GDF v2.0, an enhanced version of GDF

https://doi.org/10.1016/j.cpc.2007.08.008Get rights and content

Abstract

An improved version of the function estimation program GDF is presented. The main enhancements of the new version include: multi-output function estimation, capability of defining custom functions in the grammar and selection of the error function. The new version has been evaluated on a series of classification and regression datasets, that are widely used for the evaluation of such methods. It is compared to two known neural networks and outperforms them in 5 (out of 10) datasets.

Program summary

Title of program: GDF v2.0

Catalogue identifier: ADXC_v2_0

Program summary URL: http://cpc.cs.qub.ac.uk/summaries/ADXC_v2_0.html

Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland

Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html

No. of lines in distributed program, including test data, etc.: 98 147

No. of bytes in distributed program, including test data, etc.: 2 040 684

Distribution format: tar.gz

Programming language: GNU C++

Computer: The program is designed to be portable in all systems running the GNU C++ compiler

Operating system: Linux, Solaris, FreeBSD

RAM: 200000 bytes

Classification: 4.9

Does the new version supersede the previous version?: Yes

Nature of problem: The technique of function estimation tries to discover from a series of input data a functional form that best describes them. This can be performed with the use of parametric models, whose parameters can adapt according to the input data.

Solution method: Functional forms are being created by genetic programming which are approximations for the symbolic regression problem.

Reasons for new version: The GDF package was extended in order to be more flexible and user customizable than the old package. The user can extend the package by defining his own error functions and he can extend the grammar of the package by adding new functions to the function repertoire. Also, the new version can perform function estimation of multi-output functions and it can be used for classification problems.

Summary of revisions: The following features have been added to the package GDF:

  • Multi-output function approximation. The package can now approximate any function f:RNRM. This feature gives also to the package the capability of performing classification and not only regression.

  • User defined function can be added to the repertoire of the grammar, extending the regression capabilities of the package. This feature is limited to 3 functions, but easily this number can be increased.

  • Capability of selecting the error function. The package offers now to the user apart from the mean square error other error functions such as: mean absolute square error, maximum square error. Also, user defined error functions can be added to the set of error functions.

  • More verbose output. The main program displays more information to the user as well as the default values for the parameters. Also, the package gives to the user the capability to define an output file, where the output of the gdf program for the testing set will be stored after the termination of the process.

Additional comments: A technical report describing the revisions, experiments and test runs is packaged with the source code.

Running time: Depending on the train data.

References (0)

This paper and its associated computer program are available via the Computer Physics Communications homepage on ScienceDirect (http://www.sciencedirect.com/science/journal/00104655).

View full text