abstract = "A recent advance in genetic computations is the
heuristic prediction model (symbolic regression), which
have received little statistical scrutiny. Diagnostic
checks of genetically evolved models (GEMs) as a
forecasting method are therefore essential. This
requires assessing the statistical properties of errors
produced by GEMs. Since the predicted models and their
forecasts are produced artificially by a computer
program, little controls the final model specification.
However, it is of interest to understand the final
specification and to know the statistical
characteristics of its errors, particularly if
artificially produced models furnish better forecasts
than humanly conceived ones. This paper's main concern
is the statistical analysis of errors from genetically
evolved models. Genetic programming (GP) is one of two
computational algorithms for evolving regression
models, the other being evolutionary programming (EP).
GP-QUICK computer code written in C ++ evolves the
regression models for this study. GP-QUICK replicates
an original GP program in LISP by Koza. Both are
designed to evolve regression models randomly, finding
one that replicates the series' data-generating process
best. Prediction errors from GP evolved regression
models are tested for whiteness (or autocorrelation)
and for normality. Well-established diagnostic tools
for linear time-series modeling apply also to nonlinear
models. Only diagnostic methods using errors without
having to replicate the models that produced them are
selected and applied to series. This restriction is
avoids reproducing the resulting genetically evolved
equations. These equations are generated by a random
selection mechanism almost impossible to replicate with
GP unless the process is deterministic, and they are
usually too complex for standard statistical software
to reproduce and analyze. The diagnostic methods are
selected for their simplicity and speed of execution
without sacrificing reliability. This paper contains
four other sections. One presents the diagnostic tools
to determine the statistical properties of residuals
produced by GEMs. Residuals from evolved models
representing systems with known characteristics are
used to evaluate the statistical performance of GEMs.
Another furnishes six data-generating processes
representing linear, linear-stochastic, nonlinear,
nonlinear-stochastic, and pseudo-random systems for
which models are evolved and residuals computed. The
final contains those residuals' diagnostics. Diagnostic
tools include the Kolmogorov-Smirnov test for whiteness
developed by Durbin (1969) in addition to statistical
testing of the null hypotheses that the fitted
residuals' mean, skewness, and kurtosis are
independently equal to zero. Conclusions and future
research are given.",
notes = "CEF'99 RePEc:sce:scecf9:1031 23 Nov 1999: Our printers
barf if given GP-Stat.prn
22 Aug 2004
http://ideas.repec.org/p/sce/scecf9/1031.html CEF
number 1031",