Created by W.Langdon from gp-bibliography.bib Revision:1.8098
Affymetrix High Density Oligonuclotide Arrays (HDONA) simultaneously measure expression of thousands of genes using millions of probes. Regular expressions can be evolved from a Backus-Naur form (BNF) context-free grammar using tree based strongly typed genetic programming written in gawk. Fitness is given by egrep. The quality of individual HG-U133A probes is indicated by its correlation across 6685 human tissue samples from NCBI's GEO database with other measurements for the same gene. Low concordance indicates a poor probe. The evolved data mined motif is better at predicting poor DNA sequences than an existing human generated RE, suggesting runs of Cytosine and Guanine and mixtures should all be avoided. Section 4.6 gives more RE GP gawk implementation details.
Code is available at ftp://ftp.cs.ucl.ac.uk/genetic/gp-code/RE_gp.tar",
Genetic Programming entries for William B Langdon Andrew P Harrison