ABSTRACT
The use of machine learning techniques to automatically analyse data for information is becoming increasingly widespread. In this paper we examine the use of Genetic Programming and a Genetic Algorithm to pre-process data before it is classified by an external classifier. Genetic Programming is combined with a Genetic Algorithm to construct and select new features from those available in the data, a potentially significant process for data mining since it gives consideration to hidden relationships between features. We then examine techniques to improve the human readability of these new features and extract more information about the domain.
- Aha, D., & Kibler, D. Instance-based learning algorithms. Machine Learning vol.6, 1991, 37--66. Google ScholarDigital Library
- Ahluwalia, M. & Bull, L. Co-Evolving Functions in Genetic Programming: Classification using k-nearest neighbour. In GECCO-99: Proceedings of the Genetic and Evolutionary Computation Conference. Morgan Kaufmann, 1999 pp. 947--952.Google Scholar
- Bernstein, Y., Li, X., Ciesielski, V., Song, A.: Multiobjective parsimony enforcement for superior generalisation performance. In: Proceedings of the Congress for Evolutionary Computation 2004 (CEC'04), 2004 pp. 83--89.Google ScholarCross Ref
- Bojarczuk, C.C., Lopes, H.S., Freitas, A.A., Michalkiewicz, E.L., A constrained-syntax genetic programming system for discovering classification rules: application to medical data sets, Artificial Intelligence in Medicine 30 (1), 2004, 21--48.Google ScholarDigital Library
- De Jong, E. D., Watson, R. A., Pollack, J. B. Reducing Bloat and Promoting Diversity using Multi-Objective Methods. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2001), 2001, pp. 11--18.Google Scholar
- Ekárt, A. & Máárkus, A. Using Genetic Programming and Decision Trees for Generating Structural Descriptions of Four Bar Mechanisms. In Artificial Intelligence for Engineering Design, Analysis and Manufacturing, volume 17, issue 3, 2003.Google Scholar
- Garcia-Almanza, A.L., Tsang, E.P.K. Simplifying Decision Trees Learned by Genetic Programming. IEEE Congress on Evolutionary Computation, CEC 2006, pp 2142-- 2148.Google ScholarCross Ref
- Holland, J.H. Adaptation in Natural and Artificial Systems. Univ. Michigan. 1975. Google ScholarDigital Library
- John, G.H & Langley, P. Estimating Continuous Distributions in Bayesian Classifiers. Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence. Morgan Kaufmann, San Mateo. 1995, 338--345.Google Scholar
- Kelly, J.D. & Davis, L. Hybridizing the Genetic Algorithm and the K Nearest Neighbors Classification Algorithm. In Proceedings of the Fourth International Conference on Genetic Algorithms. Morgan Kaufmann, 1991, pp377--383.Google Scholar
- Kohavi, R. & John, G. H. Wrappers for feature subset selection. Artificial Intelligence Journal vol. 1-2: 273--324. 1997. Google ScholarDigital Library
- Koza, J.R. Genetic Programming. MIT Press. 1992.Google Scholar
- Krawiec, K. Genetic Programming-based Construction of Features for Machine Learning and Knowledge Discovery Tasks. Genetic Programming and Evolvable Machines vol. 3 no. 4: 329--343. 2002. Google ScholarDigital Library
- Langdon, W. B. & Buxton, B. F. Genetic programming for improved receiver operating characteristics. In Second International Conference on Multiple Classifier System, volume 2096: 68--77. 2001. Google ScholarDigital Library
- Mitchell, T. M. Machine Learning. McGraw-Hill, 1997. Google ScholarDigital Library
- Otero, F. E. B., Silva, M. M. S., Freitas, A. A. & Nievola J. C. Genetic Programming for Attribute Construction in Data Mining. In Genetic Programming: 6th European Conference, EuroGP 2003, Essex, UK, April 2003, Proceedings. Springer, pp. 384--393.Google Scholar
- Quinlan, J.R. C4.5: Programs for Machine Learning. Morgan Kaufmann. 1993. Google ScholarDigital Library
- Raymer, M.L., Punch, W., Goodman, E.D. & Kuhn, L. Genetic Programming for Improved Data Mining -- Application to the Biochemistry of Protein Interactions. In Proceedings of the Second Annual Conference on Genetic Programming, Morgan Kaufmann, 1996, 375--380.Google Scholar
- Siedlecki, W. & Sklansky, J. On Automatic Feature Selection. International Journal of Pattern Recognition and Artificial Intelligence 2:197--220. 1988.Google ScholarCross Ref
- Smith, M. & Bull, L. Using Genetic Programming for Feature Creation with a Genetic Algorithm Feature Selector. In Parallel Problem Solving from Nature -- PPSN VIII, X. Springer-Verlag, 2004.Google Scholar
- Smith, M. & Bull, L. Genetic Programming with a Genetic Algorithm for Feature Construction and Selection. Genetic Programming and Evolvable Machines vol. 6 no. 3: 265--281. 2005 Google ScholarDigital Library
- Thomas, J. & Sycara, K. The Importance of Simplicity and Validation in Genetic Programming for Data Mining in Financial Data. Proceedings of the joint AAAI-1999 and GECCO-1999 Workshop on Data Mining with Evolutionary Algorithms, July, 1999.Google Scholar
- Vafaie, H. & De Jong, K. Genetic Algorithms as a Tool for Restructuring Feature Space Representations. In Proceedings of the International Conference on Tools with A.I. IEEE Computer Society Press. 1995. Google ScholarDigital Library
- Witten, I.H. & Frank, E. Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann. 2000. Google ScholarDigital Library
Index Terms
Improving the human readability of features constructed by genetic programming
Recommendations
Genetic Programming with a Genetic Algorithm for Feature Construction and Selection
The use of machine learning techniques to automatically analyse data for information is becoming increasingly widespread. In this paper we primarily examine the use of Genetic Programming and a Genetic Algorithm to pre-process data before it is ...
Genetic programming for feature extraction and construction in image classification
AbstractGenetic Programming (GP) has been successfully applied to image classification and achieved promising results. However, most existing methods either address binary image classification tasks only or need a predefined classifier to ...
Highlights- A new GP representation to get an effective combination of features and classifiers.
A hybrid multiple feature construction approach for classification using Genetic Programming
AbstractThe purpose of feature construction is to create new higher-level features from original ones. Genetic Programming (GP) was usually employed to perform feature construction tasks due to its flexible representation. Filter-based ...
Highlights- A GP-based hybrid feature construction approach is developed.
- A multiple ...
Comments