Abstract
Combining a set of classifiers has often been exploited to improve the classification performance. Accurate as well as diverse base classifiers are prerequisite to construct a good ensemble classifier. Therefore, estimating diversity among classifiers has been widely investigated. This paper presents an ensemble approach that combines a set of diverse rules obtained by genetic programming. Genetic programming generates interpretable classification rules, and diversity among them is directly estimated. Finally, several diverse rules are combined by a fusion method to generate a final decision. The proposed method has been applied to cancer classification using gene expression profiles, which is one of the important issues in bioinformatics. Experiments on several popular cancer datasets have demonstrated the usability of the method. High performance of the proposed method has been obtained, and the accuracy has increased by diversity among the base classification rules.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Koza, J.: Genetic programming. Encyclopedia of Computer Science and Technology 39, 29–43 (1998)
Bruke, E., et al.: Diversity in genetic programming: An analysis of measures and correlation with fitness. IEEE Trans. Evolutionary Computation 8(1), 47–62 (2004)
Kuncheva, L.: A theoretical study on six classifier fusion strategies. IEEE Trans. Pattern Analysis and Machine Intelligence 24(2), 281–286 (2002)
Bryll, R., et al.: Attribute bagging: Improving accuracy of classifier ensembles by using random feature subsets. Pattern Recognition 36(6), 1291–1302 (2003)
Hansen, L., Salamon, P.: Neural network ensembles. IEEE Trans. Pattern Analysis and Machine Intelligence 12(10), 993–1001 (1990)
Opitz, D., Maclin, R.: Popular ensemble methods: An empirical study. J. of Artificial Intelligence Research 11, 160–198 (1999)
Zhou, Z., et al.: Ensembling neural networks: Many could be better than all. Artificial Intelligence 137(1-2), 239–263 (2002)
Ruta, D., Gabrys, B.: Classifier selection for majority voting. Information Fusion (2004)
Brown, G., et al.: Diversity creation methods: A survey and categorization. Information Fusion 6(1), 5–20 (2005)
Bakker, B., Heskes, T.: Clustering ensembles of neural network models. Neural Networks 16(2), 261–269 (2003)
Tan, A., Gilbert, D.: Ensemble machine learning on gene expression data for cancer classification. Applied Bioinformatics 2(3), 75–83 (2003)
Breiman, L.: Bagging predictors. Machine Learning 24(2), 123–140 (1996)
Freund, Y., Schapire, R.: Experiments with a new boosting algorithm. In: Proc. the 13th Int. Conf. Machine Learning, pp. 148–156 (1996)
Breiman, L.: Bias, variance, and arcing classifiers, Tech. Rep. 460, UC-Berkeley (1996)
Peterson, C., Ringner, M.: Analyzing tumor gene expression profiles. Artificial Intelligence in Medicine 28(1), 59–74 (2003)
Hong, J.-H., Cho, S.-B.: Lymphoma cancer classification using genetic programming with SNR features. In: Keijzer, M., O’Reilly, U.-M., Lucas, S., Costa, E., Soule, T. (eds.) EuroGP 2004. LNCS, vol. 3003, pp. 78–88. Springer, Heidelberg (2004)
Wang, J., Zhang, K.: Finding similar consensus between trees: An algorithm and a distance hierarchy. Pattern Recognition 34(1), 127–137 (2001)
Xiong, M., et al.: Feature selection in gene expression-based tumor classification. Molecular Genetics and Metabolism 73(3), 239–247 (2001)
Brameier, M., Banzhaf, W.: A comparison of linear genetic programming and neural networks in medical data mining. IEEE Trans. Evolutionary Computation 5(1), 17–26 (2001)
Zhang, Y., Bhattacharyya, S.: Genetic programming in classifying large-scale data: An ensemble method. Information Sciences 163(1-3), 85–101 (2004)
Alizadeh, A., et al.: Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 403(6769), 503–511 (2000)
Gordon, G., et al.: Translation of microarray data into clinically relevant cancer diagnostic tests using gene expression ratios in lung cancer and mesothelioma. Cancer Research 62(17), 4963–4967 (2002)
Petricoin III, E., et al.: Use of proteomic patterns in serum to identify ovarian cancer. The Lancet 359(9306), 572–577 (2002)
Shipp, M., et al.: Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning. Nature Medicine 8(1), 68–74 (2002)
Ando, T., et al.: Selection of causal gene sets for lymphoma prognostication from expression profiling and construction of prognostic fuzzy neural network models. J. Bioscience and Bioengineering 96(2), 161–167 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hong, JH., Cho, SB. (2005). Cancer Prediction Using Diversity-Based Ensemble Genetic Programming. In: Torra, V., Narukawa, Y., Miyamoto, S. (eds) Modeling Decisions for Artificial Intelligence. MDAI 2005. Lecture Notes in Computer Science(), vol 3558. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11526018_29
Download citation
DOI: https://doi.org/10.1007/11526018_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-27871-9
Online ISBN: 978-3-540-31883-5
eBook Packages: Computer ScienceComputer Science (R0)