Abstract
There are various types of lung cancer and they can be differentiated by the cell size as well as the growth pattern. They are all treated differently. Classification of the various types of lung cancer assists in determining the specified treatments to decrease the fatality rates. In this paper, we broaden the analysis of lung by using gene expression data, binary decomposition strategies and Gene Expression Programming (GEP) technique, aiming at achieving better classification performance. Classification performance was assessed and compared between our GEP models and three representative machine learning techniques, SVM, NNW and C4.5 on real microarray Lung tumor datasets. Dependability was evaluated by the cross-informational collection validation. The evaluation results demonstrate that our technique can achieve better classification performance in terms of Accuracy, standard deviation and range under the recipient working trademark bend. The proposed technique in this paper provides a helpful tool for Lung cancer classification.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
American Cancer Society: Cancer facts & figures 2011, vol. 1, no. 34. American Cancer Society INC. (2011)
Laureen, W., Goh, B.C.: An overview of cancer trends in Asia. Innovationmagazine.com (2012)
Balgkouranidou, I., Liloglou, T., Lianidou, E.S.: Lung cancer epigenetics: emerging biomarkers. Biomark. Med. 7(1), 49–58 (2013)
Hosseinzadeh, F., Ebrahimi, M., Goliaei, B., Shamabadi, N.: Classification of lung cancer tumors based on structural and physicochemical properties of proteins by bioinformatics models. PLoS ONE 7(7), e40017 (2012)
Beasley, M.B., Brambilla, E., Travis, W.D.: The 2004 World Health Organization classification of lung tumors. In: Seminars in Roentgenology, vol. 40, no. 2, pp. 90–97. WB Saunders (2005)
Pham, T.D., Wells, C., Crane, D.I.: Analysis of microarray gene expression data. Current Bioinform. 1(1), 37–53 (2006)
Golub, T.R., et al.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286(5439), 531–537 (1999)
Joseph, S.J., Robbins, K.R., Zhang, W., Rekaya, R.: Comparison of two output-coding strategies for multi-class tumor classification using gene expression data and latent variable model as binary classifier. Cancer Inform. 9, 39 (2010)
Burgess, D.J.: Cancer genetics: initially complex, always heterogeneous. Nat. Rev. Genet. 12(3), 154–155 (2011)
Dyrskjøt, L., et al.: Gene expression signatures predict outcome in non–muscle-invasive bladder carcinoma: a multicenter validation study. Clin. Cancer Res. 13(12), 3545–3551 (2007)
Shah, M.A., et al.: Molecular classification of gastric cancer: a new paradigm. Clin. Cancer Res. 17(9), 2693–2701 (2011)
Li, T., Zhang, C., Ogihara, M.: A comparative study of feature selection and multiclass classification methods for tissue classification based on gene expression. Bioinformatics 20(15), 2429–2437 (2004)
Mukherjee, S.: Classifying microarray data using support vector machines. In: Berrar, D.P., Dubitzky, W., Granzow, M. (eds.) A Practical Approach to Microarray Data Analysis, pp. 166–185. Springer, Boston (2003)
Ghorai, S., Mukherjee, A., Sengupta, S., Dutta, P.K.: Multicategory cancer classification from gene expression data by multiclass NPPC ensemble. In: 2010 International Conference on Systems in Medicine and Biology (ICSMB), pp. 41–48. IEEE (2010)
Lorena, A.C., De Carvalho, A.C., Gama, J.M.: A review on the combination of binary classifiers in multiclass problems. Artif. Intell. Rev. 30(1–4), 19–37 (2008)
Clark, P., Boswell, R.: Rule induction with CN2: some recent improvements. In: Kodratoff, Y. (ed.) EWSL 1991. LNCS, vol. 482, pp. 151–163. Springer, Heidelberg (1991). doi:10.1007/BFb0017011
Anand, R., Mehrotra, K., Mohan, C.K., Ranka, S.: Efficient classification for multiclass problems using modular neural networks. IEEE Trans. Neural Netw. 6(1), 117–124 (1995)
Knerr, S., Personnaz, L., Dreyfus, G.: Single-layer learning revisited: a stepwise procedure for building and training a neural network. In: Soulié, F.F., Hérault, J. (eds.) Neurocomputing. NATO ASI Series (Series F: Computer and Systems Sciences), vol. 68, pp. 41–50. Springer, Heidelberg (1990)
Ramaswamy, S., et al.: Multiclass cancer diagnosis using tumor gene expression signatures. Proc. Natl. Acad. Sci. 98(26), 15149–15154 (2001)
Vlahou, A., Schorge, J.O., Gregory, B.W., Coleman, R.L.: Diagnosis of ovarian cancer using decision tree classification of mass spectral data. Biomed. Res. Int. 2003(5), 308–314 (2003)
Ferreira, C.: Gene expression programming: a new adaptive algorithm for solving problems. Complex Syst. 13(2), 87–129 (2001)
Teodorescu, L., Sherwood, D.: High energy physics event selection with gene expression programming. Comput. Phys. Commun. 178(6), 409–419 (2008)
Shi, W., Zhang, X., Shen, Q.: Quantitative structure-activity relationships studies of CCR5 inhibitors and toxicity of aromatic compounds using gene expression programming. Eur. J. Med. Chemistry 45(1), 49–54 (2010)
Nazari, A.: Prediction performance of PEM fuel cells by gene expression programming. Int. J. Hydrogen Energy 37(24), 18972–18980 (2012)
Weinert, W.R., Lopes, H.S.: GEPCLASS: a classification rule discovery tool using gene expression programming. In: Li, X., Zaïane, O.R., Li, Z. (eds.) ADMA 2006. LNCS, vol. 4093, pp. 871–880. Springer, Heidelberg (2006). doi:10.1007/11811305_95
Jedrzejowicz, J., Jedrzejowicz, P.: Experimental evaluation of two new GEP-based ensemble classifiers. Expert Syst. Appl. 38(9), 10932–10939 (2011)
Wang, W., Li, Q., Han, S., Lin, H.: A preliminary study on constructing decision tree with gene expression programming. In: First International Conference on Innovative Computing, Information and Control (ICICIC 2006), vol. 1, pp. 222–225. IEEE (2006)
Ávila, J.L., Gibaja, E.L., Ventura, S.: Multi-label classification with gene expression programming. In: Corchado, E., Wu, X., Oja, E., Herrero, Á., Baruque, B. (eds.) HAIS 2009. LNCS, vol. 5572, pp. 629–637. Springer, Heidelberg (2009). doi:10.1007/978-3-642-02319-4_76
Ávila, J.L., Gibaja, E., Zafra, A., Ventura, S.: A gene expression programming algorithm for multi-label classification. J. Multiple Valued Logic Soft Comput. 17, 255–287 (2011)
Shi, W., Liu, Y., Kong, W., Shen, Q.: Tea classification by near infrared spectroscopy with projection discriminant analysis and gene expression programming. Anal. Lett. 48(18), 2833–2842 (2015)
Huang, J., Deng, C.: A novel multiclass classification method with gene expression programming. In: International Conference on Web Information Systems and Mining, WISM 2009, pp. 139–143. IEEE (2009)
Zhou, C., Xiao, W., Tirpak, T.M., Nelson, P.C.: Evolving accurate and compact classification rules with gene expression programming. IEEE Trans. Evol. Comput. 7(6), 519–531 (2003)
Khattab, H., Abdelaziz, A., Mekhamer, S., Badr, M., El-Saadany, E.: Gene expression programming for static security assessment of power systems. In: 2012 IEEE Power and Energy Society General Meeting, pp. 1–8. IEEE (2012)
Al-Anni, R., Hou, J., Abdu-aljabar, R.D.A., Xiang, Y.: Prediction of NSCLC recurrence from microarray data with GEP. IET Syst. Biol. 11(3), 77–85 (2017)
Azzawi, H., Hou, J., Xiang, Y., Alanni, R.: Lung cancer prediction from microarray data by gene expression programming. IET Syst. Biol. 10, 1–11 (2016)
Yu, Z., et al.: A highly efficient Gene Expression Programming (GEP) model for auxiliary diagnosis of small cell lung cancer. PLoS ONE 10(5), 1–19 (2015)
Yu, Z., Chen, X.Z., Cui, L.H., Si, H.Z., Lu, H.J., Liu, S.H.: Prediction of lung cancer based on serum biomarkers by gene expression programming methods. Asian Pac. J. Cancer Prev. 15(21), 9367–9373 (2014)
Kusy, M., Obrzut, B., Kluska, J.: Application of gene expression programming and neural networks to predict adverse events of radical hysterectomy in cervical cancer patients. Med. Biol. Eng. Comput. 51(12), 1357–1365 (2013)
Kira, K., Rendell, L.A.: A practical approach to feature selection. In: Proceedings of the Ninth International Workshop on Machine Learning, pp. 249–256 (1992)
Kononenko, I.: Estimating attributes: analysis and extensions of RELIEF. In: Bergadano, F., De Raedt, L. (eds.) ECML 1994. LNCS, vol. 784, pp. 171–182. Springer, Heidelberg (1994). doi:10.1007/3-540-57868-4_57
Robnik-Šikonja, M., Kononenko, I.: An adaptation of relief for attribute estimation in regression. In: Proceedings of the Fourteenth International Conference on Machine Learning (ICML 1997), pp. 296–304 (1997)
Witten, I.H., Frank, E., Hall, M.A., Pal, C.J.: Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, Burlington (2016)
Gene Expression Programming for Java. https://code.google.com/archive/p/gep4j/. Accessed 26 Aug 2010
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Azzawi, H., Hou, J., Alanni, R., Xiang, Y., Abdu-Aljabar, R., Azzawi, A. (2017). Multiclass Lung Cancer Diagnosis by Gene Expression Programming and Microarray Datasets. In: Cong, G., Peng, WC., Zhang, W., Li, C., Sun, A. (eds) Advanced Data Mining and Applications. ADMA 2017. Lecture Notes in Computer Science(), vol 10604. Springer, Cham. https://doi.org/10.1007/978-3-319-69179-4_38
Download citation
DOI: https://doi.org/10.1007/978-3-319-69179-4_38
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-69178-7
Online ISBN: 978-3-319-69179-4
eBook Packages: Computer ScienceComputer Science (R0)