Skip to main content

Gene Expression Programming Ensemble for Classifying Big Datasets

  • Conference paper
  • First Online:
Book cover Computational Collective Intelligence (ICCCI 2017)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10449))

Included in the following conference series:

Abstract

The paper proposes a new GEP-based batch ensemble classifier constructed using the stacked generalization concept. In our approach combination of base classifiers involves evolving the meta-gene using genes induced by GEP from randomly generated combinations of instances with randomly selected subsets of attributes. The main property of the discussed classifier is its scalability allowing adaptation to the size of the dataset under consideration. To validate the proposed classifier, we have carried-out computational experiment involving a number of publicly available benchmark datasets. Experiment results show that the approach assures good performance, scalability and robustness.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

References

  1. Álvarez, A., Sierra, B., Arruti, A., Gil, J.M.L., Garay-Vitoria, N.: Classifier subset selection for the stacked generalization method applied to emotion recognition in speech. Sensors 16(1), 21 (2016)

    Article  Google Scholar 

  2. Awwalu, J., Ghazvini, A., Bakar, A.A.: Comparative analysis of algorithms in supervised classification: a case study of bank notes dataset. Int. J. Comput. Trends Technol. 17(1), 38–43 (2014)

    Google Scholar 

  3. Ávila-Jiménez, J.L., Gibaja Galindo, E.L., Zafra, A., Ventura, S.: A gene expression programming algorithm for multi-label classification. Multiple-Valued Logic Soft Comput. 17(2–3), 183–206 (2011)

    Google Scholar 

  4. Crain, K., Davis, G.: Classifying forest cover type using cartographic features. Stanford University (2014)

    Google Scholar 

  5. Ferreira, C.: Gene expression programming: a new adaptive algorithm for solving problems. CoRR, cs.AI/0102027 (2001)

    Google Scholar 

  6. Ferreira, C.: Gene Expression Programming: Mathematical Modeling by an Artificial Intelligence. Studies in Computational Intelligence, vol. 21. Springer, Heidelberg (2006). doi:10.1007/3-540-32849-1

    Book  MATH  Google Scholar 

  7. Hosseini, S.A., Rabiee, H.R., Hafez, H., Soltani-Farani, A.: Classifying a stream of infinite concepts: a Bayesian non-parametric approach. In: Calders, T., Esposito, F., Hüllermeier, E., Meo, R. (eds.) ECML PKDD 2014. LNCS, vol. 8724, pp. 1–16. Springer, Heidelberg (2014). doi:10.1007/978-3-662-44848-9_1

    Chapter  Google Scholar 

  8. Jȩdrzejowicz, J., Jȩdrzejowicz, P.: GEP-induced expression trees as weak classifiers. In: Perner, P. (ed.) ICDM 2008. LNCS, vol. 5077, pp. 129–141. Springer, Heidelberg (2008). doi:10.1007/978-3-540-70720-2_10

    Chapter  Google Scholar 

  9. Jȩdrzejowicz, J., Jȩdrzejowicz, P.: A family of GEP-induced ensemble classifiers. In: Nguyen, N.T., Kowalczyk, R., Chen, S.-M. (eds.) ICCCI 2009. LNCS, vol. 5796, pp. 641–652. Springer, Heidelberg (2009). doi:10.1007/978-3-642-04441-0_56

    Chapter  Google Scholar 

  10. Jȩdrzejowicz, J., Jȩdrzejowicz, P.: Experimental evaluation of two new GEP-based ensemble classifiers. Expert Syst. Appl. 38(9), 10932–10939 (2011)

    Article  Google Scholar 

  11. Jȩdrzejowicz, J., Jȩdrzejowicz, P.: Combining expression trees. In: 2013 IEEE International Conference on Cybernetics, CYBCONF 2013, Lausanne, Switzerland, 13–15 June 2013, pp. 80–85. IEEE (2013)

    Google Scholar 

  12. Johnson, B.A., Tateishi, R., Thanh, H.N.: A hybrid pansharpening approach and multiscale object-based image analysis for mapping diseased pine and oak trees. Int. J. Remote Sens. 34(20), 6969–6982 (2013)

    Article  Google Scholar 

  13. Karakasis, V., Stafylopatis, A.: Data mining based on gene expression programming and Clonal selection. In: IEEE International Conference on Evolutionary Computation, CEC 2006, part of WCCI 2006, Vancouver, BC, Canada, 16–21 July 2006, pp. 514–521. IEEE (2006)

    Google Scholar 

  14. Koc, A.A., Yeniay, O.: A comparative study of artificial neural networks and logistic regression for classification of marketing campaign results. Math. Comput. Appl. 18(3), 392–398 (2013)

    Google Scholar 

  15. Li, X., Zhou, C., Xiao, W., Nelson, P.C.: Prefix gene expression programming. In: Rothlauf, F. (ed.) Late Breaking Paper at Genetic and Evolutionary Computation Conference (GECCO 2005), Washington, D.C., USA, pp. 25–29, June 2005

    Google Scholar 

  16. Lichman, M.: UCI machine learning repository (2013)

    Google Scholar 

  17. Liu, S., Liu, Z., Sun, J., Liu, L.: Application of synergetic neural network in online writeprint identification. Int. J. Digit. Content Technol. Appl. 5(3), 126–135 (2011)

    Article  MathSciNet  Google Scholar 

  18. Mertayak, C.: Utilization of dimensionality reduction in stacked generalization architecture. In: The 24th International Symposium on Computer and Information Sciences, ISCIS 2009, 14–16 September 2009, North Cyprus, pp. 88–93. IEEE (2009)

    Google Scholar 

  19. Olorunnimbe, M.K., Viktor, H.L., Paquet, E.: Intelligent adaptive ensembles for data stream mining: a high return on investment approach. In: Ceci, M., Loglisci, C., Manco, G., Masciari, E., Ras, Z.W. (eds.) NFMCP 2015. LNCS, vol. 9607, pp. 61–75. Springer, Cham (2016). doi:10.1007/978-3-319-39315-5_5

    Chapter  Google Scholar 

  20. Pesaranghader, A., Viktor, H.L.: Fast hoeffding drift detection method for evolving data streams. In: Frasconi, P., Landwehr, N., Manco, G., Vreeken, J. (eds.) ECML PKDD 2016. LNCS, vol. 9852, pp. 96–111. Springer, Cham (2016). doi:10.1007/978-3-319-46227-1_7

    Chapter  Google Scholar 

  21. Ting, K.M., Witten, I.H.: Issues in stacked generalization. J. Artif. Intell. Res. (JAIR) 10, 271–289 (1999)

    MATH  Google Scholar 

  22. Turkov, P., Krasotkina, O., Mottl, V.: Dynamic programming for bayesian logistic regression learning under concept drift. In: Maji, P., Ghosh, A., Murty, M.N., Ghosh, K., Pal, S.K. (eds.) PReMI 2013. LNCS, vol. 8251, pp. 190–195. Springer, Heidelberg (2013). doi:10.1007/978-3-642-45062-4_26

    Chapter  Google Scholar 

  23. Weinert, W.R., Lopes, H.S.: GEPCLASS: a classification rule discovery tool using gene expression programming. In: Li, X., Zaïane, O.R., Li, Z. (eds.) ADMA 2006. LNCS, vol. 4093, pp. 871–880. Springer, Heidelberg (2006). doi:10.1007/11811305_95

    Chapter  Google Scholar 

  24. Wolpert, D.H.: Stacked generalization. Neural Netw. 5(2), 241–259 (1992)

    Article  Google Scholar 

  25. Yeh, I.-C., Lien, C.H.: The comparisons of data mining techniques for the predictive accuracy of probability of default of credit card clients. Expert Syst. Appl. 36(2, Part 1), 2473–2480 (2009)

    Article  Google Scholar 

  26. Zeng, T., Tang, C., Xiang, Y., Chen, P., Liu, Y.: A model of immune gene expression programming for rule mining. J. Univ. Comput. Sci. 13(10), 1484–1497 (2007). http://www.jucs.org/jucs_13_10/a_model_of_immune

    Google Scholar 

  27. Zliobaite, I.: Controlled permutations for testing adaptive classifiers. In: Discovery Science, pp. 365–379 (2011)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Joanna Jȩdrzejowicz .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Jȩdrzejowicz, J., Jȩdrzejowicz, P. (2017). Gene Expression Programming Ensemble for Classifying Big Datasets. In: Nguyen, N., Papadopoulos, G., Jędrzejowicz, P., Trawiński, B., Vossen, G. (eds) Computational Collective Intelligence. ICCCI 2017. Lecture Notes in Computer Science(), vol 10449. Springer, Cham. https://doi.org/10.1007/978-3-319-67077-5_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-67077-5_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-67076-8

  • Online ISBN: 978-3-319-67077-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics