Skip to main content

Mining Distributed Evolving Data Streams Using Fractal GP Ensembles

  • Conference paper
Genetic Programming (EuroGP 2007)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4445))

Included in the following conference series:

Abstract

A Genetic Programming based boosting ensemble method for the classification of distributed streaming data is proposed. The approach handles flows of data coming from multiple locations by building a global model obtained by the aggregation of the local models coming from each node. A main characteristics of the algorithm presented is its adaptability in presence of concept drift. Changes in data can cause serious deterioration of the ensemble performance. Our approach is able to discover changes by adopting a strategy based on self-similarity of the ensemble behavior, measured by its fractal dimension, and to revise itself by promptly restoring classification accuracy. Experimental results on a synthetic data set show the validity of the approach in maintaining an accurate and up-to-date GP ensemble.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bauer, E., Kohavi, R.: An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. Machine Learning 36, 105–139 (1999)

    Article  Google Scholar 

  2. Breiman, L.: Bagging predictors. Machine Learning 24(2), 123–140 (1996)

    MathSciNet  MATH  Google Scholar 

  3. Cantú-Paz, E., Kamath, C.: Inducing oblique decision trees with evolutionary algorithms. IEEE Transaction on Evolutionary Computation 7(1), 54–68 (2003)

    Article  Google Scholar 

  4. Chu, F., Zaniolo, C.: Fast and light boosting for adaptive mining of data streams. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS (LNAI), vol. 3056, pp. 26–28. Springer, Heidelberg (2004)

    Google Scholar 

  5. Dietterich, T.G.: An experimental comparison of three methods for costructing ensembles of decision trees: Bagging, boosting, and randomization. Machine Learning 40, 139–157 (2000)

    Article  Google Scholar 

  6. Folino, G., Pizzuti, C., Spezzano, G.: A cellular genetic programming approach to classification. In: Proc. Of the Genetic and Evolutionary Computation Conference GECCO99, Orlando, Florida, July 1999, pp. 1015–1020. Morgan Kaufmann, San Francisco (1999)

    Google Scholar 

  7. Folino, G., Pizzuti, C., Spezzano, G.: A scalable cellular implementation of parallel genetic programming. IEEE Transaction on Evolutionary Computation 10(5), 604–616 (2006)

    Article  Google Scholar 

  8. Freund, Y., Scapire, R.: Experiments with a new boosting algorithm. In: Proceedings of the 13th Int. Conference on Machine Learning, pp. 148–156 (1996)

    Google Scholar 

  9. Gehrke, J., Ganti, V., Ramakrishnan, R., Loh, W.: Boat - optimistic decision tree construction. In: Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD’99), pp. 169–180. ACM Press, New York (1999)

    Google Scholar 

  10. Grassberger, P.: Generalized dimensions of strange attractors. Physics Letters 97A, 227–230 (1983)

    MathSciNet  Google Scholar 

  11. Iba, H.: Bagging, boosting, and bloating in genetic programming. In: Proc. Of the Genetic and Evolutionary Computation Conference. GECCO99, Orlando, Florida, pp. 103–1060. Morgan Kaufmann, San Francisco (1999)

    Google Scholar 

  12. Langdon, W.B., Buxton, B.F.: Genetic programming for combining classifiers. In: Proc. Of the Genetic and Evolutionary Computation Conference. GECCO 2001, San Francisco, CA, July 2001, pp. 66–73. Morgan Kaufmann, San Francisco (2001)

    Google Scholar 

  13. Liebovitch, L., Toth, T.: A fast algorithm to determine fractal dimensions by box counting. Physics Letters 141A(8) (1989)

    Google Scholar 

  14. Mandelbrot, B.: The Fractal Geometry of Nature. W.H. Freeman, New York (1983)

    Google Scholar 

  15. Sarraille, J., DiFalco, P.: FD3. http://tori.postech.ac.kr/softwares

  16. Soule, T.: Voting teams: A cooperative approach to non-typical problems using genetic programming. In: Proc. Of the Genetic and Evolutionary Computation Conference. GECCO99, Orlando, Florida, July 1999, pp. 916–922. Morgan Kaufmann, San Francisco (1999)

    Google Scholar 

  17. Street, W.N., Kim, Y.S.: A streaming ensemble algorithm (sea) for large-scale classification. In: Proceedings of the seventh ACM SIGKDD International conference on Knowledge discovery and data mining. KDD’01, San Francisco, CA, USA, August 26-29, pp. 377–382. ACM, New York (2001)

    Chapter  Google Scholar 

  18. Utgoff, P.E.: Incremental induction of decision trees. Machine Learning 4, 161–186 (1989)

    Article  Google Scholar 

  19. Wang, H., Fan, W., Yu, P.S., Han, J.: Mining concept-drifting data streams using ensemble classifiers. In: Proceedings of the nineth ACM SIGKDD International conference on Knowledge discovery and data mining. KDD’03, Washington, DC, USA, August 24-27, pp. 226–235. ACM, New York (2003)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Marc Ebner Michael O’Neill Anikó Ekárt Leonardo Vanneschi Anna Isabel Esparcia-Alcázar

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Berlin Heidelberg

About this paper

Cite this paper

Folino, G., Pizzuti, C., Spezzano, G. (2007). Mining Distributed Evolving Data Streams Using Fractal GP Ensembles. In: Ebner, M., O’Neill, M., Ekárt, A., Vanneschi, L., Esparcia-Alcázar, A.I. (eds) Genetic Programming. EuroGP 2007. Lecture Notes in Computer Science, vol 4445. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71605-1_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-71605-1_15

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-71602-0

  • Online ISBN: 978-3-540-71605-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics