Skip to main content

Multi- and Many-Threaded Heterogeneous Parallel Grammatical Evolution

  • Chapter
  • First Online:
Book cover Handbook of Grammatical Evolution

Abstract

There are some algorithms suited for inference of human-interpretable models for classification and regression tasks in machine learning, but it is hard to compete with Grammatical Evolution (GE) when it comes to powerfulness, model expressiveness and ease of implementation. On the other hand, algorithms that iteratively optimize a set of programs of arbitrary complexity—which is the case of GE—may take an inconceivable amount of running time when tackling complex problems. Fortunately, GE may scale to such problems by carefully harnessing the parallel processing of modern heterogeneous systems, taking advantage of traditional multi-core processors and many-core accelerators to speed up the execution by orders of magnitude. This chapter covers the subject of parallel GE, focusing on heterogeneous multi- and many-threaded decomposition in order to achieve a fully parallel implementation, where both the breeding and evaluation are parallelized. In the studied benchmarks, the overall parallel implementation runtime was 68 times faster than the sequential version, with the program evaluation kernel alone hitting an acceleration of 350 times. Details on how to efficiently accomplish that are given in the context of two well-established open standards for parallel computing: OpenMP and OpenCL. Decomposition strategies, optimization techniques and parallel benchmarks followed by analyses are presented in the chapter.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Put simply, cache is a very fast but small memory that stores the most recently/commonly accessed data. When a requested data is not in cache, a fixed size block of data—which contains not only the requested data, but also the nearby ones—is copied from main memory into the cache, which is known as cache line or cache block [10].

  2. 2.

    Due to accelerator device constraints, the OpenCL programming model forbids synchronization among work-groups which precludes gathering the partial errors within the evaluation kernel. Since only a fraction of partial errors remains at the end of the kernel execution, it is preferred to reduce them directly on the host processor (CPU).

  3. 3.

    ppi is freely available at http://github.com/daaugusto/ppi.

  4. 4.

    Note that a single generation would suffice to estimate the execution time of the tasks of the GE algorithm, however, we used two generations in order to make the timeline visualization more appealing.

  5. 5.

    Speedup is the ratio between the execution time of the algorithm before and after improvement.

  6. 6.

    Improvement percentage is the difference between the execution time of the algorithm before and after improvement over the execution time of the unimproved version, and multiplying by 100.

  7. 7.

    GPop/s stands for genetic programming operations per second and is equivalent to the number of program “symbols” processed per second.

  8. 8.

    SIMD stands for Single Instruction Multiple Data and means that a single instruction is executed on multiple data simultaneously.

  9. 9.

    SPMD stands for Single Program Multiple Data and means that different compute units of a compute device are able to execute different instructions simultaneously from the same kernel.

References

  1. D.A. Augusto, H.J.C. Barbosa, Symbolic regression via genetic programming, in Proceedings. Vol. 1 Sixth Brazilian Symposium on Neural Networks (2000), pp. 173–178

    Google Scholar 

  2. D.A. Augusto, H.J.C. Barbosa, Accelerated parallel genetic programming tree evaluation with OpenCL. J. Parallel Distrib. Comput. 73(1), 86–100 (2013)

    Article  Google Scholar 

  3. W.J. Bolosky, M.L. Scott, False sharing and its effect on shared memory performance, in USENIX Systems on USENIX Experiences with Distributed and Multiprocessor Systems - Volume 4. Sedms’93, Berkeley, CA (USENIX Association, Berkeley, CA, 1993), pp. 3–3

    Google Scholar 

  4. R. Chandra, L. Dagum, D. Kohr, D. Maydan, J. McDonald, R. Menon, Parallel Programming in OpenMP (Morgan Kaufmann, San Francisco, CA, 2001)

    Google Scholar 

  5. L. Deng, D. Yu, Deep learning: methods and applications. Technical Report (2014)

    Google Scholar 

  6. S.S. Haykin, Neural Networks and Learning Machines, 3rd edn. (Pearson Education, Upper Saddle River, NJ, 2009)

    Google Scholar 

  7. W.D. Hillis, G.L. Steele Jr., Data parallel algorithms. Commun. ACM 29, 1170–1183 (1986)

    Article  Google Scholar 

  8. C. Iancu, S. Hofmeyr, F. Blagojević, Y. Zheng, Oversubscription on multicore processors, in 2010 IEEE International Symposium on Parallel Distributed Processing (IPDPS) (April 2010), pp. 1–11

    Google Scholar 

  9. P. Jääskeläinen, C.S. de La Lama, E. Schnetter, K. Raiskila, J. Takala, H. Berg, pocl: A performance-portable OpenCL implementation. Int. J. Parallel Program. 43(5), 752–785 (2015)

    Google Scholar 

  10. B. Jacob, S. Ng, D. Wang, Memory Systems: Cache, DRAM, Disk (Morgan Kaufmann, San Francisco, CA, 2007)

    Google Scholar 

  11. D. Kaeli, P. Mistry, D. Schaa, D.P. Zhang (eds.), Heterogeneous Computing with OpenCL 2.0, 3rd edn. (Morgan Kaufmann, Boston, 2015)

    Google Scholar 

  12. C. Lameter, Numa (non-uniform memory access): an overview. Queue 11(7), 40:40–40:51 (2013)

    Google Scholar 

  13. B.W. Lampson, Lazy and speculative execution in computer systems, in OPODIS, ed. by A. A. Shvartsman, vol. 4305. Lecture Notes in Computer Science, pp. 1–2 (Springer, Berlin, 2006)

    Google Scholar 

  14. P. Pospichal, E. Murphy, M. O’Neill, J. Schwarz, J. Jaros, Acceleration of grammatical evolution using graphics processing units: computational intelligence on consumer games and graphics hardware, in Proceedings of the 13th Annual Conference Companion on Genetic and Evolutionary Computation, GECCO ’11, New York, NY (ACM, New York, 2011), pp. 431–438

    Google Scholar 

  15. D. Robilliard, V. Marion-Poty, C. Fonlupt, Genetic programming on graphics processing units. Genet. Program Evolvable Mach. 10(4), 447 (2009)

    Google Scholar 

  16. I.L.S. Russo, H.S. Bernardino, H.J.C. Barbosa, A massively parallel grammatical evolution technique with OpenCL. J. Parallel Distrib. Comput. 109, 333–349 (2017)

    Article  Google Scholar 

  17. J. Shen, J. Fang, H.J. Sips, A.L. Varbanescu, Performance traps in OpenCL for CPUs, in PDP (IEEE Computer Society, 2013), pp. 38–45

    Google Scholar 

  18. J.E. Smith, A study of branch prediction strategies, in Proceedings of the 8th Annual Symposium on Computer Architecture, ISCA ’81, Los Alamitos, CA (IEEE Computer Society Press, Los Alamitos, CA, 1981), pp. 135–148

    Google Scholar 

Download references

Acknowledgements

The authors would like to thank the support provided by CNPq (grants 310778/2013-1, 502836/2014-8 and 300458/2017-7), FAPEMIG (grant APQ-03414-15), EU H2020 Programme and MCTI/RNP–Brazil under the HPC4E Project (grant 689772).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Amanda Sabatini Dufek .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Dufek, A.S., Augusto, D.A., Barbosa, H.J.C., da Silva Dias, P.L. (2018). Multi- and Many-Threaded Heterogeneous Parallel Grammatical Evolution. In: Ryan, C., O'Neill, M., Collins, J. (eds) Handbook of Grammatical Evolution. Springer, Cham. https://doi.org/10.1007/978-3-319-78717-6_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-78717-6_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-78716-9

  • Online ISBN: 978-3-319-78717-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics