Abstract
In this work we present an extensive bibliometric and content-based analysis of the scientific literature about genetic programming in the twenty-first century. Our work has two key peculiarities. First, we revealed the topics emerging from the literature based on an unsupervised analysis of the textual content of titles and abstracts. Second, we executed all of our analyses twice, once on the papers published in the venues that are typical of the evolutionary computation research community and once on those published in all the other venues. This view from “both sides of the fence” allows us to gain broader and deeper insights into the actual contributions of our community.
Similar content being viewed by others
Notes
Formally, the twenty-first century started on January 1, 2001, rather than on January 1, 2000, which is the starting date of the so-called “2000s century.” We chose, however, to use “twenty-first century” because we think it is a more accessible locution.
The publication venue is shown in the “source title” field of Scopus results.
http://species-society.org, accessed on September 2018. SPECIES is a non-profit association that “aims to promote evolutionary algorithmic thinking within Europe and wider, and more generally to promote inspiration of parallel algorithms derived from natural processes”.
Provided by the Computing Research and Education Association of Australasia, it is one among A\(^*\) (best), A, B, and C (worst), http://www.core.edu.au/conference-portal.
Both lemmatization and stemming have been done using the NLTK toolkit, https://www.nltk.org/.
Originally, countries of affiliation are a multiset as more than one author can be affiliated with an institution in the same country. We considered the corresponding set.
References
G. Bao, H. Fang, L. Chen, Y. Wan, F. Xu, Q. Yang, L. Zhang, Soft robotics: academic insights and perspectives through bibliometric analysis. Soft Robot. 5(3), 229–241 (2018)
A. Bartoli, E. Medvet, Bibliometric evaluation of researchers in the internet age. Inf. Soc. 30(5), 349–354 (2014)
D.M. Blei, Probabilistic topic models. Commun. ACM 55(4), 77–84 (2012)
D.M. Blei, J.D Lafferty, Dynamic topic models, in ICML (2006)
D.M. Blei, A.Y. Ng, M.I. Jordan, Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
J. Branke, S. Nguyen, C.W. Pickardt, M. Zhang, Automated design of production scheduling heuristics: a review. IEEE Trans. Evol. Comput. 20(1), 110–124 (2016)
V.K. Dabhi, S. Chaudhary, Empirical modeling using genetic programming: a survey of issues and approaches. Nat. Comput. 14(2), 303–330 (2015)
D. De Nart, D. Degl’Innocenti, A. Pavan, M. Basaldella, C. Tasso, Modelling the user modelling community (and other communities as well), in International Conference on User Modeling, Adaptation, and Personalization (Springer, 2015), pp. 357–363
P.G. Espejo, S. Ventura, F. Herrera, A survey on the application of genetic programming to classification. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 40(2), 121–144 (2010)
N. Evangelopoulos, X. Zhang, V.R. Prybutok, Latent semantic analysis: five methodological recommendations. Eur. J. Inf. Syst. 21(1), 70–86 (2012)
A.W. Harzing, S. Alakangas, Google Scholar, Scopus and the Web of Science: a longitudinal and cross-disciplinary comparison. Scientometrics 106(2), 787–804 (2016)
M. Herrera, D.C. Roberts, N. Gulbahce, Mapping the evolution of scientific fields. PLoS ONE 5(5), e10355 (2010)
J.R. Koza, Survey of genetic algorithms and genetic programming, in WESCON/’95. Conference Record. ’Microelectronics Communications Technology Producing Quality Products Mobile and Portable Power Emerging Technologies (IEEE, 1995), p. 589
W.B. Langdon, S.M. Gustafson, Genetic programming and evolvable machines: ten years of reviews. Genet. Program Evolvable Mach. 11(3–4), 321–338 (2010)
J. McDermott, D.R. White, S. Luke, L. Manzoni, M. Castelli, L. Vanneschi, W. Jaskowski, K. Krawiec, R. Harper, K. De Jong, U.M. O’Reilly, Genetic programming needs better benchmarks, in Proceedings of the 14th Annual Conference on Genetic and Evolutionary Computation, GECCO ’12 (ACM, 2012), pp. 791–798
R.I. McKay, N.X. Hoai, P.A. Whigham, Y. Shan, M. O’Neill, Grammar-based genetic programming: a survey. Genet. Program Evolvable Mach. 11(3), 365–396 (2010)
E. Medvet, A. Bartoli, G. Davanzo, A. De Lorenzo, Automatic face annotation in news images by mining the web, in Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology, vol. 1 (IEEE Computer Society, 2011), pp. 47–54
E. Medvet, A. Bartoli, G. Piccinin, Publication venue recommendation based on paper abstract, in 2014 IEEE 26th International Conference on Tools with Artificial Intelligence (ICTAI) (IEEE, 2014), pp. 1004–1010
Y. Meguebli, M. Kacimi, B.L. Doan, F. Popineau, Unsupervised approach for identifying users’ political orientations, in European Conference on Information Retrieval (Springer, 2014), pp. 507–512
P. Mongeon, A. Paul-Hus, The journal coverage of Web of Science and Scopus: a comparative analysis. Scientometrics 106(1), 213–228 (2016)
M. Oltean, C. Groşan, L. Dioşan, C. Mihăilă, Genetic programming with linear representation: a survey. Int. J. Artif. Intell. Tools 18(02), 197–238 (2009)
J. Petke, S. Haraldsson, M. Harman, D. White, J. Woodward et al., Genetic improvement of software: a comprehensive survey. IEEE Trans. Evol. Comput. 22, 415–432 (2017)
M. Röder, A. Both, A. Hinneburg, Exploring the space of topic coherence measures, in Proceedings of the Eighth ACM International Conference on Web Search and Data Mining (ACM, 2015), pp. 399–408
F. Schlegel, S. Schneegans, D. Eröcal, UNESCO Science Report: Towards 2030 (UNESCO Publ., 2015)
Statistics & Collaboration Network in GECCO. https://doi.org/10.13140/RG.2.2.25153.66404
P. Sondhi, Feature construction methods: a survey. Tech. rep. (2009)
The GP bibliography, http://www.cs.bham.ac.uk/~wbl/biblio/. Accessed Oct 2018
M.C. Tremblay, C. Parra, A. Castellanos, Analyzing corporate social responsibility reports using unsupervised and supervised text data mining, in International Conference on Design Science Research in Information Systems (Springer, 2015), pp. 439–446
L. Vanneschi, M. Castelli, S. Silva, A survey of semantic methods in genetic programming. Genet. Program Evolvable Mach. 15(2), 195–214 (2014)
T. Velden, K.W. Boyack, J. Gläser, R. Koopman, A. Scharnhorst, S. Wang, Comparison of topic extraction approaches and their results. Scientometrics 111(2), 1169–1221 (2017)
M.L. Wallace, V. Larivière, Y. Gingras, Modeling a century of citation distributions. J. Informetr. 3(4), 296–303 (2009)
D.R. White, J. McDermott, M. Castelli, L. Manzoni, B.W. Goldman, G. Kronberger, W. Jaśkowski, U.M. O’Reilly, S. Luke, Better GP benchmarks: community survey results and proposals. Genet. Program Evolvable Mach. 14(1), 3–29 (2013)
B. Xue, M. Zhang, W.N. Browne, X. Yao, A survey on evolutionary computation approaches to feature selection. IEEE Trans. Evol. Comput. 20(4), 606–626 (2016)
C.K. Yau, A. Porter, N. Newman, A. Suominen, Clustering scientific documents with topic modeling. Scientometrics 100(3), 767–786 (2014)
Acknowledgements
This work was supported by national funds through FCT (Fundação para a Ciência e a Tecnologia) under Project DSAIPA/DS/0022/2018 (GADgET). Mauro Castelli acknowledges the financial support from the Slovenian Research Agency (research core Funding No. P5-0410).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
De Lorenzo, A., Bartoli, A., Castelli, M. et al. Genetic programming in the twenty-first century: a bibliometric and content-based analysis from both sides of the fence. Genet Program Evolvable Mach 21, 181–204 (2020). https://doi.org/10.1007/s10710-019-09363-3
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10710-019-09363-3