ABSTRACT
Tasks related to Natural Language Processing (NLP) have recently been the focus of a large research endeavor by the machine learning community. The increased interest in this area is mainly due to the success of deep learning methods. Genetic Programming (GP), however, was not under the spotlight with respect to NLP tasks. Here, we propose a first proof-of-concept that combines GP with the well established NLP tool word2vec for the next word prediction task. The main idea is that, once words have been moved into a vector space, traditional GP operators can successfully work on vectors, thus producing meaningful words as the output. To assess the suitability of this approach, we perform an experimental evaluation on a set of existing newspaper headlines. Individuals resulting from this (pre-)training phase can be employed as the initial population in other NLP tasks, like sentence generation, which will be the focus of future investigations, possibly employing adversarial co-evolutionary approaches.
- Hisham Al-Mubaid and Ping Chen. 2008. Application of word prediction and disambiguation to improve text entry for people with physical disabilities (assistive technology). International Journal of Social and Humanistic Computing 1, 1 (2008), 10--27.Google ScholarCross Ref
- Basemah Alshemali and Jugal Kalita. 2019. Improving the Reliability of Deep Neural Networks in NLP: A Review. Knowledge-Based Systems (2019), 105210.Google Scholar
- Lourdes Araujo. 2004. Genetic programming for natural language parsing. In European Conference on Genetic Programming. Springer, 230--239.Google ScholarCross Ref
- L. Araujo. 2006. Multiobjective Genetic Programming for Natural Language Parsing and Tagging. In Parallel Problem Solving from Nature - PPSN IX, Thomas Philip Runarsson, Hans-Georg Beyer, Edmund Burke, Juan J. Merelo-Guervós, L. Darrell Whitley, and Xin Yao (Eds.). Springer, 433--442.Google Scholar
- Lourdes Araujo. 2019. Genetic programming for natural language processing. Genetic Programming and Evolvable Machines (2019), 1--22.Google Scholar
- Wolfgang Banzhaf, Randal S Olson, William Tozier, and Rick Riolo. 2019. Genetic Programming Theory and Practice XVI. Springer.Google Scholar
- Alberto Bartoli, Giorgio Davanzo, Andrea De Lorenzo, Eric Medvet, and Enrico Sorio. 2014. Automatic synthesis of regular expressions from examples. Computer 47, 12 (2014), 72--80.Google ScholarDigital Library
- Alberto Bartoli, Andrea De Lorenzo, Eric Medvet, and Fabiano Tarlao. 2017. Active learning of regular expressions for entity extraction. IEEE transactions on cybernetics 48, 3 (2017), 1067--1080.Google Scholar
- Alberto Bartoli, Andrea De Lorenzo, Eric Medvet, Fabiano Tarlao, and Marco Virgolin. 2015. Evolutionary learning of syntax patterns for genic interaction extraction. In Proceedings of the 2015 Annual Conference on Genetic and Evolutionary Computation. 1183--1190.Google ScholarDigital Library
- Robert Dale, Hermann Moisl, and Harold Somers. 2000. Handbook of natural language processing. CRC Press.Google ScholarCross Ref
- Moises G De Carvalho, Alberto HF Laender, Marcos André Gonçalves, and Altigran S da Silva. 2010. A genetic programming approach to record deduplication. IEEE Transactions on Knowledge and Data Engineering 24, 3 (2010), 399--412.Google ScholarDigital Library
- Marc Ebner. 2017. Distributed storage and recall of sentences. Bio-Algorithms and Med-Systems 13, 2 (2017), 89--101.Google ScholarCross Ref
- Nestor Garay and J Abascal. 1994. Using statistical and syntactic information in word prediction for input speed enhancement. Information Systems Design and Hypermedia (1994), 223--230.Google Scholar
- Nestor Garay-Vitoria and Julio Abascal. 2006. Text prediction systems: a survey. Universal Access in the Information Society 4, 3 (2006), 188--203.Google ScholarDigital Library
- Nestor Garay-Vitoria and Julio Gonzalez-Abascal. 1997. Intelligent word-prediction to enhance text input rate (a syntactic analysis-based word-prediction aid for people with severe motor and speech disability). In Proceedings of the 2nd international conference on Intelligent user interfaces. 241--244.Google ScholarDigital Library
- Robert Isele and Christian Bizer. 2013. Active learning of expressive linkage rules using genetic programming. Journal of web semantics 23 (2013), 2--15.Google ScholarDigital Library
- Kyoung-Min Kim, Sung-Soo Lim, and Sung-Bae Cho. 2004. User adaptive answers generation for conversational agent using genetic programming. In International Conference on Intelligent Data Engineering and Automated Learning. Springer, 813--819.Google ScholarCross Ref
- Rohit Kulkarni. 2018. A Million News Headlines. Accessed on February 1st, 2020. Google ScholarCross Ref
- Ruli Manurung, Graeme Ritchie, and Henry S Thompson. 2008. An implementation of a flexible author-reviewer model of generation using genetic algorithms. In Proceedings of the 22nd Pacific Asia Conference on Language, Information and Computation. 272--281.Google Scholar
- Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient Estimation of Word Representations in Vector Space. In ICLR (Workshop Poster).Google Scholar
- Riccardo Poli, William B. Langdon, and Nicholas Freitag McPhee. 2008. A Field Guide to Genetic Programming. lulu.com.Google Scholar
- Carolyn Penstein Rosé. 1999. A genetic programming approach for robust language interpretation. Advances in genetic programming 3 (1999), 67--88.Google Scholar
- Martin Sundermeyer, Ralf Schlüter, and Hermann Ney. 2012. LSTM neural networks for language modeling. In Thirteenth annual conference of the international speech communication association.Google Scholar
- Andy L Swiffin, John L Arnott, and Alan F Newell. 1987. The use of syntax in a predictive communication aid for the physically handicapped. In 10th Annual Conference on Rehabilitation Technology. RESNA-Association for the Advancement of Rehabilitation Technology, 124--126.Google Scholar
- Horabail Venkatagiri. 1993. Efficiency of lexical prediction as a communication acceleration technique. Augmentative and Alternative Communication 9, 3 (1993), 161--167.Google ScholarCross Ref
- Tom Young, Devamanyu Hazarika, Soujanya Poria, and Erik Cambria. 2018. Recent trends in deep learning based natural language processing. IEEE Computational intelligence magazine 13, 3 (2018), 55--75.Google Scholar
Index Terms
- Towards an evolutionary-based approach for natural language processing
Recommendations
A lexicon of multiword expressions for linguistically precise, wide-coverage natural language processing
Since Sag et al. (2002) highlighted a key problem that had been underappreciated in the past in natural language processing (NLP), namely idiosyncratic multiword expressions (MWEs) such as idioms, quasi-idioms, cliches, quasi-cliches, institutionalized ...
Next word prediction based on the N-gram model for Kurdish Sorani and Kurmanji
AbstractNext word prediction is an input technology that simplifies the process of typing by suggesting the next word to a user to select, as typing in a conversation consumes time. A few previous studies have focused on the Kurdish language, including ...
Comments