research-article

Towards an evolutionary-based approach for natural language processing

Authors:
Luca Manzoni

Universitá degli Studi di Trieste, Trieste, Italy

Universitá degli Studi di Trieste, Trieste, Italy
View Profile

,
Domagoj Jakobovic

University of Zagreb, Zagreb, Croatia

University of Zagreb, Zagreb, Croatia
View Profile

,
Luca Mariot

Delft University of Technology, Delft, The Netherlands

Delft University of Technology, Delft, The Netherlands
View Profile

,
Stjepan Picek

Delft University of Technology, Delft, The Netherlands

Delft University of Technology, Delft, The Netherlands
View Profile

,
Mauro Castelli

Universidade Nova de Lisboa, Lisbon, Portugal

Universidade Nova de Lisboa, Lisbon, Portugal
View Profile

GECCO '20: Proceedings of the 2020 Genetic and Evolutionary Computation ConferenceJune 2020Pages 985–993https://doi.org/10.1145/3377930.3390248

Published:26 June 2020Publication History

GECCO '20: Proceedings of the 2020 Genetic and Evolutionary Computation Conference

Pages 985–993

ABSTRACT

Tasks related to Natural Language Processing (NLP) have recently been the focus of a large research endeavor by the machine learning community. The increased interest in this area is mainly due to the success of deep learning methods. Genetic Programming (GP), however, was not under the spotlight with respect to NLP tasks. Here, we propose a first proof-of-concept that combines GP with the well established NLP tool word2vec for the next word prediction task. The main idea is that, once words have been moved into a vector space, traditional GP operators can successfully work on vectors, thus producing meaningful words as the output. To assess the suitability of this approach, we perform an experimental evaluation on a set of existing newspaper headlines. Individuals resulting from this (pre-)training phase can be employed as the initial population in other NLP tasks, like sentence generation, which will be the focus of future investigations, possibly employing adversarial co-evolutionary approaches.

References

Hisham Al-Mubaid and Ping Chen. 2008. Application of word prediction and disambiguation to improve text entry for people with physical disabilities (assistive technology). International Journal of Social and Humanistic Computing 1, 1 (2008), 10--27.Google ScholarCross Ref
Basemah Alshemali and Jugal Kalita. 2019. Improving the Reliability of Deep Neural Networks in NLP: A Review. Knowledge-Based Systems (2019), 105210.Google Scholar
Lourdes Araujo. 2004. Genetic programming for natural language parsing. In European Conference on Genetic Programming. Springer, 230--239.Google ScholarCross Ref
L. Araujo. 2006. Multiobjective Genetic Programming for Natural Language Parsing and Tagging. In Parallel Problem Solving from Nature - PPSN IX, Thomas Philip Runarsson, Hans-Georg Beyer, Edmund Burke, Juan J. Merelo-Guervós, L. Darrell Whitley, and Xin Yao (Eds.). Springer, 433--442.Google Scholar
Lourdes Araujo. 2019. Genetic programming for natural language processing. Genetic Programming and Evolvable Machines (2019), 1--22.Google Scholar
Wolfgang Banzhaf, Randal S Olson, William Tozier, and Rick Riolo. 2019. Genetic Programming Theory and Practice XVI. Springer.Google Scholar
Alberto Bartoli, Giorgio Davanzo, Andrea De Lorenzo, Eric Medvet, and Enrico Sorio. 2014. Automatic synthesis of regular expressions from examples. Computer 47, 12 (2014), 72--80.Google ScholarDigital Library
Alberto Bartoli, Andrea De Lorenzo, Eric Medvet, and Fabiano Tarlao. 2017. Active learning of regular expressions for entity extraction. IEEE transactions on cybernetics 48, 3 (2017), 1067--1080.Google Scholar
Alberto Bartoli, Andrea De Lorenzo, Eric Medvet, Fabiano Tarlao, and Marco Virgolin. 2015. Evolutionary learning of syntax patterns for genic interaction extraction. In Proceedings of the 2015 Annual Conference on Genetic and Evolutionary Computation. 1183--1190.Google ScholarDigital Library
Robert Dale, Hermann Moisl, and Harold Somers. 2000. Handbook of natural language processing. CRC Press.Google ScholarCross Ref
Moises G De Carvalho, Alberto HF Laender, Marcos André Gonçalves, and Altigran S da Silva. 2010. A genetic programming approach to record deduplication. IEEE Transactions on Knowledge and Data Engineering 24, 3 (2010), 399--412.Google ScholarDigital Library
Marc Ebner. 2017. Distributed storage and recall of sentences. Bio-Algorithms and Med-Systems 13, 2 (2017), 89--101.Google ScholarCross Ref
Nestor Garay and J Abascal. 1994. Using statistical and syntactic information in word prediction for input speed enhancement. Information Systems Design and Hypermedia (1994), 223--230.Google Scholar
Nestor Garay-Vitoria and Julio Abascal. 2006. Text prediction systems: a survey. Universal Access in the Information Society 4, 3 (2006), 188--203.Google ScholarDigital Library
Nestor Garay-Vitoria and Julio Gonzalez-Abascal. 1997. Intelligent word-prediction to enhance text input rate (a syntactic analysis-based word-prediction aid for people with severe motor and speech disability). In Proceedings of the 2nd international conference on Intelligent user interfaces. 241--244.Google ScholarDigital Library
Robert Isele and Christian Bizer. 2013. Active learning of expressive linkage rules using genetic programming. Journal of web semantics 23 (2013), 2--15.Google ScholarDigital Library
Kyoung-Min Kim, Sung-Soo Lim, and Sung-Bae Cho. 2004. User adaptive answers generation for conversational agent using genetic programming. In International Conference on Intelligent Data Engineering and Automated Learning. Springer, 813--819.Google ScholarCross Ref
Rohit Kulkarni. 2018. A Million News Headlines. Accessed on February 1st, 2020. Google ScholarCross Ref
Ruli Manurung, Graeme Ritchie, and Henry S Thompson. 2008. An implementation of a flexible author-reviewer model of generation using genetic algorithms. In Proceedings of the 22nd Pacific Asia Conference on Language, Information and Computation. 272--281.Google Scholar
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient Estimation of Word Representations in Vector Space. In ICLR (Workshop Poster).Google Scholar
Riccardo Poli, William B. Langdon, and Nicholas Freitag McPhee. 2008. A Field Guide to Genetic Programming. lulu.com.Google Scholar
Carolyn Penstein Rosé. 1999. A genetic programming approach for robust language interpretation. Advances in genetic programming 3 (1999), 67--88.Google Scholar
Martin Sundermeyer, Ralf Schlüter, and Hermann Ney. 2012. LSTM neural networks for language modeling. In Thirteenth annual conference of the international speech communication association.Google Scholar
Andy L Swiffin, John L Arnott, and Alan F Newell. 1987. The use of syntax in a predictive communication aid for the physically handicapped. In 10th Annual Conference on Rehabilitation Technology. RESNA-Association for the Advancement of Rehabilitation Technology, 124--126.Google Scholar
Horabail Venkatagiri. 1993. Efficiency of lexical prediction as a communication acceleration technique. Augmentative and Alternative Communication 9, 3 (1993), 161--167.Google ScholarCross Ref
Tom Young, Devamanyu Hazarika, Soujanya Poria, and Erik Cambria. 2018. Recent trends in deep learning based natural language processing. IEEE Computational intelligence magazine 13, 3 (2018), 55--75.Google Scholar

Index Terms

Towards an evolutionary-based approach for natural language processing
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
  2. Machine learning
    1. Machine learning approaches
      1. Bio-inspired approaches
        Genetic programming

Recommendations

A lexicon of multiword expressions for linguistically precise, wide-coverage natural language processing

Since Sag et al. (2002) highlighted a key problem that had been underappreciated in the past in natural language processing (NLP), namely idiosyncratic multiword expressions (MWEs) such as idioms, quasi-idioms, cliches, quasi-cliches, institutionalized ...
Read More
Introduction to Chinese Natural Language Processing
Read More
Next word prediction based on the N-gram model for Kurdish Sorani and Kurmanji
Abstract
Next word prediction is an input technology that simplifies the process of typing by suggesting the next word to a user to select, as typing in a conversation consumes time. A few previous studies have focused on the Kurdish language, including ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
GECCO '20: Proceedings of the 2020 Genetic and Evolutionary Computation Conference
June 2020
1349 pages
ISBN:9781450371285
DOI:10.1145/3377930
General Chair:
Carlos Artemio Coello Coello
CINVESTAV-IPN
Copyright © 2020 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 26 June 2020
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
genetic programming
natural language processing
next word prediction
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,669of4,410submissions,38%
Upcoming Conference
GECCO '24

Sponsor:

sigevo

Genetic and Evolutionary Computation Conference

July 14 - 18, 2024

Melbourne , VIC , Australia
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 4
  Total Citations
  View Citations
- 234
  Total Downloads
- Downloads (Last 12 months)48
- Downloads (Last 6 weeks)13
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Towards an evolutionary-based approach for natural language processing

GECCO '20: Proceedings of the 2020 Genetic and Evolutionary Computation Conference

ABSTRACT

References

Cited By

Index Terms

Recommendations

A lexicon of multiword expressions for linguistically precise, wide-coverage natural language processing

Introduction to Chinese Natural Language Processing

Next word prediction based on the N-gram model for Kurdish Sorani and Kurmanji

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Towards an evolutionary-based approach for natural language processing

GECCO '20: Proceedings of the 2020 Genetic and Evolutionary Computation Conference

ABSTRACT

References

Cited By

Index Terms

Recommendations

A lexicon of multiword expressions for linguistically precise, wide-coverage natural language processing

Introduction to Chinese Natural Language Processing

Next word prediction based on the N-gram model for Kurdish Sorani and Kurmanji

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media