Genetic programming-based feature learning for question answering

https://doi.org/10.1016/j.ipm.2015.09.001Get rights and content

Highlights

  • A new framework for answering definitional and factoid questions.

  • Producing new features by combining effective features with arithmetic operators.

  • Genetic Programming (GP) algorithm has been employed for feature learning.

  • Three discriminant-based methods have been used for learning features weights.

Abstract

Question Answering (QA) systems are developed to answer human questions. In this paper, we have proposed a framework for answering definitional and factoid questions, enriched by machine learning and evolutionary methods and integrated in a web-based QA system. Our main purpose is to build new features by combining state-of-the-art features with arithmetic operators. To accomplish this goal, we have presented a Genetic Programming (GP)-based approach. The exact GP duty is to find the most promising formulas, made by a set of features and operators, which can accurately rank paragraphs, sentences, and words. We have also developed a QA system in order to test the new features. The input of our system is texts of documents retrieved by a search engine. To answer definitional questions, our system performs paragraph ranking and returns the most related paragraph. Moreover, in order to answer factoid questions, the system evaluates sentences of the filtered paragraphs ranked by the previous module of our framework. After this phase, the system extracts one or more words from the ranked sentences based on a set of hand-made patterns and ranks them to find the final answer. We have used Text Retrieval Conference (TREC) QA track questions, web data, and AQUAINT and AQUAINT-2 datasets for training and testing our system. Results show that the learned features can perform a better ranking in comparison with other evaluation formulas.

Introduction

Question Answering systems are advanced search engines that can provide the least brief and the most complete answer to users instead of making them read a set of documents. QA systems are essential tools for dealing with the fast-growing global information. However, upgrading a search engine to a QA system is a complex and open-ended problem (Zadeh, 2003). Machine-based human-like answering has been a dream that Artificial Intelligence (AI) scientists have been trying to achieve. Based on Russell and Norvig (2010), AI field has four definition groups and one of them is based on the Turing test, which is about the ability of machines to communicate or answer like a human. Moreover, Arthur Samuel (Samuel, 1983) in his talk titled “AI: where it has been and where it is going” stated the main goal of AI and machine learning as: “to get machines to exhibit behavior, which if done by humans, would be assumed to involve the use of intelligence.”

At the early years, the fundamental problem of QA was converting a natural language question to a Structured Query Language (SQL) query and retrieving answers from structured data. These convert-to-query-based systems have been called restricted-domain systems because they can only answer questions related to their already-provided structured data (Indurkhya & Damerau, 2010). However, with rapid enlargement of data in unstructured format, extracting answers from domain-independent sources became the main challenge of QA. These QA systems, which operate on sources that could be general and free of specific domain, are called open-domain QA systems. The first web-based QA system developed in 2004, named Start (Katz, Lin, & Felshin, 2002), and contemporary systems are Wolfram1 from IBM, and AskHERMES (Yen et al., 2013b).

The simplest form of answering for a QA system is returning a paragraph to a definitional question. However, factoid question is the most discussed question type. The answer of a factoid question is a simple fact such as name of a person or a location that can be found in a sentence (Jurafsky & Martin, 2009). In addition to these two question types, there are others such as list, hypothetical, causal, relationship, procedural, and confirmation questions. In this paper, we have worked on definitional and factoid questions with a four-phase framework including paragraph ranking, sentence ranking, word extraction, and word ranking.

In order to select an answer from a set of candidates, they must be ranked. The ranking problem includes computing a score based on a set of features. Computing the sum of all feature values or i(featurei) formula cannot reflect the true value of an answer because each feature has a weight. Finding the weights is a supervised learning problem that can be solved by a discriminant-based classification algorithm. In this paper, we used three methods for this task including Linear Discriminant Analysis (LDA), Logistic regression, and Support Vector Machines (SVM). Taking into account the weights, the score computation formula will becomei(featurei×weighti).

Following these two formulas, one possible continuation is using other arithmetic operators such as multiplication, division, exponential, and logarithm, which is the main purpose of this paper. Here a challenging problem is finding a promising ranking formula based on a set of operators and features. In order to solve such kind of problems, evolutionary approaches can be used since they are capable of searching in large-scale search spaces effectively. Among different types of evolutionary algorithms, Genetic Programming (GP) whose individuals are trees would be the best candidate. Since our problem is finding a ranking formula, which can be modeled as a tree of operators and features, GP would be a perfect choice. The main contribution of this paper is learning efficient features via GP algorithm for a Question Answering system.

The remainder of this paper is organized as follows: in Section 2, we will discuss about related works. In Section 3, the structure of our proposed QA system will be presented in detail. Sections 4 and 5 deal with answering definitional and factoid questions. The last three sections describe experimental results, discussion, and conclusions respectively.

Section snippets

Related works

Question answering subject is discussed by chapter books in Indurkhya and Damerau (2010), and Jurafsky and Martin (2009), and by survey papers such as Kolomiyets and Moens (2011).

Previous works on feature engineering task includes Severyn and Moschitti, 2013, Severyn et al., 2013, Tymoshenko et al., 2014, and Severyn and Moschitti (2012). Tymoshenko et al. (2014) represented question and candidate answer passages with pairs of shallow syntactic/semantic trees whose constituents are connected

The structure of proposed QA system

The structure of proposed QA system is illustrated in Fig. 1. The first input is question and it is given to a search engine. Contents of retrieved sources are the second input. The Question_analyzer part extracts type and features of the question. The question type is determined based on the Question_types collection, which is a set of handcrafted question type patterns that we made. The contents of retrieved sources are given to the Paragraph_feature_extractor and the Pre-processor parts for

Answering definitional questions

In order to answer a definitional question, our system ranks paragraphs of the related documents, which are retrieved by a search engine. The ranking task is based on i(featurei×weighti) formula. The features will be described later, but the weights are learned by discriminant-based classification methods and details about this task will be described in the following paragraphs.

In supervised machine learning, the output is estimated from the input based on the learned model. This model is

Sentence ranking

In order to rank the sentences, which are sentences of the filtered paragraphs from the previous part, we used 12 features and one learned feature. These features categorized in three groups are illustrated in Fig. 9 and those features that are not described in Table 3 are illustrated in Table 11. One of the new features is Question type adaptation and it is based on the analogy of a sentence with expected answer type of the question. This feature contains three inner features and each one is

Experimental results

The accuracy of the system has been computed for answering TREC 2004 (Voorhees, 2005) and 2007 QA Track factoid questions with texts from top-n Google-retrieved documents, and AQUAINT (Graff, 2002) and AQUAINT-2 (Voorhees & Graff, 2008) datasets. These results are illustrated in Table 19. The answering is based on the value of n, which is the number of retrieved documents. We used different values for n including 1, 5, 10, 15, and 50. Web-based data provided by Google API2

Discussion

The reason for which the learned formulas have proved to be more valuable than others is their combination of features and operators that can better reflect the true value of a text and discriminate a set of candidates. The multiplication, division, and exponential operators enforce a useful distance between candidates based on key features.

GP algorithm can find the accurate position of each feature and connect them with accurate operators. Moreover, GP algorithm can take into account the intra

Conclusions

We proposed a Genetic Programming-based approach for learning new features based on a set of features and arithmetic operators. The learned features proved that they can perform a better ranking in comparison with three types of evaluation formulas, including i(featurei×weighti),i(featurei2×weighti), and ifeaturei. The weights of the learned features, which have higher values than the other features, also show the importance of them.

We also developed a QA framework with three hierarchical

References (36)

  • YuZhengtao et al.

    Question classification based on co-training style semi-supervised learning

    Pattern Recognition Letters

    (2010)
  • CroceDanilo et al.

    Semantic convolution kernels over dependency trees

  • DangHoa Trang et al.

    Overview of the TREC 2007 question answering track

  • DangHoa Trang et al.

    Overview of the TREC 2006 question answering track

  • FigueroaAlejandro G. et al.

    Genetic Algorithms for data-driven web question answering

    Evolutionary Computation

    (2008)
  • Graff, David (2002). The AQUAINT Corpus of English News Text LDC2002T31. Web Download. Philadelphia: Linguistic Data...
  • HollandJ.H.

    Adaptation in Natural and Artificial Systems

    (1975)
  • IndurkhyaNitin et al.

    Handbook of Natural Language Processing

    (2010)
  • Cited by (17)

    • Transductive transfer learning based Genetic Programming for balanced and unbalanced document classification using different types of features

      2021, Applied Soft Computing
      Citation Excerpt :

      The outputs of value-based GP programs were used to classify documents [31,32]. Also, value-based GP programs can be employed as features for text classification [14,15,33]. A value-based GP system [34] utilized a small set of TF–IDF based features selected by an information gain method.

    • Output-based transfer learning in genetic programming for document classification

      2021, Knowledge-Based Systems
      Citation Excerpt :

      The value-based GP system [36,37] automatically selected features and constructed programs for predicting labels of documents. Since the term frequency and n-gram terms are used in the value-based GP system, the value-based GP system can outperform the rule-based GP system [32,34,38]. In summary, a value-based GP system is promising for document classifications.

    View all citing articles on Scopus
    View full text