Opinion-Based Entity Ranking using learning to rank
Graphical abstract
Introduction
With the development of social web content on the Internet, people are more likely to express their views and opinions. These opinions are important for individual users for making decisions. This trend is affecting more and more critical business processes such as customer support and satisfaction, brand and reputation management, product design and marketing [34], [31], [48], [47]. This global trend has led to an evolution in the behavior of web users who are now increasingly reading reviews or comments before purchasing products or services [25], [2], [18], [5]. There is now a massive growth of opinions on the web, ranging from opinions on businesses and products to diseases and people. While these opinions are meant to be helpful, the vast number of such opinions is overwhelming to users as there is just too much to read. For example, for popular products or hotels such as iPhone, Marriott or Hilton, the number of opinions can be up to hundreds or even up to thousands [31], [17]. The large numbers of these opinions make it difficult for a potential customer to read and understand them in a limited time and to make an informed decision on whether to not to purchase a product/service. Thus, there is a need to develop information retrieval techniques in order to help users to exploit available opinions.
Opinion-Based Entity Ranking (OpER) is an information retrieval task for automatically ranking entities on the basis of opinions [17], [7]. OpER directly ranks interesting entities based on how well the opinions on these entities are matched with the user's preferences. The idea is to represent each entity with the text of the opinions of all its users. Then, given a user's search query (where keywords of query represent aspects for entities), OpER can then rank the relevant entities based on how well opinions of entities (expressed by other users) match with the user's search preferences. In the presence of such automatic ranking system, the user does not need to read a large number of opinions available on all entities of a topic, but rather the user can now focus on a much smaller set of relevant entities that came on the top and roughly matches his/her preferences with the judgments of other users. Further, this type of ranking is flexible in the sense that it can be applied to any collection of entities for which opinions are available.
Previous information retrieval attempts based on OpER task determine the relevance (weights) of query keywords for a particular entity by assuming all of its opinions as a single field of text as commonly done in regular information retrieval [17]. These weights are aggregated and normalized for a specific entity so that a final score can be assigned to the entity for a particular topic. This assumption ignores the relevance of query keywords in individual opinions and does not model weights according to the subjectivity (judgments) of individual opinions. If a system does not have such a capability, then it is possible that an irrelevant entity may be ranked high just because of a greater matching of query keywords in large number of negative opinions. As we will show later in the paper, modeling the importance of query keywords in individual opinions significantly helps in improving the ranking effectiveness of OpER. In order to do this we propose a set of heuristically motivated ranking features. One subset of these features is based on standard document weighting schemes (such as TFIDF, BM25, PL2), while another subset of these features approximates subjectivity of query keywords when calculating relevance of entities. We call these features keyword-opinion features. We perform an effectiveness analysis of these keyword-opinion features to identify their correlation to relevance with the top ranked retrieved entities. Although single features show significant effectiveness, further improvement is possible by combining these features using learning to rank approach [21], [15], [42]. Thus, we employ the use of a machine learning approach to search for an optimal solution in the space of (keyword-opinion) feature combinations. At the end of learning, we evaluate the effectiveness of an optimal solution over entity collections in order to analyze to what extent it achieves a significant increase in effectiveness over the use of single features.
The remainder of this paper is structured as follow. Section 2 reviews related work on the OpER and other related areas. Section 3 starts with the description of the architecture of our proposed approach. This section also lists keyword-opinion features that we employ for ranking entities. Section 4 describes the setting for experiments, the collections, query sets and relevance judgments that we use to validate the effectiveness of our approach. Section 5 shows the effectiveness analysis of keyword-opinion features. In Section 6, we combine keyword-opinions features using learning to rank approach for automatically evolving effective retrieval model. Finally, Section 7 briefly summarizes the key lessons learned from this study.
Section snippets
Related work
Opinion-Based Entity Ranking (OpER) is a new retrieval task in information retrieval. We start the related work discussion with a brief introduction about the OpER task and then discuss several lines of related work that are similar to this domain.
Opinion-Based Entity Ranking (OpER): Ganesan and Zhai [17] proposed a novel concept of ranking entities on the basis of opinions. OpER directly ranks entities on the basis of the user's search preferences that are given in the form of query keywords
Our approach
Fig. 1 explains the architecture of our OpER approach. The aim is to employ the use of a machine learning approach to search through the space of retrieval models for OpER task. Given training samples the feature extraction component extracts keyword-opinion features from entities. These keyword-opinion features are then used for training learning to rank system using genetic programming (GP) (see Section 6). The trained system is then used for ranking entities given a user's queries.
Experiments
Collections: For running experiments, we require a collection that contains opinions on entities, so that our system can return a set of relevant entities based on how well the opinions of these entities are matched with a user's queries. For this purpose we use two collections that are obtained from Ganesan and Zhai [17]. These collections have opinions on hotels and cars. The reason for selecting these collections is that they have been frequently used for many related work on opinion
Effectiveness analysis of keyword-opinion features
In this section, we analyze each single feature to test whether or not it provides ample supply of relevant entities at top rank positions. For each query we first order retrieved entities according to descending features scores and then examine their effectiveness using nDCG@10. Although effectiveness analysis can be performed with different combination of features, due to the complexity of the problem, and potentially large number of combinations, it is difficult to perform an in-depth
GPrank: combining raking features using learning to rank
Since the information represented by different search features is complementary, it is natural to combine features in a useful manner. Among the possible feature combination techniques, genetic programming (GP) based feature combination is widely used in IR for automatically evolving effective combination of features. The use of GP in IR is not a new research idea. In the past few years, there have been several attempts on evolving effective retrieval models using GP [6], [43], [8], [12], [11],
Conclusion
In this paper we address the Opinion Based Entity Retrieval (OpER) task. OpER ranks entities based on how well opinions of entities match with given user queries. Previous research on OpER determines the relevance of query keywords for a particular entity by assuming all of its opinions as a single field of text (as commonly done for regular information retrieval). This assumption ignores the relevance of individual opinions and this type of system might rank irrelevant entities at top ranked
References (48)
- et al.
A review on the application of evolutionary computation to information retrieval?
Int. J. Approx. Reason.
(2003) - et al.
A generic ranking function discovery framework by genetic programming for information retrieval
Inf. Process. Manage. J.
(2004) Crossover improvement for the genetic algorithm in information retrieval
Inf. Process. Manage. J.
(1998)- et al.
Sentiment classification of online reviews to travel destinations by supervised machine learning approaches?
Expert Syst. Appl.
(2009) - et al.
Probabilistic models of information retrieval based on measuring the divergence from randomness?
ACM Trans. Inf. Syst.
(2002) - et al.
Multi-facet rating of product reviews
- et al.
Sentiwordnet 3. 0: an enhanced lexical resource for sentiment analysis and opinion mining
- et al.
Expertise retrieval?
Found. Trends Inf. Retr.
(2012) - et al.
Learning sentiments from tweets with personal health information
Machine learning for information retrieval: neural networks, symbolic learning, and genetic algorithms
J. Am. Soc. Inf. Sci. Technol.
(1995)