Learning to rank: new approach with the layered multi-population genetic programming on click-through features

Keyhanipour, Amir Hosein; Moshiri, Behzad; Oroumchian, Farhad; Rahgozar, Maseud; Badie, Kambiz

doi:10.1007/s10710-016-9263-y

Learning to rank: new approach with the layered multi-population genetic programming on click-through features

Published: 20 January 2016

Volume 17, pages 203–230, (2016)
Cite this article

Genetic Programming and Evolvable Machines Aims and scope Submit manuscript

Amir Hosein Keyhanipour¹,
Behzad Moshiri¹,
Farhad Oroumchian²,
Maseud Rahgozar¹ &
…
Kambiz Badie³

387 Accesses
5 Citations
Explore all metrics

Abstract

Users’ click-through data is a valuable source of information about the performance of Web search engines, but it is included in few datasets for learning to rank. In this paper, inspired by the click-through data model, a novel approach is proposed for extracting the implicit user feedback from evidence embedded in benchmarking datasets. This process outputs a set of new features, named click-through features. Generated click-through features are used in a layered multi-population genetic programming framework to find the best possible ranking functions. The layered multi-population genetic programming framework is fast and provides more extensive search capability compared to the traditional genetic programming approaches. The performance of the proposed ranking generation framework is investigated both in the presence and in the absence of explicit click-through data in the utilized benchmark datasets. The experimental results show that click-through features can be efficiently extracted in both cases but that more effective ranking functions result when click-through features are generated from benchmark datasets with explicit click-through data. In either case, the most noticeable ranking improvements are achieved at the tops of the provided ranked lists of results, which are highly targeted by the Web users.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Puma optimizer (PO): a novel metaheuristic optimization algorithm and its application in machine learning

Article 19 January 2024

Recommender Systems: Techniques, Applications, and Challenges

A systematic review: machine learning based recommendation systems for e-learning

Article 14 December 2019

References

T. Joachims, Optimizing search engines using clickthrough data, in The 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2002)
Z. Dou, R. Song, X. Yuan, J.-R. Wen, Are click-through data adequate for learning web search rankings?, in The 17th ACM Conference on Information and Knowledge Management (2008)
A.H. Keyhanipour, B. Moshiri, M. Piroozmand, C. Lucas, Aggregation of multiple search engines based on users’ preferences in webfusion. Knowl.-Based Syst. 20(4), 321–328 (2007)
Article Google Scholar
C. Macdonald, I. Ounis, Usefulness of quality click-through data for training, in The 2009 Workshop on Web Search Click Data (2009)
C. Macdonald, R.L. Santos, I. Ounis, The whens and hows of learning to rank for web search. Inf. Retr. 16(5), 584–628 (2013)
Article Google Scholar
J.R. Koza, Genetic Programming: On the Programming of Computers by Means of Natural Selection (MIT Press, Cambridge, 1992)
MATH Google Scholar
J.-Y. Lin, H.-R. Ke, B.-C. Chien, W.-P. Yang, Designing a classifier by a layered multi-population genetic programming approach. Pattern Recogn. 40, 2211–2225 (2007)
Article MATH Google Scholar
T. Qin, T.-Y. Liu, J. Xu, H. Li, LETOR: Benchmark dataset for research on learning to rank for information retrieval (Amsterdam, Netherlands, 2007)
O. Chapelle, Y. Chang, Yahoo! learning to rank challenge overview. J. Mach. Learn. Res. 14, 1–24 (2011)
Google Scholar
O.D. Alcantara, A.R. Pereira Jr, H.M. de Almeida, M.A. Goncalves, C. Middleton, R. Baeza-Yates, WCL2R: a benchmark collection for learning to rank research with clickthrough data. J. Inf. Data Manag. 1(3), 551–566 (2010)
Google Scholar
T.-Y. Liu, Learning to Rank for Information Retrieval (Springer, Berlin, 2011)
Book MATH Google Scholar
D. Cossock, T. Zhang, Subset ranking using regression, in The 19th Annual Conference on Learning Theory (2006)
N. Fuhr, Optimum polynomial retrieval functions based on the probability ranking principle. ACM Trans. Inf. Syst. 7(3), 183–204 (1989)
W. S. Cooper, F. C. Gey, D. P. Dabney, Probabilistic retrieval based on staged logistic regression, in The 15th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (1992)
F. C. Gey, Inferring probability of relevance using the method of logistic regression, in The 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (1994)
R. Nallapati, Discriminative models for information retrieval, in The 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (2004)
W. Chu, Z. Ghahramani, Gaussian processes for ordinal regression. J. Mach. Learn. Res. 6, 1019–1041 (2005)
MathSciNet MATH Google Scholar
K. Crammer, Y. Singer, Pranking with ranking. Adv. Neural Inf. Process. Syst. 14, 641–647 (2002)
Google Scholar
A. Shashua, A. Levin, Ranking with large margin principles: two approaches. Adv. Neural Inf. Process. Syst. 15, 937–944 (2003)
Y. Freund, R. Iyer, R.E. Schapire, Y. Singer, An efficient boosting algorithm for combining preferences. J. Mach. Learn. Res. 4, 933–969 (2003)
MathSciNet MATH Google Scholar
M. F. Tsai, T.-Y. Liu, T. Qin, H.-H. Chen, W.-Y. Ma, Frank: a ranking method with fidelity loss, in The 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (2007)
Y. Cao, J. Xu, T.-Y. Liu, H. Li, Y. Huang, H.-W. Hon, Adapting ranking SVM to document retrieval, in The 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (2006)
L. Rigutini, T. Papini, M. Maggini, F. Scarselli, SortNet: Learning to rank by a neural-based sorting algorithm, in SIGIR 2008 Workshop on Learning to Rank for Information Retrieval (2008)
E. Renshaw, A. Lazier, C. Burges, T. Shaked, M. Deeds, N. Hamilton, G. Hullender, Learning to rank using gradient descent, in The 22nd International Conference on Machine Learning (2005)
C. J. Burges, R. Ragno, Q. V. Le, Learning to rank with nonsmooth cost functions. Adv. Neural Inf. Process. Syst. 19, 193–200 (2007)
Y. Ganjisaffar, R. Caruana, C. V. Lopes, Bagging gradient-boosted trees for high precision, low variance ranking models, in The 34th International ACM SIGIR Conference on Research and Development in Information Retrieval (2011)
M. Taylor, J. Guiver, S. Robertson, T. Minka, Softrank: optimising non-smooth rank metrics, in The 1st International Conference on Web Search and Web Data Mining (2008)
O. Chapelle, M. Wu, Gradient descent optimization of smoothed information retrieval metrics. Inf. Retr. 13(3), 216–235 (2010)
Article Google Scholar
Y. Yue, T. Finley, F. Radlinski, T. Joachims, A support vector method for optimizing average precision, in The 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (2007)
S. Chakrabarti, R. Khanna, U. Sawant, C. Bhattacharyya, Structured learning for nonsmooth ranking losses, in The 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2008)
J. Xu, T.-Y. Liu, M. Lu, H. Li, W.-Y. Ma, Directly optimizing IR evaluation measures in learning to rank, in The 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (2008)
J. Xu, H. Li, Adarank: a boosting algorithm for information retrieval, in The 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (2007)
J.-Y. Yeh, J.-Y. Lin, H.-R. Ke, W.-P. Yang, Learning to rank for information retrieval using genetic programming, in 2012 IEEE International Conference on Computational Intelligence and Cybernetics (2007)
Z. Cao, T. Qin, T.-Y. Liu, M.-F. Tsai, H. Li, Learning to rank: from pairwise approach to listwise approach, in The 24th International Conference on Machine Learning (2007)
J. C. Huang, B. J. Frey, Structured ranking learning using cumulative distribution networks. Adv. Neural Inf. Process. Syst. 21, 697–704 (2009)
M. N. Volkovs, R. S. Zemel, Boltzrank: learning to maximize expected ranking gain, in The 26th International Conference on Machine Learning (2009)
O. Cordón, F.D. Moya, C. Zarco, A GA-P algorithm to automatically formulate extended Boolean queries for a fuzzy information retrieval system. Mathw. Soft Comput. 7(2–3), 309–322 (2000)
MATH Google Scholar
C. López-Pujalte, V.P. Guerrero Bote, F.D. Moya, A test of genetic algorithms in relevance feedback. Inf. Process. Manag. 38(6), 793–805 (2002)
Article MATH Google Scholar
A.G. López-Herrera, E. Herrera-Viedma, F. Herrera, A study of the use of multi-objective evolutionary algorithms to learn Boolean queries: a comparative study. J. Assoc. Inf. Sci. Technol. 60(6), 1192–1207 (2009)
Article Google Scholar
Z. Zhu, X. Chen, Q. Zhu, Q. Xie, A GA-based query optimization method for web information retrieval. Appl. Math. Comput. 185(2), 919–930 (2007)
MATH Google Scholar
R.L. Cecchini, C.M. Lorenzetti, A.G. Maguitman, N.B. Brignole, Using genetic algorithms to evolve a population of topical queries. Inf. Process. Manag. 44(6), 1863–1878 (2008)
Article Google Scholar
R.L. Cecchini, C.M. Lorenzetti, A.G. Maguitman, N.B. Brignole, Multiobjective evolutionary algorithms for context-based search. J. Am. Soc. Inf. Sci. Technol. 61(6), 1258–1274 (2010)
Google Scholar
A. H. Keyhanipour, B. Moshiri, Designing a web spam classifier based on feature fusion in the layered multi-population genetic programming framework, in The 16th International Conference on Information Fusion (2013)
W. Fan, M.D. Gordon, P. Pathak, Discovery of context-specific ranking functions for effective information retrieval using genetic programming. IEEE Trans. Knowl. Data Eng. 16(4), 523–527 (2004)
Article Google Scholar
W. Fan, M.D. Gordon, P. Pathak, Genetic programming-based discovery of ranking functions for effective web search. J. Manag. Inf. Syst. 21(4), 37–56 (2005)
Google Scholar
W. Fan, P. Pathak, L. Wallace, Nonlinear ranking function representations in genetic programming-based ranking discovery for personalized search. Decis. Support Syst. 42(3), 1338–1349 (2006)
Article Google Scholar
H. M. de Almeida, M. A. Gonçalves, M. Cristo, P. Calado, A combined component approach for finding collectionadapted ranking functions based on genetic programming, in The 30th annual international ACM SIGIR conference on Research and development in information retrieval (2007)
F. Wang, X. Xu, AdaGP-Rank: applying boosting technique to genetic programming for learning to rank, in IEEE Youth Conference on Information Computing and Telecommunications (2010)
F. Fernández, M. Tomassini, L. Vanneschi, An empirical study of multipopulation genetic programming. Genet. Program Evolvable Mach. 4(1), 21–51 (2003)
Article MATH Google Scholar
J.-Y. Lin, H.-R. Ke, B.-C. Chien, W.-P. Yang, Classifier design with feature selection and feature extraction using layered genetic programming. Expert Syst. Appl. 34, 1384–1393 (2008)
Article Google Scholar
A.H. Keyhanipour, M. Piroozmand, K. Badie, A GP-adaptive web ranking discovery framework based on combinative content and context features. J. Informetr. 3, 78–89 (2009)
Article Google Scholar
S. Wang, J. Ma, J. Liu, Learning to rank using evolutionary computation: immune programming or genetic programming?, in The 18th ACM conference on In-70 formation and knowledge management (2009)
D. Bollegala, N. Noman, H. Iba, RankDE: learning a ranking function for information retrieval using differential evolution, in The 13th Annual Conference on Genetic and Evolutionary Computation (2011)
R. Storn, On the usage of differential evolution for function optimization, in 1996 Biennial Conference of the North American Fuzzy Information Processing Society (1996)
R. Storn, K. Price, Differential evolution—a simple and efficient heuristic for global optimization over continuous spaces. J. Glob. Optim. 11(4), 341–359 (1997)
Article MathSciNet MATH Google Scholar
S. Wang, B. Gao, K. Wang, H. Lauw, CCrank: parallel learning to rank with cooperative coevolution, in The Twenty-Fifth AAAI Conference on Artificial Intelligence (2011)
M.A. Islam, RankGPES: Learning to Rank for Information Retrieval using a Hybrid Genetic Programming with Evolutionary Strategies (Ryerson University, Toronto, 2013)
Google Scholar
E. Agichtein, E. Brill, S. Dumais, Improving web search ranking by incorporating user behavior information, in The International ACM SIGIR Conference on Research & Development of Information Retrieval (2006)
F. Radlinski, T. Joachims, Query chains: learning to rank from implicit feedback, in The ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2005)
T. Joachims, F. Radlinski, Search engines that learn from implicit feedback. Computer 40(8), 34–40 (2007)
Article Google Scholar
T. Moon, S. Ji, C. Liao, Z. Zheng, User behavior driven ranking without editorial judgments, in The 19th ACM International Conference on Information and Knowledge Management (2010)
K. Hofmann, S. Whiteson, M. de Rijke, Balancing exploration and exploitation in learning to rank online, in The 33rd European conference on Advances in information retrieval (2011)
N. Liu, J. Yan, D. Shen, D. Chen, Z. Chen, Y. Li, Learning to rank audience for behavioral targeting, in The 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval (2010)
C.D. Manning, P. Raghavan, H. Schütze, Introduction to Information Retrieval (Cambridge University Press, Cambridge, 2008)
Book MATH Google Scholar
LETOR4.0 Datasets (2009) [Online]. Available: http://research.microsoft.com/en-us/um/beijing/projects/letor/letor4dataset.aspx. Accessed 1 March 2015
TodoCL, TodoCL search engine Website (2004) [Online]. Available: http://www.todocl.cl. Accessed 1 March 2015
WCL2R (2010) [Online]. Available: http://www.latin.dcc.ufmg.br/collections/wcl2r. Accessed 1 March 2015
LETOR4.0’s Features List (2009) [Online]. Available: http://research.microsoft.com/en-us/um/beijing/projects/letor/LETOR4.0/Data/Features_in_LETOR4.pdf. Accessed 1 March 2015
C. Zhai, J. Lafferty, A study of smoothing methods for language models applied to Ad Hoc information retrieval, in The 24th Annual international ACM SIGIR Conference on Research and Development in Information Retrieval (2001)
M.G. Kendall, Rank Correlation Methods (Oxford University Press, London, 1948)
MATH Google Scholar
T. Joachims, Training linear SVMs in linear time, in The 12th International Conference on Knowledge Discovery and Data Mining (2006)
A. A. Veloso, H. M. Almeida, M. A. Gonçalves, W. J. Meira, Learning to rank at query-time using association rules, in The 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (2008)
L. A. Granka, T. Joachims, G. Gay, Eye-tracking analysis of user behavior in WWW search, in The 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (2004)
M. Miller, 53% of organic search clicks go to first link, 10 October 2012. [Online]. Available: http://searchenginewatch.com/article/2215868/53-of-Organic-Search-Clicks-Go-to-First-Link-Study. Accessed 1 March 2015

Download references

Acknowledgments

This research work is accomplished by the financial support of the University of Tehran (Grant ID: 8101004/1/02). The authors thank the Editor-in-Chief, the Associate Editor and three anonymous reviewers for their helpful comments and suggestions. Authors would like to give special thanks to Dr. Alireza Tavakoli Targhi and Ms. Maryam Piroozmand for their helps and supports.

Author information

Authors and Affiliations

Control and Intelligent Processing, Center of Excellence, School of ECE, University of Tehran, No. 5, Zolfaghary Alley, Shadmehr Street, SattrKhan Avenue, Tehran, Iran
Amir Hosein Keyhanipour, Behzad Moshiri & Maseud Rahgozar
Faculty of Engineering and Information Sciences, University of Wollongong in Dubai, Dubai, UAE
Farhad Oroumchian
Computer Society of Iran, Tehran, Iran
Kambiz Badie

Authors

Amir Hosein Keyhanipour
View author publications
You can also search for this author in PubMed Google Scholar
Behzad Moshiri
View author publications
You can also search for this author in PubMed Google Scholar
Farhad Oroumchian
View author publications
You can also search for this author in PubMed Google Scholar
Maseud Rahgozar
View author publications
You can also search for this author in PubMed Google Scholar
Kambiz Badie
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Amir Hosein Keyhanipour.

Appendices

Appendix 1

See Table 11.

Table 11 List of features in the LETOR4.0 benchmark dataset

Full size table

Appendix 2

See Table 12.

Table 12 List of features in the WCL2R benchmark dataset

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Keyhanipour, A.H., Moshiri, B., Oroumchian, F. et al. Learning to rank: new approach with the layered multi-population genetic programming on click-through features. Genet Program Evolvable Mach 17, 203–230 (2016). https://doi.org/10.1007/s10710-016-9263-y

Download citation

Received: 12 March 2015
Revised: 31 December 2015
Published: 20 January 2016
Issue Date: September 2016
DOI: https://doi.org/10.1007/s10710-016-9263-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning to rank: new approach with the layered multi-population genetic programming on click-through features

Abstract

Access this article

Similar content being viewed by others

Puma optimizer (PO): a novel metaheuristic optimization algorithm and its application in machine learning

Recommender Systems: Techniques, Applications, and Challenges

A systematic review: machine learning based recommendation systems for e-learning

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix 1

Appendix 2

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Learning to rank: new approach with the layered multi-population genetic programming on click-through features

Abstract

Access this article

Similar content being viewed by others

Puma optimizer (PO): a novel metaheuristic optimization algorithm and its application in machine learning

Recommender Systems: Techniques, Applications, and Challenges

A systematic review: machine learning based recommendation systems for e-learning

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix 1

Appendix 2

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation