Transductive Transfer Learning in Genetic Programming for Document Classification

Fu, Wenlong; Xue, Bing; Zhang, Mengjie; Gao, Xiaoying

doi:10.1007/978-3-319-68759-9_45

Wenlong Fu²²,
Bing Xue²²,
Mengjie Zhang²² &
…
Xiaoying Gao²²

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10593))

Included in the following conference series:

Asia-Pacific Conference on Simulated Evolution and Learning

3228 Accesses
12 Citations

Abstract

Document classification tasks generally have sparse and high dimensional features. It is important to effectively extract features. In document classification tasks, there are some similarities existing in different categories or different datasets. It is possible that one document classification task does not have labelled training data. In order to obtain effective classifiers on this specific task, this paper proposes a Genetic Programming (GP) system using transductive transfer learning. The proposed GP system automatically extracts features from different source domains, and these GP extracted features are combined to form new classifiers being directly applied to a target domain. From experimental results, the proposed transductive transfer learning GP system can evolve features from source domains to effectively apply to target domains which are similar to the source domains.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Agarwal, B., Mittal, N.: Text classification using machine learning methods-a survey. In: Babu, B.V., Nagar, A., Deep, K., Pant, M., Bansal, J.C., Ray, K., Gupta, U. (eds.) Proceedings of the Second International Conference on Soft Computing for Problem Solving (SocProS 2012), December 28-30, 2012. AISC, vol. 236, pp. 701–709. Springer, New Delhi (2014). doi:10.1007/978-81-322-1602-5_75
Chapter Google Scholar
Bhowan, U., McCloskey, D.J.: Genetic programming for feature selection and question-answer ranking in IBM watson. In: Machado, P., Heywood, M.I., McDermott, J., Castelli, M., García-Sánchez, P., Burelli, P., Risi, S., Sim, K. (eds.) EuroGP 2015. LNCS, vol. 9025, pp. 153–166. Springer, Cham (2015). doi:10.1007/978-3-319-16501-1_13
Google Scholar
Cavnar, W.B., Trenkle, J.M.: N-gram-based text categorization. In: Proceedings of 3rd Annual Symposium on Document Analysis and Information Retrieval, SDAIR-1994, pp. 161–175 (1994)
Google Scholar
Chen, Q., Xue, B., Niu, B., Zhang, M.: Improving generalisation of genetic programming for high-dimensional symbolic regression with feature selection. In: 2016 IEEE Congress on Evolutionary Computation (CEC), pp. 3793–3800 (2016)
Google Scholar
Chen, Q., Zhang, M., Xue, B.: Feature selection to improve generalisation of genetic programming for high-dimensional symbolic regression. IEEE Trans. Evol. Comput. PP(99), 1 (2017)
Google Scholar
Escalante, H.J., García-Limón, M.A., Morales-Reyes, A., Graff, M., Montes-y-Gómez, M., Morales, E.F., Martínez-Carranza, J.: Term-weighting learning via genetic programming for text classification. Knowl.-Based Syst. 83, 176–189 (2015)
Article Google Scholar
Fu, W., Johnston, M., Zhang, M.: Low-level feature extraction for edge detection using genetic programming. IEEE Trans. Cybern. 44(8), 1459–1472 (2014)
Article Google Scholar
Fu, W., Johnston, M., Zhang, M.: Distribution-based invariant feature construction using genetic programming for edge detection. Soft Comput. 19(8), 2371–2389 (2015)
Article Google Scholar
Gong, B., Grauman, K., Sha, F.: Learning kernels for unsupervised domain adaptation with applications to visual object recognition. Int. J. Comput. Vis. 109(1), 3–27 (2014)
Article MathSciNet MATH Google Scholar
Hirsch, L., Saeedi, M., Hirsch, R.: Evolving rules for document classification. In: Keijzer, M., Tettamanzi, A., Collet, P., van Hemert, J., Tomassini, M. (eds.) EuroGP 2005. LNCS, vol. 3447, pp. 85–95. Springer, Heidelberg (2005). doi:10.1007/978-3-540-31989-4_8
Chapter Google Scholar
Hirsch, L., Saeedi, M., Hirsch, R.: Evolving text classification rules with genetic programming. Appl. Artif. Intell. 19(7), 659–676 (2005)
Article Google Scholar
Holm, S.: A simple sequentially rejective multiple test procedure. Scand. J. Stat. 6(2), 65–70 (1979)
MathSciNet MATH Google Scholar
Joachims, T.: Text categorization with support vector machines: learning with many relevant features. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, pp. 137–142. Springer, Heidelberg (1998). doi:10.1007/BFb0026683
Chapter Google Scholar
Khan, A., Baharudin, B., Lee, L.H., Khan, K., Tronoh, U.T.P.: A review of machine learning algorithms for text-documents classification. J. Adv. Inf. Technol. (2010)
Google Scholar
Khodadi, I., Abadeh, M.S.: Genetic programming-based feature learning for question answering. Inf. Process. Manage. 52(2), 340–357 (2016)
Article Google Scholar
Lang, K.: Newsweeder: learning to filter netnews. In: Proceedings of the 12th International Machine Learning Conference (ML95) (1995)
Google Scholar
Pan, S.J., Tsang, I.W., Kwok, J.T., Yang, Q.: Domain adaptation via transfer component analysis. IEEE Trans. Neural Netw. 22(2), 199–210 (2011)
Article Google Scholar
Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2010)
Article Google Scholar
Sebastiani, F.: Machine learning in automated text categorization. ACM Comput. Surv. 34(1), 1–47 (2002)
Article Google Scholar
Weiss, K., Khoshgoftaar, T.M., Wang, D.: A survey of transfer learning. J. Big Data 3(1), 9 (2016)
Article Google Scholar
Yang, Y., Pedersen, J.O.: A comparative study on feature selection in text categorization. In: Proceedings of the Fourteenth International Conference on Machine Learning, pp. 412–420 (1997)
Google Scholar
Zhang, B., Fan, W., Chen, Y., Fox, E.A., Gonçalves, M.A., Cristo, M., Calado, P.: A genetic programming approach for combining structural and citation-based evidence for text classification in web digital libraries. In: Herrera-Viedma, E., Pasi, G., Crestani, F. (eds.) Soft Computing in Web Information Retrieval. Studies in Fuzziness and Soft Computing, vol. 197, pp. 65–83. Springer, Heidelberg (2006). doi:10.1007/3-540-31590-X_4
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

School of Engineering and Computer Science, Victoria University of Wellington, PO Box 600, Wellington, New Zealand
Wenlong Fu, Bing Xue, Mengjie Zhang & Xiaoying Gao

Authors

Wenlong Fu
View author publications
You can also search for this author in PubMed Google Scholar
Bing Xue
View author publications
You can also search for this author in PubMed Google Scholar
Mengjie Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoying Gao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bing Xue .

Editor information

Editors and Affiliations

Southern University of Science and Technology, Shenzhen, China
Yuhui Shi
City University of Hong Kong, Hong Kong, Kowloon, Hong Kong
Kay Chen Tan
Victoria University of Wellington, Wellington, Wellington, New Zealand
Mengjie Zhang
Southern University of Science and Technology, Shenzhen, China
Ke Tang
RMIT University, Melbourne, Victoria, Australia
Xiaodong Li
City University of Hong Kong, Kowloon Tong, Hong Kong
Qingfu Zhang
Peking University, Beijing, China
Ying Tan
University of Leipzig, Leipzig, Germany
Martin Middendorf
University of Surrey, Guildford, Surrey, United Kingdom
Yaochu Jin

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fu, W., Xue, B., Zhang, M., Gao, X. (2017). Transductive Transfer Learning in Genetic Programming for Document Classification. In: Shi, Y., et al. Simulated Evolution and Learning. SEAL 2017. Lecture Notes in Computer Science(), vol 10593. Springer, Cham. https://doi.org/10.1007/978-3-319-68759-9_45

Download citation

DOI: https://doi.org/10.1007/978-3-319-68759-9_45
Published: 14 October 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-68758-2
Online ISBN: 978-3-319-68759-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics