Abstract
Multiple instance learning is a challenging task in supervised learning and data mining. However, algorithm performance becomes slow when learning from large-scale and high-dimensional data sets. Graphics processing units (GPUs) are being used for reducing computing time of algorithms. This paper presents an implementation of the G3P-MI algorithm on GPUs for solving multiple instance problems using classification rules. The GPU model proposed is distributable to multiple GPUs, seeking for its scalability across large-scale and high-dimensional data sets. The proposal is compared to the multi-threaded CPU algorithm with streaming SIMD extensions parallelism over a series of data sets. Experimental results report that the computation time can be significantly reduced and its scalability improved. Specifically, an speedup of up to 149\(\times \) can be achieved over the multi-threaded CPU algorithm when using four GPUs, and the rules interpreter achieves great efficiency and runs over 108 billion genetic programming operations per second.
Similar content being viewed by others
Notes
The GPU kernels code and the data sets are publicly available to facilitate the replicability of the experiments and future comparisons at: http://www.uco.es/grupos/kdis/kdiswiki/MIL-GPU.
References
Andrews S, Tsochantaridis I, Hofmann T (2002) Support vector machines for multiple-instance learning. In: Neural information processing, system, pp 561–568
Balachandran V, Deepak P, Khemani D (2012) Interpretable and reconfigurable clustering of document datasets by deriving word-based rules. Knowl Inf Syst 32(3):475–503
Banzhaf W, Harding S, Langdon WB, Wilson G (2009) Accelerating genetic programming through graphics processing units. In: Genetic programming theory and practice VI, pp 1–19
Bergeron C, Moore G, Zaretzki J, Breneman C, Bennett K (2012) Fast bundle algorithm for multiple-instance learning. IEEE Trans Pattern Anal Mach Intell 34(6):1068–1079
Cano A, Zafra A, Ventura S (2012) Speeding up the evaluation phase of GP classification algorithms on GPUs. Soft Comput 16:187–202
Chen S, Jiang L (2012) An empirical study on multi-instance learning. Adv Inf Sci Serv Sci 4(6):193–202
Chen X, Zhang C, Chen S, Rubin S (2009) A human-centered multiple instance learning framework for semantic video retrieval. IEEE Trans Syst Man Cybern Part C Appl Rev 39(2):228–233
Chevaleyre Y, Bredeche N, Zucker J (2002) Learning rules from multiple instance data: issues and algorithms. In: 9th Information processing and management of uncertainty in, knowledge-based systems, pp 455–459
Chevaleyre Y, Zucker J (2001) Solving multiple-instance and multiple-part learning problems with decision trees and decision rules. Application to the mutagenesis problem. Volume 2056 of LNCS, pp 204–214
Chitty D (2012) Fast parallel genetic programming: multi-core cpu versus many-core gpu. Soft Comput 16(10):1795–1814
De Oliveira FB, Davendra D, Guimarães FG (2013) Multi-objective differential evolution on the GPU with C-CUDA. Adv Intell Syst Comput 188:123–132
Dietterich TG, Lathrop RH, Lozano-Pérez T (1997) Solving the multiple instance problem with axis-parallel rectangles. Artif Intell 89:31–71
Espejo PG, Ventura S, Herrera F (2010) A survey on the application of genetic programming to classification. IEEE Trans Syst Man Cybern Part C Appl Rev 40(2):121–144
Fabris F, Krohling RA (2012) A co-evolutionary differential evolution algorithm for solving min-max optimization problems implemented on GPU using C-CUDA. Expert Syst Appl 39(12):10324–10333
Fok KL, Wong TT, Wong ML (2007) Evolutionary computing on consumer graphics hardware. IEEE Intell Syst 22(2):69–78
Foulds J, Frank E (2010) A review of multi-instance learning assumptions. Knowl Eng Rev 25(1):1–25
Foulds JR, Frank E (2010) Speeding up and boosting diverse density learning. In: 13th international conference on discovery, science, pp 102–116
Franco MA, Krasnogor N, Bacardit J (2010) Speeding up the evaluation of evolutionary learning systems using GPGPUs. In: Genetic and evolutionary computation conference, pp 1039–1046
Freitas AA (2003) Data mining and knowledge discovery with evolutionary algorithms. Springer, Berlin
Freitas AA (2007) A review of evolutionary algorithms for data mining. pp 61–93
Gao S, Suna Q (2008) Exploiting generalized discriminative multiple instance learning for multimedia semantic concept detection. Pattern Recognit 41(10):3214–3223
Gartner T, Flach PA, Kowalczyk A, Smola AJ (2002) Multi-instance kernels. In: 19th International conference on machine learning, pp 179–186
Gu Z, Mei T, Tang J, Wu X, Hua X (2008) MILC2: a multi-layer multi-instance learning approach to video concept detection. In: 14th International conference of multimedia modeling, pp 24–34
Harding S, Banzhaf W (2007) Fast genetic programming on GPUs. Lect Notes Comput Sci 4445:90–101
Herman G, Ye G, Xu J, Zhang B (2008) Region-based image categorization with reduced feature set. In: 10th IEEE workshop on multimedia, signal processing, pp 586–591
Hoai RI, Whigham NX, Shan PA, O’neill Y, McKay M (2010) Grammar-based genetic programming: a survey. Genet Program Evolvable Mach 11(3–4):365–396
Huang H, Hsu C (2002) Bayesian classification for data from the same unknown class. IEEE Trans Syst Man Cybern Part B Cybern 32(2):137–145
Huysmans J, Dejaeger K, Mues C, Vanthienen J, Baesens B (2011) An empirical evaluation of the comprehensibility of decision table, tree and rule based predictive models. Decis Support Syst 51:141–154
Konieczny D, Marcinkowski M, Myszkowski P (2013) GPGPU implementation of evolutionary algorithm for images clustering. Stud Comput Intell 457:219–238
Langdon WB (2010) A many threaded cuda interpreter for genetic programming. Lect Notes Comput Sci 6021:146–158
Langdon WB (2011) Graphics processing units and genetic programming: an overview. Soft Comput 15(8):1657–1669
Langdon WB, Banzhaf W (2008) A SIMD interpreter for genetic programming on GPU graphics cards. Lect Notes Comput Sci 4971:73–85
Langdon WB, Harrison AP (2008) GP on SPMD parallel graphics hardware for mega bioinformatics data mining. Soft Comput 12(12):1169–1183
Li CH, Gondra I, Liu L (2012) An efficient parallel neural network-based multi-instance learning algorithm. J Supercomput 62(2):724–740
Maron O, Lozano-Pérez T (1997) A framework for multiple-instance learning. In: Neural information processing, system, pp 570–576
McKenney D, White T (2012) Stock trading strategy creation using GP on GPU. Soft Comput 16(2):247–259
Nguyen D, Nguyen C, Hargraves R, Kurgan L, Cios K (2013) mi-ds: multiple-instance learning algorithm. IEEE Trans Cybern 43(1):143–154
Qi X, Han Y (2007) Incorporating multiple svms for automatic image annotation. Pattern Recognit 40(2):728–741
Sabato S, Tishby N (2012) Multi-instance learning with any hypothesis class. J Mach Learn Res 13:2999–3039
Santner J, Leistner C, Saffari A, Pock T, Bischof H (2010) PROST: parallel robust online simple tracking. In: IEEE conference on computer vision and pattern recognition, pp 23–730
Ventura S, Romero C, Zafra A, Delgado JA, Hervás C (2007) JCLEC: a Java framework for evolutionary computation. Soft Comput 12(4):381–392
Wang H, Rahnamayan S, Wu Z (2013) Parallel differential evolution with self-adapting control parameters and generalized opposition-based learning for solving high-dimensional optimization problems. J Parallel Distrib Comput 73(1):62–73
Wang J, Zucker J-D (2000) Solving the multiple-instance problem: a lazy learning approach. In: 17th International conference on machine learning, pp 1119–1126
Weidmann N, Frank E, Pfahringer B (2003) A two-level learning method for generalized multi-instance problems. In: 14th European conference on machine learning, pp 468–479
Witten I, Frank E (2005) Data mining: practical machine learning tools and techniques. 2nd Edition. Morgan Kaufmann
Wu XL, Obeid N, Hwu WM (2010) Exploiting more parallelism from applications having generalized reductions on GPU architectures. In: IEEE computer and information technology, pp 1175–1180
Zafra A, Romero C, Ventura S (2011) Multiple instance learning for classifying students in learning management systems. Expert Syst Appl 38(12):15020–15031
Zafra A, Ventura S (2010) G3P-MI: a genetic programming algorithm for multiple instance learning. Inf Sci 180:4496–4513
Zafra A, Ventura S (2012) Multi-instance genetic programming for predicting student performance in web based educational environments. Appl Soft Comput 12(8):2693–2706
Zafra A, Ventura S (2012) Multi-objective approach based on grammar-guided genetic programming for solving multiple instance problems. Soft Comput 16:955–977
Zhou Z, Zhang M (2007) Solving multi-instance problems with classifier ensemble based on constructive clustering. Knowl Inf Syst 11(2):155–170
Zhou Z-H, Jiang K, Li M (2005) Multi-instance learning based web mining. Appl Intell 22(2):135–147
Acknowledgments
This work was supported by the Regional Government of Andalusia and the Ministry of Science and Technology, Project TIN-2011-22408, and FEDER funds. This research was also supported by the Spanish Ministry of Education under FPU Grant AP2010-0042.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Cano, A., Zafra, A. & Ventura, S. Speeding up multiple instance learning classification rules on GPUs. Knowl Inf Syst 44, 127–145 (2015). https://doi.org/10.1007/s10115-014-0752-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-014-0752-0