Skip to main content

Advertisement

Log in

Speeding up multiple instance learning classification rules on GPUs

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Multiple instance learning is a challenging task in supervised learning and data mining. However, algorithm performance becomes slow when learning from large-scale and high-dimensional data sets. Graphics processing units (GPUs) are being used for reducing computing time of algorithms. This paper presents an implementation of the G3P-MI algorithm on GPUs for solving multiple instance problems using classification rules. The GPU model proposed is distributable to multiple GPUs, seeking for its scalability across large-scale and high-dimensional data sets. The proposal is compared to the multi-threaded CPU algorithm with streaming SIMD extensions parallelism over a series of data sets. Experimental results report that the computation time can be significantly reduced and its scalability improved. Specifically, an speedup of up to 149\(\times \) can be achieved over the multi-threaded CPU algorithm when using four GPUs, and the rules interpreter achieves great efficiency and runs over 108 billion genetic programming operations per second.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. The GPU kernels code and the data sets are publicly available to facilitate the replicability of the experiments and future comparisons at: http://www.uco.es/grupos/kdis/kdiswiki/MIL-GPU.

References

  1. Andrews S, Tsochantaridis I, Hofmann T (2002) Support vector machines for multiple-instance learning. In: Neural information processing, system, pp 561–568

  2. Balachandran V, Deepak P, Khemani D (2012) Interpretable and reconfigurable clustering of document datasets by deriving word-based rules. Knowl Inf Syst 32(3):475–503

    Article  Google Scholar 

  3. Banzhaf W, Harding S, Langdon WB, Wilson G (2009) Accelerating genetic programming through graphics processing units. In: Genetic programming theory and practice VI, pp 1–19

  4. Bergeron C, Moore G, Zaretzki J, Breneman C, Bennett K (2012) Fast bundle algorithm for multiple-instance learning. IEEE Trans Pattern Anal Mach Intell 34(6):1068–1079

    Article  Google Scholar 

  5. Cano A, Zafra A, Ventura S (2012) Speeding up the evaluation phase of GP classification algorithms on GPUs. Soft Comput 16:187–202

    Article  Google Scholar 

  6. Chen S, Jiang L (2012) An empirical study on multi-instance learning. Adv Inf Sci Serv Sci 4(6):193–202

    Google Scholar 

  7. Chen X, Zhang C, Chen S, Rubin S (2009) A human-centered multiple instance learning framework for semantic video retrieval. IEEE Trans Syst Man Cybern Part C Appl Rev 39(2):228–233

    Article  Google Scholar 

  8. Chevaleyre Y, Bredeche N, Zucker J (2002) Learning rules from multiple instance data: issues and algorithms. In: 9th Information processing and management of uncertainty in, knowledge-based systems, pp 455–459

  9. Chevaleyre Y, Zucker J (2001) Solving multiple-instance and multiple-part learning problems with decision trees and decision rules. Application to the mutagenesis problem. Volume 2056 of LNCS, pp 204–214

  10. Chitty D (2012) Fast parallel genetic programming: multi-core cpu versus many-core gpu. Soft Comput 16(10):1795–1814

    Article  Google Scholar 

  11. De Oliveira FB, Davendra D, Guimarães FG (2013) Multi-objective differential evolution on the GPU with C-CUDA. Adv Intell Syst Comput 188:123–132

    Google Scholar 

  12. Dietterich TG, Lathrop RH, Lozano-Pérez T (1997) Solving the multiple instance problem with axis-parallel rectangles. Artif Intell 89:31–71

    Article  MATH  Google Scholar 

  13. Espejo PG, Ventura S, Herrera F (2010) A survey on the application of genetic programming to classification. IEEE Trans Syst Man Cybern Part C Appl Rev 40(2):121–144

    Article  Google Scholar 

  14. Fabris F, Krohling RA (2012) A co-evolutionary differential evolution algorithm for solving min-max optimization problems implemented on GPU using C-CUDA. Expert Syst Appl 39(12):10324–10333

    Article  Google Scholar 

  15. Fok KL, Wong TT, Wong ML (2007) Evolutionary computing on consumer graphics hardware. IEEE Intell Syst 22(2):69–78

    Article  Google Scholar 

  16. Foulds J, Frank E (2010) A review of multi-instance learning assumptions. Knowl Eng Rev 25(1):1–25

    Article  Google Scholar 

  17. Foulds JR, Frank E (2010) Speeding up and boosting diverse density learning. In: 13th international conference on discovery, science, pp 102–116

  18. Franco MA, Krasnogor N, Bacardit J (2010) Speeding up the evaluation of evolutionary learning systems using GPGPUs. In: Genetic and evolutionary computation conference, pp 1039–1046

  19. Freitas AA (2003) Data mining and knowledge discovery with evolutionary algorithms. Springer, Berlin

    Google Scholar 

  20. Freitas AA (2007) A review of evolutionary algorithms for data mining. pp 61–93

  21. Gao S, Suna Q (2008) Exploiting generalized discriminative multiple instance learning for multimedia semantic concept detection. Pattern Recognit 41(10):3214–3223

    Article  MATH  Google Scholar 

  22. Gartner T, Flach PA, Kowalczyk A, Smola AJ (2002) Multi-instance kernels. In: 19th International conference on machine learning, pp 179–186

  23. Gu Z, Mei T, Tang J, Wu X, Hua X (2008) MILC2: a multi-layer multi-instance learning approach to video concept detection. In: 14th International conference of multimedia modeling, pp 24–34

  24. Harding S, Banzhaf W (2007) Fast genetic programming on GPUs. Lect Notes Comput Sci 4445:90–101

    Google Scholar 

  25. Herman G, Ye G, Xu J, Zhang B (2008) Region-based image categorization with reduced feature set. In: 10th IEEE workshop on multimedia, signal processing, pp 586–591

  26. Hoai RI, Whigham NX, Shan PA, O’neill Y, McKay M (2010) Grammar-based genetic programming: a survey. Genet Program Evolvable Mach 11(3–4):365–396

    Google Scholar 

  27. Huang H, Hsu C (2002) Bayesian classification for data from the same unknown class. IEEE Trans Syst Man Cybern Part B Cybern 32(2):137–145

    Article  Google Scholar 

  28. Huysmans J, Dejaeger K, Mues C, Vanthienen J, Baesens B (2011) An empirical evaluation of the comprehensibility of decision table, tree and rule based predictive models. Decis Support Syst 51:141–154

    Article  Google Scholar 

  29. Konieczny D, Marcinkowski M, Myszkowski P (2013) GPGPU implementation of evolutionary algorithm for images clustering. Stud Comput Intell 457:219–238

    Google Scholar 

  30. Langdon WB (2010) A many threaded cuda interpreter for genetic programming. Lect Notes Comput Sci 6021:146–158

    Google Scholar 

  31. Langdon WB (2011) Graphics processing units and genetic programming: an overview. Soft Comput 15(8):1657–1669

    Article  Google Scholar 

  32. Langdon WB, Banzhaf W (2008) A SIMD interpreter for genetic programming on GPU graphics cards. Lect Notes Comput Sci 4971:73–85

    Google Scholar 

  33. Langdon WB, Harrison AP (2008) GP on SPMD parallel graphics hardware for mega bioinformatics data mining. Soft Comput 12(12):1169–1183

    Article  Google Scholar 

  34. Li CH, Gondra I, Liu L (2012) An efficient parallel neural network-based multi-instance learning algorithm. J Supercomput 62(2):724–740

    Article  Google Scholar 

  35. Maron O, Lozano-Pérez T (1997) A framework for multiple-instance learning. In: Neural information processing, system, pp 570–576

  36. McKenney D, White T (2012) Stock trading strategy creation using GP on GPU. Soft Comput 16(2):247–259

    Article  Google Scholar 

  37. Nguyen D, Nguyen C, Hargraves R, Kurgan L, Cios K (2013) mi-ds: multiple-instance learning algorithm. IEEE Trans Cybern 43(1):143–154

    Article  Google Scholar 

  38. Qi X, Han Y (2007) Incorporating multiple svms for automatic image annotation. Pattern Recognit 40(2):728–741

    Article  MATH  Google Scholar 

  39. Sabato S, Tishby N (2012) Multi-instance learning with any hypothesis class. J Mach Learn Res 13:2999–3039

    MATH  MathSciNet  Google Scholar 

  40. Santner J, Leistner C, Saffari A, Pock T, Bischof H (2010) PROST: parallel robust online simple tracking. In: IEEE conference on computer vision and pattern recognition, pp 23–730

  41. Ventura S, Romero C, Zafra A, Delgado JA, Hervás C (2007) JCLEC: a Java framework for evolutionary computation. Soft Comput 12(4):381–392

    Article  Google Scholar 

  42. Wang H, Rahnamayan S, Wu Z (2013) Parallel differential evolution with self-adapting control parameters and generalized opposition-based learning for solving high-dimensional optimization problems. J Parallel Distrib Comput 73(1):62–73

    Article  Google Scholar 

  43. Wang J, Zucker J-D (2000) Solving the multiple-instance problem: a lazy learning approach. In: 17th International conference on machine learning, pp 1119–1126

  44. Weidmann N, Frank E, Pfahringer B (2003) A two-level learning method for generalized multi-instance problems. In: 14th European conference on machine learning, pp 468–479

  45. Witten I, Frank E (2005) Data mining: practical machine learning tools and techniques. 2nd Edition. Morgan Kaufmann

  46. Wu XL, Obeid N, Hwu WM (2010) Exploiting more parallelism from applications having generalized reductions on GPU architectures. In: IEEE computer and information technology, pp 1175–1180

  47. Zafra A, Romero C, Ventura S (2011) Multiple instance learning for classifying students in learning management systems. Expert Syst Appl 38(12):15020–15031

    Article  Google Scholar 

  48. Zafra A, Ventura S (2010) G3P-MI: a genetic programming algorithm for multiple instance learning. Inf Sci 180:4496–4513

    Article  Google Scholar 

  49. Zafra A, Ventura S (2012) Multi-instance genetic programming for predicting student performance in web based educational environments. Appl Soft Comput 12(8):2693–2706

    Article  Google Scholar 

  50. Zafra A, Ventura S (2012) Multi-objective approach based on grammar-guided genetic programming for solving multiple instance problems. Soft Comput 16:955–977

    Article  Google Scholar 

  51. Zhou Z, Zhang M (2007) Solving multi-instance problems with classifier ensemble based on constructive clustering. Knowl Inf Syst 11(2):155–170

    Article  Google Scholar 

  52. Zhou Z-H, Jiang K, Li M (2005) Multi-instance learning based web mining. Appl Intell 22(2):135–147

    Article  Google Scholar 

Download references

Acknowledgments

This work was supported by the Regional Government of Andalusia and the Ministry of Science and Technology, Project TIN-2011-22408, and FEDER funds. This research was also supported by the Spanish Ministry of Education under FPU Grant AP2010-0042.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sebastián Ventura.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cano, A., Zafra, A. & Ventura, S. Speeding up multiple instance learning classification rules on GPUs. Knowl Inf Syst 44, 127–145 (2015). https://doi.org/10.1007/s10115-014-0752-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-014-0752-0

Keywords

Navigation