Abstract
Most approaches for the extraction of association rules look for associations from a dataset in the form of a single table. However, with the growing interest in the storage of information, relational databases comprising a series of relations (tables) and relationships have become essential. We present the first grammar-guided genetic programming approach for mining association rules directly from relational databases. We represent the relational databases as trees by means of genetic programming, preserving the original database structure and enabling rules to be defined in an expressive and very flexible way. The proposed model deals with both positive and negative items, and also with both discrete and quantitative attributes. We exemplify the utility of the proposed approach with an artificial generated database having different characteristics. We also analyse a real case study, discovering interesting students’ behaviors from a moodle database.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
This synthetic relational database is available for download at http://www.uco.es/grupos/kdis/kdiswiki/index.php/Mining_Association_Rules_in_Relational_Databases.
References
R. Agrawal, H. Mannila, R. Srikant, H. Toivonen, and A. Verkamo. Fast Discovery of Association Rules. In Advances in Knowledge Discovery and Data Mining, pages 307–328. American Association for Artificial Intelligence, Menlo Park, CA, USA, 1996.
R. Agrawal and R. Srikant. Fast Algorithms for Mining Association Rules in Large Databases. In J. B. Bocca, M. Jarke, and C. Zaniolo, editors, VLDB’94, Proceedings of 20th International Conference on Very Large Data Bases, Santiago de Chile, Chile, pages 487–499. San Francisco: Morgan Kaufmann, September 1994.
A. Alashqur. RDB-MINER: A SQL-Based Algorithm for Mining Rrue Relational Databases. Journal of Software, 5(9):998–1005, 2010.
V. Crestana-Jensen and N. Soporkar. Frequent Itemset Counting Across Multiple Tables. In Proceedings of the 4th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PADKK ’00), Kyoto, Japan, pages 49–61, April 2000.
P.G. Espejo, S. Ventura, and F. Herrera. A Survey on the Application of Genetic Programming to Classification. IEEE Transactions on Systems, Man and Cybernetics: Part C, 40(2):121–144, 2010.
F. Berzal and I. Blanco and D. Sánchez and M.A. Vila. Measuring the Accuracy and Interest of Association Rules: A new Framework. Intelligent Data Analysis, 6(3):221–235, 2002.
A. A. Freitas. Data Mining and Knowledge Discovery with Evolutionary Algorithms. Springer-Verlag Berlin Heidelberg, 2002.
B. Goethals and J. Van den Bussche. Relational Association Rules: Getting WARMeR. In Proceedings of 2002 Pattern Detection and Discovery, ESF Exploratory Workshop, London, UK, pages 125–139, September 2002.
B. Goethals, D. Laurent, W. Le Page, and C. T. Dieng. Mining frequent conjunctive queries in relational databases through dependency discovery. Knowledge and Information Systems, 33(3):655–684, 2012.
B. Goethals, W. Le Page, and M. Mampaey. Mining Interesting Sets and Rules in Relational Databases. In Proceedings of the ACM Symposium on Applied Computing, Sierre, Switzerland, pages 997–1001, March 2010.
F. Gruau. On using Syntactic Constraints with Genetic Programming. Advances in genetic programming, 2:377–394, 1996.
P. Hájek, I. Havel, and M. Chytil. The GUHA Method of Automatic Hypotheses Determination. Computing, 1(4):293–308, 1966.
J. Han, J. Pei, Y. Yin, and R. Mao. Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach. Data Mining and Knowledge Discovery, 8:53–87, 2004.
A. Jiménez, F. Berzal, and J.C. Cubero. Using Trees to Mine Multirelational Databases. Data Mining and Knowledge Discovery, pages 1–39, 2011.
A.R. Konan, T.I. GÜndem, and M.E. Kaya. Assignment query and its implementation in moving object databases. International Journal of Information Technology and Decision Making, 9(3):349–372, 2010.
J. R. Koza. Genetic Programming: On the Programming of Computers by Means of Natural Selection (Complex Adaptive Systems). The MIT Press, December 1992.
J. R. Koza. Introduction to Genetic Programming: Tutorial. In GECCO’08, Proceedings of the 10th Annual Conference on Genetic and Evolutionary Computation, Atlanta, Georgia, USA, pages 2299–2338. ACM, July 2008.
J. M. Luna, J. R. Romero, and S. Ventura. Design and Behavior Study of a Grammar-guided Genetic Programming Algorithm for Mining Association Rules. Knowledge and Information Systems, 32(1):53–76, 2012.
J. Mata, J. L. Alvarez, and J. C. Riquelme. Discovering Numeric Association Rules via Evolutionary Algorithm. Advances in Knowledge Discovery and Data Mining, 2336/2002:40–51, 2002.
R. McKay, N. Hoai, P. Whigham, Y. Shan, and M. ONeill. Grammar-based Genetic Programming: a Survey. Genetic Programming and Evolvable Machines, 11:365–396, 2010.
E. Ng, A. Fu, and K. Wang. Mining Association Rules from Stars. In Proceedings of the 2002 IEEE International Conference on Data Mining (ICDM 2002), Maebashi City, Japan, December 2002.
N.F. Papè, J. Alcalá-Fdez, A. Bonarini, and F. Herrera. Evolutionary Extraction of Association Rules: A Preliminary Study on Their Effectiveness, volume 5572/2009 of Lecture Notes in Computer Science, pages 646–653. 2009.
A. Ratle and M. Sebag. Genetic Programming and Domain Knowledge: Beyond the Limitations of Grammar-Guided Machine Discovery. In PPSN VI, Proceedings of the 6th International Conference on Parallel Problem Solving from Nature, Paris, France, pages 211–220, September 2000.
C. Romero, S. Ventura, and P. De Bra. Knowledge Discovery with Genetic Programming for Providing Feedback to Courseware Authors. User Modeling and User-Adapted Interaction, 14:425–464, 2004.
E. Spyropoulou and T. De Bie. Interesting Multi-relational Patterns. In ICDM 2011, Proceedings of 11th IEEE International Conference on Data Mining, Vancouver, Canada, pages 675–684, December 2011.
S. Ventura, C. Romero, A. Zafra, J.A. Delgado, and C. Hervás. JCLEC: A Framework for Evolutionary Computation, volume 12 of Soft Computing, pages 381–392. Springer Berlin / Heidelberg, 2007.
Acknowledgements
This research was supported by the Spanish Ministry of Science and Technology, project TIN-2011-22408, and by FEDER funds. This research was also supported by the Spanish Ministry of Education under FPU grants AP2010-0041 and AP2010-0042.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Luna, J.M., Cano, A., Ventura, S. (2015). Genetic Programming for Mining Association Rules in Relational Database Environments. In: Gandomi, A., Alavi, A., Ryan, C. (eds) Handbook of Genetic Programming Applications. Springer, Cham. https://doi.org/10.1007/978-3-319-20883-1_17
Download citation
DOI: https://doi.org/10.1007/978-3-319-20883-1_17
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-20882-4
Online ISBN: 978-3-319-20883-1
eBook Packages: Computer ScienceComputer Science (R0)