Multi-gene Genetic Programming Based Defect-Ranking Software Modules

Guo, Junxia; Duan, Yingying; Shang, Ying

doi:10.1007/978-981-15-0310-8_4

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 861))

Included in the following conference series:

239 Accesses

Abstract

Most software defect prediction models aim at predicting the number of defects in a given software. However, it is very difficult to predict the precise number of defects in a module because of the presence of noise data. Another type of frequently used approach is ranking the software modules according to the relative number of defects, according to which software defect prediction can guide the testers to allocate the limited resources preferentially to modules with a greater number of defects. Owing to the redundant metrics in software defect data-sets, researchers always need to reduce the dimensions of the metrics before constructing defect prediction models. However a reduction in the number of dimensions may lead to some useful information being deleted too early, and consequently, the performance of the prediction model will decrease. In this paper, we propose an approach using multi-gene genetic programming (MGGP) to build a defect rank model. We compared the MGGP-based model with other optimized methods over 11 publicly available defect data-sets consisting of several software systems. The fault-percentile-average (FPA) is used to evaluate the performance of the MGGP and other methods. The results show that the models for different test objects that are built based on the MGGP approach perform better those based on other nonlinear prediction approaches when constructing the defect rank. In addition, the correlation between the software metrics will not affect the prediction performance. This means that, by using the MGGP method, we can use the original features to construct a prediction model without considering the influence of the correlation between the software module features.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Afzal, W., Torkar, R., Feldt, R.: Prediction of fault count data using genetic programming. In: Multitopic Conference, INMIC 2008. IEEE International, pp. 349–356 (2009)
Google Scholar
Awad, M., Khanna, R.: Support vector regression. Neural Inf. Process. Lett. Rev. 11(10), 203–224 (2007)
Google Scholar
Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)
MATH Google Scholar
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Article Google Scholar
Burke, E., Kendall, G.: Search methodologies: introductory tutorials in optimization and decision support techniques. Sci. Bus. 58(3), 409–410 (2005)
MATH Google Scholar
D’Ambros, M., Lanza, M., Robbes, R.: Evaluating defect prediction approaches: a benchmark and an extensive comparison. Empir. Softw. Eng. 17(4–5), 531–577 (2012)
Article Google Scholar
Elish, K.O., Elish, M.O.: Predicting defect-prone software modules using support vector machines. J. Syst. Softw. 81(5), 649–660 (2008)
Article Google Scholar
Elith, J., Leathwick, J.R., Hastie, T.: A working guide to boosted regression trees. J. Anim. Ecol. 77(4), 802–813 (2008)
Article Google Scholar
Gao, K., Khoshgoftaar, T.M.: A comprehensive empirical study of count models for software fault prediction. IEEE Trans. Reliab. 56(2), 223–236 (2007)
Article Google Scholar
Garg, A.: Review of genetic programming in modeling of machining processes. In: Proceedings of International Conference on Modelling, Identification & Control, pp. 653–658 (2012)
Google Scholar
Garg, A., Tai, K.: Comparison of regression analysis, artificial neural network and genetic programming in handling the multicollinearity problem. In: Proceedings of International Conference on Modelling, Identification & Control, pp. 353–358 (2012)
Google Scholar
Haynes, W.: Wilcoxon rank sum test. In: Dubitzky, W., Wolkenhauer, O., Cho, K.H., Yokota, H. (eds.) Encyclopedia of Systems Biology, pp. 2354–2355. Springer, New York (2013). https://doi.org/10.1007/978-1-4419-9863-7
Chapter Google Scholar
Hinchliffe, M., Hiden, H., Mckay, B., Willis, M., Tham, M., Barton, G.: Modelling chemical process systems using a multi-gene genetic programming algorithm. In: Genetic Programming (1996)
Google Scholar
Jiang, Y., Cukic, B., Ma, Y.: Techniques for evaluating fault prediction models. Empir. Softw. Eng. 13(5), 561–595 (2008)
Article Google Scholar
Khoshgoftaar, T.M., Allen, E.B.: Ordering fault-prone software modules. Softw. Qual. J. 11(1), 19–37 (2003)
Article Google Scholar
Khoshgoftaar, T.M., Geleyn, E., Gao, K.: An empirical study of the impact of count models predictions on module-order models. In: Eighth IEEE Symposium on Software Metrics. Proceedings, pp. 161–172 (2002)
Google Scholar
Khoshgoftaar, T.M., Seliya, N.: Fault prediction modeling for software quality estimation: comparing commonly used techniques. Empir. Softw. Eng. 8(3), 255–283 (2003)
Article Google Scholar
Koza, J.R.: Survey of genetic algorithms and genetic programming. In: Wescon/1995. Conference Record. Microelectronics Communications Technology Producing Quality Products Mobile and Portable Power Emerging Technologies, p. 589 (1995)
Google Scholar
Malhotra, R.: A Systematic Review of Machine Learning Techniques for Software Fault Prediction. Elsevier Science Publishers B.V, Amsterdam (2015)
Book Google Scholar
Menzies, T., Greenwald, J., Frank, A.: Data mining static code attributes to learn defect predictors. IEEE Trans. Softw. Eng. 33(1), 2–13 (2006)
Article Google Scholar
Nagappan, N., Ball, T.: Use of relative code churn measures to predict system defect density. In: International Conference on Software Engineering, pp. 284–292 (2005)
Google Scholar
Rathore, S.S., Kumar, S.: An empirical study of some software fault prediction techniques for the number of faults prediction. Soft Comput. 21(24), 1–18 (2017)
Article Google Scholar
Rathore, S.S., Kumar, S.: Predicting number of faults in software system using genetic programming. In: International Conference on Soft Computing and Software Engineering, pp. 303–311 (2015)
Article Google Scholar
Tassey, G.: The economic impacts of inadequate infrastructure for software testing. Natl. Inst. Stand. Technol. 15(3), 125 (2002)
Google Scholar
Wang, H., Khoshgoftaar, T.M., Seliya, N.: How many software metrics should be selected for defect prediction? In: Twenty-Fourth International Florida Artificial Intelligence Research Society Conference, Palm Beach, Florida, USA, 18–20 May 2011 (2005)
Google Scholar
Weyuker, E.J., Ostrand, T.J., Bell, R.M.: Comparing the effectiveness of several modeling methods for fault prediction. Empir. Softw. Eng. 15(3), 277–295 (2013)
Article Google Scholar
Yang, X., Tang, K., Yao, X.: A learning-to-rank approach to software defect prediction. IEEE Trans. Reliab. 64(1), 234–246 (2015)
Article Google Scholar
Zhang, F., Hassan, A.E., Mcintosh, S., Zou, Y.: The use of summation to aggregate software metrics hinders the performance of defect prediction models. IEEE Trans. Softw. Eng. 43(5), 476–491 (2017)
Article Google Scholar
Zhang, F., Mockus, A., Keivanloo, I., Zou, Y.: Towards building a universal defect prediction model, pp. 182–191 (2014)
Google Scholar
Zimmermann, T., Nagappan, N.: Predicting defects using network analysis on dependency graphs. In: ACM/IEEE International Conference on Software Engineering, pp. 531–540 (2008)
Google Scholar
Zimmermann, T., Nagappan, N., Gall, H., Giger, E., Murphy, B.: Cross-project defect prediction: a large scale experiment on data vs. domain vs. process. In: Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT International Symposium on Foundations of Software Engineering, Amsterdam, the Netherlands, August, pp. 91–100 (2009)
Google Scholar
Zimmermann, T., Premraj, R., Zeller, A.: Predicting defects for eclipse. In: International Workshop on Predictor MODELS in Software Engineering, Promise 2007: ICSE Workshops, p. 9 (2007)
Google Scholar

Download references

Acknowledgment

The work describes in this paper is supported by the National Natural Science Foundation of China under Grant No. 61702029, 61872026 and 61672085.

Author information

Authors and Affiliations

College of Information Science and Technology, Beijing University of Chemical Technology, Beijing, China
Junxia Guo, Yingying Duan & Ying Shang

Authors

Junxia Guo
View author publications
You can also search for this author in PubMed Google Scholar
Yingying Duan
View author publications
You can also search for this author in PubMed Google Scholar
Ying Shang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ying Shang .

Editor information

Editors and Affiliations

Beijing University of Chemical Technology, Beijing, China
Zheng Li
Beijing Institute of Technology, Beijing, China
He Jiang
Peking University, Beijing, China
Ge Li
Peking University, Beijing, China
Minghui Zhou
Nanjing University, Nanjing, China
Ming Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Guo, J., Duan, Y., Shang, Y. (2019). Multi-gene Genetic Programming Based Defect-Ranking Software Modules. In: Li, Z., Jiang, H., Li, G., Zhou, M., Li, M. (eds) Software Engineering and Methodology for Emerging Domains. NASAC NASAC 2017 2018. Communications in Computer and Information Science, vol 861. Springer, Singapore. https://doi.org/10.1007/978-981-15-0310-8_4

Download citation

DOI: https://doi.org/10.1007/978-981-15-0310-8_4
Published: 12 September 2019
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-0309-2
Online ISBN: 978-981-15-0310-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the China Computer Federation (CCF) (opens in a new tab)