ABSTRACT
Software effort estimation is an important task within software engineering. It is widely used for planning and monitoring software project development as a means to deliver the product on time and within budget. Several approaches for generating predictive models from collected metrics have been proposed throughout the years. Machine learning algorithms, in particular, have been widely-employed to this task, bearing in mind their capability of providing accurate predictive models for the analysis of project stakeholders. In this paper, we propose a grammatical evolution approach for software metrics estimation. Our novel algorithm, namely SEEGE, is empirically evaluated on public project data sets, and we compare its performance with state-of-the-art machine learning algorithms such as support vector machines for regression and artificial neural networks, and also to popular linear regression. Results show that SEEGE outperforms the other algorithms considering three different evaluation measures, clearly indicating its effectiveness for the effort estimation task.
- T. A. and D. G. Deriving models for software project effort estimation by means of genetic programming. In KDIR 2009 - 1st International Conference on Knowledge Discovery and Information Retrieval, 2009.Google Scholar
- R. C. Barros, D. D. Ruiz, N. N. Tenorio Jr., M. P. Basgalupp, and K. Becker. Issues on estimating software metrics in a large software operation. In Proceedings of the 32nd Annual IEEE Software Engineering Workshop, SEW '08, pages 152--160, Washington, DC, USA, 2008. IEEE Computer Society. Google ScholarDigital Library
- M. P. Basgalupp, R. C. Barros, T. S. da Silva, and A. C. P. L. F. de Carvalho. Software effort prediction: A hyper-heuristic decision-tree based approach. In Proceedings of the 28th Annual ACM Symposium on Applied Computing (SAC 2013), 2013. Google ScholarDigital Library
- M. P. Basgalupp, R. C. Barros, and D. D. Ruiz. Predicting software maintenance effort through evolutionary-based decision trees. In Proceedings of the 27th Annual ACM Symposium on Applied Computing, pages 1209--1214, New York, NY, USA, 2012. ACM. Google ScholarDigital Library
- S. Berlin, T. Raz, C. Glezer, and M. Zviran. Comparison of estimation methods of cost and duration in it projects. Inf. Softw. Technol., 51(4):738--748, Apr. 2009. Google ScholarDigital Library
- P. L. Braga, A. L. I. Oliveira, and S. R. L. Meira. Software effort estimation using machine learning techniques with robust confidence intervals. In Proceedings of the 19th IEEE International Conference on Tools with Artificial Intelligence - Volume 01, ICTAI '07, pages 181--185, Washington, DC, USA, 2007. IEEE Computer Society. Google ScholarDigital Library
- P. Bruhn and A. Geyer-Schulz. Genetic programming over context-free languages with linear constraints for the knapsack problem: first results. Evolutionary Computation, 10(1):51--74, Mar. 2002. Google ScholarDigital Library
- C. J. Burgess and M. Lefley. Can genetic programming improve software effort estimation? a comparative evaluation. Information and Software Technology, 43(14):863 -- 873, 2001.Google ScholarCross Ref
- A. Chavoya, C. Lopez-Martin, and M. E. M.-C. a. Applying genetic programming for estimating software development effort of short-scale projects. In Eighth International Conference on Information Technology: New Generations. IEEE Computer Society, 2011. Google ScholarDigital Library
- N.-H. Chiu and S.-J. Huang. The adjusted analogy-based software effort estimation based on similarity distances. J. Syst. Softw., 80(4):628--640, Apr. 2007. Google ScholarDigital Library
- A. Corazza, S. Di Martino, F. Ferrucci, C. Gravino, F. Sarro, and E. Mendes. How effective is Tabu search to configure support vector regression for effort estimation? In Proceedings of the 6th International Conference on Predictive Models in Software Engineering, PROMISE '10, pages 4:1--4:10, New York, NY, USA, 2010. ACM. Google ScholarDigital Library
- G. Costagliola, S. Di Martino, F. Ferrucci, C. Gravino, G. Tortora, and G. Vitiello. Effort estimation modeling techniques: a case study for web applications. In Proceedings of the 6th International Conference on Web engineering, ICWE '06, pages 9--16, New York, NY, USA, 2006. ACM. Google ScholarDigital Library
- J. Dem\vsar. Statistical Comparisons of Classifiers over Multiple Data Sets. Journal of Machine Learning Research, 7:1--30, 2006. Google ScholarDigital Library
- J. J. Dolado. On the Problem of the Software Cost Function. Information and Software Technology, 43(1):61--72, 1 Jan. 2001.Google ScholarCross Ref
- M. O. Elish. Improved estimation of software project effort using multiple additive regression trees. Expert Syst. Appl., 36(7):10774--10778, Sept. 2009. Google ScholarDigital Library
- F. Ferrucci, C. Gravino, R. Oliveto, and F. Sarro. Genetic programming for effort estimation: An analysis of the impact of different fitness functions. In Search Based Software Engineering (SSBSE), 2010 Second International Symposium on. IEEE Computer Society, 2010. Google ScholarDigital Library
- G. R. Finnie, G. E. Wittig, and J.-M. Desharnais. A comparison of software effort estimation techniques: using function points with neural networks, case-based reasoning and regression models. J. Syst. Softw., 39(3):281--289, Dec. 1997. Google ScholarDigital Library
- M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. Witten. The WEKA Data Mining Software: An Update. SIGKDD Explorations, 11(1), 2009. Google ScholarDigital Library
- A. Heiat. Comparison of artificial neural network and regression models for estimating software development effort. Information and Software Technology, 44(15):911 -- 922, 2002.Google ScholarCross Ref
- S.-J. Huang and N.-H. Chiu. Optimization of analogy weights by genetic algorithm for software effort estimation. Information and Software Technology, 48(11):1034 -- 1045, 2006.Google ScholarCross Ref
- R. Iman and J. Davenport. Approximations of the critical region of the friedman statistic. Communications in Statistics, pages 571--595, 1980.Google Scholar
- M. Jorgensen and M. Shepperd. A systematic review of software development cost estimation studies. IEEE Trans. Softw. Eng., 33(1):33--53, Jan. 2007. Google ScholarDigital Library
- M. Lefley and M. J. Shepperd. Using genetic programming to improve software effort estimation based on general data sets. Genetic and Evolutionary Computation GECCO2003, 2724(1):2477--2487, 2003. Google ScholarDigital Library
- S. Luke. ECJ 20: A Java evolutionary computation library. http://cs.gmu.edu/\(\sim\)eclab/projects/ecj/, 2004.Google Scholar
- R. M. MacCallum. Introducing a perl genetic programming system - and can meta-evolution solve the bloat problem? In Proceedings of the 6th European conference on Genetic programming, EuroGP'03, pages 364--373, Berlin, Heidelberg, 2003. Springer-Verlag. Google ScholarDigital Library
- E. Mendes and N. Mosley. Bayesian network models for web effort prediction: A comparative study. IEEE Trans. Softw. Eng., 34(6):723--737, Nov. 2008. Google ScholarDigital Library
- A. L. Oliveira. Estimation of software project effort with support vector regression. Neurocomputing, 69(13--15):1749 -- 1753, 2006. Google ScholarDigital Library
- M. O'Neill and C. Ryan. Grammatical evolution. IEEE Transactions on Evolutionary Computation, 5(4):349 --358, aug 2001. Google ScholarDigital Library
- P. Pendharkar, G. Subramanian, and J. Rodger. A probabilistic model for predicting software development effort. IEEE Trans. Softw. Eng., 31(7):615 -- 624, july 2005. Google ScholarDigital Library
- C. Ryan, J. Collins, J. Collins, and M. O'Neill. Grammatical evolution: Evolving programs for an arbitrary language. In Proceedings of the First European Workshop on Genetic Programming, pages 83--95. Springer-Verlag, 1998. Google ScholarDigital Library
- F. Sarro, F. Ferrucci, and C. Gravino. Single and multi objective genetic programming for software development effort estimation. In Proceedings of the 27th Annual ACM Symposium on Applied Computing, SAC '12. ACM, 2012. Google ScholarDigital Library
- J. Sayyad Shirabad and T. Menzies. The PROMISE Repository of Software Engineering Databases. School of Information Technology and Engineering, University of Ottawa, Canada, 2005.Google Scholar
- Y. Shan, R. I. Mckay, C. J. Lokan, and D. Essam. Software project effort estimation using genetic programming. In Proceedings of International Conference on Communications Circuits and Systems, pages 1108--1112. Press, 2002.Google ScholarCross Ref
- I. Tanev and K. Shimohara. On role of implicit interaction and explicit communications in emergence of social behavior in continuous predators-prey pursuit problem. In Proceedings of GECCO'03, pages 74--85, 2003. Google ScholarDigital Library
Index Terms
- A grammatical evolution approach for software effort estimation
Recommendations
Software effort prediction: a hyper-heuristic decision-tree based approach
SAC '13: Proceedings of the 28th Annual ACM Symposium on Applied ComputingSoftware effort prediction is an important task within software engineering. In particular, machine learning algorithms have been widely-employed to this task, bearing in mind their capability of providing accurate predictive models for the analysis of ...
The adjusted analogy-based software effort estimation based on similarity distances
Analogy-based estimation is a widely adopted problem solving method that has been evaluated and confirmed in software effort or cost estimation domains. The similarity measures between pairs of projects play a critical role in the analogy-based software ...
Predicting software maintenance effort through evolutionary-based decision trees
SAC '12: Proceedings of the 27th Annual ACM Symposium on Applied ComputingSoftware effort prediction has been a challenge for researchers throughout the years. Several approaches for producing predictive models from collected data have been proposed, although none has become standard given the specificities of different ...
Comments