Skip to main content

Using Genetic Programming to Improve Software Effort Estimation Based on General Data Sets

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2724))

Abstract

This paper investigates the use of various techniques including genetic programming, with public data sets, to attempt to model and hence estimate software project effort. The main research question is whether genetic programs can offer ‘better’ solution search using public domain metrics rather than company specific ones. Unlike most previous research, a realistic approach is taken, whereby predictions are made on the basis of the data available at a given date. Experiments are reported, designed to assess the accuracy of estimates made using data within and beyond a specific company. This research also offers insights into genetic programming’s performance, relative to alternative methods, as a problem solver in this domain. The results do not find a clear winner but, for this data, GP performs consistently well, but is harder to configure and produces more complex models. The evidence here agrees with other researchers that companies would do well to base estimates on in house data rather than incorporating public data sets. The complexity of the GP must be weighed against the small increases in accuracy to decide whether to use it as part of any effort prediction estimation.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. W. Banzhaf, P. Nordin, R. E. Keller, and F. D. Francome, Genetic Programming: An introduction. San Mateo, CA: Morgan Kaufmann, 1998.

    MATH  Google Scholar 

  2. R. Bisio and F. Malabocchia, “Cost estimation of software projects through case base reasoning,” presented at 1st Intl. Conf. on Case-Based Reasoning Research & Development, 1995.

    Google Scholar 

  3. J. Bode, “Neural networks for cost estimation,” Cost Engineering, vol. 40, pp. 25–30, 1998.

    Google Scholar 

  4. B. W. Boehm, Software Engineering Economics. Englewood Cliffs, N.J.: Prentice-Hall, 1981.

    MATH  Google Scholar 

  5. C. J. Burgess and M. Lefley, “Can genetic programming improve software effort estimation? A comparative evaluation,” Information & Software Technology, vol. 43, pp. 863–873, 2001.

    Article  Google Scholar 

  6. J. J. Dolado, “Limits to methods in software cost estimation,” presented at 1st Intl. Workshop on Soft Computing Applied to Software Engineering, Limerick, Ireland, 1999.

    Google Scholar 

  7. J. J. Dolado, “On the problem of the software cost function,” Information & Software Technology, vol. 43, pp. 61–72, 2001.

    Article  Google Scholar 

  8. S. Drummond, “Measuring applications development performance,” in Datamation, vol. 31, 1985, pp. 102–8.

    Google Scholar 

  9. G. R. Finnie, G. E. Wittig, and J.-M. Desharnais, “Estimating software development effort with case-based reasoning,” presented at 2nd Intl. Conf. on Case-Based Reasoning, 1997.

    Google Scholar 

  10. S. Huang and Y. Huang, “Bounds on the number of hidden neurons,” IEEE Trans. on Neural Networks, vol. 2, pp. 47–55, 1991.

    Article  Google Scholar 

  11. R. Jeffery, M. Ruhe, and I. Wieczorek, “Using public domain metrics to estimate software development effort,” presented at 7th IEEE Intl. Metrics Symp., London, 2001.

    Google Scholar 

  12. C. F. Kemerer, “An empirical validation of software cost estimation models,” Communications of the ACM, vol. 30, pp. 416–429, 1987.

    Article  Google Scholar 

  13. B. A. Kitchenham, S. G. MacDonell, L. Pickard, and M. J. Shepperd, “What accuracy statistics really measure,” IEE Proceedings-Software Engineering, vol. 148, pp. 81–85, 2001.

    Article  Google Scholar 

  14. B. A. Kitchenham and N. R. Taylor, “Software cost models,” ICL Technical Journal, vol. 4, pp. 73–102, 1984.

    Google Scholar 

  15. P. Kok, B. A. Kitchenham, and J. Kirakowski, “The MERMAID approach to software cost estimation,” presented at Esprit Technical Week, 1990.

    Google Scholar 

  16. J. R. Koza, Genetic programming: On the programming of computers by means of natural selection. Cambridge, MA: MIT Press, 1992.

    MATH  Google Scholar 

  17. J. R. Koza, Genetic Programming II: Automatic discovery of reusable programs: MIT Press, 1994.

    Google Scholar 

  18. J. R. Koza, F. H. Bennett, D. Andre, M. A. Keane, and (). Genetic Programming III: Darwinian Invention and Problem Solving. San Mateo, CA: Morgan Kaufmann, 1999.

    MATH  Google Scholar 

  19. M. Lefley and T. Kinsella, “Investigating neural network efficiency and structure by weight investigation,” presented at European Symp. on Intelligent Technologies, Germany, 2000.

    Google Scholar 

  20. C. Mair, G. Kadoda, M. Lefley, K. Phalp, C. Schofield, M. Shepperd, and S. Webster, “An investigation of machine learning based prediction systems,” J. of Systems Software, vol. 53, pp. 23–29, 2000.

    Article  Google Scholar 

  21. K. Maxwell, L. Van Wassenhove, and S. Dutta, “Performance evaluation of general and company specific models in software development effort estimation,” Management Science, vol. 45, pp. 787–803, 1999.

    Article  Google Scholar 

  22. M. J. Shepperd and C. Schofield, “Estimating software project effort using analogies,” IEEE Transactions on Software Engineering, vol. 23, pp. 736–743, 1997.

    Article  Google Scholar 

  23. M. J. Shepperd, C. Schofield, and B. A. Kitchenham, “Effort estimation using analogy,” presented at 18th Intl. Conf. on Softw. Eng., Berlin, 1996.

    Google Scholar 

  24. K. K. Shukla, “Neuro-genetic prediction of software development effort,” Information & Software Technology, vol. 42, pp. 701–713, 2000.

    Article  Google Scholar 

  25. E. Stensrud and I. Myrtveit, “Human performance estimating with analogy and regression models: an empirical validation,” presented at 5th Intl. Metrics Symp., Bethesda, MD, 1998.

    Google Scholar 

  26. S. Vicinanza, M. J. Prietula, and T. Mukhopadhyay, “Case-based reasoning in effort estimation,” presented at 11th Intl. Conf. on Info. Syst., 1990.

    Google Scholar 

  27. S. Walczak and N. Cerpa, “Heuristic principles for the design of artificial neural networks,” Information & Software Technology, vol. 41, pp. 107–117, 1999.

    Article  Google Scholar 

  28. G. Wittig and G. Finnie, “Estimating software development effort with connectionists models,” Information & Software Technology, vol. 39, pp. 469–476, 1997.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lefley, M., Shepperd, M.J. (2003). Using Genetic Programming to Improve Software Effort Estimation Based on General Data Sets. In: Cantú-Paz, E., et al. Genetic and Evolutionary Computation — GECCO 2003. GECCO 2003. Lecture Notes in Computer Science, vol 2724. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45110-2_151

Download citation

  • DOI: https://doi.org/10.1007/3-540-45110-2_151

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-40603-7

  • Online ISBN: 978-3-540-45110-5

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics