Abstract
Automated generation of system level tests for grammar based systems requires the generation of complex and highly structured inputs, which must typically satisfy some formal grammar. In our previous work, we showed that genetic programming combined with probabilities learned from corpora gives significantly better results over the baseline (random) strategy. In this work, we extend our previous work by introducing grammar annotations as an alternative to learned probabilities, to be used when finding and preparing the corpus required for learning is not affordable. Experimental results carried out on six grammar based systems of varying levels of complexity show that grammar annotations produce a higher number of valid sentences and achieve similar levels of coverage and fault detection as learned probabilities.
Similar content being viewed by others
Notes
io was written by Mark Johnson, see http://web.science.mq.edu.au/~mjohnson/Software.htm
http://www.mozilla.org/rhino (version 1.7R4)
counted by CLOC - http://cloc.sourceforge.net/
References
Arcuri A, Iqbal MZ, Briand L (2010) Formal analysis of the effectiveness and predictability of random testing. In: Proceedings of the 19th international symposium on software testing and analysis, ISSTA ’10. doi:10.1145/1831708.1831736. ACM, New York, pp 219–230
Beyene M, Andrews JH (2012) Generating string test data for code coverage. In: Proceedings of the international conference on software testing, verification, and validation (ICST), pp 270–279
Booth TL, Thompson RA (1973) Applying probability measures to abstract languages. IEEE Trans Comput 100(5):442–450
Claessen K, Hughes J (2011) Quickcheck: a lightweight tool for random testing of haskell programs. Acm sigplan notices 46(4):53–64
Duchon P, Flajolet P, Louchard G, Schaeffer G (2004) Boltzmann samplers for the random generation of combinatorial structures. Comb Probab Comput 13(4–5):577–625
Feldt R, Poulding S (2013) Finding test data with specific properties via metaheuristic search. In: 2013 IEEE 24th international symposium on software reliability engineering (ISSRE). IEEE, pp 350–359
Fraser G, Arcuri A (2011) Evosuite: automatic test suite generation for object-oriented software. In: Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on foundations of software engineering, ESEC/FSE ’11. Szeged, Hungary, pp 416–419
Fraser G, Arcuri A (2013) Whole test suite generation. IEEE Trans Softw Eng 39(2):276–291. doi:10.1109/TSE.2012.14
Godefroid P, Kiezun A, Levin MY (2008) Grammar-based whitebox fuzzing. In: Proceedings of the ACM SIGPLAN conference on programming language design and implementation (PLDI), pp 206–215
Grune D, Jacobs CJH (1990) Parsing techniques: a practical guide. Ellis Horwood Limited, Chichester
Guo HF, Qiu Z (2014) A dynamic stochastic model for automatic grammar-based test generation. Software: Practice and Experience
Hennessy M, Power JF (2005) An analysis of rule coverage as a criterion in generating minimal test suites for grammar-based software. In: Proceedings of the 20th IEEE/ACM international conference on automated software engineering, ASE ’05. doi:10.1145/1101908.1101926. ACM, New York, pp 104–113
Kifetew FM, Tiella R, Tonella P (2014) Combining stochastic grammars and genetic programming for coverage testing at the system level. In: Proceedings of the 6th international symposium on search-based software engineering (SSBSE), pp 138–152
Lari K, Young SJ (1990) The estimation of stochastic context-free grammars using the inside-outside algorithm. Comput Speech Lang 4(1):35–56
Majumdar R, Xu RG (2007) Directed test generation using symbolic grammars. In: Proceedings of the 22nd IEEE/ACM international conference on automated software engineering (ASE), pp 134–143
Maurer PM (1990) Generating test data with enhanced context-free grammars. IEEE Softw 7(4):50–55
McKay RI, Hoai NX, Whigham PA, Shan Y, O’Neill M (2010) Grammar-based genetic programming: a survey. Genet Program Evolvable Mach 11(3–4):365–396
McMinn P (2004) Search-based software test data generation: a survey. J Softw Test Verification and Reliability (STVR) 14:105–156
Pargas R, Harrold MJ, Peck R (1999) Test-data generation using genetic algorithms. J Softw Test Verification and Reliability (STVR) 9:263–282
Poulding S, Alexander R, Clark JA, Hadley MJ (2013) The optimisation of stochastic grammars to enable cost-effective probabilistic structural testing. In: Proceedings of the 15th annual conference on genetic and evolutionary computation, GECCO ’13. doi:10.1145/2463372.2463550. ACM, New York, pp 1477–1484
Purdom P (1972) A sentence generator for testing parsers. BIT Numer Math 12:366–375. doi:10.1007/BF01932308
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by: Claire Le Goues and Shin Yoo
Rights and permissions
About this article
Cite this article
Kifetew, F.M., Tiella, R. & Tonella, P. Generating valid grammar-based test inputs by means of genetic programming and annotated grammars. Empir Software Eng 22, 928–961 (2017). https://doi.org/10.1007/s10664-015-9422-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10664-015-9422-4