Abstract
Can the execution of software be perturbed without breaking the correctness of the output? In this paper, we devise a protocol to answer this question from a novel perspective. In an experimental study, we observe that many perturbations do not break the correctness in ten subject programs. We call this phenomenon “correctness attraction”. The uniqueness of this protocol is that it considers a systematic exploration of the perturbation space as well as perfect oracles to determine the correctness of the output. To this extent, our findings on the stability of software under execution perturbations have a level of validity that has never been reported before in the scarce related work. A qualitative manual analysis enables us to set up the first taxonomy ever of the reasons behind correctness attraction.
Similar content being viewed by others
Notes
| is the bitwise or operator. >> is the binary right shift operator. The assignment | = is the bitwise or operator between the left operand and the right operand, then the result is affected to the left operand.
In our experiments, we implement this transformation on Java programs using the Spoon transformation library (Pawlak et al. 2015).
Yet, we note that the oracles for program laguerre and linreg can be considered as approximate computing, yet the error margin we accept is very low (10−6).
Version 3.6.1: https://frama.link/tQCYrZ2W.
Version 3.8.0: https://frama.link/fCjiqzk2.
References
Barr E, Harman M, McMinn P, Shahbaz M, Yoo S (2015) The oracle problem in software testing: a survey. IEEE Trans Softw Eng 41(5):507–525
Baudry B, Monperrus M (2015) The multiple facets of software diversity: recent developments in year 2000 and beyond. ACM Comput Surv 1–26
Dijkstra EW (1988) On the cruelty of really teaching computing science
Eggert PR, Parker DS (2005) Perturbing and evaluating numerical programs without recompilation—the Wonglediff way. Softw Pract Exper 35(4):313–322
Khoo W M (2013) Decompilation as search. University of Cambridge, PhD thesis
Li X, Yeung D (2007) Application-level correctness and its impact on fault tolerance. In: 2007 IEEE 13th International symposium on high performance computer architecture, pp 181–192
Mittal S (2016) A survey of techniques for approximate computing. ACM Comput Surv 48(4):62,1–62,33
Morell L, Murrill B, Rand R (1997) Perturbation analysis of computer programs. In: Proceedings of the 12th annual conference on computer assurance, 1997. COMPASS ’97 Are we making progress towards computer assurance?, pp 77–87
Pawlak R, Monperrus M, Petitprez N, Noguera C, Seinturier L (2015) Spoon: a library for implementing analyses and transformations of java source code. Softw Pract Exper 46:1155–1179
Rinard M, Cadar C, Nguyen HH (2005) Exploring the acceptability envelope. In: Companion to the 20th Annual ACM SIGPLAN conference on object-oriented programming, systems, languages, and applications, OOPSLA ’05. New York, p ACM
Roy P, Ray R, Wang C, Wong WF (2014) Asac: automatic sensitivity analysis for approximate computing. SIGPLAN Not 49(5):95–104
Sedgewick R (1978) Implementing quicksort programs. Commun ACM 21(10):847–857
Tallam S, Tian C, Gupta R, Zhang X (2008) Avoiding program failures through safe execution perturbations. In: Proceedings of the 2008 32Nd Annual IEEE international computer software and applications conference, COMPSAC ’08. IEEE Computer Society, Washington, DC, pp 152–159
Tang E, Barr E, Li X, Su Z (2010) Perturbing numerical calculations for statistical analysis of floating-point program (in)stability. In: Proceedings of the 19th International symposium on software testing and analysis, ISSTA ’10. ACM, New York, pp 131–142
Wang N, Fertig M, Patel S (2003) Y-branches: when you come to a fork in the road, take it. In: 12th International conference on parallel architectures and compilation techniques, pp 56–66
Welch TA (1984) A technique for high-performance data compression. Computer 17(6):8–19
Acknowledgments
This work was partially supported by the EU Project STAMP ICT-16-10 No.731529, CPER Nord-Pas de Calais/FEDER DATA Advanced data science and technologies 2015-2020, and the French Ministry of Higher Education and Research. We also wishes to acknowledge the continual support of Inria, and PP acknowledges the stimulating environment provided by the SequeL Inria project-team.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by: Atif Memon
Appendix: Experiment Subject
Appendix: Experiment Subject
1.1 A.1 Overview
Table 9 gives an overview of the considered benchmark. The 1st column is the name used to refer to the subject and the second column gives the number of Line of Code (LOC) of the program. Then in the 3rd, 4th and 5th respectively the number of integer perturbation point, the number of Boolean perturbation point and the total number of perturbation point for each subject. In the last column, it a brief description the computation considered.
1.2 A.2 Quicksort
Quicksort is a sorting algorithm. We consider an implementation of Quicksort algorithm in Java. The original code is available at https://frama.link/XGMArl34. A live demo is available at https://danglotb.github.io/resources/correctness-attraction/live-demo.html
- Correctness Oracle: :
-
The oracle checks that the array is correctly sorted, checks that each element of the input is also in the output, and checks that no element that is not present in the input is in the output.
1.3 A.3 Zip
The Lempel-Ziv-Welch (LZW) (Welch 1984) is a loss-less data compression algorithm. We use it to compress/uncompress strings. The implementation comes from Rosetta Code,Footnote 4 with 1 class and 2 methods: one class to compress, and the other class to uncompress. The implementation has 6 Boolean perturbation points and 19 numerical perturbation points spread over 56 lines of code.
- Correctness Oracle: :
-
The scenario is to uncompress the compressed input string. The perfect oracle asserts that the output string is the same as the input string.
1.4 A.4 Sudoku
We consider a Sudoku solver taken from Rosetta Code. We input a randomly generated grid. Some cells are already filled in with values. There is 1 class of 87 lines of codes, containing 89 numerical perturbation points and 26 Boolean perturbation points.
- Correctness Oracle: :
-
The oracle asserts that all Sudoku constraints are satisfied: all cells are filled and valid, and all cells already in the input problem remain unchanged.
1.5 A.5 MD5
The Message Digest 5 (MD5) algorithm is used to hash a string of a given size. We take the implementation from Rosetta Code. There is 1 class with 1 method, and 91 lines of codes. We find 164 numerical perturbation points, and 11 Boolean perturbation points.
- Correctness Oracle: :
-
The oracle is that the hash is the same as the one from the reference implementation.
1.6 A.6 RSA
An RSA cryptosystem was designed by Ron Rivest, Adi Shamir, and Leonard Adleman. This implementation is a real, production-ready one taken from bouncy-castle.Footnote 5 , Footnote 6 The project is composed of 1494 classes with a total of 241483 lines of code. We studied the RSACoreEngine class, which has 6 methods with 203 lines of codes, 73 numerical perturbation points and 19 Boolean perturbation points. Many integer points are BigInteger Java objects, that we perturb appropriately. The considered inputs are random strings of 64 bytes. Correctness Oracle: The considered scenario is decrypt(crypt(x)): The oracle asserts that the decrypted string is the same as the input string.
1.7 A.7 RC4
RC4 is an encryption cipher designed by Ron Rivest. This algorithm is fast and simple yet not secure according to today’s standards. We use BouncyCastle’s class RC4CoreEngine which has 150 lines with 7 Boolean perturbation points and 112 integer points.
- Correctness Oracle: :
-
The considered scenario is decrypt(crypt(x)). The oracle asserts that the decrypted string is the same as the input string.
1.8 A.8 Canny
A canny filter is an edge detector in an image. We use the implementation of Tom Gibara.Footnote 7 There is one 1 class with 568 lines of code, with 450 integer perturbation points and 79 Boolean perturbation points.
- Correctness Oracle: :
-
The oracle asserts that the detected edges are accurate of to the pixel with regards to the result of an unperturbed reference run.
1.9 A.9 LCS
We consider the Longest Common Sequence problem, implemented using dynamic programming.Footnote 8 As input, we use real RNA sequences of two plants: sativa and thaliana, extracted from the mature dataset of miRBase.Footnote 9 This implementation has 43 Lines with 9 Boolean perturbations point and 79 integer perturbation points.
- Correctness Oracle: :
-
The oracle is that the output is the same as the one of the reference unperturbed implementation.
1.10 A.10 Laguerre
Laguerre is an numerical analysis program which computes the the roots of a polynomial equation. The implementation comes from The Apache Commons Mathematics Library.Footnote 10 The class under study is “LaguerreSolver” which is 440 lines long and has 176 interger perturbation points and 25 Boolean perturbation points.
- Correctness Oracle: :
-
The oracle checks if the computed solution actually nullifies the equation. Because the computation acts on floating-point numbers, we accept the solution if its evaluation is within + / − 10−6.
1.11 A.11 Linreg
Linreg computes a linear regression using the Tikhonov regularization. We take the implementation from the Weka Library.Footnote 11 The class under study is “LinearRegression”: it has 188 lines of codes, with 75 integer perturbation points and 15 Boolean perturbation points. We generate inputs by randomly sampling the coefficients of the equation.
- Correctness Oracle: :
-
It checks if the computed coefficients are equal to those obtained from a reference run, up to a 10−6 precision.
Rights and permissions
About this article
Cite this article
Danglot, B., Preux, P., Baudry, B. et al. Correctness attraction: a study of stability of software behavior under runtime perturbation. Empir Software Eng 23, 2086–2119 (2018). https://doi.org/10.1007/s10664-017-9571-8
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10664-017-9571-8