Abstract
Formally verifying software correctness is a highly manual process. However, because verification proof scripts often share structure, it is possible to learn from existing proof scripts to fully automate some formal verification. The goal of this paper is to improve proof script synthesis and enable fully automating more verification. Interactive theorem provers, such as the Coq proof assistant, allow programmers to write partial proof scripts, observe the semantics of the proof state thus far, and then attempt more progress. Knowing the proof state semantics is a significant aid. Recent research has shown that the proof state can help predict the next step. In this paper, we present TacTok, the first technique that attempts to fully automate proof script synthesis by modeling proof scripts using both the partial proof script written thus far and the semantics of the proof state. Thus, TacTok more completely models the information the programmer has access to when writing proof scripts manually. We evaluate TacTok on a benchmark of 26 software projects in Coq, consisting of over 10 thousand theorems. We compare our approach to five tools. Two prior techniques, CoqHammer, the state-of-the-art proof synthesis technique, and ASTactic, a proof script synthesis technique that models proof state. And three new proof script synthesis technique we create ourselves, SeqOnly, which models only the partial proof script and the initial theorem being proven, and WeightedRandom and WeightedGreedy, which use metaheuristic search biased by frequencies of proof tactics in existing, successful proof scripts. We find that TacTok outperforms WeightedRandom and WeightedGreedy, and is complementary to CoqHammer and ASTactic: for 24 out of the 26 projects, TacTok can synthesize proof scripts for some theorems the prior tools cannot. Together with TacTok, 11.5% more theorems can be proven automatically than by CoqHammer alone, and 20.0% than by ASTactic alone. Compared to a combination of CoqHammer and ASTactic, TacTok can prove an additional 3.6% more theorems, proving 115 theorems no tool could previously prove. Overall, our experiments provide evidence that partial proof script and proof state semantics, together, provide useful information for proof script modeling, and that metaheuristic search is a promising direction for proof script synthesis. TacTok is open-source and we make public all our data and a replication package of our experiments.
Supplemental Material
- Tony Abou-Assaleh, Nick Cercone, Vlado Keselj, and Ray Sweidan. 2004. N-gram-based detection of new malicious code. In Proceedings of the 28th Annual International IEEE Computer Software and Applications Conference, Vol. 2. 41-42. https://doi.org/10.1109/CMPSAC. 2004.1342667 Google ScholarCross Ref
- Afsoon Afzal, Manish Motwani, Kathryn T. Stolee, Yuriy Brun, and Claire Le Goues. 2020. SOSRepair: Expressive Semantic Search for Real-World Program Repair. IEEE Transactions on Software Engineering (TSE) ( 2020 ). https://doi.org/10.1109/ TSE. 2019.2944914 Google ScholarCross Ref
- Jesse Alama, Tom Heskes, Daniel Kühlwein, Evgeni Tsivtsivadze, and Josef Urban. 2014. Premise selection for mathematics by corpus analysis and kernel methods. Journal of Automated Reasoning 52, 2 ( 2014 ), 191-213. https://doi.org/10.1007/s10817-013-9286-5 Google ScholarDigital Library
- Enrique Alba and Francisco Chicano. 2007. Finding safety errors with ACO. In Conference on Genetic and Evolutionary Computation (GECCO). London, England, UK, 1066-1073. https://doi.org/10.1145/1276958.1277171 Google ScholarDigital Library
- Peter B Andrews and Chad E Brown. 2006. TPS: A hybrid automatic-interactive system for developing proofs. Journal of Applied Logic 4, 4 ( 2006 ), 367-395. https://doi.org/10.1016/j.jal. 2005. 10.002 Google ScholarCross Ref
- Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural Machine Translation by Jointly Learning to Align and Translate. In Proceedings of the International Conference on Learning Representations (ICLR). San Diego, CA, USA. https://arxiv.org/abs/1409.0473Google Scholar
- Ahilton Barreto, Márcio Barros, and Cláudia Werner. 2008. Stafing a software project: A constraint satisfaction approach. Computers and Operations Research 35, 10 ( 2008 ), 3073-3089. https://doi.org/10.1016/j.cor. 2007. 01.010 Google ScholarDigital Library
- Clark Barrett, Christopher L. Conway, Morgan Deters, Liana Hadarean, Dejan Jovanović, Tim King, Andrew Reynolds, and Cesare Tinelli. 2011. CVC4. In International Conference on Computer Aided Verification (CAV), Vol. 6806. Springer, Snowbird, UT, USA, 171-177. https://doi.org/10.1007/978-3-642-22110-1_14 Google ScholarCross Ref
- Yoshua Bengio, Réjean Ducharme, Pascal Vincent, and Christian Jauvin. 2003. A neural probabilistic language model. Journal of machine learning research 3, Feb ( 2003 ), 1137-1155. https://dl.acm.org/doi/10.5555/944919.944966 Google ScholarDigital Library
- Al Bessey, Ken Block, Ben Chelf, Andy Chou, Bryan Fulton, Seth Hallem, Charles Henri-Gros, Asya Kamsky, Scott McPeak, and Dawson Engler. 2010. A few billion lines of code later: Using static analysis to find bugs in the real world. Commun. ACM 53, 2 ( 2010 ), 66-75. https://doi.org/10.1145/1646353.1646374 Google ScholarDigital Library
- Jasmin Christian Blanchette, Lukas Bulwahn, and Tobias Nipkow. 2011. Automatic proof and disproof in Isabelle/HOL. In International Symposium on Frontiers of Combining Systems. Springer, 12-27. https://doi.org/10.1007/978-3-642-24364-6_2 Google ScholarCross Ref
- Jasmin Christian Blanchette, Cezary Kaliszyk, Lawrence C Paulson, and Josef Urban. 2016. Hammering towards QED. Journal of Formalized Reasoning 9, 1 ( 2016 ), 101-148. https://doi.org/10.6092/issn.1972-5787 / 4593 Google ScholarCross Ref
- Eric Brill and Robert C Moore. 2000. An improved error model for noisy channel spelling correction. In Proceedings of the 38th Annual Meeting on Association for Computational Linguistics. Hong Kong, 286-293. https://doi.org/10.3115/1075218. 1075255 Google ScholarDigital Library
- Alan Bundy. 1998. A science of reasoning. In International Conference on Automated Reasoning with Analytic Tableaux and Related Methods. Springer, 10-17. https://doi.org/10.1007/3-540-69778-0_2 Google ScholarCross Ref
- Alan Bundy, Frank Van Harmelen, Christian Horn, and Alan Smaill. 1990. The OYSTER-CLAM system. In International Conference on Automated Deduction. Springer, 647-648. https://doi.org/10.1007/3-540-52885-7_123 Google ScholarCross Ref
- Ahmet Celik, Karl Palmskog, and Milos Gligoric. 2017. ICoq: Regression proof selection for large-scale verification projects. In Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE). 171-182. https://doi.org/10.1109/ASE. 2017.8115630 Google ScholarCross Ref
- Adam Chlipala. 2013. Certified Programming with Dependent Types. MIT Press, Boston, MA, USA.Google ScholarDigital Library
- Kyunghyun Cho, Bart van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Doha, Qatar, 1724-1734. https://doi.org/10.3115/v1/ D14-1179 Google ScholarCross Ref
- Junyoung Chung, Caglar Gulcehre, KyungHyun Cho, and Yoshua Bengio. 2014. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555 ( 2014 ).Google Scholar
- Łukasz Czajka and Cezary Kaliszyk. 2018. Hammer for Coq: Automation for Dependent Type Theory. Journal of Automated Reasoning 61, 1-4 ( 2018 ), 423-453. https://doi.org/10.1007/s10817-018-9458-4 Google ScholarDigital Library
- Leonardo de Moura and Nikolaj Bjørner. 2008. Z3: An eficient SMT solver. Tools and Algorithms for the Construction and Analysis of Systems 4963 (April 2008 ), 337-340. https://doi.org/10.1007/978-3-540-78800-3_24 Google ScholarCross Ref
- Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding, In Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT). arXiv preprint arXiv: 1810. 04805, 4171-4186. https://doi.org/10.18653/v1/ N19-1423 Google ScholarCross Ref
- Michael D. Ernst. 2017. Natural Language is a Programming Language: Applying Natural Language Processing to Software Development. In 2nd Summit on Advances in Programming Languages (SNAPL 2017 ) (Leibniz International Proceedings in Informatics (LIPIcs)), Benjamin S. Lerner, Rastislav Bodík, and Shriram Krishnamurthi (Eds.), Vol. 71. Schloss DagstuhlLeibniz-Zentrum fuer Informatik, Dagstuhl, Germany, 4 : 1-4 : 14. https://doi.org/10.4230/LIPIcs.SNAPL. 2017.4 Google ScholarCross Ref
- Emily First, Yuriy Brun, and Arjun Guha. 2020. Replication package for “TacTok: Semantics-aware proof synthesis”. https://doi.org/10.5281/zenodo.4088897 Google ScholarDigital Library
- Thibault Gauthier, Cezary Kaliszyk, and Josef Urban. 2017. TacticToe: Learning to reason with HOL4 Tactics. In LPAR-21. 21st International Conference on Logic for Programming, Artificial Intelligence and Reasoning, Vol. 46. 125-143.Google Scholar
- Felix A Gers, Jürgen Schmidhuber, and Fred Cummins. 1999. Learning to forget: Continual prediction with LSTM. Neural Computation 12, 10 ( 1999 ), 2451-2471. https://doi.org/10.1162/089976600300015015 Google ScholarDigital Library
- Klaus Gref, Rupesh K Srivastava, Jan Koutník, Bas R Steunebrink, and Jürgen Schmidhuber. 2017. LSTM: A search space odyssey. IEEE Transactions on Neural Networks and Learning Systems (TNNLS) 28, 10 ( 2017 ), 2222-2232. https: //doi.org/10.1109/TNNLS. 2016.2582924 Google ScholarCross Ref
- Ronghui Gu, Zhong Shao, Hao Chen, Xiongnan Wu, Jieung Kim, Vilhelm Sjöberg, and David Costanzo. 2016. CertiKOS: An Extensible Architecture for Building Certified Concurrent OS Kernels. In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation. https://www.usenix.org/conference/osdi16/technical-sessions/presentation/guGoogle Scholar
- Arjun Guha, Mark Reitblatt, and Nate Foster. 2013. Machine Verified Network Controllers. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). Seattle, WA, USA. https://doi.org/10.1145/2491956.2462178 Google ScholarDigital Library
- Mark Harman. 2007. The Current State and Future of Search Based Software Engineering. In ACM/IEEE International Conference on Software Engineering (ICSE). Minneapolis, MN, USA, 342-357. https://doi.org/10.1109/FOSE. 2007.29 Google ScholarDigital Library
- John Harrison. 1996. HOL Light: A tutorial introduction. In International Conference on Formal Methods in Computer-Aided Design. Palo Alto, CA, USA, 265-269. https://doi.org/10.1007/BFb0031814 Google ScholarCross Ref
- Vincent J. Hellendoorn, Premkumar T. Devanbu, and Mohammad Amin Alipour. 2018. On the naturalness of proofs. In ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE) New Ideas and Emerging Results track. Orlando, FL, USA, 724-728. https://doi.org/10.1145/3236024.3264832 Google ScholarDigital Library
- Jónathan Heras and Ekaterina Komendantskaya. 2014. Recycling proof patterns in Coq: Case studies. Mathematics in Computer Science 8, 1 ( 2014 ), 99-116. https://doi.org/10.1007/s11786-014-0173-1 Google ScholarCross Ref
- Abram Hindle, Earl T. Barr, Mark Gabel, Zhendong Su, and Premkumar Devanbu. 2016. On the Naturalness of Software. Commun. ACM 59, 5 (April 2016 ), 122âĂŞ131. https://doi.org/10.1145/2902362 Google ScholarDigital Library
- Abram Hindle, Earl T Barr, Zhendong Su, Mark Gabel, and Premkumar Devanbu. 2012. On the naturalness of software. In Proceedings of the 34th International Conference on Software Engineering (ICSE). 837-847. https://doi.org/10.1109/ICSE. 2012.6227135 Google ScholarCross Ref
- Daniel Huang, Prafulla Dhariwal, Dawn Song, and Ilya Sutskever. 2018. GamePad: A Learning Environment for Theorem Proving. CoRR ( 2018 ). https://arxiv.org/abs/ 1806.00608Google Scholar
- Atalay İleri, Tej Chajed, Adam Chlipala, M. Frans Kaashoek, and Nickolai Zeldovich. 2018. Proving Confidentiality in a File System Using DISKSEC. In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation. https://www.usenix.org/conference/osdi18/presentation/ileriGoogle Scholar
- Geofrey Irving, Christian Szegedy, Alexander A Alemi, Niklas Eén, François Chollet, and Josef Urban. 2016. Deepmath-deep sequence models for premise selection. In Advances in Neural Information Processing Systems. Barcelona, Spain, 2235-2243. https://papers.nips.cc/paper/6280-deepmath-deep-sequence-models-for-premise-selectionGoogle Scholar
- Jonathan Jacky, Stefani Banerian, Michael D. Ernst, Calvin Loncaric, Stuart Pernsteiner, Zachary Tatlock, and Emina Torlak. 2017. Automatic formal verification for EPICS. In International Conference on Accelerator and Large Experimental Physics Control Systems (ICALEPCS). Barcelona, Spain. https://doi.org/10.18429/ JACOW-ICALEPCS2017-TUDPL02 Google ScholarCross Ref
- Dongseok Jang, Zachary Tatlock, and Sorin Lerner. 2012. Establishing Browser Security Guarantees Through Formal Shim Verification. In Proceedings of the 21st USENIX Conference on Security Symposium. Bellevue, WA, USA. https: //www.usenix.org/conference/usenixsecurity12/technical-sessions/presentation/jangGoogle ScholarDigital Library
- Slava Katz. 1987. Estimation of probabilities from sparse data for the language model component of a speech recognizer. IEEE Transactions on Acoustics, Speech, and Signal Processing 35, 3 ( 1987 ), 400-401. https://doi.org/10.1109/TASSP. 1987.1165125 Google ScholarCross Ref
- Yalin Ke, Kathryn T. Stolee, Claire Le Goues, and Yuriy Brun. 2015. Repairing Programs with Semantic Code Search. In Proceedings of the 30th IEEE/ACM International Conference on Automated Software Engineering (ASE) ( 9-13 ). Lincoln, NE, USA, 295-306. https://doi.org/10.1109/ASE. 2015.60 Google ScholarDigital Library
- Reinhard Kneser and Hermann Ney. 1995. Improved backing-of for m-gram language modeling. In International Conference on Acoustics, Speech, and Signal Processing, Vol. 1. Detroit, MI, USA, 181-184. https://doi.org/10.1109/ICASSP. 1995.479394 Google ScholarCross Ref
- Philipp Koehn, Hieu Hoang, Alexandra Birch, Chris Callison-Burch, Marcello Federico, Nicola Bertoldi, Brooke Cowan, Wade Shen, Christine Moran, Richard Zens, et al. 2007. Moses: Open source toolkit for statistical machine translation. In Proceedings of the 45th Annual Meeting of the ACL; Interactive Poster and Demonstration Session. 177-180. https: //www.aclweb.org/anthology/P07-2045Google ScholarCross Ref
- Ekaterina Komendantskaya, Jónathan Heras, and Gudmund Grov. 2012. Machine learning in proof general: Interfacing interfaces. In Proceedings of 10th International Workshop on User Interfaces for Theorem Provers, Vol. 118. Bremen, Germany. https://doi.org/10.4204/EPTCS.118.2 Google ScholarCross Ref
- Laura Kovács and Andrei Voronkov. 2013. First-Order Theorem Proving and Vampire. In International Conference on Computer Aided Verification (CAV), Vol. 8044. Springer-Verlag, Saint Petersburg, Russia, 1-35. https://doi.org/10.1007/978-3-642-39799-8_1 Google ScholarCross Ref
- Leonidas Lampropoulos, Zoe Paraskevopoulou, and Benjamin C. Pierce. 2017. Generating Good Generators for Inductive Relations. Proceedings of the ACM on Programming Languages 2, POPL (Dec. 2017 ), 45 : 1-45 : 30. https://doi.org/10.1145/ 3158133 Google ScholarDigital Library
- K. Rustan M. Leino. 2010. Dafny: An automatic program verifier for functional correctness. In International Conference on Logic for Programming Artificial Intelligence and Reasoning (LPAR). Dakar, Senegal, 348-370. https://doi.org/10.1007/978-3-642-17511-4_20 Google ScholarCross Ref
- Xavier Leroy. 2009. Formal verification of a realistic compiler. Commun. ACM 52, 7 ( 2009 ), 107-115. https://doi.org/10.1145/ 1538788.1538814 Google ScholarDigital Library
- Laurent Mauborgne. 2004. AstrÉe: Verification of Absence of Runtime Error. In Building the Information Society. 385-392. https://doi.org/10.1007/978-1-4020-8157-6_30 Google ScholarCross Ref
- Christoph C. Michael, Gary McGraw, and Michael A. Schatz. 2001. Generating Software Test Data by Evolution. IEEE Transactions on Software Engineering (TSE) 27, 12 (Dec. 2001 ), 1085-1110. https://doi.org/10.1109/32.988709 Google ScholarDigital Library
- Greg Morrisett, Gang Tan, Joseph Tassarotti, Jean-Baptiste Tristan, and Edward Gan. 2012. RockSalt: Better, Faster, Stronger SFI for the x86. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). Beijing, China. https://doi.org/10.1145/2345156.2254111 Google ScholarDigital Library
- Manish Motwani, Mauricio Soto, Yuriy Brun, René Just, and Claire Le Goues. 2020. Quality of Automated Program Repair on Real-World Defects. IEEE Transactions on Software Engineering (TSE) ( 2020 ). https://doi.org/10.1109/TSE. 2020.2998785 DOI: 10.1109/TSE. 2020. 2998785. Google ScholarCross Ref
- Tobias Nipkow, Lawrence C Paulson, and Markus Wenzel. 2002. Isabelle/HOL: A proof assistant for higher-order logic. Vol. 2283. Springer Science & Business Media.Google Scholar
- Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep Contextualized Word Representations. In Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL), Vol. 1. Association for Computational Linguistics, New Orleans, LA, USA, 2227-2237. https://doi.org/10.18653/v1/ N18-1202 Google ScholarCross Ref
- Baishakhi Ray, Vincent Hellendoorn, Saheel Godhane, Zhaopeng Tu, Alberto Bacchelli, and Premkumar Devanbu. 2016. On the naturalness of buggy code. In Proceedings of the IEEE/ACM 38th International Conference on Software Engineering (ICSE). Austin, TX, USA, 428-439. https://doi.org/10.1145/2884781.2884848 Google ScholarDigital Library
- Talia Ringer, Nathaniel Yazdani, John Leo, and Dan Grossman. 2018. Adapting proof automation to adapt proofs. In Proceedings of the 7th ACM SIGPLAN International Conference on Certified Programs and Proofs (CPP). Los Angeles, CA, USA, 115-129. https://doi.org/10.1145/3167094 Google ScholarDigital Library
- David E Rumelhart, Geofrey E Hinton, and Ronald J Williams. 1986. Learning representations by back-propagating errors. Nature 323, 6088 ( 1986 ), 533. https://doi.org/10.1038/323533a0 Google ScholarCross Ref
- Stephan Schulz. 2013. System Description: E 1.8. In Logic for Programming, Artificial Intelligence, and Reasoning, Ken McMillan, Aart Middeldorp, and Andrei Voronkov (Eds.). Springer Berlin Heidelberg, 735-743. https://doi.org/10.1007/978-3-642-45221-5_49 Google ScholarCross Ref
- Olaf Seng, Johannes Stammel, and David Burkhart. 2006. Search-based determination of refactorings for improving the class structure of object-oriented systems. In Conference on Genetic and Evolutionary Computation (GECCO). Seattle, WA, USA, 1909-1916. https://doi.org/10.1145/1143997.1144315 Google ScholarDigital Library
- Ilya Sergey, James R. Wilcox, and Zachary Tatlock. 2017. Programming and Proving with Distributed Protocols. Proceedings of the ACM on Programming Languages 2, POPL (Dec. 2017 ), 28 : 1-28 : 30. https://doi.org/10.1145/3158116 Google ScholarDigital Library
- Konrad Slind and Michael Norrish. 2008. A brief overview of HOL4. In International Conference on Theorem Proving in Higher Order Logics. 28-32. https://doi.org/10.1007/978-3-540-71067-7_6 Google ScholarDigital Library
- Edward K. Smith, Earl Barr, Claire Le Goues, and Yuriy Brun. 2015. Is the Cure Worse than the Disease? Overfitting in Automated Program Repair. In Proceedings of the 10th Joint Meeting of the European Software Engineering Conference and ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE) (2-4). Bergamo, Italy, 532-543. https://doi.org/10.1145/2786805.2786825 Google ScholarDigital Library
- Richard Socher, Alex Perelygin, Jean Wu, Jason Chuang, Christopher D Manning, Andrew Ng, and Christopher Potts. 2013. Recursive deep models for semantic compositionality over a sentiment treebank. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP). 1631-1642. https://www.aclweb.org/anthology/D13-1170Google Scholar
- Andreas Stolcke. 2002. SRILM-An extensible language modeling toolkit. In Proceedings of the International Conference on Spoken Language Processing. http://www.speech.sri.com/projects/srilm/papers/icslp2002-srilm.pdfGoogle Scholar
- Martin Sundermeyer, Ralf Schlüter, and Hermann Ney. 2012. LSTM neural networks for language modeling. In Proceedings of the Annual Conference of the International Speech Communication Association (INTERSPEECH). Portland, OR, USA.Google ScholarCross Ref
- Kai Sheng Tai, Richard Socher, and Christopher D. Manning. 2015. Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks. In Annual Meeting of the Association for Computational Linguistics (ACL), Vol. 1. Beijing, China, 1556-1566. https://doi.org/10.3115/v1/ P15-1150 Google ScholarCross Ref
- The Coq Development Team. 2017. Coq, v. 8.7. https://coq.inria.fr.Google Scholar
- Andrzej Trybulec and Howard A Blair. 1985. Computer Assisted Reasoning with MIZAR. In Proceedings of the 9th International Joint Conferences on Artificial Intelligence (IJCAI), Vol. 85. 26-28. https://www.ijcai.org/Proceedings/85-1/Papers/006.pdfGoogle Scholar
- Zhaopeng Tu, Zhendong Su, and Premkumar Devanbu. 2014. On the localness of software. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE). Hong Kong, China, 269-280.Google ScholarDigital Library
- Niki Vazou. 2016. Liquid Haskell: Haskell as a theorem prover. Ph.D. Dissertation. University of California, San Diego.Google Scholar
- Martin Vechev and Eran Yahav. 2016. Programming with “Big Code”. Foundations and Trends® in Programming Languages 3, 4 ( 2016 ), 231-284. https://doi.org/10.1561/2500000028 Google ScholarDigital Library
- Kristen R. Walcott, Mary Lou Sofa, Gregory M. Kapfhammer, and Robert S. Roos. 2006. Time-aware test suite prioritization. In International Symposium on Software Testing and Analysis (ISSTA). Portland, ME, USA, 1-12. https://doi.org/10.1145/ 1146238.1146240 Google ScholarDigital Library
- Mingzhe Wang, Yihe Tang, Jian Wang, and Jia Deng. 2017. Premise selection for theorem proving by deep graph embedding. In Advances in Neural Information Processing Systems (NeurIPS). Long Beach, CA, USA, 2786-2796. https://papers.nips. cc/paper/6871-premise-selection-for-theorem-proving-by-deep-graph-embeddingGoogle Scholar
- Westley Weimer, ThanhVu Nguyen, Claire Le Goues, and Stephanie Forrest. 2009. Automatically finding patches using genetic programming. In ACM/IEEE International Conference on Software Engineering (ICSE). Vancouver, BC, Canada, 364-374. https://doi.org/10.1109/ICSE. 2009.5070536 Google ScholarDigital Library
- James R. Wilcox, Doug Woos, Pavel Panchekha, Zachary Tatlock, Xi Wang, Michael D. Ernst, and Thomas Anderson. 2015. Verdi: A framework for implementing and formally verifying distributed systems. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). Portland, OR, USA, 357-368. https://doi.org/10.1145/2737924. 2737958 Google ScholarDigital Library
- Kaiyu Yang and Jia Deng. 2019. Learning to prove theorems via interacting with proof assistants. In International Conference on Machine Learning (ICML). Long Beach, CA, USA, 11. http://proceedings.mlr.press/v97/yang19a/yang19a.pdfGoogle Scholar
- Pengcheng Yin and Graham Neubig. 2017. A Syntactic Neural Model for General-Purpose Code Generation. In Annual Meeting of the Association for Computational Linguistics, Vol. 1. Association for Computational Linguistics, Vancouver, BC, Canada, 440-450. https://doi.org/10.18653/v1/ P17-1041 Google ScholarCross Ref
Index Terms
- TacTok: semantics-aware proof synthesis
Recommendations
Completeness and decidability of converse PDL in the constructive type theory of Coq
CPP 2018: Proceedings of the 7th ACM SIGPLAN International Conference on Certified Programs and ProofsThe completeness proofs for Propositional Dynamic Logic (PDL) in the literature are non-constructive and usually presented in an informal manner. We obtain a formal and constructive completeness proof for Converse PDL by recasting a completeness proof ...
Proof Reflection in Coq
We formalize natural deduction for first-order logic in the proof assistant Coq, using de Bruijn indices for variable binding. The main judgment we model is of the form Γ⊢d [:] ϕ, stating that d is a proof term of formula ϕ under hypotheses Γ; it can be ...
A Proof System for MSVL Programs in Coq
6th International Workshop on Structured Object-Oriented Formal Language and Method - Volume 10189In this paper, we propose a semi-automatic proof approach for programs written in Modeling, Simulation and Verification Language MSVL based on the interactive theorem prover Coq. To this end, first, the syntax and semantics of MSVL are briefly ...
Comments