PHOG: Probabilistic Model for Code

Pavol Bielik, Veselin Raychev, Martin Vechev
Proceedings of The 33rd International Conference on Machine Learning, PMLR 48:2933-2942, 2016.

Abstract

We introduce a new generative model for code called probabilistic higher order grammar (PHOG). PHOG generalizes probabilistic context free grammars (PCFGs) by allowing conditioning of a production rule beyond the parent non-terminal, thus capturing rich contexts relevant to programs. Even though PHOG is more powerful than a PCFG, it can be learned from data just as efficiently. We trained a PHOG model on a large JavaScript code corpus and show that it is more precise than existing models, while similarly fast. As a result, PHOG can immediately benefit existing programming tools based on probabilistic models of code.

Cite this Paper


BibTeX
@InProceedings{pmlr-v48-bielik16, title = {PHOG: Probabilistic Model for Code}, author = {Bielik, Pavol and Raychev, Veselin and Vechev, Martin}, booktitle = {Proceedings of The 33rd International Conference on Machine Learning}, pages = {2933--2942}, year = {2016}, editor = {Balcan, Maria Florina and Weinberger, Kilian Q.}, volume = {48}, series = {Proceedings of Machine Learning Research}, address = {New York, New York, USA}, month = {20--22 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v48/bielik16.pdf}, url = {https://proceedings.mlr.press/v48/bielik16.html}, abstract = {We introduce a new generative model for code called probabilistic higher order grammar (PHOG). PHOG generalizes probabilistic context free grammars (PCFGs) by allowing conditioning of a production rule beyond the parent non-terminal, thus capturing rich contexts relevant to programs. Even though PHOG is more powerful than a PCFG, it can be learned from data just as efficiently. We trained a PHOG model on a large JavaScript code corpus and show that it is more precise than existing models, while similarly fast. As a result, PHOG can immediately benefit existing programming tools based on probabilistic models of code.} }
Endnote
%0 Conference Paper %T PHOG: Probabilistic Model for Code %A Pavol Bielik %A Veselin Raychev %A Martin Vechev %B Proceedings of The 33rd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2016 %E Maria Florina Balcan %E Kilian Q. Weinberger %F pmlr-v48-bielik16 %I PMLR %P 2933--2942 %U https://proceedings.mlr.press/v48/bielik16.html %V 48 %X We introduce a new generative model for code called probabilistic higher order grammar (PHOG). PHOG generalizes probabilistic context free grammars (PCFGs) by allowing conditioning of a production rule beyond the parent non-terminal, thus capturing rich contexts relevant to programs. Even though PHOG is more powerful than a PCFG, it can be learned from data just as efficiently. We trained a PHOG model on a large JavaScript code corpus and show that it is more precise than existing models, while similarly fast. As a result, PHOG can immediately benefit existing programming tools based on probabilistic models of code.
RIS
TY - CPAPER TI - PHOG: Probabilistic Model for Code AU - Pavol Bielik AU - Veselin Raychev AU - Martin Vechev BT - Proceedings of The 33rd International Conference on Machine Learning DA - 2016/06/11 ED - Maria Florina Balcan ED - Kilian Q. Weinberger ID - pmlr-v48-bielik16 PB - PMLR DP - Proceedings of Machine Learning Research VL - 48 SP - 2933 EP - 2942 L1 - http://proceedings.mlr.press/v48/bielik16.pdf UR - https://proceedings.mlr.press/v48/bielik16.html AB - We introduce a new generative model for code called probabilistic higher order grammar (PHOG). PHOG generalizes probabilistic context free grammars (PCFGs) by allowing conditioning of a production rule beyond the parent non-terminal, thus capturing rich contexts relevant to programs. Even though PHOG is more powerful than a PCFG, it can be learned from data just as efficiently. We trained a PHOG model on a large JavaScript code corpus and show that it is more precise than existing models, while similarly fast. As a result, PHOG can immediately benefit existing programming tools based on probabilistic models of code. ER -
APA
Bielik, P., Raychev, V. & Vechev, M.. (2016). PHOG: Probabilistic Model for Code. Proceedings of The 33rd International Conference on Machine Learning, in Proceedings of Machine Learning Research 48:2933-2942 Available from https://proceedings.mlr.press/v48/bielik16.html.

Related Material