A trustworthy mechanized formalization of R
- 6 April 2020
- journal article
- research article
- Published by Association for Computing Machinery (ACM) in ACM SIGPLAN Notices
- Vol. 53 (8), 13-24
- https://doi.org/10.1145/3393673.3276946
Abstract
The R programming language is very popular for developing statistical software and data analysis, thanks to rich libraries, concise and expressive syntax, and support for interactive programming. Yet, the semantics of R is fairly complex, contains many subtle corner cases, and is not formally specified. This makes it difficult to reason about R programs. In this work, we develop a big-step operational semantics for R in the form of an interpreter written in the Coq proof assistant. We ensure the trustworthiness of the formalization by introducing a monadic encoding that allows the Coq interpreter, CoqR, to be in direct visual correspondence with the reference R interpreter, GNU R. Additionally, we provide a testing framework that supports systematic comparison of CoqR and GNU R. In its current state, CoqR covers the nucleus of the R language as well as numerous additional features, making it pass a significant number of realistic test cases from the GNU R and FastR projects. To exercise the formal specification, we prove in Coq the preservation of memory invariants in selected parts of the interpreter. This work is an important first step towards a robust environment for formal verification of R programs.Keywords
Funding Information
- Millenium Institute for Foundational Research on Data (IMFD Chile).
This publication has 18 references indexed in Scilit:
- KJS: a complete formal semantics of JavaScriptPublished by Association for Computing Machinery (ACM) ,2015
- Validating LR(1) ParsersLecture Notes in Computer Science, 2012
- Evaluating the Design of the R LanguageLecture Notes in Computer Science, 2012
- The Eval That Men DoLecture Notes in Computer Science, 2011
- A Formalization of the C99 Standard in HOL, Isabelle and CoqLecture Notes in Computer Science, 2011
- The Essence of JavaScriptLecture Notes in Computer Science, 2010
- Formal verification of a realistic compilerCommunications of the ACM, 2009
- Isolating JavaScript with Filters, Rewriting, and WrappersLecture Notes in Computer Science, 2009
- An Operational Semantics for JavaScriptLecture Notes in Computer Science, 2008
- Comprehending monadsMathematical Structures in Computer Science, 1992