Verifying Compiler Optimization Passes


Responsive image Arxiv Slides

Verifying term graph optimizations using Isabelle/HOL
Brae J. Webb, Ian J. Hayes, Mark Utting

Our objective is to formally verify the correctness of the hundreds of expression optimization rules used within the GraalVM compiler. When defining the semantics of a programming language, expressions naturally form abstract syntax trees, or, terms. However, in order to facilitate sharing of common subexpressions, modern compilers represent expressions as term graphs. Defining the semantics of term graphs is more complicated than defining the semantics of their equivalent term representations. More significantly, defining optimizations directly on term graphs and proving semantics preservation is considerably more complicated than on the equivalent term representations. On terms, optimizations can be expressed as conditional term rewriting rules, and proofs that the rewrites are semantics preserving are relatively straightforward. In this paper, we explore an approach to using term rewrites to verify term graph transformations of optimizations within the GraalVM compiler. This approach significantly reduces the overall verification effort and allows for simpler encoding of optimization rules.


Slides

Verified expression graph optimization in GraalVM
Brae J. Webb, Mark Utting, Ian J. Hayes
Presented at FM Australia 2022

Compiler optimization passes have been shown to be prone to subtle implementation bugs. Almost all software relies on the faithful translation performed by compilers, thus implementation bugs pose a substantial problem. Our work focuses on addressing this problem by verifying the optimization passes of a state-of-the-art polyglot compiler, GraalVM.
The GraalVM optimizing compiler represents expressions using a graph structure in SSA form. We present our efforts to verify the correctness of expression optimizations performed by the compiler. Proof infrastructure has been developed to aid the lifting of machine word theories in Isabelle/HOL to the compiler's graph structure. Optimization rules are specified via a domain specific language from which proof obligations and supporting lemmas are generated. This enables optimization verification by non-expert users.



Responsive imageATVA Responsive image Arxiv Slides Recording

A Formal Semantics of the GraalVM IR
Brae J. Webb, Mark Utting, Ian J. Hayes

The optimization phase of a compiler is responsible for transforming an intermediate representation (IR) of a program into a more efficient form. Modern optimizers, such as that used in the GraalVM compiler, use an IR consisting of a sophisticated graph data structure that combines data flow and control flow into the one structure. As part of a wider project on the verification of optimization passes of GraalVM, this paper describes a semantics for its IR within Isabelle/HOL. The semantics consists of a big-step operational semantics for data nodes (which are represented in a graph-based static single assignment (SSA) form) and a small-step operational semantics for handling control flow including heap-based reads and writes, exceptions, and method calls. We have proved a suite of canonicalization optimizations and conditional elimination optimizations with respect to the semantics.



Article

Verifying GraalVM Optimizations — Why?
Brae Webb

At the University of Queensland, we're working to verify the optimization passes of the GraalVM compiler. This article conveys the advantages of verifying a polyglot partially evaluating compiler, such as GraalVM. Specifially, we explore how optimizations enable polyglot behaviour, and how optimizations become shared amoung host languages. This is demonstrated through the manual partial evaluation of a hypothetical Simple Arithmetic Language (SAL).



Document Slides Recording

Brae Webb PhD Confirmation
Brae Webb

Optimizing compilers are notoriously difficult to implement correctly. Rigorous fuzzy testing has found that the optimization phase is most frequently the source of bugs within a compiler implementation. An incorrect compiler implementation can manifest bugs in any of the myriad of programs which the implementation compiles, including safety-critical software. Mechanized formal verification, as the most trusted approach to ensuring software is bug-free, has shown promising results as a solution to incorrect compiler implementations. Unfortunately, mechanized formal verification is a time consuming task requiring experts in the field and, as a consequence, it is not a standard practice for compiler implementations.

We introduce and outline a plan to apply mechanized formal verification techniques to the optimization phase of a state-of-the-art production compiler. The GraalVM compiler uses Futamura projections to implement a variety of hosted languages; subsequently, all hosted languages share the same optimization phase. The shared optimization phase allows all hosted languages to benefit from verifying the optimization phase with no additional verification effort. We propose research into verifying the correctness of the shared optimization phase in a manner that is accessible to the existing compiler developers.



Repository

Veriopt Theories

Veriopt is a formal verification effort undertaken by researchers at the University of Queensland. The project aims to formally verify the optimization phases within the GraalVM compiler.

The intermediate representation of the compiler is formalized within the Isabelle/HOL interactive theorem prover. Optimizations are then proven to be semantic preserving transformations of the intermediate representation.

This repository hosts the Isabelle/HOL theories which define a formal semantics of the intermediate representation and prove the correctness of optimization phases. The repository is in a pre-release state as we develop the theory, the theories including the formal semantics may change frequently.