Technical Perspective: Toward a Formally Verified Compiler for a Synchronous, Functional, Data-Flow Programming Language

Automatic control systems, embedded systems, cyber-physical systems, real-time systems, reactive systems: All of these refer to computer systems that interact continuously with their environment, through inputs received from physical sensors and outputs sent to physical actuators. The crucial point is that these interactions must occur at a speed imposed by the environment. For instance, an airplane fly-by-wire software cannot wait to issue its next outputs without jeopardizing the stability of the airplane. To take a more mundane (and less critical) example, your smartphone navigation app must depict your new position every second or so; otherwise, it would be rendered useless.

There are multiple approaches to program such systems, ranging from multi-threaded low-level programming languages such as C, to model-based tools such as Simulink. Low-level programming languages raise many issues: lack of formal semantics, complexity of inter-thread communications, race conditions, and so on. This raises massive difficulties in testing and validating the program. At the other end of the spectrum, model-based tools raise the issue of the faithfulness between the model and the generated code. To sum up, in the first case, the charge of testing and validating is incurred by the programmer, while in the second case, the charge of generating faithful code is incurred by the compiler.

The accompanying paper focuses on Lustre, a synchronous, functional, data-flow programming language augmented with hierarchical state machines. Lustre is the academic language underlying the Scade model-based tool from Ansys. Lustre and Scade target safety-critical embedded systems, such as those found in civil avionics (for example, fly-by-wire), nuclear power plants (for example, rods’ automatic control), trains (for example, segment admission), and automotive (for example, airbag ignition). Failures in safety-critical systems are extremely costly, so they must comply with strict certification norms. For instance the FAA, EASA, and Transport Canada agencies have all adopted the DO-178C norm, titled “Software Consideration in Airborne Systems and Equipment Certification.” Similar norms exist for other domains: EN-50128 for railway, IEC-61508 for industrial manufacturing, and ISO-26262 for automotive.

A failure of fly-by-wire software can be catastrophic for an aircraft, so the DO-178C norm classifies this software as Design Assurance Level A. This requires both the software (here the fly-by-wire software) and the design tool (here Scade) to be certified, meaning software development must follow strict rules regarding formal specification, code review, and test coverage. However, the DO-178C norm does not impose to formally prove the faithfulness of the generated code to the source model (at least not yet). Like all synchronous programming languages, the Lustre semantics and compiler guarantee that the generated code is deterministic and always runs in bounded time and bounded memory. These properties are extremely useful for safety-critical embedded systems; for example, bounded time execution allows for checking that the system reacts at the speed imposed by its environment. Yet, it does not guarantee the faithfulness of the generated code.

For this reason, researchers have tried to develop verified compilers, where the faithfulness between the source program and the generated code is guaranteed thanks to a mathematical proof, itself checked by a proof assistant. The first verified compiler to achieve this feat was CompCert, an optimizing C compiler entirely verified with the Rocq proof assistant (formerly known as Coq). An alternative approach to verified compilers is translation validation, where the object code produced by the compiler is a posteriori proven to be semantically equivalent to the source code. This is the solution adopted in the accompanying paper. Translation validation is very useful when a compilation pass is too complex to be formally proved. For instance, the graph-coloring algorithm used in register allocation is often implemented as a heuristic: This heuristic is too complex to be proven correct, while the obtained register allocation is fairly easy to verify, but it must be verified for each compiled program.

Validation of the translation can be achieved with symbolic interpretation, static analysis, model checking, automatic theorem proving, and so on. In the accompanying paper, this is achieved with the Frama-C software-analysis platform.

Smartly, the authors chose to axiomatize the relational state-transition semantics of Lustre, which serves as an intermediate semantics between the classic denotational dataflow semantics of the source program and the state-transition operational semantics of the generated code. Thanks to this axiomatization, logical annotations are attached to each generated code instruction, which are then turned into predicates in ACSL, the input language of Frama-C. The main code optimizations are formalized in the framework (conditional fusion, variable inlining, variable recycling, and enumerated type elimination). And finally, the generated ACSL predicates are proved correct by Frama-C and external SMT solvers. Thanks to these neat contributions, the accompanying paper is a great step toward a model-based tool where the faithfulness of the generated code is formally proven, which is essential for safety-critical embedded systems.

Similar Posts