• Save
Formal Verification of Programming Languages
Upcoming SlideShare
Loading in...5
×
 

Formal Verification of Programming Languages

on

  • 1,136 views

Literature Seminar — Presented to PLASMA research group on 8th December 2009

Literature Seminar — Presented to PLASMA research group on 8th December 2009

Statistics

Views

Total Views
1,136
Views on SlideShare
1,089
Embed Views
47

Actions

Likes
0
Downloads
0
Comments
0

1 Embed 47

http://www-users.cs.york.ac.uk 47

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Formal Verification of Programming Languages Formal Verification of Programming Languages Presentation Transcript

    • Motivation 1960s 1970s 1980s 1990s 2000s Conclusions Formal Verification of Programming Language Implementations Ph.D. Literature Seminar Jason S. Reich <jason@cs.york.ac.uk> University of York December 8, 2009
    • Motivation 1960s 1970s 1980s 1990s 2000s Conclusions Compiling an arithmetic language Compile from a simple arithmetic language to machine code for a simple register machine. Example taken from [McCart67]
    • Motivation 1960s 1970s 1980s 1990s 2000s Conclusions Compiling an arithmetic language Compile from a simple arithmetic language to machine code for a simple register machine. Source language Numeric constants Variables Addition e.g. (x + 3) + (x + (y + 2)) Example taken from [McCart67]
    • Motivation 1960s 1970s 1980s 1990s 2000s Conclusions Compiling an arithmetic language Compile from a simple arithmetic language to machine code for a simple register machine. Target language Source language Load Immediate into ac Numeric constants LOAD into ac from Variables address/register Addition STOre ac value to address/register e.g. (x + 3) + (x + (y + 2)) ADD register value to ac Example taken from [McCart67]
    • Motivation 1960s 1970s 1980s 1990s 2000s Conclusions Compiling an arithmetic language Arithmetic expression compiler in Haskell compile : : Source −> I n t −> Target compile ( Const v ) t = [ L i v ] compile ( Var x ) t = [ Load x ] compile (Sum e1 e2 ) t = c o m p i l e e1 t ++ [ Sto ( "t + " ++ show t ) ] ++ c o m p i l e e2 ( t + 1 ) ++ [ Add ( "t + " ++ show t ) ]
    • Motivation 1960s 1970s 1980s 1990s 2000s Conclusions Compiling an arithmetic language When compiled and executed, is the value in the accumulator the result of the source arithmetic expression? (x + 3) + (x + (y + 2)) compiled to machine code? 1 LOAD x 8 LOAD y 2 STO t 9 STO t + 2 3 LI 3 10 LI 2 4 ADD t 11 ADD t + 2 5 STO t 12 ADD t + 1 6 LOAD x 13 ADD t 7 STO t + 1 n.b. Where x and y are known memory locations and t + k are registers.
    • Motivation 1960s 1970s 1980s 1990s 2000s Conclusions Why use high-level languages? Rapid development Easier to understand, maintain and modify Less likely to make mistakes Easier to reason about and infer properties Architecture portability But...
    • Motivation 1960s 1970s 1980s 1990s 2000s Conclusions Can you trust your compiler? Use a compiler to translate from a high-level language to a low-level Compilers are programs (generally) written by people People make mistakes Can silently turn “a correct program into an incorrect executable” [Leroy09] GHC 6.10.x is ≈ 800, 000 lines of code and has had 737 bugs reported in the bug tracker as of 04/12/2009 [GHC] Can we formally verify a compiler?
    • Motivation 1960s 1970s 1980s 1990s 2000s Conclusions Can you trust your compiler? Use a compiler to translate from a high-level language to a low-level Compilers are programs (generally) written by people People make mistakes Can silently turn “a correct program into an incorrect executable” [Leroy09] GHC 6.10.x is ≈ 800, 000 lines of code and has had 737 bugs reported in the bug tracker as of 04/12/2009 [GHC] Can we formally verify a compiler?
    • Motivation 1960s 1970s 1980s 1990s 2000s Conclusions Can you trust your compiler? Use a compiler to translate from a high-level language to a low-level Compilers are programs (generally) written by people People make mistakes Can silently turn “a correct program into an incorrect executable” [Leroy09] GHC 6.10.x is ≈ 800, 000 lines of code and has had 737 bugs reported in the bug tracker as of 04/12/2009 [GHC] Can we formally verify a compiler?
    • Motivation 1960s 1970s 1980s 1990s 2000s Conclusions McCarthy and Painter, 1967 “Correctness of a compiler for arithmetic expressions” [McCart67] Describe, in first-order predicate logic; Source language semantics Target language semantics A compilation process Reason that the compiler maintains semantic equivalence
    • Motivation 1960s 1970s 1980s 1990s 2000s Conclusions McCarthy and Painter, 1967 Semantic equivalence in [McCart67] ∀e ∈ Expressions, ∀µ : Variable Mappings • interpret(e, µ) ≡ acValue(emulate(compile(e), mkState(µ))) Very limited, small toy source and target language Proof performed by hand Logical framework and proof presented in under ten pages Shows that proving a compiler correct is possible
    • Motivation 1960s 1970s 1980s 1990s 2000s Conclusions Milner and Weyhrauch, 1972 “Proving compiler correctness in a mechanised logic” [Milner72] Provide an LCF machine-checked proof of the McCarthy-Painter example Proceed towards mechanically proving a compiler for a more complex language to a stack machine Claim to have “no significant doubt that the remainder of the proof can be done on machine” [Milner72]
    • Motivation 1960s 1970s 1980s 1990s 2000s Conclusions Morris, 1973 “Advice on structuring compilers and proving them correct” [Morris73] Proves by hand the correctness of a compiler for a source language that contains assignment, conditionals, loops, arithmetic, booleans operations and local definitions “Essence” of the advice presented in [Morris73] compile Source language −−→ −− Target language    Target semantics Source semantics Source meanings ←−− −− Target meanings decode
    • Motivation 1960s 1970s 1980s 1990s 2000s Conclusions Thatcher, Wagner and Wright, 1980 Advice presented in [Thatch80] compile Source language −−→ −− Target language    Target semantics Source semantics Source meanings −−→ −− Target meanings encode “More on advice on structuring compilers and proving them correct” [Thatch80] Provides a correct compiler for a more advanced target language than [Morris73] Claim that mechanised theorem proving tools required further development
    • Motivation 1960s 1970s 1980s 1990s 2000s Conclusions The “structuring compilers” series Discuss constructing algebras to describe languages How to move from one algebra to another Encode abstract state to concrete or decode to abstract? “there is not enough information in the [abstract] state to recover the [concrete] state completely” [Moore89] Further paper “Even more on advice on structuring compilers and proving them correct: changing an arrow” [Orejas81] [Moore89] discusses this issue from a practical perspective
    • Motivation 1960s 1970s 1980s 1990s 2000s Conclusions The “structuring compilers” series Discuss constructing algebras to describe languages How to move from one algebra to another Encode abstract state to concrete or decode to abstract? “there is not enough information in the [abstract] state to recover the [concrete] state completely” [Moore89] Further paper “Even more on advice on structuring compilers and proving them correct: changing an arrow” [Orejas81] [Moore89] discusses this issue from a practical perspective
    • Motivation 1960s 1970s 1980s 1990s 2000s Conclusions The “structuring compilers” series Discuss constructing algebras to describe languages How to move from one algebra to another Encode abstract state to concrete or decode to abstract? “there is not enough information in the [abstract] state to recover the [concrete] state completely” [Moore89] Further paper “Even more on advice on structuring compilers and proving them correct: changing an arrow” [Orejas81] [Moore89] discusses this issue from a practical perspective
    • Motivation 1960s 1970s 1980s 1990s 2000s Conclusions Meijer, 1994 “More advice on proving a compiler correct: Improve a correct compiler” [Meijer94] Given a interpreter for a source language, can we transform it into a compiler to and residual interpreter for the target language? A functional decomposition problem (i.e. interpreter = emulator ◦ compiler ) Demonstrate this technique for a first-order imperative language compiling to a three-address code machine While quite feasible for first-order languages, becomes far more difficult for higher-order languages
    • Motivation 1960s 1970s 1980s 1990s 2000s Conclusions Berghofer and Stecker, 2003 “Extracting a formally verified, fully executable compiler from a proof assistant” [Bergho03] Proves a compiler for a subset of the Java source language to Java bytecode Includes typechecking, abstract syntax tree annotation and bytecode translation Isabelle/HOL used to prove properties about an abstract compiler Isabelle code extraction to produce an executable compiler
    • Motivation 1960s 1970s 1980s 1990s 2000s Conclusions Dave, 2003 Papers listed against decade published Maulik A. Dave’s bibliography for “Compiler Verification” [Dave03] Ninety-nine papers listed Ninety-one of those listed were published after 1990 Interestingly neither the Milner and Weyhrauch paper nor the Meijer are included
    • Motivation 1960s 1970s 1980s 1990s 2000s Conclusions Dave, 2003 Papers listed against decade published Maulik A. Dave’s bibliography for “Compiler Verification” [Dave03] Ninety-nine papers listed Ninety-one of those listed were published after 1990 Interestingly neither the Milner and Weyhrauch paper nor the Meijer are included
    • Motivation 1960s 1970s 1980s 1990s 2000s Conclusions Dave, 2003 Papers listed against decade published Maulik A. Dave’s bibliography for “Compiler Verification” [Dave03] Ninety-nine papers listed Ninety-one of those listed were published after 1990 Interestingly neither the Milner and Weyhrauch paper nor the Meijer are included
    • Motivation 1960s 1970s 1980s 1990s 2000s Conclusions Recent work Leroy’s “A formally verified compiler back-end” [Leroy09] Proves a compiler for Cminor to PowerPC assembler Chlipala’s “A verified compiler for an impure functional language” [Chlipa10] For a toy (but still quite feature rich) functional source language to instructions register-based machine Both use the Coq proof assistant and code extraction Both decompose the problem into compilation to several intermediate languages Both express worries that the proof assistant itself contain bugs that would invalidate correctness
    • Motivation 1960s 1970s 1980s 1990s 2000s Conclusions Conclusions Compilers have been proved correct for progressively larger source languages Rapidly became apparent that some kind of proof assistant is required Decomposition of large compilers is a key factor for success Programs are only verified when all surrounding elements are verified
    • Motivation 1960s 1970s 1980s 1990s 2000s Conclusions Open questions What about compilers for larger target languages and more advanced compilation facilities? Are our mechanised assistants producing valid proofs? Are there other ways to decompose the problem? Are particular language paradigms more amenable to compiler verification? Why haven’t the concepts of [Meijer94] been more widely used? What other ways are there of decomposing the compiler verification problem?
    • Motivation 1960s 1970s 1980s 1990s 2000s Conclusions More information Slides and bibliography will be made available at; http://www-users.cs.york.ac.uk/~jason/ Jason S. Reich <jason@cs.york.ac.uk>