Motivation      1960s   Proof        1970s       1980s    1990s   2000s   Conclusions




             Formal Verification ...
Motivation   1960s     Proof    1970s   1980s    1990s     2000s    Conclusions




Compiling an arithmetic language

    ...
Motivation       1960s   Proof   1970s   1980s   1990s     2000s    Conclusions




Compiling an arithmetic language

    ...
Motivation       1960s   Proof   1970s   1980s       1990s     2000s        Conclusions




Compiling an arithmetic langua...
Motivation   1960s          Proof    1970s   1980s    1990s   2000s   Conclusions




Compiling an arithmetic language

  ...
Motivation     1960s      Proof     1970s      1980s      1990s      2000s     Conclusions




Compiling an arithmetic lan...
Motivation      1960s     Proof    1970s    1980s     1990s   2000s   Conclusions




Why use high-level languages?



   ...
Motivation      1960s    Proof      1970s   1980s     1990s    2000s       Conclusions




Can you trust your compiler?


...
Motivation      1960s     Proof    1970s      1980s   1990s    2000s   Conclusions




McCarthy and Painter, 1967



     ...
Motivation      1960s    Proof     1970s    1980s     1990s       2000s   Conclusions




McCarthy and Painter, 1967


   ...
Motivation        1960s         Proof         1970s          1980s      1990s       2000s    Conclusions




Proving the [...
Motivation       1960s    Proof    1970s    1980s   1990s     2000s   Conclusions




Proving the [McCart67] compiler

   ...
Motivation       1960s    Proof   1970s    1980s    1990s      2000s   Conclusions




Proving the [McCart67] compiler
   ...
Motivation         1960s      Proof     1970s     1980s      1990s      2000s   Conclusions




Assumed lemmas

      Unto...
Motivation        1960s    Proof     1970s    1980s     1990s    2000s   Conclusions




Proving the [McCart67] compiler
 ...
Motivation      1960s    Proof    1970s    1980s     1990s    2000s   Conclusions




Milner and Weyhrauch, 1972



      ...
Motivation      1960s      Proof       1970s         1980s       1990s     2000s   Conclusions




Morris, 1973

         ...
Motivation      1960s      Proof       1970s         1980s       1990s     2000s   Conclusions




Thatcher, Wagner and Wr...
Motivation   1960s   Proof     1970s         1980s    1990s      2000s      Conclusions




Syntax of source language in [...
Motivation      1960s    Proof     1970s    1980s     1990s       2000s   Conclusions




The “structuring compilers” seri...
Motivation      1960s    Proof     1970s    1980s       1990s   2000s     Conclusions




Meijer, 1994

             “More...
Motivation      1960s    Proof     1970s    1980s     1990s    2000s      Conclusions




Berghofer and Stecker, 2003


  ...
Motivation        1960s     Proof        1970s     1980s      1990s      2000s      Conclusions




Dave, 2003


         ...
Motivation      1960s      Proof     1970s      1980s      1990s      2000s    Conclusions




Recent work

             L...
Motivation      1960s    Proof     1970s     1980s    1990s     2000s       Conclusions




Conclusions


             Com...
Motivation      1960s    Proof    1970s    1980s     1990s    2000s      Conclusions




Open questions


             Wha...
Motivation   1960s    Proof     1970s    1980s     1990s    2000s   Conclusions




More information




              Sli...
Upcoming SlideShare
Loading in...5
×

Formal Verification of Programming Languages

1,639

Published on

Now with more Haskell and proof!

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,639
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "Formal Verification of Programming Languages"

  1. 1. Motivation 1960s Proof 1970s 1980s 1990s 2000s Conclusions Formal Verification of Programming Language Implementations Ph.D. Literature Seminar Jason S. Reich <jason@cs.york.ac.uk> University of York 11th January 2010
  2. 2. Motivation 1960s Proof 1970s 1980s 1990s 2000s Conclusions Compiling an arithmetic language Compile from a simple arithmetic language to machine code for a simple register machine. Example taken from [McCart67]
  3. 3. Motivation 1960s Proof 1970s 1980s 1990s 2000s Conclusions Compiling an arithmetic language Compile from a simple arithmetic language to machine code for a simple register machine. Source language Numeric constants Variables Addition e.g. (x + 3) + (x + (y + 2)) Example taken from [McCart67]
  4. 4. Motivation 1960s Proof 1970s 1980s 1990s 2000s Conclusions Compiling an arithmetic language Compile from a simple arithmetic language to machine code for a simple register machine. Target language Source language Load Immediate into ac Numeric constants LOAD into ac from Variables address/register Addition STOre ac value to address/register e.g. (x + 3) + (x + (y + 2)) ADD register value to ac Example taken from [McCart67]
  5. 5. Motivation 1960s Proof 1970s 1980s 1990s 2000s Conclusions Compiling an arithmetic language Arithmetic expression compiler in Haskell compile :: I n t → Source → Target compile t ( Const v ) = [ Li v ] compile t ( Var x ) = [ Load (Map x ) ] compile t (Sum e1 e2 ) = c o m p i l e t e1 ++ [ Sto ( Reg t ) ] ++ c o m p i l e ( t + 1 ) e2 ++ [ Add ( Reg t ) ] When compiled and executed, is the value in the accumulator the result of the source arithmetic expression?
  6. 6. Motivation 1960s Proof 1970s 1980s 1990s 2000s Conclusions Compiling an arithmetic language (x + 3) + (x + (y + 2)) compiled to machine code? 1 LOAD M[x] 8 LOAD M[y] 2 STO R[t + 0] 9 STO R[t + 2] 3 LI 3 10 LI 2 4 ADD R[t + 0] 11 ADD R[t + 2] 5 STO R[t + 0] 12 ADD R[t + 1] 6 LOAD M[x] 13 ADD R[t] 7 STO R[t + 1] n.b. Where M is a mapping of variable names to memory locations and R is an indexing of registers.
  7. 7. Motivation 1960s Proof 1970s 1980s 1990s 2000s Conclusions Why use high-level languages? Rapid development Easier to understand, maintain and modify Less likely to make mistakes Easier to reason about and infer properties Architecture portability But...
  8. 8. Motivation 1960s Proof 1970s 1980s 1990s 2000s Conclusions Can you trust your compiler? Use a compiler to translate from a high-level language to a low-level Compilers are programs (generally) written by people People make mistakes Can silently turn “a correct program into an incorrect executable” [Leroy09] GHC 6.10.x is ≈ 800, 000 lines of code and has had 737 bugs reported in the bug tracker as of 04/12/2009 [GHC] Can we formally verify a compiler?
  9. 9. Motivation 1960s Proof 1970s 1980s 1990s 2000s Conclusions McCarthy and Painter, 1967 “Correctness of a compiler for arithmetic expressions” [McCart67] Describe, in first-order predicate logic; Source language semantics Target language semantics A compilation process Reason that the compiler maintains semantic equivalence
  10. 10. Motivation 1960s Proof 1970s 1980s 1990s 2000s Conclusions McCarthy and Painter, 1967 Semantic equivalence in [McCart67] ∀e ∈ Expressions, ∀µ ∈ Variable Mappings • source(e, µ) ≡ acValue(target(compile(e), construct(µ))) Very limited, small toy source and target language Proof performed by hand Logical framework and proof presented in under ten pages Shows that proving a compiler correct is possible
  11. 11. Motivation 1960s Proof 1970s 1980s 1990s 2000s Conclusions Proving the [McCart67] compiler target (compile t x) ( construct s) Ac ≡ source x s type Abstract = Name → Value type Concrete = Address → Value construct s = λ (Map v ) → s v write k v s = λ k’ → i f k == k ’ t h e n v e l s e s k ’ −− S e m a n t i c s f o r the source language s o u r c e : : Source → A b s t r a c t → Value s o u r c e ( Const n ) = n s o u r c e ( Var v ) s = s v s o u r c e ( Add x y ) s = source x s + source y s −− S e m a n t i c s f o r t h e t a r g e t l a n g u a g e t a r g e t : : Target → Concrete → Concrete target [ ] s = s t a r g e t ( i : i s ) s = t a r g e t i s $ case i of Li n → w r i t e Ac n s Load r → w r i t e Ac ( s r ) s Sto r → w r i t e r ( s Ac ) s Sum r → w r i t e Ac ( s Ac + s r ) s
  12. 12. Motivation 1960s Proof 1970s 1980s 1990s 2000s Conclusions Proving the [McCart67] compiler Proof of correctness for constants { case where ‘x = Const n’ } target (compile t (Const n)) ( construct s) Ac = { inline ‘compile’ } target [ Li n] ( construct s) Ac = { inline ‘ target ’ } write Ac n (construct s) Ac = { inline ‘ write ’ } n = { equivalent to } source (Const v) s
  13. 13. Motivation 1960s Proof 1970s 1980s 1990s 2000s Conclusions Proving the [McCart67] compiler Proof of correctness for variables { case where ‘x = Var v’ } target (compile t (Var v)) ( construct s) Ac = { inline ‘compile’ } target [Load (Map v)] (construct s) Ac = { inline ‘ target ’ } write Ac (construct s (Map v)) (construct s) Ac = { inline ‘ write ’ } ( construct s) (Map v) = { inline ‘ construct ’ } s v = { equivalent to } source (Var v) s
  14. 14. Motivation 1960s Proof 1970s 1980s 1990s 2000s Conclusions Assumed lemmas Untouched Registers lemma Any expression x, compiled to use registers t and above, will not write to a register less than t. Therefore; r < t ⇒ target (compile t x) s (Reg r) ≡ s (Reg r) Untouched Variables lemma The compiled form of expression x will never write to a memory location mapped to a variable. Therefore; target (compile t x) s (Map v) ≡ s (Map v)
  15. 15. Motivation 1960s Proof 1970s 1980s 1990s 2000s Conclusions Proving the [McCart67] compiler Proof of correctness for addition { case where ‘x = Add x y’ } target (compile t (Add x y)) ( construct s) Ac = { inline ‘compile’ and ‘ target ’ } let s1 = target (compile t x) ( construct s) s2 = write (Reg t) (s1 Ac) s1 s3 = target (compile (t + 1) y) s2 in write Ac (s3 Ac + s3 (Reg t)) s3 Ac = { State lemmas and inline ‘ write ’ s } target (compile t x) ( construct s) Ac + target (compile (t + 1) y) ( construct s) Ac = { inductive hypothesis − structural induction } source x s + source y s = { equivalent to } source (Add x y) s
  16. 16. Motivation 1960s Proof 1970s 1980s 1990s 2000s Conclusions Milner and Weyhrauch, 1972 “Proving compiler correctness in a mechanised logic” [Milner72] Provide an LCF machine-checked proof of the McCarthy-Painter example Proceed towards mechanically proving a compiler for a more complex language to a stack machine Claim to have “no significant doubt that the remainder of the proof can be done on machine” [Milner72]
  17. 17. Motivation 1960s Proof 1970s 1980s 1990s 2000s Conclusions Morris, 1973 “Advice on structuring compilers and proving them correct” [Morris73] Proves by hand the correctness of a compiler for a source language that contains assignment, conditionals, loops, arithmetic, booleans operations and local definitions “Essence” of the advice presented in [Morris73] compile Source language −−→ −− Target language    Target semantics Source semantics Source meanings ←−− −− Target meanings decode
  18. 18. Motivation 1960s Proof 1970s 1980s 1990s 2000s Conclusions Thatcher, Wagner and Wright, 1980 Advice presented in [Thatch80] compile Source language −−→ −− Target language    Target semantics Source semantics Source meanings −−→ −− Target meanings encode “More on advice on structuring compilers and proving them correct” [Thatch80] Provides a different encoding of the target language to [Morris73] Claim that mechanised theorem proving tools required further development
  19. 19. Motivation 1960s Proof 1970s 1980s 1990s 2000s Conclusions Syntax of source language in [Thatch80] ae ::= integer constant st ::= continue | variable | variable := ae | - ae | if be then st else st | Pr ae | st ; st | Su ae | while be do st | ae + ae be ::= boolean constant | ae − ae | even ae | ae × ae | ae ≤ ae | if be then ae else ae | ae ≥ ae | st result ae | ae = ae | let variable be ae in ae | ¬ be n.b. Similar to [Milner72] and [Morris73] | be ∧ be but with more operators and sequential | be ∨ be composition. Struggling to fit this onto one slide.
  20. 20. Motivation 1960s Proof 1970s 1980s 1990s 2000s Conclusions The “structuring compilers” series Discuss constructing algebras to describe language syntax and meaning The language abstract syntaxes as initial algebras Unique homomorphism from syntaxes to meanings, the semantics The compiler is the unique homomorphism between source and target syntaxes “... reduces to a proof that encode is a homomorphism ...” [Thatch80] “No structual induction is required ...” [Thatch80]
  21. 21. Motivation 1960s Proof 1970s 1980s 1990s 2000s Conclusions Meijer, 1994 “More advice on proving a compiler correct: Improve a correct compiler” [Meijer94] Given an interpreter for a source language, can we transform it into a compiler to and residual interpreter for the target language? A functional decomposition problem (i.e. interpreter = emulator ◦ compiler ) Demonstrate this technique for a first-order imperative language compiling to a three-address code machine While quite feasible for first-order languages, becomes far more difficult for higher-order languages
  22. 22. Motivation 1960s Proof 1970s 1980s 1990s 2000s Conclusions Berghofer and Stecker, 2003 “Extracting a formally verified, fully executable compiler from a proof assistant” [Bergho03] Proves a compiler for a subset of the Java source language to Java bytecode Includes typechecking, abstract syntax tree annotation and bytecode translation Isabelle/HOL used to prove properties about an abstract compiler Isabelle code extraction to produce an executable compiler
  23. 23. Motivation 1960s Proof 1970s 1980s 1990s 2000s Conclusions Dave, 2003 Papers listed against decade published Maulik A. Dave’s bibliography for “Compiler Verification” [Dave03] Ninety-nine papers listed Ninety-one of those listed were published after 1990 Interestingly neither the Milner and Weyhrauch paper nor the Meijer are included
  24. 24. Motivation 1960s Proof 1970s 1980s 1990s 2000s Conclusions Recent work Leroy’s “A formally verified compiler back-end” [Leroy09] Proves a compiler for Cminor to PowerPC assembler Chlipala’s “A verified compiler for an impure functional language” [Chlipa10] For a toy (but still quite feature rich) functional source language to instructions register-based machine Both use the Coq proof assistant and code extraction Both decompose the problem into compilation to several intermediate languages Both express worries that the proof assistant itself contain bugs that would invalidate correctness
  25. 25. Motivation 1960s Proof 1970s 1980s 1990s 2000s Conclusions Conclusions Compilers have been proved correct for progressively larger source languages A variety of different techniques are available ensuring semantic equivalences Rapidly became apparent that some kind of proof assistant is required Decomposition of large compilers is a key factor for success Programs are only verified when all surrounding elements are verified
  26. 26. Motivation 1960s Proof 1970s 1980s 1990s 2000s Conclusions Open questions What about compilers for larger target languages and more advanced compilation facilities? Are our mechanised assistants producing valid proofs? Are there other ways to decompose the problem? Are particular language paradigms more amenable to compiler verification? Why haven’t the concepts of [Meijer94] been more widely used? What other ways are there of decomposing the compiler verification problem?
  27. 27. Motivation 1960s Proof 1970s 1980s 1990s 2000s Conclusions More information Slides and bibliography will be made available at; http://www-users.cs.york.ac.uk/~jason/ Jason S. Reich <jason@cs.york.ac.uk>

×