Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Testing, fixing, and proving with contracts

957 views

Published on

Invited tutorial at the 9th International Conference on Tests & Proofs (TAP 2015) part of STAF 2015 in L'Aquila, Italy.

Published in: Software
  • Be the first to comment

  • Be the first to like this

Testing, fixing, and proving with contracts

  1. 1. Testing, Fixing, and Proving with Contracts Carlo A. Furia Chair of Software Engineering, ETH Zurich bugcounting.net @bugcounting
  2. 2. The (AlpTransit) Gotthard tunnel The tunnel • 57 km long • construction at both ends • underneath the Gotthard massif 2 Erstfeld • canton Uri • German-speaking • weather probably cloudy Bodio • canton Ticino • Italian-speaking • weather probably sunny
  3. 3. Users with different requirements Joe the programmer • little or no background in formal techniques • weak and simple (incomplete) specifications • design not optimal for verification • bugs: full verification is unattainable • looks for low hanging fruits of verification Verification expert • fluent in formal logic techniques • strong, often complete, specifications • design for full verification • could use automation of simpler steps • aims at the holy grail of verified software 3
  4. 4. The Eiffel Verification Environment 4 Inspector AutoTest AutoFix AutoProof GUI Verification Assistant
  5. 5. The Eiffel Verification Environment 5 GUI Verification Assistant CLI ComCom (web)Inspector AutoTest AutoFix AutoProof
  6. 6. A key ingredient: contracts Contracts are a form of lightweight specification: • Assertions (pre- and postconditions, invariants) • Contract language = Boolean expressions • Executable: bring immediate benefits for testing, debugging, and so on Verification tools in EVE take advantage of (simple) functional specifications in the form of contracts.
  7. 7. Auto-active user/tool interaction 1. Code + Annotations 2. Push button 3. Verification outcome 4. Correct/Revise 7
  8. 8. Roadmap AutoTest: find faults automatically 8 AutoFix: patch faults automatically Verification assistant: combine tests & proofs Two-step verification: help debug failed proofs AutoProof: prove realistic programs 1. 2. 3. 4. 5.
  9. 9. Next stop: AutoTest AutoTest: find faults automatically 9 AutoFix: patch faults automatically Verification assistant: combine tests & proofs Two-step verification: help debug failed proofs AutoProof: prove realistic programs 1. 2. 3. 4. 5.
  10. 10. AutoTest in a nutshell AutoTest is a push-button generator of unit tests • Test = sequence of method calls on objects • Contracts as oracles: target call o.m – Invalid test: o does not satisfy m’s precondition – Passing test: all contracts evaluate to True – Failing test: some contract evaluates to False 10 Similar tools: • Korat (Java + assertions) • QuickCheck (Haskell)
  11. 11. How AutoTest works 11 Random object o Random method m call o.m Invalid test Failing test: bug found • Existing object from object pool • Fresh object of primitive type (e.g. random integer) • New object of class type (call constructor) Passing test Add any new objects to object pool Classification based on runtime contract checking
  12. 12. Test generation strategies AutoTest is a push-button generator of unit tests • Basic generation strategy: random • Other strategies as extensions: – Random+ – Adaptive-random (object distance) – Precondition satisfaction – Stateful testing 12
  13. 13. Demo example: Bank Account class ACCOUNT balance: INTEGER deposit (amount: INTEGER) require 0 <= amount ensure balance = old balance + amount withdraw (amount: INTEGER) require 0 <= amount ensure balance_set: amount <= old balance implies balance = old balance - amount balance_not_set: amount > old balance implies balance = old balance invariant balance_nonnegative: balance >= 0 13
  14. 14. Demo 1: bug finding AutoTest finds a bug in the implementation of withdraw that violates postcondition balance_not_set. withdraw (amount: INTEGER) require 0 <= amount do balance := balance + amount ensure balance_set: amount <= old balance implies balance = old balance - amount balance_not_set: amount > old balance implies balance = old balance 14
  15. 15. Demo 1: bug finding AutoTest finds a bug in the implementation of withdraw that violates postcondition balance_not_set. 15
  16. 16. Next stop: AutoFix AutoTest: find faults automatically 16 Verification assistant: combine tests & proofs Two-step verification: help debug failed proofs AutoProof: prove realistic programs 1. 3. 4. 5. AutoFix: patch faults automatically 2.
  17. 17. AutoFix in a nutshell AutoFix is a push-button generator of fixes 17 AutoFix Coding code + contracts bugs + patches Similar tools: • GenProg, Kali (C) • PAR (Java)
  18. 18. How AutoFix works Program + Contracts Test suite Suspicious states AutoTest Candidate fixes Valid fixes Validation & rankingAnalysis Synthesis  count = 1  count = 2  count = 0 count = 0 @ L4 if count = 0 then ...
  19. 19. AutoFix: Components Program state abstraction: • snapshots: location, predicate, value Fault localization: • static information: proximity to failing location/expression • dynamic information: number of failing/passing tests 19
  20. 20. AutoFix: Components Program state abstraction: • snapshots : location, predicate, value Synthesis: • enumeration of common replacement expressions and instructions • conditional execution: @ location: if predicate = value then some fix action 20
  21. 21. AutoFix: Components Validation: • regression testing with all available tests for method being fixed • valid fix: passes all available tests Ranking: • based on suspiciousness score of snapshots 21
  22. 22. Demo 1b: bug fixing AutoFix builds fixes for the bug in the implementation of withdraw. A “high-quality” (proper, correct) fix: 22
  23. 23. Demo 1b: bug fixing AutoFix builds fixes for the bug in the implementation of withdraw. A fix that just happens to pass all tests: 23
  24. 24. Experiments with AutoFix Source programs: standard data-structure libraries, text library, card game. LOC of source + contracts # Unique errors % Fixed errors % High-quality fixes Time: test + fix [minutes] Fix implementation: 73’000 204 42% 25% 17 + 3 Fix contracts: 24’500 44 95% 25% 31 + 3
  25. 25. Experiments with AutoFix Source programs: standard data-structure libraries, text library, card game. GenProg, according to the analysis by [Qui+, ISSTA’15]: < 2% LOC of source + contracts # Unique errors % Fixed errors % High-quality fixes Time: test + fix [minutes] Fix implementation: 73’000 204 42% 25% 17 + 3
  26. 26. Next stop: Verification assistant AutoTest: find faults automatically 26 Two-step verification: help debug failed proofs AutoProof: prove realistic programs 1. 4. 5. AutoFix: patch faults automatically 2. Verification assistant: combine tests & proofs 3.
  27. 27. Integrating different tools A verification assistant manages individual tools – Select tools and program parts to be verified – Collect results and aggregate them Classes Data pool Tools Verification Assistant . . . AutoTest AutoProof C1 C2 Cn AutoFix AT n AT 2 AT 1 … AP n AT 2 AP 1 … AInAT 2 AI1 … AF n AT 2 AF 1 … 27 Inspector
  28. 28. Scores: aggregated verification results Each method & class receives a correctness score • A value in the interval [-1, 1] • Estimate of evidence for correctness -1 0 1 Evidence of incorrectness Evidence of correctness Lack of evidence Conclusive evidence Conclusive evidence 28
  29. 29. Score for testing • Failing test case: conclusive evidence of incorrectness • Passing test case: increases evidence of correctness • Absolute value may vary according to other metrics – used heuristics, coverage, testing time, … -1 0 1 29
  30. 30. Score for testing • Failing test case: conclusive evidence of incorrectness • Passing test case: increases evidence of correctness • Absolute value may vary according to other metrics – used heuristics, coverage, testing time, … -1 0 1 Failing test case 30
  31. 31. Score for testing • Failing test case: conclusive evidence of incorrectness • Passing test case: increases evidence of correctness • Absolute value may vary according to other metrics – used heuristics, coverage, testing time, … -1 0 1 Failing test case Passing test case 31
  32. 32. Score for testing • Failing test case: conclusive evidence of incorrectness • Passing test case: increases evidence of correctness • Absolute value may vary according to other metrics – used heuristics, coverage, testing time, … -1 0 1 Failing test case Passing test case Passing test case 32
  33. 33. Score for testing • Failing test case: conclusive evidence of incorrectness • Passing test case: increases evidence of correctness • Absolute value may vary according to other metrics – used heuristics, coverage, testing time, … -1 0 1 Failing test case Passing test case Passing test case Passing test case 33
  34. 34. Score for correctness proofs AutoProof is sound but incomplete: – Timeout: score 0 – Failed proof: score -0.2 -1 0 1 Failed proof for a complete tool Successful proof for a sound tool 34
  35. 35. Combining scores of different tools • Running each tool determines a score for each method • Overall score for a class: weighted average • Weights depend on the relative confidence in reliability of tools – may be application and configuration dependent • Overall score of modules (packages) may also weigh components differently according to their criticality 35
  36. 36. Demo 2: combined testing and proving The verification assistant runs on the version of ACCOUNT patched by AutoFix: deposit does not verify, but passes all tests  reasonable confidence in its correctness. 36
  37. 37. Next stop: Two-step verification AutoTest: find faults automatically 37 AutoProof: prove realistic programs 1. 5. AutoFix: patch faults automatically 2. Verification assistant: combine tests & proofs 3. Two-step verification: help debug failed proofs 4.
  38. 38. Modular proofs Verifiers such as AutoProof perform modular reasoning • Effects of a call to method m within the caller = m’s specification (pre, post, frame) 38 deposit (amount: INTEGER) require 0 <= amount do update_balance (amount) How we wrote it: How AutoProof sees it: deposit (amount: INTEGER) require 0 <= amount do assert update_balance.pre havoc update_balance.frame assume update_balance.post
  39. 39. Modular proofs in practice Verifiers such as AutoProof perform modular reasoning • Necessary for scalability • Consistent with design-by-contract and information hiding • But providing the detailed specifications necessary for verification may be tedious or overly complex 39
  40. 40. Specification writing fatigue Providing the specification necessary for verification may be tedious, especially in the most straightforward cases. deposit (amount: INTEGER) require 0 <= amount do update_balance (amount) ensure balance = old balance + amount How we wrote it: How we thought about it: 40 deposit (amount: INTEGER) require 0 <= amount do balance := balance + amount ensure balance = old balance + amount
  41. 41. Debugging failed verification When verification fails with verifiers such as AutoProof (modular, sound, incomplete): • There is a bug? • The program is correct, but the specification is insufficient? To help debug failed verification attempts AutoProof features two-step verification. 41
  42. 42. Two-step verification Two-step verification improves user feedback, especially in the presence of little specification. 1. First verification step – Standard modular verification 2. Second verification step – Ignore specification of called routines and loops – Uses inlining and unrolling Feedback: combination of outcomes of 1 & 2 42
  43. 43. Step 1: modular verification update_balance (a: INTEGER) do balance := balance + a end deposit (amount: INTEGER) require 0 <= amount do update_balance (amount) ensure balance = old balance + amount Postcondition violated Modular verification fails. 43 No postcondition of callee: effect on balance undefined
  44. 44. Step 2: verification with inlining Verification with inlining succeeds. Attribute balance is incremented by amount. Feedback: change (strengthen) the specification of update_balance. 44 update_balance (a: INTEGER) do balance := balance + a end deposit (amount: INTEGER) require 0 <= amount do balance := balance + amount ensure balance = old balance + amount
  45. 45. Demo 2b: two-step verification AutoProof with two-step verification runs on the version of ACCOUNT patched by AutoFix: deposit verifies after inlining update_balance • Provide postcondition to update_balance or • Direct AutoProof to use update_balance inlined 45 Follow this demo at http://bit.do/tap-tutorial (Switch to tab account2.e)
  46. 46. Two-step verification: feedback r require Pr do s ensure Qr s require Ps do : ensure Qs Step 1: modular Step 2: inlined Suggestion Verify r Verify s Verify r Ps fails Succeeds Succeeds Weaken Ps or use inlined Qr fails Succeeds Succeeds Strengthen Qs or use inlined Succeeds Qs fails Succeeds Strengthen Ps / Weaken Qs
  47. 47. Two-step verification: feedback r require Pr do s ensure Qr s require Ps do : ensure Qs Step 1: modular Step 2: inlined Suggestion Verify r Verify s Verify r Ps fails Succeeds Succeeds Weaken Ps or use inlined Qr fails Succeeds Succeeds Strengthen Qs or use inlined Succeeds Qs fails Succeeds Strengthen Ps / Weaken Qs 1
  48. 48. Two-step verification: feedback r require Pr do s ensure Qr s require Ps do : ensure Qs Step 1: modular Step 2: inlined Suggestion Verify r Verify s Verify r Ps fails Succeeds Succeeds Weaken Ps or use inlined Qr fails Succeeds Succeeds Strengthen Qs or use inlined Succeeds Qs fails Succeeds Strengthen Ps / Weaken Qs 2
  49. 49. Two-step verification: feedback r require Pr do s ensure Qr s require Ps do : ensure Qs Step 1: modular Step 2: inlined Suggestion Verify r Verify s Verify r Ps fails Succeeds Succeeds Weaken Ps or use inlined Qr fails Succeeds Succeeds Strengthen Qs or use inlined Succeeds Qs fails Succeeds Strengthen Ps / Weaken Qs 3
  50. 50. Next stop: AutoProof AutoTest: find faults automatically 50 1. AutoFix: patch faults automatically 2. Verification assistant: combine tests & proofs 3. AutoProof: prove realistic programs 5. Two-step verification: help debug failed proofs 4.
  51. 51. AutoProof in a nutshell AutoProof is an auto-active verifier for Eiffel • Prover for functional properties • All-out support of object-oriented idiomatic structures (e.g. patterns) – Based on class invariants • Flexible: incrementality – Proving simple properties requires little annotations – Proving complex properties is possible with more effort 51
  52. 52. Demo 3: a taste of AutoProof AutoProof verifies method transfer with suitable specification transfer (amount: INTEGER; other: ACCOUNT) -- Transfer `amount' from this account to `other'. require amount_non_negative: 0 <= amount amount_available: amount <= balance do withdraw (amount) other.deposit (amount) ensure deposit_done: other.balance = old other.balance + amount withdrawal_done: balance = old balance - amount 52 Follow this demo at http://bit.do/tap-tutorial (Switch to tab account3.e)
  53. 53. Sound program verifiers compared 53 more complex properties more automation static analysis interactive (KIV) ESC/Java2 OpenJML Spec# VCC Chalice Dafny KeY VeriFast
  54. 54. Reasoning with class invariants Class invariants are a natural way to reason about object-oriented programs: invariant = consistency of objects 54 ACCOUNT invariant balance >= 0
  55. 55. LIST ACCOUNT Multi-object structures Object-oriented programs involve multiple objects (duh!), whose consistency is often mutually dependent 55 invariant balance >= 0 balance = sum (transactions) transactions
  56. 56. AUDITOR LIST ACCOUNT Consistency of multi-object structures Mutually dependent object structures require extra care to enforce, and reason about, consistency (cmp. encapsulation) 56 invariant balance >= 0 balance = sum (transactions) transactions
  57. 57. AUDITOR LIST ACCOUNT Consistency of multi-object structures Mutually dependent object structures require extra care to enforce, and reason about, consistency (cmp. encapsulation) 57 invariant balance >= 0 balance = sum (transactions) transactions
  58. 58. Open and closed objects When (at which program points) must class invariants hold? To provide flexibility, objects in AutoProof can be open or closed 58 CLOSED OPEN Object: Consistent Inconsistent State: Stable Transient Invariant: Holds May not hold
  59. 59. LIST ACCOUNT Ownership For hierarchical object structures, AutoProof offers an ownership protocol 59 invariant balance >= 0 owns = [ transactions ] balance = sum (transactions) transactions owns
  60. 60. LIST ACCOUNT Ownership For hierarchical object structures, AutoProof offers an ownership protocol 60 invariant balance >= 0 owns = [ transactions ] balance = sum (transactions) transactions AUDITOR owns
  61. 61. add_node LIST ACCOUNT Ownership For hierarchical object structures, AutoProof offers an ownership protocol 61 transactions AUDITOR owns invariant balance >= 0 owns = [ transactions ] balance = sum (transactions)
  62. 62. add_node LIST ACCOUNT Ownership For hierarchical object structures, AutoProof offers an ownership protocol 62 transactions AUDITOR owns invariant balance >= 0 owns = [ transactions ] balance = sum (transactions)
  63. 63. add_node LIST ACCOUNT Ownership For hierarchical object structures, AutoProof offers an ownership protocol 63 transactions AUDITOR owns invariant balance >= 0 owns = [ transactions ] balance = sum (transactions)
  64. 64. add_node LIST ACCOUNT Ownership For hierarchical object structures, AutoProof offers an ownership protocol 64 transactions AUDITOR owns update_balance invariant balance >= 0 owns = [ transactions ] balance = sum (transactions)
  65. 65. LIST ACCOUNT Ownership For hierarchical object structures, AutoProof offers an ownership protocol 65 invariant balance >= 0 owns = [ transactions ] balance = sum (transactions) transactions AUDITOR owns
  66. 66. Demo 4: ownership in AutoProof AutoProof verifies the ACCOUNT with an owned list of transactions transactions: SIMPLE_LIST [INTEGER] -- History of transactions: -- positive integer = deposited amount -- negative integer = withdrawn amount -- latest transactions in back of list 66 Follow this demo at http://bit.do/tap-tutorial (Switch to tab account4.e)
  67. 67. ACCOUNT Semantic collaboration For collaborative object structures, AutoProof offers a novel protocol: semantic collaboration 67 invariant interest_rate = bank.rate BANK bank
  68. 68. bank bank ACCOUNT Semantic collaboration For collaborative object structures, AutoProof offers a novel protocol: semantic collaboration 68 invariant interest_rate = bank.rate BANK bank
  69. 69. subjects observers Semantic collaboration • Subjects = objects my consistency depends on • Observers = objects whose consistency depends on me 69 invariant subjects = [ bank ] Current in bank.observers -- Implicit in AutoProof interest_rate = bank.rate bank bank ACCOUNTBANK bank
  70. 70. Demo 5: collaboration in AutoProof AutoProof verifies the ACCOUNT with a BANK that sets a master interest rate bank: BANK -- Provider of this account invariant non_negative_rate: 0 <= interest_rate bank_exists: bank /= Void consistent_rate: interest_rate = bank.master_rate 70 Follow this demo at http://bit.do/tap-tutorial (Switch to tabs account5.e sand bank5.e)
  71. 71. AutoProof on realistic software Verification benchmarks: EiffelBase2 – a realistic container library: # programs LOC SPEC/CODE Verification time 25 4400 Lines: 1.0 Tokens: 1.9 Total: 3.4 min Longest method: 12 sec Average method: < 1 sec # classes LOC SPEC/CODE Verification time 46 8400 Lines: 1.4 Tokens: 2.7 Total: 7.2 min Longest method: 12 sec Average method: < 1 sec
  72. 72. Testing, fixing, and proving with contracts: acknowledgements 72 Julian Tschannen Nadia Polikarpova Yu (Max) Pei Yi (Jason) Wei Andreas Zeller Bertrand MeyerIlinca Ciupa-MoserAndreas Leitner
  73. 73. Testing, fixing, and proving with contracts (in Eiffel) 1. AutoTest 73 2. AutoFix 3. Verif. assist. 4. Two-step 5. AutoProof http://se.inf.ethz.ch/research/ eve/ http://cloudstudio.ethz.ch/ comcom/ See TAP 2015’s proceedings for references to technical papers

×