Havoc

623 views
551 views

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
623
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
5
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • I will start by talking a bit about HAVOC.HAVOC is a modular checker for C programs. The tool extends the state of the art in C verification along various dimensions. First, it provides a much more detailed view of the memory operations, an expressive annotation language, while retaining efficiency. It also has a Precise heap representation.The checking is performed locally for each procedure, and therefore can scale to large code bases. While analyzing a procedure, HAVOC use contracts for other procedures it calls into, therefore, procedures need to have contracts on them.
  • Havoc

    1. 1. HAVOC: A precise and scalable verifier for systems software Shaz Qadeer Microsoft Research
    2. 2. Collaborators• Researchers – Jeremy Condit, Shuvendu Lahiri• Interns – Shaunak Chatterjee, Brian Hackett, Zvonimir Rakamaric, Ian Wehrman, Thomas Wies
    3. 3. HAVOC• Modular verifier for C programs – Verifies each procedure separately – Requires contracts: preconditions, postconditions, modifies clauses, loop invariants• Features – Accurate heap model – Expressive annotation language – Efficient checking using SMT solvers• Precise and efficient reasoning for loop-free and call-free code
    4. 4. Annotated C program Visual C Front End Control flow graph CtoBoogiePL Memory model Boogie program Boogie VCGenerator Verification condition Z3Verified Warning SMT solver
    5. 5. Challenges for HAVOC• Concise and precise expression of non-aliasing and disjointness of heap values• Properties of unbounded collections – Lists, Arrays, …• Enable such reasoning for low-level software – pointer arithmetic – interior pointers – nested structures and unions –…
    6. 6. But will programmers ever write contracts?• In some cases, they might – security properties: thousands of buffer annotations in Windows code – maintenance of critical legacy code: the Windows NT file system• Automatic annotation inference – precise and efficient checking of annotated programs is a crucial first step
    7. 7. Roadmap• Novel features of the specification language• Dealing with low-level features of C• Concluding remarks
    8. 8. log_list.head log_list.tail LinkNode next next next prev prev prev data data data char * channel_name file_name logtype struct _logentry [muh: Internet Relay Chat (IRC) bouncer]
    9. 9. LinkNode *iter = log_list.head; while (iter != null) { struct _logentry *entry = iter->data; free (entry->channel_name); free (entry->file_name); free (entry); entry = NULL; iter = iter->next; Ensure } absence of double freeData structure invariant Reachability predicate For every node x in the list between log_list.head and null: x->data is a unique pointer, and x->data->channel_name is a unique pointer, and x->data->file_name is a unique pointer. Universal quantification
    10. 10. Limitations of SMT solvers• No support for precise reasoning with reachability predicate – Incompleteness in Floyd-Hoare proofs for straight line code• Brittle support for quantifiers – Complexity: NP-complete (ground)  undecidable – Leads to unpredictable behavior of verifiers • Proof times, proof success rate – Requires user ingenuity to craft axioms/invariants with quantifiers
    11. 11. Contribution• Expressive and efficient logic for precise reasoning about reachability, unique pointers, and restricted quantification• A decision procedure for the logic built over an SMT solver
    12. 12. Simple Java-like memory model• Heap consists of a set of objects (obj)• Each field “f” is a mutable map – f: obj  obj – g: obj  int – h: obj  bool• The sort obj may be refined into a collection of sorts
    13. 13. Reachability predicate: Btwnfx y next next next prev prev prev data data data Btwnnext(x,y) Btwnprev(y,x)
    14. 14. Inverse of a function: f-1x y next next next prev prev prev data data dataw data-1(w) = {x, y}
    15. 15. LinkNode *iter = log_list.head; while (iter != null) { struct _logentry *entry = iter->data; free (entry->channel_name); free (entry->file_name); free (entry); entry = NULL; iter = iter->next; }Data structure invariant For every node x in the list between log_list.head and null: x->data is a unique pointer, and …. x Btwnf (log_list.head, null) {null}. data-1(data(x)) = {x} ….
    16. 16. Expressive logic• Express properties of collections x Btwnf (f(hd), hd). state(x) = LOCKED //cyclic• Arithmetic reasoning on data (e.g. sortedness) x Btwnf (hd, null) {null}. y Btwnf (x, null) {null}. d(x) d(y)
    17. 17. Precise• Given the Floyd-Hoare triple X = {P} S {Q} – P and Q are expressed in our logic – S is a loop-free call-free program• We can construct a formula Y in our logic – Y is linear in the size of X – X is valid iff Y is valid Need annotations/abstractions only at procedure/loop boundaries
    18. 18. Efficient• Decision problem is NP-complete – Can’t expect any better with propositional logic! – Retains the complexity of current SMT logics• Provide a decision procedure for the logic on top of state-of-the-art Z3 SMT solver – Leverages powerful ground-theory reasoning (arithmetic, arrays, uninterpreted functions…)
    19. 19. Ground Logic Logict Term ::= c | x | t1 + t2 | t1 - t2 | f(t)G GFormula ::= t = t’ | t < t’ | t Btwnf(t1, t2) | GS Set ::= f-1(t) | Btwnf(t1, t2)F Formula ::= G | F1 F2 |F1 F2 | x S. F
    20. 20. Ground decision procedure• Provide a set of 10 rewrite rules for Btwnf – Sound, complete and terminating• E.g. Transitivity3 t1 Btwnf(t0, t2) t Btwnf(t0, t1) t Btwnf(t0, t2), t1 Btwnf(t, t2)
    21. 21. Logict Term ::= c | x | t1 + t2 | t1 - t2 | f(t)G Bounded quantification | t < t’ | GFormula ::= t = t’ over interpreted sets Btwnf(t1, t2) | G tS Set ::= f-1(t) | Btwnf(t1, t2)F Formula ::= G | F1 F2 |F1 F2 | x S. F
    22. 22. Lazy quantifier instantiation• Instantiation rule t S x S. F F[t/x]• Lazy instantiation – Instantiate only when a term t belongs to the set S – Substantially reduces the number of terms to instantiate a quantified fact• Terminates if x S. F is sort-restricted – sort(x) is less than sort(t[x]) for any term t[x] in F
    23. 23. Experience• Compared with an earlier implementation – Unrestricted quantifiers, incomplete axiomatization of reachability, no f-1 – Small to medium sized benchmarks• Greatly improved the predictability of HAVOC – Reduced runtimes (2X – 100X) – Eliminate need for carefully crafted axioms and invariants – Can handle newer examples
    24. 24. Roadmap• Novel features of the specification language• Dealing with low-level features of C• Concluding remarks
    25. 25. struct list { struct record { list *next; int data1; list *prev; list node; }; int data2;q };p record record data1 data1 next next prev prev data2 data2 q = CONTAINER(p, record, node) = (record *) ((int *) p – (int) (&(((record *)0) node))) = (record *) ((int *) p – 1)
    26. 26. void init_all_records(list *p) { void init_record(list *p) { while (p != NULL) { record *r = CONTAINER(p, init_record(p); record, node); p = p->next; r->data2 = 42; } }}• Type safety requires nontrivial reasoning • the container of every element in list has type record*• Use of memory model with field abstraction is unsound• Field abstraction is crucial to all property checkers • &a->data1 is not aliased to &b->data2 • init_all_records(p) preserves the assertion a->data1 == 0
    27. 27. Unify type checking and property checking• Harness the power of constraint solvers to enhance type checking – type safety often depends on program-specific invariants• Harness the strong guarantees provided by the type invriant to enhance property checking – non-aliasing, field abstraction
    28. 28. Mem:int int Type:int type Mutable Immutable 102 101 100 Ptr(Int) Ptr(List) Ptr(Record) 100 Int List Record 99 int typeType invariant: a:int. HasType(Mem(a), Type(a))
    29. 29. void init_record(list *p) { record *r = CONTAINER(p, struct list { record, node); list *next; r->data2 = 42; list *prev;} }; struct record { int data1; list node;requires a:int. HasType(Mem(a), Type(a)) int data2;requires HasType(p, Ptr(List)) };ensures a:int. HasType(Mem(a), Type(a))void init_record(int p) { var r:int; r := p-1; assert HasType(r, Ptr(Record)); Mem(r+3) := 42; assert a:int. HasType(Mem(a), Type(a));}
    30. 30. struct list { list *next; list *prev; };HasType(v, Int) trueHasType(v, Ptr(t)) v=0 (v > 0 Match(v, t)) struct record { int data1; list node; int data2; };Match(a, Int) Type(a) = IntMatch(a, Ptr(t)) Type(a) = Ptr(t)Match(a, List) Match(a, Ptr(List)) Match(a+1, Ptr(List))Match(a, Record) Match(a, Int) Match(a+1, List) Match(a+3, Int)
    31. 31. void init_record(list *p) { struct list { record *r = CONTAINER(p, list *next; record, node); list *prev; r->data2 = 42; };} struct record { int data1;requires HasType(p-1, Ptr(Record)) p-1 0 list node; int data2;requires a:int. HasType(Mem(a), Type(a)) };requires HasType(p, Ptr(List))ensures a:int. HasType(Mem(a), Type(a))void init_record(int p) { var r:int; r := p-1; assert HasType(r, Ptr(Record)); Mem(r+3) := 42; assert a:int. HasType(Mem(a), Type(a));}
    32. 32. struct list { list *next;HasType(v, Int) true list *prev; };HasType(v, Data1) trueHasType(v, Data2) true struct record {HasType(v, Ptr(t)) v=0 (v > 0 Match(v, t)) int data1; list node; int data2; };Match(a, Int) Type(a) = IntMatch(a, Data1) Type(a) = Data1Match(a, Data2) Type(a) = Data2Match(a, Ptr(t)) Type(a) = Ptr(t)Match(a, List) Match(a, Ptr(List)) Match(a+1, Ptr(List))Match(a, Record) Match(a, Data1) Match(a+1, List) Match(a+3, Data2)
    33. 33. Other highlights• Decision procedure for type safety – suffices to instantiate the type invariant and definitions of Match and HasType on few terms• Extensions – unions – function pointers – parametric polymorphism – user-defined types – sub-word accesses (char, short)
    34. 34. Experience• Property checking on small benchmarks – list-manipulation: insertion, removal, multiple lists each with a different container type – sorting: bubble sort, merge sort, quick sort – intuitive and concise annotations• Type checking of four WDK drivers – cancel, event, kbfiltr, vserial – ~1 min to check each driver – ~5KLOC, ~225 annotations
    35. 35. Roadmap• Novel features of the specification language• Dealing with low-level features of C• Concluding remarks
    36. 36. Other case studies with HAVOC• Synchronization protocols protecting critical data structures in the NT file system (Brian Hackett) – ~300KLOC, 1500 procedures – reference count usage, lock usage, data races, teardown races – 45 confirmed bugs (out of 125 warnings) – most bugs fixed• Spin lock usage in Windows device drivers (Juan Pablo Galeotti, Thomas Wies) – flpydisk, kbdclass, daytona, serial (~50KLOC)
    37. 37. HAVOC is available• Download: – http://research.microsoft.com/projects/HAVOC
    38. 38. Future directions• Unified decision procedure for reachability, inverse, arrays, and types for the low-level memory model• Exploiting type invariant for property checking on device drivers• Annotation inference
    39. 39. Questions

    ×