Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Bounded Model Checking for C Programs in an Enterprise Environment

Michael Tautschnig
Amazon Web Services & Queen Mary University of London

  • Login to see the comments

  • Be the first to like this

Bounded Model Checking for C Programs in an Enterprise Environment

  1. 1. Bounded Model Checking for C Programs in an Enterprise Environment Michael Tautschnig Amazon Web Services & Queen Mary University of London
  2. 2. Bounded Model Checking for C Programs in an Enterprise Environment | Michael Tautschnig Customer: I would like to get a guarantee that there are no security bugs in this software.
  3. 3. Bounded Model Checking for C Programs in an Enterprise Environment | Michael Tautschnig “Software”
  4. 4. Bounded Model Checking for C Programs in an Enterprise Environment | Michael Tautschnig “Software” eco system of can’t be published, but …
  5. 5. Bounded Model Checking for C Programs in an Enterprise Environment | Michael Tautschnig Ample Open-Source Software “out there” • Debian (http://sources.debian.net/stats/ 21st October 2016) • 26,900 source packages • 13,736,903 individual source files • 1,276,743,654 lines of source code (any programming language) • 45.5% (approx 500M) C code, 22.2% C++, 5.6% shell, 4.7% Java • SourceForge, github, CodePlex, ...: how to automate any kind of analysis? • Distributions (RedHat, Ubuntu/Debian, SuSE, … - but also industrial set ups)! • Software organised in source packages • Uniform interface to access/download packages • Uniform build interface, dependency management
  6. 6. Bounded Model Checking for C Programs in an Enterprise Environment | Michael Tautschnig How?
  7. 7. Bounded Model Checking for C Programs in an Enterprise Environment | Michael Tautschnig Building one Source Package: Compiler Tool-chain • For now: C source code only • goto-cc (part of CBMC distribution) • Uses compiler’s (here: GCC’s) preprocessor • Own C parser/front end (no Cil, LLVM, EDG, ...) • Supports GCC, Visual Studio, CodeWarrior, ARM-CC dialects and command line options • Builds intermediate representation understood by CBMC/CProver tools • Linking of compiled files/archives/libraries
  8. 8. Bounded Model Checking for C Programs in an Enterprise Environment | Michael Tautschnig Supporting arbitrary Build Systems • Builds are performed in chroot environments • /usr/bin/gcc and /usr/bin/ld replaced by scripts invoking goto-cc (+ more work) • Key procedure: 1. Run real compiler/linker (gcc/ld) 2. Compile/link using goto-cc 3. Add result as additional ELF section • Resulting file remains executable • Stable under file renaming, archiving, etc. • Linking stage extracts intermediate representation from extra ELF section x86 binary CProver IR
  9. 9. Bounded Model Checking for C Programs in an Enterprise Environment | Michael Tautschnig Building Thousands of Packages
  10. 10. Bounded Model Checking for C Programs in an Enterprise Environment | Michael Tautschnig Infrastructure: (Ab-)using Jenkins Scripts, notes, configuration: https://github.com/tautschnig/cprover-debian Jenkins master: 4 cores, 64 GB 5 slave nodes: each 64 cores, 256 GB memory Ultimate Debian Database: Package versions, bugs SQL SSH Debian mirror: source archives FTP
  11. 11. Bounded Model Checking for C Programs in an Enterprise Environment | Michael Tautschnig Current per-package Work Flow Compile, link Store archive of all object files/ executables dump-c: create human- readable C code from IR Add generic assertions (pointer checks, arithmetic overflow, no- NaN, ...) Run CBMC w/unwinding bound 1, Z3/ Minisat (DAC’03, TACAS’04, CAV’13) Loop acceleration (CAV’13) Re-compile using goto-cc Static weak memory cycles (TOPLAS/ PLDI’14) re-compile using gcc (errors not fatal)
  12. 12. Bounded Model Checking for C Programs in an Enterprise Environment | Michael Tautschnig Results?
  13. 13. Bounded Model Checking for C Programs in an Enterprise Environment | Michael Tautschnig Exercising Language Front Ends Compile, link Store archive of all object files/ executables dump-c: create human- readable C code from IR Add generic assertions (pointer checks, arithmetic overflow, no- NaN, ...) Run CBMC w/unwinding bound 1, Z3/ Minisat (DAC’03, TACAS’04, CAV’13) Loop acceleration (CAV’13) Re-compile using goto-cc Static weak memory cycles (TOPLAS/ PLDI’14) re-compile using gcc (errors not fatal) +
  14. 14. Bounded Model Checking for C Programs in an Enterprise Environment | Michael Tautschnig Exercising Language Front Ends • Many bug fixes and improvements to the parser, type checker • Re-engineering of parts of the linker • Bug fixes in IR construction • Compilation (without further analysis steps) of entire archive: ~2 days • > 250 GB of compressed archives of IR object files/executables • 10314 archives available: http://theory.eecs.qmul.ac.uk/debian+mole/pkgs/
  15. 15. Bounded Model Checking for C Programs in an Enterprise Environment | Michael Tautschnig Results for relevant to Practitioners: Bug Reports • Key feature: type checking at link time • 844 bugs reported, 530 already fixed by developers • Hundreds still to be reported • http://bugs.debian.org/cgi-bin/pkgreport.cgi?users=mt@debian.org&tag=goto- cc&archive=both
  16. 16. Bounded Model Checking for C Programs in an Enterprise Environment | Michael Tautschnig Reporting bugs
  17. 17. Automated Testing using SMID | Michael Tautschnig Where are the cats? • CAV’14: J. Alglave, D. Kroening, V. Nimal, D. Poetzl: Don't sit on the fence: A static analysis approach to automatic fence insertion • PLDI’14/TOPLAS: J. Alglave, L. Maranget, M. Tautschnig: Herding Cats - Modelling, simulation, testing, and data-mining for weak memory (cited in Linux Weekly News and C/C++ WG21/N4036)
  18. 18. Bounded Model Checking for C Programs in an Enterprise Environment | Michael Tautschnig Focus on improving/developing Methods Compile, link Store archive of all object files/ executables dump-c: create human- readable C code from IR Add generic assertions (pointer checks, arithmetic overflow, no- NaN, ...) Run CBMC w/unwinding bound 1, Z3/ Minisat (DAC’03, TACAS’04, CAV’13) Loop acceleration (CAV’13) Re-compile using goto-cc Static weak memory cycles (TOPLAS/ PLDI’14) re-compile using gcc (errors not fatal)
  19. 19. Bounded Model Checking for C Programs in an Enterprise Environment | Michael Tautschnig TOPLAS/PLDI’14: analysing 200 million LOC for potential weak memory susceptibility
  20. 20. Bounded Model Checking for C Programs in an Enterprise Environment | Michael Tautschnig Automated Information Leak Detection
  21. 21. Bounded Model Checking for C Programs in an Enterprise Environment | Michael Tautschnig Analysing the Patched Version
  22. 22. Bounded Model Checking for C Programs in an Enterprise Environment | Michael Tautschnig Overall Analysis Status (preliminary!) Compile, link Store archive of all object files/ executables dump-c: create human- readable C code from IR Add generic assertions (pointer checks, arithmetic overflow, no- NaN, ...) Run CBMC w/unwinding bound 1, Z3/ Minisat (DAC’03, TACAS’04, CAV’13) Loop acceleration (CAV’13) Re-compile using goto-cc Static weak memory cycles (TOPLAS/ PLDI’14) re-compile using gcc (errors not fatal)
  23. 23. Bounded Model Checking for C Programs in an Enterprise Environment | Michael Tautschnig Overall Analysis Status (preliminary!) • In addition to 314 bugs reported and not yet fixed: 4915 packages with error reports - top causes: 1789 CBMC counterexamples (including several using loop acceleration) 1711 Loop acceleration bugs 200 Floating point support in Z3 back end 198 Type-inconsistent access to heap with symbolic offset 129 CBMC Out-of-memory 54 Parameter counts differ 48 Conflicting array sizes 46 Conflicting types 42 Conflicting struct types 32 Conflicting return types (byte size)
  24. 24. Questions Software? Yes. Guarantees? Sometimes.

×