Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Using Static Binary Analysis To Find Vulnerabilities And Backdoors in Firmware

1,126 views

Published on

Over the last few years, as the world has moved closer to realizing the idea of the Internet of Things, an increasing number of the analog things with which we used to interact every day have been replaced with connected devices. The increasingly-complex systems that drive these devices have one thing in common ­– they must all communicate to carry out their intended functionality. Such communication is handled by firmware embedded in the device. And firmware, like any piece of software, is susceptible to a wide range of errors and vulnerabilities.

Published in: Software

Using Static Binary Analysis To Find Vulnerabilities And Backdoors in Firmware

  1. 1. • Co-founder and Chief Scientist at Lastline, Inc. – Lastline offers protection against zero-day threats and advanced malware • Professor in Computer Science at UC Santa Barbara – many systems security papers in academic conferences • Part of Shellphish Who are we?
  2. 2. • PhD Student at UC Santa Barbara – research focused primarily on binary security and embedded devices • Part of Shellphish – team leader of Shellphish's effort in the DARPA Cyber Grand Challenge • Doesn't like peanut butter Who are we?
  3. 3. - firmware - binary analysis - angr What are we talking about?
  4. 4. The “Internet of Things”
  5. 5. Embedded software is everywhere
  6. 6. • Embedded Linux and user-space programs • Custom OS and custom programs combined together in a binary blob – typically, the binary is all that you get – and, sometimes, it is not easy to get this off the device What is on embedded devices?
  7. 7. Binary analysis noun | bi·na·ry anal·y·sis | ˈbī-nə-rē ə-ˈna-lə-səs 1. The process of automatically deriving properties about the behavior of binary programs 2. Including static binary analysis and dynamic binary analysis Binary Analysis
  8. 8. • Program verification • Program testing • Vulnerability excavation • Vulnerability signature generation • Reverse engineering • Vulnerability excavation • Exploit generation Goals of Binary Analysis
  9. 9. – reason over multiple (all) execution paths – can achieve excellent coverage – precision versus scalability trade-off • very precise analysis can be slow and not scalable • too much approximation leads to wrong results (false positives) – often works on abstract program model • for example, binary code is lifted to an intermediate representation Static Binary Analysis
  10. 10. – examine individual program paths – very precise – coverage is (very) limited – sometimes hard to properly run program • hard to attach debugger to embedded system • when code is extracted and emulated, what happens with calls to peripherals? Dynamic Binary Analysis
  11. 11. • Get the binary code • Binaries lack significant information present in source • Often no clear library or operating system abstractions o where to start the analysis from? o hard to handle environment interactions Challenges of Static Binary Analysis
  12. 12. From Source to Binary Code compile link strip
  13. 13. From Source to Binary Code compile link strip type info function names variable names jump targets
  14. 14. • (Linux) system call interface is great – you know what the I/O routines are • important to understand what user can influence – you have typed parameters and return values – lets the analysis focus on (much smaller) main program • OS is not there or embedded in binary blob – heuristics to find I/O routines – open challenge to find mostly independent components Missing OS and Library Abstractions
  15. 15. • Library functions are great – you know what they do and can write a “function summary” – you have typed parameters and return values – lets the analysis focus on (much smaller) main program • Library functions are embedded (like static linking) – need heuristics to rediscover library functions – IDA FLIRT (Fast Library Identification and Recognition Technology) – more robustness based on looking for control flow similarity Missing OS and Library Abstractions
  16. 16. • Memory safety vulnerabilities – buffer overrun – out of bounds reads (heartbleed) – write-what-where • Authentication bypass (backdoors) • Actuator control! Types of Vulnerabilities
  17. 17. Linux embedded device: HTTP server for management and video monitoring, with a known backdoor. Backdoor!!! ➔ Username: 3sadmin ➔ Password: 27988303 Heffner, Craig. "Finding and Reversing Backdoors in Consumer Firmware." EELive! (2014). Motivating Example
  18. 18. Authentication Bypass Prompt Authentication Success Failure
  19. 19. Authentication Bypass Prompt Authentication Success Failure Backdoor e.g. strcmp()
  20. 20. Authentication Bypass Prompt Authentication Success Failure Backdoor e.g. strcmp() Hard to find.
  21. 21. Authentication Bypass Prompt Success Missing!
  22. 22. Modeling Authentication Bypass Prompt Authentication Success Failure Backdoor e.g. strcmp() Easier to find! Hard to find.
  23. 23. Input Determinism Prompt Authentication Success Failure Backdoor e.g. strcmp() Can we determine the input needed to reach the success function, just by analyzing the code? The answer is NO
  24. 24. Input Determinism Prompt Authentication Success Failure Backdoor e.g. strcmp() Can we determine the input needed to reach the success function, just by analyzing the code? The answer is YES
  25. 25. Modeling Authentication Bypass Prompt Authentication Success Failure Backdoor e.g. strcmp() Easier to find! But how?
  26. 26. • Without OS/ABI information: • With ABI information: Finding “Authenticated Point” EXEC()
  27. 27. Using Binary Analysis to Hunt for Vulnerabilities Program Symbolic Execution Security policies Security Policy Checker POCs Static Analysis
  28. 28. angr: A Binary Analysis Framework Static Analysis Routines Symbolic Execution Engine Binary Loader angr
  29. 29. angr: A Binary Analysis Framework Static Analysis Routines Symbolic Execution Engine Binary Loader angr
  30. 30. "How do I trigger path X or condition Y?" Symbolic Execution
  31. 31. Input Determinism Prompt Authentication Success Failure Backdoor e.g. strcmp() Can we determine the input needed to reach the success function, just by analyzing the code?
  32. 32. "How do I trigger path X or condition Y?" - Dynamic analysis - Input A? No. Input B? No. Input C? … - Based on concrete inputs to application. - (Concrete) static analysis - "You can't"/"You might be able to" - Based on various static techniques. We need something slightly different. Symbolic Execution
  33. 33. "How do I trigger path X or condition Y?" 1. Interpret the application. 2. Track "constraints" on variables. 3. When the required condition is triggered, "concretize" to obtain a possible input. Symbolic Execution
  34. 34. Constraint solving: ❏ Conversion from set of constraints to set of concrete values that satisfy them. ❏ NP-complete, in general. Constraints x >= 10 x < 100 x = 42 Symbolic Execution
  35. 35. Pros - Precise - No false positives (with correct environment model) - Produces directly- actionable inputs Symbolic Execution - Pros and Cons Cons - Not scalable - constraint solving is np- complete - path explosion
  36. 36. Our Case
  37. 37. Worst-Case
  38. 38. Worst-Case
  39. 39. angr: A Binary Analysis Framework Static Analysis Routines Symbolic Execution Engine Binary Loader angr
  40. 40. angr: A Binary Analysis Framework Control-Flow Graph Data-Flow Analysis Value-Set Analysis Static Analysis Routines Symbolic Execution Engine Binary Loader angr
  41. 41. angr: A Binary Analysis Framework Control-Flow Graph Data-Flow Analysis Value-Set Analysis Static Analysis Routines Symbolic Execution Engine Binary Loader angr
  42. 42. angr: A Binary Analysis Framework Control-Flow Graph Data-Flow Analysis Value-Set Analysis Static Analysis Routines Symbolic Execution Engine Binary Loader angr
  43. 43. angr: A Binary Analysis Framework Control-Flow Graph Data-Flow Analysis Value-Set Analysis Static Analysis Routines Symbolic Execution Engine Binary Loader angr
  44. 44. Example cmp rbx, 0x1024 ja _OUT cmp [rax+rbx], 1337 je _OUT add rbx, 4 rbx? What is rbx in the yellow square?mov rax, 0x400000 mov rbx, 0 Symbolic execution: state explosion Naive static analysis: "anything" Range analysis: "< 0x1024" Can we do better?
  45. 45. Memory access checks Type inference Variable recovery Range recovery Wrapped-interval analysis Value-set analysis Abstract interpretation Value Set Analysis
  46. 46. Value Set Analysis - Strided Intervals 4[0x100, 0x120],32 Stride Low High Size 0x100 0x10c 0x118 0x104 0x110 0x11c 0x108 0x114 0x120
  47. 47. Example cmp rbx, 0x1024 ja _OUT cmp [rax+rbx], 1337 je _OUT add rbx, 4 rbx? What is rbx in the yellow square? 1. 1[0x0, 0x0],64 2. 4[0x0, 0x4],64 3. 4[0x0, 0x8],64 4. 4[0x0, 0xc],64 5. 4[0x0, ∞],64 6. 4[0x0, 0x1024],64 mov rax, 0x400000 mov rbx, 0 Widen Narrow 1 234 5 6
  48. 48. angr: A Binary Analysis Framework Control-Flow Graph Data-Flow Analysis Value-Set Analysis Static Analysis Routines Symbolic Execution Engine Binary Loader angr
  49. 49. CB vulnerable program RB patched program POV exploit Cyber Reasoning System The Cyber Grand Challenge!
  50. 50. The Shellphish CRS CB Proposed RBs Autonomous vulnerability scanning Autonomous service resiliency PCAP Test cases POV RB Autonomous processing Autonomous patching Proposed POVs
  51. 51. The Shellphish CRS CB Proposed RBs Autonomous vulnerability scanning Autonomous service resiliency PCAP Test cases POV RB Autonomous processing Autonomous patching Proposed POVs
  52. 52. - ipython-accessible - powerful analyses - versatile - well-encapsulated - open and expandable - architecture "independent" Angr
  53. 53. Angr Mini-howto # ipython In [1]: import angr, networkx In [2]: binary = angr.Project("/some/binary") In [3]: cfg = binary.analyses.CFG() In [4]: networkx.draw(cfg.graph) In [5]: explorer = binary.factory.path_group() In [6]: explorer.explore(find=0xc001b000)
  54. 54. ➔ http://angr.io ➔ https://github.com/angr ➔ angr@lists.cs.ucsb.edu Pull requests, issues, questions, etc super-welcome! Let's bring on the next generation of binary analysis! Angr - Open source! Birthday: September 2013 Total line numbers: 59950 Total commits: ALMOST 9000!! (actually ~6000)
  55. 55. Value Set Analysis - Strided Intervals 4[0x100, 0x110],32 + 1 = 4[0x101, 0x111],32 4[0x100, 0x110],32 >> 1 = 2[0x80, 0x88],31 2[0x80, 0x88],31 << 1 = 1[0x100, 0x110],32 4[0x100, 0x110],32 ⋃ 4[0x102, 0x112],32 = 2[0x100, 0x112],32 4[0x100, 0x110],32 ⋂ 3[0x100, 0x110],32 = 12[0x100, 0x112],32 WIDEN (4[0x100, 0x110],32 ⋃ 4[0x100, 0x112],32) = 4[0x100, ∞],32

×