DieHard: Probabilistic Memory Safety for Unsafe Programming Languages Emery Berger University of Massachusetts Amherst Ben...
Problems with Unsafe Languages <ul><li>C, C++: pervasive apps, but langs. memory unsafe </li></ul><ul><li>Numerous opportu...
Current Approaches <ul><li>Unsound,  may   work or abort </li></ul><ul><ul><li>Windows, GNU libc, etc.,  Rx   [Zhou] </li>...
Probabilistic Memory Safety <ul><li>Fully-randomized  memory manager </li></ul><ul><ul><li>Increases odds of  benign  memo...
Soundness for “Erroneous” Programs <ul><li>Normally: memory errors  )   ?  … </li></ul><ul><li>Consider  infinite-heap  al...
Approximating Infinite Heaps <ul><li>Infinite  )  M-heaps:  probabilistic soundness </li></ul><ul><li>Pad  allocations &  ...
Implementation Choices <ul><li>Conventional, freelist-based heaps </li></ul><ul><ul><li>Hard to randomize, protect from er...
Randomized Heap Layout <ul><li>Bitmap-based,  segregated  size classes </li></ul><ul><ul><li>Bit represents one  object  o...
Randomized Allocation <ul><li>malloc(8) : </li></ul><ul><ul><li>compute size class = ceil(log 2  sz) – 3 </li></ul></ul><u...
<ul><li>malloc(8) : </li></ul><ul><ul><li>compute size class = ceil(log 2  sz) – 3 </li></ul></ul><ul><ul><li>randomly  pr...
<ul><li>free(ptr) : </li></ul><ul><ul><li>Ensure object  valid  – aligned to right address </li></ul></ul><ul><ul><li>Ensu...
Randomized Deallocation <ul><li>free(ptr) : </li></ul><ul><ul><li>Ensure object  valid  – aligned to right address </li></...
<ul><li>free(ptr) : </li></ul><ul><ul><li>Ensure object  valid  – aligned to right address </li></ul></ul><ul><ul><li>Ensu...
Randomized Heaps & Reliability <ul><li>Objects randomly spread across heap </li></ul><ul><li>Different run = different hea...
DieHard software architecture <ul><li>“ Output equivalent” – kill failed replicas </li></ul>broadcast vote input output ex...
Results <ul><li>Analytical results (pictures!) </li></ul><ul><ul><li>Buffer overflows </li></ul></ul><ul><ul><li>Dangling ...
Analytical Results: Buffer Overflows <ul><li>Model overflow as write of live data </li></ul><ul><ul><li>Heap half full (ma...
Analytical Results: Buffer Overflows <ul><li>Model overflow as write of live data </li></ul><ul><ul><li>Heap half full (ma...
Analytical Results: Buffer Overflows <ul><li>Model overflow as write of live data </li></ul><ul><ul><li>Heap half full (ma...
Analytical Results: Buffer Overflows <ul><li>Replicas:  Increase odds of avoiding overflow in  at least one  replica </li>...
Analytical Results: Buffer Overflows <ul><li>Replicas:  Increase odds of avoiding overflow in  at least one  replica </li>...
Analytical Results: Buffer Overflows <ul><li>Replicas:  Increase odds of avoiding overflow in  at least one  replica </li>...
Analytical Results: Buffer Overflows <ul><li>F =  free space </li></ul><ul><li>H =  heap size </li></ul><ul><li>N  = # obj...
Empirical Results: Runtime
Empirical Results: Runtime
Empirical Results: Error Avoidance <ul><li>Injected faults: </li></ul><ul><ul><li>Dangling pointers  ( @ 50%, 10 allocatio...
Conclusion <ul><li>Randomization + replicas = probabilistic memory safety </li></ul><ul><ul><li>Improves over today (0%) <...
DieHard software <ul><li>http://www.cs.umass.edu/~emery/diehard </li></ul><ul><li>Linux, Solaris (stand-alone & replicated...
Upcoming SlideShare
Loading in …5
×

DieHard: Probabilistic Memory Safety for Unsafe Languages

2,650 views

Published on

DieHard uses randomization and replication to transparently make C and C++ programs tolerate a wide range of errors, including buffer overflows and dangling pointers. Instead of crashing or running amok, DieHard lets programs continue to run correctly in the face of memory errors with high probability. Using DieHard also makes programs highly resistant to heap-based hacker attacks. Downloadable at www.diehard-software.org.

Published in: Technology, Education
1 Comment
2 Likes
Statistics
Notes
No Downloads
Views
Total views
2,650
On SlideShare
0
From Embeds
0
Number of Embeds
40
Actions
Shares
0
Downloads
67
Comments
1
Likes
2
Embeds 0
No embeds

No notes for slide

DieHard: Probabilistic Memory Safety for Unsafe Languages

  1. 1. DieHard: Probabilistic Memory Safety for Unsafe Programming Languages Emery Berger University of Massachusetts Amherst Ben Zorn Microsoft Research
  2. 2. Problems with Unsafe Languages <ul><li>C, C++: pervasive apps, but langs. memory unsafe </li></ul><ul><li>Numerous opportunities for security vulnerabilities, errors </li></ul><ul><ul><li>Double free </li></ul></ul><ul><ul><li>Invalid free </li></ul></ul><ul><ul><li>Uninitialized reads </li></ul></ul><ul><ul><li>Dangling pointers </li></ul></ul><ul><ul><li>Buffer overflows (stack & heap ) </li></ul></ul>
  3. 3. Current Approaches <ul><li>Unsound, may work or abort </li></ul><ul><ul><li>Windows, GNU libc, etc., Rx [Zhou] </li></ul></ul><ul><li>Unsound, will definitely continue </li></ul><ul><ul><li>Failure oblivious [Rinard] </li></ul></ul><ul><li>Sound, definitely aborts (fail-safe) </li></ul><ul><ul><li>CCured [Necula] , CRED [Ruwase & Lam], SAFECode [Dhurjati, Kowshik & Adve] </li></ul></ul><ul><ul><ul><li>Requires C source, programmer intervention </li></ul></ul></ul><ul><ul><ul><li>30% to 20X slowdowns </li></ul></ul></ul><ul><ul><li>Good for debugging , less for deployment </li></ul></ul>
  4. 4. Probabilistic Memory Safety <ul><li>Fully-randomized memory manager </li></ul><ul><ul><li>Increases odds of benign memory errors </li></ul></ul><ul><ul><li>Ensures different heaps across users </li></ul></ul><ul><li>Replication </li></ul><ul><ul><li>Run multiple replicas simultaneously, vote on results </li></ul></ul><ul><ul><ul><li>Detects crashing & non-crashing errors </li></ul></ul></ul><ul><li>Trades space for increased reliability </li></ul>DieHard: correct execution in face of errors with high probability
  5. 5. Soundness for “Erroneous” Programs <ul><li>Normally: memory errors ) ? … </li></ul><ul><li>Consider infinite-heap allocator: </li></ul><ul><ul><li>All new s fresh ; ignore delete </li></ul></ul><ul><ul><ul><li>No dangling pointers, invalid frees, double frees </li></ul></ul></ul><ul><ul><li>Every object infinitely large </li></ul></ul><ul><ul><ul><li>No buffer overflows, data overwrites </li></ul></ul></ul><ul><li>Transparent to correct program </li></ul><ul><li>“ Erroneous” programs sound </li></ul>
  6. 6. Approximating Infinite Heaps <ul><li>Infinite ) M-heaps: probabilistic soundness </li></ul><ul><li>Pad allocations & defer deallocations </li></ul><ul><ul><li>Simple </li></ul></ul><ul><ul><li>No protection from larger overflows </li></ul></ul><ul><ul><ul><li>pad = 8 bytes, overflow = 9 bytes… </li></ul></ul></ul><ul><ul><li>Deterministic : overflow crashes everyone </li></ul></ul><ul><li>Better: randomize heap </li></ul><ul><ul><li>Probabilistic protection against errors </li></ul></ul><ul><ul><ul><li>Independent across heaps </li></ul></ul></ul><ul><ul><li>Efficient implementation… </li></ul></ul>
  7. 7. Implementation Choices <ul><li>Conventional, freelist-based heaps </li></ul><ul><ul><li>Hard to randomize, protect from errors </li></ul></ul><ul><ul><ul><li>Double frees, heap corruption </li></ul></ul></ul><ul><li>What about bitmaps? [Wilson90] </li></ul><ul><ul><li>Catastrophic fragmentation </li></ul></ul><ul><ul><ul><li>Each small object likely to occupy one page </li></ul></ul></ul>obj obj obj obj pages
  8. 8. Randomized Heap Layout <ul><li>Bitmap-based, segregated size classes </li></ul><ul><ul><li>Bit represents one object of given size </li></ul></ul><ul><ul><ul><li>i.e., one bit = 2 i+3 bytes, etc. </li></ul></ul></ul><ul><ul><li>Prevents fragmentation </li></ul></ul>00000001 1010 10 size = 2 i+3 2 i+4 2 i+5 metadata heap
  9. 9. Randomized Allocation <ul><li>malloc(8) : </li></ul><ul><ul><li>compute size class = ceil(log 2 sz) – 3 </li></ul></ul><ul><ul><li>randomly probe bitmap for zero-bit (free) </li></ul></ul><ul><li>Fast: runtime O(1) </li></ul><ul><ul><li>M=2 ) E[# of probes] · 2 </li></ul></ul>00000001 1010 10 size = 2 i+3 2 i+4 2 i+5 metadata heap
  10. 10. <ul><li>malloc(8) : </li></ul><ul><ul><li>compute size class = ceil(log 2 sz) – 3 </li></ul></ul><ul><ul><li>randomly probe bitmap for zero-bit (free) </li></ul></ul><ul><li>Fast: runtime O(1) </li></ul><ul><ul><li>M=2 ) E[# of probes] · 2 </li></ul></ul>Randomized Allocation 00010001 1010 10 size = 2 i+3 2 i+4 2 i+5 metadata heap
  11. 11. <ul><li>free(ptr) : </li></ul><ul><ul><li>Ensure object valid – aligned to right address </li></ul></ul><ul><ul><li>Ensure allocated – bit set </li></ul></ul><ul><ul><li>Resets bit </li></ul></ul><ul><li>Prevents invalid frees, double frees </li></ul>Randomized Deallocation 00010001 1010 10 size = 2 i+3 2 i+4 2 i+5 metadata heap
  12. 12. Randomized Deallocation <ul><li>free(ptr) : </li></ul><ul><ul><li>Ensure object valid – aligned to right address </li></ul></ul><ul><ul><li>Ensure allocated – bit set </li></ul></ul><ul><ul><li>Resets bit </li></ul></ul><ul><li>Prevents invalid frees, double frees </li></ul>00010001 1010 10 size = 2 i+3 2 i+4 2 i+5 metadata heap
  13. 13. <ul><li>free(ptr) : </li></ul><ul><ul><li>Ensure object valid – aligned to right address </li></ul></ul><ul><ul><li>Ensure allocated – bit set </li></ul></ul><ul><ul><li>Resets bit </li></ul></ul><ul><li>Prevents invalid frees, double frees </li></ul>Randomized Deallocation 000 0 0001 1010 10 size = 2 i+3 2 i+4 2 i+5 metadata heap
  14. 14. Randomized Heaps & Reliability <ul><li>Objects randomly spread across heap </li></ul><ul><li>Different run = different heap </li></ul><ul><ul><li>Errors across heaps independent </li></ul></ul>2 3 4 5 3 1 6 object size = 2 i+4 object size = 2 i+3 … My Mozilla: “malignant” overflow Your Mozilla: “benign” overflow 1 1 6 3 2 5 4 …
  15. 15. DieHard software architecture <ul><li>“ Output equivalent” – kill failed replicas </li></ul>broadcast vote input output execute replicas <ul><li>Each replica has different allocator </li></ul>replica 3 seed 3 replica 1 seed 1 replica 2 seed 2
  16. 16. Results <ul><li>Analytical results (pictures!) </li></ul><ul><ul><li>Buffer overflows </li></ul></ul><ul><ul><li>Dangling pointer errors </li></ul></ul><ul><ul><li>Uninitialized reads </li></ul></ul><ul><li>Empirical results </li></ul><ul><ul><li>Runtime overhead </li></ul></ul><ul><ul><li>Error avoidance </li></ul></ul><ul><ul><ul><li>Injected faults & actual applications </li></ul></ul></ul>
  17. 17. Analytical Results: Buffer Overflows <ul><li>Model overflow as write of live data </li></ul><ul><ul><li>Heap half full (max occupancy) </li></ul></ul>
  18. 18. Analytical Results: Buffer Overflows <ul><li>Model overflow as write of live data </li></ul><ul><ul><li>Heap half full (max occupancy) </li></ul></ul>
  19. 19. Analytical Results: Buffer Overflows <ul><li>Model overflow as write of live data </li></ul><ul><ul><li>Heap half full (max occupancy) </li></ul></ul>
  20. 20. Analytical Results: Buffer Overflows <ul><li>Replicas: Increase odds of avoiding overflow in at least one replica </li></ul>replicas
  21. 21. Analytical Results: Buffer Overflows <ul><li>Replicas: Increase odds of avoiding overflow in at least one replica </li></ul>replicas
  22. 22. Analytical Results: Buffer Overflows <ul><li>Replicas: Increase odds of avoiding overflow in at least one replica </li></ul>replicas <ul><li>P(Overflow in all replicas) = (1/2) 3 = 1/8 </li></ul><ul><li>P(No overflow in ¸ 1 replica) = 1-(1/2) 3 = 7/8 </li></ul>
  23. 23. Analytical Results: Buffer Overflows <ul><li>F = free space </li></ul><ul><li>H = heap size </li></ul><ul><li>N = # objects worth of overflow </li></ul><ul><li>k = replicas </li></ul><ul><li>Overflow one object </li></ul>
  24. 24. Empirical Results: Runtime
  25. 25. Empirical Results: Runtime
  26. 26. Empirical Results: Error Avoidance <ul><li>Injected faults: </li></ul><ul><ul><li>Dangling pointers ( @ 50%, 10 allocations) </li></ul></ul><ul><ul><ul><li>glibc : crashes ; DieHard : 9/10 correct </li></ul></ul></ul><ul><ul><li>Overflows (@1%, 4 bytes over) – </li></ul></ul><ul><ul><ul><li>glibc : crashes 9/10, inf loop ; DieHard : 10/10 correct </li></ul></ul></ul><ul><li>Real faults: </li></ul><ul><ul><li>Avoids Squid web cache overflow </li></ul></ul><ul><ul><ul><li>Crashes BDW & glibc </li></ul></ul></ul><ul><ul><li>Avoids dangling pointer error in Mozilla </li></ul></ul><ul><ul><ul><li>DoS in glibc & Windows </li></ul></ul></ul>
  27. 27. Conclusion <ul><li>Randomization + replicas = probabilistic memory safety </li></ul><ul><ul><li>Improves over today (0%) </li></ul></ul><ul><ul><li>Useful point between absolute soundness (fail-stop) and unsound </li></ul></ul><ul><li>Trades hardware resources (RAM,CPU) for reliability </li></ul><ul><ul><li>Hardware trends </li></ul></ul><ul><ul><ul><li>Larger memories, multi-core CPUs </li></ul></ul></ul><ul><ul><li>Follows in footsteps of ECC memory, RAID </li></ul></ul>
  28. 28. DieHard software <ul><li>http://www.cs.umass.edu/~emery/diehard </li></ul><ul><li>Linux, Solaris (stand-alone & replicated) </li></ul><ul><li>Windows (stand-alone only) </li></ul>

×