Yandex may 2013 a san-tsan_msan


Published on

Научный семинар 30 мая: Константин Серебряный

Published in: Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Yandex may 2013 a san-tsan_msan

  1. 1. AddressSanitizer и ThreadSanitizer:Поиск ошибок и уязвимостейв браузере Chrome и серверных приложенияхКостя Серебряный, Google30 Мая 2013
  2. 2. • AddressSanitizer (aka ASan)o detects use-after-free and buffer overflows (C++)• ThreadSanitizer (aka TSan)o detects data races (C++ & Go)• MemorySanitizer (aka MSan)o detects uninitialized memory reads (C++)• Similar tools, find different kinds of bugsAgenda
  3. 3. AddressSanitizeraddressability bugs
  4. 4. AddressSanitizer overview• Findso buffer overflows (stack, heap, globals)o use-after-freeo some more• Compiler module (LLVM, GCC)• Run-time library
  5. 5. int global_array[100] = {-1};int main(int argc, char **argv) {return global_array[argc + 100]; // BOOM}% clang++ -O1 -fsanitize=address ; ./a.out==10538== ERROR: AddressSanitizer global-buffer-overflowREAD of size 4 at 0x000000415354 thread T0#0 0x402481 in main 0x7f0a1c295c4d in __libc_start_main ??:0#2 0x402379 in _start ??:00x000000415354 is located 4 bytes to the right of globalvariable global_array (0x4151c0) of size 400ASan report example: global-buffer-overflow
  6. 6. int main(int argc, char **argv) {int stack_array[100];stack_array[1] = 0;return stack_array[argc + 100]; // BOOM}% clang++ -O1 -fsanitize=address; ./a.out==10589== ERROR: AddressSanitizer stack-buffer-overflowREAD of size 4 at 0x7f5620d981b4 thread T0#0 0x4024e8 in main 0x7f5620d981b4 is located at offset 436 in frame<main> of T0s stack:This frame has 1 object(s):[32, 432) stack_arrayASan report example: stack-buffer-overflow
  7. 7. int main(int argc, char **argv) {int *array = new int[100];int res = array[argc + 100]; // BOOMdelete [] array;return res;}% clang++ -O1 -fsanitize=address; ./a.out==10565== ERROR: AddressSanitizer heap-buffer-overflowREAD of size 4 at 0x7fe4b0c76214 thread T0#0 0x40246f in main is located 4 bytes to the right of 400-byteregion [0x7fe..., 0x7fe...)allocated by thread T0 here:#0 0x402c36 in operator new[](unsigned long)#1 0x402422 in main report example: heap-buffer-overflow
  8. 8. ASan report example: use-after-freeint main(int argc, char **argv) {int *array = new int[100];delete [] array;return array[argc]; // BOOM}% clang++ -O1 -fsanitize=address && ./a.out==30226== ERROR: AddressSanitizer heap-use-after-freeREAD of size 4 at 0x7faa07fce084 thread T0#0 0x40433c in main is located 4 bytes inside of 400-byteregionfreed by thread T0 here:#0 0x4058fd in operator delete[](void*) _asan_rtl_#1 0x404303 in main allocated by thread T0 here:#0 0x405579 in operator new[](unsigned long) _asan_rtl_
  9. 9. Any aligned 8 bytes may have 9 states:N good bytes and 8 - N bad (0<=N<=8)07654321-1Good byteBad byteShadow valueASan shadow byte
  10. 10. ASan virtual address space (-pie)0xffffffff0x200000000x1fffffff0x040000000x03ffffff0x00000000ApplicationShadowmprotect-edShadow = (Addr >> 3)
  11. 11. ASan virtual address space (no -pie)0xffffffff0x400000000x3fffffff0x280000000x27ffffff0x240000000x23ffffff0x200000000x1fffffff0x00000000ApplicationShadowmprotect-edShadow = (Addr >> 3)+ 0x20000000
  12. 12. void foo() {char a[328];<------------- CODE ------------->Instrumenting stack
  13. 13. void foo() {char rz1[32]; // 32-byte alignedchar a[328];char rz2[24];char rz3[32];int *shadow = (&rz1 >> 3) + kOffset;shadow[0] = 0xffffffff; // poison rz1shadow[11] = 0xffffff00; // poison rz2shadow[12] = 0xffffffff; // poison rz3<------------- CODE ------------->shadow[0] = shadow[11] = shadow[12] = 0;}Instrumenting stack
  14. 14. Instrumenting globalsint a;struct {int original;char redzone[60];} a; // 32-aligned
  15. 15. Run-time library• Initializes shadow memory at startup• Provides full malloc replacemento Insert poisoned redzones around allocated memoryo Quarantine for free-ed memoryo Collect stack traces for every malloc/free• Provides interceptors for functions like memset• Prints error messages
  16. 16. ASan instrumentation: 8-byte accesschar *shadow = addr >> 3;if (*shadow)ReportError(a);*a = ...*a = ...
  17. 17. ASan instrumentation: N-byte access (1, 2, 4)char *shadow = addr >> 3;if (*shadow &&*shadow <= ((a&7)+N-1))ReportError(a);*a = ...*a = ...
  18. 18. Instrumentation example (x86_64)mov %rdi,%rax # address is in %rdishr $0x3,%rax # shift by 3cmpb $0x0,0x7fff8000(%rax) # shadow ? 0je 1f <foo+0x1f>callq __asan_report_store8 # Report errormovq $0x1234,(%rdi) # original store
  19. 19. • 2x slowdown (Valgrind: 20x and more)• 1.5x-3x memory overhead• 1000+ bugs found in Chrome• 1000+ bugs found in Google server apps• 1000+ bugs everywhere elseo Firefox, FreeType, FFmpeg, WebRTC, libjpeg-turbo,Perl, Vim, LLVM, GCC, MySQLASan marketing slide
  20. 20. ASan and Chrome• Chrome was the first ASan user (May 2011)• All existing tests are running with ASano Linux, Mac, ChromeOSo Tree is closed on failure• Fuzzing on massive scale (ClusterFuzz)o Generate test cases, minimize, de-duplicateo Find regression ranges, verify fixeso Thousands CPUs running 24/7• Over 1000 bugs found in 2 yearso Most may have security implicationso External researchers found 100+ bugs
  21. 21. ThreadSanitizerdata races
  22. 22. ThreadSanitizer• Detects data races• Compile-time instrumentation (LLVM, GCC)• Run-time library
  23. 23. TSan report example: data racevoid Thread1() { Global = 42; }int main() {pthread_create(&t, 0, Thread1, 0);Global = 43;...% clang -fsanitize=thread -g a.c -fPIE -pie &&./a.outWARNING: ThreadSanitizer: data race (pid=20373)Write of size 4 at 0x7f... by thread 1:#0 Thread1 a.c:1Previous write of size 4 at 0x7f... by main thread:#0 main a.c:4Thread 1 (tid=20374, running) created at:#0 pthread_create ??:0#1 main a.c:3
  24. 24. Compiler instrumentationvoid foo(int *p) {*p = 42;}void foo(int *p) {__tsan_func_entry(__builtin_return_address(0));__tsan_write4(p);*p = 42;__tsan_func_exit()}
  25. 25. Direct shadow mapping (64-bit Linux)Application0x7fffffffffff0x7f0000000000Protected0x7effffffffff0x200000000000Shadow0x1fffffffffff0x180000000000Protected0x17ffffffffff0x000000000000Shadow = 4 * (Addr & kMask);
  26. 26. Shadow cellAn 8-byte shadow cell represents one memoryaccess:o ~16 bits: TID (thread ID)o ~42 bits: Epoch (scalar clock)o 5 bits: position/size in 8-byte wordo 1 bit: IsWriteFull information (no more dereferences)TIDEpoPosIsW
  27. 27. 4 shadow cells per 8 app. bytesTIDEpoPosIsWTIDEpoPosIsWTIDEpoPosIsWTIDEpoPosIsW
  28. 28. Example: first accessT1E10:2WWrite in thread T1
  29. 29. Example: second accessT1E10:2WT2E24:8RRead in thread T2
  30. 30. Example: third accessT1E10:2WT3E30:4RT2E24:8RRead in thread T3
  31. 31. Example: race?T1E10:2WT3E30:4RT2E24:8RRace if E1 does not"happen-before" E3
  32. 32. Fast happens-before• Constant-time operationo Get TID and Epoch from the shadow cello 1 load from thread-local storageo 1 comparison• Similar to FastTrack (PLDI09)
  33. 33. Stack trace for previous access• Important to understand the report• Per-thread cyclic buffer of eventso 64 bits per event (type + PC)o Events: memory access, function entry/exito Information will be lost after some timeo Buffer size is configurable• Replay the event buffer on reporto Unlimited number of frames
  34. 34. TSan overhead• CPU: 4x-10x• RAM: 5x-8x
  35. 35. Trophies• 500+ races in Google server-side apps(C++)o Scales to huge apps• 100+ races in Go programso 25+ bugs in Go stdlib• Several races in OpenSSLo 1 fixed, ~5 benign• Several races in Chromeo Weve just started, more to come :)
  36. 36. Key advantages• Speedo > 10x faster than other tools• Native support for atomicso Hard or impossible to implement with binarytranslation (Helgrind, Intel Inspector)
  37. 37. Limitations• Only 64-bit Linux• Hard to port to 32-bit platformso Small address spaceo Relies on atomic 64-bit load/store• Does not instrument:o pre-built librarieso inline assembly
  38. 38. MemorySanitizeruninitialized memory reads (UMR)
  39. 39. MSan report example: UMRint main(int argc, char **argv) {int x[10];x[0] = 1;if (x[argc]) return 1;...% clang -fsanitize=memory -fPIE -pie a.c -g; ./a.outWARNING: MemorySanitizer: UMR (uninitialized-memory-read)#0 0x7ff6b05d9ca7 in main stack_umr.c:4ORIGIN: stack allocation: x@main
  40. 40. Shadow memory• Bit to bit shadow mappingo 1 means poisoned (uninitialized)• Uninitialized memory:o Returned by malloco Local stack objects (poisoned at function entry)• Shadow is propagated through arithmeticoperations and memory writes• Shadow is unpoisoned when constants arestored
  41. 41. Direct 1:1 shadow mappingApplication0x7fffffffffff0x600000000000Protected0x5fffffffffff0x400000000000Shadow0x3fffffffffff0x200000000000Protected0x1fffffffffff0x000000000000Shadow = Addr - 0x400000000000;
  42. 42. Shadow propagation• Reporting UMR on first read causes false positiveso E.g. copying struct {char x; int y;}• Report UMR only on some uses (branch, syscall, etc)o Thats what Valgrind does• Propagate shadow values through expressionso A = B + C: A = B | Co A = B & C: A = (B & C) | (~B & C) | (B & ~C)o Approximation to minimize false positives/negativeso Similar to Valgrind• Function parameter/retval: shadow is stored in TLSo Valgrind shadows registers/stack instead
  43. 43. Tracking origins• Where was the poisoned memory allocated?a = malloc() ...b = malloc() ...c = *a + *b ...if (c) ... // UMR. Is a guilty or b?• Valgrind --track-origins: propagate the origin ofthe poisoned memory alongside the shadow• MemorySanitizer: secondary shadowo Origin-ID is 4 bytes, 1:1 mappingo 2x additional slowdown
  44. 44. Secondary shadow (origin)Application0x7fffffffffff0x600000000000Origin0x5fffffffffff0x400000000000Shadow0x3fffffffffff0x200000000000Protected0x1fffffffffff0x000000000000Origin = Addr - 0x200000000000;
  45. 45. • Without origins:o CPU: 3xo RAM: 2x• With origins:o CPU: 6xo RAM: 3xMSan overhead
  46. 46. Tricky part :(• Missing any write instruction causes false reports• Must monitor ALL stores in the programo libc, libstdc++, syscalls, etcSolutions:• Instrumented libstdc++, wrappers for libco Works for many "console" apps, e.g. LLVM• Instrument libraries at run-timeo DynamoRIO-based prototype (SLOW)• Instrument libraries statically (is it possible?)• Compile everything, wrap syscallso Will help AddressSanitizer/ThreadSanitizer too
  47. 47. MSan trophies• Proprietary console app, 1.3 MLOC in C++o Not tested with Valgrind previouslyo 20+ unique bugs in < 2 hourso Valgrind finds the same bugs in 24+ hourso MSan gives better reports for stack memory• Several bugs in LLVMo Caught by regular LLVM bootstrap• A few bugs in Chrome (just started)o Have to use DynamoRIO module (MSanDR)o 7x faster than Valgrind
  48. 48. • AddressSanitizer (memory corruption)o A "must use" for everyone (C++)o Linux, OSX, CrOS, Androido i386, x86_64, ARM, PowerPCo WIP: iOS, Windows, *BSD (?)o Clang 3.1+ and GCC 4.8+• ThreadSanitizer (races)o A "must use" if you have threads (C++, Go)o Only x86_64 Linux; Clang 3.2+ and GCC 4.8+• MemorySanitizer (uses of uninitialized data)o WIP, usable for "console" apps (C++)o Only x86_64 Linux; Clang 3.3Summary (all 3 tools)
  49. 49. Q&A
  50. 50. ASan/MSan vs Valgrind (Memcheck)Valgrind ASan MSanHeap out-of-bounds YES YES NOStack out-of-bounds NO YES NOGlobal out-of-bounds NO YES NOUse-after-free YES YES NOUse-after-return NO Sometimes NOUninitialized reads YES NO YESCPU Overhead 10x-300x 1.5x-3x 3x
  51. 51. • Slowdowns will add upo Bad for interactive or network apps• Memory overheads will multiplyo ASan redzone vs TSan/MSan large shadow• Not trivial to implementWhy not a single tool?