Published on


Published in: Education, Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide


  1. 1. Bitblaze Alex Bazhanyuk, @Abazhanyuk“RE” school, DefCon-UA, 2012
  2. 2. Motivation● Service● Kernel space: kernel, drivers● Mobile● Embedded
  3. 3. Goal● vulnerabilities● malware
  4. 4. What we have?● Taint analysis● Symbolic execution● Solver:stp, smt, z3● Data-flow analysis, special case for graph analysis
  5. 5. Theory of taint analysis
  6. 6. The one of the data tracingTaint sources:Network, Keyboard, Memory, Disk,Function outputs• Taint propagation: a data flow techniqueShadow memoryWhole-systemAcross register/memory/disk/swapping
  7. 7. Fundamentals of taint analysis
  8. 8. Taint propagation •If an operation uses the value of some tainted object, say X, to derive a valuefor another, say Y, then object Y becomes tainted. Object X tainted the object Y •Taint operator t •X→ t(Y) •Taint operator is transitive X → t(Y) and Y→ t(Z), then X→ t(Z)
  9. 9. Static Taint AnalysisAnalysis performed over multiple pathsof a program* Typically performed on a control flowgraph (CFG):statements are nodes, and there is anedge between nodes if there is apossible transfer of control.
  10. 10. BitBlaze: Binary Analysis InfrastructureAutomatically extracting security-related properties frombinary codeBuild a unified binary analysis platform for security- Static analysis + Dynamic analysis + Symbolic Analysis- Leverages recent advances in program analysis, formalmethods, binary instrumentation…Solve security problems via binary analysis• More than a dozen different security applications• Over 25 research publications
  11. 11. BitBlazehttp://bitblaze.cs.berkeley.edu/TEMU,VINERudder, Panorama, Renovo
  12. 12. TEMU
  13. 13. Confines TEMUOnly gcc-3.4Qemu 0.9.1 - TEMUQemu 0.10 - TCG(Tiny Code Generator)-TODOQemu 0.10 =! Qemu 1.01
  14. 14. Create trace with the network service:$sudo ./tracecap/temu -m 256 -net nic,vlan=0 -net tap,vlan=0,script=/etc/qemu-ifup -monitor stdio winxp.img(qemu) load_plugin tracecap/tracecap.sogeneral/trace_only_after_first_taint is enabled.general/log_external_calls is disabled.general/write_ops_at_insn_end is enabled.general/save_state_at_trace_stop is disabled.tracing/tracing_table_lookup is enabled.tracing/tracing_tainted_only is disabled.tracing/tracing_single_thread_only is disabled.tracing/tracing_kernel is disabled.tracing/tracing_kernel_tainted is disabled.tracing/tracing_kernel_partial is disabled.network/ignore_dns is disabled.Enabled: 0x00 Proto: 0x00 Sport: 0 Dport: 0 Src: Dst: plugin options from: SRC_PATH/tracecap/ini/hook_plugin.iniLoading plugins from: SRC_PATH/shared/hooks/hook_pluginstracecap/tracecap.so is loaded successfully!(qemu) enable_emulationEmulation is now enabled(qemu) taint_nic 1(qemu) tracebyname "FileZilla_server.exe" /traces/file_zilla(Посылаем данные по сети)(qemu) Time of first tainted data: 1289487729.962905(qemu) trace_stop
  15. 15. VINE
  16. 16. The Vine Intermediate Language
  17. 17. Example of disasm:fc32dcec: rep stos %eax,%es:(%edi) R@eax[0x00000000][4](R) T0 R@ecx[0x00000002][4](RCW) T0 M@0xfb7bfff8[0x00000000][4](CW) T1 {15 (1231, 69624) (1231, 69625) (1231,69626) (1231, 69627) }fc32dcec: rep stos %eax,%es:(%edi) R@eax[0x00000000][4](R) T0 R@ecx[0x00000001][4](RCW) T0 M@0xfb7bfffc[0x00000000][4](CW) T1 {15 (1231, 69628) (1231, 69629) (1231, 69630)(1231, 69631) }fc32dcee: mov %edx,%ecx R@edx[0x0000015c][4](R) T0 R@ecx[0x00000000][4](W) T0fc32dcf0: and $0×3,%ecx I@0×00000000[0x00000003][1](R) T0 R@ecx[0x0000015c][4](RW) T0fc32dcf5: andl $0×0,-0×4(%ebp) I@0×00000000[0x00000000][1](R) T0M@0xfb5ae738[0x00000002][4](RW) T0fc32dcf9: jmp 0x00000000fc32c726 J@0×00000000[0xffffea2d][4](R) T0fc32c726: cmpl $0×0,-0×58(%ebp) I@0×00000000[0x00000000][1](R) T0M@0xfb5ae6e4[0x00000000][4](R) T0fc32c72a: je 0x00000000fc32c369 J@0×00000000[0xfffffc3f][4](R) T0fc32c369: mov 0xc(%ebp),%eax M@0xfb5ae748[0x81144e70][4](R) T0 R@eax[0x00000000][4](W) T0fc32c36c: xor %edx,%edx R@edx[0x0000015c][4](R) T0 R@edx[0x0000015c][4](RW)T0
  18. 18. TaintT0 - means that the statement did not tainted.T1 - means that the instruction tainted in curly brackets can be seenthat there tainted and what it depends.Heres an example of:fc32dcec: rep stos% eax,% es: (% edi) R @ eax [0x00000000] [4] (R)T0 R @ ecx [0x00000001] [4] (RCW) T0 M @ 0xfb7bfffc [0x00000000][4] (CW) T1 {15 (1231, 628) (1231, 629) (1231, 630) (1231, 631)}One can see that 4 bits of information tainted and they depend onthe offset: 628, 629, 630, 631. 1231 - this number (name).
  19. 19. appreplay./vine-1.0/trace_utils/appreplay -trace font.trace -ir-out font.trace.il -assertion-on-var false-use-post-varfalsewhere:appreplay - ocaml script that we run;-trace - the way to the trace;-ir-out - the path to which we write IL code.-assertion-on-var false-use-post-var false - flags thatshow the format of IL code for this to false makes itmore readable text.
  20. 20. var R_EBP_0:reg32_t;Example of IL code: var R_ESP_1:reg32_t; var R_ESI_2:reg32_t; var R_EDI_3:reg32_t; var R_EIP_4:reg32_t; var R_EAX_5:reg32_t; var R_EBX_6:reg32_t;Begins with the declaration of variables: var R_ECX_7:reg32_t; var R_EDX_8:reg32_t;INPUT - its free memory cells, those that var EFLAGS_9:reg32_t;are tested in the very beginning (back in var R_CF_10:reg1_t;temu), input into the program from an var R_PF_11:reg1_t;external source. var R_AF_12:reg1_t; var R_ZF_13:reg1_t; var R_SF_14:reg1_t; var R_OF_15:reg1_t;var cond_000017_0x4010ce_00_162:reg1_t; var R_CC_OP_16:reg32_t; var R_CC_DEP1_17:reg32_t; var R_CC_DEP2_18:reg32_t;var cond_000013_0x4010c3_00_161:reg1_t; var R_CC_NDEP_19:reg32_t;var cond_000012_0x4010c0_00_160:reg1_t; var R_DFLAG_20:reg32_t; var R_IDFLAG_21:reg32_t;var cond_000007_0x4010b6_00_159:reg1_t; var R_ACFLAG_22:reg32_t;var INPUT_10000_0000_62:reg8_t; var R_EMWARN_23:reg32_t; var R_LDT_24:reg32_t;var INPUT_10000_0001_63:reg8_t; var R_GDT_25:reg32_t;var INPUT_10000_0002_64:reg8_t; var R_CS_26:reg16_t;var INPUT_10000_0003_65:reg8_t; var R_DS_27:reg16_t; var R_ES_28:reg16_t;var mem_arr_57:reg8_t[4294967296]; – memory as an array var R_FS_29:reg16_t;var mem_35:mem32l_t; var R_GS_30:reg16_t; var R_SS_31:reg16_t; var R_FTOP_32:reg32_t;
  21. 21. label pc_0x40143c_1: –is the name of which is the address of instruction/*Filter IRs:*/ And the filter itself changes.Now comes the filter initialization: /*ASM IR:*/ {{ var T_32t0_58:reg32_t;/*Initializers*/ var T_32t1_59:reg32_t; var T_32t2_60:reg32_t; var T_32t3_61:reg32_t;R_EAX_5:reg32_t = 0×73657930:reg32_t; T_32t2_60:reg32_t = R_ESP_1:reg32_t;{ T_32t1_59:reg32_t = T_32t2_60:reg32_t + 0x1c8:reg32_t; T_32t3_61:reg32_t =var idx_144:reg32_t; (var val_143:reg8_t; ( cast(mem_arr_57[T_32t1_59:reg32_t +idx_144:reg32_t = 0x12fef0:reg32_t; 0:reg32_t]:reg8_t)U:reg32_tval_143:reg8_t = INPUT_10000_0000_62:reg8_t; << 0:reg32_t |mem_arr_57[idx_144:reg32_t + 0:reg32_t]:reg8_t = cast(mem_arr_57[T_32t1_59:reg32_t +cast((val_143:reg8_t & 0xff:reg8_t) >> 0:reg8_t)L:reg8_t; 1:reg32_t]:reg8_t)U:reg32_t << 8:reg32_t )} | cast(mem_arr_57[T_32t1_59:reg32_t +{ 2:reg32_t]:reg8_t)U:reg32_tvar idx_146:reg32_t; << 0×10:reg32_t )var val_145:reg8_t; |idx_146:reg32_t = 0x12fef1:reg32_t; cast(mem_arr_57[T_32t1_59:reg32_t +val_145:reg8_t = INPUT_10000_0001_63:reg8_t; 3:reg32_t]:reg8_t)U:reg32_t << 0×18:reg32_tmem_arr_57[idx_146:reg32_t + 0:reg32_t]:reg8_t = ;cast((val_145:reg8_t & 0xff:reg8_t) >> 0:reg8_t)L:reg8_t; R_EAX_5:reg32_t = T_32t3_61:reg32_t; }}
  22. 22. Weakest precondition Vine:● CFG → DPG(DDG) → GCL → (WP form) DPG - Program-Dependence Graphs DDG - Data-Dependence Graphs GCL - Guarded Command Language WP(S,R) – S is formula, R is condition.
  23. 23. What is STP and what it does?STP - a solver for bit-vector expressions.This is a separate project independent of thehttp://sites.google.com/site/stpfastprover/bitblazeTo produce STP code from IL code:./vine-1.0/utils/wputil trace.il -stpout stp.codewhere the input is IL code, and the output isSTP code.
  24. 24. STP program examplebv : BITVECTOR(10);a : BOOLEAN;QUERY(0bin01100000[5:3]=(0bin1111001@bv[0:0])[4:2]AND0bin1@(IF a THEN 0bin0 ELSE 0bin1 ENDIF)=(IF aTHEN 0bin110 ELSE 0bin011 ENDIF)[1:0]);
  25. 25. STP Example code:Begin with the declaration of variables:R_EBX_6_16 : BITVECTOR(32);INPUT_10000_0003_65_7 : BITVECTOR(8);INPUT_10000_0002_64_6 : BITVECTOR(8);INPUT_10000_0001_63_5 : BITVECTOR(8);mem_arr_57_8 : ARRAY BITVECTOR(64) OF BITVECTOR(8);INPUT_10000_0000_62_4 : BITVECTOR(8);% end free variables.The very expression of the form question (assert)ASSERT( 0bin1 =(LET R_EAX_5_232 =0hex73657930IN(LET idx_144_233 =0hex0012fef0IN(LET val_143_234 =INPUT_10000_0000_62_4IN(LET mem_arr_57_393 =(mem_arr_57_8 WITH [(0bin00000000000000000000000000000000 @ BVPLUS(32, idx_144_233,0hex00000000))] := (val_143_234;0hexff)[7:0])…….IN(cond_000017_0x4010ce_00_162_392;0bin1))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))));Its a question:This expression is not true?QUERY (FALSE);And to give a counter example:COUNTEREXAMPLE;
  26. 26. STP exampleHow to ask for a decision at STP:./stp stp.codeExample of STP output:ASSERT( INPUT_10000_0001_63_5 = 0×00 );ASSERT( INPUT_10000_0002_64_6 = 0×00 );ASSERT( INPUT_10000_0000_62_4 = 0×61 );ASSERT( INPUT_10000_0003_65_7 = 0×00 );Invalid.
  27. 27. Solver● SeL4● Hash: md5, …● Physics
  28. 28. Diagram from basic methodinput data software TEMU Trace, alloc-file, state appreplay Trace IL code Vine wputilinput stp Stpdata code
  29. 29. SASV Components:Temu (tracecap: start/stop tracing. Various additions to tracecap(hooks etc.))Vine (appreplay, wputil)STPIDA plugins:- DangerFunctions – finds calls to malloc,strcpy,memcpy etc.- IndirectCalls – indirect jumps, indirect calls.- ida2sql (zynamics) –idb in the mysql db. (http://blog.zynamics.com/2010/06/29/ida2sql-exporting-ida-databases-to-mysql/)Iterators – wrapper for temu, vine, stp.Various publishers – for DeviceIoControl etc.
  30. 30. How does SASV work?
  31. 31. SASVPattern:Min Goal: 100% coverage of the danger codeMax Goal: 100% coverage of the all code
  32. 32. SASV basic algorithm1. Work of IDA plugins -> danger places2. Publisher -> invoke analyzing code3. TEMU -> trace4. Trace -> appreplay -> IL5. IL -> change path algo -> IL’6. IL’ -> wputil -> STP_prorgam’7. STP_prorgam’ -> STP -> data for n+1 iteration8. Goto #2
  33. 33. Diagram for new path in graphinput data software TEMU Trace, alloc-file, state IL appreplay code Next Iteration Trace Changer, symbolic Vine execution wputilNew IL’input stp Stp’ codedata code
  34. 34. Complex system 1)input data New input data Blackbox SASV fuzzer 2) input data New inputSet of inputdata Coverage data Blackbox SASV fuzzer 3) Set of New inputinput data new SASV data Blackbox input Coverage data fuzzer
  35. 35. Diagram from basic methodinput data software TEMU Trace, alloc-file, state appreplay Trace IL code Vine wputilinput stp Stpdata code
  36. 36. Bitblaze vs blackbox-fuzzer Optipng 1 byte change:PNG format file: PNG format file:- 7309 iterations. Unique addresses: - initial set: 3654 - time for one iteration: 10 sec12313. Time: 73:32 hours. Coverage: - number of iterations: 43200 GIF format file:36.27%. - initial set: 8442- size first input data: 2.3Kb. - time for one iteration: 3 sec - number of iterations: 86400BMP format file: BMP format file: - initial set: 2368- 2368 iterations. Unique addresses: - time for one iteration: 3 sec9620. Time: 12:35 hours. Coverage: - number of iterations: 86400 2 byte change:28.34%. PNG format file: - initial set: 3654- size first input data: 1.1Kb. - time for one iteration: 10 secGIF format file: - number of iterations: 86400 GIF format file:- 16884 iterations. Unique addresses: - initial set: 8442 - time for one iteration: 3 sec10361. Time: 112:21 hours. Coverage: - number of iterations: 17280030.52%. BMP format file: - initial set: 2368- size first input data: 0.9Kb, 2.3Kb, 15Kb. - time for one iteration: 3 sec - number of iterations: 172800
  37. 37. ResultBlackbox-fuzzer – 1Bitblaze – 0Blackbox-fuzzer + BitBlaze = 3
  38. 38. CFG
  39. 39. Call graph
  40. 40. DisadvantagesThe target of vulnerability is difficult question.Performance – speed of tracing in TEMU isAWFUL
  41. 41. Get rid of that damned QEMU!Move taint propagation to Hypervisor!Damn good idea!But a lot of code to port/rewrite
  42. 42. S2E● Symbolic execution● Architecture Hypothesis for S²E Android support
  43. 43. BAP● Dynamic: PIN-tools● Static: llvm
  44. 44. S2E + SASV S2E=Qemu+Klee Klee=LLVM+StpInput data => taint analysis (new concept) Support Arm Support Qemu 0.12
  45. 45. LLVMLLVM ELLCC - The Embedded LLVMCompiler Collection
  46. 46. Data-flow analysis
  47. 47. Integrity graph
  48. 48. Data Flow Analysis Schema
  49. 49. State machine
  50. 50. Thanks :)virvdova@gmail.com