CSTalks-Visualizing Software Behavior-14Sep

482 views

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
482
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

CSTalks-Visualizing Software Behavior-14Sep

  1. 1. Visualizing Software Behavior Wu Yongzheng14/Sep/2011 NUS SoC CSTalks 1
  2. 2. Problems• Software is complex – Large codebase – Interaction between components – Components from different vendor – Closed source, closed API• Why understand software? – As developer => less bugs – As administrator => diagnosis – Curiosity?• Execution trace contains software behavior information, but it’s huge.14/Sep/2011 NUS SoC CSTalks 2
  3. 3. Software Traces• Types of traces – Instruction trace: records machine instructions – Call trace: records function calls – System call trace: records system calls – Software logs: important events• System trace – System call trace from all processes – Mainly resource usage, system & process interaction14/Sep/2011 NUS SoC CSTalks 3
  4. 4. WinResMon• WinResMon: our trace recorder.• Works in Windows• Types of events: – File: open, read, write, close, rename, … – Registry: open, get value, set value, delete, … – Network: connect, listen, send, receive, … – Process/thread: create, terminate.14/Sep/2011 NUS SoC CSTalks 4
  5. 5. Information (fields) in an Event• PID/TID Process/thread ID• Program name Path of program’s EXE• User name/group Process’ owner• Start/end time Event timing in CPU ticks• Operation type E.g. file open• Parameter Type dependent. E.g. – file path, system call flags, registry path – IP address• Call stack trace Call stack in user process14/Sep/2011 NUS SoC CSTalks 5
  6. 6. Why visualize System Traces• Software is complex – Interaction between modules, other software• Software can be closed source, but interaction is open• Human is good at detecting – Repeated pattern – Anomaly14/Sep/2011 NUS SoC CSTalks 6
  7. 7. What is DotPlot? Trace X E A C B E E E D C A C B C D E Trace Y B C E14/Sep/2011 NUS SoC CSTalks 7
  8. 8. What is DotPlot? Trace X E A C B E E E D C A C B C D E Trace Y B C E14/Sep/2011 NUS SoC CSTalks 8
  9. 9. An Example Visualization comparing: MS PowerPoint, MS Word, OO Word, and OO PowerPoint.14/Sep/2011 NUS SoC CSTalks 9
  10. 10. Elements of VDP 2 1: Extended DotPlot 2,3: Axis Histogram3 1 4 4,5: Barcode 314/Sep/2011 NUS SoC CSTalks 10
  11. 11. Extended DotPlot • Matching Rule – Define whether two events match – By fields: e.g. “if PIDs and resource paths are the same”, “if program names are the same” • DP Coloring Rule – Define color for matched events – Traditional DP uses black only – Use RGB model on black background, CMY on white background – Use regular expression to specify events – E.g. “.*file_open.*”→blue. “.*reg_.*”→cyan14/Sep/2011 NUS SoC CSTalks 11
  12. 12. Event-ordered and Time-ordered• Each event takes different time• The meaning/unit of each axis Event-ordered Time-ordered14/Sep/2011 NUS SoC CSTalks 12
  13. 13. Axis Histogram – Ticks mark unit time (e.g. 1 second) – Histogram • Event density (time-ordered) • Time spent (event-ordered)14/Sep/2011 NUS SoC CSTalks 13
  14. 14. Barcode • One dimensional • Highlight user chosen events • E.g. file_open → red • One or more (e.g. three below) • Barcode coloring rules14/Sep/2011 NUS SoC CSTalks 14
  15. 15. Example 1: File Copying Self-comparison, event-ordered xcopy copying 8 files: 1MB, 10KB, 10MB, 100KB, 1MB, 10KB, 10MB and 100KB DP match : operation + parameter (pathname) DP color : magenta → source; cyan → destination; black → other File Operation Source/Dst File Operation Registry Operation14/Sep/2011 NUS SoC CSTalks 15
  16. 16. File Size File size is visible Two 1MB and 10MB are shown Two 10KB and two 100KB are visible only when zoomed in14/Sep/2011 NUS SoC CSTalks 16
  17. 17. Zooming in DP color : magenta → source; cyan → destination; black → other14/Sep/2011 NUS SoC CSTalks 17
  18. 18. A Surprise: Registry Operations So many registry operations for a console application Registry Operation14/Sep/2011 NUS SoC CSTalks 18
  19. 19. Another Surprise: DLLs DLLs File, but not source or destination. Time on DLLs is more than a 1MB file. File Operation Source/Dst File Operation14/Sep/2011 NUS SoC CSTalks 19
  20. 20. Example 2: Software BuildX: succeed; Y: failed due to X: succeedmissing .c fileDP match : program + operation+ value (pathname) Y: Failed due to missing .c fileDP color : black → anyBar1 color : black → nmake.exeBar2 color : cyan → cl.exe;magenta → link.exeBar3 color : cyan → reading .cfiles; magenta → reading .h files 14/Sep/2011 NUS SoC CSTalks 20
  21. 21. Number of ExecutionsX: 4 compiles (cl.exe), 1 link(link.exe)Y: 3 compiles, 0 link Y: 3 compiler, 0 linkerY: Third compile doesn’t read.c or .h.Bar2 color : cyan → cl.exe;magenta → link.exeBar3 color : cyan → reading .c X: 4 compiler, 1 linkerfiles; magenta → reading .hfiles 14/Sep/2011 NUS SoC CSTalks 21
  22. 22. Similarity & DifferenceTwo traces are similar.Y (failed) traceterminates earlier.Right before reading .cfile 14/Sep/2011 NUS SoC CSTalks 22
  23. 23. Different Matching Rule Operation Type Program Name14/Sep/2011 NUS SoC CSTalks 23
  24. 24. Example 3: Two Idle Windows Machine • Time-ordered • 1 hour each • Different time • About 750K events each14/Sep/2011 NUS SoC CSTalks 24
  25. 25. Anomaly & Repeated Pattern • Periodic pattern R2 • Most events in R1 R1 • Most time in R2 alike • Easily spot anomaly & regular pattern14/Sep/2011 NUS SoC CSTalks 25
  26. 26. Zoom In R2 R114/Sep/2011 NUS SoC CSTalks 26
  27. 27. R1: Windows Update • Similar events (darker area) are by Windows Auto Updater • More file operation, less registry operationmagenta → wuauclt.exe (Windows Update)File OperationRegistry Operation 14/Sep/2011 NUS SoC CSTalks 27
  28. 28. 14/Sep/2011 NUS SoC CSTalks 28
  29. 29. Visualizing Module Dependencies• The problem – There’s vulnerability in X. Which software uses X? – Why my software uses X? I never call it. – Is it safe to uninstall X?• Software module – Windows DLLs – UNIX .so – Java class, packages14/Sep/2011 NUS SoC CSTalks 29
  30. 30. Examples of dependencies (1)• Binaries used by notepad – c:windowsapppatchacgenral.dll – c:windowssystem32avgrsstx.dll – c:windowssystem32imm32.dll – c:windowssystem32lpk.dll – c:windowssystem32msacm32.dll – c:windowssystem32msctf.dll – c:windowssystem32msctfime.ime – c:windowssystem32shimeng.dll – c:windowssystem32usp10.dll – c:windowssystem32uxtheme.dll – c:windowssystem32winmm.dll – c:windowssystem32winspool.drv – c:windowswinsxsx86_microsoft.windows.common- controls_6595b64144ccf1df_6.0.2600.5512_x-ww_35d4ce83comctl32.dll14/Sep/2011 NUS SoC CSTalks 30
  31. 31. Examples of dependencies (2)• Simple boot (only Windows installed) – DLLs: 154 – EXEs: 10 – Drivers: 1 – Ime: 1• Typical boot (Windows + applications) – DLLs: 274 – EXEs: 15 – Telephony/Modem: 6 – Drivers: 3 – ActiveX: 2 – Ime: 114/Sep/2011 NUS SoC CSTalks 31
  32. 32. Visualization (1)• Basic dependency graph• Graph is too dense14/Sep/2011 NUS SoC CSTalks 32
  33. 33. Binary Dependency Visualization• Two types of nodes: EXE, DLL + etc• Three types of directed edges 1. EXE X launches another EXE Y 2. EXE X load a DLL Y 3. A function in binary X calls a function in binary Y• How are binaries shared among programs? – EXE Dependency Graph – Only Type 1 and 2 edge – Group DLLs by loader• How binaries interact? – DLL Dependency Graph – Only Type 2 and 3 edge – Group DLLs manually by functionality or software vendor14/Sep/2011 NUS SoC CSTalks 33
  34. 34. Visualization (1)• Basic dependency graph• Graph is too dense14/Sep/2011 NUS SoC CSTalks 34
  35. 35. A more usable Visualization: EXE Dependency Graph• Grouped dependency graph 1 2 1 2 114/Sep/2011 NUS SoC CSTalks 35
  36. 36. Comparing Microsoft Word and Open Office Writer14/Sep/2011 NUS SoC CSTalks 36
  37. 37. DLL Dependency Graph: actual binary usage• Some definitions: – An EXE-DLL dependency in a DLL Dependency Graph is when there is has a control transfer from code in executable x to code in DLL y. We say that x has an EXE-DLL dependency on y. – A DLL-DLL dependency in a DLL Dependency Graph is when there is has a control transfer from code in DLL x to code in DLL y. We say that x has a DLL-DLL dependency on y14/Sep/2011 NUS SoC CSTalks 37
  38. 38. wget: DLL dependency without grouping14/Sep/2011 NUS SoC CSTalks 38
  39. 39. wget: DLL dependency group by fnctionality14/Sep/2011 NUS SoC CSTalks 39
  40. 40. Examples of grouping By functionality (GIMP)14/Sep/2011 NUS SoC CSTalks 40
  41. 41. Examples of grouping By software vendor (GIMP)14/Sep/2011 NUS SoC CSTalks 41
  42. 42. Two Operations• Diff – Compare two graphs. • E.g. from same program but different environment/input • E.g. from two related programs – Diff graph G1 and G2 to get G3.• Projection – Focus on a particular module X – Only show modules that calls X or called by X (recursive defination) – Project graph G1 on module M to get G2 – Not a simple subgraph problem14/Sep/2011 NUS SoC CSTalks 42
  43. 43. Diff of DLL dependency graph of Internet Explorer with Flash and without14/Sep/2011 NUS SoC CSTalks 43
  44. 44. Projection of the DLL dependency graph of Internet Explorer on Flash14/Sep/2011 NUS SoC CSTalks 44
  45. 45. Firefox using tortoisesvn14/Sep/2011 NUS SoC CSTalks 45
  46. 46. Questions?14/Sep/2011 NUS SoC CSTalks 46
  47. 47. Visualizing binaries executed• Call graph is large.• Group functions to images => DLL dependency graph.• DLL dependency graph is still large.• Group DLLs by properties: – By functionality: graphics, audio, network… – By vendor: microsoft, adobe… – By path: C:windowssystem32*.dll, D:vmware*.dll…14/Sep/2011 NUS SoC CSTalks 47
  48. 48. Visualizing binaries executed (1)• Generate call tree, call graph, DLL dependency graph• PIN tool to collect execution trace – Trace include call, return, thread, context, system call events – Call and return records stack pointer, PC and target address.• Not trivial to maintain call stack by tracking call and return – Non-return function (long jump) – Thread, fiber – Context – Kernel callback14/Sep/2011 NUS SoC CSTalks 48
  49. 49. Projectionvoid main (void) { Full Graph A(); A B(1); C} mainvoid A (void) { B D B(0);}void B (int i) { if (i) D(); Project on A else C(); A} C mainvoid C (void) {} Bvoid D (void) {}14/Sep/2011 NUS SoC CSTalks 49

×