Visualizing Software Behavior                   Wu Yongzheng14/Sep/2011          NUS SoC CSTalks     1
Problems• Software is complex      –   Large codebase      –   Interaction between components      –   Components from dif...
Software Traces• Types of traces      – Instruction trace: records machine instructions      – Call trace: records functio...
WinResMon• WinResMon: our trace recorder.• Works in Windows• Types of events:      – File: open, read, write, close, renam...
Information (fields) in an Event•   PID/TID                   Process/thread ID•   Program name              Path of progr...
Why visualize System Traces• Software is complex      – Interaction between modules, other software• Software can be close...
What is DotPlot?                                                 Trace X                       E   A   C     B       E    ...
What is DotPlot?                                                 Trace X                       E   A   C     B       E    ...
An Example                                   Visualization                                   comparing:                   ...
Elements of VDP              2                                     1: Extended DotPlot                                    ...
Extended DotPlot              • Matching Rule                 – Define whether two events match                 – By field...
Event-ordered and Time-ordered• Each event takes different time• The meaning/unit of each axis       Event-ordered        ...
Axis Histogram              – Ticks mark unit time (e.g. 1 second)              – Histogram                 • Event densit...
Barcode              • One dimensional              • Highlight user chosen events                 • E.g. file_open → red ...
Example 1: File Copying                                         Self-comparison, event-ordered                            ...
File Size                                 File size is visible                                 Two 1MB and 10MB are       ...
Zooming in                                  DP color : magenta → source;                                  cyan → destinati...
A Surprise: Registry Operations                                     So many registry operations for                       ...
Another Surprise: DLLs              DLLs                                          File, but not source or                 ...
Example 2: Software BuildX: succeed; Y: failed due to                                                             X: succe...
Number of ExecutionsX: 4 compiles (cl.exe), 1 link(link.exe)Y: 3 compiles, 0 link                                         ...
Similarity & DifferenceTwo traces are similar.Y (failed) traceterminates earlier.Right before reading .cfile  14/Sep/2011 ...
Different Matching Rule          Operation Type                     Program Name14/Sep/2011                NUS SoC CSTalks...
Example 3: Two Idle Windows Machine                      •    Time-ordered                      •    1 hour each          ...
Anomaly & Repeated Pattern                          •    Periodic pattern              R2          •    Most events in R1 ...
Zoom In              R2      R114/Sep/2011         NUS SoC CSTalks   26
R1: Windows Update   • Similar events (darker     area) are by Windows     Auto Updater   • More file operation,     less ...
14/Sep/2011   NUS SoC CSTalks   28
Visualizing Module Dependencies• The problem      – There’s vulnerability in X. Which software uses X?      – Why my softw...
Examples of dependencies (1)•         Binaries used by notepad      –       c:windowsapppatchacgenral.dll      –       c:w...
Examples of dependencies (2)• Simple boot (only Windows installed)      –   DLLs: 154      –   EXEs: 10      –   Drivers: ...
Visualization (1)• Basic dependency graph• Graph is too dense14/Sep/2011         NUS SoC CSTalks   32
Binary Dependency Visualization• Two types of nodes: EXE, DLL + etc• Three types of directed edges      1.      EXE X laun...
Visualization (1)• Basic dependency graph• Graph is too dense14/Sep/2011         NUS SoC CSTalks   34
A more usable Visualization: EXE                   Dependency Graph• Grouped dependency graph                             ...
Comparing Microsoft Word and Open                 Office Writer14/Sep/2011          NUS SoC CSTalks       36
DLL Dependency Graph: actual binary                   usage• Some definitions:      – An EXE-DLL dependency in a DLL Depen...
wget: DLL dependency without grouping14/Sep/2011             NUS SoC CSTalks          38
wget: DLL dependency group by fnctionality14/Sep/2011           NUS SoC CSTalks            39
Examples of grouping              By functionality (GIMP)14/Sep/2011           NUS SoC CSTalks   40
Examples of grouping              By software vendor (GIMP)14/Sep/2011            NUS SoC CSTalks    41
Two Operations• Diff      – Compare two graphs.              • E.g. from same program but different environment/input     ...
Diff of DLL dependency graph of Internet        Explorer with Flash and without14/Sep/2011         NUS SoC CSTalks        ...
Projection of the DLL dependency      graph of Internet Explorer on Flash14/Sep/2011          NUS SoC CSTalks        44
Firefox using tortoisesvn14/Sep/2011             NUS SoC CSTalks   45
Questions?14/Sep/2011      NUS SoC CSTalks   46
Visualizing binaries executed• Call graph is large.• Group functions to images => DLL dependency  graph.• DLL dependency g...
Visualizing binaries executed (1)• Generate call tree, call graph, DLL dependency graph• PIN tool to collect execution tra...
Projectionvoid main (void) {                          Full Graph  A();                                               A  B(...
Upcoming SlideShare
Loading in...5
×

CSTalks-Visualizing Software Behavior-14Sep

304

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
304
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

CSTalks-Visualizing Software Behavior-14Sep

  1. 1. Visualizing Software Behavior Wu Yongzheng14/Sep/2011 NUS SoC CSTalks 1
  2. 2. Problems• Software is complex – Large codebase – Interaction between components – Components from different vendor – Closed source, closed API• Why understand software? – As developer => less bugs – As administrator => diagnosis – Curiosity?• Execution trace contains software behavior information, but it’s huge.14/Sep/2011 NUS SoC CSTalks 2
  3. 3. Software Traces• Types of traces – Instruction trace: records machine instructions – Call trace: records function calls – System call trace: records system calls – Software logs: important events• System trace – System call trace from all processes – Mainly resource usage, system & process interaction14/Sep/2011 NUS SoC CSTalks 3
  4. 4. WinResMon• WinResMon: our trace recorder.• Works in Windows• Types of events: – File: open, read, write, close, rename, … – Registry: open, get value, set value, delete, … – Network: connect, listen, send, receive, … – Process/thread: create, terminate.14/Sep/2011 NUS SoC CSTalks 4
  5. 5. Information (fields) in an Event• PID/TID Process/thread ID• Program name Path of program’s EXE• User name/group Process’ owner• Start/end time Event timing in CPU ticks• Operation type E.g. file open• Parameter Type dependent. E.g. – file path, system call flags, registry path – IP address• Call stack trace Call stack in user process14/Sep/2011 NUS SoC CSTalks 5
  6. 6. Why visualize System Traces• Software is complex – Interaction between modules, other software• Software can be closed source, but interaction is open• Human is good at detecting – Repeated pattern – Anomaly14/Sep/2011 NUS SoC CSTalks 6
  7. 7. What is DotPlot? Trace X E A C B E E E D C A C B C D E Trace Y B C E14/Sep/2011 NUS SoC CSTalks 7
  8. 8. What is DotPlot? Trace X E A C B E E E D C A C B C D E Trace Y B C E14/Sep/2011 NUS SoC CSTalks 8
  9. 9. An Example Visualization comparing: MS PowerPoint, MS Word, OO Word, and OO PowerPoint.14/Sep/2011 NUS SoC CSTalks 9
  10. 10. Elements of VDP 2 1: Extended DotPlot 2,3: Axis Histogram3 1 4 4,5: Barcode 314/Sep/2011 NUS SoC CSTalks 10
  11. 11. Extended DotPlot • Matching Rule – Define whether two events match – By fields: e.g. “if PIDs and resource paths are the same”, “if program names are the same” • DP Coloring Rule – Define color for matched events – Traditional DP uses black only – Use RGB model on black background, CMY on white background – Use regular expression to specify events – E.g. “.*file_open.*”→blue. “.*reg_.*”→cyan14/Sep/2011 NUS SoC CSTalks 11
  12. 12. Event-ordered and Time-ordered• Each event takes different time• The meaning/unit of each axis Event-ordered Time-ordered14/Sep/2011 NUS SoC CSTalks 12
  13. 13. Axis Histogram – Ticks mark unit time (e.g. 1 second) – Histogram • Event density (time-ordered) • Time spent (event-ordered)14/Sep/2011 NUS SoC CSTalks 13
  14. 14. Barcode • One dimensional • Highlight user chosen events • E.g. file_open → red • One or more (e.g. three below) • Barcode coloring rules14/Sep/2011 NUS SoC CSTalks 14
  15. 15. Example 1: File Copying Self-comparison, event-ordered xcopy copying 8 files: 1MB, 10KB, 10MB, 100KB, 1MB, 10KB, 10MB and 100KB DP match : operation + parameter (pathname) DP color : magenta → source; cyan → destination; black → other File Operation Source/Dst File Operation Registry Operation14/Sep/2011 NUS SoC CSTalks 15
  16. 16. File Size File size is visible Two 1MB and 10MB are shown Two 10KB and two 100KB are visible only when zoomed in14/Sep/2011 NUS SoC CSTalks 16
  17. 17. Zooming in DP color : magenta → source; cyan → destination; black → other14/Sep/2011 NUS SoC CSTalks 17
  18. 18. A Surprise: Registry Operations So many registry operations for a console application Registry Operation14/Sep/2011 NUS SoC CSTalks 18
  19. 19. Another Surprise: DLLs DLLs File, but not source or destination. Time on DLLs is more than a 1MB file. File Operation Source/Dst File Operation14/Sep/2011 NUS SoC CSTalks 19
  20. 20. Example 2: Software BuildX: succeed; Y: failed due to X: succeedmissing .c fileDP match : program + operation+ value (pathname) Y: Failed due to missing .c fileDP color : black → anyBar1 color : black → nmake.exeBar2 color : cyan → cl.exe;magenta → link.exeBar3 color : cyan → reading .cfiles; magenta → reading .h files 14/Sep/2011 NUS SoC CSTalks 20
  21. 21. Number of ExecutionsX: 4 compiles (cl.exe), 1 link(link.exe)Y: 3 compiles, 0 link Y: 3 compiler, 0 linkerY: Third compile doesn’t read.c or .h.Bar2 color : cyan → cl.exe;magenta → link.exeBar3 color : cyan → reading .c X: 4 compiler, 1 linkerfiles; magenta → reading .hfiles 14/Sep/2011 NUS SoC CSTalks 21
  22. 22. Similarity & DifferenceTwo traces are similar.Y (failed) traceterminates earlier.Right before reading .cfile 14/Sep/2011 NUS SoC CSTalks 22
  23. 23. Different Matching Rule Operation Type Program Name14/Sep/2011 NUS SoC CSTalks 23
  24. 24. Example 3: Two Idle Windows Machine • Time-ordered • 1 hour each • Different time • About 750K events each14/Sep/2011 NUS SoC CSTalks 24
  25. 25. Anomaly & Repeated Pattern • Periodic pattern R2 • Most events in R1 R1 • Most time in R2 alike • Easily spot anomaly & regular pattern14/Sep/2011 NUS SoC CSTalks 25
  26. 26. Zoom In R2 R114/Sep/2011 NUS SoC CSTalks 26
  27. 27. R1: Windows Update • Similar events (darker area) are by Windows Auto Updater • More file operation, less registry operationmagenta → wuauclt.exe (Windows Update)File OperationRegistry Operation 14/Sep/2011 NUS SoC CSTalks 27
  28. 28. 14/Sep/2011 NUS SoC CSTalks 28
  29. 29. Visualizing Module Dependencies• The problem – There’s vulnerability in X. Which software uses X? – Why my software uses X? I never call it. – Is it safe to uninstall X?• Software module – Windows DLLs – UNIX .so – Java class, packages14/Sep/2011 NUS SoC CSTalks 29
  30. 30. Examples of dependencies (1)• Binaries used by notepad – c:windowsapppatchacgenral.dll – c:windowssystem32avgrsstx.dll – c:windowssystem32imm32.dll – c:windowssystem32lpk.dll – c:windowssystem32msacm32.dll – c:windowssystem32msctf.dll – c:windowssystem32msctfime.ime – c:windowssystem32shimeng.dll – c:windowssystem32usp10.dll – c:windowssystem32uxtheme.dll – c:windowssystem32winmm.dll – c:windowssystem32winspool.drv – c:windowswinsxsx86_microsoft.windows.common- controls_6595b64144ccf1df_6.0.2600.5512_x-ww_35d4ce83comctl32.dll14/Sep/2011 NUS SoC CSTalks 30
  31. 31. Examples of dependencies (2)• Simple boot (only Windows installed) – DLLs: 154 – EXEs: 10 – Drivers: 1 – Ime: 1• Typical boot (Windows + applications) – DLLs: 274 – EXEs: 15 – Telephony/Modem: 6 – Drivers: 3 – ActiveX: 2 – Ime: 114/Sep/2011 NUS SoC CSTalks 31
  32. 32. Visualization (1)• Basic dependency graph• Graph is too dense14/Sep/2011 NUS SoC CSTalks 32
  33. 33. Binary Dependency Visualization• Two types of nodes: EXE, DLL + etc• Three types of directed edges 1. EXE X launches another EXE Y 2. EXE X load a DLL Y 3. A function in binary X calls a function in binary Y• How are binaries shared among programs? – EXE Dependency Graph – Only Type 1 and 2 edge – Group DLLs by loader• How binaries interact? – DLL Dependency Graph – Only Type 2 and 3 edge – Group DLLs manually by functionality or software vendor14/Sep/2011 NUS SoC CSTalks 33
  34. 34. Visualization (1)• Basic dependency graph• Graph is too dense14/Sep/2011 NUS SoC CSTalks 34
  35. 35. A more usable Visualization: EXE Dependency Graph• Grouped dependency graph 1 2 1 2 114/Sep/2011 NUS SoC CSTalks 35
  36. 36. Comparing Microsoft Word and Open Office Writer14/Sep/2011 NUS SoC CSTalks 36
  37. 37. DLL Dependency Graph: actual binary usage• Some definitions: – An EXE-DLL dependency in a DLL Dependency Graph is when there is has a control transfer from code in executable x to code in DLL y. We say that x has an EXE-DLL dependency on y. – A DLL-DLL dependency in a DLL Dependency Graph is when there is has a control transfer from code in DLL x to code in DLL y. We say that x has a DLL-DLL dependency on y14/Sep/2011 NUS SoC CSTalks 37
  38. 38. wget: DLL dependency without grouping14/Sep/2011 NUS SoC CSTalks 38
  39. 39. wget: DLL dependency group by fnctionality14/Sep/2011 NUS SoC CSTalks 39
  40. 40. Examples of grouping By functionality (GIMP)14/Sep/2011 NUS SoC CSTalks 40
  41. 41. Examples of grouping By software vendor (GIMP)14/Sep/2011 NUS SoC CSTalks 41
  42. 42. Two Operations• Diff – Compare two graphs. • E.g. from same program but different environment/input • E.g. from two related programs – Diff graph G1 and G2 to get G3.• Projection – Focus on a particular module X – Only show modules that calls X or called by X (recursive defination) – Project graph G1 on module M to get G2 – Not a simple subgraph problem14/Sep/2011 NUS SoC CSTalks 42
  43. 43. Diff of DLL dependency graph of Internet Explorer with Flash and without14/Sep/2011 NUS SoC CSTalks 43
  44. 44. Projection of the DLL dependency graph of Internet Explorer on Flash14/Sep/2011 NUS SoC CSTalks 44
  45. 45. Firefox using tortoisesvn14/Sep/2011 NUS SoC CSTalks 45
  46. 46. Questions?14/Sep/2011 NUS SoC CSTalks 46
  47. 47. Visualizing binaries executed• Call graph is large.• Group functions to images => DLL dependency graph.• DLL dependency graph is still large.• Group DLLs by properties: – By functionality: graphics, audio, network… – By vendor: microsoft, adobe… – By path: C:windowssystem32*.dll, D:vmware*.dll…14/Sep/2011 NUS SoC CSTalks 47
  48. 48. Visualizing binaries executed (1)• Generate call tree, call graph, DLL dependency graph• PIN tool to collect execution trace – Trace include call, return, thread, context, system call events – Call and return records stack pointer, PC and target address.• Not trivial to maintain call stack by tracking call and return – Non-return function (long jump) – Thread, fiber – Context – Kernel callback14/Sep/2011 NUS SoC CSTalks 48
  49. 49. Projectionvoid main (void) { Full Graph A(); A B(1); C} mainvoid A (void) { B D B(0);}void B (int i) { if (i) D(); Project on A else C(); A} C mainvoid C (void) {} Bvoid D (void) {}14/Sep/2011 NUS SoC CSTalks 49
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×