CSTalks-Visualizing Software Behavior-14Sep
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
434
On Slideshare
434
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
2
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Visualizing Software Behavior Wu Yongzheng14/Sep/2011 NUS SoC CSTalks 1
  • 2. Problems• Software is complex – Large codebase – Interaction between components – Components from different vendor – Closed source, closed API• Why understand software? – As developer => less bugs – As administrator => diagnosis – Curiosity?• Execution trace contains software behavior information, but it’s huge.14/Sep/2011 NUS SoC CSTalks 2
  • 3. Software Traces• Types of traces – Instruction trace: records machine instructions – Call trace: records function calls – System call trace: records system calls – Software logs: important events• System trace – System call trace from all processes – Mainly resource usage, system & process interaction14/Sep/2011 NUS SoC CSTalks 3
  • 4. WinResMon• WinResMon: our trace recorder.• Works in Windows• Types of events: – File: open, read, write, close, rename, … – Registry: open, get value, set value, delete, … – Network: connect, listen, send, receive, … – Process/thread: create, terminate.14/Sep/2011 NUS SoC CSTalks 4
  • 5. Information (fields) in an Event• PID/TID Process/thread ID• Program name Path of program’s EXE• User name/group Process’ owner• Start/end time Event timing in CPU ticks• Operation type E.g. file open• Parameter Type dependent. E.g. – file path, system call flags, registry path – IP address• Call stack trace Call stack in user process14/Sep/2011 NUS SoC CSTalks 5
  • 6. Why visualize System Traces• Software is complex – Interaction between modules, other software• Software can be closed source, but interaction is open• Human is good at detecting – Repeated pattern – Anomaly14/Sep/2011 NUS SoC CSTalks 6
  • 7. What is DotPlot? Trace X E A C B E E E D C A C B C D E Trace Y B C E14/Sep/2011 NUS SoC CSTalks 7
  • 8. What is DotPlot? Trace X E A C B E E E D C A C B C D E Trace Y B C E14/Sep/2011 NUS SoC CSTalks 8
  • 9. An Example Visualization comparing: MS PowerPoint, MS Word, OO Word, and OO PowerPoint.14/Sep/2011 NUS SoC CSTalks 9
  • 10. Elements of VDP 2 1: Extended DotPlot 2,3: Axis Histogram3 1 4 4,5: Barcode 314/Sep/2011 NUS SoC CSTalks 10
  • 11. Extended DotPlot • Matching Rule – Define whether two events match – By fields: e.g. “if PIDs and resource paths are the same”, “if program names are the same” • DP Coloring Rule – Define color for matched events – Traditional DP uses black only – Use RGB model on black background, CMY on white background – Use regular expression to specify events – E.g. “.*file_open.*”→blue. “.*reg_.*”→cyan14/Sep/2011 NUS SoC CSTalks 11
  • 12. Event-ordered and Time-ordered• Each event takes different time• The meaning/unit of each axis Event-ordered Time-ordered14/Sep/2011 NUS SoC CSTalks 12
  • 13. Axis Histogram – Ticks mark unit time (e.g. 1 second) – Histogram • Event density (time-ordered) • Time spent (event-ordered)14/Sep/2011 NUS SoC CSTalks 13
  • 14. Barcode • One dimensional • Highlight user chosen events • E.g. file_open → red • One or more (e.g. three below) • Barcode coloring rules14/Sep/2011 NUS SoC CSTalks 14
  • 15. Example 1: File Copying Self-comparison, event-ordered xcopy copying 8 files: 1MB, 10KB, 10MB, 100KB, 1MB, 10KB, 10MB and 100KB DP match : operation + parameter (pathname) DP color : magenta → source; cyan → destination; black → other File Operation Source/Dst File Operation Registry Operation14/Sep/2011 NUS SoC CSTalks 15
  • 16. File Size File size is visible Two 1MB and 10MB are shown Two 10KB and two 100KB are visible only when zoomed in14/Sep/2011 NUS SoC CSTalks 16
  • 17. Zooming in DP color : magenta → source; cyan → destination; black → other14/Sep/2011 NUS SoC CSTalks 17
  • 18. A Surprise: Registry Operations So many registry operations for a console application Registry Operation14/Sep/2011 NUS SoC CSTalks 18
  • 19. Another Surprise: DLLs DLLs File, but not source or destination. Time on DLLs is more than a 1MB file. File Operation Source/Dst File Operation14/Sep/2011 NUS SoC CSTalks 19
  • 20. Example 2: Software BuildX: succeed; Y: failed due to X: succeedmissing .c fileDP match : program + operation+ value (pathname) Y: Failed due to missing .c fileDP color : black → anyBar1 color : black → nmake.exeBar2 color : cyan → cl.exe;magenta → link.exeBar3 color : cyan → reading .cfiles; magenta → reading .h files 14/Sep/2011 NUS SoC CSTalks 20
  • 21. Number of ExecutionsX: 4 compiles (cl.exe), 1 link(link.exe)Y: 3 compiles, 0 link Y: 3 compiler, 0 linkerY: Third compile doesn’t read.c or .h.Bar2 color : cyan → cl.exe;magenta → link.exeBar3 color : cyan → reading .c X: 4 compiler, 1 linkerfiles; magenta → reading .hfiles 14/Sep/2011 NUS SoC CSTalks 21
  • 22. Similarity & DifferenceTwo traces are similar.Y (failed) traceterminates earlier.Right before reading .cfile 14/Sep/2011 NUS SoC CSTalks 22
  • 23. Different Matching Rule Operation Type Program Name14/Sep/2011 NUS SoC CSTalks 23
  • 24. Example 3: Two Idle Windows Machine • Time-ordered • 1 hour each • Different time • About 750K events each14/Sep/2011 NUS SoC CSTalks 24
  • 25. Anomaly & Repeated Pattern • Periodic pattern R2 • Most events in R1 R1 • Most time in R2 alike • Easily spot anomaly & regular pattern14/Sep/2011 NUS SoC CSTalks 25
  • 26. Zoom In R2 R114/Sep/2011 NUS SoC CSTalks 26
  • 27. R1: Windows Update • Similar events (darker area) are by Windows Auto Updater • More file operation, less registry operationmagenta → wuauclt.exe (Windows Update)File OperationRegistry Operation 14/Sep/2011 NUS SoC CSTalks 27
  • 28. 14/Sep/2011 NUS SoC CSTalks 28
  • 29. Visualizing Module Dependencies• The problem – There’s vulnerability in X. Which software uses X? – Why my software uses X? I never call it. – Is it safe to uninstall X?• Software module – Windows DLLs – UNIX .so – Java class, packages14/Sep/2011 NUS SoC CSTalks 29
  • 30. Examples of dependencies (1)• Binaries used by notepad – c:windowsapppatchacgenral.dll – c:windowssystem32avgrsstx.dll – c:windowssystem32imm32.dll – c:windowssystem32lpk.dll – c:windowssystem32msacm32.dll – c:windowssystem32msctf.dll – c:windowssystem32msctfime.ime – c:windowssystem32shimeng.dll – c:windowssystem32usp10.dll – c:windowssystem32uxtheme.dll – c:windowssystem32winmm.dll – c:windowssystem32winspool.drv – c:windowswinsxsx86_microsoft.windows.common- controls_6595b64144ccf1df_6.0.2600.5512_x-ww_35d4ce83comctl32.dll14/Sep/2011 NUS SoC CSTalks 30
  • 31. Examples of dependencies (2)• Simple boot (only Windows installed) – DLLs: 154 – EXEs: 10 – Drivers: 1 – Ime: 1• Typical boot (Windows + applications) – DLLs: 274 – EXEs: 15 – Telephony/Modem: 6 – Drivers: 3 – ActiveX: 2 – Ime: 114/Sep/2011 NUS SoC CSTalks 31
  • 32. Visualization (1)• Basic dependency graph• Graph is too dense14/Sep/2011 NUS SoC CSTalks 32
  • 33. Binary Dependency Visualization• Two types of nodes: EXE, DLL + etc• Three types of directed edges 1. EXE X launches another EXE Y 2. EXE X load a DLL Y 3. A function in binary X calls a function in binary Y• How are binaries shared among programs? – EXE Dependency Graph – Only Type 1 and 2 edge – Group DLLs by loader• How binaries interact? – DLL Dependency Graph – Only Type 2 and 3 edge – Group DLLs manually by functionality or software vendor14/Sep/2011 NUS SoC CSTalks 33
  • 34. Visualization (1)• Basic dependency graph• Graph is too dense14/Sep/2011 NUS SoC CSTalks 34
  • 35. A more usable Visualization: EXE Dependency Graph• Grouped dependency graph 1 2 1 2 114/Sep/2011 NUS SoC CSTalks 35
  • 36. Comparing Microsoft Word and Open Office Writer14/Sep/2011 NUS SoC CSTalks 36
  • 37. DLL Dependency Graph: actual binary usage• Some definitions: – An EXE-DLL dependency in a DLL Dependency Graph is when there is has a control transfer from code in executable x to code in DLL y. We say that x has an EXE-DLL dependency on y. – A DLL-DLL dependency in a DLL Dependency Graph is when there is has a control transfer from code in DLL x to code in DLL y. We say that x has a DLL-DLL dependency on y14/Sep/2011 NUS SoC CSTalks 37
  • 38. wget: DLL dependency without grouping14/Sep/2011 NUS SoC CSTalks 38
  • 39. wget: DLL dependency group by fnctionality14/Sep/2011 NUS SoC CSTalks 39
  • 40. Examples of grouping By functionality (GIMP)14/Sep/2011 NUS SoC CSTalks 40
  • 41. Examples of grouping By software vendor (GIMP)14/Sep/2011 NUS SoC CSTalks 41
  • 42. Two Operations• Diff – Compare two graphs. • E.g. from same program but different environment/input • E.g. from two related programs – Diff graph G1 and G2 to get G3.• Projection – Focus on a particular module X – Only show modules that calls X or called by X (recursive defination) – Project graph G1 on module M to get G2 – Not a simple subgraph problem14/Sep/2011 NUS SoC CSTalks 42
  • 43. Diff of DLL dependency graph of Internet Explorer with Flash and without14/Sep/2011 NUS SoC CSTalks 43
  • 44. Projection of the DLL dependency graph of Internet Explorer on Flash14/Sep/2011 NUS SoC CSTalks 44
  • 45. Firefox using tortoisesvn14/Sep/2011 NUS SoC CSTalks 45
  • 46. Questions?14/Sep/2011 NUS SoC CSTalks 46
  • 47. Visualizing binaries executed• Call graph is large.• Group functions to images => DLL dependency graph.• DLL dependency graph is still large.• Group DLLs by properties: – By functionality: graphics, audio, network… – By vendor: microsoft, adobe… – By path: C:windowssystem32*.dll, D:vmware*.dll…14/Sep/2011 NUS SoC CSTalks 47
  • 48. Visualizing binaries executed (1)• Generate call tree, call graph, DLL dependency graph• PIN tool to collect execution trace – Trace include call, return, thread, context, system call events – Call and return records stack pointer, PC and target address.• Not trivial to maintain call stack by tracking call and return – Non-return function (long jump) – Thread, fiber – Context – Kernel callback14/Sep/2011 NUS SoC CSTalks 48
  • 49. Projectionvoid main (void) { Full Graph A(); A B(1); C} mainvoid A (void) { B D B(0);}void B (int i) { if (i) D(); Project on A else C(); A} C mainvoid C (void) {} Bvoid D (void) {}14/Sep/2011 NUS SoC CSTalks 49