Your SlideShare is downloading. ×
0
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Semantics-Aware Trace Analysis [PLDI 2009]

2,957

Published on

We present a novel dynamic program analysis that builds a semantic view of program executions. These views reflect program abstractions and aspects; however, views are not simply projections of …

We present a novel dynamic program analysis that builds a semantic view of program executions. These views reflect program abstractions and aspects; however, views are not simply projections of execution traces, but are linked to each other to capture semantic interactions among abstractions at different levels of granularity in a scalable manner.

We describe our approach in the context of Java and demonstrate its utility to improve regression analysis. We first formalize a subset of Java and a grammar for traces generated at program execution. We then introduce several types of views used to analyze regression bugs along with a novel, scalable technique for semantic differencing of traces from different versions of the same program. Benchmark results on large open-source Java programs demonstrate that semantic-aware trace differencing can identify precise and useful details about the underlying cause for a regression, even in programs that use reflection, multithreading, or dynamic code generation, features that typically confound other analysis techniques.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
2,957
On Slideshare
0
From Embeds
0
Number of Embeds
5
Actions
Shares
0
Downloads
28
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Kevin Hoffman, Patrick Eugster, Suresh Jagannathan
  • 2. Roadmap  Motivation  Prior Approaches  Semantics-Aware Trace Analysis (SATA)  Applying SATA to Regression Analysis  Evaluation  Conclusions
  • 3. Motivation  Apache XalanJ 2.4.1 works: java … xslt.Process -xsl case1.xsl -in test.xml java … xslt.Process -xsl case2.xsl -in test.xml java … xslt.Process -xsl case3.xsl -in test.xml  Upgrade to 2.5.1, now it‟s broken! java … xslt.Process -xsl case1.xsl -in test.xml java … xslt.Process -xsl case2.xsl -in test.xml java … xslt.Process -xsl case3.xsl -in test.xml
  • 4. How to find the cause?  Manual inspection is hard  12 months of development from 2.4.1 to 2.5.1  79K new or changed lines of code  97 new features and bugfixes
  • 5. How to find the cause?  Debugging is hard  Separation of cause and effect ○ e.g. in XalanJ, bug in XSLT compiler  Complex web of interacting components  Debugging requires in-depth domain- specific knowledge (limited resource)
  • 6. Roadmap  Motivation  Prior Approaches  Semantics-Aware Trace Analysis (SATA)  Applying SATA to Regression Analysis  Evaluation  Conclusions
  • 7. Challenges: Static Analysis  Dynamically generated code  Advanced language features  Dynamic dispatch (e.g., Polymorphism)  Reflection  Advanced aspect-oriented language features
  • 8. Challenges: Dynamic Analysis  Dynamic program slicing  Slices are still quite large (e.g. 1000s of events)  Control-flow similarity metrics  State-space exploration / refinement
  • 9. Execution Indexing  Use structure/state of execution to compute an „index‟ at each execution point  Find correlations between indices for profiling, debugging, execution comparison
  • 10. Roadmap  Motivation  Prior Approaches  Semantics-Aware Trace Analysis (SATA)  Applying SATA to Regression Analysis  Evaluation  Conclusions
  • 11. Semantic Trace Views Execution Trace --> LOG-1.addMsg('Handling..') ... <-- LOG-1.addMsg(..) --> SP-1.setRequestType('text/html') --> STR-1.equals('text/html') <-- STR-1.equals(..) ret=true --> NUM-1.new(32, 127) set NUM-1._minCharRange = 32 set NUM-1._maxCharRange = 127 <-- NUM-1.new(..) set SP-1._binConv = NUM-1 ... --> LOG-1.addMsg('Set req..') ... <-- LOG-1.addMsg(..) <-- SP-1.setRequestType(..) Organize execution traces into “views”
  • 12. Semantic Trace Views Execution Trace (Thread View) --> LOG-1.addMsg('Handling..') ... <-- LOG-1.addMsg(..) --> SP-1.setRequestType('text/html') --> STR-1.equals('text/html') <-- STR-1.equals(..) ret=true --> NUM-1.new(32, 127) set NUM-1._minCharRange = 32 set NUM-1._maxCharRange = 127 <-- NUM-1.new(..) set SP-1._binConv = NUM-1 ... --> LOG-1.addMsg('Set req..') ... <-- LOG-1.addMsg(..) <-- SP-1.setRequestType(..) Thread views based on thread ID
  • 13. Semantic Trace Views Execution Trace (and Thread View) --> LOG-1.addMsg('Handling..') ... <-- LOG-1.addMsg(..) --> SP-1.setRequestType('text/html') --> STR-1.equals('text/html') <-- STR-1.equals(..) ret=true --> NUM-1.new(32, 127) Method View for SP.setRequestType set NUM-1._minCharRange = 32 set NUM-1._maxCharRange = 127 --> STR-1.equals('text/html') <-- NUM-1.new(..) <-- STR-1.equals(..) ret=true set SP-1._binConv = NUM-1 --> NUM-1.new(32, 127) ... <-- NUM-1.new(..) --> LOG-1.addMsg('Set req..') set SP-1._binConv = NUM-1 ... ... <-- LOG-1.addMsg(..) --> LOG-1.addMsg('Set req..') <-- SP-1.setRequestType(..) <-- LOG-1.addMsg(..) Method views based on top of call stack
  • 14. Semantic Trace Views Execution Trace (and Thread View) Method View for NUM.new --> LOG-1.addMsg('Handling..') set NUM-1._minCharRange = 32 ... set NUM-1._maxCharRange = 127 <-- LOG-1.addMsg(..) --> SP-1.setRequestType('text/html') --> STR-1.equals('text/html') <-- STR-1.equals(..) ret=true --> NUM-1.new(32, 127) Method View for LOG.addMsg set NUM-1._minCharRange = 32 set NUM-1._maxCharRange = 127 ... <-- NUM-1.new(..) ... set SP-1._binConv = NUM-1 ... --> LOG-1.addMsg('Set req..') ... <-- LOG-1.addMsg(..) <-- SP-1.setRequestType(..) Method views based on top of call stack
  • 15. Semantic Trace Views Execution Trace (and Thread View) Active Object View for NUM-1.new --> LOG-1.addMsg('Handling..') set NUM-1._minCharRange = 32 ... set NUM-1._maxCharRange = 127 <-- LOG-1.addMsg(..) --> SP-1.setRequestType('text/html') --> STR-1.equals('text/html') <-- STR-1.equals(..) ret=true --> NUM-1.new(32, 127) Active Object View for LOG-1.addMsg set NUM-1._minCharRange = 32 set NUM-1._maxCharRange = 127 ... <-- NUM-1.new(..) ... set SP-1._binConv = NUM-1 ... --> LOG-1.addMsg('Set req..') ... <-- LOG-1.addMsg(..) <-- SP-1.setRequestType(..) Active object views based on top of call stack
  • 16. Semantic Trace Views Execution Trace (and Thread View) Target Object View for NUM-1 --> LOG-1.addMsg('Handling..') --> NUM-1.new(32, 127) ... set NUM-1._minCharRange = 32 <-- LOG-1.addMsg(..) set NUM-1._maxCharRange = 127 --> SP-1.setRequestType('text/html') <-- NUM-1.new(..) --> STR-1.equals('text/html') <-- STR-1.equals(..) ret=true --> NUM-1.new(32, 127) Target Object View for LOG-1 set NUM-1._minCharRange = 32 set NUM-1._maxCharRange = 127 --> LOG-1.addMsg('Handling..') <-- NUM-1.new(..) <-- LOG-1.addMsg(..) set SP-1._binConv = NUM-1 --> LOG-1.addMsg('Set req..') ... <-- LOG-1.addMsg(..) --> LOG-1.addMsg('Set req..') ... <-- LOG-1.addMsg(..) <-- SP-1.setRequestType(..) Target object views
  • 17. Semantic Trace Views Execution Trace (and Thread View) Target Object View for NUM-1 --> NUM-1.new(32, 127) --> LOG-1.addMsg('Handling..') set NUM-1._minCharRange = 32 ... set NUM-1._maxCharRange = 127 <-- LOG-1.addMsg(..) <-- NUM-1.new(..) --> SP-1.setRequestType('text/html') --> STR-1.equals('text/html') Method View for SP.setRequestType <-- STR-1.equals(..) ret=true --> STR-1.equals('text/html') --> NUM-1.new(32, 127) <-- STR-1.equals(..) ret=true set NUM-1._minCharRange = 32 --> NUM-1.new(32, 127) set NUM-1._maxCharRange = 127 <-- NUM-1.new(..) <-- NUM-1.new(..) set SP-1._binConv = NUM-1 set SP-1._binConv = NUM-1 ... --> LOG-1.addMsg('Set req..') ... <-- LOG-1.addMsg(..) --> LOG-1.addMsg('Set req..') ... Target Object View for LOG-1 <-- LOG-1.addMsg(..) --> LOG-1.addMsg('Handling..') <-- SP-1.setRequestType(..) <-- LOG-1.addMsg(..) --> LOG-1.addMsg('Set req..') <-- LOG-1.addMsg(..) Views are linked allowing for multilevel analysis
  • 18. Roadmap  Motivation  Prior Approaches  Semantics-Aware Trace Analysis (SATA)  Applying SATA to Regression Analysis  Evaluation  Conclusions
  • 19. What if we just used diff?  Collect dynamic traces: 2.4.1: sssdyntracer … xslt.Process -xsl case2.xsl -in test.xml 2.5.1: sssdyntracer … xslt.Process -xsl case2.xsl -in test.xml  Traces are about 48K entries
  • 20. What if we just used diff?  Collect dynamic traces: 2.4.1: sssdyntracer … xslt.Process -xsl case2.xsl -in test.xml 2.5.1: sssdyntracer … xslt.Process -xsl case2.xsl -in test.xml  Traces are about 48K entries  Run “diff” tool on traces:  Requires 25 minutes on a 1.8GHZ x64 CPU  Requires 27 GB of RAM  Produces 1594 differences (3.3% of trace)
  • 21. Challenges of diff / LCS Old: New:  diff based on LCS algorithm:  Intractable on large traces: Ω(n2)  Can‟t detect moved sequences  Is not semantic-aware  diff produces too many differences
  • 22. Leveraging Semantic Views  Use secondary views (method/object) to find correlations in primary view (thread)  Robust against reorderings in other views  Correlations are semantically sound  Apply LCS/diff over fixed-sized windows in primary view to find „best overall correlation‟ in primary view
  • 23. Recall: What LCS would produce Old: New:
  • 24. View-based Semantic Differencing Main View Old: CBDHXYFEF Z New: CADFEFXYZ
  • 25. View-based Semantic Differencing Main View Old: CBDHXYFEF Z New: CADFEFXYZ Secondary View DHXYZ View construction (only one of many secondary views displayed here) DXYZ
  • 26. View-based Semantic Differencing Main View Old: CBDHXYFEF Z New: CADFEFXYZ Secondary View DHXYZ Lock-step scanning of main view DXYZ
  • 27. View-based Semantic Differencing Main View Old: CBDHXYFEF Z   New: CADFEFXYZ Secondary View DHXYZ Lock-step scanning of main view DXYZ
  • 28. View-based Semantic Differencing Main View Old: CBDHXYFEF Z   New: CADFEFXYZ Secondary View DHXYZ Discovery of correlating secondary views DXYZ
  • 29. View-based Semantic Differencing Main View Old: CBDHXYFEF Z   New: CADFEFXYZ Secondary View DHXYZ Exploration of correlating secondary views DXYZ
  • 30. View-based Semantic Differencing Main View Old: CBDHXYFEF Z   New: CADFEFXYZ Secondary View DHXYZ  Exploration of correlating secondary views DXYZ
  • 31. View-based Semantic Differencing Main View Old: CBDHXYFEF Z   New: CADFEFXYZ Secondary View DHXYZ  Exploration of correlating secondary views DXYZ
  • 32. View-based Semantic Differencing Main View Old: CBDHXYFEF Z   New: CADFEFXYZ Secondary View DHXYZ  Exploration of correlating secondary views DXYZ
  • 33. View-based Semantic Differencing Main View Old: CBDHXYFEF Z   New: CADFEFXYZ Secondary View DHXYZ  Exploration of correlating secondary views DXYZ
  • 34. View-based Semantic Differencing Main View Old: CBDHXYFEF Z   New: CADFEFXYZ Secondary View DHXYZ  Lock-step scanning of main view DXYZ
  • 35. View-based Semantic Differencing Main View Old: CBDHXYFEF Z   New: CADFEFXYZ Secondary View DHXYZ  Lock-step scanning of main view DXYZ
  • 36. View-based Semantic Differencing Main View Old: CBDHXYFEF Z     New: CADFEFXYZ Secondary View DHXYZ Lock-step scanning  of main view; exploration of secondary views DXYZ
  • 37. View-based Semantic Differencing Main View Old: CBDHXYFEF Z     New: CADFEFXYZ Secondary View DHXYZ Apply LCS over  fixed-size window in main view to find the next correlation DXYZ
  • 38. View-based Semantic Differencing Main View Old: CBDHXYFEF Z    New: CADFEFXYZ Secondary View DHXYZ Apply LCS over  fixed-size window in main view to find the next correlation DXYZ
  • 39. View-based Semantic Differencing Main View Old: CBDHXYFEF Z    New: CADFEFXYZ Secondary View DHXYZ  Lock-step scanning of main view DXYZ
  • 40. View-based Semantic Differencing Main View Old: CBDHXYFEF Z    New: CADFEFXYZ Secondary View DHXYZ  Lock-step scanning of main view DXYZ
  • 41. View-based Semantic Differencing Main View Old: CBDHXYFEF Z    New: CADFEFXYZ Secondary View DHXYZ  Lock-step scanning of main view DXYZ
  • 42. View-based Semantic Differencing Main View Old: CBDHXYFEF Z    New: CADFEFXYZ Secondary View DHXYZ Apply LCS over  fixed-size window in main view to find the next correlation DXYZ
  • 43. View-based Semantic Differencing Main View Old: CBDHXYFEF Z    New: CADFEFXYZ Secondary View DHXYZ  Lock-step scanning of main view DXYZ
  • 44. View-based Semantic Differencing Main View Old: CBDHXYFEF Z    New: CADFEFXYZ Secondary View DHXYZ View-based  differencing identified moved sequences properly DXYZ
  • 45. View-Based Differencing vs. LCS  Collect dynamic traces: 2.4.1: sssdyntracer … xslt.Process -xsl case2.xsl -in test.xml 2.5.1: sssdyntracer … xslt.Process -xsl case2.xsl -in test.xml  Traces are about 48K entries  Run view-based differencing tool on traces:  Requires 0.3 minutes instead of 25 minutes  Requires 0.1 GB instead of 27 GB of RAM  Produces 598 differences (1.2% of trace) ○ vs 1594 differences (3.3% of trace) for LCS
  • 46. Regression Analysis Process Old Program AspectJ Old Program w/ New Program w/ New Program Load-time Instrumentation Instrumentation Weaver Tracing Aspects Trace Trace Working Regressing Test Case(s) Likely Test Case RPrism Analysis (but similar) Regression Algorithm Causes View Trace Traces for 4 Differencing Cases
  • 47. RPrism Analysis Algorithm Suspected differences set: Old Program New Program Regressing Test Case VS Regressing Test Case Expected differences set: Old Program New Program Working Test Case VS Working Test Case Regression differences set: New Program New Program Working Test Case VS Regressing Test Case
  • 48. RPrism Analysis Algorithm Regression Results differences set Suspected Expected differences set differences set
  • 49. Roadmap  Motivation  Prior Approaches  Semantics-Aware Trace Analysis (SATA)  Applying SATA to Regression Analysis  Evaluation  Conclusions
  • 50. 4 Regressions on 3 Projects  Daikon  Dynamic invariant detector from MIT  Used as a test subject in 11 other publications  Apache XalanJ  Implements XML XPath and XSLT  Interprets XSLT or compiles XSLT to Java bytecode  Used in Sun JDK to implement javax.xml.* classes  Apache Derby (720 KLOC)  Embedded or client/server relational DB  AKA Sun Java DB, included in JDK 6
  • 51. Daikon Regression  About Daikon  169 KLOC, 1100 classes  Dynamic invariant detector from MIT  Used as a test subject in 11 other publications  About the Regression  Regression first studied by JUnit/CIA [FSE „06] ○ 1 week of differences  Execution traces about 15K entries in length
  • 52. Daikon Regression  42 differences before, 3 after analysis  Same accuracy as LCS  12.9x speedup  12.1 times less memory
  • 53. XalanJ-1725 Regression  About XalanJ  365 KLOC, 1500 classes  Implements XPath and XSLT for XML  Used by Sun to implement javax.xml.* classes  About the Regression  Regression from version 2.5.1 to 2.5.2 ○ 4 months of code changes, 84 major changes  Execution traces about 98K entries in length  Regressing behavior exhibited within dynamically generated code
  • 54. XalanJ-1725 Regression  296 differences before, 1 after analysis  LCS failed to find the regression cause  82.8x speedup  269 times less memory
  • 55. XalanJ-1802 Regression  About XalanJ  365 KLOC, 1500 classes  Implements XPath and XSLT for XML  Used by Sun to implement javax.xml.* classes  About the Regression  Regression from version 2.4.1 to 2.5.1 ○ 79K changed code over 12 months ○ 97 bugfixes and feature enhancements  Execution traces about 44K entries in length  Regressing behavior exhibited within a completely rearchitected module
  • 56. XalanJ-1802 Regression  184 differences before, 10 after analysis  Same accuracy as LCS  9.4x speedup  35.4 times less memory
  • 57. Derby-1633 Regression  About Derby  720K lines of code  Embedded or client/server relational DB  AKA Sun Java DB, included in JDK 6  About the Regression  Regression from version 10.1.2.1 to 10.1.3.1 ○ 7 months of changes, 9 enhancements, 97 bugfixes  Execution traces about 335K entries in length  Involves multiple threads, larger code base (2x), and longer running traces (3x)
  • 58. Derby-1633 Regression  2663 differences before, 6 after analysis  LCS completely failed (out of memory failure at 32 GB)
  • 59. Roadmap  Motivation  Prior Approaches  Semantics-Aware Trace Analysis (SATA)  Applying SATA to Regression Analysis  Evaluation  Conclusions
  • 60. Summary / Future Directions  New view-based model for traces  Facilitates semantics-aware dynamic analyses  One application is efficient trace differencing  Full formal framework in paper  Other potential applications:  Race detection  Object-protocol enforcement  Data-mining from traces  Malware detection
  • 61. Download RPrism, try it out! http://cs.purdue.edu/homes/kjhoffma/rprism/ Contact Information: Kevin Hoffman kjhoffma@cs.purdue.edu
  • 62. View-based Diff vs LCS
  • 63. Regression Cause Analysis  Factors affecting false negatives:  Dynamic traces are complete, set A must contain cause  Differences in set B produced correct output, not likely to contain the direct regression cause  Intersecting with set C can introduce false negatives (e.g., regression caused by code removal)  Factors affecting false positives:  Choice of similar test case affects quality of set B  Intersecting/subtracting set C also helps Set A is the suspected differences set Set B is the expected differences set Set C is the regression differences set
  • 64. Lock-step Scanning of Main View
  • 65. Lock-step Scanning of Main View
  • 66. Exploration of Secondary Views with LCS
  • 67. Apply LCS over Fixed-size Window in Main View to Find the Next Correlation
  • 68. Exploration of Secondary Views with LCS
  • 69. Apply LCS over Fixed-size Window in Main View to Find the Next Correlation
  • 70. Lock-step Scanning of Main View

×