Kevin Hoffman, Patrick Eugster, Suresh Jagannathan
Roadmap
 Motivation
 Prior Approaches
 Semantics-Aware Trace Analysis (SATA)
 Applying SATA to Regression Analysis
 E...
Motivation
   Apache XalanJ 2.4.1 works:
      java … xslt.Process -xsl case1.xsl -in test.xml
      java … xslt.Process ...
How to find the cause?
   Manual inspection is hard
     12 months of development from 2.4.1 to 2.5.1
     79K new or c...
How to find the cause?
   Debugging is hard
     Separation of cause and effect
      ○ e.g. in XalanJ, bug in XSLT comp...
Roadmap
 Motivation
 Prior Approaches
 Semantics-Aware Trace Analysis (SATA)
 Applying SATA to Regression Analysis
 E...
Challenges: Static Analysis
 Dynamically generated code
 Advanced language features
     Dynamic dispatch (e.g., Polymo...
Challenges: Dynamic Analysis
   Dynamic program slicing
     Slices are still quite large (e.g. 1000s of events)
 Contr...
Execution Indexing
 Use structure/state of execution to compute
  an „index‟ at each execution point
 Find correlations ...
Roadmap
 Motivation
 Prior Approaches
 Semantics-Aware Trace Analysis (SATA)
 Applying SATA to Regression Analysis
 E...
Semantic Trace Views
           Execution Trace
 --> LOG-1.addMsg('Handling..')
     ...
 <-- LOG-1.addMsg(..)
 --> SP-1.s...
Semantic Trace Views
     Execution Trace (Thread View)
 --> LOG-1.addMsg('Handling..')
     ...
 <-- LOG-1.addMsg(..)
 --...
Semantic Trace Views
   Execution Trace (and Thread View)
 --> LOG-1.addMsg('Handling..')
     ...
 <-- LOG-1.addMsg(..)
 ...
Semantic Trace Views
   Execution Trace (and Thread View)           Method View for NUM.new
 --> LOG-1.addMsg('Handling..'...
Semantic Trace Views
   Execution Trace (and Thread View)      Active Object View for NUM-1.new
 --> LOG-1.addMsg('Handlin...
Semantic Trace Views
   Execution Trace (and Thread View)       Target Object View for NUM-1
 --> LOG-1.addMsg('Handling.....
Semantic Trace Views
   Execution Trace (and Thread View)        Target Object View for NUM-1
                            ...
Roadmap
 Motivation
 Prior Approaches
 Semantics-Aware Trace Analysis (SATA)
 Applying SATA to Regression Analysis
 E...
What if we just used diff?
   Collect dynamic traces:
      2.4.1: sssdyntracer … xslt.Process -xsl case2.xsl -in test.xm...
What if we just used diff?
   Collect dynamic traces:
       2.4.1: sssdyntracer … xslt.Process -xsl case2.xsl -in test.x...
Challenges of diff / LCS
Old:



New:

   diff based on LCS algorithm:
     Intractable on large traces: Ω(n2)
     Can...
Leveraging Semantic Views
   Use secondary views (method/object) to
    find correlations in primary view (thread)
     ...
Recall: What LCS would produce

Old:




New:
View-based Semantic Differencing
           Main View
Old:   CBDHXYFEF Z


New:    CADFEFXYZ
View-based Semantic Differencing
           Main View
Old:   CBDHXYFEF Z


New:    CADFEFXYZ
          Secondary View

   ...
View-based Semantic Differencing
           Main View
Old:   CBDHXYFEF Z


New:    CADFEFXYZ
          Secondary View

   ...
View-based Semantic Differencing
                Main View
Old:   CBDHXYFEF Z
        

            
New:    CADFEFXYZ
 ...
View-based Semantic Differencing
                Main View
Old:   CBDHXYFEF Z
        

            
New:    CADFEFXYZ
 ...
View-based Semantic Differencing
                Main View
Old:   CBDHXYFEF Z
        

            
New:    CADFEFXYZ
 ...
View-based Semantic Differencing
                Main View
Old:   CBDHXYFEF Z
        

            
New:    CADFEFXYZ
 ...
View-based Semantic Differencing
                Main View
Old:   CBDHXYFEF Z
        

            
New:    CADFEFXYZ
 ...
View-based Semantic Differencing
                Main View
Old:   CBDHXYFEF Z
        

            
New:    CADFEFXYZ
 ...
View-based Semantic Differencing
                Main View
Old:   CBDHXYFEF Z
        

            
New:    CADFEFXYZ
 ...
View-based Semantic Differencing
                Main View
Old:   CBDHXYFEF Z
        

            
New:    CADFEFXYZ
 ...
View-based Semantic Differencing
                Main View
Old:   CBDHXYFEF Z
        

            
New:    CADFEFXYZ
 ...
View-based Semantic Differencing
                Main View
Old:   CBDHXYFEF Z
               

                   
New...
View-based Semantic Differencing
                Main View
Old:   CBDHXYFEF Z
               

                   
New...
View-based Semantic Differencing
                Main View
Old:   CBDHXYFEF Z
               

            
New:    CAD...
View-based Semantic Differencing
                Main View
Old:   CBDHXYFEF Z
               

            
New:    CAD...
View-based Semantic Differencing
                Main View
Old:   CBDHXYFEF Z
               

            
New:    CAD...
View-based Semantic Differencing
                Main View
Old:   CBDHXYFEF Z
               

            
New:    CAD...
View-based Semantic Differencing
                Main View
Old:   CBDHXYFEF Z
               

            
New:    CAD...
View-based Semantic Differencing
                Main View
Old:   CBDHXYFEF Z
               

            
New:    CAD...
View-based Semantic Differencing
                Main View
Old:   CBDHXYFEF Z
               

            
New:    CAD...
View-Based Differencing vs. LCS
   Collect dynamic traces:
       2.4.1: sssdyntracer … xslt.Process -xsl case2.xsl -in t...
Regression Analysis Process
 Old Program
                      AspectJ
                                    Old Program w/ ...
RPrism Analysis Algorithm
               Suspected differences set:
        Old Program                 New Program
    Re...
RPrism Analysis Algorithm

                               Regression
 Results                     differences set




  Su...
Roadmap
 Motivation
 Prior Approaches
 Semantics-Aware Trace Analysis (SATA)
 Applying SATA to Regression Analysis
 E...
4 Regressions on 3 Projects
   Daikon
     Dynamic invariant detector from MIT
     Used as a test subject in 11 other ...
Daikon Regression
   About Daikon
     169 KLOC, 1100 classes
     Dynamic invariant detector from MIT
     Used as a ...
Daikon Regression




 42 differences before, 3 after analysis
 Same accuracy as LCS
 12.9x speedup
 12.1 times less m...
XalanJ-1725 Regression
   About XalanJ
     365 KLOC, 1500 classes
     Implements XPath and XSLT for XML
     Used by...
XalanJ-1725 Regression




 296 differences before, 1 after analysis
 LCS failed to find the regression cause
 82.8x sp...
XalanJ-1802 Regression
   About XalanJ
     365 KLOC, 1500 classes
     Implements XPath and XSLT for XML
     Used by...
XalanJ-1802 Regression




 184 differences before, 10 after analysis
 Same accuracy as LCS
 9.4x speedup
 35.4 times ...
Derby-1633 Regression
   About Derby
     720K lines of code
     Embedded or client/server relational DB
     AKA Sun...
Derby-1633 Regression




 2663 differences before, 6 after analysis
 LCS completely failed (out of memory
  failure at ...
Roadmap
 Motivation
 Prior Approaches
 Semantics-Aware Trace Analysis (SATA)
 Applying SATA to Regression Analysis
 E...
Summary / Future Directions
   New view-based model for traces
   Facilitates semantics-aware dynamic analyses
   One a...
Download RPrism, try it out!
http://cs.purdue.edu/homes/kjhoffma/rprism/




              Contact Information:
          ...
View-based Diff vs LCS
Regression Cause Analysis
   Factors affecting false negatives:
     Dynamic traces are complete, set A must contain cau...
Lock-step Scanning of Main View
Lock-step Scanning of Main View
Exploration of Secondary Views with LCS
Apply LCS over Fixed-size Window in
Main View to Find the Next Correlation
Exploration of Secondary Views with LCS
Apply LCS over Fixed-size Window in
Main View to Find the Next Correlation
Lock-step Scanning of Main View
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Semantics-Aware Trace Analysis [PLDI 2009]
Upcoming SlideShare
Loading in …5
×

Semantics-Aware Trace Analysis [PLDI 2009]

3,106 views
3,037 views

Published on

We present a novel dynamic program analysis that builds a semantic view of program executions. These views reflect program abstractions and aspects; however, views are not simply projections of execution traces, but are linked to each other to capture semantic interactions among abstractions at different levels of granularity in a scalable manner.

We describe our approach in the context of Java and demonstrate its utility to improve regression analysis. We first formalize a subset of Java and a grammar for traces generated at program execution. We then introduce several types of views used to analyze regression bugs along with a novel, scalable technique for semantic differencing of traces from different versions of the same program. Benchmark results on large open-source Java programs demonstrate that semantic-aware trace differencing can identify precise and useful details about the underlying cause for a regression, even in programs that use reflection, multithreading, or dynamic code generation, features that typically confound other analysis techniques.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
3,106
On SlideShare
0
From Embeds
0
Number of Embeds
1,750
Actions
Shares
0
Downloads
28
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Semantics-Aware Trace Analysis [PLDI 2009]

  1. 1. Kevin Hoffman, Patrick Eugster, Suresh Jagannathan
  2. 2. Roadmap  Motivation  Prior Approaches  Semantics-Aware Trace Analysis (SATA)  Applying SATA to Regression Analysis  Evaluation  Conclusions
  3. 3. Motivation  Apache XalanJ 2.4.1 works: java … xslt.Process -xsl case1.xsl -in test.xml java … xslt.Process -xsl case2.xsl -in test.xml java … xslt.Process -xsl case3.xsl -in test.xml  Upgrade to 2.5.1, now it‟s broken! java … xslt.Process -xsl case1.xsl -in test.xml java … xslt.Process -xsl case2.xsl -in test.xml java … xslt.Process -xsl case3.xsl -in test.xml
  4. 4. How to find the cause?  Manual inspection is hard  12 months of development from 2.4.1 to 2.5.1  79K new or changed lines of code  97 new features and bugfixes
  5. 5. How to find the cause?  Debugging is hard  Separation of cause and effect ○ e.g. in XalanJ, bug in XSLT compiler  Complex web of interacting components  Debugging requires in-depth domain- specific knowledge (limited resource)
  6. 6. Roadmap  Motivation  Prior Approaches  Semantics-Aware Trace Analysis (SATA)  Applying SATA to Regression Analysis  Evaluation  Conclusions
  7. 7. Challenges: Static Analysis  Dynamically generated code  Advanced language features  Dynamic dispatch (e.g., Polymorphism)  Reflection  Advanced aspect-oriented language features
  8. 8. Challenges: Dynamic Analysis  Dynamic program slicing  Slices are still quite large (e.g. 1000s of events)  Control-flow similarity metrics  State-space exploration / refinement
  9. 9. Execution Indexing  Use structure/state of execution to compute an „index‟ at each execution point  Find correlations between indices for profiling, debugging, execution comparison
  10. 10. Roadmap  Motivation  Prior Approaches  Semantics-Aware Trace Analysis (SATA)  Applying SATA to Regression Analysis  Evaluation  Conclusions
  11. 11. Semantic Trace Views Execution Trace --> LOG-1.addMsg('Handling..') ... <-- LOG-1.addMsg(..) --> SP-1.setRequestType('text/html') --> STR-1.equals('text/html') <-- STR-1.equals(..) ret=true --> NUM-1.new(32, 127) set NUM-1._minCharRange = 32 set NUM-1._maxCharRange = 127 <-- NUM-1.new(..) set SP-1._binConv = NUM-1 ... --> LOG-1.addMsg('Set req..') ... <-- LOG-1.addMsg(..) <-- SP-1.setRequestType(..) Organize execution traces into “views”
  12. 12. Semantic Trace Views Execution Trace (Thread View) --> LOG-1.addMsg('Handling..') ... <-- LOG-1.addMsg(..) --> SP-1.setRequestType('text/html') --> STR-1.equals('text/html') <-- STR-1.equals(..) ret=true --> NUM-1.new(32, 127) set NUM-1._minCharRange = 32 set NUM-1._maxCharRange = 127 <-- NUM-1.new(..) set SP-1._binConv = NUM-1 ... --> LOG-1.addMsg('Set req..') ... <-- LOG-1.addMsg(..) <-- SP-1.setRequestType(..) Thread views based on thread ID
  13. 13. Semantic Trace Views Execution Trace (and Thread View) --> LOG-1.addMsg('Handling..') ... <-- LOG-1.addMsg(..) --> SP-1.setRequestType('text/html') --> STR-1.equals('text/html') <-- STR-1.equals(..) ret=true --> NUM-1.new(32, 127) Method View for SP.setRequestType set NUM-1._minCharRange = 32 set NUM-1._maxCharRange = 127 --> STR-1.equals('text/html') <-- NUM-1.new(..) <-- STR-1.equals(..) ret=true set SP-1._binConv = NUM-1 --> NUM-1.new(32, 127) ... <-- NUM-1.new(..) --> LOG-1.addMsg('Set req..') set SP-1._binConv = NUM-1 ... ... <-- LOG-1.addMsg(..) --> LOG-1.addMsg('Set req..') <-- SP-1.setRequestType(..) <-- LOG-1.addMsg(..) Method views based on top of call stack
  14. 14. Semantic Trace Views Execution Trace (and Thread View) Method View for NUM.new --> LOG-1.addMsg('Handling..') set NUM-1._minCharRange = 32 ... set NUM-1._maxCharRange = 127 <-- LOG-1.addMsg(..) --> SP-1.setRequestType('text/html') --> STR-1.equals('text/html') <-- STR-1.equals(..) ret=true --> NUM-1.new(32, 127) Method View for LOG.addMsg set NUM-1._minCharRange = 32 set NUM-1._maxCharRange = 127 ... <-- NUM-1.new(..) ... set SP-1._binConv = NUM-1 ... --> LOG-1.addMsg('Set req..') ... <-- LOG-1.addMsg(..) <-- SP-1.setRequestType(..) Method views based on top of call stack
  15. 15. Semantic Trace Views Execution Trace (and Thread View) Active Object View for NUM-1.new --> LOG-1.addMsg('Handling..') set NUM-1._minCharRange = 32 ... set NUM-1._maxCharRange = 127 <-- LOG-1.addMsg(..) --> SP-1.setRequestType('text/html') --> STR-1.equals('text/html') <-- STR-1.equals(..) ret=true --> NUM-1.new(32, 127) Active Object View for LOG-1.addMsg set NUM-1._minCharRange = 32 set NUM-1._maxCharRange = 127 ... <-- NUM-1.new(..) ... set SP-1._binConv = NUM-1 ... --> LOG-1.addMsg('Set req..') ... <-- LOG-1.addMsg(..) <-- SP-1.setRequestType(..) Active object views based on top of call stack
  16. 16. Semantic Trace Views Execution Trace (and Thread View) Target Object View for NUM-1 --> LOG-1.addMsg('Handling..') --> NUM-1.new(32, 127) ... set NUM-1._minCharRange = 32 <-- LOG-1.addMsg(..) set NUM-1._maxCharRange = 127 --> SP-1.setRequestType('text/html') <-- NUM-1.new(..) --> STR-1.equals('text/html') <-- STR-1.equals(..) ret=true --> NUM-1.new(32, 127) Target Object View for LOG-1 set NUM-1._minCharRange = 32 set NUM-1._maxCharRange = 127 --> LOG-1.addMsg('Handling..') <-- NUM-1.new(..) <-- LOG-1.addMsg(..) set SP-1._binConv = NUM-1 --> LOG-1.addMsg('Set req..') ... <-- LOG-1.addMsg(..) --> LOG-1.addMsg('Set req..') ... <-- LOG-1.addMsg(..) <-- SP-1.setRequestType(..) Target object views
  17. 17. Semantic Trace Views Execution Trace (and Thread View) Target Object View for NUM-1 --> NUM-1.new(32, 127) --> LOG-1.addMsg('Handling..') set NUM-1._minCharRange = 32 ... set NUM-1._maxCharRange = 127 <-- LOG-1.addMsg(..) <-- NUM-1.new(..) --> SP-1.setRequestType('text/html') --> STR-1.equals('text/html') Method View for SP.setRequestType <-- STR-1.equals(..) ret=true --> STR-1.equals('text/html') --> NUM-1.new(32, 127) <-- STR-1.equals(..) ret=true set NUM-1._minCharRange = 32 --> NUM-1.new(32, 127) set NUM-1._maxCharRange = 127 <-- NUM-1.new(..) <-- NUM-1.new(..) set SP-1._binConv = NUM-1 set SP-1._binConv = NUM-1 ... --> LOG-1.addMsg('Set req..') ... <-- LOG-1.addMsg(..) --> LOG-1.addMsg('Set req..') ... Target Object View for LOG-1 <-- LOG-1.addMsg(..) --> LOG-1.addMsg('Handling..') <-- SP-1.setRequestType(..) <-- LOG-1.addMsg(..) --> LOG-1.addMsg('Set req..') <-- LOG-1.addMsg(..) Views are linked allowing for multilevel analysis
  18. 18. Roadmap  Motivation  Prior Approaches  Semantics-Aware Trace Analysis (SATA)  Applying SATA to Regression Analysis  Evaluation  Conclusions
  19. 19. What if we just used diff?  Collect dynamic traces: 2.4.1: sssdyntracer … xslt.Process -xsl case2.xsl -in test.xml 2.5.1: sssdyntracer … xslt.Process -xsl case2.xsl -in test.xml  Traces are about 48K entries
  20. 20. What if we just used diff?  Collect dynamic traces: 2.4.1: sssdyntracer … xslt.Process -xsl case2.xsl -in test.xml 2.5.1: sssdyntracer … xslt.Process -xsl case2.xsl -in test.xml  Traces are about 48K entries  Run “diff” tool on traces:  Requires 25 minutes on a 1.8GHZ x64 CPU  Requires 27 GB of RAM  Produces 1594 differences (3.3% of trace)
  21. 21. Challenges of diff / LCS Old: New:  diff based on LCS algorithm:  Intractable on large traces: Ω(n2)  Can‟t detect moved sequences  Is not semantic-aware  diff produces too many differences
  22. 22. Leveraging Semantic Views  Use secondary views (method/object) to find correlations in primary view (thread)  Robust against reorderings in other views  Correlations are semantically sound  Apply LCS/diff over fixed-sized windows in primary view to find „best overall correlation‟ in primary view
  23. 23. Recall: What LCS would produce Old: New:
  24. 24. View-based Semantic Differencing Main View Old: CBDHXYFEF Z New: CADFEFXYZ
  25. 25. View-based Semantic Differencing Main View Old: CBDHXYFEF Z New: CADFEFXYZ Secondary View DHXYZ View construction (only one of many secondary views displayed here) DXYZ
  26. 26. View-based Semantic Differencing Main View Old: CBDHXYFEF Z New: CADFEFXYZ Secondary View DHXYZ Lock-step scanning of main view DXYZ
  27. 27. View-based Semantic Differencing Main View Old: CBDHXYFEF Z   New: CADFEFXYZ Secondary View DHXYZ Lock-step scanning of main view DXYZ
  28. 28. View-based Semantic Differencing Main View Old: CBDHXYFEF Z   New: CADFEFXYZ Secondary View DHXYZ Discovery of correlating secondary views DXYZ
  29. 29. View-based Semantic Differencing Main View Old: CBDHXYFEF Z   New: CADFEFXYZ Secondary View DHXYZ Exploration of correlating secondary views DXYZ
  30. 30. View-based Semantic Differencing Main View Old: CBDHXYFEF Z   New: CADFEFXYZ Secondary View DHXYZ  Exploration of correlating secondary views DXYZ
  31. 31. View-based Semantic Differencing Main View Old: CBDHXYFEF Z   New: CADFEFXYZ Secondary View DHXYZ  Exploration of correlating secondary views DXYZ
  32. 32. View-based Semantic Differencing Main View Old: CBDHXYFEF Z   New: CADFEFXYZ Secondary View DHXYZ  Exploration of correlating secondary views DXYZ
  33. 33. View-based Semantic Differencing Main View Old: CBDHXYFEF Z   New: CADFEFXYZ Secondary View DHXYZ  Exploration of correlating secondary views DXYZ
  34. 34. View-based Semantic Differencing Main View Old: CBDHXYFEF Z   New: CADFEFXYZ Secondary View DHXYZ  Lock-step scanning of main view DXYZ
  35. 35. View-based Semantic Differencing Main View Old: CBDHXYFEF Z   New: CADFEFXYZ Secondary View DHXYZ  Lock-step scanning of main view DXYZ
  36. 36. View-based Semantic Differencing Main View Old: CBDHXYFEF Z     New: CADFEFXYZ Secondary View DHXYZ Lock-step scanning  of main view; exploration of secondary views DXYZ
  37. 37. View-based Semantic Differencing Main View Old: CBDHXYFEF Z     New: CADFEFXYZ Secondary View DHXYZ Apply LCS over  fixed-size window in main view to find the next correlation DXYZ
  38. 38. View-based Semantic Differencing Main View Old: CBDHXYFEF Z    New: CADFEFXYZ Secondary View DHXYZ Apply LCS over  fixed-size window in main view to find the next correlation DXYZ
  39. 39. View-based Semantic Differencing Main View Old: CBDHXYFEF Z    New: CADFEFXYZ Secondary View DHXYZ  Lock-step scanning of main view DXYZ
  40. 40. View-based Semantic Differencing Main View Old: CBDHXYFEF Z    New: CADFEFXYZ Secondary View DHXYZ  Lock-step scanning of main view DXYZ
  41. 41. View-based Semantic Differencing Main View Old: CBDHXYFEF Z    New: CADFEFXYZ Secondary View DHXYZ  Lock-step scanning of main view DXYZ
  42. 42. View-based Semantic Differencing Main View Old: CBDHXYFEF Z    New: CADFEFXYZ Secondary View DHXYZ Apply LCS over  fixed-size window in main view to find the next correlation DXYZ
  43. 43. View-based Semantic Differencing Main View Old: CBDHXYFEF Z    New: CADFEFXYZ Secondary View DHXYZ  Lock-step scanning of main view DXYZ
  44. 44. View-based Semantic Differencing Main View Old: CBDHXYFEF Z    New: CADFEFXYZ Secondary View DHXYZ View-based  differencing identified moved sequences properly DXYZ
  45. 45. View-Based Differencing vs. LCS  Collect dynamic traces: 2.4.1: sssdyntracer … xslt.Process -xsl case2.xsl -in test.xml 2.5.1: sssdyntracer … xslt.Process -xsl case2.xsl -in test.xml  Traces are about 48K entries  Run view-based differencing tool on traces:  Requires 0.3 minutes instead of 25 minutes  Requires 0.1 GB instead of 27 GB of RAM  Produces 598 differences (1.2% of trace) ○ vs 1594 differences (3.3% of trace) for LCS
  46. 46. Regression Analysis Process Old Program AspectJ Old Program w/ New Program w/ New Program Load-time Instrumentation Instrumentation Weaver Tracing Aspects Trace Trace Working Regressing Test Case(s) Likely Test Case RPrism Analysis (but similar) Regression Algorithm Causes View Trace Traces for 4 Differencing Cases
  47. 47. RPrism Analysis Algorithm Suspected differences set: Old Program New Program Regressing Test Case VS Regressing Test Case Expected differences set: Old Program New Program Working Test Case VS Working Test Case Regression differences set: New Program New Program Working Test Case VS Regressing Test Case
  48. 48. RPrism Analysis Algorithm Regression Results differences set Suspected Expected differences set differences set
  49. 49. Roadmap  Motivation  Prior Approaches  Semantics-Aware Trace Analysis (SATA)  Applying SATA to Regression Analysis  Evaluation  Conclusions
  50. 50. 4 Regressions on 3 Projects  Daikon  Dynamic invariant detector from MIT  Used as a test subject in 11 other publications  Apache XalanJ  Implements XML XPath and XSLT  Interprets XSLT or compiles XSLT to Java bytecode  Used in Sun JDK to implement javax.xml.* classes  Apache Derby (720 KLOC)  Embedded or client/server relational DB  AKA Sun Java DB, included in JDK 6
  51. 51. Daikon Regression  About Daikon  169 KLOC, 1100 classes  Dynamic invariant detector from MIT  Used as a test subject in 11 other publications  About the Regression  Regression first studied by JUnit/CIA [FSE „06] ○ 1 week of differences  Execution traces about 15K entries in length
  52. 52. Daikon Regression  42 differences before, 3 after analysis  Same accuracy as LCS  12.9x speedup  12.1 times less memory
  53. 53. XalanJ-1725 Regression  About XalanJ  365 KLOC, 1500 classes  Implements XPath and XSLT for XML  Used by Sun to implement javax.xml.* classes  About the Regression  Regression from version 2.5.1 to 2.5.2 ○ 4 months of code changes, 84 major changes  Execution traces about 98K entries in length  Regressing behavior exhibited within dynamically generated code
  54. 54. XalanJ-1725 Regression  296 differences before, 1 after analysis  LCS failed to find the regression cause  82.8x speedup  269 times less memory
  55. 55. XalanJ-1802 Regression  About XalanJ  365 KLOC, 1500 classes  Implements XPath and XSLT for XML  Used by Sun to implement javax.xml.* classes  About the Regression  Regression from version 2.4.1 to 2.5.1 ○ 79K changed code over 12 months ○ 97 bugfixes and feature enhancements  Execution traces about 44K entries in length  Regressing behavior exhibited within a completely rearchitected module
  56. 56. XalanJ-1802 Regression  184 differences before, 10 after analysis  Same accuracy as LCS  9.4x speedup  35.4 times less memory
  57. 57. Derby-1633 Regression  About Derby  720K lines of code  Embedded or client/server relational DB  AKA Sun Java DB, included in JDK 6  About the Regression  Regression from version 10.1.2.1 to 10.1.3.1 ○ 7 months of changes, 9 enhancements, 97 bugfixes  Execution traces about 335K entries in length  Involves multiple threads, larger code base (2x), and longer running traces (3x)
  58. 58. Derby-1633 Regression  2663 differences before, 6 after analysis  LCS completely failed (out of memory failure at 32 GB)
  59. 59. Roadmap  Motivation  Prior Approaches  Semantics-Aware Trace Analysis (SATA)  Applying SATA to Regression Analysis  Evaluation  Conclusions
  60. 60. Summary / Future Directions  New view-based model for traces  Facilitates semantics-aware dynamic analyses  One application is efficient trace differencing  Full formal framework in paper  Other potential applications:  Race detection  Object-protocol enforcement  Data-mining from traces  Malware detection
  61. 61. Download RPrism, try it out! http://cs.purdue.edu/homes/kjhoffma/rprism/ Contact Information: Kevin Hoffman kjhoffma@cs.purdue.edu
  62. 62. View-based Diff vs LCS
  63. 63. Regression Cause Analysis  Factors affecting false negatives:  Dynamic traces are complete, set A must contain cause  Differences in set B produced correct output, not likely to contain the direct regression cause  Intersecting with set C can introduce false negatives (e.g., regression caused by code removal)  Factors affecting false positives:  Choice of similar test case affects quality of set B  Intersecting/subtracting set C also helps Set A is the suspected differences set Set B is the expected differences set Set C is the regression differences set
  64. 64. Lock-step Scanning of Main View
  65. 65. Lock-step Scanning of Main View
  66. 66. Exploration of Secondary Views with LCS
  67. 67. Apply LCS over Fixed-size Window in Main View to Find the Next Correlation
  68. 68. Exploration of Secondary Views with LCS
  69. 69. Apply LCS over Fixed-size Window in Main View to Find the Next Correlation
  70. 70. Lock-step Scanning of Main View

×