Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Unifi

2,469 views

Published on

Unifi presentation at ICSE-2009 (may 20/2009)

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

Unifi

  1. 1. Automatic Dimension Inference and Checking for Object-Oriented Programs Sudheendra Hangal Monica S. Lam http://suif.stanford.edu/unifi International Conference on Software Engineering Vancouver, Canada May 20th, 2009
  2. 2. Overview • A fully automatic dimension inference system for Java programs • Diff-based method to detect dimension errors • Case-study on a 19KLoC program • UniFi: Usable open-source tool
  3. 3. Dimensionality Checking Used by physicists (and high-schoolers) E.g. E = m * c; [M x L2 x T-2 ] vs. [M x L x T-1] Doesn’t “type check”!
  4. 4. Dimensions are Everywhere • Program values have dimensions like id, $, date, port, color, flag, state, mask, count, message, filename, property, … and of course, mass, length, time, etc. • We focus on primitive types and strings – Hard to define custom types for everything – No benefit of type-checking
  5. 5. Programmer View java.awt.MouseWheelEvent public MouseWheelEvent(Component source, int id, long when, int modifiers, int x, int y, int clickCount, boolean popupTrigger, int scrollType, int scrollAmount, int wheelRotation);
  6. 6. Type-checker View java.awt.MouseWheelEvent public MouseWheelEvent(Component source, int id, long when, int modifiers, int x, int y, int clickCount, boolean popupTrigger, int scrollType, int scrollAmount, int wheelRotation);
  7. 7. Observation • Programmers use suffixes to capture dimensions int proxyPort, backgroundColor; long startTimeMillis, eventMask; String inputFilename, serverURL;
  8. 8. Putting Dimensions to Work How do we get the benefit of dimension checking in mainstream languages ? 2 ideas: 1) Detect (likely) errors automatically by diff’ing dimension usage between programs 2) Bootstrap from standard libraries
  9. 9. UniFi’s Core Idea • Infer dimensions of variables automatically – Static analysis, type inference techniques – Standard Java programs, zero annotation burden • Optional: Examine results • Compare inferred dimensions across two programs that have something in common
  10. 10. Results 1 UniFi Program 1 Inference Diffs UniFi Diff … UniFi UniFi Program 2 Inference GUI Results 2
  11. 11. Use Cases • Report changes as the same code evolves – Nightly builds – During program maintenance • Compare against a different configuration – Different programs using the same library – 2 different implementations of an interface – Implementation of a library v/s program using it – Different programmers’ code
  12. 12. Inference Algorithm • Input: Java program • Assigns dimensions to variables – Initially independent • Set up constraints between dim. vars • Solve constraints • Output: a set of relations between dimension variables
  13. 13. Inference Example (1) x = y + z x y z a < b a b d[i] i d.length u = v * w u v * w
  14. 14. Inference Example (2) int f(x) { return x * x; } a1 = f(a); a1 a * a b1 = f(b); b1 b * b • Context sensitive analysis – Uses method summaries
  15. 15. OO Constraints • Subtypes retain supertype interface – Liskov Substitution Principle • Constrains dimensions of parameters and return value of subtype methods class A { int m( int x ) { … } } class B extends A { int m( int x ) { … } }
  16. 16. Multiply/Divide Constraints • Linear equation style expressions for multiply and divide – Special handling of java.math libraries • Solved using Gaussian elimination style algorithm
  17. 17. Comparing Inferred Dimensions • Identify common variables – Same name of field, position of method param, etc. • Compare equivalence classes formed by unification constraints • Compare Multiply-divide constraints – Need canonical formulas for dvars – Make common variables more “stable” than others – See paper for details
  18. 18. Case Study: bddbddb http://sourceforge.net/projects/bddbddb • Retroactively run over 10 months of active development – Oct. 2004 to July 2005, 292 builds – Approx. 19,000 lines of Java code • Compared successive nightly builds
  19. 19. Results • 26 reports, across 19 pairs of builds • 5 real errors (+ fixes) • False Positives – Trivial reasons like field not used – Probably easy to reduce number
  20. 20. Bug Example double NO_CLASS = …; // default class id double NO_CLASS_SCORE = …; // default score … double vScore=NO_CLASS, aScore=NO_CLASS; double vClass=NO_CLASS, aClass=NO_CLASS; • UniFi detected that independent dimensions NO_CLASS and NO_CLASS_SCORE merged
  21. 21. Inference Example double[] distribution = new double[numClasses]; ... // compute sum ... // initialize distribution array for (int i=0; i < NUM_TREES; ++i) distribution[i] /= sum; numClasses distribution.length However: not caught i since this was in new NUM_TREES code!
  22. 22. Experiences • Sometimes bugs indicated by removal of unification constraint (“error of omission”) • Dimensionally inconsistent code – Ignore hashCode(), compareTo() – Cannot interpret semantically
  23. 23. Experiences • Types of Errors: Sometimes can be difficult to root-cause • Dimensions vs. Units – May not catch wrong scaling factor… …but might catch the absence of one (?)
  24. 24. Future Work • Explore use-cases for UniFi in the wild • “S.I. Units” for platform libraries – Using JSR-308 for Java w/understandable names • An intriguing possibility: Dimension inference for hardware languages like Verilog
  25. 25. Related Work • Osprey (Jiang and Su, ICSE ‘06) • XeLda (Antoniu et al, ICSE '04) • Type qualifiers (Foster et al, PLDI '99) • Lackwit (O’Callahan and Jackson, ICSE '97) • Fortress (Allen et al, OOPSLA '04)
  26. 26. Conclusions • UniFi is the first dimension inference system for standard Java programs – for automatically detecting bugs – for bootstrapping use of dimensions via libraries – Many uses waiting to be explored Open sourced and available from: http://suif.stanford.edu/unifi Users and collaborators welcome
  27. 27. Automatic Dimension Inference and Checking for Object-Oriented Programs Sudheendra Hangal Monica S. Lam http://suif.stanford.edu/unifi International Conference on Software Engineering Vancouver, Canada May 20th, 2009
  28. 28. Backup slides
  29. 29. Bug Example double[] distribution = new double[numClasses]; ... // compute sum and initialize ... // distribution array for (int i=0; i < NUM_TREES; ++i) distribution[i] /= sum; numClasses distribution.length However: not caught i NUM_TREES since this was in new code!
  30. 30. Dimension Variables Assign dimension variables (dvars) to • Fields • Interfaces: Method Parameters, Return values • Array elements and lengths • Local Variables • Constants • Result of Multiply/Divide Operations • Primitive types only
  31. 31. Mechanics • Bytecode based static analysis • Scripts to monitor a CVS/SVN repository and generate diffs • GUI to view inference results, correlated with unification points in source code.

×