Unifi

2,099 views
2,013 views

Published on

Unifi presentation at ICSE-2009 (may 20/2009)

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
2,099
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
8
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Dimensionality checking is a simple way of checking physics equations for consistency.Even if Prof. Einstein came up and told you that Energy = mass times the velocity of light,You could tell him he was wrong because the independent physical dimensions on both sides don’t match up.In software parlance, you could say it doesn’t “type check”.
  • Now programs operate on values which have dimensions not just in the scientific or physical sense. Regular applications manipulate values with dimensions like employee ID, network port, calendar year, a filename, a hostname, a street address and so on.In our work, we focus on primitive types and strings, and I argue that many of the actual values a program computes with are of these types, a lot of the rest is scaffolding to hold these values together. For example, your database is comprised of values of these types.anga
  • Most of these variables have their own space of values, and that’s what the colours are intended to represent.
  • I’ll be using color as a proxy to delineate different dimensions throughout this presentation.
  • How do we get the benefit of dimension checking in mainstream languagesWithout special languages or programmer annotations, and for say IT applications and not scientific code.
  • Instantly adaptable to java.
  • Could be that program 1 is wrong or program 2 is wrong.Or could be that they’re both fine.What’s a notion of dimensions and inference algorithm that will capture as many real bugs as possible.
  • An example of a case where this might work.E = m * c^2 yesterdayE = m * c today.Should note that we’ve not explored second area much.
  • Notice that we don’t necessarily have human understandable names.UniFication constraints based on assignment, comparison, add/subtract, array indexing (implicit comparison with length), method invocation
  • Our early examples lost precision due to not having context sensitivity
  • Add example.
  • GUI to view inference resultsWe also have scripts to monitor a CVS/SVN repository and run Unifi at regular intervals anddiff the results of successive runs.
  • When we started, we weren’t sure what to expect – would these equivalence classes be merging all the time due to legal program changes ? Would interesting errors show up as changes in dimensional relationships at all ?
  • Like mass, length, time in physics, will be wonderful if a platform comes with a set of base units
  • UniFi is first system for Java/OO features and is completely automatic.
  • Please use it, play around with it, give us feedback, extend it, build upon it, do whatever you want.
  • Unifi

    1. 1. Automatic Dimension Inference and Checking for Object-Oriented Programs Sudheendra Hangal Monica S. Lam http://suif.stanford.edu/unifi International Conference on Software Engineering Vancouver, Canada May 20th, 2009
    2. 2. Overview • A fully automatic dimension inference system for Java programs • Diff-based method to detect dimension errors • Case-study on a 19KLoC program • UniFi: Usable open-source tool
    3. 3. Dimensionality Checking Used by physicists (and high-schoolers) E.g. E = m * c; [M x L2 x T-2 ] vs. [M x L x T-1] Doesn’t “type check”!
    4. 4. Dimensions are Everywhere • Program values have dimensions like id, $, date, port, color, flag, state, mask, count, message, filename, property, … and of course, mass, length, time, etc. • We focus on primitive types and strings – Hard to define custom types for everything – No benefit of type-checking
    5. 5. Programmer View java.awt.MouseWheelEvent public MouseWheelEvent(Component source, int id, long when, int modifiers, int x, int y, int clickCount, boolean popupTrigger, int scrollType, int scrollAmount, int wheelRotation);
    6. 6. Type-checker View java.awt.MouseWheelEvent public MouseWheelEvent(Component source, int id, long when, int modifiers, int x, int y, int clickCount, boolean popupTrigger, int scrollType, int scrollAmount, int wheelRotation);
    7. 7. Observation • Programmers use suffixes to capture dimensions int proxyPort, backgroundColor; long startTimeMillis, eventMask; String inputFilename, serverURL;
    8. 8. Putting Dimensions to Work How do we get the benefit of dimension checking in mainstream languages ? 2 ideas: 1) Detect (likely) errors automatically by diff’ing dimension usage between programs 2) Bootstrap from standard libraries
    9. 9. UniFi’s Core Idea • Infer dimensions of variables automatically – Static analysis, type inference techniques – Standard Java programs, zero annotation burden • Optional: Examine results • Compare inferred dimensions across two programs that have something in common
    10. 10. Results 1 UniFi Program 1 Inference Diffs UniFi Diff … UniFi UniFi Program 2 Inference GUI Results 2
    11. 11. Use Cases • Report changes as the same code evolves – Nightly builds – During program maintenance • Compare against a different configuration – Different programs using the same library – 2 different implementations of an interface – Implementation of a library v/s program using it – Different programmers’ code
    12. 12. Inference Algorithm • Input: Java program • Assigns dimensions to variables – Initially independent • Set up constraints between dim. vars • Solve constraints • Output: a set of relations between dimension variables
    13. 13. Inference Example (1) x = y + z x y z a < b a b d[i] i d.length u = v * w u v * w
    14. 14. Inference Example (2) int f(x) { return x * x; } a1 = f(a); a1 a * a b1 = f(b); b1 b * b • Context sensitive analysis – Uses method summaries
    15. 15. OO Constraints • Subtypes retain supertype interface – Liskov Substitution Principle • Constrains dimensions of parameters and return value of subtype methods class A { int m( int x ) { … } } class B extends A { int m( int x ) { … } }
    16. 16. Multiply/Divide Constraints • Linear equation style expressions for multiply and divide – Special handling of java.math libraries • Solved using Gaussian elimination style algorithm
    17. 17. Comparing Inferred Dimensions • Identify common variables – Same name of field, position of method param, etc. • Compare equivalence classes formed by unification constraints • Compare Multiply-divide constraints – Need canonical formulas for dvars – Make common variables more “stable” than others – See paper for details
    18. 18. Case Study: bddbddb http://sourceforge.net/projects/bddbddb • Retroactively run over 10 months of active development – Oct. 2004 to July 2005, 292 builds – Approx. 19,000 lines of Java code • Compared successive nightly builds
    19. 19. Results • 26 reports, across 19 pairs of builds • 5 real errors (+ fixes) • False Positives – Trivial reasons like field not used – Probably easy to reduce number
    20. 20. Bug Example double NO_CLASS = …; // default class id double NO_CLASS_SCORE = …; // default score … double vScore=NO_CLASS, aScore=NO_CLASS; double vClass=NO_CLASS, aClass=NO_CLASS; • UniFi detected that independent dimensions NO_CLASS and NO_CLASS_SCORE merged
    21. 21. Inference Example double[] distribution = new double[numClasses]; ... // compute sum ... // initialize distribution array for (int i=0; i < NUM_TREES; ++i) distribution[i] /= sum; numClasses distribution.length However: not caught i since this was in new NUM_TREES code!
    22. 22. Experiences • Sometimes bugs indicated by removal of unification constraint (“error of omission”) • Dimensionally inconsistent code – Ignore hashCode(), compareTo() – Cannot interpret semantically
    23. 23. Experiences • Types of Errors: Sometimes can be difficult to root-cause • Dimensions vs. Units – May not catch wrong scaling factor… …but might catch the absence of one (?)
    24. 24. Future Work • Explore use-cases for UniFi in the wild • “S.I. Units” for platform libraries – Using JSR-308 for Java w/understandable names • An intriguing possibility: Dimension inference for hardware languages like Verilog
    25. 25. Related Work • Osprey (Jiang and Su, ICSE ‘06) • XeLda (Antoniu et al, ICSE '04) • Type qualifiers (Foster et al, PLDI '99) • Lackwit (O’Callahan and Jackson, ICSE '97) • Fortress (Allen et al, OOPSLA '04)
    26. 26. Conclusions • UniFi is the first dimension inference system for standard Java programs – for automatically detecting bugs – for bootstrapping use of dimensions via libraries – Many uses waiting to be explored Open sourced and available from: http://suif.stanford.edu/unifi Users and collaborators welcome
    27. 27. Automatic Dimension Inference and Checking for Object-Oriented Programs Sudheendra Hangal Monica S. Lam http://suif.stanford.edu/unifi International Conference on Software Engineering Vancouver, Canada May 20th, 2009
    28. 28. Backup slides
    29. 29. Bug Example double[] distribution = new double[numClasses]; ... // compute sum and initialize ... // distribution array for (int i=0; i < NUM_TREES; ++i) distribution[i] /= sum; numClasses distribution.length However: not caught i NUM_TREES since this was in new code!
    30. 30. Dimension Variables Assign dimension variables (dvars) to • Fields • Interfaces: Method Parameters, Return values • Array elements and lengths • Local Variables • Constants • Result of Multiply/Divide Operations • Primitive types only
    31. 31. Mechanics • Bytecode based static analysis • Scripts to monitor a CVS/SVN repository and generate diffs • GUI to view inference results, correlated with unification points in source code.

    ×