SlideShare a Scribd company logo
1 of 32
Automatic Dimension Inference and
Checking for Object-Oriented Programs

             Sudheendra Hangal
               Monica S. Lam

           http://suif.stanford.edu/unifi

  International Conference on Software Engineering
                  Vancouver, Canada
                    May 20th, 2009
Overview
• A fully automatic dimension inference system
  for Java programs
• Diff-based method to detect dimension errors
• Case-study on a 19KLoC program

• UniFi: Usable open-source tool
Dimensionality Checking
          Used by physicists
          (and high-schoolers)

          E.g.
                     E = m * c;

          [M x L2 x T-2 ] vs. [M x L x T-1]
          Doesn’t “type check”!
Dimensions are Everywhere
• Program values have dimensions like
  id, $, date, port, color, flag, state, mask, count,
  message, filename, property, …
  and of course, mass, length, time, etc.

• We focus on primitive types and strings
   – Hard to define custom types for everything
   – No benefit of type-checking
Programmer View
java.awt.MouseWheelEvent

public MouseWheelEvent(Component source,
  int id, long when, int modifiers,
  int x, int y, int clickCount,
  boolean popupTrigger,
  int scrollType, int scrollAmount,
  int wheelRotation);
Type-checker View
java.awt.MouseWheelEvent

public MouseWheelEvent(Component source,
  int id, long when, int modifiers,
  int x, int y, int clickCount,
  boolean popupTrigger,
  int scrollType, int scrollAmount,
  int wheelRotation);
Observation
• Programmers use suffixes to capture
  dimensions

  int proxyPort, backgroundColor;
  long startTimeMillis, eventMask;
  String inputFilename, serverURL;
Putting Dimensions to Work
 How do we get the benefit of dimension
 checking in mainstream languages ?

2 ideas:
1) Detect (likely) errors automatically by
diff’ing dimension usage between programs
2) Bootstrap from standard libraries
UniFi’s Core Idea
• Infer dimensions of variables automatically
  – Static analysis, type inference techniques
  – Standard Java programs, zero annotation burden


• Optional: Examine results

• Compare inferred dimensions across two
  programs that have something in common
Results 1

              UniFi
Program 1   Inference
                                    Diffs

                         UniFi
                         Diff




                                      …
              UniFi                  UniFi
Program 2   Inference                GUI

                        Results 2
Use Cases
• Report changes as the same code evolves
  – Nightly builds
  – During program maintenance
• Compare against a different configuration
  – Different programs using the same library
  – 2 different implementations of an interface
  – Implementation of a library v/s program using it
  – Different programmers’ code
Inference Algorithm
• Input: Java program
• Assigns dimensions to variables
  – Initially independent
• Set up constraints between dim. vars
• Solve constraints
• Output: a set of relations between dimension
  variables
Inference Example (1)
x = y + z        x   y       z


a < b            a   b


d[i]             i   d.length


u = v * w        u       v * w
Inference Example (2)
  int f(x) { return x * x; }

  a1 = f(a);         a1        a * a


  b1 = f(b);         b1        b * b


• Context sensitive analysis
  – Uses method summaries
OO Constraints
• Subtypes retain supertype interface
  – Liskov Substitution Principle
• Constrains dimensions of parameters and
  return value of subtype methods

  class A           { int m( int x ) { … } }
  class B extends A { int m( int x ) { … } }
Multiply/Divide Constraints
• Linear equation style expressions for multiply
  and divide
  – Special handling of java.math libraries
• Solved using Gaussian elimination style
  algorithm
Comparing Inferred Dimensions
• Identify common variables
  – Same name of field, position of method param,
    etc.
• Compare equivalence classes formed by
  unification constraints
• Compare Multiply-divide constraints
  – Need canonical formulas for dvars
  – Make common variables more “stable” than
    others
  – See paper for details
Case Study: bddbddb
  http://sourceforge.net/projects/bddbddb

• Retroactively run over 10 months of active
  development
  – Oct. 2004 to July 2005, 292 builds
  – Approx. 19,000 lines of Java code


• Compared successive nightly builds
Results
• 26 reports, across 19 pairs of builds
• 5 real errors (+ fixes)

• False Positives
  – Trivial reasons like field not used
  – Probably easy to reduce number
Bug Example


  double   NO_CLASS = …; // default class id
  double   NO_CLASS_SCORE = …; // default score
  …
  double   vScore=NO_CLASS, aScore=NO_CLASS;
  double   vClass=NO_CLASS, aClass=NO_CLASS;




• UniFi detected that independent dimensions
  NO_CLASS and NO_CLASS_SCORE merged
Inference Example
double[] distribution = new double[numClasses];
... // compute sum
... // initialize distribution array

for (int i=0; i < NUM_TREES; ++i)
    distribution[i] /= sum;




    numClasses
distribution.length           However: not caught
         i                    since this was in new
     NUM_TREES                code!
Experiences
• Sometimes bugs indicated by removal of
  unification constraint (“error of omission”)

• Dimensionally inconsistent code
  – Ignore hashCode(), compareTo()
  – Cannot interpret semantically
Experiences
• Types of Errors: Sometimes can be difficult to
  root-cause

• Dimensions vs. Units
  – May not catch wrong scaling factor…
  …but might catch the absence of one (?)
Future Work
• Explore use-cases for UniFi in the wild

• “S.I. Units” for platform libraries
   – Using JSR-308 for Java w/understandable names


• An intriguing possibility: Dimension inference
  for hardware languages like Verilog
Related Work
•   Osprey (Jiang and Su, ICSE ‘06)
•   XeLda (Antoniu et al, ICSE '04)
•   Type qualifiers (Foster et al, PLDI '99)
•   Lackwit (O’Callahan and Jackson, ICSE '97)
•   Fortress (Allen et al, OOPSLA '04)
Conclusions
• UniFi is the first dimension inference system
  for standard Java programs
  – for automatically detecting bugs
  – for bootstrapping use of dimensions via libraries
  – Many uses waiting to be explored
Open sourced and available from:
http://suif.stanford.edu/unifi
Users and collaborators welcome
Automatic Dimension Inference and
Checking for Object-Oriented Programs

             Sudheendra Hangal
               Monica S. Lam

           http://suif.stanford.edu/unifi

  International Conference on Software Engineering
                  Vancouver, Canada
                    May 20th, 2009
Backup slides
Bug Example
double[] distribution = new double[numClasses];
... // compute sum and initialize
... // distribution array

for (int i=0; i < NUM_TREES; ++i)
  distribution[i] /= sum;




       numClasses
   distribution.length       However: not caught
            i
        NUM_TREES            since this was in new
                             code!
Dimension Variables
Assign dimension variables (dvars) to
•   Fields
•   Interfaces: Method Parameters, Return values
•   Array elements and lengths
•   Local Variables
•   Constants
•   Result of Multiply/Divide Operations
•   Primitive types only
Mechanics
• Bytecode based static analysis

• Scripts to monitor a CVS/SVN repository and
  generate diffs

• GUI to view inference results, correlated with
  unification points in source code.

More Related Content

Similar to Unifi

Dependency Injection in .NET applications
Dependency Injection in .NET applicationsDependency Injection in .NET applications
Dependency Injection in .NET applicationsBabak Naffas
 
PROGRAMMING USING C#.NET SARASWATHI RAMALINGAM
PROGRAMMING USING C#.NET SARASWATHI RAMALINGAMPROGRAMMING USING C#.NET SARASWATHI RAMALINGAM
PROGRAMMING USING C#.NET SARASWATHI RAMALINGAMSaraswathiRamalingam
 
Csharp dot net
Csharp dot netCsharp dot net
Csharp dot netRevanth Mca
 
Computer Engineer Master Project
Computer Engineer Master ProjectComputer Engineer Master Project
Computer Engineer Master ProjectJordi Muntada GĂłmez
 
Close encounters in MDD: when Models meet Code
Close encounters in MDD: when Models meet CodeClose encounters in MDD: when Models meet Code
Close encounters in MDD: when Models meet Codelbergmans
 
Close Encounters in MDD: when models meet code
Close Encounters in MDD: when models meet codeClose Encounters in MDD: when models meet code
Close Encounters in MDD: when models meet codelbergmans
 
Формальная верификация как средство тестирования (в Java)
Формальная верификация как средство тестирования (в Java)Формальная верификация как средство тестирования (в Java)
Формальная верификация как средство тестирования (в Java)SQALab
 
Generation of Random EMF Models for Benchmarks
Generation of Random EMF Models for BenchmarksGeneration of Random EMF Models for Benchmarks
Generation of Random EMF Models for BenchmarksMarkus Scheidgen
 
End-to-End Object Detection with Transformers
End-to-End Object Detection with TransformersEnd-to-End Object Detection with Transformers
End-to-End Object Detection with TransformersSeunghyun Hwang
 
Abstractions and Directives for Adapting Wavefront Algorithms to Future Archi...
Abstractions and Directives for Adapting Wavefront Algorithms to Future Archi...Abstractions and Directives for Adapting Wavefront Algorithms to Future Archi...
Abstractions and Directives for Adapting Wavefront Algorithms to Future Archi...inside-BigData.com
 
Dacj 1-2 c
Dacj 1-2 cDacj 1-2 c
Dacj 1-2 cNiit Care
 
Mining Fix Patterns for FindBugs Violations
Mining Fix Patterns for FindBugs ViolationsMining Fix Patterns for FindBugs Violations
Mining Fix Patterns for FindBugs ViolationsDongsun Kim
 
The operation principles of PVS-Studio static code analyzer
The operation principles of PVS-Studio static code analyzerThe operation principles of PVS-Studio static code analyzer
The operation principles of PVS-Studio static code analyzerAndrey Karpov
 
GPCE16: Automatic Non-functional Testing of Code Generators Families
GPCE16: Automatic Non-functional Testing of Code Generators FamiliesGPCE16: Automatic Non-functional Testing of Code Generators Families
GPCE16: Automatic Non-functional Testing of Code Generators FamiliesMohamed BOUSSAA
 
Changes and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development ActivitiesChanges and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development ActivitiesThomas Zimmermann
 

Similar to Unifi (20)

Dependency Injection in .NET applications
Dependency Injection in .NET applicationsDependency Injection in .NET applications
Dependency Injection in .NET applications
 
Core Java
Core JavaCore Java
Core Java
 
PROGRAMMING USING C#.NET SARASWATHI RAMALINGAM
PROGRAMMING USING C#.NET SARASWATHI RAMALINGAMPROGRAMMING USING C#.NET SARASWATHI RAMALINGAM
PROGRAMMING USING C#.NET SARASWATHI RAMALINGAM
 
Csharp dot net
Csharp dot netCsharp dot net
Csharp dot net
 
Computer Engineer Master Project
Computer Engineer Master ProjectComputer Engineer Master Project
Computer Engineer Master Project
 
Java vs .Net
Java vs .NetJava vs .Net
Java vs .Net
 
ASE02.ppt
ASE02.pptASE02.ppt
ASE02.ppt
 
Mobile Weekend Budapest presentation
Mobile Weekend Budapest presentationMobile Weekend Budapest presentation
Mobile Weekend Budapest presentation
 
Close encounters in MDD: when Models meet Code
Close encounters in MDD: when Models meet CodeClose encounters in MDD: when Models meet Code
Close encounters in MDD: when Models meet Code
 
Close Encounters in MDD: when models meet code
Close Encounters in MDD: when models meet codeClose Encounters in MDD: when models meet code
Close Encounters in MDD: when models meet code
 
Формальная верификация как средство тестирования (в Java)
Формальная верификация как средство тестирования (в Java)Формальная верификация как средство тестирования (в Java)
Формальная верификация как средство тестирования (в Java)
 
Generation of Random EMF Models for Benchmarks
Generation of Random EMF Models for BenchmarksGeneration of Random EMF Models for Benchmarks
Generation of Random EMF Models for Benchmarks
 
End-to-End Object Detection with Transformers
End-to-End Object Detection with TransformersEnd-to-End Object Detection with Transformers
End-to-End Object Detection with Transformers
 
Abstractions and Directives for Adapting Wavefront Algorithms to Future Archi...
Abstractions and Directives for Adapting Wavefront Algorithms to Future Archi...Abstractions and Directives for Adapting Wavefront Algorithms to Future Archi...
Abstractions and Directives for Adapting Wavefront Algorithms to Future Archi...
 
Dacj 1-2 c
Dacj 1-2 cDacj 1-2 c
Dacj 1-2 c
 
Mining Fix Patterns for FindBugs Violations
Mining Fix Patterns for FindBugs ViolationsMining Fix Patterns for FindBugs Violations
Mining Fix Patterns for FindBugs Violations
 
The operation principles of PVS-Studio static code analyzer
The operation principles of PVS-Studio static code analyzerThe operation principles of PVS-Studio static code analyzer
The operation principles of PVS-Studio static code analyzer
 
GPCE16: Automatic Non-functional Testing of Code Generators Families
GPCE16: Automatic Non-functional Testing of Code Generators FamiliesGPCE16: Automatic Non-functional Testing of Code Generators Families
GPCE16: Automatic Non-functional Testing of Code Generators Families
 
Surge2012
Surge2012Surge2012
Surge2012
 
Changes and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development ActivitiesChanges and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development Activities
 

Recently uploaded

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetEnjoy Anytime
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 

Recently uploaded (20)

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 

Unifi

  • 1. Automatic Dimension Inference and Checking for Object-Oriented Programs Sudheendra Hangal Monica S. Lam http://suif.stanford.edu/unifi International Conference on Software Engineering Vancouver, Canada May 20th, 2009
  • 2. Overview • A fully automatic dimension inference system for Java programs • Diff-based method to detect dimension errors • Case-study on a 19KLoC program • UniFi: Usable open-source tool
  • 3. Dimensionality Checking Used by physicists (and high-schoolers) E.g. E = m * c; [M x L2 x T-2 ] vs. [M x L x T-1] Doesn’t “type check”!
  • 4. Dimensions are Everywhere • Program values have dimensions like id, $, date, port, color, flag, state, mask, count, message, filename, property, … and of course, mass, length, time, etc. • We focus on primitive types and strings – Hard to define custom types for everything – No benefit of type-checking
  • 5. Programmer View java.awt.MouseWheelEvent public MouseWheelEvent(Component source, int id, long when, int modifiers, int x, int y, int clickCount, boolean popupTrigger, int scrollType, int scrollAmount, int wheelRotation);
  • 6. Type-checker View java.awt.MouseWheelEvent public MouseWheelEvent(Component source, int id, long when, int modifiers, int x, int y, int clickCount, boolean popupTrigger, int scrollType, int scrollAmount, int wheelRotation);
  • 7. Observation • Programmers use suffixes to capture dimensions int proxyPort, backgroundColor; long startTimeMillis, eventMask; String inputFilename, serverURL;
  • 8. Putting Dimensions to Work How do we get the benefit of dimension checking in mainstream languages ? 2 ideas: 1) Detect (likely) errors automatically by diff’ing dimension usage between programs 2) Bootstrap from standard libraries
  • 9. UniFi’s Core Idea • Infer dimensions of variables automatically – Static analysis, type inference techniques – Standard Java programs, zero annotation burden • Optional: Examine results • Compare inferred dimensions across two programs that have something in common
  • 10. Results 1 UniFi Program 1 Inference Diffs UniFi Diff … UniFi UniFi Program 2 Inference GUI Results 2
  • 11. Use Cases • Report changes as the same code evolves – Nightly builds – During program maintenance • Compare against a different configuration – Different programs using the same library – 2 different implementations of an interface – Implementation of a library v/s program using it – Different programmers’ code
  • 12. Inference Algorithm • Input: Java program • Assigns dimensions to variables – Initially independent • Set up constraints between dim. vars • Solve constraints • Output: a set of relations between dimension variables
  • 13. Inference Example (1) x = y + z x y z a < b a b d[i] i d.length u = v * w u v * w
  • 14. Inference Example (2) int f(x) { return x * x; } a1 = f(a); a1 a * a b1 = f(b); b1 b * b • Context sensitive analysis – Uses method summaries
  • 15. OO Constraints • Subtypes retain supertype interface – Liskov Substitution Principle • Constrains dimensions of parameters and return value of subtype methods class A { int m( int x ) { … } } class B extends A { int m( int x ) { … } }
  • 16. Multiply/Divide Constraints • Linear equation style expressions for multiply and divide – Special handling of java.math libraries • Solved using Gaussian elimination style algorithm
  • 17. Comparing Inferred Dimensions • Identify common variables – Same name of field, position of method param, etc. • Compare equivalence classes formed by unification constraints • Compare Multiply-divide constraints – Need canonical formulas for dvars – Make common variables more “stable” than others – See paper for details
  • 18.
  • 19. Case Study: bddbddb http://sourceforge.net/projects/bddbddb • Retroactively run over 10 months of active development – Oct. 2004 to July 2005, 292 builds – Approx. 19,000 lines of Java code • Compared successive nightly builds
  • 20. Results • 26 reports, across 19 pairs of builds • 5 real errors (+ fixes) • False Positives – Trivial reasons like field not used – Probably easy to reduce number
  • 21. Bug Example double NO_CLASS = …; // default class id double NO_CLASS_SCORE = …; // default score … double vScore=NO_CLASS, aScore=NO_CLASS; double vClass=NO_CLASS, aClass=NO_CLASS; • UniFi detected that independent dimensions NO_CLASS and NO_CLASS_SCORE merged
  • 22. Inference Example double[] distribution = new double[numClasses]; ... // compute sum ... // initialize distribution array for (int i=0; i < NUM_TREES; ++i) distribution[i] /= sum; numClasses distribution.length However: not caught i since this was in new NUM_TREES code!
  • 23. Experiences • Sometimes bugs indicated by removal of unification constraint (“error of omission”) • Dimensionally inconsistent code – Ignore hashCode(), compareTo() – Cannot interpret semantically
  • 24. Experiences • Types of Errors: Sometimes can be difficult to root-cause • Dimensions vs. Units – May not catch wrong scaling factor… …but might catch the absence of one (?)
  • 25. Future Work • Explore use-cases for UniFi in the wild • “S.I. Units” for platform libraries – Using JSR-308 for Java w/understandable names • An intriguing possibility: Dimension inference for hardware languages like Verilog
  • 26. Related Work • Osprey (Jiang and Su, ICSE ‘06) • XeLda (Antoniu et al, ICSE '04) • Type qualifiers (Foster et al, PLDI '99) • Lackwit (O’Callahan and Jackson, ICSE '97) • Fortress (Allen et al, OOPSLA '04)
  • 27. Conclusions • UniFi is the first dimension inference system for standard Java programs – for automatically detecting bugs – for bootstrapping use of dimensions via libraries – Many uses waiting to be explored Open sourced and available from: http://suif.stanford.edu/unifi Users and collaborators welcome
  • 28. Automatic Dimension Inference and Checking for Object-Oriented Programs Sudheendra Hangal Monica S. Lam http://suif.stanford.edu/unifi International Conference on Software Engineering Vancouver, Canada May 20th, 2009
  • 30. Bug Example double[] distribution = new double[numClasses]; ... // compute sum and initialize ... // distribution array for (int i=0; i < NUM_TREES; ++i) distribution[i] /= sum; numClasses distribution.length However: not caught i NUM_TREES since this was in new code!
  • 31. Dimension Variables Assign dimension variables (dvars) to • Fields • Interfaces: Method Parameters, Return values • Array elements and lengths • Local Variables • Constants • Result of Multiply/Divide Operations • Primitive types only
  • 32. Mechanics • Bytecode based static analysis • Scripts to monitor a CVS/SVN repository and generate diffs • GUI to view inference results, correlated with unification points in source code.

Editor's Notes

  1. Dimensionality checking is a simple way of checking physics equations for consistency.Even if Prof. Einstein came up and told you that Energy = mass times the velocity of light,You could tell him he was wrong because the independent physical dimensions on both sides don’t match up.In software parlance, you could say it doesn’t “type check”.
  2. Now programs operate on values which have dimensions not just in the scientific or physical sense. Regular applications manipulate values with dimensions like employee ID, network port, calendar year, a filename, a hostname, a street address and so on.In our work, we focus on primitive types and strings, and I argue that many of the actual values a program computes with are of these types, a lot of the rest is scaffolding to hold these values together. For example, your database is comprised of values of these types.anga
  3. Most of these variables have their own space of values, and that’s what the colours are intended to represent.
  4. I’ll be using color as a proxy to delineate different dimensions throughout this presentation.
  5. How do we get the benefit of dimension checking in mainstream languagesWithout special languages or programmer annotations, and for say IT applications and not scientific code.
  6. Instantly adaptable to java.
  7. Could be that program 1 is wrong or program 2 is wrong.Or could be that they’re both fine.What’s a notion of dimensions and inference algorithm that will capture as many real bugs as possible.
  8. An example of a case where this might work.E = m * c^2 yesterdayE = m * c today.Should note that we’ve not explored second area much.
  9. Notice that we don’t necessarily have human understandable names.UniFication constraints based on assignment, comparison, add/subtract, array indexing (implicit comparison with length), method invocation
  10. Our early examples lost precision due to not having context sensitivity
  11. Add example.
  12. GUI to view inference resultsWe also have scripts to monitor a CVS/SVN repository and run Unifi at regular intervals anddiff the results of successive runs.
  13. When we started, we weren’t sure what to expect – would these equivalence classes be merging all the time due to legal program changes ? Would interesting errors show up as changes in dimensional relationships at all ?
  14. Like mass, length, time in physics, will be wonderful if a platform comes with a set of base units
  15. UniFi is first system for Java/OO features and is completely automatic.
  16. Please use it, play around with it, give us feedback, extend it, build upon it, do whatever you want.