This document discusses ontology type inference and the type inference workflow. It describes how context-sensitive constraint variables are generated by analyzing method calls. It also explains how less conservative answers can be obtained while maintaining type safety by generating preference constraints that prefer specific ontology types and using a weighted MAX SAT solver. The document provides an overview of other recent progress including backward dataflow analysis and infrastructure improvements.
2. Annotated result
for corpus
.jaif file
class CylinderShape:
method
getLocalBounds()
Method.parameter
1:
@ontology.qual.Ont
ology(values={ontolo
gy.qual.OntologyValu
e.POSITION_3D})
Our part in the big picture
• Ontology propagation in corpus
Type Inference
Corpus
Ontology -> annotation
Mapping file
Corpus Refined corpus
3. Outline
• How ontology type inference works so far
• Work flow overview
• Constraints graph generation
• Constraints encoding & solver solving
• Other progress
• Infrastructure improvement
• Backward dataflow analysis
• Next steps
5. Meaning of Ontology types
• How to interpret @Ontology({SEQUENCE, FORCE})
• 1. possible info: means this is SEQ or FORCE, but not others
• 2. precise info: means this is SEQ and FORCE, possibly also others
• we want precise information!
SEQUENCE FORCE VELOCITY
1. possible info: means this is SEQ or FORCE, but not others
6. Ontology type lattice
• Power set lattice: 2^n values for n ontologies
• TOP is empty
• A non-top value is precise ontology(ies)
• (e.g. must be SEQ and VEL)
• BOTTOM is conjunction of every values
• (must be SEQ and POS and VEL ...)
TOP
POSSEQ VEL
SEQ,POS SEQ,VEL POS,VEL
SEQ,POS,VEL
...
7. Two challenges
• Only look up declaration is not enough
• How to generate context-sensitive constraint variables?
• TOP is always a valid answer, but not useful
• How to give less-conservative answer while keeping type safety?
TOP
POSSEQ VEL
SEQ,POS SEQ,VEL POS,VEL
SEQ,POS,VEL
...
13. 2 type lattice & preference to bottom
TOP
POSSEQ VEL
SEQ,POS SEQ,VEL POS,VEL
SEQ,POS,VEL
...
less-conservative
answer
merge solutions by LUB
VEL
POS
for each supertype in subtype constraints
add preference to be SEQ
preference constraints
SEQ
VAR
1
VAR
2
VAR
3
VAR
5
VAR
4
VAR
6
SEQ reachable
constraints graph
TOP
SEQ
current lattice
SEQ
two type lattices and
corresponding constraints graph
SEQ
VAR
1
VAR
2
VAR
3
VAR
5
VAR
4
VAR
6
SEQ reachable constraints graph
weighted
MAX SAT
solver
encoding
TOP
POSSEQ VEL
SEQ,POS SEQ,VEL POS,VEL
SEQ,POS,VEL
...
for each supertype in subtype constraints
add preference to be SEQ
preference constraints
SEQ
Current lattice
TOP
14. Other progress
• Infrastructure improvement
• Support more language features
• Make CF and CFI works on bigger and bigger projects!
project name inferred annotations code lines of project
java-callgraph 45 726
logback-extensions 548 1308
cal10n 921 5781
jReactPhysics3D 5347 15502
ode4j 25286 91683
statistic data of dataflow type system on real projects
15. <entry>
str = …str = …
<exit>
else then
File file = new File(str);
D
Other progress
• Backward dataflow analysis
• Extends dataflow framework in CF
• Now can do both forward and backward analysis
• Support more analyses kinds: e.g. live variable analysis
16. What happened since May
• New features
• Context-sensitive type inference
• Customize preference on certain solutions
• Backward dataflow analysis
• Backend improvement
• New solver backend – Lingeling
• Constraint graph improvement – parallel solving
• Infrastructure improvement
• Checker Framework
• Checker Framework Inference
17. Next steps
• Design inter-dependent ontology lattice
• e.g. SEQUENCE and SEQ_LENGTH
• Viewpoint adaptation
• Formalize and infer context-sensitive methods
• Improve LogicBlox solver
• Support of preference constraints
• Scale-up of supported language features and program sizes