Existing reverse-engineering tools use algorithms based on vague and verbose definitions of UML constituents to recover class diagrams from source code. Thus, reverse-engineered class diagrams are neither abstract nor precise representations of source code and are of little interest for software engineers. We propose a exhaustive study of class diagram constituents with respect to their recovery from C++, Java, and Smalltalk source code. Finally, we suggest a road-map to abstract and precise reverse-engineering. We exemplify our study by developing a tool to reverse-engineer Java programs in UML class diagrams abstractly and precisely. Such a reverse-engineering tool produces class diagrams that help software engineers in better understanding programs.
2. 2/25
Context
n UML is a standard
– Used in industry
– Taught in universities
n UML is not a standard
– Imprecise
– Heavy
3. 3/25
Problem
n But “lost” class diagrams
– Missing, out-of-date
n Maintenance > 50% of cost
– Program understanding > 50% of time
• Developers
• Maintainers
• Architecture
• Behaviour
• Design choices
• Implementation
4. 4/25
Problem (cont’d)
n Lack of automated recovery tools
– Class diagrams
• Abstract
– Concentrate the essential qualities of anything more
extensive or of several things
(≠ Graphical representations of source code)
• Precise
– Extent to which a given measurement agrees with a
standard value
(Original and–or expected class diagrams)
6. 5/25
Example
public class A { ... }
public class Example1 {
private A[] listOfAs = new A[10];
private int numberOfAs = 0;
public void addA(final A a) {
this.listOfAs[numberOfAs++] = a;
}
public A getA(final int index) {
return this.listOfAs[index];
}
public void removeA(final A a) {
// ...
}
public static void main(final String[] args) {
final Example1 example1 = new Example1();
example1.addA(new A());
// ...
}
}
7. 5/25
Example
public class A { ... }
public class Example1 {
private A[] listOfAs = new A[10];
private int numberOfAs = 0;
public void addA(final A a) {
this.listOfAs[numberOfAs++] = a;
}
public A getA(final int index) {
return this.listOfAs[index];
}
public void removeA(final A a) {
// ...
}
public static void main(final String[] args) {
final Example1 example1 = new Example1();
example1.addA(new A());
// ...
}
}
9. 6/25
Example (cont’d)
public class A { ... }
public class Example2 {
private List listOfAs = new ArrayList();
public void addA(final A a) {
this.listOfAs.add(a);
}
public A getA(final int index) {
return (A) this.listOfAs.remove(index);
}
public void removeA(final A a) {
this.listOfAs.remove(a);
}
public static void main(final String[] args) {
final Example2 example2 = new Example2();
example2.addA(new A());
// ...
}
}
10. 6/25
Example (cont’d)
public class A { ... }
public class Example2 {
private List listOfAs = new ArrayList();
public void addA(final A a) {
this.listOfAs.add(a);
}
public A getA(final int index) {
return (A) this.listOfAs.remove(index);
}
public void removeA(final A a) {
this.listOfAs.remove(a);
}
public static void main(final String[] args) {
final Example2 example2 = new Example2();
example2.addA(new A());
// ...
}
}
16. 8/25
Solution
n Systematic study of class diagram
constituents for their abstract and
precise recovery
– 34 constituents (UML meta-model)
– C++, Java, and Smalltalk
– Prototype tool
17. 9/25
Related Work
n Number of constituents of class
diagrams handled by tools
Recovered UML constituents in %
ArgoUML 7/34 21
Chava 7/34 21
Fujaba 7/34 21
IDEA 16/34 48
Rational Rose 8/34 24
Borland Together/J 8/34 24
Womble 9/34 28
18. 10/25
Hypotheses
n Class-based programming languages
n Class diagrams document design
during development and maintenance
n Context dependant definitions of the
constituents (⇒ incrementality)
19. 11/25
Systematic Study
n Classifier features
– Attribute üü
– Method üü
– Operations üû
• Abstract public methods
• Overloaded methods
• (Comments, documentation)
Abstract, Precise
20. 12/25
Systematic Study
n Classifier relationships
– Binary association üü
– Association end N/A
– Multiplicity üû
– Qualifier üû
– Association class ûû
– N-ary association ûû
– Aggregation üü
– Composition üû
– Generalisation ûü
– Dependency ûû
Abstract, Precise
21. 13/25
Systematic Study
n Classifiers
– Class üü
– Nested class ûü
– Type üü
– Implementation class üü
– Interface üü
– Parameterised class üü
– Bound Element üü
– Metaclass üü
– Powertype ûû
– Data Type üü
– Enumeration üü
– Utility class üü
Relationships
to Other from OtherClassifiers Attributes Operations Methods
Classifiers
Class ü ü ü ü ü
Implementation Class ü û ü û û
Interface û ü û û ü
Type ü ü û ü ü
Abstract, Precise
22. 14/25
Systematic Study
n Miscellaneous
– Stereotype üû
– Class pathname üü
– Importing packages üü
– Object N/A
– Composite object N/A
– Link N/A
– Instance of üü
– Derived element
– List compartment N/A
– Name compartment N/A
Abstract, Precise
24. 16/25
Prototype Tool
n Ptidej
– Pattern Trace Identification, Detection, and
Enhancement in Java
• Also “breakfast” in French slang J
– Open framework for high-level program
analyses, visualisation, pattern detection
• More at www.ptidej.net
– Executable (.exe, .jar)
– Source code
25. 17/25
Prototype Tool (cont’d)
n Ptidej
Java
Model
Abstract and
Precise Model
Dedicated Parsers
Transformations
based on the
Systematic Study
PADL
Meta-Model
C++
26. 18/25
Prototype Tool (cont’d)
n Ptidej
Recovered UML constituents in %
ArgoUML 7/34 21
Chava 7/34 21
Fujaba 7/34 21
IDEA 16/34 48
Ptidej 21/34 62
Rational Rose 8/34 24
Borland Together/J 8/34 24
Womble 9/34 28
27. 19/25
Prototype Tool (cont’d)
n Ptidej
– Association, aggregation, and composition
binary class relationships
• Message sends
• Invocation sites
– Simple fields
– Arrays, collections
• Lifetime and exclusivity properties
29. 21/25
Example revisited
public class A { ... }
public class Example1 {
private A[] listOfAs = new A[10];
private int numberOfAs = 0;
public void addA(final A a) {
this.listOfAs[numberOfAs++] = a;
}
public A getA(final int index) {
return this.listOfAs[index];
}
public void removeA(final A a) {
// ...
}
public static void main(final String[] args) {
final Example1 example1 = new Example1();
example1.addA(new A());
// ...
}
}
30. 21/25
Example revisited
public class A { ... }
public class Example1 {
private A[] listOfAs = new A[10];
private int numberOfAs = 0;
public void addA(final A a) {
this.listOfAs[numberOfAs++] = a;
}
public A getA(final int index) {
return this.listOfAs[index];
}
public void removeA(final A a) {
// ...
}
public static void main(final String[] args) {
final Example1 example1 = new Example1();
example1.addA(new A());
// ...
}
}
31. 22/25
Example revisited (cont’d)
public class A { ... }
public class Example2 {
private List listOfAs = new ArrayList();
public void addA(final A a) {
this.listOfAs.add(a);
}
public A getA(final int index) {
return (A) this.listOfAs.remove(index);
}
public void removeA(final A a) {
this.listOfAs.remove(a);
}
public static void main(final String[] args) {
final Example2 example2 = new Example2();
example2.addA(new A());
// ...
}
}
32. 22/25
Example revisited (cont’d)
public class A { ... }
public class Example2 {
private List listOfAs = new ArrayList();
public void addA(final A a) {
this.listOfAs.add(a);
}
public A getA(final int index) {
return (A) this.listOfAs.remove(index);
}
public void removeA(final A a) {
this.listOfAs.remove(a);
}
public static void main(final String[] args) {
final Example2 example2 = new Example2();
example2.addA(new A());
// ...
}
}
36. 24/25
Conclusion
n Exhaustive and systematic study of
UML class diagram constituents
n Prototype tool implementing algorithms
for abstract and precise recovery
n Recovery of more constituents than any
other tools
37. 25/25
Future Work
n UML v2.0
n Other programming languages
– Eiffel
n Other source of information
– Comments, names
– Documentation
n Other diagrams
– Sequence diagrams + Integration
n Base(s) of comparison