2. 2/20
Problems and Summary
n Our approach
– Builds higher-level abstractions through
feature identification and comparison
– Is an alternative to source code browsing
(i.e., design recovery)
n Source code browsing is expensive
– Program comprehension, maintenance
(As highlighted in yesterday’s session on
Program Comprehension)
3. 3/20
Feature Identification and
Comparison in a Nutshell
n Static and dynamic
analyses
n One functionality
– Two scenarios
n Large C++ multi-
threaded program
n Features
(micro-architectures)
5. 5/20
1. Program Model Creation
n Static analysis of C++ source code
– Strict analysis
• Classes are only syntactic
• Structures skipped
n AOL representation
n PADL meta-model
6. 6/20
2. Feature Identification (1/6)
n A functionality, two scenarios ⇒ Two traces
n Comparison of the two traces to identify the
feature related to the functionality
n Dynamic analysis
– Traces = Sequences of intervals
– Intervals = Sequences of events
– Events are (ir)relevant to the feature
7. 7/20
2. Feature Identification (2/6)
n Without noise, set operations
n With noise
– Imprecise locations of events
– Imprecise beginning/end of intervals
because
– C++ multi-threaded or distributed
– Statistical profiling imprecision
feature-relevant events tangled or lost
9. 9/20
2. Feature Identification (4/6)
n Probabilistic ranking
– scenarios (not) exercising a feature
– intervals with (ir)relevant events
– may contain irrelevant events
– Wilde’s relevance index for an event
10. 10/20
2. Feature Identification (5/6)
n Probabilistic ranking
– Irrelevant events frequent with
– Few intervals in , many intervals in
– Renormalisation of Wilde’s equation
11. 11/20
2. Feature Identification (6/6)
n Probabilistic ranking
– For a positive threshold
– Set of events relevant to scenarios
12. 12/20
3. Feature Models Creation
n Using the program architectural model
(from 1.) and events in (from 2.)
n A feature contains classes, methods
identified in
n A feature
– Micro-architecture, subset of the cloned
program model
– Arbitrarily narrow or large by transitivity
13. 13/20
4. Feature Comparisons
n Two feature models
– For scenarios
– For scenarios
n Computation of the
transformations
n Highlighting of the
differences
14. 14/20
Case Study (1/5)
n Mozilla browser (C++ multi-threaded)
n Naïve approach vs. Formal concept
analysis approach vs. Our approach
n Processor emulator vs. Statistical profiler
15. 15/20
Case Study (2/5)
n Scenarios
1. Visit a book-marked URL
2. Scenario 1. + Save the URL
n Program comprehension task
– Feature relevant to saving a URL
n Objective
– Usefulness of our approach
wrt. naïve and FCA-based approaches
16. 16/20
Case Study (3/5)
n Number of methods
to analyse manually
– Naïve
• bookm : 45, link : 18
• bookm and link : 14
• Up to 3,000 methods
for other terms
– FCA
• 11 concepts
• 1,038 methods are
retained from 13,325
to 26,613 methods
– Our approach
• with different
• Complemented with a
naïve analysis:
1 class, 5 methods
17. 17/20
Case Study (4/5)
n FCA vs. Our approach
0
20000000
40000000
60000000
80000000
100000000
120000000
140000000
I* I I* I
FCA Our Approach
Method Calls
Startup
Shutdown
Scenario 1
Scenario 2
Startup Shutdown Scenario 1 Scenario 2
0
5000
10000
15000
20000
25000
30000
I* I I* I
FCA Our Approach
Distinct Method Calls
Startup
Shutdown
Scenario 1
Scenario 2
Startup Shutdown Scenario 1 Scenario 2
18. 18/20
Case Study (5/5)
n Valgrind, x86 processor emulator
n JProfiler, Mozilla statistical profiler
19. 19/20
Related Work
n Meta-modelling and transformations
“Every model needs a meta-model”, Dave Thomas
– Pagel and Winter
– Sunyé
– Jezequel et al. with UMLAUT
n Static and dynamic
analyses
– Ernst et al. (Daikon)
– Jeffery et al. (UFO)
– Reiss et al.
– Hamou-Lhadj et al.
n Feature identification
– Wilde and Scully
– Chen and Rajlich (ASDG)
– Eisenbarth et al. (FCA)
– Salah and Mancoridis
(feature-interaction views)
20. 20/20
Conclusion
n Our approach
– No scalability issue
n Future work
– Feature comparisons
– Width of micro-architectures
– Feature evolution
– Visualisation