How History Justifies System Architecture (or Not)
1. 1/12
International Workshop on Principles of Software Evolution · Helsinki, Finland, 1 September 2003
How History Justifies
System Architecture (or not)
Thomas Zimmermann
(with Stephan Diehl and Andreas Zeller)
Lehrstuhl Softwaretechnik
Universit¨t des Saarlandes, Saarbr¨cken, Germany
a u
2. 2/12
The Problem
Your task: extend the debug component in GCC!
You identify the variable xcoff debug hooks.
What else do you need to change?
3. 2/12
The Problem
Your task: extend the debug component in GCC!
You identify the variable xcoff debug hooks.
What else do you need to change?
General issue: only change coupled entities!
You can detect existing coupling by
• Program Analysis—e.g. def-use associations.
• Learning from History—entities changed together.
5. 3/12
Evolutionary Coupling
34
gcc/gcc/dbxout.c [134] gcc/gcc/sdbout.c [74]
dbx_debug_hooks sdb_debug_hooks
12
[12] [12]
10
10
[10]
xcoff_debug_hooks
Support: How much evidence (= simultaneous changes)?
Confidence: How relevant is coupling for participants?
6. 3/12
Evolutionary Coupling
34
gcc/gcc/dbxout.c [134] gcc/gcc/sdbout.c [74]
dbx_debug_hooks sdb_debug_hooks
12
[12] [12]
4 10
4
10
[10] [4]
4
xcoff_debug_hooks sdb_global_decl()
dbx_functions_end()
[6] [7]
2
dbx_symbol_name()
Support: How much evidence (= simultaneous changes)?
Confidence: How relevant is coupling for participants?
7. 4/12
What We Do
Our ROSE prototype analyzes evolution of CVS archives.
ROSE Couplings
Reengineering Of Software Evolution
Graphs
CVS
Step 1: Restore Transactions from CVS
Metrics
Step 2: Identify Modified Entities
ROSE determines entities at different granularities:
coarse-granular entities: directories, modules, files
fine-granular entities: methods, variables, sections
8. 5/12
Step 1: Restoring Transactions
Two atomic changes δi and δi+1 are part of one
transaction ∆ = (δ1 , . . . , δn ) if:
author(δi ) = author(δi+1 ) ∧
log message(δi ) = log message(δi+1 ) ∧
|time(δi+1 ) − time(δi )| < 200 seconds
We use a sliding window instead of a fixed one.
GNU C Compiler (GCC):
The average transaction length is 6.2 seconds.
The maximal transaction length is 1 hour 32 minutes.
9. 6/12
Step 2: Light-Weight Analysis
File: Animals.java
1 class Cat {
3 public String[] COLORS = {
...
23 }
25 public Cat() {
...
30 }
...
56 }
58 class Dog {
60 public String[] COLORS = {
...
80 }
...
99 }
10. 6/12
Step 2: Light-Weight Analysis
File: Animals.java Step A: Map to Entities
1 class Cat {
3 public String[] COLORS = { Cat.COLORS
...
lines 3-23
23 }
Class Cat
25 lines 1-56
public Cat() { Cat.Cat()
...
lines 25-30
30 }
...
56 }
58 class Dog {
60 public String[] COLORS = { Dog.COLORS Class Dog
...
lines 60-80 lines 58-99
80 }
...
99 }
11. 6/12
Step 2: Light-Weight Analysis
File: Animals.java Step A: Map to Entities
1 class Cat {
3 public String[] COLORS = {
Cat.COLORS
17 ...
lines 3-23
23 }
Class Cat
25 lines 1-56
public Cat() {
Cat.Cat()
...
lines 25-30
30 }
...
56 }
58 class Dog {
60 public String[] COLORS = {
Dog.COLORS Class Dog
...
lines 60-80 lines 58-99
80 }
...
99 }
Step B: Filter Entities
We analyze C/C++, JAVA, PYTHON, TEX and TEXINFO files.
We get the modified methods, variables and subsections.
16. 8/12
Visualizing Coupling
A B C D
High Confidence
A
B
C
Low Confidence
D
No Coupling (No Support)
A C
[3] A ⇒ C: Confidence 3/10 = 30%
[10] [4]
C ⇒ A: Confidence 3/4 = 75%
22. 12/12
Conclusion
Fine-grained evolutionary coupling. . .
• detects coupling between non-program entities.
e.g. coupling between a function and a database schema
• guides developers while making changes.
Programmers who changed this function also changed. . .
• gives better(?) results than coarse-grained coupling.
Coupling between files doesn’t tell you that much
• can be compared with given coupling (= architecture).
Results are mixed—what is coupling, anyway?
Those who cannot learn from history are doomed to repeat it.
(George Santayana)