An Empirical Study of Unspecified Dependencies in Make-Based Build Systems
1. An Empirical Study of
Unspecified Dependencies
in Make-Based Build Systems
Cor-Paul Bezemer, Shane McIntosh, Bram Adams,
Daniel M. German, Ahmed E. Hassan
Empirical Software Engineering – Journal First
2. 2
What is a build system?
Source code
Deliverable
3. 3
Build systems describe how sources are
translated into deliverables
.tex
.c .o
.dvi
.a
.pdf
.deb
4. 4
A build file specifies how to generate
targets from their dependencies
all: app.o
app.o: app.c app.h
gcc -c app.c -o app.o
app.c:
./generator -o app.c
all: app.o
app.o: app.c app.h
gcc -c app.c -o app.o
app.c:
./generator -o app.c
5. 5
Build rules
specify how targets must be built
all: app.o
app.o: app.c app.h
gcc -c app.c -o app.o
app.c:
./generator -o app.c
all: app.o
app.o: app.c app.h
gcc -c app.c -o app.o
app.c:
./generator -o app.c
6. 6
Targets
are the deliverables of a build system
all: app.o
app.o: app.c app.h
gcc -c app.c -o app.o
app.c:
./generator -o app.c
all: app.o
app.o: app.c app.h
gcc -c app.c -o app.o
app.c:
./generator -o app.c
9. 9
Builds can take a lot of time
glib
openldap
Linux
Qt
0 50 100 150 200 250 300
Build time (minutes)
10. 10
There exist ways to speed up a build
● Incremental build
– Rebuild only the component(s) that changed, and the
component(s) that depend on it
– Most build technologies (e.g., make) already do this
● ‘Hacking’ the build by removing dependencies
– Decreases potential to parallellize the build
– Difficult to debug issues that may occur
12. 12
An example of an
unspecified dependency
all: app.o
app.o: app.c
gcc -c app.c -o app.o
app.c:
./generator -o app.c
all: app.o
app.o: app.c
gcc -c app.c -o app.o
app.c:
./generator -o app.c
If the code for ‘generator’ changes, app.c
(and all the targets that depend on it)
will not be rebuilt!
13. 13
Unspecified dependencies are easy to
fix, but hard to discover
all: app.o
app.o: app.c
gcc -c app.c -o app.o
app.c:
./generator -o app.c
all: app.o
app.o: app.c
gcc -c app.c -o app.o
app.c:
./generator -o app.c
generator
15. 15
Our prototype tool compares the conceptual
with the concrete build graph
● Conceptual graph:
– What we believe is being built (i.e., the build file)
● Concrete graph:
– What is actually being built (i.e., the build execution
trace)
● Processes may internally access files that are not
explicitly defined as dependencies!
– Hence we need to compare both graphs
30. 30
We studied unspecified dependencies in
the build systems of 4 real projects
GLIB
1.7k 944k
OPEN
LDAP
(unspecified dependencies)
284k290
31. 31
We manually identified 6 root causes
for the unspecified dependencies
● We used filters to group unspecified dependencies
that had the same root cause
– For example, a project policy
32. 32
We manually identified 6 root causes
for the unspecified dependencies
● We used filters to group unspecified dependencies
that had the same root cause
– For example, a project policy
● We ‘debugged’ the unspecified dependencies
– We manually traced the code around the spot where
the dependency should have been specified
– We searched the documentation of the project, or we
consulted developers
34. 34
The “Binary Compatibility
Guarantee” root cause
● Qt has many unspecified dependencies on shared
library (.so) files
● Qt guarantees binary compatibility between minor
releases
– Changes to the .so file are OK as long as the interface
does not change
– Hence, it is sufficient to depend on the interface
35. 35
The “Binary Compatibility
Guarantee” root cause
● Qt has many unspecified dependencies on shared
library (.so) files
● Qt guarantees binary compatibility between minor
releases
– Changes to the .so file are OK as long as the interface
does not change
– Hence, it is sufficient to depend on the interface
For the other root causes, check our paper!
36. 36
What did developers think
of our findings?
● We submitted patches for 36 unspecified
dependencies to the Glib project
– The developers agreed, but preferred to not touch the
build system
37. 37
What did developers think
of our findings?
● We submitted patches for 36 unspecified
dependencies to the Glib project
– The developers agreed, but preferred to not touch the
build system
● Developers often remove dependencies to speed
up the build
– And then rely on project processes (Qt, Linux) to deal
with those unspecified dependencies
44. 44
We combined several tools to reveal unspecified
dependencies in make-based build systems
MAKAO
Conceptual graphs are
extracted using
[Adams et al., ICSM 2007]
45. 45
We combined several tools to reveal unspecified
dependencies in make-based build systems
MAKAO
Conceptual graphs are
extracted using
[Adams et al., ICSM 2007]
STRACE
Concrete graphs are
extracted from
execution logs
46. 46
We combined several tools to reveal unspecified
dependencies in make-based build systems
MAKAO
Conceptual graphs are
extracted using
[Adams et al., ICSM 2007]
STRACE
Concrete graphs are
extracted from
execution logs
Graphs are represented
and analyzed using
47. 47
An example of
a correctly specified dependency
all: app.o
app.o: app.c app.h
gcc -c app.c -o app.o
app.c:
./generator -o app.c
all: app.o
app.o: app.c app.h
gcc -c app.c -o app.o
app.c:
./generator -o app.c
48. 48
An example of
a correctly specified dependency
all: app.o
app.o: app.c app.h
gcc -c app.c -o app.o
app.c:
./generator -o app.c
all: app.o
app.o: app.c app.h
gcc -c app.c -o app.o
app.c:
./generator -o app.c
49. 49
An example of
a correctly specified dependency
all: app.o
app.o: app.c app.h
gcc -c app.c -o app.o
app.c:
./generator -o app.c
all: app.o
app.o: app.c app.h
gcc -c app.c -o app.o
app.c:
./generator -o app.c
50. 50
An example of
a correctly specified dependency
all: app.o
app.o: app.c app.h
gcc -c app.c -o app.o
app.c:
./generator -o app.c
all: app.o
app.o: app.c app.h
gcc -c app.c -o app.o
app.c:
./generator -o app.c
53. 53
One of the root causes: The Helper File
Is this type of unspecified dependency a problem?
54. 54
One of the root causes: The Helper File
Is this type of unspecified dependency a problem?
Not for now...
55. 55
One of the root causes: The Helper File
Is this type of unspecified dependency a problem?
Not for now...
For the other root causes, check our paper!