An Empirical Study of
Unspecified Dependencies
in Make-Based Build Systems
Cor-Paul Bezemer, Shane McIntosh, Bram Adams,
Daniel M. German, Ahmed E. Hassan
Empirical Software Engineering – Journal First
2
What is a build system?
Source code
Deliverable
3
Build systems describe how sources are
translated into deliverables
.tex
.c .o
.dvi
.a
.pdf
.deb
4
A build file specifies how to generate
targets from their dependencies
all: app.o
app.o: app.c app.h
gcc -c app.c -o app.o
app.c:
./generator -o app.c
all: app.o
app.o: app.c app.h
gcc -c app.c -o app.o
app.c:
./generator -o app.c
5
Build rules
specify how targets must be built
all: app.o
app.o: app.c app.h
gcc -c app.c -o app.o
app.c:
./generator -o app.c
all: app.o
app.o: app.c app.h
gcc -c app.c -o app.o
app.c:
./generator -o app.c
6
Targets
are the deliverables of a build system
all: app.o
app.o: app.c app.h
gcc -c app.c -o app.o
app.c:
./generator -o app.c
all: app.o
app.o: app.c app.h
gcc -c app.c -o app.o
app.c:
./generator -o app.c
7
Dependencies
are source code, libraries or targets
all: app.o
app.o: app.c app.h
gcc -c app.c -o app.o
app.c:
./generator -o app.c
all: app.o
app.o: app.c app.h
gcc -c app.c -o app.o
app.c:
./generator -o app.c
8
Processes
‘glue’ targets and dependencies together
all: app.o
app.o: app.c app.h
gcc -c app.c -o app.o
app.c:
./generator -o app.c
all: app.o
app.o: app.c app.h
gcc -c app.c -o app.o
app.c:
./generator -o app.c
9
Builds can take a lot of time
glib
openldap
Linux
Qt
0 50 100 150 200 250 300
Build time (minutes)
10
There exist ways to speed up a build
● Incremental build
– Rebuild only the component(s) that changed, and the
component(s) that depend on it
– Most build technologies (e.g., make) already do this
● ‘Hacking’ the build by removing dependencies
– Decreases potential to parallellize the build
– Difficult to debug issues that may occur
11
What happens if we fail
to specify a dependency?
12
An example of an
unspecified dependency
all: app.o
app.o: app.c
gcc -c app.c -o app.o
app.c:
./generator -o app.c
all: app.o
app.o: app.c
gcc -c app.c -o app.o
app.c:
./generator -o app.c
If the code for ‘generator’ changes, app.c
(and all the targets that depend on it)
will not be rebuilt!
13
Unspecified dependencies are easy to
fix, but hard to discover
all: app.o
app.o: app.c
gcc -c app.c -o app.o
app.c:
./generator -o app.c
all: app.o
app.o: app.c
gcc -c app.c -o app.o
app.c:
./generator -o app.c
generator
14
How prevalent are
unspecified dependencies
in real projects?
15
Our prototype tool compares the conceptual
with the concrete build graph
● Conceptual graph:
– What we believe is being built (i.e., the build file)
● Concrete graph:
– What is actually being built (i.e., the build execution
trace)
● Processes may internally access files that are not
explicitly defined as dependencies!
– Hence we need to compare both graphs
16
Deriving the conceptual graph
all: app.o
app.o: app.c
gcc -c app.c -o app.o
app.c:
./generator -o app.c
all: app.o
app.o: app.c
gcc -c app.c -o app.o
app.c:
./generator -o app.c
17
Deriving the conceptual graph
all: app.o
app.o: app.c
gcc -c app.c -o app.o
app.c:
./generator -o app.c
all: app.o
app.o: app.c
gcc -c app.c -o app.o
app.c:
./generator -o app.c
18
Deriving the concrete graph
all: app.o
app.o: app.c
gcc -c app.c -o app.o
app.c:
./generator -o app.c
all: app.o
app.o: app.c
gcc -c app.c -o app.o
app.c:
./generator -o app.c
19
Deriving the concrete graph
all: app.o
app.o: app.c
gcc -c app.c -o app.o
app.c:
./generator -o app.c
all: app.o
app.o: app.c
gcc -c app.c -o app.o
app.c:
./generator -o app.c
20
How can we compare the
conceptual and concrete graph?
21
We abstract both graphs into graphs
that follow a unified schema
22
The abstracted conceptual graph
23
The abstracted concrete graph
24
The abstracted concrete graph
25
The differences between the abstracted
graphs reveal unspecified dependencies
26
The differences between the abstracted
graphs reveal unspecified dependencies
27
We can make the build file complete by
manually adding the unspecified dependency
all: app.o
app.o: app.c
gcc -c app.c -o app.o
app.c:
./generator -o app.c
all: app.o
app.o: app.c
gcc -c app.c -o app.o
app.c:
./generator -o app.c
28
We can make the build file complete by
manually adding the unspecified dependency
all: app.o
app.o: app.c
gcc -c app.c -o app.o
app.c: generator
./generator -o app.c
all: app.o
app.o: app.c
gcc -c app.c -o app.o
app.c: generator
./generator -o app.c
29
We studied unspecified dependencies in
the build systems of 4 real projects
GLIB
OPEN
LDAP
30
We studied unspecified dependencies in
the build systems of 4 real projects
GLIB
1.7k 944k
OPEN
LDAP
(unspecified dependencies)
284k290
31
We manually identified 6 root causes
for the unspecified dependencies
● We used filters to group unspecified dependencies
that had the same root cause
– For example, a project policy
32
We manually identified 6 root causes
for the unspecified dependencies
● We used filters to group unspecified dependencies
that had the same root cause
– For example, a project policy
● We ‘debugged’ the unspecified dependencies
– We manually traced the code around the spot where
the dependency should have been specified
– We searched the documentation of the project, or we
consulted developers
33
The “Binary Compatibility
Guarantee” root cause
● Qt has many unspecified dependencies on shared
library (.so) files
34
The “Binary Compatibility
Guarantee” root cause
● Qt has many unspecified dependencies on shared
library (.so) files
● Qt guarantees binary compatibility between minor
releases
– Changes to the .so file are OK as long as the interface
does not change
– Hence, it is sufficient to depend on the interface
35
The “Binary Compatibility
Guarantee” root cause
● Qt has many unspecified dependencies on shared
library (.so) files
● Qt guarantees binary compatibility between minor
releases
– Changes to the .so file are OK as long as the interface
does not change
– Hence, it is sufficient to depend on the interface
For the other root causes, check our paper!
36
What did developers think
of our findings?
● We submitted patches for 36 unspecified
dependencies to the Glib project
– The developers agreed, but preferred to not touch the
build system
37
What did developers think
of our findings?
● We submitted patches for 36 unspecified
dependencies to the Glib project
– The developers agreed, but preferred to not touch the
build system
● Developers often remove dependencies to speed
up the build
– And then rely on project processes (Qt, Linux) to deal
with those unspecified dependencies
38
39
40
41
42bezemer@cs.queensu.ca, http://sailhome.cs.queensu.ca/~corpaul
43
44
We combined several tools to reveal unspecified
dependencies in make-based build systems
MAKAO
Conceptual graphs are
extracted using
[Adams et al., ICSM 2007]
45
We combined several tools to reveal unspecified
dependencies in make-based build systems
MAKAO
Conceptual graphs are
extracted using
[Adams et al., ICSM 2007]
STRACE
Concrete graphs are
extracted from
execution logs
46
We combined several tools to reveal unspecified
dependencies in make-based build systems
MAKAO
Conceptual graphs are
extracted using
[Adams et al., ICSM 2007]
STRACE
Concrete graphs are
extracted from
execution logs
Graphs are represented
and analyzed using
47
An example of
a correctly specified dependency
all: app.o
app.o: app.c app.h
gcc -c app.c -o app.o
app.c:
./generator -o app.c
all: app.o
app.o: app.c app.h
gcc -c app.c -o app.o
app.c:
./generator -o app.c
48
An example of
a correctly specified dependency
all: app.o
app.o: app.c app.h
gcc -c app.c -o app.o
app.c:
./generator -o app.c
all: app.o
app.o: app.c app.h
gcc -c app.c -o app.o
app.c:
./generator -o app.c
49
An example of
a correctly specified dependency
all: app.o
app.o: app.c app.h
gcc -c app.c -o app.o
app.c:
./generator -o app.c
all: app.o
app.o: app.c app.h
gcc -c app.c -o app.o
app.c:
./generator -o app.c
50
An example of
a correctly specified dependency
all: app.o
app.o: app.c app.h
gcc -c app.c -o app.o
app.c:
./generator -o app.c
all: app.o
app.o: app.c app.h
gcc -c app.c -o app.o
app.c:
./generator -o app.c
51
One of the root causes: The Helper File
52
One of the root causes: The Helper File
53
One of the root causes: The Helper File
Is this type of unspecified dependency a problem?
54
One of the root causes: The Helper File
Is this type of unspecified dependency a problem?
Not for now...
55
One of the root causes: The Helper File
Is this type of unspecified dependency a problem?
Not for now...
For the other root causes, check our paper!

An Empirical Study of Unspecified Dependencies in Make-Based Build Systems

  • 1.
    An Empirical Studyof Unspecified Dependencies in Make-Based Build Systems Cor-Paul Bezemer, Shane McIntosh, Bram Adams, Daniel M. German, Ahmed E. Hassan Empirical Software Engineering – Journal First
  • 2.
    2 What is abuild system? Source code Deliverable
  • 3.
    3 Build systems describehow sources are translated into deliverables .tex .c .o .dvi .a .pdf .deb
  • 4.
    4 A build filespecifies how to generate targets from their dependencies all: app.o app.o: app.c app.h gcc -c app.c -o app.o app.c: ./generator -o app.c all: app.o app.o: app.c app.h gcc -c app.c -o app.o app.c: ./generator -o app.c
  • 5.
    5 Build rules specify howtargets must be built all: app.o app.o: app.c app.h gcc -c app.c -o app.o app.c: ./generator -o app.c all: app.o app.o: app.c app.h gcc -c app.c -o app.o app.c: ./generator -o app.c
  • 6.
    6 Targets are the deliverablesof a build system all: app.o app.o: app.c app.h gcc -c app.c -o app.o app.c: ./generator -o app.c all: app.o app.o: app.c app.h gcc -c app.c -o app.o app.c: ./generator -o app.c
  • 7.
    7 Dependencies are source code,libraries or targets all: app.o app.o: app.c app.h gcc -c app.c -o app.o app.c: ./generator -o app.c all: app.o app.o: app.c app.h gcc -c app.c -o app.o app.c: ./generator -o app.c
  • 8.
    8 Processes ‘glue’ targets anddependencies together all: app.o app.o: app.c app.h gcc -c app.c -o app.o app.c: ./generator -o app.c all: app.o app.o: app.c app.h gcc -c app.c -o app.o app.c: ./generator -o app.c
  • 9.
    9 Builds can takea lot of time glib openldap Linux Qt 0 50 100 150 200 250 300 Build time (minutes)
  • 10.
    10 There exist waysto speed up a build ● Incremental build – Rebuild only the component(s) that changed, and the component(s) that depend on it – Most build technologies (e.g., make) already do this ● ‘Hacking’ the build by removing dependencies – Decreases potential to parallellize the build – Difficult to debug issues that may occur
  • 11.
    11 What happens ifwe fail to specify a dependency?
  • 12.
    12 An example ofan unspecified dependency all: app.o app.o: app.c gcc -c app.c -o app.o app.c: ./generator -o app.c all: app.o app.o: app.c gcc -c app.c -o app.o app.c: ./generator -o app.c If the code for ‘generator’ changes, app.c (and all the targets that depend on it) will not be rebuilt!
  • 13.
    13 Unspecified dependencies areeasy to fix, but hard to discover all: app.o app.o: app.c gcc -c app.c -o app.o app.c: ./generator -o app.c all: app.o app.o: app.c gcc -c app.c -o app.o app.c: ./generator -o app.c generator
  • 14.
    14 How prevalent are unspecifieddependencies in real projects?
  • 15.
    15 Our prototype toolcompares the conceptual with the concrete build graph ● Conceptual graph: – What we believe is being built (i.e., the build file) ● Concrete graph: – What is actually being built (i.e., the build execution trace) ● Processes may internally access files that are not explicitly defined as dependencies! – Hence we need to compare both graphs
  • 16.
    16 Deriving the conceptualgraph all: app.o app.o: app.c gcc -c app.c -o app.o app.c: ./generator -o app.c all: app.o app.o: app.c gcc -c app.c -o app.o app.c: ./generator -o app.c
  • 17.
    17 Deriving the conceptualgraph all: app.o app.o: app.c gcc -c app.c -o app.o app.c: ./generator -o app.c all: app.o app.o: app.c gcc -c app.c -o app.o app.c: ./generator -o app.c
  • 18.
    18 Deriving the concretegraph all: app.o app.o: app.c gcc -c app.c -o app.o app.c: ./generator -o app.c all: app.o app.o: app.c gcc -c app.c -o app.o app.c: ./generator -o app.c
  • 19.
    19 Deriving the concretegraph all: app.o app.o: app.c gcc -c app.c -o app.o app.c: ./generator -o app.c all: app.o app.o: app.c gcc -c app.c -o app.o app.c: ./generator -o app.c
  • 20.
    20 How can wecompare the conceptual and concrete graph?
  • 21.
    21 We abstract bothgraphs into graphs that follow a unified schema
  • 22.
  • 23.
  • 24.
  • 25.
    25 The differences betweenthe abstracted graphs reveal unspecified dependencies
  • 26.
    26 The differences betweenthe abstracted graphs reveal unspecified dependencies
  • 27.
    27 We can makethe build file complete by manually adding the unspecified dependency all: app.o app.o: app.c gcc -c app.c -o app.o app.c: ./generator -o app.c all: app.o app.o: app.c gcc -c app.c -o app.o app.c: ./generator -o app.c
  • 28.
    28 We can makethe build file complete by manually adding the unspecified dependency all: app.o app.o: app.c gcc -c app.c -o app.o app.c: generator ./generator -o app.c all: app.o app.o: app.c gcc -c app.c -o app.o app.c: generator ./generator -o app.c
  • 29.
    29 We studied unspecifieddependencies in the build systems of 4 real projects GLIB OPEN LDAP
  • 30.
    30 We studied unspecifieddependencies in the build systems of 4 real projects GLIB 1.7k 944k OPEN LDAP (unspecified dependencies) 284k290
  • 31.
    31 We manually identified6 root causes for the unspecified dependencies ● We used filters to group unspecified dependencies that had the same root cause – For example, a project policy
  • 32.
    32 We manually identified6 root causes for the unspecified dependencies ● We used filters to group unspecified dependencies that had the same root cause – For example, a project policy ● We ‘debugged’ the unspecified dependencies – We manually traced the code around the spot where the dependency should have been specified – We searched the documentation of the project, or we consulted developers
  • 33.
    33 The “Binary Compatibility Guarantee”root cause ● Qt has many unspecified dependencies on shared library (.so) files
  • 34.
    34 The “Binary Compatibility Guarantee”root cause ● Qt has many unspecified dependencies on shared library (.so) files ● Qt guarantees binary compatibility between minor releases – Changes to the .so file are OK as long as the interface does not change – Hence, it is sufficient to depend on the interface
  • 35.
    35 The “Binary Compatibility Guarantee”root cause ● Qt has many unspecified dependencies on shared library (.so) files ● Qt guarantees binary compatibility between minor releases – Changes to the .so file are OK as long as the interface does not change – Hence, it is sufficient to depend on the interface For the other root causes, check our paper!
  • 36.
    36 What did developersthink of our findings? ● We submitted patches for 36 unspecified dependencies to the Glib project – The developers agreed, but preferred to not touch the build system
  • 37.
    37 What did developersthink of our findings? ● We submitted patches for 36 unspecified dependencies to the Glib project – The developers agreed, but preferred to not touch the build system ● Developers often remove dependencies to speed up the build – And then rely on project processes (Qt, Linux) to deal with those unspecified dependencies
  • 38.
  • 39.
  • 40.
  • 41.
  • 42.
  • 43.
  • 44.
    44 We combined severaltools to reveal unspecified dependencies in make-based build systems MAKAO Conceptual graphs are extracted using [Adams et al., ICSM 2007]
  • 45.
    45 We combined severaltools to reveal unspecified dependencies in make-based build systems MAKAO Conceptual graphs are extracted using [Adams et al., ICSM 2007] STRACE Concrete graphs are extracted from execution logs
  • 46.
    46 We combined severaltools to reveal unspecified dependencies in make-based build systems MAKAO Conceptual graphs are extracted using [Adams et al., ICSM 2007] STRACE Concrete graphs are extracted from execution logs Graphs are represented and analyzed using
  • 47.
    47 An example of acorrectly specified dependency all: app.o app.o: app.c app.h gcc -c app.c -o app.o app.c: ./generator -o app.c all: app.o app.o: app.c app.h gcc -c app.c -o app.o app.c: ./generator -o app.c
  • 48.
    48 An example of acorrectly specified dependency all: app.o app.o: app.c app.h gcc -c app.c -o app.o app.c: ./generator -o app.c all: app.o app.o: app.c app.h gcc -c app.c -o app.o app.c: ./generator -o app.c
  • 49.
    49 An example of acorrectly specified dependency all: app.o app.o: app.c app.h gcc -c app.c -o app.o app.c: ./generator -o app.c all: app.o app.o: app.c app.h gcc -c app.c -o app.o app.c: ./generator -o app.c
  • 50.
    50 An example of acorrectly specified dependency all: app.o app.o: app.c app.h gcc -c app.c -o app.o app.c: ./generator -o app.c all: app.o app.o: app.c app.h gcc -c app.c -o app.o app.c: ./generator -o app.c
  • 51.
    51 One of theroot causes: The Helper File
  • 52.
    52 One of theroot causes: The Helper File
  • 53.
    53 One of theroot causes: The Helper File Is this type of unspecified dependency a problem?
  • 54.
    54 One of theroot causes: The Helper File Is this type of unspecified dependency a problem? Not for now...
  • 55.
    55 One of theroot causes: The Helper File Is this type of unspecified dependency a problem? Not for now... For the other root causes, check our paper!