Studying the Evolution
of Build Systems
Shane McIntosh
Queen’s University
What is the build system?
2
Build systems help
practitioners
3
OK Testers
Developers
Managers
Build systems require 12%
of a developer’s
time (on average)
4
Kumfert, G., and Epperly, T.
Software in the DOE: The
Hidden Overhead of the “Build”
Address Bar Search Bar
5
4 months before
the issue was fixed!
Research hypothesis
Build system maintenance imposes a significant
degree of overhead on software development
6
Studied projects
PLplot
7
>30 MLOC!
Thesis Overview
1.0 1.1 2.0
1.High level
evolution
analysis
2. Fine-grained
evolution analysis
3. Maintenance
Overhead analysis
8
Acceptance Ratio:
16/51 = 31%
1.0 2.0
Acceptance Ratio:
62/441 = 14%
1.0 2.0
High level evolution
1.Code analysis: Do the static size and
complexity of source code and build
system evolve similarly?
2.Runtime analysis: Does the build-time
complexity evolve?
9
1.0 1.1 2.0
Our approach
10
Code analysis
(Static)
(Static)
1.Do the static size and complexity of
source code and build system evolve
similarly?
2.Does the build-time complexity evolve?
11
1.0 1.1 2.0
12
ANTLR
Abstraction
Module
Restructuring
Module
Restructuring
Source & Build Change Together!
Runtime analysis
(Dynamic)
1.Do the static size and complexity of
source code and build system evolve
similarly?
2.Does the build-time complexity evolve?
13
1.0 1.1 2.0
Libs built
from source
Build length increases over time
14
High level evolution
results
1.Do the static size and complexity of source
code and build system evolve similarly?
2.Does the build-time complexity evolve?
15
Yes, but events may impact
build and source differently.
Yes, length tends to grow as
projects age.
Release granularity is
too coarse!
1 2 31.1 1.21.1.1 1.1.2 1.1.3
Fixed bug #1234 Added feature #1212 Issue #4444:
restructured module16
Fine-grained evolution
1.How many files does a typical build system
consist of?
2.How much does a typical build system
churn?
3.How large are typical build system
changes?
17
1.0 2.0
Revision tagging
1.1.1 1.1.2
Fixed bug #1234 Added feature #1212 Issue #4444:
Restructured module18
Revision tagging
Fixed bug #1234
19
Size of the build system
1.How many files does a typical build system
consist of?
2.How much does a typical build system
churn?
3.How large are typical build system
changes?
20
1.0 2.0
Build accounts for 9%
of files (median)
21
Build system rate of
change
1.How many files does a typical build system
consist of?
2.How much does a typical build system
churn?
3.How large are typical build system
changes?
22
1.0 2.0
High churn may result
in post-release bugs!
Normalized churn is on
par with source code
23
Size of build changes
1.How many files does a typical build system
consist of?
2.How much does a typical build system
churn?
3.How large are typical build system
changes?
24
1.0 2.0
Build changes add and
delete 3-4 lines
Source changes add and
delete 6-11 lines
25
26
Fine-grained evolution
1.How many files does a typical build system
consist of?
2.How much does a typical build system
churn?
3.How large are typical build system changes?
9% of files
On par with source
Adds and deletes 3-4 lines
How do projects cope
with build maintenance?
27
Build maintenance
overhead
1.How often are build changes required to
complete development tasks?
2.How do projects distribute build
maintenance work?
28
1.0 2.0
Logical coupling
29
LC(Source Bld)⇒
= 2 ÷ 4
= 50% 1.0 2.0
Logical coupling at
revision level is low!
30
Source Build coupling well below 12% estimation!⇒
Work item grouping
1.1.1 1.1.2
Fixed bug #1234 Added feature #1212 Issue #4444:
Restructured module31
Work item grouping
1.1.1 1.1.2
Fixed bug #1234 Oops, forgot some
code for bug #1234
Work item grouping
Bug #1234
33
Fixed bug #1234 Oops, forgot some
code for bug #1234
Task-centric analysis
1.How often are build changes required to
complete development tasks?
2.How do projects distribute build
maintenance work?
34
1.0 2.0
Work item coupling to
the build is high
35
Mozilla’s Source Build coupling is very high!⇒
Developer-centric
analysis
analysis
1.How often are build changes required to
complete development tasks?
2.How do projects distribute build
maintenance work?
36
1.0 2.0
If you produce source code,
do you produce build code?
37
Dispersed
Ownership:
Build maintenance is
distributed amongst
most team members
38
Concentrated Ownership:
A small team is
responsible for most
of the build maintenance
Build maintenance
overhead
1.How often are build changes required to
complete development tasks?
2.How do projects distribute build
maintenance work?
Up to 27% require build changes
Dispersed or concentrated
Potential bias Studied
projects
All
projects
40
41
Questions?

Studying the Evolution of Build Systems

Editor's Notes

  • #2 My thesis is called...
  • #3 Infrastructure that translates source code into deliverables
  • #4 devs: constantly have to rebuilt artifacts to test changes... bld sys incrementally updates builds testers: automated tests...regression. managers: bundles built deliverables, docs, and 3rd p libs into installable pkgs for delivery...
  • #5 Based on survey results...
  • #6 Firefox 3.0 was built and delivered incorrectly users in a networked environment address/search bar broken due to an incorrect version of the SQLite library being linked in build process Fix was delivered 4 months late in a service pack (3.0.1)
  • #7 Prior research and our preliminary analysis lead us to formulate...
  • #8 Studied 13 projects of different: - prog language (c, c++, java) - domain (web communications, web servers, app servers, compilers, RDBMS, IDE...) - release style (single rel train, parallel rel train) - dev methodology (open, proprietary)
  • #9 3 analyses to investigate the hypo... Rel. level -> confirm java build sys’s evolve Rev. level -> analyze scale of build sys maint. Maint. overhead -> analyze dev impact of build maint
  • #10 We address two research questions
  • #11 We gather a set of official release snapshots from project archives - Then, we perform code and build-time analyses on each snapshot and plot metric values over time - Finally, we analyze anomalies in the curves using project docs and commit logs
  • #12 We begin with our static analysis
  • #13 We find that source and build change together (increase/decrease together) Periods when they disagree and circled (1) source files were lifted to into an ANTLR abstraction, shrinking the size in LOC. However, the build requires more logic to execute the ANTLR code generation step. Hence, build size increases. (2) Modules for C++ code generation (from the UML diagrams) were moved out of the main repository into an independent one. Build size shrinks because they took the opportunity to restructure the build files. SLOC grows due to normal bug fixes and new features being introduced.
  • #14 Next, we perform a build-time analysis
  • #15 Length and depth are standardized and plotted here. - Length is growing as the project ages. - Depth constant - anomaly: third party libraries were built from source
  • #16 We address two research questions
  • #17 1) Build systems appear to evolve at the release level...but we may miss phenomena that occur between releases. 2) At the release level, we cannot comment on the quantity of maintenance that the build requires
  • #18 We formulated three research questions...
  • #19 So, the revision level data contains information about each group of simultaneous developer changes (next).
  • #20 Using this data, we can mark each revision as src/bld modifying and analyze the set of all project revisions accordingly.
  • #21 First, we use this data to figure out (RQ1).
  • #22 We find that the build system accounts for 9% of the build files in a project (median). - Small!
  • #23 Next, we measure (RQ2)
  • #24 1) churn of the build system is close to that of the source code (normalized) 2) Prior research => High churn may produce bugs!
  • #25 Finally, we study the size of typical build system changes
  • #26 Build changes add and delete 3-4 lines while source changes add and delete 6-11 lines.
  • #27 1) Build systems are small in size (9%) 2) But churn frequently (potential for bugs) 3) Changes are not trivially small How do projects cope?
  • #28 Build systems seem to be a burden on developers! (next)
  • #29 We formulated two research questions to address the problem
  • #30 - measures strength of implication...
  • #31 Counter intuitive and does not match with prior survey results (<12%)
  • #32 - devs may not commit all changes in one revision
  • #33 - devs may not commit all changes in one revision
  • #36 - Mozilla is scary high - Eclipse-core and Jazz are considerably lower - Use of PDE build
  • #38 - Jazz src devs are often responsible for build dev - Git and Linux are less so
  • #39 - Jazz distributes build work...