Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
An Empirical Analysis of Build Failures in the
Continuous Integration Worfklows
of Java-Based Open-Source Software
Thomas ...
2
Continuous Integration
VCS
CI Server Build
Feedback
Logs
Vasilescu et al. (2015).
Quality and Productivity Outcomes Rela...
3
4
Related Work
5
Understanding Build Failures
What types of errors cause CI build failures?
Which development practices can be
associated...
6
Research Setting
Project Name Description
Apache Storm Distributed Computation
Butterknife Android Dependency Injection
...
7
Data Acquisition
a
b
c
d
Topology Mapping
CI build history
Change history
8
Understanding Build Failures
What types of errors cause CI build failures?
Which development practices can be
associated...
9
Error Categorization and Quantification
 Goal
○ Categorization of errors
○ Frequency of occurrence of error types

App...
10
Error Categories
unknown Errors without a clearly identifiable cause 9
itestfailure An automated integration test faile...
11
Distribution of Common Error Types
Faulty VCS
interaction
Faulty build
configuration Dependency
error
Compilation
error...
12
Distribution of Common Error Types
Apache Storm
Butterknife
Crate.IO
Hystrix
Error
testfailure
compile
git dependency c...
13
Understanding Build Failures
What types of errors cause CI build failures?
Which development practices can be
associate...
14
Change Metrics
.java .txt
Changes
 Complexity
○ Churn, number of files, ...

File types
○ README.txt vs.
IntegrationT...
15
Process Metrics
b1
b2
b4
t
b3
a
b
e
c
f
d
g
VCS
commit graph
CI build
information
 Build History
○ Build climate
 Bui...
16
Statistical Correlation Analysis
 For each project individually
 Non-parametric correlation tests
○ Pearson’s chi-squ...
17
PassedBuild outcome Failed
Failed Passed
Previous build result
Percentageofbuilds
Findings
Build failures mostly occur ...
18
PassedBuild outcome Failed
Failed Passed
Previous build result
Percentageofbuilds
Findings
Build failures mostly occur ...
19
Findings
Build failures mostly occur consecutively.
Phases of build instability perpetuate
failures.
Build failures mos...
20
Findings
Even objectively harmless changes can
break builds. This indicates unwanted
flakiness of tests or the build en...
21
Summary
 Categorization of error types (beyond failed/errored)
 Quantification of error type occurrence
 Statistical...
22
Dipl.-Ing.
Thomas Rausch
Research Assistant
TU Wien
Distributed Systems Group
Argentinierstraße 8/184-1, 1040, Vienna, ...
Upcoming SlideShare
Loading in …5
×

An Empirical Analysis of Build Failures in the Continuous Integration Workflows of Java-Based Open-Source Software

0 views

Published on

MSR'17 presentation of our work on analyzing CI build failures.

Published in: Science
  • Be the first to comment

  • Be the first to like this

An Empirical Analysis of Build Failures in the Continuous Integration Workflows of Java-Based Open-Source Software

  1. 1. An Empirical Analysis of Build Failures in the Continuous Integration Worfklows of Java-Based Open-Source Software Thomas Rausch, Waldemar Hummer, Philipp Leitner*, Stefan Schulte Distributed Systems Group Vienna University of Technology, Austria http://dsg.tuwien.ac.at * Software Evolution and Architecture Lab University of Zurich, Switzerland http://www.ifi.uzh.ch/en/seal.html
  2. 2. 2 Continuous Integration VCS CI Server Build Feedback Logs Vasilescu et al. (2015). Quality and Productivity Outcomes Relating to Continuous Integration in GitHub “Our main finding is that continuous integration improves the productivity of project teams” Kerzazi et al. (2014). Why do Automated Builds Break? An Empirical Study “We [...] quantified the cost of such build breakage as more than 336.18 man-hours”
  3. 3. 3
  4. 4. 4 Related Work
  5. 5. 5 Understanding Build Failures What types of errors cause CI build failures? Which development practices can be associated with CI build failures?
  6. 6. 6 Research Setting Project Name Description Apache Storm Distributed Computation Butterknife Android Dependency Injection Crate.IO Scalable SQL database JabRef BibTeX management GUI jcabi-github Wrapper of GitHub API Hystrix Latency and fault tolerance library Presto Distributed SQL query engine Openmicroscopy Microscopy data environment RxAndroid RxJava bindings for Android Sponge API Minecraft plugin API Spring Boot Java Application Framework Square OkHttp HTTP+HTTP/2 client for Android Square Retofit HTTP client for Android Wordpress-Android WordPress for Android
  7. 7. 7 Data Acquisition a b c d Topology Mapping CI build history Change history
  8. 8. 8 Understanding Build Failures What types of errors cause CI build failures? Which development practices can be associated with CI build failures?
  9. 9. 9 Error Categorization and Quantification  Goal ○ Categorization of errors ○ Frequency of occurrence of error types  Approach ○ Systematic exploration of ~54 000 logfiles ○ Categorization scheme based on log message patterns [INFO] Compiling 67 source files to /home/travis/.../target/classes [INFO] ------------------------------------------------------------- [ERROR] COMPILATION ERROR : [INFO] ------------------------------------------------------------- [ERROR] /home/travis/.../redis/RedisAutoConfiguration.java:[143,10] cannot find symbol [INFO] 1 error [INFO] Compiling 67 source files to /home/travis/.../target/classes [INFO] ------------------------------------------------------------- [ERROR] COMPILATION ERROR : [INFO] ------------------------------------------------------------- [ERROR] /home/travis/.../redis/RedisAutoConfiguration.java:[143,10] cannot find symbol [INFO] 1 error [INFO] Compiling 67 source files to /home/travis/.../target/classes [INFO] ------------------------------------------------------------- [ERROR] COMPILATION ERROR : [INFO] ------------------------------------------------------------- [ERROR] /home/travis/.../redis/RedisAutoConfiguration.java:[143,10] cannot find symbol [INFO] 1 error
  10. 10. 10 Error Categories unknown Errors without a clearly identifiable cause 9 itestfailure An automated integration test failed 4 doc Documentation (e.g., JavaDoc) problem 3 license License criteria not met (missing header) 3 compatibility API incompatibility 2 androidsdk Android SDK-related error 1 buildout Error specific to Crate.IO python module 1 Label Description Occurrences testfailure An automated test failed 12 compile Compilation error 12 git VCS interaction error 12 buildconfig Faulty build config 11 crash Build environment crash or timeout 11 dependency Dependency error 11 quality Coding-rule violation (e.g., Checkstyle) 10
  11. 11. 11 Distribution of Common Error Types Faulty VCS interaction Faulty build configuration Dependency error Compilation error Coding-rule violation Failing test Crash 40% 30% 20% 10% 0%
  12. 12. 12 Distribution of Common Error Types Apache Storm Butterknife Crate.IO Hystrix Error testfailure compile git dependency crash buildconfig quality others Percentage JabRef jcabi-github Presto RxAndroid SpongeAPI Spring Boot Square OkHttp Square Retrofit 0% 25% 50% 75% 100%
  13. 13. 13 Understanding Build Failures What types of errors cause CI build failures? Which development practices can be associated with CI build failures?
  14. 14. 14 Change Metrics .java .txt Changes  Complexity ○ Churn, number of files, ...  File types ○ README.txt vs. IntegrationTest.java  Date and time  Author ○ Experience, commit frequency, ...
  15. 15. 15 Process Metrics b1 b2 b4 t b3 a b e c f d g VCS commit graph CI build information  Build History ○ Build climate  Build Type ○ Pull request, merge, ...  Pull Request Scenarios ○ Rebase, squash, ...
  16. 16. 16 Statistical Correlation Analysis  For each project individually  Non-parametric correlation tests ○ Pearson’s chi-square test ○ Mann—Whitney U test  Calculate effect sizes ○ Cramér’s V ○ Rank-biserial correlation
  17. 17. 17 PassedBuild outcome Failed Failed Passed Previous build result Percentageofbuilds Findings Build failures mostly occur consecutively. Phases of build instability perpetuate failures. Build failures mostly occur consecutively. Phases of build instability perpetuate failures. Build history b b’
  18. 18. 18 PassedBuild outcome Failed Failed Passed Previous build result Percentageofbuilds Findings Build failures mostly occur consecutively. Phases of build instability perpetuate failures. Build failures mostly occur consecutively. Phases of build instability perpetuate failures. Build history b b’
  19. 19. 19 Findings Build failures mostly occur consecutively. Phases of build instability perpetuate failures. Build failures mostly occur consecutively. Phases of build instability perpetuate failures. Build history No evidence that either history manipula- tion operations or parallel development to a PR affect the PR’s build outcome. No evidence that either history manipula- tion operations or parallel development to a PR affect the PR’s build outcome. Pull request scenarios
  20. 20. 20 Findings Even objectively harmless changes can break builds. This indicates unwanted flakiness of tests or the build environment. Even objectively harmless changes can break builds. This indicates unwanted flakiness of tests or the build environment. Build failures mostly occur consecutively. Phases of build instability perpetuate failures. Build failures mostly occur consecutively. Phases of build instability perpetuate failures. File types Build history  577 builds from Spring Boot  Changelog file change only  14% original failures ○ 52% test failures ○ 45% environment crash ○ 3% dependency error No evidence that either history manipula- tion operations or parallel development to a PR affect the PR’s build outcome. No evidence that either history manipula- tion operations or parallel development to a PR affect the PR’s build outcome. Pull request scenarios
  21. 21. 21 Summary  Categorization of error types (beyond failed/errored)  Quantification of error type occurrence  Statistical analysis of impact factors  Uncovered challenges that arise when mining CI data
  22. 22. 22 Dipl.-Ing. Thomas Rausch Research Assistant TU Wien Distributed Systems Group Argentinierstraße 8/184-1, 1040, Vienna, Austria T: +43 1 58801 184 838 E: rausch@dsg.tuwien.ac.at dsg.tuwien.ac.at/staff/trausch

×