SlideShare a Scribd company logo
An Empirical Analysis of Build Failures in the
Continuous Integration Worfklows
of Java-Based Open-Source Software
Thomas Rausch, Waldemar Hummer, Philipp Leitner*, Stefan Schulte
Distributed Systems Group
Vienna University of Technology, Austria
http://dsg.tuwien.ac.at
* Software Evolution and Architecture Lab
University of Zurich, Switzerland
http://www.ifi.uzh.ch/en/seal.html
2
Continuous Integration
VCS
CI Server Build
Feedback
Logs
Vasilescu et al. (2015).
Quality and Productivity Outcomes Relating to
Continuous Integration in GitHub
“Our main finding is that continuous
integration improves the productivity of project
teams”
Kerzazi et al. (2014).
Why do Automated Builds Break? An Empirical Study
“We [...] quantified the cost of such build
breakage as more than 336.18 man-hours”
3
4
Related Work
5
Understanding Build Failures
What types of errors cause CI build failures?
Which development practices can be
associated with CI build failures?
6
Research Setting
Project Name Description
Apache Storm Distributed Computation
Butterknife Android Dependency Injection
Crate.IO Scalable SQL database
JabRef BibTeX management GUI
jcabi-github Wrapper of GitHub API
Hystrix Latency and fault tolerance library
Presto Distributed SQL query engine
Openmicroscopy Microscopy data environment
RxAndroid RxJava bindings for Android
Sponge API Minecraft plugin API
Spring Boot Java Application Framework
Square OkHttp HTTP+HTTP/2 client for Android
Square Retofit HTTP client for Android
Wordpress-Android WordPress for Android
7
Data Acquisition
a
b
c
d
Topology Mapping
CI build history
Change history
8
Understanding Build Failures
What types of errors cause CI build failures?
Which development practices can be
associated with CI build failures?
9
Error Categorization and Quantification
 Goal
○ Categorization of errors
○ Frequency of occurrence of error types

Approach
○ Systematic exploration of ~54 000 logfiles
○ Categorization scheme based on log message patterns
[INFO] Compiling 67 source files to /home/travis/.../target/classes
[INFO] -------------------------------------------------------------
[ERROR] COMPILATION ERROR :
[INFO] -------------------------------------------------------------
[ERROR] /home/travis/.../redis/RedisAutoConfiguration.java:[143,10] cannot find symbol
[INFO] 1 error
[INFO] Compiling 67 source files to /home/travis/.../target/classes
[INFO] -------------------------------------------------------------
[ERROR] COMPILATION ERROR :
[INFO] -------------------------------------------------------------
[ERROR] /home/travis/.../redis/RedisAutoConfiguration.java:[143,10] cannot find symbol
[INFO] 1 error
[INFO] Compiling 67 source files to /home/travis/.../target/classes
[INFO] -------------------------------------------------------------
[ERROR] COMPILATION ERROR :
[INFO] -------------------------------------------------------------
[ERROR] /home/travis/.../redis/RedisAutoConfiguration.java:[143,10] cannot find symbol
[INFO] 1 error
10
Error Categories
unknown Errors without a clearly identifiable cause 9
itestfailure An automated integration test failed 4
doc Documentation (e.g., JavaDoc) problem 3
license License criteria not met (missing header) 3
compatibility API incompatibility 2
androidsdk Android SDK-related error 1
buildout Error specific to Crate.IO python module 1
Label Description Occurrences
testfailure An automated test failed 12
compile Compilation error 12
git VCS interaction error 12
buildconfig Faulty build config 11
crash Build environment crash or timeout 11
dependency Dependency error 11
quality Coding-rule violation (e.g., Checkstyle) 10
11
Distribution of Common Error Types
Faulty VCS
interaction
Faulty build
configuration Dependency
error
Compilation
error
Coding-rule
violation
Failing test
Crash
40%
30%
20%
10%
0%
12
Distribution of Common Error Types
Apache Storm
Butterknife
Crate.IO
Hystrix
Error
testfailure
compile
git dependency crash
buildconfig quality others
Percentage
JabRef
jcabi-github
Presto
RxAndroid
SpongeAPI
Spring Boot
Square OkHttp
Square Retrofit
0% 25% 50% 75% 100%
13
Understanding Build Failures
What types of errors cause CI build failures?
Which development practices can be
associated with CI build failures?
14
Change Metrics
.java .txt
Changes
 Complexity
○ Churn, number of files, ...

File types
○ README.txt vs.
IntegrationTest.java
 Date and time

Author
○ Experience, commit
frequency, ...
15
Process Metrics
b1
b2
b4
t
b3
a
b
e
c
f
d
g
VCS
commit graph
CI build
information
 Build History
○ Build climate
 Build Type
○ Pull request, merge, ...

Pull Request Scenarios
○ Rebase, squash, ...
16
Statistical Correlation Analysis
 For each project individually
 Non-parametric correlation tests
○ Pearson’s chi-square test
○ Mann—Whitney U test

Calculate effect sizes
○ Cramér’s V
○ Rank-biserial correlation
17
PassedBuild outcome Failed
Failed Passed
Previous build result
Percentageofbuilds
Findings
Build failures mostly occur consecutively.
Phases of build instability perpetuate
failures.
Build failures mostly occur consecutively.
Phases of build instability perpetuate
failures.
Build history
b
b’
18
PassedBuild outcome Failed
Failed Passed
Previous build result
Percentageofbuilds
Findings
Build failures mostly occur consecutively.
Phases of build instability perpetuate
failures.
Build failures mostly occur consecutively.
Phases of build instability perpetuate
failures.
Build history
b
b’
19
Findings
Build failures mostly occur consecutively.
Phases of build instability perpetuate
failures.
Build failures mostly occur consecutively.
Phases of build instability perpetuate
failures.
Build history
No evidence that either history manipula-
tion operations or parallel development
to a PR affect the PR’s build outcome.
No evidence that either history manipula-
tion operations or parallel development
to a PR affect the PR’s build outcome.
Pull request scenarios
20
Findings
Even objectively harmless changes can
break builds. This indicates unwanted
flakiness of tests or the build environment.
Even objectively harmless changes can
break builds. This indicates unwanted
flakiness of tests or the build environment.
Build failures mostly occur consecutively.
Phases of build instability perpetuate
failures.
Build failures mostly occur consecutively.
Phases of build instability perpetuate
failures.
File types
Build history
 577 builds from Spring Boot
 Changelog file change only
 14% original failures
○ 52% test failures
○ 45% environment crash
○ 3% dependency error
No evidence that either history manipula-
tion operations or parallel development
to a PR affect the PR’s build outcome.
No evidence that either history manipula-
tion operations or parallel development
to a PR affect the PR’s build outcome.
Pull request scenarios
21
Summary
 Categorization of error types (beyond failed/errored)
 Quantification of error type occurrence
 Statistical analysis of impact factors
 Uncovered challenges that arise when mining CI data
22
Dipl.-Ing.
Thomas Rausch
Research Assistant
TU Wien
Distributed Systems Group
Argentinierstraße 8/184-1, 1040, Vienna, Austria
T: +43 1 58801 184 838
E: rausch@dsg.tuwien.ac.at
dsg.tuwien.ac.at/staff/trausch

More Related Content

Similar to An Empirical Analysis of Build Failures in the Continuous Integration Workflows of Java-Based Open-Source Software

Noise and Heterogeneity in Historical Build Data: An Empirical Study of Travi...
Noise and Heterogeneity in Historical Build Data: An Empirical Study of Travi...Noise and Heterogeneity in Historical Build Data: An Empirical Study of Travi...
Noise and Heterogeneity in Historical Build Data: An Empirical Study of Travi...
Keheliya Gallaba
 
Build it, Test it, Ship it: Continuous Delivery at Turner Broadcasting System...
Build it, Test it, Ship it: Continuous Delivery at Turner Broadcasting System...Build it, Test it, Ship it: Continuous Delivery at Turner Broadcasting System...
Build it, Test it, Ship it: Continuous Delivery at Turner Broadcasting System...
Atlassian
 
Continuous Integration (Jenkins/Hudson)
Continuous Integration (Jenkins/Hudson)Continuous Integration (Jenkins/Hudson)
Continuous Integration (Jenkins/Hudson)
Dennys Hsieh
 
havcs-410-101 a-2-10-srt-pg_2
havcs-410-101 a-2-10-srt-pg_2havcs-410-101 a-2-10-srt-pg_2
havcs-410-101 a-2-10-srt-pg_2
raryal
 

Similar to An Empirical Analysis of Build Failures in the Continuous Integration Workflows of Java-Based Open-Source Software (20)

Noise and Heterogeneity in Historical Build Data: An Empirical Study of Travi...
Noise and Heterogeneity in Historical Build Data: An Empirical Study of Travi...Noise and Heterogeneity in Historical Build Data: An Empirical Study of Travi...
Noise and Heterogeneity in Historical Build Data: An Empirical Study of Travi...
 
Azure from scratch part 4
Azure from scratch part 4Azure from scratch part 4
Azure from scratch part 4
 
Part 2 improving your software development v1.0
Part 2   improving your software development v1.0Part 2   improving your software development v1.0
Part 2 improving your software development v1.0
 
DOES14 - Gary Gruver - Macy's - Transforming Traditional Enterprise Software ...
DOES14 - Gary Gruver - Macy's - Transforming Traditional Enterprise Software ...DOES14 - Gary Gruver - Macy's - Transforming Traditional Enterprise Software ...
DOES14 - Gary Gruver - Macy's - Transforming Traditional Enterprise Software ...
 
Build it, Test it, Ship it: Continuous Delivery at Turner Broadcasting System...
Build it, Test it, Ship it: Continuous Delivery at Turner Broadcasting System...Build it, Test it, Ship it: Continuous Delivery at Turner Broadcasting System...
Build it, Test it, Ship it: Continuous Delivery at Turner Broadcasting System...
 
Continuous Integration (Jenkins/Hudson)
Continuous Integration (Jenkins/Hudson)Continuous Integration (Jenkins/Hudson)
Continuous Integration (Jenkins/Hudson)
 
DOES14 - Gary Gruver - Macy's - Transforming Traditional Enterprise Software ...
DOES14 - Gary Gruver - Macy's - Transforming Traditional Enterprise Software ...DOES14 - Gary Gruver - Macy's - Transforming Traditional Enterprise Software ...
DOES14 - Gary Gruver - Macy's - Transforming Traditional Enterprise Software ...
 
Continuous Everything
Continuous EverythingContinuous Everything
Continuous Everything
 
Delivering Quality Software with Continuous Integration
Delivering Quality Software with Continuous IntegrationDelivering Quality Software with Continuous Integration
Delivering Quality Software with Continuous Integration
 
DevOps for Your Mobile App
DevOps for Your Mobile AppDevOps for Your Mobile App
DevOps for Your Mobile App
 
PVS-Studio for Linux (CoreHard presentation)
PVS-Studio for Linux (CoreHard presentation)PVS-Studio for Linux (CoreHard presentation)
PVS-Studio for Linux (CoreHard presentation)
 
Continuous Delivery Applied
Continuous Delivery AppliedContinuous Delivery Applied
Continuous Delivery Applied
 
Continuous Delivery Applied (Agile Richmond)
Continuous Delivery Applied (Agile Richmond)Continuous Delivery Applied (Agile Richmond)
Continuous Delivery Applied (Agile Richmond)
 
Continuous delivery for databases
Continuous delivery for databasesContinuous delivery for databases
Continuous delivery for databases
 
Continuous Delivery Applied
Continuous Delivery AppliedContinuous Delivery Applied
Continuous Delivery Applied
 
Enabling Continuous Integration with Azure Pipelines
Enabling Continuous Integration with Azure PipelinesEnabling Continuous Integration with Azure Pipelines
Enabling Continuous Integration with Azure Pipelines
 
havcs-410-101 a-2-10-srt-pg_2
havcs-410-101 a-2-10-srt-pg_2havcs-410-101 a-2-10-srt-pg_2
havcs-410-101 a-2-10-srt-pg_2
 
Continous Integration: A Case Study
Continous Integration: A Case StudyContinous Integration: A Case Study
Continous Integration: A Case Study
 
A Continuous Delivery Safety Net for Databases
A Continuous Delivery Safety Net for DatabasesA Continuous Delivery Safety Net for Databases
A Continuous Delivery Safety Net for Databases
 
Flight East 2018 Presentation–Continuous Integration––An Overview
Flight East 2018 Presentation–Continuous Integration––An OverviewFlight East 2018 Presentation–Continuous Integration––An Overview
Flight East 2018 Presentation–Continuous Integration––An Overview
 

More from Thomas Rausch

More from Thomas Rausch (9)

Test cloud application deployments locally and in CI without staging environm...
Test cloud application deployments locally and in CI without staging environm...Test cloud application deployments locally and in CI without staging environm...
Test cloud application deployments locally and in CI without staging environm...
 
Synthesizing Plausible Infrastructure Configurations for Evaluating Edge Comp...
Synthesizing Plausible Infrastructure Configurations for Evaluating Edge Comp...Synthesizing Plausible Infrastructure Configurations for Evaluating Edge Comp...
Synthesizing Plausible Infrastructure Configurations for Evaluating Edge Comp...
 
Towards a Serverless Platform for Edge AI
Towards a Serverless Platform for Edge AITowards a Serverless Platform for Edge AI
Towards a Serverless Platform for Edge AI
 
Edge Intelligence: The Convergence of Humans, Things and AI
Edge Intelligence: The Convergence of Humans, Things and AIEdge Intelligence: The Convergence of Humans, Things and AI
Edge Intelligence: The Convergence of Humans, Things and AI
 
Portable Energy-Aware Cluster-Based Edge Computers
Portable Energy-Aware Cluster-Based Edge ComputersPortable Energy-Aware Cluster-Based Edge Computers
Portable Energy-Aware Cluster-Based Edge Computers
 
EMMA: Distributed QoS-Aware MQTT Middleware for Edge Computing Applications
EMMA: Distributed QoS-Aware MQTT Middleware for Edge Computing ApplicationsEMMA: Distributed QoS-Aware MQTT Middleware for Edge Computing Applications
EMMA: Distributed QoS-Aware MQTT Middleware for Edge Computing Applications
 
Message-Oriented Middleware for Edge Computing Applications
Message-Oriented Middleware for Edge Computing ApplicationsMessage-Oriented Middleware for Edge Computing Applications
Message-Oriented Middleware for Edge Computing Applications
 
Build Failure Prediction in Continuous Integration Workflows
Build Failure Prediction in Continuous Integration WorkflowsBuild Failure Prediction in Continuous Integration Workflows
Build Failure Prediction in Continuous Integration Workflows
 
Git Introduction Tutorial
Git Introduction TutorialGit Introduction Tutorial
Git Introduction Tutorial
 

Recently uploaded

Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...
Sérgio Sacani
 
platelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptxplatelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptx
muralinath2
 
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCINGRNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
AADYARAJPANDEY1
 
Mammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsMammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also Functions
YOGESH DOGRA
 
The importance of continents, oceans and plate tectonics for the evolution of...
The importance of continents, oceans and plate tectonics for the evolution of...The importance of continents, oceans and plate tectonics for the evolution of...
The importance of continents, oceans and plate tectonics for the evolution of...
Sérgio Sacani
 
Climate extremes likely to drive land mammal extinction during next supercont...
Climate extremes likely to drive land mammal extinction during next supercont...Climate extremes likely to drive land mammal extinction during next supercont...
Climate extremes likely to drive land mammal extinction during next supercont...
Sérgio Sacani
 
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdf
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdfPests of sugarcane_Binomics_IPM_Dr.UPR.pdf
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdf
PirithiRaju
 

Recently uploaded (20)

word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...
word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...
word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...
 
Hemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptxHemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptx
 
Structures and textures of metamorphic rocks
Structures and textures of metamorphic rocksStructures and textures of metamorphic rocks
Structures and textures of metamorphic rocks
 
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...
 
platelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptxplatelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptx
 
GLOBAL AND LOCAL SCENARIO OF FOOD AND NUTRITION.pptx
GLOBAL AND LOCAL SCENARIO OF FOOD AND NUTRITION.pptxGLOBAL AND LOCAL SCENARIO OF FOOD AND NUTRITION.pptx
GLOBAL AND LOCAL SCENARIO OF FOOD AND NUTRITION.pptx
 
Shuaib Y-basedComprehensive mahmudj.pptx
Shuaib Y-basedComprehensive mahmudj.pptxShuaib Y-basedComprehensive mahmudj.pptx
Shuaib Y-basedComprehensive mahmudj.pptx
 
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCINGRNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
 
Mammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsMammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also Functions
 
EY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptxEY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptx
 
The importance of continents, oceans and plate tectonics for the evolution of...
The importance of continents, oceans and plate tectonics for the evolution of...The importance of continents, oceans and plate tectonics for the evolution of...
The importance of continents, oceans and plate tectonics for the evolution of...
 
Microbial Type Culture Collection (MTCC)
Microbial Type Culture Collection (MTCC)Microbial Type Culture Collection (MTCC)
Microbial Type Culture Collection (MTCC)
 
Transport in plants G1.pptx Cambridge IGCSE
Transport in plants G1.pptx Cambridge IGCSETransport in plants G1.pptx Cambridge IGCSE
Transport in plants G1.pptx Cambridge IGCSE
 
Climate extremes likely to drive land mammal extinction during next supercont...
Climate extremes likely to drive land mammal extinction during next supercont...Climate extremes likely to drive land mammal extinction during next supercont...
Climate extremes likely to drive land mammal extinction during next supercont...
 
SAMPLING.pptx for analystical chemistry sample techniques
SAMPLING.pptx for analystical chemistry sample techniquesSAMPLING.pptx for analystical chemistry sample techniques
SAMPLING.pptx for analystical chemistry sample techniques
 
Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...
Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...
Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
 
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
 
Topography and sediments of the floor of the Bay of Bengal
Topography and sediments of the floor of the Bay of BengalTopography and sediments of the floor of the Bay of Bengal
Topography and sediments of the floor of the Bay of Bengal
 
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdf
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdfPests of sugarcane_Binomics_IPM_Dr.UPR.pdf
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdf
 

An Empirical Analysis of Build Failures in the Continuous Integration Workflows of Java-Based Open-Source Software

  • 1. An Empirical Analysis of Build Failures in the Continuous Integration Worfklows of Java-Based Open-Source Software Thomas Rausch, Waldemar Hummer, Philipp Leitner*, Stefan Schulte Distributed Systems Group Vienna University of Technology, Austria http://dsg.tuwien.ac.at * Software Evolution and Architecture Lab University of Zurich, Switzerland http://www.ifi.uzh.ch/en/seal.html
  • 2. 2 Continuous Integration VCS CI Server Build Feedback Logs Vasilescu et al. (2015). Quality and Productivity Outcomes Relating to Continuous Integration in GitHub “Our main finding is that continuous integration improves the productivity of project teams” Kerzazi et al. (2014). Why do Automated Builds Break? An Empirical Study “We [...] quantified the cost of such build breakage as more than 336.18 man-hours”
  • 3. 3
  • 5. 5 Understanding Build Failures What types of errors cause CI build failures? Which development practices can be associated with CI build failures?
  • 6. 6 Research Setting Project Name Description Apache Storm Distributed Computation Butterknife Android Dependency Injection Crate.IO Scalable SQL database JabRef BibTeX management GUI jcabi-github Wrapper of GitHub API Hystrix Latency and fault tolerance library Presto Distributed SQL query engine Openmicroscopy Microscopy data environment RxAndroid RxJava bindings for Android Sponge API Minecraft plugin API Spring Boot Java Application Framework Square OkHttp HTTP+HTTP/2 client for Android Square Retofit HTTP client for Android Wordpress-Android WordPress for Android
  • 7. 7 Data Acquisition a b c d Topology Mapping CI build history Change history
  • 8. 8 Understanding Build Failures What types of errors cause CI build failures? Which development practices can be associated with CI build failures?
  • 9. 9 Error Categorization and Quantification  Goal ○ Categorization of errors ○ Frequency of occurrence of error types  Approach ○ Systematic exploration of ~54 000 logfiles ○ Categorization scheme based on log message patterns [INFO] Compiling 67 source files to /home/travis/.../target/classes [INFO] ------------------------------------------------------------- [ERROR] COMPILATION ERROR : [INFO] ------------------------------------------------------------- [ERROR] /home/travis/.../redis/RedisAutoConfiguration.java:[143,10] cannot find symbol [INFO] 1 error [INFO] Compiling 67 source files to /home/travis/.../target/classes [INFO] ------------------------------------------------------------- [ERROR] COMPILATION ERROR : [INFO] ------------------------------------------------------------- [ERROR] /home/travis/.../redis/RedisAutoConfiguration.java:[143,10] cannot find symbol [INFO] 1 error [INFO] Compiling 67 source files to /home/travis/.../target/classes [INFO] ------------------------------------------------------------- [ERROR] COMPILATION ERROR : [INFO] ------------------------------------------------------------- [ERROR] /home/travis/.../redis/RedisAutoConfiguration.java:[143,10] cannot find symbol [INFO] 1 error
  • 10. 10 Error Categories unknown Errors without a clearly identifiable cause 9 itestfailure An automated integration test failed 4 doc Documentation (e.g., JavaDoc) problem 3 license License criteria not met (missing header) 3 compatibility API incompatibility 2 androidsdk Android SDK-related error 1 buildout Error specific to Crate.IO python module 1 Label Description Occurrences testfailure An automated test failed 12 compile Compilation error 12 git VCS interaction error 12 buildconfig Faulty build config 11 crash Build environment crash or timeout 11 dependency Dependency error 11 quality Coding-rule violation (e.g., Checkstyle) 10
  • 11. 11 Distribution of Common Error Types Faulty VCS interaction Faulty build configuration Dependency error Compilation error Coding-rule violation Failing test Crash 40% 30% 20% 10% 0%
  • 12. 12 Distribution of Common Error Types Apache Storm Butterknife Crate.IO Hystrix Error testfailure compile git dependency crash buildconfig quality others Percentage JabRef jcabi-github Presto RxAndroid SpongeAPI Spring Boot Square OkHttp Square Retrofit 0% 25% 50% 75% 100%
  • 13. 13 Understanding Build Failures What types of errors cause CI build failures? Which development practices can be associated with CI build failures?
  • 14. 14 Change Metrics .java .txt Changes  Complexity ○ Churn, number of files, ...  File types ○ README.txt vs. IntegrationTest.java  Date and time  Author ○ Experience, commit frequency, ...
  • 15. 15 Process Metrics b1 b2 b4 t b3 a b e c f d g VCS commit graph CI build information  Build History ○ Build climate  Build Type ○ Pull request, merge, ...  Pull Request Scenarios ○ Rebase, squash, ...
  • 16. 16 Statistical Correlation Analysis  For each project individually  Non-parametric correlation tests ○ Pearson’s chi-square test ○ Mann—Whitney U test  Calculate effect sizes ○ Cramér’s V ○ Rank-biserial correlation
  • 17. 17 PassedBuild outcome Failed Failed Passed Previous build result Percentageofbuilds Findings Build failures mostly occur consecutively. Phases of build instability perpetuate failures. Build failures mostly occur consecutively. Phases of build instability perpetuate failures. Build history b b’
  • 18. 18 PassedBuild outcome Failed Failed Passed Previous build result Percentageofbuilds Findings Build failures mostly occur consecutively. Phases of build instability perpetuate failures. Build failures mostly occur consecutively. Phases of build instability perpetuate failures. Build history b b’
  • 19. 19 Findings Build failures mostly occur consecutively. Phases of build instability perpetuate failures. Build failures mostly occur consecutively. Phases of build instability perpetuate failures. Build history No evidence that either history manipula- tion operations or parallel development to a PR affect the PR’s build outcome. No evidence that either history manipula- tion operations or parallel development to a PR affect the PR’s build outcome. Pull request scenarios
  • 20. 20 Findings Even objectively harmless changes can break builds. This indicates unwanted flakiness of tests or the build environment. Even objectively harmless changes can break builds. This indicates unwanted flakiness of tests or the build environment. Build failures mostly occur consecutively. Phases of build instability perpetuate failures. Build failures mostly occur consecutively. Phases of build instability perpetuate failures. File types Build history  577 builds from Spring Boot  Changelog file change only  14% original failures ○ 52% test failures ○ 45% environment crash ○ 3% dependency error No evidence that either history manipula- tion operations or parallel development to a PR affect the PR’s build outcome. No evidence that either history manipula- tion operations or parallel development to a PR affect the PR’s build outcome. Pull request scenarios
  • 21. 21 Summary  Categorization of error types (beyond failed/errored)  Quantification of error type occurrence  Statistical analysis of impact factors  Uncovered challenges that arise when mining CI data
  • 22. 22 Dipl.-Ing. Thomas Rausch Research Assistant TU Wien Distributed Systems Group Argentinierstraße 8/184-1, 1040, Vienna, Austria T: +43 1 58801 184 838 E: rausch@dsg.tuwien.ac.at dsg.tuwien.ac.at/staff/trausch