Build Failure Prediction in Continuous Integration Workflows

Build Failure Prediction in
Continuous Integration Workflows
Master’s Thesis Presentation
Software Engineering & Internet Computing
Thomas Rausch
Advisor: Stefan Schulte
Co-Advisor: Waldemar Hummer

Thomas Rausch 3
Build
t
Build
Code Product

Thomas Rausch 4
Continuous Integration (CI)
●
Build at every change
●
Identify problems early
●
Allow frequent releases
VCS
CI Server
Feedback
Build
P. M. Duvall, S. Matyas, and A. Glover, Continuous integration:
improving software quality and reducing risk. Pearson Education, 2007.

Thomas Rausch 5
Build Failure
Build
Wasted resources
t
Examine Determine Fix
ErrorBuild log

Thomas Rausch 6
Software Defect Prediction
Complex
code
Bad
practices
Unstable
environment
Software
defect
M. D’Ambros, M. Lanza, and R. Robbes, Evaluating defect prediction approaches:
a benchmark and an extensive comparison, vol. 17, no. 4–5. 2012.

Thomas Rausch 7
Build Failure Prediction?
Complex
code
Bad
practices
Unstable
environment
Build
failure
?
Unstable
environment
?

Thomas Rausch 8
Research Questions
Which errors cause CI build failures?
What factors influence build outcomes?
Can we predict build failures?

Thomas Rausch 10
Empirical Study
Research Setting
●
14 open source projects that employ CI
Topology mapping
a
b
c
d
VCS
Builds Logs
Change history
CI build history

Thomas Rausch 11
Outline
I Introduction
II Solution Approach
Study
III Systematic study of build errors
IV Factors influencing build failures
V Build failure prediction
VI Summary & Conclusion

III
Systematic Study of
Build Errors

Thomas Rausch 13
Build Steps
validatefetch compile inspect test

Thomas Rausch 14
Build Failure

Thomas Rausch 15
Build Error
Build log output
“[ERROR] Compilation Error”

Thomas Rausch 16
Build Error Frequency
Faulty VCS
interaction
Faulty build
configuration Dependency
error
Compilation
error
Coding-rule
violation
Failing test
Crash
40%
30%
20%
10%
0%
62%

Thomas Rausch 17
Build Error
t

Thomas Rausch 18
Frequency
Seconds
Error Probability Distribution
t

IV
Factors Influencing
Build Results

Thomas Rausch 20
Causes for Build Failures
Complex
code
Bad
practices
Unstable
environment
Build
failure
?

Thomas Rausch 21
Measurable Properties
Complex
code
Bad
practices
Unstable
environment
) = ?fn(

Thomas Rausch 22
Complex
code
Bad
practices
Unstable
environment
= fn(?)

Thomas Rausch 24
Change Metrics
●
What was done to the software?
Process Metrics
●
How were the changes applied?

Thomas Rausch 25
Change Metrics
●
Complexity
●
File types
●
Date and time
●
Author
Changes
.java .txt

Thomas Rausch 26
Process Metrics
●
Build history
●
Build type
●
Integration scenario
b1
b2
b4
t
b3
a
b
e
c
f
d
g
VCS
commit graph
CI build
information
Topology mapping

Thomas Rausch 27
Statistical Correlation Analyses
Categorical Variables
●
{c1, …,cn} ~ { , } Pearson’s chi-square test
Numerical Variables
●
R ~ { , } Mann–Whitney U test
D. J. Sheskin, Handbook of Parametric and Nonparametric Statistical Procedures.
crc Press, 2003.

Thomas Rausch 28
Results
Metric Meta
Build history +++++
Build type ++
Author ++
Change complexity +
Date and time +
File types ~
Integration Scenario ~
Meta statistics
●
Using Fisher’s method
●
Relative relation strength
Van Zwet, W. R., & Oosterhoff, J. (1967). On the combination of independent test
statistics. The Annals of Mathematical Statistics, 38(3), 659–680. article.

Thomas Rausch 29
PassedBuild outcome Failed
Failed Passed
Previous build result
Percentageofbuilds
Results
Metric Meta
Build history +++++
Build type ++
Author ++
Change complexity +
Date and time +
File types ~
b
b’

Thomas Rausch 30
PassedBuild outcome Failed
Failed Passed
Previous build result
Percentageofbuilds
Results
Metric Meta
Build history +++++
Build type ++
Author ++
Change complexity +
Date and time +
File types ~
b
b’

Thomas Rausch 31
Results
Metric Meta
Build history +++++
Build type ++
Author ++
Change complexity +
Date and time +
File types ~
Example
●
Documentation file (changelog)
●
577 builds – doc. change only
●
14% original failures
●
52% test failures
●
45% environment crash
●
3% dependency error
Similar behavior in all projects
Noise skews statistics

Thomas Rausch 33
Machine Learning
?
Classifier
Observations
?
?

Thomas Rausch 34
Experiment Design
Well-known algorithms
●
Naive Bayes
●
C4.5 Decision Trees
●
Random Forest
Feature sets
●
Process metrics
●
Change metrics
●
Combined
Prediction for
●
Binary (failed/passed)
●
Multi-class (error type)
Baseline
●
0-R classifier
frequency table
predicts the average

Thomas Rausch 35
Binary classification result
Average F1
-scores
Binary classification
●
F1-score, 1.0 = perfect
●
Ranging from
●
0.71 – 0.91
Algorithm 0-R NB C4.5 RF
All CM PM
0.3
0.4
0.5
0.6
0.7
0.8
0.9

Thomas Rausch 36
Multi-class classification results
Multi-class classification
●
●
RMSE
●
Error between
actual and predicted
Average RMSE
p=[ p1 ... pn]
Algorithm 0-R NB C4.5 RF
All CM PM
0.0
0.3
0.1
0.2

Thomas Rausch 37
Update Predictions
False positive elimination
●
Likelihood for error declines
●
Plausibility of prediction declines
tmax t50 t75

Thomas Rausch 39
Research Questions - Answered
Which errors cause CI build failures?
●
Failing tests (41%)
●
Coding-rule violations (11%)
●
Compilation errors (10%)
●
A number of errors involving the CI environment

Thomas Rausch 40
What factors influence build outcomes?
●
Process Metrics
●
Build history failures persist
●
Build type merges vs. forward engineering
●
Change Metrics
●
Actually, not so much

Thomas Rausch 41
Can we predict build failures?
●
Yes!
●
We can update predictions during the execution
State Rate min max
Passed 87% 45% 99%
Failed 66% 27% 96%

Thomas Rausch 42
Contributions
In-depth analysis of CI workflow
●
Multiplicity of build errors
●
Factors influencing build failures
Topology mapping
Baseline for CI build failure prediction

Thomas Rausch 43
Future Work
Examine more influence factors
●
Developer behavior
●
Project management workflows
Improve learning methods
Incorporate prediction into CI tools
●
Improve feedback mechanism
●
Improve development productivity

The end?
Questions & Answers
Thomas Rausch
t.rausch@infosys.tuwien.ac.at

Thomas Rausch 45
Research Subjects
Project Name Description
Apache Storm Distributed Computation Framework
Butterknife Android Dependency Injection Library
Crate.IO Scalable SQL database
JabRef Graphical Java application for managing BibTeX databases
jcabi-github Object Oriented Wrapper of GitHub API
Hystrix Latency and fault tolerance library for distributed systems
Presto Distributed SQL query engine for big data
Openmicroscopy Microscopy data environment
RxAndroid RxJava bindings for Android
Sponge API Minecraft plugin API
Spring Boot Java Application Framework
Square OkHttp HTTP+HTTP/2 client for Android and Java
Square Retofit HTTP client for Android and Java
Wordpress-Android WordPress for Android

Thomas Rausch 46
Integration Scenarios
a
b
c
f
e d
g
a
b
c
e
d
f
a
b
c
e
d
a
b
c
d
Initialization Update

Thomas Rausch 47
Related Work
Build Failure Prediction
●
Hassan and Zhang 2006
●
Outdated assumptions
●
No CI workflow considerations
●
Kerzazi et al. 2014
●
Only statistical analysis, no prediction
●
Only binary build outcome
●
Wolf 2009, Schröter 2010
●
Socio-technical factors
●
Closed source enterprise software

Thomas Rausch 48
Data Source
{
"id": 22555277,
"commit_id": 6534711,
"number": "784",
"pull_request_number": "1912",
"pull_request_title": "Example PR",
"started_at": "2014-04-08T19:37:44Z",
"finished_at": "2014-04-08T19:52:56Z",
"duration": 2648,
"state": "failed"
"commit": { ... }
}
Travis-CI
●
Hosted CI service
●
RESTful API
●
Integrated with GitHub

Thomas Rausch 49
Runtime Evolution

Build Failure Prediction in Continuous Integration Workflows

More Related Content

Similar to Build Failure Prediction in Continuous Integration Workflows

More from Thomas Rausch

Recently uploaded

Build Failure Prediction in Continuous Integration Workflows