MSR 2017 Mining Challenge
Moritz Beller @Inventitech
Georgios Gousios, Andy Zaidman
MSR 2017 Mining Challenge
Moritz Beller @Inventitech
Georgios Gousios, Andy Zaidman
TODO: Add background with
Sun
How Came to Be
TODO: Add background with
Sun
How Came to Be
TODO: Add background with
Sun
How Came to Be
Tomorrow, Sunday, 11:00-12.30 Cine. Ground floor
TODO: Add background with
Sun
How Came to Be
Tomorrow, Sunday, 11:00-12.30 Cine. Ground floor
TODO: Add background with
Sun
Travis 101
TODO: Add background with
Sun
Travis 101
TODO: Add background with
Sun
Travis 101
TODO: Add background with
Sun
Travis 101
Project: 1,200
TODO: Add background with
Sun
Travis 101
Builds:
680,209
Project: 1,200
TODO: Add background with
Sun
Travis 101
Builds:
680,209
Project: 1,200 Build result
TODO: Add background with
Sun
Travis 101
Builds:
680,209
Project: 1,200
Duration
Build result
TODO: Add background with
Sun
Travis 101
Builds:
680,209
Project: 1,200
Duration
Languages: Java, Ruby
Build result
TODO: Add background with
Sun
Travis 101
Builds:
680,209
Project: 1,200
Jobs: 2+ Million
Duration
Languages: Java, Ruby
Build result
TODO: Add background with
Sun
Travis 101
Builds:
680,209
Project: 1,200
Jobs: 2+ Million
Duration
Languages: Java, Ruby
Build result
TODO: Add background with
Sun
Travis 101
Build(job) log (up to 50M!):
1 Terabyte
(Sorry for downloading it
thrice, Travis CI!)
Builds:
680,209
Project: 1,200
Jobs: 2+ Million
Duration
Languages: Java, Ruby
Build result
TODO: Add background with
Sun
Travis 101
Build(job) log (up to 50M!):
1 Terabyte
(Sorry for downloading it
thrice, Travis CI!)
Builds:
680,209
Project: 1,200
Jobs: 2+ Million
Duration
Link to Git(Hub)
Languages: Java, Ruby
Build result
TODO: Add background with
Sun
Facts About TravisTorrent
TODO: Add background with
Sun
Facts About TravisTorrent
- git_branch
- git_all_built_commits
- ...
TODO: Add background with
Sun
Facts About TravisTorrent
- git_branch
- git_all_built_commits
- ...
- gh_project_name
- gh_is_pr
- gh_team_size
- gh_diff_tests_added
- gh_first_commit_...
TODO: Add background with
Sun
Facts About TravisTorrent
- git_branch
- git_all_built_commits
- ...
- gh_project_name
- gh_is_pr
- gh_team_size
- gh_diff_tests_added
- gh_first_commit_...
- tr_prev_build
- tr_status
- tr_log_tests_ran
- tr_log_tests_failed
TODO: Add background with
Sun
Facts About TravisTorrent
- git_branch
- git_all_built_commits
- ...
- gh_project_name
- gh_is_pr
- gh_team_size
- gh_diff_tests_added
- gh_first_commit_...
- tr_prev_build
- tr_status
- tr_log_tests_ran
- tr_log_tests_failed
TODO: Add background with
Sun
Facts About TravisTorrent
- git_branch
- git_all_built_commits
- ...
- gh_project_name
- gh_is_pr
- gh_team_size
- gh_diff_tests_added
- gh_first_commit_...
- tr_prev_build
- tr_status
- tr_log_tests_ran
- tr_log_tests_failed
new!
new!
62columns
TODO: Add background with
Sun
Community-Driven Development (CDD)
TODO: Add background with
Sun
Community-Driven Development (CDD)
● C. M. Rosenberg: “gh_first_commit_created_at and gh_build_started_at
are always 22:00 or 23:00 PM”
→ Bug in R’s anytime library (“anytime aims to be that general purpose
converter no matter the input” - "You are reading too much into this. We
heuristically try a bunch formats. None includes TZ because we can't. No
more, no less.” - Well, parsedate can.)
● Kevin (Unknown): “Does 'tr_tests_ran' indicate whether tests were run
during the build? If yes, why is it that in 39698 jobs, 'tr_tests_run' and
'tr_tests_skipped' have a value of 0 but 'tr_tests_ran' has a value of 1?”
→ Inconsistency: tr_tests_run should be NA instead 0
● Sebastian Baltes: First BigQuery Integration Script
● Questions that improved our documentation: Omar Elazhary, Matthias
Bussonnier, Ward Muylaert, and many others in disqus!
TODO: Add background with
Sun
Community-Driven Development (CDD)
● C. M. Rosenberg: “gh_first_commit_created_at and gh_build_started_at
are always 22:00 or 23:00 PM”
→ Bug in R’s anytime library (“anytime aims to be that general purpose
converter no matter the input” - "You are reading too much into this. We
heuristically try a bunch formats. None includes TZ because we can't. No
more, no less.” - Well, parsedate can.)
● Kevin (Unknown): “Does 'tr_tests_ran' indicate whether tests were run
during the build? If yes, why is it that in 39698 jobs, 'tr_tests_run' and
'tr_tests_skipped' have a value of 0 but 'tr_tests_ran' has a value of 1?”
→ Inconsistency: tr_tests_run should be NA instead 0
● Sebastian Baltes: First BigQuery Integration Script
● Questions that improved our documentation: Omar Elazhary, Matthias
Bussonnier, Ward Muylaert, and many others in disqus!
TODO: Add background with
Sun
Community-Driven Development (CDD)
● C. M. Rosenberg: “gh_first_commit_created_at and gh_build_started_at
are always 22:00 or 23:00 PM”
→ Bug in R’s anytime library (“anytime aims to be that general purpose
converter no matter the input” - "You are reading too much into this. We
heuristically try a bunch formats. None includes TZ because we can't. No
more, no less.” - Well, parsedate can.)
● Kevin (Unknown): “Does 'tr_tests_ran' indicate whether tests were run
during the build? If yes, why is it that in 39698 jobs, 'tr_tests_run' and
'tr_tests_skipped' have a value of 0 but 'tr_tests_ran' has a value of 1?”
→ Inconsistency: tr_tests_run should be NA instead 0
● Sebastian Baltes: First BigQuery Integration Script
● Questions that improved our documentation: Omar Elazhary, Matthias
Bussonnier, Ward Muylaert, and many others in disqus!
TODO: Add background with
Sun
Community-Driven Development (CDD)
● C. M. Rosenberg: “gh_first_commit_created_at and gh_build_started_at
are always 22:00 or 23:00 PM”
→ Bug in R’s anytime library (“anytime aims to be that general purpose
converter no matter the input” - "You are reading too much into this. We
heuristically try a bunch formats. None includes TZ because we can't. No
more, no less.” - Well, parsedate can.)
● Kevin (Unknown): “Does 'tr_tests_ran' indicate whether tests were run
during the build? If yes, why is it that in 39698 jobs, 'tr_tests_run' and
'tr_tests_skipped' have a value of 0 but 'tr_tests_ran' has a value of 1?”
→ Inconsistency: tr_tests_run should be NA instead 0
● Sebastian Baltes: First BigQuery Integration Script
● Questions that improved our documentation: Omar Elazhary, Matthias
Bussonnier, Ward Muylaert, and many others in disqus!
TODO: Add background with
Sun
Community-Driven Development (CDD)
● C. M. Rosenberg: “gh_first_commit_created_at and gh_build_started_at
are always 22:00 or 23:00 PM”
→ Bug in R’s anytime library (“anytime aims to be that general purpose
converter no matter the input” - "You are reading too much into this. We
heuristically try a bunch formats. None includes TZ because we can't. No
more, no less.” - Well, parsedate can.)
● Kevin (Unknown): “Does 'tr_tests_ran' indicate whether tests were run
during the build? If yes, why is it that in 39698 jobs, 'tr_tests_run' and
'tr_tests_skipped' have a value of 0 but 'tr_tests_ran' has a value of 1?”
→ Inconsistency: tr_tests_run should be NA instead 0
● Sebastian Baltes: First BigQuery Integration Script
● Questions that improved our documentation: Omar Elazhary, Matthias
Bussonnier, Ward Muylaert, and many others in disqus!
TODO: Add background with
Sun
Community-Driven Development (CDD)
● C. M. Rosenberg: “gh_first_commit_created_at and gh_build_started_at
are always 22:00 or 23:00 PM”
→ Bug in R’s anytime library (“anytime aims to be that general purpose
converter no matter the input” - "You are reading too much into this. We
heuristically try a bunch formats. None includes TZ because we can't. No
more, no less.” - Well, parsedate can.)
● Kevin (Unknown): “Does 'tr_tests_ran' indicate whether tests were run
during the build? If yes, why is it that in 39698 jobs, 'tr_tests_run' and
'tr_tests_skipped' have a value of 0 but 'tr_tests_ran' has a value of 1?”
→ Inconsistency: tr_tests_run should be NA instead 0
● Sebastian Baltes: First BigQuery Integration Script
● Questions that improved our documentation: Omar Elazhary, Matthias
Bussonnier, Ward Muylaert, and many others in disqus!
TODO: Add background with
Sun
Community-Driven Development (CDD)
● C. M. Rosenberg: “gh_first_commit_created_at and gh_build_started_at
are always 22:00 or 23:00 PM”
→ Bug in R’s anytime library (“anytime aims to be that general purpose
converter no matter the input” - "You are reading too much into this. We
heuristically try a bunch formats. None includes TZ because we can't. No
more, no less.” - Well, parsedate can.)
● Kevin (Unknown): “Does 'tr_tests_ran' indicate whether tests were run
during the build? If yes, why is it that in 39698 jobs, 'tr_tests_run' and
'tr_tests_skipped' have a value of 0 but 'tr_tests_ran' has a value of 1?”
→ Inconsistency: tr_tests_run should be NA instead 0
● Sebastian Baltes: First BigQuery Integration Script
● Questions that improved our documentation: Omar Elazhary, Matthias
Bussonnier, Ward Muylaert, and many others in disqus!
Thank
you!
TODO: Add background with
Sun
MSR Challenge Organization
TODO: Add background with
Sun
MSR Challenge Organization
Thank
you!
TODO: Add background with
Sun
MSR Challenge Organization
Submissions
● Record number of 31 submissions
● 1 Desk-rejected (7 pages!), 1 pulled by authors
● 29 submissions into review
Reviews
● Each paper 3 reviews
● Double-blind worked mostly well
● Except if you have to double your PC (9 20)→
● Had to re-assign one paper (my bad)
Outcome
● 14 accepted, 15 rejected (acceptance rate: 48%)
● On purpose: Every paper with a valid contribution
should be discussed
Thank
you!
TODO: Add background with
Sun
MSR Challenge Organization
Submissions
● Record number of 31 submissions
● 1 Desk-rejected (7 pages!), 1 pulled by authors
● 29 submissions into review
Reviews
● Each paper 3 reviews
● Double-blind worked mostly well
● Except if you have to double your PC (9 20)→
● Had to re-assign one paper (my bad)
Outcome
● 14 accepted, 15 rejected (acceptance rate: 48%)
● On purpose: Every paper with a valid contribution
should be discussed
Thank
you!
TODO: Add background with
Sun
MSR Challenge Organization
Submissions
● Record number of 31 submissions
● 1 Desk-rejected (7 pages!), 1 pulled by authors
● 29 submissions into review
Reviews
● Each paper 3 reviews
● Double-blind worked mostly well
● Except if you have to double your PC (9 20)→
● Had to re-assign one paper (my bad)
Outcome
● 14 accepted, 15 rejected (acceptance rate: 48%)
● On purpose: Every paper with a valid contribution
should be discussed
Thank
you!
TODO: Add background with
Sun
MSR Challenge Organization
Submissions
● Record number of 31 submissions
● 1 Desk-rejected (7 pages!), 1 pulled by authors
● 29 submissions into review
Reviews
● Each paper 3 reviews
● Double-blind worked mostly well
● Except if you have to double your PC (9 20)→
● Had to re-assign one paper (my bad)
Outcome
● 14 accepted, 15 rejected (acceptance rate: 48%)
● On purpose: Every paper with a valid contribution
should be discussed
Thank
you!
TODO: Add background with
Sun
MSR Challenge Organization
Submissions
● Record number of 31 submissions
● 1 Desk-rejected (7 pages!), 1 pulled by authors
● 29 submissions into review
Reviews
● Each paper 3 reviews
● Double-blind worked mostly well
● Except if you have to double your PC (9 20)→
● Had to re-assign one paper (my bad)
Outcome
● 14 accepted, 15 rejected (acceptance rate: 48%)
● On purpose: Every paper with a valid contribution
should be discussed
Thank
you!
TODO: Add background with
Sun
MSR Challenge Organization
Submissions
● Record number of 31 submissions
● 1 Desk-rejected (7 pages!), 1 pulled by authors
● 29 submissions into review
Reviews
● Each paper 3 reviews
● Double-blind worked mostly well
● Except if you have to double your PC (9 20)→
● Had to re-assign one paper (my bad)
Outcome
● 14 accepted, 15 rejected (acceptance rate: 48%)
● On purpose: Every paper with a valid contribution
should be discussed
Thank
you!
TODO: Add background with
Sun
MSR Challenge Organization
Submissions
● Record number of 31 submissions
● 1 Desk-rejected (7 pages!), 1 pulled by authors
● 29 submissions into review
Reviews
● Each paper 3 reviews
● Double-blind worked mostly well
● Except if you have to double your PC (9 20)→
● Had to re-assign one paper (my bad)
Outcome
● 14 accepted, 15 rejected (acceptance rate: 48%)
● On purpose: Every paper with a valid contribution
should be discussed
Thank
you!
TODO: Add background with
Sun
MSR Challenge Organization
Submissions
● Record number of 31 submissions
● 1 Desk-rejected (7 pages!), 1 pulled by authors
● 29 submissions into review
Reviews
● Each paper 3 reviews
● Double-blind worked mostly well
● Except if you have to double your PC (9 20)→
● Had to re-assign one paper (my bad)
Outcome
● 14 accepted, 15 rejected (acceptance rate: 48%)
● On purpose: Every paper with a valid contribution
should be discussed
Thank
you!
TODO: Add background with
Sun
MSR Challenge Organization
Submissions
● Record number of 31 submissions
● 1 Desk-rejected (7 pages!), 1 pulled by authors
● 29 submissions into review
Reviews
● Each paper 3 reviews
● Double-blind worked mostly well
● Except if you have to double your PC (9 20)→
● Had to re-assign one paper (my bad)
Outcome
● 14 accepted, 15 rejected (acceptance rate: 48%)
● On purpose: Every paper with a valid contribution
should be discussed
Thank
you!
TODO: Add background with
Sun
MSR Challenge Organization
Submissions
● Record number of 31 submissions
● 1 Desk-rejected (7 pages!), 1 pulled by authors
● 29 submissions into review
Reviews
● Each paper 3 reviews
● Double-blind worked mostly well
● Except if you have to double your PC (9 20)→
● Had to re-assign one paper (my bad)
Outcome
● 14 accepted, 15 rejected (acceptance rate: 48%)
● On purpose: Every paper with a valid contribution
should be discussed
Thank
you!
TODO: Add background with
Sun
MSR Challenge Organization
Submissions
● Record number of 31 submissions
● 1 Desk-rejected (7 pages!), 1 pulled by authors
● 29 submissions into review
Reviews
● Each paper 3 reviews
● Double-blind worked mostly well
● Except if you have to double your PC (9 20)→
● Had to re-assign one paper (my bad)
Outcome
● 14 accepted, 15 rejected (acceptance rate: 48%)
● On purpose: Every paper with a valid contribution
should be discussed
Thank
you!
TODO: Add background with
Sun
MSR Challenge Organization
Submissions
● Record number of 31 submissions
● 1 Desk-rejected (7 pages!), 1 pulled by authors
● 29 submissions into review
Reviews
● Each paper 3 reviews
● Double-blind worked mostly well
● Except if you have to double your PC (9 20)→
● Had to re-assign one paper (my bad)
Outcome
● 14 accepted, 15 rejected (acceptance rate: 48%)
● On purpose: Every paper with a valid contribution
should be discussed
Thank
you!
TODO: Add background with
Sun
Lessons Learned (Future Chairs)
TODO: Add background with
Sun
Lessons Learned (Future Chairs)
● Oh, these !”§$%&/()= updates!
● Serious (time) commitment on updating, bug fixing, support
● Provide one data format (should have been CSV)
● Make your data perfect the first time ;)
● If you can, host someone else’s data
● Disqus panel proved helpful: 98 comments in total
● Make sure your PC is large enough
TODO: Add background with
Sun
Lessons Learned (Future Chairs)
● Oh, these !”§$%&/()= updates!
● Serious (time) commitment on updating, bug fixing, support
● Provide one data format (should have been CSV)
● Make your data perfect the first time ;)
● If you can, host someone else’s data
● Disqus panel proved helpful: 98 comments in total
● Make sure your PC is large enough
TODO: Add background with
Sun
Lessons Learned (Future Chairs)
● Oh, these !”§$%&/()= updates!
● Serious (time) commitment on updating, bug fixing, support
● Provide one data format (should have been CSV)
● Make your data perfect the first time ;)
● If you can, host someone else’s data
● Disqus panel proved helpful: 98 comments in total
● Make sure your PC is large enough
TODO: Add background with
Sun
Lessons Learned (Future Chairs)
● Oh, these !”§$%&/()= updates!
● Serious (time) commitment on updating, bug fixing, support
● Provide one data format (should have been CSV)
● Make your data perfect the first time ;)
● If you can, host someone else’s data
● Disqus panel proved helpful: 98 comments in total
● Make sure your PC is large enough
TODO: Add background with
Sun
Lessons Learned (Future Chairs)
● Oh, these !”§$%&/()= updates!
● Serious (time) commitment on updating, bug fixing, support
● Provide one data format (should have been CSV)
● Make your data perfect the first time ;)
● If you can, host someone else’s data
● Disqus panel proved helpful: 98 comments in total
● Make sure your PC is large enough
TODO: Add background with
Sun
Lessons Learned (Future Chairs)
● Oh, these !”§$%&/()= updates!
● Serious (time) commitment on updating, bug fixing, support
● Provide one data format (should have been CSV)
● Make your data perfect the first time ;)
● If you can, host someone else’s data
● Disqus panel proved helpful: 98 comments in total
● Make sure your PC is large enough
TODO: Add background with
Sun
Lessons Learned (Future Chairs)
● Oh, these !”§$%&/()= updates!
● Serious (time) commitment on updating, bug fixing, support
● Provide one data format (should have been CSV)
● Make your data perfect the first time ;)
● If you can, host someone else’s data
● Disqus panel proved helpful: 98 comments in total
● Make sure your PC is large enough
TODO: Add background with
Sun
Paper Arrivals
“click”
=> Early detection of #Submissions
TODO: Add background with
Sun
Topics of Submissions
keyword #accepts #rejects freq
build result 5 4 31%
machine learning 3 4 24%
prediction 2 2 14%
factor analysis 3 1 14%
time 3 1 14%
sentiment 1 2 10%
external vs. internal contribs 2 1 10%
project success 2 7%
testing 1 1 7%
attraction 1 3%
team size 1 3%
code review 1 3%
meta-paper 1 3%
TODO: Add background with
Sun
Topics of Submissions
keyword #accepts #rejects freq
build result 5 4 31%
machine learning 3 4 24%
prediction 2 2 14%
factor analysis 3 1 14%
time 3 1 14%
sentiment 1 2 10%
external vs. internal contribs 2 1 10%
project success 2 7%
testing 1 1 7%
attraction 1 3%
team size 1 3%
code review 1 3%
meta-paper 1 3%
TODO: Add background with
Sun
Topics of Submissions
keyword #accepts #rejects freq
build result 5 4 31%
machine learning 3 4 24%
prediction 2 2 14%
factor analysis 3 1 14%
time 3 1 14%
sentiment 1 2 10%
external vs. internal contribs 2 1 10%
project success 2 7%
testing 1 1 7%
attraction 1 3%
team size 1 3%
code review 1 3%
meta-paper 1 3%
TODO: Add background with
Sun
Topics of Submissions
keyword #accepts #rejects freq
build result 5 4 31%
machine learning 3 4 24%
prediction 2 2 14%
factor analysis 3 1 14%
time 3 1 14%
sentiment 1 2 10%
external vs. internal contribs 2 1 10%
project success 2 7%
testing 1 1 7%
attraction 1 3%
team size 1 3%
code review 1 3%
meta-paper 1 3%
TODO: Add background with
Sun
Topics of Submissions
keyword #accepts #rejects freq
build result 5 4 31%
machine learning 3 4 24%
prediction 2 2 14%
factor analysis 3 1 14%
time 3 1 14%
sentiment 1 2 10%
external vs. internal contribs 2 1 10%
project success 2 7%
testing 1 1 7%
attraction 1 3%
team size 1 3%
code review 1 3%
meta-paper 1 3%
TODO: Add background with
Sun
Awards
TODO: Add background with
Sun
Awards
TODO: Add background with
Sun
Awards
Best Paper
● Suggested by reviewers, chosen by chairs with help from Anna Nagy (Travis CI)
● Prize: 200$ voucher from Travis CI (& Fame)
● Winner ...
TODO: Add background with
Sun
Awards
Best Paper
● Suggested by reviewers, chosen by chairs with help from Anna Nagy (Travis CI)
● Prize: 200$ voucher from Travis CI (& Fame)
● Winner ...
TODO: Add background with
Sun
Awards
Best Paper
● Suggested by reviewers, chosen by chairs with help from Anna Nagy (Travis CI)
● Prize: 200$ voucher from Travis CI (& Fame)
● Winner ...
TODO: Add background with
Sun
Awards
Best Paper
● Suggested by reviewers, chosen by chairs with help from Anna Nagy (Travis CI)
● Prize: 200$ voucher from Travis CI (& Fame)
● Winner ...
TODO: Add background with
Sun
Awards
Best Paper
● Suggested by reviewers, chosen by chairs with help from Anna Nagy (Travis CI)
● Prize: 200$ voucher from Travis CI (& Fame)
● Winner ...
Best Presentation
● Elected by you, plurality vote!
● Prize: Surprise (& Fame)
TODO: Add background with
Sun
Awards
Best Paper
● Suggested by reviewers, chosen by chairs with help from Anna Nagy (Travis CI)
● Prize: 200$ voucher from Travis CI (& Fame)
● Winner ...
Best Presentation
● Elected by you, plurality vote!
● Prize: Surprise (& Fame)
TODO: Add background with
Sun
Awards
Best Paper
● Suggested by reviewers, chosen by chairs with help from Anna Nagy (Travis CI)
● Prize: 200$ voucher from Travis CI (& Fame)
● Winner ...
Best Presentation
● Elected by you, plurality vote!
● Prize: Surprise (& Fame)
TODO: Add background with
Sun
Awards
Best Paper
● Suggested by reviewers, chosen by chairs with help from Anna Nagy (Travis CI)
● Prize: 200$ voucher from Travis CI (& Fame)
● Winner ...
Best Presentation
● Elected by you, plurality vote!
● Prize: Surprise (& Fame)
TODO: Add background with
Sun
TravisTorrent: In the future ...
TODO: Add background with
Sun
TravisTorrent: In the future ...
● Total infrastructure redesign
● Real time statistics: BLAaaS
● Opt-in model for new projects
● Self-updating and synced to Google BigQuery
● Official Google show case data set (hopefully)
● Addition of new languages
● More intelligent log-analysis (for example, with neural
networks instead of RegExen)

TravisTorrent: Synthesizing Travis CI and GitHub for Full-Stack Research on Continuous Integration

  • 1.
    MSR 2017 MiningChallenge Moritz Beller @Inventitech Georgios Gousios, Andy Zaidman
  • 2.
    MSR 2017 MiningChallenge Moritz Beller @Inventitech Georgios Gousios, Andy Zaidman
  • 3.
    TODO: Add backgroundwith Sun How Came to Be
  • 4.
    TODO: Add backgroundwith Sun How Came to Be
  • 5.
    TODO: Add backgroundwith Sun How Came to Be Tomorrow, Sunday, 11:00-12.30 Cine. Ground floor
  • 6.
    TODO: Add backgroundwith Sun How Came to Be Tomorrow, Sunday, 11:00-12.30 Cine. Ground floor
  • 7.
    TODO: Add backgroundwith Sun Travis 101
  • 8.
    TODO: Add backgroundwith Sun Travis 101
  • 9.
    TODO: Add backgroundwith Sun Travis 101
  • 10.
    TODO: Add backgroundwith Sun Travis 101 Project: 1,200
  • 11.
    TODO: Add backgroundwith Sun Travis 101 Builds: 680,209 Project: 1,200
  • 12.
    TODO: Add backgroundwith Sun Travis 101 Builds: 680,209 Project: 1,200 Build result
  • 13.
    TODO: Add backgroundwith Sun Travis 101 Builds: 680,209 Project: 1,200 Duration Build result
  • 14.
    TODO: Add backgroundwith Sun Travis 101 Builds: 680,209 Project: 1,200 Duration Languages: Java, Ruby Build result
  • 15.
    TODO: Add backgroundwith Sun Travis 101 Builds: 680,209 Project: 1,200 Jobs: 2+ Million Duration Languages: Java, Ruby Build result
  • 16.
    TODO: Add backgroundwith Sun Travis 101 Builds: 680,209 Project: 1,200 Jobs: 2+ Million Duration Languages: Java, Ruby Build result
  • 17.
    TODO: Add backgroundwith Sun Travis 101 Build(job) log (up to 50M!): 1 Terabyte (Sorry for downloading it thrice, Travis CI!) Builds: 680,209 Project: 1,200 Jobs: 2+ Million Duration Languages: Java, Ruby Build result
  • 18.
    TODO: Add backgroundwith Sun Travis 101 Build(job) log (up to 50M!): 1 Terabyte (Sorry for downloading it thrice, Travis CI!) Builds: 680,209 Project: 1,200 Jobs: 2+ Million Duration Link to Git(Hub) Languages: Java, Ruby Build result
  • 19.
    TODO: Add backgroundwith Sun Facts About TravisTorrent
  • 20.
    TODO: Add backgroundwith Sun Facts About TravisTorrent - git_branch - git_all_built_commits - ...
  • 21.
    TODO: Add backgroundwith Sun Facts About TravisTorrent - git_branch - git_all_built_commits - ... - gh_project_name - gh_is_pr - gh_team_size - gh_diff_tests_added - gh_first_commit_...
  • 22.
    TODO: Add backgroundwith Sun Facts About TravisTorrent - git_branch - git_all_built_commits - ... - gh_project_name - gh_is_pr - gh_team_size - gh_diff_tests_added - gh_first_commit_... - tr_prev_build - tr_status - tr_log_tests_ran - tr_log_tests_failed
  • 23.
    TODO: Add backgroundwith Sun Facts About TravisTorrent - git_branch - git_all_built_commits - ... - gh_project_name - gh_is_pr - gh_team_size - gh_diff_tests_added - gh_first_commit_... - tr_prev_build - tr_status - tr_log_tests_ran - tr_log_tests_failed
  • 24.
    TODO: Add backgroundwith Sun Facts About TravisTorrent - git_branch - git_all_built_commits - ... - gh_project_name - gh_is_pr - gh_team_size - gh_diff_tests_added - gh_first_commit_... - tr_prev_build - tr_status - tr_log_tests_ran - tr_log_tests_failed new! new! 62columns
  • 25.
    TODO: Add backgroundwith Sun Community-Driven Development (CDD)
  • 26.
    TODO: Add backgroundwith Sun Community-Driven Development (CDD) ● C. M. Rosenberg: “gh_first_commit_created_at and gh_build_started_at are always 22:00 or 23:00 PM” → Bug in R’s anytime library (“anytime aims to be that general purpose converter no matter the input” - "You are reading too much into this. We heuristically try a bunch formats. None includes TZ because we can't. No more, no less.” - Well, parsedate can.) ● Kevin (Unknown): “Does 'tr_tests_ran' indicate whether tests were run during the build? If yes, why is it that in 39698 jobs, 'tr_tests_run' and 'tr_tests_skipped' have a value of 0 but 'tr_tests_ran' has a value of 1?” → Inconsistency: tr_tests_run should be NA instead 0 ● Sebastian Baltes: First BigQuery Integration Script ● Questions that improved our documentation: Omar Elazhary, Matthias Bussonnier, Ward Muylaert, and many others in disqus!
  • 27.
    TODO: Add backgroundwith Sun Community-Driven Development (CDD) ● C. M. Rosenberg: “gh_first_commit_created_at and gh_build_started_at are always 22:00 or 23:00 PM” → Bug in R’s anytime library (“anytime aims to be that general purpose converter no matter the input” - "You are reading too much into this. We heuristically try a bunch formats. None includes TZ because we can't. No more, no less.” - Well, parsedate can.) ● Kevin (Unknown): “Does 'tr_tests_ran' indicate whether tests were run during the build? If yes, why is it that in 39698 jobs, 'tr_tests_run' and 'tr_tests_skipped' have a value of 0 but 'tr_tests_ran' has a value of 1?” → Inconsistency: tr_tests_run should be NA instead 0 ● Sebastian Baltes: First BigQuery Integration Script ● Questions that improved our documentation: Omar Elazhary, Matthias Bussonnier, Ward Muylaert, and many others in disqus!
  • 28.
    TODO: Add backgroundwith Sun Community-Driven Development (CDD) ● C. M. Rosenberg: “gh_first_commit_created_at and gh_build_started_at are always 22:00 or 23:00 PM” → Bug in R’s anytime library (“anytime aims to be that general purpose converter no matter the input” - "You are reading too much into this. We heuristically try a bunch formats. None includes TZ because we can't. No more, no less.” - Well, parsedate can.) ● Kevin (Unknown): “Does 'tr_tests_ran' indicate whether tests were run during the build? If yes, why is it that in 39698 jobs, 'tr_tests_run' and 'tr_tests_skipped' have a value of 0 but 'tr_tests_ran' has a value of 1?” → Inconsistency: tr_tests_run should be NA instead 0 ● Sebastian Baltes: First BigQuery Integration Script ● Questions that improved our documentation: Omar Elazhary, Matthias Bussonnier, Ward Muylaert, and many others in disqus!
  • 29.
    TODO: Add backgroundwith Sun Community-Driven Development (CDD) ● C. M. Rosenberg: “gh_first_commit_created_at and gh_build_started_at are always 22:00 or 23:00 PM” → Bug in R’s anytime library (“anytime aims to be that general purpose converter no matter the input” - "You are reading too much into this. We heuristically try a bunch formats. None includes TZ because we can't. No more, no less.” - Well, parsedate can.) ● Kevin (Unknown): “Does 'tr_tests_ran' indicate whether tests were run during the build? If yes, why is it that in 39698 jobs, 'tr_tests_run' and 'tr_tests_skipped' have a value of 0 but 'tr_tests_ran' has a value of 1?” → Inconsistency: tr_tests_run should be NA instead 0 ● Sebastian Baltes: First BigQuery Integration Script ● Questions that improved our documentation: Omar Elazhary, Matthias Bussonnier, Ward Muylaert, and many others in disqus!
  • 30.
    TODO: Add backgroundwith Sun Community-Driven Development (CDD) ● C. M. Rosenberg: “gh_first_commit_created_at and gh_build_started_at are always 22:00 or 23:00 PM” → Bug in R’s anytime library (“anytime aims to be that general purpose converter no matter the input” - "You are reading too much into this. We heuristically try a bunch formats. None includes TZ because we can't. No more, no less.” - Well, parsedate can.) ● Kevin (Unknown): “Does 'tr_tests_ran' indicate whether tests were run during the build? If yes, why is it that in 39698 jobs, 'tr_tests_run' and 'tr_tests_skipped' have a value of 0 but 'tr_tests_ran' has a value of 1?” → Inconsistency: tr_tests_run should be NA instead 0 ● Sebastian Baltes: First BigQuery Integration Script ● Questions that improved our documentation: Omar Elazhary, Matthias Bussonnier, Ward Muylaert, and many others in disqus!
  • 31.
    TODO: Add backgroundwith Sun Community-Driven Development (CDD) ● C. M. Rosenberg: “gh_first_commit_created_at and gh_build_started_at are always 22:00 or 23:00 PM” → Bug in R’s anytime library (“anytime aims to be that general purpose converter no matter the input” - "You are reading too much into this. We heuristically try a bunch formats. None includes TZ because we can't. No more, no less.” - Well, parsedate can.) ● Kevin (Unknown): “Does 'tr_tests_ran' indicate whether tests were run during the build? If yes, why is it that in 39698 jobs, 'tr_tests_run' and 'tr_tests_skipped' have a value of 0 but 'tr_tests_ran' has a value of 1?” → Inconsistency: tr_tests_run should be NA instead 0 ● Sebastian Baltes: First BigQuery Integration Script ● Questions that improved our documentation: Omar Elazhary, Matthias Bussonnier, Ward Muylaert, and many others in disqus!
  • 32.
    TODO: Add backgroundwith Sun Community-Driven Development (CDD) ● C. M. Rosenberg: “gh_first_commit_created_at and gh_build_started_at are always 22:00 or 23:00 PM” → Bug in R’s anytime library (“anytime aims to be that general purpose converter no matter the input” - "You are reading too much into this. We heuristically try a bunch formats. None includes TZ because we can't. No more, no less.” - Well, parsedate can.) ● Kevin (Unknown): “Does 'tr_tests_ran' indicate whether tests were run during the build? If yes, why is it that in 39698 jobs, 'tr_tests_run' and 'tr_tests_skipped' have a value of 0 but 'tr_tests_ran' has a value of 1?” → Inconsistency: tr_tests_run should be NA instead 0 ● Sebastian Baltes: First BigQuery Integration Script ● Questions that improved our documentation: Omar Elazhary, Matthias Bussonnier, Ward Muylaert, and many others in disqus! Thank you!
  • 33.
    TODO: Add backgroundwith Sun MSR Challenge Organization
  • 34.
    TODO: Add backgroundwith Sun MSR Challenge Organization Thank you!
  • 35.
    TODO: Add backgroundwith Sun MSR Challenge Organization Submissions ● Record number of 31 submissions ● 1 Desk-rejected (7 pages!), 1 pulled by authors ● 29 submissions into review Reviews ● Each paper 3 reviews ● Double-blind worked mostly well ● Except if you have to double your PC (9 20)→ ● Had to re-assign one paper (my bad) Outcome ● 14 accepted, 15 rejected (acceptance rate: 48%) ● On purpose: Every paper with a valid contribution should be discussed Thank you!
  • 36.
    TODO: Add backgroundwith Sun MSR Challenge Organization Submissions ● Record number of 31 submissions ● 1 Desk-rejected (7 pages!), 1 pulled by authors ● 29 submissions into review Reviews ● Each paper 3 reviews ● Double-blind worked mostly well ● Except if you have to double your PC (9 20)→ ● Had to re-assign one paper (my bad) Outcome ● 14 accepted, 15 rejected (acceptance rate: 48%) ● On purpose: Every paper with a valid contribution should be discussed Thank you!
  • 37.
    TODO: Add backgroundwith Sun MSR Challenge Organization Submissions ● Record number of 31 submissions ● 1 Desk-rejected (7 pages!), 1 pulled by authors ● 29 submissions into review Reviews ● Each paper 3 reviews ● Double-blind worked mostly well ● Except if you have to double your PC (9 20)→ ● Had to re-assign one paper (my bad) Outcome ● 14 accepted, 15 rejected (acceptance rate: 48%) ● On purpose: Every paper with a valid contribution should be discussed Thank you!
  • 38.
    TODO: Add backgroundwith Sun MSR Challenge Organization Submissions ● Record number of 31 submissions ● 1 Desk-rejected (7 pages!), 1 pulled by authors ● 29 submissions into review Reviews ● Each paper 3 reviews ● Double-blind worked mostly well ● Except if you have to double your PC (9 20)→ ● Had to re-assign one paper (my bad) Outcome ● 14 accepted, 15 rejected (acceptance rate: 48%) ● On purpose: Every paper with a valid contribution should be discussed Thank you!
  • 39.
    TODO: Add backgroundwith Sun MSR Challenge Organization Submissions ● Record number of 31 submissions ● 1 Desk-rejected (7 pages!), 1 pulled by authors ● 29 submissions into review Reviews ● Each paper 3 reviews ● Double-blind worked mostly well ● Except if you have to double your PC (9 20)→ ● Had to re-assign one paper (my bad) Outcome ● 14 accepted, 15 rejected (acceptance rate: 48%) ● On purpose: Every paper with a valid contribution should be discussed Thank you!
  • 40.
    TODO: Add backgroundwith Sun MSR Challenge Organization Submissions ● Record number of 31 submissions ● 1 Desk-rejected (7 pages!), 1 pulled by authors ● 29 submissions into review Reviews ● Each paper 3 reviews ● Double-blind worked mostly well ● Except if you have to double your PC (9 20)→ ● Had to re-assign one paper (my bad) Outcome ● 14 accepted, 15 rejected (acceptance rate: 48%) ● On purpose: Every paper with a valid contribution should be discussed Thank you!
  • 41.
    TODO: Add backgroundwith Sun MSR Challenge Organization Submissions ● Record number of 31 submissions ● 1 Desk-rejected (7 pages!), 1 pulled by authors ● 29 submissions into review Reviews ● Each paper 3 reviews ● Double-blind worked mostly well ● Except if you have to double your PC (9 20)→ ● Had to re-assign one paper (my bad) Outcome ● 14 accepted, 15 rejected (acceptance rate: 48%) ● On purpose: Every paper with a valid contribution should be discussed Thank you!
  • 42.
    TODO: Add backgroundwith Sun MSR Challenge Organization Submissions ● Record number of 31 submissions ● 1 Desk-rejected (7 pages!), 1 pulled by authors ● 29 submissions into review Reviews ● Each paper 3 reviews ● Double-blind worked mostly well ● Except if you have to double your PC (9 20)→ ● Had to re-assign one paper (my bad) Outcome ● 14 accepted, 15 rejected (acceptance rate: 48%) ● On purpose: Every paper with a valid contribution should be discussed Thank you!
  • 43.
    TODO: Add backgroundwith Sun MSR Challenge Organization Submissions ● Record number of 31 submissions ● 1 Desk-rejected (7 pages!), 1 pulled by authors ● 29 submissions into review Reviews ● Each paper 3 reviews ● Double-blind worked mostly well ● Except if you have to double your PC (9 20)→ ● Had to re-assign one paper (my bad) Outcome ● 14 accepted, 15 rejected (acceptance rate: 48%) ● On purpose: Every paper with a valid contribution should be discussed Thank you!
  • 44.
    TODO: Add backgroundwith Sun MSR Challenge Organization Submissions ● Record number of 31 submissions ● 1 Desk-rejected (7 pages!), 1 pulled by authors ● 29 submissions into review Reviews ● Each paper 3 reviews ● Double-blind worked mostly well ● Except if you have to double your PC (9 20)→ ● Had to re-assign one paper (my bad) Outcome ● 14 accepted, 15 rejected (acceptance rate: 48%) ● On purpose: Every paper with a valid contribution should be discussed Thank you!
  • 45.
    TODO: Add backgroundwith Sun MSR Challenge Organization Submissions ● Record number of 31 submissions ● 1 Desk-rejected (7 pages!), 1 pulled by authors ● 29 submissions into review Reviews ● Each paper 3 reviews ● Double-blind worked mostly well ● Except if you have to double your PC (9 20)→ ● Had to re-assign one paper (my bad) Outcome ● 14 accepted, 15 rejected (acceptance rate: 48%) ● On purpose: Every paper with a valid contribution should be discussed Thank you!
  • 46.
    TODO: Add backgroundwith Sun MSR Challenge Organization Submissions ● Record number of 31 submissions ● 1 Desk-rejected (7 pages!), 1 pulled by authors ● 29 submissions into review Reviews ● Each paper 3 reviews ● Double-blind worked mostly well ● Except if you have to double your PC (9 20)→ ● Had to re-assign one paper (my bad) Outcome ● 14 accepted, 15 rejected (acceptance rate: 48%) ● On purpose: Every paper with a valid contribution should be discussed Thank you!
  • 47.
    TODO: Add backgroundwith Sun Lessons Learned (Future Chairs)
  • 48.
    TODO: Add backgroundwith Sun Lessons Learned (Future Chairs) ● Oh, these !”§$%&/()= updates! ● Serious (time) commitment on updating, bug fixing, support ● Provide one data format (should have been CSV) ● Make your data perfect the first time ;) ● If you can, host someone else’s data ● Disqus panel proved helpful: 98 comments in total ● Make sure your PC is large enough
  • 49.
    TODO: Add backgroundwith Sun Lessons Learned (Future Chairs) ● Oh, these !”§$%&/()= updates! ● Serious (time) commitment on updating, bug fixing, support ● Provide one data format (should have been CSV) ● Make your data perfect the first time ;) ● If you can, host someone else’s data ● Disqus panel proved helpful: 98 comments in total ● Make sure your PC is large enough
  • 50.
    TODO: Add backgroundwith Sun Lessons Learned (Future Chairs) ● Oh, these !”§$%&/()= updates! ● Serious (time) commitment on updating, bug fixing, support ● Provide one data format (should have been CSV) ● Make your data perfect the first time ;) ● If you can, host someone else’s data ● Disqus panel proved helpful: 98 comments in total ● Make sure your PC is large enough
  • 51.
    TODO: Add backgroundwith Sun Lessons Learned (Future Chairs) ● Oh, these !”§$%&/()= updates! ● Serious (time) commitment on updating, bug fixing, support ● Provide one data format (should have been CSV) ● Make your data perfect the first time ;) ● If you can, host someone else’s data ● Disqus panel proved helpful: 98 comments in total ● Make sure your PC is large enough
  • 52.
    TODO: Add backgroundwith Sun Lessons Learned (Future Chairs) ● Oh, these !”§$%&/()= updates! ● Serious (time) commitment on updating, bug fixing, support ● Provide one data format (should have been CSV) ● Make your data perfect the first time ;) ● If you can, host someone else’s data ● Disqus panel proved helpful: 98 comments in total ● Make sure your PC is large enough
  • 53.
    TODO: Add backgroundwith Sun Lessons Learned (Future Chairs) ● Oh, these !”§$%&/()= updates! ● Serious (time) commitment on updating, bug fixing, support ● Provide one data format (should have been CSV) ● Make your data perfect the first time ;) ● If you can, host someone else’s data ● Disqus panel proved helpful: 98 comments in total ● Make sure your PC is large enough
  • 54.
    TODO: Add backgroundwith Sun Lessons Learned (Future Chairs) ● Oh, these !”§$%&/()= updates! ● Serious (time) commitment on updating, bug fixing, support ● Provide one data format (should have been CSV) ● Make your data perfect the first time ;) ● If you can, host someone else’s data ● Disqus panel proved helpful: 98 comments in total ● Make sure your PC is large enough
  • 55.
    TODO: Add backgroundwith Sun Paper Arrivals “click” => Early detection of #Submissions
  • 56.
    TODO: Add backgroundwith Sun Topics of Submissions keyword #accepts #rejects freq build result 5 4 31% machine learning 3 4 24% prediction 2 2 14% factor analysis 3 1 14% time 3 1 14% sentiment 1 2 10% external vs. internal contribs 2 1 10% project success 2 7% testing 1 1 7% attraction 1 3% team size 1 3% code review 1 3% meta-paper 1 3%
  • 57.
    TODO: Add backgroundwith Sun Topics of Submissions keyword #accepts #rejects freq build result 5 4 31% machine learning 3 4 24% prediction 2 2 14% factor analysis 3 1 14% time 3 1 14% sentiment 1 2 10% external vs. internal contribs 2 1 10% project success 2 7% testing 1 1 7% attraction 1 3% team size 1 3% code review 1 3% meta-paper 1 3%
  • 58.
    TODO: Add backgroundwith Sun Topics of Submissions keyword #accepts #rejects freq build result 5 4 31% machine learning 3 4 24% prediction 2 2 14% factor analysis 3 1 14% time 3 1 14% sentiment 1 2 10% external vs. internal contribs 2 1 10% project success 2 7% testing 1 1 7% attraction 1 3% team size 1 3% code review 1 3% meta-paper 1 3%
  • 59.
    TODO: Add backgroundwith Sun Topics of Submissions keyword #accepts #rejects freq build result 5 4 31% machine learning 3 4 24% prediction 2 2 14% factor analysis 3 1 14% time 3 1 14% sentiment 1 2 10% external vs. internal contribs 2 1 10% project success 2 7% testing 1 1 7% attraction 1 3% team size 1 3% code review 1 3% meta-paper 1 3%
  • 60.
    TODO: Add backgroundwith Sun Topics of Submissions keyword #accepts #rejects freq build result 5 4 31% machine learning 3 4 24% prediction 2 2 14% factor analysis 3 1 14% time 3 1 14% sentiment 1 2 10% external vs. internal contribs 2 1 10% project success 2 7% testing 1 1 7% attraction 1 3% team size 1 3% code review 1 3% meta-paper 1 3%
  • 61.
    TODO: Add backgroundwith Sun Awards
  • 62.
    TODO: Add backgroundwith Sun Awards
  • 63.
    TODO: Add backgroundwith Sun Awards Best Paper ● Suggested by reviewers, chosen by chairs with help from Anna Nagy (Travis CI) ● Prize: 200$ voucher from Travis CI (& Fame) ● Winner ...
  • 64.
    TODO: Add backgroundwith Sun Awards Best Paper ● Suggested by reviewers, chosen by chairs with help from Anna Nagy (Travis CI) ● Prize: 200$ voucher from Travis CI (& Fame) ● Winner ...
  • 65.
    TODO: Add backgroundwith Sun Awards Best Paper ● Suggested by reviewers, chosen by chairs with help from Anna Nagy (Travis CI) ● Prize: 200$ voucher from Travis CI (& Fame) ● Winner ...
  • 66.
    TODO: Add backgroundwith Sun Awards Best Paper ● Suggested by reviewers, chosen by chairs with help from Anna Nagy (Travis CI) ● Prize: 200$ voucher from Travis CI (& Fame) ● Winner ...
  • 67.
    TODO: Add backgroundwith Sun Awards Best Paper ● Suggested by reviewers, chosen by chairs with help from Anna Nagy (Travis CI) ● Prize: 200$ voucher from Travis CI (& Fame) ● Winner ... Best Presentation ● Elected by you, plurality vote! ● Prize: Surprise (& Fame)
  • 68.
    TODO: Add backgroundwith Sun Awards Best Paper ● Suggested by reviewers, chosen by chairs with help from Anna Nagy (Travis CI) ● Prize: 200$ voucher from Travis CI (& Fame) ● Winner ... Best Presentation ● Elected by you, plurality vote! ● Prize: Surprise (& Fame)
  • 69.
    TODO: Add backgroundwith Sun Awards Best Paper ● Suggested by reviewers, chosen by chairs with help from Anna Nagy (Travis CI) ● Prize: 200$ voucher from Travis CI (& Fame) ● Winner ... Best Presentation ● Elected by you, plurality vote! ● Prize: Surprise (& Fame)
  • 70.
    TODO: Add backgroundwith Sun Awards Best Paper ● Suggested by reviewers, chosen by chairs with help from Anna Nagy (Travis CI) ● Prize: 200$ voucher from Travis CI (& Fame) ● Winner ... Best Presentation ● Elected by you, plurality vote! ● Prize: Surprise (& Fame)
  • 71.
    TODO: Add backgroundwith Sun TravisTorrent: In the future ...
  • 72.
    TODO: Add backgroundwith Sun TravisTorrent: In the future ... ● Total infrastructure redesign ● Real time statistics: BLAaaS ● Opt-in model for new projects ● Self-updating and synced to Google BigQuery ● Official Google show case data set (hopefully) ● Addition of new languages ● More intelligent log-analysis (for example, with neural networks instead of RegExen)