Next Step in Automation: 
Elastic Build Environment 
Kohsuke Kawaguchi / CloudBees, Inc. 
kk@kohsuke.org / @kohsukekawa 
Jesse Glick / CloudBees, Inc. 
jglick@cloudbees.com / @tyvole 
©2013 CloudBees, Inc. All Rights Reserved 1
Have You Met Jenkins? http://jenkins-ci.org/ 
©2013 CloudBees, Inc. All Rights Reserved 2
©2013 CloudBees, Inc. All Rights Reserved 3
©2013 CloudBees, Inc. All Rights Reserved 4
©2013 CloudBees, Inc. All Rights Reserved 5
My Jenkins around 2006 
©2013 CloudBees, Inc. All Rights Reserved 6
©2013 CloudBees, Inc. All Rights Reserved 7
©2013 CloudBees, Inc. All Rights Reserved 8
©2013 CloudBees, http://www.flickr.com/p Ihnoct.o As/lgl bRyirgnhetss/ 9R1e25s7e6rv8e83d/ 9
If only we had more computers… 
• Just building & testing them all… 
• Running tests more frequently 
• Testing individual commits 
©2013 CloudBees, Inc. All Rights Reserved 10
©2013 CloudBees, Inc. All Rights Reserved 11 
http://www.flickr.com/photos/drocpsu/8546730021/
Just enough computers 
just in time 
©2013 CloudBees, Inc. All Rights Reserved 12 
Elasticity
My Jenkins around 2007 
©2013 CloudBees, Inc. All Rights Reserved 13
©2013 CloudBees, Inc. All Rights Reserved 14
©2013 CloudBees, Inc. All Rights Reserved 15
©2013 CloudBees, Inc. All Rights Reserved 16
©2013 CloudBees, Inc. All Rights Reserved 17 
http://www.flickr.com/photos/drocpsu/8546730021/
©2013 CloudBees, Inc. All Rights Reserved 18 
18
Just enough computers 
of the right kind 
just in time 
©2013 CloudBees, Inc. All Rights Reserved 19 
Elasticity!
©2013 CloudBees, http://www.flickr.com/p hInotco. sA/8ll2 R21ig9h20ts6 @ReNs0e0r/v7e00d3641975/ 20
Correct answer 
• Test assumes a fixture running on port 8080 
– Doesn’t check if it’s already being used 
• If another test runs at the same time…? 
©2013 CloudBees, Inc. All Rights Reserved 21
©2013 CloudBees, http://www.flickr.com/p hInotco. sA/8ll2 R21ig9h20ts6 @ReNs0e0r/v7e00d3641975/ 22
Correct answer 
• Because of “pkill -f -9 tomcat” cleanup 
©2013 CloudBees, Inc. All Rights Reserved 23
©2013 CloudBees, http://www.flickr.com/p Ihnoct.o As/ljlu Rmiigllha/ts8 6R6e76s4e8rv7e97d/ 24
Isolation 
• At odds with large multi-core systems 
• x86 virtual machines 
• User isolation 
• Kernel containers 
©2013 CloudBees, Inc. All Rights Reserved 25
©2013 CloudBees, Inc. All Rights Reserved 26
©2013 CloudBees, http://www.flickr.com/p hInotco. sA/8ll2 R21ig9h20ts6 @ReNs0e0r/v7e00d3641975/ 27
Correct answer 
• Same Maven ID, two different jars 
• Different projects designate different ones 
• Local cache gets cleaned up periodically 
• Whichever first runs after cache cleanup 
“wins” 
©2013 CloudBees, Inc. All Rights Reserved 28
©2013 CloudBees, http://www.flickr.com/p hInotco. sA/8ll2 R21ig9h20ts6 @ReNs0e0r/v7e00d3641975/ 29
Correct answer 
• Test script leaves background daemon 
process behind 
• Over time it’ll slowly choke slaves 
©2013 CloudBees, Inc. All Rights Reserved 30
©2013 CloudBees, Inc. All Rights Reserved 31
Throw away & create new 
©2013 CloudBees, Inc. All Rights Reserved 32 
Elasticity!
©2013 CloudBees, Inc. All Rights Reserved 33
©2013 CloudBees, Inc. All Rights Reserved 34 
Ladder to Cloud 
Single 
Multiple 
Elastic
Solid OSS Elasticity Plugins 
• EC2 plugin 
• Jclouds plugin 
– OpenStack, CloudStack 
• Launch and tear down slaves on demand 
©2013 CloudBees, Inc. All Rights Reserved 35
VMWare auto-scaling plugin 
• Snapshot 
• Power on-off management 
• Hypervisor-aware scheduling 
• Folder based pooling 
• VMWare tools integration 
• One-time use support 
©2013 CloudBees, Inc. All Rights Reserved 36
Host that runs Docker 
©2013 CloudBees, Inc. All Rights Reserved 37 
Docker plugin
CloudBees DEV@cloud 
©2013 CloudBees, Inc. All Rights Reserved 38 
Mansion 
Slave Slave 
Slave Slave
Linux Container = zero cost virtualization 
©2013 CloudBees, Inc. All Rights Reserved 39 
Maven 
Git 
Ant 
Mercurial 
Gradle 
Subversion 
Linux Kernel 
Hardware
©2013 CloudBees, Inc. All Rights Reserved 40 
For OS X 
Maven 
Git 
XCode 
Git 
XCode 
Subversion 
OS X OS X OS X 
QEMU QEMU QEMU 
Linux Kernel 
Apple Hardware
Kernel Same-page Merging 
OS X OS X OS X 
©2013 CloudBees, Inc. All Rights Reserved 41
©2013 CloudBees, Inc. All Rights Reserved 42
©2013 CloudBees, Inc. All Rights Reserved 43 
Mansion 
Slave 
Slave 
Slave 
Workspace 1 
Workspace 2 
Workspace 3 
Workspace 4
Slave 
Slave 
Workspace’ 
©2013 CloudBees, Inc. All Rights Reserved 44 
Workspace 
Workspace’’
Slave 
Slave 
~/.m2/repository 
©2013 CloudBees, Inc. All Rights Reserved 45 
~/.m2/repository 
~/.m2/repository
©2013 CloudBees, Inc. All Rights Reserved 46 
Parallel Testing 
Test Group #1 Test Group #2 Test Group #3
©2013 CloudBees, Inc. All Rights Reserved 47 
Parallel Testing 
foo #10 Test Group #1 
foo #11 Test Group #2 
foo #12 Test Goup #3
©2013 CloudBees, Inc. All Rights Reserved 48 
Validated Merge 
upstream 
repo 
gate 
repo
©2013 CloudBees, Inc. All Rights Reserved 49
Workflow System 
• Alternative to “freestyle” projects 
• Scripted control flow 
• Resumable execution across restarts 
• All-in-one build/test/deploy pipelines 
• Under active development 
©2013 CloudBees, Inc. All Rights Reserved 50
Workflow with Elastic Slaves 
• One-line provisioning from cloud 
• Language-level parallelism 
• Run commands, archive files, test results 
• Now integrates with parallel test plugin 
©2013 CloudBees, Inc. All Rights Reserved 51
©2013 CloudBees, Inc. All Rights Reserved 52
Conclusion: Elasticity Benefits 
• Just-in-time capacity 
• Diversity without overhead 
• Isolation 
• Productivity gain 
– parallel testing 
– validated merge 
– workflow 
©2013 CloudBees, Inc. All Rights Reserved 53
©2013 CloudBees, Inc. All Rights Reserved 54

JavaOne 2014: Next Step in Automation: Elastic Build Environment

Editor's Notes

  • #3 Java OSS
  • #4 More than 30% uses no slaves at all or just one
  • #5 50+ slaves. There’s a divide here.
  • #6 Growing “Cloud divide”
  • #10 Because if you are doing it right, just building and testing will require a dozen or computers.
  • #11 As I get used to controlling a handful of computers, I started thinking what more we can do. If you don’t think more computers are helpful, you are doing it wrong / Can’t be said about people.
  • #12 Don’t build up capacity that’s enough on a few days a year but go idle most other time.
  • #17 One of the reasons I needed so many computers is because I needed all the different environments / some combinations were very rare and old, keeping them pristine was hard.
  • #18 Needing to have diversity in the environment adds to the capacity planning problem.
  • #19 But you don’t want to make everything too slow by over-subscribing. I’ve seen hypervisors used to run many virtual machines.
  • #21 Hey Kohsuke, my builds are failing. Can you take a look?
  • #23 Hey Kohsuke, my builds are failing. Can you take a look?
  • #25 So the lesson and the best practice = isolate builds and tests / treat them like untrusted code
  • #26 Various techniques has been deployed successfully today
  • #27 but as I found out the hard way, this isn’t enough to solve this problem
  • #28 Hey Kohsuke, my builds are failing. Can you take a look?
  • #30 Hey Kohsuke, my builds are failing. Can you take a look?
  • #32 Turns out isolation in the time dimension is just as important / somewhat like a human body --- if you live long enough, things tend to break down / beyond certain point it becomes unsalvageable, as Windows users know all too well!
  • #33 Turns out elasticity solves this problem, too, by allowing you to simply throw away and create new instances in the same predictable state /
  • #34 Episode from scalability summit / everyone explains their monitoring system
  • #36 Either this slide or more details Jenkins.
  • #39 If you are willing to invest on creating a great slave virtualization environment, you can.
  • #45 HS: if somebody misses the CoW concept, he’d be lost for the next two slides
  • #50 brand-new job type / based on feedback from many / scalability summit