Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Building Cloud Tools for Netflix with Jenkins


Published on

Brian Moyles and Gareth Bowles from Netflix describe the continuous integration system that lets us build and deploy the Netflix streaming service fast and at scale.

Published in: Technology

Building Cloud Tools for Netflix with Jenkins

  1. 1. Building and Deploying Netflix in the Cloud @bmoyles @garethbowles #netflixcloud
  2. 2. Who Are These Guys?Brian Moyles Gareth Bowles
  3. 3. What We BuildLarge number of loosely-coupled Java Web ServicesCommon code in libraries that can be shared acrossappsEach service is “baked” - installed onto a base AmazonMachine Image and then created as a new AMI ...... and then deployed into a Service Cluster (a set ofAuto Scaling Groups running a particular service)
  4. 4. Getting Built
  5. 5. Build Pipeline Artifactory yum libraries Jenkins CBF steps resolve compile publish report sync check build test sourcePerforce GitHub
  6. 6. build.xml<project name="helloworld"> <import file="../../../Tools/build/webapplication.xml"/></project>ivy.xml<info organisation="netflix" module="helloworld"> <publications> <artifact name="helloworld" type="package" e:classifier="package" ext="tgz"/> <artifact name="helloworld" type="javadoc" e:classifier="javadoc" ext="jar"/> </publications> <dependencies> <dependency org="netflix" name="resourceregistry" rev="latest.${input.status}" conf="compile"/> <dependency org="netflix" name="platform" rev="latest.${input.status}" conf="compile" /> ...
  7. 7. Jenkins at Netflix
  8. 8. Jenkins Statistics1600 job definitions, 50% SCM triggered2000 builds per dayCommon Build Framework updates trigger 800 rebuilds;by scaling up to 20 cloud slaves we can complete theflood of new builds in 30 minutes2TB of build data
  9. 9. Jenkins Architecturex86_64 slave 11 x86_64 slave 1 x86_64 slave buildnode01 1 x86_64 slave Standard buildnode01 custom slaves buildnode01 buildnode01 custom slaves custom slaves slave group misc. architecture custom slaves misc. architecture misc. architecture custom slaves Amazon Linux Single Master misc. architecture m1.xlarge misc. architecture Ad-hoc slaves Red Hat Linux 2x quad core x86_64 misc. O/S & architectures 26G RAMx86_64 slave 11 x86_64Custom x86_64slave 1 slave buildnode01 ~40 custom slaves buildnode01 slave group buildnode01 maintained by product Amazon Linux teams various us-west-1 VPC Netflix data center Netflix data center and office
  10. 10. Other Uses of JenkinsMonitoring of our test and production Cassandra clustersAutomated integration tests, including bake and deployProduction bake and deploymentHousekeeping of the build / deploy infrastructure: Reap unreferenced artifacts in Artifactory Disable Jenkins jobs with no recent successful builds Mark Jenkins builds as permanent if they are used by an active deployment in prod or test Alert owners when slaves get disconnected
  11. 11. Jenkins Scaling ChallengesFlood of simultaneous builds can quickly exhaust all buildexecutors and clog the pipelineFlood of simultaneous builds can hammer rest of theinfrastructure (especially Artifactory)Making global changes to all jobsSome plugins don’t scale to our number of jobs / buildsHard to test every job before upgrading master or pluginsLarge amount of state encapsulated in build data makesrestoring from backup time consuming
  12. 12. Netflix Extensions to Jenkins Job DSL plugin: allow jobs to be set up with minimal definition, using templates and a Groovy-based DSL. Housekeeping and maintenance processes implemented as Jenkins jobs, system Groovy scripts
  13. 13. TheDynaSlavePluginOur cloud-basedarmy of build nodes
  14. 14. The DynaSlave PluginGenesisOriginal build fleet: 15 VMs on datacenter hardware, 8GRAM, single vCPU, 2 executors per nodeMany jobs build on SCM change. Changes to ourcommon build framework create massive thunderingherd since everything depends on it.Ask for more VMs? Modify CBF less frequently?
  15. 15. The DynaSlave PluginWhat We WantedLeverage our extensive AWS infrastructure, tooling, andexperienceNo manual fiddling with machines once they launchQuick and easy to maintain a fixed pool of slave nodesthat can grow/shrink to meet build demand
  16. 16. The DynaSlave PluginWhat We HaveExposes a new endpoint in Jenkins that EC2 instancesin VPC use for registrationAllows a slave to name itself, label itself, tell Jenkinshow many executors it can supportEC2 == Ephemeral. Disconnected nodes that are gonefor > 30 mins are reapedSizing handled by EC2 ASGs, tweaks passed throughvia user data (labels, names, etc)
  17. 17. The DynaSlave PluginWhat’s NextDynamic resource management: have Jenkins respondto build demand and manage its own slave poolsSlave groups: Allows us to create specialized (andisolated from the genpop) pools of build nodesRefresh mechanism for slave tools (JDKs, Ant versions,etc)Enhanced security/registration of nodesGive it back to the community (!)
  18. 18. Further Reading @netflixoss
  19. 19. Thank you @bmoyles @garethbowles
  20. 20. Thank youQuestions? @bmoyles @garethbowles