Running a production Jenkins instance
Harpreet Singh,
Senior Director, Product Management

Kohsuke Kawaguchi
Jenkins founder


                 ©2012 CloudBees, Inc. All Rights Reserved
Agenda
• Failures – a fact of life
  – Getting ready for failures
  – Preventing failures
  – Debugging failures
• Run an efficient Jenkins installation




               ©2012 CloudBees, Inc. All Rights Reserved   2
Day: A period of 24 hours, mostly
misspent…
              ©2011 CloudBees, Inc. All Rights   3
                        Reserved
©2011 CloudBees, Inc. All Rights   4
          Reserved
CloudBees – Who are we?
•   Jenkins founder on-board
•   Key Jenkins contributors on-board
•   Built Jenkins as a Service
•   Run the biggest Jenkins installation
    anywhere (2k+) masters




                ©2011 CloudBees, Inc. All Rights Reserved   5
CloudBees’ Mission - Eliminate
Downtime
• Eliminate time wasted due to
  – Jenkins issues
  – User issues
  – Lack of right tools…
• Improve efficiency for administrators and
  developers
• Rely on Jenkins…



               ©2011 CloudBees, Inc. All Rights Reserved   6
Good Management of Jenkins
•   Organize jobs better
•   Secure your jobs
•   Replicate good practices
•   Respond quicker to requests
•   Ensure compliance
•   Bounce back from failures
•   Prevent failures
•   Everything should be as fast as possible…if
    not faster

                 ©2011 CloudBees, Inc. All Rights Reserved   7
Recovering from failures
High Availability, Backing up




                    ©2011 CloudBees, Inc. All Rights
                              Reserved
Backing up Jenkins
                                          Jenkins Enterprise
Problem: Disk Failures                    Solution
• JENKINS_HOME                            • Backup plugin
   – Plugins, users,
     jobs…everything
                                          • Backup-to-cloud

Solution: Back it up
• Push HOME to a repo
   – HOME tends to be large
   – Commit only vital info
   – Run nightly
• Push to S3

                       ©2011 CloudBees, Inc. All Rights        9
                                 Reserved
JE Backup Plugin
• Backup as a Jenkins                    • Where to backup
  job                                           – Local Directory
• What to backup                                – Sftp server
  – Job configuration                           – WebDav
  – Build records                        • Retention Policy
  – System                                      – All
    Configuration                               – Last N
     • Plugin binaries, plugin
                                                – Exponential decay
       configs etc
     • Everything except job



                      ©2011 CloudBees, Inc. All Rights                10
                                Reserved
Demo




       ©2011 CloudBees, Inc. All Rights   11
                 Reserved
Making Jenkins Highly Available
                                                       Jenkins Enterprise
Problem: Jenkins failures                              Solution
•   Machine/Jenkins failure has                        •    Highly Available
    high cost to productivity                                – Setup multiple Jenkins
                                                               masters
Solution: Notified by unhappy                                – Uses jgroups to elect a
customers ;-)                                                  primary master
•   Issues:                                                  – Promotes a backup master
     –   Receive emails from unhappy                           as primary
         customers and log in and fix it
•   You do have JENKINS_HOME
    backed up else where – don’t
    you?




                                    ©2011 CloudBees, Inc. All Rights                      12
                                              Reserved
Bounce Back Faster: High Availability

       Reverse Proxy                                      Reverse Proxy




                                                                     Jenkins
  Jenkins           Jenkins                                          Master
  Master            Master

                                                                           MT
                           MT                           Jenkins Cluster
     Jenkins Cluster



            JENKINS_HOME                                   JENKINS_HOME
                                                                NFS




                                ©2011 CloudBees, Inc. All Rights                13
                                          Reserved
Demo




       ©2011 CloudBees, Inc. All Rights   14
                 Reserved
Miscellaneous
• Jenkins is not just JENKINS_HOME…think about the slaves
   – Offload builds onto slaves
   – Other executables on the system: git, ruby, java etc as well
   – Preferably use Chef/Puppet to replicate installations
• What about geo redundancy?
   – Technically you can use HA but network latency comes in play
   – Ideally, use HA in a localized data center and a manual failover
     to a different geo
• What HA is not?
   – Does not load balance between instances




                      ©2011 CloudBees, Inc. All Rights Reserved     15
Preventing failures
Git Validated Merges plugin




                   ©2011 CloudBees, Inc. All Rights
                             Reserved
How can you delegate more to Jenkins?
• Does your CI server shift work from
  laptops to servers?
  – You need to commit to have Jenkins test it
  – But if your commit is bad, it blocks others
  – You end up testing locally before committing
  – FAIL




                                                   17
Motivation
• We want to make changes safely
  – Your mistake shouldn’t block others
  – Only push after changes are validated
• We want to run tests asynchronously
  – Your brain has more important things to do
  – Make change and move on
  – Even with TDD!
• We want to run tests on the server
  – Your laptop has more important things to do

                                                  18
Solution: Jenkins should be Git server
•   I push to Jenkins
•   Jenkins merges it with upstream
•   Jenkins tests it
•   If good, Jenkins pushes it upstream


                                    upstream
                                      repo
                        gate
                        repo

                                               19
Another way to look at it

                  Tip of master in upstream
                          Tip of master in upstream




                   My changes




                                                      20
Implementation
• Transport
  – HTTP
  – SSH
• JGit embedded in Jenkins for git server
  functionality
  – A bit of magic like Gerrit to make it seamless
• Additional tags to let you pull submitted
  changes


                                                     21
Demo




       ©2011 CloudBees, Inc. All Rights   22
                 Reserved
Running an efficient production
system




           ©2011 CloudBees, Inc. All Rights
                     Reserved
Test Instance
• Run mini 2nd instance
  – Test new core version before putting it to
    prod
  – Test new versions of plugins
  – Play with new plugins
• Copy over some jobs from prod

• Bootstrap dry-run
  – -Djenkins.model.Jenkins.killAfterLoad=true
               ©2011 CloudBees, Inc. All Rights Reserved   24
Configuring Jenkins for efficiency
• Fast archiver plugin
  – Conserve network bandwidth
• No build on master
  – Also good for security




               ©2011 CloudBees, Inc. All Rights Reserved   25
Managing and Pruning Plugins
                                                  Jenkins Enterprise
Problem: Discovering what
plugins are used in an                            Solution
installation                                      •    Plugin Usage Plugin
•   No visibility if a particular plugin                – Tabular view of Plugin
    is used or how many jobs use it                        name, # of jobs and the job
                                                           names using the plugin




                               ©2011 CloudBees, Inc. All Rights                          26
                                         Reserved
Demo




       ©2011 CloudBees, Inc. All Rights   27
                 Reserved
Monitoring Jenkins




          ©2012 CloudBees, Inc. All Rights
                    Reserved
Why?




       ©2011 CloudBees, Inc. All Rights Reserved   29
What?
• What the user sees
    – GUI (load time)
• JVM memory size
    – Beware of several independent pieces
•   System load
•   Free space on $JENKINS_HOME
•   Slave availability
•   Queue length

                 ©2011 CloudBees, Inc. All Rights Reserved   30
Groovy Console
$ cat queue.groovy
j=Jenkins.instances
println j.queue.items.length

$ curl –u "user:apiToken“ 
  –data-urlencode script@queue.groovy 
  http://jenkins/scriptText
13




                                          31
Remote API
$ curl http://jenkins/computer/api/json?pretty=true
{
  busyExecutors: 0,
  totalExecutors: 2,
  ...
}




                                                  32
Jenkins Monitoring plugin
• JavaMelody in Jenkins




             ©2011 CloudBees, Inc. All Rights Reserved   33
Nagios (or others like it)
• Server app for monitoring
  stuff
  – Extensible, allowing all sorts
    of things to be monitored
• Used in jenkins-ci.org/DEV@cloud




                ©2011 CloudBees, Inc. All Rights Reserved   34
©2011 CloudBees, Inc. All Rights Reserved   35
Thread dump
• Tells us where Jenkins is stuck
• When?
  – Hang or slowness
• Look for threads that’s stuck
  – HTTP request threads
  – Executor threads




              ©2011 CloudBees, Inc. All Rights Reserved   36
How to get a thread dump
• http://jenkins/threadDump
• kill -3 <PID>




             ©2011 CloudBees, Inc. All Rights Reserved   37
Heap dump
• Tells us what’s eating memory
• When?
  – OutOfMemoryError
  – Monitoring shows abnormal growth
• Look for objects that are big
  – Sessions
  – Classes from plugins



              ©2011 CloudBees, Inc. All Rights Reserved   38
How to get a memory dump
• curl –L http://jenkins/heapDump >
  dump.hprof
• jmap -dump:format=b,file=dump.hprof
  PID
• -XX:+HeapDumpOnOutOfMemoryError




            ©2011 CloudBees, Inc. All Rights Reserved   39
Wrapping up

                             Thank You!

 More Info       http://www.cloudbees.com/jenkins-enterprise-by-cloudbees-overview.cb

  Free Trial    http://www.cloudbees.com/jenkins-enterprise-by-cloudbees-download.cb

  Wiki Page        https://wiki.cloudbees.com/bin/view/Jenkins+Enterprise/WebHome

 User Guide    http://jenkins-enterprise.cloudbees.com/docs/user-guide-bundle/index.html#




                          ©2011 CloudBees, Inc. All Rights                                  40
                                    Reserved
Day: A period of 24 hours, mostly
misspent…      ©2011 CloudBees, Inc. All Rights   41
                         Reserved
©2012 CloudBees, Inc. All Rights
          Reserved

Running productioninstance 1-localcopy

  • 1.
    Running a productionJenkins instance Harpreet Singh, Senior Director, Product Management Kohsuke Kawaguchi Jenkins founder ©2012 CloudBees, Inc. All Rights Reserved
  • 2.
    Agenda • Failures –a fact of life – Getting ready for failures – Preventing failures – Debugging failures • Run an efficient Jenkins installation ©2012 CloudBees, Inc. All Rights Reserved 2
  • 3.
    Day: A periodof 24 hours, mostly misspent… ©2011 CloudBees, Inc. All Rights 3 Reserved
  • 4.
    ©2011 CloudBees, Inc.All Rights 4 Reserved
  • 5.
    CloudBees – Whoare we? • Jenkins founder on-board • Key Jenkins contributors on-board • Built Jenkins as a Service • Run the biggest Jenkins installation anywhere (2k+) masters ©2011 CloudBees, Inc. All Rights Reserved 5
  • 6.
    CloudBees’ Mission -Eliminate Downtime • Eliminate time wasted due to – Jenkins issues – User issues – Lack of right tools… • Improve efficiency for administrators and developers • Rely on Jenkins… ©2011 CloudBees, Inc. All Rights Reserved 6
  • 7.
    Good Management ofJenkins • Organize jobs better • Secure your jobs • Replicate good practices • Respond quicker to requests • Ensure compliance • Bounce back from failures • Prevent failures • Everything should be as fast as possible…if not faster ©2011 CloudBees, Inc. All Rights Reserved 7
  • 8.
    Recovering from failures HighAvailability, Backing up ©2011 CloudBees, Inc. All Rights Reserved
  • 9.
    Backing up Jenkins Jenkins Enterprise Problem: Disk Failures Solution • JENKINS_HOME • Backup plugin – Plugins, users, jobs…everything • Backup-to-cloud Solution: Back it up • Push HOME to a repo – HOME tends to be large – Commit only vital info – Run nightly • Push to S3 ©2011 CloudBees, Inc. All Rights 9 Reserved
  • 10.
    JE Backup Plugin •Backup as a Jenkins • Where to backup job – Local Directory • What to backup – Sftp server – Job configuration – WebDav – Build records • Retention Policy – System – All Configuration – Last N • Plugin binaries, plugin – Exponential decay configs etc • Everything except job ©2011 CloudBees, Inc. All Rights 10 Reserved
  • 11.
    Demo ©2011 CloudBees, Inc. All Rights 11 Reserved
  • 12.
    Making Jenkins HighlyAvailable Jenkins Enterprise Problem: Jenkins failures Solution • Machine/Jenkins failure has • Highly Available high cost to productivity – Setup multiple Jenkins masters Solution: Notified by unhappy – Uses jgroups to elect a customers ;-) primary master • Issues: – Promotes a backup master – Receive emails from unhappy as primary customers and log in and fix it • You do have JENKINS_HOME backed up else where – don’t you? ©2011 CloudBees, Inc. All Rights 12 Reserved
  • 13.
    Bounce Back Faster:High Availability Reverse Proxy Reverse Proxy Jenkins Jenkins Jenkins Master Master Master MT MT Jenkins Cluster Jenkins Cluster JENKINS_HOME JENKINS_HOME NFS ©2011 CloudBees, Inc. All Rights 13 Reserved
  • 14.
    Demo ©2011 CloudBees, Inc. All Rights 14 Reserved
  • 15.
    Miscellaneous • Jenkins isnot just JENKINS_HOME…think about the slaves – Offload builds onto slaves – Other executables on the system: git, ruby, java etc as well – Preferably use Chef/Puppet to replicate installations • What about geo redundancy? – Technically you can use HA but network latency comes in play – Ideally, use HA in a localized data center and a manual failover to a different geo • What HA is not? – Does not load balance between instances ©2011 CloudBees, Inc. All Rights Reserved 15
  • 16.
    Preventing failures Git ValidatedMerges plugin ©2011 CloudBees, Inc. All Rights Reserved
  • 17.
    How can youdelegate more to Jenkins? • Does your CI server shift work from laptops to servers? – You need to commit to have Jenkins test it – But if your commit is bad, it blocks others – You end up testing locally before committing – FAIL 17
  • 18.
    Motivation • We wantto make changes safely – Your mistake shouldn’t block others – Only push after changes are validated • We want to run tests asynchronously – Your brain has more important things to do – Make change and move on – Even with TDD! • We want to run tests on the server – Your laptop has more important things to do 18
  • 19.
    Solution: Jenkins shouldbe Git server • I push to Jenkins • Jenkins merges it with upstream • Jenkins tests it • If good, Jenkins pushes it upstream upstream repo gate repo 19
  • 20.
    Another way tolook at it Tip of master in upstream Tip of master in upstream My changes 20
  • 21.
    Implementation • Transport – HTTP – SSH • JGit embedded in Jenkins for git server functionality – A bit of magic like Gerrit to make it seamless • Additional tags to let you pull submitted changes 21
  • 22.
    Demo ©2011 CloudBees, Inc. All Rights 22 Reserved
  • 23.
    Running an efficientproduction system ©2011 CloudBees, Inc. All Rights Reserved
  • 24.
    Test Instance • Runmini 2nd instance – Test new core version before putting it to prod – Test new versions of plugins – Play with new plugins • Copy over some jobs from prod • Bootstrap dry-run – -Djenkins.model.Jenkins.killAfterLoad=true ©2011 CloudBees, Inc. All Rights Reserved 24
  • 25.
    Configuring Jenkins forefficiency • Fast archiver plugin – Conserve network bandwidth • No build on master – Also good for security ©2011 CloudBees, Inc. All Rights Reserved 25
  • 26.
    Managing and PruningPlugins Jenkins Enterprise Problem: Discovering what plugins are used in an Solution installation • Plugin Usage Plugin • No visibility if a particular plugin – Tabular view of Plugin is used or how many jobs use it name, # of jobs and the job names using the plugin ©2011 CloudBees, Inc. All Rights 26 Reserved
  • 27.
    Demo ©2011 CloudBees, Inc. All Rights 27 Reserved
  • 28.
    Monitoring Jenkins ©2012 CloudBees, Inc. All Rights Reserved
  • 29.
    Why? ©2011 CloudBees, Inc. All Rights Reserved 29
  • 30.
    What? • What theuser sees – GUI (load time) • JVM memory size – Beware of several independent pieces • System load • Free space on $JENKINS_HOME • Slave availability • Queue length ©2011 CloudBees, Inc. All Rights Reserved 30
  • 31.
    Groovy Console $ catqueue.groovy j=Jenkins.instances println j.queue.items.length $ curl –u "user:apiToken“ –data-urlencode script@queue.groovy http://jenkins/scriptText 13 31
  • 32.
    Remote API $ curlhttp://jenkins/computer/api/json?pretty=true { busyExecutors: 0, totalExecutors: 2, ... } 32
  • 33.
    Jenkins Monitoring plugin •JavaMelody in Jenkins ©2011 CloudBees, Inc. All Rights Reserved 33
  • 34.
    Nagios (or otherslike it) • Server app for monitoring stuff – Extensible, allowing all sorts of things to be monitored • Used in jenkins-ci.org/DEV@cloud ©2011 CloudBees, Inc. All Rights Reserved 34
  • 35.
    ©2011 CloudBees, Inc.All Rights Reserved 35
  • 36.
    Thread dump • Tellsus where Jenkins is stuck • When? – Hang or slowness • Look for threads that’s stuck – HTTP request threads – Executor threads ©2011 CloudBees, Inc. All Rights Reserved 36
  • 37.
    How to geta thread dump • http://jenkins/threadDump • kill -3 <PID> ©2011 CloudBees, Inc. All Rights Reserved 37
  • 38.
    Heap dump • Tellsus what’s eating memory • When? – OutOfMemoryError – Monitoring shows abnormal growth • Look for objects that are big – Sessions – Classes from plugins ©2011 CloudBees, Inc. All Rights Reserved 38
  • 39.
    How to geta memory dump • curl –L http://jenkins/heapDump > dump.hprof • jmap -dump:format=b,file=dump.hprof PID • -XX:+HeapDumpOnOutOfMemoryError ©2011 CloudBees, Inc. All Rights Reserved 39
  • 40.
    Wrapping up Thank You! More Info http://www.cloudbees.com/jenkins-enterprise-by-cloudbees-overview.cb Free Trial http://www.cloudbees.com/jenkins-enterprise-by-cloudbees-download.cb Wiki Page https://wiki.cloudbees.com/bin/view/Jenkins+Enterprise/WebHome User Guide http://jenkins-enterprise.cloudbees.com/docs/user-guide-bundle/index.html# ©2011 CloudBees, Inc. All Rights 40 Reserved
  • 41.
    Day: A periodof 24 hours, mostly misspent… ©2011 CloudBees, Inc. All Rights 41 Reserved
  • 42.
    ©2012 CloudBees, Inc.All Rights Reserved

Editor's Notes

  • #18 さっき散々サーバを使いこなすことが重要だという話をした。&lt;&lt;スライド&gt;&gt;それはCIが無価値だということではないが、もっと活用できるポテンシャルがあるのにいかせていない