More Related Content Similar to Running productioninstance 1-localcopy Similar to Running productioninstance 1-localcopy (20) Running productioninstance 1-localcopy1. Running a production Jenkins instance
Harpreet Singh,
Senior Director, Product Management
Kohsuke Kawaguchi
Jenkins founder
©2012 CloudBees, Inc. All Rights Reserved
2. Agenda
• Failures – a fact of life
– Getting ready for failures
– Preventing failures
– Debugging failures
• Run an efficient Jenkins installation
©2012 CloudBees, Inc. All Rights Reserved 2
3. Day: A period of 24 hours, mostly
misspent…
©2011 CloudBees, Inc. All Rights 3
Reserved
5. CloudBees – Who are we?
• Jenkins founder on-board
• Key Jenkins contributors on-board
• Built Jenkins as a Service
• Run the biggest Jenkins installation
anywhere (2k+) masters
©2011 CloudBees, Inc. All Rights Reserved 5
6. CloudBees’ Mission - Eliminate
Downtime
• Eliminate time wasted due to
– Jenkins issues
– User issues
– Lack of right tools…
• Improve efficiency for administrators and
developers
• Rely on Jenkins…
©2011 CloudBees, Inc. All Rights Reserved 6
7. Good Management of Jenkins
• Organize jobs better
• Secure your jobs
• Replicate good practices
• Respond quicker to requests
• Ensure compliance
• Bounce back from failures
• Prevent failures
• Everything should be as fast as possible…if
not faster
©2011 CloudBees, Inc. All Rights Reserved 7
9. Backing up Jenkins
Jenkins Enterprise
Problem: Disk Failures Solution
• JENKINS_HOME • Backup plugin
– Plugins, users,
jobs…everything
• Backup-to-cloud
Solution: Back it up
• Push HOME to a repo
– HOME tends to be large
– Commit only vital info
– Run nightly
• Push to S3
©2011 CloudBees, Inc. All Rights 9
Reserved
10. JE Backup Plugin
• Backup as a Jenkins • Where to backup
job – Local Directory
• What to backup – Sftp server
– Job configuration – WebDav
– Build records • Retention Policy
– System – All
Configuration – Last N
• Plugin binaries, plugin
– Exponential decay
configs etc
• Everything except job
©2011 CloudBees, Inc. All Rights 10
Reserved
11. Demo
©2011 CloudBees, Inc. All Rights 11
Reserved
12. Making Jenkins Highly Available
Jenkins Enterprise
Problem: Jenkins failures Solution
• Machine/Jenkins failure has • Highly Available
high cost to productivity – Setup multiple Jenkins
masters
Solution: Notified by unhappy – Uses jgroups to elect a
customers ;-) primary master
• Issues: – Promotes a backup master
– Receive emails from unhappy as primary
customers and log in and fix it
• You do have JENKINS_HOME
backed up else where – don’t
you?
©2011 CloudBees, Inc. All Rights 12
Reserved
13. Bounce Back Faster: High Availability
Reverse Proxy Reverse Proxy
Jenkins
Jenkins Jenkins Master
Master Master
MT
MT Jenkins Cluster
Jenkins Cluster
JENKINS_HOME JENKINS_HOME
NFS
©2011 CloudBees, Inc. All Rights 13
Reserved
14. Demo
©2011 CloudBees, Inc. All Rights 14
Reserved
15. Miscellaneous
• Jenkins is not just JENKINS_HOME…think about the slaves
– Offload builds onto slaves
– Other executables on the system: git, ruby, java etc as well
– Preferably use Chef/Puppet to replicate installations
• What about geo redundancy?
– Technically you can use HA but network latency comes in play
– Ideally, use HA in a localized data center and a manual failover
to a different geo
• What HA is not?
– Does not load balance between instances
©2011 CloudBees, Inc. All Rights Reserved 15
17. How can you delegate more to Jenkins?
• Does your CI server shift work from
laptops to servers?
– You need to commit to have Jenkins test it
– But if your commit is bad, it blocks others
– You end up testing locally before committing
– FAIL
17
18. Motivation
• We want to make changes safely
– Your mistake shouldn’t block others
– Only push after changes are validated
• We want to run tests asynchronously
– Your brain has more important things to do
– Make change and move on
– Even with TDD!
• We want to run tests on the server
– Your laptop has more important things to do
18
19. Solution: Jenkins should be Git server
• I push to Jenkins
• Jenkins merges it with upstream
• Jenkins tests it
• If good, Jenkins pushes it upstream
upstream
repo
gate
repo
19
20. Another way to look at it
Tip of master in upstream
Tip of master in upstream
My changes
20
21. Implementation
• Transport
– HTTP
– SSH
• JGit embedded in Jenkins for git server
functionality
– A bit of magic like Gerrit to make it seamless
• Additional tags to let you pull submitted
changes
21
22. Demo
©2011 CloudBees, Inc. All Rights 22
Reserved
24. Test Instance
• Run mini 2nd instance
– Test new core version before putting it to
prod
– Test new versions of plugins
– Play with new plugins
• Copy over some jobs from prod
• Bootstrap dry-run
– -Djenkins.model.Jenkins.killAfterLoad=true
©2011 CloudBees, Inc. All Rights Reserved 24
25. Configuring Jenkins for efficiency
• Fast archiver plugin
– Conserve network bandwidth
• No build on master
– Also good for security
©2011 CloudBees, Inc. All Rights Reserved 25
26. Managing and Pruning Plugins
Jenkins Enterprise
Problem: Discovering what
plugins are used in an Solution
installation • Plugin Usage Plugin
• No visibility if a particular plugin – Tabular view of Plugin
is used or how many jobs use it name, # of jobs and the job
names using the plugin
©2011 CloudBees, Inc. All Rights 26
Reserved
27. Demo
©2011 CloudBees, Inc. All Rights 27
Reserved
29. Why?
©2011 CloudBees, Inc. All Rights Reserved 29
30. What?
• What the user sees
– GUI (load time)
• JVM memory size
– Beware of several independent pieces
• System load
• Free space on $JENKINS_HOME
• Slave availability
• Queue length
©2011 CloudBees, Inc. All Rights Reserved 30
31. Groovy Console
$ cat queue.groovy
j=Jenkins.instances
println j.queue.items.length
$ curl –u "user:apiToken“
–data-urlencode script@queue.groovy
http://jenkins/scriptText
13
31
32. Remote API
$ curl http://jenkins/computer/api/json?pretty=true
{
busyExecutors: 0,
totalExecutors: 2,
...
}
32
34. Nagios (or others like it)
• Server app for monitoring
stuff
– Extensible, allowing all sorts
of things to be monitored
• Used in jenkins-ci.org/DEV@cloud
©2011 CloudBees, Inc. All Rights Reserved 34
36. Thread dump
• Tells us where Jenkins is stuck
• When?
– Hang or slowness
• Look for threads that’s stuck
– HTTP request threads
– Executor threads
©2011 CloudBees, Inc. All Rights Reserved 36
37. How to get a thread dump
• http://jenkins/threadDump
• kill -3 <PID>
©2011 CloudBees, Inc. All Rights Reserved 37
38. Heap dump
• Tells us what’s eating memory
• When?
– OutOfMemoryError
– Monitoring shows abnormal growth
• Look for objects that are big
– Sessions
– Classes from plugins
©2011 CloudBees, Inc. All Rights Reserved 38
39. How to get a memory dump
• curl –L http://jenkins/heapDump >
dump.hprof
• jmap -dump:format=b,file=dump.hprof
PID
• -XX:+HeapDumpOnOutOfMemoryError
©2011 CloudBees, Inc. All Rights Reserved 39
40. Wrapping up
Thank You!
More Info http://www.cloudbees.com/jenkins-enterprise-by-cloudbees-overview.cb
Free Trial http://www.cloudbees.com/jenkins-enterprise-by-cloudbees-download.cb
Wiki Page https://wiki.cloudbees.com/bin/view/Jenkins+Enterprise/WebHome
User Guide http://jenkins-enterprise.cloudbees.com/docs/user-guide-bundle/index.html#
©2011 CloudBees, Inc. All Rights 40
Reserved
41. Day: A period of 24 hours, mostly
misspent… ©2011 CloudBees, Inc. All Rights 41
Reserved
Editor's Notes さっき散々サーバを使いこなすことが重要だという話をした。<<スライド>>それはCIが無価値だということではないが、もっと活用できるポテンシャルがあるのにいかせていない