Deploying Software at Scale Kris Buytaert @krisbuytaert
Kris Buytaert● I used to be a Dev,● Then Became an Op● Chief Trolling Officer and Open Source Consultant @inuits.eu● Everything is an effing DNS Problem● Building Clouds since before the bookstore● Some books, some papers, some blogs● Evangelizing devops
Todays Goals● A reproducable way to deploy and upgrade software● Automatically● Fast● Consistent
Whats the problem ?The community of developers whose work yousee on the Web, who probably don’t know whatADO or UML or JPA even stand for, deploy bettersystems at less cost in less time at lower riskthan we see in the Enterprise. This is true evenwhen you factor in the greater flexibility andvelocity of startups.Tim Bray , on his blog January 2010
The Old Days● “Put this Code Live, heres a tarball” NOW!● What dependencies ?● No machines available ?● What database ?● Security ?● High Availability ?● Scalability ?● My computer cant install this ?
devops● Culture● (Lean)● Automation● Measurement● Sharing Damon Edwards and John Willis Gene Kim
NirvanaAn “ecosystem” that supports continuous delivery, frominfrastructure, data and configuration management tobusiness.Through automation of the build, deployment, and testingprocess, and improved collaboration between developers,testers, and operations, delivery teams can get changesreleased in a matter of hours — sometimes even minutes–nomatter what the size of a project or the complexity of its codebase. Continuous Delivery , Jez Humble
How many times a day ?● 10 @ Flickr● Deployments used to be pain● Nobody dared to deploy a site● Practice makes perfect● Knowing you can vs constantly doing it
" Our job as engineers (and ops, dev-ops, QA,support, everyone in the company actually) is toenable the business goals. We strongly feel thatin order to do that you must have the ability todeploy code quickly and safely. Even if thebusiness goals are to deploy strongly QA’d codeonce a month at 3am (it’s not for us, we push allthe time), having a reliable and easydeployment should be non-negotiable."Etsy Blog upon releasing Deployinatorhttp://codeascraft.etsy.com/2010/05/20/quantum-of-deployment/
OS Baseline● Automated Deployments● Reproducable● Kickstart, FAI, Preseeding,● JeOS
Infrastructure as Code● Treat configuration automation as code● Development best practices • Model your infrastructure • Version your cookbooks / manifests • Test your cookbooks/ manifests • Dev/ test /uat / prod for your infra● Model your infrastructure● A working service = automated ( Application Code + Infrastructure Code + Security + Monitoring )● Think Puppet, Chef, Cfengine, Ansible , ....
Version Control● Git !● Version ALL the things: • Source code Application • Source code Infrastructure • Builds • Tests • Pipelines • Scripts • Documentation • Monitoring scripts
Continuous Integration● Builds● Nightly Builds● Builds with tests● Nightly Builds with tests● Frequent integration● Continuous Integration
Jenkins● Open Source Continuous Integration Server● A zillion plugins (400)● Have developers build stable and deployable code● Test Infra code
A pipeline● Checkout code● Syntax● Style● Code Coverage● Tests● Build● More Tests● Package
App Requirements● Testable● Configuration isolated● Automated Deployments● “If my computer cant install it , the installer is borken” Luke Kanies at Fosdem (2007)● Bulk provisioning of data● http://www.krisbuytaert.be/blog/how-i-my-java
Why ops like to package● Packages give you features•Consistency, security, dependencies● Uniquely identify where files come from•Package or cfg-mgmt● Source repo not always available•Firewall / Cloud etc ..● Weird deployment locations , no easy access● Little overhead when you automate● CONFIG does not belong in a package
fpmfpm -t rpm -s dir -n hornetq -v 2.2.5 hornetqExecuting(%prep): /bin/sh -e /var/tmp/rpm-tmp.nNkVwh+ umask 022+ cd /usr/local/build-rpm-hornetq-2.2.5.x86_64.rpm/BUILD+ exit 0Executing(%build): /bin/sh -e /var/tmp/rpm-tmp.yUd4MV+ umask 022+ cd /usr/local/build-rpm-hornetq-2.2.5.x86_64.rpm/BUILD+ cd /usr/local/build-rpm-hornetq-2.2.5.x86_64.rpm/BUILD+ tar -zxf /usr/local/build-rpm-hornetq-2.2.5.x86_64.rpm/data.tar.gz+ exit 0Executing(%install): /bin/sh -e /var/tmp/rpm-tmp.jkpqeA+ umask 022+ cd /usr/local/build-rpm-hornetq-2.2.5.x86_64.rpm/BUILD+ /usr/lib/rpm/brp-compress+ /usr/lib/rpm/brp-strip+ /usr/lib/rpm/brp-strip-static-archive+ /usr/lib/rpm/brp-strip-comment-noteProcessing files: hornetq-2.2.5-1.x86_64Checking for unpackaged file(s): /usr/lib/rpm/check-files /usr/local/build-rpm-hornetq-2.2.5.x86_64.rpm/BUILDWrote: /usr/local/build-rpm-hornetq-2.2.5.x86_64.rpm/SRPMS/hornetq-2.2.5-1.src.rpmWrote: /usr/local/build-rpm-hornetq-2.2.5.x86_64.rpm/RPMS/x86_64/hornetq-2.2.5-1.x86_64.rpmExecuting(%clean): /bin/sh -e /var/tmp/rpm-tmp.z2UL3B+ umask 022+ cd /usr/local/build-rpm-hornetq-2.2.5.x86_64.rpm/BUILD+ rm -rf /usr/local/build-rpm-hornetq-2.2.5.x86_64.rpm/BUILD+ exit 0Created /usr/local/hornetq-2.2.5.x86_64.rpm
fpm in action● https://github.com/Inuits/build-gems● Fork, pull● Jenkins pulls , builds , pushes to repo● (variants for Nagios Plugins / Jenkins Plugins available)
A pipeline● Checkout code ● Upload to Repo● Syntax● Style● Code Coverage● Tests● Build● More Tests● Package
PulpPulp is a Python application for managingsoftware repositories and their associatedcontent, such as packages, errata, anddistributions. It can replicate softwarerepositories from a variety of supported sources,such as http/https, file system, ISO, and RHN, toa local on-site repository. It provides mechanismsfor systems to gain access to these repositories,providing centralized software installation.
Pulp● Redhat Community● Redhat Emerging Technology● Part of Katello
Pulp● “manages” its own apache instance● Symlinks , no copies● Queues•Syncing in the background•No more screens ;)● Actions are not instantly•e.g. Add / sync / delete● Hello mongodb :(● v1 vs v2● Only use repo functionality , cfgmgmt is in charge of packages
Version vs Latest● Version your repos ? ensure => latests● Latest your environments ?● Strict versioning in config ? ensure => 0.98.4
A pipeline● Checkout code ● Upload to Repo● Syntax ● Deploy on Test● Style● Code Coverage● Tests● Build● More Tests● Package
The Marionette Collective● Distributed ssh ++● What version of ssh do I have installed on my servers ?● On what servers is XYZ running ?● Clean all my ssl certs ?● Restart apache on all servers with fact X
mc-packagemc-package -W /dev/ status jdk * [ ============================================================> ] 33 / 33servicesdb01.dev.com version = -absentservices.dev.google.com version = jdk-1.6.0_13-fcsdrbdtest02.dev.google.com version = -absentservices3.dev.google.com version = jdk-1.6.0_20-fcsum.dev.google.com version = jdk-1.5.0_19-fcsdevtools03.uat.com version = jdk-1.6.0_29-fcsalexandria02.dev.google.com version = -absentweblink01.dev.com version = -absentwikitest.dev.google.com version = jdk-1.6.0_24-fcspayment.dev.google.com version = jdk-1.5.0_17-fcstiff2pdf01.dev.com version = -absentdevdoos.dev.com version = jdk-1.6.0_30-fcswiki.dev.google.com version = jdk-1.6.0_24-fcsreporting01.dev.com version = -absentdevtools01-dev.uat.com version = jdk-1.6.0_23-fcsdevtools02.uat.com version = jdk-1.6.0_29-fcsdrbdtest01.dev.google.com version = -absent---- package agent summary ---- Nodes: 33/33 Versions: 1 * 1.5.0_17-fcs, 1 * 1.5.0_19-fcs, 1 * 1.6.0_13-fcs, 1 * 1.6.0_20-fcs, 1 * 1.6.0_23-fcs, 2 * 1.6.0_24-fcs, 2 * 1.6.0 Elapsed Time: 1.73 s
What to Trigger ?● Update Package • Only updates package● Trigger Puppet Run • Updates config + package
A pipeline● Checkout code ● Upload to Repo● Syntax ● Deploy on Test● Style ● More Tests● Code Coverage ● Promote● Tests ● Deploy on UAT● Build ● More Tests● More Tests ● Promote● Package ● Deploy on Prod
Done ?● Close the feedback loop,● Send metric on deployment echo "deployed.$package_name 1 `date + %s`" > /dev/tcp/<%= graphite_host %>/2003
Done ?A Software project is not done until your lastenduser is in his grave !
But remember Everything is a Fscking DNS Problem No really, Everything is a Fscking DNS Problem If its not a fucking DNS Problem .. Its an arp problem If its not an arp problem... Its a Full Filesystem Problem If your filesystem isnt full Its a Spanning Tree problem If its not a spanning Tree problem... Its a USB problem If its not a USB Problem It might be an ntp problem If its not an ntp problem Its a sharing IRQ Problem If its not a sharing IRQ Problem But most often .. its a Freaking Dns Problem ! Or someone playing tricks on you Jan 2006