Puppet at GitHub / ChatOps

Jesse Newland
jnewland

hey errbody
my name is jesse newland
I do ops at GitHub

Puppet
at
GitHub
And today I’m going to be talking about Puppet at GitHub.

Really, I’m telling a story in two parts.

All of the amazing Puppet
OSS projects @rodjek
has written but doesn’t
want to talk about
First... I’ll be talking about all of the amazing Puppet open source projects Tim Sharpe has
written but doesn’t want to talk about

and how we use them at GitHub

*
And then, I want to introduce you to the star of the GitHub Ops team, Hubot, and tell you a
little bit about something we’ve been calling ChatOps

the

Setup
But, before I get into all of that, I'm actually going to talk about an
upcoming talk, one by a coworker of mine at GitHub. Will Farrington
is going to be speaking tomorrow at 2:45pm about The Setup, our
Puppet-powered GitHubber laptop management solution. It's
amazing. It's one of the coolest uses of Puppet I've ever seen, and
it's going to completely change the way you think about your
development environment.

But I’m not going to be talking about any of that today.

So, yeah, go to Will's talk tommorrow. You won't be disappointed.

Puppet
at
GitHub
So I guess you could say that I’m talking about

THE of
REST
Puppet
at
GitHub
the rest of puppet at github. For the scope of this talk, I’m going to be talking about the
Puppet infrastructure that runs github.com

4 years, >100k LOC

We’ve been managing GitHub’s infrastructure with Puppet for 4 years, since the move to
Rackspace. There’s a ton of code, and we’re developing at a rapid pace.

Simple
But we are obsessed with keeping our Puppet deployment simple

Single
Master
We use a single puppetmaster running lots of unicorns. Nothing fancy. It works for now.

However, we will need to scale this tier up or out in about 6 months if the trends look right.
We’ll probably switch to two load balanced puppetmasters around that time.

cron FTW
# cat /etc/cron.d/puppet
13 * * * * root /usr/bin/

We don’t run the agent, but rather run puppet on cron every hour in combination with runs
triggered via Hubot (more on that later)

No
ENC
We don’t use an external node classiﬁer

([a-z0-9-_]+)(d+)([a-z]?).(.*).github.com
$ cat manifests/nodes/janky.rscloud.pp

node /^jankyd+.rscloud.github.com$/ {
github::role::janky { 'janky':
public_address => dns_lookup($fqdn),
nginx_hostname => $fqdn,
}
}

Instead, we give nodes DNS names that adhere to a naming convention that maps them to a
pre-deﬁned role

Where the magic happens
$ head modules/github/manifests/role/janky.pp

define github::role::janky($public_address,
$nginx_hostname='',
$god=true
) {

github::core { 'janky': }

include github::app::janky

github::nginx { 'janky': }

}
Role definitions are where the magic happens. We try to DRY common functionality into our
core module and into other simple classes or defines so that role definitions read like a nice
summary of what makes this role different from others

Heavy use of augeas
augeas { 'my.cnf/avoid_cardinality_skew':
context => '/files/etc/mysql/my.cnf/mysqld/',
changes => [
'set innodb_stats_auto_update 0',
'set innodb_stats_on_metadata 0',
'set innodb_stats_on_metadata 64'
],
require => Percona::Server[$::fqdn],
}

We generally try to avoid templates for configuration files in favor of using aw ge us

Lets us manage the small pieces of configuration we care about and use the OS defaults for
the things we don't.

BORING
But I don’t want to just show all of you Puppet code for thirty minutes. That's boring

What’s interesting
about Puppet at
GitHub?
I’d rather talk about what's interesting about how we use Puppet at GitHub. And what I think
is the most interesting is that we focus heavily on ensuring the Puppet development workﬂow
is easily accessible to everyone at GitHub.

Making
Puppet Less
Scary
We’re doing our best to make puppet less scary for people that aren’t familiar with it, so they
can help the Ops team grow and evolve our infrastructure. We’re doing some things right
here, but there’s still a lot of work to do.

I’ve been thinking about this a lot recently as we’ve just had two large infrastructure projects
shipped by people that were completely or relatively new to puppet. First, Derek Greentree
shipped a Cassandra cluster,,,

And Adam Roben shipped puppet manifests for our windows build and CI servers.

this
is
good
This is an awesome trend, and I want it to continue. So I thought I’d talk a bit today about
what we’re doing to try to enable even more of this.

Flow just like
a (GitHub)
Ruby project
For us, an important part of making Puppet development accessible for other developers at
GitHub is making the development ﬂow on our puppet codebase as similar as possible to that
of any other GitHub Ruby project. That means sticking with some common conventions

Setup
$ ./script/bootstrap

Like making it as easy to setup as any other project at GitHub

$ cat Gemfile
source :rubygems

gem 'puppet', '2.7.18'
gem 'facter', '1.6.10'
gem 'rspec-puppet', '0.1.2'
gem 'rake', '0.8.7'
gem 'puppet-lint', '0.2.1'
gem 'ruby-augeas', '0.3.0'
gem 'json', '1.5.1'
gem 'fog', '1.3.1'
gem 'librarian-puppet', '0.9.4'
gem 'parallel_tests'

So ruby deps are managed by Bundler

$ cat Puppetfile

forge "http://forge.puppetlabs.com"

mod 'puppetlabs/apt'
...

And puppet deps are managed by librarian-puppet, a bundler-like library that manages the
puppet modules your infrastructure depends on and install them directly from GitHub
repositories.

I’m of the opinion that the unit of open source currency is no longer a tarball downloaded
from a something named *forge. It’s a GitHub repo. All of the developers at GitHub feel the
same way, so Tim wrote librarian puppet

 rodjek / librarian-puppet

For those of you keeping score at home, that’s the ﬁrst of Tim Sharpe’s open source projects
that I’ve mentioned. Hi Tim!

Making puppet ﬂow like other projects at GitHub means ensuring we have good editor
support for the language

 rodjek / vim-puppet

vim-puppet, that’s two.

Tests
$ ./script/cibuild

It means running tests is a simple one-step process

TESTS!
Tests are super important. A solid and easy to use test harness helps build developer
conﬁdence in a new language.

Safety
net
And tests are crucial safety net for helping people cut their teeth on Puppet if they haven’t
ever touched it before.

rspec-puppet
should contain_github__firewall_rule('internal_network')

should contain_ssmtp__relay_to('smtp').with_relay_host('smtp')

should contain_file('/etc/logstash/logstash.conf')

should include_class('github::ksplice')

should contain_networking__bond('bond0').with(
:gateway => '172.22.0.2',
:arp_ip_target => '172.22.0.2',
:up_commands => nil
)

We use rspec-puppet heavily. If you haven’t used rspec-puppet yet, go check it out right
now.

It’s amazing.

There are no less than three talks about it at Puppetconf, so I’m not going to talk about HOW
to use it today, just touch a little bit on how WE use it.

 rodjek / rspec-puppet

rspec-puppet, that’s three

role
describe 'github::role::fe' do
let(:title) { 'fe' }
let(:node) { 'fe1.rs.github.com' }
let(:params) {
{

specs
:public_address => '207.97.227.242/27',
:private_address => '172.22.1.59/22',
:git_weight => '16'
}
}
let(:facts) {

are
{
:ipaddress => '172.22.1.59',
:operatingsystem => 'Debian',
:datacenter => 'rackspace-iad2',
}

king
}

it do
should contain_github__core('fe')
...
end
end

We try our best to adequately test our individual puppet modules, but our central and most
frequently touched specs exercise our role system. There’s one spec for each role which
describes its intended functionality.

These specs focus on critical functionality of each role, and help a great deal to build
conﬁdence that we’re not introducing regressions when adding or refactoring functionality or
working in other roles.

.git/hooks/pre-commit
$ git commit -am "lolbadchange"
modules/github/manifests/role/fe.pp:err: Could
not parse for environment production: Syntax
error at 'allow_outbound_syslog'; expected '}'
at /Users/jnewland/github/puppet/modules/github/
manifests/role/fe.pp:31
modules/github/manifests/role/fe.pp - WARNING:
=> is not properly aligned on line 626

For an even faster feedback loop than running specs, all Puppet dev environments
automatically get setup with a pre-commit hook that checks for syntax errors and ensures
your changes conﬁrm to the Puppet Style guide.

This has proved amazingly useful for Puppet novices and experts alike, novices ﬁnding it
helps them understand language conventions quickly and guides them towards solutions,
and experts using it to catch typos and help them not look like novices.

 rodjek / puppet-lint

puppet-lint, that’s four, btw.

specs run on each push

auto deploy on CI pass
rspec-puppet and puppet-lint are automatically run by CI on every commit on every branch
pushed to our Puppet repo.

Once master passes CI, puppet is automatically deployed

As you can see, Hubot automates a lot of the process of rolling out Puppet

That example covered pushing changes to master, but what about a Pull-Request based
workﬂow?

Say we have a pull request for a branch we want to merge, and that we’ve reviewed the code
and it all looks good.

branches
==
environments
On each deploy, we turn all git branches into puppet environments.

This combined with heaven, our capistrano-powered deployment API we interact with via
Hubot, enables us to experiment with unmerged Puppet branches in a powerful way

So, to safely merge this pull request...

hubot ci status puppet/git-gh13

deploy:apply puppet/git-gh13 staging/fs1

deploy:noop puppet/git-gh13 prod/fs1

# merge pull request

hubot deploy:apply puppet to prod/fs

graph me -1h @collectd.load(fs*)

log me hooks github/github

You might ask Hubot to conﬁrm its build status

Build #108816
(5fe75932f26ea62cb5fc5e3d0cb302cc2461d11e)
of puppet/git-gh13 was successful(421s) github/
puppet@567ea48...5fe7593

Yup, looks good.








Then roll the branch out to a staging box to make everything applies cleanly there.

** [out :: REDACTED ] Bootstrapping...
** [out :: REDACTED ] Gem environment up-to-date.
** [out :: REDACTED ] Running librarian-puppet...
** [out :: REDACTED ] Generating puppet environments...
** [out :: REDACTED ] Cleaning up deleted branches...
** [out :: REDACTED ] Done!
** [out :: REDACTED ] Sending 'restart' command
** [out :: REDACTED ] The following watches were affected:
** [out :: REDACTED ] puppetmaster_unicorn
** [out :: fs1a.stg.github.com] info: Applying
configuration version
'8fb1a2716d5f950b836e511471a2bdac3ed27090'
** [out :: fs1a.stg.github.com] notice: /Stage[main]
Github::Common_packages/Package[git]/ensure: ensure changed
'1:1.7.10-1+github12' to '1:1.7.10-1+github13'
...

Yup, looks good.








Then, if you wanted an extra layer of conﬁdence, you could noop the branch against a
production node

** [out :: fs1a.rs.github.com] info: Applying
'8fb1a2716d5f950b836e511471a2bdac3ed27090'
** [out :: fs1a.rs.github.com] notice: /Stage[main]/
Github::Common_packages/Package[git]/ensure: would have
changed from '1:1.7.10-1+github12' to
'1:1.7.10-1+github13'
...

Yup, looks good








Next, you’d merge the pull request. If you stopped here, the code would gradually roll out to
all affected nodes over the next hour.








If you wanted the rollout to happen faster than that, you could force a puppet run on the
affected class of nodes

** [out :: fs1a.rs.github.com] info: Applying
'8fb1a2716d5f950b836e511471a2bdac3ed27090'
** [out :: fs7b.rs.github.com] info: Applying
'8fb1a2716d5f950b836e511471a2bdac3ed27090'
** [out :: fs1a.rs.github.com] notice: /Stage[main]/
Github::Common_packages/Package[git]/ensure: ensure
changed '1:1.7.10-1+github12' to '1:1.7.10-1+github13'
** [out :: fs7b.rs.github.com] notice: /Stage[main]/
Github::Common_packages/Package[git]/ensure: ensure
changed '1:1.7.10-1+github12' to '1:1.7.10-1+github13'
...

Yup, that looks good.








Then you’d probably want to check out load to make sure nothing went crazy








...and maybe check some logs or other related metrics to conﬁrm your change didn’t break
something

ChatOps
How we interact with Puppet via Hubot is a great example of a core principal of how we do
ops at GitHub. We’ve been calling it ChatOps recently.

Essentially, ChatOps is the result of Hubot becoming sentient, and decreeing, among other
things, that we now address him as “Supreme Leader” and communicate with our
infrastructure though his secure channels alone.

We occasionally observe him speaking in tongues that sound eerily like YouTube comments.

 Hubot
Actually, that’s not it at all. Hubot is the star of our Ops team.

heaven
shell Hubot janky
graphme
We use hubot day in day out to interact with other simple tools we’ve written over JSON apis.

ALL OF
hubotshell
heaven
janky

THE APIS graphme
Hubot interacts nicely with tons of external APIs too. If you have a JSON API, making your
service work with Hubot is a piece of cake.

Why is this stupid
chat bot so
important to Ops?
But why do we obsess about Hubot so much? It’s just a chat bot, right?

There are some distinct upsides to this approach we’ve notices as our use of Hubot in Ops
has grown








Remember the ﬂow I just showed you for rolling out puppet changes to our infrastructure?

Everyone sees all
of that happen
on their first day
Everyone sees all of this happen from the minute they join GitHub. It’s right there, in the Ops
room, right in the middle of the conversation in campﬁre.

You don’t just see how to roll out puppet, you see how to...

hubot ci status github/smoke-perf

check the status of branch’s last build

hubot deploy github/smoke-perf to prod/fe1

deploy a any branch of any github app to any server

hubot graph me -10min @app-perf

get graphs of the app’s recent performance

hubot procs unicorn

check the status of unicorns across all frontends

hubot resque critical

check the status of the resque critical queue

hubot graph me -10min @collectd.load(fe*)

check load on the frontends

hubot conns fe1

check current connections to a frontend that you suspect has a problem

hubot log me smoke fe1

grab smoke logs for that frontend and realize that you did, in fact, break it

hubot lbctl disable fe1

take it out of the load balancer

hubot status yellow Bad deploy. Reverting now.

update the status blog

hubot who’s on call

determine who is currently on call so you can apologize to them

hubot pingdom checks

check pingdom to make sure you haven’t broken everything

hubot upset me

chill yourself out really quick

hubot deploy github to prod/fe1

revert back to master on the busted frontend

hubot log me smoke fe1

verify things have returned to normal

hubot air drum me

get pumped up because you ﬁxed it

hubot lbctl enable fe1

bring the ﬁxed frontend back into the rotation

hubot status green All systems go.

clear alerts on the status page

hubot whois 4.9.23.22

Once the outage has been resolved, you might see how to grab whois information for an IP
that exhibited suspicious activity in the logs you saw

hubot khanify spammers

and how to hit meme generator to make a joke when you realize that IP is a spammer

hubot play in the air tonight

then someone would queue up the song that popped into their head when they thought
about drums and gorillas at the same time

hubot tweet@github PuppetConf Drinkup Friday
night at 8:30 at Zeke’s
(3rd & Brannan)

and then ﬁnish it all off with a tweet about the Drinkup we’re throwing friday night

ChatOps
ChatOps means building tools that make it easier to operate your infrastructure via Hubot
than via Terminal or Chrome

By placing tools
directly in the
middle of the
conversation
Because...

Everyone
is pairing
all of the time
This is the core concept behind ChatOps.

Teaching by

doing
Teaching by doing is awesome

This was always my main
motivation with hubot - teaching
by doing by making things
visible. It's an extremely
powerful teaching
technique - @rtomayko
Ryan Tomayko had this in mind from the very ﬁrst commits to hubot, which just presented a
simple wrapper around a repository of shell scripts we use for management and monitoring
our infrastructure.

This is how I respond to “how to I do X” questions in Campﬁre now.

If there’s not yet Hubot functionality to do a thing, we try to write it.

Communicate
by

doing
Placing tools in the middle of the conversation also means you get communication of your
work for free.

If you’re doing something in a shell or on a website, you have to do it, then tell people about
it. If you do it with hubot, that comes free.

THINGS I
HAVEN’T ASKED
RECENTLY
For example, here are a few things I haven’t asked recently because Hubot has told me the
answer

THINGS I
HAVEN’T ASKED
how’s that deploy going?

RECENTLY

THINGS I
are you deploying that or should i?

HAVEN’T ASKED

RECENTLY

THINGS I

HAVEN’T ASKED
is anyone responding to that nagios alert?

RECENTLY

THINGS I
is that branch green?

HAVEN’T ASKED

RECENTLY

THINGS I

HAVEN’T ASKED

RECENTLY
how does load look?

did anyone update the status page?

THINGS I

HAVEN’T ASKED

RECENTLY
how does load look?

did anyone update the status page?

THINGS I

HAVEN’T ASKED

RECENTLY
how does load look?
did that deploy finish?

Free communication is especially crucial in a distributed environment.

Our Ops team is entirely remote, so Campﬁre is our default means of communication.

http://www.flickr.com/photos/7997249@N06/6061305639/
This is extremely helpful during outages or other situations that require tactical response.

You don’t have to SAY that you’re spraying water on the ﬁre, people SEE you doing it.

Hide the

ugly
Another awesome beneﬁt of ChatOps-ing all of the things is that you can hide ugly interfaces
and design exactly the interaction you want with some simple porcelain commands

My favorite example of this is ugliest of the ugly, Nagios.

[nines] hubot opened issue #4263: Nagios
(229906) - fs3b/syslog - Tue Sept 25 23:40:18
PDT 2012. github/nines#4263

Hubot politely delivers nagios alerts directly into chat

hubot nagios ack fs3b/syslog

# fix stuff

nagios check fs3b/syslog

nagios status fs3b/syslog

hubot nagios downtime fs3b/syslog 90

nagios mute fs3b/syslog

nagios unmute fs3b/syslog
Which we can interact with without any unnecessary eye bleeding. Making this easy means
developers and other ops engineers actually mute or schedule downtime when they’re testing
things.

Mobile
FTW
Yet another awesome beneﬁt of ChatOps is that you get mobile support for free

Well, that is, if you have a team of awesome iOS developers that have built an actually
functioning Campﬁre client for the iPhone

This lets you do anything hubot can do from your phone.

Which means from your couch. Or your bed. Or a beach in Hawaii.

Which means you can ﬁx a lot of things without pulling your laptop out of your bag.

ChatOps
That’s ChatOps at its ﬁnest.

And now for
something
completely
different
While I’m showing off mobile stuff, I thought I’d slip in a demo of something else we’ve done
to make Ops more mobile friendly.

We’ve hacked together support for PagerDuty alerts via Apple Push Notiﬁcations. When you
swipe on the alert, you go directly to the PagerDuty mobile UI for an incident

Boom
I can’t even begin to tell you how happy this makes me, and how less shitty it makes being
on-call

So, who better to summarize all of this than Hubot himself. I asked him what he thought
about ChatOps. Here’s what he said:

ChatOps all the things.

Listen to what Hubot said. You’ll love it. Your ops team will love it.

And you’ll help other developers learn how to interact with ops tools without any additional
work.

That’s awesome.

Work at GitHub
jesse@github.com

If you can’t ChatOps all the things at your gig now, you could always just come work with me
at GitHub.

Shoot me an email if you’re interested.

Thanks!

That’s all I have. Thanks for listening! any questions?

Tomorrow @ 8:30 PM

Zeke’s

3rd & Brannan
While I still have everyone’s attention, I wanted to mention the GitHub Drinkup we’re
throwing for Puppetconf again. It’s tomorrow night at 8:30pm at Zeke’s, which is on the
corner of 3rd and Brannan,
everyone’s invited. I’ll see you there.

Thanks again!

Puppet at GitHub / ChatOps

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Puppet at GitHub / ChatOps

Similar to Puppet at GitHub / ChatOps (20)

More from Puppet

More from Puppet (20)

Recently uploaded

Recently uploaded (20)

Puppet at GitHub / ChatOps