Improving Operations
Efficiency with Puppet
April 17th, 2015
Nicolas Brousse | Sr. Director Of Operations Engineering | nicolas@tubemogul.com
Julien Fabre | Site Reliability Engineer | julien.fabre@tubemogul.com
Who are we?
TubeMogul
●  Enterprise software company for digital branding
●  Over 27 Billions Ads served in 2014
●  Over 30 Billions Ad Auctions per day
●  Bid processed in less than 50 ms
●  Bid served in less than 80 ms (include network round trip)
●  5 PB of monthly video traffic served
●  1.1 EB of data stored
Operations Engineering
●  Ensure the smooth day to day operation of the platform
infrastructure
●  Provide a cost effective and cutting edge infrastructure
●  Team composed of SREs, SEs and DBAs
●  Managing over 2,500 servers (virtual and physical)
Our Infrastructure
Public Cloud On Premises
Multiple locations with a mix of Public Cloud and On Premises
●  Java (a lot!)
●  MySQL
●  Couchbase
●  Vertica
●  Kafka
●  Storm
●  Zookeeper, Exhibitor
●  Hadoop, HBase, Hive
●  Terracotta
●  ElasticSearch, Kibana
●  LogStash
●  PHP, Python, Ruby, Go...
●  Apache httpd
●  Nagios
●  Ganglia
Technology Hoarders
●  Graphite
●  Memcached
●  Puppet
●  HAproxy
●  OpenStack
●  Git and Gerrit
●  Gor
●  ActiveMQ
●  OpenLDAP
●  Redis
●  Blackbox
●  Jenkins, Sonar
●  Tomcat
●  Jetty (embedded)
●  AWS DynamoDB, EC2, S3...
●  2008 - 2010: Use SVN, Bash scripts and custom templates.
●  2010: Managing about 250 instances. Start looking at Puppet.
●  2011: Started with Puppet 0.25 then upgraded to 2.7 by EOY on
400 servers with 2 contributors.
●  2012: 800 servers managed by Puppet. 4 contributors.
●  2013: 1,000 servers managed by Puppet. 6 contributors.
●  2014: 1,500 servers managed by Puppet. Workflow using Git,
Gerrit and Jenkins. 9 contributors. Start migration to 3.7.
●  2015: 2,000 servers managed by Puppet. 13 contributors.
Five Years Of Puppet!
●  2000 nodes
●  225 unique nodes definition
●  1 puppetmaster
●  112 Puppet modules
Puppet Stats
●  Virtual and Physical Servers Configuration : Master mode
●  Building AWS AMI with Packer : Master mode
●  Local development environment with Vagrant : Master mode
●  OpenStack deployment : Masterless mode
Where and how do we use Puppet ?
Code Review?
●  Gerrit, an industry standard : Eclipse, Google, Chromium, OpenStack,
WikiMedia, LibreOffice, Spotify, GlusterFS, etc...
●  Fine Grained Permissions Rules
●  Plugged to LDAP
●  Code Review per commit
●  Stream Events
●  Use GitBlit
●  Integrated with Jenkins and Jira
●  Managing about 600 Git repositories
A Powerful Gerrit Integration
Gerrit in Action
●  1 job per module
●  1 job for the manifests and hiera data
●  1 job for the Puppet fileserver
●  1 job to deploy
Continuous Delivery with Jenkins
Global Jenkins stats for the past year
●  ~10,000 Puppet deployment
●  Over 8,500 Production App Deployment
Team Awareness: HipChat Integration with Hubot
Infrastructure As Code
●  Follow standard development lifecycle
●  Repeatable and consistent server
provisioning
Continuous Delivery
●  Iterate quickly
●  Automated code review to improve code
quality
Reliability
●  Improve Production Stability
●  Enforce Better Security Practices
Puppet Continuous Delivery Workflow: The Vision
The Workflow
The Workflow : Puppet code logic
Puppet environments
●  Dedicated node manifests (*.pp)
●  Modules deployed by branch with Git submodules
All the data in Hiera
●  Try to avoid params.pp class
●  Store everything : modules parameters, classes, keys, passwords, ...
Puppet Code Hierarchy
/etc/puppet
├── puppet.conf, hiera.yaml, *.conf
├── hiera
└── environments
├── dev
│ ├── manifests
│ │ ├── nodes/*.pp
│ │ └── site.pp
│ └── modules
│ ├── activemq
│ ├── apache
│ ├── apf
│ ...
│ └── zookeeper
└── production
├── manifests
│ ├── nodes/*.pp
│ └── site.pp
└── modules
├── activemq
…
└── zookeeper
Git submodules, branch dev
Git submodules, branch production
Hiera Configuration
$ cat /etc/puppet/hiera.yaml
---
:backends:
- eyaml
- yaml
:yaml:
:datadir: /etc/puppet/hiera
:eyaml:
:datadir: /etc/puppet/hiera
:extension: 'yaml'
:pkcs7_private_key: /var/lib/puppet/hiera_keys/private_key.pkcs7.pem
:pkcs7_public_key: /var/lib/puppet/hiera_keys/public_key.pkcs7.pem
:hierarchy:
- fqdn/%{::fqdn}
- "%{::zone}/%{::vpc}/%{::hostgroup}"
- "%{::zone}/%{::vpc}/all"
- "%{::zone}/%{::hostgroup}"
- "%{::zone}/all"
- hostname/%{::hostname}
- hostgroup/%{::hostgroup}
- environment/%{::environment}
- common
:merge_behavior: deeper
Hiera eyaml : github.com/TomPoulton/hiera-eyaml
●  Hiera backend
●  Easy to use
●  Powerful CLI : eyaml edit /etc/puppet/hiera/secrets.yaml
Encrypt Your Secrets
$ cat secret.yaml
---
ec2::access_key_id: ENC[PKCS7,MIIBiQYJKoZIhvcNAQcDoIIBejCCAXYCAQAxggEhMII
IBHQIBADAFMAACAQEwDQYJKoZIhvcNAQEBBQAEggEAVIa28OwyaqI5N1TDCvVkBZz3YG+s
+Hfzr0lqgcvRCIuJGpq28sQmmuBaQjWY38i86ZSFu0gM6saOHfG64OzVlurO7k/
l0CKeL0JfXNaVM4TUqMaN9dSkL5e2vsmpLKrMASawmarqbLYwllTrTe32H4NWxU1e
+qWLeUMr9ciBnA3W1Azm4RIo+3bsvgvMfdks....=]
Encrypt Files
Blackbox : github.com/StackExchange/blackbox
●  Use GPG to encrypt secret files
●  Easy to add/delete team members
●  No need to change your Puppet code !
# modules/${modules_name}/files/credentials.yaml.gpg
file { ‘/etc/app/credentials.yaml’:
ensure => ‘file’,
owner => ‘root’,
group => ‘root’,
mode => ‘0644’,
source => ‘puppet:///modules/${module_name}/credentials.yaml’
}
The Workflow
The Workflow : bottlenecks
●  Only Ops team members can commit (SRE, SE)
●  Review and validation is done only by a SRE
●  Jenkins will verify the code but will not validate the commit
●  Static Puppet environments
●  Rely a lot on server hostnames
Flexibility : R10K github.com/adrienthebo/r10k !
●  Dynamic environments
●  No Git submodules anymore ! : - )
●  Easy to reproduce any environment
●  Can use private and forge Puppet modules
●  Can use branches and tags
●  Based on Puppetfile
Puppet Workflow Reloaded!
R10K
$ cat Puppetfile
forge "https://forgeapi.puppetlabs.com"
# Forge modules
mod 'pdxcat/collectd'
mod 'puppetlabs/rabbitmq'
mod 'arioch/redis'
mod 'maestrodev/wget'
mod 'puppetlabs/apt'
mod 'puppetlabs/stdlib'
# Tubemogul modules
mod "hosts",
:git => 'ssh://<gerrit_host>/puppet/modules/hosts',
:branch => 'dev'
mod "timezone",
:git => 'ssh://<gerrit_host>/puppet/modules/timezone',
:branch => 'dev'
...
Puppet Workflow Reloaded!
Better code organization : Roles and Profiles
●  Represent the business logic : Roles
o  Highest abstraction layer
o  Use Profiles for implementation
●  Implement the applications : Profiles
o  Remove potential code duplication
o  Use modules and other Puppet resources
Roles/Profiles Pattern
class role::logs {
include profile::base
include profile::logstash::server
include profile::elasticsearch
}
class profile::logstash {
$version = hiera('profile::logstash::server::version', '1.4.2')
$es_host = hiera('profile::logstash::server::es_host', 'es01')
$redis_host = hiera('profile::logstash::server::redis_host', 'redis01')
class { 'logstash':
package_url => "https://download.elasticsearch.org/logstash/.../logstash_${version}.deb",
java_install => true,
}
logstash::configfile { 'input_redis':
content => template('logstash/configfile/logstash.input_redis.conf.erb'),
order => 10,
}
logstash::configfile { 'output_es':
content => template('logstash/configfile/logstash.output_es.conf.erb'),
order => 30,
}
}
Do not rely on hostname : nodeless approach
●  Facts to guide Puppet
●  No node myawesomeserver { } anymore
●  Enforce a cluster vision
●  site.pp gives the configuration logic
Puppet Workflow Reloaded!
# /etc/puppet/manifests/site.pp
node default {
if $::ec2_tag_tm_role {
notify { "Using role : ${ec2_tag_tm_role}": }
include "role::${::ec2_tag_tm_role}"
} else {
fail(‘No role found. Nothing to configure.’)
}
}
●  Specify tags during the provisioning
●  Retrieve tags with AWS Ruby SDK and create facts
●  New hierarchy
AWS EC2 tags
$ facter -p | grep ec2_tag
ec2_tag_cluster => rtb-bidder
ec2_tag_nagios_host => mgmt01
ec2_tag_name => bidder
ec2_tag_pupenv => production
ec2_tag_tm_role => rtb::bidder
:hierarchy:
- "%{::zone}/%{::ec2_tag_vpc}/%{::ec2_tag_cluster}"
- "%{::zone}/%{::ec2_tag_vpc}/all"
- "%{::zone}/all"
- vpc/%{::ec2_tag_vpc}/%{::ec2_tag_cluster}
- vpc/%{::ec2_tag_vpc}/all
- environment/%{::environment}
- common
New merging and reviewing rules
●  Everyone can commit a Puppet code
●  Allow everyone to review a Puppet change (+1)
●  Allow SE and SRE to validate a Puppet change (+2)
●  Auto validation/merging in dev if at least 80% of test (+2)
Next improvements
●  Acceptance testing with Beaker and Docker
●  Full test provisioning with ServerSpec
●  PuppetDB to improve the reporting
●  Dedicated Puppet Masters
OpenSource Modules
●  tubemogul-aptly
●  tubemogul-blackbox
●  tubemogul-codedeploy
●  tubemogul-gor
●  tubemogul-packer
●  tubemogul-tmfile
●  tubemogul-storm
●  tubemogul-kafka
Nicolas Brousse
Julien Fabre
@orieg
@julien_fabre

Improving Operations Efficiency with Puppet

  • 1.
    Improving Operations Efficiency withPuppet April 17th, 2015 Nicolas Brousse | Sr. Director Of Operations Engineering | nicolas@tubemogul.com Julien Fabre | Site Reliability Engineer | julien.fabre@tubemogul.com
  • 2.
    Who are we? TubeMogul ● Enterprise software company for digital branding ●  Over 27 Billions Ads served in 2014 ●  Over 30 Billions Ad Auctions per day ●  Bid processed in less than 50 ms ●  Bid served in less than 80 ms (include network round trip) ●  5 PB of monthly video traffic served ●  1.1 EB of data stored Operations Engineering ●  Ensure the smooth day to day operation of the platform infrastructure ●  Provide a cost effective and cutting edge infrastructure ●  Team composed of SREs, SEs and DBAs ●  Managing over 2,500 servers (virtual and physical)
  • 3.
    Our Infrastructure Public CloudOn Premises Multiple locations with a mix of Public Cloud and On Premises
  • 4.
    ●  Java (alot!) ●  MySQL ●  Couchbase ●  Vertica ●  Kafka ●  Storm ●  Zookeeper, Exhibitor ●  Hadoop, HBase, Hive ●  Terracotta ●  ElasticSearch, Kibana ●  LogStash ●  PHP, Python, Ruby, Go... ●  Apache httpd ●  Nagios ●  Ganglia Technology Hoarders ●  Graphite ●  Memcached ●  Puppet ●  HAproxy ●  OpenStack ●  Git and Gerrit ●  Gor ●  ActiveMQ ●  OpenLDAP ●  Redis ●  Blackbox ●  Jenkins, Sonar ●  Tomcat ●  Jetty (embedded) ●  AWS DynamoDB, EC2, S3...
  • 5.
    ●  2008 -2010: Use SVN, Bash scripts and custom templates. ●  2010: Managing about 250 instances. Start looking at Puppet. ●  2011: Started with Puppet 0.25 then upgraded to 2.7 by EOY on 400 servers with 2 contributors. ●  2012: 800 servers managed by Puppet. 4 contributors. ●  2013: 1,000 servers managed by Puppet. 6 contributors. ●  2014: 1,500 servers managed by Puppet. Workflow using Git, Gerrit and Jenkins. 9 contributors. Start migration to 3.7. ●  2015: 2,000 servers managed by Puppet. 13 contributors. Five Years Of Puppet!
  • 6.
    ●  2000 nodes ● 225 unique nodes definition ●  1 puppetmaster ●  112 Puppet modules Puppet Stats
  • 7.
    ●  Virtual andPhysical Servers Configuration : Master mode ●  Building AWS AMI with Packer : Master mode ●  Local development environment with Vagrant : Master mode ●  OpenStack deployment : Masterless mode Where and how do we use Puppet ?
  • 8.
  • 9.
    ●  Gerrit, anindustry standard : Eclipse, Google, Chromium, OpenStack, WikiMedia, LibreOffice, Spotify, GlusterFS, etc... ●  Fine Grained Permissions Rules ●  Plugged to LDAP ●  Code Review per commit ●  Stream Events ●  Use GitBlit ●  Integrated with Jenkins and Jira ●  Managing about 600 Git repositories A Powerful Gerrit Integration
  • 10.
  • 11.
    ●  1 jobper module ●  1 job for the manifests and hiera data ●  1 job for the Puppet fileserver ●  1 job to deploy Continuous Delivery with Jenkins Global Jenkins stats for the past year ●  ~10,000 Puppet deployment ●  Over 8,500 Production App Deployment
  • 12.
    Team Awareness: HipChatIntegration with Hubot
  • 13.
    Infrastructure As Code ● Follow standard development lifecycle ●  Repeatable and consistent server provisioning Continuous Delivery ●  Iterate quickly ●  Automated code review to improve code quality Reliability ●  Improve Production Stability ●  Enforce Better Security Practices Puppet Continuous Delivery Workflow: The Vision
  • 14.
  • 15.
    The Workflow :Puppet code logic Puppet environments ●  Dedicated node manifests (*.pp) ●  Modules deployed by branch with Git submodules All the data in Hiera ●  Try to avoid params.pp class ●  Store everything : modules parameters, classes, keys, passwords, ...
  • 16.
    Puppet Code Hierarchy /etc/puppet ├──puppet.conf, hiera.yaml, *.conf ├── hiera └── environments ├── dev │ ├── manifests │ │ ├── nodes/*.pp │ │ └── site.pp │ └── modules │ ├── activemq │ ├── apache │ ├── apf │ ... │ └── zookeeper └── production ├── manifests │ ├── nodes/*.pp │ └── site.pp └── modules ├── activemq … └── zookeeper Git submodules, branch dev Git submodules, branch production
  • 17.
    Hiera Configuration $ cat/etc/puppet/hiera.yaml --- :backends: - eyaml - yaml :yaml: :datadir: /etc/puppet/hiera :eyaml: :datadir: /etc/puppet/hiera :extension: 'yaml' :pkcs7_private_key: /var/lib/puppet/hiera_keys/private_key.pkcs7.pem :pkcs7_public_key: /var/lib/puppet/hiera_keys/public_key.pkcs7.pem :hierarchy: - fqdn/%{::fqdn} - "%{::zone}/%{::vpc}/%{::hostgroup}" - "%{::zone}/%{::vpc}/all" - "%{::zone}/%{::hostgroup}" - "%{::zone}/all" - hostname/%{::hostname} - hostgroup/%{::hostgroup} - environment/%{::environment} - common :merge_behavior: deeper
  • 18.
    Hiera eyaml :github.com/TomPoulton/hiera-eyaml ●  Hiera backend ●  Easy to use ●  Powerful CLI : eyaml edit /etc/puppet/hiera/secrets.yaml Encrypt Your Secrets $ cat secret.yaml --- ec2::access_key_id: ENC[PKCS7,MIIBiQYJKoZIhvcNAQcDoIIBejCCAXYCAQAxggEhMII IBHQIBADAFMAACAQEwDQYJKoZIhvcNAQEBBQAEggEAVIa28OwyaqI5N1TDCvVkBZz3YG+s +Hfzr0lqgcvRCIuJGpq28sQmmuBaQjWY38i86ZSFu0gM6saOHfG64OzVlurO7k/ l0CKeL0JfXNaVM4TUqMaN9dSkL5e2vsmpLKrMASawmarqbLYwllTrTe32H4NWxU1e +qWLeUMr9ciBnA3W1Azm4RIo+3bsvgvMfdks....=]
  • 19.
    Encrypt Files Blackbox :github.com/StackExchange/blackbox ●  Use GPG to encrypt secret files ●  Easy to add/delete team members ●  No need to change your Puppet code ! # modules/${modules_name}/files/credentials.yaml.gpg file { ‘/etc/app/credentials.yaml’: ensure => ‘file’, owner => ‘root’, group => ‘root’, mode => ‘0644’, source => ‘puppet:///modules/${module_name}/credentials.yaml’ }
  • 20.
  • 21.
    The Workflow :bottlenecks ●  Only Ops team members can commit (SRE, SE) ●  Review and validation is done only by a SRE ●  Jenkins will verify the code but will not validate the commit ●  Static Puppet environments ●  Rely a lot on server hostnames
  • 22.
    Flexibility : R10Kgithub.com/adrienthebo/r10k ! ●  Dynamic environments ●  No Git submodules anymore ! : - ) ●  Easy to reproduce any environment ●  Can use private and forge Puppet modules ●  Can use branches and tags ●  Based on Puppetfile Puppet Workflow Reloaded!
  • 23.
    R10K $ cat Puppetfile forge"https://forgeapi.puppetlabs.com" # Forge modules mod 'pdxcat/collectd' mod 'puppetlabs/rabbitmq' mod 'arioch/redis' mod 'maestrodev/wget' mod 'puppetlabs/apt' mod 'puppetlabs/stdlib' # Tubemogul modules mod "hosts", :git => 'ssh://<gerrit_host>/puppet/modules/hosts', :branch => 'dev' mod "timezone", :git => 'ssh://<gerrit_host>/puppet/modules/timezone', :branch => 'dev' ...
  • 24.
    Puppet Workflow Reloaded! Bettercode organization : Roles and Profiles ●  Represent the business logic : Roles o  Highest abstraction layer o  Use Profiles for implementation ●  Implement the applications : Profiles o  Remove potential code duplication o  Use modules and other Puppet resources
  • 25.
    Roles/Profiles Pattern class role::logs{ include profile::base include profile::logstash::server include profile::elasticsearch } class profile::logstash { $version = hiera('profile::logstash::server::version', '1.4.2') $es_host = hiera('profile::logstash::server::es_host', 'es01') $redis_host = hiera('profile::logstash::server::redis_host', 'redis01') class { 'logstash': package_url => "https://download.elasticsearch.org/logstash/.../logstash_${version}.deb", java_install => true, } logstash::configfile { 'input_redis': content => template('logstash/configfile/logstash.input_redis.conf.erb'), order => 10, } logstash::configfile { 'output_es': content => template('logstash/configfile/logstash.output_es.conf.erb'), order => 30, } }
  • 26.
    Do not relyon hostname : nodeless approach ●  Facts to guide Puppet ●  No node myawesomeserver { } anymore ●  Enforce a cluster vision ●  site.pp gives the configuration logic Puppet Workflow Reloaded! # /etc/puppet/manifests/site.pp node default { if $::ec2_tag_tm_role { notify { "Using role : ${ec2_tag_tm_role}": } include "role::${::ec2_tag_tm_role}" } else { fail(‘No role found. Nothing to configure.’) } }
  • 27.
    ●  Specify tagsduring the provisioning ●  Retrieve tags with AWS Ruby SDK and create facts ●  New hierarchy AWS EC2 tags $ facter -p | grep ec2_tag ec2_tag_cluster => rtb-bidder ec2_tag_nagios_host => mgmt01 ec2_tag_name => bidder ec2_tag_pupenv => production ec2_tag_tm_role => rtb::bidder :hierarchy: - "%{::zone}/%{::ec2_tag_vpc}/%{::ec2_tag_cluster}" - "%{::zone}/%{::ec2_tag_vpc}/all" - "%{::zone}/all" - vpc/%{::ec2_tag_vpc}/%{::ec2_tag_cluster} - vpc/%{::ec2_tag_vpc}/all - environment/%{::environment} - common
  • 28.
    New merging andreviewing rules ●  Everyone can commit a Puppet code ●  Allow everyone to review a Puppet change (+1) ●  Allow SE and SRE to validate a Puppet change (+2) ●  Auto validation/merging in dev if at least 80% of test (+2)
  • 29.
    Next improvements ●  Acceptancetesting with Beaker and Docker ●  Full test provisioning with ServerSpec ●  PuppetDB to improve the reporting ●  Dedicated Puppet Masters
  • 30.
    OpenSource Modules ●  tubemogul-aptly ● tubemogul-blackbox ●  tubemogul-codedeploy ●  tubemogul-gor ●  tubemogul-packer ●  tubemogul-tmfile ●  tubemogul-storm ●  tubemogul-kafka
  • 31.