How TubeMogul reached
10,000 Puppet deployment in
one year
May 26th, 2015
Nicolas Brousse | Sr. Director Of Operations Eng...
Who are we?
TubeMogul
● Enterprise software company for digital branding
● Over 27 Billions Ads served in 2014
● Over 30 B...
Who are we?
Operations Engineering
● Ensure the smooth day to day operation of the platform
infrastructure
● Provide a cos...
Our Infrastructure
Public Cloud On Premises
Multiple locations with a mix of Public Cloud and On Premises
● Java (a lot!)
● MySQL
● Couchbase
● Vertica
● Kafka
● Storm
● Zookeeper, Exhibitor
● Hadoop, HBase, Hive
● Terracotta
● ...
● 2008 - 2010: Use SVN, Bash scripts and custom templates.
● 2010: Managing about 250 instances. Start looking at Puppet.
...
● 2000 nodes
● 225 unique nodes definition
● 1 puppetmaster
● 112 Puppet modules
Puppet Stats
● Virtual and Physical Servers Configuration : Master mode
● Building AWS AMI with Packer : Master mode
● Local developmen...
Code Review?
● Gerrit, an industry standard : Eclipse, Google, Chromium, OpenStack,
WikiMedia, LibreOffice, Spotify, GlusterFS, etc...
...
Gerrit in Action
verify -1 when no
ticket # or doesn’t
pass Jenkins code
validation
● 1 job per module
● 1 job for the manifests and hiera data
● 1 job for the Puppet fileserver
● 1 job to deploy
Continuous...
Plugin : github.com/jenkinsci/job-dsl-plugin
● Automate the jobs creation
● Ensure a standard across all the jobs
● Versio...
Team Awareness: HipChat Integration with Hubot
Infrastructure As Code
● Follow standard development lifecycle
● Repeatable and consistent server
provisioning
Continuous ...
The Workflow
The Workflow : Puppet code logic
Puppet environments
● Dedicated node manifests (*.pp)
● Modules deployed by branch with G...
Puppet Code Hierarchy
/etc/puppet
├── puppet.conf, hiera.yaml, *.conf
├── hiera
└── environments
├── dev
│ ├── manifests
│...
Hiera Configuration
$ cat /etc/puppet/hiera.yaml
---
:backends:
- eyaml
- yaml
:yaml:
:datadir: /etc/puppet/hiera
:eyaml:
...
Hiera eyaml : github.com/TomPoulton/hiera-eyaml
● Hiera backend
● Easy to use
● Powerful CLI : eyaml edit /etc/puppet/hier...
Encrypt Files
Blackbox : github.com/StackExchange/blackbox
● Use GPG to encrypt secret files
● Easy to add/delete team mem...
The Workflow
The Workflow : bottlenecks
● Only Ops team members can commit (SRE, SE)
● Review and validation is done only by a SRE
● Je...
Flexibility : R10K github.com/adrienthebo/r10k !
● Dynamic environments
● No Git submodules anymore ! : - )
● Easy to repr...
R10K
$ cat Puppetfile
forge "https://forgeapi.puppetlabs.com"
# Forge modules
mod 'pdxcat/collectd'
mod 'puppetlabs/rabbit...
Puppet Workflow Reloaded!
Better code organization : Roles and Profiles
● Represent the business logic : Roles
o Highest a...
Roles/Profiles Pattern
class role::logs {
include profile::base
include profile::logstash::server
include profile::elastic...
Do not rely on hostname : nodeless approach
● Facts to guide Puppet
● No node myawesomeserver { } anymore
● Enforce a clus...
● Specify tags during the provisioning
● Retrieve tags with AWS Ruby SDK and create facts
● New hierarchy
AWS EC2 tags
$ f...
New merging and reviewing rules
● Everyone can commit a Puppet code
● Allow everyone to review a Puppet change (+1)
● Allo...
Next improvements
● Acceptance testing with Beaker and Docker
● Full test provisioning with ServerSpec
● PuppetDB to impro...
Nicolas Brousse
Julien Fabre
@orieg
@julien_fabre
Upcoming SlideShare
Loading in …5
×

Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

1,264 views

Published on

TubeMogul grew from few servers to over two thousands servers and handling over one trillion http requests a month, processed in less than 50ms each. To keep up with the fast growth, the SRE team had to implement an efficient Continuous Delivery infrastructure that allowed to do over 10,000 puppet deployment and 8,500 application deployment in 2014. In this presentation, we will cover the nuts and bolts of the TubeMogul operations engineering team and how they over come challenges.

Published in: Engineering
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,264
On SlideShare
0
From Embeds
0
Number of Embeds
177
Actions
Shares
0
Downloads
10
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

  1. 1. How TubeMogul reached 10,000 Puppet deployment in one year May 26th, 2015 Nicolas Brousse | Sr. Director Of Operations Engineering | nicolas@tubemogul.com Julien Fabre | Site Reliability Engineer | julien.fabre@tubemogul.com
  2. 2. Who are we? TubeMogul ● Enterprise software company for digital branding ● Over 27 Billions Ads served in 2014 ● Over 30 Billions Ad Auctions per day ● Bid processed in less than 50 ms ● Bid served in less than 80 ms (include network round trip) ● 5 PB of monthly video traffic served ● 1.3 EB of data stored
  3. 3. Who are we? Operations Engineering ● Ensure the smooth day to day operation of the platform infrastructure ● Provide a cost effective and cutting edge infrastructure ● Team composed of SREs, SEs and DBAs ● Managing over 2,500 servers (virtual and physical)
  4. 4. Our Infrastructure Public Cloud On Premises Multiple locations with a mix of Public Cloud and On Premises
  5. 5. ● Java (a lot!) ● MySQL ● Couchbase ● Vertica ● Kafka ● Storm ● Zookeeper, Exhibitor ● Hadoop, HBase, Hive ● Terracotta ● ElasticSearch, Kibana ● LogStash ● PHP, Python, Ruby, Go... ● Apache httpd ● Nagios ● Ganglia Technology Hoarders ● Graphite ● Memcached ● Puppet ● HAproxy ● OpenStack ● Git and Gerrit ● Gor ● ActiveMQ ● OpenLDAP ● Redis ● Blackbox ● Jenkins, Sonar ● Tomcat ● Jetty (embedded) ● AWS DynamoDB, EC2, S3...
  6. 6. ● 2008 - 2010: Use SVN, Bash scripts and custom templates. ● 2010: Managing about 250 instances. Start looking at Puppet. ● 2011: Puppet 0.25 then 2.7 by EOY on 400 servers with 2 contributors. ● 2012: 800 servers managed by Puppet. 4 contributors. ● 2013: 1,000 servers managed by Puppet. 6 contributors. ● 2014: 1,500 servers managed by Puppet. Introduced Continuous Delivery Workflow. 9 contributors. Start 3.7 migration. ● 2015: 2,000 servers managed by Puppet. 13 contributors. Five Years Of Puppet!
  7. 7. ● 2000 nodes ● 225 unique nodes definition ● 1 puppetmaster ● 112 Puppet modules Puppet Stats
  8. 8. ● Virtual and Physical Servers Configuration : Master mode ● Building AWS AMI with Packer : Master mode ● Local development environment with Vagrant : Master mode ● OpenStack deployment : Masterless mode Where and how do we use Puppet ?
  9. 9. Code Review?
  10. 10. ● Gerrit, an industry standard : Eclipse, Google, Chromium, OpenStack, WikiMedia, LibreOffice, Spotify, GlusterFS, etc... ● Fine Grained Permissions Rules ● Plugged to LDAP ● Code Review per commit ● Stream Events ● Use GitBlit ● Integrated with Jenkins and Jira ● Managing about 600 Git repositories A Powerful Gerrit Integration
  11. 11. Gerrit in Action verify -1 when no ticket # or doesn’t pass Jenkins code validation
  12. 12. ● 1 job per module ● 1 job for the manifests and hiera data ● 1 job for the Puppet fileserver ● 1 job to deploy Continuous Delivery with Jenkins Global Jenkins stats for the past year ● ~10,000 Puppet deployment ● Over 8,500 Production App Deployment
  13. 13. Plugin : github.com/jenkinsci/job-dsl-plugin ● Automate the jobs creation ● Ensure a standard across all the jobs ● Versioned the configuration ● Apply changes to all your jobs without pain ● Test your configuration changes Jenkins job DSL : code your Jenkins jobs
  14. 14. Team Awareness: HipChat Integration with Hubot
  15. 15. Infrastructure As Code ● Follow standard development lifecycle ● Repeatable and consistent server provisioning Continuous Delivery ● Iterate quickly ● Automated code review to improve code quality Reliability ● Improve Production Stability ● Enforce Better Security Practices Puppet Continuous Delivery Workflow: The Vision
  16. 16. The Workflow
  17. 17. The Workflow : Puppet code logic Puppet environments ● Dedicated node manifests (*.pp) ● Modules deployed by branch with Git submodules All the data in Hiera ● Try to avoid params.pp class ● Store everything : modules parameters, classes, keys, passwords, ...
  18. 18. Puppet Code Hierarchy /etc/puppet ├── puppet.conf, hiera.yaml, *.conf ├── hiera └── environments ├── dev │ ├── manifests │ │ ├── nodes/*.pp │ │ └── site.pp │ └── modules │ ├── activemq │ ... │ └── zookeeper └── production ├── manifests │ ├── nodes/*.pp │ └── site.pp └── modules ├── activemq … └── zookeeper Git submodules, branch dev Git submodules, branch production
  19. 19. Hiera Configuration $ cat /etc/puppet/hiera.yaml --- :backends: - eyaml - yaml :yaml: :datadir: /etc/puppet/hiera :eyaml: :pkcs7_private_key: /var/lib/puppet/hiera_keys/private_key.pkcs7.pem :pkcs7_public_key: /var/lib/puppet/hiera_keys/public_key.pkcs7.pem :hierarchy: - fqdn/%{::fqdn} - "%{::zone}/%{::vpc}/%{::hostgroup}" - "%{::zone}/%{::vpc}/all" - "%{::zone}/%{::hostgroup}" - "%{::zone}/all" - hostname/%{::hostname} - hostgroup/%{::hostgroup} - environment/%{::environment} - common :merge_behavior: deeper
  20. 20. Hiera eyaml : github.com/TomPoulton/hiera-eyaml ● Hiera backend ● Easy to use ● Powerful CLI : eyaml edit /etc/puppet/hiera/secrets.yaml Encrypt Your Secrets $ cat secret.yaml --- ec2::access_key_id: ENC[PKCS7,MIIBiQYJKoZIhvcNAQcDoIIBejCCAXYCAQAxggEhMII IBHQIBADAFMAACAQEwDQYJKoZIhvcNAQEBBQAEggEAVIa28OwyaqI5N1TDCvVkBZz3YG+s+Hfzr0lqgcvRCIuJ Gpq28sQmmuBaQjWY38i86ZSFu0gM6saOHfG64OzVlurO7k/l0CKeL0JfXNaVM4TUqMaN9dSkL5e2vsmpLKrMAS awmarqbLYwllTrTe32H4NWxU1e+qWLeUMr9ciBnA3W1Azm4RIo+3bsvgvMfdks....=]
  21. 21. Encrypt Files Blackbox : github.com/StackExchange/blackbox ● Use GPG to encrypt secret files ● Easy to add/delete team members ● No need to change your Puppet code ! # modules/${modules_name}/files/credentials.yaml.gpg file { ‘/etc/app/credentials.yaml’: ensure => ‘file’, owner => ‘root’, group => ‘root’, mode => ‘0644’, source => ‘puppet:///modules/${module_name}/credentials.yaml’ }
  22. 22. The Workflow
  23. 23. The Workflow : bottlenecks ● Only Ops team members can commit (SRE, SE) ● Review and validation is done only by a SRE ● Jenkins will verify the code but will not validate the commit ● Static Puppet environments ● Rely a lot on server hostnames
  24. 24. Flexibility : R10K github.com/adrienthebo/r10k ! ● Dynamic environments ● No Git submodules anymore ! : - ) ● Easy to reproduce any environment ● Can use private and forge Puppet modules ● Can use branches and tags ● Based on Puppetfile Puppet Workflow Reloaded!
  25. 25. R10K $ cat Puppetfile forge "https://forgeapi.puppetlabs.com" # Forge modules mod 'pdxcat/collectd' mod 'puppetlabs/rabbitmq' mod 'arioch/redis' mod 'maestrodev/wget' mod 'puppetlabs/apt' mod 'puppetlabs/stdlib' # Tubemogul modules mod "hosts", :git => 'ssh://<gerrit_host>/puppet/modules/hosts', :branch => 'dev' mod "timezone", :git => 'ssh://<gerrit_host>/puppet/modules/timezone', :branch => 'dev' ...
  26. 26. Puppet Workflow Reloaded! Better code organization : Roles and Profiles ● Represent the business logic : Roles o Highest abstraction layer o Use Profiles for implementation ● Implement the applications : Profiles o Remove potential code duplication o Use modules and other Puppet resources
  27. 27. Roles/Profiles Pattern class role::logs { include profile::base include profile::logstash::server include profile::elasticsearch } class profile::logstash { $version = hiera('profile::logstash::server::version', '1.4.2') $es_host = hiera('profile::logstash::server::es_host', 'es01') $redis_host = hiera('profile::logstash::server::redis_host', 'redis01') class { 'logstash': package_url => "https://download.elasticsearch.org/logstash/.../logstash_${version}.deb", java_install => true, } logstash::configfile { 'input_redis': content => template('logstash/configfile/logstash.input_redis.conf.erb'), order => 10, } logstash::configfile { 'output_es': content => template('logstash/configfile/logstash.output_es.conf.erb'), order => 30, } }
  28. 28. Do not rely on hostname : nodeless approach ● Facts to guide Puppet ● No node myawesomeserver { } anymore ● Enforce a cluster vision ● site.pp gives the configuration logic Puppet Workflow Reloaded! # /etc/puppet/manifests/site.pp node default { if $::ec2_tag_tm_role { notify { "Using role : ${ec2_tag_tm_role}": } include "role::${::ec2_tag_tm_role}" } else { fail(‘No role found. Nothing to configure.’) } }
  29. 29. ● Specify tags during the provisioning ● Retrieve tags with AWS Ruby SDK and create facts ● New hierarchy AWS EC2 tags $ facter -p | grep ec2_tag ec2_tag_cluster => rtb-bidder ec2_tag_nagios_host => mgmt01 ec2_tag_name => bidder ec2_tag_pupenv => production ec2_tag_tm_role => rtb::bidder :hierarchy: - "%{::zone}/%{::ec2_tag_vpc}/%{::ec2_tag_cluster}" - "%{::zone}/%{::ec2_tag_vpc}/all" - "%{::zone}/all" - vpc/%{::ec2_tag_vpc}/%{::ec2_tag_cluster} - vpc/%{::ec2_tag_vpc}/all - environment/%{::environment} - common
  30. 30. New merging and reviewing rules ● Everyone can commit a Puppet code ● Allow everyone to review a Puppet change (+1) ● Allow SE and SRE to validate a Puppet change (+2) ● Auto validation/merging in dev if at least 80% of test (+2)
  31. 31. Next improvements ● Acceptance testing with Beaker and Docker ● Full test provisioning with ServerSpec ● PuppetDB to improve the reporting ● Dedicated Puppet Masters
  32. 32. Nicolas Brousse Julien Fabre @orieg @julien_fabre

×