Learning Lessons Scaling to
                      5000 Agents
                                    Russ Johnson
                               rjohnson@stubhub.com
                                   @professoruss




Confidential                                   Slide 1
#whoami

    Started out in the mid 90’s
    Recovering Windows Admin
    Storage Guy
    Datacenter Monkey
    Once upon a time network guy
    At StubHub since December 2006
    Working on puppet adoption in a crazy infrastructure
    Puppet certified


April 9, 2013                                              2
4 Puppet Masters




                DEV/QA   PROD




                 DR      CORP
April 9, 2013                   3
The road to sanity




April 9, 2013            4
Set up your master properly

            Apache/Passenger
            Tune Passenger
                PassengerMaxPoolSize 32
                PassengerMinInstances 4
                PassengerMaxRequests 10000
                PassengerStatThrottleRate 30


            16 cores, 32GB
            load average: 4.03, 3.71, 3.45
            4000+ agents



April 9, 2013                                  5
Thundering herds




April 9, 2013          6
Build your hosts the same way!

            Old way:
                systemimager, vmware clones, manual installs
            Results:
                INCONSISTENCY!




April 9, 2013                                                  7
Build your hosts the same way!

            New way:
                Cobbler < 5m bare metal to on the network
            Results:
                Same results every time! No drift between base




April 9, 2013                                                    8
Set up your working environment properly

            Geppetto – eclipse based IDE
                  http://cloudsmith.github.com/geppetto/index.html
            VIM
            •    Pathogen – For autoloading vim plugins
            •    Snipmate – Snippets
            •    Tabular – Text filtering and alignment
            •    Syntastic – Syntax checking
            •    mv-vim-puppet – Make vim puppet friendly
            •    puppet-lint – Syntax checker (gem)

April 9, 2013                                                        9
Syntastic/puppet-lint




April 9, 2013               10
Set up your working environment properly



                            +




April 9, 2013                                  11
Version Control is not enough

            Ever do a 4 way diff across 60 modules to find most of
            them different?




April 9, 2013                                                        12
What to do?

 How do I stop manual edits?

                        How do I deal with 80+ Dev/QA Environments?


                Versioning?                              Librarian?

                                                    Pulp?
        Branching?

                                                   Internal Forge?

 Dynamic Environments?
                                            Puppet Module Tool?
                    What does PuppetLabs do?
April 9, 2013                                                     13
PuppetLabs seems to know what to do




                         Let’s investigate
                        puppet module tool




http://docs.puppetlabs.com/puppet/2.7/reference/modules_publishing.html



April 9, 2013                                                        14
Generate a module




http://docs.puppetlabs.com/puppet/2.7/reference/modules_publishing.html

April 9, 2013                                                        15
Edit Modulefile




http://docs.puppetlabs.com/puppet/2.7/reference/modules_publishing.html
April 9, 2013                                                        16
Document the manifest




                http://rdoc.sourceforge.net/
April 9, 2013                                  17
Write Documentation?




April 9, 2013              18
Free Docs!

        puppet doc -a -o /var/www/html/puppetdocs --mode rdoc




April 9, 2013                                                   19
What’s actually installed?




April 9, 2013                    20
Catching live edits and preventing them



                Splunk -> puppet module changes -> alerting




                       The NOC will hunt you down!


April 9, 2013                                                 21
Build and install the module




     tar –xzf /tmp/work/stubhub-puppetserver/pkg/stubhub-
     puppetserver-0.0.1.tar.gz –C /etc/puppet/environments/
     staging/modules/puppetserver




April 9, 2013                                                 22
Releasing like that?




April 9, 2013              23
Internal Forge

    mod_rewrite:
           Simulate the api – redirect to json metadata files
           $htmlroot/api/v1/releases.json?module=user/module



    ruby script:
           Generate metadata files for each module release and
           all modules.
           Similar to createrepo (yum)


April 9, 2013                                                    24
Internal Forge - Search




April 9, 2013                 25
Internal Forge - install




April 9, 2013                  26
Internal Forge - upgrade




April 9, 2013                  27
Case statements? How bout Hiera?




April 9, 2013                          28
Avoid case statement insanity

    case $::system_role {
         ‘browse’, ‘search’: {
                …do some stuff…
          }
         ‘db’: {
                …other stuff…
         }
         ‘otherrole’: {
                …please make it stop!!!!
         }
    }
April 9, 2013                              29
hieradata

    $hieradata/browse.yaml:
           ---
           module::parameter: ‘foo’

    $hieradata/search.yaml:
           ---
           module::parameter: ‘bar’

    $hieradata/defaults.yaml:
           ---
           module::parameter: ‘I want this everywhere unless there are overrides’




April 9, 2013                                                                       30
Case -> variables -> hiera

    §    9000 lines of case statements


    §    1000 lines with case/variables


    §    ~20 lines with defined type


                     Code compression FTW!


April 9, 2013                                31
Dynamic Environments

    §    puppet.conf:
    modulepath = /etc/puppet/environments/$environment/modules
    manifest       = /etc/puppet/environments/$environment/manifests/site.pp
    manifestdir = /etc/puppet/environments/$environment/manifests




    §    hiera.yaml:
    :datadir: '/etc/puppet/environments/%{environment}/hieradata'




April 9, 2013                                                              32
Release process

 §    Syntax check/validate
 §    Test on VMs
 §    Build module package
 §    Release to internal forge
 §    puppet module install to staging environment
 §    Test again!
 §    puppet module install to production environment


April 9, 2013                                            33
The road to yesop

    §    Staging
    §    Process
    §    Repeatability
    §    Consistency
    §    Document everything
    §    Breaking things where it’s cheap
    §    Test everything!


April 9, 2013                                34
Then VS now

 §    Environment build time:
         –      Then: 3+ weeks
                 •  It was wrong

                 •  It didn’t work

                 •  Nobody knew what to expect

         –      Now: < 1 day
                 •  It’s the same every time

                 •  We know exactly what’s installed

                 •  Internal consumers get what they expect

                 •  Less outages from human error


April 9, 2013                                                 35
Questions?




April 9, 2013                36

Lessons I Learned While Scaling to 5000 Puppet Agents

  • 1.
    Learning Lessons Scalingto 5000 Agents Russ Johnson rjohnson@stubhub.com @professoruss Confidential Slide 1
  • 2.
    #whoami Started out in the mid 90’s Recovering Windows Admin Storage Guy Datacenter Monkey Once upon a time network guy At StubHub since December 2006 Working on puppet adoption in a crazy infrastructure Puppet certified April 9, 2013 2
  • 3.
    4 Puppet Masters DEV/QA PROD DR CORP April 9, 2013 3
  • 4.
    The road tosanity April 9, 2013 4
  • 5.
    Set up yourmaster properly Apache/Passenger Tune Passenger PassengerMaxPoolSize 32 PassengerMinInstances 4 PassengerMaxRequests 10000 PassengerStatThrottleRate 30 16 cores, 32GB load average: 4.03, 3.71, 3.45 4000+ agents April 9, 2013 5
  • 6.
  • 7.
    Build your hoststhe same way! Old way: systemimager, vmware clones, manual installs Results: INCONSISTENCY! April 9, 2013 7
  • 8.
    Build your hoststhe same way! New way: Cobbler < 5m bare metal to on the network Results: Same results every time! No drift between base April 9, 2013 8
  • 9.
    Set up yourworking environment properly Geppetto – eclipse based IDE http://cloudsmith.github.com/geppetto/index.html VIM •  Pathogen – For autoloading vim plugins •  Snipmate – Snippets •  Tabular – Text filtering and alignment •  Syntastic – Syntax checking •  mv-vim-puppet – Make vim puppet friendly •  puppet-lint – Syntax checker (gem) April 9, 2013 9
  • 10.
  • 11.
    Set up yourworking environment properly + April 9, 2013 11
  • 12.
    Version Control isnot enough Ever do a 4 way diff across 60 modules to find most of them different? April 9, 2013 12
  • 13.
    What to do? How do I stop manual edits? How do I deal with 80+ Dev/QA Environments? Versioning? Librarian? Pulp? Branching? Internal Forge? Dynamic Environments? Puppet Module Tool? What does PuppetLabs do? April 9, 2013 13
  • 14.
    PuppetLabs seems toknow what to do Let’s investigate puppet module tool http://docs.puppetlabs.com/puppet/2.7/reference/modules_publishing.html April 9, 2013 14
  • 15.
  • 16.
  • 17.
    Document the manifest http://rdoc.sourceforge.net/ April 9, 2013 17
  • 18.
  • 19.
    Free Docs! puppet doc -a -o /var/www/html/puppetdocs --mode rdoc April 9, 2013 19
  • 20.
  • 21.
    Catching live editsand preventing them Splunk -> puppet module changes -> alerting The NOC will hunt you down! April 9, 2013 21
  • 22.
    Build and installthe module tar –xzf /tmp/work/stubhub-puppetserver/pkg/stubhub- puppetserver-0.0.1.tar.gz –C /etc/puppet/environments/ staging/modules/puppetserver April 9, 2013 22
  • 23.
  • 24.
    Internal Forge mod_rewrite: Simulate the api – redirect to json metadata files $htmlroot/api/v1/releases.json?module=user/module ruby script: Generate metadata files for each module release and all modules. Similar to createrepo (yum) April 9, 2013 24
  • 25.
    Internal Forge -Search April 9, 2013 25
  • 26.
    Internal Forge -install April 9, 2013 26
  • 27.
    Internal Forge -upgrade April 9, 2013 27
  • 28.
    Case statements? Howbout Hiera? April 9, 2013 28
  • 29.
    Avoid case statementinsanity case $::system_role { ‘browse’, ‘search’: { …do some stuff… } ‘db’: { …other stuff… } ‘otherrole’: { …please make it stop!!!! } } April 9, 2013 29
  • 30.
    hieradata $hieradata/browse.yaml: --- module::parameter: ‘foo’ $hieradata/search.yaml: --- module::parameter: ‘bar’ $hieradata/defaults.yaml: --- module::parameter: ‘I want this everywhere unless there are overrides’ April 9, 2013 30
  • 31.
    Case -> variables-> hiera §  9000 lines of case statements §  1000 lines with case/variables §  ~20 lines with defined type Code compression FTW! April 9, 2013 31
  • 32.
    Dynamic Environments §  puppet.conf: modulepath = /etc/puppet/environments/$environment/modules manifest = /etc/puppet/environments/$environment/manifests/site.pp manifestdir = /etc/puppet/environments/$environment/manifests §  hiera.yaml: :datadir: '/etc/puppet/environments/%{environment}/hieradata' April 9, 2013 32
  • 33.
    Release process §  Syntax check/validate §  Test on VMs §  Build module package §  Release to internal forge §  puppet module install to staging environment §  Test again! §  puppet module install to production environment April 9, 2013 33
  • 34.
    The road toyesop §  Staging §  Process §  Repeatability §  Consistency §  Document everything §  Breaking things where it’s cheap §  Test everything! April 9, 2013 34
  • 35.
    Then VS now §  Environment build time: –  Then: 3+ weeks •  It was wrong •  It didn’t work •  Nobody knew what to expect –  Now: < 1 day •  It’s the same every time •  We know exactly what’s installed •  Internal consumers get what they expect •  Less outages from human error April 9, 2013 35
  • 36.