Lessons I Learned While Scaling to 5000 Puppet Agents
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share

Lessons I Learned While Scaling to 5000 Puppet Agents

  • 12,166 views
Uploaded on

Russ Johnson of StubHub talks about "Learning Lessons Scaling to 5000 Puppet Agents" at Puppet Camp San Francisco 2013. Find a Puppet Camp near you: puppetlabs.com/community/puppet-camp/

Russ Johnson of StubHub talks about "Learning Lessons Scaling to 5000 Puppet Agents" at Puppet Camp San Francisco 2013. Find a Puppet Camp near you: puppetlabs.com/community/puppet-camp/

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
12,166
On Slideshare
12,127
From Embeds
39
Number of Embeds
4

Actions

Shares
Downloads
62
Comments
0
Likes
13

Embeds 39

https://twitter.com 20
https://puppetlabs.com 10
http://puppetlabs.com 8
http://moderation.local 1

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Learning Lessons Scaling to 5000 Agents Russ Johnson rjohnson@stubhub.com @professorussConfidential Slide 1
  • 2. #whoami Started out in the mid 90’s Recovering Windows Admin Storage Guy Datacenter Monkey Once upon a time network guy At StubHub since December 2006 Working on puppet adoption in a crazy infrastructure Puppet certifiedApril 9, 2013 2
  • 3. 4 Puppet Masters DEV/QA PROD DR CORPApril 9, 2013 3
  • 4. The road to sanityApril 9, 2013 4
  • 5. Set up your master properly Apache/Passenger Tune Passenger PassengerMaxPoolSize 32 PassengerMinInstances 4 PassengerMaxRequests 10000 PassengerStatThrottleRate 30 16 cores, 32GB load average: 4.03, 3.71, 3.45 4000+ agentsApril 9, 2013 5
  • 6. Thundering herdsApril 9, 2013 6
  • 7. Build your hosts the same way! Old way: systemimager, vmware clones, manual installs Results: INCONSISTENCY!April 9, 2013 7
  • 8. Build your hosts the same way! New way: Cobbler < 5m bare metal to on the network Results: Same results every time! No drift between baseApril 9, 2013 8
  • 9. Set up your working environment properly Geppetto – eclipse based IDE http://cloudsmith.github.com/geppetto/index.html VIM •  Pathogen – For autoloading vim plugins •  Snipmate – Snippets •  Tabular – Text filtering and alignment •  Syntastic – Syntax checking •  mv-vim-puppet – Make vim puppet friendly •  puppet-lint – Syntax checker (gem)April 9, 2013 9
  • 10. Syntastic/puppet-lintApril 9, 2013 10
  • 11. Set up your working environment properly +April 9, 2013 11
  • 12. Version Control is not enough Ever do a 4 way diff across 60 modules to find most of them different?April 9, 2013 12
  • 13. What to do? How do I stop manual edits? How do I deal with 80+ Dev/QA Environments? Versioning? Librarian? Pulp? Branching? Internal Forge? Dynamic Environments? Puppet Module Tool? What does PuppetLabs do?April 9, 2013 13
  • 14. PuppetLabs seems to know what to do Let’s investigate puppet module toolhttp://docs.puppetlabs.com/puppet/2.7/reference/modules_publishing.htmlApril 9, 2013 14
  • 15. Generate a modulehttp://docs.puppetlabs.com/puppet/2.7/reference/modules_publishing.htmlApril 9, 2013 15
  • 16. Edit Modulefilehttp://docs.puppetlabs.com/puppet/2.7/reference/modules_publishing.htmlApril 9, 2013 16
  • 17. Document the manifest http://rdoc.sourceforge.net/April 9, 2013 17
  • 18. Write Documentation?April 9, 2013 18
  • 19. Free Docs! puppet doc -a -o /var/www/html/puppetdocs --mode rdocApril 9, 2013 19
  • 20. What’s actually installed?April 9, 2013 20
  • 21. Catching live edits and preventing them Splunk -> puppet module changes -> alerting The NOC will hunt you down!April 9, 2013 21
  • 22. Build and install the module tar –xzf /tmp/work/stubhub-puppetserver/pkg/stubhub- puppetserver-0.0.1.tar.gz –C /etc/puppet/environments/ staging/modules/puppetserverApril 9, 2013 22
  • 23. Releasing like that?April 9, 2013 23
  • 24. Internal Forge mod_rewrite: Simulate the api – redirect to json metadata files $htmlroot/api/v1/releases.json?module=user/module ruby script: Generate metadata files for each module release and all modules. Similar to createrepo (yum)April 9, 2013 24
  • 25. Internal Forge - SearchApril 9, 2013 25
  • 26. Internal Forge - installApril 9, 2013 26
  • 27. Internal Forge - upgradeApril 9, 2013 27
  • 28. Case statements? How bout Hiera?April 9, 2013 28
  • 29. Avoid case statement insanity case $::system_role { ‘browse’, ‘search’: { …do some stuff… } ‘db’: { …other stuff… } ‘otherrole’: { …please make it stop!!!! } }April 9, 2013 29
  • 30. hieradata $hieradata/browse.yaml: --- module::parameter: ‘foo’ $hieradata/search.yaml: --- module::parameter: ‘bar’ $hieradata/defaults.yaml: --- module::parameter: ‘I want this everywhere unless there are overrides’April 9, 2013 30
  • 31. Case -> variables -> hiera §  9000 lines of case statements §  1000 lines with case/variables §  ~20 lines with defined type Code compression FTW!April 9, 2013 31
  • 32. Dynamic Environments §  puppet.conf: modulepath = /etc/puppet/environments/$environment/modules manifest = /etc/puppet/environments/$environment/manifests/site.pp manifestdir = /etc/puppet/environments/$environment/manifests §  hiera.yaml: :datadir: /etc/puppet/environments/%{environment}/hieradataApril 9, 2013 32
  • 33. Release process §  Syntax check/validate §  Test on VMs §  Build module package §  Release to internal forge §  puppet module install to staging environment §  Test again! §  puppet module install to production environmentApril 9, 2013 33
  • 34. The road to yesop §  Staging §  Process §  Repeatability §  Consistency §  Document everything §  Breaking things where it’s cheap §  Test everything!April 9, 2013 34
  • 35. Then VS now §  Environment build time: –  Then: 3+ weeks •  It was wrong •  It didn’t work •  Nobody knew what to expect –  Now: < 1 day •  It’s the same every time •  We know exactly what’s installed •  Internal consumers get what they expect •  Less outages from human errorApril 9, 2013 35
  • 36. Questions?April 9, 2013 36