Your SlideShare is downloading. ×
Lessons I Learned While Scaling to 5000 Puppet Agents
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Lessons I Learned While Scaling to 5000 Puppet Agents

11,557
views

Published on

Russ Johnson of StubHub talks about "Learning Lessons Scaling to 5000 Puppet Agents" at Puppet Camp San Francisco 2013. Find a Puppet Camp near you: puppetlabs.com/community/puppet-camp/

Russ Johnson of StubHub talks about "Learning Lessons Scaling to 5000 Puppet Agents" at Puppet Camp San Francisco 2013. Find a Puppet Camp near you: puppetlabs.com/community/puppet-camp/


0 Comments
14 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
11,557
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
74
Comments
0
Likes
14
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Learning Lessons Scaling to 5000 Agents Russ Johnson rjohnson@stubhub.com @professorussConfidential Slide 1
  • 2. #whoami Started out in the mid 90’s Recovering Windows Admin Storage Guy Datacenter Monkey Once upon a time network guy At StubHub since December 2006 Working on puppet adoption in a crazy infrastructure Puppet certifiedApril 9, 2013 2
  • 3. 4 Puppet Masters DEV/QA PROD DR CORPApril 9, 2013 3
  • 4. The road to sanityApril 9, 2013 4
  • 5. Set up your master properly Apache/Passenger Tune Passenger PassengerMaxPoolSize 32 PassengerMinInstances 4 PassengerMaxRequests 10000 PassengerStatThrottleRate 30 16 cores, 32GB load average: 4.03, 3.71, 3.45 4000+ agentsApril 9, 2013 5
  • 6. Thundering herdsApril 9, 2013 6
  • 7. Build your hosts the same way! Old way: systemimager, vmware clones, manual installs Results: INCONSISTENCY!April 9, 2013 7
  • 8. Build your hosts the same way! New way: Cobbler < 5m bare metal to on the network Results: Same results every time! No drift between baseApril 9, 2013 8
  • 9. Set up your working environment properly Geppetto – eclipse based IDE http://cloudsmith.github.com/geppetto/index.html VIM •  Pathogen – For autoloading vim plugins •  Snipmate – Snippets •  Tabular – Text filtering and alignment •  Syntastic – Syntax checking •  mv-vim-puppet – Make vim puppet friendly •  puppet-lint – Syntax checker (gem)April 9, 2013 9
  • 10. Syntastic/puppet-lintApril 9, 2013 10
  • 11. Set up your working environment properly +April 9, 2013 11
  • 12. Version Control is not enough Ever do a 4 way diff across 60 modules to find most of them different?April 9, 2013 12
  • 13. What to do? How do I stop manual edits? How do I deal with 80+ Dev/QA Environments? Versioning? Librarian? Pulp? Branching? Internal Forge? Dynamic Environments? Puppet Module Tool? What does PuppetLabs do?April 9, 2013 13
  • 14. PuppetLabs seems to know what to do Let’s investigate puppet module toolhttp://docs.puppetlabs.com/puppet/2.7/reference/modules_publishing.htmlApril 9, 2013 14
  • 15. Generate a modulehttp://docs.puppetlabs.com/puppet/2.7/reference/modules_publishing.htmlApril 9, 2013 15
  • 16. Edit Modulefilehttp://docs.puppetlabs.com/puppet/2.7/reference/modules_publishing.htmlApril 9, 2013 16
  • 17. Document the manifest http://rdoc.sourceforge.net/April 9, 2013 17
  • 18. Write Documentation?April 9, 2013 18
  • 19. Free Docs! puppet doc -a -o /var/www/html/puppetdocs --mode rdocApril 9, 2013 19
  • 20. What’s actually installed?April 9, 2013 20
  • 21. Catching live edits and preventing them Splunk -> puppet module changes -> alerting The NOC will hunt you down!April 9, 2013 21
  • 22. Build and install the module tar –xzf /tmp/work/stubhub-puppetserver/pkg/stubhub- puppetserver-0.0.1.tar.gz –C /etc/puppet/environments/ staging/modules/puppetserverApril 9, 2013 22
  • 23. Releasing like that?April 9, 2013 23
  • 24. Internal Forge mod_rewrite: Simulate the api – redirect to json metadata files $htmlroot/api/v1/releases.json?module=user/module ruby script: Generate metadata files for each module release and all modules. Similar to createrepo (yum)April 9, 2013 24
  • 25. Internal Forge - SearchApril 9, 2013 25
  • 26. Internal Forge - installApril 9, 2013 26
  • 27. Internal Forge - upgradeApril 9, 2013 27
  • 28. Case statements? How bout Hiera?April 9, 2013 28
  • 29. Avoid case statement insanity case $::system_role { ‘browse’, ‘search’: { …do some stuff… } ‘db’: { …other stuff… } ‘otherrole’: { …please make it stop!!!! } }April 9, 2013 29
  • 30. hieradata $hieradata/browse.yaml: --- module::parameter: ‘foo’ $hieradata/search.yaml: --- module::parameter: ‘bar’ $hieradata/defaults.yaml: --- module::parameter: ‘I want this everywhere unless there are overrides’April 9, 2013 30
  • 31. Case -> variables -> hiera §  9000 lines of case statements §  1000 lines with case/variables §  ~20 lines with defined type Code compression FTW!April 9, 2013 31
  • 32. Dynamic Environments §  puppet.conf: modulepath = /etc/puppet/environments/$environment/modules manifest = /etc/puppet/environments/$environment/manifests/site.pp manifestdir = /etc/puppet/environments/$environment/manifests §  hiera.yaml: :datadir: /etc/puppet/environments/%{environment}/hieradataApril 9, 2013 32
  • 33. Release process §  Syntax check/validate §  Test on VMs §  Build module package §  Release to internal forge §  puppet module install to staging environment §  Test again! §  puppet module install to production environmentApril 9, 2013 33
  • 34. The road to yesop §  Staging §  Process §  Repeatability §  Consistency §  Document everything §  Breaking things where it’s cheap §  Test everything!April 9, 2013 34
  • 35. Then VS now §  Environment build time: –  Then: 3+ weeks •  It was wrong •  It didn’t work •  Nobody knew what to expect –  Now: < 1 day •  It’s the same every time •  We know exactly what’s installed •  Internal consumers get what they expect •  Less outages from human errorApril 9, 2013 35
  • 36. Questions?April 9, 2013 36

×