Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Multi-tenant Puppet
Automation for everyone
JJ
John Jawed, github.com./johnj
Dogs, anything with an ocean
3
Gap up
Gap up
Linear
Exponential
Change
function of time
2014
118,000 hosts
13,000 environments
fewer puppetmasters
baremetal, VM, containers
Cha-cha-cha-changes
unavoidable
happen everywhere
Oops
changes does not always go according to plan
48 minutes
Goals
performance & scale
policy
seamless on boarding
Bottlenecks? Try giving up.
capacity, abilities
paradigms (epoll vs select)
insanity
Classification Catalog Reports/Facts
average puppet run 8 seconds
Classification
node_terminus = /enc_script.rb
320ms - loading gems, files, certs
only 100ms for API call to ENC
Optimize: ...
Classification
paradigm shift
from
exec /enc_script.rb fqdn
to
write fqdn to ENC workers
Classification
a little dash of bash
node_terminus = /enc_handler.sh
$ cat enc_handler.sh!
...!
echo $1 | nc -U /unix.sock...
Classification
a little go go
William Kennedy’s workpool
(github.com./goinggo/workpool)
go server listening on /unix.sock
...
Classification
exec/exit to listen/process
$ cat /enc_script.rb!
…!
while certname = $stdin.gets do!
enc(certname)!
end!
…!
Classification
PPM calls node_terminus
node_terminus writes request
to socket
go handles request, workpool
routes
Classification
end result
gets close to 100ms goal – 110ms
CPU usage – no constant bootstrapping
frees up resources, puppe...
catalogs
Catalog compilation – low hanging fruit, difficult
Catalog
source: http://www.isrubyfastyet.com
agents
everything is SSL, that is good
everything is SSL, that is expensive
use yum.puppetlabs.com. or apt.puppetlabs.com....
post run woes
after agent runs, the real fun begins
puppetmaster and agent both wait for
report processors to finish
slow ...
foreman
foreman report/fact processing – need to spread
read I/O
fact processing is read heavy, reports are write
heavy
ru...
reports
4k run reports per minute
using pg_shard:
psql> SELECT master_create_distributed_table(table_name := ’reports',
pa...
facts
most of the workload is read I/O, kept local
facts updated immediately after puppet runs
Master DB loadavg 2
Reports...
Classification Catalog Reports/Facts
average puppet run 2 seconds
runinterval is not your friend
pvc
Open source, github.com./johnj/pvc
Basis of orchestration in 2014
pvc
pvc.conf
pvc
host_endpoint=your.pvcbackend.com./host !
!
simple is hard
“Simple can be harder than complex: You have
to work hard to get your thinking clean to make
it simple. But...
Host Infrastructure
Host events
most systems have audit frameworks
files (inotify)
processes (audit)
network
puppet needs react to these events
osquery
osquery
services, files, and any resource that can be
tracked as a host event
event information can also be recorded (door...
file monitoring
{!
"file_paths": {!
"homes": [!
"/root/.ssh/%%",!
"/home/%/.ssh/%%"!
],!
”binaries": [!
"/usr/bin/%%",!
"/...
Infrastructure events
code releases, package upgrades,
access changes
puppet needs to be told to run when these
events occ...
pvc and foreman
foreman’s puppetrun API to set flag
pvc queries foreman to trigger run
logical separation with host groups
runinterval is an after thought
puppet runs instantly when it needs to
runinterval can be 3 minutes or 3 hours
frees up pu...
git
I pummel people with questions, because I need to know
what they're thinking, what they're trying to achieve, what
they be...
PuppetConf 2016: Multi-Tenant Puppet at Scale – John Jawed, eBay, Inc.
PuppetConf 2016: Multi-Tenant Puppet at Scale – John Jawed, eBay, Inc.
Upcoming SlideShare
Loading in …5
×

PuppetConf 2016: Multi-Tenant Puppet at Scale – John Jawed, eBay, Inc.

529 views

Published on

Here are the slides from John Jawed's PuppetConf 2016 presentation called Multi-Tenant Puppet at Scale. Watch the videos at https://www.youtube.com/playlist?list=PLV86BgbREluVjwwt-9UL8u2Uy8xnzpIqa

Published in: Technology
  • Just got my check for $500, Sometimes people don't believe me when I tell them about how much you can make taking paid surveys online... So I took a video of myself actually getting paid $500 for paid surveys to finally set the record straight. I'm not going to leave this video up for long, so check it out now before I take it down! ➤➤ http://ishbv.com/surveys6/pdf
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • You can now be your own boss and get yourself a very generous daily income. START FREE...▲▲▲ https://tinyurl.com/realmoneystreams2019
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Be the first to like this

PuppetConf 2016: Multi-Tenant Puppet at Scale – John Jawed, eBay, Inc.

  1. 1. Multi-tenant Puppet Automation for everyone
  2. 2. JJ John Jawed, github.com./johnj Dogs, anything with an ocean
  3. 3. 3 Gap up Gap up Linear Exponential
  4. 4. Change function of time 2014 118,000 hosts 13,000 environments fewer puppetmasters baremetal, VM, containers
  5. 5. Cha-cha-cha-changes unavoidable happen everywhere
  6. 6. Oops changes does not always go according to plan 48 minutes
  7. 7. Goals performance & scale policy seamless on boarding
  8. 8. Bottlenecks? Try giving up. capacity, abilities paradigms (epoll vs select) insanity
  9. 9. Classification Catalog Reports/Facts average puppet run 8 seconds
  10. 10. Classification node_terminus = /enc_script.rb 320ms - loading gems, files, certs only 100ms for API call to ENC Optimize: ENC run time as close to 100ms as possible
  11. 11. Classification paradigm shift from exec /enc_script.rb fqdn to write fqdn to ENC workers
  12. 12. Classification a little dash of bash node_terminus = /enc_handler.sh $ cat enc_handler.sh! ...! echo $1 | nc -U /unix.sock! ...!
  13. 13. Classification a little go go William Kennedy’s workpool (github.com./goinggo/workpool) go server listening on /unix.sock workpool routes requests to an idle worker
  14. 14. Classification exec/exit to listen/process $ cat /enc_script.rb! …! while certname = $stdin.gets do! enc(certname)! end! …!
  15. 15. Classification PPM calls node_terminus node_terminus writes request to socket go handles request, workpool routes
  16. 16. Classification end result gets close to 100ms goal – 110ms CPU usage – no constant bootstrapping frees up resources, puppet master process at scale, 200ms per run adds up quickly (30 for every 60 seconds of CPU time)
  17. 17. catalogs Catalog compilation – low hanging fruit, difficult Catalog source: http://www.isrubyfastyet.com
  18. 18. agents everything is SSL, that is good everything is SSL, that is expensive use yum.puppetlabs.com. or apt.puppetlabs.com. to make sure you run 3.7+ runtime savings: 40% Catalog
  19. 19. post run woes after agent runs, the real fun begins puppetmaster and agent both wait for report processors to finish slow report collection will cause your infrastructure to fall over – some just avoid it Reports/Facts
  20. 20. foreman foreman report/fact processing – need to spread read I/O fact processing is read heavy, reports are write heavy ruby activerecord: makara postgresql: local read slaves, pg_shard Reports/Facts
  21. 21. reports 4k run reports per minute using pg_shard: psql> SELECT master_create_distributed_table(table_name := ’reports', partition_column := ‘report_id'); psql> SELECT master_create_worker_shards(table_name := ‘reports', shard_count := 365); Reports/Facts
  22. 22. facts most of the workload is read I/O, kept local facts updated immediately after puppet runs Master DB loadavg 2 Reports/Facts
  23. 23. Classification Catalog Reports/Facts average puppet run 2 seconds
  24. 24. runinterval is not your friend
  25. 25. pvc Open source, github.com./johnj/pvc Basis of orchestration in 2014 pvc
  26. 26. pvc.conf pvc host_endpoint=your.pvcbackend.com./host ! !
  27. 27. simple is hard “Simple can be harder than complex: You have to work hard to get your thinking clean to make it simple. But it’s worth it in the end because once you get there, you can move mountains.” - Steve Jobs
  28. 28. Host Infrastructure
  29. 29. Host events most systems have audit frameworks files (inotify) processes (audit) network puppet needs react to these events
  30. 30. osquery
  31. 31. osquery services, files, and any resource that can be tracked as a host event event information can also be recorded (doorman, zentral, etc) event info is stored in tables (sqlite)
  32. 32. file monitoring {! "file_paths": {! "homes": [! "/root/.ssh/%%",! "/home/%/.ssh/%%"! ],! ”binaries": [! "/usr/bin/%%",! "/sbin/%%"! ],! "etc": [! "/etc/%%"! ],! "tmp": [! "/tmp/%%"! ]! }! }!
  33. 33. Infrastructure events code releases, package upgrades, access changes puppet needs to be told to run when these events occur
  34. 34. pvc and foreman foreman’s puppetrun API to set flag pvc queries foreman to trigger run logical separation with host groups
  35. 35. runinterval is an after thought puppet runs instantly when it needs to runinterval can be 3 minutes or 3 hours frees up puppet masters, allows more resources for other things your infrastructure is still kept honest
  36. 36. git
  37. 37. I pummel people with questions, because I need to know what they're thinking, what they're trying to achieve, what they believe the final outcome is going to be. Tim Gunn

×