Managing Puppet using MCollectivePresentation Transcript
Managing Puppet using MCollective Puppet Camp Ghent R.I.Pienaar
Who am I?• Puppet user since 0.22.x• Architect of MCollective• Author of Extlookup and Hiera• Developer at Puppet Labs London• Blog at http://devco.net• Tweets at @ripienaar• Volcane on IRC R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
The Problem?• Puppet needs management just like other software• Enabling, disabling, ad-hoc runs, custom environments etc• The Puppet Master is a finite resource that needs protection• Orchestrated deploys R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
Obtaining The Agent Status R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
Obtaining Statuses$ mco puppet status* [ ============================================================> ] 11 / 11 node8.example.net: Currently stopped; last completed run 14 minutes 16 seconds ago ....Summary of Applying: false = 11Summary of Daemon Running:unix text here Per node status stopped = 11Summary of Enabled: Estate wide summary enabled = 10 disabled = 1Summary of Idling: false = 11Finished processing 11 / 11 hosts in 72.05 ms R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
Obtaining Statuses$ mco puppet countTotal Puppet nodes: 11 Nodes currently enabled: 10 Nodes currently disabled: 1Nodes currently doing puppet runs: 5 Nodes currently stopped: 6 Nodes with daemons started: 10 Nodes without daemons started: 1 Daemons started but idling: 6 R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
Obtaining Statuses$ mco rpc puppet last_run_summary* [ ============================================================> ] 28 / 28 . . .Summary of Config Retrieval Time: Average: 20.13Summary of Total Resources: Average: 435Summary of Total Time: Average: 39.33Finished processing 28 / 28 hosts in 311.23 ms R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
Doing Basic Runs$ mco puppet runonce * [ ============================================================> ] 11 / 11node9.example.net Request Aborted Puppet is disabled: machine under maintenanceFinished processing 11 / 11 hosts in 2593.85 ms$ mco puppet countTotal Puppet nodes: 11 Puppet 3 disable message Nodes currently enabled: 10 Nodes currently disabled: 1Nodes currently doing puppet runs: 2 Nodes currently stopped: 9 Nodes with daemons started: 10 Nodes without daemons started: 1 Daemons started but idling: 8Run with default configured splay and splaylimit R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
Doing Basic Runs$ mco puppet runonce -f * [ ============================================================> ] 11 / 11node9.example.net Request Aborted Puppet is disabled: machine under maintenanceFinished processing 11 / 11 hosts in 2661.99 msRun with no splay, still subject to enable/disable R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
Doing Basic Runs$ mco puppet runonce --splay --splaylimit 120* [ ============================================================> ] 11 / 11node9.example.net Request Aborted Puppet is disabled: machine under maintenanceFinished processing 11 / 11 hosts in 2661.99 ms Force splay and set a custom splay limit R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
Tags and Environment$ mco puppet runonce --tag webserver --tag syslog --environment development* [ ============================================================> ] 11 / 11node9.example.net Request Aborted Puppet is disabled: machine under maintenanceFinished processing 11 / 11 hosts in 2661.99 msSelects 2 tags in a specific Puppet Environment R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
Doing noop Runs$ mco puppet runonce --noop* [ ============================================================> ] 11 / 11node9.example.net Request Aborted Puppet is disabled: machine under maintenanceFinished processing 11 / 11 hosts in 2661.99 ms Do a noop run, gathers reports and audit information R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
Doing no-noop Runs$ mco puppet runonce --tag webserver --no-noop* [ ============================================================> ] 11 / 11node9.example.net Request Aborted Puppet is disabled: machine under maintenanceFinished processing 11 / 11 hosts in 2661.99 ms When puppet.conf has noop=true, do an actual run on demand R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
Choosing a Master$ mco puppet runonce --server secops.example.net:8134 --tag compliance* [ ============================================================> ] 11 / 11node9.example.net Request Aborted Puppet is disabled: machine under maintenanceFinished processing 11 / 11 hosts in 2661.99 ms Does a single run against a different Puppet Master R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
The Big Red Button$ mco puppet disable “we f’d up, stop the train!”* [ ============================================================> ] 11 / 11node9.example.net Request Aborted Could not disable Puppet: Already disabledSummary of Enabled: disabled = 11Finished processing 11 / 11 hosts in 90.06 ms Disables Puppet, does not change currently disabled nodes reasons R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
The Big Green Button$ mco puppet enable -S ‘puppet().disable_message=/stop the train/’* [ ============================================================> ] 10 / 10Summary of Enabled: enabled = 10Finished processing 10 / 10 hosts in 90.06 ms Enables all disabled Puppet nodes R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
Operating On Groups Of Hosts R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
Selective Runs Facter fact Puppet Class$ mco puppet runonce -W “cluster=a roles::webserver”* [ ============================================================> ] 5 / 5Finished processing 5 / 5 hosts in 90.06 ms Run using a filter: all web servers with fact cluster=a R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
Selective Runs Any Puppet resource$ mco puppet runonce -S “resource(‘File[/srv/www]’).managed=true”* [ ============================================================> ] 5 / 5Finished processing 5 / 5 hosts in 90.06 ms Run using a filter: nodes where we manage /srv/www R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
Selective Runs$ mco puppet runonce -S “resource().failed_resources>5 and resource().config_version=xyz”* [ ============================================================> ] 5 / 5Finished processing 5 / 5 hosts in 90.06 ms Run using a filter: Most recent run config_version was xyz that had > 5 resource failures R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
Roll Out A Change Quickly$ mco puppet runall 72013-01-19 20:58:59: Running all nodes with a concurrency of 72013-01-19 20:58:59: Discovering enabled Puppet nodes to manage2013-01-19 20:59:02: Found 11 enabled nodes2013-01-19 20:59:06: node3.example.net schedule status: Started a background Puppet run2013-01-19 20:59:07: node1.example.net schedule status: Started a background Puppet run2013-01-19 20:59:09: node4.example.net schedule status: Started a background Puppet run2013-01-19 20:59:10: node6.example.net schedule status: Started a background Puppet run2013-01-19 20:59:12: node0.example.net schedule status: Started a background Puppet run2013-01-19 20:59:13: node5.example.net schedule status: Started a background Puppet run2013-01-19 20:59:17: Currently 7 nodes applying the catalog; waiting for less than 72013-01-19 20:59:21: Currently 7 nodes applying the catalog; waiting for less than 72013-01-19 20:59:25: node9.example.net schedule status: Puppet is currently applying a catalog,cannot run now2013-01-19 20:59:29: node8.example.net schedule status: Started a background Puppet run2013-01-19 20:59:33: Currently 7 nodes applying the catalog; waiting for less than 72013-01-19 20:59:38: node2.example.net schedule status: Started a background Puppet run2013-01-19 20:59:41: Currently 7 nodes applying the catalog; waiting for less than 72013-01-19 20:59:46: middleware.example.net schedule status: Started a background Puppet run2013-01-19 20:59:50: Currently 7 nodes applying the catalog; waiting for less than 72013-01-19 20:59:55: node7.example.net schedule status: Started a background Puppet run Runs all nodes with a maximum concurrency R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
Roll Out A Change Quickly2013-01-19 20:58:59: Running all nodes with a concurrency of 72013-01-19 20:58:59: Discovering enabled Puppet nodes to manage2013-01-19 20:59:02: Found 11 enabled nodes Does not attempt to manage disabled nodes R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
Roll Out A Change Quickly2013-01-19 20:59:02: Found 11 enabled nodes2013-01-19 20:59:06: node3.example.net schedule status: Started a background Puppet run2013-01-19 20:59:07: node1.example.net schedule status: Started a background Puppet run2013-01-19 20:59:09: node4.example.net schedule status: Started a background Puppet run2013-01-19 20:59:10: node6.example.net schedule status: Started a background Puppet run2013-01-19 20:59:12: node0.example.net schedule status: Started a background Puppet run2013-01-19 20:59:13: node5.example.net schedule status: Started a background Puppet run2013-01-19 20:59:17: Currently 7 nodes applying the catalog; waiting for less than 7 Starts the first 6 quickly but considers administrators doing 1other run at the same time R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
Roll Out A Change Quickly2013-01-19 20:59:17: Currently 7 nodes applying the catalog; waiting for less than 72013-01-19 20:59:21: Currently 7 nodes applying the catalog; waiting for less than 72013-01-19 20:59:25: node9.example.net schedule status: Puppet is currently applying a catalog,cannot run now2013-01-19 20:59:29: node8.example.net schedule status: Started a background Puppet run node9 was being run by an administrator or normal schedule already, skipped to next node R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
Roll Out A Change Quickly2013-01-19 20:59:29: node8.example.net schedule status: Started a background Puppet run2013-01-19 20:59:33: Currently 7 nodes applying the catalog; waiting for less than 72013-01-19 20:59:38: node2.example.net schedule status: Started a background Puppet run2013-01-19 20:59:41: Currently 7 nodes applying the catalog; waiting for less than 72013-01-19 20:59:46: middleware.example.net schedule status: Started a background Puppet run2013-01-19 20:59:50: Currently 7 nodes applying the catalog; waiting for less than 72013-01-19 20:59:55: node7.example.net schedule status: Started a background Puppet run Regularly checks the concurrency and starts more nodes soon as possible. Average node run time 34.39s, total time 55 seconds R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
Roll Out A Change Slowly Wait 5 minutes$ mco puppet runonce --batch 5 --batch-sleep 300* [ ============================================================> ] 11 / 11Finished processing 11 / 11 hosts in 903686.29 msDoes runonce in batches of 5, 5 minute sleep per batch. ^c after any batch to stop. 15 minute total run time. R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
Performance Analysis$ mco find -S "resource().config_retrieval_time > 30"dev3.example.netdev4.example.netdev7.example.netdev6.example.netdev8.example.netdev9.example.netdev10.example.net Find machines with config_retrieval_time over 30 seconds - all the dev servers. R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
Maintenance Windows and Access Control R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
Puppet State As ACLpolicy default denyallow cert=manager enable disable * *allow cert=sysadmin runonce status * *allow cert=developer * environment=development * Only cert=manager can enable and disable the Puppet Agent indicating maintenance periods R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
Puppet State As ACLpolicy default denyallow cert=manager stop start * *allow cert=noc stop start puppet().enabled=falseallow cert=developer * environment=development * NOC can start and stop services only during a maintenance window. Manager user can always override maintenance windows. R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
What is MCollective?• Ruby framework for writing Orchestration systems• Provides Authentication, Authorization and Auditing• No direct communication between client and nodes R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
1–1 of 1 previous next