SENSE AND
SENSU-BILITY
Painless Metrics And Monitoring
In The Cloud with Sensu
Bethany Erskine
nycdevops Meetup
http://git...
DO YOU LOVE
YOUR
MONITORING
SETUP?
Thursday, November 14, 13
#MONITORINGLOVE

Thursday, November 14, 13
MY STORY

+

(╯︵╰,)

Thursday, November 14, 13
Thursday, November 14, 13
Thursday, November 14, 13
WHY SENSU
✓Ruby
Plugins can be written in any
✓language
✓
✓community

sensu-chef cookbook

Thursday, November 14, 13
WHY SENSU
✓re-use Nagios checks
metrics and checks all collected by
✓one system
✓
✓easy to scale

Graphite integration

Th...
WHY SENSU

✓“Can I do X with Sensu?” probably!

Thursday, November 14, 13
WHY SENSU

Thursday, November 14, 13
WHY SENSU?
✓

Sensu source is well-written and
easy to parse

✓

Thursday, November 14, 13

https://github.com/sensu
WHY SENSU?
✓sensu-community-plugins
80 contributors
✓
✓over 600 plugins
https://github.com/sensu/sensu✓community-plugins
T...
TODAY at
PAPERLESS
Two Sensu environments (prod/testing)
~ 250 - 275 instances of sensu-client
4-6 Sensu-server instances
...
RESOURCES
All of our
✓virtualized.Sensu infrastructure is
We typically give a
✓box 1.5GB RAM and sensu-server
4 processors...
AS WE GREW
Growing pains and lessons learned...

Thursday, November 14, 13
NEEDS MORE
SENSU
✓High load on Sensu server
Backed-up queues in RabbitMQ
✓
TIP: set up check to monitor the
✓RabbitMQ read...
HOW TO SCALE
✓Add more sensu-server instances
No special configuration needed
✓
checks will be
✓robin fashion todistributed...
GRAPHITE PAINS
symptoms: backed up queues in
✓RabbitMQ, spotty graphs
cluster couldn’t
with the
✓large amount of keep upwe...
GRAPHITE PAINS
✓

Solution: stop collecting metrics
every 10 seconds (excessive!)

✓

moved staging metrics to staging
Gra...
THE MIGRATION
or, How To Quit Nagios in Ten Easy Steps

Thursday, November 14, 13
STEP 1: NUKE AND
PAVE

Thursday, November 14, 13
STEP 2: PLAN
METRICS AND MONITORING SURVEY

Thursday, November 14, 13
METRICS AND MONITORING SURVEY

Thursday, November 14, 13
STEP 3: DEFINE
GLOBALS
✓CHECKS: must be actionable!
✓METRICS: go nuts
HANDLERS: EMAIL for everything
✓initially, added Pag...
OUR GLOBALS
✓

CHECKS: disk usage, swap usage,
zombie processes, RO filesystems

✓

METRICS: vmstat, disk usage, cpu,
memor...
STEP 4: DEFINE
SPECIFICS
✓

For each server role, define
additional states to be checked and
alerted on:

✓Process Checks
✓...
STEP 5: SET UP A
PLACE TO TEST
✓

Set up a permanent testing Sensu
stack using your CM tool of choice

✓

Thursday, Novemb...
STEP 6: SET A
WORKFLOW
✓

Develop and document a workflow
for implementing, testing,
deploying and signing off on
checks

✓
...
EXAMPLE
WORKFLOW
add new sensu_check
✓appropriate cookbook definitions to the
in Chef
deploy
✓Chef new check to staging env...
SENSU IN CHEF

Thursday, November 14, 13
STEP 7: EXECUTE
WORKFLOW
Starting with the low-hanging
✓(plugins that already existed infruit
sensu-community-plugins
repo...
STEP 8: WATCH
THE WATCHER
Set up some bare-minimum 3rd
✓party monitoring for the Sensu
servers

Thursday, November 14, 13
Thursday, November 14, 13
MONITOR THE
MONITOR
✓

Other ideas: have Testing Sensu
monitor Prod Sensu

✓

Sensu can collect metrics about
itself

Thur...
STEP 9: ROLLOUT
Deploy your
✓infrastructureProduction server
Roll out the client
✓the rest of the yourand checks to
prod
e...
STEP 10: TUNE
✓
Expect to need to tune
✓and alert occurrences. thresholds
Laissez le bon alertes roulent!

Thursday, Novem...
SENSU
ARCHITECTURE

Thursday, November 14, 13
SENSU
ARCHITECTURE

Thursday, November 14, 13
OMNIBUS
INSTALLER
is awesome

Thursday, November 14, 13
LET’S PLAY WITH
SENSU
If you haven’t been able to get your
sandboxes up and running,
please pair with someone near you.

T...
SANDBOX GOALS
✓

Get familiar with Sensu
configuration

✓
✓Deploy a check
Trigger an alert on that check
✓
Give you somethi...
OOPS
If you mess anything up:
vagrant halt; vagrant up
Worst case:
vagrant destroy; vagrant up

Thursday, November 14, 13
TWO
VIRTUALBOXES
Sensu-Server and Sensu-Client
Vagrant/Chef
Centos 6.4
Sensu Version 0.10.2

Thursday, November 14, 13
SENSU
CONFIGURATION
Please open up a terminal
✓into both your sensu-serverand SSH
and
sensu-client VMs

✓sudo su ✓cd /etc/...
SENSU
CONFIGURATION
✓/etc/sensu/config.json - config for
redis, rabbitmq, api and dashboard

✓/etc/sensu/conf.d/ - checks g...
TRIGGER AN
ALERT!
On sensu-client:
service sensu-client stop

Thursday, November 14, 13
CHECK YOUR
DASHBOARD
Open a web browser and
✓http://10.254.254.10:8080 go to
username:
✓secret admin / password:

Thursday...
HANDLERS
✓

A HANDLER takes action on an
event using a pipe, TCP, UDP,
AMQP, or a set of other handlers

Examples: send an...
HANDLER
EXAMPLES
✓BASIC: send an email to ops@
ADVANCED: attempt to remediate
✓the alert (i.e. run a custom script
that sp...
HANDLERS
Let’s configure an EMAIL handler
✓to send a informative email for an
event.

✓

/etc/sensu/handlers/mailer.rb
plug...
CONFIGURE THE
PLUGIN
ON SENSU SERVER:
vim /etc/sensu/conf.d/handlers/
mailer.json
{
"mailer": {
"mail_from": "sensu@you.co...
CONFIGURE THE
HANDLER
cp /etc/sensu/conf.d/handlers/
default.json
/etc/sensu/conf.d/handlers/
email.json
vim /etc/sensu/co...
EMAIL.JSON
"handlers": {
"email": {
"type": "pipe",
"command": "/etc/sensu/handlers/
mailer.rb"
}
}

Thursday, November 14...
CHECK GEM
DEPENDENCIES
/opt/sensu/embedded/bin/gem list | grep mail

Thursday, November 14, 13
FIX PERMISSIONS

chown -R .sensu /etc/sensu/conf.d/

Thursday, November 14, 13
RESTART
SERVICES
service sensu-server restart
tail -100 /var/log/sensu/sensu-server.log
| grep mail

Thursday, November 14...
CHECKS
Sensu-client runs CHECKS that
✓defined and scheduled either are
locally (standalone) or on the
sensu-server (subscri...
CHECK
EXECUTION
✓

Either scheduled by the server
(subscription) or scheduled by the
client (standalone)

Today we will co...
LETS CONFIGURE
A CHECK
✓

Use check-procs.rb to make sure
at least one instance of cornbread
is running

Thursday, Novembe...
DETERMINE OUR
CHECK COMMAND
On your SENSU CLIENT:
/opt/sensu/embedded/bin/ruby /etc/sensu/plugins/check-procs.rb -p
cornbr...
INSTALL OUR
CHECK
✓On your SENSU SERVER:
vim /etc/sensu/conf.d/checks/
✓cornbread_process.json

Thursday, November 14, 13
CORNBREAD_PRO
CESS.JSON

Thursday, November 14, 13
RESTART
SERVICES
service sensu-server restart
tail -100 /var/log/sensu/sensu-server.log
| grep cornbread

Thursday, Novemb...
CHECK YOUR
DASHBOARD

Thursday, November 14, 13
CHECK YOUR
EMAIL

Thursday, November 14, 13
SENSU API
✓
✓HTTP/4567
on SENSU SERVER try:
✓
REST API

curl -l http://localhost:4567/events 
| python -mjson.tool

Thursd...
SENSU SERVICES
✓Sensu API
Sensu Server
✓
✓Sensu Client
Sensu Dashboard
✓
Thursday, November 14, 13
EVERYTHING OK?
✓

/etc/init.d/sensu-service {client|
server|api|dashboard} {start|stop|
status|restart}

✓ps -ef | grep se...
COOL SENSU
TRICKS

Thursday, November 14, 13
SEND DIRECTLY
TO SENSU
netcat to: 127.0.0.0:3030

Thursday, November 14, 13
AGGREGATE
ALERTS
✓
Alert when
✓not OK X% of checks are are

Handy for preventing alert floods

Thursday, November 14, 13
MY SENSU TIPS
install the RabbitMQ management
✓web interface and bookmark it (see
http://10.254.254.10:15672/#/ )

✓

lock...
TIPS TIPS TIPS
✓

have alternate ways to access your
Dashboard information

✓

we integrated our command-line
developer to...
MORE TIPS

✓

Put NGINX in front of sensudashboard

Thursday, November 14, 13
HA SENSU
✓

Redundancy is easy (bring up
more sensu-servers)

✓

Making Redis and RabbitMQ HA
more challenging

✓

We’re s...
WHERE TO GO
FOR HELP
✓
✓IRC: #sensu - freenode
sensu-users mailing list
✓

http://docs.sensuapp.org

Thursday, November 14...
QUESTIONS

Thursday, November 14, 13
THANK YOU
bethany@paperlesspost.com
@skymob - twitter
robotwitharose - #sensu on IRC (freenode)

Thursday, November 14, 13
Upcoming SlideShare
Loading in...5
×

Sensu at nycdevops Meetup

2,593

Published on

My sensu tutorial from the nycdevops Meetup on November 14, 2013.

Published in: Technology, Business
0 Comments
8 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,593
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
43
Comments
0
Likes
8
Embeds 0
No embeds

No notes for slide

Sensu at nycdevops Meetup

  1. 1. SENSE AND SENSU-BILITY Painless Metrics And Monitoring In The Cloud with Sensu Bethany Erskine nycdevops Meetup http://github.com/skymob/sensu-tutorial Thursday, November 14, 13
  2. 2. DO YOU LOVE YOUR MONITORING SETUP? Thursday, November 14, 13
  3. 3. #MONITORINGLOVE Thursday, November 14, 13
  4. 4. MY STORY + (╯︵╰,) Thursday, November 14, 13
  5. 5. Thursday, November 14, 13
  6. 6. Thursday, November 14, 13
  7. 7. WHY SENSU ✓Ruby Plugins can be written in any ✓language ✓ ✓community sensu-chef cookbook Thursday, November 14, 13
  8. 8. WHY SENSU ✓re-use Nagios checks metrics and checks all collected by ✓one system ✓ ✓easy to scale Graphite integration Thursday, November 14, 13
  9. 9. WHY SENSU ✓“Can I do X with Sensu?” probably! Thursday, November 14, 13
  10. 10. WHY SENSU Thursday, November 14, 13
  11. 11. WHY SENSU? ✓ Sensu source is well-written and easy to parse ✓ Thursday, November 14, 13 https://github.com/sensu
  12. 12. WHY SENSU? ✓sensu-community-plugins 80 contributors ✓ ✓over 600 plugins https://github.com/sensu/sensu✓community-plugins Thursday, November 14, 13
  13. 13. TODAY at PAPERLESS Two Sensu environments (prod/testing) ~ 250 - 275 instances of sensu-client 4-6 Sensu-server instances 25k Metrics/Hour to Graphite 1 custom dashboard 1 custom CLI Thursday, November 14, 13
  14. 14. RESOURCES All of our ✓virtualized.Sensu infrastructure is We typically give a ✓box 1.5GB RAM and sensu-server 4 processors, scaling up RAM for any box running more than one Sensu service on it. 4GB ✓install RAM for a monolithic Sensu (Rabbit, Redis, all Sensu components on one) Thursday, November 14, 13
  15. 15. AS WE GREW Growing pains and lessons learned... Thursday, November 14, 13
  16. 16. NEEDS MORE SENSU ✓High load on Sensu server Backed-up queues in RabbitMQ ✓ TIP: set up check to monitor the ✓RabbitMQ ready queue size, you'll want an email when the queue grows about 10K and stays there Thursday, November 14, 13
  17. 17. HOW TO SCALE ✓Add more sensu-server instances No special configuration needed ✓ checks will be ✓robin fashion todistributed in roundthe sensu-servers Thursday, November 14, 13
  18. 18. GRAPHITE PAINS symptoms: backed up queues in ✓RabbitMQ, spotty graphs cluster couldn’t with the ✓large amount of keep upwe were metrics now serving it via AMQP Thursday, November 14, 13
  19. 19. GRAPHITE PAINS ✓ Solution: stop collecting metrics every 10 seconds (excessive!) ✓ moved staging metrics to staging Graphite cluster ✓ Moved prod Graphite cluster to SSD Thursday, November 14, 13
  20. 20. THE MIGRATION or, How To Quit Nagios in Ten Easy Steps Thursday, November 14, 13
  21. 21. STEP 1: NUKE AND PAVE Thursday, November 14, 13
  22. 22. STEP 2: PLAN METRICS AND MONITORING SURVEY Thursday, November 14, 13
  23. 23. METRICS AND MONITORING SURVEY Thursday, November 14, 13
  24. 24. STEP 3: DEFINE GLOBALS ✓CHECKS: must be actionable! ✓METRICS: go nuts HANDLERS: EMAIL for everything ✓initially, added Pagerduty later. Thursday, November 14, 13
  25. 25. OUR GLOBALS ✓ CHECKS: disk usage, swap usage, zombie processes, RO filesystems ✓ METRICS: vmstat, disk usage, cpu, memory, interface and disk perf ✓ HANDLERS: Email, Campfire, Pagerduty Thursday, November 14, 13
  26. 26. STEP 4: DEFINE SPECIFICS ✓ For each server role, define additional states to be checked and alerted on: ✓Process Checks ✓System Checks ✓Service Checks ✓Service Metrics Thursday, November 14, 13
  27. 27. STEP 5: SET UP A PLACE TO TEST ✓ Set up a permanent testing Sensu stack using your CM tool of choice ✓ Thursday, November 14, 13 we used sensu-chef cookbook
  28. 28. STEP 6: SET A WORKFLOW ✓ Develop and document a workflow for implementing, testing, deploying and signing off on checks ✓ You’ll get the best coverage if anyone (developers or ops) can easily add checks and metrics to Sensu Thursday, November 14, 13
  29. 29. EXAMPLE WORKFLOW add new sensu_check ✓appropriate cookbook definitions to the in Chef deploy ✓Chef new check to staging env using ✓Pull Request with sample graphs or alerts ✓Code Review from colleague ✓Deploy to Prod Thursday, November 14, 13
  30. 30. SENSU IN CHEF Thursday, November 14, 13
  31. 31. STEP 7: EXECUTE WORKFLOW Starting with the low-hanging ✓(plugins that already existed infruit sensu-community-plugins repository), configure and deploy each check in the worksheet to the testing Sensu server deploy sensu-client to a few select ✓machines Thursday, November 14, 13
  32. 32. STEP 8: WATCH THE WATCHER Set up some bare-minimum 3rd ✓party monitoring for the Sensu servers Thursday, November 14, 13
  33. 33. Thursday, November 14, 13
  34. 34. MONITOR THE MONITOR ✓ Other ideas: have Testing Sensu monitor Prod Sensu ✓ Sensu can collect metrics about itself Thursday, November 14, 13
  35. 35. STEP 9: ROLLOUT Deploy your ✓infrastructureProduction server Roll out the client ✓the rest of the yourand checks to prod environments.  Thursday, November 14, 13
  36. 36. STEP 10: TUNE ✓ Expect to need to tune ✓and alert occurrences. thresholds Laissez le bon alertes roulent! Thursday, November 14, 13
  37. 37. SENSU ARCHITECTURE Thursday, November 14, 13
  38. 38. SENSU ARCHITECTURE Thursday, November 14, 13
  39. 39. OMNIBUS INSTALLER is awesome Thursday, November 14, 13
  40. 40. LET’S PLAY WITH SENSU If you haven’t been able to get your sandboxes up and running, please pair with someone near you. Thursday, November 14, 13
  41. 41. SANDBOX GOALS ✓ Get familiar with Sensu configuration ✓ ✓Deploy a check Trigger an alert on that check ✓ Give you something to take home ✓and hack on Install a Handler Thursday, November 14, 13
  42. 42. OOPS If you mess anything up: vagrant halt; vagrant up Worst case: vagrant destroy; vagrant up Thursday, November 14, 13
  43. 43. TWO VIRTUALBOXES Sensu-Server and Sensu-Client Vagrant/Chef Centos 6.4 Sensu Version 0.10.2 Thursday, November 14, 13
  44. 44. SENSU CONFIGURATION Please open up a terminal ✓into both your sensu-serverand SSH and sensu-client VMs ✓sudo su ✓cd /etc/sensu Thursday, November 14, 13
  45. 45. SENSU CONFIGURATION ✓/etc/sensu/config.json - config for redis, rabbitmq, api and dashboard ✓/etc/sensu/conf.d/ - checks go here ✓/etc/sensu/conf.d/client.json client configuration, subscriptions ✓ /etc/sensu/{extensions|handlers| mutators|plugins} Thursday, November 14, 13
  46. 46. TRIGGER AN ALERT! On sensu-client: service sensu-client stop Thursday, November 14, 13
  47. 47. CHECK YOUR DASHBOARD Open a web browser and ✓http://10.254.254.10:8080 go to username: ✓secret admin / password: Thursday, November 14, 13
  48. 48. HANDLERS ✓ A HANDLER takes action on an event using a pipe, TCP, UDP, AMQP, or a set of other handlers Examples: send an send ✓event to Pagerduty,email,metrics to send Graphite ✓ Thursday, November 14, 13 Default is “debug”
  49. 49. HANDLER EXAMPLES ✓BASIC: send an email to ops@ ADVANCED: attempt to remediate ✓the alert (i.e. run a custom script that spins up additional ec2 instances) Thursday, November 14, 13
  50. 50. HANDLERS Let’s configure an EMAIL handler ✓to send a informative email for an event. ✓ /etc/sensu/handlers/mailer.rb plugin is installed for you, we just need to configure and install it Thursday, November 14, 13
  51. 51. CONFIGURE THE PLUGIN ON SENSU SERVER: vim /etc/sensu/conf.d/handlers/ mailer.json { "mailer": { "mail_from": "sensu@you.com", "mail_to": "you@yourdomain.com" } } Thursday, November 14, 13
  52. 52. CONFIGURE THE HANDLER cp /etc/sensu/conf.d/handlers/ default.json /etc/sensu/conf.d/handlers/ email.json vim /etc/sensu/conf.d/handlers/ email.json Thursday, November 14, 13
  53. 53. EMAIL.JSON "handlers": { "email": { "type": "pipe", "command": "/etc/sensu/handlers/ mailer.rb" } } Thursday, November 14, 13
  54. 54. CHECK GEM DEPENDENCIES /opt/sensu/embedded/bin/gem list | grep mail Thursday, November 14, 13
  55. 55. FIX PERMISSIONS chown -R .sensu /etc/sensu/conf.d/ Thursday, November 14, 13
  56. 56. RESTART SERVICES service sensu-server restart tail -100 /var/log/sensu/sensu-server.log | grep mail Thursday, November 14, 13
  57. 57. CHECKS Sensu-client runs CHECKS that ✓defined and scheduled either are locally (standalone) or on the sensu-server (subscription). A CHECK sends a RESULT as ✓EVENT to a HANDLER - this an applies to anything - service checks, metrics, etc Thursday, November 14, 13
  58. 58. CHECK EXECUTION ✓ Either scheduled by the server (subscription) or scheduled by the client (standalone) Today we will configure a ✓subscription-based check on the server that will run on our client Thursday, November 14, 13
  59. 59. LETS CONFIGURE A CHECK ✓ Use check-procs.rb to make sure at least one instance of cornbread is running Thursday, November 14, 13
  60. 60. DETERMINE OUR CHECK COMMAND On your SENSU CLIENT: /opt/sensu/embedded/bin/ruby /etc/sensu/plugins/check-procs.rb -p cornbread -W1 Thursday, November 14, 13
  61. 61. INSTALL OUR CHECK ✓On your SENSU SERVER: vim /etc/sensu/conf.d/checks/ ✓cornbread_process.json Thursday, November 14, 13
  62. 62. CORNBREAD_PRO CESS.JSON Thursday, November 14, 13
  63. 63. RESTART SERVICES service sensu-server restart tail -100 /var/log/sensu/sensu-server.log | grep cornbread Thursday, November 14, 13
  64. 64. CHECK YOUR DASHBOARD Thursday, November 14, 13
  65. 65. CHECK YOUR EMAIL Thursday, November 14, 13
  66. 66. SENSU API ✓ ✓HTTP/4567 on SENSU SERVER try: ✓ REST API curl -l http://localhost:4567/events | python -mjson.tool Thursday, November 14, 13
  67. 67. SENSU SERVICES ✓Sensu API Sensu Server ✓ ✓Sensu Client Sensu Dashboard ✓ Thursday, November 14, 13
  68. 68. EVERYTHING OK? ✓ /etc/init.d/sensu-service {client| server|api|dashboard} {start|stop| status|restart} ✓ps -ef | grep sensu tail -f /var/log/sensu/*.log ✓ ✓curl -l localhost:4567/info Thursday, November 14, 13
  69. 69. COOL SENSU TRICKS Thursday, November 14, 13
  70. 70. SEND DIRECTLY TO SENSU netcat to: 127.0.0.0:3030 Thursday, November 14, 13
  71. 71. AGGREGATE ALERTS ✓ Alert when ✓not OK X% of checks are are Handy for preventing alert floods Thursday, November 14, 13
  72. 72. MY SENSU TIPS install the RabbitMQ management ✓web interface and bookmark it (see http://10.254.254.10:15672/#/ ) ✓ lock your plugins’ gem dependency versions Thursday, November 14, 13
  73. 73. TIPS TIPS TIPS ✓ have alternate ways to access your Dashboard information ✓ we integrated our command-line developer tools with Sensu API ✓ we also created our own Ops dashboard that queries Sensu, Graphite and our app for data Thursday, November 14, 13
  74. 74. MORE TIPS ✓ Put NGINX in front of sensudashboard Thursday, November 14, 13
  75. 75. HA SENSU ✓ Redundancy is easy (bring up more sensu-servers) ✓ Making Redis and RabbitMQ HA more challenging ✓ We’re still running one solitary Redis and RabbitMQ but are OK with this risk for now Thursday, November 14, 13
  76. 76. WHERE TO GO FOR HELP ✓ ✓IRC: #sensu - freenode sensu-users mailing list ✓ http://docs.sensuapp.org Thursday, November 14, 13
  77. 77. QUESTIONS Thursday, November 14, 13
  78. 78. THANK YOU bethany@paperlesspost.com @skymob - twitter robotwitharose - #sensu on IRC (freenode) Thursday, November 14, 13
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×