Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Sharing Sensu with Multiple Teams
Deployment & Configuration using Ansible
David Schroeder
August 23, 2018
Short story shorter
2
Overview
› Environment segregation
– Access limits
– Contacts
› Different deployment strategies
› Different thresholds
– Both keepa...
› Sensu Enterprise RBAC!
› Contact routing!
› Check parameter tokenization!
› API tokens!
› Custom configuration anywhere ...
5
sensu-client sensu-server sensu-enterprise rabbitmq-server
› Installs & configures
› Satisfies dependencies
› Creates clie...
sensu-client sensu-server sensu-enterprise rabbitmq-server
› Installs & configures
› Satisfies dependencies
› Creates clie...
› sensu/
– group_vars/
▪ framework_pdx_dev/
▪ framework_pdx_stage/
▪ framework_pdx_prod/
▪ sensu_one/
▪ sensu_two/
– roles...
› sensu/
– group_vars/
▪ framework_pdx_dev/
▪ framework_pdx_stage/
▪ framework_pdx_prod/
▪ sensu_one/
▪ sensu_two/
– roles...
› sensu/
– group_vars/
▪ infrastructure_pdx_dev/
– main.yml
– vault.yml
Per Environment
10
Ansible Structure
---
### Envir...
› sensu/
– group_vars/
▪ infrastructure_pdx_dev/
– main.yml
– vault.yml
Per Environment
11
Ansible Structure
# To add a su...
› sensu/
– group_vars/
▪ infrastructure_pdx_dev/
– main.yml
– vault.yml
Per Environment
12
Ansible Structure
### Communica...
› sensu/
– group_vars/
▪ sensu_one/
– main.yml
– vault.yml
– aggregatechecks.yml
– endpoints.yml
– handlers.yml
– pingchec...
› sensu/
– group_vars/
▪ sensu_one/
– main.yml
– vault.yml
– aggregatechecks.yml
– endpoints.yml
– handlers.yml
– pingchec...
› sensu/
– group_vars/
▪ sensu_one/
– main.yml
– vault.yml
– aggregatechecks.yml
– endpoints.yml
– handlers.yml
– pingchec...
› sensu/
– roles/sensu-server/
▪ vars/
– main.yml
– checks.yml
– filters.yml
– mutators.yml
Sensu Server Role
16
Ansible S...
17
Pull
Request
Code
Review
Client
Deployment
Server
Deployment
Win!
18
Sensu Change Workflow
Problems? Let's be honest: yes.
Classification goes here 19
Ongoing Challenges
API calls
Limited availability in RBAC01
Dashboard
Missing hosts in Events list02
Cleanup
Old checks, f...
Ongoing Challenges
API calls
Limited availability in LDAP RBAC01
21
› Works through RBAC, but without subscription limitat...
Ongoing Challenges
Dashboard
Missing hosts in Events list02
22
› If a host matches a subscription in RBAC, but the alertin...
Ongoing Challenges
Cleanup
Old checks, forgotten hosts03
23
Ongoing Challenges
Bottlenecks
04
24
This guy!
Thank you
#monitoringlove
Upcoming SlideShare
Loading in …5
×

Sharing Sensu with Multiple Teams using Ansible

106 views

Published on

For the last two years, David Schroeder, Software Engineer at Viasat, Inc. has supported a single Sensu cluster shared by multiple teams, each with their own requirements, thresholds, and contacts. How does it all work, how can these different uses coexist?

This talk from Sensu Summit 2018 describes how Ansible is used to configure and deploy Sensu for multiple teams, how much autonomy is granted each one, and where the bottlenecks are.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Sharing Sensu with Multiple Teams using Ansible

  1. 1. Sharing Sensu with Multiple Teams Deployment & Configuration using Ansible David Schroeder August 23, 2018
  2. 2. Short story shorter 2 Overview
  3. 3. › Environment segregation – Access limits – Contacts › Different deployment strategies › Different thresholds – Both keepalive and other checks › Different checks, different platforms (even Windows) › API calls – Creating silence – Gather check results "Can Sensu do #{this_thing}?" 3 Team Requirements
  4. 4. › Sensu Enterprise RBAC! › Contact routing! › Check parameter tokenization! › API tokens! › Custom configuration anywhere and everywhere! "Sensu can do #{this_thing}!" 4 Team Requirements
  5. 5. 5
  6. 6. sensu-client sensu-server sensu-enterprise rabbitmq-server › Installs & configures › Satisfies dependencies › Creates client.json – Maintenance mode › Configures checks – Pub/sub – Aggregate – API endpoint – Ping › Installs handlers & stand- alone check scripts › Configures handlers › Configures contacts › Installs Sensu Enterprise › Configures API › Configures dashboard – RBAC through LDAP › Installs and configures RabbitMQ cluster › Installs and configures Redis Sentinel › Fetches certificates 6 Ansible Roles sensu-winclient › Generates configuration › Bundles installer & dependencies sensu-standalone › Subrepo of community sensu-ansible role redis-server › Installs and configures Redis › Installs and configures Graphite
  7. 7. sensu-client sensu-server sensu-enterprise rabbitmq-server › Installs & configures › Satisfies dependencies › Creates client.json – Maintenance mode › Configures checks – Pub/sub – Aggregate – API endpoint – Ping › Installs handlers & stand- alone check scripts › Configures handlers › Configures contacts › Installs Sensu Enterprise › Configures API › Configures dashboard – RBAC through LDAP › Installs and configures RabbitMQ cluster › Installs and configures Redis Sentinel › Fetches certificates 7 Ansible Roles sensu-winclient › Generates configuration › Bundles installer & dependencies sensu-standalone › Subrepo of community sensu-ansible role redis-server › Installs and configures Redis › Installs and configures Graphite › Shared role, "galaxy" style › Included as 'subrepo'
  8. 8. › sensu/ – group_vars/ ▪ framework_pdx_dev/ ▪ framework_pdx_stage/ ▪ framework_pdx_prod/ ▪ sensu_one/ ▪ sensu_two/ – roles/ ▪ sensu_client/ ▪ sensu_winclient/ ▪ sensu_server/ ▪ sensu_enterprise/ Drilling Down 8 Ansible Structure › Team Environments
  9. 9. › sensu/ – group_vars/ ▪ framework_pdx_dev/ ▪ framework_pdx_stage/ ▪ framework_pdx_prod/ ▪ sensu_one/ ▪ sensu_two/ – roles/ ▪ sensu_client/ ▪ sensu_winclient/ ▪ sensu_server/ ▪ sensu_enterprise/ Drilling Down 9 Ansible Structure › Sensu Clusters
  10. 10. › sensu/ – group_vars/ ▪ infrastructure_pdx_dev/ – main.yml – vault.yml Per Environment 10 Ansible Structure --- ### Environment Definitions ########################################### host_subscriptions: - "basic" - "framework" - "framework_pdx_dev" host_environment: "framework_pdx_dev" host_contact: "framework" # Keepalive thresholds: number of seconds before warning or alerting keepalive_warn: 150 keepalive_crit: 210 # Set re-notification time (in seconds) for keepalive alarms. Default is 300. keepalive_refresh: 3600
  11. 11. › sensu/ – group_vars/ ▪ infrastructure_pdx_dev/ – main.yml – vault.yml Per Environment 11 Ansible Structure # To add a subscription based on server role as included in the hostname, # include the subscription name as the key, and hostname pattern as the # value. Be sure to escape out backslashes. role_patterns: framework_zeromq: "-mq00d" framework_utility: "^utly" # Enable Sensu client socket commands enable_client_socket: true # Custom client-side configuration custom_client_configs: checks: check_ram: warning: 101 critical: 100
  12. 12. › sensu/ – group_vars/ ▪ infrastructure_pdx_dev/ – main.yml – vault.yml Per Environment 12 Ansible Structure ### Communicating with Sensu ########################################## # Hostname or IP address of the graphite API server for graph rendering graphite_server: "172.16.20.100" rabbitmq_params: port: 5671 user: "sensu" pass: "{{ vault_rabbitmq['password'] }}" host1: "172.16.20.101" host1_cert: "{{ vault_rabbitmq['host1_cert'] }}" host1_key: "{{ vault_rabbitmq['host1_key'] }}" host2: "172.16.20.102" host2_cert: "{{ vault_rabbitmq['host2_cert'] }}" host2_key: "{{ vault_rabbitmq['host2_key'] }}" host3: "172.16.20.103" host3_cert: "{{ vault_rabbitmq['host3_cert'] }}" host3_key: "{{ vault_rabbitmq['host3_key'] }}"
  13. 13. › sensu/ – group_vars/ ▪ sensu_one/ – main.yml – vault.yml – aggregatechecks.yml – endpoints.yml – handlers.yml – pingchecks.yml – site_checks.yml Sensu Clusters 13 Ansible Structure ldap: server: "auth.somewhere.out.there" port: 636 roles: framework_team: name: "framework_team" readonly: "false" members: - "framework" datacenters: [] subscriptions: - "framework"
  14. 14. › sensu/ – group_vars/ ▪ sensu_one/ – main.yml – vault.yml – aggregatechecks.yml – endpoints.yml – handlers.yml – pingchecks.yml – site_checks.yml Sensu Clusters 14 Ansible Structure ldap: roles: jenkins_api: name: "jenkins_api" readonly: "false" token: "{{ vault_ldap.jenkins_api.token }}" members: [] datacenters: [] subscriptions: [] methods: get: - aggregates - clients - silenced post: - silenced
  15. 15. › sensu/ – group_vars/ ▪ sensu_one/ – main.yml – vault.yml – aggregatechecks.yml – endpoints.yml – handlers.yml – pingchecks.yml – site_checks.yml Sensu Clusters 15 Ansible Structure handler_contacts: - contacts.json: contacts: framework: hipchatter: api_token: ChahL8XeiphohBi2eiceiseehaele5eu1aesahyuu room: 1234 mailer: mail_to: frameworkteam.dl@wherever.com sensu_admin: hipchatter: api_token: Aivoubah0iexi6eyioQu0eeThee2Aenu6kohw4qui room: 2345 mailer: mail_to: sensuteam.dl@wherever.com
  16. 16. › sensu/ – roles/sensu-server/ ▪ vars/ – main.yml – checks.yml – filters.yml – mutators.yml Sensu Server Role 16 Ansible Structure pubsub_checks: # Basic Checks - check_ram.json: checks: check_ram: command: "check-memory-percent.rb –w :::custom.checks.check_ram.warning|95::: -c :::custom.checks.check_ram.critical|98:::" interval: "{{ default_interval }}" subscribers: - basic handlers: "{{ default_handlers }}" occurrences: 5 refresh: "{{ default_renotify }}" runbook: "{{ runbook_base_url }}/check_ram" graph: "http://{{ graphite_server }}/render?from={{ graph_time }}&until=now&{{ graph_size}}&target=:::environment:::.:::graphname:::.memory.usedWOBuffersCaches&title=Mem ory+Used+Without+Buffers+and+Caches&uchiwa_force_image=.jpg"
  17. 17. 17
  18. 18. Pull Request Code Review Client Deployment Server Deployment Win! 18 Sensu Change Workflow
  19. 19. Problems? Let's be honest: yes. Classification goes here 19
  20. 20. Ongoing Challenges API calls Limited availability in RBAC01 Dashboard Missing hosts in Events list02 Cleanup Old checks, forgotten hosts03 Bottlenecks 04 20
  21. 21. Ongoing Challenges API calls Limited availability in LDAP RBAC01 21 › Works through RBAC, but without subscription limitations: – /clients – /clients/:client/history (deprecated) – /events (returns all events) – /silenced (POST ignores 'begin' field) › Does not work at all through RBAC layer" – /results – /events/:client/ – /silenced/subscriptions/:subscription – /silenced/checks/:check – ?filter › Good news: support in Sensu 2.0!
  22. 22. Ongoing Challenges Dashboard Missing hosts in Events list02 22 › If a host matches a subscription in RBAC, but the alerting check does not, it is not visible on the Events page
  23. 23. Ongoing Challenges Cleanup Old checks, forgotten hosts03 23
  24. 24. Ongoing Challenges Bottlenecks 04 24 This guy!
  25. 25. Thank you #monitoringlove

×