Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Monitoring MySQL Replication lag with Prometheus & pt-heartbeat

1,216 views

Published on

Lightning talk given at PromCon2017 about monitoring mysql replication lag with pt-heartbeat and prometheus (mysqld_exporter).

Published in: Technology
  • Be the first to comment

Monitoring MySQL Replication lag with Prometheus & pt-heartbeat

  1. 1. Monitoring MySQL Replication Delay with mysqld_exporter & pt-heartbeat Julien Pivotto (@roidelapluie) PromConf Munich Augustus 18, 2017
  2. 2. SELECT USER(); Julien "roidelapluie" Pivotto @roidelapluie Sysadmin at inuits Automation, monitoring, HA MySQL/MariaDB user/admin/contributor Grafana and Prometheus user/contributor
  3. 3. inuits
  4. 4. MySQL Replication MySQL Master <-> MySQL Master MySQL Master -> MySQL Slave MySQL Master -> MySQL Slave -> MySQL Slave MySQL Masters -> MySQL Slaves -> MySQL Slaves -> MySQL Slaves MySQL Master -> MySQL Slaves
  5. 5. mysqld_exporter
  6. 6. mysqld_exporter
  7. 7. mysqld_exporter is great Lots of data Lots of alerts examples Percona's Graphana dashboard brings dozens of useful dashboards
  8. 8. Migrating to Prometheus does not mean that we should forget the past ... Or lower our monitoring expectations.
  9. 9. pt-heartbeat pt-heartbeart is a daemon that updates an entry with current timestamp on a mysql server every second. On the replica, you can check the timestamp and do  NOW ­ timestamp to get the real lag. +­­­­­­­­­­­­­­­­­­­­­­­­­­­­+­­­­­­­­­­­+ | ts                         | server_id | +­­­­­­­­­­­­­­­­­­­­­­­­­­­­+­­­­­­­­­­­+ | 2017­08­17T16:55:01.001030 |         1 | +­­­­­­­­­­­­­­­­­­­­­­­­­­­­+­­­­­­­­­­­+
  10. 10. pt-heartbeat GPL Perl Part of percona toolkit
  11. 11. pt-heartbeat Our previous monitoring tool (munin) had support for pt-heartbeat. Prometheus mysqld_exporter didn't.
  12. 12. wait, mysql has that natively mysql> SHOW SLAVE STATUSG ... Seconds_Behind_Master: 0 ... aka mysqld_exporter metric:  mysql_slave_lag_seconds 
  13. 13. Bugs Fixes for Seconds_Behind_Master in: 5.7.18, 5.6.36, 5.6.23, 5.6.16.
  14. 14. pt-heartbeat is useful Okay, so we had that thing, now we move to prometheus, we don't want to lose that thing. :idea_emoji: let's implement this!
  15. 15. Pull Request 183 https://github.com/prometheus/mysqld_exporter/ pull/183 Opened Feb 20 Merged Feb 21
  16. 16. How it works Checks the heartbeat table (SQL query). It's not calling the  pt­heartbeat cli. So it is independant from it.
  17. 17. CLI flags collect.heartbeat collect.heartbeat.database collect.heartbeat.table
  18. 18. Metrics mysql_heartbeat_stored_timestamp_seconds{server_id="1"} mysql_heartbeat_now_timestamp_seconds{server_id="1"}
  19. 19. Recording Lag mysql_heartbeat_lag_seconds =     mysql_heartbeat_now_timestamp_seconds ­     mysql_heartbeat_stored_timestamp_seconds https://github.com/prometheus/mysqld_exporter/blob/maste r/example.rules
  20. 20. Alert ALERT MySQLReplicationLag   IF       (mysql_heartbeat_lag_seconds > 30)     AND on (instance)       (predict_linear(mysql_heartbeat_lag_seconds[5m],        60*2) > 0)   FOR 1m   LABELS {     severity = "critical"   }   ANNOTATIONS {     summary = "MySQL slave replication is lagging",     description = "The mysql slave replication has       fallen behind and is not recovering",   } https://github.com/prometheus/mysqld_exporter/blob/maste r/example.rules
  21. 21. Contributing to Percona Grafana Dashboards less great PR opened Feb 23 Still open
  22. 22. Takeaways contributing to prometheus is easy pt-heartbeat is the way to monitor mysql replication lag and now it's available in prometheus any volunteers to rewrite pt-heartbeat in go? :)
  23. 23. Julien Pivotto roidelapluie roidelapluie@inuits.eu Inuits https://inuits.eu info@inuits.eu Contact

×