Build Full Stack
Monitoring and Notification 

with Prometheus
1
Jazz Yao-Tsung Wang
Initiator of Taiwan Data Engineering Association
Co-Founder of Taiwan Hadoop User Group
Shared at 2018-02-10 <TDEA Workshop 2018 Q1>
Hello!
I am Jazz Wang
Co-Founder of Hadoop.TW
Initiator of Taiwan Data Engineering Association (TDEA)
Hadoop Evangelist since 2008.
Open Source Promoter. System Admin (Ops).
- 11 years (2002/08 ~ 2014/02) Researcher in HPC field.
- 2 years (2014/03 ~ 2016/04) Assistant Vice President (AVP),
Product Management of ‘Big Data Platform Management Product’
- 1.8 years (2016/04 ~ Now) Data Architect of Real-Time Bidding
You can find me at @jazzwang_tw or

https://fb.com/groups/dataengineering.tw 

https://slideshare.net/jazzwang
2
1.
/ /
Why do I need Full Stack Monitoring and Notification ?
Let’s start with Jazz’s Jobs / Pains / Gains
3
AWS
Hybrid ….
4
VM
Azure
GCP
5
NetAdmin
Research
Developer
Security
Cloud Ops
SysAdmin
Data Engineer
6
NetAdmin
Research
Developer
Security
Cacti
NewRelic 

Server
OpsCenter
Kafka Manager
NewRelic 

Synthetic / APM
Status Cake
++ ++ DataDog
Pain
▷ Data Fragments
▷
▷
▷ Data Retention
▷ 7
▷ Black Box
▷ (Metrics)
▷ Metrics
▷ Vendor Lock-in
▷
7
Gain —
▷ Centralized Time-serious Database
▷
▷ Support Alert Notification
▷ Slack, E-mail, SMS …
▷ Self-defined Data Retention Rate
▷
▷ White Box
▷ Metrics = (Metrics)
▷ Self-defined Dashboard
▷ Ex. Data Pipeline
8
( ) …. Inspired by Outlier …
https://www.outlyer.com/
~~ ~~
9
2.
/ /
Introduction to Prometheus Ecosystem
Features / Pain Relievers / Gain Creators
10
11
Concepts
Common Building Blocks
12
Target
Collector
Exporter
Time-Series
Database
Rule
Dashboard
Alert Message
Collector
Exporter
Exporter
Dashboard
Dashboard
TargetTarget
Rule
Rule
Alert Message
Annotation
Push
Pull
Ranking of Time Series DBMS
13https://db-engines.com/en/ranking/time+series+dbms
Comparison of Common Monitor and Notification System
14
Target / Exporter DBMS
Dashboard
Alert
snmpd
Pull
Cacti — Device
( snmpwalk )
RRDTool Cacti — Graph Plugin*
gmond
Pull
Ganglia
gmetad
RRDTool Ganglia Nagios
newrelic-agent
Push (?) NewRelic ?? NewRelic NewRelic Alert
statsD
Push Carbon / whisper Graphite Grafana Grafana
Telegraf
Push Telegraf InfluxDB Grafana Grafana
Pull
Push*
snmp_expoter
node_exporter
jmx_exporter …
Prometheus Grafana AlertManager
15
About Prometheus
▷ https://prometheus.io/
▷ 2012 11 SoundCloud
▷ Go Apache 2.0
▷ 2016 Cloud Native Computing Foundation

Kubernates K8S Prometheus
▷ v1.0.0 / 2016-07-18 v2.0.0 / 2017-11-08
▷ PromQL
▷ Grafana
▷ AlertManager
▷ v2.0
16
Components of Prometheus
Push
Pull
Query
Comparison of Time-Series DBMS
17
Prometheus
HA
Prometheus
Data Model
Client Libraries
18
▷ Official Prometheus client library
▷ Go
▷ Java or Scala
▷ Python
▷ Ruby
▷ Unofficial 3rd-party client library
▷ Bash
▷ C++
▷ Common Lisp
▷ Elixir
▷ Erlang
▷ Haskell
▷ Lua for Nginx
▷ Lua for Tarantool
▷ .NET / C#
▷ Node.js
▷ PHP
▷ Rust
19
3.
Docker Compose
Full Stack
Show me the source code!!
○ https://github.com/jazzwang/prometheus-labs
○ Docker Compose
○
20
— Data Pipeline
21
in_dummy Fluentd out_kafka
Kafka
in_kafka_group Fluentd
out_file
Network Layer
▷ snmp_exporter
○ https://github.com/prometheus/snmp_exporter
○ snmp Metrics
○ MIB OID
○ 

snmp_exporter generator
snmp.yml
▷ blackbox_exporter
○ https://github.com/prometheus/blackbox_exporter
○ HTTP, HTTPS, DNS, TCP ICMP
○ 

Web Service SSH DNS
Ping blackbox_exporter
22
System Layer
▷ node_exporter
○ https://github.com/prometheus/node_exporter
○ OS Level Metrics
23
Middleware Layer
▷ jmx_exporter
○ https://github.com/prometheus/jmx_exporter
○ Java YAML
Prometheus Metrics
○
■ Apache Kafka
■ Apache Cassandra
■ Apache Flink
■ Apache Spark
■ Apache Tomcat
■ Apache ZooKeeper
■ Apache ActiveMQ Artemis 2.x
■ WebLogic
■ WildFly 10
24
Kafka
▷ `jmx_exporter` Kafka Cassandra
○ Docker - https://github.com/RobustPerception/docker_examples
▷ kafka_topic_exporter
○ Java Jetty
○ https://github.com/ogibayashi/kafka-topic-exporter
▷ kafka_zookeeper_exporter
○ ZK topic_partition
○ https://github.com/cloudflare/kafka_zookeeper_exporter
▷ prometheus-kafka-consumer-group-exporter
○ Python Metrics consumer_group_offset topic_highwater
Lag
○ https://github.com/braedon/prometheus-kafka-consumer-group-exporter
▷ burrow_exporter
○ LinkedIn Kafka Lag Burrow (Go ,
sliding window )
○ https://github.com/jirwin/burrow_exporter
25
Kafka
▷ kafka-consumer-group-exporter
○ Go kafka-consumer-groups.sh
○ https://github.com/kawamuray/prometheus-kafka-consumer-group-
exporter
▷ kafka-prometheus-exporter
○ Go consumergoup_lag metrics
○ Kafka 0.8 (ZK)
○ https://github.com/ogibayashi/kafka-topic-exporter
▷ kafka_zookeeper_exporter
○ Go Metrics
○ Kafka 0.9 (KF)
○ https://github.com/danielqsj/kafka_exporter
26
Fluentd
▷ fluent-agent-lite_exporter
○ Tagamoris fluent-agent-lite [1]
○ https://github.com/matsumana/fluent-agent-lite_exporter
○ [1] https://github.com/tagomoris/fluent-agent-lite
▷ fluent-plugin-prometheus
○ fluentd → monitor_agent → fluent-plugin-prometheus
○ http://prometheus:9090/metrics → `fluent-plugin-prometheus` → fluentd
○ https://github.com/fluent/fluent-plugin-prometheus
▷ fluentd_exporter
○ Release,
○ https://github.com/wyukawa/fluentd_exporter
▷ fluentd_exporter
○ http://fluentd:9224/metrics → `fluentd_exporter` (by V3ckt0r) → prometheus
○ https://github.com/wyukawa/fluentd_exporter
27
Application Layer
28
▷ https://prometheus.io/docs/instrumenting/clientlibs/
Application Layer
29
▷ http://metrics.dropwizard.io/4.0.0/
30
4.
Lesson Learned
Lesson Learned
▷ Lesson #1



Prometheus 

▷ Lesson #2





Metrics exporter 

○ exporter

https://prometheus.io/docs/instrumenting/exporters/
○ Port

https://github.com/prometheus/prometheus/wiki/Default-port-allocations
○ exporter Metrics
31
Lesson Learned
▷
○ github
○ exporter Metrics
○ http://prometheus:9090/graph
○ Grafana Dashboard
○ Grafana Alert
32
33
Thanks!
Any questions?
You can find me at @jazzwang_tw or

https://fb.com/groups/dataengineering.tw 

https://slideshare.net/jazzwang
https://github.com/jazzwang
Github *^__^*

Full Stack Monitoring with Prometheus and Grafana

  • 1.
    Build Full Stack Monitoringand Notification 
 with Prometheus 1 Jazz Yao-Tsung Wang Initiator of Taiwan Data Engineering Association Co-Founder of Taiwan Hadoop User Group Shared at 2018-02-10 <TDEA Workshop 2018 Q1>
  • 2.
    Hello! I am JazzWang Co-Founder of Hadoop.TW Initiator of Taiwan Data Engineering Association (TDEA) Hadoop Evangelist since 2008. Open Source Promoter. System Admin (Ops). - 11 years (2002/08 ~ 2014/02) Researcher in HPC field. - 2 years (2014/03 ~ 2016/04) Assistant Vice President (AVP), Product Management of ‘Big Data Platform Management Product’ - 1.8 years (2016/04 ~ Now) Data Architect of Real-Time Bidding You can find me at @jazzwang_tw or
 https://fb.com/groups/dataengineering.tw 
 https://slideshare.net/jazzwang 2
  • 3.
    1. / / Why doI need Full Stack Monitoring and Notification ? Let’s start with Jazz’s Jobs / Pains / Gains 3
  • 4.
  • 5.
  • 6.
  • 7.
    Pain ▷ Data Fragments ▷ ▷ ▷Data Retention ▷ 7 ▷ Black Box ▷ (Metrics) ▷ Metrics ▷ Vendor Lock-in ▷ 7
  • 8.
    Gain — ▷ CentralizedTime-serious Database ▷ ▷ Support Alert Notification ▷ Slack, E-mail, SMS … ▷ Self-defined Data Retention Rate ▷ ▷ White Box ▷ Metrics = (Metrics) ▷ Self-defined Dashboard ▷ Ex. Data Pipeline 8
  • 9.
    ( ) ….Inspired by Outlier … https://www.outlyer.com/ ~~ ~~ 9
  • 10.
    2. / / Introduction toPrometheus Ecosystem Features / Pain Relievers / Gain Creators 10
  • 11.
  • 12.
    Common Building Blocks 12 Target Collector Exporter Time-Series Database Rule Dashboard AlertMessage Collector Exporter Exporter Dashboard Dashboard TargetTarget Rule Rule Alert Message Annotation Push Pull
  • 13.
    Ranking of TimeSeries DBMS 13https://db-engines.com/en/ranking/time+series+dbms
  • 14.
    Comparison of CommonMonitor and Notification System 14 Target / Exporter DBMS Dashboard Alert snmpd Pull Cacti — Device ( snmpwalk ) RRDTool Cacti — Graph Plugin* gmond Pull Ganglia gmetad RRDTool Ganglia Nagios newrelic-agent Push (?) NewRelic ?? NewRelic NewRelic Alert statsD Push Carbon / whisper Graphite Grafana Grafana Telegraf Push Telegraf InfluxDB Grafana Grafana Pull Push* snmp_expoter node_exporter jmx_exporter … Prometheus Grafana AlertManager
  • 15.
    15 About Prometheus ▷ https://prometheus.io/ ▷2012 11 SoundCloud ▷ Go Apache 2.0 ▷ 2016 Cloud Native Computing Foundation
 Kubernates K8S Prometheus ▷ v1.0.0 / 2016-07-18 v2.0.0 / 2017-11-08 ▷ PromQL ▷ Grafana ▷ AlertManager ▷ v2.0
  • 16.
  • 17.
    Comparison of Time-SeriesDBMS 17 Prometheus HA Prometheus Data Model
  • 18.
    Client Libraries 18 ▷ OfficialPrometheus client library ▷ Go ▷ Java or Scala ▷ Python ▷ Ruby ▷ Unofficial 3rd-party client library ▷ Bash ▷ C++ ▷ Common Lisp ▷ Elixir ▷ Erlang ▷ Haskell ▷ Lua for Nginx ▷ Lua for Tarantool ▷ .NET / C# ▷ Node.js ▷ PHP ▷ Rust
  • 19.
  • 20.
    Show me thesource code!! ○ https://github.com/jazzwang/prometheus-labs ○ Docker Compose ○ 20
  • 21.
    — Data Pipeline 21 in_dummyFluentd out_kafka Kafka in_kafka_group Fluentd out_file
  • 22.
    Network Layer ▷ snmp_exporter ○https://github.com/prometheus/snmp_exporter ○ snmp Metrics ○ MIB OID ○ 
 snmp_exporter generator snmp.yml ▷ blackbox_exporter ○ https://github.com/prometheus/blackbox_exporter ○ HTTP, HTTPS, DNS, TCP ICMP ○ 
 Web Service SSH DNS Ping blackbox_exporter 22
  • 23.
    System Layer ▷ node_exporter ○https://github.com/prometheus/node_exporter ○ OS Level Metrics 23
  • 24.
    Middleware Layer ▷ jmx_exporter ○https://github.com/prometheus/jmx_exporter ○ Java YAML Prometheus Metrics ○ ■ Apache Kafka ■ Apache Cassandra ■ Apache Flink ■ Apache Spark ■ Apache Tomcat ■ Apache ZooKeeper ■ Apache ActiveMQ Artemis 2.x ■ WebLogic ■ WildFly 10 24
  • 25.
    Kafka ▷ `jmx_exporter` KafkaCassandra ○ Docker - https://github.com/RobustPerception/docker_examples ▷ kafka_topic_exporter ○ Java Jetty ○ https://github.com/ogibayashi/kafka-topic-exporter ▷ kafka_zookeeper_exporter ○ ZK topic_partition ○ https://github.com/cloudflare/kafka_zookeeper_exporter ▷ prometheus-kafka-consumer-group-exporter ○ Python Metrics consumer_group_offset topic_highwater Lag ○ https://github.com/braedon/prometheus-kafka-consumer-group-exporter ▷ burrow_exporter ○ LinkedIn Kafka Lag Burrow (Go , sliding window ) ○ https://github.com/jirwin/burrow_exporter 25
  • 26.
    Kafka ▷ kafka-consumer-group-exporter ○ Gokafka-consumer-groups.sh ○ https://github.com/kawamuray/prometheus-kafka-consumer-group- exporter ▷ kafka-prometheus-exporter ○ Go consumergoup_lag metrics ○ Kafka 0.8 (ZK) ○ https://github.com/ogibayashi/kafka-topic-exporter ▷ kafka_zookeeper_exporter ○ Go Metrics ○ Kafka 0.9 (KF) ○ https://github.com/danielqsj/kafka_exporter 26
  • 27.
    Fluentd ▷ fluent-agent-lite_exporter ○ Tagamorisfluent-agent-lite [1] ○ https://github.com/matsumana/fluent-agent-lite_exporter ○ [1] https://github.com/tagomoris/fluent-agent-lite ▷ fluent-plugin-prometheus ○ fluentd → monitor_agent → fluent-plugin-prometheus ○ http://prometheus:9090/metrics → `fluent-plugin-prometheus` → fluentd ○ https://github.com/fluent/fluent-plugin-prometheus ▷ fluentd_exporter ○ Release, ○ https://github.com/wyukawa/fluentd_exporter ▷ fluentd_exporter ○ http://fluentd:9224/metrics → `fluentd_exporter` (by V3ckt0r) → prometheus ○ https://github.com/wyukawa/fluentd_exporter 27
  • 28.
  • 29.
  • 30.
  • 31.
    Lesson Learned ▷ Lesson#1
 
 Prometheus 
 ▷ Lesson #2
 
 
 Metrics exporter 
 ○ exporter
 https://prometheus.io/docs/instrumenting/exporters/ ○ Port
 https://github.com/prometheus/prometheus/wiki/Default-port-allocations ○ exporter Metrics 31
  • 32.
    Lesson Learned ▷ ○ github ○exporter Metrics ○ http://prometheus:9090/graph ○ Grafana Dashboard ○ Grafana Alert 32
  • 33.
    33 Thanks! Any questions? You canfind me at @jazzwang_tw or
 https://fb.com/groups/dataengineering.tw 
 https://slideshare.net/jazzwang https://github.com/jazzwang Github *^__^*