SlideShare a Scribd company logo
Monitoring Clojure Applications with Prometheus
Jo Draeger 18/09/2018
How do you know your
Clojure web service is doing
what it is supposed to do?
1. Monitoring
2. Prometheus
3. Clojure + Prometheus LIVE!
Agenda
Monitoring
What is Monitoring?
Metrics Raw measurements e.g. CPU usage, disk space,
requests/s, number of errors
Instrumentation Adding code to components to expose metrics
Monitoring Collect, store, aggregate and visualise metrics, trigger
actions
Alerting Automated action when metrics values leave defined
boundaries
● Reliability, Availability
● Preempt problems
● Detect problems
● Identify causes
● Optimise resource utilisation: Efficiency
Why Monitoring
Monitoring @ Signal
AI Text-Analytics
Pipeline
Summarisation
Topic Classification
Entity Recognition
Story collation
Deduplication
Transformation
Content
Provider
User
Print
Online
Broadcast
Alerts
API
searches, latency,
memory, CPU...
queue sizes,
latency, msgs/s,
doc sizes...
article count, API
health...
timeliness...
Prometheus
Prometheus
discovers by
Scrapes
by
HTTP
visualises
alerts
stores to disk
Targets (tasks, VMs, services)
m
anaged
by
https://github.com/SignalMedia/prometheus-ecs-sd
curl http://my-service/metrics
# HELP process_start_time_seconds Start time of the process since unix epoch
in seconds.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1.537190369188E9
# HELP process_open_fds Number of open file descriptors.
# TYPE process_open_fds gauge
process_open_fds 86.0
# HELP process_max_fds Maximum number of open file descriptors.
# TYPE process_max_fds gauge
process_max_fds 10240.0
# HELP jvm_gc_collection_seconds Time spent in a given JVM garbage collector
in seconds.
# TYPE jvm_gc_collection_seconds summary
jvm_gc_collection_seconds_count{gc="PS Scavenge",} 9.0
jvm_gc_collection_seconds_sum{gc="PS Scavenge",} 0.085
Prometheus Metrics Exposition
Push Pull
● Client needs to know target
● Rate controlled by client
● Can monitor single events
● Scraper needs to discover targets
● Rate controlled by scraper
● Aggregates events
● Easy to develop and test (curl!)
● HollywoodPrinciple
Thomas Wolf, www.foto-tw.de / Wikimedia Commons / CC BY-SA 3.0
Gauges: Current value, for slowly changing values, e.g.:
Number of workers.
Counters: Sum of events, or sum of values, e.g. sum of
requests, sum of time taken for requests
Gauges vs Counters
Counters Example
t requests requests
/s
avg
latency
request
counter
latency
counter
requests
/s
avg
latency
0s
20s 80 4.0 15ms 80 1200
40s 90 4.5 20ms 170 3000
60s 40 2.0 5ms 210 3200 3.5 15.2
80s 30 1.5 10ms 240 3500
100s 80 4.0 25ms 320 5500
120s 20 1.0 5ms 340 4500 2.2 18.5
Scraping every 60s
210/50=3.5
(340-210)
/ 60
= 2.2
3200/210=15.2
(5600-3200)
/( 340- 210)
= 18.5
Prometheus and Grafana in
Docker Compose
http-kit and ring based
webservice
Looks up coordinates for a
postcode and then looks up crime
stats
https://tinyurl.com/cljpromdemo
Live Demo
● Add Metrics, switch on the light!
● Pull is much, much better then Push ;-) (it depends)
● Keep an eye on JVM metrics
● Try it
Summary
https://tinyurl.com/cljpromdemo
Any Questions?
https://tinyurl.com/cljpromdemo
@joachimdraeger
Thank you!
https://tinyurl.com/cljpromdemo
@joachimdraeger

More Related Content

Similar to Monitoring Clojure Applications with Prometheus

How to reduce expenses on monitoring
How to reduce expenses on monitoringHow to reduce expenses on monitoring
How to reduce expenses on monitoring
RomanKhavronenko
 
stackconf 2023 | How to reduce expenses on monitoring with VictoriaMetrics by...
stackconf 2023 | How to reduce expenses on monitoring with VictoriaMetrics by...stackconf 2023 | How to reduce expenses on monitoring with VictoriaMetrics by...
stackconf 2023 | How to reduce expenses on monitoring with VictoriaMetrics by...
NETWAYS
 
Microservices and Prometheus (Microservices NYC 2016)
Microservices and Prometheus (Microservices NYC 2016)Microservices and Prometheus (Microservices NYC 2016)
Microservices and Prometheus (Microservices NYC 2016)
Brian Brazil
 
Scalable Realtime Analytics with declarative SQL like Complex Event Processin...
Scalable Realtime Analytics with declarative SQL like Complex Event Processin...Scalable Realtime Analytics with declarative SQL like Complex Event Processin...
Scalable Realtime Analytics with declarative SQL like Complex Event Processin...
Srinath Perera
 
Monitoring in Big Data Platform - Albert Lewandowski, GetInData
Monitoring in Big Data Platform - Albert Lewandowski, GetInDataMonitoring in Big Data Platform - Albert Lewandowski, GetInData
Monitoring in Big Data Platform - Albert Lewandowski, GetInData
GetInData
 
Monitoring your Python with Prometheus (Python Ireland April 2015)
Monitoring your Python with Prometheus (Python Ireland April 2015)Monitoring your Python with Prometheus (Python Ireland April 2015)
Monitoring your Python with Prometheus (Python Ireland April 2015)
Brian Brazil
 
ApacheCon2019 Talk: Improving the Observability of Cassandra, Kafka and Kuber...
ApacheCon2019 Talk: Improving the Observability of Cassandra, Kafka and Kuber...ApacheCon2019 Talk: Improving the Observability of Cassandra, Kafka and Kuber...
ApacheCon2019 Talk: Improving the Observability of Cassandra, Kafka and Kuber...
Paul Brebner
 
YOW2018 Cloud Performance Root Cause Analysis at Netflix
YOW2018 Cloud Performance Root Cause Analysis at NetflixYOW2018 Cloud Performance Root Cause Analysis at Netflix
YOW2018 Cloud Performance Root Cause Analysis at Netflix
Brendan Gregg
 
When third parties stop being polite... and start getting real
When third parties stop being polite... and start getting realWhen third parties stop being polite... and start getting real
When third parties stop being polite... and start getting real
Charles Vazac
 
Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)
Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)
Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)
Brian Brazil
 
Mantis: Netflix's Event Stream Processing System
Mantis: Netflix's Event Stream Processing SystemMantis: Netflix's Event Stream Processing System
Mantis: Netflix's Event Stream Processing System
C4Media
 
Prometheus Introduction (InfraCoders Vienna)
Prometheus Introduction (InfraCoders Vienna)Prometheus Introduction (InfraCoders Vienna)
Prometheus Introduction (InfraCoders Vienna)
Oliver Moser
 
Tracing-for-fun-and-profit.pptx
Tracing-for-fun-and-profit.pptxTracing-for-fun-and-profit.pptx
Tracing-for-fun-and-profit.pptx
Hai Nguyen Duy
 
How to Improve the Observability of Apache Cassandra and Kafka applications...
How to Improve the Observability of Apache Cassandra and Kafka applications...How to Improve the Observability of Apache Cassandra and Kafka applications...
How to Improve the Observability of Apache Cassandra and Kafka applications...
Paul Brebner
 
Approaches for application request throttling - Cloud Developer Days Poland
Approaches for application request throttling - Cloud Developer Days PolandApproaches for application request throttling - Cloud Developer Days Poland
Approaches for application request throttling - Cloud Developer Days Poland
Maarten Balliauw
 
Fluent 2018: When third parties stop being polite... and start getting real
Fluent 2018: When third parties stop being polite... and start getting realFluent 2018: When third parties stop being polite... and start getting real
Fluent 2018: When third parties stop being polite... and start getting real
Akamai Developers & Admins
 
When Third Parties Stop Being Polite... and Start Getting Real
When Third Parties Stop Being Polite... and Start Getting RealWhen Third Parties Stop Being Polite... and Start Getting Real
When Third Parties Stop Being Polite... and Start Getting Real
Nicholas Jansma
 
#TwitterRealTime - Real time processing @twitter
#TwitterRealTime - Real time processing @twitter#TwitterRealTime - Real time processing @twitter
#TwitterRealTime - Real time processing @twitter
Twitter Developers
 
Approaches for application request throttling - dotNetCologne
Approaches for application request throttling - dotNetCologneApproaches for application request throttling - dotNetCologne
Approaches for application request throttling - dotNetCologne
Maarten Balliauw
 
ACM DEBS 2015: Realtime Streaming Analytics Patterns
ACM DEBS 2015: Realtime Streaming Analytics PatternsACM DEBS 2015: Realtime Streaming Analytics Patterns
ACM DEBS 2015: Realtime Streaming Analytics Patterns
Srinath Perera
 

Similar to Monitoring Clojure Applications with Prometheus (20)

How to reduce expenses on monitoring
How to reduce expenses on monitoringHow to reduce expenses on monitoring
How to reduce expenses on monitoring
 
stackconf 2023 | How to reduce expenses on monitoring with VictoriaMetrics by...
stackconf 2023 | How to reduce expenses on monitoring with VictoriaMetrics by...stackconf 2023 | How to reduce expenses on monitoring with VictoriaMetrics by...
stackconf 2023 | How to reduce expenses on monitoring with VictoriaMetrics by...
 
Microservices and Prometheus (Microservices NYC 2016)
Microservices and Prometheus (Microservices NYC 2016)Microservices and Prometheus (Microservices NYC 2016)
Microservices and Prometheus (Microservices NYC 2016)
 
Scalable Realtime Analytics with declarative SQL like Complex Event Processin...
Scalable Realtime Analytics with declarative SQL like Complex Event Processin...Scalable Realtime Analytics with declarative SQL like Complex Event Processin...
Scalable Realtime Analytics with declarative SQL like Complex Event Processin...
 
Monitoring in Big Data Platform - Albert Lewandowski, GetInData
Monitoring in Big Data Platform - Albert Lewandowski, GetInDataMonitoring in Big Data Platform - Albert Lewandowski, GetInData
Monitoring in Big Data Platform - Albert Lewandowski, GetInData
 
Monitoring your Python with Prometheus (Python Ireland April 2015)
Monitoring your Python with Prometheus (Python Ireland April 2015)Monitoring your Python with Prometheus (Python Ireland April 2015)
Monitoring your Python with Prometheus (Python Ireland April 2015)
 
ApacheCon2019 Talk: Improving the Observability of Cassandra, Kafka and Kuber...
ApacheCon2019 Talk: Improving the Observability of Cassandra, Kafka and Kuber...ApacheCon2019 Talk: Improving the Observability of Cassandra, Kafka and Kuber...
ApacheCon2019 Talk: Improving the Observability of Cassandra, Kafka and Kuber...
 
YOW2018 Cloud Performance Root Cause Analysis at Netflix
YOW2018 Cloud Performance Root Cause Analysis at NetflixYOW2018 Cloud Performance Root Cause Analysis at Netflix
YOW2018 Cloud Performance Root Cause Analysis at Netflix
 
When third parties stop being polite... and start getting real
When third parties stop being polite... and start getting realWhen third parties stop being polite... and start getting real
When third parties stop being polite... and start getting real
 
Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)
Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)
Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)
 
Mantis: Netflix's Event Stream Processing System
Mantis: Netflix's Event Stream Processing SystemMantis: Netflix's Event Stream Processing System
Mantis: Netflix's Event Stream Processing System
 
Prometheus Introduction (InfraCoders Vienna)
Prometheus Introduction (InfraCoders Vienna)Prometheus Introduction (InfraCoders Vienna)
Prometheus Introduction (InfraCoders Vienna)
 
Tracing-for-fun-and-profit.pptx
Tracing-for-fun-and-profit.pptxTracing-for-fun-and-profit.pptx
Tracing-for-fun-and-profit.pptx
 
How to Improve the Observability of Apache Cassandra and Kafka applications...
How to Improve the Observability of Apache Cassandra and Kafka applications...How to Improve the Observability of Apache Cassandra and Kafka applications...
How to Improve the Observability of Apache Cassandra and Kafka applications...
 
Approaches for application request throttling - Cloud Developer Days Poland
Approaches for application request throttling - Cloud Developer Days PolandApproaches for application request throttling - Cloud Developer Days Poland
Approaches for application request throttling - Cloud Developer Days Poland
 
Fluent 2018: When third parties stop being polite... and start getting real
Fluent 2018: When third parties stop being polite... and start getting realFluent 2018: When third parties stop being polite... and start getting real
Fluent 2018: When third parties stop being polite... and start getting real
 
When Third Parties Stop Being Polite... and Start Getting Real
When Third Parties Stop Being Polite... and Start Getting RealWhen Third Parties Stop Being Polite... and Start Getting Real
When Third Parties Stop Being Polite... and Start Getting Real
 
#TwitterRealTime - Real time processing @twitter
#TwitterRealTime - Real time processing @twitter#TwitterRealTime - Real time processing @twitter
#TwitterRealTime - Real time processing @twitter
 
Approaches for application request throttling - dotNetCologne
Approaches for application request throttling - dotNetCologneApproaches for application request throttling - dotNetCologne
Approaches for application request throttling - dotNetCologne
 
ACM DEBS 2015: Realtime Streaming Analytics Patterns
ACM DEBS 2015: Realtime Streaming Analytics PatternsACM DEBS 2015: Realtime Streaming Analytics Patterns
ACM DEBS 2015: Realtime Streaming Analytics Patterns
 

Recently uploaded

原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
mz5nrf0n
 
E-commerce Development Services- Hornet Dynamics
E-commerce Development Services- Hornet DynamicsE-commerce Development Services- Hornet Dynamics
E-commerce Development Services- Hornet Dynamics
Hornet Dynamics
 
Requirement Traceability in Xen Functional Safety
Requirement Traceability in Xen Functional SafetyRequirement Traceability in Xen Functional Safety
Requirement Traceability in Xen Functional Safety
Ayan Halder
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOMLORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
lorraineandreiamcidl
 
Mobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona InfotechMobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona Infotech
Drona Infotech
 
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, FactsALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
Green Software Development
 
Fundamentals of Programming and Language Processors
Fundamentals of Programming and Language ProcessorsFundamentals of Programming and Language Processors
Fundamentals of Programming and Language Processors
Rakesh Kumar R
 
OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024
OpenMetadata
 
SQL Accounting Software Brochure Malaysia
SQL Accounting Software Brochure MalaysiaSQL Accounting Software Brochure Malaysia
SQL Accounting Software Brochure Malaysia
GohKiangHock
 
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
XfilesPro
 
Artificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension FunctionsArtificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension Functions
Octavian Nadolu
 
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
kalichargn70th171
 
How Can Hiring A Mobile App Development Company Help Your Business Grow?
How Can Hiring A Mobile App Development Company Help Your Business Grow?How Can Hiring A Mobile App Development Company Help Your Business Grow?
How Can Hiring A Mobile App Development Company Help Your Business Grow?
ToXSL Technologies
 
How to write a program in any programming language
How to write a program in any programming languageHow to write a program in any programming language
How to write a program in any programming language
Rakesh Kumar R
 
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdfVitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke
 
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian CompaniesE-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
Quickdice ERP
 
Lecture 2 - software testing SE 412.pptx
Lecture 2 - software testing SE 412.pptxLecture 2 - software testing SE 412.pptx
Lecture 2 - software testing SE 412.pptx
TaghreedAltamimi
 
Using Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query PerformanceUsing Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query Performance
Grant Fritchey
 
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdfAutomated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
timtebeek1
 
Webinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for EmbeddedWebinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for Embedded
ICS
 

Recently uploaded (20)

原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
 
E-commerce Development Services- Hornet Dynamics
E-commerce Development Services- Hornet DynamicsE-commerce Development Services- Hornet Dynamics
E-commerce Development Services- Hornet Dynamics
 
Requirement Traceability in Xen Functional Safety
Requirement Traceability in Xen Functional SafetyRequirement Traceability in Xen Functional Safety
Requirement Traceability in Xen Functional Safety
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOMLORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
 
Mobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona InfotechMobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona Infotech
 
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, FactsALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
 
Fundamentals of Programming and Language Processors
Fundamentals of Programming and Language ProcessorsFundamentals of Programming and Language Processors
Fundamentals of Programming and Language Processors
 
OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024
 
SQL Accounting Software Brochure Malaysia
SQL Accounting Software Brochure MalaysiaSQL Accounting Software Brochure Malaysia
SQL Accounting Software Brochure Malaysia
 
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
 
Artificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension FunctionsArtificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension Functions
 
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
 
How Can Hiring A Mobile App Development Company Help Your Business Grow?
How Can Hiring A Mobile App Development Company Help Your Business Grow?How Can Hiring A Mobile App Development Company Help Your Business Grow?
How Can Hiring A Mobile App Development Company Help Your Business Grow?
 
How to write a program in any programming language
How to write a program in any programming languageHow to write a program in any programming language
How to write a program in any programming language
 
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdfVitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdf
 
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian CompaniesE-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
 
Lecture 2 - software testing SE 412.pptx
Lecture 2 - software testing SE 412.pptxLecture 2 - software testing SE 412.pptx
Lecture 2 - software testing SE 412.pptx
 
Using Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query PerformanceUsing Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query Performance
 
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdfAutomated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
 
Webinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for EmbeddedWebinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for Embedded
 

Monitoring Clojure Applications with Prometheus

  • 1. Monitoring Clojure Applications with Prometheus Jo Draeger 18/09/2018
  • 2. How do you know your Clojure web service is doing what it is supposed to do?
  • 3.
  • 4. 1. Monitoring 2. Prometheus 3. Clojure + Prometheus LIVE! Agenda
  • 6. What is Monitoring? Metrics Raw measurements e.g. CPU usage, disk space, requests/s, number of errors Instrumentation Adding code to components to expose metrics Monitoring Collect, store, aggregate and visualise metrics, trigger actions Alerting Automated action when metrics values leave defined boundaries
  • 7. ● Reliability, Availability ● Preempt problems ● Detect problems ● Identify causes ● Optimise resource utilisation: Efficiency Why Monitoring
  • 8. Monitoring @ Signal AI Text-Analytics Pipeline Summarisation Topic Classification Entity Recognition Story collation Deduplication Transformation Content Provider User Print Online Broadcast Alerts API searches, latency, memory, CPU... queue sizes, latency, msgs/s, doc sizes... article count, API health... timeliness...
  • 10. Prometheus discovers by Scrapes by HTTP visualises alerts stores to disk Targets (tasks, VMs, services) m anaged by https://github.com/SignalMedia/prometheus-ecs-sd
  • 11. curl http://my-service/metrics # HELP process_start_time_seconds Start time of the process since unix epoch in seconds. # TYPE process_start_time_seconds gauge process_start_time_seconds 1.537190369188E9 # HELP process_open_fds Number of open file descriptors. # TYPE process_open_fds gauge process_open_fds 86.0 # HELP process_max_fds Maximum number of open file descriptors. # TYPE process_max_fds gauge process_max_fds 10240.0 # HELP jvm_gc_collection_seconds Time spent in a given JVM garbage collector in seconds. # TYPE jvm_gc_collection_seconds summary jvm_gc_collection_seconds_count{gc="PS Scavenge",} 9.0 jvm_gc_collection_seconds_sum{gc="PS Scavenge",} 0.085 Prometheus Metrics Exposition
  • 12. Push Pull ● Client needs to know target ● Rate controlled by client ● Can monitor single events ● Scraper needs to discover targets ● Rate controlled by scraper ● Aggregates events ● Easy to develop and test (curl!) ● HollywoodPrinciple Thomas Wolf, www.foto-tw.de / Wikimedia Commons / CC BY-SA 3.0
  • 13. Gauges: Current value, for slowly changing values, e.g.: Number of workers. Counters: Sum of events, or sum of values, e.g. sum of requests, sum of time taken for requests Gauges vs Counters
  • 14. Counters Example t requests requests /s avg latency request counter latency counter requests /s avg latency 0s 20s 80 4.0 15ms 80 1200 40s 90 4.5 20ms 170 3000 60s 40 2.0 5ms 210 3200 3.5 15.2 80s 30 1.5 10ms 240 3500 100s 80 4.0 25ms 320 5500 120s 20 1.0 5ms 340 4500 2.2 18.5 Scraping every 60s 210/50=3.5 (340-210) / 60 = 2.2 3200/210=15.2 (5600-3200) /( 340- 210) = 18.5
  • 15. Prometheus and Grafana in Docker Compose http-kit and ring based webservice Looks up coordinates for a postcode and then looks up crime stats https://tinyurl.com/cljpromdemo Live Demo
  • 16. ● Add Metrics, switch on the light! ● Pull is much, much better then Push ;-) (it depends) ● Keep an eye on JVM metrics ● Try it Summary https://tinyurl.com/cljpromdemo