EH Monitoring System Overview

•

0 likes•15 views

Luong Vo

Employment Hero monitoring solution

Technology

EH Monitoring System
Engineering Team
Minh Nguyen & Luong Vo

Before we start
- How to answer those questions?
+ Why is the system too slow?
+ Does everything work ﬁne?
+ What’s the main bottleneck of our system?
+ What did happen at 10:00 AM this morning that made a
lot of customers complain?
+ What’s the average time the user has to wait until they get
the notiﬁcation?
+ etc.

In short, we built a system successfully.
BUT WE HAVE NO IDEA HOW IT PERFORMS.

Observability
- Programmatically and continuously capture the states of a
running system
- Analyze and extract the information to produce a set of
knowledge that the observer is interested in
- Detect the abnormal behaviors and notify the responsible,
and automatically take actions to resolve the situation
- Archive the data in convenient forms that support future
investigation or analyzing

Pillars of Observability
Log Management
Distributed TracingMetrics Monitoring
Error Tracking

Pillars of Observability
Metrics Monitoring

We need a solution that offers
- Detailed (both real-time and aggregated) statistics about our
microservices.
- Alerting when usage peeks or accidents happen.
- Easy method to implement for our microservices.
- Supports a variety of ways to keep data. (counter, gauge,
histogram ….)
- Two-way integration with Kubernetes

Prometheus and Grafana
- Prometheus is an open-source systems
monitoring and alerting toolkit
originally built at SoundCloud.
- Grafana is is an open source
dashboard tool for data visualization.
- They are our selected approach to
extract/collect and display monitored
data.

Node 1
Push Model
Application
Node 3
Metrics collector
Node 2
Application
POST /metrics
POST /metrics

Node 1
Pull Model
Application
Node 3
Metrics collector
Node 2
Application
GET /metrics
GET /metrics

Node 1
Pull Model and Sidecar Model
Application
Node 3
Metrics collector
Node 2
GET /metrics
GET /metrics
Metric Server
/tmp/monitoring
Application Metric Server
/tmp/monitoring

- This gem helps you monitor your
service with ease.
- It abstracts away many infrastructural
layer via a lot of helpers
- Built-in native supports for gRPC,
Kafka, Sidekiq (soon)
EhMonitoring gem

Service owners are responsible for their children

What’s next?
- Support other common libraries, like Sidekiq
- Apply EhMonitoring to all services
- Dump Instana and create our own Tracing system

Reference
https://github.com/Thinkei/feature-ﬂag-api/pull/81 - Add metrics to feature ﬂag API.
https://docs.google.com/document/d/1-wjTM600u5Q68ImhHHA2DTtlh8wX5mc9Xv5
EEawFNFI/edit - Employment Hero microservices documents.
https://github.com/Thinkei/eh-monitoring - EH monitoring gem
http://monitor.staging.ehrocks.com/ - Our monitoring page.

Similar to EH Monitoring System Overview

PreMonR - A Reactive Platform To Monitor Reactive ApplicationKnoldus Inc.

Go Observability (in practice)Eran Levy

Monitoring Distributed SystemsAleksandr Tavgen

Unified Monitoring Webinar with Dustin WhittleAppDynamics

Why Use Open Source to Gain More Visibility into Network MonitoringDevOps.com

onTune the differencesTeemStone Pty Ltd

Skynet project: Monitor, analyze, scale, and maintain a system in the CloudSylvain Kalache

Adventures in Observability - Clickhouse and InstanaMarcel Birkner

Adventures in Observability: How in-house ClickHouse deployment enabled Inst...Altinity Ltd

Agile Gurugram 2023 | Observability for Modern Applications. How does it help...AgileNetwork

IRJET- Real Time Monitoring of Servers with Prometheus and Grafana for High A...IRJET Journal

Hari proposalRey Jefferson

OnTune suggestion for value_2012Austin Lee

Observability for Application Developers (1)-1.pptxOpsTree solutions

IDEA.pptxTirthMehta19

Product and sevices management systemVinod Gurram

An Introduction to Prometheus (GrafanaCon 2016)Brian Brazil

A practical look at how to build & run IoT business logicVeselin Pizurica

Data automation 101Yosua Michael Maranatha

Never Lose Data Again: Robust Integrations With MuleSoftAaronLieberman5

Similar to EH Monitoring System Overview (20)

PreMonR - A Reactive Platform To Monitor Reactive Application

Go Observability (in practice)

Monitoring Distributed Systems

Unified Monitoring Webinar with Dustin Whittle

Why Use Open Source to Gain More Visibility into Network Monitoring

onTune the differences

Skynet project: Monitor, analyze, scale, and maintain a system in the Cloud

Adventures in Observability - Clickhouse and Instana

Adventures in Observability: How in-house ClickHouse deployment enabled Inst...

Agile Gurugram 2023 | Observability for Modern Applications. How does it help...

IRJET- Real Time Monitoring of Servers with Prometheus and Grafana for High A...

Hari proposal

OnTune suggestion for value_2012

Observability for Application Developers (1)-1.pptx

IDEA.pptx

Product and sevices management system

An Introduction to Prometheus (GrafanaCon 2016)

A practical look at how to build & run IoT business logic

Data automation 101

Never Lose Data Again: Robust Integrations With MuleSoft

Recently uploaded

GenCyber Cyber Security Day PresentationMichael W. Hawkins

The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los

SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren

Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies

From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55

#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada

Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC

Install Stable Diffusion in windows machinePadma Pradeep

Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software

Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK

Salesforce Community Group Quito, Salesforce 101Paola De la Torre

Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes

How to convert PDF to text with Nanonetsnaman860154

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent

AI as an Interface for Commercial BuildingsMemoori

A Domino Admins Adventures (Engage 2024)Gabriella Davis

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j

My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar

Recently uploaded (20)

GenCyber Cyber Security Day Presentation

The 7 Things I Know About Cyber Security After 25 Years | April 2024

SQL Database Design For Developers at php[tek] 2024

Factors to Consider When Choosing Accounts Payable Services Providers.pptx

From Event to Action: Accelerate Your Decision Making with Real-Time Automation

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...

#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024

Breaking the Kubernetes Kill Chain: Host Path Mount

Install Stable Diffusion in windows machine

Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation

Unblocking The Main Thread Solving ANRs and Frozen Frames

Salesforce Community Group Quito, Salesforce 101

Enhancing Worker Digital Experience: A Hands-on Workshop for Partners

How to convert PDF to text with Nanonets

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...

AI as an Interface for Commercial Buildings

A Domino Admins Adventures (Engage 2024)

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...

My Hashitalk Indonesia April 2024 Presentation

EH Monitoring System Overview

1. EH Monitoring System Engineering Team Minh Nguyen & Luong Vo

2. Before we start - How to answer those questions? + Why is the system too slow? + Does everything work ﬁne? + What’s the main bottleneck of our system? + What did happen at 10:00 AM this morning that made a lot of customers complain? + What’s the average time the user has to wait until they get the notiﬁcation? + etc.

3. In short, we built a system successfully. BUT WE HAVE NO IDEA HOW IT PERFORMS.

4. Observability - Programmatically and continuously capture the states of a running system - Analyze and extract the information to produce a set of knowledge that the observer is interested in - Detect the abnormal behaviors and notify the responsible, and automatically take actions to resolve the situation - Archive the data in convenient forms that support future investigation or analyzing

5. Pillars of Observability Log Management Distributed TracingMetrics Monitoring Error Tracking

6. Pillars of Observability Metrics Monitoring

7. We need a solution that offers - Detailed (both real-time and aggregated) statistics about our microservices. - Alerting when usage peeks or accidents happen. - Easy method to implement for our microservices. - Supports a variety of ways to keep data. (counter, gauge, histogram ….) - Two-way integration with Kubernetes

8. Demo time

9. Prometheus and Grafana - Prometheus is an open-source systems monitoring and alerting toolkit originally built at SoundCloud. - Grafana is is an open source dashboard tool for data visualization. - They are our selected approach to extract/collect and display monitored data.

10. Node 1 Push Model Application Node 3 Metrics collector Node 2 Application POST /metrics POST /metrics

11. Node 1 Pull Model Application Node 3 Metrics collector Node 2 Application GET /metrics GET /metrics

12. Node 1 Pull Model and Sidecar Model Application Node 3 Metrics collector Node 2 GET /metrics GET /metrics Metric Server /tmp/monitoring Application Metric Server /tmp/monitoring

13. - This gem helps you monitor your service with ease. - It abstracts away many infrastructural layer via a lot of helpers - Built-in native supports for gRPC, Kafka, Sidekiq (soon) EhMonitoring gem

14. Service owners are responsible for their children

15. What’s next? - Support other common libraries, like Sidekiq - Apply EhMonitoring to all services - Dump Instana and create our own Tracing system

16. Reference https://github.com/Thinkei/feature-ﬂag-api/pull/81 - Add metrics to feature ﬂag API. https://docs.google.com/document/d/1-wjTM600u5Q68ImhHHA2DTtlh8wX5mc9Xv5 EEawFNFI/edit - Employment Hero microservices documents. https://github.com/Thinkei/eh-monitoring - EH monitoring gem http://monitor.staging.ehrocks.com/ - Our monitoring page.

17. The End

EH Monitoring System Overview

Recommended

Recommended

More Related Content

Similar to EH Monitoring System Overview

Similar to EH Monitoring System Overview (20)

More from Luong Vo

More from Luong Vo (10)

Recently uploaded

Recently uploaded (20)

EH Monitoring System Overview