SlideShare a Scribd company logo
MONITORING MICROSERVICES
APRI L 2 6 , 2 0 1 7
Rich Schofield
rschofield@signalfx.com
Jamison Clouthier
jamison@signalfx.com
Karthik Rau, CEO
VMware, VP Products
Loudcloud/Opsware, Products
S E L E C T C U S T O M E R S I N V E S T O R S
Phillip Liu, CTO
Facebook, Lead Architect
Loudcloud/Opsware,
Chief Architect
F O U N D E R S
A LITTLE BACKGROUND
SignalFx is an advanced monitoring & alerting system for cloud apps, delivered as a SaaS solution
SIGNALFX HIGH LEVEL ARCHITECTURE
METADATA
1s / 5s
1m
1h
•••
MESSAGEBUS
TIME SERIES DATABASE
ANALYTICS
25
INGEST
NOTIFICATION
8d
384d
32d
1s 16m (IN MEMORY)
DATAPOINTS
MICROSERVICES
BENEFITS OF MICROSERVICES ARCHITECTURE
Application modules released independently
• Enables agile development per team
• Canary deployments for testing
• Containers to enable rapid automated deployment & rollback
Scale in/out with service load
• Respond quickly to increased data flow or usage
• Optimize infrastructure costs as load falls
Technology flexibility
• Easier to upgrade components or change platforms
MONITORING A MICROSERVICE APPLICATION
Real-time metrics and analytics
• Detect issues and trends quickly
Apply context: history, related events, service metadata
• Metrics without context are just numbers!
Tag-based monitoring for elastic/ephemeral services
• Adapt automatically to monitor services as they scale in/out
Shared, self-service access across all groups
• App developers, tech leads, support, management
• Avoid different tools for different teams
ISSUES WITH TRADITIONAL MONITORING
Noisy, reactive monitoring
C H A L L E N G E
• Too many alerts fire at once for a cluster-wide
problem
• Is the machine down because we scaled down
the cluster or because we had a real problem?
• Do we even care if a single node is down?
• Component-specific monitoring configurations
that require constant maintenance in
ephemeral/elastic environments
What
matters?
Where to
start?
?
THE MODERN MONITORING LANDSCAPE
A P M M E T R I C S L O G S
Performance Testing Pre-Flight Streaming Metrics Aggregated In-Flight Black Box Recorder Post-Flight
Luxury of TimeReal Time MattersLuxury of Time
LET’S TALK METRICS
Metric name
Metric value
Metric type
Timestamp
Dimensions
cpu.idle
27
gauge
1234567
host = relic47df
datacenter = sjc1
env = prod
…
Dimensions allow
you to filter,
aggregate, compare
across sources
METRICS IN A TIME SERIES
{
"gauge":
[{"metric":”cpu.idle",
"dimensions":
{"host":”hostname123",
"datacenter":”snc"},
"value":249}]
}
{
"gauge":
[{"metric":”cpu.idle",
"dimensions":
{"host":”hostname123",
"datacenter":”snc"},
"value":230}]
}
{
"gauge":
[{"metric":”cpu.idle",
"dimensions":
{"host":”hostname123",
"datacenter":”snc"},
"value":202}]
}
{
"gauge":
[{"metric":”cpu.idle",
"dimensions":
{"host":”hostname123",
"datacenter":”snc"},
"value":284}]
}
10:15:02 10:15:03 10:15:04 10:15:05
USING TIME SERIES ANALYTICS TO CORRELATE
AND IDENTIFY PATTERNS
How well load balanced is this
8-node Kafka cluster?
Compare the signal against
historical patterns and alert on
anomalous patterns
Create a signal to represent the
cluster’s load balancing
effectiveness, computed within
seconds
GUIDED TRIAGE
0
2
4
6
8
10
12
14
16
18
20
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
INFRASTRUCTURE SERVICE
PRODUCT LAUNCH
EVENT
CORRELATE EVENTS
TO MONITORING
IS THIS A PROBLEM?
ANALYSIS
1. PRODUCT LAUNCH
(EVENT RECORDED)
2. TRAFFIC SPIKE
3. TRANSIENT?
4. IMPACT EXISTING
CUSTOMERS?
5. IS IT LEVELING OUT?
6. RAM ISSUE?
7. STORAGE ISSUE?
8. BUS BACKED UP?
JOURNEY TO METRICS BASED MONITORING
PHASE 0 PHASE 2PHASE 1 PHASE 3
Health
checks and
logs
Small internal
metrics
system
Build out scalable,
highly-available metrics
system
Build out more
sophisticated
analytics
From individual component checks to proactive management of service-wide performance
M M / D D / Y Y
YOUR TITLE HERE
P R E P A R E D F O R :
P L A C E L O G O
H E R EDEMO
M M / D D / Y Y
YOUR TITLE HERE
P R E P A R E D F O R :
P L A C E L O G O
H E R E
T H A N K Y O U !
jamison@signalfx.com
rschofield@signalfx.com
S I G N U P F O R A T R I A L A T :
signalfx.com

More Related Content

What's hot

Time Series Tech Stack for the IoT Edge
Time Series Tech Stack for the IoT EdgeTime Series Tech Stack for the IoT Edge
Time Series Tech Stack for the IoT Edge
InfluxData
 
Time Series Analysis Using an Event Streaming Platform
 Time Series Analysis Using an Event Streaming Platform Time Series Analysis Using an Event Streaming Platform
Time Series Analysis Using an Event Streaming Platform
Dr. Mirko Kämpf
 
Monitor Traefik with Prometheus
Monitor Traefik with PrometheusMonitor Traefik with Prometheus
Monitor Traefik with Prometheus
Brian Christner
 
Open-source vs. public cloud in the Big Data landscape. Friends or Foes?
Open-source vs. public cloud in the Big Data landscape. Friends or Foes?Open-source vs. public cloud in the Big Data landscape. Friends or Foes?
Open-source vs. public cloud in the Big Data landscape. Friends or Foes?
GetInData
 
Cloud monitoring
Cloud monitoringCloud monitoring
Cloud monitoring
Gang Tao
 
Worldsensing: A Real World Use Case for Flux by Albert Zaragoza, CTO & Head o...
Worldsensing: A Real World Use Case for Flux by Albert Zaragoza, CTO & Head o...Worldsensing: A Real World Use Case for Flux by Albert Zaragoza, CTO & Head o...
Worldsensing: A Real World Use Case for Flux by Albert Zaragoza, CTO & Head o...
InfluxData
 
InfluxDB Community Office Hours September 2020
InfluxDB Community Office Hours September 2020 InfluxDB Community Office Hours September 2020
InfluxDB Community Office Hours September 2020
InfluxData
 
What's new in confluent platform 5.4 online talk
What's new in confluent platform 5.4 online talkWhat's new in confluent platform 5.4 online talk
What's new in confluent platform 5.4 online talk
confluent
 
Top 5 Considerations for Operating a Kubernetes Environment at Scale
Top 5 Considerations for Operating a Kubernetes Environment at ScaleTop 5 Considerations for Operating a Kubernetes Environment at Scale
Top 5 Considerations for Operating a Kubernetes Environment at Scale
Deborah Schalm
 
Growing into a proactive Data Platform
Growing into a proactive Data PlatformGrowing into a proactive Data Platform
Growing into a proactive Data Platform
LivePerson
 
Automated Remediation with Rundeck + Sensu
Automated Remediation with Rundeck + SensuAutomated Remediation with Rundeck + Sensu
Automated Remediation with Rundeck + Sensu
Rundeck
 
Supersonic, Subatomic, Kubernetes Native Java : Microservices Day Dallas
Supersonic, Subatomic, Kubernetes Native Java : Microservices Day DallasSupersonic, Subatomic, Kubernetes Native Java : Microservices Day Dallas
Supersonic, Subatomic, Kubernetes Native Java : Microservices Day Dallas
Jeremy Davis
 
Digital Transformation & Solvency II Simulations for L&G: Optimizing, Acceler...
Digital Transformation & Solvency II Simulations for L&G: Optimizing, Acceler...Digital Transformation & Solvency II Simulations for L&G: Optimizing, Acceler...
Digital Transformation & Solvency II Simulations for L&G: Optimizing, Acceler...
OW2
 
Continuous Delivery with Spinnaker.io
Continuous Delivery with Spinnaker.ioContinuous Delivery with Spinnaker.io
Continuous Delivery with Spinnaker.io
Martin Roderus
 
Big Data on OpenStack
Big Data on OpenStackBig Data on OpenStack
Big Data on OpenStack
Nati Shalom
 
Case Study : InfluxDB
Case Study : InfluxDBCase Study : InfluxDB
Case Study : InfluxDB
omkarpowar4
 
How to Deliver a Critical and Actionable Customer-Facing Metrics Product with...
How to Deliver a Critical and Actionable Customer-Facing Metrics Product with...How to Deliver a Critical and Actionable Customer-Facing Metrics Product with...
How to Deliver a Critical and Actionable Customer-Facing Metrics Product with...
InfluxData
 
Fall in Love with Graphs and Metrics using Grafana
Fall in Love with Graphs and Metrics using GrafanaFall in Love with Graphs and Metrics using Grafana
Fall in Love with Graphs and Metrics using Grafana
torkelo
 
Overview of Blue Medora - New Relic Plugin for Cisco Nexus
Overview of Blue Medora - New Relic Plugin for Cisco NexusOverview of Blue Medora - New Relic Plugin for Cisco Nexus
Overview of Blue Medora - New Relic Plugin for Cisco Nexus
Blue Medora
 

What's hot (19)

Time Series Tech Stack for the IoT Edge
Time Series Tech Stack for the IoT EdgeTime Series Tech Stack for the IoT Edge
Time Series Tech Stack for the IoT Edge
 
Time Series Analysis Using an Event Streaming Platform
 Time Series Analysis Using an Event Streaming Platform Time Series Analysis Using an Event Streaming Platform
Time Series Analysis Using an Event Streaming Platform
 
Monitor Traefik with Prometheus
Monitor Traefik with PrometheusMonitor Traefik with Prometheus
Monitor Traefik with Prometheus
 
Open-source vs. public cloud in the Big Data landscape. Friends or Foes?
Open-source vs. public cloud in the Big Data landscape. Friends or Foes?Open-source vs. public cloud in the Big Data landscape. Friends or Foes?
Open-source vs. public cloud in the Big Data landscape. Friends or Foes?
 
Cloud monitoring
Cloud monitoringCloud monitoring
Cloud monitoring
 
Worldsensing: A Real World Use Case for Flux by Albert Zaragoza, CTO & Head o...
Worldsensing: A Real World Use Case for Flux by Albert Zaragoza, CTO & Head o...Worldsensing: A Real World Use Case for Flux by Albert Zaragoza, CTO & Head o...
Worldsensing: A Real World Use Case for Flux by Albert Zaragoza, CTO & Head o...
 
InfluxDB Community Office Hours September 2020
InfluxDB Community Office Hours September 2020 InfluxDB Community Office Hours September 2020
InfluxDB Community Office Hours September 2020
 
What's new in confluent platform 5.4 online talk
What's new in confluent platform 5.4 online talkWhat's new in confluent platform 5.4 online talk
What's new in confluent platform 5.4 online talk
 
Top 5 Considerations for Operating a Kubernetes Environment at Scale
Top 5 Considerations for Operating a Kubernetes Environment at ScaleTop 5 Considerations for Operating a Kubernetes Environment at Scale
Top 5 Considerations for Operating a Kubernetes Environment at Scale
 
Growing into a proactive Data Platform
Growing into a proactive Data PlatformGrowing into a proactive Data Platform
Growing into a proactive Data Platform
 
Automated Remediation with Rundeck + Sensu
Automated Remediation with Rundeck + SensuAutomated Remediation with Rundeck + Sensu
Automated Remediation with Rundeck + Sensu
 
Supersonic, Subatomic, Kubernetes Native Java : Microservices Day Dallas
Supersonic, Subatomic, Kubernetes Native Java : Microservices Day DallasSupersonic, Subatomic, Kubernetes Native Java : Microservices Day Dallas
Supersonic, Subatomic, Kubernetes Native Java : Microservices Day Dallas
 
Digital Transformation & Solvency II Simulations for L&G: Optimizing, Acceler...
Digital Transformation & Solvency II Simulations for L&G: Optimizing, Acceler...Digital Transformation & Solvency II Simulations for L&G: Optimizing, Acceler...
Digital Transformation & Solvency II Simulations for L&G: Optimizing, Acceler...
 
Continuous Delivery with Spinnaker.io
Continuous Delivery with Spinnaker.ioContinuous Delivery with Spinnaker.io
Continuous Delivery with Spinnaker.io
 
Big Data on OpenStack
Big Data on OpenStackBig Data on OpenStack
Big Data on OpenStack
 
Case Study : InfluxDB
Case Study : InfluxDBCase Study : InfluxDB
Case Study : InfluxDB
 
How to Deliver a Critical and Actionable Customer-Facing Metrics Product with...
How to Deliver a Critical and Actionable Customer-Facing Metrics Product with...How to Deliver a Critical and Actionable Customer-Facing Metrics Product with...
How to Deliver a Critical and Actionable Customer-Facing Metrics Product with...
 
Fall in Love with Graphs and Metrics using Grafana
Fall in Love with Graphs and Metrics using GrafanaFall in Love with Graphs and Metrics using Grafana
Fall in Love with Graphs and Metrics using Grafana
 
Overview of Blue Medora - New Relic Plugin for Cisco Nexus
Overview of Blue Medora - New Relic Plugin for Cisco NexusOverview of Blue Medora - New Relic Plugin for Cisco Nexus
Overview of Blue Medora - New Relic Plugin for Cisco Nexus
 

Similar to Microservices meetup April 2017

Incrementalism: An Industrial Strategy For Adopting Modern Automation
Incrementalism: An Industrial Strategy For Adopting Modern AutomationIncrementalism: An Industrial Strategy For Adopting Modern Automation
Incrementalism: An Industrial Strategy For Adopting Modern Automation
Sean Chittenden
 
Industrial Edge.pptx
Industrial Edge.pptxIndustrial Edge.pptx
Industrial Edge.pptx
Sripad NS
 
Architecting peta-byte-scale analytics by scaling out Postgres on Azure with ...
Architecting peta-byte-scale analytics by scaling out Postgres on Azure with ...Architecting peta-byte-scale analytics by scaling out Postgres on Azure with ...
Architecting peta-byte-scale analytics by scaling out Postgres on Azure with ...
Citus Data
 
[Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Servic...
[Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Servic...[Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Servic...
[Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Servic...
Vinu Charanya
 
Safer Commutes & Streaming Data | George Padavick, Ohio Department of Transpo...
Safer Commutes & Streaming Data | George Padavick, Ohio Department of Transpo...Safer Commutes & Streaming Data | George Padavick, Ohio Department of Transpo...
Safer Commutes & Streaming Data | George Padavick, Ohio Department of Transpo...
HostedbyConfluent
 
Francisco Javier Ramirez Urea - Hopla - OSL19
Francisco Javier Ramirez Urea - Hopla - OSL19Francisco Javier Ramirez Urea - Hopla - OSL19
Francisco Javier Ramirez Urea - Hopla - OSL19
marketingsyone
 
Big Data Tools in AWS
Big Data Tools in AWSBig Data Tools in AWS
Big Data Tools in AWS
Shu-Jeng Hsieh
 
Cloud-native .NET-Microservices mit Kubernetes @BASTAcon
Cloud-native .NET-Microservices mit Kubernetes @BASTAconCloud-native .NET-Microservices mit Kubernetes @BASTAcon
Cloud-native .NET-Microservices mit Kubernetes @BASTAcon
Mario-Leander Reimer
 
[Velocity Conf 2017 NY] How Twitter built a framework to improve infrastructu...
[Velocity Conf 2017 NY] How Twitter built a framework to improve infrastructu...[Velocity Conf 2017 NY] How Twitter built a framework to improve infrastructu...
[Velocity Conf 2017 NY] How Twitter built a framework to improve infrastructu...
Vinu Charanya
 
NextGenML
NextGenML NextGenML
Introduction to architecture exploration
Introduction to architecture explorationIntroduction to architecture exploration
Introduction to architecture exploration
Deepak Shankar
 
E1: Building the Digital Twin (Predix Transform 2016)
E1: Building the Digital Twin (Predix Transform 2016)E1: Building the Digital Twin (Predix Transform 2016)
E1: Building the Digital Twin (Predix Transform 2016)
Predix
 
Webinar: Data Modeling and Shortcuts to Success in Scaling Time Series Applic...
Webinar: Data Modeling and Shortcuts to Success in Scaling Time Series Applic...Webinar: Data Modeling and Shortcuts to Success in Scaling Time Series Applic...
Webinar: Data Modeling and Shortcuts to Success in Scaling Time Series Applic...
DATAVERSITY
 
Den Datenschatz heben und Zeit- und Energieeffizienz steigern: Mathematik und...
Den Datenschatz heben und Zeit- und Energieeffizienz steigern: Mathematik und...Den Datenschatz heben und Zeit- und Energieeffizienz steigern: Mathematik und...
Den Datenschatz heben und Zeit- und Energieeffizienz steigern: Mathematik und...
Joachim Schlosser
 
Tim Hall [InfluxData] | InfluxDB Roadmap | InfluxDays Virtual Experience Lond...
Tim Hall [InfluxData] | InfluxDB Roadmap | InfluxDays Virtual Experience Lond...Tim Hall [InfluxData] | InfluxDB Roadmap | InfluxDays Virtual Experience Lond...
Tim Hall [InfluxData] | InfluxDB Roadmap | InfluxDays Virtual Experience Lond...
InfluxData
 
Big Data LDN 2018: DEUTSCHE BANK: THE PATH TO AUTOMATION IN A HIGHLY REGULATE...
Big Data LDN 2018: DEUTSCHE BANK: THE PATH TO AUTOMATION IN A HIGHLY REGULATE...Big Data LDN 2018: DEUTSCHE BANK: THE PATH TO AUTOMATION IN A HIGHLY REGULATE...
Big Data LDN 2018: DEUTSCHE BANK: THE PATH TO AUTOMATION IN A HIGHLY REGULATE...
Matt Stubbs
 
Top Performance Problems in Distributed Architectures
Top Performance Problems in Distributed ArchitecturesTop Performance Problems in Distributed Architectures
Top Performance Problems in Distributed Architectures
Andreas Grabner
 
DCEU 18: From Legacy Mainframe to the Cloud: The Finnish Railways Evolution w...
DCEU 18: From Legacy Mainframe to the Cloud: The Finnish Railways Evolution w...DCEU 18: From Legacy Mainframe to the Cloud: The Finnish Railways Evolution w...
DCEU 18: From Legacy Mainframe to the Cloud: The Finnish Railways Evolution w...
Docker, Inc.
 
Predix Builder Roadshow
Predix Builder RoadshowPredix Builder Roadshow
Predix Builder Roadshow
Predix
 
Webinar: Unlock the Power of Streaming Data with Kinetica and Confluent
Webinar: Unlock the Power of Streaming Data with Kinetica and ConfluentWebinar: Unlock the Power of Streaming Data with Kinetica and Confluent
Webinar: Unlock the Power of Streaming Data with Kinetica and Confluent
Kinetica
 

Similar to Microservices meetup April 2017 (20)

Incrementalism: An Industrial Strategy For Adopting Modern Automation
Incrementalism: An Industrial Strategy For Adopting Modern AutomationIncrementalism: An Industrial Strategy For Adopting Modern Automation
Incrementalism: An Industrial Strategy For Adopting Modern Automation
 
Industrial Edge.pptx
Industrial Edge.pptxIndustrial Edge.pptx
Industrial Edge.pptx
 
Architecting peta-byte-scale analytics by scaling out Postgres on Azure with ...
Architecting peta-byte-scale analytics by scaling out Postgres on Azure with ...Architecting peta-byte-scale analytics by scaling out Postgres on Azure with ...
Architecting peta-byte-scale analytics by scaling out Postgres on Azure with ...
 
[Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Servic...
[Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Servic...[Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Servic...
[Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Servic...
 
Safer Commutes & Streaming Data | George Padavick, Ohio Department of Transpo...
Safer Commutes & Streaming Data | George Padavick, Ohio Department of Transpo...Safer Commutes & Streaming Data | George Padavick, Ohio Department of Transpo...
Safer Commutes & Streaming Data | George Padavick, Ohio Department of Transpo...
 
Francisco Javier Ramirez Urea - Hopla - OSL19
Francisco Javier Ramirez Urea - Hopla - OSL19Francisco Javier Ramirez Urea - Hopla - OSL19
Francisco Javier Ramirez Urea - Hopla - OSL19
 
Big Data Tools in AWS
Big Data Tools in AWSBig Data Tools in AWS
Big Data Tools in AWS
 
Cloud-native .NET-Microservices mit Kubernetes @BASTAcon
Cloud-native .NET-Microservices mit Kubernetes @BASTAconCloud-native .NET-Microservices mit Kubernetes @BASTAcon
Cloud-native .NET-Microservices mit Kubernetes @BASTAcon
 
[Velocity Conf 2017 NY] How Twitter built a framework to improve infrastructu...
[Velocity Conf 2017 NY] How Twitter built a framework to improve infrastructu...[Velocity Conf 2017 NY] How Twitter built a framework to improve infrastructu...
[Velocity Conf 2017 NY] How Twitter built a framework to improve infrastructu...
 
NextGenML
NextGenML NextGenML
NextGenML
 
Introduction to architecture exploration
Introduction to architecture explorationIntroduction to architecture exploration
Introduction to architecture exploration
 
E1: Building the Digital Twin (Predix Transform 2016)
E1: Building the Digital Twin (Predix Transform 2016)E1: Building the Digital Twin (Predix Transform 2016)
E1: Building the Digital Twin (Predix Transform 2016)
 
Webinar: Data Modeling and Shortcuts to Success in Scaling Time Series Applic...
Webinar: Data Modeling and Shortcuts to Success in Scaling Time Series Applic...Webinar: Data Modeling and Shortcuts to Success in Scaling Time Series Applic...
Webinar: Data Modeling and Shortcuts to Success in Scaling Time Series Applic...
 
Den Datenschatz heben und Zeit- und Energieeffizienz steigern: Mathematik und...
Den Datenschatz heben und Zeit- und Energieeffizienz steigern: Mathematik und...Den Datenschatz heben und Zeit- und Energieeffizienz steigern: Mathematik und...
Den Datenschatz heben und Zeit- und Energieeffizienz steigern: Mathematik und...
 
Tim Hall [InfluxData] | InfluxDB Roadmap | InfluxDays Virtual Experience Lond...
Tim Hall [InfluxData] | InfluxDB Roadmap | InfluxDays Virtual Experience Lond...Tim Hall [InfluxData] | InfluxDB Roadmap | InfluxDays Virtual Experience Lond...
Tim Hall [InfluxData] | InfluxDB Roadmap | InfluxDays Virtual Experience Lond...
 
Big Data LDN 2018: DEUTSCHE BANK: THE PATH TO AUTOMATION IN A HIGHLY REGULATE...
Big Data LDN 2018: DEUTSCHE BANK: THE PATH TO AUTOMATION IN A HIGHLY REGULATE...Big Data LDN 2018: DEUTSCHE BANK: THE PATH TO AUTOMATION IN A HIGHLY REGULATE...
Big Data LDN 2018: DEUTSCHE BANK: THE PATH TO AUTOMATION IN A HIGHLY REGULATE...
 
Top Performance Problems in Distributed Architectures
Top Performance Problems in Distributed ArchitecturesTop Performance Problems in Distributed Architectures
Top Performance Problems in Distributed Architectures
 
DCEU 18: From Legacy Mainframe to the Cloud: The Finnish Railways Evolution w...
DCEU 18: From Legacy Mainframe to the Cloud: The Finnish Railways Evolution w...DCEU 18: From Legacy Mainframe to the Cloud: The Finnish Railways Evolution w...
DCEU 18: From Legacy Mainframe to the Cloud: The Finnish Railways Evolution w...
 
Predix Builder Roadshow
Predix Builder RoadshowPredix Builder Roadshow
Predix Builder Roadshow
 
Webinar: Unlock the Power of Streaming Data with Kinetica and Confluent
Webinar: Unlock the Power of Streaming Data with Kinetica and ConfluentWebinar: Unlock the Power of Streaming Data with Kinetica and Confluent
Webinar: Unlock the Power of Streaming Data with Kinetica and Confluent
 

More from SignalFx

Top Considerations For Operating a Kubernetes Environment at Scale
Top Considerations For Operating a Kubernetes Environment at ScaleTop Considerations For Operating a Kubernetes Environment at Scale
Top Considerations For Operating a Kubernetes Environment at Scale
SignalFx
 
SignalFx Elasticsearch Metrics Monitoring and Alerting
SignalFx Elasticsearch Metrics Monitoring and AlertingSignalFx Elasticsearch Metrics Monitoring and Alerting
SignalFx Elasticsearch Metrics Monitoring and Alerting
SignalFx
 
SignalFx Kafka Consumer Optimization
SignalFx Kafka Consumer OptimizationSignalFx Kafka Consumer Optimization
SignalFx Kafka Consumer Optimization
SignalFx
 
Making Cassandra Perform as a Time Series Database - Cassandra Summit 15
Making Cassandra Perform as a Time Series Database - Cassandra Summit 15Making Cassandra Perform as a Time Series Database - Cassandra Summit 15
Making Cassandra Perform as a Time Series Database - Cassandra Summit 15
SignalFx
 
Go debugging and troubleshooting tips - from real life lessons at SignalFx
Go debugging and troubleshooting tips - from real life lessons at SignalFxGo debugging and troubleshooting tips - from real life lessons at SignalFx
Go debugging and troubleshooting tips - from real life lessons at SignalFx
SignalFx
 
Docker at and with SignalFx
Docker at and with SignalFxDocker at and with SignalFx
Docker at and with SignalFx
SignalFx
 
Scaling ingest pipelines with high performance computing principles - Rajiv K...
Scaling ingest pipelines with high performance computing principles - Rajiv K...Scaling ingest pipelines with high performance computing principles - Rajiv K...
Scaling ingest pipelines with high performance computing principles - Rajiv K...
SignalFx
 

More from SignalFx (7)

Top Considerations For Operating a Kubernetes Environment at Scale
Top Considerations For Operating a Kubernetes Environment at ScaleTop Considerations For Operating a Kubernetes Environment at Scale
Top Considerations For Operating a Kubernetes Environment at Scale
 
SignalFx Elasticsearch Metrics Monitoring and Alerting
SignalFx Elasticsearch Metrics Monitoring and AlertingSignalFx Elasticsearch Metrics Monitoring and Alerting
SignalFx Elasticsearch Metrics Monitoring and Alerting
 
SignalFx Kafka Consumer Optimization
SignalFx Kafka Consumer OptimizationSignalFx Kafka Consumer Optimization
SignalFx Kafka Consumer Optimization
 
Making Cassandra Perform as a Time Series Database - Cassandra Summit 15
Making Cassandra Perform as a Time Series Database - Cassandra Summit 15Making Cassandra Perform as a Time Series Database - Cassandra Summit 15
Making Cassandra Perform as a Time Series Database - Cassandra Summit 15
 
Go debugging and troubleshooting tips - from real life lessons at SignalFx
Go debugging and troubleshooting tips - from real life lessons at SignalFxGo debugging and troubleshooting tips - from real life lessons at SignalFx
Go debugging and troubleshooting tips - from real life lessons at SignalFx
 
Docker at and with SignalFx
Docker at and with SignalFxDocker at and with SignalFx
Docker at and with SignalFx
 
Scaling ingest pipelines with high performance computing principles - Rajiv K...
Scaling ingest pipelines with high performance computing principles - Rajiv K...Scaling ingest pipelines with high performance computing principles - Rajiv K...
Scaling ingest pipelines with high performance computing principles - Rajiv K...
 

Recently uploaded

From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 

Recently uploaded (20)

From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 

Microservices meetup April 2017

  • 1. MONITORING MICROSERVICES APRI L 2 6 , 2 0 1 7 Rich Schofield rschofield@signalfx.com Jamison Clouthier jamison@signalfx.com
  • 2. Karthik Rau, CEO VMware, VP Products Loudcloud/Opsware, Products S E L E C T C U S T O M E R S I N V E S T O R S Phillip Liu, CTO Facebook, Lead Architect Loudcloud/Opsware, Chief Architect F O U N D E R S A LITTLE BACKGROUND SignalFx is an advanced monitoring & alerting system for cloud apps, delivered as a SaaS solution
  • 3. SIGNALFX HIGH LEVEL ARCHITECTURE METADATA 1s / 5s 1m 1h ••• MESSAGEBUS TIME SERIES DATABASE ANALYTICS 25 INGEST NOTIFICATION 8d 384d 32d 1s 16m (IN MEMORY) DATAPOINTS MICROSERVICES
  • 4. BENEFITS OF MICROSERVICES ARCHITECTURE Application modules released independently • Enables agile development per team • Canary deployments for testing • Containers to enable rapid automated deployment & rollback Scale in/out with service load • Respond quickly to increased data flow or usage • Optimize infrastructure costs as load falls Technology flexibility • Easier to upgrade components or change platforms
  • 5. MONITORING A MICROSERVICE APPLICATION Real-time metrics and analytics • Detect issues and trends quickly Apply context: history, related events, service metadata • Metrics without context are just numbers! Tag-based monitoring for elastic/ephemeral services • Adapt automatically to monitor services as they scale in/out Shared, self-service access across all groups • App developers, tech leads, support, management • Avoid different tools for different teams
  • 6. ISSUES WITH TRADITIONAL MONITORING Noisy, reactive monitoring C H A L L E N G E • Too many alerts fire at once for a cluster-wide problem • Is the machine down because we scaled down the cluster or because we had a real problem? • Do we even care if a single node is down? • Component-specific monitoring configurations that require constant maintenance in ephemeral/elastic environments What matters? Where to start? ?
  • 7. THE MODERN MONITORING LANDSCAPE A P M M E T R I C S L O G S Performance Testing Pre-Flight Streaming Metrics Aggregated In-Flight Black Box Recorder Post-Flight Luxury of TimeReal Time MattersLuxury of Time
  • 8. LET’S TALK METRICS Metric name Metric value Metric type Timestamp Dimensions cpu.idle 27 gauge 1234567 host = relic47df datacenter = sjc1 env = prod … Dimensions allow you to filter, aggregate, compare across sources
  • 9. METRICS IN A TIME SERIES { "gauge": [{"metric":”cpu.idle", "dimensions": {"host":”hostname123", "datacenter":”snc"}, "value":249}] } { "gauge": [{"metric":”cpu.idle", "dimensions": {"host":”hostname123", "datacenter":”snc"}, "value":230}] } { "gauge": [{"metric":”cpu.idle", "dimensions": {"host":”hostname123", "datacenter":”snc"}, "value":202}] } { "gauge": [{"metric":”cpu.idle", "dimensions": {"host":”hostname123", "datacenter":”snc"}, "value":284}] } 10:15:02 10:15:03 10:15:04 10:15:05
  • 10. USING TIME SERIES ANALYTICS TO CORRELATE AND IDENTIFY PATTERNS How well load balanced is this 8-node Kafka cluster? Compare the signal against historical patterns and alert on anomalous patterns Create a signal to represent the cluster’s load balancing effectiveness, computed within seconds
  • 11. GUIDED TRIAGE 0 2 4 6 8 10 12 14 16 18 20 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 INFRASTRUCTURE SERVICE PRODUCT LAUNCH EVENT CORRELATE EVENTS TO MONITORING IS THIS A PROBLEM? ANALYSIS 1. PRODUCT LAUNCH (EVENT RECORDED) 2. TRAFFIC SPIKE 3. TRANSIENT? 4. IMPACT EXISTING CUSTOMERS? 5. IS IT LEVELING OUT? 6. RAM ISSUE? 7. STORAGE ISSUE? 8. BUS BACKED UP?
  • 12. JOURNEY TO METRICS BASED MONITORING PHASE 0 PHASE 2PHASE 1 PHASE 3 Health checks and logs Small internal metrics system Build out scalable, highly-available metrics system Build out more sophisticated analytics From individual component checks to proactive management of service-wide performance
  • 13. M M / D D / Y Y YOUR TITLE HERE P R E P A R E D F O R : P L A C E L O G O H E R EDEMO
  • 14. M M / D D / Y Y YOUR TITLE HERE P R E P A R E D F O R : P L A C E L O G O H E R E T H A N K Y O U ! jamison@signalfx.com rschofield@signalfx.com S I G N U P F O R A T R I A L A T : signalfx.com