SlideShare a Scribd company logo
©2015 Azul Systems, Inc.	 	 	 	 	 	
How NOT to
Measure Latency
Gil Tene, CTO & co-Founder, Azul Systems
@giltene
InfoQ.com: News & Community Site
• 750,000 unique visitors/month
• Published in 4 languages (English, Chinese, Japanese and Brazilian
Portuguese)
• Post content from our QCon conferences
• News 15-20 / week
• Articles 3-4 / week
• Presentations (videos) 12-15 / week
• Interviews 2-3 / week
• Books 1 / month
Watch the video with slide
synchronization on InfoQ.com!
http://www.infoq.com/presentations/
latency-response-time
Purpose of QCon
- to empower software development by facilitating the spread of
knowledge and innovation
Strategy
- practitioner-driven conference designed for YOU: influencers of
change and innovation in your teams
- speakers and topics driving the evolution and innovation
- connecting and catalyzing the influencers and innovators
Highlights
- attended by more than 12,000 delegates since 2007
- held in 9 cities worldwide
Presented at QCon San Francisco
www.qconsf.com
©2015 Azul Systems, Inc.	 	 	 	 	 	
The “Oh S@%#!” talk
Gil Tene, CTO & co-Founder, Azul Systems
@giltene
©2015 Azul Systems, Inc.	 	 	 	 	 	
About me: Gil Tene
co-founder, CTO @Azul
Systems

Have been working on
“think different” GC
approaches since 2002

A Long history building
Virtual & Physical
Machines, Operating
Systems, Enterprise apps,
etc...

I also depress people by
pulling the wool up from
over their eyes…
* working on real-world trash compaction issues, circa 2004
©2015 Azul Systems, Inc.
©2015 Azul Systems, Inc.	 	 	 	 	 	
Latency Behavior
Latency: The time it took one operation to happen

Each operation occurrence has its own latency

What we care about is how latency behaves

Behavior is a lot more than “the common case was X”
©2015 Azul Systems, Inc.	 	 	 	 	 	
We like to look at pretty charts…
95%’lie
The “We only want to show good things” chart
©2015 Azul Systems, Inc.
©2015 Azul Systems, Inc.	 	 	 	 	 	
A real world, real time example
©2015 Azul Systems, Inc.	 	 	 	 	 	
A real world, real time example
©2015 Azul Systems, Inc.	 	 	 	 	 	
A real world, real time example
So this is a better picture. Right?
©2015 Azul Systems, Inc.	 	 	 	 	 	
Why do we tend to avoid plotting Max latency?
Because no other %’ile will be visible on the same chart…
©2015 Azul Systems, Inc.	 	 	 	 	 	
I like to rant about latency…
©2015 Azul Systems, Inc.	 	 	 	 	 	
#LatencyTipOfTheDay:
If you are not measuring and/or
plotting Max, what are you hiding
(from)?
©2015 Azul Systems, Inc.
©2015 Azul Systems, Inc.	 	 	 	 	 	
What (TF) does the Average
of the 95%’lie mean?
©2015 Azul Systems, Inc.	 	 	 	 	 	
What (TF) does the Average
of the 95%’lie mean?
Lets do the same with 100%’ile; Suppose we a set of
100%’ile values for each minute:

[1, 0, 3, 1, 601, 4, 2, 8, 0, 3, 3, 1, 1, 0, 2]

“The average 100%’ile over the past 15 minutes was 42”

Same nonsense applies to any other %’lie
©2015 Azul Systems, Inc.	 	 	 	 	 	
#LatencyTipOfTheDay:
You can't average percentiles.
Period.
©2015 Azul Systems, Inc.	 	 	 	 	 	
Percentiles Matter
©2015 Azul Systems, Inc.	 	 	 	 	 	
Is the 99%’lie “rare”?
©2015 Azul Systems, Inc.	 	 	 	 	 	
99%’lie: a good indicator, right?
What are the chances of a single web page
view experiencing >99%’lie latency of:
- A single search engine node?
- A single Key/Value store node?
- A single Database node?
- A single CDN request?
©2015 Azul Systems, Inc.
©2015 Azul Systems, Inc.
©2015 Azul Systems, Inc.	 	 	 	 	 	
#LatencyTipOfTheDay:
MOST page loads will experience
the 99%'lie server response
©2015 Azul Systems, Inc.	 	 	 	 	 	
Which HTTP response time metric is more
“representative” of user experience?
The 95%’lie or the 99.9%’lie
©2015 Azul Systems, Inc.	 	 	 	 	 	
Gauging user experience
Example: If a typical user session involves 5 page
loads, averaging 40 resources per page.
- How many of our users will NOT experience
something worse than the 95%’lie of http requests?
Answer: ~0.003%
- How may of our users will experience at least one
response that is longer than the 99.9%’lie?
Answer: ~18%
©2015 Azul Systems, Inc.	 	 	 	 	 	
Gauging user experience
Example: If a typical user session involves 5 page
loads, averaging 40 resources per page.
- What http response percentile will be experienced
by the 95%’ile of users?
Answer: ~99.97%
- What http response percentile will be experienced
by the 99%’ile of users
Answer: ~99.995%
©2015 Azul Systems, Inc.	 	 	 	 	 	
#LatencyTipOfTheDay:
Median Server Response Time:
The number that 99.9999999999%
of page views can be worse than
©2015 Azul Systems, Inc.	 	 	 	 	 	
Why don’t we have response
time or latency stats with
multiple 9s in them???
©2015 Azul Systems, Inc.	 	 	 	 	 	
You can’t average
percentiles…
And you also can’t get an
hour’s 99.999%’lie out of lots
of 10 second interval 99%’lie
reports…
Why don’t we have response
time or latency stats with
multiple 9s in them???
©2015 Azul Systems, Inc.	 	 	 	 	 	
You can’t average percentiles…
And you also can’t get an hour’s
99.999%’lie out of lots
of 10 second interval 99%’lie reports…
Check out HdrHistogram
It lets you have nice things….
Why don’t we have response
time or latency stats with
multiple 9s in them???
0"
20"
40"
60"
80"
100"
120"
0" 500" 1000" 1500" 2000" 2500"
Hiccup&Dura*on&(mse
&Elapsed&Time&(sec)&
0%" 90%" 99%" 99.9%" 99.99%" 99.999%" 99.9999%"
0"
20"
40"
60"
80"
100"
120"
140"Hiccup&Dura*on&(msec)&
&
&
Percen*le&
Hiccups&by&Percen*le&Distribu*on&
Hiccups"by"Percen?le" SLA"
©2015 Azul Systems, Inc.	 	 	 	 	 	
Latency “wishful thinking”
We know how to compute
averages & std. deviation, etc.

Wouldn’t it be nice if latency
had a normal distribution?

The average, 90%’lie, 99%’lie,
std. deviation, etc. can give us
a “feel” for the rest of the
distribution, right?

If 99% of the stuff behaves
well, how bad can the rest be,
really?
©2015 Azul Systems, Inc.	 	 	 	 	 	
The real world: latency distribution
0%! 90%! 99%!
0!
0.05!
0.1!
0.15!
0.2!
0.25!
0.3!
0.35!
0.4!
0.45!
0.5!
Latency(msec)!
!
!
Percentile!
Latency by Percentile Distribution!
©2015 Azul Systems, Inc.	 	 	 	 	 	
The real world: latency distribution
0%! 90%! 99%! 99.9%!
0!
0.5!
1!
1.5!
2!
2.5!
3!
3.5!
4!
4.5!
5!
Latency(msec)!
!
!
Percentile!
Latency by Percentile Distribution!
©2015 Azul Systems, Inc.	 	 	 	 	 	
The real world: latency distribution
0%! 90%! 99%! 99.9%! 99.99%! 99.999%! 99.9999%!
0!
10!
20!
30!
40!
50!
60!
Latency(msec)!
!
!
Percentile!
Latency by Percentile Distribution!
©2015 Azul Systems, Inc.	 	 	 	 	 	
Dispelling standard deviation
0%# 90%# 99%# 99.9%# 99.99%# 99.999%# 99.9999%#
0#
10#
20#
30#
40#
50#
Latency(((msec)(
(
(
Percen/le(
Latency(by(Percen/le(Distribu/on(
A#
B#
C#
D#
E#
F#
©2015 Azul Systems, Inc.	 	 	 	 	 	
Dispelling standard deviation
0%# 90%# 99%# 99.9%# 99.99%# 99.999%# 99.9999%#
0#
10#
20#
30#
40#
50#
Latency(((msec)(
(
(
Percen/le(
Latency(by(Percen/le(Distribu/on(
A#
B#
C#
D#
E#
F#
Mean = 0.06 msec

Std. Deviation (𝞂) = 0.21msec
99.999% = 38.66msec
In a normal distribution,
These are NOT normal distributions
~184 σ (!!!) away from the mean
the 99.999%’ile falls within 4.5 σ
©2015 Azul Systems, Inc.	 	 	 	 	 	
The coordinated omission problem
An accidental conspiracy...
The lie in the 99%’lies
©2015 Azul Systems, Inc.	 	 	 	 	 	
The coordinated omission
problem
Common Example A (load testing):

each “client” issues requests at a certain rate

measure/log response time for each request

So what’s wrong with that?

works only if ALL responses fit within interval

implicit “automatic back off” coordination
©2015 Azul Systems, Inc.	 	 	 	 	 	
Common Example B: 

Coordinated Omission in Monitoring Code
Long operations only get measured once
delays outside of timing window do not get measured at all
How	bad	can	this	get?
40
Avg. is 1 msec
over 1st 100 sec
System Stalled
for 100 Sec
Elapsed Time
System easily handles
100 requests/sec
Responds to each
in 1msec
How would you characterize this system?
~50%‘ile is 1 msec ~75%‘ile is 50 sec 99.99%‘ile is ~100sec
Avg. is 50 sec.
over next 100 sec
Overall Average response time is ~25 sec.
Measurement	in	prac8ce
41
System Stalled
for 100 Sec
Elapsed Time
System easily handles
100 requests/sec
Responds to each
in 1msec
What actually gets measured?
(should be ~50sec) (should be ~100 sec)
50%‘ile is 1 msec 75%‘lie is 1 msec 99.99%‘lie is 1 msec
10,000 measurements
@ 1 msec each
1 measurement
@ 100 sec
Overall Average is 10.9 msec (!!!)
Proper	measurement
42
System Stalled
for 100 Sec
Elapsed Time
System easily handles
100 requests/sec
Responds to each
in 1msec
10,000 results
Varying linearly
from 100 sec
to 10 msec
10,000 results
@ 1 msec each
~50%‘ile is 1 msec ~75%‘ile is 50 sec 99.99%‘ile is ~100sec
Proper	measurement
43
System Stalled
for 100 Sec
Elapsed Time
System easily handles
100 requests/sec
Responds to each
in 1msec
10,000 results
Varying linearly
from 100 sec
to 10 msec
10,000 results
@ 1 msec each
~50%‘ile is 1 msec ~75%‘ile is 50 sec 99.99%‘ile is ~100sec
Coordinated
Omission
1 msec 1 msec
1 result
@ 100 sec
“Be?er”	can	look	“Worse”
44
System Slowed
for 100 Sec
Elapsed Time
System easily handles
100 requests/sec
Responds to each
in 1msec
10,000 @ 1msec 10,000 @ 5 msec
50%‘ile is 1 msec 75%‘lie is 2.5msec 99.99%‘lie is ~5msec
Still easily handles
100 requests/sec
Responds to each
in 5 msec
(stalled shows 1 msec) (stalled shows 1 msec)
“Correc8on”:	“Chea8ng	Twice”
45
System Stalled
for 100 Sec
Elapsed Time
System easily handles
100 requests/sec
Responds to each
in 1msec
10,000 results
Varying linearly
from 100 sec
to 10 msec
10,000 results
@ 1 msec each
~50%‘ile is 1 msec ~75%‘ile is 50 sec 99.994%‘ile is ~100sec
Coordinated
Omission
1 msec 1 msec
9,999 additional results
@ 1 msec each
1 result
@ 100 sec
“correction”
©2015 Azul Systems, Inc.	 	 	 	 	 	
Response Time vs. Service Time
©2015 Azul Systems, Inc.	 	 	 	 	 	
Service Time vs. Response Time
©2015 Azul Systems, Inc.	 	 	 	 	 	
Coordinated Omission
Usually
makes something that you think is a
Response Time metric
only represent
the Service Time component
©2015 Azul Systems, Inc.	 	 	 	 	 	
Response Time vs. Service Time @2K/sec
©2015 Azul Systems, Inc.	 	 	 	 	 	
Response Time vs. Service Time @20K/sec
©2015 Azul Systems, Inc.	 	 	 	 	 	
Response Time vs. Service Time @60K/sec
©2015 Azul Systems, Inc.	 	 	 	 	 	
Response Time vs. Service Time @80K/sec
©2015 Azul Systems, Inc.	 	 	 	 	 	
Response Time vs. Service Time @90K/sec
©2015 Azul Systems, Inc.	 	 	 	 	 	
How “real” people react
©2015 Azul Systems, Inc.	 	 	 	 	 	
Service Time, 90K/s vs 80K/s
©2015 Azul Systems, Inc.	 	 	 	 	 	
Response Time, 90K/s vs 80K/s
©2015 Azul Systems, Inc.	 	 	 	 	 	
Response Time, 90K/s vs 80K/s (to scale)
©2015 Azul Systems, Inc.	 	 	 	 	 	
Latency doesn’t live in a vacuum
©2015 Azul Systems, Inc.	 	 	 	 	 	
Sustainable Throughput:
The throughput achieved while
safely maintaining service levels
©2015 Azul Systems, Inc.	 	 	 	 	 	
Comparing behavior under different throughputs
and/or configurations
0%# 90%# 99%# 99.9%# 99.99%# 99.999%#
0#
20#
40#
60#
80#
100#
120#
!Dura&on!(msec)!
!
!
Percen&le!
Dura&on!by!Percen&le!Distribu&on!
Setup#D#
Setup#E#
Setup#F#
Setup#A#
Setup#B#
Setup#C#
©2015 Azul Systems, Inc.	 	 	 	 	 	
Comparing response time or
latency behaviors
©2015 Azul Systems, Inc.	 	 	 	 	 	
System A @90K/s & 85K/s vs.
System B @90K/s & 85K/s
Wrong Place to Look:
They both “suck” at >85K/sec
©2015 Azul Systems, Inc.	 	 	 	 	 	
System A 85K/s vs. System B 85K/s
Looks good, but still
the wrong place to look
©2015 Azul Systems, Inc.	 	 	 	 	 	
System A @40K/s vs. System B @40K/s
More interesting…
What can we do with this?
©2015 Azul Systems, Inc.	 	 	 	 	 	
System A @10K/s vs. System B @40K/s
E.g. if “99%’ile < 5msec” was a goal:
System B delivers similar 99%’ile and superior
99.9%’ile+ while carrying 4x the throughput
©2015 Azul Systems, Inc.	 	 	 	 	 	
System A @2K/s vs. System B @20K/s
E.g. if “99.9%’ile < 10msec” was a goal:
System B delivers similar 99%’ile and 99.9%’ile
while carrying 10x the throughput
©2015 Azul Systems, Inc.	 	 	 	 	 	
System A @2k thru 80k
©2015 Azul Systems, Inc.	 	 	 	 	 	
System A @2k thru 70k
©2015 Azul Systems, Inc.	 	 	 	 	 	
System B @20k thru 70k
©2015 Azul Systems, Inc.	 	 	 	 	 	
System A & System B @2k thru 70k
©2015 Azul Systems, Inc.	 	 	 	 	 	
System A & System B
10K/s thru 60K/s
System A @ 10K, 20K, 40K, 60K
System B @20K, 40K, 60K
Lots of conclusions can be drawn from the above…
E.g. System B delivers a consistent 100x reduction in the
rate of occurrence of >20msec response times
©2015 Azul Systems, Inc.	 	 	 	 	 	
System A: 200-1400 msec stalls
System B drawn to scale
Service timeResponse Time Response TimeService time
©2015 Azul Systems, Inc.	 	 	 	 	 	
This is Your Load on System A
This is Your Load on System B
Any Questions?
A simple visual summary
©2015 Azul Systems, Inc.	 	 	 	 	 	
http://www.azulsystems.com
Any Questions?
Watch the video with slide synchronization on
InfoQ.com!
http://www.infoq.com/presentations/latency-
response-time

More Related Content

What's hot

Distributed computing with spark
Distributed computing with sparkDistributed computing with spark
Distributed computing with spark
Javier Santos Paniego
 
Introduction to Microservices Patterns
Introduction to Microservices PatternsIntroduction to Microservices Patterns
Introduction to Microservices Patterns
Dimosthenis Botsaris
 
Elastic Observability keynote
Elastic Observability keynoteElastic Observability keynote
Elastic Observability keynote
Elasticsearch
 
[ACNA2022] Hadoop Vectored IO_ your data just got faster!.pdf
[ACNA2022] Hadoop Vectored IO_ your data just got faster!.pdf[ACNA2022] Hadoop Vectored IO_ your data just got faster!.pdf
[ACNA2022] Hadoop Vectored IO_ your data just got faster!.pdf
MukundThakur22
 
MuleSoft Runtime Fabric (RTF): Foundations : MuleSoft Virtual Muleys Meetups
MuleSoft Runtime Fabric (RTF): Foundations  : MuleSoft Virtual Muleys MeetupsMuleSoft Runtime Fabric (RTF): Foundations  : MuleSoft Virtual Muleys Meetups
MuleSoft Runtime Fabric (RTF): Foundations : MuleSoft Virtual Muleys Meetups
Angel Alberici
 
Algebraic Data Types for Data Oriented Programming - From Haskell and Scala t...
Algebraic Data Types forData Oriented Programming - From Haskell and Scala t...Algebraic Data Types forData Oriented Programming - From Haskell and Scala t...
Algebraic Data Types for Data Oriented Programming - From Haskell and Scala t...
Philip Schwarz
 
Square Engineering's "Fail Fast, Retry Soon" Performance Optimization Technique
Square Engineering's "Fail Fast, Retry Soon" Performance Optimization TechniqueSquare Engineering's "Fail Fast, Retry Soon" Performance Optimization Technique
Square Engineering's "Fail Fast, Retry Soon" Performance Optimization Technique
ScyllaDB
 
Monitoring MySQL Replication lag with Prometheus & pt-heartbeat
Monitoring MySQL Replication lag with Prometheus & pt-heartbeatMonitoring MySQL Replication lag with Prometheus & pt-heartbeat
Monitoring MySQL Replication lag with Prometheus & pt-heartbeat
Julien Pivotto
 
Apache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing dataApache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing data
DataWorks Summit/Hadoop Summit
 
Postgresql tutorial
Postgresql tutorialPostgresql tutorial
Postgresql tutorial
Ashoka Vanjare
 
Druid deep dive
Druid deep diveDruid deep dive
Druid deep dive
Kashif Khan
 
Near Real-Time Netflix Recommendations Using Apache Spark Streaming with Nit...
 Near Real-Time Netflix Recommendations Using Apache Spark Streaming with Nit... Near Real-Time Netflix Recommendations Using Apache Spark Streaming with Nit...
Near Real-Time Netflix Recommendations Using Apache Spark Streaming with Nit...
Databricks
 
How We Test MongoDB: Evergreen
How We Test MongoDB: EvergreenHow We Test MongoDB: Evergreen
How We Test MongoDB: Evergreen
MongoDB
 
Dynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data AlertsDynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data Alerts
Flink Forward
 
Netflix Development Patterns for Scale, Performance & Availability (DMG206) |...
Netflix Development Patterns for Scale, Performance & Availability (DMG206) |...Netflix Development Patterns for Scale, Performance & Availability (DMG206) |...
Netflix Development Patterns for Scale, Performance & Availability (DMG206) |...
Amazon Web Services
 
Why Splunk Chose Pulsar_Karthik Ramasamy
Why Splunk Chose Pulsar_Karthik RamasamyWhy Splunk Chose Pulsar_Karthik Ramasamy
Why Splunk Chose Pulsar_Karthik Ramasamy
StreamNative
 
The Truth About the Service Mesh Data Plane
The Truth About the Service Mesh Data PlaneThe Truth About the Service Mesh Data Plane
The Truth About the Service Mesh Data Plane
Christian Posta
 
Using ClickHouse for Experimentation
Using ClickHouse for ExperimentationUsing ClickHouse for Experimentation
Using ClickHouse for Experimentation
Gleb Kanterov
 
Grafana Mimir and VictoriaMetrics_ Performance Tests.pptx
Grafana Mimir and VictoriaMetrics_ Performance Tests.pptxGrafana Mimir and VictoriaMetrics_ Performance Tests.pptx
Grafana Mimir and VictoriaMetrics_ Performance Tests.pptx
RomanKhavronenko
 
Microservices
MicroservicesMicroservices
Microservices
Stephan Lindauer
 

What's hot (20)

Distributed computing with spark
Distributed computing with sparkDistributed computing with spark
Distributed computing with spark
 
Introduction to Microservices Patterns
Introduction to Microservices PatternsIntroduction to Microservices Patterns
Introduction to Microservices Patterns
 
Elastic Observability keynote
Elastic Observability keynoteElastic Observability keynote
Elastic Observability keynote
 
[ACNA2022] Hadoop Vectored IO_ your data just got faster!.pdf
[ACNA2022] Hadoop Vectored IO_ your data just got faster!.pdf[ACNA2022] Hadoop Vectored IO_ your data just got faster!.pdf
[ACNA2022] Hadoop Vectored IO_ your data just got faster!.pdf
 
MuleSoft Runtime Fabric (RTF): Foundations : MuleSoft Virtual Muleys Meetups
MuleSoft Runtime Fabric (RTF): Foundations  : MuleSoft Virtual Muleys MeetupsMuleSoft Runtime Fabric (RTF): Foundations  : MuleSoft Virtual Muleys Meetups
MuleSoft Runtime Fabric (RTF): Foundations : MuleSoft Virtual Muleys Meetups
 
Algebraic Data Types for Data Oriented Programming - From Haskell and Scala t...
Algebraic Data Types forData Oriented Programming - From Haskell and Scala t...Algebraic Data Types forData Oriented Programming - From Haskell and Scala t...
Algebraic Data Types for Data Oriented Programming - From Haskell and Scala t...
 
Square Engineering's "Fail Fast, Retry Soon" Performance Optimization Technique
Square Engineering's "Fail Fast, Retry Soon" Performance Optimization TechniqueSquare Engineering's "Fail Fast, Retry Soon" Performance Optimization Technique
Square Engineering's "Fail Fast, Retry Soon" Performance Optimization Technique
 
Monitoring MySQL Replication lag with Prometheus & pt-heartbeat
Monitoring MySQL Replication lag with Prometheus & pt-heartbeatMonitoring MySQL Replication lag with Prometheus & pt-heartbeat
Monitoring MySQL Replication lag with Prometheus & pt-heartbeat
 
Apache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing dataApache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing data
 
Postgresql tutorial
Postgresql tutorialPostgresql tutorial
Postgresql tutorial
 
Druid deep dive
Druid deep diveDruid deep dive
Druid deep dive
 
Near Real-Time Netflix Recommendations Using Apache Spark Streaming with Nit...
 Near Real-Time Netflix Recommendations Using Apache Spark Streaming with Nit... Near Real-Time Netflix Recommendations Using Apache Spark Streaming with Nit...
Near Real-Time Netflix Recommendations Using Apache Spark Streaming with Nit...
 
How We Test MongoDB: Evergreen
How We Test MongoDB: EvergreenHow We Test MongoDB: Evergreen
How We Test MongoDB: Evergreen
 
Dynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data AlertsDynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data Alerts
 
Netflix Development Patterns for Scale, Performance & Availability (DMG206) |...
Netflix Development Patterns for Scale, Performance & Availability (DMG206) |...Netflix Development Patterns for Scale, Performance & Availability (DMG206) |...
Netflix Development Patterns for Scale, Performance & Availability (DMG206) |...
 
Why Splunk Chose Pulsar_Karthik Ramasamy
Why Splunk Chose Pulsar_Karthik RamasamyWhy Splunk Chose Pulsar_Karthik Ramasamy
Why Splunk Chose Pulsar_Karthik Ramasamy
 
The Truth About the Service Mesh Data Plane
The Truth About the Service Mesh Data PlaneThe Truth About the Service Mesh Data Plane
The Truth About the Service Mesh Data Plane
 
Using ClickHouse for Experimentation
Using ClickHouse for ExperimentationUsing ClickHouse for Experimentation
Using ClickHouse for Experimentation
 
Grafana Mimir and VictoriaMetrics_ Performance Tests.pptx
Grafana Mimir and VictoriaMetrics_ Performance Tests.pptxGrafana Mimir and VictoriaMetrics_ Performance Tests.pptx
Grafana Mimir and VictoriaMetrics_ Performance Tests.pptx
 
Microservices
MicroservicesMicroservices
Microservices
 

Similar to How NOT to Measure Latency

Misery Metrics & Consequences
Misery Metrics & ConsequencesMisery Metrics & Consequences
Misery Metrics & Consequences
ScyllaDB
 
Understanding Latency Response Time and Behavior
Understanding Latency Response Time and BehaviorUnderstanding Latency Response Time and Behavior
Understanding Latency Response Time and Behavior
zuluJDK
 
Intelligent Trading Summit NY 2014: Understanding Latency: Key Lessons and Tools
Intelligent Trading Summit NY 2014: Understanding Latency: Key Lessons and ToolsIntelligent Trading Summit NY 2014: Understanding Latency: Key Lessons and Tools
Intelligent Trading Summit NY 2014: Understanding Latency: Key Lessons and Tools
Azul Systems Inc.
 
How NOT to Measure Latency
How NOT to Measure LatencyHow NOT to Measure Latency
How NOT to Measure Latency
Azul Systems, Inc.
 
Demystifying Sample Size - How Many Participants Do You Really Need for UX Re...
Demystifying Sample Size - How Many Participants Do You Really Need for UX Re...Demystifying Sample Size - How Many Participants Do You Really Need for UX Re...
Demystifying Sample Size - How Many Participants Do You Really Need for UX Re...
UserZoom
 
VSLive Orlando 2019 - When "We are down" is not good enough. SRE on Azure
VSLive Orlando 2019 - When "We are down" is not good enough. SRE on AzureVSLive Orlando 2019 - When "We are down" is not good enough. SRE on Azure
VSLive Orlando 2019 - When "We are down" is not good enough. SRE on Azure
Rene Van Osnabrugge
 
Site Reliability Engineering (SRE) - Tech Talk by Keet Sugathadasa
Site Reliability Engineering (SRE) - Tech Talk by Keet SugathadasaSite Reliability Engineering (SRE) - Tech Talk by Keet Sugathadasa
Site Reliability Engineering (SRE) - Tech Talk by Keet Sugathadasa
Keet Sugathadasa
 
Provisioning and Capacity Planning (Travel Meets Big Data)
Provisioning and Capacity Planning (Travel Meets Big Data)Provisioning and Capacity Planning (Travel Meets Big Data)
Provisioning and Capacity Planning (Travel Meets Big Data)
Brian Brazil
 
How Machines Help Humans Root Case Issues @ Netflix
How Machines Help Humans Root Case Issues @ NetflixHow Machines Help Humans Root Case Issues @ Netflix
How Machines Help Humans Root Case Issues @ Netflix
C4Media
 
Hidden Costs of Chasing the Mythical 'Five Nines'
Hidden Costs of Chasing the Mythical 'Five Nines'Hidden Costs of Chasing the Mythical 'Five Nines'
Hidden Costs of Chasing the Mythical 'Five Nines'
DevOpsDays DFW
 
Understanding Hardware Transactional Memory
Understanding Hardware Transactional MemoryUnderstanding Hardware Transactional Memory
Understanding Hardware Transactional Memory
C4Media
 
Reliability?
Reliability?Reliability?
Reliability?
Teiva Harsanyi
 
Forecasting using monte carlo simulations
Forecasting using monte carlo simulationsForecasting using monte carlo simulations
Forecasting using monte carlo simulations
Daniel Ploeg
 
Enabling Java in Latency Sensitive Environments
Enabling Java in Latency Sensitive EnvironmentsEnabling Java in Latency Sensitive Environments
Enabling Java in Latency Sensitive Environments
C4Media
 
Enabling Java in Latency-Sensitive Environments - Austin JUG April 2015
Enabling Java in Latency-Sensitive Environments - Austin JUG April 2015Enabling Java in Latency-Sensitive Environments - Austin JUG April 2015
Enabling Java in Latency-Sensitive Environments - Austin JUG April 2015
Azul Systems, Inc.
 
Enabling Java in Latency Sensitive Environments - Dallas JUG April 2015
Enabling Java in Latency Sensitive Environments - Dallas JUG April 2015Enabling Java in Latency Sensitive Environments - Dallas JUG April 2015
Enabling Java in Latency Sensitive Environments - Dallas JUG April 2015
Azul Systems, Inc.
 
Toc Education
Toc EducationToc Education
Toc Education
teddalexander
 
Enabling Java in Latency Sensitive Applications by Gil Tene, CTO, Azul Systems
Enabling Java in Latency Sensitive Applications by Gil Tene, CTO, Azul SystemsEnabling Java in Latency Sensitive Applications by Gil Tene, CTO, Azul Systems
Enabling Java in Latency Sensitive Applications by Gil Tene, CTO, Azul Systems
zuluJDK
 
When down is not good enough. SRE On Azure
When down is not good enough. SRE On AzureWhen down is not good enough. SRE On Azure
When down is not good enough. SRE On Azure
Rene Van Osnabrugge
 
Prometheus (Prometheus London, 2016)
Prometheus (Prometheus London, 2016)Prometheus (Prometheus London, 2016)
Prometheus (Prometheus London, 2016)
Brian Brazil
 

Similar to How NOT to Measure Latency (20)

Misery Metrics & Consequences
Misery Metrics & ConsequencesMisery Metrics & Consequences
Misery Metrics & Consequences
 
Understanding Latency Response Time and Behavior
Understanding Latency Response Time and BehaviorUnderstanding Latency Response Time and Behavior
Understanding Latency Response Time and Behavior
 
Intelligent Trading Summit NY 2014: Understanding Latency: Key Lessons and Tools
Intelligent Trading Summit NY 2014: Understanding Latency: Key Lessons and ToolsIntelligent Trading Summit NY 2014: Understanding Latency: Key Lessons and Tools
Intelligent Trading Summit NY 2014: Understanding Latency: Key Lessons and Tools
 
How NOT to Measure Latency
How NOT to Measure LatencyHow NOT to Measure Latency
How NOT to Measure Latency
 
Demystifying Sample Size - How Many Participants Do You Really Need for UX Re...
Demystifying Sample Size - How Many Participants Do You Really Need for UX Re...Demystifying Sample Size - How Many Participants Do You Really Need for UX Re...
Demystifying Sample Size - How Many Participants Do You Really Need for UX Re...
 
VSLive Orlando 2019 - When "We are down" is not good enough. SRE on Azure
VSLive Orlando 2019 - When "We are down" is not good enough. SRE on AzureVSLive Orlando 2019 - When "We are down" is not good enough. SRE on Azure
VSLive Orlando 2019 - When "We are down" is not good enough. SRE on Azure
 
Site Reliability Engineering (SRE) - Tech Talk by Keet Sugathadasa
Site Reliability Engineering (SRE) - Tech Talk by Keet SugathadasaSite Reliability Engineering (SRE) - Tech Talk by Keet Sugathadasa
Site Reliability Engineering (SRE) - Tech Talk by Keet Sugathadasa
 
Provisioning and Capacity Planning (Travel Meets Big Data)
Provisioning and Capacity Planning (Travel Meets Big Data)Provisioning and Capacity Planning (Travel Meets Big Data)
Provisioning and Capacity Planning (Travel Meets Big Data)
 
How Machines Help Humans Root Case Issues @ Netflix
How Machines Help Humans Root Case Issues @ NetflixHow Machines Help Humans Root Case Issues @ Netflix
How Machines Help Humans Root Case Issues @ Netflix
 
Hidden Costs of Chasing the Mythical 'Five Nines'
Hidden Costs of Chasing the Mythical 'Five Nines'Hidden Costs of Chasing the Mythical 'Five Nines'
Hidden Costs of Chasing the Mythical 'Five Nines'
 
Understanding Hardware Transactional Memory
Understanding Hardware Transactional MemoryUnderstanding Hardware Transactional Memory
Understanding Hardware Transactional Memory
 
Reliability?
Reliability?Reliability?
Reliability?
 
Forecasting using monte carlo simulations
Forecasting using monte carlo simulationsForecasting using monte carlo simulations
Forecasting using monte carlo simulations
 
Enabling Java in Latency Sensitive Environments
Enabling Java in Latency Sensitive EnvironmentsEnabling Java in Latency Sensitive Environments
Enabling Java in Latency Sensitive Environments
 
Enabling Java in Latency-Sensitive Environments - Austin JUG April 2015
Enabling Java in Latency-Sensitive Environments - Austin JUG April 2015Enabling Java in Latency-Sensitive Environments - Austin JUG April 2015
Enabling Java in Latency-Sensitive Environments - Austin JUG April 2015
 
Enabling Java in Latency Sensitive Environments - Dallas JUG April 2015
Enabling Java in Latency Sensitive Environments - Dallas JUG April 2015Enabling Java in Latency Sensitive Environments - Dallas JUG April 2015
Enabling Java in Latency Sensitive Environments - Dallas JUG April 2015
 
Toc Education
Toc EducationToc Education
Toc Education
 
Enabling Java in Latency Sensitive Applications by Gil Tene, CTO, Azul Systems
Enabling Java in Latency Sensitive Applications by Gil Tene, CTO, Azul SystemsEnabling Java in Latency Sensitive Applications by Gil Tene, CTO, Azul Systems
Enabling Java in Latency Sensitive Applications by Gil Tene, CTO, Azul Systems
 
When down is not good enough. SRE On Azure
When down is not good enough. SRE On AzureWhen down is not good enough. SRE On Azure
When down is not good enough. SRE On Azure
 
Prometheus (Prometheus London, 2016)
Prometheus (Prometheus London, 2016)Prometheus (Prometheus London, 2016)
Prometheus (Prometheus London, 2016)
 

More from C4Media

Streaming a Million Likes/Second: Real-Time Interactions on Live Video
Streaming a Million Likes/Second: Real-Time Interactions on Live VideoStreaming a Million Likes/Second: Real-Time Interactions on Live Video
Streaming a Million Likes/Second: Real-Time Interactions on Live Video
C4Media
 
Next Generation Client APIs in Envoy Mobile
Next Generation Client APIs in Envoy MobileNext Generation Client APIs in Envoy Mobile
Next Generation Client APIs in Envoy Mobile
C4Media
 
Software Teams and Teamwork Trends Report Q1 2020
Software Teams and Teamwork Trends Report Q1 2020Software Teams and Teamwork Trends Report Q1 2020
Software Teams and Teamwork Trends Report Q1 2020
C4Media
 
Understand the Trade-offs Using Compilers for Java Applications
Understand the Trade-offs Using Compilers for Java ApplicationsUnderstand the Trade-offs Using Compilers for Java Applications
Understand the Trade-offs Using Compilers for Java Applications
C4Media
 
Kafka Needs No Keeper
Kafka Needs No KeeperKafka Needs No Keeper
Kafka Needs No Keeper
C4Media
 
High Performing Teams Act Like Owners
High Performing Teams Act Like OwnersHigh Performing Teams Act Like Owners
High Performing Teams Act Like Owners
C4Media
 
Does Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to JavaDoes Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to Java
C4Media
 
Service Meshes- The Ultimate Guide
Service Meshes- The Ultimate GuideService Meshes- The Ultimate Guide
Service Meshes- The Ultimate Guide
C4Media
 
Shifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDShifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CD
C4Media
 
CI/CD for Machine Learning
CI/CD for Machine LearningCI/CD for Machine Learning
CI/CD for Machine Learning
C4Media
 
Fault Tolerance at Speed
Fault Tolerance at SpeedFault Tolerance at Speed
Fault Tolerance at Speed
C4Media
 
Architectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsArchitectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep Systems
C4Media
 
ML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.js
C4Media
 
Build Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerBuild Your Own WebAssembly Compiler
Build Your Own WebAssembly Compiler
C4Media
 
User & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleUser & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix Scale
C4Media
 
Scaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeScaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's Edge
C4Media
 
Make Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereMake Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home Everywhere
C4Media
 
The Talk You've Been Await-ing For
The Talk You've Been Await-ing ForThe Talk You've Been Await-ing For
The Talk You've Been Await-ing For
C4Media
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data Engineering
C4Media
 
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreAutomated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
C4Media
 

More from C4Media (20)

Streaming a Million Likes/Second: Real-Time Interactions on Live Video
Streaming a Million Likes/Second: Real-Time Interactions on Live VideoStreaming a Million Likes/Second: Real-Time Interactions on Live Video
Streaming a Million Likes/Second: Real-Time Interactions on Live Video
 
Next Generation Client APIs in Envoy Mobile
Next Generation Client APIs in Envoy MobileNext Generation Client APIs in Envoy Mobile
Next Generation Client APIs in Envoy Mobile
 
Software Teams and Teamwork Trends Report Q1 2020
Software Teams and Teamwork Trends Report Q1 2020Software Teams and Teamwork Trends Report Q1 2020
Software Teams and Teamwork Trends Report Q1 2020
 
Understand the Trade-offs Using Compilers for Java Applications
Understand the Trade-offs Using Compilers for Java ApplicationsUnderstand the Trade-offs Using Compilers for Java Applications
Understand the Trade-offs Using Compilers for Java Applications
 
Kafka Needs No Keeper
Kafka Needs No KeeperKafka Needs No Keeper
Kafka Needs No Keeper
 
High Performing Teams Act Like Owners
High Performing Teams Act Like OwnersHigh Performing Teams Act Like Owners
High Performing Teams Act Like Owners
 
Does Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to JavaDoes Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to Java
 
Service Meshes- The Ultimate Guide
Service Meshes- The Ultimate GuideService Meshes- The Ultimate Guide
Service Meshes- The Ultimate Guide
 
Shifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDShifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CD
 
CI/CD for Machine Learning
CI/CD for Machine LearningCI/CD for Machine Learning
CI/CD for Machine Learning
 
Fault Tolerance at Speed
Fault Tolerance at SpeedFault Tolerance at Speed
Fault Tolerance at Speed
 
Architectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsArchitectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep Systems
 
ML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.js
 
Build Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerBuild Your Own WebAssembly Compiler
Build Your Own WebAssembly Compiler
 
User & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleUser & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix Scale
 
Scaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeScaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's Edge
 
Make Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereMake Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home Everywhere
 
The Talk You've Been Await-ing For
The Talk You've Been Await-ing ForThe Talk You've Been Await-ing For
The Talk You've Been Await-ing For
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data Engineering
 
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreAutomated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
 

Recently uploaded

Integrating Kafka with MuleSoft 4 and usecase
Integrating Kafka with MuleSoft 4 and usecaseIntegrating Kafka with MuleSoft 4 and usecase
Integrating Kafka with MuleSoft 4 and usecase
shyamraj55
 
UX Webinar Series: Drive Revenue and Decrease Costs with Passkeys for Consume...
UX Webinar Series: Drive Revenue and Decrease Costs with Passkeys for Consume...UX Webinar Series: Drive Revenue and Decrease Costs with Passkeys for Consume...
UX Webinar Series: Drive Revenue and Decrease Costs with Passkeys for Consume...
FIDO Alliance
 
Mastering OnlyFans Clone App Development: Key Strategies for Success
Mastering OnlyFans Clone App Development: Key Strategies for SuccessMastering OnlyFans Clone App Development: Key Strategies for Success
Mastering OnlyFans Clone App Development: Key Strategies for Success
David Wilson
 
What's New in Teams Calling, Meetings, Devices June 2024
What's New in Teams Calling, Meetings, Devices June 2024What's New in Teams Calling, Meetings, Devices June 2024
What's New in Teams Calling, Meetings, Devices June 2024
Stephanie Beckett
 
Types of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technologyTypes of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technology
ldtexsolbl
 
LeadMagnet IQ Review: Unlock the Secret to Effortless Traffic and Leads.pdf
LeadMagnet IQ Review:  Unlock the Secret to Effortless Traffic and Leads.pdfLeadMagnet IQ Review:  Unlock the Secret to Effortless Traffic and Leads.pdf
LeadMagnet IQ Review: Unlock the Secret to Effortless Traffic and Leads.pdf
SelfMade bd
 
Connector Corner: Leveraging Snowflake Integration for Smarter Decision Making
Connector Corner: Leveraging Snowflake Integration for Smarter Decision MakingConnector Corner: Leveraging Snowflake Integration for Smarter Decision Making
Connector Corner: Leveraging Snowflake Integration for Smarter Decision Making
DianaGray10
 
Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
bellared2
 
The History of Embeddings & Multimodal Embeddings
The History of Embeddings & Multimodal EmbeddingsThe History of Embeddings & Multimodal Embeddings
The History of Embeddings & Multimodal Embeddings
Zilliz
 
Computer HARDWARE presenattion by CWD students class 10
Computer HARDWARE presenattion by CWD students class 10Computer HARDWARE presenattion by CWD students class 10
Computer HARDWARE presenattion by CWD students class 10
ankush9927
 
COVID-19 and the Level of Cloud Computing Adoption: A Study of Sri Lankan Inf...
COVID-19 and the Level of Cloud Computing Adoption: A Study of Sri Lankan Inf...COVID-19 and the Level of Cloud Computing Adoption: A Study of Sri Lankan Inf...
COVID-19 and the Level of Cloud Computing Adoption: A Study of Sri Lankan Inf...
AimanAthambawa1
 
Mastering Board Best Practices: Essential Skills for Effective Non-profit Lea...
Mastering Board Best Practices: Essential Skills for Effective Non-profit Lea...Mastering Board Best Practices: Essential Skills for Effective Non-profit Lea...
Mastering Board Best Practices: Essential Skills for Effective Non-profit Lea...
OnBoard
 
History and Introduction for Generative AI ( GenAI )
History and Introduction for Generative AI ( GenAI )History and Introduction for Generative AI ( GenAI )
History and Introduction for Generative AI ( GenAI )
Badri_Bady
 
Vulnerability Management: A Comprehensive Overview
Vulnerability Management: A Comprehensive OverviewVulnerability Management: A Comprehensive Overview
Vulnerability Management: A Comprehensive Overview
Steven Carlson
 
Generative AI Reasoning Tech Talk - July 2024
Generative AI Reasoning Tech Talk - July 2024Generative AI Reasoning Tech Talk - July 2024
Generative AI Reasoning Tech Talk - July 2024
siddu769252
 
Accelerating Migrations = Recommendations
Accelerating Migrations = RecommendationsAccelerating Migrations = Recommendations
Accelerating Migrations = Recommendations
isBullShit
 
UX Webinar Series: Aligning Authentication Experiences with Business Goals
UX Webinar Series: Aligning Authentication Experiences with Business GoalsUX Webinar Series: Aligning Authentication Experiences with Business Goals
UX Webinar Series: Aligning Authentication Experiences with Business Goals
FIDO Alliance
 
Retrieval Augmented Generation Evaluation with Ragas
Retrieval Augmented Generation Evaluation with RagasRetrieval Augmented Generation Evaluation with Ragas
Retrieval Augmented Generation Evaluation with Ragas
Zilliz
 
Welcome to Cyberbiosecurity. Because regular cybersecurity wasn't complicated...
Welcome to Cyberbiosecurity. Because regular cybersecurity wasn't complicated...Welcome to Cyberbiosecurity. Because regular cybersecurity wasn't complicated...
Welcome to Cyberbiosecurity. Because regular cybersecurity wasn't complicated...
Snarky Security
 
Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...
Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...
Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...
shanihomely
 

Recently uploaded (20)

Integrating Kafka with MuleSoft 4 and usecase
Integrating Kafka with MuleSoft 4 and usecaseIntegrating Kafka with MuleSoft 4 and usecase
Integrating Kafka with MuleSoft 4 and usecase
 
UX Webinar Series: Drive Revenue and Decrease Costs with Passkeys for Consume...
UX Webinar Series: Drive Revenue and Decrease Costs with Passkeys for Consume...UX Webinar Series: Drive Revenue and Decrease Costs with Passkeys for Consume...
UX Webinar Series: Drive Revenue and Decrease Costs with Passkeys for Consume...
 
Mastering OnlyFans Clone App Development: Key Strategies for Success
Mastering OnlyFans Clone App Development: Key Strategies for SuccessMastering OnlyFans Clone App Development: Key Strategies for Success
Mastering OnlyFans Clone App Development: Key Strategies for Success
 
What's New in Teams Calling, Meetings, Devices June 2024
What's New in Teams Calling, Meetings, Devices June 2024What's New in Teams Calling, Meetings, Devices June 2024
What's New in Teams Calling, Meetings, Devices June 2024
 
Types of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technologyTypes of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technology
 
LeadMagnet IQ Review: Unlock the Secret to Effortless Traffic and Leads.pdf
LeadMagnet IQ Review:  Unlock the Secret to Effortless Traffic and Leads.pdfLeadMagnet IQ Review:  Unlock the Secret to Effortless Traffic and Leads.pdf
LeadMagnet IQ Review: Unlock the Secret to Effortless Traffic and Leads.pdf
 
Connector Corner: Leveraging Snowflake Integration for Smarter Decision Making
Connector Corner: Leveraging Snowflake Integration for Smarter Decision MakingConnector Corner: Leveraging Snowflake Integration for Smarter Decision Making
Connector Corner: Leveraging Snowflake Integration for Smarter Decision Making
 
Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
 
The History of Embeddings & Multimodal Embeddings
The History of Embeddings & Multimodal EmbeddingsThe History of Embeddings & Multimodal Embeddings
The History of Embeddings & Multimodal Embeddings
 
Computer HARDWARE presenattion by CWD students class 10
Computer HARDWARE presenattion by CWD students class 10Computer HARDWARE presenattion by CWD students class 10
Computer HARDWARE presenattion by CWD students class 10
 
COVID-19 and the Level of Cloud Computing Adoption: A Study of Sri Lankan Inf...
COVID-19 and the Level of Cloud Computing Adoption: A Study of Sri Lankan Inf...COVID-19 and the Level of Cloud Computing Adoption: A Study of Sri Lankan Inf...
COVID-19 and the Level of Cloud Computing Adoption: A Study of Sri Lankan Inf...
 
Mastering Board Best Practices: Essential Skills for Effective Non-profit Lea...
Mastering Board Best Practices: Essential Skills for Effective Non-profit Lea...Mastering Board Best Practices: Essential Skills for Effective Non-profit Lea...
Mastering Board Best Practices: Essential Skills for Effective Non-profit Lea...
 
History and Introduction for Generative AI ( GenAI )
History and Introduction for Generative AI ( GenAI )History and Introduction for Generative AI ( GenAI )
History and Introduction for Generative AI ( GenAI )
 
Vulnerability Management: A Comprehensive Overview
Vulnerability Management: A Comprehensive OverviewVulnerability Management: A Comprehensive Overview
Vulnerability Management: A Comprehensive Overview
 
Generative AI Reasoning Tech Talk - July 2024
Generative AI Reasoning Tech Talk - July 2024Generative AI Reasoning Tech Talk - July 2024
Generative AI Reasoning Tech Talk - July 2024
 
Accelerating Migrations = Recommendations
Accelerating Migrations = RecommendationsAccelerating Migrations = Recommendations
Accelerating Migrations = Recommendations
 
UX Webinar Series: Aligning Authentication Experiences with Business Goals
UX Webinar Series: Aligning Authentication Experiences with Business GoalsUX Webinar Series: Aligning Authentication Experiences with Business Goals
UX Webinar Series: Aligning Authentication Experiences with Business Goals
 
Retrieval Augmented Generation Evaluation with Ragas
Retrieval Augmented Generation Evaluation with RagasRetrieval Augmented Generation Evaluation with Ragas
Retrieval Augmented Generation Evaluation with Ragas
 
Welcome to Cyberbiosecurity. Because regular cybersecurity wasn't complicated...
Welcome to Cyberbiosecurity. Because regular cybersecurity wasn't complicated...Welcome to Cyberbiosecurity. Because regular cybersecurity wasn't complicated...
Welcome to Cyberbiosecurity. Because regular cybersecurity wasn't complicated...
 
Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...
Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...
Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...
 

How NOT to Measure Latency

  • 1. ©2015 Azul Systems, Inc. How NOT to Measure Latency Gil Tene, CTO & co-Founder, Azul Systems @giltene
  • 2. InfoQ.com: News & Community Site • 750,000 unique visitors/month • Published in 4 languages (English, Chinese, Japanese and Brazilian Portuguese) • Post content from our QCon conferences • News 15-20 / week • Articles 3-4 / week • Presentations (videos) 12-15 / week • Interviews 2-3 / week • Books 1 / month Watch the video with slide synchronization on InfoQ.com! http://www.infoq.com/presentations/ latency-response-time
  • 3. Purpose of QCon - to empower software development by facilitating the spread of knowledge and innovation Strategy - practitioner-driven conference designed for YOU: influencers of change and innovation in your teams - speakers and topics driving the evolution and innovation - connecting and catalyzing the influencers and innovators Highlights - attended by more than 12,000 delegates since 2007 - held in 9 cities worldwide Presented at QCon San Francisco www.qconsf.com
  • 4. ©2015 Azul Systems, Inc. The “Oh S@%#!” talk Gil Tene, CTO & co-Founder, Azul Systems @giltene
  • 5. ©2015 Azul Systems, Inc. About me: Gil Tene co-founder, CTO @Azul Systems Have been working on “think different” GC approaches since 2002 A Long history building Virtual & Physical Machines, Operating Systems, Enterprise apps, etc... I also depress people by pulling the wool up from over their eyes… * working on real-world trash compaction issues, circa 2004
  • 7. ©2015 Azul Systems, Inc. Latency Behavior Latency: The time it took one operation to happen Each operation occurrence has its own latency What we care about is how latency behaves Behavior is a lot more than “the common case was X”
  • 8. ©2015 Azul Systems, Inc. We like to look at pretty charts… 95%’lie The “We only want to show good things” chart
  • 10. ©2015 Azul Systems, Inc. A real world, real time example
  • 11. ©2015 Azul Systems, Inc. A real world, real time example
  • 12. ©2015 Azul Systems, Inc. A real world, real time example So this is a better picture. Right?
  • 13. ©2015 Azul Systems, Inc. Why do we tend to avoid plotting Max latency? Because no other %’ile will be visible on the same chart…
  • 14. ©2015 Azul Systems, Inc. I like to rant about latency…
  • 15. ©2015 Azul Systems, Inc. #LatencyTipOfTheDay: If you are not measuring and/or plotting Max, what are you hiding (from)?
  • 17. ©2015 Azul Systems, Inc. What (TF) does the Average of the 95%’lie mean?
  • 18. ©2015 Azul Systems, Inc. What (TF) does the Average of the 95%’lie mean? Lets do the same with 100%’ile; Suppose we a set of 100%’ile values for each minute: [1, 0, 3, 1, 601, 4, 2, 8, 0, 3, 3, 1, 1, 0, 2] “The average 100%’ile over the past 15 minutes was 42” Same nonsense applies to any other %’lie
  • 19. ©2015 Azul Systems, Inc. #LatencyTipOfTheDay: You can't average percentiles. Period.
  • 20. ©2015 Azul Systems, Inc. Percentiles Matter
  • 21. ©2015 Azul Systems, Inc. Is the 99%’lie “rare”?
  • 22. ©2015 Azul Systems, Inc. 99%’lie: a good indicator, right? What are the chances of a single web page view experiencing >99%’lie latency of: - A single search engine node? - A single Key/Value store node? - A single Database node? - A single CDN request?
  • 25. ©2015 Azul Systems, Inc. #LatencyTipOfTheDay: MOST page loads will experience the 99%'lie server response
  • 26. ©2015 Azul Systems, Inc. Which HTTP response time metric is more “representative” of user experience? The 95%’lie or the 99.9%’lie
  • 27. ©2015 Azul Systems, Inc. Gauging user experience Example: If a typical user session involves 5 page loads, averaging 40 resources per page. - How many of our users will NOT experience something worse than the 95%’lie of http requests? Answer: ~0.003% - How may of our users will experience at least one response that is longer than the 99.9%’lie? Answer: ~18%
  • 28. ©2015 Azul Systems, Inc. Gauging user experience Example: If a typical user session involves 5 page loads, averaging 40 resources per page. - What http response percentile will be experienced by the 95%’ile of users? Answer: ~99.97% - What http response percentile will be experienced by the 99%’ile of users Answer: ~99.995%
  • 29. ©2015 Azul Systems, Inc. #LatencyTipOfTheDay: Median Server Response Time: The number that 99.9999999999% of page views can be worse than
  • 30. ©2015 Azul Systems, Inc. Why don’t we have response time or latency stats with multiple 9s in them???
  • 31. ©2015 Azul Systems, Inc. You can’t average percentiles… And you also can’t get an hour’s 99.999%’lie out of lots of 10 second interval 99%’lie reports… Why don’t we have response time or latency stats with multiple 9s in them???
  • 32. ©2015 Azul Systems, Inc. You can’t average percentiles… And you also can’t get an hour’s 99.999%’lie out of lots of 10 second interval 99%’lie reports… Check out HdrHistogram It lets you have nice things…. Why don’t we have response time or latency stats with multiple 9s in them??? 0" 20" 40" 60" 80" 100" 120" 0" 500" 1000" 1500" 2000" 2500" Hiccup&Dura*on&(mse &Elapsed&Time&(sec)& 0%" 90%" 99%" 99.9%" 99.99%" 99.999%" 99.9999%" 0" 20" 40" 60" 80" 100" 120" 140"Hiccup&Dura*on&(msec)& & & Percen*le& Hiccups&by&Percen*le&Distribu*on& Hiccups"by"Percen?le" SLA"
  • 33. ©2015 Azul Systems, Inc. Latency “wishful thinking” We know how to compute averages & std. deviation, etc. Wouldn’t it be nice if latency had a normal distribution? The average, 90%’lie, 99%’lie, std. deviation, etc. can give us a “feel” for the rest of the distribution, right? If 99% of the stuff behaves well, how bad can the rest be, really?
  • 34. ©2015 Azul Systems, Inc. The real world: latency distribution 0%! 90%! 99%! 0! 0.05! 0.1! 0.15! 0.2! 0.25! 0.3! 0.35! 0.4! 0.45! 0.5! Latency(msec)! ! ! Percentile! Latency by Percentile Distribution!
  • 35. ©2015 Azul Systems, Inc. The real world: latency distribution 0%! 90%! 99%! 99.9%! 0! 0.5! 1! 1.5! 2! 2.5! 3! 3.5! 4! 4.5! 5! Latency(msec)! ! ! Percentile! Latency by Percentile Distribution!
  • 36. ©2015 Azul Systems, Inc. The real world: latency distribution 0%! 90%! 99%! 99.9%! 99.99%! 99.999%! 99.9999%! 0! 10! 20! 30! 40! 50! 60! Latency(msec)! ! ! Percentile! Latency by Percentile Distribution!
  • 37. ©2015 Azul Systems, Inc. Dispelling standard deviation 0%# 90%# 99%# 99.9%# 99.99%# 99.999%# 99.9999%# 0# 10# 20# 30# 40# 50# Latency(((msec)( ( ( Percen/le( Latency(by(Percen/le(Distribu/on( A# B# C# D# E# F#
  • 38. ©2015 Azul Systems, Inc. Dispelling standard deviation 0%# 90%# 99%# 99.9%# 99.99%# 99.999%# 99.9999%# 0# 10# 20# 30# 40# 50# Latency(((msec)( ( ( Percen/le( Latency(by(Percen/le(Distribu/on( A# B# C# D# E# F# Mean = 0.06 msec Std. Deviation (𝞂) = 0.21msec 99.999% = 38.66msec In a normal distribution, These are NOT normal distributions ~184 σ (!!!) away from the mean the 99.999%’ile falls within 4.5 σ
  • 39. ©2015 Azul Systems, Inc. The coordinated omission problem An accidental conspiracy... The lie in the 99%’lies
  • 40. ©2015 Azul Systems, Inc. The coordinated omission problem Common Example A (load testing): each “client” issues requests at a certain rate measure/log response time for each request So what’s wrong with that? works only if ALL responses fit within interval implicit “automatic back off” coordination
  • 41. ©2015 Azul Systems, Inc. Common Example B: Coordinated Omission in Monitoring Code Long operations only get measured once delays outside of timing window do not get measured at all
  • 42. How bad can this get? 40 Avg. is 1 msec over 1st 100 sec System Stalled for 100 Sec Elapsed Time System easily handles 100 requests/sec Responds to each in 1msec How would you characterize this system? ~50%‘ile is 1 msec ~75%‘ile is 50 sec 99.99%‘ile is ~100sec Avg. is 50 sec. over next 100 sec Overall Average response time is ~25 sec.
  • 43. Measurement in prac8ce 41 System Stalled for 100 Sec Elapsed Time System easily handles 100 requests/sec Responds to each in 1msec What actually gets measured? (should be ~50sec) (should be ~100 sec) 50%‘ile is 1 msec 75%‘lie is 1 msec 99.99%‘lie is 1 msec 10,000 measurements @ 1 msec each 1 measurement @ 100 sec Overall Average is 10.9 msec (!!!)
  • 44. Proper measurement 42 System Stalled for 100 Sec Elapsed Time System easily handles 100 requests/sec Responds to each in 1msec 10,000 results Varying linearly from 100 sec to 10 msec 10,000 results @ 1 msec each ~50%‘ile is 1 msec ~75%‘ile is 50 sec 99.99%‘ile is ~100sec
  • 45. Proper measurement 43 System Stalled for 100 Sec Elapsed Time System easily handles 100 requests/sec Responds to each in 1msec 10,000 results Varying linearly from 100 sec to 10 msec 10,000 results @ 1 msec each ~50%‘ile is 1 msec ~75%‘ile is 50 sec 99.99%‘ile is ~100sec Coordinated Omission 1 msec 1 msec 1 result @ 100 sec
  • 46. “Be?er” can look “Worse” 44 System Slowed for 100 Sec Elapsed Time System easily handles 100 requests/sec Responds to each in 1msec 10,000 @ 1msec 10,000 @ 5 msec 50%‘ile is 1 msec 75%‘lie is 2.5msec 99.99%‘lie is ~5msec Still easily handles 100 requests/sec Responds to each in 5 msec (stalled shows 1 msec) (stalled shows 1 msec)
  • 47. “Correc8on”: “Chea8ng Twice” 45 System Stalled for 100 Sec Elapsed Time System easily handles 100 requests/sec Responds to each in 1msec 10,000 results Varying linearly from 100 sec to 10 msec 10,000 results @ 1 msec each ~50%‘ile is 1 msec ~75%‘ile is 50 sec 99.994%‘ile is ~100sec Coordinated Omission 1 msec 1 msec 9,999 additional results @ 1 msec each 1 result @ 100 sec “correction”
  • 48. ©2015 Azul Systems, Inc. Response Time vs. Service Time
  • 49. ©2015 Azul Systems, Inc. Service Time vs. Response Time
  • 50. ©2015 Azul Systems, Inc. Coordinated Omission Usually makes something that you think is a Response Time metric only represent the Service Time component
  • 51. ©2015 Azul Systems, Inc. Response Time vs. Service Time @2K/sec
  • 52. ©2015 Azul Systems, Inc. Response Time vs. Service Time @20K/sec
  • 53. ©2015 Azul Systems, Inc. Response Time vs. Service Time @60K/sec
  • 54. ©2015 Azul Systems, Inc. Response Time vs. Service Time @80K/sec
  • 55. ©2015 Azul Systems, Inc. Response Time vs. Service Time @90K/sec
  • 56. ©2015 Azul Systems, Inc. How “real” people react
  • 57. ©2015 Azul Systems, Inc. Service Time, 90K/s vs 80K/s
  • 58. ©2015 Azul Systems, Inc. Response Time, 90K/s vs 80K/s
  • 59. ©2015 Azul Systems, Inc. Response Time, 90K/s vs 80K/s (to scale)
  • 60. ©2015 Azul Systems, Inc. Latency doesn’t live in a vacuum
  • 61. ©2015 Azul Systems, Inc. Sustainable Throughput: The throughput achieved while safely maintaining service levels
  • 62. ©2015 Azul Systems, Inc. Comparing behavior under different throughputs and/or configurations 0%# 90%# 99%# 99.9%# 99.99%# 99.999%# 0# 20# 40# 60# 80# 100# 120# !Dura&on!(msec)! ! ! Percen&le! Dura&on!by!Percen&le!Distribu&on! Setup#D# Setup#E# Setup#F# Setup#A# Setup#B# Setup#C#
  • 63. ©2015 Azul Systems, Inc. Comparing response time or latency behaviors
  • 64. ©2015 Azul Systems, Inc. System A @90K/s & 85K/s vs. System B @90K/s & 85K/s Wrong Place to Look: They both “suck” at >85K/sec
  • 65. ©2015 Azul Systems, Inc. System A 85K/s vs. System B 85K/s Looks good, but still the wrong place to look
  • 66. ©2015 Azul Systems, Inc. System A @40K/s vs. System B @40K/s More interesting… What can we do with this?
  • 67. ©2015 Azul Systems, Inc. System A @10K/s vs. System B @40K/s E.g. if “99%’ile < 5msec” was a goal: System B delivers similar 99%’ile and superior 99.9%’ile+ while carrying 4x the throughput
  • 68. ©2015 Azul Systems, Inc. System A @2K/s vs. System B @20K/s E.g. if “99.9%’ile < 10msec” was a goal: System B delivers similar 99%’ile and 99.9%’ile while carrying 10x the throughput
  • 69. ©2015 Azul Systems, Inc. System A @2k thru 80k
  • 70. ©2015 Azul Systems, Inc. System A @2k thru 70k
  • 71. ©2015 Azul Systems, Inc. System B @20k thru 70k
  • 72. ©2015 Azul Systems, Inc. System A & System B @2k thru 70k
  • 73. ©2015 Azul Systems, Inc. System A & System B 10K/s thru 60K/s System A @ 10K, 20K, 40K, 60K System B @20K, 40K, 60K Lots of conclusions can be drawn from the above… E.g. System B delivers a consistent 100x reduction in the rate of occurrence of >20msec response times
  • 74. ©2015 Azul Systems, Inc. System A: 200-1400 msec stalls System B drawn to scale Service timeResponse Time Response TimeService time
  • 75. ©2015 Azul Systems, Inc. This is Your Load on System A This is Your Load on System B Any Questions? A simple visual summary
  • 76. ©2015 Azul Systems, Inc. http://www.azulsystems.com Any Questions?
  • 77. Watch the video with slide synchronization on InfoQ.com! http://www.infoq.com/presentations/latency- response-time