Statistical Analysis of DNS Latencies.pdf

Ondřej Surý <ondrej@isc.org>
Internet Systems Consortium
2024-05-21

DNS Performance Metrics (quick intro) Measuring DNS Latency
• Performance under normal conditions
‣
• The data is right-skewed
‣ The usual descriptive statistics are useless (average, mean, …)
‣ Most of the queries are answered very quickly
‣ In fact, 95% of the queries are answered under 2 milliseconds
‣ The tails makes it interesting
Ondřej Surý <ondrej@isc.org> 2024-05-21 1 / 8

Logarithmic Percentile Histogram Measuring DNS Latency

Logarithmic Percentile Histogram Measuring DNS Latency
• Both axes are logarithmic
‣ x-axis: slowest percentile
‣ y-axis: average latency
• It makes the tail more visible
• Variant of Complementary Cumulative Distribution Function
• Very robust, can be used for monitoring (1% slowest percentile)
• Introduced by the good folks at PowerDNS
See more: https://blog.powerdns.com/2017/11/02/dns-performance-
metrics-the-logarithmic-percentile-histogram

DNS Performance for Developers Compare DNS Latencies
• Comparing two branches of BIND 9
‣ Did we improve the code?
‣ Did we made things worse?
‣ Currently, we compare the graphs by looking at them;
‣ And then running more tests;
‣ And then some wishful thinking…
• Sending thanks to Python’s numpy and scipy developers!

Pick the right statistics Compare DNS Latencies
• The distribution is not normal
• Non-parametrical test then?
‣ Kolmogorov-Smirnov test didn’t really work
• Normalize the data?
‣ Box Cox Transformation didn’t really work
• Maybe look only at the tail then?

Looking at the tail Compare DNS Latencies
• Pick the 95% (99%) percentile complement
‣ Either return the lowest bucket needed for 5% of responses
‣ Or count the answers in (1.9-2.0 second buckets)
• Have at least 3 runs for each group
• Yay! The data are normal and the group variances are equal
‣ Shapiro-Wilk test
– ﬁrst group (𝑊 = 0.905, 𝑝 = 0.436)
– second group (𝑊 = 0.970, 𝑝 = 0.874)
‣ Brown-Forsyth test (𝐹 = 0.070, 𝑝 = 0.798)

Parametrical test (ANOVA) Compare DNS Latencies
• We can test more than two branches
• One-way ANOVA reports diﬀerence between branches
‣ 𝐹 = 9244.090, 𝑝 < .001
• Two-sample T-Test (for conﬁrmation)
‣ 𝑇 = −96.146, 𝑝 < .001

Other tests? More ideas?
• Is this even correct? Or am I crazy? (I’m not a statistician)
• Can we just compare two data sets (1x baseline with 1x branch)?
• Can we use the full (right-skewed) population?
• Are there any other non-parametrical tests I can try/use?
• Are there any other suitable statistical methods?
• Is this useful for other Internet measurements?

Statistical Analysis of DNS Latencies.pdf

Recommended

Recommended

More Related Content

Similar to Statistical Analysis of DNS Latencies.pdf

Similar to Statistical Analysis of DNS Latencies.pdf (20)

Recently uploaded

Recently uploaded (20)

Statistical Analysis of DNS Latencies.pdf