• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
The Statistics of Web Performance
 

The Statistics of Web Performance

on

  • 7,685 views

Analysis of user experience is typically done by taking a random sample of users, measuring their experiences and extracting a single number from that sample. In terms of web performance, the ...

Analysis of user experience is typically done by taking a random sample of users, measuring their experiences and extracting a single number from that sample. In terms of web performance, the experience we need to measure is user perceived page load time, and the single number we need to extract depends on the distribution of measurements across the sample.

There are a few contenders for what the magic number should be. Do you use the mean, median, mode, or something else? How do you determine the correctness of this number or whether your sample size is large enough? Is one number sufficient?

This talk covers some of the statistics behind figuring out which numbers one should be looking at and how to go about extracting it from the sample.

Statistics

Views

Total Views
7,685
Views on SlideShare
7,541
Embed Views
144

Actions

Likes
16
Downloads
131
Comments
0

4 Embeds 144

http://eclass.hiast.edu.sy 65
http://talks.bluesmoon.info 61
http://www.slideshare.net 15
http://www.linkedin.com 3

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

CC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    The Statistics of Web Performance The Statistics of Web Performance Presentation Transcript

    • Introduction Statistics - I Statistics - II The Statistics of web Performance Philip Tellis / philip@bluesmoon.info ConFoo / 2010-03-12 ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction Statistics - I Statistics - II $ finger philip Philip Tellis philip@bluesmoon.info @bluesmoon yahoo geek ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction The goal Statistics - I Performance Measurement Statistics - II Introduction ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction The goal Statistics - I Performance Measurement Statistics - II Accurately measure page performance At least, as accurately as possible ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction The goal Statistics - I Performance Measurement Statistics - II Accurately measure page performance At least, as accurately as possible ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction The goal Statistics - I Performance Measurement Statistics - II Be unintrusive If you try to measure something accurately, you will change something related – Heisenberg’s uncertainty principle ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction The goal Statistics - I Performance Measurement Statistics - II And one number to rule them all ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction The goal Statistics - I Performance Measurement Statistics - II Bandwidth Real bandwidth v/s advertised bandwidth Bandwidth to your server, not to the ISP Bandwidth during normal internet usage If the user’s always watching movies, you’re not winning ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction The goal Statistics - I Performance Measurement Statistics - II Bandwidth Real bandwidth v/s advertised bandwidth Bandwidth to your server, not to the ISP Bandwidth during normal internet usage If the user’s always watching movies, you’re not winning ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction The goal Statistics - I Performance Measurement Statistics - II Latency How long does it take a byte to get to the user? Wired, wireless, mobile, satellite? How many hops in between? Speed of light is constant This is not a battle we will soon win. When was the last time you heard latency mentioned in a TV ad? http://www.stuartcheshire.org/rants/Latency.html ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction The goal Statistics - I Performance Measurement Statistics - II Latency How long does it take a byte to get to the user? Wired, wireless, mobile, satellite? How many hops in between? Speed of light is constant This is not a battle we will soon win. When was the last time you heard latency mentioned in a TV ad? http://www.stuartcheshire.org/rants/Latency.html ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction The goal Statistics - I Performance Measurement Statistics - II Latency How long does it take a byte to get to the user? Wired, wireless, mobile, satellite? How many hops in between? Speed of light is constant This is not a battle we will soon win. When was the last time you heard latency mentioned in a TV ad? http://www.stuartcheshire.org/rants/Latency.html ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction The goal Statistics - I Performance Measurement Statistics - II User perceived page load time Time from “click on a link” to “spinner stops spinning” This is what users notice Depends on how long your page takes to build Depends on what’s in your page Depends on how long components take to load Depends on how long the browser takes to execute and render ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction The goal Statistics - I Performance Measurement Statistics - II We need to measure real user data ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction The goal Statistics - I Performance Measurement Statistics - II The statistics apply to any kind of performance data though ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction Random Sampling Statistics - I Margin of Error Statistics - II Central Tendency Statistics - I ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction Random Sampling Statistics - I Margin of Error Statistics - II Central Tendency Disclaimer I am not a statistician ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction Random Sampling Statistics - I Margin of Error Statistics - II Central Tendency Population All possible users of your system ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction Random Sampling Statistics - I Margin of Error Statistics - II Central Tendency Sample Representative subset of the population ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction Random Sampling Statistics - I Margin of Error Statistics - II Central Tendency Bad sample Sometimes it’s not ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction Random Sampling Statistics - I Margin of Error Statistics - II Central Tendency How to randomize? Pick 10% of users at random and always test them OR For each user, decide at random if they should be tested http://tech.bluesmoon.info/2010/01/statistics-of-performance-measurement.html ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction Random Sampling Statistics - I Margin of Error Statistics - II Central Tendency Select 10% of users - I if($sessionid % 10 === 0) { // instrument code for measurement } Once a user enters the measurement bucket, they stay there until they log out Fixed set of users, so tests may be more consistent Error in the sample results in positive feedback ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction Random Sampling Statistics - I Margin of Error Statistics - II Central Tendency Select 10% of users - II if(rand() < 0.1 * getrandmax()) { // instrument code for measurement } For every request, a user has a 10% chance of being tested Gets rid of positive feedback errors, but sample size != 10% of population ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction Random Sampling Statistics - I Margin of Error Statistics - II Central Tendency How big a sample is representative? Select n such that σ 1.96 √n ≤ 5%µ ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction Random Sampling Statistics - I Margin of Error Statistics - II Central Tendency Standard Deviation Standard deviation tells you the spread of the curve The narrower the curve, the more confident you can be ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction Random Sampling Statistics - I Margin of Error Statistics - II Central Tendency MoE at 95% confidence σ ±1.96 √n ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction Random Sampling Statistics - I Margin of Error Statistics - II Central Tendency MoE & Sample size There is an inverse square root correlation between sample size and margin of error ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction Random Sampling Statistics - I Margin of Error Statistics - II Central Tendency But wait... it’s not complicated enough. We have different types of margins of error ...more about that later ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction Random Sampling Statistics - I Margin of Error Statistics - II Central Tendency But wait... it’s not complicated enough. We have different types of margins of error ...more about that later ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction Random Sampling Statistics - I Margin of Error Statistics - II Central Tendency But wait... it’s not complicated enough. We have different types of margins of error ...more about that later ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction Random Sampling Statistics - I Margin of Error Statistics - II Central Tendency Ding dong ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction Random Sampling Statistics - I Margin of Error Statistics - II Central Tendency One number Mean (Arithmetic) Good for symmetric curves Affected by outliers Mean(10, 11, 12, 11, 109) = 30 ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction Random Sampling Statistics - I Margin of Error Statistics - II Central Tendency One number Median Middle value measures central tendency well Not trivial to pull out of a DB Median(10, 11, 12, 11, 109) = 11 ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction Random Sampling Statistics - I Margin of Error Statistics - II Central Tendency One number Mode Not often used Multi-modal distributions suggest problems Mode(10, 11, 12, 11, 109) = 11 ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction Random Sampling Statistics - I Margin of Error Statistics - II Central Tendency Other numbers A percentile point in the distribution: 95th , 98.5th or 99th Used to find out the worst user experience Makes more sense if you filter data first P95th (10, 11, 12, 11, 109) = 12 ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction Random Sampling Statistics - I Margin of Error Statistics - II Central Tendency Other means Geometric mean Good if your data is exponential in nature (with the tail on the right) GMean(10, 11, 12, 11, 109) = 16.68 ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction Random Sampling Statistics - I Margin of Error Statistics - II Central Tendency Wait... how did I get that? N ΠN xi — could lead to overflow i=1 ΣN loge (xi ) i=1 N e — computationally simpler ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction Random Sampling Statistics - I Margin of Error Statistics - II Central Tendency Wait... how did I get that? N ΠN xi — could lead to overflow i=1 ΣN loge (xi ) i=1 N e — computationally simpler ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction Random Sampling Statistics - I Margin of Error Statistics - II Central Tendency Wait... how did I get that? N ΠN xi — could lead to overflow i=1 ΣN loge (xi ) i=1 N e — computationally simpler ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction Random Sampling Statistics - I Margin of Error Statistics - II Central Tendency Wait... how did I get that? N ΠN xi — could lead to overflow i=1 ΣN loge (xi ) i=1 N e — computationally simpler ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction Random Sampling Statistics - I Margin of Error Statistics - II Central Tendency Other means And there is also the Harmonic mean, but forget about that ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction Random Sampling Statistics - I Margin of Error Statistics - II Central Tendency ...though consequently We have other margins of error Geometric margin of error Uses geometric standard deviation Median margin of error Uses ranges of actual values from data set Stick to the arithmetic MoE – simpler to calculate, simpler to read and not incorrect ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction Random Sampling Statistics - I Margin of Error Statistics - II Central Tendency ...though consequently We have other margins of error Geometric margin of error Uses geometric standard deviation Median margin of error Uses ranges of actual values from data set Stick to the arithmetic MoE – simpler to calculate, simpler to read and not incorrect ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction Filtering Statistics - I The Log-Normal distribution Statistics - II Statistics - II ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction Filtering Statistics - I The Log-Normal distribution Statistics - II Outliers Out of range data points Nothing you can fix here There’s even a book about them ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction Filtering Statistics - I The Log-Normal distribution Statistics - II Outliers Out of range data points Nothing you can fix here There’s even a book about them ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction Filtering Statistics - I The Log-Normal distribution Statistics - II Outliers Out of range data points Nothing you can fix here There’s even a book about them ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction Filtering Statistics - I The Log-Normal distribution Statistics - II Outliers Out of range data points Nothing you can fix here There’s even a book about them ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction Filtering Statistics - I The Log-Normal distribution Statistics - II DNS problems can cause outliers 2 or 3 DNS servers for an ISP 30 second timeout if first fails ... 30 second increase in page load time Maybe measure both and fix what you can http://nms.lcs.mit.edu/papers/dns-ton2002.pdf ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction Filtering Statistics - I The Log-Normal distribution Statistics - II Band-pass filtering ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction Filtering Statistics - I The Log-Normal distribution Statistics - II Band-pass filtering Strip everything outside a reasonable range Bandwidth range: 4kbps - 4Gbps Page load time: 50ms - 120s You may need to relook at the ranges all the time ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction Filtering Statistics - I The Log-Normal distribution Statistics - II IQR filtering ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction Filtering Statistics - I The Log-Normal distribution Statistics - II IQR filtering Here, we derive the range from the data ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction Filtering Statistics - I The Log-Normal distribution Statistics - II Let’s look at some real charts ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction Filtering Statistics - I The Log-Normal distribution Statistics - II Bandwidth distribution for web devs x-axis is linear ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction Filtering Statistics - I The Log-Normal distribution Statistics - II Now let’s use log(kbps) instead of kbps x-axis is exponential ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction Filtering Statistics - I The Log-Normal distribution Statistics - II Exponential == Geometric Categories/Buckets grow exponentially Data is related geometrically Use the geometric mean and geometric margin of error gmean Error _range = /gmoe , gmean ∗ gmoe Non-linear ranges are hard for humans to grok ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction Filtering Statistics - I The Log-Normal distribution Statistics - II Exponential == Geometric Categories/Buckets grow exponentially Data is related geometrically Use the geometric mean and geometric margin of error gmean Error _range = /gmoe , gmean ∗ gmoe Non-linear ranges are hard for humans to grok ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction Filtering Statistics - I The Log-Normal distribution Statistics - II Exponential == Geometric Categories/Buckets grow exponentially Data is related geometrically Use the geometric mean and geometric margin of error gmean Error _range = /gmoe , gmean ∗ gmoe Non-linear ranges are hard for humans to grok ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction Statistics - I Statistics - II So... ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction Statistics - I Statistics - II Further reading Web Performance - Not a Simple Number http://www.netforecast.com/Articles/BCR+C25+Web+Performance+-+Not+A+Simple+Number.pdf Revisiting statistics for web performance (introduction to Log-Normal) http://home.pacbell.net/ciemo/statistics/WhatDoYouMean.pdf Random Sampling http://tech.bluesmoon.info/2010/01/statistics-of-performance-measurement.html Khan Academy’s tutorials on statistics http://khanacademy.com/ Learning about Statistical Learning http://measuringmeasures.blogspot.com/2010/01/learning-about-statistical-learning.html Wikipedia articles on Random Sampling, Central Tendency, Standard Error, Confounding, Means and IQR ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction Statistics - I Statistics - II Summary Choose a reasonable sample size and sampling factor Tune sample size for minimal margin of error Decide based on your data whether to use mode, median or one of the means Figure out whether your data is Normal, Log-Normal or something else Filter out anomalous outliers ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction Statistics - I Statistics - II contact me Philip Tellis philip@bluesmoon.info bluesmoon.info @bluesmoon ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction Statistics - I Statistics - II Photo credits http://www.flickr.com/photos/leoffreitas/332360959/ by leoffreitas http://www.flickr.com/photos/cobalt/56500295/ by cobalt123 http://www.flickr.com/photos/sophistechate/4264466015/ by Lisa Brewster http://www.flickr.com/photos/nchoz/243216008/ by nchoz ConFoo / 2010-03-12 The Statistics of web Performance
    • Introduction Statistics - I Statistics - II List of figures http://en.wikipedia.org/wiki/File:Standard_deviation_diagram.svg http://en.wikipedia.org/wiki/File:Normal_Distribution_PDF.svg http://en.wikipedia.org/wiki/File:KilroySchematic.svg http://en.wikipedia.org/wiki/File:Boxplot_vs_PDF.png ConFoo / 2010-03-12 The Statistics of web Performance