• Philip Tellis

•                           .com
• philip@lognormal.com
• @bluesmoon
• geek paranoid speedfreak
• http://bluesmoon.info/




    Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   1
I’m a Web Speedfreak




Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   2
We measure real user website performance




Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   3
This talk is about the Statistics we learned while building it




  Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   4
The Statistics of Web Performance Analysis

            Philip Tellis / philip@lognormal.com


             Boston #WebPerf Meetup / 2012-08-14




 Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   5
0
                             Numbers



Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   6
Accurately measure page performance∗




Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   7
Be unintrusive




     If you try to measure something accurately, you will change
                          something related
                                                                       – Heisenberg’s uncertainty principle




       Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis                       8
And one number to rule them all




Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   9
What do we measure?




    • Network Throughput
    • Network Latency
    • User perceived page load time




      Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   10
We measure real user data




Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   11
Which is noisy




Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   12
1
                        Statistics - 1



Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   13
Disclaimer




   I am not a statistician




      Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   14
1-1  Random Sampling



Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   15
Population



                        All possible users of your system




       Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   16
Sample



                    Representative subset of the population




         Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   17
Bad sample



                                   Sometimes it’s not




      Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   18
How to randomize?




                                                                                   http://xkcd.com/221/




      Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis                    19
How to randomize?




      • Pick 10% of users at random and always test them

                                               OR

      • For each user, decide at random if they should be tested

   http://tech.bluesmoon.info/2010/01/statistics-of-performance-measurement.html




         Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   20
Select 10% of users - I




       if($sessionid % 10 === 0) {
          // instrument code for measurement
       }

     • Once a user enters the measurement bucket, they stay
       there until they log out
     • Fixed set of users, so tests may be more consistent
     • Error in the sample results in positive feedback




       Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   21
Select 10% of users - II




       if(rand() < 0.1 * getrandmax()) {
          // instrument code for measurement
       }

     • For every request, a user has a 10% chance of being
       tested
     • Gets rid of positive feedback errors, but sample size !=
       10% of population




       Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   22
How big a sample is representative?




                                     Select n such that
                                     σ
                                1.96 √n ≤ 5%µ




       Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   23
1-2     Margin of Error



Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   24
Standard Deviation

     • Standard deviation tells you the spread of the curve
     • The narrower the curve, the more confident you can be




       Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   25
MoE at 95% confidence




                                       σ
                                 ±1.96 √n




      Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   26
MoE & Sample size




   There is an inverse square root correlation between sample size
                         and margin of error




       Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   27
1-3   Central Tendency



Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   28
Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   29
One number




    • Mean (Arithmetic)
       • Good for symmetric curves
       • Affected by outliers


                Mean(10, 11, 12, 11, 109) = 30




      Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   30
One number




    • Median
       • Middle value measures central tendency well
       • Not trivial to pull out of a DB


              Median(10, 11, 12, 11, 109) = 11




      Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   31
One number




    • Mode
       • Not often used
       • Multi-modal distributions suggest problems


                Mode(10, 11, 12, 11, 109) = 11




      Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   32
Other numbers




    • A percentile point in the distribution: 95th , 98.5th or 99th
        • Used to find out the worst user experience
        • Makes more sense if you filter data first


                P95th (10, 11, 12, 11, 109) = 12




      Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   33
Other means




    • Geometric mean
        • Good if your data is exponential in nature
          (with the tail on the right)


           GMean(10, 11, 12, 11, 109) = 16.68




      Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   34
Wait... how did I get that?




                N
                    ΠN xi — could lead to overflow
                     i=1

               ΣN loge (xi )
                i=1
                    N
          e                       — computationally simpler




       Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   35
Wait... how did I get that?




                N
                    ΠN xi — could lead to overflow
                     i=1

               ΣN loge (xi )
                i=1
                    N
          e                       — computationally simpler




       Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   35
Wait... how did I get that?




                N
                    ΠN xi — could lead to overflow
                     i=1

               ΣN loge (xi )
                i=1
                    N
          e                       — computationally simpler




       Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   35
Wait... how did I get that?




                N
                    ΠN xi — could lead to overflow
                     i=1

               ΣN loge (xi )
                i=1
                    N
          e                       — computationally simpler




       Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   35
Other means




    And there is also the Harmonic mean, but forget about that




      Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   36
...though consequently




   We have other margins of error
    • Geometric margin of error
          • Uses geometric standard deviation
     • Median margin of error
        • Uses ranges of actual values from data set
     • Stick to the arithmetic MoE
       – simpler to calculate, simpler to read and not incorrect




       Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   37
...though consequently




   We have other margins of error
    • Geometric margin of error
          • Uses geometric standard deviation
     • Median margin of error
        • Uses ranges of actual values from data set
     • Stick to the arithmetic MoE
       – simpler to calculate, simpler to read and not incorrect




       Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   37
2
                        Statistics - 2



Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   38
2-1         Distributions



Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   39
Let’s look at some real charts




Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   40
Sparse Distribution




       Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   41
Log-normal distribution




       Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   42
Bimodal distribution




       Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   43
What does all of this mean?




Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   44
Distributions




     • Sparse distribution suggests that you don’t have enough
       data points
     • Log-normal distribution is typical
     • Bi-modal distribution suggests two (or more) distributions
       combined




       Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   45
In practice, a bi-modal distribution is not uncommon




Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   46
Hint: Does your site do a lot of back-end caching?




Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   47
2-2               Filtering



Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   48
Outliers




                                                        • Out of range data points
                                                        • Nothing you can fix here
                                                        • There’s even a book about
                                                           them




       Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   49
Outliers




                                                        • Out of range data points
                                                        • Nothing you can fix here
                                                        • There’s even a book about
                                                           them




       Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   49
Outliers




                                                        • Out of range data points
                                                        • Nothing you can fix here
                                                        • There’s even a book about
                                                           them




       Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   49
Outliers




                                                        • Out of range data points
                                                        • Nothing you can fix here
                                                        • There’s even a book about
                                                           them




       Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   49
DNS problems can cause outliers




     • 2 or 3 DNS servers for an ISP
     • 30 second timeout if first fails
     • ... 30 second increase in page load time
     • Maybe measure both and fix what you can
     • http://nms.lcs.mit.edu/papers/dns-ton2002.pdf




       Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   50
Band-pass filtering




       Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   51
Band-pass filtering




     • Strip everything outside a reasonable range
         • Bandwidth range: 4kbps - 4Gbps
         • Page load time: 50ms - 120s
     • You may need to relook at the ranges all the time




       Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   51
IQR filtering




       Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   52
IQR filtering




                  Here, we derive the range from the data




       Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   52
Further Reading




   lognormal.com/blog/2012/08/13/analysing-performance-data/




      Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   53
Summary




    • Choose a reasonable sample size and sampling factor
    • Tune sample size for minimal margin of error
    • Decide based on your data whether to use mode, median
      or one of the means
    • Figure out whether your data is Normal, Log-Normal or
      something else
    • Filter out anomalous outliers




      Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   54
• Philip Tellis

•                           .com
• philip@lognormal.com
• @bluesmoon
• geek paranoid speedfreak
• http://bluesmoon.info/




    Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   55
Thank you




Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   56
Photo credits




     • http://www.flickr.com/photos/leoffreitas/332360959/ by leoffreitas
     • http://www.flickr.com/photos/cobalt/56500295/ by cobalt123
     • http://www.flickr.com/photos/sophistechate/4264466015/ by Lisa
       Brewster




       Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   57
List of figures




     • http://en.wikipedia.org/wiki/File:Standard_deviation_diagram.svg
     • http://en.wikipedia.org/wiki/File:Normal_Distribution_PDF.svg
     • http://en.wikipedia.org/wiki/File:KilroySchematic.svg
     • http://en.wikipedia.org/wiki/File:Boxplot_vs_PDF.png




       Boston #WebPerf Meetup / 2012-08-14   The Statistics of Web Performance Analysis   58

The Statistics of Web Performance Analysis

  • 1.
    • Philip Tellis • .com • philip@lognormal.com • @bluesmoon • geek paranoid speedfreak • http://bluesmoon.info/ Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 1
  • 2.
    I’m a WebSpeedfreak Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 2
  • 3.
    We measure realuser website performance Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 3
  • 4.
    This talk isabout the Statistics we learned while building it Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 4
  • 5.
    The Statistics ofWeb Performance Analysis Philip Tellis / philip@lognormal.com Boston #WebPerf Meetup / 2012-08-14 Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 5
  • 6.
    0 Numbers Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 6
  • 7.
    Accurately measure pageperformance∗ Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 7
  • 8.
    Be unintrusive If you try to measure something accurately, you will change something related – Heisenberg’s uncertainty principle Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 8
  • 9.
    And one numberto rule them all Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 9
  • 10.
    What do wemeasure? • Network Throughput • Network Latency • User perceived page load time Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 10
  • 11.
    We measure realuser data Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 11
  • 12.
    Which is noisy Boston#WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 12
  • 13.
    1 Statistics - 1 Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 13
  • 14.
    Disclaimer I am not a statistician Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 14
  • 15.
    1-1 RandomSampling Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 15
  • 16.
    Population All possible users of your system Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 16
  • 17.
    Sample Representative subset of the population Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 17
  • 18.
    Bad sample Sometimes it’s not Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 18
  • 19.
    How to randomize? http://xkcd.com/221/ Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 19
  • 20.
    How to randomize? • Pick 10% of users at random and always test them OR • For each user, decide at random if they should be tested http://tech.bluesmoon.info/2010/01/statistics-of-performance-measurement.html Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 20
  • 21.
    Select 10% ofusers - I if($sessionid % 10 === 0) { // instrument code for measurement } • Once a user enters the measurement bucket, they stay there until they log out • Fixed set of users, so tests may be more consistent • Error in the sample results in positive feedback Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 21
  • 22.
    Select 10% ofusers - II if(rand() < 0.1 * getrandmax()) { // instrument code for measurement } • For every request, a user has a 10% chance of being tested • Gets rid of positive feedback errors, but sample size != 10% of population Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 22
  • 23.
    How big asample is representative? Select n such that σ 1.96 √n ≤ 5%µ Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 23
  • 24.
    1-2 Margin of Error Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 24
  • 25.
    Standard Deviation • Standard deviation tells you the spread of the curve • The narrower the curve, the more confident you can be Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 25
  • 26.
    MoE at 95%confidence σ ±1.96 √n Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 26
  • 27.
    MoE & Samplesize There is an inverse square root correlation between sample size and margin of error Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 27
  • 28.
    1-3 Central Tendency Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 28
  • 29.
    Boston #WebPerf Meetup/ 2012-08-14 The Statistics of Web Performance Analysis 29
  • 30.
    One number • Mean (Arithmetic) • Good for symmetric curves • Affected by outliers Mean(10, 11, 12, 11, 109) = 30 Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 30
  • 31.
    One number • Median • Middle value measures central tendency well • Not trivial to pull out of a DB Median(10, 11, 12, 11, 109) = 11 Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 31
  • 32.
    One number • Mode • Not often used • Multi-modal distributions suggest problems Mode(10, 11, 12, 11, 109) = 11 Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 32
  • 33.
    Other numbers • A percentile point in the distribution: 95th , 98.5th or 99th • Used to find out the worst user experience • Makes more sense if you filter data first P95th (10, 11, 12, 11, 109) = 12 Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 33
  • 34.
    Other means • Geometric mean • Good if your data is exponential in nature (with the tail on the right) GMean(10, 11, 12, 11, 109) = 16.68 Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 34
  • 35.
    Wait... how didI get that? N ΠN xi — could lead to overflow i=1 ΣN loge (xi ) i=1 N e — computationally simpler Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 35
  • 36.
    Wait... how didI get that? N ΠN xi — could lead to overflow i=1 ΣN loge (xi ) i=1 N e — computationally simpler Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 35
  • 37.
    Wait... how didI get that? N ΠN xi — could lead to overflow i=1 ΣN loge (xi ) i=1 N e — computationally simpler Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 35
  • 38.
    Wait... how didI get that? N ΠN xi — could lead to overflow i=1 ΣN loge (xi ) i=1 N e — computationally simpler Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 35
  • 39.
    Other means And there is also the Harmonic mean, but forget about that Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 36
  • 40.
    ...though consequently We have other margins of error • Geometric margin of error • Uses geometric standard deviation • Median margin of error • Uses ranges of actual values from data set • Stick to the arithmetic MoE – simpler to calculate, simpler to read and not incorrect Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 37
  • 41.
    ...though consequently We have other margins of error • Geometric margin of error • Uses geometric standard deviation • Median margin of error • Uses ranges of actual values from data set • Stick to the arithmetic MoE – simpler to calculate, simpler to read and not incorrect Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 37
  • 42.
    2 Statistics - 2 Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 38
  • 43.
    2-1 Distributions Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 39
  • 44.
    Let’s look atsome real charts Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 40
  • 45.
    Sparse Distribution Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 41
  • 46.
    Log-normal distribution Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 42
  • 47.
    Bimodal distribution Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 43
  • 48.
    What does allof this mean? Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 44
  • 49.
    Distributions • Sparse distribution suggests that you don’t have enough data points • Log-normal distribution is typical • Bi-modal distribution suggests two (or more) distributions combined Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 45
  • 50.
    In practice, abi-modal distribution is not uncommon Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 46
  • 51.
    Hint: Does yoursite do a lot of back-end caching? Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 47
  • 52.
    2-2 Filtering Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 48
  • 53.
    Outliers • Out of range data points • Nothing you can fix here • There’s even a book about them Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 49
  • 54.
    Outliers • Out of range data points • Nothing you can fix here • There’s even a book about them Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 49
  • 55.
    Outliers • Out of range data points • Nothing you can fix here • There’s even a book about them Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 49
  • 56.
    Outliers • Out of range data points • Nothing you can fix here • There’s even a book about them Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 49
  • 57.
    DNS problems cancause outliers • 2 or 3 DNS servers for an ISP • 30 second timeout if first fails • ... 30 second increase in page load time • Maybe measure both and fix what you can • http://nms.lcs.mit.edu/papers/dns-ton2002.pdf Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 50
  • 58.
    Band-pass filtering Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 51
  • 59.
    Band-pass filtering • Strip everything outside a reasonable range • Bandwidth range: 4kbps - 4Gbps • Page load time: 50ms - 120s • You may need to relook at the ranges all the time Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 51
  • 60.
    IQR filtering Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 52
  • 61.
    IQR filtering Here, we derive the range from the data Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 52
  • 62.
    Further Reading lognormal.com/blog/2012/08/13/analysing-performance-data/ Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 53
  • 63.
    Summary • Choose a reasonable sample size and sampling factor • Tune sample size for minimal margin of error • Decide based on your data whether to use mode, median or one of the means • Figure out whether your data is Normal, Log-Normal or something else • Filter out anomalous outliers Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 54
  • 64.
    • Philip Tellis • .com • philip@lognormal.com • @bluesmoon • geek paranoid speedfreak • http://bluesmoon.info/ Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 55
  • 65.
    Thank you Boston #WebPerfMeetup / 2012-08-14 The Statistics of Web Performance Analysis 56
  • 66.
    Photo credits • http://www.flickr.com/photos/leoffreitas/332360959/ by leoffreitas • http://www.flickr.com/photos/cobalt/56500295/ by cobalt123 • http://www.flickr.com/photos/sophistechate/4264466015/ by Lisa Brewster Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 57
  • 67.
    List of figures • http://en.wikipedia.org/wiki/File:Standard_deviation_diagram.svg • http://en.wikipedia.org/wiki/File:Normal_Distribution_PDF.svg • http://en.wikipedia.org/wiki/File:KilroySchematic.svg • http://en.wikipedia.org/wiki/File:Boxplot_vs_PDF.png Boston #WebPerf Meetup / 2012-08-14 The Statistics of Web Performance Analysis 58