Your SlideShare is downloading. ×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Advanced network performance measurement techniques


Published on

See how multi-point measures allow you to isolate network performance issues both temporally and spatially.

See how multi-point measures allow you to isolate network performance issues both temporally and spatially.

Published in: Technology

1 Like
  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide
  • There is a branch of mathematics called Large Deviation Theory that does have something to say about the predictive power of averages. And what it says is not very comforting – in the sense what it means in terms as a predictor of underlying hazards and risks – i.e. it is pretty bad. Capturing distribution gives the ability to assure that arrival patterns are within specification (see QTA’s later)Multipoint measurement gives some level of spatial identification – but suffers from the same issues as A, in that it remains a bad predictor of the hazards and risks.This is the measurement nirvana – it turns out that multipoint instantaneous observation makes available all the information that is possible by observation. In a well designed system this, by the principle of observational bisimularity, is the ultimate evidence of correct operation – including its performance aspects.Although there are people who have prided themselves on capturing (average) data over smaller (and smaller) timescales – the real issue is the number of events that occur in those timeslots. This is where the M5 (major holiday highway in UK) analogy comes in – the number of possible “events” 10 minutes on a 1Gb Ethernet (i.e. packets) is broadly equivalent to the number of events (cars on a three lane highway) in two year. For “averaging” to make sense you would need to be generating them over 20ms to 250ms intervals – no one can afford that.
  • Note that S is not a number, it is a function from packet size to delay. S is not necessarily a simple line, it may have a more complex structure depending on media quantisation (eg ATM cells, WiFi) and bearer allocation choices (eg 3GPP).
  • Transcript

    • 1. Advanced network performance measurement techniques Dr Neil Davies Predictable Network Solutions Ltd Peter Thompson Predictable Network Solutions Ltd Martin Geddes Martin Geddes Consulting Ltd PREDICTABLE NETWORK SOLUTIONS © 2013 All Rights Reserved
    • 2. Dr Neil Davies Co-founder, Predictable Network Solutions Ltd Peter Thompson CTO, Predictable Network Solutions Ltd Martin Geddes Founder, Martin Geddes Consulting Ltd PREDICTABLE NETWORK SOLUTIONS
    • 3. The only ex ante network performance engineering company in the world. Consultancy on the future of telecoms and the Internet. PREDICTABLE NETWORK SOLUTIONS
    • 4. Context for this presentation We are all in the business of “information translocation” The timely movement of information from one computational process to another The value lies in delivering application outcomes That people will pay for You are reading this because you are interested in delivering successful outcomes And understanding the causes of failure, so they can be mitigated You may be working in a culture of deflecting the attribution of blame We’d like to help you turn away from the path to the Dark Side
    • 5. What affects the timeliness? • The timeliness of application outcomes is dependent on the end-to-end loss and delay characteristics of the translocation • We call this end-to-end property ∆Q – ∆Q applies in each direction – not just the round trip – These characteristics need to be suitably bounded • ∆Q depends on the offered load – “Bandwidth” is an aspect of the relationship between offered load and ∆Q This presentation is about measuring ∆Q – and the benefits that approach brings
    • 6. Good measurement is NOT about averages • The average number of legs of a Swedish person is 1.9 – Now find me one! • Measuring average throughput on a 1Gb link over 10 mins is like measuring the traffic on the M5 motorway over two years – No indicator of my likely travel experience • Need to know the instantaneous properties – The ∆Q the “next packet” is going to get • It is all about the probability distribution of quality attenuation – This is what determines timeliness of application outcomes
    • 7. One-point measures • This is the typical information captured by equipment today – Counters (e.g. packets passed, packet sizes, packets dropped) – Sampled over a period • Does not capture ∆Q – Not end-to-end • Multiple one-point measures don’t help • Creates an equipment-centric view – Focuses on the equipment, not the service to the customers – Leads to focus on capacity, and ignores schedulability
    • 8. Multipoint measures • Measure a value between different points – Not just counting things • Same “information translocation” at various points – Measuring the dynamics of the flow • Isolates issues, in both space and time – Excellent diagnostic power • Leads to a focus on schedulability and trading – Which in turn focuses on the outcomes for the customer
    • 9. Different measurement approaches Average Instantaneous Single Point Offered Load and Utilisation (mean values only) Limited predictive power Arrival Patterns Temporal predictive power, localised assurance (compliance with arrival pattern policy) Multiple Point Delay and Loss (mean and variance) Spatial predictive power Temporal and spatial predictive power Assurance of both arrival and service (demand and supply) – represents all that can be known about a system (by observation) PLUS PLUS
    • 10. Interpreting the two-point measure © Predictable Network Solutions 2013 Raw data: There’s no discernable structure here – not possible to work with data like this.
    • 11. Sort by packet size: a clear structure emerges
    • 12. Serialisation (or size-related) delay S Packets with bigger payloads experience more structural delay: it takes longer to turn the packet into a bitstream, and back again into a packet at the next network element . © Predictable Network Solutions 2013
    • 13. Geographic delay Serialisation delay Variable contention delay G S V
    • 14. Example multi-point measure The bi-directional, end-to-end path of a small cell deployment over commodity infrastructure
    • 15. How to read the information • Different views tell different stories • We’ll see some of those stories in the following slides • The focus on V is because that is where the issues of schedulability manifest themselves
    • 16. Key to following charts Two point measures (by time) GSV view (by packet) V (by time) V (by packet size) V cumulative distribution function (main) V cumulative distribution function (tail)
    • 17. E to A direction (user experience)Return Transit (run dd0a2310-d235-495b-8d2f-a4dc 0 0.05 0.1 0.15 0.2 0 50 100 150 200 250 delay(s) run time (s) Observed Delay against Experiment Run Time E->A 0 0.05 0.1 0.15 0.2 0 delay(s) 0.02 0.04 0.06 0.08 0.1 0.12 0.14 delay(s) Observed Delay Variability (V) against Experiment Run Time E->A 0.02 0.04 0.06 0.08 0.1 0.12 0.14 delay(s) Note the delay spike during the test run @ approx 60 seconds in How can be begin to analyse this performance issue?
    • 18. E -> A (by packet) This ‘spike’ doesn’t appear to be related to a particular packet size (note ‘striations’ in the S value is an artefact of 3GPP scheduling)
    • 19. E -> A (Dynamic response) Removing G and S influences clearly highlights the magnitude of the contention issue
    • 20. Spatial Isolation Same magnitude issue between D to B But not between D and C
    • 21. Spatial Isolation (2) It is occurring between C and B NOTE: this is the effect that we are measuring – NOT the cause (which in this case was not the access network but elsewhere) Armed with this information, we can begin to analyse root causes (e.g. what is over-driving this link?)
    • 22. ∆Q for ADSL line © Predictable Network Solutions 2013 Compare and contrast: baseline data for an ADSL line
    • 23. ∆Q for Femto (over ADSL) © Predictable Network Solutions 2013 Now run a femtocell over that same line: much worse performance
    • 24. Summary • Multipoint distribution based measurement gives access to all the information available through observation – “observation” is key – independent of equipment – Captures the influence of technology etc • G,S & V gives you a way of extracting both temporal and spatial details • Becomes extremely powerful when combined with analysis – E.g. you have a model of what V should be, or what G and S should be given the network layout
    • 25. Upcoming workshops: Sustainable Public Service Networks London, 19th September 2013 Fundamentals of Network Performance London, 20th September 2013 PREDICTABLE NETWORK SOLUTIONS
    • 26. Neil Davies Peter Thompson Martin Geddes PREDICTABLE NETWORK SOLUTIONS