Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

PEnDAR webinar 2 with notes

376 views

Published on

Second webinar from the PEnDAR project, going into more detail about validating quantified performance requirements with a worked example.

Published in: Technology
  • Be the first to comment

PEnDAR webinar 2 with notes

  1. 1. © Predictable Network Solutions Ltd 2017 www.pnsol.com PEnDAR Focus Group Performance Ensurance by Design, Assuring Requirements Second Webinar
  2. 2. © Predictable Network Solutions Ltd 2017 www.pnsol.com Recap Previous webinar recording at https://attendee.gotowebinar.com/register/1615742612853821699 Also on SlideShare: https://www.slideshare.net/pnsol-slides
  3. 3. © Predictable Network Solutions Ltd 2017 www.pnsol.com 3 Goals • Consider how to enable Validation & Verification of cost and performance • For distributed and hierarchical systems • via sophisticated but easy-to-use tools • Supporting both initial and ongoing incremental development • Provide early visibility of cost/performance hazards • Avoiding costly failures • Maximising the chances of successful in-budget delivery of acceptable end-user outcomes Partners Supported by: PEnDAR project
  4. 4. © Predictable Network Solutions Ltd 2017 www.pnsol.com 4 Our offer to you Learn what we know about the laws of the distributed world: • Where the realm of the possible lies; • How to push the envelope in a way that benefits you most; • How to innovate without getting your fingers burnt. Get an insight into the future: • Understand more about achieving sustainability and viability; • See where competition may lie in the future. Our ‘ask’ Help us to find out: • In which markets/sectors/application domains will the benefits outweigh the costs? • Which are the major benefits? • Which are just ‘nice to have’? • How could this methodology be consumed? • Tools? • Services? Role of the focus group
  5. 5. © Predictable Network Solutions Ltd 2017 www.pnsol.com Outline of a coherent methodology
  6. 6. © Predictable Network Solutions Ltd 2017© Predictable Network Solutions Ltd 2017 www.pnsol.com 6 Outcomes Delivered Resources Consumed Variability Exception /failure Externally created Mitigation System impact Propagation Scale Schedulability Capacity Distance Number Time Space Density
  7. 7. © Predictable Network Solutions Ltd 2017 www.pnsol.com 7Performance/resource analysis Sub- system Sub- system Sub- system ∆Q Shared resources Starting with a functional decomposition: • Take each subsystem in isolation • analyse performance • modelling remainder of system as ∆Q • quantify resource consumption • may be dependent on the ∆Q • Examine resource sharing • within system – quantify resource costs • between systems – quantify opportunity cost • Successive refinements • consider couplings • iterate to fixed point Quantitative Timeliness Agreement Nottoday
  8. 8. © Predictable Network Solutions Ltd 2017 8 www.pnsol.com Recap of ∆Q • ∆Q is a measure of the ‘quality impairment’ of an outcome • The extent of deviation from ‘instantaneous and infallible’ • Nothing in the real world is perfect so ∆Q always exists • ∆Q is conserved • A delayed outcome can’t be ‘undelayed’ • A failed outcome can’t be ‘unfailed’ • ∆Q can be traded • E.g. accept more delay in return for more certainty of completion • ∆Q can be represented as an improper random variable • Combines continuous and discrete probabilities • Thus encompasses normal behaviour and exceptions/failures in one model
  9. 9. © Predictable Network Solutions Ltd 2017 www.pnsol.com 9 • Decompose the performance requirement following system structure • Using engineering judgment/best practice/cosmic constraints • Creates initial subsystem requirements • Validate the decomposition by re- combining via the behaviour • Formally and automatically checkable • Can be part of continuous integration • Captures interactions and couplings • Necessary and sufficient: • IF all subsystems function correctly and integrate properly • AND all subsystems satisfy their performance requirements • THEN the overall system will meet its performance requirement • Apply this hierarchically until • Behaviour is trivially provable OR • Have a complete set of testable subsystem verification/acceptance criteria Validating performance requirements
  10. 10. © Predictable Network Solutions Ltd 2017 www.pnsol.com Quantitative worked example
  11. 11. © Predictable Network Solutions Ltd 2017 www.pnsol.com 11Generic RPC Front end Back endNetwork 6 Transmit Response 2 Send request 4 Process request 5 Commit results 1 Construct transaction 7 Receive Response 0 Idle 8 Complete transaction 3 Receive request
  12. 12. © Predictable Network Solutions Ltd 2017 www.pnsol.com 12Generic RPC - observables Front end Back endNetwork 6 Transmit Response 2 Send request 4 Process request 5 Commit results 1 Construct transaction 7 Receive Response 0 Idle 8 Complete transaction 3 Receive request A E D CB F ∆Q(A⇝B) ∆Q(B⇝C) ∆Q(C⇝D) ∆Q(D⇝E)∆Q(E⇝F)
  13. 13. © Predictable Network Solutions Ltd 2017 www.pnsol.com 13Generic RPC – abstract performance Front end 1 Construct transaction 0 Idle 8 Complete transaction A E B F Network 6 Transmit Response 2 Send request 7 Receive Response 3 Receive request D C ∆Q(C⇝D )
  14. 14. © Predictable Network Solutions Ltd 2017 www.pnsol.com 14Generic RPC – abstract performance Front end 1 Construct transaction 0 Idle 8 Complete transaction A E B F ∆Q(B⇝E )
  15. 15. © Predictable Network Solutions Ltd 2017 www.pnsol.com 15Generic RPC – abstract performance 0 Idle A F ∆Q(A⇝F)
  16. 16. © Predictable Network Solutions Ltd 2017 www.pnsol.com 16 • User performance requirement is a bound on ∆Q(A⇝F), e.g. • 50% within 500ms • 95% within 750ms • etc. • Decompose this as ∆Q(A⇝F) = ∆Q(A⇝B) ⊕ ∆Q(B⇝E) ⊕ ∆Q(E⇝F) • How these terms combine depends on the behaviour • Will examine the interaction of performance and behaviour next Performance requirement
  17. 17. © Predictable Network Solutions Ltd 2017 www.pnsol.com 17 • Since the request or its acknowledgement may be lost, the process retries after a 333ms timeout • Failure mitigation • After three attempts the transaction is deemed to have failed • Failure propagation • Can unroll the loop for a clearer picture Front-end behaviour (1) A E B F
  18. 18. © Predictable Network Solutions Ltd 2017 www.pnsol.com 18Front-end behaviour (2)
  19. 19. © Predictable Network Solutions Ltd 2017 www.pnsol.com 19 • We have a non-preemptive first- to-finish synchronisation between receiving the response and the timeout • Which execution path is taken is a probabilistic choice depending on the ∆Q(B ⇝E) • We combine the ∆Qs of the different paths with these probabilities Assumptions: For an initial customer deployment over UK broadband with the server in the US East Coast: • ∆Q(B⇝C) and ∆Q(D⇝E) the same: • Minimum delay 95ms • 50ms variability • 1-2% loss • ∆Q(C⇝D) distributed between 25ms and 60ms Performance analysis (1)
  20. 20. © Predictable Network Solutions Ltd 2017 www.pnsol.com 20Performance analysis (2) ∆Q(C⇝D) ∆Q(B⇝E) ∆Q(A⇝F) Requirement
  21. 21. © Predictable Network Solutions Ltd 2017 21 www.pnsol.com Verification • So far, the verification task seems straightforward: • Provided the network and server performance are within the given bounds, then the front end will deliver the required end-user experience • However, this seems to have plenty of slack • Can we validate a different set of requirements? • How far can we relax them? • With this model we can explore: • Varying the network performance • Varying the server performance (e.g. as a result of load)
  22. 22. © Predictable Network Solutions Ltd 2017 www.pnsol.com 22Performance analysis (3) ∆Q(C⇝D ) ∆Q(B⇝E) ∆Q(A⇝F) Requirement
  23. 23. © Predictable Network Solutions Ltd 2017 www.pnsol.com 23 • Virtualised server has a limit on the rate of I/O operations • So that the underlying infrastructure can be reasonably shared • Such parameters affect the OPEX • Typically access is controlled via a token-bucket shaper • Allows small bursts • Limit long-term average use • This constitutes a QTA • Need to verify both sides • Delivered performance of the virtualised I/O thus depends on recent usage history • Load is a function of number of front-ends accessing the service and the proportion of re-tries • Re-tries depend on the ∆Q(B ⇝E) – which in turn depends on the performance of the I/O system! • This has been modeled here via a probability of taking a ‘slow path’ in the server Server resource consumption
  24. 24. © Predictable Network Solutions Ltd 2017 www.pnsol.com 24Performance analysis (4) ∆Q(C⇝D ) ∆Q(B⇝E) Requirement ∆Q(A⇝F)
  25. 25. © Predictable Network Solutions Ltd 2017© Predictable Network Solutions Ltd 2017 www.pnsol.com 25 Outcomes Delivered Resources Consumed Variability Exception /failure Externally created Mitigation System impact Propagation Scale Schedulability Capacity Distance Number Time Space Density Impact of retry behaviour on delivered ∆Q Impact of retry behaviour on server load Effect of server loading on delivered ∆Q Server I/O constraint triggered by load Coupling of retries/load/resp onse delay
  26. 26. © Predictable Network Solutions Ltd 2017 www.pnsol.com Relation to challenges in ICT E.g. virtualisation
  27. 27. © Predictable Network Solutions Ltd 2017 www.pnsol.com 27 Creates: • New digital supply chain hazards • Loss of mitigation opportunities • Aspects of performance are no longer under direct control • Additional ∆Q • As seen in example • This will be ‘traded’ to extract value • Who benefits? The user or the infrastructure provider? Now need to consider: • Ongoing resource costs • OPEX rather than (sunk) CAPEX • Opportunity costs • Amazon Web Service pricing of compute instances, SQS and lambda micro-services shows engagement with opportunity cost issues Virtualisation
  28. 28. © Predictable Network Solutions Ltd 2017 www.pnsol.com 28 Need quantitative intent • At the very least, some indication of what is really not acceptable • Comparison with something else (that is measurable) will do • But not “the best possible” or “whatever will sell” Can approximate best possible ∆Q • Use to quickly check degree of feasibility • Add up minimum times • Means and variances also add • Puts bounds on future improvement • Early visibility of performance hazards • Early mitigation General ICT application
  29. 29. © Predictable Network Solutions Ltd 2017 29 www.pnsol.com Summary • System performance validation consists of: • Analysing the interaction of system behaviour and subsystem performance requirements • Showing that this ‘adds up’ to meet the quantified user requirements (QTA) • System performance verification consists of showing that subsystems meet their QTAs • By analysis and/or measurement of ∆Q of the subsystems’ observable behaviour • This provides acceptance and/or contractual criteria for third-party subsystems or services • Substantially reduces performance integration risks and hence re-work
  30. 30. © Predictable Network Solutions Ltd 2017 www.pnsol.com What now ?
  31. 31. © Predictable Network Solutions Ltd 2017 31 www.pnsol.com The questions Given the right inputs, this sort of analysis takes relatively little time and effort, however: • Quantitative intent is fundamental • Where can this be found? • Which parts of the system decomposition are forced? • E.g. use of existing infrastructure, such as a network • How to handle subsystems with undocumented characteristics? • Can mitigate this with in-life measurement of ∆Q driven by the validation • Forewarned is forearmed • Inexpensive, focused data rather than wide-angle ‘big data’ approach
  32. 32. © Predictable Network Solutions Ltd 2017 www.pnsol.com Thank you!

×