Performance Oriented Design


Published on

Performance Oriented Design, presented at QCon São Paulo 2011 by Rodrigo Campos

Published in: Technology

Performance Oriented Design

  1. 1. PerformanceOriented Design QCon São Paulo 2011 Rodrigo Albani de Campos - @xinu
  2. 2. Agenda• Performance & Design• Why should I care ?• What should I measure ?• References
  3. 3. What is performance ?the capabilities of a machine orproduct, esp. when observedunder particular conditions : thehardware is put through testswhich assess the performance ofthe processor.
  4. 4. What is design ?his design of reaching the top:intention, aim, purpose, plan,intent, objective, object, goal,end, target; hope, desire, wish,dream, aspiration, ambition.
  5. 5. McLaren MP4 12c GT3
  6. 6. Underlying Operating Systems Read IOPS IO Wait Page Faults Run Queue Disk Usage# users USER CPU SYSTEM CPU Resident Size Write IOPS Network Traffic Memory Usage Page out Interrupts Page inPacket Loss Network Collision # processes Buffers Kernel Tables
  7. 7. What about code ?
  8. 8. Apr 25, 2011 5:44:02 PM fatalErrorSEVERE: javax.xml.transform.TransformerException:java.lang.NullPointerException: Parameter alpha must not be nullApr 25, 2011 5:44:02 PM org.apache.fop.cli.Main startFOPSEVERE: Exceptionjavax.xml.transform.TransformerException:java.lang.NullPointerException: Parameter alpha must not be null atorg.apache.fop.cli.InputHandler.transformTo( atorg.apache.fop.cli.InputHandler.renderTo( at org.apache.fop.cli.Main.startFOP( at org.apache.fop.cli.Main.main( by: javax.xml.transform.TransformerException:java.lang.NullPointerException: Parameter alpha must not be null atorg.apache.xalan.transformer.TransformerImpl.executeChildTemplates( atorg.apache.xalan.templates.ElemLiteralResult.execute( atorg.apache.xalan.templates.ElemApplyTemplates.transformSelectedNodes( atorg.apache.xalan.templates.ElemApplyTemplates.execute(
  9. 9. We’ve been riding space shuttles blindfolded handcuffed
  10. 10. Why should I care ?
  11. 11. Why should I care ?Capacity planning is not just about the futureanymore.Today, there is a serious need to squeeze moreout of your current capital equipment. The Guerrilla Manual Online
  12. 12. Why should I care ?“Our systems are very simple, there’s no need for such performance metrics”
  13. 13. It goes like this... The Internet Web Server Application Server Database
  14. 14. It goes like this... The Internet Web Server Application Server Database
  15. 15. It goes like this... The Internet Web ServerApplication Server Database
  16. 16. It goes like this... The Internet Web ServerApplication Server Slaves RO Master RW
  17. 17. It goes like this... The Internet Web ServerApplication Server Master RW Slaves RO
  18. 18. It goes like this... The Internet Web Server Application Server Caches Master RW Slaves RO Evil Machines Corporation
  19. 19. It goes like this... The Internet CPUs will be idle Disks will be sleeping Network will be Web Server Application Server underused... and your users will be Caches complaining... Master RW Slaves RO Evil Machines Corporation
  20. 20. Why should I care ? “But we are using the Cloud !”
  21. 21. Why should I care ?• So now you’re in an utility computing model• You’re charged per usage
  22. 22. Why should I care ?“Updating performancecounters will make my code run slower”
  23. 23. Why should I care ?• Datacenter Average CPU utilization is around 15%• If updating performance counters is a problem then you really need them• Those microseconds will save you hours of troubleshooting !
  24. 24. Why should I care ?“These are non-functional requirements”
  25. 25. Why should I care ? Distinct Query Revenue/ Any Clicks Satisfaction Time to Click Queries/User Refinement User (increase in ms) 50ms 0 0 0 0 0 0 200ms 0 0 0 -0,30% -0,40% 500 500ms 0 -0,60% -1,20% -1,00% -0,90% 1200 1000ms -0,70% -0,90% -2,80% -1,90% -1,60% 1900 2000ms -1,80% -2,10% -4,30% -4,40% -3,80% 3100The User and Business Impact of Server Delays, Additional Bytes, and HTTPChunking in Web Search - Eric Schurman (Amazon), Jake Brutlag (Google)
  26. 26. Why should I care ?“Fast isn’t a feature, fast isa Requirement” Jesse Robins - OPSCode
  27. 27. What should I measure ?
  28. 28. QueuesThe not so typical performance metrics• Invented the fields of traffic engineering and queuing theory• 1909 - Published “The theory of Probabilities and Telephone Conversations” Agner Krarup Erlang• 1917 - Published “Solution of some Problems in the Theory of Probabilities of Significance in Automatic Telephone Exchanges"
  29. 29. QueuesThe not so typical performance metrics• 1961 - CTSS was first demonstrated at MIT• 1965 - Allan Scherr used machine repairman problem to model a time-shared system as part of Project MAC• Another offspring of Project MAC is Multics
  30. 30. Queues The not so typical performance metrics • IBM System/370 model 158-3 - 1.0 MIPS @ 1.0 MHz -1972 • Average purchase price: $ 771,000* • No disks or peripherals included • $ 4,082,039 by 2011 • Intel Core i7 Extreme Edition 990x released in 2011 peaks 159,000 MIPS @ 3,46GHz* Source:
  31. 31. QueuesThe not so typical performance metrics Computer System Disks CPU
  32. 32. QueuesThe not so typical performance metrics (A) λ X (C) SOpen/Closed W Network R A Arrival Count λ Arrival Rate (A/T) W Time spent in Queue R Residence Time (W+S) S Service Time X System Throughput (C/T) C Completed tasks count
  33. 33. Arrival Rate (λ)• Pretty straightforward• Requests per second/hour/day• Not the same as throughput (X) • Although in a steady state: • A = C as T →∞ • λ=X
  34. 34. Service Time (S)• Time spent in processing • Web server response time • Total query time • IO operation time length
  35. 35. What to look for ?• Stretch factor• Method Count• Method Service Time• Geolocation• Inbound & Outbound Traffic• Round Trip Delays
  36. 36. What should I measure? Average Hits/s = 65.142 Average Svc time = 0.0159
  37. 37. What should I measure ?• A simple tag collection data store• For each data operation: • A 64 bit counter for the number of calls • An average counter for the service time
  38. 38. What should I measure ? Method Call Count Service Time (ms) dbConnect 1.876 11,2 fetchDatum 19.987.182 12,4 postDatum 1.285.765 98,4 deleteDatum 312.873 31,1 fetchKeys 27.334.983 278,3fetchCollection 34.873.194 211,9createCollection 118.853 219,4
  39. 39. What should I measure ? Call Count x Service Time fetchKeys createCollectionService Time (ms) fetchCollection deleteDatum postDatum dbConnect fetchDatum Call Count
  40. 40. ReferencesGuerrilla Capacity Planning:A Tactical Approach toPlanning for Highly ScalableApplications and ServicesNeil J. Gunther
  41. 41. ReferencesAnalyzing ComputerSystems Performance: WithPerl::PDQNeil J. Gunther
  42. 42. ReferencesPerformance by Design:Computer CapacityPlanning By ExampleDaniel A. Menasce et al.
  43. 43. ReferencesThe Art of CapacityPlanning: Scaling WebResourcesJohn Allspaw
  44. 44. ReferencesCapacity Planning for WebPerformance: Metrics,Models, and MethodsDaniel Menasce & VirgilioAlmeida
  45. 45. ReferencesMeasure IT: Conference Proceedings: Brazilian Chapter:
  46. 46. Last but not least...Measure what ismeasurable, and makemeasurable what is not so. Galileo Galilei
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.