Performance Instrumentation Beyond What You Do Now

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    Favorites, Groups & Events

    Performance Instrumentation Beyond What You Do Now - Presentation Transcript

    1. Performance Instrumentation beyond what you do now Cary Millsap cary.millsap@method-r.com Percona Performance Conference Santa Clara, California 9:00a–9:55a Thursday 23 April 2009 1
    2. Introductions 2
    3. Cary Millsap carymillsap.blogspot.com cary_millsap 3
    4. 1986 1989 1999 2008 4
    5. 1986 1989 Software Developer 1999 and Performance Analyst 2008 4
    6. 5
    7. Method R Corporation http://method-r.com 6
    8. What we do at Method R Corporation… • Write code for you • Troubleshoot performance problems • Teach you how to do what we do • Write software tools that make your work easier 7
    9. Thinking clearly about performance 8
    10. Performance is HARD 9
    11. “Our users say that everything is slow, but I don’t know where to begin.” 10
    12. “Our users are complaining, but all our dials are green.” 11
    13. A story. 12
    14. In the beginning... (1989: Oracle 6.0.26) 13
    15. “Tuning” was… 14
    16. bstat.sql ... estat.sql report.txt 15
    17. 16
    18. V$PARAMETER sar V$DB_OBJECT_CACHE ps iostat V$OPEN_CURSOR V$SESSTAT netstat V$FIXED_VIEW_DEFINITION V$LATCH nfsstat V$TRANSACTION V$PROCESS V$FILESTAT V$LOCK vmstat V$SQL V$SESSION V$SYSSTAT V$SQLTEXT V$SESS_IO V$LIBRARYCACHE V$ROLLSTAT V$ROWCACHE V$WAITSTAT pstat V$TIMER 16
    19. People looked for “bad numbers.” 17
    20. Ineficiencies. 18
    21. But how can you know what causes a specific task to be slow? 19
    22. 20
    23. 21
    24. It's latches 21
    25. It's I/O It's latches 21
    26. It's I/O It's It's latches always I/ O 21
    27. It's It's bad SQL I/O It's It's latches always I/ O 21
    28. It's It's It's bad SQL always I/O bad SQL It's It's latches always I/ O 21
    29. It's It's It's bad SQL always I/O bad SQL It's It's latchesThere's always I/ not O enough memory 21
    30. It's It's It's bad SQL always I/O bad SQL It's It's latchesThere's always I/ There's not O never enough enough memory memory 21
    31. My problem… 22
    32. How can you possibly know that? 23
    33. Reminded me of… 24
    34. 25 vailroger.googlepages.com/orionconstellation
    35. You do see it... Right? 26
    36. 27 vailroger.googlepages.com/orionconstellation
    37. 27 vailroger.googlepages.com/orionconstellation
    38. But who says that is what you have to see? 28
    39. 29
    40. 29
    41. Why not? 30
    42. Performance is hard. 31
    43. A good pilot makes it look easy. —Van R. Millsap 1936–2004 32
    44. Performance is EASY 33
    45. How? 34
    46. It’s the user’s experience that matters. 35
    47. 36
    48. A user’s performance experience consists of two elements… 37
    49. 1. a task 2. time 38
    50. Task 39
    51. The things we used to “computerize”… tasks. http://olathe.lib.ks.us/images/Image/Computer%20User.jpg 40
    52. A task is a business unit of work. • Post to the General Ledger • Enter an order • Look up a book by author 41
    53. Tasks can nest. Posting PO AP AR … FA 42
    54. Tasks can nest. • Print Addresses is a task Posting PO AP AR … FA 42
    55. Tasks can nest. • Print Addresses is a task • Print Address #42 is a (sub)task Posting PO AP AR … FA 42
    56. Tasks can nest. • Print Addresses is a task • Print Address #42 is a (sub)task Posting PO AP AR … FA 42
    57. Tasks can nest. • Print Addresses is a task • Print Address #42 is a (sub)task • Often, a program is a task Posting PO AP AR … FA 42
    58. Tasks can nest. • Print Addresses is a task • Print Address #42 is a (sub)task • Often, a program is a task • Often, a tiny part of a Posting program is a task PO AP AR … FA 42
    59. it. Tasks are Business people don’t care about the “system” except through execution of the tasks that make up their business. 43
    60. it. Tasks are Tasks are what system owners care about. 44
    61. Time 45
    62. time. Performance is about 46
    63. How fast: “Daddy, can your car go 500 miles?” He meant “500 miles per hour.” To talk about performance (speed), you have to talk about time. 47
    64. Two ways to measure performance… 48
    65. 49
    66. tasks per time 49
    67. tasks per time (that’s throughput) 49
    68. tasks per time (that’s throughput) 49
    69. tasks per time (that’s throughput) time per task 49
    70. tasks per time (that’s throughput) time per task (that’s response time) 49
    71. Throughput and response time… 50
    72. Throughput and response time… • Throughput (X) – The tasks-per-time way – Number of task executions completed in a given duration • “orders/second” 50
    73. Throughput and response time… • Throughput (X) – The tasks-per-time way – Number of task executions completed in a given duration • “orders/second” 50
    74. Throughput and response time… • Throughput (X) – The tasks-per-time way – Number of task executions completed in a given duration • “orders/second” • Response time (R) – The time-per-task way – Elapsed duration of an execution of a given task • “seconds/order” 50
    75. 51
    76. X = 1/R 51
    77. X = 1/R 51
    78. X = 1/R (kind of) 51
    79. Average throughput is the inverse of average response time. 52
    80. Average throughput is the inverse of average response time. X = 1,000 txn/sec? 52
    81. Average throughput is the inverse of average response time. X = 1,000 txn/sec? Then R = (1 sec)/(1,000 txn) = .001 sec/txn But… 52
    82. 53
    83. …Adding load to create higher throughput changes response time. 53
    84. …Which leads to a whole ’nother conversation I’d love to have with you some other time. 54
    85. Sequence Diagram 55
    86. A simple way to view response time is with a UML sequence diagram. RA http://www.websequencediagrams.com 56
    87. More complicated systems have nested levels of suppliers and consumers. RA RB http://www.websequencediagrams.com 57
    88. The tiers represent the way your system is constructed. RUser http://www.websequencediagrams.com 58
    89. This sequence diagram shows the complicated interactions among consumers and suppliers. RUser http://www.websequencediagrams.com 59
    90. The sequence diagram is a conceptual good tool. 60
    91. But when you need to analyze thousands of calls, you need something else. 61
    92. Profile 62
    93. A profile is a complete account of a task’s response time. Response time # Calls R/call Call name (seconds) (seconds) 0.769 50.3% 5,003 0.000154 unaccounted-for between dbcalls 0.393 25.7% 5,010 0.000078 SQL*Net message from client 0.381 24.9% 5,013 0.000076 CPU service, execute calls 0.090 5.9% 11 0.008194 CPU service, prepare calls 0.027 1.8% 1 0.027396 log file sync 0.008 0.5% 5,010 0.000002 SQL*Net message to client 0.000 0.0% 9 0.000000 CPU service, fetch calls –0.138 –9.1% 5,031 –0.000028 unaccounted-for within dbcalls 1.530 100.0% Total 63
    94. You’ve done this before, if you’ve ever used… gcc –pg …; gprof … java –prof …; java ProfilerViewer … perl –d:Dprof …; dprofpp … dbms_monitor.session_trace_enable(…); p5prof … 64
    95. Profile • Full account of response time • Contributions as %R – Spanning (sum ≮ R) • Duration per call Mean, minimum, maximum, … – Non-overlapping (sum ≯ R) Skew • Sorted by descending R • Drill-down • Useful dimension Individual call level of detail – Flat profile Maybe even deeper – Call graph 65
    96. Response Time 66
    97. To optimize throughput, you response must analyze time. 67
    98. (Proof) 68
    99. (Proof) You cannot optimize X for a task that’s ineficient. 68
    100. (Proof) You cannot optimize X for a task that’s ineficient. 68
    101. (Proof) You cannot optimize X for a task that’s ineficient. You cannot measure a task’s eficiency without measuring its R. 68
    102. (Proof) You cannot optimize X for a task that’s ineficient. You cannot measure a task’s eficiency without measuring its R. 68
    103. (Proof) You cannot optimize X for a task that’s ineficient. You cannot measure a task’s eficiency without measuring its R. Therefore, to optimize X, you must first analyze R. 68
    104. The universal experience of programmers who have been using measurement tools has been that their intuitive guesses fail. —Donald Knuth 69
    105. (Programmers aren’t very good at guessing where their code spends time.) 70
    106. To optimize performance (throughput or response time), profiles. need people 71
    107. Performance is EASY 72
    108. Performance is easy if you can stop guessing where your code is slow. 73
    109. When you have profiles for task response times, performance cannot hide problems from you. 74
    110. Some surprising things I’ve learned by measuring R… 75
    111. Disk I/O is often less important than people think. http://carymillsap.blogspot.com/2009/04/cary-on-joel-on-ssd.html 76
    112. Common performance problems: 77
    113. Common performance problems: CPU 77
    114. Common performance problems: CPU 77
    115. Common performance problems: CPU Network I/O 77
    116. Common performance problems: CPU Network I/O 77
    117. Common performance problems: CPU Network I/O Software serialization 77
    118. The point… 78
    119. Your problems have nothing to do with experiences I’ve had. measure. So 79
    120. Finding what you need to see 80
    121. How are you supposed to profiles? create these 81
    122. You have to insist on seeing where time goes for any task you think is important. 82
    123. To drill down, you need call-by-call data. (NOT data about aggregations of calls.) 83
    124. In Oracle, we do it with a feature called extended SQL tracing. • For Developers: Making Friends with the Oracle Database for Fast, Scalable Applications – Cary Millsap http://method-r.com/downloads/doc_details/10-for- developers-making-friends-with-the-oracle- database-cary-millsap • Optimizing Oracle Performance – Cary Millsap with Je Holt 84
    125. The stu you need… 85
    126. Feature (attribute) Oracle MySQL App tier Task identification y Call-by-call coverage 98%+ DB call begin sequence partly derivable DB call begin time partly derivable DB call end time y DB call context info y OS call begin sequence partly derivable OS call begin time derivable OS call end time y OS call context info y Call SQL context y Call CPU (sys mode) - Call CPU (usr mode) - Call CPU (total) y SQL execution plans y 86
    127. Recap 87
    128. Here’s what I hope you take away today… 88
    129. Performance is about time and tasks. 89
    130. If you’re interested in performance, then read Goldratt’s The Goal. 90
    131. 91
    132. Don’t guess; you’re probably wrong. 91
    133. Don’t guess; you’re probably wrong. Measure response time before you optimize anything. 91
    134. Don’t guess; you’re probably wrong. Measure response time before you optimize anything. Insist on it. 91
    135. Performance is easy (and fun!) when code measures its own time and tasks. 92
    136. 93

    + PerconaPerformancePerconaPerformance, 6 months ago

    custom

    297 views, 0 favs, 0 embeds more stats

    More info about this document

    © All Rights Reserved

    Go to text version

    • Total Views 297
      • 297 on SlideShare
      • 0 from embeds
    • Comments 0
    • Favorites 0
    • Downloads 2
    Most viewed embeds

    more

    All embeds

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories