Your SlideShare is downloading. ×

OOW13 JB KP ASH Deep Dive

397

Published on

Joint session with JB from Oracle at OOW13/Oracle Open World 2013

Joint session with JB from Oracle at OOW13/Oracle Open World 2013

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
397
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
20
Comments
0
Likes
2
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.1
  • 2. Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12 of the corporate presentation template2 ASH Deep Dive: Advanced Performance Analysis Tips John Beresniewicz, Oracle America Kellyn Pot’vin, Enkitec
  • 3. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.3 Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle s products remains at the sole discretion of Oracle.
  • 4. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.4 Program Agenda   What is ASH?   How does ASH work?   How do we use ASH data?   Enterprise Manager: ASH Analytics   ASH in Action: Kellyn Pot’vin
  • 5. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.5 What is ASH?
  • 6. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.6 What is ASH?   Time-based sampling of foreground session state –  Highly multi-dimensional view of database activity and therefore DB Time   Observations of specific values of the (DB Time/time) function –  This function is called: Average Active Sessions An instrumentation mechanism that actualizes an important concept
  • 7. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.7 Important Properties of ASH   Samples represent “snapshots” of session activity at “same time” –  Not really true since using latchless mechanism   Sampling is time independent of session activity –  Important since otherwise sessions may be over or under-sampled
  • 8. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.8 Active Session Sampling Time-based captures of state information for active sessions Sample_t1 Session 1 Session 2 Session 3 Sample_t2 Sample_t3 Session Time State Wait Class SQL_ID Object t1 1 ON CPU null 53qkkf6yzc2x0 null t1 2 WAITING User I/O 0naxkcasaz162 EMP t1 3 WAITING User I/O cs4qrt8kr3uhx EMP t2 3 WAITING Application 4uh6zm2wg03mx DEPT
  • 9. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.9 ASH is Highly Multi-dimensional Most of these represent useful investigative paths in some context desc v$active_session_history Name Null Type ------------------------------ -------- ---------------- SAMPLE_ID NUMBER SAMPLE_TIME TIMESTAMP(3) IS_AWR_SAMPLE VARCHAR2(1) SESSION_ID NUMBER SESSION_SERIAL# NUMBER SESSION_TYPE VARCHAR2(10) FLAGS NUMBER USER_ID NUMBER . . . 93 rows selected
  • 10. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.10 SQL Dimensions SQL_ID VARCHAR2(13) IS_SQLID_CURRENT VARCHAR2(1) SQL_CHILD_NUMBER NUMBER SQL_OPCODE NUMBER SQL_OPNAME VARCHAR2(64) FORCE_MATCHING_SIGNATURE NUMBER TOP_LEVEL_SQL_ID VARCHAR2(13) TOP_LEVEL_SQL_OPCODE NUMBER SQL_PLAN_HASH_VALUE NUMBER SQL_PLAN_LINE_ID NUMBER SQL_PLAN_OPERATION VARCHAR2(30) SQL_PLAN_OPTIONS VARCHAR2(30) SQL_EXEC_ID NUMBER SQL_EXEC_START DATE PLSQL_ENTRY_OBJECT_ID NUMBER PLSQL_ENTRY_SUBPROGRAM_ID NUMBER PLSQL_OBJECT_ID NUMBER PLSQL_SUBPROGRAM_ID NUMBER QC_INSTANCE_ID NUMBER QC_SESSION_ID NUMBER QC_SESSION_SERIAL# NUMBER
  • 11. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.11 Wait Event Dimensions EVENT VARCHAR2(64) EVENT_ID NUMBER EVENT# NUMBER SEQ# NUMBER P1TEXT VARCHAR2(64) P1 NUMBER P2TEXT VARCHAR2(64) P2 NUMBER P3TEXT VARCHAR2(64) P3 NUMBER WAIT_CLASS VARCHAR2(64) WAIT_CLASS_ID NUMBER WAIT_TIME NUMBER SESSION_STATE VARCHAR2(7) TIME_WAITED NUMBER
  • 12. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.12 Application Dimensions Instrumented applications can benefit greatly SERVICE_HASH NUMBER PROGRAM VARCHAR2(48) MODULE VARCHAR2(48) ACTION VARCHAR2(32) CLIENT_ID VARCHAR2(64) MACHINE VARCHAR2(64) PORT NUMBER ECID VARCHAR2(64) CONSUMER_GROUP_ID NUMBER TOP_LEVEL_CALL# NUMBER TOP_LEVEL_CALL_NAME VARCHAR2(64) CONSUMER_GROUP_ID NUMBER XID RAW(8) REMOTE_INSTANCE# NUMBER TIME_MODEL NUMBER
  • 13. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.13 How does ASH work?
  • 14. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.14 ASH Key Architecture Concepts   In-memory ASH sampling: –  Dedicated background process: MMNL –  Circular SGA memory buffer: one writer; many readers –  Lean and robust mechanism: no locking or latching –  Default 1000ms (1 sec) sampling interval   ASH sub-sampling to disk: –  Flush to AWR with snapshot or on emergency flush –  Default: 1-in-10 of the 1-sec samples are persisted –  Future: continuous sub-sampling Session activity sampled efficiently into memory and onto disk
  • 15. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.15 •  MMNL writes to ASH circular buffer one way •  Readers of V$ASH start at current write pointer •  Readers proceed in opposite direction of MMNL through buffer •  Stop when current sample_id > last read sample_id •  SELECT from V$ASH returned recent-last order Reading / Writing in Opposite Directions MMNL SALLY start SALLY finish
  • 16. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.16 Sampling Pseudo-code (lean and mean, but there is a hole) 1) FOR ALL SESSION STATE OBJECTS 2) IS SESSION CONNECTED? NO => NEXT SESSION YES: 3) IS SESSION ACTIVE? NO => NEXT SESSION YES: 4) MEMCPY SESSION STATE OBJ 5) CHECK CONSISTENCY OF COPY WITH LIVE SESSION 6) IS COPY CONSISTENT? YES: WRITE ASH ROW FROM COPY NO: IF FIRST COPY, REPEAT STEPS 4-6 ELSE => NEXT SESSION (NO ASH ROW WRITTEN)
  • 17. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.17 Default Settings   Sampling interval = 1000ms = 1 sec   Disk filter ratio = 10 = 1 in 10 samples written to AWR   ASH buffer size: –  Min( Max (5% shared pool, 2% SGA), 2MB per CPU) –  Absolute Max of 256MB These are carefully chosen for maximum general utility NOTE: the MMNL sampler session is not sampled
  • 18. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.18 Control Parameters   _ash_size : size of ASH buffer in bytes –  K/M notation works (e.g. 200M)   _ash_sampling_interval : in milliseconds –  Min = 100, Max = 10,000   _ash_disk_filter_ratio : every Nth sample to AWR –  MOD(sample_id, N) = 0 where N=disk filter ratio   _sample_all : samples idle and active sessions (geeks want underscores)
  • 19. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.19 V$ASH_INFO New in 11.2 (but unfortunately un-documented) desc v$ash_info Name Null Type ------------------------------ -------- -------------- TOTAL_SIZE NUMBER FIXED_SIZE NUMBER SAMPLING_INTERVAL NUMBER OLDEST_SAMPLE_ID NUMBER OLDEST_SAMPLE_TIME TIMESTAMP(9) LATEST_SAMPLE_ID NUMBER LATEST_SAMPLE_TIME TIMESTAMP(9) SAMPLE_COUNT NUMBER SAMPLED_BYTES NUMBER SAMPLER_ELAPSED_TIME NUMBER DISK_FILTER_RATIO NUMBER AWR_FLUSH_BYTES NUMBER AWR_FLUSH_ELAPSED_TIME NUMBER AWR_FLUSH_COUNT NUMBER AWR_FLUSH_EMERGENCY_COUNT NUMBER Compute buffer time window size Compute average time per sample DROPPED_SAMPLE_COUNT NUMBER
  • 20. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.20 ASH is Robust when CPU-constrained 1.  ASH sampler is very efficient and does not lock –  Should complete a sample within a single CPU slice 2.  After sampling, the sampler computes next scheduled sample time and sleeps until then 3.  Upon scheduled wake-up, it waits for CPU (runq) and samples again –  CPU bound sample times are shifted by one runq but intervals stay close to 1 second (These are precisely times when reliable data is necessary)
  • 21. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.21 ASH Sampler and Run-queue Sampling interval is consistent under CPU-starvation S_t0 S_t2S_t1 Run queue Run queue A_t1A_t0 Run queue A_t2 Sleep until next time Sleep until next Sample Sample Sample
  • 22. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.22 The ASH “Fix-up”   ASH column values may be unknown at sampling time –  TIME_WAITED: session is still waiting –  PLAN_HASH: session is still optimizing SQL –  GC events: event details unknown at event initiation   ASH “fixes up” data during subsequent sampling –  TIME_WAITED fixed up in first sample after event completes –  Long events: last sample gets correct TIME_WAITED (all others 0)   Querying V$ASH may return un-fixed rows –  Should not be a problem generally A unique and very important feature
  • 23. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.23 How do we use ASH data?
  • 24. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.24 How do we use ASH data?   Estimate DB Time and Average Active Sessions –  For specific time intervals –  Decomposed and filtered many ASH dimensions   Investigate tuning opportunities –  Excesses of DB Time in tune-able areas   ASH Forensics –  Figure out “what happened to SID?”
  • 25. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.25 ASH Math: Estimating DB Time from ASH   Each ASH row counts for :INTERVAL of active session time   Default for :INTERVAL is 1 second (1000 ms)   Therefore COUNT(*) = DB Time in seconds   This is what I call “ASH Math”   An estimate because it is computed over a sample of true reality
  • 26. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.26 ASH Math and DB Time The count of sampled rows is an estimate (unbiased) of DB time Estimate DB TimeCOUNT (ASH SAMPLED ROWS)
  • 27. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.27 Computing Average Active Sessions   AAS = DELTA(DB TIME) / DELTA(elapsed_time) –  Over some time interval(s) of sampled workload   SUM(:sampling_interval) / [ MAX(sample_time) – MIN(sample_time) ] –  Normalized to common time units, e.g. seconds   COUNT(*) / [ (MAX(sample_id) – MIN(sample_id) ] –  This works for default sampling interval and one time interval The centerpiece measure for EM Activity charts and ASH Analytics
  • 28. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.28 Bad ASH Math and TIME_WAITED These mistakes are very common and very wrong AVG(TIME_WAITED) This does not estimate average event latencies because sampling is biased toward longer events SUM(TIME_WAITED) This does not compute total wait time in the database since ASH does not contain all waits.
  • 29. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.29 ASH Timing for Nano-operations   Some important operations are still too frequent and short-lived for timing –  E.g. no wait event for “bind” operations   A session-level bit vector is updated in binary fashion before/after such operations –  Much cheaper than timer calls   The session bit vector is sampled into ASH   “ASH Math” used to estimate time spent in un-timed transient operations Magic trick: timing what cannot be timed
  • 30. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.30 “ON CPU” and ASH   ASH session status ‘ON CPU’ is derived, not observed –  Session is in a database call –  Session is NOT in a wait event (idle or non-idle)   Un-instrumented waits => ‘ON CPU’ –  These are bugs and should be rare, but have happened   Sessions on run queue may be ‘WAITING’ or ‘ON CPU’ –  Depends on state prior to going onto run queue ASH CPU and Time Model CPU don’t always agree
  • 31. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.31 Enterprise Manager: ASH Analytics
  • 32. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.32   Display AAS by wait class over time   5-minute Time Selector for details   Top SQL and Top Sessions –  Broken down by wait class –  Additional fact columns   User-selectable Top dimension Average Active Sessions Origin: EM Top Activity
  • 33. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.33   Top Lists not graphically comparable –  “% Activity” depends on sample count   Time Series by Wait Class only –  What about SQL, User, etc?   Lots of wasted visual real-estate Design Issues Top Activity
  • 34. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.34 EM ASH Analytics   Logical extension of EM Top Activity page   Average Active Sessions (AAS) over time –  Decomposed by user-selectable ASH dimension (“parent” dimension)   “Top” Lists by two other user-selectable ASH dimensions –  With breakdown by “parent” dimension   ASH Analytics Loadmap –  AAS decomposed into Treemap of up to 3 ASH dimensions –  Investigate skew and/or balance of load over dimension combinations –  Investigate possible cause-effect relationships Flexible multi-dimensional ASH-based performance analysis tool
  • 35. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.35   Load (AAS) over time with time selector –  Selected time broken down by ASH dimension   2 “Top” lists by other dimensions –  Broken down by parent dimension also   4 Charts with a shared dimension –  Extremely powerful EM ASH Analytics Average Active Sessions
  • 36. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.36
  • 37. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.37   Space-filling, scales well   Decompose load (AAS) by multiple ASH dimensions   Hierarchical decomposition   Some hierarchies natural, others investigative EM ASH Analytics: Loadmap ASH Treemap Visualization
  • 38. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.38 Graphic Section Divider
  • 39. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.39 Tanel Poder Consultant, Enkitec THIRD PARTY COMPANY LOGO Active Session History has radically changed the performance diagnosis of Oracle Databases, by design. With ASH you have detailed performance data always and immediately available… This translates to much faster problem solution times and also more accurate diagnosis…
  • 40. About Us  I am…  •  Oracle ACE Director  •  Sr. Technical Consultant, Enkitec  Enkitec is…    Oracle Platinum Partner specializing in:    Oracle Exadata      Oracle Database, including RAC    Oracle Database Performance Tuning    Oracle APEX and so much more! 
  • 41. The Consultant’s Challenge    “Hybrid” workload environment:    Transactional, ETL, Reporting    Upgraded to 11g in previous year    Consistent degradation since upgrade    ETL down from 400 “businesses” per hour to 2‐300    ETL code review and enhancement in works     “What can you do for us now outside of that effort?”  Goal:  Load 700 businesses per hour!! 
  • 42. Oracle Tools of the Trade    AWR Reports:    First offered by onsite DBA, always available    “Averaging” effect of large snapshot times hiding issues    ASH Reports:    Help identify problem times with finer granularity    Target reports to problem times, gives clearer picture    Enterprise Manager 12c     Used to enhance ASH findings and do further research    Top Activity, SQL Details, ASH Analytics 
  • 43. Why AWR Wasn’t the Answer    The “problem” was not visible    We expect to use CPU and to do I/O    Did not want to alter AWR snapshot timing but  needed finer‐grained time view    Problem not related to workload change or data  volumes, ETL just degrading over time 
  • 44. Why ASH Was…    Exposed competing PL/SQL procedures    More definitive breakdown of data    Zero‐in on problem time    Session level information    Interested in impacts not frequencies 
  • 45. ASH Report Targets CPU Spike    Breakdown by the minute, by interval  CPU spikes in four  minute period 
  • 46. ASH Top SQL Exposes OddiFes    STATS_ADMIN??    SQL Analyze??  }  What does  this SQL  Originate  from? 
  • 47. EM Exposes problem SQL Profiles    EM Search SQL found multiple plans for critical ETL  statements with vastly different performance (?)    Click‐through bad plan to expose existence of SQL Profile    Oops, profiles are supposed to fix plans! 
  • 48. What Caused This?    High profile environment, very sensitive to change    Stats collection using custom wrapper over deprecated   Oracle package dating back prior to 9i (DBMS_ADMIN)    Also using 11g stats collection (DBMS_STATS)    DBMS_ADMIN was deprecated for a reason!    Analysis of object stats providing poor data to CBO    Other automated maintenance window tasks were  expensive and competing for resources at exactly the  wrong time (i.e. ETL time) 
  • 49. Steps to Correct    Migrated to DBMS_STATS for all stats collection     Disable jobs using custom wrapper over  DBMS_ADMIN    Removed SQL Profiles impacting bad ETL plans    Additional steps taken:    Migrated select b‐tree indexes to bitmap indexes, also  much needed disk space.    Continued to review ASH, AWR and Session SQL  performance for improvement. 
  • 50. Victory within reach…    Throughput improvement after Stats gathering  changes  
  • 51. Where They Are Today…    With further physical and logical tuning:  750 !!  GOAL  ACHIEVED 
  • 52. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.52 Graphic Section Divider
  • 53. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.53 A SHORT SEQUENCE OF USING THE TOOL ON A REAL SYSTEM JB’S ASH ANALYTICS ADVENTURE
  • 54. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.54
  • 55. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.55
  • 56. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.56
  • 57. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.57
  • 58. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.58
  • 59. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.59
  • 60. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.60
  • 61. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.61
  • 62. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.62
  • 63. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.63
  • 64. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.64
  • 65. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.65

×