The Database Sizing Workflow


Published on

Published in: Technology, Business
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Outline:
    Ultimate Exadata IO monitoring – Flash, HardDisk , & Write back cache overhead
    I’ll do a session highlighting a very write intensive OLTP Exadata environment and will discuss the different ways to monitor IO from the database and storage layer perspective and correlating it back to the application by mining the dba_hist_sqlstat data. I’ll also touch on utilizing the OEM12c Metric Extensions and BI Publisher integration to ultimately scale the monitoring to a bunch of Exadata environments. It’s going to be a fun hacking session.
    > discuss the capacity doodle
    > the variables
    > monitoring
    > the reclaim
    > highlight issue on very write intensive OLTP environment
    > monitoring problem
    on OEM perf page > show IO perf page not accounting the flash IOs
    ** partly because some people in the team have access to only limited view of things
    ** or they have difficulty interpreting the numbers, they need simple stuff
    on OEM12c storage grid perf > although 12c has exadata IOs monitoring but,
    I'd like to get the IOPS number separated by flash and disk
    > wbfc patent
    > write back cache
    > exadata oltp optimizations
    > discuss about the basic architecture
    > discuss different ways to monitor IO (email to randy)
    Different views of IO performance
    SECTION 1: USER IO wait class and cell single block reads latency with curve fitting
    SECTION 2: Small IOPS vs Large IOPS
    SECTION 3: Flash vs HD IOPS
    SECTION 4: Flash vs HD IOPS with read/write breakdown
    SECTION 5: IO throughput read/write MB/s
    SECTION 6: Drill down on smart scans affecting cell single block latency on 24hour period
    > IO workload correlate up to the topevents and sqlstat data
    > causal links - produce analysis which relates database load to application processing creating a strong understanding
    front to back as an enabler to ‘fix’
    > feedback loop on what is working and what is not
    > track IO config changes - IORM (topevents data)
    > basic, auto, low latency... and when it is applicable
    > scaling it!
    > metrics extension
    > BIP
    > show data model
    > email everyday
  • Just a brief introduction of myself..
  • And this is what the tar files looks like and it’s just a simple CSV output of AWR data
  • And what makes the tableau really interesting is it automatically creates “dimensions” out of those CSV files
    My objective on this image is to quickly see the utilization of CPU if I combine particular instances and I can do that by just pulling the Total Oracle CPU seconds metric on the graph and that’s the boxed line chart at the bottom and that's the sum of Total Oracle CPU seconds of the instances that are selected on the right hand side portion of the graph. 
    So let’s say I want to consolidated the 3 instances on a single 24cores compute node.. (24cores x 3600 seconds = 86400 seconds of CPU capacity) I’ll be able to tell from the workload trend that it can fit on that box and I’m expecting the highest CPU Utilization that I’ll have is about 69% (60000/86400)
    And you can also right click on this and do a “View Data”
  • So how it works is whatever SNAP_ID on the selected instances that falls on a specific hour dimension will get summed. So this tool automatically takes care of snap interval differences of the databases which is tedious to do manually.
  • The Database Sizing Workflow

    1. 1. The Database Sizing Workflow Presented by: Karl Arao 1
    2. 2. whoami Karl Arao • Senior Technical Consultant @ Enkitec • Performance and Capacity Planning Enthusiast 7+ years DBA experience Oracle ACE, OCP-DBA, RHCE, OakTable Blog: Wiki: Twitter: @karlarao 2
    3. 3. 3 200+ 3
    4. 4. Agenda • The sizing scenarios/objective • The general sizing workflow – Extract – Visualize – Model – Project • Putting it all together: Real Sizing Scenarios 4
    5. 5. 5
    6. 6. The sizing scenarios/objective • Consolidation, HW refresh, platform migration – How many can fit? – Can I combine A + B + ½ of C? – What's the ideal hardware to buy - "right sizing" 6
    7. 7. The sizing workflow – Extract • Workload data – Visualize • Consolidated peak workload – Model • Provisioning plan – Project • “Headroom” 7
    8. 8. 8
    9. 9. Extract 9
    10. 10. AWR data • Top Events – AAS CPU, latency, wait class • SYSSTAT – PGA, SGA, physical memory, Executes/sec • IO – IOPS breakdown, MB/s • CPU – Load Average, NUM_CPUs, • Storage – total storage size, per tablespace size • Services – distribution of workload/modules • Top SQL – PIOs, LIOs, modules, SQL type, SQL_ID, PX Correlate across months of workload data!
    11. 11. 11
    12. 12. 12 OS data
    13. 13. Visualize 13
    14. 14. Visualize – Workload Characterization General Workload • top events • load profile (exec/sec) • top modules/services CPU usage • CPU, cpuwait, scheduler SGA/PGA IOPS, MB/s, latency • IO breakdown • read/write ratio Storage Size 14
    15. 15. • Tableau auto creates a time dimension for the time column “MM/DD/YY HH24:MI:SS” of AWR csv output 15
    16. 16. 16 • Summary and Underlying data 1-2AM 2-3AM
    17. 17. 17 Consolidated CPU usage
    18. 18. Model 18
    19. 19. What to model? • the provisioning plan – instance mapping – node failure scenarios – resource management • backups, test/dev, DR, ZFS • hardware options • memory upgrade • redundancy (normal or high) 19
    20. 20. 20
    21. 21. Projection 21
    22. 22. 22
    23. 23. Putting it all together 23
    24. 24. Summary • The sizing scenarios/objective • The 4 points of the sizing worklflow 24
    25. 25. References • Where did my CPU go? (webinar) (paper) • Book: Computer Architecture: A Quantitative Approach 5th Ed - Chapter1 Section1.10 Putting it all together Perf, Price, Power • Book: The Art of Scalability - Ch11 “Headroom” • Viz Example: CPU sizing 15 vs 60 mins snap interval • Viz Example: Different views of IO performance • Exadata Provisioning Worksheet rkaraoconsolidation-successstory 25 @karlarao