The 5S Approach to Performance Tuning by Chuck Ezell


Published on

If your organization relies on data, optimizing the performance of your database can increase your earnings and savings. Many factors large and small can affect performance, so fine-tuning your database is essential. Performance Tuning expert and Senior Applications Tuner for Datavail, Chuck Ezell, sheds light on the right questions to get the answers that will help you move forward by using a defined approach, refered to as 5S.

This performance tuning white paper addresses each stage of this novel approach, as well as key performance issues: SQL, Space, Sessions, Statistics, and Scheduled Processes.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

The 5S Approach to Performance Tuning by Chuck Ezell

  1. 1. The 5S Approach To Database Performance Tuning Presented by Chuck Ezell 478-714-1615
  2. 2. The Landscape Bank of America online banking down for 6 days, affecting 29 million online customers. Gmail down for 2 days, caused by software update affecting 120,000 users. Virgin Blue’s reservation desk down for 11 days affecting 50k passengers and 400 flights, costing millions in profit. Netflix down 4-8 hours, affecting 20 million customers, potentially due to software deployment issues that were termed “internal technical issues.” PayPal battled on-and-off service outages for about five days in October 2004 after upgrading site. They blamed the glitches on a software update. 11/19/2013 2
  3. 3. Why Performance Tune? 80% of unplanned outages are due to ill-planned changes made by operations or developers. 60% of availability and performance errors are the result of mis-configurations. 80% of incidents are caused by changes made to the IT environment including application code. Looking Ahead: Through 2015, 80% of outages impacting mission-critical services will be caused by people and process issues. More than 50% will be caused by change/configuration/release integration and hand-off issues. 11/19/2013 3
  4. 4. 11/19/2013 4
  5. 5. DBA’s Top Performance Issues? “What are the Top 3 performance issues that you encounter with your SQL servers?” – Stack Exchange User 11/19/2013 5
  6. 6. What is Database Performance Tuning? Pull an AWR and ASH report! What’s your Buffer Cache Hit Ratio? Look in the Workload History. Are your statistics up to date? Buffer busy waits I/O Wait SQL * Net message from client Enq: CF Contention db file sequential reads Disk Reads Enq:TX – row lock contention Buffer Gets Cursor: pin S wait on X Concurrency Wait Time LGWR wait for redo copy Rollbacks & Transactions Physical Reads Log File Sync Waits 11/19/2013 6
  7. 7. Ignore the Forest for the Trees “Every defect is a treasure, if the company can uncover its cause and work to prevent it across the corporation.” - Kilchiro Toyoda, founder of Toyota Reactive Approach vs. Proactive Approach • Many approaches out there are from hardware/architecture perspective or deal with peripheral issues and distractions. • What works in reactive situations often applies when proactively planning (in both cases we’re mitigating). • Most often we’re facing Reactive situations. • Build on what we know are real problems we’re fixing right now. • Let’s avoid all the distractions of all the potential peripheral issues. • We need a direct quick way to address root cause and remediate. 11/19/2013 7
  8. 8. The 5S Approach SQL Code Statistics Space/Indexing Sessions Scheduled Process 11/19/2013 8
  9. 9. Step 1 - SQL Code Review the SQL Execution Plan • • • • • What are the peaks and bottlenecks? What indexes are being used? Do you see Index Skip Scans, Index Range Scans or Full Table Scans? Why is the optimizer/execution engine generating this plan? Are there embedded HINTs forcing the poor execution? Review the SQL Code • • • • • • 11/19/2013 Wise use of built in, optimized core language functions ? Are there ANSI JOINS instead Core Product Friendly JOINs? Are you seeing date or integer calculations without the use of Core Product Friendly functions (e.g. implicit conversions)? Are there too many rows being selected (e.g. SELECT * is bad form)? Are bind variables being used? Iterative Calls generating multiple SQL statements (reduce). 9
  10. 10. Step 2 - Statistics • Are the statistics up to date? • Are you finding stats on temporary tables? • Is there a great degree of data manipulation on the tables in question that might have left a high water mark? • Are there sufficient transaction slots for the table/index in question? • What is the clustering factor on the indexes in question? Step 3 - Space/Indexing • • • • • 11/19/2013 Are there better indexes that can be used? Are you missing indexes (or too many IDX) that would be needed? Is there too much data in the table and in need of purging? Are the tables fragmented and in need of rebuilding? Is there sufficient space available for temp data to be processed? 10
  11. 11. Step 4 - Sessions • • • • • Is it possible a developer is testing SQL code against production? Are you seeing long running sessions causing blocks or waiting? Are there sessions locking objects and/or possibly invalidating objects? Are you seeing too many sessions open at once? Are you finding abandoned sessions consuming connections and CPU? Step 5 - Scheduled Processes • Was that backup scheduled for 12 a.m. or 12 p.m., right in the middle of the sales day? • Are there processes competing for resources? • Do you know how many child processes the scheduled process will spawn? • Are your update statistics jobs conflicting with other crucial process and causing extended run times? 11/19/2013 11
  12. 12. Case Study #1 Customer Environment: Informix, Java, Silverlight Complaint: Reports were running greater than 5 minutes and would often timeout when searching by city, state, zip code, vendor name, or vendor id. Root Cause: SQL Code • • • • Too many ANSI joins to external database for Vendor Information. SQL was also using OR statements (in the JOINs) against two different sets of tables. Normalization was too high, which forced multiple joins to get simple vendor information. Poor, non-standard SQL caused Informix to generate bad execution plan. Solution: • • • Replicate external dependency into de-normalized tables within DB. Rewrite the SQL joins Eliminate OR clauses by with de-normalization. Result: 5+ minute response times decreased to a maximum of :30 second response time. 11/19/2013 12
  13. 13. Case Study #2 Customer Environment: Oracle EBS & iStore Complaint: Vendor sales checkout form was performing slowly. Root Cause: Statistics • • • A poor execution plan was causing multiple blocking locks, waits, high I/O and high CPU. Statistics were found on two temporary tables that should not have been there. A DBA had improperly scheduled a statistics update job and it generated statistics on ALL objects in the database. Solution: • Dropped the statistics on the temporary tables and execution plan reverted back to previous plan. Result: Response time went from several minutes with blocking locks and waits to sub-second response times hardly worth noting. 11/19/2013 13
  14. 14. Case Study #3 Customer Environment: Oracle Financials Complaint: Month-end reporting taking days to complete. Had to schedule through weekend and eliminate any interaction with system until complete. Root Cause: SQL Code, Indexing • • • A concurrent process was spawning multiple child processes. Each process was executing full table scans and index skip scans. The reporting process was taking 17 hours to complete. Solution: • • • Add an index on a specific table to eliminate the full table scans. Register new index with histogram and turn off logging. Force the use of a better index with statistics for the Skip Scans. Result: Even with 5% increased executions the 17 hour report began returning in less than 3 minutes. 11/19/2013 14
  15. 15. Case Study #4 Customer Environment: Informix, Java, Silverlight Complaint: Shipping change alert not working. Root Cause: SQL Code • • • The SQL execution was selecting too much data. The alert was looking back against 6 months of data to verify a shipment should have been received within 10 days. The SQL code had no way of limiting the selection against all the rows of shipment data. Solution: • • • Add a new condition to the SQL that limited the data selection to 10 days or less. It was finally decided to add a new feature to the UI allow the user to define how far back the alert would look with a maximum of 30 days. Result: Alert began working because response time went from timing out to sub-second response times. 11/19/2013 15
  16. 16. Case Study #5 Customer Environment: Oracle & Java concurrent program Complaint: Billing invoices werebacking up and many were failing. Root Cause: SQL Code, Indexing • • • Memory and hardware improvements made no improvement in throughput. The vendor support couldn’t provide patches or improvements. The SQL execution plan was sub-par selecting the wrong indexes for optimal performance; performing index skip scans and index range scans. Solution: • Added 2 new indexes that provided highest degree of selectivity over current indexes being utilized by optimizer. Result: Doubled throughput of invoice billing, eliminating failures due to timeouts and response times. 11/19/2013 16
  17. 17. Benefits of 5S Approach Get to root cause faster Greater degree of insight (no Voodoo!) Eliminates distracting possibilities Time and money saved Better utilization of resources (people and hardware) Works both in reactive and proactive situations Outcome provides measured response Root cause reporting is much easier 11/19/2013 17
  18. 18. Final Thoughts 11/19/2013 18
  19. 19. Questions? The 5S Approach for Database Performance Tuning Chuck Ezell – 478-714-1615