Predictive performance analysis using sql pattern matching

  • 153 views
Uploaded on

 

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
153
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
12
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Predictive Performance Analysis using SQL Pattern Matching October 2013 Horia Berca
  • 2. Predictive Performance Analysis using SQL Pattern Matching TABLE OF CONTENTS Overview .................................................................................................................................................... 3 Introduction ............................................................................................................................................... 3 Metrics ....................................................................................................................................................... 4 Workload Cycles ......................................................................................................................................... 4 Find Patterns. Fast ...................................................................................................................................... 5 How Data is processed in Pattern Matching ................................................................................................. 5 Appendix .................................................................................................................................................... 6 SYSSTAT - Logical Reads ................................................................................................................................ 6 CPU Load ....................................................................................................................................................... 7 2
  • 3. Overview Predictive performance analysis advantages will allow efficient scheduling by anticipating workload behaviour prior to execution. This in turn will allow efficient resource utilisation and aim to maximize the applications and systems performance with control from the design to the delivery stage. Introduction The development of tools and methods to aid application and system performance analysis continues to be a research area. Performance method is an iterative process. As you remove the first bottleneck, you may see no improvement as another bottleneck might be revealed that has a greater effect on performance. You will need to accurately diagnose the performance problem in order to ensure your changes improve performance. Typically, performance problems result from a lack of throughput (the amount of work that can be completed in a specified time), unacceptable user or jobresponse time (the time to complete a specified workload), or both. The problem might be localized to specific application modules or it might span the system. The Automatic Workload Repository (AWR) stores a wealth of data regarding database performance. This data is a key component of the Diagnostics and Tuning pack. There are times when viewing longer term historical information would be useful. Viewing this longer term historical information could help pinpoint when a performance problem may have started. Viewing the historical performance of your workload can be helpful in identifying peak hours and peak days. Similarly, certain historical characteristics of the workload, such as I/O requests per second, or user calls per second, may be useful to look at to see if the workload remains constant, is increasing or decreasing. A trend of a high load SQL statement may be useful to determine whether the characteristics of a SQL statement is changing – is it using more CPU, is it taking more elapsed time per execution, is it retrieving more data per execution, or is it simply getting executed more often? All of the above information is available in the AWR, Oracle Enterprise Manager does display the performance information in graphs along several dimensions. However, this data can be analysed in many more ways, and specifically we are interested in showing how we can make predictions using this information. In a default configuration, AWR retention period is eight days. In order to do a longer analysis, that will cover various workloads you have in your organization, is recommended to change the default retention. 3
  • 4. Metrics Tuning metrics table Stats Raw Waits Files V$SYSSTAT V$SYSTEM_EVENT V$FILEIO V$EVENT_HISTOGRAM V$SYSTEM_WAIT_CLASS V$SYSMETRIC V$EVENTMETRIC V$SYSMETRIC_SUMMARY V$WAITCLASSMETRIC 1 Hour V$SYSTEMMETRIC_HISTORY V$WAITCLASSMETRIC_HISTORY V$FILEMETRIC_HISTORY 7 days DBA_HIST_SYSMETRIC_SUMMARY DBA_HIST_SYSTEM_EVENT DBA_HIST_FILESTATXS Alerts Only DBA_HIST_SYSMETRIC_HISTORY DBA_HIST_WAITCLASSMETRIC_HISTORY DBA_HIST_FILEMETRIC_HISTORY Now V$FILEMETRIC Workload Cycles The workload may have cyclic usage patterns during a specified period, for example during day vs night, during weekdays and weekends. We can classify in these categories: Daily activity - By day and night: metrics aggregated between day hours (8am to 8pm) and night hours (8pm and 8am) - Hourly: metrics aggregated within every hour of a day separately - None: all metrics aggregated together. Weekly options - By day of week: metrics are aggregated separately by days of the week - By weekdays and weekends - None 4
  • 5. Find Patterns. Fast Patterns are usually defined as a repetitive series or sequence of specific events or actions and they occur everywhere in business. The ability to find, analyze and quantify individual or groups of patterns within a data set can greatly help you gain a better understanding of operational activities, seek trends and better predict direction. Oracle Database 12c adds native pattern matching capabilities to SQL. This brings the simplicity and efficiency of the most common data analysis language to the process of identifying patterns within a data set. It offers significant gains in term of performance. A completely new native SQL syntax that has adopted the regular expression capabilities of Perl by implementing a core set of rules to define patterns in sequences has been made available. Pattern matching in SQL is performed using the MATCH_RECOGNIZE clause. MATCH_RECOGNIZE enables you to do the following tasks: - Logically partition and order the data that is used in the MATCH_RECOGNIZE clause with its PARTITION BY and ORDER BY clauses. - Define patterns of rows to seek using the PATTERN clause of the MATCH_RECOGNIZE clause. These patterns use regular expression syntax, a powerful and expressive feature, applied to the pattern variables you define. - Specify the logical conditions required to map a row to a row pattern variable in the DEFINE clause. - Define measures, which are expressions usable in other parts of the SQL query, in the MEASURES clause. How Data is processed in Pattern Matching The MATCH_RECOGNIZE clause performs these steps: 1. The row pattern input table is partitioned according to the PARTITION BY clause. Each partition consists of the set of rows of the input table that have the same value on the partitioning columns. 2. Each row pattern partition is ordered according to the ORDER BY clause. 3. Each ordered row pattern partition is searched for matches to the PATTERN. 4. Pattern matching operates by seeking the match at the earliest row, considering the rows in a row pattern partition in the order specified by the ORDERBY clause. Pattern matching in a sequence of rows is an incremental process, with one row after another examined to see if it fits the pattern. With this incremental processing model, at any step until the complete pattern is recognized, you only have a partial match, and you do not know what rows might be added in the future, nor to what variables those future rows might be mapped. If no match is found at the earliest row, the search moves to the next row in the partition, checking if a match can be found starting with that row. 5. After a match is found, row pattern matching calculates the row pattern measure columns, which are expressions defined by the MEASURES clause. 6. Using ONE ROW PER MATCH, as shown in the first example, pattern matching generates one row for each match that is found. If you use ALL ROWS PERMATCH, every row that is matched is included in the pattern match output. 7. The AFTER MATCH SKIP clause determines where row pattern matching resumes within a row pattern partition after a non-empty match is found. In the previous example, row pattern matching resumes at the last row of the match found (AFTER MATCH SKIP TO LAST UP). Most of the AWR tables store statistic values from instance startup and this is a useful resource to explore. 5
  • 6. Appendix SYSSTAT - Logical Reads select * from (selectstat_name ,to_char(round(end_interval_time,'hh24'),'mm-dd-rr hh24') snap_time ,round(avg(pSec),2) perSec from ( select stat_name, end_interval_time , greatest(v/ela,0) pSec from ( select /*+ leading(s,sn,sy) */sn.stat_name, s.snap_id , s.dbid , s.end_interval_time , case when s.begin_interval_time = s.startup_time thensy.value elsesy.value - lag(sy.value,1) over (partition by sy.stat_id , sy.dbid , s.startup_time order by sy.snap_id) end v , (cast(end_interval_time as date) - cast(begin_interval_time as date))*24*3600 ela fromdba_hist_snapshot s , dba_hist_sysstatsy , dba_hist_stat_namesn wheres.dbid = sy.dbid ands.instance_number = sy.instance_number ands.snap_id = sy.snap_id ands.dbid = sn.dbid andsy.stat_id = sn.stat_id andsn.stat_name = 'session logical reads' ) ) group by stat_name,to_char(round(end_interval_time,'hh24'),'mm-dd-rr hh24'), instance_number order by stat_name,to_char(round(end_interval_time,'hh24'),'mm-dd-rr hh24'), instance_number) match_recognize ( partition by stat_name order by snap_time measures match_number() as match_num, STRT.snap_time AS start_tstamp, FINAL LAST(UP.snap_time) AS end_tstamp ALL ROWS PER MATCH AFTER MATCH SKIP TO LAST UP PATTERN (STRT DOWN+ UP+ DOWN+ UP+) DEFINE DOWN AS DOWN.perSec<PREV(DOWN.perSec), UP AS UP.perSec>PREV(UP.perSec) ) MR ORDER BY MR.stat_name, MR.match_num, MR.snap_time; session logical reads session logical reads session logical reads 10-28-13 10 10-28-13 11 10-28-13 12 1 1 1 10-28-13 10 10-28-13 10 10-28-13 10 6 10-28-13 16 10-28-13 16 10-28-13 16 541,71 66,42 31,14
  • 7. session logical reads session logical reads session logical reads session logical reads 10-28-13 13 10-28-13 14 10-28-13 15 10-28-13 16 1 1 1 1 10-28-13 10 10-28-13 10 10-28-13 10 10-28-13 10 10-28-13 16 10-28-13 16 10-28-13 16 10-28-13 16 467,44 32,11 29,1 107,27 CPU Load select * from (selectstat_name, to_char(round(s.end_interval_time,'hh24'),'mm-dd-rr hh24') snap_time , round(os.value,2) value fromdba_hist_snapshot s , dba_hist_osstatos wheres.dbid = os.dbid ands.instance_number = os.instance_number ands.snap_id = os.snap_id andos.stat_name = 'LOAD' order by 2) match_recognize ( partition by stat_name order by snap_time measures match_number() as match_num, STRT.snap_time AS start_tstamp, FINAL LAST(UP.snap_time) AS end_tstamp ALL ROWS PER MATCH AFTER MATCH SKIP TO LAST UP PATTERN (STRT DOWN+ UP+ DOWN+ UP+) DEFINE DOWN AS DOWN.value<PREV(DOWN.value), UP AS UP.value>PREV(UP.value) ) MR ORDER BY MR.stat_name, MR.match_num, MR.snap_time; LOAD LOAD LOAD LOAD LOAD LOAD 10-28-13 10 10-28-13 11 10-28-13 12 10-28-13 13 10-28-13 14 10-28-13 15 1 1 1 1 1 1 10-28-13 10 10-28-13 10 10-28-13 10 10-28-13 10 10-28-13 10 10-28-13 10 10-28-13 15 10-28-13 15 10-28-13 15 10-28-13 15 10-28-13 15 10-28-13 15 7 0,81 0,35 0,85 0,22 0,45 8