Your SlideShare is downloading. ×
Database Performance Analysis with Time Series
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Database Performance Analysis with Time Series


Published on

Showing how to use R and Time Series Analysis techniques to analyse performance and plan capacity and SLAs.

Showing how to use R and Time Series Analysis techniques to analyse performance and plan capacity and SLAs.

Published in: Technology, Economy & Finance

1 Comment
  • I am using Enteros Performance Explorer-i database performance analysis tool - IMHO it is absolutely industry's best!

    It has moving averages, seasonality analysis, linear regression predication, trend analysis, and automated spike analysis, cross database and cross instance analysis, Oracle RAC support, ASH analysis and much more.

    Sorry for being too excited, but for me Performance Explorer-i was a treasure chest, and considering my complex, challenging and hugely active production database environment is a life savior.
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide
  • Time Series – Data that is collected sequentially, usually in regular intervals.Time series are all around us – weather, stock, cpu, disk space…
  • Recognize abnormal data and send alerts Recognize changes and be proactive Analyze long term trends for planning Set Realistic SLAs
  • One question we’ll keep asking ourselves: Which techniques are really useful?
  • All kinds of data issues can prevent analysisYou can and sometimes should fix the data so analysis is possibleReplace missing data with average values (or maximum values where makes sense)Remove outliers when it makes sense.Analyze two sides of discontinuity separately
  • Linear trend. Easy to fit and use, but rarely makes sense in real life
  • Moving Average requires picking a window size and weights.Small window – matches data better, but may include noiseLarge window – more of a general trend, but will contain a delay
  • Remove trend to allow analyzing other components.
  • 50 degrees Fahrenheit is cold for August but hot for January. How about 60% CPU? Is it always OK or always a problem?
  • Reminder: Correlation is a measure of the strength of the relation between two variables. How much do the variables change together?
  • How is data in our series correlates to itself? We see strong correlation between data points 24 hours away.
  • Average CPU for each hour. Similar to those average temperatures for each month charts you sometimes see in tour guides.
  • One chart to rule them all – data, trend, seasonality and all the rest.
  • “All the rest” is not completely random – there is still some auto-correlation. Data correlates to points with a lag of one and two.
  • R used the auto-correlations to model the data
  • We test the model.We can see that the residuals no longer have auto-correlationand the statistical test for the fit shows that the result is likely not random.
  • I added couple of hours with high CPU here. Can you spot them?
  • After removing seasonality and average, we can clearly see that data point that is an outlier. It stands out.
  • Calculate moving average of future by adding the moving average for the last 20 points as an additional point. Then using the last 19 real points and the new one to calculate another point… Obviously this gets less accurate the more you do it.Adding seasonality is a matter of adding the hourly average to the appropriate new points.
  • Red – Match the model to existing dataBlue – Predicted dataGreen – 99% probability that we will not get data outside these lines
  • A bit like moving average but with very specific weights.
  • Blue – Predicted dataGreen – 99% probability that we will not get data outside these lines
  • The redo data is very noisy, but adding a moving average trend allows us to see a point where redo generation drops. This happened to be Dec 20 where many users left for vacation.
  • Correlation every 6 hours and stronger correlation every 24. These are the times we recalculate materialized views. Few views every 6 hours and a bunch every 24.
  • Removing the seasonality allows us to notice abnormal data. Worth investigating – what was running at that time? Is it likely to happen again?
  • Not exactly trend, but we do have changing levels of data.
  • There are periodic correlations but they are not regular, so it is not seasonality.This graph does indicate extremely strong auto-correlation
  • Partial Autocorrelation graph. This is similar to autocorrelation, but when we calculate auto-correlation for lag 2, we remove the correlation already explained by lag 1 and so on.Using this graph we can see auto-correlation up to lag 17. Once the CPU climbs, it may take over 3 hours until it is back to normal!
  • Checking that the AR(17) model fits.
  • Transcript

    • 1. Analyzing Oracle Performance Using Time Series Models
      Chen (Gwen) Shapira
    • 2. Why?
      Abnormal Data
    • 3. See
      Use Cases
      Real Data
    • 4. Techniques
    • 5.
    • 6. Trend
    • 7. Trend
    • 8. Moving Average Trend
    • 9.
    • 10. Remove Trend
    • 11. Seasonality
    • 12.
    • 13.
    • 14. Seasonal Effect
    • 15. Components
    • 16. More AutoCorrelation
    • 17. Xt= 0.33Xt-1 + 0.07Xt-2 – 0.09Xt-3+ e
    • 18. Test Model
    • 19. Use Cases
    • 20. Fake Incident
    • 21. Detect By
      Remove trend
      Remove Seasonality
      Mark “normal data”
      What’s left?
    • 22. Spot the Incident
    • 23. “I have seen the future and it is very much like the present, only longer”
    • 24. Exponential Smoothing
      Calculate moving average of future
      Add seasonality
    • 25.
    • 26. AutoCorrelation
      Use the model:Xt = aXt-1…To calculate Xt+1,Xt+2…
    • 27.
    • 28. Real Data 1:Redo Blocks per Hour
    • 29. Holiday
    • 30. Seasonality
    • 31. Abnormal Data
    • 32. Real Data 2:CPU on DB Server
    • 33.
    • 34. Seasonality?
    • 35. Partial AutoCorrelation
    • 36. Check Fit of Model
    • 37. Prediction
    • 38. Conclusions
      Use moving average to describe trend
      Look for seasonality
      Predict with Exponential Smoothing
      Seasonality aware monitoring
    • 39. Questions?