Analyzing Your Logs: What are they telling you?

1,215 views

Published on

Use systems thinking and statistical analysis to learn more about your proprietary applications. Analyze their behavior based on the logs they generate. Determine patterns and trends to obviate system downtimes.

Published in: Business, Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,215
On SlideShare
0
From Embeds
0
Number of Embeds
21
Actions
Shares
0
Downloads
53
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

Analyzing Your Logs: What are they telling you?

  1. 1. Analyzing Your Logs What are they telling you? Gerard Ibarra, PhD November 2008
  2. 2. <ul><li>Goals </li></ul><ul><li>Systems Thinking </li></ul><ul><li>Definition of System: This Presentation </li></ul><ul><li>Log Analysis </li></ul><ul><li>Analysis Summary </li></ul>Copyright © 2008 Buildwave Technologies, Inc. All rights reserved.
  3. 3. <ul><li>Think systems first </li></ul><ul><li>Use statistics to understand what is going on </li></ul><ul><li>Get a better picture with charts </li></ul><ul><li>Include control charts to monitor the system </li></ul>Copyright © 2008 Buildwave Technologies, Inc. All rights reserved.
  4. 4. <ul><li>“ A system is an assemblage or combination of elements or parts forming a complex or unitary whole;…” (Blanchard, B. S., and Fabrycky, W. J., Systems and Engineering and Analysis (2 nd ed.). Englewood Cliffs, NJ: Prentice-Hall, 1990) </li></ul>Copyright © 2008 Buildwave Technologies, Inc. All rights reserved.
  5. 5. <ul><li>Systems could be any of the following: </li></ul><ul><ul><li>A transportation network moving items from one place to another – dynamic </li></ul></ul><ul><ul><li>A bridge used to connect places together – static </li></ul></ul><ul><ul><li>A set of unmanned aerial vehicles (UAV) located in a strategic region providing intelligence – dynamic </li></ul></ul><ul><ul><li>A group of applications and servers acting together to perform a service – dynamic </li></ul></ul><ul><ul><li>A motor for a car – static/dynamic </li></ul></ul>Copyright © 2008 Buildwave Technologies, Inc. All rights reserved.
  6. 6. <ul><li>Systems today are more complex than before (Using Systems Engineering to Improve RMS&L Requirements, A Government-Industry Training Workshop, various discussions, Springfield VA: November 12-13, 2008) </li></ul>Copyright © 2008 Buildwave Technologies, Inc. All rights reserved.
  7. 7. <ul><li>Changes in one part of the system affects the system as a whole </li></ul><ul><ul><li>More items to move – extra resources to process </li></ul></ul><ul><ul><li>Increase traffic – longer times to cross bridge </li></ul></ul><ul><ul><li>Reduction in UAV – changes strategies if mission remains the same </li></ul></ul><ul><ul><li>Server down – increases load; possible sales loss </li></ul></ul><ul><ul><li>New and improved parts – increase inventory to maintain both motors </li></ul></ul>Copyright © 2008 Buildwave Technologies, Inc. All rights reserved.
  8. 8. <ul><li>Why think systems for your network? </li></ul><ul><ul><li>Because changes done to its parts affect its overall mission and ultimately the business as a whole. For example, the items below have an effect on how the system operates that in turn affects how the company can conduct its business. </li></ul></ul><ul><ul><ul><li>Adding or removing applications </li></ul></ul></ul><ul><ul><ul><li>Modify software/hardware configuration </li></ul></ul></ul><ul><ul><ul><li>Add or remove hardware from operations </li></ul></ul></ul><ul><ul><ul><li>Improving, adding, or deleting features </li></ul></ul></ul>Copyright © 2008 Buildwave Technologies, Inc. All rights reserved.
  9. 9. <ul><li>System is the aggregation of applications, servers, and services working in unison to produce a common function for the use, goals, sustainment, and operations of the company </li></ul>Copyright © 2008 Buildwave Technologies, Inc. All rights reserved.
  10. 10. <ul><li>Various ways to analyze logs: Examples </li></ul><ul><ul><li>Statistical </li></ul></ul><ul><ul><ul><li>Central Tendency </li></ul></ul></ul><ul><ul><ul><li>Variation </li></ul></ul></ul><ul><ul><ul><li>Skewness </li></ul></ul></ul><ul><ul><ul><li>Kurtosis </li></ul></ul></ul>Copyright © 2008 Buildwave Technologies, Inc. All rights reserved.
  11. 11. <ul><li>Examples Continued: </li></ul><ul><ul><li>Graphical </li></ul></ul><ul><ul><ul><li>Bar Chart </li></ul></ul></ul><ul><ul><ul><li>Line Chart </li></ul></ul></ul><ul><ul><ul><li>Pie Chart </li></ul></ul></ul><ul><ul><ul><li>Control Charts </li></ul></ul></ul>Copyright © 2008 Buildwave Technologies, Inc. All rights reserved.
  12. 12. <ul><li>Statistical – Central Tendency </li></ul><ul><ul><li>Determine how much central tendency there is in the log data </li></ul></ul><ul><ul><ul><li>Know and understand what is the average number of events occurring in a system – used for a quick check of how the system is currently operating </li></ul></ul></ul><ul><ul><ul><li>Compare the average events occurring over time – see if there are any patterns </li></ul></ul></ul><ul><ul><ul><li>Look at the startup of a process – determine if the number of errors differ as times progresses </li></ul></ul></ul>Copyright © 2008 Buildwave Technologies, Inc. All rights reserved.
  13. 13. <ul><li>Statistical – Central Tendency Example </li></ul><ul><ul><li>Use the following analytics to generate report </li></ul></ul><ul><ul><ul><li>Mean </li></ul></ul></ul><ul><ul><ul><li>Medium </li></ul></ul></ul><ul><ul><ul><li>Mode </li></ul></ul></ul><ul><ul><ul><li>Quartiles </li></ul></ul></ul>Copyright © 2008 Buildwave Technologies, Inc. All rights reserved. First Hour (Based on 1-min aggregations over 1-hour periods)
  14. 14. <ul><ul><li>The mean is 2.3333 – this is the average times over one hour based on one minute increments that the error occurred; anything more than this should raise a flag when comparing the same events to the same hour to other days </li></ul></ul>Copyright © 2008 Buildwave Technologies, Inc. All rights reserved. Example Day1 Day2 Day3 Day4 Day5 Mean 2.33 2.35 2.21 7.45 2.41
  15. 15. <ul><ul><li>The median is 2 – this is the mid point number of events based on the hour; it should be somewhat close to the mean unless the data is skewed </li></ul></ul>Copyright © 2008 Buildwave Technologies, Inc. All rights reserved. Example Data (1, 1, 1, 1, 1, 1, 2, 20, 21, 22, 23, 24, 25 ) Mean = 11 Median = 2 The mean is over five times the median – should raise a flag; notice that the data is skewed to the ones and twenties
  16. 16. <ul><ul><li>The mode is 1 – this is the most reoccurring number of events based on one minute aggregations over the one hour; shows where most of the data comes from; should make some sense with respect to the mean or median or both </li></ul></ul>Copyright © 2008 Buildwave Technologies, Inc. All rights reserved. Example Data (1, 1, 1, 1, 1, 1, 5, 17, 19, 21, 23, 25, 27) Mode = 1 Mean =11 Median = 5 There is a wide variation between the three indices – should raise flag
  17. 17. <ul><ul><li>The lower and upper quartiles are 1 and 3.5 – this shows the lower half and upper half of the medians based on the Moore and McCabe or “M-and-M method” (there are various ways to calculate the quartiles) </li></ul></ul>Copyright © 2008 Buildwave Technologies, Inc. All rights reserved. Example Data (0, 0, 0, 0, 0, 0, 5, 17, 19, 21, 23, 25, 27) LQ = 0; UQ = 22 Mean = 11 The mean is far from the LQ in terms of percentages – should raise flag; could show that at the startup of the period the #no. of errors were nil, and as time increased, so did the errors
  18. 18. <ul><li>Statistical – Variation </li></ul><ul><ul><li>Determine how much the log data is varying from the mean </li></ul></ul><ul><ul><ul><li>The closer to the mean, the less the systems vary </li></ul></ul></ul><ul><ul><ul><li>The less variations typically the smoother the system operates </li></ul></ul></ul>Copyright © 2008 Buildwave Technologies, Inc. All rights reserved.
  19. 19. <ul><li>Statistical – Variation Example </li></ul><ul><ul><li>Use the following analytics to generate report </li></ul></ul><ul><ul><ul><li>Mean </li></ul></ul></ul><ul><ul><ul><li>Variation </li></ul></ul></ul>Copyright © 2008 Buildwave Technologies, Inc. All rights reserved. First Hour (Based on 1-min aggregations over 1-hour periods)
  20. 20. <ul><ul><li>The mean is 2.3333 and the standard deviation is 1.91195 – the standard deviation is the amount that the data varies from the mean; it is the amount of spread from the mean expressed in the original units </li></ul></ul>Copyright © 2008 Buildwave Technologies, Inc. All rights reserved. Example Mean = 45 StdDev = 41 The standard deviation is almost the same amount as the mean – this should raise a flag (Note that the company could define this type of behavior as normal)
  21. 21. <ul><li>Statistical – Skewness and Kurtosis </li></ul><ul><ul><li>Try to find out the type of distribution the system generates </li></ul></ul><ul><ul><ul><li>Learn if the data is normal – good for predictions </li></ul></ul></ul><ul><ul><ul><li>See how the system operates – determine if there are modes during certain periods </li></ul></ul></ul>Copyright © 2008 Buildwave Technologies, Inc. All rights reserved.
  22. 22. <ul><li>Statistical – Skewness and Kurtosis Example </li></ul><ul><ul><li>Use the following analytics to generate report </li></ul></ul><ul><ul><ul><li>Statistics </li></ul></ul></ul>Copyright © 2008 Buildwave Technologies, Inc. All rights reserved. (Based on 1-hour aggregations over the range of the data)
  23. 23. <ul><ul><li>The Skewness is -2.34592 – this is a measure of the symmetry of the distribution (negative means that it skews to the left and positive to the right) </li></ul></ul><ul><ul><li>The Kurtosis is 8.49086 – this is the measure of how peaked the distribution is (the larger the number, the more “peaked”) </li></ul></ul>Copyright © 2008 Buildwave Technologies, Inc. All rights reserved. Example of possible distribution: Most of the events take place at the start of the process and peaks in a short interval
  24. 24. <ul><ul><li>A Skewness of 0.0 and Kurtosis of 3.0 means that this is an ideal normal distribution – great for predicting possible outcomes </li></ul></ul>Copyright © 2008 Buildwave Technologies, Inc. All rights reserved.
  25. 25. <ul><li>Graphical – Bar Charts </li></ul><ul><ul><li>View the errors based on different periods </li></ul></ul><ul><ul><li>Understand the behavior of the systems better </li></ul></ul>Copyright © 2008 Buildwave Technologies, Inc. All rights reserved. Most Errors on Day 2 Least number of errors at 6:00 am and 5:00 pm Two instances of almost zero errors on day 5
  26. 26. <ul><li>Graphical – Line Charts </li></ul><ul><ul><li>Get a clearer perspective on the error rates </li></ul></ul><ul><ul><li>View same data, but from a different perspective </li></ul></ul>Copyright © 2008 Buildwave Technologies, Inc. All rights reserved. Most Errors on Day 2 Least number of errors at 6:00 am and 5:00 pm Two instances of almost zero errors on day 5
  27. 27. <ul><li>Graphical – Line Charts </li></ul><ul><ul><li>Use it to forecast </li></ul></ul>Copyright © 2008 Buildwave Technologies, Inc. All rights reserved. Follows Same Trend Based on Periods (Aug01 – Sep01 and Aug02 – Sep02) Shows an Upward Trend
  28. 28. <ul><li>Graphical – Pie Charts </li></ul><ul><ul><li>Compare to other events </li></ul></ul><ul><ul><li>Compare to system as a whole </li></ul></ul>Copyright © 2008 Buildwave Technologies, Inc. All rights reserved. Errors account for less than 2% of the Events in the System Significant number of Errors occurring based on the number of Warnings
  29. 29. <ul><li>Graphical – Control Charts </li></ul><ul><ul><li>Monitor the system or individual subsystems </li></ul></ul><ul><ul><li>Anticipate possible problems </li></ul></ul>Copyright © 2008 Buildwave Technologies, Inc. All rights reserved. Out of Compliance Trending Upwards: Try to keep it from going above the UCL again
  30. 30. <ul><li>Use analytics and charting to help view and understand what the system and its subsystems may be doing </li></ul><ul><ul><li>Look for </li></ul></ul><ul><ul><ul><li>Abnormalities </li></ul></ul></ul><ul><ul><ul><li>Deviations </li></ul></ul></ul><ul><ul><ul><li>Compliances </li></ul></ul></ul>Copyright © 2008 Buildwave Technologies, Inc. All rights reserved. <ul><ul><li>Learn how to </li></ul></ul><ul><ul><ul><li>Predict </li></ul></ul></ul><ul><ul><ul><li>Anticipate </li></ul></ul></ul><ul><ul><ul><li>Forecast </li></ul></ul></ul>
  31. 31. Copyright © 2008 Buildwave Technologies, Inc. All rights reserved. Most of the chart and result screen shots shown in this presentation were created in Violog. http://www.buildwave.com/violog

×