This is the presentation I gave at Velocity Santa Clara on June 25, 2014:
You’ve instrumented your system and application to the hilt. You can now “measure all the things”. Your team has set up thousands of metrics collecting millions of data points a day. Now what?
Analyzing this mountain of data and extracting signal from the noise is not easy, so most IT ops teams only keep an eye on a small fraction of the metrics they collect, or they run some simplistic analytics that don’t generate any useful information. The choice of what analytic method to use ranges from simple statistical analysis to sophisticated machine learning techniques. And one algorithm doesn’t fit all data.
While the more advanced algorithms require math nerds with PhD’s to develop, there are some basic statistical methods that anyone can implement and they can provide surprisingly valuable insights. In this talk, Toufic will show you how to determine the distribution of your data and explain why this is important. We will then explore a few statistical techniques that are appropriate for the most common distributions and discuss their pros and cons. Yes, there will be math in this talk but you will be able to follow along. The goal is for you to walk away with at least one technique that you can apply right away in your monitoring environment to improve the signal to noise ratio and get information out of your data.