"Cloud environments and Open Source software have lowered the bar for anyone to implement software solutions.
Complex relationships between system components are frequently missed by the human eye, and small but important changes are neglected. This, along with the sheer amount of monitoring data, call for a new approach.
"
18. • Supervised Learning:
• Uses data with clearly-defined output (“labeled data”)
• Machine learns explicitly through right and wrong answers
• Two main types:
• Regression – Predict continuous values based on sets of (correlated) data
• Classification – Predict the class of an item based on its properties
Types of Machine Learning - Supervised
20. • Regression 2 – Given cups of coffee sold per 10 minutes
• Predict how many cups are sold on any given time of the day
• Linear regression:
• Polynomial regression:
Types of Machine Learning – Regression (2)
Time of day (hours)
Cups of coffee sold
27. Types of Machine Learning – Unsupervised (1)
• Problem:
• Supervised learning is good, but requires labeled data
• Most data in the world is not labeled, there’s no right/wrong answer
• Labeling requires human effort à tedious and expensive
• Unsupervised Learning:
• The machine automatically recognizes relationships in the data
• No right or wrong answers are given
• Many times used to enhance Supervised Learning
28. • Some approaches include:
• Clustering algorithms: k-means, k-nearest-neighbors etc.
• Anomaly detection of rare events
• Deep learning (for pretty much everything…)
• Deep Learning approach:
• Learn from a lot of non-labeled data
• Learn highly non-linear correlations (represent complex relationships)
• Surprisingly good results for many applications!
Types of Machine Learning – Unsupervised (2)
34. Applying to Log Records (1)
• Problems:
• Log data is very redundant
• Hard to find the important events
• Rare logs are a needle in the haystack
• Also:
• Actions in the system are represented by a series of logs records
• But other logs interrupt the visual flow
• Tracing the logs of a complete action is hard
43. Log analysis pipeline - clustering
• Result: M raw log records à N log prototypes
(N << M)
• M is in the billions; N is in the thousands
“Creating tag on Stream: -1 Position: 42”
“Creating tag on Stream: 2 Position: 65”
“Creating tag on Stream: {var1} Position: {var2}”
49. Log analysis pipeline – sequence finding
• Repeat the process:
• For each k-sequence try to construct a longer (k+1)-sequence
• Stop when failing the G-Test or when the normalized score decreases:
• Save the k-sequence as valid (an action in the system)
𝑆#
/ %
< 𝑆#
/1# %
51. Log analysis pipeline
• Alert about a sequence anomaly when ratio is distant enough from
the valid sequence, e.g. 𝑝 < 0.001
• Software is constantly changing – update all models all the time
• Of course, there is much more then we explored here!
70% of developers use logs for production monitoring and production troubleshooting
(add arrows to the two common usages)
(have them in horizontal)
Identify what “log types” are
Explain why rare logs are important
Frequent logs core business, a lot of noise etc.
Rare logs
Show text only after some explanation
Insert business example visually
And keep coming back to it along the explanation
Take numbers off
Sequence log types which relate to the same action
The more population a country has, the more “candidates” it has to fill up its 11 soccer player slots.
The more money a country has (GDP per capita), the more resources it has to train its players.
Take one graph and slice it manually to clarify how the algorithm works.
Ditch all other plots
Place in the next slide