How often do you ask yourself this question? In this talk, I’ll use it as a guide and walk you through a few interesting problems that we have at Datadog around anomaly detection in time series. We’ll see how this questioning can help us improve our understanding on a variety of topics such as when to use machine learning, how to select the best algorithm for a problem, when to publish a paper, or how to build useful products.
Meetup talk from https://www.meetup.com/fr-FR/pyladiesparis/events/297190950/
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
Why am I doing this???
1. Why am I doing this???
Anne-Marie Tousch
Senior Data Scientist, Datadog
PyLadies Meetup
November 16th, 2023
2. ❏ To share my pain
❏ To show off my knowledge
❏ To explain away why I'm failing so much
❏ Why did I sign up for this talk?
❏ To make you ask the same question
Why am I doing this?
3. Why am I doing this???
Or why data science is harder than you think
Anne-Marie Tousch
Senior Data Scientist, Datadog
PyLadies Meetup
November 16th, 2023
4. Quick bio
computer vision
(PhD)
computer vision
(startup)
ML (RecSys, …)
2020-?: AIOps
4
More Machine Learning More Software Engineering
2006
2010
2014
2020
?
?
?
?
5. ● We run on millions
of hosts
● We collect tens of
trillions of
events per day
Visit datadoghq.com for more information
11. Ghosh, Supriyo, et al. "How to fight production incidents? an empirical study on a large-scale cloud
service." Proceedings of the 13th Symposium on Cloud Computing. 2022.
12. Ghosh, Supriyo, et al. "How to fight production incidents? an empirical study on a large-scale cloud
service." Proceedings of the 13th Symposium on Cloud Computing. 2022.
"How incidents are detected? … we
observe that about 55% of the incidents
were detected by the automated
watchdogs."
13. Ghosh, Supriyo, et al. "How to fight production incidents? an empirical study on a large-scale cloud
service." Proceedings of the 13th Symposium on Cloud Computing. 2022.
14. Ghosh, Supriyo, et al. "How to fight production incidents? an empirical study on a large-scale cloud
service." Proceedings of the 13th Symposium on Cloud Computing. 2022.
15. The challenge of anomaly detection
Is this an anomaly?
Should I page someone?
16. Anomaly detection for cloud systems
● Account for the
severity of the anomaly
● Low time to detection
● Low false detection
rates
● Explainability matters
17. Understand the context of the product
Why am I building this algorithm?
Hits/seconds Errors/hits
19. The challenge of Time Series
Hewamalage, Hansika, Klaus Ackermann, and Christoph Bergmeir. "Forecast evaluation for data scientists:
common pitfalls and best practices." Data Mining and Knowledge Discovery 37.2 (2023): 788-832.
20. The challenge of Time Series
"we regularly come across papers in top
Artificial Intelligence (AI)/ML conferences
and journals (even winning best paper
awards) that use inadequate and misleading
benchmark methods for comparison"
Hewamalage, Hansika, Klaus Ackermann, and Christoph Bergmeir. "Forecast evaluation for data scientists:
common pitfalls and best practices." Data Mining and Knowledge Discovery 37.2 (2023): 788-832.
21. The challenge of Time Series
Hewamalage, Hansika, Klaus Ackermann, and Christoph Bergmeir. "Forecast evaluation for data scientists:
common pitfalls and best practices." Data Mining and Knowledge Discovery 37.2 (2023): 788-832.
MAE: mean absolute error
MSE: mean squared error
22. The challenge of Time Series
Hewamalage, Hansika, Klaus Ackermann, and Christoph Bergmeir. "Forecast evaluation for data scientists:
common pitfalls and best practices." Data Mining and Knowledge Discovery 37.2 (2023): 788-832.
23. Schmidl, Sebastian, Phillip Wenig, and Thorsten Papenbrock. "Anomaly detection in time series: a
comprehensive evaluation." Proceedings of the VLDB Endowment 15.9 (2022): 1779-1797.
24. Schmidl, Sebastian, Phillip Wenig, and Thorsten Papenbrock. "Anomaly detection in time series: a
comprehensive evaluation." Proceedings of the VLDB Endowment 15.9 (2022): 1779-1797.
This comprehensive, scientific study
carefully evaluates most
state-of-the-art anomaly detection
algorithms. We collected and
re-implemented 71 anomaly detection
algorithms from different domains and
evaluated them on 976 time series
datasets.
��
25. Schmidl, Sebastian, Phillip Wenig, and Thorsten Papenbrock. "Anomaly detection in time series: a
comprehensive evaluation." Proceedings of the VLDB Endowment 15.9 (2022): 1779-1797.
Our experimental results on the
different datasets show that, overall,
every anomaly detection family can be
effective and there is no clear winner.
26.
27. Choosing the right algorithm for the context
● What do your time series look like?
○ Domain knowledge
● Are you evaluating correctly?
○ Do you have relevant benchmarks?
○ Do you have a strong "simple"
baseline?
○ Do you have relevant evaluation
metrics?
28. The challenge of anomaly detection
Is this an anomaly?
Is this unlike other events in the same
context?
34. Different kinds of contracts
Function definition is
clear
- Rules / mathematics
- Unit tests
- Explainable
34
Function definition is
data-dependent
- Examples
- Statistical accuracy
- Uncertain outcome
Strong contracts Weak contracts
35. Different kinds of contracts
"An anomaly is whenever
latency goes above given
threshold"
35
"An anomaly is an event
unlike others in the same
context"
(ideas from Two big challenges in machine learning Keynote by Leon Bottou, ICML 2015)
36.
37. Should I use machine learning?
● Can you describe the problem with
simple rules?
● Do you have data?
● Do you need 100% accuracy?
○ Can you have 100% accuracy realistically?
● Do you need 100% explainability?
○ Eg regulations/law
39. Data science is harder than you think
● Understand the product
○ What kind of contract fits better?
● Evaluate rigorously
○ Why is this algorithm better than any other?
● Adapt to the context
○ Why am I doing this?