Why am I doing this???
Anne-Marie Tousch
Senior Data Scientist, Datadog
PyLadies Meetup
November 16th, 2023
❏ To share my pain
❏ To show off my knowledge
❏ To explain away why I'm failing so much
❏ Why did I sign up for this talk?
❏ To make you ask the same question
Why am I doing this?
Why am I doing this???
Or why data science is harder than you think
Anne-Marie Tousch
Senior Data Scientist, Datadog
PyLadies Meetup
November 16th, 2023
Quick bio
computer vision
(PhD)
computer vision
(startup)
ML (RecSys, …)
2020-?: AIOps
4
More Machine Learning More Software Engineering
2006
2010
2014
2020
?
?
?
?
● We run on millions
of hosts
● We collect tens of
trillions of
events per day
Visit datadoghq.com for more information
Datadog Watchdog™
https://docs.datadoghq.com/watchdog/
Anomaly monitors
https://docs.datadoghq.com/moni
tors/types/anomaly/#overview
The challenge of
Anomaly
Detection
The challenge of anomaly detection
Is this an anomaly?
Ghosh, Supriyo, et al. "How to fight production incidents? an empirical study on a large-scale cloud
service." Proceedings of the 13th Symposium on Cloud Computing. 2022.
Ghosh, Supriyo, et al. "How to fight production incidents? an empirical study on a large-scale cloud
service." Proceedings of the 13th Symposium on Cloud Computing. 2022.
"How incidents are detected? … we
observe that about 55% of the incidents
were detected by the automated
watchdogs."
Ghosh, Supriyo, et al. "How to fight production incidents? an empirical study on a large-scale cloud
service." Proceedings of the 13th Symposium on Cloud Computing. 2022.
Ghosh, Supriyo, et al. "How to fight production incidents? an empirical study on a large-scale cloud
service." Proceedings of the 13th Symposium on Cloud Computing. 2022.
The challenge of anomaly detection
Is this an anomaly?
Should I page someone?
Anomaly detection for cloud systems
● Account for the
severity of the anomaly
● Low time to detection
● Low false detection
rates
● Explainability matters
Understand the context of the product
Why am I building this algorithm?
Hits/seconds Errors/hits
The challenge of Time
Series
The challenge of Time Series
Hewamalage, Hansika, Klaus Ackermann, and Christoph Bergmeir. "Forecast evaluation for data scientists:
common pitfalls and best practices." Data Mining and Knowledge Discovery 37.2 (2023): 788-832.
The challenge of Time Series
"we regularly come across papers in top
Artificial Intelligence (AI)/ML conferences
and journals (even winning best paper
awards) that use inadequate and misleading
benchmark methods for comparison"
Hewamalage, Hansika, Klaus Ackermann, and Christoph Bergmeir. "Forecast evaluation for data scientists:
common pitfalls and best practices." Data Mining and Knowledge Discovery 37.2 (2023): 788-832.
The challenge of Time Series
Hewamalage, Hansika, Klaus Ackermann, and Christoph Bergmeir. "Forecast evaluation for data scientists:
common pitfalls and best practices." Data Mining and Knowledge Discovery 37.2 (2023): 788-832.
MAE: mean absolute error
MSE: mean squared error
The challenge of Time Series
Hewamalage, Hansika, Klaus Ackermann, and Christoph Bergmeir. "Forecast evaluation for data scientists:
common pitfalls and best practices." Data Mining and Knowledge Discovery 37.2 (2023): 788-832.
Schmidl, Sebastian, Phillip Wenig, and Thorsten Papenbrock. "Anomaly detection in time series: a
comprehensive evaluation." Proceedings of the VLDB Endowment 15.9 (2022): 1779-1797.
Schmidl, Sebastian, Phillip Wenig, and Thorsten Papenbrock. "Anomaly detection in time series: a
comprehensive evaluation." Proceedings of the VLDB Endowment 15.9 (2022): 1779-1797.
This comprehensive, scientific study
carefully evaluates most
state-of-the-art anomaly detection
algorithms. We collected and
re-implemented 71 anomaly detection
algorithms from different domains and
evaluated them on 976 time series
datasets.
��
Schmidl, Sebastian, Phillip Wenig, and Thorsten Papenbrock. "Anomaly detection in time series: a
comprehensive evaluation." Proceedings of the VLDB Endowment 15.9 (2022): 1779-1797.
Our experimental results on the
different datasets show that, overall,
every anomaly detection family can be
effective and there is no clear winner.
Choosing the right algorithm for the context
● What do your time series look like?
○ Domain knowledge
● Are you evaluating correctly?
○ Do you have relevant benchmarks?
○ Do you have a strong "simple"
baseline?
○ Do you have relevant evaluation
metrics?
The challenge of anomaly detection
Is this an anomaly?
Is this unlike other events in the same
context?
The challenge of
Data Science in
general
Classical software
Use algorithms to process data.
Classical Software
31
smooth
31
Threshold
Anomaly
detection
Strong contracts
Machine Learning: so what's different?
The function is generated from the data
32
Machine Learning
33
Weak contracts
Different kinds of contracts
Function definition is
clear
- Rules / mathematics
- Unit tests
- Explainable
34
Function definition is
data-dependent
- Examples
- Statistical accuracy
- Uncertain outcome
Strong contracts Weak contracts
Different kinds of contracts
"An anomaly is whenever
latency goes above given
threshold"
35
"An anomaly is an event
unlike others in the same
context"
(ideas from Two big challenges in machine learning Keynote by Leon Bottou, ICML 2015)
Should I use machine learning?
● Can you describe the problem with
simple rules?
● Do you have data?
● Do you need 100% accuracy?
○ Can you have 100% accuracy realistically?
● Do you need 100% explainability?
○ Eg regulations/law
So, why am I doing
this?
Takeaways
Data science is harder than you think
● Understand the product
○ What kind of contract fits better?
● Evaluate rigorously
○ Why is this algorithm better than any other?
● Adapt to the context
○ Why am I doing this?
Thanks! Questions?
annemarie@datadoghq.com

Why am I doing this???

  • 1.
    Why am Idoing this??? Anne-Marie Tousch Senior Data Scientist, Datadog PyLadies Meetup November 16th, 2023
  • 2.
    ❏ To sharemy pain ❏ To show off my knowledge ❏ To explain away why I'm failing so much ❏ Why did I sign up for this talk? ❏ To make you ask the same question Why am I doing this?
  • 3.
    Why am Idoing this??? Or why data science is harder than you think Anne-Marie Tousch Senior Data Scientist, Datadog PyLadies Meetup November 16th, 2023
  • 4.
    Quick bio computer vision (PhD) computervision (startup) ML (RecSys, …) 2020-?: AIOps 4 More Machine Learning More Software Engineering 2006 2010 2014 2020 ? ? ? ?
  • 5.
    ● We runon millions of hosts ● We collect tens of trillions of events per day Visit datadoghq.com for more information
  • 6.
  • 7.
  • 8.
  • 9.
    The challenge ofanomaly detection Is this an anomaly?
  • 11.
    Ghosh, Supriyo, etal. "How to fight production incidents? an empirical study on a large-scale cloud service." Proceedings of the 13th Symposium on Cloud Computing. 2022.
  • 12.
    Ghosh, Supriyo, etal. "How to fight production incidents? an empirical study on a large-scale cloud service." Proceedings of the 13th Symposium on Cloud Computing. 2022. "How incidents are detected? … we observe that about 55% of the incidents were detected by the automated watchdogs."
  • 13.
    Ghosh, Supriyo, etal. "How to fight production incidents? an empirical study on a large-scale cloud service." Proceedings of the 13th Symposium on Cloud Computing. 2022.
  • 14.
    Ghosh, Supriyo, etal. "How to fight production incidents? an empirical study on a large-scale cloud service." Proceedings of the 13th Symposium on Cloud Computing. 2022.
  • 15.
    The challenge ofanomaly detection Is this an anomaly? Should I page someone?
  • 16.
    Anomaly detection forcloud systems ● Account for the severity of the anomaly ● Low time to detection ● Low false detection rates ● Explainability matters
  • 17.
    Understand the contextof the product Why am I building this algorithm? Hits/seconds Errors/hits
  • 18.
    The challenge ofTime Series
  • 19.
    The challenge ofTime Series Hewamalage, Hansika, Klaus Ackermann, and Christoph Bergmeir. "Forecast evaluation for data scientists: common pitfalls and best practices." Data Mining and Knowledge Discovery 37.2 (2023): 788-832.
  • 20.
    The challenge ofTime Series "we regularly come across papers in top Artificial Intelligence (AI)/ML conferences and journals (even winning best paper awards) that use inadequate and misleading benchmark methods for comparison" Hewamalage, Hansika, Klaus Ackermann, and Christoph Bergmeir. "Forecast evaluation for data scientists: common pitfalls and best practices." Data Mining and Knowledge Discovery 37.2 (2023): 788-832.
  • 21.
    The challenge ofTime Series Hewamalage, Hansika, Klaus Ackermann, and Christoph Bergmeir. "Forecast evaluation for data scientists: common pitfalls and best practices." Data Mining and Knowledge Discovery 37.2 (2023): 788-832. MAE: mean absolute error MSE: mean squared error
  • 22.
    The challenge ofTime Series Hewamalage, Hansika, Klaus Ackermann, and Christoph Bergmeir. "Forecast evaluation for data scientists: common pitfalls and best practices." Data Mining and Knowledge Discovery 37.2 (2023): 788-832.
  • 23.
    Schmidl, Sebastian, PhillipWenig, and Thorsten Papenbrock. "Anomaly detection in time series: a comprehensive evaluation." Proceedings of the VLDB Endowment 15.9 (2022): 1779-1797.
  • 24.
    Schmidl, Sebastian, PhillipWenig, and Thorsten Papenbrock. "Anomaly detection in time series: a comprehensive evaluation." Proceedings of the VLDB Endowment 15.9 (2022): 1779-1797. This comprehensive, scientific study carefully evaluates most state-of-the-art anomaly detection algorithms. We collected and re-implemented 71 anomaly detection algorithms from different domains and evaluated them on 976 time series datasets. ��
  • 25.
    Schmidl, Sebastian, PhillipWenig, and Thorsten Papenbrock. "Anomaly detection in time series: a comprehensive evaluation." Proceedings of the VLDB Endowment 15.9 (2022): 1779-1797. Our experimental results on the different datasets show that, overall, every anomaly detection family can be effective and there is no clear winner.
  • 27.
    Choosing the rightalgorithm for the context ● What do your time series look like? ○ Domain knowledge ● Are you evaluating correctly? ○ Do you have relevant benchmarks? ○ Do you have a strong "simple" baseline? ○ Do you have relevant evaluation metrics?
  • 28.
    The challenge ofanomaly detection Is this an anomaly? Is this unlike other events in the same context?
  • 29.
    The challenge of DataScience in general
  • 30.
  • 31.
  • 32.
    Machine Learning: sowhat's different? The function is generated from the data 32
  • 33.
  • 34.
    Different kinds ofcontracts Function definition is clear - Rules / mathematics - Unit tests - Explainable 34 Function definition is data-dependent - Examples - Statistical accuracy - Uncertain outcome Strong contracts Weak contracts
  • 35.
    Different kinds ofcontracts "An anomaly is whenever latency goes above given threshold" 35 "An anomaly is an event unlike others in the same context" (ideas from Two big challenges in machine learning Keynote by Leon Bottou, ICML 2015)
  • 37.
    Should I usemachine learning? ● Can you describe the problem with simple rules? ● Do you have data? ● Do you need 100% accuracy? ○ Can you have 100% accuracy realistically? ● Do you need 100% explainability? ○ Eg regulations/law
  • 38.
    So, why amI doing this? Takeaways
  • 39.
    Data science isharder than you think ● Understand the product ○ What kind of contract fits better? ● Evaluate rigorously ○ Why is this algorithm better than any other? ● Adapt to the context ○ Why am I doing this?
  • 40.