SlideShare a Scribd company logo
1 of 40
Why am I doing this???
Anne-Marie Tousch
Senior Data Scientist, Datadog
PyLadies Meetup
November 16th, 2023
❏ To share my pain
❏ To show off my knowledge
❏ To explain away why I'm failing so much
❏ Why did I sign up for this talk?
❏ To make you ask the same question
Why am I doing this?
Why am I doing this???
Or why data science is harder than you think
Anne-Marie Tousch
Senior Data Scientist, Datadog
PyLadies Meetup
November 16th, 2023
Quick bio
computer vision
(PhD)
computer vision
(startup)
ML (RecSys, …)
2020-?: AIOps
4
More Machine Learning More Software Engineering
2006
2010
2014
2020
?
?
?
?
● We run on millions
of hosts
● We collect tens of
trillions of
events per day
Visit datadoghq.com for more information
Datadog Watchdog™
https://docs.datadoghq.com/watchdog/
Anomaly monitors
https://docs.datadoghq.com/moni
tors/types/anomaly/#overview
The challenge of
Anomaly
Detection
The challenge of anomaly detection
Is this an anomaly?
Ghosh, Supriyo, et al. "How to fight production incidents? an empirical study on a large-scale cloud
service." Proceedings of the 13th Symposium on Cloud Computing. 2022.
Ghosh, Supriyo, et al. "How to fight production incidents? an empirical study on a large-scale cloud
service." Proceedings of the 13th Symposium on Cloud Computing. 2022.
"How incidents are detected? … we
observe that about 55% of the incidents
were detected by the automated
watchdogs."
Ghosh, Supriyo, et al. "How to fight production incidents? an empirical study on a large-scale cloud
service." Proceedings of the 13th Symposium on Cloud Computing. 2022.
Ghosh, Supriyo, et al. "How to fight production incidents? an empirical study on a large-scale cloud
service." Proceedings of the 13th Symposium on Cloud Computing. 2022.
The challenge of anomaly detection
Is this an anomaly?
Should I page someone?
Anomaly detection for cloud systems
● Account for the
severity of the anomaly
● Low time to detection
● Low false detection
rates
● Explainability matters
Understand the context of the product
Why am I building this algorithm?
Hits/seconds Errors/hits
The challenge of Time
Series
The challenge of Time Series
Hewamalage, Hansika, Klaus Ackermann, and Christoph Bergmeir. "Forecast evaluation for data scientists:
common pitfalls and best practices." Data Mining and Knowledge Discovery 37.2 (2023): 788-832.
The challenge of Time Series
"we regularly come across papers in top
Artificial Intelligence (AI)/ML conferences
and journals (even winning best paper
awards) that use inadequate and misleading
benchmark methods for comparison"
Hewamalage, Hansika, Klaus Ackermann, and Christoph Bergmeir. "Forecast evaluation for data scientists:
common pitfalls and best practices." Data Mining and Knowledge Discovery 37.2 (2023): 788-832.
The challenge of Time Series
Hewamalage, Hansika, Klaus Ackermann, and Christoph Bergmeir. "Forecast evaluation for data scientists:
common pitfalls and best practices." Data Mining and Knowledge Discovery 37.2 (2023): 788-832.
MAE: mean absolute error
MSE: mean squared error
The challenge of Time Series
Hewamalage, Hansika, Klaus Ackermann, and Christoph Bergmeir. "Forecast evaluation for data scientists:
common pitfalls and best practices." Data Mining and Knowledge Discovery 37.2 (2023): 788-832.
Schmidl, Sebastian, Phillip Wenig, and Thorsten Papenbrock. "Anomaly detection in time series: a
comprehensive evaluation." Proceedings of the VLDB Endowment 15.9 (2022): 1779-1797.
Schmidl, Sebastian, Phillip Wenig, and Thorsten Papenbrock. "Anomaly detection in time series: a
comprehensive evaluation." Proceedings of the VLDB Endowment 15.9 (2022): 1779-1797.
This comprehensive, scientific study
carefully evaluates most
state-of-the-art anomaly detection
algorithms. We collected and
re-implemented 71 anomaly detection
algorithms from different domains and
evaluated them on 976 time series
datasets.
��
Schmidl, Sebastian, Phillip Wenig, and Thorsten Papenbrock. "Anomaly detection in time series: a
comprehensive evaluation." Proceedings of the VLDB Endowment 15.9 (2022): 1779-1797.
Our experimental results on the
different datasets show that, overall,
every anomaly detection family can be
effective and there is no clear winner.
Choosing the right algorithm for the context
● What do your time series look like?
○ Domain knowledge
● Are you evaluating correctly?
○ Do you have relevant benchmarks?
○ Do you have a strong "simple"
baseline?
○ Do you have relevant evaluation
metrics?
The challenge of anomaly detection
Is this an anomaly?
Is this unlike other events in the same
context?
The challenge of
Data Science in
general
Classical software
Use algorithms to process data.
Classical Software
31
smooth
31
Threshold
Anomaly
detection
Strong contracts
Machine Learning: so what's different?
The function is generated from the data
32
Machine Learning
33
Weak contracts
Different kinds of contracts
Function definition is
clear
- Rules / mathematics
- Unit tests
- Explainable
34
Function definition is
data-dependent
- Examples
- Statistical accuracy
- Uncertain outcome
Strong contracts Weak contracts
Different kinds of contracts
"An anomaly is whenever
latency goes above given
threshold"
35
"An anomaly is an event
unlike others in the same
context"
(ideas from Two big challenges in machine learning Keynote by Leon Bottou, ICML 2015)
Should I use machine learning?
● Can you describe the problem with
simple rules?
● Do you have data?
● Do you need 100% accuracy?
○ Can you have 100% accuracy realistically?
● Do you need 100% explainability?
○ Eg regulations/law
So, why am I doing
this?
Takeaways
Data science is harder than you think
● Understand the product
○ What kind of contract fits better?
● Evaluate rigorously
○ Why is this algorithm better than any other?
● Adapt to the context
○ Why am I doing this?
Thanks! Questions?
annemarie@datadoghq.com

More Related Content

Similar to Why am I doing this???

Machine Learning
Machine LearningMachine Learning
Machine LearningVivek Garg
 
MITRE ATTACKcon Power Hour - January
MITRE ATTACKcon Power Hour - JanuaryMITRE ATTACKcon Power Hour - January
MITRE ATTACKcon Power Hour - JanuaryMITRE - ATT&CKcon
 
Industry of Things World - Berlin 19-09-16
Industry of Things World - Berlin 19-09-16Industry of Things World - Berlin 19-09-16
Industry of Things World - Berlin 19-09-16Boris Adryan
 
Big Data & Machine Learning - TDC2013 Sao Paulo
Big Data & Machine Learning - TDC2013 Sao PauloBig Data & Machine Learning - TDC2013 Sao Paulo
Big Data & Machine Learning - TDC2013 Sao PauloOCTO Technology
 
Make Sense Out of Data with Feature Engineering
Make Sense Out of Data with Feature EngineeringMake Sense Out of Data with Feature Engineering
Make Sense Out of Data with Feature EngineeringDataRobot
 
Machine Learning, Data Mining, and
Machine Learning, Data Mining, and Machine Learning, Data Mining, and
Machine Learning, Data Mining, and butest
 
Ml topic1 a
Ml topic1 aMl topic1 a
Ml topic1 abosycs1
 
Machine Learning - Challenges, Learnings & Opportunities
Machine Learning - Challenges, Learnings & OpportunitiesMachine Learning - Challenges, Learnings & Opportunities
Machine Learning - Challenges, Learnings & OpportunitiesCodePolitan
 
Say "Hi!" to Your New Boss
Say "Hi!" to Your New BossSay "Hi!" to Your New Boss
Say "Hi!" to Your New BossAndreas Dewes
 
DSCI 552 machine learning for data science
DSCI 552 machine learning for data scienceDSCI 552 machine learning for data science
DSCI 552 machine learning for data sciencepavithrak2205
 
What Is Data Science? | Introduction to Data Science | Data Science For Begin...
What Is Data Science? | Introduction to Data Science | Data Science For Begin...What Is Data Science? | Introduction to Data Science | Data Science For Begin...
What Is Data Science? | Introduction to Data Science | Data Science For Begin...Simplilearn
 
are algorithms really a black box
are algorithms really a black boxare algorithms really a black box
are algorithms really a black boxAnsgar Koene
 
(In)convenient truths about applied machine learning
(In)convenient truths about applied machine learning(In)convenient truths about applied machine learning
(In)convenient truths about applied machine learningMax Pagels
 
A data view of the data science process
A data view of the data science processA data view of the data science process
A data view of the data science processMathieu d'Aquin
 
Exploring the Data science Process
Exploring the Data science ProcessExploring the Data science Process
Exploring the Data science ProcessVishal Patel
 
Data Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & ImpactData Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & ImpactDr. Sunil Kr. Pandey
 
What data scientists really do, according to 50 data scientists
What data scientists really do, according to 50 data scientistsWhat data scientists really do, according to 50 data scientists
What data scientists really do, according to 50 data scientistsHugo Bowne-Anderson
 
Data Driven Disruption - Why Marketing and Advertising in WA lags - ADMA WA 2...
Data Driven Disruption - Why Marketing and Advertising in WA lags - ADMA WA 2...Data Driven Disruption - Why Marketing and Advertising in WA lags - ADMA WA 2...
Data Driven Disruption - Why Marketing and Advertising in WA lags - ADMA WA 2...Coert Du Plessis (杜康)
 
Machine learning ppt unit one syllabuspptx
Machine learning ppt unit one syllabuspptxMachine learning ppt unit one syllabuspptx
Machine learning ppt unit one syllabuspptxVenkateswaraBabuRavi
 
Fraud detection ML
Fraud detection MLFraud detection ML
Fraud detection MLMaatougSelim
 

Similar to Why am I doing this??? (20)

Machine Learning
Machine LearningMachine Learning
Machine Learning
 
MITRE ATTACKcon Power Hour - January
MITRE ATTACKcon Power Hour - JanuaryMITRE ATTACKcon Power Hour - January
MITRE ATTACKcon Power Hour - January
 
Industry of Things World - Berlin 19-09-16
Industry of Things World - Berlin 19-09-16Industry of Things World - Berlin 19-09-16
Industry of Things World - Berlin 19-09-16
 
Big Data & Machine Learning - TDC2013 Sao Paulo
Big Data & Machine Learning - TDC2013 Sao PauloBig Data & Machine Learning - TDC2013 Sao Paulo
Big Data & Machine Learning - TDC2013 Sao Paulo
 
Make Sense Out of Data with Feature Engineering
Make Sense Out of Data with Feature EngineeringMake Sense Out of Data with Feature Engineering
Make Sense Out of Data with Feature Engineering
 
Machine Learning, Data Mining, and
Machine Learning, Data Mining, and Machine Learning, Data Mining, and
Machine Learning, Data Mining, and
 
Ml topic1 a
Ml topic1 aMl topic1 a
Ml topic1 a
 
Machine Learning - Challenges, Learnings & Opportunities
Machine Learning - Challenges, Learnings & OpportunitiesMachine Learning - Challenges, Learnings & Opportunities
Machine Learning - Challenges, Learnings & Opportunities
 
Say "Hi!" to Your New Boss
Say "Hi!" to Your New BossSay "Hi!" to Your New Boss
Say "Hi!" to Your New Boss
 
DSCI 552 machine learning for data science
DSCI 552 machine learning for data scienceDSCI 552 machine learning for data science
DSCI 552 machine learning for data science
 
What Is Data Science? | Introduction to Data Science | Data Science For Begin...
What Is Data Science? | Introduction to Data Science | Data Science For Begin...What Is Data Science? | Introduction to Data Science | Data Science For Begin...
What Is Data Science? | Introduction to Data Science | Data Science For Begin...
 
are algorithms really a black box
are algorithms really a black boxare algorithms really a black box
are algorithms really a black box
 
(In)convenient truths about applied machine learning
(In)convenient truths about applied machine learning(In)convenient truths about applied machine learning
(In)convenient truths about applied machine learning
 
A data view of the data science process
A data view of the data science processA data view of the data science process
A data view of the data science process
 
Exploring the Data science Process
Exploring the Data science ProcessExploring the Data science Process
Exploring the Data science Process
 
Data Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & ImpactData Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & Impact
 
What data scientists really do, according to 50 data scientists
What data scientists really do, according to 50 data scientistsWhat data scientists really do, according to 50 data scientists
What data scientists really do, according to 50 data scientists
 
Data Driven Disruption - Why Marketing and Advertising in WA lags - ADMA WA 2...
Data Driven Disruption - Why Marketing and Advertising in WA lags - ADMA WA 2...Data Driven Disruption - Why Marketing and Advertising in WA lags - ADMA WA 2...
Data Driven Disruption - Why Marketing and Advertising in WA lags - ADMA WA 2...
 
Machine learning ppt unit one syllabuspptx
Machine learning ppt unit one syllabuspptxMachine learning ppt unit one syllabuspptx
Machine learning ppt unit one syllabuspptx
 
Fraud detection ML
Fraud detection MLFraud detection ML
Fraud detection ML
 

More from Anne-Marie Tousch

Large-scale recommendation, a random point of view
Large-scale recommendation, a random point of viewLarge-scale recommendation, a random point of view
Large-scale recommendation, a random point of viewAnne-Marie Tousch
 
From DevOps to MLOps: practical steps for a smooth transition
From DevOps to MLOps: practical steps for a smooth transitionFrom DevOps to MLOps: practical steps for a smooth transition
From DevOps to MLOps: practical steps for a smooth transitionAnne-Marie Tousch
 
On Machine Learning Readiness
On Machine Learning ReadinessOn Machine Learning Readiness
On Machine Learning ReadinessAnne-Marie Tousch
 
Data is beautiful​, please don't ruin it
Data is beautiful​, please don't ruin itData is beautiful​, please don't ruin it
Data is beautiful​, please don't ruin itAnne-Marie Tousch
 
Large Scale Recommendation: a view from the Trenches
Large Scale Recommendation: a view from the TrenchesLarge Scale Recommendation: a view from the Trenches
Large Scale Recommendation: a view from the TrenchesAnne-Marie Tousch
 
PyParis -- How we used Python to introduce teenagers to the fun of programming
PyParis -- How we used Python to introduce teenagers to the fun of programmingPyParis -- How we used Python to introduce teenagers to the fun of programming
PyParis -- How we used Python to introduce teenagers to the fun of programmingAnne-Marie Tousch
 

More from Anne-Marie Tousch (6)

Large-scale recommendation, a random point of view
Large-scale recommendation, a random point of viewLarge-scale recommendation, a random point of view
Large-scale recommendation, a random point of view
 
From DevOps to MLOps: practical steps for a smooth transition
From DevOps to MLOps: practical steps for a smooth transitionFrom DevOps to MLOps: practical steps for a smooth transition
From DevOps to MLOps: practical steps for a smooth transition
 
On Machine Learning Readiness
On Machine Learning ReadinessOn Machine Learning Readiness
On Machine Learning Readiness
 
Data is beautiful​, please don't ruin it
Data is beautiful​, please don't ruin itData is beautiful​, please don't ruin it
Data is beautiful​, please don't ruin it
 
Large Scale Recommendation: a view from the Trenches
Large Scale Recommendation: a view from the TrenchesLarge Scale Recommendation: a view from the Trenches
Large Scale Recommendation: a view from the Trenches
 
PyParis -- How we used Python to introduce teenagers to the fun of programming
PyParis -- How we used Python to introduce teenagers to the fun of programmingPyParis -- How we used Python to introduce teenagers to the fun of programming
PyParis -- How we used Python to introduce teenagers to the fun of programming
 

Recently uploaded

VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSRajkumarAkumalla
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerAnamika Sarkar
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...
High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...
High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...Call Girls in Nagpur High Profile
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineeringmalavadedarshan25
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).pptssuser5c9d4b1
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxupamatechverse
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Dr.Costas Sachpazis
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝soniya singh
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 

Recently uploaded (20)

VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...
High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...
High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineering
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
 

Why am I doing this???

  • 1. Why am I doing this??? Anne-Marie Tousch Senior Data Scientist, Datadog PyLadies Meetup November 16th, 2023
  • 2. ❏ To share my pain ❏ To show off my knowledge ❏ To explain away why I'm failing so much ❏ Why did I sign up for this talk? ❏ To make you ask the same question Why am I doing this?
  • 3. Why am I doing this??? Or why data science is harder than you think Anne-Marie Tousch Senior Data Scientist, Datadog PyLadies Meetup November 16th, 2023
  • 4. Quick bio computer vision (PhD) computer vision (startup) ML (RecSys, …) 2020-?: AIOps 4 More Machine Learning More Software Engineering 2006 2010 2014 2020 ? ? ? ?
  • 5. ● We run on millions of hosts ● We collect tens of trillions of events per day Visit datadoghq.com for more information
  • 9. The challenge of anomaly detection Is this an anomaly?
  • 10.
  • 11. Ghosh, Supriyo, et al. "How to fight production incidents? an empirical study on a large-scale cloud service." Proceedings of the 13th Symposium on Cloud Computing. 2022.
  • 12. Ghosh, Supriyo, et al. "How to fight production incidents? an empirical study on a large-scale cloud service." Proceedings of the 13th Symposium on Cloud Computing. 2022. "How incidents are detected? … we observe that about 55% of the incidents were detected by the automated watchdogs."
  • 13. Ghosh, Supriyo, et al. "How to fight production incidents? an empirical study on a large-scale cloud service." Proceedings of the 13th Symposium on Cloud Computing. 2022.
  • 14. Ghosh, Supriyo, et al. "How to fight production incidents? an empirical study on a large-scale cloud service." Proceedings of the 13th Symposium on Cloud Computing. 2022.
  • 15. The challenge of anomaly detection Is this an anomaly? Should I page someone?
  • 16. Anomaly detection for cloud systems ● Account for the severity of the anomaly ● Low time to detection ● Low false detection rates ● Explainability matters
  • 17. Understand the context of the product Why am I building this algorithm? Hits/seconds Errors/hits
  • 18. The challenge of Time Series
  • 19. The challenge of Time Series Hewamalage, Hansika, Klaus Ackermann, and Christoph Bergmeir. "Forecast evaluation for data scientists: common pitfalls and best practices." Data Mining and Knowledge Discovery 37.2 (2023): 788-832.
  • 20. The challenge of Time Series "we regularly come across papers in top Artificial Intelligence (AI)/ML conferences and journals (even winning best paper awards) that use inadequate and misleading benchmark methods for comparison" Hewamalage, Hansika, Klaus Ackermann, and Christoph Bergmeir. "Forecast evaluation for data scientists: common pitfalls and best practices." Data Mining and Knowledge Discovery 37.2 (2023): 788-832.
  • 21. The challenge of Time Series Hewamalage, Hansika, Klaus Ackermann, and Christoph Bergmeir. "Forecast evaluation for data scientists: common pitfalls and best practices." Data Mining and Knowledge Discovery 37.2 (2023): 788-832. MAE: mean absolute error MSE: mean squared error
  • 22. The challenge of Time Series Hewamalage, Hansika, Klaus Ackermann, and Christoph Bergmeir. "Forecast evaluation for data scientists: common pitfalls and best practices." Data Mining and Knowledge Discovery 37.2 (2023): 788-832.
  • 23. Schmidl, Sebastian, Phillip Wenig, and Thorsten Papenbrock. "Anomaly detection in time series: a comprehensive evaluation." Proceedings of the VLDB Endowment 15.9 (2022): 1779-1797.
  • 24. Schmidl, Sebastian, Phillip Wenig, and Thorsten Papenbrock. "Anomaly detection in time series: a comprehensive evaluation." Proceedings of the VLDB Endowment 15.9 (2022): 1779-1797. This comprehensive, scientific study carefully evaluates most state-of-the-art anomaly detection algorithms. We collected and re-implemented 71 anomaly detection algorithms from different domains and evaluated them on 976 time series datasets. ��
  • 25. Schmidl, Sebastian, Phillip Wenig, and Thorsten Papenbrock. "Anomaly detection in time series: a comprehensive evaluation." Proceedings of the VLDB Endowment 15.9 (2022): 1779-1797. Our experimental results on the different datasets show that, overall, every anomaly detection family can be effective and there is no clear winner.
  • 26.
  • 27. Choosing the right algorithm for the context ● What do your time series look like? ○ Domain knowledge ● Are you evaluating correctly? ○ Do you have relevant benchmarks? ○ Do you have a strong "simple" baseline? ○ Do you have relevant evaluation metrics?
  • 28. The challenge of anomaly detection Is this an anomaly? Is this unlike other events in the same context?
  • 29. The challenge of Data Science in general
  • 32. Machine Learning: so what's different? The function is generated from the data 32
  • 34. Different kinds of contracts Function definition is clear - Rules / mathematics - Unit tests - Explainable 34 Function definition is data-dependent - Examples - Statistical accuracy - Uncertain outcome Strong contracts Weak contracts
  • 35. Different kinds of contracts "An anomaly is whenever latency goes above given threshold" 35 "An anomaly is an event unlike others in the same context" (ideas from Two big challenges in machine learning Keynote by Leon Bottou, ICML 2015)
  • 36.
  • 37. Should I use machine learning? ● Can you describe the problem with simple rules? ● Do you have data? ● Do you need 100% accuracy? ○ Can you have 100% accuracy realistically? ● Do you need 100% explainability? ○ Eg regulations/law
  • 38. So, why am I doing this? Takeaways
  • 39. Data science is harder than you think ● Understand the product ○ What kind of contract fits better? ● Evaluate rigorously ○ Why is this algorithm better than any other? ● Adapt to the context ○ Why am I doing this?