SlideShare a Scribd company logo
A field guide to the machine
learning zoo
Theodore Vasiloudis SICS/KTH
From idea to objective function
Formulating an ML problem
Formulating an ML problem
● Common aspects
Source: Xing (2015)
Formulating an ML problem
● Common aspects
○ Model (θ)
Source: Xing (2015)
Formulating an ML problem
● Common aspects
○ Model (θ)
○ Data (D)
Source: Xing (2015)
Formulating an ML problem
● Common aspects
○ Model (θ)
○ Data (D)
● Objective function: L(θ, D)
Source: Xing (2015)
Formulating an ML problem
● Common aspects
○ Model (θ)
○ Data (D)
● Objective function: L(θ, D)
● Prior knowledge: r(θ)
Source: Xing (2015)
Formulating an ML problem
● Common aspects
○ Model (θ)
○ Data (D)
● Objective function: L(θ, D)
● Prior knowledge: r(θ)
● ML program: f(θ, D) = L(θ, D) + r(θ)
Source: Xing (2015)
Formulating an ML problem
● Common aspects
○ Model (θ)
○ Data (D)
● Objective function: L(θ, D)
● Prior knowledge: r(θ)
● ML program: f(θ, D) = L(θ, D) + r(θ)
● ML Algorithm: How to optimize f(θ, D)
Source: Xing (2015)
Example: Improve retention at Twitter
● Goal: Reduce the churn of users on Twitter
● Assumption: Users churn because they don’t engage with the platform
● Idea: Increase the retweets, by promoting tweets more likely to be
retweeted
Example: Improve retention at Twitter
● Goal: Reduce the churn of users on Twitter
● Assumption: Users churn because they don’t engage with the platform
● Idea: Increase the retweets, by promoting tweets more likely to be
retweeted
● Data (D):
● Model (θ):
● Objective function - L(D, θ):
● Prior knowledge (Regularization):
● Algorithm:
Example: Improve retention at Twitter
● Goal: Reduce the churn of users on Twitter
● Assumption: Users churn because they don’t engage with the platform
● Idea: Increase the retweets, by promoting tweets more likely to be
retweeted
● Data (D): Features and labels, xi
, yi
● Model (θ):
● Objective function - L(D, θ):
● Prior knowledge (Regularization):
● Algorithm:
Example: Improve retention at Twitter
● Goal: Reduce the churn of users on Twitter
● Assumption: Users churn because they don’t engage with the platform
● Idea: Increase the retweets, by promoting tweets more likely to be
retweeted
● Data (D): Features and labels, xi
, yi
● Model (θ): Logistic regression, parameters w
○ p(y|x, w) = Bernouli(y | sigm(wΤ
x))
● Objective function - L(D, θ):
● Prior knowledge (Regularization):
● Algorithm:
Example: Improve retention at Twitter
● Goal: Reduce the churn of users on Twitter
● Assumption: Users churn because they don’t engage with the platform
● Idea: Increase the retweets, by promoting tweets more likely to be
retweeted
● Data (D): Features and labels, xi
, yi
● Model (θ): Logistic regression, parameters w
○ p(y|x, w) = Bernouli(y | sigm(wΤ
x))
● Objective function - L(D, θ): NLL(w) = Σ log(1 + exp(-y wΤ
xi
))
● Prior knowledge (Regularization): r(w) = λ*wΤ
w
● Algorithm:
Warning:
Notation abuse
Example: Improve retention at Twitter
● Goal: Reduce the churn of users on Twitter
● Assumption: Users churn because they don’t engage with the platform
● Idea: Increase the retweets, by promoting tweets more likely to be
retweeted
● Data (D): Features and labels, xi
, yi
● Model (θ): Logistic regression, parameters w
○ p(y|x, w) = Bernouli(y | sigm(wΤ
x))
● Objective function - L(D, θ): NLL(w) = Σ log(1 + exp(-y wΤ
xi
))
● Prior knowledge (Regularization): r(w) = λ*wΤ
w
● Algorithm: Gradient Descent
Data problems
Data problems
● GIGO: Garbage In - Garbage Out
Data readiness
Source: Lawrence (2017)
Data readiness
● Problem: “Data” as a concept is hard to reason about.
● Goal: Make the stakeholders aware of the state of the data at all stages
Source: Lawrence (2017)
Data readiness
Source: Lawrence (2017)
Data readiness
● Band C
○ Accessibility
Source: Lawrence (2017)
Data readiness
● Band C
○ Accessibility
● Band B
○ Representation and faithfulness
Source: Lawrence (2017)
Data readiness
● Band C
○ Accessibility
● Band B
○ Representation and faithfulness
● Band A
○ Data in context
Source: Lawrence (2017)
Data readiness
● Band C
○ “How long will it take to bring our user data to C1 level?”
● Band B
○ “Until we know the collection process we can’t move the data to B1.”
● Band A
○ “We realized that we would need location data in order to have an A1 dataset.”
Source: Lawrence (2017)
Data readiness
● Band C
○ “How long will it take to bring our user data to C1 level?”
● Band B
○ “Until we know the collection process we can’t move the data to B1.”
● Band A
○ “We realized that we would need location data in order to have an A1 dataset.”
Selecting algorithm & software:
“Easy” choices
Selecting algorithms
An ML algorithm “farm”
Source: scikit-learn.org
The neural network zoo
Source: Asimov
Institute (2016)
Selecting algorithms
● Always go for the simplest model you can afford
Selecting algorithms
● Always go for the simplest model you can afford
○ Your first model is more about getting the infrastructure right
Source: Zinkevich (2017)
Selecting algorithms
● Always go for the simplest model you can afford
○ Your first model is more about getting the infrastructure right
○ Simple models are usually interpretable. Interpretable models are easier to debug.
Source: Zinkevich (2017)
Selecting algorithms
● Always go for the simplest model you can afford
○ Your first model is more about getting the infrastructure right
○ Simple models are usually interpretable. Interpretable models are easier to debug.
○ Complex model erode boundaries
Source: Sculley et al. (2015)
Selecting algorithms
● Always go for the simplest model you can afford
○ Your first model is more about getting the infrastructure right
○ Simple models are usually interpretable. Interpretable models are easier to debug.
○ Complex model erode boundaries
■ CACE principle: Changing Anything Changes Everything
Source: Sculley et al. (2015)
Selecting software
The ML software zoo
Leaf
Your model vs. the world
What are the problems with ML systems?
Data ML Code Model
Expectation
What are the problems with ML systems?
Data ML Code Model
Reality
Sculley et al. (2015)
Things to watch out for
● Data dependencies
Things to watch out for
Sculley et al. (2015)
& Zinkevich (2017)
● Data dependencies
○ Unstable dependencies
Things to watch out for
Sculley et al. (2015)
& Zinkevich (2017)
● Data dependencies
○ Unstable dependencies
● Feedback loops
Things to watch out for
Sculley et al. (2015)
& Zinkevich (2017)
● Data dependencies
○ Unstable dependencies
● Feedback loops
○ Direct
Things to watch out for
Sculley et al. (2015)
& Zinkevich (2017)
● Data dependencies
○ Unstable dependencies
● Feedback loops
○ Direct
○ Indirect
Things to watch out for
Sculley et al. (2015)
& Zinkevich (2017)
Bringing it all together
Bringing it all together
● Define your problem as optimizing your objective function using data
● Determine (and monitor) the readiness of your data
● Don't spend too much time at first choosing an ML framework/algorithm
● Worry much more about what happens when your model meets the world.
Thank you.
@thvasilo
tvas@sics.se
Sources
● Google auto-replies: Shared photos, and text
● Silver et al. (2016): Mastering the game of Go
● Xing (2015): A new look at the system, algorithm and theory foundations of Distributed ML
● Lawrence (2017): Data readiness levels
● Asimov Institute (2016): The Neural Network Zoo
● Zinkevich (2017): Rules of Machine Learning - Best Practices for ML Engineering
● Sculley et al. (2015): Hidden Technical Debt in Machine Learning Systems

More Related Content

Similar to A field guide the machine learning zoo

General introduction to AI ML DL DS
General introduction to AI ML DL DSGeneral introduction to AI ML DL DS
General introduction to AI ML DL DS
Roopesh Kohad
 
L15.pptx
L15.pptxL15.pptx
L15.pptx
ImonBennett
 
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
Strata 2016 -  Lessons Learned from building real-life Machine Learning SystemsStrata 2016 -  Lessons Learned from building real-life Machine Learning Systems
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
Xavier Amatriain
 
PyData SF 2016 --- Moving forward through the darkness
PyData SF 2016 --- Moving forward through the darknessPyData SF 2016 --- Moving forward through the darkness
PyData SF 2016 --- Moving forward through the darkness
Chia-Chi Chang
 
A Plan for Sustainable MIR Evaluation
A Plan for Sustainable MIR EvaluationA Plan for Sustainable MIR Evaluation
A Plan for Sustainable MIR Evaluation
Julián Urbano
 
Online Machine Learning: introduction and examples
Online Machine Learning:  introduction and examplesOnline Machine Learning:  introduction and examples
Online Machine Learning: introduction and examples
Felipe
 
BIG2016- Lessons Learned from building real-life user-focused Big Data systems
BIG2016- Lessons Learned from building real-life user-focused Big Data systemsBIG2016- Lessons Learned from building real-life user-focused Big Data systems
BIG2016- Lessons Learned from building real-life user-focused Big Data systems
Xavier Amatriain
 
Jakub Langr (University of Oxford) - Overview of Generative Adversarial Netwo...
Jakub Langr (University of Oxford) - Overview of Generative Adversarial Netwo...Jakub Langr (University of Oxford) - Overview of Generative Adversarial Netwo...
Jakub Langr (University of Oxford) - Overview of Generative Adversarial Netwo...
Codiax
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
Sampath Kumar
 
Human in the loop: Bayesian Rules Enabling Explainable AI
Human in the loop: Bayesian Rules Enabling Explainable AIHuman in the loop: Bayesian Rules Enabling Explainable AI
Human in the loop: Bayesian Rules Enabling Explainable AI
Pramit Choudhary
 
D7 MarkPlus - Machine Learning Algorithm.pdf
D7 MarkPlus - Machine Learning Algorithm.pdfD7 MarkPlus - Machine Learning Algorithm.pdf
D7 MarkPlus - Machine Learning Algorithm.pdf
ikraizn
 
A few questions about large scale machine learning
A few questions about large scale machine learningA few questions about large scale machine learning
A few questions about large scale machine learning
Theodoros Vasiloudis
 
DSDT Meetup April 2021
DSDT Meetup April 2021DSDT Meetup April 2021
DSDT Meetup April 2021
DSDT_MTL
 
Data science
Data scienceData science
Data science
Purna Chander
 
London TensorFlow Meetup - 18th July 2017
London TensorFlow Meetup - 18th July 2017London TensorFlow Meetup - 18th July 2017
London TensorFlow Meetup - 18th July 2017
Daniel Ecer
 
Learning for Big Data-林軒田
Learning for Big Data-林軒田Learning for Big Data-林軒田
Learning for Big Data-林軒田
台灣資料科學年會
 
An Empirical Comparison of Knowledge Graph Embeddings for Item Recommendation
An Empirical Comparison of Knowledge Graph Embeddings for Item RecommendationAn Empirical Comparison of Knowledge Graph Embeddings for Item Recommendation
An Empirical Comparison of Knowledge Graph Embeddings for Item Recommendation
Enrico Palumbo
 
Improving How We Deliver Machine Learning Models (XCONF 2019)
Improving How We Deliver Machine Learning Models (XCONF 2019)Improving How We Deliver Machine Learning Models (XCONF 2019)
Improving How We Deliver Machine Learning Models (XCONF 2019)
David Tan
 
C2_W1---.pdf
C2_W1---.pdfC2_W1---.pdf
C2_W1---.pdf
Humayun Kabir
 
Intro to Graphs and Neo4j
Intro to Graphs and Neo4jIntro to Graphs and Neo4j
Intro to Graphs and Neo4jjexp
 

Similar to A field guide the machine learning zoo (20)

General introduction to AI ML DL DS
General introduction to AI ML DL DSGeneral introduction to AI ML DL DS
General introduction to AI ML DL DS
 
L15.pptx
L15.pptxL15.pptx
L15.pptx
 
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
Strata 2016 -  Lessons Learned from building real-life Machine Learning SystemsStrata 2016 -  Lessons Learned from building real-life Machine Learning Systems
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
 
PyData SF 2016 --- Moving forward through the darkness
PyData SF 2016 --- Moving forward through the darknessPyData SF 2016 --- Moving forward through the darkness
PyData SF 2016 --- Moving forward through the darkness
 
A Plan for Sustainable MIR Evaluation
A Plan for Sustainable MIR EvaluationA Plan for Sustainable MIR Evaluation
A Plan for Sustainable MIR Evaluation
 
Online Machine Learning: introduction and examples
Online Machine Learning:  introduction and examplesOnline Machine Learning:  introduction and examples
Online Machine Learning: introduction and examples
 
BIG2016- Lessons Learned from building real-life user-focused Big Data systems
BIG2016- Lessons Learned from building real-life user-focused Big Data systemsBIG2016- Lessons Learned from building real-life user-focused Big Data systems
BIG2016- Lessons Learned from building real-life user-focused Big Data systems
 
Jakub Langr (University of Oxford) - Overview of Generative Adversarial Netwo...
Jakub Langr (University of Oxford) - Overview of Generative Adversarial Netwo...Jakub Langr (University of Oxford) - Overview of Generative Adversarial Netwo...
Jakub Langr (University of Oxford) - Overview of Generative Adversarial Netwo...
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
 
Human in the loop: Bayesian Rules Enabling Explainable AI
Human in the loop: Bayesian Rules Enabling Explainable AIHuman in the loop: Bayesian Rules Enabling Explainable AI
Human in the loop: Bayesian Rules Enabling Explainable AI
 
D7 MarkPlus - Machine Learning Algorithm.pdf
D7 MarkPlus - Machine Learning Algorithm.pdfD7 MarkPlus - Machine Learning Algorithm.pdf
D7 MarkPlus - Machine Learning Algorithm.pdf
 
A few questions about large scale machine learning
A few questions about large scale machine learningA few questions about large scale machine learning
A few questions about large scale machine learning
 
DSDT Meetup April 2021
DSDT Meetup April 2021DSDT Meetup April 2021
DSDT Meetup April 2021
 
Data science
Data scienceData science
Data science
 
London TensorFlow Meetup - 18th July 2017
London TensorFlow Meetup - 18th July 2017London TensorFlow Meetup - 18th July 2017
London TensorFlow Meetup - 18th July 2017
 
Learning for Big Data-林軒田
Learning for Big Data-林軒田Learning for Big Data-林軒田
Learning for Big Data-林軒田
 
An Empirical Comparison of Knowledge Graph Embeddings for Item Recommendation
An Empirical Comparison of Knowledge Graph Embeddings for Item RecommendationAn Empirical Comparison of Knowledge Graph Embeddings for Item Recommendation
An Empirical Comparison of Knowledge Graph Embeddings for Item Recommendation
 
Improving How We Deliver Machine Learning Models (XCONF 2019)
Improving How We Deliver Machine Learning Models (XCONF 2019)Improving How We Deliver Machine Learning Models (XCONF 2019)
Improving How We Deliver Machine Learning Models (XCONF 2019)
 
C2_W1---.pdf
C2_W1---.pdfC2_W1---.pdf
C2_W1---.pdf
 
Intro to Graphs and Neo4j
Intro to Graphs and Neo4jIntro to Graphs and Neo4j
Intro to Graphs and Neo4j
 

Recently uploaded

Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
benishzehra469
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
enxupq
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
ewymefz
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
ocavb
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
ewymefz
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
Oppotus
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
pchutichetpong
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
ukgaet
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
Opendatabay
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 

Recently uploaded (20)

Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 

A field guide the machine learning zoo

  • 1. A field guide to the machine learning zoo Theodore Vasiloudis SICS/KTH
  • 2. From idea to objective function
  • 4. Formulating an ML problem ● Common aspects Source: Xing (2015)
  • 5. Formulating an ML problem ● Common aspects ○ Model (θ) Source: Xing (2015)
  • 6. Formulating an ML problem ● Common aspects ○ Model (θ) ○ Data (D) Source: Xing (2015)
  • 7. Formulating an ML problem ● Common aspects ○ Model (θ) ○ Data (D) ● Objective function: L(θ, D) Source: Xing (2015)
  • 8. Formulating an ML problem ● Common aspects ○ Model (θ) ○ Data (D) ● Objective function: L(θ, D) ● Prior knowledge: r(θ) Source: Xing (2015)
  • 9. Formulating an ML problem ● Common aspects ○ Model (θ) ○ Data (D) ● Objective function: L(θ, D) ● Prior knowledge: r(θ) ● ML program: f(θ, D) = L(θ, D) + r(θ) Source: Xing (2015)
  • 10. Formulating an ML problem ● Common aspects ○ Model (θ) ○ Data (D) ● Objective function: L(θ, D) ● Prior knowledge: r(θ) ● ML program: f(θ, D) = L(θ, D) + r(θ) ● ML Algorithm: How to optimize f(θ, D) Source: Xing (2015)
  • 11. Example: Improve retention at Twitter ● Goal: Reduce the churn of users on Twitter ● Assumption: Users churn because they don’t engage with the platform ● Idea: Increase the retweets, by promoting tweets more likely to be retweeted
  • 12. Example: Improve retention at Twitter ● Goal: Reduce the churn of users on Twitter ● Assumption: Users churn because they don’t engage with the platform ● Idea: Increase the retweets, by promoting tweets more likely to be retweeted ● Data (D): ● Model (θ): ● Objective function - L(D, θ): ● Prior knowledge (Regularization): ● Algorithm:
  • 13. Example: Improve retention at Twitter ● Goal: Reduce the churn of users on Twitter ● Assumption: Users churn because they don’t engage with the platform ● Idea: Increase the retweets, by promoting tweets more likely to be retweeted ● Data (D): Features and labels, xi , yi ● Model (θ): ● Objective function - L(D, θ): ● Prior knowledge (Regularization): ● Algorithm:
  • 14. Example: Improve retention at Twitter ● Goal: Reduce the churn of users on Twitter ● Assumption: Users churn because they don’t engage with the platform ● Idea: Increase the retweets, by promoting tweets more likely to be retweeted ● Data (D): Features and labels, xi , yi ● Model (θ): Logistic regression, parameters w ○ p(y|x, w) = Bernouli(y | sigm(wΤ x)) ● Objective function - L(D, θ): ● Prior knowledge (Regularization): ● Algorithm:
  • 15. Example: Improve retention at Twitter ● Goal: Reduce the churn of users on Twitter ● Assumption: Users churn because they don’t engage with the platform ● Idea: Increase the retweets, by promoting tweets more likely to be retweeted ● Data (D): Features and labels, xi , yi ● Model (θ): Logistic regression, parameters w ○ p(y|x, w) = Bernouli(y | sigm(wΤ x)) ● Objective function - L(D, θ): NLL(w) = Σ log(1 + exp(-y wΤ xi )) ● Prior knowledge (Regularization): r(w) = λ*wΤ w ● Algorithm: Warning: Notation abuse
  • 16. Example: Improve retention at Twitter ● Goal: Reduce the churn of users on Twitter ● Assumption: Users churn because they don’t engage with the platform ● Idea: Increase the retweets, by promoting tweets more likely to be retweeted ● Data (D): Features and labels, xi , yi ● Model (θ): Logistic regression, parameters w ○ p(y|x, w) = Bernouli(y | sigm(wΤ x)) ● Objective function - L(D, θ): NLL(w) = Σ log(1 + exp(-y wΤ xi )) ● Prior knowledge (Regularization): r(w) = λ*wΤ w ● Algorithm: Gradient Descent
  • 18. Data problems ● GIGO: Garbage In - Garbage Out
  • 20. Data readiness ● Problem: “Data” as a concept is hard to reason about. ● Goal: Make the stakeholders aware of the state of the data at all stages Source: Lawrence (2017)
  • 22. Data readiness ● Band C ○ Accessibility Source: Lawrence (2017)
  • 23. Data readiness ● Band C ○ Accessibility ● Band B ○ Representation and faithfulness Source: Lawrence (2017)
  • 24. Data readiness ● Band C ○ Accessibility ● Band B ○ Representation and faithfulness ● Band A ○ Data in context Source: Lawrence (2017)
  • 25. Data readiness ● Band C ○ “How long will it take to bring our user data to C1 level?” ● Band B ○ “Until we know the collection process we can’t move the data to B1.” ● Band A ○ “We realized that we would need location data in order to have an A1 dataset.” Source: Lawrence (2017)
  • 26. Data readiness ● Band C ○ “How long will it take to bring our user data to C1 level?” ● Band B ○ “Until we know the collection process we can’t move the data to B1.” ● Band A ○ “We realized that we would need location data in order to have an A1 dataset.”
  • 27. Selecting algorithm & software: “Easy” choices
  • 29. An ML algorithm “farm” Source: scikit-learn.org
  • 30. The neural network zoo Source: Asimov Institute (2016)
  • 31. Selecting algorithms ● Always go for the simplest model you can afford
  • 32. Selecting algorithms ● Always go for the simplest model you can afford ○ Your first model is more about getting the infrastructure right Source: Zinkevich (2017)
  • 33. Selecting algorithms ● Always go for the simplest model you can afford ○ Your first model is more about getting the infrastructure right ○ Simple models are usually interpretable. Interpretable models are easier to debug. Source: Zinkevich (2017)
  • 34. Selecting algorithms ● Always go for the simplest model you can afford ○ Your first model is more about getting the infrastructure right ○ Simple models are usually interpretable. Interpretable models are easier to debug. ○ Complex model erode boundaries Source: Sculley et al. (2015)
  • 35. Selecting algorithms ● Always go for the simplest model you can afford ○ Your first model is more about getting the infrastructure right ○ Simple models are usually interpretable. Interpretable models are easier to debug. ○ Complex model erode boundaries ■ CACE principle: Changing Anything Changes Everything Source: Sculley et al. (2015)
  • 37. The ML software zoo Leaf
  • 38. Your model vs. the world
  • 39. What are the problems with ML systems? Data ML Code Model Expectation
  • 40. What are the problems with ML systems? Data ML Code Model Reality Sculley et al. (2015)
  • 41. Things to watch out for
  • 42. ● Data dependencies Things to watch out for Sculley et al. (2015) & Zinkevich (2017)
  • 43. ● Data dependencies ○ Unstable dependencies Things to watch out for Sculley et al. (2015) & Zinkevich (2017)
  • 44. ● Data dependencies ○ Unstable dependencies ● Feedback loops Things to watch out for Sculley et al. (2015) & Zinkevich (2017)
  • 45. ● Data dependencies ○ Unstable dependencies ● Feedback loops ○ Direct Things to watch out for Sculley et al. (2015) & Zinkevich (2017)
  • 46. ● Data dependencies ○ Unstable dependencies ● Feedback loops ○ Direct ○ Indirect Things to watch out for Sculley et al. (2015) & Zinkevich (2017)
  • 47. Bringing it all together
  • 48. Bringing it all together ● Define your problem as optimizing your objective function using data ● Determine (and monitor) the readiness of your data ● Don't spend too much time at first choosing an ML framework/algorithm ● Worry much more about what happens when your model meets the world.
  • 50. Sources ● Google auto-replies: Shared photos, and text ● Silver et al. (2016): Mastering the game of Go ● Xing (2015): A new look at the system, algorithm and theory foundations of Distributed ML ● Lawrence (2017): Data readiness levels ● Asimov Institute (2016): The Neural Network Zoo ● Zinkevich (2017): Rules of Machine Learning - Best Practices for ML Engineering ● Sculley et al. (2015): Hidden Technical Debt in Machine Learning Systems