SlideShare a Scribd company logo
Sonpvh – VNG 2017-july-11
LITERATURE REVIEW
1. The machine learning workflow
2. Evaluation Metrics
3. Offline Evaluation mechanisms
4. Hyperparameter turning
5. A/B Testing
6. Casual effect measurament
1
2
Why is it so complicated?
1. Offline evaluation: accuracy, precision,
recall …
Online evaluation: business metrics
2. Distribution drift: the distribution of data
changes overtime, so keep track the models
performance on the validation metrics of live
data.
 Ex: spam detection, prostitute detection…
Confusion matrix
Per-class accuracy
3
ROC: receiver operating characteristic
AUC: area under the curve
 Ex: search ranker, personalized recommendation ..
4
"The precision is the proportion of recommendations that
are good recommendations, and recall is the proportion of
good recommendations that appear in top
recommendations."
 Ex: prediction…

5
 Evaluation metrics # model log loss function: Train a personalized recommender
by minimizing the loss between its predictions and observed ratings, and then use
this recommender to produce a ranked list of recommendations. AVOID
 Skewed data, imbalanced, classes, outliers, rare data: analysis carefully before
doing anything else
6
1. The machine learning workflow
2. Evaluation Metrics
3. Offline Evaluation mechanisms
4. Hyperparameter turning
5. A/B Testing
6. Casual effect measurament
7
8
9
Cross validation:
Independently and Identically distributed
1. The machine learning workflow
2. Evaluation Metrics
3. Offline Evaluation mechanisms
4. Hyperparameter turning
5. A/B Testing
6. Casual effect measurament
10
11
 Model parameter: y = WT x
 Hyper-parameter (nuisance parameters): optimization state.
 Ex:
 Linear regression: regularization parameter,
 Decision trees: desired depth and number of leaves.
 SVMs: misclassification penalty term
 ..
1. The machine learning workflow
2. Evaluation Metrics
3. Offline Evaluation mechanisms
4. Hyperparameter turning
5. A/B Testing
6. Casual effect measurament
12
1. Split into randomized control/experimentation groups.
2. Observe behavior of both groups on the proposed methods.
3. Compute test statistics.
4. Compute p-value.
5. Output decision.
13
1. Baggage of the old: should do A/A testing first
2. Choose metrics, indexes (business design)
3. Did you count right?
4. How many observations do you need?
5. Is the distribution of the metric Gaussian?
6. Variances equal?
7. Multiple models, multiple hypotheses: A/A1/A2/…/B testing
8. How long to run the test?
9. Catching distribution drift: stationarity assumption
14
1. The machine learning workflow
2. Evaluation Metrics
3. Offline Evaluation mechanisms
4. Hyperparameter turning
5. A/B Testing
6. Casual effect measurament
15
 D = E(Yi(1)|Ti=1)- E(Yi(0)|Ti=0) + [E(Yi(0)|T=1) – E(Yi(0) |Ti=1)]
D: casual effect
Y(1),Y(0): outcome
T =1/0: treatment/control group
E(): expectation function
 D = ATE + B
 ATE = E(Yi(1)|Ti=1)- E(Yi(0)|Ti=1) Average Treatment Effect
 B = [E(Yi(0)|T=1) – E(Yi(0) |Ti=0)]
 We cannot observe E(Yi(0)|T=1) => cannot evaluate B => cannot evaluate D.
16
17
TOT = E[Yi(1) – Yi(0)|Ti=1] = E[Yi(1)|Ti=1] - E[Yi(0)|Ti=1]
Because of random sample, we have: E[Yi(0)|Ti=1] = E[Yi(0)|Ti=0]
So E[Yi(1)|Ti=1] - E[Yi(0)|Ti=0] = D
18
1. (Conditional) independence
2. Common support
3. TOT = EP(X)|T=1 {E[Y(1)|T=1, P(X)] – E[Y(0)|T=0, P(X)]}
19
 cov(T, Z) != 0
 cov(Z, ε) = 0 (exclusion restriction)
 Ti = γZi + φXi + ui , predict Ti
 Yi = αXi + βTi + εi
20

More Related Content

What's hot

Random Forest Algorithm - Random Forest Explained | Random Forest In Machine ...
Random Forest Algorithm - Random Forest Explained | Random Forest In Machine ...Random Forest Algorithm - Random Forest Explained | Random Forest In Machine ...
Random Forest Algorithm - Random Forest Explained | Random Forest In Machine ...
Simplilearn
 
Classification Based Machine Learning Algorithms
Classification Based Machine Learning AlgorithmsClassification Based Machine Learning Algorithms
Classification Based Machine Learning Algorithms
Md. Main Uddin Rony
 
Lecture 3: Basic Concepts of Machine Learning - Induction & Evaluation
Lecture 3: Basic Concepts of Machine Learning - Induction & EvaluationLecture 3: Basic Concepts of Machine Learning - Induction & Evaluation
Lecture 3: Basic Concepts of Machine Learning - Induction & Evaluation
Marina Santini
 
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...
Md. Main Uddin Rony
 
K Nearest Neighbor Algorithm
K Nearest Neighbor AlgorithmK Nearest Neighbor Algorithm
K Nearest Neighbor Algorithm
Tharuka Vishwajith Sarathchandra
 
Introduction to Machine Learning Classifiers
Introduction to Machine Learning ClassifiersIntroduction to Machine Learning Classifiers
Introduction to Machine Learning Classifiers
Functional Imperative
 
Machine Learning and Data Mining: 14 Evaluation and Credibility
Machine Learning and Data Mining: 14 Evaluation and CredibilityMachine Learning and Data Mining: 14 Evaluation and Credibility
Machine Learning and Data Mining: 14 Evaluation and Credibility
Pier Luca Lanzi
 
Introduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnIntroduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-Learn
Benjamin Bengfort
 
Classification
ClassificationClassification
Classification
CloudxLab
 
Lecture 6: Ensemble Methods
Lecture 6: Ensemble Methods Lecture 6: Ensemble Methods
Lecture 6: Ensemble Methods
Marina Santini
 
Hyperparameter Tuning
Hyperparameter TuningHyperparameter Tuning
Hyperparameter Tuning
Jon Lederman
 
22 Machine Learning Feature Selection
22 Machine Learning Feature Selection22 Machine Learning Feature Selection
22 Machine Learning Feature Selection
Andres Mendez-Vazquez
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
MartinHogg9
 
K means clustering
K means clusteringK means clustering
K means clustering
keshav goyal
 
Overfitting & Underfitting
Overfitting & UnderfittingOverfitting & Underfitting
Overfitting & Underfitting
SOUMIT KAR
 
Logistic regression in Machine Learning
Logistic regression in Machine LearningLogistic regression in Machine Learning
Logistic regression in Machine Learning
Kuppusamy P
 
Ensemble learning Techniques
Ensemble learning TechniquesEnsemble learning Techniques
Ensemble learning Techniques
Babu Priyavrat
 
Decision Tree Algorithm With Example | Decision Tree In Machine Learning | Da...
Decision Tree Algorithm With Example | Decision Tree In Machine Learning | Da...Decision Tree Algorithm With Example | Decision Tree In Machine Learning | Da...
Decision Tree Algorithm With Example | Decision Tree In Machine Learning | Da...
Simplilearn
 
Supervised Machine Learning
Supervised Machine LearningSupervised Machine Learning
Supervised Machine Learning
Ankit Rai
 

What's hot (20)

Random Forest Algorithm - Random Forest Explained | Random Forest In Machine ...
Random Forest Algorithm - Random Forest Explained | Random Forest In Machine ...Random Forest Algorithm - Random Forest Explained | Random Forest In Machine ...
Random Forest Algorithm - Random Forest Explained | Random Forest In Machine ...
 
Classification Based Machine Learning Algorithms
Classification Based Machine Learning AlgorithmsClassification Based Machine Learning Algorithms
Classification Based Machine Learning Algorithms
 
Lecture 3: Basic Concepts of Machine Learning - Induction & Evaluation
Lecture 3: Basic Concepts of Machine Learning - Induction & EvaluationLecture 3: Basic Concepts of Machine Learning - Induction & Evaluation
Lecture 3: Basic Concepts of Machine Learning - Induction & Evaluation
 
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...
 
K Nearest Neighbor Algorithm
K Nearest Neighbor AlgorithmK Nearest Neighbor Algorithm
K Nearest Neighbor Algorithm
 
Introduction to Machine Learning Classifiers
Introduction to Machine Learning ClassifiersIntroduction to Machine Learning Classifiers
Introduction to Machine Learning Classifiers
 
Machine Learning and Data Mining: 14 Evaluation and Credibility
Machine Learning and Data Mining: 14 Evaluation and CredibilityMachine Learning and Data Mining: 14 Evaluation and Credibility
Machine Learning and Data Mining: 14 Evaluation and Credibility
 
Introduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnIntroduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-Learn
 
Classification
ClassificationClassification
Classification
 
Lecture 6: Ensemble Methods
Lecture 6: Ensemble Methods Lecture 6: Ensemble Methods
Lecture 6: Ensemble Methods
 
Hyperparameter Tuning
Hyperparameter TuningHyperparameter Tuning
Hyperparameter Tuning
 
22 Machine Learning Feature Selection
22 Machine Learning Feature Selection22 Machine Learning Feature Selection
22 Machine Learning Feature Selection
 
K means Clustering Algorithm
K means Clustering AlgorithmK means Clustering Algorithm
K means Clustering Algorithm
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
K means clustering
K means clusteringK means clustering
K means clustering
 
Overfitting & Underfitting
Overfitting & UnderfittingOverfitting & Underfitting
Overfitting & Underfitting
 
Logistic regression in Machine Learning
Logistic regression in Machine LearningLogistic regression in Machine Learning
Logistic regression in Machine Learning
 
Ensemble learning Techniques
Ensemble learning TechniquesEnsemble learning Techniques
Ensemble learning Techniques
 
Decision Tree Algorithm With Example | Decision Tree In Machine Learning | Da...
Decision Tree Algorithm With Example | Decision Tree In Machine Learning | Da...Decision Tree Algorithm With Example | Decision Tree In Machine Learning | Da...
Decision Tree Algorithm With Example | Decision Tree In Machine Learning | Da...
 
Supervised Machine Learning
Supervised Machine LearningSupervised Machine Learning
Supervised Machine Learning
 

Similar to Model evaluation - machine learning

Experimental Design for Distributed Machine Learning with Myles Baker
Experimental Design for Distributed Machine Learning with Myles BakerExperimental Design for Distributed Machine Learning with Myles Baker
Experimental Design for Distributed Machine Learning with Myles Baker
Databricks
 
Data analytcis-first-steps
Data analytcis-first-stepsData analytcis-first-steps
Data analytcis-first-steps
Shesha R
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
Shiva Rama Krishna Dasharathi
 
Customer Churn Analytics using Microsoft R Open
Customer Churn Analytics using Microsoft R OpenCustomer Churn Analytics using Microsoft R Open
Customer Churn Analytics using Microsoft R Open
Poo Kuan Hoong
 
The ways to fuck up ab testing (from data products meetup)
The ways to fuck up ab testing (from data products meetup)The ways to fuck up ab testing (from data products meetup)
The ways to fuck up ab testing (from data products meetup)
Data Products Meetup
 
Use the Windshield, Not the Mirror Predictive Metrics that Drive Successful ...
 Use the Windshield, Not the Mirror Predictive Metrics that Drive Successful ... Use the Windshield, Not the Mirror Predictive Metrics that Drive Successful ...
Use the Windshield, Not the Mirror Predictive Metrics that Drive Successful ...
Seapine Software
 
Anton Muzhailo - Practical Test Process Improvement using ISTQB
Anton Muzhailo - Practical Test Process Improvement using ISTQBAnton Muzhailo - Practical Test Process Improvement using ISTQB
Anton Muzhailo - Practical Test Process Improvement using ISTQB
Ievgenii Katsan
 
10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems
Xavier Amatriain
 
Predictive Analytics in Practice
Predictive Analytics in PracticePredictive Analytics in Practice
Predictive Analytics in Practice
Hobsons
 
An introduction to variable and feature selection
An introduction to variable and feature selectionAn introduction to variable and feature selection
An introduction to variable and feature selection
Marco Meoni
 
Six sigma
Six sigmaSix sigma
Six sigma
Alaleh Irooni
 
Data Analyst Interview Questions & Answers
Data Analyst Interview Questions & AnswersData Analyst Interview Questions & Answers
Data Analyst Interview Questions & Answers
Satyam Jaiswal
 
Master the essentials of conversion optimization
Master the essentials of conversion optimization Master the essentials of conversion optimization
Master the essentials of conversion optimization
Steve Clough
 
Lesson14
Lesson14Lesson14
Lesson14
GeorgeLasluisa
 
Measurement system analysis
Measurement system analysisMeasurement system analysis
Measurement system analysis
Tina Arora
 
ASC Model: A Process Model for the Evaluation of Simulated Field Exercises in...
ASC Model: A Process Model for the Evaluation of Simulated Field Exercises in...ASC Model: A Process Model for the Evaluation of Simulated Field Exercises in...
ASC Model: A Process Model for the Evaluation of Simulated Field Exercises in...
streamspotter
 
Master the Essentials of Conversion Optimization
Master the Essentials of Conversion Optimization Master the Essentials of Conversion Optimization
Master the Essentials of Conversion Optimization
Kyle Curnow
 
Rapid Miner
Rapid MinerRapid Miner
Rapid Miner
SrushtiSuvarna
 
Enhance testing with monitoring and analytics
Enhance testing with monitoring and analyticsEnhance testing with monitoring and analytics
Enhance testing with monitoring and analytics
Dana Aonofriesei
 
Practical Tools for Measurement Systems Analysis
Practical Tools for Measurement Systems AnalysisPractical Tools for Measurement Systems Analysis
Practical Tools for Measurement Systems Analysis
Gabor Szabo, CQE
 

Similar to Model evaluation - machine learning (20)

Experimental Design for Distributed Machine Learning with Myles Baker
Experimental Design for Distributed Machine Learning with Myles BakerExperimental Design for Distributed Machine Learning with Myles Baker
Experimental Design for Distributed Machine Learning with Myles Baker
 
Data analytcis-first-steps
Data analytcis-first-stepsData analytcis-first-steps
Data analytcis-first-steps
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Customer Churn Analytics using Microsoft R Open
Customer Churn Analytics using Microsoft R OpenCustomer Churn Analytics using Microsoft R Open
Customer Churn Analytics using Microsoft R Open
 
The ways to fuck up ab testing (from data products meetup)
The ways to fuck up ab testing (from data products meetup)The ways to fuck up ab testing (from data products meetup)
The ways to fuck up ab testing (from data products meetup)
 
Use the Windshield, Not the Mirror Predictive Metrics that Drive Successful ...
 Use the Windshield, Not the Mirror Predictive Metrics that Drive Successful ... Use the Windshield, Not the Mirror Predictive Metrics that Drive Successful ...
Use the Windshield, Not the Mirror Predictive Metrics that Drive Successful ...
 
Anton Muzhailo - Practical Test Process Improvement using ISTQB
Anton Muzhailo - Practical Test Process Improvement using ISTQBAnton Muzhailo - Practical Test Process Improvement using ISTQB
Anton Muzhailo - Practical Test Process Improvement using ISTQB
 
10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems
 
Predictive Analytics in Practice
Predictive Analytics in PracticePredictive Analytics in Practice
Predictive Analytics in Practice
 
An introduction to variable and feature selection
An introduction to variable and feature selectionAn introduction to variable and feature selection
An introduction to variable and feature selection
 
Six sigma
Six sigmaSix sigma
Six sigma
 
Data Analyst Interview Questions & Answers
Data Analyst Interview Questions & AnswersData Analyst Interview Questions & Answers
Data Analyst Interview Questions & Answers
 
Master the essentials of conversion optimization
Master the essentials of conversion optimization Master the essentials of conversion optimization
Master the essentials of conversion optimization
 
Lesson14
Lesson14Lesson14
Lesson14
 
Measurement system analysis
Measurement system analysisMeasurement system analysis
Measurement system analysis
 
ASC Model: A Process Model for the Evaluation of Simulated Field Exercises in...
ASC Model: A Process Model for the Evaluation of Simulated Field Exercises in...ASC Model: A Process Model for the Evaluation of Simulated Field Exercises in...
ASC Model: A Process Model for the Evaluation of Simulated Field Exercises in...
 
Master the Essentials of Conversion Optimization
Master the Essentials of Conversion Optimization Master the Essentials of Conversion Optimization
Master the Essentials of Conversion Optimization
 
Rapid Miner
Rapid MinerRapid Miner
Rapid Miner
 
Enhance testing with monitoring and analytics
Enhance testing with monitoring and analyticsEnhance testing with monitoring and analytics
Enhance testing with monitoring and analytics
 
Practical Tools for Measurement Systems Analysis
Practical Tools for Measurement Systems AnalysisPractical Tools for Measurement Systems Analysis
Practical Tools for Measurement Systems Analysis
 

Recently uploaded

Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTESAdjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Subhajit Sahu
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
apvysm8
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
vikram sood
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
dwreak4tg
 
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
mzpolocfi
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
balafet
 
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptxData_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
AnirbanRoy608946
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
ahzuo
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
Walaa Eldin Moustafa
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
mbawufebxi
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
oz8q3jxlp
 
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
u86oixdj
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
slg6lamcq
 

Recently uploaded (20)

Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTESAdjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
 
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
 
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptxData_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
 
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
 

Model evaluation - machine learning

  • 1. Sonpvh – VNG 2017-july-11 LITERATURE REVIEW
  • 2. 1. The machine learning workflow 2. Evaluation Metrics 3. Offline Evaluation mechanisms 4. Hyperparameter turning 5. A/B Testing 6. Casual effect measurament 1
  • 3. 2 Why is it so complicated? 1. Offline evaluation: accuracy, precision, recall … Online evaluation: business metrics 2. Distribution drift: the distribution of data changes overtime, so keep track the models performance on the validation metrics of live data.
  • 4.  Ex: spam detection, prostitute detection… Confusion matrix Per-class accuracy 3 ROC: receiver operating characteristic AUC: area under the curve
  • 5.  Ex: search ranker, personalized recommendation .. 4 "The precision is the proportion of recommendations that are good recommendations, and recall is the proportion of good recommendations that appear in top recommendations."
  • 7.  Evaluation metrics # model log loss function: Train a personalized recommender by minimizing the loss between its predictions and observed ratings, and then use this recommender to produce a ranked list of recommendations. AVOID  Skewed data, imbalanced, classes, outliers, rare data: analysis carefully before doing anything else 6
  • 8. 1. The machine learning workflow 2. Evaluation Metrics 3. Offline Evaluation mechanisms 4. Hyperparameter turning 5. A/B Testing 6. Casual effect measurament 7
  • 9. 8
  • 10. 9 Cross validation: Independently and Identically distributed
  • 11. 1. The machine learning workflow 2. Evaluation Metrics 3. Offline Evaluation mechanisms 4. Hyperparameter turning 5. A/B Testing 6. Casual effect measurament 10
  • 12. 11  Model parameter: y = WT x  Hyper-parameter (nuisance parameters): optimization state.  Ex:  Linear regression: regularization parameter,  Decision trees: desired depth and number of leaves.  SVMs: misclassification penalty term  ..
  • 13. 1. The machine learning workflow 2. Evaluation Metrics 3. Offline Evaluation mechanisms 4. Hyperparameter turning 5. A/B Testing 6. Casual effect measurament 12
  • 14. 1. Split into randomized control/experimentation groups. 2. Observe behavior of both groups on the proposed methods. 3. Compute test statistics. 4. Compute p-value. 5. Output decision. 13
  • 15. 1. Baggage of the old: should do A/A testing first 2. Choose metrics, indexes (business design) 3. Did you count right? 4. How many observations do you need? 5. Is the distribution of the metric Gaussian? 6. Variances equal? 7. Multiple models, multiple hypotheses: A/A1/A2/…/B testing 8. How long to run the test? 9. Catching distribution drift: stationarity assumption 14
  • 16. 1. The machine learning workflow 2. Evaluation Metrics 3. Offline Evaluation mechanisms 4. Hyperparameter turning 5. A/B Testing 6. Casual effect measurament 15
  • 17.  D = E(Yi(1)|Ti=1)- E(Yi(0)|Ti=0) + [E(Yi(0)|T=1) – E(Yi(0) |Ti=1)] D: casual effect Y(1),Y(0): outcome T =1/0: treatment/control group E(): expectation function  D = ATE + B  ATE = E(Yi(1)|Ti=1)- E(Yi(0)|Ti=1) Average Treatment Effect  B = [E(Yi(0)|T=1) – E(Yi(0) |Ti=0)]  We cannot observe E(Yi(0)|T=1) => cannot evaluate B => cannot evaluate D. 16
  • 18. 17 TOT = E[Yi(1) – Yi(0)|Ti=1] = E[Yi(1)|Ti=1] - E[Yi(0)|Ti=1] Because of random sample, we have: E[Yi(0)|Ti=1] = E[Yi(0)|Ti=0] So E[Yi(1)|Ti=1] - E[Yi(0)|Ti=0] = D
  • 19. 18 1. (Conditional) independence 2. Common support 3. TOT = EP(X)|T=1 {E[Y(1)|T=1, P(X)] – E[Y(0)|T=0, P(X)]}
  • 20. 19
  • 21.  cov(T, Z) != 0  cov(Z, ε) = 0 (exclusion restriction)  Ti = γZi + φXi + ui , predict Ti  Yi = αXi + βTi + εi 20

Editor's Notes

  1. Rank score. Lift index