SlideShare a Scribd company logo
1 of 36
Download to read offline
Practical Machine Learning
Practical Machine Learning
• Your model makes unacceptably large errors on new data. What to do next?
Practical Machine Learning
• Your model makes unacceptably large errors on new data. What to do next?
• Collect more training samples
Practical Machine Learning
• Your model makes unacceptably large errors on new data. What to do next?
• Collect more training samples
• Reduce number of features
Practical Machine Learning
• Your model makes unacceptably large errors on new data. What to do next?
• Collect more training samples
• Reduce number of features
• Increase number of features
Practical Machine Learning
• Your model makes unacceptably large errors on new data. What to do next?
• Collect more training samples
• Reduce number of features
• Increase number of features
• Regularization
Practical Machine Learning
• Your model makes unacceptably large errors on new data. What to do next?
• Collect more training samples
• Reduce number of features
• Increase number of features
• Regularization
• Bigger Model
Practical Machine Learning
• Your model makes unacceptably large errors on new data. What to do next?
• Collect more training samples
• Reduce number of features
• Increase number of features
• Regularization
• Bigger Model
• Hyper-parameter tuning
Bias vs. Variance
x
x
x
x x
x
x
x
x x
x
x
x
x x
High bias
(underfit)
“Just right” High variance
(overfit)
x
f(x)
x
f(x)
x
f(x)
Bias vs. Variance – Machine Learning
perspective
• Optimal error rate (e.g. Bayes rate, best human error)
• Training error
• Validation error
Training Test
Bias vs. Variance – Machine Learning
perspective
• Optimal error rate (e.g. Bayes rate, best human error)
• Training error
• Validation error
Training Validation Test
Bias vs. Variance – Machine Learning
perspective
• Optimal error rate (e.g. Bayes rate, best human error)
• Training error
• Validation error
Training Validation Test
Bias
Variance
Bias vs. Variance – Machine Learning
perspective
• Optimal error rate (e.g. Bayes rate, best human error)
• Training error
• Validation error
Training Validation Test
Bias
Variance
1%
5%
6%
Bias vs. Variance – Machine Learning
perspective
• Optimal error rate (e.g. Bayes rate, best human error)
• Training error
• Validation error
Training Validation Test
Bias
Variance
1%
5%
6%
1%
2%
6%
Bias vs. Variance – Machine Learning
perspective
• Optimal error rate (e.g. Bayes rate, best human error)
• Training error
• Validation error
Training Validation Test
Bias
Variance
1%
5%
6%
1%
2%
6%
1%
5%
10%
Data from different distributions/domains
Training Test
10-hour call-center speech50-hour conversational speech
Data from different distributions/domains
Training Test
10-hour call-center speech50-hour conversational speech
TestValTrain-Val
Data from different distributions/domains
Training Test
10-hour call-center speech50-hour conversational speech
TestValTrain-Val
• Optimal error rate (e.g. Bayes rate, best human error)
• Training error
• Train-Val error
• Validation error
• Test error
Data from different distributions/domains
Training Test
10-hour call-center speech50-hour conversational speech
TestValTrain-Val
• Optimal error rate (e.g. Bayes rate, best human error)
• Training error
• Train-Val error
• Validation error
• Test error
Bias
Variance
Train-Test mismatch
Overfitting of Val
Data from different distributions/domains
Training Test
10-hour call-center speech50-hour conversational speech
TestValTrain-Val
• Optimal error rate (e.g. Bayes rate, best human error)
• Training error
• Train-Val error
• Validation error
• Test error
Bias
Variance
1%
5%
6%
Train-Test mismatch
Overfitting of Val
10%
20%
Workflow (courtesy of Andrew Ng)
Training error high?
Workflow (courtesy of Andrew Ng)
Training error high?
Bigger model
Train longer
New model architecture
Yes
Workflow (courtesy of Andrew Ng)
Training error high?
Bigger model
Train longer
New model architecture
Train-Val error high?
Yes
No
Workflow (courtesy of Andrew Ng)
Training error high?
Bigger model
Train longer
New model architecture
Train-Val error high?
More data
Regularization
New model architecture
Yes
Yes
No
Workflow (courtesy of Andrew Ng)
Training error high?
Bigger model
Train longer
New model architecture
Train-Val error high?
More data
Regularization
New model architecture
Val error high?
Yes
Yes
No
No
Workflow (courtesy of Andrew Ng)
Training error high?
Bigger model
Train longer
New model architecture
Train-Val error high?
More data
Regularization
New model architecture
Val error high?
More data similar to test
Data synthesis
New model architecture
Yes
Yes
Yes
No
No
Workflow (courtesy of Andrew Ng)
Training error high?
Bigger model
Train longer
New model architecture
Train-Val error high?
More data
Regularization
New model architecture
Val error high?
More data similar to test
Data synthesis
New model architecture
Test error high?
Yes
Yes
Yes
No
No
No
Workflow (courtesy of Andrew Ng)
Training error high?
Bigger model
Train longer
New model architecture
Train-Val error high?
More data
Regularization
New model architecture
Val error high?
More data similar to test
Data synthesis
New model architecture
Test error high? More validation data
Yes
Yes
Yes
Yes
No
No
No
No
Done
Learning curves
More training dataerror
Validation
Train
Learning curves
More training dataerror
Validation
Train
More training data
error
Validation
Train
High bias
Getting more data likely
doesn’t help much
Learning curves
More training dataerror
Validation
Train
More training data
error
Validation
Train
More training data
error
Validation
Train
High bias
Getting more data likely
doesn’t help much
High variance
Getting more data is likely
to help
Working with imbalanced datasets
Working with imbalanced datasets
• Change your performance metric (e.g. F1 score instead of Accuracy)
• Customize objective function
Working with imbalanced datasets
• Change your performance metric (e.g. F1 score instead of Accuracy)
• Customize objective function
• Data:
• Oversampling/Undersampling
• Synthesize minority class (e.g. SMOTE)
• Buy more data
Working with imbalanced datasets
• Change your performance metric (e.g. F1 score instead of Accuracy)
• Customize objective function
• Data:
• Oversampling/Undersampling
• Synthesize minority class (e.g. SMOTE)
• Buy more data
• Algorithms:
• Bagging
• New/Other models
• Different perspective, e.g. anomaly detection
Dirty work drives progress

More Related Content

Viewers also liked

Buzzwords 2014 / Overview / part2
Buzzwords 2014 / Overview / part2Buzzwords 2014 / Overview / part2
Buzzwords 2014 / Overview / part2Andrii Gakhov
 
The swiss knife of a word press developer
The swiss knife of a word press developerThe swiss knife of a word press developer
The swiss knife of a word press developerIvelina Dimova
 
Daily Newsletter: 1st February, 2011
Daily Newsletter: 1st February, 2011Daily Newsletter: 1st February, 2011
Daily Newsletter: 1st February, 2011Fullerton Securities
 
HDI Capital Area Local Chapter March 2016 Meeting
HDI Capital Area Local Chapter March 2016 Meeting HDI Capital Area Local Chapter March 2016 Meeting
HDI Capital Area Local Chapter March 2016 Meeting hdicapitalarea
 
Ways To Master 12 Essential Life Skills!
Ways To Master 12 Essential Life Skills!Ways To Master 12 Essential Life Skills!
Ways To Master 12 Essential Life Skills!Business Insider India
 
Claims club - November 2016, Exeter
Claims club - November 2016, ExeterClaims club - November 2016, Exeter
Claims club - November 2016, ExeterBrowne Jacobson LLP
 
Landscapes of love for slideshare
Landscapes of love for slideshareLandscapes of love for slideshare
Landscapes of love for slideshareMarianne Esders
 
Culver City Film Festival - Film Marketing Services
Culver City Film Festival - Film Marketing ServicesCulver City Film Festival - Film Marketing Services
Culver City Film Festival - Film Marketing ServicesFilm Marketing Services
 
Winning tenders / securing tenderers in a competitive construction market - N...
Winning tenders / securing tenderers in a competitive construction market - N...Winning tenders / securing tenderers in a competitive construction market - N...
Winning tenders / securing tenderers in a competitive construction market - N...Browne Jacobson LLP
 
둥진최고
둥진최고둥진최고
둥진최고MIDO
 
Presentation: The Past, Present and Future of Mobile for CPG Marketers
Presentation: The Past, Present and Future of Mobile for CPG MarketersPresentation: The Past, Present and Future of Mobile for CPG Marketers
Presentation: The Past, Present and Future of Mobile for CPG MarketersMediaPost
 

Viewers also liked (14)

VVN Jan 2015
VVN Jan 2015 VVN Jan 2015
VVN Jan 2015
 
Buzzwords 2014 / Overview / part2
Buzzwords 2014 / Overview / part2Buzzwords 2014 / Overview / part2
Buzzwords 2014 / Overview / part2
 
The swiss knife of a word press developer
The swiss knife of a word press developerThe swiss knife of a word press developer
The swiss knife of a word press developer
 
8 rasgos de evaluación
8 rasgos de evaluación8 rasgos de evaluación
8 rasgos de evaluación
 
Daily Newsletter: 1st February, 2011
Daily Newsletter: 1st February, 2011Daily Newsletter: 1st February, 2011
Daily Newsletter: 1st February, 2011
 
HDI Capital Area Local Chapter March 2016 Meeting
HDI Capital Area Local Chapter March 2016 Meeting HDI Capital Area Local Chapter March 2016 Meeting
HDI Capital Area Local Chapter March 2016 Meeting
 
Ways To Master 12 Essential Life Skills!
Ways To Master 12 Essential Life Skills!Ways To Master 12 Essential Life Skills!
Ways To Master 12 Essential Life Skills!
 
Claims club - November 2016, Exeter
Claims club - November 2016, ExeterClaims club - November 2016, Exeter
Claims club - November 2016, Exeter
 
Lean enterprise fatma urek
Lean enterprise   fatma urekLean enterprise   fatma urek
Lean enterprise fatma urek
 
Landscapes of love for slideshare
Landscapes of love for slideshareLandscapes of love for slideshare
Landscapes of love for slideshare
 
Culver City Film Festival - Film Marketing Services
Culver City Film Festival - Film Marketing ServicesCulver City Film Festival - Film Marketing Services
Culver City Film Festival - Film Marketing Services
 
Winning tenders / securing tenderers in a competitive construction market - N...
Winning tenders / securing tenderers in a competitive construction market - N...Winning tenders / securing tenderers in a competitive construction market - N...
Winning tenders / securing tenderers in a competitive construction market - N...
 
둥진최고
둥진최고둥진최고
둥진최고
 
Presentation: The Past, Present and Future of Mobile for CPG Marketers
Presentation: The Past, Present and Future of Mobile for CPG MarketersPresentation: The Past, Present and Future of Mobile for CPG Marketers
Presentation: The Past, Present and Future of Mobile for CPG Marketers
 

Similar to Practical Machine Learning

Development testing
Development testingDevelopment testing
Development testingYury Kisliak
 
Scientific Revenue USF 2016 talk
Scientific Revenue USF 2016 talkScientific Revenue USF 2016 talk
Scientific Revenue USF 2016 talkScientificRevenue
 
Conversion Rate Optimisation: The Science Behind Turning Visitors Into Customers
Conversion Rate Optimisation: The Science Behind Turning Visitors Into CustomersConversion Rate Optimisation: The Science Behind Turning Visitors Into Customers
Conversion Rate Optimisation: The Science Behind Turning Visitors Into CustomersWeb Marketing ROI
 
Landing Page Optimization: 6 Years of Testing Distilled Into 5 Critical Lessons
Landing Page Optimization: 6 Years of Testing Distilled Into 5 Critical LessonsLanding Page Optimization: 6 Years of Testing Distilled Into 5 Critical Lessons
Landing Page Optimization: 6 Years of Testing Distilled Into 5 Critical LessonsTaboola
 
You cant control what you cant measure - Measuring requirements quality
You cant control what you cant measure - Measuring requirements qualityYou cant control what you cant measure - Measuring requirements quality
You cant control what you cant measure - Measuring requirements qualityMarkus Unterauer
 
Meetup_Consumer_Credit_Default_Vers_2_All
Meetup_Consumer_Credit_Default_Vers_2_AllMeetup_Consumer_Credit_Default_Vers_2_All
Meetup_Consumer_Credit_Default_Vers_2_AllBernard Ong
 
Lars Wolff - Performance Testing for DevOps in the Cloud - Codemotion Amsterd...
Lars Wolff - Performance Testing for DevOps in the Cloud - Codemotion Amsterd...Lars Wolff - Performance Testing for DevOps in the Cloud - Codemotion Amsterd...
Lars Wolff - Performance Testing for DevOps in the Cloud - Codemotion Amsterd...Codemotion
 
Machine Learning lecture7(ml system design1)
Machine Learning lecture7(ml system design1)Machine Learning lecture7(ml system design1)
Machine Learning lecture7(ml system design1)cairo university
 
From Labelling Open data images to building a private recommender system
From Labelling Open data images to building a private recommender systemFrom Labelling Open data images to building a private recommender system
From Labelling Open data images to building a private recommender systemPierre Gutierrez
 
Machine Learning in Production: Manu Mukerji, Strata CA March 2018
Machine Learning in Production: Manu Mukerji, Strata CA March 2018 Machine Learning in Production: Manu Mukerji, Strata CA March 2018
Machine Learning in Production: Manu Mukerji, Strata CA March 2018 Manu Mukerji
 
Recruiting great developers
Recruiting great developersRecruiting great developers
Recruiting great developersPer Fragemann
 
How to Effectively Experiment in PM by LendingTree Sr PM
How to Effectively Experiment in PM by LendingTree Sr PMHow to Effectively Experiment in PM by LendingTree Sr PM
How to Effectively Experiment in PM by LendingTree Sr PMProduct School
 
Top 10 Data Science Practitioner Pitfalls
Top 10 Data Science Practitioner PitfallsTop 10 Data Science Practitioner Pitfalls
Top 10 Data Science Practitioner PitfallsSri Ambati
 
6 Evaluating Predictive Performance and ensemble.pptx
6 Evaluating Predictive Performance and ensemble.pptx6 Evaluating Predictive Performance and ensemble.pptx
6 Evaluating Predictive Performance and ensemble.pptxmohammedalherwi1
 
Making Machine Learning Work in Practice - StampedeCon 2014
Making Machine Learning Work in Practice - StampedeCon 2014Making Machine Learning Work in Practice - StampedeCon 2014
Making Machine Learning Work in Practice - StampedeCon 2014StampedeCon
 
Optimizely Workshop: Take Action on Results with Statistics
Optimizely Workshop: Take Action on Results with StatisticsOptimizely Workshop: Take Action on Results with Statistics
Optimizely Workshop: Take Action on Results with StatisticsOptimizely
 
6 Guidelines for A/B Testing
6 Guidelines for A/B Testing6 Guidelines for A/B Testing
6 Guidelines for A/B TestingEmily Robinson
 
"How we killed 80% of features and increased outcomes of a/b testing by 100%"...
"How we killed 80% of features and increased outcomes of a/b testing by 100%"..."How we killed 80% of features and increased outcomes of a/b testing by 100%"...
"How we killed 80% of features and increased outcomes of a/b testing by 100%"...Fwdays
 

Similar to Practical Machine Learning (20)

Development testing
Development testingDevelopment testing
Development testing
 
Scientific Revenue USF 2016 talk
Scientific Revenue USF 2016 talkScientific Revenue USF 2016 talk
Scientific Revenue USF 2016 talk
 
Conversion Rate Optimisation: The Science Behind Turning Visitors Into Customers
Conversion Rate Optimisation: The Science Behind Turning Visitors Into CustomersConversion Rate Optimisation: The Science Behind Turning Visitors Into Customers
Conversion Rate Optimisation: The Science Behind Turning Visitors Into Customers
 
Landing Page Optimization: 6 Years of Testing Distilled Into 5 Critical Lessons
Landing Page Optimization: 6 Years of Testing Distilled Into 5 Critical LessonsLanding Page Optimization: 6 Years of Testing Distilled Into 5 Critical Lessons
Landing Page Optimization: 6 Years of Testing Distilled Into 5 Critical Lessons
 
You cant control what you cant measure - Measuring requirements quality
You cant control what you cant measure - Measuring requirements qualityYou cant control what you cant measure - Measuring requirements quality
You cant control what you cant measure - Measuring requirements quality
 
Meetup_Consumer_Credit_Default_Vers_2_All
Meetup_Consumer_Credit_Default_Vers_2_AllMeetup_Consumer_Credit_Default_Vers_2_All
Meetup_Consumer_Credit_Default_Vers_2_All
 
Lars Wolff - Performance Testing for DevOps in the Cloud - Codemotion Amsterd...
Lars Wolff - Performance Testing for DevOps in the Cloud - Codemotion Amsterd...Lars Wolff - Performance Testing for DevOps in the Cloud - Codemotion Amsterd...
Lars Wolff - Performance Testing for DevOps in the Cloud - Codemotion Amsterd...
 
Machine Learning lecture7(ml system design1)
Machine Learning lecture7(ml system design1)Machine Learning lecture7(ml system design1)
Machine Learning lecture7(ml system design1)
 
From Labelling Open data images to building a private recommender system
From Labelling Open data images to building a private recommender systemFrom Labelling Open data images to building a private recommender system
From Labelling Open data images to building a private recommender system
 
Machine Learning in Production: Manu Mukerji, Strata CA March 2018
Machine Learning in Production: Manu Mukerji, Strata CA March 2018 Machine Learning in Production: Manu Mukerji, Strata CA March 2018
Machine Learning in Production: Manu Mukerji, Strata CA March 2018
 
Recruiting great developers
Recruiting great developersRecruiting great developers
Recruiting great developers
 
Ml2 production
Ml2 productionMl2 production
Ml2 production
 
How to Effectively Experiment in PM by LendingTree Sr PM
How to Effectively Experiment in PM by LendingTree Sr PMHow to Effectively Experiment in PM by LendingTree Sr PM
How to Effectively Experiment in PM by LendingTree Sr PM
 
Top 10 Data Science Practitioner Pitfalls
Top 10 Data Science Practitioner PitfallsTop 10 Data Science Practitioner Pitfalls
Top 10 Data Science Practitioner Pitfalls
 
6 Evaluating Predictive Performance and ensemble.pptx
6 Evaluating Predictive Performance and ensemble.pptx6 Evaluating Predictive Performance and ensemble.pptx
6 Evaluating Predictive Performance and ensemble.pptx
 
Making Machine Learning Work in Practice - StampedeCon 2014
Making Machine Learning Work in Practice - StampedeCon 2014Making Machine Learning Work in Practice - StampedeCon 2014
Making Machine Learning Work in Practice - StampedeCon 2014
 
Optimizely Workshop: Take Action on Results with Statistics
Optimizely Workshop: Take Action on Results with StatisticsOptimizely Workshop: Take Action on Results with Statistics
Optimizely Workshop: Take Action on Results with Statistics
 
6 Guidelines for A/B Testing
6 Guidelines for A/B Testing6 Guidelines for A/B Testing
6 Guidelines for A/B Testing
 
"How we killed 80% of features and increased outcomes of a/b testing by 100%"...
"How we killed 80% of features and increased outcomes of a/b testing by 100%"..."How we killed 80% of features and increased outcomes of a/b testing by 100%"...
"How we killed 80% of features and increased outcomes of a/b testing by 100%"...
 
Docs slides-lecture10
Docs slides-lecture10Docs slides-lecture10
Docs slides-lecture10
 

Recently uploaded

Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...ssuserf63bd7
 
毕业文凭制作#回国入职#diploma#degree美国加州州立大学北岭分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#de...
毕业文凭制作#回国入职#diploma#degree美国加州州立大学北岭分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#de...毕业文凭制作#回国入职#diploma#degree美国加州州立大学北岭分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#de...
毕业文凭制作#回国入职#diploma#degree美国加州州立大学北岭分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#de...ttt fff
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics
 
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一F La
 
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGILLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGIThomas Poetter
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一F La
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 

Recently uploaded (20)

Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
 
毕业文凭制作#回国入职#diploma#degree美国加州州立大学北岭分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#de...
毕业文凭制作#回国入职#diploma#degree美国加州州立大学北岭分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#de...毕业文凭制作#回国入职#diploma#degree美国加州州立大学北岭分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#de...
毕业文凭制作#回国入职#diploma#degree美国加州州立大学北岭分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#de...
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
 
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
 
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGILLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 

Practical Machine Learning

  • 2. Practical Machine Learning • Your model makes unacceptably large errors on new data. What to do next?
  • 3. Practical Machine Learning • Your model makes unacceptably large errors on new data. What to do next? • Collect more training samples
  • 4. Practical Machine Learning • Your model makes unacceptably large errors on new data. What to do next? • Collect more training samples • Reduce number of features
  • 5. Practical Machine Learning • Your model makes unacceptably large errors on new data. What to do next? • Collect more training samples • Reduce number of features • Increase number of features
  • 6. Practical Machine Learning • Your model makes unacceptably large errors on new data. What to do next? • Collect more training samples • Reduce number of features • Increase number of features • Regularization
  • 7. Practical Machine Learning • Your model makes unacceptably large errors on new data. What to do next? • Collect more training samples • Reduce number of features • Increase number of features • Regularization • Bigger Model
  • 8. Practical Machine Learning • Your model makes unacceptably large errors on new data. What to do next? • Collect more training samples • Reduce number of features • Increase number of features • Regularization • Bigger Model • Hyper-parameter tuning
  • 9. Bias vs. Variance x x x x x x x x x x x x x x x High bias (underfit) “Just right” High variance (overfit) x f(x) x f(x) x f(x)
  • 10. Bias vs. Variance – Machine Learning perspective • Optimal error rate (e.g. Bayes rate, best human error) • Training error • Validation error Training Test
  • 11. Bias vs. Variance – Machine Learning perspective • Optimal error rate (e.g. Bayes rate, best human error) • Training error • Validation error Training Validation Test
  • 12. Bias vs. Variance – Machine Learning perspective • Optimal error rate (e.g. Bayes rate, best human error) • Training error • Validation error Training Validation Test Bias Variance
  • 13. Bias vs. Variance – Machine Learning perspective • Optimal error rate (e.g. Bayes rate, best human error) • Training error • Validation error Training Validation Test Bias Variance 1% 5% 6%
  • 14. Bias vs. Variance – Machine Learning perspective • Optimal error rate (e.g. Bayes rate, best human error) • Training error • Validation error Training Validation Test Bias Variance 1% 5% 6% 1% 2% 6%
  • 15. Bias vs. Variance – Machine Learning perspective • Optimal error rate (e.g. Bayes rate, best human error) • Training error • Validation error Training Validation Test Bias Variance 1% 5% 6% 1% 2% 6% 1% 5% 10%
  • 16. Data from different distributions/domains Training Test 10-hour call-center speech50-hour conversational speech
  • 17. Data from different distributions/domains Training Test 10-hour call-center speech50-hour conversational speech TestValTrain-Val
  • 18. Data from different distributions/domains Training Test 10-hour call-center speech50-hour conversational speech TestValTrain-Val • Optimal error rate (e.g. Bayes rate, best human error) • Training error • Train-Val error • Validation error • Test error
  • 19. Data from different distributions/domains Training Test 10-hour call-center speech50-hour conversational speech TestValTrain-Val • Optimal error rate (e.g. Bayes rate, best human error) • Training error • Train-Val error • Validation error • Test error Bias Variance Train-Test mismatch Overfitting of Val
  • 20. Data from different distributions/domains Training Test 10-hour call-center speech50-hour conversational speech TestValTrain-Val • Optimal error rate (e.g. Bayes rate, best human error) • Training error • Train-Val error • Validation error • Test error Bias Variance 1% 5% 6% Train-Test mismatch Overfitting of Val 10% 20%
  • 21. Workflow (courtesy of Andrew Ng) Training error high?
  • 22. Workflow (courtesy of Andrew Ng) Training error high? Bigger model Train longer New model architecture Yes
  • 23. Workflow (courtesy of Andrew Ng) Training error high? Bigger model Train longer New model architecture Train-Val error high? Yes No
  • 24. Workflow (courtesy of Andrew Ng) Training error high? Bigger model Train longer New model architecture Train-Val error high? More data Regularization New model architecture Yes Yes No
  • 25. Workflow (courtesy of Andrew Ng) Training error high? Bigger model Train longer New model architecture Train-Val error high? More data Regularization New model architecture Val error high? Yes Yes No No
  • 26. Workflow (courtesy of Andrew Ng) Training error high? Bigger model Train longer New model architecture Train-Val error high? More data Regularization New model architecture Val error high? More data similar to test Data synthesis New model architecture Yes Yes Yes No No
  • 27. Workflow (courtesy of Andrew Ng) Training error high? Bigger model Train longer New model architecture Train-Val error high? More data Regularization New model architecture Val error high? More data similar to test Data synthesis New model architecture Test error high? Yes Yes Yes No No No
  • 28. Workflow (courtesy of Andrew Ng) Training error high? Bigger model Train longer New model architecture Train-Val error high? More data Regularization New model architecture Val error high? More data similar to test Data synthesis New model architecture Test error high? More validation data Yes Yes Yes Yes No No No No Done
  • 29. Learning curves More training dataerror Validation Train
  • 30. Learning curves More training dataerror Validation Train More training data error Validation Train High bias Getting more data likely doesn’t help much
  • 31. Learning curves More training dataerror Validation Train More training data error Validation Train More training data error Validation Train High bias Getting more data likely doesn’t help much High variance Getting more data is likely to help
  • 33. Working with imbalanced datasets • Change your performance metric (e.g. F1 score instead of Accuracy) • Customize objective function
  • 34. Working with imbalanced datasets • Change your performance metric (e.g. F1 score instead of Accuracy) • Customize objective function • Data: • Oversampling/Undersampling • Synthesize minority class (e.g. SMOTE) • Buy more data
  • 35. Working with imbalanced datasets • Change your performance metric (e.g. F1 score instead of Accuracy) • Customize objective function • Data: • Oversampling/Undersampling • Synthesize minority class (e.g. SMOTE) • Buy more data • Algorithms: • Bagging • New/Other models • Different perspective, e.g. anomaly detection
  • 36. Dirty work drives progress