SlideShare a Scribd company logo
1 of 31
Download to read offline
Introducing
OptiML
BigML Release: OptiML
BigML, Inc 2OptiML Release Webinar
OptiML Release
CHARLES PARKER, PH.D. - VP of Machine
Learning Algorithms
Please enter questions into chat box – We will
answer some via chat and others at the end of the
session
https://bigml.com/releases
ATAKAN CETINSOY - VP of Predictive Applications
Resources
Moderator
Speaker
Contact support@bigml.com
Twitter @bigmlcom
Questions
BigML, Inc 3OptiML Release Webinar
Parameter Optimization
• There are lots of algorithms and lots of parameters
• We don’t have time to try even close to everything
• If only we had a way to make a prediction . . .
Did I hear someone say
Machine Learning?
BigML, Inc 4OptiML Release Webinar
The Allure of ML
“Why don’t we just use
Machine Learning to predict
the quality of a set of
modeling parameters before
we train a model on them?”
— Every first year ML grad student ever
BigML, Inc 5OptiML Release Webinar
In This Webinar
• Technology Overview
• Metric Selection
• The Dangers of Naive Cross-validation
• Selecting the “Best” Model
• Caveat Emptor!
BigML, Inc 6OptiML Release Webinar
In This Webinar
• Technology Overview
• Metric Selection
• The Dangers of Naive Cross-validation
• Selecting the “Best” Model
• Caveat Emptor!
BigML, Inc 7OptiML Release Webinar
Bayesian Parameter Optimization
• The performance of a ML algorithm (with associated parameters) is
data dependent
• So: Learn from your previous attempts
• Train a model, then evaluate it
• After you’ve done a number of evaluations, learn a regression
model to predict the performance of future, as-yet-untrained
models
• Use this classifier to chose a promising set of “next models” to
evaluate
BigML, Inc 8OptiML Release Webinar
Model and
EvaluateParameters 1
Parameters 2
Parameters 3
Parameters 4
Parameters 5
Parameters 6
Bayesian Parameter Optimization
BigML, Inc 9OptiML Release Webinar
0.75
Model and
EvaluateParameters 1
Parameters 2
Parameters 3
Parameters 4
Parameters 5
Parameters 6
Bayesian Parameter Optimization
BigML, Inc 10OptiML Release Webinar
0.75
0.56
Model and
EvaluateParameters 1
Parameters 2
Parameters 3
Parameters 4
Parameters 5
Parameters 6
Bayesian Parameter Optimization
BigML, Inc 11OptiML Release Webinar
0.75
0.56
0.92
Model and
EvaluateParameters 1
Parameters 2
Parameters 3
Parameters 4
Parameters 5
Parameters 6
Bayesian Parameter Optimization
BigML, Inc 12OptiML Release Webinar
0.75
0.56
0.92
Model and
EvaluateParameters 1
Parameters 2
Parameters 3
Parameters 4
Parameters 5
Parameters 6
Bayesian Parameter Optimization
BigML, Inc 13OptiML Release Webinar
0.75
0.56
0.92
Model and
EvaluateParameters 1
Parameters 2
Parameters 3
Parameters 4
Parameters 5
Parameters 6
Machine Learning!
parameters ⟶ performance
Bayesian Parameter Optimization
BigML, Inc 14OptiML Release Webinar
Model and
EvaluateParameters 1
Parameters 2
Parameters 3
Parameters 4
Parameters 5
Parameters 6
0.75
0.56
0.92
Machine Learning!
parameters ⟶ performance
Bayesian Parameter Optimization
BigML, Inc 15OptiML Release Webinar
Some Other Tricks
• Use metalearning to select a good set of initial candidates
• Cross-validation is expensive, and there’s no reason to do it for models
with terrible performance; stop early in these cases
BigML, Inc 16OptiML Release Webinar
In This Webinar
• Technology Overview
• Metric Selection
• The Dangers of Naive Cross-validation
• Selecting the “Best” Model
• Caveat Emptor!
BigML, Inc 17OptiML Release Webinar
A Metric Selection Flowchart
YES
YES
YES
NO
NO
NO
YES
NO
Will you bother about
threshold setting?
Is yours a “ranking”
problem?
Is your dataset
imbalanced?
Do you care more
about the top-ranked
instances?
Max. Phi
KS-statistic
Area Under the ROC / PR curve
Kendall’s Tau
Spearman’s Rho
Accuracy
Phi coefficient
f-measure
BigML, Inc 18OptiML Release Webinar
Ranking Problems
Medical Diagnosis (no) vs. Stock Picking (yes)
BigML, Inc 19OptiML Release Webinar
Top-heavy Importance
Draft-Style Selections (no) vs. Customer Churn (yes)
BigML, Inc 20OptiML Release Webinar
In This Webinar
• Technology Overview
• Metric Selection
• The Dangers of Naive Cross-validation
• Selecting the “Best” Model
• Caveat Emptor!
BigML, Inc 21OptiML Release Webinar
Is Cross-Validation Right for You?
• Cross-validation is a good tool some
of the time
• Many other times, it is disastrously bad
• Overly optimistic
• False confidence in results
• This is why we offer the option for a
specific holdout set
BigML, Inc 22OptiML Release Webinar
Case #1: Market Direction
• Suppose you want to predict the direction of the stock market
• You have information for that market for each minute of each day
• But minutes next to each other are correlated in the input and objective field
• So if you have the answer for one minute, you can trivially predict the rest!
• Cross-validation will tell you your classifier is near-perfect!
All Negative
All Positive
Close of Day
BigML, Inc 23OptiML Release Webinar
Case #2: Photo Age Prediction
• Suppose you want to predict the age of a printed photograph (based on dye-
fade, paper watermarks, the presence and type of border, etc.)
• Your training set: A few thousand photos from a few dozen people
• But the age of one person’s photos are correlated in both the input and output
spaces! (same age, camera, storage conditions, etc.)
• So you can trivially do well predicting the age of some of one person’s photos if
you know the ages of the rest
• Cross-validation will tell you your classifier is near perfect!
BigML, Inc 24OptiML Release Webinar
Take Care!
• These situations are very common in all
cases where data comes in batches
(days, users, etc.)
• The solution is to hold out whole batches
of data (e.g., a specific test set) rather
than just random points from each one
(as in cross-validation)
• It’s possible that it isn’t a problem in your
dataset, but when in doubt, try both!
BigML, Inc 25OptiML Release Webinar
In This Webinar
• Technology Overview
• Metric Selection
• The Dangers of Naive Cross-validation
• Selecting the “Best” Model
• Caveat Emptor!
BigML, Inc 26OptiML Release Webinar
Which Model is Best?
• Performance isn’t the only issue!
• Retraining: Will the amount of data you have be different in the future?
• Fit stability: How confident must you be that the model’s behavior is invariant
to small data changes?
• Prediction speed: The difference can be orders of magnitude
BigML, Inc 27OptiML Release Webinar
Modeling Tradeoffs
Interpretability vs. Representability
Weak vs. Slow
Confidence vs. Performance
Biased vs. Data-hungry
Simple
(Logistic)
Complex
(Deepnets)
BigML, Inc 28OptiML Release Webinar
In This Webinar
• Technology Overview
• Metric Selection
• The Dangers of Naive Cross-validation
• Selecting the “Best” Model
• Caveat Emptor!
BigML, Inc 29OptiML Release Webinar
Mo’ Problems
• Model selection tends to take a lot of
data, and the more accurate you
want the search to be, the more data
you need.
• We had to define a search space that
would suit “most” datasets. It’s
possible that the right model for your
data isn’t in there!
BigML, Inc 30OptiML Release Webinar
https://bigml.com/releases/winter-2018
Learn More
https://bigml.com/whatsnew
Questions?
@bigmlcom support@bigml.com

More Related Content

What's hot

Specification by example and agile acceptance testing
Specification by example and agile acceptance testingSpecification by example and agile acceptance testing
Specification by example and agile acceptance testinggojkoadzic
 
Alexa, what's next?
Alexa, what's next?Alexa, what's next?
Alexa, what's next?Ralf Eggert
 
Try Before You Buy: User Experience Testing in Your RFP Process Can Save You ...
Try Before You Buy: User Experience Testing in Your RFP Process Can Save You ...Try Before You Buy: User Experience Testing in Your RFP Process Can Save You ...
Try Before You Buy: User Experience Testing in Your RFP Process Can Save You ...David Rosen
 
First steps in Test Driven Development
First steps in Test Driven Development First steps in Test Driven Development
First steps in Test Driven Development IIBA UK Chapter
 
Effective specifications for agile teams
Effective specifications for agile teamsEffective specifications for agile teams
Effective specifications for agile teamsgojkoadzic
 
Continuously testing govt.nz - DevOpsDays Ignite Wellington 2018
Continuously testing govt.nz - DevOpsDays Ignite Wellington 2018Continuously testing govt.nz - DevOpsDays Ignite Wellington 2018
Continuously testing govt.nz - DevOpsDays Ignite Wellington 2018Allen Geer
 
End-to-End Automated Testing: Lessons from Zombieland
End-to-End Automated Testing: Lessons from ZombielandEnd-to-End Automated Testing: Lessons from Zombieland
End-to-End Automated Testing: Lessons from ZombielandJosiah Renaudin
 
Cinci ug-january2011-anti-patterns
Cinci ug-january2011-anti-patternsCinci ug-january2011-anti-patterns
Cinci ug-january2011-anti-patternsSteven Smith
 
Stop Guessing, Start Knowing: The Top 5 Most Common Rapid Prototyping Mistakes
Stop Guessing, Start Knowing: The Top 5 Most Common Rapid Prototyping MistakesStop Guessing, Start Knowing: The Top 5 Most Common Rapid Prototyping Mistakes
Stop Guessing, Start Knowing: The Top 5 Most Common Rapid Prototyping MistakesHannah Flynn
 
Stop Guessing, Start Knowing: The Top 5 Most Common Rapid Prototyping Mistakes
Stop Guessing, Start Knowing: The Top 5 Most Common Rapid Prototyping MistakesStop Guessing, Start Knowing: The Top 5 Most Common Rapid Prototyping Mistakes
Stop Guessing, Start Knowing: The Top 5 Most Common Rapid Prototyping MistakesAggregage
 
Automating good coding practices
Automating good coding practicesAutomating good coding practices
Automating good coding practicesKevin Peterson
 
Episode 21 - Design Pattern 1
Episode 21 - Design Pattern 1Episode 21 - Design Pattern 1
Episode 21 - Design Pattern 1Jitendra Zaa
 
Test Automation - Insights Into Frameworks by Anup Patnaik, QA InfoTech
Test Automation - Insights Into Frameworks by Anup Patnaik, QA InfoTechTest Automation - Insights Into Frameworks by Anup Patnaik, QA InfoTech
Test Automation - Insights Into Frameworks by Anup Patnaik, QA InfoTechQA InfoTech
 
Helping Programmers Write Better Tests
Helping Programmers Write Better TestsHelping Programmers Write Better Tests
Helping Programmers Write Better TestsGeoffrey Dunn
 
How should we build that? Evolving a development environment that's suitable ...
How should we build that? Evolving a development environment that's suitable ...How should we build that? Evolving a development environment that's suitable ...
How should we build that? Evolving a development environment that's suitable ...AdaCore
 
Pair Programming in Theory and Practice By Garrick West
Pair Programming in Theory and Practice By Garrick WestPair Programming in Theory and Practice By Garrick West
Pair Programming in Theory and Practice By Garrick WestXP Conference India
 

What's hot (18)

Specification by example and agile acceptance testing
Specification by example and agile acceptance testingSpecification by example and agile acceptance testing
Specification by example and agile acceptance testing
 
Alexa, what's next?
Alexa, what's next?Alexa, what's next?
Alexa, what's next?
 
Try Before You Buy: User Experience Testing in Your RFP Process Can Save You ...
Try Before You Buy: User Experience Testing in Your RFP Process Can Save You ...Try Before You Buy: User Experience Testing in Your RFP Process Can Save You ...
Try Before You Buy: User Experience Testing in Your RFP Process Can Save You ...
 
First steps in Test Driven Development
First steps in Test Driven Development First steps in Test Driven Development
First steps in Test Driven Development
 
Effective specifications for agile teams
Effective specifications for agile teamsEffective specifications for agile teams
Effective specifications for agile teams
 
Continuously testing govt.nz - DevOpsDays Ignite Wellington 2018
Continuously testing govt.nz - DevOpsDays Ignite Wellington 2018Continuously testing govt.nz - DevOpsDays Ignite Wellington 2018
Continuously testing govt.nz - DevOpsDays Ignite Wellington 2018
 
Bug Advocacy
Bug AdvocacyBug Advocacy
Bug Advocacy
 
End-to-End Automated Testing: Lessons from Zombieland
End-to-End Automated Testing: Lessons from ZombielandEnd-to-End Automated Testing: Lessons from Zombieland
End-to-End Automated Testing: Lessons from Zombieland
 
Cinci ug-january2011-anti-patterns
Cinci ug-january2011-anti-patternsCinci ug-january2011-anti-patterns
Cinci ug-january2011-anti-patterns
 
Stop Guessing, Start Knowing: The Top 5 Most Common Rapid Prototyping Mistakes
Stop Guessing, Start Knowing: The Top 5 Most Common Rapid Prototyping MistakesStop Guessing, Start Knowing: The Top 5 Most Common Rapid Prototyping Mistakes
Stop Guessing, Start Knowing: The Top 5 Most Common Rapid Prototyping Mistakes
 
Stop Guessing, Start Knowing: The Top 5 Most Common Rapid Prototyping Mistakes
Stop Guessing, Start Knowing: The Top 5 Most Common Rapid Prototyping MistakesStop Guessing, Start Knowing: The Top 5 Most Common Rapid Prototyping Mistakes
Stop Guessing, Start Knowing: The Top 5 Most Common Rapid Prototyping Mistakes
 
Agile Practices
Agile PracticesAgile Practices
Agile Practices
 
Automating good coding practices
Automating good coding practicesAutomating good coding practices
Automating good coding practices
 
Episode 21 - Design Pattern 1
Episode 21 - Design Pattern 1Episode 21 - Design Pattern 1
Episode 21 - Design Pattern 1
 
Test Automation - Insights Into Frameworks by Anup Patnaik, QA InfoTech
Test Automation - Insights Into Frameworks by Anup Patnaik, QA InfoTechTest Automation - Insights Into Frameworks by Anup Patnaik, QA InfoTech
Test Automation - Insights Into Frameworks by Anup Patnaik, QA InfoTech
 
Helping Programmers Write Better Tests
Helping Programmers Write Better TestsHelping Programmers Write Better Tests
Helping Programmers Write Better Tests
 
How should we build that? Evolving a development environment that's suitable ...
How should we build that? Evolving a development environment that's suitable ...How should we build that? Evolving a development environment that's suitable ...
How should we build that? Evolving a development environment that's suitable ...
 
Pair Programming in Theory and Practice By Garrick West
Pair Programming in Theory and Practice By Garrick WestPair Programming in Theory and Practice By Garrick West
Pair Programming in Theory and Practice By Garrick West
 

Similar to BigML Release: OptiML

VSSML18. OptiML and Fusions
VSSML18. OptiML and FusionsVSSML18. OptiML and Fusions
VSSML18. OptiML and FusionsBigML, Inc
 
BigML Education - OptiML
BigML Education - OptiMLBigML Education - OptiML
BigML Education - OptiMLBigML, Inc
 
BigML Release: PCA
BigML Release: PCABigML Release: PCA
BigML Release: PCABigML, Inc
 
BigML Release: Image Processing
BigML Release: Image ProcessingBigML Release: Image Processing
BigML Release: Image ProcessingBigML, Inc
 
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...Alok Singh
 
2024-02-24_Session 1 - PMLE_UPDATED.pptx
2024-02-24_Session 1 - PMLE_UPDATED.pptx2024-02-24_Session 1 - PMLE_UPDATED.pptx
2024-02-24_Session 1 - PMLE_UPDATED.pptxgdgsurrey
 
BSSML17 - Introduction, Models, Evaluations
BSSML17 - Introduction, Models, EvaluationsBSSML17 - Introduction, Models, Evaluations
BSSML17 - Introduction, Models, EvaluationsBigML, Inc
 
DutchMLSchool 2022 - Anomaly Detection at Scale
DutchMLSchool 2022 - Anomaly Detection at ScaleDutchMLSchool 2022 - Anomaly Detection at Scale
DutchMLSchool 2022 - Anomaly Detection at ScaleBigML, Inc
 
Can we induce change with what we measure?
Can we induce change with what we measure?Can we induce change with what we measure?
Can we induce change with what we measure?Michaela Greiler
 
CD in Machine Learning Systems
CD in Machine Learning SystemsCD in Machine Learning Systems
CD in Machine Learning SystemsThoughtworks
 
Production model lifecycle management 2016 09
Production model lifecycle management 2016 09Production model lifecycle management 2016 09
Production model lifecycle management 2016 09Greg Makowski
 
Customer choice probabilities
Customer choice probabilitiesCustomer choice probabilities
Customer choice probabilitiesAllan D. Butler
 
Before Kaggle : from a business goal to a Machine Learning problem
Before Kaggle : from a business goal to a Machine Learning problem Before Kaggle : from a business goal to a Machine Learning problem
Before Kaggle : from a business goal to a Machine Learning problem Dataiku
 
Keynote AST 2016
Keynote AST 2016Keynote AST 2016
Keynote AST 2016Kim Herzig
 
The Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkThe Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkIvo Andreev
 
Experiment Management for the Enterprise
Experiment Management for the EnterpriseExperiment Management for the Enterprise
Experiment Management for the EnterpriseSigOpt
 
vodQA Pune (2019) - Testing AI,ML applications
vodQA Pune (2019) - Testing AI,ML applicationsvodQA Pune (2019) - Testing AI,ML applications
vodQA Pune (2019) - Testing AI,ML applicationsvodQA
 

Similar to BigML Release: OptiML (20)

VSSML18. OptiML and Fusions
VSSML18. OptiML and FusionsVSSML18. OptiML and Fusions
VSSML18. OptiML and Fusions
 
BigML Education - OptiML
BigML Education - OptiMLBigML Education - OptiML
BigML Education - OptiML
 
BigML Release: PCA
BigML Release: PCABigML Release: PCA
BigML Release: PCA
 
BigML Release: Image Processing
BigML Release: Image ProcessingBigML Release: Image Processing
BigML Release: Image Processing
 
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
 
2024-02-24_Session 1 - PMLE_UPDATED.pptx
2024-02-24_Session 1 - PMLE_UPDATED.pptx2024-02-24_Session 1 - PMLE_UPDATED.pptx
2024-02-24_Session 1 - PMLE_UPDATED.pptx
 
BSSML17 - Introduction, Models, Evaluations
BSSML17 - Introduction, Models, EvaluationsBSSML17 - Introduction, Models, Evaluations
BSSML17 - Introduction, Models, Evaluations
 
Understand SPC Fundamentals
Understand SPC FundamentalsUnderstand SPC Fundamentals
Understand SPC Fundamentals
 
DutchMLSchool 2022 - Anomaly Detection at Scale
DutchMLSchool 2022 - Anomaly Detection at ScaleDutchMLSchool 2022 - Anomaly Detection at Scale
DutchMLSchool 2022 - Anomaly Detection at Scale
 
Can we induce change with what we measure?
Can we induce change with what we measure?Can we induce change with what we measure?
Can we induce change with what we measure?
 
CD in Machine Learning Systems
CD in Machine Learning SystemsCD in Machine Learning Systems
CD in Machine Learning Systems
 
Production model lifecycle management 2016 09
Production model lifecycle management 2016 09Production model lifecycle management 2016 09
Production model lifecycle management 2016 09
 
Customer choice probabilities
Customer choice probabilitiesCustomer choice probabilities
Customer choice probabilities
 
Before Kaggle : from a business goal to a Machine Learning problem
Before Kaggle : from a business goal to a Machine Learning problem Before Kaggle : from a business goal to a Machine Learning problem
Before Kaggle : from a business goal to a Machine Learning problem
 
Before Kaggle
Before KaggleBefore Kaggle
Before Kaggle
 
Keynote AST 2016
Keynote AST 2016Keynote AST 2016
Keynote AST 2016
 
The Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkThe Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it Work
 
Experiment Management for the Enterprise
Experiment Management for the EnterpriseExperiment Management for the Enterprise
Experiment Management for the Enterprise
 
Tec314f
Tec314fTec314f
Tec314f
 
vodQA Pune (2019) - Testing AI,ML applications
vodQA Pune (2019) - Testing AI,ML applicationsvodQA Pune (2019) - Testing AI,ML applications
vodQA Pune (2019) - Testing AI,ML applications
 

More from BigML, Inc

Digital Transformation and Process Optimization in Manufacturing
Digital Transformation and Process Optimization in ManufacturingDigital Transformation and Process Optimization in Manufacturing
Digital Transformation and Process Optimization in ManufacturingBigML, Inc
 
DutchMLSchool 2022 - Automation
DutchMLSchool 2022 - AutomationDutchMLSchool 2022 - Automation
DutchMLSchool 2022 - AutomationBigML, Inc
 
DutchMLSchool 2022 - ML for AML Compliance
DutchMLSchool 2022 - ML for AML ComplianceDutchMLSchool 2022 - ML for AML Compliance
DutchMLSchool 2022 - ML for AML ComplianceBigML, Inc
 
DutchMLSchool 2022 - Multi Perspective Anomalies
DutchMLSchool 2022 - Multi Perspective AnomaliesDutchMLSchool 2022 - Multi Perspective Anomalies
DutchMLSchool 2022 - Multi Perspective AnomaliesBigML, Inc
 
DutchMLSchool 2022 - My First Anomaly Detector
DutchMLSchool 2022 - My First Anomaly Detector DutchMLSchool 2022 - My First Anomaly Detector
DutchMLSchool 2022 - My First Anomaly Detector BigML, Inc
 
DutchMLSchool 2022 - Anomaly Detection
DutchMLSchool 2022 - Anomaly DetectionDutchMLSchool 2022 - Anomaly Detection
DutchMLSchool 2022 - Anomaly DetectionBigML, Inc
 
DutchMLSchool 2022 - History and Developments in ML
DutchMLSchool 2022 - History and Developments in MLDutchMLSchool 2022 - History and Developments in ML
DutchMLSchool 2022 - History and Developments in MLBigML, Inc
 
DutchMLSchool 2022 - End-to-End ML
DutchMLSchool 2022 - End-to-End MLDutchMLSchool 2022 - End-to-End ML
DutchMLSchool 2022 - End-to-End MLBigML, Inc
 
DutchMLSchool 2022 - A Data-Driven Company
DutchMLSchool 2022 - A Data-Driven CompanyDutchMLSchool 2022 - A Data-Driven Company
DutchMLSchool 2022 - A Data-Driven CompanyBigML, Inc
 
DutchMLSchool 2022 - ML in the Legal Sector
DutchMLSchool 2022 - ML in the Legal SectorDutchMLSchool 2022 - ML in the Legal Sector
DutchMLSchool 2022 - ML in the Legal SectorBigML, Inc
 
DutchMLSchool 2022 - Smart Safe Stadiums
DutchMLSchool 2022 - Smart Safe StadiumsDutchMLSchool 2022 - Smart Safe Stadiums
DutchMLSchool 2022 - Smart Safe StadiumsBigML, Inc
 
DutchMLSchool 2022 - Process Optimization in Manufacturing Plants
DutchMLSchool 2022 - Process Optimization in Manufacturing PlantsDutchMLSchool 2022 - Process Optimization in Manufacturing Plants
DutchMLSchool 2022 - Process Optimization in Manufacturing PlantsBigML, Inc
 
DutchMLSchool 2022 - Citizen Development in AI
DutchMLSchool 2022 - Citizen Development in AIDutchMLSchool 2022 - Citizen Development in AI
DutchMLSchool 2022 - Citizen Development in AIBigML, Inc
 
Democratizing Object Detection
Democratizing Object DetectionDemocratizing Object Detection
Democratizing Object DetectionBigML, Inc
 
Machine Learning in Retail: Know Your Customers' Customer. See Your Future
Machine Learning in Retail: Know Your Customers' Customer. See Your FutureMachine Learning in Retail: Know Your Customers' Customer. See Your Future
Machine Learning in Retail: Know Your Customers' Customer. See Your FutureBigML, Inc
 
Machine Learning in Retail: ML in the Retail Sector
Machine Learning in Retail: ML in the Retail SectorMachine Learning in Retail: ML in the Retail Sector
Machine Learning in Retail: ML in the Retail SectorBigML, Inc
 
ML in GRC: Machine Learning in Legal Automation, How to Trust a Lawyerbot
ML in GRC: Machine Learning in Legal Automation, How to Trust a LawyerbotML in GRC: Machine Learning in Legal Automation, How to Trust a Lawyerbot
ML in GRC: Machine Learning in Legal Automation, How to Trust a LawyerbotBigML, Inc
 
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...BigML, Inc
 
ML in GRC: Cybersecurity versus Governance, Risk Management, and Compliance
ML in GRC: Cybersecurity versus Governance, Risk Management, and ComplianceML in GRC: Cybersecurity versus Governance, Risk Management, and Compliance
ML in GRC: Cybersecurity versus Governance, Risk Management, and ComplianceBigML, Inc
 
Intelligent Mobility: Machine Learning in the Mobility Industry
Intelligent Mobility: Machine Learning in the Mobility IndustryIntelligent Mobility: Machine Learning in the Mobility Industry
Intelligent Mobility: Machine Learning in the Mobility IndustryBigML, Inc
 

More from BigML, Inc (20)

Digital Transformation and Process Optimization in Manufacturing
Digital Transformation and Process Optimization in ManufacturingDigital Transformation and Process Optimization in Manufacturing
Digital Transformation and Process Optimization in Manufacturing
 
DutchMLSchool 2022 - Automation
DutchMLSchool 2022 - AutomationDutchMLSchool 2022 - Automation
DutchMLSchool 2022 - Automation
 
DutchMLSchool 2022 - ML for AML Compliance
DutchMLSchool 2022 - ML for AML ComplianceDutchMLSchool 2022 - ML for AML Compliance
DutchMLSchool 2022 - ML for AML Compliance
 
DutchMLSchool 2022 - Multi Perspective Anomalies
DutchMLSchool 2022 - Multi Perspective AnomaliesDutchMLSchool 2022 - Multi Perspective Anomalies
DutchMLSchool 2022 - Multi Perspective Anomalies
 
DutchMLSchool 2022 - My First Anomaly Detector
DutchMLSchool 2022 - My First Anomaly Detector DutchMLSchool 2022 - My First Anomaly Detector
DutchMLSchool 2022 - My First Anomaly Detector
 
DutchMLSchool 2022 - Anomaly Detection
DutchMLSchool 2022 - Anomaly DetectionDutchMLSchool 2022 - Anomaly Detection
DutchMLSchool 2022 - Anomaly Detection
 
DutchMLSchool 2022 - History and Developments in ML
DutchMLSchool 2022 - History and Developments in MLDutchMLSchool 2022 - History and Developments in ML
DutchMLSchool 2022 - History and Developments in ML
 
DutchMLSchool 2022 - End-to-End ML
DutchMLSchool 2022 - End-to-End MLDutchMLSchool 2022 - End-to-End ML
DutchMLSchool 2022 - End-to-End ML
 
DutchMLSchool 2022 - A Data-Driven Company
DutchMLSchool 2022 - A Data-Driven CompanyDutchMLSchool 2022 - A Data-Driven Company
DutchMLSchool 2022 - A Data-Driven Company
 
DutchMLSchool 2022 - ML in the Legal Sector
DutchMLSchool 2022 - ML in the Legal SectorDutchMLSchool 2022 - ML in the Legal Sector
DutchMLSchool 2022 - ML in the Legal Sector
 
DutchMLSchool 2022 - Smart Safe Stadiums
DutchMLSchool 2022 - Smart Safe StadiumsDutchMLSchool 2022 - Smart Safe Stadiums
DutchMLSchool 2022 - Smart Safe Stadiums
 
DutchMLSchool 2022 - Process Optimization in Manufacturing Plants
DutchMLSchool 2022 - Process Optimization in Manufacturing PlantsDutchMLSchool 2022 - Process Optimization in Manufacturing Plants
DutchMLSchool 2022 - Process Optimization in Manufacturing Plants
 
DutchMLSchool 2022 - Citizen Development in AI
DutchMLSchool 2022 - Citizen Development in AIDutchMLSchool 2022 - Citizen Development in AI
DutchMLSchool 2022 - Citizen Development in AI
 
Democratizing Object Detection
Democratizing Object DetectionDemocratizing Object Detection
Democratizing Object Detection
 
Machine Learning in Retail: Know Your Customers' Customer. See Your Future
Machine Learning in Retail: Know Your Customers' Customer. See Your FutureMachine Learning in Retail: Know Your Customers' Customer. See Your Future
Machine Learning in Retail: Know Your Customers' Customer. See Your Future
 
Machine Learning in Retail: ML in the Retail Sector
Machine Learning in Retail: ML in the Retail SectorMachine Learning in Retail: ML in the Retail Sector
Machine Learning in Retail: ML in the Retail Sector
 
ML in GRC: Machine Learning in Legal Automation, How to Trust a Lawyerbot
ML in GRC: Machine Learning in Legal Automation, How to Trust a LawyerbotML in GRC: Machine Learning in Legal Automation, How to Trust a Lawyerbot
ML in GRC: Machine Learning in Legal Automation, How to Trust a Lawyerbot
 
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Mac...
 
ML in GRC: Cybersecurity versus Governance, Risk Management, and Compliance
ML in GRC: Cybersecurity versus Governance, Risk Management, and ComplianceML in GRC: Cybersecurity versus Governance, Risk Management, and Compliance
ML in GRC: Cybersecurity versus Governance, Risk Management, and Compliance
 
Intelligent Mobility: Machine Learning in the Mobility Industry
Intelligent Mobility: Machine Learning in the Mobility IndustryIntelligent Mobility: Machine Learning in the Mobility Industry
Intelligent Mobility: Machine Learning in the Mobility Industry
 

Recently uploaded

꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAbdelrhman abooda
 
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...ThinkInnovation
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxFurkanTasci3
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 

Recently uploaded (20)

꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptx
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 

BigML Release: OptiML

  • 2. BigML, Inc 2OptiML Release Webinar OptiML Release CHARLES PARKER, PH.D. - VP of Machine Learning Algorithms Please enter questions into chat box – We will answer some via chat and others at the end of the session https://bigml.com/releases ATAKAN CETINSOY - VP of Predictive Applications Resources Moderator Speaker Contact support@bigml.com Twitter @bigmlcom Questions
  • 3. BigML, Inc 3OptiML Release Webinar Parameter Optimization • There are lots of algorithms and lots of parameters • We don’t have time to try even close to everything • If only we had a way to make a prediction . . . Did I hear someone say Machine Learning?
  • 4. BigML, Inc 4OptiML Release Webinar The Allure of ML “Why don’t we just use Machine Learning to predict the quality of a set of modeling parameters before we train a model on them?” — Every first year ML grad student ever
  • 5. BigML, Inc 5OptiML Release Webinar In This Webinar • Technology Overview • Metric Selection • The Dangers of Naive Cross-validation • Selecting the “Best” Model • Caveat Emptor!
  • 6. BigML, Inc 6OptiML Release Webinar In This Webinar • Technology Overview • Metric Selection • The Dangers of Naive Cross-validation • Selecting the “Best” Model • Caveat Emptor!
  • 7. BigML, Inc 7OptiML Release Webinar Bayesian Parameter Optimization • The performance of a ML algorithm (with associated parameters) is data dependent • So: Learn from your previous attempts • Train a model, then evaluate it • After you’ve done a number of evaluations, learn a regression model to predict the performance of future, as-yet-untrained models • Use this classifier to chose a promising set of “next models” to evaluate
  • 8. BigML, Inc 8OptiML Release Webinar Model and EvaluateParameters 1 Parameters 2 Parameters 3 Parameters 4 Parameters 5 Parameters 6 Bayesian Parameter Optimization
  • 9. BigML, Inc 9OptiML Release Webinar 0.75 Model and EvaluateParameters 1 Parameters 2 Parameters 3 Parameters 4 Parameters 5 Parameters 6 Bayesian Parameter Optimization
  • 10. BigML, Inc 10OptiML Release Webinar 0.75 0.56 Model and EvaluateParameters 1 Parameters 2 Parameters 3 Parameters 4 Parameters 5 Parameters 6 Bayesian Parameter Optimization
  • 11. BigML, Inc 11OptiML Release Webinar 0.75 0.56 0.92 Model and EvaluateParameters 1 Parameters 2 Parameters 3 Parameters 4 Parameters 5 Parameters 6 Bayesian Parameter Optimization
  • 12. BigML, Inc 12OptiML Release Webinar 0.75 0.56 0.92 Model and EvaluateParameters 1 Parameters 2 Parameters 3 Parameters 4 Parameters 5 Parameters 6 Bayesian Parameter Optimization
  • 13. BigML, Inc 13OptiML Release Webinar 0.75 0.56 0.92 Model and EvaluateParameters 1 Parameters 2 Parameters 3 Parameters 4 Parameters 5 Parameters 6 Machine Learning! parameters ⟶ performance Bayesian Parameter Optimization
  • 14. BigML, Inc 14OptiML Release Webinar Model and EvaluateParameters 1 Parameters 2 Parameters 3 Parameters 4 Parameters 5 Parameters 6 0.75 0.56 0.92 Machine Learning! parameters ⟶ performance Bayesian Parameter Optimization
  • 15. BigML, Inc 15OptiML Release Webinar Some Other Tricks • Use metalearning to select a good set of initial candidates • Cross-validation is expensive, and there’s no reason to do it for models with terrible performance; stop early in these cases
  • 16. BigML, Inc 16OptiML Release Webinar In This Webinar • Technology Overview • Metric Selection • The Dangers of Naive Cross-validation • Selecting the “Best” Model • Caveat Emptor!
  • 17. BigML, Inc 17OptiML Release Webinar A Metric Selection Flowchart YES YES YES NO NO NO YES NO Will you bother about threshold setting? Is yours a “ranking” problem? Is your dataset imbalanced? Do you care more about the top-ranked instances? Max. Phi KS-statistic Area Under the ROC / PR curve Kendall’s Tau Spearman’s Rho Accuracy Phi coefficient f-measure
  • 18. BigML, Inc 18OptiML Release Webinar Ranking Problems Medical Diagnosis (no) vs. Stock Picking (yes)
  • 19. BigML, Inc 19OptiML Release Webinar Top-heavy Importance Draft-Style Selections (no) vs. Customer Churn (yes)
  • 20. BigML, Inc 20OptiML Release Webinar In This Webinar • Technology Overview • Metric Selection • The Dangers of Naive Cross-validation • Selecting the “Best” Model • Caveat Emptor!
  • 21. BigML, Inc 21OptiML Release Webinar Is Cross-Validation Right for You? • Cross-validation is a good tool some of the time • Many other times, it is disastrously bad • Overly optimistic • False confidence in results • This is why we offer the option for a specific holdout set
  • 22. BigML, Inc 22OptiML Release Webinar Case #1: Market Direction • Suppose you want to predict the direction of the stock market • You have information for that market for each minute of each day • But minutes next to each other are correlated in the input and objective field • So if you have the answer for one minute, you can trivially predict the rest! • Cross-validation will tell you your classifier is near-perfect! All Negative All Positive Close of Day
  • 23. BigML, Inc 23OptiML Release Webinar Case #2: Photo Age Prediction • Suppose you want to predict the age of a printed photograph (based on dye- fade, paper watermarks, the presence and type of border, etc.) • Your training set: A few thousand photos from a few dozen people • But the age of one person’s photos are correlated in both the input and output spaces! (same age, camera, storage conditions, etc.) • So you can trivially do well predicting the age of some of one person’s photos if you know the ages of the rest • Cross-validation will tell you your classifier is near perfect!
  • 24. BigML, Inc 24OptiML Release Webinar Take Care! • These situations are very common in all cases where data comes in batches (days, users, etc.) • The solution is to hold out whole batches of data (e.g., a specific test set) rather than just random points from each one (as in cross-validation) • It’s possible that it isn’t a problem in your dataset, but when in doubt, try both!
  • 25. BigML, Inc 25OptiML Release Webinar In This Webinar • Technology Overview • Metric Selection • The Dangers of Naive Cross-validation • Selecting the “Best” Model • Caveat Emptor!
  • 26. BigML, Inc 26OptiML Release Webinar Which Model is Best? • Performance isn’t the only issue! • Retraining: Will the amount of data you have be different in the future? • Fit stability: How confident must you be that the model’s behavior is invariant to small data changes? • Prediction speed: The difference can be orders of magnitude
  • 27. BigML, Inc 27OptiML Release Webinar Modeling Tradeoffs Interpretability vs. Representability Weak vs. Slow Confidence vs. Performance Biased vs. Data-hungry Simple (Logistic) Complex (Deepnets)
  • 28. BigML, Inc 28OptiML Release Webinar In This Webinar • Technology Overview • Metric Selection • The Dangers of Naive Cross-validation • Selecting the “Best” Model • Caveat Emptor!
  • 29. BigML, Inc 29OptiML Release Webinar Mo’ Problems • Model selection tends to take a lot of data, and the more accurate you want the search to be, the more data you need. • We had to define a search space that would suit “most” datasets. It’s possible that the right model for your data isn’t in there!
  • 30. BigML, Inc 30OptiML Release Webinar https://bigml.com/releases/winter-2018 Learn More https://bigml.com/whatsnew