SlideShare a Scribd company logo
Machine Learning - Black Art
Charles Parker
Allston Trading
Machine Learning is Hard!
• By now, you know kind of a lot

• Different types of models

• Feature engineering

• Ways to evaluate

• But you’ll still fail!

• Out in the real world, there’s a
whole bunch of things that will kill
your project

• FYI - A lot of these talks are stolen
2
Join Me!
• On a journey into the Machine Learning House of
Horrors!

• Mwa ha ha!
3
5
• The Horror of The Huge Hypothesis Space

• The Perils of The Poorly Picked Loss Function

• The Creeping Creature Called Cross Validation

• The Dread of the Drifting Domain

• The Repugnance of Reliance on Research Results
The Machine Learning House of Horrors!
Choosing A Hypothesis Space
• By “hypothesis space” we
mean the possible classifiers
you could build with an
algorithm given the data

• This is the choice you make
when you pick a learning
algorithm

• You have one job!

• Is there any way to make it
easier?
6
Theory to The Rescue!
• Probably Approximately Correct

• We’d like our model to have error less than epsilon

• We’d like that to happen at least some percentage of the time

• If the error is epsilon, the percentage is sigma, the number of
training examples is m, and the hypothesis space size is d:
7
The Triple Trade-Off
• There is a triple-trade off between the error, the size
of the hypothesis space, and the amount of training
data you have
8
Error
Hypothesis Space Training Data
What About Huge Data?
• I’m clever, so I’ll use non-
parametric methods (Decision
tree, k-NN, kernelized SVMs)

• As data scales, curious things
tend to happen

• Simpler models become more
desirable as they’re faster to fit.

• You can increase model
complexity by adding features
(maybe word counts)

• Big data often trumps modeling!
9
10
• The Horror of The Huge Hypothesis Space

• The Perils of The Poorly Picked Loss Function

• The Creeping Creature Called Cross Validation

• The Dread of the Drifting Domain

• The Repugnance of Reliance on Research Results
The Machine Learning House of Horrors!
A Dirty Little Secret About ML Algorithms
• They don’t care what you want

• Decision Trees:

• SVM:

• LR:

• LDA:
11
Real-world Losses
• Real losses are nothing like this

• False positive in disease
diagnosis

• False positive in face
detection

• False positive in thumbprint
identification

• Some aren’t even instance-
based

• Path dependencies

• Game playing
12
Specializing Your Loss
• One solution is to let developers apply their own loss

• This is the approach of SVM light: 

http://svmlight.joachims.org/

It’s been around for a while

• Losses other than Mutual Information can be plugged into the appropriate
place in splitting code

• Models trained via gradient descent can obviously be customized (Python’s
Theano is interesting for this)

• In the case of multi-example loss function, we have SEARN in Vowpal Wabbit

https://github.com/JohnLangford/vowpal_wabbit
13
Other Hackery
• Sometimes, the solution is just to hack
around the actual prediction

• Have several levels (cascade) of
classifiers in e.g., medical diagnosis, text
recognition

• Apply logic to explicitly avoid high loss
cases (e.g., when buying/selling equities)

• Changing the problem setting

• Will you be doing queries? Use ranking
or metric learning

• “I want to do crazy thing x with
classifiers”, chances are it’s already been
done and you can read about it.
14
15
• The Horror of The Huge Hypothesis Space

• The Perils of The Poorly Picked Loss Function

• The Creeping Creature Called Cross Validation

• The Dread of the Drifting Domain

• The Repugnance of Reliance on Research Results
The Machine Learning House of Horrors!
When Validation Attacks!
• Cross validation

• n-Fold - Hold out one fold for
testing, train on n - 1 folds

• Great way to measure
performance, right?

• It’s all about information leakage

• via instances

• via features
16
Case Study #1: Law of Averages
• Estimate sporting event
outcomes

• Use previous games to
estimate points scored for
each team (via windowing
transform)

• Choose winner based on
predicted score

• What if you’re off by one on
the window?
17
Case Study #2: Photo Dating
• Take scanned photos from
30 different users (on
average 200 per user) and
create a model to assign a
date taken (plus or minus
five years)

• Perform 10-cross
validation

• Accuracy is 85%. Can
you trust it?
18
Case Study #3: Moments In Time
• You have a buy/sell
opportunity every five
seconds

• The signals you use to
evaluate the opportunity
are aggregates of market
activity over the last five
minutes

• How careful must you be
with cross-validation?
19
20
• The Horror of The Huge Hypothesis Space

• The Perils of The Poorly Picked Loss Function

• The Creeping Creature Called Cross Validation

• The Dread of the Drifting Domain

• The Repugnance of Reliance on Research Results
The Machine Learning House of Horrors!
Breaking Machine Learning
• You’ve got this great model!
Congratulations!

• Suddenly it stops working.
Why?

• You might be in a domain
that tends to change over
time (document classification,
sales prediction)

• You might be experiencing
adverse selection (market
data predictions, spam)
21
Concept Drift
• This is called non-stationarity in either the prior or the conditional
distributions

• Could be a couple of different things

• If the prior p(input) is changing, it’s covariate shift

• If the conditional p(output | input) is changing, it’s concept drift

• No rule that it can’t be both

• http://blog.bigml.com/2013/03/12/machine-learning-from-
streaming-data-two-problems-two-solutions-two-concerns-and-
two-lessons/
22
Take Action!
• First: Look for symptoms

• Getting a lot of errors

• The distribution of predicted values changes

• Drift detection algorithms (that I know about) have the same basic flavor:

• Buffer some data in memory

• If recent data is “different” from past data, retrain, update or give up

• Some resources - A nice survey paper and an open source package:
23
http://www.win.tue.nl/~mpechen/publications/pubs/Gama_ACMCS_AdaptationCD_accepted.pdf

http://moa.cms.waikato.ac.nz/
The Benefits of Archeology
• Why might you train on old
data, even if it’s not relevant?

• Verification of your research
process

• You’d do the same thing
last year. Did it work?

• Gives you a good idea of
how much drift you should
expect
24
25
• The Horror of The Huge Hypothesis Space

• The Perils of The Poorly Picked Loss Function

• The Creeping Creature Called Cross Validation

• The Dread of the Drifting Domain

• The Repugnance of Reliance on Research Results
The Machine Learning House of Horrors!
Publish or Perish
• Academic papers are a certain type of
result

• Show incremental improvement in
accuracy or generality

• Prove something about your
algorithm

• This latter is hard to come by as results
get more realistic

• Machine learning proofs assume data
is “i.i.d”, but this is obviously false.

• Real world data sucks, and dealing
with that significantly changes the
dataset
26
Usefulness of Results
• Theoretical Results

• Most of the time bounds do not apply (error, sample
complexity, convergence)

• Sometimes they don’t even make any sense

• Beware of putting too much faith in a single person or single
person’s work

• Usefulness generally occurs only in the aggregate

• And sometimes not even then (researchers are people, too)
27
Machine Learning Isn’t About Machine Learning
• Why doesn’t it work like in the
paper?

• Remember, the paper is carefully
controlled in a way your application
is not.

• Performance is rarely driven by
machine learning

• It’s driven by camera
microphones

• It’s driven by Mario Draghi
28
So, Don’t Bother With It?
• Of course not!

• What’s the alternative?

• “All our science, measured
against reality, is primitive
and childlike — and yet it is
the most precious thing we
have” - Albert Einstein

• Use academia as your
starting point, but don’t
think it will get you out of
the work
29
Some Themes
• The major points of this talk:

• Machine learning is hard to get right

• The algorithms won’t do what you want

• Good results are probably spurious

• Even if they aren’t, it won’t last

• Reading the research won’t help

• Wait, no!

• Have an attitude of skeptical optimism (or optimal skepticism?)
30

More Related Content

What's hot

Machine Learning presentation.
Machine Learning presentation.Machine Learning presentation.
Machine Learning presentation.butest
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
Darshan Ambhaikar
 
Lecture 2 Basic Concepts in Machine Learning for Language Technology
Lecture 2 Basic Concepts in Machine Learning for Language TechnologyLecture 2 Basic Concepts in Machine Learning for Language Technology
Lecture 2 Basic Concepts in Machine Learning for Language Technology
Marina Santini
 
What is Machine Learning
What is Machine LearningWhat is Machine Learning
What is Machine Learning
Bhaskara Reddy Sannapureddy
 
A Friendly Introduction to Machine Learning
A Friendly Introduction to Machine LearningA Friendly Introduction to Machine Learning
A Friendly Introduction to Machine Learning
Haptik
 
BSSML16 L3. Clusters and Anomaly Detection
BSSML16 L3. Clusters and Anomaly DetectionBSSML16 L3. Clusters and Anomaly Detection
BSSML16 L3. Clusters and Anomaly Detection
BigML, Inc
 
Lecture 01: Machine Learning for Language Technology - Introduction
 Lecture 01: Machine Learning for Language Technology - Introduction Lecture 01: Machine Learning for Language Technology - Introduction
Lecture 01: Machine Learning for Language Technology - Introduction
Marina Santini
 
VSSML16 LR2. Summary Day 2
VSSML16 LR2. Summary Day 2VSSML16 LR2. Summary Day 2
VSSML16 LR2. Summary Day 2
BigML, Inc
 
ML Basics
ML BasicsML Basics
ML Basics
SrujanaMerugu1
 
Machine Learning and Real-World Applications
Machine Learning and Real-World ApplicationsMachine Learning and Real-World Applications
Machine Learning and Real-World Applications
MachinePulse
 
Applications in Machine Learning
Applications in Machine LearningApplications in Machine Learning
Applications in Machine Learning
Joel Graff
 
Supervised learning
Supervised learningSupervised learning
Supervised learning
ankit_ppt
 
Explainable Machine Learning (Explainable ML)
Explainable Machine Learning (Explainable ML)Explainable Machine Learning (Explainable ML)
Explainable Machine Learning (Explainable ML)
Hayim Makabee
 
Machine learning basics
Machine learning basics Machine learning basics
Machine learning basics
Akanksha Bali
 
Machine learning
Machine learningMachine learning
Machine learning
Dr Geetha Mohan
 
Lesson 3 ai in the enterprise
Lesson 3   ai in the enterpriseLesson 3   ai in the enterprise
Lesson 3 ai in the enterprise
ankit_ppt
 
Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)
Laguna State Polytechnic University
 
Machine Learning in NutShell
Machine Learning in NutShellMachine Learning in NutShell
Machine Learning in NutShell
Ashwin Shiv
 
Ml1 introduction to-supervised_learning_and_k_nearest_neighbors
Ml1 introduction to-supervised_learning_and_k_nearest_neighborsMl1 introduction to-supervised_learning_and_k_nearest_neighbors
Ml1 introduction to-supervised_learning_and_k_nearest_neighbors
ankit_ppt
 
Basics of Machine Learning
Basics of Machine LearningBasics of Machine Learning
Basics of Machine Learningbutest
 

What's hot (20)

Machine Learning presentation.
Machine Learning presentation.Machine Learning presentation.
Machine Learning presentation.
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Lecture 2 Basic Concepts in Machine Learning for Language Technology
Lecture 2 Basic Concepts in Machine Learning for Language TechnologyLecture 2 Basic Concepts in Machine Learning for Language Technology
Lecture 2 Basic Concepts in Machine Learning for Language Technology
 
What is Machine Learning
What is Machine LearningWhat is Machine Learning
What is Machine Learning
 
A Friendly Introduction to Machine Learning
A Friendly Introduction to Machine LearningA Friendly Introduction to Machine Learning
A Friendly Introduction to Machine Learning
 
BSSML16 L3. Clusters and Anomaly Detection
BSSML16 L3. Clusters and Anomaly DetectionBSSML16 L3. Clusters and Anomaly Detection
BSSML16 L3. Clusters and Anomaly Detection
 
Lecture 01: Machine Learning for Language Technology - Introduction
 Lecture 01: Machine Learning for Language Technology - Introduction Lecture 01: Machine Learning for Language Technology - Introduction
Lecture 01: Machine Learning for Language Technology - Introduction
 
VSSML16 LR2. Summary Day 2
VSSML16 LR2. Summary Day 2VSSML16 LR2. Summary Day 2
VSSML16 LR2. Summary Day 2
 
ML Basics
ML BasicsML Basics
ML Basics
 
Machine Learning and Real-World Applications
Machine Learning and Real-World ApplicationsMachine Learning and Real-World Applications
Machine Learning and Real-World Applications
 
Applications in Machine Learning
Applications in Machine LearningApplications in Machine Learning
Applications in Machine Learning
 
Supervised learning
Supervised learningSupervised learning
Supervised learning
 
Explainable Machine Learning (Explainable ML)
Explainable Machine Learning (Explainable ML)Explainable Machine Learning (Explainable ML)
Explainable Machine Learning (Explainable ML)
 
Machine learning basics
Machine learning basics Machine learning basics
Machine learning basics
 
Machine learning
Machine learningMachine learning
Machine learning
 
Lesson 3 ai in the enterprise
Lesson 3   ai in the enterpriseLesson 3   ai in the enterprise
Lesson 3 ai in the enterprise
 
Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)
 
Machine Learning in NutShell
Machine Learning in NutShellMachine Learning in NutShell
Machine Learning in NutShell
 
Ml1 introduction to-supervised_learning_and_k_nearest_neighbors
Ml1 introduction to-supervised_learning_and_k_nearest_neighborsMl1 introduction to-supervised_learning_and_k_nearest_neighbors
Ml1 introduction to-supervised_learning_and_k_nearest_neighbors
 
Basics of Machine Learning
Basics of Machine LearningBasics of Machine Learning
Basics of Machine Learning
 

Similar to L15. Machine Learning - Black Art

Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018
HJ van Veen
 
Five Things I Learned While Building Anomaly Detection Tools - Toufic Boubez ...
Five Things I Learned While Building Anomaly Detection Tools - Toufic Boubez ...Five Things I Learned While Building Anomaly Detection Tools - Toufic Boubez ...
Five Things I Learned While Building Anomaly Detection Tools - Toufic Boubez ...
tboubez
 
Influx/Days 2017 San Francisco | Baron Schwartz
Influx/Days 2017 San Francisco | Baron SchwartzInflux/Days 2017 San Francisco | Baron Schwartz
Influx/Days 2017 San Francisco | Baron Schwartz
InfluxData
 
Waves keynote2c
Waves keynote2cWaves keynote2c
Waves keynote2c
David Topps
 
DutchMLSchool. Logistic Regression, Deepnets, Time Series
DutchMLSchool. Logistic Regression, Deepnets, Time SeriesDutchMLSchool. Logistic Regression, Deepnets, Time Series
DutchMLSchool. Logistic Regression, Deepnets, Time Series
BigML, Inc
 
AI in the Real World: Challenges, and Risks and how to handle them?
AI in the Real World: Challenges, and Risks and how to handle them?AI in the Real World: Challenges, and Risks and how to handle them?
AI in the Real World: Challenges, and Risks and how to handle them?
Srinath Perera
 
Predicting Gene Loss in Plants: Lessons Learned From Laptop-Scale Data
Predicting Gene Loss in Plants: Lessons Learned From Laptop-Scale DataPredicting Gene Loss in Plants: Lessons Learned From Laptop-Scale Data
Predicting Gene Loss in Plants: Lessons Learned From Laptop-Scale Data
philippbayer
 
Will Robots Replace Testers?
Will Robots Replace Testers?Will Robots Replace Testers?
Will Robots Replace Testers?
TEST Huddle
 
The zen of predictive modelling
The zen of predictive modellingThe zen of predictive modelling
The zen of predictive modelling
Quinton Anderson
 
AI Models For Fun and Profit by Walmart Director of Artificial Intelligence
AI Models For Fun and Profit by Walmart Director of Artificial IntelligenceAI Models For Fun and Profit by Walmart Director of Artificial Intelligence
AI Models For Fun and Profit by Walmart Director of Artificial Intelligence
Product School
 
Defcon 21-pinto-defending-networks-machine-learning by pseudor00t
Defcon 21-pinto-defending-networks-machine-learning by pseudor00tDefcon 21-pinto-defending-networks-machine-learning by pseudor00t
Defcon 21-pinto-defending-networks-machine-learning by pseudor00t
pseudor00t overflow
 
Ml masterclass
Ml masterclassMl masterclass
Ml masterclass
Maxwell Rebo
 
VSSML18. OptiML and Fusions
VSSML18. OptiML and FusionsVSSML18. OptiML and Fusions
VSSML18. OptiML and Fusions
BigML, Inc
 
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
Alex Pinto
 
Data Science Folk Knowledge
Data Science Folk KnowledgeData Science Folk Knowledge
Data Science Folk Knowledge
Krishna Sankar
 
DataEngConf SF16 - Three lessons learned from building a production machine l...
DataEngConf SF16 - Three lessons learned from building a production machine l...DataEngConf SF16 - Three lessons learned from building a production machine l...
DataEngConf SF16 - Three lessons learned from building a production machine l...
Hakka Labs
 
How to make m achines learn
How to make m achines learnHow to make m achines learn
How to make m achines learn
iskamegy
 
November 15th 2018 denver cu seminar (drew miller) ai robotics cryptocurrency...
November 15th 2018 denver cu seminar (drew miller) ai robotics cryptocurrency...November 15th 2018 denver cu seminar (drew miller) ai robotics cryptocurrency...
November 15th 2018 denver cu seminar (drew miller) ai robotics cryptocurrency...
Drew Miller
 
Testing for cognitive bias in ai systems
Testing for cognitive bias in ai systemsTesting for cognitive bias in ai systems
Testing for cognitive bias in ai systems
Peter Varhol
 
CS194Lec0hbh6EDA.pptx
CS194Lec0hbh6EDA.pptxCS194Lec0hbh6EDA.pptx
CS194Lec0hbh6EDA.pptx
PrudhvirajEluri1
 

Similar to L15. Machine Learning - Black Art (20)

Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018
 
Five Things I Learned While Building Anomaly Detection Tools - Toufic Boubez ...
Five Things I Learned While Building Anomaly Detection Tools - Toufic Boubez ...Five Things I Learned While Building Anomaly Detection Tools - Toufic Boubez ...
Five Things I Learned While Building Anomaly Detection Tools - Toufic Boubez ...
 
Influx/Days 2017 San Francisco | Baron Schwartz
Influx/Days 2017 San Francisco | Baron SchwartzInflux/Days 2017 San Francisco | Baron Schwartz
Influx/Days 2017 San Francisco | Baron Schwartz
 
Waves keynote2c
Waves keynote2cWaves keynote2c
Waves keynote2c
 
DutchMLSchool. Logistic Regression, Deepnets, Time Series
DutchMLSchool. Logistic Regression, Deepnets, Time SeriesDutchMLSchool. Logistic Regression, Deepnets, Time Series
DutchMLSchool. Logistic Regression, Deepnets, Time Series
 
AI in the Real World: Challenges, and Risks and how to handle them?
AI in the Real World: Challenges, and Risks and how to handle them?AI in the Real World: Challenges, and Risks and how to handle them?
AI in the Real World: Challenges, and Risks and how to handle them?
 
Predicting Gene Loss in Plants: Lessons Learned From Laptop-Scale Data
Predicting Gene Loss in Plants: Lessons Learned From Laptop-Scale DataPredicting Gene Loss in Plants: Lessons Learned From Laptop-Scale Data
Predicting Gene Loss in Plants: Lessons Learned From Laptop-Scale Data
 
Will Robots Replace Testers?
Will Robots Replace Testers?Will Robots Replace Testers?
Will Robots Replace Testers?
 
The zen of predictive modelling
The zen of predictive modellingThe zen of predictive modelling
The zen of predictive modelling
 
AI Models For Fun and Profit by Walmart Director of Artificial Intelligence
AI Models For Fun and Profit by Walmart Director of Artificial IntelligenceAI Models For Fun and Profit by Walmart Director of Artificial Intelligence
AI Models For Fun and Profit by Walmart Director of Artificial Intelligence
 
Defcon 21-pinto-defending-networks-machine-learning by pseudor00t
Defcon 21-pinto-defending-networks-machine-learning by pseudor00tDefcon 21-pinto-defending-networks-machine-learning by pseudor00t
Defcon 21-pinto-defending-networks-machine-learning by pseudor00t
 
Ml masterclass
Ml masterclassMl masterclass
Ml masterclass
 
VSSML18. OptiML and Fusions
VSSML18. OptiML and FusionsVSSML18. OptiML and Fusions
VSSML18. OptiML and Fusions
 
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
 
Data Science Folk Knowledge
Data Science Folk KnowledgeData Science Folk Knowledge
Data Science Folk Knowledge
 
DataEngConf SF16 - Three lessons learned from building a production machine l...
DataEngConf SF16 - Three lessons learned from building a production machine l...DataEngConf SF16 - Three lessons learned from building a production machine l...
DataEngConf SF16 - Three lessons learned from building a production machine l...
 
How to make m achines learn
How to make m achines learnHow to make m achines learn
How to make m achines learn
 
November 15th 2018 denver cu seminar (drew miller) ai robotics cryptocurrency...
November 15th 2018 denver cu seminar (drew miller) ai robotics cryptocurrency...November 15th 2018 denver cu seminar (drew miller) ai robotics cryptocurrency...
November 15th 2018 denver cu seminar (drew miller) ai robotics cryptocurrency...
 
Testing for cognitive bias in ai systems
Testing for cognitive bias in ai systemsTesting for cognitive bias in ai systems
Testing for cognitive bias in ai systems
 
CS194Lec0hbh6EDA.pptx
CS194Lec0hbh6EDA.pptxCS194Lec0hbh6EDA.pptx
CS194Lec0hbh6EDA.pptx
 

More from Machine Learning Valencia

From Turing To Humanoid Robots - Ramón López de Mántaras
From Turing To Humanoid Robots - Ramón López de MántarasFrom Turing To Humanoid Robots - Ramón López de Mántaras
From Turing To Humanoid Robots - Ramón López de Mántaras
Machine Learning Valencia
 
Artificial Intelligence Progress - Tom Dietterich
Artificial Intelligence Progress - Tom DietterichArtificial Intelligence Progress - Tom Dietterich
Artificial Intelligence Progress - Tom Dietterich
Machine Learning Valencia
 
L14. Anomaly Detection
L14. Anomaly DetectionL14. Anomaly Detection
L14. Anomaly Detection
Machine Learning Valencia
 
L9. Real World Machine Learning - Cooking Predictions
L9. Real World Machine Learning - Cooking PredictionsL9. Real World Machine Learning - Cooking Predictions
L9. Real World Machine Learning - Cooking Predictions
Machine Learning Valencia
 
L7. A developers’ overview of the world of predictive APIs
L7. A developers’ overview of the world of predictive APIsL7. A developers’ overview of the world of predictive APIs
L7. A developers’ overview of the world of predictive APIs
Machine Learning Valencia
 
LR1. Summary Day 1
LR1. Summary Day 1LR1. Summary Day 1
LR1. Summary Day 1
Machine Learning Valencia
 
L6. Unbalanced Datasets
L6. Unbalanced DatasetsL6. Unbalanced Datasets
L6. Unbalanced Datasets
Machine Learning Valencia
 
L5. Data Transformation and Feature Engineering
L5. Data Transformation and Feature EngineeringL5. Data Transformation and Feature Engineering
L5. Data Transformation and Feature Engineering
Machine Learning Valencia
 
L4. Ensembles of Decision Trees
L4. Ensembles of Decision TreesL4. Ensembles of Decision Trees
L4. Ensembles of Decision Trees
Machine Learning Valencia
 
L3. Decision Trees
L3. Decision TreesL3. Decision Trees
L3. Decision Trees
Machine Learning Valencia
 
L2. Evaluating Machine Learning Algorithms I
L2. Evaluating Machine Learning Algorithms IL2. Evaluating Machine Learning Algorithms I
L2. Evaluating Machine Learning Algorithms I
Machine Learning Valencia
 
L1. State of the Art in Machine Learning
L1. State of the Art in Machine LearningL1. State of the Art in Machine Learning
L1. State of the Art in Machine Learning
Machine Learning Valencia
 

More from Machine Learning Valencia (12)

From Turing To Humanoid Robots - Ramón López de Mántaras
From Turing To Humanoid Robots - Ramón López de MántarasFrom Turing To Humanoid Robots - Ramón López de Mántaras
From Turing To Humanoid Robots - Ramón López de Mántaras
 
Artificial Intelligence Progress - Tom Dietterich
Artificial Intelligence Progress - Tom DietterichArtificial Intelligence Progress - Tom Dietterich
Artificial Intelligence Progress - Tom Dietterich
 
L14. Anomaly Detection
L14. Anomaly DetectionL14. Anomaly Detection
L14. Anomaly Detection
 
L9. Real World Machine Learning - Cooking Predictions
L9. Real World Machine Learning - Cooking PredictionsL9. Real World Machine Learning - Cooking Predictions
L9. Real World Machine Learning - Cooking Predictions
 
L7. A developers’ overview of the world of predictive APIs
L7. A developers’ overview of the world of predictive APIsL7. A developers’ overview of the world of predictive APIs
L7. A developers’ overview of the world of predictive APIs
 
LR1. Summary Day 1
LR1. Summary Day 1LR1. Summary Day 1
LR1. Summary Day 1
 
L6. Unbalanced Datasets
L6. Unbalanced DatasetsL6. Unbalanced Datasets
L6. Unbalanced Datasets
 
L5. Data Transformation and Feature Engineering
L5. Data Transformation and Feature EngineeringL5. Data Transformation and Feature Engineering
L5. Data Transformation and Feature Engineering
 
L4. Ensembles of Decision Trees
L4. Ensembles of Decision TreesL4. Ensembles of Decision Trees
L4. Ensembles of Decision Trees
 
L3. Decision Trees
L3. Decision TreesL3. Decision Trees
L3. Decision Trees
 
L2. Evaluating Machine Learning Algorithms I
L2. Evaluating Machine Learning Algorithms IL2. Evaluating Machine Learning Algorithms I
L2. Evaluating Machine Learning Algorithms I
 
L1. State of the Art in Machine Learning
L1. State of the Art in Machine LearningL1. State of the Art in Machine Learning
L1. State of the Art in Machine Learning
 

Recently uploaded

Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization Sample
James Polillo
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
nscud
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
ewymefz
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive data
theahmadsaood
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
Oppotus
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
ewymefz
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Boston Institute of Analytics
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
MaleehaSheikh2
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
ocavb
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
ArpitMalhotra16
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
ewymefz
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
ewymefz
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
enxupq
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
ukgaet
 

Recently uploaded (20)

Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization Sample
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive data
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
 

L15. Machine Learning - Black Art

  • 1. Machine Learning - Black Art Charles Parker Allston Trading
  • 2. Machine Learning is Hard! • By now, you know kind of a lot • Different types of models • Feature engineering • Ways to evaluate • But you’ll still fail! • Out in the real world, there’s a whole bunch of things that will kill your project • FYI - A lot of these talks are stolen 2
  • 3. Join Me! • On a journey into the Machine Learning House of Horrors! • Mwa ha ha! 3
  • 4. 5 • The Horror of The Huge Hypothesis Space • The Perils of The Poorly Picked Loss Function • The Creeping Creature Called Cross Validation • The Dread of the Drifting Domain • The Repugnance of Reliance on Research Results The Machine Learning House of Horrors!
  • 5. Choosing A Hypothesis Space • By “hypothesis space” we mean the possible classifiers you could build with an algorithm given the data • This is the choice you make when you pick a learning algorithm • You have one job! • Is there any way to make it easier? 6
  • 6. Theory to The Rescue! • Probably Approximately Correct • We’d like our model to have error less than epsilon • We’d like that to happen at least some percentage of the time • If the error is epsilon, the percentage is sigma, the number of training examples is m, and the hypothesis space size is d: 7
  • 7. The Triple Trade-Off • There is a triple-trade off between the error, the size of the hypothesis space, and the amount of training data you have 8 Error Hypothesis Space Training Data
  • 8. What About Huge Data? • I’m clever, so I’ll use non- parametric methods (Decision tree, k-NN, kernelized SVMs) • As data scales, curious things tend to happen • Simpler models become more desirable as they’re faster to fit. • You can increase model complexity by adding features (maybe word counts) • Big data often trumps modeling! 9
  • 9. 10 • The Horror of The Huge Hypothesis Space • The Perils of The Poorly Picked Loss Function • The Creeping Creature Called Cross Validation • The Dread of the Drifting Domain • The Repugnance of Reliance on Research Results The Machine Learning House of Horrors!
  • 10. A Dirty Little Secret About ML Algorithms • They don’t care what you want • Decision Trees: • SVM: • LR: • LDA: 11
  • 11. Real-world Losses • Real losses are nothing like this • False positive in disease diagnosis • False positive in face detection • False positive in thumbprint identification • Some aren’t even instance- based • Path dependencies • Game playing 12
  • 12. Specializing Your Loss • One solution is to let developers apply their own loss • This is the approach of SVM light: http://svmlight.joachims.org/ It’s been around for a while • Losses other than Mutual Information can be plugged into the appropriate place in splitting code • Models trained via gradient descent can obviously be customized (Python’s Theano is interesting for this) • In the case of multi-example loss function, we have SEARN in Vowpal Wabbit https://github.com/JohnLangford/vowpal_wabbit 13
  • 13. Other Hackery • Sometimes, the solution is just to hack around the actual prediction • Have several levels (cascade) of classifiers in e.g., medical diagnosis, text recognition • Apply logic to explicitly avoid high loss cases (e.g., when buying/selling equities) • Changing the problem setting • Will you be doing queries? Use ranking or metric learning • “I want to do crazy thing x with classifiers”, chances are it’s already been done and you can read about it. 14
  • 14. 15 • The Horror of The Huge Hypothesis Space • The Perils of The Poorly Picked Loss Function • The Creeping Creature Called Cross Validation • The Dread of the Drifting Domain • The Repugnance of Reliance on Research Results The Machine Learning House of Horrors!
  • 15. When Validation Attacks! • Cross validation • n-Fold - Hold out one fold for testing, train on n - 1 folds • Great way to measure performance, right? • It’s all about information leakage • via instances • via features 16
  • 16. Case Study #1: Law of Averages • Estimate sporting event outcomes • Use previous games to estimate points scored for each team (via windowing transform) • Choose winner based on predicted score • What if you’re off by one on the window? 17
  • 17. Case Study #2: Photo Dating • Take scanned photos from 30 different users (on average 200 per user) and create a model to assign a date taken (plus or minus five years) • Perform 10-cross validation • Accuracy is 85%. Can you trust it? 18
  • 18. Case Study #3: Moments In Time • You have a buy/sell opportunity every five seconds • The signals you use to evaluate the opportunity are aggregates of market activity over the last five minutes • How careful must you be with cross-validation? 19
  • 19. 20 • The Horror of The Huge Hypothesis Space • The Perils of The Poorly Picked Loss Function • The Creeping Creature Called Cross Validation • The Dread of the Drifting Domain • The Repugnance of Reliance on Research Results The Machine Learning House of Horrors!
  • 20. Breaking Machine Learning • You’ve got this great model! Congratulations! • Suddenly it stops working. Why? • You might be in a domain that tends to change over time (document classification, sales prediction) • You might be experiencing adverse selection (market data predictions, spam) 21
  • 21. Concept Drift • This is called non-stationarity in either the prior or the conditional distributions • Could be a couple of different things • If the prior p(input) is changing, it’s covariate shift • If the conditional p(output | input) is changing, it’s concept drift • No rule that it can’t be both • http://blog.bigml.com/2013/03/12/machine-learning-from- streaming-data-two-problems-two-solutions-two-concerns-and- two-lessons/ 22
  • 22. Take Action! • First: Look for symptoms • Getting a lot of errors • The distribution of predicted values changes • Drift detection algorithms (that I know about) have the same basic flavor: • Buffer some data in memory • If recent data is “different” from past data, retrain, update or give up • Some resources - A nice survey paper and an open source package: 23 http://www.win.tue.nl/~mpechen/publications/pubs/Gama_ACMCS_AdaptationCD_accepted.pdf http://moa.cms.waikato.ac.nz/
  • 23. The Benefits of Archeology • Why might you train on old data, even if it’s not relevant? • Verification of your research process • You’d do the same thing last year. Did it work? • Gives you a good idea of how much drift you should expect 24
  • 24. 25 • The Horror of The Huge Hypothesis Space • The Perils of The Poorly Picked Loss Function • The Creeping Creature Called Cross Validation • The Dread of the Drifting Domain • The Repugnance of Reliance on Research Results The Machine Learning House of Horrors!
  • 25. Publish or Perish • Academic papers are a certain type of result • Show incremental improvement in accuracy or generality • Prove something about your algorithm • This latter is hard to come by as results get more realistic • Machine learning proofs assume data is “i.i.d”, but this is obviously false. • Real world data sucks, and dealing with that significantly changes the dataset 26
  • 26. Usefulness of Results • Theoretical Results • Most of the time bounds do not apply (error, sample complexity, convergence) • Sometimes they don’t even make any sense • Beware of putting too much faith in a single person or single person’s work • Usefulness generally occurs only in the aggregate • And sometimes not even then (researchers are people, too) 27
  • 27. Machine Learning Isn’t About Machine Learning • Why doesn’t it work like in the paper? • Remember, the paper is carefully controlled in a way your application is not. • Performance is rarely driven by machine learning • It’s driven by camera microphones • It’s driven by Mario Draghi 28
  • 28. So, Don’t Bother With It? • Of course not! • What’s the alternative? • “All our science, measured against reality, is primitive and childlike — and yet it is the most precious thing we have” - Albert Einstein • Use academia as your starting point, but don’t think it will get you out of the work 29
  • 29. Some Themes • The major points of this talk: • Machine learning is hard to get right • The algorithms won’t do what you want • Good results are probably spurious • Even if they aren’t, it won’t last • Reading the research won’t help • Wait, no! • Have an attitude of skeptical optimism (or optimal skepticism?) 30