The Power of Ensembles in Machine Learning

Amit Kapoor
Amit KapoorCrafting Visual Stories with Data
Power of Ensembles______
Amit Kapoor
@amitkaps
Bargava Subramanian
@bargava
The Blind Men & the Elephant
___
“And so these men of Indostan
Disputed loud and long,
Each in his own opinion
Exceeding stiff and strong,
Though each was partly in the right,
And all were in the wrong.”
— John Godfrey Saxe
Model Abstraction
___
"All models are wrong, but some are
useful"
— George Box
Building Many Models
___
"All models are wrong, but some are
useful, and their combination may be
better"
Ensembles
___
(noun) /änˈsämbəl/
a group of items viewed as a whole
rather than individually.
Machine Learning Process
___
Frame: Problem definition
Acquire: Data ingestion
Refine: Data wrangling
Transform: Feature creation
Explore: Feature selection
Model: Model selection
Insight: Solution communication
Machine Learning Model Pipeline
___
Data --- Feature --- Model --- Model
Space Space Algorithm Parameters
(n) (f) (m) (p)
Machine Learning Challenge
___
"...the task of searching through a
hypotheses space to find a suitable
hypothesis that will make good
predictions for a particular problem"
Hypotheses Space
___
Data --- Feature --- Model --- Model
Space Space Algorithm Parameters
(n) (f) (m) (p)
|_________________________|
Hypotheses Space
Search Hypotheses Space
___
The key question is how do we efficiently
search this Hypotheses Space?
Illustrative Example
___
Binary Classification Problem
Classes: c = 2
Features: f = x1, x2, x3
Observations: n = 1,000
The Power of Ensembles in Machine Learning
Dimensionality Reduction
___
"Easier to visualise the data space"
Principal Component Analysis
Dimensions = 2 --> (pc1, pc2)
The Power of Ensembles in Machine Learning
Approach to Build Models
___
1. Train a model
2. Predict the probabilities for each
class
3. Score the model on AUC via 5-Fold
Cross-Validation
4. Show the Decision Space
Simple Model with Different Features
___
Data --- Feature --- Model --- Model
Space Space Algorithm Parameters
(n) (f) (m) (p)
100% x1,x2 Logistic C=1
100% x2,x3 Logistic C=1
100% x1,x3 Logistic C=1
100% pc1,pc2 Logistic C=1
The Power of Ensembles in Machine Learning
Tune the Model Parameters
___
Data --- Feature --- Model --- Model
Space Space Algorithm Parameters
(n) (f) (m) (p)
100% pc1,pc2 Logistic C=1
100% pc1,pc2 Logistic C=1e-1
100% pc1,pc2 Logistic C=1e-3
100% pc1,pc2 Logistic C=1e-5
The Power of Ensembles in Machine Learning
Make Many Simple Models
___
Data --- Feature --- Model --- Model
Space Space Algorithm Parameters
(n) (f) (m) (p)
100% pc1,pc2 Logistic C=1
100% pc1,pc2 SVM-Linear prob=True
100% pc1,pc2 Decision Tree d=full
100% pc1,pc2 KNN nn=3
The Power of Ensembles in Machine Learning
Hypotheses Search Approach
___
Exhaustive search is impossible
Compute as a proxy for human IQ
Clever algorithmic way to search the
solution space
Ensemble Thought Process
___
"The goal of ensemble methods is to
combine the predictions of several base
estimators built with (one or multiple)
model algorithms in order to improve
generalisation and robustness over a
single estimator."
Ensemble Approach
___
Different training sets
Different feature sampling
Different algorithms
Different (hyper) parameters
Clever aggregation
Ensemble Methods
___
[1] Averaging
[2] Boosting
[3] Voting
[4] Stacking
[1] Averaging: Concept
___
"Build several estimators independently
and then average their predictions to
reduce model variance"
To ensemble several good models to
produce a less variance model.
[1] Averaging: Approaches
[1] Averaging: Simple Models
___
Data --- Feature --- Model --- Model
Space Space Algorithm Parameters
(n) (f) (m) (p)
R[50%] pc1,pc2 Decision Tree n_est=10
R[50%,r] pc1,pc2 Decision Tree n_est=10
100% R[50%] Decision Tree n_est=10
R[50%] R[50%] Decision Tree n_est=10
The Power of Ensembles in Machine Learning
[1] Averaging: Extended Models
___
Use perturb-and-combine techniques
Bagging: Bootstrap Aggregation
Random Forest: Best split amongst random
features
Extremely Randomised: Random threshold
for split
[1] Averaging: Extended Models
___
Data --- Feature --- Model --- Model
Space Space Algorithm Parameters
(n) (f) (m) (p)
100% pc1,pc2 Decision Tree d=full
R[50%,r] pc1,pc2 Decision Tree n_est=10
R[50%,r] R[Split] Decision Tree n_est=10
R[50%,r] R[Split,thresh] Decision Tree n_est=10
The Power of Ensembles in Machine Learning
[2] Boosting: Concept
___
"Build base estimators sequentially and
then try to reduce the bias of the
combined estimator."
Combine several weak models to produce a
powerful ensemble.
[2] Boosting: Concept
___
Source: Wang Tan
[2] Boosting: Models
___
Data --- Feature --- Model --- Model
Space Space Algorithm Parameters
(n) (f) (m) (p)
100% pc1,pc2 Decision Tree d=full
100% pc1,pc2 DTree Boost n_est=10
100% pc1,pc2 DTree Boost n_est=10, d=1
100% pc1,pc2 DTree Boost n_est=10, d=2
The Power of Ensembles in Machine Learning
[3] Voting: Concept
___
"To use conceptually different model
algorithms and use majority vote or soft
vote (average probabilities) to
combine."
To ensemble a set of equally well
performing models in order to balance
out their individual weaknesses.
[3] Voting: Approach
[3] Voting: Approach
___
Hard Voting: the majority (mode) of the
class labels predicted
Soft Voting: the argmax of the sum of
predicted probabilities
Weighted Voting: the argmax of the
weighted sum of predicted probabilities*
[3] Voting: Models
___
Data --- Feature --- Model --- Model
Space Space Algorithm Parameters
(n) (f) (m) (p)
100% pc1,pc2 Gaussian -
100% pc1,pc2 SVM - RBF gamma=0.5, C=1
100% pc1,pc2 Gradient Boost n_est=10, d=2
----- ------- -------------- --------------
100% pc1,pc2 Voting Soft
The Power of Ensembles in Machine Learning
[4] Stacking
___
"To use output of different model
algorithms as feature inputs for a meta
classifier."
To ensemble a set of well performing
models in order to uncover higher order
interaction effects
[4] Stacking: Approach
[4] Stacking: Models
___
Data --- Feature --- Model --- Model
Space Space Algorithm Parameters
(n) (f) (m) (p)
100% pc1,pc2 Gaussian -
100% pc1,pc2 SVM - RBF gamma=0.5, C=1
100% pc1,pc2 Gradient Boost n_est=10, d=2
----- ------- -------------- --------------
100% pc1,pc2 Stacking Logistic()
The Power of Ensembles in Machine Learning
Ensemble Advantages
___
Improved accuracy
Robustness and better generalisation
Parallelization
Ensemble Disadvantages
___
Model human readability isn't great
Time/effort trade-off to improve accuracy
may not make sense
Advanced Ensemble Approach
___
— Grid Search for voting weights
— Bayesian Optimization for weights &
hyper-parameters
— Feature Engineering e.g. tree or
cluster embedding
Ensembles
___
"It is the harmony of the diverse parts,
their symmetry, their happy balance; in
a word it is all that introduces order,
all that gives unity, that permits us to
see clearly and to comprehend at once
both the ensemble and the details."
- Henri Poincare
Code and Slides
____
All the code and slides are available at
https://github.com/amitkaps/ensemble
Contact
______
Amit Kapoor
@amitkaps
amitkaps.com
Bargava Subramanian
@bargava
1 of 51

Recommended

Workshop - Introduction to Machine Learning with R by
Workshop - Introduction to Machine Learning with RWorkshop - Introduction to Machine Learning with R
Workshop - Introduction to Machine Learning with RShirin Elsinghorst
39.5K views60 slides
Auto encoding-variational-bayes by
Auto encoding-variational-bayesAuto encoding-variational-bayes
Auto encoding-variational-bayesmehdi Cherti
1.8K views17 slides
(DL輪読)Matching Networks for One Shot Learning by
(DL輪読)Matching Networks for One Shot Learning(DL輪読)Matching Networks for One Shot Learning
(DL輪読)Matching Networks for One Shot LearningMasahiro Suzuki
7.8K views28 slides
Build your own Convolutional Neural Network CNN by
Build your own Convolutional Neural Network CNNBuild your own Convolutional Neural Network CNN
Build your own Convolutional Neural Network CNNhichem felouat
194 views23 slides
Auto-encoding variational bayes by
Auto-encoding variational bayesAuto-encoding variational bayes
Auto-encoding variational bayesKyuri Kim
574 views16 slides
Generative adversarial networks by
Generative adversarial networksGenerative adversarial networks
Generative adversarial networksKyuri Kim
175 views20 slides

More Related Content

What's hot

Introduction To Generative Adversarial Networks GANs by
Introduction To Generative Adversarial Networks GANsIntroduction To Generative Adversarial Networks GANs
Introduction To Generative Adversarial Networks GANshichem felouat
625 views23 slides
Genetic programming by
Genetic programmingGenetic programming
Genetic programmingOmar Ghazi
1.8K views36 slides
Rabbit challenge 3 DNN Day2 by
Rabbit challenge 3 DNN Day2Rabbit challenge 3 DNN Day2
Rabbit challenge 3 DNN Day2TOMMYLINK1
22 views32 slides
Introduction of Xgboost by
Introduction of XgboostIntroduction of Xgboost
Introduction of Xgboostmichiaki ito
3.4K views29 slides
Xgboost by
XgboostXgboost
XgboostVivian S. Zhang
708.8K views128 slides
NIPS読み会2013: One-shot learning by inverting a compositional causal process by
NIPS読み会2013: One-shot learning by inverting  a compositional causal processNIPS読み会2013: One-shot learning by inverting  a compositional causal process
NIPS読み会2013: One-shot learning by inverting a compositional causal processnozyh
21.6K views15 slides

What's hot(20)

Introduction To Generative Adversarial Networks GANs by hichem felouat
Introduction To Generative Adversarial Networks GANsIntroduction To Generative Adversarial Networks GANs
Introduction To Generative Adversarial Networks GANs
hichem felouat625 views
Genetic programming by Omar Ghazi
Genetic programmingGenetic programming
Genetic programming
Omar Ghazi1.8K views
Rabbit challenge 3 DNN Day2 by TOMMYLINK1
Rabbit challenge 3 DNN Day2Rabbit challenge 3 DNN Day2
Rabbit challenge 3 DNN Day2
TOMMYLINK122 views
Introduction of Xgboost by michiaki ito
Introduction of XgboostIntroduction of Xgboost
Introduction of Xgboost
michiaki ito3.4K views
NIPS読み会2013: One-shot learning by inverting a compositional causal process by nozyh
NIPS読み会2013: One-shot learning by inverting  a compositional causal processNIPS読み会2013: One-shot learning by inverting  a compositional causal process
NIPS読み会2013: One-shot learning by inverting a compositional causal process
nozyh21.6K views
GBM package in r by mark_landry
GBM package in rGBM package in r
GBM package in r
mark_landry41.4K views
Predict future time series forecasting by hichem felouat
Predict future time series forecastingPredict future time series forecasting
Predict future time series forecasting
hichem felouat107 views
Side Notes on Practical Natural Language Processing: Bootstrap Test by HarithaGangavarapu
Side Notes on Practical Natural Language Processing: Bootstrap TestSide Notes on Practical Natural Language Processing: Bootstrap Test
Side Notes on Practical Natural Language Processing: Bootstrap Test
HarithaGangavarapu174 views
Comparison Study of Decision Tree Ensembles for Regression by Seonho Park
Comparison Study of Decision Tree Ensembles for RegressionComparison Study of Decision Tree Ensembles for Regression
Comparison Study of Decision Tree Ensembles for Regression
Seonho Park1.2K views
Machine Learning with Go by James Bowman
Machine Learning with GoMachine Learning with Go
Machine Learning with Go
James Bowman94 views
Vc dimension in Machine Learning by VARUN KUMAR
Vc dimension in Machine LearningVc dimension in Machine Learning
Vc dimension in Machine Learning
VARUN KUMAR4.3K views
XGBoost: the algorithm that wins every competition by Jaroslaw Szymczak
XGBoost: the algorithm that wins every competitionXGBoost: the algorithm that wins every competition
XGBoost: the algorithm that wins every competition
Jaroslaw Szymczak16.7K views
Tree models with Scikit-Learn: Great models with little assumptions by Gilles Louppe
Tree models with Scikit-Learn: Great models with little assumptionsTree models with Scikit-Learn: Great models with little assumptions
Tree models with Scikit-Learn: Great models with little assumptions
Gilles Louppe7.5K views
Machine learning by Shreyas G S
Machine learningMachine learning
Machine learning
Shreyas G S2.6K views
Anomaly detection using deep one class classifier by 홍배 김
Anomaly detection using deep one class classifierAnomaly detection using deep one class classifier
Anomaly detection using deep one class classifier
홍배 김5.9K views
Gradient boosting in practice: a deep dive into xgboost by Jaroslaw Szymczak
Gradient boosting in practice: a deep dive into xgboostGradient boosting in practice: a deep dive into xgboost
Gradient boosting in practice: a deep dive into xgboost
Jaroslaw Szymczak3.4K views

Viewers also liked

Python Visualisation for Data Science by
Python Visualisation for Data SciencePython Visualisation for Data Science
Python Visualisation for Data ScienceAmit Kapoor
2.3K views17 slides
Kontrak kuliah pengetahuan lingkungan by
Kontrak kuliah pengetahuan lingkunganKontrak kuliah pengetahuan lingkungan
Kontrak kuliah pengetahuan lingkunganHotnida D'kanda
265 views6 slides
Sanjeet kumar social media profile by
Sanjeet kumar social media profileSanjeet kumar social media profile
Sanjeet kumar social media profileSanjeet Kumar
161 views3 slides
Co to jest psychologia? by
Co to jest psychologia?Co to jest psychologia?
Co to jest psychologia?Paula Pilarska
2K views19 slides
Cuadro comparativo linux, windows y android by
Cuadro comparativo linux, windows y androidCuadro comparativo linux, windows y android
Cuadro comparativo linux, windows y androidSthefy Zavala
840 views3 slides
What is psychology by
What is psychologyWhat is psychology
What is psychologyPaula Pilarska
747 views18 slides

Viewers also liked(20)

Python Visualisation for Data Science by Amit Kapoor
Python Visualisation for Data SciencePython Visualisation for Data Science
Python Visualisation for Data Science
Amit Kapoor2.3K views
Kontrak kuliah pengetahuan lingkungan by Hotnida D'kanda
Kontrak kuliah pengetahuan lingkunganKontrak kuliah pengetahuan lingkungan
Kontrak kuliah pengetahuan lingkungan
Hotnida D'kanda265 views
Sanjeet kumar social media profile by Sanjeet Kumar
Sanjeet kumar social media profileSanjeet kumar social media profile
Sanjeet kumar social media profile
Sanjeet Kumar161 views
Cuadro comparativo linux, windows y android by Sthefy Zavala
Cuadro comparativo linux, windows y androidCuadro comparativo linux, windows y android
Cuadro comparativo linux, windows y android
Sthefy Zavala840 views
презентація сучасний кабінет інструментарій педагога в підготовці кваліфіко... by Stepan Revin
презентація сучасний кабінет   інструментарій педагога в підготовці кваліфіко...презентація сучасний кабінет   інструментарій педагога в підготовці кваліфіко...
презентація сучасний кабінет інструментарій педагога в підготовці кваліфіко...
Stepan Revin337 views
SARK Trading & Marketing Intl Trade Show Brochure by Sadik R Khan
SARK Trading & Marketing Intl Trade Show BrochureSARK Trading & Marketing Intl Trade Show Brochure
SARK Trading & Marketing Intl Trade Show Brochure
Sadik R Khan381 views
Visualising Big Data by Amit Kapoor
Visualising Big DataVisualising Big Data
Visualising Big Data
Amit Kapoor8.7K views
Telling Stories with Data - Using Story Spine by Amit Kapoor
Telling Stories with Data - Using Story SpineTelling Stories with Data - Using Story Spine
Telling Stories with Data - Using Story Spine
Amit Kapoor4.8K views
Atmosfer dan pencemaran udara by Hotnida D'kanda
Atmosfer dan pencemaran udaraAtmosfer dan pencemaran udara
Atmosfer dan pencemaran udara
Hotnida D'kanda11.7K views
Ensemble Learning: The Wisdom of Crowds (of Machines) by Lior Rokach
Ensemble Learning: The Wisdom of Crowds (of Machines)Ensemble Learning: The Wisdom of Crowds (of Machines)
Ensemble Learning: The Wisdom of Crowds (of Machines)
Lior Rokach9K views
Five Things I Wish I Knew the First Day I Used Tableau by Ryan Sleeper
Five Things I Wish I Knew the First Day I Used TableauFive Things I Wish I Knew the First Day I Used Tableau
Five Things I Wish I Knew the First Day I Used Tableau
Ryan Sleeper8.1K views
SORACOM Bootcamp Rec1 - SORACOM Air (1) by SORACOM,INC
SORACOM Bootcamp Rec1 - SORACOM Air (1)SORACOM Bootcamp Rec1 - SORACOM Air (1)
SORACOM Bootcamp Rec1 - SORACOM Air (1)
SORACOM,INC977 views
Data driven storytelling tips from an iron viz champion ryan sleeper by Ryan Sleeper
Data driven storytelling tips from an iron viz champion   ryan sleeperData driven storytelling tips from an iron viz champion   ryan sleeper
Data driven storytelling tips from an iron viz champion ryan sleeper
Ryan Sleeper5K views
SORACOM Bootcamp Rec5 - SORACOM Funnel by SORACOM,INC
SORACOM Bootcamp Rec5 - SORACOM FunnelSORACOM Bootcamp Rec5 - SORACOM Funnel
SORACOM Bootcamp Rec5 - SORACOM Funnel
SORACOM,INC871 views
Crafting Visual Stories with Data by Amit Kapoor
Crafting Visual Stories with DataCrafting Visual Stories with Data
Crafting Visual Stories with Data
Amit Kapoor26.2K views
Learning the Craft of Data Visualisation by Amit Kapoor
Learning the Craft of Data VisualisationLearning the Craft of Data Visualisation
Learning the Craft of Data Visualisation
Amit Kapoor1.7K views
Visualising Multi Dimensional Data by Amit Kapoor
Visualising Multi Dimensional DataVisualising Multi Dimensional Data
Visualising Multi Dimensional Data
Amit Kapoor14K views

Similar to The Power of Ensembles in Machine Learning

Two methods for optimising cognitive model parameters by
Two methods for optimising cognitive model parametersTwo methods for optimising cognitive model parameters
Two methods for optimising cognitive model parametersUniversity of Huddersfield
150 views16 slides
Computational Intelligence Assisted Engineering Design Optimization (using MA... by
Computational Intelligence Assisted Engineering Design Optimization (using MA...Computational Intelligence Assisted Engineering Design Optimization (using MA...
Computational Intelligence Assisted Engineering Design Optimization (using MA...AmirParnianifard1
33 views29 slides
(Machine Learning) Ensemble learning by
(Machine Learning) Ensemble learning (Machine Learning) Ensemble learning
(Machine Learning) Ensemble learning Omkar Rane
294 views14 slides
Lecture7 cross validation by
Lecture7 cross validationLecture7 cross validation
Lecture7 cross validationStéphane Canu
1.4K views27 slides
Machine learning (5) by
Machine learning (5)Machine learning (5)
Machine learning (5)NYversity
273 views8 slides
Transfer Learning for Improving Model Predictions in Highly Configurable Soft... by
Transfer Learning for Improving Model Predictions in Highly Configurable Soft...Transfer Learning for Improving Model Predictions in Highly Configurable Soft...
Transfer Learning for Improving Model Predictions in Highly Configurable Soft...Pooyan Jamshidi
932 views39 slides

Similar to The Power of Ensembles in Machine Learning(20)

Computational Intelligence Assisted Engineering Design Optimization (using MA... by AmirParnianifard1
Computational Intelligence Assisted Engineering Design Optimization (using MA...Computational Intelligence Assisted Engineering Design Optimization (using MA...
Computational Intelligence Assisted Engineering Design Optimization (using MA...
(Machine Learning) Ensemble learning by Omkar Rane
(Machine Learning) Ensemble learning (Machine Learning) Ensemble learning
(Machine Learning) Ensemble learning
Omkar Rane294 views
Lecture7 cross validation by Stéphane Canu
Lecture7 cross validationLecture7 cross validation
Lecture7 cross validation
Stéphane Canu1.4K views
Machine learning (5) by NYversity
Machine learning (5)Machine learning (5)
Machine learning (5)
NYversity273 views
Transfer Learning for Improving Model Predictions in Highly Configurable Soft... by Pooyan Jamshidi
Transfer Learning for Improving Model Predictions in Highly Configurable Soft...Transfer Learning for Improving Model Predictions in Highly Configurable Soft...
Transfer Learning for Improving Model Predictions in Highly Configurable Soft...
Pooyan Jamshidi932 views
Evaluation of a hybrid method for constructing multiple SVM kernels by infopapers
Evaluation of a hybrid method for constructing multiple SVM kernelsEvaluation of a hybrid method for constructing multiple SVM kernels
Evaluation of a hybrid method for constructing multiple SVM kernels
infopapers331 views
Summary.ppt by butest
Summary.pptSummary.ppt
Summary.ppt
butest476 views
[PR12] PR-036 Learning to Remember Rare Events by Taegyun Jeon
[PR12] PR-036 Learning to Remember Rare Events[PR12] PR-036 Learning to Remember Rare Events
[PR12] PR-036 Learning to Remember Rare Events
Taegyun Jeon2.7K views
PPT - AutoML-Zero: Evolving Machine Learning Algorithms From Scratch by Jisang Yoon
PPT - AutoML-Zero: Evolving Machine Learning Algorithms From ScratchPPT - AutoML-Zero: Evolving Machine Learning Algorithms From Scratch
PPT - AutoML-Zero: Evolving Machine Learning Algorithms From Scratch
Jisang Yoon118 views
Deep Learning for Developers by Julien SIMON
Deep Learning for DevelopersDeep Learning for Developers
Deep Learning for Developers
Julien SIMON1.1K views
The Concurrent Constraint Programming Research Programmes -- Redux (part2) by Pierre Schaus
The Concurrent Constraint Programming Research Programmes -- Redux (part2)The Concurrent Constraint Programming Research Programmes -- Redux (part2)
The Concurrent Constraint Programming Research Programmes -- Redux (part2)
Pierre Schaus1.1K views
Machine learning for_finance by Stefan Duprey
Machine learning for_financeMachine learning for_finance
Machine learning for_finance
Stefan Duprey2K views
Intro to Machine Learning for non-Data Scientists by Parinaz Ameri
Intro to Machine Learning for non-Data ScientistsIntro to Machine Learning for non-Data Scientists
Intro to Machine Learning for non-Data Scientists
Parinaz Ameri224 views
DataEngConf: Feature Extraction: Modern Questions and Challenges at Google by Hakka Labs
DataEngConf: Feature Extraction: Modern Questions and Challenges at GoogleDataEngConf: Feature Extraction: Modern Questions and Challenges at Google
DataEngConf: Feature Extraction: Modern Questions and Challenges at Google
Hakka Labs3K views
This is a heavily data-oriented by butest
This is a heavily data-orientedThis is a heavily data-oriented
This is a heavily data-oriented
butest840 views
This is a heavily data-oriented by butest
This is a heavily data-orientedThis is a heavily data-oriented
This is a heavily data-oriented
butest660 views
An Adaptive Masker for the Differential Evolution Algorithm by IOSR Journals
An Adaptive Masker for the Differential Evolution AlgorithmAn Adaptive Masker for the Differential Evolution Algorithm
An Adaptive Masker for the Differential Evolution Algorithm
IOSR Journals454 views

More from Amit Kapoor

Deep Learning for NLP by
Deep Learning for NLPDeep Learning for NLP
Deep Learning for NLPAmit Kapoor
2.2K views46 slides
Model Visualisation by
Model VisualisationModel Visualisation
Model VisualisationAmit Kapoor
1.6K views60 slides
Storytelling with Data - Approach | Skills by
Storytelling with Data - Approach | SkillsStorytelling with Data - Approach | Skills
Storytelling with Data - Approach | SkillsAmit Kapoor
3.5K views134 slides
Tools & Resources for Data Visualisation by
Tools & Resources for Data VisualisationTools & Resources for Data Visualisation
Tools & Resources for Data VisualisationAmit Kapoor
4.5K views34 slides
Fifth Elephant 2014 talk - Crafting Visual Stories with Data by
Fifth Elephant 2014 talk - Crafting Visual Stories with DataFifth Elephant 2014 talk - Crafting Visual Stories with Data
Fifth Elephant 2014 talk - Crafting Visual Stories with DataAmit Kapoor
9.8K views76 slides
Storytelling with Data - See | Show | Tell | Engage by
Storytelling with Data - See | Show | Tell | EngageStorytelling with Data - See | Show | Tell | Engage
Storytelling with Data - See | Show | Tell | EngageAmit Kapoor
7.5K views145 slides

More from Amit Kapoor(14)

Deep Learning for NLP by Amit Kapoor
Deep Learning for NLPDeep Learning for NLP
Deep Learning for NLP
Amit Kapoor2.2K views
Model Visualisation by Amit Kapoor
Model VisualisationModel Visualisation
Model Visualisation
Amit Kapoor1.6K views
Storytelling with Data - Approach | Skills by Amit Kapoor
Storytelling with Data - Approach | SkillsStorytelling with Data - Approach | Skills
Storytelling with Data - Approach | Skills
Amit Kapoor3.5K views
Tools & Resources for Data Visualisation by Amit Kapoor
Tools & Resources for Data VisualisationTools & Resources for Data Visualisation
Tools & Resources for Data Visualisation
Amit Kapoor4.5K views
Fifth Elephant 2014 talk - Crafting Visual Stories with Data by Amit Kapoor
Fifth Elephant 2014 talk - Crafting Visual Stories with DataFifth Elephant 2014 talk - Crafting Visual Stories with Data
Fifth Elephant 2014 talk - Crafting Visual Stories with Data
Amit Kapoor9.8K views
Storytelling with Data - See | Show | Tell | Engage by Amit Kapoor
Storytelling with Data - See | Show | Tell | EngageStorytelling with Data - See | Show | Tell | Engage
Storytelling with Data - See | Show | Tell | Engage
Amit Kapoor7.5K views
Business Process Improvement - A Strategic and Supply Chain Perspective by Amit Kapoor
Business Process Improvement - A Strategic and Supply Chain Perspective Business Process Improvement - A Strategic and Supply Chain Perspective
Business Process Improvement - A Strategic and Supply Chain Perspective
Amit Kapoor1.8K views
What makes a data-story work? by Amit Kapoor
What makes a data-story work?What makes a data-story work?
What makes a data-story work?
Amit Kapoor2.1K views
What is Strategy - Thinking like a Strategist by Amit Kapoor
What is Strategy - Thinking like a StrategistWhat is Strategy - Thinking like a Strategist
What is Strategy - Thinking like a Strategist
Amit Kapoor4.7K views
Story Structure and Modern Storytelling by Amit Kapoor
Story Structure and Modern StorytellingStory Structure and Modern Storytelling
Story Structure and Modern Storytelling
Amit Kapoor2K views
Targeting the Moment of Truth - Using Big Data in Retail by Amit Kapoor
Targeting the Moment of Truth - Using Big Data in RetailTargeting the Moment of Truth - Using Big Data in Retail
Targeting the Moment of Truth - Using Big Data in Retail
Amit Kapoor1.3K views
Storytelling - Gutenberg by Amit Kapoor
Storytelling - GutenbergStorytelling - Gutenberg
Storytelling - Gutenberg
Amit Kapoor534 views
Analytics in Consulting by Amit Kapoor
Analytics in ConsultingAnalytics in Consulting
Analytics in Consulting
Amit Kapoor2.7K views
Retail Pricing Perspective by Amit Kapoor
Retail Pricing PerspectiveRetail Pricing Perspective
Retail Pricing Perspective
Amit Kapoor2.2K views

Recently uploaded

Building Real-Time Travel Alerts by
Building Real-Time Travel AlertsBuilding Real-Time Travel Alerts
Building Real-Time Travel AlertsTimothy Spann
109 views48 slides
RuleBookForTheFairDataEconomy.pptx by
RuleBookForTheFairDataEconomy.pptxRuleBookForTheFairDataEconomy.pptx
RuleBookForTheFairDataEconomy.pptxnoraelstela1
67 views16 slides
ColonyOS by
ColonyOSColonyOS
ColonyOSJohanKristiansson6
9 views17 slides
Chapter 3b- Process Communication (1) (1)(1) (1).pptx by
Chapter 3b- Process Communication (1) (1)(1) (1).pptxChapter 3b- Process Communication (1) (1)(1) (1).pptx
Chapter 3b- Process Communication (1) (1)(1) (1).pptxayeshabaig2004
5 views30 slides
Survey on Factuality in LLM's.pptx by
Survey on Factuality in LLM's.pptxSurvey on Factuality in LLM's.pptx
Survey on Factuality in LLM's.pptxNeethaSherra1
5 views9 slides
Cross-network in Google Analytics 4.pdf by
Cross-network in Google Analytics 4.pdfCross-network in Google Analytics 4.pdf
Cross-network in Google Analytics 4.pdfGA4 Tutorials
6 views7 slides

Recently uploaded(20)

Building Real-Time Travel Alerts by Timothy Spann
Building Real-Time Travel AlertsBuilding Real-Time Travel Alerts
Building Real-Time Travel Alerts
Timothy Spann109 views
RuleBookForTheFairDataEconomy.pptx by noraelstela1
RuleBookForTheFairDataEconomy.pptxRuleBookForTheFairDataEconomy.pptx
RuleBookForTheFairDataEconomy.pptx
noraelstela167 views
Chapter 3b- Process Communication (1) (1)(1) (1).pptx by ayeshabaig2004
Chapter 3b- Process Communication (1) (1)(1) (1).pptxChapter 3b- Process Communication (1) (1)(1) (1).pptx
Chapter 3b- Process Communication (1) (1)(1) (1).pptx
ayeshabaig20045 views
Survey on Factuality in LLM's.pptx by NeethaSherra1
Survey on Factuality in LLM's.pptxSurvey on Factuality in LLM's.pptx
Survey on Factuality in LLM's.pptx
NeethaSherra15 views
Cross-network in Google Analytics 4.pdf by GA4 Tutorials
Cross-network in Google Analytics 4.pdfCross-network in Google Analytics 4.pdf
Cross-network in Google Analytics 4.pdf
GA4 Tutorials6 views
Data structure and algorithm. by Abdul salam
Data structure and algorithm. Data structure and algorithm.
Data structure and algorithm.
Abdul salam 18 views
Introduction to Microsoft Fabric.pdf by ishaniuudeshika
Introduction to Microsoft Fabric.pdfIntroduction to Microsoft Fabric.pdf
Introduction to Microsoft Fabric.pdf
ishaniuudeshika24 views
UNEP FI CRS Climate Risk Results.pptx by pekka28
UNEP FI CRS Climate Risk Results.pptxUNEP FI CRS Climate Risk Results.pptx
UNEP FI CRS Climate Risk Results.pptx
pekka2811 views
Short Story Assignment by Kelly Nguyen by kellynguyen01
Short Story Assignment by Kelly NguyenShort Story Assignment by Kelly Nguyen
Short Story Assignment by Kelly Nguyen
kellynguyen0118 views
JConWorld_ Continuous SQL with Kafka and Flink by Timothy Spann
JConWorld_ Continuous SQL with Kafka and FlinkJConWorld_ Continuous SQL with Kafka and Flink
JConWorld_ Continuous SQL with Kafka and Flink
Timothy Spann100 views
Advanced_Recommendation_Systems_Presentation.pptx by neeharikasingh29
Advanced_Recommendation_Systems_Presentation.pptxAdvanced_Recommendation_Systems_Presentation.pptx
Advanced_Recommendation_Systems_Presentation.pptx
Organic Shopping in Google Analytics 4.pdf by GA4 Tutorials
Organic Shopping in Google Analytics 4.pdfOrganic Shopping in Google Analytics 4.pdf
Organic Shopping in Google Analytics 4.pdf
GA4 Tutorials10 views
Launch of the Knowledge Exchange Platform - Romina Boarini - 21 November 2023 by StatsCommunications
Launch of the Knowledge Exchange Platform - Romina Boarini - 21 November 2023Launch of the Knowledge Exchange Platform - Romina Boarini - 21 November 2023
Launch of the Knowledge Exchange Platform - Romina Boarini - 21 November 2023
Supercharging your Data with Azure AI Search and Azure OpenAI by Peter Gallagher
Supercharging your Data with Azure AI Search and Azure OpenAISupercharging your Data with Azure AI Search and Azure OpenAI
Supercharging your Data with Azure AI Search and Azure OpenAI
Peter Gallagher37 views

The Power of Ensembles in Machine Learning

  • 1. Power of Ensembles______ Amit Kapoor @amitkaps Bargava Subramanian @bargava
  • 2. The Blind Men & the Elephant ___ “And so these men of Indostan Disputed loud and long, Each in his own opinion Exceeding stiff and strong, Though each was partly in the right, And all were in the wrong.” — John Godfrey Saxe
  • 3. Model Abstraction ___ "All models are wrong, but some are useful" — George Box
  • 4. Building Many Models ___ "All models are wrong, but some are useful, and their combination may be better"
  • 5. Ensembles ___ (noun) /änˈsämbəl/ a group of items viewed as a whole rather than individually.
  • 6. Machine Learning Process ___ Frame: Problem definition Acquire: Data ingestion Refine: Data wrangling Transform: Feature creation Explore: Feature selection Model: Model selection Insight: Solution communication
  • 7. Machine Learning Model Pipeline ___ Data --- Feature --- Model --- Model Space Space Algorithm Parameters (n) (f) (m) (p)
  • 8. Machine Learning Challenge ___ "...the task of searching through a hypotheses space to find a suitable hypothesis that will make good predictions for a particular problem"
  • 9. Hypotheses Space ___ Data --- Feature --- Model --- Model Space Space Algorithm Parameters (n) (f) (m) (p) |_________________________| Hypotheses Space
  • 10. Search Hypotheses Space ___ The key question is how do we efficiently search this Hypotheses Space?
  • 11. Illustrative Example ___ Binary Classification Problem Classes: c = 2 Features: f = x1, x2, x3 Observations: n = 1,000
  • 13. Dimensionality Reduction ___ "Easier to visualise the data space" Principal Component Analysis Dimensions = 2 --> (pc1, pc2)
  • 15. Approach to Build Models ___ 1. Train a model 2. Predict the probabilities for each class 3. Score the model on AUC via 5-Fold Cross-Validation 4. Show the Decision Space
  • 16. Simple Model with Different Features ___ Data --- Feature --- Model --- Model Space Space Algorithm Parameters (n) (f) (m) (p) 100% x1,x2 Logistic C=1 100% x2,x3 Logistic C=1 100% x1,x3 Logistic C=1 100% pc1,pc2 Logistic C=1
  • 18. Tune the Model Parameters ___ Data --- Feature --- Model --- Model Space Space Algorithm Parameters (n) (f) (m) (p) 100% pc1,pc2 Logistic C=1 100% pc1,pc2 Logistic C=1e-1 100% pc1,pc2 Logistic C=1e-3 100% pc1,pc2 Logistic C=1e-5
  • 20. Make Many Simple Models ___ Data --- Feature --- Model --- Model Space Space Algorithm Parameters (n) (f) (m) (p) 100% pc1,pc2 Logistic C=1 100% pc1,pc2 SVM-Linear prob=True 100% pc1,pc2 Decision Tree d=full 100% pc1,pc2 KNN nn=3
  • 22. Hypotheses Search Approach ___ Exhaustive search is impossible Compute as a proxy for human IQ Clever algorithmic way to search the solution space
  • 23. Ensemble Thought Process ___ "The goal of ensemble methods is to combine the predictions of several base estimators built with (one or multiple) model algorithms in order to improve generalisation and robustness over a single estimator."
  • 24. Ensemble Approach ___ Different training sets Different feature sampling Different algorithms Different (hyper) parameters Clever aggregation
  • 25. Ensemble Methods ___ [1] Averaging [2] Boosting [3] Voting [4] Stacking
  • 26. [1] Averaging: Concept ___ "Build several estimators independently and then average their predictions to reduce model variance" To ensemble several good models to produce a less variance model.
  • 28. [1] Averaging: Simple Models ___ Data --- Feature --- Model --- Model Space Space Algorithm Parameters (n) (f) (m) (p) R[50%] pc1,pc2 Decision Tree n_est=10 R[50%,r] pc1,pc2 Decision Tree n_est=10 100% R[50%] Decision Tree n_est=10 R[50%] R[50%] Decision Tree n_est=10
  • 30. [1] Averaging: Extended Models ___ Use perturb-and-combine techniques Bagging: Bootstrap Aggregation Random Forest: Best split amongst random features Extremely Randomised: Random threshold for split
  • 31. [1] Averaging: Extended Models ___ Data --- Feature --- Model --- Model Space Space Algorithm Parameters (n) (f) (m) (p) 100% pc1,pc2 Decision Tree d=full R[50%,r] pc1,pc2 Decision Tree n_est=10 R[50%,r] R[Split] Decision Tree n_est=10 R[50%,r] R[Split,thresh] Decision Tree n_est=10
  • 33. [2] Boosting: Concept ___ "Build base estimators sequentially and then try to reduce the bias of the combined estimator." Combine several weak models to produce a powerful ensemble.
  • 35. [2] Boosting: Models ___ Data --- Feature --- Model --- Model Space Space Algorithm Parameters (n) (f) (m) (p) 100% pc1,pc2 Decision Tree d=full 100% pc1,pc2 DTree Boost n_est=10 100% pc1,pc2 DTree Boost n_est=10, d=1 100% pc1,pc2 DTree Boost n_est=10, d=2
  • 37. [3] Voting: Concept ___ "To use conceptually different model algorithms and use majority vote or soft vote (average probabilities) to combine." To ensemble a set of equally well performing models in order to balance out their individual weaknesses.
  • 39. [3] Voting: Approach ___ Hard Voting: the majority (mode) of the class labels predicted Soft Voting: the argmax of the sum of predicted probabilities Weighted Voting: the argmax of the weighted sum of predicted probabilities*
  • 40. [3] Voting: Models ___ Data --- Feature --- Model --- Model Space Space Algorithm Parameters (n) (f) (m) (p) 100% pc1,pc2 Gaussian - 100% pc1,pc2 SVM - RBF gamma=0.5, C=1 100% pc1,pc2 Gradient Boost n_est=10, d=2 ----- ------- -------------- -------------- 100% pc1,pc2 Voting Soft
  • 42. [4] Stacking ___ "To use output of different model algorithms as feature inputs for a meta classifier." To ensemble a set of well performing models in order to uncover higher order interaction effects
  • 44. [4] Stacking: Models ___ Data --- Feature --- Model --- Model Space Space Algorithm Parameters (n) (f) (m) (p) 100% pc1,pc2 Gaussian - 100% pc1,pc2 SVM - RBF gamma=0.5, C=1 100% pc1,pc2 Gradient Boost n_est=10, d=2 ----- ------- -------------- -------------- 100% pc1,pc2 Stacking Logistic()
  • 46. Ensemble Advantages ___ Improved accuracy Robustness and better generalisation Parallelization
  • 47. Ensemble Disadvantages ___ Model human readability isn't great Time/effort trade-off to improve accuracy may not make sense
  • 48. Advanced Ensemble Approach ___ — Grid Search for voting weights — Bayesian Optimization for weights & hyper-parameters — Feature Engineering e.g. tree or cluster embedding
  • 49. Ensembles ___ "It is the harmony of the diverse parts, their symmetry, their happy balance; in a word it is all that introduces order, all that gives unity, that permits us to see clearly and to comprehend at once both the ensemble and the details." - Henri Poincare
  • 50. Code and Slides ____ All the code and slides are available at https://github.com/amitkaps/ensemble