Adaboost Algorithm
Apr. 09, 2019
Yangwoo, Kim
Master Degree at BioComputing Lab
1
Contents
2
1. Introduction
2. Ensemble Summary
3. Random Forest (Bagging)
4. Adaboost (Boosting)
5. Application
3
1. Introduction
“Ensemble Learning” Definition:
In statistics and machine learning, ensemble methods use multiple learning algorithms to obtain better predictive
performance than could be obtained from any of the constituent learning algorithm alone.
02 0301 04
Bagging Boosting
Bayesian model
combination
Stacking
Ensemble model
In this slide, we will focus on ‘Boosting Algorithm’ !
https://en.wikipedia.org/wiki/Boosting_(machine_learning)
4
1. Introduction
• There are lots of data types in Kaggle + Most people make a baseline by using xgboost and lightgbm
5
1. Introduction
• Among the 29 challenge winning solutions published at Kaggle’s blog during 2015, 17 solutions used XGBoost !
• Among these solutions, eight solely used XGBoost to train the model, while most others combined XGBoost
with neural nets in ensembles.
• For comparison, the second most popular method, deep neural nets, was used in 11 solutions.
6
2. Ensemble Summary
[ Bagging, called bootstrap aggregating ]
• To improve the stability and accuracy by reducing variance and avoiding overfitting.
training set 𝐷
size 𝑛
training set 𝐷𝑖
size 𝑛′
(𝑛′ = 𝑛)
by sampling
with replacement
 Bootstrap sample
In categorical data, use voting algorithm !
In regression, use averaging(mean) the output !
7
2. Ensemble Summary
https://www.youtube.com/watch?v=2Mg8QD0F1dQ
Data
(60%)
Train
(40%)
Test
𝐷1
. . .
𝐷2 𝐷 𝑚
Random with replacement
train train train
model model model
[ Bagging, called bootstrap aggregating ]
Mean / Voting
8
2. Ensemble Summary
[ Bagging, called bootstrap aggregating ]
• i.e. Random Forest
For solving,
1. Underfitting with high bias
2. Overfitting with high variance
9
2. Ensemble Summary
https://quantdare.com/what-is-the-difference-between-bagging-and-boosting/
[ Boosting ]
• “Can a set of weak learners creates a single strong learner?”
• Weak learner : a classifier that is only slightly correlated with the true classification
• Compared to Bagging, Boosting gives more weights to misclassified data to improve classification accuracy
• We will only focus on ‘Adaboost’ in this presentation
10
2. Ensemble Summary
https://www.gormanalysis.com/blog/guide-to-model-stacking-i-e-meta-ensembling/
[ Stacking ]
• Involves training a learning algorithm to combine the predictions of several other learning algorithms.
• First, all of the other algorithms are trained using the available data, then a combiner algorithm is trained to make
a final prediction using all the predictions of the other algorithms as additional inputs.
• (Bob, Kate, Mark, Sue) 4 people through 187 darts at a board !
• 150 samples(train data) + 27 samples(test data)
11
2. Ensemble Summary
https://www.gormanalysis.com/blog/guide-to-model-stacking-i-e-meta-ensembling/
[ Stacking ]
• Our target is stacking these models (KNN, SVM)
KNN model does a good job at classifying
Kate’s and Mark’s throws
SVM model does a good job at classifying
Bob’s and Sue’s throws
How can we combine two advantages by Stacking ?
12
2. Ensemble Summary
https://www.gormanalysis.com/blog/guide-to-model-stacking-i-e-meta-ensembling/
[ Stacking ]
Train datataset is divided into 5 folds
(Train : 150 samples, Test 27 samples)
True value (Competitor)
13
2. Ensemble Summary
https://www.gormanalysis.com/blog/guide-to-model-stacking-i-e-meta-ensembling/
[ Stacking ]
• M1 : K-Nearest Neighbors(k=1)
• M2 : Support Vector Machine (type = 4, cost = 1000)
Make meta data
14
2. Ensemble Summary
https://www.gormanalysis.com/blog/guide-to-model-stacking-i-e-meta-ensembling/
[ Stacking ]
• M1 : K-Nearest Neighbors(k=1)
• M2 : Support Vector Machine (type = 4, cost = 1000)
• Fit the base model to the training fold
• (Fill FoldID = 1) Train FoldId 2-5, predict the M1, M2 by FoldId 1
• (Fill FoldID = 2) Train FoldId 1,3,4,5, predict the M1, M2 by FoldId 2
• ...
15
2. Ensemble Summary
https://www.gormanalysis.com/blog/guide-to-model-stacking-i-e-meta-ensembling/
[ Stacking ]
• Fit each base model (M1, M2) to the full training dataset
- It means that test_meta’s M1, M2 are filled by all of the train folds
• Fit a new model (stacking model) ! Optionally, include other features from the original training dataset or
engineered features
• The main point to take home is that we’re using the predictions of the base models as features (i.e. meta features)
16
2. Ensemble Summary
https://www.gormanalysis.com/blog/guide-to-model-stacking-i-e-meta-ensembling/
[ Stacking ]
KNN model does a good job
at classifying
Kate’s and Mark’s throws
SVM model does a good job
at classifying
Bob’s and Sue’s throws
Combine base model’s advantage
17
3. Random Forest
• Before we understand XGBoost, we have to know ‘Gradient Boosting’...
• Before we understand ‘Gradient Boosting’, we have to know ‘Boosting’, ‘Adaboost’, ‘Random Forest’ ()
https://www.youtube.com/watch?v=J4Wdy0Wc_xQ&t=0s
Step 1) Create a bootstrapped dataset
Randomly selected with replacement !
18
3. Random Forest
Step 2) Create a decision tree using the bootstrapped dataset !
Only considering a random a subset of variables at each step !
You’d do this 100’s of times !
https://www.youtube.com/watch?v=J4Wdy0Wc_xQ&t=0s
+. Blue node can be made by 2 or more variables
19
3. Random Forest
Step 3) How to use it ?
https://www.youtube.com/watch?v=J4Wdy0Wc_xQ&t=0s
etc. etc. etc..
Bootstrapping the data plus
using the aggregated to make a descision is called ‘Bagging’
...
Voting ! YES !!,
20
3. Random Forest
Step 4) How to we know if it’s any good ?
OOB dataset is composed by not in bootstrapped dataset
https://www.youtube.com/watch?v=J4Wdy0Wc_xQ&t=0s
Yes
Yes
NO
We can also get an OOB error by this step.
21
3. Random Forest
https://www.youtube.com/watch?v=J4Wdy0Wc_xQ&t=0s
Step 5) Change the number of variables used per step + Find out better model
Recall that random forest is bagging !
22
4. Adaboost
• Before we read XGBoost theory, recall the Boosting !
- Convert weak learner to strong ones / Sequential model / Random sampling with replacement
Thouhts on Hypotesis Boosting(1988)
1988
Generalization of AdaBoost
as Gradient Boosting
2012
Adaboost (1995)
A decision-theoretic generalization of on-line learning and an application to boosting
2011 2016
XGBoost
https://www.slideshare.net/freepsw/boosting-bagging-vs-boosting
23 https://www.youtube.com/watch?v=GM3CDQfQ4sw
Data
(60%)
Train
(40%)
Test
𝐷1
train
model
①
Random
Test
Error samples
𝐷2
Random
train
𝐷3
model
Combine
Error samples
Random
. . .
train
model
[ Boosting ]
4. Adaboost
• How to optimize ?!
- Adaboost, Gradient Boosting and so on
24 https://www.youtube.com/watch?v=LsK-xG1cLYA
4. Adaboost
• In random forest, there is no predetermined maximum depth
• In contrast, in a Forest of Trees made with AdaBoost,
- use one variable(feature) to make a decision
- Therefore, accuracy is too low (weak learner)
STUMP 
Weak learner
25 https://www.youtube.com/watch?v=LsK-xG1cLYA
4. Adaboost
• As you know, sequence is important in Adaboost
- Next stump is made by influence of previous stump
- i.e. the errors that secode stump makes influence how the third stump is made
Sample weight init. =
1
# 𝑡𝑜𝑡𝑎𝑙 𝑠𝑎𝑚𝑝𝑙𝑒𝑠
• Let’s start !
26 https://www.youtube.com/watch?v=LsK-xG1cLYA
4. Adaboost
1) Make a first stump / Use Gini Index
2) Find an error (incorrect classification sample) / Determin Amount of Say
3) Give the weights (or modify the weights) to error samples
27 https://www.youtube.com/watch?v=LsK-xG1cLYA
4. Adaboost
1) Make a first stump / Use Gini Index
2) Find an error (incorrect classification sample) / Determin Amount of Say
3) Give the weights (or modify the weights) to error samples
Gini index = probability of misclassification
- Gini is 0 for all signal or all background
- 𝐺𝑖𝑛𝑖 = (σ𝑖=1
𝑛
𝑊𝑖)𝑃(1 − 𝑃)
3
5
∗
2
5
+
2
3
∗
1
3
= 0.24 + 0.22 = 0.46
3
6
∗
3
6
+
1
2
∗
1
2
= 0.25 + 0.25 = 0.50
3
3
∗
0
3
+
4
5
∗
1
5
= 0 + 0.20 = 0.20
28 https://www.youtube.com/watch?v=LsK-xG1cLYA
4. Adaboost
1) Make a first stump / Use Gini Index
2) Find an error (incorrect classification sample) / Determin Amount of Say
3) Give the weights (or modify the weights) to error samples
Total Error = σ 𝐸𝑟𝑟𝑜𝑟 =
1
8
Amount of Say am =
1
2
log
1 −Totla Error
Totla Error
𝑇𝑜𝑡𝑎𝑙 𝐸𝑟𝑟𝑜𝑟 = ෍
𝑦𝑖 ≠𝑘 𝑚(𝑥 𝑖)
𝑤𝑖
𝑚
/ ෍
𝑖=1
𝑁
𝑤𝑖
𝑚
=
1
2
log
1 − 1/8
1/8
=
1
2
log 7 = 𝟎. 𝟗𝟕
29 https://www.youtube.com/watch?v=LsK-xG1cLYA
4. Adaboost
Amount of Say am =
1
2
log
1 −Totla Error
Totla Error
, 𝑇𝑜𝑡𝑎𝑙 𝐸𝑟𝑟𝑜𝑟 = ෍
𝑦𝑖 ≠𝑘 𝑚(𝑥 𝑖)
𝑤𝑖
𝑚
/ ෍
𝑖=1
𝑁
𝑤𝑖
𝑚
1) Make a first stump / Use Gini Index
2) Find an error (incorrect classification sample) / Determin Amount of Say
3) Give the weights (or modify the weights) to error samples
[ Amount of Say Graph ]
Error ↓  Amount of Say ↑
Error = ½  Amount of Say = 0
Error ↑  Amount of Say ↓
30 https://www.youtube.com/watch?v=LsK-xG1cLYA
4. Adaboost
1) Make a first stump / Use Gini Index
2) Find an error (incorrect classification sample) / Determin Amount of Say
3) Give the weights (or modify the weights) to error samples
• Now, we treat how to modify the samples weights !
31 https://www.youtube.com/watch?v=LsK-xG1cLYA
4. Adaboost
1) Make a first stump / Use Gini Index
2) Find an error (incorrect classification sample) / Determin Amount of Say
3) Give the weights (or modify the weights) to error samples
𝑁𝑒𝑤 𝑆𝑎𝑚𝑝𝑙𝑒 𝑤𝑒𝑖𝑔ℎ𝑡 = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑤𝑒𝑖𝑔ℎ𝑡 × 𝑒−𝑎𝑚𝑜𝑢𝑡 𝑜𝑓 𝑠𝑎𝑦
𝑁𝑒𝑤 𝑆𝑎𝑚𝑝𝑙𝑒 𝑤𝑒𝑖𝑔ℎ𝑡 = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑤𝑒𝑖𝑔ℎ𝑡 × 𝑒+𝑎𝑚𝑜𝑢𝑡 𝑜𝑓 𝑠𝑎𝑦
32 https://www.youtube.com/watch?v=LsK-xG1cLYA
4. Adaboost
1) Make a first stump / Use Gini Index
2) Find an error (incorrect classification sample) / Determin Amount of Say
3) Give the weights (or modify the weights) to error samples
Sum = 1
Weighted Gini Index would put
more emphasis on correctly
classifiying this sample
( misclassified by the previous stump)
33 https://www.youtube.com/watch?v=LsK-xG1cLYA
4. Adaboost
• Let’s think about the modified weight. How to change the learning method ?
- we can make a new collection of samples that contains duplicat copies of the samples
0.00 – 0.07 : 1st row samples
0.07 – 0.14 : 2nd row samples
0.14 – 0.21 : 3rd row samples
0.21 – 0.70 : 4th row samples
...
34 https://www.youtube.com/watch?v=LsK-xG1cLYA
4. Adaboost
• We go back to the beginning and try to find the stump that does the best job classifying the new collection
of samples.
Thank you

20190409 agist adaboost_algorithm

  • 1.
    Adaboost Algorithm Apr. 09,2019 Yangwoo, Kim Master Degree at BioComputing Lab 1
  • 2.
    Contents 2 1. Introduction 2. EnsembleSummary 3. Random Forest (Bagging) 4. Adaboost (Boosting) 5. Application
  • 3.
    3 1. Introduction “Ensemble Learning”Definition: In statistics and machine learning, ensemble methods use multiple learning algorithms to obtain better predictive performance than could be obtained from any of the constituent learning algorithm alone. 02 0301 04 Bagging Boosting Bayesian model combination Stacking Ensemble model In this slide, we will focus on ‘Boosting Algorithm’ ! https://en.wikipedia.org/wiki/Boosting_(machine_learning)
  • 4.
    4 1. Introduction • Thereare lots of data types in Kaggle + Most people make a baseline by using xgboost and lightgbm
  • 5.
    5 1. Introduction • Amongthe 29 challenge winning solutions published at Kaggle’s blog during 2015, 17 solutions used XGBoost ! • Among these solutions, eight solely used XGBoost to train the model, while most others combined XGBoost with neural nets in ensembles. • For comparison, the second most popular method, deep neural nets, was used in 11 solutions.
  • 6.
    6 2. Ensemble Summary [Bagging, called bootstrap aggregating ] • To improve the stability and accuracy by reducing variance and avoiding overfitting. training set 𝐷 size 𝑛 training set 𝐷𝑖 size 𝑛′ (𝑛′ = 𝑛) by sampling with replacement  Bootstrap sample In categorical data, use voting algorithm ! In regression, use averaging(mean) the output !
  • 7.
    7 2. Ensemble Summary https://www.youtube.com/watch?v=2Mg8QD0F1dQ Data (60%) Train (40%) Test 𝐷1 .. . 𝐷2 𝐷 𝑚 Random with replacement train train train model model model [ Bagging, called bootstrap aggregating ] Mean / Voting
  • 8.
    8 2. Ensemble Summary [Bagging, called bootstrap aggregating ] • i.e. Random Forest For solving, 1. Underfitting with high bias 2. Overfitting with high variance
  • 9.
    9 2. Ensemble Summary https://quantdare.com/what-is-the-difference-between-bagging-and-boosting/ [Boosting ] • “Can a set of weak learners creates a single strong learner?” • Weak learner : a classifier that is only slightly correlated with the true classification • Compared to Bagging, Boosting gives more weights to misclassified data to improve classification accuracy • We will only focus on ‘Adaboost’ in this presentation
  • 10.
    10 2. Ensemble Summary https://www.gormanalysis.com/blog/guide-to-model-stacking-i-e-meta-ensembling/ [Stacking ] • Involves training a learning algorithm to combine the predictions of several other learning algorithms. • First, all of the other algorithms are trained using the available data, then a combiner algorithm is trained to make a final prediction using all the predictions of the other algorithms as additional inputs. • (Bob, Kate, Mark, Sue) 4 people through 187 darts at a board ! • 150 samples(train data) + 27 samples(test data)
  • 11.
    11 2. Ensemble Summary https://www.gormanalysis.com/blog/guide-to-model-stacking-i-e-meta-ensembling/ [Stacking ] • Our target is stacking these models (KNN, SVM) KNN model does a good job at classifying Kate’s and Mark’s throws SVM model does a good job at classifying Bob’s and Sue’s throws How can we combine two advantages by Stacking ?
  • 12.
    12 2. Ensemble Summary https://www.gormanalysis.com/blog/guide-to-model-stacking-i-e-meta-ensembling/ [Stacking ] Train datataset is divided into 5 folds (Train : 150 samples, Test 27 samples) True value (Competitor)
  • 13.
    13 2. Ensemble Summary https://www.gormanalysis.com/blog/guide-to-model-stacking-i-e-meta-ensembling/ [Stacking ] • M1 : K-Nearest Neighbors(k=1) • M2 : Support Vector Machine (type = 4, cost = 1000) Make meta data
  • 14.
    14 2. Ensemble Summary https://www.gormanalysis.com/blog/guide-to-model-stacking-i-e-meta-ensembling/ [Stacking ] • M1 : K-Nearest Neighbors(k=1) • M2 : Support Vector Machine (type = 4, cost = 1000) • Fit the base model to the training fold • (Fill FoldID = 1) Train FoldId 2-5, predict the M1, M2 by FoldId 1 • (Fill FoldID = 2) Train FoldId 1,3,4,5, predict the M1, M2 by FoldId 2 • ...
  • 15.
    15 2. Ensemble Summary https://www.gormanalysis.com/blog/guide-to-model-stacking-i-e-meta-ensembling/ [Stacking ] • Fit each base model (M1, M2) to the full training dataset - It means that test_meta’s M1, M2 are filled by all of the train folds • Fit a new model (stacking model) ! Optionally, include other features from the original training dataset or engineered features • The main point to take home is that we’re using the predictions of the base models as features (i.e. meta features)
  • 16.
    16 2. Ensemble Summary https://www.gormanalysis.com/blog/guide-to-model-stacking-i-e-meta-ensembling/ [Stacking ] KNN model does a good job at classifying Kate’s and Mark’s throws SVM model does a good job at classifying Bob’s and Sue’s throws Combine base model’s advantage
  • 17.
    17 3. Random Forest •Before we understand XGBoost, we have to know ‘Gradient Boosting’... • Before we understand ‘Gradient Boosting’, we have to know ‘Boosting’, ‘Adaboost’, ‘Random Forest’ () https://www.youtube.com/watch?v=J4Wdy0Wc_xQ&t=0s Step 1) Create a bootstrapped dataset Randomly selected with replacement !
  • 18.
    18 3. Random Forest Step2) Create a decision tree using the bootstrapped dataset ! Only considering a random a subset of variables at each step ! You’d do this 100’s of times ! https://www.youtube.com/watch?v=J4Wdy0Wc_xQ&t=0s +. Blue node can be made by 2 or more variables
  • 19.
    19 3. Random Forest Step3) How to use it ? https://www.youtube.com/watch?v=J4Wdy0Wc_xQ&t=0s etc. etc. etc.. Bootstrapping the data plus using the aggregated to make a descision is called ‘Bagging’ ... Voting ! YES !!,
  • 20.
    20 3. Random Forest Step4) How to we know if it’s any good ? OOB dataset is composed by not in bootstrapped dataset https://www.youtube.com/watch?v=J4Wdy0Wc_xQ&t=0s Yes Yes NO We can also get an OOB error by this step.
  • 21.
    21 3. Random Forest https://www.youtube.com/watch?v=J4Wdy0Wc_xQ&t=0s Step5) Change the number of variables used per step + Find out better model Recall that random forest is bagging !
  • 22.
    22 4. Adaboost • Beforewe read XGBoost theory, recall the Boosting ! - Convert weak learner to strong ones / Sequential model / Random sampling with replacement Thouhts on Hypotesis Boosting(1988) 1988 Generalization of AdaBoost as Gradient Boosting 2012 Adaboost (1995) A decision-theoretic generalization of on-line learning and an application to boosting 2011 2016 XGBoost https://www.slideshare.net/freepsw/boosting-bagging-vs-boosting
  • 23.
    23 https://www.youtube.com/watch?v=GM3CDQfQ4sw Data (60%) Train (40%) Test 𝐷1 train model ① Random Test Error samples 𝐷2 Random train 𝐷3 model Combine Errorsamples Random . . . train model [ Boosting ] 4. Adaboost • How to optimize ?! - Adaboost, Gradient Boosting and so on
  • 24.
    24 https://www.youtube.com/watch?v=LsK-xG1cLYA 4. Adaboost •In random forest, there is no predetermined maximum depth • In contrast, in a Forest of Trees made with AdaBoost, - use one variable(feature) to make a decision - Therefore, accuracy is too low (weak learner) STUMP  Weak learner
  • 25.
    25 https://www.youtube.com/watch?v=LsK-xG1cLYA 4. Adaboost •As you know, sequence is important in Adaboost - Next stump is made by influence of previous stump - i.e. the errors that secode stump makes influence how the third stump is made Sample weight init. = 1 # 𝑡𝑜𝑡𝑎𝑙 𝑠𝑎𝑚𝑝𝑙𝑒𝑠 • Let’s start !
  • 26.
    26 https://www.youtube.com/watch?v=LsK-xG1cLYA 4. Adaboost 1)Make a first stump / Use Gini Index 2) Find an error (incorrect classification sample) / Determin Amount of Say 3) Give the weights (or modify the weights) to error samples
  • 27.
    27 https://www.youtube.com/watch?v=LsK-xG1cLYA 4. Adaboost 1)Make a first stump / Use Gini Index 2) Find an error (incorrect classification sample) / Determin Amount of Say 3) Give the weights (or modify the weights) to error samples Gini index = probability of misclassification - Gini is 0 for all signal or all background - 𝐺𝑖𝑛𝑖 = (σ𝑖=1 𝑛 𝑊𝑖)𝑃(1 − 𝑃) 3 5 ∗ 2 5 + 2 3 ∗ 1 3 = 0.24 + 0.22 = 0.46 3 6 ∗ 3 6 + 1 2 ∗ 1 2 = 0.25 + 0.25 = 0.50 3 3 ∗ 0 3 + 4 5 ∗ 1 5 = 0 + 0.20 = 0.20
  • 28.
    28 https://www.youtube.com/watch?v=LsK-xG1cLYA 4. Adaboost 1)Make a first stump / Use Gini Index 2) Find an error (incorrect classification sample) / Determin Amount of Say 3) Give the weights (or modify the weights) to error samples Total Error = σ 𝐸𝑟𝑟𝑜𝑟 = 1 8 Amount of Say am = 1 2 log 1 −Totla Error Totla Error 𝑇𝑜𝑡𝑎𝑙 𝐸𝑟𝑟𝑜𝑟 = ෍ 𝑦𝑖 ≠𝑘 𝑚(𝑥 𝑖) 𝑤𝑖 𝑚 / ෍ 𝑖=1 𝑁 𝑤𝑖 𝑚 = 1 2 log 1 − 1/8 1/8 = 1 2 log 7 = 𝟎. 𝟗𝟕
  • 29.
    29 https://www.youtube.com/watch?v=LsK-xG1cLYA 4. Adaboost Amountof Say am = 1 2 log 1 −Totla Error Totla Error , 𝑇𝑜𝑡𝑎𝑙 𝐸𝑟𝑟𝑜𝑟 = ෍ 𝑦𝑖 ≠𝑘 𝑚(𝑥 𝑖) 𝑤𝑖 𝑚 / ෍ 𝑖=1 𝑁 𝑤𝑖 𝑚 1) Make a first stump / Use Gini Index 2) Find an error (incorrect classification sample) / Determin Amount of Say 3) Give the weights (or modify the weights) to error samples [ Amount of Say Graph ] Error ↓  Amount of Say ↑ Error = ½  Amount of Say = 0 Error ↑  Amount of Say ↓
  • 30.
    30 https://www.youtube.com/watch?v=LsK-xG1cLYA 4. Adaboost 1)Make a first stump / Use Gini Index 2) Find an error (incorrect classification sample) / Determin Amount of Say 3) Give the weights (or modify the weights) to error samples • Now, we treat how to modify the samples weights !
  • 31.
    31 https://www.youtube.com/watch?v=LsK-xG1cLYA 4. Adaboost 1)Make a first stump / Use Gini Index 2) Find an error (incorrect classification sample) / Determin Amount of Say 3) Give the weights (or modify the weights) to error samples 𝑁𝑒𝑤 𝑆𝑎𝑚𝑝𝑙𝑒 𝑤𝑒𝑖𝑔ℎ𝑡 = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑤𝑒𝑖𝑔ℎ𝑡 × 𝑒−𝑎𝑚𝑜𝑢𝑡 𝑜𝑓 𝑠𝑎𝑦 𝑁𝑒𝑤 𝑆𝑎𝑚𝑝𝑙𝑒 𝑤𝑒𝑖𝑔ℎ𝑡 = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑤𝑒𝑖𝑔ℎ𝑡 × 𝑒+𝑎𝑚𝑜𝑢𝑡 𝑜𝑓 𝑠𝑎𝑦
  • 32.
    32 https://www.youtube.com/watch?v=LsK-xG1cLYA 4. Adaboost 1)Make a first stump / Use Gini Index 2) Find an error (incorrect classification sample) / Determin Amount of Say 3) Give the weights (or modify the weights) to error samples Sum = 1 Weighted Gini Index would put more emphasis on correctly classifiying this sample ( misclassified by the previous stump)
  • 33.
    33 https://www.youtube.com/watch?v=LsK-xG1cLYA 4. Adaboost •Let’s think about the modified weight. How to change the learning method ? - we can make a new collection of samples that contains duplicat copies of the samples 0.00 – 0.07 : 1st row samples 0.07 – 0.14 : 2nd row samples 0.14 – 0.21 : 3rd row samples 0.21 – 0.70 : 4th row samples ...
  • 34.
    34 https://www.youtube.com/watch?v=LsK-xG1cLYA 4. Adaboost •We go back to the beginning and try to find the stump that does the best job classifying the new collection of samples.
  • 35.