SlideShare a Scribd company logo
Glucose Age Diabetes
78 26 No
85 31 No
89 21 No
100 32 No
103 33 No
107 31 Yes
110 30 No
115 29 Yes
126 27 No
115 32 Yes
116 31 Yes
118 31 Yes
183 50 Yes
189 59 Yes
197 53 Yes
Glucose
Age Glucose
75 < G < 90
G > 90
20 < Age <= 31
Y-0, N-3
Prediction
No (Majority)
100 <= G <= 110
G > 110
Y-1, N-3
Prediction
No (Majority)
Age
30 <= Age < 34
Age
Glucose
110 < G < 127 G > 180
Age
Y-4, N-1
Prediction
Yes (Majority)
Age >= 50
Y-3, N-0
Prediction
Yes (Majority)
A few sample observation on diabetes
result along with glucose and age are
given.
Attempting a decision tree prediction model
30 < Age < 34
Root of the Tree
Branch
Leaf
 In the last slide example, the first split divided the into groups of 12 and 3.
 What if decision tree is creating a split for each sample?
 Then for the samples from train data, model can do accurate predictions.
 But for other data, model may perform worse.
 It is better to do the branch split with a minimum number of samples greater
than 1. (Default value is 2)
 We can control minimum sample split count of decision tree model with help
of “min_samples_split” argument.
Glucose
75 < G < 90
G > 90
No. of
samples = 3
No. of
samples = 12
 The samples count in last splits (leaf of decision tree) is also important for the
model
 Default value for minimum sample count in leaf of decision tree is 1.
 We can control the minimum count of samples in a decision tree model with
help of “min_samples_leaf” argument.
 Entropy is the measures of impurity, disorder or uncertainty in a bunch of
examples.
 In a decision tree it is the impurity in the split.
 Entropy value ranges from 0 to 1. Maximum impurity represents 1.
Entropy H(s) = -P(Yes) * log2(P(Yes)) -P(No) * log2(P(No))
Possible split with 6 samples (label “Yes” or “No”) with entropy in each split are shown below.
Entropy in case of more than 2 labels
Gini
Gini is another method of
impurity measure in the
decision tree split.
Gini = 1- (P(Yes)^2 + P(No)^2)
In case of more than 2 labels
Gini = 1 - ∑ (Pi)^2
No. of “Yes” No. of “No” P(Yes) P(No) Entropy Notes
0 6 0 1 0 pure split
1 5 0.17 0.83 0.65
2 4 0.33 0.67 0.92
3 3 0.5 0.5 1 maximum imprity
4 2 0.67 0.33 0.92
5 1 0.83 0.17 0.65
6 0 1 0 0 pure split
Gini
0
0.28
0.44
0.50
0.44
0.28
0
Information Gain helps to measure the reduction in entropy or surprise by splitting a dataset according to a given value of a random variable.
Information Gain IG(S, a) = H(S) – H(S | a)
IG(S, a) = information for the dataset S for the variable a
H(S) = the entropy for the dataset before any change
H(S | a) is the conditional entropy for the dataset given the variable a.
A larger information gain suggests a lower entropy group or groups of samples.
Overall Entropy = -(8/15)*log2(8/15) – (7/15) * log2(7/15) = 0.996
Feature - Gender:
Entropy of ‘Male’ = -(4/8)*log2(4/8)-(4/8)*log2(4/8) = 1
Entropy of ‘Female’ = -(4/7)*log2(4/7)-(3/7)*log2(3/7) = 0.985
Weighted entropy = (8/15)*1 + (7/15)*0.985 = 0.993
Information Gain for gender feature = 0.996 – 0.993 = 0.003
Information Gain for exercise feature = 0.996 – 0.884 = 0.112
Gender Exercise Diabetes
Male Regular No
Female Irregular No
Male Regular No
Male Regular No
Female No No
Male Irregular Yes
Female No No
Female Regular Yes
Male Regular No
Female Regular Yes
Female Regular Yes
Female Irregular Yes
Male Irregular Yes
Male No Yes
Male Irregular Yes
A Random Forest model creates many decision trees and combine the output.
x1 x2 x3 x4 x5 y
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
x1 x2 x3
1
2
3
4
x4 x5 y
13
14
15
16
17
x3 x4 x5 y
1
9
10
11
12
13
x1 x2 x3 y
1
2
3
4
Sample-1
Sample-2
Sample-3
Sample-n
Decision
Tree-1
Decision
Tree-2
Decision
Tree-3
Decision
Tree-n
Combined
output
M
A
J
O
R
I
T
Y
Creating multiple models and
combining the output is called
Bagging.
Number of trees in Random Forest can be managed by “n_estimators” argument.
Default number of trees is 100.
Random Forest reduces overfitting in decision trees and helps to improve the
accuracy.
Decision tree random forest classifier

More Related Content

Similar to Decision tree random forest classifier

Chi sqyre test
Chi sqyre testChi sqyre test
Chi sqyre test
Dr.M.Prasad Naidu
 
Statistical analysis by iswar
Statistical analysis by iswarStatistical analysis by iswar
Two Sample Tests
Two Sample TestsTwo Sample Tests
Two Sample Testssanketd1983
 
Lecture-6 (t-test and one way ANOVA.ppt
Lecture-6 (t-test and one way ANOVA.pptLecture-6 (t-test and one way ANOVA.ppt
Lecture-6 (t-test and one way ANOVA.ppt
habtamu biazin
 
Logisticregression
LogisticregressionLogisticregression
Logisticregression
sujimaa
 
The two sample t-test
The two sample t-testThe two sample t-test
The two sample t-test
Christina K J
 
f and t test
f and t testf and t test
Proportions and Confidence Intervals in Biostatistics
Proportions and Confidence Intervals in BiostatisticsProportions and Confidence Intervals in Biostatistics
Proportions and Confidence Intervals in Biostatistics
HelpWithAssignment.com
 
PARAMETRIC STATISTICS .pptx
PARAMETRIC STATISTICS .pptxPARAMETRIC STATISTICS .pptx
PARAMETRIC STATISTICS .pptx
FrancheskaPaveCabund
 
Test of significance (t-test, proportion test, chi-square test)
Test of significance (t-test, proportion test, chi-square test)Test of significance (t-test, proportion test, chi-square test)
Test of significance (t-test, proportion test, chi-square test)
Ramnath Takiar
 
Video 1B Handout_2023.pptx
Video 1B Handout_2023.pptxVideo 1B Handout_2023.pptx
Video 1B Handout_2023.pptx
JhonatanJesusRamnRoj
 
Test of hypothesis (t)
Test of hypothesis (t)Test of hypothesis (t)
Test of hypothesis (t)Marlon Gomez
 
lecture8.ppt
lecture8.pptlecture8.ppt
lecture8.ppt
AlokKumar969617
 
Lecture8
Lecture8Lecture8
Lecture8
giftcertificate
 
The siegel-tukey-test-for-equal-variability
The siegel-tukey-test-for-equal-variabilityThe siegel-tukey-test-for-equal-variability
The siegel-tukey-test-for-equal-variability
Islamia College University Peshawar
 
Categorical data analysis full lecture note PPT.pptx
Categorical data analysis full lecture note  PPT.pptxCategorical data analysis full lecture note  PPT.pptx
Categorical data analysis full lecture note PPT.pptx
MinilikDerseh1
 
Ttest
TtestTtest
Hypothesis testing for parametric data (1)
Hypothesis testing for parametric data (1)Hypothesis testing for parametric data (1)
Hypothesis testing for parametric data (1)
KwambokaLeonidah
 

Similar to Decision tree random forest classifier (20)

Chi sqyre test
Chi sqyre testChi sqyre test
Chi sqyre test
 
Statistical analysis by iswar
Statistical analysis by iswarStatistical analysis by iswar
Statistical analysis by iswar
 
Statistics
StatisticsStatistics
Statistics
 
Two Sample Tests
Two Sample TestsTwo Sample Tests
Two Sample Tests
 
Lecture-6 (t-test and one way ANOVA.ppt
Lecture-6 (t-test and one way ANOVA.pptLecture-6 (t-test and one way ANOVA.ppt
Lecture-6 (t-test and one way ANOVA.ppt
 
Logisticregression
LogisticregressionLogisticregression
Logisticregression
 
The two sample t-test
The two sample t-testThe two sample t-test
The two sample t-test
 
f and t test
f and t testf and t test
f and t test
 
Proportions and Confidence Intervals in Biostatistics
Proportions and Confidence Intervals in BiostatisticsProportions and Confidence Intervals in Biostatistics
Proportions and Confidence Intervals in Biostatistics
 
PARAMETRIC STATISTICS .pptx
PARAMETRIC STATISTICS .pptxPARAMETRIC STATISTICS .pptx
PARAMETRIC STATISTICS .pptx
 
Test of significance (t-test, proportion test, chi-square test)
Test of significance (t-test, proportion test, chi-square test)Test of significance (t-test, proportion test, chi-square test)
Test of significance (t-test, proportion test, chi-square test)
 
Video 1B Handout_2023.pptx
Video 1B Handout_2023.pptxVideo 1B Handout_2023.pptx
Video 1B Handout_2023.pptx
 
Test of hypothesis (t)
Test of hypothesis (t)Test of hypothesis (t)
Test of hypothesis (t)
 
lecture8.ppt
lecture8.pptlecture8.ppt
lecture8.ppt
 
Lecture8
Lecture8Lecture8
Lecture8
 
The siegel-tukey-test-for-equal-variability
The siegel-tukey-test-for-equal-variabilityThe siegel-tukey-test-for-equal-variability
The siegel-tukey-test-for-equal-variability
 
Categorical data analysis full lecture note PPT.pptx
Categorical data analysis full lecture note  PPT.pptxCategorical data analysis full lecture note  PPT.pptx
Categorical data analysis full lecture note PPT.pptx
 
MT2
MT2MT2
MT2
 
Ttest
TtestTtest
Ttest
 
Hypothesis testing for parametric data (1)
Hypothesis testing for parametric data (1)Hypothesis testing for parametric data (1)
Hypothesis testing for parametric data (1)
 

More from SreerajVA

Logistic regression classification
Logistic regression classificationLogistic regression classification
Logistic regression classification
SreerajVA
 
Linear regression
Linear regressionLinear regression
Linear regression
SreerajVA
 
Lasso and ridge regression
Lasso and ridge regressionLasso and ridge regression
Lasso and ridge regression
SreerajVA
 
KNN, SVM, Naive bayes classifiers
KNN, SVM, Naive bayes classifiersKNN, SVM, Naive bayes classifiers
KNN, SVM, Naive bayes classifiers
SreerajVA
 
KMeans clustering
KMeans clusteringKMeans clustering
KMeans clustering
SreerajVA
 
ADABoost classifier
ADABoost classifierADABoost classifier
ADABoost classifier
SreerajVA
 

More from SreerajVA (6)

Logistic regression classification
Logistic regression classificationLogistic regression classification
Logistic regression classification
 
Linear regression
Linear regressionLinear regression
Linear regression
 
Lasso and ridge regression
Lasso and ridge regressionLasso and ridge regression
Lasso and ridge regression
 
KNN, SVM, Naive bayes classifiers
KNN, SVM, Naive bayes classifiersKNN, SVM, Naive bayes classifiers
KNN, SVM, Naive bayes classifiers
 
KMeans clustering
KMeans clusteringKMeans clustering
KMeans clustering
 
ADABoost classifier
ADABoost classifierADABoost classifier
ADABoost classifier
 

Recently uploaded

一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
vcaxypu
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
balafet
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
ewymefz
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
NABLAS株式会社
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
ArpitMalhotra16
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
MaleehaSheikh2
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
ahzuo
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
slg6lamcq
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Boston Institute of Analytics
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
ukgaet
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Linda486226
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
Opendatabay
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
enxupq
 

Recently uploaded (20)

一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 

Decision tree random forest classifier

  • 1.
  • 2. Glucose Age Diabetes 78 26 No 85 31 No 89 21 No 100 32 No 103 33 No 107 31 Yes 110 30 No 115 29 Yes 126 27 No 115 32 Yes 116 31 Yes 118 31 Yes 183 50 Yes 189 59 Yes 197 53 Yes Glucose Age Glucose 75 < G < 90 G > 90 20 < Age <= 31 Y-0, N-3 Prediction No (Majority) 100 <= G <= 110 G > 110 Y-1, N-3 Prediction No (Majority) Age 30 <= Age < 34 Age Glucose 110 < G < 127 G > 180 Age Y-4, N-1 Prediction Yes (Majority) Age >= 50 Y-3, N-0 Prediction Yes (Majority) A few sample observation on diabetes result along with glucose and age are given. Attempting a decision tree prediction model 30 < Age < 34 Root of the Tree Branch Leaf
  • 3.  In the last slide example, the first split divided the into groups of 12 and 3.  What if decision tree is creating a split for each sample?  Then for the samples from train data, model can do accurate predictions.  But for other data, model may perform worse.  It is better to do the branch split with a minimum number of samples greater than 1. (Default value is 2)  We can control minimum sample split count of decision tree model with help of “min_samples_split” argument. Glucose 75 < G < 90 G > 90 No. of samples = 3 No. of samples = 12  The samples count in last splits (leaf of decision tree) is also important for the model  Default value for minimum sample count in leaf of decision tree is 1.  We can control the minimum count of samples in a decision tree model with help of “min_samples_leaf” argument.
  • 4.  Entropy is the measures of impurity, disorder or uncertainty in a bunch of examples.  In a decision tree it is the impurity in the split.  Entropy value ranges from 0 to 1. Maximum impurity represents 1. Entropy H(s) = -P(Yes) * log2(P(Yes)) -P(No) * log2(P(No)) Possible split with 6 samples (label “Yes” or “No”) with entropy in each split are shown below. Entropy in case of more than 2 labels Gini Gini is another method of impurity measure in the decision tree split. Gini = 1- (P(Yes)^2 + P(No)^2) In case of more than 2 labels Gini = 1 - ∑ (Pi)^2 No. of “Yes” No. of “No” P(Yes) P(No) Entropy Notes 0 6 0 1 0 pure split 1 5 0.17 0.83 0.65 2 4 0.33 0.67 0.92 3 3 0.5 0.5 1 maximum imprity 4 2 0.67 0.33 0.92 5 1 0.83 0.17 0.65 6 0 1 0 0 pure split Gini 0 0.28 0.44 0.50 0.44 0.28 0
  • 5. Information Gain helps to measure the reduction in entropy or surprise by splitting a dataset according to a given value of a random variable. Information Gain IG(S, a) = H(S) – H(S | a) IG(S, a) = information for the dataset S for the variable a H(S) = the entropy for the dataset before any change H(S | a) is the conditional entropy for the dataset given the variable a. A larger information gain suggests a lower entropy group or groups of samples. Overall Entropy = -(8/15)*log2(8/15) – (7/15) * log2(7/15) = 0.996 Feature - Gender: Entropy of ‘Male’ = -(4/8)*log2(4/8)-(4/8)*log2(4/8) = 1 Entropy of ‘Female’ = -(4/7)*log2(4/7)-(3/7)*log2(3/7) = 0.985 Weighted entropy = (8/15)*1 + (7/15)*0.985 = 0.993 Information Gain for gender feature = 0.996 – 0.993 = 0.003 Information Gain for exercise feature = 0.996 – 0.884 = 0.112 Gender Exercise Diabetes Male Regular No Female Irregular No Male Regular No Male Regular No Female No No Male Irregular Yes Female No No Female Regular Yes Male Regular No Female Regular Yes Female Regular Yes Female Irregular Yes Male Irregular Yes Male No Yes Male Irregular Yes
  • 6.
  • 7. A Random Forest model creates many decision trees and combine the output. x1 x2 x3 x4 x5 y 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 x1 x2 x3 1 2 3 4 x4 x5 y 13 14 15 16 17 x3 x4 x5 y 1 9 10 11 12 13 x1 x2 x3 y 1 2 3 4 Sample-1 Sample-2 Sample-3 Sample-n Decision Tree-1 Decision Tree-2 Decision Tree-3 Decision Tree-n Combined output M A J O R I T Y Creating multiple models and combining the output is called Bagging.
  • 8. Number of trees in Random Forest can be managed by “n_estimators” argument. Default number of trees is 100. Random Forest reduces overfitting in decision trees and helps to improve the accuracy.