SlideShare a Scribd company logo
1 of 22
Random Forest
Training Data
Age Income Student Credit_Rating Buy_NoBuy
Youth High no fair no
Youth High no excellent no
Middle_Aged High no fair yes
Senior Medium no fair yes
Senior Low yes fair yes
Senior Low yes excellent no
Middle_Aged Low yes excellent yes
Youth Medium no fair no
Youth Low yes fair yes
Senior Medium yes fair yes
Youth Medium yes excellent yes
Middle_Aged Medium no excellent yes
Middle_Aged High yes fair yes
Senior Medium no excellent no
To predict if a person will buy a Computer or Not
Prediction using Decision Tree
We want to predict the Buy or NoBuy decision of a person, given
that we know his/her:
• Age
• Income
• Student or Not
• Credit_Rating
Prediction with Decision Tree
• Decision trees are powerful and popular tools for classification
and prediction.
• We first make a list of attributes that we can measure. In our
case those are Age, Income, Student or Not, & Credit_Rating.
• We then choose a target attribute that we want to predict, in
our case it is the “Buy_NoBuy” decision.
Algorithms
Commonly Used Algorithms
• ID3 (Iterative Dichotomiser 3): developed in early 1980s: good
for discrete attributes
• C4.5 (improved from ID3) : Handling both continuous and
discrete attributes
• CART (Classification and Regression Tree) : developed in 1984:
good for continuous and discrete attribute
ID3 Algorithm
• Information gain is used to select most useful attribute for
classification/splitting
• To calculate Information Gain, we need to know Entropy
2
1
2 2
( _ ) log
9 9 5 5
log log
14 14 14 14
0.94
c
i i
i
Entropy Buy NoBuy p p

 
  


Buy_NoBuy
no
no
yes
yes
yes
no
yes
no
yes
yes
yes
yes
yes
no
Yes 9
No 5
Total 14
ID3 Algorithm
.
( _ , ) ( ) ( )
( ) (2,3) ( ) (4,0)
( ) (3,2)
c Age
Entropy Buy NoBuy Age P c Entropy c
P Youth Entropy P Middle Entropy
P Senior Entropy


 


Weather Buy NoBuy Total
Youth 2 3 5
Middle_Aged 4 0 4
Senior 3 2 5
Total 14
Age Buy_NoBuy
Youth no
Youth no
Middle_Aged yes
Senior yes
Senior yes
Senior no
Middle_Aged yes
Youth no
Youth yes
Senior yes
Youth yes
Middle_Aged yes
Middle_Aged yes
Senior no
ID3 Algorithm
2 2 2 2
2 2
5 2 2 3 3 4 4 4 0 0
( _ , ) ( log log ) ( log log )
14 5 5 5 5 14 4 4 4 4
5 3 3 2 2
( log log )
14 5 5 5 5
0.694
Entropy Buy NoBuy Age      
  

( _ , )
( _ ) ( _ , )
0.94 0.694 0.246
InformationGain Buy NoBuy Age
Entropy Buy NoBuy Entropy Buy NoBuy Age


  
ID3 Algorithm
Similarly:
( _ , ) 0.246
( _ , ) 0.029
( _ , ) 0.151
( _ , _ ) 0.048
InformationGain Buy NoBuy Age
InformationGain Buy NoBuy Income
InformationGain Buy NoBuy Student
InformationGain Buy NoBuy Credit Rating




Highest Information
Gain
Attribute with highest information gain (here Age), will be selected as
splitting attribute.
Final Decision Tree Using ID3
Using this tree, we can predict that a young person who is also a
student will buy a computer
Random Forest
• First proposed by Tin Kam Ho of Bell Labs in 1995.
• Random forest is an ensemble/group classifier that consists of
a large number of decision trees.
• Each Decision Tree gives their predicted value, but the final
prediction is made by a majority vote.
Step 1
• Take a random sample of size N with replacement from the
data (bootstrap sample).
Selected Age Income Student Credit_Rating Buy_NoBuy
X Youth High no fair no
X Middle_Aged High no fair yes
X Senior Low yes excellent no
X Middle_Aged Low yes excellent yes
X Senior Medium yes fair yes
X Youth Medium yes excellent yes
.
.
Nth Senior Medium no excellent no
Step 2
• At each node, take a random sample of attributes of size m
(without replacement). M being total number of attributes,
such that m<M.
• Generally m=sqrt(M)
• Let’s say Age &
Credit_Rating are the
attributes selected
Selected X X Prediction
Age Income Student Credit_Rating Buy_NoBuy
X Youth High no fair no
X Middle_Aged High no fair yes
X Senior Low yes excellent no
X Middle_Aged Low yes excellent yes
X Senior Medium yes fair yes
X Youth Medium yes excellent yes
.
.
Nth Senior Medium no excellent no
Information Gain
• From Age
• From Credit_Rating
Step 3
• Construct a split by using the m attributes selected in Step 2,
• Let’s say “Age” is selected for the split, can
be done by Information Gain method.
Selected X X Prediction
Age Credit_Rating Buy_NoBuy
X Youth fair no
X Middle_Aged fair yes
X Senior excellent no
X Middle_Aged excellent yes
X Senior fair yes
X Youth excellent yes
.
.
X Senior excellent no
Age
Youth Senior
Middle
Step 4
• Repeat Steps 2 and 3 for each subsequent split until the tree is
complete.
• Say, for Age = Youth, let Income
& Credit_Rating are the
attributes selected at random.
Selected X X X Prediction
Age Income Credit_Rating Buy_NoBuy
X Youth High fair no
X
X
X
X
X Youth Medium excellent yes
.
.
Nth
Age
Youth Senior
Middle
Information Gain
• From Income
• From Credit_Rating
Step 4
• Out of Income & Credit_Rating, say Income is selected for the
split, as in step 3, using information Gain method.
Selected X X X Prediction
Age Income Credit_Rating Buy_NoBuy
X Youth High fair no
X
X
X
X
X Youth Medium excellent yes
.
.
Nth
Age
Income
Youth Senior
Middle
High Medium
Step 5
• Repeat steps 1 to 4 to create a large number of decision trees,
let’s say we create 4 trees.
• Make prediction using each decision tree.
• Make final prediction by a majority vote over the set of trees.
Prediction
• Predict using random forest if a young student with low
income and fair credit rating will buy a computer or not.
Tree # Predicted (Buy_NoBuy)
1 Buy
2 Buy
3 NoBuy
4 Buy
Final Prediction on the basis of
majority vote
Buy
Supply Chain Example

More Related Content

What's hot

Learning loss for active learning
Learning loss for active learningLearning loss for active learning
Learning loss for active learningNAVER Engineering
 
Anomaly detection with machine learning at scale
Anomaly detection with machine learning at scaleAnomaly detection with machine learning at scale
Anomaly detection with machine learning at scaleImpetus Technologies
 
Anomaly detection (Unsupervised Learning) in Machine Learning
Anomaly detection (Unsupervised Learning) in Machine LearningAnomaly detection (Unsupervised Learning) in Machine Learning
Anomaly detection (Unsupervised Learning) in Machine LearningKuppusamy P
 
End-to-end Speech Recognition with Recurrent Neural Networks (D3L6 Deep Learn...
End-to-end Speech Recognition with Recurrent Neural Networks (D3L6 Deep Learn...End-to-end Speech Recognition with Recurrent Neural Networks (D3L6 Deep Learn...
End-to-end Speech Recognition with Recurrent Neural Networks (D3L6 Deep Learn...Universitat Politècnica de Catalunya
 
Decision Tree Learning
Decision Tree LearningDecision Tree Learning
Decision Tree LearningMilind Gokhale
 
Credit card fraud detection pptx (1) (1)
Credit card fraud detection pptx (1) (1)Credit card fraud detection pptx (1) (1)
Credit card fraud detection pptx (1) (1)ajmal anbu
 
Credit Card Fraudulent Transaction Detection Research Paper
Credit Card Fraudulent Transaction Detection Research PaperCredit Card Fraudulent Transaction Detection Research Paper
Credit Card Fraudulent Transaction Detection Research PaperGarvit Burad
 
Video Classification: Human Action Recognition on HMDB-51 dataset
Video Classification: Human Action Recognition on HMDB-51 datasetVideo Classification: Human Action Recognition on HMDB-51 dataset
Video Classification: Human Action Recognition on HMDB-51 datasetGiorgio Carbone
 
Decision tree in artificial intelligence
Decision tree in artificial intelligenceDecision tree in artificial intelligence
Decision tree in artificial intelligenceMdAlAmin187
 
Learning spatiotemporal features with 3 d convolutional networks
Learning spatiotemporal features with 3 d convolutional networksLearning spatiotemporal features with 3 d convolutional networks
Learning spatiotemporal features with 3 d convolutional networksSungminYou
 
HT2014 Tutorial: Evaluating Recommender Systems - Ensuring Replicability of E...
HT2014 Tutorial: Evaluating Recommender Systems - Ensuring Replicability of E...HT2014 Tutorial: Evaluating Recommender Systems - Ensuring Replicability of E...
HT2014 Tutorial: Evaluating Recommender Systems - Ensuring Replicability of E...Alejandro Bellogin
 
random forest regression
random forest regressionrandom forest regression
random forest regressionAkhilesh Joshi
 
Machine Learning for Fraud Detection
Machine Learning for Fraud DetectionMachine Learning for Fraud Detection
Machine Learning for Fraud DetectionNitesh Kumar
 
Multi Layer Perceptron & Back Propagation
Multi Layer Perceptron & Back PropagationMulti Layer Perceptron & Back Propagation
Multi Layer Perceptron & Back PropagationSung-ju Kim
 
Deep Learning Applications | Deep Learning Applications In Real Life | Deep l...
Deep Learning Applications | Deep Learning Applications In Real Life | Deep l...Deep Learning Applications | Deep Learning Applications In Real Life | Deep l...
Deep Learning Applications | Deep Learning Applications In Real Life | Deep l...Simplilearn
 

What's hot (20)

Learning loss for active learning
Learning loss for active learningLearning loss for active learning
Learning loss for active learning
 
Anomaly detection with machine learning at scale
Anomaly detection with machine learning at scaleAnomaly detection with machine learning at scale
Anomaly detection with machine learning at scale
 
Sequential Pattern Mining and GSP
Sequential Pattern Mining and GSPSequential Pattern Mining and GSP
Sequential Pattern Mining and GSP
 
Anomaly detection (Unsupervised Learning) in Machine Learning
Anomaly detection (Unsupervised Learning) in Machine LearningAnomaly detection (Unsupervised Learning) in Machine Learning
Anomaly detection (Unsupervised Learning) in Machine Learning
 
End-to-end Speech Recognition with Recurrent Neural Networks (D3L6 Deep Learn...
End-to-end Speech Recognition with Recurrent Neural Networks (D3L6 Deep Learn...End-to-end Speech Recognition with Recurrent Neural Networks (D3L6 Deep Learn...
End-to-end Speech Recognition with Recurrent Neural Networks (D3L6 Deep Learn...
 
Decision Tree Learning
Decision Tree LearningDecision Tree Learning
Decision Tree Learning
 
Credit card fraud detection pptx (1) (1)
Credit card fraud detection pptx (1) (1)Credit card fraud detection pptx (1) (1)
Credit card fraud detection pptx (1) (1)
 
Credit Card Fraudulent Transaction Detection Research Paper
Credit Card Fraudulent Transaction Detection Research PaperCredit Card Fraudulent Transaction Detection Research Paper
Credit Card Fraudulent Transaction Detection Research Paper
 
Video Classification: Human Action Recognition on HMDB-51 dataset
Video Classification: Human Action Recognition on HMDB-51 datasetVideo Classification: Human Action Recognition on HMDB-51 dataset
Video Classification: Human Action Recognition on HMDB-51 dataset
 
Radial Basis Function
Radial Basis FunctionRadial Basis Function
Radial Basis Function
 
Decision tree in artificial intelligence
Decision tree in artificial intelligenceDecision tree in artificial intelligence
Decision tree in artificial intelligence
 
Learning spatiotemporal features with 3 d convolutional networks
Learning spatiotemporal features with 3 d convolutional networksLearning spatiotemporal features with 3 d convolutional networks
Learning spatiotemporal features with 3 d convolutional networks
 
K - Nearest neighbor ( KNN )
K - Nearest neighbor  ( KNN )K - Nearest neighbor  ( KNN )
K - Nearest neighbor ( KNN )
 
HT2014 Tutorial: Evaluating Recommender Systems - Ensuring Replicability of E...
HT2014 Tutorial: Evaluating Recommender Systems - Ensuring Replicability of E...HT2014 Tutorial: Evaluating Recommender Systems - Ensuring Replicability of E...
HT2014 Tutorial: Evaluating Recommender Systems - Ensuring Replicability of E...
 
random forest regression
random forest regressionrandom forest regression
random forest regression
 
A* Search Algorithm
A* Search AlgorithmA* Search Algorithm
A* Search Algorithm
 
Machine Learning for Fraud Detection
Machine Learning for Fraud DetectionMachine Learning for Fraud Detection
Machine Learning for Fraud Detection
 
Decision tree
Decision treeDecision tree
Decision tree
 
Multi Layer Perceptron & Back Propagation
Multi Layer Perceptron & Back PropagationMulti Layer Perceptron & Back Propagation
Multi Layer Perceptron & Back Propagation
 
Deep Learning Applications | Deep Learning Applications In Real Life | Deep l...
Deep Learning Applications | Deep Learning Applications In Real Life | Deep l...Deep Learning Applications | Deep Learning Applications In Real Life | Deep l...
Deep Learning Applications | Deep Learning Applications In Real Life | Deep l...
 

Similar to Decision Tree and Random forest

Family fun night.pptx
Family fun night.pptxFamily fun night.pptx
Family fun night.pptxAniluMendiola
 
The role of statistics and the data analysis process.ppt
The role of statistics and the data analysis process.pptThe role of statistics and the data analysis process.ppt
The role of statistics and the data analysis process.pptJakeCuenca10
 
Loan Eligibility Checker
Loan Eligibility CheckerLoan Eligibility Checker
Loan Eligibility CheckerKiranVodela
 
Statistics assignment on statistical inference
Statistics assignment on statistical inferenceStatistics assignment on statistical inference
Statistics assignment on statistical inferencesadiakarim8
 
Research trends in data warehousing and data mining
Research trends in data warehousing and data miningResearch trends in data warehousing and data mining
Research trends in data warehousing and data miningEr. Nawaraj Bhandari
 
Preparing your Training Leads to Run Gamification Programs
Preparing your Training Leads to Run Gamification ProgramsPreparing your Training Leads to Run Gamification Programs
Preparing your Training Leads to Run Gamification ProgramsSeriousGamesAssoc
 
Chemistry Lab Manual 2012-13
Chemistry Lab Manual 2012-13Chemistry Lab Manual 2012-13
Chemistry Lab Manual 2012-13Stephen Taylor
 
VSSML18. Evaluations
VSSML18. EvaluationsVSSML18. Evaluations
VSSML18. EvaluationsBigML, Inc
 
Why You're a Terrible PM: Cognitive Biases in Project Management notes
Why You're a Terrible PM: Cognitive Biases in Project Management notesWhy You're a Terrible PM: Cognitive Biases in Project Management notes
Why You're a Terrible PM: Cognitive Biases in Project Management notesCarson Pierce
 
Reflecting on assessment a tale of hope and ideals 2010
Reflecting on assessment a tale of hope and ideals   2010Reflecting on assessment a tale of hope and ideals   2010
Reflecting on assessment a tale of hope and ideals 2010John McCarthy
 
MO NRCS Area Training - Generational Intelligence
MO NRCS Area Training - Generational IntelligenceMO NRCS Area Training - Generational Intelligence
MO NRCS Area Training - Generational IntelligenceAmy Hays
 

Similar to Decision Tree and Random forest (20)

Machine learning
Machine learningMachine learning
Machine learning
 
Ranking scales
Ranking scalesRanking scales
Ranking scales
 
Family fun night.pptx
Family fun night.pptxFamily fun night.pptx
Family fun night.pptx
 
The role of statistics and the data analysis process.ppt
The role of statistics and the data analysis process.pptThe role of statistics and the data analysis process.ppt
The role of statistics and the data analysis process.ppt
 
Loan Eligibility Checker
Loan Eligibility CheckerLoan Eligibility Checker
Loan Eligibility Checker
 
Statistics assignment on statistical inference
Statistics assignment on statistical inferenceStatistics assignment on statistical inference
Statistics assignment on statistical inference
 
Research trends in data warehousing and data mining
Research trends in data warehousing and data miningResearch trends in data warehousing and data mining
Research trends in data warehousing and data mining
 
Decision Trees
Decision TreesDecision Trees
Decision Trees
 
Preparing your Training Leads to Run Gamification Programs
Preparing your Training Leads to Run Gamification ProgramsPreparing your Training Leads to Run Gamification Programs
Preparing your Training Leads to Run Gamification Programs
 
ML_Presentation.pptx
ML_Presentation.pptxML_Presentation.pptx
ML_Presentation.pptx
 
Chemistry Lab Manual 2012-13
Chemistry Lab Manual 2012-13Chemistry Lab Manual 2012-13
Chemistry Lab Manual 2012-13
 
Ppt on decision theory
Ppt on decision theoryPpt on decision theory
Ppt on decision theory
 
Statistics
StatisticsStatistics
Statistics
 
Adjusting Entries Practice...
Adjusting Entries Practice...Adjusting Entries Practice...
Adjusting Entries Practice...
 
VSSML18. Evaluations
VSSML18. EvaluationsVSSML18. Evaluations
VSSML18. Evaluations
 
Why You're a Terrible PM: Cognitive Biases in Project Management notes
Why You're a Terrible PM: Cognitive Biases in Project Management notesWhy You're a Terrible PM: Cognitive Biases in Project Management notes
Why You're a Terrible PM: Cognitive Biases in Project Management notes
 
Unit-1.pdf
Unit-1.pdfUnit-1.pdf
Unit-1.pdf
 
Reflecting on assessment a tale of hope and ideals 2010
Reflecting on assessment a tale of hope and ideals   2010Reflecting on assessment a tale of hope and ideals   2010
Reflecting on assessment a tale of hope and ideals 2010
 
MO NRCS Area Training - Generational Intelligence
MO NRCS Area Training - Generational IntelligenceMO NRCS Area Training - Generational Intelligence
MO NRCS Area Training - Generational Intelligence
 
Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)
 

Recently uploaded

edited gordis ebook sixth edition david d.pdf
edited gordis ebook sixth edition david d.pdfedited gordis ebook sixth edition david d.pdf
edited gordis ebook sixth edition david d.pdfgreat91
 
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证dq9vz1isj
 
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...ThinkInnovation
 
Data Analysis Project Presentation : NYC Shooting Cluster Analysis
Data Analysis Project Presentation : NYC Shooting Cluster AnalysisData Analysis Project Presentation : NYC Shooting Cluster Analysis
Data Analysis Project Presentation : NYC Shooting Cluster AnalysisBoston Institute of Analytics
 
社内勉強会資料_Object Recognition as Next Token Prediction
社内勉強会資料_Object Recognition as Next Token Prediction社内勉強会資料_Object Recognition as Next Token Prediction
社内勉強会資料_Object Recognition as Next Token PredictionNABLAS株式会社
 
Predictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting TechniquesPredictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting TechniquesBoston Institute of Analytics
 
Formulas dax para power bI de microsoft.pdf
Formulas dax para power bI de microsoft.pdfFormulas dax para power bI de microsoft.pdf
Formulas dax para power bI de microsoft.pdfRobertoOcampo24
 
Aggregations - The Elasticsearch "GROUP BY"
Aggregations - The Elasticsearch "GROUP BY"Aggregations - The Elasticsearch "GROUP BY"
Aggregations - The Elasticsearch "GROUP BY"John Sobanski
 
How to Transform Clinical Trial Management with Advanced Data Analytics
How to Transform Clinical Trial Management with Advanced Data AnalyticsHow to Transform Clinical Trial Management with Advanced Data Analytics
How to Transform Clinical Trial Management with Advanced Data AnalyticsBrainSell Technologies
 
Genuine love spell caster )! ,+27834335081) Ex lover back permanently in At...
Genuine love spell caster )! ,+27834335081)   Ex lover back permanently in At...Genuine love spell caster )! ,+27834335081)   Ex lover back permanently in At...
Genuine love spell caster )! ,+27834335081) Ex lover back permanently in At...BabaJohn3
 
如何办理加州大学伯克利分校毕业证(UCB毕业证)成绩单留信学历认证
如何办理加州大学伯克利分校毕业证(UCB毕业证)成绩单留信学历认证如何办理加州大学伯克利分校毕业证(UCB毕业证)成绩单留信学历认证
如何办理加州大学伯克利分校毕业证(UCB毕业证)成绩单留信学历认证a8om7o51
 
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...Klinik Aborsi
 
NOAM AAUG Adobe Summit 2024: Summit Slam Dunks
NOAM AAUG Adobe Summit 2024: Summit Slam DunksNOAM AAUG Adobe Summit 2024: Summit Slam Dunks
NOAM AAUG Adobe Summit 2024: Summit Slam Dunksgmuir1066
 
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样jk0tkvfv
 
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证ju0dztxtn
 
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证pwgnohujw
 
Bios of leading Astrologers & Researchers
Bios of leading Astrologers & ResearchersBios of leading Astrologers & Researchers
Bios of leading Astrologers & Researchersdarmandersingh4580
 
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...ThinkInnovation
 
Northern New England Tableau User Group (TUG) May 2024
Northern New England Tableau User Group (TUG) May 2024Northern New England Tableau User Group (TUG) May 2024
Northern New England Tableau User Group (TUG) May 2024patrickdtherriault
 

Recently uploaded (20)

edited gordis ebook sixth edition david d.pdf
edited gordis ebook sixth edition david d.pdfedited gordis ebook sixth edition david d.pdf
edited gordis ebook sixth edition david d.pdf
 
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
 
Abortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotec
Abortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotecAbortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotec
Abortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotec
 
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
 
Data Analysis Project Presentation : NYC Shooting Cluster Analysis
Data Analysis Project Presentation : NYC Shooting Cluster AnalysisData Analysis Project Presentation : NYC Shooting Cluster Analysis
Data Analysis Project Presentation : NYC Shooting Cluster Analysis
 
社内勉強会資料_Object Recognition as Next Token Prediction
社内勉強会資料_Object Recognition as Next Token Prediction社内勉強会資料_Object Recognition as Next Token Prediction
社内勉強会資料_Object Recognition as Next Token Prediction
 
Predictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting TechniquesPredictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting Techniques
 
Formulas dax para power bI de microsoft.pdf
Formulas dax para power bI de microsoft.pdfFormulas dax para power bI de microsoft.pdf
Formulas dax para power bI de microsoft.pdf
 
Aggregations - The Elasticsearch "GROUP BY"
Aggregations - The Elasticsearch "GROUP BY"Aggregations - The Elasticsearch "GROUP BY"
Aggregations - The Elasticsearch "GROUP BY"
 
How to Transform Clinical Trial Management with Advanced Data Analytics
How to Transform Clinical Trial Management with Advanced Data AnalyticsHow to Transform Clinical Trial Management with Advanced Data Analytics
How to Transform Clinical Trial Management with Advanced Data Analytics
 
Genuine love spell caster )! ,+27834335081) Ex lover back permanently in At...
Genuine love spell caster )! ,+27834335081)   Ex lover back permanently in At...Genuine love spell caster )! ,+27834335081)   Ex lover back permanently in At...
Genuine love spell caster )! ,+27834335081) Ex lover back permanently in At...
 
如何办理加州大学伯克利分校毕业证(UCB毕业证)成绩单留信学历认证
如何办理加州大学伯克利分校毕业证(UCB毕业证)成绩单留信学历认证如何办理加州大学伯克利分校毕业证(UCB毕业证)成绩单留信学历认证
如何办理加州大学伯克利分校毕业证(UCB毕业证)成绩单留信学历认证
 
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
 
NOAM AAUG Adobe Summit 2024: Summit Slam Dunks
NOAM AAUG Adobe Summit 2024: Summit Slam DunksNOAM AAUG Adobe Summit 2024: Summit Slam Dunks
NOAM AAUG Adobe Summit 2024: Summit Slam Dunks
 
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
 
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
 
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
 
Bios of leading Astrologers & Researchers
Bios of leading Astrologers & ResearchersBios of leading Astrologers & Researchers
Bios of leading Astrologers & Researchers
 
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
 
Northern New England Tableau User Group (TUG) May 2024
Northern New England Tableau User Group (TUG) May 2024Northern New England Tableau User Group (TUG) May 2024
Northern New England Tableau User Group (TUG) May 2024
 

Decision Tree and Random forest

  • 2. Training Data Age Income Student Credit_Rating Buy_NoBuy Youth High no fair no Youth High no excellent no Middle_Aged High no fair yes Senior Medium no fair yes Senior Low yes fair yes Senior Low yes excellent no Middle_Aged Low yes excellent yes Youth Medium no fair no Youth Low yes fair yes Senior Medium yes fair yes Youth Medium yes excellent yes Middle_Aged Medium no excellent yes Middle_Aged High yes fair yes Senior Medium no excellent no To predict if a person will buy a Computer or Not
  • 3. Prediction using Decision Tree We want to predict the Buy or NoBuy decision of a person, given that we know his/her: • Age • Income • Student or Not • Credit_Rating
  • 4. Prediction with Decision Tree • Decision trees are powerful and popular tools for classification and prediction. • We first make a list of attributes that we can measure. In our case those are Age, Income, Student or Not, & Credit_Rating. • We then choose a target attribute that we want to predict, in our case it is the “Buy_NoBuy” decision.
  • 5. Algorithms Commonly Used Algorithms • ID3 (Iterative Dichotomiser 3): developed in early 1980s: good for discrete attributes • C4.5 (improved from ID3) : Handling both continuous and discrete attributes • CART (Classification and Regression Tree) : developed in 1984: good for continuous and discrete attribute
  • 6. ID3 Algorithm • Information gain is used to select most useful attribute for classification/splitting • To calculate Information Gain, we need to know Entropy 2 1 2 2 ( _ ) log 9 9 5 5 log log 14 14 14 14 0.94 c i i i Entropy Buy NoBuy p p         Buy_NoBuy no no yes yes yes no yes no yes yes yes yes yes no Yes 9 No 5 Total 14
  • 7. ID3 Algorithm . ( _ , ) ( ) ( ) ( ) (2,3) ( ) (4,0) ( ) (3,2) c Age Entropy Buy NoBuy Age P c Entropy c P Youth Entropy P Middle Entropy P Senior Entropy       Weather Buy NoBuy Total Youth 2 3 5 Middle_Aged 4 0 4 Senior 3 2 5 Total 14 Age Buy_NoBuy Youth no Youth no Middle_Aged yes Senior yes Senior yes Senior no Middle_Aged yes Youth no Youth yes Senior yes Youth yes Middle_Aged yes Middle_Aged yes Senior no
  • 8. ID3 Algorithm 2 2 2 2 2 2 5 2 2 3 3 4 4 4 0 0 ( _ , ) ( log log ) ( log log ) 14 5 5 5 5 14 4 4 4 4 5 3 3 2 2 ( log log ) 14 5 5 5 5 0.694 Entropy Buy NoBuy Age           ( _ , ) ( _ ) ( _ , ) 0.94 0.694 0.246 InformationGain Buy NoBuy Age Entropy Buy NoBuy Entropy Buy NoBuy Age     
  • 9. ID3 Algorithm Similarly: ( _ , ) 0.246 ( _ , ) 0.029 ( _ , ) 0.151 ( _ , _ ) 0.048 InformationGain Buy NoBuy Age InformationGain Buy NoBuy Income InformationGain Buy NoBuy Student InformationGain Buy NoBuy Credit Rating     Highest Information Gain Attribute with highest information gain (here Age), will be selected as splitting attribute.
  • 10.
  • 11. Final Decision Tree Using ID3 Using this tree, we can predict that a young person who is also a student will buy a computer
  • 12. Random Forest • First proposed by Tin Kam Ho of Bell Labs in 1995. • Random forest is an ensemble/group classifier that consists of a large number of decision trees. • Each Decision Tree gives their predicted value, but the final prediction is made by a majority vote.
  • 13. Step 1 • Take a random sample of size N with replacement from the data (bootstrap sample). Selected Age Income Student Credit_Rating Buy_NoBuy X Youth High no fair no X Middle_Aged High no fair yes X Senior Low yes excellent no X Middle_Aged Low yes excellent yes X Senior Medium yes fair yes X Youth Medium yes excellent yes . . Nth Senior Medium no excellent no
  • 14. Step 2 • At each node, take a random sample of attributes of size m (without replacement). M being total number of attributes, such that m<M. • Generally m=sqrt(M) • Let’s say Age & Credit_Rating are the attributes selected Selected X X Prediction Age Income Student Credit_Rating Buy_NoBuy X Youth High no fair no X Middle_Aged High no fair yes X Senior Low yes excellent no X Middle_Aged Low yes excellent yes X Senior Medium yes fair yes X Youth Medium yes excellent yes . . Nth Senior Medium no excellent no
  • 15. Information Gain • From Age • From Credit_Rating
  • 16. Step 3 • Construct a split by using the m attributes selected in Step 2, • Let’s say “Age” is selected for the split, can be done by Information Gain method. Selected X X Prediction Age Credit_Rating Buy_NoBuy X Youth fair no X Middle_Aged fair yes X Senior excellent no X Middle_Aged excellent yes X Senior fair yes X Youth excellent yes . . X Senior excellent no Age Youth Senior Middle
  • 17. Step 4 • Repeat Steps 2 and 3 for each subsequent split until the tree is complete. • Say, for Age = Youth, let Income & Credit_Rating are the attributes selected at random. Selected X X X Prediction Age Income Credit_Rating Buy_NoBuy X Youth High fair no X X X X X Youth Medium excellent yes . . Nth Age Youth Senior Middle
  • 18. Information Gain • From Income • From Credit_Rating
  • 19. Step 4 • Out of Income & Credit_Rating, say Income is selected for the split, as in step 3, using information Gain method. Selected X X X Prediction Age Income Credit_Rating Buy_NoBuy X Youth High fair no X X X X X Youth Medium excellent yes . . Nth Age Income Youth Senior Middle High Medium
  • 20. Step 5 • Repeat steps 1 to 4 to create a large number of decision trees, let’s say we create 4 trees. • Make prediction using each decision tree. • Make final prediction by a majority vote over the set of trees.
  • 21. Prediction • Predict using random forest if a young student with low income and fair credit rating will buy a computer or not. Tree # Predicted (Buy_NoBuy) 1 Buy 2 Buy 3 NoBuy 4 Buy Final Prediction on the basis of majority vote Buy