SlideShare a Scribd company logo
1 of 10
Download to read offline
Machine
Learning - V
Random Forest
Random Forest
To understand the Random Forest let’s first understand Ensemble model.
Ensemble model is a collection of outputs from multiple models to more
accuracy predictive modeling
Ensemble model are high demand due to easy implementation of
multiple models in a short time and effort for high prediction accuracy.
And Decision tree is a branching method of one or more if-then-else
statements for the predictors
* It is very useful for data exploration that breaks down the dataset into
smaller and smaller subsets of association.
Rupak Roy
Single decision Tree
In Decision trees the measure i.e. the branching of the tree is done by
Information Gain.
Information Gain = Entropy of the parent node – Entropy of the
split(children)
Entropy is a measure on how disorganized the systems is.
Entropy ranges from 0 to1. Pure node has an Entropy of 0 while impure
node has Entropy of 1
* The core algorithm for building decision trees was known as ID3 by
J.R.Quinlan
• It uses a top down approach and can be used to build Classification
and Regression Decision trees.
Rupak Roy
Decision Making in Regression Decision
As we know the main aim in regression tree is to reduce the standard
deviation and in classification tree the main aim is to reduce entropy.
Random Forest is suitable for numerical values and since random forest
is a collection of decision trees so first lets understand how numerical
values works for a decision tree.
The numerical values for the decision tree i.e. the regression tree uses
standard deviation scores to do the splitting. The attribute with the
largest standard deviation reduction is chosen for the next decision
node(the node that can be further split). The branch with standard
deviation value more than 0 usually needs for splitting.
Rupak Roy
Decision Making in Regression Decision
ď‚› Stop/pruning criteria is provided using a size based criteria to further
stop the tree growing which leads to over fitting problems.
ď‚› The process of splitting the decision nodes runs recursively till it
reaches the terminal/leafnodes(the node that cannot be further split)
When the number of instances is more than one at a least node we
calculate the average as the final value for the target.
Rupak Roy
Decision Tree Algorithms
ID3 or Iterative Dichotomizer is one of the first of 3 decision tree
algorithms to implement, developed by J Quinlan in 1986
C4.5 – is the next version also developed by J Quinlan, optimized for
continuous and discrete features with some improvement on the over
fitting problem by using bottom up approach known as pruning.
CART or Classification & Regression Trees
ď‚› The CART implementation is similar to C4.5 that prunes the tree by
imposing a complexity penalty based on number of leaves in the
tree.
ď‚› CART uses the GINI method to create binary splits. Most commonly
used decision tree algorithm.
Advantages of Single DT
Advantages of Single DT
ď‚› It is a non- parametric method i.e. it is independent of type, size of
underlying population, that is we can even use when sample size is
low. Therefore very fast and easy to understand and implement.
ď‚› Can handle outliers and missing values, therefore requires less data
preparation than other machine leaning methods and can be used
for both continuous and numerical data types.
Here let’s focus more into the disadvantages of a Decision Tree to get a
solution.
Rupak Roy
Disadvantages of Single DT
Disadvantages of Single DT
ď‚› As we know decision trees are easily prone to over fitting issue,
therefore it needs to be controlled by pruning techniques.
ď‚› It uses range of values to split the tree rather than actual values for
continuous numerical variables. Hence sometimes not very effective
for estimating continuous values.
ď‚› The robustness to outliers and skewness comes at the cost of throwing
away some of the information from the dataset.
ď‚› When some input variable have too many possible values they need
to be aggregated into groups else it will result in too many splits
which may result in poor predicting performance.
This disadvantages of a Decision Tree has given rise to the ensemble
methods.
Rupak Roy
Ensemble Methods
This disadvantages of a Decision Tree has given rise to the ensemble
methods.
* A collection of several models in this case collection of decision trees
are used in order to increase predictive power & the final score is
obtained by aggregating them.
• This is known as Ensemble Method in Machine Learning
Random forest for continuous numerical variables and Boosting &
Bagging for categorical variables are the most popular ensemble
methods.
However the basic functionality remains the same i.e. the original
concept of creating a tree by using entropy & information gain.
Rupak Roy
Random Forest in brief
• The goal of random forest is to improve the prediction accuracy by
using the collection of un-pruned decision trees combined with a rule
based criteria.
So let’s understand the goals of random forest in detail.
Rupak Roy

More Related Content

What's hot

Random Forest and KNN is fun
Random Forest and KNN is funRandom Forest and KNN is fun
Random Forest and KNN is funZhen Li
 
Slide3.ppt
Slide3.pptSlide3.ppt
Slide3.pptbutest
 
Decision Trees for Classification: A Machine Learning Algorithm
Decision Trees for Classification: A Machine Learning AlgorithmDecision Trees for Classification: A Machine Learning Algorithm
Decision Trees for Classification: A Machine Learning AlgorithmPalin analytics
 
Decision trees in Machine Learning
Decision trees in Machine Learning Decision trees in Machine Learning
Decision trees in Machine Learning Mohammad Junaid Khan
 
Decision tree
Decision treeDecision tree
Decision treeSoujanya V
 
Decision Tree - C4.5&CART
Decision Tree - C4.5&CARTDecision Tree - C4.5&CART
Decision Tree - C4.5&CARTXueping Peng
 
Introduction to Machine Learning Classifiers
Introduction to Machine Learning ClassifiersIntroduction to Machine Learning Classifiers
Introduction to Machine Learning ClassifiersFunctional Imperative
 
Decision trees & random forests
Decision trees & random forestsDecision trees & random forests
Decision trees & random forestsSC5.io
 
Understanding Bagging and Boosting
Understanding Bagging and BoostingUnderstanding Bagging and Boosting
Understanding Bagging and BoostingMohit Rajput
 
Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests Derek Kane
 
Decision Tree Learning
Decision Tree LearningDecision Tree Learning
Decision Tree LearningMd. Ariful Hoque
 
Ensemble methods in machine learning
Ensemble methods in machine learningEnsemble methods in machine learning
Ensemble methods in machine learningSANTHOSH RAJA M G
 
Decision Tree Algorithm & Analysis | Machine Learning Algorithm | Data Scienc...
Decision Tree Algorithm & Analysis | Machine Learning Algorithm | Data Scienc...Decision Tree Algorithm & Analysis | Machine Learning Algorithm | Data Scienc...
Decision Tree Algorithm & Analysis | Machine Learning Algorithm | Data Scienc...Edureka!
 
Naive Bayes
Naive BayesNaive Bayes
Naive BayesCloudxLab
 
Hyperparameter Tuning
Hyperparameter TuningHyperparameter Tuning
Hyperparameter TuningJon Lederman
 
Decision tree
Decision treeDecision tree
Decision treeVarun Jain
 
Ensemble learning
Ensemble learningEnsemble learning
Ensemble learningHaris Jamil
 

What's hot (20)

Random Forest and KNN is fun
Random Forest and KNN is funRandom Forest and KNN is fun
Random Forest and KNN is fun
 
Slide3.ppt
Slide3.pptSlide3.ppt
Slide3.ppt
 
Random forest
Random forestRandom forest
Random forest
 
Decision Trees for Classification: A Machine Learning Algorithm
Decision Trees for Classification: A Machine Learning AlgorithmDecision Trees for Classification: A Machine Learning Algorithm
Decision Trees for Classification: A Machine Learning Algorithm
 
Decision trees in Machine Learning
Decision trees in Machine Learning Decision trees in Machine Learning
Decision trees in Machine Learning
 
Decision tree
Decision treeDecision tree
Decision tree
 
Decision tree
Decision treeDecision tree
Decision tree
 
Decision Tree - C4.5&CART
Decision Tree - C4.5&CARTDecision Tree - C4.5&CART
Decision Tree - C4.5&CART
 
Introduction to Machine Learning Classifiers
Introduction to Machine Learning ClassifiersIntroduction to Machine Learning Classifiers
Introduction to Machine Learning Classifiers
 
Decision trees & random forests
Decision trees & random forestsDecision trees & random forests
Decision trees & random forests
 
Understanding Bagging and Boosting
Understanding Bagging and BoostingUnderstanding Bagging and Boosting
Understanding Bagging and Boosting
 
Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests
 
Decision Tree Learning
Decision Tree LearningDecision Tree Learning
Decision Tree Learning
 
Ensemble methods in machine learning
Ensemble methods in machine learningEnsemble methods in machine learning
Ensemble methods in machine learning
 
Decision Tree Algorithm & Analysis | Machine Learning Algorithm | Data Scienc...
Decision Tree Algorithm & Analysis | Machine Learning Algorithm | Data Scienc...Decision Tree Algorithm & Analysis | Machine Learning Algorithm | Data Scienc...
Decision Tree Algorithm & Analysis | Machine Learning Algorithm | Data Scienc...
 
Naive Bayes
Naive BayesNaive Bayes
Naive Bayes
 
Hyperparameter Tuning
Hyperparameter TuningHyperparameter Tuning
Hyperparameter Tuning
 
Naive bayes
Naive bayesNaive bayes
Naive bayes
 
Decision tree
Decision treeDecision tree
Decision tree
 
Ensemble learning
Ensemble learningEnsemble learning
Ensemble learning
 

Similar to Introduction to Random Forest

Machine Learning Unit-5 Decesion Trees & Random Forest.pdf
Machine Learning Unit-5 Decesion Trees & Random Forest.pdfMachine Learning Unit-5 Decesion Trees & Random Forest.pdf
Machine Learning Unit-5 Decesion Trees & Random Forest.pdfAdityaSoraut
 
Decision Tree.pptx
Decision Tree.pptxDecision Tree.pptx
Decision Tree.pptxkibriaswe
 
Machine learning session6(decision trees random forrest)
Machine learning   session6(decision trees random forrest)Machine learning   session6(decision trees random forrest)
Machine learning session6(decision trees random forrest)Abhimanyu Dwivedi
 
Unit 2-ML.pptx
Unit 2-ML.pptxUnit 2-ML.pptx
Unit 2-ML.pptxChitrachitrap
 
20211229120253D6323_PERT 06_ Ensemble Learning.pptx
20211229120253D6323_PERT 06_ Ensemble Learning.pptx20211229120253D6323_PERT 06_ Ensemble Learning.pptx
20211229120253D6323_PERT 06_ Ensemble Learning.pptxRaflyRizky2
 
An Introduction to Random Forest and linear regression algorithms
An Introduction to Random Forest and linear regression algorithmsAn Introduction to Random Forest and linear regression algorithms
An Introduction to Random Forest and linear regression algorithmsShouvic Banik0139
 
Machine Learning - Decision Trees
Machine Learning - Decision TreesMachine Learning - Decision Trees
Machine Learning - Decision TreesRupak Roy
 
13 random forest
13 random forest13 random forest
13 random forestVishal Dutt
 
Decision trees
Decision treesDecision trees
Decision treesnandini patil
 
5.Module_AIML Random Forest.pptx
5.Module_AIML Random Forest.pptx5.Module_AIML Random Forest.pptx
5.Module_AIML Random Forest.pptxPRIYACHAURASIYA25
 
Machine Learning Algorithm - Decision Trees
Machine Learning Algorithm - Decision Trees Machine Learning Algorithm - Decision Trees
Machine Learning Algorithm - Decision Trees Kush Kulshrestha
 
Decision tree presentation
Decision tree presentationDecision tree presentation
Decision tree presentationVijay Yadav
 
Decision Tree Classification Algorithm.pptx
Decision Tree Classification Algorithm.pptxDecision Tree Classification Algorithm.pptx
Decision Tree Classification Algorithm.pptxPriyadharshiniG41
 
data mining.pptx
data mining.pptxdata mining.pptx
data mining.pptxKaviya452563
 
Decision treeinductionmethodsandtheirapplicationtobigdatafinal 5
Decision treeinductionmethodsandtheirapplicationtobigdatafinal 5Decision treeinductionmethodsandtheirapplicationtobigdatafinal 5
Decision treeinductionmethodsandtheirapplicationtobigdatafinal 5ssuser33da69
 
Machine Learning Decision Tree Algorithms
Machine Learning Decision Tree AlgorithmsMachine Learning Decision Tree Algorithms
Machine Learning Decision Tree AlgorithmsRupak Roy
 

Similar to Introduction to Random Forest (20)

Machine Learning Unit-5 Decesion Trees & Random Forest.pdf
Machine Learning Unit-5 Decesion Trees & Random Forest.pdfMachine Learning Unit-5 Decesion Trees & Random Forest.pdf
Machine Learning Unit-5 Decesion Trees & Random Forest.pdf
 
Decision Tree.pptx
Decision Tree.pptxDecision Tree.pptx
Decision Tree.pptx
 
Machine learning session6(decision trees random forrest)
Machine learning   session6(decision trees random forrest)Machine learning   session6(decision trees random forrest)
Machine learning session6(decision trees random forrest)
 
Unit 2-ML.pptx
Unit 2-ML.pptxUnit 2-ML.pptx
Unit 2-ML.pptx
 
20211229120253D6323_PERT 06_ Ensemble Learning.pptx
20211229120253D6323_PERT 06_ Ensemble Learning.pptx20211229120253D6323_PERT 06_ Ensemble Learning.pptx
20211229120253D6323_PERT 06_ Ensemble Learning.pptx
 
An Introduction to Random Forest and linear regression algorithms
An Introduction to Random Forest and linear regression algorithmsAn Introduction to Random Forest and linear regression algorithms
An Introduction to Random Forest and linear regression algorithms
 
Machine Learning - Decision Trees
Machine Learning - Decision TreesMachine Learning - Decision Trees
Machine Learning - Decision Trees
 
13 random forest
13 random forest13 random forest
13 random forest
 
Decision trees
Decision treesDecision trees
Decision trees
 
5.Module_AIML Random Forest.pptx
5.Module_AIML Random Forest.pptx5.Module_AIML Random Forest.pptx
5.Module_AIML Random Forest.pptx
 
Decision Tree.pptx
Decision Tree.pptxDecision Tree.pptx
Decision Tree.pptx
 
Machine Learning Algorithm - Decision Trees
Machine Learning Algorithm - Decision Trees Machine Learning Algorithm - Decision Trees
Machine Learning Algorithm - Decision Trees
 
Decision tree presentation
Decision tree presentationDecision tree presentation
Decision tree presentation
 
Random Forest
Random ForestRandom Forest
Random Forest
 
Decision Tree Classification Algorithm.pptx
Decision Tree Classification Algorithm.pptxDecision Tree Classification Algorithm.pptx
Decision Tree Classification Algorithm.pptx
 
data mining.pptx
data mining.pptxdata mining.pptx
data mining.pptx
 
random forest.pptx
random forest.pptxrandom forest.pptx
random forest.pptx
 
What Is Random Forest_ analytics_ IBM.pdf
What Is Random Forest_ analytics_ IBM.pdfWhat Is Random Forest_ analytics_ IBM.pdf
What Is Random Forest_ analytics_ IBM.pdf
 
Decision treeinductionmethodsandtheirapplicationtobigdatafinal 5
Decision treeinductionmethodsandtheirapplicationtobigdatafinal 5Decision treeinductionmethodsandtheirapplicationtobigdatafinal 5
Decision treeinductionmethodsandtheirapplicationtobigdatafinal 5
 
Machine Learning Decision Tree Algorithms
Machine Learning Decision Tree AlgorithmsMachine Learning Decision Tree Algorithms
Machine Learning Decision Tree Algorithms
 

More from Rupak Roy

Hierarchical Clustering - Text Mining/NLP
Hierarchical Clustering - Text Mining/NLPHierarchical Clustering - Text Mining/NLP
Hierarchical Clustering - Text Mining/NLPRupak Roy
 
Clustering K means and Hierarchical - NLP
Clustering K means and Hierarchical - NLPClustering K means and Hierarchical - NLP
Clustering K means and Hierarchical - NLPRupak Roy
 
Network Analysis - NLP
Network Analysis  - NLPNetwork Analysis  - NLP
Network Analysis - NLPRupak Roy
 
Topic Modeling - NLP
Topic Modeling - NLPTopic Modeling - NLP
Topic Modeling - NLPRupak Roy
 
Sentiment Analysis Practical Steps
Sentiment Analysis Practical StepsSentiment Analysis Practical Steps
Sentiment Analysis Practical StepsRupak Roy
 
NLP - Sentiment Analysis
NLP - Sentiment AnalysisNLP - Sentiment Analysis
NLP - Sentiment AnalysisRupak Roy
 
Text Mining using Regular Expressions
Text Mining using Regular ExpressionsText Mining using Regular Expressions
Text Mining using Regular ExpressionsRupak Roy
 
Introduction to Text Mining
Introduction to Text Mining Introduction to Text Mining
Introduction to Text Mining Rupak Roy
 
Apache Hbase Architecture
Apache Hbase ArchitectureApache Hbase Architecture
Apache Hbase ArchitectureRupak Roy
 
Introduction to Hbase
Introduction to Hbase Introduction to Hbase
Introduction to Hbase Rupak Roy
 
Apache Hive Table Partition and HQL
Apache Hive Table Partition and HQLApache Hive Table Partition and HQL
Apache Hive Table Partition and HQLRupak Roy
 
Installing Apache Hive, internal and external table, import-export
Installing Apache Hive, internal and external table, import-export Installing Apache Hive, internal and external table, import-export
Installing Apache Hive, internal and external table, import-export Rupak Roy
 
Introductive to Hive
Introductive to Hive Introductive to Hive
Introductive to Hive Rupak Roy
 
Scoop Job, import and export to RDBMS
Scoop Job, import and export to RDBMSScoop Job, import and export to RDBMS
Scoop Job, import and export to RDBMSRupak Roy
 
Apache Scoop - Import with Append mode and Last Modified mode
Apache Scoop - Import with Append mode and Last Modified mode Apache Scoop - Import with Append mode and Last Modified mode
Apache Scoop - Import with Append mode and Last Modified mode Rupak Roy
 
Introduction to scoop and its functions
Introduction to scoop and its functionsIntroduction to scoop and its functions
Introduction to scoop and its functionsRupak Roy
 
Introduction to Flume
Introduction to FlumeIntroduction to Flume
Introduction to FlumeRupak Roy
 
Apache Pig Relational Operators - II
Apache Pig Relational Operators - II Apache Pig Relational Operators - II
Apache Pig Relational Operators - II Rupak Roy
 
Passing Parameters using File and Command Line
Passing Parameters using File and Command LinePassing Parameters using File and Command Line
Passing Parameters using File and Command LineRupak Roy
 
Apache PIG Relational Operations
Apache PIG Relational Operations Apache PIG Relational Operations
Apache PIG Relational Operations Rupak Roy
 

More from Rupak Roy (20)

Hierarchical Clustering - Text Mining/NLP
Hierarchical Clustering - Text Mining/NLPHierarchical Clustering - Text Mining/NLP
Hierarchical Clustering - Text Mining/NLP
 
Clustering K means and Hierarchical - NLP
Clustering K means and Hierarchical - NLPClustering K means and Hierarchical - NLP
Clustering K means and Hierarchical - NLP
 
Network Analysis - NLP
Network Analysis  - NLPNetwork Analysis  - NLP
Network Analysis - NLP
 
Topic Modeling - NLP
Topic Modeling - NLPTopic Modeling - NLP
Topic Modeling - NLP
 
Sentiment Analysis Practical Steps
Sentiment Analysis Practical StepsSentiment Analysis Practical Steps
Sentiment Analysis Practical Steps
 
NLP - Sentiment Analysis
NLP - Sentiment AnalysisNLP - Sentiment Analysis
NLP - Sentiment Analysis
 
Text Mining using Regular Expressions
Text Mining using Regular ExpressionsText Mining using Regular Expressions
Text Mining using Regular Expressions
 
Introduction to Text Mining
Introduction to Text Mining Introduction to Text Mining
Introduction to Text Mining
 
Apache Hbase Architecture
Apache Hbase ArchitectureApache Hbase Architecture
Apache Hbase Architecture
 
Introduction to Hbase
Introduction to Hbase Introduction to Hbase
Introduction to Hbase
 
Apache Hive Table Partition and HQL
Apache Hive Table Partition and HQLApache Hive Table Partition and HQL
Apache Hive Table Partition and HQL
 
Installing Apache Hive, internal and external table, import-export
Installing Apache Hive, internal and external table, import-export Installing Apache Hive, internal and external table, import-export
Installing Apache Hive, internal and external table, import-export
 
Introductive to Hive
Introductive to Hive Introductive to Hive
Introductive to Hive
 
Scoop Job, import and export to RDBMS
Scoop Job, import and export to RDBMSScoop Job, import and export to RDBMS
Scoop Job, import and export to RDBMS
 
Apache Scoop - Import with Append mode and Last Modified mode
Apache Scoop - Import with Append mode and Last Modified mode Apache Scoop - Import with Append mode and Last Modified mode
Apache Scoop - Import with Append mode and Last Modified mode
 
Introduction to scoop and its functions
Introduction to scoop and its functionsIntroduction to scoop and its functions
Introduction to scoop and its functions
 
Introduction to Flume
Introduction to FlumeIntroduction to Flume
Introduction to Flume
 
Apache Pig Relational Operators - II
Apache Pig Relational Operators - II Apache Pig Relational Operators - II
Apache Pig Relational Operators - II
 
Passing Parameters using File and Command Line
Passing Parameters using File and Command LinePassing Parameters using File and Command Line
Passing Parameters using File and Command Line
 
Apache PIG Relational Operations
Apache PIG Relational Operations Apache PIG Relational Operations
Apache PIG Relational Operations
 

Recently uploaded

How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17Celine George
 
Quarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up FridayQuarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up FridayMakMakNepo
 
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxGrade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxChelloAnnAsuncion2
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxDr.Ibrahim Hassaan
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfSpandanaRallapalli
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
Romantic Opera MUSIC FOR GRADE NINE pptx
Romantic Opera MUSIC FOR GRADE NINE pptxRomantic Opera MUSIC FOR GRADE NINE pptx
Romantic Opera MUSIC FOR GRADE NINE pptxsqpmdrvczh
 
Atmosphere science 7 quarter 4 .........
Atmosphere science 7 quarter 4 .........Atmosphere science 7 quarter 4 .........
Atmosphere science 7 quarter 4 .........LeaCamillePacle
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 

Recently uploaded (20)

How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17
 
Quarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up FridayQuarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up Friday
 
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxGrade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptx
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdf
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
Romantic Opera MUSIC FOR GRADE NINE pptx
Romantic Opera MUSIC FOR GRADE NINE pptxRomantic Opera MUSIC FOR GRADE NINE pptx
Romantic Opera MUSIC FOR GRADE NINE pptx
 
Atmosphere science 7 quarter 4 .........
Atmosphere science 7 quarter 4 .........Atmosphere science 7 quarter 4 .........
Atmosphere science 7 quarter 4 .........
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 

Introduction to Random Forest

  • 2. Random Forest To understand the Random Forest let’s first understand Ensemble model. Ensemble model is a collection of outputs from multiple models to more accuracy predictive modeling Ensemble model are high demand due to easy implementation of multiple models in a short time and effort for high prediction accuracy. And Decision tree is a branching method of one or more if-then-else statements for the predictors * It is very useful for data exploration that breaks down the dataset into smaller and smaller subsets of association. Rupak Roy
  • 3. Single decision Tree In Decision trees the measure i.e. the branching of the tree is done by Information Gain. Information Gain = Entropy of the parent node – Entropy of the split(children) Entropy is a measure on how disorganized the systems is. Entropy ranges from 0 to1. Pure node has an Entropy of 0 while impure node has Entropy of 1 * The core algorithm for building decision trees was known as ID3 by J.R.Quinlan • It uses a top down approach and can be used to build Classification and Regression Decision trees. Rupak Roy
  • 4. Decision Making in Regression Decision As we know the main aim in regression tree is to reduce the standard deviation and in classification tree the main aim is to reduce entropy. Random Forest is suitable for numerical values and since random forest is a collection of decision trees so first lets understand how numerical values works for a decision tree. The numerical values for the decision tree i.e. the regression tree uses standard deviation scores to do the splitting. The attribute with the largest standard deviation reduction is chosen for the next decision node(the node that can be further split). The branch with standard deviation value more than 0 usually needs for splitting. Rupak Roy
  • 5. Decision Making in Regression Decision ď‚› Stop/pruning criteria is provided using a size based criteria to further stop the tree growing which leads to over fitting problems. ď‚› The process of splitting the decision nodes runs recursively till it reaches the terminal/leafnodes(the node that cannot be further split) When the number of instances is more than one at a least node we calculate the average as the final value for the target. Rupak Roy
  • 6. Decision Tree Algorithms ID3 or Iterative Dichotomizer is one of the first of 3 decision tree algorithms to implement, developed by J Quinlan in 1986 C4.5 – is the next version also developed by J Quinlan, optimized for continuous and discrete features with some improvement on the over fitting problem by using bottom up approach known as pruning. CART or Classification & Regression Trees ď‚› The CART implementation is similar to C4.5 that prunes the tree by imposing a complexity penalty based on number of leaves in the tree. ď‚› CART uses the GINI method to create binary splits. Most commonly used decision tree algorithm.
  • 7. Advantages of Single DT Advantages of Single DT ď‚› It is a non- parametric method i.e. it is independent of type, size of underlying population, that is we can even use when sample size is low. Therefore very fast and easy to understand and implement. ď‚› Can handle outliers and missing values, therefore requires less data preparation than other machine leaning methods and can be used for both continuous and numerical data types. Here let’s focus more into the disadvantages of a Decision Tree to get a solution. Rupak Roy
  • 8. Disadvantages of Single DT Disadvantages of Single DT ď‚› As we know decision trees are easily prone to over fitting issue, therefore it needs to be controlled by pruning techniques. ď‚› It uses range of values to split the tree rather than actual values for continuous numerical variables. Hence sometimes not very effective for estimating continuous values. ď‚› The robustness to outliers and skewness comes at the cost of throwing away some of the information from the dataset. ď‚› When some input variable have too many possible values they need to be aggregated into groups else it will result in too many splits which may result in poor predicting performance. This disadvantages of a Decision Tree has given rise to the ensemble methods. Rupak Roy
  • 9. Ensemble Methods This disadvantages of a Decision Tree has given rise to the ensemble methods. * A collection of several models in this case collection of decision trees are used in order to increase predictive power & the final score is obtained by aggregating them. • This is known as Ensemble Method in Machine Learning Random forest for continuous numerical variables and Boosting & Bagging for categorical variables are the most popular ensemble methods. However the basic functionality remains the same i.e. the original concept of creating a tree by using entropy & information gain. Rupak Roy
  • 10. Random Forest in brief • The goal of random forest is to improve the prediction accuracy by using the collection of un-pruned decision trees combined with a rule based criteria. So let’s understand the goals of random forest in detail. Rupak Roy