SlideShare a Scribd company logo
Decision Tree Classification
Algorithm
Introduction
• Decision Tree is a Supervised learning technique that can be used
for both classification and Regression problems, but mostly it is
preferred for solving Classification problems.
• It is a tree-structured classifier, where internal nodes represent the
features of a dataset, branches represent the decision
rules and each leaf node represents the outcome.
• In a Decision tree, there are two nodes, which are the Decision
Node and Leaf Node. Decision nodes are used to make any decision
and have multiple branches, whereas Leaf nodes are the output of those
decisions and do not contain any further branches.
• The decisions or the test are performed on the basis of features of the
given dataset.
• It is a graphical representation for getting all the possible
solutions to a problem/decision based on given conditions.
• It is called a decision tree because, similar to a tree, it starts with the root node,
which expands on further branches and constructs a tree-like structure.
• In order to build a tree, we use the CART algorithm, which stands
for Classification and Regression Tree algorithm.
• A decision tree simply asks a question, and based on the answer (Yes/No), it further
split the tree into subtrees.
General structure of
a decision tree
Why use Decision Trees?
• Decision Trees usually mimic human thinking
ability while making a decision, so it is easy to
understand.
• The logic behind the decision tree can be easily
understood because it shows a tree-like
structure.
Decision Tree Terminologies
• Root Node: Root node is from where the decision tree starts. It
represents the entire dataset, which further gets divided into two
or more homogeneous sets.
• Leaf Node: Leaf nodes are the final output node, and the tree
cannot be segregated further after getting a leaf node.
• Splitting: Splitting is the process of dividing the decision
node/root node into sub-nodes according to the given conditions.
• Branch/Sub Tree: A tree formed by splitting the tree.
• Pruning: Pruning is the process of removing the unwanted
branches from the tree.
• Parent/Child node: The root node of the tree is called the
parent node, and other nodes are called the child nodes.
How does the Decision Tree algorithm
Work?
• In a decision tree, for predicting the class of the given
dataset, the algorithm starts from the root node of the
tree. This algorithm compares the values of root
attribute with the record (real dataset) attribute and,
based on the comparison, follows the branch and
jumps to the next node.
• For the next node, the algorithm again compares the
attribute value with the other sub-nodes and move
further. It continues the process until it reaches the
leaf node of the tree. The complete process can be
better understood using the algorithm.
Algorithm
• Step-1: Begin the tree with the root node, says S, which contains
the complete dataset.
• Step-2: Find the best attribute in the dataset using Attribute
Selection Measure (ASM).
• Step-3: Divide the S into subsets that contains possible values for
the best attributes.
• Step-4: Generate the decision tree node, which contains the best
attribute.
• Step-5: Recursively make new decision trees using the subsets of
the dataset created in step -3. Continue this process until
a stage is reached where you cannot further classify the
nodes and called the final node as a leaf node.
Example
• Suppose there is a candidate who has a job offer and wants
to decide whether he should accept the offer or Not.
• So, to solve this problem, the decision tree starts with the
root node (Salary attribute by ASM).
• The root node splits further into the next decision node
(distance from the office) and one leaf node based on the
corresponding labels.
• The next decision node further gets split into one decision
node (Cab facility) and one leaf node.
• Finally, the decision node splits into two leaf nodes
(Accepted offers and Declined offer).
Example of a decision tree
We see that if the weather is cloudy then we must go to play. Why didn’t
it split more? Why did it stop there?
To answer this question, we need to know about few more concepts like
entropy, information gain, and Gini index.
Attribute Selection Measures
• While implementing a Decision tree, the main issue arises
that how to select the best attribute for the root node and
for sub-nodes.
• So, to solve such problems there is a technique which is
called as Attribute selection measure or ASM.
•
• By this measurement, we can easily select the best attribute
for the nodes of the tree.
• There are two popular techniques for ASM, which are:
1. Information Gain
2. Gini Index
Information Gain
• Information gain is the measurement of changes in entropy after the segmentation of a dataset
based on an attribute.
• It calculates how much information a feature provides us about a class.
• According to the value of information gain, we split the node and build the decision tree.
• A decision tree algorithm always tries to maximize the value of information gain, and a
node/attribute having the highest information gain is split first.
• It can be calculated using the below formula:
Information Gain= Entropy(S)- [(Weighted Avg) *Entropy(each feature)]
• Entropy: Entropy is a metric to measure the impurity in a given attribute. It specifies
randomness in data. Entropy can be calculated as:
Entropy(s)= -P(yes)log2 P(yes)- P(no) log2 P(no)
Where,
S= Total number of samples
P(yes)= probability of yes
P(no)= probability of no
How to calculate Entropy?
• Suppose a feature has 8 “yes” and 4 “no” initially,
after the first split the left node gets 5 ‘yes’ and 2
‘no’ whereas right node gets 3 ‘yes’ and 2 ‘no’.
• We see here the split is not pure, why? Because we
can still see some negative classes in both the
nodes. In order to make a decision tree, we need to
calculate the impurity of each split, and when the
purity is 100%, we make it as a leaf node.
• To check the impurity of feature 2 and feature 3 we
will take the help for Entropy formula.
We can clearly see from the tree itself that left node has low entropy or more purity
than right node since left node has a greater number of “yes” and it is easy to decide
here.
Always remember that the higher the Entropy, the lower will be the purity
and the higher will be the impurity.
For feature 2 :
For feature 3 :
Example to calculate Information Gain
Suppose our entire population has a total of 30 instances. The
dataset is to predict whether the person will go to the gym or
not.
Let’s say 16 people go to the gym and 14 people don’t
Now we have two features to predict whether he/she will go to
the gym or not.
Feature 1 is “Energy” which takes two values “high” and
“low”
Feature 2 is “Motivation” which takes 3 values “No
motivation”, “Neutral” and “Highly motivated”.
Let’s calculate Entropy
To see the weighted average of entropy of each node we will do as follows:
Now we have the value of E(Parent) and E(Parent|Energy), information
gain will be:
Similarly, we will do this with the other feature “Motivation” and calculate
its information gain.
Let’s calculate the entropy here:
To see the weighted average of entropy of each node we will do as follows:
Now we have the value of E(Parent) and E(Parent|Motivation), information gain
will be:
We now see that the “Energy” feature gives more reduction which is 0.37 than the
“Motivation” feature. Hence we will select the feature which has the highest
information gain and then split the node based on that feature.
In this example “Energy” will be our root node and we’ll do the same for sub-
nodes. Here we can see that when the energy is “high” the entropy is low and hence
we can say a person will definitely go to the gym if he has high energy, but what if
the energy is low? We will again split the node based on the new feature which is
“Motivation”.
Gini Index
• Gini index is a measure of impurity or purity used
while creating a decision tree in the
CART(Classification and Regression Tree) algorithm.
• An attribute with the low Gini index should be
preferred as compared to the high Gini index.
• It only creates binary splits, and the CART algorithm
uses the Gini index to create binary splits.
• Gini index can be calculated using the below formula:
Gini Index= 1- ∑jPj
2
When to stop splitting?
Usually, real-world datasets have a large number of features, which will result in a
large number of splits, which in turn gives a huge tree. Such trees take time to
build and can lead to overfitting. That means the tree will give very good accuracy
on the training dataset but will give bad accuracy in test data. There are many
ways to tackle this problem through hyperparameter tuning.
1. max_depth parameter - set the maximum depth of our decision tree. The
more the value of max_depth, the more complex your tree will be. Hence
you need a value that will not overfit as well as underfit our data.
2. min_samples_split - set the minimum number of samples for each spilt.
3. min_samples_leaf – represents the minimum number of samples required to
be in the leaf node. The more you increase the number, the more is the possibility
of overfitting.
4. max_features – it helps us decide what number of features to consider when
looking for the best split.
Pruning
• Pruning is a process of deleting the unnecessary nodes from a tree in order to get the
optimal decision tree.
• A too-large tree increases the risk of overfitting, and a small tree may not capture all the
important features of the dataset.
• Therefore, a technique that decreases the size of the learning tree without reducing
accuracy is known as Pruning.
• There are mainly two types of tree pruning technology used:
Cost Complexity Pruning
Reduced Error Pruning.
• There are mainly 2 ways for pruning:
(i) Pre-pruning – we can stop growing the tree earlier, which means we can
prune/remove/cut a node if it has low importance while growing the tree.
(ii) Post-pruning – once our tree is built to its depth, we can start pruning the
nodes based on their significance.
Advantages of the Decision Tree
• It is simple to understand as it follows the same
process which a human follow while making any
decision in real-life.
• It can be very useful for solving decision-related
problems.
• It helps to think about all the possible outcomes for
a problem.
• There is less requirement of data cleaning
compared to other algorithms.
Disadvantages of the Decision Tree
• The decision tree contains lots of layers, which
makes it complex.
• It may have an overfitting issue, which can be
resolved using the Random Forest algorithm.
• For more class labels, the computational
complexity of the decision tree may increase.

More Related Content

What's hot

Decision tree
Decision treeDecision tree
Decision tree
Ami_Surati
 
5.5 graph mining
5.5 graph mining5.5 graph mining
5.5 graph mining
Krish_ver2
 
Apriori algorithm
Apriori algorithmApriori algorithm
Apriori algorithm
Gaurav Aggarwal
 
Classification in data mining
Classification in data mining Classification in data mining
Classification in data mining
Sulman Ahmed
 
Low level feature extraction - chapter 4
Low level feature extraction - chapter 4Low level feature extraction - chapter 4
Low level feature extraction - chapter 4
Aalaa Khattab
 
Classification techniques in data mining
Classification techniques in data miningClassification techniques in data mining
Classification techniques in data mining
Kamal Acharya
 
Data mining technique (decision tree)
Data mining technique (decision tree)Data mining technique (decision tree)
Data mining technique (decision tree)
Shweta Ghate
 
Graph clustering
Graph clusteringGraph clustering
Graph clustering
ssusered887b
 
Genetic algorithms in Data Mining
Genetic algorithms in Data MiningGenetic algorithms in Data Mining
Genetic algorithms in Data Mining
Atul Khanna
 
Unit3:Informed and Uninformed search
Unit3:Informed and Uninformed searchUnit3:Informed and Uninformed search
Unit3:Informed and Uninformed search
Tekendra Nath Yogi
 
AI Greedy and A-STAR Search
AI Greedy and A-STAR SearchAI Greedy and A-STAR Search
AI Greedy and A-STAR Search
Andrew Ferlitsch
 
Clustering paradigms and Partitioning Algorithms
Clustering paradigms and Partitioning AlgorithmsClustering paradigms and Partitioning Algorithms
Clustering paradigms and Partitioning Algorithms
Umang MIshra
 
Data Mining:Concepts and Techniques, Chapter 8. Classification: Basic Concepts
Data Mining:Concepts and Techniques, Chapter 8. Classification: Basic ConceptsData Mining:Concepts and Techniques, Chapter 8. Classification: Basic Concepts
Data Mining:Concepts and Techniques, Chapter 8. Classification: Basic Concepts
Salah Amean
 
Artificial Neural Networks for Data Mining
Artificial Neural Networks for Data MiningArtificial Neural Networks for Data Mining
Unsupervised learning clustering
Unsupervised learning clusteringUnsupervised learning clustering
Unsupervised learning clustering
Arshad Farhad
 
Decision Tree Machine Learning Detailed Explanation.
Decision Tree Machine Learning Detailed Explanation.Decision Tree Machine Learning Detailed Explanation.
Decision Tree Machine Learning Detailed Explanation.
DrezzingGaming
 
Sum of subsets problem by backtracking 
Sum of subsets problem by backtracking Sum of subsets problem by backtracking 
Sum of subsets problem by backtracking 
Hasanain Alshadoodee
 

What's hot (20)

Decision tree
Decision treeDecision tree
Decision tree
 
5.5 graph mining
5.5 graph mining5.5 graph mining
5.5 graph mining
 
3. mining frequent patterns
3. mining frequent patterns3. mining frequent patterns
3. mining frequent patterns
 
Lecture13 - Association Rules
Lecture13 - Association RulesLecture13 - Association Rules
Lecture13 - Association Rules
 
Apriori algorithm
Apriori algorithmApriori algorithm
Apriori algorithm
 
Classification in data mining
Classification in data mining Classification in data mining
Classification in data mining
 
Low level feature extraction - chapter 4
Low level feature extraction - chapter 4Low level feature extraction - chapter 4
Low level feature extraction - chapter 4
 
Classification techniques in data mining
Classification techniques in data miningClassification techniques in data mining
Classification techniques in data mining
 
Data mining technique (decision tree)
Data mining technique (decision tree)Data mining technique (decision tree)
Data mining technique (decision tree)
 
Graph clustering
Graph clusteringGraph clustering
Graph clustering
 
Genetic algorithms in Data Mining
Genetic algorithms in Data MiningGenetic algorithms in Data Mining
Genetic algorithms in Data Mining
 
Unit3:Informed and Uninformed search
Unit3:Informed and Uninformed searchUnit3:Informed and Uninformed search
Unit3:Informed and Uninformed search
 
AI Greedy and A-STAR Search
AI Greedy and A-STAR SearchAI Greedy and A-STAR Search
AI Greedy and A-STAR Search
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
Clustering paradigms and Partitioning Algorithms
Clustering paradigms and Partitioning AlgorithmsClustering paradigms and Partitioning Algorithms
Clustering paradigms and Partitioning Algorithms
 
Data Mining:Concepts and Techniques, Chapter 8. Classification: Basic Concepts
Data Mining:Concepts and Techniques, Chapter 8. Classification: Basic ConceptsData Mining:Concepts and Techniques, Chapter 8. Classification: Basic Concepts
Data Mining:Concepts and Techniques, Chapter 8. Classification: Basic Concepts
 
Artificial Neural Networks for Data Mining
Artificial Neural Networks for Data MiningArtificial Neural Networks for Data Mining
Artificial Neural Networks for Data Mining
 
Unsupervised learning clustering
Unsupervised learning clusteringUnsupervised learning clustering
Unsupervised learning clustering
 
Decision Tree Machine Learning Detailed Explanation.
Decision Tree Machine Learning Detailed Explanation.Decision Tree Machine Learning Detailed Explanation.
Decision Tree Machine Learning Detailed Explanation.
 
Sum of subsets problem by backtracking 
Sum of subsets problem by backtracking Sum of subsets problem by backtracking 
Sum of subsets problem by backtracking 
 

Similar to Decision Tree.pptx

Decision Tree Classification Algorithm.pptx
Decision Tree Classification Algorithm.pptxDecision Tree Classification Algorithm.pptx
Decision Tree Classification Algorithm.pptx
PriyadharshiniG41
 
Machine Learning Algorithm - Decision Trees
Machine Learning Algorithm - Decision Trees Machine Learning Algorithm - Decision Trees
Machine Learning Algorithm - Decision Trees
Kush Kulshrestha
 
Unit 2-ML.pptx
Unit 2-ML.pptxUnit 2-ML.pptx
Unit 2-ML.pptx
Chitrachitrap
 
Machine Learning Unit-5 Decesion Trees & Random Forest.pdf
Machine Learning Unit-5 Decesion Trees & Random Forest.pdfMachine Learning Unit-5 Decesion Trees & Random Forest.pdf
Machine Learning Unit-5 Decesion Trees & Random Forest.pdf
AdityaSoraut
 
Decision Tree in Machine Learning
Decision Tree in Machine Learning  Decision Tree in Machine Learning
Decision Tree in Machine Learning
Souma Maiti
 
Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...
Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...
Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...
Maninda Edirisooriya
 
Random forest sgv_ai_talk_oct_2_2018
Random forest sgv_ai_talk_oct_2_2018Random forest sgv_ai_talk_oct_2_2018
Random forest sgv_ai_talk_oct_2_2018
digitalzombie
 
Lecture 12.pptx for bca student DAA lecture
Lecture 12.pptx for bca student DAA lectureLecture 12.pptx for bca student DAA lecture
Lecture 12.pptx for bca student DAA lecture
AjayKumar773878
 
Decision Trees
Decision TreesDecision Trees
Decision Trees
Student
 
Machine learning session6(decision trees random forrest)
Machine learning   session6(decision trees random forrest)Machine learning   session6(decision trees random forrest)
Machine learning session6(decision trees random forrest)
Abhimanyu Dwivedi
 
Decision Trees for Classification: A Machine Learning Algorithm
Decision Trees for Classification: A Machine Learning AlgorithmDecision Trees for Classification: A Machine Learning Algorithm
Decision Trees for Classification: A Machine Learning Algorithm
Palin analytics
 
20211229120253D6323_PERT 06_ Ensemble Learning.pptx
20211229120253D6323_PERT 06_ Ensemble Learning.pptx20211229120253D6323_PERT 06_ Ensemble Learning.pptx
20211229120253D6323_PERT 06_ Ensemble Learning.pptx
RaflyRizky2
 
From decision trees to random forests
From decision trees to random forestsFrom decision trees to random forests
From decision trees to random forests
Viet-Trung TRAN
 
Decision tree
Decision tree Decision tree
Decision tree
Learnbay Datascience
 
CSA 3702 machine learning module 2
CSA 3702 machine learning module 2CSA 3702 machine learning module 2
CSA 3702 machine learning module 2
Nandhini S
 
Desicion tree and neural networks
Desicion tree and neural networksDesicion tree and neural networks
Desicion tree and neural networks
jaskarankaur21
 
Data mining 5 klasifikasi decision tree dan random forest
Data mining 5   klasifikasi decision tree dan random forestData mining 5   klasifikasi decision tree dan random forest
Data mining 5 klasifikasi decision tree dan random forest
IrwansyahSaputra1
 
Unit-V.pptx DVD is a great way to get sbi and more jobs available review and ...
Unit-V.pptx DVD is a great way to get sbi and more jobs available review and ...Unit-V.pptx DVD is a great way to get sbi and more jobs available review and ...
Unit-V.pptx DVD is a great way to get sbi and more jobs available review and ...
zohebmusharraf
 
Decision tree presentation
Decision tree presentationDecision tree presentation
Decision tree presentation
Vijay Yadav
 
Random Forest Decision Tree.pptx
Random Forest Decision Tree.pptxRandom Forest Decision Tree.pptx
Random Forest Decision Tree.pptx
Ramakrishna Reddy Bijjam
 

Similar to Decision Tree.pptx (20)

Decision Tree Classification Algorithm.pptx
Decision Tree Classification Algorithm.pptxDecision Tree Classification Algorithm.pptx
Decision Tree Classification Algorithm.pptx
 
Machine Learning Algorithm - Decision Trees
Machine Learning Algorithm - Decision Trees Machine Learning Algorithm - Decision Trees
Machine Learning Algorithm - Decision Trees
 
Unit 2-ML.pptx
Unit 2-ML.pptxUnit 2-ML.pptx
Unit 2-ML.pptx
 
Machine Learning Unit-5 Decesion Trees & Random Forest.pdf
Machine Learning Unit-5 Decesion Trees & Random Forest.pdfMachine Learning Unit-5 Decesion Trees & Random Forest.pdf
Machine Learning Unit-5 Decesion Trees & Random Forest.pdf
 
Decision Tree in Machine Learning
Decision Tree in Machine Learning  Decision Tree in Machine Learning
Decision Tree in Machine Learning
 
Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...
Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...
Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...
 
Random forest sgv_ai_talk_oct_2_2018
Random forest sgv_ai_talk_oct_2_2018Random forest sgv_ai_talk_oct_2_2018
Random forest sgv_ai_talk_oct_2_2018
 
Lecture 12.pptx for bca student DAA lecture
Lecture 12.pptx for bca student DAA lectureLecture 12.pptx for bca student DAA lecture
Lecture 12.pptx for bca student DAA lecture
 
Decision Trees
Decision TreesDecision Trees
Decision Trees
 
Machine learning session6(decision trees random forrest)
Machine learning   session6(decision trees random forrest)Machine learning   session6(decision trees random forrest)
Machine learning session6(decision trees random forrest)
 
Decision Trees for Classification: A Machine Learning Algorithm
Decision Trees for Classification: A Machine Learning AlgorithmDecision Trees for Classification: A Machine Learning Algorithm
Decision Trees for Classification: A Machine Learning Algorithm
 
20211229120253D6323_PERT 06_ Ensemble Learning.pptx
20211229120253D6323_PERT 06_ Ensemble Learning.pptx20211229120253D6323_PERT 06_ Ensemble Learning.pptx
20211229120253D6323_PERT 06_ Ensemble Learning.pptx
 
From decision trees to random forests
From decision trees to random forestsFrom decision trees to random forests
From decision trees to random forests
 
Decision tree
Decision tree Decision tree
Decision tree
 
CSA 3702 machine learning module 2
CSA 3702 machine learning module 2CSA 3702 machine learning module 2
CSA 3702 machine learning module 2
 
Desicion tree and neural networks
Desicion tree and neural networksDesicion tree and neural networks
Desicion tree and neural networks
 
Data mining 5 klasifikasi decision tree dan random forest
Data mining 5   klasifikasi decision tree dan random forestData mining 5   klasifikasi decision tree dan random forest
Data mining 5 klasifikasi decision tree dan random forest
 
Unit-V.pptx DVD is a great way to get sbi and more jobs available review and ...
Unit-V.pptx DVD is a great way to get sbi and more jobs available review and ...Unit-V.pptx DVD is a great way to get sbi and more jobs available review and ...
Unit-V.pptx DVD is a great way to get sbi and more jobs available review and ...
 
Decision tree presentation
Decision tree presentationDecision tree presentation
Decision tree presentation
 
Random Forest Decision Tree.pptx
Random Forest Decision Tree.pptxRandom Forest Decision Tree.pptx
Random Forest Decision Tree.pptx
 

Recently uploaded

Introduction to Mean Field Theory(MFT).pptx
Introduction to Mean Field Theory(MFT).pptxIntroduction to Mean Field Theory(MFT).pptx
Introduction to Mean Field Theory(MFT).pptx
zeex60
 
role of pramana in research.pptx in science
role of pramana in research.pptx in sciencerole of pramana in research.pptx in science
role of pramana in research.pptx in science
sonaliswain16
 
Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
Nistarini College, Purulia (W.B) India
 
S.1 chemistry scheme term 2 for ordinary level
S.1 chemistry scheme term 2 for ordinary levelS.1 chemistry scheme term 2 for ordinary level
S.1 chemistry scheme term 2 for ordinary level
ronaldlakony0
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
Areesha Ahmad
 
Richard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlandsRichard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlands
Richard Gill
 
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
Studia Poinsotiana
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
University of Maribor
 
Deep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless ReproducibilityDeep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless Reproducibility
University of Rennes, INSA Rennes, Inria/IRISA, CNRS
 
in vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptxin vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptx
yusufzako14
 
general properties of oerganologametal.ppt
general properties of oerganologametal.pptgeneral properties of oerganologametal.ppt
general properties of oerganologametal.ppt
IqrimaNabilatulhusni
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
muralinath2
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
Lokesh Patil
 
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
David Osipyan
 
GBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture MediaGBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture Media
Areesha Ahmad
 
nodule formation by alisha dewangan.pptx
nodule formation by alisha dewangan.pptxnodule formation by alisha dewangan.pptx
nodule formation by alisha dewangan.pptx
alishadewangan1
 
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdfDMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
fafyfskhan251kmf
 
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Sérgio Sacani
 
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
Wasswaderrick3
 
Chapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisisChapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisis
tonzsalvador2222
 

Recently uploaded (20)

Introduction to Mean Field Theory(MFT).pptx
Introduction to Mean Field Theory(MFT).pptxIntroduction to Mean Field Theory(MFT).pptx
Introduction to Mean Field Theory(MFT).pptx
 
role of pramana in research.pptx in science
role of pramana in research.pptx in sciencerole of pramana in research.pptx in science
role of pramana in research.pptx in science
 
Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
 
S.1 chemistry scheme term 2 for ordinary level
S.1 chemistry scheme term 2 for ordinary levelS.1 chemistry scheme term 2 for ordinary level
S.1 chemistry scheme term 2 for ordinary level
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
 
Richard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlandsRichard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlands
 
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
 
Deep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless ReproducibilityDeep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless Reproducibility
 
in vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptxin vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptx
 
general properties of oerganologametal.ppt
general properties of oerganologametal.pptgeneral properties of oerganologametal.ppt
general properties of oerganologametal.ppt
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
 
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
 
GBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture MediaGBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture Media
 
nodule formation by alisha dewangan.pptx
nodule formation by alisha dewangan.pptxnodule formation by alisha dewangan.pptx
nodule formation by alisha dewangan.pptx
 
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdfDMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
 
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
 
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
 
Chapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisisChapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisis
 

Decision Tree.pptx

  • 2. Introduction • Decision Tree is a Supervised learning technique that can be used for both classification and Regression problems, but mostly it is preferred for solving Classification problems. • It is a tree-structured classifier, where internal nodes represent the features of a dataset, branches represent the decision rules and each leaf node represents the outcome. • In a Decision tree, there are two nodes, which are the Decision Node and Leaf Node. Decision nodes are used to make any decision and have multiple branches, whereas Leaf nodes are the output of those decisions and do not contain any further branches. • The decisions or the test are performed on the basis of features of the given dataset. • It is a graphical representation for getting all the possible solutions to a problem/decision based on given conditions.
  • 3. • It is called a decision tree because, similar to a tree, it starts with the root node, which expands on further branches and constructs a tree-like structure. • In order to build a tree, we use the CART algorithm, which stands for Classification and Regression Tree algorithm. • A decision tree simply asks a question, and based on the answer (Yes/No), it further split the tree into subtrees. General structure of a decision tree
  • 4. Why use Decision Trees? • Decision Trees usually mimic human thinking ability while making a decision, so it is easy to understand. • The logic behind the decision tree can be easily understood because it shows a tree-like structure.
  • 5. Decision Tree Terminologies • Root Node: Root node is from where the decision tree starts. It represents the entire dataset, which further gets divided into two or more homogeneous sets. • Leaf Node: Leaf nodes are the final output node, and the tree cannot be segregated further after getting a leaf node. • Splitting: Splitting is the process of dividing the decision node/root node into sub-nodes according to the given conditions. • Branch/Sub Tree: A tree formed by splitting the tree. • Pruning: Pruning is the process of removing the unwanted branches from the tree. • Parent/Child node: The root node of the tree is called the parent node, and other nodes are called the child nodes.
  • 6. How does the Decision Tree algorithm Work? • In a decision tree, for predicting the class of the given dataset, the algorithm starts from the root node of the tree. This algorithm compares the values of root attribute with the record (real dataset) attribute and, based on the comparison, follows the branch and jumps to the next node. • For the next node, the algorithm again compares the attribute value with the other sub-nodes and move further. It continues the process until it reaches the leaf node of the tree. The complete process can be better understood using the algorithm.
  • 7. Algorithm • Step-1: Begin the tree with the root node, says S, which contains the complete dataset. • Step-2: Find the best attribute in the dataset using Attribute Selection Measure (ASM). • Step-3: Divide the S into subsets that contains possible values for the best attributes. • Step-4: Generate the decision tree node, which contains the best attribute. • Step-5: Recursively make new decision trees using the subsets of the dataset created in step -3. Continue this process until a stage is reached where you cannot further classify the nodes and called the final node as a leaf node.
  • 8. Example • Suppose there is a candidate who has a job offer and wants to decide whether he should accept the offer or Not. • So, to solve this problem, the decision tree starts with the root node (Salary attribute by ASM). • The root node splits further into the next decision node (distance from the office) and one leaf node based on the corresponding labels. • The next decision node further gets split into one decision node (Cab facility) and one leaf node. • Finally, the decision node splits into two leaf nodes (Accepted offers and Declined offer).
  • 9.
  • 10. Example of a decision tree
  • 11. We see that if the weather is cloudy then we must go to play. Why didn’t it split more? Why did it stop there? To answer this question, we need to know about few more concepts like entropy, information gain, and Gini index.
  • 12. Attribute Selection Measures • While implementing a Decision tree, the main issue arises that how to select the best attribute for the root node and for sub-nodes. • So, to solve such problems there is a technique which is called as Attribute selection measure or ASM. • • By this measurement, we can easily select the best attribute for the nodes of the tree. • There are two popular techniques for ASM, which are: 1. Information Gain 2. Gini Index
  • 13. Information Gain • Information gain is the measurement of changes in entropy after the segmentation of a dataset based on an attribute. • It calculates how much information a feature provides us about a class. • According to the value of information gain, we split the node and build the decision tree. • A decision tree algorithm always tries to maximize the value of information gain, and a node/attribute having the highest information gain is split first. • It can be calculated using the below formula: Information Gain= Entropy(S)- [(Weighted Avg) *Entropy(each feature)] • Entropy: Entropy is a metric to measure the impurity in a given attribute. It specifies randomness in data. Entropy can be calculated as: Entropy(s)= -P(yes)log2 P(yes)- P(no) log2 P(no) Where, S= Total number of samples P(yes)= probability of yes P(no)= probability of no
  • 14. How to calculate Entropy? • Suppose a feature has 8 “yes” and 4 “no” initially, after the first split the left node gets 5 ‘yes’ and 2 ‘no’ whereas right node gets 3 ‘yes’ and 2 ‘no’. • We see here the split is not pure, why? Because we can still see some negative classes in both the nodes. In order to make a decision tree, we need to calculate the impurity of each split, and when the purity is 100%, we make it as a leaf node. • To check the impurity of feature 2 and feature 3 we will take the help for Entropy formula.
  • 15. We can clearly see from the tree itself that left node has low entropy or more purity than right node since left node has a greater number of “yes” and it is easy to decide here. Always remember that the higher the Entropy, the lower will be the purity and the higher will be the impurity.
  • 16. For feature 2 : For feature 3 :
  • 17. Example to calculate Information Gain Suppose our entire population has a total of 30 instances. The dataset is to predict whether the person will go to the gym or not. Let’s say 16 people go to the gym and 14 people don’t Now we have two features to predict whether he/she will go to the gym or not. Feature 1 is “Energy” which takes two values “high” and “low” Feature 2 is “Motivation” which takes 3 values “No motivation”, “Neutral” and “Highly motivated”.
  • 19. To see the weighted average of entropy of each node we will do as follows: Now we have the value of E(Parent) and E(Parent|Energy), information gain will be:
  • 20. Similarly, we will do this with the other feature “Motivation” and calculate its information gain.
  • 21. Let’s calculate the entropy here: To see the weighted average of entropy of each node we will do as follows:
  • 22. Now we have the value of E(Parent) and E(Parent|Motivation), information gain will be: We now see that the “Energy” feature gives more reduction which is 0.37 than the “Motivation” feature. Hence we will select the feature which has the highest information gain and then split the node based on that feature. In this example “Energy” will be our root node and we’ll do the same for sub- nodes. Here we can see that when the energy is “high” the entropy is low and hence we can say a person will definitely go to the gym if he has high energy, but what if the energy is low? We will again split the node based on the new feature which is “Motivation”.
  • 23. Gini Index • Gini index is a measure of impurity or purity used while creating a decision tree in the CART(Classification and Regression Tree) algorithm. • An attribute with the low Gini index should be preferred as compared to the high Gini index. • It only creates binary splits, and the CART algorithm uses the Gini index to create binary splits. • Gini index can be calculated using the below formula: Gini Index= 1- ∑jPj 2
  • 24. When to stop splitting? Usually, real-world datasets have a large number of features, which will result in a large number of splits, which in turn gives a huge tree. Such trees take time to build and can lead to overfitting. That means the tree will give very good accuracy on the training dataset but will give bad accuracy in test data. There are many ways to tackle this problem through hyperparameter tuning. 1. max_depth parameter - set the maximum depth of our decision tree. The more the value of max_depth, the more complex your tree will be. Hence you need a value that will not overfit as well as underfit our data. 2. min_samples_split - set the minimum number of samples for each spilt. 3. min_samples_leaf – represents the minimum number of samples required to be in the leaf node. The more you increase the number, the more is the possibility of overfitting. 4. max_features – it helps us decide what number of features to consider when looking for the best split.
  • 25. Pruning • Pruning is a process of deleting the unnecessary nodes from a tree in order to get the optimal decision tree. • A too-large tree increases the risk of overfitting, and a small tree may not capture all the important features of the dataset. • Therefore, a technique that decreases the size of the learning tree without reducing accuracy is known as Pruning. • There are mainly two types of tree pruning technology used: Cost Complexity Pruning Reduced Error Pruning. • There are mainly 2 ways for pruning: (i) Pre-pruning – we can stop growing the tree earlier, which means we can prune/remove/cut a node if it has low importance while growing the tree. (ii) Post-pruning – once our tree is built to its depth, we can start pruning the nodes based on their significance.
  • 26. Advantages of the Decision Tree • It is simple to understand as it follows the same process which a human follow while making any decision in real-life. • It can be very useful for solving decision-related problems. • It helps to think about all the possible outcomes for a problem. • There is less requirement of data cleaning compared to other algorithms.
  • 27. Disadvantages of the Decision Tree • The decision tree contains lots of layers, which makes it complex. • It may have an overfitting issue, which can be resolved using the Random Forest algorithm. • For more class labels, the computational complexity of the decision tree may increase.