SlideShare a Scribd company logo
DECISION TREE
• Rabia Rehman
DECISION TREE
• A decision tree is a flowchart- like tree structure that
includes root nodes, branches and leaf nodes.
• Each internal node (non-leaf node) denotes a test on an
attributes, each branch denotes the outcome of a test,
and each leaf (terminal) node holds a class label.
• The top most node in a tree is root node.
• It’s a supervised machine learning algorithm.
• A leaf node attribute produces a homogeneous result
(all in one class), which does not require additional
classification testing.
Common terms used with Decision Tree
Root Node: It represents entire population or sample, and
this further gets divided into two or more homogeneous
sets.
Splitting: It is a process of dividing a node into two or more
sub-nodes.
Decision Node: Specifies a test on a single attribute
Leaf/ Terminal Node: Indicates the value of the target
attribute
Pruning: When we remove sub-nodes of a decision node,
this process is called pruning. You can say opposite
process of splitting.
Arc/edge: No. of paths extract from single attribute.
Path: A disjunction of test to make the final decision
• Decision trees classify instances or examples by starting
at the root of the tree and moving through it until a leaf
node
• The following
decision tree is for
the concept
buy_computer that
indicates whether a
customer at a
company is likely to
buy a computer or
not.
• Each internal node
represents a test on
an attribute.
• Each leaf node
represents a class.
“WHY ARE DECISION TREE
CLASSIFIERS SO POPULAR?
• It can handle multidimensional data.
• It requires less data cleaning compared to some other modeling
techniques. It is not influenced by outliers and missing values to
a fair degree.
• It works for both categorical and continuous input and output
variables.
• The learning and classification steps of a decision tree are simple
and fast.
• Perform classification without much computation.
• The construction of decision tree classifiers does not require any
domain knowledge or parameter setting.
TYPES OF DECISION TREE
• There are two types of decision tree:
1. CART(classification and regression tree)
• Adopt Gini Index
• Entropy Calculate
• Information Gain
2. ID3 (Iterative Dichotomiser 3)
• Entropy
• Avg Entropy of attributes
• Information Gain
ID3 ALGORITHM
• During the late 1970s and early 1980s, J.Ross Quinlan, a researcher
in machine learning, developed a decision tree algorithm known as
ID3 (Iterative Dichotomiser).
• ID3 algorithm is a classification algorithm that follows a greedy
approach of building a decision tree by selecting a best attribute that
yields maximum Information Gain or minimum Entropy.
• In this algorithm, there is no backtracking; the trees are constructed
in a top-down recursive divide-and-conquer manner.
 It uses three functions
• Entropy, average entropy & Information Gain
WHAT IS ENTROPY?
• A measure of homogeneity or uncertainty of the set of
examples.
• Given a set T of positive and negative examples of some
target concept (a 2-class problem), the entropy of set T
relative to this binary classification is:
E(T) = - (p/p+n) log2 (p/p+n) – (n/p+n) log2 (n/p+n)
p= Positive (Number of positive values in target
attribute)
n= Negative (Number of negative values in target
attribute)
WHAT IS AVERAGE ENTROPY?
• Calculate Entropy for sub attributes is average entropy.
• AvgEntropy =
WHAT IS INFORMATION GAIN?
• Information gain measures the expected reduction in entropy, or
uncertainty.
IG= Entropy(Attribute)-AvgEntropy(Attribute)
THE PROCESS
1. Calculate entropy for dataset.
2. For each attribute/feature
• Calculate entropy for all its categorical values.
• Calculate information gain for the feature.
3. Find the feature with maximum information gain.
4. Repeat it until we get the desired tree.
CONSIDER A TABLE OF DATA SET BELOW. GIVEN THE COLUMN “PLAY
TENNIS” AS TARGET ATTRIBUTE (T), AND EXAMPLE OF DAYS WHICH SUCH
CONDITIONS AS ATTRIBUTES: OUTLOOK; TEMPERATURE; HUMIDITY;
AND WIND. WE WANT TO KNOW WHAT THE BEST DAY TO PLAY TENNIS.
STEP 1: CALCULATE ENTROPY FOR DATASET
• Choose column “Play Tennis” as a Target Attribute (T).
• Dataset is of binary classes (yes and no), where 9 out of 14 are "yes"
and 5 out of 14 are "no“.
• We can consider Yes as Positive (p) and No as Negative (n).
• Complete entropy of dataset (Target Value) is:
E(T) = - (p/p+n) log2 (p/p+n) – (n/p+n) log2 (n/p+n)
STEP 2: CALCULATE ENTROPY OF EACH ATTRIBUTE
FOR ALL ITS CATEGORICAL VALUES
• First Attribute – Outlook
• Categorical values - sunny, overcast and rain Sunny: p(Yes)=2, n(No)=3
• E(T) = - (p/p+n) log2 (p/p+n) – (n/p+n) log2 (n/p+n) Rain: p(Yes)=3, n(No)=2
• E(Outlook=sunny) = -(2/5)*log(2/5)-(3/5)*log(3/5) =0.971 Overcast: p(Yes)=4, n(No)=0
• E(Outlook=rain) = -(3/5)*log(3/5)-(2/5)*log(2/5) =0.971
• E(Outlook=overcast) = -(4/4)*log(4/4)-0 = 0
• AvgEntropy (Outlook) p=9, n=4
AvgEntropy(Outlook) = p(sunny) * E(Outlook=sunny) + p(rain) * E(Outlook=rain) + p(overcast) *
E(Outlook=overcast)
= (5/14)*0.971 + (5/14)*0.971 + (4/14)*0
= 0.693
• Information Gain = E(T) - AvgEntropy(Outlook)
= 0.94 - 0.693
= 0.247
•Second Attribute – Temperature
• Categorical values - hot, mild, cool Hot: p(Yes)=2, n(No)=2
• E(Temperature=hot) = -(2/4)*log(2/4)-(2/4)*log(2/4) = 1 Cool: p(Yes)=3, n(No)=1
• E(Temperature=cool) = -(3/4)*log(3/4)-(1/4)*log(1/4) = 0.811 Mild: p(Yes)=4, n(No)=2
• E(Temperature=mild) = -(4/6)*log(4/6)-(2/6)*log(2/6) = 0.9179
• AvgEntropy(Temperature)
AvgEntropy(Temperature) = p(hot)*E(Temperature=hot) + p(mild)*E(Temperature=mild) +
p(cool)*E(Temperature=cool)
= (4/14)*1 + (6/14)*0.9179 + (4/14)*0.811
= 0.9108
• Information Gain = E(T) - AvgEntropy(Temperature)
= 0.94 - 0.9108
= 0.0292
• Third Attribute – Humidity
• Categorical values - high, normal High: p(Yes)=3, n(No)=4
• E(Humidity=high) = -(3/7)*log(3/7)-(4/7)*log(4/7) = 0.983 Normal: p(Yes)=6, n(No)=1
• E(Humidity=normal) = -(6/7)*log(6/7)-(1/7)*log(1/7) = 0.591
• AvgEntropy (Humidity)
AvgEntropy(Humidity) = p(high)*H(Humidity=high) + p(normal)*H(Humidity=normal)
= (7/14)*0.983 + (7/14)*0.591
= 0.787
• Information Gain = E(T) - AvgEntropy(Humidity)
= 0.94 - 0.787
= 0.153
•Fourth Attribute – Wind
• Categorical values - weak, strong Weak: p(Yes)=6, n(No)=2
• E(Wind=weak) = -(6/8)*log(6/8)-(2/8)*log(2/8) = 0.811 Strong: p(Yes)=3, n(No)=3
• E(Wind=strong) = -(3/6)*log(3/6)-(3/6)*log(3/6) = 1
• AvgEntropy (Wind)
AvgEntropy(Wind) = p(weak)*E(Wind=weak) + p(strong)*E(Wind=strong)
= (8/14)*0.811 + (6/14)*1
= 0.892
• Information Gain = E(T) - AvgEntropy(Wind)
= 0.94 - 0.892
= 0.048
Attribute Gain
Outlook 0.247
Temperature 0.029
Humidity 0.152
wind 0.048
STEP 3: FIND THE FEATURE WITH MAXIMUM
INFORMATION GAIN.
• Here, the attribute with maximum information gain is Outlook (0.247). So,
the decision tree built so far –
• Here, when Outlook == overcast, it is of pure class(Yes).
Now, we have to repeat same procedure for the data
with rows consist of Outlook value as Sunny and
then for Outlook value as Rain.
• Now, finding the best attribute for splitting the data with
Outlook=Sunny
• E(T) = - (p/p+n) log2 (p/p+n) – (n/p+n) log2 (n/p+n)
REPEAT STEP 2
• First Attribute – Temperature
• Categorical values - hot, mild, cool (Sunny, Hot): p(Yes)=0, n(No)=2
• E(Sunny, Temperature=hot) = -0-(2/2)*log(2/2) = 0 (Sunny, Cool): p(Yes)=1, n(No)=0
• E(Sunny, Temperature=cool) = -(1)*log(1)- 0 = 0 (Sunny, Mild): p(Yes)=1, n(No)=1
• E(Sunny, Temperature=mild) = -(1/2)*log(1/2)-(1/2)*log(1/2) = 1
• AvgEntropy (Temperature) Sunny: p=2, n=3
I(Sunny, Temperature) = p(Sunny, hot)*E(Sunny, Temperature=hot) + p(Sunny, mild)*E(Sunny,
Temperature=mild) + p(Sunny, cool)*E(Sunny, Temperature=cool)
= (2/5)*0 + (1/5)*0 + (2/5)*1
= 0.4
• Information Gain = Entropy(Sunny) – AvgEntropy(Temperature)
= 0.971 - 0.4
= 0.571
• Second Attribute – Humidity
• Categorical values - high, normal (Sunny, High): p(Yes)=0, n(No)=3
• E(Sunny, Humidity=high) = - 0 - (3/3)*log(3/3) = 0 (Sunny, Normal): p(Yes)=2, n(No)=0
• E(Sunny, Humidity=normal) = -(2/2)*log(2/2)-0 = 0
• AvgEntropy (Humidity)
AvgEntropy(Sunny, Humidity) = p(Sunny, high)*E(Sunny, Humidity=high) + p(Sunny,
normal)*E(Sunny, Humidity=normal)
= (3/5)*0 + (2/5)*0
= 0
• Information Gain = E(Sunny) - AvgEntropy(Humidity)
= 0.971 - 0
= 0.971
• Third Attribute – Wind
• Categorical values - weak, strong (Sunny, Weak): p(Yes)=1, n(No)=2
• E(Sunny, Wind=weak) = -(1/3)*log(1/3)-(2/3)*log(2/3) = 0.918 (Sunny, Strong): p(Yes)=1, n(No)=1
• E(Sunny, Wind=strong) = -(1/2)*log(1/2)-(1/2)*log(1/2) = 1
• AvgEntropy(Wind)
AvgEntropy(Sunny, Wind) = p(Sunny, weak)*E(Sunny, Wind=weak) + p(Sunny, strong)*E(Sunny,
Wind=strong)
= (3/5)*0.918 + (2/5)*1
= 0.9508
• Information Gain = E(Sunny) - AvgEntropy(Wind)
= 0.971 - 0.9508
= 0.0202
Attribute Gain
Temperature 0.571
Humidity 0.971
wind 0.02
REPEAT STEP 3
• Here, the attribute with maximum information gain is
Humidity(0.971). So, the decision tree built so far –
• Now, finding the best attribute for splitting the
data with Outlook=Rain
• E(T) = - (p/p+n) log2 (p/p+n) – (n/p+n) log2 (n/p+n)
REPEAT STEP 2
•First Attribute – Temperature
• Categorical values - mild, cool, Hot (Rain, Cool): p(Yes)=1,
n(No)=1
• E(Rain, Temperature=cool) = -(1/2)*log(1/2)- (1/2)*log(1/2) = 1 (Rain, Mild): p(Yes)=2, n(No)=1
• E(Rain, Temperature=mild) = -(2/3)*log(2/3)-(1/3)*log(1/3) = 0.918
• E(Rain, Temperature=hot) = 0
• AvgEntropy (Temperature) Rain: p=3, n=2
AvgEntropy(Rain, Temperature) = p(Rain, mild)*E(Rain, Temperature=mild) + p(Rain, cool)*E(Rain,
Temperature=cool)
= (2/5)*1 + (3/5)*0.918
= 0.9508
• Information Gain = E(Rain) - AvgEntropy(Temperature)
= 0.971 - 0.9508
• Second Attribute – Wind
• Categorical values - weak, strong (Rain, Weak): p(Yes)=3, n(No)=0
• E(Wind=weak) = -(3/3)*log(3/3)-0 = 0 (Rain, Strong): p(Yes)=0, n(No)=2
• E(Wind=strong) = 0-(2/2)*log(2/2) = 0
• AvgEntropy (Wind)
AvgEntropy(Wind) = p(Rain, weak)*E(Rain, Wind=weak) + p(Rain, strong)*E(Rain, Wind=strong)
= (3/5)*0 + (2/5)*0
= 0
• Information Gain = E(Rain) - AvgEntropy(Wind)
= 0.971 - 0
= 0.971
REPEAT STEP 3
• Here, the attribute with maximum information gain is
Wind(0.971). So, the decision tree built so far –

More Related Content

What's hot

2.3 bayesian classification
2.3 bayesian classification2.3 bayesian classification
2.3 bayesian classification
Krish_ver2
 
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & KamberChapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
error007
 
Data Mining: Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...
Data Mining:  Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...Data Mining:  Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...
Data Mining: Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...
Salah Amean
 
Data Mining: Association Rules Basics
Data Mining: Association Rules BasicsData Mining: Association Rules Basics
Data Mining: Association Rules Basics
Benazir Income Support Program (BISP)
 
Classification in data mining
Classification in data mining Classification in data mining
Classification in data mining
Sulman Ahmed
 
Data Mining:Concepts and Techniques, Chapter 8. Classification: Basic Concepts
Data Mining:Concepts and Techniques, Chapter 8. Classification: Basic ConceptsData Mining:Concepts and Techniques, Chapter 8. Classification: Basic Concepts
Data Mining:Concepts and Techniques, Chapter 8. Classification: Basic Concepts
Salah Amean
 
Data mining :Concepts and Techniques Chapter 2, data
Data mining :Concepts and Techniques Chapter 2, dataData mining :Concepts and Techniques Chapter 2, data
Data mining :Concepts and Techniques Chapter 2, data
Salah Amean
 
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain RatioLecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Marina Santini
 
Decision tree
Decision treeDecision tree
Decision tree
ShraddhaPandey45
 
Decision Tree Learning
Decision Tree LearningDecision Tree Learning
Decision Tree Learning
Md. Ariful Hoque
 
NAIVE BAYES CLASSIFIER
NAIVE BAYES CLASSIFIERNAIVE BAYES CLASSIFIER
NAIVE BAYES CLASSIFIER
Knoldus Inc.
 
04 Classification in Data Mining
04 Classification in Data Mining04 Classification in Data Mining
04 Classification in Data Mining
Valerii Klymchuk
 
Decision tree
Decision treeDecision tree
Decision tree
Soujanya V
 
Data mining: Classification and prediction
Data mining: Classification and predictionData mining: Classification and prediction
Data mining: Classification and prediction
DataminingTools Inc
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learning
Paras Kohli
 
Association rule mining
Association rule miningAssociation rule mining
Association rule mining
Acad
 

What's hot (20)

2.3 bayesian classification
2.3 bayesian classification2.3 bayesian classification
2.3 bayesian classification
 
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & KamberChapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
 
Data Mining: Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...
Data Mining:  Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...Data Mining:  Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...
Data Mining: Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...
 
Data Mining: Association Rules Basics
Data Mining: Association Rules BasicsData Mining: Association Rules Basics
Data Mining: Association Rules Basics
 
Decision trees
Decision treesDecision trees
Decision trees
 
Classification in data mining
Classification in data mining Classification in data mining
Classification in data mining
 
Data Mining:Concepts and Techniques, Chapter 8. Classification: Basic Concepts
Data Mining:Concepts and Techniques, Chapter 8. Classification: Basic ConceptsData Mining:Concepts and Techniques, Chapter 8. Classification: Basic Concepts
Data Mining:Concepts and Techniques, Chapter 8. Classification: Basic Concepts
 
Data mining :Concepts and Techniques Chapter 2, data
Data mining :Concepts and Techniques Chapter 2, dataData mining :Concepts and Techniques Chapter 2, data
Data mining :Concepts and Techniques Chapter 2, data
 
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain RatioLecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
 
Decision tree
Decision treeDecision tree
Decision tree
 
Decision Tree Learning
Decision Tree LearningDecision Tree Learning
Decision Tree Learning
 
Decision tree
Decision treeDecision tree
Decision tree
 
NAIVE BAYES CLASSIFIER
NAIVE BAYES CLASSIFIERNAIVE BAYES CLASSIFIER
NAIVE BAYES CLASSIFIER
 
Presentation on K-Means Clustering
Presentation on K-Means ClusteringPresentation on K-Means Clustering
Presentation on K-Means Clustering
 
04 Classification in Data Mining
04 Classification in Data Mining04 Classification in Data Mining
04 Classification in Data Mining
 
3. mining frequent patterns
3. mining frequent patterns3. mining frequent patterns
3. mining frequent patterns
 
Decision tree
Decision treeDecision tree
Decision tree
 
Data mining: Classification and prediction
Data mining: Classification and predictionData mining: Classification and prediction
Data mining: Classification and prediction
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learning
 
Association rule mining
Association rule miningAssociation rule mining
Association rule mining
 

Similar to Decision tree induction \ Decision Tree Algorithm with Example| Data science

Aiml ajsjdjcjcjcjfjfjModule4_Pashrt1-1.pdf
Aiml ajsjdjcjcjcjfjfjModule4_Pashrt1-1.pdfAiml ajsjdjcjcjcjfjfjModule4_Pashrt1-1.pdf
Aiml ajsjdjcjcjcjfjfjModule4_Pashrt1-1.pdf
CHIRAGGOWDA41
 
Descision making descision making decision tree.pptx
Descision making descision making decision tree.pptxDescision making descision making decision tree.pptx
Descision making descision making decision tree.pptx
charmeshponnagani
 
Lecture4.pptx
Lecture4.pptxLecture4.pptx
Lecture4.pptx
yasir149288
 
"Induction of Decision Trees" @ Papers We Love Bucharest
"Induction of Decision Trees" @ Papers We Love Bucharest"Induction of Decision Trees" @ Papers We Love Bucharest
"Induction of Decision Trees" @ Papers We Love Bucharest
Stefan Adam
 
CART Algorithm.pptx
CART Algorithm.pptxCART Algorithm.pptx
CART Algorithm.pptx
SumayaNazir440
 
Decision Trees
Decision TreesDecision Trees
Decision Trees
zekeLabs Technologies
 
Covering (Rules-based) Algorithm
Covering (Rules-based) AlgorithmCovering (Rules-based) Algorithm
Covering (Rules-based) Algorithm
ZHAO Sam
 
Data structures notes for college students btech.pptx
Data structures notes for college students btech.pptxData structures notes for college students btech.pptx
Data structures notes for college students btech.pptx
KarthikVijay59
 
2.pptx
2.pptx2.pptx
2.pptx
MohAlyasin1
 
Decision Tree Steps
Decision Tree StepsDecision Tree Steps
Decision Tree Steps
Vikash Kumar
 
Recursion tree method
Recursion tree methodRecursion tree method
Recursion tree method
Rajendran
 
WEKA:Algorithms The Basic Methods
WEKA:Algorithms The Basic MethodsWEKA:Algorithms The Basic Methods
WEKA:Algorithms The Basic Methods
weka Content
 
WEKA: Algorithms The Basic Methods
WEKA: Algorithms The Basic MethodsWEKA: Algorithms The Basic Methods
WEKA: Algorithms The Basic Methods
DataminingTools Inc
 
unit 5 decision tree2.pptx
unit 5 decision tree2.pptxunit 5 decision tree2.pptx
unit 5 decision tree2.pptx
ssuser5c580e1
 
Learning
LearningLearning
Learningbutest
 
Decision tree learning
Decision tree learningDecision tree learning
Decision tree learning
Dr. Radhey Shyam
 
ID3 Algorithm
ID3 AlgorithmID3 Algorithm
ID3 Algorithm
CherifRehouma
 
Master method
Master method Master method
Master method
Rajendran
 

Similar to Decision tree induction \ Decision Tree Algorithm with Example| Data science (20)

Aiml ajsjdjcjcjcjfjfjModule4_Pashrt1-1.pdf
Aiml ajsjdjcjcjcjfjfjModule4_Pashrt1-1.pdfAiml ajsjdjcjcjcjfjfjModule4_Pashrt1-1.pdf
Aiml ajsjdjcjcjcjfjfjModule4_Pashrt1-1.pdf
 
Descision making descision making decision tree.pptx
Descision making descision making decision tree.pptxDescision making descision making decision tree.pptx
Descision making descision making decision tree.pptx
 
Lecture4.pptx
Lecture4.pptxLecture4.pptx
Lecture4.pptx
 
Decision tree
Decision treeDecision tree
Decision tree
 
"Induction of Decision Trees" @ Papers We Love Bucharest
"Induction of Decision Trees" @ Papers We Love Bucharest"Induction of Decision Trees" @ Papers We Love Bucharest
"Induction of Decision Trees" @ Papers We Love Bucharest
 
CART Algorithm.pptx
CART Algorithm.pptxCART Algorithm.pptx
CART Algorithm.pptx
 
Decision Trees
Decision TreesDecision Trees
Decision Trees
 
Covering (Rules-based) Algorithm
Covering (Rules-based) AlgorithmCovering (Rules-based) Algorithm
Covering (Rules-based) Algorithm
 
Data structures notes for college students btech.pptx
Data structures notes for college students btech.pptxData structures notes for college students btech.pptx
Data structures notes for college students btech.pptx
 
2.pptx
2.pptx2.pptx
2.pptx
 
Decision Tree Steps
Decision Tree StepsDecision Tree Steps
Decision Tree Steps
 
Recurrences
RecurrencesRecurrences
Recurrences
 
Recursion tree method
Recursion tree methodRecursion tree method
Recursion tree method
 
WEKA:Algorithms The Basic Methods
WEKA:Algorithms The Basic MethodsWEKA:Algorithms The Basic Methods
WEKA:Algorithms The Basic Methods
 
WEKA: Algorithms The Basic Methods
WEKA: Algorithms The Basic MethodsWEKA: Algorithms The Basic Methods
WEKA: Algorithms The Basic Methods
 
unit 5 decision tree2.pptx
unit 5 decision tree2.pptxunit 5 decision tree2.pptx
unit 5 decision tree2.pptx
 
Learning
LearningLearning
Learning
 
Decision tree learning
Decision tree learningDecision tree learning
Decision tree learning
 
ID3 Algorithm
ID3 AlgorithmID3 Algorithm
ID3 Algorithm
 
Master method
Master method Master method
Master method
 

Recently uploaded

De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
g2nightmarescribd
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 

Recently uploaded (20)

De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 

Decision tree induction \ Decision Tree Algorithm with Example| Data science

  • 2. DECISION TREE • A decision tree is a flowchart- like tree structure that includes root nodes, branches and leaf nodes. • Each internal node (non-leaf node) denotes a test on an attributes, each branch denotes the outcome of a test, and each leaf (terminal) node holds a class label. • The top most node in a tree is root node. • It’s a supervised machine learning algorithm. • A leaf node attribute produces a homogeneous result (all in one class), which does not require additional classification testing.
  • 3. Common terms used with Decision Tree Root Node: It represents entire population or sample, and this further gets divided into two or more homogeneous sets. Splitting: It is a process of dividing a node into two or more sub-nodes. Decision Node: Specifies a test on a single attribute Leaf/ Terminal Node: Indicates the value of the target attribute Pruning: When we remove sub-nodes of a decision node, this process is called pruning. You can say opposite process of splitting. Arc/edge: No. of paths extract from single attribute. Path: A disjunction of test to make the final decision • Decision trees classify instances or examples by starting at the root of the tree and moving through it until a leaf node
  • 4. • The following decision tree is for the concept buy_computer that indicates whether a customer at a company is likely to buy a computer or not. • Each internal node represents a test on an attribute. • Each leaf node represents a class.
  • 5. “WHY ARE DECISION TREE CLASSIFIERS SO POPULAR? • It can handle multidimensional data. • It requires less data cleaning compared to some other modeling techniques. It is not influenced by outliers and missing values to a fair degree. • It works for both categorical and continuous input and output variables. • The learning and classification steps of a decision tree are simple and fast. • Perform classification without much computation. • The construction of decision tree classifiers does not require any domain knowledge or parameter setting.
  • 6. TYPES OF DECISION TREE • There are two types of decision tree: 1. CART(classification and regression tree) • Adopt Gini Index • Entropy Calculate • Information Gain 2. ID3 (Iterative Dichotomiser 3) • Entropy • Avg Entropy of attributes • Information Gain
  • 7. ID3 ALGORITHM • During the late 1970s and early 1980s, J.Ross Quinlan, a researcher in machine learning, developed a decision tree algorithm known as ID3 (Iterative Dichotomiser). • ID3 algorithm is a classification algorithm that follows a greedy approach of building a decision tree by selecting a best attribute that yields maximum Information Gain or minimum Entropy. • In this algorithm, there is no backtracking; the trees are constructed in a top-down recursive divide-and-conquer manner.  It uses three functions • Entropy, average entropy & Information Gain
  • 8. WHAT IS ENTROPY? • A measure of homogeneity or uncertainty of the set of examples. • Given a set T of positive and negative examples of some target concept (a 2-class problem), the entropy of set T relative to this binary classification is: E(T) = - (p/p+n) log2 (p/p+n) – (n/p+n) log2 (n/p+n) p= Positive (Number of positive values in target attribute) n= Negative (Number of negative values in target attribute)
  • 9. WHAT IS AVERAGE ENTROPY? • Calculate Entropy for sub attributes is average entropy. • AvgEntropy =
  • 10. WHAT IS INFORMATION GAIN? • Information gain measures the expected reduction in entropy, or uncertainty. IG= Entropy(Attribute)-AvgEntropy(Attribute)
  • 11. THE PROCESS 1. Calculate entropy for dataset. 2. For each attribute/feature • Calculate entropy for all its categorical values. • Calculate information gain for the feature. 3. Find the feature with maximum information gain. 4. Repeat it until we get the desired tree.
  • 12. CONSIDER A TABLE OF DATA SET BELOW. GIVEN THE COLUMN “PLAY TENNIS” AS TARGET ATTRIBUTE (T), AND EXAMPLE OF DAYS WHICH SUCH CONDITIONS AS ATTRIBUTES: OUTLOOK; TEMPERATURE; HUMIDITY; AND WIND. WE WANT TO KNOW WHAT THE BEST DAY TO PLAY TENNIS.
  • 13. STEP 1: CALCULATE ENTROPY FOR DATASET • Choose column “Play Tennis” as a Target Attribute (T). • Dataset is of binary classes (yes and no), where 9 out of 14 are "yes" and 5 out of 14 are "no“. • We can consider Yes as Positive (p) and No as Negative (n).
  • 14. • Complete entropy of dataset (Target Value) is: E(T) = - (p/p+n) log2 (p/p+n) – (n/p+n) log2 (n/p+n)
  • 15. STEP 2: CALCULATE ENTROPY OF EACH ATTRIBUTE FOR ALL ITS CATEGORICAL VALUES • First Attribute – Outlook • Categorical values - sunny, overcast and rain Sunny: p(Yes)=2, n(No)=3 • E(T) = - (p/p+n) log2 (p/p+n) – (n/p+n) log2 (n/p+n) Rain: p(Yes)=3, n(No)=2 • E(Outlook=sunny) = -(2/5)*log(2/5)-(3/5)*log(3/5) =0.971 Overcast: p(Yes)=4, n(No)=0 • E(Outlook=rain) = -(3/5)*log(3/5)-(2/5)*log(2/5) =0.971 • E(Outlook=overcast) = -(4/4)*log(4/4)-0 = 0 • AvgEntropy (Outlook) p=9, n=4 AvgEntropy(Outlook) = p(sunny) * E(Outlook=sunny) + p(rain) * E(Outlook=rain) + p(overcast) * E(Outlook=overcast) = (5/14)*0.971 + (5/14)*0.971 + (4/14)*0 = 0.693 • Information Gain = E(T) - AvgEntropy(Outlook) = 0.94 - 0.693 = 0.247
  • 16. •Second Attribute – Temperature • Categorical values - hot, mild, cool Hot: p(Yes)=2, n(No)=2 • E(Temperature=hot) = -(2/4)*log(2/4)-(2/4)*log(2/4) = 1 Cool: p(Yes)=3, n(No)=1 • E(Temperature=cool) = -(3/4)*log(3/4)-(1/4)*log(1/4) = 0.811 Mild: p(Yes)=4, n(No)=2 • E(Temperature=mild) = -(4/6)*log(4/6)-(2/6)*log(2/6) = 0.9179 • AvgEntropy(Temperature) AvgEntropy(Temperature) = p(hot)*E(Temperature=hot) + p(mild)*E(Temperature=mild) + p(cool)*E(Temperature=cool) = (4/14)*1 + (6/14)*0.9179 + (4/14)*0.811 = 0.9108 • Information Gain = E(T) - AvgEntropy(Temperature) = 0.94 - 0.9108 = 0.0292
  • 17. • Third Attribute – Humidity • Categorical values - high, normal High: p(Yes)=3, n(No)=4 • E(Humidity=high) = -(3/7)*log(3/7)-(4/7)*log(4/7) = 0.983 Normal: p(Yes)=6, n(No)=1 • E(Humidity=normal) = -(6/7)*log(6/7)-(1/7)*log(1/7) = 0.591 • AvgEntropy (Humidity) AvgEntropy(Humidity) = p(high)*H(Humidity=high) + p(normal)*H(Humidity=normal) = (7/14)*0.983 + (7/14)*0.591 = 0.787 • Information Gain = E(T) - AvgEntropy(Humidity) = 0.94 - 0.787 = 0.153
  • 18. •Fourth Attribute – Wind • Categorical values - weak, strong Weak: p(Yes)=6, n(No)=2 • E(Wind=weak) = -(6/8)*log(6/8)-(2/8)*log(2/8) = 0.811 Strong: p(Yes)=3, n(No)=3 • E(Wind=strong) = -(3/6)*log(3/6)-(3/6)*log(3/6) = 1 • AvgEntropy (Wind) AvgEntropy(Wind) = p(weak)*E(Wind=weak) + p(strong)*E(Wind=strong) = (8/14)*0.811 + (6/14)*1 = 0.892 • Information Gain = E(T) - AvgEntropy(Wind) = 0.94 - 0.892 = 0.048
  • 19. Attribute Gain Outlook 0.247 Temperature 0.029 Humidity 0.152 wind 0.048
  • 20. STEP 3: FIND THE FEATURE WITH MAXIMUM INFORMATION GAIN. • Here, the attribute with maximum information gain is Outlook (0.247). So, the decision tree built so far – • Here, when Outlook == overcast, it is of pure class(Yes). Now, we have to repeat same procedure for the data with rows consist of Outlook value as Sunny and then for Outlook value as Rain.
  • 21. • Now, finding the best attribute for splitting the data with Outlook=Sunny • E(T) = - (p/p+n) log2 (p/p+n) – (n/p+n) log2 (n/p+n)
  • 22. REPEAT STEP 2 • First Attribute – Temperature • Categorical values - hot, mild, cool (Sunny, Hot): p(Yes)=0, n(No)=2 • E(Sunny, Temperature=hot) = -0-(2/2)*log(2/2) = 0 (Sunny, Cool): p(Yes)=1, n(No)=0 • E(Sunny, Temperature=cool) = -(1)*log(1)- 0 = 0 (Sunny, Mild): p(Yes)=1, n(No)=1 • E(Sunny, Temperature=mild) = -(1/2)*log(1/2)-(1/2)*log(1/2) = 1 • AvgEntropy (Temperature) Sunny: p=2, n=3 I(Sunny, Temperature) = p(Sunny, hot)*E(Sunny, Temperature=hot) + p(Sunny, mild)*E(Sunny, Temperature=mild) + p(Sunny, cool)*E(Sunny, Temperature=cool) = (2/5)*0 + (1/5)*0 + (2/5)*1 = 0.4 • Information Gain = Entropy(Sunny) – AvgEntropy(Temperature) = 0.971 - 0.4 = 0.571
  • 23. • Second Attribute – Humidity • Categorical values - high, normal (Sunny, High): p(Yes)=0, n(No)=3 • E(Sunny, Humidity=high) = - 0 - (3/3)*log(3/3) = 0 (Sunny, Normal): p(Yes)=2, n(No)=0 • E(Sunny, Humidity=normal) = -(2/2)*log(2/2)-0 = 0 • AvgEntropy (Humidity) AvgEntropy(Sunny, Humidity) = p(Sunny, high)*E(Sunny, Humidity=high) + p(Sunny, normal)*E(Sunny, Humidity=normal) = (3/5)*0 + (2/5)*0 = 0 • Information Gain = E(Sunny) - AvgEntropy(Humidity) = 0.971 - 0 = 0.971
  • 24. • Third Attribute – Wind • Categorical values - weak, strong (Sunny, Weak): p(Yes)=1, n(No)=2 • E(Sunny, Wind=weak) = -(1/3)*log(1/3)-(2/3)*log(2/3) = 0.918 (Sunny, Strong): p(Yes)=1, n(No)=1 • E(Sunny, Wind=strong) = -(1/2)*log(1/2)-(1/2)*log(1/2) = 1 • AvgEntropy(Wind) AvgEntropy(Sunny, Wind) = p(Sunny, weak)*E(Sunny, Wind=weak) + p(Sunny, strong)*E(Sunny, Wind=strong) = (3/5)*0.918 + (2/5)*1 = 0.9508 • Information Gain = E(Sunny) - AvgEntropy(Wind) = 0.971 - 0.9508 = 0.0202
  • 26. REPEAT STEP 3 • Here, the attribute with maximum information gain is Humidity(0.971). So, the decision tree built so far –
  • 27. • Now, finding the best attribute for splitting the data with Outlook=Rain • E(T) = - (p/p+n) log2 (p/p+n) – (n/p+n) log2 (n/p+n)
  • 28. REPEAT STEP 2 •First Attribute – Temperature • Categorical values - mild, cool, Hot (Rain, Cool): p(Yes)=1, n(No)=1 • E(Rain, Temperature=cool) = -(1/2)*log(1/2)- (1/2)*log(1/2) = 1 (Rain, Mild): p(Yes)=2, n(No)=1 • E(Rain, Temperature=mild) = -(2/3)*log(2/3)-(1/3)*log(1/3) = 0.918 • E(Rain, Temperature=hot) = 0 • AvgEntropy (Temperature) Rain: p=3, n=2 AvgEntropy(Rain, Temperature) = p(Rain, mild)*E(Rain, Temperature=mild) + p(Rain, cool)*E(Rain, Temperature=cool) = (2/5)*1 + (3/5)*0.918 = 0.9508 • Information Gain = E(Rain) - AvgEntropy(Temperature) = 0.971 - 0.9508
  • 29. • Second Attribute – Wind • Categorical values - weak, strong (Rain, Weak): p(Yes)=3, n(No)=0 • E(Wind=weak) = -(3/3)*log(3/3)-0 = 0 (Rain, Strong): p(Yes)=0, n(No)=2 • E(Wind=strong) = 0-(2/2)*log(2/2) = 0 • AvgEntropy (Wind) AvgEntropy(Wind) = p(Rain, weak)*E(Rain, Wind=weak) + p(Rain, strong)*E(Rain, Wind=strong) = (3/5)*0 + (2/5)*0 = 0 • Information Gain = E(Rain) - AvgEntropy(Wind) = 0.971 - 0 = 0.971
  • 30. REPEAT STEP 3 • Here, the attribute with maximum information gain is Wind(0.971). So, the decision tree built so far –