Presented by:
Md. Al-Amin ID: 172015031
Belayet Hossain ID: 172015032
Presented to:
Ms. Shamima Akter
Designation: Assistant Professor
Decision Tree
Green University of Bangladesh
Department: CSE
Course Name: Artificial Intelligence Lab
Course Code: CSE 410
1
Outline
 Introduction
 What is decision tree?
 Decision tree terms
 Type of decision tree
 Decision tree algorithm
 Example using ID3
 Advantages of decision tree
 Limitation
 Conclusion
 Reference
2
Introduction
The main ideas behind Decision Trees were invented more than 70 years ago, and
nowadays they are among the most powerful Machine Learning tools.
A machine researcher named J.Ross Quinlan in 1980 developed a decision tree algorithm
known as ID3 (Iterative Dichotomiser 3). Later, he presented C4.5, which was the
successor of ID3. ID3 and C4.5 adopt a greedy approach. In this algorithm, there is no
backtracking; the trees are constructed in a top-down recursive divide-and-conquer
manner.
Examples Model
Generalize Instantiate for
another case
Prediction
3
What is DecisionTree?
Decision tree is a decision support tool that is the most powerful and popular tool which is
commonly used in operations research, classification, prediction and machine learning. A
Decision tree is a flowchart like tree structure, where each internal node denotes a test on
an attribute, each branch represents an outcome of the test, and each leaf node (terminal
node) holds a class label.
Example
4
DecisionTree Terms
5
Type of DecisionTree
Decision trees used in data mining are of two main types:
 Classification tree analysis is when the predicted outcome is the class (discrete) to
which the data belongs. Classification trees are used to predict membership of cases
or objects into classes of a categorical dependent variable from their measurements
on one or more predictor variables.
 Regression tree analysis is when the predicted outcome can be considered a real
number (e.g. the price of a house, or a patient's length of stay in a hospital).
6
DecisionTreeAlgorithm
 ID3(IterativeDichotomiser3):Quinlan, 1986
 C4.5(successorofID3):Quinlan 1993, based on ID3.
CART(ClassificationAndRegressionTree): Breiman, Friedman, Olsen and Stone, 1984.
 CHAID(Chi-squareautomaticinteractiondetection):Gordon V. Kass, 1980.
 Use information gain as splitting criterion
 Uses Gini diversity index as measure of impurity when deciding splitting
 A statistical approach that uses the Chi-squared test when deciding on the best split
7
ID3 (Iterative Dichotomiser3)
 Dichotomiser means dividing into two completely opposite things.
Algorithm iteratively divides attributes into two groups which one the most
dominant attribute and these to construct a tree.
Then it calculates the “Entropy & Information Gains” of each attribute. In
this way, the most dominant attribute can founded.
After then, the most dominant one is put on the tree of decision node.
Entropy & Gain scores would be calculated again among the other attribute.
Procedure continue until reaching a decision for that branch.
Cont…8
 Calculate the Entropy of every attribute using the data set S.
 Entropy 𝑆 = 𝛴 − 𝑃 𝑖 . 𝑙𝑜𝑔2
𝑝
𝑖
 Split the set S into subsets using the attribute for which the resulting entropy is minimum.
 Gain (S,A) = Entropy(S) − 𝛴[P(S|A).Entropy(S|A)]
 Make a decision tree node contain that attribute.
 Recurs on subsets sing remaining attribute.
ID3 (Iterative Dichotomiser 3)
Cont…9
To Go Outing or Not
Day Outlook Temperature Humidity Wind Decision
1 Sunny Hot High Weak No
2 Sunny Hot High Strong No
3 Overcast Hot High Weak Yes
4 Rain Mild High Weak Yes
5 Rain Cool Normal Weak Yes
6 Rain Cool Normal Strong No
7 Overcast Cool Normal Strong Yes
8 Sunny Mild High Weak No
9 Sunny Cool Normal Weak Yes
10 Rain Mild Normal Weak Yes
11 Sunny Mild Normal Strong Yes
12 Overcast Mild High Strong Yes
13 Overcast Hot Normal Weak Yes
14 Rain Mild High Strong No
Example
Cont…10
Accomplishment Using ID3
 Decision column consists of 14 instances and includes two labels- Yes & No
 There are 9 decisions labelled Yes & 5 decision labelled No
Entropy 𝐷𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = −𝑃 𝑌𝑒𝑠 ∗ 𝑙𝑜𝑔2
𝑝
𝑌𝑒𝑠 − 𝑃 𝑁𝑜 ∗ 𝑙𝑜𝑔2
𝑝
𝑁𝑜
Entropy 𝐷𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = −
9
14
∗ 𝑙𝑜𝑔2
9
14
−
5
14
∗ 𝑙𝑜𝑔2
5
14
= 0.940
Cont…11
Calculate Wind factor on decision
 Gain 𝐷, 𝑊 = 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝐷 − 𝛴 𝑃 𝐷|𝑊 ∗ 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑃 𝐷|𝑊
 Wind attribute has two labels- Weak & Strong
 Gain 𝐷, 𝑊 = 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝐷 −
𝑃 𝐷|𝑊 = 𝑤𝑒𝑒𝑘 ∗ 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑃 𝐷|𝑊 = 𝑤𝑒𝑎𝑘 −
𝑃 𝐷|𝑊 = 𝑠𝑡𝑟𝑜𝑛𝑔 ∗ 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑃 𝐷|𝑊 = 𝑠𝑡𝑟𝑜𝑛𝑔
Now we calculate Gain for Weak & Strong Wind
Cont…
Accomplishment Using ID3
12
Weak Wind factor on decision
Entropy 𝐷|𝑊 = 𝑤𝑒𝑒𝑎𝑘 = −𝑃 𝑁𝑜 ∗ 𝑙𝑜𝑔2
𝑝
𝑁𝑜 − 𝑃 𝑌𝑒𝑠 ∗ 𝑙𝑜𝑔2
𝑝
𝑌𝑒𝑠
= −
2
8
∗ 𝑙𝑜𝑔2
2
8
−
6
8
∗ 𝑙𝑜𝑔2
6
8
= 0.811
Strong Wind factor on decision
Entropy 𝐷|𝑊 = 𝑠𝑡𝑟𝑜𝑛𝑔 = −𝑃 𝑁𝑜 ∗ 𝑙𝑜𝑔2
𝑝
𝑁𝑜 − 𝑃 𝑌𝑒𝑠 ∗ 𝑙𝑜𝑔2
𝑝
𝑌𝑒𝑠
= −
3
6
∗ 𝑙𝑜𝑔2
3
6
−
3
6
∗ 𝑙𝑜𝑔2
3
6
= 1
Cont…
Accomplishment Using ID3
13
Wind factor on decision
Gain 𝐷, 𝑊 = 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝐷 −
𝑃 𝐷|𝑊 = 𝑤𝑒𝑒𝑘 ∗ 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑃 𝐷|𝑊 = 𝑤𝑒𝑎𝑘 −
𝑃 𝐷|𝑊 = 𝑠𝑡𝑟𝑜𝑛𝑔 ∗ 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑃 𝐷|𝑊 = 𝑠𝑡𝑟𝑜𝑛𝑔
= 0.940 −
8
14
∗ 0.811 −
6
14
∗ 1
= 0.048
Cont…
Accomplishment Using ID3
14
Applied similar calculation on the other columns 
 Gain (Decision, Outlook) = 0.246
 Gain (Decision, Temperature) = 0.029
 Gain (Decision, Humidity) = 0.151
 Gain (Decision, Wind) = 0.048
Outlook factor on decision produces highest score. That’s
why outlook decision will appear the root node of the tree.
Outlook
Sunny Overcast Rain
Cont…
Accomplishment Using ID3
15
Day Outlook Temperature Humidity Wind Decision
3 Overcast Hot High Weak Yes
7 Overcast Cool Normal Strong Yes
12 Overcast Mild High Strong Yes
13 Overcast Hot Normal Weak Yes
Outlook
Sunny Overcast Rain
Yes
[3,7,12,13]
Decision will always be yes if outlook were overcast
Cont…
Accomplishment Using ID3
16
Cont…
Accomplishment Using ID3
Sunny outlook on decision
Gain (Outlook = Sunny|Temp) = 0.270
Gain (Outlook = Sunny|Humidity) = 0.970
Gain (Outlook = Sunny|Wind) = 0.019
So Humidity is decision
17
To Go Outing or Not
Day Outlook Temperature Humidity Wind Decision
1 Sunny Hot High Weak No
2 Sunny Hot High Strong No
8 Sunny Mild High Weak No
9 Sunny Cool Normal Weak Yes
11 Sunny Mild Normal Strong Yes
Day Outlook Temperature Humidity Wind Decision
1 Sunny Hot High Weak No
2 Sunny Hot High Strong No
8 Sunny Mild High Weak No
Day Outlook Temperature Humidity Wind Decision
9 Sunny Cool Normal Weak Yes
11 Sunny Mild Normal Strong Yes
Sunny
High
Humidity
Normal
No
[1,2,8]
Yes
[9,11]
Cont…
Accomplishment Using ID3
18
Strong
Wind
Weak
No
[6,14]
Yes
[4,5,10]
Day Outlook Temperature Humidity Wind Decision
4 Rain Mild High Weak Yes
5 Rain Cool Normal Weak Yes
6 Rain Cool Normal Strong No
10 Rain Mild Normal Weak Yes
14 Rain Mild High Strong No
Wind produces the highest score if outlook were rain
Cont…
Accomplishment Using ID3
Rain outlook on decision
(Outlook = Rain|Temp)
(Outlook = Rain|Humidity)
(Outlook = Rain|Wind)
19
Humidity
Outlook
Sunny Overcast Rain
Normal
No
[1,2,8]
High Weak
Wind
Strong
Yes
[4,5,10]
No
[6,14]
Yes
[9,11]
Yes
[3,7,12,13]
Accomplishment Using ID3
20
Advantages of DecisionTree
 Simple to understand and interpret
 Able to handle both numerical and categorical data
 Requires little data preparation
 Uses a white box model
 Possible to validate a model using statistical tests
 Performs well with large datasets
 Mirrors human decision making more closely than other approaches
Decision trees have various advantages. They are 
21
Limitation of tree elements
 Trees can be very non-robust; therefore they will be unstable.
 Over fitting
 Not fit for continuous variables
 Greedy algorithms cannot guarantee to return the globally optimal decision tree.
 Decision tree learners create biased trees if some classes dominate.
 Generally, it gives low prediction accuracy for a dataset.
 Calculations can become complex when there are many class label.
 They are often relatively inaccurate.
22
Conclusion
23
Decision Tree algorithm belongs to the family of supervised learning algorithms. The
general motive of using Decision Tree is to create a training model which can use to
predict class or value of target variables by learning decision rules inferred from prior
data (training data). The primary challenge in the decision tree implementation is to
identify which attributes do we need to consider as the root node and each level.
Decision trees often mimic the human level thinking so it’s simple to understand the
data and make some good interpretations.
1) https://dzone.com/articles/machine-learning-with-decision-trees
2) https://www.tutorialspoint.com/data_mining/dm_dti.htm
3) https://en.wikipedia.org/wiki/Decision_tree_learning#Decision_tree_types
4) https://en.wikipedia.org/wiki/Decision_tree_learning#Advantages
5) https://en.wikipedia.org/wiki/Decision_tree_learning#Limitations
6) https://becominghuman.ai/understanding-decision-trees-43032111380f
7) https://en.wikipedia.org/wiki/Decision_tree_learning
8) https://www.youtube.com/watch?v=Svo4MTtkHXo&t=104s
9) https://www.youtube.com/watch?v=UzpwBb3qAbs&t=153s
Reference
24

Decision tree in artificial intelligence

  • 1.
    Presented by: Md. Al-AminID: 172015031 Belayet Hossain ID: 172015032 Presented to: Ms. Shamima Akter Designation: Assistant Professor Decision Tree Green University of Bangladesh Department: CSE Course Name: Artificial Intelligence Lab Course Code: CSE 410 1
  • 2.
    Outline  Introduction  Whatis decision tree?  Decision tree terms  Type of decision tree  Decision tree algorithm  Example using ID3  Advantages of decision tree  Limitation  Conclusion  Reference 2
  • 3.
    Introduction The main ideasbehind Decision Trees were invented more than 70 years ago, and nowadays they are among the most powerful Machine Learning tools. A machine researcher named J.Ross Quinlan in 1980 developed a decision tree algorithm known as ID3 (Iterative Dichotomiser 3). Later, he presented C4.5, which was the successor of ID3. ID3 and C4.5 adopt a greedy approach. In this algorithm, there is no backtracking; the trees are constructed in a top-down recursive divide-and-conquer manner. Examples Model Generalize Instantiate for another case Prediction 3
  • 4.
    What is DecisionTree? Decisiontree is a decision support tool that is the most powerful and popular tool which is commonly used in operations research, classification, prediction and machine learning. A Decision tree is a flowchart like tree structure, where each internal node denotes a test on an attribute, each branch represents an outcome of the test, and each leaf node (terminal node) holds a class label. Example 4
  • 5.
  • 6.
    Type of DecisionTree Decisiontrees used in data mining are of two main types:  Classification tree analysis is when the predicted outcome is the class (discrete) to which the data belongs. Classification trees are used to predict membership of cases or objects into classes of a categorical dependent variable from their measurements on one or more predictor variables.  Regression tree analysis is when the predicted outcome can be considered a real number (e.g. the price of a house, or a patient's length of stay in a hospital). 6
  • 7.
    DecisionTreeAlgorithm  ID3(IterativeDichotomiser3):Quinlan, 1986 C4.5(successorofID3):Quinlan 1993, based on ID3. CART(ClassificationAndRegressionTree): Breiman, Friedman, Olsen and Stone, 1984.  CHAID(Chi-squareautomaticinteractiondetection):Gordon V. Kass, 1980.  Use information gain as splitting criterion  Uses Gini diversity index as measure of impurity when deciding splitting  A statistical approach that uses the Chi-squared test when deciding on the best split 7
  • 8.
    ID3 (Iterative Dichotomiser3) Dichotomiser means dividing into two completely opposite things. Algorithm iteratively divides attributes into two groups which one the most dominant attribute and these to construct a tree. Then it calculates the “Entropy & Information Gains” of each attribute. In this way, the most dominant attribute can founded. After then, the most dominant one is put on the tree of decision node. Entropy & Gain scores would be calculated again among the other attribute. Procedure continue until reaching a decision for that branch. Cont…8
  • 9.
     Calculate theEntropy of every attribute using the data set S.  Entropy 𝑆 = 𝛴 − 𝑃 𝑖 . 𝑙𝑜𝑔2 𝑝 𝑖  Split the set S into subsets using the attribute for which the resulting entropy is minimum.  Gain (S,A) = Entropy(S) − 𝛴[P(S|A).Entropy(S|A)]  Make a decision tree node contain that attribute.  Recurs on subsets sing remaining attribute. ID3 (Iterative Dichotomiser 3) Cont…9
  • 10.
    To Go Outingor Not Day Outlook Temperature Humidity Wind Decision 1 Sunny Hot High Weak No 2 Sunny Hot High Strong No 3 Overcast Hot High Weak Yes 4 Rain Mild High Weak Yes 5 Rain Cool Normal Weak Yes 6 Rain Cool Normal Strong No 7 Overcast Cool Normal Strong Yes 8 Sunny Mild High Weak No 9 Sunny Cool Normal Weak Yes 10 Rain Mild Normal Weak Yes 11 Sunny Mild Normal Strong Yes 12 Overcast Mild High Strong Yes 13 Overcast Hot Normal Weak Yes 14 Rain Mild High Strong No Example Cont…10
  • 11.
    Accomplishment Using ID3 Decision column consists of 14 instances and includes two labels- Yes & No  There are 9 decisions labelled Yes & 5 decision labelled No Entropy 𝐷𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = −𝑃 𝑌𝑒𝑠 ∗ 𝑙𝑜𝑔2 𝑝 𝑌𝑒𝑠 − 𝑃 𝑁𝑜 ∗ 𝑙𝑜𝑔2 𝑝 𝑁𝑜 Entropy 𝐷𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = − 9 14 ∗ 𝑙𝑜𝑔2 9 14 − 5 14 ∗ 𝑙𝑜𝑔2 5 14 = 0.940 Cont…11
  • 12.
    Calculate Wind factoron decision  Gain 𝐷, 𝑊 = 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝐷 − 𝛴 𝑃 𝐷|𝑊 ∗ 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑃 𝐷|𝑊  Wind attribute has two labels- Weak & Strong  Gain 𝐷, 𝑊 = 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝐷 − 𝑃 𝐷|𝑊 = 𝑤𝑒𝑒𝑘 ∗ 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑃 𝐷|𝑊 = 𝑤𝑒𝑎𝑘 − 𝑃 𝐷|𝑊 = 𝑠𝑡𝑟𝑜𝑛𝑔 ∗ 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑃 𝐷|𝑊 = 𝑠𝑡𝑟𝑜𝑛𝑔 Now we calculate Gain for Weak & Strong Wind Cont… Accomplishment Using ID3 12
  • 13.
    Weak Wind factoron decision Entropy 𝐷|𝑊 = 𝑤𝑒𝑒𝑎𝑘 = −𝑃 𝑁𝑜 ∗ 𝑙𝑜𝑔2 𝑝 𝑁𝑜 − 𝑃 𝑌𝑒𝑠 ∗ 𝑙𝑜𝑔2 𝑝 𝑌𝑒𝑠 = − 2 8 ∗ 𝑙𝑜𝑔2 2 8 − 6 8 ∗ 𝑙𝑜𝑔2 6 8 = 0.811 Strong Wind factor on decision Entropy 𝐷|𝑊 = 𝑠𝑡𝑟𝑜𝑛𝑔 = −𝑃 𝑁𝑜 ∗ 𝑙𝑜𝑔2 𝑝 𝑁𝑜 − 𝑃 𝑌𝑒𝑠 ∗ 𝑙𝑜𝑔2 𝑝 𝑌𝑒𝑠 = − 3 6 ∗ 𝑙𝑜𝑔2 3 6 − 3 6 ∗ 𝑙𝑜𝑔2 3 6 = 1 Cont… Accomplishment Using ID3 13
  • 14.
    Wind factor ondecision Gain 𝐷, 𝑊 = 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝐷 − 𝑃 𝐷|𝑊 = 𝑤𝑒𝑒𝑘 ∗ 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑃 𝐷|𝑊 = 𝑤𝑒𝑎𝑘 − 𝑃 𝐷|𝑊 = 𝑠𝑡𝑟𝑜𝑛𝑔 ∗ 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑃 𝐷|𝑊 = 𝑠𝑡𝑟𝑜𝑛𝑔 = 0.940 − 8 14 ∗ 0.811 − 6 14 ∗ 1 = 0.048 Cont… Accomplishment Using ID3 14
  • 15.
    Applied similar calculationon the other columns   Gain (Decision, Outlook) = 0.246  Gain (Decision, Temperature) = 0.029  Gain (Decision, Humidity) = 0.151  Gain (Decision, Wind) = 0.048 Outlook factor on decision produces highest score. That’s why outlook decision will appear the root node of the tree. Outlook Sunny Overcast Rain Cont… Accomplishment Using ID3 15
  • 16.
    Day Outlook TemperatureHumidity Wind Decision 3 Overcast Hot High Weak Yes 7 Overcast Cool Normal Strong Yes 12 Overcast Mild High Strong Yes 13 Overcast Hot Normal Weak Yes Outlook Sunny Overcast Rain Yes [3,7,12,13] Decision will always be yes if outlook were overcast Cont… Accomplishment Using ID3 16
  • 17.
    Cont… Accomplishment Using ID3 Sunnyoutlook on decision Gain (Outlook = Sunny|Temp) = 0.270 Gain (Outlook = Sunny|Humidity) = 0.970 Gain (Outlook = Sunny|Wind) = 0.019 So Humidity is decision 17 To Go Outing or Not Day Outlook Temperature Humidity Wind Decision 1 Sunny Hot High Weak No 2 Sunny Hot High Strong No 8 Sunny Mild High Weak No 9 Sunny Cool Normal Weak Yes 11 Sunny Mild Normal Strong Yes
  • 18.
    Day Outlook TemperatureHumidity Wind Decision 1 Sunny Hot High Weak No 2 Sunny Hot High Strong No 8 Sunny Mild High Weak No Day Outlook Temperature Humidity Wind Decision 9 Sunny Cool Normal Weak Yes 11 Sunny Mild Normal Strong Yes Sunny High Humidity Normal No [1,2,8] Yes [9,11] Cont… Accomplishment Using ID3 18
  • 19.
    Strong Wind Weak No [6,14] Yes [4,5,10] Day Outlook TemperatureHumidity Wind Decision 4 Rain Mild High Weak Yes 5 Rain Cool Normal Weak Yes 6 Rain Cool Normal Strong No 10 Rain Mild Normal Weak Yes 14 Rain Mild High Strong No Wind produces the highest score if outlook were rain Cont… Accomplishment Using ID3 Rain outlook on decision (Outlook = Rain|Temp) (Outlook = Rain|Humidity) (Outlook = Rain|Wind) 19
  • 20.
    Humidity Outlook Sunny Overcast Rain Normal No [1,2,8] HighWeak Wind Strong Yes [4,5,10] No [6,14] Yes [9,11] Yes [3,7,12,13] Accomplishment Using ID3 20
  • 21.
    Advantages of DecisionTree Simple to understand and interpret  Able to handle both numerical and categorical data  Requires little data preparation  Uses a white box model  Possible to validate a model using statistical tests  Performs well with large datasets  Mirrors human decision making more closely than other approaches Decision trees have various advantages. They are  21
  • 22.
    Limitation of treeelements  Trees can be very non-robust; therefore they will be unstable.  Over fitting  Not fit for continuous variables  Greedy algorithms cannot guarantee to return the globally optimal decision tree.  Decision tree learners create biased trees if some classes dominate.  Generally, it gives low prediction accuracy for a dataset.  Calculations can become complex when there are many class label.  They are often relatively inaccurate. 22
  • 23.
    Conclusion 23 Decision Tree algorithmbelongs to the family of supervised learning algorithms. The general motive of using Decision Tree is to create a training model which can use to predict class or value of target variables by learning decision rules inferred from prior data (training data). The primary challenge in the decision tree implementation is to identify which attributes do we need to consider as the root node and each level. Decision trees often mimic the human level thinking so it’s simple to understand the data and make some good interpretations.
  • 24.
    1) https://dzone.com/articles/machine-learning-with-decision-trees 2) https://www.tutorialspoint.com/data_mining/dm_dti.htm 3)https://en.wikipedia.org/wiki/Decision_tree_learning#Decision_tree_types 4) https://en.wikipedia.org/wiki/Decision_tree_learning#Advantages 5) https://en.wikipedia.org/wiki/Decision_tree_learning#Limitations 6) https://becominghuman.ai/understanding-decision-trees-43032111380f 7) https://en.wikipedia.org/wiki/Decision_tree_learning 8) https://www.youtube.com/watch?v=Svo4MTtkHXo&t=104s 9) https://www.youtube.com/watch?v=UzpwBb3qAbs&t=153s Reference 24