SlideShare a Scribd company logo
1 of 39
BAS 250
Lesson 5: Decision Trees
• Explain what decision trees are, how they are used, and the
benefits of using them
• Describe the best format for data in order to perform predictive
decision tree mining
• Interpret visual tree’s nodes and leaves
• Explain the use of different algorithms in order to increase the
granularity of the tree’s detail
This Week’s Learning Objectives
 What is a Decision Tree
 Sample Decision Trees
 How to Construct a Decision Tree
 Problems with Decision Trees
 Summary
Overview
• Decision trees are excellent predictive models when the target attribute is categorical in
nature and when the data set is of mixed data types
• More numerically-based approaches, decision trees are better at handling attributes that
have missing or inconsistent values that are not handled- decision trees will work around
such data and still generate usable results
• Decision trees are made of nodes and leaves to represent the best predictor attributes in
a data set
• Decision trees tell the user what is predicted, how confident that prediction can be, and
how we arrived at said prediction
Overview
An example of a Decision Tree developed in RapidMiner
Decision Trees
• Nodes are circular or oval shapes that represent
attributes which serve as good predictors for the label
attribute
• Leaves are end points that demonstrate the
distribution of categories from the label attribute that
follow the branch of the tree to the point of that leaf
Decision Trees
An example of meta data for playing golf based on a decision tree
Decision Trees
 An inductive learning task
o Use particular facts to make more generalized conclusions
 A predictive model based on a branching series of
Boolean tests
o These smaller Boolean tests are less complex than a one-
stage classifier
 Let’s look at a sample decision tree…
What is a Decision Tree?
Predicting Commute Time
Leave At
Stall? Accident?
10 AM 9 AM
8 AM
Long
Long
Short Medium Long
No Yes No Yes
If we leave at 10 AM and
there are no cars stalled
on the road, what will our
commute time be?
 In this decision tree, we made a series of Boolean
decisions and followed the corresponding branch
o Did we leave at 10 AM?
o Did a car stall on the road?
o Is there an accident on the road?
 By answering each of these yes/no questions, we
then came to a conclusion on how long our commute
might take
Inductive Learning
We did not have represent this tree graphically
We could have represented as a set of rules.
However, this may be much harder to read…
Decision Trees as Rules
if hour == 8am
commute time = long
else if hour == 9am
if accident == yes
commute time = long
else
commute time = medium
else if hour == 10am
if stall == yes
commute time = long
else
commute time = short
Decision Tree as a Rule Set
• Notice that all attributes to
not have to be used in each
path of the decision.
• As we will see, all attributes
may not even appear in the
tree.
1. We first make a list of attributes that we can measure
 These attributes (for now) must be discrete
2. We then choose a target attribute that we want to predict
3. Then create an experience table that lists what we have
seen in the past
How to Create a Decision Tree
Example Attributes Target
Hour Weather Accident Stall Commute
D1 8 AM Sunny No No Long
D2 8 AM Cloudy No Yes Long
D3 10 AM Sunny No No Short
D4 9 AM Rainy Yes No Long
D5 9 AM Sunny Yes Yes Long
D6 10 AM Sunny No No Short
D7 10 AM Cloudy No No Short
D8 9 AM Rainy No No Medium
D9 9 AM Sunny Yes No Long
D10 10 AM Cloudy Yes Yes Long
D11 10 AM Rainy No No Short
D12 8 AM Cloudy Yes No Long
D13 9 AM Sunny No No Medium
Sample Experience Table
The previous experience decision table had 4 attributes:
1. Hour
2. Weather
3. Accident
4. Stall
But the decision tree only showed 3 attributes:
1. Hour
2. Accident
3. Stall
Why?
Choosing Attributes
 Methods for selecting attributes show that weather is
not a discriminating attribute
 We use the principle of Occam’s Razor: Given a
number of competing hypotheses, the simplest one
is preferable
Choosing Attributes
 The basic structure of creating a decision tree is
the same for most decision tree algorithms
 The difference lies in how we select the attributes
for the tree
 We will focus on the ID3 algorithm developed by
Ross Quinlan in 1975
Choosing Attributes
 The basic idea behind any decision tree algorithm is as
follows:
o Choose the best attribute(s) to split the remaining instances and make
that attribute a decision node
o Repeat this process for recursively for each child
o Stop when:
 All the instances have the same target attribute value
 There are no more attributes
 There are no more instances
Decision Tree Algorithms
Original decision tree
Identifying the Best Attributes
Leave At
Stall? Accident?
10 AM 9 AM
8 AM
Long
Long
Short Medium
No Yes No Yes
Long
How did we know to split on leave at and then on stall and
accident and not weather?
 To determine the best attribute, we look at the
ID3 heuristic
 ID3 splits attributes based on their entropy.
Entropy is the measure of disinformation…
ID3 Heuristic
 Entropy is minimized when all values of the target
attribute are the same
o If we know that commute time will always be short, then entropy = 0
 Entropy is maximized when there is an equal chance
of all values for the target attribute (i.e. the result is
random)
o If commute time = short in 3 instances, medium in 3 instances and long
in 3 instances, entropy is maximized
Entropy
 Calculation of entropy
o Entropy(S) = ∑(i=1 to l)-|Si|/|S| * log2(|Si|/|S|)
 S = set of examples
 Si = subset of S with value vi under the target attribute
 l = size of the range of the target attribute
Entropy
 ID3 splits on attributes with the lowest entropy
 We calculate the entropy for all values of an attribute
as the weighted sum of subset entropies as follows:
o ∑(i = 1 to k) |Si|/|S| Entropy(Si), where k is the range
of the attribute we are testing
 We can also measure information gain (which is
inversely proportional to entropy) as follows:
o Entropy(S) - ∑(i = 1 to k) |Si|/|S| Entropy(Si)
ID3
Attribute Expected Entropy Information Gain
Hour 0.6511 0.768449
Weather 1.28884 0.130719
Accident 0.92307 0.496479
Stall 1.17071 0.248842
ID3
Given our commute time sample set, we can calculate
the entropy of each attribute at the root node
 There is another technique for reducing the
number of attributes used in a tree – pruning
 Two types of pruning:
oPre-pruning (forward pruning)
oPost-pruning (backward pruning)
Pruning Trees
 In prepruning, we decide during the building process
when to stop adding attributes (possibly based on their
information gain)
 However, this may be problematic – Why?
o Sometimes attributes individually do not contribute much to a
decision, but combined, they may have a significant impact
Prepruning
 Postpruning waits until the full decision tree
has built and then prunes the attributes
 Two techniques:
o Subtree Replacement
o Subtree Raising
Postpruning
Entire subtree is replaced by a single leaf node
Subtree Replacement
A
B
C
1 2 3
4 5
• Node 6 replaced
the subtree
• Generalizes tree
a little more, but
may increase
accuracy
Subtree Replacement
A
B
6 4 5
Entire subtree is raised onto another node
Subtree Raising
A
B
C
1 2 3
4 5
Entire subtree is raised onto another node
We will NOT be using Subtree Raising in this course!
Subtree Raising
A
C
1 2 3
 ID3 is not optimal
o Uses expected entropy reduction, not actual reduction
 Must use discrete (or discretized) attributes
o What if we left for work at 9:30 AM?
o We could break down the attributes into smaller
values…
Problems with ID3
If we broke down leave time to the minute, we
might get something like this:
Problems with ID3
8:02 AM 10:02 AM8:03 AM 9:09 AM9:05 AM 9:07 AM
Long Medium Short Long Long Short
Since entropy is very low for each branch, we have n branches
with n leaves. This would not be helpful for predictive modeling.
 We can use a technique known as discretization
 We choose cut points, such as 9AM for splitting
continuous attributes
 These cut points generally lie in a subset of boundary
points, such that a boundary point is where two adjacent
instances in a sorted list have different target value
attributes
Problems with ID3
Consider the attribute commute time
Problems with ID3
8:00 (L), 8:02 (L), 8:07 (M), 9:00 (S), 9:20 (S), 9:25 (S), 10:00 (S), 10:02 (M)
When we split on these attributes, we increase
the entropy so we don’t have a decision tree
with the same number of cut points as leaves
 While decision trees classify quickly, the time for
building a tree may be higher than another type of
classifier
 Decision trees suffer from a problem of errors
propagating throughout a tree
 A very serious problem as the number of classes
increases
Problems with Decision Trees
 Since decision trees work by a series of local
decisions, what happens when one of these
local decisions is wrong?
o Every decision from that point on may be wrong
o We may never return to the correct path of the
tree
Error Propagation
 Decision trees can be used to help predict the
future
 The trees are easy to understand
 Decision trees work more efficiently with discrete
attributes
 The trees may suffer from error propagation
Summary
“This workforce solution was funded by a grant awarded by the U.S. Department of Labor’s
Employment and Training Administration. The solution was created by the grantee and does not
necessarily reflect the official position of the U.S. Department of Labor. The Department of Labor
makes no guarantees, warranties, or assurances of any kind, express or implied, with respect to such
information, including any information on linked sites and including, but not limited to, accuracy of the
information or its completeness, timeliness, usefulness, adequacy, continued availability, or ownership.”
Except where otherwise stated, this work by Wake Technical Community College Building Capacity in
Business Analytics, a Department of Labor, TAACCCT funded project, is licensed under the Creative
Commons Attribution 4.0 International License. To view a copy of this license, visit
http://creativecommons.org/licenses/by/4.0/
Copyright Information

More Related Content

What's hot

Basic Steps of Video Processing - unit 4 (2).pdf
Basic Steps of Video Processing - unit 4 (2).pdfBasic Steps of Video Processing - unit 4 (2).pdf
Basic Steps of Video Processing - unit 4 (2).pdfHeenaSyed6
 
Image processing9 segmentation(pointslinesedges)
Image processing9 segmentation(pointslinesedges)Image processing9 segmentation(pointslinesedges)
Image processing9 segmentation(pointslinesedges)John Williams
 
Note on fourier transform of unit step function
Note on fourier transform of unit step functionNote on fourier transform of unit step function
Note on fourier transform of unit step functionAnand Krishnamoorthy
 
MODULE 2 computer vision part 2 depth estimation
MODULE 2 computer vision part 2 depth estimationMODULE 2 computer vision part 2 depth estimation
MODULE 2 computer vision part 2 depth estimationgarimajain959768
 
Lecture 16 KL Transform in Image Processing
Lecture 16 KL Transform in Image ProcessingLecture 16 KL Transform in Image Processing
Lecture 16 KL Transform in Image ProcessingVARUN KUMAR
 
2 d geometric transformations
2 d geometric transformations2 d geometric transformations
2 d geometric transformationsMohd Arif
 
Master method theorem
Master method theoremMaster method theorem
Master method theoremRajendran
 
Knnowledge representation and logic lec 11 to lec 15
Knnowledge representation and logic lec 11 to lec 15Knnowledge representation and logic lec 11 to lec 15
Knnowledge representation and logic lec 11 to lec 15Subash Chandra Pakhrin
 
Spatial filtering
Spatial filteringSpatial filtering
Spatial filteringDeepikaT13
 
Fuzzy Logic Based Edge Detection
Fuzzy Logic Based Edge DetectionFuzzy Logic Based Edge Detection
Fuzzy Logic Based Edge DetectionDawn Raider Gupta
 
Image processing second unit Notes
Image processing second unit NotesImage processing second unit Notes
Image processing second unit NotesAAKANKSHA JAIN
 
Wavelet transform in two dimensions
Wavelet transform in two dimensionsWavelet transform in two dimensions
Wavelet transform in two dimensionsAyushi Gagneja
 

What's hot (20)

Basic Steps of Video Processing - unit 4 (2).pdf
Basic Steps of Video Processing - unit 4 (2).pdfBasic Steps of Video Processing - unit 4 (2).pdf
Basic Steps of Video Processing - unit 4 (2).pdf
 
Shortest Path Problem
Shortest Path ProblemShortest Path Problem
Shortest Path Problem
 
Image processing9 segmentation(pointslinesedges)
Image processing9 segmentation(pointslinesedges)Image processing9 segmentation(pointslinesedges)
Image processing9 segmentation(pointslinesedges)
 
Note on fourier transform of unit step function
Note on fourier transform of unit step functionNote on fourier transform of unit step function
Note on fourier transform of unit step function
 
3D transformation and viewing
3D transformation and viewing3D transformation and viewing
3D transformation and viewing
 
Image transforms
Image transformsImage transforms
Image transforms
 
Otsu binarization
Otsu binarizationOtsu binarization
Otsu binarization
 
MODULE 2 computer vision part 2 depth estimation
MODULE 2 computer vision part 2 depth estimationMODULE 2 computer vision part 2 depth estimation
MODULE 2 computer vision part 2 depth estimation
 
Lecture 16 KL Transform in Image Processing
Lecture 16 KL Transform in Image ProcessingLecture 16 KL Transform in Image Processing
Lecture 16 KL Transform in Image Processing
 
2 d geometric transformations
2 d geometric transformations2 d geometric transformations
2 d geometric transformations
 
Type system
Type systemType system
Type system
 
Master method theorem
Master method theoremMaster method theorem
Master method theorem
 
camera calibration
 camera calibration camera calibration
camera calibration
 
Knnowledge representation and logic lec 11 to lec 15
Knnowledge representation and logic lec 11 to lec 15Knnowledge representation and logic lec 11 to lec 15
Knnowledge representation and logic lec 11 to lec 15
 
Image Segmentation
 Image Segmentation Image Segmentation
Image Segmentation
 
Spatial filtering
Spatial filteringSpatial filtering
Spatial filtering
 
Convex hulls & Chan's algorithm
Convex hulls & Chan's algorithmConvex hulls & Chan's algorithm
Convex hulls & Chan's algorithm
 
Fuzzy Logic Based Edge Detection
Fuzzy Logic Based Edge DetectionFuzzy Logic Based Edge Detection
Fuzzy Logic Based Edge Detection
 
Image processing second unit Notes
Image processing second unit NotesImage processing second unit Notes
Image processing second unit Notes
 
Wavelet transform in two dimensions
Wavelet transform in two dimensionsWavelet transform in two dimensions
Wavelet transform in two dimensions
 

Viewers also liked

BAS 150 Lesson 4 Lecture
BAS 150 Lesson 4 LectureBAS 150 Lesson 4 Lecture
BAS 150 Lesson 4 LectureWake Tech BAS
 
BAS 150 Lesson 6 Lecture
BAS 150 Lesson 6 LectureBAS 150 Lesson 6 Lecture
BAS 150 Lesson 6 LectureWake Tech BAS
 
BAS 150 Lesson 5 Lecture
BAS 150 Lesson 5 LectureBAS 150 Lesson 5 Lecture
BAS 150 Lesson 5 LectureWake Tech BAS
 
Learning SAS With Example by Ron Cody :Chapter 16 to Chapter 20 Solution
Learning SAS With Example by Ron Cody :Chapter 16 to Chapter 20 SolutionLearning SAS With Example by Ron Cody :Chapter 16 to Chapter 20 Solution
Learning SAS With Example by Ron Cody :Chapter 16 to Chapter 20 SolutionVibeesh CS
 
SAS Ron Cody Solutions for even Number problems from Chapter 16 to 20
SAS Ron Cody Solutions for even Number problems from Chapter 16 to 20SAS Ron Cody Solutions for even Number problems from Chapter 16 to 20
SAS Ron Cody Solutions for even Number problems from Chapter 16 to 20Ayapparaj SKS
 
SAS Ron Cody Solutions for even Number problems from Chapter 7 to 15
SAS Ron Cody Solutions for even Number problems from Chapter 7 to 15SAS Ron Cody Solutions for even Number problems from Chapter 7 to 15
SAS Ron Cody Solutions for even Number problems from Chapter 7 to 15Ayapparaj SKS
 
BAS 150 Lesson 2 Lecture
BAS 150 Lesson 2 Lecture BAS 150 Lesson 2 Lecture
BAS 150 Lesson 2 Lecture Wake Tech BAS
 
Where Vs If Statement
Where Vs If StatementWhere Vs If Statement
Where Vs If StatementSunil Gupta
 
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain RatioLecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain RatioMarina Santini
 

Viewers also liked (10)

Decision trees
Decision treesDecision trees
Decision trees
 
BAS 150 Lesson 4 Lecture
BAS 150 Lesson 4 LectureBAS 150 Lesson 4 Lecture
BAS 150 Lesson 4 Lecture
 
BAS 150 Lesson 6 Lecture
BAS 150 Lesson 6 LectureBAS 150 Lesson 6 Lecture
BAS 150 Lesson 6 Lecture
 
BAS 150 Lesson 5 Lecture
BAS 150 Lesson 5 LectureBAS 150 Lesson 5 Lecture
BAS 150 Lesson 5 Lecture
 
Learning SAS With Example by Ron Cody :Chapter 16 to Chapter 20 Solution
Learning SAS With Example by Ron Cody :Chapter 16 to Chapter 20 SolutionLearning SAS With Example by Ron Cody :Chapter 16 to Chapter 20 Solution
Learning SAS With Example by Ron Cody :Chapter 16 to Chapter 20 Solution
 
SAS Ron Cody Solutions for even Number problems from Chapter 16 to 20
SAS Ron Cody Solutions for even Number problems from Chapter 16 to 20SAS Ron Cody Solutions for even Number problems from Chapter 16 to 20
SAS Ron Cody Solutions for even Number problems from Chapter 16 to 20
 
SAS Ron Cody Solutions for even Number problems from Chapter 7 to 15
SAS Ron Cody Solutions for even Number problems from Chapter 7 to 15SAS Ron Cody Solutions for even Number problems from Chapter 7 to 15
SAS Ron Cody Solutions for even Number problems from Chapter 7 to 15
 
BAS 150 Lesson 2 Lecture
BAS 150 Lesson 2 Lecture BAS 150 Lesson 2 Lecture
BAS 150 Lesson 2 Lecture
 
Where Vs If Statement
Where Vs If StatementWhere Vs If Statement
Where Vs If Statement
 
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain RatioLecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
 

Similar to BAS 250 Lecture 5

K nearest neighbor
K nearest neighborK nearest neighbor
K nearest neighborUjjawal
 
Decision tree induction
Decision tree inductionDecision tree induction
Decision tree inductionthamizh arasi
 
Machine Learning Unit-5 Decesion Trees & Random Forest.pdf
Machine Learning Unit-5 Decesion Trees & Random Forest.pdfMachine Learning Unit-5 Decesion Trees & Random Forest.pdf
Machine Learning Unit-5 Decesion Trees & Random Forest.pdfAdityaSoraut
 
Decision Trees
Decision TreesDecision Trees
Decision TreesStudent
 
Decision tree presentation
Decision tree presentationDecision tree presentation
Decision tree presentationVijay Yadav
 
Decision tree in artificial intelligence
Decision tree in artificial intelligenceDecision tree in artificial intelligence
Decision tree in artificial intelligenceMdAlAmin187
 
module_3_1.pptx
module_3_1.pptxmodule_3_1.pptx
module_3_1.pptxWanderer20
 
module_3_1.pptx
module_3_1.pptxmodule_3_1.pptx
module_3_1.pptxWanderer20
 
Machine Learning Lecture 3 Decision Trees
Machine Learning Lecture 3 Decision TreesMachine Learning Lecture 3 Decision Trees
Machine Learning Lecture 3 Decision Treesananth
 
Random forest sgv_ai_talk_oct_2_2018
Random forest sgv_ai_talk_oct_2_2018Random forest sgv_ai_talk_oct_2_2018
Random forest sgv_ai_talk_oct_2_2018digitalzombie
 
Introduction to Random Forest
Introduction to Random Forest Introduction to Random Forest
Introduction to Random Forest Rupak Roy
 
Supervised learning (2)
Supervised learning (2)Supervised learning (2)
Supervised learning (2)AlexAman1
 
Decision Trees- Random Forests.pdf
Decision Trees- Random Forests.pdfDecision Trees- Random Forests.pdf
Decision Trees- Random Forests.pdfTahaYasmin
 
Decision Tree Classification Algorithm.pptx
Decision Tree Classification Algorithm.pptxDecision Tree Classification Algorithm.pptx
Decision Tree Classification Algorithm.pptxPriyadharshiniG41
 
Introduction to Datamining Concept and Techniques
Introduction to Datamining Concept and TechniquesIntroduction to Datamining Concept and Techniques
Introduction to Datamining Concept and TechniquesSơn Còm Nhom
 

Similar to BAS 250 Lecture 5 (20)

K nearest neighbor
K nearest neighborK nearest neighbor
K nearest neighbor
 
Decision tree induction
Decision tree inductionDecision tree induction
Decision tree induction
 
Decision Tree.pptx
Decision Tree.pptxDecision Tree.pptx
Decision Tree.pptx
 
Decision trees
Decision treesDecision trees
Decision trees
 
Machine Learning Unit-5 Decesion Trees & Random Forest.pdf
Machine Learning Unit-5 Decesion Trees & Random Forest.pdfMachine Learning Unit-5 Decesion Trees & Random Forest.pdf
Machine Learning Unit-5 Decesion Trees & Random Forest.pdf
 
Decision Trees
Decision TreesDecision Trees
Decision Trees
 
Decision tree presentation
Decision tree presentationDecision tree presentation
Decision tree presentation
 
Decision tree in artificial intelligence
Decision tree in artificial intelligenceDecision tree in artificial intelligence
Decision tree in artificial intelligence
 
module_3_1.pptx
module_3_1.pptxmodule_3_1.pptx
module_3_1.pptx
 
module_3_1.pptx
module_3_1.pptxmodule_3_1.pptx
module_3_1.pptx
 
Machine Learning Lecture 3 Decision Trees
Machine Learning Lecture 3 Decision TreesMachine Learning Lecture 3 Decision Trees
Machine Learning Lecture 3 Decision Trees
 
Random forest sgv_ai_talk_oct_2_2018
Random forest sgv_ai_talk_oct_2_2018Random forest sgv_ai_talk_oct_2_2018
Random forest sgv_ai_talk_oct_2_2018
 
07 learning
07 learning07 learning
07 learning
 
Introduction to Random Forest
Introduction to Random Forest Introduction to Random Forest
Introduction to Random Forest
 
Supervised learning (2)
Supervised learning (2)Supervised learning (2)
Supervised learning (2)
 
Decision tree
Decision tree Decision tree
Decision tree
 
Decision Trees- Random Forests.pdf
Decision Trees- Random Forests.pdfDecision Trees- Random Forests.pdf
Decision Trees- Random Forests.pdf
 
Decision Tree Classification Algorithm.pptx
Decision Tree Classification Algorithm.pptxDecision Tree Classification Algorithm.pptx
Decision Tree Classification Algorithm.pptx
 
M3R.FINAL
M3R.FINALM3R.FINAL
M3R.FINAL
 
Introduction to Datamining Concept and Techniques
Introduction to Datamining Concept and TechniquesIntroduction to Datamining Concept and Techniques
Introduction to Datamining Concept and Techniques
 

More from Wake Tech BAS

BAS 150 Lesson 8 Lecture
BAS 150 Lesson 8 LectureBAS 150 Lesson 8 Lecture
BAS 150 Lesson 8 LectureWake Tech BAS
 
BAS 150 Lesson 7 Lecture
BAS 150 Lesson 7 LectureBAS 150 Lesson 7 Lecture
BAS 150 Lesson 7 LectureWake Tech BAS
 
BAS 150 Lesson 3 Lecture
BAS 150 Lesson 3 LectureBAS 150 Lesson 3 Lecture
BAS 150 Lesson 3 LectureWake Tech BAS
 
BAS 150 Lesson 1 Lecture
BAS 150 Lesson 1 LectureBAS 150 Lesson 1 Lecture
BAS 150 Lesson 1 LectureWake Tech BAS
 

More from Wake Tech BAS (9)

BAS 250 Lecture 8
BAS 250 Lecture 8BAS 250 Lecture 8
BAS 250 Lecture 8
 
BAS 250 Lecture 4
BAS 250 Lecture 4BAS 250 Lecture 4
BAS 250 Lecture 4
 
BAS 250 Lecture 3
BAS 250 Lecture 3BAS 250 Lecture 3
BAS 250 Lecture 3
 
BAS 250 Lecture 2
BAS 250 Lecture 2BAS 250 Lecture 2
BAS 250 Lecture 2
 
BAS 250 Lecture 1
BAS 250 Lecture 1BAS 250 Lecture 1
BAS 250 Lecture 1
 
BAS 150 Lesson 8 Lecture
BAS 150 Lesson 8 LectureBAS 150 Lesson 8 Lecture
BAS 150 Lesson 8 Lecture
 
BAS 150 Lesson 7 Lecture
BAS 150 Lesson 7 LectureBAS 150 Lesson 7 Lecture
BAS 150 Lesson 7 Lecture
 
BAS 150 Lesson 3 Lecture
BAS 150 Lesson 3 LectureBAS 150 Lesson 3 Lecture
BAS 150 Lesson 3 Lecture
 
BAS 150 Lesson 1 Lecture
BAS 150 Lesson 1 LectureBAS 150 Lesson 1 Lecture
BAS 150 Lesson 1 Lecture
 

Recently uploaded

1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...christianmathematics
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024Janet Corral
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingTeacherCyreneCayanan
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 

Recently uploaded (20)

1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 

BAS 250 Lecture 5

  • 1. BAS 250 Lesson 5: Decision Trees
  • 2. • Explain what decision trees are, how they are used, and the benefits of using them • Describe the best format for data in order to perform predictive decision tree mining • Interpret visual tree’s nodes and leaves • Explain the use of different algorithms in order to increase the granularity of the tree’s detail This Week’s Learning Objectives
  • 3.  What is a Decision Tree  Sample Decision Trees  How to Construct a Decision Tree  Problems with Decision Trees  Summary Overview
  • 4. • Decision trees are excellent predictive models when the target attribute is categorical in nature and when the data set is of mixed data types • More numerically-based approaches, decision trees are better at handling attributes that have missing or inconsistent values that are not handled- decision trees will work around such data and still generate usable results • Decision trees are made of nodes and leaves to represent the best predictor attributes in a data set • Decision trees tell the user what is predicted, how confident that prediction can be, and how we arrived at said prediction Overview
  • 5. An example of a Decision Tree developed in RapidMiner Decision Trees
  • 6. • Nodes are circular or oval shapes that represent attributes which serve as good predictors for the label attribute • Leaves are end points that demonstrate the distribution of categories from the label attribute that follow the branch of the tree to the point of that leaf Decision Trees
  • 7. An example of meta data for playing golf based on a decision tree Decision Trees
  • 8.  An inductive learning task o Use particular facts to make more generalized conclusions  A predictive model based on a branching series of Boolean tests o These smaller Boolean tests are less complex than a one- stage classifier  Let’s look at a sample decision tree… What is a Decision Tree?
  • 9. Predicting Commute Time Leave At Stall? Accident? 10 AM 9 AM 8 AM Long Long Short Medium Long No Yes No Yes If we leave at 10 AM and there are no cars stalled on the road, what will our commute time be?
  • 10.  In this decision tree, we made a series of Boolean decisions and followed the corresponding branch o Did we leave at 10 AM? o Did a car stall on the road? o Is there an accident on the road?  By answering each of these yes/no questions, we then came to a conclusion on how long our commute might take Inductive Learning
  • 11. We did not have represent this tree graphically We could have represented as a set of rules. However, this may be much harder to read… Decision Trees as Rules
  • 12. if hour == 8am commute time = long else if hour == 9am if accident == yes commute time = long else commute time = medium else if hour == 10am if stall == yes commute time = long else commute time = short Decision Tree as a Rule Set • Notice that all attributes to not have to be used in each path of the decision. • As we will see, all attributes may not even appear in the tree.
  • 13. 1. We first make a list of attributes that we can measure  These attributes (for now) must be discrete 2. We then choose a target attribute that we want to predict 3. Then create an experience table that lists what we have seen in the past How to Create a Decision Tree
  • 14. Example Attributes Target Hour Weather Accident Stall Commute D1 8 AM Sunny No No Long D2 8 AM Cloudy No Yes Long D3 10 AM Sunny No No Short D4 9 AM Rainy Yes No Long D5 9 AM Sunny Yes Yes Long D6 10 AM Sunny No No Short D7 10 AM Cloudy No No Short D8 9 AM Rainy No No Medium D9 9 AM Sunny Yes No Long D10 10 AM Cloudy Yes Yes Long D11 10 AM Rainy No No Short D12 8 AM Cloudy Yes No Long D13 9 AM Sunny No No Medium Sample Experience Table
  • 15. The previous experience decision table had 4 attributes: 1. Hour 2. Weather 3. Accident 4. Stall But the decision tree only showed 3 attributes: 1. Hour 2. Accident 3. Stall Why? Choosing Attributes
  • 16.  Methods for selecting attributes show that weather is not a discriminating attribute  We use the principle of Occam’s Razor: Given a number of competing hypotheses, the simplest one is preferable Choosing Attributes
  • 17.  The basic structure of creating a decision tree is the same for most decision tree algorithms  The difference lies in how we select the attributes for the tree  We will focus on the ID3 algorithm developed by Ross Quinlan in 1975 Choosing Attributes
  • 18.  The basic idea behind any decision tree algorithm is as follows: o Choose the best attribute(s) to split the remaining instances and make that attribute a decision node o Repeat this process for recursively for each child o Stop when:  All the instances have the same target attribute value  There are no more attributes  There are no more instances Decision Tree Algorithms
  • 19. Original decision tree Identifying the Best Attributes Leave At Stall? Accident? 10 AM 9 AM 8 AM Long Long Short Medium No Yes No Yes Long How did we know to split on leave at and then on stall and accident and not weather?
  • 20.  To determine the best attribute, we look at the ID3 heuristic  ID3 splits attributes based on their entropy. Entropy is the measure of disinformation… ID3 Heuristic
  • 21.  Entropy is minimized when all values of the target attribute are the same o If we know that commute time will always be short, then entropy = 0  Entropy is maximized when there is an equal chance of all values for the target attribute (i.e. the result is random) o If commute time = short in 3 instances, medium in 3 instances and long in 3 instances, entropy is maximized Entropy
  • 22.  Calculation of entropy o Entropy(S) = ∑(i=1 to l)-|Si|/|S| * log2(|Si|/|S|)  S = set of examples  Si = subset of S with value vi under the target attribute  l = size of the range of the target attribute Entropy
  • 23.  ID3 splits on attributes with the lowest entropy  We calculate the entropy for all values of an attribute as the weighted sum of subset entropies as follows: o ∑(i = 1 to k) |Si|/|S| Entropy(Si), where k is the range of the attribute we are testing  We can also measure information gain (which is inversely proportional to entropy) as follows: o Entropy(S) - ∑(i = 1 to k) |Si|/|S| Entropy(Si) ID3
  • 24. Attribute Expected Entropy Information Gain Hour 0.6511 0.768449 Weather 1.28884 0.130719 Accident 0.92307 0.496479 Stall 1.17071 0.248842 ID3 Given our commute time sample set, we can calculate the entropy of each attribute at the root node
  • 25.  There is another technique for reducing the number of attributes used in a tree – pruning  Two types of pruning: oPre-pruning (forward pruning) oPost-pruning (backward pruning) Pruning Trees
  • 26.  In prepruning, we decide during the building process when to stop adding attributes (possibly based on their information gain)  However, this may be problematic – Why? o Sometimes attributes individually do not contribute much to a decision, but combined, they may have a significant impact Prepruning
  • 27.  Postpruning waits until the full decision tree has built and then prunes the attributes  Two techniques: o Subtree Replacement o Subtree Raising Postpruning
  • 28. Entire subtree is replaced by a single leaf node Subtree Replacement A B C 1 2 3 4 5
  • 29. • Node 6 replaced the subtree • Generalizes tree a little more, but may increase accuracy Subtree Replacement A B 6 4 5
  • 30. Entire subtree is raised onto another node Subtree Raising A B C 1 2 3 4 5
  • 31. Entire subtree is raised onto another node We will NOT be using Subtree Raising in this course! Subtree Raising A C 1 2 3
  • 32.  ID3 is not optimal o Uses expected entropy reduction, not actual reduction  Must use discrete (or discretized) attributes o What if we left for work at 9:30 AM? o We could break down the attributes into smaller values… Problems with ID3
  • 33. If we broke down leave time to the minute, we might get something like this: Problems with ID3 8:02 AM 10:02 AM8:03 AM 9:09 AM9:05 AM 9:07 AM Long Medium Short Long Long Short Since entropy is very low for each branch, we have n branches with n leaves. This would not be helpful for predictive modeling.
  • 34.  We can use a technique known as discretization  We choose cut points, such as 9AM for splitting continuous attributes  These cut points generally lie in a subset of boundary points, such that a boundary point is where two adjacent instances in a sorted list have different target value attributes Problems with ID3
  • 35. Consider the attribute commute time Problems with ID3 8:00 (L), 8:02 (L), 8:07 (M), 9:00 (S), 9:20 (S), 9:25 (S), 10:00 (S), 10:02 (M) When we split on these attributes, we increase the entropy so we don’t have a decision tree with the same number of cut points as leaves
  • 36.  While decision trees classify quickly, the time for building a tree may be higher than another type of classifier  Decision trees suffer from a problem of errors propagating throughout a tree  A very serious problem as the number of classes increases Problems with Decision Trees
  • 37.  Since decision trees work by a series of local decisions, what happens when one of these local decisions is wrong? o Every decision from that point on may be wrong o We may never return to the correct path of the tree Error Propagation
  • 38.  Decision trees can be used to help predict the future  The trees are easy to understand  Decision trees work more efficiently with discrete attributes  The trees may suffer from error propagation Summary
  • 39. “This workforce solution was funded by a grant awarded by the U.S. Department of Labor’s Employment and Training Administration. The solution was created by the grantee and does not necessarily reflect the official position of the U.S. Department of Labor. The Department of Labor makes no guarantees, warranties, or assurances of any kind, express or implied, with respect to such information, including any information on linked sites and including, but not limited to, accuracy of the information or its completeness, timeliness, usefulness, adequacy, continued availability, or ownership.” Except where otherwise stated, this work by Wake Technical Community College Building Capacity in Business Analytics, a Department of Labor, TAACCCT funded project, is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ Copyright Information