SlideShare a Scribd company logo
BY
International School of Engineering
{We Are Applied Engineering}
Disclaimer: Some of the Images and content have been taken from multiple online sources and this presentation is
intended only for knowledge sharing but not for any commercial business intention
OVERVIEW
• DEFINITION OF DECISIONTREE
• WHY DECISIONTREE?
• DECISIONTREETERMS
• EASY EXAMPLE
• CONSTRUCTING A DECISION
TREE
• CALCULATION OF ENTROPY
• ENTROPY
• TERMINATION CRITERIA
• PRUNINGTREES
• APPROACHES TO PRUNETREE
• DECISIONTREE ALGORITHMS
• LIMITATIONS
• ADVANTAGES
• VIDEO OF CONSTRUCTING A
DECISIONTREE
DEFINITION OF ‘DECISIONTREE'
 A decision tree is a natural and simple way of inducing following kind of rules.
If (Age is x) and (income is y) and (family size is z) and (credit card
spending is p) then he will accept the loan
 It is powerful and perhaps most widely used modeling technique of all
 Decision trees classify instances by sorting them down the tree from the root to some leaf
node, which provides the classification of the instance
WHY DECISIONTREE?
Source: http://www.simafore.com/blog/bid/62482/2-main-differences-between-classification-and-regression-trees
Decision Trees
To Classify
Response variable has
only two categories
Use standard
classification tree
Response variable has
multiple categories
Use c4.5
implementation
To Predict
Response variable is
continuous
Linear relationships
between predictors
and response
Use standard
Regression tree
Nonlinear relationships
between predictors and
response
Use c4.5
implementation
DECISIONTREETERMS
Root Node
Condition Check
Leaf Node(Decision Point)
Leaf Node(Decision Point)
Condition Check
Branch Branch
EASY EXAMPLE
 Joe’s garage is considering hiring another mechanic.
 The mechanic would cost them an additional $50,000 / year in salary and
benefits.
 If there are a lot of accidents in Iowa City this year, they anticipate making an
additional $75,000 in net revenue.
 If there are not a lot of accidents, they could lose $20,000 off of last year’s total
net revenues.
 Because of all the ice on the roads, Joe thinks that there will be a 70% chance of
“a lot of accidents” and a 30% chance of “fewer accidents”.
 Assume if he doesn’t expand he will have the same revenue as last year.
Joe’s Garage Hiring a
mechanic
Hire a new mechanic
Cost = $50,000
70% of an chance
increase in accidents
Profit = $70,000
30% of a chance
decrease in accidents
Profit = -$20,000
Don’t hire a mechanic
Cost = $0
• Estimated value of “Hire Mechanic” =
NPV =.7(70,000) + .3(- $20,000) - $50,000 = - $7,000
• Therefore you should not hire the mechanic
continued
CONSTRUCTING A DECISIONTREE
 Which attribute to choose?
 Information Gain
ENTROPY
 Where to stop?
 Termination criteria
Two Aspects
CALCULATION OF ENTROPY
 Entropy is a measure of uncertainty in the data
Entropy(S) = ∑(i=1 to l)-|Si|/|S| * log2(|Si|/|S|)
S = set of examples
Si = subset of S with value vi under the target attribute
l = size of the range of the target attribute
ENTROPY
 Let us say, I am considering an action like a coin toss. Say, I have five coins with probabilities
for heads 0, 0.25, 0.5, 0.75 and 1. When I toss them which one has highest uncertainty and
which one has the least?
H = − 𝑖𝑝𝑖 log2 𝑝𝑖
 Information gain = Entropy of the system before split – Entropy
of the system after split
ENTROPY: MEASURE OF RANDOMNESS
TERMINATION CRITERIA
 All the records at the node belong to one class
 A significant majority fraction of records belong to a single class
 The segment contains only one or very small number of records
 The improvement is not substantial enough to warrant making the split
PRUNINGTREES
 The decision trees can be grown deeply enough to perfectly classify the training examples
which leads to overfitting when there is noise in the data
 When the number of training examples is too small to produce a representative sample of
the true target function.
 Practically, pruning is not important for classification
APPROACHES TO PRUNETREE
 Three approaches
–Stop growing the tree earlier, before it reaches the point
where it perfectly classifies the training data,
–Allow the tree to over fit the data, and then post-prune the
tree.
–Allow the tree to over fit the data, transform the tree to rules
and then post-prune the rules.
 Pessimistic pruning
Take the upper bound error at the node and sub-trees
e= [f+
𝑧2
2𝑁
+z
𝑓
𝑁
−
𝑓2
𝑁
+
𝑧2
4𝑁2]/[1+
𝑧2
𝑁
]
 Cost complexity pruning
J(Tree, S) = ErrorRate(Tree, S) + a |Tree|
Play with several values a starting from 0
Do a K-fold validation on all of them and find the best pruning α
TWO MOST POPULAR
DECISIONTREE ALGORITHMS
 Cart
–Binary split
–Gini index
–Cost complexity pruning
 C5.0
–Multi split
–Info gain
–pessimistic pruning
LIMITATIONS
 Class imbalance
 When there are more records and very less number of attributes/features
ADVANTAGES
 They are fast
 Robust
 Requires very little experimentation
 You may also build some intuitions about your customer base. E.g. “Are customers with
different family sizes truly different?
For Detailed Description on
CONSTRUCTING A DECISION TREE
with example
Check out our video
Plot no 63/A, 1st Floor, Road No 13, Film Nagar, Jubilee
Hills, Hyderabad-500033
For Individuals (+91) 9502334561/62
For Corporates (+91) 9618 483 483
Facebook: www.facebook.com/insofe
Slide share: www.slideshare.net/INSOFE
International School of Engineering

More Related Content

What's hot

Decision Tree Algorithm With Example | Decision Tree In Machine Learning | Da...
Decision Tree Algorithm With Example | Decision Tree In Machine Learning | Da...Decision Tree Algorithm With Example | Decision Tree In Machine Learning | Da...
Decision Tree Algorithm With Example | Decision Tree In Machine Learning | Da...
Simplilearn
 
Decision Tree - C4.5&CART
Decision Tree - C4.5&CARTDecision Tree - C4.5&CART
Decision Tree - C4.5&CARTXueping Peng
 
Decision Trees
Decision TreesDecision Trees
Decision Trees
Student
 
Decision tree
Decision treeDecision tree
Decision tree
ShraddhaPandey45
 
2.2 decision tree
2.2 decision tree2.2 decision tree
2.2 decision tree
Krish_ver2
 
Decision Tree Learning
Decision Tree LearningDecision Tree Learning
Decision Tree Learning
Milind Gokhale
 
Decision Tree In R | Decision Tree Algorithm | Data Science Tutorial | Machin...
Decision Tree In R | Decision Tree Algorithm | Data Science Tutorial | Machin...Decision Tree In R | Decision Tree Algorithm | Data Science Tutorial | Machin...
Decision Tree In R | Decision Tree Algorithm | Data Science Tutorial | Machin...
Simplilearn
 
Classification Using Decision tree
Classification Using Decision treeClassification Using Decision tree
Classification Using Decision tree
Mohd. Noor Abdul Hamid
 
Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests
Derek Kane
 
Decision tree
Decision tree Decision tree
Decision tree
asna akhtar
 
Decision tree in artificial intelligence
Decision tree in artificial intelligenceDecision tree in artificial intelligence
Decision tree in artificial intelligence
MdAlAmin187
 
Decision tree
Decision treeDecision tree
Decision tree
Venkata Reddy Konasani
 
Random forest and decision tree
Random forest and decision treeRandom forest and decision tree
Random forest and decision tree
AAKANKSHA JAIN
 
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain RatioLecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Marina Santini
 
Slide3.ppt
Slide3.pptSlide3.ppt
Slide3.pptbutest
 
Decision trees in Machine Learning
Decision trees in Machine Learning Decision trees in Machine Learning
Decision trees in Machine Learning
Mohammad Junaid Khan
 
Machine Learning Ml Overview Algorithms Use Cases And Applications
Machine Learning Ml Overview Algorithms Use Cases And ApplicationsMachine Learning Ml Overview Algorithms Use Cases And Applications
Machine Learning Ml Overview Algorithms Use Cases And Applications
SlideTeam
 
Hyperparameter Tuning
Hyperparameter TuningHyperparameter Tuning
Hyperparameter Tuning
Jon Lederman
 
Introduction to Data Mining
Introduction to Data Mining Introduction to Data Mining
Introduction to Data Mining
Sushil Kulkarni
 
Decision tree and random forest
Decision tree and random forestDecision tree and random forest
Decision tree and random forest
Lippo Group Digital
 

What's hot (20)

Decision Tree Algorithm With Example | Decision Tree In Machine Learning | Da...
Decision Tree Algorithm With Example | Decision Tree In Machine Learning | Da...Decision Tree Algorithm With Example | Decision Tree In Machine Learning | Da...
Decision Tree Algorithm With Example | Decision Tree In Machine Learning | Da...
 
Decision Tree - C4.5&CART
Decision Tree - C4.5&CARTDecision Tree - C4.5&CART
Decision Tree - C4.5&CART
 
Decision Trees
Decision TreesDecision Trees
Decision Trees
 
Decision tree
Decision treeDecision tree
Decision tree
 
2.2 decision tree
2.2 decision tree2.2 decision tree
2.2 decision tree
 
Decision Tree Learning
Decision Tree LearningDecision Tree Learning
Decision Tree Learning
 
Decision Tree In R | Decision Tree Algorithm | Data Science Tutorial | Machin...
Decision Tree In R | Decision Tree Algorithm | Data Science Tutorial | Machin...Decision Tree In R | Decision Tree Algorithm | Data Science Tutorial | Machin...
Decision Tree In R | Decision Tree Algorithm | Data Science Tutorial | Machin...
 
Classification Using Decision tree
Classification Using Decision treeClassification Using Decision tree
Classification Using Decision tree
 
Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests
 
Decision tree
Decision tree Decision tree
Decision tree
 
Decision tree in artificial intelligence
Decision tree in artificial intelligenceDecision tree in artificial intelligence
Decision tree in artificial intelligence
 
Decision tree
Decision treeDecision tree
Decision tree
 
Random forest and decision tree
Random forest and decision treeRandom forest and decision tree
Random forest and decision tree
 
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain RatioLecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
 
Slide3.ppt
Slide3.pptSlide3.ppt
Slide3.ppt
 
Decision trees in Machine Learning
Decision trees in Machine Learning Decision trees in Machine Learning
Decision trees in Machine Learning
 
Machine Learning Ml Overview Algorithms Use Cases And Applications
Machine Learning Ml Overview Algorithms Use Cases And ApplicationsMachine Learning Ml Overview Algorithms Use Cases And Applications
Machine Learning Ml Overview Algorithms Use Cases And Applications
 
Hyperparameter Tuning
Hyperparameter TuningHyperparameter Tuning
Hyperparameter Tuning
 
Introduction to Data Mining
Introduction to Data Mining Introduction to Data Mining
Introduction to Data Mining
 
Decision tree and random forest
Decision tree and random forestDecision tree and random forest
Decision tree and random forest
 

Viewers also liked

Apriori Algorithm
Apriori AlgorithmApriori Algorithm
FSU SLIS Wk 8 Intro to Info Services - Ready Reference
FSU SLIS Wk 8 Intro to Info Services - Ready ReferenceFSU SLIS Wk 8 Intro to Info Services - Ready Reference
FSU SLIS Wk 8 Intro to Info Services - Ready Reference
Lorri Mon
 
Part 4 (machine learning overview) solution architecture
Part 4 (machine learning overview)   solution architecturePart 4 (machine learning overview)   solution architecture
Part 4 (machine learning overview) solution architecture
International School of Engineering
 
Mini project1 team5
Mini project1 team5Mini project1 team5
Mini project1 team5
Saikumar Allaka
 
Pareto optimality
Pareto optimalityPareto optimality
Pareto optimality
Prabha Panth
 
Pareto optimal
Pareto optimal    Pareto optimal
Pareto optimal rmpas
 
Part 6 (machine learning overview) what makes a problem tough continued
Part 6 (machine learning overview)   what makes a problem tough continuedPart 6 (machine learning overview)   what makes a problem tough continued
Part 6 (machine learning overview) what makes a problem tough continued
International School of Engineering
 
Engineering Big Data with Hadoop
Engineering Big Data with HadoopEngineering Big Data with Hadoop
Engineering Big Data with Hadoop
International School of Engineering
 
Intelligent Decision Support Systems
Intelligent Decision Support SystemsIntelligent Decision Support Systems
Intelligent Decision Support SystemsGildardo Sanchez-Ante
 
Part 5 (machine learning overview) what makes a problem tough
Part 5 (machine learning overview)   what makes a problem toughPart 5 (machine learning overview)   what makes a problem tough
Part 5 (machine learning overview) what makes a problem tough
International School of Engineering
 
Scope and Career in Analytics
Scope and Career in AnalyticsScope and Career in Analytics
Scope and Career in Analytics
International School of Engineering
 
Decision Support Systems
Decision Support SystemsDecision Support Systems
Decision Support Systems
luzenith_g
 
Apriori algorithm
Apriori algorithmApriori algorithm
Apriori algorithm
nouraalkhatib
 
Discriminant analysis
Discriminant analysisDiscriminant analysis
Discriminant analysis
Amritashish Bagchi
 
Types of decision support system
Types of decision support systemTypes of decision support system
Types of decision support systemnripeshkumarnrip
 
Discriminant analysis
Discriminant analysisDiscriminant analysis
Discriminant analysis
Sandeep Soni Kanpur
 
Decision Support Systems
Decision Support SystemsDecision Support Systems
Decision Support Systems
Shigem
 

Viewers also liked (20)

Decision tree example problem
Decision tree example problemDecision tree example problem
Decision tree example problem
 
Apriori Algorithm
Apriori AlgorithmApriori Algorithm
Apriori Algorithm
 
FSU SLIS Wk 8 Intro to Info Services - Ready Reference
FSU SLIS Wk 8 Intro to Info Services - Ready ReferenceFSU SLIS Wk 8 Intro to Info Services - Ready Reference
FSU SLIS Wk 8 Intro to Info Services - Ready Reference
 
Part 4 (machine learning overview) solution architecture
Part 4 (machine learning overview)   solution architecturePart 4 (machine learning overview)   solution architecture
Part 4 (machine learning overview) solution architecture
 
Mini project1 team5
Mini project1 team5Mini project1 team5
Mini project1 team5
 
Pareto optimality
Pareto optimalityPareto optimality
Pareto optimality
 
Pareto optimal
Pareto optimal    Pareto optimal
Pareto optimal
 
Part 6 (machine learning overview) what makes a problem tough continued
Part 6 (machine learning overview)   what makes a problem tough continuedPart 6 (machine learning overview)   what makes a problem tough continued
Part 6 (machine learning overview) what makes a problem tough continued
 
Engineering Big Data with Hadoop
Engineering Big Data with HadoopEngineering Big Data with Hadoop
Engineering Big Data with Hadoop
 
Intelligent Decision Support Systems
Intelligent Decision Support SystemsIntelligent Decision Support Systems
Intelligent Decision Support Systems
 
Part 5 (machine learning overview) what makes a problem tough
Part 5 (machine learning overview)   what makes a problem toughPart 5 (machine learning overview)   what makes a problem tough
Part 5 (machine learning overview) what makes a problem tough
 
Scope and Career in Analytics
Scope and Career in AnalyticsScope and Career in Analytics
Scope and Career in Analytics
 
Decision Support Systems
Decision Support SystemsDecision Support Systems
Decision Support Systems
 
Apriori algorithm
Apriori algorithmApriori algorithm
Apriori algorithm
 
Decision trees
Decision treesDecision trees
Decision trees
 
Discriminant analysis
Discriminant analysisDiscriminant analysis
Discriminant analysis
 
Types of decision support system
Types of decision support systemTypes of decision support system
Types of decision support system
 
Decision support system
Decision support systemDecision support system
Decision support system
 
Discriminant analysis
Discriminant analysisDiscriminant analysis
Discriminant analysis
 
Decision Support Systems
Decision Support SystemsDecision Support Systems
Decision Support Systems
 

Similar to Decision Trees

Simple rules for building robust machine learning models
Simple rules for building robust machine learning modelsSimple rules for building robust machine learning models
Simple rules for building robust machine learning models
Kyriakos Chatzidimitriou
 
Introduction to machine learning and model building using linear regression
Introduction to machine learning and model building using linear regressionIntroduction to machine learning and model building using linear regression
Introduction to machine learning and model building using linear regression
Girish Gore
 
Random Decision Forests at Scale
Random Decision Forests at ScaleRandom Decision Forests at Scale
Random Decision Forests at Scale
Cloudera, Inc.
 
Echelon Asia Summit 2017 Startup Academy Workshop
Echelon Asia Summit 2017 Startup Academy WorkshopEchelon Asia Summit 2017 Startup Academy Workshop
Echelon Asia Summit 2017 Startup Academy Workshop
Garrett Teoh Hor Keong
 
What are the odds of making that number risk analysis with crystal ball - O...
What are the odds of making that number   risk analysis with crystal ball - O...What are the odds of making that number   risk analysis with crystal ball - O...
What are the odds of making that number risk analysis with crystal ball - O...
p6academy
 
Machine_Learning.pptx
Machine_Learning.pptxMachine_Learning.pptx
Machine_Learning.pptx
VickyKumar131533
 
Machine Learning in the Financial Industry
Machine Learning in the Financial IndustryMachine Learning in the Financial Industry
Machine Learning in the Financial Industry
Subrat Panda, PhD
 
Top 3 Considerations for Machine Learning on Big Data
Top 3 Considerations for Machine Learning on Big DataTop 3 Considerations for Machine Learning on Big Data
Top 3 Considerations for Machine Learning on Big Data
Datameer
 
Fraud Detection by Stacking Cost-Sensitive Decision Trees
Fraud Detection by Stacking Cost-Sensitive Decision TreesFraud Detection by Stacking Cost-Sensitive Decision Trees
Fraud Detection by Stacking Cost-Sensitive Decision Trees
Alejandro Correa Bahnsen, PhD
 
Machine learning overview (with SAS software)
Machine learning overview (with SAS software)Machine learning overview (with SAS software)
Machine learning overview (with SAS software)
Longhow Lam
 
Decision Tree Algorithm & Analysis | Machine Learning Algorithm | Data Scienc...
Decision Tree Algorithm & Analysis | Machine Learning Algorithm | Data Scienc...Decision Tree Algorithm & Analysis | Machine Learning Algorithm | Data Scienc...
Decision Tree Algorithm & Analysis | Machine Learning Algorithm | Data Scienc...
Edureka!
 
A General Framework for Accurate and Fast Regression by Data Summarization in...
A General Framework for Accurate and Fast Regression by Data Summarization in...A General Framework for Accurate and Fast Regression by Data Summarization in...
A General Framework for Accurate and Fast Regression by Data Summarization in...Yao Wu
 
Mini datathon
Mini datathonMini datathon
Mini datathon
Kunal Jain
 
Decision theory
Decision theoryDecision theory
Decision theory
Surekha98
 
Course Project for Coursera Practical Machine Learning
Course Project for Coursera Practical Machine LearningCourse Project for Coursera Practical Machine Learning
Course Project for Coursera Practical Machine LearningJohn Edward Slough II
 
Intro to ml_2021
Intro to ml_2021Intro to ml_2021
Intro to ml_2021
Sanghamitra Deb
 
Machine Learning part 3 - Introduction to data science
Machine Learning part 3 - Introduction to data science Machine Learning part 3 - Introduction to data science
Machine Learning part 3 - Introduction to data science
Frank Kienle
 
Data Driven Risk Management
Data Driven Risk ManagementData Driven Risk Management
Data Driven Risk Management
Resolver Inc.
 
Quantitative data essentials for charities - Learning Lab
Quantitative data essentials for charities - Learning LabQuantitative data essentials for charities - Learning Lab
Quantitative data essentials for charities - Learning Lab
Improvement Skills Consulting Ltd.
 

Similar to Decision Trees (20)

Simple rules for building robust machine learning models
Simple rules for building robust machine learning modelsSimple rules for building robust machine learning models
Simple rules for building robust machine learning models
 
Introduction to machine learning and model building using linear regression
Introduction to machine learning and model building using linear regressionIntroduction to machine learning and model building using linear regression
Introduction to machine learning and model building using linear regression
 
Random Decision Forests at Scale
Random Decision Forests at ScaleRandom Decision Forests at Scale
Random Decision Forests at Scale
 
Echelon Asia Summit 2017 Startup Academy Workshop
Echelon Asia Summit 2017 Startup Academy WorkshopEchelon Asia Summit 2017 Startup Academy Workshop
Echelon Asia Summit 2017 Startup Academy Workshop
 
What are the odds of making that number risk analysis with crystal ball - O...
What are the odds of making that number   risk analysis with crystal ball - O...What are the odds of making that number   risk analysis with crystal ball - O...
What are the odds of making that number risk analysis with crystal ball - O...
 
Machine_Learning.pptx
Machine_Learning.pptxMachine_Learning.pptx
Machine_Learning.pptx
 
Machine Learning in the Financial Industry
Machine Learning in the Financial IndustryMachine Learning in the Financial Industry
Machine Learning in the Financial Industry
 
Top 3 Considerations for Machine Learning on Big Data
Top 3 Considerations for Machine Learning on Big DataTop 3 Considerations for Machine Learning on Big Data
Top 3 Considerations for Machine Learning on Big Data
 
Fraud Detection by Stacking Cost-Sensitive Decision Trees
Fraud Detection by Stacking Cost-Sensitive Decision TreesFraud Detection by Stacking Cost-Sensitive Decision Trees
Fraud Detection by Stacking Cost-Sensitive Decision Trees
 
Machine learning overview (with SAS software)
Machine learning overview (with SAS software)Machine learning overview (with SAS software)
Machine learning overview (with SAS software)
 
Decision Tree Algorithm & Analysis | Machine Learning Algorithm | Data Scienc...
Decision Tree Algorithm & Analysis | Machine Learning Algorithm | Data Scienc...Decision Tree Algorithm & Analysis | Machine Learning Algorithm | Data Scienc...
Decision Tree Algorithm & Analysis | Machine Learning Algorithm | Data Scienc...
 
A General Framework for Accurate and Fast Regression by Data Summarization in...
A General Framework for Accurate and Fast Regression by Data Summarization in...A General Framework for Accurate and Fast Regression by Data Summarization in...
A General Framework for Accurate and Fast Regression by Data Summarization in...
 
Decision theory & decisiontrees
Decision theory & decisiontreesDecision theory & decisiontrees
Decision theory & decisiontrees
 
Mini datathon
Mini datathonMini datathon
Mini datathon
 
Decision theory
Decision theoryDecision theory
Decision theory
 
Course Project for Coursera Practical Machine Learning
Course Project for Coursera Practical Machine LearningCourse Project for Coursera Practical Machine Learning
Course Project for Coursera Practical Machine Learning
 
Intro to ml_2021
Intro to ml_2021Intro to ml_2021
Intro to ml_2021
 
Machine Learning part 3 - Introduction to data science
Machine Learning part 3 - Introduction to data science Machine Learning part 3 - Introduction to data science
Machine Learning part 3 - Introduction to data science
 
Data Driven Risk Management
Data Driven Risk ManagementData Driven Risk Management
Data Driven Risk Management
 
Quantitative data essentials for charities - Learning Lab
Quantitative data essentials for charities - Learning LabQuantitative data essentials for charities - Learning Lab
Quantitative data essentials for charities - Learning Lab
 

More from International School of Engineering

Part 3 (machine learning overview) the forms of knowledge
Part 3 (machine learning overview)   the forms of knowledgePart 3 (machine learning overview)   the forms of knowledge
Part 3 (machine learning overview) the forms of knowledge
International School of Engineering
 
Part 2 (machine learning overview) all machine learning is pattern search
Part 2 (machine learning overview)   all machine learning is pattern searchPart 2 (machine learning overview)   all machine learning is pattern search
Part 2 (machine learning overview) all machine learning is pattern search
International School of Engineering
 
Part 4 (machine learning overview) solution architecture
Part 4 (machine learning overview)   solution architecturePart 4 (machine learning overview)   solution architecture
Part 4 (machine learning overview) solution architecture
International School of Engineering
 
Part 3 (machine learning overview) the forms of knowledge
Part 3 (machine learning overview)   the forms of knowledgePart 3 (machine learning overview)   the forms of knowledge
Part 3 (machine learning overview) the forms of knowledge
International School of Engineering
 
Part 2 (machine learning overview) all machine learning is pattern search
Part 2 (machine learning overview)   all machine learning is pattern searchPart 2 (machine learning overview)   all machine learning is pattern search
Part 2 (machine learning overview) all machine learning is pattern search
International School of Engineering
 
Fast Track Machine Learning Part 1 (Machine Learning Overview) - Types of Mac...
Fast Track Machine Learning Part 1 (Machine Learning Overview) - Types of Mac...Fast Track Machine Learning Part 1 (Machine Learning Overview) - Types of Mac...
Fast Track Machine Learning Part 1 (Machine Learning Overview) - Types of Mac...
International School of Engineering
 
Fraud detection
Fraud detectionFraud detection

More from International School of Engineering (12)

Part 3 (machine learning overview) the forms of knowledge
Part 3 (machine learning overview)   the forms of knowledgePart 3 (machine learning overview)   the forms of knowledge
Part 3 (machine learning overview) the forms of knowledge
 
Part 2 (machine learning overview) all machine learning is pattern search
Part 2 (machine learning overview)   all machine learning is pattern searchPart 2 (machine learning overview)   all machine learning is pattern search
Part 2 (machine learning overview) all machine learning is pattern search
 
Part 4 (machine learning overview) solution architecture
Part 4 (machine learning overview)   solution architecturePart 4 (machine learning overview)   solution architecture
Part 4 (machine learning overview) solution architecture
 
Part 3 (machine learning overview) the forms of knowledge
Part 3 (machine learning overview)   the forms of knowledgePart 3 (machine learning overview)   the forms of knowledge
Part 3 (machine learning overview) the forms of knowledge
 
Part 2 (machine learning overview) all machine learning is pattern search
Part 2 (machine learning overview)   all machine learning is pattern searchPart 2 (machine learning overview)   all machine learning is pattern search
Part 2 (machine learning overview) all machine learning is pattern search
 
Fast Track Machine Learning Part 1 (Machine Learning Overview) - Types of Mac...
Fast Track Machine Learning Part 1 (Machine Learning Overview) - Types of Mac...Fast Track Machine Learning Part 1 (Machine Learning Overview) - Types of Mac...
Fast Track Machine Learning Part 1 (Machine Learning Overview) - Types of Mac...
 
Service assurance with predictive analytics
Service assurance with predictive analyticsService assurance with predictive analytics
Service assurance with predictive analytics
 
Analytics in Supply Chain Management
Analytics in Supply Chain ManagementAnalytics in Supply Chain Management
Analytics in Supply Chain Management
 
Health Care Analytics
Health Care AnalyticsHealth Care Analytics
Health Care Analytics
 
Analytics in the Manufacturing industry
Analytics in the Manufacturing industryAnalytics in the Manufacturing industry
Analytics in the Manufacturing industry
 
Analytics in Pharmaceutical Industry
Analytics in Pharmaceutical IndustryAnalytics in Pharmaceutical Industry
Analytics in Pharmaceutical Industry
 
Fraud detection
Fraud detectionFraud detection
Fraud detection
 

Recently uploaded

Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
pchutichetpong
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
ewymefz
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
Oppotus
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
NABLAS株式会社
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
benishzehra469
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
ocavb
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Boston Institute of Analytics
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
enxupq
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
vcaxypu
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
ewymefz
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Linda486226
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
Tiktokethiodaily
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 

Recently uploaded (20)

Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 

Decision Trees

  • 1. BY International School of Engineering {We Are Applied Engineering} Disclaimer: Some of the Images and content have been taken from multiple online sources and this presentation is intended only for knowledge sharing but not for any commercial business intention
  • 2. OVERVIEW • DEFINITION OF DECISIONTREE • WHY DECISIONTREE? • DECISIONTREETERMS • EASY EXAMPLE • CONSTRUCTING A DECISION TREE • CALCULATION OF ENTROPY • ENTROPY • TERMINATION CRITERIA • PRUNINGTREES • APPROACHES TO PRUNETREE • DECISIONTREE ALGORITHMS • LIMITATIONS • ADVANTAGES • VIDEO OF CONSTRUCTING A DECISIONTREE
  • 3. DEFINITION OF ‘DECISIONTREE'  A decision tree is a natural and simple way of inducing following kind of rules. If (Age is x) and (income is y) and (family size is z) and (credit card spending is p) then he will accept the loan  It is powerful and perhaps most widely used modeling technique of all  Decision trees classify instances by sorting them down the tree from the root to some leaf node, which provides the classification of the instance
  • 4. WHY DECISIONTREE? Source: http://www.simafore.com/blog/bid/62482/2-main-differences-between-classification-and-regression-trees Decision Trees To Classify Response variable has only two categories Use standard classification tree Response variable has multiple categories Use c4.5 implementation To Predict Response variable is continuous Linear relationships between predictors and response Use standard Regression tree Nonlinear relationships between predictors and response Use c4.5 implementation
  • 5. DECISIONTREETERMS Root Node Condition Check Leaf Node(Decision Point) Leaf Node(Decision Point) Condition Check Branch Branch
  • 6. EASY EXAMPLE  Joe’s garage is considering hiring another mechanic.  The mechanic would cost them an additional $50,000 / year in salary and benefits.  If there are a lot of accidents in Iowa City this year, they anticipate making an additional $75,000 in net revenue.  If there are not a lot of accidents, they could lose $20,000 off of last year’s total net revenues.  Because of all the ice on the roads, Joe thinks that there will be a 70% chance of “a lot of accidents” and a 30% chance of “fewer accidents”.  Assume if he doesn’t expand he will have the same revenue as last year.
  • 7. Joe’s Garage Hiring a mechanic Hire a new mechanic Cost = $50,000 70% of an chance increase in accidents Profit = $70,000 30% of a chance decrease in accidents Profit = -$20,000 Don’t hire a mechanic Cost = $0 • Estimated value of “Hire Mechanic” = NPV =.7(70,000) + .3(- $20,000) - $50,000 = - $7,000 • Therefore you should not hire the mechanic continued
  • 8. CONSTRUCTING A DECISIONTREE  Which attribute to choose?  Information Gain ENTROPY  Where to stop?  Termination criteria Two Aspects
  • 9. CALCULATION OF ENTROPY  Entropy is a measure of uncertainty in the data Entropy(S) = ∑(i=1 to l)-|Si|/|S| * log2(|Si|/|S|) S = set of examples Si = subset of S with value vi under the target attribute l = size of the range of the target attribute
  • 10. ENTROPY  Let us say, I am considering an action like a coin toss. Say, I have five coins with probabilities for heads 0, 0.25, 0.5, 0.75 and 1. When I toss them which one has highest uncertainty and which one has the least? H = − 𝑖𝑝𝑖 log2 𝑝𝑖  Information gain = Entropy of the system before split – Entropy of the system after split
  • 11. ENTROPY: MEASURE OF RANDOMNESS
  • 12. TERMINATION CRITERIA  All the records at the node belong to one class  A significant majority fraction of records belong to a single class  The segment contains only one or very small number of records  The improvement is not substantial enough to warrant making the split
  • 13. PRUNINGTREES  The decision trees can be grown deeply enough to perfectly classify the training examples which leads to overfitting when there is noise in the data  When the number of training examples is too small to produce a representative sample of the true target function.  Practically, pruning is not important for classification
  • 14. APPROACHES TO PRUNETREE  Three approaches –Stop growing the tree earlier, before it reaches the point where it perfectly classifies the training data, –Allow the tree to over fit the data, and then post-prune the tree. –Allow the tree to over fit the data, transform the tree to rules and then post-prune the rules.
  • 15.  Pessimistic pruning Take the upper bound error at the node and sub-trees e= [f+ 𝑧2 2𝑁 +z 𝑓 𝑁 − 𝑓2 𝑁 + 𝑧2 4𝑁2]/[1+ 𝑧2 𝑁 ]  Cost complexity pruning J(Tree, S) = ErrorRate(Tree, S) + a |Tree| Play with several values a starting from 0 Do a K-fold validation on all of them and find the best pruning α
  • 16. TWO MOST POPULAR DECISIONTREE ALGORITHMS  Cart –Binary split –Gini index –Cost complexity pruning  C5.0 –Multi split –Info gain –pessimistic pruning
  • 17. LIMITATIONS  Class imbalance  When there are more records and very less number of attributes/features
  • 18. ADVANTAGES  They are fast  Robust  Requires very little experimentation  You may also build some intuitions about your customer base. E.g. “Are customers with different family sizes truly different?
  • 19. For Detailed Description on CONSTRUCTING A DECISION TREE with example Check out our video
  • 20. Plot no 63/A, 1st Floor, Road No 13, Film Nagar, Jubilee Hills, Hyderabad-500033 For Individuals (+91) 9502334561/62 For Corporates (+91) 9618 483 483 Facebook: www.facebook.com/insofe Slide share: www.slideshare.net/INSOFE International School of Engineering