SlideShare a Scribd company logo
“Induction of Decision Trees”
by Ross Quinlan
Papers We Love Bucharest
Stefan Alexandru Adam
27th of May 2016
TechHub
Short History on Classification Algorithms
• Perceptron (Rosenblatt, Frank 1957)
• Pattern Recognition using K-NN (1967)
• Top down induction of decision trees (1980’s)
• Bagging (Breiman 1994)
• Boosting
• SVM
TDIDT family of learning systems
A family of learning systems which solves a
classification problem using a decision tree.
A classification problem can be :
- diagnosis of a medical condition given
symptoms
- determining the Gain/Lose possible values
of a chess position
- determining from atmospheric observation
if it will snow or not
TDIDT family of learning systems
Decision trees are a representation of a classification
• The root is labelled by an attribute
• Edges are labelled by attribute values
• Each leaf is labelled by a class
• Is a collection of decision rules
Classification is done by traversing the tree from the root to the leaves.
TDIDT family of learning systems
Important decision tree algorithms:
- ID3 (Iterative Dichotomiser 3) and successor
- CART binary trees (Classification and
regression trees)
TDIDT family of learning systems
CLS(1963)
ID3(1979)
C4.5(1993)
C5
ACLS(1981) ASSISTANT(1984)
RuleMaster(1984)
EX-TRAN(1984)
Expert-Ease(1983)
CART(1984)
Induction Task
Given a training set of n observations (𝑥𝑖, 𝑦𝑖)
T X, Y =
𝑥1,1 ⋯ 𝑥1,𝑠
⋮ ⋱ ⋮
𝑥1,𝑛 ⋯ 𝑥 𝑛,𝑠
𝑦1
⋮
𝑦𝑛
, 𝑥𝑖 has s attributes and 𝑦𝑖 ∈ {𝐶1, … 𝐶 𝑚}
Develop a classification rule(decision tree) that can determine the class
of any objects from its values attributes. It usually has two steps:
 growing phase. The tree is constructed top-down
 pruning phase. The tree is pruned bottom-up
Induction Task – Algorithm for growing
decision trees
GrowTree(observations, node)
1. If all observations have the same class C// node is a leaf
node.Class = C
return
2. bestAttribute = FindBestAttribute(observations) // identify best attribute
which splits the data
3. Partition the observations according with bestAttribute into k partitions
4. For each partitions 𝑃𝑖=1:𝑘
ChildNode = new Node(node)
GrowTree(𝑃𝑖, ChildNode) // recursive call
Induction Task – Choosing best attribute
Outlook
Sunny
Overcast
Rain
9 yes, 5 no
2 yes, 3 no 4 yes, 0 no 3 yes, 2 no
Temperature
Cool
Mild
Hot
9 yes, 5 no
3 yes, 1 no 4 yes, 2 no 2 yes, 2 no
Humidity
Normal High
9 yes, 5 no
6 yes, 1 no 3 yes, 4 no
Wind
True False
9 yes, 5 no
3 yes, 3 no 6 yes, 2 no
Day Outlook Temperature Humidity Wind Play
Tennis
Day1 Sunny Hot High Weak No
Day2 Sunny Hot High Strong No
Day3 Overcast Hot High Weak Yes
Day4 Rain Mild High Weak Yes
Day5 Rain Cool Normal Weak Yes
Day6 Rain Cool Normal Strong No
Day7 Overcast Cool Normal Strong Yes
Day8 Sunny Mild High Weak No
Day9 Sunny Cool Normal Weak Yes
Day10 Rain Mild Normal Weak Yes
Day11 Sunny Mild Normal Strong Yes
Day12 Overcast Mild High Strong Yes
Day13 Overcast Hot Normal Weak Yes
Day14 Rain Mild High Strong No
Induction Task – Measure Information
Identify the best attribute for splitting the data.
Create a score for each attribute and choose the one with the highest
value. This score is a measure of impurity
Possible scores:
- Entropy (information Gain) used in ID3, C4.5
- Gain Ratio
- Gini index used in CART
- Twoing splitting rule in CART
Entropy measure – Information Gain
Given a discrete random variable X with possible outcomes
{𝑥1, … , 𝑥 𝑛} and probability P
H 𝑋 = 𝐸 −𝑙𝑜𝑔2 𝑃 𝑋 bits, represents the entropy of X
bit = the unit of measure for Entropy
The entropy measures the impurity (disorder) of a system.
For a subset S(n) which correspond to a node n in a decision
tree
𝐇 𝑺 = − 𝒊=𝟏
𝒎
𝒑 𝑪𝒊 𝒍𝒐𝒈 𝟐 𝒑 𝑪𝒊 where
𝑝 𝐶𝑖 =
𝑐𝑜𝑢𝑛𝑡 𝑜𝑓 𝑖𝑡𝑒𝑚𝑠 𝑖𝑛 𝑆 𝑤𝑖𝑡ℎ 𝑐𝑙𝑎𝑠𝑠 𝐶 𝑖
𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑖𝑡𝑒𝑚𝑠 𝑖𝑛 𝑆
Maximum impurity
Zero impurity
Entropy measure – Information Gain
Given a subset S and attribute A with possible outcomes {𝑉1, … , 𝑉𝑙} we
define the information gain
𝐆 𝑺, 𝑨 = 𝑯(𝑺) − 𝒊=𝟏
𝒍
𝒑 𝑽𝒊 𝑯 𝑺 𝑽 𝒊
, where
𝑝 𝑉𝑖 =
𝑐𝑜𝑢𝑛𝑡 𝑖𝑛 𝑆 𝑤ℎ𝑒𝑟𝑒 𝑣𝑎𝑙𝑢𝑒 𝐴 = 𝑉 𝑖
𝑐𝑜𝑢𝑛𝑡 𝑜𝑓 𝑖𝑡𝑒𝑚𝑠 𝑖𝑛 𝑆
, prob. of value(A) = 𝑉𝑖 in S
𝑆 𝑉 𝑖
represents the subset from S where value(A) = 𝑉𝑖
𝑆𝑝𝑙𝑖𝑡𝐼𝑛𝑓𝑜 𝑆, 𝐴 = − 1
𝑙
𝑝 𝑉𝑖 𝑙𝑜𝑔2(𝑝(𝑉𝑖))
𝑮 𝒓𝒂𝒕𝒊𝒐 𝑺, 𝑨 =
𝑮 𝑺, 𝑨
𝑺𝒑𝒍𝒊𝒕𝑰𝒏𝒇𝒐 𝑺, 𝑨
ID3 – decision tree
GrowID3(S, node, attributes)
1. If all items in S has the same class C
node.Class = C
return
2. For each attribute A in attributes compute G 𝑆, 𝐴
𝐴 𝑠𝑝𝑙𝑖𝑡 = 𝑎𝑟𝑔𝑚𝑎𝑥 𝐴(𝐺(𝑆, 𝐴)), 𝐴 𝑠𝑝𝑙𝑖𝑡 ∈ {𝑉1, … , 𝑉𝑙}
3. Compute 𝑆 𝑉 𝑖
the subset from S where value(A) = 𝑉𝑖
4. attributes.remove(𝐴 𝑠𝑝𝑙𝑖𝑡)
5. For each subset partitions 𝑆 𝑉 𝑖=1:𝑙
ChildNode = new Node(node)
GrowID3 (𝑃𝑖, ChildNode) // recursive call
ID3– Example
Outlook
Sunny
Overcast
Rain
9 yes, 5 no
2 yes, 3 no 4 yes, 0 no 3 yes, 2 no
Temperature
Cool
Mild
Hot
9 yes, 5 no
3 yes, 1 no 4 yes, 2 no 2 yes, 2 no
Humidity
Normal High
9 yes, 5 no
6 yes, 1 no 3 yes, 4 no
Wind
True False
9 yes, 5 no
3 yes, 3 no 6 yes, 2 no
G=0.246 bits G=0.029 bits
G(Humidify)=0.152 bits G(Wind)=0.048 bits
Day Outlook Temperature Humidity Wind Play
Tennis
Day1 Sunny Hot High Weak No
Day2 Sunny Hot High Strong No
Day3 Overcast Hot High Weak Yes
Day4 Rain Mild High Weak Yes
Day5 Rain Cool Normal Weak Yes
Day6 Rain Cool Normal Strong No
Day7 Overcast Cool Normal Strong Yes
Day8 Sunny Mild High Weak No
Day9 Sunny Cool Normal Weak Yes
Day10 Rain Mild Normal Weak Yes
Day11 Sunny Mild Normal Strong Yes
Day12 Overcast Mild High Strong Yes
Day13 Overcast Hot Normal Weak Yes
Day14 Rain Mild High Strong No
Ex: G(Outlook) = - 9/14log(9/14) -5/14log(5/14) – ( 5/14 * (-2/5 log(2/5) – 3/5 log(3/5)) + 4/14 * (-4/4 log(4/4) - 0* log(0)) + 5/14 * (-3/5log(3/5) -2/5log(2/5))) = 0.246 bits
ID3 - Example
Day Outlook Temperature Humidity Wind Play
Tennis
Day1 Sunny Hot High Weak No
Day2 Sunny Hot High Strong No
Day3 Overcast Hot High Weak Yes
Day4 Rain Mild High Weak Yes
Day5 Rain Cool Normal Weak Yes
Day6 Rain Cool Normal Strong No
Day7 Overcast Cool Normal Strong Yes
Day8 Sunny Mild High Weak No
Day9 Sunny Cool Normal Weak Yes
Day10 Rain Mild Normal Weak Yes
Day11 Sunny Mild Normal Strong Yes
Day12 Overcast Mild High Strong Yes
Day13 Overcast Hot Normal Weak Yes
Day14 Rain Mild High Strong No
What if X’=(Outlook=Sunny, Temperature=Cool, Humidity=High, Wind=Strong) = > No
ID3 – handling unknown values
- Consider the attribute with the unknown value as the class feature and
identify the missing value using the ID3 algorithm
Day Play
Tennis
Temperature Humidity Wind Outlook
Day1 No Hot High Weak ?
Day2 No Hot High Strong Sunny
Day3 Yes Hot High Weak Overcast
Day4 Yes Mild High Weak Rain
Day5 Yes Cool Normal Weak Rain
Day6 No Cool Normal Strong Rain
Day7 Yes Cool Normal Strong Overcast
Day8 No Mild High Weak Sunny
Day9 Yes Cool Normal Weak Sunny
Day1
0
Yes Mild Normal Weak Rain
Day1
1
Yes Mild Normal Strong Sunny
Day1
2
Yes Mild High Strong Overcast
Day1
3
Yes Hot Normal Weak Overcast
Day1
4
No Mild High Strong Rain
Suppose the first value from the Outlook is missing.
? = class(PlayTennis=No, Temperature=Hot, Humidity=High, Wind=Weak) = Sunny
ID3 – overfitting
To avoid overfitting the negligible leaves are removed. This mechanism
is called pruning.
Two types of pruning:
• Pre-Pruning
- is not so efficient but is fast because is done in the growing process
- based on thresholds like (maximum depth, minimum items classified by a node )
• Post – Pruning
- done after the tree is grown
- prunes the leaves in the error rate order (much more efficient)
ID3 - summary
• Easy to implement
• Doesn’t tackle the numeric values (C4.5 does)
• Creates a split for each Attribute value (CART trees are binary tree)
• C4.5 the successor of ID3 was considered the first algorithm in the
“Top 10 Algorithms in Data Mining” paper published by Springer LNCS
in 2008
Decision tree – real life examples
• Biomedical engineering (decision trees for identifying features to be used in
implantable devices)
• Financial analysis (e.g Kaggle Santander customer satisfaction)
• Astronomy(classify galaxies, identify quasars, etc.)
• System Control
• Manufacturing and Production (quality control, semiconductor manufacturing,
etc.)
• Medicine(diagnosis, cardiology, psychiatry)
• Plant diseases (CART was recently used to assess the hazard of mortality to pine
trees)
• Pharmacology (drug analysis classification)
• Physics (particle detection)
Decision tree - conclusions
• Easy to implement
• Top classifier used in ensemble learning (Random Forest, Gradient
boosting, Extreme Trees)
• Widely used in practice
• When trees are small they are easy to understand

More Related Content

Similar to "Induction of Decision Trees" @ Papers We Love Bucharest

Classification decision tree
Classification  decision treeClassification  decision tree
Classification decision tree
yazad dumasia
 
CART Algorithm.pptx
CART Algorithm.pptxCART Algorithm.pptx
CART Algorithm.pptx
SumayaNazir440
 
Decision tree induction \ Decision Tree Algorithm with Example| Data science
Decision tree induction \ Decision Tree Algorithm with Example| Data scienceDecision tree induction \ Decision Tree Algorithm with Example| Data science
Decision tree induction \ Decision Tree Algorithm with Example| Data science
MaryamRehman6
 
Classification using decision tree in detail
Classification using decision tree in detailClassification using decision tree in detail
Classification using decision tree in detail
Ramadan Babers, PhD
 
Decision trees
Decision treesDecision trees
Decision trees
SatishH5
 
Nhan_Chapter 6_Classification 2022.pdf
Nhan_Chapter 6_Classification 2022.pdfNhan_Chapter 6_Classification 2022.pdf
Nhan_Chapter 6_Classification 2022.pdf
NhtHong96
 
unit 5 decision tree2.pptx
unit 5 decision tree2.pptxunit 5 decision tree2.pptx
unit 5 decision tree2.pptx
ssuser5c580e1
 
Decision tree
Decision treeDecision tree
3.Classification.ppt
3.Classification.ppt3.Classification.ppt
3.Classification.ppt
KhalilDaiyob1
 
Lecture4.pptx
Lecture4.pptxLecture4.pptx
Lecture4.pptx
yasir149288
 
Introduction to ML and Decision Tree
Introduction to ML and Decision TreeIntroduction to ML and Decision Tree
Introduction to ML and Decision Tree
Suman Debnath
 
Descision making descision making decision tree.pptx
Descision making descision making decision tree.pptxDescision making descision making decision tree.pptx
Descision making descision making decision tree.pptx
charmeshponnagani
 
Lect9 Decision tree
Lect9 Decision treeLect9 Decision tree
Lect9 Decision tree
hktripathy
 
ID3_Explanation.pptx
ID3_Explanation.pptxID3_Explanation.pptx
ID3_Explanation.pptx
SanketMani1
 
Machine Learning with Accord Framework
Machine Learning with Accord FrameworkMachine Learning with Accord Framework
Machine Learning with Accord Framework
Andrius Dapševičius
 
Imlkn c45
Imlkn c45Imlkn c45
Imlkn c45
Hung Le
 
Big Data Analytics - Unit 3.pptx
Big Data Analytics - Unit 3.pptxBig Data Analytics - Unit 3.pptx
Big Data Analytics - Unit 3.pptx
PlacementsBCA
 
lecture notes about decision tree. Its a very good
lecture notes about decision tree. Its a very goodlecture notes about decision tree. Its a very good
lecture notes about decision tree. Its a very good
ranjankumarbehera14
 
decison tree and rules in data mining techniques
decison tree and rules in data mining techniquesdecison tree and rules in data mining techniques
decison tree and rules in data mining techniques
ALIZAIB KHAN
 
DMTM Lecture 07 Decision trees
DMTM Lecture 07 Decision treesDMTM Lecture 07 Decision trees
DMTM Lecture 07 Decision trees
Pier Luca Lanzi
 

Similar to "Induction of Decision Trees" @ Papers We Love Bucharest (20)

Classification decision tree
Classification  decision treeClassification  decision tree
Classification decision tree
 
CART Algorithm.pptx
CART Algorithm.pptxCART Algorithm.pptx
CART Algorithm.pptx
 
Decision tree induction \ Decision Tree Algorithm with Example| Data science
Decision tree induction \ Decision Tree Algorithm with Example| Data scienceDecision tree induction \ Decision Tree Algorithm with Example| Data science
Decision tree induction \ Decision Tree Algorithm with Example| Data science
 
Classification using decision tree in detail
Classification using decision tree in detailClassification using decision tree in detail
Classification using decision tree in detail
 
Decision trees
Decision treesDecision trees
Decision trees
 
Nhan_Chapter 6_Classification 2022.pdf
Nhan_Chapter 6_Classification 2022.pdfNhan_Chapter 6_Classification 2022.pdf
Nhan_Chapter 6_Classification 2022.pdf
 
unit 5 decision tree2.pptx
unit 5 decision tree2.pptxunit 5 decision tree2.pptx
unit 5 decision tree2.pptx
 
Decision tree
Decision treeDecision tree
Decision tree
 
3.Classification.ppt
3.Classification.ppt3.Classification.ppt
3.Classification.ppt
 
Lecture4.pptx
Lecture4.pptxLecture4.pptx
Lecture4.pptx
 
Introduction to ML and Decision Tree
Introduction to ML and Decision TreeIntroduction to ML and Decision Tree
Introduction to ML and Decision Tree
 
Descision making descision making decision tree.pptx
Descision making descision making decision tree.pptxDescision making descision making decision tree.pptx
Descision making descision making decision tree.pptx
 
Lect9 Decision tree
Lect9 Decision treeLect9 Decision tree
Lect9 Decision tree
 
ID3_Explanation.pptx
ID3_Explanation.pptxID3_Explanation.pptx
ID3_Explanation.pptx
 
Machine Learning with Accord Framework
Machine Learning with Accord FrameworkMachine Learning with Accord Framework
Machine Learning with Accord Framework
 
Imlkn c45
Imlkn c45Imlkn c45
Imlkn c45
 
Big Data Analytics - Unit 3.pptx
Big Data Analytics - Unit 3.pptxBig Data Analytics - Unit 3.pptx
Big Data Analytics - Unit 3.pptx
 
lecture notes about decision tree. Its a very good
lecture notes about decision tree. Its a very goodlecture notes about decision tree. Its a very good
lecture notes about decision tree. Its a very good
 
decison tree and rules in data mining techniques
decison tree and rules in data mining techniquesdecison tree and rules in data mining techniques
decison tree and rules in data mining techniques
 
DMTM Lecture 07 Decision trees
DMTM Lecture 07 Decision treesDMTM Lecture 07 Decision trees
DMTM Lecture 07 Decision trees
 

Recently uploaded

Enhanced Screen Flows UI/UX using SLDS with Tom Kitt
Enhanced Screen Flows UI/UX using SLDS with Tom KittEnhanced Screen Flows UI/UX using SLDS with Tom Kitt
Enhanced Screen Flows UI/UX using SLDS with Tom Kitt
Peter Caitens
 
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
kgyxske
 
What’s new in VictoriaMetrics - Q2 2024 Update
What’s new in VictoriaMetrics - Q2 2024 UpdateWhat’s new in VictoriaMetrics - Q2 2024 Update
What’s new in VictoriaMetrics - Q2 2024 Update
VictoriaMetrics
 
Folding Cheat Sheet #6 - sixth in a series
Folding Cheat Sheet #6 - sixth in a seriesFolding Cheat Sheet #6 - sixth in a series
Folding Cheat Sheet #6 - sixth in a series
Philip Schwarz
 
Alluxio Webinar | 10x Faster Trino Queries on Your Data Platform
Alluxio Webinar | 10x Faster Trino Queries on Your Data PlatformAlluxio Webinar | 10x Faster Trino Queries on Your Data Platform
Alluxio Webinar | 10x Faster Trino Queries on Your Data Platform
Alluxio, Inc.
 
Orca: Nocode Graphical Editor for Container Orchestration
Orca: Nocode Graphical Editor for Container OrchestrationOrca: Nocode Graphical Editor for Container Orchestration
Orca: Nocode Graphical Editor for Container Orchestration
Pedro J. Molina
 
The Role of DevOps in Digital Transformation.pdf
The Role of DevOps in Digital Transformation.pdfThe Role of DevOps in Digital Transformation.pdf
The Role of DevOps in Digital Transformation.pdf
mohitd6
 
Best Practices & Tips for a Successful Odoo ERP Implementation
Best Practices & Tips for a Successful Odoo ERP ImplementationBest Practices & Tips for a Successful Odoo ERP Implementation
Best Practices & Tips for a Successful Odoo ERP Implementation
Envertis Software Solutions
 
Migration From CH 1.0 to CH 2.0 and Mule 4.6 & Java 17 Upgrade.pptx
Migration From CH 1.0 to CH 2.0 and  Mule 4.6 & Java 17 Upgrade.pptxMigration From CH 1.0 to CH 2.0 and  Mule 4.6 & Java 17 Upgrade.pptx
Migration From CH 1.0 to CH 2.0 and Mule 4.6 & Java 17 Upgrade.pptx
ervikas4
 
TheFutureIsDynamic-BoxLang-CFCamp2024.pdf
TheFutureIsDynamic-BoxLang-CFCamp2024.pdfTheFutureIsDynamic-BoxLang-CFCamp2024.pdf
TheFutureIsDynamic-BoxLang-CFCamp2024.pdf
Ortus Solutions, Corp
 
Strengthening Web Development with CommandBox 6: Seamless Transition and Scal...
Strengthening Web Development with CommandBox 6: Seamless Transition and Scal...Strengthening Web Development with CommandBox 6: Seamless Transition and Scal...
Strengthening Web Development with CommandBox 6: Seamless Transition and Scal...
Ortus Solutions, Corp
 
What is Continuous Testing in DevOps - A Definitive Guide.pdf
What is Continuous Testing in DevOps - A Definitive Guide.pdfWhat is Continuous Testing in DevOps - A Definitive Guide.pdf
What is Continuous Testing in DevOps - A Definitive Guide.pdf
kalichargn70th171
 
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Paul Brebner
 
Call Girls Bangalore🔥7023059433🔥Best Profile Escorts in Bangalore Available 24/7
Call Girls Bangalore🔥7023059433🔥Best Profile Escorts in Bangalore Available 24/7Call Girls Bangalore🔥7023059433🔥Best Profile Escorts in Bangalore Available 24/7
Call Girls Bangalore🔥7023059433🔥Best Profile Escorts in Bangalore Available 24/7
manji sharman06
 
Software Test Automation - A Comprehensive Guide on Automated Testing.pdf
Software Test Automation - A Comprehensive Guide on Automated Testing.pdfSoftware Test Automation - A Comprehensive Guide on Automated Testing.pdf
Software Test Automation - A Comprehensive Guide on Automated Testing.pdf
kalichargn70th171
 
Folding Cheat Sheet #5 - fifth in a series
Folding Cheat Sheet #5 - fifth in a seriesFolding Cheat Sheet #5 - fifth in a series
Folding Cheat Sheet #5 - fifth in a series
Philip Schwarz
 
Going AOT: Everything you need to know about GraalVM for Java applications
Going AOT: Everything you need to know about GraalVM for Java applicationsGoing AOT: Everything you need to know about GraalVM for Java applications
Going AOT: Everything you need to know about GraalVM for Java applications
Alina Yurenko
 
Building the Ideal CI-CD Pipeline_ Achieving Visual Perfection
Building the Ideal CI-CD Pipeline_ Achieving Visual PerfectionBuilding the Ideal CI-CD Pipeline_ Achieving Visual Perfection
Building the Ideal CI-CD Pipeline_ Achieving Visual Perfection
Applitools
 
Trailhead Talks_ Journey of an All-Star Ranger .pptx
Trailhead Talks_ Journey of an All-Star Ranger .pptxTrailhead Talks_ Journey of an All-Star Ranger .pptx
Trailhead Talks_ Journey of an All-Star Ranger .pptx
ImtiazBinMohiuddin
 
Refactoring legacy systems using events commands and bubble contexts
Refactoring legacy systems using events commands and bubble contextsRefactoring legacy systems using events commands and bubble contexts
Refactoring legacy systems using events commands and bubble contexts
Michał Kurzeja
 

Recently uploaded (20)

Enhanced Screen Flows UI/UX using SLDS with Tom Kitt
Enhanced Screen Flows UI/UX using SLDS with Tom KittEnhanced Screen Flows UI/UX using SLDS with Tom Kitt
Enhanced Screen Flows UI/UX using SLDS with Tom Kitt
 
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
 
What’s new in VictoriaMetrics - Q2 2024 Update
What’s new in VictoriaMetrics - Q2 2024 UpdateWhat’s new in VictoriaMetrics - Q2 2024 Update
What’s new in VictoriaMetrics - Q2 2024 Update
 
Folding Cheat Sheet #6 - sixth in a series
Folding Cheat Sheet #6 - sixth in a seriesFolding Cheat Sheet #6 - sixth in a series
Folding Cheat Sheet #6 - sixth in a series
 
Alluxio Webinar | 10x Faster Trino Queries on Your Data Platform
Alluxio Webinar | 10x Faster Trino Queries on Your Data PlatformAlluxio Webinar | 10x Faster Trino Queries on Your Data Platform
Alluxio Webinar | 10x Faster Trino Queries on Your Data Platform
 
Orca: Nocode Graphical Editor for Container Orchestration
Orca: Nocode Graphical Editor for Container OrchestrationOrca: Nocode Graphical Editor for Container Orchestration
Orca: Nocode Graphical Editor for Container Orchestration
 
The Role of DevOps in Digital Transformation.pdf
The Role of DevOps in Digital Transformation.pdfThe Role of DevOps in Digital Transformation.pdf
The Role of DevOps in Digital Transformation.pdf
 
Best Practices & Tips for a Successful Odoo ERP Implementation
Best Practices & Tips for a Successful Odoo ERP ImplementationBest Practices & Tips for a Successful Odoo ERP Implementation
Best Practices & Tips for a Successful Odoo ERP Implementation
 
Migration From CH 1.0 to CH 2.0 and Mule 4.6 & Java 17 Upgrade.pptx
Migration From CH 1.0 to CH 2.0 and  Mule 4.6 & Java 17 Upgrade.pptxMigration From CH 1.0 to CH 2.0 and  Mule 4.6 & Java 17 Upgrade.pptx
Migration From CH 1.0 to CH 2.0 and Mule 4.6 & Java 17 Upgrade.pptx
 
TheFutureIsDynamic-BoxLang-CFCamp2024.pdf
TheFutureIsDynamic-BoxLang-CFCamp2024.pdfTheFutureIsDynamic-BoxLang-CFCamp2024.pdf
TheFutureIsDynamic-BoxLang-CFCamp2024.pdf
 
Strengthening Web Development with CommandBox 6: Seamless Transition and Scal...
Strengthening Web Development with CommandBox 6: Seamless Transition and Scal...Strengthening Web Development with CommandBox 6: Seamless Transition and Scal...
Strengthening Web Development with CommandBox 6: Seamless Transition and Scal...
 
What is Continuous Testing in DevOps - A Definitive Guide.pdf
What is Continuous Testing in DevOps - A Definitive Guide.pdfWhat is Continuous Testing in DevOps - A Definitive Guide.pdf
What is Continuous Testing in DevOps - A Definitive Guide.pdf
 
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
 
Call Girls Bangalore🔥7023059433🔥Best Profile Escorts in Bangalore Available 24/7
Call Girls Bangalore🔥7023059433🔥Best Profile Escorts in Bangalore Available 24/7Call Girls Bangalore🔥7023059433🔥Best Profile Escorts in Bangalore Available 24/7
Call Girls Bangalore🔥7023059433🔥Best Profile Escorts in Bangalore Available 24/7
 
Software Test Automation - A Comprehensive Guide on Automated Testing.pdf
Software Test Automation - A Comprehensive Guide on Automated Testing.pdfSoftware Test Automation - A Comprehensive Guide on Automated Testing.pdf
Software Test Automation - A Comprehensive Guide on Automated Testing.pdf
 
Folding Cheat Sheet #5 - fifth in a series
Folding Cheat Sheet #5 - fifth in a seriesFolding Cheat Sheet #5 - fifth in a series
Folding Cheat Sheet #5 - fifth in a series
 
Going AOT: Everything you need to know about GraalVM for Java applications
Going AOT: Everything you need to know about GraalVM for Java applicationsGoing AOT: Everything you need to know about GraalVM for Java applications
Going AOT: Everything you need to know about GraalVM for Java applications
 
Building the Ideal CI-CD Pipeline_ Achieving Visual Perfection
Building the Ideal CI-CD Pipeline_ Achieving Visual PerfectionBuilding the Ideal CI-CD Pipeline_ Achieving Visual Perfection
Building the Ideal CI-CD Pipeline_ Achieving Visual Perfection
 
Trailhead Talks_ Journey of an All-Star Ranger .pptx
Trailhead Talks_ Journey of an All-Star Ranger .pptxTrailhead Talks_ Journey of an All-Star Ranger .pptx
Trailhead Talks_ Journey of an All-Star Ranger .pptx
 
Refactoring legacy systems using events commands and bubble contexts
Refactoring legacy systems using events commands and bubble contextsRefactoring legacy systems using events commands and bubble contexts
Refactoring legacy systems using events commands and bubble contexts
 

"Induction of Decision Trees" @ Papers We Love Bucharest

  • 1. “Induction of Decision Trees” by Ross Quinlan Papers We Love Bucharest Stefan Alexandru Adam 27th of May 2016 TechHub
  • 2. Short History on Classification Algorithms • Perceptron (Rosenblatt, Frank 1957) • Pattern Recognition using K-NN (1967) • Top down induction of decision trees (1980’s) • Bagging (Breiman 1994) • Boosting • SVM
  • 3. TDIDT family of learning systems A family of learning systems which solves a classification problem using a decision tree. A classification problem can be : - diagnosis of a medical condition given symptoms - determining the Gain/Lose possible values of a chess position - determining from atmospheric observation if it will snow or not
  • 4. TDIDT family of learning systems Decision trees are a representation of a classification • The root is labelled by an attribute • Edges are labelled by attribute values • Each leaf is labelled by a class • Is a collection of decision rules Classification is done by traversing the tree from the root to the leaves.
  • 5. TDIDT family of learning systems
  • 6. Important decision tree algorithms: - ID3 (Iterative Dichotomiser 3) and successor - CART binary trees (Classification and regression trees) TDIDT family of learning systems CLS(1963) ID3(1979) C4.5(1993) C5 ACLS(1981) ASSISTANT(1984) RuleMaster(1984) EX-TRAN(1984) Expert-Ease(1983) CART(1984)
  • 7. Induction Task Given a training set of n observations (𝑥𝑖, 𝑦𝑖) T X, Y = 𝑥1,1 ⋯ 𝑥1,𝑠 ⋮ ⋱ ⋮ 𝑥1,𝑛 ⋯ 𝑥 𝑛,𝑠 𝑦1 ⋮ 𝑦𝑛 , 𝑥𝑖 has s attributes and 𝑦𝑖 ∈ {𝐶1, … 𝐶 𝑚} Develop a classification rule(decision tree) that can determine the class of any objects from its values attributes. It usually has two steps:  growing phase. The tree is constructed top-down  pruning phase. The tree is pruned bottom-up
  • 8. Induction Task – Algorithm for growing decision trees GrowTree(observations, node) 1. If all observations have the same class C// node is a leaf node.Class = C return 2. bestAttribute = FindBestAttribute(observations) // identify best attribute which splits the data 3. Partition the observations according with bestAttribute into k partitions 4. For each partitions 𝑃𝑖=1:𝑘 ChildNode = new Node(node) GrowTree(𝑃𝑖, ChildNode) // recursive call
  • 9. Induction Task – Choosing best attribute Outlook Sunny Overcast Rain 9 yes, 5 no 2 yes, 3 no 4 yes, 0 no 3 yes, 2 no Temperature Cool Mild Hot 9 yes, 5 no 3 yes, 1 no 4 yes, 2 no 2 yes, 2 no Humidity Normal High 9 yes, 5 no 6 yes, 1 no 3 yes, 4 no Wind True False 9 yes, 5 no 3 yes, 3 no 6 yes, 2 no Day Outlook Temperature Humidity Wind Play Tennis Day1 Sunny Hot High Weak No Day2 Sunny Hot High Strong No Day3 Overcast Hot High Weak Yes Day4 Rain Mild High Weak Yes Day5 Rain Cool Normal Weak Yes Day6 Rain Cool Normal Strong No Day7 Overcast Cool Normal Strong Yes Day8 Sunny Mild High Weak No Day9 Sunny Cool Normal Weak Yes Day10 Rain Mild Normal Weak Yes Day11 Sunny Mild Normal Strong Yes Day12 Overcast Mild High Strong Yes Day13 Overcast Hot Normal Weak Yes Day14 Rain Mild High Strong No
  • 10. Induction Task – Measure Information Identify the best attribute for splitting the data. Create a score for each attribute and choose the one with the highest value. This score is a measure of impurity Possible scores: - Entropy (information Gain) used in ID3, C4.5 - Gain Ratio - Gini index used in CART - Twoing splitting rule in CART
  • 11. Entropy measure – Information Gain Given a discrete random variable X with possible outcomes {𝑥1, … , 𝑥 𝑛} and probability P H 𝑋 = 𝐸 −𝑙𝑜𝑔2 𝑃 𝑋 bits, represents the entropy of X bit = the unit of measure for Entropy The entropy measures the impurity (disorder) of a system. For a subset S(n) which correspond to a node n in a decision tree 𝐇 𝑺 = − 𝒊=𝟏 𝒎 𝒑 𝑪𝒊 𝒍𝒐𝒈 𝟐 𝒑 𝑪𝒊 where 𝑝 𝐶𝑖 = 𝑐𝑜𝑢𝑛𝑡 𝑜𝑓 𝑖𝑡𝑒𝑚𝑠 𝑖𝑛 𝑆 𝑤𝑖𝑡ℎ 𝑐𝑙𝑎𝑠𝑠 𝐶 𝑖 𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑖𝑡𝑒𝑚𝑠 𝑖𝑛 𝑆 Maximum impurity Zero impurity
  • 12. Entropy measure – Information Gain Given a subset S and attribute A with possible outcomes {𝑉1, … , 𝑉𝑙} we define the information gain 𝐆 𝑺, 𝑨 = 𝑯(𝑺) − 𝒊=𝟏 𝒍 𝒑 𝑽𝒊 𝑯 𝑺 𝑽 𝒊 , where 𝑝 𝑉𝑖 = 𝑐𝑜𝑢𝑛𝑡 𝑖𝑛 𝑆 𝑤ℎ𝑒𝑟𝑒 𝑣𝑎𝑙𝑢𝑒 𝐴 = 𝑉 𝑖 𝑐𝑜𝑢𝑛𝑡 𝑜𝑓 𝑖𝑡𝑒𝑚𝑠 𝑖𝑛 𝑆 , prob. of value(A) = 𝑉𝑖 in S 𝑆 𝑉 𝑖 represents the subset from S where value(A) = 𝑉𝑖 𝑆𝑝𝑙𝑖𝑡𝐼𝑛𝑓𝑜 𝑆, 𝐴 = − 1 𝑙 𝑝 𝑉𝑖 𝑙𝑜𝑔2(𝑝(𝑉𝑖)) 𝑮 𝒓𝒂𝒕𝒊𝒐 𝑺, 𝑨 = 𝑮 𝑺, 𝑨 𝑺𝒑𝒍𝒊𝒕𝑰𝒏𝒇𝒐 𝑺, 𝑨
  • 13. ID3 – decision tree GrowID3(S, node, attributes) 1. If all items in S has the same class C node.Class = C return 2. For each attribute A in attributes compute G 𝑆, 𝐴 𝐴 𝑠𝑝𝑙𝑖𝑡 = 𝑎𝑟𝑔𝑚𝑎𝑥 𝐴(𝐺(𝑆, 𝐴)), 𝐴 𝑠𝑝𝑙𝑖𝑡 ∈ {𝑉1, … , 𝑉𝑙} 3. Compute 𝑆 𝑉 𝑖 the subset from S where value(A) = 𝑉𝑖 4. attributes.remove(𝐴 𝑠𝑝𝑙𝑖𝑡) 5. For each subset partitions 𝑆 𝑉 𝑖=1:𝑙 ChildNode = new Node(node) GrowID3 (𝑃𝑖, ChildNode) // recursive call
  • 14. ID3– Example Outlook Sunny Overcast Rain 9 yes, 5 no 2 yes, 3 no 4 yes, 0 no 3 yes, 2 no Temperature Cool Mild Hot 9 yes, 5 no 3 yes, 1 no 4 yes, 2 no 2 yes, 2 no Humidity Normal High 9 yes, 5 no 6 yes, 1 no 3 yes, 4 no Wind True False 9 yes, 5 no 3 yes, 3 no 6 yes, 2 no G=0.246 bits G=0.029 bits G(Humidify)=0.152 bits G(Wind)=0.048 bits Day Outlook Temperature Humidity Wind Play Tennis Day1 Sunny Hot High Weak No Day2 Sunny Hot High Strong No Day3 Overcast Hot High Weak Yes Day4 Rain Mild High Weak Yes Day5 Rain Cool Normal Weak Yes Day6 Rain Cool Normal Strong No Day7 Overcast Cool Normal Strong Yes Day8 Sunny Mild High Weak No Day9 Sunny Cool Normal Weak Yes Day10 Rain Mild Normal Weak Yes Day11 Sunny Mild Normal Strong Yes Day12 Overcast Mild High Strong Yes Day13 Overcast Hot Normal Weak Yes Day14 Rain Mild High Strong No Ex: G(Outlook) = - 9/14log(9/14) -5/14log(5/14) – ( 5/14 * (-2/5 log(2/5) – 3/5 log(3/5)) + 4/14 * (-4/4 log(4/4) - 0* log(0)) + 5/14 * (-3/5log(3/5) -2/5log(2/5))) = 0.246 bits
  • 15. ID3 - Example Day Outlook Temperature Humidity Wind Play Tennis Day1 Sunny Hot High Weak No Day2 Sunny Hot High Strong No Day3 Overcast Hot High Weak Yes Day4 Rain Mild High Weak Yes Day5 Rain Cool Normal Weak Yes Day6 Rain Cool Normal Strong No Day7 Overcast Cool Normal Strong Yes Day8 Sunny Mild High Weak No Day9 Sunny Cool Normal Weak Yes Day10 Rain Mild Normal Weak Yes Day11 Sunny Mild Normal Strong Yes Day12 Overcast Mild High Strong Yes Day13 Overcast Hot Normal Weak Yes Day14 Rain Mild High Strong No What if X’=(Outlook=Sunny, Temperature=Cool, Humidity=High, Wind=Strong) = > No
  • 16. ID3 – handling unknown values - Consider the attribute with the unknown value as the class feature and identify the missing value using the ID3 algorithm Day Play Tennis Temperature Humidity Wind Outlook Day1 No Hot High Weak ? Day2 No Hot High Strong Sunny Day3 Yes Hot High Weak Overcast Day4 Yes Mild High Weak Rain Day5 Yes Cool Normal Weak Rain Day6 No Cool Normal Strong Rain Day7 Yes Cool Normal Strong Overcast Day8 No Mild High Weak Sunny Day9 Yes Cool Normal Weak Sunny Day1 0 Yes Mild Normal Weak Rain Day1 1 Yes Mild Normal Strong Sunny Day1 2 Yes Mild High Strong Overcast Day1 3 Yes Hot Normal Weak Overcast Day1 4 No Mild High Strong Rain Suppose the first value from the Outlook is missing. ? = class(PlayTennis=No, Temperature=Hot, Humidity=High, Wind=Weak) = Sunny
  • 17. ID3 – overfitting To avoid overfitting the negligible leaves are removed. This mechanism is called pruning. Two types of pruning: • Pre-Pruning - is not so efficient but is fast because is done in the growing process - based on thresholds like (maximum depth, minimum items classified by a node ) • Post – Pruning - done after the tree is grown - prunes the leaves in the error rate order (much more efficient)
  • 18. ID3 - summary • Easy to implement • Doesn’t tackle the numeric values (C4.5 does) • Creates a split for each Attribute value (CART trees are binary tree) • C4.5 the successor of ID3 was considered the first algorithm in the “Top 10 Algorithms in Data Mining” paper published by Springer LNCS in 2008
  • 19. Decision tree – real life examples • Biomedical engineering (decision trees for identifying features to be used in implantable devices) • Financial analysis (e.g Kaggle Santander customer satisfaction) • Astronomy(classify galaxies, identify quasars, etc.) • System Control • Manufacturing and Production (quality control, semiconductor manufacturing, etc.) • Medicine(diagnosis, cardiology, psychiatry) • Plant diseases (CART was recently used to assess the hazard of mortality to pine trees) • Pharmacology (drug analysis classification) • Physics (particle detection)
  • 20. Decision tree - conclusions • Easy to implement • Top classifier used in ensemble learning (Random Forest, Gradient boosting, Extreme Trees) • Widely used in practice • When trees are small they are easy to understand

Editor's Notes

  1. Artificial intelligence first achieved recognition in the mid 1950’ August 1985