SlideShare a Scribd company logo
1 of 38
 Emaan Tahir-442
 Kaneez Amna-448
 Izza Shah-401
 Aiman Arshad-445
 Summaya Nazir-440
Presented To:
Ma’am Nazia
 1977: Breiman, Stone, Friedman, and Olshen invented the
first CART version. 1984: The official publication with a CART
decision tree software. It was a revolution in the world of
algorithms. Even today, CART is one of the most used
methods for decision tree data analytics.
 CART is an umbrella word that refers to the following types of
decision trees:
Classification Trees: When the target variable is continuous, the
tree is used to find the "class" into which the target variable is
most likely to fall.
Regression trees: These are used to forecast the value of a
continuous variable.
 The representation used for CART is a binary tree.
 Stopping criteria define how much tree learns and pruning can be
used to improve a learned tree.
 Predictions are made with CART by traversing the binary tree given
a new input record.
 The tree is learned using a greedy algorithm on the training data to
pick splits in the tree.
 The nodes are split into subnodes on the basis of a threshold value of an
attribute. The CART algorithm does that by searching for the best
homogeneity for the subnodes, with the help of the Gini Index criterion.
 The root node is taken as the training set and is split into two by considering
the best attribute and threshold value. Further, the subsets are also split
using the same logic. This continues till the last pure sub-set is found in the
tree or the maximum number of leaves possible in that growing tree. This is
also known as Tree Pruning.
1
2
3
4
If Height > 180 cm Then Male
If Height <= 180 cm AND Weight > 80 kg
Then Male
If Height <= 180 cm AND Weight <= 80
kg Then Female
Make Predictions With CART Models
Greedy Algorithm:
 The input variables and the split points are selected through a greedy
algorithm.
Stopping Criterion:
 The recursive binary splitting method described above must know
when to stop splitting. If the count is less than a certain threshold, the
split is rejected and the node is considered the last leaf node.
Tree pruning:
A decision tree's complexity is defined as the number of splits in the tree.
Only leaf nodes are eliminated if the total cost function for the complete
test set decreases.
EXAMPLE
Day Outlook Temp. Humidity Wind Decision
1 Sunny Hot High Weak No
2 Sunny Hot High Strong No
3 Overcast Hot High Weak Yes
4 Rain Mild High Weak Yes
5 Rain Cool Normal Weak Yes
6 Rain Cool Normal Strong No
7 Overcast Cool Normal Strong Yes
8 Sunny Mild High Weak No
9 Sunny Cool Normal Weak Yes
10 Rain Mild Normal Weak Yes
11 Sunny Mild Normal Strong Yes
12 Overcast Mild High Strong Yes
13 Overcast Hot Normal Weak Yes
14 Rain Mild High Strong No
Gini index is a metric for classification tasks in
CART. It stores sum of squared probabilities of
each class. We can formulate it as illustrated below.
Gini = 1 – Σ (Pi)2 for i=1 to number of classes
Outlook Yes No
Number of
instances
Sunny 2 3 5
Overcast 4 0 4
Rain 3 2 5
Gini(Outlook=Sunny) = 1 – (2/5)2 – (3/5)2 = 1 – 0.16 – 0.36 = 0.48
Gini(Outlook=Overcast) = 1 – (4/4)2 – (0/4)2 = 0
Gini(Outlook=Rain) = 1 – (3/5)2 – (2/5)2 = 1 – 0.36 – 0.16 = 0.48
Then, we will calculate weighted sum of gini indexes for outlook
feature.
Gini(Outlook) = (5/14) x 0.48 + (4/14) x 0 + (5/14) x 0.48 = 0.171 + 0
+ 0.171 = 0.342
Temperature Yes No
Number of
instances
Hot 2 2 4
Cool 3 1 4
Mild 4 2 6
Gini(Temp=Hot) = 1 – (2/4)2 – (2/4)2 = 0.5
Gini(Temp=Cool) = 1 – (3/4)2 – (1/4)2 = 1 – 0.5625 – 0.0625 = 0.375
Gini(Temp=Mild) = 1 – (4/6)2 – (2/6)2 = 1 – 0.444 – 0.111 = 0.445
We’ll calculate weighted sum of gini index for temperature feature
Gini(Temp) = (4/14) x 0.5 + (4/14) x 0.375 + (6/14) x 0.445 = 0.142 +
0.107 + 0.190 = 0.439
Humidity Yes No
Number of
instances
High 3 4 7
Normal 6 1 7
Gini(Humidity=High) = 1 – (3/7)2 – (4/7)2 = 1 – 0.183 – 0.326 =
0.489
Gini(Humidity=Normal) = 1 – (6/7)2 – (1/7)2 = 1 – 0.734 – 0.02 =
0.244
Weighted sum for humidity feature will be calculated next
Gini(Humidity) = (7/14) x 0.489 + (7/14) x 0.244 = 0.367
Wind Yes No
Number of
instances
Weak 6 2 8
Strong 3 3 6
Gini(Wind=Weak) = 1 – (6/8)2 – (2/8)2 = 1 – 0.5625 – 0.062 = 0.375
Gini(Wind=Strong) = 1 – (3/6)2 – (3/6)2 = 1 – 0.25 – 0.25 = 0.5
Gini(Wind) = (8/14) x 0.375 + (6/14) x 0.5 = 0.428
 We’ve calculated gini index values for each feature. We will
select outlook feature because its cost is the lowest.
Feature Gini index
Outlook 0.342
Temperature 0.439
Humidity 0.367
Wind 0.428
outlook
OUTLOOK
YES
Day Outlook Temp. Humidity Wind Decision
1 Sunny Hot High Weak No
2 Sunny Hot High Strong No
8 Sunny Mild High Weak No
9 Sunny Cool Normal Weak Yes
11 Sunny Mild Normal Strong Yes
Gini of temperature for sunny outlook
Temperature Yes No
Number of
instances
Hot 0 2 2
Cool 1 0 1
Mild 1 1 2
Gini(Outlook=Sunny and Temp.=Hot) = 1 – (0/2)2 – (2/2)2 = 0
Gini(Outlook=Sunny and Temp.=Cool) = 1 – (1/1)2 – (0/1)2 = 0
Gini(Outlook=Sunny and Temp.=Mild) = 1 – (1/2)2 – (1/2)2 = 1 – 0.25 – 0.25 = 0.5
Gini(Outlook=Sunny and Temp.) = (2/5)x0 + (1/5)x0 + (2/5)x0.5 = 0.2
Humidity Yes No Number of instances
High 0 3 3
Normal 2 0 2
Gini(Outlook=Sunny and Humidity=High) = 1 – (0/3)2 – (3/3)2 = 0
Gini(Outlook=Sunny and Humidity=Normal) = 1 – (2/2)2 – (0/2)2 = 0
Gini(Outlook=Sunny and Humidity) = (3/5)x0 + (2/5)x0 = 0
Gini of wind for sunny outlook
Wind Yes No
Number of
instances
Weak 1 2 3
Strong 1 1 2
Gini(Outlook=Sunny and Wind=Weak) = 1 – (1/3)2 – (2/3)2 = 0.266
Gini(Outlook=Sunny and Wind=Strong) = 1- (1/2)2 – (1/2)2 = 0.2
Gini(Outlook=Sunny and Wind) = (3/5)x0.266 + (2/5)x0.2 = 0.466
Feature Gini index
Temperature 0.2
Humidity 0
Wind 0.466
Rain outlook
Day Outlook Temp. Humidity Wind Decision
4 Rain Mild High Weak Yes
5 Rain Cool Normal Weak Yes
6 Rain Cool Normal Strong No
10 Rain Mild Normal Weak Yes
14 Rain Mild High Strong No
Temperature Yes No Number of instances
Cool 1 1 2
Mild 2 1 3
Gini(Outlook=Rain and Temp.=Cool) = 1 – (1/2)2 – (1/2)2 = 0.5
Gini(Outlook=Rain and Temp.=Mild) = 1 – (2/3)2 – (1/3)2 = 0.444
Gini(Outlook=Rain and Temp.) = (2/5)x0.5 + (3/5)x0.444 = 0.466
Gini of humidity for rain outlook
Humidity Yes No Number of instances
High 1 1 2
Normal 2 1 3
Gini(Outlook=Rain and Humidity=High) = 1 – (1/2)2 – (1/2)2 = 0.5
Gini(Outlook=Rain and Humidity=Normal) = 1 – (2/3)2 – (1/3)2 = 0.444
Gini(Outlook=Rain and Humidity) = (2/5)x0.5 + (3/5)x0.444 = 0.466
Wind Yes No Number of instances
Weak 3 0 3
Strong 0 2 2
Gini(Outlook=Rain and Wind=Weak) = 1 – (3/3)2 – (0/3)2 = 0
Gini(Outlook=Rain and Wind=Strong) = 1 – (0/2)2 – (2/2)2 = 0
Gini(Outlook=Rain and Wind) = (3/5)x0 + (2/5)x0 = 0
Decision for rain outlook
Feature Gini index
Temperature 0.466
Humidity 0.466
Wind 0
Example
Age Job House Credit Loan Approved
Young False No Fair No
Young False No Good No
Young True No Good Yes
Young True Yes Fair Yes
Young False No Fair No
Middle False No Fair No
Middle False No Good No
Middle True Yes Good Yes
Middle False Yes Excellent Yes
Middle False Yes Excellent Yes
Old False Yes Excellent Yes
Old False Yes Good Yes
Old True No Good Yes
Old True No Excellent Yes
Young False No Good ?
Young 5 Yes 2
No 3
Middle 5 Yes 3
No 2
Old 5 Yes 4
No 1
individual error, and total for Age attribute
Attribute Rules Error Total Error
Age Young->No 2/5 5/15
Middle->Yes 2/5
Old->Yes 1/5
False 10 Yes 4
No 6
True 5 Yes 5
No 0
individual error, and total for Job attribute
Attribute Rules Error Total Error
Job False->No 4/10 4/15
True->Yes 0/5
No 9 Yes 3
No 6
Yes 6 Yes 6
No 0
individual error, and total for House attribute
Attribute Rules Error Total Error
House No->No 3/9 3/15
Yes->yes 0/6
Fair 5 Yes 1
No 4
Good 6 Yes 4
No 2
Excellent 4 Yes 4
No 0
individual error, and total for Credit attribute
Attribute Rules Error Total Error
Credit Fair->No 1/5 3/15
Good->Yes 2/6
Excellent->Yes 0/4
Attribute Rules Error Total Error
Age Young->No 2/5 5/15
Middle->Yes 2/5
Old->Yes 1/5
Job False->No 4/10 4/15
True->Yes 0/5
House No->No 3/9 3/15
Yes->yes 0/6
Credit Fair->No 1/5 3/15
Good->Yes 2/6
Excellent->Yes 0/4
House
YES
YES
NO
Age Job Credit LA
young False Fair No
Young False Good No
young True Good Yes
young False Fair No
Middle False Fair No
Middle False Good No
Old True Good Yes
Old True Excellent Yes
Old False Fair No
Attribute Rules Error Total Error
Age Young->No 1/4 2/9
Middle->No 0/2
Old->Yes 1/3
Job False->No 0/6 0/9
True->Yes 0/3
Credit Fair->No 0/4 2/9
Good->Yes/No 2/4
Excellent->Yes 0/1
House
YES
YES
YES
True
No
No
False
Job
Age Job House Credit
Loan
Approved
Young False No Good No
Advantages of CART Algorithm
 CART requires minimal supervision and produces easy-
to-understand models.
 It focuses on finding interactions and signal
discontinuity.
 It finds important variables automatically.
 It uses any combination of continuous/ discrete
variables.
Disadvantages of CART Algorithm
 This does not use combinations of variables.
 The tree structure may be unstable.
 It has a limited number of positions to accommodate
available predictors.
CART Algorithm.pptx
CART Algorithm.pptx

More Related Content

Similar to CART Algorithm.pptx

Machine Learning: finding patterns Outline
Machine Learning: finding patterns OutlineMachine Learning: finding patterns Outline
Machine Learning: finding patterns Outlinebutest
 
mod_02_intro_ml.ppt
mod_02_intro_ml.pptmod_02_intro_ml.ppt
mod_02_intro_ml.pptbutest
 
Descision making descision making decision tree.pptx
Descision making descision making decision tree.pptxDescision making descision making decision tree.pptx
Descision making descision making decision tree.pptxcharmeshponnagani
 
Decision tree in artificial intelligence
Decision tree in artificial intelligenceDecision tree in artificial intelligence
Decision tree in artificial intelligenceMdAlAmin187
 
Classification decision tree
Classification  decision treeClassification  decision tree
Classification decision treeyazad dumasia
 
ID3 Algorithm & ROC Analysis
ID3 Algorithm & ROC AnalysisID3 Algorithm & ROC Analysis
ID3 Algorithm & ROC AnalysisTalha Kabakus
 
ID3_Explanation.pptx
ID3_Explanation.pptxID3_Explanation.pptx
ID3_Explanation.pptxSanketMani1
 
Lecture02 - Data Mining & Analytics
Lecture02 - Data Mining & AnalyticsLecture02 - Data Mining & Analytics
Lecture02 - Data Mining & AnalyticsPrithwis Mukerjee
 
unit 5 decision tree2.pptx
unit 5 decision tree2.pptxunit 5 decision tree2.pptx
unit 5 decision tree2.pptxssuser5c580e1
 
Decision Tree Steps
Decision Tree StepsDecision Tree Steps
Decision Tree StepsVikash Kumar
 
Software-Praktikum SoSe 2005 Lehrstuhl fuer Maschinelles ...
Software-Praktikum SoSe 2005 Lehrstuhl fuer Maschinelles ...Software-Praktikum SoSe 2005 Lehrstuhl fuer Maschinelles ...
Software-Praktikum SoSe 2005 Lehrstuhl fuer Maschinelles ...butest
 
Ml6 decision trees
Ml6 decision treesMl6 decision trees
Ml6 decision treesankit_ppt
 
Imlkn c45
Imlkn c45Imlkn c45
Imlkn c45Hung Le
 
Decision Tree In R | Decision Tree Algorithm | Data Science Tutorial | Machin...
Decision Tree In R | Decision Tree Algorithm | Data Science Tutorial | Machin...Decision Tree In R | Decision Tree Algorithm | Data Science Tutorial | Machin...
Decision Tree In R | Decision Tree Algorithm | Data Science Tutorial | Machin...Simplilearn
 
WEKA:Algorithms The Basic Methods
WEKA:Algorithms The Basic MethodsWEKA:Algorithms The Basic Methods
WEKA:Algorithms The Basic Methodsweka Content
 
WEKA: Algorithms The Basic Methods
WEKA: Algorithms The Basic MethodsWEKA: Algorithms The Basic Methods
WEKA: Algorithms The Basic MethodsDataminingTools Inc
 
Dr. Oner CelepcikayITS 632ITS 632Week 4Classification
Dr. Oner CelepcikayITS 632ITS 632Week 4ClassificationDr. Oner CelepcikayITS 632ITS 632Week 4Classification
Dr. Oner CelepcikayITS 632ITS 632Week 4ClassificationDustiBuckner14
 

Similar to CART Algorithm.pptx (20)

Machine Learning: finding patterns Outline
Machine Learning: finding patterns OutlineMachine Learning: finding patterns Outline
Machine Learning: finding patterns Outline
 
mod_02_intro_ml.ppt
mod_02_intro_ml.pptmod_02_intro_ml.ppt
mod_02_intro_ml.ppt
 
Descision making descision making decision tree.pptx
Descision making descision making decision tree.pptxDescision making descision making decision tree.pptx
Descision making descision making decision tree.pptx
 
Decision Tree
Decision TreeDecision Tree
Decision Tree
 
Lecture4.pptx
Lecture4.pptxLecture4.pptx
Lecture4.pptx
 
Decision tree in artificial intelligence
Decision tree in artificial intelligenceDecision tree in artificial intelligence
Decision tree in artificial intelligence
 
Classification decision tree
Classification  decision treeClassification  decision tree
Classification decision tree
 
ID3 Algorithm & ROC Analysis
ID3 Algorithm & ROC AnalysisID3 Algorithm & ROC Analysis
ID3 Algorithm & ROC Analysis
 
Machine Learning and Data Mining
Machine Learning and Data MiningMachine Learning and Data Mining
Machine Learning and Data Mining
 
ID3_Explanation.pptx
ID3_Explanation.pptxID3_Explanation.pptx
ID3_Explanation.pptx
 
Lecture02 - Data Mining & Analytics
Lecture02 - Data Mining & AnalyticsLecture02 - Data Mining & Analytics
Lecture02 - Data Mining & Analytics
 
unit 5 decision tree2.pptx
unit 5 decision tree2.pptxunit 5 decision tree2.pptx
unit 5 decision tree2.pptx
 
Decision Tree Steps
Decision Tree StepsDecision Tree Steps
Decision Tree Steps
 
Software-Praktikum SoSe 2005 Lehrstuhl fuer Maschinelles ...
Software-Praktikum SoSe 2005 Lehrstuhl fuer Maschinelles ...Software-Praktikum SoSe 2005 Lehrstuhl fuer Maschinelles ...
Software-Praktikum SoSe 2005 Lehrstuhl fuer Maschinelles ...
 
Ml6 decision trees
Ml6 decision treesMl6 decision trees
Ml6 decision trees
 
Imlkn c45
Imlkn c45Imlkn c45
Imlkn c45
 
Decision Tree In R | Decision Tree Algorithm | Data Science Tutorial | Machin...
Decision Tree In R | Decision Tree Algorithm | Data Science Tutorial | Machin...Decision Tree In R | Decision Tree Algorithm | Data Science Tutorial | Machin...
Decision Tree In R | Decision Tree Algorithm | Data Science Tutorial | Machin...
 
WEKA:Algorithms The Basic Methods
WEKA:Algorithms The Basic MethodsWEKA:Algorithms The Basic Methods
WEKA:Algorithms The Basic Methods
 
WEKA: Algorithms The Basic Methods
WEKA: Algorithms The Basic MethodsWEKA: Algorithms The Basic Methods
WEKA: Algorithms The Basic Methods
 
Dr. Oner CelepcikayITS 632ITS 632Week 4Classification
Dr. Oner CelepcikayITS 632ITS 632Week 4ClassificationDr. Oner CelepcikayITS 632ITS 632Week 4Classification
Dr. Oner CelepcikayITS 632ITS 632Week 4Classification
 

Recently uploaded

MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfphamnguyenenglishnb
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
ROOT CAUSE ANALYSIS PowerPoint Presentation
ROOT CAUSE ANALYSIS PowerPoint PresentationROOT CAUSE ANALYSIS PowerPoint Presentation
ROOT CAUSE ANALYSIS PowerPoint PresentationAadityaSharma884161
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxEyham Joco
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxGrade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxChelloAnnAsuncion2
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceSamikshaHamane
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfSpandanaRallapalli
 
Romantic Opera MUSIC FOR GRADE NINE pptx
Romantic Opera MUSIC FOR GRADE NINE pptxRomantic Opera MUSIC FOR GRADE NINE pptx
Romantic Opera MUSIC FOR GRADE NINE pptxsqpmdrvczh
 

Recently uploaded (20)

MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
ROOT CAUSE ANALYSIS PowerPoint Presentation
ROOT CAUSE ANALYSIS PowerPoint PresentationROOT CAUSE ANALYSIS PowerPoint Presentation
ROOT CAUSE ANALYSIS PowerPoint Presentation
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptx
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxGrade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in Pharmacovigilance
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdf
 
Romantic Opera MUSIC FOR GRADE NINE pptx
Romantic Opera MUSIC FOR GRADE NINE pptxRomantic Opera MUSIC FOR GRADE NINE pptx
Romantic Opera MUSIC FOR GRADE NINE pptx
 

CART Algorithm.pptx

  • 1.
  • 2.  Emaan Tahir-442  Kaneez Amna-448  Izza Shah-401  Aiman Arshad-445  Summaya Nazir-440 Presented To: Ma’am Nazia
  • 3.  1977: Breiman, Stone, Friedman, and Olshen invented the first CART version. 1984: The official publication with a CART decision tree software. It was a revolution in the world of algorithms. Even today, CART is one of the most used methods for decision tree data analytics.  CART is an umbrella word that refers to the following types of decision trees: Classification Trees: When the target variable is continuous, the tree is used to find the "class" into which the target variable is most likely to fall. Regression trees: These are used to forecast the value of a continuous variable.
  • 4.  The representation used for CART is a binary tree.  Stopping criteria define how much tree learns and pruning can be used to improve a learned tree.  Predictions are made with CART by traversing the binary tree given a new input record.  The tree is learned using a greedy algorithm on the training data to pick splits in the tree.
  • 5.  The nodes are split into subnodes on the basis of a threshold value of an attribute. The CART algorithm does that by searching for the best homogeneity for the subnodes, with the help of the Gini Index criterion.  The root node is taken as the training set and is split into two by considering the best attribute and threshold value. Further, the subsets are also split using the same logic. This continues till the last pure sub-set is found in the tree or the maximum number of leaves possible in that growing tree. This is also known as Tree Pruning. 1 2 3 4 If Height > 180 cm Then Male If Height <= 180 cm AND Weight > 80 kg Then Male If Height <= 180 cm AND Weight <= 80 kg Then Female Make Predictions With CART Models
  • 6. Greedy Algorithm:  The input variables and the split points are selected through a greedy algorithm. Stopping Criterion:  The recursive binary splitting method described above must know when to stop splitting. If the count is less than a certain threshold, the split is rejected and the node is considered the last leaf node. Tree pruning: A decision tree's complexity is defined as the number of splits in the tree. Only leaf nodes are eliminated if the total cost function for the complete test set decreases.
  • 8. Day Outlook Temp. Humidity Wind Decision 1 Sunny Hot High Weak No 2 Sunny Hot High Strong No 3 Overcast Hot High Weak Yes 4 Rain Mild High Weak Yes 5 Rain Cool Normal Weak Yes 6 Rain Cool Normal Strong No 7 Overcast Cool Normal Strong Yes 8 Sunny Mild High Weak No 9 Sunny Cool Normal Weak Yes 10 Rain Mild Normal Weak Yes 11 Sunny Mild Normal Strong Yes 12 Overcast Mild High Strong Yes 13 Overcast Hot Normal Weak Yes 14 Rain Mild High Strong No
  • 9. Gini index is a metric for classification tasks in CART. It stores sum of squared probabilities of each class. We can formulate it as illustrated below. Gini = 1 – Σ (Pi)2 for i=1 to number of classes
  • 10. Outlook Yes No Number of instances Sunny 2 3 5 Overcast 4 0 4 Rain 3 2 5 Gini(Outlook=Sunny) = 1 – (2/5)2 – (3/5)2 = 1 – 0.16 – 0.36 = 0.48 Gini(Outlook=Overcast) = 1 – (4/4)2 – (0/4)2 = 0 Gini(Outlook=Rain) = 1 – (3/5)2 – (2/5)2 = 1 – 0.36 – 0.16 = 0.48 Then, we will calculate weighted sum of gini indexes for outlook feature. Gini(Outlook) = (5/14) x 0.48 + (4/14) x 0 + (5/14) x 0.48 = 0.171 + 0 + 0.171 = 0.342
  • 11. Temperature Yes No Number of instances Hot 2 2 4 Cool 3 1 4 Mild 4 2 6 Gini(Temp=Hot) = 1 – (2/4)2 – (2/4)2 = 0.5 Gini(Temp=Cool) = 1 – (3/4)2 – (1/4)2 = 1 – 0.5625 – 0.0625 = 0.375 Gini(Temp=Mild) = 1 – (4/6)2 – (2/6)2 = 1 – 0.444 – 0.111 = 0.445 We’ll calculate weighted sum of gini index for temperature feature Gini(Temp) = (4/14) x 0.5 + (4/14) x 0.375 + (6/14) x 0.445 = 0.142 + 0.107 + 0.190 = 0.439
  • 12. Humidity Yes No Number of instances High 3 4 7 Normal 6 1 7 Gini(Humidity=High) = 1 – (3/7)2 – (4/7)2 = 1 – 0.183 – 0.326 = 0.489 Gini(Humidity=Normal) = 1 – (6/7)2 – (1/7)2 = 1 – 0.734 – 0.02 = 0.244 Weighted sum for humidity feature will be calculated next Gini(Humidity) = (7/14) x 0.489 + (7/14) x 0.244 = 0.367
  • 13. Wind Yes No Number of instances Weak 6 2 8 Strong 3 3 6 Gini(Wind=Weak) = 1 – (6/8)2 – (2/8)2 = 1 – 0.5625 – 0.062 = 0.375 Gini(Wind=Strong) = 1 – (3/6)2 – (3/6)2 = 1 – 0.25 – 0.25 = 0.5 Gini(Wind) = (8/14) x 0.375 + (6/14) x 0.5 = 0.428
  • 14.  We’ve calculated gini index values for each feature. We will select outlook feature because its cost is the lowest. Feature Gini index Outlook 0.342 Temperature 0.439 Humidity 0.367 Wind 0.428
  • 17. Day Outlook Temp. Humidity Wind Decision 1 Sunny Hot High Weak No 2 Sunny Hot High Strong No 8 Sunny Mild High Weak No 9 Sunny Cool Normal Weak Yes 11 Sunny Mild Normal Strong Yes Gini of temperature for sunny outlook Temperature Yes No Number of instances Hot 0 2 2 Cool 1 0 1 Mild 1 1 2 Gini(Outlook=Sunny and Temp.=Hot) = 1 – (0/2)2 – (2/2)2 = 0 Gini(Outlook=Sunny and Temp.=Cool) = 1 – (1/1)2 – (0/1)2 = 0 Gini(Outlook=Sunny and Temp.=Mild) = 1 – (1/2)2 – (1/2)2 = 1 – 0.25 – 0.25 = 0.5 Gini(Outlook=Sunny and Temp.) = (2/5)x0 + (1/5)x0 + (2/5)x0.5 = 0.2
  • 18. Humidity Yes No Number of instances High 0 3 3 Normal 2 0 2 Gini(Outlook=Sunny and Humidity=High) = 1 – (0/3)2 – (3/3)2 = 0 Gini(Outlook=Sunny and Humidity=Normal) = 1 – (2/2)2 – (0/2)2 = 0 Gini(Outlook=Sunny and Humidity) = (3/5)x0 + (2/5)x0 = 0 Gini of wind for sunny outlook Wind Yes No Number of instances Weak 1 2 3 Strong 1 1 2 Gini(Outlook=Sunny and Wind=Weak) = 1 – (1/3)2 – (2/3)2 = 0.266 Gini(Outlook=Sunny and Wind=Strong) = 1- (1/2)2 – (1/2)2 = 0.2 Gini(Outlook=Sunny and Wind) = (3/5)x0.266 + (2/5)x0.2 = 0.466
  • 19. Feature Gini index Temperature 0.2 Humidity 0 Wind 0.466
  • 20. Rain outlook Day Outlook Temp. Humidity Wind Decision 4 Rain Mild High Weak Yes 5 Rain Cool Normal Weak Yes 6 Rain Cool Normal Strong No 10 Rain Mild Normal Weak Yes 14 Rain Mild High Strong No
  • 21. Temperature Yes No Number of instances Cool 1 1 2 Mild 2 1 3 Gini(Outlook=Rain and Temp.=Cool) = 1 – (1/2)2 – (1/2)2 = 0.5 Gini(Outlook=Rain and Temp.=Mild) = 1 – (2/3)2 – (1/3)2 = 0.444 Gini(Outlook=Rain and Temp.) = (2/5)x0.5 + (3/5)x0.444 = 0.466 Gini of humidity for rain outlook Humidity Yes No Number of instances High 1 1 2 Normal 2 1 3 Gini(Outlook=Rain and Humidity=High) = 1 – (1/2)2 – (1/2)2 = 0.5 Gini(Outlook=Rain and Humidity=Normal) = 1 – (2/3)2 – (1/3)2 = 0.444 Gini(Outlook=Rain and Humidity) = (2/5)x0.5 + (3/5)x0.444 = 0.466
  • 22. Wind Yes No Number of instances Weak 3 0 3 Strong 0 2 2 Gini(Outlook=Rain and Wind=Weak) = 1 – (3/3)2 – (0/3)2 = 0 Gini(Outlook=Rain and Wind=Strong) = 1 – (0/2)2 – (2/2)2 = 0 Gini(Outlook=Rain and Wind) = (3/5)x0 + (2/5)x0 = 0 Decision for rain outlook Feature Gini index Temperature 0.466 Humidity 0.466 Wind 0
  • 23.
  • 25. Age Job House Credit Loan Approved Young False No Fair No Young False No Good No Young True No Good Yes Young True Yes Fair Yes Young False No Fair No Middle False No Fair No Middle False No Good No Middle True Yes Good Yes Middle False Yes Excellent Yes Middle False Yes Excellent Yes Old False Yes Excellent Yes Old False Yes Good Yes Old True No Good Yes Old True No Excellent Yes Young False No Good ?
  • 26. Young 5 Yes 2 No 3 Middle 5 Yes 3 No 2 Old 5 Yes 4 No 1 individual error, and total for Age attribute Attribute Rules Error Total Error Age Young->No 2/5 5/15 Middle->Yes 2/5 Old->Yes 1/5
  • 27. False 10 Yes 4 No 6 True 5 Yes 5 No 0 individual error, and total for Job attribute Attribute Rules Error Total Error Job False->No 4/10 4/15 True->Yes 0/5
  • 28. No 9 Yes 3 No 6 Yes 6 Yes 6 No 0 individual error, and total for House attribute Attribute Rules Error Total Error House No->No 3/9 3/15 Yes->yes 0/6
  • 29. Fair 5 Yes 1 No 4 Good 6 Yes 4 No 2 Excellent 4 Yes 4 No 0 individual error, and total for Credit attribute Attribute Rules Error Total Error Credit Fair->No 1/5 3/15 Good->Yes 2/6 Excellent->Yes 0/4
  • 30. Attribute Rules Error Total Error Age Young->No 2/5 5/15 Middle->Yes 2/5 Old->Yes 1/5 Job False->No 4/10 4/15 True->Yes 0/5 House No->No 3/9 3/15 Yes->yes 0/6 Credit Fair->No 1/5 3/15 Good->Yes 2/6 Excellent->Yes 0/4
  • 31. House YES YES NO Age Job Credit LA young False Fair No Young False Good No young True Good Yes young False Fair No Middle False Fair No Middle False Good No Old True Good Yes Old True Excellent Yes Old False Fair No
  • 32. Attribute Rules Error Total Error Age Young->No 1/4 2/9 Middle->No 0/2 Old->Yes 1/3 Job False->No 0/6 0/9 True->Yes 0/3 Credit Fair->No 0/4 2/9 Good->Yes/No 2/4 Excellent->Yes 0/1
  • 33. House YES YES YES True No No False Job Age Job House Credit Loan Approved Young False No Good No
  • 34.
  • 35.
  • 36. Advantages of CART Algorithm  CART requires minimal supervision and produces easy- to-understand models.  It focuses on finding interactions and signal discontinuity.  It finds important variables automatically.  It uses any combination of continuous/ discrete variables. Disadvantages of CART Algorithm  This does not use combinations of variables.  The tree structure may be unstable.  It has a limited number of positions to accommodate available predictors.