SlideShare a Scribd company logo
1 of 79
Data Mining with Decision Trees: An Introduction to CART ® Dan Steinberg Mikhail Golovnya [email_address] http://www.salford-systems.com
In The Beginning… ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
The Years of Struggle  ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
The Final Triumph ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
So What is CART? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Typical CART Solution ,[object Object],[object Object],[object Object],PATIENTS = 215 SURVIVE 178 82.8% DEAD 37 17.2% Is BP<=91? Terminal Node A SURVIVE 6 30.0% DEAD 14 70.0% NODE = DEAD Terminal Node B SURVIVE 102 98.1% DEAD 2   1.9% NODE = SURVIVE PATIENTS = 195 SURVIVE 172 88.2% DEAD 23 11.8% Is AGE<=62.5? Terminal Node C SURVIVE 14 50.0% DEAD 14 50.0% NODE = DEAD PATIENTS = 91 SURVIVE 70 76.9% DEAD 21 23.1% Is SINUS<=.5? Terminal Node D SURVIVE 56 88.9% DEAD 7 11.1% NODE = SURVIVE <= 91 > 91 <= 62.5 > 62.5 >.5 <=.5
How to Read It ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Decision Questions ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
The Tree is a Classifier ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
The Importance of Binary Splits ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Accuracy of a Tree ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Prediction Success Table ,[object Object],[object Object],Classified As TRUE Class 1 Class 2 &quot;Survivors&quot; &quot;Early Deaths&quot; Total % Correct Class 1 &quot;Survivors&quot; 158 20 178 88% Class 2 &quot;Early Deaths&quot; 9 28 37 75% Total 167 48 215 86.51%
Tree Interpretation and Use ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Binary Recursive Partitioning ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Three Major Stages ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
General Workflow Stage 1 Stage 2  Stage 3 Historical Data Learn Test Validate Build a Sequence of Nested Trees Monitor Performance Best Confirm Findings
Large Trees and Nearest Neighbor Classifier ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Derivation ,[object Object],[object Object],[object Object],[object Object],[object Object]
Illustration ,[object Object]
Impact on the Relative Error ,[object Object],[object Object]
Searching All Possible Splits ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Split Tables ,[object Object],[object Object],Sorted by Age Sorted by Blood Pressure
Categorical Predictors ,[object Object],[object Object],[object Object],[object Object],Left Right 1   A B, C, D 2   B A, C, B 3  C A, B, D 4   D A, B, C 5  A, B C, D 6  A, C B, D 7  A, D B, C
Binary Target Shortcut ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
GYM Model Example ,[object Object],[object Object],[object Object],[object Object],[object Object]
Variable Definitions ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],predictors not used target
Initial Runs ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Pruning Multiple Nodes ,[object Object],[object Object],[object Object]
Pruning Weaker Split First ,[object Object],[object Object]
Understanding Variable Importance ,[object Object],[object Object],[object Object],[object Object]
Competitors and Surrogates ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
The Utility of Surrogates ,[object Object],[object Object],[object Object],[object Object],[object Object]
Competitors and Surrogates Are Different   ,[object Object],[object Object],[object Object],[object Object],A B C A B C Split X A B C A C B Split Y ,[object Object],[object Object],[object Object]
Calculating Association ,[object Object],[object Object],[object Object],[object Object],          Split X Split Y           Default           ,[object Object]
Notes on Association ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Calculating Variable Importance ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Mystery Solved ,[object Object],[object Object]
Penalizing Individual Variables ,[object Object],[object Object],[object Object]
Penalizing Missing and Categorical Predictors ,[object Object],[object Object],[object Object],[object Object]
Splitting Rules - Introduction ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Priors Adjusted Probabilities ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
GINI Splitting Rule ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Illustration
GINI Properties ,[object Object],[object Object],[object Object]
Example ,[object Object],[object Object]
ENTROPY Splitting Rule ,[object Object],[object Object],[object Object],[object Object],[object Object]
Example ,[object Object],[object Object]
TWOING Splitting Rule ,[object Object],[object Object],[object Object],[object Object],[object Object]
Example ,[object Object],[object Object]
Best Possible Split ,[object Object],[object Object],[object Object],[object Object],[object Object]
Example ,[object Object],A 40 B 30 C 20 D 10 A 40 B 30 C 20 D 10 GINI Best Split A 40 B 30 C 20 D 10 A 40 D 10 B 30 C 20 TWOING Best Split
Symmetric GINI ,[object Object],[object Object],[object Object],[object Object]
Ordered TWOING ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Class Probability Rule ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Example
Favor Even Splits Control ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Fundamental Tree Building Mechanisms ,[object Object],[object Object],[object Object]
Key Players ,[object Object],DM Engine Analyst Model Dataset Priors    i Costs   C i  j Population
Fraud Example ,[object Object],[object Object],[object Object],[object Object]
Fraud Example – PRIORS EQUAL (0.5, 0.5) ,[object Object],[object Object],[object Object],[object Object]
Fraud Example – PRIORS SPECIFY (0.9, 0.1) ,[object Object],[object Object],[object Object],[object Object]
Fraud Example – PRIORS DATA (0.98, 0.02) ,[object Object],[object Object],[object Object],[object Object]
Internal Class Assignment Rule ,[object Object],[object Object],[object Object],[object Object]
Putting It All Together ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
The Impact of Priors ,[object Object],[object Object],[object Object], r =0.1 ,   b =0.9  r =0.9 ,   b =0.1  r =0.5 ,   b =0.5
Model Evaluation ,[object Object],Evaluator Analyst Expected Cost Dataset Priors    i Costs   C i  j Model PS Table Cells   n i  j Population
Estimating Expected Cost ,[object Object],[object Object],[object Object],[object Object]
PRIORS DATA – Relative Cost ,[object Object],[object Object],[object Object],[object Object],[object Object]
PRIORS EQUAL – Relative Cost ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Using Equivalent Costs ,[object Object],[object Object]
Automated CART Runs ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
More Batteries ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Battery MCT ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Battery MVI ,[object Object],[object Object],[object Object],[object Object],[object Object]
Battery PRIOR ,[object Object],[object Object],[object Object],[object Object],[object Object]
Battery SHAVING ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Battery SAMPLE ,[object Object],[object Object],[object Object]
Battery TARGET ,[object Object],[object Object],[object Object],[object Object]
Recommended Reading ,[object Object],[object Object],[object Object],[object Object]

More Related Content

What's hot

Cluster analysis using spss
Cluster analysis using spssCluster analysis using spss
Cluster analysis using spssDr Nisha Arora
 
Cluster spss week7
Cluster spss week7Cluster spss week7
Cluster spss week7Birat Sharma
 
13 random forest
13 random forest13 random forest
13 random forestVishal Dutt
 
An Algorithm Analysis on Data Mining-396
An Algorithm Analysis on Data Mining-396An Algorithm Analysis on Data Mining-396
An Algorithm Analysis on Data Mining-396Nida Rashid
 
Using CART For Beginners with A Teclo Example Dataset
Using CART For Beginners with A Teclo Example DatasetUsing CART For Beginners with A Teclo Example Dataset
Using CART For Beginners with A Teclo Example DatasetSalford Systems
 
Medical Statistics Part-I:Descriptive statistics
Medical Statistics Part-I:Descriptive statisticsMedical Statistics Part-I:Descriptive statistics
Medical Statistics Part-I:Descriptive statisticsRamachandra Barik
 
Review of Basic Statistics and Terminology
Review of Basic Statistics and TerminologyReview of Basic Statistics and Terminology
Review of Basic Statistics and Terminologyaswhite
 
Textmining Predictive Models
Textmining Predictive ModelsTextmining Predictive Models
Textmining Predictive Modelsguest0edcaf
 
Nonparametric tests assignment
Nonparametric tests assignmentNonparametric tests assignment
Nonparametric tests assignmentROOHASHAHID1
 
13 Machine Learning Supervised Decision Trees
13 Machine Learning Supervised Decision Trees13 Machine Learning Supervised Decision Trees
13 Machine Learning Supervised Decision TreesAndres Mendez-Vazquez
 
PG STAT 531 Lecture 4 Exploratory Data Analysis
PG STAT 531 Lecture 4 Exploratory Data AnalysisPG STAT 531 Lecture 4 Exploratory Data Analysis
PG STAT 531 Lecture 4 Exploratory Data AnalysisAashish Patel
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis緯鈞 沈
 
Integrating Fuzzy Dematel and SMAA-2 for Maintenance Expenses
Integrating Fuzzy Dematel and SMAA-2 for Maintenance ExpensesIntegrating Fuzzy Dematel and SMAA-2 for Maintenance Expenses
Integrating Fuzzy Dematel and SMAA-2 for Maintenance Expensesinventionjournals
 

What's hot (17)

Cluster analysis using spss
Cluster analysis using spssCluster analysis using spss
Cluster analysis using spss
 
Eda sri
Eda sriEda sri
Eda sri
 
Cluster spss week7
Cluster spss week7Cluster spss week7
Cluster spss week7
 
13 random forest
13 random forest13 random forest
13 random forest
 
16 Simple CART
16 Simple CART16 Simple CART
16 Simple CART
 
An Algorithm Analysis on Data Mining-396
An Algorithm Analysis on Data Mining-396An Algorithm Analysis on Data Mining-396
An Algorithm Analysis on Data Mining-396
 
Using CART For Beginners with A Teclo Example Dataset
Using CART For Beginners with A Teclo Example DatasetUsing CART For Beginners with A Teclo Example Dataset
Using CART For Beginners with A Teclo Example Dataset
 
Medical Statistics Part-I:Descriptive statistics
Medical Statistics Part-I:Descriptive statisticsMedical Statistics Part-I:Descriptive statistics
Medical Statistics Part-I:Descriptive statistics
 
Data analysis01 singlevariable
Data analysis01 singlevariableData analysis01 singlevariable
Data analysis01 singlevariable
 
Review of Basic Statistics and Terminology
Review of Basic Statistics and TerminologyReview of Basic Statistics and Terminology
Review of Basic Statistics and Terminology
 
Lesson 1 07 measures of variation
Lesson 1 07 measures of variationLesson 1 07 measures of variation
Lesson 1 07 measures of variation
 
Textmining Predictive Models
Textmining Predictive ModelsTextmining Predictive Models
Textmining Predictive Models
 
Nonparametric tests assignment
Nonparametric tests assignmentNonparametric tests assignment
Nonparametric tests assignment
 
13 Machine Learning Supervised Decision Trees
13 Machine Learning Supervised Decision Trees13 Machine Learning Supervised Decision Trees
13 Machine Learning Supervised Decision Trees
 
PG STAT 531 Lecture 4 Exploratory Data Analysis
PG STAT 531 Lecture 4 Exploratory Data AnalysisPG STAT 531 Lecture 4 Exploratory Data Analysis
PG STAT 531 Lecture 4 Exploratory Data Analysis
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
Integrating Fuzzy Dematel and SMAA-2 for Maintenance Expenses
Integrating Fuzzy Dematel and SMAA-2 for Maintenance ExpensesIntegrating Fuzzy Dematel and SMAA-2 for Maintenance Expenses
Integrating Fuzzy Dematel and SMAA-2 for Maintenance Expenses
 

Viewers also liked

Floidbox -Weird photos
Floidbox -Weird photosFloidbox -Weird photos
Floidbox -Weird photosFlorin Levarda
 
Prelim Evaluation
Prelim EvaluationPrelim Evaluation
Prelim Evaluationnctcmedia12
 
Теплица "Презент"
Теплица "Презент"Теплица "Презент"
Теплица "Презент"Al Maks
 
Weird photos - pictures
Weird photos - picturesWeird photos - pictures
Weird photos - picturesFlorin Levarda
 
Yaksha prasna questions of yaksha
Yaksha prasna   questions of yakshaYaksha prasna   questions of yaksha
Yaksha prasna questions of yakshaBASKARAN P
 
Сандеро
СандероСандеро
СандероAl Maks
 
Why invest in azuero October 1st 2012
Why invest in azuero October 1st 2012Why invest in azuero October 1st 2012
Why invest in azuero October 1st 2012Cubita Panama
 
Icabihal sayi 3
Icabihal sayi 3Icabihal sayi 3
Icabihal sayi 3kolormatik
 
PERGUNTAS E RESPOSTAS SOBRE OS NOVOS PROCEDIMENTOS NOS AEROPORTOS
PERGUNTAS E RESPOSTAS SOBRE OS NOVOS PROCEDIMENTOS NOS AEROPORTOSPERGUNTAS E RESPOSTAS SOBRE OS NOVOS PROCEDIMENTOS NOS AEROPORTOS
PERGUNTAS E RESPOSTAS SOBRE OS NOVOS PROCEDIMENTOS NOS AEROPORTOSMaria Santos
 
Музыка перевода. Избранные работы за 2010 г.
Музыка перевода. Избранные работы за 2010 г.Музыка перевода. Избранные работы за 2010 г.
Музыка перевода. Избранные работы за 2010 г.Вениамин Бакалинский
 
Digital Workplace by Lizard Soft
Digital Workplace by Lizard SoftDigital Workplace by Lizard Soft
Digital Workplace by Lizard SoftIgor Petrushyn
 
Prelim R&P of Similar Products
Prelim R&P of Similar ProductsPrelim R&P of Similar Products
Prelim R&P of Similar Productsnctcmedia12
 
How to look for journal articles using ebsco host_1010S
How to look for journal articles using ebsco host_1010SHow to look for journal articles using ebsco host_1010S
How to look for journal articles using ebsco host_1010Smchiware
 
Create an Assignment in MBC
Create an Assignment in MBCCreate an Assignment in MBC
Create an Assignment in MBCMaryAnn Medved
 
Research Into Target Audience
Research Into Target AudienceResearch Into Target Audience
Research Into Target Audiencenctcmedia12
 
Finding the meaning in meaningful use
Finding the meaning in meaningful useFinding the meaning in meaningful use
Finding the meaning in meaningful usegdabate
 

Viewers also liked (20)

Floidbox -Weird photos
Floidbox -Weird photosFloidbox -Weird photos
Floidbox -Weird photos
 
Edgar allan poe
Edgar allan poeEdgar allan poe
Edgar allan poe
 
Prelim Evaluation
Prelim EvaluationPrelim Evaluation
Prelim Evaluation
 
Теплица "Презент"
Теплица "Презент"Теплица "Презент"
Теплица "Презент"
 
Weird photos - pictures
Weird photos - picturesWeird photos - pictures
Weird photos - pictures
 
Yaksha prasna questions of yaksha
Yaksha prasna   questions of yakshaYaksha prasna   questions of yaksha
Yaksha prasna questions of yaksha
 
Regular verbs
Regular verbsRegular verbs
Regular verbs
 
Сандеро
СандероСандеро
Сандеро
 
Why invest in azuero October 1st 2012
Why invest in azuero October 1st 2012Why invest in azuero October 1st 2012
Why invest in azuero October 1st 2012
 
Icabihal sayi 3
Icabihal sayi 3Icabihal sayi 3
Icabihal sayi 3
 
PERGUNTAS E RESPOSTAS SOBRE OS NOVOS PROCEDIMENTOS NOS AEROPORTOS
PERGUNTAS E RESPOSTAS SOBRE OS NOVOS PROCEDIMENTOS NOS AEROPORTOSPERGUNTAS E RESPOSTAS SOBRE OS NOVOS PROCEDIMENTOS NOS AEROPORTOS
PERGUNTAS E RESPOSTAS SOBRE OS NOVOS PROCEDIMENTOS NOS AEROPORTOS
 
Музыка перевода. Избранные работы за 2010 г.
Музыка перевода. Избранные работы за 2010 г.Музыка перевода. Избранные работы за 2010 г.
Музыка перевода. Избранные работы за 2010 г.
 
Digital Workplace by Lizard Soft
Digital Workplace by Lizard SoftDigital Workplace by Lizard Soft
Digital Workplace by Lizard Soft
 
Prelim R&P of Similar Products
Prelim R&P of Similar ProductsPrelim R&P of Similar Products
Prelim R&P of Similar Products
 
How to look for journal articles using ebsco host_1010S
How to look for journal articles using ebsco host_1010SHow to look for journal articles using ebsco host_1010S
How to look for journal articles using ebsco host_1010S
 
Create an Assignment in MBC
Create an Assignment in MBCCreate an Assignment in MBC
Create an Assignment in MBC
 
GERMANY
GERMANYGERMANY
GERMANY
 
Research Into Target Audience
Research Into Target AudienceResearch Into Target Audience
Research Into Target Audience
 
testmakt
testmakttestmakt
testmakt
 
Finding the meaning in meaningful use
Finding the meaning in meaningful useFinding the meaning in meaningful use
Finding the meaning in meaningful use
 

Similar to Introduction to CART Decision Tree Analysis for Data Mining and Predictive Modeling

Data Mining In Market Research
Data Mining In Market ResearchData Mining In Market Research
Data Mining In Market Researchjim
 
Data Mining in Market Research
Data Mining in Market ResearchData Mining in Market Research
Data Mining in Market Researchbutest
 
Data Mining In Market Research
Data Mining In Market ResearchData Mining In Market Research
Data Mining In Market Researchkevinlan
 
A Decision Tree Based Classifier for Classification & Prediction of Diseases
A Decision Tree Based Classifier for Classification & Prediction of DiseasesA Decision Tree Based Classifier for Classification & Prediction of Diseases
A Decision Tree Based Classifier for Classification & Prediction of Diseasesijsrd.com
 
A General Framework for Accurate and Fast Regression by Data Summarization in...
A General Framework for Accurate and Fast Regression by Data Summarization in...A General Framework for Accurate and Fast Regression by Data Summarization in...
A General Framework for Accurate and Fast Regression by Data Summarization in...Yao Wu
 
Decision Trees for Classification: A Machine Learning Algorithm
Decision Trees for Classification: A Machine Learning AlgorithmDecision Trees for Classification: A Machine Learning Algorithm
Decision Trees for Classification: A Machine Learning AlgorithmPalin analytics
 
Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests Derek Kane
 
measure of dispersion
measure of dispersion measure of dispersion
measure of dispersion som allul
 
Machine learning session6(decision trees random forrest)
Machine learning   session6(decision trees random forrest)Machine learning   session6(decision trees random forrest)
Machine learning session6(decision trees random forrest)Abhimanyu Dwivedi
 
On cascading small decision trees
On cascading small decision treesOn cascading small decision trees
On cascading small decision treesJulià Minguillón
 
Tree net and_randomforests_2009
Tree net and_randomforests_2009Tree net and_randomforests_2009
Tree net and_randomforests_2009Matthew Magistrado
 

Similar to Introduction to CART Decision Tree Analysis for Data Mining and Predictive Modeling (20)

CART Training 1999
CART Training 1999CART Training 1999
CART Training 1999
 
Data Mining In Market Research
Data Mining In Market ResearchData Mining In Market Research
Data Mining In Market Research
 
Data Mining in Market Research
Data Mining in Market ResearchData Mining in Market Research
Data Mining in Market Research
 
Data Mining In Market Research
Data Mining In Market ResearchData Mining In Market Research
Data Mining In Market Research
 
A Decision Tree Based Classifier for Classification & Prediction of Diseases
A Decision Tree Based Classifier for Classification & Prediction of DiseasesA Decision Tree Based Classifier for Classification & Prediction of Diseases
A Decision Tree Based Classifier for Classification & Prediction of Diseases
 
A General Framework for Accurate and Fast Regression by Data Summarization in...
A General Framework for Accurate and Fast Regression by Data Summarization in...A General Framework for Accurate and Fast Regression by Data Summarization in...
A General Framework for Accurate and Fast Regression by Data Summarization in...
 
Statistics
StatisticsStatistics
Statistics
 
Classification
ClassificationClassification
Classification
 
Classification
ClassificationClassification
Classification
 
decisiontrees (3).ppt
decisiontrees (3).pptdecisiontrees (3).ppt
decisiontrees (3).ppt
 
decisiontrees.ppt
decisiontrees.pptdecisiontrees.ppt
decisiontrees.ppt
 
decisiontrees.ppt
decisiontrees.pptdecisiontrees.ppt
decisiontrees.ppt
 
Decision Trees for Classification: A Machine Learning Algorithm
Decision Trees for Classification: A Machine Learning AlgorithmDecision Trees for Classification: A Machine Learning Algorithm
Decision Trees for Classification: A Machine Learning Algorithm
 
6238578.ppt
6238578.ppt6238578.ppt
6238578.ppt
 
Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests
 
measure of dispersion
measure of dispersion measure of dispersion
measure of dispersion
 
Machine learning session6(decision trees random forrest)
Machine learning   session6(decision trees random forrest)Machine learning   session6(decision trees random forrest)
Machine learning session6(decision trees random forrest)
 
On cascading small decision trees
On cascading small decision treesOn cascading small decision trees
On cascading small decision trees
 
Introduction to cart_2007
Introduction to cart_2007Introduction to cart_2007
Introduction to cart_2007
 
Tree net and_randomforests_2009
Tree net and_randomforests_2009Tree net and_randomforests_2009
Tree net and_randomforests_2009
 

Recently uploaded

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 

Recently uploaded (20)

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 

Introduction to CART Decision Tree Analysis for Data Mining and Predictive Modeling

  • 1. Data Mining with Decision Trees: An Introduction to CART ® Dan Steinberg Mikhail Golovnya [email_address] http://www.salford-systems.com
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16. General Workflow Stage 1 Stage 2 Stage 3 Historical Data Learn Test Validate Build a Sequence of Nested Trees Monitor Performance Best Confirm Findings
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.
  • 33.
  • 34.
  • 35.
  • 36.
  • 37.
  • 38.
  • 39.
  • 40.
  • 41.
  • 42.
  • 44.
  • 45.
  • 46.
  • 47.
  • 48.
  • 49.
  • 50.
  • 51.
  • 52.
  • 53.
  • 54.
  • 56.
  • 57.
  • 58.
  • 59.
  • 60.
  • 61.
  • 62.
  • 63.
  • 64.
  • 65.
  • 66.
  • 67.
  • 68.
  • 69.
  • 70.
  • 71.
  • 72.
  • 73.
  • 74.
  • 75.
  • 76.
  • 77.
  • 78.
  • 79.