Tree pruning

•Download as PPTX, PDF•

2 likes•595 views

priya_kalia

Tree pruning is useful for the removal of overfitting...

Engineering

 Tree pruning is performed in order to
remove overfitting in the training data due
to noise or outliers.
 The pruned trees are smaller and less
complex.
 Two strategies for “pruning” the decision
tree:
 Postpruning - take a fully-grown decision tree
and discard unreliable parts(sub trees)
 Prepruning - stop growing a branch when
information becomes unreliable

 The efficiency of existing decision tree
algorithms has been well established for
relatively small data sets.
 Efficiency becomes an issue of concern when
these algorithms are applied to the mining of
very large real-world databases.
 As in the data mining application, millions of
tuples are common in the data set. Which
needs to fit in the memory(As per the
decision tree restriction)

 The AVC-set of an attribute A at node N gives
the class label counts for each value of A for
the tuples at N.
 The set of all AVC-sets at a node N is the
AVC-group of N.
 The size of an AVC-set for attribute A at
node N depends only on the number of
distinct values of A and the number of
classes in the set of tuples at N.
 Typically, this size should fit in memory,
even for real-world data.

 It uses a statistical technique known as
“bootstrapping” to create several smaller
samples (or subsets) of the given training
data, each of which fits in memory.
 Each subset is used to construct a tree,
resulting in several trees. The trees are
examined and used to construct a new tree,
 T`, that turns out to be “very close” to the
tree that would have been generated if all
the original training data had fit in memory.

 Faster in constructing tree
 Insertion and deletion is easy without
constructing the tree from scratch

 (PBC) Perception based classification is an
interactive approach based on
multidimensional visualization techniques.
 It allows the user to incorporate background
knowledge about the data when building a
decision tree.
 By visually interacting with the data, the user is
also likely to develop a deeper understanding of
the data.
 The resulting trees tend to be smaller than those
built using traditional decision tree induction
methods and so are easier to interpret, while
achieving about the same accuracy.

What's hot

Dynamic Itemset CountingTarat Diloksawatdikul

Decision treeSoujanya V

Tree pruningShivangi Gupta

Data Mining: Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...Salah Amean

Fp growth algorithmPradip Kumar

Mining Frequent Patterns, Association and CorrelationsJustin Cletus

Understanding Bagging and BoostingMohit Rajput

Data Mining: Mining ,associations, and correlationsDatamining Tools

Decision trees in Machine Learning Mohammad Junaid Khan

Classification and predictionAcad

Decision Tree in Machine Learning Souma Maiti

Clustering in data Mining (Data Mining)Mustafa Sherazi

Principal component analysisFarah M. Altufaili

Overfitting & UnderfittingSOUMIT KAR

Classification techniques in data miningKamal Acharya

Machine Learning Algorithm - Decision Trees Kush Kulshrestha

3.2 partitioning methodsKrish_ver2

Introduction to Clustering algorithmhadifar

Decision treeR A Akerkar

Support vector machines (svm)Sharayu Patil

What's hot (20)

Dynamic Itemset Counting

Decision tree

Tree pruning

Data Mining: Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...

Fp growth algorithm

Mining Frequent Patterns, Association and Correlations

Understanding Bagging and Boosting

Data Mining: Mining ,associations, and correlations

Decision trees in Machine Learning

Classification and prediction

Decision Tree in Machine Learning

Clustering in data Mining (Data Mining)

Principal component analysis

Overfitting & Underfitting

Classification techniques in data mining

Machine Learning Algorithm - Decision Trees

3.2 partitioning methods

Introduction to Clustering algorithm

Decision tree

Support vector machines (svm)

Viewers also liked

Data mining assignment 4BarryK88

Data mining assignment 5BarryK88

Data mining assignment 1BarryK88

DATA MINING IN RETAIL SECTORRenuka Chand

Csc1100 lecture04 ch04IIUM

05 Conditional statementsmaznabili

01 10 speech channel assignmentEricsson Saudi

Project_702Sreelakshmi Dodderi

С++ without new and deletePlatonov Sergey

Data Engineering - Data Mining AssignmentDarran Mottershead

Data mining notesAVC College of Engineering

Data mining with wekaHein Min Htike

Data mining to predict academic performance. Ranjith Gowda

4.2 bstKrish_ver2

Data Mining – analyse Bank Marketing Data SetMateusz Brzoska

DATA MINING on WEKAsatyamkhatri

Ch06maheswari narne

Decision treesJagjit Wilku

Naive bayesAshraf Uddin

Viewers also liked (19)

Data mining assignment 4

Data mining assignment 5

Data mining assignment 1

DATA MINING IN RETAIL SECTOR

Csc1100 lecture04 ch04

05 Conditional statements

01 10 speech channel assignment

Project_702

С++ without new and delete

Data Engineering - Data Mining Assignment

Data mining notes

Data mining with weka

Data mining to predict academic performance.

4.2 bst

Data Mining – analyse Bank Marketing Data Set

DATA MINING on WEKA

Ch06

Decision trees

Naive bayes

Similar to Tree pruning

data mining.pptxKaviya452563

Know How to Create and Visualize a Decision Tree with Python.pdfData Science Council of America

IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHESVikash Kumar

An Introduction to Random Forest and linear regression algorithmsShouvic Banik0139

Fp growth tree improve its efficiency and scalabilityDr.Manmohan Singh

Building Azure Machine Learning ModelsEng Teong Cheah

STUDENT PERFORMANCE ANALYSIS USING DECISION TREEAkshay Jain

Desicion tree and neural networksjaskarankaur21

Machine Learning Unit-5 Decesion Trees & Random Forest.pdfAdityaSoraut

Analysis of Bayes, Neural Network and Tree Classifier of Classification Techn...cscpconf

Comprehensive Survey of Data Classification & Prediction Techniquesijsrd.com

A Decision Tree Based Classifier for Classification & Prediction of Diseasesijsrd.com

Unit 2-ML.pptxChitrachitrap

Algoritma Random Forest beserta aplikasi nyabatubao

Decision treeinductionmethodsandtheirapplicationtobigdatafinal 5ssuser33da69

A Survey of Modern Data Classification Techniquesijsrd.com

Data mining Jhadesunil

20211229120253D6323_PERT 06_ Ensemble Learning.pptxRaflyRizky2

Distributed Digital Artifacts on the Semantic WebEditor IJCATR

13 random forestVishal Dutt

Similar to Tree pruning (20)

data mining.pptx

Know How to Create and Visualize a Decision Tree with Python.pdf

IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES

An Introduction to Random Forest and linear regression algorithms

Fp growth tree improve its efficiency and scalability

Building Azure Machine Learning Models

STUDENT PERFORMANCE ANALYSIS USING DECISION TREE

Desicion tree and neural networks

Machine Learning Unit-5 Decesion Trees & Random Forest.pdf

Analysis of Bayes, Neural Network and Tree Classifier of Classification Techn...

Comprehensive Survey of Data Classification & Prediction Techniques

A Decision Tree Based Classifier for Classification & Prediction of Diseases

Unit 2-ML.pptx

Algoritma Random Forest beserta aplikasi nya

Decision treeinductionmethodsandtheirapplicationtobigdatafinal 5

A Survey of Modern Data Classification Techniques

Data mining

20211229120253D6323_PERT 06_ Ensemble Learning.pptx

Distributed Digital Artifacts on the Semantic Web

13 random forest

Recently uploaded

Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...srsj9000

Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝soniya singh

Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort servicejennyeacort

★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR9953056974 Low Rate Call Girls In Saket, Delhi NCR

Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerAnamika Sarkar

(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat

chaitra-1.pptx fake news detection using machine learningmisbanausheenparvam

Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234

VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor

CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfAsst.prof M.Gokilavani

GDSC ASEB Gen AI study jams presentationGDSCAESB

VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ

SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome

Software and Systems Engineering Standards: Verification and Validation of Sy...VICTOR MAESTRE RAMIREZ

Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis

Past, Present and Future of Generative AIabhishek36461

HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95

Application of Residue Theorem to evaluate real integrations.pptx959SahilShah

Heart Disease Prediction using machine learning.pptxPoojaBan

microprocessor 8085 and its interfacingjaychoudhary37

Recently uploaded (20)

Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...

Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝

Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service

★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR

Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger

(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts

chaitra-1.pptx fake news detection using machine learning

Microscopic Analysis of Ceramic Materials.pptx

VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130

CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf

GDSC ASEB Gen AI study jams presentation

VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...

SPICE PARK APR2024 ( 6,793 SPICE Models )

Software and Systems Engineering Standards: Verification and Validation of Sy...

Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction

Past, Present and Future of Generative AI

HARMONY IN THE NATURE AND EXISTENCE - Unit-IV

Application of Residue Theorem to evaluate real integrations.pptx

Heart Disease Prediction using machine learning.pptx

microprocessor 8085 and its interfacing

Tree pruning

1. Priyanka

2.  Tree pruning is performed in order to remove overfitting in the training data due to noise or outliers.  The pruned trees are smaller and less complex.  Two strategies for “pruning” the decision tree:  Postpruning - take a fully-grown decision tree and discard unreliable parts(sub trees)  Prepruning - stop growing a branch when information becomes unreliable

4.  The efficiency of existing decision tree algorithms has been well established for relatively small data sets.  Efficiency becomes an issue of concern when these algorithms are applied to the mining of very large real-world databases.  As in the data mining application, millions of tuples are common in the data set. Which needs to fit in the memory(As per the decision tree restriction)

5.  The AVC-set of an attribute A at node N gives the class label counts for each value of A for the tuples at N.  The set of all AVC-sets at a node N is the AVC-group of N.  The size of an AVC-set for attribute A at node N depends only on the number of distinct values of A and the number of classes in the set of tuples at N.  Typically, this size should fit in memory, even for real-world data.

7.  It uses a statistical technique known as “bootstrapping” to create several smaller samples (or subsets) of the given training data, each of which fits in memory.  Each subset is used to construct a tree, resulting in several trees. The trees are examined and used to construct a new tree,  T`, that turns out to be “very close” to the tree that would have been generated if all the original training data had fit in memory.

8.  Faster in constructing tree  Insertion and deletion is easy without constructing the tree from scratch

10.  (PBC) Perception based classification is an interactive approach based on multidimensional visualization techniques.  It allows the user to incorporate background knowledge about the data when building a decision tree.  By visually interacting with the data, the user is also likely to develop a deeper understanding of the data.  The resulting trees tend to be smaller than those built using traditional decision tree induction methods and so are easier to interpret, while achieving about the same accuracy.

Tree pruning

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (19)

Similar to Tree pruning

Similar to Tree pruning (20)

Recently uploaded

Recently uploaded (20)

Tree pruning