decision_trees_forests_2.pptx

•Download as PPTX, PDF•

0 likes•1 view

stalkthemhaha

Logic

Engineering

Decision tree based
predictive modeling
(continued)

how to grow a decision tree
• pick the split (variable and cutoff) that best
separates the data at current node
• for classification, best is typically determined by gini
impurity
• stop growing tree when all nodes are “pure” (all one
label) or node contains a low number of data points

bootstrap aggregation
(bagging)
• resample (with replacement) a dataset of the same
size as original data
• repeat, compute a model on each sampled dataset
• aggregate the model over the sampled datasets to
reduce overfitting or quantify uncertainty

random forests:
bagging decision trees
• a single decision tree fully grown overfits the data
(the prediction will be perfect on training data)
• random forests grow many trees over randomly
sampled subsets of the data and then aggregate
(average or majority vote)
• to reduce variance further, choose splits at each
node from a random subset of predictor variables

Machine Learning
Train model
on past data
Predict on
new data
Evaluate
performance
y = f (x) L(y, f (x))

ensemble models
• no model is perfect
• why not combine models to improve
accuracy
• another way of looking at it, let the
output of different models be new
features

linear weighted stacking
• divide training data into 2 parts, call them train1 and
train2
• on train1, train rf, glm, gbm, svm
• score each model on train2 and use those scores
as four new features
• combine four new features with original features on
train2 and fit a glm model (with interactions between
original and new features)
• run test data through the stacked model pipeline

Similar to decision_trees_forests_2.pptx

Using Tree algorithms on machine learningRajasekhar364622

20211229120253D6323_PERT 06_ Ensemble Learning.pptxRaflyRizky2

Classification Using Decision Trees and RulesChapter 5.docxmonicafrancis71118

Causal Random ForestBong-Ho Lee

Performance Issue? Machine Learning to the rescue!Maarten Smeets

Predictive analyticsDinakar nk

CSA 3702 machine learning module 2Nandhini S

To bag, or to boost? A question of balanceAlex Henderson

5.Module_AIML Random Forest.pptxPRIYACHAURASIYA25

Computer Intensive Statistical MethodsDaniele Baker

Top 10 Data Science Practioner Pitfalls - Mark LandrySri Ambati

decisiontrees (3).pptLvlShivaNagendra

decisiontrees.pptLvlShivaNagendra

decisiontrees.pptPriyadharshiniG41

Introduction to random forest and gradient boosting methods a lectureShreyas S K

Survival Analysis SuperlearnerColleen Farrelly

Phylogenetic tree constructionUddalok Jana

Data Science 101ideatoipo

Lecture4.pptxyasir149288

Data Wrangling_1.pptxPallabiSahoo5

Similar to decision_trees_forests_2.pptx (20)

Using Tree algorithms on machine learning

20211229120253D6323_PERT 06_ Ensemble Learning.pptx

Classification Using Decision Trees and RulesChapter 5.docx

Causal Random Forest

Performance Issue? Machine Learning to the rescue!

Predictive analytics

CSA 3702 machine learning module 2

To bag, or to boost? A question of balance

5.Module_AIML Random Forest.pptx

Computer Intensive Statistical Methods

Top 10 Data Science Practioner Pitfalls - Mark Landry

decisiontrees (3).ppt

decisiontrees.ppt

Introduction to random forest and gradient boosting methods a lecture

Survival Analysis Superlearner

Phylogenetic tree construction

Data Science 101

Lecture4.pptx

Data Wrangling_1.pptx

Recently uploaded

Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Servicemeghakumariji156

Introduction to Serverless with AWS LambdaOmar Fathy

Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X79953056974 Low Rate Call Girls In Saket, Delhi NCR

HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKARKOUSTAV SARKAR

Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Kandungan 087776558899

Thermal Engineering -unit - III & IV.pptDineshKumar4165

data_management_and _data_science_cheat_sheet.pdfJiananWang21

GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLEselvakumar948

Double Revolving field theory-how the rotor develops torqueBhangaleSonal

Generative AI or GenAI technology based PPTbhaskargani46

Standard vs Custom Battery Packs - Decoding the Power PlayEpec Engineered Technologies

Verification of thevenin's theorem for BEEE Lab (1).pptxchumtiyababu

FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced LoadsArindam Chakraborty, Ph.D., P.E. (CA, TX)

Wadi Rum luxhotel lodge Analysis case study.pptxNadaHaitham1

Integrated Test Rig For HTFE-25 - NeometrixNeometrix_Engineering_Pvt_Ltd

Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...Call Girls Mumbai

A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptxmaisarahman1

S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptxSCMS School of Architecture

Hostel management system project report..pdfKamal Acharya

AIRCANVAS[1].pdf mini project for btech studentsvanyagupta248

Recently uploaded (20)

Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service

Introduction to Serverless with AWS Lambda

Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7

HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR

Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil

Thermal Engineering -unit - III & IV.ppt

data_management_and _data_science_cheat_sheet.pdf

GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLE

Double Revolving field theory-how the rotor develops torque

Generative AI or GenAI technology based PPT

Standard vs Custom Battery Packs - Decoding the Power Play

Verification of thevenin's theorem for BEEE Lab (1).pptx

FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads

Wadi Rum luxhotel lodge Analysis case study.pptx

Integrated Test Rig For HTFE-25 - Neometrix

Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...

A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx

S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx

Hostel management system project report..pdf

AIRCANVAS[1].pdf mini project for btech students

decision_trees_forests_2.pptx

1. Decision tree based predictive modeling (continued)

2. decision trees and random forests

3. decision tree

4. decision tree

5. how to grow a decision tree • pick the split (variable and cutoff) that best separates the data at current node • for classification, best is typically determined by gini impurity • stop growing tree when all nodes are “pure” (all one label) or node contains a low number of data points

6. bootstrap aggregation (bagging) • resample (with replacement) a dataset of the same size as original data • repeat, compute a model on each sampled dataset • aggregate the model over the sampled datasets to reduce overfitting or quantify uncertainty

7. random forests: bagging decision trees • a single decision tree fully grown overfits the data (the prediction will be perfect on training data) • random forests grow many trees over randomly sampled subsets of the data and then aggregate (average or majority vote) • to reduce variance further, choose splits at each node from a random subset of predictor variables

8. Machine Learning Train model on past data Predict on new data Evaluate performance y = f (x) L(y, f (x))

9. ensemble models • no model is perfect • why not combine models to improve accuracy • another way of looking at it, let the output of different models be new features

10. stacking

11. linear weighted stacking • divide training data into 2 parts, call them train1 and train2 • on train1, train rf, glm, gbm, svm • score each model on train2 and use those scores as four new features • combine four new features with original features on train2 and fit a glm model (with interactions between original and new features) • run test data through the stacked model pipeline

decision_trees_forests_2.pptx

Recommended

Recommended

More Related Content

Similar to decision_trees_forests_2.pptx

Similar to decision_trees_forests_2.pptx (20)

Recently uploaded

Recently uploaded (20)

decision_trees_forests_2.pptx