SlideShare a Scribd company logo
1 of 7
SMOTE
Synthetic Minority Over-sampling Technique – a
methodology to handle class imbalance problems.
• Synthesis of new minority class instances
• Over-sampling of minority class
• Under-sampling of majority class
• Tweaking the cost function to make misclassification of
minority instances more important than misclassification of
majority instances
SMOTE
SMOTE
The trade-off between recall (percent of truly positive instances that were classified as such) and
precision (percent of positive classifications that are truly positive). In situations where we want to
detect instances of a minority class, we are usually concerned more so with recall than precision
K-Fold Cross-validation
Cross-validation is a resampling procedure used to evaluate
machine learning models on a limited data sample.
1. Shuffle the dataset randomly
2. Split the dataset into k groups
3. For each unique group:
a. Take the group as a hold out or test data set
b. Take the remaining groups as a training data set
c. Fit a model on the training set and evaluate it on the test set
d. Retain the evaluation score and discard the model
4. Summarize the skill of the model using the sample of model
evaluation scores
Think of this as the leave one out technique
K-Fold Cross-validation
Choosing k
A poorly chosen value for k may result in a misrepresentative
idea of the skill of the model, such as a score with a high
variance (that may change a lot based on the data used to fit
the model), or a high bias, (such as an overestimate of the skill
of the model).
K-Fold Cross-validation
Basic example
Let’s assume we have six observations:
[1, 2, 3, 4, 5, 6]
Now, let’s use k = 3, so we need to create three random folds:
fold 1: [5, 2]
fold 2: [4, 6]
fold 3: [3, 1]
K-Fold Cross-validation
Basic example
So our train and test sets become:
train: [5, 2, 4, 6], test: [3, 1]
train: [4, 6, 3, 1], test: [5, 2]
train: [3, 1, 5, 2], test: [4, 6]
fold 1: [5, 2]
fold 2: [4, 6]
fold 3: [3, 1]

More Related Content

Similar to SMOTE and K-Fold Cross Validation-Presentation.pptx

Machine learning (5)
Machine learning (5)Machine learning (5)
Machine learning (5)NYversity
 
Cmpe 255 cross validation
Cmpe 255 cross validationCmpe 255 cross validation
Cmpe 255 cross validationAbraham Kong
 
Probability density estimation using Product of Conditional Experts
Probability density estimation using Product of Conditional ExpertsProbability density estimation using Product of Conditional Experts
Probability density estimation using Product of Conditional ExpertsChirag Gupta
 
Classification using L1-Penalized Logistic Regression
Classification using L1-Penalized Logistic RegressionClassification using L1-Penalized Logistic Regression
Classification using L1-Penalized Logistic RegressionSetia Pramana
 
Barga Data Science lecture 10
Barga Data Science lecture 10Barga Data Science lecture 10
Barga Data Science lecture 10Roger Barga
 
Learning On The Border:Active Learning in Imbalanced classification Data
Learning On The Border:Active Learning in Imbalanced classification DataLearning On The Border:Active Learning in Imbalanced classification Data
Learning On The Border:Active Learning in Imbalanced classification Data萍華 楊
 
Ensemble hybrid learning technique
Ensemble hybrid learning techniqueEnsemble hybrid learning technique
Ensemble hybrid learning techniqueDishaSinha9
 
Lecture 9: Machine Learning in Practice (2)
Lecture 9: Machine Learning in Practice (2)Lecture 9: Machine Learning in Practice (2)
Lecture 9: Machine Learning in Practice (2)Marina Santini
 
in5490-classification (1).pptx
in5490-classification (1).pptxin5490-classification (1).pptx
in5490-classification (1).pptxMonicaTimber
 
Ensemble methods in Machine learning technology
Ensemble methods in Machine learning technologyEnsemble methods in Machine learning technology
Ensemble methods in Machine learning technologysikethatsarightemail
 
Cross-validation aggregation for forecasting
Cross-validation aggregation for forecastingCross-validation aggregation for forecasting
Cross-validation aggregation for forecastingDevon Barrow
 
DMTM Lecture 06 Classification evaluation
DMTM Lecture 06 Classification evaluationDMTM Lecture 06 Classification evaluation
DMTM Lecture 06 Classification evaluationPier Luca Lanzi
 
Statistical Learning and Model Selection module 2.pptx
Statistical Learning and Model Selection module 2.pptxStatistical Learning and Model Selection module 2.pptx
Statistical Learning and Model Selection module 2.pptxnagarajan740445
 
Post Graduate Admission Prediction System
Post Graduate Admission Prediction SystemPost Graduate Admission Prediction System
Post Graduate Admission Prediction SystemIRJET Journal
 
DMTM 2015 - 14 Evaluation of Classification Models
DMTM 2015 - 14 Evaluation of Classification ModelsDMTM 2015 - 14 Evaluation of Classification Models
DMTM 2015 - 14 Evaluation of Classification ModelsPier Luca Lanzi
 

Similar to SMOTE and K-Fold Cross Validation-Presentation.pptx (20)

Machine learning (5)
Machine learning (5)Machine learning (5)
Machine learning (5)
 
evaluation and credibility-Part 1
evaluation and credibility-Part 1evaluation and credibility-Part 1
evaluation and credibility-Part 1
 
18.1 combining models
18.1 combining models18.1 combining models
18.1 combining models
 
Cmpe 255 cross validation
Cmpe 255 cross validationCmpe 255 cross validation
Cmpe 255 cross validation
 
Probability density estimation using Product of Conditional Experts
Probability density estimation using Product of Conditional ExpertsProbability density estimation using Product of Conditional Experts
Probability density estimation using Product of Conditional Experts
 
Classification using L1-Penalized Logistic Regression
Classification using L1-Penalized Logistic RegressionClassification using L1-Penalized Logistic Regression
Classification using L1-Penalized Logistic Regression
 
Barga Data Science lecture 10
Barga Data Science lecture 10Barga Data Science lecture 10
Barga Data Science lecture 10
 
Learning On The Border:Active Learning in Imbalanced classification Data
Learning On The Border:Active Learning in Imbalanced classification DataLearning On The Border:Active Learning in Imbalanced classification Data
Learning On The Border:Active Learning in Imbalanced classification Data
 
Ensemble hybrid learning technique
Ensemble hybrid learning techniqueEnsemble hybrid learning technique
Ensemble hybrid learning technique
 
Lecture 9: Machine Learning in Practice (2)
Lecture 9: Machine Learning in Practice (2)Lecture 9: Machine Learning in Practice (2)
Lecture 9: Machine Learning in Practice (2)
 
4.1.pptx
4.1.pptx4.1.pptx
4.1.pptx
 
in5490-classification (1).pptx
in5490-classification (1).pptxin5490-classification (1).pptx
in5490-classification (1).pptx
 
Ensemble methods in Machine learning technology
Ensemble methods in Machine learning technologyEnsemble methods in Machine learning technology
Ensemble methods in Machine learning technology
 
Cross-validation aggregation for forecasting
Cross-validation aggregation for forecastingCross-validation aggregation for forecasting
Cross-validation aggregation for forecasting
 
slides.pptx
slides.pptxslides.pptx
slides.pptx
 
DMTM Lecture 06 Classification evaluation
DMTM Lecture 06 Classification evaluationDMTM Lecture 06 Classification evaluation
DMTM Lecture 06 Classification evaluation
 
Statistical Learning and Model Selection module 2.pptx
Statistical Learning and Model Selection module 2.pptxStatistical Learning and Model Selection module 2.pptx
Statistical Learning and Model Selection module 2.pptx
 
Post Graduate Admission Prediction System
Post Graduate Admission Prediction SystemPost Graduate Admission Prediction System
Post Graduate Admission Prediction System
 
LR 9 Estimation.pdf
LR 9 Estimation.pdfLR 9 Estimation.pdf
LR 9 Estimation.pdf
 
DMTM 2015 - 14 Evaluation of Classification Models
DMTM 2015 - 14 Evaluation of Classification ModelsDMTM 2015 - 14 Evaluation of Classification Models
DMTM 2015 - 14 Evaluation of Classification Models
 

More from HaritikaChhatwal1

Visualization IN DATA ANALYTICS IN TIME SERIES
Visualization IN DATA ANALYTICS IN TIME SERIESVisualization IN DATA ANALYTICS IN TIME SERIES
Visualization IN DATA ANALYTICS IN TIME SERIESHaritikaChhatwal1
 
TS Decomposition IN data Time Series Fore
TS Decomposition IN data Time Series ForeTS Decomposition IN data Time Series Fore
TS Decomposition IN data Time Series ForeHaritikaChhatwal1
 
TIMES SERIES FORECASTING ON HISTORICAL DATA IN R
TIMES SERIES FORECASTING ON HISTORICAL DATA  IN RTIMES SERIES FORECASTING ON HISTORICAL DATA  IN R
TIMES SERIES FORECASTING ON HISTORICAL DATA IN RHaritikaChhatwal1
 
Factor Analysis-Presentation DATA ANALYTICS
Factor Analysis-Presentation DATA ANALYTICSFactor Analysis-Presentation DATA ANALYTICS
Factor Analysis-Presentation DATA ANALYTICSHaritikaChhatwal1
 
Additional Reading material-Probability.ppt
Additional Reading material-Probability.pptAdditional Reading material-Probability.ppt
Additional Reading material-Probability.pptHaritikaChhatwal1
 
Decision Tree_Loan Delinquent_Problem Statement.pdf
Decision Tree_Loan Delinquent_Problem Statement.pdfDecision Tree_Loan Delinquent_Problem Statement.pdf
Decision Tree_Loan Delinquent_Problem Statement.pdfHaritikaChhatwal1
 
Frequency Based Classification Algorithms_ important
Frequency Based Classification Algorithms_ importantFrequency Based Classification Algorithms_ important
Frequency Based Classification Algorithms_ importantHaritikaChhatwal1
 
M2W1 - FBS - Descriptive Statistics_Mentoring Presentation.pptx
M2W1 - FBS - Descriptive Statistics_Mentoring Presentation.pptxM2W1 - FBS - Descriptive Statistics_Mentoring Presentation.pptx
M2W1 - FBS - Descriptive Statistics_Mentoring Presentation.pptxHaritikaChhatwal1
 
Introduction to R _IMPORTANT FOR DATA ANALYTICS
Introduction to R _IMPORTANT FOR DATA ANALYTICSIntroduction to R _IMPORTANT FOR DATA ANALYTICS
Introduction to R _IMPORTANT FOR DATA ANALYTICSHaritikaChhatwal1
 
BUSINESS ANALYTICS WITH R SOFTWARE DIAST
BUSINESS ANALYTICS WITH R SOFTWARE DIASTBUSINESS ANALYTICS WITH R SOFTWARE DIAST
BUSINESS ANALYTICS WITH R SOFTWARE DIASTHaritikaChhatwal1
 
SESSION 1-2 [Autosaved] [Autosaved].pptx
SESSION 1-2 [Autosaved] [Autosaved].pptxSESSION 1-2 [Autosaved] [Autosaved].pptx
SESSION 1-2 [Autosaved] [Autosaved].pptxHaritikaChhatwal1
 
WORKSHEET INTRO AND TVM _SESSION 1 AND 2.pdf
WORKSHEET INTRO AND TVM _SESSION 1 AND 2.pdfWORKSHEET INTRO AND TVM _SESSION 1 AND 2.pdf
WORKSHEET INTRO AND TVM _SESSION 1 AND 2.pdfHaritikaChhatwal1
 
Epigeum at ntunive singapore MED900.pptx
Epigeum at ntunive singapore MED900.pptxEpigeum at ntunive singapore MED900.pptx
Epigeum at ntunive singapore MED900.pptxHaritikaChhatwal1
 
MED 900 Correlational Studies online safety sake.pptx
MED 900 Correlational Studies online safety sake.pptxMED 900 Correlational Studies online safety sake.pptx
MED 900 Correlational Studies online safety sake.pptxHaritikaChhatwal1
 
Nw Microsoft PowerPoint Presentation.pptx
Nw Microsoft PowerPoint Presentation.pptxNw Microsoft PowerPoint Presentation.pptx
Nw Microsoft PowerPoint Presentation.pptxHaritikaChhatwal1
 
HOWs CORRELATIONAL STUDIES are performed
HOWs CORRELATIONAL STUDIES are performedHOWs CORRELATIONAL STUDIES are performed
HOWs CORRELATIONAL STUDIES are performedHaritikaChhatwal1
 
CHAPTER 4 -TYPES OF BUSINESS.pptx
CHAPTER 4 -TYPES OF BUSINESS.pptxCHAPTER 4 -TYPES OF BUSINESS.pptx
CHAPTER 4 -TYPES OF BUSINESS.pptxHaritikaChhatwal1
 
CHAPTER 3-ENTREPRENEURSHIP [Autosaved].pptx
CHAPTER 3-ENTREPRENEURSHIP [Autosaved].pptxCHAPTER 3-ENTREPRENEURSHIP [Autosaved].pptx
CHAPTER 3-ENTREPRENEURSHIP [Autosaved].pptxHaritikaChhatwal1
 

More from HaritikaChhatwal1 (20)

Visualization IN DATA ANALYTICS IN TIME SERIES
Visualization IN DATA ANALYTICS IN TIME SERIESVisualization IN DATA ANALYTICS IN TIME SERIES
Visualization IN DATA ANALYTICS IN TIME SERIES
 
TS Decomposition IN data Time Series Fore
TS Decomposition IN data Time Series ForeTS Decomposition IN data Time Series Fore
TS Decomposition IN data Time Series Fore
 
TIMES SERIES FORECASTING ON HISTORICAL DATA IN R
TIMES SERIES FORECASTING ON HISTORICAL DATA  IN RTIMES SERIES FORECASTING ON HISTORICAL DATA  IN R
TIMES SERIES FORECASTING ON HISTORICAL DATA IN R
 
Factor Analysis-Presentation DATA ANALYTICS
Factor Analysis-Presentation DATA ANALYTICSFactor Analysis-Presentation DATA ANALYTICS
Factor Analysis-Presentation DATA ANALYTICS
 
Additional Reading material-Probability.ppt
Additional Reading material-Probability.pptAdditional Reading material-Probability.ppt
Additional Reading material-Probability.ppt
 
Decision Tree_Loan Delinquent_Problem Statement.pdf
Decision Tree_Loan Delinquent_Problem Statement.pdfDecision Tree_Loan Delinquent_Problem Statement.pdf
Decision Tree_Loan Delinquent_Problem Statement.pdf
 
Frequency Based Classification Algorithms_ important
Frequency Based Classification Algorithms_ importantFrequency Based Classification Algorithms_ important
Frequency Based Classification Algorithms_ important
 
M2W1 - FBS - Descriptive Statistics_Mentoring Presentation.pptx
M2W1 - FBS - Descriptive Statistics_Mentoring Presentation.pptxM2W1 - FBS - Descriptive Statistics_Mentoring Presentation.pptx
M2W1 - FBS - Descriptive Statistics_Mentoring Presentation.pptx
 
Introduction to R _IMPORTANT FOR DATA ANALYTICS
Introduction to R _IMPORTANT FOR DATA ANALYTICSIntroduction to R _IMPORTANT FOR DATA ANALYTICS
Introduction to R _IMPORTANT FOR DATA ANALYTICS
 
BUSINESS ANALYTICS WITH R SOFTWARE DIAST
BUSINESS ANALYTICS WITH R SOFTWARE DIASTBUSINESS ANALYTICS WITH R SOFTWARE DIAST
BUSINESS ANALYTICS WITH R SOFTWARE DIAST
 
SESSION 1-2 [Autosaved] [Autosaved].pptx
SESSION 1-2 [Autosaved] [Autosaved].pptxSESSION 1-2 [Autosaved] [Autosaved].pptx
SESSION 1-2 [Autosaved] [Autosaved].pptx
 
WORKSHEET INTRO AND TVM _SESSION 1 AND 2.pdf
WORKSHEET INTRO AND TVM _SESSION 1 AND 2.pdfWORKSHEET INTRO AND TVM _SESSION 1 AND 2.pdf
WORKSHEET INTRO AND TVM _SESSION 1 AND 2.pdf
 
Epigeum at ntunive singapore MED900.pptx
Epigeum at ntunive singapore MED900.pptxEpigeum at ntunive singapore MED900.pptx
Epigeum at ntunive singapore MED900.pptx
 
MED 900 Correlational Studies online safety sake.pptx
MED 900 Correlational Studies online safety sake.pptxMED 900 Correlational Studies online safety sake.pptx
MED 900 Correlational Studies online safety sake.pptx
 
Nw Microsoft PowerPoint Presentation.pptx
Nw Microsoft PowerPoint Presentation.pptxNw Microsoft PowerPoint Presentation.pptx
Nw Microsoft PowerPoint Presentation.pptx
 
HOWs CORRELATIONAL STUDIES are performed
HOWs CORRELATIONAL STUDIES are performedHOWs CORRELATIONAL STUDIES are performed
HOWs CORRELATIONAL STUDIES are performed
 
DIAS PRESENTATION.pptx
DIAS PRESENTATION.pptxDIAS PRESENTATION.pptx
DIAS PRESENTATION.pptx
 
FULLTEXT01.pdf
FULLTEXT01.pdfFULLTEXT01.pdf
FULLTEXT01.pdf
 
CHAPTER 4 -TYPES OF BUSINESS.pptx
CHAPTER 4 -TYPES OF BUSINESS.pptxCHAPTER 4 -TYPES OF BUSINESS.pptx
CHAPTER 4 -TYPES OF BUSINESS.pptx
 
CHAPTER 3-ENTREPRENEURSHIP [Autosaved].pptx
CHAPTER 3-ENTREPRENEURSHIP [Autosaved].pptxCHAPTER 3-ENTREPRENEURSHIP [Autosaved].pptx
CHAPTER 3-ENTREPRENEURSHIP [Autosaved].pptx
 

Recently uploaded

代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAbdelrhman abooda
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...ThinkInnovation
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxFurkanTasci3
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 

Recently uploaded (20)

Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptx
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 

SMOTE and K-Fold Cross Validation-Presentation.pptx

  • 1. SMOTE Synthetic Minority Over-sampling Technique – a methodology to handle class imbalance problems. • Synthesis of new minority class instances • Over-sampling of minority class • Under-sampling of majority class • Tweaking the cost function to make misclassification of minority instances more important than misclassification of majority instances
  • 3. SMOTE The trade-off between recall (percent of truly positive instances that were classified as such) and precision (percent of positive classifications that are truly positive). In situations where we want to detect instances of a minority class, we are usually concerned more so with recall than precision
  • 4. K-Fold Cross-validation Cross-validation is a resampling procedure used to evaluate machine learning models on a limited data sample. 1. Shuffle the dataset randomly 2. Split the dataset into k groups 3. For each unique group: a. Take the group as a hold out or test data set b. Take the remaining groups as a training data set c. Fit a model on the training set and evaluate it on the test set d. Retain the evaluation score and discard the model 4. Summarize the skill of the model using the sample of model evaluation scores Think of this as the leave one out technique
  • 5. K-Fold Cross-validation Choosing k A poorly chosen value for k may result in a misrepresentative idea of the skill of the model, such as a score with a high variance (that may change a lot based on the data used to fit the model), or a high bias, (such as an overestimate of the skill of the model).
  • 6. K-Fold Cross-validation Basic example Let’s assume we have six observations: [1, 2, 3, 4, 5, 6] Now, let’s use k = 3, so we need to create three random folds: fold 1: [5, 2] fold 2: [4, 6] fold 3: [3, 1]
  • 7. K-Fold Cross-validation Basic example So our train and test sets become: train: [5, 2, 4, 6], test: [3, 1] train: [4, 6, 3, 1], test: [5, 2] train: [3, 1, 5, 2], test: [4, 6] fold 1: [5, 2] fold 2: [4, 6] fold 3: [3, 1]