SlideShare a Scribd company logo
1 of 7
Threshold setting for high prediction
rate with low false positives
Improving the functionality of Supervised
Classification
Why do we need a low false positive rate?
Let us take the example of a cancer prediction problem. If our model
would predict that one of our patients is going to have cancer, when he
actually isn’t going to, we are going to render a mental trauma for him
and his family. In other words, while we want to accurately predict the
number of people who are going to have cancer, we do not want to
falsely predict if someone is going to have cancer when they actually
aren’t going to.
Hence, when we build a classification model, we need to ensure that it is
tested correctly and that the false positive rate is as low as possible
without compromising the classifying accuracy of the model.
Testing the Classification Model
Testing requires two parameters to be observed:
• Sensitivity: Proportion of true positives predicted
Total number of positives
• Specificity: Proportion of true negatives predicted
Total number of negatives
Sensitivity can be intuitively thought of as the predictive(classifying) accuracy of the model on the positive class
(Eg; how correctly are we predicting the number of cancer patients)
Specificity can be intuitively thought of as the predictive(classifying)accuracy of the model on the negative
class (Eg; how correctly are we predicting the number of patients who do not have cancer)
For example
There is a sample of 2000 patients out of which 20 have ovarian cancer.
The classification model built by a healthcare company predicts 22
patients have ovarian cancer out of which 15 people have ovarian
cancer.
What is the sensitivity and specificity?
Sensitivity = 15/20 = 0.75
Specificity = 1973/1980 = 0.99
ROC Curve Analysis
• ROC Curve – plot of sensitivity vs. False positive rate
• Each point corresponds to a different threshold that separates negative samples
from positive samples
• The objective is to find a point (threshold) where the prediction rate is high
(high sensitivity) and false positive rate is low.
• Example in next slide
Source
The use of Decision Threshold Adjustment in Classification of Cancer Predictionhttp://www.ams.sunysb.edu/~hahn/psfile/papthres.pdf
Cases
• Breast Cancer Prediction – 0.98
• Fraud detection – 0.92
Source: http://www.gcxanalytics.com/papers/GCX%20Fraud%20Detection%20Performance%20Evaluation-
GCX.pdf

More Related Content

Similar to Threshold setting for reduction of false positives

Performance Metrics for Machine Learning Algorithms
Performance Metrics for Machine Learning AlgorithmsPerformance Metrics for Machine Learning Algorithms
Performance Metrics for Machine Learning AlgorithmsKush Kulshrestha
 
Performance Metrics, Baseline Model, and Hyper Parameter
Performance Metrics, Baseline Model, and Hyper ParameterPerformance Metrics, Baseline Model, and Hyper Parameter
Performance Metrics, Baseline Model, and Hyper ParameterIndraFransiskusAlam1
 
Evidence based diagnosis
Evidence based diagnosisEvidence based diagnosis
Evidence based diagnosisHesham Al-Inany
 
Odds ratio and confidence interval
Odds ratio and confidence intervalOdds ratio and confidence interval
Odds ratio and confidence intervalUttamaTungkhang
 
NY Prostate Cancer Conference - A. Vickers - Session 8: Debate 2: Categorical...
NY Prostate Cancer Conference - A. Vickers - Session 8: Debate 2: Categorical...NY Prostate Cancer Conference - A. Vickers - Session 8: Debate 2: Categorical...
NY Prostate Cancer Conference - A. Vickers - Session 8: Debate 2: Categorical...European School of Oncology
 
NY Prostate Cancer Conference - A. Vickers - Session 1: Traditional statistic...
NY Prostate Cancer Conference - A. Vickers - Session 1: Traditional statistic...NY Prostate Cancer Conference - A. Vickers - Session 1: Traditional statistic...
NY Prostate Cancer Conference - A. Vickers - Session 1: Traditional statistic...European School of Oncology
 
screening for diseases.pptx . ...
screening for diseases.pptx .             ...screening for diseases.pptx .             ...
screening for diseases.pptx . ...AkshayBadore2
 
screening-140217071714-phpapp02.pdf
screening-140217071714-phpapp02.pdfscreening-140217071714-phpapp02.pdf
screening-140217071714-phpapp02.pdfSYEDZIYADFURQAN
 
Causal inference lecture to Texas Children's fellows
Causal inference lecture to Texas Children's fellowsCausal inference lecture to Texas Children's fellows
Causal inference lecture to Texas Children's fellowsPavlos Msaouel, MD, PhD
 
Practical Methods To Overcome Sample Size Challenges
Practical Methods To Overcome Sample Size ChallengesPractical Methods To Overcome Sample Size Challenges
Practical Methods To Overcome Sample Size ChallengesnQuery
 
Validity of a screening test
Validity of a screening testValidity of a screening test
Validity of a screening testdrkulrajat
 
brain tumor presentation.pptxbraintumorpresentationonbraintumor
brain tumor presentation.pptxbraintumorpresentationonbraintumorbrain tumor presentation.pptxbraintumorpresentationonbraintumor
brain tumor presentation.pptxbraintumorpresentationonbraintumorNagavelliMadhavi
 
Excelsior College PBH 321 Page 1 CASE-CONTROL STU.docx
Excelsior College PBH 321    Page 1 CASE-CONTROL STU.docxExcelsior College PBH 321    Page 1 CASE-CONTROL STU.docx
Excelsior College PBH 321 Page 1 CASE-CONTROL STU.docxgitagrimston
 
Oncotype dx
Oncotype dxOncotype dx
Oncotype dxNHS
 
ISCB 2023 Sources of uncertainty b.pptx
ISCB 2023 Sources of uncertainty b.pptxISCB 2023 Sources of uncertainty b.pptx
ISCB 2023 Sources of uncertainty b.pptxBenVanCalster
 
Basics of Sample Size Estimation
Basics of Sample Size EstimationBasics of Sample Size Estimation
Basics of Sample Size EstimationMandar Baviskar
 

Similar to Threshold setting for reduction of false positives (20)

Performance Metrics for Machine Learning Algorithms
Performance Metrics for Machine Learning AlgorithmsPerformance Metrics for Machine Learning Algorithms
Performance Metrics for Machine Learning Algorithms
 
Performance Metrics, Baseline Model, and Hyper Parameter
Performance Metrics, Baseline Model, and Hyper ParameterPerformance Metrics, Baseline Model, and Hyper Parameter
Performance Metrics, Baseline Model, and Hyper Parameter
 
Evidence based diagnosis
Evidence based diagnosisEvidence based diagnosis
Evidence based diagnosis
 
Odds ratio and confidence interval
Odds ratio and confidence intervalOdds ratio and confidence interval
Odds ratio and confidence interval
 
Screening
ScreeningScreening
Screening
 
NY Prostate Cancer Conference - A. Vickers - Session 8: Debate 2: Categorical...
NY Prostate Cancer Conference - A. Vickers - Session 8: Debate 2: Categorical...NY Prostate Cancer Conference - A. Vickers - Session 8: Debate 2: Categorical...
NY Prostate Cancer Conference - A. Vickers - Session 8: Debate 2: Categorical...
 
NY Prostate Cancer Conference - A. Vickers - Session 1: Traditional statistic...
NY Prostate Cancer Conference - A. Vickers - Session 1: Traditional statistic...NY Prostate Cancer Conference - A. Vickers - Session 1: Traditional statistic...
NY Prostate Cancer Conference - A. Vickers - Session 1: Traditional statistic...
 
screening for diseases.pptx . ...
screening for diseases.pptx .             ...screening for diseases.pptx .             ...
screening for diseases.pptx . ...
 
screening-140217071714-phpapp02.pdf
screening-140217071714-phpapp02.pdfscreening-140217071714-phpapp02.pdf
screening-140217071714-phpapp02.pdf
 
How to do the maths
How to do the mathsHow to do the maths
How to do the maths
 
Evidence Based Diagnosis
Evidence Based DiagnosisEvidence Based Diagnosis
Evidence Based Diagnosis
 
Causal inference lecture to Texas Children's fellows
Causal inference lecture to Texas Children's fellowsCausal inference lecture to Texas Children's fellows
Causal inference lecture to Texas Children's fellows
 
The Lachman Test
The Lachman TestThe Lachman Test
The Lachman Test
 
Practical Methods To Overcome Sample Size Challenges
Practical Methods To Overcome Sample Size ChallengesPractical Methods To Overcome Sample Size Challenges
Practical Methods To Overcome Sample Size Challenges
 
Validity of a screening test
Validity of a screening testValidity of a screening test
Validity of a screening test
 
brain tumor presentation.pptxbraintumorpresentationonbraintumor
brain tumor presentation.pptxbraintumorpresentationonbraintumorbrain tumor presentation.pptxbraintumorpresentationonbraintumor
brain tumor presentation.pptxbraintumorpresentationonbraintumor
 
Excelsior College PBH 321 Page 1 CASE-CONTROL STU.docx
Excelsior College PBH 321    Page 1 CASE-CONTROL STU.docxExcelsior College PBH 321    Page 1 CASE-CONTROL STU.docx
Excelsior College PBH 321 Page 1 CASE-CONTROL STU.docx
 
Oncotype dx
Oncotype dxOncotype dx
Oncotype dx
 
ISCB 2023 Sources of uncertainty b.pptx
ISCB 2023 Sources of uncertainty b.pptxISCB 2023 Sources of uncertainty b.pptx
ISCB 2023 Sources of uncertainty b.pptx
 
Basics of Sample Size Estimation
Basics of Sample Size EstimationBasics of Sample Size Estimation
Basics of Sample Size Estimation
 

Recently uploaded

Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 

Recently uploaded (20)

Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 

Threshold setting for reduction of false positives

  • 1. Threshold setting for high prediction rate with low false positives Improving the functionality of Supervised Classification
  • 2. Why do we need a low false positive rate? Let us take the example of a cancer prediction problem. If our model would predict that one of our patients is going to have cancer, when he actually isn’t going to, we are going to render a mental trauma for him and his family. In other words, while we want to accurately predict the number of people who are going to have cancer, we do not want to falsely predict if someone is going to have cancer when they actually aren’t going to. Hence, when we build a classification model, we need to ensure that it is tested correctly and that the false positive rate is as low as possible without compromising the classifying accuracy of the model.
  • 3. Testing the Classification Model Testing requires two parameters to be observed: • Sensitivity: Proportion of true positives predicted Total number of positives • Specificity: Proportion of true negatives predicted Total number of negatives Sensitivity can be intuitively thought of as the predictive(classifying) accuracy of the model on the positive class (Eg; how correctly are we predicting the number of cancer patients) Specificity can be intuitively thought of as the predictive(classifying)accuracy of the model on the negative class (Eg; how correctly are we predicting the number of patients who do not have cancer)
  • 4. For example There is a sample of 2000 patients out of which 20 have ovarian cancer. The classification model built by a healthcare company predicts 22 patients have ovarian cancer out of which 15 people have ovarian cancer. What is the sensitivity and specificity? Sensitivity = 15/20 = 0.75 Specificity = 1973/1980 = 0.99
  • 5. ROC Curve Analysis • ROC Curve – plot of sensitivity vs. False positive rate • Each point corresponds to a different threshold that separates negative samples from positive samples • The objective is to find a point (threshold) where the prediction rate is high (high sensitivity) and false positive rate is low. • Example in next slide Source The use of Decision Threshold Adjustment in Classification of Cancer Predictionhttp://www.ams.sunysb.edu/~hahn/psfile/papthres.pdf
  • 6.
  • 7. Cases • Breast Cancer Prediction – 0.98 • Fraud detection – 0.92 Source: http://www.gcxanalytics.com/papers/GCX%20Fraud%20Detection%20Performance%20Evaluation- GCX.pdf