SlideShare a Scribd company logo
1 of 54
Download to read offline
Model Evaluation and Selection
Evaluation Metrics
Standardized equations
• sensitivity = recall = tp / t = tp / (tp + fn)
• specificity = tn / n = tn / (tn + fp)
• precision = tp / p = tp / (tp + fp)
Equations explained
• Sensitivity/recall – how good a test is at detecting the positives. A test can cheat and
maximize this by always returning “positive”.
• Specificity – how good a test is at avoiding false alarms. A test can cheat and maximize
this by always returning “negative”.
• Precision – how many of the positively classified were relevant. A test can cheat and
maximize this by only returning positive on one result it’s most confident in.
• The cheating is resolved by looking at both relevant metrics instead of just one. E.g. the
cheating 100% sensitivity that always says “positive” has 0% specificity.
6.Evaluationandmodelselection.pdf
6.Evaluationandmodelselection.pdf
6.Evaluationandmodelselection.pdf
6.Evaluationandmodelselection.pdf
6.Evaluationandmodelselection.pdf
6.Evaluationandmodelselection.pdf
6.Evaluationandmodelselection.pdf
6.Evaluationandmodelselection.pdf
6.Evaluationandmodelselection.pdf
6.Evaluationandmodelselection.pdf
6.Evaluationandmodelselection.pdf
6.Evaluationandmodelselection.pdf
6.Evaluationandmodelselection.pdf
6.Evaluationandmodelselection.pdf
6.Evaluationandmodelselection.pdf
6.Evaluationandmodelselection.pdf
6.Evaluationandmodelselection.pdf
6.Evaluationandmodelselection.pdf
6.Evaluationandmodelselection.pdf
6.Evaluationandmodelselection.pdf
6.Evaluationandmodelselection.pdf
6.Evaluationandmodelselection.pdf
6.Evaluationandmodelselection.pdf
6.Evaluationandmodelselection.pdf

More Related Content

Similar to 6.Evaluationandmodelselection.pdf

MACHINE LEARNING PPT K MEANS CLUSTERING.
MACHINE LEARNING PPT K MEANS CLUSTERING.MACHINE LEARNING PPT K MEANS CLUSTERING.
MACHINE LEARNING PPT K MEANS CLUSTERING.AmnaArooj13
 
Understanding specification
Understanding specification Understanding specification
Understanding specification KarabiBiswas4
 
Evaluating diagnostic tests.pptx
Evaluating diagnostic tests.pptxEvaluating diagnostic tests.pptx
Evaluating diagnostic tests.pptxangelabraver1
 
04 performance metrics v2
04 performance metrics v204 performance metrics v2
04 performance metrics v2Anne Starr
 
Estimating standard error of measurement
Estimating standard error of measurementEstimating standard error of measurement
Estimating standard error of measurementCarlo Magno
 
Parametric Test -T test.pptx by Dr. Neha Deo
Parametric Test -T test.pptx by Dr. Neha DeoParametric Test -T test.pptx by Dr. Neha Deo
Parametric Test -T test.pptx by Dr. Neha DeoNeha Deo
 

Similar to 6.Evaluationandmodelselection.pdf (8)

MACHINE LEARNING PPT K MEANS CLUSTERING.
MACHINE LEARNING PPT K MEANS CLUSTERING.MACHINE LEARNING PPT K MEANS CLUSTERING.
MACHINE LEARNING PPT K MEANS CLUSTERING.
 
Understanding specification
Understanding specification Understanding specification
Understanding specification
 
Evaluating diagnostic tests.pptx
Evaluating diagnostic tests.pptxEvaluating diagnostic tests.pptx
Evaluating diagnostic tests.pptx
 
04 performance metrics v2
04 performance metrics v204 performance metrics v2
04 performance metrics v2
 
hypothesis.pptx
hypothesis.pptxhypothesis.pptx
hypothesis.pptx
 
Estimating standard error of measurement
Estimating standard error of measurementEstimating standard error of measurement
Estimating standard error of measurement
 
Parametric Test -T test.pptx by Dr. Neha Deo
Parametric Test -T test.pptx by Dr. Neha DeoParametric Test -T test.pptx by Dr. Neha Deo
Parametric Test -T test.pptx by Dr. Neha Deo
 
Item analysis
Item analysisItem analysis
Item analysis
 

More from Variable14

12. Multi Resolution analysis, Scale invariant futures.pptx
12. Multi Resolution analysis, Scale invariant futures.pptx12. Multi Resolution analysis, Scale invariant futures.pptx
12. Multi Resolution analysis, Scale invariant futures.pptxVariable14
 
2.Find_SandCandidateElimination.pdf
2.Find_SandCandidateElimination.pdf2.Find_SandCandidateElimination.pdf
2.Find_SandCandidateElimination.pdfVariable14
 
5.levelofAbstraction.pptx
5.levelofAbstraction.pptx5.levelofAbstraction.pptx
5.levelofAbstraction.pptxVariable14
 
Emotional Intelligence.pptx
Emotional Intelligence.pptxEmotional Intelligence.pptx
Emotional Intelligence.pptxVariable14
 

More from Variable14 (8)

16. DLT.pdf
16. DLT.pdf16. DLT.pdf
16. DLT.pdf
 
12. Multi Resolution analysis, Scale invariant futures.pptx
12. Multi Resolution analysis, Scale invariant futures.pptx12. Multi Resolution analysis, Scale invariant futures.pptx
12. Multi Resolution analysis, Scale invariant futures.pptx
 
3.SVM.pdf
3.SVM.pdf3.SVM.pdf
3.SVM.pdf
 
2.Find_SandCandidateElimination.pdf
2.Find_SandCandidateElimination.pdf2.Find_SandCandidateElimination.pdf
2.Find_SandCandidateElimination.pdf
 
2.ANN.pptx
2.ANN.pptx2.ANN.pptx
2.ANN.pptx
 
5.levelofAbstraction.pptx
5.levelofAbstraction.pptx5.levelofAbstraction.pptx
5.levelofAbstraction.pptx
 
HR.pptx
HR.pptxHR.pptx
HR.pptx
 
Emotional Intelligence.pptx
Emotional Intelligence.pptxEmotional Intelligence.pptx
Emotional Intelligence.pptx
 

Recently uploaded

Dynamo Scripts for Task IDs and Space Naming.pptx
Dynamo Scripts for Task IDs and Space Naming.pptxDynamo Scripts for Task IDs and Space Naming.pptx
Dynamo Scripts for Task IDs and Space Naming.pptxMustafa Ahmed
 
Software Engineering Practical File Front Pages.pdf
Software Engineering Practical File Front Pages.pdfSoftware Engineering Practical File Front Pages.pdf
Software Engineering Practical File Front Pages.pdfssuser5c9d4b1
 
Operating System chapter 9 (Virtual Memory)
Operating System chapter 9 (Virtual Memory)Operating System chapter 9 (Virtual Memory)
Operating System chapter 9 (Virtual Memory)NareenAsad
 
Raashid final report on Embedded Systems
Raashid final report on Embedded SystemsRaashid final report on Embedded Systems
Raashid final report on Embedded SystemsRaashidFaiyazSheikh
 
Artificial Intelligence in due diligence
Artificial Intelligence in due diligenceArtificial Intelligence in due diligence
Artificial Intelligence in due diligencemahaffeycheryld
 
Final DBMS Manual (2).pdf final lab manual
Final DBMS Manual (2).pdf final lab manualFinal DBMS Manual (2).pdf final lab manual
Final DBMS Manual (2).pdf final lab manualBalamuruganV28
 
Introduction to Artificial Intelligence and History of AI
Introduction to Artificial Intelligence and History of AIIntroduction to Artificial Intelligence and History of AI
Introduction to Artificial Intelligence and History of AISheetal Jain
 
Instruct Nirmaana 24-Smart and Lean Construction Through Technology.pdf
Instruct Nirmaana 24-Smart and Lean Construction Through Technology.pdfInstruct Nirmaana 24-Smart and Lean Construction Through Technology.pdf
Instruct Nirmaana 24-Smart and Lean Construction Through Technology.pdfEr.Sonali Nasikkar
 
Lab Manual Arduino UNO Microcontrollar.docx
Lab Manual Arduino UNO Microcontrollar.docxLab Manual Arduino UNO Microcontrollar.docx
Lab Manual Arduino UNO Microcontrollar.docxRashidFaridChishti
 
litvinenko_Henry_Intrusion_Hong-Kong_2024.pdf
litvinenko_Henry_Intrusion_Hong-Kong_2024.pdflitvinenko_Henry_Intrusion_Hong-Kong_2024.pdf
litvinenko_Henry_Intrusion_Hong-Kong_2024.pdfAlexander Litvinenko
 
CLOUD COMPUTING SERVICES - Cloud Reference Modal
CLOUD COMPUTING SERVICES - Cloud Reference ModalCLOUD COMPUTING SERVICES - Cloud Reference Modal
CLOUD COMPUTING SERVICES - Cloud Reference ModalSwarnaSLcse
 
8th International Conference on Soft Computing, Mathematics and Control (SMC ...
8th International Conference on Soft Computing, Mathematics and Control (SMC ...8th International Conference on Soft Computing, Mathematics and Control (SMC ...
8th International Conference on Soft Computing, Mathematics and Control (SMC ...josephjonse
 
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...drjose256
 
Autodesk Construction Cloud (Autodesk Build).pptx
Autodesk Construction Cloud (Autodesk Build).pptxAutodesk Construction Cloud (Autodesk Build).pptx
Autodesk Construction Cloud (Autodesk Build).pptxMustafa Ahmed
 
Research Methodolgy & Intellectual Property Rights Series 1
Research Methodolgy & Intellectual Property Rights Series 1Research Methodolgy & Intellectual Property Rights Series 1
Research Methodolgy & Intellectual Property Rights Series 1T.D. Shashikala
 
Linux Systems Programming: Semaphores, Shared Memory, and Message Queues
Linux Systems Programming: Semaphores, Shared Memory, and Message QueuesLinux Systems Programming: Semaphores, Shared Memory, and Message Queues
Linux Systems Programming: Semaphores, Shared Memory, and Message QueuesRashidFaridChishti
 
handbook on reinforce concrete and detailing
handbook on reinforce concrete and detailinghandbook on reinforce concrete and detailing
handbook on reinforce concrete and detailingAshishSingh1301
 
5G and 6G refer to generations of mobile network technology, each representin...
5G and 6G refer to generations of mobile network technology, each representin...5G and 6G refer to generations of mobile network technology, each representin...
5G and 6G refer to generations of mobile network technology, each representin...archanaece3
 
NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024
NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024
NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024EMMANUELLEFRANCEHELI
 
Low Altitude Air Defense (LAAD) Gunner’s Handbook
Low Altitude Air Defense (LAAD) Gunner’s HandbookLow Altitude Air Defense (LAAD) Gunner’s Handbook
Low Altitude Air Defense (LAAD) Gunner’s HandbookPeterJack13
 

Recently uploaded (20)

Dynamo Scripts for Task IDs and Space Naming.pptx
Dynamo Scripts for Task IDs and Space Naming.pptxDynamo Scripts for Task IDs and Space Naming.pptx
Dynamo Scripts for Task IDs and Space Naming.pptx
 
Software Engineering Practical File Front Pages.pdf
Software Engineering Practical File Front Pages.pdfSoftware Engineering Practical File Front Pages.pdf
Software Engineering Practical File Front Pages.pdf
 
Operating System chapter 9 (Virtual Memory)
Operating System chapter 9 (Virtual Memory)Operating System chapter 9 (Virtual Memory)
Operating System chapter 9 (Virtual Memory)
 
Raashid final report on Embedded Systems
Raashid final report on Embedded SystemsRaashid final report on Embedded Systems
Raashid final report on Embedded Systems
 
Artificial Intelligence in due diligence
Artificial Intelligence in due diligenceArtificial Intelligence in due diligence
Artificial Intelligence in due diligence
 
Final DBMS Manual (2).pdf final lab manual
Final DBMS Manual (2).pdf final lab manualFinal DBMS Manual (2).pdf final lab manual
Final DBMS Manual (2).pdf final lab manual
 
Introduction to Artificial Intelligence and History of AI
Introduction to Artificial Intelligence and History of AIIntroduction to Artificial Intelligence and History of AI
Introduction to Artificial Intelligence and History of AI
 
Instruct Nirmaana 24-Smart and Lean Construction Through Technology.pdf
Instruct Nirmaana 24-Smart and Lean Construction Through Technology.pdfInstruct Nirmaana 24-Smart and Lean Construction Through Technology.pdf
Instruct Nirmaana 24-Smart and Lean Construction Through Technology.pdf
 
Lab Manual Arduino UNO Microcontrollar.docx
Lab Manual Arduino UNO Microcontrollar.docxLab Manual Arduino UNO Microcontrollar.docx
Lab Manual Arduino UNO Microcontrollar.docx
 
litvinenko_Henry_Intrusion_Hong-Kong_2024.pdf
litvinenko_Henry_Intrusion_Hong-Kong_2024.pdflitvinenko_Henry_Intrusion_Hong-Kong_2024.pdf
litvinenko_Henry_Intrusion_Hong-Kong_2024.pdf
 
CLOUD COMPUTING SERVICES - Cloud Reference Modal
CLOUD COMPUTING SERVICES - Cloud Reference ModalCLOUD COMPUTING SERVICES - Cloud Reference Modal
CLOUD COMPUTING SERVICES - Cloud Reference Modal
 
8th International Conference on Soft Computing, Mathematics and Control (SMC ...
8th International Conference on Soft Computing, Mathematics and Control (SMC ...8th International Conference on Soft Computing, Mathematics and Control (SMC ...
8th International Conference on Soft Computing, Mathematics and Control (SMC ...
 
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
 
Autodesk Construction Cloud (Autodesk Build).pptx
Autodesk Construction Cloud (Autodesk Build).pptxAutodesk Construction Cloud (Autodesk Build).pptx
Autodesk Construction Cloud (Autodesk Build).pptx
 
Research Methodolgy & Intellectual Property Rights Series 1
Research Methodolgy & Intellectual Property Rights Series 1Research Methodolgy & Intellectual Property Rights Series 1
Research Methodolgy & Intellectual Property Rights Series 1
 
Linux Systems Programming: Semaphores, Shared Memory, and Message Queues
Linux Systems Programming: Semaphores, Shared Memory, and Message QueuesLinux Systems Programming: Semaphores, Shared Memory, and Message Queues
Linux Systems Programming: Semaphores, Shared Memory, and Message Queues
 
handbook on reinforce concrete and detailing
handbook on reinforce concrete and detailinghandbook on reinforce concrete and detailing
handbook on reinforce concrete and detailing
 
5G and 6G refer to generations of mobile network technology, each representin...
5G and 6G refer to generations of mobile network technology, each representin...5G and 6G refer to generations of mobile network technology, each representin...
5G and 6G refer to generations of mobile network technology, each representin...
 
NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024
NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024
NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024
 
Low Altitude Air Defense (LAAD) Gunner’s Handbook
Low Altitude Air Defense (LAAD) Gunner’s HandbookLow Altitude Air Defense (LAAD) Gunner’s Handbook
Low Altitude Air Defense (LAAD) Gunner’s Handbook
 

6.Evaluationandmodelselection.pdf

  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30. Evaluation Metrics Standardized equations • sensitivity = recall = tp / t = tp / (tp + fn) • specificity = tn / n = tn / (tn + fp) • precision = tp / p = tp / (tp + fp) Equations explained • Sensitivity/recall – how good a test is at detecting the positives. A test can cheat and maximize this by always returning “positive”. • Specificity – how good a test is at avoiding false alarms. A test can cheat and maximize this by always returning “negative”. • Precision – how many of the positively classified were relevant. A test can cheat and maximize this by only returning positive on one result it’s most confident in. • The cheating is resolved by looking at both relevant metrics instead of just one. E.g. the cheating 100% sensitivity that always says “positive” has 0% specificity.