SlideShare a Scribd company logo
1 of 15
Binary Classification
Validation
Au Vo, PhD
Confusion Matrix
Real Y is positive (Class 1) Real Y is negative (Class 0)
Predicted Y is
positive (Class 1)
True positive
predict positive, and is positive
 Correct / True prediction
False positive
predict positive, but is negative
 Incorrect / False positive
Type I error
Predicted Y is
negative (Class 0)
False negative
predict negative, but is positive
 Incorrect / False prediction
Type II error
True negative
predict negative, and is negative
 Correct / True prediction
Confusion Matrix
Real Y is positive (Class 1) Real Y is negative (Class 0)
Predicted Y is
positive (Class 1)
True positive
predict positive, and is positive
 Correct / True prediction
False positive
predict positive, but is negative
 Incorrect / False positive
Type I error
Predicted Y is
negative (Class 0)
False negative
predict negative, but is positive
 Incorrect / False prediction
Type II error
True negative
predict negative, and is negative
 Correct / True prediction
Most popular Classification metrics
Name Description Interpretation
Accuracy 𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 + 𝑡𝑟𝑢𝑒 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒𝑠
𝑎𝑙𝑙 𝑑𝑎𝑡𝑎
The proportion of exact match of
prediction y to real y.
Best is 1.0
Misclassification Rate 1 – Accuracy OR
𝑓𝑎𝑙𝑠𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠+𝑓𝑎𝑙𝑠𝑒 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒𝑠
𝑎𝑙𝑙 𝑑𝑎𝑡𝑎
The proportion of incorrect match of
prediction to real y.
Best is 0
Precision 𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠
𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 + 𝑓𝑎𝑙𝑠𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠
How many selected items are relevant?
Best is 1.0
Recall
AKA Sensitivity, hit rate
𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠
𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 + 𝑓𝑎𝑙𝑠𝑒 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒𝑠
How many relevant items are selected?
Best is 1.0
F-score A measure that balances precision and recall.
𝐹 − 1 𝑠𝑐𝑜𝑟𝑒 = 2 ∗
𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 ∗ 𝑟𝑒𝑐𝑎𝑙𝑙
𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑟𝑒𝑐𝑎𝑙𝑙
Balance score between precision and
recall  use F-1 score
Predictions and errors
Real Y is positive (Class 1) Real Y is negative (Class 0)
Predicted Y is
positive (Class 1)
True positive
predict positive, and is positive
 Correct / True prediction
False positive
predict positive, but is negative
 Incorrect / False positive
Type I error
Predicted Y is
negative (Class 0)
False negative
predict negative, but is positive
 Incorrect / False prediction
Type II error
True negative
predict negative, and is negative
 Correct / True prediction
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 =
𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠
𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 + 𝑓𝑎𝑙𝑠𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠
How many relevant items are selected?𝑅𝑒𝑐𝑎𝑙𝑙 =
𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠
𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 + 𝑓𝑎𝑙𝑠𝑒 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒𝑠
How many selected items are relevant?
https://en.wikipedia.org/wiki/Precision_and_recall
When it predicts yes, how often is it correct?
When it's actually yes, how often does it predict yes?
Consider this example
The mafia syndicate makes sure to get the
right person for the family.
When in doubt, reject.
Only accept when absolutely sure.
Does the Don look for a high
precision or recall?
Real Y is positive (Class 1) Real Y is negative (Class 0)
Predicted Y is
positive (Class 1)
True positive
predict positive, and is positive
 Correct / True prediction
False positive
predict positive, but is negative
 Incorrect / False positive
Type I error
Predicted Y is
negative (Class 0)
False negative
predict negative, but is positive
 Incorrect / False prediction
Type II error
True negative
predict negative, and is negative
 Correct / True prediction
Class 1: the Don’s family Class 0: not in the family
Willing to err on the False negative
side
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 =
𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠
𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 + 𝑓𝑎𝑙𝑠𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠
𝑅𝑒𝑐𝑎𝑙𝑙 =
𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠
𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 + 𝑓𝑎𝑙𝑠𝑒 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒𝑠
So
 OR?
Real Y is positive (Class 1) Real Y is negative (Class 0)
Predicted Y is
positive (Class 1)
True positive
predict positive, and is positive
 Correct / True prediction
False positive
predict positive, but is negative
 Incorrect / False positive
Type I error
Predicted Y is
negative (Class 0)
False negative
predict negative, but is positive
 Incorrect / False prediction
Type II error
True negative
predict negative, and is negative
 Correct / True prediction
Class 1: the Don’s family Class 0: not in the family
Willing to err on the False Negative
side
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 =
𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠
𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 + 𝑓𝑎𝑙𝑠𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠
𝑅𝑒𝑐𝑎𝑙𝑙 =
𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠
𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 + 𝑓𝑎𝑙𝑠𝑒 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒𝑠
OR?
When False Positive is high  False
Negative is low
Best? False Negative is ALL of
misclassification rate  False
positive is 0
misclassification rate =
𝑓𝑎𝑙𝑠𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠+𝑓𝑎𝑙𝑠𝑒 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒𝑠
𝑎𝑙𝑙 𝑑𝑎𝑡𝑎
Real Y is positive (Class 1) Real Y is negative (Class 0)
Predicted Y is
positive (Class 1)
True positive
predict positive, and is positive
 Correct / True prediction
False positive
predict positive, but is negative
 Incorrect / False positive
Type I error
Predicted Y is
negative (Class 0)
False negative
predict negative, but is positive
 Incorrect / False prediction
Type II error
True negative
predict negative, and is negative
 Correct / True prediction
Class 1: the Don’s family Class 0: not in the family
Willing to err on the False Negative
side
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 =
𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠
𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 + 𝑓𝑎𝑙𝑠𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠
𝑅𝑒𝑐𝑎𝑙𝑙 =
𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠
𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 + 𝑓𝑎𝑙𝑠𝑒 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒𝑠
OR?
When False Positive is high  False
Positive is low
Best? False Negative is ALL of
misclassification rate  False
positive is 0
misclassification rate =
𝑓𝑎𝑙𝑠𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠+𝑓𝑎𝑙𝑠𝑒 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒𝑠
𝑎𝑙𝑙 𝑑𝑎𝑡𝑎
Therefore

• It is about

PRECISION!
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 =
𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠
𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 + 𝑓𝑎𝑙𝑠𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠
How many selected items are relevant?
When it predicts yes, how often is it correct?
Consider another example
You want to marry  go purchase a $$$ ring
Credit card company rejects your purchase,
saying it ”might be” fraud, calling you to
verify.
Does the credit card company
look for a high precision or
recall?
Real Y is positive (Class 1) Real Y is negative (Class 0)
Predicted Y is
positive (Class 1)
True positive
predict positive, and is positive
 Correct / True prediction
False positive
predict positive, but is negative
 Incorrect / False positive
Type I error
Predicted Y is
negative (Class 0)
False negative
predict negative, but is positive
 Incorrect / False prediction
Type II error
True negative
predict negative, and is negative
 Correct / True prediction
Class 1: fraud Class 0: not fraud
Willing to err on the False Positive
side
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 =
𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠
𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 + 𝑓𝑎𝑙𝑠𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠
𝑅𝑒𝑐𝑎𝑙𝑙 =
𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠
𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 + 𝑓𝑎𝑙𝑠𝑒 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒𝑠
So
 OR?
Real Y is positive (Class 1) Real Y is negative (Class 0)
Predicted Y is
positive (Class 1)
True positive
predict positive, and is positive
 Correct / True prediction
False positive
predict positive, but is negative
 Incorrect / False positive
Type I error
Predicted Y is
negative (Class 0)
False negative
predict negative, but is positive
 Incorrect / False prediction
Type II error
True negative
predict negative, and is negative
 Correct / True prediction
Class 1: fraud Class 0: not fraud
Willing to err on the False Positive
side
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 =
𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠
𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 + 𝑓𝑎𝑙𝑠𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠
𝑅𝑒𝑐𝑎𝑙𝑙 =
𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠
𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 + 𝑓𝑎𝑙𝑠𝑒 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒𝑠
OR?
When False Positive is high  False
Negative is low
Best? False Positive is ALL of
misclassification rate  False
Negative is 0
misclassification rate =
𝑓𝑎𝑙𝑠𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠+𝑓𝑎𝑙𝑠𝑒 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒𝑠
𝑎𝑙𝑙 𝑑𝑎𝑡𝑎
Therefore

• It is about

RECALL!
How many relevant items are selected?
When it's actually yes, how often does it predict yes?
𝑅𝑒𝑐𝑎𝑙𝑙 =
𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠
𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 + 𝑓𝑎𝑙𝑠𝑒 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒𝑠

More Related Content

Recently uploaded

原件䞀样䌊敊囜王孊院毕䞚证成绩单留信孊历讀证
原件䞀样䌊敊囜王孊院毕䞚证成绩单留信孊历讀证原件䞀样䌊敊囜王孊院毕䞚证成绩单留信孊历讀证
原件䞀样䌊敊囜王孊院毕䞚证成绩单留信孊历讀证pwgnohujw
 
原件䞀样(UWO毕䞚证乊西安倧略倧孊毕䞚证成绩单留信孊历讀证
原件䞀样(UWO毕䞚证乊西安倧略倧孊毕䞚证成绩单留信孊历讀证原件䞀样(UWO毕䞚证乊西安倧略倧孊毕䞚证成绩单留信孊历讀证
原件䞀样(UWO毕䞚证乊西安倧略倧孊毕䞚证成绩单留信孊历讀证pwgnohujw
 
Formulas dax para power bI de microsoft.pdf
Formulas dax para power bI de microsoft.pdfFormulas dax para power bI de microsoft.pdf
Formulas dax para power bI de microsoft.pdfRobertoOcampo24
 
NOAM AAUG Adobe Summit 2024: Summit Slam Dunks
NOAM AAUG Adobe Summit 2024: Summit Slam DunksNOAM AAUG Adobe Summit 2024: Summit Slam Dunks
NOAM AAUG Adobe Summit 2024: Summit Slam Dunksgmuir1066
 
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...ThinkInnovation
 
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...Klinik Aborsi
 
SCI8-Q4-MOD11.pdfwrwujrrjfaajerjrajrrarj
SCI8-Q4-MOD11.pdfwrwujrrjfaajerjrajrrarjSCI8-Q4-MOD11.pdfwrwujrrjfaajerjrajrrarj
SCI8-Q4-MOD11.pdfwrwujrrjfaajerjrajrrarjadimosmejiaslendon
 
obat aborsi Banjarmasin wa 082135199655 jual obat aborsi cytotec asli di Ban...
obat aborsi Banjarmasin wa 082135199655 jual obat aborsi cytotec asli di  Ban...obat aborsi Banjarmasin wa 082135199655 jual obat aborsi cytotec asli di  Ban...
obat aborsi Banjarmasin wa 082135199655 jual obat aborsi cytotec asli di Ban...siskavia95
 
Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...
Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...
Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...ssuserf63bd7
 
劂䜕办理(Dalhousie毕䞚证乊蟟尔豪斯倧孊毕䞚证成绩单留信孊历讀证
劂䜕办理(Dalhousie毕䞚证乊蟟尔豪斯倧孊毕䞚证成绩单留信孊历讀证劂䜕办理(Dalhousie毕䞚证乊蟟尔豪斯倧孊毕䞚证成绩单留信孊历讀证
劂䜕办理(Dalhousie毕䞚证乊蟟尔豪斯倧孊毕䞚证成绩单留信孊历讀证zifhagzkk
 
Aggregations - The Elasticsearch "GROUP BY"
Aggregations - The Elasticsearch "GROUP BY"Aggregations - The Elasticsearch "GROUP BY"
Aggregations - The Elasticsearch "GROUP BY"John Sobanski
 
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...yulianti213969
 
劂䜕办理(UPenn毕䞚证乊実倕法尌亚倧孊毕䞚证成绩单本科硕士孊䜍证留信孊历讀证
劂䜕办理(UPenn毕䞚证乊実倕法尌亚倧孊毕䞚证成绩单本科硕士孊䜍证留信孊历讀证劂䜕办理(UPenn毕䞚证乊実倕法尌亚倧孊毕䞚证成绩单本科硕士孊䜍证留信孊历讀证
劂䜕办理(UPenn毕䞚证乊実倕法尌亚倧孊毕䞚证成绩单本科硕士孊䜍证留信孊历讀证acoha1
 
Bios of leading Astrologers & Researchers
Bios of leading Astrologers & ResearchersBios of leading Astrologers & Researchers
Bios of leading Astrologers & Researchersdarmandersingh4580
 
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...Valters Lauzums
 
The Significance of Transliteration Enhancing
The Significance of Transliteration EnhancingThe Significance of Transliteration Enhancing
The Significance of Transliteration Enhancingmohamed Elzalabany
 
Credit Card Fraud Detection: Safeguarding Transactions in the Digital Age
Credit Card Fraud Detection: Safeguarding Transactions in the Digital AgeCredit Card Fraud Detection: Safeguarding Transactions in the Digital Age
Credit Card Fraud Detection: Safeguarding Transactions in the Digital AgeBoston Institute of Analytics
 
瀟内勉匷䌚資料_Object Recognition as Next Token Prediction
瀟内勉匷䌚資料_Object Recognition as Next Token Prediction瀟内勉匷䌚資料_Object Recognition as Next Token Prediction
瀟内勉匷䌚資料_Object Recognition as Next Token PredictionNABLAS株匏䌚瀟
 
What is Insertion Sort. Its basic information
What is Insertion Sort. Its basic informationWhat is Insertion Sort. Its basic information
What is Insertion Sort. Its basic informationmuqadasqasim10
 

Recently uploaded (20)

原件䞀样䌊敊囜王孊院毕䞚证成绩单留信孊历讀证
原件䞀样䌊敊囜王孊院毕䞚证成绩单留信孊历讀证原件䞀样䌊敊囜王孊院毕䞚证成绩单留信孊历讀证
原件䞀样䌊敊囜王孊院毕䞚证成绩单留信孊历讀证
 
Abortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotec
Abortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotecAbortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotec
Abortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotec
 
原件䞀样(UWO毕䞚证乊西安倧略倧孊毕䞚证成绩单留信孊历讀证
原件䞀样(UWO毕䞚证乊西安倧略倧孊毕䞚证成绩单留信孊历讀证原件䞀样(UWO毕䞚证乊西安倧略倧孊毕䞚证成绩单留信孊历讀证
原件䞀样(UWO毕䞚证乊西安倧略倧孊毕䞚证成绩单留信孊历讀证
 
Formulas dax para power bI de microsoft.pdf
Formulas dax para power bI de microsoft.pdfFormulas dax para power bI de microsoft.pdf
Formulas dax para power bI de microsoft.pdf
 
NOAM AAUG Adobe Summit 2024: Summit Slam Dunks
NOAM AAUG Adobe Summit 2024: Summit Slam DunksNOAM AAUG Adobe Summit 2024: Summit Slam Dunks
NOAM AAUG Adobe Summit 2024: Summit Slam Dunks
 
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
 
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
 
SCI8-Q4-MOD11.pdfwrwujrrjfaajerjrajrrarj
SCI8-Q4-MOD11.pdfwrwujrrjfaajerjrajrrarjSCI8-Q4-MOD11.pdfwrwujrrjfaajerjrajrrarj
SCI8-Q4-MOD11.pdfwrwujrrjfaajerjrajrrarj
 
obat aborsi Banjarmasin wa 082135199655 jual obat aborsi cytotec asli di Ban...
obat aborsi Banjarmasin wa 082135199655 jual obat aborsi cytotec asli di  Ban...obat aborsi Banjarmasin wa 082135199655 jual obat aborsi cytotec asli di  Ban...
obat aborsi Banjarmasin wa 082135199655 jual obat aborsi cytotec asli di Ban...
 
Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...
Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...
Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...
 
劂䜕办理(Dalhousie毕䞚证乊蟟尔豪斯倧孊毕䞚证成绩单留信孊历讀证
劂䜕办理(Dalhousie毕䞚证乊蟟尔豪斯倧孊毕䞚证成绩单留信孊历讀证劂䜕办理(Dalhousie毕䞚证乊蟟尔豪斯倧孊毕䞚证成绩单留信孊历讀证
劂䜕办理(Dalhousie毕䞚证乊蟟尔豪斯倧孊毕䞚证成绩单留信孊历讀证
 
Aggregations - The Elasticsearch "GROUP BY"
Aggregations - The Elasticsearch "GROUP BY"Aggregations - The Elasticsearch "GROUP BY"
Aggregations - The Elasticsearch "GROUP BY"
 
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
 
劂䜕办理(UPenn毕䞚证乊実倕法尌亚倧孊毕䞚证成绩单本科硕士孊䜍证留信孊历讀证
劂䜕办理(UPenn毕䞚证乊実倕法尌亚倧孊毕䞚证成绩单本科硕士孊䜍证留信孊历讀证劂䜕办理(UPenn毕䞚证乊実倕法尌亚倧孊毕䞚证成绩单本科硕士孊䜍证留信孊历讀证
劂䜕办理(UPenn毕䞚证乊実倕法尌亚倧孊毕䞚证成绩单本科硕士孊䜍证留信孊历讀证
 
Bios of leading Astrologers & Researchers
Bios of leading Astrologers & ResearchersBios of leading Astrologers & Researchers
Bios of leading Astrologers & Researchers
 
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
 
The Significance of Transliteration Enhancing
The Significance of Transliteration EnhancingThe Significance of Transliteration Enhancing
The Significance of Transliteration Enhancing
 
Credit Card Fraud Detection: Safeguarding Transactions in the Digital Age
Credit Card Fraud Detection: Safeguarding Transactions in the Digital AgeCredit Card Fraud Detection: Safeguarding Transactions in the Digital Age
Credit Card Fraud Detection: Safeguarding Transactions in the Digital Age
 
瀟内勉匷䌚資料_Object Recognition as Next Token Prediction
瀟内勉匷䌚資料_Object Recognition as Next Token Prediction瀟内勉匷䌚資料_Object Recognition as Next Token Prediction
瀟内勉匷䌚資料_Object Recognition as Next Token Prediction
 
What is Insertion Sort. Its basic information
What is Insertion Sort. Its basic informationWhat is Insertion Sort. Its basic information
What is Insertion Sort. Its basic information
 

Featured

Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data ScienceChristy Abraham Joy
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at WorkGetSmarter
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slidesAlireza Esmikhani
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...DevGAMM Conference
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationErica Santiago
 
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellGood Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellSaba Software
 

Featured (20)

Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 
More than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike RoutesMore than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike Routes
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy Presentation
 
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellGood Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
 

Binary classification validation

  • 2. Confusion Matrix Real Y is positive (Class 1) Real Y is negative (Class 0) Predicted Y is positive (Class 1) True positive predict positive, and is positive  Correct / True prediction False positive predict positive, but is negative  Incorrect / False positive Type I error Predicted Y is negative (Class 0) False negative predict negative, but is positive  Incorrect / False prediction Type II error True negative predict negative, and is negative  Correct / True prediction
  • 3. Confusion Matrix Real Y is positive (Class 1) Real Y is negative (Class 0) Predicted Y is positive (Class 1) True positive predict positive, and is positive  Correct / True prediction False positive predict positive, but is negative  Incorrect / False positive Type I error Predicted Y is negative (Class 0) False negative predict negative, but is positive  Incorrect / False prediction Type II error True negative predict negative, and is negative  Correct / True prediction
  • 4. Most popular Classification metrics Name Description Interpretation Accuracy 𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 + 𝑡𝑟𝑢𝑒 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒𝑠 𝑎𝑙𝑙 𝑑𝑎𝑡𝑎 The proportion of exact match of prediction y to real y. Best is 1.0 Misclassification Rate 1 – Accuracy OR 𝑓𝑎𝑙𝑠𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠+𝑓𝑎𝑙𝑠𝑒 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒𝑠 𝑎𝑙𝑙 𝑑𝑎𝑡𝑎 The proportion of incorrect match of prediction to real y. Best is 0 Precision 𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 + 𝑓𝑎𝑙𝑠𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 How many selected items are relevant? Best is 1.0 Recall AKA Sensitivity, hit rate 𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 + 𝑓𝑎𝑙𝑠𝑒 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒𝑠 How many relevant items are selected? Best is 1.0 F-score A measure that balances precision and recall. 𝐹 − 1 𝑠𝑐𝑜𝑟𝑒 = 2 ∗ 𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 ∗ 𝑟𝑒𝑐𝑎𝑙𝑙 𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑟𝑒𝑐𝑎𝑙𝑙 Balance score between precision and recall  use F-1 score
  • 5. Predictions and errors Real Y is positive (Class 1) Real Y is negative (Class 0) Predicted Y is positive (Class 1) True positive predict positive, and is positive  Correct / True prediction False positive predict positive, but is negative  Incorrect / False positive Type I error Predicted Y is negative (Class 0) False negative predict negative, but is positive  Incorrect / False prediction Type II error True negative predict negative, and is negative  Correct / True prediction 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = 𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 + 𝑓𝑎𝑙𝑠𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 How many relevant items are selected?𝑅𝑒𝑐𝑎𝑙𝑙 = 𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 + 𝑓𝑎𝑙𝑠𝑒 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒𝑠 How many selected items are relevant? https://en.wikipedia.org/wiki/Precision_and_recall When it predicts yes, how often is it correct? When it's actually yes, how often does it predict yes?
  • 6.
  • 7. Consider this example The mafia syndicate makes sure to get the right person for the family. When in doubt, reject. Only accept when absolutely sure. Does the Don look for a high precision or recall?
  • 8. Real Y is positive (Class 1) Real Y is negative (Class 0) Predicted Y is positive (Class 1) True positive predict positive, and is positive  Correct / True prediction False positive predict positive, but is negative  Incorrect / False positive Type I error Predicted Y is negative (Class 0) False negative predict negative, but is positive  Incorrect / False prediction Type II error True negative predict negative, and is negative  Correct / True prediction Class 1: the Don’s family Class 0: not in the family Willing to err on the False negative side 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = 𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 + 𝑓𝑎𝑙𝑠𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 𝑅𝑒𝑐𝑎𝑙𝑙 = 𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 + 𝑓𝑎𝑙𝑠𝑒 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒𝑠 So
 OR?
  • 9. Real Y is positive (Class 1) Real Y is negative (Class 0) Predicted Y is positive (Class 1) True positive predict positive, and is positive  Correct / True prediction False positive predict positive, but is negative  Incorrect / False positive Type I error Predicted Y is negative (Class 0) False negative predict negative, but is positive  Incorrect / False prediction Type II error True negative predict negative, and is negative  Correct / True prediction Class 1: the Don’s family Class 0: not in the family Willing to err on the False Negative side 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = 𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 + 𝑓𝑎𝑙𝑠𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 𝑅𝑒𝑐𝑎𝑙𝑙 = 𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 + 𝑓𝑎𝑙𝑠𝑒 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒𝑠 OR? When False Positive is high  False Negative is low Best? False Negative is ALL of misclassification rate  False positive is 0 misclassification rate = 𝑓𝑎𝑙𝑠𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠+𝑓𝑎𝑙𝑠𝑒 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒𝑠 𝑎𝑙𝑙 𝑑𝑎𝑡𝑎
  • 10. Real Y is positive (Class 1) Real Y is negative (Class 0) Predicted Y is positive (Class 1) True positive predict positive, and is positive  Correct / True prediction False positive predict positive, but is negative  Incorrect / False positive Type I error Predicted Y is negative (Class 0) False negative predict negative, but is positive  Incorrect / False prediction Type II error True negative predict negative, and is negative  Correct / True prediction Class 1: the Don’s family Class 0: not in the family Willing to err on the False Negative side 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = 𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 + 𝑓𝑎𝑙𝑠𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 𝑅𝑒𝑐𝑎𝑙𝑙 = 𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 + 𝑓𝑎𝑙𝑠𝑒 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒𝑠 OR? When False Positive is high  False Positive is low Best? False Negative is ALL of misclassification rate  False positive is 0 misclassification rate = 𝑓𝑎𝑙𝑠𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠+𝑓𝑎𝑙𝑠𝑒 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒𝑠 𝑎𝑙𝑙 𝑑𝑎𝑡𝑎
  • 11. Therefore
 • It is about
 PRECISION! 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = 𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 + 𝑓𝑎𝑙𝑠𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 How many selected items are relevant? When it predicts yes, how often is it correct?
  • 12. Consider another example You want to marry  go purchase a $$$ ring Credit card company rejects your purchase, saying it ”might be” fraud, calling you to verify. Does the credit card company look for a high precision or recall?
  • 13. Real Y is positive (Class 1) Real Y is negative (Class 0) Predicted Y is positive (Class 1) True positive predict positive, and is positive  Correct / True prediction False positive predict positive, but is negative  Incorrect / False positive Type I error Predicted Y is negative (Class 0) False negative predict negative, but is positive  Incorrect / False prediction Type II error True negative predict negative, and is negative  Correct / True prediction Class 1: fraud Class 0: not fraud Willing to err on the False Positive side 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = 𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 + 𝑓𝑎𝑙𝑠𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 𝑅𝑒𝑐𝑎𝑙𝑙 = 𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 + 𝑓𝑎𝑙𝑠𝑒 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒𝑠 So
 OR?
  • 14. Real Y is positive (Class 1) Real Y is negative (Class 0) Predicted Y is positive (Class 1) True positive predict positive, and is positive  Correct / True prediction False positive predict positive, but is negative  Incorrect / False positive Type I error Predicted Y is negative (Class 0) False negative predict negative, but is positive  Incorrect / False prediction Type II error True negative predict negative, and is negative  Correct / True prediction Class 1: fraud Class 0: not fraud Willing to err on the False Positive side 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = 𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 + 𝑓𝑎𝑙𝑠𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 𝑅𝑒𝑐𝑎𝑙𝑙 = 𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 + 𝑓𝑎𝑙𝑠𝑒 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒𝑠 OR? When False Positive is high  False Negative is low Best? False Positive is ALL of misclassification rate  False Negative is 0 misclassification rate = 𝑓𝑎𝑙𝑠𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠+𝑓𝑎𝑙𝑠𝑒 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒𝑠 𝑎𝑙𝑙 𝑑𝑎𝑡𝑎
  • 15. Therefore
 • It is about
 RECALL! How many relevant items are selected? When it's actually yes, how often does it predict yes? 𝑅𝑒𝑐𝑎𝑙𝑙 = 𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 𝑡𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 + 𝑓𝑎𝑙𝑠𝑒 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒ð‘