Model Evaluation Matrix: Confusion Matrix, F1 Score, ROC curve AUC

Data Science
Model Performance Metrics
(F1 score, AUC, Confusion Matrix)

F1-Score
The F1 score is a metric used to evaluate the performance of a classification
model, especially when dealing with imbalanced classes. It's the harmonic
mean of precision and recall, providing a balance between the two.
The formula for the F1 score is:
F1= 2 x (Precision x Recall / Precision + Recall)
F1 score could be an effective evaluation metric when FP and FN are
equally costly.

Example
Suppose we have a binary classification problem where we want to predict
whether emails are spam (positive class) or not spam (negative class).
• True Positives (TP) = 90 False Positives (FP) = 10
• False Negatives (FN) = 15 True Negatives (TN) = 885
• Precision = 90/90+10 = 0.9
• Recall= 90/90+15 = 0.857
• F1 Score= 2 x (0.9 x 0.857 / 0.9+0.857) = 0.878
The F1 score for this classification model is approximately 0.878. It
provides a single metric that considers both precision and recall, making it
useful for evaluating the model's overall performance, especially in
scenarios with imbalanced classes.

ROC Curve and AUC
The Receiver Operating Characteristic (ROC) curve and the Area Under the
ROC Curve (AUC) are widely used evaluation metrics for binary
classification models.
They are particularly useful when dealing with imbalanced datasets or when
the cost of false positives and false negatives varies.

ROC curve
• The ROC curve is a graphical representation of the trade-off between the
true positive rate (sensitivity) and the false positive rate (1 - specificity)
for different threshold values.
• The true positive rate (TPR) is the ratio of true positive predictions to the
total actual positive instances in the dataset. It represents the model's
ability to correctly identify positive instances.
• The false positive rate (FPR) is the ratio of false positive predictions to
the total actual negative instances in the dataset. It represents the model's
tendency to incorrectly identify negative instances as positive.

ROC curve
• The ROC curve plots the TPR against the FPR as the discrimination
threshold is varied from 0 to 1. Each point on the curve represents a
different threshold, and the curve illustrates how the model's performance
changes across various threshold values.
• A diagonal line from the bottom-left corner to the top-right corner
represents random guessing (an ineffective model). A good model's ROC
curve will be closer to the top-left corner, indicating high TPR and low
FPR across different thresholds.

AUC (Area under the ROC curve)
• The AUC quantifies the overall performance of a classification model by
calculating the area under the ROC curve.
• A perfect classifier would have an AUC of 1, indicating that it achieves a
TPR of 1 (identifies all positives correctly) while maintaining an FPR of
0 (makes no false positive predictions).
• A random classifier would have an AUC of 0.5, as the ROC curve would
be a diagonal line from (0,0) to (1,1).
• The AUC provides a single scalar value that summarizes the model's
performance across all possible classification thresholds. Higher AUC
values indicate better overall performance, with values closer to 1
indicating better discrimination between positive and negative instances.

AUC (Area under the ROC curve)
• Generally, an AUC above 0.8 is considered good, while an AUC above
0.9 is considered excellent. An AUC below 0.7 might indicate poor
discriminatory power.

Confusion Matrix
A confusion matrix is a table that is often used to evaluate the performance
of a classification model. It provides a comprehensive summary of the
model's predictions compared to the actual outcomes in a tabular format.
Each row of the matrix represents the instances in a predicted class, while
each column represents the instances in an actual class.
Predicted Actual
Positive Negative
Positive (P) True Positive False Positive
Negative (N) False Negative True Negative

Components of confusion matrix
True Positives (TP): These are the cases where the model correctly predicts
the positive class. For example, in a medical diagnosis scenario, TP would
represent the number of patients correctly diagnosed with a disease.
False Positives (FP): These are the cases where the model incorrectly
predicts the positive class when it's actually negative. In medical terms, FP
would represent healthy patients incorrectly diagnosed with a disease.
False Negatives (FN): These are the cases where the model
incorrectly predicts the negative class when it's actually positive. In
medical terms, FN would represent patients with a disease incorrectly
classified as healthy.

Components of confusion matrix
True Negatives (TN): These are the cases where the model correctly
predicts the negative class. For example, in a medical diagnosis scenario,
TN would represent the number of healthy patients correctly identified as
such.
The confusion matrix is a valuable tool for understanding the strengths and
weaknesses of a classification model, particularly in scenarios with
imbalanced classes or when certain types of errors (e.g., false positives or
false negatives) are more costly or critical than others.
Please check the description box for the link to Machine Learning videos.

Evaluating model for Imbalanced datasets
When dealing with imbalanced datasets in classification tasks, where the
number of instances in one class significantly outweighs the other, standard
evaluation metrics like accuracy can be misleading.
strategies for effectively evaluating models trained on imbalanced datasets
 Confusion Matrix
Precision, Recall and F1 Score
ROC curve and AUC
Ensemble methods
Resampling techniques: oversampling the minority class or under
sampling the majority class to balance the dataset before evaluation. Use
stratified sampling when splitting the dataset into training and testing sets
to ensure that the class distribution remains consistent across both sets.

Thanks for Watching!
Please check the description box for the link to
Machine Learning videos.

Model Evaluation Matrix: Confusion Matrix, F1 Score, ROC curve AUC

Recommended

Recommended

More Related Content

Similar to Model Evaluation Matrix: Confusion Matrix, F1 Score, ROC curve AUC

Similar to Model Evaluation Matrix: Confusion Matrix, F1 Score, ROC curve AUC (20)

More from Megha Sharma

More from Megha Sharma (20)

Recently uploaded

Recently uploaded (20)

Model Evaluation Matrix: Confusion Matrix, F1 Score, ROC curve AUC