1. Logistic Regression
Classification - Evaluation Metrics - Naive Baye’s
Dr. D. Harimurugan
Department of Electrical Engineering
Dr B R Ambedkar National Institute of Technology
Jalandhar
2. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Sigmoid function
Decison Boundary
Cost Function
Logistic regression
Classification algorithm to cluster the data
Output is categorical variable (0/1)
ML Dr. D. Harimurugan, EE - NITJ
3. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Sigmoid function
Decison Boundary
Cost Function
Logistic regression
Binary classification (0/1)
Multiclass classfication (0,1,2)
ML Dr. D. Harimurugan, EE - NITJ
4. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Sigmoid function
Decison Boundary
Cost Function
Logistic regression
Binary classification (0/1)
Multiclass classfication (0,1,2)
The output is the probability value (0 to 1) which gives the
probability of a dataset belonging to particular class
hθ(x) >0.5 ⇒ Class-0
hθ(x) <0.5 ⇒ Class-1
0.5 is the threshold value (user defined).
ML Dr. D. Harimurugan, EE - NITJ
5. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Sigmoid function
Decison Boundary
Cost Function
Logistic regression
Binary classification (0/1)
Multiclass classfication (0,1,2)
ML Dr. D. Harimurugan, EE - NITJ
6. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Sigmoid function
Decison Boundary
Cost Function
Logistic regression
Binary classification (0/1)
Multiclass classfication (0,1,2)
The output is the probability value (0 to 1) which gives the
probability of a dataset belonging to particular class
hθ(x) >0.5 ⇒ Class-0
hθ(x) <0.5 ⇒ Class-1
0.5 is the threshold value (user defined).
ML Dr. D. Harimurugan, EE - NITJ
11. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Sigmoid function
Decison Boundary
Cost Function
Linear regression for classification with outliers
ML Dr. D. Harimurugan, EE - NITJ
12. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Sigmoid function
Decison Boundary
Cost Function
Linear regression for classification with outliers
ML Dr. D. Harimurugan, EE - NITJ
13. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Sigmoid function
Decison Boundary
Cost Function
Linear regression for classification with outliers
ML Dr. D. Harimurugan, EE - NITJ
14. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Sigmoid function
Decison Boundary
Cost Function
Linear regression for classification with outliers
Problem of outliers
−∞ ≤ h ≤ ∞ (Threshold selection is a problem)
To overcome these problem, we use Sigmoid function
The value of h varies between 0 to 1
S-curve is used for fitting in logistic regression
ML Dr. D. Harimurugan, EE - NITJ
15. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Sigmoid function
Decison Boundary
Cost Function
Linear regression for classification with outliers
Problem of outliers
−∞ ≤ h ≤ ∞ (Threshold selection is a problem)
To overcome these problem, we use Sigmoid function
The value of h varies between 0 to 1
S-curve is used for fitting in logistic regression
ML Dr. D. Harimurugan, EE - NITJ
16. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Sigmoid function
Decison Boundary
Cost Function
Linear regression for classification with outliers
Problem of outliers
−∞ ≤ h ≤ ∞
To overcome these problem, we use Sigmoid function
The value of h varies between 0 to 1
S-curve is used for fitting in logistic regression
ML Dr. D. Harimurugan, EE - NITJ
17. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Sigmoid function
Decison Boundary
Cost Function
Linear regression for classification with outliers
Problem of outliers
−∞ ≤ h ≤ ∞
To overcome these problem, we use Sigmoid function
The value of h varies between 0 to 1
S-curve is used for fitting in logistic regression
ML Dr. D. Harimurugan, EE - NITJ
19. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Sigmoid function
Decison Boundary
Cost Function
S-curve : Sigmoid function
S-curve represents the probability value.
The probability range between the classes is high with
sigmoid curve (stepness and closeness)
ML Dr. D. Harimurugan, EE - NITJ
20. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Sigmoid function
Decison Boundary
Cost Function
S-curve : Sigmoid function
S-curve represents the probability value. (low probaility for
one class and high probability for other class)
The probability range between the classes is high with
sigmoid curve (stepness and closeness)
ML Dr. D. Harimurugan, EE - NITJ
21. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Sigmoid function
Decison Boundary
Cost Function
Sigmoid function or logistic function
g(z) =
1
1 + e−z
ML Dr. D. Harimurugan, EE - NITJ
22. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Sigmoid function
Decison Boundary
Cost Function
Sigmoid function or logistic function
g(z) =
1
1 + e−z
g(z)|z=∞ = 1 g(z)|z=−∞ = 0
h(x) represents the estimated probability data belongs to
one class
ML Dr. D. Harimurugan, EE - NITJ
23. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Sigmoid function
Decison Boundary
Cost Function
Sigmoid function for logistic regression
g(z) =
1
1 + e−z
Hypothesis for logistic regression
g(hθ(x)) = g(X.θ) =
1
1 + e−(X.θ)
ML Dr. D. Harimurugan, EE - NITJ
24. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Sigmoid function
Decison Boundary
Cost Function
Sigmoid function for logistic regression
g(z) =
1
1 + e−z
Hypothesis for logistic regression
g(hθ(x)) = g(X.θ) =
1
1 + e−(X.θ)
ML Dr. D. Harimurugan, EE - NITJ
25. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Sigmoid function
Decison Boundary
Cost Function
Sigmoid function for logistic regression
hθ(x) = g(X.θ) =
1
1 + e−(X.θ)
ML Dr. D. Harimurugan, EE - NITJ
26. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Sigmoid function
Decison Boundary
Cost Function
Sigmoid function for logistic regression
hθ(x) = g(X.θ) =
1
1 + e−(X.θ)
z ≥ 0 ⇒ g(z) ≥ 0.5 ⇒ hθ(x) ≥ 0.5 ⇒ Class − 1
z < 0 ⇒ g(z) < 0.5 ⇒ hθ(x) < 0.5 ⇒ Class − 0
ML Dr. D. Harimurugan, EE - NITJ
27. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Sigmoid function
Decison Boundary
Cost Function
Sigmoid function for logistic regression
hθ(x) = g(X.θ) =
1
1 + e−(X.θ)
z ≥ 0 ⇒ g(z) ≥ 0.5 ⇒ hθ(x) ≥ 0.5 ⇒ Class − 1
z < 0 ⇒ g(z) < 0.5 ⇒ hθ(x) < 0.5 ⇒ Class − 0
X.θ ≥ 0 ⇒ g(X.θ) ≥ 0.5 ⇒ Class − 1
X.θ < 0 ⇒ g(X.θ) < 0.5 ⇒ Class − 0
ML Dr. D. Harimurugan, EE - NITJ
28. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Sigmoid function
Decison Boundary
Cost Function
Sigmoid function for logistic regression
hθ(x) = g(X.θ) =
1
1 + e−(X.θ)
z ≥ 0 ⇒ g(z) ≥ 0.5 ⇒ hθ(x) ≥ 0.5 ⇒ Class − 1
z < 0 ⇒ g(z) < 0.5 ⇒ hθ(x) < 0.5 ⇒ Class − 0
X.θ ≥ 0 ⇒ g(X.θ) ≥ 0.5 ⇒ Class − 1
X.θ < 0 ⇒ g(X.θ) < 0.5 ⇒ Class − 0
Predicting probability of ’y’ belong to class-1 or class-0 is
equivalent to predicting X.θ greater than or less than zero.
Based on the value of h, we will divide the dataset into
classes and the boundary we call it as “Decision
boundary”
ML Dr. D. Harimurugan, EE - NITJ
30. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Sigmoid function
Decison Boundary
Cost Function
Decision Boundary
hθ(x) = g(θ0 + θ1x1 + θ2x2)
Find the equation of line which
seperates two classes
ML Dr. D. Harimurugan, EE - NITJ
31. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Sigmoid function
Decison Boundary
Cost Function
Decision Boundary
hθ(x) = g(θ0 + θ1x1 + θ2x2)
Find the equation of line which
seperates two classes
ML Dr. D. Harimurugan, EE - NITJ
32. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Sigmoid function
Decison Boundary
Cost Function
Decision Boundary
hθ(x) = g(θ0 + θ1x1 + θ2x2)
Find the equation of line which
seperates two classes
x1 + x2 = 4
x1 + x2 − 4 = 0
ML Dr. D. Harimurugan, EE - NITJ
33. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Sigmoid function
Decison Boundary
Cost Function
Decision Boundary
hθ(x) = g(θ0 + θ1x1 + θ2x2)
Find the equation of line which
seperates two classes
x1 + x2 = 4
x1 + x2 − 4 = 0
θ =
−4
1
1
Predict y=1, if x1 + x2 ≥ 4
Predict y=0, if x1 + x2 < 4
ML Dr. D. Harimurugan, EE - NITJ
34. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Sigmoid function
Decison Boundary
Cost Function
Decision Boundary
hθ(x) = g(θ0 + θ1x1 + θ2x2)
Find the equation of line which
seperates two classes
x1 + x2 = 4
x1 + x2 − 4 = 0
θ =
−4
1
1
Predict y=1, if x1 + x2 ≥ 4
Predict y=0, if x1 + x2 < 4
hθ(x) = 0.5 ⇒ g(x1 + x2 = 4)
ML Dr. D. Harimurugan, EE - NITJ
35. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Sigmoid function
Decison Boundary
Cost Function
Decision Boundary
Decision boundary is a property of hypothesis and
parameter of hypothesis, not of data set
ML Dr. D. Harimurugan, EE - NITJ
37. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Sigmoid function
Decison Boundary
Cost Function
Non linear Decision Boundary
hθ(x) = g(θ0+θ1x1+θ2x2+θ3x2
1 +θ4x2
2 )
Decision boundary is
ML Dr. D. Harimurugan, EE - NITJ
38. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Sigmoid function
Decison Boundary
Cost Function
Non linear Decision Boundary
hθ(x) = g(θ0+θ1x1+θ2x2+θ3x2
1 +θ4x2
2 )
Decision boundary is
x2
1 + x2
2 = 1
x2
1 + x2
2 − 1 ≥ 0 ⇒ y = 1
x2
1 + x2
2 − 1 < 0 ⇒ y = 0
ML Dr. D. Harimurugan, EE - NITJ
39. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Sigmoid function
Decison Boundary
Cost Function
Non linear Decision Boundary
hθ(x) = g(θ0+θ1x1+θ2x2+θ3x2
1 +θ4x2
2 )
Decision boundary is
x2
1 + x2
2 = 1
x2
1 + x2
2 − 1 ≥ 0 ⇒ y = 1
x2
1 + x2
2 − 1 < 0 ⇒ y = 0
θ =
−1
0
0
1
1
hθ(x) = g(x2
1 + x2
2 − 1)
ML Dr. D. Harimurugan, EE - NITJ
42. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Sigmoid function
Decison Boundary
Cost Function
Logistic regression cost function
P1 to P4 should have less
probability
P5 to P8 should have high
probability
ML Dr. D. Harimurugan, EE - NITJ
43. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Sigmoid function
Decison Boundary
Cost Function
Logistic regression cost function
P1 to P4 should have less
probability
P5 to P8 should have high
probability
Minimizing P4 is equivalent
to maximizing (1 − P4)
ML Dr. D. Harimurugan, EE - NITJ
44. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Sigmoid function
Decison Boundary
Cost Function
Logistic regression cost function
P1 to P4 should have less
probability
P5 to P8 should have high
probability
Minimizing P4 is equivalent
to maximizing (1 − P4)
The Maximization function is
Product = (1 − P1)(1 − P2)(1 − P3)(1 − P4)P5P6P7P8
ML Dr. D. Harimurugan, EE - NITJ
45. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Sigmoid function
Decison Boundary
Cost Function
Logistic regression cost function
P1 to P4 should have less
probability
P5 to P8 should have high
probability
Minimizing P4 is equivalent
to maximizing (1 − P4)
The Maximization function is
Product = (1 − P1)(1 − P2)(1 − P3)(1 − P4)P5P6P7P8
Maximization is equivalent to Minimizing negative of function
Min J = −[(1 − P1)(1 − P2)(1 − P3)(1 − P4)P5P6P7P8]
ML Dr. D. Harimurugan, EE - NITJ
46. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Sigmoid function
Decison Boundary
Cost Function
Logistic regression cost function
Linear regression ⇒ J =
1
m
m
X
i=1
1
2
(h(xi) − yi)2
Logistic regression ⇒ J =
1
m
m
X
i=1
cost(hθ(x)(i)
, y(i)
)
cost(hθ(x)(i)
, y(i)
) =
(
−hθ(x) if y=1
−(1 − hθ(x)) if y=0
ML Dr. D. Harimurugan, EE - NITJ
47. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Sigmoid function
Decison Boundary
Cost Function
Logistic regression cost function
cost(hθ(x)(i)
, y(i)
) =
(
−(hθ(x)) if y=1
−(1 − hθ(x)) if y=0
ML Dr. D. Harimurugan, EE - NITJ
48. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Sigmoid function
Decison Boundary
Cost Function
Logistic regression cost function
cost(hθ(x)(i)
, y(i)
) =
(
−(hθ(x)) if y=1
−(1 − hθ(x)) if y=0
cost(hθ(x)(i)
, y(i)
) =
(
−log(hθ(x)) if y=1
−log(1 − hθ(x)) if y=0
ML Dr. D. Harimurugan, EE - NITJ
49. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Sigmoid function
Decison Boundary
Cost Function
Logistic regression cost function
cost(hθ(x)(i)
, y(i)
) =
(
−log(hθ(x)) if y=1
−log(1 − hθ(x)) if y=0
The above cost value can be written as
cost(hθ(x), y) = −y.log(hθ(x)) − (1 − y).log(1 − hθ(x))
ML Dr. D. Harimurugan, EE - NITJ
50. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Sigmoid function
Decison Boundary
Cost Function
Logistic regression cost function
cost(hθ(x)(i)
, y(i)
) =
(
−log(hθ(x)) if y=1
−log(1 − hθ(x)) if y=0
The above cost value can be written as
cost(hθ(x), y) = −y.log(hθ(x)) − (1 − y).log(1 − hθ(x))
y=1:
ML Dr. D. Harimurugan, EE - NITJ
51. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Sigmoid function
Decison Boundary
Cost Function
Logistic regression cost function
cost(hθ(x)(i)
, y(i)
) =
(
−log(hθ(x)) if y=1
−log(1 − hθ(x)) if y=0
The above cost value can be written as
cost(hθ(x), y) = −y.log(hθ(x)) − (1 − y).log(1 − hθ(x))
y=1:
cost(hθ(x), y) = −1.log(hθ(x)) − (1 − 1).log(1 − hθ(x))
cost(hθ(x), y) = −log(hθ(x))
ML Dr. D. Harimurugan, EE - NITJ
52. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Sigmoid function
Decison Boundary
Cost Function
Logistic regression cost function
cost(hθ(x)(i)
, y(i)
) =
(
−log(hθ(x)) if y=1
−log(1 − hθ(x)) if y=0
The above cost value can be written as
cost(hθ(x), y) = −y.log(hθ(x)) − (1 − y).log(1 − hθ(x))
y=1:
cost(hθ(x), y) = −1.log(hθ(x)) − (1 − 1).log(1 − hθ(x))
cost(hθ(x), y) = −log(hθ(x))
y=0:
ML Dr. D. Harimurugan, EE - NITJ
53. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Sigmoid function
Decison Boundary
Cost Function
Logistic regression cost function
cost(hθ(x)(i)
, y(i)
) =
(
−log(hθ(x)) if y=1
−log(1 − hθ(x)) if y=0
The above cost value can be written as
cost(hθ(x), y) = −y.log(hθ(x)) − (1 − y).log(1 − hθ(x))
y=1:
cost(hθ(x), y) = −1.log(hθ(x)) − (1 − 1).log(1 − hθ(x))
cost(hθ(x), y) = −log(hθ(x))
y=0:
cost(hθ(x), y) = −0.log(hθ(x)) − (1 − 0).log(1 − hθ(x))
cost(hθ(x), y) = −log(1 − hθ(x))
ML Dr. D. Harimurugan, EE - NITJ
54. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Sigmoid function
Decison Boundary
Cost Function
Logistic regression cost function
The cost function for logistic regression is
J = −
1
m
m
X
i=1
y(i)
.log(hθ(x)(i)
) + (1 − y(i)
).log(1 − hθ(x)(i)
)
Goal ⇒find the value of θ which gives minimum value for J
The output for new value of x is given by
hθ(x) =
1
1 − e−X.θ
ML Dr. D. Harimurugan, EE - NITJ
57. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Sigmoid function
Decison Boundary
Cost Function
Logistic regression: Regularization
The cost function for logistic regression with regularization is
J = −
1
m
m
X
i=1
y(i)
.log(hθ(x)(i)
) + (1 − y(i)
).log(1 − hθ(x)(i)
)
+
λ
2m
n
X
j=1
θ2
j
ML Dr. D. Harimurugan, EE - NITJ
59. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
One Vs All
Multiclass classification: One Vs All
Find the probabilites of each model and the test point belongs
to the model which gives highest probability
ML Dr. D. Harimurugan, EE - NITJ
60. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Accuracy
Confusion matrix
Precision and Recall, F1 Score
AUC-ROC Log-loss
Evaluation metrics for classification
Accuracy
Confusion matrix
Precision and recall
F1-score
AUC-ROC
Log loss
Gini coefficient
ML Dr. D. Harimurugan, EE - NITJ
61. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Accuracy
Confusion matrix
Precision and Recall, F1 Score
AUC-ROC Log-loss
Evaluation metric : Accuracy
Accuracy indicates how much percentage model has made
correct prediction
Accuracy =
Correct prediction
Total Prediction
Accuracy will have problem with skewed classes.
ML Dr. D. Harimurugan, EE - NITJ
63. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Accuracy
Confusion matrix
Precision and Recall, F1 Score
AUC-ROC Log-loss
Evaluation metric : Confusion matrix
ML Dr. D. Harimurugan, EE - NITJ
64. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Accuracy
Confusion matrix
Precision and Recall, F1 Score
AUC-ROC Log-loss
Evaluation metric : Confusion matrix
Precision =
True positives
Predicted positives
=
True positives
True positives + False positives
ML Dr. D. Harimurugan, EE - NITJ
65. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Accuracy
Confusion matrix
Precision and Recall, F1 Score
AUC-ROC Log-loss
Evaluation metric : Recall
Recall =
True positives
Actual positives
=
True positives
True positives + False negatives
ML Dr. D. Harimurugan, EE - NITJ
66. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Accuracy
Confusion matrix
Precision and Recall, F1 Score
AUC-ROC Log-loss
Evaluation Metric: Precision and Recall
ML Dr. D. Harimurugan, EE - NITJ
67. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Accuracy
Confusion matrix
Precision and Recall, F1 Score
AUC-ROC Log-loss
Evaluation Metric: Precision and Recall
ML Dr. D. Harimurugan, EE - NITJ
68. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Accuracy
Confusion matrix
Precision and Recall, F1 Score
AUC-ROC Log-loss
Evaluation metric : F1 score
F1 score is used as tradeoff among precison and recall
F1 score is a harmonic sum of precision and recall
F1 score = 2
P.R
P + R
ML Dr. D. Harimurugan, EE - NITJ
69. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Accuracy
Confusion matrix
Precision and Recall, F1 Score
AUC-ROC Log-loss
Evaluation metric : F1 score
F1 score is used as tradeoff among precison and recall
F1 score is a harmonic sum of precision and recall
F1 score = 2
P.R
P + R
To give more importance to precision or recall
F1 score = (1 + β2
)
P.R
(β2P) + R
ML Dr. D. Harimurugan, EE - NITJ
72. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Accuracy
Confusion matrix
Precision and Recall, F1 Score
AUC-ROC Log-loss
Evaluation metric : AUC-ROC
ROC stands for “Receiver Operating Characteristics” which
from signals and systems where they used it for
distinguishing ’noise’ from ’not noise’
Used as an evaluation metric between true positive rate
and false positive rate.
Gives trade off between true positives and false positives
ML Dr. D. Harimurugan, EE - NITJ
73. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Accuracy
Confusion matrix
Precision and Recall, F1 Score
AUC-ROC Log-loss
Evaluation metric : AUC-ROC
Consider max value of 1 unit, an completely random
prediction will give you straight line (AUC=0.5)
For a model better than random one, AUC will be greater
than 0.5.
More area under the curve, better model it is.
ML Dr. D. Harimurugan, EE - NITJ
74. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Accuracy
Confusion matrix
Precision and Recall, F1 Score
AUC-ROC Log-loss
Evaluation metric : AUC-ROC
Stepper the curve, better the model!
ML Dr. D. Harimurugan, EE - NITJ
75. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Accuracy
Confusion matrix
Precision and Recall, F1 Score
AUC-ROC Log-loss
PR curve is preferred over ROC if we have sckewed classes.
ML Dr. D. Harimurugan, EE - NITJ
76. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Accuracy
Confusion matrix
Precision and Recall, F1 Score
AUC-ROC Log-loss
Evaluation metric : Log loss
AUC considers only the order of probability not the value of
probability
Log loss is the negative average of the log of the predicted
probabilites for each instance
Log loss = −
1
m
m
X
i=1
y(i)
.log(hθ(x)(i)
)+(1−y(i)
).log(1−hθ(x)(i)
)
ML Dr. D. Harimurugan, EE - NITJ
77. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Accuracy
Confusion matrix
Precision and Recall, F1 Score
AUC-ROC Log-loss
Evaluation metric : Gini coefficient
It is derived from AUC-ROC curve
It is given by area between the ROC curve and the
diagonal line divided by area of triangle
Gini above 60% is a good model
Gini coefficient = 2AUC − 1
ML Dr. D. Harimurugan, EE - NITJ
78. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Example
Generative model Vs Descriminative model
Types of Navie Baye
Navie Baye’s Algorithm
Supervised algorithm based on Baye’s theorem used for
classification
Generative model
Main assumption: Each feature is independent of each
other
ML Dr. D. Harimurugan, EE - NITJ
79. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Example
Generative model Vs Descriminative model
Types of Navie Baye
Navie Baye’s Algorithm
Supervised algorithm based on Baye’s theorem used for
classification
Generative model
Main assumption: Each feature is independent of each
other
P(A|B) =
P(B|A)P(A)
P(B)
P(A|B) is Posterior probability: Probability of hypothesis
A on the observed event B.
P(B|A) is Likelihood probability: Probability of the
evidence given that the probability of a hypothesis is true.
P(A) is Prior Probability: Probability of hypothesis before
observing the evidence.
P(B) is Marginal Probability: Probability of Evidence.
ML Dr. D. Harimurugan, EE - NITJ
80. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Example
Generative model Vs Descriminative model
Types of Navie Baye
Navie Baye’s Procedure
Convert the given dataset into frequency tables
Generate Likelihood table by finding probabilites of given
feature
Use Baye’s theorem to calculate the Posterior probability
P(y|x1, x2....xn) =
P(x1|y).P(x2|y).....P(xn|y).P(y)
P(x1).P(x2)....P(xn)
ML Dr. D. Harimurugan, EE - NITJ
81. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Example
Generative model Vs Descriminative model
Types of Navie Baye
Navie Baye’s Algorithm
Consider a dataset of weather condition and target variable
as playing golf
Find P(yes|today); today=(sunny, hot, normal, false)
ML Dr. D. Harimurugan, EE - NITJ
82. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Example
Generative model Vs Descriminative model
Types of Navie Baye
Navie Baye’s Algorithm:Calculation
P(y|x1, x2....xn) =
P(x1|y).P(x2|y).....P(xn|y).P(y)
P(x1).P(x2)....P(xn)
Find P(yes|today); today=(sunny, hot, normal, false)
=
[P(sunny|yes).P(hot|yes)P(normal|yes)P(false|yes)].P(yes)
P(sunny).P(hot)P(false)P(normal)
Find P(NO|today); today=(sunny, hot, normal, false)
=
[P(sunny|No).P(hot|No)P(normal|No)P(false|No)].P(no)
P(sunny).P(hot)P(false)P(normal)
ML Dr. D. Harimurugan, EE - NITJ
83. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Example
Generative model Vs Descriminative model
Types of Navie Baye
Navie Baye’s Algorithm:Calculation
P(y|x1, x2....xn) =
P(x1|y).P(x2|y).....P(xn|y).P(y)
P(x1).P(x2)....P(xn)
Find P(yes|today); today=(sunny, hot, normal, false)
=
[P(sunny|yes).P(hot|yes)P(normal|yes)P(false|yes)].P(yes)
P(sunny).P(hot)P(false)P(normal)
Find P(NO|today); today=(sunny, hot, normal, false)
=
[P(sunny|No).P(hot|No)P(normal|No)P(false|No)].P(no)
P(sunny).P(hot)P(false)P(normal)
P(Y|t) = P(yes)∗P(sunny|yes).P(hot|yes)P(normal|yes)P(false|yes
P(N|t) = P(no)∗P(sunny|no).P(hot|no)P(normal|no)P(false|no)
ML Dr. D. Harimurugan, EE - NITJ
84. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Example
Generative model Vs Descriminative model
Types of Navie Baye
Navie Baye’s Algorithm:Calculation
ML Dr. D. Harimurugan, EE - NITJ
85. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Example
Generative model Vs Descriminative model
Types of Navie Baye
Navie Baye’s Algorithm
P(sunny|yes) =
3
9
P(hot|yes) =
2
9
P(Normal|yes) =
6
9
P(False|yes) =
6
9
P(yes) =
9
14
ML Dr. D. Harimurugan, EE - NITJ
86. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Example
Generative model Vs Descriminative model
Types of Navie Baye
Navie Baye’s Algorithm: Calculation
Find P(yes|today); today=(sunny, hot, normal, false)
= [P(sunny|yes).P(hot|yes)P(normal|yes)P(false|yes)].P(yes)
P(yes|sunny, hot, normal, false) =
3
9
.
2
9
.
6
9
.
6
9
.
9
14
= 0.0211
P(No|sunny, hot, normal, false) =
2
5
.
2
5
.
1
5
.
2
5
.
5
14
= 0.0024
P(yes|today) P(no|today)
Hence, the test data belongs to class “Yes”
ML Dr. D. Harimurugan, EE - NITJ
87. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Example
Generative model Vs Descriminative model
Types of Navie Baye
Navie Baye’s Algorithm: Exercise
Classify a red, domestic, SUV.
ML Dr. D. Harimurugan, EE - NITJ
88. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Example
Generative model Vs Descriminative model
Types of Navie Baye
Navie Baye’s Algorithm: Exercise
Classify a red, domestic, SUV.
P(Yes|test) = 0.037 P(No|test) = 0.069
ML Dr. D. Harimurugan, EE - NITJ
89. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Example
Generative model Vs Descriminative model
Types of Navie Baye
Generative Vs Descriminative models
In case of discriminative models, to find the probability, first
we assume some functional form for P(Y|x) and
estimate the parameter of P(Y|x) with the help of training
data
ML Dr. D. Harimurugan, EE - NITJ
90. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Example
Generative model Vs Descriminative model
Types of Navie Baye
Generative Vs Descriminative models
In case of discriminative models, to find the probability, first
we assume some functional form for P(Y|x) and
estimate the parameter of P(Y|x) with the help of training
data
In case of generative models, to find the conditional
probability P(Y|x), first we estimate the prior probability
P(Y) and likelihood probability P(x|Y) with the help of
training data and uses baye’s theorem to calculate the
posterior probability P(Y|x)
P(Y|x) =
P(x|Y)P(Y)
P(x)
ML Dr. D. Harimurugan, EE - NITJ
91. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Example
Generative model Vs Descriminative model
Types of Navie Baye
Generative model vs discriminative model
ML Dr. D. Harimurugan, EE - NITJ
92. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Example
Generative model Vs Descriminative model
Types of Navie Baye
Generative Vs Descriminative models
In case of discriminative models, to find the probability, first
we assume some functional form for P(Y|x) and
estimate the parameter of P(Y|x) with the help of training
data
ML Dr. D. Harimurugan, EE - NITJ
93. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Example
Generative model Vs Descriminative model
Types of Navie Baye
Generative Vs Descriminative models
In case of discriminative models, to find the probability, first
we assume some functional form for P(Y|x) and
estimate the parameter of P(Y|x) with the help of training
data
In case of generative models, to find the conditional
probability P(Y|x), first we estimate the prior probability
P(Y) and likelihood probability P(x|Y) with the help of
training data and uses baye’s theorem to calculate the
posterior probability P(Y|x)
P(Y|x) =
P(x|Y)P(Y)
P(x)
ML Dr. D. Harimurugan, EE - NITJ
94. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Example
Generative model Vs Descriminative model
Types of Navie Baye
Generative Vs Descriminative models
Descriminative models makes predictions on the unseen
data based on conditional probability.
Generative model focuses on the distribution of a dataset
to return a probability
ML Dr. D. Harimurugan, EE - NITJ
95. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Example
Generative model Vs Descriminative model
Types of Navie Baye
Generative Vs Descriminative models
Descriminative models makes predictions on the unseen
data based on conditional probability.
Generative model focuses on the distribution of a dataset
to return a probability
Discriminative models are better than generative models
when we haave otuliers
ML Dr. D. Harimurugan, EE - NITJ
96. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Example
Generative model Vs Descriminative model
Types of Navie Baye
Generative Vs Descriminative models
Descriminative models makes predictions on the unseen
data based on conditional probability.
Generative model focuses on the distribution of a dataset
to return a probability
Discriminative models are better than generative models
when we haave otuliers
Generative models use the assumption of independence
among the features
ML Dr. D. Harimurugan, EE - NITJ
97. Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Example
Generative model Vs Descriminative model
Types of Navie Baye
Types of Navie Baye’s algorithm
Most common variants are
Gaussian Navie Bayes
Multinomail Navie Bayes
Bernoulli Navie Bayes
END OF LOGISTIC REGRESSION
ML Dr. D. Harimurugan, EE - NITJ