logistic regression.pdf

Logistic Regression
Classification - Evaluation Metrics - Naive Baye’s
Dr. D. Harimurugan
Department of Electrical Engineering
Dr B R Ambedkar National Institute of Technology
Jalandhar

Logistic regression
Multiclass classification
Evaluation Metrics
Naive Baye’s Algorithm
Introduction
Sigmoid function
Decison Boundary
Cost Function
Logistic regression
Classification algorithm to cluster the data
Output is categorical variable (0/1)
ML Dr. D. Harimurugan, EE - NITJ

Logistic regression
Evaluation Metrics
Introduction
Sigmoid function
Decison Boundary
Cost Function
Logistic regression
Binary classification (0/1)
Multiclass classfication (0,1,2)

Logistic regression
Evaluation Metrics
Introduction
Sigmoid function
Decison Boundary
Cost Function
Logistic regression
Binary classification (0/1)
Multiclass classfication (0,1,2)
The output is the probability value (0 to 1) which gives the
probability of a dataset belonging to particular class
hθ(x) >0.5 ⇒ Class-0
hθ(x) <0.5 ⇒ Class-1
0.5 is the threshold value (user defined).

Logistic regression
Evaluation Metrics
Introduction
Sigmoid function
Decison Boundary
Cost Function
Linear regression for classification

Logistic regression
Evaluation Metrics
Introduction
Sigmoid function
Decison Boundary
Cost Function
Linear regression for classification with outliers

Logistic regression
Evaluation Metrics
Introduction
Sigmoid function
Decison Boundary
Cost Function
Problem of outliers
−∞ ≤ h ≤ ∞ (Threshold selection is a problem)
To overcome these problem, we use Sigmoid function
The value of h varies between 0 to 1
S-curve is used for fitting in logistic regression

Logistic regression
Evaluation Metrics
Introduction
Sigmoid function
Decison Boundary
Cost Function
Problem of outliers
−∞ ≤ h ≤ ∞
To overcome these problem, we use Sigmoid function
The value of h varies between 0 to 1
S-curve is used for fitting in logistic regression

Logistic regression
Evaluation Metrics
Introduction
Sigmoid function
Decison Boundary
Cost Function
S-curve : Sigmoid function

Logistic regression
Evaluation Metrics
Introduction
Sigmoid function
Decison Boundary
Cost Function
S-curve represents the probability value.
The probability range between the classes is high with
sigmoid curve (stepness and closeness)

Logistic regression
Evaluation Metrics
Introduction
Sigmoid function
Decison Boundary
Cost Function
S-curve represents the probability value. (low probaility for
one class and high probability for other class)
The probability range between the classes is high with
sigmoid curve (stepness and closeness)

Logistic regression
Evaluation Metrics
Introduction
Sigmoid function
Decison Boundary
Cost Function
Sigmoid function or logistic function
g(z) =
1
1 + e−z

Logistic regression
Evaluation Metrics
Introduction
Sigmoid function
Decison Boundary
Cost Function
Sigmoid function or logistic function
g(z) =
1
1 + e−z
g(z)|z=∞ = 1 g(z)|z=−∞ = 0
h(x) represents the estimated probability data belongs to
one class

Logistic regression
Evaluation Metrics
Introduction
Sigmoid function
Decison Boundary
Cost Function
Sigmoid function for logistic regression
g(z) =
1
1 + e−z
Hypothesis for logistic regression
g(hθ(x)) = g(X.θ) =
1
1 + e−(X.θ)

Logistic regression
Evaluation Metrics
Introduction
Sigmoid function
Decison Boundary
Cost Function
hθ(x) = g(X.θ) =
1
1 + e−(X.θ)

Logistic regression
Evaluation Metrics
Introduction
Sigmoid function
Decison Boundary
Cost Function
hθ(x) = g(X.θ) =
1
1 + e−(X.θ)
z ≥ 0 ⇒ g(z) ≥ 0.5 ⇒ hθ(x) ≥ 0.5 ⇒ Class − 1
z < 0 ⇒ g(z) < 0.5 ⇒ hθ(x) < 0.5 ⇒ Class − 0

Logistic regression
Evaluation Metrics
Introduction
Sigmoid function
Decison Boundary
Cost Function
hθ(x) = g(X.θ) =
1
1 + e−(X.θ)
z ≥ 0 ⇒ g(z) ≥ 0.5 ⇒ hθ(x) ≥ 0.5 ⇒ Class − 1
z < 0 ⇒ g(z) < 0.5 ⇒ hθ(x) < 0.5 ⇒ Class − 0
X.θ ≥ 0 ⇒ g(X.θ) ≥ 0.5 ⇒ Class − 1
X.θ < 0 ⇒ g(X.θ) < 0.5 ⇒ Class − 0

Logistic regression
Evaluation Metrics
Introduction
Sigmoid function
Decison Boundary
Cost Function
hθ(x) = g(X.θ) =
1
1 + e−(X.θ)
z ≥ 0 ⇒ g(z) ≥ 0.5 ⇒ hθ(x) ≥ 0.5 ⇒ Class − 1
z < 0 ⇒ g(z) < 0.5 ⇒ hθ(x) < 0.5 ⇒ Class − 0
X.θ ≥ 0 ⇒ g(X.θ) ≥ 0.5 ⇒ Class − 1
X.θ < 0 ⇒ g(X.θ) < 0.5 ⇒ Class − 0
Predicting probability of ’y’ belong to class-1 or class-0 is
equivalent to predicting X.θ greater than or less than zero.
Based on the value of h, we will divide the dataset into
classes and the boundary we call it as “Decision
boundary”

Logistic regression
Evaluation Metrics
Introduction
Sigmoid function
Decison Boundary
Cost Function
Decision Boundary

Logistic regression
Evaluation Metrics
Introduction
Sigmoid function
Decison Boundary
Cost Function
Decision Boundary
hθ(x) = g(θ0 + θ1x1 + θ2x2)
Find the equation of line which
seperates two classes

Logistic regression
Evaluation Metrics
Introduction
Sigmoid function
Decison Boundary
Cost Function
Decision Boundary
hθ(x) = g(θ0 + θ1x1 + θ2x2)
x1 + x2 = 4
x1 + x2 − 4 = 0

Logistic regression
Evaluation Metrics
Introduction
Sigmoid function
Decison Boundary
Cost Function
Decision Boundary
hθ(x) = g(θ0 + θ1x1 + θ2x2)
x1 + x2 = 4
x1 + x2 − 4 = 0
θ =


−4
1
1


Predict y=1, if x1 + x2 ≥ 4
Predict y=0, if x1 + x2 < 4

Logistic regression
Evaluation Metrics
Introduction
Sigmoid function
Decison Boundary
Cost Function
Decision Boundary
hθ(x) = g(θ0 + θ1x1 + θ2x2)
x1 + x2 = 4
x1 + x2 − 4 = 0
θ =


−4
1
1


Predict y=1, if x1 + x2 ≥ 4
Predict y=0, if x1 + x2 < 4
hθ(x) = 0.5 ⇒ g(x1 + x2 = 4)

Logistic regression
Evaluation Metrics
Introduction
Sigmoid function
Decison Boundary
Cost Function
Decision Boundary
Decision boundary is a property of hypothesis and
parameter of hypothesis, not of data set

Logistic regression
Evaluation Metrics
Introduction
Sigmoid function
Decison Boundary
Cost Function
Non linear Decision Boundary

Logistic regression
Evaluation Metrics
Introduction
Sigmoid function
Decison Boundary
Cost Function
hθ(x) = g(θ0+θ1x1+θ2x2+θ3x2
1 +θ4x2
2 )
Decision boundary is

Logistic regression
Evaluation Metrics
Introduction
Sigmoid function
Decison Boundary
Cost Function
1 +θ4x2
2 )
x2
1 + x2
2 = 1
x2
1 + x2
2 − 1 ≥ 0 ⇒ y = 1
x2
1 + x2
2 − 1 < 0 ⇒ y = 0

Logistic regression
Evaluation Metrics
Introduction
Sigmoid function
Decison Boundary
Cost Function
1 +θ4x2
2 )
x2
1 + x2
2 = 1
x2
1 + x2
2 − 1 ≥ 0 ⇒ y = 1
x2
1 + x2
2 − 1 < 0 ⇒ y = 0
θ =






−1
0
0
1
1






hθ(x) = g(x2
1 + x2
2 − 1)

Logistic regression
Evaluation Metrics
Introduction
Sigmoid function
Decison Boundary
Cost Function
Logistic regression cost function

Logistic regression
Evaluation Metrics
Introduction
Sigmoid function
Decison Boundary
Cost Function
P1 to P4 should have less
probability
P5 to P8 should have high
probability

Logistic regression
Evaluation Metrics
Introduction
Sigmoid function
Decison Boundary
Cost Function
probability
probability
Minimizing P4 is equivalent
to maximizing (1 − P4)

Logistic regression
Evaluation Metrics
Introduction
Sigmoid function
Decison Boundary
Cost Function
probability
probability
The Maximization function is
Product = (1 − P1)(1 − P2)(1 − P3)(1 − P4)P5P6P7P8

Logistic regression
Evaluation Metrics
Introduction
Sigmoid function
Decison Boundary
Cost Function
probability
probability
The Maximization function is
Product = (1 − P1)(1 − P2)(1 − P3)(1 − P4)P5P6P7P8
Maximization is equivalent to Minimizing negative of function
Min J = −[(1 − P1)(1 − P2)(1 − P3)(1 − P4)P5P6P7P8]

Logistic regression
Evaluation Metrics
Introduction
Sigmoid function
Decison Boundary
Cost Function
Linear regression ⇒ J =
1
m
m
X
i=1
1
2
(h(xi) − yi)2
Logistic regression ⇒ J =
1
m
m
X
i=1
cost(hθ(x)(i)
, y(i)
)
cost(hθ(x)(i)
, y(i)
) =
(
−hθ(x) if y=1
−(1 − hθ(x)) if y=0

Logistic regression
Evaluation Metrics
Introduction
Sigmoid function
Decison Boundary
Cost Function
cost(hθ(x)(i)
, y(i)
) =
(
−(hθ(x)) if y=1
−(1 − hθ(x)) if y=0

Logistic regression
Evaluation Metrics
Introduction
Sigmoid function
Decison Boundary
Cost Function
cost(hθ(x)(i)
, y(i)
) =
(
−(hθ(x)) if y=1
−(1 − hθ(x)) if y=0
cost(hθ(x)(i)
, y(i)
) =
(
−log(hθ(x)) if y=1
−log(1 − hθ(x)) if y=0

Logistic regression
Evaluation Metrics
Introduction
Sigmoid function
Decison Boundary
Cost Function
cost(hθ(x)(i)
, y(i)
) =
(
The above cost value can be written as
cost(hθ(x), y) = −y.log(hθ(x)) − (1 − y).log(1 − hθ(x))

Logistic regression
Evaluation Metrics
Introduction
Sigmoid function
Decison Boundary
Cost Function
cost(hθ(x)(i)
, y(i)
) =
(
y=1:

Logistic regression
Evaluation Metrics
Introduction
Sigmoid function
Decison Boundary
Cost Function
cost(hθ(x)(i)
, y(i)
) =
(
y=1:
cost(hθ(x), y) = −1.log(hθ(x)) − (1 − 1).log(1 − hθ(x))
cost(hθ(x), y) = −log(hθ(x))

Logistic regression
Evaluation Metrics
Introduction
Sigmoid function
Decison Boundary
Cost Function
cost(hθ(x)(i)
, y(i)
) =
(
y=1:
y=0:

Logistic regression
Evaluation Metrics
Introduction
Sigmoid function
Decison Boundary
Cost Function
cost(hθ(x)(i)
, y(i)
) =
(
y=1:
y=0:
cost(hθ(x), y) = −log(1 − hθ(x))

Logistic regression
Evaluation Metrics
Introduction
Sigmoid function
Decison Boundary
Cost Function
The cost function for logistic regression is
J = −
1
m
m
X
i=1
y(i)
.log(hθ(x)(i)
) + (1 − y(i)
).log(1 − hθ(x)(i)
)

Goal ⇒find the value of θ which gives minimum value for J
The output for new value of x is given by
hθ(x) =
1
1 − e−X.θ

Logistic regression
Evaluation Metrics
Introduction
Sigmoid function
Decison Boundary
Cost Function
Logistic regression: Overfitting

Logistic regression
Evaluation Metrics
Introduction
Sigmoid function
Decison Boundary
Cost Function
Logistic regression: Regularization
The cost function for logistic regression with regularization is
J = −
1
m
m
X
i=1
y(i)
.log(hθ(x)(i)
) + (1 − y(i)
).log(1 − hθ(x)(i)
)

+
λ
2m
n
X
j=1
θ2
j

Logistic regression
Evaluation Metrics
One Vs All
Multiclass classification: One Vs All
Find the probabilites of each model and the test point belongs
to the model which gives highest probability

Logistic regression
Evaluation Metrics
Accuracy
Confusion matrix
Precision and Recall, F1 Score
AUC-ROC Log-loss
Evaluation metrics for classification
Accuracy
Confusion matrix
Precision and recall
F1-score
AUC-ROC
Log loss
Gini coefficient

Logistic regression
Evaluation Metrics
Accuracy
Confusion matrix
AUC-ROC Log-loss
Evaluation metric : Accuracy
Accuracy indicates how much percentage model has made
correct prediction
Accuracy =
Correct prediction
Total Prediction
Accuracy will have problem with skewed classes.

Logistic regression
Evaluation Metrics
Accuracy
Confusion matrix
AUC-ROC Log-loss
Evaluation Metric: Accuracy

Logistic regression
Evaluation Metrics
Accuracy
Confusion matrix
AUC-ROC Log-loss
Evaluation metric : Confusion matrix

Logistic regression
Evaluation Metrics
Accuracy
Confusion matrix
AUC-ROC Log-loss
Evaluation metric : Confusion matrix
Precision =
True positives
Predicted positives
=
True positives
True positives + False positives

Logistic regression
Evaluation Metrics
Accuracy
Confusion matrix
AUC-ROC Log-loss
Evaluation metric : Recall
Recall =
True positives
Actual positives
=
True positives
True positives + False negatives

Logistic regression
Evaluation Metrics
Accuracy
Confusion matrix
AUC-ROC Log-loss
Evaluation Metric: Precision and Recall

Logistic regression
Evaluation Metrics
Accuracy
Confusion matrix
AUC-ROC Log-loss
Evaluation metric : F1 score
F1 score is used as tradeoff among precison and recall
F1 score is a harmonic sum of precision and recall
F1 score = 2
P.R
P + R

Logistic regression
Evaluation Metrics
Accuracy
Confusion matrix
AUC-ROC Log-loss
Evaluation metric : F1 score
F1 score is used as tradeoff among precison and recall
F1 score is a harmonic sum of precision and recall
F1 score = 2
P.R
P + R
To give more importance to precision or recall
F1 score = (1 + β2
)
P.R
(β2P) + R

Logistic regression
Evaluation Metrics
Accuracy
Confusion matrix
AUC-ROC Log-loss
Evaluation Metric: F1 Score

Logistic regression
Evaluation Metrics
Accuracy
Confusion matrix
AUC-ROC Log-loss
Evaluation Metric: AUC-ROC
True positive fraction
TPF (sensitivity) =
TP
TP + FN
False negative fraction
TPF =
FN
TP + FN
True negative fraction
TPF (specificity) =
TN
TN + FP
False positive fraction
TPF =
FP
TN + FP
Positive predicted value
PPV =
TP
TP + FP
Negative predicted value
NPV =
TN
TN + FN

Logistic regression
Evaluation Metrics
Accuracy
Confusion matrix
AUC-ROC Log-loss
Evaluation metric : AUC-ROC
ROC stands for “Receiver Operating Characteristics” which
from signals and systems where they used it for
distinguishing ’noise’ from ’not noise’
Used as an evaluation metric between true positive rate
and false positive rate.
Gives trade off between true positives and false positives

Logistic regression
Evaluation Metrics
Accuracy
Confusion matrix
AUC-ROC Log-loss
Consider max value of 1 unit, an completely random
prediction will give you straight line (AUC=0.5)
For a model better than random one, AUC will be greater
than 0.5.
More area under the curve, better model it is.

Logistic regression
Evaluation Metrics
Accuracy
Confusion matrix
AUC-ROC Log-loss
Stepper the curve, better the model!

Logistic regression
Evaluation Metrics
Accuracy
Confusion matrix
AUC-ROC Log-loss
PR curve is preferred over ROC if we have sckewed classes.

Logistic regression
Evaluation Metrics
Accuracy
Confusion matrix
AUC-ROC Log-loss
Evaluation metric : Log loss
AUC considers only the order of probability not the value of
probability
Log loss is the negative average of the log of the predicted
probabilites for each instance
Log loss = −
1
m
m
X
i=1
y(i)
.log(hθ(x)(i)
)+(1−y(i)
).log(1−hθ(x)(i)
)


Logistic regression
Evaluation Metrics
Accuracy
Confusion matrix
AUC-ROC Log-loss
Evaluation metric : Gini coefficient
It is derived from AUC-ROC curve
It is given by area between the ROC curve and the
diagonal line divided by area of triangle
Gini above 60% is a good model
Gini coefficient = 2AUC − 1

Logistic regression
Evaluation Metrics
Introduction
Example
Generative model Vs Descriminative model
Types of Navie Baye
Navie Baye’s Algorithm
Supervised algorithm based on Baye’s theorem used for
classification
Generative model
Main assumption: Each feature is independent of each
other

Logistic regression
Evaluation Metrics
Introduction
Example
Types of Navie Baye
Supervised algorithm based on Baye’s theorem used for
classification
Generative model
Main assumption: Each feature is independent of each
other
P(A|B) =
P(B|A)P(A)
P(B)
P(A|B) is Posterior probability: Probability of hypothesis
A on the observed event B.
P(B|A) is Likelihood probability: Probability of the
evidence given that the probability of a hypothesis is true.
P(A) is Prior Probability: Probability of hypothesis before
observing the evidence.
P(B) is Marginal Probability: Probability of Evidence.

Logistic regression
Evaluation Metrics
Introduction
Example
Types of Navie Baye
Navie Baye’s Procedure
Convert the given dataset into frequency tables
Generate Likelihood table by finding probabilites of given
feature
Use Baye’s theorem to calculate the Posterior probability
P(y|x1, x2....xn) =
P(x1|y).P(x2|y).....P(xn|y).P(y)
P(x1).P(x2)....P(xn)

Logistic regression
Evaluation Metrics
Introduction
Example
Types of Navie Baye
Consider a dataset of weather condition and target variable
as playing golf
Find P(yes|today); today=(sunny, hot, normal, false)

Logistic regression
Evaluation Metrics
Introduction
Example
Types of Navie Baye

Logistic regression
Evaluation Metrics
Introduction
Example
Types of Navie Baye
P(sunny|yes) =
3
9
P(hot|yes) =
2
9
P(Normal|yes) =
6
9
P(False|yes) =
6
9
P(yes) =
9
14

Logistic regression
Evaluation Metrics
Introduction
Example
Types of Navie Baye
Navie Baye’s Algorithm: Calculation
= [P(sunny|yes).P(hot|yes)P(normal|yes)P(false|yes)].P(yes)
P(yes|sunny, hot, normal, false) =
3
9
.
2
9
.
6
9
.
6
9
.
9
14
= 0.0211
P(No|sunny, hot, normal, false) =
2
5
.
2
5
.
1
5
.
2
5
.
5
14
= 0.0024
P(yes|today) P(no|today)
Hence, the test data belongs to class “Yes”

Logistic regression
Evaluation Metrics
Introduction
Example
Types of Navie Baye
Navie Baye’s Algorithm: Exercise
Classify a red, domestic, SUV.

Logistic regression
Evaluation Metrics
Introduction
Example
Types of Navie Baye
Navie Baye’s Algorithm: Exercise
Classify a red, domestic, SUV.
P(Yes|test) = 0.037 P(No|test) = 0.069

Logistic regression
Evaluation Metrics
Introduction
Example
Types of Navie Baye
Generative Vs Descriminative models
In case of discriminative models, to find the probability, first
we assume some functional form for P(Y|x) and
estimate the parameter of P(Y|x) with the help of training
data

Logistic regression
Evaluation Metrics
Introduction
Example
Types of Navie Baye
In case of discriminative models, to find the probability, first
we assume some functional form for P(Y|x) and
estimate the parameter of P(Y|x) with the help of training
data
In case of generative models, to find the conditional
probability P(Y|x), first we estimate the prior probability
P(Y) and likelihood probability P(x|Y) with the help of
training data and uses baye’s theorem to calculate the
posterior probability P(Y|x)
P(Y|x) =
P(x|Y)P(Y)
P(x)

Logistic regression
Evaluation Metrics
Introduction
Example
Types of Navie Baye
Generative model vs discriminative model

Logistic regression
Evaluation Metrics
Introduction
Example
Types of Navie Baye
Descriminative models makes predictions on the unseen
data based on conditional probability.
Generative model focuses on the distribution of a dataset
to return a probability

Logistic regression
Evaluation Metrics
Introduction
Example
Types of Navie Baye
Discriminative models are better than generative models
when we haave otuliers

Logistic regression
Evaluation Metrics
Introduction
Example
Types of Navie Baye
Discriminative models are better than generative models
when we haave otuliers
Generative models use the assumption of independence
among the features

Logistic regression
Evaluation Metrics
Introduction
Example
Types of Navie Baye
Types of Navie Baye’s algorithm
Most common variants are
Gaussian Navie Bayes
Multinomail Navie Bayes
Bernoulli Navie Bayes
END OF LOGISTIC REGRESSION

logistic regression.pdf

Recommended

Recommended

More Related Content

Similar to logistic regression.pdf

Similar to logistic regression.pdf (20)

Recently uploaded

Recently uploaded (20)

logistic regression.pdf