Machine Learning-Lec5.pdf_explain of logistic regression

Logistic Regression
Dr. Marwa M. Emam
Faculty of computers and Information
Minia University
Dr. Marwa M. Emam 1

Agenda
 Introduction to Logistic Regression
 Logistic Regression Basics
 Model Representation
 Training the Model
 Examples and Applications
Dr. Marwa M. Emam 2

Classification Problem
 A classification problem is a type of supervised machine learning
problem where the goal is to assign data points or instances to one
of several predefined categories or classes.
 In a classification problem, the target variable is categorical, and
the objective is to learn a model that can accurately predict the
class label of new, unseen data points based on their features.
Dr. Marwa M. Emam 4

Classification Problem…
Here are some key characteristics of a classification
problem:
 Categorical Target Variable:
 In a classification problem, the target variable, often referred to as
the "class label," is a categorical variable. It can represent
different classes or categories. For example, in a spam email
classification problem, the classes might be "spam" and "not spam."
Dr. Marwa M. Emam 5

Here are some key characteristics of a
classification problem:
 Training Data:
 The model is trained on a dataset where each data point is
associated with a known class label. The dataset includes
features (independent variables) that describe the
characteristics of the data points.
Dr. Marwa M. Emam 6

 Model Building: The goal is to build a predictive model that can
generalize from the training data to make accurate predictions for
new, unseen data points.
 This typically involves choosing an appropriate algorithm, learning
the relationships between features and class labels, and
estimating model parameters.
Dr. Marwa M. Emam 7

 Prediction: Once the model is trained, it is used to predict the
class labels of new data points based on their feature values.
The output of the classification model is a class label or a
probability distribution over class labels.
Dr. Marwa M. Emam 8

 Evaluation: Classification models are evaluated using
various performance metrics, such as accuracy, precision,
recall, F1-score, and the area under the receiver
operating characteristic curve (ROC-AUC). These metrics
help assess the model's ability to make correct
classifications and identify any trade-offs between
different performance aspects.
Dr. Marwa M. Emam 9

 Applications: Classification problems are common in a wide range of
applications, including spam detection, sentiment analysis, image
recognition, medical diagnosis, fraud detection, and customer churn
prediction.
 Overall, classification problems are concerned with making informed
decisions based on data and assigning data points to the most
appropriate categories or classes. The choice of the classification
algorithm and the quality of the features used in the model are
critical factors in the success of a classification task.
Dr. Marwa M. Emam 10

Machine Learning Models

Logistic Regression
 Logistic regression is a statistical and machine learning technique
used for binary classification.
 It is a type of regression analysis that is well-suited for predicting
the probability of a binary outcome, such as yes/no, true/false,
1/0, or success/failure.
 Despite its name, logistic regression is a classification method
rather than a regression method.

Logistic Regression …
 The primary goal of logistic regression is to model the relationship
between a set of independent variables (features) and a binary
dependent variable (target) in such a way that it can predict the
probability of the binary outcome.
 The outcome is typically coded as 0 (for one class) and 1 (for the other
class).
 The logistic function (also known as the sigmoid function) plays a
crucial role in this process, as it transforms a linear combination of
the independent variables into probabilities between 0 and 1.
Dr. Marwa M. Emam
13

Logistic Regression …
 Supervised Learning:
 Logistic regression is a supervised learning algorithm, which means
it requires labeled training data to learn and make predictions.
 In supervised learning, the algorithm is provided with a dataset in
which each data point has an associated class label, indicating the
category to which it belongs. Logistic regression learns from this
labeled data and, using the patterns it discerns, can subsequently
make predictions for new, unlabeled data points.

Logistic Regression Objective
 The algorithm aims to assign data points into one of these two
classes by estimating the probability of belonging to the "1" or
"positive" class.

Logistic Function / Sigmoid function
 To estimate these probabilities, logistic regression uses the
sigmoid function (also known as the logistic function).
 The sigmoid function maps the linear combination of input
features to a value between 0 and 1.
 This mapping ensures that the output represents a valid
probability.
Dr. Marwa M. Emam
16

Logistic Regression Representation
 The logistic function is represented as:
 P(Y=1) = 1 / (1 + e^-(β0 + β1*X1 + β2*X2 + ... + βn*Xn))
• P(Y=1) is the probability that the dependent variable (Y) equals 1
(or belongs to the positive class).
• β0 is the intercept.
• β1, β2, ..., βn are the coefficients associated with the
independent variables X1, X2, ..., Xn.
 Logistic regression estimates the coefficients (β values) based
on the training data to make predictions. If the predicted
probability is greater than a predefined threshold (often 0.5),
the model assigns the data point to the positive class;
otherwise, it assigns it to the negative class.
Dr. Marwa M. Emam
17

Transforming Linearity into Probability:
 The sigmoid function's primary purpose is to transform the
linear combination of input features into a probability value
that falls within the range of 0 to 1.
 Probabilities: The output of the sigmoid function represents the
probability that a data point belongs to the "positive" class. For
example, if the output is 0.8, it suggests an 80% probability of
the event Y=1 occurring.

Sigmoid Function

Classification problem
 Tumor: Malignant/ Benign
Dr. Marwa M. Emam
20

Logistic Regression
 We have a set of feature vectors X with corresponding binary outputs
 We want to model p(y|x)

Odds Ratio
 The Odds Ratio (OR) is a crucial concept in logistic regression that quantifies
the relationship between a binary outcome and a predictor variable.
 In logistic regression, the odds ratio is used to understand how a one-unit
change in a predictor variable affects the odds of an event occurring. It
helps assess the strength and direction of the relationship between the
predictor and the binary outcome.
 The Odds Ratio is calculated as the ratio of the odds of an event occurring in
one group (exposure group) to the odds of the same event occurring in
another group (non-exposure group).

Odds Ratio
 To explain the idea behind logistic regression as a probabilistic model, let's
first introduce the odds ratio: the odds in favor of a particular event.
 Notations:
• p : probability of an event occurring
• 1 – p : probability of the event not occurring
• The odds ratio can be written as:
• Odds=
𝑷
𝟏−𝑷
• where p stands for the probability of the positive event. The
term positive event does not necessarily mean good, but refers
to the event that we want to predict, for example, the
probability that a patient has a certain disease; we can think of
the positive event as class label y =1.

Odds Ratio
 We can then define the logit function, which is simply the logarithm of
the odds ratio (log-odds):
 Logit (P)= 𝐥𝐨𝐠
𝑷
(𝟏−𝒑)
 The logit function takes as input values in the range 0 to 1 and
transforms them to values over the entire real-number range, which we
can use to express a linear relationship between feature values and the
log-odds:

Hypothesis function

Logistic regression: equation (ex. Only one
samle)

Logistic regression representation (all
observations)

Gradient ascent
 Now we could use an optimization algorithm such as gradient
ascent to maximize this log-likelihood function.
 Linear Regression:
 Logistic Regression:

Have a Nice Day …

Machine Learning-Lec5.pdf_explain of logistic regression

More Related Content

Similar to Machine Learning-Lec5.pdf_explain of logistic regression

More from BeshoyArnest

Recently uploaded

Machine Learning-Lec5.pdf_explain of logistic regression