2. What is Logistic Regression?
Logistic Regression is a supervised algorithm for binary and
multi-class classification tasks.
It is a widely used method for predicting the probability of an
instance belonging to a specific class instead of simple binary
prediction.
Logistic Regression accepts independent variables as input and
predicts the categorical outcome of the dependent variable.
3. Independent Variables
These are the input variables and are also known as features or
predictors.
The model takes these as input and makes predictions about the
probability of an instance belonging to a category.
E.g.- For predicting heart disease, independent variables will be
Age, Cholesterol Level, Blood Pressure etc.
4. Dependent Variables
These are outcome or target variables that are predicted based
on features(independent variables).
These are categorical variables for which we estimate the
probability of certain categories.
E.g.- For heart disease prediction, the dependent variable will be
1 (have heart disease) or 0 (does not have).
5. How it is different from Linear
Regression?
Linear Regression
Linear Regression is used
for predicting continuous
numeric values.
It models the relationship
between independent
variables and a dependent
variable by fitting a linear
equation.
Logistic Regression
Logistic Regression is used
for binary classification tasks.
It predicts the probability of
an instance belonging to a
specific class.
6. Linear Regression
The output of Linear
Regression is a continuous
numerical value.
It is suitable for problems
where the dependent
variable is numeric. E.g.- For
predicting Stock prices,
Housing prices, Air Quality
prediction etc.
Logistic Regression
The output of Logistic
Regression is a probability
value between 0 and 1 that
represents the probability of
belonging to a particular
class.
It is suitable for binary
classification. E.g.- Predicting
medical conditions, Churn
Prediction, Spam detection
etc.
7. Model Components
Input Features(X)
These are the independent variables or features that describe
the data instances we work with.
Weights(θ)
Weights refer to the coefficients assigned to each independent
variable (feature) in the linear combination that is used to make
predictions. These weights determine the influence of each feature
on the final prediction.
8. Bias(θ₀)
In Logistic Regression “Bias” refers to the intercept term in the
linear equation. It allows the linear model to make predictions even
when all input features are zero.
Here, z is the linear combination
θ₀(represented as b) is bias
w is the weight associated with each feature
9. Sigmoid Function (Logistic function)
The sigmoid function is a crucial component that transforms the
linear combination of input features and their associated weights
into a probability value.
If the output of the sigmoid function is greater than or equal to
Decision Boundary (generally 0.5), the instance is predicted to
belong to the positive class otherwise to the negative class.
P(y = 1|X) =
11. Cost Function
A cost function (also known as a loss function) is a measure that
quantifies the discrepancy between the predicted values of a
model and the actual values.
The goal is to minimize this loss function during model training to
achieve accurate predictions.
In the case of Logistic Regression, the cost function is often the
log loss or cross-entropy loss.
During model training, optimization algorithms like gradient
descent are used to iteratively update the model parameters(θ)
to minimize the log loss function.
13. Optimization Techniques
Regularization
Regularization is a technique used in machine learning to prevent
overfitting and improve the generalization ability of a model.
Regularization is particularly useful when dealing with complex
models that have a high number of features or parameters.
L1-Regularization(Lasso)
15. Gradient Descent
Gradient descent is an iterative optimization algorithm that
updates the model's parameters in the opposite direction of the
gradient of the cost function.
Gradient descent takes steps in the parameter space that lead to
lower values of the cost function.
16. Model Evaluation
Accuracy
Accuracy is the ratio of correctly predicted instances to the total
number of instances in the testing set.
Precision
Precision measures the proportion of correctly predicted positive
instances out of all instances predicted as positive.
17. Recall(Sensitivity)
Recall measures the proportion of correctly predicted positive
instances out of all actual positive instances.
F1-Score
The F1-score is the harmonic mean of precision and recall, providing
a balanced measure of the model's accuracy.
18. Receiver Operating Characteristic (ROC) Curve
It is a graphical representation that shows the performance of a
classification model across different levels of decision
thresholds.
It plots the true positive rate (recall) against the false positive
rate at various threshold settings, providing valuable insights into
the model's ability to distinguish between positive and negative
classes.
19.
20. Applications
Medical Diagnosis and Healthcare
Market and Consumer Analysis
Image and Object Recognition
Fraud Detection
Social Sciences and Political Analysis
Natural Language Processing (NLP)
21. Advantages and Limitations
Advantages
Logistic regression is computationally efficient and can handle
relatively large datasets with ease.
Compared to more complex models, logistic regression has
a lower risk of overfitting.
Logistic regression is specifically designed for binary
classification problems.
Logistic regression can help identify important features that have
a significant impact on the outcome.
22. Limitations
When dealing with high-dimensional data, logistic regression can
become prone to overfitting.
Logistic regression is inherently designed for binary classification
problems. So multi-class problems might not be as
straightforward to interpret.
Logistic regression is sensitive to outliers, especially if the dataset
is small.
Logistic regression requires complete data for all variables.
Dealing with missing data can be challenging.
When dealing with imbalanced datasets (where one class is
much more frequent than the other), logistic regression might
struggle to predict the minority class effectively.