SlideShare a Scribd company logo
1 of 48
Download to read offline
Machine Learning
Supervised Learning: Logistic Regression
•Our goal is to estimate w from a training data
of <xi
,yi
> pairs
•One way to find such relationship is to
minimize the least squares error:
Linear regression
i
i
i
w − wx )2
arg min
∑
(
y
X
Y y = wx + ε
Applications of Linear Regression
Marks of a student based on the number of
hours he/she put into the preparation
• Simple linear regression…..
• Multiple linear regression….
• Non-linear problem….
Marks of a student based on the number of
hours he/she put into the preparation
let’s assume
Marks of a student (M) do depend on the number
of hours (H) he/she put up for preparation.
The following formula can represent the model:
Marks = function (No. of hours)
=> Marks = m*Hours + c
Marks of a student based on the number of
hours he/she put into the preparation
let’s assume
Marks of a student (M) do depend on the number
of hours (H) he/she put up for preparation.
The following formula can represent the model:
Marks = function (No. of hours)
=> Marks = m*Hours + c
Marks of a student based on the number of
hours he/she put into the preparation
let’s Plot the data to check if it’s a Linear Problem
– The easiest way to determine
Marks of a student based on the number of
hours he/she put into the preparation
Marks of a student based on the number of
hours he/she put into the preparation
Marks of a student based on the number of
hours he/she put into the preparation
How to determine the Slope of line
The value of m….
Marks of a student based on the number of
hours he/she put into the preparation
• The value of m (slope of the line) can be
determined using an objective function which
is a combination of the loss function and a
regularization term.
• For simple linear regression, the objective
function would be the summation of Mean
Squared Error (MSE).
• The best fit line would be obtained
by minimizing the objective function
(summation of mean squared error).
Marks of a student based on the number of
hours he/she put into the preparation
• The value of m (slope of the line) can be
determined using an objective function which
is a combination of the loss function and a
regularization term.
• For simple linear regression, the objective
function would be the summation of Mean
Squared Error (MSE).
• The best fit line would be obtained
by minimizing the objective function
(summation of mean squared error).
Predicting weight reduction in form of the
number of KGs reduced
• Lets Assume
IT could depend upon input features such as:
age, height, the weight of the person, and the
time spent on exercises,
Predicting weight reduction in form of the
number of KGs reduced
Weight Reduction = Function(Age, Height,
Weight, Time On Exercise)
=> Shoe-size = b1*Height + b2*Weight +
b3*age + b4*time On Exercise + b0
Predicting weight reduction in form of the
number of KGs reduced
As part of training the above model
Goal:
Find the value of b1, b2, b3, b4, and b0 which would
minimize the objective function.
The objective function would be the summation of
mean squared error which is nothing but the sum of
the square of the actual value and the predicted
value for different values of age, height, weight,
and time On Exercise
Forecasting sales
Organizations often use linear regression models to
forecast future sales.
This can be helpful for things like budgeting and
planning.
Algorithms such as Amazon’s item-to-item
collaborative filtering are used to predict what
customers will buy in the future based on their past
purchase history
Cash forecasting
Many businesses use linear regression to forecast how
much cash they’ll have on hand in the future.
This is important for things like managing expenses
and ensuring that there is enough cash on hand to
cover unexpected costs.
Analyzing survey data
Linear regression can also be used to analyze survey
data.
This can help businesses understand things like
customer satisfaction and product preferences.
For example, a company might use linear regression
to figure out how likely people are to recommend their
product to others..
Stock predictions
A lot of businesses use linear regression models to
predict how stocks will perform in the future.
This is done by analyzing past data on stock prices
and trends to identify patterns.
Predicting consumer behavior
Businesses can use linear regression to predict things
like how much a customer is likely to spend.
Regression models can also be used to predict
consumer behavior. This can be helpful for things like
targeted marketing and product development.
For example, Walmart uses linear regression to predict
what products will be popular in different regions of
the country.
Code:
Code:
import numpy as np
import matplotlib.pyplot as plt
def estimate_coef(x, y):
# number of observations/points
n = np.size(x)
# mean of x and y vector
m_x = np.mean(x)
m_y = np.mean(y)
# calculating cross-deviation and deviation about x
SS_xy = np.sum(y*x) - n*m_y*m_x
SS_xx = np.sum(x*x) - n*m_x*m_x
# calculating regression coefficients
b_1 = SS_xy / SS_xx
b_0 = m_y - b_1*m_x
return (b_0, b_1)
def plot_regression_line(x, y, b):
# plotting the actual points as scatter plot
plt.scatter(x, y, color = "m",
marker = "o", s = 30)
# predicted response vector
y_pred = b[0] + b[1]*x
# plotting the regression line
plt.plot(x, y_pred, color = "g")
# putting labels
plt.xlabel('x')
plt.ylabel('y')
# function to show plot
plt.show()
def main():
# observations / data
x = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
y = np.array([1, 3, 2, 5, 7, 8, 8, 9, 10, 12])
# estimating coefficients
b = estimate_coef(x, y)
print("Estimated coefficients:nb_0 = {} 
nb_1 = {}".format(b[0], b[1]))
# plotting regression line
plot_regression_line(x, y, b)
if __name__ == "__main__":
main()
Logistic Regression…
Regression for classification
• In some cases we can use linear regression for determining the
appropriate boundary.
• However, since the output is usually binary or discrete there are
more efficient regression methods
Regression for classification
• Assume we would like to use linear regression to learn the
parameters for p(y | X ; θ)
• Problems?
1
-1
Optimal regression
model
wT
X ≥ 0 ⇒ classify as 1
wT
X < 0 ⇒ classify as -1
Logistic Regression
Logistic Regression is basically a predictive model
analysis technique where the target variables (output)
are discrete values for a given set of features or input
(X).
For example whether someone is covid-19 positive
(1) or negative (0).
It is a very powerful yet simple classification
algorithm in machine learning borrowed from
statistics algorithms.
Logistic Regression
Logistic Regression is basically a predictive model
analysis technique where the target variables (output)
are discrete values for a given set of features or input
(X).
For example whether someone is covid-19 positive
(1) or negative (0).
It is a very powerful yet simple classification
algorithm in machine learning borrowed from
statistics algorithms.
Around 60% of the world’s classification problems
can be solved by using the logistic regression
algorithm.
Logistic Regression
Logistic regression is one of the most common
machine learning algorithms used for binary
classification.
It predicts the probability of occurrence of a binary
outcome.
Fraud detection, spam detection, cancer detection,
etc.
Sigmoid Function
It is a mathematical function having a characteristic
that can take any real value and map it to between 0
to 1 shaped like the letter “S”.
The sigmoid function also called a logistic function
g(h) =
1
1+ e− h
Sigmoid Function
Sigmoid Function
If the value of z goes to positive infinity then the
predicted value of y will become 1 and if it goes to
negative infinity then the predicted value of y will
become 0.
And if the outcome of the sigmoid function is more
than 0.5 then we classify that label as class 1 or
positive class and if it is less than 0.5 then we can
classify it to negative class or label as class 0.
Diff b/w Linear & Logistic
Regression
Linear Regression is used when our dependent variable is
continuous in nature for example weight, height, numbers, etc.
and in contrast,
Logistic Regression is used when the dependent variable is
binary or limited for example: yes and no, true and false, 1 or 2,
etc.
In the 19th century, people use linear regression on biology to
predict health disease but it is very risky for example if a patient
has cancer and its probability of malignant is 0.4 then in linear
regression it will show that cancer is benign (because
probability comes <0.5). That’s where Logistic Regression
comes which only provides us with binary results.
Logistic regression vs. Linear regression
1
T
1+ ew X
p( y = 0 | X ;θ ) = g(wT
X )
=
T
e
w X
T
1+ ew X
p( y = 1| X ;θ ) = 1− g(wT
X )
=
Determining parameters for logistic
regression problems
∏
i
y
i i
i
i (1− y )
(1− g( X ; w)) g( X ; w)
L( y | X ; w)
=
• So how do we learn the
parameters?
• Similar to other regression problems
we look for the MLE for w
• The likelihood of the data given
the model is:
1
T
1+ ew X
p( y = 0 | X ;θ ) = g( X ; w)
= T
e
w X
T
1+ ew X
p( y = 1| X ;θ ) = 1− g( X ; w)
=
Gradient Descent
Gradient descent is an optimization algorithm
used to minimize some function by iteratively
moving in the direction of steepest descent as
defined by the negative of the gradient.
In machine learning, we use gradient descent to
update the parameters of our model.
Gradient Descent
Starting at the top of the mountain, we take our first
step downhill in the direction specified by the
negative gradient.
Next we recalculate the negative gradient (passing in
the coordinates of our new point) and take another
step in the direction it specifies.
We continue this process iteratively until we get to
the bottom of our graph, or to a point where we can
no longer move downhill–a local minimum.
Gradient Descent
Learning Rate
The size of these steps is called the learning rate.
With a high learning rate we can cover more ground each
step, but we risk overshooting the lowest point since the slope
of the hill is constantly changing.
With a very low learning rate, we can confidently move in
the direction of the negative gradient since we are
recalculating it so frequently.
A low learning rate is more precise, but calculating the
gradient is time-consuming, so it will take us a very long time
to get to the bottom.
Gradient ascent
Slope = ∂z/
∂w
z
Δw
w
•Going in the direction to the slope will lead to a larger z
•But not too much, otherwise we would go beyond the
optimal w
Gradient descent
z Slope = ∂z/
∂w
Δz
Δw
w
•Going in the opposite direction to the slope will lead to
a smaller z
•But not too much, otherwise we would go beyond the
optimal w
Finding The Best Weights -
Hill Descent
Ball on a complicated hilly terrain
— rolls down to a local valley
↑
this is called a local minimum
Questions:
How to get to the bottom of the deepest valley?
How to do this when we don’t have gravity?
Our Ein
Has Only One Valley
Weights, w
In-sample
Error,
E
in
. . . because Ein
(w) is a convex function of w.
Gradient Descent Method
Batch Gradient Descent: use all examples in each iteration
Mini batch Gradient Descent: use some examples in each iteration
Stochastic Gradient Descent: use 1 example in each iteration
Regularization
• Similar to other data estimation problems, we may not have enough
samples to learn good models for logistic regression classification
• One way to overcome this is to ‘regularize’ the model, impose
additional constraints on the parameters we are fitting.
• For example, lets assume that wj
comes from a Gaussian
distribution with mean 0 and variance σ2
(where σ2
is a user defined
parameter): wj
~N(0, σ2
)
• In that case we have a prior on the parameters and so:
p( y = 1,θ | X ) ∝ p( y = 1| X ;θ ) p(θ )
Credits
Yasir Abu Mustafa ,Caltech university, California
Barnabas Poczos, Ziv Bar-Joseph School of Computer Science,
Carnegie Mellon University
Vibhav Gogate The University of Texas at Dallas

More Related Content

Similar to 3ml.pdf

Supervised Learning.pdf
Supervised Learning.pdfSupervised Learning.pdf
Supervised Learning.pdfgadissaassefa
 
Kickstart ML.pptx
Kickstart ML.pptxKickstart ML.pptx
Kickstart ML.pptxGDSCVJTI
 
Linear, Machine Learning or Probabilistic Predictive Models: What's Best for ...
Linear, Machine Learning or Probabilistic Predictive Models: What's Best for ...Linear, Machine Learning or Probabilistic Predictive Models: What's Best for ...
Linear, Machine Learning or Probabilistic Predictive Models: What's Best for ...Bohdan Pavlyshenko
 
Model Selection and Validation
Model Selection and ValidationModel Selection and Validation
Model Selection and Validationgmorishita
 
Machine learning Introduction
Machine learning IntroductionMachine learning Introduction
Machine learning IntroductionKuppusamy P
 
Application of Machine Learning in Agriculture
Application of Machine  Learning in AgricultureApplication of Machine  Learning in Agriculture
Application of Machine Learning in AgricultureAman Vasisht
 
Chapter 1: Linear Regression
Chapter 1: Linear RegressionChapter 1: Linear Regression
Chapter 1: Linear RegressionAkmelSyed
 
Linear regression
Linear regression Linear regression
Linear regression mohamed Naas
 
Machine learning
Machine learningMachine learning
Machine learningShreyas G S
 
Understanding Blackbox Prediction via Influence Functions
Understanding Blackbox Prediction via Influence FunctionsUnderstanding Blackbox Prediction via Influence Functions
Understanding Blackbox Prediction via Influence FunctionsSEMINARGROOT
 
Linear Regression (Machine Learning)
Linear Regression (Machine Learning)Linear Regression (Machine Learning)
Linear Regression (Machine Learning)Omkar Rane
 
CS229 Machine Learning Lecture Notes
CS229 Machine Learning Lecture NotesCS229 Machine Learning Lecture Notes
CS229 Machine Learning Lecture NotesEric Conner
 
Artificial Intelligence.pptx
Artificial Intelligence.pptxArtificial Intelligence.pptx
Artificial Intelligence.pptxMubashir Hashmi
 
Detail Study of the concept of Regression model.pptx
Detail Study of the concept of  Regression model.pptxDetail Study of the concept of  Regression model.pptx
Detail Study of the concept of Regression model.pptxtruptikulkarni2066
 
WEKA:Credibility Evaluating Whats Been Learned
WEKA:Credibility Evaluating Whats Been LearnedWEKA:Credibility Evaluating Whats Been Learned
WEKA:Credibility Evaluating Whats Been Learnedweka Content
 

Similar to 3ml.pdf (20)

Supervised Learning.pdf
Supervised Learning.pdfSupervised Learning.pdf
Supervised Learning.pdf
 
Kickstart ML.pptx
Kickstart ML.pptxKickstart ML.pptx
Kickstart ML.pptx
 
Linear, Machine Learning or Probabilistic Predictive Models: What's Best for ...
Linear, Machine Learning or Probabilistic Predictive Models: What's Best for ...Linear, Machine Learning or Probabilistic Predictive Models: What's Best for ...
Linear, Machine Learning or Probabilistic Predictive Models: What's Best for ...
 
Model Selection and Validation
Model Selection and ValidationModel Selection and Validation
Model Selection and Validation
 
Machine learning Introduction
Machine learning IntroductionMachine learning Introduction
Machine learning Introduction
 
Application of Machine Learning in Agriculture
Application of Machine  Learning in AgricultureApplication of Machine  Learning in Agriculture
Application of Machine Learning in Agriculture
 
Chapter 1: Linear Regression
Chapter 1: Linear RegressionChapter 1: Linear Regression
Chapter 1: Linear Regression
 
Session 4 .pdf
Session 4 .pdfSession 4 .pdf
Session 4 .pdf
 
Regression
RegressionRegression
Regression
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Linear regression
Linear regression Linear regression
Linear regression
 
Linear_Regression
Linear_RegressionLinear_Regression
Linear_Regression
 
Machine learning
Machine learningMachine learning
Machine learning
 
Understanding Blackbox Prediction via Influence Functions
Understanding Blackbox Prediction via Influence FunctionsUnderstanding Blackbox Prediction via Influence Functions
Understanding Blackbox Prediction via Influence Functions
 
Linear Regression (Machine Learning)
Linear Regression (Machine Learning)Linear Regression (Machine Learning)
Linear Regression (Machine Learning)
 
Lecture 11 linear regression
Lecture 11 linear regressionLecture 11 linear regression
Lecture 11 linear regression
 
CS229 Machine Learning Lecture Notes
CS229 Machine Learning Lecture NotesCS229 Machine Learning Lecture Notes
CS229 Machine Learning Lecture Notes
 
Artificial Intelligence.pptx
Artificial Intelligence.pptxArtificial Intelligence.pptx
Artificial Intelligence.pptx
 
Detail Study of the concept of Regression model.pptx
Detail Study of the concept of  Regression model.pptxDetail Study of the concept of  Regression model.pptx
Detail Study of the concept of Regression model.pptx
 
WEKA:Credibility Evaluating Whats Been Learned
WEKA:Credibility Evaluating Whats Been LearnedWEKA:Credibility Evaluating Whats Been Learned
WEKA:Credibility Evaluating Whats Been Learned
 

Recently uploaded

How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentInMediaRes1
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,Virag Sontakke
 
Biting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfBiting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfadityarao40181
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfMahmoud M. Sallam
 
Capitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitolTechU
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17Celine George
 
History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxHistory Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxsocialsciencegdgrohi
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxAvyJaneVismanos
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaVirag Sontakke
 

Recently uploaded (20)

How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media Component
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
 
Biting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfBiting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdf
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdf
 
Capitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptx
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17
 
History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxHistory Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptx
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of India
 

3ml.pdf

  • 2. •Our goal is to estimate w from a training data of <xi ,yi > pairs •One way to find such relationship is to minimize the least squares error: Linear regression i i i w − wx )2 arg min ∑ ( y X Y y = wx + ε
  • 4. Marks of a student based on the number of hours he/she put into the preparation • Simple linear regression….. • Multiple linear regression…. • Non-linear problem….
  • 5. Marks of a student based on the number of hours he/she put into the preparation let’s assume Marks of a student (M) do depend on the number of hours (H) he/she put up for preparation. The following formula can represent the model: Marks = function (No. of hours) => Marks = m*Hours + c
  • 6. Marks of a student based on the number of hours he/she put into the preparation let’s assume Marks of a student (M) do depend on the number of hours (H) he/she put up for preparation. The following formula can represent the model: Marks = function (No. of hours) => Marks = m*Hours + c
  • 7. Marks of a student based on the number of hours he/she put into the preparation let’s Plot the data to check if it’s a Linear Problem – The easiest way to determine
  • 8. Marks of a student based on the number of hours he/she put into the preparation
  • 9. Marks of a student based on the number of hours he/she put into the preparation
  • 10. Marks of a student based on the number of hours he/she put into the preparation How to determine the Slope of line The value of m….
  • 11. Marks of a student based on the number of hours he/she put into the preparation • The value of m (slope of the line) can be determined using an objective function which is a combination of the loss function and a regularization term. • For simple linear regression, the objective function would be the summation of Mean Squared Error (MSE). • The best fit line would be obtained by minimizing the objective function (summation of mean squared error).
  • 12. Marks of a student based on the number of hours he/she put into the preparation • The value of m (slope of the line) can be determined using an objective function which is a combination of the loss function and a regularization term. • For simple linear regression, the objective function would be the summation of Mean Squared Error (MSE). • The best fit line would be obtained by minimizing the objective function (summation of mean squared error).
  • 13. Predicting weight reduction in form of the number of KGs reduced • Lets Assume IT could depend upon input features such as: age, height, the weight of the person, and the time spent on exercises,
  • 14. Predicting weight reduction in form of the number of KGs reduced Weight Reduction = Function(Age, Height, Weight, Time On Exercise) => Shoe-size = b1*Height + b2*Weight + b3*age + b4*time On Exercise + b0
  • 15. Predicting weight reduction in form of the number of KGs reduced As part of training the above model Goal: Find the value of b1, b2, b3, b4, and b0 which would minimize the objective function. The objective function would be the summation of mean squared error which is nothing but the sum of the square of the actual value and the predicted value for different values of age, height, weight, and time On Exercise
  • 16. Forecasting sales Organizations often use linear regression models to forecast future sales. This can be helpful for things like budgeting and planning. Algorithms such as Amazon’s item-to-item collaborative filtering are used to predict what customers will buy in the future based on their past purchase history
  • 17. Cash forecasting Many businesses use linear regression to forecast how much cash they’ll have on hand in the future. This is important for things like managing expenses and ensuring that there is enough cash on hand to cover unexpected costs.
  • 18. Analyzing survey data Linear regression can also be used to analyze survey data. This can help businesses understand things like customer satisfaction and product preferences. For example, a company might use linear regression to figure out how likely people are to recommend their product to others..
  • 19. Stock predictions A lot of businesses use linear regression models to predict how stocks will perform in the future. This is done by analyzing past data on stock prices and trends to identify patterns.
  • 20. Predicting consumer behavior Businesses can use linear regression to predict things like how much a customer is likely to spend. Regression models can also be used to predict consumer behavior. This can be helpful for things like targeted marketing and product development. For example, Walmart uses linear regression to predict what products will be popular in different regions of the country.
  • 21. Code:
  • 22.
  • 23. Code: import numpy as np import matplotlib.pyplot as plt def estimate_coef(x, y): # number of observations/points n = np.size(x) # mean of x and y vector m_x = np.mean(x) m_y = np.mean(y) # calculating cross-deviation and deviation about x SS_xy = np.sum(y*x) - n*m_y*m_x SS_xx = np.sum(x*x) - n*m_x*m_x # calculating regression coefficients b_1 = SS_xy / SS_xx b_0 = m_y - b_1*m_x return (b_0, b_1)
  • 24. def plot_regression_line(x, y, b): # plotting the actual points as scatter plot plt.scatter(x, y, color = "m", marker = "o", s = 30) # predicted response vector y_pred = b[0] + b[1]*x # plotting the regression line plt.plot(x, y_pred, color = "g") # putting labels plt.xlabel('x') plt.ylabel('y') # function to show plot plt.show() def main(): # observations / data x = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) y = np.array([1, 3, 2, 5, 7, 8, 8, 9, 10, 12]) # estimating coefficients b = estimate_coef(x, y) print("Estimated coefficients:nb_0 = {} nb_1 = {}".format(b[0], b[1])) # plotting regression line plot_regression_line(x, y, b) if __name__ == "__main__": main()
  • 25.
  • 27. Regression for classification • In some cases we can use linear regression for determining the appropriate boundary. • However, since the output is usually binary or discrete there are more efficient regression methods
  • 28. Regression for classification • Assume we would like to use linear regression to learn the parameters for p(y | X ; θ) • Problems? 1 -1 Optimal regression model wT X ≥ 0 ⇒ classify as 1 wT X < 0 ⇒ classify as -1
  • 29. Logistic Regression Logistic Regression is basically a predictive model analysis technique where the target variables (output) are discrete values for a given set of features or input (X). For example whether someone is covid-19 positive (1) or negative (0). It is a very powerful yet simple classification algorithm in machine learning borrowed from statistics algorithms.
  • 30. Logistic Regression Logistic Regression is basically a predictive model analysis technique where the target variables (output) are discrete values for a given set of features or input (X). For example whether someone is covid-19 positive (1) or negative (0). It is a very powerful yet simple classification algorithm in machine learning borrowed from statistics algorithms. Around 60% of the world’s classification problems can be solved by using the logistic regression algorithm.
  • 31. Logistic Regression Logistic regression is one of the most common machine learning algorithms used for binary classification. It predicts the probability of occurrence of a binary outcome. Fraud detection, spam detection, cancer detection, etc.
  • 32. Sigmoid Function It is a mathematical function having a characteristic that can take any real value and map it to between 0 to 1 shaped like the letter “S”. The sigmoid function also called a logistic function g(h) = 1 1+ e− h
  • 34. Sigmoid Function If the value of z goes to positive infinity then the predicted value of y will become 1 and if it goes to negative infinity then the predicted value of y will become 0. And if the outcome of the sigmoid function is more than 0.5 then we classify that label as class 1 or positive class and if it is less than 0.5 then we can classify it to negative class or label as class 0.
  • 35. Diff b/w Linear & Logistic Regression Linear Regression is used when our dependent variable is continuous in nature for example weight, height, numbers, etc. and in contrast, Logistic Regression is used when the dependent variable is binary or limited for example: yes and no, true and false, 1 or 2, etc. In the 19th century, people use linear regression on biology to predict health disease but it is very risky for example if a patient has cancer and its probability of malignant is 0.4 then in linear regression it will show that cancer is benign (because probability comes <0.5). That’s where Logistic Regression comes which only provides us with binary results.
  • 36. Logistic regression vs. Linear regression 1 T 1+ ew X p( y = 0 | X ;θ ) = g(wT X ) = T e w X T 1+ ew X p( y = 1| X ;θ ) = 1− g(wT X ) =
  • 37. Determining parameters for logistic regression problems ∏ i y i i i i (1− y ) (1− g( X ; w)) g( X ; w) L( y | X ; w) = • So how do we learn the parameters? • Similar to other regression problems we look for the MLE for w • The likelihood of the data given the model is: 1 T 1+ ew X p( y = 0 | X ;θ ) = g( X ; w) = T e w X T 1+ ew X p( y = 1| X ;θ ) = 1− g( X ; w) =
  • 38. Gradient Descent Gradient descent is an optimization algorithm used to minimize some function by iteratively moving in the direction of steepest descent as defined by the negative of the gradient. In machine learning, we use gradient descent to update the parameters of our model.
  • 39. Gradient Descent Starting at the top of the mountain, we take our first step downhill in the direction specified by the negative gradient. Next we recalculate the negative gradient (passing in the coordinates of our new point) and take another step in the direction it specifies. We continue this process iteratively until we get to the bottom of our graph, or to a point where we can no longer move downhill–a local minimum.
  • 41. Learning Rate The size of these steps is called the learning rate. With a high learning rate we can cover more ground each step, but we risk overshooting the lowest point since the slope of the hill is constantly changing. With a very low learning rate, we can confidently move in the direction of the negative gradient since we are recalculating it so frequently. A low learning rate is more precise, but calculating the gradient is time-consuming, so it will take us a very long time to get to the bottom.
  • 42. Gradient ascent Slope = ∂z/ ∂w z Δw w •Going in the direction to the slope will lead to a larger z •But not too much, otherwise we would go beyond the optimal w
  • 43. Gradient descent z Slope = ∂z/ ∂w Δz Δw w •Going in the opposite direction to the slope will lead to a smaller z •But not too much, otherwise we would go beyond the optimal w
  • 44. Finding The Best Weights - Hill Descent Ball on a complicated hilly terrain — rolls down to a local valley ↑ this is called a local minimum Questions: How to get to the bottom of the deepest valley? How to do this when we don’t have gravity?
  • 45. Our Ein Has Only One Valley Weights, w In-sample Error, E in . . . because Ein (w) is a convex function of w.
  • 46. Gradient Descent Method Batch Gradient Descent: use all examples in each iteration Mini batch Gradient Descent: use some examples in each iteration Stochastic Gradient Descent: use 1 example in each iteration
  • 47. Regularization • Similar to other data estimation problems, we may not have enough samples to learn good models for logistic regression classification • One way to overcome this is to ‘regularize’ the model, impose additional constraints on the parameters we are fitting. • For example, lets assume that wj comes from a Gaussian distribution with mean 0 and variance σ2 (where σ2 is a user defined parameter): wj ~N(0, σ2 ) • In that case we have a prior on the parameters and so: p( y = 1,θ | X ) ∝ p( y = 1| X ;θ ) p(θ )
  • 48. Credits Yasir Abu Mustafa ,Caltech university, California Barnabas Poczos, Ziv Bar-Joseph School of Computer Science, Carnegie Mellon University Vibhav Gogate The University of Texas at Dallas