Logistic Regression: Predicting Binary Classes Using Sigmoid Functions and Gradient Descent

•Download as PPTX, PDF•

1 like•477 views

This is a presentation made for our Intro to Machine Learning class. As a result it focuses more on the use of logit regression as a classifier as opposed to statistical applications. Many of the slides are based on Stanford's Open Course in machine learning.

Data & Analytics

Logistic Regression
Jacquelyn Victoria & Tamer Wahba
1

Slide Ownership
Jacquelyn Victoria - 3 to 9
Tamer Wahba - 10 to 15
2

Regression
Analysis +
Classification
How can we predict a nominal class
using regression analysis?
Consider a binary class:
Each instance x is a vector of feature
values
Our output values or class labels are
restricted to 0 or 1, i.e. f(x) ∈ {0, 1}
We need an h(x) where: 0 < h(x) < 1
We need a function which exhibits this
behavior
3

Logistic
Functions Sigmoid Function σ(x)
Asymptotes at y = 1 and y = 0
Easy to specify threshold (σ(0) = .5)
Results are P(y=1)
As a result:
Where θ is a vector of weights
4

Cost Function
Need to find hθ(x) that is a logistic
function that represents our data
Need to find θ to fit our data
-log(1-x)-log(x)
5

Gradient
Descent
In order to find the minimum, we can
use the partial derivative of J(θ)
do {
}until θ converges
Where α is the learning rate (almost
always between 0 and 1, .1-.3 usually
a good range)
6

Maximum Likelihood Estimation
7
do {
}until θ converges
Can also be calculated using:
Iteratively Reweighted Least Squares
Multinomial data uses Softmax Regression

Interpreting
hypothesis
8
Recall that σ(0) = .5 and that hθ(x) = σ(θTx)
x1
x2

Interpreting hθ
I want to create a model to give me the
probability that I will pass a test given how
many hours I have studied
Hours 0.50 0.75 1.00 1.25 1.50 1.75 1.75 2.00 2.25 2.50 2.75 3.00 3.25 3.50 4.00 4.25 4.50 4.75 5.00 5.50
Pass 0 0 0 0 0 0 1 0 1 0 1 0 1 0 1 1 1 1 1 1
Using this generated model, calculate my probability
of passing given I have studied 3 hours
P(passing| study time = 3) = .61
9source

Logistic
Regression
Compared to
Other Classifiers
Naive Bayes
Support Vector Machines
Decision Trees
10

vs Decision Tree
Assumptions
DT: decision boundaries parallel to axes
LR: one smooth boundary
Decision trees can be used when there are
multiple decision boundaries
11

Feature Weights
NB: each set independently depending on class
LR: together such that decision function tends to be high for positive classes and low for negative
classes
Correlated features have no effect on logistic regression
vs Naive Bayes
12

vs Support Vector Machine
13
Both attempt to find hyperplane separating training samples
SVM: find the solution with maximum margin
LR: find any solution that separates the instances
SVM is a hard classified while LR is probabilistic

Advantages
Works well with diagonal decision boundaries
Does not give undue weight to correlated
features
Probabilistic outcomes
14
Requires large sample size for stable results
Disadvantages

Use Cases
Categorical outcomes
Large sample data
Minimal preprocessing
15

For more info...
Helpful links to go into more
depth with Logistic Regression
Stanford Open Course (Logit
regression section)
Logit Regression Tutorial (exercises in
MATLAB)
Logit Regression Tutorial (no code)
How to use Logit Regression in Python
How to use Logit Regression in R
How to use Logit Regression in Java
using Weka
16

What's hot

Chapter 24 aoaHanif Durad

daa-unit-3-greedy methodhodcsencet

2.7 other classifiersKrish_ver2

Greedy algorithmInternational Islamic University

06. string matchingOnkar Nath Sharma

lecture 26sajinsc

Deep learning paper review ppt sourece -Direct clr taeseon ryu

Limit of Function MathematicMuhammad Kanhan

Greedy AlgorithmWaqar Akram

Greedy algorithms -Making change-Knapsack-Prim's-Kruskal'sJay Patel

Chapter 10 dsHanif Durad

String matching algorithmsDr Shashikant Athawale

Theory of computation Lec3 dfaArab Open University and Cairo University

Unit 2guna287176

Fuzzy Logic_HKRHirak Kr. Roy

Graph 3International Islamic University

An overview of Hidden Markov Models (HMM)ananth

Basic python part 1National University of Malaysia

Tutorial 7: Transpose of matrices and Its propertiesDr. Mehar Chand

Skiena algorithm 2007 lecture16 introduction to dynamic programmingzukun

What's hot (20)

Chapter 24 aoa

daa-unit-3-greedy method

2.7 other classifiers

Greedy algorithm

06. string matching

lecture 26

Deep learning paper review ppt sourece -Direct clr

Limit of Function Mathematic

Greedy Algorithm

Greedy algorithms -Making change-Knapsack-Prim's-Kruskal's

Chapter 10 ds

String matching algorithms

Theory of computation Lec3 dfa

Unit 2

Fuzzy Logic_HKR

Graph 3

An overview of Hidden Markov Models (HMM)

Basic python part 1

Tutorial 7: Transpose of matrices and Its properties

Skiena algorithm 2007 lecture16 introduction to dynamic programming

Viewers also liked

Logistic regressionDrZahid Khan

Ordinal Logistic RegressionAl-Ahmadgaid Asaad

Logistic regressionsaba khan

Logistic regression with SPSS examplesGaurav Kamboj

Logistic regressionVenkata Reddy Konasani

Logistic regression (blyth 2006) (simplified)MikeBlyth

Logistic regressionPakistan Gum Industries Pvt. Ltd

Regression analysis pptElkana Rorio

Logistic Regression AnalysisCOSTARCH Analytical Consulting (P) Ltd.

Spss course session-IIaltleo

Logistic regression sagePakistan Gum Industries Pvt. Ltd

Technique PresentationElizabeth Rego

MRA vs AVM Will Wiggins

Module5.slpGimylin

Boosted Tree-based Multinomial Logit Model for Aggregated Market DataJay (Jianqiang) Wang

BigML Summer 2016 ReleaseBigML, Inc

BSSML16 L2. Ensembles and Logistic RegressionsBigML, Inc

Logistic Regression/Markov Chain presentationMichael Hankin

Multinomial logisticregression basicrelationshipsAnirudha si

Transparency7A M

Viewers also liked (20)

Logistic regression

Ordinal Logistic Regression

Logistic regression

Logistic regression with SPSS examples

Logistic regression

Logistic regression (blyth 2006) (simplified)

Logistic regression

Regression analysis ppt

Logistic Regression Analysis

Spss course session-II

Logistic regression sage

Technique Presentation

MRA vs AVM

Module5.slp

Boosted Tree-based Multinomial Logit Model for Aggregated Market Data

BigML Summer 2016 Release

BSSML16 L2. Ensembles and Logistic Regressions

Logistic Regression/Markov Chain presentation

Multinomial logisticregression basicrelationships

Transparency7

Similar to Logistic Regression: Predicting Binary Classes Using Sigmoid Functions and Gradient Descent

MachineLearning.pptbutest

Model Selection and Validationgmorishita

MLHEP 2015: Introductory Lecture #1arogozhnikov

L1 intro2 supervised_learningYogendra Singh

Support Vector Machinesnextlib

Hands-on Tutorial of Machine Learning in PythonChun-Ming Chang

3_MLE_printable.pdfElio Laureano

Decision Trees and Bayes ClassifiersAlexander Jung

linear SVM.pptMahimMajee

MetiTarski: An Automatic Prover for Real-Valued Special FunctionsLawrence Paulson

Extra Lecture - Support Vector Machines (SVM), a lecture in subject module St...Maninda Edirisooriya

Regression analysis and its typeEkta Bafna

Support Vector Machines USING MACHINE LEARNING HOW IT WORKSrajalakshmi5921

[ppt]butest

Machine Learning meets DevOpsPooyan Jamshidi

Data classification sammer Sammer Qader

Classification ContinuedDatamining Tools

Similar to Logistic Regression: Predicting Binary Classes Using Sigmoid Functions and Gradient Descent (20)

MachineLearning.ppt

Model Selection and Validation

MLHEP 2015: Introductory Lecture #1

L1 intro2 supervised_learning

Support Vector Machines

Hands-on Tutorial of Machine Learning in Python

3_MLE_printable.pdf

Decision Trees and Bayes Classifiers

linear SVM.ppt

MetiTarski: An Automatic Prover for Real-Valued Special Functions

Extra Lecture - Support Vector Machines (SVM), a lecture in subject module St...

Regression analysis and its type

Support Vector Machines USING MACHINE LEARNING HOW IT WORKS

[ppt]

Machine Learning meets DevOps

Data classification sammer

Classification Continued

Recently uploaded

꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083

꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...Call Girls In Delhi Whatsup 9873940964 Enjoy Unlimited Pleasure

EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster

PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava

100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate

Digi Khata Problem along complete plan.pptxTanveerAhmed817946

FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg

Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863

High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh

Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083

From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck

VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor

Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha

Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda

(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat

B2 Creative Industry Response Evaluation.docxStephen266013

代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo

Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor

Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408

RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh

Recently uploaded (20)

꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call

꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...

EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx

PKS-TGC-1084-630 - Stage 1 Proposal.pptx

100-Concepts-of-AI by Anupama Kate .pptx

Digi Khata Problem along complete plan.pptx

FESE Capital Markets Fact Sheet 2024 Q1.pdf

Dubai Call Girls Wifey O52&786472 Call Girls Dubai

High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...

Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call

From idea to production in a day – Leveraging Azure ML and Streamlit to build...

VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati

Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...

Customer Service Analytics - Make Sense of All Your Data.pptx

(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service

B2 Creative Industry Response Evaluation.docx

代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改

Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai

Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps

RA-11058_IRR-COMPRESS Do 198 series of 1998

Logistic Regression: Predicting Binary Classes Using Sigmoid Functions and Gradient Descent

1. Logistic Regression Jacquelyn Victoria & Tamer Wahba 1

2. Slide Ownership Jacquelyn Victoria - 3 to 9 Tamer Wahba - 10 to 15 2

3. Regression Analysis + Classification How can we predict a nominal class using regression analysis? Consider a binary class: Each instance x is a vector of feature values Our output values or class labels are restricted to 0 or 1, i.e. f(x) ∈ {0, 1} We need an h(x) where: 0 < h(x) < 1 We need a function which exhibits this behavior 3

4. Logistic Functions Sigmoid Function σ(x) Asymptotes at y = 1 and y = 0 Easy to specify threshold (σ(0) = .5) Results are P(y=1) As a result: Where θ is a vector of weights 4

5. Cost Function Need to find hθ(x) that is a logistic function that represents our data Need to find θ to fit our data -log(1-x)-log(x) 5

6. Gradient Descent In order to find the minimum, we can use the partial derivative of J(θ) do { }until θ converges Where α is the learning rate (almost always between 0 and 1, .1-.3 usually a good range) 6

7. Maximum Likelihood Estimation 7 do { }until θ converges Can also be calculated using: Iteratively Reweighted Least Squares Multinomial data uses Softmax Regression

8. Interpreting hypothesis 8 Recall that σ(0) = .5 and that hθ(x) = σ(θTx) x1 x2

9. Interpreting hθ I want to create a model to give me the probability that I will pass a test given how many hours I have studied Hours 0.50 0.75 1.00 1.25 1.50 1.75 1.75 2.00 2.25 2.50 2.75 3.00 3.25 3.50 4.00 4.25 4.50 4.75 5.00 5.50 Pass 0 0 0 0 0 0 1 0 1 0 1 0 1 0 1 1 1 1 1 1 Using this generated model, calculate my probability of passing given I have studied 3 hours P(passing| study time = 3) = .61 9source

10. Logistic Regression Compared to Other Classifiers Naive Bayes Support Vector Machines Decision Trees 10

11. vs Decision Tree Assumptions DT: decision boundaries parallel to axes LR: one smooth boundary Decision trees can be used when there are multiple decision boundaries 11

12. Feature Weights NB: each set independently depending on class LR: together such that decision function tends to be high for positive classes and low for negative classes Correlated features have no effect on logistic regression vs Naive Bayes 12

13. vs Support Vector Machine 13 Both attempt to find hyperplane separating training samples SVM: find the solution with maximum margin LR: find any solution that separates the instances SVM is a hard classified while LR is probabilistic

14. Advantages Works well with diagonal decision boundaries Does not give undue weight to correlated features Probabilistic outcomes 14 Requires large sample size for stable results Disadvantages

15. Use Cases Categorical outcomes Large sample data Minimal preprocessing 15

16. For more info... Helpful links to go into more depth with Logistic Regression Stanford Open Course (Logit regression section) Logit Regression Tutorial (exercises in MATLAB) Logit Regression Tutorial (no code) How to use Logit Regression in Python How to use Logit Regression in R How to use Logit Regression in Java using Weka 16

Editor's Notes

hθ(x) = σ(θTx)
We’re trying to find the most likely theta given our test instances. This can be nondeterministic, Also need stopping criteria.
We can actually add terms such as x1^2, or x1*x2^4 to make a non-linear
The first entry in our instance vector is always one, due to the weight of the intercept When we say theta transpose x - this means that we transpose theta and then multiply it, using matrix multiplication,
Draw SVM diagram on board

Logistic Regression: Predicting Binary Classes Using Sigmoid Functions and Gradient Descent

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Logistic Regression: Predicting Binary Classes Using Sigmoid Functions and Gradient Descent

Similar to Logistic Regression: Predicting Binary Classes Using Sigmoid Functions and Gradient Descent (20)

Recently uploaded

Recently uploaded (20)

Logistic Regression: Predicting Binary Classes Using Sigmoid Functions and Gradient Descent

Editor's Notes