SlideShare a Scribd company logo
1 of 24
Download to read offline
1
CS340 Machine learning
Gaussian classifiers
2
Correlated features
• Height and weight are not independent
3
Multivariate Gaussian
• Multivariate Normal (MVN)
• Exponent is the Mahalanobis distance between x
and µ
Σ is the covariance matrix (positive definite)
N(x|µ, Σ)
def
=
1
(2π)p/2|Σ|1/2
exp[−1
2 (x − µ)T
Σ−1
(x − µ)]
∆ = (x − µ)T
Σ−1
(x − µ)
xT
Σx > 0 ∀x
4
Bivariate Gaussian
• Covariance matrix is
where the correlation coefficient is
and satisfies -1 ≤ ρ ≤ 1
• Density is
Σ =

σ2
x ρσxσy
ρσxσy σ2
y

ρ =
Cov(X, Y )

V ar(X)V ar(Y )
p(x, y) =
1
2πσxσy

1 − ρ2
exp

−
1
2(1 − ρ2)

x2
σ2
x
+
y2
σ2
y
−
2ρxy
(σxσy)
5
Spherical, diagonal, full covariance
Σ =

σ2
0
0 σ2

Σ =

σ2
x 0
0 σ2
y

Σ =

σ2
x ρσxσy
ρσxσy σ2
y
6
Surface plots
7
Generative classifier
• A generative classifier is one that defines a class-
conditional density p(x|y=c) and combines this with
a class prior p(c) to compute the class posterior
• Examples:
– Naïve Bayes:
– Gaussian classifiers
• Alternative is a discriminative classifier, that
estimates p(y=c|x) directly.
p(y = c|x) =
p(x|y = c)p(y = c)

c′ p(x|y = c′)p(c′)
p(x|y = c) =
d

j=1
p(xj|y = c)
p(x|y = c) = N (x|µc, Σc)
8
Naïve Bayes with Bernoulli features
• Consider this class-conditional density
• The resulting class posterior (using plugin rule) has
the form
• This can be rewritten as
p(x|y = c) =
d

i=1
θ
I(xi=1)
ic (1 − θic)I(xi=0)
p(y = c|x) =
p(y = c)p(x|y = c)
p(x)
=
πc
d
i=1 θ
I(xi=1)
ic (1 − θic)I(xi=0)
p(x)
p(Y = c|x, θ, π) =
p(x|y = c)p(y = c)

c′ p(x|y = c′)p(y = c′)
=
exp[log p(x|y = c) + log p(y = c)]

c′ exp[log p(x|y = c′) + log p(y = c′)]
=
exp [log πc +

i I(xi = 1) log θic + I(xi = 0) log(1 − θic)]

c′ exp [log πc′ +

i I(xi = 1) log θi,c′ + I(xi = 0) log(1 − θic)]
9
Form of the class posterior
• From previous slide
• Define
• Then the posterior is given by the softmax function
• This is called softmax because it acts like the max
function when |βc| → ∞
p(Y = c|x, θ, π) ∝ exp

log πc +

i
I(xi = 1) log θic + I(xi = 0) log(1 − θic)
x′
= [1, I(x1 = 1), I(x1 = 0), . . . , I(xd = 1), I(xd = 0)]
βc = [log πc, log θ1c, log(1 − θ1c), . . . , log θdc, log(1 − θdc)]
p(Y = c|x, β) =
exp[βT
c x′
]

c′ exp[βT
c′ x′]
p(Y = c|x) =
1.0 if c = arg maxc′ βT
c′ x
0.0 otherwise
10
Two-class case
• From previous slide
• In the binary case, Y ∈ {0,1}, the softmax becomes
the logistic (sigmoid) function σ(u) = 1/(1+e-u)
p(Y = c|x, β) =
exp[βT
c x′
]

c′ exp[βT
c′ x′]
p(Y = 1|x, θ) =
eβT
1 x′
eβT
1 x′
+ eβT
0 x′
=
1
1 + e(β0−β1)T x′
=
1
1 + ewT x′
= σ(wT
x′
)
11
Sigmoid function
• σ(ax + b), a controls steepness, b is threshold.
• For small a and x ≈ –b/2, roughly linear
a=0.3
a=1.0
a=3.0
b=-30 b=0 b=+30
12
Sigmoid function in 2D
Mackay 39.3
σ(w1 x1 + w2 x2) = σ(wT x): w is perpendicular to the decision boundary
13
Logit function
• Let p=p(y=1) and η be the log odds
• Then p = σ(η) and η = logit(p)
η is the natural parameter of the Bernoulli
distribution, and p = E[y] is the moment parameter
• If η = wT x, then wi is how much the log-odds
increases by if we increase xi
η = log
p
1 − p
σ(η) =
1
1 + e−η
=
eη
eη + 1
=
p
(1−p)
p
1−p + 1
=
p
(1−p)
p+1−p
1−p
= p
14
Gaussian classifiers
• Class posterior (using plug-in rule)
• We will consider the form of this equation for
various special cases:
• Σ1=Σ0,
• tied, many classes
• General case
p(Y = c|x) =
p(x|Y = c)p(Y = c)
C
c′=1 p(x|Y = c′)p(Y = c′)
=
πc|2πΣc|−
1
2 exp −1
2
(x − µc)T
Σ−1
c (x − µc)

c′ πc′ |2πΣc′ |−
1
2 exp −1
2
(x − µc′ )T Σ−1
c′ (x − µc′ )
Σc
15
Σ1 = Σ0
• Class posterior simplifies to
p(Y = 1|x) =
p(x|Y = 1)p(Y = 1)
p(x|Y = 1)p(Y = 1) + p(x|Y = 0)p(Y = 0)
=
π1 exp −1
2 (x − µ1)T
Σ−1
(x − µ1)
π1 exp −1
2
(x − µ1)T Σ−1(x − µ1) + π0 exp −1
2
(x − µ0)T Σ−1(x − µ0)
=
π1ea1
π1ea1 + π0ea0
=
1
1 + π0
π1
ea0−a1
ac
def
= −1
2
(x − µc)T
Σ(x − µc)
16
Σ1 = Σ0
• Class posterior simplifies to
p(Y = 1|x) =
1
1 + exp − log π1
π0
+ a0 − a1

a0 − a1 = −1
2 (x − µ0)T
Σ−1
(x − µ0) + 1
2 (x − µ1)T
Σ−1
(x − µ1)
= −(µ1 − µ0)T
Σ−1
x + 1
2 (µ1 − µ0)T
Σ−1
(µ1 + µ0)
so
p(Y = 1|x) =
1
1 + exp [−βT x − γ]
= σ(βT
x + γ)
β
def
= Σ−1
(µ1 − µ0)
γ
def
= −1
2 (µ1 − µ0)T
Σ−1
(µ1 + µ0) + log
π1
π0
σ(η)
def
=
1
1 + e−η
=
eη
eη + 1
Linear function of x
17
Decision boundary
• Rewrite class posterior as
• If Σ=I, then w=(µ1-µ0) is in the direction of µ1-µ0, so
the hyperplane is orthogonal to the line between
the two means, and intersects it at x0
• If π1=π0, then x0 = 0.5(µ1+µ0) is midway between
the two means
• If π1 increases, x0 decreases, so the boundary
shifts toward µ0 (so more space gets mapped to
class 1)
p(Y = 1|x) = σ(βT
x + γ) = σ(wT
(x − x0))
w = β = Σ−1
(µ1 − µ0)
x0 = −
γ
β
= 1
2 (µ1 + µ0) −
log(π1/π0)
(µ1 − µ0)T Σ−1(µ1 − µ0)
(µ1 − µ0)
18
Decision boundary in 1d
Discontinuous decision region
19
Decision boundary in 2d
20
Tied Σ, many classes
• Similarly to before
• This is the multinomial logit or softmax function
p(Y = c|x) =
πc exp −1
2 (x − µc)T
Σ−1
c (x − µc)

c′ πc′ exp −1
2
(x − µc′ )T Σ−1
c′ (x − µc′ )
=
exp µT
′ Σ−1
x − 1
2
µT
c Σ−1
µc + log πc

c′ exp µT
c′ Σ−1x − 1
2
µT
c′ Σ−1µc′ + log πc′
θc
def
=

−µT
c Σ−1
µc + log πc
Σ−1
µc

=

γc
βc

p(Y = c|x) =
eθT
c x

c′ eθT
c′ x =
eβT
c x+γc

c′ eβT
c′ x+γc′
21
Tied Σ, many classes
• Discriminant function
• Decision boundary is again linear, since xT Σ x
terms cancel
• If Σ = I, then the decision boundaries are
orthogonal to µi - µj, otherwise skewed
gc(x) = −1
2 (x − µc)T
Σ−1
(x − µc) + log p(Y = c) = βT
c x + βc0
βc = Σ−1
µc
βc0 = −1
2 µT
c Σ−1
µc + log πc
22
Decision boundaries
[x,y] = meshgrid(linspace(-10,10,100), linspace(-10,10,100));
g1 = reshape(mvnpdf(X, mu1(:)’, S1), [m n]); ...
contour(x,y,g2*p2-max(g1*p1, g3*p3),[0 0],’-k’);
23
Σ0, Σ1 arbitrary
• If the Σ are unconstrained, we end up with cross
product terms, leading to quadratic decision
boundaries
24
General case
µ1 = (0, 0), µ2 = (0, 5), µ3 = (5, 5), π = (1/3, 1/3, 1/3)

More Related Content

Similar to stoch41.pdf

AlgoPerm2012 - 03 Olivier Hudry
AlgoPerm2012 - 03 Olivier HudryAlgoPerm2012 - 03 Olivier Hudry
AlgoPerm2012 - 03 Olivier HudryAlgoPerm 2012
 
cps170_bayes_nets.ppt
cps170_bayes_nets.pptcps170_bayes_nets.ppt
cps170_bayes_nets.pptFaizAbaas
 
Roots equations
Roots equationsRoots equations
Roots equationsoscar
 
Roots equations
Roots equationsRoots equations
Roots equationsoscar
 
Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...
Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...
Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...Tomoya Murata
 
Integration techniques
Integration techniquesIntegration techniques
Integration techniquesKrishna Gali
 
Open GL T0074 56 sm4
Open GL T0074 56 sm4Open GL T0074 56 sm4
Open GL T0074 56 sm4Roziq Bahtiar
 
Density theorems for anisotropic point configurations
Density theorems for anisotropic point configurationsDensity theorems for anisotropic point configurations
Density theorems for anisotropic point configurationsVjekoslavKovac1
 
Low Complexity Regularization of Inverse Problems - Course #3 Proximal Splitt...
Low Complexity Regularization of Inverse Problems - Course #3 Proximal Splitt...Low Complexity Regularization of Inverse Problems - Course #3 Proximal Splitt...
Low Complexity Regularization of Inverse Problems - Course #3 Proximal Splitt...Gabriel Peyré
 
Function evaluation, termination, vertical line test etc
Function evaluation, termination, vertical line test etcFunction evaluation, termination, vertical line test etc
Function evaluation, termination, vertical line test etcsurprisesibusiso07
 
Gabarito completo anton_calculo_8ed_caps_01_08
Gabarito completo anton_calculo_8ed_caps_01_08Gabarito completo anton_calculo_8ed_caps_01_08
Gabarito completo anton_calculo_8ed_caps_01_08joseotaviosurdi
 
Lecture 2: linear SVM in the dual
Lecture 2: linear SVM in the dualLecture 2: linear SVM in the dual
Lecture 2: linear SVM in the dualStéphane Canu
 
Lecture 2: linear SVM in the Dual
Lecture 2: linear SVM in the DualLecture 2: linear SVM in the Dual
Lecture 2: linear SVM in the DualStéphane Canu
 
A Szemeredi-type theorem for subsets of the unit cube
A Szemeredi-type theorem for subsets of the unit cubeA Szemeredi-type theorem for subsets of the unit cube
A Szemeredi-type theorem for subsets of the unit cubeVjekoslavKovac1
 
Difference quotient algebra
Difference quotient algebraDifference quotient algebra
Difference quotient algebramath260
 
Lecture8 multi class_svm
Lecture8 multi class_svmLecture8 multi class_svm
Lecture8 multi class_svmStéphane Canu
 

Similar to stoch41.pdf (20)

talk MCMC & SMC 2004
talk MCMC & SMC 2004talk MCMC & SMC 2004
talk MCMC & SMC 2004
 
5.n nmodels i
5.n nmodels i5.n nmodels i
5.n nmodels i
 
AlgoPerm2012 - 03 Olivier Hudry
AlgoPerm2012 - 03 Olivier HudryAlgoPerm2012 - 03 Olivier Hudry
AlgoPerm2012 - 03 Olivier Hudry
 
cps170_bayes_nets.ppt
cps170_bayes_nets.pptcps170_bayes_nets.ppt
cps170_bayes_nets.ppt
 
Roots equations
Roots equationsRoots equations
Roots equations
 
Roots equations
Roots equationsRoots equations
Roots equations
 
Core 2 revision notes
Core 2 revision notesCore 2 revision notes
Core 2 revision notes
 
Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...
Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...
Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...
 
Integration techniques
Integration techniquesIntegration techniques
Integration techniques
 
Open GL T0074 56 sm4
Open GL T0074 56 sm4Open GL T0074 56 sm4
Open GL T0074 56 sm4
 
Density theorems for anisotropic point configurations
Density theorems for anisotropic point configurationsDensity theorems for anisotropic point configurations
Density theorems for anisotropic point configurations
 
Low Complexity Regularization of Inverse Problems - Course #3 Proximal Splitt...
Low Complexity Regularization of Inverse Problems - Course #3 Proximal Splitt...Low Complexity Regularization of Inverse Problems - Course #3 Proximal Splitt...
Low Complexity Regularization of Inverse Problems - Course #3 Proximal Splitt...
 
Function evaluation, termination, vertical line test etc
Function evaluation, termination, vertical line test etcFunction evaluation, termination, vertical line test etc
Function evaluation, termination, vertical line test etc
 
Gabarito completo anton_calculo_8ed_caps_01_08
Gabarito completo anton_calculo_8ed_caps_01_08Gabarito completo anton_calculo_8ed_caps_01_08
Gabarito completo anton_calculo_8ed_caps_01_08
 
Lecture 2: linear SVM in the dual
Lecture 2: linear SVM in the dualLecture 2: linear SVM in the dual
Lecture 2: linear SVM in the dual
 
Lecture 2: linear SVM in the Dual
Lecture 2: linear SVM in the DualLecture 2: linear SVM in the Dual
Lecture 2: linear SVM in the Dual
 
A Szemeredi-type theorem for subsets of the unit cube
A Szemeredi-type theorem for subsets of the unit cubeA Szemeredi-type theorem for subsets of the unit cube
A Szemeredi-type theorem for subsets of the unit cube
 
Difference quotient algebra
Difference quotient algebraDifference quotient algebra
Difference quotient algebra
 
Lecture8 multi class_svm
Lecture8 multi class_svmLecture8 multi class_svm
Lecture8 multi class_svm
 
Maths 301 key_sem_1_2009_2010
Maths 301 key_sem_1_2009_2010Maths 301 key_sem_1_2009_2010
Maths 301 key_sem_1_2009_2010
 

Recently uploaded

microprocessor 8085 and its interfacing
microprocessor 8085  and its interfacingmicroprocessor 8085  and its interfacing
microprocessor 8085 and its interfacingjaychoudhary37
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineeringmalavadedarshan25
 
HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2RajaP95
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfAsst.prof M.Gokilavani
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfAsst.prof M.Gokilavani
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...srsj9000
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLDeelipZope
 

Recently uploaded (20)

microprocessor 8085 and its interfacing
microprocessor 8085  and its interfacingmicroprocessor 8085  and its interfacing
microprocessor 8085 and its interfacing
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineering
 
HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCL
 

stoch41.pdf

  • 2. 2 Correlated features • Height and weight are not independent
  • 3. 3 Multivariate Gaussian • Multivariate Normal (MVN) • Exponent is the Mahalanobis distance between x and µ Σ is the covariance matrix (positive definite) N(x|µ, Σ) def = 1 (2π)p/2|Σ|1/2 exp[−1 2 (x − µ)T Σ−1 (x − µ)] ∆ = (x − µ)T Σ−1 (x − µ) xT Σx > 0 ∀x
  • 4. 4 Bivariate Gaussian • Covariance matrix is where the correlation coefficient is and satisfies -1 ≤ ρ ≤ 1 • Density is Σ = σ2 x ρσxσy ρσxσy σ2 y ρ = Cov(X, Y ) V ar(X)V ar(Y ) p(x, y) = 1 2πσxσy 1 − ρ2 exp − 1 2(1 − ρ2) x2 σ2 x + y2 σ2 y − 2ρxy (σxσy)
  • 5. 5 Spherical, diagonal, full covariance Σ = σ2 0 0 σ2 Σ = σ2 x 0 0 σ2 y Σ = σ2 x ρσxσy ρσxσy σ2 y
  • 7. 7 Generative classifier • A generative classifier is one that defines a class- conditional density p(x|y=c) and combines this with a class prior p(c) to compute the class posterior • Examples: – Naïve Bayes: – Gaussian classifiers • Alternative is a discriminative classifier, that estimates p(y=c|x) directly. p(y = c|x) = p(x|y = c)p(y = c) c′ p(x|y = c′)p(c′) p(x|y = c) = d j=1 p(xj|y = c) p(x|y = c) = N (x|µc, Σc)
  • 8. 8 Naïve Bayes with Bernoulli features • Consider this class-conditional density • The resulting class posterior (using plugin rule) has the form • This can be rewritten as p(x|y = c) = d i=1 θ I(xi=1) ic (1 − θic)I(xi=0) p(y = c|x) = p(y = c)p(x|y = c) p(x) = πc d i=1 θ I(xi=1) ic (1 − θic)I(xi=0) p(x) p(Y = c|x, θ, π) = p(x|y = c)p(y = c) c′ p(x|y = c′)p(y = c′) = exp[log p(x|y = c) + log p(y = c)] c′ exp[log p(x|y = c′) + log p(y = c′)] = exp [log πc + i I(xi = 1) log θic + I(xi = 0) log(1 − θic)] c′ exp [log πc′ + i I(xi = 1) log θi,c′ + I(xi = 0) log(1 − θic)]
  • 9. 9 Form of the class posterior • From previous slide • Define • Then the posterior is given by the softmax function • This is called softmax because it acts like the max function when |βc| → ∞ p(Y = c|x, θ, π) ∝ exp log πc + i I(xi = 1) log θic + I(xi = 0) log(1 − θic) x′ = [1, I(x1 = 1), I(x1 = 0), . . . , I(xd = 1), I(xd = 0)] βc = [log πc, log θ1c, log(1 − θ1c), . . . , log θdc, log(1 − θdc)] p(Y = c|x, β) = exp[βT c x′ ] c′ exp[βT c′ x′] p(Y = c|x) = 1.0 if c = arg maxc′ βT c′ x 0.0 otherwise
  • 10. 10 Two-class case • From previous slide • In the binary case, Y ∈ {0,1}, the softmax becomes the logistic (sigmoid) function σ(u) = 1/(1+e-u) p(Y = c|x, β) = exp[βT c x′ ] c′ exp[βT c′ x′] p(Y = 1|x, θ) = eβT 1 x′ eβT 1 x′ + eβT 0 x′ = 1 1 + e(β0−β1)T x′ = 1 1 + ewT x′ = σ(wT x′ )
  • 11. 11 Sigmoid function • σ(ax + b), a controls steepness, b is threshold. • For small a and x ≈ –b/2, roughly linear a=0.3 a=1.0 a=3.0 b=-30 b=0 b=+30
  • 12. 12 Sigmoid function in 2D Mackay 39.3 σ(w1 x1 + w2 x2) = σ(wT x): w is perpendicular to the decision boundary
  • 13. 13 Logit function • Let p=p(y=1) and η be the log odds • Then p = σ(η) and η = logit(p) η is the natural parameter of the Bernoulli distribution, and p = E[y] is the moment parameter • If η = wT x, then wi is how much the log-odds increases by if we increase xi η = log p 1 − p σ(η) = 1 1 + e−η = eη eη + 1 = p (1−p) p 1−p + 1 = p (1−p) p+1−p 1−p = p
  • 14. 14 Gaussian classifiers • Class posterior (using plug-in rule) • We will consider the form of this equation for various special cases: • Σ1=Σ0, • tied, many classes • General case p(Y = c|x) = p(x|Y = c)p(Y = c) C c′=1 p(x|Y = c′)p(Y = c′) = πc|2πΣc|− 1 2 exp −1 2 (x − µc)T Σ−1 c (x − µc) c′ πc′ |2πΣc′ |− 1 2 exp −1 2 (x − µc′ )T Σ−1 c′ (x − µc′ ) Σc
  • 15. 15 Σ1 = Σ0 • Class posterior simplifies to p(Y = 1|x) = p(x|Y = 1)p(Y = 1) p(x|Y = 1)p(Y = 1) + p(x|Y = 0)p(Y = 0) = π1 exp −1 2 (x − µ1)T Σ−1 (x − µ1) π1 exp −1 2 (x − µ1)T Σ−1(x − µ1) + π0 exp −1 2 (x − µ0)T Σ−1(x − µ0) = π1ea1 π1ea1 + π0ea0 = 1 1 + π0 π1 ea0−a1 ac def = −1 2 (x − µc)T Σ(x − µc)
  • 16. 16 Σ1 = Σ0 • Class posterior simplifies to p(Y = 1|x) = 1 1 + exp − log π1 π0 + a0 − a1 a0 − a1 = −1 2 (x − µ0)T Σ−1 (x − µ0) + 1 2 (x − µ1)T Σ−1 (x − µ1) = −(µ1 − µ0)T Σ−1 x + 1 2 (µ1 − µ0)T Σ−1 (µ1 + µ0) so p(Y = 1|x) = 1 1 + exp [−βT x − γ] = σ(βT x + γ) β def = Σ−1 (µ1 − µ0) γ def = −1 2 (µ1 − µ0)T Σ−1 (µ1 + µ0) + log π1 π0 σ(η) def = 1 1 + e−η = eη eη + 1 Linear function of x
  • 17. 17 Decision boundary • Rewrite class posterior as • If Σ=I, then w=(µ1-µ0) is in the direction of µ1-µ0, so the hyperplane is orthogonal to the line between the two means, and intersects it at x0 • If π1=π0, then x0 = 0.5(µ1+µ0) is midway between the two means • If π1 increases, x0 decreases, so the boundary shifts toward µ0 (so more space gets mapped to class 1) p(Y = 1|x) = σ(βT x + γ) = σ(wT (x − x0)) w = β = Σ−1 (µ1 − µ0) x0 = − γ β = 1 2 (µ1 + µ0) − log(π1/π0) (µ1 − µ0)T Σ−1(µ1 − µ0) (µ1 − µ0)
  • 18. 18 Decision boundary in 1d Discontinuous decision region
  • 20. 20 Tied Σ, many classes • Similarly to before • This is the multinomial logit or softmax function p(Y = c|x) = πc exp −1 2 (x − µc)T Σ−1 c (x − µc) c′ πc′ exp −1 2 (x − µc′ )T Σ−1 c′ (x − µc′ ) = exp µT ′ Σ−1 x − 1 2 µT c Σ−1 µc + log πc c′ exp µT c′ Σ−1x − 1 2 µT c′ Σ−1µc′ + log πc′ θc def = −µT c Σ−1 µc + log πc Σ−1 µc = γc βc p(Y = c|x) = eθT c x c′ eθT c′ x = eβT c x+γc c′ eβT c′ x+γc′
  • 21. 21 Tied Σ, many classes • Discriminant function • Decision boundary is again linear, since xT Σ x terms cancel • If Σ = I, then the decision boundaries are orthogonal to µi - µj, otherwise skewed gc(x) = −1 2 (x − µc)T Σ−1 (x − µc) + log p(Y = c) = βT c x + βc0 βc = Σ−1 µc βc0 = −1 2 µT c Σ−1 µc + log πc
  • 22. 22 Decision boundaries [x,y] = meshgrid(linspace(-10,10,100), linspace(-10,10,100)); g1 = reshape(mvnpdf(X, mu1(:)’, S1), [m n]); ... contour(x,y,g2*p2-max(g1*p1, g3*p3),[0 0],’-k’);
  • 23. 23 Σ0, Σ1 arbitrary • If the Σ are unconstrained, we end up with cross product terms, leading to quadratic decision boundaries
  • 24. 24 General case µ1 = (0, 0), µ2 = (0, 5), µ3 = (5, 5), π = (1/3, 1/3, 1/3)