SlideShare a Scribd company logo
1 of 35
Download to read offline
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Statistical Pattern Recognition
Lecture4
Bayesian Learning
Dr Zohreh Azimifar
School of Electrical and Computer Engineering
Shiraz University
Fall2014
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 1 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Table of contents
1 Introduction
2 Generative Learning vs Discriminative Learning
Generative Learning and Discriminative Learning
3 Linear Discriminant Analysis
Gaussian Linear Discriminant Analysis
Boundary Decision for GLDA
Analysis of GLDA
4 Quadratic Discriminant Analysis
Quadratic Discriminant Analysis
Analysis of QDA
5 GLAD and QDA, Another point of view
GLAD and QDA, Another point of view
6 Naive Bayes
Naive Bayes
Naive Bayes: An Example
7 Lecture Summary
Summary
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 2 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Introduction
Classification based on the theory Bayesian Learning
P(y = 0|X) ≷y=0
y=1 P(y = 1|X)
Classification involves determining P(y|X), from different
perspectives.
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 3 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Generative Learning and Discriminative Learning
Generative Learning and Discriminative Learning
1 Discriminative Learning:
Direct learning of P(y|X).
Modelling of decision boundary, to which side a new sample is
assigned.
Logistic and softmax regression are called discriminative learners.
2 Generative Learning
Explicit modelling of each class separately.
Compare new sample with each class probability, based on Bayesian
rule:
P(y|X) =
Likelihood
z }| {
P(X|y)
Prior
z }| {
P(y)
P(X)
| {z }
normalizing factor
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 4 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Gaussian Linear Discriminant Analysis
Boundary Decision for GLDA
Analysis of GLDA
Gaussian Linear Discriminant Analysis
Model P(y) and P(X|y) for each class y:
P(y) = φ
1{y=1}
1 φ
1{y=2}
2 · · · φ1{y=c}
c
P(X|y = i) =
1
(
√
2π)n|Σ|
1
2
exp(
−1
2
(X − µi )T
Σ−1
(X − µi ))
Parameter set: θ = {φ1, φ2, ..., φc , µ1, µ2, ..., µc , Σ}
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 5 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Gaussian Linear Discriminant Analysis
Boundary Decision for GLDA
Analysis of GLDA
Gaussian Linear Discriminant Analysis
Estimate parameters: θ = {φ1, φ2, ..., φc , µ1, µ2, ..., µc , Σ}
l(θ) = log
m
Y
j=1
P(X(j)
, y(j)
) = log
m
Y
j=1
P(X(j)
|P(y(j)
))P(y(j)
)
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 6 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Gaussian Linear Discriminant Analysis
Boundary Decision for GLDA
Analysis of GLDA
Gaussian Linear Discriminant Analysis
Estimate parameters: θ = {φ1, φ2, ..., φc , µ1, µ2, ..., µc , Σ}
l(θ) = log
m
Y
j=1
P(X(j)
, y(j)
) = log
m
Y
j=1
P(X(j)
|P(y(j)
))P(y(j)
)
Take partial derivative in terms of each individual parameter:
φMLE
i =
Pm
j=1 1{y(j)
= i}
m
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 6 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Gaussian Linear Discriminant Analysis
Boundary Decision for GLDA
Analysis of GLDA
Gaussian Linear Discriminant Analysis
Estimate parameters: θ = {φ1, φ2, ..., φc , µ1, µ2, ..., µc , Σ}
l(θ) = log
m
Y
j=1
P(X(j)
, y(j)
) = log
m
Y
j=1
P(X(j)
|P(y(j)
))P(y(j)
)
Take partial derivative in terms of each individual parameter:
φMLE
i =
Pm
j=1 1{y(j)
= i}
m
µMLE
i =
Pm
j=1 1{y(j)
= i}X(j)
Pm
j=1 1{y(j) = i}
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 6 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Gaussian Linear Discriminant Analysis
Boundary Decision for GLDA
Analysis of GLDA
Gaussian Linear Discriminant Analysis
Estimate parameters: θ = {φ1, φ2, ..., φc , µ1, µ2, ..., µc , Σ}
l(θ) = log
m
Y
j=1
P(X(j)
, y(j)
) = log
m
Y
j=1
P(X(j)
|P(y(j)
))P(y(j)
)
Take partial derivative in terms of each individual parameter:
φMLE
i =
Pm
j=1 1{y(j)
= i}
m
µMLE
i =
Pm
j=1 1{y(j)
= i}X(j)
Pm
j=1 1{y(j) = i}
ΣMLE
=
1
m
m
X
j=1
(X(j)
− µy(j) )(X(j)
− µy(j) )T
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 6 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Gaussian Linear Discriminant Analysis
Boundary Decision for GLDA
Analysis of GLDA
Gaussian Linear Discriminant Analysis
Determine class label of a new sample Xnew
:
ynew
= argmaxy P(y|X)
= argmaxy
P(X|y)P(y)
P(X)
= argmaxy P(X|y)P(y)
Note that P(X|y) is a class dependent density.
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 7 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Gaussian Linear Discriminant Analysis
Boundary Decision for GLDA
Analysis of GLDA
Boundary Decision for GLDA
Decision boundary is a line, a plane, or a hyper-plane. Why?
P(y = i|X) = P(y = j|X)
P(X|y = i)P(y = i)
P(X)
=
P(X|y = j)P(y = j)
P(X)
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 8 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Gaussian Linear Discriminant Analysis
Boundary Decision for GLDA
Analysis of GLDA
Boundary Decision for GLDA
Decision boundary is a line, a plane, or a hyper-plane. Why?
P(y = i|X) = P(y = j|X)
P(X|y = i)P(y = i)
P(X)
=
P(X|y = j)P(y = j)
P(X)
P(X|y = i)P(y = i) = P(X|y = j)P(y = j)
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 8 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Gaussian Linear Discriminant Analysis
Boundary Decision for GLDA
Analysis of GLDA
Boundary Decision for GLDA
Decision boundary is a line, a plane, or a hyper-plane. Why?
P(y = i|X) = P(y = j|X)
P(X|y = i)P(y = i)
P(X)
=
P(X|y = j)P(y = j)
P(X)
P(X|y = i)P(y = i) = P(X|y = j)P(y = j)
1
(2π)
n
2 |Σ|
1
2
exp(
−1
2
(X − µi )T
Σ−1
(X − µi ))P(y = i)
=
1
(2π)
n
2 |Σ|
1
2
exp(
−1
2
(X − µj )T
Σ−1
(X − µj ))P(y = j)
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 8 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Gaussian Linear Discriminant Analysis
Boundary Decision for GLDA
Analysis of GLDA
Boundary Decision for GLDA
Decision boundary cont’ed:
exp(
−1
2
(X − µi )T
Σ−1
(X − µi ))P(y = i) = exp(
−1
2
(X − µj )T
Σ−1
(X − µj ))P(y = j)
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 9 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Gaussian Linear Discriminant Analysis
Boundary Decision for GLDA
Analysis of GLDA
Boundary Decision for GLDA
Decision boundary cont’ed:
exp(
−1
2
(X − µi )T
Σ−1
(X − µi ))P(y = i) = exp(
−1
2
(X − µj )T
Σ−1
(X − µj ))P(y = j)
−1
2
(X − µi )T
Σ−1
(X − µi ) + log P(y = i) =
−1
2
(X − µj )T
Σ−1
(X − µj ) + log P(y = j)
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 9 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Gaussian Linear Discriminant Analysis
Boundary Decision for GLDA
Analysis of GLDA
Boundary Decision for GLDA
Decision boundary cont’ed:
exp(
−1
2
(X − µi )T
Σ−1
(X − µi ))P(y = i) = exp(
−1
2
(X − µj )T
Σ−1
(X − µj ))P(y = j)
−1
2
(X − µi )T
Σ−1
(X − µi ) + log P(y = i) =
−1
2
(X − µj )T
Σ−1
(X − µj ) + log P(y = j)
⇒ log
P(y = i)
P(y = j)
−
1
2
[XT
Σ−1
X − 2XT
Σ−1
µi + µT
i Σ−1
µi ]
+
1
2
[XT
Σ−1
X − 2XT
Σ−1
µj + µT
j Σ−1
µj ] = 0
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 9 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Gaussian Linear Discriminant Analysis
Boundary Decision for GLDA
Analysis of GLDA
Boundary Decision for GLDA
Decision boundary cont’ed:
exp(
−1
2
(X − µi )T
Σ−1
(X − µi ))P(y = i) = exp(
−1
2
(X − µj )T
Σ−1
(X − µj ))P(y = j)
−1
2
(X − µi )T
Σ−1
(X − µi ) + log P(y = i) =
−1
2
(X − µj )T
Σ−1
(X − µj ) + log P(y = j)
⇒ log
P(y = i)
P(y = j)
−
1
2
[XT
Σ−1
X − 2XT
Σ−1
µi + µT
i Σ−1
µi ]
+
1
2
[XT
Σ−1
X − 2XT
Σ−1
µj + µT
j Σ−1
µj ] = 0
⇒ XT
Σ−1
(µi − µj )
| {z }
aX
+
1
2
µT
i Σ−1
µi −
1
2
µT
j Σ−1
µj + log
P(y = i)
P(y = j)
| {z }
b
= 0
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 9 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Gaussian Linear Discriminant Analysis
Boundary Decision for GLDA
Analysis of GLDA
Analysis of GLDA when Σ = σ2
I
Classes are of identical distribution, but different means.
Cross-section of classes distribution is spherical.
Decision boundary is linear.
Called classifier with nearest Euclidean distance to the class mean,
when?
Σ =

1 0
0 1

Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 10 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Gaussian Linear Discriminant Analysis
Boundary Decision for GLDA
Analysis of GLDA
Analysis of GLDA when Σ is not identity
Classes are of identical distribution, but different means.
Cross-section of classes distribution is ellipsoidal.
Decision boundary is linear.
Σ =

0.5 0
0 1.5

Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 11 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Gaussian Linear Discriminant Analysis
Boundary Decision for GLDA
Analysis of GLDA
Analysis of GLDA with arbitrary Σ
Classes are of identical distribution, but different means.
Cross-section of classes distribution is ellipsoidal. Linear decision
boundary.
Classes are aligned with direction of covariance eigenvectors.
Σ =

0.5 0.2
0.2 1.5

Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 12 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Gaussian Linear Discriminant Analysis
Boundary Decision for GLDA
Analysis of GLDA
Analysis of GLDA with arbitrary Σ
Classes are of identical distribution, but different means.
Cross-section of classes distribution is ellipsoidal. Linear decision
boundary.
Classes are aligned with direction of covariance eigenvectors.
Σ =

0.5 0.2
0.2 1.5

Called classifier with nearest Mahalanobis distance to the class
mean, when? Dist(X, µi) = (X − µi)T
Σ−1
(X − µi)
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 12 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Quadratic Discriminant Analysis
Analysis of QDA
Quadratic Discriminant Analysis
Another generative learning model; a Bayesian classifier
Here, classes are of multinomial distribution, and likelihoods are
multivariate Gaussian with separate covariance Σi .
Decision boundary becomes non-linear, Why?
P(X|y = i) =
1
(
√
2π)n|Σi |
1
2
exp(
−1
2
(X − µi )T
Σ−1
i (X − µi ))
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 13 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Quadratic Discriminant Analysis
Analysis of QDA
Quadratic Discriminant Analysis
Decision boundary is parabolic:
P(y = i|X) = P(y = j|X)
P(X|y = i)P(y = i)
P(X)
=
P(X|y = j)P(y = j)
P(X)
P(X|y = i)P(y = i) = P(X|y = j)P(y = j)
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 14 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Quadratic Discriminant Analysis
Analysis of QDA
Quadratic Discriminant Analysis
Decision boundary is parabolic:
P(y = i|X) = P(y = j|X)
P(X|y = i)P(y = i)
P(X)
=
P(X|y = j)P(y = j)
P(X)
P(X|y = i)P(y = i) = P(X|y = j)P(y = j)
1
(2π)
n
2 |Σi |
1
2
exp(
−1
2
(X − µi )T
Σ−1
i (X − µi ))P(y = i)
=
1
(2π)
n
2 |Σj |
1
2
exp(
−1
2
(X − µj )T
Σ−1
j (X − µj ))P(y = j)
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 14 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Quadratic Discriminant Analysis
Analysis of QDA
Quadratic Discriminant Analysis
Decision boundary cont’ed:
1
|Σi |
1
2
exp(
−1
2
(X − µi )T
Σ−1
i (X − µi ))P(y = i)
=
1
|Σj |
1
2
exp(
−1
2
(X − µj )T
Σ−1
j (X − µj ))P(y = j)
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 15 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Quadratic Discriminant Analysis
Analysis of QDA
Quadratic Discriminant Analysis
Decision boundary cont’ed:
1
|Σi |
1
2
exp(
−1
2
(X − µi )T
Σ−1
i (X − µi ))P(y = i)
=
1
|Σj |
1
2
exp(
−1
2
(X − µj )T
Σ−1
j (X − µj ))P(y = j)
−1
2
log|Σi | −
1
2
(X − µi )T
Σ−1
i (X − µi ) + logP(y = i)
=
−1
2
log|Σj | −
1
2
(X − µj )T
Σ−1
j (X − µj ) + logP(y = j)
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 15 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Quadratic Discriminant Analysis
Analysis of QDA
Quadratic Discriminant Analysis
Decision boundary cont’ed:
1
|Σi |
1
2
exp(
−1
2
(X − µi )T
Σ−1
i (X − µi ))P(y = i)
=
1
|Σj |
1
2
exp(
−1
2
(X − µj )T
Σ−1
j (X − µj ))P(y = j)
−1
2
log|Σi | −
1
2
(X − µi )T
Σ−1
i (X − µi ) + logP(y = i)
=
−1
2
log|Σj | −
1
2
(X − µj )T
Σ−1
j (X − µj ) + logP(y = j)
⇒ log
P(y = i)
P(y = j)
−
1
2
log
|Σi |
|Σj |
−
1
2
[XT
Σ−1
i X + µT
i Σ−1
i µi −
2XT
Σ−1
i µi − XT
Σ−1
j X − µT
j Σ−1
j µj + 2XT
Σ−1
j µj ] = 0
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 15 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Quadratic Discriminant Analysis
Analysis of QDA
Quadratic Discriminant Analysis
Decision boundary cont’ed:
log
P(y = i)
P(y = j)
−
1
2
log
Σi

More Related Content

What's hot

A STUDY ON L-FUZZY NORMAL SUBl -GROUP
A STUDY ON L-FUZZY NORMAL SUBl -GROUPA STUDY ON L-FUZZY NORMAL SUBl -GROUP
A STUDY ON L-FUZZY NORMAL SUBl -GROUPijejournal
 
Bayesian Nonparametric Topic Modeling Hierarchical Dirichlet Processes
Bayesian Nonparametric Topic Modeling Hierarchical Dirichlet ProcessesBayesian Nonparametric Topic Modeling Hierarchical Dirichlet Processes
Bayesian Nonparametric Topic Modeling Hierarchical Dirichlet ProcessesJinYeong Bak
 
RGA of Cayley graphs
RGA of Cayley graphsRGA of Cayley graphs
RGA of Cayley graphsJohn Owen
 
Introduction to second gradient theory of elasticity - Arjun Narayanan
Introduction to second gradient theory of elasticity - Arjun NarayananIntroduction to second gradient theory of elasticity - Arjun Narayanan
Introduction to second gradient theory of elasticity - Arjun NarayananArjun Narayanan
 
On the proof theory for Description Logics
On the proof theory for Description LogicsOn the proof theory for Description Logics
On the proof theory for Description LogicsAlexandre Rademaker
 
A new generalized lindley distribution
A new generalized lindley distributionA new generalized lindley distribution
A new generalized lindley distributionAlexander Decker
 
A Gentle Introduction to Bayesian Nonparametrics
A Gentle Introduction to Bayesian NonparametricsA Gentle Introduction to Bayesian Nonparametrics
A Gentle Introduction to Bayesian NonparametricsJulyan Arbel
 
Linear Discriminant Analysis (LDA) Under f-Divergence Measures
Linear Discriminant Analysis (LDA) Under f-Divergence MeasuresLinear Discriminant Analysis (LDA) Under f-Divergence Measures
Linear Discriminant Analysis (LDA) Under f-Divergence MeasuresAnmol Dwivedi
 
A Gentle Introduction to Bayesian Nonparametrics
A Gentle Introduction to Bayesian NonparametricsA Gentle Introduction to Bayesian Nonparametrics
A Gentle Introduction to Bayesian NonparametricsJulyan Arbel
 

What's hot (10)

A STUDY ON L-FUZZY NORMAL SUBl -GROUP
A STUDY ON L-FUZZY NORMAL SUBl -GROUPA STUDY ON L-FUZZY NORMAL SUBl -GROUP
A STUDY ON L-FUZZY NORMAL SUBl -GROUP
 
Bayesian Nonparametric Topic Modeling Hierarchical Dirichlet Processes
Bayesian Nonparametric Topic Modeling Hierarchical Dirichlet ProcessesBayesian Nonparametric Topic Modeling Hierarchical Dirichlet Processes
Bayesian Nonparametric Topic Modeling Hierarchical Dirichlet Processes
 
RGA of Cayley graphs
RGA of Cayley graphsRGA of Cayley graphs
RGA of Cayley graphs
 
Introduction to second gradient theory of elasticity - Arjun Narayanan
Introduction to second gradient theory of elasticity - Arjun NarayananIntroduction to second gradient theory of elasticity - Arjun Narayanan
Introduction to second gradient theory of elasticity - Arjun Narayanan
 
On the proof theory for Description Logics
On the proof theory for Description LogicsOn the proof theory for Description Logics
On the proof theory for Description Logics
 
A new generalized lindley distribution
A new generalized lindley distributionA new generalized lindley distribution
A new generalized lindley distribution
 
A Gentle Introduction to Bayesian Nonparametrics
A Gentle Introduction to Bayesian NonparametricsA Gentle Introduction to Bayesian Nonparametrics
A Gentle Introduction to Bayesian Nonparametrics
 
Lec 17
Lec  17Lec  17
Lec 17
 
Linear Discriminant Analysis (LDA) Under f-Divergence Measures
Linear Discriminant Analysis (LDA) Under f-Divergence MeasuresLinear Discriminant Analysis (LDA) Under f-Divergence Measures
Linear Discriminant Analysis (LDA) Under f-Divergence Measures
 
A Gentle Introduction to Bayesian Nonparametrics
A Gentle Introduction to Bayesian NonparametricsA Gentle Introduction to Bayesian Nonparametrics
A Gentle Introduction to Bayesian Nonparametrics
 

Similar to Dr azimifar pattern recognition lect4

Model complexity
Model complexityModel complexity
Model complexityDeb Roy
 
Reading "Bayesian measures of model complexity and fit"
Reading "Bayesian measures of model complexity and fit"Reading "Bayesian measures of model complexity and fit"
Reading "Bayesian measures of model complexity and fit"Christian Robert
 
Nec 602 unit ii Random Variables and Random process
Nec 602 unit ii Random Variables and Random processNec 602 unit ii Random Variables and Random process
Nec 602 unit ii Random Variables and Random processDr Naim R Kidwai
 
RSS Annual Conference, Newcastle upon Tyne, Sept. 03, 2013
RSS Annual Conference, Newcastle upon Tyne, Sept. 03, 2013RSS Annual Conference, Newcastle upon Tyne, Sept. 03, 2013
RSS Annual Conference, Newcastle upon Tyne, Sept. 03, 2013Christian Robert
 
Type-based Dependency Analysis
Type-based Dependency AnalysisType-based Dependency Analysis
Type-based Dependency AnalysisMatthias Keil
 
Hypothesis testings on individualized treatment rules
Hypothesis testings on individualized treatment rulesHypothesis testings on individualized treatment rules
Hypothesis testings on individualized treatment rulesYoung-Geun Choi
 
Workshop in honour of Don Poskitt and Gael Martin
Workshop in honour of Don Poskitt and Gael MartinWorkshop in honour of Don Poskitt and Gael Martin
Workshop in honour of Don Poskitt and Gael MartinChristian Robert
 
Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Christian Robert
 
graphicascaaaaaaaaaa dsaaaaal models.pdf
graphicascaaaaaaaaaa dsaaaaal models.pdfgraphicascaaaaaaaaaa dsaaaaal models.pdf
graphicascaaaaaaaaaa dsaaaaal models.pdfshanmathews0007
 
PAWL - GPU meeting @ Warwick
PAWL - GPU meeting @ WarwickPAWL - GPU meeting @ Warwick
PAWL - GPU meeting @ WarwickPierre Jacob
 
Self-organizing Network for Variable Clustering and Predictive Modeling
Self-organizing Network for Variable Clustering and Predictive ModelingSelf-organizing Network for Variable Clustering and Predictive Modeling
Self-organizing Network for Variable Clustering and Predictive ModelingHui Yang
 
Gradient Estimation Using Stochastic Computation Graphs
Gradient Estimation Using Stochastic Computation GraphsGradient Estimation Using Stochastic Computation Graphs
Gradient Estimation Using Stochastic Computation GraphsYoonho Lee
 
prior selection for mixture estimation
prior selection for mixture estimationprior selection for mixture estimation
prior selection for mixture estimationChristian Robert
 
Accelerating Pseudo-Marginal MCMC using Gaussian Processes
Accelerating Pseudo-Marginal MCMC using Gaussian ProcessesAccelerating Pseudo-Marginal MCMC using Gaussian Processes
Accelerating Pseudo-Marginal MCMC using Gaussian ProcessesMatt Moores
 

Similar to Dr azimifar pattern recognition lect4 (19)

Model complexity
Model complexityModel complexity
Model complexity
 
Reading "Bayesian measures of model complexity and fit"
Reading "Bayesian measures of model complexity and fit"Reading "Bayesian measures of model complexity and fit"
Reading "Bayesian measures of model complexity and fit"
 
Nec 602 unit ii Random Variables and Random process
Nec 602 unit ii Random Variables and Random processNec 602 unit ii Random Variables and Random process
Nec 602 unit ii Random Variables and Random process
 
RSS Annual Conference, Newcastle upon Tyne, Sept. 03, 2013
RSS Annual Conference, Newcastle upon Tyne, Sept. 03, 2013RSS Annual Conference, Newcastle upon Tyne, Sept. 03, 2013
RSS Annual Conference, Newcastle upon Tyne, Sept. 03, 2013
 
DIC
DICDIC
DIC
 
Deep Learning Opening Workshop - Horseshoe Regularization for Machine Learnin...
Deep Learning Opening Workshop - Horseshoe Regularization for Machine Learnin...Deep Learning Opening Workshop - Horseshoe Regularization for Machine Learnin...
Deep Learning Opening Workshop - Horseshoe Regularization for Machine Learnin...
 
Type-based Dependency Analysis
Type-based Dependency AnalysisType-based Dependency Analysis
Type-based Dependency Analysis
 
Hypothesis testings on individualized treatment rules
Hypothesis testings on individualized treatment rulesHypothesis testings on individualized treatment rules
Hypothesis testings on individualized treatment rules
 
Workshop in honour of Don Poskitt and Gael Martin
Workshop in honour of Don Poskitt and Gael MartinWorkshop in honour of Don Poskitt and Gael Martin
Workshop in honour of Don Poskitt and Gael Martin
 
asymptotics of ABC
asymptotics of ABCasymptotics of ABC
asymptotics of ABC
 
PMED Opening Workshop - Inference on Individualized Treatment Rules from Obse...
PMED Opening Workshop - Inference on Individualized Treatment Rules from Obse...PMED Opening Workshop - Inference on Individualized Treatment Rules from Obse...
PMED Opening Workshop - Inference on Individualized Treatment Rules from Obse...
 
Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Laplace's Demon: seminar #1
Laplace's Demon: seminar #1
 
graphicascaaaaaaaaaa dsaaaaal models.pdf
graphicascaaaaaaaaaa dsaaaaal models.pdfgraphicascaaaaaaaaaa dsaaaaal models.pdf
graphicascaaaaaaaaaa dsaaaaal models.pdf
 
PAWL - GPU meeting @ Warwick
PAWL - GPU meeting @ WarwickPAWL - GPU meeting @ Warwick
PAWL - GPU meeting @ Warwick
 
Self-organizing Network for Variable Clustering and Predictive Modeling
Self-organizing Network for Variable Clustering and Predictive ModelingSelf-organizing Network for Variable Clustering and Predictive Modeling
Self-organizing Network for Variable Clustering and Predictive Modeling
 
Gradient Estimation Using Stochastic Computation Graphs
Gradient Estimation Using Stochastic Computation GraphsGradient Estimation Using Stochastic Computation Graphs
Gradient Estimation Using Stochastic Computation Graphs
 
prior selection for mixture estimation
prior selection for mixture estimationprior selection for mixture estimation
prior selection for mixture estimation
 
MUMS: Bayesian, Fiducial, and Frequentist Conference - Spatially Informed Var...
MUMS: Bayesian, Fiducial, and Frequentist Conference - Spatially Informed Var...MUMS: Bayesian, Fiducial, and Frequentist Conference - Spatially Informed Var...
MUMS: Bayesian, Fiducial, and Frequentist Conference - Spatially Informed Var...
 
Accelerating Pseudo-Marginal MCMC using Gaussian Processes
Accelerating Pseudo-Marginal MCMC using Gaussian ProcessesAccelerating Pseudo-Marginal MCMC using Gaussian Processes
Accelerating Pseudo-Marginal MCMC using Gaussian Processes
 

More from Zahra Amini

Dip azimifar enhancement_l05_2020
Dip azimifar enhancement_l05_2020Dip azimifar enhancement_l05_2020
Dip azimifar enhancement_l05_2020Zahra Amini
 
Dip azimifar enhancement_l04_2020
Dip azimifar enhancement_l04_2020Dip azimifar enhancement_l04_2020
Dip azimifar enhancement_l04_2020Zahra Amini
 
Dip azimifar enhancement_l03_2020
Dip azimifar enhancement_l03_2020Dip azimifar enhancement_l03_2020
Dip azimifar enhancement_l03_2020Zahra Amini
 
Dip azimifar enhancement_l02_2020
Dip azimifar enhancement_l02_2020Dip azimifar enhancement_l02_2020
Dip azimifar enhancement_l02_2020Zahra Amini
 
Dip azimifar enhancement_l01_2020
Dip azimifar enhancement_l01_2020Dip azimifar enhancement_l01_2020
Dip azimifar enhancement_l01_2020Zahra Amini
 
Ch 1-3 nn learning 1-7
Ch 1-3 nn learning 1-7Ch 1-3 nn learning 1-7
Ch 1-3 nn learning 1-7Zahra Amini
 
Ch 1-2 NN classifier
Ch 1-2 NN classifierCh 1-2 NN classifier
Ch 1-2 NN classifierZahra Amini
 
Ch 1-1 introduction
Ch 1-1 introductionCh 1-1 introduction
Ch 1-1 introductionZahra Amini
 
Kernel estimation(ref)
Kernel estimation(ref)Kernel estimation(ref)
Kernel estimation(ref)Zahra Amini
 
Dr azimifar pattern recognition lect2
Dr azimifar pattern recognition lect2Dr azimifar pattern recognition lect2
Dr azimifar pattern recognition lect2Zahra Amini
 
Dr azimifar pattern recognition lect1
Dr azimifar pattern recognition lect1Dr azimifar pattern recognition lect1
Dr azimifar pattern recognition lect1Zahra Amini
 

More from Zahra Amini (12)

Dip azimifar enhancement_l05_2020
Dip azimifar enhancement_l05_2020Dip azimifar enhancement_l05_2020
Dip azimifar enhancement_l05_2020
 
Dip azimifar enhancement_l04_2020
Dip azimifar enhancement_l04_2020Dip azimifar enhancement_l04_2020
Dip azimifar enhancement_l04_2020
 
Dip azimifar enhancement_l03_2020
Dip azimifar enhancement_l03_2020Dip azimifar enhancement_l03_2020
Dip azimifar enhancement_l03_2020
 
Dip azimifar enhancement_l02_2020
Dip azimifar enhancement_l02_2020Dip azimifar enhancement_l02_2020
Dip azimifar enhancement_l02_2020
 
Dip azimifar enhancement_l01_2020
Dip azimifar enhancement_l01_2020Dip azimifar enhancement_l01_2020
Dip azimifar enhancement_l01_2020
 
Ch 1-3 nn learning 1-7
Ch 1-3 nn learning 1-7Ch 1-3 nn learning 1-7
Ch 1-3 nn learning 1-7
 
Ch 1-2 NN classifier
Ch 1-2 NN classifierCh 1-2 NN classifier
Ch 1-2 NN classifier
 
Ch 1-1 introduction
Ch 1-1 introductionCh 1-1 introduction
Ch 1-1 introduction
 
Lecture 8
Lecture 8Lecture 8
Lecture 8
 
Kernel estimation(ref)
Kernel estimation(ref)Kernel estimation(ref)
Kernel estimation(ref)
 
Dr azimifar pattern recognition lect2
Dr azimifar pattern recognition lect2Dr azimifar pattern recognition lect2
Dr azimifar pattern recognition lect2
 
Dr azimifar pattern recognition lect1
Dr azimifar pattern recognition lect1Dr azimifar pattern recognition lect1
Dr azimifar pattern recognition lect1
 

Recently uploaded

Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfme23b1001
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx959SahilShah
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxKartikeyaDwivedi3
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEroselinkalist12
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2RajaP95
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfAsst.prof M.Gokilavani
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLDeelipZope
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxDeepakSakkari2
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSCAESB
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...srsj9000
 
power system scada applications and uses
power system scada applications and usespower system scada applications and uses
power system scada applications and usesDevarapalliHaritha
 

Recently uploaded (20)

Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdf
 
young call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Serviceyoung call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Service
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptx
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCL
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptx
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentation
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
power system scada applications and uses
power system scada applications and usespower system scada applications and uses
power system scada applications and uses
 
POWER SYSTEMS-1 Complete notes examples
POWER SYSTEMS-1 Complete notes  examplesPOWER SYSTEMS-1 Complete notes  examples
POWER SYSTEMS-1 Complete notes examples
 
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
 

Dr azimifar pattern recognition lect4

  • 1. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Statistical Pattern Recognition Lecture4 Bayesian Learning Dr Zohreh Azimifar School of Electrical and Computer Engineering Shiraz University Fall2014 Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 1 / 22
  • 2. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Table of contents 1 Introduction 2 Generative Learning vs Discriminative Learning Generative Learning and Discriminative Learning 3 Linear Discriminant Analysis Gaussian Linear Discriminant Analysis Boundary Decision for GLDA Analysis of GLDA 4 Quadratic Discriminant Analysis Quadratic Discriminant Analysis Analysis of QDA 5 GLAD and QDA, Another point of view GLAD and QDA, Another point of view 6 Naive Bayes Naive Bayes Naive Bayes: An Example 7 Lecture Summary Summary Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 2 / 22
  • 3. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Introduction Classification based on the theory Bayesian Learning P(y = 0|X) ≷y=0 y=1 P(y = 1|X) Classification involves determining P(y|X), from different perspectives. Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 3 / 22
  • 4. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Generative Learning and Discriminative Learning Generative Learning and Discriminative Learning 1 Discriminative Learning: Direct learning of P(y|X). Modelling of decision boundary, to which side a new sample is assigned. Logistic and softmax regression are called discriminative learners. 2 Generative Learning Explicit modelling of each class separately. Compare new sample with each class probability, based on Bayesian rule: P(y|X) = Likelihood z }| { P(X|y) Prior z }| { P(y) P(X) | {z } normalizing factor Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 4 / 22
  • 5. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Gaussian Linear Discriminant Analysis Boundary Decision for GLDA Analysis of GLDA Gaussian Linear Discriminant Analysis Model P(y) and P(X|y) for each class y: P(y) = φ 1{y=1} 1 φ 1{y=2} 2 · · · φ1{y=c} c P(X|y = i) = 1 ( √ 2π)n|Σ| 1 2 exp( −1 2 (X − µi )T Σ−1 (X − µi )) Parameter set: θ = {φ1, φ2, ..., φc , µ1, µ2, ..., µc , Σ} Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 5 / 22
  • 6. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Gaussian Linear Discriminant Analysis Boundary Decision for GLDA Analysis of GLDA Gaussian Linear Discriminant Analysis Estimate parameters: θ = {φ1, φ2, ..., φc , µ1, µ2, ..., µc , Σ} l(θ) = log m Y j=1 P(X(j) , y(j) ) = log m Y j=1 P(X(j) |P(y(j) ))P(y(j) ) Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 6 / 22
  • 7. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Gaussian Linear Discriminant Analysis Boundary Decision for GLDA Analysis of GLDA Gaussian Linear Discriminant Analysis Estimate parameters: θ = {φ1, φ2, ..., φc , µ1, µ2, ..., µc , Σ} l(θ) = log m Y j=1 P(X(j) , y(j) ) = log m Y j=1 P(X(j) |P(y(j) ))P(y(j) ) Take partial derivative in terms of each individual parameter: φMLE i = Pm j=1 1{y(j) = i} m Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 6 / 22
  • 8. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Gaussian Linear Discriminant Analysis Boundary Decision for GLDA Analysis of GLDA Gaussian Linear Discriminant Analysis Estimate parameters: θ = {φ1, φ2, ..., φc , µ1, µ2, ..., µc , Σ} l(θ) = log m Y j=1 P(X(j) , y(j) ) = log m Y j=1 P(X(j) |P(y(j) ))P(y(j) ) Take partial derivative in terms of each individual parameter: φMLE i = Pm j=1 1{y(j) = i} m µMLE i = Pm j=1 1{y(j) = i}X(j) Pm j=1 1{y(j) = i} Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 6 / 22
  • 9. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Gaussian Linear Discriminant Analysis Boundary Decision for GLDA Analysis of GLDA Gaussian Linear Discriminant Analysis Estimate parameters: θ = {φ1, φ2, ..., φc , µ1, µ2, ..., µc , Σ} l(θ) = log m Y j=1 P(X(j) , y(j) ) = log m Y j=1 P(X(j) |P(y(j) ))P(y(j) ) Take partial derivative in terms of each individual parameter: φMLE i = Pm j=1 1{y(j) = i} m µMLE i = Pm j=1 1{y(j) = i}X(j) Pm j=1 1{y(j) = i} ΣMLE = 1 m m X j=1 (X(j) − µy(j) )(X(j) − µy(j) )T Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 6 / 22
  • 10. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Gaussian Linear Discriminant Analysis Boundary Decision for GLDA Analysis of GLDA Gaussian Linear Discriminant Analysis Determine class label of a new sample Xnew : ynew = argmaxy P(y|X) = argmaxy P(X|y)P(y) P(X) = argmaxy P(X|y)P(y) Note that P(X|y) is a class dependent density. Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 7 / 22
  • 11. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Gaussian Linear Discriminant Analysis Boundary Decision for GLDA Analysis of GLDA Boundary Decision for GLDA Decision boundary is a line, a plane, or a hyper-plane. Why? P(y = i|X) = P(y = j|X) P(X|y = i)P(y = i) P(X) = P(X|y = j)P(y = j) P(X) Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 8 / 22
  • 12. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Gaussian Linear Discriminant Analysis Boundary Decision for GLDA Analysis of GLDA Boundary Decision for GLDA Decision boundary is a line, a plane, or a hyper-plane. Why? P(y = i|X) = P(y = j|X) P(X|y = i)P(y = i) P(X) = P(X|y = j)P(y = j) P(X) P(X|y = i)P(y = i) = P(X|y = j)P(y = j) Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 8 / 22
  • 13. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Gaussian Linear Discriminant Analysis Boundary Decision for GLDA Analysis of GLDA Boundary Decision for GLDA Decision boundary is a line, a plane, or a hyper-plane. Why? P(y = i|X) = P(y = j|X) P(X|y = i)P(y = i) P(X) = P(X|y = j)P(y = j) P(X) P(X|y = i)P(y = i) = P(X|y = j)P(y = j) 1 (2π) n 2 |Σ| 1 2 exp( −1 2 (X − µi )T Σ−1 (X − µi ))P(y = i) = 1 (2π) n 2 |Σ| 1 2 exp( −1 2 (X − µj )T Σ−1 (X − µj ))P(y = j) Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 8 / 22
  • 14. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Gaussian Linear Discriminant Analysis Boundary Decision for GLDA Analysis of GLDA Boundary Decision for GLDA Decision boundary cont’ed: exp( −1 2 (X − µi )T Σ−1 (X − µi ))P(y = i) = exp( −1 2 (X − µj )T Σ−1 (X − µj ))P(y = j) Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 9 / 22
  • 15. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Gaussian Linear Discriminant Analysis Boundary Decision for GLDA Analysis of GLDA Boundary Decision for GLDA Decision boundary cont’ed: exp( −1 2 (X − µi )T Σ−1 (X − µi ))P(y = i) = exp( −1 2 (X − µj )T Σ−1 (X − µj ))P(y = j) −1 2 (X − µi )T Σ−1 (X − µi ) + log P(y = i) = −1 2 (X − µj )T Σ−1 (X − µj ) + log P(y = j) Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 9 / 22
  • 16. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Gaussian Linear Discriminant Analysis Boundary Decision for GLDA Analysis of GLDA Boundary Decision for GLDA Decision boundary cont’ed: exp( −1 2 (X − µi )T Σ−1 (X − µi ))P(y = i) = exp( −1 2 (X − µj )T Σ−1 (X − µj ))P(y = j) −1 2 (X − µi )T Σ−1 (X − µi ) + log P(y = i) = −1 2 (X − µj )T Σ−1 (X − µj ) + log P(y = j) ⇒ log P(y = i) P(y = j) − 1 2 [XT Σ−1 X − 2XT Σ−1 µi + µT i Σ−1 µi ] + 1 2 [XT Σ−1 X − 2XT Σ−1 µj + µT j Σ−1 µj ] = 0 Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 9 / 22
  • 17. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Gaussian Linear Discriminant Analysis Boundary Decision for GLDA Analysis of GLDA Boundary Decision for GLDA Decision boundary cont’ed: exp( −1 2 (X − µi )T Σ−1 (X − µi ))P(y = i) = exp( −1 2 (X − µj )T Σ−1 (X − µj ))P(y = j) −1 2 (X − µi )T Σ−1 (X − µi ) + log P(y = i) = −1 2 (X − µj )T Σ−1 (X − µj ) + log P(y = j) ⇒ log P(y = i) P(y = j) − 1 2 [XT Σ−1 X − 2XT Σ−1 µi + µT i Σ−1 µi ] + 1 2 [XT Σ−1 X − 2XT Σ−1 µj + µT j Σ−1 µj ] = 0 ⇒ XT Σ−1 (µi − µj ) | {z } aX + 1 2 µT i Σ−1 µi − 1 2 µT j Σ−1 µj + log P(y = i) P(y = j) | {z } b = 0 Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 9 / 22
  • 18. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Gaussian Linear Discriminant Analysis Boundary Decision for GLDA Analysis of GLDA Analysis of GLDA when Σ = σ2 I Classes are of identical distribution, but different means. Cross-section of classes distribution is spherical. Decision boundary is linear. Called classifier with nearest Euclidean distance to the class mean, when? Σ = 1 0 0 1 Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 10 / 22
  • 19. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Gaussian Linear Discriminant Analysis Boundary Decision for GLDA Analysis of GLDA Analysis of GLDA when Σ is not identity Classes are of identical distribution, but different means. Cross-section of classes distribution is ellipsoidal. Decision boundary is linear. Σ = 0.5 0 0 1.5 Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 11 / 22
  • 20. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Gaussian Linear Discriminant Analysis Boundary Decision for GLDA Analysis of GLDA Analysis of GLDA with arbitrary Σ Classes are of identical distribution, but different means. Cross-section of classes distribution is ellipsoidal. Linear decision boundary. Classes are aligned with direction of covariance eigenvectors. Σ = 0.5 0.2 0.2 1.5 Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 12 / 22
  • 21. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Gaussian Linear Discriminant Analysis Boundary Decision for GLDA Analysis of GLDA Analysis of GLDA with arbitrary Σ Classes are of identical distribution, but different means. Cross-section of classes distribution is ellipsoidal. Linear decision boundary. Classes are aligned with direction of covariance eigenvectors. Σ = 0.5 0.2 0.2 1.5 Called classifier with nearest Mahalanobis distance to the class mean, when? Dist(X, µi) = (X − µi)T Σ−1 (X − µi) Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 12 / 22
  • 22. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Quadratic Discriminant Analysis Analysis of QDA Quadratic Discriminant Analysis Another generative learning model; a Bayesian classifier Here, classes are of multinomial distribution, and likelihoods are multivariate Gaussian with separate covariance Σi . Decision boundary becomes non-linear, Why? P(X|y = i) = 1 ( √ 2π)n|Σi | 1 2 exp( −1 2 (X − µi )T Σ−1 i (X − µi )) Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 13 / 22
  • 23. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Quadratic Discriminant Analysis Analysis of QDA Quadratic Discriminant Analysis Decision boundary is parabolic: P(y = i|X) = P(y = j|X) P(X|y = i)P(y = i) P(X) = P(X|y = j)P(y = j) P(X) P(X|y = i)P(y = i) = P(X|y = j)P(y = j) Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 14 / 22
  • 24. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Quadratic Discriminant Analysis Analysis of QDA Quadratic Discriminant Analysis Decision boundary is parabolic: P(y = i|X) = P(y = j|X) P(X|y = i)P(y = i) P(X) = P(X|y = j)P(y = j) P(X) P(X|y = i)P(y = i) = P(X|y = j)P(y = j) 1 (2π) n 2 |Σi | 1 2 exp( −1 2 (X − µi )T Σ−1 i (X − µi ))P(y = i) = 1 (2π) n 2 |Σj | 1 2 exp( −1 2 (X − µj )T Σ−1 j (X − µj ))P(y = j) Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 14 / 22
  • 25. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Quadratic Discriminant Analysis Analysis of QDA Quadratic Discriminant Analysis Decision boundary cont’ed: 1 |Σi | 1 2 exp( −1 2 (X − µi )T Σ−1 i (X − µi ))P(y = i) = 1 |Σj | 1 2 exp( −1 2 (X − µj )T Σ−1 j (X − µj ))P(y = j) Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 15 / 22
  • 26. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Quadratic Discriminant Analysis Analysis of QDA Quadratic Discriminant Analysis Decision boundary cont’ed: 1 |Σi | 1 2 exp( −1 2 (X − µi )T Σ−1 i (X − µi ))P(y = i) = 1 |Σj | 1 2 exp( −1 2 (X − µj )T Σ−1 j (X − µj ))P(y = j) −1 2 log|Σi | − 1 2 (X − µi )T Σ−1 i (X − µi ) + logP(y = i) = −1 2 log|Σj | − 1 2 (X − µj )T Σ−1 j (X − µj ) + logP(y = j) Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 15 / 22
  • 27. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Quadratic Discriminant Analysis Analysis of QDA Quadratic Discriminant Analysis Decision boundary cont’ed: 1 |Σi | 1 2 exp( −1 2 (X − µi )T Σ−1 i (X − µi ))P(y = i) = 1 |Σj | 1 2 exp( −1 2 (X − µj )T Σ−1 j (X − µj ))P(y = j) −1 2 log|Σi | − 1 2 (X − µi )T Σ−1 i (X − µi ) + logP(y = i) = −1 2 log|Σj | − 1 2 (X − µj )T Σ−1 j (X − µj ) + logP(y = j) ⇒ log P(y = i) P(y = j) − 1 2 log |Σi | |Σj | − 1 2 [XT Σ−1 i X + µT i Σ−1 i µi − 2XT Σ−1 i µi − XT Σ−1 j X − µT j Σ−1 j µj + 2XT Σ−1 j µj ] = 0 Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 15 / 22
  • 28. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Quadratic Discriminant Analysis Analysis of QDA Quadratic Discriminant Analysis Decision boundary cont’ed: log P(y = i) P(y = j) − 1 2 log
  • 29.
  • 30.
  • 31. Σi
  • 32.
  • 33.
  • 34.
  • 35.
  • 36.
  • 37. Σj
  • 38.
  • 39.
  • 40. − 1 2 [XT (Σ−1 i − Σ−1 j )X + µT i Σ−1 i µi − µT j Σ−1 j µj − 2XT (Σ−1 i µi − Σ−1 j µj )] = 0 Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 16 / 22
  • 41. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Quadratic Discriminant Analysis Analysis of QDA Quadratic Discriminant Analysis Decision boundary cont’ed: log P(y = i) P(y = j) − 1 2 log
  • 42.
  • 43.
  • 44. Σi
  • 45.
  • 46.
  • 47.
  • 48.
  • 49.
  • 50. Σj
  • 51.
  • 52.
  • 53. − 1 2 [XT (Σ−1 i − Σ−1 j )X + µT i Σ−1 i µi − µT j Σ−1 j µj − 2XT (Σ−1 i µi − Σ−1 j µj )] = 0 ⇒ XT aX + bT X + c = 0 Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 16 / 22
  • 54. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Quadratic Discriminant Analysis Analysis of QDA Analysis of QDA when Σ = σ2 i I Classes are of different distributions and different means. Cross-sections of classes distribution are spherical but of different sizes. Σ1 = 1.5 0 0 1.5 , Σ2 = 2 0 0 2 , Σ3 = 1 0 0 1 Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 17 / 22
  • 55. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Quadratic Discriminant Analysis Analysis of QDA Analysis of QDA with arbitrary Σi 6= Σj Classes are of different distributions and different means. Cross-sections of classes distribution are ellipsoidal and of different sizes. Σ1 = 1.5 0.1 0.1 0.5 , Σ2 = 1 −0.2 −0.2 2 , Σ3 = 2 −0.25 −0.25 1.5 Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 18 / 22
  • 56. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary GLAD and QDA, Another point of view GLAD and QDA, Another point of view Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 19 / 22
  • 57. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Naive Bayes Naive Bayes: An Example Naive Bayes Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 20 / 22
  • 58. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Naive Bayes Naive Bayes: An Example Naive Bayes: An Example Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 21 / 22
  • 59. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Summary Summary Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 22 / 22