Pattern Recognition
Tutorial 4- Aly Osama
Agenda
1. Short Review
2. Solving Sheets Problems
3. Experiments
Review
Design a Bayesian Decision Classifier?
Class-conditional densities Prior
We don’t have both in real life problems
Bayes Rule
How to estimate parameters ?
MLE vs MAP
1. Maximum Likelihood Estimation
a. Parameters are unknown but fixed
2. Maximum A Posteriori Estimation
a. Parameters are random variables having a priori distribution
1. Maximum Likelihood Estimation
let’s say we have a likelihood function P(X|θ).
Then, the MLE for θ, the parameter we want to
infer, is:
To use this framework, we just need to derive the
log likelihood of our model, then maximizing it
with regard of θ using our favorite optimization
algorithm like Gradient Descent.
2. Maximum a posterior Estimation
If we replace the likelihood in the MLE formula above with
the posterior, we get:
Comparing both MLE and MAP equation, the only
thing differs is the inclusion of prior P(θ) in MAP,
otherwise they are identical. What it means is that,
the likelihood is now weighted with some weight
coming from the prior.
Relation Between MLE and MAP
What we could conclude then,
is that MLE is a special case
of MAP, where the prior is
uniform!
Problems
Problem 1
Solution 1
Problem 2
Problem 3
?
Problem 4
Experiments
Assignment Discussion
Assignment 4
Understand and compare your classifier using
● Accuracy
● Precision
● Recall
● F-Score

Pattern recognition 4 - MLE

  • 1.
  • 2.
    Agenda 1. Short Review 2.Solving Sheets Problems 3. Experiments
  • 3.
  • 4.
    Design a BayesianDecision Classifier? Class-conditional densities Prior We don’t have both in real life problems
  • 5.
  • 6.
    How to estimateparameters ? MLE vs MAP 1. Maximum Likelihood Estimation a. Parameters are unknown but fixed 2. Maximum A Posteriori Estimation a. Parameters are random variables having a priori distribution
  • 7.
    1. Maximum LikelihoodEstimation let’s say we have a likelihood function P(X|θ). Then, the MLE for θ, the parameter we want to infer, is: To use this framework, we just need to derive the log likelihood of our model, then maximizing it with regard of θ using our favorite optimization algorithm like Gradient Descent.
  • 8.
    2. Maximum aposterior Estimation If we replace the likelihood in the MLE formula above with the posterior, we get: Comparing both MLE and MAP equation, the only thing differs is the inclusion of prior P(θ) in MAP, otherwise they are identical. What it means is that, the likelihood is now weighted with some weight coming from the prior.
  • 9.
    Relation Between MLEand MAP What we could conclude then, is that MLE is a special case of MAP, where the prior is uniform!
  • 18.
  • 19.
  • 20.
  • 21.
  • 23.
  • 25.
  • 26.
  • 28.
  • 29.
  • 30.
    Assignment 4 Understand andcompare your classifier using ● Accuracy ● Precision ● Recall ● F-Score