Estimation Theory


Published on

Estimation theory

Published in: Education, Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Estimation Theory

  1. 1. Estimation Theory<br />1<br />
  2. 2. Estimation Theory<br />We seek to determine from a set of data, a set of parameters such that their values would yield the highest probability of obtaining the observed data.<br />The unknown parameters may be seen as deterministic or random variables<br />There are essentially two alternatives to the statistical case<br />When no a priori distribution assumed then Maximum Likelihood<br />When a priori distribution known then Bayes<br />
  3. 3. Maximum Likelihood<br />Principle: Estimate a parameter such that for this value the probability of obtaining an actually observed sample is as large as possible.<br />I.e. having got the observation we “look back” and compute probability that the given sample will be observed, as if the experiment is to be done again.<br />This probability depends on a parameter which is adjusted to give it a maximum possible value.<br />Reminds you of politicians observing the movement of the crowd and then move to the front to lead them?<br />
  4. 4. Estimation Theory<br />Let a random variable have a probability distribution dependent on a parameter <br />The parameter lies in a space of all possible parameters <br />Let be the probability density function of <br />Assume the the mathematical form of is known but not <br />
  5. 5. Estimation Theory<br />The joint pdf of sample random variables evaluated at each the sample points<br />Is given as<br />The above is known as the likelihood of the sampled observation <br />
  6. 6. Estimation Theory<br />The likelihood function is a function of the unknown parameter for a fixed set of observations<br />The Maximum Likelihood Principle requires us to select that value of that maximises the likelihood function<br />The parameter may also be regarded as a vector of parameters <br />
  7. 7. Estimation Theory<br />It is often more convenient to use<br />The maximum is then at<br />
  8. 8. An example<br />Let be a random sample selected from a normal distribution<br />The joint pdf is <br />We wish to find the best and <br />
  9. 9. Estimation Theory<br />Form the log-likelihood function<br />Hence<br />or <br />
  10. 10. Fisher and Cramer-Rao<br />The Fisher Information helps in placing a bound on estimators<br />Cramer-Rao Lower Bound:“If is any unbiased estimator of based on maximum likelihood then <br />Ie provides a lower bound on the covariance matrix of any unbiased estimator<br />
  11. 11. Estimation Theory<br />It can be seen that if we model the observations as the output of an AR process driven by zero mean Gaussian noise then the Maximum Likelihood estimator for the variance is also the Least Squares Estimator.<br />
  12. 12. The Cramer-Rao Lower Bound<br />This is an important theorem which establishes the superiority of the ML estimate over all others. The Cramer-Rao lower bound is the smallest theoretical variance which can be achieved. ML gives this so any other estimation technique can at best only equal it. <br />this is the Cramer-Rao inequality.<br />
  13. 13. <ul><li>CRB Definition:
  14. 14. Inverse of the Fisher Matrix:
  15. 15. lowest possible variance
  16. 16. Purpose of CRB analysis:
  17. 17. indicate the performance bounds of a particular problem.
  18. 18. facilitate analysis of factors that impact most on the performance of an algorithm.
  19. 19. Fisher Matrix of our Energy Decay Model</li></ul>The Cramer-Rao Lower Bound<br />
  20. 20. <ul><li>A different way of looking at information for continuous functions r(s).
  21. 21. Intuition: We can discriminate values of s better if r(s) is changing rapidly. If r(s) does not change much with s, then we don’t learn much about s.</li></ul>The Cramer-Rao Lower Bound<br />
  22. 22. <ul><li>How much does r tell us about s?
  23. 23. Fisher information is high when:
  24. 24. High negative curvature at s for all r (on average)
  25. 25. Rapid change in p(r|s) at s for all r (on average)
  26. 26. Implies easy to discriminate different values of s</li></ul>The Cramer-Rao Lower Bound<br />
  27. 27. <ul><li>Applies to unbiased estimators for which
  28. 28. The best unbiased estimator is only so good.
  29. 29. The best unbiased estimator is the ML estimate
  30. 30. Show a relation between variance of estimators and an information measure</li></ul>The Cramer-Rao Lower Bound<br />