SlideShare a Scribd company logo
1 of 32
Download to read offline
Stat310            Inference


                         Hadley Wickham
Tuesday, 31 March 2009
1. Homework / Take home exam
               2. Recap
               3. Data vs. distributions
               4. Estimation
                         1. Maximum likelihood
                         2. Method of moments
               5. Feedback

Tuesday, 31 March 2009
Assessment
                   Short homework this week. (But you
                   have to do some reading)
                   Take home test will be available online
                   next Thursday.
                   Both take home and homework will be
                   due in class on Thursday April 9.
                   Will put up study guide asap.


Tuesday, 31 March 2009
Recap

                   What are the 5 parameters of the bivariate
                   normal?
                   If X and Y are bivariate normal, and their
                   correlation is zero, what does that imply
                   about X and Y? Is that usually true?




Tuesday, 31 March 2009
Data vs. Distributions
                   Random experiments produce data.
                   A repeatable random experiment has
                   some underlying distribution.
                   We want to go from the data to say
                   something about the underlying
                   distribution.



Tuesday, 31 March 2009
Coin tossing
                   Half the class generates 100 heads and tails
                   by flipping coins.
                   The other half generates 100 heads and tails
                   just by writing down what they think the
                   sequence would be.
                   Write up on the board.
                   I’ll come in and guess which group was
                   which.


Tuesday, 31 March 2009
Problem

                   Have some data
                   and a probability model, with unknown
                   parameters.
                   Want to estimate the value of those
                   parameters



Tuesday, 31 March 2009
Some definitons
                   Parameter space: set of all possible
                   parameter values
                   Estimator: process/function which takes
                   data and gives best guess for parameter
                   (usually many possible estimators for a
                   problem)
                   Point estimate: estimator for a single value


Tuesday, 31 March 2009
Example

                   Data: 5.7 3.0 5.7 4.5 6.0 6.3 4.9 5.8 4.4 5.8
                   Model: Normal(?, 1)


                   What is the mean of the underlying
                   distribution? (5.2?)



Tuesday, 31 March 2009
Uncertainty

                   Also want to be able to quantify how
                   certain/confident we are in our answer.
                   How close is our estimate to the true
                   mean?




Tuesday, 31 March 2009
Simulation
                   One approach to find the answer is to use
                   simulation, i.e., set up a case where we
                   know what the true answer is and see
                   what happens.
                   X ~ Normal(5, 1)
                   Draw 10 numbers from this distribution
                   and calculate their average.


Tuesday, 31 March 2009
3.1 3.4 5.1 4.9 2.2 4.4 4.2 3.9 5.6 4.9 4.2
      5.9 2.8 6.0 5.1 2.7 6.5 4.2 4.9 4.6 4.4 4.7
      5.0 5.3 5.3 5.1 5.4 4.7 4.7 4.4 5.9 4.2 5.0
      4.3 5.4 5.5 4.9 3.1 4.1 4.8 3.6 6.8 5.5 4.8
      3.8 6.1 3.8 5.2 5.7 5.2 3.2 5.2 5.3 2.3 4.6
      5.6 6.0 5.5 5.5 5.1 7.3 5.4 6.1 4.4 4.9 5.6




Tuesday, 31 March 2009
Repeat 1000 times
     120




     100




         80
 count




         60




         40


                               95% of values
                                lie between
         20


                                4.5 and 5.6
          0

                4.0      4.5        5.0        5.5   6.0
                                  samp
Tuesday, 31 March 2009
Theory

                   From Tuesday, we know what the
                   distribution of the average is. Write it
                   down.
                   Create a 95% confidence interval.
                   How does it compare to the simulation?



Tuesday, 31 March 2009
Why the mean?

                   Why is the mean of the data a good
                   estimate of μ? Are there other estimators
                   that might be as good or better?
                   In general, how can we figure out an
                   estimator for a parameter of a
                   distribution?



Tuesday, 31 March 2009
Maximum likelihood
                   Method of moments



Tuesday, 31 March 2009
Maximum likelihood

                   Write down log-likelihood (i.e., given this
                   data how likely is it that it was generated
                   from this parmeter?)
                   Find the maximum (i.e., differentiate and
                   set to zero)




Tuesday, 31 March 2009
Example
                   X ~ Binomial(10, p?)
                   Here is some data drawn from that
                   random experiment: 4 5 1 5 3 2 4 2 2 4
                   We know the joint pdf because they are
                   independent. Can try out various values
                   of p and see which is most likely



Tuesday, 31 March 2009
Your turn

                   Write down the joint pdf for X1, X2, …, Xn
                   ~ Binomial(n, p)


                   Try evaluating it for x = (4 5 1 5 3 2 4 2 2
                   4), n = 10, p = 0.1



Tuesday, 31 March 2009
Try 10 different
                                       ●




                                                                   values of p
    3.0e−08



    2.5e−08



    2.0e−08
 prob




    1.5e−08



    1.0e−08                                ●




    5.0e−09



                                 ●
    0.0e+00                                          ●
                         ●   ●                           ●     ●    ●    ●   ●




                     0.0         0.2       0.4           0.6       0.8       1.0
                                                 p
Tuesday, 31 March 2009
Try 100 different
                                                      values of p
    3.5e−08



    3.0e−08



    2.5e−08



    2.0e−08
 prob




    1.5e−08


                                                     True p is 0.3
    1.0e−08



    5.0e−09



    0.0e+00

                     0.0   0.2   0.4       0.6         0.8      1.0
                                       p
Tuesday, 31 March 2009
Calculus
                   Can do the same analytically with calculus.
                   Want to find the maximum of the pdf with
                   respect to p. (How do we do this?)
                   Normally call this the likelihood when
                   we’re thinking of the x’s being fixed and
                   the parameters varying.
                   Usually easier to work with the log pdf
                   (why?)


Tuesday, 31 March 2009
Steps

                   Write out log-likelihood
                   (Discard constants)
                   Differentiate and set to 0
                   (Check second derivative is positive)




Tuesday, 31 March 2009
Analytically

                   Mean of x’s is 3.2
                   n = 10
                   Maximum likelihood estimate of p for this
                   example is 0.32




Tuesday, 31 March 2009
Method of moments
                   We know how to calculate sample
                   moments (e.g. mean and variance of data)
                   We know what the moments of the
                   distribution are in terms of the
                   parameters.
                   Why not just match them up?



Tuesday, 31 March 2009
Binomial
                   E(X) = np Var(X) = np(1-p)




Tuesday, 31 March 2009
Binomial
                   E(X) = np Var(X) = np(1-p)
                   p = mean / n = 3.2 / 10 = 0.32




Tuesday, 31 March 2009
Binomial
                   E(X) = np Var(X) = np(1-p)
                   p = mean / n = 3.2 / 10 = 0.32
                   p(1-p) = var / n = 2 / 10 = 0.2




Tuesday, 31 March 2009
Binomial
                   E(X) = np Var(X) = np(1-p)
                   p = mean / n = 3.2 / 10 = 0.32
                   p(1-p) = var / n = 2 / 10 = 0.2
                   -p 2   + p - 0.2 = 0




Tuesday, 31 March 2009
Binomial
                   E(X) = np Var(X) = np(1-p)
                   p = mean / n = 3.2 / 10 = 0.32
                   p(1-p) = var / n = 2 / 10 = 0.2
                   -p 2   + p - 0.2 = 0
                   p = (0.276, 0.725)



Tuesday, 31 March 2009
Your turn

                   What are the method of moments
                   estimators for the mean and variance of
                   the normal distribution?
                   What about the gamma distribution?




Tuesday, 31 March 2009
Feedback



Tuesday, 31 March 2009

More Related Content

Similar to 21 Inference

Binomail probability distribution
Binomail probability distributionBinomail probability distribution
Binomail probability distributionAbdrie Setegn
 
Artificial intelligence cs607 handouts lecture 11 - 45
Artificial intelligence   cs607 handouts lecture 11 - 45Artificial intelligence   cs607 handouts lecture 11 - 45
Artificial intelligence cs607 handouts lecture 11 - 45Sattar kayani
 
learningIntro.doc
learningIntro.doclearningIntro.doc
learningIntro.docbutest
 
learningIntro.doc
learningIntro.doclearningIntro.doc
learningIntro.docbutest
 
Sean Holden (University of Cambridge) - Proving Theorems_ Still A Major Test ...
Sean Holden (University of Cambridge) - Proving Theorems_ Still A Major Test ...Sean Holden (University of Cambridge) - Proving Theorems_ Still A Major Test ...
Sean Holden (University of Cambridge) - Proving Theorems_ Still A Major Test ...Codiax
 
Random forest sgv_ai_talk_oct_2_2018
Random forest sgv_ai_talk_oct_2_2018Random forest sgv_ai_talk_oct_2_2018
Random forest sgv_ai_talk_oct_2_2018digitalzombie
 
ensemble learning
ensemble learningensemble learning
ensemble learningbutest
 
Genetic Algorithms
Genetic AlgorithmsGenetic Algorithms
Genetic Algorithmsanas_elf
 
Elementary statistical inference1
Elementary statistical inference1Elementary statistical inference1
Elementary statistical inference1SEMINARGROOT
 

Similar to 21 Inference (16)

Monte Carlo
Monte CarloMonte Carlo
Monte Carlo
 
23 Estimation
23 Estimation23 Estimation
23 Estimation
 
Binomail probability distribution
Binomail probability distributionBinomail probability distribution
Binomail probability distribution
 
02 Ddply
02 Ddply02 Ddply
02 Ddply
 
Artificial intelligence cs607 handouts lecture 11 - 45
Artificial intelligence   cs607 handouts lecture 11 - 45Artificial intelligence   cs607 handouts lecture 11 - 45
Artificial intelligence cs607 handouts lecture 11 - 45
 
12 Ddply
12 Ddply12 Ddply
12 Ddply
 
learningIntro.doc
learningIntro.doclearningIntro.doc
learningIntro.doc
 
learningIntro.doc
learningIntro.doclearningIntro.doc
learningIntro.doc
 
Sean Holden (University of Cambridge) - Proving Theorems_ Still A Major Test ...
Sean Holden (University of Cambridge) - Proving Theorems_ Still A Major Test ...Sean Holden (University of Cambridge) - Proving Theorems_ Still A Major Test ...
Sean Holden (University of Cambridge) - Proving Theorems_ Still A Major Test ...
 
Random forest sgv_ai_talk_oct_2_2018
Random forest sgv_ai_talk_oct_2_2018Random forest sgv_ai_talk_oct_2_2018
Random forest sgv_ai_talk_oct_2_2018
 
ensemble learning
ensemble learningensemble learning
ensemble learning
 
Genetic Algorithms
Genetic AlgorithmsGenetic Algorithms
Genetic Algorithms
 
Elementary statistical inference1
Elementary statistical inference1Elementary statistical inference1
Elementary statistical inference1
 
MLE.pdf
MLE.pdfMLE.pdf
MLE.pdf
 
2019 PMED Spring Course - SMARTs-Part II - Eric Laber, April 10, 2019
2019 PMED Spring Course - SMARTs-Part II - Eric Laber, April 10, 2019 2019 PMED Spring Course - SMARTs-Part II - Eric Laber, April 10, 2019
2019 PMED Spring Course - SMARTs-Part II - Eric Laber, April 10, 2019
 
Probability unit2.pptx
Probability unit2.pptxProbability unit2.pptx
Probability unit2.pptx
 

More from Hadley Wickham (20)

27 development
27 development27 development
27 development
 
27 development
27 development27 development
27 development
 
24 modelling
24 modelling24 modelling
24 modelling
 
23 data-structures
23 data-structures23 data-structures
23 data-structures
 
Graphical inference
Graphical inferenceGraphical inference
Graphical inference
 
R packages
R packagesR packages
R packages
 
22 spam
22 spam22 spam
22 spam
 
21 spam
21 spam21 spam
21 spam
 
20 date-times
20 date-times20 date-times
20 date-times
 
19 tables
19 tables19 tables
19 tables
 
18 cleaning
18 cleaning18 cleaning
18 cleaning
 
17 polishing
17 polishing17 polishing
17 polishing
 
16 critique
16 critique16 critique
16 critique
 
15 time-space
15 time-space15 time-space
15 time-space
 
14 case-study
14 case-study14 case-study
14 case-study
 
13 case-study
13 case-study13 case-study
13 case-study
 
12 adv-manip
12 adv-manip12 adv-manip
12 adv-manip
 
11 adv-manip
11 adv-manip11 adv-manip
11 adv-manip
 
11 adv-manip
11 adv-manip11 adv-manip
11 adv-manip
 
10 simulation
10 simulation10 simulation
10 simulation
 

Recently uploaded

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdfChristopherTHyatt
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 

Recently uploaded (20)

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 

21 Inference

  • 1. Stat310 Inference Hadley Wickham Tuesday, 31 March 2009
  • 2. 1. Homework / Take home exam 2. Recap 3. Data vs. distributions 4. Estimation 1. Maximum likelihood 2. Method of moments 5. Feedback Tuesday, 31 March 2009
  • 3. Assessment Short homework this week. (But you have to do some reading) Take home test will be available online next Thursday. Both take home and homework will be due in class on Thursday April 9. Will put up study guide asap. Tuesday, 31 March 2009
  • 4. Recap What are the 5 parameters of the bivariate normal? If X and Y are bivariate normal, and their correlation is zero, what does that imply about X and Y? Is that usually true? Tuesday, 31 March 2009
  • 5. Data vs. Distributions Random experiments produce data. A repeatable random experiment has some underlying distribution. We want to go from the data to say something about the underlying distribution. Tuesday, 31 March 2009
  • 6. Coin tossing Half the class generates 100 heads and tails by flipping coins. The other half generates 100 heads and tails just by writing down what they think the sequence would be. Write up on the board. I’ll come in and guess which group was which. Tuesday, 31 March 2009
  • 7. Problem Have some data and a probability model, with unknown parameters. Want to estimate the value of those parameters Tuesday, 31 March 2009
  • 8. Some definitons Parameter space: set of all possible parameter values Estimator: process/function which takes data and gives best guess for parameter (usually many possible estimators for a problem) Point estimate: estimator for a single value Tuesday, 31 March 2009
  • 9. Example Data: 5.7 3.0 5.7 4.5 6.0 6.3 4.9 5.8 4.4 5.8 Model: Normal(?, 1) What is the mean of the underlying distribution? (5.2?) Tuesday, 31 March 2009
  • 10. Uncertainty Also want to be able to quantify how certain/confident we are in our answer. How close is our estimate to the true mean? Tuesday, 31 March 2009
  • 11. Simulation One approach to find the answer is to use simulation, i.e., set up a case where we know what the true answer is and see what happens. X ~ Normal(5, 1) Draw 10 numbers from this distribution and calculate their average. Tuesday, 31 March 2009
  • 12. 3.1 3.4 5.1 4.9 2.2 4.4 4.2 3.9 5.6 4.9 4.2 5.9 2.8 6.0 5.1 2.7 6.5 4.2 4.9 4.6 4.4 4.7 5.0 5.3 5.3 5.1 5.4 4.7 4.7 4.4 5.9 4.2 5.0 4.3 5.4 5.5 4.9 3.1 4.1 4.8 3.6 6.8 5.5 4.8 3.8 6.1 3.8 5.2 5.7 5.2 3.2 5.2 5.3 2.3 4.6 5.6 6.0 5.5 5.5 5.1 7.3 5.4 6.1 4.4 4.9 5.6 Tuesday, 31 March 2009
  • 13. Repeat 1000 times 120 100 80 count 60 40 95% of values lie between 20 4.5 and 5.6 0 4.0 4.5 5.0 5.5 6.0 samp Tuesday, 31 March 2009
  • 14. Theory From Tuesday, we know what the distribution of the average is. Write it down. Create a 95% confidence interval. How does it compare to the simulation? Tuesday, 31 March 2009
  • 15. Why the mean? Why is the mean of the data a good estimate of μ? Are there other estimators that might be as good or better? In general, how can we figure out an estimator for a parameter of a distribution? Tuesday, 31 March 2009
  • 16. Maximum likelihood Method of moments Tuesday, 31 March 2009
  • 17. Maximum likelihood Write down log-likelihood (i.e., given this data how likely is it that it was generated from this parmeter?) Find the maximum (i.e., differentiate and set to zero) Tuesday, 31 March 2009
  • 18. Example X ~ Binomial(10, p?) Here is some data drawn from that random experiment: 4 5 1 5 3 2 4 2 2 4 We know the joint pdf because they are independent. Can try out various values of p and see which is most likely Tuesday, 31 March 2009
  • 19. Your turn Write down the joint pdf for X1, X2, …, Xn ~ Binomial(n, p) Try evaluating it for x = (4 5 1 5 3 2 4 2 2 4), n = 10, p = 0.1 Tuesday, 31 March 2009
  • 20. Try 10 different ● values of p 3.0e−08 2.5e−08 2.0e−08 prob 1.5e−08 1.0e−08 ● 5.0e−09 ● 0.0e+00 ● ● ● ● ● ● ● ● 0.0 0.2 0.4 0.6 0.8 1.0 p Tuesday, 31 March 2009
  • 21. Try 100 different values of p 3.5e−08 3.0e−08 2.5e−08 2.0e−08 prob 1.5e−08 True p is 0.3 1.0e−08 5.0e−09 0.0e+00 0.0 0.2 0.4 0.6 0.8 1.0 p Tuesday, 31 March 2009
  • 22. Calculus Can do the same analytically with calculus. Want to find the maximum of the pdf with respect to p. (How do we do this?) Normally call this the likelihood when we’re thinking of the x’s being fixed and the parameters varying. Usually easier to work with the log pdf (why?) Tuesday, 31 March 2009
  • 23. Steps Write out log-likelihood (Discard constants) Differentiate and set to 0 (Check second derivative is positive) Tuesday, 31 March 2009
  • 24. Analytically Mean of x’s is 3.2 n = 10 Maximum likelihood estimate of p for this example is 0.32 Tuesday, 31 March 2009
  • 25. Method of moments We know how to calculate sample moments (e.g. mean and variance of data) We know what the moments of the distribution are in terms of the parameters. Why not just match them up? Tuesday, 31 March 2009
  • 26. Binomial E(X) = np Var(X) = np(1-p) Tuesday, 31 March 2009
  • 27. Binomial E(X) = np Var(X) = np(1-p) p = mean / n = 3.2 / 10 = 0.32 Tuesday, 31 March 2009
  • 28. Binomial E(X) = np Var(X) = np(1-p) p = mean / n = 3.2 / 10 = 0.32 p(1-p) = var / n = 2 / 10 = 0.2 Tuesday, 31 March 2009
  • 29. Binomial E(X) = np Var(X) = np(1-p) p = mean / n = 3.2 / 10 = 0.32 p(1-p) = var / n = 2 / 10 = 0.2 -p 2 + p - 0.2 = 0 Tuesday, 31 March 2009
  • 30. Binomial E(X) = np Var(X) = np(1-p) p = mean / n = 3.2 / 10 = 0.32 p(1-p) = var / n = 2 / 10 = 0.2 -p 2 + p - 0.2 = 0 p = (0.276, 0.725) Tuesday, 31 March 2009
  • 31. Your turn What are the method of moments estimators for the mean and variance of the normal distribution? What about the gamma distribution? Tuesday, 31 March 2009