SlideShare a Scribd company logo
1 of 23
Download to read offline
Stat310            Hypothesis testing


                             Hadley Wickham
Wednesday, 21 April 2010
1. Quick review
                2. More about the interval on the
                   sampling distribution
                3. More complicated example
                4. Introduction to hypothesis testing
                   (aka the statistical justice system)



Wednesday, 21 April 2010
Review
                    Identify distribution that connects
                    estimator and true value.
                    Form confidence interval for
                    known (sampling) distribution.
                    Write as probability statement.
                    Back transform.
                    Write as interval.


Wednesday, 21 April 2010
Steps
                    Identify distribution that connects
                    estimator and true value.
                    Form confidence interval for
                    known (sampling) distribution.
                    Write as probability statement.
                    Back transform.
                    Write as interval.


Wednesday, 21 April 2010
Steps
                    Want P(a < Q < b) = 1 - α, and b - a to be
                    as small as possible.
                    If Q is symmetric, P(-a < Q < a) = 1 - α. So
                    a = F -1(α/2), and there is no interval
                    smaller.
                    If Q isn’t symmetric, pick a = F -1(α/2),
                    b=F    -1(1 - α/2), but there might be a

                    shorter interval.


Wednesday, 21 April 2010
Example


                    We want a 90% confidence interval, then
                    two possible ends for the interval are
                    F -1(0.05) and F -1(0.95)




Wednesday, 21 April 2010
Variance
                    Xi iid Normal(5, σ2), i = 1, ..., 10
                    I ran the experiment and recorded the following
                    results: 0.15 1.48 1.25 2.47 1.09 1.95 1.46 1.49
                    2.81 1.96 (var = 0.55)
                    Find a 95% confidence interval for the variance.
                    (Hint: If X~χ2(9), FX(19) = 0.975 and FX(2.7) =
                    0.025).
                    EC: Can you also make a confidence interval for
                    the standard deviation?


Wednesday, 21 April 2010
Rest of chapter

                    The rest of chapter 6 just gives examples
                    of the general method we learned this
                    week. Any of it is fair game in the final.
                    But don’t memorise specifics: learn the
                    general steps!




Wednesday, 21 April 2010
So far

                    Given Xi iid SomeDist(a, b), and data, we
                    can give estimates (including uncertainty)
                    for a and b.
                    Next: We’ll learn how to answer questions
                    like is a > 0, or is b in [5, 10].




Wednesday, 21 April 2010
Your turn
                    The following values have generated from
                    iid Normal(μ, 1):
                    2.9 2.1 3.0 3.2 1.2 3.0 3.3 1.2 2.3 1.5
                    (mean: 2.13)
                    Is it possible they came from a normal
                    distribution with mean 1.5?



Wednesday, 21 April 2010
Hypothesis testing
 The statistical justice system




http://www.flickr.com/photos/joegratz/117048243
Wednesday, 21 April 2010
A suspect is accused of a crime. The suspect is
        declared guilty or innocent based on a trial. Each
        trial has a defence and a prosecution. On the
        basis of how evidence compares to a standard,
        the judge makes a decision to convict or acquit.




Wednesday, 21 April 2010
A dataset is accused of having a particular
        parameter value. The data is declared guilty or
        innocent based on the results of a statistical test.
        Each test has a null hypothesis and an
        alternative hypothesis. On the basis of how a
        test statistic compares to a standard
        distribution, we make the decision to reject the
        null or fail to reject the null hypothesis.



Wednesday, 21 April 2010
Your turn

                    With a partner, write down the criminal
                    equivalents to the following terms:
                    null hypothesis, alternative hypothesis,
                    test statistic, test distribution, reject the
                    null hypothesis, fail to reject the null
                    hypothesis



Wednesday, 21 April 2010
Hypthoses
                    Every case has two sides:
                    The null hypothesis (the defence). This is
                    the default, or the status quo, what you
                    need to argue against.
                    The alternative hypothesis (the
                    prosecution). This is the interesting case,
                    but you need to prove it’s true.


Wednesday, 21 April 2010
In the statistical justice system evidence is
                    based on the similarity between the accused
                    and known innocents.
                    The population of innocents, called the null
                    distribution, is generated by the combination
                    of null hypothesis and test statistic.
                    To determine the guilt of the accused we
                    compute the proportion of innocents who look
                    more guilty than the accused. This is the p-
                    value, the probability that the accused would
                    look this guilty if they actually were innocent.


Wednesday, 21 April 2010
The lady tasting tea
                    A thought experiment by R. A. Fisher
                    (famous early statistician, 1890-1962)
                    A lady at a tea party claims that she can
                    tell the difference between putting the
                    milk in first and second.
                    How can we be sure?



Wednesday, 21 April 2010
Experiment
                    8 cups. 4 milk first, 4 milk second.
                    Presented in random order.
                    What is the null hypothesis?
                    (What is the position of the defence?)
                    What test-statistic might we use?
                    (What can we measure to assess evidence?)
                    What is the null-distribution?
                    (What do innocents look like?)


Wednesday, 21 April 2010
Right                     Wrong          #     %
                           4                            0      1    1%
                           3                            1      16   23%
                           2                            2      36   51%
                           1                            3      16   23%
                           0                            4      1    1%
                                                               70   100%

http://www.uwsp.edu/education/wkirby/t&m/5LadyTastingTea.htm
Wednesday, 21 April 2010
1. Write down null and alternative hypotheses
                   (positions of defence and prosecution)
                2. Figure out good test statistic (what numeric
                   summary captures useful evidence)
                3. Work out null distribution
                   (distribution of innocents)
                4. Calculate p-value by comparing actual
                   value to null distribution (what proportion of
                   true innocents look more guilty than the
                   suspect)


Wednesday, 21 April 2010
Your turn

                    The following values have generated from
                    iid Normal(μ, 1):
                    2.9 2.1 3.0 3.2 1.2 3.0 3.3 1.2 2.3 1.5
                    (mean: 2.13)
                    Does μ = 1.5 ?



Wednesday, 21 April 2010
Wednesday, 21 April 2010
Reading
                    Make sure you are familiar with the
                    similarities and differences between the
                    criminal and statistical justice systems.
                    Read Sections 7.1 and 7.41 for a classical
                    take on the issue. Can you give the errors
                    names to match the statistical justice
                    system?


Wednesday, 21 April 2010

More Related Content

Similar to 23 testing

BA4101 Question Bank.pdf
BA4101 Question Bank.pdfBA4101 Question Bank.pdf
BA4101 Question Bank.pdfFredrickWilson4
 
Mayo minnesota 28 march 2 (1)
Mayo minnesota 28 march 2 (1)Mayo minnesota 28 march 2 (1)
Mayo minnesota 28 march 2 (1)jemille6
 
ders 5 hypothesis testing.pptx
ders 5 hypothesis testing.pptxders 5 hypothesis testing.pptx
ders 5 hypothesis testing.pptxErgin Akalpler
 
Chap14 analysis of categorical data
Chap14 analysis of categorical dataChap14 analysis of categorical data
Chap14 analysis of categorical dataJudianto Nugroho
 
RM-10 Hypothesis testing.ppt
RM-10 Hypothesis testing.pptRM-10 Hypothesis testing.ppt
RM-10 Hypothesis testing.pptSadiaArifinSmrity
 
Ldb Convergenze Parallele_De barros_02
Ldb Convergenze Parallele_De barros_02Ldb Convergenze Parallele_De barros_02
Ldb Convergenze Parallele_De barros_02laboratoridalbasso
 
Error Control and Severity
Error Control and SeverityError Control and Severity
Error Control and Severityjemille6
 
Mayo: Evidence as Passing a Severe Test (How it Gets You Beyond the Statistic...
Mayo: Evidence as Passing a Severe Test (How it Gets You Beyond the Statistic...Mayo: Evidence as Passing a Severe Test (How it Gets You Beyond the Statistic...
Mayo: Evidence as Passing a Severe Test (How it Gets You Beyond the Statistic...jemille6
 
Different types of distributions
Different types of distributionsDifferent types of distributions
Different types of distributionsRajaKrishnan M
 
Severe Testing: The Key to Error Correction
Severe Testing: The Key to Error CorrectionSevere Testing: The Key to Error Correction
Severe Testing: The Key to Error Correctionjemille6
 
Bayesian model choice (and some alternatives)
Bayesian model choice (and some alternatives)Bayesian model choice (and some alternatives)
Bayesian model choice (and some alternatives)Christian Robert
 
Meeting #1 Slides Phil 6334/Econ 6614 SP2019
Meeting #1 Slides Phil 6334/Econ 6614 SP2019Meeting #1 Slides Phil 6334/Econ 6614 SP2019
Meeting #1 Slides Phil 6334/Econ 6614 SP2019jemille6
 
“The importance of philosophy of science for statistical science and vice versa”
“The importance of philosophy of science for statistical science and vice versa”“The importance of philosophy of science for statistical science and vice versa”
“The importance of philosophy of science for statistical science and vice versa”jemille6
 
Back to the basics-Part2: Data exploration: representing and testing data pro...
Back to the basics-Part2: Data exploration: representing and testing data pro...Back to the basics-Part2: Data exploration: representing and testing data pro...
Back to the basics-Part2: Data exploration: representing and testing data pro...Giannis Tsakonas
 

Similar to 23 testing (20)

24 tests
24 tests24 tests
24 tests
 
04 Bayes Rule
04 Bayes Rule04 Bayes Rule
04 Bayes Rule
 
20 Estimation
20 Estimation20 Estimation
20 Estimation
 
Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
 
BA4101 Question Bank.pdf
BA4101 Question Bank.pdfBA4101 Question Bank.pdf
BA4101 Question Bank.pdf
 
Mayo minnesota 28 march 2 (1)
Mayo minnesota 28 march 2 (1)Mayo minnesota 28 march 2 (1)
Mayo minnesota 28 march 2 (1)
 
ders 5 hypothesis testing.pptx
ders 5 hypothesis testing.pptxders 5 hypothesis testing.pptx
ders 5 hypothesis testing.pptx
 
Chap14 analysis of categorical data
Chap14 analysis of categorical dataChap14 analysis of categorical data
Chap14 analysis of categorical data
 
RM-10 Hypothesis testing.ppt
RM-10 Hypothesis testing.pptRM-10 Hypothesis testing.ppt
RM-10 Hypothesis testing.ppt
 
Ldb Convergenze Parallele_De barros_02
Ldb Convergenze Parallele_De barros_02Ldb Convergenze Parallele_De barros_02
Ldb Convergenze Parallele_De barros_02
 
Error Control and Severity
Error Control and SeverityError Control and Severity
Error Control and Severity
 
Mayo: Evidence as Passing a Severe Test (How it Gets You Beyond the Statistic...
Mayo: Evidence as Passing a Severe Test (How it Gets You Beyond the Statistic...Mayo: Evidence as Passing a Severe Test (How it Gets You Beyond the Statistic...
Mayo: Evidence as Passing a Severe Test (How it Gets You Beyond the Statistic...
 
Different types of distributions
Different types of distributionsDifferent types of distributions
Different types of distributions
 
19 Inference
19 Inference19 Inference
19 Inference
 
Severe Testing: The Key to Error Correction
Severe Testing: The Key to Error CorrectionSevere Testing: The Key to Error Correction
Severe Testing: The Key to Error Correction
 
Bayesian model choice (and some alternatives)
Bayesian model choice (and some alternatives)Bayesian model choice (and some alternatives)
Bayesian model choice (and some alternatives)
 
Meeting #1 Slides Phil 6334/Econ 6614 SP2019
Meeting #1 Slides Phil 6334/Econ 6614 SP2019Meeting #1 Slides Phil 6334/Econ 6614 SP2019
Meeting #1 Slides Phil 6334/Econ 6614 SP2019
 
“The importance of philosophy of science for statistical science and vice versa”
“The importance of philosophy of science for statistical science and vice versa”“The importance of philosophy of science for statistical science and vice versa”
“The importance of philosophy of science for statistical science and vice versa”
 
Back to the basics-Part2: Data exploration: representing and testing data pro...
Back to the basics-Part2: Data exploration: representing and testing data pro...Back to the basics-Part2: Data exploration: representing and testing data pro...
Back to the basics-Part2: Data exploration: representing and testing data pro...
 
Bayesian/Fiducial/Frequentist Uncertainty Quantification by Artificial Samples
Bayesian/Fiducial/Frequentist Uncertainty Quantification by Artificial SamplesBayesian/Fiducial/Frequentist Uncertainty Quantification by Artificial Samples
Bayesian/Fiducial/Frequentist Uncertainty Quantification by Artificial Samples
 

More from Hadley Wickham (20)

27 development
27 development27 development
27 development
 
24 modelling
24 modelling24 modelling
24 modelling
 
23 data-structures
23 data-structures23 data-structures
23 data-structures
 
Graphical inference
Graphical inferenceGraphical inference
Graphical inference
 
R packages
R packagesR packages
R packages
 
22 spam
22 spam22 spam
22 spam
 
21 spam
21 spam21 spam
21 spam
 
20 date-times
20 date-times20 date-times
20 date-times
 
19 tables
19 tables19 tables
19 tables
 
18 cleaning
18 cleaning18 cleaning
18 cleaning
 
17 polishing
17 polishing17 polishing
17 polishing
 
16 critique
16 critique16 critique
16 critique
 
15 time-space
15 time-space15 time-space
15 time-space
 
14 case-study
14 case-study14 case-study
14 case-study
 
13 case-study
13 case-study13 case-study
13 case-study
 
12 adv-manip
12 adv-manip12 adv-manip
12 adv-manip
 
11 adv-manip
11 adv-manip11 adv-manip
11 adv-manip
 
11 adv-manip
11 adv-manip11 adv-manip
11 adv-manip
 
10 simulation
10 simulation10 simulation
10 simulation
 
10 simulation
10 simulation10 simulation
10 simulation
 

23 testing

  • 1. Stat310 Hypothesis testing Hadley Wickham Wednesday, 21 April 2010
  • 2. 1. Quick review 2. More about the interval on the sampling distribution 3. More complicated example 4. Introduction to hypothesis testing (aka the statistical justice system) Wednesday, 21 April 2010
  • 3. Review Identify distribution that connects estimator and true value. Form confidence interval for known (sampling) distribution. Write as probability statement. Back transform. Write as interval. Wednesday, 21 April 2010
  • 4. Steps Identify distribution that connects estimator and true value. Form confidence interval for known (sampling) distribution. Write as probability statement. Back transform. Write as interval. Wednesday, 21 April 2010
  • 5. Steps Want P(a < Q < b) = 1 - α, and b - a to be as small as possible. If Q is symmetric, P(-a < Q < a) = 1 - α. So a = F -1(α/2), and there is no interval smaller. If Q isn’t symmetric, pick a = F -1(α/2), b=F -1(1 - α/2), but there might be a shorter interval. Wednesday, 21 April 2010
  • 6. Example We want a 90% confidence interval, then two possible ends for the interval are F -1(0.05) and F -1(0.95) Wednesday, 21 April 2010
  • 7. Variance Xi iid Normal(5, σ2), i = 1, ..., 10 I ran the experiment and recorded the following results: 0.15 1.48 1.25 2.47 1.09 1.95 1.46 1.49 2.81 1.96 (var = 0.55) Find a 95% confidence interval for the variance. (Hint: If X~χ2(9), FX(19) = 0.975 and FX(2.7) = 0.025). EC: Can you also make a confidence interval for the standard deviation? Wednesday, 21 April 2010
  • 8. Rest of chapter The rest of chapter 6 just gives examples of the general method we learned this week. Any of it is fair game in the final. But don’t memorise specifics: learn the general steps! Wednesday, 21 April 2010
  • 9. So far Given Xi iid SomeDist(a, b), and data, we can give estimates (including uncertainty) for a and b. Next: We’ll learn how to answer questions like is a > 0, or is b in [5, 10]. Wednesday, 21 April 2010
  • 10. Your turn The following values have generated from iid Normal(μ, 1): 2.9 2.1 3.0 3.2 1.2 3.0 3.3 1.2 2.3 1.5 (mean: 2.13) Is it possible they came from a normal distribution with mean 1.5? Wednesday, 21 April 2010
  • 11. Hypothesis testing The statistical justice system http://www.flickr.com/photos/joegratz/117048243 Wednesday, 21 April 2010
  • 12. A suspect is accused of a crime. The suspect is declared guilty or innocent based on a trial. Each trial has a defence and a prosecution. On the basis of how evidence compares to a standard, the judge makes a decision to convict or acquit. Wednesday, 21 April 2010
  • 13. A dataset is accused of having a particular parameter value. The data is declared guilty or innocent based on the results of a statistical test. Each test has a null hypothesis and an alternative hypothesis. On the basis of how a test statistic compares to a standard distribution, we make the decision to reject the null or fail to reject the null hypothesis. Wednesday, 21 April 2010
  • 14. Your turn With a partner, write down the criminal equivalents to the following terms: null hypothesis, alternative hypothesis, test statistic, test distribution, reject the null hypothesis, fail to reject the null hypothesis Wednesday, 21 April 2010
  • 15. Hypthoses Every case has two sides: The null hypothesis (the defence). This is the default, or the status quo, what you need to argue against. The alternative hypothesis (the prosecution). This is the interesting case, but you need to prove it’s true. Wednesday, 21 April 2010
  • 16. In the statistical justice system evidence is based on the similarity between the accused and known innocents. The population of innocents, called the null distribution, is generated by the combination of null hypothesis and test statistic. To determine the guilt of the accused we compute the proportion of innocents who look more guilty than the accused. This is the p- value, the probability that the accused would look this guilty if they actually were innocent. Wednesday, 21 April 2010
  • 17. The lady tasting tea A thought experiment by R. A. Fisher (famous early statistician, 1890-1962) A lady at a tea party claims that she can tell the difference between putting the milk in first and second. How can we be sure? Wednesday, 21 April 2010
  • 18. Experiment 8 cups. 4 milk first, 4 milk second. Presented in random order. What is the null hypothesis? (What is the position of the defence?) What test-statistic might we use? (What can we measure to assess evidence?) What is the null-distribution? (What do innocents look like?) Wednesday, 21 April 2010
  • 19. Right Wrong # % 4 0 1 1% 3 1 16 23% 2 2 36 51% 1 3 16 23% 0 4 1 1% 70 100% http://www.uwsp.edu/education/wkirby/t&m/5LadyTastingTea.htm Wednesday, 21 April 2010
  • 20. 1. Write down null and alternative hypotheses (positions of defence and prosecution) 2. Figure out good test statistic (what numeric summary captures useful evidence) 3. Work out null distribution (distribution of innocents) 4. Calculate p-value by comparing actual value to null distribution (what proportion of true innocents look more guilty than the suspect) Wednesday, 21 April 2010
  • 21. Your turn The following values have generated from iid Normal(μ, 1): 2.9 2.1 3.0 3.2 1.2 3.0 3.3 1.2 2.3 1.5 (mean: 2.13) Does μ = 1.5 ? Wednesday, 21 April 2010
  • 23. Reading Make sure you are familiar with the similarities and differences between the criminal and statistical justice systems. Read Sections 7.1 and 7.41 for a classical take on the issue. Can you give the errors names to match the statistical justice system? Wednesday, 21 April 2010