Stats chapter 10
Upcoming SlideShare
Loading in...5
×
 

Stats chapter 10

on

  • 5,609 views

 

Statistics

Views

Total Views
5,609
Views on SlideShare
5,556
Embed Views
53

Actions

Likes
0
Downloads
44
Comments
0

3 Embeds 53

http://jujo00obo2o234ungd3t8qjfcjrs3o6k-a-sites-opensocial.googleusercontent.com 25
http://mj89sp3sau2k7lj1eg3k40hkeppguj6j-a-sites-opensocial.googleusercontent.com 25
https://jujo00obo2o234ungd3t8qjfcjrs3o6k-a-sites-opensocial.googleusercontent.com 3

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Stats chapter 10 Stats chapter 10 Presentation Transcript

  • Chapter 10
    Estimating with Confidence
  • 10.1 Confidence Intervals: The Basics
  • Definitions
    Statistical Inference
    The method of drawing conclusions about a population based on a sample
    The last Major Topic for this last statistics course
    The mindset: we are looking at data that comes from a random sample (or an experiment) and inferring characteristics of the population
  • The Basic Plan
    In the absence of other data, the sample estimate is the estimate of the parameter
    We would like an interval estimation
    State the parameter that is being estimated
    Check to see if the data can be Normalized
    Compute the interval using area under the Normal curve
    Write a nice conclusion
  • What does an interval estimation look like?
    EX1We are 95% confident the mean is in the interval (9.11, 12.05)
  • What does an interval estimation look like?
    EX1We are 95% confident the mean is in the interval (9.11, 12.05)
    Confidence Level C
  • What does an interval estimation look like?
    EX1We are 95% confident the mean is in the interval (9.11, 12.05)
    Confidence Interval
    (CI)
  • What does an interval estimation look like?
    EX1We are 95% confident the mean is in the interval (9.11, 12.05)
    Lower Bound
    Upper Bound
  • What does an interval estimation look like?
    EX1We are 95% confident the mean is in the interval (9.11, 12.05)
    EX2We are 90% confident that the proportion that support this law is 0.74 ± 0.08
  • What does an interval estimation look like?
    EX1We are 95% confident the mean is in the interval (9.11, 12.05)
    EX2We are 90% confident that the proportion that support this law is 0.74 ± 0.08
    Confidence Level C
  • What does an interval estimation look like?
    EX1We are 95% confident the mean is in the interval (9.11, 12.05)
    EX2We are 90% confident that the proportion that support this law is 0.74 ± 0.08
    Confidence Interval (CI)
  • What does an interval estimation look like?
    EX1We are 95% confident the mean is in the interval (9.11, 12.05)
    EX2We are 90% confident that the proportion that support this law is 0.74 ± 0.08
    Margin of Error
    (ME)
    Point Estimate
  • Confidence Level
    The value of the parameter is fixed before we start our sampling
    Either the parameter is ‘in’ our interval or it’s not.
    There is no probability involved here
    “There is a 95% probability the parameter is in the interval”
  • Confidence Level
    The value of the parameter is fixed before we start our sampling
    Either the parameter is ‘in’ our interval or it’s not.
    There is no probability involved here
    “There is a 95% probability the parameter is in the interval”
    F A I L
  • Confidence Level
    Remember that our sample is just one of many samples that could have been taken
    If good sampling technique is used, 95% (for example) of the samples would contain the parameter
    We just don’t know if our sample is part of the 95% or the 5%
  • Confidence Level
    Interpretation of “90% Confidence Level”
    “We are 90% confident that our CI contains the value of the parameter”
    “90% of CI’s computed with this technique will contain the parameter”
    “There is a 90% probability our CI contains the parameter”
    WRONG INTERPRETATION
  • Confidence Level
    Interpretation of “90% Confidence Level”
    “We are 90% confident that our CI contains the value of the parameter”
    “90% of CI’s computed with this technique will contain the parameter”
    “There is a 90% probability our CI contains the parameter”
    WRONG INTERPRETATION
  • PANIC
    When constructing Confidence Intervals for the AP test, there is a definite checklist that readers look for
    Use the acronym P.A.N.I.C. to help memorize the steps to construction a confidence interval
    P = state the Parameter
    A = check that Assumptions are satisfied
    N = state is the Name of the interval
    I = compute numerical values of the Interval
    C = write a Conclusion for your findings
  • PANIC for Means
    We are going to start with Means, we’ll save proportions for later
    Parameterdefine what  measuresdefine what x-bar measures
    “ = mean length of all Great White Sharks‘ flukes
    x-bar = mean length the Great White Shark flukes in our sample of 35 sharks”
  • Assumptions for CI’s of Mean
    SRS- the data must have come from an SRS
    Independence- the population size must be more than 10 times the sampleN> 10n (Independence)
  • Assumptions for CI’s of Mean
    SRS- the data must have come from an SRS
    Independence- the population size must be more than 10 times the sampleN> 10n (Independence)
    This is a condition that must be met since we are sampling without replacement and our formula for
    std dev needs to hold true
  • Assumptions for CI’s of Mean
    SRS- the data must have come from an SRS
    Independence- the population size must be more than 10 times the sampleN> 10n (Independence)
    The sampling Distribution is approximately Normal (Normality)
  • More on Normality
    We are looking for justification that the sampling distribution is Normal
    If the population is Normal, then the samp dist is also Normal
    If n> 30 and the sample data is not given, state “The Central Limit Theorem guarantees that the sampling distribution is Normal”
    For smaller samples, check the following:(1) the Histogram is single peak, symmetric with no outliers(2) the Normal Probability Plot is approx Linear
  • Name of the Interval
    Currently, the only Interval we will worry about is “z-interval for sample means”
    This will always be the name of the distribution used
  • Confidence Interval Computation
  • Confidence Interval Computation
    Margin of Error
  • Confidence Interval Computation
    Standard Error
    Always the std dev
    of the sampling distribution
    Critical Value
    -from the Normal (z) distribution
  • Critical Value
    The area between –z* and +z* = Confidence Level
    Because they are used frequently, there is a shorthand method in table C
    The last row gives different confidence levels
    Keep critical values to 3 decimal places
    The row above the CL row has the z*
    In the AP test, they will call this z
  • A General z* curve
  • z* for a 80% CL
  • Computing a CI
    Let’s compute the 95% CI for a sample of 35, with a mean of 5.38 and a population std dev =0.74
    (1) locate the critical value
    z* = 1.960
    (2) Compute S.E. and M.E.
    SE = 0.74/(35) = 0.1251ME = (1.960)(0.1251) = 0.2452
  • Computing a CI
    State the CI = 5.38 ± 0.25
    Interval estimate:(5.13, 5.63)
    Notice that the point estimate is the average of the upper and lower bounds
  • Conclusions
    We are 95% confident that the mean length of a great white shark fluke is 5.38 ± 0.25 ft.
    OR
    We are 95% confident that the mean length of all great white shark flukes is in the interval (5.13 ft, 5.63 ft).
  • Calculators TI83/TI84
  • Calculators TI83/TI84
    Your TI is very efficient for finding these intervals! This doesn’t excuse you from the mathematics, of course.
    [stat] -> “TESTS” -> “Zinterval”
    Inpt: “Stats” (if your data is in L1, you can use “Data”
    Enter values for , x-bar, n, and C-Level.
    Viola!
  • Calculators TI89
    Run the “Stats/List Editor” APP
    [2nd] -> [F2] (F7) -> “Zinterval”
    Input Method = “Stats”Choose Data if all the observations are in a List
    Enter values for , x-bar, n, and C-Level.
    Viola!
  • Behavior of Margin of Error
    ME = z* (/n)
    In practice, we would like to minimize ME.
    (1) decrease z* This also means decrease our CL!
    (2) increase nthis is usually a trade off, since obtaining large samples could be more expensive/time consuming
    (3) decrease Not really an option.  is a known quantity.
  • Sample Size
    By using algebra, we can get a formula to compute minimum sample size for a given ME
    You should always round the sample size up in this calculation
    Note: ME is produced by sampling variability, it has nothing to do with “sloppy work”
  • 10.2 Estimating a Population Mean
  • Student’s t-distribution
    The confidence intervals computed in the previous section assumed that we knew .
    It doesn’t seem likely that we would know  and not know the value of 
    When  is unknown, we can no longer use the Normal distribution
  • Student’s t-distribution
    The t-distribution is to be used when  is unknown.
    The t-distribution is very similar to the Normal distribution with a key difference
    The shape of the distribution changes based upon the sample size/degrees of freedom (df)
    Large samples have a tall peak and thinner tails
    Small samples have a small peak and thicker tails.
  • Student’s t-distribution
    Degrees of Freedom = n -1
  • Using the t-distribution
    The CI is found using PANIC
    The Assumptions are the same as a z-interval
    Since sample sizes tend to be small, you will most likely need to check the histogram (symmetric, no outliers) and Normal prob plot (approx linear)
    We cannot use the t-distribution when there are outliers!
    The Name of the interval is “1-sample t-interval for mean”
  • Using the t-distribution
    Interval is computed with
    df = n – 1
    t* is from table C or your calculator
    s is the sample’s standard deviation
    Uses the sample std dev to approximate 
  • Upper Tail Area
  • Upper Tail Area
    This is the Upper Tail Area
  • Using the t-distribution
  • Using the t-distribution
    Using TI84
    [2nd] -> [vars](DIST) -> “invT”
    “invT(1-Upper Tail Area, df)”
    Where “Upper Tail Area” = (1-CL)/2
    This is the area to the left of the right crit. Value.
    ALTERNATIVELY, you may use“−invT(Upper Tail Area, df)”
  • Using the t-distribution
    Using TI84
    [2nd] -> [vars](DIST) -> “invT”
    “invT(1-Upper Tail Area, df)”
    Where “Upper Tail Area” = (1-CL)/2
    This is the area to the left of the right crit. Value.
    ALTERNATIVELY, you may use“−invT(Upper Tail Area, df)”
  • Using the t-distribution
    Using TI84
    [2nd] -> [vars](DIST) -> “invT”
    “invT(1-Upper Tail Area, df)”
    Where “Upper Tail Area” = (1-CL)/2
    This is the area to the left of the right crit. Value.
    ALTERNATIVELY, you may use“−invT(Upper Tail Area, df)”
    Don’t forget the negative!
  • Using the t-distribution
  • Using the t-distribution
    Using TI89
    From home screen:
    [catalog] -> [F3](FlashApps) -> inv_t…TIStat
    “tistat.invT(1-Upper Tail Area, df)”
    Where “Upper Tail Area” = (1-CL)/2
    This is the area to the left of the right crit. Value.
    ALTERNATIVELY, you may use“-tistat.invT(Upper Tail Area, df)”
  • Using the t-distribution
    Using TI89
    From home screen:
    [catalog] -> [F3](FlashApps) -> inv_t…TIStat
    “tistat.invT(1-Upper Tail Area, df)”
    Where “Upper Tail Area” = (1-CL)/2
    This is the area to the left of the right crit. Value.
    ALTERNATIVELY, you may use“-tistat.invT(Upper Tail Area, df)”
  • Using the t-distribution
    Using TI89
    From home screen:
    [catalog] -> [F3](FlashApps) -> inv_t…TIStat
    “tistat.invT(1-Upper Tail Area, df)”
    Where “Upper Tail Area” = (1-CL)/2
    This is the area to the left of the right crit. Value.
    ALTERNATIVELY, you may use“-tistat.invT(Upper Tail Area, df)”
    Don’t forget the negative!
  • Using the t-distribution
    ALTERNATIVE TI89 titanium
    [APPS] -> “Stat/List Editor” -> [F5] (distrib) -> “Inverse” -> “Inverse t…”
    “Area: 1- Upper Tail”
    Upper Tail = (1 – CL)/2
    “degrees of freedom, df: df”
    This takes longer to get to, but the menu “guides” you through
  • Table C
    Occasionally, you will need to fine the t* for a df that does not appear in the chart.
    When this happens, you are to use the nearest greatest df in the same column
    This usually means “use the t* that is in the line above where the desired t* should be”
  • Example: 1 sample t-interval
    Problem 10.30
    The amount of Vitamin C (mg/100g) in CSB for a random sample of 8 are given as:
    26, 31, 23, 22, 11, 22, 14, 31
    Construct a 95% Confidence interval for the amount of vitamin C in CSB.
  • Example: 1 sample t-interval
    Parameter
     = the mean amount of vitamin C in CSB produced at the factory
    x-bar = the mean amount of vitamin C in a sample n = 8
  • Example: 1 sample t-interval
    Assumptions
    SRS
    ‘Our problem states that we have a random sample’
    Independence
    ‘10n = 80 < N; we can infer that more than 80 CSB is produced in the factory’
  • Example: 1 sample t-interval
    Assumptions
    Normality
    ‘The histogram is symmetric w/ no outliers’
    ‘The Norm Prob Plot is approx linear’
    ‘We have good evidence that our sampling distribution is approximately Normal’
    x
    10 15 20 25 30 35
    Norm Prob Plot
    Histogram
    z
  • Example: 1 sample t-interval
    Name of Interval
    ‘We will compute a 1-sample t-interval for a mean’
  • Example: 1 sample t-interval
    Name of Interval
    ‘We will compute a 1-sample t-interval for a mean’
    Interval Calculation
  • Example: 1 sample t-interval
    Name of Interval
    ‘We will compute a 1-sample t-interval for a mean’
    Interval Calculation
    Conclusion
    ‘We are 95% confident that the mean amount of vitamin C in a unit of CSB produced in the factory is between 16.487 and 28.513 mg/100 g’
  • 10.3 Estimating a population Proportion
  • Estimating a Proportion
    Like with means, we are going to estimate the population proportion based on the proportion in a sample
    x = # of positive responsesn = total number of responses
    We will again use the PANIC procedures
    Note: nothing is averaged. We are not looking at the average proportion from many samples
  • Parameter
    Some typical parameters:
    ‘p = prop of people in CA who support the proposition
    p-hat = proportion of people in a sample n = 35 who support the proposition’
    ‘p = prop of students at THS who ride the bus
    p-hat = propotion of students in a sample (n = 14) from THS who ride the bus
  • Assumptions
    Simple Random Sample SRS must be either stated or inferred
    Independencebecause you are usually sampling without replacement:N> 10n
    Normalityn·p-hat > 10n·q-hat > 10
  • Name of the Interval
    “1-proportion z interval”
    Unlike for means, you will always use the Normal curve when dealing with proportions!
  • Interval Calculation
    z* is calculated as before
    Use table C
    OR
    “ − invNorm( (1 – C) /2 )”This calculation is similar to the calculations for the last section
  • Interval Calculation
    z* is calculated as before
    Use table C
    OR
    “ − invNorm( (1 – C) /2 )”This calculation is similar to the calculations for the last section
    Margin of Error (ME)
  • Interval Calculation
    z* is calculated as before
    Use table C
    OR
    “ − invNorm( (1 – C) /2 )”This calculation is similar to the calculations for the last section
    Standard Error(SE)
  • Conclusion
    Some Examples:
    We are 90% confident that the proportion of voters in CA who support the proposition is 0.34 ± 0.03
    We are 95% confident that the proportion of students at THS who ride this bus is in the interval (0.39, 0.44)
  • Sample Size
    The relevant formula (from ME) for the sample size is:
    p* and q* are guessed values of the proportion
    If there is no previous data or study, use p* = q* = 0.5 (this will maximize the error and sample size)
    As before, you are to round the sample size up to the nearest integer
  • TI83/84
  • TI83/84
    [stat] -> “TEST” -> “1-PropZInt”
    “x: number of successes” this can be computed with “p-hat x n”
    “n: number of people in sample”
    “C-Level : confidence level”
    “Calculate” and you are done
    You still need to fully write up “PANIC” procedures
  • TI 89
  • TI 89
    [APPS] -> “Stat/List Editor”
    [2nd] -> [F2](F7) -> “1-PropZInt”
    “Successes x: # of successes”
    “n: number of people in sample”
    “C-Level : confidence level”
    “Calculate” and you are done
    You still need to fully write up “PANIC” procedures
  • Example 1-PropZInt
    The 2004 Gallup Youth Survey asked a random sample of 439 US teens aged 13 to 17 whether they though young people should wait to have sex until marriage. 246 sad “yes.” Let’s construct a 95% confidence interval for the proportion of all teens who would say “Yes”
  • Parameter
    p = the proportion of all teens in the US aged 13-17 who would answer “Yes” to the survey
    p-hat = the proportion of teens in the survey of 439 who answered “Yes” to the survey
  • Assumptions
    SRS: We are told in the problem that the survey was random sample.
    Independence: there are than 10(439) = 4390 teens aged 13-17 in the US
    Normalityn x p-hat = 246, n x q-hat = 193The sampling distribution is approx. Normal
    (Note that this is just #successes and #failures)
  • Name of Interval
    We are constructing a “1 proportion Z interval”
  • Construction of Interval
  • Conclusion
    “We are 95% confident that the true proportion of 13-17 year olds who would answer “yes” when asked if young people should wait to have sex until they are married is between 0.514 and 0.606.”