1. Introduction to Statistics
STA250
Lecture 11 - April 21st, 2010
1
2. 2
3. 3
4. Probability
✤ How we express likelihood mathematically
✤ For an event “A”, the probability of A occurring is denoted “P(A)”
✤ Always number between 0 and 1
✤ P(A) = 0 means that A never happens
✤ P(A) = 1 means that A always happens
4
5. Independence & Exclusivity
✤ independence - A and B are independent if the occurrence of one does
not affect the probability of the other:
✤ P(A|B) = P(A) = P(A|not B)
✤ P(B|A) = P(B) = P(B|not A)
✤ mutually exclusive - A and B are mutually exclusive if it is impossible
for both of them to occur:
✤ P(A and B) = 0
5
6. Probability Rules
✤ Probability of not happening is 1 minus probability of occurring
✤ P(not A) = 1 - P(A)
✤ When A and B are independent:
✤ P(A and B) = P(A) × P(B)
✤ P(A or B) = P(A) + P(B) - P(A and B)
6
7. Probability Fundamentals
✤ Sum of probabilities of all possible outcomes is 1
✤ Flip a coin and you get either heads or tails:
✤ P(heads) + P(tails) = 1 = P(heads or tails)
✤ With mutually exclusive outcomes A, B, C, and D
✤ P(A) + P(B) + P(C) + P(D) = 1 = P(A or B or C or D)
7
8. Conditional Probability
✤ With non-independent events, knowing one has happened may change
the likelihood of the other occurring
✤ Conditional probability - what is the probability of A given that B has
already happened?
✤ P(A|B)
✤ Bayes Rule for conditional probability:
P (A and B) P (B|A) × P (B)
P (A|B) = =
P (B) P (A)
8
9. Conditional Probability Hoedown
✤ At John Jay, 62.5% of all students hate statistics while 25% of all
students hate statistics and passed the class. What is the probability
that a student passes stats given that the student hates statistics?
✤ Two fair dice are rolled, what is the (conditional) probability that
exactly one die’s value is a 1 or 2 given that they show different
numbers?
9
10. Something Really Important
✤ Classic stats problem emerged from the game show Let’s Make a Deal,
often called the Monty Hall Problem after the show’s host
✤ Has ended many friendships and caused bitter internet arguments
10
11. The Game
✤ There are 3 doors labeled “1”, “2”, and “3”, behind one of these doors
is a fabulous prize that Monty has hidden
✤ You get to choose a door, which may or may not have the prize
✤ Monty opens another door without revealing the prize
✤ You now have the option to stay with your door or switch to another,
should you stick with your original choice or switch?
11
12. Choosing The First Door
✤ Three doors and one prize so you’ll pick the right door one out of
three times, i.e. P(right ﬁrst choice) = 1/3
✤ Likewise, you’ll pick the wrong door with P(wrong ﬁrst choice) = 2/3
12
13. The Reveal
✤ No matter how you choose, there are two other doors one prize. This
means there is at least one of the two unchosen doors with nothing
behind it.
✤ Monty knows where the prize is and opens the door that DOESN’T
have the prize behind it.
✤ This leaves your door and one other. One of them has the prize and
the other doesn’t, should you switch?
13
14. To Switch, or Not To Switch
✤ You don’t know if you have the right door!
✤ What’s the probability that your door has the prize?
✤ What’s the probability that the other door has the prize?
✤ What’s the probability that your door doesn’t have the prize?
14
15. Example of the Game
✤ As an example, the prize is hidden behind door “3”.
✤ If you choose door “3” initially, switching can only lose you the prize
✤ If you choose door “2” initially, Monty must open door “1” and
switching will get you the prize
✤ If you choose door “1” initially, Monty must open door “2” and
switching will get you the prize
15
16. Switch Already!
✤ Switching is a way of saying “I don’t think the prize is behind this
door”
✤ Since the probability is 1/3 that the prize is behind any one door, the
probability is 2/3 that the prize is not behind that door
✤ Always switch and you’ll win 2/3’s of the time!
16
17. Expected Values
✤ Probability can be used to estimate rewards in a game of chance
✤ Expected Value = P(A)×Reward(A) + P(B)×Reward(B) + ...
✤ Silly coin-ﬂipping game: If you can ﬂip a coin three times and have
exactly one Heads, you get a dollar. If not, you give me a dollar.
✤ Should you take the bet?
17
18. Normal Distribution
✤ The distribution is
✤ unimodal
✤ symmetric
✤ “light tailed”
✤ Notation: X ~ N(μ, σ) means “the random variable X has a normal
distribution with mean μ and standard deviation σ“
18
19. Area Under the Curve Equals 1
0.8
N(!3,0.5)
N(2,1)
N(!1,3)
0.6
f(x)
0.4
0.2
0.0
!4 !2 0 2 4
19
20. Rules of Thumb
✤ P(within one standard deviation) = 0.68
✤ P(within 1.68 standard deviations) = 0.95
✤ P(within three standard deviations) = 0.997
✤ With “real” normal distributions, you just don’t get outliers!
20
21. Standard Normal Distribution
✤ standard normal distribution is the normal distribution with mean μ = 0
and standard deviation σ = 1: Z ~ N(0, 1)
✤ Any normal distribution can be transformed into a standard normal
distribution. If X ~ N(μ, σ), then:
X −µ
=Z
σ
✤
21
22. Z - Scores & the Standard Normal
✤ Each observation has an associated z-score, which is the number of
standard deviations that observation is away from the mean
✤ Converting a sample from a normal distribution to z-scores transforms
it to a standard normal distribution
✤ z-score = (observation - mean) ÷ standard deviation
✤ If the observation is above the mean then the Z-score is positive, if
below then the Z-score is negative
22
23. Interval Estimation
✤ We might estimate the mean for an entire population using the mean
for a small sample, this is called a point estimate.
✤ A conﬁdence interval gives a range of “plausible” values for the
population mean
✤ Usually reported as "mean ± wiggle room"
✤ Each interval has an associated level of conﬁdence, usually written as
a percent (95% being the most common)
✤ "I am 95% conﬁdent that the population mean is in this range,
with the sample mean being the most likely guess"
23
24. Two-Sided: 1.96 Std. Dev.’s
24
25. Normal Critical Deviates
the point for which the area und
ht is γ. how many you wanted to ﬁnd the middle X% of to travel
Critical normal deviate: If
✤
distribution, standard deviations would you have
the
in each direction.
✤ Deﬁne zγ to be the point for which the area under the normal curve to
matical notation, zγ is the point f
the right is γ.
✤ In more mathematical notation, zγ is the point for which:
P (Z > zγ ) = γ,
25
26. Interpreting Confidence Intervals
✤ The width of a conﬁdence interval indicates precision
✤ An observation's z-score can test if an observation is similar to
others, bigger than ±1.96 means 95% likely to be different
✤ 95% conﬁdence intervals are by far the most common, but any level of
conﬁdence interval can be computed:
✤ 90%: mean ± (1.645 × standard deviation)
✤ 95%: mean ± (1.96 × standard deviation)
✤ 99%: mean ± (2.58 × standard deviation)
26
27. Components of Confidence
✤ How might a conﬁdence interval change as:
✤ Ȳ increases
✤ σ increases
✤ n increases
✤ the conﬁdence level increases (e.g., from 95% to 99%)
27
28. Conflicting Hypotheses
✤ In statistical inference, there are always two conﬂicting hypotheses:
✤ null hypothesis “H0” - often states “no effect” or “no difference”.
This is the hypothesis that we will assume to be true unless we
have convincing evidence to the contrary.
✤ alternative hypothesis “H1” or “Ha” - The hypothesis that we will
believe only if the evidence strongly supports it.
✤ The null hypothesis typically has “=” in it
28
29. Hypothesis as Metaphor
✤ Hypothesis tests are like U.S. criminal trials
✤ The judicial system is structured such that the accused person is
presumed innocent until proven guilty. In such a system the absence
of convincing evidence (“beyond a reasonable doubt”) results in the
person being set free.
✤ H0: innocent
✤ Ha: guilty
29
30. P-values
✤ In each hypothesis testing situation we will compute a p-value. This is
the probability that the null hypothesis is correct given the data.
✤ Accept H0 if the p-value is large
✤ Reject H0 if the p-value is small, go with Ha
✤ How small is small enough? It depends... (usually p < 0.05)
30
31. Notes on Hypothesis Testing
✤ “Statistical signiﬁcance” is not the same as “clinical signiﬁcance”. A
tiny effect may be “statistically signiﬁcant” if the sample size is huge.
✤ The p-value does not describe the magnitude of the effect!
✤ When reporting analysis results, a conﬁdence interval should always
be provided along with the results of a hypothesis test.
✤ The choice of 0.05 is arbitrary. (p = 0.051 and p = 0.049 should lead to
similar conclusions, in practice they often do not)
✤ Never report results as “p < 0.05”, report the p-value and let the
reader decide if they agree with your interpretation.
31
32. • Type I Error: Reject H0 when H0 is actually true.
– For example, to conclude there is an eﬀect (or a diﬀerence)
when there really isn’t one.
– Also called “false positive”.
• Type II Error: Accept H0 when H0 is actually false.
– For example, to fail to ﬁnd an eﬀect (or a diﬀerence) when
there really is one.
– Also called “false negative”.
State of nature
Decision H0 is true Ha is true
Accept H0 qh
q Type II
qh
q
Reject H0 Type I
32
33. Probabilities of Errors of Type I and Ty
Probabilities
✤ Each of the errors has an associated probability: associated
Each of the errors has an probabilit
• α = P (Type I Error)
• β = P (Type II Error)
✤ Hypothesis testing is set up to control Type I error rate (α)
Hypothesis testing is set up to control Type I
The experimenter chooses α - everything else follows from this!
The experimenter chooses α — everything else
✤ Most common (by far) choice for α is 0.05.
✤ (Also, 0.01 and 0.10most common
The on occasion) (by far) choice for α is 0.05
33
34. Comparing Means
✤ Tests:
✤ Single group versus a ﬁxed mean
✤ Two groups with the same variable
✤ Two groups with pairwise observation
✤ Hypotheses:
✤ H0 : the two groups have equal means ( mean A = mean B )
✤ Ha : the means of the groups are different
34
35. Assumptions for t-Tests
✤ The group (sample) is the Independent Variable (dichotomous)
✤ The outcome of interest is the Dependent Variable
✤ t-Tests are only valid if these assumptions are not violated:
✤ The research question DOES involve the comparison of 2 means
✤ The Dependent Variable is a quantitative scale
✤ The distribution of the Dependent Variable is normal
✤ Independent Variable assigned randomly (independently)
35
36. Met Assumptions, but Which Test?
✤ Only one group with data: One-Sample t-Test
✤ Two groups:
✤ Not related to each other: Independent-Samples t-Test
✤ Related samples (e.g. before & after): Paired-Samples t-Test
36
37. One-Sample t-Test
✤ Compares a sample mean to a known population mean.
✤ Need to know the population mean!
✤ Example: Is there a difference between the population mean IQ (100)
and the mean IQ for a sample of 50 John Jay students (125)?
37
38. Paired-Samples t-Test
✤ Sometimes we have two sets of measurements that are related:
✤ Each subject is measured before and after treatment
✤ With pairs of identical twins
✤ Subject has different treatment on left & right arms
✤ For each observation in one group there is exactly one closely related
observation in the other groups (can make pairs, one of each group)
38
39. Independent-Samples t-Test
✤ Compares the means of two groups or samples.
✤ One of the most common situations in statistical inference is that of
comparing two means from independent samples
✤ Clinical trials - treatment group vs. placebo group
✤ Exposed vs. unexposed
✤ Males vs. females
✤ General population vs. speciﬁc subpopulation
39
40. Review: Hypotheses
✤ Null Hypothesis: there is no relationship between the independent
and dependent variables
✤ p-value: the probability of the null hypothesis (H0) being true
✤ Reject H0 if p is too small (usually p < 0.05)
✤ If we reject H0, we must instead choose the alternative (Ha)
40
41. Review: t-Tests
✤ Compare the means of exactly two groups
✤ Only one group (with data) compared to a ﬁxed number:
✤ One-Sample t-Test
✤ Two groups (with data):
✤ Not related to each other: Independent-Samples t-Test
✤ Related samples (e.g. before & after): Paired-Samples t-Test
41
Be the first to comment