2. Chap 4-2
Key Definitions
n A population is the collection of all items of interest or
under investigation
n N represents the population size
n A sample is an observed subset of the population
n n represents the sample size
n A parameter is a specific characteristic of a population
n A statistic is a specific characteristic of a sample
3. Chap 4-3
Data types
n Qualitative data: Categorical data
n Quantitative data: Numerical data
n Cross-Sectional data: Data that are collected in
one point in time
n Time-series data: A sequence of data points
indexed in time order
4. Chap 4-4
Variable types
n Discrete variable: Countable outcomes. Can be
qualitative or quantitative data
Ø Example: The number customers coming to a
bank counter within 1 hour
n Continuous variable: Can obtain any value
within an interval.
Ø Example: Weight of an item produced by an
automatic machine.
5. Chap 4-5
Descriptive and Inferential Statistics
Two branches of statistics:
n Descriptive statistics
n Graphical and numerical procedures to summarize
and process data
n Inferential statistics
n Using data to make predictions, forecasts, and
estimates to assist decision making
6. Chap 4-6
Inferential Statistics
n Estimation
n e.g., Estimate the population
mean weight using the sample
mean weight
n Hypothesis testing
n e.g., Test the claim that the
population mean weight is 120
pounds
Inference is the process of drawing conclusions or
making decisions about a population based on
sample results
7. Chap 4-7
A simple question
Ø If you went to school 200 days last year, and you
were 102 days late. What is the probability that you
will be late tomorrow?
“So easy, no need to think! I don’t need to study
statistics to answer this question. The probability of
delay is approximately 1/2, (0.51 to be exact!)”.
8. Another question
November 8, 2016 is the day of the US presidential
election.
On FiveThirtyEight.com, Nate Silver (who correctly
forecast the results of the presidential election several
times before) wrote, at 9am:
"The probability of Hillary Clinton winning the election is
71%”
That evening, Donald Trump was declared the winner to
the surprise of many.
Chap 4-8
9. Another question
Ø In 1950, an economist preparing a report asked statistician David
Blackwell (not yet a Bayesian) to estimate the probability of another
world war in the next five years.
- Blackwell answered:
"Oh, that question just doesn’t make sense. Probability applies to a
long sequence of repeatable events, and this is clearly a unique
situation. The probability is either 0 or 1, but we won’t know for five
years."
- The economist replied:
"I was afraid you were going to say that. I’ve spoken to several other
statisticians, and they all told me the same thing."
Excerpt from “A theory that would not die” – Sharon McGrayne
Chap 4-9
10. n We use probability to quantify uncertainty of our
knowledge given data, in our life (often instinctively) and
in our work.
n There are two interpretations of probabilities:
Frequentist probability and Bayesian probability.
n The mathematical concept of probability is the same,
but how to use probability to quantify knowledge that is
different!
n The two different interpretations correspond to two
distinct approaches to statistical inference
Chap 4-10
12. In a nutshell
n Frequentist approach is natural when data can
be obtained by repeated experiments.
n Bayesian approach is more appropriate when
one needs to draw conclusion based on only
the data available.
Chap 4-12
13. Chap 4-13
Graphical
Presentation of Data
n Techniques reviewed in this chapter:
Categorical
Variables
Numerical
Variables
• Frequency distribution
• Bar chart
• Pie chart
• Pareto diagram
• Line chart
• Frequency distribution
• Histogram and ogive
• Stem-and-leaf display
• Scatter plot
(continued)
14. Chap 4-14
Relationships Between Variables
n Graphs illustrated so far have involved only a
single variable
n When two variables exist other techniques are
used:
Categorical
(Qualitative)
Variables
Numerical
(Quantitative)
Variables
Cross tables Scatter plots
15. Chap 4-15
Describing Data Numerically
Arithmetic Mean
Median
Mode
Describing Data Numerically
Variance
Standard Deviation
Coefficient of Variation
Range
Interquartile Range
Central Tendency Variation
16. Chap 4-16
Measures of Central Tendency
Central Tendency
Mean Median Mode
n
x
x
n
1
i
i
å
=
=
Overview
Midpoint of
ranked values
Most frequently
observed value
Arithmetic
average
17. Chap 4-17
Shape of a Distribution
n Describes how data are distributed
n Measures of shape
n Symmetric or skewed
Mean = Median
Mean < Median Median < Mean
Right-Skewed
Left-Skewed Symmetric
18. Chap 4-18
Same center,
different variation
Measures of Variability
Variation
Variance Standard
Deviation
Coefficient
of Variation
Range Interquartile
Range
n Measures of variation give
information on the spread
or variability of the data
values.
19. Chap 4-19
n Average of squared deviations of values from
the mean
n Population variance:
Population Variance
N
μ)
(x
σ
N
1
i
2
i
2
å
=
-
=
Where = population mean
N = population size
xi = ith value of the variable x
μ
20. Chap 4-20
n Average (approximately) of squared deviations
of values from the mean
n Sample variance:
Sample Variance
1
-
n
)
x
(x
s
n
1
i
2
i
2
å
=
-
=
Where = arithmetic mean
n = sample size
Xi = ith value of the variable X
X
21. Chap 4-21
Population Standard Deviation
n Most commonly used measure of variation
n Shows variation about the mean
n Has the same units as the original data
n Population standard deviation:
N
μ)
(x
σ
N
1
i
2
i
å
=
-
=
22. Chap 4-22
Sample Standard Deviation
n Most commonly used measure of variation
n Shows variation about the mean
n Has the same units as the original data
n Sample standard deviation:
1
-
n
)
x
(x
S
n
1
i
2
i
å
=
-
=
23. Chap 4-23
The Sample Covariance
n The covariance measures the strength of the linear relationship
between two variables
n The population covariance:
n The sample covariance:
n Only concerned with the strength of the relationship
n No causal effect is implied
N
)
)(y
(x
y)
,
(x
Cov
N
1
i
y
i
x
i
xy
å
=
-
-
=
=
µ
µ
s
1
n
)
y
)(y
x
(x
s
y)
,
(x
Cov
n
1
i
i
i
xy
-
-
-
=
=
å
=
24. Chap 4-24
n Covariance between two variables:
Cov(x,y) > 0 x and y tend to move in the same direction
Cov(x,y) < 0 x and y tend to move in opposite directions
Cov(x,y) = 0 x and y are independent
Interpreting Covariance
25. Chap 4-25
Coefficient of Correlation
n Measures the relative strength of the linear relationship
between two variables
n Population correlation coefficient:
n Sample correlation coefficient:
Y
X s
s
y)
,
(x
Cov
r =
Y
X σ
σ
y)
,
(x
Cov
ρ =
26. Chap 4-26
Features of
Correlation Coefficient, r
n Unit free
n Ranges between –1 and 1
n The closer to –1, the stronger the negative linear
relationship
n The closer to 1, the stronger the positive linear
relationship
n The closer to 0, the weaker any positive linear
relationship
27. Chap 4-27
Scatter Plots of Data with Various
Correlation Coefficients
Y
X
Y
X
Y
X
Y
X
Y
X
r = -1 r = -.6 r = 0
r = +.3
r = +1
Y
X
r = 0
28. Chap 4-28
Introduction to
Probability Distributions
n Random Variable
n Represents a possible numerical value from
a random experiment
Random
Variables
Discrete
Random Variable
Continuous
Random Variable
29. Chap 4-29
Expected Value
n Expected Value (or mean) of a discrete
distribution (Weighted Average)
n Example: Toss 2 coins,
x = # of heads,
compute expected value of x:
E(x) = (0 x .25) + (1 x .50) + (2 x .25)
= 1.0
x P(x)
0 .25
1 .50
2 .25
å
=
=
x
P(x)
x
E(x)
μ
31. Chap 4-31
Variance and Standard
Deviation
n Variance of a discrete random variable X
n Standard Deviation of a discrete random variable X
å -
=
=
x
2
2
P(x)
μ)
(x
σ
σ
å -
=
-
=
x
2
2
2
P(x)
μ)
(x
μ)
E(X
σ
32. Chap 4-32
Bernoulli Distribution
n Consider only two outcomes: “success” or “failure”
n Let P denote the probability of success
n Let 1 – P be the probability of failure
n Define random variable X:
x = 1 if success, x = 0 if failure
n Then the Bernoulli probability function is
P
P(1)
and
P)
(1
P(0) =
-
=
33. Chap 4-33
P(x) = probability of x successes in n trials,
with probability of success P on each trial
x = number of ‘successes’ in sample,
(x = 0, 1, 2, ..., n)
n = sample size (number of trials
or observations)
P = probability of “success”
P(x)
n
x ! n x
P (1- P)
X n X
!
( )!
=
-
-
Example: Flip a coin four
times, let x = # heads:
n = 4
P = 0.5
1 - P = (1 - 0.5) = 0.5
x = 0, 1, 2, 3, 4
Binomial Distribution Formula
34. Chap 4-34
Continuous Probability Distributions
n A continuous random variable is a variable that
can assume any value in an interval
n thickness of an item
n time required to complete a task
n temperature of a solution
n height, in inches
n These can potentially take on any value,
depending only on the ability to measure
accurately.
35. Chap 4-35
Probability Density Function
The probability density function, f(x), of random variable X has the
following properties:
1. f(x) > 0 for all values of x
2. The area under the probability density function f(x) over all values of the
random variable X is equal to 1.0
3. The probability that X lies between two values is the area under the
density function graph between the two values
4. The cumulative density function F(x0) is the area under the probability
density function f(x) from the minimum x value up to x0
where xm is the minimum value of the random variable x
ò
=
0
m
x
x
0 f(x)dx
)
F(x
36. Chap 4-36
The Normal Distribution
n ‘Bell Shaped’
n Symmetrical
n Mean, Median and Mode
are Equal
Location is determined by the
mean, μ
Spread is determined by the
standard deviation, σ
The random variable has an
infinite theoretical range:
+ ¥ to - ¥
Mean
= Median
= Mode
x
f(x)
μ
σ
(continued)
37. Chap 4-37
The Standardized Normal Table
Z
0 2.00
.9772
Example:
P(Z < 2.00) = .9772
n Appendix Table 1 gives the probability F(a) for
any value a
38. Chap 4-38
Sampling Distributions
n A sampling distribution is a distribution of
all of the possible values of a statistic for
a given size sample selected from a
population
39. Chap 4-39
If the Population is Normal
n If a population is normal with mean μ and
standard deviation σ, the sampling distribution
of is also normally distributed with
and
X
μ
μX
=
n
σ
σX
=
40. Chap 4-40
Z-value for Sampling Distribution
of the Mean
n Z-value for the sampling distribution of :
where: = sample mean
= population mean
= population standard deviation
n = sample size
X
μ
σ
n
σ
μ)
X
(
σ
μ)
X
(
Z
X
-
=
-
=
X
41. Chap 4-41
Population Proportions, P
P = the proportion of the population having
some characteristic
n Sample proportion ( ) provides an estimate
of P:
n 0 ≤ ≤ 1
n has a binomial distribution, but can be approximated
by a normal distribution when nP(1 – P) > 9
size
sample
interest
of
stic
characteri
the
having
sample
the
in
items
of
number
n
X
P =
=
ˆ
P̂
P̂
P̂
42. Chap 4-42
Sampling Distribution of P
n Normal approximation:
Properties:
and
(where P = population proportion)
Sampling Distribution
.3
.2
.1
0
0 . 2 .4 .6 8 1
p
)
P
E( =
ˆ
n
P)
P(1
n
X
Var
σ2
P
-
=
÷
ø
ö
ç
è
æ
=
ˆ
^
)
P
P( ˆ
P̂
43. Chap 4-43
Z-Value for Proportions
n
P)
P(1
P
P
σ
P
P
Z
P
-
-
=
-
=
ˆ
ˆ
ˆ
Standardize to a Z value with the formula:
P̂
44. Chap 4-44
Sampling Distribution of
Sample Variances
n The sampling distribution of s2 has mean σ2
n If the population distribution is normal, then
n If the population distribution is normal then
has a c2 distribution with n – 1 degrees of freedom
2
2
σ
)
E(s =
1
n
2σ
)
Var(s
4
2
-
=
2
2
σ
1)s
-
(n
45. Chap 4-45
The Chi-square Distribution
n The chi-square distribution is a family of distributions,
depending on degrees of freedom:
n d.f. = n – 1
n Text Table 7 contains chi-square probabilities
0 4 8 12 16 20 24 28 0 4 8 12 16 20 24 28 0 4 8 12 16 20 24 28
d.f. = 1 d.f. = 5 d.f. = 15
c2 c2
c2
47. Chap 4-47
Confidence Interval for μ
(σ2 Known)
n Assumptions
n Population variance σ2 is known
n Population is normally distributed
n If population is not normal, use large sample
n Confidence interval estimate:
(where za/2 is the normal distribution value for a probability of a/2 in
each tail)
n
σ
z
x
μ
n
σ
z
x α/2
α/2 +
<
<
-
48. Chap 4-48
Margin of Error
n The confidence interval,
n Can also be written as
where ME is called the margin of error
n The interval width, w, is equal to twice the margin of
error
n
σ
z
x
μ
n
σ
z
x α/2
α/2 +
<
<
-
ME
x ±
n
σ
z
ME α/2
=
49. Chap 4-49
Reducing the Margin of Error
The margin of error can be reduced if
n the population standard deviation can be reduced (σ↓)
n The sample size is increased (n↑)
n The confidence level is decreased, (1 – a) ↓
n
σ
z
ME α/2
=
50. Chap 4-50
n Assumptions
n Population standard deviation is unknown
n Population is normally distributed
n If population is not normal, use large sample
n Use Student’s t Distribution
n Confidence Interval Estimate:
where tn-1,α/2 is the critical value of the t distribution with n-1 d.f.
and an area of α/2 in each tail:
Confidence Interval for μ
(σ Unknown)
n
S
t
x
μ
n
S
t
x α/2
1,
-
n
α/2
1,
-
n +
<
<
-
(continued)
α/2
)
t
P(t α/2
1,
n
1
n =
> -
-
51. Chap 4-51
Student’s t Distribution
n The t is a family of distributions
n The t value depends on degrees of
freedom (d.f.)
n Number of observations that are free to vary after
sample mean has been calculated
d.f. = n - 1
52. Chap 4-52
Confidence Intervals for the
Population Proportion, p
n Recall that the distribution of the sample
proportion is approximately normal if the
sample size is large, with standard deviation
n We will estimate this with sample data:
(continued)
n
)
p
(1
p ˆ
ˆ -
n
P)
P(1
σP
-
=
53. Chap 4-53
Confidence Interval Endpoints
n Upper and lower confidence limits for the
population proportion are calculated with the
formula
n where
n za/2 is the standard normal value for the level of confidence desired
n is the sample proportion
n n is the sample size
n
)
p
(1
p
z
p
P
n
)
p
(1
p
z
p α/2
α/2
ˆ
ˆ
ˆ
ˆ
ˆ
ˆ -
+
<
<
-
-
p̂
54. Chap 4-54
What is a Hypothesis?
n A hypothesis is a claim
(assumption) about a
population parameter:
n population mean
n population proportion
Example: The mean monthly cell phone bill
of this city is μ = $42
Example: The proportion of adults in this
city with cell phones is p = .68
55. Chap 4-55
The Null Hypothesis, H0
n Begin with the assumption that the null
hypothesis is true
n Similar to the notion of innocent until
proven guilty
n Refers to the status quo
n Always contains “=” , “≤” or “³” sign
n May or may not be rejected
(continued)
56. Chap 4-56
The Alternative Hypothesis, H1
n Is the opposite of the null hypothesis
n e.g., The average number of TV sets in U.S.
homes is not equal to 3 ( H1: μ ≠ 3 )
n Challenges the status quo
n Never contains the “=” , “≤” or “³” sign
n May or may not be supported
n Is generally the hypothesis that the
researcher is trying to support
57. Chap 4-57
Level of Significance, a
n Defines the unlikely values of the sample
statistic if the null hypothesis is true
n Defines rejection region of the sampling
distribution
n Is designated by a , (level of significance)
n Typical values are .01, .05, or .10
n Is selected by the researcher at the beginning
n Provides the critical value(s) of the test
58. Chap 4-58
Level of Significance
and the Rejection Region
H0: μ ≥ 3
H1: μ < 3
0
H0: μ ≤ 3
H1: μ > 3
a
a
Represents
critical value
Lower-tail test
Level of significance = a
0
Upper-tail test
Two-tail test
Rejection
region is
shaded
/2
0
a
/2
a
H0: μ = 3
H1: μ ≠ 3
59. Chap 4-59
Errors in Making Decisions
n Type I Error
n Reject a true null hypothesis
n Considered a serious type of error
The probability of Type I Error is a
n Called level of significance of the test
n Set by researcher in advance
60. Chap 4-60
Errors in Making Decisions
n Type II Error
n Fail to reject a false null hypothesis
The probability of Type II Error is β
(continued)
61. Chap 4-61
Test of Hypothesis
for the Mean (σ Known)
n Convert sample result ( ) to a z value
The decision rule is:
α
0
0 z
n
σ
μ
x
z
if
H
Reject >
-
=
σ Known σ Unknown
Hypothesis
Tests for µ
Consider the test
0
0 μ
μ
:
H =
0
1 μ
μ
:
H >
(Assume the population is normal)
x
62. Chap 4-62
Reject H0
Do not reject H0
Decision Rule
a
zα
0
μ0
H0: μ = μ0
H1: μ > μ0
Critical value
Z
α
0
0 z
n
σ
μ
x
z
if
H
Reject >
-
=
n
σ/
Z
μ
X
if
H
Reject α
0
0 +
>
n
σ
z
μ α
0 +
Alternate rule:
x
63. Chap 4-63
p-Value Approach to Testing
n p-value: Probability of obtaining a test
statistic more extreme ( ≤ or ³ ) than the
observed sample value given H0 is true
n Also called observed level of significance
n Smallest value of a for which H0 can be
rejected
64. Chap 4-64
p-Value Approach to Testing
n Convert sample result (e.g., ) to test statistic (e.g., z
statistic )
n Obtain the p-value
n For an upper
tail test:
n Decision rule: compare the p-value to a
n If p-value < a , reject H0
n If p-value ³ a , do not reject H0
(continued)
x
)
μ
μ
|
n
σ/
μ
-
x
P(Z
true)
is
H
that
given
,
n
σ/
μ
-
x
P(Z
value
-
p
0
0
0
0
=
>
=
>
=
65. Chap 4-65
t Test of Hypothesis for the Mean
(σ Unknown)
n Convert sample result ( ) to a t test statistic
σ Known σ Unknown
Hypothesis
Tests for µ
x
The decision rule is:
α
,
1
-
n
0
0 t
n
s
μ
x
t
if
H
Reject >
-
=
Consider the test
0
0 μ
μ
:
H =
0
1 μ
μ
:
H >
(Assume the population is normal)
66. Chap 4-66
t Test of Hypothesis for the Mean
(σ Unknown)
n For a two-tailed test:
The decision rule is:
α/2
,
1
-
n
0
α/2
,
1
-
n
0
0 t
n
s
μ
x
t
if
or
t
n
s
μ
x
t
if
H
Reject >
-
=
-
<
-
=
Consider the test
0
0 μ
μ
:
H =
0
1 μ
μ
:
H ¹
(Assume the population is normal,
and the population variance is
unknown)
(continued)
67. Chap 4-67
n The sampling
distribution of is
approximately
normal, so the test
statistic is a z
value:
Hypothesis Tests for Proportions
n
)
P
(1
P
P
p
z
0
0
0
-
-
=
ˆ
nP(1 – P) > 9
Hypothesis
Tests for P
Not discussed
in this chapter
p̂
nP(1 – P) < 9
68. Chap 4-68
Example: Z Test for Proportion
A marketing company
claims that it receives
8% responses from its
mailing. To test this
claim, a random sample
of 500 were surveyed
with 25 responses. Test
at the a = .05
significance level.
Check:
Our approximation for P is
= 25/500 = .05
nP(1 - P) = (500)(.05)(.95)
= 23.75 > 9
p̂
ü
69. Chap 4-69
Z Test for Proportion: Solution
a = .05
n = 500, = .05
Reject H0 at a = .05
H0: P = .08
H1: P ¹ .08
Critical Values: ± 1.96
Test Statistic:
Decision:
Conclusion:
z
0
Reject Reject
.025
.025
1.96
-2.47
There is sufficient
evidence to reject the
company’s claim of 8%
response rate.
2.47
500
.08)
.08(1
.08
.05
n
)
P
(1
P
P
p
z
0
0
0
-
=
-
-
=
-
-
=
ˆ
-1.96
p̂