SlideShare a Scribd company logo
1 of 113
GIS 665 Geospatial Analysis
Inferential Statistics
Review of Statistics
1
Outline
 Probability distributions
 Confidence intervals
 Hypothesis testing
2
Discrete Probability Distributions
3
Section I
Outline
 Concept of a random variable
 Binomial distribution
 Poisson distribution
4
5
Suppose we consider any measureable
characteristics of a population, such as the
household size of all houses in a city. Because this
characteristics can take different values, we refer
to it as a variable.
If we were to select one household at random from
this population, the value of this variable is
determined. Because of the value is determined
through a random sampling, we call it a random
variable.
The “random” of a
random variable is
from the random
sampling process
Discrete vs. Continuous RV
 A discrete random variable is a random variable
that can take on only a finite or at most a
countable infinite number of values.
 The total number of heads turns up when flipping a coin
three times: [0, 1, 2, 3]
 The number of accidents occurred in Redlands per day
 A continuous random variable is the random
variable that can take on a continuum of values
 Commute distance / annual rainfall or temperature
6
For example, GPA.
A course GPA is a
Discrete RV. The
random variable only
take on a finite value.
But the average
GPA? It is continuous.
Probability Function
 A table, graph, or mathematical function that describes the
potential values of a random variable X and their
corresponding probabilities is a probability function.
 It describes the frequency distribution of the variable.
7
Probability Mass Function
 The probability distribution of a discrete
random variable is specified by a probability
mass function or the frequency function.
 We use uppercase letter X to denote random
variable and lowercase x to denote a specific value.
8
1
)
(
and
)
(
)
(
1


 

k
i
i
i
i x
P
x
X
P
x
P
The sum of all xi should always be: 1
Probability Mass Function
0 1 2 3
9
 Example
 Flipping a coin three times. Define X to be the total
number of heads that turns up
P(X=0) = 1/8 P(X=1) = 3/8
P(X=2) = 3/8 P(X=3) = 1/8
Flip the coin three
times. It can either be
head OR tail. So, the
number of possible
outcome is: 23
{HHH, HHT, HTH,
THH, HTT, TTH, THT,
TTT}
AND is multiplication;
OR is addition.
Head AND Head
AND Head.
∴
1
2
×
1
2
×
1
2
=
1
23
=
1
8
Probability on an Interval
 What is the probability that the total number of heads turns up is less
than 2 when flipping a coin three times?
P(X<2) = ? x=0 OR x=1; P = 0.5
 P(1<X<3) = ? x=2; P=0.375
 P(1≤X<3) = ? x=1 OR x=2 P=0.75
 P(1≤X≤3) = ? x=1 OR x=2 OR x=3 P=0.875
10
[1,3] = P(1≤X≤3)
(1,3) = P(1<X<3)
Expected Value of a Discrete RV
 Expected value of a discrete RV is the average value it takes.
11


i
i
i x
P
x
X
E )
(
)
(
xi P(xi) xi*P(xi)
0 0.125 0
1 0.375 0.375
2 0.375 0.75
3 0.125 0.375
E(X) = 1.5
This is the probability
of different number of
head turns up.
Remember IDW?
The sum of weight is
also 1.
Flipping a coin three
times. Define X to be the
total number of heads
that turns up
This is the Expected
value of the total
number of heads
turns up when
tossed three times.
Which is the sum of
these 4 items.
This is the number
of heads turns up
when the coin is
tossed 3 times.
Either 0 or 1 or 2 or
3.
Binomial Distribution
 Bernoulli Trial
 Each trial results in one of two possible outcomes
(“success”/“failure” or “head”/”tail”)
 The probability of success is constant and equal to p
on each trial (the probability of failure is 1-p)
 Binomial Distribution
 The process of interest consists of n independent
Bernoulli trials with the probability of success in each
trial as p
 The total number of successes, X, is a binomial
random variable with parameters n and p.
12
Suppose that n independent experiments
are performed, where n is fixed number,
and each experiment results in a success
with probability p and a failure with 1-p. the
total number of successes, X, is a binomial
random variable with parameters n and p.
What is the probability that one head would turn up when a fair
coin is tossed three times?
1st 2nd 3rd
H T T 0.5*0.5*0.5= (0.51* 0.52) =0.125
T H T 0.5*0.5*0.5 =(0.51* 0.52) =0.125
T T H 0.5*0.5*0.5 =(0.51* 0.52) =0.125
P(X=1) = 3 × 0.51
× 0.52
= 0.375
13
𝐶 𝑛, 𝑟 =
𝑛
𝑟
=
3
1
=
𝑛!
𝑟! 𝑛 − 𝑟 !
= 3
AND is multiplication;
OR is addition.
HTT,
OR THT,
OR TTH;
∴0.125 +0.125 +0.125 =0.375
What is the probability that two “2” would turn up if a dice is rolled four
times?
A B C D
1 1 0 0 1/6*1/6*5/6*5/6 =(1/6)2* (5/6)2
1 0 1 0 1/6*5/6*1/6*5/6 =(1/6)2* (5/6)2
1 0 0 1 1/6*5/6*5/6*1/6 =(1/6)2* (5/6)2
0 1 1 0 5/6*1/6*1/6*5/6 =(1/6)2* (5/6)2
0 1 0 1 5/6*1/6*5/6*1/6 =(1/6)2* (5/6)2
0 0 1 1 5/6*5/6*1/6*1/6 =(1/6)2* (5/6)2
14
P(X=2) = 6*(1/6)^2*(5/6)^2 = 0.116
Whatever other than “2”
The four trial/roll
“2”(Success): 1
Not “2”(Failure): 0
𝐶 𝑛, 𝑟 =
𝑛
𝑟
=
4
2
=
𝑛!
𝑟! 𝑛 − 𝑟 !
= 6
Frequency Function
 Part I: any particular sequence of x
successes occurs with probability px (1-p)n-x
(multiplication law)
 Part II: there are ways to assign x
successes to n trials








x
n
x
n
x
p
p
x
n
x
X
P 










 )
1
(
)
(
15
AND is multiplication;
OR is addition.
Among the n trials, the frequency for the situation to happen
𝑛
𝑥
AND Probability of something successfully happening x times (𝑝𝑥
)
AND Probability of something NOT successfully happening in the REST of the trials ( 1 − 𝑝 𝑛−𝑥
)
…which is the Probability Function, see p.7
Probability Function “describes the frequency distribution of the variable”
𝐶 𝑛, 𝑥 =
𝑛
𝑥
=
𝑛!
𝑥! 𝑛 − 𝑥 !
Binomial -> Poisson
16
 Consider this situation…
Suppose you are a transportation planner, and
you are concerned about the safety of particular
intersection. During the last 60 days, there were 3
accidents each occurring on separate days. You
are asked to estimate the probability that 2 will
occur during the next 30 days.
Among the n trials, the frequency for the situation to happen
𝑛
𝑥
=
30
2
AND Probability of something successfully happening x times (𝑝𝑥
) =
3
60
2
= 0.05 2
AND Probability of something NOT successfully happening in the REST of the trials ( 1 − 𝑝 𝑛−𝑥
) =
57
60
30−2
= 0.95 28
Binomial -> Poisson (cont.)
17
 Solution – the binomial distribution
If we define observing the traffic accident per day
as a Bernoulli trial, the number of days in which
an accident occurs is a binomial random variable
However, it is possible to have more than one
accidents per day. So we can take half day as the
analysis unit
Among the n trials, the frequency
for the situation to happen
𝑛
𝑥
30
2
AND Probability of something
successfully happening x times (𝑝𝑥
)
3
60
2
= 0.05 2
AND Probability of something NOT
successfully happening in the
REST of the trials ( 1 − 𝑝 𝑛−𝑥
)
57
60
30−2
= 0.95 28
2586
.
0
95
.
0
05
.
0
)
2
,
30
(
)
2
(
95
.
0
1
,
05
0
60
3
,
30
28
2










C
X
P
p
.
/
p
n
Binomial -> Poisson
18
 Solution – the binomial distribution
X is defined as the number of half days in which
one accident occurs
Again, the choice of time unit is artificial. We
can continue to divide the day into smaller time
periods.
2548
.
0
975
.
0
025
.
0
)
2
,
60
(
)
2
(
975
.
0
1
,
025
0
120
3
,
60
58
2










C
X
P
p
.
/
p
n
0
, 

 p
n
Among the n trials, the frequency
for the situation to happen
𝑛
𝑥
60
2
AND Probability of something
successfully happening x times
(𝑝𝑥
)
3
120
2
= 0.025 2
AND Probability of something
NOT successfully happening in
the REST of the trials ( 1 − 𝑝 𝑛−𝑥
)
57
120
60−2
= 0.975 58
Binomial -> Poisson
 The Poisson distribution can be defined as the
limiting case of the binomial distribution:
 Poisson distribution can be used as the
approximation of Binomial distribution for large n
and small p


n 
 
np constant
19
Poisson Distribution
 The process of interest consists of events that occur
repeatedly and randomly within certain time period or
space
 Traffic accidents in Redlands / Tornados in Columbus (Ohio)
 Events are independent of past or future occurrences
 The occurrence of an event has a constant mean rate
or density (underlying process governing the
phenomenon must be invariant)
 The random variable of interest, X, is the number of
events occurring within a given unit of time, area,
volume, and etc.
20
The Poisson distribution is sometimes known as the
Law of Small Numbers, because it describes the
behavior of events that are rare
The probability that an event will occur within a
given unit must be the same for all units (i.e. the
underlying process governing the phenomenon must
be invariant)
3. The number of events occurring per unit must be
independent of the number of events occurring in
other units (no interactions)
4. The mean or expected number of events per unit
(λ) is found by past experience (observations)
The “counts” of events
Poisson Distribution (cont.)
 Frequency function
 The mean or expected number of events (λ) is found
by past experience (observations)
where e = 2.71828 (base of the natural logarithm)
λ = the mean or expected value
(for THE given time, expected value, not actually happening)
(The mean or expected number of events per unit)
x = 1, 2, …, n – 1, n # of occurrences
!
)
(
x
e
x
X
P
x





21
Number of trials: n
Probability of success: p
𝑛𝑝 = 𝜆
The probability that an event will occur within a
given unit must be the same for all units (i.e. the
underlying process governing the phenomenon must
be invariant)
3. The number of events occurring per unit must be
independent of the number of events occurring in
other units (no interactions)
4. The mean or expected number of events per unit
(λ) is found by past experience (observations)
Poisson Distribution (cont.)
λ
22
λ affect the skew: The larger the λ, the
more symmetrical it becomes.
Notice that the Poisson Distribution is for
relatively rare incidents, such as
accidents and cancer. If the frequency is
relatively high, we should use normal
distribution.
Example 1
Three(3) accidents were observed in last 60 days. Find
the probability of observing x accidents in the next 30
days
Solution:
1. Random variable X: the # of accidents occurred during
the 30-day period
2. The mean number of accidents during the 30-day
period is constant and equal to  = 3/2 = 1.5
3. Find the probability observing x accidents during the
30-day period. That is, find the value of P(X = x)
23
!
)
(
x
e
x
X
P
x





3
60
× 30 =
3
2
= 1.5
[3 accidents per 60 days]
When time period is 30
days, the mean number of
accidents would be 1.5
(accidents).
x P(X = x)
0 e-1.5 1.50 / 0! = 0.2231
1 e-1.5 1.51 / 1! = 0.3374
2 e-1.5 1.52 / 2! = 0.2510
3 e-1.5 1.53 / 3! = 0.1255
Example 1
24
Example 2
A disease occurs randomly in space, with one(1)
incident every 16 square kilometers. What is the
probability of finding four(4) incidents in a 30
square kilometer area?
25
Example 2
Solution:
1. Random variable X: the # of incidents in a 30 square
kilometer area
2. The mean number of incidents in a 30 square
kilometer area equals to  = 30/16 = 1.875
3.
26
079
.
0
!
4
875
.
1
)
4
(
4
875
.
1





e
X
P
One(1) every 16, the 𝜆, mean number of incidents in a 30 km2
area, is 30/16, which is 1.875 (incidents)
Binomial vs. Poisson
 If a mean or average probability of an event happening per unit
time/per page/per mile cycled etc., is given, and you are asked to
calculate a probability of n events happening in a given
time/number of pages/number of miles cycled, then the Poisson
Distribution is used. You do not know the number of trials.
 If, on the other hand, an exact probability of an event happening is
given, or implied, in the question, and you are asked to calculate
the probability of this event happening k times out of n, then the
Binomial Distribution must be used. You know the number of
trials.
27
http://personal.maths.surrey.ac.uk/st/J.Deane/Teac
h/se202/poiss_bin.html
Expected value = 𝜆
Variance = 𝜆
Expected value = 𝑛𝑝
Variance = 𝑛𝑝(1 − 𝑝)
Binomial vs. Poisson
The Binomial and Poisson distributions are similar, but they are different. Also, the fact that they are both
discrete does not mean that they are the same. The Geometric distribution and one form of the Uniform
distribution are also discrete, but they are very different from both the Binomial and Poisson distributions.
The difference between the two is that while both measure the number of certain random events (or
"successes") within a certain frame, the Binomial is based on discrete events, while the Poisson is based
on continuous events. That is, with a Binomial distribution you have a certain number, n, of "attempts,"
each of which has probability of success p. With a Poisson distribution, you essentially have infinite
attempts, with infinitesimal chance of success. That is, given a Binomial distribution with some n,p, if
you let n→∞ and p→0 in such a way that np→λ, then that distribution approaches a Poisson distribution
with parameter λ.
Because of this limiting effect, Poisson distributions are used to model occurrences of events that could
happen a very large number of times but happen rarely. That is, they are used in situations that would
be more properly represented by a Binomial distribution with a very large n and small p, especially when the
exact values of n and p are unknown. (Historically, the number of wrongful criminal convictions in a country)
28
Exercise
 A typist makes on average 2 mistakes per page. What is the probability of a
particular page having no errors on it? P
 A computer crashes once every 2 days on average. What is the probability
of there being 2 crashes in one week? P
 Components are packed in boxes of 20. The probability of a component
being defective is 0.1. What is the probability of a box containing 2 defective
components? B
 ICs are packaged in boxes of 10. The probability of an IC being faulty is 2%.
What is the probability of a box containing 2 faulty ICs? B
 The mean number of faults in a new house is 8. What is the probability of
buying a new house with exactly 1 fault? P
 A box contains a large number of washers; there are twice as many steel
washers as brass ones. Four washers are selected at random from the box.
What is the probability that 3 are brass? B
29
http://personal.maths.surrey.ac.uk/st/J.Deane/Teac
h/se202/poiss_bin.html
n=20, p=0.1, x=2
n=10, p=2%, x=2
n=4, p=0.33, x=3
x
n
x
p
p
x
n
x
X
P 










 )
1
(
)
(
 Suppose 30 events are randomly distributed among 35
equally sized grid cells, how many of the grid cells are
expected to have one event?
30
Normal Distribution
31
Section II
Outline
 Probability density function
 Uniform probability distribution
 Normal distribution
32
Continuous Random Variable
 A continuous random variable is the
random variable that can take on a
continuum of values
 Travel distance / the magnitude of a flood
 For a continuous random variable, the role
of frequency function is taken by a density
function, f(x)
33
Probability Density Function
 Probability distribution of a continuous random variable
is expressed by its probability density function (PDF), f (x),
which has the following properties
1) f(x) > 0
2) f is piecewise continuous
3) −∞
∞
𝑓 𝑥 𝑑𝑥 = 1
 f(x) is often represented by a graph or an equation
34
“The total area under
the curve”
= “The Sum of all
possibility”
= 1
𝑡ℎ𝑒 𝑙𝑜𝑤𝑒𝑟 𝑙𝑖𝑚𝑖𝑡
𝑡ℎ𝑒 𝑢𝑝𝑝𝑒𝑟 𝑙𝑖𝑚𝑖𝑡
𝑇ℎ𝑒 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛 𝑑𝑥
−∞
∞
𝑓 𝑥 𝑑𝑥
𝑓 𝑋 = 𝑥 = 0
The probability of one particular value
only, is zero. Because what we look at is
the area, since the probability is tied to
the area. And because the length of one
particular value only is so thin, such that
the length approach to zero. And hence
the area and probability as well.
Therefore, 𝑓 𝑋 = 𝑥 = 0
Probability on an Interval
 If X is a random variable with density function f, then
for any a < b, the probability that X falls in the interval
(a, b) is the area under the density function between
a and b:
𝑃 𝑎 ≤ 𝑋 ≤ 𝑏 =
𝑎
𝑏
𝑓 𝑥 𝑑𝑥
35
a b
𝑡ℎ𝑒 𝑙𝑜𝑤𝑒𝑟 𝑙𝑖𝑚𝑖𝑡
𝑡ℎ𝑒 𝑢𝑝𝑝𝑒𝑟 𝑙𝑖𝑚𝑖𝑡
𝑇ℎ𝑒 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛 𝑑𝑥
Uniform Distribution
 Uniform distribution
 The process of interest consists of equally likely
outcomes
 Probability density function
𝑓 𝑥 =
1
𝑏 − 𝑎
; 𝑎 ≤ 𝑥 ≤ 𝑏
0; 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
36
Uniform Distribution (cont.)
 Probability on an interval
a
b
c
d
dx
a
b
dx
a
b
c
F
d
F
d
X
c
P
c
a
d
a













1
1
)
(
)
(
)
(
37
Uniform Distribution - Example
 The annual mean temperature is uniformly distributed
between 10ºC and 18ºC. Find the probability that the
annual mean temperature falls in between 12ºC and
15ºC.
a = 10, b = 18, c = 12, d = 15
 What is the probability that annual mean temperature is
greater than 15 ºC? What is the probability that annual
temperature is less than 13 ºC?
)
18
,
10
(
~ U
X
38
8
3
10
18
12
15
)
15
12
( 








a
b
c
d
X
P
The probability is only related
to the length of the interval,
instead of the location of
interval, given that the
interval is the defined
between a and b.
Normal Distribution
 It was proposed by Karl Friedrich Gauss as a
model for measurement errors.
 It is also called Gaussian distribution.
 The most common and important probability
distribution
 Most naturally occurring variables are distributed
normally (e.g. heights, weights, annual temperature
variations, test scores, IQ scores, etc.)
 Foundation in probability and statistics
39
A normal distribution can also
be produced by tracking the
errors made in repeated
measurements of the same
thing; Karl Friedrich Gauss
was a 19th century astronomer
who found that the distribution
of the repeated errors of
determining the position of the
same star formed a normal (or
Gaussian) distribution
Normal Distribution (cont.)
 Probability density function
 The normal distribution is a continuous distribution
that is symmetric and bell-shaped.
2
2
2
/
)
(
2
2
1
)
(
)
,
(
~








 x
e
x
f
N
X
𝜇 – population mean
𝜎2 – population variance
𝜎 – population Standard deviation
40
Normal distributions with various parameters
41
Properties of Normal Distribution
 Symmetry: values below μ are just as likely as values above
μ.
 Center: f(x) has maximum value for x = μ, so values close to μ
are the most likely to occur.
 Dispersion: the density is “wider” for large σ compared to
small values of σ (for fixed μ), so the larger σ the more likely
are observations far from μ.
42
Normal Distribution (cont.)
 Probability on an interval
 The areas under normal curves can be obtained from standard
normal tables. Therefore, it is necessary to standardize normal
distributions.
2
1
)
(
2
2
2
/
)
-
(x
-
b
a
dx
e
b
X
a
P








43
Standard Normal Distribution
 Standard normal distribution
 The special case of normal distribution which has
and
0


1
2


2
/
2
2
1
)
(
)
1
,
0
(
~
z
e
z
f
N
Z



44
Central Part of z Distribution
45
46
We would expect the interval as
±1.96 to contain approximately
95% of the observations. This
corresponds to a commonly used
rule-of-thumb that roughly 95% of
the observations are within[-2, 2].
Similar computations are made
for other percentages.
 The standardization is achieved by converting the
data into z-scores
 Example 1
 population mean and variance are known
The annual precipitation 𝑋~𝑁 80, 402
, what is the z-
score of x = 150mm?
Standardization of Normal Distributions
75
.
1
40
80
150
40
80




 i
x
z
47
“z-score”, which in
other words is “how
many S.D. is x
deviated from the
mean.
By referring to the
Standard Normal
Table, we can
calculate the
possibility of x
laying above, below,
or between certain
numbers.
𝑍~𝑁 0, 12
𝑋~𝑁 𝜇, 𝜎2
𝑧 =
𝑥𝑖 − 𝜇
𝜎
Standardization of normal distributions
 Example 2
 Sample mean and variance
are known
 Step I:
 Sample mean
𝑥 = 59.7
 Sample standard deviation
𝑠 = 12.97
 Step II:
Month T (°F) Z-score
J 39.53 -1.56
F 46.36 -1.03
M 46.42 -1.02
A 60.32 0.05
M 66.34 0.51
J 75.49 1.22
J 75.39 1.21
A 77.29 1.36
S 68.64 0.69
O 57.57 -0.16
N 54.88 -0.37
D 48.2 -0.89 s
x
x
z i 

48
𝑥=Sample mean; 𝑠 = 𝑆𝑎𝑚𝑝𝑙𝑒 𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝐷𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛;
𝜇 = 𝑃𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑚𝑒𝑎𝑛; 𝜎 = 𝑃𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝐷𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛;
Calculating Probabilities from a Normal Distribution
 Consider this problem
Suppose that the annual precipitation was normally
distributed with mean 80 mm per year and standard
deviation 40 mm. What is the probability that the
annual precipitation is greater than 150 mm?
Solution:
1. calculate the z score(s)
49
75
.
1
40
80
150
40
80





x
z
Standard Normal Table
1.75
P(Z>=1.75)
= 0.0401
z
50
2. Look up the
standard
normal table
Standard Normal Table, Head-End (z=0.0~0.99)
Z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09
.0 .5000 .5040 .5080 .5120 .5160 .5199 .5239 .5279 .5319 .5359
.1 .5398 .5438 .5478 .5517 .5557 .5596 .5636 .5675 .5714 .5753
.2 .5793 .5832 .5871 .5910 .5948 .5987 .6026 .6064 .6103 .6141
.3 .6179 .6217 .6255 .6293 .6331 .6368 .6406 .6443 .6480 .6517
.4 .6554 .6591 .6628 .6664 .6700 .6736 .6772 .6808 .6844 .6879
.5 .6915 .6950 .6985 .7019 .7054 .7088 .7123 .7157 .7190 .7224
.6 .7257 .7291 .7324 .7357 .7389 .7422 .7454 .7486 .7517 .7549
.7 .7580 .7611 .7642 .7673 .7704 .7734 .7764 .7794 .7823 .7852
.8 .7881 .7910 .7939 .7967 .7995 .8023 .8051 .8078 .8106 .8133
.9 .8159 .8186 .8212 .8238 .8264 .8289 .8315 .8340 .8365 .8389
51
Standard Normal Table, Head-End (z=1.0~1.99)
Z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09
1.0 .8413 .8438 .8461 .8485 .8508 .8531 .8554 .8577 .8599 .8621
1.1 .8643 .8665 .8686 .8708 .8729 .8749 .8770 .8790 .8810 .8830
1.2 .8849 .8869 .8888 .8907 .8925 .8944 .8962 .8980 .8997 .9015
1.3 .9032 .9049 .9066 .9082 .9099 .9115 .9131 .9147 .9162 .9177
1.4 .9192 .9207 .9222 .9236 .9251 .9265 .9279 .9292 .9306 .9319
1.5 .9332 .9345 .9357 .9370 .9382 .9394 .9406 .9418 .9429 .9441
1.6 .9452 .9463 .9474 .9484 .9495 .9505 .9515 .9525 .9535 .9545
1.7 .9554 .9564 .9573 .9582 .9591 .9599 .9608 .9616 .9625 .9633
1.8 .9641 .9649 .9656 .9664 .9671 .9678 .9686 .9693 .9699 .9706
1.9 .9713 .9719 .9726 .9732 .9738 .9744 .9750 .9756 .9761 .9767
52
Standard Normal Table, Head-End (z=2.0~2.99)
Z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09
2.0 .9772 .9778 .9783 .9788 .9793 .9798 .9803 .9808 .9812 .9817
2.1 .9821 .9826 .9830 .9834 .9838 .9842 .9846 .9850 .9854 .9857
2.2 .9861 .9864 .9868 .9871 .9875 .9878 .9881 .9884 .9887 .9890
2.3 .9893 .9896 .9898 .9901 .9904 .9906 .9909 .9911 .9913 .9916
2.4 .9918 .9920 .9922 .9925 .9927 .9929 .9931 .9932 .9934 .9936
2.5 .9938 .9940 .9941 .9943 .9945 .9946 .9948 .9949 .9951 .9952
2.6 .9953 .9955 .9956 .9957 .9959 .9960 .9961 .9962 .9963 .9964
2.7 .9965 .9966 .9967 .9968 .9969 .9970 .9971 .9972 .9973 .9974
2.8 .9974 .9975 .9976 .9977 .9977 .9978 .9979 .9979 .9980 .9981
2.9 .9981 .9982 .9982 .9983 .9984 .9984 .9985 .9985 .9986 .9986
53
Standard Normal Table, Head-End (z=3.0~3.49)
Z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09
3.0 .9987 .9987 .9987 .9988 .9988 .9989 .9989 .9989 .9990 .9990
3.1 .9990 .9991 .9991 .9991 .9992 .9992 .9992 .9992 .9993 .9993
3.2 .9993 .9993 .9994 .9994 .9994 .9994 .9994 .9995 .9995 .9995
3.3 .9995 .9995 .9995 .9996 .9996 .9996 .9996 .9996 .9996 .9997
3.4 .9997 .9997 .9997 .9997 .9997 .9997 .9997 .9997 .9997 .9998
54
Calculating Probabilities (cont.)
μ = 0
f(x)
+1.75
.0401 P(Z > 1.75) = 0.0401
P(Z <= 1.75) = 0.9599
μ = 80
f(x)
.0401
+150
P(X > 150) = 0.0401
P(X <= 150) = 0.9599
55
3. Calculate the
areas of
interest
 Random variable X follows a normal
distribution with mean of µ and variance of 2.
Find out the interval that contains of middle
95% of the data.
56
Central Part of a Normal Distribution
57
Are data normally distributed?
58
 Compared the observed histogram to a normal curve
that has sample mean and sample standard deviation.
Normal Q-Q plot
59
 A normal quantile-
quantile plot compares
the sample quantiles
to those of the normal
distribution. If data are
N(μ,σ2) distributed, the
points in the QQ-plot
should be scattered
around the straight
line.
Straight line in Q-Q plot = normal distribution
Confidence Interval
Central limit theorem, interval estimation
60
Section III
Outline
 Point estimation
 Central Limit Theorem
 Confidence interval
61
Estimation
 Estimate population parameters based on
sample data
 Two types of estimate
 Point estimate
 Interval estimate
62
 Mean
 Population parameter: 
 Point estimate: 𝑥
 Standard Deviation
 Population parameter: 
 Point estimate: s
Point Estimate
63



n
i
i
x
n
x
1
1
1
)
(
1
2





n
x
x
s
n
i
i
The first type of estimation
“True mean”
“Sample mean”
“Sample s. d.”
“population s. d.”
Sampling Error
 Sampling error is the difference between the value of a
population characteristics and the value of that
characteristics inferred from a sample.
 Example: consider the population characteristic of the
average selling price of homes in Redlands in 2009. If
every house is examined, the average selling price is
$200,000. If only 25 homes per month are sampled, the
average selling price of the 300 homes is $230,000.
The sampling error is $200,000-$230,000 = -$30,000
Sampling Error cannot
be removed.
Interval Estimate
 It is very unlikely that sample point estimates will
exactly equal the true population parameters due to
uncertainty in probability sampling.
 To determine how good our point estimates are, we
could extend a point estimate to an interval within
which the population parameter lies.
65
x
The second type of estimation
Confidence Level and Interval
 In probabilistic terms, we like to attach some measure
of certainty (confidence) to our interval estimates.
 What does the 90% confidence level mean?
 The chance that the interval estimate containing the true
mean is 90%.
 In other words, the probability that the true mean falls in the
interval is 90%.
 This interval is called 90% confidence interval.
66
So, let’s say we repeated
the sampling process
many of time, and there
are many sets of Sample.
The sample mean from
each Sample set are
different, and hence they
form an interval.
So, is the true
population mean (𝝁)
falls within the interval?
Maybe.
The wider the interval it
is, the higher chance it is
in the interval. Technically
if the interval is −∞ ≤ 𝑥 ≤
∞, which is from negative
infinity to infinity, the
chance is always 100%,
and the confident level
would always be 100%...
But that interval would be
meaningless.
Therefore, Statistically
we would apply a
tolerance level, that we
the probability is less
than 100%, but high
enough (90%, 95%,
99%, etc.) that the
interval is useful.
Then, here it is, the
concept of confidence
interval and the
confidence level.
How to obtain a confidence interval?
 If we know the relationship between the sample mean and the
true mean, we can link them together.
 The sampling distribution or probability distribution of the
sample mean reveals the relationship.
67
Sampling Distribution of Sample Mean
 The sampling distribution of the sample mean can be
developed by taking all possible or many samples of size n
from a population, calculating the value of the mean for each
sample, and drawing the distribution of these values.
68
Sampling Distribution of Sample Mean
 When the sampling process is repeated many times,
we could get many different samples, which give
different sample means.
 http://www.ltcconline.net/greenl/java/Statistics/clt/cltsimulation.html
69
The larger a sample size,
the closer the sample
mean is to the true mean.
Hence, the larger the
sample size, the variance
of the sample means
would be smaller
Which mean this
frequency plot would be
narrower.
Central Limit Theorem (CLT)
70
 Let X1, X2, X3… Xn be a random sample of size n drawn from
a population with mean  and standard deviation .
 Then for a large n, the sampling distribution of 𝑋 is
approximately normally distributed with mean  and
standard deviation 𝜎 𝑛.
 In a special case where X is normal, the distribution of 𝑋 is
exactly normal regardless of sample size.
 The standard deviation of the sample mean 𝜎 𝑛 is also
called standard error.
𝑋: Sample mean
: Population Mean
𝜎 𝑛: Standard deviation of sample means, (=standard error)
σ: Standard deviation of population
𝑠: Standard deviation of a set of sample
The mean of the frequency
distribution of sample
mean is theoretically the
same as population mean
(𝝁). The Standard deviation
of the frequency distribution
of sample mean is (𝝈
𝒏
).
𝑋~𝑁 𝜇, 𝜎
𝑛
2
: The frequency distribution of sample mean
CLT (cont.)
 The central limit theorem only applies to sample
mean, not for other sample statistics.
 Generally, a sample size n ≥ 30 could be regarded
as sufficiently large so that the sampling distribution
of sample means is approximately normal.
 The sample size is inversely related to the standard
error.
71
Central limit
theorem is used to
estimate the
sample mean and
sample mean only. The statement of
sample size 𝑛 ≥ 30,
is obtained by
comparing the T-
table and normal
distribution. Which
we will talk about in
later slides.
Notation for Confidence Level and Interval
 Confidence level
 Denoted by (1 – α) × 100%
 Usually α = 0.05, could also be 0.10 or 0.01
 Thus, the likelihood that we are wrong is α (also
called significance level)
 The interval in which the true mean lies within
(1 – α) × 100% confidence (lies within the Confidence level) is
called (1 – α) × 100% confidence interval.
72
95% Confidence
level
5% likelihood of
being wrong
90% Confidence level
10% likelihood of being
wrong
99% Confidence level
1% likelihood of being
wrong
“Significance level”
the likelihood that
we are wrong
Basic Steps
 Step 1: Standardize 𝑋
 Step 2: find the z score
 Step 3: calculate margin of error
 Step 4: obtain the final CI (Confident Interval)
 Interpretation
73
 Suppose the sample size n is sufficiently large (n ≥
30), according to the central limit theorem, the
frequency distribution of sample mean is normal
with mean μ and standard deviation 𝜎 𝑛
 Step 1: Standardize 𝑋
Step 1: Standardize 𝑋
74
𝑋~𝑁 𝜇, 𝜎
𝑛
2
𝑍 =
𝑋 − 𝜇
𝜎
𝑛
~𝑁 0, 1
Step 2: Find The Z Score
75
The three z-scores 1.65, 1.96, 2.58 are associated with three
confidence levels 1 – α (=0.90, 0.95, 0.99),
where α is 0.10, 0.05, and 0.01 respectively.
𝑧𝛼 2 is a z score or z value that corresponds to
a tail area of α/2.
76
𝑧𝛼 2
α/2
Step 3: Calculate Margin Of Error
 The range of values above and below the sample
statistic with a specified confidence.
 Put differently

 




 1
)
( ME
X
ME
X
P
77
n
z
ME

 2

𝑀𝐸: “Margin
of Error”
Step 4: Obtain The Final CI(confidence Interval)
 Add/subtract the margin of error from the
sample mean to get the CI.
78
]
96
.
1
,
96
.
1
[
then
,
05
.
0
If
]
,
[ 2
2
n
X
n
X
n
z
X
n
z
X












Interpretation
 When α = 0.05, we can say that
 “I am 95% confident that the mean of the population is
somewhere between 𝑥 − 1.96
𝜎
𝑛
and 𝑥 + 1.96
𝜎
𝑛
 The true population mean μ should, 95% of the time,
lie within ±1.96
𝜎
𝑛
of sample mean.
 95% of all confidence intervals that can be
constructed will contain the unknown true mean.
79
80
After repeated sampling of 100 times, how many of the confidence
intervals would you expect to contain the true mean?
What influence CI?
 Sample variance ↑, range of CI ↑
 larger sample variability , higher uncertainty
 wider CI
 Sample size ↑, range of CI↓
 Larger sample size n, more information
 narrower CI
 Confidence level ↑, range of CI ↑
 Higher confidence level,
higher uncertainty to be accounted
 wider CI
81
Some issues
 How about small sample size (n < 30) ?
 t-distribution unless the sample is drawn from a
normally distributed population
 How about the population standard deviation σ is
unknown?
 Use sample standard deviation s to approximate
population standard deviation σ
 t-distribution, providing the population is normal
82
t-Distribution
 When the sample size is not sufficiently large, the
frequency distribution of sample means has what is
known as the t distribution (or Student’s t
distribution)
 t-distribution also copes with uncertainty resulting from
estimating the standard deviation from a sample,
whereas if the population standard deviation was
unknown
 The overall shape of the probability density function of
the t-distribution resembles the bell shape of a
standard normal distribution, except that it is a bit
lower and wider.
 t-distribution depends on a new parameter – degree of
freedom (df =n -1)
83
Student's t-distribution to
cope with uncertainty
resulting from estimating
the standard deviation from
a sample, whereas if the
population standard
deviation were known, a
normal distribution would be
used.
t-Distribution vs. Standard Normal
df = 1 df = 2 df = 3
df = 5 df = 10 df = 30
84
Using t distribution to construct CI
 Population standard deviation is unknown
 Population standard deviation is known but
sample size is small
85
]
,
[ 1
,
2
1
,
2
n
s
t
X
n
s
t
X n
n 
 
 

]
,
[ 1
,
2
1
,
2
n
t
X
n
t
X n
n



 
 

Example question
A local bank needs information concerning the savings account
balances of its customers. A random sample of 15 accounts
was checked. The mean balance was $686 with a standard
deviation of $256. Which of the followings is the
95% confidence interval for the true mean?
The correct answer is: 686 − 2.15 ×
256
15
, 686 + 2.15 ×
256
15
[686-2.15*256/sqrt(15), 686+2.15*256/sqrt(15)]
86
Hypothesis Testing
87
Section IV
Outline
 What is hypothesis testing?
 Errors in hypothesis testing
 One-sample z-test
 One-sample t-test
88
Consider this situation
A consumer advocacy group collects a random sample
of n = 100 light bulbs from a manufacture, and observe a
sample mean of 987 hours. Assume standard deviation
of all light bulbs is 40 hours. Estimate on average how
many hours of light can the light bulbs provide.
𝛼 = 0.05, 𝑧0.025 = 1.96
𝑀𝐸 = 𝑧0.025
𝑠
𝑛
= 1.96 ×
40
10
= 7.84
Confidence Interval [979.16, 994.84]
89
We are 95% confident that
the mean lifetime of light
bulb manufactured by the
manufacturer fall between
979.16 to 994.84 hours.
Consider a related situation
A consumer advocacy group thinks a manufacturer of light bulbs is mistaken
in their claim that their bulbs on average provide 1000 hours of light. They
believe the light bulbs are defective. To test this, they collect a random
sample of n = 100 light bulbs and observe a sample mean of 987 hours.
Assume standard deviation of all light bulbs is 40 hours.
90
We are 95% confident that
the mean lifetime of light
bulb manufactured by the
manufacturer fall between
979.16 to 994.84 hours.
We cannot say that, the sample mean of
one set of sample is 987 hour, so the claim
of manufacturer is false;
But with statistic and probability, we can
claim that what the manufacturer claimed in
false – because 95% of the chance that the
mean lifetime of light bulb manufactured by
the manufacturer would fall between 979.16
to 994.84 hours.
1000 hours fall outside of the 95%
range. In other word, the claim of
the manufacturer, at least 95% of
the case, is wrong.
 If we assume the manufacture’s claim is true, we
would expect the average lifetime of the samples
is close to 1000h.
 But how close is close? Is the sample mean of
987 close enough to the presumed value of 1000?
 We need to quantify the closeness or difference
between the sample mean and the presumed
mean.
 To do so, we may compare 987 to a threshold that is
deemed as “close enough”
91
Common sense is important
 95% of the time the sample mean will range between
992 and 1007. So we can take 992 and 1007 as the
thresholds for “close enough”.
 However, there is small chance (<5%) that sample
mean falls out the range of 992 and 1007. So we
could be wrong if we conclude the true mean is not
1000. This 5% is called significance level
92
𝑋~𝑁(1000,
402
100
)
1000
987 992 1008
𝑃 𝑋 ≤ 992 = 0.025
Significant: the tails.
Something you do
not expect.
Confident: the center.
The range that you’re
confident.
For Hypothesis testing, we
start with the claim. Hence,
the mean is set to be 1000
hours. Next, we throw the
Margin of Error into the
graph.
By setting the mean at
1000, we can calculate the
probability that we get a
sample mean that is so
small that it is 987
 Different problems will have different thresholds. Can we
obtain a standardized threshold that applies to all problems?
 Yes, we would use the z score as a generic measure for
“closeness”.
 Now let us take a look of basic steps.
93
 Step 1: state a null hypothesis
 Step 2: state alternative hypothesis
 Step 3: choose a significance level
 Step 4: calculate test statistic
 Step 5: find critical value and region of rejection
 Step 6: make a decision
94
Basic Steps of Hypothesis Testing
Step 1: state a null hypothesis
 Step 1: state a null hypothesis
 H0: μ = 1000
Note: the null hypothesis states this large random
sample is drawn from the population that has a
mean of 1000.
If the null hypothesis is true, we then can conclude
that the sample mean approximately follows
normal distribution (𝑋~𝑁(1000,
402
100
))
95
A consumer advocacy group thinks
a manufacturer of light bulbs is
mistaken in their claim that their
bulbs on average provide 1000
hours of light. They believe the light
bulbs are defective. To test this,
they collect a random sample of n =
100 light bulbs and observe a
sample mean of 987 hours.
Assume standard deviation of all
light bulbs is 40 hours.
Step 2: state alternative hypothesis
 Alternative hypothesis
 Two-sided hypothesis testing (test if the lifetime
differs from 1000 hours of light)
 HA: μ ≠ 1000
 One-sided hypothesis testing (test if the lightbulbs
provide less than 1000 hours of light)
 HA: μ < 1000
96
Population parameter
Reverse of what the
experimenter believes
A consumer advocacy group thinks
a manufacturer of light bulbs is
mistaken in their claim that their
bulbs on average provide 1000
hours of light. They believe the light
bulbs are defective. To test this,
they collect a random sample of n =
100 light bulbs and observe a
sample mean of 987 hours.
Assume standard deviation of all
light bulbs is 40 hours.
Step 3: choose a significance level
 α= 0.1, 0.05, 0.01
 Example: α=0.05
 The critical z value would be 1.65, 1.96, and 2.58
respectively
97
A result was said to be significant at the 5% level.
This means the result would be unexpected if the
null hypothesis were true.
A consumer advocacy group thinks
a manufacturer of light bulbs is
mistaken in their claim that their
bulbs on average provide 1000
hours of light. They believe the light
bulbs are defective. To test this,
they collect a random sample of n =
100 light bulbs and observe a
sample mean of 987 hours.
Assume standard deviation of all
light bulbs is 40 hours.
Step 4: calculate test statistic
 If H0 is true, the CLT gives
 Test statistic
 z-score:
𝑧𝑡𝑒𝑠𝑡 =
𝑥 − 𝜇0
𝜎 𝑛
=
987 − 1000
40/10
= −3.25
98
𝑋~𝑁(1000,
402
100
)
if a large random sample is drawn from the population that
has a mean of μ0 and a standard deviation of σ
A consumer advocacy group thinks
a manufacturer of light bulbs is
mistaken in their claim that their
bulbs on average provide 1000
hours of light. They believe the light
bulbs are defective. To test this,
they collect a random sample of n =
100 light bulbs and observe a
sample mean of 987 hours.
Assume standard deviation of all
light bulbs is 40 hours.
Step 5: find critical value and region of rejection
 Two-sided
HA :μ ≠ 100
 Example
 One-sided
 Example
HA : μ < 1000
μ < μ0
99
±𝑧𝛼 2 = ±1.96
−𝑧𝛼 = -1.65
A consumer advocacy group thinks
a manufacturer of light bulbs is
mistaken in their claim that their
bulbs on average provide 1000
hours of light. They believe the light
bulbs are defective. To test this,
they collect a random sample of n =
100 light bulbs and observe a
sample mean of 987 hours.
Assume standard deviation of all
light bulbs is 40 hours.
Step 6: make a decision
 Two-sided
 Reject the null
hypothesis if
 One-sided
 Reject the null
hypothesis if
100
𝑧𝑡𝑒𝑠𝑡 > 𝑧∝ 2 𝑜𝑟 𝑧𝑡𝑒𝑠𝑡 < −𝑧∝ 2
𝑧𝑡𝑒𝑠𝑡 > 𝑧∝ 𝑓𝑜𝑟 𝜇 > 𝜇0
𝑧𝑡𝑒𝑠𝑡 < −𝑧∝ 𝑓𝑜𝑟 𝜇 < 𝜇0
A consumer advocacy group thinks
a manufacturer of light bulbs is
mistaken in their claim that their
bulbs on average provide 1000
hours of light. They believe the light
bulbs are defective. To test this,
they collect a random sample of n =
100 light bulbs and observe a
sample mean of 987 hours.
Assume standard deviation of all
light bulbs is 40 hours.
Step 6: make a decision
 Example
Therefore, we can reject the null hypothesis. The lifetime of
lightbulbs is significantly less than 1000 hours at α=0.05.
101
𝑧𝑡𝑒𝑠𝑡 = −3.25,
−𝑧∝ = −1.65
𝑧𝑡𝑒𝑠𝑡< −𝑧∝
A consumer advocacy group thinks
a manufacturer of light bulbs is
mistaken in their claim that their
bulbs on average provide 1000
hours of light. They believe the light
bulbs are defective. To test this,
they collect a random sample of n =
100 light bulbs and observe a
sample mean of 987 hours.
Assume standard deviation of all
light bulbs is 40 hours.
What is hypothesis testing?
 Now let us take a deeper look of hypothesis testing
 Hypothesis
 A proposition whose truth or falsity is capable of being
tested
102
Errors in Hypothesis Testing
 Type I Error
 “False Positive”
 Rejecting a true hypothesis
 The likelihood of making a type I error is denoted by α, referred to as
significance level
 Type II Error
 “False Negative”
 Accepting a wrong hypothesis
 The likelihood of making a type II error is denoted by β
103
Errors in Hypothesis Testing (cont.)
104
We want to control Type I
error, more than Type II.
In most cases, Type I error
would lead to more severe
consequences.
Take a trial as example.
The null hypothesis (H0) is
an assumption of
innocence.
In a Type I error,
The H0 is true: The person is
innocent.
The H0 is rejected: The
person is deemed guilty
This creates the following
consequence:
(1) an innocent person is
deemed guilty;
(2) the criminal is still out
there free from the
system.
Take the same trial as example.
In a Type II error,
H0 is false: the person is guilty;
H0 is accepted: the person is
deemed to be innocent.
This create the following
consequence:
(1) The criminal is deemed
innocent and released
Type I error, also known as
False Positive
Type II error, also known as
False Negative
Controlling Type I Error
 It is almost always impossible to simultaneously minimize the
probability of both types of errors.
 Classical hypothesis testing adopts the strategy of controlling α.
 In making small α, we have small probability of making errors if we are able
to reject our hypothesis.
 If we have evidence to reject null hypothesis, we will be confident in our
analysis.
 The null hypothesis should be something we want to reject, rather than
something we want to confirm.
105
 Population standard deviation σ is unknown
 Sample size is small
 Test Statistic
 When H0 is true (the sample is drawn from the specified
population that has mean of μ0, T random variable follows
a student t-distribution with df = n-1
One-sample t-test
𝑇 =
𝑋 − 𝜇0
𝑆/ 𝑛
106
𝑇: T-value
Limitations of classic hypothesis testing
 The specific significance level must be selected a
priori, and often arbitrary and lack of theoretical
basis
 The final decision regarding the null and
alternative hypothesis is binary:
 H0 is rejected or not rejected
 More flexible method is needed
 What is the exact significance level associated with
the test statistic?
107
p-value
 The probability of getting a test statistic value as
extreme as or more extreme than that observed
by chance, if the null hypothesis H0 is true
 If null hypothesis is rejected, p-value is the
probability of making a Type I error
 The smaller the p-value, the more convincing to
reject the null hypothesis
108
Rejecting a true hypothesis
Typically, we can reject the null hypothesis, when p value is less than 10% (loose standard).
5% for spatial analysis
Determining p-value
 Using Calculated z or t test statistic to determine p-value
 p-value corresponds to the shaded area under the
standard normal (or t) curve
109
Light Bulbs Example
 What is the p-value of the lightbulb test? Would
you reject your null hypothesis at α = 0.01
significance level?
110
But when to reject the Null Hypothesis?
111
When the p-value of our test
fall outside of the interval, in
other words, fall within the
shaded area, we can reject it.
Let’s say the p-value is 0.006.
We can reject the null
hypothesis in both 𝑝 < 0.1, 𝑝 <
0.05, and 𝑝 < 0.01 level.
Let’s say the p-value is 0.02.
We can reject the null
hypothesis in both 𝑝 < 0.1, and
𝑝 < 0.05 level, but not at the
𝑝 < 0.01 level.
Let’s say the p-value is 0.06.
We can reject the null
hypothesis only in 𝑝 < 0.1
level, but not at the 𝑝 < 0.05
and 𝑝 < 0.01 level.
Let’s say the p-value is 0.2. We
cannot reject the null
hypothesis in all the following
three (𝑝 < 0.1, 𝑝 < 0.05 and
𝑝 < 0.01) level.
Useful graph/area plotting URL
http://www.statdistributions.com/normal/ http://www.statdistributions.com/t/
112
113
Null hypothesis (you want to reject)
No difference / no change / equal to “=”
Alternative hypothesis (research interest that you want to accept)
One sided: Different / not equal “≠”
Left sided: smaller than / less than “<”
Right sided: larger than / more than “>”
No sample statistics are included in both H0 and Ha (H1)

More Related Content

Similar to 2 Review of Statistics. 2 Review of Statistics.

Econometrics 2.pptx
Econometrics 2.pptxEconometrics 2.pptx
Econometrics 2.pptxfuad80
 
Statistical Analysis with R- III
Statistical Analysis with R- IIIStatistical Analysis with R- III
Statistical Analysis with R- IIIAkhila Prabhakaran
 
kinds of distribution
 kinds of distribution kinds of distribution
kinds of distributionUnsa Shakir
 
Random variables
Random variablesRandom variables
Random variablesMenglinLiu1
 
Probability distribution 2
Probability distribution 2Probability distribution 2
Probability distribution 2Nilanjan Bhaumik
 
Module 5 Lecture Notes
Module 5 Lecture NotesModule 5 Lecture Notes
Module 5 Lecture NotesLumen Learning
 
Mayo Slides: Part I Meeting #2 (Phil 6334/Econ 6614)
Mayo Slides: Part I Meeting #2 (Phil 6334/Econ 6614)Mayo Slides: Part I Meeting #2 (Phil 6334/Econ 6614)
Mayo Slides: Part I Meeting #2 (Phil 6334/Econ 6614)jemille6
 
AP Statistic and Probability 6.1 (1).ppt
AP Statistic and Probability 6.1 (1).pptAP Statistic and Probability 6.1 (1).ppt
AP Statistic and Probability 6.1 (1).pptAlfredNavea1
 
Mba i qt unit-4.1_introduction to probability distributions
Mba i qt unit-4.1_introduction to probability distributionsMba i qt unit-4.1_introduction to probability distributions
Mba i qt unit-4.1_introduction to probability distributionsRai University
 
Statistik 1 5 distribusi probabilitas diskrit
Statistik 1 5 distribusi probabilitas diskritStatistik 1 5 distribusi probabilitas diskrit
Statistik 1 5 distribusi probabilitas diskritSelvin Hadi
 
Principles of Actuarial Science Chapter 2
Principles of Actuarial Science Chapter 2Principles of Actuarial Science Chapter 2
Principles of Actuarial Science Chapter 2ssuser8226b2
 
random variable and distribution
random variable and distributionrandom variable and distribution
random variable and distributionlovemucheca
 
PG STAT 531 Lecture 5 Probability Distribution
PG STAT 531 Lecture 5 Probability DistributionPG STAT 531 Lecture 5 Probability Distribution
PG STAT 531 Lecture 5 Probability DistributionAashish Patel
 
Binomial and Poission Probablity distribution
Binomial and Poission Probablity distributionBinomial and Poission Probablity distribution
Binomial and Poission Probablity distributionPrateek Singla
 
CPSC531-Probability.pptx
CPSC531-Probability.pptxCPSC531-Probability.pptx
CPSC531-Probability.pptxRidaIrfan10
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)IJERD Editor
 

Similar to 2 Review of Statistics. 2 Review of Statistics. (20)

PTSP PPT.pdf
PTSP PPT.pdfPTSP PPT.pdf
PTSP PPT.pdf
 
Econometrics 2.pptx
Econometrics 2.pptxEconometrics 2.pptx
Econometrics 2.pptx
 
Statistical Analysis with R- III
Statistical Analysis with R- IIIStatistical Analysis with R- III
Statistical Analysis with R- III
 
U unit7 ssb
U unit7 ssbU unit7 ssb
U unit7 ssb
 
kinds of distribution
 kinds of distribution kinds of distribution
kinds of distribution
 
Unit II PPT.pptx
Unit II PPT.pptxUnit II PPT.pptx
Unit II PPT.pptx
 
Random variables
Random variablesRandom variables
Random variables
 
Probability distribution 2
Probability distribution 2Probability distribution 2
Probability distribution 2
 
Module 5 Lecture Notes
Module 5 Lecture NotesModule 5 Lecture Notes
Module 5 Lecture Notes
 
Mayo Slides: Part I Meeting #2 (Phil 6334/Econ 6614)
Mayo Slides: Part I Meeting #2 (Phil 6334/Econ 6614)Mayo Slides: Part I Meeting #2 (Phil 6334/Econ 6614)
Mayo Slides: Part I Meeting #2 (Phil 6334/Econ 6614)
 
AP Statistic and Probability 6.1 (1).ppt
AP Statistic and Probability 6.1 (1).pptAP Statistic and Probability 6.1 (1).ppt
AP Statistic and Probability 6.1 (1).ppt
 
Mba i qt unit-4.1_introduction to probability distributions
Mba i qt unit-4.1_introduction to probability distributionsMba i qt unit-4.1_introduction to probability distributions
Mba i qt unit-4.1_introduction to probability distributions
 
Statistik 1 5 distribusi probabilitas diskrit
Statistik 1 5 distribusi probabilitas diskritStatistik 1 5 distribusi probabilitas diskrit
Statistik 1 5 distribusi probabilitas diskrit
 
Principles of Actuarial Science Chapter 2
Principles of Actuarial Science Chapter 2Principles of Actuarial Science Chapter 2
Principles of Actuarial Science Chapter 2
 
Unit 2 Probability
Unit 2 ProbabilityUnit 2 Probability
Unit 2 Probability
 
random variable and distribution
random variable and distributionrandom variable and distribution
random variable and distribution
 
PG STAT 531 Lecture 5 Probability Distribution
PG STAT 531 Lecture 5 Probability DistributionPG STAT 531 Lecture 5 Probability Distribution
PG STAT 531 Lecture 5 Probability Distribution
 
Binomial and Poission Probablity distribution
Binomial and Poission Probablity distributionBinomial and Poission Probablity distribution
Binomial and Poission Probablity distribution
 
CPSC531-Probability.pptx
CPSC531-Probability.pptxCPSC531-Probability.pptx
CPSC531-Probability.pptx
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)
 

Recently uploaded

Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 
Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentInMediaRes1
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceSamikshaHamane
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...jaredbarbolino94
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
Capitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitolTechU
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
MARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupMARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupJonathanParaisoCruz
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfMahmoud M. Sallam
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
MICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptxMICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptxabhijeetpadhi001
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxEyham Joco
 

Recently uploaded (20)

Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media Component
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in Pharmacovigilance
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
Capitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptx
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
MARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupMARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized Group
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdf
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
MICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptxMICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptx
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptx
 

2 Review of Statistics. 2 Review of Statistics.

  • 1. GIS 665 Geospatial Analysis Inferential Statistics Review of Statistics 1
  • 2. Outline  Probability distributions  Confidence intervals  Hypothesis testing 2
  • 4. Outline  Concept of a random variable  Binomial distribution  Poisson distribution 4
  • 5. 5 Suppose we consider any measureable characteristics of a population, such as the household size of all houses in a city. Because this characteristics can take different values, we refer to it as a variable. If we were to select one household at random from this population, the value of this variable is determined. Because of the value is determined through a random sampling, we call it a random variable. The “random” of a random variable is from the random sampling process
  • 6. Discrete vs. Continuous RV  A discrete random variable is a random variable that can take on only a finite or at most a countable infinite number of values.  The total number of heads turns up when flipping a coin three times: [0, 1, 2, 3]  The number of accidents occurred in Redlands per day  A continuous random variable is the random variable that can take on a continuum of values  Commute distance / annual rainfall or temperature 6 For example, GPA. A course GPA is a Discrete RV. The random variable only take on a finite value. But the average GPA? It is continuous.
  • 7. Probability Function  A table, graph, or mathematical function that describes the potential values of a random variable X and their corresponding probabilities is a probability function.  It describes the frequency distribution of the variable. 7
  • 8. Probability Mass Function  The probability distribution of a discrete random variable is specified by a probability mass function or the frequency function.  We use uppercase letter X to denote random variable and lowercase x to denote a specific value. 8 1 ) ( and ) ( ) ( 1      k i i i i x P x X P x P The sum of all xi should always be: 1
  • 9. Probability Mass Function 0 1 2 3 9  Example  Flipping a coin three times. Define X to be the total number of heads that turns up P(X=0) = 1/8 P(X=1) = 3/8 P(X=2) = 3/8 P(X=3) = 1/8 Flip the coin three times. It can either be head OR tail. So, the number of possible outcome is: 23 {HHH, HHT, HTH, THH, HTT, TTH, THT, TTT} AND is multiplication; OR is addition. Head AND Head AND Head. ∴ 1 2 × 1 2 × 1 2 = 1 23 = 1 8
  • 10. Probability on an Interval  What is the probability that the total number of heads turns up is less than 2 when flipping a coin three times? P(X<2) = ? x=0 OR x=1; P = 0.5  P(1<X<3) = ? x=2; P=0.375  P(1≤X<3) = ? x=1 OR x=2 P=0.75  P(1≤X≤3) = ? x=1 OR x=2 OR x=3 P=0.875 10 [1,3] = P(1≤X≤3) (1,3) = P(1<X<3)
  • 11. Expected Value of a Discrete RV  Expected value of a discrete RV is the average value it takes. 11   i i i x P x X E ) ( ) ( xi P(xi) xi*P(xi) 0 0.125 0 1 0.375 0.375 2 0.375 0.75 3 0.125 0.375 E(X) = 1.5 This is the probability of different number of head turns up. Remember IDW? The sum of weight is also 1. Flipping a coin three times. Define X to be the total number of heads that turns up This is the Expected value of the total number of heads turns up when tossed three times. Which is the sum of these 4 items. This is the number of heads turns up when the coin is tossed 3 times. Either 0 or 1 or 2 or 3.
  • 12. Binomial Distribution  Bernoulli Trial  Each trial results in one of two possible outcomes (“success”/“failure” or “head”/”tail”)  The probability of success is constant and equal to p on each trial (the probability of failure is 1-p)  Binomial Distribution  The process of interest consists of n independent Bernoulli trials with the probability of success in each trial as p  The total number of successes, X, is a binomial random variable with parameters n and p. 12 Suppose that n independent experiments are performed, where n is fixed number, and each experiment results in a success with probability p and a failure with 1-p. the total number of successes, X, is a binomial random variable with parameters n and p.
  • 13. What is the probability that one head would turn up when a fair coin is tossed three times? 1st 2nd 3rd H T T 0.5*0.5*0.5= (0.51* 0.52) =0.125 T H T 0.5*0.5*0.5 =(0.51* 0.52) =0.125 T T H 0.5*0.5*0.5 =(0.51* 0.52) =0.125 P(X=1) = 3 × 0.51 × 0.52 = 0.375 13 𝐶 𝑛, 𝑟 = 𝑛 𝑟 = 3 1 = 𝑛! 𝑟! 𝑛 − 𝑟 ! = 3 AND is multiplication; OR is addition. HTT, OR THT, OR TTH; ∴0.125 +0.125 +0.125 =0.375
  • 14. What is the probability that two “2” would turn up if a dice is rolled four times? A B C D 1 1 0 0 1/6*1/6*5/6*5/6 =(1/6)2* (5/6)2 1 0 1 0 1/6*5/6*1/6*5/6 =(1/6)2* (5/6)2 1 0 0 1 1/6*5/6*5/6*1/6 =(1/6)2* (5/6)2 0 1 1 0 5/6*1/6*1/6*5/6 =(1/6)2* (5/6)2 0 1 0 1 5/6*1/6*5/6*1/6 =(1/6)2* (5/6)2 0 0 1 1 5/6*5/6*1/6*1/6 =(1/6)2* (5/6)2 14 P(X=2) = 6*(1/6)^2*(5/6)^2 = 0.116 Whatever other than “2” The four trial/roll “2”(Success): 1 Not “2”(Failure): 0 𝐶 𝑛, 𝑟 = 𝑛 𝑟 = 4 2 = 𝑛! 𝑟! 𝑛 − 𝑟 ! = 6
  • 15. Frequency Function  Part I: any particular sequence of x successes occurs with probability px (1-p)n-x (multiplication law)  Part II: there are ways to assign x successes to n trials         x n x n x p p x n x X P             ) 1 ( ) ( 15 AND is multiplication; OR is addition. Among the n trials, the frequency for the situation to happen 𝑛 𝑥 AND Probability of something successfully happening x times (𝑝𝑥 ) AND Probability of something NOT successfully happening in the REST of the trials ( 1 − 𝑝 𝑛−𝑥 ) …which is the Probability Function, see p.7 Probability Function “describes the frequency distribution of the variable” 𝐶 𝑛, 𝑥 = 𝑛 𝑥 = 𝑛! 𝑥! 𝑛 − 𝑥 !
  • 16. Binomial -> Poisson 16  Consider this situation… Suppose you are a transportation planner, and you are concerned about the safety of particular intersection. During the last 60 days, there were 3 accidents each occurring on separate days. You are asked to estimate the probability that 2 will occur during the next 30 days. Among the n trials, the frequency for the situation to happen 𝑛 𝑥 = 30 2 AND Probability of something successfully happening x times (𝑝𝑥 ) = 3 60 2 = 0.05 2 AND Probability of something NOT successfully happening in the REST of the trials ( 1 − 𝑝 𝑛−𝑥 ) = 57 60 30−2 = 0.95 28
  • 17. Binomial -> Poisson (cont.) 17  Solution – the binomial distribution If we define observing the traffic accident per day as a Bernoulli trial, the number of days in which an accident occurs is a binomial random variable However, it is possible to have more than one accidents per day. So we can take half day as the analysis unit Among the n trials, the frequency for the situation to happen 𝑛 𝑥 30 2 AND Probability of something successfully happening x times (𝑝𝑥 ) 3 60 2 = 0.05 2 AND Probability of something NOT successfully happening in the REST of the trials ( 1 − 𝑝 𝑛−𝑥 ) 57 60 30−2 = 0.95 28 2586 . 0 95 . 0 05 . 0 ) 2 , 30 ( ) 2 ( 95 . 0 1 , 05 0 60 3 , 30 28 2           C X P p . / p n
  • 18. Binomial -> Poisson 18  Solution – the binomial distribution X is defined as the number of half days in which one accident occurs Again, the choice of time unit is artificial. We can continue to divide the day into smaller time periods. 2548 . 0 975 . 0 025 . 0 ) 2 , 60 ( ) 2 ( 975 . 0 1 , 025 0 120 3 , 60 58 2           C X P p . / p n 0 ,    p n Among the n trials, the frequency for the situation to happen 𝑛 𝑥 60 2 AND Probability of something successfully happening x times (𝑝𝑥 ) 3 120 2 = 0.025 2 AND Probability of something NOT successfully happening in the REST of the trials ( 1 − 𝑝 𝑛−𝑥 ) 57 120 60−2 = 0.975 58
  • 19. Binomial -> Poisson  The Poisson distribution can be defined as the limiting case of the binomial distribution:  Poisson distribution can be used as the approximation of Binomial distribution for large n and small p   n    np constant 19
  • 20. Poisson Distribution  The process of interest consists of events that occur repeatedly and randomly within certain time period or space  Traffic accidents in Redlands / Tornados in Columbus (Ohio)  Events are independent of past or future occurrences  The occurrence of an event has a constant mean rate or density (underlying process governing the phenomenon must be invariant)  The random variable of interest, X, is the number of events occurring within a given unit of time, area, volume, and etc. 20 The Poisson distribution is sometimes known as the Law of Small Numbers, because it describes the behavior of events that are rare The probability that an event will occur within a given unit must be the same for all units (i.e. the underlying process governing the phenomenon must be invariant) 3. The number of events occurring per unit must be independent of the number of events occurring in other units (no interactions) 4. The mean or expected number of events per unit (λ) is found by past experience (observations) The “counts” of events
  • 21. Poisson Distribution (cont.)  Frequency function  The mean or expected number of events (λ) is found by past experience (observations) where e = 2.71828 (base of the natural logarithm) λ = the mean or expected value (for THE given time, expected value, not actually happening) (The mean or expected number of events per unit) x = 1, 2, …, n – 1, n # of occurrences ! ) ( x e x X P x      21 Number of trials: n Probability of success: p 𝑛𝑝 = 𝜆 The probability that an event will occur within a given unit must be the same for all units (i.e. the underlying process governing the phenomenon must be invariant) 3. The number of events occurring per unit must be independent of the number of events occurring in other units (no interactions) 4. The mean or expected number of events per unit (λ) is found by past experience (observations)
  • 22. Poisson Distribution (cont.) λ 22 λ affect the skew: The larger the λ, the more symmetrical it becomes. Notice that the Poisson Distribution is for relatively rare incidents, such as accidents and cancer. If the frequency is relatively high, we should use normal distribution.
  • 23. Example 1 Three(3) accidents were observed in last 60 days. Find the probability of observing x accidents in the next 30 days Solution: 1. Random variable X: the # of accidents occurred during the 30-day period 2. The mean number of accidents during the 30-day period is constant and equal to  = 3/2 = 1.5 3. Find the probability observing x accidents during the 30-day period. That is, find the value of P(X = x) 23 ! ) ( x e x X P x      3 60 × 30 = 3 2 = 1.5 [3 accidents per 60 days] When time period is 30 days, the mean number of accidents would be 1.5 (accidents).
  • 24. x P(X = x) 0 e-1.5 1.50 / 0! = 0.2231 1 e-1.5 1.51 / 1! = 0.3374 2 e-1.5 1.52 / 2! = 0.2510 3 e-1.5 1.53 / 3! = 0.1255 Example 1 24
  • 25. Example 2 A disease occurs randomly in space, with one(1) incident every 16 square kilometers. What is the probability of finding four(4) incidents in a 30 square kilometer area? 25
  • 26. Example 2 Solution: 1. Random variable X: the # of incidents in a 30 square kilometer area 2. The mean number of incidents in a 30 square kilometer area equals to  = 30/16 = 1.875 3. 26 079 . 0 ! 4 875 . 1 ) 4 ( 4 875 . 1      e X P One(1) every 16, the 𝜆, mean number of incidents in a 30 km2 area, is 30/16, which is 1.875 (incidents)
  • 27. Binomial vs. Poisson  If a mean or average probability of an event happening per unit time/per page/per mile cycled etc., is given, and you are asked to calculate a probability of n events happening in a given time/number of pages/number of miles cycled, then the Poisson Distribution is used. You do not know the number of trials.  If, on the other hand, an exact probability of an event happening is given, or implied, in the question, and you are asked to calculate the probability of this event happening k times out of n, then the Binomial Distribution must be used. You know the number of trials. 27 http://personal.maths.surrey.ac.uk/st/J.Deane/Teac h/se202/poiss_bin.html Expected value = 𝜆 Variance = 𝜆 Expected value = 𝑛𝑝 Variance = 𝑛𝑝(1 − 𝑝)
  • 28. Binomial vs. Poisson The Binomial and Poisson distributions are similar, but they are different. Also, the fact that they are both discrete does not mean that they are the same. The Geometric distribution and one form of the Uniform distribution are also discrete, but they are very different from both the Binomial and Poisson distributions. The difference between the two is that while both measure the number of certain random events (or "successes") within a certain frame, the Binomial is based on discrete events, while the Poisson is based on continuous events. That is, with a Binomial distribution you have a certain number, n, of "attempts," each of which has probability of success p. With a Poisson distribution, you essentially have infinite attempts, with infinitesimal chance of success. That is, given a Binomial distribution with some n,p, if you let n→∞ and p→0 in such a way that np→λ, then that distribution approaches a Poisson distribution with parameter λ. Because of this limiting effect, Poisson distributions are used to model occurrences of events that could happen a very large number of times but happen rarely. That is, they are used in situations that would be more properly represented by a Binomial distribution with a very large n and small p, especially when the exact values of n and p are unknown. (Historically, the number of wrongful criminal convictions in a country) 28
  • 29. Exercise  A typist makes on average 2 mistakes per page. What is the probability of a particular page having no errors on it? P  A computer crashes once every 2 days on average. What is the probability of there being 2 crashes in one week? P  Components are packed in boxes of 20. The probability of a component being defective is 0.1. What is the probability of a box containing 2 defective components? B  ICs are packaged in boxes of 10. The probability of an IC being faulty is 2%. What is the probability of a box containing 2 faulty ICs? B  The mean number of faults in a new house is 8. What is the probability of buying a new house with exactly 1 fault? P  A box contains a large number of washers; there are twice as many steel washers as brass ones. Four washers are selected at random from the box. What is the probability that 3 are brass? B 29 http://personal.maths.surrey.ac.uk/st/J.Deane/Teac h/se202/poiss_bin.html n=20, p=0.1, x=2 n=10, p=2%, x=2 n=4, p=0.33, x=3 x n x p p x n x X P             ) 1 ( ) (
  • 30.  Suppose 30 events are randomly distributed among 35 equally sized grid cells, how many of the grid cells are expected to have one event? 30
  • 32. Outline  Probability density function  Uniform probability distribution  Normal distribution 32
  • 33. Continuous Random Variable  A continuous random variable is the random variable that can take on a continuum of values  Travel distance / the magnitude of a flood  For a continuous random variable, the role of frequency function is taken by a density function, f(x) 33
  • 34. Probability Density Function  Probability distribution of a continuous random variable is expressed by its probability density function (PDF), f (x), which has the following properties 1) f(x) > 0 2) f is piecewise continuous 3) −∞ ∞ 𝑓 𝑥 𝑑𝑥 = 1  f(x) is often represented by a graph or an equation 34 “The total area under the curve” = “The Sum of all possibility” = 1 𝑡ℎ𝑒 𝑙𝑜𝑤𝑒𝑟 𝑙𝑖𝑚𝑖𝑡 𝑡ℎ𝑒 𝑢𝑝𝑝𝑒𝑟 𝑙𝑖𝑚𝑖𝑡 𝑇ℎ𝑒 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛 𝑑𝑥 −∞ ∞ 𝑓 𝑥 𝑑𝑥 𝑓 𝑋 = 𝑥 = 0 The probability of one particular value only, is zero. Because what we look at is the area, since the probability is tied to the area. And because the length of one particular value only is so thin, such that the length approach to zero. And hence the area and probability as well. Therefore, 𝑓 𝑋 = 𝑥 = 0
  • 35. Probability on an Interval  If X is a random variable with density function f, then for any a < b, the probability that X falls in the interval (a, b) is the area under the density function between a and b: 𝑃 𝑎 ≤ 𝑋 ≤ 𝑏 = 𝑎 𝑏 𝑓 𝑥 𝑑𝑥 35 a b 𝑡ℎ𝑒 𝑙𝑜𝑤𝑒𝑟 𝑙𝑖𝑚𝑖𝑡 𝑡ℎ𝑒 𝑢𝑝𝑝𝑒𝑟 𝑙𝑖𝑚𝑖𝑡 𝑇ℎ𝑒 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛 𝑑𝑥
  • 36. Uniform Distribution  Uniform distribution  The process of interest consists of equally likely outcomes  Probability density function 𝑓 𝑥 = 1 𝑏 − 𝑎 ; 𝑎 ≤ 𝑥 ≤ 𝑏 0; 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 36
  • 37. Uniform Distribution (cont.)  Probability on an interval a b c d dx a b dx a b c F d F d X c P c a d a              1 1 ) ( ) ( ) ( 37
  • 38. Uniform Distribution - Example  The annual mean temperature is uniformly distributed between 10ºC and 18ºC. Find the probability that the annual mean temperature falls in between 12ºC and 15ºC. a = 10, b = 18, c = 12, d = 15  What is the probability that annual mean temperature is greater than 15 ºC? What is the probability that annual temperature is less than 13 ºC? ) 18 , 10 ( ~ U X 38 8 3 10 18 12 15 ) 15 12 (          a b c d X P The probability is only related to the length of the interval, instead of the location of interval, given that the interval is the defined between a and b.
  • 39. Normal Distribution  It was proposed by Karl Friedrich Gauss as a model for measurement errors.  It is also called Gaussian distribution.  The most common and important probability distribution  Most naturally occurring variables are distributed normally (e.g. heights, weights, annual temperature variations, test scores, IQ scores, etc.)  Foundation in probability and statistics 39 A normal distribution can also be produced by tracking the errors made in repeated measurements of the same thing; Karl Friedrich Gauss was a 19th century astronomer who found that the distribution of the repeated errors of determining the position of the same star formed a normal (or Gaussian) distribution
  • 40. Normal Distribution (cont.)  Probability density function  The normal distribution is a continuous distribution that is symmetric and bell-shaped. 2 2 2 / ) ( 2 2 1 ) ( ) , ( ~          x e x f N X 𝜇 – population mean 𝜎2 – population variance 𝜎 – population Standard deviation 40
  • 41. Normal distributions with various parameters 41
  • 42. Properties of Normal Distribution  Symmetry: values below μ are just as likely as values above μ.  Center: f(x) has maximum value for x = μ, so values close to μ are the most likely to occur.  Dispersion: the density is “wider” for large σ compared to small values of σ (for fixed μ), so the larger σ the more likely are observations far from μ. 42
  • 43. Normal Distribution (cont.)  Probability on an interval  The areas under normal curves can be obtained from standard normal tables. Therefore, it is necessary to standardize normal distributions. 2 1 ) ( 2 2 2 / ) - (x - b a dx e b X a P         43
  • 44. Standard Normal Distribution  Standard normal distribution  The special case of normal distribution which has and 0   1 2   2 / 2 2 1 ) ( ) 1 , 0 ( ~ z e z f N Z    44
  • 45. Central Part of z Distribution 45
  • 46. 46 We would expect the interval as ±1.96 to contain approximately 95% of the observations. This corresponds to a commonly used rule-of-thumb that roughly 95% of the observations are within[-2, 2]. Similar computations are made for other percentages.
  • 47.  The standardization is achieved by converting the data into z-scores  Example 1  population mean and variance are known The annual precipitation 𝑋~𝑁 80, 402 , what is the z- score of x = 150mm? Standardization of Normal Distributions 75 . 1 40 80 150 40 80      i x z 47 “z-score”, which in other words is “how many S.D. is x deviated from the mean. By referring to the Standard Normal Table, we can calculate the possibility of x laying above, below, or between certain numbers. 𝑍~𝑁 0, 12 𝑋~𝑁 𝜇, 𝜎2 𝑧 = 𝑥𝑖 − 𝜇 𝜎
  • 48. Standardization of normal distributions  Example 2  Sample mean and variance are known  Step I:  Sample mean 𝑥 = 59.7  Sample standard deviation 𝑠 = 12.97  Step II: Month T (°F) Z-score J 39.53 -1.56 F 46.36 -1.03 M 46.42 -1.02 A 60.32 0.05 M 66.34 0.51 J 75.49 1.22 J 75.39 1.21 A 77.29 1.36 S 68.64 0.69 O 57.57 -0.16 N 54.88 -0.37 D 48.2 -0.89 s x x z i   48 𝑥=Sample mean; 𝑠 = 𝑆𝑎𝑚𝑝𝑙𝑒 𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝐷𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛; 𝜇 = 𝑃𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑚𝑒𝑎𝑛; 𝜎 = 𝑃𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝐷𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛;
  • 49. Calculating Probabilities from a Normal Distribution  Consider this problem Suppose that the annual precipitation was normally distributed with mean 80 mm per year and standard deviation 40 mm. What is the probability that the annual precipitation is greater than 150 mm? Solution: 1. calculate the z score(s) 49 75 . 1 40 80 150 40 80      x z
  • 50. Standard Normal Table 1.75 P(Z>=1.75) = 0.0401 z 50 2. Look up the standard normal table
  • 51. Standard Normal Table, Head-End (z=0.0~0.99) Z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09 .0 .5000 .5040 .5080 .5120 .5160 .5199 .5239 .5279 .5319 .5359 .1 .5398 .5438 .5478 .5517 .5557 .5596 .5636 .5675 .5714 .5753 .2 .5793 .5832 .5871 .5910 .5948 .5987 .6026 .6064 .6103 .6141 .3 .6179 .6217 .6255 .6293 .6331 .6368 .6406 .6443 .6480 .6517 .4 .6554 .6591 .6628 .6664 .6700 .6736 .6772 .6808 .6844 .6879 .5 .6915 .6950 .6985 .7019 .7054 .7088 .7123 .7157 .7190 .7224 .6 .7257 .7291 .7324 .7357 .7389 .7422 .7454 .7486 .7517 .7549 .7 .7580 .7611 .7642 .7673 .7704 .7734 .7764 .7794 .7823 .7852 .8 .7881 .7910 .7939 .7967 .7995 .8023 .8051 .8078 .8106 .8133 .9 .8159 .8186 .8212 .8238 .8264 .8289 .8315 .8340 .8365 .8389 51
  • 52. Standard Normal Table, Head-End (z=1.0~1.99) Z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09 1.0 .8413 .8438 .8461 .8485 .8508 .8531 .8554 .8577 .8599 .8621 1.1 .8643 .8665 .8686 .8708 .8729 .8749 .8770 .8790 .8810 .8830 1.2 .8849 .8869 .8888 .8907 .8925 .8944 .8962 .8980 .8997 .9015 1.3 .9032 .9049 .9066 .9082 .9099 .9115 .9131 .9147 .9162 .9177 1.4 .9192 .9207 .9222 .9236 .9251 .9265 .9279 .9292 .9306 .9319 1.5 .9332 .9345 .9357 .9370 .9382 .9394 .9406 .9418 .9429 .9441 1.6 .9452 .9463 .9474 .9484 .9495 .9505 .9515 .9525 .9535 .9545 1.7 .9554 .9564 .9573 .9582 .9591 .9599 .9608 .9616 .9625 .9633 1.8 .9641 .9649 .9656 .9664 .9671 .9678 .9686 .9693 .9699 .9706 1.9 .9713 .9719 .9726 .9732 .9738 .9744 .9750 .9756 .9761 .9767 52
  • 53. Standard Normal Table, Head-End (z=2.0~2.99) Z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09 2.0 .9772 .9778 .9783 .9788 .9793 .9798 .9803 .9808 .9812 .9817 2.1 .9821 .9826 .9830 .9834 .9838 .9842 .9846 .9850 .9854 .9857 2.2 .9861 .9864 .9868 .9871 .9875 .9878 .9881 .9884 .9887 .9890 2.3 .9893 .9896 .9898 .9901 .9904 .9906 .9909 .9911 .9913 .9916 2.4 .9918 .9920 .9922 .9925 .9927 .9929 .9931 .9932 .9934 .9936 2.5 .9938 .9940 .9941 .9943 .9945 .9946 .9948 .9949 .9951 .9952 2.6 .9953 .9955 .9956 .9957 .9959 .9960 .9961 .9962 .9963 .9964 2.7 .9965 .9966 .9967 .9968 .9969 .9970 .9971 .9972 .9973 .9974 2.8 .9974 .9975 .9976 .9977 .9977 .9978 .9979 .9979 .9980 .9981 2.9 .9981 .9982 .9982 .9983 .9984 .9984 .9985 .9985 .9986 .9986 53
  • 54. Standard Normal Table, Head-End (z=3.0~3.49) Z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09 3.0 .9987 .9987 .9987 .9988 .9988 .9989 .9989 .9989 .9990 .9990 3.1 .9990 .9991 .9991 .9991 .9992 .9992 .9992 .9992 .9993 .9993 3.2 .9993 .9993 .9994 .9994 .9994 .9994 .9994 .9995 .9995 .9995 3.3 .9995 .9995 .9995 .9996 .9996 .9996 .9996 .9996 .9996 .9997 3.4 .9997 .9997 .9997 .9997 .9997 .9997 .9997 .9997 .9997 .9998 54
  • 55. Calculating Probabilities (cont.) μ = 0 f(x) +1.75 .0401 P(Z > 1.75) = 0.0401 P(Z <= 1.75) = 0.9599 μ = 80 f(x) .0401 +150 P(X > 150) = 0.0401 P(X <= 150) = 0.9599 55 3. Calculate the areas of interest
  • 56.  Random variable X follows a normal distribution with mean of µ and variance of 2. Find out the interval that contains of middle 95% of the data. 56
  • 57. Central Part of a Normal Distribution 57
  • 58. Are data normally distributed? 58  Compared the observed histogram to a normal curve that has sample mean and sample standard deviation.
  • 59. Normal Q-Q plot 59  A normal quantile- quantile plot compares the sample quantiles to those of the normal distribution. If data are N(μ,σ2) distributed, the points in the QQ-plot should be scattered around the straight line. Straight line in Q-Q plot = normal distribution
  • 60. Confidence Interval Central limit theorem, interval estimation 60 Section III
  • 61. Outline  Point estimation  Central Limit Theorem  Confidence interval 61
  • 62. Estimation  Estimate population parameters based on sample data  Two types of estimate  Point estimate  Interval estimate 62
  • 63.  Mean  Population parameter:   Point estimate: 𝑥  Standard Deviation  Population parameter:   Point estimate: s Point Estimate 63    n i i x n x 1 1 1 ) ( 1 2      n x x s n i i The first type of estimation “True mean” “Sample mean” “Sample s. d.” “population s. d.”
  • 64. Sampling Error  Sampling error is the difference between the value of a population characteristics and the value of that characteristics inferred from a sample.  Example: consider the population characteristic of the average selling price of homes in Redlands in 2009. If every house is examined, the average selling price is $200,000. If only 25 homes per month are sampled, the average selling price of the 300 homes is $230,000. The sampling error is $200,000-$230,000 = -$30,000 Sampling Error cannot be removed.
  • 65. Interval Estimate  It is very unlikely that sample point estimates will exactly equal the true population parameters due to uncertainty in probability sampling.  To determine how good our point estimates are, we could extend a point estimate to an interval within which the population parameter lies. 65 x The second type of estimation
  • 66. Confidence Level and Interval  In probabilistic terms, we like to attach some measure of certainty (confidence) to our interval estimates.  What does the 90% confidence level mean?  The chance that the interval estimate containing the true mean is 90%.  In other words, the probability that the true mean falls in the interval is 90%.  This interval is called 90% confidence interval. 66 So, let’s say we repeated the sampling process many of time, and there are many sets of Sample. The sample mean from each Sample set are different, and hence they form an interval. So, is the true population mean (𝝁) falls within the interval? Maybe. The wider the interval it is, the higher chance it is in the interval. Technically if the interval is −∞ ≤ 𝑥 ≤ ∞, which is from negative infinity to infinity, the chance is always 100%, and the confident level would always be 100%... But that interval would be meaningless. Therefore, Statistically we would apply a tolerance level, that we the probability is less than 100%, but high enough (90%, 95%, 99%, etc.) that the interval is useful. Then, here it is, the concept of confidence interval and the confidence level.
  • 67. How to obtain a confidence interval?  If we know the relationship between the sample mean and the true mean, we can link them together.  The sampling distribution or probability distribution of the sample mean reveals the relationship. 67
  • 68. Sampling Distribution of Sample Mean  The sampling distribution of the sample mean can be developed by taking all possible or many samples of size n from a population, calculating the value of the mean for each sample, and drawing the distribution of these values. 68
  • 69. Sampling Distribution of Sample Mean  When the sampling process is repeated many times, we could get many different samples, which give different sample means.  http://www.ltcconline.net/greenl/java/Statistics/clt/cltsimulation.html 69 The larger a sample size, the closer the sample mean is to the true mean. Hence, the larger the sample size, the variance of the sample means would be smaller Which mean this frequency plot would be narrower.
  • 70. Central Limit Theorem (CLT) 70  Let X1, X2, X3… Xn be a random sample of size n drawn from a population with mean  and standard deviation .  Then for a large n, the sampling distribution of 𝑋 is approximately normally distributed with mean  and standard deviation 𝜎 𝑛.  In a special case where X is normal, the distribution of 𝑋 is exactly normal regardless of sample size.  The standard deviation of the sample mean 𝜎 𝑛 is also called standard error. 𝑋: Sample mean : Population Mean 𝜎 𝑛: Standard deviation of sample means, (=standard error) σ: Standard deviation of population 𝑠: Standard deviation of a set of sample The mean of the frequency distribution of sample mean is theoretically the same as population mean (𝝁). The Standard deviation of the frequency distribution of sample mean is (𝝈 𝒏 ). 𝑋~𝑁 𝜇, 𝜎 𝑛 2 : The frequency distribution of sample mean
  • 71. CLT (cont.)  The central limit theorem only applies to sample mean, not for other sample statistics.  Generally, a sample size n ≥ 30 could be regarded as sufficiently large so that the sampling distribution of sample means is approximately normal.  The sample size is inversely related to the standard error. 71 Central limit theorem is used to estimate the sample mean and sample mean only. The statement of sample size 𝑛 ≥ 30, is obtained by comparing the T- table and normal distribution. Which we will talk about in later slides.
  • 72. Notation for Confidence Level and Interval  Confidence level  Denoted by (1 – α) × 100%  Usually α = 0.05, could also be 0.10 or 0.01  Thus, the likelihood that we are wrong is α (also called significance level)  The interval in which the true mean lies within (1 – α) × 100% confidence (lies within the Confidence level) is called (1 – α) × 100% confidence interval. 72 95% Confidence level 5% likelihood of being wrong 90% Confidence level 10% likelihood of being wrong 99% Confidence level 1% likelihood of being wrong “Significance level” the likelihood that we are wrong
  • 73. Basic Steps  Step 1: Standardize 𝑋  Step 2: find the z score  Step 3: calculate margin of error  Step 4: obtain the final CI (Confident Interval)  Interpretation 73
  • 74.  Suppose the sample size n is sufficiently large (n ≥ 30), according to the central limit theorem, the frequency distribution of sample mean is normal with mean μ and standard deviation 𝜎 𝑛  Step 1: Standardize 𝑋 Step 1: Standardize 𝑋 74 𝑋~𝑁 𝜇, 𝜎 𝑛 2 𝑍 = 𝑋 − 𝜇 𝜎 𝑛 ~𝑁 0, 1
  • 75. Step 2: Find The Z Score 75 The three z-scores 1.65, 1.96, 2.58 are associated with three confidence levels 1 – α (=0.90, 0.95, 0.99), where α is 0.10, 0.05, and 0.01 respectively.
  • 76. 𝑧𝛼 2 is a z score or z value that corresponds to a tail area of α/2. 76 𝑧𝛼 2 α/2
  • 77. Step 3: Calculate Margin Of Error  The range of values above and below the sample statistic with a specified confidence.  Put differently         1 ) ( ME X ME X P 77 n z ME   2  𝑀𝐸: “Margin of Error”
  • 78. Step 4: Obtain The Final CI(confidence Interval)  Add/subtract the margin of error from the sample mean to get the CI. 78 ] 96 . 1 , 96 . 1 [ then , 05 . 0 If ] , [ 2 2 n X n X n z X n z X            
  • 79. Interpretation  When α = 0.05, we can say that  “I am 95% confident that the mean of the population is somewhere between 𝑥 − 1.96 𝜎 𝑛 and 𝑥 + 1.96 𝜎 𝑛  The true population mean μ should, 95% of the time, lie within ±1.96 𝜎 𝑛 of sample mean.  95% of all confidence intervals that can be constructed will contain the unknown true mean. 79
  • 80. 80 After repeated sampling of 100 times, how many of the confidence intervals would you expect to contain the true mean?
  • 81. What influence CI?  Sample variance ↑, range of CI ↑  larger sample variability , higher uncertainty  wider CI  Sample size ↑, range of CI↓  Larger sample size n, more information  narrower CI  Confidence level ↑, range of CI ↑  Higher confidence level, higher uncertainty to be accounted  wider CI 81
  • 82. Some issues  How about small sample size (n < 30) ?  t-distribution unless the sample is drawn from a normally distributed population  How about the population standard deviation σ is unknown?  Use sample standard deviation s to approximate population standard deviation σ  t-distribution, providing the population is normal 82
  • 83. t-Distribution  When the sample size is not sufficiently large, the frequency distribution of sample means has what is known as the t distribution (or Student’s t distribution)  t-distribution also copes with uncertainty resulting from estimating the standard deviation from a sample, whereas if the population standard deviation was unknown  The overall shape of the probability density function of the t-distribution resembles the bell shape of a standard normal distribution, except that it is a bit lower and wider.  t-distribution depends on a new parameter – degree of freedom (df =n -1) 83 Student's t-distribution to cope with uncertainty resulting from estimating the standard deviation from a sample, whereas if the population standard deviation were known, a normal distribution would be used.
  • 84. t-Distribution vs. Standard Normal df = 1 df = 2 df = 3 df = 5 df = 10 df = 30 84
  • 85. Using t distribution to construct CI  Population standard deviation is unknown  Population standard deviation is known but sample size is small 85 ] , [ 1 , 2 1 , 2 n s t X n s t X n n       ] , [ 1 , 2 1 , 2 n t X n t X n n        
  • 86. Example question A local bank needs information concerning the savings account balances of its customers. A random sample of 15 accounts was checked. The mean balance was $686 with a standard deviation of $256. Which of the followings is the 95% confidence interval for the true mean? The correct answer is: 686 − 2.15 × 256 15 , 686 + 2.15 × 256 15 [686-2.15*256/sqrt(15), 686+2.15*256/sqrt(15)] 86
  • 88. Outline  What is hypothesis testing?  Errors in hypothesis testing  One-sample z-test  One-sample t-test 88
  • 89. Consider this situation A consumer advocacy group collects a random sample of n = 100 light bulbs from a manufacture, and observe a sample mean of 987 hours. Assume standard deviation of all light bulbs is 40 hours. Estimate on average how many hours of light can the light bulbs provide. 𝛼 = 0.05, 𝑧0.025 = 1.96 𝑀𝐸 = 𝑧0.025 𝑠 𝑛 = 1.96 × 40 10 = 7.84 Confidence Interval [979.16, 994.84] 89 We are 95% confident that the mean lifetime of light bulb manufactured by the manufacturer fall between 979.16 to 994.84 hours.
  • 90. Consider a related situation A consumer advocacy group thinks a manufacturer of light bulbs is mistaken in their claim that their bulbs on average provide 1000 hours of light. They believe the light bulbs are defective. To test this, they collect a random sample of n = 100 light bulbs and observe a sample mean of 987 hours. Assume standard deviation of all light bulbs is 40 hours. 90 We are 95% confident that the mean lifetime of light bulb manufactured by the manufacturer fall between 979.16 to 994.84 hours. We cannot say that, the sample mean of one set of sample is 987 hour, so the claim of manufacturer is false; But with statistic and probability, we can claim that what the manufacturer claimed in false – because 95% of the chance that the mean lifetime of light bulb manufactured by the manufacturer would fall between 979.16 to 994.84 hours. 1000 hours fall outside of the 95% range. In other word, the claim of the manufacturer, at least 95% of the case, is wrong.
  • 91.  If we assume the manufacture’s claim is true, we would expect the average lifetime of the samples is close to 1000h.  But how close is close? Is the sample mean of 987 close enough to the presumed value of 1000?  We need to quantify the closeness or difference between the sample mean and the presumed mean.  To do so, we may compare 987 to a threshold that is deemed as “close enough” 91 Common sense is important
  • 92.  95% of the time the sample mean will range between 992 and 1007. So we can take 992 and 1007 as the thresholds for “close enough”.  However, there is small chance (<5%) that sample mean falls out the range of 992 and 1007. So we could be wrong if we conclude the true mean is not 1000. This 5% is called significance level 92 𝑋~𝑁(1000, 402 100 ) 1000 987 992 1008 𝑃 𝑋 ≤ 992 = 0.025 Significant: the tails. Something you do not expect. Confident: the center. The range that you’re confident. For Hypothesis testing, we start with the claim. Hence, the mean is set to be 1000 hours. Next, we throw the Margin of Error into the graph. By setting the mean at 1000, we can calculate the probability that we get a sample mean that is so small that it is 987
  • 93.  Different problems will have different thresholds. Can we obtain a standardized threshold that applies to all problems?  Yes, we would use the z score as a generic measure for “closeness”.  Now let us take a look of basic steps. 93
  • 94.  Step 1: state a null hypothesis  Step 2: state alternative hypothesis  Step 3: choose a significance level  Step 4: calculate test statistic  Step 5: find critical value and region of rejection  Step 6: make a decision 94 Basic Steps of Hypothesis Testing
  • 95. Step 1: state a null hypothesis  Step 1: state a null hypothesis  H0: μ = 1000 Note: the null hypothesis states this large random sample is drawn from the population that has a mean of 1000. If the null hypothesis is true, we then can conclude that the sample mean approximately follows normal distribution (𝑋~𝑁(1000, 402 100 )) 95 A consumer advocacy group thinks a manufacturer of light bulbs is mistaken in their claim that their bulbs on average provide 1000 hours of light. They believe the light bulbs are defective. To test this, they collect a random sample of n = 100 light bulbs and observe a sample mean of 987 hours. Assume standard deviation of all light bulbs is 40 hours.
  • 96. Step 2: state alternative hypothesis  Alternative hypothesis  Two-sided hypothesis testing (test if the lifetime differs from 1000 hours of light)  HA: μ ≠ 1000  One-sided hypothesis testing (test if the lightbulbs provide less than 1000 hours of light)  HA: μ < 1000 96 Population parameter Reverse of what the experimenter believes A consumer advocacy group thinks a manufacturer of light bulbs is mistaken in their claim that their bulbs on average provide 1000 hours of light. They believe the light bulbs are defective. To test this, they collect a random sample of n = 100 light bulbs and observe a sample mean of 987 hours. Assume standard deviation of all light bulbs is 40 hours.
  • 97. Step 3: choose a significance level  α= 0.1, 0.05, 0.01  Example: α=0.05  The critical z value would be 1.65, 1.96, and 2.58 respectively 97 A result was said to be significant at the 5% level. This means the result would be unexpected if the null hypothesis were true. A consumer advocacy group thinks a manufacturer of light bulbs is mistaken in their claim that their bulbs on average provide 1000 hours of light. They believe the light bulbs are defective. To test this, they collect a random sample of n = 100 light bulbs and observe a sample mean of 987 hours. Assume standard deviation of all light bulbs is 40 hours.
  • 98. Step 4: calculate test statistic  If H0 is true, the CLT gives  Test statistic  z-score: 𝑧𝑡𝑒𝑠𝑡 = 𝑥 − 𝜇0 𝜎 𝑛 = 987 − 1000 40/10 = −3.25 98 𝑋~𝑁(1000, 402 100 ) if a large random sample is drawn from the population that has a mean of μ0 and a standard deviation of σ A consumer advocacy group thinks a manufacturer of light bulbs is mistaken in their claim that their bulbs on average provide 1000 hours of light. They believe the light bulbs are defective. To test this, they collect a random sample of n = 100 light bulbs and observe a sample mean of 987 hours. Assume standard deviation of all light bulbs is 40 hours.
  • 99. Step 5: find critical value and region of rejection  Two-sided HA :μ ≠ 100  Example  One-sided  Example HA : μ < 1000 μ < μ0 99 ±𝑧𝛼 2 = ±1.96 −𝑧𝛼 = -1.65 A consumer advocacy group thinks a manufacturer of light bulbs is mistaken in their claim that their bulbs on average provide 1000 hours of light. They believe the light bulbs are defective. To test this, they collect a random sample of n = 100 light bulbs and observe a sample mean of 987 hours. Assume standard deviation of all light bulbs is 40 hours.
  • 100. Step 6: make a decision  Two-sided  Reject the null hypothesis if  One-sided  Reject the null hypothesis if 100 𝑧𝑡𝑒𝑠𝑡 > 𝑧∝ 2 𝑜𝑟 𝑧𝑡𝑒𝑠𝑡 < −𝑧∝ 2 𝑧𝑡𝑒𝑠𝑡 > 𝑧∝ 𝑓𝑜𝑟 𝜇 > 𝜇0 𝑧𝑡𝑒𝑠𝑡 < −𝑧∝ 𝑓𝑜𝑟 𝜇 < 𝜇0 A consumer advocacy group thinks a manufacturer of light bulbs is mistaken in their claim that their bulbs on average provide 1000 hours of light. They believe the light bulbs are defective. To test this, they collect a random sample of n = 100 light bulbs and observe a sample mean of 987 hours. Assume standard deviation of all light bulbs is 40 hours.
  • 101. Step 6: make a decision  Example Therefore, we can reject the null hypothesis. The lifetime of lightbulbs is significantly less than 1000 hours at α=0.05. 101 𝑧𝑡𝑒𝑠𝑡 = −3.25, −𝑧∝ = −1.65 𝑧𝑡𝑒𝑠𝑡< −𝑧∝ A consumer advocacy group thinks a manufacturer of light bulbs is mistaken in their claim that their bulbs on average provide 1000 hours of light. They believe the light bulbs are defective. To test this, they collect a random sample of n = 100 light bulbs and observe a sample mean of 987 hours. Assume standard deviation of all light bulbs is 40 hours.
  • 102. What is hypothesis testing?  Now let us take a deeper look of hypothesis testing  Hypothesis  A proposition whose truth or falsity is capable of being tested 102
  • 103. Errors in Hypothesis Testing  Type I Error  “False Positive”  Rejecting a true hypothesis  The likelihood of making a type I error is denoted by α, referred to as significance level  Type II Error  “False Negative”  Accepting a wrong hypothesis  The likelihood of making a type II error is denoted by β 103
  • 104. Errors in Hypothesis Testing (cont.) 104 We want to control Type I error, more than Type II. In most cases, Type I error would lead to more severe consequences. Take a trial as example. The null hypothesis (H0) is an assumption of innocence. In a Type I error, The H0 is true: The person is innocent. The H0 is rejected: The person is deemed guilty This creates the following consequence: (1) an innocent person is deemed guilty; (2) the criminal is still out there free from the system. Take the same trial as example. In a Type II error, H0 is false: the person is guilty; H0 is accepted: the person is deemed to be innocent. This create the following consequence: (1) The criminal is deemed innocent and released Type I error, also known as False Positive Type II error, also known as False Negative
  • 105. Controlling Type I Error  It is almost always impossible to simultaneously minimize the probability of both types of errors.  Classical hypothesis testing adopts the strategy of controlling α.  In making small α, we have small probability of making errors if we are able to reject our hypothesis.  If we have evidence to reject null hypothesis, we will be confident in our analysis.  The null hypothesis should be something we want to reject, rather than something we want to confirm. 105
  • 106.  Population standard deviation σ is unknown  Sample size is small  Test Statistic  When H0 is true (the sample is drawn from the specified population that has mean of μ0, T random variable follows a student t-distribution with df = n-1 One-sample t-test 𝑇 = 𝑋 − 𝜇0 𝑆/ 𝑛 106 𝑇: T-value
  • 107. Limitations of classic hypothesis testing  The specific significance level must be selected a priori, and often arbitrary and lack of theoretical basis  The final decision regarding the null and alternative hypothesis is binary:  H0 is rejected or not rejected  More flexible method is needed  What is the exact significance level associated with the test statistic? 107
  • 108. p-value  The probability of getting a test statistic value as extreme as or more extreme than that observed by chance, if the null hypothesis H0 is true  If null hypothesis is rejected, p-value is the probability of making a Type I error  The smaller the p-value, the more convincing to reject the null hypothesis 108 Rejecting a true hypothesis Typically, we can reject the null hypothesis, when p value is less than 10% (loose standard). 5% for spatial analysis
  • 109. Determining p-value  Using Calculated z or t test statistic to determine p-value  p-value corresponds to the shaded area under the standard normal (or t) curve 109
  • 110. Light Bulbs Example  What is the p-value of the lightbulb test? Would you reject your null hypothesis at α = 0.01 significance level? 110
  • 111. But when to reject the Null Hypothesis? 111 When the p-value of our test fall outside of the interval, in other words, fall within the shaded area, we can reject it. Let’s say the p-value is 0.006. We can reject the null hypothesis in both 𝑝 < 0.1, 𝑝 < 0.05, and 𝑝 < 0.01 level. Let’s say the p-value is 0.02. We can reject the null hypothesis in both 𝑝 < 0.1, and 𝑝 < 0.05 level, but not at the 𝑝 < 0.01 level. Let’s say the p-value is 0.06. We can reject the null hypothesis only in 𝑝 < 0.1 level, but not at the 𝑝 < 0.05 and 𝑝 < 0.01 level. Let’s say the p-value is 0.2. We cannot reject the null hypothesis in all the following three (𝑝 < 0.1, 𝑝 < 0.05 and 𝑝 < 0.01) level.
  • 112. Useful graph/area plotting URL http://www.statdistributions.com/normal/ http://www.statdistributions.com/t/ 112
  • 113. 113 Null hypothesis (you want to reject) No difference / no change / equal to “=” Alternative hypothesis (research interest that you want to accept) One sided: Different / not equal “≠” Left sided: smaller than / less than “<” Right sided: larger than / more than “>” No sample statistics are included in both H0 and Ha (H1)