SlideShare a Scribd company logo
1 of 70
Download to read offline
Statistics (Recap)
Finance & Management Students
Farzad Javidanrad
October 2013
University of Nottingham-Business School
Probability
• Some Preliminary Concepts:
 Random: Something that happens (occurs) by chance.
 Population: A set of all possible outcome of a random experiment
or a collection of all members of a specific group under study. This
collection makes an space that all possible samples can be derived
from. For that reason it is sometimes called sample space.
 Sample: Any subset of population (sample space).
In tossing a die:
Random event is the event of appearing any face of the die.
Population (sample space) is the set of .
Sample is any subset of the set above such as or .
 61,2,3,4,5,
 3  6,4,2
Probability
• Two events are mutually exclusive if they cannot happen together.
The occurrence of one of them prevents the occurrence of another.
For example, if the baby is a boy it cannot be a girl and vice versa.
• Two events are independent if occurrence of one of them has no
effect on the chance of occurrence of another. For example, the
result of rolling a die has no impact on the outcome of flipping a
coin. But in the experiment of taking two cards consecutively from a
set of 52 cards (if the cards can be chosen equally likely) the chance
of getting the second card is affected by the result of the first card.
• Two events are exhaustive if they include all possible outcomes
together. For example, in rolling a die the possibility of having odd
numbers or even numbers.
Probability
• If event 𝑨 can happen in 𝒎 different ways out of 𝒏 equally likely
ways, the probability of event 𝑨 can be shown as its relative
frequency; i.e. :
𝑃 𝐴 =
𝑚
𝑛
U: sample space (population)
A: an event (sample)
A’: mutually exclusive event with A
A & A’ are exhaustive collectively
No. of ways that event 𝐴
occurs
Total of equally likely and
possible outcomes
𝐴𝐴
𝐴′
U
Probability
• As 0 ≤ 𝑚 ≤ 𝑛 it can be concluded that
0 ≤
𝑚
𝑛
≤ 1
Or
0 ≤ 𝑃(𝐴) ≤ 1
• 𝑃 𝐴 = 0 means that event 𝐴 cannot happen and 𝑃 𝐴 = 1
means that the event will happen with certainty.
• With the definition of 𝐴′ as an event of “non-occurrence” of event
𝐴, we can find that:
𝑃 𝐴′
=
𝑛 − 𝑚
𝑛
= 1 −
𝑚
𝑛
= 1 − 𝑃 𝐴
Or
𝑃 𝐴 + 𝑃 𝐴′
= 1
Probability of Multiple Events
• If 𝑨 and 𝑩 are not mutually exclusive events so, the probability of
happening one of them (𝑨 𝑜𝑟 𝑩) can be calculated as following:
𝑷 𝑨 ∪ 𝑩 = 𝑷 𝑨 + 𝑷 𝑩 − 𝑷(𝑨 ∩ 𝑩)
𝑃 𝐴 𝑜𝑟 𝐵 𝑃 𝐴 𝑎𝑛𝑑 𝐵
𝑃 𝐴 𝑃 𝐵
𝑃 𝐴 ∩ 𝐵
Probability of Multiple Events
P(A)
P(B)P(C)
𝑃 𝐴 ∩ 𝐵 ∩ 𝐶
In case, we are dealing with more events:
𝑷 𝑨 ∪ 𝑩 ∪ 𝑪 = 𝑷 𝑨 + 𝑷 𝑩 + 𝑷 𝑪 − 𝑷 𝑨 ∩ 𝑩 − 𝑷 𝑨 ∩ 𝑪 −
𝑷 𝑩 ∩ 𝑪 + 𝑷(𝑨 ∩ 𝑩 ∩ 𝑪)
Probability of Multiple Events
• Considering 𝑷 𝑨 ∪ 𝑩 = 𝑷 𝑨 + 𝑷 𝑩 − 𝑷(𝑨 ∩ 𝑩) we can have the
following situations:
1. If 𝑨 and 𝑩 are mutually exclusive events, then :
𝑷 𝑨 ∩ 𝑩 = 𝟎
2. If 𝑨 and 𝑩 are two independent events, then:
𝑷 𝑨 ∩ 𝑩 = 𝑷(𝑨) × 𝑷(𝑩)
3. If 𝑨 and 𝑩 are dependent events, then:
𝑷 𝑨 ∩ 𝑩 = 𝑷(𝑨) × 𝑷(𝑩 𝑨) = 𝑷(𝑩) × 𝑷(𝑨 𝑩)
Where 𝑷(𝑨 𝑩) and 𝑷(𝑩 𝑨) are conditional probabilities and in the case of
𝑷(𝑨 𝑩) means the probability of event 𝐴 provided that event 𝐵 has already
happened.
Probability of Multiple Events
o The probability of picking at random a Heart or a Queen on a single
experiment from a card deck of 52 is:
𝑃 𝐻 ∪ 𝑄 = 𝑃 𝐻 + 𝑃 𝑄 − 𝑃 𝐻 ∩ 𝑄 =
13
52
+
4
52
−
1
52
=
4
13
o The probability of getting a 1 or a 4 on a single toss of a fair die is:
𝑃 1 ∪ 4 = 𝑃 1 + 𝑃 4 =
1
6
+
1
6
=
1
3
As they cannot happen together they are mutually exclusive events
and 𝑃 1 ∩ 4 = 0.
o The probability of having two heads in the experiment of tossing
two fair coins is: (two independent events)
𝑃 𝐻 ∩ 𝐻 =
1
2
.
1
2
=
1
4
Probability of Multiple Events
o The probability of picking two ace without returning the first card
into the batch of 52 playing cards, which represents a conditional
probability, is:
𝑃 1𝑠𝑡 𝑎𝑐𝑒 ∩ 2𝑛𝑑 𝑎𝑐𝑒 = 𝑃(1𝑠𝑡 𝑎𝑐𝑒) × 𝑃(2𝑛𝑑 𝑎𝑐𝑒 1𝑠𝑡 𝑎𝑐𝑒)
Or can be written with less words involved:
𝑃 𝐴1 ∩ 𝐴2 = 𝑃(𝐴1) × 𝑃(𝐴2 𝐴1) =
4
52
×
3
51
=
1
221
• If two events 𝑨 and 𝑩 are independent from each other then:
𝑷(𝑨 𝑩) = 𝑷 𝑨 𝒂𝒏𝒅 𝑷(𝑩 𝑨) = 𝑷(𝑩)
Random Variable & Probability Distribution
Some Basic Concepts:
• Variable: A letter (symbol) which represents the elements of a
specific set.
• Random Variable: A variable whose values are randomly appear
based on a probability distribution.
• Probability Distribution: A corresponding rule (function) which
corresponds a probability to the values of a random variable.
• Variables (including random variables) are divided into two general
categories:
1) Discrete Variables, and
2) Continuous Variables
Random Variable & Probability Distribution
• A discrete variable is the variable whose elements (values) can be
corresponded to the values of the natural numbers set or any subset
of that. So, it is possible to put an order and count its elements
(values). The number of elements can be finite or infinite.
• For a discrete variable it is not possible to define any neighbourhood,
whatever small, at any value in its domain. There is a jump from one
value to another value.
• If the elements of the domain of a variable can be corresponded to
the values of the real numbers set or any subset of that, the variable
is called continuous. It is not possible to order and count the
elements of a continuous variable. A variable is continuous if for any
value in its domain a neighbourhood, whatever small, can be defined.
Random Variable & Probability Distribution
• Probability Distribution: A rule (function) that associates a
probability either to all possible elements of a random variable (RV)
individually or a set of them in an interval.*
• For a discrete RV this rule associates a probability to each possible
individual outcome. For example, the probability distribution for
occurrence of a Head when filliping a fair coin: (Note: 𝑃𝑖 = 1)
𝒙 0 1
𝑃(𝑥) 0.5 0.5
In one trial 𝐻, 𝑇
𝒙 0 1 2
𝑃(𝑥) 0.25 0.5 0.25
In two trials
𝐻𝐻, 𝐻𝑇, 𝑇𝐻, 𝑇𝑇
𝒙 = 𝑷𝒓𝒊𝒄𝒆 (+1) --- (0) (-1)
𝑃(𝑥) 0.6 0.1 0.3
Change in the price of a
share in one day
o The probability distribution for the price change of a share in stock market
Probability Distributions (Continuous)
• The probability that a continuous random variable chooses
just one of its values in its domain is zero, because the number
of all possible outcomes 𝒏 is infinite and
𝒎
∞
→ 𝟎.
• For the above reason, the probability of a continuous random
variable need to be calculated in an interval.
• The probability distribution of a continuous random variable is
often called a probability density function (PDF) or simply
probability function and it is usually shown by 𝒇(𝒙) and it has
following properties:
I. 𝑓(𝑥) ≥ 0 (similar to 𝑷(𝒙) ≥ 𝟎 for discrete RV*)
II. −∞
+∞
𝑓 𝑥 𝑑𝑥 = 1 (similar to 𝑷 𝒙 = 𝟏 for discrete RV)
III. 𝑎
𝑏
𝑓 𝑥 𝑑𝑥 = 𝑃 𝑎 ≤ 𝑥 ≤ 𝑏 = 𝐹 𝑏 − 𝐹 𝑎 (probability
given to set of values in an interval [a,b] )**
Probability Distributions (Continuous)
• where 𝐹(𝑥) is the integral of the PDF function (𝑓(𝑥)) and it is
called as Cumulative Distribution Function (CDF) and for any
real value of 𝒙 is defined as:
𝐹(𝑥) ≡ 𝑃(𝑋 ≤ 𝑥)
CDF shows the area under
PDF function (𝐟(𝐱)) from
− ∞ to 𝐱 . For discrete
random variable, CDF
shows the summation of
all probabilities before
the value of 𝐱 .
Adopted from http://beyondbitsandatomsblog.stanford.edu/spring2010/tag/embodied-artifacts/
𝐹(𝑥)
𝑓(𝑥)
𝐹(𝑥)≡𝑃(𝑋≤𝑥)
𝐹(𝑥)≡𝑃(𝑋≤𝑥)
Some Characteristics of Probability Distributions
• Expected Value (Probabilistic Mean Value): It is one of the most
important measures which shows the central tendency of the
distribution. It is the weighted average of all possible values of
random variable 𝒙 and it is shown by 𝑬(𝒙).
• For a discreet RV (with n possible outcomes)
𝑬 𝒙 = 𝒙 𝟏 𝑷 𝒙 𝟏 + 𝒙 𝟐 𝑷 𝒙 𝟐 + ⋯ + 𝒙 𝒏 𝑷 𝒙 𝒏 =
𝒊=𝟏
𝒏
𝒙𝒊 𝑷(𝒙𝒊)
• For a continuous RV
𝑬 𝒙 =
−∞
+∞
𝒙. 𝒇 𝒙 𝒅𝒙
Some Characteristics of Probability Distributions
• Properties of 𝑬(𝒙):
i. If 𝒄 is a constant then 𝑬 𝒄 = 𝒄 .
ii. If 𝒂 and 𝒃 are constants then 𝑬 𝒂𝒙 + 𝒃 = 𝒂𝑬 𝒙 + 𝒃 .
iii. If 𝒂 𝟏, … , 𝒂 𝒏 are constants then
𝑬 𝒂 𝟏 𝒙 𝟏 + ⋯ + 𝒂 𝒏 𝒙 𝒏 = 𝒂 𝟏 𝑬 𝒙 𝟏 + ⋯ + 𝒂 𝒏 𝑬(𝒙 𝒏)
Or
𝑬(
𝒊=𝟏
𝒏
𝒂𝒊 𝒙𝒊) =
𝒊=𝟏
𝒏
𝒂𝒊 𝑬(𝒙𝒊)
iv. If 𝒙 and 𝒚 are independent random variables then
𝑬 𝒙𝒚 = 𝑬 𝒙 . 𝑬 𝒚
Some Characteristics of Probability Distributions
v. If 𝒈 𝒙 is a function of random variable 𝒙 then
𝑬 𝒈 𝒙 = 𝒈 𝒙 . 𝑷(𝒙)
𝑬 𝒈 𝒙 = 𝒈 𝒙 . 𝒇 𝒙 𝒅𝒙
• Variance: To measure how random variable 𝒙 is dispersed around
its expected value, variance can help. If we show 𝑬 𝒙 = 𝝁 , then
𝒗𝒂𝒓 𝒙 = 𝝈 𝟐 = 𝑬[ 𝒙 − 𝑬 𝒙
𝟐
]
= 𝑬[ 𝒙 − 𝝁 𝟐]
= 𝑬[𝒙 𝟐 − 𝟐𝒙𝝁 + 𝝁 𝟐]
= 𝑬 𝒙 𝟐 − 𝟐𝝁𝑬 𝒙 + 𝝁 𝟐
= 𝑬 𝒙 𝟐 − 𝝁 𝟐
For discreet RV
For continuous RV
Some Characteristics of Probability Distributions
𝒗𝒂𝒓 𝒙 =
𝒊=𝟏
𝒏
𝒙𝒊 − 𝝁 𝟐. 𝑷(𝒙)
𝒗𝒂𝒓 𝒙 = −∞
+∞
𝒙𝒊 − 𝝁 𝟐. 𝒇 𝒙 𝒅𝒙
• Properties of Variance:
i. if 𝒄 is a constant then 𝒗𝒂𝒓 𝒄 = 𝟎 .
ii. If 𝒂 and 𝒃 are constants then 𝒗𝒂𝒓 𝒂𝒙 + 𝒃 = 𝒂 𝟐 𝒗𝒂𝒓(𝒙) .
iii. If 𝒙 and 𝒚 are independent random variables then
𝒗𝒂𝒓 𝒙 ± 𝒚 = 𝒗𝒂𝒓 𝒙 + 𝒗𝒂𝒓(𝒚)
can be extended to more
variables
For discreet RV
For continuous RV
• Some of the well-known probability distributions are:
• The Binomial Distribution:
1. The probability of the occurrence of an event is 𝒑 and is not
changing.
2. The experiment is repeated for 𝒏 times.
3. The probability that out of 𝒏 times, the event appears 𝒙 times is:
𝑃 𝑥 =
𝑛!
𝑥! 𝑛 − 𝑥 !
𝑝 𝑥(1 − 𝑝) 𝑛−𝑥
The mean value and standard deviation of the binomial distribution
are:
𝜇 = 𝑖=0
𝑛
𝑥𝑖. 𝑃 𝑥𝑖 = 𝑛𝑝 𝜎 = 𝑖=0
𝑛
𝑥𝑖 − 𝜇 2. 𝑃(𝑥𝑖) = 𝑛𝑝(1 − 𝑝)
So, to show that the probability distribution of the random variable 𝑋
is binomial we can write: 𝑋~𝐵𝑖(𝑛𝑝, 𝑛𝑝 1 − 𝑝 )
Probability Distributions (Discrete RV)
Probability Distributions (Discrete RV)
• A gambler thinks his chance to get a 1 in rolling a die is high. What
is his chance to have 4 one out of six experiments using a fair die?
The probability of having a one in an individual trial is
1
6
and it
remains the same in all 6 experiments. So,
𝑃 𝑥 = 4 =
6!
4! 2!
1
6
4
5
6
2
=
375
7776
= 0.048 ≈ 5%
• The Poisson Distribution:
1. It is used to calculate the probability of number of desired event
(no. of successes)in a specific period of time.
2. The average number of desired event (no. of successes) per unit of
time remains constant.
• So, the probability of having 𝒙 numbers of success is calculated by:
𝑃 𝑥 =
𝝀 𝑥 𝑒−𝝀
𝑥!
Where 𝝀 is the average number of successes in a specific period of time and
𝑒 = 2.7182 .
• The mean value and standard deviation of the Poisson distribution are:
𝜇 =
𝑖=0
𝑛
𝑥𝑖. 𝑃 𝑥𝑖 = 𝝀 and 𝜎 =
𝑖=0
𝑛
𝑥𝑖 − 𝜇 2. 𝑃(𝑥𝑖) = 𝝀
So, to show that the probability distribution of the random variable 𝑋 is
Poisson we can write: 𝑿~Poi(𝝀, 𝝀).
o The emergency section in a hospital receives 2 calls per half an hour (4
calls in an hour). The probability of getting just 2 calls in a randomly
chosen hour in a random day is:
𝑃 𝑥 = 2 =
42 𝑒−4
2!
= 0.146 ≈ 15%
Probability Distributions (Discrete RV)
The Normal Distribution (Continuous RV)
• The Normal Distribution: It is the best known probability
distribution which reflects the nature of most random
variables in the world. The probability density function (PDF)
of normal distribution is:
1. Symmetrical around its mean value (𝝁).
2. Bell-shaped, with two tails approaching the horizontal axis
asymptotically as we move further away from the mean.
Adopted from
http://www.pdnotebook.com/2
010/06/statistical-tolerance-
analysis-root-sum-square/
The Normal Distribution (Continuous RV)
3. The probability density function (PDF) of normal distribution
can be represented by:
𝒇 𝒙 =
𝟏
𝝈 𝟐𝝅
𝒆
−
𝒙−𝝁 𝟐
𝟐𝝈 𝟐
(−∞ < 𝒙 < +∞)
Where 𝝁 and 𝝈 are mean and standard deviation respectively.
𝝁 = −∞
+∞
𝒙. 𝒇 𝒙 𝒅𝒙 and 𝝈 = −∞
+∞
𝒙 − 𝝁 𝟐 . 𝒇 𝒙 𝒅𝒙
So, 𝑿~𝑵(𝝁, 𝝈 𝟐).
• A linear combination of independent normally distributed random
variables is itself normally distributed, that is,
If 𝑿~𝑵 𝝁 𝟏, 𝝈 𝟏
𝟐 and 𝒀~𝑵 𝝁 𝟐, 𝝈 𝟐
𝟐 and if 𝒁 = 𝒂𝑿 + 𝒃𝒀 then
𝒁~𝑵(𝒂𝝁 𝟏 + 𝒃𝝁 𝟐 , 𝒂 𝟐
𝝈 𝟏
𝟐
+ 𝒃 𝟐
𝝈 𝟐
𝟐
)
• This can be extended to more than two random variables.
The Normal Distribution (Continuous RV)
• Recalling the last property of PDF ( 𝑎
𝑏
𝑓 𝑥 𝑑𝑥 = 𝑃(𝑎 ≤ 𝑥 ≤ 𝑏)), it is
difficult to calculate the probability using the above PDF with different
values of 𝝁 and 𝝈. The solution for this problem is to transform the normal
variable 𝒙 to the standardised normal variable (or simply, standard normal
variable) random variable 𝒛 , by: 𝒛 =
𝒙−𝝁
𝝈
which its parameters (𝜇 and 𝜎2
) are independent from the influence of other
random variables’ parameters with normal distribution because we always
have:𝑬 𝒛 = 𝟎 and 𝒗𝒂𝒓 𝒛 = 𝟏 (why?)
• The probability distribution for the standard normal variable is defined as:
𝒇 𝒛 =
𝟏
𝟐𝝅
𝒆−
𝒛 𝟐
𝟐 𝒁~𝑵(𝟎, 𝟏).
Standardised
Adopted and amended from
http://www.mathsisfun.com/data/standard-normal-
distribution.html
𝑿~𝑵(𝝁, 𝝈 𝟐) 𝒁~𝑵(𝟎, 𝟏)
The Standard Normal Distribution
0
• Properties of the standard normal distribution curve:
1. It is symmetrical around y-axis.
2. The area under the curve can be split into two equal areas, that is:
−∞
0
𝑓 𝑧 𝑑𝑧 =
0
+∞
𝑓 𝑧 𝑑𝑧 = 0.5
• To find the area under the curve and before 𝒛 𝟏 = 𝟏. 𝟐𝟔 , using the z-
table (next slide), we have:
𝑃 𝑧 ≤ 𝑧1 = 1.26 =
−∞
0
𝑓 𝑧 𝑑𝑧 +
0
𝑧1
𝑓 𝑧 𝑑𝑧 = 0.5 + 0.3962 = 0.8962 ≈ 90%
𝑓(𝑧)
50%
𝑧
50% 50%
𝒛 𝟏 = 𝟏. 𝟐𝟔
0.5
0.3962
Z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.0 0.0000 0.0040 0.0080 0.0120 0.0160 0.0199 0.0239 0.0279 0.0319 0.0359
0.1 0.0398 0.0438 0.0478 0.0517 0.0557 0.0596 0.0636 0.0675 0.0714 0.0753
0.2 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064 0.1103 0.1141
0.3 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443 0.1480 0.1517
0.4 0.1554 0.1591 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808 0.1844 0.1879
0.5 0.1915 0.1950 0.1985 0.2019 0.2054 0.2088 0.2123 0.2157 0.2190 0.2224
0.6 0.2257 0.2291 0.2324 0.2357 0.2389 0.2422 0.2454 0.2486 0.2517 0.2549
0.7 0.2580 0.2611 0.2642 0.2673 0.2704 0.2734 0.2764 0.2794 0.2823 0.2852
0.8 0.2881 0.2910 0.2939 0.2967 0.2995 0.3023 0.3051 0.3078 0.3106 0.3133
0.9 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340 0.3365 0.3389
1.0 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.3577 0.3599 0.3621
1.1 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790 0.3810 0.3830
1.2 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3980 0.3997 0.4015
1.3 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.4147 0.4162 0.4177
1.4 0.4192 0.4207 0.4222 0.4236 0.4251 0.4265 0.4279 0.4292 0.4306 0.4319
1.5 0.4332 0.4345 0.4357 0.4370 0.4382 0.4394 0.4406 0.4418 0.4429 0.4441
1.6 0.4452 0.4463 0.4474 0.4484 0.4495 0.4505 0.4515 0.4525 0.4535 0.4545
1.7 0.4554 0.4564 0.4573 0.4582 0.4591 0.4599 0.4608 0.4616 0.4625 0.4633
1.8 0.4641 0.4649 0.4656 0.4664 0.4671 0.4678 0.4686 0.4693 0.4699 0.4706
1.9 0.4713 0.4719 0.4726 0.4732 0.4738 0.4744 0.4750 0.4756 0.4761 0.4767
2.0 0.4772 0.4778 0.4783 0.4788 0.4793 0.4798 0.4803 0.4808 0.4812 0.4817
2.1 0.4821 0.4826 0.4830 0.4834 0.4838 0.4842 0.4846 0.4850 0.4854 0.4857
2.2 0.4861 0.4864 0.4868 0.4871 0.4875 0.4878 0.4881 0.4884 0.4887 0.4890
2.3 0.4893 0.4896 0.4898 0.4901 0.4904 0.4906 0.4909 0.4911 0.4913 0.4916
2.4 0.4918 0.4920 0.4922 0.4925 0.4927 0.4929 0.4931 0.4932 0.4934 0.4936
2.5 0.4938 0.4940 0.4941 0.4943 0.4945 0.4946 0.4948 0.4949 0.4951 0.4952
2.6 0.4953 0.4955 0.4956 0.4957 0.4959 0.4960 0.4961 0.4962 0.4963 0.4964
2.7 0.4965 0.4966 0.4967 0.4968 0.4969 0.4970 0.4971 0.4972 0.4973 0.4974
2.8 0.4974 0.4975 0.4976 0.4977 0.4977 0.4978 0.4979 0.4979 0.4980 0.4981
2.9 0.4981 0.4982 0.4982 0.4983 0.4984 0.4984 0.4985 0.4985 0.4986 0.4986
3.0 0.4987 0.4987 0.4987 0.4988 0.4988 0.4989 0.4989 0.4989 0.4990 0.4990
Working with the Z-Table
• To find the probability
𝑃 0.89 < 𝑧 < 1.5 =
0
𝑧2
𝑓(𝑧)𝑑𝑧 −
0
𝑧1
𝑓 𝑧 𝑑𝑧
= 𝐹 1.5 − 𝐹 0.89 = 0.4332 − 0.3133
= 0.119 ≈ 12%
as both values are positive.
• To find the probability in the negative area we
need to find the equivalent area in the positive side:
𝑃 −1.32 < 𝑧 < −1.25 = 𝑃 1.25 < 𝑧 < 1.32
= 𝐹 1.32 − 𝐹 1.25
= 0.4066 − 0.3944 = 0.0122 ≈ 1%
1.50.89
Working with the Z-Table
• To find 𝑃(−2.15 < 𝑧) we can write:
−∞
−2.15
𝑓. 𝑑𝑧 =
−∞
0
𝑓. 𝑑𝑧 −
−2.15
0
𝑓. 𝑑𝑧
= 0.5 − 0.4842 = 0.0158 ≈ 2%
• And finally, to find 𝑃(𝑧 ≥ 1.93) , we have:
1.93
+∞
𝑓. 𝑑𝑧 =
0
+∞
𝑓. 𝑑𝑧 −
0
1.93
𝑓. 𝑑𝑧
= 0.5 − 0.4732 = 0.0268
0-2.15 =≡
0
2.15
𝑓. 𝑑𝑧
0 =1.93
An Example
o If the income of employees in a big company normally distributed
with 𝝁 = £𝟐𝟎𝟎𝟎𝟎 and 𝝈 = £𝟒𝟎𝟎𝟎, what is the probability of an
employee picked randomly have an income
a) above £22000, b) between £16000 and £24000.
a) We need to transform 𝒙 to 𝒛 firstly:
𝑃 𝑥 > 22000 = 𝑃
𝑥 − 20000
4000
>
22000 − 20000
4000
= 𝑃 𝑧 > 0.5 = 0.5 − 01915 = 0.3085 ≈ 31%
b) 𝑃 16000 < 𝑥 < 24000 = 𝑃(
16000−20000
4000
<
𝑥−20000
4000
<
24000−20000
4000
)
= 𝑃 −1 < 𝑧 < 1
= 0.3413 + 0.3413
= 0.6826 ≈ 68%
The ࣑2
(Chi-Squared)Distribution
• The ࣑ 𝟐(Chi-Squared)Distribution:
Let 𝒁 𝟏, 𝒁 𝟐, … , 𝒁 𝒌be 𝒌 independent standardised normal distributed
random variables, then the sum of the squares of them
𝑋 =
𝑖=1
𝑘
𝑍𝑖
2
have a Chi-Square distribution with a degree of freedom equal to the
number of random variables (𝒅𝒇 = 𝒌). So, 𝑿~ .
The mean value and standard
deviation of the RV with a Chi-Squared
distribution are 𝒌 𝑎𝑛𝑑 𝟐𝒌
Respectively. So we can write:
𝑿~
2
k
Probability Density Function (PDF) of ࣑2
Distribution
Adoptedfromhttp://2012books.lardbucket.org/books/beginning-statistics/s15-chi-square-tests-and-f-tests.html
Adoptedfromhttp://www.docstoc.com/docs/80811492/chi--square-table
𝑃 𝑥2
= 32 𝑑𝑓 = 16 = 0.01 or 𝑥2
0.01 ,16 = 32
The t-Distribution
• If 𝒁~𝑵 𝟎, 𝟏 and 𝑿~ and two random variables
𝒁 and 𝑿 are independent then the random variable
𝒕 =
𝒁
𝑿
𝒌
=
𝒁. 𝒌
𝑿
follows student’s t-distribution (t-distribution) with 𝒌 degree of
freedom. For a sample size 𝒏 we have 𝒅𝒇 = 𝒌 = 𝒏 − 𝟏.
• The mean value and standard deviation of this distribution are
𝝁 =
𝟎 𝒏 > 𝟐
𝒖𝒏𝒅𝒆𝒇𝒊𝒏𝒆𝒅 𝒏 = 𝟏, 𝟐
𝝈 =
𝒏−𝟏
𝒏−𝟑
𝒏 > 𝟑
∞ 𝒏 = 𝟑
𝒖𝒏𝒅𝒆𝒇𝒊𝒏𝒆𝒅 𝒏 = 𝟏, 𝟐
)2,(2
kkk
The t-Distribution
• The t-distribution like the standard normal distribution is a bell-
shaped and symmetrical distribution with zero mean (n>2) but it is
flatter but as the degree of freedom increases (or 𝒏 increases)it
approaches the standard normal distribution and for 𝒏≥𝟑𝟎 their
behaviours are similar.
• From the table (next slide)
𝑃 𝑡 = 1.706 𝑑𝑓 =26 = 0.05 ≈ 5% or 𝑡0.05,26 = 1.706
Adoptedfromhttp://education-
portal.com/academy/lesson/what
-is-a-t-test-procedure-
interpretation-
examples.html#lesson
= 𝟏. 𝟕𝟎𝟔
5%
df 0.20 0.15 0.10 0.05 0.025 0.01 0.005 0.0025 0.001 0.0005
1 1.376 1.963 3.078 6.314 12.706 31.821 63.656 127.321 318.289 636.578
2 1.061 1.386 1.886 2.920 4.303 6.965 9.925 14.089 22.328 31.600
3 0.978 1.250 1.638 2.353 3.182 4.541 5.841 7.453 10.214 12.924
4 0.941 1.190 1.533 2.132 2.776 3.747 4.604 5.598 7.173 8.610
5 0.920 1.156 1.476 2.015 2.571 3.365 4.032 4.773 5.894 6.869
6 0.906 1.134 1.440 1.943 2.447 3.143 3.707 4.317 5.208 5.959
7 0.896 1.119 1.415 1.895 2.365 2.998 3.499 4.029 4.785 5.408
8 0.889 1.108 1.397 1.860 2.306 2.896 3.355 3.833 4.501 5.041
9 0.883 1.100 1.383 1.833 2.262 2.821 3.250 3.690 4.297 4.781
10 0.879 1.093 1.372 1.812 2.228 2.764 3.169 3.581 4.144 4.587
11 0.876 1.088 1.363 1.796 2.201 2.718 3.106 3.497 4.025 4.437
12 0.873 1.083 1.356 1.782 2.179 2.681 3.055 3.428 3.930 4.318
13 0.870 1.079 1.350 1.771 2.160 2.650 3.012 3.372 3.852 4.221
14 0.868 1.076 1.345 1.761 2.145 2.624 2.977 3.326 3.787 4.140
15 0.866 1.074 1.341 1.753 2.131 2.602 2.947 3.286 3.733 4.073
16 0.865 1.071 1.337 1.746 2.120 2.583 2.921 3.252 3.686 4.015
17 0.863 1.069 1.333 1.740 2.110 2.567 2.898 3.222 3.646 3.965
18 0.862 1.067 1.330 1.734 2.101 2.552 2.878 3.197 3.610 3.922
19 0.861 1.066 1.328 1.729 2.093 2.539 2.861 3.174 3.579 3.883
20 0.860 1.064 1.325 1.725 2.086 2.528 2.845 3.153 3.552 3.850
21 0.859 1.063 1.323 1.721 2.080 2.518 2.831 3.135 3.527 3.819
22 0.858 1.061 1.321 1.717 2.074 2.508 2.819 3.119 3.505 3.792
23 0.858 1.060 1.319 1.714 2.069 2.500 2.807 3.104 3.485 3.768
24 0.857 1.059 1.318 1.711 2.064 2.492 2.797 3.091 3.467 3.745
25 0.856 1.058 1.316 1.708 2.060 2.485 2.787 3.078 3.450 3.725
26 0.856 1.058 1.315 1.706 2.056 2.479 2.779 3.067 3.435 3.707
27 0.855 1.057 1.314 1.703 2.052 2.473 2.771 3.057 3.421 3.689
28 0.855 1.056 1.313 1.701 2.048 2.467 2.763 3.047 3.408 3.674
29 0.854 1.055 1.311 1.699 2.045 2.462 2.756 3.038 3.396 3.660
30 0.854 1.055 1.310 1.697 2.042 2.457 2.750 3.030 3.385 3.646
31 0.853 1.054 1.309 1.696 2.040 2.453 2.744 3.022 3.375 3.633
32 0.853 1.054 1.309 1.694 2.037 2.449 2.738 3.015 3.365 3.622
33 0.853 1.053 1.308 1.692 2.035 2.445 2.733 3.008 3.356 3.611
34 0.852 1.052 1.307 1.691 2.032 2.441 2.728 3.002 3.348 3.601
35 0.852 1.052 1.306 1.690 2.030 2.438 2.724 2.996 3.340 3.591
36 0.852 1.052 1.306 1.688 2.028 2.434 2.719 2.990 3.333 3.582
37 0.851 1.051 1.305 1.687 2.026 2.431 2.715 2.985 3.326 3.574
38 0.851 1.051 1.304 1.686 2.024 2.429 2.712 2.980 3.319 3.566
39 0.851 1.050 1.304 1.685 2.023 2.426 2.708 2.976 3.313 3.558
40 0.851 1.050 1.303 1.684 2.021 2.423 2.704 2.971 3.307 3.551
50 0.849 1.047 1.299 1.676 2.009 2.403 2.678 2.937 3.261 3.496
60 0.848 1.045 1.296 1.671 2.000 2.390 2.660 2.915 3.232 3.460
80 0.846 1.043 1.292 1.664 1.990 2.374 2.639 2.887 3.195 3.416
100 0.845 1.042 1.290 1.660 1.984 2.364 2.626 2.871 3.174 3.390
150 0.844 1.040 1.287 1.655 1.976 2.351 2.609 2.849 3.145 3.357
Infinity 0.842 1.036 1.282 1.645 1.960 2.326 2.576 2.807 3.090 3.290
The F Distribution
• If 𝑍1~ and 𝑍2~ and 𝑍1 and 𝑍2 are independent then the
random variable
𝐹 =
𝑍1
𝑘1
𝑍2
𝑘2
follows F distribution with 𝑘1 and 𝑘2 degrees of freedom, i.e.:
𝐹~𝐹𝑘1,𝑘2
or 𝐹~𝐹(𝑘1, 𝑘2)
• This distribution is skewed to
the right as the Chi-Square
distribution but as 𝑘1 and 𝑘2
increase (𝑛 → ∞) it approaches
to normal distribution.
2
2k2
1k
Adoptedfrom
http://www.vosesoftware.com/ModelRiskHelp/index.htm#Dis
tributions/Continuous_distributions/F_distribution.htm
The F Distribution
• The mean and standard deviation of the F distribution are:
𝜇 =
𝑘2
𝑘2−2
𝑓𝑜𝑟 (𝑘2 > 2) and
𝜎 =
𝑘2
𝑘2−2
2(𝑘1+𝑘2−2)
𝑘1(𝑘2−4)
𝑓𝑜𝑟 (𝑘2 > 4)
• Relation between t & Chi-Square Distributions with F distribution:
• For a random variable 𝑋~𝑡 𝑘it can be shown that 𝑋2~𝐹1,𝑘. This can
also be written as
𝑡 𝑘
2 = 𝐹1,𝑘
• If 𝑘2 is large enough, then
𝑘1. 𝐹𝑘1,𝑘2
~
2
1k
𝛼 = 0.25
All adopted from
http://www.stat.purdue.edu/~yuzhu/stat514s05/tab
les.html
𝛼 = 0.10
𝛼 = 0.05
𝛼 = 0.025
𝛼 = 0.01
Statistical Inference (Estimation)
• Statistical inference or statistical induction is one of the most
important aspect of decision making and it refers to the process of
drawing a conclusion about the unknown parameters of the
population from a sample of randomly chosen data.
• So, the idea is that a sample of randomly chosen data provides the
best information about parameters of the population and it can be
considered as a representative of the population when its size
reasonably (appropriately) large.
• The first step in statistical inference (induction) is estimation which
is the process of finding an estimate or approximation for the
population parameters (such as mean value and standard deviation)
using the data in the sample.
Statistical Inference (Estimation)
• The value of 𝑿 (sample mean) in a randomly chosen and
appropriately large sample is a good estimator of the population
mean 𝝁 . The value of 𝒔 𝟐
(sample variance) is also a good estimator
of the population variance 𝝈 𝟐.
• Before taking any sample from population (when the sample is not
realised or observed) we can talk about the probability distribution
of a hypothetical sample. The probability distribution of a random
variable 𝒙 in a hypothetical sample follows the probability
distribution of the population even if the sampling process is
repeated for many times.
• But the probability distribution of the sample mean 𝑿 in repeated
sampling does not necessarily follow the probability distribution of
its population when number of sampling increases.
Central Limit Theorem
• Central Limit Theorem:
Imagine random variable 𝑿 with any probability distribution is defined
in a population with the mean 𝝁 and the variance 𝝈 𝟐. If we get
𝒏 independent samples 𝑿 𝟏, 𝑿 𝟐, … , 𝑿 𝒏 and for each sample we
calculate the mean values 𝑿 𝟏, 𝑿 𝟐, … , 𝑿 𝒏(see figure below)
𝑿~𝒊. 𝒊. 𝒅(𝝁, 𝝈 𝟐
)
𝑿 𝟏
𝑿 𝟐
⋮
𝑿 𝒏
𝑖. 𝑖. 𝑑 ≡Independent &
Identically Distributed RVs
Central Limit Theorem
As the number of sampling increases infinitely, the random variable 𝑿
has a normal distribution (regardless of the population distribution)
and we have
𝑿~𝑵 𝝁,
𝝈 𝟐
𝒏
when 𝒏 → +∞
And in the standard form:
𝒁 =
𝑿 − 𝝁 𝑿
𝝈 𝑿
=
𝑿 − 𝝁
𝝈
𝒏
=
𝒏 𝑿 − 𝝁
𝝈
~𝑵(𝟎, 𝟏)
o Taking sample of 36 elements from a population with the mean of 20 and
standard deviation of 12, what is the probability that the sample mean
falls between 18 and 24?
𝑃 18 < 𝑥 < 24 = 𝑃 −1 <
𝑥 − 20
12
36
< 2 = 0.3413 + 0.4772 ≈ 82%
Estimation
• In previous slides we introduced some of the most important
probability distributions for discrete & continuous random variables.
• In many cases we know the nature of the probability distribution of
a random variable, defined in a population, but have no idea about
its parameters such as mean value or/and standard deviation.
• Point Estimation:
• To estimate the unknown parameters of a probability distribution of
a random variable we can either have a point estimation or an
interval estimation using an estimator.
• The estimator is a function of the sample values 𝒙 𝟏, 𝒙 𝟐, … , 𝒙 𝒏 and it
is often called a statistic. If 𝜽 represent that estimator we have:
𝜽 = 𝒇(𝒙 𝟏, 𝒙 𝟐, … , 𝒙 𝒏)
Estimation
• 𝜽 is said to be an unbiased estimator of true 𝜽 (parameter of the
population) if 𝑬 𝜽 = 𝜽. Because the bias itself is defined as
𝑩𝒊𝒂𝒔 = 𝑬 𝜽 − 𝜽
o For example, the sample mean 𝑿 is a point and unbiased estimator
for the unknown parameter 𝝁 (population mean):
𝜽 = 𝑿 = 𝒇 𝒙 𝟏, 𝒙 𝟐, … , 𝒙 𝒏 =
𝟏
𝒏
𝒙 𝟏 + 𝒙 𝟐 + ⋯ + 𝒙 𝒏
It is unbiased because 𝑬 𝑿 = 𝝁.
• The sample variance in the form of 𝒔 𝟐
=
𝒙 𝒊− 𝒙 𝟐
𝒏
is a point but a
biased estimator of the population variance 𝝈 𝟐
in a small sample:
𝑬 𝒔 𝟐
= 𝝈 𝟐
(𝟏 −
𝟏
𝒏
) ≠ 𝝈 𝟐
But it is a consistent estimator because it will approaches to 𝝈 𝟐when the
sample size 𝒏 increases indefinitely (𝒏 → ∞)
• With Bessel’s correction (changing 𝒏 to (𝒏 − 𝟏)) we can define
another sample variance which is unbiased even for small sample
size.
𝒔 𝟐 =
𝒙𝒊 − 𝒙 𝟐
𝒏 − 𝟏
• The methods of finding point estimators are mostly least-square
method and maximum likelihood method which among them the first
method will be discussed later.
Estimation
Interval Estimation
• Interval Estimation:
• Interval estimation, in contrary, provides an interval or a range of
possible estimates at a specific level of probability, which is called
level of confidence, within which the true value of the population
parameter may lie.
• If 𝜽 𝟏 and 𝜽 𝟐 are respectively the lowest and highest estimates of 𝜽
the probability that 𝜽 is covered by the interval 𝜽 𝟏, 𝜽 𝟐 is:
𝐏𝐫 𝜽 𝟏 ≤ 𝜽 ≤ 𝜽 𝟐 = 𝟏 − 𝜶 (0 < 𝛼 < 1)
Where 𝟏 − 𝜶 is the level of confidence and 𝜶 itself is called level of
significance. The interval 𝜽 𝟏, 𝜽 𝟐 is called confidence interval.
Interval Estimation
 How to find 𝜽 𝟏 𝒂𝒏𝒅 𝜽 𝟐?
In order to find the lower and upper limits of a confidence interval we need to
have a prior knowledge about the nature of distribution of the random variable
in the population.
 If random variable 𝒙 is normally distributed in the population and the
population standard deviation (𝝈) is known, the 95% confidence interval for
the unknown population mean (𝝁) can be constructed by finding the
symmetric z-values associated to 95% area under the standard normal
curve:
𝟏 − 𝜶 = 𝟗𝟓% → 𝜶 = 𝟓% →
𝜶
𝟐
= 𝟐. 𝟓%
So, ±𝒁 𝟎.𝟎𝟐𝟓 = ±𝟏. 𝟗𝟔
We know that: 𝒁 =
𝑿−𝝁 𝑿
𝝈 𝑿
=
𝑿−𝝁
𝝈
𝒏
, so:
𝑷(−𝒁 𝜶
𝟐
≤ 𝒁 ≤ 𝒁 𝜶
𝟐
) = 𝟗𝟓%
Adopted & altered from http://upload.wikimedia.org/wikipedia/en/b/bf/NormalDist1.96.png
=1−𝛼
𝜶
𝟐
= 𝟎. 𝟎𝟐𝟓
𝜶
𝟐
= 𝟎. 𝟎𝟐𝟓
−𝒁 𝜶
𝟐
= = 𝒁 𝜶
𝟐
Interval Estimation
• So we can write:
𝑷 𝒙 − 𝟏. 𝟗𝟔𝝈 𝒙 ≤ 𝝁 ≤ 𝒙 + 𝟏. 𝟗𝟔𝝈 𝒙 = 𝟎. 𝟗𝟓
Or
𝑷 𝒙 − 𝟏. 𝟗𝟔
𝝈
𝒏
≤ 𝝁 ≤ 𝒙 + 𝟏. 𝟗𝟔
𝝈
𝒏
= 𝟎. 𝟗𝟓
Therefore, the interval 𝒙 − 𝟏. 𝟗𝟔
𝝈
𝒏
, 𝒙 + 𝟏. 𝟗𝟔
𝝈
𝒏
represents a 95%
confidence interval (𝐶𝐼95%)of the unknown value of 𝝁.
It means in repeated random
sampling (for 100 times) we
expect 95 out of 100 intervals,
such as the above, cover the
unknown value of the
population mean 𝝁 .
𝒙 ̅− 𝟏.𝟗𝟔 𝝈/√𝒏 = = 𝒙 ̅− 𝟏.𝟗𝟔 𝝈/√𝒏
Adopted and altered from http://forums.anarchy-online.com/showthread.php?t=604728
Interval Estimation for population Proportion
 A confidence interval can be constructed for the population
proportion (see the graph below)
𝑋~𝐵𝑖(𝑛𝑝, 𝑛𝑝 1 − 𝑝 )
𝒑 𝟏
𝜇 𝜎2
𝒑 𝟐
⋮
𝒑 𝒏
𝝁 𝒑 = 𝑬 𝒑 = 𝒑 =
𝝁
𝒏
𝝈 𝟐
𝒑
= 𝒗𝒂𝒓 𝒑 =
𝝈 𝟐
𝒏 𝟐
=
𝒑(𝟏 − 𝒑)
𝒏
𝒑 in each sample
represents a
sample proportion.
In repeated random
sampling 𝒑 has its
own probability
distribution with
mean value and
variance
Interval Estimation for population Proportion
• The 90% confidence interval for the population proportion 𝒑 when
sample size is bigger than 30 (n>30) and there is no information
about the population variance will be constructed as following:
±𝒁 𝜶
𝟐
=
𝒑 − 𝒑
𝒑(𝟏 − 𝒑)
𝒏
𝑷(−𝒁 𝜶
𝟐
≤ 𝒁 ≤ +𝒁 𝜶
𝟐
) = 𝟏 − 𝜶
𝑷( 𝒑 − 𝒁 𝜶
𝟐
.
𝒑(𝟏− 𝒑)
𝒏
≤ 𝒑 ≤ 𝒑+𝒁 𝜶
𝟐
.
𝒑(𝟏− 𝒑)
𝒏
) = 𝟎. 𝟗
So, the confidence interval can be simply
written as:
𝑪𝑰 𝟗𝟎% = 𝒑 ∓ 𝟏. 𝟔𝟒𝟓
𝒑(𝟏 − 𝒑)
𝒏 =90% 𝜶
𝟐 = 𝟎. 𝟎𝟓𝜶
𝟐 = 𝟎. 𝟎𝟓
−𝒁 𝜶
𝟐
= −𝟏. 𝟔𝟒𝟓 𝒁 𝜶
𝟐
= 𝟏. 𝟔𝟒𝟓
Obviously, if we had
knowledge about the
population variance we
were be able to estimate
the population
proportion 𝒑 directly.
Why?
Adopted and altered fromhttp://www.stat.wmich.edu/s216/book/node83.html
Examples
o Imagine the weight of people in a society distributed normally. A
random sample of 25 with the sample mean 72 kg is taken from this
society. If the standard deviation of the population is 6 kg find a)the
90% b)95% and c) 99% confidence interval for the unknown
population mean.
a) 1 − 𝛼 = 0.9 →
𝛼
2
= 0.05 → 𝑍 𝛼
2
= 1.645
So, 𝐶𝐼90% = 72 ± 1.645 ×
6
25
= 70.03 , 73.97
b) 1 − 𝛼 = 0.95 →
𝛼
2
= 0.025 → 𝑍 𝛼
2
= 1.96
So, 𝐶𝐼95% = 72 ± 1.96 ×
6
25
= 69.65 , 74.35
c) 1 − 𝛼 = 0.99 →
𝛼
2
= 0.005 → 𝑍 𝛼
2
= 2.58
So, 𝐶𝐼99% = 72 ± 2.58 ×
6
25
= 68.9 , 75.1
Examples
o Samples from one of the lines of production in a factory suggests
that 10% of products are defective. If the range of 1% difference
between sample and population proportion is acceptable what
sample size we need to construct a 95% confidence interval for the
population proportion? What about if the acceptable gap between
sample & population proportion increased to 3%?
1 − 𝛼 = 0.95 →
𝛼
2
= 0.025 → 𝑍 𝛼
2
= 1.96
𝑍 𝛼
2
=
𝑝 − 𝑝
𝑝(1 − 𝑝)
𝑛
→ 1.96 =
0.01
0.1 × 0.9
𝑛
→ 𝑛 = 196 × 0.3 2 ≈ 3458
If the gap increases to 3% then:
1.96 =
0.03
0.1×0.9
𝑛
→ 𝑛 = 196 × 0.1 2 ≈ 385
Interval Estimation (Using t-distribution)
• If the population standard deviation 𝝈 is unknown and we use
sample standard deviation 𝒔 instead, and the size of the sample is
less than 30 (𝒏 < 𝟑𝟎) then the random variable
𝒙 − 𝝁
𝒔
𝒏
~𝒕 𝒏−𝟏
has t-distribution with 𝒅𝒇 = 𝒏 − 𝟏.
This means a confidence interval for the population mean 𝝁 will be in
the form of:
𝑪𝑰(𝟏−𝜶) = 𝒙 − 𝒕 𝜶
𝟐,𝒏−𝟏
𝒔
𝒏
, 𝒙 + 𝒕 𝜶
𝟐,𝒏−𝟏
𝒔
𝒏
−𝒕 𝜶
𝟐
,𝒏−𝟏
𝒕 𝜶
𝟐
,𝒏−𝟏
1 − 𝛼 % 𝜶
𝟐
𝜶
𝟐
Adopted and altered from http://cnx.org/content/m46278/latest/?collection=col11521/latest
Interval Estimation
• The following flowchart can help to choose between Z and t-
distributions when the interval estimation is constructed for 𝝁 in
the population.
Use
nonparametric
methods
Adopted from http://www.expertsmind.com/questions/flow-chart-for-confidence-interval-30112489.aspx
Interval Estimation
• Here there is a list of confidence intervals for the subject parameters
in the population.
Adopted from http://www.bls-stats.org/uploads/1/7/6/7/1767713/250709.image0.jpg
Hypothesis Testing
• Hypothesis testing is one of the important aspects of statistical inference.
The main idea is to find out if some claims/statements (in the form of
hypothesis) about population parameters can be statistically rejected by
the evidence from the sample using a test statistic (a function of sample).
• Claims can be made in the form of null hypothesis (𝐻0) against the
alternative hypothesis (𝐻1) and they are just rejectable. These two
hypotheses should be mutually exclusive and collectively exhaustive. For
example:
𝐻0: 𝜇 = 0.8 𝑎𝑔𝑎𝑖𝑛𝑠𝑡 𝐻1: 𝜇 ≠ 0.8
𝐻0: 𝜇 ≥ 2.1 𝑎𝑔𝑎𝑖𝑛𝑠𝑡 𝐻1: 𝜇 < 2.1
𝐻0: 𝜎2
≤ 0.4 𝑎𝑔𝑎𝑖𝑛𝑠𝑡 𝐻1: 𝜎2
> 0.4
 Always remember that the equality sign comes with 𝐻0.
• If the value of the test statistic lies in the rejection area(s) the null
hypothesis must be rejected, otherwise the sample does not
provide sufficient evidence to reject the null hypothesis.
Hypothesis Testing
• Assuming we know the distribution of the random variable in the
population and also having statistical independence between different
random variables, in hypothesis testing we need to follow the following
steps:
1. Stating the relevant null & alternative hypotheses. The state of the null
hypothesis (being =, ≥, ≤ something)indicates how many rejection
regions we will have (for = sign we will have two regions and for others
just one region; depending on the difference between the value of
estimator and the claimed value for the population parameter the
rejection area could be on the right or left of the distribution curve).
𝐻0: 𝜇 = 0.5
𝐻1: 𝜇 ≠ 0.5
𝐻0: 𝜇 ≥ 0.5 (𝑜𝑟 𝜇 ≤ 0.5)
𝐻1: 𝜇 < 0.5 (𝑜𝑟 𝜇 > 0.5)
GraphsAdopted from http://www.soc.napier.ac.uk/~cs181/Modules/CM/Statistics/Statistics%203.html
Hypothesis Testing
2. Identifying the level of significance of the test (𝜶) and it is usually
considered to be 5% or 1%, depending on the nature of the test and the
goals of researcher. When 𝜶 is known with the prior knowledge about
the sample distribution, the critical region(s) (or rejection area(s)) can be
identified.
Here we have two
critical values for
standard normal
distributions
associated to the level
of significance 𝛼 =
5% and 𝛼 = 1%
Adoptedfrom http://www.psychstat.missouristate.edu/introbook/sbk26.htm
𝑍 𝛼=1.65
𝑍 𝛼=2.33
Hypothesis Testing
3. Constructing a test statistic (a function based on the sample distribution &
sample size). This function is used to decide whther or not to reject 𝑯 𝟎.
TableAdoptedfromhttp://www.bls-stats.org/uploads/1/7/6/7/1767713/250714.image0.jpg
Here we
have a list
of some of
the test
statistics
for testing
different
hypotheses
Hypothesis Testing
4. Taking a random sample from the population and calculating the value of
the test statistic. If the value is in the rejection area the null hypothesis 𝑯 𝟎
will be rejected in favour of the alternative 𝑯 𝟏at the predetermined
significance level 𝜶, otherwise the sample does not provide sufficient
evidence to reject 𝑯 𝟎 (this does not mean that we accept 𝑯 𝟎)
Adoptedfrom http://www.onekobo.com/Articles/Statistics/03-Hypotheses/Stats3%20-%2010%20-%20Rejection%20Region.htm
−𝒁 𝜶 𝑜𝑟 − 𝒕 𝜶,𝒅𝒇 if there is a left-tail test
−𝒁 𝜶
𝟐
𝑜𝑟 − 𝒕 𝜶
𝟐
,𝒅𝒇 if there is a two-tail test
+𝒁 𝜶 𝑜𝑟 + 𝒕 𝜶,𝒅𝒇 if there is a right-tail test
+𝒁 𝜶
𝟐
𝑜𝑟 + 𝒕 𝜶
𝟐
,𝒅𝒇 if there is a two-tail test
Example
o A chocolate factory claims that its new tin of cocoa powder contains at
least 500 gr of the powder. A standard checking agency takes a random
sample of 𝑛 = 25 of the tins and found out that sample mean weight of
tins is 𝑋 = 520 𝑔𝑟 and the sample standard deviation is 𝑠 = 75 𝑔𝑟. If we
assume the weight of cocoa powder in tins has a normal distribution,
does the sample provide enough evidence to support the claim at 95%
level of confidence?
1. 𝐻0: 𝜇 ≥ 500
𝐻1: 𝜇 < 500 (so, it is a one-tail test)
2. Level of significance 𝛼 = 5% → 𝑡 𝛼
2
,(𝑛−1)
= 𝑡0.05,24 = 1.711 (it is t-
distribution because 𝑛 < 30 and we do not have a prior knowledge
about the population standard deviation)
3. The value of the test statistics is : 𝑡 =
𝑋−𝜇
𝑠
𝑛
=
520−500
75
25
= 1.33
4. As 1.33 < 1.711 we are not in the rejection area so, the claim cannot be
rejected at 5% level of significance.
Type I & Type II Errors
• Two types of errors can occur in hypothesis testing:
A. Type I error; when based on our sample we reject a true null
hypothesis.
B. Type II error; when based on our sample we cannot reject a false
null hypothesis.
• By reducing the level of significance 𝜶 we can reduce the
probability of making type I error (why?) however, at the same
time, we increase the probability of making type II error.
• What would happen to type I and type II errors if we increase the
sample size? (Hint: look at the confidence intervals)
Adoptedfrom http://whatilearned.wikia.com/wiki/Hypothesis_Testing?file=Type_I_and_Type_II_Error_Table.jpg
Type I & Type II Errors
• The following graph shows how a change of the critical line (critical
value) changes the probability of making type I and type II errors:
𝑷 𝑻𝒚𝒑𝒆 𝑰 𝒆𝒓𝒓𝒐𝒓 = 𝜶
And
𝑷 𝑻𝒚𝒑𝒆 𝑰𝑰 𝒆𝒓𝒓𝒐𝒓 = 𝜷
Adoptedfrom http://www.weibull.com/hotwire/issue88/relbasics88.htm
The Power Of a Test:
The power of a test is
the probability that the
test will correctly reject
the null hypothesis. It is
the probability of not
committing type II
error. The power is
equal to 𝟏 − 𝜷 which
means by reducing 𝜷
the power of the test
will increase.
The P-Value
• It is not unusual to reject 𝐻0 at some level of significance, for
example 𝛼 = 5% , but being unable to reject it at some other
levels, e.g. 𝛼 = 1% . The dependence of the final decision to the
value of 𝛼 is the weak point of the classical approach.
• In the new approach, we try to find p-value which is the lowest
significance level at which 𝐻0 can be rejected. If the level of
significance is determined at 5% and the lowest significance level at
which 𝐻0 can be rejected (p-value) is 2% so the null hypothesis
should be rejected; i.e.
𝒑 − 𝒗𝒂𝒍𝒖𝒆 < 𝜶
 To understand this concept better let’s look at an example:
• Suppose we believe that the mean life expectancy of the people in
a city is 75 years (𝐻0: 𝜇 = 75). But our observation shows a sample
mean of 76 years for a sample size of 100 with a sample variance of
4 years.
Reject 𝐻0
The P-Value
• The Z-score (test statistic) can be calculated as following:
• At 5% level of significance the critical Z-value is 1.96 so we must
reject 𝑯 𝟎. But, we should not have had this result (or should not
have had those observations in our random sample) from the
beginning if our assumption about the population mean 𝝁 was
correct.
• The p-value is the probability of
having these type of results
or even worse than that (i.e. a Z-score
bigger than 2.5) considering the null
hypothesis was correct,
𝑷(𝒁 ≥ 𝟐. 𝟓 𝝁 = 𝟕𝟓) = 𝒑 − 𝒗𝒂𝒍𝒖𝒆 ≈ 𝟎. 𝟎𝟎𝟔 (it means in 1000 samples this type of
results can happen theoretically 6 times; but it has happened in our first random
sampling).
𝑍 =
𝑋 − 𝜇
𝑠
𝑛
=
76 − 75
4
100
= 2.5
Z=2.5
𝑷 𝒁 ≥ 𝟐. 𝟓
≈ 𝟎. 𝟎𝟎𝟔
http://faculty.elgin.edu/dkernler/statistics/ch10/10-2.html
The P-Value
• As we cannot deny what we have observed and obtained from the
sample, eventually we need to change our belief about the
population mean and reject our assumption about that.
• The smaller the p-value, the stronger evidence against 𝐻0.

More Related Content

What's hot

Regression analysis
Regression analysisRegression analysis
Regression analysisSohag Babu
 
Simple linear regression (final)
Simple linear regression (final)Simple linear regression (final)
Simple linear regression (final)Harsh Upadhyay
 
Basic concept of probability
Basic concept of probabilityBasic concept of probability
Basic concept of probabilityIkhlas Rahman
 
Eigen values and eigen vectors engineering
Eigen values and eigen vectors engineeringEigen values and eigen vectors engineering
Eigen values and eigen vectors engineeringshubham211
 
Geometric Distribution
Geometric DistributionGeometric Distribution
Geometric DistributionRatul Basak
 
Geometric distributions
Geometric distributionsGeometric distributions
Geometric distributionsUlster BOCES
 
Probability distribution 2
Probability distribution 2Probability distribution 2
Probability distribution 2Nilanjan Bhaumik
 
Regression Analysis
Regression AnalysisRegression Analysis
Regression AnalysisASAD ALI
 
Introduction to probability distributions-Statistics and probability analysis
Introduction to probability distributions-Statistics and probability analysis Introduction to probability distributions-Statistics and probability analysis
Introduction to probability distributions-Statistics and probability analysis Vijay Hemmadi
 
Probability And Probability Distributions
Probability And Probability Distributions Probability And Probability Distributions
Probability And Probability Distributions Sahil Nagpal
 
2.2 laws of probability (1)
2.2 laws of probability (1)2.2 laws of probability (1)
2.2 laws of probability (1)gracie
 
Regression Analysis
Regression AnalysisRegression Analysis
Regression Analysisnadiazaheer
 
Hypergeometric distribution
Hypergeometric distributionHypergeometric distribution
Hypergeometric distributionmohammad nouman
 
Multiple linear regression
Multiple linear regressionMultiple linear regression
Multiple linear regressionKen Plummer
 

What's hot (20)

Regression analysis
Regression analysisRegression analysis
Regression analysis
 
Simple linear regression (final)
Simple linear regression (final)Simple linear regression (final)
Simple linear regression (final)
 
Regression analysis
Regression analysisRegression analysis
Regression analysis
 
Discrete probability distributions
Discrete probability distributionsDiscrete probability distributions
Discrete probability distributions
 
Ch05 4
Ch05 4Ch05 4
Ch05 4
 
Basic concept of probability
Basic concept of probabilityBasic concept of probability
Basic concept of probability
 
Eigen values and eigen vectors engineering
Eigen values and eigen vectors engineeringEigen values and eigen vectors engineering
Eigen values and eigen vectors engineering
 
Geometric Distribution
Geometric DistributionGeometric Distribution
Geometric Distribution
 
Geometric distributions
Geometric distributionsGeometric distributions
Geometric distributions
 
Probability distribution 2
Probability distribution 2Probability distribution 2
Probability distribution 2
 
Regression Analysis
Regression AnalysisRegression Analysis
Regression Analysis
 
FPDE presentation
FPDE presentationFPDE presentation
FPDE presentation
 
Introduction to probability distributions-Statistics and probability analysis
Introduction to probability distributions-Statistics and probability analysis Introduction to probability distributions-Statistics and probability analysis
Introduction to probability distributions-Statistics and probability analysis
 
The Standard Normal Distribution
The Standard Normal DistributionThe Standard Normal Distribution
The Standard Normal Distribution
 
Probability And Probability Distributions
Probability And Probability Distributions Probability And Probability Distributions
Probability And Probability Distributions
 
2.2 laws of probability (1)
2.2 laws of probability (1)2.2 laws of probability (1)
2.2 laws of probability (1)
 
Regression Analysis
Regression AnalysisRegression Analysis
Regression Analysis
 
Ring
RingRing
Ring
 
Hypergeometric distribution
Hypergeometric distributionHypergeometric distribution
Hypergeometric distribution
 
Multiple linear regression
Multiple linear regressionMultiple linear regression
Multiple linear regression
 

Similar to Statistics (recap)

Statistical Analysis with R- III
Statistical Analysis with R- IIIStatistical Analysis with R- III
Statistical Analysis with R- IIIAkhila Prabhakaran
 
2 Review of Statistics. 2 Review of Statistics.
2 Review of Statistics. 2 Review of Statistics.2 Review of Statistics. 2 Review of Statistics.
2 Review of Statistics. 2 Review of Statistics.WeihanKhor2
 
Statistical Description of Turbulent Flow
Statistical Description of Turbulent FlowStatistical Description of Turbulent Flow
Statistical Description of Turbulent FlowKhusro Kamaluddin
 
Stat 2153 Stochastic Process and Markov chain
Stat 2153 Stochastic Process and Markov chainStat 2153 Stochastic Process and Markov chain
Stat 2153 Stochastic Process and Markov chainKhulna University
 
Probability & Information theory
Probability & Information theoryProbability & Information theory
Probability & Information theory성재 최
 
Probability Proficiency.pptx
Probability Proficiency.pptxProbability Proficiency.pptx
Probability Proficiency.pptxHarshGupta137011
 
RSS probability theory
RSS probability theoryRSS probability theory
RSS probability theoryKaimrc_Rss_Jd
 
Random Variable.pptx
Random Variable.pptxRandom Variable.pptx
Random Variable.pptxSoumyaPanja2
 
Different types of distributions
Different types of distributionsDifferent types of distributions
Different types of distributionsRajaKrishnan M
 
PG STAT 531 Lecture 5 Probability Distribution
PG STAT 531 Lecture 5 Probability DistributionPG STAT 531 Lecture 5 Probability Distribution
PG STAT 531 Lecture 5 Probability DistributionAashish Patel
 
PROBABILITY DISTRIBUTION OF SUM OF TWO CONTINUOUS VARIABLES AND CONVOLUTION
PROBABILITY DISTRIBUTION OF SUM OF TWO CONTINUOUS VARIABLES AND CONVOLUTIONPROBABILITY DISTRIBUTION OF SUM OF TWO CONTINUOUS VARIABLES AND CONVOLUTION
PROBABILITY DISTRIBUTION OF SUM OF TWO CONTINUOUS VARIABLES AND CONVOLUTIONJournal For Research
 
Statistical inference: Probability and Distribution
Statistical inference: Probability and DistributionStatistical inference: Probability and Distribution
Statistical inference: Probability and DistributionEugene Yan Ziyou
 
Random variables and probability distributions Random Va.docx
Random variables and probability distributions Random Va.docxRandom variables and probability distributions Random Va.docx
Random variables and probability distributions Random Va.docxcatheryncouper
 

Similar to Statistics (recap) (20)

1. Probability.pdf
1. Probability.pdf1. Probability.pdf
1. Probability.pdf
 
Lecture 4
Lecture 4Lecture 4
Lecture 4
 
Statistical Analysis with R- III
Statistical Analysis with R- IIIStatistical Analysis with R- III
Statistical Analysis with R- III
 
2 Review of Statistics. 2 Review of Statistics.
2 Review of Statistics. 2 Review of Statistics.2 Review of Statistics. 2 Review of Statistics.
2 Review of Statistics. 2 Review of Statistics.
 
Statistical Description of Turbulent Flow
Statistical Description of Turbulent FlowStatistical Description of Turbulent Flow
Statistical Description of Turbulent Flow
 
Unit 2 Probability
Unit 2 ProbabilityUnit 2 Probability
Unit 2 Probability
 
Machine learning session2
Machine learning   session2Machine learning   session2
Machine learning session2
 
Crv
CrvCrv
Crv
 
Probability
ProbabilityProbability
Probability
 
Stat 2153 Stochastic Process and Markov chain
Stat 2153 Stochastic Process and Markov chainStat 2153 Stochastic Process and Markov chain
Stat 2153 Stochastic Process and Markov chain
 
Probability & Information theory
Probability & Information theoryProbability & Information theory
Probability & Information theory
 
5. Probability.pdf
5. Probability.pdf5. Probability.pdf
5. Probability.pdf
 
Probability Proficiency.pptx
Probability Proficiency.pptxProbability Proficiency.pptx
Probability Proficiency.pptx
 
RSS probability theory
RSS probability theoryRSS probability theory
RSS probability theory
 
Random Variable.pptx
Random Variable.pptxRandom Variable.pptx
Random Variable.pptx
 
Different types of distributions
Different types of distributionsDifferent types of distributions
Different types of distributions
 
PG STAT 531 Lecture 5 Probability Distribution
PG STAT 531 Lecture 5 Probability DistributionPG STAT 531 Lecture 5 Probability Distribution
PG STAT 531 Lecture 5 Probability Distribution
 
PROBABILITY DISTRIBUTION OF SUM OF TWO CONTINUOUS VARIABLES AND CONVOLUTION
PROBABILITY DISTRIBUTION OF SUM OF TWO CONTINUOUS VARIABLES AND CONVOLUTIONPROBABILITY DISTRIBUTION OF SUM OF TWO CONTINUOUS VARIABLES AND CONVOLUTION
PROBABILITY DISTRIBUTION OF SUM OF TWO CONTINUOUS VARIABLES AND CONVOLUTION
 
Statistical inference: Probability and Distribution
Statistical inference: Probability and DistributionStatistical inference: Probability and Distribution
Statistical inference: Probability and Distribution
 
Random variables and probability distributions Random Va.docx
Random variables and probability distributions Random Va.docxRandom variables and probability distributions Random Va.docx
Random variables and probability distributions Random Va.docx
 

More from Farzad Javidanrad

More from Farzad Javidanrad (12)

Lecture 5
Lecture 5Lecture 5
Lecture 5
 
Lecture 3
Lecture 3Lecture 3
Lecture 3
 
Lecture 2
Lecture 2Lecture 2
Lecture 2
 
Lecture 1
Lecture 1Lecture 1
Lecture 1
 
Specific topics in optimisation
Specific topics in optimisationSpecific topics in optimisation
Specific topics in optimisation
 
Matrix algebra
Matrix algebraMatrix algebra
Matrix algebra
 
Introduction to correlation and regression analysis
Introduction to correlation and regression analysisIntroduction to correlation and regression analysis
Introduction to correlation and regression analysis
 
Integral calculus
Integral calculusIntegral calculus
Integral calculus
 
Basic calculus (ii) recap
Basic calculus (ii) recapBasic calculus (ii) recap
Basic calculus (ii) recap
 
Basic calculus (i)
Basic calculus (i)Basic calculus (i)
Basic calculus (i)
 
The Dynamic of Business Cycle in Kalecki’s Theory: Duality in the Nature of I...
The Dynamic of Business Cycle in Kalecki’s Theory: Duality in the Nature of I...The Dynamic of Business Cycle in Kalecki’s Theory: Duality in the Nature of I...
The Dynamic of Business Cycle in Kalecki’s Theory: Duality in the Nature of I...
 
Introductory Finance for Economics (Lecture 10)
Introductory Finance for Economics (Lecture 10)Introductory Finance for Economics (Lecture 10)
Introductory Finance for Economics (Lecture 10)
 

Recently uploaded

Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
CELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxCELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxJiesonDelaCerna
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17Celine George
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementmkooblal
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfMahmoud M. Sallam
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxRaymartEstabillo3
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentInMediaRes1
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaVirag Sontakke
 

Recently uploaded (20)

Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
CELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxCELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptx
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of management
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdf
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)
 
Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media Component
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of India
 

Statistics (recap)

  • 1. Statistics (Recap) Finance & Management Students Farzad Javidanrad October 2013 University of Nottingham-Business School
  • 2. Probability • Some Preliminary Concepts:  Random: Something that happens (occurs) by chance.  Population: A set of all possible outcome of a random experiment or a collection of all members of a specific group under study. This collection makes an space that all possible samples can be derived from. For that reason it is sometimes called sample space.  Sample: Any subset of population (sample space). In tossing a die: Random event is the event of appearing any face of the die. Population (sample space) is the set of . Sample is any subset of the set above such as or .  61,2,3,4,5,  3  6,4,2
  • 3. Probability • Two events are mutually exclusive if they cannot happen together. The occurrence of one of them prevents the occurrence of another. For example, if the baby is a boy it cannot be a girl and vice versa. • Two events are independent if occurrence of one of them has no effect on the chance of occurrence of another. For example, the result of rolling a die has no impact on the outcome of flipping a coin. But in the experiment of taking two cards consecutively from a set of 52 cards (if the cards can be chosen equally likely) the chance of getting the second card is affected by the result of the first card. • Two events are exhaustive if they include all possible outcomes together. For example, in rolling a die the possibility of having odd numbers or even numbers.
  • 4. Probability • If event 𝑨 can happen in 𝒎 different ways out of 𝒏 equally likely ways, the probability of event 𝑨 can be shown as its relative frequency; i.e. : 𝑃 𝐴 = 𝑚 𝑛 U: sample space (population) A: an event (sample) A’: mutually exclusive event with A A & A’ are exhaustive collectively No. of ways that event 𝐴 occurs Total of equally likely and possible outcomes 𝐴𝐴 𝐴′ U
  • 5. Probability • As 0 ≤ 𝑚 ≤ 𝑛 it can be concluded that 0 ≤ 𝑚 𝑛 ≤ 1 Or 0 ≤ 𝑃(𝐴) ≤ 1 • 𝑃 𝐴 = 0 means that event 𝐴 cannot happen and 𝑃 𝐴 = 1 means that the event will happen with certainty. • With the definition of 𝐴′ as an event of “non-occurrence” of event 𝐴, we can find that: 𝑃 𝐴′ = 𝑛 − 𝑚 𝑛 = 1 − 𝑚 𝑛 = 1 − 𝑃 𝐴 Or 𝑃 𝐴 + 𝑃 𝐴′ = 1
  • 6. Probability of Multiple Events • If 𝑨 and 𝑩 are not mutually exclusive events so, the probability of happening one of them (𝑨 𝑜𝑟 𝑩) can be calculated as following: 𝑷 𝑨 ∪ 𝑩 = 𝑷 𝑨 + 𝑷 𝑩 − 𝑷(𝑨 ∩ 𝑩) 𝑃 𝐴 𝑜𝑟 𝐵 𝑃 𝐴 𝑎𝑛𝑑 𝐵 𝑃 𝐴 𝑃 𝐵 𝑃 𝐴 ∩ 𝐵
  • 7. Probability of Multiple Events P(A) P(B)P(C) 𝑃 𝐴 ∩ 𝐵 ∩ 𝐶 In case, we are dealing with more events: 𝑷 𝑨 ∪ 𝑩 ∪ 𝑪 = 𝑷 𝑨 + 𝑷 𝑩 + 𝑷 𝑪 − 𝑷 𝑨 ∩ 𝑩 − 𝑷 𝑨 ∩ 𝑪 − 𝑷 𝑩 ∩ 𝑪 + 𝑷(𝑨 ∩ 𝑩 ∩ 𝑪)
  • 8. Probability of Multiple Events • Considering 𝑷 𝑨 ∪ 𝑩 = 𝑷 𝑨 + 𝑷 𝑩 − 𝑷(𝑨 ∩ 𝑩) we can have the following situations: 1. If 𝑨 and 𝑩 are mutually exclusive events, then : 𝑷 𝑨 ∩ 𝑩 = 𝟎 2. If 𝑨 and 𝑩 are two independent events, then: 𝑷 𝑨 ∩ 𝑩 = 𝑷(𝑨) × 𝑷(𝑩) 3. If 𝑨 and 𝑩 are dependent events, then: 𝑷 𝑨 ∩ 𝑩 = 𝑷(𝑨) × 𝑷(𝑩 𝑨) = 𝑷(𝑩) × 𝑷(𝑨 𝑩) Where 𝑷(𝑨 𝑩) and 𝑷(𝑩 𝑨) are conditional probabilities and in the case of 𝑷(𝑨 𝑩) means the probability of event 𝐴 provided that event 𝐵 has already happened.
  • 9. Probability of Multiple Events o The probability of picking at random a Heart or a Queen on a single experiment from a card deck of 52 is: 𝑃 𝐻 ∪ 𝑄 = 𝑃 𝐻 + 𝑃 𝑄 − 𝑃 𝐻 ∩ 𝑄 = 13 52 + 4 52 − 1 52 = 4 13 o The probability of getting a 1 or a 4 on a single toss of a fair die is: 𝑃 1 ∪ 4 = 𝑃 1 + 𝑃 4 = 1 6 + 1 6 = 1 3 As they cannot happen together they are mutually exclusive events and 𝑃 1 ∩ 4 = 0. o The probability of having two heads in the experiment of tossing two fair coins is: (two independent events) 𝑃 𝐻 ∩ 𝐻 = 1 2 . 1 2 = 1 4
  • 10. Probability of Multiple Events o The probability of picking two ace without returning the first card into the batch of 52 playing cards, which represents a conditional probability, is: 𝑃 1𝑠𝑡 𝑎𝑐𝑒 ∩ 2𝑛𝑑 𝑎𝑐𝑒 = 𝑃(1𝑠𝑡 𝑎𝑐𝑒) × 𝑃(2𝑛𝑑 𝑎𝑐𝑒 1𝑠𝑡 𝑎𝑐𝑒) Or can be written with less words involved: 𝑃 𝐴1 ∩ 𝐴2 = 𝑃(𝐴1) × 𝑃(𝐴2 𝐴1) = 4 52 × 3 51 = 1 221 • If two events 𝑨 and 𝑩 are independent from each other then: 𝑷(𝑨 𝑩) = 𝑷 𝑨 𝒂𝒏𝒅 𝑷(𝑩 𝑨) = 𝑷(𝑩)
  • 11. Random Variable & Probability Distribution Some Basic Concepts: • Variable: A letter (symbol) which represents the elements of a specific set. • Random Variable: A variable whose values are randomly appear based on a probability distribution. • Probability Distribution: A corresponding rule (function) which corresponds a probability to the values of a random variable. • Variables (including random variables) are divided into two general categories: 1) Discrete Variables, and 2) Continuous Variables
  • 12. Random Variable & Probability Distribution • A discrete variable is the variable whose elements (values) can be corresponded to the values of the natural numbers set or any subset of that. So, it is possible to put an order and count its elements (values). The number of elements can be finite or infinite. • For a discrete variable it is not possible to define any neighbourhood, whatever small, at any value in its domain. There is a jump from one value to another value. • If the elements of the domain of a variable can be corresponded to the values of the real numbers set or any subset of that, the variable is called continuous. It is not possible to order and count the elements of a continuous variable. A variable is continuous if for any value in its domain a neighbourhood, whatever small, can be defined.
  • 13. Random Variable & Probability Distribution • Probability Distribution: A rule (function) that associates a probability either to all possible elements of a random variable (RV) individually or a set of them in an interval.* • For a discrete RV this rule associates a probability to each possible individual outcome. For example, the probability distribution for occurrence of a Head when filliping a fair coin: (Note: 𝑃𝑖 = 1) 𝒙 0 1 𝑃(𝑥) 0.5 0.5 In one trial 𝐻, 𝑇 𝒙 0 1 2 𝑃(𝑥) 0.25 0.5 0.25 In two trials 𝐻𝐻, 𝐻𝑇, 𝑇𝐻, 𝑇𝑇 𝒙 = 𝑷𝒓𝒊𝒄𝒆 (+1) --- (0) (-1) 𝑃(𝑥) 0.6 0.1 0.3 Change in the price of a share in one day o The probability distribution for the price change of a share in stock market
  • 14. Probability Distributions (Continuous) • The probability that a continuous random variable chooses just one of its values in its domain is zero, because the number of all possible outcomes 𝒏 is infinite and 𝒎 ∞ → 𝟎. • For the above reason, the probability of a continuous random variable need to be calculated in an interval. • The probability distribution of a continuous random variable is often called a probability density function (PDF) or simply probability function and it is usually shown by 𝒇(𝒙) and it has following properties: I. 𝑓(𝑥) ≥ 0 (similar to 𝑷(𝒙) ≥ 𝟎 for discrete RV*) II. −∞ +∞ 𝑓 𝑥 𝑑𝑥 = 1 (similar to 𝑷 𝒙 = 𝟏 for discrete RV) III. 𝑎 𝑏 𝑓 𝑥 𝑑𝑥 = 𝑃 𝑎 ≤ 𝑥 ≤ 𝑏 = 𝐹 𝑏 − 𝐹 𝑎 (probability given to set of values in an interval [a,b] )**
  • 15. Probability Distributions (Continuous) • where 𝐹(𝑥) is the integral of the PDF function (𝑓(𝑥)) and it is called as Cumulative Distribution Function (CDF) and for any real value of 𝒙 is defined as: 𝐹(𝑥) ≡ 𝑃(𝑋 ≤ 𝑥) CDF shows the area under PDF function (𝐟(𝐱)) from − ∞ to 𝐱 . For discrete random variable, CDF shows the summation of all probabilities before the value of 𝐱 . Adopted from http://beyondbitsandatomsblog.stanford.edu/spring2010/tag/embodied-artifacts/ 𝐹(𝑥) 𝑓(𝑥) 𝐹(𝑥)≡𝑃(𝑋≤𝑥) 𝐹(𝑥)≡𝑃(𝑋≤𝑥)
  • 16. Some Characteristics of Probability Distributions • Expected Value (Probabilistic Mean Value): It is one of the most important measures which shows the central tendency of the distribution. It is the weighted average of all possible values of random variable 𝒙 and it is shown by 𝑬(𝒙). • For a discreet RV (with n possible outcomes) 𝑬 𝒙 = 𝒙 𝟏 𝑷 𝒙 𝟏 + 𝒙 𝟐 𝑷 𝒙 𝟐 + ⋯ + 𝒙 𝒏 𝑷 𝒙 𝒏 = 𝒊=𝟏 𝒏 𝒙𝒊 𝑷(𝒙𝒊) • For a continuous RV 𝑬 𝒙 = −∞ +∞ 𝒙. 𝒇 𝒙 𝒅𝒙
  • 17. Some Characteristics of Probability Distributions • Properties of 𝑬(𝒙): i. If 𝒄 is a constant then 𝑬 𝒄 = 𝒄 . ii. If 𝒂 and 𝒃 are constants then 𝑬 𝒂𝒙 + 𝒃 = 𝒂𝑬 𝒙 + 𝒃 . iii. If 𝒂 𝟏, … , 𝒂 𝒏 are constants then 𝑬 𝒂 𝟏 𝒙 𝟏 + ⋯ + 𝒂 𝒏 𝒙 𝒏 = 𝒂 𝟏 𝑬 𝒙 𝟏 + ⋯ + 𝒂 𝒏 𝑬(𝒙 𝒏) Or 𝑬( 𝒊=𝟏 𝒏 𝒂𝒊 𝒙𝒊) = 𝒊=𝟏 𝒏 𝒂𝒊 𝑬(𝒙𝒊) iv. If 𝒙 and 𝒚 are independent random variables then 𝑬 𝒙𝒚 = 𝑬 𝒙 . 𝑬 𝒚
  • 18. Some Characteristics of Probability Distributions v. If 𝒈 𝒙 is a function of random variable 𝒙 then 𝑬 𝒈 𝒙 = 𝒈 𝒙 . 𝑷(𝒙) 𝑬 𝒈 𝒙 = 𝒈 𝒙 . 𝒇 𝒙 𝒅𝒙 • Variance: To measure how random variable 𝒙 is dispersed around its expected value, variance can help. If we show 𝑬 𝒙 = 𝝁 , then 𝒗𝒂𝒓 𝒙 = 𝝈 𝟐 = 𝑬[ 𝒙 − 𝑬 𝒙 𝟐 ] = 𝑬[ 𝒙 − 𝝁 𝟐] = 𝑬[𝒙 𝟐 − 𝟐𝒙𝝁 + 𝝁 𝟐] = 𝑬 𝒙 𝟐 − 𝟐𝝁𝑬 𝒙 + 𝝁 𝟐 = 𝑬 𝒙 𝟐 − 𝝁 𝟐 For discreet RV For continuous RV
  • 19. Some Characteristics of Probability Distributions 𝒗𝒂𝒓 𝒙 = 𝒊=𝟏 𝒏 𝒙𝒊 − 𝝁 𝟐. 𝑷(𝒙) 𝒗𝒂𝒓 𝒙 = −∞ +∞ 𝒙𝒊 − 𝝁 𝟐. 𝒇 𝒙 𝒅𝒙 • Properties of Variance: i. if 𝒄 is a constant then 𝒗𝒂𝒓 𝒄 = 𝟎 . ii. If 𝒂 and 𝒃 are constants then 𝒗𝒂𝒓 𝒂𝒙 + 𝒃 = 𝒂 𝟐 𝒗𝒂𝒓(𝒙) . iii. If 𝒙 and 𝒚 are independent random variables then 𝒗𝒂𝒓 𝒙 ± 𝒚 = 𝒗𝒂𝒓 𝒙 + 𝒗𝒂𝒓(𝒚) can be extended to more variables For discreet RV For continuous RV
  • 20. • Some of the well-known probability distributions are: • The Binomial Distribution: 1. The probability of the occurrence of an event is 𝒑 and is not changing. 2. The experiment is repeated for 𝒏 times. 3. The probability that out of 𝒏 times, the event appears 𝒙 times is: 𝑃 𝑥 = 𝑛! 𝑥! 𝑛 − 𝑥 ! 𝑝 𝑥(1 − 𝑝) 𝑛−𝑥 The mean value and standard deviation of the binomial distribution are: 𝜇 = 𝑖=0 𝑛 𝑥𝑖. 𝑃 𝑥𝑖 = 𝑛𝑝 𝜎 = 𝑖=0 𝑛 𝑥𝑖 − 𝜇 2. 𝑃(𝑥𝑖) = 𝑛𝑝(1 − 𝑝) So, to show that the probability distribution of the random variable 𝑋 is binomial we can write: 𝑋~𝐵𝑖(𝑛𝑝, 𝑛𝑝 1 − 𝑝 ) Probability Distributions (Discrete RV)
  • 21. Probability Distributions (Discrete RV) • A gambler thinks his chance to get a 1 in rolling a die is high. What is his chance to have 4 one out of six experiments using a fair die? The probability of having a one in an individual trial is 1 6 and it remains the same in all 6 experiments. So, 𝑃 𝑥 = 4 = 6! 4! 2! 1 6 4 5 6 2 = 375 7776 = 0.048 ≈ 5% • The Poisson Distribution: 1. It is used to calculate the probability of number of desired event (no. of successes)in a specific period of time. 2. The average number of desired event (no. of successes) per unit of time remains constant.
  • 22. • So, the probability of having 𝒙 numbers of success is calculated by: 𝑃 𝑥 = 𝝀 𝑥 𝑒−𝝀 𝑥! Where 𝝀 is the average number of successes in a specific period of time and 𝑒 = 2.7182 . • The mean value and standard deviation of the Poisson distribution are: 𝜇 = 𝑖=0 𝑛 𝑥𝑖. 𝑃 𝑥𝑖 = 𝝀 and 𝜎 = 𝑖=0 𝑛 𝑥𝑖 − 𝜇 2. 𝑃(𝑥𝑖) = 𝝀 So, to show that the probability distribution of the random variable 𝑋 is Poisson we can write: 𝑿~Poi(𝝀, 𝝀). o The emergency section in a hospital receives 2 calls per half an hour (4 calls in an hour). The probability of getting just 2 calls in a randomly chosen hour in a random day is: 𝑃 𝑥 = 2 = 42 𝑒−4 2! = 0.146 ≈ 15% Probability Distributions (Discrete RV)
  • 23. The Normal Distribution (Continuous RV) • The Normal Distribution: It is the best known probability distribution which reflects the nature of most random variables in the world. The probability density function (PDF) of normal distribution is: 1. Symmetrical around its mean value (𝝁). 2. Bell-shaped, with two tails approaching the horizontal axis asymptotically as we move further away from the mean. Adopted from http://www.pdnotebook.com/2 010/06/statistical-tolerance- analysis-root-sum-square/
  • 24. The Normal Distribution (Continuous RV) 3. The probability density function (PDF) of normal distribution can be represented by: 𝒇 𝒙 = 𝟏 𝝈 𝟐𝝅 𝒆 − 𝒙−𝝁 𝟐 𝟐𝝈 𝟐 (−∞ < 𝒙 < +∞) Where 𝝁 and 𝝈 are mean and standard deviation respectively. 𝝁 = −∞ +∞ 𝒙. 𝒇 𝒙 𝒅𝒙 and 𝝈 = −∞ +∞ 𝒙 − 𝝁 𝟐 . 𝒇 𝒙 𝒅𝒙 So, 𝑿~𝑵(𝝁, 𝝈 𝟐). • A linear combination of independent normally distributed random variables is itself normally distributed, that is, If 𝑿~𝑵 𝝁 𝟏, 𝝈 𝟏 𝟐 and 𝒀~𝑵 𝝁 𝟐, 𝝈 𝟐 𝟐 and if 𝒁 = 𝒂𝑿 + 𝒃𝒀 then 𝒁~𝑵(𝒂𝝁 𝟏 + 𝒃𝝁 𝟐 , 𝒂 𝟐 𝝈 𝟏 𝟐 + 𝒃 𝟐 𝝈 𝟐 𝟐 ) • This can be extended to more than two random variables.
  • 25. The Normal Distribution (Continuous RV) • Recalling the last property of PDF ( 𝑎 𝑏 𝑓 𝑥 𝑑𝑥 = 𝑃(𝑎 ≤ 𝑥 ≤ 𝑏)), it is difficult to calculate the probability using the above PDF with different values of 𝝁 and 𝝈. The solution for this problem is to transform the normal variable 𝒙 to the standardised normal variable (or simply, standard normal variable) random variable 𝒛 , by: 𝒛 = 𝒙−𝝁 𝝈 which its parameters (𝜇 and 𝜎2 ) are independent from the influence of other random variables’ parameters with normal distribution because we always have:𝑬 𝒛 = 𝟎 and 𝒗𝒂𝒓 𝒛 = 𝟏 (why?) • The probability distribution for the standard normal variable is defined as: 𝒇 𝒛 = 𝟏 𝟐𝝅 𝒆− 𝒛 𝟐 𝟐 𝒁~𝑵(𝟎, 𝟏). Standardised Adopted and amended from http://www.mathsisfun.com/data/standard-normal- distribution.html 𝑿~𝑵(𝝁, 𝝈 𝟐) 𝒁~𝑵(𝟎, 𝟏)
  • 26. The Standard Normal Distribution 0 • Properties of the standard normal distribution curve: 1. It is symmetrical around y-axis. 2. The area under the curve can be split into two equal areas, that is: −∞ 0 𝑓 𝑧 𝑑𝑧 = 0 +∞ 𝑓 𝑧 𝑑𝑧 = 0.5 • To find the area under the curve and before 𝒛 𝟏 = 𝟏. 𝟐𝟔 , using the z- table (next slide), we have: 𝑃 𝑧 ≤ 𝑧1 = 1.26 = −∞ 0 𝑓 𝑧 𝑑𝑧 + 0 𝑧1 𝑓 𝑧 𝑑𝑧 = 0.5 + 0.3962 = 0.8962 ≈ 90% 𝑓(𝑧) 50% 𝑧 50% 50% 𝒛 𝟏 = 𝟏. 𝟐𝟔 0.5 0.3962
  • 27. Z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.0 0.0000 0.0040 0.0080 0.0120 0.0160 0.0199 0.0239 0.0279 0.0319 0.0359 0.1 0.0398 0.0438 0.0478 0.0517 0.0557 0.0596 0.0636 0.0675 0.0714 0.0753 0.2 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064 0.1103 0.1141 0.3 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443 0.1480 0.1517 0.4 0.1554 0.1591 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808 0.1844 0.1879 0.5 0.1915 0.1950 0.1985 0.2019 0.2054 0.2088 0.2123 0.2157 0.2190 0.2224 0.6 0.2257 0.2291 0.2324 0.2357 0.2389 0.2422 0.2454 0.2486 0.2517 0.2549 0.7 0.2580 0.2611 0.2642 0.2673 0.2704 0.2734 0.2764 0.2794 0.2823 0.2852 0.8 0.2881 0.2910 0.2939 0.2967 0.2995 0.3023 0.3051 0.3078 0.3106 0.3133 0.9 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340 0.3365 0.3389 1.0 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.3577 0.3599 0.3621 1.1 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790 0.3810 0.3830 1.2 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3980 0.3997 0.4015 1.3 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.4147 0.4162 0.4177 1.4 0.4192 0.4207 0.4222 0.4236 0.4251 0.4265 0.4279 0.4292 0.4306 0.4319 1.5 0.4332 0.4345 0.4357 0.4370 0.4382 0.4394 0.4406 0.4418 0.4429 0.4441 1.6 0.4452 0.4463 0.4474 0.4484 0.4495 0.4505 0.4515 0.4525 0.4535 0.4545 1.7 0.4554 0.4564 0.4573 0.4582 0.4591 0.4599 0.4608 0.4616 0.4625 0.4633 1.8 0.4641 0.4649 0.4656 0.4664 0.4671 0.4678 0.4686 0.4693 0.4699 0.4706 1.9 0.4713 0.4719 0.4726 0.4732 0.4738 0.4744 0.4750 0.4756 0.4761 0.4767 2.0 0.4772 0.4778 0.4783 0.4788 0.4793 0.4798 0.4803 0.4808 0.4812 0.4817 2.1 0.4821 0.4826 0.4830 0.4834 0.4838 0.4842 0.4846 0.4850 0.4854 0.4857 2.2 0.4861 0.4864 0.4868 0.4871 0.4875 0.4878 0.4881 0.4884 0.4887 0.4890 2.3 0.4893 0.4896 0.4898 0.4901 0.4904 0.4906 0.4909 0.4911 0.4913 0.4916 2.4 0.4918 0.4920 0.4922 0.4925 0.4927 0.4929 0.4931 0.4932 0.4934 0.4936 2.5 0.4938 0.4940 0.4941 0.4943 0.4945 0.4946 0.4948 0.4949 0.4951 0.4952 2.6 0.4953 0.4955 0.4956 0.4957 0.4959 0.4960 0.4961 0.4962 0.4963 0.4964 2.7 0.4965 0.4966 0.4967 0.4968 0.4969 0.4970 0.4971 0.4972 0.4973 0.4974 2.8 0.4974 0.4975 0.4976 0.4977 0.4977 0.4978 0.4979 0.4979 0.4980 0.4981 2.9 0.4981 0.4982 0.4982 0.4983 0.4984 0.4984 0.4985 0.4985 0.4986 0.4986 3.0 0.4987 0.4987 0.4987 0.4988 0.4988 0.4989 0.4989 0.4989 0.4990 0.4990
  • 28. Working with the Z-Table • To find the probability 𝑃 0.89 < 𝑧 < 1.5 = 0 𝑧2 𝑓(𝑧)𝑑𝑧 − 0 𝑧1 𝑓 𝑧 𝑑𝑧 = 𝐹 1.5 − 𝐹 0.89 = 0.4332 − 0.3133 = 0.119 ≈ 12% as both values are positive. • To find the probability in the negative area we need to find the equivalent area in the positive side: 𝑃 −1.32 < 𝑧 < −1.25 = 𝑃 1.25 < 𝑧 < 1.32 = 𝐹 1.32 − 𝐹 1.25 = 0.4066 − 0.3944 = 0.0122 ≈ 1% 1.50.89
  • 29. Working with the Z-Table • To find 𝑃(−2.15 < 𝑧) we can write: −∞ −2.15 𝑓. 𝑑𝑧 = −∞ 0 𝑓. 𝑑𝑧 − −2.15 0 𝑓. 𝑑𝑧 = 0.5 − 0.4842 = 0.0158 ≈ 2% • And finally, to find 𝑃(𝑧 ≥ 1.93) , we have: 1.93 +∞ 𝑓. 𝑑𝑧 = 0 +∞ 𝑓. 𝑑𝑧 − 0 1.93 𝑓. 𝑑𝑧 = 0.5 − 0.4732 = 0.0268 0-2.15 =≡ 0 2.15 𝑓. 𝑑𝑧 0 =1.93
  • 30. An Example o If the income of employees in a big company normally distributed with 𝝁 = £𝟐𝟎𝟎𝟎𝟎 and 𝝈 = £𝟒𝟎𝟎𝟎, what is the probability of an employee picked randomly have an income a) above £22000, b) between £16000 and £24000. a) We need to transform 𝒙 to 𝒛 firstly: 𝑃 𝑥 > 22000 = 𝑃 𝑥 − 20000 4000 > 22000 − 20000 4000 = 𝑃 𝑧 > 0.5 = 0.5 − 01915 = 0.3085 ≈ 31% b) 𝑃 16000 < 𝑥 < 24000 = 𝑃( 16000−20000 4000 < 𝑥−20000 4000 < 24000−20000 4000 ) = 𝑃 −1 < 𝑧 < 1 = 0.3413 + 0.3413 = 0.6826 ≈ 68%
  • 31. The ࣑2 (Chi-Squared)Distribution • The ࣑ 𝟐(Chi-Squared)Distribution: Let 𝒁 𝟏, 𝒁 𝟐, … , 𝒁 𝒌be 𝒌 independent standardised normal distributed random variables, then the sum of the squares of them 𝑋 = 𝑖=1 𝑘 𝑍𝑖 2 have a Chi-Square distribution with a degree of freedom equal to the number of random variables (𝒅𝒇 = 𝒌). So, 𝑿~ . The mean value and standard deviation of the RV with a Chi-Squared distribution are 𝒌 𝑎𝑛𝑑 𝟐𝒌 Respectively. So we can write: 𝑿~ 2 k Probability Density Function (PDF) of ࣑2 Distribution Adoptedfromhttp://2012books.lardbucket.org/books/beginning-statistics/s15-chi-square-tests-and-f-tests.html
  • 33. The t-Distribution • If 𝒁~𝑵 𝟎, 𝟏 and 𝑿~ and two random variables 𝒁 and 𝑿 are independent then the random variable 𝒕 = 𝒁 𝑿 𝒌 = 𝒁. 𝒌 𝑿 follows student’s t-distribution (t-distribution) with 𝒌 degree of freedom. For a sample size 𝒏 we have 𝒅𝒇 = 𝒌 = 𝒏 − 𝟏. • The mean value and standard deviation of this distribution are 𝝁 = 𝟎 𝒏 > 𝟐 𝒖𝒏𝒅𝒆𝒇𝒊𝒏𝒆𝒅 𝒏 = 𝟏, 𝟐 𝝈 = 𝒏−𝟏 𝒏−𝟑 𝒏 > 𝟑 ∞ 𝒏 = 𝟑 𝒖𝒏𝒅𝒆𝒇𝒊𝒏𝒆𝒅 𝒏 = 𝟏, 𝟐 )2,(2 kkk
  • 34. The t-Distribution • The t-distribution like the standard normal distribution is a bell- shaped and symmetrical distribution with zero mean (n>2) but it is flatter but as the degree of freedom increases (or 𝒏 increases)it approaches the standard normal distribution and for 𝒏≥𝟑𝟎 their behaviours are similar. • From the table (next slide) 𝑃 𝑡 = 1.706 𝑑𝑓 =26 = 0.05 ≈ 5% or 𝑡0.05,26 = 1.706 Adoptedfromhttp://education- portal.com/academy/lesson/what -is-a-t-test-procedure- interpretation- examples.html#lesson = 𝟏. 𝟕𝟎𝟔 5%
  • 35. df 0.20 0.15 0.10 0.05 0.025 0.01 0.005 0.0025 0.001 0.0005 1 1.376 1.963 3.078 6.314 12.706 31.821 63.656 127.321 318.289 636.578 2 1.061 1.386 1.886 2.920 4.303 6.965 9.925 14.089 22.328 31.600 3 0.978 1.250 1.638 2.353 3.182 4.541 5.841 7.453 10.214 12.924 4 0.941 1.190 1.533 2.132 2.776 3.747 4.604 5.598 7.173 8.610 5 0.920 1.156 1.476 2.015 2.571 3.365 4.032 4.773 5.894 6.869 6 0.906 1.134 1.440 1.943 2.447 3.143 3.707 4.317 5.208 5.959 7 0.896 1.119 1.415 1.895 2.365 2.998 3.499 4.029 4.785 5.408 8 0.889 1.108 1.397 1.860 2.306 2.896 3.355 3.833 4.501 5.041 9 0.883 1.100 1.383 1.833 2.262 2.821 3.250 3.690 4.297 4.781 10 0.879 1.093 1.372 1.812 2.228 2.764 3.169 3.581 4.144 4.587 11 0.876 1.088 1.363 1.796 2.201 2.718 3.106 3.497 4.025 4.437 12 0.873 1.083 1.356 1.782 2.179 2.681 3.055 3.428 3.930 4.318 13 0.870 1.079 1.350 1.771 2.160 2.650 3.012 3.372 3.852 4.221 14 0.868 1.076 1.345 1.761 2.145 2.624 2.977 3.326 3.787 4.140 15 0.866 1.074 1.341 1.753 2.131 2.602 2.947 3.286 3.733 4.073 16 0.865 1.071 1.337 1.746 2.120 2.583 2.921 3.252 3.686 4.015 17 0.863 1.069 1.333 1.740 2.110 2.567 2.898 3.222 3.646 3.965 18 0.862 1.067 1.330 1.734 2.101 2.552 2.878 3.197 3.610 3.922 19 0.861 1.066 1.328 1.729 2.093 2.539 2.861 3.174 3.579 3.883 20 0.860 1.064 1.325 1.725 2.086 2.528 2.845 3.153 3.552 3.850 21 0.859 1.063 1.323 1.721 2.080 2.518 2.831 3.135 3.527 3.819 22 0.858 1.061 1.321 1.717 2.074 2.508 2.819 3.119 3.505 3.792 23 0.858 1.060 1.319 1.714 2.069 2.500 2.807 3.104 3.485 3.768 24 0.857 1.059 1.318 1.711 2.064 2.492 2.797 3.091 3.467 3.745 25 0.856 1.058 1.316 1.708 2.060 2.485 2.787 3.078 3.450 3.725 26 0.856 1.058 1.315 1.706 2.056 2.479 2.779 3.067 3.435 3.707 27 0.855 1.057 1.314 1.703 2.052 2.473 2.771 3.057 3.421 3.689 28 0.855 1.056 1.313 1.701 2.048 2.467 2.763 3.047 3.408 3.674 29 0.854 1.055 1.311 1.699 2.045 2.462 2.756 3.038 3.396 3.660 30 0.854 1.055 1.310 1.697 2.042 2.457 2.750 3.030 3.385 3.646 31 0.853 1.054 1.309 1.696 2.040 2.453 2.744 3.022 3.375 3.633 32 0.853 1.054 1.309 1.694 2.037 2.449 2.738 3.015 3.365 3.622 33 0.853 1.053 1.308 1.692 2.035 2.445 2.733 3.008 3.356 3.611 34 0.852 1.052 1.307 1.691 2.032 2.441 2.728 3.002 3.348 3.601 35 0.852 1.052 1.306 1.690 2.030 2.438 2.724 2.996 3.340 3.591 36 0.852 1.052 1.306 1.688 2.028 2.434 2.719 2.990 3.333 3.582 37 0.851 1.051 1.305 1.687 2.026 2.431 2.715 2.985 3.326 3.574 38 0.851 1.051 1.304 1.686 2.024 2.429 2.712 2.980 3.319 3.566 39 0.851 1.050 1.304 1.685 2.023 2.426 2.708 2.976 3.313 3.558 40 0.851 1.050 1.303 1.684 2.021 2.423 2.704 2.971 3.307 3.551 50 0.849 1.047 1.299 1.676 2.009 2.403 2.678 2.937 3.261 3.496 60 0.848 1.045 1.296 1.671 2.000 2.390 2.660 2.915 3.232 3.460 80 0.846 1.043 1.292 1.664 1.990 2.374 2.639 2.887 3.195 3.416 100 0.845 1.042 1.290 1.660 1.984 2.364 2.626 2.871 3.174 3.390 150 0.844 1.040 1.287 1.655 1.976 2.351 2.609 2.849 3.145 3.357 Infinity 0.842 1.036 1.282 1.645 1.960 2.326 2.576 2.807 3.090 3.290
  • 36. The F Distribution • If 𝑍1~ and 𝑍2~ and 𝑍1 and 𝑍2 are independent then the random variable 𝐹 = 𝑍1 𝑘1 𝑍2 𝑘2 follows F distribution with 𝑘1 and 𝑘2 degrees of freedom, i.e.: 𝐹~𝐹𝑘1,𝑘2 or 𝐹~𝐹(𝑘1, 𝑘2) • This distribution is skewed to the right as the Chi-Square distribution but as 𝑘1 and 𝑘2 increase (𝑛 → ∞) it approaches to normal distribution. 2 2k2 1k Adoptedfrom http://www.vosesoftware.com/ModelRiskHelp/index.htm#Dis tributions/Continuous_distributions/F_distribution.htm
  • 37. The F Distribution • The mean and standard deviation of the F distribution are: 𝜇 = 𝑘2 𝑘2−2 𝑓𝑜𝑟 (𝑘2 > 2) and 𝜎 = 𝑘2 𝑘2−2 2(𝑘1+𝑘2−2) 𝑘1(𝑘2−4) 𝑓𝑜𝑟 (𝑘2 > 4) • Relation between t & Chi-Square Distributions with F distribution: • For a random variable 𝑋~𝑡 𝑘it can be shown that 𝑋2~𝐹1,𝑘. This can also be written as 𝑡 𝑘 2 = 𝐹1,𝑘 • If 𝑘2 is large enough, then 𝑘1. 𝐹𝑘1,𝑘2 ~ 2 1k
  • 38. 𝛼 = 0.25 All adopted from http://www.stat.purdue.edu/~yuzhu/stat514s05/tab les.html
  • 43. Statistical Inference (Estimation) • Statistical inference or statistical induction is one of the most important aspect of decision making and it refers to the process of drawing a conclusion about the unknown parameters of the population from a sample of randomly chosen data. • So, the idea is that a sample of randomly chosen data provides the best information about parameters of the population and it can be considered as a representative of the population when its size reasonably (appropriately) large. • The first step in statistical inference (induction) is estimation which is the process of finding an estimate or approximation for the population parameters (such as mean value and standard deviation) using the data in the sample.
  • 44. Statistical Inference (Estimation) • The value of 𝑿 (sample mean) in a randomly chosen and appropriately large sample is a good estimator of the population mean 𝝁 . The value of 𝒔 𝟐 (sample variance) is also a good estimator of the population variance 𝝈 𝟐. • Before taking any sample from population (when the sample is not realised or observed) we can talk about the probability distribution of a hypothetical sample. The probability distribution of a random variable 𝒙 in a hypothetical sample follows the probability distribution of the population even if the sampling process is repeated for many times. • But the probability distribution of the sample mean 𝑿 in repeated sampling does not necessarily follow the probability distribution of its population when number of sampling increases.
  • 45. Central Limit Theorem • Central Limit Theorem: Imagine random variable 𝑿 with any probability distribution is defined in a population with the mean 𝝁 and the variance 𝝈 𝟐. If we get 𝒏 independent samples 𝑿 𝟏, 𝑿 𝟐, … , 𝑿 𝒏 and for each sample we calculate the mean values 𝑿 𝟏, 𝑿 𝟐, … , 𝑿 𝒏(see figure below) 𝑿~𝒊. 𝒊. 𝒅(𝝁, 𝝈 𝟐 ) 𝑿 𝟏 𝑿 𝟐 ⋮ 𝑿 𝒏 𝑖. 𝑖. 𝑑 ≡Independent & Identically Distributed RVs
  • 46. Central Limit Theorem As the number of sampling increases infinitely, the random variable 𝑿 has a normal distribution (regardless of the population distribution) and we have 𝑿~𝑵 𝝁, 𝝈 𝟐 𝒏 when 𝒏 → +∞ And in the standard form: 𝒁 = 𝑿 − 𝝁 𝑿 𝝈 𝑿 = 𝑿 − 𝝁 𝝈 𝒏 = 𝒏 𝑿 − 𝝁 𝝈 ~𝑵(𝟎, 𝟏) o Taking sample of 36 elements from a population with the mean of 20 and standard deviation of 12, what is the probability that the sample mean falls between 18 and 24? 𝑃 18 < 𝑥 < 24 = 𝑃 −1 < 𝑥 − 20 12 36 < 2 = 0.3413 + 0.4772 ≈ 82%
  • 47. Estimation • In previous slides we introduced some of the most important probability distributions for discrete & continuous random variables. • In many cases we know the nature of the probability distribution of a random variable, defined in a population, but have no idea about its parameters such as mean value or/and standard deviation. • Point Estimation: • To estimate the unknown parameters of a probability distribution of a random variable we can either have a point estimation or an interval estimation using an estimator. • The estimator is a function of the sample values 𝒙 𝟏, 𝒙 𝟐, … , 𝒙 𝒏 and it is often called a statistic. If 𝜽 represent that estimator we have: 𝜽 = 𝒇(𝒙 𝟏, 𝒙 𝟐, … , 𝒙 𝒏)
  • 48. Estimation • 𝜽 is said to be an unbiased estimator of true 𝜽 (parameter of the population) if 𝑬 𝜽 = 𝜽. Because the bias itself is defined as 𝑩𝒊𝒂𝒔 = 𝑬 𝜽 − 𝜽 o For example, the sample mean 𝑿 is a point and unbiased estimator for the unknown parameter 𝝁 (population mean): 𝜽 = 𝑿 = 𝒇 𝒙 𝟏, 𝒙 𝟐, … , 𝒙 𝒏 = 𝟏 𝒏 𝒙 𝟏 + 𝒙 𝟐 + ⋯ + 𝒙 𝒏 It is unbiased because 𝑬 𝑿 = 𝝁.
  • 49. • The sample variance in the form of 𝒔 𝟐 = 𝒙 𝒊− 𝒙 𝟐 𝒏 is a point but a biased estimator of the population variance 𝝈 𝟐 in a small sample: 𝑬 𝒔 𝟐 = 𝝈 𝟐 (𝟏 − 𝟏 𝒏 ) ≠ 𝝈 𝟐 But it is a consistent estimator because it will approaches to 𝝈 𝟐when the sample size 𝒏 increases indefinitely (𝒏 → ∞) • With Bessel’s correction (changing 𝒏 to (𝒏 − 𝟏)) we can define another sample variance which is unbiased even for small sample size. 𝒔 𝟐 = 𝒙𝒊 − 𝒙 𝟐 𝒏 − 𝟏 • The methods of finding point estimators are mostly least-square method and maximum likelihood method which among them the first method will be discussed later. Estimation
  • 50. Interval Estimation • Interval Estimation: • Interval estimation, in contrary, provides an interval or a range of possible estimates at a specific level of probability, which is called level of confidence, within which the true value of the population parameter may lie. • If 𝜽 𝟏 and 𝜽 𝟐 are respectively the lowest and highest estimates of 𝜽 the probability that 𝜽 is covered by the interval 𝜽 𝟏, 𝜽 𝟐 is: 𝐏𝐫 𝜽 𝟏 ≤ 𝜽 ≤ 𝜽 𝟐 = 𝟏 − 𝜶 (0 < 𝛼 < 1) Where 𝟏 − 𝜶 is the level of confidence and 𝜶 itself is called level of significance. The interval 𝜽 𝟏, 𝜽 𝟐 is called confidence interval.
  • 51. Interval Estimation  How to find 𝜽 𝟏 𝒂𝒏𝒅 𝜽 𝟐? In order to find the lower and upper limits of a confidence interval we need to have a prior knowledge about the nature of distribution of the random variable in the population.  If random variable 𝒙 is normally distributed in the population and the population standard deviation (𝝈) is known, the 95% confidence interval for the unknown population mean (𝝁) can be constructed by finding the symmetric z-values associated to 95% area under the standard normal curve: 𝟏 − 𝜶 = 𝟗𝟓% → 𝜶 = 𝟓% → 𝜶 𝟐 = 𝟐. 𝟓% So, ±𝒁 𝟎.𝟎𝟐𝟓 = ±𝟏. 𝟗𝟔 We know that: 𝒁 = 𝑿−𝝁 𝑿 𝝈 𝑿 = 𝑿−𝝁 𝝈 𝒏 , so: 𝑷(−𝒁 𝜶 𝟐 ≤ 𝒁 ≤ 𝒁 𝜶 𝟐 ) = 𝟗𝟓% Adopted & altered from http://upload.wikimedia.org/wikipedia/en/b/bf/NormalDist1.96.png =1−𝛼 𝜶 𝟐 = 𝟎. 𝟎𝟐𝟓 𝜶 𝟐 = 𝟎. 𝟎𝟐𝟓 −𝒁 𝜶 𝟐 = = 𝒁 𝜶 𝟐
  • 52. Interval Estimation • So we can write: 𝑷 𝒙 − 𝟏. 𝟗𝟔𝝈 𝒙 ≤ 𝝁 ≤ 𝒙 + 𝟏. 𝟗𝟔𝝈 𝒙 = 𝟎. 𝟗𝟓 Or 𝑷 𝒙 − 𝟏. 𝟗𝟔 𝝈 𝒏 ≤ 𝝁 ≤ 𝒙 + 𝟏. 𝟗𝟔 𝝈 𝒏 = 𝟎. 𝟗𝟓 Therefore, the interval 𝒙 − 𝟏. 𝟗𝟔 𝝈 𝒏 , 𝒙 + 𝟏. 𝟗𝟔 𝝈 𝒏 represents a 95% confidence interval (𝐶𝐼95%)of the unknown value of 𝝁. It means in repeated random sampling (for 100 times) we expect 95 out of 100 intervals, such as the above, cover the unknown value of the population mean 𝝁 . 𝒙 ̅− 𝟏.𝟗𝟔 𝝈/√𝒏 = = 𝒙 ̅− 𝟏.𝟗𝟔 𝝈/√𝒏 Adopted and altered from http://forums.anarchy-online.com/showthread.php?t=604728
  • 53. Interval Estimation for population Proportion  A confidence interval can be constructed for the population proportion (see the graph below) 𝑋~𝐵𝑖(𝑛𝑝, 𝑛𝑝 1 − 𝑝 ) 𝒑 𝟏 𝜇 𝜎2 𝒑 𝟐 ⋮ 𝒑 𝒏 𝝁 𝒑 = 𝑬 𝒑 = 𝒑 = 𝝁 𝒏 𝝈 𝟐 𝒑 = 𝒗𝒂𝒓 𝒑 = 𝝈 𝟐 𝒏 𝟐 = 𝒑(𝟏 − 𝒑) 𝒏 𝒑 in each sample represents a sample proportion. In repeated random sampling 𝒑 has its own probability distribution with mean value and variance
  • 54. Interval Estimation for population Proportion • The 90% confidence interval for the population proportion 𝒑 when sample size is bigger than 30 (n>30) and there is no information about the population variance will be constructed as following: ±𝒁 𝜶 𝟐 = 𝒑 − 𝒑 𝒑(𝟏 − 𝒑) 𝒏 𝑷(−𝒁 𝜶 𝟐 ≤ 𝒁 ≤ +𝒁 𝜶 𝟐 ) = 𝟏 − 𝜶 𝑷( 𝒑 − 𝒁 𝜶 𝟐 . 𝒑(𝟏− 𝒑) 𝒏 ≤ 𝒑 ≤ 𝒑+𝒁 𝜶 𝟐 . 𝒑(𝟏− 𝒑) 𝒏 ) = 𝟎. 𝟗 So, the confidence interval can be simply written as: 𝑪𝑰 𝟗𝟎% = 𝒑 ∓ 𝟏. 𝟔𝟒𝟓 𝒑(𝟏 − 𝒑) 𝒏 =90% 𝜶 𝟐 = 𝟎. 𝟎𝟓𝜶 𝟐 = 𝟎. 𝟎𝟓 −𝒁 𝜶 𝟐 = −𝟏. 𝟔𝟒𝟓 𝒁 𝜶 𝟐 = 𝟏. 𝟔𝟒𝟓 Obviously, if we had knowledge about the population variance we were be able to estimate the population proportion 𝒑 directly. Why? Adopted and altered fromhttp://www.stat.wmich.edu/s216/book/node83.html
  • 55. Examples o Imagine the weight of people in a society distributed normally. A random sample of 25 with the sample mean 72 kg is taken from this society. If the standard deviation of the population is 6 kg find a)the 90% b)95% and c) 99% confidence interval for the unknown population mean. a) 1 − 𝛼 = 0.9 → 𝛼 2 = 0.05 → 𝑍 𝛼 2 = 1.645 So, 𝐶𝐼90% = 72 ± 1.645 × 6 25 = 70.03 , 73.97 b) 1 − 𝛼 = 0.95 → 𝛼 2 = 0.025 → 𝑍 𝛼 2 = 1.96 So, 𝐶𝐼95% = 72 ± 1.96 × 6 25 = 69.65 , 74.35 c) 1 − 𝛼 = 0.99 → 𝛼 2 = 0.005 → 𝑍 𝛼 2 = 2.58 So, 𝐶𝐼99% = 72 ± 2.58 × 6 25 = 68.9 , 75.1
  • 56. Examples o Samples from one of the lines of production in a factory suggests that 10% of products are defective. If the range of 1% difference between sample and population proportion is acceptable what sample size we need to construct a 95% confidence interval for the population proportion? What about if the acceptable gap between sample & population proportion increased to 3%? 1 − 𝛼 = 0.95 → 𝛼 2 = 0.025 → 𝑍 𝛼 2 = 1.96 𝑍 𝛼 2 = 𝑝 − 𝑝 𝑝(1 − 𝑝) 𝑛 → 1.96 = 0.01 0.1 × 0.9 𝑛 → 𝑛 = 196 × 0.3 2 ≈ 3458 If the gap increases to 3% then: 1.96 = 0.03 0.1×0.9 𝑛 → 𝑛 = 196 × 0.1 2 ≈ 385
  • 57. Interval Estimation (Using t-distribution) • If the population standard deviation 𝝈 is unknown and we use sample standard deviation 𝒔 instead, and the size of the sample is less than 30 (𝒏 < 𝟑𝟎) then the random variable 𝒙 − 𝝁 𝒔 𝒏 ~𝒕 𝒏−𝟏 has t-distribution with 𝒅𝒇 = 𝒏 − 𝟏. This means a confidence interval for the population mean 𝝁 will be in the form of: 𝑪𝑰(𝟏−𝜶) = 𝒙 − 𝒕 𝜶 𝟐,𝒏−𝟏 𝒔 𝒏 , 𝒙 + 𝒕 𝜶 𝟐,𝒏−𝟏 𝒔 𝒏 −𝒕 𝜶 𝟐 ,𝒏−𝟏 𝒕 𝜶 𝟐 ,𝒏−𝟏 1 − 𝛼 % 𝜶 𝟐 𝜶 𝟐 Adopted and altered from http://cnx.org/content/m46278/latest/?collection=col11521/latest
  • 58. Interval Estimation • The following flowchart can help to choose between Z and t- distributions when the interval estimation is constructed for 𝝁 in the population. Use nonparametric methods Adopted from http://www.expertsmind.com/questions/flow-chart-for-confidence-interval-30112489.aspx
  • 59. Interval Estimation • Here there is a list of confidence intervals for the subject parameters in the population. Adopted from http://www.bls-stats.org/uploads/1/7/6/7/1767713/250709.image0.jpg
  • 60. Hypothesis Testing • Hypothesis testing is one of the important aspects of statistical inference. The main idea is to find out if some claims/statements (in the form of hypothesis) about population parameters can be statistically rejected by the evidence from the sample using a test statistic (a function of sample). • Claims can be made in the form of null hypothesis (𝐻0) against the alternative hypothesis (𝐻1) and they are just rejectable. These two hypotheses should be mutually exclusive and collectively exhaustive. For example: 𝐻0: 𝜇 = 0.8 𝑎𝑔𝑎𝑖𝑛𝑠𝑡 𝐻1: 𝜇 ≠ 0.8 𝐻0: 𝜇 ≥ 2.1 𝑎𝑔𝑎𝑖𝑛𝑠𝑡 𝐻1: 𝜇 < 2.1 𝐻0: 𝜎2 ≤ 0.4 𝑎𝑔𝑎𝑖𝑛𝑠𝑡 𝐻1: 𝜎2 > 0.4  Always remember that the equality sign comes with 𝐻0. • If the value of the test statistic lies in the rejection area(s) the null hypothesis must be rejected, otherwise the sample does not provide sufficient evidence to reject the null hypothesis.
  • 61. Hypothesis Testing • Assuming we know the distribution of the random variable in the population and also having statistical independence between different random variables, in hypothesis testing we need to follow the following steps: 1. Stating the relevant null & alternative hypotheses. The state of the null hypothesis (being =, ≥, ≤ something)indicates how many rejection regions we will have (for = sign we will have two regions and for others just one region; depending on the difference between the value of estimator and the claimed value for the population parameter the rejection area could be on the right or left of the distribution curve). 𝐻0: 𝜇 = 0.5 𝐻1: 𝜇 ≠ 0.5 𝐻0: 𝜇 ≥ 0.5 (𝑜𝑟 𝜇 ≤ 0.5) 𝐻1: 𝜇 < 0.5 (𝑜𝑟 𝜇 > 0.5) GraphsAdopted from http://www.soc.napier.ac.uk/~cs181/Modules/CM/Statistics/Statistics%203.html
  • 62. Hypothesis Testing 2. Identifying the level of significance of the test (𝜶) and it is usually considered to be 5% or 1%, depending on the nature of the test and the goals of researcher. When 𝜶 is known with the prior knowledge about the sample distribution, the critical region(s) (or rejection area(s)) can be identified. Here we have two critical values for standard normal distributions associated to the level of significance 𝛼 = 5% and 𝛼 = 1% Adoptedfrom http://www.psychstat.missouristate.edu/introbook/sbk26.htm 𝑍 𝛼=1.65 𝑍 𝛼=2.33
  • 63. Hypothesis Testing 3. Constructing a test statistic (a function based on the sample distribution & sample size). This function is used to decide whther or not to reject 𝑯 𝟎. TableAdoptedfromhttp://www.bls-stats.org/uploads/1/7/6/7/1767713/250714.image0.jpg Here we have a list of some of the test statistics for testing different hypotheses
  • 64. Hypothesis Testing 4. Taking a random sample from the population and calculating the value of the test statistic. If the value is in the rejection area the null hypothesis 𝑯 𝟎 will be rejected in favour of the alternative 𝑯 𝟏at the predetermined significance level 𝜶, otherwise the sample does not provide sufficient evidence to reject 𝑯 𝟎 (this does not mean that we accept 𝑯 𝟎) Adoptedfrom http://www.onekobo.com/Articles/Statistics/03-Hypotheses/Stats3%20-%2010%20-%20Rejection%20Region.htm −𝒁 𝜶 𝑜𝑟 − 𝒕 𝜶,𝒅𝒇 if there is a left-tail test −𝒁 𝜶 𝟐 𝑜𝑟 − 𝒕 𝜶 𝟐 ,𝒅𝒇 if there is a two-tail test +𝒁 𝜶 𝑜𝑟 + 𝒕 𝜶,𝒅𝒇 if there is a right-tail test +𝒁 𝜶 𝟐 𝑜𝑟 + 𝒕 𝜶 𝟐 ,𝒅𝒇 if there is a two-tail test
  • 65. Example o A chocolate factory claims that its new tin of cocoa powder contains at least 500 gr of the powder. A standard checking agency takes a random sample of 𝑛 = 25 of the tins and found out that sample mean weight of tins is 𝑋 = 520 𝑔𝑟 and the sample standard deviation is 𝑠 = 75 𝑔𝑟. If we assume the weight of cocoa powder in tins has a normal distribution, does the sample provide enough evidence to support the claim at 95% level of confidence? 1. 𝐻0: 𝜇 ≥ 500 𝐻1: 𝜇 < 500 (so, it is a one-tail test) 2. Level of significance 𝛼 = 5% → 𝑡 𝛼 2 ,(𝑛−1) = 𝑡0.05,24 = 1.711 (it is t- distribution because 𝑛 < 30 and we do not have a prior knowledge about the population standard deviation) 3. The value of the test statistics is : 𝑡 = 𝑋−𝜇 𝑠 𝑛 = 520−500 75 25 = 1.33 4. As 1.33 < 1.711 we are not in the rejection area so, the claim cannot be rejected at 5% level of significance.
  • 66. Type I & Type II Errors • Two types of errors can occur in hypothesis testing: A. Type I error; when based on our sample we reject a true null hypothesis. B. Type II error; when based on our sample we cannot reject a false null hypothesis. • By reducing the level of significance 𝜶 we can reduce the probability of making type I error (why?) however, at the same time, we increase the probability of making type II error. • What would happen to type I and type II errors if we increase the sample size? (Hint: look at the confidence intervals) Adoptedfrom http://whatilearned.wikia.com/wiki/Hypothesis_Testing?file=Type_I_and_Type_II_Error_Table.jpg
  • 67. Type I & Type II Errors • The following graph shows how a change of the critical line (critical value) changes the probability of making type I and type II errors: 𝑷 𝑻𝒚𝒑𝒆 𝑰 𝒆𝒓𝒓𝒐𝒓 = 𝜶 And 𝑷 𝑻𝒚𝒑𝒆 𝑰𝑰 𝒆𝒓𝒓𝒐𝒓 = 𝜷 Adoptedfrom http://www.weibull.com/hotwire/issue88/relbasics88.htm The Power Of a Test: The power of a test is the probability that the test will correctly reject the null hypothesis. It is the probability of not committing type II error. The power is equal to 𝟏 − 𝜷 which means by reducing 𝜷 the power of the test will increase.
  • 68. The P-Value • It is not unusual to reject 𝐻0 at some level of significance, for example 𝛼 = 5% , but being unable to reject it at some other levels, e.g. 𝛼 = 1% . The dependence of the final decision to the value of 𝛼 is the weak point of the classical approach. • In the new approach, we try to find p-value which is the lowest significance level at which 𝐻0 can be rejected. If the level of significance is determined at 5% and the lowest significance level at which 𝐻0 can be rejected (p-value) is 2% so the null hypothesis should be rejected; i.e. 𝒑 − 𝒗𝒂𝒍𝒖𝒆 < 𝜶  To understand this concept better let’s look at an example: • Suppose we believe that the mean life expectancy of the people in a city is 75 years (𝐻0: 𝜇 = 75). But our observation shows a sample mean of 76 years for a sample size of 100 with a sample variance of 4 years. Reject 𝐻0
  • 69. The P-Value • The Z-score (test statistic) can be calculated as following: • At 5% level of significance the critical Z-value is 1.96 so we must reject 𝑯 𝟎. But, we should not have had this result (or should not have had those observations in our random sample) from the beginning if our assumption about the population mean 𝝁 was correct. • The p-value is the probability of having these type of results or even worse than that (i.e. a Z-score bigger than 2.5) considering the null hypothesis was correct, 𝑷(𝒁 ≥ 𝟐. 𝟓 𝝁 = 𝟕𝟓) = 𝒑 − 𝒗𝒂𝒍𝒖𝒆 ≈ 𝟎. 𝟎𝟎𝟔 (it means in 1000 samples this type of results can happen theoretically 6 times; but it has happened in our first random sampling). 𝑍 = 𝑋 − 𝜇 𝑠 𝑛 = 76 − 75 4 100 = 2.5 Z=2.5 𝑷 𝒁 ≥ 𝟐. 𝟓 ≈ 𝟎. 𝟎𝟎𝟔 http://faculty.elgin.edu/dkernler/statistics/ch10/10-2.html
  • 70. The P-Value • As we cannot deny what we have observed and obtained from the sample, eventually we need to change our belief about the population mean and reject our assumption about that. • The smaller the p-value, the stronger evidence against 𝐻0.