Fundamentals of Sampling Distribution and Data Descriptions

Presentation on
CH-8
Fundamental Sampling Distribution and
Data Descriptions
Group members:
Iftekharul Islam Nahid
Syed Rizwanul Haque
Shantonu Nonda
Fardin Islam
Emonur Rahman Fahim
Department of Industrial & Production Engineering
Rajshahi University of Engineering & Technology(RUET)

A population consists of the totality of the observations with which we are concerned.
A sample is a subset of a population.
What is Population
What is Sample

Location Measures of a Sample: The Sample
Mean, Median, and Mode
(a) Sample Mean:
The sample mean is the average of the values of a variable in a sample, which is the sum of those
values divided by the number of values. Using mathematical notation, if a sample of N observations
on variable X is taken from the population, the sample mean is:

(b) Sample median:
𝑥 = 𝑥 𝑛+1 ∕2; if n is odd
1
2
𝑥𝑛∕2 + 𝑥𝑛 2+1 ; if n is even
The sample median is also a location measure that shows the middle value of the
sample
(c) Sample mode
The sample mode is the value of the sample that occurs most often.
Suppose a data set consists of the following observations:
0.32 0.53 0.28 0.37 0.47 0.43 0.36 0.42
0.38 0.43
The sample mode is 0.43, since this value occurs more than any other value

(d)Sample variance:
S2 = 𝒊=𝟏
𝒏
(𝒙𝒊−𝒙)𝟐
𝒏−𝟏
The computed value of S2 for a given sample is denoted by s2. Note that
S2 is essentially defined to be the average of the squares of the deviations of the observations from
their mean.
(e)Sample standard deviation:
S = 𝑆2
(f) Sample range:
Range(X) = Max(X) – Min(X)

Example:
A comparison of coffee prices at 4 randomly selected grocery stores in San Diego
showed increases from the previous month of 12, 15, 17, and 20 cents for a 1-pound
bag. Find the variance of this random sample of price increases.
Solution:
The sample mean is, 𝒙 =
𝟏𝟐+𝟏𝟓+𝟏𝟕+𝟐𝟎
𝟒
= 16 cents.
Therefore,
S2 =
𝟏
𝟑 𝒊=𝟏
𝟒
𝒙𝒊 − 𝟏𝟔 𝟐
=
𝟏𝟐−𝟏𝟔 𝟐+ 𝟏𝟓−𝟏𝟔 𝟐+ 𝟏𝟕−𝟏𝟔 𝟐+ 𝟐𝟎−𝟏𝟔 𝟐
𝟑
=
𝟑𝟒
𝟑

What is Sampling Distribution of Means?
Definition: A sampling distribution of sample means is a normal distribution obtained by
using the means computed from random samples of a specific size taken from a population.
The first important sampling distribution to be considered is that of the mean 𝑋.
Characteristics: Suppose that a random sample of n observations is taken from a normal
population with mean μ and variance 𝜎2
. Each observation 𝑋𝑖 , i = 1, 2,...,n, of the random sample
will then have the same normal distribution as the population being sampled. Hence, we can
conclude that,
𝑿 =
𝟏
𝒏
𝑿𝟏 + 𝑿𝟐 + ⋯ + 𝑿𝒏
Has a normal distribution with mean,
𝝁𝑿 =
1
𝑛
𝜇 + 𝜇 + ⋯ + 𝜇 = 𝝁 𝑎𝑛𝑑 𝒗𝒂𝒓𝒊𝒂𝒏𝒄𝒆 𝝈𝑿
𝟐
=
1
𝑛2
𝜎2
+ 𝜎2
+ ⋯ + 𝜎2
=
𝝈𝟐
𝒏
.

What is Central Limit Theorem
Definition: If 𝑋 is the mean of a random sample of size n taken from a population with mean μ and
finite variance σ2, then the limiting form of the distribution of
𝒁 =
𝑿 − 𝝁
𝝈 𝒏
The sample size n = 30 is a guideline to use for the central
limit theorem. However ,as the statement of the theorem
implies the presumption of normality on the distribution of
𝑿 becomes more accurate as n grows larger , beginning
with the clearly non symmetric distribution of an individual
observation (n = 1) and the mean of 𝑋 remains 𝜇 for any
sample size and the variance of 𝑋 gets smaller as n
increases

Example: An any electric firm manufacturers light bulbs that have a length of life that is approximately
normally distributed with mean equal to 800 hours and a standard deviation of 40 hours. Find the
probability that a random sample of 16 bulbs will have an average life of less than 775 hours.
Solution: The sampling distribution of 𝑋 will be approximately normal with
𝜇𝑥 = 800
𝜎𝑥 =
40
16
= 10.
The desired probability is given by the area of the shaded region
When 𝑋 = 775 , z =
775−800
10
= -2.5
∴ P( 𝑋 < 775) = P(Z < - 2.5) = 0.0062

The t-distributions similar to the normal distribution but is adapted for small sample sizes. It is
employed when dealing with small sample sizes or when the population standard deviation is
unknown.
T-Distributions
Equations:
t =
𝑥−𝜇
𝑆∕ 𝑛
h(t)=
𝛤 𝑣+1 ∕2
𝛤 𝑣∕2 𝜋𝑣
1 +
𝑡2
𝑣
− 𝑣+1 ∕2
, -∞ <t < ∞
This is known as the t-distribution with v degrees of freedom.

Example 8.11: A chemical engineer claims that the population mean yield of a certain batch
process is 500 grams per milliliter of raw material. To check this claim he samples 25 batches each
month. If the computed t-value falls between −t0.05 and t0.05, he is satisfied with this claim. What
conclusion should he draw from a sample that has a mean ¯x = 518 grams per milliliter and a sample
standard deviation s = 40 grams? Assume the distribution of yields to be approximately normal.
Solution: From Table A.4 we find that t0.05 = 1.711 for 24 degrees of freedom. Therefore, the engineer
can be satisfied with his claim if a sample of 25 batches yields a t-value between −1.711 and 1.711. If
μ= 500, then
𝒕 =
𝟓𝟏𝟖−𝟓𝟎𝟎
𝟒𝟎∕ 𝟐𝟓
= 𝟐 ⋅ 𝟐𝟓
a value well above 1.711. The probability of obtaining a t-value, with v = 24, equal to or greater than 2.25
is approximately 0.02. If μ > 500, the value of t computed from the sample is more reasonable. Hence, the
engineer is likely to conclude that the process produces a better product than he thought.

Unknown Standard
Deviation:
Appropriate when
population
standard deviation
is unknown.
Small Samples:
Ideal for
datasets with
limited data
points
Applications
1
2

F- distribution
Definition: It is the ratio of two independent chi squared random
variables, each divided by its number of degrees of freedom.
It is particularly used to compare variance analysis and hypothesis
testing.
F =
𝑈
𝜈1
𝑉
𝜈2
Where U and V are random variables for chi squared distribution with
ν1 and ν2 degrees of freedom.

Probability density function:
h(f) =
Γ
𝜈1 + 𝜈2
2
𝜈1
𝜈2
𝜈1
2
f
𝜈1
2
−1
Γ
𝜈1
2
Γ
𝜈2
2
1 + 𝜈1 𝑓 𝜈2
𝜈1+𝜈2
2
0
,f> 0
,f<0

Characteristics of F-distribution:
1. Curve will not be symmetric.
2. Curve changes with the change of degrees of freedom.
3. It's positively skewed, meaning it has a longer tail to the right.
Applications of F-distribution:
1. Hypothesis testing while comparing sample variances.
2. ANOVA testing.
3. Regression analysis.

Essential formulas
1
Biased estimator s2 = 𝐢=𝟏
𝐧
(𝐱𝐢−𝐱)𝟐
𝐧
2
Unbiased estimator S2 = 𝑖=1
𝑛
(𝑥𝑖−𝑥)2
𝑛−1
=
𝑠2𝑛
𝑛−1
3
Mean
𝜇 =
𝜈2
𝜈2−2
;𝜈2 > 2
4
Variance 𝜎2
=
2𝜈2
2
(𝜈1+𝜈2−2)
𝜈1(𝜈2−2)2(𝜈2−4)
; 𝜈2 > 4
5
F- distribution F =
𝑆2
2
𝑆1
2 ; when 𝑆2
2
> 𝑆1
2

Example:
Consider the following measurements of the heat-producing capacity of the coal produced by two mines
(in millions of calories per ton):
Mine 1: 8260 8130 8350 8070 8340
Mine 2: 7950 7890 7900 8140 7920 7840
Can it be concluded that the two population variances are equal?
Solution:
Step 1: Identification of Null Hypothesis and Alternate hypothesis
H0 : 𝜎1
2
= 𝜎2
2
H1 : 𝜎1
2
≠ 𝜎2
2
Step 2: Calculation of F from data and table
For mine 1: 𝜈1= 5-1 = 4
For mine 1: 𝜈2= 6-1 = 5
For 5% level of significance, F(4,5),0.05 = 5.19

Again, from the data, for mine 1:
𝑠1
2
= 𝑖=1
𝑛
(𝑥𝑖−𝑥)2
𝑛−1
=
1
𝑛 𝑛−1
[𝑛 𝑖=1
𝑛
𝑋
2
− ( 𝑖=1
𝑛
𝑋𝑖)2
]
Or, 𝑠1
2
=
1
5×4
[5× 338727500 − 411502] = 15750
Similarly, for mine 2, 𝑠2
2
= 10920
Now, F =
𝑆1
2
𝑆2
2 =
15750
10920
=1.442
Step 3: Decision
Since the obtained value lies in acception zone, so we fail to reject null hypothesis. So, it
can be concluded that the variances of populations are equal.

Fundamentals of Sampling Distribution and Data Descriptions

Recommended

Recommended

More Related Content

Similar to Fundamentals of Sampling Distribution and Data Descriptions

Similar to Fundamentals of Sampling Distribution and Data Descriptions (20)

Recently uploaded

Recently uploaded (20)

Fundamentals of Sampling Distribution and Data Descriptions