Fundamentals of Sampling Distribution and Data Descriptions
1. Presentation on
CH-8
Fundamental Sampling Distribution and
Data Descriptions
Group members:
Iftekharul Islam Nahid
Syed Rizwanul Haque
Shantonu Nonda
Fardin Islam
Emonur Rahman Fahim
Department of Industrial & Production Engineering
Rajshahi University of Engineering & Technology(RUET)
2. A population consists of the totality of the observations with which we are concerned.
A sample is a subset of a population.
What is Population
What is Sample
3. Location Measures of a Sample: The Sample
Mean, Median, and Mode
(a) Sample Mean:
The sample mean is the average of the values of a variable in a sample, which is the sum of those
values divided by the number of values. Using mathematical notation, if a sample of N observations
on variable X is taken from the population, the sample mean is:
4. (b) Sample median:
π₯ = π₯ π+1 β2; if n is odd
1
2
π₯πβ2 + π₯π 2+1 ; if n is even
The sample median is also a location measure that shows the middle value of the
sample
(c) Sample mode
The sample mode is the value of the sample that occurs most often.
Suppose a data set consists of the following observations:
0.32 0.53 0.28 0.37 0.47 0.43 0.36 0.42
0.38 0.43
The sample mode is 0.43, since this value occurs more than any other value
5. (d)Sample variance:
S2 = π=π
π
(ππβπ)π
πβπ
The computed value of S2 for a given sample is denoted by s2. Note that
S2 is essentially defined to be the average of the squares of the deviations of the observations from
their mean.
(e)Sample standard deviation:
S = π2
(f) Sample range:
Range(X) = Max(X) β Min(X)
6. Example:
A comparison of coffee prices at 4 randomly selected grocery stores in San Diego
showed increases from the previous month of 12, 15, 17, and 20 cents for a 1-pound
bag. Find the variance of this random sample of price increases.
Solution:
The sample mean is, π =
ππ+ππ+ππ+ππ
π
= 16 cents.
Therefore,
S2 =
π
π π=π
π
ππ β ππ π
=
ππβππ π+ ππβππ π+ ππβππ π+ ππβππ π
π
=
ππ
π
7. What is Sampling Distribution of Means?
Definition: A sampling distribution of sample means is a normal distribution obtained by
using the means computed from random samples of a specific size taken from a population.
The first important sampling distribution to be considered is that of the mean π.
Characteristics: Suppose that a random sample of n observations is taken from a normal
population with mean ΞΌ and variance π2
. Each observation ππ , i = 1, 2,...,n, of the random sample
will then have the same normal distribution as the population being sampled. Hence, we can
conclude that,
πΏ =
π
π
πΏπ + πΏπ + β― + πΏπ
Has a normal distribution with mean,
ππΏ =
1
π
π + π + β― + π = π πππ ππππππππ ππΏ
π
=
1
π2
π2
+ π2
+ β― + π2
=
ππ
π
.
8. What is Central Limit Theorem
Definition: If π is the mean of a random sample of size n taken from a population with mean ΞΌ and
finite variance Ο2, then the limiting form of the distribution of
π =
πΏ β π
π π
The sample size n = 30 is a guideline to use for the central
limit theorem. However ,as the statement of the theorem
implies the presumption of normality on the distribution of
πΏ becomes more accurate as n grows larger , beginning
with the clearly non symmetric distribution of an individual
observation (n = 1) and the mean of π remains π for any
sample size and the variance of π gets smaller as n
increases
9. Example: An any electric firm manufacturers light bulbs that have a length of life that is approximately
normally distributed with mean equal to 800 hours and a standard deviation of 40 hours. Find the
probability that a random sample of 16 bulbs will have an average life of less than 775 hours.
Solution: The sampling distribution of π will be approximately normal with
ππ₯ = 800
ππ₯ =
40
16
= 10.
The desired probability is given by the area of the shaded region
When π = 775 , z =
775β800
10
= -2.5
β΄ P( π < 775) = P(Z < - 2.5) = 0.0062
10. The t-distributions similar to the normal distribution but is adapted for small sample sizes. It is
employed when dealing with small sample sizes or when the population standard deviation is
unknown.
T-Distributions
Equations:
t =
π₯βπ
πβ π
h(t)=
π€ π£+1 β2
π€ π£β2 ππ£
1 +
π‘2
π£
β π£+1 β2
, -β <t < β
This is known as the t-distribution with v degrees of freedom.
11. Example 8.11: A chemical engineer claims that the population mean yield of a certain batch
process is 500 grams per milliliter of raw material. To check this claim he samples 25 batches each
month. If the computed t-value falls between βt0.05 and t0.05, he is satisfied with this claim. What
conclusion should he draw from a sample that has a mean Β―x = 518 grams per milliliter and a sample
standard deviation s = 40 grams? Assume the distribution of yields to be approximately normal.
Solution: From Table A.4 we find that t0.05 = 1.711 for 24 degrees of freedom. Therefore, the engineer
can be satisfied with his claim if a sample of 25 batches yields a t-value between β1.711 and 1.711. If
ΞΌ= 500, then
π =
πππβπππ
ππβ ππ
= π β ππ
a value well above 1.711. The probability of obtaining a t-value, with v = 24, equal to or greater than 2.25
is approximately 0.02. If ΞΌ > 500, the value of t computed from the sample is more reasonable. Hence, the
engineer is likely to conclude that the process produces a better product than he thought.
13. F- distribution
Definition: It is the ratio of two independent chi squared random
variables, each divided by its number of degrees of freedom.
It is particularly used to compare variance analysis and hypothesis
testing.
F =
π
π1
π
π2
Where U and V are random variables for chi squared distribution with
Ξ½1 and Ξ½2 degrees of freedom.
15. Characteristics of F-distribution:
1. Curve will not be symmetric.
2. Curve changes with the change of degrees of freedom.
3. It's positively skewed, meaning it has a longer tail to the right.
Applications of F-distribution:
1. Hypothesis testing while comparing sample variances.
2. ANOVA testing.
3. Regression analysis.
17. Example:
Consider the following measurements of the heat-producing capacity of the coal produced by two mines
(in millions of calories per ton):
Mine 1: 8260 8130 8350 8070 8340
Mine 2: 7950 7890 7900 8140 7920 7840
Can it be concluded that the two population variances are equal?
Solution:
Step 1: Identification of Null Hypothesis and Alternate hypothesis
H0 : π1
2
= π2
2
H1 : π1
2
β π2
2
Step 2: Calculation of F from data and table
For mine 1: π1= 5-1 = 4
For mine 1: π2= 6-1 = 5
For 5% level of significance, F(4,5),0.05 = 5.19
18. Again, from the data, for mine 1:
π 1
2
= π=1
π
(π₯πβπ₯)2
πβ1
=
1
π πβ1
[π π=1
π
π
2
β ( π=1
π
ππ)2
]
Or, π 1
2
=
1
5Γ4
[5Γ 338727500 β 411502] = 15750
Similarly, for mine 2, π 2
2
= 10920
Now, F =
π1
2
π2
2 =
15750
10920
=1.442
Step 3: Decision
Since the obtained value lies in acception zone, so we fail to reject null hypothesis. So, it
can be concluded that the variances of populations are equal.