Sampling and Sampling Distribution

Sampling
and
Sampling Distribution
by
Umesh K. Pandey

1
Sampling
and
Sampling Distribution

Umesh K. Pandey
M.Sc., MBA, MBL, DIPC, Ph.D.
© The Author
Important notice on reuse, reproduction or commercial use:
Complete reproduction without alteration of the content, partial or as a whole, is permitted for non-commercial, personal
and academic purposes without a prior permission provided such reproduction includes full citation of the article, an
acknowledgement of the copyright and link to the article. The author should be informed about this use if more than one
copy is being made or the content, partial or as a whole, is being reproduced on a website, intranet or any other electronic
media.
Contents of this ebook, partial or as a whole, should not be included in a framed web page.
Contents of this site, partial or as a whole, should not be included in a password protected site or a site which requires
registration, even if free.
Contents of this ebook, partial or as a whole, should not be included in a site which charges for other contents but provides
the content from this site for free.
A site or any electronic media reproducing content from this ebook if includes advertisements placed along with our content
or generates any form of revenue due to the contents of this ebook, should share the revenue with the author.
For seeking permission for commercial reuse please contact author at yukaypee@hotmail.com

2
INDEX
Sampling and Sampling Distribution
Page No
1. Sampling 3
1.1 What is sampling ? 3
1.2 Why Sampling instead of Census? 4
1.3 Sampling methods 7
1.3.1 Probability Sampling Methods 7
1.3.1.1 Simple Random Sampling 8
1.3.1.2 Systematic Sampling 9
1.3.1.3 Stratified 10
1.3.1.4 Cluster Sampling 12
1.3.2 Non-Probability Sampling Methods 13
1.3.2.1 Convenience Sampling 14
1.3.2.2 Purposive Sampling 14
1.3.2.2.1 Judgment Sampling 15
1.3.2.2.2 Quota Sampling 15
2. Sampling Distribution 16
2.1 Sampling Distribution of the Mean 18
2.2 The Central Limit Theorem 22
2.3 Sampling Distribution of the variance 23
2.4 The Chi-square Distribution 24
2.5 Sampling Distribution of the proportion 26
2.6 The Confidence Level 27
3. Bibliography 29
***

3
1. Sampling
When managers use research, they are applying the methods of science to the art
of management. Business operates in the world of uncertainity and there is no unique
method which can entirely eliminate this uncertainity. Nevertheless, the research
methodology can indeed minimise the extent of uncertainity and can reduce the probability
of making a wrong choice amongst alternative courses of action. Therefore, the increasingly
complex nature of of business and governance focusses more and more attention on the
use of research methodology in solving managerial problems. In the prevailing highly
involved environment neither a business decision nor a governmental decision can be made
casually or based on intutions.
It is through appropriate data
and their analysis that the
decision maker becomes
equipped with proper tools of
decision making. Needless to
say the the credibility of the
results derived from the
application of such methodology
is dependent upon the reliability of the data included in the analysis.
The quantitative tool of inferential statistics is extensively used to address
managerial and business problems by using the relevant data. The inferential statistics are
the quantitative tools that use samples to estimate something about a population
parameter above what can possibly happen by chance. Good research is only as good as the
design, methods, and statistics used. Yet, the design, methods, and statistics are useless if
first, an optimal sample is not used. Thus sampling is the corner stone of any business
research.
1.1 What is Sampling ?
The terminology “sampling” indicates the selection of a part of a group or an
aggregate with a view to obtaining information about the whole. This aggregate or the
POPULATION SAMPLE
STATISTICPARAMETER
Sampling
Estimation
Inference
Figure 1: Research Methodology

4
totality of all members is known as Population although they need not be human beings.
The selected part, Which is used to ascertain the characteristics of the population is called
Sample. While choosing a sample, the population is assumed to be composed of individual
units or members, some of which are included in the sample. The total number of members
of the population and the number included in the sample are called Population Size and
Sample Size respectively. The concept can be shown through the following venn diagram
where the population is an universal set and sample is shown as a true subset.
Population: Set of all items
Sample: Set of chosen items
The process of generalising on the basis of information collected on a part is really a
traditional practice. With the advancement of management science more sophisticated
applications of sampling in business and industry are available. Sampling methodology can
be used by an auditor or an accountant to estimate the value of total inventory in the stores
without actually inspecting all the items physically. Opinion polls based on samples are used
to forecast the result of a forthcoming election.
1.2 Why Sampling instead of Census?
The census or complete enumeration consists in collecting data from each and every
unit from the population. The sampling only chooses a part of the units from the population
for the same study. The sampling has a number of advantages as compared to complete
enumeration due to a variety of reasons.
Cost
The first obvious advantage of sampling is that it is less expensive. If we want to study
the consumer reaction before launching a new product it will be much less expensive to
Figure 2: Population & Sample

5
carry out a consumer survey based on a sample rather than studying the entire population
which is the potential group of customers. Although in decennial census every individual is
enumerated, certain aspects of the population are studied on a sample basis with a view to
reduce cost.
Time
The smaller size of the sample enables us to collect the data more quickly than to
survey all the units of the population even if we are willing to spend money. This is
particularly the case if the decision is time bound. An accountant may be interested to know
the total inventory value quickly to prepare a periodical report like a quarterly balance sheet
and a profit and loss account. A detailed study on the inventory is likely to take too long to
enable him to prepare the report in time. If we want to measure the Consumer Price Index
in a particular month we cannot collect data of all the consumer prices even if the
expenditure is not a hinderance. The collection of data on all the consumer items and their
processing in all probability are going to take pretty long time. Thus when ready, the price
index will not serve any meaningful purpose.
Accuracy
It is possible to achieve greater accuracy by using appropriate sampling techniques
than by a complete enumeration of all the units of the population. Contrary to the common
belief, complete enumeration may result in inaccuracies of the data owing to the fatigue of
the enumerator or spurious & unreliable data collected in view of large volume. On the
other hand, if a small number of items is observed the basic data will be much more
accurate. It is of course true that the conclusion about a population characteristic such as
the proportion of defective items from a sample will also introduce error in the system.
However, such errors, known as sampling errors, can be studied, controlled and probability
statements can be made about their magnitude. The accuracy which results due to fatigue
of the inspector is known as non sampling error. It is difficult to recognise the pattern of the
non sampling error and it is not possible to make any comment about its magnitude even
probabilistically.

6
Reliability of Inference
In many cases, sampling provides adequate information so that not much additional
reliability can be gained with complete enumeration in spite of spending large amounts of
additional money and time. It is also possible to quantify the magnitude of possible error on
using some types of sampling which is not the case in census approach.
Impossibility of complete enumeration
In many situations the item being studied gets destroyed while being tested.
Sampling is indispensable under such circumstances.if one is interested in computing the
average life of Compact Fuoroscent Lamps (CFL) supplied in a batch, the life of entire batch
cannot be examined to compute the average life since this means that the entire supplywill
be wasted. Thus in such cases there is no other alternative than to examine the life of a
sample of CFLs and draw an inference about the entire batch.
Infeasibility of complete enumeration
More often than not, it is practically infeasible to do a complete enumeration due to
many practical difficulties. For example, if a shaving gel manufacturer wants to launch a new
& improved version of its gel. For getting consumer feedback the manufacturer distributed
old version of gel to say 500 consumers and after a week or so replaced it with the new
version to get feedback on various attributes of the product. In this situation, it would be
infeasible to collect information from all the consumers of shaving gel in India. Some
consumers would have moved from one place to another during the period of study, some
others would have stopped consuming shaving gel just before the period of study whereas
some others would have been users of shaving gel during the period of study but would
have stopped using it some time later. In such situations, although it is theoretically possible
to do a complete enumeration, it is practically infeasible to do so.
The above account clearly establishes that a research study gives more reliable
results at a greater convenience by way of sampling as compared with the study of the entire
population.

7
1.3 Sampling methods
Good research is as only as
good as the design, methods,
and statistics used. Yet, the
design, methods, and
statistics are useless if first,
we don’t use an optimal
sample. A Sampling frame is a
list of all the units of the
population. The preparation
of a sampling frame should
always be upto date and be free from errors of omission and duplication of sampling units.
A perfect frame identifies each element once and only once. Perfect frames are seldom
available in real life. Nevertheless, it needs to be ensured that the sampling frame is
complete, accurate, adequate and up-to-date.
Further, depending on the requirement of various possibilities in research sampling
methods are broadly categorized into two groups viz Probability sampling methods and
Non probability sampling methods, as depicted in Figure 3.
1.3.1 Probability Samplling Methods
In probability sampling methods the population from which the sample is drawn should be
known to the researcher. Under this sampling design every item of the population has an
equal chance of inclusion in the sample. Lottery methods or selecting a student from the
complete students names from a box with blind or folded eyes is the best example of
random sampling, it is the best technique and unbiased method. It is the best process of
selecting representative sample.But the major disadvantage is that for this technique we
need the complete sampling frame i.e. the list of the complete items or population which is
not always available.
The probability sampling methods are of four types viz Simple Random Sampling, Systematic
Sampling, Stratified Sampling and Cluster Sampling.
Sampling
Probability Non-probability
Simple Random
Systematic
Stratified
Cluster
Convenience
Purposive
Quota
Judgment
Figure 3: Sampling Methods

8
1.3.1.1 Simple Random Sampling
Simple random sampling is based on the concept of probability. The use of probability in
sampling theory makes it a reliable tool to draw inference or conclusion about the
population. Although the types of conclusion or inference can be quite diverse, two
particular types of decision making are quite prevalent in problems of business and
government.
On various occasions, the management would like to know the percentage or proportion of
units in the population with a certain characteristic. An organisation selling consumer
product may like to know the proportion of potential consumers using a certain type of
cosmetic. The government may like to know the percent of small farmers owning some
cultivable land in a rural region. A manufacturer planning to export some product may be
interested to ascertain the proportion of defect free units his system is capable of
manufacturing.
The representative character of a sample is ensured by allocating some probability to each
unit of the population for being included in the sample. The simple random sample assigns
equal probability to each unit of the population. The simple random sample can be chosen
both with and without replacement.
Simple Random Sampling with Replacement
Suppose the population consists of N units and we want to select a sample of size n. In simple
random sampling with replacement we choose an observation from the population in such
a manner that every unit of the population has an equal chance of 1/N to be included in the
sample. After the first unit is selected its value is recorded and it is again placed back in the
population. The second unit is drawn exactly in the same manner as the first unit This
procedure is continued until nth unit of the sample is selected. Obviously, in this case each
unit of the population has an equal chance of 1/N to be included in each of the n units of
the sample.

9
Simple Random Sampling without Replacement
In this case when the first unit is chosen every unit of the population has a chance of 1/N to
be included in the sample. After the first unit is chosen it is no longer replaced in the
population. The second unit is selected from the remaining ‘N-1’ members of the population
so that each unit has a chance of
𝟏𝟏
𝑵𝑵−𝟏𝟏
to be included in the sample. The procedure is
continued till nth unit of the sample is chosen with probability
𝟏𝟏
[𝑵𝑵−𝒏𝒏+𝟏𝟏]
.
Random numbers for simple random sampling are generated using probabilistic mechanism.
1.3.1.2 Systematic Sampling
Systematic sampling involves selecting items
using a constant interval between the selections
depending on the sampling ratio – first interval
having a random start. For example, if a sample
of size 10 from a population of size 100 is
required, the sampling ratio would be n/N =
10/100= 1/10. It would, therefore, have to be decided where to start from among the first
10 names in our sampling frame. If this number happens to be 5 for example, then the
sample would contain members having serial numbers 5, 15, 25, 35, ……. 95 in the frame. It
is noteworthy that the random process establishes only the first member of the sample -
the rest are pre-determined because of the known sampling ratio. Usually the starting serial
number of sample is decided by allowing chance to play its role by using a table of random
numbers. In other words, the sampling starts by selecting an element from the list at
random and then every kth element in the frame is selected, where k, the sampling interval
(sometimes known as the skip): this is calculated as 𝒌𝒌 =
𝑵𝑵
𝒏𝒏
where n is the sample size, and N
is the population size.
Systematic sampling is relatively much easier to implement compared to simple random
sampling. However, there is one possibility that should be guarded against while using
systematic sampling - the possibility of a strong bias in the results if there is any periodicity
in the frame that parallels the sampling ratio. For example if someone were making studies
Figure 4: Systematic Sampling

10
on the demand for various banking transactions in a bank branch by studying the demand
on some days randomly selected by systematic sampling and the chosen sampling ratio is
1/7 or 1/14 etc, he would always be studying the demand on the same day of the week and
the inferences could be biased depending on whether the day selected is a Monday or a
Friday and so on.
If the frame is arranged in an order, ascending or descending, of some attribute then the
location of the first sample element may affect the result of the study. For example, if the
frame contains a list of students arranged in a descending order of their percentage in the
previous examination and we are picking a systematic sample with a sampling ratio of 1/50.
If the first number picked is 1 or 2, then the sample chosen will be academically much better
off compared to another systematic sample with the first number chosen as 49 or 50. In
such situations, one should devise ways of nullifying the effect of bias due to starting number
by insisting on multiple starts after a small cycle or other such means.
On the other hand, if the frame is so arranged that similar elements are grouped together,
then systematic sampling produces almost a proportional stratified sample and would be,
therefore, more statistically efficient than simple random sampling.
Systematic sampling is perhaps the most commonly used method among the probability
sampling designs and for many purposes e.g. for estimating the precision of the results,
systematic samples are treated as simple random samples.
1.3.1.3 Stratified Sampling
The simple random sampling
may not always provide a
representative snapshot of the
population. Certain segments of
a population can easily be under
represented when an
unrestricted random sample is
chosen. Hence, when
considerable heterogeneity is present in the population with regard to subject matter under
n1
N1
Stratum-1 Stratum-pStratum-2
N2
Np
n2 np
Figure 5: Sratified Sampling

11
study, it is often a good idea to divide the population into segments or strata and select a
certain number of sampling units from each stratum thus ensuring representation from all
relevant segments. Thus for designing a suitable marketing strategy for a consumers
durable, the population of consumers may be divided into strata by income level and a
certain number of consumers can be selected randomly from each strata.
Therefore, in stratified random sampling the population is first divided into different
homogeneous group or strata which may be based upon a single criterion such as male or
female. Or upon combination of more criteria like sex, caste, level of education and so on.
This method is generally applied when different category of individuals constitutes the
population viz General, OBC, SC, ST or upper income, middle income, lower income or small
farmers, big farmers, marginal farmers landless farmers etc. To have an actual picture of a
particular population about the standard of living, in case of India it is advisable to
categorized the population on the basis of caste, religion or land holding otherwise some
section may be under-represented or not represented at all.
Stratified random sampling may be of Proportionate Stratified Random Sampling or Dis-
Proportionate Stratified Random Sampling.
Proportionate Stratified Random Sampling
In case of proportionate random sampling method, the researcher stratifies the population
according to known characteristics and subsequently, randomly draws the sample in a
similar proportion from each stratum of the population according to its proportion. That is,
the population is divided into several sub-populations depending upon some known
characteristics, this sub population is called strata and they are homogeneous. For example,
a town area committee consists of 15000 voters among which 60% are Hindus, 30% are
Muslims and 10% are others and the researcher wants to draw a sample of 300 voters from
the population as per their proportion. That can be done by multiplying the sample number
with their proportion; as per this method the sample size of Hindu voter will be 300 x 60% =
180, Muslims will be 300 x 30% = 90 and others will be 300 x 10% = 30. So the researcher
has to collect the complete voter list of the town and randomly select the sample from each
category as calculated above. In this method the sampling error is minimized and the sample
possesses all the required characteristics of the population.

12
Disproportionate Stratified Random Sampling
In this method the sampling unit in each stratum is not necessarily be as per their
population. Suppose for the said town the researcher wants to the know the voting pattern
of male and female of Hindu, Muslim and other voters; in that case he must take equal no.
of male and female voter from each category. Here the investigator has to give equal
weightage to each stratum. This is a biased type of sampling and in this case some stratum
is over-represented and some are less-represented; these are not truly representative
sampling, still this to be used in some special cases.
If the different strata in the population have unequal variances of the characteristic being
measured, then the sample size allocation decision should consider the variance as well. It
would be logical to have a smaller sample from a stratum where the variance is smaller than
from another stratum where the variance is higher. In fact, if 𝜎𝜎1
2
, 𝜎𝜎2
2
, … … , 𝜎𝜎𝑝𝑝
2
are the variance
of the p strata, then the statistical efficiency is the highest when –
𝑛𝑛1
𝑁𝑁1 𝜎𝜎1
=
𝑛𝑛1
𝑁𝑁2 𝜎𝜎2
= ⋯ =
𝑛𝑛𝑝𝑝
𝑁𝑁𝑝𝑝 𝜎𝜎𝑝𝑝
1.3.1.4 Cluster sampling
This is another type of probability sampling method, in which the sampling units are not individual
elements of the population, but group of elements or group of individuals are selected as sample. In
cluster sampling the total population is divided into a number of relatively small sub-divisions or
groups which are themselves clusters and then some of these cluster are randomly selected for
inclusion in the sample. Suppose a researcher wants to study the functioning of mid day meal service
in a district in that case he can use some schools clustering in a block or two without selecting the
schools scattering all over the district. Cluster sampling reduces the cost and labour of collecting the
data of the researcher but less precise than random sampling.
We can now compare Cluster Sampling with Stratified Sampling. Stratification is done to make the
strata homogeneous within and different from other strata. Clusters, on the other hand, should be
heterogeneous within and the different clusters should be similar to each other. A clusture, ideally,
is a mini-population and has all the features of the population.
The criterion used for stratification is a variable which is closely associated with the characteristic
we are measuring e.g. income level when we are measuring the family consumption of non-aerated

13
beverages. On the other hand, convenience of data collection is usually the basis for cluster
definitions. Geographic contiguity is quite often used for clusture definitions and in such cases,
cluster sampling is also known as Area Sampling.
There are very fewer strata and one requires to pick up a random sample from each of the strata for
drawing inferences. In cluster sampling, there are many clusters out of which only a few are picked
up by random sampling and then the clusters are completely enumerated.
Multi-stage and Multi-phase Sampling
In this method sampling is drawn more than once . This is used in most of the large surveys where
the sampling unit is something larger than an individual element of the population in all stages but
the final. For example, in a national survey on the demand of fertilizers one might use stratified
sampling in the first stage with a district as a sampling unit and the average rainfall in the district as
the criterion for stratification. Having obtained 20 districts from this stage, cluster sampling may be
used in the second stage to pick up 10 villages in each of the selected districts. Finally, in the third
stage, stratified sampling may be used in each village to pick up frames in each of the strata defined
with land holding as the criterion.
Multi-phase sampling, on the other hand, is designed to make use of the information collected in
one phase to develop a sampling design in a subsequent phase. A study with two phases is often
called Double Sampling. The first phase of the study might reveal a relationship between the family
consumption of non-aerated beverages and the family income and this information would then be
used in the second phase to stratify the population with family income as the criterion.
1.3.2 Non Probability Sampling Methods
Probability sampling has some theoretical advantages over non-probability sampling. The bias
introduced due to sampling could be completely eliminated and it is possible to set a confidence
interval for the population parameter that is being studied. In spite of these advantages of
probability sampling, non-probability sampling is used quite frequently in many sampling surveys.
This is so because all are based on practical considerations.
Probability sampling requires a list of all the sampling units and this frame is not available in many
situations nor is it practically feasible to develop a frame of say all the households in a city or zone
or ward of a city. Sometimes the objective of the stuc may not be to draw a statistical inference
about the population but to get familiar wit extreme cases or other such objectives. In a dealer
survey, our objective may be to get familiar with the problems faced by our dealers so that we can

14
take some corrective actions, wherever possible. Probability sampling is rigorous and this rigour e.g.
in selecting samples, adds to the cost of the study. And finally, even when we are doing probability
sampling, there are chances of deviations from the laid out process especially where some samples
are selected by the interviewers at site - say after reaching a village. Also, some of the sample
members may not agree to be interviewed or not available to be interviewed and our sample may
turn out to be a non-probability sample in the strictest sense of the term.
1.3.2.1 Convenience Sampling
In this type of non-probability sampling, the choice of the sample is left completely to the
convenience of the researcher. The cost involved in picking up the sample is minimum and the cost
of data collection is also generally low, e.g. the researcher can go to same retail shops and interview
some shoppers while studying the demand for some commodity.
Another form of convenience sampling is known as ‘Snow Ball Sampling’. This is a sociometric
sampling technique generally used to study the small groups. All the persons in a group identify their
friends who in turn know their friends and colleagues, until the informal relationships converge into
some type of a definite social pattern. It is like a snow ball increasing its size as it rolls down an ice-
field. For example in case of research regarding drug addict people it is difficult to find out who are
the drug users but when one person is identified he can tell the names of his partners then each of
his partner can tell another 2 or 3 names whom he knows uses drug . This way the required number
of elements/persons are identified and data is collected. This method is suitable for diffusion of
innovation, network analysis, decision making.
However, such samples can suffer from excessive bias from known or unknown sources and also
there is no way that the possible errors can be quantified.
1.3.2.2 Purposive Sampling
In convenience sampling, any member of the population can be included in the sampl without any
restriction. When some restrictions are put on the possible inclusion of a member in the sample, the
sampling is called purposive. This is a non random sampling method where the researcher selects
the sample arbitrarily which he considers important for the research and believes it as typical and
representative of the population. Say, a researcher wants to forecast the chance of coming into the
power of a political party in general election. He may select some reporters, some teachers and some
elite people of the territory and collect their opinions for the purpose of his study. He considers
those are the leading persons and their view are relevant for the chance of coming in to the power

15
of the party. As it is a purposive method it has big sampling errors and carry misleading conclusion.
The purposive sampling is broadly of two types viz Judgment Sampling and Quota Sampling.
1.3.2.2.1 Judgment Sampling
In judgment sampling, the judgment or opinion of some experts forms the basis for sample selection.
The experts are persons who are believed to have information on the population which can help in
giving us better samples. Such sampling is very useful when we want to study rare events, or when
members have extreme positions, or even when the objective of the study is to collect a wide cross-
section of views from one extreme to the other.
1.3.2.2.2 Quota Sampling
Even while using non-probability sampling, one might want our sample to be representative of the
population in some defined ways. This is sought to be achieved in quota sampling so that the bias
introduced by sampling could be reduced.
If in a given population, 25% of the members belong to the high income group, 25% to the middle
income group, 35% to the low income group and 15 % are Below Poverty Line (BPL) and we are using
quota sampling, we would specify that the sample should also contain members in the same
proportion as in the population e.g. 15% of the sample members would belong to the BPL group
and so on.
The criteria used to set quotas could be many. For example, family size could be another criterion
and we can set quotas for families with family size upto 3, between 4 & 5, and above 5. However, if
the number of such criteria is large, it becomes difficult to locate sample members satisfying the
combination of the criteria. In such cases, the overall relative frequency of each criterion in the
sample is matched with the overall relative frequency of the criterion in the population.
This method of sampling is almost same with that of stratified random sampling as stated above, the
only difference is that here in selecting the elements randomization is not done instead quota is
taken into consideration. As quota sampling is not random so sampling method is biased and lead to
large sampling errors.

16
2. The Sampling Distribution
Sample statistics form the basis of all inferences drawn about populations. If we know the probability
distribution of the sample statistic, then we can calculate the probability that the sample statistic
assumes a particular value (if it is a discrete random variable) or has a value in a given interval. This
ability to calculate the probability that the sample statistic lies in a particular interval is the most
important factor in all statistical inferences. Let’s demonstrate this by an example.
Suppose we know that 55% of the population of all users of Shampoo prefer brand ‘A’ to the next
competing brand. A “new improved” version of ‘A’ has been developed and given to a random
sample of 200 shampoo users for use. If 120 of these prefer the “new improved” version to the next
competing brand, what should one conclude? For an answer, we would like to know the probability
that the sample proportion in a sample of size 200 is as large as 60% or higher when the true
population proportion is only 55%, i.e. assuming that the new version is no better than the old. If
this probability is quite large, say 0.5, we might conclude that the high sample proportion viz. 60% is
perhaps because of sampling errors and the new version is not really superior to the old. On the
other hand, if this probability works out to a very small figure, say 0.001, then we might conclude
that the true population proportion is higher than 55%, i.e. the new version is actually superior to
the old one as perceived by members of the population. To calculate this probability, we need to
know the probability distribution of sample proportion or the sampling distribution of the
proportion.
The sampling distribution, thus, is a distribution of a sample statistic. It is a model of a distribution
of scores, like the population distribution, except that the scores are not raw scores, but statistics. It
is a thought experiment; "what would the world be like if a person repeatedly took samples of size
N from the population distribution and computed a particular statistic each time?" The resulting
distribution of statistics is called the sampling distribution of that statistic.
For example, suppose that a sample of size sixteen (N=16) is taken from some population. The mean
of the sixteen numbers is computed. Next a new sample of sixteen is taken, and the mean is again
computed. If this process were repeated an infinite number of times, the distribution of the now
infinite number of sample means would be called the sampling distribution of the mean. Similarly,
every statistic has a sampling distribution.
Just as the population models can be described with parameters, so can the sampling distribution.
The expected value (analogous to the mean) of a sampling distribution will be represented here by

17
the symbol µ. The µ symbol is often written with a subscript to indicate which sampling distribution
is being discussed. For example, the expected value of the sampling distribution of the mean is
represented by the symbol 𝜇𝜇𝑥𝑥̅, that of the median by 𝜇𝜇 𝑀𝑀𝑑𝑑
, etc. The value of 𝜇𝜇𝑥𝑥̅ can be thought of as
the mean of the distribution of means. In a similar manner the value of 𝜇𝜇 𝑀𝑀𝑑𝑑
is the mean of a
distribution of medians. They are not really means, because it is not possible to find a mean when
𝑁𝑁 = ∞, but are the mathematical equivalent of a mean.
Using advanced mathematics, in a thought experiment, the theoretical statistician often discovers a
relationship between the expected value of a statistic and the model parameters. For example, it
can be proven that the expected value of both the mean and the median, 𝑋𝑋� and Md, is equal to µ
x .
When the expected value of a statistic equals a population parameter, the statistic is called an
unbiased estimator of that parameter. In this case, both the mean and the median would be an
unbiased estimator of the parameter 𝜇𝜇𝑥𝑥̅.
A sampling distribution may also be described with a parameter corresponding to a variance,
symbolized by 𝜎𝜎2
. The square root of this parameter is given a special name, the standard error.
Each sampling distribution has a standard error. In order to keep them straight, each has a name
tagged on the end of the word "standard error" and a subscript on the σ symbol. The standard
deviation of the sampling distribution of the mean is called the standard error of the mean and is
symbolized by 𝜎𝜎𝑥𝑥̅. Similarly, the standard deviation of the sampling distribution of the median is
called the standard error of the median and is symbolized by 𝜇𝜇 𝑀𝑀𝑑𝑑
.
In each case the standard error of a statistics describes the degree to which the computed statistics
will differ from one another when calculated from sample of similar size and selected from similar
population models. The larger the standard error, the greater the difference between the computed
statistics. Consistency is a valuable property to have in the estimation of a population parameter, as
the statistic with the smallest standard error is preferred as the estimator of the corresponding
population parameter, everything else being equal. Statisticians have proven that in most cases the
standard error of the mean is smaller than the standard error of the median. Because of this
property, the mean is the preferred estimator of 𝜇𝜇𝑥𝑥.
In practice, we refer to the sampling distributions of only the commonly used sampling statistics like
the sample mean, sample variance, sample proportion, sample median etc., which have a role in
making inferences about the population.

18
2.1 The Sampling Distribution of the Mean
There are many (infinite!) possible values of the sample mean and the particular value that we
obtain, if we pick up only one sample, is determined only by chance. The distribution of the sample
mean is also referred to as the sampling distribution of the mean.
However, to observe the distribution of x empirically, we have to take many samples of size n and
determine the value of x for each sample. Then, looking at the various observed values of x, it might
be possible to get an idea of the nature of the distribution.Such sampling distribution of the mean is
known as distribution of sample means. This distribution is described with the parameters 𝜇𝜇𝑥𝑥̅ and
𝜎𝜎𝑥𝑥̅ .
Sampling from Infinite Populations
Let’s study two cases –
1. Where the population is infinitely large or when the sampling is done with replacement
2. Where the population is finite and we are sampling without replacement
For the first scenario let’s assume we have a population which is infinitely large and having a
population mean of µ . and a population variance of 𝜎𝜎2
. This implies that if x is a random variable
denoting the measurement of the characteristic that we are interested in, on one element of the
population picked up randomly, then the expected value of x, E(x) = µ and the variance of x, Var (x)
= 𝜎𝜎2
The sample mean, 𝑥𝑥̅ , can be looked at as the sum of n random variables, viz x1, x2,..., xn, each being
divided by (1/n). Here x1, is a random variable representing the first observed value in the sample,
x2 representing the second observed value and so on. Now, when the population is infinitely large,
whatever be the value of x1, the distribution of x2 is not affected by it. This is true of any other pair
of random variables as well. In other words x1, x2,..., xn are independent random variables and all are
picked up from the same population.
∴ 𝐸𝐸(𝑥𝑥1) = 𝜇𝜇 and 𝑉𝑉𝑉𝑉𝑉𝑉(𝑥𝑥1) = 𝜎𝜎2
𝐸𝐸(𝑥𝑥2) = 𝜇𝜇 and 𝑉𝑉𝑉𝑉𝑉𝑉(𝑥𝑥2) = 𝜎𝜎2
… and so on
Finally,
𝐸𝐸(𝑥𝑥̅) = 𝐸𝐸 �
(𝑥𝑥1+𝑥𝑥2+⋯+𝑥𝑥𝑛𝑛)
𝑛𝑛
�

19
=
1
𝑛𝑛
𝐸𝐸(𝑥𝑥1) +
1
𝑛𝑛
𝐸𝐸(𝑥𝑥2) + ⋯ +
1
𝑛𝑛
𝐸𝐸(𝑥𝑥𝑛𝑛)
=
1
𝑛𝑛
𝜇𝜇 +
1
𝑛𝑛
𝜇𝜇 + ⋯ +
1
𝑛𝑛
𝜇𝜇
= 𝜇𝜇
This means that the expected value of the sample mean is the same as the population mean.
and Var(𝑥𝑥̅)= 𝑉𝑉𝑉𝑉𝑉𝑉 �
𝑥𝑥1+𝑥𝑥2+⋯+𝑥𝑥𝑛𝑛
𝑛𝑛
�
= 𝑉𝑉𝑉𝑉𝑉𝑉 �
𝑥𝑥1
𝑛𝑛
� + 𝑉𝑉𝑉𝑉𝑉𝑉 �
𝑥𝑥2
𝑛𝑛
� + ⋯ + 𝑉𝑉𝑉𝑉𝑉𝑉 �
𝑥𝑥𝑛𝑛
𝑛𝑛
�
=
1
𝑛𝑛2
𝑉𝑉𝑉𝑉𝑉𝑉(𝑥𝑥1) +
1
𝑛𝑛2
𝑉𝑉𝑉𝑉𝑉𝑉 (𝑥𝑥2) + ⋯ +
1
𝑛𝑛2
𝑉𝑉𝑉𝑉𝑉𝑉(𝑥𝑥𝑛𝑛)
=
1
𝑛𝑛2 𝜎𝜎2
+
1
𝑛𝑛2 𝜎𝜎2
+ ⋯ +
1
𝑛𝑛2 𝜎𝜎2
=
𝜎𝜎2
𝑛𝑛
This says that the variance of the sample mean is the variance of the population divided by the
sample size.
If we take a large number of samples of size n, then the average value of the sample means tends to
be close to the true population mean. On the other hand, if the sample site is increased then the
variance of 𝑥𝑥̅ gets reduced and by selecting an appropriately large value of n. the variance of x can
be made as small as desired.
The standard deviation of 𝑥𝑥̅ is also called the standard error of the mean. Very often we estimate
the population mean by the sample mean. The standard error of the mean indicates the extent to
which the observed value of sample mean can be away from the true value, due to sampling errors.
For example, if the standard error of the mean is small, we are reasonably confident that
whatever sample mean value we have observed cannot be very far away from the true value. The
standard error of the mean is represented by 𝜎𝜎𝑥𝑥̅.
Sampling with replacement
The above results have been obtained under the assumption that the random variables 𝑥𝑥1, 𝑥𝑥2, … , 𝑥𝑥𝑛𝑛
are independent. This assumption is valid when the population is infinitely large. It is also valid when

20
the sampling is done with replacement, so that the population is back to the same form before the
next sample member is picked up.
Hence, if the sampling is done with replacement, we would again have-
𝐸𝐸(𝑥𝑥̅) = 𝜇𝜇 and 𝑉𝑉𝑉𝑉𝑉𝑉(𝑥𝑥̅) =
𝜎𝜎2
√ 𝑛𝑛
meaning thereby that 𝜎𝜎𝑥𝑥̅ =
𝜎𝜎
√ 𝑛𝑛
Sampling Without Replacement from Finite Populations
When a sample is picked up without replacement from a finite population, the probability
distribution of the second random variable depends on what has been the outcome of the first pick
and so on. As the n random variables representing the n sample members do not remain
independent, the expression for the variance of 𝑥𝑥̅ changes. Results of derivation for this situation
works out as under-
𝐸𝐸(𝑥𝑥̅) = 𝜇𝜇 and 𝑉𝑉𝑉𝑉𝑉𝑉(𝑥𝑥̅) = 𝜎𝜎𝑥𝑥
2
=
𝜎𝜎2
𝑛𝑛
.
𝑁𝑁−𝑛𝑛
𝑁𝑁−1
meaning thereby that 𝜎𝜎𝑥𝑥̅ =
𝜎𝜎
√ 𝑛𝑛
. �
𝑁𝑁−𝑛𝑛
𝑁𝑁−1
By comparing these expressions with the ones derived above we find that the standard error of 𝑥𝑥̅ is
the same but further multiplied by a factor �(𝑁𝑁 − 𝑛𝑛)/(𝑁𝑁 − 1) . This factor is, therefore, known as
the finite population multiplier.
In practice, almost all the samples used are picked up without replacement. Also, most populations
are finite although they may be very large and so the standard error of the mean should theoretically
be found by using the expression given above. However, if the population size (N) is large and
consequently the sampling ratio (n/N) small, then the finite population multiplier is close to 1 and is
not used, thus treating large finite populations as if they were infinitely large. For example, if N =
5,00,000 and n=500, the finite population multiplier -
�
𝑁𝑁−𝑛𝑛
𝑁𝑁−1
= �
5,00,000−500
5,00,000−1
= �
499500
499999
= √0.999002 = 0.9995 which is very close to 1 and the standard
error of the mean would, for all practical purposes, be the same whether the population is treated
as finite or infinite. As a rule of that, the finite population multiplier may not be used if the sampling
ratio (n/N) is smaller than 0.05.
Sampling from Normal Populations
It has been observed that the normal distribution occurs very frequently among many natural
phenomena. For example, heights or weights of individuals, the weights of filled-bags from an
automatic machine, the hardness obtained by heat treatment, etc. are distributed normally.

21
It is also known fact that the sum of two independent random variables will follow a normal
distribution if each of the two random variables belongs to a normal population. The sample mean,
as we have seen earlier is the sum of n random variables 𝑥𝑥1, 𝑥𝑥2, … , 𝑥𝑥𝑛𝑛 each divided by n. Now, if
each of these random variables is from the same normal population, it is not difficult to see that 𝑥𝑥̅
would also be distributed normally.
Let 𝑥𝑥~𝑁𝑁(𝜇𝜇, 𝜎𝜎2) symbolically represent the fact that the random variable x is distributed normally
with mean n and variance 𝜎𝜎2
. Thus,
If 𝑥𝑥~𝑁𝑁(𝜇𝜇, 𝜎𝜎2) then it follows that 𝑥𝑥~𝑁𝑁 �𝜇𝜇,
𝜎𝜎
𝑛𝑛
2
�
The normal distribution is a continuous distribution and so the population cannot be small and finite
if it is distributed normally; that is why the finite population multiplier is not used in the above
expression. Let’s see, by an example, how to make use of the above result.
Suppose the weight of candy produced on a semi-automatic machine is known to be distributed
normally with a mean of 10 mg and a standard deviation of 0.1 mg. If we pick up a random sample
of size 5, what is the probability that the sample mean will be between 9.95 mg and 10.05 mg?
Let x be a random variable representing the weight of one candy picked up at random.
We know that 𝑥𝑥 − 𝑁𝑁( 10, 0.01)
Therefore, it follows that 𝑥𝑥̅~ 𝑁𝑁 �10,
0.01
5
�
This denots that 𝑥𝑥̅ will be distributed normally with a mean of 10 and a variance which is only 1/5
of the variance of the population, since the sample size is 5.
𝑃𝑃𝑃𝑃 {9.95 ≤ 𝑥𝑥̅ ≤ 10.05} = 2 × Pr{10 ≤ 𝑥𝑥̅ ≤ 10.05}
= 2 × Pr �
10−𝜇𝜇
𝜎𝜎
√ 𝑛𝑛�
≤
𝑥𝑥̅− 𝜇𝜇
𝜎𝜎
√ 𝑛𝑛�
≤
10.05−𝜇𝜇
𝜎𝜎
√ 𝑛𝑛�
�
= 2 × Pr �0 ≤ 𝑧𝑧 ≤
10.05−10
0.1
√5
�
�
= 2 × Pr{0 ≤ 𝑧𝑧 ≤ 1.12}
=2 × 0.3686
= 0.7372

22
Figure 6: Distribution of 𝒙𝒙� the enclosed area represents the probability of the random variable 𝒙𝒙� between 9.95 and 10.05
We first make use of the symmetry of the normal distribution and then calculate the z value by
subtracting the mean and then dividing it by the standard deviation of the random variable
distributed normally, viz 𝑥𝑥̅. The probability of interest is also shown as the enclosed area in Figure 6
above.
2.2 The Central Limit Theorem
The above parameters are closely related to the parameters of the population distribution, with the
relationship being described by the Central Limit Theorem. The Central Limit Theorem essentially
states that the mean of the sampling distribution of the mean (𝜇𝜇𝑥𝑥̅) equals the mean of the population
( 𝜇𝜇𝑥𝑥) and that the standard error of the mean (𝜎𝜎𝑥𝑥̅) equals the standard deviation of the population (
𝜎𝜎𝑥𝑥) divided by the square root of N as the sample size gets infinitely larger (𝑁𝑁 ≥ ∞). In addition, the
sampling distribution of the mean will approach a normal distribution. These relationships may be
summarized as follows:
𝜇𝜇𝑥𝑥= 𝜇𝜇𝑥𝑥̅ and 𝜎𝜎𝑥𝑥=
𝜎𝜎𝑥𝑥
√ 𝑁𝑁
It is observed that the sample size needs to be very large (∞) in order for these relationships to hold
true. In theory, this is fact; in practice, an infinite sample size is impossible.
In most situations encountered by researchers, the Central Limit Theorem works reasonably well
with an N greater than 10 or 20. Thus, it is possible to closely approximate what the distribution of
sample means looks like, even with relatively small sample sizes.
9.95 µ=1 10.05
𝜎𝜎𝑥𝑥̅ =
0.1
√5
𝑥𝑥̅ →

23
The importance of the Central Limit Theorem to statistical thinking cannot be overstated. Most of
hypothesis testing and sampling theory are based on this theorem. In addition, it provides a
justification for using the normal curve as a model for many naturally occurring phenomena. If a
trait, such as intelligence, can be thought of as a combination of relatively independent events, in
this case both genetic and environmental, then it would be expected that trait would be normally
distributed in a population.
We need to use the central limit theorem when the population distribution is either unknown or
known to be non-normal. If the population distribution is known to be normal, then 𝑥𝑥̅ will also be
distributed normally, irrespective of the sample size.
2.3 The Sampling Distribution of the Variance
Before attempting to discuss the sampling distribution of the variance, it is worthwhile to first
introduce the concept of sample variance and then present the chi-square distribution which helps
us in working out probabilities for the sample variance, when the population is distributed normally.
The Sample Variance
In probability theory and statistics, the variance is a measure of how far a set of numbers is spread
out. A variance of zero indicates that all the values are identical. A non-zero variance is always
positive: a small variance indicates that the data points tend to be very close to the mean (expected
value) and hence to each other, while a high variance indicates that the data points are very spread
out from the mean and from each other.
We use the sample mean to estimate the population mean, when that parameter is unknown.
Similarly , we use a sample statistic called the sample variance to estimate the population variance.
The sample variance is usually denoted by 𝑠𝑠2
and it again captures some kind of an average of the
square of deviations of the sample values from the sample mean. Let us put it in an equation form
𝑠𝑠2
=
∑ (𝑥𝑥𝑖𝑖−𝑥𝑥̅)2𝑛𝑛
𝑖𝑖=1
𝑛𝑛−1
By comparing this expression with the corresponding expression for the population variance, we
notice two differences. The deviations are measured from the sample mean and not from the
population mean and secondly, the sum of squared deviations is divided by (n -1) and not by n.
Consequently, we can calculate the sample variance based only on the sample values without
knowing the value of any population parameter. The division by (n - 1) is due to a technical reason
to make the expected value of s2 equal 𝜎𝜎2
, which it is supposed to estimate.

24
2.4 The Chi-square Distribution
The 𝜒𝜒2
distribution is an asymmetric distribution that has a minimum value of 0, but no maximum
value. The curve reaches a peak to the right of 0, and then gradually declines in height, the larger
the 𝜒𝜒2
value is. The curve approaches, but never quite touches, the horizontal axis.
For each degree of freedom there is a different 𝜒𝜒2
distribution. The mean of the chi square
distribution is the degree of freedom and the standard devi-ation is twice the degrees of freedom.
This implies that the 𝜒𝜒2
distribution is more spread out, with a peak farther to the right, for larger
than for smaller degrees of freedom. As a result, for any given level of significance, the critical region
begins at a larger chi square value, the larger the degree of freedom.
In its graphical represntation the 𝜒𝜒2
value is on the horizontal axis, with the probability for each
𝜒𝜒2
value being represented by the vertical axis. The three lines in the diagram represents the pattern
of chi square for degrees of freedom as 1, 5 and 10 respectively.
Figure 7: Chi-square distribution with different degrees of freedom
If the random variable x has the standard normal distribution, what would be the distribution of 𝜒𝜒2
?
Intuitively speaking, it would be quite different from a normal distribution because now 𝜒𝜒2
, being a
squared term, can assume only non-negative values. The probability density of 𝜒𝜒2
will be the highest
near 0, because most of the value are close to 0 in a standard normal distribution. This distribution
is called the chi-square distribution with 1 degree of freedom.
The chi-square distribution has only one parameter viz. the degrees of freedom and so there are
many chi-square distributions each with its own degrees of freedom. In statistical tables, chi-square
values for different are.as under the right tail and the left tail of various chi-square distributions are
tabulated.

25
If 𝑥𝑥1, 𝑥𝑥2, … , 𝑥𝑥𝑛𝑛 are independent random variables, each having a standard normal distribution, then
𝑥𝑥1 + 𝑥𝑥2 + ⋯ + 𝑥𝑥𝑛𝑛 will have a chi-square distribution with n degrees of freedom.
If 𝑦𝑦1 and 𝑦𝑦2 are independent random variables having chi-square distributions with 𝛾𝛾1 and 𝛾𝛾2
degrees of freedom, then (𝑦𝑦1 + 𝑦𝑦2) will have a chi-square distribution with 𝛾𝛾1 + 𝛾𝛾2 degrees of
freedom.
Further, if 𝑦𝑦1 and 𝑦𝑦2 are independent random variables such that 𝑦𝑦1 has a chi-square distribution
with 𝛾𝛾1 degrees of freedom and (𝑦𝑦1 + 𝑦𝑦2) has a chi-square distribution with 𝛾𝛾 > 𝛾𝛾1 degrees of
freedom, then 𝑦𝑦2 will have a chi-square distribution with (𝛾𝛾 − 𝛾𝛾1) degrees of freedom.
Now, if 𝑥𝑥1, 𝑥𝑥2, … , 𝑥𝑥𝑛𝑛 are n random variables from a normal population with mean 𝜇𝜇 and variance
𝜎𝜎2
,
i.e. 𝑥𝑥𝑖𝑖~𝑁𝑁(𝜇𝜇, 𝜎𝜎2), 𝑖𝑖 = 1,2, … , 𝑛𝑛
it implies that
𝑥𝑥𝑖𝑖−𝜇𝜇
𝜎𝜎
~𝑁𝑁(0,1)
and so �
𝜎𝜎
�
2
will have a chi-square distribution with 1 degree of freedom.
Hence, ∑ �
𝜎𝜎
�
2
𝑛𝑛
𝑖𝑖=1 will have a chi-square distribution with n degrees of freedom.
We can break up this expression by measuring the deviation from 𝑥𝑥 in place of 𝜇𝜇. We will then have
∑ �
𝜎𝜎
�
2
𝑛𝑛
𝑖𝑖=1 =
1
𝜎𝜎2
∑ [(𝑥𝑥𝑖𝑖 − 𝑥𝑥̅) + (𝑥𝑥̅ − 𝜇𝜇)]2𝑛𝑛
𝑖𝑖=1
=
1
𝜎𝜎2
∑ (𝑥𝑥𝑖𝑖 − 𝑥𝑥̅)2𝑛𝑛
𝑖𝑖=1 +
1
𝜎𝜎2
∑ (𝑥𝑥̅ − 𝜇𝜇)2𝑛𝑛
𝑖𝑖=1 +
2(𝑥𝑥̅−𝜇𝜇)
𝜎𝜎2
∑ (𝑥𝑥𝑖𝑖 − 𝑥𝑥̅)𝑛𝑛
𝑖𝑖=1
=
(𝑛𝑛−1)𝑠𝑠2
𝜎𝜎2 + �
𝜎𝜎
√ 𝑛𝑛�
�
2
since ∑ (𝑥𝑥𝑖𝑖 − 𝑥𝑥̅)𝑛𝑛
𝑖𝑖=1 = 0
Now, it is known that the LHS of the above equation is a random variable which has a chi-square
distribution with n degrees of freedom. It is also known that –
𝑥𝑥̅~ 𝑁𝑁 �𝜇𝜇.
𝜎𝜎2
𝑛𝑛
�
∴ �
𝜎𝜎
√ 𝑛𝑛�
�
2
will have a chi-square distribution with 1 degree of freedom.
Hence, if the two terms on the right hand side of the above equation are independent (which will be
assumed as true here), then it follows that
(𝑛𝑛−1) 𝑠𝑠2
𝜎𝜎2 has a chi-square distribution with (n — 1) degrees
of freedom. One degree of freedom is lost because the deviations are measured from 𝑥𝑥̅ and not
from 𝜇𝜇.
Expected Value and Variance of 𝒔𝒔𝟐𝟐

26
The mean of a chi-square distribution is equal to its degrees of freedom and the variance is equal to
twice the degrees of freedom. This can be used to find the expected value and the variance of 𝒔𝒔𝟐𝟐
.
Since
(𝑛𝑛−1) 𝑠𝑠2
𝜎𝜎2 has a chi-square distribution with (n-1) degrees of freedom,
∴ 𝐸𝐸 �
𝜎𝜎2 � = 𝑛𝑛 − 1 Or
(𝑛𝑛−1)
𝜎𝜎2 . 𝐸𝐸 (𝑠𝑠2) = 𝑛𝑛 − 1
∴ 𝐸𝐸 (𝑠𝑠2) = 𝜎𝜎2
Also, Var �
𝜎𝜎2 � = 2(𝑛𝑛 − 1)
Using the definition of Variance, we get
𝐸𝐸 �
(𝑁𝑁−1)𝑆𝑆2
𝜎𝜎2 − 𝐸𝐸 �
(𝑁𝑁−1)𝑆𝑆2
𝜎𝜎2 ��
2
= 2(𝑁𝑁 − 1) Or, 𝐸𝐸 �
𝜎𝜎2 − (𝑛𝑛 − 1)�
2
− 2(𝑛𝑛 − 1)
Or,
(𝑛𝑛−1)2
𝜎𝜎4 𝐸𝐸 (𝑠𝑠2
− 𝜎𝜎2)2
= 2(𝑛𝑛 − 1) ∴ 𝐸𝐸 (𝑠𝑠2
− 𝜎𝜎2)2
=
2𝜎𝜎4
(𝑛𝑛−1)
i.e 𝑉𝑉𝑉𝑉𝑉𝑉 (𝑠𝑠2) =
2𝜎𝜎4
𝑛𝑛−1
since the expected value of 𝑠𝑠2
is equal to 𝜎𝜎2
.
It can therefore, be conclude that if we take a large number of samples, each with a sample size on
n, from a normal population with mean 𝜇𝜇 and variance 𝜎𝜎2
, each sample will perhaps have a different
value for its sample variance 𝑠𝑠2
. But the average of a large number of values of 𝑠𝑠2
will be close to
𝜎𝜎2
. Also, the variance of 𝑠𝑠2
falls as the sample size increases.
Its important to note here that all the above conclusions are based on the assumption that the
population is distributed normally. If the population does not have a normal distribution, then
nothing can be said about the distribution of 𝑠𝑠2
.
2.5 Sampling Distribution of the Proportion
Let us assume that 0.80 of all students in a school can pass a test of physical fitness. A random
sample of 20 students is chosen: 13 passed and 7 failed. The parameter π is used to designate the
proportion of subjects in the population that pass (0.80 in this case) and the statistic p is used to
designate the proportion who pass in a sample (13/20= 0.65 in this case). The sample size (N) in this
example is 20. If repeated samples of size N where taken from the population and the proportion
passing (p) were determined for each sample, a distribution of values of p would be formed. If the
sampling went on forever, the distribution would be the sampling distribution of a proportion. The
sampling distribution of a proportion is equal to the binomial distribution. The mean and standard
deviation of the binomial distribution are:
𝜇𝜇 = 𝜋𝜋 and 𝜎𝜎𝑝𝑝 = �
𝜋𝜋(1−𝜋𝜋)
𝑁𝑁
For the present example, N = 20, π = 0.80, the mean of the sampling distribution of 𝑝𝑝(𝜇𝜇) is 0.8 and
the standard error of 𝑝𝑝�𝜎𝜎𝑝𝑝� is 0.089. The shape of the binomial distribution depends on both N and

27
π. With large values of N and values of π in the neighborhood of 0.5, the sampling distribution is very
close to a normal distribution.
Assume that for the population of people applying for a job at a bank in a major city, 0.40 are able
to pass a basic literacy test required to get the job. Out of a group of 20 applicants, what is the
probability that 50% or more of them will pass? This problem involves the sampling distribution of p
with π = 0.40 and N = 20. The mean of the sampling distribution is π = 0.40. The standard deviation
is:
𝜎𝜎𝑝𝑝 = �
𝜋𝜋(1−𝜋𝜋)
𝑁𝑁
= �
0.40(1−0.40)
20
= 0.11
Using the normal approximation, a proportion of 0.50 is: (0.50-0.40)/0.11 = 0.909 standard
deviations above the mean. From a z table it can be calculated that 0.818 of the area is below a z of
0.909. Therefore the probability that 50% or more will pass the literacy test is only about 1 - 0.818 =
0.182.
2.6 The Confidence Level
The sample mean is researchers estimate of the population mean. If we are asked to give an interval
as our estimate, then we would add a range on the upper and the lower side of the sample mean
and give that interval as our estimate. The larger the interval, the greater is our confidence that the
interval does contain the true population mean. It is to be noted that the true population mean is a
constant and is not a variable. On the other hand, the interval that we specify is a random interval
whose position depends on the sample mean. For example if the sample mean is 50 and the standard
error of the mean is 5, we may specify our interval estimate as (45,55) i.e. from 45 to 55 which spans
one standard error of the mean on either side of the sample mean. On the other hand, if the interval
estimate is specified as (40,60) i.e. spanning two standard errors of the mean on either side of the
sample mean, we are more confident that the latter interval contains the true population mean as
compared to the former. However, if the confidence level is raised too high, the corresponding
interval may become too wide to be of any practical use.
The confidence level, therefore, may be defined as the probability that the interval estimate will
contain the true value of the population parameter that is being estimated. If we say that a 95%
confidence interval for the population mean is obtained by spanning 1.96 times the standard error
of the mean on either side of the sample mean, we mean that we take a large number of samples of
size n, say 1000, and obtain the interval estimates from each of these 1000 samples and then 95%
of these interval estimate would contain the true population mean.
Confidence Interval for the Population Mean
Let us now discuss how to obtain a confidence interval for the population mean. We shall assume
that the population distribution is normal and that the population variance is known. Later, we shall
relax the second condition.
Suppose it is known that the weight of cement in packed bags is distributed normally with a standard
deviation of 0.2 Kg. A sample of 25 bags is picked up at random and the mean weight of cement in

28
these 25 bags is only 49.7 Kg. We want to find a 90% confidence interval for the mean weight of
cement in filled bags.
Let x be a random variable representing the weight of cement in a bag picked up at random. We
know that x is distributed normally with a standard deviation of 0.2 Kg.
The standard error of the mean can be easily calculated as
𝜎𝜎𝑥𝑥̅ =
𝜎𝜎
√ 𝑛𝑛
=
0.2
√25
= 0.04 𝐾𝐾𝐾𝐾
We can use the above approach when the population standard deviation is known or when the
sample size is large n > 30 , in which case the sample standard deviation can be used as an estimate
of the population standard deviation. However, if the sample size is not large, as in the example
above, then one has to use the t distribution in place of the standard normal distribution to calculate
the probabilities. Let us assume that we are interested in developing a 90% confidence interval in
the same situation as described earlier with the difference that the population standard deviation is
now not known. However, the sample standard deviation has been calculated and is known to be
0.2 Kg.
Since the sample size n = 25, we know that
𝑠𝑠
√ 𝑛𝑛�
follows a t-distribution with 24 degrees of freedom.
From t-tables, we can see that the probability that a t statistic with 24 degrees of freedom lying
between - 1.711 and 1.711 is 0.90 -i.e. the probability that 𝑥𝑥̅ lies between −1.711 𝑠𝑠 √𝑛𝑛⁄ and
+1.711 𝑠𝑠 √𝑛𝑛⁄ is 0.90.
In other words, if we use an interval spanning from (𝑥𝑥 − 1.711 𝑠𝑠 √𝑛𝑛⁄ ) to (𝑥𝑥 + 1.711 𝑠𝑠 √𝑛𝑛⁄ )
then 90% of the time, this interval will contain 𝜇𝜇 . Hence, for a 90% confidence interval,
The lower limit = 𝑥𝑥̅ − 1.711
𝑠𝑠
√ 𝑛𝑛
or 49.7 − 1.711
0.2
√25
or 49.6316
And the upper limit = 𝑥𝑥̅ + 1.711
𝑠𝑠
√ 𝑛𝑛
or 49.7 + 1.711
0.2
√25
or 49.7684
In this case, we can state with 90% confidence level that the mean weight of cement in a filled bag
lies between 49.6316 Kg and 49.7684 Kg.
Using the derivations and relations we can calculte the sample size that will be ideal for a particular
study for an expected confidence level.
***

29
Bibliography
1. http://www.nku.edu/~statistics/212_Sampling_Distribution_of_P-hat.htm
2. http://en.wikipedia.org/wiki/Sampling_distribution
3. http://en.wikipedia.org/wiki/Sampling_(statistics)
4. http://onlinestatbook.com
5. Course material on ‘Quantitative analysis for Managerial Applications’, MS-8, 1997, IGNOU,
Maidan Garhi, New Delhi.
6. Course material on ‘Research Methodology for Management Decisions’, MS-95, 1997, IGNOU,
Maidan Garhi, New Delhi.
7. http://stattrek.com/sampling/sampling-distribution.aspx
8. http://www.psychstat.missouristate.edu/introbook/sbk19.htm
9. http://www.stat.berkeley.edu/~stark/SticiGui/Text/index.htm
10. http://www.fao.org/docrep/w7295e/w7295e08.htm#6
***

Sampling and Sampling Distribution

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Sampling and Sampling Distribution

Similar to Sampling and Sampling Distribution (20)

Recently uploaded

Recently uploaded (20)

Sampling and Sampling Distribution