Define and identify the different
levels of measurement and
Define, appraise, use and
interpret the different tools used
for data analysis
The entire aggregation of items from which
samples can be drawn is known as a
In sampling, the population may refer to the
units, from which the sample is drawn.
A population of interest may be the universe of
nations or cities.
This is one of the first things the analyst needs
to define properly while conducting a business
“N” represents the size of the population.
Statistics such as averages and standard
deviations, median, mode etc are population
Bias is a term which refers to how far the average
statistic lies from the parameter it is estimating, that
is, the error which arises when estimating a quantity.
A statistic is biased if it is calculated in such a way that
is systematically different from the population
parameter of interest
Unbiasedness. This means that the average of large
set of unbiased measurements will be close to the
This means that repeated measurements would be close
to one another but not necessarily to true value.
The allocation of patients to both treatment and control
groups in a random manner. This enables the
minimization of selection bias.
When participants are allocated to two groups as blocks
of 2, 4, 6 or 8 and so on and both groups contain equal
number of blocks at each time interval.
What is sampling ?
In simple words, sampling consists of obtaining
information from a portion of a larger group or
an universe. Elements are selected in a manner
that they yield almost all information about the
whole universe, if and when selected according
to some scientific principles and procedures.
A complete study of all the elements present in the
population is known as a census. The national
population census is an example of census survey
A Sample is a selection of units from the entire group
called the population or universe of interest. It is
Subset of a larger population
The probability of any particular
member being chosen for the
sample is unknown.
The sampling procedure of obtaining the
people or units that are most
Accidental sampling is a type of nonprobability sampling which involves the
sample being drawn from that part of the
population which is close to hand
in quota sampling, the population is first
segmented into mutually exclusive In
quota sampling the selection of the
sample is non-random sub-groups
In the quota sampling the interviewers
are instructed to interview a specified no
of persons from each category. In studying
peoples status, living
conditions, preference, opinions, attitude
Samples in which the selection criteria are
based on personal judgment that the element
is representative of the population under
Example:-In test marketing, a judgement is made as
to which cities would constitute the best
ones for testing the marketability of a new
samples in which selection of additional
respondents is based on referrals from the
Initial respondents are selected by
Additional respondents are obtained from
information provided by the initial
Every member of the population has a
known, non-zero probability of being selected
Simple random sampling
Random sampling mean, the arrangement of
conditions in such a manner that every item of
the whole universe from which we are to select
the sample shall have the same chance of
being selected as any other item.
Among all the probability sampling procedures
random sampling is the most basic and least
Prepare a list of all the elements in the universe
and number them. This list can be according to
alphabetical order, as in records etc.
Then from the list, every third/every 8th / or any
other number in the like manner can be selected.
For this method, population needs to be
homogeneous. This method is frequently
used, because it is simple, direct and inexpensive.
Also known as patterned, serial or chain sampling.
When the population is divided
into different stratas or groups
and then samples are selected
from each stratum by simple
procedure, we call it as
stratified random sampling
The whole population is divided in small
clusters it may be according to location. Then
clusters are selected in sample
The purpose of cluster sampling is to sample
characteristics of a probability sample.
Defining the target
Specifying the sampling
Specifying the sampling
Selection of the
Specifying the sampling
Selecting the sample.
Advantages of sampling
Helps to collect vital information more quickly and it
helps to make estimates of the characteristics of the total
population in a shorter time
Sampling cuts costs. Much of time and money is saved at
each stage of research
Sampling techniques often increases the accuracy of the
data. With small samples it become easier to check the
accuracy of the data.
From the administrative point of view also sampling
become easier – problem of hiring the staff, task of
training and supervising will become easier
Disadvantages of sampling
Sampling is not flexible in a situation where
knowledge about each unit is needed. E.g. estimation
of national income for the current year.
Reliability of information depends upon the
representativeness of the sample of the total
Most of the sampling techniques require the service of
a sampling experts or statisticians.
Hospital patients may be different than those in the
Volunteers are not typical of non-volunteers.
Standard Error of Mean
SD ( mmHg)
SEM = SD√n.
Printers SEM = 4.5 √72 = 0.53mmHg
Farmers SEM = 4.2 √ 48 = 0.61 mmHg
Standard Error of a proportion or
Total No. of patients diagnosed with Appendicitis = 120
No. of Males = 73 ( 60.8%)
No. of Females = 47 ( 39.2%)
If P represents one percentage then 100 – P is percentage
for the other so
SE Percentage =
SE Percentage =
60.8 * 39.2/120 = 4.46
Difference between Standard
Deviation & Standard Error
SD is a sample estimate of the population parameters.
In other words it is the estimate of variability of
observations. Each population has a unique SD and as
the population size enlarges, the more precise
estimate of population SD is provided by it.
SE is on the other hand is a measure of precision of an
estimate of a population parameter.SE is always
attached to a parameter and it can be calculated for
any parameter like Mean, Median, Fifth centile and
even for SD itself.
As sample size increases, the SE of the estimate will
decrease as the precision of the estimate will increase
with increasing sample size.
When to use SD & SE!!!!
If the purpose of the data is to describe
the data and it is normally
distributed, then use Standard Deviation
denoted by SD.
If the purpose is to describe the outcome
of a study, e.g to estimate the prevalence
of a disease or the difference between
two treatment groups, then one should
use standard error denoted by SE.