• Methods to collect data – questionnaire,
observations, recording, etc.
• Population and sampling
Data are values of quantitative or qualitative
variables, belonging to a set of items.
Eg: Data regarding the no. of seats for some
of the science courses in a college
No. of Seats
Primary data is a type of information that is
obtained directly from first-hand sources by
means of surveys, observation or
experimentation. It is the data that has not been
previously published and is derived from a new
or original research study and collected at the
source such as in marketing.
METHODS OF COLLECTING
Secondary data is any information collected by
someone else other than it's user. It is data that
has already been collected and is readily
available for use. Secondary data saves on time
as compared to primary data which has to be
collected and analysed before use.
METHODS OF COLLECTING
• International Publication – UNO,WTO,etc.
• Government Publications – Central and State Govt.
• Publication- Municipal Corporations,boards, etc.
• Research Works
• Records of Private Firms
A series of questions asked to
individuals to obtain
information statistically useful
about a certain topic.
• They can be easily be fed back to employees.
• Questionnaires can be standard or customized.
• Large amounts of information can be
• Can be analyzed easily.
• Simple and quick to fill up by the respondent.
• Can be used for sensitive topics.
• Respondents have time to think about their
• Format is familiar to most respondents.
• Respondents don‟t always answer honestly.
• No additional questions can be asked.
• Appear impersonal.
• Respondents may misunderstand.
• Unsuitable for some kind of respondents.
• Respondents may ignore questions.
• Different respondents may interpret differently.
• Danger of questionnaire fatigue.
KEY POINTS FOR MAKING AN
Keep it as short as possible.
Ask short, simple, clearly worded questions.
Start with demographic questions.
Use open ended questions cautiously.
Place the questions in logical order.
Try to minimize effort to be put in by the
• Put in more open ended questions.
• This method of collecting data involves
presentation or oral-verbal stimuli and reply in
terms of oral-verbal responses.
• Interviews are probably the most widely used
technique for collecting data.
• They permit the interviewer to ask the respondent
TYPES OF INTERVIEW
• Structured: Pre-established questions
• Unstructured: Draw out information without
the use of pre-established questions
• Semi-Structured: A mixture of both strategies
FORMAL AND INFORMAL
Formal: A formal interview is just that, formal. It
includes: the office setting; the formal handshake;
appropriate attire; order and structure; and best
Informal: An informal interview attempts to ignore the
rules and roles associated with interviewing in an
attempt to gain trust and create a more natural
environment for an open and honest communication.
• Keep language pitched to that of respondent
• Avoid long questions
• Create comfort
• Establish time frame for interview
• Avoid leading questions
• Sequence topics
• Be respectful
• Listen carefully
DISADVANTAGES OF INTERVIEWS
Deep and free response
Costly in time and personnel
Glimpse into respondent‟s tone,
May be difficult to summarize
Ability to probe, follow-up
Possible biases: interviewer,
Suppose IBN7 Managing Editor Ashutosh comes to KMC to interview our Principal, Dr.
Now, by interviewing Dr. SP Gupta, he can get data about our college like,
• What is the total budget allotted by DU to KMC?
• What is the the total strength of the teaching/Non-teaching staff?
• How many rooms are available for the lectures?
And so on……
• Observation: A systematic method of data
collection that relies on a researcher‟s ability to
gather data through their senses
• Observe: To notice using a full range of
appropriate senses. To see, hear, feel, taste, and
• They are free from the biases inherent in the self-report
• They put the practitioner directly in touch with the
behaviors in question.
• They involved real-time data, describing behavior
occurring in the present rather than the past.
• Difficulties interpreting the meaning underlying the
• Observers must decide which people to observe; choose
time periods, territory and events
• Failure to attend to these sampling issues can result in a
biased sample of data.
See what is happening
• traffic patterns
• land use patterns
• layout of city and rural areas
• quality of housing
• condition of roads
• conditions of buildings
• who goes to a health clinic
During our Manali tour, I observed some things that can provide a kind of data
regarding the tour. The observations are as follows:
• There were 3 teachers, 9 -1st year, 2 -2nd year and 31- 3rd students.
• Only 1 student among 45 demanded onion/garlic less food.
• Most rooms were booked on triple sharing basis and 3-4 rooms on quad
• Only 1 student among 45 was a minor.
THINGS TO CONSIDER
• All data collection methods are capable of gathering quantitative and
qualitative data, although some may be better suited towards one task or the
• There is no single data collection method that can guarantee credible data
• All data collection methods can be consciously manipulated
• All data collection methods can be „contaminated‟ by unrecognized bias
• All data collection methods require conscious deliberation on the part of the
researcher to ensure credibility
WHAT IS A POPULATION?
• A population is any complete group with at least
one characteristic in common.
• Populations are not just people.
• Populations may consist of people, animals,
businesses, buildings, motor vehicles, farms,
objects or events.
WHY DO YOU NEED TO KNOW WHO OR
WHAT ARE IN A POPULATION?
• When looking at data, it is important to clearly identify the
population being studied or referred to, so that you can
understand who or what are included in the data.
• For example, if you were looking at some Indian farming data,
you would need to understand whether the population, the data
refers to, is all farms in India, just farms that grow crops, those
that only have livestock, or some other type of farm.
Sampling is the process of selecting a number of
individuals for a study in such a way that the
individuals represent the larger group from
which they were selected
The representatives selected for a study whose
characteristics match the larger group from
which they were selected.
• A sampling frame is the source material from which
a sample is drawn.
• It is a list of all those within a population who can
be sampled, and may include individuals, households
BASIC METHODS OF
Simple Random Sampling
Systematic Random Sampling
Stratified Random Sampling
SIMPLE RANDOM SAMPLING
• It is the basic random sampling technique
where a group of subjects (a sample) is
selected for study from a larger group (a
• Every experimental unit is chosen entirely
by chance and each member of the
population has an equal chance of being
included in the sample.
Examples: Lottery, generation of random
Write down the name of each member of
the population on pieces of paper.
Place these papers in a box or a container
The box or lottery drum must be shaken
thoroughly to prevent some pieces of
paper from sinking at the bottom.
Picked the required number of sample
units from the lottery drum.
• Select a random starting point using chits,etc. and
then select every kth subject in the population
• Simple to use so it is used often
where n is the sample
size, and N is the
Choosing a sample of size 84 from 500.
k = N/n
where N = 500 and n = 84
k = 500/84
k = 5.95
• The population is divided into two or more groups called
strata, according to some criterion, such as geographic
location, grade level, age, or income, and subsamples are
randomly selected from each strata.
Divide the population into groups (called clusters), randomly
select some of the groups, and then collect data from ALL
members of the selected groups
Used extensively by government and private research
Examples: Exit Polls
It attempts to obtain a sample of
convenient elements. Often,
respondents are selected because
they happen to be in the right
place at the right time.
Using family members or
students in a classroom
It is a form of convenience sampling
in which the population elements
are selected based on the
Judgment of the researcher.
Test markets, engineers selected
in industrial marketing research,
expert witnesses used in court,etc.
Quota sampling may be viewed as two-stage restricted
• The first stage consists of developing control
categories, or quotas, of population elements.
• In the second stage, sample elements are selected
based on convenience or judgment.
• In snowball sampling, an
initial group of respondents
is selected, usually at
• After being interviewed,
these respondents are asked
to identify others who
belong to the target
population of interest.
• Subsequent respondents are
selected based on the
1. DEFINE POPULATION TO BE
Identify the group of interest and its characteristics to
which the findings of the study will be generalized.
Note: Mostly the “accessible” or “available”
population must be used.
2. DETERMINE THE SAMPLE
• The size of the sample influences both the
representativeness of the sample and the statistical
analysis of the data
1.Larger samples are more likely to detect a difference
between different groups.
2.Smaller samples are more likely not to be
RULES OF THUMB FOR
DETERMINING THE SAMPLE SIZE
• The larger the population size, the smaller the percentage of the
population required to get a representative sample
• For smaller samples (N ‹ 100), there is little point in sampling.
Survey the entire population.
• If the population size is around 500, 50% should be sampled.
• If the population size is around 1500, 20% should be sampled.
• Beyond a certain point (N = 5000), the population size is almost
irrelevant and a sample size of 400 may be adequate.
3. CONTROL FOR SAMPLING
BIAS AND ERROR
• Be aware of the sources of sampling bias and identify how
to avoid it.
• Decide whether the bias is so severe that the results of the
study will be seriously affected in the final report, document
awareness of bias, rationale for proceeding, and potential
4. SELECT THE SAMPLE...
• A process by which the researcher attempts to
ensure that the sample is representative of the
population from which it is to be selected.
Note: Requires identifying the sampling method that
will be used.
• In general, there are two types of errors:
1) non-sampling errors
• These are errors that arise because data has been collected from a
part, rather than the whole of the population.
• Because of the above, sampling errors are restricted to sample
surveys only unlike non-sampling errors that can occur in both
sample surveys and censuses data.
• There are no sampling errors in a census because the calculations
are based on the entire population.
• They are measurable from the sample data in the case of
FACTORS AFFECTING SAMPLING
It is affected by a number of factors including:
• In general, larger sample sizes decrease the sampling error,
however this decrease is not directly proportional.
• As a rough rule of the thumb, you need to increase the sample
The Sampling Fraction
• This is of lesser influence but as the sample size increases as a
fraction of the population, the sampling error should decrease.
The Variability Within The Population.
More variable populations give rise to larger errors as the
samples or the estimates calculated from different samples are
more likely to have greater variation.
The effect of variability within the population can be reduced
by the use of stratification.
An efficient sampling design will help in reducing sampling
• Nonsampling errors are more serious and are due to mistakes made in
the acquisition of data or due to the sample observations being selected
• These are errors that arise during the course of all data collection
• In summary, they have the following characteristics:
1) exist in both sample surveys and censuses data.
2) difficult to measure .
SOURCES OF NON-SAMPLING
Non-sampling errors arise from:
• defects in the sampling frame.
• failure to identify the target population.
• non response.
• responses given by respondents.
REDUCING NON-SAMPLING ERRORS
Can be minimised by adopting any of the following approaches:
• using an up-to-date and accurate sampling frame.
• careful selection of the time the survey is conducted.
• careful questionnaire design.
• providing thorough training and periodic retraining of
interviewers and processing staff.
WHAT IS A SURVEY ?
• A “survey” is a systematic method of gathering information
from a sample.
• A survey usually originates when an individual or
institution is confronted with an information need and the
existing data are insufficient
MAIL / PHONE / INTERNET
• Literacy issues
• Consider accessibility
reliability of postal service
• Consider bias
What population segment has telephone access? Internet
Best when you want to know what
people think, believe, or perceive,
only they can tell you that.
People may not accurately recall their
behavior or may be reluctant to reveal
their behavior if it is illegal. What
people think they do or say they do is
not always the same as what they
• Keep them short (under 5 minutes)
• Avoid huge long checklists
• Allow for text comments
• Allow for categorical identifications -- school, job
function, grade, etc.
BASIC SURVEY TYPES
• Surveys can be administered in a number of ways:
• Face to face
• Survey questions can either be open or closed:
• Open questions: These questions ask respondents to
construct answers using their own words. Open questions
can generate rich and candid data, but it can be data that is
difficult to code and analyse
• Closed questions: These questions force respondents to
choose from a range of predetermined responses, and are
generally easy to code and statistically analyse .