1. Hallmark Business School www.hbs.ac.in
UNIT IIIData Collection
Types of Data:Definition of Data: It’s the facts presented to the researcher from
the study’s environment. It is characterized by their abstractness, verifiability,
elusiveness, and closeness to the phenomenon. 1) Qualitative data:is a
categorical measurement expressed not in terms of numbers, but rather by
means of a natural language description. In statistics, it is often used
interchangeably with "categorical" data. For e.g., favorite color – “Blue”, Height –
“Tall”. Although we may have categories, the categories may have a structure to
them. When there is not a natural ordering of the categories, we call these
nominal categories. Examples might be gender, race, religion, or sport. When the
categories may be ordered, these are called ordinal variables. Categorical
variables that judge size (small, medium, large, etc.) are ordinal variables.
Attitudes (strongly disagree, disagree, neutral, agree, strongly agree) are also
ordinal variables, however we may not know which value is the best or worst of
these issues. Note that the distance between these categories is not something
we can measure.2) Quantitative data:is a numerical measurement expressed not
by means of a natural language description, but rather in terms of numbers.
However, not all numbers are continuous and measurable. For example, the
Aadhaar is a number, but not something that one can add or subtract.
Quantitative data always associate with a scale measure including ratio-scale.
Primary vs Secondary data:Primary data are sought for their proximity to the
truth and control over error. These cautions remind us to use care in designing
data collection procedures and generalizing from results. Secondary data have
had at least one level of interpretation inserted between the event and its
recording. primary sources, (2) secondary sources, and (3) tertiary sources.
Primary sources are original works of research or raw data without
interpretation or pronouncements that represent an official opinion or position.
Included among the primary sources are memos; letters; complete interviews or
speeches (in audio, video, or written transcript formats); laws; regulations; court
decisions or standards; and most government data, including census, economic,
and labor data. Primary sources are always the most authoritative because the
information has not been altered or interpreted by a second party. Other internal
sources of primary data are inventory records, personnel records, purchasing
requisition forms, statistical process control charts, and similar data. Secondary
sources are interpretations of primary data. Encyclopedias, textbooks,
handbooks, magazine and newspaper articles, and most newscasts are
considered secondary information sources. Indeed, nearly all reference materials
fall into this category. Internally, sales analysis summaries and investor annual
reports would be examples of secondary sources, because they are compiled
from a variety of primary sources. To an outsider, however, the annual report is
viewed as a primary source, because it represents the official position of the
corporation.
Methods of primary data collection:Methods of data collection: 1)
Monitoring(Conditions, behaviors, events, processes): includes studies in which
the researcher inspects the activities of a subject or the nature of some material
without attempting to elicit responses from anyone e.g, Traffic counts at an
intersection. 2) Communication(Attitudes, motivations, intentions,
expectations):the researcher questions the subjects and collects their responses
by personal or impersonal means. The collected data may result from (i)
interview or telephone conversations, (ii) self-administered or self-reported
instruments sent through the mail, left in convenient locations, or transmitted
electronically or by other means, or (iii) instruments presented before and/or
after a treatment or stimulus condition in an experiment.Data Collection Design:
Steps: 1) Select relevant variables; 2) Specify levels of treatment; 3) Control the
experimental environment; 4) Choose the experimental design – Screen design,
Response surface design, Choice design, Life test design, Nonlinear design, Space
filling design, Full factorial design, Taguchi design, Mixture design, Evaluate
design & Augment design.Instrument Design: Steps: 1) Identify screening inquiry;
2) Prepare participation appeal; 3) Identify source of error; 4) Prepare error
reduction plan; 5) Prepare instrument.
Survey vs Observation:Survey: Very versatile in types of data collection. This
method provides opportunityto the respondents for seekingclarifications. The the
response to the questionscan be sought thoughPersonal interviews, Ordinary
Mail or Electronic communication. Time and cooperation is requiredfrom the
respondent. Observation:Data collection is constrained only what can be
observed or heard. Any kind of attitude/feelings survey is not possible. The
observation can be done mechanically(videotapes) or through human interface.
This method is best for conducting surveys on infants /children who cannot
speak. In this technique no extra effort is needed from the respondent. Not
affected by the presence of the interviewer. Types of Observations: 1) Natural vs
Contrived observation; 2) Disguised vs Non-disguised; 3) Human vs Mechanical;
4) Web-based observation.
Experiments:Read Unit II cheatsheet
Construction of questionnaire and instrument:Question construction involves
three critical decision areas. They are (a)question content, (b) question wording,
and(c) response strategy. Question content should pass the following tests:
Should the question be asked? Is it of proper scope? Can and will the participant
answer adequately?Question wording difficulties exceed most other sources of
distortion in surveys.Each response strategy generates a specific level of data,
with available statistical procedures for each scaletype influencing the desired
response strategy. Participant factors include level of information about the
topic, degree to which the topic has been thought through, ease of
communication, and motivation to share information.
Instruments obtain three general classes of information. Target questions
address the investigative questions and are the most important. Classification
questions concern participant characteristics and allow participants’ answers to
be grouped for analysis. Administrative questions identify the participant,
interviewer, and interview location and conditions.
Validation of questionnaire:
Retention of a question should be confirmed by answering these questions: Is
the question stated in terms of a shared vocabulary? Does the vocabulary have a
single meaning? Does the question contain misleading assumptions? Is the
wording biased? Is it correctly personalized? Are adequate alternatives
presented?
Definitions: Idea of Sampling:is that by selecting some of the elements in a
population, we may draw conclusions about the entire population. A population
element is the individual participant or object on which the measurement is
taken. It is the unit of study.A population is the total collection of elements about
which we wish to make some inferences.A census is a count of all the elements in
a population. We call the listing of all population elements from which the
sample will be drawn as the sample frame. Sample Types: 1) Nonprobability
sampling is arbitrary and subjective; when we choose subjectively, we usually do
so with a pattern or scheme in mind (e.g., only talking with young people or only
talking with women). Each member of the population does not have a known
chance of being included.2) Probability sampling is based on the concept of
random selection—a controlled procedure that assures that each population
element is given a known nonzero chance of se- lection. This procedure is never
haphazard. Only probability samples provide estimates of precision.
Sample plan:
Sampling Design Steps: 1. What is the target population? 2. What are the
parameters of interest? 3. What is the sampling frame? 4. What is the
appropriate sampling method?5. What size sample is needed?
Sample size:
The sample size is an important feature of any empirical study in which the goal is
to make inferences about a population from a sample.
Determinants of optimal sample size:
1) Type of analysis to be employed; 2) The level of precision needed; 3)
Population homogeneity/heterogeneity; 4) Available resources; 5) Sampling
technique used
Sampling techniques:
Types: Unrestricted: 1) Simple Random (Probability), 2) Convenience (Non-
probability). Restricted: 1) Complex Random (Probability) – Systematic, Cluster,
Stratified, Double. 2) Purposive (Nonprobability) – Judgement, Quota,
Snowball.Exhibit –14-8.
Probability vs non-probability sampling methods:
Probability Sampling: • You have a complete sampling frame. You have contact
information for the entire population.• You can select a random sample from
your population. Since all persons (or “units”) have an equal chance of being
selected for your survey, you can randomly select participants without missing
entire portions of your audience.• You can generalize your results from a random
sample. With this data collection method and a decent response rate, you can
extrapolate your results to the entire population.• Can be more expensive and
time-consuming than convenience or purposive sampling.
Nonprobability Sampling:• Used when there isn’t an exhaustive population list
available. Some units are unable to be selected, therefore you have no way of
knowing the size and effect of sampling error (missed persons, unequal
representation, etc.). • Not random. • Can be effective when trying to generate
ideas and getting feedback, but you cannot generalize your results to an entire
population with a high level of confidence. Quota samples (males and females,
etc.) are an example.•More convenient and less costly, but doesn’t hold up to
expectations of probability theory.
Stratified Sampling:1. We divide the population into a few subgroups: Each
subgroup has many elements in it; Subgroups are selected according to some
criterion that is related to the variables under study.2. We try to secure
homogeneity within subgroups.3. We try to secure heterogeneity between
subgroups.4. We randomly choose elements from within each subgroup.
Cluster Sampling: 1. We divide the population into many subgroups: • Each
subgroup has few elements in it; • Subgroups are selected according to some
criterion of ease or availability in data collection.2. We try to secure
heterogeneity within subgroups.3. We try to secure homogeneity between
subgroups.4. We randomly choose several subgroups that we then typically study
in depth.