2. Unit III
• Scaling & measurement techniques:
– Concept of Measurement:
– Need of Measurement;
– Problems in measurement in management research –
– Validity and Reliability.
– Levels of measurement – Nominal, Ordinal, Interval, Ratio.
• Attitude Scaling Techniques:
– Concept of Scale – Rating Scales viz. Likert Scales, Semantic
Differential Scales, Constant Sum Scales, Graphic Rating
Scales
– Ranking Scales – Paired comparison & Forced Ranking –
Concept and Application.
3. Concept of Measurement
• Definition Metrology is the name given to the science
of pure measurement. Engineering Metrology is
restricted to measurements of length & angle.
• Measurement is defined as the process of numerical
evaluation of a dimension or the process of
comparison with standard measuring instruments
• Quantifying the independent and dependent variable.
• “The assignment of numerals to objects or events
according to rules”-Steven
• “The assignment of numbers to represent properties”-
Campbell
4. Importance or Need for Measurement
• It helps in doing comparison.
• It helps in establishing relationship between variables.
• Observations must be quantifiable in order to subject them to
statistical analysis.
• This the building block of any data analysis.
• To check Customer Satisfaction
• To evaluate the validity and reliability of measuring
instrument
• It helps in converting Physical parameter into meaningful
number.
• True dimension Evaluate the Performance.
5. Problems in measurement
in management research
• Problem of Uni-dimensionality: Measurement should not measure
more than one characteristic at a time. Example: scale should not
measure length and temperature at the same time.
• Problem of Linearity: A good measurement should follow the
straight line model.
• Problem of Validity: The measurement scale should measure what
it is supposed to measure.
• Problem of Reliability: This refers to consistency. The measurement
should give consistent result.
• Problem of Accuracy and precision: It should give an accurate and
precise measure of what is being measured.
• Problem complexity :Measurement tool should not be very
complicated or over elaborate.
• Problem of Practicability: The tool should be easy to understand
administer.
6. Validity and Reliability
• Reliability
• Reliability refers to the consistency of a
measure.
• Validity
• Validity is the extent to which the scores from
a measure represent the variable they are
intended to
• It measures the same thing which it is
supposed to do.
7. Validity and Reliability
• Reliability
• Reliability refers to the consistency of a
measure.
• Psychologists consider three types of
consistency:
I. over time (test-retest reliability),
II. across items (internal consistency),
III. and across different researchers (inter-rater
reliability).
8. Reliability
• Over time (test-retest reliability),
• When researchers measure a construct that they assume to
be consistent across time, then the scores they obtain
should also be consistent across time. Test-
retest reliability is the extent to which this is actually the
case with respect to time.
• For example, intelligence is generally thought to be
consistent across time. A person who is highly intelligent
today will be highly intelligent next week. This means that
any good measure of intelligence should produce roughly
the same scores for this individual next week as it does
today.
• Correlation is used to establish the fact
9. Reliability
• Across items (internal consistency),
• A second kind of reliability is internal consistency, which is
the consistency of people’s responses across the items on a
multiple-item measure.
• In general, all the items on such measures are supposed to
reflect the same underlying construct, so people’s scores on
those items should be correlated with each other.
• On the Rosenberg Self-Esteem Scale, people who agree that
they are a person of worth should tend to agree that that
they have a number of good qualities.
• People are divided into two group then correlation is
established.
10. Reliability
• Across different researchers (inter-rater reliability).
• Inter-rater reliability is the extent to which different
observers are consistent in their judgments.
• For example, measuring university students social
skills, you could make video recordings of them as
they interacted with another student whom they
are meeting for the first time. Then two rater will
rate their observation.
• Inter-rater reliability would also have been measured
in Bandura’s doll study.
11. Validity
• Validity is the extent to which the scores from a
measure represent the variable they are intended
to.
• Example: people’s index finger length reflects
their self-esteem and therefore tries to measure
self-esteem by holding a ruler up to people’s
index fingers
• Types:
a. Content Validity.
b. Criterion Validity.
c. Construct Validity.
12. Validity
• Content validity is also of two types-
a. In psychometrics, content validity (also known as
logical validity) refers to the extent to which a
measure represents all facets of a given
construct.
b. For example, a depression scale may lack
content validity if it only assesses the affective
dimension of depression but fails to take into
account the behavioral dimension.
c. Extrovert and Introvert.
13. Validity
• Content validity is also of two types-
a. Face validity (surface or superficial assessment)
b. Sampling validity.
– Face validity is the extent to which a measurement
method appears “on its face” to measure the construct of
interest.
– Face validity is determined through a subjective
evaluation of a measuring scale.
• Example: a researcher may develop a scale to measure consumer
attitude towards a brand and pre test the scale among a few
experts. If the experts are satisfied with the scale the researcher
may conclude face validity of the scale to be high.
• Personality Test(psychometric test)
• Example: Milk – Cancer.
14. Validity
– Sample validity refers to how
representative the content of measuring
instrument is? Measuring instrument's
content should be representative of
content universe of the characteristic
being measured.
– Example: if attitude is the characteristic
being measured, its content universe may
comprise statements and questions
indicating all aspects of attitude and the
sampling validity can be determined by
comparing these with the content of the
measuring instrument.
1. Social Factors.
2. Direct Instruction.
3. Family.
4. Prejudices.
5. Personal
Experience.
6. Media.
7. Educational and
Religious
Institutions.
8. Physical Factors.
9. Economic Status
and Occupations.
15. Validity
• Criterion validity is the extent to which people’s
scores on a measure are correlated with other
variables (known as criteria) that one would
expect them to be correlated with.
– For example, people’s scores on a new measure of
test anxiety should be negatively correlated with their
performance on an important school exam.
– If it were found that people’s scores were in fact
negatively correlated with their exam performance,
then this would be a piece of evidence that these
scores really represent people’s test anxiety.
16. Validity
• Construct Validity.
• A construct, or psychological construct is an attribute,
proficiency, ability, or skill of the human brain and is
defined by established theories. For example, "overall
English language proficiency" is a construct. It exists in
theory and has been observed to exist in practice.
– Example: in a differential-groups study the performances
on the test are compared for two groups (1) one that has
the construct and (2)that does not have the construct. If
the group with the construct performs better than the
group without the construct it may provide evidence of the
construct validity of the test.
– Example: Global Warming.
17. Levels of data
• Nominal
• Ordinal
• Interval (Scale in SPSS)
• Ratio (Scale in SPSS)
nominal
ordinal
interval
ratio
18. Nominal data
• a more “crude” form of data:
limited possibilities for statistical
analysis categories,
classifications, or groupings
– “pigeon-holing” or labeling
• merely measures the presence
or absence of something
– gender: male or female
– immigration status;
documented,
undocumented
– zip codes, 90210, 92634,
91784.
• nominal categories aren’t
hierarchical, one category isn’t
“better” or “higher” than
another
• assignment of numbers to the
categories has no mathematical
meaning.
• nominal categories should be
mutually exclusive and
exhaustive.
19. Nominal data-continued
• nominal data is usually
represented “descriptively”
• graphic representations
include tables, bar graphs, pie
charts.
• there are limited statistical
tests that can be performed
on nominal data
• if nominal data can be
converted to averages,
advanced statistical analysis is
possible.
20. Ordinal data
• more sensitive than
nominal data, but still
lacking in precision
• exists in a rank order,
hierarchy, or sequence
– highest to lowest, best
to worst, first to last
• allows for comparisons
along some dimension
– example: Performance
of one fellow is better
than other
• examples:
– 1st, 2nd, 3rd places
finishes in a horse race
– top 10 movie box office
successes of 2006
– bestselling books (#1,
#2, #3 bestseller, etc.)
2nd 3rd1st
21. More about ordinal data
• no assumption of “equidistance” of
numbers
–increments or gradations aren’t
necessarily uniform.
• researchers do sometimes treat
ordinal data as if it were interval data
• there are limited statistical tests
available with ordinal data.
22. Interval data (scale data)
• represents a more sensitive type of data
or sophisticated form of measurement.
• assumption of “equidistance” applies to
data or numbers gathered.
– gradations, increments, or units of measure
are uniform, constant.
• examples:
– Scale data: Likert scales,
– Stanford Binet I.Q. test
– Symantec Differenttial
23. More about interval data
• scores can be compared to one another, but in
relative, rather than absolute terms.
– example: If Fred is rated a “6” on attractiveness, and
Barney a “3,” it doesn’t mean Fred is twice as attractive as
Barny.
• No true zero point (a complete absence of the
phenomenon being measured)
– example: A person can’t have zero intelligence or zero self
esteem.
• Scale data is usually aggregated or converted to
averages.
• Amenable to advanced statistical analysis.
24. Ratio data
• The most sensitive, powerful type of data
– ratio measures contain the most precise
information about each observation
that is made.
• Examples:
– time as a unit of measure.
– distance as a unit of measure (setting
an odometer to zero before beginning a
trip)
– weight and height as units of measure.
25. More about ratio data
• more prevalent in the natural
sciences, less common in social
science research
• includes a true zero point
• Allows for absolute comparisons
– If Fred can lift 200 lbs and Barney can
lift 100 lbs, Fred can lift twice as much
as Barney, e.g., a 2:1 ratio
26.
27. Attitude
• An attitude is viewed as an enduring
disposition to respond consistently in a given
manner to various aspect of the world,
including persons, events and objects.
– Attitude cannot be measured directly
– Attitude is derived from the perceptions
– Attitude are indirectly observed
29. Attitude
• Cognitive component: Represents and
individual’s information and knowledge about an
object. Example of remembering about
Tupperware..
• Affective Component: Summarizes a person’s
overall feeling or emotions towards the object.
Example: of tasty food cooked in pressure cooker
• Intention or Action component: It also reflects a
person’s expectation future behavior towards an
object. Example: Purchase intention to buy things
in future
30. Scaling
• Scaling describes the procedures of assigning
numbers to various degrees of opinion,
attitude and other concepts.
• It may be stated here that a scale is a
continuum, consisting of the highest point (in
terms of some characteristic e.g., preference,
favourableness, etc.) and the lowest point
along with several intermediate points
between these two extreme points.
32. Classification of Scales
• Single Item Scale: In the single item scale, there
is only one item to measure a given construct
– How satisfied are you with your current job?
• Very Dissatisfied
• Dissatisfied
• Neutral
• Satisfied
• Very Satisfied
• Other aspects may be left out like, job, pay, work
environment, rules and regulations, security of
job and communication with the seniors.
33. Classification of Scales
• Multiple Item Scale: In this there are many
items that play an important role forming the
underlying construct that the researcher is
trying to measure.
– How satisfied are you with the pay.
– How satisfied are you with the rules.
– How satisfied are you with the job.
34. Classification of Scales
Comparative vs Non Comparative Scale
Scaling Techniques
Comparative
Paired Comparison
Constant Sum
Rank Order
Q-Short and Other
Procedure
Non Comparative
Graphic Rating Scale
Itemized Rating
Scale
Likert
Semantic
Differential
Stapel
35. Comparative Scale
• In comparative scales it is assumed that
respondents make use of a standard frame of
reference before answering the question,
• Example: How do you rate barista in comparison
to cafe coffee Day on Quality of Beverages?
• Please rate Domino's in comparison to Pizza Hut
on basis of your satisfaction level on the 11-point
scale , based on the following parameters: 1-
Extremely poor, 6-Average, 11-Extremely Good.
36.
37. Comparative Scale – Paired
Comparison
• Here a respondent is presented with two objects
and is asked to select one according to whatever
criterion he or she wants to use.
• The Resulting data from this scale is ordinal in
nature.
• Example: Wants to offer chocolate, burger, ice
cream and pizza.
• In general, if there are n items, the number of
paired comparison would be
𝒏(𝒏−𝟏)
𝟐
38. Comparative Scale – Paired
Comparison
• There are many ways of doing it….
• The analysis of paired comparison data would
result in ordinal scale and also interval scale
measurement.
• Example: there are five brands – A,B,C,D and E
and paired comparison with two brands at a time
is presented to the respondent with the option to
chose one of them.
• As there are five brands, it will result in 10 paired
comparison.
• Sample of 250 respondent.
39. A B C D E
A - 0.60 0.30 0.60 0.35
B 0.40 - 0.28 0.70 0.40
C 0.70 0.72 - 0.65 0.10
D 0.40 0.30 0.35 - 0.42
E 0.65 0.60 0.90 0.58 -
A B C D E
A - 1 0 1 0
B 0 - 0 1 0
C 1 1 - 1 0
D 0 0 0 - 0
E 1 1 1 1 -
Total 2 3 1 4 0
40. Comparative Scale – Rank Order Scale
• In Rank order scaling,
respondents are presented
with several objects
simultaneously and asked
to order or rank them
according to some
criterion. Consider, for
example the following
question:
Soft Drink Rank
Coke
Pepsi
Limca
Sprite
Mirinda
Seven up
Fanta
41. Comparative Scale – Constant Sum
Rating Scale
• In this the respondents
are asked to allocate a
total of 100 points
between various
objects and brands. The
respondent distribute
the points to the
various objects in the
order of his preference.
School Points
DPS
Jagran Public
School
Mount Litera
DAV Public School
Jaipuria
Sunbeam
International
Atulanand
Heritage
Total 100
42. Comparative Scale – Q-Sort
• The Q-sort technique was developed to discriminate
among a large number of objects quickly. This
technique makes use of the rank order procedure in
which objects are sorted into different piles based on
their similarity with respect to certain criterion.
• Example: Group of data can be piled up in five group
– Strongly agree
– Some what agree
– Neutral
– Some what disagree
– Strongly disagree
43.
44. Non Comparative Scales
• In this the respondents do
not make use of any
frame of reference before
answering the questions.
The resulting data is
generally assumed to be
interval or ratio scale.
• The respondent may be
asked to evaluate the
quality of food in a
restaurant on a five point
scale. (1=very poor,
2=poor, and 5=very good)
Non
Comparative
Scales
Graphic Rating
Scales
Itemised Rating
Scale
Likert Rating
Scale
Semantic
differential
Rating Scale
Stapel Rating
Scale
45. Graphic Rating Scale
• This continuous scale, also called graphic
rating Scale. In the graphic rating scale the
respondent is asked to tick mark on the
following question:
Least
Preferred
Most
Preferred
46.
47. Itemized Rating Scale
• The respondent are provided with a scale that
has a number or descriptions associated with
each of the response categories.
• Issues related to the Itemized Rating Scale
– Number of categories to be used.
– Odd or even number of categories.
– Balanced versus unbalanced scales.
– Nature and degree of verbal description.
– Forced Versus non-forced scales.
– Physical form.
48.
49. Likert Scale
• This is a multiple item agree – disagree Odd
number scale.
• Check degree of agreement and Disagreement.
• On a simple format there may be 25 to 30 scale
for the discussion.
• The Likert Scale is a 5- or 7-point scale that offers
a range of answer options — from one extreme
attitude to another, like “extremely likely” to “not
at all likely.” Typically, they include a moderate or
neutral midpoint.
50. Likert Scale
• Unipolar Likert Scale
– Now rate the customer care representative’s
knowledge of the product and its features:
• Not at all Satisfactory
• Not Satisfactory
• Somewhat Satisfactory
• Satisfactory
• Very Satisfactory
• Bipolar Likert Scale.
51.
52. Semantic differential Rating Scale
• Semantic Differential Scale is a survey or
questionnaire rating scale that asks people to
rate a product, company, brand or any "entity"
within the frames of a multi-point rating
options. These survey answering options are
grammatically on opposite adjectives at each
end.
54. Stapel Rating Scale
• Definition: Stapel Scale is a unipolar (one
adjective) rating scale designed to measure
the respondent’s attitude towards the object
or event. The scale is comprised of 10
categories ranging from –5 to +5without
any neutral point (zero).
55. Stapel Rating Scale
• For example, the respondent is asked to rank the quality
of food, and crew member service of an airline on a scale
ranging from -5 to +5:
•