2. Measurement in Research
• Quantification of a characteristic or
attribute of a person, object, or event
• Provides for a consistent and meaningful
interpretation of the nature of an attribute
when the same measurement process or
instrument is used
• Systematic process that uses rules to
assign numbers to persons, objects, or
events which represent the amount or kind
of a specified attribute
3. Sources of Error of Measurements
• Respondents
• Situations
• Measurer
• Instrument
4. Brief Overview
• Reliability and Validity are important
concepts in research as they are used for
enhancing the accuracy of the assessment
and evaluation of a research work
(Tavakol and Dennick, 2011)
• They have different meanings under the
different types of research, i.e. quantitative
and qualitative research (Creswell, 2014)
• It is possible for measurement to be
reliable but invalid and vice-versa
5. a. Validity
• Ability of an instrument to measure what it
is designed to measure
• Refers to the accuracy of responses on
self-report, norm-referenced measures of
attitudes and behavior
• It is the extent to which any measuring
instrument measures what it is intended to
measure (Thatcher, 2010)
6. a. Validity: Types
1. Face Validity
• Refers to whether the instrument looks as
though it is measuring the appropriate
construct
• It is helpful for a measure to have face
validity if other types of validity have also
been demonstrated
7. a. Validity: Types
2. Content Validity
• Concerns the degree to which an
instrument has an appropriate sample of
items for the construct being measured
• Relevant in development of both affective
measures and cognitive measures
8. a. Validity: Types
3. Criterion-Related Validity
• Involves determining the relationship
between an instrument and an external
criterion
• Relates to our ability to predict some
outcome or estimate the existence of
some current condition
• The instrument is said to be valid if its
scores correlates highly with scores on the
criterion
9. a. Validity: Types
3.1 Predictive Validity
• Refers to the adequacy of an instrument in
differentiating between people’s
performance on some future criterion
3.2. Concurrent Validity
• Refers to an instrument’s ability to
distinguish individuals who differ on a
present criterion
10. a. Validity: Types
4. Construct Validity
• Degree to which scores on a test can be
accounted for by the explanatory
constructs of a sound theory
• For determining construct validity, we
associate a set of other propositions with
the results received from using our
measurement instrument
11. a. Validity: Types
5.1. Convergent Validity
• Evidence that different methods of
measuring a construct yield similar results
• Correlations between two different
methods measuring the same trait
5.2. Divergent Validity
• Ability to differentiate the construct form
other similar construct
12. b. Reliability
• Refers to the consistency of responses on
self-report, norm-referenced measures of
attitudes and behavior
• Refers to the consistency, stability and
repeatability of results i.e. the result of a
researcher is considered reliable if
consistent results have been obtained in
identifical situations but different
circumstances (Twycross and Shields,
2004)
13. b. Reliability: Factors
• Factors affecting the reliability of a
research instrument
– Wording of questions
– Physical setting
– Respondent’s mood
– Nature of interaction
– Regression effect of an instrument
14. b. Reliability: Types
1. Stability – this is when a researcher
obtains the same results in repeated
administrations or when the same test
tools are used on the same sample size
more than once, and when there is a
reliability coefficient that provides an
indication of how reliable the tool is
15. b. Reliability: Types
Test of Stability
• Test-retest
– Researchers administer the same measure to
a sample on two occasions and then compare
the scores
– The comparison is performed objectively by
computing a reliability coefficient (index of
magnitude of the test’s reliability)
– Reliability coefficient ranges from -1.00 through
0.00 to +1.00
16. b. Reliability: Types
2. Homogeneity – This is a measure of the
internal consistency of the scales.
Cronbach’s alpha is used to measure the
reliability of a tool
17. b. Reliability: Types
Test of Homogeneity
• Internal Consistency
– The most widely used reliability approach
among the researchers
– If is economical and is the best means of
assessing an especially important source of
measurement error in instruments
– Types:
1. Split-half technique
2. Cronbach’s alpha
18. b. Reliability: Types
Test of Homogeneity
• Split Half Technique (Internal Consistency)
– Oldest methods for assesing internal
consistency
– Items on a scale are split into two groups and
scored independently. Scores on the two hald
tests then are used to compute a correlation
coefficient
19. b. Reliability: Types
Test of Homogeneity
• Cronbach’s alpha (Internal Consistency)
– Most widely used method
– Preferable because it gives an estimate of the
split-half correlation for all possible ways of
dividing the measure into two halves
– Normal range values is between 0.00 and
+1.00
20. b. Reliability: Types
3. Equivalence – this is level of agreement
among researchers using the same data
collection tool. The ratings of two or more
researchers are compared by calculating
a correlation coefficient
21. b. Reliability: Types
Test of Equivalence
• Interrater (or interobserver) Reliability
– Estimated by having two or more trained
observers watching an event simultaneously,
and independently recording data according to
the instrument’s instructions.
– The data can be used to compute an index of
equivalence or agreement between observers
22. b. Reliability: Types
• Techniques such as Cohen’s kapa,
analysis of variance, intraclass correlations
and rank-orer correlations assess this
reliability
23. b. Reliability: Types
Test of Equivalence
• Alternate Form
– Consisting of two sets of similar questons
designed to measure the same trait
– The two tests are based on the same content,
but the individual items are different.
24. c. Other Criteria for Assesing
Quantitative Measures
• Sensitivity and Specificity
– Sensitivity = the ability of an instrument to
identify a “case correctly”, that is, to screen in
or diagnosis is its rate of yielding “true
positives”
– Specificity = the intrument’s ability to identify
non-cases correctly, that is, to screen out those
without condition correctly, yielding “true
negatives”
25. c. Other Criteria for Assesing
Quantitative Measures
• Receiver Operating Characteristic (ROC)
curve
– The best cutoff point (score value used to
distinguish cases and noncases) used
– A tradeoff between sensitivity and specificity of
an instrument
– Sensitivity is plotted against the false-positive
rate (rate of incorrectly diagnosing someone as
a case, inverse of specificity)