VALIDITY AND RELIABILITY
GOODNESS OF MEASURE
*Item Analysis
*Reliability
*Validity
Item Analysis
• It is done to see if the items in the instrument
belong there or not, each item is examined for
it’s ability to discriminate between those subjects
whose total scores are high and those with low
scores. This is done by testing the difference
through t-values. Usually items with high t-value
are included in the instrument.
CRITERIA FOR GOOD
MEASUREMENT
 criteria which are commonly used to assess the
quality of measurement scales in research:
1. Reliability
2. Validity
RELIABILITY
 The degree to which a measure is free from
random error and therefore gives consistent
results.
 An indicator of the measure’s internal
consistency
• Methods to measures Reliability
 Test–retest reliability
 Parallel form reliability
 Internal conisitency
VALIDITY
• The accuracy of a measure or the extent to which a
score truthfully represents a concept.
• The ability of a measure (scale) to measure what it
is intended measure.
• Establishing validity involves answers to the
following:
▫ Is there a consensus that the scale measures what it is
supposed to measure?
▫ Does the measure correlate with other measures of
the same concept?
▫ Does the behavior expected from the measure predict
actual observed behavior?
Face validity
• Just on its face the instrument appears to be a
good measure of the concept. “intuitive, arrived
at through inspection”
▫ e.g. Concept=pain level
▫ Measure=verbal rating scale “rate your pain from
1 to 10”.
Face validity is sometimes considered a subtype of
content validity.
Content validity
• Is the extent to which a measuring instrument
provides adequate coverage of the topic under
study
• If the instrument contains a representative
sample of the universe ,the content validity is
good
• It can also be determined by using a panel of
persons who shall judge how well the
measuring instrument meets the standards,
but there is no numerical way to express it.
Content validity
• Content of the measure is justified by other
evidence, e.g. the literature.
• Entire range or universe of the construct is
measured.
• Usually evaluated and scored by experts in the
content area.
• A CVI (content validity index) of .80 or more
is desirable.
Construct validity
• Sensitivity of the instrument to pick up minor
variations in the concept being measured.
Can an instrument to measure anxiety pick up different
levels of anxiety or just its presence or absence?.
Ways of arriving at construct validity
▫ Hypothesis testing method
▫ factor analysis approach
Criterion Related Validity
• Concurrent validity
• Correspondence of one measure of a
phenomenon with another of the same
construct.(administered at the same time)
Two tools are used to measure the same concept
and then a correlational analysis is performed.
The tool which is already demonstrated to be
valid is the “gold standard” with which the other
measure must correlate.
Predictive validity
• The ability of one measure to predict another
future measure of the same concept.
If IQ predicts SAT, and SAT predicts QPA, then shouldn’t IQ predict QPA (we
could skip SATs for admission decisions)
The researcher is usually looking for a more efficient way to measure a
concept.
ASSESSING VALIDITY
1. Face or content validity: The subjective agreement
among professionals that a scale logically appears to
measure what it is intended to measure.
2. Criterion Validity: the degree of correlation of a
measure with other standard measures of the same
construct.
 Concurrent Validity: the new measure/scale is taken at
same time as criterion measure.
 Predictive Validity: new measure is able to predict a
future event / measure (the criterion measure).
3. Construct Validity: degree to which a measure/scale
confirms a network of related hypotheses generated from
theory based on the concepts.
 Convergent Validity.
 Discriminant Validity.
RELATIONSHIP BETWEEN VALIDITY AND
RELIABILITY
1. A measure that is not reliable cannot be
valid, i.e. for a measure to be valid, it
must be reliable  Thus, reliability is a
necessary condition for validity
2. A measure that is reliable is not
necessarily valid; indeed a measure can
be but not valid  Thus, reliability is
not a sufficient condition for validity
3. Therefore, reliability is a necessary but
not sufficient condition for Validity
MEASUREMENT ERROR
This occurs when the observed measurement on a construct or concept
deviates from its true values.
Reasons
 Mood, fatigue and health of the respondent
 Variations in the environment in which measurements are taken
 A respondent may not understand the question being asked and the
interviewer may have to rephrase the same. While rephrasing the question
the interviewer’s bias may get into the responses.
 Some of the questions in the questionnaire may be ambiguous errors may
be committed at the time of coding, entering of data from questionnaire to
the spreadsheet

unit 2.6.pptx

  • 1.
  • 2.
    GOODNESS OF MEASURE *ItemAnalysis *Reliability *Validity
  • 3.
    Item Analysis • Itis done to see if the items in the instrument belong there or not, each item is examined for it’s ability to discriminate between those subjects whose total scores are high and those with low scores. This is done by testing the difference through t-values. Usually items with high t-value are included in the instrument.
  • 4.
    CRITERIA FOR GOOD MEASUREMENT criteria which are commonly used to assess the quality of measurement scales in research: 1. Reliability 2. Validity
  • 5.
    RELIABILITY  The degreeto which a measure is free from random error and therefore gives consistent results.  An indicator of the measure’s internal consistency • Methods to measures Reliability  Test–retest reliability  Parallel form reliability  Internal conisitency
  • 6.
    VALIDITY • The accuracyof a measure or the extent to which a score truthfully represents a concept. • The ability of a measure (scale) to measure what it is intended measure. • Establishing validity involves answers to the following: ▫ Is there a consensus that the scale measures what it is supposed to measure? ▫ Does the measure correlate with other measures of the same concept? ▫ Does the behavior expected from the measure predict actual observed behavior?
  • 7.
    Face validity • Juston its face the instrument appears to be a good measure of the concept. “intuitive, arrived at through inspection” ▫ e.g. Concept=pain level ▫ Measure=verbal rating scale “rate your pain from 1 to 10”. Face validity is sometimes considered a subtype of content validity.
  • 8.
    Content validity • Isthe extent to which a measuring instrument provides adequate coverage of the topic under study • If the instrument contains a representative sample of the universe ,the content validity is good • It can also be determined by using a panel of persons who shall judge how well the measuring instrument meets the standards, but there is no numerical way to express it.
  • 9.
    Content validity • Contentof the measure is justified by other evidence, e.g. the literature. • Entire range or universe of the construct is measured. • Usually evaluated and scored by experts in the content area. • A CVI (content validity index) of .80 or more is desirable.
  • 10.
    Construct validity • Sensitivityof the instrument to pick up minor variations in the concept being measured. Can an instrument to measure anxiety pick up different levels of anxiety or just its presence or absence?. Ways of arriving at construct validity ▫ Hypothesis testing method ▫ factor analysis approach
  • 11.
    Criterion Related Validity •Concurrent validity • Correspondence of one measure of a phenomenon with another of the same construct.(administered at the same time) Two tools are used to measure the same concept and then a correlational analysis is performed. The tool which is already demonstrated to be valid is the “gold standard” with which the other measure must correlate.
  • 12.
    Predictive validity • Theability of one measure to predict another future measure of the same concept. If IQ predicts SAT, and SAT predicts QPA, then shouldn’t IQ predict QPA (we could skip SATs for admission decisions) The researcher is usually looking for a more efficient way to measure a concept.
  • 13.
    ASSESSING VALIDITY 1. Faceor content validity: The subjective agreement among professionals that a scale logically appears to measure what it is intended to measure. 2. Criterion Validity: the degree of correlation of a measure with other standard measures of the same construct.  Concurrent Validity: the new measure/scale is taken at same time as criterion measure.  Predictive Validity: new measure is able to predict a future event / measure (the criterion measure). 3. Construct Validity: degree to which a measure/scale confirms a network of related hypotheses generated from theory based on the concepts.  Convergent Validity.  Discriminant Validity.
  • 14.
    RELATIONSHIP BETWEEN VALIDITYAND RELIABILITY 1. A measure that is not reliable cannot be valid, i.e. for a measure to be valid, it must be reliable  Thus, reliability is a necessary condition for validity 2. A measure that is reliable is not necessarily valid; indeed a measure can be but not valid  Thus, reliability is not a sufficient condition for validity 3. Therefore, reliability is a necessary but not sufficient condition for Validity
  • 15.
    MEASUREMENT ERROR This occurswhen the observed measurement on a construct or concept deviates from its true values. Reasons  Mood, fatigue and health of the respondent  Variations in the environment in which measurements are taken  A respondent may not understand the question being asked and the interviewer may have to rephrase the same. While rephrasing the question the interviewer’s bias may get into the responses.  Some of the questions in the questionnaire may be ambiguous errors may be committed at the time of coding, entering of data from questionnaire to the spreadsheet