Validation
and
Reliability
Lovebella C. Jao
Discussant
It is the process of collecting and
analyzing evidence to support the
meaningfulness and usefulness of the
test.
The purpose of validation is to
determine the characteristics of the
whole test itself.
Validation
Validity is the extent to which a test
measures what it purports to measure
or as referring to the appropriateness,
correctness, meaningfulness and
usefulness of the specific decisions a
teacher makes based on the test
results.
Validation
Content-related evidence
Criterion-related evidence
Construct-related evidence
Three Main Types of Evidence
How appropriate is the content?
How comprehensive?
Does it logically get at the intended
variable?
How adequately does the sample of items or
questions represent the content to be
assessed?
Content-related evidence of validity
refers to the content and format of
the instrument.
How strong is this relationship?
How well do such scores estimate
present or predict future performance
of a certain type?
Criterion-related evidence of validity
refers to the relationship between the
scores obtained using the instrument
and scores obtained using one or more
other tests.
Grade point Average
Test Score Very Good Good Needs
Improvement
High 20 10 5
Average 10 25 5
Low 1 10 14
Expectancy Table
~ Gronlund
How well does a measure of the
construct explain differences in the
behavior of the individuals or their
performance on a certain task?
Construct-related evidence of validity
refers to the nature of the psychological
construct or characteristics being
measured by the test.
It refers to the consistency of the
scores obtained – how consistent they
are for each individual from one
administration of an instrument to
another and from one set of items to
another.
Reliability
Reliability Interpretation
.90 and above Excellent reliability; at the level of the
best standardized tests
.80 - .90 Very good for a classroom test
.70 - .80 Good for a classroom test; in the
range of most. There are probably a
few items which could be improved.
Table of Interpretation
.60 - .70 Somewhat low. This test needs to be
supplemented by other measures (e.g.,
more tests) to determine grades. There
are probably some items which could be
improved.
.50 - .60 Suggests need for revision o test, unless
it is quite short (ten or fewer items). The
test definitely needs to be supplemented
by other measures (e.g., more tests)
for grading.
.50 or below Questionable reliability. This test should
not contribute heavily to he course
grade, and it needs revision.
Thank You
for
Listening!

Validity and Reliability of a Test

  • 1.
  • 2.
    It is theprocess of collecting and analyzing evidence to support the meaningfulness and usefulness of the test. The purpose of validation is to determine the characteristics of the whole test itself. Validation
  • 3.
    Validity is theextent to which a test measures what it purports to measure or as referring to the appropriateness, correctness, meaningfulness and usefulness of the specific decisions a teacher makes based on the test results. Validation
  • 4.
  • 5.
    How appropriate isthe content? How comprehensive? Does it logically get at the intended variable? How adequately does the sample of items or questions represent the content to be assessed? Content-related evidence of validity refers to the content and format of the instrument.
  • 6.
    How strong isthis relationship? How well do such scores estimate present or predict future performance of a certain type? Criterion-related evidence of validity refers to the relationship between the scores obtained using the instrument and scores obtained using one or more other tests.
  • 7.
    Grade point Average TestScore Very Good Good Needs Improvement High 20 10 5 Average 10 25 5 Low 1 10 14 Expectancy Table ~ Gronlund
  • 8.
    How well doesa measure of the construct explain differences in the behavior of the individuals or their performance on a certain task? Construct-related evidence of validity refers to the nature of the psychological construct or characteristics being measured by the test.
  • 9.
    It refers tothe consistency of the scores obtained – how consistent they are for each individual from one administration of an instrument to another and from one set of items to another. Reliability
  • 10.
    Reliability Interpretation .90 andabove Excellent reliability; at the level of the best standardized tests .80 - .90 Very good for a classroom test .70 - .80 Good for a classroom test; in the range of most. There are probably a few items which could be improved. Table of Interpretation
  • 11.
    .60 - .70Somewhat low. This test needs to be supplemented by other measures (e.g., more tests) to determine grades. There are probably some items which could be improved. .50 - .60 Suggests need for revision o test, unless it is quite short (ten or fewer items). The test definitely needs to be supplemented by other measures (e.g., more tests) for grading. .50 or below Questionable reliability. This test should not contribute heavily to he course grade, and it needs revision.
  • 12.