2. It is the process of collecting and
analyzing evidence to support the
meaningfulness and usefulness of the
test.
The purpose of validation is to
determine the characteristics of the
whole test itself.
Validation
3. Validity is the extent to which a test
measures what it purports to measure
or as referring to the appropriateness,
correctness, meaningfulness and
usefulness of the specific decisions a
teacher makes based on the test
results.
Validation
5. How appropriate is the content?
How comprehensive?
Does it logically get at the intended
variable?
How adequately does the sample of items or
questions represent the content to be
assessed?
Content-related evidence of validity
refers to the content and format of
the instrument.
6. How strong is this relationship?
How well do such scores estimate
present or predict future performance
of a certain type?
Criterion-related evidence of validity
refers to the relationship between the
scores obtained using the instrument
and scores obtained using one or more
other tests.
7. Grade point Average
Test Score Very Good Good Needs
Improvement
High 20 10 5
Average 10 25 5
Low 1 10 14
Expectancy Table
~ Gronlund
8. How well does a measure of the
construct explain differences in the
behavior of the individuals or their
performance on a certain task?
Construct-related evidence of validity
refers to the nature of the psychological
construct or characteristics being
measured by the test.
9. It refers to the consistency of the
scores obtained – how consistent they
are for each individual from one
administration of an instrument to
another and from one set of items to
another.
Reliability
10. Reliability Interpretation
.90 and above Excellent reliability; at the level of the
best standardized tests
.80 - .90 Very good for a classroom test
.70 - .80 Good for a classroom test; in the
range of most. There are probably a
few items which could be improved.
Table of Interpretation
11. .60 - .70 Somewhat low. This test needs to be
supplemented by other measures (e.g.,
more tests) to determine grades. There
are probably some items which could be
improved.
.50 - .60 Suggests need for revision o test, unless
it is quite short (ten or fewer items). The
test definitely needs to be supplemented
by other measures (e.g., more tests)
for grading.
.50 or below Questionable reliability. This test should
not contribute heavily to he course
grade, and it needs revision.