1. Qualities of Good Evaluation Tool
The evaluation tool serves a variety of uses. Regardless
of the type of tool used or how the result of evaluation are to be used, all
types of evaluation should possesses certain characteristics. The most
important characteristics are- objective baseness, comprehensiveness,
objectivity, discriminating power , reliability, validity and usability.
Objective baseness:
Each item(question ) should have well defined predetermined criteria
or objectives behind it. This can be in accordance with the criteria specified
in this taxonomy of educational objectives. Each question should represent
an objective based upon which the response can be scored.
Comprehensiveness
A tool is said to be comprehensive if it cover all the teaching point to
be learned by the teacher and it should also cover all the predetermined
objectives.
Objectivity
A tool is said to be objective if it is free from personal bias of
interpreting its scope as well as in scoring the responses.
Objectivity has two aspects- objectivity of questions and objectivity of
scoring. Objectivity of test mainly defined as the βextent to which a student
score on it is based on his actual answer or performance on the test and not
only the opinion of the examinerβ
Objectivity depends on the nature of question and consistency in
scoring. Objectivity usually obtained by,
2. i. Stating the question specifically and precisely
ii. Requiring specific ,precise answers and scoring the test by use of a
previously determined marking scheme.
Discriminating power
Every item should be helpful to discriminate pupil on the basis of their
real attainment. It is the power of an item or question to discriminate
students on the basis of their performance . too easy and too difficult item
will reduce discriminating power.
Reliability
Reliability refers to the consistency with which the tool can be
measured desired criteria.
A test score is called reliable when we have reason for believing the
score to be stable and trust worthy. It refers to the consistency of scores
obtained by same individual when re-examined with the same test on
different occasion or with different set of equivalent item or under other
variable examining conditions.
Method of finding reliability
1. Test Re-test reliability
To estimate reliability by means of test re test method , the same
test is administered twice to the same group of people with a given
time interval between the two administration of test the scores on
a test will be highly correlated with the scores on second
administrative test, the tesr is said to be reliable
2. Equivalent or parallel form method
3. Two forms of a test that are equivalent in terms of characteristics
are administered to the same group of people and find out the
correlation coefficient between these two forms of test scores. If
correlation coefficient is high, the test is said to be reliable
3. Split half method
In this method the test scores can be divided into first half of the
test scores and second half of the test scores or scores of the test
on odd numbered questions and even numbered questions . In
either case the scores on the first half (or odd numbered ) can be
correlated with the scores on the second half (or even numbered )
of the test. The correlation that could result from the above split
would be for only half a test then the reliability coefficient is
computed by applying spearman brown prophecy formula
2πΎ1
2β πΎ1
2β
1+πΎ1
2β 1
2β
4. Kuder Richardson-20(KR-20)
Here reliability of a test is calculated by using the formula
π
πβ1
[
1ββ ππ
ππ₯2
]
Where n=number of questions in the test
π =
ππ’ππππ ππ ππππ πππ πππ π€πππππ ππ‘ππ πππππππ‘ππ¦
ππ’ππππ ππ ππππ ππ π‘πππππ π‘βπ π‘ππ π‘
π =
ππ’ππππ ππ ππππ ππ πππ π€πππππ ππ‘πππ π€ππππππ¦
ππ’ππππ ππ ππππ ππ π‘πππππ π‘βπ π‘ππ π‘
ππ₯2
= π£πππππππ ππ π‘βπ π‘ππ‘ππ π‘ππ π‘ π ππππ
4. Validity
A test is valid if it measure what is intend to measure. ie., it is the
βextent to which measurements instruments measures what is
supposed or designed to measure.
Type of validity
1) Content validity(content related validity)
It refers to the degree to which the test actually measures or
is specifically related to the content for which it was
designed. Validity of achievement test commonly based on
content related evidence that is, it shows how adequately
test questions agree with the course content and objectives.
β The extent to which the content of the test items reflects
the academic discipline understudy . it is usually determined
by professional judgement about the appropriateness of the
content based on blueprint.
2) Criterion related validity
Criterion validation is the process of determining the extent
to which test performance is related to some other valued
measures of performance called criterion. The criterion
related evidence may be obtained some future data and
present performance.
5. β The extent to which scores on a test correlate with
scores on some external criterion measureβ.
It refers two different types of validity namely concurrent
validity and predictive validity with different time frames for
establishing validity.
βPredictive validity refers to the usefulness of a test in
predicting future performance β. It refers to βthe extent which
prediction made from the test scores can predict the future
performance of the studentsβ. Predictive validation is
whether or not the test score predict a specified future
performance .
Concurrent validity refers to the relationship between
scores on a measuring tool and a criterion available at the
same time. It refers to β the extent to which scores on attest
match performance scores on one or more criterion measure
obtained at about the same time the test is given.
3) Construct validity
It is the degree to which scores on a test can be accounted
for by the explanatory construct of sound theory. It refers to
β the extent to which a test measures one or more
dimension of a theory or trait.
A construct is a psychological quality that is assumed to exist
in order to explain some aspects of behaviour
4) Face validity
6. It refers to the extent to which test taker can identified the
purpose of the test.
Usability
Usability should consider administration of the test , scoring
procedure, number of questions and economy