a formal and systematic instrument,
usually paper and pencil procedure
designed to assess the quality, ability, skill
or knowledge of the students by giving a
set of question in uniform manner
one of the many types of assessment
procedure used to gather information
about the performance of students
“Validity is the extent to which a test
measures what it claims to measure. It is
vital for a test to be valid in order for the
results to be accurately applied and
Other definitions given by experts
Gronlund and Linn (1995)- “ Validity refers
to the appropriateness of the interpretation
made from test scores and other
evaluation results with regard to a
Anne Anastasi (1969) writes “ the validity
of a test concerns what the test measures
and how well it does so.”
Ebel and Frisbie (1991)- “ The term
validity, when applied to a set of test
scores, refers to the consistency
(accuracy) with which the scores measure
a particular cognitive ability of interest.”
C.V. Good (1973)- in the dictionary of
education defines validity as the “ extent to
which a test or other measuring instrument
fulfills the purpose for which it is used.”
•TYPES OF VALIDITY
1. Face Validity:
- it is the extent to which the
measurement method appears “on its
face” to measure the construct of interest.
-is done by examining the physical
appearance of the instrument to make it
readable and understandable
People might have negative reactions
to an intelligence test that did not
appear to them to be measuring their
2. Content Validity:
-it is the extent to which the
measurement method covers the entire
range of relevant behaviors, thoughts,
and feelings that define the construct
-is done through a careful and critical
examination of the objectives of
assessment to reflect the curricular
3. Criterion-based Validity:
-it is the extent to which people’s
scores are correlated with other variables
or criteria that reflect the same construct.
-is established statistically such
that a set of scores revealed by the
measuring instrument is correlated with the
scores obtained in another external
predictor or measure
An IQ test should correlate positively with
An occupational aptitude test should
correlate positively with work performance
•TYPES OF CRITERION VALIDITY:
3.1. Predictive Validity:
-describes the future performance of
an individual by correlating the sets of
scores obtained from two measures given
at a longer time interval
-when the criterion is something that
will happen or be assessed in the future,
this is called predictive validity.
3.2. Concurrent Validity:
-describes the present status of the
individual by correlating the sets of
scores obtained from two measures
given at a close interval
-when the criterion is something that is
happening or being assessed at the
same time as the construct of interest, it
is called concurrent validity.
4. Construct Validity
-is established statistically by comparing
psychological traits or factors that
theoretically influence scores in a test
TYPES OF CONSTRUCT VALIDITY:
4.1 Convergent Validity
-is established if the instrument defines
another similar trait other than what it is
intended to measure.
E.g. Critical Thinking Test may be
correlated with Creative Thinking Test.
4.2 Divergent Validity
- is established if an instrument can
describe only the intended trait and not
the other traits.
E.g. Critical Thinking Test may not be
correlated with Reading Comprehension
Nature of Validity
1. Validity refers to the appropriateness of the
test results but not to the instrument itself.
2. Validity does not exist on an all-or-none
basis but it is a matter of degree.
3. Tests are not valid for all purposes. Validity
is always specific to particular interpretation.
4. Validity is not of different types. It is a unitary
concept. It is based on various types of
Factors Affecting Validity :-
1. Factors in the test:
(i) Unclear directions to the students to
respond the test.
(ii) Difficulty of the reading vocabulary and
(iii) Too easy or too difficult test items.
(iv) Ambiguous statements in the test
(v) Inappropriate test items for measuring
a particular outcome.
(vi) Inadequate time provided to take the
2. Factors in Test Administration and
(i) Unfair aid to individual students, who
ask for help.
(ii) Cheating by the pupils during testing.
(iii) Unreliable scoring of essay type
(iv) Insufficient time to complete the test.
(v) Adverse physical and psychological
condition at the time of testing.
3. Factors related to Testee
(i) Test anxiety of the students.
(ii) Physical and Psychological state of
(iii) Response set– a consistent
tendency to follow a certain pattern in
responding the items.
refers to the consistency of
measurement; that is, how consistent
test results or other assessment results
from one measurement to another
Other definitions given by experts
Gronlund and Linn (1995)-” reliability
refers to the consistency of measurement-
that is, how consistent test scores or other
evaluation results are from one
measurement to other”.
Ebel and Frisbie (1991)- “ the term
reliability means the consistency with
which a set of test scores measure
whatever they do measure”.
C.V. Good (1973)-has defined reliability
as the “ worthiness with which a
measuring device measures something;
the degree to which a test or other
instrument of evaluation measures
consistently whatever it does in fact
Davis (1946) “ the degree of relative
precisions of measurement of a set of test
score is defined reliability”.
Nature of Reliability
1. Reliability refers to consistency of the results
obtained with an instrument but not the
2. Reliability refers to a particular interpretation
of test scores.
3.Reliability is a statistical concept to determine
reliability we administer a test to a group once
or more than once.
4. Reliability is necessary but not a sufficient
condition for validity.
Four methods of determining
(a) Test-Retest method.
(b) Equivalent forms/Parallel forms
(c) Split-half method.
(d) Rational Equivalence/Kuder-
This is the simplest method of determining
the test reliability.
To determine reliability in this method the
test is given and repeated on same group.
Then the correlation between the first set
of scores and second set of scores is
Equivalent Forms/Parallel Forms Method:
In this process two parallel forms of tests
are administered to the same group of
pupils in short interval of time, then the
scores of both the tests are correlated.
This correlation provides the index of
In this method a test is administered to a
group of pupils in usual manner. Then the
test is divided into two equivalent values
and correlation for these halftests are
Rational Equivalent/Kuder Richardson Method:
This method also provides a measure of
internal consistency. It neither requires
administration of two equivalent forms of
tests nor it requires to split the tests into two
Reliability coefficient is determined by using
the KuderRichardson formula20 which
reads like this.
Factors affecting reliability:-
1. Factors related to test:
(i) length of the test
(ii) content of the test
(iii) characteristics of items
(iv) spread of scores
2. Factors related to testee:
(i) Heterogeneity of the group
(ii) Test wiseness of the students
(iii) Motivation of the students
3. Factors related to testing procedures:
(i) Time limit of test
(ii) Cheating opportunity given to the
refers to the agreement of two or more
raters or test administrators concerning
the score of the student.
Other definitions given by experts:
C.V. Good (1973) defines objectivity in
testing is “the extent to which the
instrument is free from personal error
(personal bias), that is subjectivity on the
part of the scorer”.
Gronlund and Linn (1995) states
“Objectivity of a test refers to the degree to
which equally competent scores obtain the
same results. So a test is considered
objective when it makes for the elimination
of the scorer’s personal opinion and bias
judgement. In this context there are two
aspects of objectivity which should be kept
in mind while constructing a test.”
Two aspects of objectivity which should be kept
in mind while constructing a test
1. Objectivity in Scoring
-means same person or different persons
scoring the test at any time arrives at the
same result without may chance error. a
test to be objective must necessarily so
worded that only correct answer can be
given to it.
2. Objectivity of Test Items
-means that the item must call for a
definite single answer. Wellconstructed
test items should lead themselves to one
and only one interpretation by students
who know the material involved. It means
the test items should be free from
- means the test item should not have
any biases. It should not be offensive
to any examinee subgroup.
-a test can only be good if it is fair to all
-a fair assessment provides all
students with an equal opportunity to
The key to fairness are as follows:
Students have knowledge of learning targets
Students are given equal opportunity to
Students possess the pre-requisite
knowledge and skills.
Students are free from teacher stereotypes.
Students are free from biased assessment
task and procedures.
- means that the test should be easy
to score, direction for scoring should be
clearly stated in the instruction. Provide the
students an answer sheet and the answer
key for the one who will check the test.
-means that the test should contain a
wide range of sampling of items to
determine the educational outcomes or
abilities so that the resulting scores are
representatives of the total performance in
the areas measured.
- means that the test should be
administered uniformly to all students so
that the scores obtained will not vary due to
factors other than differences of the
student’s knowledge ad skills. There should
be a clear provision for instruction for the
students, proctors and even the one who will
check the test or the test scorer.
• PRACTICALITY AND EFFICIENCY
-refers to the teacher’s familiarity with
the methods used, time required for the
assessment, complexity of the
administration, ease of scoring, ease of
interpretation of the test results and the
materials used must be at the lowest cost.
-a balanced assessment sets targets in all
domains of learning (cognitive, affective and
psychomotor) or domains of intelligence
(verbal-linguistic, logical-mathematical, bodily
kinesthetic , visual- spatial, musical-rhythmic,
introspection, physical world-natural,
-makes use of both traditional and