STANDARDIZED AND NON
Mr. Sam Jose
Bhopal college of nursing
Education aims at the all-round development of a
student not merely imparting knowledge to
him. Evaluation is the process of judging the value
or worth of an individual’s achievements or
It is the judging of the goals attained by the
educational system. In order to evaluate the
student knowledge teacher uses different types of
Standardization means uniformity of procedure in scoring,
administering and interpreting the result.
standardized tests and scales that have met the criteria of
This means that they have been used in a sufficiently larger
number of cases to make it possible to determine their
validity, reliability , objectivity and administratability.
A standardized test is a test that is administered and
scored in a consistent, or "standard", manner.
Any test in which the same test is given in the same
manner to all test takers, and graded in the same
manner for everyone, is a standardized test. It is
developed by test specialist.
A standardized test is any form of test that (1) requires
all test takers to answer the same questions, or a
selection of questions from common bank of
questions, in the same way, and that (2) is scored in a
“standard” or consistent manner, which makes it
possible to compare the relative performance of
individual students or groups of students.
While different types of tests and assessments may be
“standardized” in this way, the term is primarily
associated with large-scale tests administered to large
populations of students, such as a multiple-choice test
given to all the eighth-grade public-school students in
a particular state.
In addition to the familiar multiple-choice format,
standardized tests can include true-false questions,
short-answer questions, essay questions, or a mix of
question types. While standardized tests were
traditionally presented on paper and completed using
pencils, and many still are, they are increasingly being
administered on computers connected to online
While standardized tests may come in a variety of
forms, multiple-choice and true-false formats are
widely used for large-scale testing situations because
computers can score them quickly, consistently, and
Eg: NCLEX (National Council Licensure
Examination) is a nationwide examination for
the licensing of nurses in the United States and Canada
SAT(Scholastic Assessment Test)
A systematic procedure for determining the amount a
student has learned through instruction.
” The NON-STANDARDISED TESTS focus upon
an examinees' attainment at a given point of time”.
Basically teacher made tests are used to evaluate the
progress of the students in school. However , the
specific use of tests may vary from school to school
and teacher or teacher.
A non-standardized test is one that allows for an
assessment of an individual's abilities or performances,
but doesn't allow for a fair comparison of one student to
• The test results can be used for students, teachers,
and for other administrative purposes.
These tests are very simple to use.
Easy for the students.
Teachers can assess the strengths and weaknesses of
Teachers can understand the need for re- teaching
concepts and can decide remedial instruction.
Teacher made tests devised by the teachers is to meet
their various needs and directives.
Tests are not so carefully and scientifically prepared
The items of teacher made tests are seldom analyzed
Reliability is a characteristic of any test
refers to the accuracy and consistency of information obtained
in a study.
A well-developed scientific tool should give accurate results
both at present as well as over the time.
A test good reliability means that the test taker will obtain the
same test score over repeated testing as long as no other
extraneous factors have affected the score.
A good instrument will produce consistent scores. An
instrument’s reliability is estimated using a correlation
Types of Reliability
1. Test-retest reliability is a measure of reliability
obtained by administering the same test twice over a
period of time to a group of individuals. The scores from
Time 1 and Time 2 can then be correlated in order to
evaluate the test for stability over time.
2. Parallel forms reliability is a measure of reliability
obtained by administering different versions of an
assessment tool (both versions must contain items that
probe the same construct, skill, knowledge base, etc.) to
the same group of individuals. The scores from the two
versions can then be correlated in order to evaluate the
consistency of results across alternate versions.
3. Inter-rater reliability is a measure of reliability
used to assess the degree to which different
judges or raters agree in their assessment
decisions. Inter-rater reliability is useful because
human observers will not necessarily interpret
answers the same way; raters may disagree as to
how well certain responses or material
demonstrate knowledge of the construct or skill
The accuracy with which a test measures whatever it is
supposed to measure.
An evaluation procedure is valid to the extent that it provides
an assessment of the degree to which pupils have achieved
specific objectives , content matter and learning experiences
Validity is an important characteristic of any test. This refers to
what the test really measures. A test is valid, if it measures
what we really wish to measure.
It is a more complex concept that broadly concerns the
soundness of the study's evidence - that is whether the finding
are unbiased and well grounded.
Factors affecting validity
If reading vocabulary is poor, students fail to reply to
Difficult sentences make difficulty to understand.
Use of inappropriate items.
Medium of expression , English instruction difficult
for non- English medium students.
Too easy and too difficult test items would not
discriminate among pupils. Influence of extraneous
factors grammar , handwriting , legibility etc .
Types of validity
1. Content validity: all major aspects of the content area
should be covered by the test items.
2. Predictive validity: extent to which a test can predict the
future performance of the students.
3. Concurrent validity: to diagnose the existing status of
the individual rather than predicting about the future
4. Constructive validity: extent to which a test reflects to
measure a hypothesized trait.
5. Face validity: When one looks at the test he thinks of the
extent to which the test seems logically related to what
is being tested.
A good test should be inexpensive, not only from the
view point of money but also from the view point of
time and effort taken in the construction of a test.
Fortunately there is no direct relationship between
cost and quality.
Ease in administration:
A test is good only when the conditions of answering
are simple (scientific and logical). Its instruction
should be simple and clear.
Generally the time given to students is always in short
supply however the students too do not accept very
long tests. Therefore a test should neither be very long
nor very short.
A good test should be acceptable to student to whom
its being given without regard to any specific situation
that is the question given in the test should be neither
very difficult nor very easy.
The items in a test should be specific to the objectives.
The extent to which independent researchers would arrive
at similar judgements or conclusions .
i.e, judgements not biased by personal values or beliefs.
Achievement of the correct proportion among
questions allotted to each of the objectives & teaching
Precise & clear:
Items should be precise, clear so that the students
can answer well and score marks.
Usefulness of an object or product
Considered as the ease of use or the extent to which a
product can be used by a specified user
In education the teacher verifies the accuracy of
information obtained about the students' performance
after administering an educational tool.
Standard of comparison for test results developed by giving
the test to large well defined groups of people.