2. Definitions
A test is an instrument or systematic procedure for observing
and describing one or more characteristics of student, using
either a numerical scale or classification scheme.
Measurement: is procedure for assigning numbers to specified
attribute or characteristic of person.
Evaluation: is the process of making a value judgment about the
worth of someone or something. (Nitko, 2001).
aliheydari.tums@gmail.com
3. Literature Review
• 13% of students who fail in class are caused by faulty test
questions • World watch- The Philadelphia trumpet, 2005
• It is estimated that 90% of the testing items are out of
quality • Wilen WW (1992)
aliheydari.tums@gmail.com
4. Learning objectives
A learning objective (target) specifies what
you would like students to achieve at the
completion of an instructional segment.
aliheydari.tums@gmail.com
5. Stages in Test Construction
I. Planning the Test
A. Determining the Objectives
B. Preparing the Table of Specifications
C. Selecting the Appropriate Item Format
D. Writing the Test items
E. Editing the Test items
aliheydari.tums@gmail.com
6. Stages in Test Construction
II. Trying Out the Test
A. Administering the test
B. Item analysis
C. Preparing the Final Form of the Test
aliheydari.tums@gmail.com
7. Stages in Test Construction
III. Establishing Test Validity
IV. Establishing Test Reliability
V. Interpreting the Test Scores
aliheydari.tums@gmail.com
8. A Table of Specifications is:
The teacher’s blueprint in constructing a test for classroom use.
TOS ensures that there is a balance between items that test
lower level thinking skills and those with higher order
thinking skills in the test.
aliheydari.tums@gmail.com
9. Steps in Preparing TOS
List down the topics covered for inclusion in the test.
Determine the objectives (Bloom’s Taxonomy) to be
assessed by the test.
Determine the percentage allocation of the test
items for each topic.
aliheydari.tums@gmail.com
10. Characteristics of a Good Test
Validity
Reliability
Practicality
Administrability
Comprehensiveness
Objectivity
Simplicity
Scorability
aliheydari.tums@gmail.com
11. Validity
A test is valid if it measures what we want it to measure and
nothing else.
Validity is a more test-dependant concept but reliability is a
purely statistical parameter.
aliheydari.tums@gmail.com
12. Types Of Validity
Content Validity
Criterion-Related Validity
Construct Validity
Face Validity
aliheydari.tums@gmail.com
13. Content Validity
Does the test measure the objectives of the course?
The extent to which a test measures a representative
sample of the content to be tested at the intended
level of learning.
aliheydari.tums@gmail.com
14. Criterion-related Validity
Criterion-related Validity investigates the
correspondence between the scores obtained from
the newly-developed test and the scores obtained
from some independent outside criteria.
aliheydari.tums@gmail.com
15. Criterion-related Validity
Depending on the time of administration
Concurrent Validity:
Correlation between the test scores (new test) with a
recognized measure taken at the same time.
Predictive validity:
Comparison (correlation) of students' scores with a
criterion taken at a later time.
aliheydari.tums@gmail.com
16. Construct validity
Refers to measuring certain traits or theoretical
construct.
It is based on the degree to which the items in a
test reflect the essential aspects of the theory on
which the test is based on.
aliheydari.tums@gmail.com
17. Face Validity
Does it appear to measure what it is supposed to
measure?
aliheydari.tums@gmail.com
18. Factors Affecting Validity
a. Directions (clear and simple)
b. Difficulty level of the test (not too easy nor too difficult)
c. Structure of the items (poorly constructed and/or
ambiguous items will contribute to invalidity)
d. Arrangement of items and correct responses (starting
with the easiest items and ending with the difficult ones +
arranging item responses randomly not based on an
identifiable pattern)
aliheydari.tums@gmail.com
19. Reliability
A test is reliable if we get the same results repeatedly.
An “unreliable” test, on the other hand one’s score
might fluctuate from one administration to the other.
aliheydari.tums@gmail.com
20. several ways to measuring reliability
Internal Consistency
Test-retest Reliability
Split-half Methods
Inter rater Reliability
• Parallel-Forms
aliheydari.tums@gmail.com
21. Test-Retest
Administrating a given test to a particular group twice and
calculating the correlation between the two sets of score
Since there has to be a reasonable amount of time
between the two administrations, this kind of reliability is
referred to as the reliability or consistency over time.
aliheydari.tums@gmail.com
23. Disadvantages of Test-Retest
It requires two administrations.
Preparing similar conditions under which the
administration take place adds to the complications of this
method.
There should be a short time between to administration.
Although not too short nor too long. To keep the balance it
is recommended to have a period of two weeks between
them.
aliheydari.tums@gmail.com
24. Two similar, or parallel forms of the same test are administrated to a
group of examinees just once.
The two form of the test should be the same.
Subtests should also be the same.
The problem here is constructing two parallel forms of a test which is a
difficult job to do.
Parallel-Forms
aliheydari.tums@gmail.com
25. Split-Half Test
In this method, when a single test with homogeneous
items is administrated to a group of examinees, the
test is split, or divided, into two equal halves. The
correlation between the two halves is an estimate of
the test score reliability.
In this method easy and difficult items should be
equally distributed in two halves.
aliheydari.tums@gmail.com
26. Split-Half Advantages and
Disadvantages
Advantages:
There is no need to administer the same test twice. Nor
is it necessary to develop two parallel form of the same
test.
Disadvantages:
Developing a test with homogeneous items
aliheydari.tums@gmail.com
27. Which method should we use?
It depends on the function of the test.
Test-retest method is appropriate when the consistency of
scores a particular time interval (stability of test scores over
time) is important
The Parallel-forms method is desirable when the consistency
of scores over different forms is of importance.
When the go-togetherness of the items of a test is of
significance (the internal consistency), Split-Half and KR-21 will
be the most appropriate methods.
aliheydari.tums@gmail.com
28. Factors Influencing Reliability
To have a reliability estimate, one or two sets of scores
should be obtained from the same group of testees. Thus,
two factors contribute to test reliability: the testee and the
test itself.
aliheydari.tums@gmail.com
29. Validity and Reliability
Neither Valid
nor Reliable
Reliable but not
Valid
Valid & Reliable
A test must be reliable aliheydari.tums@gmail.com to be valid, but reliability does not guarantee validity
30. Practicality
practicality refers to the ease of administration
and scoring of a test.
aliheydari.tums@gmail.com
31. Administrability
Administrability the test should directed uniformly to all
students so that the scores obtained will not vary due to
factors other than differences of the students’ knowledge and
skills. There should a clear provision for instruction for the
students and the one who will check the test (having clear
directions and processes)
aliheydari.tums@gmail.com
32. Scorability
Scorability the test should be easy to score, directions for
scoring is clear, provide the answer sheet and the answer
key.
aliheydari.tums@gmail.com
33. Comprehensiveness
A test is said to have comprehensiveness if it
encompasses all aspects of a particular subject of
study.
aliheydari.tums@gmail.com
34. Simplicity
A test is said to be simple if it is easy to understand
along with the instructions and other details.
aliheydari.tums@gmail.com
35. Objectivity
Objectivity represents the agreement of two or more
raters or a test administrator concerning the score of a
student.
Not influenced by emotion or personal prejudice.
Lack of objectivity reduces test validity in the same way
that lack reliability influence validity.
aliheydari.tums@gmail.com
36. The Other Factors
Test length
Speed
Item difficulty
aliheydari.tums@gmail.com
37. References
David ,M. Robert , L. . Norman, E. Measurement and Assessment in
Teaching (10th Ed). Pearson(2008)
Anthony J, N. Educational Assessment of Students (3th Ed). Merrill
Prentice Hall (2001)
http://www.ehow.com/how_4913690_steps-preparing-test.html
aliheydari.tums@gmail.com