4. 02
OBJECTIVES
a. explain key concepts, principles,
and practices of assessment;
b. express the importance of your
role as a teacher in developing fairer
tests;
c. evaluate existing tests and
recommend improvements to
enhance the quality of the test.
8. 05
A TEST
Interior Design 001
A test is used to examine someone’s knowledge of
something to determine what that person knows or
has learned. It measures the level of skill or
knowledge that has been reached.
A test is a “product” that measures a particular
behavior or set of objectives.
Tests are done after the instruction has taken
place, it’s a way to complete the instruction and get
the results. The results of the tests don’t have to be
interpreted, unlike assessments.
9. 05
AN ASSESSMENT
Interior Design 001
Assessment is the systematic process of
documenting and using empirical data on
knowledge, skills, attitudes, and beliefs.
Assessment is seen as a procedure instead
of a product.
Assessment is used during and after the
instruction has taken place.
10. 09
(Taken from Brown, H.D. 2004. Language Assessment: Principles
and Classroom Practices. New York: Pearson Education p. 5)
• By doing the assessment,
teachers can hopefully gain
information about every
aspect of their students,
especially their
achievements.
• An aspect that plays a crucial
role in assessment is test.
12. 07
PRACTICALITY
Practicality can be simply defined as the
relationship between available resources for
the test, i.e. human resources, material
resources, time, etc., and resources that will
be required in the design, development, and
use of the test (Bachman & Palmer,
1996:35-36).
Brown (2004:19) defines practicality in
terms of 1) Cost, 2) Time, 3)
Administration, 4) Scoring /
Evaluation
13. 14
COST
The test should not be too expensive to
conduct.
The cost of the test has to stay within the
budget.
Avoid conducting a test that requires an
excessive budget.
14. 14
TIME
The test should stay within
appropriate time constraints.
The test should not be too long or
too short.
17. 07
VALIDITY
The validity of a test is the extent, to
which it exactly measures what it is
supposed to measure (Hughes,
2003:26).
A test must aim to provide a true
measure of the particular skill which it is
intended to measure not to the extent
that it measures external knowledge
and other skills at the same time
(Heaton, 1990:159).
19. 14
CONTENT VALIDITY
• The correlation between the contents
of the test and the skills, structures,
etc. with which it is meant to be
measured has to be crystal clear.
• The test items should really
represent the course objective.
20. 14
CONSEQUENTIAL
VALIDITY
Consequential validity to refer to the
social consequences of using a
particular test for a particular purpose.
The use of a test is said to have
consequential validity to the extent that
society benefits from the use of the
test.
21. 14
FACE VALIDITY
A test is said to have face validity if it looks to
other testers, teachers, moderators, and
students as if it measures what it is supposed to
measure (Heaton, 1990:159).
The test can be judged to have face validity by
simply looking at the items of the test.
Note that face validity can affect students in
doing the test (Brown, 2004:27 & Heaton,
1988:160).
22. 14
FACE VALIDITY
Students will be more confident if they face a well-
constructed, expected format with familiar tasks.
Students will be less anxious if the test is clearly
doable within the allotted time limit.
Students will be optimistic if the items are clear and
uncomplicated (simple).
Students will find it easy to do the test if the directions
are very clear.
Students will be at ease if the difficulty level presents a
reasonable challenge.
23. 07
RELIABILTY
Reliability refers to the consistency of the
scores obtained (Gronlund, 1977:138).
It means that if the test is administered to the
same students on different occasions then it
produces (almost) the same results.
Reliability actually does not really deal with
the test itself. it deals with the results of the
test. The test results should be consistent.
24. Take a look at the two scores below!
Which one is more reliable?
25. RIMBERIO CO
11
Reliability falls into 4
kinds (Brown, 2004:21-22).
They are:
1) Student-Related
Reliability
2) Rater Reliability
3) Test Administration
Reliability
4) Test Reliability
26. 14
STUDENT-RELATED
RELIABILITY
This kind of reliability refers to
temporary illness, fatigue, a bad day,
anxiety, and other physical or
psychological factors of the students.
Thus, the score obtained by the student
maybe not be his/her actual score.
27. 14
RATER RELIABILITY
Rater reliability deals with the scoring process.
Factors that can affect the reliability might be
human error, subjectivity, and bias in the scoring
process.
Two categories:
1. Inter-rater reliability
2. Intra-rater reliability
29. 14
TEST RELIABILITY
Test reliability refers to the test itself.
Whether the test fits into the time
constraints.
It means that the test should not be too
long or short.
The items of the test should be crystal clear
so that it will not end with ambiguity.
30. 07
AUTHENTICITY
Authenticity deals with the “real
world”.
Teachers should construct a test
with the test items that are likely to
be used or applied in the real
contexts of daily life.
31. RIMBERIO CO
11
Brown (2004:28) also proposes considerations that
might be helpful to present authenticity in a test.
They are:
1. The language in the test is natural as possible.
2. Items are contextualized rather than isolated.
3. Topics are meaningful (relevant, interesting) to the
learners.
4. Some thematic organization to items is provided, such
as through a story or episode.
5. Tasks represent, or closely approximate, real-world
tasks.
32. 07
WASHBACK/
BACKWASH
The word backwash can be found in certain dictionaries and it is defined as
“an effect that is not the direct result of something” by the Cambridge
Advanced Learner’s Dictionary.
Washback (Brown, 2004) or Backwash (Heaton, 1990) refers to the
influence of testing on teaching and learning.
The influence itself can be positive or negative (Cheng et al. (Eds.), 2008:7-
11)
33. RIMBERIO CO
11
POSITIVE WASHBACK
Positive washback has a beneficial influence on
teaching and learning. It means teachers and
students have a positive attitude toward the
examination or test and work willingly and
collaboratively toward its objective (Cheng & Curtis,
2008:10).
A good test should have a good effect.
34. RIMBERIO CO
11
NEGATIVE WASHBACK
Negative washback does not give any
beneficial influence on teaching and
learning (Cheng and Curtis, 2008:9).
Tests that have negative washback are
considered to have a negative influence
on teaching and learning.
35. A. A student should receive the same score on the assessment
if it is given on the next day or if it is scored by two different
raters.
Washback
B. This refers to the effects that tests have on teaching and
learning. What is tested should be what is agreed upon as the
most important language knowledge and use.
Practicality
C. Needed resources (including time and place) are available
for the assessment task and for scoring. Validity
D. The assessment task is at the appropriate level for the
learners and they are not being asked to use the language or
content that they haven’t encountered.
Reliability
E. The language being assessed is used in ways that are in
which they are instructed, and are appropriate to learners. Authenticity
Match the example with the principle.
B
C
D
A
E
36. 15
“A good exam gives all students an equal
opportunity to fully demonstrate their
learning.” –University of Waterloo