“ It can not be denied that a great deal of language testing is of very poor quality...they (the tests) often fail to measure accurately whatever it is they are intended to measure” (Hughes, 2000).
“ Many beginning teachers believe that their most important or only responsibility is to select
effective instructional procedures” (Sax, 2002).
Characteristics of a Good Test
1. Validity: The extent to which a test measures what it is supposed to measure.
The truthfulness or accuracy within the score of a test or interpretation of an experiment.
How should a reading or listening test be designed?
What does the TOEFL evaluate?
2. Reliability (consistency) : The extent to which individual differences are measured consistently.
Does a clasroom, its conditions or the environment affect students` results? E.g: a marching band… health or personal problems:a fever. Taking the listening part at a noisy place? With poor equipement?
Being drunk or sober?
Financial limitations, time constraints, ease of administation, and scoring and interpreation.
Types of tests
A) Placement test
B) Diagnostic test
C) Progress or achievement test*
D) Proficiency test
__ designed to measure how much a student has learned at a given moment in a course.
__ designed to get a general picture of a student’s knowledge and ability.
__ designed to measure a student’s strengths and weaknesses in order to better things
__ designed to fit new students in a course according to their level of language competence.
But , why do we evaluate?
to measure students` learning
to measure teaching effectiveness and to improve it!
To provide feedack to students, to teachers, and parents!
To diagnose students´ weaknesses and strengths
to identify which students are in need of remedial or advance work.
To teach without evaluation is a contradiction in terms” (Sax, 1997).
Key Terms in evaluation
A test: is a task or series of tasks used to obtain
systematic observations, usually paper and
pencil procedure, about sts´s performance.
Measurement: is the process that assigns
numbers to attributes or characteristics of
persons, objects, or events according to explicit
formulations or rules. More quantitative
Assessment vrs Evaluation
Assessment: the process of collecting, synthesizing, and interpreting information to aid in decision making. More qualitative
Evaluation: the process of determining the value or worth of a program, course, or other initiative, toward the ultimate goal of making decisions about adopting, rejecting, or revising the innovation.
Evaluation is the more inclusive term , often making use of assessment data in addition to many other data sources.
Discrete point testing vrs integrative testing
Discrete point testing refers to measuring one element at a time, item by item: a series of items, each testing particular grammatical structures.
Integrative requires the candidate to combine many language elements in the completion of a task.
Discrete point testing will be indirect and
Integrative will be direct.
Direct and indirect test items
It is direct if it asks candidates to perform the communicative skill which is being tested
( through a composition, oral interview)
Indirect items: measure a students´ knowledge and ability by getting at what lies beneath their receptive and productive skills ( controlled items: multiple choice type for example).