Once a consensus has been built as to the purposes and types of test to employ in a program. A strategy must be worked out to maximize the quality and effectiveness of the test. In the best of all possible worlds each program would have a resident testing expert, whose entire job would be to develop tests especially created for and suited to that program. But even in the worst of all possible worlds rational decisions can be made when selecting commercially available tests from scratch or adopting them from commercial sources is the possibility of adapting existing tests so they are made to better fit with the purposes and objectives of the program.
Many language tests are, or should be, situation specific. That is to say, a test can be very effective in one situation with one particular group of students and be virtually useless in another situation or with another group of students. Teachers can not simply go out (or worse yet, illegally photocopy) a test and automatically expect it to work with their student. It may have been developed for completely different types of students (different in background, level of proficiency, gender, and so forth) and for entirely different purposes (that is, base on differing approaches, syllabuses, techniques or exercises).
Though all of these may seem like a great deal of work, remember that in most language programs, any rational approach to testing will be a vast improvement over the existing condition. The purpose in this section of the chapter will be to suggest systematically bases for getting started in adopting, developing, or adapting decent language test for a particular and very specific language program.
Adopting Language Tests
The tests that are used for program decision are very often bought from commercial publishing houses. Tests are also sometimes adopted from other language programs or taken straight from the current textbook. Given the wide diversity and variation in the nationalities and levels involved in the various language programs around the world, it may turn out that any tests that is adopted is being applied to a population quite different from envisioned when the test were originally written. As a result, program decisions that can dramatically affect the lives of the student
May be irresponsibly base on tests consisting of test questions that are quite unrelated to the needs of the particular group of students or to the curriculum being taught in the specific program involved.
Selecting good test that match the specific needs of a program is therefore important. Test reviews are one good place to start. Such reviews can be found in the review sections of some language teaching journals, right alongside the reviews of texts and professional books. Unfortunately, test reviews appear infrequently:
Language Testing - is a journal that focuses on language tests and also provides reviews. For those in ESL/EEL, Alderson, Krahnke and Stanfield 1987 is a useful source of test reviews for most of the major test available at the time it was published. The Mental Measurements Yearbook also includes some reviews of language test.
Alternative ways to approach the task of selecting test for a program might include:
1. Taking a language testing course
2. Reading up on testing.
3. Hiring a person who already knows about testing.
4. Giving one member of the staff release time ti become informed all the topic.
Table 4.3 Test Evaluation Checklist
General Background Information
3. Publisher and Date of Publication
4. Published reviews available
1. Test family (norm-referenced or criterion-referenced)
2. Purpose of decision (proficiency, placement, achievement or diagnosis)
3. Language methodology orientation approach and syllabuses
1. Target population (age, level, nationality, language/dialect, educational background and so forth)
2. Skills test (for instance reading, writing, listening, speaking, structure, vocabulary and pronunciation)
3. Number of subtests and separate scores.
4. Type of items reflect appropriate techniques and exercises (receptive true-false, multiple-choice, matching; productive: fill-in, short response, essay, extended discourse task).
a. Cost of test booklets, cassette tapes, manual, answer sheets, scoring templates, scoring services, any other necessary test components.
b. Quality of items listed immediately above (paper, printing, audio clarity, durability and so forth)
c. Ease of administration (time required, proctor/examinee, ratio, proctor qualifications, equipment necessary, availability and quality of direction for administration and so forth)
d. Ease of scoring (method of scoring, amount of training necessary, time per test, score conversion information and so forth)
e. Ease of interpretation (quality of guidelines for the interpretations of scores in terms of norms or other criteria.)
Developing Language Test
In the best of all possible worlds, sufficient resources and expertise will be available in a program so that proficiency, placement, achievement, and diagnostic tests can be developed and fitted to the specific goals of the program and to the specific population studying in it.
If this is the case, decision must be made about which types of tests must be developed first. That might mean first developing achievement and diagnosis tests, while temporarily adopting previously published proficiency and placement test.
A program specific placement test could be developed so that the reasons for separating students into levels in the program are related to the things that the students can learn while in those levels.
It is rarely necessary or even useful to develop program specific proficiency tests because of their interprogrammatic nature. In other words, for purposes of reference to other programs elsewhere, an adopted test that is used by a wide variety of language programs will be most appropriate. Naturally, all of these decisions are up to the teachers, administrators, and curriculum developers in the program in question.
Adapting Language Test
It may turn out that a pre-existing test that works fairly well, but not perfectly, can be adapted to the specific testing needs of a particular program.
The process of adapting a test to a specific situation will probably involve some variant of the following strategy:
1. Administer the test to the students to the program.
2. Select those items that appear to be doing a good job of spreading.
3. Create a shorter, more efficient, revised version of the test that fits the ability levels of the specific population of students.
4. Create new item that function like those that were working well in order to have a test of sufficient length.
Organizing and Using Test Result
Table 4.4 A Checklist for Successful Testing
(adapted from Brown forthcoming)
Purposes of Test
1. Clearly defined (theoretical and practical orientation)
2. Understood and agreed upon by staff
Physical needs arranged
1. Adequate and quiet space
2. Enough time in that space for some flexibility
3. Clear scheduling
1. Students properly notified
2. Students signed up for level
3. Students given precise information (where and when test will be, as well as what they should do to prepare and what they should bring with them especially identification if required)
1. Adequate materials in hand (test booklets, answer sheets, cassette tapes, pencils, scoring templates, and so forth) plus extras
2. All necessary equipment in hand and tested (cassette players, microphones, public address system, videotape players, blackboard, chalk, and so forth) with backups where appropriate
3. Proctors trained in their duties
4. All necessary information distributed to proctors (test
directions, answers to obvious questions, schedule of who is to be where and when, and so forth)
1. Adequate space for all scoring take place
2. Clear scheduling of scoring and notification of results
3. Sufficient qualified staff for all scoring activities
4. Staff trained in all scoring procedures
1. Clearly defined uses for result
2. Provision for helping teachers interpret scores and explain them to students
3. Provision for eventual systematic termination of records
1. Results used to fall advantage for research
2. Results incorporated into overall program evaluation plan