LANGUAGE TESTING ESCUELA : NOMBRES: Inglés Mgs. Vanessa Toro G. BIMESTRE : Segundo Octubre 2011-Febrero 2012
Unit 5 Validity: Testing the test According to McNamara (2008), “The purpose of validation in language testing is to ensure the defensibility and fairness of interpretations based on test performance.” According to Hughes (2008), “ A test is said to be valid if it measures accurately what it is intended to measure.”
Hughes (2008, p. 26) argues that a test has content validity if its content constitutes a representative of the language skills, structures, etc. For example, “we would not expect an achievement test for intermediate learners to contain just the same set of structures as one for advanced learners.”
Regarding test method , “it is the way in which the candidate is asked to engage with the materials and tasks in the test, and how these responses will be scored .” For example, fixed and constructed response formats and short answer questions.
Test construct is “ what is being measured by the test; those aspects of the candidate’s underlying knowledge or skill which are the target of measurement in a test.” In addition, “it refers to those aspects of knowledge or skill possessed by the candidate which are being measured.”
Measurement investigates the quality of the process of assessment by looking at scores. Two main steps are involved:
Quantification, the assigning of scores to various outcomes of assessment.
Checking for various kinds of mathematical and statistical patterning within the matrix to investigate the extend to which consistency of performance by candidates, or by judges are presented in the assessment.
Norm-referenced tests: are designed to enable the test user to make “normative” interpretations of test results. That is, test results are interpreted with reference to the performance of a given group, or norm. “The norm group” is a large group of individuals who are similar to the individuals for whom the test is designed.
In the development of norm-referenced tests the norm group is given the test, and then the characteristics, or norms, of this group’s performance are used as reference points for interpreting the performance of other students who take the test.
Criterion-referenced tests: are designed to enable the test user to interpret a test score with reference to a criterion level of ability or domain of content. For example, students are evaluated in terms of their relative degree of mastery of course content, rather than with respect to their relative ranking in the class.
Item response theory: an approach to measurement which uses complex statistical modeling of test performance data to make powerful generalizations about item characteristics, the relation of items to candidate ability, and about the overall quality of tests.
Assessment is done not by someone acting in a private capacity or motivated by personal curiosity, but in an institutional role, and serving institutional purposes. It involves the fulfillment of policy objectives in education and other areas of social policy.
In international education, tests are used to control access to educational opportunities. Typically, international students need to meet a standard on a test of language for academic purposes before they are admitted to the university of their choice.
However, research both on the presume negative washback of conservative test formats and on the presume positive washback of communicative assessment has shown that washback is often rather than predictable.
In relation to this, some factors may affect washback such as conditions in the classroom, the immediate motivation of learner, etc.
Test impact: tests can also have effects beyond the classroom. Test impact is the wider effect of tests on the community as a whole, including the school.
For example the TOEFL has effects beyond the classroom, in terms of educational policy and the allocation of resources to education.
Unit 7 Codes of professional ethics for language testers
Professional bodies of language testers should formulate codes of practice which will guide language testers in their work. The emphasis is on good professional practice: that is language testers should in general take responsibility for the development of quality language tests.
From the perspective of critical language testing, the emphasis in ethical language testing on the individual responsibility of the language tester is misguided because it presupposed that this would operate within the established institution of testing, and so essentially accept the status quo and concede its legitimacy.
Computers and language testing: many national and international language tests, including TOEFL, are moving to computer based testing (CBT) where stimulus texts and prompts are presented on the screen, and the candidates are required to key in their responses. It represents a change in test method not in test content.
Computers and language testing: CBT require the prior creation of an item bank, a large group of items which have been thoroughly trialed, and whose likely difficulty for candidates at given levels of ability has been estimated as precisely as possible.
Automatically scoring and the fact that candidates can be given a score immediately are both examples of the advantages of CBT.
Computers represent the most rapid point of technological change.
Tape recorders can be used in the administration of speaking tests. Candidates are presented with a prompt on tape, and are asked to respond as if they were talking to a person, the response being recorded on tape.