LENGUAGE TESTING (II Bimestre Abril Agosto 2011)


Published on

Universidad Técnica Particular de Loja
Ciclo Académico Abril Agosto 2011
Carrera: Inglés
Docente: Mgs. Orlando Lizaldes E.
Ciclo: Sexto
Bimestre: Segundo

Published in: Education
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

LENGUAGE TESTING (II Bimestre Abril Agosto 2011)

  1. 1. LANGUAGE TESTING <br />INGLÉS<br />Second Bimester<br />Language School<br />Teacher: Orlando V. Lizaldes E.<br />April – August 2011<br />1<br />
  2. 2. Second Bimester<br /><ul><li> 5 Validity
  3. 3. 6 Measurement
  4. 4. 7 The social character of language tests
  5. 5. 8 New directions –and dilemmas?</li></ul>2<br />http://www.google.com/imgres<br />
  6. 6. Testing is a matter of using data to establish evidence of learning. <br />
  7. 7. What makes a good tests good?<br />Its qualities: Reliable, Valid, Practical. <br />There’s no such a thing as a “good test” <br />4<br />
  8. 8. Validity<br />Validity<br />Reliability (standardized tests)<br />Inference<br />Judgment<br />Test Validation<br />5<br />
  9. 9. Testing the test<br />6<br />
  10. 10. Key questions in assessment<br />Validity: does this test measure what is supposed to measure? <br />Reliability: does this test or instrument consistently measure what is supposed to measure?<br />
  11. 11. The harder of the two concepts is…<br />Reliability doesn’t really apply to classroom teachers or classroom based test very often. <br />
  12. 12. Reliability. Conceptual understanding<br /><ul><li>May refer to a complete test or to individual items on the test.
  13. 13. It has to deal with consistency of measurement… means the same test to the same group of students.
  14. 14. It is not really a reliability application in classroom-based teaching. We really don’t have time to give the same tests over and over to the same person to see if this test is reliable or not. High stakes test (YES)</li></li></ul><li>VALID TEST<br />Remember:<br />T: V = R<br /> T: R ≠ V<br />10<br />
  15. 15. 11<br />EXAMPLE:<br />
  16. 16. Validity:the degree to which the test actually measures what it is intended to measure.<br />
  17. 17. If no validation<br />There is potential for unfairness and injustice<br />The potential is in proportion to what is at stake.<br />The validation procedure guarantees the FACE VALIDITY of the test.<br />
  18. 18. MEASUREMENT<br />What is measurement?<br /> Is the estimation of physical quantity such as distance, energy, temperature, time. Measurements find the ratio of some physical quantity to a standard quantity of the same type, thus a measurement of length is the ratio of a physical length to some standard length, such as a standard meter. <br />
  19. 19. MEASUREMENT<br />Assessment usually involves allocating a score, an attractively simple number. <br />A rose is a rose is a rose “Gertrude Stein (Sacred Emily)<br />A score is not a score is not a score because different raters give the same and different scores.<br />Measurement = dauntingly technical field = means, percentiles, standard deviations and statistics.<br />
  20. 20. Measurement always involves some error, and so in science measurements are accompanied by error bounds. <br />
  21. 21. The assigning of <br />numbers and scores<br />QUANTIFICATION<br />MATH – <br />PROCEDURES<br />For various kinds<br /> of mathematical and <br />statistical patterning within <br />the matrix in order to investigate <br />the extent to which necessary <br />properties are present in<br /> the assessment.<br />
  22. 22. Investigating the properties of individual test items<br />Investigating rater characteristics is important to guaranteeing the meaningfulness and fairness of assessment performance. (ITEM ANALYSIS). <br />Item analysis is a normal part of test development <br /> PILOT OPERATIONAL <br />
  23. 23. Correlation coefficient r<br />It expresses the extent to which one score set is knowable from another, and uses a scale from 0 to 1. <br />Reliability coefficient<br />Inter-rater reliability<br />19<br />
  24. 24. Norm-referenced and Criterion-referenced MEASUREMENTS<br />Norm-referenced Measurements (N-R-M) adopts a framework of comparison between individuals for understanding the significance of any single score. <br />In Criterion-referenced Measurements (C-R-M) individual performances are evaluated against a verbal description of a satisfactory performance at a given level.<br />
  25. 25. Criterion-referenced<br />They are not always easily defined in a yes/no judgment.<br />
  26. 26. Norm-referenced<br />www.utpl.edu.ec<br />Scores may not be consistent across instruments<br />
  27. 27. Bell curve of a normal distribution<br />http://www.google.com/imgres?imgurl=http://classes<br />
  28. 28. CENTRAL TENDENCY<br />The Central Tendency of a distributionisanestimate of the “center” of a distribution of values.<br />http://www.google.com/images?imgurlstr=http://centraltendency<br />
  29. 29. CENTRAL TENDENCY<br />There are threemajortypes of estimates of CentralTendency:<br /> - Mean<br /> - Median<br /> - Mode<br />
  30. 30. CENTRAL TENDENCY<br />The Mean oraverageisprobablythemostcommonlyusedmethod of describing central tendency.<br />
  31. 31. CENTRAL TENDENCY<br /> The Mean <br /> To compute the mean, add up all the values and divide by the number of values.<br />
  32. 32. CENTRAL TENDENCY<br />The Mean <br /> For example:<br /> 20, 20, 20, 18, 17, 14, 14= 135<br /> The sum of these 8 values is 135/8= <br />16.87<br />
  33. 33. CENTRAL TENDENCY<br />The Median <br />Isthe score found at theexactmiddle of the set of values. One way to compute the median is to list all scores in numerical order, and then locate the score in the center of the sample. <br />
  34. 34. The Median EXAMPLES:<br /> 15, 15, 15, 15, 15, 17, 18, 20<br />There are 8 scores and score # 4 and # 5 representthehalfwaypoint. Sinceboththese scores are 15, the median is 15.<br />Example: find the Median of {12, 3 and 5}<br />Put them in order: <br />3, 5, 12<br />The middle number is 5, so the median is 5.<br />
  35. 35. CENTRAL TENDENCY<br />If the two middle scores have different values, you would have to interpolate to determine the median.<br />There are now fourteen numbers and so we don't have just one middle number, we have a pair of middle numbers: <br />3, 5, 7, 12, 13, 14, 21, 23, 23, 23, 23, 29, 40, 56 <br />In this example the middle numbers are 21 and 23. <br />To find the value half-way between them, add them together and divide by 2: <br />21 + 23 = 4444 ÷ 2 = 22<br />And, so, the Median in this example is 22. <br />
  36. 36. The social character of language tests<br />Educational assessment has traditionally drawn its concepts and procedures from the field of Psychology.<br />When tests reform are introduced within the educational system, they are likely to figure prominently in the press and become matters of public concern.<br />
  37. 37. Conventional proficiency tests have been used for purposes of exclusion.<br />Industrialized countries have developed more flexible policies for the recognition and certification of specific work-related skills (competencies)<br />International Ss need to meet a standard on a language test for academic purposes. <br />
  38. 38. Computers and Language Testing<br />The proponents of computer based testing can point to a number of advantages. First, scoring of fixed response items can be done automatically, and the candidate can be given a score immediately. Second, the computer can deliver tests that are tailored to the particular abilities of the candidate.<br />
  39. 39. It seems inefficient for all candidates to take all the questions on a test; clearly some are so easy for some candidates that they provide little information on their abilities; others are too hard to be of use. It makes sense to use the very limited time available for testing to focus on those items that are just within, and just beyond a candidate’s threshold of ability.<br />
  40. 40. The use of computer for delivery of test materials raises questions of validity. For example, different levels of familiarity with computers will affect people’s performance with them, and interaction with the computer may be stressful experience for some students or candidates. (McNamara ( 2000, 79-81)<br />
  41. 41. New directions<br />Computer based tests (CBT)<br />Do raters react differently to printed versus handwritten texts?<br />Semi-direct test of speaking. (cheaper to administer – raises questions of validity since there’s no COMMUNICATION at all.)<br />37<br />
  42. 42. Summing - up<br />Language testing remains a complex and perplexing activity.<br />Language testing is an uncertain and approximate business at the best times, even if to the outsider this may be camouflaged by its impressive, even daunting, technical trappings (McNamara, Language Testing, 86). <br />38<br />
  43. 43. Consulted Bibliography<br />McNamara, T.(2000). Language Testing. Oxford University Press. London<br />Heaton J. B.(1998). Classroom Testing. Keys to Language Teaching. Longman. New York (USA)<br />Richards, J.C. (2005). Communicative Language Teaching , Cambridge Univ. Press<br />Brown, H. D. (2004). Language Assessment. Principles and classroom practices. Longman, United States<br />IBT Tests (2004). MacGraw Hills.<br />Freeman D., Richards J.C. (2001). Teacher Learning in Language Teaching. Pearson. USA<br />O’Malley, J. M., Valdez Pierce, L. (1996). Authentic assessment for English language learners. Practical approaches for teachers. Longman. USA<br />39<br />
  44. 44. THANK YOU<br />40<br />
  45. 45. 41<br />