Testing and Test construction (Evaluation in EFL)


Published on

Published in: Education
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Testing and Test construction (Evaluation in EFL)

  1. 3. What is testing? It’s an activity whose purpose is to determine what learners can do or know about something. What is a test? It’s a formal instrument to measure what candidates can do or know about something.  
  2. 4. <ul><li>What are tests for? </li></ul><ul><li>To inform learners and teachers of the strengths and weaknesses of the process. </li></ul><ul><li>To motivate learners to review or consolidate specific material. </li></ul><ul><li>To create a sense of accomplishment/success. </li></ul><ul><li>To guide the planning/development of the ongoing teaching process. </li></ul><ul><li>To determine if (and to what extent) the objectives have been achieved. </li></ul><ul><li>To encourage improvement. </li></ul><ul><li>  </li></ul>
  3. 6. Depending on purpose   Depending on characteristics Screening/Selection/ Admission Direct Tests/ Indirect Tests Placemen t Discrete point/ Integrative tests Proficiency Criteria-referenced/ Norm-referenced Aptitude Objective tests/ Subjective tests  Diagnostic Speed test/ Power test Achievement Knowledge tests/ Skill tests Progress
  4. 7. <ul><li>Depending on purpose : </li></ul><ul><li>  </li></ul><ul><li>Screening/Selection/Admission : To know if a person has the required behavior to be successful in a specific program (not based on objectives), e.g. IPC’s admission test. </li></ul><ul><li>Placemen t : To determine the level in which a person should be located inside a program (designed by the institution), e.g. CVA’s placement test. </li></ul><ul><li>Proficiency : To know if a person shows an overall proficiency in a language, compared to native speakers in real life contexts, e.g. The TOEFL test . </li></ul><ul><li>  </li></ul>
  5. 8. <ul><li>  </li></ul><ul><li>Aptitude : To know the talents of a person to do something specific. Suitability of a candidate for a specific program of instruction. </li></ul><ul><li>Diagnostic : It refers to entrance behavior or previous knowledge. To determine strengths and weaknesses and to guarantee that potential problems will be corrected (performed by the teacher). </li></ul><ul><li>Achievement : To know if a determined objective has been covered successfully. </li></ul><ul><li>Progress : To check improvement achieved according to a referential point in a program. </li></ul>
  6. 9. <ul><li>  </li></ul><ul><li>Depending on characteristics : </li></ul><ul><li>  </li></ul><ul><li>Direct Tests : they test what they are intended to assess in a straightforward manner. </li></ul><ul><li>Indirect Tests : they give information about aspects that are not the focus but are implicitly addressed (a reading comprehension cloze may give an indirect measure of vocabulary knowledge). </li></ul><ul><li>Discrete point tests : the focus is on restricted areas of the target language (a cloze test on verb tenses) . </li></ul><ul><li>Integrative tests : answers demand the combination of many areas of language knowledge to generate the product demanded. ( oral interviews, reading comprehension, essay writing, etc.)  </li></ul><ul><li>  </li></ul><ul><li>  </li></ul><ul><li>  </li></ul>
  7. 10. <ul><li>Criterion-referenced : these exams describe what a person can do in relation to the course objectives or predefined criteria . There is no comparison between students. </li></ul><ul><li>Norm-referenced : test results are compared so as to measure one person’s performance in relation to a given population. </li></ul><ul><li>Objective tests : no judgment is involved. Answers are either right or wrong. (e.g. multiple choice items ) </li></ul><ul><li>Subjective tests : judgment and opinions on the part of the rater are involved. No right or wrong answer, but a continuum. (e.g. opinion/discussion items ) </li></ul><ul><li>  </li></ul>
  8. 11. <ul><li>  </li></ul><ul><li>Speed tests : easy items that must be answered in a very short time. They assess speed of performance and strategy, e.g. scanning exercises. </li></ul><ul><li>Power tests : the difficulty of the items demands enough time to respond. They assess actual control over the aspects under scrutiny. </li></ul><ul><li>Knowledge tests : they assess the language components, e.g. grammar quizzes. </li></ul><ul><li>Skill tests : they focus on listening, speaking, reading and/or writing. e.g. listening quizzes. </li></ul><ul><li>  </li></ul>
  9. 13. I. Specific guidelines : The way the test is designed and organized. II. Moderation of mark scheme : The way in which teachers set the score of the test. III. Standardization of examiners : The way in which examiners guarantee a common criteria for correction.
  10. 14. <ul><li>I. Specific Guidelines </li></ul><ul><li>Moderation of tasks : Searching for feed-back. Revision made by other teachers. </li></ul><ul><li>Level of difficulty : The presentation of tasks in a test should be arranged from easy to difficult. Starting with the most difficult task will lead the weakest learners to soon give up. An item is easy if 75% of students answer it correctly, it’s average if 50% of the students answer it correctly, and if 25% of students can’t answer the item, then it is considered difficult (pilot test). </li></ul><ul><li>Discrimination : A test should allow candidates at different levels to perform according to their abilities. A variety of tasks ranging from easy to difficult should point out the difference(s) between learners (good and weak). The number of difficult tasks should be limited and go at the end of the test. </li></ul>
  11. 15. <ul><li>Appropriate sample : The test should present a representative sample of the objectives, activities and tasks taught or used in the classroom. </li></ul><ul><li>Overlap : It occurs when content is assessed more than once. It should be avoided as reassessment of content will present an inappropriate sample, but also to prevent visual and mental overload from students. </li></ul><ul><li>Clarity of tasks : Instructions should be simple and unambiguous, providing a clear indication of what the task demands from the student. Instructions should never be more difficult than the task. </li></ul><ul><li>Questions and texts: The selection of questions and texts will depend on the purpose and the formats chosen by the designer of the test. Again, the difficulty should not lie in the question but in the task. Conversely, questions should not be too simple, obvious or answerable from world knowledge. </li></ul>
  12. 16. <ul><li>Timing : give students a reasonable time to complete the test, since too little time will evidence unreliable results. Students should be aware of the time set to complete each part of the test. The time of the test should reflect the importance and difficulty of what is being assessed. Teachers can pilot the test with a group of a similar level or he/she can even relate to similar evaluative experiences in the classroom, to determine the appropriate time agreed to complete the test. </li></ul><ul><li>Layout : presentation, printing, spacing, font size, style, formats (a,b,c… I,II,III,IV… 1,2,3…) The layout should be consistent. Single parts should be arranged on the same page. </li></ul><ul><li>Bias : it can result from experiential, cultural or knowledge-based factors. Avoid items or topics inclined to give an unfair advantage to a particular group of students. Also avoid tasks or issues so obscure that candidates might have no frame of reference into which process and comprehend what is being asked. </li></ul>
  13. 17. <ul><li>II. Moderation of Mark Scheme </li></ul><ul><li>Acceptable response/variations . </li></ul><ul><li>Subjectivity in productive tasks . </li></ul><ul><li>Weighting (balance between items/tasks and scores). </li></ul><ul><li>Computation : The data and results should be easy to compute. The manipulation of numbers must be convenient. Simple for students and teacher (to conceive and process). </li></ul><ul><li>Avoidance of muddied measurement : The use of a skill should not interfere with the measurement of another. </li></ul><ul><li>Accessibility/intelligibility of mark scheme : Easy and convenient to access, use and understand. </li></ul>
  14. 18. <ul><li>III. Standardization of examiners </li></ul><ul><li>Agreement on criteria : by teachers and students. </li></ul><ul><li>Trial assessment : to assess difficulty and potential problems. </li></ul><ul><li>Review procedures : to make sure they fit test pusposes. </li></ul><ul><li>Follow-up checks : Notes or reports on the results of the tests (to improve or consolidate it) </li></ul>