# Evaluation of in-house item banks by administering actual CATs

TERA & PROMS 2013, Taiwan

1. 1. Evaluation of in-house item banks by administering actual CATs Tetsuo KIMURA (Niigata Seiryo Unirversity) TERA-PROMS 2013, Kaohsiung, Taiwan August 5, 2013 UCAT
2. 2. Outline • Background & Previous Studies ▫ What is CAT? ▫ UCAT and Moodle UCAT ▫ Construction of Item Bank • Current Study ▫ Sample & Method ▫ Results ▫ Conclusion 2
3. 3. What is CAT? Paper-Pencil Test Computerized Test Computer Adaptive Test
4. 4. What is CAT? Paper-Pencil Test Computerized Test Computer Adaptive Test Interview Test Self-scoring flexilevel test (Lord, 1971) Binet’s IQ test (Binet’s & Simon, 1905) Adaptive Test
5. 5. Binet’s IQ test (Binet’s & Simon,1905) The First Adaptive Test 5
6. 6. Flexilevel Test (Lord,1971) The middle difficulty item, number 11 in difficulty-order ① ② ③ ④ 1. A slightly easier item, number 10 in difficulty-order ① ② ③ ④ 1. A slightly harder item, number 12 in difficulty-order ① ② ③ ④ 2. A slightly easier item, number 9 in difficulty-order ① ② ③ ④ 2. A slightly harder item, number 13 in difficulty-order ① ② ③ ④ 3. 3. ・ ・ ・ ・ ・ ・ 10. The easiest item, number 1 in difficulty-order ① ② ③ ④ 10. The hardest item, number 21 in difficulty-order ① ② ③ ④ ① ② ③ ④ ① ② ③ ④① ② ③ ④ ① ② ③ ④ ① ② ③ ④ ① ② ③ ④ ① ② ③ ④① ② ③ ④ ① ② ③ ④ ① ② ③ ④① ② ③ ④ 6
7. 7. Individualization of test 1. item selection suitable to each test taker 2. shortening of test administration time What is CAT? 3. improvement of measurement accuracy Efficiency of measuremtn
8. 8. Previous Studies 8 • Rash-based CAT program ▫ Linacre (1987) . UCAT: a BASIC computer-adaptive testing program. ▫ Kimura, Ohnishi & Nagaoka (2012). Moodle UCAT: a computer- adaptive test module for Moodle based on the Rasch model. ⇒ ACP (SG）& Version2 (JP) cooperative project • Construction of item banks for CAT ▫ Kimura (2009). Construction of a Moodle-based placement test and possibility of a Moodle-based computer adaptive test. ▫ Kimura & Nagaoka (2010). Towards the construction of item banks for moodle-based in-house computer adaptive English tests.
9. 9. Construction of item bank Pretesting 9 Item analysis & elimination of misfit More pretests with new items and anchored items Item bank Calibrated items Anchored items
10. 10. Types of items used in the study All the items were adopted from the Eiken Test Grade pre 1 to Grade 3, under the permission of the Society for Testing English Proficiency (STEP). Listening comprehension (Lng) Reading comprehension (Rdg) Vocabulary and grammar (Vgm) Listening comprehension with dialogue (Dlg) Listening comprehension with monologue (Mlg)
11. 11. Construction of item bank: Common Person Linking Dlg & Mlg  Lng r = .86 Mlg =Dlg × 1.18 ＋ 0.06 r = 0.89 Dlg =Mlg × 0.85 ＋ 0.05
12. 12. Current item banks 12 Vgm N AVG SD G1.5 (B2) 73 1.57 0.84 G2 (B1) 69 0.52 0.81 G2.5 (A2) 67 -0.47 0.91 G3 (A1) 49 -1.41 0.80 Total 258 0.19 1.37 Lng N AVG SD G1.5 (B2) 44 1.26 1.42 G2 (B1) 109 0.77 1.11 G2.5 (A2) 75 0.35 1.05 G3 (A1) 80 -0.90 1.33 Total 308 0.30 1.43
13. 13. CAT Algorithm: Item Selection (logit bias) 13 Moodle UCAT LL and UL can be adjusted by adding logit value to the Logit bias box in the CAT setting window BiasULULBiased BiasLLLLBiased _ _ Positve logit value decrease the chance of answer correct Negative logit value increase the chance of answer correct
14. 14. Current Study: Sample & Method 14 Test takers: About 160 Japanese university freshmen whose majors are nursing and social welfare Some students were eliminated from the data because they had not completed the CAT properly. Eiken grade Item banks Vgm Lng Pre 1st 115 113 2nd 105 108 Pre 2nd 95 104 3rd 85 91 CAT conditions • Initial estimate ability: 0.0 logit (100 unit) • Ending condition: number of item (16 items) S.E. theoretically reached as low as 0.5 logit (Linacre, 2006) • Logit bias: 0 (targeting probability of answering correct could be 0.5)
15. 15. Current Study: results 15 Vgm: More than 90% of 157 test takes ended their CAT with S.E. less than .55 logits. Lng: More than 90% of 130 test takes ended their CAT with S.E. less than .55 logits. Item exposure rate (frequency per 100 test takers) Vgm Lng
16. 16. Current Study: results Vgm Lng Item exposure rate (frequency per 100 test takers) Current item banks Vgm Lng 
17. 17. Current Study: Conclusions • More items with lower difficulty should be added to both item banks. • If the CATs were administered to students in advanced level, more items with higher difficulty need to be added to both item banks. • If the cutting point of test is set between 0 and 3 logits for Vgm and between -1 and 3 for Lng, the current item banks can serves well for the CAT.
18. 18. Thank you for listening. Tetsuo Kimura tetsuo.kmr@gmail.com Files for Moodle UCAT https://github.com/VERSION2-Inc/moodle_ucat Acknowledgements: A part of the present study was supported by a Grant-in- Aid for Scientific Research for 2010-2012 (No. 22520590) from the Japan Society for the Promotion of Science. 18