Upcoming SlideShare
×

# RIT 101: Understanding Scores From MAP

4,581 views

Published on

RIT 101
Gage Kingsbury & Steve Wise, Senior Resource Fellows, NWEA
Fusion 2012, the NWEA summer conference in Portland, Oregon

It’s easy to say that the RIT scale is an equal-interval scale, but not as easy to back it up. This session will provide a conceptual review of the RIT scale and its characteristics and help to answer these questions: What is a RIT? What is a Rasch model? Why isn’t the number of correct answers used as the score? How are scores compared if students take different test items? Does a 200 RIT score from a third-grader mean the same thing as a 200 from an eighth grader?

Learning outcome:
- Gain a deeper understanding of the Rasch model.

Audience:
- New data user
- Experienced data user
- Advanced data user
- Curriculum and Instruction

Published in: Education, Business, Technology
1 Like
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
Your message goes here
• Be the first to comment

Views
Total views
4,581
On SlideShare
0
From Embeds
0
Number of Embeds
29
Actions
Shares
0
97
0
Likes
1
Embeds 0
No embeds

No notes for slide

### RIT 101: Understanding Scores From MAP

1. 1. RIT 101: Understanding Scores from MAP Steven L. Wise Senior Research Fellow
2. 2. RIT 101• Unique features of the RIT scales• Calibrating items for MAP• The RIT scale and adaptive testing• Scoring a test• Interpretation of scores 2
3. 3. Unique Features of RIT Scales• Equal Interval• Cross Graded• Stable over time• Allows us to assess change (growth) over time• Allows us to develop item banks• Allows us to give tests specific to student needs 3
4. 4. How do we use the RIT scale?• The RIT scale is the platform upon which both new items are calibrated, a test is chosen for a student and a student’s score is computed and interpreted.• MAP is a computerized adaptive test (CAT), which means that each student receives a test that is tailored to his/her level of proficiency. 4
5. 5. Item Calibration• Item calibration is the process by which we figure out how difficult an item is.• This is extremely useful in both building an item bank and administering a CAT• Based on item response theory—specifically, the Rasch model. – Specifies the relationship between a student’s proficiency level and his/her chances of passing the item. 5
6. 6. How do we decide a new item’s difficulty?• Some items are more difficult than others.• We figure out an item’s difficulty by field testing it during live test events.• We then consider how many students got the item right relative to their standing on the RIT scale. 6
7. 7. A Basic Math Item: 5 + 5 = ? 1.0 0.9ProportionCorrect 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 120 170 220 270 RIT 7
8. 8. Fitting a Rasch Curve 1.0 0.9ProportionCorrect 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 120 170 220 270 RIT 8
9. 9. Item Difficulty: the RIT value at which we expect half of the students to pass the item. 1.0 0.9ProportionCorrect 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 120 170 220 270 RITDifficulty = 170 9
10. 10. The Item Bank• Once an item has been calibrated, it (along with its difficulty) will be added to the MAP item bank.• MAP banks contain thousands of test items.• Large item banks are essential for using CAT. 10
11. 11. Scoring a Test• The scoring of a student’s test under the Rasch model takes into account two things: – how difficult the items were the student received – how she did on those items• A standard method of scoring is called “maximum likelihood” – This just means, “What is the most likely RIT score for a student who performed as she did on the items she received?”• Conceptually, this is not as complicated as it sounds. 11
12. 12. A One-item Test 1 0.9 0.8 0.7 Proportion Correct 0.6 0.5 0.4 0.3 0.2 0.1 0 130 140 150 160 170 180 190 200 210 220 230 240 250 260 270 RITIf this item was passed, what are the most likely values of the student’s RIT?What are the least likely values? 12
13. 13. A Two-item Test 1 0.9 0.8 0.7 Proportion Correct 0.6 0.5 0.4 0.3 0.2 0.1 0 130 140 150 160 170 180 190 200 210 220 230 240 250 260 270 RITWhat if the Blue item was passed and the Red Item was failed? 13
14. 14. A Three-item Test 1 0.9 0.8 0.7 Proportion Correct 0.6 0.5 0.4 0.3 0.2 0.1 0 130 140 150 160 170 180 190 200 210 220 230 240 250 260 270 RITWhat if the Blue and Green items were passed and the Red Item was failed? 14
15. 15. Maximum Likelihood Scoring and CAT• Notice that item difficulty and student scores are on the same scale (RIT).• The best measurement occurs when students are given items whose difficulties are well matched to their proficiency levels.• This is what a CAT does. It tailors the test to each student by adjusting item difficulty.• Result: all students can be measured with equal precision. 15
16. 16. How a CAT Works1. Pick an item of appropriate starting difficulty.2. The item is presented & answered by the student.3. If answer is right, choose a harder item to give next. If answer is wrong, choose an easier item to give next.4. Repeat steps 2 & 3 until enough items have been given.5. Calculate the student’s RIT score. 16
17. 17. Interpreting a RIT Score• How is a student’s RIT score interpreted?• A RIT score in math of, say 221, by itself is not interpretable.• We need to have one or more reference points to interpret a score. 17
18. 18. Reference Points for a Spring RIT Score of 221 in Math• Normative: Shauna’s 221 is at the 62nd percentile relative to other 5th grade students.• Growth: She has gained 13 RIT points since fall MAP testing. Typical growth for students starting at the same level was 9 points.• Predictive: Her score indicates that she is on track to being college ready by the 12th grade.• Content: DesCartes provides information about which skills Shauna is currently ready to learn. 18
19. 19. Thank you for your attention.steve.wise@nwea.org