EAC 2018 Psychometrics Webinar

EAC 2018 | Fort Lauderdale | June 27-29

Thank you for joining.
The webinar will begin shortly.
Psychometrics 101: Know What Your
Assessment Data is Telling You
Eric Ermie, Vice President of Sales, ExamSoft WorldWide, Inc.
Formerly Program Manager for Evaluation and Assessment at The Ohio State University
College of Medicine

AGENDA
• Overview
• Types of stats
• Interpreting the item analysis report
• Examples
• General statistical guidelines

Item analysis is not a fool proof answer to these questions.
But… YOU HAVE TO START SOMEWHERE.
“Where do
I start?”
“Is this a good or bad question?
Can statistics even tell me that?”
OVERVIEW
“How can I reconcile what I know
about my assessment’s past with
what the data is telling me?”

Item Difficulty/p Value:
a decimal representation of
difficulty using the percentage of
students who got the item
correct. The lower the decimal
the higher the difficulty.
Upper 27%:
of only the top 27% of
scorers what percentage
of those students got the
item correct.
Lower 27%:
of only the bottom 27% of
scorers what percentage
of those students got the
item correct.
TYPES OF STATS
Common Stats:

Discrimination index:
calculated by subtracting the % of the
bottom 27% group that got the item correct
from the % of the top 27% group that got the
item correct. Discrimination index measures
whether the item discriminate between
highest and lowest performers.
Point-Biserial:
a discrimination statistic that indicates
whether doing well on that specific item
correlated with doing well on the exam
overall. Thus was that item a good or
bad predictor of overall performance on
the exam.
TYPES OF STATS
Common Stats:

Item Difficulty:
Range 1.0 to 0.0
Discrimination Index:
Range 1.0 to -1.0
Point Biserial:
Range 1.0 to -1.0
ITEM ANALYSIS REPORT

But with any statistic
it is important to
remember:
context matters!

6 Factors to always consider when evaluating item performance:
1. Cheating
2. Return on investment
3. Conflicting content/faculty
4. “Six degrees of Kevin Bacon”
5. Author Intent
6. Content delivery method
“Stats alone cannot tell the whole story..”
EXTRANEOUS FACTORS

Diff(p) Upper A B D E
0.98 100.00% 0.10 0 1 1 *178
0.00 0.55 0.55 98.34
0.00 0.02 -0.10 0.10
0.00 0.00 -0.02 0.02
0.00 0.00 0.00 1.00
0.00 0.00 0.02 0.98Lower 27%
Upper 27%
Disc. Index 0.00
0.00
0.00
0.00
0
0.00
Lower
Disc.
Index
1
% Selected
Point Biserial (rpb)
96.15% E0.04
Item
#
Correct Responses Point
Biserial
Correct
Answer
Response Frequencies (*Indicates correct answer)
C
ITEM ANALYSIS EXAMPLES

0.66 82.00% 0.28 7 17 *120 9
3.87 9.39 66.30 4.97
-0.11 -0.19 0.28 -0.07
-0.04 -0.19 0.36 -0.04
0.00 0.00 0.82 0.06
0.04 0.19 0.46 0.10
Lower C
Item
#
Correct Responses Disc.
Index
Point
Biserial
Correct
Answer
0.36
Lower 27%
Upper 27%
Disc. Index -0.09
0.21
0.12
46.15% D 28
15.47
-0.12
7
% Selected

0.36 52.00% 0.22 35 34 *66 25
19.34 18.78 36.46 13.81
-0.09 0.04 0.22 -0.06
-0.15 0.07 0.25 -0.02
0.10 0.24 0.52 0.10
0.25 0.17 0.27 0.12
Item
#
Index
Point
Biserial
Correct
Answer
Lower C
0.25
Lower 27%
Upper 27%
Disc. Index -0.15
0.19
0.04
26.92% D 21
11.60
-0.20
22
% Selected

0.52 64.00% 0.18 61 21 5 0
33.70 11.60 2.76 0.00
-0.10 -0.19 0.12 0.00
-0.12 -0.13 0.04 0.00
0.26 0.04 0.06 0.00
0.38 0.17 0.02 0.00
Item
#
Index
Point
Biserial
Correct
Answer
Lower C
0.22
Lower 27%
Upper 27%
Disc. Index 0.22
0.42
0.64
42.31% C *94
51.93
0.18
24
% Selected

0.71 90.00% 0.31 0 *129 30 21
0.00 71.27 16.57 11.60
0.00 0.31 -0.25 -0.11
0.00 0.34 -0.23 -0.09
0.00 0.90 0.06 0.04
0.00 0.56 0.29 0.13
Item
#
Index
Point
Biserial
Correct
Answer
Lower C
0.34
Lower 27%
Upper 27%
Disc. Index -0.02
0.02
0.00
55.77% B 1
0.55
-0.16
34
% Selected

Desired statistical ranges - opinions differ but most commonly used are:
• Item Difficulty/p Value - Acceptable item difficulty is not a set number but more a correlation
with question intention. If you intended the item to be a mastery item you want the difficulty as
close to 1.00 as possible. If you desired a discriminating question significantly lower levels are
acceptable.
• Upper 27% - if less than 60% of your top performers are getting a question correct a further
analysis is needed to see if there are issues with the question. Also if less of your upper 27%
get a question correct than your lower 27% then there could also be an issue.
• Lower 27% - generally you never want it to be higher than the upper 27%. As low as 0% can
be acceptable as high as 100% can be acceptable if it is a mastery question.
GENERAL GUIDELINES

Desired statistical ranges - opinions differ but most commonly used are:
• Discrimination index – some set specific numbers of acceptable and unacceptable values, I
would argue the more accurate guide is that the lower the p value the higher the discrimination
index needs to be.
Generally .2 the item is considered to have discriminated, less than that is considered no
discrimination. .3 or greater is consider highly discriminating.
• Point-Biserial – similarly to discrimination index some set specific numbers of acceptable and
unacceptable values.
Generally .2 and above is considered to have discrimination and have positive association with
overall performance on the assessment, lower levels are acceptable for mastery and .3+ would
be desired for discriminating questions.
GENERAL GUIDELINES

KR-20
• Used as an overall measure of reliability for the assessment.
• Measured on a scale from 0.0 to 1.0 with 0.0 being very poor and 1.0 being excellent.
• Quick notes:
1. Heavily influenced by number of questions in assessment
2. Heavily influenced by number of students taking the assessments
3. The combination can FREQUENTLY lead to false positive and false negative KR-20 values.
GENERAL GUIDELINES

Ways to increase the accuracy/usefulness of your stats:
• Item review process
– Format
– Level of difficulty
– Alternative correct options
• Historical item analysis
– Across assessments
– Across versions
• Reuse/Recycle
BEST PRACTICES

NEW PORTAL

P H O N E
+ 1 . 9 5 4 . 4 2 9 . 8 8 8 9
E M A I L
i n f o @ e x a m s o f t . c o m
W E B S I T E
l e a r n . e x a m s o f t . c o m

EAC 2018 Psychometrics Webinar

Recommended

Recommended

More Related Content

What's hot

What's hot (8)

Similar to EAC 2018 Psychometrics Webinar

Similar to EAC 2018 Psychometrics Webinar (20)

More from ExamSoft

More from ExamSoft (20)

Recently uploaded

Recently uploaded (20)

EAC 2018 Psychometrics Webinar