2. INTRODUCTION
• Item Analysis is an important (probably
the most important) tool to increase
test effectiveness. The contribution
each item makes to the test is analyzed
and assessed.
3. DEFINITION
• Item analysis is the process of evaluating
single test items by any of the several
methods. This usually involves the
determination of how well an individual item
separates examinees, its relative difficulty
value, and its correlation with some criterion
of measurement.
--- Donald Clark
4. PURPOSES
Identify and change test items or distracters
which are not performing well.
To identify objectives which students did not
learn properly and which will need to be re-
taught and re-assessed.
6. RELIABILITY OF A TEST
• Reliability refers to the consistency of test
scores; how consistent a particular student’s test
scores are from one testing to another.
• To increase the likelihood of obtaining higher
reliability, a teacher can:
increase the length of the test,
include questions that measure higher, more
complex levels of learning, and include
questions with a range of difficulty with
most questions in the middle range, and
if one or more essay questions are included
on the test, grade them as objectively as
possible.
7. VALIDITY OF A TEST
• Content or curricular validity is generally
used to assess whether a classroom test is
measuring what it is supposed to measure.
• A quantitative method of assessing test
validity is to examine each test item. This is
accomplished by reviewing the
discrimination (ID) of each item.
8. IMPORTANCE OF RELIABILITY AND VALIDITY
• High reliability means that the questions of a
test tended to "pull together."
• Low reliability means that the questions
tended to be unrelated to each other in
terms of who answered them correctly.
9. GENERAL GUIDELINES CANBE USEDTO INTERPRET RELIABILITY
COEFFICIENTS
Reliability Interpretation
.90 and
above
Excellent reliability; at the level of the best
standardized tests
.80 - .90 Very good for a classroom test
.70 - .80
Good for a classroom test; in the range of most. There
are probably a few items which could be improved.
.60 - .70
Somewhat low. This test needs to be supplemented by
other measures (e.g., more tests) to determine grades.
.50 or
below
Questionable reliability. This test should not
contribute heavily to the course grade, and it needs
revision.
10. STEPS INVOLVED IN ITEM ANALYSIS
• For each item count the number of students in
each group who answered the item correctly.
For alternative response type of items, count the
number of students in each group who choose
each alternative.
• Award of score of each student.
• Ranking in order of merit and identifying high
and low groups.
• Arrange the answer sheets from the highest
score to the lowest score.
• Make two groups i.e. highest score in one group;
lowest score in one group or top and bottom
halves.
11. OPTIMAL ITEMDIFFICULTY
• “Item difficulty” is the percentage of the total
group that got the item correct.
• For each item, compute the percentage of
students who get the item correct is called
“item difficulty index.”
12. CALCULATION OF DIFFICULTY INDEX OF A
QUESTION
• D=R/N x 100.
• R: Number of pupils who answered the item
correctly
• N: Number of pupils who tried them
The higher the difficulty index the easier is the
item.
13. DIFFICULTY LEVEL
• Difficulty level/ facility level of a test; it is an
index of how easy or difficult the test is from
the point of view of the teachers.
Difficulty level= Average on the test x 100
Maximum possible score
14. Difficulty index
Difficulty index= H+L x 100
N
H: Number of correct answers to the high group
L: Number of correct answers to the low group
N: Total number of students in both groups
Find out the facility value of the objective tests first.
Facility value=
Number of students answering questions correctly x 100
Number of students who have taken the test
15. ITEM DISCRIMINATIONI
The discriminating power (validity index) of an
item refers to the degree to which a given item
discriminates among students who differ sharply
in the functions measured by the test as a
whole.
• The difference is one measure of item
discrimination (ID).
The formula is:
• ID = (Upper Group % Correct) – (Lower Group %
Correct)
16. FORMULAS
• DI = RU – RL
½ N
RU= Number of correct responses from the upper group
RL= Number of correct responses from the lower group
N= Total number of pupils who tried them
• DI= Number of HAQ – LAQ
Number of HAG
Number of HAQ: Number of students in high ability group
answering the questions correctly
Number of LAQ: Number of students in low ability group
answering the questions correctly
Number of HAG: Number of students in high ability group
22. ITEM DISCRIMINATIONII
The point biserial correlation (PBC) measures
the correlation between the correct answer
(viewed as 1 = right and 0 = wrong) on an
item and the total test score of all students.
23. CRITERIA USED TO EVALUATE
PBC Interpretation
.30 and
above
Very good items
.20 to .29
Reasonably good items, but subject
to improvement
.10 to .19
Marginal items, usually needing
improvement
.00 to .09 Poor items, to be rejected or revised
24. DISTRACTERS AND THEIR EFFECTIVENESS
• The effectiveness of a multiple-choice
question is heavily dependent on its
distracters. It is, therefore, important for
teachers to observe how many students
select each distracter and to revise
those that draw little or no attention.
25. USING ITEMANALYSIS RESULT
• It helps to judge the worth or quality of a test.
• Aids in subsequent test revisions
• Lead to increase skill in test construction
• Provides diagnostic value and help in planning
future learning activities
• Provides a basis for discussing test results
• For making decisions about the promotion of
students to the next higher grades
• To bring about improvement in teaching
methods and techniques
26. UNDERSTANDING PERFORMANCE OF
INDIVIDUAL ITEMS
• Look at the Total Group % correct.
• Point Biserial is a correlation coefficient which
indicates the relationship between each
individual item and the test as a whole. This
number varies between -1 and +1. Items with a
correlation of over about .2 are decent.
• The upper and lower 27% figures are similar to
the correlation in what they show. These two
numbers indicate the percent of top scores who
got an item correct and the percent of bottom
scores who got an item correct.
• Distracter analysis indicates the percent of test
takers who selected individual items.
27. ITEM ANALYSIS GUIDELINES
• Item analysis gives necessary but not
sufficient information concerning the
appropriateness of an item as a measure of
intended outcomes of instruction.
• An item must be of appropriate difficulty for
the students to whom it is administered.
• An item should discriminate between upper
and lower groups
• All of the incorrect options, or distracters,
should actually be distracting.