Item analysis with spss software

ITEM ANALYSIS
Technique to improve test
items and instruction

TEST DEVELOPMENT PROCESS

13. Standard Setting Study
14. Set Passing Standard

11. Administer Tests
12. Conduct Item
Analysis

9. Assemble Operational Test
Forms
10. Produce Printed Tests Mat.

1. Review
National
and
Professional
Standards

2. Convene National
Advisory Committee

3. Develop Domain,
Knowledge and Skills
Statements
Conduct NeedsAnalysis

5. Construct Table of
Specifications
6. Develop Test Design
7. Develop New Test Questions
8. Review Test Questions

WHAT IS ITEM ANALYSIS ?
 process that examines student responses
to individual test items assessing quality
items and the test as a whole
 valuable in improving items which will be
used again in later tests and eliminate
ambiguous or misleading items
 valuable for increasing instructors' skills in
test construction, and
 identifying specific areas of course content
which need greater emphasis or clarity.

SEVERAL PURPOSES
1. More diagnostic information on students
• Classroom level:
• determine questions most found very difficult/
guessing on • reteach that concept
• questions all got right –
• don't waste more time on this area
• find wrong answers students are choosing• identify common misconceptions
• Individual level:
• isolate specific errors the students made

2. Build future tests, revise test
items to make them better
• know how much work in writing good
questions
• SHOULD NOT REUSE WHOLE TESTS -->
diagnostic teaching means responding to
needs of students, so after a few years a test
bank is build up and choose a tests for the
class

• can spread difficulty levels across your
blueprint (TOS)

3. Part of continuing professional
development
• doing occasional item analysis will help
become a better test writer
• documenting just how good your evaluation
is
• useful for dealing with parents or
administrators if there's ever a dispute

• once you start bringing out all these
impressive looking statistics, parents and
administrators will believe why some
students failed.

CLASSICAL ITEM ANALYSIS
STATISTICS
• Reliability

(test level statistic)

• Difficulty

(item level statistic)

• Discrimination

(item level statistic)

TEST LEVEL STATISTIC
Quality of the Test
• Reliability and Validity
• Reliability Consistency of measurement
• Validity Truthfulness of response
• Overall Test Quality
• Individual Item Quality

RELIABILITY
• refers to the extent to which the test is likely to
produce consistent scores.
Characteristics:
1. The intercorrelations among the items -the greater/stronger the relative number
of positive relationships are, the greater
the reliability.
2. The length of the test –
a test with more items will have a higher
reliability, all other things being equal.

3. The content of the test -generally, the more diverse the
subject matter tested and the testing
techniques used, the lower the
reliability.

4. Heterogeneous groups of test takers

TYPES OF RELIABILITY

• Stability
1. Test – Retest

• Stability
2. Inter – rater / Observer/ Scorer
•

applicable for mostly essay questions

• Use Cohen’s Kappa Statistic

• Equivalence
3. Parallel-Forms/ Equivalent
Used to assess the consistency of the results of
two tests constructed in the same way from the
same content domain.

• Internal Consistency
• Used to assess the consistency of results across items
within a test.

4. Split – Half

• 5. Kuder-Richardson
Formula 20 / 21
Correlation is determined from a
single administration of a test
through a study of score variances

Reliability
Indices
.91 and above

Interpretation
Excellent reliability; at the level of the best standardized
tests

.81 - .90

Very good for a classroom test

.71 - .80

Good for a classroom test; in the range of most. There
are probably a few items which could be improved.

.61 - .70

Somewhat low. This test needs to be supplemented by
other measures (e.g., more tests) to determine
grades. There are probably some items which could
be improved.

.51 - .60

Suggests need for revision of test, unless it is quite short
(ten or fewer items). The test definitely needs to be
supplemented by other measures (e.g., more tests)
for grading.

.50 or below

Questionable reliability. This test should not contribute
heavily to the course grade, and it needs revision.

TEST ITEM STATISTIC

 Item Difficulty
Percent answering correctly

 Item Discrimination
How well the item "functions“
How “valid” the item is based on
the total test score criterion

WHAT IS A WELL-FUNCTIONING
TEST ITEM?
• how many students got it correct?

(DIFFICULTY)
• which students got it correct?

(DECRIMINATION)

THREE IMPORTANT INFORMATION
ON QUALITY OF TEST ITEMS
• Item difficulty: measure whether an item was
too easy or too hard.
• Item discrimination: measure whether an item
discriminated between students who knew the
material well and students who did not.
• Effectiveness of alternatives: Determination
whether distracters (incorrect but plausible
answers) tend to be marked by the less able
students and not by the more able students.

ITEM DIFFICULTY
• Item difficulty is simply the percentage of
students who answer an item correctly. In this
case, it is also equal to the item mean.

Diff = # of students choosing correctly
total # of students

• The item difficulty index ranges from 0 to 100;
the higher the value, the easier the question.

ITEM DIFFICULTY LEVEL: DEFINITION
The percentage of students who answered
the item correctly.
High
(Difficult)

Low
(Easy
)

<= 30%

0

Medium
(Moderate)
> 30% AND < 80%

>=80
%

10

20

30

40

50

60

70

80

90

100

ITEM DIFFICULTY LEVEL: SAMPLE
Number of students who answered each item = 50
Item
No.

No. Correct
Answers

%
Correct

Difficulty
Level

1

15

30

High

2

25

50

Medium

3

35

70

Medium

4

45

90

Low

ITEM DIFFICULTY LEVEL:
QUESTIONS/DISCUSSION

• Is a test that nobody failed too
easy?
• Is a test on which nobody got 100%
too difficult?
• Should items that are “too easy” or
“too difficult” be thrown out?

ITEM DISCRIMINATION
• Traditionally, using high and low scoring groups
(upper 27 % and lower 27%)
• Computerized analyses provide more accurate
assessment of the discrimination power of items
since it accounts all responses rather than just
high and low scoring groups.
• Equivalent to point-biserial correlation. It
provides estimate the degree an individual item is
measuring the same thing as the rest of the items.

WHAT IS ITEM
DISCRIMINATION?
• Generally, students who did well on the
exam should select the correct answer to
any given item on the exam.
• The Discrimination Index distinguishes for
each item between the performance of
students who did well on the exam and
students who did poorly.

INDICES OF DIFFICULTY AND
DISCRIMINATION
(BY HOPKINS AND ANTES)

Index

Difficulty

Discrimination

0.86 above

Very Easy

To be discarded

0.71 – 0.85

Easy

To be revised

0.30 – 0.70

Moderate

Very Good items

0.15 – 0.29

Difficult

To be revised

0.14 below

Very Difficult

To be discarded

ITEM DISCRIMINATION:
QUESTIONS / DISCUSSION
• What factors could contribute to
low item discrimination between
the two groups of students?
• What is a likely cause for a
negative discrimination index?

SAMPLE TOS
Remember

Section
A
Section
B
Section
C
Total

Understand

Apply

Total

4

6

10

20

5

4

14

7

6

16

18

20

50

(1,3,7,9)

5
(2,5,8,11,15)

3
(6,17,21)

12

STEPS IN ITEM ANALYSIS

1. Code the test items:
- 1 for correct and 0 for incorrect
- Vertical – columns (item numbers)
- Horizontal – rows
(respondents/students)

TEST ITEMS
No.

1

1

1 0 1 1 1 0 0 0 0 1 1 1 0 0 0 0 0 1 1

2

1 1 0 1 1 1 0 1 1 1 0 1 1 1 1 0 1 1 1

3

0 0 0 1 0 0 0 1 0 0 0 1 1 1 1 1 1 1 0

4

0 1 0 0 0 1 0 0 0 1 0 0 1 0 0 0 1 0 0

5

1 0 1 1 1 0 1 1 1 0 1 1 0 1 1 1 0 1 0

6

1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 0 1

7

0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1

8

1 1 0 1 1 1 0 1 1 1 0 1 1 0 0 0 1 0 0

2

3

4

5

6

7

8

9

1
0

1
1

1
2

1
3

1
4

.

.

.

.

5
0

2. IN SPSS:

Analyze-Scale-Reliability
analysis – (drag/place variables
to Item box) – Statistics – Scale
if item deleted – ok.

•

•

****** Method 1 (space saver) will be used for this analysis ******
R E L I A B I L I T Y A N A L Y S I S - S C A L E (A L P H A)

• Item-total Statistics
•

Scale

Scale

Corrected

•

Mean

Variance

Item-

•

if Item

if Item

•

Deleted

Deleted

Total
Correlation

Alpha
if Item
Deleted

• VAR00001

14.4211

127.1053

.9401

.9502

• VAR00002

14.6316

136.8440

.7332

.9542

• VAR00022

14.4211

129.1410

.7311

.9513

• VAR00023

14.4211

127.1053

.4401

.9502

• VAR00024

14.6316

136.8440

-.0332

.9542

• VAR00047

14.4737

128.6109

.8511

.9508

• VAR00048

14.4737

128.8252

.8274

.9509

• VAR00049

14.0526

130.6579

.5236

.9525

• VAR00050

14.2105

127.8835

.7533

.9511

• Reliability Coefficients
• N of Cases =
• Alpha =

.9533

57.0

N of Items = 50

3. In the output dialog box:
• Alpha placed at the bottom
• the corrected item total
correlation is the point biserial
correlation as bases for index of
test reliability

4. Count the number
of items discarded
and fill up summary
item analysis table.

TEST ITEM RELIABILITY ANALYSIS
SUMMARY (SAMPLE)
Test

Math
(50 items)

Level of
Difficulty
Very Easy

Number
of Items

%

Item Number

1

2

1

Easy

2

4

2,5

Moderate

10

20 3,4,10,15…

Difficult

30

60 6,7,8,9,11,…

Very
Difficult

7

14 16,24,32…

5. Count the number of
items retained based on
the cognitive domains
in the TOS. Compute the
percentage per level of
difficulty.

Remember Understand

Apply

N

A
B
C
Total

%

Over
all

Ret

N

Ret

N

Ret

4
5
3
12

1
3
2
6

6
5
7
18

3
3
4
10

10
4
6
20

3
2
3
8

50%

56%

24/50 = 48%

40%

• Realistically: Do item analysis to
your most important tests
• end of unit tests, final exams -->
summative evaluation
• common exams with other teachers
(departmentalized exam)
• common exams gives bigger
sample to work with, which is good
• makes sure that questions other
teacher s prepared are working for
your class

ITEM ANALYSIS is one area where
even a lot of otherwise very good
classroom teachers fall down:
• they think they're doing a good job;
• they think they've doing good
evaluation;
• but without doing item analysis,
• They don’t really know.

ITEM ANALYSIS is not an
end in itself,
•no point unless you use it
to revise items, and
•helps students on the basis
of information you get out
of it.

END OF PRESENTATION…

THANK U FOR LISTENING…
HAVE A RELIABLE AND
ENJOYABLE DAY….

Item analysis with spss software

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Item analysis with spss software

Similar to Item analysis with spss software (20)

Recently uploaded

Recently uploaded (20)

Item analysis with spss software