Analyzing a Multiple Choice Test

LANGUAGE TESTING II
ANALYSIS OF MULTIPLE CHOICE ITEM TEST
Lecturer : Nuri Atiningsih, S.Pd, M.Pd
Written by:
Lia Suparni (11321157/6E)
Siti Purwaningsih (11321159/6E)
ENGLISH TEACHING DEPARTMENT
FACULTY OF LETTERS AND ARTS EDUCATION
IKIP PGRI MADIUN
2014

ANALYSIS OF ITEM TEST
We make a multiple choice item test to evaluate the students of seventh grade of
Junior High School. We also analyze the item test to know the validity, reliability, index
difficulty, distinguishing characteristic, and the pola of respondents’ choice of the item test.
The following is the data and blueprint of the item test.
Analysis of Multiple Choice Item Test__2
Level of school : Junior High Schol
Grade : VII
Lesson : English
Period : Odd midle test
Kind of item test : multiple choice
Total item test : 40 number items
Respondent : 10 people
Blueprint of item test :
Material Item number Total
Unit 1
Greeting 6, 20 2
22
Leave taking 7, 21 2
Expression of introducing oneself 5, 32 2
Expression of introducing others 2, 31 2
Asking for people’s identity 19, 37, 40 3
Command/ request 8, 22, 33 3
Prohibition 13, 30, 38 3
Greeting card 12, 36 2
Simple present tense 1, 11, 26 3
Unit 2
Expression of asking for information 9, 27 2
18
Expression of giving information 10, 23 2
Direction 4, 15, 16 3
Expression of thanking 34, 28, 39 3
Expression of apologizing 14, 35, 29 3
Expression of showing politeness 3, 24 2
Positive, negative, interrogative sentence 17, 18, 25 3
Total 40
I. ANALYSIS OF VALIDITY
Validity is used to analyze whether the item test is valid or not. A test is valid if it is
measures what purpose to measure. The valid item test will still be used in a test.
Meanwhile, the invalid item test will be ommited and replaced by the other item test. The

item test will valid if the rxy > rtable. The rtable is 0,632 and n (total respondent) is 10. The
formula of rxy is:
ݎ௑௒ =
ܰΣ ܻܺ − (Σܺ)(Σܻ)
ඥ(ܰΣ ܺଶ − (Σ ܺ)ଶ)(ܰΣܻଶ − (Σܻ)ଶ)
rxy = Corelation Coefisien between X and Y variable
Σxy = Total of multiplication x and y
Σx = Total of x scores
Σy = Total of y scores
Σx2 = Total of x quadrate scores
Σy2 = Total of y quadrate scores
x = Item score to i
y = Total score
Result of validity analysis:
Criterion Item Number Total Percentage (%)
valid 26, 27, 31 3 7,5
Invalid
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,
14, 15, 16, 17, 18, 19, 20, 21, 22, 23,
24, 25, 28, 29, 30, 32, 33, 34, 35, 36,
37, 38, 39, 40
37 92,5
The data above can be seen in the appendix 2. Based on the data above, there are
some valid and invalid item numbers. The valid item number is 3 and the invalid is 37 with
rtable=0,632. The invalid item numbers may be because the questions is too difficult for the
students or the students have low ability so they cann’t do answer the questions correctly.
The invalid item numbers should be omitted and replaced with others item test to evaluate
the student.
II. ANALYSIS OF RELIABILITY
Test reliability refers to the degree to which a test is consistent and stable in
measuring what it is intended to measure. Most simply put, a test is reliable if it is
consistent within itself and across time. The reliability can be counted by Kuder
Richardson formula 21 :
r = ࢑
࢑ି૚
ቂ૚ − ࡹ(࢑ିࡹ)
࢑ࡿ૛ ቃ

r : reliability
k : The number of items in the test
M : The mean
S2 : The total variance
The reliability level is between 0 and 1.00. The criteria of reliability are:
r < 0,4  very low
0,4 < r < 0,6  low
0,6 < r < 0,8  high
r > 0,8  very high
If the reliability level is 1.00, it shows that the test is perfect according to reliability.
The Result of reliability analysis:
Very high
1, 2, 4, 7, 8, 9, 11, 12, 13, 14, 16, 17,
19, 20, 22, 24, 28, 29, 33, 39, 40
21 52,5
High
5, 6, 15, 18, 21, 23, 26, 27, 31, 32,
34, 35, 36, 37, 38
15 37,5
Low 3, 10, 25, 30 4 10
Very low - 0 0
The reliability analysis can be seen on appendix 3. From the data above, the
reliability level of the item test is very high and high. There are 21 item numbers are very
high reliability, and the rest item numbers are high reliability. The low reliability just 4.
Because the reliability categorise as high, the item test can be used to measure the
students’ ability consistenly.
III. ANALYSIS OF INDEX DIFFICULTY
Index difficulty is used to measure the degree of item test, whether it is easy,
average, or hard. A good item test is a test that is not too easy or not too hard. A very easy
item test will make the students don’t have effort to answer it. Meanwhile, the very hard
item test will make the students desperate to answer it. The index difficulty is between 0,0
- 1,0. Item test with 0,0 degree shows that the item test is very hard. Meanwhile, the item
test with 1,0 degree shows that the item test is vey easy. The formula of index difficulty is:

ܲ =
ܤ
ܬܵ
P = index difficulty
B = total students who have correct answer
JS = total respondent
The criterion of index difficulty is:
P ≤ 0,3  hard
0,3 < P ≤ 0,7  average
P > 0.7  easy
The result of index difficulty analysis:
Hard 3, 10, 25, 30 4 10
Average 5, 6, 15, 18, 21, 23, 26, 27, 29, 31, 32, 34, 38 13 32, 5
easy
1, 2, 4, 7, 8, 9, 11, 12, 13, 14, 16, 17, 19, 20,
22, 24, 28, 33, 35, 36, 37, 39, 40
23 57, 5
The analysis of index difficulty can be seen on appendix 4. From the data above, the
item test have variation index difficulty. There are 4 hard item test, 13 average item test,
and 23 easy item test. More than half of the item test is easy, so the students will easy to
answer it.
IV. ANALYSIS OF DISTINGUISHING CHARACTERISTIC
Distinguishing characteristic is used to differentiate between the students who have
low (lower group) and high (upper group) ability. The index of discrimination is between
0,0 – 1,0, almost same with index difficulty. The difference is, the index discrimination
have negative sign. It shows the testee quality, the upper group is lower students and the
lower group is higher students. The formula of distinguishing characteristic is:
ܦ =
ܤܣ
ܬܣ
−
ܤܤ
ܬܤ
= ܲܣ − ܲܤ
D : distinguishing characteristic
PA : index difficulty of upper group
PB : index difficulty of lower group

The result of distinguishing characteristic analysis:
Poor 1, 2, 4, 9, 19, 22, 30, 34, 35, 40 10 25
Satisfactory 3, 10, 18, 25, 28, 37 6 15
Good 21, 31, 32, 39 4 10
Excellent 26, 27, 29 3 7,5
Not good
5, 6, 7, 8, 11, 12, 13, 14, 15, 16, 17,
20, 23, 24, 33, 36, 38
17 42,5
The analysis of distinguishing characteristic can be seen on apendix 5. From the data
above, the distinguishing characteristic of the item test is variatif. Some of the items test is
poor, satisfactory, good, excellent, and not good. The item test that is not good have to be
omitted because it is not appropriate to evaluate the stdents.
V. ANALYSIS OF POLA OF RESPONDENTS’ CHOICE
The pola of respondents’ choice is the distribution of testee to decides the choice in
multiple choice item test. It can be determined by counting how much testee who choose
option a, b, c, d, or doesn’t choose all of them. The pola of respondents’ choice can
determine the distractor, whether it is good or not. A distractor is good if it is choosen by
at least 5% of the respondents. The pola of respondents’ choice also can be used to analyse
the quality of item test. An item test is good if the omit is no more than 10% of the
respondent.
The result of pola of respondents’ choice analysis is:
good distractor
3, 5, 8, 9, 15, 16, 21, 25, 26, 30, 32, 34,
38
13 32,5
Bad distractor
1, 2, 4, 6, 7, 10, 11, 12, 13, 14, 17, 18, 19,
20, 22, 23, 24, 27, 28, 29, 31, 33, 35, 36,
37, 39, 40
27 67,5
good item test
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14,
16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26,
39 97,5

27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37,
38, 39, 40
Bad item test 15 1 2,5
The analysis of pola of respondents’ choice can be seen on apendix 6. From the data
above, more than half of the item test have bad distractors (67%). So, the bad distractor
should not use in an item test and should be replaced. Meanwhile, almost all of the item
test categorized as a good item test, because none of the testee doesn’t choose the option.
Therefore, the item test can be used to measure the students’ ability again.
CONCLUSION
Based on the results of the analysis of multiple choice item test above which analyze
the validity, reliability, index of difficulty, distinguishing characteristic, and pola of
respondents’ choice, we can conclude:
a. Validity of the odd middle test categorised as the valid item number is 3 (7.5%) and the
invalid is 37 (92.5%). The invalid can be caused the item test is too difficult or the
respondents’ have low ability. The invalid item test should be replaced by new item test
because it cann’t be used to measure students’ ability.
b. Reliability of the middle odd test are test is very high and high. There are 21 item
numbers are very high reliability, and the rest item numbes are high reliability.
c. Index of difficulty of the item test have variation index difficulty. There are 4 hard item
test, 13 average item test, and 23 easy item test. More than half of the item test is easy,
because the human resources is quite high, so the students will easy to answer it.
d. Distinguishing characteristic of the item test is variatif. Some of the items test is poor,
satisfactory, good, excellent, and not good. The item test that is not good has to be
omitted because it is not appropriate to evaluate the students.
e. Pola of respondents’ choice of the item test doesn’t spread. Some of the distractor is
bad, but the item test is good because none of the testee doesn’t choose the option.
The analysis of validity, reliability, index of difficulty, distinguishing characteristic,
and pola of respondents’ choice of item test can be used to analyse whether the item test can
be used to evaluate students’ ability or not.

Analyzing a Multiple Choice Test

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (19)

Similar to Analyzing a Multiple Choice Test

Similar to Analyzing a Multiple Choice Test (20)

More from Siti Purwaningsih

More from Siti Purwaningsih (20)

Recently uploaded

Recently uploaded (20)

Analyzing a Multiple Choice Test