Psychological Testing Item Analysis

SUBJECT: PSYCHOLOGICAL TESTING
TOPIC: ITEM ANALYSIS
SUBMITTED TO: MAM RABIA
Name: Zainab Tahir
Class: MSC Psychology
Roll no: 1

ITEM ANALYSIS:
 It is a general term that refers to the specific
method used in education to evaluate test
items.
 For the purpose of test construction and
revision.
 It is one of the most important approach of test
construction incorporated into item response
theory which serves as an alternative to
classical measurement theory or classical
test theory .

ITEM ANALYSIS:
 Classical measurement theory:
Considers a score to be the direct result of a
persons true score plus error. It is an error that
is of great interest as previous measurement
have been unable to specify its source.

ITEM ANALYSIS:
Item response theory:
Item response theory uses item analysis to
differentiate between types of errors in order to
gain a clearer understanding of any existing
deficiencies.

ITEM ANALYSIS;
 Particular attention is given to individual test
items, item characteristics, probability of
answering items correctly, over ability of the
test taker, and degrees or level of knowledge
being assessed.

THE PURPOSE OF ITEM ANALYSIS:
 There must also be an effort to test for more
complex levels of understanding.
 With taken care to avoid over sampling items
that assess only basic levels of knowledge.
 Tests that are too difficult lead to frustration and
deflated scores.
 Tests that are too easy facilitate a decline in
motivation and lead to inflated scores.
 Test can be improved by maintaining a valid
pool of test items from future tests can be drawn
and that cover a reasonable span of difficulty.

 Item analysis helps improve test items and
identify unfair or biased items.
 Results should be used to refine test item
wordings.
 Closer examination of items will reveal which
questions were more difficult.
 If a particular distracter(that is incorrect
answer choice) is most often chosen answer,
then it must be examine more closely for
correctness.

 The value of these items can be
systematically assessed using several
methods representative of item analysis:
 A test item’s level of difficulty.
 An item’s capacity to discriminate.
 The item characteristics curve.

 Difficulty is assessed by examining the
number of persons correctly endorsing the
answer.
 Discrimination can be examined by
comparing the number of persons getting a
particular item correct with the total test
score.
 Item characteristics curves can be used to
plot the likelihood of answering correctly with
the level of success on the test.

ITEM DIFFICULTY:
 In test construction, item difficulty is determined by
the number of people who answer a particular test
item correctly.
 For example, if the first question on a test was
answered correctly by 76% of the class, then the
difficulty level (p or percentage passing) for that
question is p = .76. If the second question on a test
was answered correctly by only 48% of the class,
then the difficulty level for that question is p = .48.
The higher the percentage of people who answer
correctly, the easier the item, so that a difficulty level
of .48 indicates that question two was more difficult
than question one, which had a difficulty level of .76.

ITEM DIFFICULTY:
Many educators find themselves wondering
how difficult a good test item should be.
Several things must be taken into
consideration in order to determine appropriate
difficulty level. The first task of any test maker
should be to determine the probability of
answering an item correctly by chance alone,
also referred to as guessing or luck.

ITEM DIFFICULTY:
 For example:
 a true-false item, because it has only two
choices, could be answered correctly by
chance half of the time. Therefore, a true-
false item with a demonstrated difficulty level
of only p = .50 would not be a good test item
because that level of success could be
achieved through guessing alone and would
not be an actual indication of knowledge or
ability level.

ITEM DIFFICULTY:
 For example:
 Similarly, a multiple-choice item with five
alternatives could be answered correctly
by chance 20% of the time. Therefore, an
item difficulty greater than .20 would be
necessary in order to discriminate
between respondents' ability to guess
correctly and respondents' level of
knowledge.

ITEM DIFFICULTY:
 In most instances, it is desirable for a test to
contain items of various difficulty levels in order
to distinguish between students who are not
prepared at all, students who are fairly
prepared, and students who are well prepared.
 In other words, educators do not want the same
level of success for those students who did not
study as for those who studied a fair amount, or
for those who studied a fair amount and those
who studied exceptionally hard.

ITEM DIFFICULTY:
 Therefore, it is necessary for a test to be
composed of items of varying levels of
difficulty. As a general rule for norm-
referenced tests, items in the difficulty range
of .30 to .70 yield important differences
between individuals' level of knowledge,
ability, and preparedness.

DISCRIMINATION INDEX:
 According to Wilson (2005), item difficulty is the most
essential component of item analysis. However, it is not
the only way to evaluate test items. Discrimination goes
beyond determining the proportion of people who answer
correctly and looks more specifically at who answers
correctly.
 In other words, item discrimination determines whether
those who did well on the entire test did well on a
particular item. An item should in fact be able to
discriminate between upper and lower scoring groups.
Membership in these groups is usually determined based
on their total test score, and it is expected that those
scoring higher on the overall test will also be more likely
to endorse the correct response on a particular item.

 Sometimes an item will discriminate
negatively, that is, a larger proportion of the
lower group select the correct response, as
compared to those in the higher scoring
group. Such an item should be revised or
discarded.
 One way to determine an item's power to
discriminate is to compare those who have
done very well with those who have done
very poorly, known as the extreme group
method.

 First, identify the students who scored in the top one-third
as well as those in the bottom one-third of the class.
 next, calculate the proportion of each group that
answered a particular test item correctly (i.e., percentage
passing for the high and low groups on each item).
 Finally, subtract the p of the bottom performing group
from the p for the top performing group to yield an item
discrimination index (D). Item discriminations of D = .50 or
higher are considered excellent. D = 0 means the item
has no discrimination ability, while D = 1.00 means the
item has perfect discrimination ability.

CHARACTERISTIC CURVE:
 A third parameter used to conduct item analysis
is known as the item characteristic curve (ICC).
 This is a graphical or pictorial depiction of the
characteristics of a particular item, or taken
collectively, can be representative of the entire
test.
 In the item characteristic curve the total test
score is represented on the horizontal axis and
the proportion of test takers passing the item
within that range of test scores is scaled along
the vertical axis.

Psychological Testing Item Analysis

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Psychological Testing Item Analysis

Similar to Psychological Testing Item Analysis (20)

Recently uploaded

Recently uploaded (20)

Psychological Testing Item Analysis