SUBJECT: PSYCHOLOGICAL TESTING
TOPIC: ITEM ANALYSIS
SUBMITTED TO: MAM RABIA
Name: Zainab Tahir
Class: MSC Psychology
Roll no: 1
ITEM ANALYSIS:
 It is a general term that refers to the specific
method used in education to evaluate test
items.
 For the purpose of test construction and
revision.
 It is one of the most important approach of test
construction incorporated into item response
theory which serves as an alternative to
classical measurement theory or classical
test theory .
ITEM ANALYSIS:
 Classical measurement theory:
Considers a score to be the direct result of a
persons true score plus error. It is an error that
is of great interest as previous measurement
have been unable to specify its source.
ITEM ANALYSIS:
Item response theory:
Item response theory uses item analysis to
differentiate between types of errors in order to
gain a clearer understanding of any existing
deficiencies.
ITEM ANALYSIS;
 Particular attention is given to individual test
items, item characteristics, probability of
answering items correctly, over ability of the
test taker, and degrees or level of knowledge
being assessed.
THE PURPOSE OF ITEM ANALYSIS:
 There must also be an effort to test for more
complex levels of understanding.
 With taken care to avoid over sampling items
that assess only basic levels of knowledge.
 Tests that are too difficult lead to frustration and
deflated scores.
 Tests that are too easy facilitate a decline in
motivation and lead to inflated scores.
 Test can be improved by maintaining a valid
pool of test items from future tests can be drawn
and that cover a reasonable span of difficulty.
THE PURPOSE OF ITEM ANALYSIS:
 Item analysis helps improve test items and
identify unfair or biased items.
 Results should be used to refine test item
wordings.
 Closer examination of items will reveal which
questions were more difficult.
 If a particular distracter(that is incorrect
answer choice) is most often chosen answer,
then it must be examine more closely for
correctness.
THE PURPOSE OF ITEM ANALYSIS:
 The value of these items can be
systematically assessed using several
methods representative of item analysis:
 A test item’s level of difficulty.
 An item’s capacity to discriminate.
 The item characteristics curve.
THE PURPOSE OF ITEM ANALYSIS:
 Difficulty is assessed by examining the
number of persons correctly endorsing the
answer.
 Discrimination can be examined by
comparing the number of persons getting a
particular item correct with the total test
score.
 Item characteristics curves can be used to
plot the likelihood of answering correctly with
the level of success on the test.
ITEM DIFFICULTY:
 In test construction, item difficulty is determined by
the number of people who answer a particular test
item correctly.
 For example, if the first question on a test was
answered correctly by 76% of the class, then the
difficulty level (p or percentage passing) for that
question is p = .76. If the second question on a test
was answered correctly by only 48% of the class,
then the difficulty level for that question is p = .48.
The higher the percentage of people who answer
correctly, the easier the item, so that a difficulty level
of .48 indicates that question two was more difficult
than question one, which had a difficulty level of .76.
ITEM DIFFICULTY:
Many educators find themselves wondering
how difficult a good test item should be.
Several things must be taken into
consideration in order to determine appropriate
difficulty level. The first task of any test maker
should be to determine the probability of
answering an item correctly by chance alone,
also referred to as guessing or luck.
ITEM DIFFICULTY:
 For example:
 a true-false item, because it has only two
choices, could be answered correctly by
chance half of the time. Therefore, a true-
false item with a demonstrated difficulty level
of only p = .50 would not be a good test item
because that level of success could be
achieved through guessing alone and would
not be an actual indication of knowledge or
ability level.
ITEM DIFFICULTY:
 For example:
 Similarly, a multiple-choice item with five
alternatives could be answered correctly
by chance 20% of the time. Therefore, an
item difficulty greater than .20 would be
necessary in order to discriminate
between respondents' ability to guess
correctly and respondents' level of
knowledge.
ITEM DIFFICULTY:
 In most instances, it is desirable for a test to
contain items of various difficulty levels in order
to distinguish between students who are not
prepared at all, students who are fairly
prepared, and students who are well prepared.
 In other words, educators do not want the same
level of success for those students who did not
study as for those who studied a fair amount, or
for those who studied a fair amount and those
who studied exceptionally hard.
ITEM DIFFICULTY:
 Therefore, it is necessary for a test to be
composed of items of varying levels of
difficulty. As a general rule for norm-
referenced tests, items in the difficulty range
of .30 to .70 yield important differences
between individuals' level of knowledge,
ability, and preparedness.
DISCRIMINATION INDEX:
 According to Wilson (2005), item difficulty is the most
essential component of item analysis. However, it is not
the only way to evaluate test items. Discrimination goes
beyond determining the proportion of people who answer
correctly and looks more specifically at who answers
correctly.
 In other words, item discrimination determines whether
those who did well on the entire test did well on a
particular item. An item should in fact be able to
discriminate between upper and lower scoring groups.
Membership in these groups is usually determined based
on their total test score, and it is expected that those
scoring higher on the overall test will also be more likely
to endorse the correct response on a particular item.
DISCRIMINATION INDEX:
 Sometimes an item will discriminate
negatively, that is, a larger proportion of the
lower group select the correct response, as
compared to those in the higher scoring
group. Such an item should be revised or
discarded.
 One way to determine an item's power to
discriminate is to compare those who have
done very well with those who have done
very poorly, known as the extreme group
method.
DISCRIMINATION INDEX:
 First, identify the students who scored in the top one-third
as well as those in the bottom one-third of the class.
 next, calculate the proportion of each group that
answered a particular test item correctly (i.e., percentage
passing for the high and low groups on each item).
 Finally, subtract the p of the bottom performing group
from the p for the top performing group to yield an item
discrimination index (D). Item discriminations of D = .50 or
higher are considered excellent. D = 0 means the item
has no discrimination ability, while D = 1.00 means the
item has perfect discrimination ability.
CHARACTERISTIC CURVE:
 A third parameter used to conduct item analysis
is known as the item characteristic curve (ICC).
 This is a graphical or pictorial depiction of the
characteristics of a particular item, or taken
collectively, can be representative of the entire
test.
 In the item characteristic curve the total test
score is represented on the horizontal axis and
the proportion of test takers passing the item
within that range of test scores is scaled along
the vertical axis.

Item

  • 1.
    SUBJECT: PSYCHOLOGICAL TESTING TOPIC:ITEM ANALYSIS SUBMITTED TO: MAM RABIA Name: Zainab Tahir Class: MSC Psychology Roll no: 1
  • 2.
    ITEM ANALYSIS:  Itis a general term that refers to the specific method used in education to evaluate test items.  For the purpose of test construction and revision.  It is one of the most important approach of test construction incorporated into item response theory which serves as an alternative to classical measurement theory or classical test theory .
  • 3.
    ITEM ANALYSIS:  Classicalmeasurement theory: Considers a score to be the direct result of a persons true score plus error. It is an error that is of great interest as previous measurement have been unable to specify its source.
  • 4.
    ITEM ANALYSIS: Item responsetheory: Item response theory uses item analysis to differentiate between types of errors in order to gain a clearer understanding of any existing deficiencies.
  • 5.
    ITEM ANALYSIS;  Particularattention is given to individual test items, item characteristics, probability of answering items correctly, over ability of the test taker, and degrees or level of knowledge being assessed.
  • 6.
    THE PURPOSE OFITEM ANALYSIS:  There must also be an effort to test for more complex levels of understanding.  With taken care to avoid over sampling items that assess only basic levels of knowledge.  Tests that are too difficult lead to frustration and deflated scores.  Tests that are too easy facilitate a decline in motivation and lead to inflated scores.  Test can be improved by maintaining a valid pool of test items from future tests can be drawn and that cover a reasonable span of difficulty.
  • 7.
    THE PURPOSE OFITEM ANALYSIS:  Item analysis helps improve test items and identify unfair or biased items.  Results should be used to refine test item wordings.  Closer examination of items will reveal which questions were more difficult.  If a particular distracter(that is incorrect answer choice) is most often chosen answer, then it must be examine more closely for correctness.
  • 8.
    THE PURPOSE OFITEM ANALYSIS:  The value of these items can be systematically assessed using several methods representative of item analysis:  A test item’s level of difficulty.  An item’s capacity to discriminate.  The item characteristics curve.
  • 9.
    THE PURPOSE OFITEM ANALYSIS:  Difficulty is assessed by examining the number of persons correctly endorsing the answer.  Discrimination can be examined by comparing the number of persons getting a particular item correct with the total test score.  Item characteristics curves can be used to plot the likelihood of answering correctly with the level of success on the test.
  • 10.
    ITEM DIFFICULTY:  Intest construction, item difficulty is determined by the number of people who answer a particular test item correctly.  For example, if the first question on a test was answered correctly by 76% of the class, then the difficulty level (p or percentage passing) for that question is p = .76. If the second question on a test was answered correctly by only 48% of the class, then the difficulty level for that question is p = .48. The higher the percentage of people who answer correctly, the easier the item, so that a difficulty level of .48 indicates that question two was more difficult than question one, which had a difficulty level of .76.
  • 11.
    ITEM DIFFICULTY: Many educatorsfind themselves wondering how difficult a good test item should be. Several things must be taken into consideration in order to determine appropriate difficulty level. The first task of any test maker should be to determine the probability of answering an item correctly by chance alone, also referred to as guessing or luck.
  • 12.
    ITEM DIFFICULTY:  Forexample:  a true-false item, because it has only two choices, could be answered correctly by chance half of the time. Therefore, a true- false item with a demonstrated difficulty level of only p = .50 would not be a good test item because that level of success could be achieved through guessing alone and would not be an actual indication of knowledge or ability level.
  • 13.
    ITEM DIFFICULTY:  Forexample:  Similarly, a multiple-choice item with five alternatives could be answered correctly by chance 20% of the time. Therefore, an item difficulty greater than .20 would be necessary in order to discriminate between respondents' ability to guess correctly and respondents' level of knowledge.
  • 14.
    ITEM DIFFICULTY:  Inmost instances, it is desirable for a test to contain items of various difficulty levels in order to distinguish between students who are not prepared at all, students who are fairly prepared, and students who are well prepared.  In other words, educators do not want the same level of success for those students who did not study as for those who studied a fair amount, or for those who studied a fair amount and those who studied exceptionally hard.
  • 15.
    ITEM DIFFICULTY:  Therefore,it is necessary for a test to be composed of items of varying levels of difficulty. As a general rule for norm- referenced tests, items in the difficulty range of .30 to .70 yield important differences between individuals' level of knowledge, ability, and preparedness.
  • 16.
    DISCRIMINATION INDEX:  Accordingto Wilson (2005), item difficulty is the most essential component of item analysis. However, it is not the only way to evaluate test items. Discrimination goes beyond determining the proportion of people who answer correctly and looks more specifically at who answers correctly.  In other words, item discrimination determines whether those who did well on the entire test did well on a particular item. An item should in fact be able to discriminate between upper and lower scoring groups. Membership in these groups is usually determined based on their total test score, and it is expected that those scoring higher on the overall test will also be more likely to endorse the correct response on a particular item.
  • 17.
    DISCRIMINATION INDEX:  Sometimesan item will discriminate negatively, that is, a larger proportion of the lower group select the correct response, as compared to those in the higher scoring group. Such an item should be revised or discarded.  One way to determine an item's power to discriminate is to compare those who have done very well with those who have done very poorly, known as the extreme group method.
  • 18.
    DISCRIMINATION INDEX:  First,identify the students who scored in the top one-third as well as those in the bottom one-third of the class.  next, calculate the proportion of each group that answered a particular test item correctly (i.e., percentage passing for the high and low groups on each item).  Finally, subtract the p of the bottom performing group from the p for the top performing group to yield an item discrimination index (D). Item discriminations of D = .50 or higher are considered excellent. D = 0 means the item has no discrimination ability, while D = 1.00 means the item has perfect discrimination ability.
  • 19.
    CHARACTERISTIC CURVE:  Athird parameter used to conduct item analysis is known as the item characteristic curve (ICC).  This is a graphical or pictorial depiction of the characteristics of a particular item, or taken collectively, can be representative of the entire test.  In the item characteristic curve the total test score is represented on the horizontal axis and the proportion of test takers passing the item within that range of test scores is scaled along the vertical axis.