NEMA GRACE B. MEDILLO
Educ. 243 – Education Evaluation
Prof. James L. Paglinawan, Ph D
Central Mindanao University
Graduate School
Education Evaluation
Chapter 10: ADMINISTERING, ANALYZING, AND IMPROVING THE TEST
OR ASSESSMENT
Assume that the following are completed and ready;
Then we are ready for assembling, administering, scoring, analyzing and
debriefing test items.
 When assembling the a test:
o The following packaging guidelines are worth to apply:
a. group together all items of similar format
b. arrange test items from easy to hard
c. space the items for easy reading
d. keep items and options on the same page
e. position illustration near descriptions
f. check your answer key
g. determine how students record answer
h. provide space for name and date
i. check test directions
j. proofread the test
o Care must be taken in reproducing the test to avoid illegible copies that would impair
test validity through:
a. know the photocopying machine
b. specify copying instructions
c. file original test
 In administering the test, make an effort to do the following:
a. induce a positive test – taking attitude
b. maximize the achievement nature of the test
c. equalize the advantages test – wise students have over non – test wise students
d. avoid surprise test
e. provide special instructions before the test are actually distributed
f. alternate your test distribution procedures
g. have the students check that they have the entire test
h. keep distractions to a minimum
i. alert students to the amount of time left toward the end of the test
j. clarify test collection procedures before handing out the test
Written measurable instructional objectives;
Prepared a test blueprint, specifying the number of items for each content and
process area; and
Written test items that match your instructional objectives.
NEMA GRACE B. MEDILLO
Educ. 243 – Education Evaluation
Prof. James L. Paglinawan, Ph D
 In scoring the test, try to do the following:
a. have the key prepared in advance
b. have the key checked for accuracy
c. score blindly
d. check for multiple answers if machine scoring used
e. double – check scores, if scored by hand
f. record scores before returning the tests
 In analyzing the test:
o the quantitative item analysis is a mathematical approach in assessing an item utility
 an item difficulty level (p) is computed by
𝑝 =
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑠𝑡𝑢𝑑𝑛𝑒𝑡𝑠 𝑠𝑒𝑙𝑒𝑐𝑡𝑖𝑛𝑔 𝑐𝑜𝑟𝑟𝑒𝑐𝑡 𝑎𝑛𝑠𝑤𝑒𝑟
𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑠𝑡𝑢𝑑𝑒𝑛𝑡𝑠 𝑎𝑡𝑡𝑒𝑚𝑝𝑡𝑖𝑛𝑔 𝑡ℎ𝑒 𝑖𝑡𝑒𝑚
 If P level > 0.75, the item is considered relatively easy.
 If P level < 0. 25, the item is considered relatively difficult.
 Build tests that have most items between p levels of 0.20 and 0. 80,
with an average p level about 0.50.
 an item discrimination index is computed by
𝑑 =
(
𝑁𝑢𝑚𝑏𝑒𝑟 𝑤ℎ𝑜 𝑔𝑜𝑡 𝑖𝑡𝑒𝑚
𝑐𝑜𝑟𝑟𝑒𝑐𝑡 𝑖𝑛 𝑢𝑝𝑝𝑒𝑟 𝑔𝑟𝑜𝑢𝑝
) − (
𝑁𝑢𝑚𝑏𝑒𝑟 𝑤ℎ𝑜 𝑔𝑜𝑡 𝑖𝑡𝑒𝑚
𝑐𝑜𝑟𝑟𝑒𝑐𝑡 𝑖𝑛 𝑙𝑜𝑤𝑒𝑟 𝑔𝑟𝑜𝑢𝑝
)
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑠𝑡𝑢𝑑𝑒𝑛𝑡𝑠 𝑖𝑛 𝑒𝑖𝑡ℎ𝑒𝑟 𝑔𝑟𝑜𝑢𝑝
(𝑖𝑓 𝑔𝑟𝑜𝑢𝑝 𝑠𝑖𝑧𝑒𝑠 𝑎𝑟𝑒 𝑢𝑛𝑒𝑞𝑢𝑎𝑙, 𝑐ℎ𝑜𝑜𝑠𝑒 𝑡ℎ𝑒 ℎ𝑖𝑔ℎ𝑒𝑟 𝑛𝑢𝑚𝑏𝑒𝑟)
 keyed correct option should discriminate positively and incorrect
options should discriminate negatively
 Consider/ retain item with a positive D value.
 quantitative item analysis helps us
 decide whether to retain or eliminate an item,
 which distractor(s) should be modified or eliminated,
 whether an item is miskeyed,
 whether guessing occurred, and
 whether ambiguity is present
o qualitative item analysis is a nonmathematical approach to assessing an item’s
utility
 performed by checking an item’s content validity and inspecting it for
technical faults (outlined in Chapter 5, 6, and 7)
o apply quantitative item analysis procedures that have been develop and proved
using norm – reference test to criterion – reference test with appropriate caution and
modification such as:
NEMA GRACE B. MEDILLO
Educ. 243 – Education Evaluation
Prof. James L. Paglinawan, Ph D
a. using pre and posttest as upper and lower group
 Step 1: Compute p levels for both tests
 Step 2: Determine the discrimination index for the key
 Step 3: Determine whether each option separately discriminates
negatively
Modified quantitative item indicates the following
1. There was a sizeable increase in p value from pretest to posttest.
2. The D index for the key was positive.
3. The distractors all discriminated negatively.
b. Comparing the percentage answering each item correctly on both pre- and
posttest
Percentage passing posttest – percentage passing pretest
 The more positive the difference, the more the item tapping the
content
c. Determining the percentage of items answered in the expected direction for the
entire test
 Step 1: Find the number of items that students answered
incorrectly prior to instruction but correctly after instruction.
 Step 2: Add counts and divide by the number of students.
 Step 3: Divide by number of test items.
 Step 4: Multiply by 100.
 The greater the overall positive percentage of change, the more
your test is likely to match your instruction and to be content –
valid test.
d. Limitations of Modification
 Difficult
 Unit of instruction is brief
 From norm – referenced test to criterion – referenced test
 Time devote to instruction (pre – post)
 After the test has been scored, but before giving to students their scores:
a. Discuss any items considered to be problematic
b. Listen to student concerns and try to stay unemotional
c. Let students know you will consider their comments but will not make any
decisions affecting their scores until you have had time to reflect on their
comments
d. Let students know that any changes made will apply equally to all students.
 After the students are given their scores, ask them to check for clerical errors
NEMA GRACE B. MEDILLO
Educ. 243 – Education Evaluation
Prof. James L. Paglinawan, Ph D
Process of Evaluating Classroom Achievement
Instructional Objectives are
formulated, based on the school's or
school district's educational goals.
Instructional procedures are
implemented that lead o the
achievement of these objectives
A test blueprint is drawn to ensure that
each important content and process
area is adequately sampled by the
appropriate number and kind of test
items.
Test items are written. The format,
number, and level are determined in
part by the objectives, in part by the
test blueprints, and in part by the
teacher's judgment.
Test items are reviewed and, where
necessary, edited or replaced.
The actual test is assembled using the
Test Assembly Checklist to be sure
important assembly consideration are
not overlooked.
The test is reproduced, with care taken
to ensure that copies are legible.
The test is administered, with steps
taken to ensure the appropriate
psychological atmosphere.
The test is scored, with appropriate
safeguards to minimize scoring errors.
Items that look marginal are subject to
qualitative and quantitative and
qualitaive item analysis. Appropriate
changes in scoring or revisions are
made.
The tests are scores are returned to the
students or debriefing. Items the
students find marginal are subject to
qualitative and quantitative item
analysis and appropriate changes in
scoring or revisions are made.
Notes are made of deficient or
problematic items, enabling the
teacher to be aware of problems,
cahnges, etc., before administering the
test again.
NEMA GRACE B. MEDILLO
Educ. 243 – Education Evaluation
Prof. James L. Paglinawan, Ph D
Test Assembly Checklist
Put a check in the blank to the right of each statement after you’ve checked to
see that it applies to your test.
Yes No
1. Are items of similar format grouped together? ____ ____
2. Are items are arranged from easy – to – hard of difficulty? ____ ____
3. Are items properly spaced? ____ ____
4. Are items and options on the same page? ____ ____
5. Are diagrams, maps, and supporting material above
designated items and on the same page with items? ____ ____
6. Are answers random? ____ ____
7. Have you decided whether an answer sheet will be used? ____ ____
8. Are blanks for name and data included? ____ ____
9. Have the directions been checked for clarity? ____ ____
10. Has the test been proofread for errors? ____ ____
11. Do items avoid racial and gender bias? ____ ____
Qualitative Item Analysis Checklist
1. Item Number ______
2. Difficulty Index:
𝑝 =
𝑁𝑢𝑚𝑏𝑒𝑟 𝐶𝑜𝑟𝑟𝑒𝑐𝑡
𝑇𝑜𝑡𝑎𝑙
=
______
3. Discrimination Index:
𝐷 =
(
𝑁𝑢𝑚𝑏𝑒𝑟 𝐶𝑜𝑟𝑟𝑒𝑐𝑡
(𝑢𝑝𝑝𝑒𝑟)
) − (
𝑁𝑢𝑚𝑏𝑒𝑟 𝐶𝑜𝑟𝑟𝑒𝑐𝑡
(𝑙𝑜𝑤𝑒𝑟)
)
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑠𝑡𝑢𝑑𝑒𝑛𝑡𝑠 𝑖𝑛 𝑒𝑖𝑡ℎ𝑒𝑟 𝑔𝑟𝑜𝑢𝑝
______
4. Eliminate or revise item? Check,
a. Does key discriminate positively?
b. Do discriminators discriminate negatively?
** If you answer yes to both a and b, no revision may be
necessary.
** If you answer no to a or b, revision is necessary.
** If you answer no to both a and b, eliminate the item.
5. Check for miskeying, ambiguity, or guessing. Among the choices
for the upper group only, was there evidence of:
a. Miskeying (more chose distractor than key)?
b. Guessing (equal spread of choices across options?
c. Ambiguity (equal number chose one distractor and the key)?
______

Administering, Analyzing, and Improving the Test or Assessment

  • 1.
    NEMA GRACE B.MEDILLO Educ. 243 – Education Evaluation Prof. James L. Paglinawan, Ph D Central Mindanao University Graduate School Education Evaluation Chapter 10: ADMINISTERING, ANALYZING, AND IMPROVING THE TEST OR ASSESSMENT Assume that the following are completed and ready; Then we are ready for assembling, administering, scoring, analyzing and debriefing test items.  When assembling the a test: o The following packaging guidelines are worth to apply: a. group together all items of similar format b. arrange test items from easy to hard c. space the items for easy reading d. keep items and options on the same page e. position illustration near descriptions f. check your answer key g. determine how students record answer h. provide space for name and date i. check test directions j. proofread the test o Care must be taken in reproducing the test to avoid illegible copies that would impair test validity through: a. know the photocopying machine b. specify copying instructions c. file original test  In administering the test, make an effort to do the following: a. induce a positive test – taking attitude b. maximize the achievement nature of the test c. equalize the advantages test – wise students have over non – test wise students d. avoid surprise test e. provide special instructions before the test are actually distributed f. alternate your test distribution procedures g. have the students check that they have the entire test h. keep distractions to a minimum i. alert students to the amount of time left toward the end of the test j. clarify test collection procedures before handing out the test Written measurable instructional objectives; Prepared a test blueprint, specifying the number of items for each content and process area; and Written test items that match your instructional objectives.
  • 2.
    NEMA GRACE B.MEDILLO Educ. 243 – Education Evaluation Prof. James L. Paglinawan, Ph D  In scoring the test, try to do the following: a. have the key prepared in advance b. have the key checked for accuracy c. score blindly d. check for multiple answers if machine scoring used e. double – check scores, if scored by hand f. record scores before returning the tests  In analyzing the test: o the quantitative item analysis is a mathematical approach in assessing an item utility  an item difficulty level (p) is computed by 𝑝 = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑠𝑡𝑢𝑑𝑛𝑒𝑡𝑠 𝑠𝑒𝑙𝑒𝑐𝑡𝑖𝑛𝑔 𝑐𝑜𝑟𝑟𝑒𝑐𝑡 𝑎𝑛𝑠𝑤𝑒𝑟 𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑠𝑡𝑢𝑑𝑒𝑛𝑡𝑠 𝑎𝑡𝑡𝑒𝑚𝑝𝑡𝑖𝑛𝑔 𝑡ℎ𝑒 𝑖𝑡𝑒𝑚  If P level > 0.75, the item is considered relatively easy.  If P level < 0. 25, the item is considered relatively difficult.  Build tests that have most items between p levels of 0.20 and 0. 80, with an average p level about 0.50.  an item discrimination index is computed by 𝑑 = ( 𝑁𝑢𝑚𝑏𝑒𝑟 𝑤ℎ𝑜 𝑔𝑜𝑡 𝑖𝑡𝑒𝑚 𝑐𝑜𝑟𝑟𝑒𝑐𝑡 𝑖𝑛 𝑢𝑝𝑝𝑒𝑟 𝑔𝑟𝑜𝑢𝑝 ) − ( 𝑁𝑢𝑚𝑏𝑒𝑟 𝑤ℎ𝑜 𝑔𝑜𝑡 𝑖𝑡𝑒𝑚 𝑐𝑜𝑟𝑟𝑒𝑐𝑡 𝑖𝑛 𝑙𝑜𝑤𝑒𝑟 𝑔𝑟𝑜𝑢𝑝 ) 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑠𝑡𝑢𝑑𝑒𝑛𝑡𝑠 𝑖𝑛 𝑒𝑖𝑡ℎ𝑒𝑟 𝑔𝑟𝑜𝑢𝑝 (𝑖𝑓 𝑔𝑟𝑜𝑢𝑝 𝑠𝑖𝑧𝑒𝑠 𝑎𝑟𝑒 𝑢𝑛𝑒𝑞𝑢𝑎𝑙, 𝑐ℎ𝑜𝑜𝑠𝑒 𝑡ℎ𝑒 ℎ𝑖𝑔ℎ𝑒𝑟 𝑛𝑢𝑚𝑏𝑒𝑟)  keyed correct option should discriminate positively and incorrect options should discriminate negatively  Consider/ retain item with a positive D value.  quantitative item analysis helps us  decide whether to retain or eliminate an item,  which distractor(s) should be modified or eliminated,  whether an item is miskeyed,  whether guessing occurred, and  whether ambiguity is present o qualitative item analysis is a nonmathematical approach to assessing an item’s utility  performed by checking an item’s content validity and inspecting it for technical faults (outlined in Chapter 5, 6, and 7) o apply quantitative item analysis procedures that have been develop and proved using norm – reference test to criterion – reference test with appropriate caution and modification such as:
  • 3.
    NEMA GRACE B.MEDILLO Educ. 243 – Education Evaluation Prof. James L. Paglinawan, Ph D a. using pre and posttest as upper and lower group  Step 1: Compute p levels for both tests  Step 2: Determine the discrimination index for the key  Step 3: Determine whether each option separately discriminates negatively Modified quantitative item indicates the following 1. There was a sizeable increase in p value from pretest to posttest. 2. The D index for the key was positive. 3. The distractors all discriminated negatively. b. Comparing the percentage answering each item correctly on both pre- and posttest Percentage passing posttest – percentage passing pretest  The more positive the difference, the more the item tapping the content c. Determining the percentage of items answered in the expected direction for the entire test  Step 1: Find the number of items that students answered incorrectly prior to instruction but correctly after instruction.  Step 2: Add counts and divide by the number of students.  Step 3: Divide by number of test items.  Step 4: Multiply by 100.  The greater the overall positive percentage of change, the more your test is likely to match your instruction and to be content – valid test. d. Limitations of Modification  Difficult  Unit of instruction is brief  From norm – referenced test to criterion – referenced test  Time devote to instruction (pre – post)  After the test has been scored, but before giving to students their scores: a. Discuss any items considered to be problematic b. Listen to student concerns and try to stay unemotional c. Let students know you will consider their comments but will not make any decisions affecting their scores until you have had time to reflect on their comments d. Let students know that any changes made will apply equally to all students.  After the students are given their scores, ask them to check for clerical errors
  • 4.
    NEMA GRACE B.MEDILLO Educ. 243 – Education Evaluation Prof. James L. Paglinawan, Ph D Process of Evaluating Classroom Achievement Instructional Objectives are formulated, based on the school's or school district's educational goals. Instructional procedures are implemented that lead o the achievement of these objectives A test blueprint is drawn to ensure that each important content and process area is adequately sampled by the appropriate number and kind of test items. Test items are written. The format, number, and level are determined in part by the objectives, in part by the test blueprints, and in part by the teacher's judgment. Test items are reviewed and, where necessary, edited or replaced. The actual test is assembled using the Test Assembly Checklist to be sure important assembly consideration are not overlooked. The test is reproduced, with care taken to ensure that copies are legible. The test is administered, with steps taken to ensure the appropriate psychological atmosphere. The test is scored, with appropriate safeguards to minimize scoring errors. Items that look marginal are subject to qualitative and quantitative and qualitaive item analysis. Appropriate changes in scoring or revisions are made. The tests are scores are returned to the students or debriefing. Items the students find marginal are subject to qualitative and quantitative item analysis and appropriate changes in scoring or revisions are made. Notes are made of deficient or problematic items, enabling the teacher to be aware of problems, cahnges, etc., before administering the test again.
  • 5.
    NEMA GRACE B.MEDILLO Educ. 243 – Education Evaluation Prof. James L. Paglinawan, Ph D Test Assembly Checklist Put a check in the blank to the right of each statement after you’ve checked to see that it applies to your test. Yes No 1. Are items of similar format grouped together? ____ ____ 2. Are items are arranged from easy – to – hard of difficulty? ____ ____ 3. Are items properly spaced? ____ ____ 4. Are items and options on the same page? ____ ____ 5. Are diagrams, maps, and supporting material above designated items and on the same page with items? ____ ____ 6. Are answers random? ____ ____ 7. Have you decided whether an answer sheet will be used? ____ ____ 8. Are blanks for name and data included? ____ ____ 9. Have the directions been checked for clarity? ____ ____ 10. Has the test been proofread for errors? ____ ____ 11. Do items avoid racial and gender bias? ____ ____ Qualitative Item Analysis Checklist 1. Item Number ______ 2. Difficulty Index: 𝑝 = 𝑁𝑢𝑚𝑏𝑒𝑟 𝐶𝑜𝑟𝑟𝑒𝑐𝑡 𝑇𝑜𝑡𝑎𝑙 = ______ 3. Discrimination Index: 𝐷 = ( 𝑁𝑢𝑚𝑏𝑒𝑟 𝐶𝑜𝑟𝑟𝑒𝑐𝑡 (𝑢𝑝𝑝𝑒𝑟) ) − ( 𝑁𝑢𝑚𝑏𝑒𝑟 𝐶𝑜𝑟𝑟𝑒𝑐𝑡 (𝑙𝑜𝑤𝑒𝑟) ) 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑠𝑡𝑢𝑑𝑒𝑛𝑡𝑠 𝑖𝑛 𝑒𝑖𝑡ℎ𝑒𝑟 𝑔𝑟𝑜𝑢𝑝 ______ 4. Eliminate or revise item? Check, a. Does key discriminate positively? b. Do discriminators discriminate negatively? ** If you answer yes to both a and b, no revision may be necessary. ** If you answer no to a or b, revision is necessary. ** If you answer no to both a and b, eliminate the item. 5. Check for miskeying, ambiguity, or guessing. Among the choices for the upper group only, was there evidence of: a. Miskeying (more chose distractor than key)? b. Guessing (equal spread of choices across options? c. Ambiguity (equal number chose one distractor and the key)? ______