Learning assessments gather information on what learners know and what they can do with what they have learnt, as well as offer critical information on the process and context that enable learning, and on those that may be hindering learning progress.
4. • Audio-visual aids are defined as any devices used to aid in the
communication of an idea. As such, virtually anything can be used as
an audio-visual aid provided it successfully communicates the idea or
information for which it is designed.
• An audio-visual aid includes still photography, motion picture, audio or
video tape, slide or filmstrip, that is prepared individually or in
combination to communicate information or to elicit a desired audio
response.
I. Educational Technology
5. DEVICE
- Device is any means other than the
subject-matter itself that is employed by
the teacher in presenting the subject
matter to the learner.
6. 1.To challenge students’ attention
2.To stimulate the imagination and develop the mental
imagery of the pupils
3.To facilitate the understanding of the pupils
4.To provide motivation to the learners
5.To develop the ability to listen
Purpose of Visual Devices
8. 1.Extrinsic – used of supplement a method used.
Example: pictures, graph, dilm strips, slides, etc.
2.Intrinsic – used as a part of the method or teaching procedures.
Example: pictures accompanying an article.
3.Material Devices – device that have no bearing on the subject matter.
Example: blackboard, chalk, books, pencils, etc.
4.Mental Devices – a kind of device that is related in form and meaning to
the subject matter being presented.
Example: questions, projects, drills, lesson plans, etc.
Classification of Devices
9. -Non-projected aids are those that do not require the use of audio-
visual equipment such as a projector and screen. These include
charts, graphs, maps, illustrations, photographs, brochures, and
handouts. Charts are commonly used almost everywhere.
-A chart is a diagram which shows relationships. An organizational
chart is one of the most widely and commonly used kind of chart.
(Page 263)
NON-PROJECTED AUDIOVISUAL AIDS
10. -It focuses on the development and utilization of assessment tools
to improve the teaching-learning process. It emphasizes on the
use of testing for measuring knowledge, comprehension and
other thinking skills. It allows the students to go through the
standard steps in test constitution for quality assessment.
Students will experience how to develop rubrics for performance-
based and portfolio assessment.
II.ASSESSMENT OF LEARNING
12. I. Measurement
Refers to the qualitative aspect of evaluation.it involves outcomes that
can be quantified statistically.it can also be defined as the process in
determining and differentiating the information about the attributes or
characteristics of things.
13. II. EVALUATION
Is the qualitative aspect of determining the outcomes of learning.it
involves value judgment. Evaluation is more comprehensive than
measurement in fact measurement is one aspect of evaluation.
14. III. TEST
Consist of question or exercises or other devices
for measuring the outcomes of learning
15. I. ACCORDING TO MANNER
OF RESPONCE
A.ORAL
B.WRITTEN
II. ACCORDING TO METHOD
OF PREPARATION
III. ACCORDING TO NATURE
OF ANSWER
A.SUBJECTIVE/ESSAY
B.OBJECTIVE
A.PERSONALITY TESTS
B.INTELLIGENCE TESTS
C.APTITUDE TESTS
D.ACHIEVEMENT OR SUMMATIVE TESTS
E.SOCIOMETRIC TEST
F.DIAGNOSTIC OR FORMATIVE TEST
G.TRADE OR VOCATIONAL TEST
16. • OBJECTIVE TEST- are tests which have definite answers and therefore
are not subject to personal bias.
• TEACHER MADE TESTS -or educational are constructed by the
teachers based on the contents of different subjects taught.
• DIAGNOSTIC TESTS- are used to measure a students strengths and
weaknesses, usually to identify deficiencies in skills or performance.
• FORMATIVE ASSESSMENT- are terms often used with evaluation , but
they may also be used with testing.
17. • FORMATIVE TESTING-is done to monitor students attainment of the
instructional objectives. formative testing occurs over a period of time and
monitors student progress.
• SUMMATIVE TESTING- is done at the conclusion of instruction and
measures the extent to which students have attained the desired
outcomes.
• STANDARDIZED TESTS-are already valid, reliable and objective .
standardized tests are tests for which contents have been selected and for
which norms or standards have been established . psychological tests and
government national examination are examples od standardized tests.
• STANDARDS OR NORMS- are the goals to be achieved expressed in
terms of the average performance of the population.
18. Criterion-referenced measure is a measuring device with a
predetermined level of success or standard on the part pf the test-
takers. For example, a level of 75 percent score in all the test items
could be considered a satisfactory performance.
Norm-referenced measure is a test that is scored on the basis of the
norm or standard level of accomplishment by the whole group taking
the test. The grades of the students are based on the normal curve of
distribution.
19. -A good examination must pass the following criteria:
Validity
-Refers to the degree to which a test measures what is intended to measure. It
is the usefulness of the test for a given measure.
-A valid test is always reliable. To test the validity of a test it is to be presented
in order to determine if it really measures what it intends to measure or what it
purports to measure.
CRITERIA OF A GOOD EXAMINATION
20. Reliability
-Pertains to the degree to which a test measures what it
supposed to measure.
-The test of reliability is the consistency of the results when it is
administered to different groups of individuals with similar
characteristics in different places at different times.
-Also, the results are almost similar when the test is given to the
se group of individuals at different days and the coefficient of
correlation is not less than 0.85.
21. Objectivity
- Is the degree to which personal bias is eliminated in the
scoring of the answers? When refer to the quality of
measurement, essentially, we mean the amount of
information contained in a score generated by the
measurement.
-Measures of student’s instructional outcomes are rarely as
precise as those pf physical characteristics such as height
and weight.
22. Example:
Numbers from an interval scale of measurement contain
all of the information that nominal and ordinal scales would
provide, plus some supplementary input. However, a ratio
scale of the same attribute would contain even more
information than the interval scale. This idea will become
clearer as each scale of measurement is described.
23. I. Nominal Measurement
-Are the least sophisticated; they merely classify objects or even by
assigning number to them.
-These numbers are arbitrary and imply no quantification, but the
categories must be mutually exclusive and exhaustive.
-For example, one could nominate designate baseball positions by
assigning the pitcher the numeral
1; the catcher, 2; the first baseman, 3; the second baseman, 4; and so
on. These assignments are arbitrary of these numbers is meaningful.
For example, 1 plus 2 does not equal 3, because a pitcher plus a
catcher does not equal a first baseman.
24. II. Ordinal Measurement
-Ordinal scales classify, but they also assign rank order. An example of ordinal
measurement is ranking individuals in a class according to their test scores.
-Students’ scores could be ordered from first, second, third, and so forth to the lowest
score. Such a scale gives more information than nominal measurement, but it still has
limitations.
-The units of ordinal are most likely unequal.
25. III. Interval Measurement
-In order to be able to add and subtract scores, we use interval scales,
sometimes called equal interval or equal unit measurement.
-This measurement scale contains the nominal and ordinal properties and also
characterized by equal units between score points.
-Examples include thermometers and calendar years.
26. IV. Ratio Measurement
-The most sophisticated type of measurement includes all the preceding
properties, but in a ratio scale, the zero point is not arbitrary; a score of zero
includes the absence of what is being measured.
-For example, if a person’s wealth equaled zero, he or she would have no
wealth at all. This is unlike a social studies test, where missing every item (i.e.,
receiving a score of zero)
-Ratio measurement is rarely achieved in educational assessment, either
cognitive or affective areas.
27. .
a. There should be two columns. Under “A” are the stimuli which should be longer and more descriptive
than the responses under column “B”. The response may be a word, a phrase, a number, or a formula.
b. The stimuli under column “A” should be numbered and the response under column “B” should be
lettered. Answers will be indicated by letters only on lines provided in column “A”.
c. The number of pairs usually should not exceed twenty items. Less than ten introduces chance
elements. Twenty pairs may be used but more than twenty is decidedly wasteful of time.
d. The number of responses in column “B” should be two or more than the number of items in Column “A”
to avoid guessing.
e. Only one correct matching for each item should be possible.
f. Matching sets should neither be to long nor too short.
g. All items should be on the same page to avoid turning of pages in the process of matching pairs.
h. Score is the number of correct answers.
MATCHING TYPE
29. Common types of essay questions. (The types are related to
purposes of which the essay examinations are to be used).
1.Comparison of two things
2.Explanations of the use or meaning of a statement or passage.
3.Analysis
4.Decisions for or against
5.Discussion
30. How to construct essay examinations.
1.Determine the objectives or essentials for each question to be evaluated.
2.Phrase question in simple, clear and concise language.
3.Suit the length of the questions to the time available for answering the essay
examination. The teacher should try to answer the test herself.
4.Scoring:
a. Have a model answer in advance.
b. Indicate the number of points for each question.
c. Score a point for each essential.
32. a. The objectives test is free from personal bias in scoring.
b. It is easy to score. With a scoring key, the test can be corrected by different
individuals without affecting the accuracy of the grades given.
c. It has high validity because it is comprehensive with wide sampling of
essentials
d. It is less time-consuming since may items can be answered in a given time
e. It is fair to students since the slow writers can accomplish the test as fast as
writes.
Advantages
33. a. It is difficult to construct and requires more time to prepare.
b. It does not afford the students the opportunity in training for
self- and thought organization
c. It cannot be used to test ability in theme writing or
journalistic writing.
Disadvantages
35. A. The essay examination can be used in practically in all subjects of the school curriculum.
b. It trains students for thought organization and self-expression.
c. It affords students opportunities to express their originality and independence of thinking.
d. Only the essay test can be used in some subjects like composition writing and journalistic
writing in which cannot be tested by the objective type test.
e. Essay examination measures higher mental abilities like comparison, interpretation,
criticism, defense of opinion and decision.
f. The essay test is easily prepared.
g. It is inexpensive.
Advantages
36. a. The limited sampling of items makes the test unreliable measures
of achievements or abilities.
b. Questions usually are not well prepared.
c. Scoring is highly subjective due to the influence of the corrector’s
personal judgment.
d. Grading of the essay test is inaccurate measure of pupils’
achievements due to subjective of scoring.
Disadvantages
38. Frequency Distributions
- A simple, common sense technique for describing a set of test scores is through the use of a
frequency distribution. A frequency distribution is merely a listing of the possible score values and
the number of persons who achieved each score. Such an arrangement presents the scores in a more
simple and understandable manner than merely listing all of the separate scores. Considers a specific
set of scores to clarify these ideas.
- A set of scores for a group of 25 students who took a 50-items test is listed in Table 1. It is easier to
analyze the scores if they are arranged in a simple frequency distribution. (The frequency distribution
for the same set of scores is given in Table 2). The steps that are involved in creating the frequency
distribution are:
- First list the possible scores values in rank order, from highest to lowest. Then a second column
indicates the frequency or number of persons who received each score. For example, three students
received a score of 47, two received 40 and so forth. There is no need to list the score values below
the lowest score that anyone received.
39. Table 1. Scores of 25 Students on a 50 Item Test
Student Score Student Score
A 48 N 43
B 50 O 47
C 46 P 48
D 41 Q 42
E 37 R 44
F 48 S 38
G 38 T 49
H 47 U 34
I 49 V 35
J 44 W 47
K 48 X 40
L 49 Y 48
M 40
40. Table 2. Frequency Distribution of the 25 Scores of Table 1
Score Frequency Score Frequency
50 1 41 1
49 3 40 2
48 5 39 0
47 3 38 2
46 1 37 1
45 0 36 0
44 2 35 1
43 1 34 1
42 1
41. -When there is a wide range of scores in a frequency distribution, the
distribution can be quite long, with a lot of zeros in the column of frequencies.
Such a frequency distribution can make interpretation of the scores difficult
and confusing. A grouped of frequency distribution would be more appropriate
in this kind of situation. Groups of score values are listed rather than each
separate possible score value.
-If we were to change the frequency distribution in Table 2 into a grouped
frequency distribution, we might choose intervals such as 48-50, 45-47, and so
forth. The frequency corresponding to interval 48-50 would be 9 (1+3+5). The
choice of the width of the interval is arbitrary, but it must be the same for all
intervals. In addition, it is a good idea to have as odd- numbered interval width
(we used 3 above) so that the midpoint of the interval is a whole number. This
42. - strategy will simplify subsequent graphs and description of the data. The grouped
frequency distribution is presented in Table 3.
Table 3. Grouped Frequency Distribution
Score Interval Frequency
48-50 9
45-47 4
42-44 4
39-41 3
36-38 3
43. Frequency distributions summarize sets of test scores by listing
the number of people who received each test score. All of the
test scores can be listed separately, or the sources can be
grouped in a frequency distribution.
44. MEASURES OF CENTRAL TENDEDNCY
- Frequency distributions are helpful for indicating the shape to describe a distribution of
scores, but we need more information than the shape to describe a distribution
adequately. We need to know where on the scale of measurement a distribution is
located and how the scores are dispersed in the distribution. For the former, we
compute measures of central tendency, and for the latter, we compute measures of
dispersion. Measures of central tendency are points on the scale of measurement,
and they are representative of how the scores tend to average. There are three
commonly used measures of central tendency; the mean, the median, mode, but the
mean is by far the most widely used.
45. The Mean
- The mean of a set of scores is the arithmetic mean. It is found by
summing the scores and dividing the sum by the number of scores. The
mean is the most commonly used measure of central tendency because
it is easily understood and is based on all of the scores in the set;
hence, it summarizes a lot of information. The formula for the mean is as
follows:
46. Where
:
Is the mean
Is the symbol for a score, the summation operator
(it tells us to add all the Xs)
N Is the number of scores.
For the set of scores in Table 1,
X = 1100
N = 25,
So then
=110/25=44
47. The mean of the set of scores in Table 1 is 44. The mean does not have to
equal an observed score; it is usually not even a whole number.
When the scores are arranged in a frequency distribution, the formula is:
=
𝑓𝑋 𝑚𝑑𝑝𝑡
𝑁
Where fX mdpt means that the midpoint of the interval is multiplied by the
frequency for that interval. In computing the mean for the scores in Table 3,
using formula we obtain:
=
9 49 + 4 46 + 4 43 + 3 40 + 3 37 + 2(34
25
= 43.84
Note that this mean is slightly different than the mean using ungrouped data. This
difference is due to the midpoint representing the scores in the interval rather than using
the actual scores.
48. The Median
- Another measure of central tendency is the median which is the point that divides the distribution in
half; that is, the half of the scores fall above the median and half of the scores fall below the
median.
- When there are only few scores, the median can often be found by inspection. If there is an odd
number of scores, the middle score is the median. Where there is an even number of scores, the
median is halfway between the two middles scores. However, when there are tied scores in the
middle’s distribution, or when the scores are in a frequency distribution, the median may not be so
obvious.
- Consider again the frequency distribution in Table 2. There were 25 scores in the distribution, so
the middle score should be the median. A straightforward way to find this median is to augment the
frequency distribution with a column of cumulative frequencies.
- Cumulative frequencies indicate the number of scores at or below each score. Table 4
indicates the cumulative frequencies for the data in Tale 2.
49. Table 4. Frequency Distribution, Cumulative Frequencies for
the Scores of Table 2
Score Frequency Cumulative Frequency
50 1 25
49 3 24
48 5 21
47 3 16
46 1 13
45 0 12
44 2 12
43 1 10
42 1 9
41 1 8
40 2 7
39 0 5
38 2 5
37 1 3
36 0 2
35 1 2
34 1 1
For example, 7 people scored at or below a score of 40, and 21 persons scored at or below a
score of 48.
- To find the median, we need to locate the middle score in the cumulative frequency
column, because this score is the median. Since there are 25 scores
50. in the distribution, the middles one is the 13th, a score of 46. Thus, 46 is the
median of this distribution; half of the people scored above 46 and half scored.
- When there are ties in the middle of the distribution, there may be a need
to interpolate between scores to get the exact median. However, such precisions
are not needed for most classroom tests. The whole number closest to the
median is usually sufficient.
51. The Mode
- The measure of central tendency that is the easiest to find is the mode. The mode
is the most frequently occurring score in the distribution. The mode of the scores in
Table is 48. Five people had scores of 48 and no other score occurred as often.
- Each of these three measures of central tendency – the mean, median, and
the mode means a legitimate definition of “average” performance on this test.
However, each does provide different information. The arithmetic average was 44; half
of the people scored at or below 46 and more people received 48 than any other
score.
- There are some distributions in which all the three measures of central
tendency are equal, but more often than not they will be different. The choice of which
measure of central tendency is best will differ from situation to situation. The mean is
used most often, perhaps because it includes information from all of the scores.
52. - When a distribution has a small number of very extreme
scores, though, the median may be a better definition of
central tendency. The mode provides the least information
and is used infrequently as an “average”. The mode can be
used with nominal scale data, just as an indicator of the most
frequently appearing category. The mean, the median, and
the mode all describe central tendency:
The mean is the arithmetic average.
The median divides the distribution in half
The mode is the most frequent score.
53. MEASURES OF DISPERSION
- Measures of central tendency are useful for summarizing average
performance, but they tell as nothing about how the scores are distributed or
“spread out” but they might be differed in other ways. One the distributions may
have the scores tightly clustered around the average, and the other distribution
may have scores that are widely separated. As you may have anticipated, there
are descriptive statistics that measures dispersion, which also are called
measures of variability. These measures indicate how spread out the scores
tends to be.
54. The Range
- The range indicates the difference between the highest and lowest scores
in the distribution. It is simple to calculate, but it provides limited
information. We subtract the lowest from the highest score and add 1 so
that we include both scores in the spread between them. For the scores of
Tables 2, the range is 50 – 34 + 1 = 17.
- A problem with using the range is that only the two most extreme scores are
used in the computation. There is no indication of the spread of scores
between the highest and lowest. Measures of dispersion that take into
consideration every score in the distribution are the variance and the standard
deviation. The standard deviation is used a great deal in interpreting scores
from standardized test.
55. The Variance
- The variance measures how widely the scores in the distribution are
spread about the mean. In other words, the variance is the average squared
difference between the sources and the mean. As a formula, it looks like this:
𝑆2
=
𝑋 − 𝑋 2
𝑁
An equivalent formula, easier to compute is:
𝑆2
=
𝑋2
𝑁
𝑋2
- The computation of the variance for the scores of Tables 1 is illustrated
in Table 5. The data for students K through V are omitted to save space, but
these values are included in the column totals and in the computation.
56. The Standard Deviation
- The standard deviation also indicates how spread out the scores is, but it is expressed
in the same units as the original scores. The standards deviation is computed by finding
the square root of the variance.
S = 𝑆2
- For the data in Table 1, the variance is 22.8. The standard deviation is 22.8, or 4.77.
- The scores of most norm groups have the shape of a “normal distribution-a symmetrical,
bell-shaped distribution with which most people are familiar. With normal distribution,
about 95 percent of the scores are within two standard deviations of the mean.
Even when scores are not normally distributed, most of the scores will be within two
standard deviations of the mean. In the example, the mean minus two standard deviations is
34.46, and the mean plus two standard deviations is 53.54. Therefore, only one score is
outside of this interval; the lowest score, 43, is slightly more than two standard deviations
from the mean.
57. Table 5. Computation of the Variance for the Scores of Table 1
Student
A 48 4 16
B 50 6 36
C 46 2 4
D 41 -3 9
E 37 -7 49
F 48 4 16
G 38 -6 36
H 47 3 9
I 49 5 25
J 44 0 0
. . . .
. . . .
. . . .
W 47 3 9
X 40 -4 16
Y 48 4 16
Totals 1100 0 570
58. To determine the mean:
=
1100
25
= 44.
Then, to determine the variance:
𝑆2 =
𝑋 − 𝑋 2
𝑁
=
570
25
= 22.8
-The usefulness of the standard deviation
becomes apparent when scores from different
test are compared. Suppose that two tests are
given to the same class one fractions and the
other on reading comprehensive. The fractions
test has a mean of 30 and a standard deviation
of 8; the reading comprehensive test has a
mean of 60 and a standard deviation of 10.
- If Ann scored 38 on the fractions test and
55 on the reading comprehensive test, it
appears from the raw scores that she did better
in reading than in fractions, because 55 is
greater than 38.
59. • Descriptive statistics that indicate dispersion are the range,
the variance, and the standard deviation.
• The range is the difference between the highest and lowest
scores in the distribution plus one.
• The standard deviation is a unit of measurement that shows
by how much the separate scores tend to differ from the
mean.
• The variance is the square of the standard deviation. Most
scores are within two standard deviations from the mean.
60. Graphing Distributions
-A graph of a distribution of test scores is often better understood than is the frequency
distribution or a mere table of numbers.
-The general pattern of scores, as well as any unique characteristics of the distribution,
can be seen easily in simple graphs. There are several kinds of graph that can be
used, but a simple bar graph or histogram, is as useful as any.
-The general shape of the distribution is clear from the graph. Most of the scores in this
distribution are high, at the upper end of the graph.
-Such a shape is quite common for the scores of classroom tests.
-A normal distribution has most of the test scores in the middle of the distribution and
progressively fewer scores toward extremes. The scores of norm groups ate seldom
graphed but they could be if we were concerned about seeing the specific shape of the
distribution of scores.
61. Assessment of Learning
• Themselves more too individualizing instruction than to do norm-
referenced test. In individualizing instruction, a student’s performance is
interpreted more appropriately by comparison to the desired behaviors
for that particular student, rather than by comparison with the
performance of a group.
• Norm-referenced test items tend to be of average difficulty. Criterion-
referenced test have item difficulty matched to the learning tasks. This
distinction in item difficulty is necessary because norm-referenced test
emphasize the discrimination among individuals and criterion-referenced
test emphasize the description of performance.
• -
62. Criterion- Referenced Interpretation
-The concepts of criterion-referenced testing have developed with a dual meaning
for criterion-referenced. On one hand, it means referencing an individual’s
performance to some criterion that is a defined performance level. The individual’s
score is interpreted in absolute rather than relative terms. The criterion, in this
situation, means some level of specified performance that has been determined
independently of how other might perform.
-Second meaning of criterion-referenced involves the idea of a defined behavioral
domain – that is, a defined body of learner behaviors. The leaner’s performance
on a test is referenced to a specifically defined group of behaviors. The criterion in
this situation is the desired behaviors.
63. Criterion-referenced tests focus on a specific group of learner behaviors.
To show the contrast, consider an example. Arithmetic skills represent a
general and broad category of student outcomes and would likely be
measured by a norm-referenced test. On the other hand, behaviors such
as solving addition problems with two five-digit numbers or
-Norm-referenced test scores are transformed to positions within the
normative group. Criterion-referenced test scores are usually given in the
percentage of correct answers or another indicator of mastery or the lack
thereof. Criterion-referenced tends to lend….
Distinctions between Norm-Referenced
and Criterion- Referenced Tests
64. Norm-Referenced and Criterion
Referenced Measurement
-When we contrast norm-referenced measurement (or testing) with
criterion-referenced measurement, we are basically refereeing to two
different ways of interpreting information. However, Popham (1988, page
135) points out that certain characteristics tend to go with each type of
measurement, and it is unlikely that results of norm-referenced test are
interpreted in criterion-referenced ways and vice versa.
65. Norm-Referenced Interpretation
-Norm-referenced interpretation stems from the desire to differentiate
among individuals or to discriminate among the individuals for some
defined group on whatever is being measured. In norm-referenced
measurement, an individual’s score in interpreted by comparing it to the
scores of a defined group, often called the normative group. Norms
represents the scores earned by one or more groups of students who
have taken the test.
66. Achievement Test as an Example.
Achievement Test as an Example.
-Most standardized achievement tests, especially those covering
several skills and academic areas, are primarily designed for norm-
referenced interpretations. However, the form of results and the
interpretations of these tests are somewhat complex and require
concepts not yet introduced in this text. Scores as teacher-
constructed test are often given norm-referenced interpretations.
67. STAGES IN TEST CONSTRUCTION
I. Planning the Test
A. Determining the Objectives
B. Preparing the Table of Specifications
C. Selecting the Appropriate Item Format
D. Writing the Test Items
E. Editing the Test Items
II. Trying Out the Test
A. Administering the First Tryout- then Item Analysis
B. Administering the Second Tryout- then Item Analysis
C. Preparing the Final Form of the Test
III. Establishing Test Validity
IV. Establishing the Test Reliability
V. Interpreting the Test Score
69. -The following are the major considerations in
test construction:
Type of Test
- Our usual idea of testing is an in-class test that is administered by the teacher.
However, there are many vibrations on this theme: group test, individual test, written test,
oral test, speed test, power test, pretest and post test. Each of these has different
characteristics that must be considered when the test is planned.
Test Length
-
A major decision in the test planning is how many items should be included on the
test. There should be enough to cover the content adequately, but the length of the
class period or the attention span of fatigue limits of the students usually restricts the
test length. Decisions about test length are usually based on practical constraints
more than on theoretical considerations.
70. Item Formats
- Determining what kind of items is included on the test is a major
decision. Should they be objectively scored formats such as multiple
choice or matching type? Should they cause the students to organize
their own thoughts through short answer essay formats? These are
important questions that can be answered only by the teacher in terms of
the local context, his or her students, his or her classroom, and the
specific purpose of the test. Once the planning decision is made, the
item writing begins. This tank is often the most feared by beginning test
constructors. However, the procedures are more common sense than
formal rules.
71. POINTS TO BE CONSIDERED IN
PREPARING A TEST
1.Are the instructional objectives clearly defined?
2.What knowledge, skills and attitudes do want to measure?
3.Did you prepare a table of specifications?
4.Did you formulate well defined and clear test items?
5.Did you employ correct English in writing the items?
6.Did you avoid giving clues to the correct answer?
7.Did you test the important ideas rather than the trivial?
8.Did you adapt the test’s difficulty to your student’s ability?
9.Did you avoid using textbooks jargons?
10.Did you cast the items in positive forms?
11.Did you prepare a scoring key?
12.Does each item have single correct answer?
13.Did you review your items?
72. GENERAL PRINCIPLES IN CONSTRUCTING
DIFFERENT TYPES OF TEST
1. The test items should be selected very carefully. Only important facts should be included.
2. The test should have extensive sampling of items.
3. The test items should be carefully expressed in simple, clear, definite, and meaningful sentences.
4. There should be only one possible correct response for each test item.
5. Each item should be independent. Leading clues to other items should be avoided.
6. Lifting sentences from books should not be done to encourage thinking and understanding
.
7. The first-person personal pronouns / and we should not be used.
8. Various types of test items should be made to avoid monotony.
9. Majority of the test items should be of moderate difficulty. Few difficult and few easy items should be included.
10. The test items should be arranged in an ascending order of difficulty. Easy items should be at the beginning to
encourage the examinee to pursue the test and the most difficult items should be at the end.
73. 11. Clear concise and complete directions should precede all types of test. Sample test. Sample test items may be
provided for expected responses.
12. Items which can be answered by previous experience alone without knowledge of the subject matter should
not be included.
13. Catchy words should not be used in the test items.
14. Test items must be based upon the objectives of the course and upon the course content.
15. The test should measure the degree of achievement or determine the difficulties of the learners.
16. The test should emphasize ability to apply and use facts as well as knowledge of facts.
17. The test should be of such length that it can be completed within the time allotted by all or nearly all of the
pupils. The teacher should perform the test herself to determine its approximate time allotment.
18. Rules to governing good language expression, grammar, spelling, punctuation, and capitalization should be
observed in all times.
19. information on how scoring will be done should be provided.
20. Scoring Keys in correcting and scoring test should be provided.
74. POINTERS TO BE OBSERVED
IN
CONSTRUCTING
AND
SCORING
THE DIFFERENT TYPES OF TESTS
75. A.RECALL TYPES
1.Simple recall type
a)This type of consists of questions calling for a single word or expressions as an
answer.
b)Items usually begin with who, where, when, and what.
c)Score is the number of correct answers.
2.Completion type
A .Only important words or phrases should be omitted to avoid confusion.
b. Blanks should be of equal lengths.
c. The blank, as much as possible, is placed near or at the end of the sentence.
d. Articles a, an, and they should not be provide before the end of omitted word or
phrase to avoid clues for answers.
e. Score is the number of correct answers.
76. 3.Enumeration Type
a. The exact number of expected answers should be
started.
B .Blanks should be equal lengths.
c. Score is the number of correct answers.
4.Identification type
a. The items should make an examinee think of a word,
number, or group of words that would complete the
statement or answer the problem.
b. Score is the number of correct answers.
77. B. RECOGNITION TYPES
1.True-false or alternate-response type
a. Declarative sentences should be used.
b. The number of “true” and “false” items should be more or less equal.
c. The truth or falsity of the sentence should not be too evident.
d. Negative statements should be avoided.
e. The “modified true – false” is more preferable than the plain true-false”.
f. In arranging the items, avoid the regular recurrence of “true” and “false” statements.
g. Avoid using specific determiners like: all, always, never, none, nothing, most, often,
some, etc, and avoid weak statements as may, sometimes, as a rule, in general etc.
h. Minimize the use of qualitative terms like; few, great, many, more, etc.
i. Avoid leading clues to answers in all times.
j. Score is the number of correct answers in “modified true-false and right answers
minus wrong answers in “plain true-false”.
78. 2.Yes-No type
a. The items should be in interrogative sentences.
b. The same rules as in true-false are applied.
3.Multiple-response type
a. There should be three to five choices. The number of choices used in the first item should
be the same number of choices in all the items of this type of test.
b. The choices should be numbered or lettered so that only the number or letter can be written
on blank provided.
c. If the choices are figures, they should be arranged in ascending order.
d. Avoid the use of “a” or “an” as the last word prior to the listing of the responses.
E .Random occurrence of responses should be employed
f. The choices, as much as possible, should be at the end of the statements.
g. The choices should be related in some way or should belong to the same class.
h. Avoid the use of “none of these” as one of the choices.
i. Score is the number of correct answers.
79. 4.Best answer type
a. There should be three to five choices all of which are right but
vary in their degree of merit, importance or desirability
b. The other rules for multiple-response items are applied here.
c. Score is the number of correct answers.