2. Assessment of Learning
• It focus on the development and utilization of assessment
tools to improve the teaching-learning process.
• Measurement refers to the quantitative aspect of evaluation.
It involves outcomes that can be quantified statistically.
• Evaluation is the qualitative aspect of determining the
outcomes of learning. It involves value judgment.
• Test consist of questions or exercises or other devices for
measuring the outcomes of learning.
3. EducationalTechnology
• Audio visual aids are defined as any devised used to aid in the
communication of an idea.
• As such, virtually anything can be used as an audio visual aid
provided it successfully communicates the idea or information
for which its is designed.
• An audio visual aid includes still photography, motion picture,
audio or video tape, slide or filmstrip, that is prepared
individually or in combination to communicate information, or
to elicit a desired audience response.
• Even though early aids, such as maps, and drawings, are still in
use, advances in the audiovisual field have opened up new
methods of representing these aids, such as video tapes, and
multimedia equipment which allow more professional and
entertaining presentation not only the classroom but also
anywhere in which ideas are to be conveyed to the audience.
4. Device
• Device is any means other than the subject-matter itself
that is employed by the teacher in presenting the
subjects matter to the learner.
• Purpose of visual devices
• 1. To challenge the student’s attention
• 2. To stimulate the imagination and develop the
mental imagery of the pupils.
• 3. To facilitate the understanding of the pupils.
• 4. To provide motivation to the learners.
• 5. To develop the ability to listen.
5. TraditionalFormsofVisualAids
• 1. Demonstration
• 2. Field trips
• 3. Laboratory experiments
• 4. Pictures, films, simulations, models
• 5. Real objects
• Classification of Devices
• 1. Extrinsic – used to supplement a method used
• Ex. Picture, graph, film strips, slides etc.
• 2. Intrinsic – used as a part of the method or teaching procedure
• Ex. Pictures, accompanying an article
• 3. Material devices – device that have no bearing on the subject
matter.
• Ex. Blackboard, chalk, books, pencil etc.
• 4. Mental Devices – a kind of device that is related in from and
meaning to the subject matter being presented.
• ex. Questions, projects, drills, lesson plan, etc.
6. NONPROJECTEDAUDIOVISUALAIDS
• Nonprojected aids are those do not require the
use of audiovisual equipment such as a projector
and screen.
• These include charts, graphs, maps, illustrations,
photographs, brochures, and handouts.
• Charts are commonly used almost anywhere
• Charts is a diagram which shows relationship.
• An organizational chart is one of the most widely
and commonly used kind of chart.
7. Classification of Tests
• 1. According to manner of response
• a. oral
• b. written
2. According to method of preparations
a. subject/essay
b. objective
3. According to the nature of answer
a. personally test
b. intelligence test
c. aptitude test
d. achievement or summative test
e. sociometric test
f. diagnostic or formative test
g. trade or vocational test
8. Classification tests
Objective test are tests which have definite answer and therefore are most
subject to personal bias.
Teacher-made tests or educational test are constructed by the teachers based
on the contents of different subjects taught.
Diagnostic tests are used to measure a student’s strengths and weakness.
Formative and summative are terms often used with evaluation, but they my
also be used with testing. Formative testing is done to monitor students’
attainment of the instructional objectives. Summative testing is done the at
the conclusion of instruction and measures the extent to which students have
attained the desired outcomes.
Standardized tests are already valid, reliable objective. Standardized tests are
tests which contents have been selected and for which norms or standards
have been established.
Standards or norms are the goals to be achieved expressed in terms of the
average performance of the population tested.
9. Criteria of a Good Examination
• A good examination must pass the following criteria
• Validity – validity refers to the degree to which a test
measures it is intended to measure. It is usefulness of
the test for a given measure.
• Reliability – reliability pertains to the degree to which a
test measure what is suppose to measure.
• Objectivity – it is the degree to which personal bias is
eliminated in the scoring of the answers. When we refer
to the quality of measurement, essentially we mean the
amount of information contained in a score generated by
the measurement.
10. Levels of Measurement
• In the scales are nominal, ordinal, interval and ratio.
• The terms nominal, ordinal, interval and ratio actually
form of hierarchy.
• Nominal scales of measurement are least sophisticated
and contain the least of information
• Ordinal, interval and ratio scales increase respectively in
sophistication.
• The arrangement is a hierarchy in the higher levels, along
with additional data.
11. Nominal Measurement
• Nominal scales – are the least sophisticated; merely
classify objects or events by assigning numbers to them.
these numbers are arbitrary and imply no quantification,
but the categories must be mutually exclusive and
exhaustive.
•Ordinal Measurement
• Ordinal scales – classify, but they also assign rank order.
As example of ordinal measurement is ranking individual
in a class according to their test scores. Student scores
could be ordered from firs, second, third and so forth to
the lowest score.
12. Interval Measurement
• In order to be able to add and subtract scores, we use interval scales,
sometimes called equal interval or equal unit measurement. this
measurement scale contains the nominal and ordinal properties and is
also characterized by equal units between score points.
• Ratio Measurement
• the most sophisticated type of measurement includes all the preceding
properties. But in a ratio scale, the zero point is not arbitrary, a score of zero
includes the absence of what is being measured. May not indicate the complete
absence of social studies knowledge.
• The desirability of ratio measurement scales is that they allow ratio comparisons.
Ratio measurement is rarely achieved in educational assessment, either
cognitive or affective areas.
• We can seldom say that one’s intelligence or achievement is 1- ½ times as great
as that of another person.
13. Norm-Referencedand CriterionReferenced
Measurement
• measurement (or testing) with criterion-referenced measurement, we are
basically referring to two different ways of interpreting information.
• Norm referenced interpretation historically has been used in educations,
today’s schools. The terminology of criterion-referenced measurement has
existed for close to three decades.
•Norm-Referenced Interpretation
• Norm-referenced Interpretation stems from the desire to
different among individuals or to discriminate among the
individuals of some defined group on whatever is being
measured.
• Is a relative interpretation based on an individual’s position with
respect to some group, often called the normative group.
• consist of the scores, usually in some form of descriptive
statistics, of the normative group.
14. Achievement Test as an Example
• Most standardized achievement tests, especially those
covering several skills and academic areas, are primarily
designed for norm-referenced interpretations.
• The form of results and the interpretations of these tests are
somewhat complex and require concepts not yet introduced
in this text.
• Scores on teacher-constructed tests are often given norm-
referenced interpretations.
• Specified percentages of scores are assigned the different
grades, and assigned to the final examination performance
will be 10 percent As, 20 percent Bs, 40 percent Cs, 20 percent
Ds, and 10 percent Fs.
15. Criterion-Referenced Interpretation
• The concepts of criterion-referenced testing have developed
with a dual meaning for criterion-referenced.
• It means referencing as individual’s performance to some
criterion that is a defined performance level.
• The second meaning for criterion-referenced involves the idea
of a defined behavioral domain-that is, a defined body of
learner behaviors. The learners performance on a testis
referenced to a specifically defined group behaviors.
• Criterion-referenced interpretation is an absolute rather than
relative interpretation, referenced to a defined body of a
learner or, as is commonly done, to some specified level of
performance.
• A student who does not attain the criterion has not mastered
the skill sufficiently to move ahead in the instructional
sequence. To a large extent, the criterion is based on teacher
judgment.
16. DistinctionsbetweenNorm-Referencedand
Criterion-ReferencedTests
• Although interpretations, not characteristics, provide the distinctions
between norm-referenced and criterion-referenced tests, the two types
tend to differ in some ways.
• Norm-Referenced Tests are usually more general and comprehensive and
cover a large domain of content and learning tasks. They are used for
survey testing, although this is not their exclusive use.
• Score are transformed to positions within the normative group
• Criterion-referenced Tests focus on specific group of learner behavior. To
show the contrast, consider an example. Arithmetic skill represent a general
and broad category of students outcomes and would likely be measured by
a norm-referenced test.
• Focus more sub skills than on broad skills
• Mastery learning is involved, criterion-referenced measurement would be
used.
17. STAGESIN TESTCONSTRUCT
• I. Planning the test
• A. Determining the objectives
• B. Preparing the Table of Specifications
• C. Selecting the Appropriate Item Format
• D. Writing the Test Items
• E. Editing the Test Items
II. Trying Out the Test
A. Administering the First Tryout – then Item Analysis
B. Administering the Second Tryout – then Item Analysis
c. Preparing the Final Form of the Test
III. Establishing Test Validity
IV. Establishing the Test Reliability
V. Interpreting the Test Score
18. MAJOR CONSIDERATIONSIN TEST
CONSTRUCTIONS
• The following are the major considerations in test
construction:
• Type of Test
• Our usual idea of testing is an in-class test that is administered
by the teacher. However, there are many variations on this
theme: group tests, individual test, written tests, oral tests,
speed tests, power tests, pretests and post tests.
• Each of these has different characteristics that must be
considered when the tests are planned. These can be
communicated to students, administrators, parents, and
others who may be affected by the testing program.
19. TestLength
• A major decision in the test planning is how many items,
should be included on the test.
• Most teacher want test scores to be determined by how much
the students understands rather than by how quickly he or she
answers the questions.
• Item Formats
• Determining what kind of items to include on the tests is a
major decision. should they be objectively scored formats
such as multiple choices or matching type?
• These are some important questions that can be answered
only by the teacher in terms of the local context, his or her
students, his or her classroom, and the specific purpose of the
test.
20. POINTSTO BE CONSIDEREDIN PREPARING
A TEST
• 1. Are the instructional objectives clearly defined?
• 2. what knowledge, skills and attitudes do you want to measure?
• 3. Did you prepare a table of specifications?
• 4. Did you formulate well defined and clear test items?
• 5. Did you employ correct English in writing the items?
• 6. Did you avoid giving clues to the correct answer?
• 7. Did you test the important ideas rather than the trivial?
• 8. Did you adapt the test’s difficulty to your student’s ability?
• 9. Did you avoid using textbook jargons?
• 10. Did you cast the items in positive form?
• 11. Did you prepare a scoring key?
• 12. Does each item have a single correct answer?
• 13. Did you review your items?
21. GENERALPRINCIPLESIN CONSTRUCTING
DIFFERENTTYPESOF TESTS
• 1. The test item should be selected very carefully. Only important facts should be
included.
• 2. The test should have extensive sampling of items.
• 3. the test items should be carefully expressed in simple, clear, definite, and
meaningful sentences.
• 4. There should be only one possible correct response for each test item.
• 5. Each item should be independent. Leading clues to other items should be avoided.
• 6. Lifting sentences from books should not be done to encourage thinking and
understanding.
• 7. The first person personal pronouns/and we should not be used.
• 8. Various types of test items should not be made to avoid monotony.
• 9. Majority of the test items should be moderate difficulty. Few difficult and few easy
items should be included.
• 10. The test items should be arranged in an ascending order or difficulty. Easy items
should be at the beginning to encourage the examinee to pursue the test and the
most difficult items should be at the end.
22. -
11. Clear, concise, and complete directions should precede all types of test.
Sample test items may be provided for expected responses.
12. Items which can be answered by previous experience alone without
knowledge of the subject matter should not be included.
13. Catchy words should not be used in the test items.
14. Test items must be based upon the objectives of the course and upon the
course content.
15. The test measure the degree of achievement or determine the difficulties
of the learners.
16. The test should emphasize ability to apply and use facts as well as
knowledge of facts.
17. The test should be of such length that it can be completed within the time
allotted by all or nearly all of the pupils. The teacher should perform the test
herself to determine its approximate time allotment.
18. Rules governing good language expression, grammar, spelling,
punctuations, and capitalization should be observed in all items.
19. Information on how scoring will be done should be provided.
20. Scoring Keys in correcting and scoring tests should be provided.
23. POINTERSTOBE OBSERVEDINCONSTRUCTING
ANDSCORINGTHE DIFFERENTTYPESOFTESTS
• A. RECALL TYPES
• 1. Simple recall type
a. This type consists of questions calling for a single word or
expressions as an answer.
b. Items usually begin with who, where, when, and what.
c. Score is the number of correct answer.
2. Completion type
a. Only important words or phrases should be omitted to avoid
confusion.
b. Blanks should be of equal lengths.
c. The blank, as much as possible, is placed near or at the end of the
sentence.
d. Articles a, an, and the should not be provided before the omitted
word or phrase to avoid clues for answers.
e. Score is the number of correct answers.
24. • 3. Enumeration
a. the exact number of expected answer should be stated.
b. Blanks should be equal lengths
c. score is the number of correct answers.
4. Identification type
a. The items should make an examinee think of a word, number, or
group of words that would complete the statement or answer the problem.
b. score is the number of correct answers.
B. RECOGNITION TYPES
1. True-False or alternative-response
a. declarative sentences should be used.
b. the number of “True” and “false” items should be more or less
equal.
c. The truth or falsity of the sentence should not be too evident.
d. negative statements should be avoided.
e. The “modified true-false” is more preferable than the “plain true-
false”.
25. -
f. In arranging the items, avoid the regular recurrence of “true” and “false”
statements.
g. Avoid using specific determiners like all, always, never, none, nothing, as a
rule, in general etc.
h. Minimize the use of qualitative terms like few, great, many, more, etc.
i. Avoid leading clues to answer in all items.
j. Score is the number of correct answers in “modified true-false and right
answers minus wrong answers in “plain true-false”.
2. YES – NO type
a. The items should be in interrogative sentences.
b. The same rules as in “true-false” are applied.
3. Multiple-response type
a. There should be three to five choices. The number of choices used
in the first item should be the same number of choices in all the items of
this type of test.
26. -
b. The choices should be numbered or lettered so that only the number
or letter can be written on the blank provided.
c. If the choices are figures, they be arranged in ascending order.
d. Avoid the use of “a” or “an” as the last word prior to the listening of
the responses
e. Random occurrence of responses should be employed.
f. The choices, as much as possible, should be at the end of the
statements.
g. The choices should be related in some way or should belong to the
same class.
h. Avoid the use of “none of these” as one of the choices.
i. Score is the number of correct answers.
4. Best answer type
a. These should be three to five choices all of which are right but vary in
their degree of merit, importance or desirability
b. The other rules for multiple response items are applied here.
c. Score is the number of correct answers.
27. -
5. Matching type
a. There should be two columns. Under “A” are the stimuli which should be
longer and more descriptive than the responses under column “B” the
response may be a word, a phrase, a number, or a formula.
b. The stimuli under column “A” should be numbered and the responses
under lines “B” should be lettered. Answers will be indicated by letters only on
lines provided in column “A”.
c. The number of pairs usually should not exceed twenty items. Less than ten
introduces chance elements. Twenty pairs may be used but more than twenty
is decidedly wasteful of time.
d. The number of responses in column “B” should be two or more than the
number of items in column “A” to avoid guessing.
e. Only one correct matching for each item should be possible.
f. Matching sets should neither be too long nor too short.
g. All items should be on the same page to avoid turning of pages in the
process of matching pairs.
h. Score is the number of correct answers.
28. C.ESSAYTYPE EXAMINATIONS
Common types of essay questions.(the type are related to purpose of which
the essay examinations are to be used.
1. comparison of two things
2. explanations of the use or meaning of a statement or passage.
3. Analysis
4. Decisions for or against
5. Discussion
How to construct essay examinations
1. determine the objectives or essentials for each questions to be
evaluated.
2. Phrase questions in simple, clear and concise language.
3. Suit the length of the questions to the time available for answering
the essay examination. The teacher should try to answer the test herself.
4. Scoring
a. Have a model answer in advance.
b. indicate the number of points for each question.
c. Score point for each essential.
29. AdvantagesandDisadvantagesoftheObjectiveType
ofTests
• Advantages
• a. the objectives test is free from personal bias in scoring.
• b. It is easy to score. With a scoring key, the test can be corrected
by different individuals without affecting the accuracy of the
grades given.
• c. It has high validity because it is comprehensive with wide
sampling of essentials.
• d. it is less time-consuming since many items can be answered in a
given time.
• e. It is fair to students since the slow writers can accomplish the
test as fast as the fast writers.
30. -
• Disadvantages
• a. It is difficult to construct and requires more time to prepare.
• b. it does not afford the students the opportunity in training for
self- and thought organization.
• c. It cannot be used to test ability in theme writing or journalistic
writing.
ADVANTAGES AND DISADVANTAGES OF THE
ESSAY TYPES OF TESTS
Advantage
a. The essay examinations can be used in practically all subjects of
the school curriculum.
b. It trains students for thought organization and self expression.
c. It affords students opportunities to express their originality and
independence of thinking.
31. -
• d. Only the essay test can be used in some subjects like composition
writing.
• e. Essay examination measures higher mental abilities like
comparison, interpretation, criticism, defense of opinion and decision.
• f. The essay test is easily prepared.
• g. It is inexpensive .
• Disadvantages
• a. The limited sampling of items makes the test unreliable measure of
achievements or abilities.
• b. Questions usually are not well prepared.
• c. Scoring is highly subjective due to the influence of the corrector’s
personal judgment.
• d. Grading of the essay test is inaccurate measure of pupil’s
achievement due to subjective of soring.
32. STATISTICALMEASURESORTOOLSUSEDIN
INTERPRETINGNUMERICALDATA
• Frequency Distributions
• A simple, common sense technique for describing a set of test
scores is through the use of a frequency distribution.
• frequency distribution is merely a listing of the possible score
values and the number of persons who achieved each score.
• Such an arrangement presents the score in a more simple and
understandable manner than merely listing all of the separate
scores. Consider a specific set of scores to clarify these ideas.
• First, list the possible score values in rank order, from highest to
lowest.
• Second column indicates the frequency or number of persons who
received each score.
• This are the example of the tables.
36. MeasuresofCentralTendency
• Frequency distribution are helpful for indicating the shape to
describe a distributions of scores, but we need more
information than the shape to describe a distribution
adequately.
• Measures of central tendency, and for the latter, we compute
measures of dispersion.
• There are three commonly used measures of central tendency
the mean, the median, and the mode, but the mean is by far
the most widely used.
• The mean
• The mean of a set of scores is the arithmetic mean. It is found
by summing the scores and dividing the sum by the number of
scores.
37. Example:
x = __X _
N
Where :
x is the mean
X is the symbol for a score, the summation operator
(it tells us to add the Xs)
N is the number of scores
for the set of scores in table 1,
X = 1 100
N = 25,
so then
X = 1 100_ = 44
25
The mean of the set of scores in table 1 is 44, the mean does not have to equal an
observed score, it is usually not even a whole number
When the scores are arranged in a frequency distribution, the formula is
X = fX__mdpt_
N
38. • Where fX mdpt means that the midpoint of the interval is
multiplied by the frequency for the interval. In computing the
mean for the scores in Table 3, using formula we obtain:
• X = 9(49)+4(46)+4(43)+3(40)+3(37)+2(34) = 43.84
25
Note that this mean is slightly different than the
mean using ungrouped data. This difference is due to
the midpoint representing the scores in the rather
than using the actual scores.
39. Themedian
• Another measure of central tendency is the median which is the
point that divides the distribution in half, that is, half of the
scores fall above the median and half of the scores fall below
the median.
• Consider again the frequency distribution in table 2.
• There were 25 scores in the distribution, so the middle score
should be the median
• Cumulative frequencies indicate the number of scores at or
below each score. Table 4 indicates the cumulative frequencies
for the data in table 2.
41. Themode
• The measure of central tendency that is the easiest to find is the
mode.
• The mode is most frequent occurring score in the distribution.
• The mode of the scores in Table is 48 Five person had scores of 48
and no other score occurred as often.
• Each of these three measures of central tendency- the mean, the
median, and the mode means a legitimate definition of “average”
performance on this test.
• There are some distribution in which all three measures of central
tendency are equal. But more often than not they will be
different.
• When the distribution has a small number of very extreme scores,
the median may be a better definition of central tendency.
• - the mean is the arithmetic average
• - the median divides the distribution in half
• -the mode is the most frequent score.
42. MEASURESOFDISPERSION
• Measures of central tendency are useful for summarizing
average performance, but they tell us nothing about how the
scores are distributed or “spread out” around the averages.
• The two sets of test scores may have equal measures of central
tendency, but they might differ in other ways.
• The Range
• Range indicates the difference between the highest and lowest in
the distribution.
• A problem with using the range is that only the two most extreme
scores are used in the computation.
• Measures of dispersion that take into consideration every score in
the distribution are the variance and the standard deviation.
Standard deviation is used a great deal in interpreting scores from
standardized tests.
43. TheVariance
• The variance measures how widely the scores in the distribution
are spread about the mean. In other words, the variance is the
average squared difference between the scores and the mean.
As a formula, it looks like this.
• s² = (X – X)²
• -------------
• N
An equivalent formula, easier to compute is:
S² = X²
----------------X²
N
44. TheStandardDeviation
• The standard deviation also indicates how spread out the
scores are, but is expressed in the same units as the original
scores.
• The standard deviation is computed by finding the square root
of the variance S = S²
• For the Table 1, the variance is 22.8. The standard deviation is
22, or 4.77
• the scores of most norm groups have the shape of a “normal”
distribution-ma symmetrical, bell-shaped distribution with
which most people are familiar.
45. Table 5. Computation of the Variance for the Scores of Table 1
Student Score
-------
X
Score – Mean
------------
x - x
(Score – Mean)²
---------------
(x – x)²
A 48 4 16
B 50 6 36
C 46 2 4
D 41 -3 9
E 37 -7 49
F 48 4 16
G 38 -6 36
H 47 3 9
I 49 5 25
J 44 0 0
-
-
-
W 47 3 9
X 40 -4 16
Y 48 4 16
Totals 1 100 0 570
46. -
• To determine the mean :
1 100
X = --------------- = 44
25
Then, to determine the variance :
(X – X)² 570
S2 = ------------ = -------- = 22.8
N 25
the usefulness of the standard deviation becomes apparent when scores
from different tests are compared.
In fine, descriptive statistics that indicate dispersion are the range, the
variance, and the standard deviation.
Standard deviation is a unit of measurement that shows by how much the
separate scores tend to differ from the mean
The variance is the square of the standard deviation. Most scores are within
two standard deviation from the mean.
47. GraphingDistribution
• A graph of distribution of test scores is often better
understood than is the frequency distribution or a mere table
of numbers.
• The general shape of the distribution is clear from the graph
• A normal distribution has most of the test scores in the middle
of the distribution and progressively fewer scores toward the
extreme.
• The scores of norm groups are seldom graphed but they could
be if we were considered about seeing the specific shape of
the distribution scores.
• Usually, we know or assume that the scores are normally
distributed.