Making communications land - Are they received and understood as intended? we...
Assessment of Learning
1.
2. What is Assessment of Learning?
• I focuses on the development and utilization of assessment
tools to improve the reaching-learning process.
• It emphasizes on the use of testing measuring knowledge,
comprehension and other thinking skills.
• It allows the students to go through the standard steps in
constitution for quality assessment.
• Students will experience how to develop rubrics for
performance-based and portfolio assessment.
3. MEASUREMENT
•Refers to the quantitative aspect of
evaluation. It involves outcomes that can be
quantified statically. It can also be defined
as the process in determining and
differentiating the information about the
attributes or characteristics of things.
4. EVALUATION
•Is the qualitative aspect of determining
outcomes of learning. It involves value
judgment. Evaluation is more comprehensive
then measurements. In fact, measurement is
one aspect of evaluation.
9. According to the nature of test:
• Personality test
• Intelligence test
• Aptitude test
• Achievement or summative test
• Sociometric test
• Diagnostic or formative test
• Trade of vocational test
12. Diagnostic Tests
•Are used to measure a student’s strengths
and weaknesses, usually to identify
deficiencies in skills or performance.
13. Formative and Summative Tests
• Are terms often used with evaluation, but they may
also be used with testing. Formative testing is done to
monitor students’ attainment of the instructional
objectives. Formative testing occurs over a period of
time and monitors students progress. Summative
testing is done at the conclusions of instruction and
measures the extent to which students have attained
the desired outcomes.
14. Standardized Tests
•Are already valid, reliable and objective.
Standardized tests are tests for which contents have
been selected and for which norms or standards
have been established. Psychological tests and
government national examinations are examples of
standardized tests.
15. Standards or Norms
•Are goals to be achieved expressed in terms of the
average performance of the population tested.
16. Criterion-referenced measure
•Is a measuring device with a predetermined level of
success or standard on the part of the test-takes.
•For example, a level of 75 percent score in all the
test items could be considered a satisfactory
performance.
17. Norm-referenced measure
•Is a test that is scored on the basis of the norm or
standard level of accomplishment by the whole
group taking the test. The grades of the students
are based on the normal curve of distribution.
22. Nominal Measurement
• Merely classify objects or events by assigning
numbers to them.
-For example, one could nominally designate baseball
positions by assigning the pitcher the number 1, 2; the
first baseman and so on.
23. Ordinal Measurement
•Ordinal scaled classify but they also assign rank
order. Ranking individuals according to their test
scores is an example of ordinal measurement.
24. Interval Measurement
•In order to be able to add and subtract scores, we
sue interval scales or sometimes called equal
interval or equal unit measurement. This contains
the nominal and ordinal properties and is also
characterized by equal units between score points.
25. Ratio Measurement
•It includes all the preceding properties, but in ratio
scale, the zero point is not arbitrary; a score of zero
included the absence of what is being measured.
26. Norm-referenced and Criterion
referenced Measurement
• When we contrast norm-referenced measurement
(or testing) with criterion-referenced measurement,
we are basically referring to two different ways of
interpreting information. However, Popham (1988,
page 135) points out that certain characteristics tend
to go with each type of measurement, and it is likely
that results of norm- referenced tests are interpreted
in criterion-referenced ways and vice versa.
27. Norm-referenced Interpretation
•An individual score is interpreted by comparing it
to the scores of a defined groups, often called the
normative group. Norms represent the scored
earned by one or more groups of students who
have taken the test.
28. Achievement Test as an Example
• Most standardized achievement tests, especially those covering
several skills and academic areas, are primarily designed for norm-
referenced interpretations. However, the form of results and the
interpretations of these tests are somewhat complex and require
concepts not yet introduced in this text. Scores on teacher-constructed
tests are often given norm-referenced interpretations, Grading on the
curve, for example, is a norm-referenced interpretation of test scores
on some type of performance measure. Specified percentages of scores
are assigned the different grades, and : an individual's score is
positioned in the distribution of scores. (We mention this only as an
example; we do not endorse this procedure.)
29. Criterion-referenced Interpretation
• The concepts of criterion-referenced testing have
developed with a dual meaning for criterion-
referenced. On one hand, it means referencing an
individual’s performance to some criterion that is a
defined performance level. The individual's score is
interpreted in absolute rather than relative terms.
The criterion, in this situation, means some level of
specified performance that has been determined
independently of how others might perform.
30. Distinctions between Norms-referenced and
Criterion-referenced Tests
• Although interpretations, not characteristics,
provide the distinction between norm-referenced
and criterion-referenced tests, the two types do tend
to differ in some ways. Norm- referenced tests are
usually more general and comprehensive and cover a
large domain of content and learning tasks. They are
used for survey testing, although this is not their
exclusive use.
31. • Criterion-referenced tests focus on a specific group of
learner behaviors. To show the contrast, consider an
example. Arithmetic skills represent a general and broad
category of student outcomes and would likely be measured
by a norm-referenced test. On the other band, behaviors
such as solving addition problems with two five-digit
numbers or determining the multiplication products of
three-and four digit numbers are much more specific and
may be measured by criterion-referenced tests.
32. • A criterion-referenced test tends to focus more on sub skills
than on broad skills. Thus, criterion-referenced tests tend to be
shorter. If mastery learning is involved, criterion-referenced
measurement would be used.
• Norm-referenced test scores are transformed to positions
within the normative group. Criterion-referenced test scores are
usually given in the percentage of correct answers or another
indicator of mastery or the tack thereof. Criterion-referenced
tests tend to lend
33. STAGES IN TEST CONSTRUCTION
I. Planning the Test
• A. Determining the Objectives
• B. Preparing the Table of Specifications i
• C. Selecting the Appropriate Item Format
• D. Writing the Test Items
• E. Editing the Test Items
34. II . Trying Out the Test
• A. Administering the First Tryout - then Item Analysis
• B. Administering the Second Tryout - then Item Analysis ‘.
• C. Preparing the Final Form of the Test
35. •Ill. Establishing Test Validity
•IV. Establishing the Test Reliability
•V. Interpreting the Test Score
36. MAJOR CONSIDERATIONS IN
TEST CONSTRUCTION
• Type of Test Our usual idea of testing is an in-class
test that is administered by the teacher. However,
there are many variations on this theme: group tests,
individual tests, written tests, oral tests, speed tests,
power tests, pretests and post tests. Each of these
has different characteristics that must be considered
when the tests are planned.
37. Test Length
• A major decision in the test planning is how many items
should be included on the test. There should be enough to
cover the content adequately, but the length of the class
period or the attention span or fatigue limits of the students
usually restrict the test length. Decisions about test length
are usually based on practical constraints more than on
theoretical considerations.
38. Item Formats
• Determining what kind of items to include on the test is a
major decision. Should they be objectively scored formats
such as multiple choice or matching type? Should they causes
the students to organize their own thoughts through short
answer or essay formats? These are important questions that
can be answered only by the teacher in terms of the local
context, his or her students, his or her classroom, and the
specific purpose of the test. Once the planning decisions are
made, the item writing begins. This tank 1s often the most
feared By the: beginning test constructors, However, the
procedures are more common sense than formal rules.
39. POINTS TO BE CONSIDERED IN
PREPARING A TEST
1. Are the instructional objectives clearly defined?
2. What knowledge, skills and attitudes do you want to measure?
3. Did you prepare a table of specifications?
4. Did you formulate well defined and clear test items?
5. Did you employ correct English in writing the items?
6. Did you avoid giving clues to the correct answer?
40. 7. Did you test the important ideas rather than the trivial?
8. Did you adapt the test's difficulty to your student's ability?
9. Did you avoid using textbook jargons?
10. Did you cast the items in positive form?
11. Did you prepare a scoring key?
12. Does each item have a single correct answer? .
13. Did you review your items?
41. General Principles in Construction
Different Types of Tests
1. The test items should be selected very carefully. Only important
facts should be included.
2. The test should have extensive sampling of items
3. The test items should be carefully expressed in simple, clear,
definite, and meaningful sentences.
4. There should be only one possible correct response for each test
item.
5. Each item should be independent.
42. 6. Lifting sentences from books should not be done to
encourage thinking and understanding
7. The first personal pronouns I and we should not be used.
8. Various types of test items should be made to avoid
monotomy.
9. Majority of the test items should be of moderate difficulty.
10. The test items should be arranged in an ascending order of
difficulty.
43. 11. Clear, concise and complete directions should precede all types of
test.
12. Items which can be answered by previous experience alone
without knowledge of the subject matter should not be included.
13. Catchy words should not be used in the test items.
14. Test items must be based upon the objectives of the course and
upon the course content.
15. The test should measure the degree of achievement or determine
the difficulties of the learners.
16. The test should emphasize ability to apply and use facts as well as
knowledge of facts.
44. 17. The test should be of such length that it can be completed
within the time allotted by all or nearly all of the pupils.
18. Rules governing good language expression, grammar,
spelling, punctuation, and capitalization should be
observed in all items.
19. Information on how scoring will be done should be
provided.
20. Scoring keys in correcting and scoring tests should be
provided.
45. POINTERS TO BE OBSERVED IN
CONSTRUCTING AND SCORING THE
DIFFERENT TYPES OF TESTS
• A. RECALL TYPES
1. Simple recall type
a. This type consists of questions calling for a single word of
expression as an answer.
b. Items usually begin with who, where, when, and what,
c. Score to the number of correct answers.
46. 2. Completion type
a. Only important words or phrases should be omitted to avoid
confusion.
b. Blanks should be of equal lengths.
c. The blank, as much as possible, is placed near or at the end of the
sentence.
d. Articles a, an, and the should not be provided before the omitted
word or phrase to avoid clues for answers.
e. Score is the number of correct answers.
47. 3. Enumeration type
a. The exact number of expected answers should be stated.
b. Blanks should be of equal lengths,
c. Score is the number of correct answers
48. • 4. Identification type
a. The items should make an examinee think of a word,
number, or group of words that would complete the
statement or answer the problem.
b. Score is the number of correct answers.
49. B. RECOGNITION TYPES
• 1. True-false or alternate-response type
a. Declarative sentences should be used.
b. The number of “true” and “false” items should be more or less equal
c. The truth or falsity of the sentence should not be too evident.
d. Negative statements should be avoided.
e. The "modified true-false" is more preferable than the “plain true-false™. f. In
arranging the items, avoid the regular recurrence of “true” and “false”
Statements.
50. g. Avoid using specific determiners like all, always, never, none, nothing,
most, often, some, etc. and avoid weak statements as may, sometimes,
as a rule, in general etc
h. Minimize the use of qualitative terms like: few, great, many, more,
etc.
i. Avoid leading clues to answers to all stems.
j. Score is the number of correct answers in “modified true-false and
right answers minus wrong answers in “plain true-false”
51. 2. Yes-No type
• a. The items should be in interrogative sentences.
• b. The same rules as in “true-false” are applied
52. • 3. Multiple-response type
a. There should be three to five choices. The number of
choices used in the first item should be the same number of
choices in all the items of this type of test.
b. The choices should be numbered or lettered so that only
the number or letter can be written on the blank provided.
c. If the choices are figures, they should be arranged in
ascending order.
d. Avoid the use of “a” or “an” as the last word prior to the
fisting of the responses.
53. e. Random occurrence of responses should be employed
f. The choices, as much as possible, should be at the end of the
statements.
g. The choices should be related in some way or should belong
to the same class.
h. Avoid the use of “none of these” as one of the choices,
i. Score is the number of correct answers.
54. 4. Best answer type
a. There should be three to five choices all of which are
right but vary in their degree of merit, importance or
desirability
b. The other rules for multiple-response items are
applied here.
c. Score is the number of correct answers.
55. 5. Matching type
a. There should be two columns, Under “A” are the stimuli
which should: be longer and = more descriptive than the
responses under column “p" The’ response may be a word,
a Phrase, g number, or a formula, . .
b. The stimuli under column “At should be numbered and
the responses under column “B should be lettered, Answers
will be indicated by letters only on lines provided in column
“A”,
c. The number of Pairs Usually should Not exceed twenty
items. Less than ten I. introduces chance elements Twenty
pairs may be used but more than Twenty is decidedly
wasteful of time
56. d. The number of responses in column “B" should be two or
more than the number ii of items in Column “A” to avoid
guessing.
e. Only one correct matching for each item should be
Possible.
f. Matching sets should Neither be too long nor too short.
g. All items should be on the Same page to avoid turning of
pages in the process of matching pairs
h. Score is the number of correct answers.
57. C. ESSAY TYPE EXAMINATIONS
Common types of essay questions.
(The types are related to purposes of which the essay examinations are to be used)
1. Comparison of two things
2. Explanation of the use or meaning of a statement or passage.
3. Analysis
4. Decisions for or against 5. Discussion
58. •How to construct essay examinations
1. Determine the objectives or essentials for each question to be
evaluated.
2. Phrase questions in simple, clear and concise language.
3. Suit the length of the questions to the time available for answering the
essay examination. The teacher should try to answer the test herself.
59. 4. Scoring
a. Have a model answer in advance.
b. Indicate the number of points for each question.
c. Score a point for each essential.
60. Advantages and disadvantages
of the Objective Type of Tests
• Advantages
a. The objective test is free from personal bias in scoring.
b. It is easy to score. With a scoring key, the test can be
corrected by different individuals without affecting the accuracy
of the grades given.
c. It has high validity because it is comprehensive with wide
sampling of essentials.
d. It is less time-consuming since many items can be answered
in a given time.
e. It is fair to students since the slow writers can accomplish the
test as fast as the fast writers.
61. •Disadvantages
a. It is difficult to construct and requires more time to prepare.
b. It does not afford the students the opportunity in training
for self- and thought organization
c. It cannot be used to test ability in theme writing or
journalistic writing.
62. Advantages and Disadvantages of
the Essay type of Tests
• Advantages
a. The essay examination can be used in practically all subjects of the
school curriculum.
b. It trains students for thought organization and self expression.
c. It affords students opportunities to express their originality and
independence of thinking.
d. Only the essay test can be used in some subjects like composition
writing and journalistic writing which cannot be tested by the objective
type test.
e. Essay examination measures higher mental abilities like comparison, y
interpretation, criticism, defense of opinion and decision. f. The essay test
is easily prepared. g. It is inexpensive
63. • Disadvantages
a. The limited sampling of items makes the test unreliable
measure of achievements or abilities.
b. Questions usually are not well prepared.
c. Scoring is highly subjective due to the influence of the
corrector’s personal judgment.
d. Grading of the essay test is inaccurate measure of pupils’
achievements due to subjectivity of scoring.
64. STATISTICAL MEASURES OR TOOLS USED IN
THE INTEREPRETING NUMERICAL DATA
• Frequency Distributions
A simple, common sense technique for describing a sct of test
scores is through the use of a frequency distribution. A
frequency distribution is merely a listing of the possible score
values and the number of persons who achieved each score.
Such an arrangement presents the scores in a more simple and
understandable manner than merely listing all of the separate
scores. Consider a specific set of scores to clarify these ideas.
65. MEASURES OF CENTRAL
TENDENCY
• Frequency distributions are helpful for indicating the
shape to describe a distributions of scores, but we
need more information than the shape to describe a
distribution adequately. We need to know where on
the scale of measure of central tendency.
66. MEASURES OF DISPERSION
• Measures of central tendency are useful for summarizing
average performance, but they tell us nothing about how the
scores are distributed or “spread out” around the averages.
Two sets of test scores may have equal measures of central
tendency, but they might differ in other ways. One of the
distributions may have the scores tightly clustered around the
average, and the other distribution may have scores that are
widely separated. As you may have anticipated, there are
descriptive statistics that measure dispersion, which also are
called measures of variability. These measures indicate how
spread out the scores tend to be.
67. Graphing Distributions
• A graph of a distribution of test scores is often better
understood than is the frequency distribution or a mere table
of numbers. The general pattern of scores, as well as any
unique characteristics of the distribution, can be seen easily in
simple graphs. There are several kinds of graphs that can be
used, but a simple bar graph, or histogram, is as useful as any.