Assessment.pptx module 1: professional education

•an instrument designed to measure any
characteristic, quality, ability, knowledge or
skill. It comprised of items in the area it is
designed to measure.
Test

•a process of quantifying the degree to which
someone/something possess a given trait
Measurement

•a process of gathering and organizing
quantitative or qualitative data into an
interpretable form to have a basis for
judgement or decision making. It is a
prerequisite to evaluation. It provides the
information which enables evaluation to take
place.
Assessment

•a process of systematic interpretation,
analysis, appraisal or judgement of the worth
of organized data as basis for decision
making. It involves judgement about the
desirability of changes in students.
Evaluation

•It refers to the use of pen and paper
objective test.
Traditional Assessment

•it refers to the use of methods other than pen
and paper objective test which includes
performance tests, projects, portfolios,
journals and the likes.
Alternative Assessment

•it refers to the use of an assessment method
that simulate true to life situations. This could
be objective tests that reflect real-life
situations or alternative methods that are
parallel to what we experience in real life.
Authentic Assessment

1. Assessment FOR Learning – this includes
three types of assessment done before and
during instruction. These are placement,
formative and diagnostic.
Purposes of Classroom Assessment

• Its purpose is to assess the needs of the learners to
have basis in planning for a relevant instruction.
• Teachers use this assessment to know what their
students are bringing into the learning situation and
use this as a starting point for instruction.
• The results of this assessment place students in
specific learning groups to facilitate teaching and
learning.
a. Placement – done prior to instruction

• This assessment is where teachers continuously
monitor the students’ level of attainment of the
learning objectives.
• The results of this assessment are communicated
clearly and promptly to the students for them to
know their strengths and weaknesses and the
progress of their learning.
b. Formative – done during instruction

• This is used to determine students’ recurring or
persistent difficulties.
• It searches for the underlying causes of student’s
learning problems that do not respond to first aid
treatment. It helps formulate a plan for detailed
remedial instruction.
c. Diagnostic – done before or during instruction

• this is done after instruction. This is usually referred to
as the summative assessment.
• It is used to certify what students know and can do
and the level of their proficiency or competency.
• The information from assessment of learning is usually
expressed as marks or grades.
• The results of which are communicated to the
students, parents and other stakeholders for decision
making.
2. Assessment OF Learning

• this is done for teachers to understand and
perform well their role of assessing FOR and OF
learning. It requires teachers to undergo training on
how to assess learning and be equipped with the
following competencies needed in performing
their work as assessors.
3. Assessment AS Learning

MODE DESCRIPTION EXAMPLES ADVANTAGE DISADVANTAGE
Traditional The objective paper and
pen test which usually
assesses s low-level
thinking skills.
Standardized tests
Teacher-made tests
-Scoring is objective.
-Administration is
easy because
students can take
the test at the same
time.
-Preparation of
instrument is time-
consuming.
-Prone to cheating.
Performance Requires actual
demonstration of skills of
creation of products of
learning
-Practical test
-oral test
-projects
-preparation of the
instrument is
relatively easy
-measures behaviors
that cannot be
deceived
-scoring tends to be
subjective without
rubrics
-administration is time
consuming
Portfolio -a process of gathering
multiple indicators of
student progress to
support course goals in
dynamic, ongoing and
collaborative process.
-working portfolios
-show portfolios
-documentary
portfolios
-measures student’s
growth and
development
-intelligence-fair
-development is time
consuming
-ratings tends to be
subjective without
rubrics
MODES OF ASSESSMENT

1. Placement Evaluation
-done before instruction
-determines mastery of prerequisite skills
-not graded
FOUR TYPES OF EVALUATION
PROCEDURES

2. Summative Evaluation
-done after instruction
-certifies mastery of the intended learning
outcomes
-graded
-examples include quarterly exams, unit or
chapter tests, final exams

• determine the extent of what the pupils have
achieved or mastered in the objectives of the
intended instruction
• determine the students in specific learning groups
to facilitate teaching and learning
• serve as a pretest for the next unit
• serve as basis in planning for a relevant instruction
Both Placement and Summative Evaluations:

•reinforces successful learning
•provides continues feedback to both
students and teachers concerning learning
success and failures
•not graded
•examples: short quizzes, recitations
3. Formative Evaluation

•determine persistent deficiencies
•helps formulate a plan for remedial
instruction
4. Diagnostic Evaluation

1. administered during instruction
2. designed to formulate a plan for remedial
instruction
3. modify the teaching and learning process
4. not graded
Formative and Diagnostic Evaluation both:

Principle 1: Clarity of Learning Targets
• Clear and appropriate learning targets
include what students know and can do and
the criteria for judging student performance.
PRINCIPLES OF HIGH QUALITY
ASSESSMENT

•the method of assessment to be used should
match the learning targets
Principle 2: Appropriateness of Assessment Methods

• A balanced assessment sets target in all
domains of learning or domains of
intelligence.
• A balanced assessment makes use of both
traditional and alternative assessments.
Principle 3: Balance

•is the degree to which the assessment
instrument measures what it intends to
measure. It is also refers to the usefulness of
the instrument for a given purpose. It is the
most important criterion of a good
assessment instrument.
Principle 4: Validity

1. Face validity – is done by examining the
physical appearance of the instrument to
make it readable and understandable.
2. Content validity – is done through a careful
and critical examination of the objectives of
assessment to reflect the curricular objectives.
Ways in Establishing Validity

–is established statistically such that a set of scores obtained
in another external predictor or measure. It has two
purposes: concurrent and predictive.
a. Concurrent validity – describes the present status of
the individual by correlating the sets of scores
obtained from two measures given at a close interval.
b. Predictive validity – describes the future
performance of an individual by correlating the sets of
scores obtained from two measures given at a
longer time interval.
3. Criterion-related validity

– is established statistically by comparing psychological
traits of factors that theoretically influence scores in a test.
a. Convergent validity – is established if the instrument
defines another similar trait other than what it is
intended to measure. Ex. Critical thinking may be
correlated with creative thinking test
b. Divergent Validity – is established if an instrument
can describe only the intended trait and not the other
traits. Ex. Critical thinking test may not be correlated with
reading comprehension test.
4. Construct validity

•this refers to the degree of consistency when
several items in a test measure the same
thing and the stability when the same
measures are given across time
•Split-Half method, test-retest method, parallel
or equivalent form
Principle 5: Reliability

•fair assessment is unbiased and provides
students with opportunities to demonstrate
what they have learned.
Principle 6: Fairness

•When assessing learning, the information
obtained should be worth the resources and
time required to obtain it. The easier the
procedure, the more reliable the assessment
is.
Principle 7: Practicality and Efficiency

•Assessment takes place in all phases of
instruction. It could be done before, during
and after instruction.
Principle 8: Continuity

•Assessment targets and standards should be
communicated. Assessment results should be
communicated to important users.
Assessment results should be communicated
to students through direct interaction or
regular ongoing of feedback on their
progress.
Principle 9: Communication

•Assessment should have a positive
consequence to students; that is; it should
motivate them to learn.
•Assessment should have a positive
consequence to teachers; that is, it should
help them improve the effectiveness of their
instruction.
Principle 10: Positive Consequences

• Teachers should free the students from harmful
consequences of misuse or overuse of various
assessment procedures such as embarrassing students
and violating students right to confidentiality.
• Teachers should be guided by laws and policies that
affect their classroom assessment.
• Administrators and teachers should understand that it
is inappropriate to use standardized student
achievement to measure teaching effectiveness.
Principle 11: Ethics

• is a process of gathering information about
student’s learning through actual demonstration of
essential and observable skills and creation of
products that are grounded in real world contexts
and constraints. It is an assessment that is open to
many possible answers and judged using multiple
criteria or standards of excellence that are pre-
specified and public.
Performance-based Assessment

1. Demonstration type – this is a task that
requires no product. Examples: cooking
demonstration, presentations
2. Creation-type – this is a task that requires
tangible products
•Example: project plan, research paper,
project flyers
Types of Performance-based Task

•is also an alternative to pen and paper
objective test. It is a purposeful, ongoing,
dynamic and collaborative process of
gathering multiple indicators of the learner’s
growth and development. Portfolio
assessment is also performance based but
more authentic than any performance-
based task.
Portfolio Assessment

1. Content principle – suggests that portfolios should
reflect the subject matter that is important for the
students to learn.
2. Learning principle – suggests that portfolios should
enable the students to become active and
thoughtful learners.
3. Equity principle -explains that portfolios should
allow students to demonstrate their learning styles
and multiple intelligences.
Principles Underlying Portfolio
Assessment

1. The working portfolio is a collection of a
student’s day to day works which reflect
his/her learning.
2. The show portfolio is a collection of a
student’s best works.
3. The documentary portfolio is a combination
of a working and a show portfolio.
Types of Portfolios

Levels of Learning
Outcomes
Description Some Question Clues
Knowledge Involves remembering or recalling
previously learned material or a wide
range of materials
-list, define, identify, name,
recall, state, arrange
Comprehension Ability to grasp the meaning of material
b translating materials from one form to
another or by interpreting material
-describe, interpret, classify,
differentiate, explain, translate
Application Ability to use learned material in new
and concrete situations
-apply, demonstrate, solve,
interpret, use, experiment
Analysis Ability to break down material into its
component parts so that the whole
structure is understood
-analyze, separate, explain,
examine, discriminate, infer
Synthesis Ability to put parts together to form a
new whole
-integrate, plan, generalize,
construct, design, propose
Evaluation Ability to judge the value of material on
the basis of a definite criteria
-assess, decide, judge, support,
summarize, defend
A. COGNITIVE DOMAIN

Categories Description Some Illustrative Verbs
Receiving Willingness to receive or to
attend to a particular
phenomenon or stimulus
-acknowledge, ask,
choose, follow, listen,
reply, watch
Responding Refers to active participation on
the part of the student
-answer, assist, contribute,
cooperate, follow-up
Valuing Ability to see worth or value in a
subject activity
-adopt, commit, desire,
display, explain, initiate
Organization Bringing together a complex of
values, resolving conflicts
between them, and beginning to
build an internally consistent
value system
-adapt, categories,
establish, generalize,
integrate, organize
B. AFFECTIVE DOMAIN

Categories Description Some Illustrative Verbs
Imitation Early stages in learning a complex skill
after an indication or readiness to take
a particular type of action
-carry out, assemble, practice,
follow, repeat, sketch, move
Manipulation A particular skill or sequence; is
practiced continuously until it becomes
habitual and done with some
confidence and proficiency
-acquire, complete, conduct,
improve, perform, produce
Precision A skill has been attained with
proficiency and efficiency
-achieve, accomplish, excel.
master, succeed
Articulation An individual can modify movement
patters to meet a particular situation
-adapt, change, excel,
reorganize, rearrange
Naturalization An individual responds automatically
and creates new motor ways of
manipulation out of understanding,
abilities and skills developed
 arrange, combine, compose,
construct, create, design
C. PSYCHOMOTOR DOMAIN

MAIN POINT OF
COMPARISON
TYPES OF TESTS
Purpose Psychological Educational
-aims to measure students’
intelligence or mental
ability in a large degree
without reference to what
the students has learned
(e.g. aptitude test,
personality tests,
intelligence tests)
-aims to measure the result
of instructions and learning
(e.g. achievement tests,
performance test)
DIFFERENT TYPES OF TEST

Scope of Content Survey Mastery
-covers a broad
range of objectives
-covers a specific
objective
-measures general
achievement in
certain subjects
-measures
fundamental skills
and abilities
-constructed by
trained
professional
-typically
constructed by the
teacher

Language Mode Verbal Non-verbal
-words are used
by students in
attaching
meaning to or
responding to
test items
-students do
not use words
in attaching
meaning to or in
responding to
test items

Construction Standardized Informal
-constructed by a professional
item writer
-constructed by a classroom
teacher
-covers a broad range of
content covered in a subject
area
-covers a narrow range of
content
Uses mainly multiple choice -various types of items are
used
-items written are screened
and the best items were
chosen for the final
instrument
-teacher picks or writes items
as needed for the test
-can be scored by a machine -scored manually by the
teacher
-interpretation of results is
usually norm-referenced
-interpretation is usually
criterion-referenced

Manner of
Administration
Individual Group
-mostly given orally or
requires actual
demonstration of skill
-this is a paper and pen
test
-one on one situations,
thus, many
opportunities for clinical
observation
-loss of rapport, insights
and knowledge about
each examinee
-chance to follow up
examinee’s response in
order to clarify or
comprehend more
clearly
-same amount of time
needed to gather
information from one
student

Effect of Biases Objective Subjective
-scorer’s personal
judgment does not
affect the scoring
-affected by scorer’s
personal opinions,
biases and judgments
Worded that only one
answer is acceptable
-several answers are
possible
-little of no
disagreement on
what is the correct
answer
-possible to
disagreement on
what is the correct
answer

Time Limit and
Level of Difficulty
Power Speed
-consists of series
of items arranged
in ascending order
of difficulty
-consists of items
approximately
equal in difficulty
-measures
student’s ability to
answer more and
more difficult items
-measures students
speed or rate and
accuracy in
responding

Format Selective Supply
-there are choices for the
answer
-there are no choices for
the answer
-multiple choice, true and
false, matching type
-short answer,
completion, restricted or
extended essay
-can be answered quickly -may require a longer
time to answer
-prone to guessing -less chance to guessing
but prone to bluffing
-time consuming to
construct
-time consuming to
answer/score

Nature of
Assessment
Maximum
Performance
Typical
Performance
-determines
what individuals
can do when
performing at
their best
-determines
what individuals
will do under
natural
conditions

Interpretation Norm-referenced Criterion-referenced
-result is interpreted by
comparing one student’s
performance with other student’s
performance.
-results is interpreted by
comparing students’
performance based on a
predefined standard (mastery)
-some will really pass All or none may pass
-there is competition for a limited
percentage of high scores
There is no competition for a
limited percentage of high score
-typically covers a large domain
of learning tasks
-typically focuses on a delimited
domain of learning tasks
-emphasizes discrimination
among individuals in terms of
level of learning
-emphasizes description of what
learning tasks individuals can and
cannot perform
-favors items of average difficulty
and typically omits very easy and
very difficult items
Matches items difficulty to
learning tasks, without altering
items difficulty or omitting easy
-interpretation requires clearly
defined group
Interpretation requires a clearly
defined and delimited
achievement domain

Reference Interpretation Provided Condition that must be present
Ability-referenced How are students performing
relative to what they are
capable of doing
Good measures of the students
maximum possible
performance
Growth-referenced How much have students
changes or improved relative to
what they were doing
Pre and post measures of
performance that are highly
reliable
Norm-referenced How well are students doing
with respect to what is typical
or reasonable
Clear and understanding of
whom students are being
compared to
Criterion-reference What can students do and not
do
Well-defined content domain
that was assessed
FOUR COMMONLY-USED REFERENCES
FOR CLASSROOM INTERPRETATION

1. Selective Type – provides choices for the answer.
a. Multiple Choice – consists of a stem which describes the
problem and 3 or more alternatives which give the suggested
solutions. The incorrect alternatives are the distractors.
b. True-False or Alternative Response - consists of declarative
statement that one has to mark true or false, right or wrong,
correct or incorrect, yes or no, fact or opinion and the like.
c. Matching Type – consists of two parallel columns: Column A,
the column of premises from which a match is sought: Column B, the
column of responses from which the selection is made.
TYPES OF TEST ACCORDING TO
FORMAT

Type Advantages Limitations
Multiple Choice -more adequate sampling of content
-tend to structure the problem to be
addressed more effectively
-can be quickly and objectively scored
-prone to guessing
-often indirectly measure targeted
behaviors
-time-consuming to construct
Alternative Response -more adequate sampling of content
-easy to construct
-can be effectively and objectively scored
=prone to guessing
-can be used only when
dichotomous answers represent
sufficient response options
-usually must indirectly measure
performance related to procedural
knowledge
Matching Type -allows comparison of related ideas
-concepts or theories
-effectively assesses association between a
variety of items within a topic
-encourages integration of information
-can be quickly and objectively scored
-can be easily administered
-difficult to produce a sufficient
number of plausible premises
-not effective in testing isolated
facts
-may be limited to lower levels of
understanding
-useful; only when there is a
sufficient number of related items
-may be influenced by guessing

a. Short Answer – uses a direct question that
can be answered by a word, phrase, a
number or a symbol
b. Completion Test – consists of an incomplete
statement
2. Supply Test

Advantages Limitations
-easy to construct
-require the student to
supply the answer
-many can be included
in one test
-generally limited to
measuring recall of
information
-more likely to be
scored erroneously due
to a variety of
responses

a. Restricted - limits the content of the
response by restricting the scope of the topic.
b. Extended Response – allows the students to
select any factual information that they think is
pertinent, to organize their answers in
accordance with their best judgment
3. Essay Test

Advantages Limitations
-measure more directly
behaviors specified by
performance objectives
-examine students’ written
communication skills
-require the students to
supply the response
- provide a less adequate
sampling of content
-less reliable scoring
-time-consuming

1. Use your TOS (Table of Specification) as guide to
item writing.
2. Write more test items than needed.
3. Write the test items well in advance of the testing
date.
4. Write each items so that the task to be performed
is clearly defined.
5. Write each test items in appropriate reading level.
GENERAL SUGGESTIONS IN WRITING
TESTS

6. Write each test item so that it does not provide
help in answering other items in the test.
7. Write each test item so that the answer is one that
would be agreed upon by experts.
8. Write test items so that it is the proper level of
difficulty.
9. Whenever a test is revised, recheck its relevance.

1. Word the item/s so that the required answer is
both brief and specific.
2. Do not take statements directly from textbooks to
use as a basis for short answer items.
3. A direct question is generally more desirable than
an incomplete statement.
4. If item is to be expressed in numerical units,
indicate type of answer wanted.
SPECIFIC SUGGESTIONS

5. Blanks should be equal in length.
6. Answers should be written before the item
number for easy checking.
7. When completion items are to be used, do not
have too many blanks. Blanks should be at the
center of the sentence and not at the beginning.

1. Restrict the use of essay questions to those learning
outcomes that cannot be satisfactorily measured by
objective items.
2. Formulate questions that will call forth the behavior
specified in the learning outcome.
3. Phrase each question so that the pupil’s task is clearly
indicated.
4. Indicate an approximate time limit for each question.
5. Avoid the use of optional questions.
B. ESSAY TYPE

Alternative Response
1. Avoid broad statements.
2. Avoid trivial statements.
3. Avoid the use of negative statements
especially double negative.
4. Avoid long and complex sentences.
C. SELECTIVE TYPE

1. Use only homogenous materials in a single matching exercise.
2. Include an unequal number of responses and premises and
instruct the pupils that response may be used once, more than
once or not at all.
3. Keep the list of items to be matched brief and place the shorter
responses at the right.
4. Arrange the list of responses in logical order.
5. Indicate in the directions the base for matching the responses
and premises.
6. Place all the items for one matching exercise on the same page.
Matching Type

1. The stem of the item should be meaningful by itself and should
present a definite problem
2. The item should include as much of the item as possible and
should be free of irrelevant information.
3. Use a negatively stated item only when significant learning
outcome requires it.
4. Highlight negative words in the stem for emphasis.
5. All the alternatives should be grammatically consistent with the
stem of the item.
6. An item should only have one correct or clearly best answer.
Multiple Choice

When to Use Specific behaviors or behavioral outcomes are to be observed.
Possibility of judging the appropriateness of students actions
A process or outcomes cannot be directly measured by paper and
pencil tests
Advantages Allow evaluation of complex skills which are difficult to assess
using written tests.
Positive effect on instruction and learning,
Can be used to evaluate both the process and the product
Limitations Time-consuming to administer, develop and score
Subjectivity in scoring
Inconsistencies in performance on alternative skills
PERFORMANCE AND AUTHENTIC
ASSESSMENTS

Characteristics
1. Adaptable to individualized instructional goals.
2. Focus on assessment of products.
3. Identify students’ strengths rather than weaknesses.
4. Actively involve students in the evaluation process.
5. Communicate student achievement to others.
6. Time-consuming.
7. Need of scoring plan to increase reliability
PORTFOLIO ASSESSMENT

TYPES DESCRIPTION
Showcase A collection of students’ best work
Reflective Used for helping teachers, students and family members
think about various dimensions of student learning (e.g.
effort, achievement)
Cumulative A collection of items done for an extended period of time
Analyzed to verify changes in the products and process
associated with student learning
Goal-based A collection of works chosen by students and teachers to
match pre-established objectives
Process A way of documenting the steps and processes a student has
done to complete a place of work.

•scoring guide consisting of specific pre-
established performance criteria used in
evaluating student work in performance
assessments
RUBRICS

1. Holistic Rubric – requires the teacher to score the
overall process or product as a whole, without
judging the component parts separately
2. Analytic Rubric – requires the teacher to score
individual components of the product or
performance first, then sums the individual scores to
obtain a total score.
Two Types

1. Closed –Item or Forced-Choice Instruments – ask
for one or specific answer
a. Checklist – measures student’s preferences,
hobbies, attitudes, feelings, beliefs, interests,
etc. by marking a set of possible responses.
AFFECTIVE ASSESSMENTS

1. Rating Scale – measures the degree or extent of one’s
attitude, feelings and perception about ideas, objects
and people by marking a point along 3 or 5 point scale.
2. Semantic Differential Scale – measures the degree of
one’s attitudes, feelings and perceptions about ideas
and people by marking a point along 3 or 7 or 11 point
scale of semantic adjectives.
3. Likert Scale – measures the degree of one’s agreement
or disagreement on positive or negative statements
about objects and people
b. Scales – these instruments that indicate the extent or degree of one’s
response.

c. Alternate Response – measures students’
preferences, hobbies, attitudes, feelings,
beliefs, interests, by choosing two possible
responses.
d. Ranking – measures students preferences or
priorities by ranking a set of response

a. Sentence Completion – measures students preferences
over a variety of attitudes and allows students to answer by
completing an unfinished statement which may vary in
length.
b. Surveys – measures the values held by an individual by
writing one or many responses to a given question.
c. Essays – allows the students to reveal and clarify their
preferences, hobbies, attitudes, feelings, beliefs and interest
by writing their reaction or opinions to a given question.
2. Open-ended Instruments – they are open to more than one answer.

•VALIDITY – the degree to which a test
measures what is intended to be measured. It
is the usefulness of the test for a given
purpose. It is the most important criteria of a
good examination.
CRITERIA TO CONSIDER IN
CONSTRUCTING GOOD TESTS

a. Appropriateness of test – it should measures the abilities,
skills and information it is supposed to measure.
b. Directions – it should indicate how the learners should
answer and record their answers
c. Reading Vocabulary and Sentence Structure – it should
be based on intellectual level of maturity and background
experience of the learners.
d. Difficulty of Items – it should have items that are not too
difficult and not too easy to be able to discriminate the
bright from slow pupils.
FACTORS influencing the validity of test in general

e. Construction of Items- it should not provide clues so it will not
be a test on clues nor should it be ambiguous so it will not be a
test on interpretation.
f. Length of Test – it should just be of sufficient length so it can
measure what is it supposed to measure and not that it is too
short that it cannot adequately measure the performance we
want to measure.
g. Arrangement of Items – It should have items that are arrange
in ascending level of difficulty such that are arrange in
ascending level of difficulty such that it starts with the easy ones
so that pupils will pursue on taking the test.
h. Pattern of Answer – it should not allow the creation of patterns
in answering the test.

1. Face Validity – is done by examining the physical
appearance of the test
2. Content Validity – is done through a careful and
critical examination of the objectives of the test so
that it reflects the curricular objectives.
3. Criterion-related Validity – is established
statistically such that a set of scores revealed by a
test is correlated with scores.
WAYS OF ESTABLISHING VALIDITY

•Concurrent Validity – describes the present
status of the individual by correlating the sets
of scores obtained from two measures given
concurrently

•Predictive Validity – describes the future
performance of an individual by correlating
the sets of scores obtained from two
measures given at a longer time interval.

•Construct validity – is established statistically
by comparing psychological traits or factors
that influence scores in a test, e.g. verbal,
numerical, spatial, etc.

•Convergent Validity – is established if the
instrument defines another similar trait other
that what it intended to measure (e.g.
Critical Thinking Test may be correlated with
Creative Thinking Test)

•Divergent Validity – is established if an
instrument can describe only the intended
trait and not other traits (e.g. Critical Thinking
Test may not be correlated with Reading
Comprehension Test)

•it refers to the consistency of scores obtained
by the same person when retested using the
same instrument or one that is parallel to it.
RELIABILITY

1. Length of the test – as a general rule, the longer the test, the higher the
reliability. A longer test provides a more adequate sample of the behavior
being measured and is less distorted by chance of factors like guessing.
2. Difficulty of the test – ideally, achievement tests should be constructed
such that the average score is 50 percent correct and the scores range
from zero to near perfect. The bigger the spread of scores, the more
reliable the measured difference is likely to be. A test is reliable if the
coefficient of correlation is not less than 0.85.
3. Objectivity – can be obtained by eliminating the bias, opinions
judgments of the persons who checks the test
4. Administrability – the test should be administered with clarity and
uniformity so that the scores obtained are comparable. Uniformity can be
obtained by setting the time limit and oral instructions.
Factors affecting Reliability

5. Scorability – the test should be easy to score such that
directions for scoring are clear, the scoring key is simple,
provisions for answer sheets are made
6. Economy – the test should be given in the cheapest way,
which means that answer sheets must be provided so the
test can be given from time to time
7. Adequacy – the test should contain a wide sampling of
items to determine the educational outcomes or abilities so
that the resulting scores are representatives of the total
performance in the areas measured.

Steps:
1. Score the test. Arrange the scores from highest to
lowest.
2. Get the top 27% (upper group) and below 27% (lower
group) of the examinees.
3. Count the number of examinees in the upper group
and lower group who got the item correctly.
4. Compute for the Difficulty Index of each item.
5. Compute for the Discrimination Index of each item.
ITEM ANALYSIS

Df =
• where:
Df= difficulty index
Discrimination Index= DU - DL
• where:
DU= difficulty index for upper group
DL= difficulty index for lower group

Difficulty Index
(Df)
Description Action Taken
0.00-0.25 Very difficult Revise/Discard
0.26 -0.75 Average/Right difficulty Retain
0.76 – 1.00 Very easy Revise/Discard
INTERPRETATION

Discrimination
Index
Description Action Taken
0.46 to 1.00 Positive Discriminating Power Retain
-0.50 to 0.45 Could not discriminate Revise
-1.00 to -0.51 Negative Discriminating Power Discard
INTERPRETATION

• Item 1
Distractor Analysis
Sample
A B C D
12 25 13 10 TOTAL
3 10 2 1 UPPER GROUP
5 6 5 0 LOWER GROUP

•Leniency error – faculty tends to judge better than it really is.
•Generosity error – faculty tends to use high end of scale
only.
•Severity error – faculty tends to use low end of scale only.
•Central tendency error – faculty avoids both extremes of
the scale.
•Bias – letting other factors influence score (e.g. handwriting,
types)
•Halo – effect: letting general impression (e.g. student’s prior
work)
SCORING ERRORS AND BIASES

•Contamination effect – judgement is influenced by irrelevant knowledge
about the student or other factors that have no bearing on performance
level (e.g. student appearance)
• Similar to me effect – judging more favorably those students whom
faculty see as similar to themselves (e.g. expressing similar interest or point
of view)
• First- impression effect – judgment is based on early opinions rather than
on a complete picture (opening paragraph)
• Contrast effect – judging by comparing student against other students
instead of established criteria and standards
• Rater drift – unintentionally redefining criteria and standards over time or
across a series of scoring ( e.g. getting tired and cranky and therefore
more severe, getting tired and reading more quickly/leniently to get the
job done.

Measurement Characteristics Examples
Nominal Groups and label
data
Gender (1 male: 2
female)
Ordinal Distance between
points are indefinite
Income (1 low, 2-
average, 3- high)
Interval Distance between
points are equal
No absolute zero
Test scores.
temperature
Ratio Absolute zero Height, weight
FOUR TYPES OF MEASUREMENT
SCALES

1. Normal/Bell-shaped/Symmetrical
2. Positively Skewed- most scores are below the
mean and there are extremely high scores.
3. Negatively Skewed – most scores are above the
mean and there are extremely low scores.
4. Leptokurtic – highly peaked and the tails are
more elevated above the baseline.
SHAPES OF FREQUENCY POLYGONS

5. Mesokurtic – moderately peaked
6. Platykurtic – flattened peak
7. Bimodal Curve- curve with 2 peaks or
modes
8. Polymodal Curve – curve with 3 or more
modes
9. Rectangular Distribution – there is no mode.

• Skewers – distortion or asymmetry in a symmetrical
bell curve or normal distribution in a set of data.
• Positively-skewed (or right skewed) –the mean is
greater than the median, it means more students
understand the course or the test was simple
• negatively-skewed (or left skewed)- the mean is
less than the median; it means students did not
understand the test concepts or they were not
taught well

MEASURES OF CENTRAL TENDENCY
AND VARIABILITY

Assumptions when used Appropriate Statistical Tools
Measures of Central Tendency
(describes the representative
value of a set of data)
Measures of Variability
(describes the degree of spread
or dispersion of a set of data)
When the frequency
distribution is regular or
symmetrical
(normal)
Usually used when data are
numeric (interval or ratio)
Mean – the arithmetic average Standard Deviation – the root
mean square of the deviations
from the mean
When the frequency
distribution is irregular or sked.
Usually when the data is ordinal
Median – the middle score in a
group of scores that are ranked
Quartile Deviation – the
average deviation of the 1st
and
3rd
quartiles from the median
When the distribution of scores
is normal and quick answer is
needed
Usually when the data are
nominal
Mode – the most frequent
score
Range – the difference between
the highest and the lowest
score in the distribution

• The value that represents a set of data will
be the basis in determining whether the group
is performing better or poorer than the other
groups.
How to Interpret the Measures of Central Tendency

• The result will help you determine if the
group is homogenous or not.
• The result will help you determine the
number of students that fall below and above
the average performance.
How to Interpret the Standard Deviation

• Standard Deviation is a measure of how spread out
number are; a measurement that indicates how much
a group of scores vary from the average; it can tell you
how much a group of grades varied on any given test,
it might be able to tell you if the test was too easy or
too difficult.
• A small standard deviation means that the values in a
statistical data set are close to the mean of the data
set while the large SD means the values are farther
away from the mean.

• The result will help you determine if the
group is homogenous or not.
• The result will also help you determine the
number of students that fall below and above
the average performance.
How to Interpret the Quartile Deviation

-tells the percentage of examinee that lies below one’s score
• Percentile Scores – percentage of scores in its frequency
distribution that are equal to or lower than it.
• Scores of students are arranged in rank order from lowest to
highest- the scores are divided into 100 equally sized groups or
bands. The lowest score is “in the 1st percentile”. (there is no 0
percentile). The highest score is in the “99th percentile”
• If you’re score is in the 60th percentile, it means that you
scored better than 60 percent of all the test takers.
PERCENTILE

-tells the number of standard deviations
equivalent to a given raw scores.
Z-SCORES

a. could represent:
-how a student is performing in relation to other
students (norm-referenced grading)
-the extent to which a student has mastered a
particular body knowledge (criterion-referenced
grading); how a student is performing in relation to a
teacher’s judgment of his or her potential
GRADES

-certification that gives assurance that student has
mastered a specific content or achieved a certain level of
accomplishment.
-selection that provides basis in identifying or grouping
students for certain educational paths or programs
-direction that provides information for diagnosis and
planning
-motivation that emphasizes specific material or skills to be
learned and helping students to understand and improve
their performance.
b. could be for:

• Criterion-Referenced Grading – or grading based
on fixed or absolute standards where grade is
assigned based on how a student has met the
criteria or a well –defined objectives of a course
that were spelled out in advance. It is then up to
the student to earn the grade he or she wants to
receive regardless of how other students in the
class have performed. This is done by transmuting
test scores into marks or rating.
c. could be assigned by using:

•or grading based on relatives standards
where a student’s grade reflects his or her
level of achievement relative to the
performance of other students in the class. In
this system, the grade is assigned based on
the average of test scores.
Norm –referenced Grading-

•whereby the teacher identifies points or
percentages for various tests and class
activities depending on their performance.
The total of these points will be the bases for
the grade assigned to the student.
Point or Percentage Grading System

•where each student agrees to work for a
particular grade according to agreed-upon
standards.
Contact Grading System

The following points provide helpful reminders when preparing for and
conducting parent-teacher conferences.
1. Make plans for the conference. Set the goals and objectives of the
conference ahead of time.
2. Begin the conference in a positive manner. Starting the conference
by making a positive statement about the student sets the tone for the
meeting.
3. Present the student’s strong points before describing the areas
needing improvement. It is helpful to present examples of the
student’s work when discussing the student’s performance.
4. Encourage parents to participate and share information. Although
as a teacher you are in charge of the conference, you must be willing
to listen to parents and share information rather than “talk at” them.
Conducting Parent-Teacher
Conferences

5. Plan a course of action cooperatively. The discussion
should lead to what steps can be taken by the teacher and
the parent to help the student.
6. End the conference with a positive comment. At the end
of the conference, thank the parents for coming and say
something positive about the student, like “ Lucas has a
good sense of humor and I enjoy having him in class.”
7. Use good human relation skills during the conference.
Some of these skills can be summarized by following the
do’s and don’ts.

Assessment.pptx module 1: professional education

More Related Content

Similar to Assessment.pptx module 1: professional education

Recently uploaded

Assessment.pptx module 1: professional education

Editor's Notes