Module 6-L A & E, Weekend.pptx

Module 6: Learning Assessments
and Evaluation
Course code: TECS 623
Credit hours: 2
Prof.Omprakash H M
Department of Curriculum and Instructons
College of Education and Behavioral Sciences
Bule Hora University, Adola, Ethiopia

Unit One: Concepts, Purposes and
Principles of Assessment
1.1 Concept of Assessment and related
terms(Test,Mesurement,assessment
and Evaluation)
1.2 Function of Assessment and
Evaluation
1.3 Principles of Assessment (Validity,
Equity, reliability and explicitness)
1.4 Basic assumption in assessing
students’ performance.

1.1 Concept of Assessment and related terms (Test,
Mesurement, Assessment and Evaluation)
Concept of Assessment:
Assessment: Assessment is the process of gathering and
discussing information from multiple and diverse sources in
order to develop a deep understanding of what students know,
understand, and can do with their knowledge as a result of their
educational experiences; the process culminates when
assessment results are used to improve subsequent learning.
Comprehensive definition of assessment that incorporates its
key elements: the planned process of gathering and synthesizing
information relevant to the purposes of (a) discovering and
documenting students' strengths and weaknesses, (b) planning
and enhancing instruction, or (c) evaluating progress and making
decisions about students.

What is basic concept of assessment?
Assessment refers to the full range of information
gathered and produced by teachers about their students
and their classrooms (Arends, 1994) Assessment is a
method for analyzing and evaluating student
achievement or program success.
Assessment involves the use of empirical data on
student learning to refine programs and improve
student learning.
It is the process of defining, selecting, designing,
collecting, analyzing, interpreting, and using
information to increase students' learning and
development.

All those activities undertaken by teachers,
and by their students in assessing themselves,
which provide information to be used as
feedback to modify the teaching and learning
activities in which they are engaged.
Basic Concepts in Testing and Assessment
Test Measurement Assessment Evaluation
What is a Test?
Perhaps test is a concept that we are more familiar with
than the other concepts. we have been taking tests ever
since we have started schooling to determine our
academic performance. Tests are also used in work
places to select individuals for a certain job vacancy.

Thus, test in educational context is meant to the presentation of
a standard set of questions to be answered by students. It is one
instrument that is used for collecting information about
students’ behaviors or performances. It can be noted that there
are many other ways of collecting information about students’
educational performances other than tests, such as observations,
assignments, project works, portfolios, etc.
 Most commonly used method of making
measurements in education.
 An instrument or systematic procedures for
measuring sample of behaviour by posing to
measure any quality, ability, skill or
knowledgea set of questions in a uniform
manner.

 Designed e.g. objective, subjective, descripitive, time
management, pretest-post, develop conditioning, enhance
observation ( in learning) with exp.
 There is always right/best and wrong answer.
What is Measurement?
Measurement: In our day to day life there are
different things that we measure.
We measure our height and put it in terms of
meters and centimeters.
We measure some of our daily consumptions like
sugar in kilograms and liquids in liters.
We measure temperature and express it in terms of
degree centigrade or degree Celsius. How do we
measure these things?

Well definitely we need to have appropriate instruments such as a
meter, a weighing scale, or a thermometer in order to have reliable
measurements.
Similarly, in education measurement is the process by which the
attributes of a person are measured and described in numbers. It is
a quantitative description of the behavior or performance of
students. As educators we frequently measure human attributes
such as attitudes, academic achievement, aptitudes, interests,
personality and so forth. Measurement permits more objective
description concerning traits and facilitates comparisons. Hence,
to measure we have to use certain instruments so that we can
conclude that a certain student is better in a certain subject than
another student. How do we measure performance in
mathematics?

We use a mathematics test which is an instrument containing
questions and problems to be solved by students. The number of
right responses obtained is an indication of performance of
individual students in mathematics. Thus, the purpose of
educational measurement is to represent how much of ‘something’
is possessed by a person using numbers. Note that we are only
collecting information. We are not evaluating! Evaluation is
therefore quite different from measurement. Measurement is not
also that same as testing. While a test is an instrument to collect
information about students’ behaviors, measurement is the
assignment of quantitative value to the results of a test or other
assessment techniques. Measurement can refer to both the score
obtained as well as the process itself.

 Basically assignment of numbers.
 Variety of i n s t r u m e n t s such as tests, rating scales, rubrics
are used.
 The process of obtaining numerical description of the degree
of individual processes.
 Quantifying how much learners learned.
What is Assessment?
Assessment: In educational literature the concepts
‘assessment’ and ‘evaluation’ have been used with
some confusion. Some educators have used them
interchangeably to mean the same thing. Others
have used them as two different concepts. Even
when they are used differently there is too much
overlap in the interpretations of the two concepts.

Cizek (in Phiye, 1997) provides us a comprehensive definition of
assessment that incorporates its key elements:
The planned process of gathering and synthesizing information
relevant to the purposes of:
(a) Discovering and documenting students' strengths and
weaknesses,
(b) Planning and enhancing instruction, or
(c) Evaluating progress and making decisions about students.
Process by which evidence of student achievement is obtained
and evaluated (marks obtained, how...checked).
 Including testing, interpreting and placing information in
context.

 Process of gathering and organizing data – the basis
for decision- making (evaluation).
 Methods of measuring and evaluating the nature of the
learner (what he learned/ how he learned) e.g objective
/descriptive.

Principles of Assessment
1. Assessment should be aimed at improving student performance.
2. Assessment should be based on an understanding of how students
learn.
3. Assessment should be an integral component of course design
and not something to addafterwards.
4. Good assessment provides useful information to report credibly to
parents on student achievement.
The Role of Assessment in Learning
Assessment plays a major role in how students learn, their
motivation to learn learn, and how teachers to teach.
Assessment is used for various purpose:
 Assessment for learning: Where assessment helps teachers gain
insight.
 Assessment as learning: Where students develop an awareness.

 Assessment of learning: where assessment informs students,
teachers and parents, as well as the broader educational
community, of achievement at a certain point in time.
Research and experience show that student learning is best
supported when:
 Instruction and assessment are based on clear learning goals.
 Instruction and assessment are differentiated according to
student learning needs.
 Assessment information is used to make decisions that
support further learning.
 Parents are well informed about their child’s learning, and
work with the school to help plan and provide support,
e.g.weekly/monthly meeting.

Advantages of Assessment
 Helps in knowing the position of a
student when they enter a course.
E.g. Adam student in Adola Teacher’s
College.
 It provides a large view of students’ need
and assessment.
 In accordance to the students’
achievement, the curriculum and teaching
methods can be adjusted
Disadvantages of Assessment
It limits the potential of a student to a mere ‘test’.
 Under supervision and pressure and supervision, creativity and
performance is affected.
 Though assessment aims at bringing out the latent/hidden
knowledge, it often conceals it by the pressure it creates.
 The parameter to judging knowledge is just a test score.

Evaluation
It is the process of obtaining, analyzing and interpreting
information to determine the extent to which students
achieve .e.g How many marks secure

1.2 Function of Assessment and Evaluation
1. Capturing student time and attention.
2. Generating appropriate student learning activity.
3. Providing timely feedback which students pay attention
to.
4.Helping students to internalize the discipline’s
standards and notions of equality.
5. Generating marks or grades which distinguish between
students or enable pass/fail decisions to be made.

 Monitoring the progress
 Decision making
 Screening
 Diagnostic process
 Placement of students in remedial courses
 Instructional planning
 Evaluation of instructional programed.
 Feedback
 Motivation

1.3 Principles of Assessment (Validity, Equity,
reliability and explicitness)
What are the basic principles of assessment?
1 Assessment should be valid.
2 Assessment should be reliable and consistent.
3 Information about assessment should be explicit, accessible
and transparent.
4 Assessment should be inclusive and equitable
 Validity
Validity refers to the evidence base that can be provided about
appropriateness of the inferences, uses, and consequences that
come from assessment.
Appropriateness has to do with the soundness, trustworthiness, or
legitimacy of the claims or inferences that testers would like to
make on the basis of obtained scores.

 Validity is “the extent to which inferences
made from assessment results are appropriate,
meaningful, and useful in terms of the purpose
of the assessment” (Gronlund, 1998).
 Validity refers to whether the test is actually
measuring what it claims to measure (Arshad,
2004).
Face Validity
It is pertinent that a test looks like a test even at first
impression.
If students taking a test do not feel that the questions given
to them are not a test or part of a test, then the test may
not be valid as the students may not take it seriously to
attempt the questions.

Construct Validity
Construct is a psychological concept used in measurement.
Construct validity is the most obvious reflection of whether a
test measures what it is supposed to measure as it directly
addresses the issue of what it is that is being measured.
In other words, construct validity refers to whether the
underlying theoretical constructs that the test measures are
themselves valid.
 Equity
Principle 1 - Assessment should be inclusive and equitable/fair. As
far as is possible without compromising academic standards,
inclusive and equitable assessment should ensure that tasks and
procedures do not disadvantage any group or individual.

Principle 2 - Information about assessment should be explicit,
accessible and transparent Clear, accurate, consistent and timely
information on assessment tasks and procedures should be made
available to students, staff and other external assessors or
examiners.
Principle 3 - Assessment should be inclusive and equitable As far
as is possible without compromising academic standards,
inclusive and equitable assessment should ensure that tasks and
procedures do not disadvantage any group or individual.
Principle 4 - Assessment should be an integral part of programme
design and should relate directly to the programme aims and
learning outcomes Assessment tasks should primarily reflect the
nature of the discipline or subject but should also ensure that
students have the opportunity to develop a range of generic skills
and capabilities.
Principle 5 - The amount of assessed work should be manageable
The scheduling of assignments and the amount of assessed work
required should provide a reliable and valid profile of achievement
without overloading staff or students, Principle.

Principle 6 - Formative and summative assessment should be
included in each programme Formative and summative assessment
should be incorporated into programmes to ensure that the
purposes of assessment are adequately addressed. Many
programmes may also wish to include diagnostic assessment.
Principle 7 - Timely feedback that promotes learning and facilitates
improvement should be an integral part of the assessment process
Students are entitled to feedback on submitted formative
assessment tasks, and on summative tasks, where appropriate. The
nature, extent and timing of feedback for each assessment task
should be made clear to students in advance.
Principle 8 - Staff development policy and strategy should include
assessment All those involved in the assessment of students must
be competent to undertake their roles and responsibilities.

 Reliability
According to Brown (2010), a reliable test can be described as
follows:
◦ Consistent in its conditions across two or more
administrations
◦ Gives clear directions for scoring / evaluation
◦ Has uniform rubrics for scoring / evaluation
◦ Lends itself to consistent application of those rubrics by
the scorer
◦ Contains item / tasks that are unambiguous to the
test-taker

Reliability means the degree to which an assessment tool
produces stable and consistent results.
Reliability essentially denotes ‘consistency, stability,
dependability, and accuracy of assessment results’
Since there is tremendous variability from either teacher or
tester to teacher/tester that affects student performance,
thus reliability in planning, implementing, and scoring
student performances gives rise to valid assessment.
Split Half Reliability
A test is administered once to a group, is divided into two
equal halves after the students have returned the test, and
the halves are then correlated.
Halves are often determined based on the number assigned to
each item with one half consisting of odd numbered items and
the other half even numbered items.

Factors that affect reliability of the test
a.Test factor
b.Teacher and Student factor
c.Environment factor
d.Test administration factor
e.Making factor
a. Test Factor
In general, longer tests produce higher reliabilities.
Due to the dependency on coincidence and guessing, the
scores will be more accurate if the duration of the test is
longer.
An objective test has higher consistency because it
is not exposed to a variety of interpretations.

b. Teacher and Student factor
In most tests, it is normally for teachers to construct and
administer tests for students. Thus, any good teacher-student
relationship would help increase the consistency of the
results.
Other factors that contribute to positive effects to the
reliability of a test include teacher’s encouragement, positive
mental and physical condition, familiarity to the test formats,
and perseverance and motivation.
c. Environment Factor
An examination environment certainly influences test-
takers and their scores.
Any favourable environment with comfortable chairs and desks,
good ventilation, sufficient light and space will improve the
reliability of the test.
On the contrary, a non-conducive environment will
affect test-takers’ performance and test reliability.

d. Test Administration Factor
Because students' grades are dependent on the way tests are
being administered, test administrators should strive to
provide clear and accurate instructions, sufficient time and
careful monitoring of tests to improve the reliability of their
tests.
A test-retest technique can be used to determine test
reliability.

e. Marking Factor
Human judges have many opportunities to
introduce error in scoring essays.
It is also common that different markers award
different marks for the same answer even with a
prepared mark scheme.
A marker’s assessment may vary from time to
time and with different situations.
Conversely, it does not happen to the objective type of
tests since the responses are fixed. Thus, objectivity is
a condition for reliability.

 Explicitness
Principle - Information about assessment should be explicit,
accessible and transparent Clear, accurate, consistent and
timely information on assessment tasks and procedures
should be made available to students, staff and other external
assessors or examiners.

1.4 Basic assumption in assessing students’ performance.
1.Quality of student learning is directly related to quality of
teaching,
2.The first step in getting useful feedback about course goals is
to make these goals explicit,
3.Students need focused feedback early and often, and they
should be taught how to assess their own learning,
4.The most effective assessment addresses problem-directed
questions that faculty ask themselves,
5.Course assessment is an intellectual challenge and therefore
motivating for the faculty,
6.Assessment does not require special training,
7.Collaboration with colleagues and students improves learning
and is satisfying.

Reflection: When planning to assess students, what are the
assumptions that one held in mind?
What are the things that should be kept in mind when preparing
assessment tools for assessing students?
Angelo and Cross (1993) have listed seven basic assumptions of
classroom assessment which are described as follows:
The quality of student learning is directly, although not
exclusively related to the quality of teaching.
Therefore, one of the most promising ways to improve learning
is to improve teaching.
If assessment is to improve the quality of students learning, both
teachers and students must become personally invested and
actively involved in the process.
Reflection: What should be the roles of students and teachers in
classroom assessment so as it will help students’ learning?

To improve their effectiveness, teachers need first to make their
goals and objectives explicit and then to get specific,
comprehendible feedback on the extent to which they are
achieving those goals and objectives.
Effective assessment begins with clear goals.
Before teachers can assess how well their students are learning,
they must identify and clarify what they are trying to teach.
After teachers have identified specific teaching goals they wish
to assess, they can better determine what kind of feedback to
collect.
To improve their learning, students need to receive appropriate
and focused feedback early and often; they also need to learn how
to assess their own learning.
Reflection: How do you think feedback and self-assessment will
help to improve students’ learning?

The type of assessment most likely to improve teaching and
learning is that conducted by teachers to answer questions they
themselves have formulated in response to issues or problems in
their own teaching.
To best understand their students’ learning, teachers need
specific and timely information about the particular individuals in
their classes. As a result of the different students’ needs, there is
often a gap between assessment and student learning. One goal of
classroom assessment is to reduce this gap.
Reflection: How does classroom assessment help to reduce this
gap between assessment and student learning?

Systematic inquiry and intellectual challenge are powerful sources
of motivation, growth, and renewal for teachers, and classroom
assessment can provide such challenge.
Classroom assessment is an effort to encourage and assist those
teachers who wish to become more knowledgeable, involved, and
successful.
Classroom assessment does not require specialized training; it
can be carried out by dedicated teachers from all disciplines.
To succeed in classroom assessment, teachers need only a
detailed knowledge of the discipline, dedication to teaching, and
the motivation to improve.

By collaborating with colleagues and actively involving
students in classroom assessment efforts, teachers
(and students) enhance learning and personal
satisfaction. By working together, all parties achieve
results of greater value than those they can achieve by
working separately.
Reflection: Can you explain how teachers’
collaboration with colleagues can be more effective in
enhancing learning and personal satisfaction than
working alone?

Unit Two: Assessment types, Methods and Tools
2.1. Assessments Types
2.2. Assessment Method
2.3. Assumption in selecting assessment methods
2.4. Table of specification and construction of item
2.5. Test administration, making and grading

Advantages of Formative Assessment
 Develops knowledge
 Continuous improvement
 Provides quick feedback
 Achieves successful outcomes
 Communicate with parent regular
Disadvantages of Formative Assessment
 Time consuming and requires resources
 Tiring process
 Trained and qualified professionals
 Develops challenges

Advantages of Summative Assessment
 To know if students have understood
 They determine achievement
 They make academic records
 Boosts individuals
 Weak areas can be identified
 Training success can be measured
Disadvantages of Summative Evaluation
 Demotivates individuals
 Rectification is late
 It is disruptive
 No remedy to identify challenges in advance
 Not accurate reflection of learning
 Negative effect for students

Assessment as Procedure
 This paradigm has elements of both measurement
paradigm and inquiry paradigm.
 The primary focus is on assessment procedures and
not on the underlying purposes of the assessment
program.
 Knowledge is believed to be existing separately from the
learners, and it can be transmitted to the students and
eventuallyobjectively measured.
Assessment as Inquiry
 Under this paradigm, assessment is based on
constructive theories of knowledge, student-centered
learning and the inquiry process.
 The teachers use various qualitative and quantitative
techniques to inquire about particular learners.
 It is a process of inquiry, and a process of
interpretation, used to promote reflection
concerning students' understandings, attitudes, and
literate abilities.
 Assessment, in this paradigm, is viewed as a social, contex
specific, interpretive activity.

Assessment as Measurement
 The primary instrument of this paradigm is the large-scale,
norm-referenced standardized test.
 Objectivity, standardization and reliability are the main concerns.
 Knowledge is considered to be existing separately from the
learners; and the learners work to acquire it and not construct
it.
 Decisions about the information to be collected and the
means of evaluation are made by authorities outside the
classroom.

Conclusion:
 Therefore, it may be concluded that though the leanings are
towards the formative evaluation system, it is not without
the want for modification.
 A balance is required not only in the formative but also in any
form of evaluation system.
 In the summative system too much leniency risks the
attention of the students. While, in formative evaluation too
much stress can endanger the clarity of concepts in the
students.
 A midway has to be found between the two extremes, and a
modified, flexible and accommodating system of evaluation, in
the lines of the formative system, should be adopted, so that
the faculty and the students can keep track of the latter’s
progress while at the same time also have the chance to
improve, develop and grow.

2.2. Assessment Method
Methods of Assessment. Methods will vary depending on the
learning outcome(s) to be measured. Direct methods are when
students demonstrate that they have achieved a learning
outcome or objective. Indirect methods are when students (or
others) report perceptions of how well students have achieved an
objective or outcome.
A. Direct Assessment Method:
Direct assessment involves looking at actual samples of student
work produced in our programs. Observations of field work,
internship performance, service learning, clinical experiences.
Grades based on explicit criteria related to clear learning goals,
tests of writing, critical thinking, or general knowledge,
Performance on achievement tests.

B. Indirect Assessment Method
Indirect methods are when students (or others) report perceptions
of how well students have achieved an objective or outcome.
Indirect assessment methods require that faculty infer actual
student abilities, knowledge, and values rather than observe direct
evidence. Among indirect methods are surveys, exit interviews, focus
groups, and the use of external reviewers.
 Surveys: Surveys usually are given to large numbers of possible
respondents, usually in writing, and often at a distance.
 Exit interviews and focus groups: Exit interviews and focus
groups allow faculty to ask specific questions face-to-face with
students.
 External reviewers: External reviewers are usually representatives
of the discipline and usually are guided by discipline-based
standards.

Advantages:
•Indirect methods are easy to administer;
•Indirect methods may be designed to facilitate
statistical analyses;
•Indirect methods may provide clues about what
could be assessed directly;
•Indirect methods can flesh out areas that direct
assessments cannot capture;
•Indirect methods are particularly useful for
ascertaining values and beliefs;

•Surveys can be given to many respondents at a time;
•Surveys are useful for gathering information from alumni,
employers, and graduate program representatives;
•Exit interviews and focus groups allow faculty to question
students face to face;
•External reviewers can bring a degree of objectivity to the
assessment;
•External reviewers can be guided either by questions that
the Department wants answered or by discipline-based
national standards.

DISADVANTAGES:
•Indirect methods provide only impressions and
opinions, not hard evidence;
•Impressions and opinions may change over time
and with additional experience;
•Respondents may tell you what they think you
want to hear;
•The number of surveys returned are usually low,
with 33 percent considered a good number;

•You cannot assume those who do not
respond would have responded in the
same way as those who did respond;
•Exit interviews take time to carry out;
•Focus groups usually involve a limited
number of respondents;
•Unless the faculty agree upon the
questions that are asked in exit
interviews and focus groups, there may
not be consistency in the responses.

2.3 Assumption in selecting assessment methods
Choosing appropriate assessments:
1. Vary assessments.
2. Consider intervals for assessment.
3. Match learning goals to assessments.
4. Direct and indirect assessment.
5. Collect data on student performance.
6. Revise assessment choices.
7. Assessment Primer.
8. Creating Assignments and Exams.
Assessment methods are selected based on the level and
content of your learning objectives. Certain assessment
methods are better suited for different types of knowledge,
skills, or attitude aspects.

1.Varyassessments
Student learning styles vary widely, and their strengths and
challenges with respect to assessment vary as well. Instructors need
to consider that variation as they choose assessments for their
courses. By varying the way we assess student understanding, we are
more likely to offer opportunities for every student to demonstrate
their knowledge. This can be accomplished by creating courses with
three or more forms of assessment, for example papers, class projects
and exams. This can also be accomplished by offering choices of how
to be assessed, for example giving students the option of writing a
paper or taking an exam for a unit of instruction, as long as by the
end of a course they have done both forms of assessment. This
might also be accomplished by offering multiple questions, and
having students choose which to answer. New faculty members
should think creatively how to best elicit quality student responses.

2.Considerintervalsforassessment
The frequency of assessment varies widely from course to
course. Some classes assess only twice, on a midterm and a
final. Others have weekly assignments, presentations and
homework. Think about the frequency with which your
students should be assessed, based on the knowledge that
assessment drives learning by focusing student attention,
energy, and motivation to learn. New faculty members need
to try various intervals and choose those that best support
their students’ learning.

3.Matchlearninggoalstoassessments
What we assess is what our students study, engage with, and
explore in more depth. By beginning with what we want
students to know and be able to do, we can design and choose
assessments to demonstrate the appropriate knowledge and
skills we are aiming for them to learn. After choosing student
learning outcomes, make a grid that places learning outcomes
across one axis, and the assessment that demonstrates their
achievement of those outcomes on the other axis. In this way
new faculty members can double check to be certain that each of
the student learning outcomes have been assessed. If we make
clear to students how each assessment furthers the goals of the
course, they are able to make informed choices about how to
spend their limited study time to achieve the course goals.

4.Directandindirectassessment
Assessment strategies are typically classified as direct, where
actual student behavior is measured or assessed, or indirect,
including things like surveys, focus groups, and similar
activities that gather impressions or opinions about a
program or its learning goals. If student assessment is
embedded in a course, meaning it impacts a course grade, it
is typically taken more seriously.

5.Collectdataonstudentperformance
In spite of our best efforts at choosing the appropriate forms of
assessment, and the intervals that best support student learning,
there will be some topics, or units of instruction where students
come up short. If we collect data on these issues, which test
questions are commonly missed, which paper topics are
commonly derailed, what misconceptions some students are
taking away, we can identify weaknesses in instruction and
assessment choices and make adjustments as needed.

6.Reviseassessmentchoices
After analyzing student achievement systematically, we should
begin to see gaps in our teaching or the effectiveness of our
assessments to measure student understanding. This is the time to
modify our assessments and the instruction leading up to them to
better support student learning. Accomplished faculty members
continually revise the ways they assess student knowledge and
skills to close the learning gap. The more students we can move
toward deep understanding of the course topics, the more effective
we are as instructors. The best time to make these revisions is
right after an assessment is evaluated and the results analyzed to
be certain to make changes when the understanding of weaknesses
is fresh in our minds. Throughout this revision process it is
important to maintain high expectations about what students
should know and be able to do.

7. Assessment Primer
Assessment Primer: Describing Your Community, Collecting
Data, Analyzing the Issues and Establishing a Road Map for
Change
8. Creating Assignments and Exams
Set the grading categories, but the students to help write the
descriptions. Draft the complete grading scale, then give it to
students for review and suggestions. Determining goals for
the assignment and its essential logistics is a good start
to creating an effective assignment.

2.4. Table of specification and construction of item
What is a table of specifications in test construction?
The table of specifications (TOS) is a tool used to ensure that
a test or assessment measures the content and thinking skills that
the test intends to measure. ... That is,
a TOS helps test constructors to focus on issue of response content,
ensuring that the test or assessment measures what it intends to
measure.
Table of specification as an activity which enumerates the
information and cognitive tasks on which examinees are to be
assessed. It is clearly defined as possible scope which laid emphasis
of the test and relates other objectives to the content in order to
ensure a balanced test items. Table of specification, sometimes
referred to as test blue print, is a table that helps teachers align
objectives, instruction and assessment.

A sample table of specification is shown in Table 1 below.
Table1: Table of specification for a (30) items Economics
test for SS2.
Objectives Objectives
Remembering
Under -
standing
Thinking Total
Consumers behavior
&price determination
2 4 3 9
Population 2 2 2 6
Money Inflation 1 3 2 6
Economics Systems 1 2 2 5
Principle of Economics 1 2 1 4
Total 7 13 10 30

General format of table of specification Table II
Content Knowledge
No and or
percentage
Understand
ing No and
or
percentage
Application
No and or
percentage
Total
Topic 1
Topic 2
Topic 3
Topic 4
Topic 5
Total

At the end of the lesson students should be able to:
1. Define the term consumer’s behavior i.e. demand and supply.
2. State the law of demand and supply.
3. Identify the forces of demand and supply as determinant of
price of goods and services.
Table III Table of specification for an objective test
Sl No Content Recall
Knowledge
Under
standing
Application Total
1 Consumer
behavior
17 ½% 7items 201/2%
8items
71/2% 3items 45%(18)
2 Price
determination
12 ½% 5items 171/2% 7item 51/2% 2items 35%(14)1
3 Public finance 0%
No items
121/2%
5items
71/2% 3items 20%(8)
4 Total 30% 12items 50 1/2%
20items
20% 8items 100%
40items

In an essay test the weighting can be achieved by assigning the
amount of time to be spent on each test item to show the
relative importance of the topics. For instance, if five essay
items are to be designed to test three subject topics, the
weighting can assigned in the same proportions of time divisions
as can be seen in table 4.
Topic Importance Item Time
Black smiting 35% Question 1 9 minutes
Missionary journey 25% Question 2 11 minutes
Photographer 40% Question 3 16 minutes
Question 4 14 minutes
Question 5 10 minutes

1. Teachers are able to determine what topic is being stressed and
also assist in the preparation of tests that reflect what students have
learnt and also limit the amount of time spent on each unit.
2. That no important objective or content area will be advertently
omitted.
3. The table of specifications can assist immensely in the
preparation of test items, production of valid and well robust test, in
the classification of objectives to both teacher and students, and in
assisting the teacher to select the most appropriate teaching
strategy.
4. Only those aims and objectives actually involved in the
instructional process will be assessed.
That each objective will receive a proportional emphasis on the test
in relation to the emphasis placed on that objective by the teacher.

2.5. Test administration, making and grading
The traditional approach to assessment of student learning is
formal testing. Still the most widely used of all methods of
assessment, testing has been the center of discussion and debate
among educators for years. The topic of testing includes a large
body information, some of which will be discussed in the upcoming
section. Basically, testing consists of four primary steps: test
construction, test administration, scoring and analyzing the test.
Each of these steps can result in a variety of test forms and elicit
a variety of useful outcomes, such as:
° Ideas for lesson plans
° Knowledge of individual students
° Ideas for approaching different students/classes
° Scores for admission
° Indication of teacher effectiveness

 Ways to assess student learning
 Goals for tests
 Suggestions to help students do better on exams
 Descriptions of common testing methods
 Issues to consider when assigning test grades
 Understanding the results, whether or not they agree
with expectations.
 Decision-making skills based on results both expected
and unanticipated (application of theory).
 Method of recording, presenting, and analyzing data;
observations and results (the notebook and final report).
 Performance of physical manipulations (technique).

The following is a schematic of the steps in testing that will
be covered in the rest of this section.

Constructing a test:
There are eight basic steps in constructing a test:
1. Defining the purpose. Before considering content and
procedure, the teacher must first determine who is taking the
test, why the test is being taken, and how the scores will be used.
Furthermore, the teacher should have a rationale for giving a test
at 8 particular point in the course: Does the test cover a
particular part of the unit content? Or should material currently
being studied be saved and tested at a later time when the entire
section is completed?
2. Listing the topics. Once the purpose and parameters have been
established, specific topics are listed and examined for their
relative importance in the section. This is called representative
sampling.

For example, if the study of crustaceans comprised approximately
10% of all class work in the section to be tested (including class time,
homework, and other assignments), then that topic should comprise
approximately 10% of the test. This can be done either by calculating
the number of questions per topic or by weighting different sections
to match class coverage (see 7. Making a Scoring Key below).
3. Listing types of questions. Different types of material calls for
different types of test questions. While multiple choice questions
might adequately test a student's knowledge of mathematics,
essays reveal more about a student's understanding of literature or
philosophy. Thus, in deciding what types of test questions to use
(short answer, essay, true/false, matching, multiple choice, etc.)
the following advantages and disadvantages should be kept in
mind:

Type Advantages Disadvantages
Short Answer Can test many facts in short time
Fairy
Often ambiguous
Difficult to measure complex learning easy to
score
Excellent format for math
Tests recall
Essay Can test complex learning
Can evaluate
Difficult to score objectively
Uses a great thinking process and creativity deal
of testing time
Subjective
True/False Test the most facts in shortest time
Easy
Difficult to measure complex learning to score
Tests recognition
Objective
Difficult to write reliable items
Subject to guessing
Matching Excellent for testing associations and recognition
of facts
Difficult to write good items
Subject to
Although terse can test
process of elimination complex learning
(especially concepts)
Objective
Multiple Choice
Can evaluate learning at all levels of complexity.
Difficult to write
Somewhat subject to
Can be highly reliable objective guessing Tests
fairly large knowledge base
In short time
Easy to score

In choosing types of questions to be used on a test, it is also
important to consider the following points:
° Classroom conditions can automatically eliminate certain
types of questions. Since answers to multiple choice questions
can be easily copied in an overcrowded classroom, they might
not be an accurate measure of student learning. Likewise, if
blackboards are the only media available for presenting the test,
long questions and textual references might be impossible to
include on the test.
° Considerations regarding administration and scoring often
dictate the type of questions to be included on a test. Numbers
of students, time constraints, and other factors might
necessitate the use of questions which can be administered and
scored quickly and easily.

The types of knowledge being tested should be considered in the
assessment process. A simplified checklist could be used by the
teacher to determine if students have been assessed in all
relevant areas. This could take the form of a graph such as the
one which follows:
TOPICS TO BE TESTED FACTS SKILLS CONCEPTS APPLICATIO
N
Verbs: Conjugation of "to be" x
Pronunciation: Short "a" x
Use of Models: Should, Must, Ought to x
Free Expression x

4. Writing items Once purpose, topics and types of questions have
been determined, the teacher is ready to begin writing the specific
parts, or items, of the test. Initially, more items should be written
than will be included on the test. When writing items, the following
guidelines are followed:
° Cover important material No item should be included on
. a test unless it covers a fact, concept, skill or applied principle
that is relevant to the information covered in class (see 3. Listing
Types of Questions above).
Items should be independent. The answer to one item should not be
found in another item; correctly answering one item should not be
dependent on correctly answering a previous item. (This guideline
might not apply in some cases. For example, a math test might begin
by testing simple skills and then test their integration. In all cases, the
teacher should be aware of what is being tested at each level and use
this strategy sparingly).

° Write simply and clearly. Use only terms and examples
students will understand and eliminate all nonfunctional words.
° Be sure students know how to respond. The item should define
the task clearly enough that students who understand the
material will know what type of answer is required and how to
record their answers. For example, on essay questions, the
teacher may specify the length and scope of the answer required.
° Include questions of varying difficulty. Tests should include at
least one question that all students can answer and one that few,
if any, can answer. Tests should be designed to go from the
easiest to most difficult items so as not to immediately
discourage the weaker students.

° Be flexible. No one type of item is best for all situations or all
types of material. Whenever feasible, any test should contain
several types of items.
5. Reviewing items Regardless of how skilled the teacher is, not all
his/her first efforts will be perfect or even acceptable. It is
therefore important to review all items, revising the good and
eliminating the bad. Finally, all items should be evaluated in terms
of purpose, standardization, validity, practicality, efficiency, and
fairness (see 8. Evaluating a Test below).
6. Writing directions. Clear and concise directions should be
written for each section. Whenever possible, an example of a
correctly answered test item should be provided as a model. If
there is any question as to the clarity of the directions, the teacher
should "try them out" on someone else before giving the exam.

7. Devising a scoring key. While the test items are fresh in his/her
mind, the teacher should make a scoring key -- a list of correct
responses, acceptable variations, and weights assigned to each
response (see Scoring below). In order to assure representative
sampling, all items should be assigned values at this time. For
example, if "factoring" comprised 50% of class material to be tested
and only 25% of the total number of test questions, each question
should be assigned double value.
8. Evaluating A Teat. All methods of assessing student learning
should achieve the same thing: the clear, consistent and systematic
measurement of a behavior or something that is learned. Once a test
has been constructed, it should be reviewed to ensure that it meets
six specific criteria: clarity, consistency, validity, practicality,
efficiency, and fairness.

The following is a checklist of questions that should be
asked after the test (or any assessment activity) has been
prepared and before it is administered:
A CLEARLY DEFINED
PURPOSE
Who is being assessed?
What material is the test (or activity) measuring?
What kinds of knowledge or skills is the test (or activity) measuring?
Do the tasks or test items relate to the objectives?
STANDARDIZATION OF
CONTENT
Are content, administration, and scoring consistent in all groups?
VALIDITY Is this test (or activity) a representative sampling of the material presented
in this section?
Does this test (or activity) faithfully reflect the level of difficulty of material
covered in the class?
PRACTICALITY AND
EFFICIENCY
Will the students have enough time to finish the test (or activity)?
Are there sufficient materials available to present the test or complete the
activity effectively?
What problems might arise due to structural or material difficulties or
shortages?
FAIRNESS Did the teacher adequately prepare students for this activity/test?
Were they given advance notice?
Did they understand the testing procedure?
How will the scores affect the students' lives?
ACTIVITY BOX

Administering a test:
Once the items, directions, and answer key have been written, the
teacher should consider the manner in which the test will be
presented in advance. Factors such as duplication, visual aids, and
use of the blackboard should be considered in advance to insure
clarity in presentation as well as to avoid technical difficulties.
Establish Classroom Policy:
Because discipline is a major factor in test administration, the
teacher must establish a classroom policy concerning such matters as
tardiness, absences, make-ups, leaving the room, and cheating (see
Classroom Management). The teacher must also advise students of
procedural rules such as:
° What to do if they have any questions.
° What to do when they are finished taking the test.
° What to do if they run out of paper, need a new pen, etc.
° What to do if they run out of time.

The teacher should always be aware of the effect of testing
conditions on testing outcomes. Physical shortcomings should be
alleviated wherever possible. If some students cannot see the
blackboard, they should be allowed to move to a better location. If
students are cramped into benches, more benches should be brought
in and students should be spread out. If this is not possible, two
separate tests can be written and distributed to students on an
alternating basis.
Similarly, psychological conditions can inhibit optimal performance.
Such factors as motivation, test anxiety, temporary states (everyone
has a bad day once in a while), and long-term changes can
profoundly effect the test-taker and therefore his/her performance
on the test. It is therefore the teacher's responsibility to establish an
official, yet not oppressive, atmosphere in the testing room to
maximize student performance.

Teaching Teat-Taking Techniques:
Students often fail tests not because they do not know the
material but because they do not understand the procedures
and techniques for successful test-taking. If a test is to be as
fair as possible, students must understand both test-taking
procedures and techniques. This means that the teacher should
familiarize his/her students with:
° The type of test to be given (e.g. diagnostic, proficiency,
achievement, etc.) and how to study for it.
° The types of items which will appear on the test and how to
respond to them (e.g. matching, fill in the blank, essay
questions, etc.).
° The types of directions commonly accompanying certain types
of test items.
° Strategies for successful test-taking (e.g. time management,
the process of elimination, guessing, etc.).

Grading a test:
In order to determine how well 8 student performed on a test
or in an activity, specific value must assigned to each test
item or activity component. Then, raw scores must be derived
and, if necessary, transformed to fit the requirements of
testing within specific contexts.
Obtaining Raw Scores
The first step in determining how well a student performed on
a test or in an activity is to derive 8 raw score, or number of
items answered correctly. Hence, if a student answers eight
out of ten items correctly, his/her raw score is eight.

Transforming Raw Scores:
Grades are determined based on 100 points, grading in
countries following the French model is based on a system of
20 points. In order to make tests match such a
predetermined number, raw scores must be transformed into
fractions, decimals, or multiples of their raw value. For
example, say the desired result is a score over 20, but a test
includes 30 questions. If all questions are of equal
importance and difficulty, they can be considered as
fractions ( 2/3 pt. each) or as decimals (.66 each). Likewise,
if a test has only 10 questions, each can be multiplied by two
to obtain a score over 20.

Cross-Cultural Considerations:
In general, grading is much harsher in many countries than in
the United States. Students rarely, if ever, achieve perfect or
even near perfect scores on tests or as a final grade. In
countries following the British model, a passing grade is
50/100 or better, in the French I model, 10/20 or better. It is
therefore, inappropriate, for example, to give even the best
students a grade higher than 80% (British) or 16/20 (French).
In fact, your school administration, fellow teachers, and
students will be bewildered and even angry if you deviate from
this strict rule. Remember: 502 or 10/20 reflects an adequate
performance, equivalent to the U.S. 770% or C. It is, therefore,
important when designing a test that you include items of
sufficient difficulty to reflect this grading tradition.

Weighting Test Items:
In the event that some questions are more important or more
difficult than others, they can be weighted; that is, some
questions can be considered of double value (in the example
above, 1 point each) and others of less value (1/4 point, or .25).
In other words, as long as the total value for the test equals the
predetermined number required, individual item values can be
juggled as the teacher sees fit (see table on coefficients under
Norm-Referenced Scoring below).

Deriving Percentages:
By transforming raw scores into percentages, the teacher can
compare tests of varying length and difficulty or tests of varying
amounts of points on equal terms.
If all items on a test are worth the same amount, the percentage
correct can be determined by dividing the number of correct items
by the total number of items, then multiplying by 100%:
Percent correct = (Number of items correct ) / (Total number of
items) x 100%
If the items are of different weight, the percentage correct can be
determined by dividing the number of points earned by the
maximum number of points, then multiplying by 100%:
Percent correct = (Points earned) / (Maximum number of points) x
100%

Assuring Objectivity:
As with test construction, the key to successful test scoring is
objectivity. By setting certain standards and prescribing certain
rules, the teacher can be sure that scoring has been objective and
students have been treated fairly. Three techniques are
particularly helpful in assuring objectivity:
° Immediate scoring & recording
° Using a scoring key
° Having a procedure for comparing responses to the key
Perhaps it seems self-evident, but immediate scoring and
recording of scores can do much to alleviate misunderstanding
and bias. The more time that goes by between test-taking and
scoring, the greater the chances of forgetting relevant
information or losing papers altogether.

More importantly, the sooner the students get their tests back,
the more meaningful their performance on the test. It does little
good to return a test months after it has been taken when
students have to review the material tested just to remember
why they answered the way they did.
Using a scoring key can make scoring papers go quickly while
reducing the possibility of error and bias. It can also simplify and
standardize the process of scoring if numerous people will be
scoring the test. Having a procedure for comparing responses to
the key can also speed up the scoring process and increase
objectivity. For example, the teacher can:
° Scan several papers before starting scoring to get a baseline
view of the type and level of responses.

° Grade a sample of papers twice to see if he/she is, in fact,
grading consistently.
° Score papers anonymously so as not to be influenced by
students' performance in other aspects of the course (this can
be done by assigning numbers before hand, folding the tops of
test papers back, etc.).
° Grade items one at a time -- that is, first grade all answers
to item 1, then all responses to item 2, and so on (this
techniques is particularly useful with essay tests where it is
important to look for key points in each response).

Analysing test results:
Once test papers have been scored, they can then be analyzed
in numerous ways to provide the teacher with information
about student performance. For example, a student's tests
from one semester can be ranked to show relative areas of
strength and weakness; averaged class scores on a given test
can be ranked to compare one class's performance to that of
another. Such information is important for making decisions
about lesson planning and future testing as well as knowing
how to approach different students and classes.
In order to analyze anything, specific criteria must be
established. In test analysis, three different criteria are
generally used: the content of the test, the norm group taking
the test, or an individual student.

Criterion-Referenced Scoring:
Criterion-referenced scoring uses the content of the test itself as
a the basis of comparison for assessing the student's level of
achievement. Thus, a content-referenced score of 80% means
that the student correctly answered 80% of the items on the test.
The most common of all methods of test analysis, content-
referenced scoring is used in:
° To determine the level of achievement at which to begin a
student;
° To determine how much a student has learned from given
section of material; and
° To determine a student's potential in a given field.

Norm-Referenced Scoring:
Sometimes referred to as "grading on a curve," norm-referenced
scoring uses the class as a whole as a referent. The class average,
or mean, usually serves as the base score against which all other
grades are judged. The mean is calculated by adding all the scores
and then dividing by the number of scores given (e.g., the total
test scores in a class of 25 equals 1625, the class average, or
mean, is 1625/25, or 65).
Some schools require certain percentages of passing grades per
class. If these percentages are exceeded, the teacher is seen as
"too easy"; conversely, if these percentages are not met, students
can become indignant and discipline problems can result.

In these instances, it is important to be able to adjust students'
scores so that official standards can be met. Such adjustments
can be made by:
° Adding (or subtracting) points to students' overall scores
° Adding (or subtracting) points to sections in which students
scored the highest
° Making the next test easier (or harder)
° Weighting the test lightly (or heavily) on the semester-end
grade by multiplying each test by an appropriate amount, or
coefficient. For example, in the table below, Panafricanism is
weighted three times, which gives the student an end of term
score of 84%.

TEST SCORE RELATIVE
WEIGHT
(coefficient)
SCORE
WITH
WEIGHTING
(coefficient)
Pre-colonial
Africa
50% 1 50
Neo-colonial
Africa
85% 1 85 420/5=
84%
Panafricanism 95% 3 285
TOTAL 5 420

Self-Referenced Scoring:
Though it is difficult to do in large
classes, self-referenced scoring measures
an individual student's rate of progress
relative to his or her own past
performance. By comparing past test
scores, a teacher can assess a student's
rate of progress in a given subject area or
across subjects to see where he/she is in
need of help.

The advantages and disadvantages of Criterion-, Norm- and Self-
Referenced scoring are listed below:
Type of Grading Advantages Disadvantages
Norm-referenced 1 Allows for comparisons among students 1 It whose class does well some students still
get poor grades
2 Classes can be compared to other classes 2 It class as a whole does poorly a good grade
could be misleading
3 Allows teacher to spot students who are dropping
behind the class
3 Does not allow individual progress or
individual circumstances to be considered
4. The whole class (or large portions of it)
must be evaluated in the same way
5 Everyone m class (or norm group) must be
evaluated with the same instrument under
the same conditions
Criterion 1. Helps teacher to decide it students are ready to
move on
1. It is difficult to develop meaningful
criteria(therefore arbitrary cut-oft scores are
often used)
2 Criteria are independent of group performance 2. Presents unique problems in computing the
reliability of criterion-referenced tests
3 Works well in a mastery-learning setting
4 Each individual can be evaluated on different
material depending on his or her level of achievement
3 Makes it difficult to make comparisons
among students
Self-referenced 1. Allows you to check student progress 1. All measures taken on an individual must be
taken with similar Instruments under similar
circumstances
2 Makes it possible to compare achievement across
different subjects for the same individual
2 Does not help you to compare an individual
with his or her peers

Percentile Ranking:
Just as the raw scores for individual test items can be
transformed to fit a certain testing model (e.g. Francophone
testing - score/20), so can one set of test results be analyzed
in relation to previous tests as well as other classes'
performances. Percentile ranks offer a way to obtain an
image of class performance on a test by calculating the
percentage of persons who obtain lower scores. To obtain a
percentile rank, divide the number of students below the
passing grade by the total number of students who took the
test. For example, if 10 students out of 30 get passing scores
(50% and above), then the percentile ranking for that test
would be 66% -- that is, 66% of that class rank in the lower
fiftieth percentile.

Charting Student Performance:
Just as percentile ranking can give a teacher a comparative measure
of class performance, charting the results of a test can give the
teacher an internal picture of how his/her class has performed as a
whole. The graph below, for example, clearly and graphically
illustrates that the majority of the students in the class failed the
test.
Student performance

To chart student performance:
1. Tally the number of students who obtain each score. (e.g.
4 students at 4/20 - or 20/100 16 students at 8/20 - or
40/100.)
2. Plot each number on a chart as illustrated above.
3. Drew a vertical line intersecting the passing grade. (In the
French system 10/20 is passing; in the British system
50/100 is passing).
The teacher can obtain a visual comparison of class
performance over a semester or a year by superimposing
charted results of multiple tests.

Unit Three: Item Analysis
3.1. Item difficult level
3.2. Item discrimination index
3.3. Item Banking

3.1 Item difficulty level
How difficulty do you think a test should be? How do we determine
the difficulty level of test items? Why is it important to know the
difficulty level of test items?.
Item difficulty index is one of the most useful, and most frequently
reported, item analysis statistics. It is a measure of the proportion
of examinees who answered the item correctly; for this reason it is
frequently called the p-value. If scores from all students in a group
are included the difficulty index is simply the total percent correct.
When there is a sufficient number of scores available (i.e., 100 or
more) difficulty indexes are calculated using scores from the top
and bottom 27 percent of the group.

Item Analysis Procedures:
1.Rank the papers in order from the highest to the lowest scores.
2.Select one-third of the papers with the highest total scores and
another one-third of the papers with lowest total scores.
3.For each test item, tabulate the number of students in the upper
and lower groups who selected each option.
4.Compute the difficulty of each item (% of students who got the
right item)
Item difficulty index can be calculated using the following formula:
P=Successes in the HSG + Successes in the LSG/N in (HSG+LSG)
Where, HSG=High scoring groups
LSG=Low scoring groups
N=The total number of HSG and LSG

The difficulty indexes can range between 0.0 and 1.0 and are
usually expressed as a percentage. A higher value indicates that a
greater proportion of examinees responded to the item correctly,
and it was thus an easier item. The average difficulty of a test is
the average of the individual item difficulties. For maximum
discrimination among students, an average difficulty of .60 is ideal.
For example: If 243 students answered item no. 1 correctly and 9
students answered incorrectly, the difficulty level of the item
would be 243/252 or .96.
In the example below, five true-false questions were part of a larger
test administered to a class of 20 students. For each question, the
number of students answering correctly was determined, and then
converted to the percentage of students answering correctly.

Ques
tion
Correct responses Item difficulty
1 | | | | | | | | | | | | |
| | 15
75% (15/20)
2 | | | | | | | | | | | | |
| | | | 17
85% (17/20)
3 | | | | | | 6 30% (6/20)
4 | | | | | | | | | | | |
| 13
65% 13/20)
5 | | | | | | | | | | | | |
| | | | | | | 20
100% (20/20)

Activity: Calculate the item difficulty level for the following
four options multiple choice test item. (The sign (*) shows the
correct answer).
Response Options
Groups A B C D* Total
High Scorers 0 1 1 8 10
Low Scorers 1 1 5 3 10
Total 1 2 6 11 20

3.2 Item Discrimination Index
To whome extent do you think a test item should discriminate
between higher achievers and lower achievers? Shoud it be highly
disciminating, averagely discriminating, or less discriminating?
The index of discrimination is a numerical indicator that enables
us to determine whether the question discriminates appropriatly
between lower scoring and higher scoring students.
When students who earn high scores are compared with those who
earn low scores, we would expect to find more students in the high
scoring groups answering a question correctly than students from
the low scoring group.
In te case of very difficult items which no one in either group
answered correctly or fairly easy questions which even the
students in the low group answered correctly, the numbers of
correct answer might be equal for the two groups.

What we would not expect to find is a case in which the low
scoring students answered correctly more frequently than
students in the high group.
D=Successes in the HSG-Successes in the LSG/1/2(HSG+LSG)
Where, HSG=High Scoring Groups
LSG=Low Scoring Groups
In the example, there are 8 students in the high scoring group
and 8 in the low scoring group (with 12 between the two groups
which are not represented). For equestion 1, all 8 in high scoring
group answered corretly, while only 4 in the low scoring group
did so. Thus success in the HSG-success in the LSG (8-4)=+4.
The last step is to divide the +4 by half of the total number of
both groups (16).
Thus 4/2x8 will give us +5, which is the D value.

Quest
ion
Success in
the HSG
Success in
the LSG
Difference D value
1 8 4 8 – 4 = 4 .5
2 7 2
3 5 6
Activity 2: Calculate the item discrimination index for the
questions 2 & 3 on the table above.The item discrimination index
can vary from -1.00 to +1.00. A negative discrimination index
(between -1.00 and zero) results when more students in the low
group answered correctly than students in the high group. A
discrimination index of zero means equal numbers of high and low
students answered correctly, so the item did not discriminate
between groups.

A positive index occurs when more students in the high group
answer correctly than the low group. If the students in the class
are fairly homogeneous in ability and achievement, their test
performance is also likely to be similar, resulting in little
discrimination between high and low groups.
Questions that have an item difficulty index (NOT item discrimination)
of 1.00 or 0.00 need not be included when calculating item
discrimination indices. An item difficulty of 1.00 indicates that
everyone answered correctly, while 0.00 means no one answered
correctly. We already know that neither type of item discriminates
between students.
When computing the discrimination index, the scores are divided into
three groups with the top 27% of the scores in the upper group and
the bottom 27% in the lower group.

The number of correct responses for an item by the lower
group is subtracted from the number of correct responses for
the item in the upper group. The difference is divided by the
number of students in either group. The process is repeated
for each item.
The value is interpreted in terms of both:
 direction (positive or negative) and
 strength (non-discriminating to strongly-
discriminating).
These values can range from -1.00 to +1.00.The
possible range of the discrimination index is -1.0 to
1.0.

Item discrimination interpretation
D-Value Direction Strength
> +.40 positive strong
+.20 to +.40 positive moderate
-.20 to +.20 none ---
< -.20 negative moderate to
strong

For a small group of students, an index of discrimination for an
item that exceeds. 20 is considered satisfactory. For larger groups,
the index should be higher because more difference between groups
would be expected. The guidelines for an acceptable level of
discrimination depend upon item difficulty. For very easy or very
difficult items, low discrimination levels would be expected; most
students, regardless of ability, would get the item correct or
incorrect as the case may be. For items with a difficulty level of
about 70 percent, the discrimination should be at least. 30.
When an item is discriminating negatively, overall the most
knowledgeable examinees are getting the item wrong and the least
knowledgeable examinees are getting the item right. A negative
discrimination index may indicate that the item is measuring
something other than what the rest of the test is measuring. More
often, it is a sign that the item has been mis-keyed.

3.3 Item Banking
Building a file of effective test items and assessment tasks
involves recording the items or tasks, adding information
from analyses of students responses, and filing the records
by both the content area and the objective that the item or
task measures. Thus, items and tasks are recorded on
records as they are constructed; information form analysis
of students responses is added after the items and tasks
have been used, and then the effective items and tasks are
deposited in the file. In a few years, it is possible to start
using some of the items and tasks from the file and
supplement these with new items and tasks.

As the file grow, it becomes possible to select the majority of the
items and tasks from the file for any given test or assessment
without repeating them frequently. Such a file is especially
valuable in areas of complex achievement, when the construction
of test items and assessment tasks is difficult and time consuming.
When enough high-quality items and tasks have been assembled,
the burden of preparing tests and assessments is considerably
lightened. Computer item banking makes tasks even easier.
Summary:
In this unit you learned how to judge the quality of classroom test
by carrying out item analysis which is the process of “testing the
item” to ascertain specifically whether the item is functioning
properly in measuring what the entire test is measuring.

You also learned about the process of item analysis and how to
compute item difficulty, item discriminating power and evaluating
the effectiveness of distracters. You have learned that item
difficulty indicates the percentage of testees who get the item right;
Item discriminating power is an index which indicates how well an
item is able to distinguish between the high achievers and low
achievers given what the test is measuring; and the distraction
power of a distracter is its ability to differentiate between those
who do not know and those who know what the item is measuring.
Finally you learned that after conducting item analysis, items
may still be usable, after modest changes are made to improve
their performance on future exams. Thus, good test items should
be kept in test item banks and in this unit you were given
highlights on how to build a Test Item File/Item Bank.

Unit Four: Ethical Standards of Assessment
4.1 Ethical and professional standards of
assessment and its use
4.2 Race, ethnicity, gender, religion and culture
in assessment and test

4.1 Ethical Standards of assessment
Ethical standards guide teachers in fulfilling their obligation to
provide and use tests that are fair to all test takers regardless of
age, gender, disability, ethnicity, religion, linguistic background, or
other personal characteristics.
Fairness is a primary consideration in all aspects of testing :
 It helps to ensure that all test takers are given a comparable
opportunity to demonstrate what they know and how they
can perform in the area being tested.
 Implies that every test taker has the opportunity to prepare
for the test and is informed about the general nature and
content of the test.
 Also extends to the accurate reporting of individual and
group test results.

The following are some ethical standards that teachers may
consider in their assessment practices.
1. Teachers should be skilled in choosing assessment methods
appropriate for instructional decisions. Skills in choosing
appropriate, useful, administratively convenient, technically
adequate, and fair assessment methods are prerequisite to good
use of information to support instructional decisions. Teachers
need to be well-acquainted with the kinds of information
provided by a broad range of assessment alternatives and their
strengths and weaknesses. In particular, they should be familiar
with criteria for evaluating and selecting assessment methods in
light of instructional plans.

2. Teachers should develop tests that meet the intended
purpose and that are appropriate for the intended test
takers. This requires teachers to:
Define the purpose for testing, the content and skills
to be tested, and the intended test takers.
Develop tests that are appropriate with content,
skills tested, and content coverage for the intended
purpose of testing.
Develop tests that have clear, accurate, and complete
information.
Develop tests with appropriately modified forms or
administration procedures for test takers with
disabilities who need special accommodations

3. The teacher should be skilled in administering, scoring and
interpreting the results from diverse assessment methods. It is not
enough that teachers are able to select and develop good
assessment methods; they must also be able to apply them
properly. This requires teachers to:
Follow established procedures for administering tests in a
standardized manner.
Provide and document appropriate procedures for test takers
with disabilities who need special accommodations or those
with diverse linguistic backgrounds.
Protect the security of test materials, including eliminating
opportunities for test takers to obtain scores by fraudulent
means.
Develop and implement procedures for ensuring the
confidentiality of scores

4. Teachers should be skilled in using assessment results when making
decisions about individual students, planning teaching, developing
curriculum, and school improvement.
Assessment results are used to make educational decisions at several
levels: in the classroom about students, in the community about a
school and a school district, and in society, generally, about the
purposes and outcomes of the educational enterprise. Teachers play a
vital role when participating in decision-making at each of these
levels and must be able to use assessment results effectively.
5. Teachers should be skilled in developing valid pupil grading
procedures which use pupil assessments. Grading students is an
important part of professional practice for teachers. Grading is defined
as indicating both a student's level of performance and a teacher's
valuing of that performance. The principles for using assessments to
obtain valid grades are known and teachers should employ them.

6. Teachers should be skilled in communicating assessment
results to students, parents, other lay audiences, and other
educators. Teachers must routinely report assessment results to
students and to parents or guardians. In addition, they are
frequently asked to report or to discuss assessment results with
other educators and with diverse lay audiences.
If the results are not communicated effectively, they may be
misused or not used. To communicate effectively with others on
matters of student assessment, teachers must be able to use
assessment terminology appropriately and must be able to
articulate the meaning, limitations, and implications of
assessment results.

Furthermore, teachers will sometimes be in a position that will
require them to defend their own assessment procedures and
their interpretations of them.
At other times, teachers may need to help the public to interpret
assessment results appropriately.
7.Teachers should be skilled in recognizing unethical, illegal, and
otherwise inappropriate assessment methods and uses of assessment
information.
Fairness, the rights of all concerned, and professional ethical
behavior must undergird all student assessment activities, from the
initial planning for and gathering of information to the
interpretation, use, and communication of the results.
Teachers must be well-versed in their own ethical and legal
responsibilities in assessment.

In addition, they should also attempt to have the inappropriate
assessment practices of others discontinued whenever they are
encountered.
1. Teachers should also participate with the wider educational
community in defining the limits of appropriate professional
behavior in assessment.
In addition, the following are principles of grading that can
guide the development of a grading system.
1. The system of grading should be clear and understandable (to
parents, other stakeholders, and most especially students).
2. The system of grading should be communicated to all
stakeholders (e.g., students, parents, administrators).

3. Grading should be fair for all students regardless of gender,
socioeconomic status or any other personal characteristics.
4. Grading should support, enhance, and inform the instructional
process.
4.2 Race, ethnicity, gender, religion and culture in
assessment and test
In the previous section we have learned that fairness is the
fundamental principle that has to be followed in teachers’
assessment practices.
It has been said that all students have to be provided with equal
opportunity to demonstrate the skills and knowledge being
assessed. Fairness is fundamentally a socio-cultural, rather than a
technical, issue.

Thus, in this section we are going to see how culture and
ethnicity may influence teachers’ assessment practices and what
precautions we have to take in order avoid bias and be
accommodative to students from all cultural groups.
Students represent a variety of cultural and linguistic
backgrounds. If the cultural and linguistic backgrounds are
ignored, students may become alienated or disengaged from the
learning and assessment process. Teachers need to be aware of
how such backgrounds may influence student performance and
the potential impact on learning. Teachers should be ready to
provide accommodations where needed.

Classroom assessment practices should be sensitive to the
cultural and linguistic diversity of students in order to obtain
accurate information about their learning.
Assessment practices that attend to issues of cultural diversity
include those that
 Aacknowledge students’ cultural backgrounds.
 Are sensitive to those aspects of an assessment that may
hamper students’ ability to demonstrate their knowledge and
understanding.
 Use that knowledge to adjust or scaffold assessment practices
if necessary.

Assessment practices that attend to issues of linguistic
diversity include those that
 Acknowledge students’ differing linguistic abilities.
 Use that knowledge to adjust or scaffold assessment
practices if necessary.
 Use assessment practices in which the language demands do
not unfairly prevent the students from understanding what is
expected of them.
Use assessment practices that allow students to accurately
demonstrate their understanding by responding in ways that
accommodate their linguistic abilities, if the response method is
not relevant to the concept being assessed (e.g., allow a student
to respond orally rather than in writing).

Disability and Assessment Practices:
It is quite obvious that our education system was exclusionary in
fully accommodating the educational needs of disabled students.
This has been true not only in our country but in the rest of the
world as well, although the magnitude might differ from country to
country.
It was in response to this situation that UNESCO has been
promoting the principle of inclusive education to guide the
educational policies and practice of all governments. Different world
conventions were held and documents signed towards the
implementation of inclusive education.
Our country, Ethiopia, has been a signatory of these documents and
therefore has accepted inclusive education as a basic principle to
guide its policy and practice in relation to the education of disabled
students.

UN Convention on the Rights of Persons with Disabilities (2006)
One group should work on one convention and documents can
be found from the internet.
Inclusive education is based on the idea that all students,
including those with disabilities, should be provided with the
best possible education to develop themselves. This implies for
the provision of all possible accommodations to address the
educational needs of disabled students. Accommodations
should not only refer to the teaching and learning process. It
should also consider the assessment mechanisms and
procedures.

There are different strategies that can be considered to make
assessment practices accessible to students with disabilities
depending on the type of disability. In general terms, however, the
following strategies could be considered in summative assessments:
 Modifying assessments: This should enable disabled students to
have full access to the assessment without giving them any unfair
advantage.
 Others’ support: Disabled students may need the support of
others in certain assessment activities which they can not do it
independently. For instance, they may require readers and
scribes in written exams; they may also need others’ assistance in
practical activities, such as using equipments, locating materials,
drawing and measuring.

 Time allowances: Disabled students should be given additional
time to complete their assessments which the individual
instructor has to decide based on the purpose and nature of the
assessment.
 Rest breaks: Some students may need rest breaks during the
examination. This may be to relieve pain or to attend to
personal needs.
 Flexible schedules: In some cases disabled students may require
flexibility in the scheduling of examinations. For example, some
students may find it difficult to manage a number of
examinations in quick succession and need to have
examinations scheduled over a period of days.

 Alternative methods of assessment: In certain situations where
formal methods of assessment may not be appropriate for
disabled students, the instructor should assess them using non
formal methods such as class works, portfolios, oral
presentations, etc.
 Assistive Technology: Specific equipment may need to be
available to the student in an examination. Such arrangements
often include the use of personal computers, voice activated
software and screen readers.

Gender issues in assessment:
Do you feel that gender has any influence in teachers’
assessment practices? Is there any gender-related stereotype
in relation to assessment results?
Teachers’ assessment practices can also be affected by gender
stereotypes. The issues of gender bias and fairness in
assessment are concerned with differences in opportunities
for boys and girls.
A test is biased if boys and girls with the same ability levels
tend to obtain different scores.

Test questions should be checked for:
 material or references that may be offensive to members of
one gender,
 references to objects and ideas that are likely to be more
familiar to men or to women,
 unequal representation of men and women as actors in test
items or representation of members of each gender only in
stereotyped roles.
If the questions involve objects and ideas that are more
familiar or less offensive to members of one gender, then the
test may be easier for individuals of that gender.

Standards for achievement on such a test may be unfair to
individuals of the gender that is less familiar with or more
offended by the objects and ideas discussed, because it may be
more difficult for such individuals to demonstrate their abilities
or their knowledge of the material.
Summary: In this unit you have learned that ethics is a very
important issue we have to follow in our assessment practices.
And the most important ethical consideration is fairness. If we
are to draw reasonably good conclusions about what our students
have learned, it is imperative that we make our assessments—and
our uses of the results—as fair as possible for as many students as
possible. A fair assessment is one in which students are given
equitable opportunities to demonstrate their abilities and
knowledge.

Teachers must make every effort to address and minimize the
effect of bias in classroom assessment practices. Biases in
assessment can occur because of differences in culture or
ethnicity, disability as well as gender. To ensure suitability and
fairness for all students, teachers need to check the
assessment strategy for its appropriateness and if there are
cultural, disability and gender biases.
Equitable assessment means that students are assessed using
methods and procedures most appropriate to them. Classroom
assessment practices should be sensitive and diverse enough to
accommodate all types of diversity in the classroom in order to
obtain accurate information about learning.

Module 6-L A & E, Weekend.pptx

Module 6-L A & E, Weekend.pptx

More Related Content

What's hot

Similar to Module 6-L A & E, Weekend.pptx

More from Rajashekhar Shirvalkar

Recently uploaded

Module 6-L A & E, Weekend.pptx