Module 6: Learning Assessments
and Evaluation
Course code: TECS 623
Credit hours: 2
Prof.Omprakash H M
Department of Curriculum and Instructons
College of Education and Behavioral Sciences
Bule Hora University, Adola, Ethiopia
Unit One: Concepts, Purposes and
Principles of Assessment
1.1 Concept of Assessment and related
terms(Test,Mesurement,assessment
and Evaluation)
1.2 Function of Assessment and
Evaluation
1.3 Principles of Assessment (Validity,
Equity, reliability and explicitness)
1.4 Basic assumption in assessing
students’ performance.
1.1 Concept of Assessment and related terms (Test,
Mesurement, Assessment and Evaluation)
Concept of Assessment:
Assessment: Assessment is the process of gathering and
discussing information from multiple and diverse sources in
order to develop a deep understanding of what students know,
understand, and can do with their knowledge as a result of their
educational experiences; the process culminates when
assessment results are used to improve subsequent learning.
Comprehensive definition of assessment that incorporates its
key elements: the planned process of gathering and synthesizing
information relevant to the purposes of (a) discovering and
documenting students' strengths and weaknesses, (b) planning
and enhancing instruction, or (c) evaluating progress and making
decisions about students.
What is basic concept of assessment?
Assessment refers to the full range of information
gathered and produced by teachers about their students
and their classrooms (Arends, 1994) Assessment is a
method for analyzing and evaluating student
achievement or program success.
Assessment involves the use of empirical data on
student learning to refine programs and improve
student learning.
It is the process of defining, selecting, designing,
collecting, analyzing, interpreting, and using
information to increase students' learning and
development.
All those activities undertaken by teachers,
and by their students in assessing themselves,
which provide information to be used as
feedback to modify the teaching and learning
activities in which they are engaged.
Basic Concepts in Testing and Assessment
Test Measurement Assessment Evaluation
What is a Test?
Perhaps test is a concept that we are more familiar with
than the other concepts. we have been taking tests ever
since we have started schooling to determine our
academic performance. Tests are also used in work
places to select individuals for a certain job vacancy.
Thus, test in educational context is meant to the presentation of
a standard set of questions to be answered by students. It is one
instrument that is used for collecting information about
students’ behaviors or performances. It can be noted that there
are many other ways of collecting information about students’
educational performances other than tests, such as observations,
assignments, project works, portfolios, etc.
 Most commonly used method of making
measurements in education.
 An instrument or systematic procedures for
measuring sample of behaviour by posing to
measure any quality, ability, skill or
knowledgea set of questions in a uniform
manner.
 Designed e.g. objective, subjective, descripitive, time
management, pretest-post, develop conditioning, enhance
observation ( in learning) with exp.
 There is always right/best and wrong answer.
What is Measurement?
Measurement: In our day to day life there are
different things that we measure.
We measure our height and put it in terms of
meters and centimeters.
We measure some of our daily consumptions like
sugar in kilograms and liquids in liters.
We measure temperature and express it in terms of
degree centigrade or degree Celsius. How do we
measure these things?
Well definitely we need to have appropriate instruments such as a
meter, a weighing scale, or a thermometer in order to have reliable
measurements.
Similarly, in education measurement is the process by which the
attributes of a person are measured and described in numbers. It is
a quantitative description of the behavior or performance of
students. As educators we frequently measure human attributes
such as attitudes, academic achievement, aptitudes, interests,
personality and so forth. Measurement permits more objective
description concerning traits and facilitates comparisons. Hence,
to measure we have to use certain instruments so that we can
conclude that a certain student is better in a certain subject than
another student. How do we measure performance in
mathematics?
We use a mathematics test which is an instrument containing
questions and problems to be solved by students. The number of
right responses obtained is an indication of performance of
individual students in mathematics. Thus, the purpose of
educational measurement is to represent how much of ‘something’
is possessed by a person using numbers. Note that we are only
collecting information. We are not evaluating! Evaluation is
therefore quite different from measurement. Measurement is not
also that same as testing. While a test is an instrument to collect
information about students’ behaviors, measurement is the
assignment of quantitative value to the results of a test or other
assessment techniques. Measurement can refer to both the score
obtained as well as the process itself.
 Basically assignment of numbers.
 Variety of i n s t r u m e n t s such as tests, rating scales, rubrics
are used.
 The process of obtaining numerical description of the degree
of individual processes.
 Quantifying how much learners learned.
What is Assessment?
Assessment: In educational literature the concepts
‘assessment’ and ‘evaluation’ have been used with
some confusion. Some educators have used them
interchangeably to mean the same thing. Others
have used them as two different concepts. Even
when they are used differently there is too much
overlap in the interpretations of the two concepts.
Cizek (in Phiye, 1997) provides us a comprehensive definition of
assessment that incorporates its key elements:
The planned process of gathering and synthesizing information
relevant to the purposes of:
(a) Discovering and documenting students' strengths and
weaknesses,
(b) Planning and enhancing instruction, or
(c) Evaluating progress and making decisions about students.
Process by which evidence of student achievement is obtained
and evaluated (marks obtained, how...checked).
 Including testing, interpreting and placing information in
context.
 Process of gathering and organizing data – the basis
for decision- making (evaluation).
 Methods of measuring and evaluating the nature of the
learner (what he learned/ how he learned) e.g objective
/descriptive.
Principles of Assessment
1. Assessment should be aimed at improving student performance.
2. Assessment should be based on an understanding of how students
learn.
3. Assessment should be an integral component of course design
and not something to addafterwards.
4. Good assessment provides useful information to report credibly to
parents on student achievement.
The Role of Assessment in Learning
Assessment plays a major role in how students learn, their
motivation to learn learn, and how teachers to teach.
Assessment is used for various purpose:
 Assessment for learning: Where assessment helps teachers gain
insight.
 Assessment as learning: Where students develop an awareness.
 Assessment of learning: where assessment informs students,
teachers and parents, as well as the broader educational
community, of achievement at a certain point in time.
Research and experience show that student learning is best
supported when:
 Instruction and assessment are based on clear learning goals.
 Instruction and assessment are differentiated according to
student learning needs.
 Assessment information is used to make decisions that
support further learning.
 Parents are well informed about their child’s learning, and
work with the school to help plan and provide support,
e.g.weekly/monthly meeting.
Advantages of Assessment
 Helps in knowing the position of a
student when they enter a course.
E.g. Adam student in Adola Teacher’s
College.
 It provides a large view of students’ need
and assessment.
 In accordance to the students’
achievement, the curriculum and teaching
methods can be adjusted
Disadvantages of Assessment
It limits the potential of a student to a mere ‘test’.
 Under supervision and pressure and supervision, creativity and
performance is affected.
 Though assessment aims at bringing out the latent/hidden
knowledge, it often conceals it by the pressure it creates.
 The parameter to judging knowledge is just a test score.
Evaluation
It is the process of obtaining, analyzing and interpreting
information to determine the extent to which students
achieve .e.g How many marks secure
1.2 Function of Assessment and Evaluation
1. Capturing student time and attention.
2. Generating appropriate student learning activity.
3. Providing timely feedback which students pay attention
to.
4.Helping students to internalize the discipline’s
standards and notions of equality.
5. Generating marks or grades which distinguish between
students or enable pass/fail decisions to be made.
 Monitoring the progress
 Decision making
 Screening
 Diagnostic process
 Placement of students in remedial courses
 Instructional planning
 Evaluation of instructional programed.
 Feedback
 Motivation
1.3 Principles of Assessment (Validity, Equity,
reliability and explicitness)
What are the basic principles of assessment?
1 Assessment should be valid.
2 Assessment should be reliable and consistent.
3 Information about assessment should be explicit, accessible
and transparent.
4 Assessment should be inclusive and equitable
 Validity
Validity refers to the evidence base that can be provided about
appropriateness of the inferences, uses, and consequences that
come from assessment.
Appropriateness has to do with the soundness, trustworthiness, or
legitimacy of the claims or inferences that testers would like to
make on the basis of obtained scores.
 Validity is “the extent to which inferences
made from assessment results are appropriate,
meaningful, and useful in terms of the purpose
of the assessment” (Gronlund, 1998).
 Validity refers to whether the test is actually
measuring what it claims to measure (Arshad,
2004).
Face Validity
It is pertinent that a test looks like a test even at first
impression.
If students taking a test do not feel that the questions given
to them are not a test or part of a test, then the test may
not be valid as the students may not take it seriously to
attempt the questions.
Construct Validity
Construct is a psychological concept used in measurement.
Construct validity is the most obvious reflection of whether a
test measures what it is supposed to measure as it directly
addresses the issue of what it is that is being measured.
In other words, construct validity refers to whether the
underlying theoretical constructs that the test measures are
themselves valid.
 Equity
Principle 1 - Assessment should be inclusive and equitable/fair. As
far as is possible without compromising academic standards,
inclusive and equitable assessment should ensure that tasks and
procedures do not disadvantage any group or individual.
Principle 2 - Information about assessment should be explicit,
accessible and transparent Clear, accurate, consistent and timely
information on assessment tasks and procedures should be made
available to students, staff and other external assessors or
examiners.
Principle 3 - Assessment should be inclusive and equitable As far
as is possible without compromising academic standards,
inclusive and equitable assessment should ensure that tasks and
procedures do not disadvantage any group or individual.
Principle 4 - Assessment should be an integral part of programme
design and should relate directly to the programme aims and
learning outcomes Assessment tasks should primarily reflect the
nature of the discipline or subject but should also ensure that
students have the opportunity to develop a range of generic skills
and capabilities.
Principle 5 - The amount of assessed work should be manageable
The scheduling of assignments and the amount of assessed work
required should provide a reliable and valid profile of achievement
without overloading staff or students, Principle.
Principle 6 - Formative and summative assessment should be
included in each programme Formative and summative assessment
should be incorporated into programmes to ensure that the
purposes of assessment are adequately addressed. Many
programmes may also wish to include diagnostic assessment.
Principle 7 - Timely feedback that promotes learning and facilitates
improvement should be an integral part of the assessment process
Students are entitled to feedback on submitted formative
assessment tasks, and on summative tasks, where appropriate. The
nature, extent and timing of feedback for each assessment task
should be made clear to students in advance.
Principle 8 - Staff development policy and strategy should include
assessment All those involved in the assessment of students must
be competent to undertake their roles and responsibilities.
 Reliability
According to Brown (2010), a reliable test can be described as
follows:
◦ Consistent in its conditions across two or more
administrations
◦ Gives clear directions for scoring / evaluation
◦ Has uniform rubrics for scoring / evaluation
◦ Lends itself to consistent application of those rubrics by
the scorer
◦ Contains item / tasks that are unambiguous to the
test-taker
Reliability means the degree to which an assessment tool
produces stable and consistent results.
Reliability essentially denotes ‘consistency, stability,
dependability, and accuracy of assessment results’
Since there is tremendous variability from either teacher or
tester to teacher/tester that affects student performance,
thus reliability in planning, implementing, and scoring
student performances gives rise to valid assessment.
Split Half Reliability
A test is administered once to a group, is divided into two
equal halves after the students have returned the test, and
the halves are then correlated.
Halves are often determined based on the number assigned to
each item with one half consisting of odd numbered items and
the other half even numbered items.
Factors that affect reliability of the test
a.Test factor
b.Teacher and Student factor
c.Environment factor
d.Test administration factor
e.Making factor
a. Test Factor
In general, longer tests produce higher reliabilities.
Due to the dependency on coincidence and guessing, the
scores will be more accurate if the duration of the test is
longer.
An objective test has higher consistency because it
is not exposed to a variety of interpretations.
b. Teacher and Student factor
In most tests, it is normally for teachers to construct and
administer tests for students. Thus, any good teacher-student
relationship would help increase the consistency of the
results.
Other factors that contribute to positive effects to the
reliability of a test include teacher’s encouragement, positive
mental and physical condition, familiarity to the test formats,
and perseverance and motivation.
c. Environment Factor
An examination environment certainly influences test-
takers and their scores.
Any favourable environment with comfortable chairs and desks,
good ventilation, sufficient light and space will improve the
reliability of the test.
On the contrary, a non-conducive environment will
affect test-takers’ performance and test reliability.
d. Test Administration Factor
Because students' grades are dependent on the way tests are
being administered, test administrators should strive to
provide clear and accurate instructions, sufficient time and
careful monitoring of tests to improve the reliability of their
tests.
A test-retest technique can be used to determine test
reliability.
e. Marking Factor
Human judges have many opportunities to
introduce error in scoring essays.
It is also common that different markers award
different marks for the same answer even with a
prepared mark scheme.
A marker’s assessment may vary from time to
time and with different situations.
Conversely, it does not happen to the objective type of
tests since the responses are fixed. Thus, objectivity is
a condition for reliability.
 Explicitness
Principle - Information about assessment should be explicit,
accessible and transparent Clear, accurate, consistent and
timely information on assessment tasks and procedures
should be made available to students, staff and other external
assessors or examiners.
1.4 Basic assumption in assessing students’ performance.
1.Quality of student learning is directly related to quality of
teaching,
2.The first step in getting useful feedback about course goals is
to make these goals explicit,
3.Students need focused feedback early and often, and they
should be taught how to assess their own learning,
4.The most effective assessment addresses problem-directed
questions that faculty ask themselves,
5.Course assessment is an intellectual challenge and therefore
motivating for the faculty,
6.Assessment does not require special training,
7.Collaboration with colleagues and students improves learning
and is satisfying.
Reflection: When planning to assess students, what are the
assumptions that one held in mind?
What are the things that should be kept in mind when preparing
assessment tools for assessing students?
Angelo and Cross (1993) have listed seven basic assumptions of
classroom assessment which are described as follows:
The quality of student learning is directly, although not
exclusively related to the quality of teaching.
Therefore, one of the most promising ways to improve learning
is to improve teaching.
If assessment is to improve the quality of students learning, both
teachers and students must become personally invested and
actively involved in the process.
Reflection: What should be the roles of students and teachers in
classroom assessment so as it will help students’ learning?
To improve their effectiveness, teachers need first to make their
goals and objectives explicit and then to get specific,
comprehendible feedback on the extent to which they are
achieving those goals and objectives.
Effective assessment begins with clear goals.
Before teachers can assess how well their students are learning,
they must identify and clarify what they are trying to teach.
After teachers have identified specific teaching goals they wish
to assess, they can better determine what kind of feedback to
collect.
To improve their learning, students need to receive appropriate
and focused feedback early and often; they also need to learn how
to assess their own learning.
Reflection: How do you think feedback and self-assessment will
help to improve students’ learning?
The type of assessment most likely to improve teaching and
learning is that conducted by teachers to answer questions they
themselves have formulated in response to issues or problems in
their own teaching.
To best understand their students’ learning, teachers need
specific and timely information about the particular individuals in
their classes. As a result of the different students’ needs, there is
often a gap between assessment and student learning. One goal of
classroom assessment is to reduce this gap.
Reflection: How does classroom assessment help to reduce this
gap between assessment and student learning?
Systematic inquiry and intellectual challenge are powerful sources
of motivation, growth, and renewal for teachers, and classroom
assessment can provide such challenge.
Classroom assessment is an effort to encourage and assist those
teachers who wish to become more knowledgeable, involved, and
successful.
Classroom assessment does not require specialized training; it
can be carried out by dedicated teachers from all disciplines.
To succeed in classroom assessment, teachers need only a
detailed knowledge of the discipline, dedication to teaching, and
the motivation to improve.
By collaborating with colleagues and actively involving
students in classroom assessment efforts, teachers
(and students) enhance learning and personal
satisfaction. By working together, all parties achieve
results of greater value than those they can achieve by
working separately.
Reflection: Can you explain how teachers’
collaboration with colleagues can be more effective in
enhancing learning and personal satisfaction than
working alone?
Unit Two: Assessment types, Methods and Tools
2.1. Assessments Types
2.2. Assessment Method
2.3. Assumption in selecting assessment methods
2.4. Table of specification and construction of item
2.5. Test administration, making and grading
2.1. Assessments Types
Advantages of Formative Assessment
 Develops knowledge
 Continuous improvement
 Provides quick feedback
 Achieves successful outcomes
 Communicate with parent regular
Disadvantages of Formative Assessment
 Time consuming and requires resources
 Tiring process
 Trained and qualified professionals
 Develops challenges
Advantages of Summative Assessment
 To know if students have understood
 They determine achievement
 They make academic records
 Boosts individuals
 Weak areas can be identified
 Training success can be measured
Disadvantages of Summative Evaluation
 Demotivates individuals
 Rectification is late
 It is disruptive
 No remedy to identify challenges in advance
 Not accurate reflection of learning
 Negative effect for students
Assessment as Procedure
 This paradigm has elements of both measurement
paradigm and inquiry paradigm.
 The primary focus is on assessment procedures and
not on the underlying purposes of the assessment
program.
 Knowledge is believed to be existing separately from the
learners, and it can be transmitted to the students and
eventuallyobjectively measured.
Assessment as Inquiry
 Under this paradigm, assessment is based on
constructive theories of knowledge, student-centered
learning and the inquiry process.
 The teachers use various qualitative and quantitative
techniques to inquire about particular learners.
 It is a process of inquiry, and a process of
interpretation, used to promote reflection
concerning students' understandings, attitudes, and
literate abilities.
 Assessment, in this paradigm, is viewed as a social, contex
specific, interpretive activity.
Assessment as Measurement
 The primary instrument of this paradigm is the large-scale,
norm-referenced standardized test.
 Objectivity, standardization and reliability are the main concerns.
 Knowledge is considered to be existing separately from the
learners; and the learners work to acquire it and not construct
it.
 Decisions about the information to be collected and the
means of evaluation are made by authorities outside the
classroom.
Conclusion:
 Therefore, it may be concluded that though the leanings are
towards the formative evaluation system, it is not without
the want for modification.
 A balance is required not only in the formative but also in any
form of evaluation system.
 In the summative system too much leniency risks the
attention of the students. While, in formative evaluation too
much stress can endanger the clarity of concepts in the
students.
 A midway has to be found between the two extremes, and a
modified, flexible and accommodating system of evaluation, in
the lines of the formative system, should be adopted, so that
the faculty and the students can keep track of the latter’s
progress while at the same time also have the chance to
improve, develop and grow.
2.2. Assessment Method
Methods of Assessment. Methods will vary depending on the
learning outcome(s) to be measured. Direct methods are when
students demonstrate that they have achieved a learning
outcome or objective. Indirect methods are when students (or
others) report perceptions of how well students have achieved an
objective or outcome.
A. Direct Assessment Method:
Direct assessment involves looking at actual samples of student
work produced in our programs. Observations of field work,
internship performance, service learning, clinical experiences.
Grades based on explicit criteria related to clear learning goals,
tests of writing, critical thinking, or general knowledge,
Performance on achievement tests.
B. Indirect Assessment Method
Indirect methods are when students (or others) report perceptions
of how well students have achieved an objective or outcome.
Indirect assessment methods require that faculty infer actual
student abilities, knowledge, and values rather than observe direct
evidence. Among indirect methods are surveys, exit interviews, focus
groups, and the use of external reviewers.
 Surveys: Surveys usually are given to large numbers of possible
respondents, usually in writing, and often at a distance.
 Exit interviews and focus groups: Exit interviews and focus
groups allow faculty to ask specific questions face-to-face with
students.
 External reviewers: External reviewers are usually representatives
of the discipline and usually are guided by discipline-based
standards.
Advantages:
•Indirect methods are easy to administer;
•Indirect methods may be designed to facilitate
statistical analyses;
•Indirect methods may provide clues about what
could be assessed directly;
•Indirect methods can flesh out areas that direct
assessments cannot capture;
•Indirect methods are particularly useful for
ascertaining values and beliefs;
•Surveys can be given to many respondents at a time;
•Surveys are useful for gathering information from alumni,
employers, and graduate program representatives;
•Exit interviews and focus groups allow faculty to question
students face to face;
•External reviewers can bring a degree of objectivity to the
assessment;
•External reviewers can be guided either by questions that
the Department wants answered or by discipline-based
national standards.
DISADVANTAGES:
•Indirect methods provide only impressions and
opinions, not hard evidence;
•Impressions and opinions may change over time
and with additional experience;
•Respondents may tell you what they think you
want to hear;
•The number of surveys returned are usually low,
with 33 percent considered a good number;
•You cannot assume those who do not
respond would have responded in the
same way as those who did respond;
•Exit interviews take time to carry out;
•Focus groups usually involve a limited
number of respondents;
•Unless the faculty agree upon the
questions that are asked in exit
interviews and focus groups, there may
not be consistency in the responses.
2.3 Assumption in selecting assessment methods
Choosing appropriate assessments:
1. Vary assessments.
2. Consider intervals for assessment.
3. Match learning goals to assessments.
4. Direct and indirect assessment.
5. Collect data on student performance.
6. Revise assessment choices.
7. Assessment Primer.
8. Creating Assignments and Exams.
Assessment methods are selected based on the level and
content of your learning objectives. Certain assessment
methods are better suited for different types of knowledge,
skills, or attitude aspects.
1.Varyassessments
Student learning styles vary widely, and their strengths and
challenges with respect to assessment vary as well. Instructors need
to consider that variation as they choose assessments for their
courses. By varying the way we assess student understanding, we are
more likely to offer opportunities for every student to demonstrate
their knowledge. This can be accomplished by creating courses with
three or more forms of assessment, for example papers, class projects
and exams. This can also be accomplished by offering choices of how
to be assessed, for example giving students the option of writing a
paper or taking an exam for a unit of instruction, as long as by the
end of a course they have done both forms of assessment. This
might also be accomplished by offering multiple questions, and
having students choose which to answer. New faculty members
should think creatively how to best elicit quality student responses.
2.Considerintervalsforassessment
The frequency of assessment varies widely from course to
course. Some classes assess only twice, on a midterm and a
final. Others have weekly assignments, presentations and
homework. Think about the frequency with which your
students should be assessed, based on the knowledge that
assessment drives learning by focusing student attention,
energy, and motivation to learn. New faculty members need
to try various intervals and choose those that best support
their students’ learning.
3.Matchlearninggoalstoassessments
What we assess is what our students study, engage with, and
explore in more depth. By beginning with what we want
students to know and be able to do, we can design and choose
assessments to demonstrate the appropriate knowledge and
skills we are aiming for them to learn. After choosing student
learning outcomes, make a grid that places learning outcomes
across one axis, and the assessment that demonstrates their
achievement of those outcomes on the other axis. In this way
new faculty members can double check to be certain that each of
the student learning outcomes have been assessed. If we make
clear to students how each assessment furthers the goals of the
course, they are able to make informed choices about how to
spend their limited study time to achieve the course goals.
4.Directandindirectassessment
Assessment strategies are typically classified as direct, where
actual student behavior is measured or assessed, or indirect,
including things like surveys, focus groups, and similar
activities that gather impressions or opinions about a
program or its learning goals. If student assessment is
embedded in a course, meaning it impacts a course grade, it
is typically taken more seriously.
5.Collectdataonstudentperformance
In spite of our best efforts at choosing the appropriate forms of
assessment, and the intervals that best support student learning,
there will be some topics, or units of instruction where students
come up short. If we collect data on these issues, which test
questions are commonly missed, which paper topics are
commonly derailed, what misconceptions some students are
taking away, we can identify weaknesses in instruction and
assessment choices and make adjustments as needed.
6.Reviseassessmentchoices
After analyzing student achievement systematically, we should
begin to see gaps in our teaching or the effectiveness of our
assessments to measure student understanding. This is the time to
modify our assessments and the instruction leading up to them to
better support student learning. Accomplished faculty members
continually revise the ways they assess student knowledge and
skills to close the learning gap. The more students we can move
toward deep understanding of the course topics, the more effective
we are as instructors. The best time to make these revisions is
right after an assessment is evaluated and the results analyzed to
be certain to make changes when the understanding of weaknesses
is fresh in our minds. Throughout this revision process it is
important to maintain high expectations about what students
should know and be able to do.
7. Assessment Primer
Assessment Primer: Describing Your Community, Collecting
Data, Analyzing the Issues and Establishing a Road Map for
Change
8. Creating Assignments and Exams
Set the grading categories, but the students to help write the
descriptions. Draft the complete grading scale, then give it to
students for review and suggestions. Determining goals for
the assignment and its essential logistics is a good start
to creating an effective assignment.
2.4. Table of specification and construction of item
What is a table of specifications in test construction?
The table of specifications (TOS) is a tool used to ensure that
a test or assessment measures the content and thinking skills that
the test intends to measure. ... That is,
a TOS helps test constructors to focus on issue of response content,
ensuring that the test or assessment measures what it intends to
measure.
Table of specification as an activity which enumerates the
information and cognitive tasks on which examinees are to be
assessed. It is clearly defined as possible scope which laid emphasis
of the test and relates other objectives to the content in order to
ensure a balanced test items. Table of specification, sometimes
referred to as test blue print, is a table that helps teachers align
objectives, instruction and assessment.
A sample table of specification is shown in Table 1 below.
Table1: Table of specification for a (30) items Economics
test for SS2.
Objectives Objectives
Remembering
Under -
standing
Thinking Total
Consumers behavior
&price determination
2 4 3 9
Population 2 2 2 6
Money Inflation 1 3 2 6
Economics Systems 1 2 2 5
Principle of Economics 1 2 1 4
Total 7 13 10 30
General format of table of specification Table II
Content Knowledge
No and or
percentage
Understand
ing No and
or
percentage
Application
No and or
percentage
Total
Topic 1
Topic 2
Topic 3
Topic 4
Topic 5
Total
At the end of the lesson students should be able to:
1. Define the term consumer’s behavior i.e. demand and supply.
2. State the law of demand and supply.
3. Identify the forces of demand and supply as determinant of
price of goods and services.
Table III Table of specification for an objective test
Sl No Content Recall
Knowledge
Under
standing
Application Total
1 Consumer
behavior
17 ½% 7items 201/2%
8items
71/2% 3items 45%(18)
2 Price
determination
12 ½% 5items 171/2% 7item 51/2% 2items 35%(14)1
3 Public finance 0%
No items
121/2%
5items
71/2% 3items 20%(8)
4 Total 30% 12items 50 1/2%
20items
20% 8items 100%
40items
In an essay test the weighting can be achieved by assigning the
amount of time to be spent on each test item to show the
relative importance of the topics. For instance, if five essay
items are to be designed to test three subject topics, the
weighting can assigned in the same proportions of time divisions
as can be seen in table 4.
Topic Importance Item Time
Black smiting 35% Question 1 9 minutes
Missionary journey 25% Question 2 11 minutes
Photographer 40% Question 3 16 minutes
Question 4 14 minutes
Question 5 10 minutes
1. Teachers are able to determine what topic is being stressed and
also assist in the preparation of tests that reflect what students have
learnt and also limit the amount of time spent on each unit.
2. That no important objective or content area will be advertently
omitted.
3. The table of specifications can assist immensely in the
preparation of test items, production of valid and well robust test, in
the classification of objectives to both teacher and students, and in
assisting the teacher to select the most appropriate teaching
strategy.
4. Only those aims and objectives actually involved in the
instructional process will be assessed.
That each objective will receive a proportional emphasis on the test
in relation to the emphasis placed on that objective by the teacher.
2.5. Test administration, making and grading
The traditional approach to assessment of student learning is
formal testing. Still the most widely used of all methods of
assessment, testing has been the center of discussion and debate
among educators for years. The topic of testing includes a large
body information, some of which will be discussed in the upcoming
section. Basically, testing consists of four primary steps: test
construction, test administration, scoring and analyzing the test.
Each of these steps can result in a variety of test forms and elicit
a variety of useful outcomes, such as:
° Ideas for lesson plans
° Knowledge of individual students
° Ideas for approaching different students/classes
° Scores for admission
° Indication of teacher effectiveness
 Ways to assess student learning
 Goals for tests
 Suggestions to help students do better on exams
 Descriptions of common testing methods
 Issues to consider when assigning test grades
 Understanding the results, whether or not they agree
with expectations.
 Decision-making skills based on results both expected
and unanticipated (application of theory).
 Method of recording, presenting, and analyzing data;
observations and results (the notebook and final report).
 Performance of physical manipulations (technique).
The following is a schematic of the steps in testing that will
be covered in the rest of this section.
Constructing a test:
There are eight basic steps in constructing a test:
1. Defining the purpose. Before considering content and
procedure, the teacher must first determine who is taking the
test, why the test is being taken, and how the scores will be used.
Furthermore, the teacher should have a rationale for giving a test
at 8 particular point in the course: Does the test cover a
particular part of the unit content? Or should material currently
being studied be saved and tested at a later time when the entire
section is completed?
2. Listing the topics. Once the purpose and parameters have been
established, specific topics are listed and examined for their
relative importance in the section. This is called representative
sampling.
For example, if the study of crustaceans comprised approximately
10% of all class work in the section to be tested (including class time,
homework, and other assignments), then that topic should comprise
approximately 10% of the test. This can be done either by calculating
the number of questions per topic or by weighting different sections
to match class coverage (see 7. Making a Scoring Key below).
3. Listing types of questions. Different types of material calls for
different types of test questions. While multiple choice questions
might adequately test a student's knowledge of mathematics,
essays reveal more about a student's understanding of literature or
philosophy. Thus, in deciding what types of test questions to use
(short answer, essay, true/false, matching, multiple choice, etc.)
the following advantages and disadvantages should be kept in
mind:
Type Advantages Disadvantages
Short Answer Can test many facts in short time
Fairy
Often ambiguous
Difficult to measure complex learning easy to
score
Excellent format for math
Tests recall
Essay Can test complex learning
Can evaluate
Difficult to score objectively
Uses a great thinking process and creativity deal
of testing time
Subjective
True/False Test the most facts in shortest time
Easy
Difficult to measure complex learning to score
Tests recognition
Objective
Difficult to write reliable items
Subject to guessing
Matching Excellent for testing associations and recognition
of facts
Difficult to write good items
Subject to
Although terse can test
process of elimination complex learning
(especially concepts)
Objective
Multiple Choice
Can evaluate learning at all levels of complexity.
Difficult to write
Somewhat subject to
Can be highly reliable objective guessing Tests
fairly large knowledge base
In short time
Easy to score
In choosing types of questions to be used on a test, it is also
important to consider the following points:
° Classroom conditions can automatically eliminate certain
types of questions. Since answers to multiple choice questions
can be easily copied in an overcrowded classroom, they might
not be an accurate measure of student learning. Likewise, if
blackboards are the only media available for presenting the test,
long questions and textual references might be impossible to
include on the test.
° Considerations regarding administration and scoring often
dictate the type of questions to be included on a test. Numbers
of students, time constraints, and other factors might
necessitate the use of questions which can be administered and
scored quickly and easily.
The types of knowledge being tested should be considered in the
assessment process. A simplified checklist could be used by the
teacher to determine if students have been assessed in all
relevant areas. This could take the form of a graph such as the
one which follows:
TOPICS TO BE TESTED FACTS SKILLS CONCEPTS APPLICATIO
N
Verbs: Conjugation of "to be" x
Pronunciation: Short "a" x
Use of Models: Should, Must, Ought to x
Free Expression x
4. Writing items Once purpose, topics and types of questions have
been determined, the teacher is ready to begin writing the specific
parts, or items, of the test. Initially, more items should be written
than will be included on the test. When writing items, the following
guidelines are followed:
° Cover important material No item should be included on
. a test unless it covers a fact, concept, skill or applied principle
that is relevant to the information covered in class (see 3. Listing
Types of Questions above).
Items should be independent. The answer to one item should not be
found in another item; correctly answering one item should not be
dependent on correctly answering a previous item. (This guideline
might not apply in some cases. For example, a math test might begin
by testing simple skills and then test their integration. In all cases, the
teacher should be aware of what is being tested at each level and use
this strategy sparingly).
° Write simply and clearly. Use only terms and examples
students will understand and eliminate all nonfunctional words.
° Be sure students know how to respond. The item should define
the task clearly enough that students who understand the
material will know what type of answer is required and how to
record their answers. For example, on essay questions, the
teacher may specify the length and scope of the answer required.
° Include questions of varying difficulty. Tests should include at
least one question that all students can answer and one that few,
if any, can answer. Tests should be designed to go from the
easiest to most difficult items so as not to immediately
discourage the weaker students.
° Be flexible. No one type of item is best for all situations or all
types of material. Whenever feasible, any test should contain
several types of items.
5. Reviewing items Regardless of how skilled the teacher is, not all
his/her first efforts will be perfect or even acceptable. It is
therefore important to review all items, revising the good and
eliminating the bad. Finally, all items should be evaluated in terms
of purpose, standardization, validity, practicality, efficiency, and
fairness (see 8. Evaluating a Test below).
6. Writing directions. Clear and concise directions should be
written for each section. Whenever possible, an example of a
correctly answered test item should be provided as a model. If
there is any question as to the clarity of the directions, the teacher
should "try them out" on someone else before giving the exam.
7. Devising a scoring key. While the test items are fresh in his/her
mind, the teacher should make a scoring key -- a list of correct
responses, acceptable variations, and weights assigned to each
response (see Scoring below). In order to assure representative
sampling, all items should be assigned values at this time. For
example, if "factoring" comprised 50% of class material to be tested
and only 25% of the total number of test questions, each question
should be assigned double value.
8. Evaluating A Teat. All methods of assessing student learning
should achieve the same thing: the clear, consistent and systematic
measurement of a behavior or something that is learned. Once a test
has been constructed, it should be reviewed to ensure that it meets
six specific criteria: clarity, consistency, validity, practicality,
efficiency, and fairness.
The following is a checklist of questions that should be
asked after the test (or any assessment activity) has been
prepared and before it is administered:
A CLEARLY DEFINED
PURPOSE
Who is being assessed?
What material is the test (or activity) measuring?
What kinds of knowledge or skills is the test (or activity) measuring?
Do the tasks or test items relate to the objectives?
STANDARDIZATION OF
CONTENT
Are content, administration, and scoring consistent in all groups?
VALIDITY Is this test (or activity) a representative sampling of the material presented
in this section?
Does this test (or activity) faithfully reflect the level of difficulty of material
covered in the class?
PRACTICALITY AND
EFFICIENCY
Will the students have enough time to finish the test (or activity)?
Are there sufficient materials available to present the test or complete the
activity effectively?
What problems might arise due to structural or material difficulties or
shortages?
FAIRNESS Did the teacher adequately prepare students for this activity/test?
Were they given advance notice?
Did they understand the testing procedure?
How will the scores affect the students' lives?
ACTIVITY BOX
Administering a test:
Once the items, directions, and answer key have been written, the
teacher should consider the manner in which the test will be
presented in advance. Factors such as duplication, visual aids, and
use of the blackboard should be considered in advance to insure
clarity in presentation as well as to avoid technical difficulties.
Establish Classroom Policy:
Because discipline is a major factor in test administration, the
teacher must establish a classroom policy concerning such matters as
tardiness, absences, make-ups, leaving the room, and cheating (see
Classroom Management). The teacher must also advise students of
procedural rules such as:
° What to do if they have any questions.
° What to do when they are finished taking the test.
° What to do if they run out of paper, need a new pen, etc.
° What to do if they run out of time.
The teacher should always be aware of the effect of testing
conditions on testing outcomes. Physical shortcomings should be
alleviated wherever possible. If some students cannot see the
blackboard, they should be allowed to move to a better location. If
students are cramped into benches, more benches should be brought
in and students should be spread out. If this is not possible, two
separate tests can be written and distributed to students on an
alternating basis.
Similarly, psychological conditions can inhibit optimal performance.
Such factors as motivation, test anxiety, temporary states (everyone
has a bad day once in a while), and long-term changes can
profoundly effect the test-taker and therefore his/her performance
on the test. It is therefore the teacher's responsibility to establish an
official, yet not oppressive, atmosphere in the testing room to
maximize student performance.
Teaching Teat-Taking Techniques:
Students often fail tests not because they do not know the
material but because they do not understand the procedures
and techniques for successful test-taking. If a test is to be as
fair as possible, students must understand both test-taking
procedures and techniques. This means that the teacher should
familiarize his/her students with:
° The type of test to be given (e.g. diagnostic, proficiency,
achievement, etc.) and how to study for it.
° The types of items which will appear on the test and how to
respond to them (e.g. matching, fill in the blank, essay
questions, etc.).
° The types of directions commonly accompanying certain types
of test items.
° Strategies for successful test-taking (e.g. time management,
the process of elimination, guessing, etc.).
Grading a test:
In order to determine how well 8 student performed on a test
or in an activity, specific value must assigned to each test
item or activity component. Then, raw scores must be derived
and, if necessary, transformed to fit the requirements of
testing within specific contexts.
Obtaining Raw Scores
The first step in determining how well a student performed on
a test or in an activity is to derive 8 raw score, or number of
items answered correctly. Hence, if a student answers eight
out of ten items correctly, his/her raw score is eight.
Transforming Raw Scores:
Grades are determined based on 100 points, grading in
countries following the French model is based on a system of
20 points. In order to make tests match such a
predetermined number, raw scores must be transformed into
fractions, decimals, or multiples of their raw value. For
example, say the desired result is a score over 20, but a test
includes 30 questions. If all questions are of equal
importance and difficulty, they can be considered as
fractions ( 2/3 pt. each) or as decimals (.66 each). Likewise,
if a test has only 10 questions, each can be multiplied by two
to obtain a score over 20.
Cross-Cultural Considerations:
In general, grading is much harsher in many countries than in
the United States. Students rarely, if ever, achieve perfect or
even near perfect scores on tests or as a final grade. In
countries following the British model, a passing grade is
50/100 or better, in the French I model, 10/20 or better. It is
therefore, inappropriate, for example, to give even the best
students a grade higher than 80% (British) or 16/20 (French).
In fact, your school administration, fellow teachers, and
students will be bewildered and even angry if you deviate from
this strict rule. Remember: 502 or 10/20 reflects an adequate
performance, equivalent to the U.S. 770% or C. It is, therefore,
important when designing a test that you include items of
sufficient difficulty to reflect this grading tradition.
Weighting Test Items:
In the event that some questions are more important or more
difficult than others, they can be weighted; that is, some
questions can be considered of double value (in the example
above, 1 point each) and others of less value (1/4 point, or .25).
In other words, as long as the total value for the test equals the
predetermined number required, individual item values can be
juggled as the teacher sees fit (see table on coefficients under
Norm-Referenced Scoring below).
Deriving Percentages:
By transforming raw scores into percentages, the teacher can
compare tests of varying length and difficulty or tests of varying
amounts of points on equal terms.
If all items on a test are worth the same amount, the percentage
correct can be determined by dividing the number of correct items
by the total number of items, then multiplying by 100%:
Percent correct = (Number of items correct ) / (Total number of
items) x 100%
If the items are of different weight, the percentage correct can be
determined by dividing the number of points earned by the
maximum number of points, then multiplying by 100%:
Percent correct = (Points earned) / (Maximum number of points) x
100%
Assuring Objectivity:
As with test construction, the key to successful test scoring is
objectivity. By setting certain standards and prescribing certain
rules, the teacher can be sure that scoring has been objective and
students have been treated fairly. Three techniques are
particularly helpful in assuring objectivity:
° Immediate scoring & recording
° Using a scoring key
° Having a procedure for comparing responses to the key
Perhaps it seems self-evident, but immediate scoring and
recording of scores can do much to alleviate misunderstanding
and bias. The more time that goes by between test-taking and
scoring, the greater the chances of forgetting relevant
information or losing papers altogether.
More importantly, the sooner the students get their tests back,
the more meaningful their performance on the test. It does little
good to return a test months after it has been taken when
students have to review the material tested just to remember
why they answered the way they did.
Using a scoring key can make scoring papers go quickly while
reducing the possibility of error and bias. It can also simplify and
standardize the process of scoring if numerous people will be
scoring the test. Having a procedure for comparing responses to
the key can also speed up the scoring process and increase
objectivity. For example, the teacher can:
° Scan several papers before starting scoring to get a baseline
view of the type and level of responses.
° Grade a sample of papers twice to see if he/she is, in fact,
grading consistently.
° Score papers anonymously so as not to be influenced by
students' performance in other aspects of the course (this can
be done by assigning numbers before hand, folding the tops of
test papers back, etc.).
° Grade items one at a time -- that is, first grade all answers
to item 1, then all responses to item 2, and so on (this
techniques is particularly useful with essay tests where it is
important to look for key points in each response).
Analysing test results:
Once test papers have been scored, they can then be analyzed
in numerous ways to provide the teacher with information
about student performance. For example, a student's tests
from one semester can be ranked to show relative areas of
strength and weakness; averaged class scores on a given test
can be ranked to compare one class's performance to that of
another. Such information is important for making decisions
about lesson planning and future testing as well as knowing
how to approach different students and classes.
In order to analyze anything, specific criteria must be
established. In test analysis, three different criteria are
generally used: the content of the test, the norm group taking
the test, or an individual student.
Criterion-Referenced Scoring:
Criterion-referenced scoring uses the content of the test itself as
a the basis of comparison for assessing the student's level of
achievement. Thus, a content-referenced score of 80% means
that the student correctly answered 80% of the items on the test.
The most common of all methods of test analysis, content-
referenced scoring is used in:
° To determine the level of achievement at which to begin a
student;
° To determine how much a student has learned from given
section of material; and
° To determine a student's potential in a given field.
Norm-Referenced Scoring:
Sometimes referred to as "grading on a curve," norm-referenced
scoring uses the class as a whole as a referent. The class average,
or mean, usually serves as the base score against which all other
grades are judged. The mean is calculated by adding all the scores
and then dividing by the number of scores given (e.g., the total
test scores in a class of 25 equals 1625, the class average, or
mean, is 1625/25, or 65).
Some schools require certain percentages of passing grades per
class. If these percentages are exceeded, the teacher is seen as
"too easy"; conversely, if these percentages are not met, students
can become indignant and discipline problems can result.
In these instances, it is important to be able to adjust students'
scores so that official standards can be met. Such adjustments
can be made by:
° Adding (or subtracting) points to students' overall scores
° Adding (or subtracting) points to sections in which students
scored the highest
° Making the next test easier (or harder)
° Weighting the test lightly (or heavily) on the semester-end
grade by multiplying each test by an appropriate amount, or
coefficient. For example, in the table below, Panafricanism is
weighted three times, which gives the student an end of term
score of 84%.
TEST SCORE RELATIVE
WEIGHT
(coefficient)
SCORE
WITH
WEIGHTING
(coefficient)
Pre-colonial
Africa
50% 1 50
Neo-colonial
Africa
85% 1 85 420/5=
84%
Panafricanism 95% 3 285
TOTAL 5 420
Self-Referenced Scoring:
Though it is difficult to do in large
classes, self-referenced scoring measures
an individual student's rate of progress
relative to his or her own past
performance. By comparing past test
scores, a teacher can assess a student's
rate of progress in a given subject area or
across subjects to see where he/she is in
need of help.
The advantages and disadvantages of Criterion-, Norm- and Self-
Referenced scoring are listed below:
Type of Grading Advantages Disadvantages
Norm-referenced 1 Allows for comparisons among students 1 It whose class does well some students still
get poor grades
2 Classes can be compared to other classes 2 It class as a whole does poorly a good grade
could be misleading
3 Allows teacher to spot students who are dropping
behind the class
3 Does not allow individual progress or
individual circumstances to be considered
4. The whole class (or large portions of it)
must be evaluated in the same way
5 Everyone m class (or norm group) must be
evaluated with the same instrument under
the same conditions
Criterion 1. Helps teacher to decide it students are ready to
move on
1. It is difficult to develop meaningful
criteria(therefore arbitrary cut-oft scores are
often used)
2 Criteria are independent of group performance 2. Presents unique problems in computing the
reliability of criterion-referenced tests
3 Works well in a mastery-learning setting
4 Each individual can be evaluated on different
material depending on his or her level of achievement
3 Makes it difficult to make comparisons
among students
Self-referenced 1. Allows you to check student progress 1. All measures taken on an individual must be
taken with similar Instruments under similar
circumstances
2 Makes it possible to compare achievement across
different subjects for the same individual
2 Does not help you to compare an individual
with his or her peers
Percentile Ranking:
Just as the raw scores for individual test items can be
transformed to fit a certain testing model (e.g. Francophone
testing - score/20), so can one set of test results be analyzed
in relation to previous tests as well as other classes'
performances. Percentile ranks offer a way to obtain an
image of class performance on a test by calculating the
percentage of persons who obtain lower scores. To obtain a
percentile rank, divide the number of students below the
passing grade by the total number of students who took the
test. For example, if 10 students out of 30 get passing scores
(50% and above), then the percentile ranking for that test
would be 66% -- that is, 66% of that class rank in the lower
fiftieth percentile.
Charting Student Performance:
Just as percentile ranking can give a teacher a comparative measure
of class performance, charting the results of a test can give the
teacher an internal picture of how his/her class has performed as a
whole. The graph below, for example, clearly and graphically
illustrates that the majority of the students in the class failed the
test.
Student performance
To chart student performance:
1. Tally the number of students who obtain each score. (e.g.
4 students at 4/20 - or 20/100 16 students at 8/20 - or
40/100.)
2. Plot each number on a chart as illustrated above.
3. Drew a vertical line intersecting the passing grade. (In the
French system 10/20 is passing; in the British system
50/100 is passing).
The teacher can obtain a visual comparison of class
performance over a semester or a year by superimposing
charted results of multiple tests.
Unit Three: Item Analysis
3.1. Item difficult level
3.2. Item discrimination index
3.3. Item Banking
3.1 Item difficulty level
How difficulty do you think a test should be? How do we determine
the difficulty level of test items? Why is it important to know the
difficulty level of test items?.
Item difficulty index is one of the most useful, and most frequently
reported, item analysis statistics. It is a measure of the proportion
of examinees who answered the item correctly; for this reason it is
frequently called the p-value. If scores from all students in a group
are included the difficulty index is simply the total percent correct.
When there is a sufficient number of scores available (i.e., 100 or
more) difficulty indexes are calculated using scores from the top
and bottom 27 percent of the group.
Item Analysis Procedures:
1.Rank the papers in order from the highest to the lowest scores.
2.Select one-third of the papers with the highest total scores and
another one-third of the papers with lowest total scores.
3.For each test item, tabulate the number of students in the upper
and lower groups who selected each option.
4.Compute the difficulty of each item (% of students who got the
right item)
Item difficulty index can be calculated using the following formula:
P=Successes in the HSG + Successes in the LSG/N in (HSG+LSG)
Where, HSG=High scoring groups
LSG=Low scoring groups
N=The total number of HSG and LSG
The difficulty indexes can range between 0.0 and 1.0 and are
usually expressed as a percentage. A higher value indicates that a
greater proportion of examinees responded to the item correctly,
and it was thus an easier item. The average difficulty of a test is
the average of the individual item difficulties. For maximum
discrimination among students, an average difficulty of .60 is ideal.
For example: If 243 students answered item no. 1 correctly and 9
students answered incorrectly, the difficulty level of the item
would be 243/252 or .96.
In the example below, five true-false questions were part of a larger
test administered to a class of 20 students. For each question, the
number of students answering correctly was determined, and then
converted to the percentage of students answering correctly.
Ques
tion
Correct responses Item difficulty
1 | | | | | | | | | | | | |
| | 15
75% (15/20)
2 | | | | | | | | | | | | |
| | | | 17
85% (17/20)
3 | | | | | | 6 30% (6/20)
4 | | | | | | | | | | | |
| 13
65% 13/20)
5 | | | | | | | | | | | | |
| | | | | | | 20
100% (20/20)
Activity: Calculate the item difficulty level for the following
four options multiple choice test item. (The sign (*) shows the
correct answer).
Response Options
Groups A B C D* Total
High Scorers 0 1 1 8 10
Low Scorers 1 1 5 3 10
Total 1 2 6 11 20
3.2 Item Discrimination Index
To whome extent do you think a test item should discriminate
between higher achievers and lower achievers? Shoud it be highly
disciminating, averagely discriminating, or less discriminating?
The index of discrimination is a numerical indicator that enables
us to determine whether the question discriminates appropriatly
between lower scoring and higher scoring students.
When students who earn high scores are compared with those who
earn low scores, we would expect to find more students in the high
scoring groups answering a question correctly than students from
the low scoring group.
In te case of very difficult items which no one in either group
answered correctly or fairly easy questions which even the
students in the low group answered correctly, the numbers of
correct answer might be equal for the two groups.
What we would not expect to find is a case in which the low
scoring students answered correctly more frequently than
students in the high group.
D=Successes in the HSG-Successes in the LSG/1/2(HSG+LSG)
Where, HSG=High Scoring Groups
LSG=Low Scoring Groups
In the example, there are 8 students in the high scoring group
and 8 in the low scoring group (with 12 between the two groups
which are not represented). For equestion 1, all 8 in high scoring
group answered corretly, while only 4 in the low scoring group
did so. Thus success in the HSG-success in the LSG (8-4)=+4.
The last step is to divide the +4 by half of the total number of
both groups (16).
Thus 4/2x8 will give us +5, which is the D value.
Quest
ion
Success in
the HSG
Success in
the LSG
Difference D value
1 8 4 8 – 4 = 4 .5
2 7 2
3 5 6
Activity 2: Calculate the item discrimination index for the
questions 2 & 3 on the table above.The item discrimination index
can vary from -1.00 to +1.00. A negative discrimination index
(between -1.00 and zero) results when more students in the low
group answered correctly than students in the high group. A
discrimination index of zero means equal numbers of high and low
students answered correctly, so the item did not discriminate
between groups.
A positive index occurs when more students in the high group
answer correctly than the low group. If the students in the class
are fairly homogeneous in ability and achievement, their test
performance is also likely to be similar, resulting in little
discrimination between high and low groups.
Questions that have an item difficulty index (NOT item discrimination)
of 1.00 or 0.00 need not be included when calculating item
discrimination indices. An item difficulty of 1.00 indicates that
everyone answered correctly, while 0.00 means no one answered
correctly. We already know that neither type of item discriminates
between students.
When computing the discrimination index, the scores are divided into
three groups with the top 27% of the scores in the upper group and
the bottom 27% in the lower group.
The number of correct responses for an item by the lower
group is subtracted from the number of correct responses for
the item in the upper group. The difference is divided by the
number of students in either group. The process is repeated
for each item.
The value is interpreted in terms of both:
 direction (positive or negative) and
 strength (non-discriminating to strongly-
discriminating).
These values can range from -1.00 to +1.00.The
possible range of the discrimination index is -1.0 to
1.0.
Item discrimination interpretation
D-Value Direction Strength
> +.40 positive strong
+.20 to +.40 positive moderate
-.20 to +.20 none ---
< -.20 negative moderate to
strong
For a small group of students, an index of discrimination for an
item that exceeds. 20 is considered satisfactory. For larger groups,
the index should be higher because more difference between groups
would be expected. The guidelines for an acceptable level of
discrimination depend upon item difficulty. For very easy or very
difficult items, low discrimination levels would be expected; most
students, regardless of ability, would get the item correct or
incorrect as the case may be. For items with a difficulty level of
about 70 percent, the discrimination should be at least. 30.
When an item is discriminating negatively, overall the most
knowledgeable examinees are getting the item wrong and the least
knowledgeable examinees are getting the item right. A negative
discrimination index may indicate that the item is measuring
something other than what the rest of the test is measuring. More
often, it is a sign that the item has been mis-keyed.
3.3 Item Banking
Building a file of effective test items and assessment tasks
involves recording the items or tasks, adding information
from analyses of students responses, and filing the records
by both the content area and the objective that the item or
task measures. Thus, items and tasks are recorded on
records as they are constructed; information form analysis
of students responses is added after the items and tasks
have been used, and then the effective items and tasks are
deposited in the file. In a few years, it is possible to start
using some of the items and tasks from the file and
supplement these with new items and tasks.
As the file grow, it becomes possible to select the majority of the
items and tasks from the file for any given test or assessment
without repeating them frequently. Such a file is especially
valuable in areas of complex achievement, when the construction
of test items and assessment tasks is difficult and time consuming.
When enough high-quality items and tasks have been assembled,
the burden of preparing tests and assessments is considerably
lightened. Computer item banking makes tasks even easier.
Summary:
In this unit you learned how to judge the quality of classroom test
by carrying out item analysis which is the process of “testing the
item” to ascertain specifically whether the item is functioning
properly in measuring what the entire test is measuring.
You also learned about the process of item analysis and how to
compute item difficulty, item discriminating power and evaluating
the effectiveness of distracters. You have learned that item
difficulty indicates the percentage of testees who get the item right;
Item discriminating power is an index which indicates how well an
item is able to distinguish between the high achievers and low
achievers given what the test is measuring; and the distraction
power of a distracter is its ability to differentiate between those
who do not know and those who know what the item is measuring.
Finally you learned that after conducting item analysis, items
may still be usable, after modest changes are made to improve
their performance on future exams. Thus, good test items should
be kept in test item banks and in this unit you were given
highlights on how to build a Test Item File/Item Bank.
Unit Four: Ethical Standards of Assessment
4.1 Ethical and professional standards of
assessment and its use
4.2 Race, ethnicity, gender, religion and culture
in assessment and test
4.1 Ethical Standards of assessment
Ethical standards guide teachers in fulfilling their obligation to
provide and use tests that are fair to all test takers regardless of
age, gender, disability, ethnicity, religion, linguistic background, or
other personal characteristics.
Fairness is a primary consideration in all aspects of testing :
 It helps to ensure that all test takers are given a comparable
opportunity to demonstrate what they know and how they
can perform in the area being tested.
 Implies that every test taker has the opportunity to prepare
for the test and is informed about the general nature and
content of the test.
 Also extends to the accurate reporting of individual and
group test results.
The following are some ethical standards that teachers may
consider in their assessment practices.
1. Teachers should be skilled in choosing assessment methods
appropriate for instructional decisions. Skills in choosing
appropriate, useful, administratively convenient, technically
adequate, and fair assessment methods are prerequisite to good
use of information to support instructional decisions. Teachers
need to be well-acquainted with the kinds of information
provided by a broad range of assessment alternatives and their
strengths and weaknesses. In particular, they should be familiar
with criteria for evaluating and selecting assessment methods in
light of instructional plans.
2. Teachers should develop tests that meet the intended
purpose and that are appropriate for the intended test
takers. This requires teachers to:
Define the purpose for testing, the content and skills
to be tested, and the intended test takers.
Develop tests that are appropriate with content,
skills tested, and content coverage for the intended
purpose of testing.
Develop tests that have clear, accurate, and complete
information.
Develop tests with appropriately modified forms or
administration procedures for test takers with
disabilities who need special accommodations
3. The teacher should be skilled in administering, scoring and
interpreting the results from diverse assessment methods. It is not
enough that teachers are able to select and develop good
assessment methods; they must also be able to apply them
properly. This requires teachers to:
Follow established procedures for administering tests in a
standardized manner.
Provide and document appropriate procedures for test takers
with disabilities who need special accommodations or those
with diverse linguistic backgrounds.
Protect the security of test materials, including eliminating
opportunities for test takers to obtain scores by fraudulent
means.
Develop and implement procedures for ensuring the
confidentiality of scores
4. Teachers should be skilled in using assessment results when making
decisions about individual students, planning teaching, developing
curriculum, and school improvement.
Assessment results are used to make educational decisions at several
levels: in the classroom about students, in the community about a
school and a school district, and in society, generally, about the
purposes and outcomes of the educational enterprise. Teachers play a
vital role when participating in decision-making at each of these
levels and must be able to use assessment results effectively.
5. Teachers should be skilled in developing valid pupil grading
procedures which use pupil assessments. Grading students is an
important part of professional practice for teachers. Grading is defined
as indicating both a student's level of performance and a teacher's
valuing of that performance. The principles for using assessments to
obtain valid grades are known and teachers should employ them.
6. Teachers should be skilled in communicating assessment
results to students, parents, other lay audiences, and other
educators. Teachers must routinely report assessment results to
students and to parents or guardians. In addition, they are
frequently asked to report or to discuss assessment results with
other educators and with diverse lay audiences.
If the results are not communicated effectively, they may be
misused or not used. To communicate effectively with others on
matters of student assessment, teachers must be able to use
assessment terminology appropriately and must be able to
articulate the meaning, limitations, and implications of
assessment results.
Furthermore, teachers will sometimes be in a position that will
require them to defend their own assessment procedures and
their interpretations of them.
At other times, teachers may need to help the public to interpret
assessment results appropriately.
7.Teachers should be skilled in recognizing unethical, illegal, and
otherwise inappropriate assessment methods and uses of assessment
information.
Fairness, the rights of all concerned, and professional ethical
behavior must undergird all student assessment activities, from the
initial planning for and gathering of information to the
interpretation, use, and communication of the results.
Teachers must be well-versed in their own ethical and legal
responsibilities in assessment.
In addition, they should also attempt to have the inappropriate
assessment practices of others discontinued whenever they are
encountered.
1. Teachers should also participate with the wider educational
community in defining the limits of appropriate professional
behavior in assessment.
In addition, the following are principles of grading that can
guide the development of a grading system.
1. The system of grading should be clear and understandable (to
parents, other stakeholders, and most especially students).
2. The system of grading should be communicated to all
stakeholders (e.g., students, parents, administrators).
3. Grading should be fair for all students regardless of gender,
socioeconomic status or any other personal characteristics.
4. Grading should support, enhance, and inform the instructional
process.
4.2 Race, ethnicity, gender, religion and culture in
assessment and test
In the previous section we have learned that fairness is the
fundamental principle that has to be followed in teachers’
assessment practices.
It has been said that all students have to be provided with equal
opportunity to demonstrate the skills and knowledge being
assessed. Fairness is fundamentally a socio-cultural, rather than a
technical, issue.
Thus, in this section we are going to see how culture and
ethnicity may influence teachers’ assessment practices and what
precautions we have to take in order avoid bias and be
accommodative to students from all cultural groups.
Students represent a variety of cultural and linguistic
backgrounds. If the cultural and linguistic backgrounds are
ignored, students may become alienated or disengaged from the
learning and assessment process. Teachers need to be aware of
how such backgrounds may influence student performance and
the potential impact on learning. Teachers should be ready to
provide accommodations where needed.
Classroom assessment practices should be sensitive to the
cultural and linguistic diversity of students in order to obtain
accurate information about their learning.
Assessment practices that attend to issues of cultural diversity
include those that
 Aacknowledge students’ cultural backgrounds.
 Are sensitive to those aspects of an assessment that may
hamper students’ ability to demonstrate their knowledge and
understanding.
 Use that knowledge to adjust or scaffold assessment practices
if necessary.
Assessment practices that attend to issues of linguistic
diversity include those that
 Acknowledge students’ differing linguistic abilities.
 Use that knowledge to adjust or scaffold assessment
practices if necessary.
 Use assessment practices in which the language demands do
not unfairly prevent the students from understanding what is
expected of them.
Use assessment practices that allow students to accurately
demonstrate their understanding by responding in ways that
accommodate their linguistic abilities, if the response method is
not relevant to the concept being assessed (e.g., allow a student
to respond orally rather than in writing).
Disability and Assessment Practices:
It is quite obvious that our education system was exclusionary in
fully accommodating the educational needs of disabled students.
This has been true not only in our country but in the rest of the
world as well, although the magnitude might differ from country to
country.
It was in response to this situation that UNESCO has been
promoting the principle of inclusive education to guide the
educational policies and practice of all governments. Different world
conventions were held and documents signed towards the
implementation of inclusive education.
Our country, Ethiopia, has been a signatory of these documents and
therefore has accepted inclusive education as a basic principle to
guide its policy and practice in relation to the education of disabled
students.
UN Convention on the Rights of Persons with Disabilities (2006)
One group should work on one convention and documents can
be found from the internet.
Inclusive education is based on the idea that all students,
including those with disabilities, should be provided with the
best possible education to develop themselves. This implies for
the provision of all possible accommodations to address the
educational needs of disabled students. Accommodations
should not only refer to the teaching and learning process. It
should also consider the assessment mechanisms and
procedures.
There are different strategies that can be considered to make
assessment practices accessible to students with disabilities
depending on the type of disability. In general terms, however, the
following strategies could be considered in summative assessments:
 Modifying assessments: This should enable disabled students to
have full access to the assessment without giving them any unfair
advantage.
 Others’ support: Disabled students may need the support of
others in certain assessment activities which they can not do it
independently. For instance, they may require readers and
scribes in written exams; they may also need others’ assistance in
practical activities, such as using equipments, locating materials,
drawing and measuring.
 Time allowances: Disabled students should be given additional
time to complete their assessments which the individual
instructor has to decide based on the purpose and nature of the
assessment.
 Rest breaks: Some students may need rest breaks during the
examination. This may be to relieve pain or to attend to
personal needs.
 Flexible schedules: In some cases disabled students may require
flexibility in the scheduling of examinations. For example, some
students may find it difficult to manage a number of
examinations in quick succession and need to have
examinations scheduled over a period of days.
 Alternative methods of assessment: In certain situations where
formal methods of assessment may not be appropriate for
disabled students, the instructor should assess them using non
formal methods such as class works, portfolios, oral
presentations, etc.
 Assistive Technology: Specific equipment may need to be
available to the student in an examination. Such arrangements
often include the use of personal computers, voice activated
software and screen readers.
Gender issues in assessment:
Do you feel that gender has any influence in teachers’
assessment practices? Is there any gender-related stereotype
in relation to assessment results?
Teachers’ assessment practices can also be affected by gender
stereotypes. The issues of gender bias and fairness in
assessment are concerned with differences in opportunities
for boys and girls.
A test is biased if boys and girls with the same ability levels
tend to obtain different scores.
Test questions should be checked for:
 material or references that may be offensive to members of
one gender,
 references to objects and ideas that are likely to be more
familiar to men or to women,
 unequal representation of men and women as actors in test
items or representation of members of each gender only in
stereotyped roles.
If the questions involve objects and ideas that are more
familiar or less offensive to members of one gender, then the
test may be easier for individuals of that gender.
Standards for achievement on such a test may be unfair to
individuals of the gender that is less familiar with or more
offended by the objects and ideas discussed, because it may be
more difficult for such individuals to demonstrate their abilities
or their knowledge of the material.
Summary: In this unit you have learned that ethics is a very
important issue we have to follow in our assessment practices.
And the most important ethical consideration is fairness. If we
are to draw reasonably good conclusions about what our students
have learned, it is imperative that we make our assessments—and
our uses of the results—as fair as possible for as many students as
possible. A fair assessment is one in which students are given
equitable opportunities to demonstrate their abilities and
knowledge.
Teachers must make every effort to address and minimize the
effect of bias in classroom assessment practices. Biases in
assessment can occur because of differences in culture or
ethnicity, disability as well as gender. To ensure suitability and
fairness for all students, teachers need to check the
assessment strategy for its appropriateness and if there are
cultural, disability and gender biases.
Equitable assessment means that students are assessed using
methods and procedures most appropriate to them. Classroom
assessment practices should be sensitive and diverse enough to
accommodate all types of diversity in the classroom in order to
obtain accurate information about learning.
Module 6-L A & E, Weekend.pptx

Module 6-L A & E, Weekend.pptx

  • 1.
    Module 6: LearningAssessments and Evaluation Course code: TECS 623 Credit hours: 2 Prof.Omprakash H M Department of Curriculum and Instructons College of Education and Behavioral Sciences Bule Hora University, Adola, Ethiopia
  • 3.
    Unit One: Concepts,Purposes and Principles of Assessment 1.1 Concept of Assessment and related terms(Test,Mesurement,assessment and Evaluation) 1.2 Function of Assessment and Evaluation 1.3 Principles of Assessment (Validity, Equity, reliability and explicitness) 1.4 Basic assumption in assessing students’ performance.
  • 4.
    1.1 Concept ofAssessment and related terms (Test, Mesurement, Assessment and Evaluation) Concept of Assessment: Assessment: Assessment is the process of gathering and discussing information from multiple and diverse sources in order to develop a deep understanding of what students know, understand, and can do with their knowledge as a result of their educational experiences; the process culminates when assessment results are used to improve subsequent learning. Comprehensive definition of assessment that incorporates its key elements: the planned process of gathering and synthesizing information relevant to the purposes of (a) discovering and documenting students' strengths and weaknesses, (b) planning and enhancing instruction, or (c) evaluating progress and making decisions about students.
  • 5.
    What is basicconcept of assessment? Assessment refers to the full range of information gathered and produced by teachers about their students and their classrooms (Arends, 1994) Assessment is a method for analyzing and evaluating student achievement or program success. Assessment involves the use of empirical data on student learning to refine programs and improve student learning. It is the process of defining, selecting, designing, collecting, analyzing, interpreting, and using information to increase students' learning and development.
  • 6.
    All those activitiesundertaken by teachers, and by their students in assessing themselves, which provide information to be used as feedback to modify the teaching and learning activities in which they are engaged. Basic Concepts in Testing and Assessment Test Measurement Assessment Evaluation What is a Test? Perhaps test is a concept that we are more familiar with than the other concepts. we have been taking tests ever since we have started schooling to determine our academic performance. Tests are also used in work places to select individuals for a certain job vacancy.
  • 7.
    Thus, test ineducational context is meant to the presentation of a standard set of questions to be answered by students. It is one instrument that is used for collecting information about students’ behaviors or performances. It can be noted that there are many other ways of collecting information about students’ educational performances other than tests, such as observations, assignments, project works, portfolios, etc.  Most commonly used method of making measurements in education.  An instrument or systematic procedures for measuring sample of behaviour by posing to measure any quality, ability, skill or knowledgea set of questions in a uniform manner.
  • 8.
     Designed e.g.objective, subjective, descripitive, time management, pretest-post, develop conditioning, enhance observation ( in learning) with exp.  There is always right/best and wrong answer. What is Measurement? Measurement: In our day to day life there are different things that we measure. We measure our height and put it in terms of meters and centimeters. We measure some of our daily consumptions like sugar in kilograms and liquids in liters. We measure temperature and express it in terms of degree centigrade or degree Celsius. How do we measure these things?
  • 9.
    Well definitely weneed to have appropriate instruments such as a meter, a weighing scale, or a thermometer in order to have reliable measurements. Similarly, in education measurement is the process by which the attributes of a person are measured and described in numbers. It is a quantitative description of the behavior or performance of students. As educators we frequently measure human attributes such as attitudes, academic achievement, aptitudes, interests, personality and so forth. Measurement permits more objective description concerning traits and facilitates comparisons. Hence, to measure we have to use certain instruments so that we can conclude that a certain student is better in a certain subject than another student. How do we measure performance in mathematics?
  • 10.
    We use amathematics test which is an instrument containing questions and problems to be solved by students. The number of right responses obtained is an indication of performance of individual students in mathematics. Thus, the purpose of educational measurement is to represent how much of ‘something’ is possessed by a person using numbers. Note that we are only collecting information. We are not evaluating! Evaluation is therefore quite different from measurement. Measurement is not also that same as testing. While a test is an instrument to collect information about students’ behaviors, measurement is the assignment of quantitative value to the results of a test or other assessment techniques. Measurement can refer to both the score obtained as well as the process itself.
  • 11.
     Basically assignmentof numbers.  Variety of i n s t r u m e n t s such as tests, rating scales, rubrics are used.  The process of obtaining numerical description of the degree of individual processes.  Quantifying how much learners learned. What is Assessment? Assessment: In educational literature the concepts ‘assessment’ and ‘evaluation’ have been used with some confusion. Some educators have used them interchangeably to mean the same thing. Others have used them as two different concepts. Even when they are used differently there is too much overlap in the interpretations of the two concepts.
  • 12.
    Cizek (in Phiye,1997) provides us a comprehensive definition of assessment that incorporates its key elements: The planned process of gathering and synthesizing information relevant to the purposes of: (a) Discovering and documenting students' strengths and weaknesses, (b) Planning and enhancing instruction, or (c) Evaluating progress and making decisions about students. Process by which evidence of student achievement is obtained and evaluated (marks obtained, how...checked).  Including testing, interpreting and placing information in context.
  • 13.
     Process ofgathering and organizing data – the basis for decision- making (evaluation).  Methods of measuring and evaluating the nature of the learner (what he learned/ how he learned) e.g objective /descriptive.
  • 14.
    Principles of Assessment 1.Assessment should be aimed at improving student performance. 2. Assessment should be based on an understanding of how students learn. 3. Assessment should be an integral component of course design and not something to addafterwards. 4. Good assessment provides useful information to report credibly to parents on student achievement. The Role of Assessment in Learning Assessment plays a major role in how students learn, their motivation to learn learn, and how teachers to teach. Assessment is used for various purpose:  Assessment for learning: Where assessment helps teachers gain insight.  Assessment as learning: Where students develop an awareness.
  • 15.
     Assessment oflearning: where assessment informs students, teachers and parents, as well as the broader educational community, of achievement at a certain point in time. Research and experience show that student learning is best supported when:  Instruction and assessment are based on clear learning goals.  Instruction and assessment are differentiated according to student learning needs.  Assessment information is used to make decisions that support further learning.  Parents are well informed about their child’s learning, and work with the school to help plan and provide support, e.g.weekly/monthly meeting.
  • 16.
    Advantages of Assessment Helps in knowing the position of a student when they enter a course. E.g. Adam student in Adola Teacher’s College.  It provides a large view of students’ need and assessment.  In accordance to the students’ achievement, the curriculum and teaching methods can be adjusted Disadvantages of Assessment It limits the potential of a student to a mere ‘test’.  Under supervision and pressure and supervision, creativity and performance is affected.  Though assessment aims at bringing out the latent/hidden knowledge, it often conceals it by the pressure it creates.  The parameter to judging knowledge is just a test score.
  • 18.
    Evaluation It is theprocess of obtaining, analyzing and interpreting information to determine the extent to which students achieve .e.g How many marks secure
  • 32.
    1.2 Function ofAssessment and Evaluation 1. Capturing student time and attention. 2. Generating appropriate student learning activity. 3. Providing timely feedback which students pay attention to. 4.Helping students to internalize the discipline’s standards and notions of equality. 5. Generating marks or grades which distinguish between students or enable pass/fail decisions to be made.
  • 33.
     Monitoring theprogress  Decision making  Screening  Diagnostic process  Placement of students in remedial courses  Instructional planning  Evaluation of instructional programed.  Feedback  Motivation
  • 34.
    1.3 Principles ofAssessment (Validity, Equity, reliability and explicitness) What are the basic principles of assessment? 1 Assessment should be valid. 2 Assessment should be reliable and consistent. 3 Information about assessment should be explicit, accessible and transparent. 4 Assessment should be inclusive and equitable  Validity Validity refers to the evidence base that can be provided about appropriateness of the inferences, uses, and consequences that come from assessment. Appropriateness has to do with the soundness, trustworthiness, or legitimacy of the claims or inferences that testers would like to make on the basis of obtained scores.
  • 35.
     Validity is“the extent to which inferences made from assessment results are appropriate, meaningful, and useful in terms of the purpose of the assessment” (Gronlund, 1998).  Validity refers to whether the test is actually measuring what it claims to measure (Arshad, 2004). Face Validity It is pertinent that a test looks like a test even at first impression. If students taking a test do not feel that the questions given to them are not a test or part of a test, then the test may not be valid as the students may not take it seriously to attempt the questions.
  • 36.
    Construct Validity Construct isa psychological concept used in measurement. Construct validity is the most obvious reflection of whether a test measures what it is supposed to measure as it directly addresses the issue of what it is that is being measured. In other words, construct validity refers to whether the underlying theoretical constructs that the test measures are themselves valid.  Equity Principle 1 - Assessment should be inclusive and equitable/fair. As far as is possible without compromising academic standards, inclusive and equitable assessment should ensure that tasks and procedures do not disadvantage any group or individual.
  • 37.
    Principle 2 -Information about assessment should be explicit, accessible and transparent Clear, accurate, consistent and timely information on assessment tasks and procedures should be made available to students, staff and other external assessors or examiners. Principle 3 - Assessment should be inclusive and equitable As far as is possible without compromising academic standards, inclusive and equitable assessment should ensure that tasks and procedures do not disadvantage any group or individual. Principle 4 - Assessment should be an integral part of programme design and should relate directly to the programme aims and learning outcomes Assessment tasks should primarily reflect the nature of the discipline or subject but should also ensure that students have the opportunity to develop a range of generic skills and capabilities. Principle 5 - The amount of assessed work should be manageable The scheduling of assignments and the amount of assessed work required should provide a reliable and valid profile of achievement without overloading staff or students, Principle.
  • 38.
    Principle 6 -Formative and summative assessment should be included in each programme Formative and summative assessment should be incorporated into programmes to ensure that the purposes of assessment are adequately addressed. Many programmes may also wish to include diagnostic assessment. Principle 7 - Timely feedback that promotes learning and facilitates improvement should be an integral part of the assessment process Students are entitled to feedback on submitted formative assessment tasks, and on summative tasks, where appropriate. The nature, extent and timing of feedback for each assessment task should be made clear to students in advance. Principle 8 - Staff development policy and strategy should include assessment All those involved in the assessment of students must be competent to undertake their roles and responsibilities.
  • 39.
     Reliability According toBrown (2010), a reliable test can be described as follows: ◦ Consistent in its conditions across two or more administrations ◦ Gives clear directions for scoring / evaluation ◦ Has uniform rubrics for scoring / evaluation ◦ Lends itself to consistent application of those rubrics by the scorer ◦ Contains item / tasks that are unambiguous to the test-taker
  • 40.
    Reliability means thedegree to which an assessment tool produces stable and consistent results. Reliability essentially denotes ‘consistency, stability, dependability, and accuracy of assessment results’ Since there is tremendous variability from either teacher or tester to teacher/tester that affects student performance, thus reliability in planning, implementing, and scoring student performances gives rise to valid assessment. Split Half Reliability A test is administered once to a group, is divided into two equal halves after the students have returned the test, and the halves are then correlated. Halves are often determined based on the number assigned to each item with one half consisting of odd numbered items and the other half even numbered items.
  • 41.
    Factors that affectreliability of the test a.Test factor b.Teacher and Student factor c.Environment factor d.Test administration factor e.Making factor a. Test Factor In general, longer tests produce higher reliabilities. Due to the dependency on coincidence and guessing, the scores will be more accurate if the duration of the test is longer. An objective test has higher consistency because it is not exposed to a variety of interpretations.
  • 42.
    b. Teacher andStudent factor In most tests, it is normally for teachers to construct and administer tests for students. Thus, any good teacher-student relationship would help increase the consistency of the results. Other factors that contribute to positive effects to the reliability of a test include teacher’s encouragement, positive mental and physical condition, familiarity to the test formats, and perseverance and motivation. c. Environment Factor An examination environment certainly influences test- takers and their scores. Any favourable environment with comfortable chairs and desks, good ventilation, sufficient light and space will improve the reliability of the test. On the contrary, a non-conducive environment will affect test-takers’ performance and test reliability.
  • 43.
    d. Test AdministrationFactor Because students' grades are dependent on the way tests are being administered, test administrators should strive to provide clear and accurate instructions, sufficient time and careful monitoring of tests to improve the reliability of their tests. A test-retest technique can be used to determine test reliability.
  • 44.
    e. Marking Factor Humanjudges have many opportunities to introduce error in scoring essays. It is also common that different markers award different marks for the same answer even with a prepared mark scheme. A marker’s assessment may vary from time to time and with different situations. Conversely, it does not happen to the objective type of tests since the responses are fixed. Thus, objectivity is a condition for reliability.
  • 45.
     Explicitness Principle -Information about assessment should be explicit, accessible and transparent Clear, accurate, consistent and timely information on assessment tasks and procedures should be made available to students, staff and other external assessors or examiners.
  • 46.
    1.4 Basic assumptionin assessing students’ performance. 1.Quality of student learning is directly related to quality of teaching, 2.The first step in getting useful feedback about course goals is to make these goals explicit, 3.Students need focused feedback early and often, and they should be taught how to assess their own learning, 4.The most effective assessment addresses problem-directed questions that faculty ask themselves, 5.Course assessment is an intellectual challenge and therefore motivating for the faculty, 6.Assessment does not require special training, 7.Collaboration with colleagues and students improves learning and is satisfying.
  • 47.
    Reflection: When planningto assess students, what are the assumptions that one held in mind? What are the things that should be kept in mind when preparing assessment tools for assessing students? Angelo and Cross (1993) have listed seven basic assumptions of classroom assessment which are described as follows: The quality of student learning is directly, although not exclusively related to the quality of teaching. Therefore, one of the most promising ways to improve learning is to improve teaching. If assessment is to improve the quality of students learning, both teachers and students must become personally invested and actively involved in the process. Reflection: What should be the roles of students and teachers in classroom assessment so as it will help students’ learning?
  • 48.
    To improve theireffectiveness, teachers need first to make their goals and objectives explicit and then to get specific, comprehendible feedback on the extent to which they are achieving those goals and objectives. Effective assessment begins with clear goals. Before teachers can assess how well their students are learning, they must identify and clarify what they are trying to teach. After teachers have identified specific teaching goals they wish to assess, they can better determine what kind of feedback to collect. To improve their learning, students need to receive appropriate and focused feedback early and often; they also need to learn how to assess their own learning. Reflection: How do you think feedback and self-assessment will help to improve students’ learning?
  • 49.
    The type ofassessment most likely to improve teaching and learning is that conducted by teachers to answer questions they themselves have formulated in response to issues or problems in their own teaching. To best understand their students’ learning, teachers need specific and timely information about the particular individuals in their classes. As a result of the different students’ needs, there is often a gap between assessment and student learning. One goal of classroom assessment is to reduce this gap. Reflection: How does classroom assessment help to reduce this gap between assessment and student learning?
  • 50.
    Systematic inquiry andintellectual challenge are powerful sources of motivation, growth, and renewal for teachers, and classroom assessment can provide such challenge. Classroom assessment is an effort to encourage and assist those teachers who wish to become more knowledgeable, involved, and successful. Classroom assessment does not require specialized training; it can be carried out by dedicated teachers from all disciplines. To succeed in classroom assessment, teachers need only a detailed knowledge of the discipline, dedication to teaching, and the motivation to improve.
  • 51.
    By collaborating withcolleagues and actively involving students in classroom assessment efforts, teachers (and students) enhance learning and personal satisfaction. By working together, all parties achieve results of greater value than those they can achieve by working separately. Reflection: Can you explain how teachers’ collaboration with colleagues can be more effective in enhancing learning and personal satisfaction than working alone?
  • 52.
    Unit Two: Assessmenttypes, Methods and Tools 2.1. Assessments Types 2.2. Assessment Method 2.3. Assumption in selecting assessment methods 2.4. Table of specification and construction of item 2.5. Test administration, making and grading
  • 53.
  • 62.
    Advantages of FormativeAssessment  Develops knowledge  Continuous improvement  Provides quick feedback  Achieves successful outcomes  Communicate with parent regular Disadvantages of Formative Assessment  Time consuming and requires resources  Tiring process  Trained and qualified professionals  Develops challenges
  • 67.
    Advantages of SummativeAssessment  To know if students have understood  They determine achievement  They make academic records  Boosts individuals  Weak areas can be identified  Training success can be measured Disadvantages of Summative Evaluation  Demotivates individuals  Rectification is late  It is disruptive  No remedy to identify challenges in advance  Not accurate reflection of learning  Negative effect for students
  • 73.
    Assessment as Procedure This paradigm has elements of both measurement paradigm and inquiry paradigm.  The primary focus is on assessment procedures and not on the underlying purposes of the assessment program.  Knowledge is believed to be existing separately from the learners, and it can be transmitted to the students and eventuallyobjectively measured. Assessment as Inquiry  Under this paradigm, assessment is based on constructive theories of knowledge, student-centered learning and the inquiry process.  The teachers use various qualitative and quantitative techniques to inquire about particular learners.  It is a process of inquiry, and a process of interpretation, used to promote reflection concerning students' understandings, attitudes, and literate abilities.  Assessment, in this paradigm, is viewed as a social, contex specific, interpretive activity.
  • 74.
    Assessment as Measurement The primary instrument of this paradigm is the large-scale, norm-referenced standardized test.  Objectivity, standardization and reliability are the main concerns.  Knowledge is considered to be existing separately from the learners; and the learners work to acquire it and not construct it.  Decisions about the information to be collected and the means of evaluation are made by authorities outside the classroom.
  • 75.
    Conclusion:  Therefore, itmay be concluded that though the leanings are towards the formative evaluation system, it is not without the want for modification.  A balance is required not only in the formative but also in any form of evaluation system.  In the summative system too much leniency risks the attention of the students. While, in formative evaluation too much stress can endanger the clarity of concepts in the students.  A midway has to be found between the two extremes, and a modified, flexible and accommodating system of evaluation, in the lines of the formative system, should be adopted, so that the faculty and the students can keep track of the latter’s progress while at the same time also have the chance to improve, develop and grow.
  • 76.
    2.2. Assessment Method Methodsof Assessment. Methods will vary depending on the learning outcome(s) to be measured. Direct methods are when students demonstrate that they have achieved a learning outcome or objective. Indirect methods are when students (or others) report perceptions of how well students have achieved an objective or outcome. A. Direct Assessment Method: Direct assessment involves looking at actual samples of student work produced in our programs. Observations of field work, internship performance, service learning, clinical experiences. Grades based on explicit criteria related to clear learning goals, tests of writing, critical thinking, or general knowledge, Performance on achievement tests.
  • 80.
    B. Indirect AssessmentMethod Indirect methods are when students (or others) report perceptions of how well students have achieved an objective or outcome. Indirect assessment methods require that faculty infer actual student abilities, knowledge, and values rather than observe direct evidence. Among indirect methods are surveys, exit interviews, focus groups, and the use of external reviewers.  Surveys: Surveys usually are given to large numbers of possible respondents, usually in writing, and often at a distance.  Exit interviews and focus groups: Exit interviews and focus groups allow faculty to ask specific questions face-to-face with students.  External reviewers: External reviewers are usually representatives of the discipline and usually are guided by discipline-based standards.
  • 81.
    Advantages: •Indirect methods areeasy to administer; •Indirect methods may be designed to facilitate statistical analyses; •Indirect methods may provide clues about what could be assessed directly; •Indirect methods can flesh out areas that direct assessments cannot capture; •Indirect methods are particularly useful for ascertaining values and beliefs;
  • 82.
    •Surveys can begiven to many respondents at a time; •Surveys are useful for gathering information from alumni, employers, and graduate program representatives; •Exit interviews and focus groups allow faculty to question students face to face; •External reviewers can bring a degree of objectivity to the assessment; •External reviewers can be guided either by questions that the Department wants answered or by discipline-based national standards.
  • 83.
    DISADVANTAGES: •Indirect methods provideonly impressions and opinions, not hard evidence; •Impressions and opinions may change over time and with additional experience; •Respondents may tell you what they think you want to hear; •The number of surveys returned are usually low, with 33 percent considered a good number;
  • 84.
    •You cannot assumethose who do not respond would have responded in the same way as those who did respond; •Exit interviews take time to carry out; •Focus groups usually involve a limited number of respondents; •Unless the faculty agree upon the questions that are asked in exit interviews and focus groups, there may not be consistency in the responses.
  • 88.
    2.3 Assumption inselecting assessment methods Choosing appropriate assessments: 1. Vary assessments. 2. Consider intervals for assessment. 3. Match learning goals to assessments. 4. Direct and indirect assessment. 5. Collect data on student performance. 6. Revise assessment choices. 7. Assessment Primer. 8. Creating Assignments and Exams. Assessment methods are selected based on the level and content of your learning objectives. Certain assessment methods are better suited for different types of knowledge, skills, or attitude aspects.
  • 89.
    1.Varyassessments Student learning stylesvary widely, and their strengths and challenges with respect to assessment vary as well. Instructors need to consider that variation as they choose assessments for their courses. By varying the way we assess student understanding, we are more likely to offer opportunities for every student to demonstrate their knowledge. This can be accomplished by creating courses with three or more forms of assessment, for example papers, class projects and exams. This can also be accomplished by offering choices of how to be assessed, for example giving students the option of writing a paper or taking an exam for a unit of instruction, as long as by the end of a course they have done both forms of assessment. This might also be accomplished by offering multiple questions, and having students choose which to answer. New faculty members should think creatively how to best elicit quality student responses.
  • 90.
    2.Considerintervalsforassessment The frequency ofassessment varies widely from course to course. Some classes assess only twice, on a midterm and a final. Others have weekly assignments, presentations and homework. Think about the frequency with which your students should be assessed, based on the knowledge that assessment drives learning by focusing student attention, energy, and motivation to learn. New faculty members need to try various intervals and choose those that best support their students’ learning.
  • 91.
    3.Matchlearninggoalstoassessments What we assessis what our students study, engage with, and explore in more depth. By beginning with what we want students to know and be able to do, we can design and choose assessments to demonstrate the appropriate knowledge and skills we are aiming for them to learn. After choosing student learning outcomes, make a grid that places learning outcomes across one axis, and the assessment that demonstrates their achievement of those outcomes on the other axis. In this way new faculty members can double check to be certain that each of the student learning outcomes have been assessed. If we make clear to students how each assessment furthers the goals of the course, they are able to make informed choices about how to spend their limited study time to achieve the course goals.
  • 92.
    4.Directandindirectassessment Assessment strategies aretypically classified as direct, where actual student behavior is measured or assessed, or indirect, including things like surveys, focus groups, and similar activities that gather impressions or opinions about a program or its learning goals. If student assessment is embedded in a course, meaning it impacts a course grade, it is typically taken more seriously.
  • 93.
    5.Collectdataonstudentperformance In spite ofour best efforts at choosing the appropriate forms of assessment, and the intervals that best support student learning, there will be some topics, or units of instruction where students come up short. If we collect data on these issues, which test questions are commonly missed, which paper topics are commonly derailed, what misconceptions some students are taking away, we can identify weaknesses in instruction and assessment choices and make adjustments as needed.
  • 94.
    6.Reviseassessmentchoices After analyzing studentachievement systematically, we should begin to see gaps in our teaching or the effectiveness of our assessments to measure student understanding. This is the time to modify our assessments and the instruction leading up to them to better support student learning. Accomplished faculty members continually revise the ways they assess student knowledge and skills to close the learning gap. The more students we can move toward deep understanding of the course topics, the more effective we are as instructors. The best time to make these revisions is right after an assessment is evaluated and the results analyzed to be certain to make changes when the understanding of weaknesses is fresh in our minds. Throughout this revision process it is important to maintain high expectations about what students should know and be able to do.
  • 95.
    7. Assessment Primer AssessmentPrimer: Describing Your Community, Collecting Data, Analyzing the Issues and Establishing a Road Map for Change 8. Creating Assignments and Exams Set the grading categories, but the students to help write the descriptions. Draft the complete grading scale, then give it to students for review and suggestions. Determining goals for the assignment and its essential logistics is a good start to creating an effective assignment.
  • 96.
    2.4. Table ofspecification and construction of item What is a table of specifications in test construction? The table of specifications (TOS) is a tool used to ensure that a test or assessment measures the content and thinking skills that the test intends to measure. ... That is, a TOS helps test constructors to focus on issue of response content, ensuring that the test or assessment measures what it intends to measure. Table of specification as an activity which enumerates the information and cognitive tasks on which examinees are to be assessed. It is clearly defined as possible scope which laid emphasis of the test and relates other objectives to the content in order to ensure a balanced test items. Table of specification, sometimes referred to as test blue print, is a table that helps teachers align objectives, instruction and assessment.
  • 97.
    A sample tableof specification is shown in Table 1 below. Table1: Table of specification for a (30) items Economics test for SS2. Objectives Objectives Remembering Under - standing Thinking Total Consumers behavior &price determination 2 4 3 9 Population 2 2 2 6 Money Inflation 1 3 2 6 Economics Systems 1 2 2 5 Principle of Economics 1 2 1 4 Total 7 13 10 30
  • 98.
    General format oftable of specification Table II Content Knowledge No and or percentage Understand ing No and or percentage Application No and or percentage Total Topic 1 Topic 2 Topic 3 Topic 4 Topic 5 Total
  • 99.
    At the endof the lesson students should be able to: 1. Define the term consumer’s behavior i.e. demand and supply. 2. State the law of demand and supply. 3. Identify the forces of demand and supply as determinant of price of goods and services. Table III Table of specification for an objective test Sl No Content Recall Knowledge Under standing Application Total 1 Consumer behavior 17 ½% 7items 201/2% 8items 71/2% 3items 45%(18) 2 Price determination 12 ½% 5items 171/2% 7item 51/2% 2items 35%(14)1 3 Public finance 0% No items 121/2% 5items 71/2% 3items 20%(8) 4 Total 30% 12items 50 1/2% 20items 20% 8items 100% 40items
  • 100.
    In an essaytest the weighting can be achieved by assigning the amount of time to be spent on each test item to show the relative importance of the topics. For instance, if five essay items are to be designed to test three subject topics, the weighting can assigned in the same proportions of time divisions as can be seen in table 4. Topic Importance Item Time Black smiting 35% Question 1 9 minutes Missionary journey 25% Question 2 11 minutes Photographer 40% Question 3 16 minutes Question 4 14 minutes Question 5 10 minutes
  • 101.
    1. Teachers areable to determine what topic is being stressed and also assist in the preparation of tests that reflect what students have learnt and also limit the amount of time spent on each unit. 2. That no important objective or content area will be advertently omitted. 3. The table of specifications can assist immensely in the preparation of test items, production of valid and well robust test, in the classification of objectives to both teacher and students, and in assisting the teacher to select the most appropriate teaching strategy. 4. Only those aims and objectives actually involved in the instructional process will be assessed. That each objective will receive a proportional emphasis on the test in relation to the emphasis placed on that objective by the teacher.
  • 102.
    2.5. Test administration,making and grading The traditional approach to assessment of student learning is formal testing. Still the most widely used of all methods of assessment, testing has been the center of discussion and debate among educators for years. The topic of testing includes a large body information, some of which will be discussed in the upcoming section. Basically, testing consists of four primary steps: test construction, test administration, scoring and analyzing the test. Each of these steps can result in a variety of test forms and elicit a variety of useful outcomes, such as: ° Ideas for lesson plans ° Knowledge of individual students ° Ideas for approaching different students/classes ° Scores for admission ° Indication of teacher effectiveness
  • 103.
     Ways toassess student learning  Goals for tests  Suggestions to help students do better on exams  Descriptions of common testing methods  Issues to consider when assigning test grades  Understanding the results, whether or not they agree with expectations.  Decision-making skills based on results both expected and unanticipated (application of theory).  Method of recording, presenting, and analyzing data; observations and results (the notebook and final report).  Performance of physical manipulations (technique).
  • 104.
    The following isa schematic of the steps in testing that will be covered in the rest of this section.
  • 105.
    Constructing a test: Thereare eight basic steps in constructing a test: 1. Defining the purpose. Before considering content and procedure, the teacher must first determine who is taking the test, why the test is being taken, and how the scores will be used. Furthermore, the teacher should have a rationale for giving a test at 8 particular point in the course: Does the test cover a particular part of the unit content? Or should material currently being studied be saved and tested at a later time when the entire section is completed? 2. Listing the topics. Once the purpose and parameters have been established, specific topics are listed and examined for their relative importance in the section. This is called representative sampling.
  • 106.
    For example, ifthe study of crustaceans comprised approximately 10% of all class work in the section to be tested (including class time, homework, and other assignments), then that topic should comprise approximately 10% of the test. This can be done either by calculating the number of questions per topic or by weighting different sections to match class coverage (see 7. Making a Scoring Key below). 3. Listing types of questions. Different types of material calls for different types of test questions. While multiple choice questions might adequately test a student's knowledge of mathematics, essays reveal more about a student's understanding of literature or philosophy. Thus, in deciding what types of test questions to use (short answer, essay, true/false, matching, multiple choice, etc.) the following advantages and disadvantages should be kept in mind:
  • 107.
    Type Advantages Disadvantages ShortAnswer Can test many facts in short time Fairy Often ambiguous Difficult to measure complex learning easy to score Excellent format for math Tests recall Essay Can test complex learning Can evaluate Difficult to score objectively Uses a great thinking process and creativity deal of testing time Subjective True/False Test the most facts in shortest time Easy Difficult to measure complex learning to score Tests recognition Objective Difficult to write reliable items Subject to guessing Matching Excellent for testing associations and recognition of facts Difficult to write good items Subject to Although terse can test process of elimination complex learning (especially concepts) Objective Multiple Choice Can evaluate learning at all levels of complexity. Difficult to write Somewhat subject to Can be highly reliable objective guessing Tests fairly large knowledge base In short time Easy to score
  • 108.
    In choosing typesof questions to be used on a test, it is also important to consider the following points: ° Classroom conditions can automatically eliminate certain types of questions. Since answers to multiple choice questions can be easily copied in an overcrowded classroom, they might not be an accurate measure of student learning. Likewise, if blackboards are the only media available for presenting the test, long questions and textual references might be impossible to include on the test. ° Considerations regarding administration and scoring often dictate the type of questions to be included on a test. Numbers of students, time constraints, and other factors might necessitate the use of questions which can be administered and scored quickly and easily.
  • 109.
    The types ofknowledge being tested should be considered in the assessment process. A simplified checklist could be used by the teacher to determine if students have been assessed in all relevant areas. This could take the form of a graph such as the one which follows: TOPICS TO BE TESTED FACTS SKILLS CONCEPTS APPLICATIO N Verbs: Conjugation of "to be" x Pronunciation: Short "a" x Use of Models: Should, Must, Ought to x Free Expression x
  • 110.
    4. Writing itemsOnce purpose, topics and types of questions have been determined, the teacher is ready to begin writing the specific parts, or items, of the test. Initially, more items should be written than will be included on the test. When writing items, the following guidelines are followed: ° Cover important material No item should be included on . a test unless it covers a fact, concept, skill or applied principle that is relevant to the information covered in class (see 3. Listing Types of Questions above). Items should be independent. The answer to one item should not be found in another item; correctly answering one item should not be dependent on correctly answering a previous item. (This guideline might not apply in some cases. For example, a math test might begin by testing simple skills and then test their integration. In all cases, the teacher should be aware of what is being tested at each level and use this strategy sparingly).
  • 111.
    ° Write simplyand clearly. Use only terms and examples students will understand and eliminate all nonfunctional words. ° Be sure students know how to respond. The item should define the task clearly enough that students who understand the material will know what type of answer is required and how to record their answers. For example, on essay questions, the teacher may specify the length and scope of the answer required. ° Include questions of varying difficulty. Tests should include at least one question that all students can answer and one that few, if any, can answer. Tests should be designed to go from the easiest to most difficult items so as not to immediately discourage the weaker students.
  • 112.
    ° Be flexible.No one type of item is best for all situations or all types of material. Whenever feasible, any test should contain several types of items. 5. Reviewing items Regardless of how skilled the teacher is, not all his/her first efforts will be perfect or even acceptable. It is therefore important to review all items, revising the good and eliminating the bad. Finally, all items should be evaluated in terms of purpose, standardization, validity, practicality, efficiency, and fairness (see 8. Evaluating a Test below). 6. Writing directions. Clear and concise directions should be written for each section. Whenever possible, an example of a correctly answered test item should be provided as a model. If there is any question as to the clarity of the directions, the teacher should "try them out" on someone else before giving the exam.
  • 113.
    7. Devising ascoring key. While the test items are fresh in his/her mind, the teacher should make a scoring key -- a list of correct responses, acceptable variations, and weights assigned to each response (see Scoring below). In order to assure representative sampling, all items should be assigned values at this time. For example, if "factoring" comprised 50% of class material to be tested and only 25% of the total number of test questions, each question should be assigned double value. 8. Evaluating A Teat. All methods of assessing student learning should achieve the same thing: the clear, consistent and systematic measurement of a behavior or something that is learned. Once a test has been constructed, it should be reviewed to ensure that it meets six specific criteria: clarity, consistency, validity, practicality, efficiency, and fairness.
  • 114.
    The following isa checklist of questions that should be asked after the test (or any assessment activity) has been prepared and before it is administered: A CLEARLY DEFINED PURPOSE Who is being assessed? What material is the test (or activity) measuring? What kinds of knowledge or skills is the test (or activity) measuring? Do the tasks or test items relate to the objectives? STANDARDIZATION OF CONTENT Are content, administration, and scoring consistent in all groups? VALIDITY Is this test (or activity) a representative sampling of the material presented in this section? Does this test (or activity) faithfully reflect the level of difficulty of material covered in the class? PRACTICALITY AND EFFICIENCY Will the students have enough time to finish the test (or activity)? Are there sufficient materials available to present the test or complete the activity effectively? What problems might arise due to structural or material difficulties or shortages? FAIRNESS Did the teacher adequately prepare students for this activity/test? Were they given advance notice? Did they understand the testing procedure? How will the scores affect the students' lives? ACTIVITY BOX
  • 115.
    Administering a test: Oncethe items, directions, and answer key have been written, the teacher should consider the manner in which the test will be presented in advance. Factors such as duplication, visual aids, and use of the blackboard should be considered in advance to insure clarity in presentation as well as to avoid technical difficulties. Establish Classroom Policy: Because discipline is a major factor in test administration, the teacher must establish a classroom policy concerning such matters as tardiness, absences, make-ups, leaving the room, and cheating (see Classroom Management). The teacher must also advise students of procedural rules such as: ° What to do if they have any questions. ° What to do when they are finished taking the test. ° What to do if they run out of paper, need a new pen, etc. ° What to do if they run out of time.
  • 116.
    The teacher shouldalways be aware of the effect of testing conditions on testing outcomes. Physical shortcomings should be alleviated wherever possible. If some students cannot see the blackboard, they should be allowed to move to a better location. If students are cramped into benches, more benches should be brought in and students should be spread out. If this is not possible, two separate tests can be written and distributed to students on an alternating basis. Similarly, psychological conditions can inhibit optimal performance. Such factors as motivation, test anxiety, temporary states (everyone has a bad day once in a while), and long-term changes can profoundly effect the test-taker and therefore his/her performance on the test. It is therefore the teacher's responsibility to establish an official, yet not oppressive, atmosphere in the testing room to maximize student performance.
  • 117.
    Teaching Teat-Taking Techniques: Studentsoften fail tests not because they do not know the material but because they do not understand the procedures and techniques for successful test-taking. If a test is to be as fair as possible, students must understand both test-taking procedures and techniques. This means that the teacher should familiarize his/her students with: ° The type of test to be given (e.g. diagnostic, proficiency, achievement, etc.) and how to study for it. ° The types of items which will appear on the test and how to respond to them (e.g. matching, fill in the blank, essay questions, etc.). ° The types of directions commonly accompanying certain types of test items. ° Strategies for successful test-taking (e.g. time management, the process of elimination, guessing, etc.).
  • 118.
    Grading a test: Inorder to determine how well 8 student performed on a test or in an activity, specific value must assigned to each test item or activity component. Then, raw scores must be derived and, if necessary, transformed to fit the requirements of testing within specific contexts. Obtaining Raw Scores The first step in determining how well a student performed on a test or in an activity is to derive 8 raw score, or number of items answered correctly. Hence, if a student answers eight out of ten items correctly, his/her raw score is eight.
  • 119.
    Transforming Raw Scores: Gradesare determined based on 100 points, grading in countries following the French model is based on a system of 20 points. In order to make tests match such a predetermined number, raw scores must be transformed into fractions, decimals, or multiples of their raw value. For example, say the desired result is a score over 20, but a test includes 30 questions. If all questions are of equal importance and difficulty, they can be considered as fractions ( 2/3 pt. each) or as decimals (.66 each). Likewise, if a test has only 10 questions, each can be multiplied by two to obtain a score over 20.
  • 120.
    Cross-Cultural Considerations: In general,grading is much harsher in many countries than in the United States. Students rarely, if ever, achieve perfect or even near perfect scores on tests or as a final grade. In countries following the British model, a passing grade is 50/100 or better, in the French I model, 10/20 or better. It is therefore, inappropriate, for example, to give even the best students a grade higher than 80% (British) or 16/20 (French). In fact, your school administration, fellow teachers, and students will be bewildered and even angry if you deviate from this strict rule. Remember: 502 or 10/20 reflects an adequate performance, equivalent to the U.S. 770% or C. It is, therefore, important when designing a test that you include items of sufficient difficulty to reflect this grading tradition.
  • 121.
    Weighting Test Items: Inthe event that some questions are more important or more difficult than others, they can be weighted; that is, some questions can be considered of double value (in the example above, 1 point each) and others of less value (1/4 point, or .25). In other words, as long as the total value for the test equals the predetermined number required, individual item values can be juggled as the teacher sees fit (see table on coefficients under Norm-Referenced Scoring below).
  • 122.
    Deriving Percentages: By transformingraw scores into percentages, the teacher can compare tests of varying length and difficulty or tests of varying amounts of points on equal terms. If all items on a test are worth the same amount, the percentage correct can be determined by dividing the number of correct items by the total number of items, then multiplying by 100%: Percent correct = (Number of items correct ) / (Total number of items) x 100% If the items are of different weight, the percentage correct can be determined by dividing the number of points earned by the maximum number of points, then multiplying by 100%: Percent correct = (Points earned) / (Maximum number of points) x 100%
  • 123.
    Assuring Objectivity: As withtest construction, the key to successful test scoring is objectivity. By setting certain standards and prescribing certain rules, the teacher can be sure that scoring has been objective and students have been treated fairly. Three techniques are particularly helpful in assuring objectivity: ° Immediate scoring & recording ° Using a scoring key ° Having a procedure for comparing responses to the key Perhaps it seems self-evident, but immediate scoring and recording of scores can do much to alleviate misunderstanding and bias. The more time that goes by between test-taking and scoring, the greater the chances of forgetting relevant information or losing papers altogether.
  • 124.
    More importantly, thesooner the students get their tests back, the more meaningful their performance on the test. It does little good to return a test months after it has been taken when students have to review the material tested just to remember why they answered the way they did. Using a scoring key can make scoring papers go quickly while reducing the possibility of error and bias. It can also simplify and standardize the process of scoring if numerous people will be scoring the test. Having a procedure for comparing responses to the key can also speed up the scoring process and increase objectivity. For example, the teacher can: ° Scan several papers before starting scoring to get a baseline view of the type and level of responses.
  • 125.
    ° Grade asample of papers twice to see if he/she is, in fact, grading consistently. ° Score papers anonymously so as not to be influenced by students' performance in other aspects of the course (this can be done by assigning numbers before hand, folding the tops of test papers back, etc.). ° Grade items one at a time -- that is, first grade all answers to item 1, then all responses to item 2, and so on (this techniques is particularly useful with essay tests where it is important to look for key points in each response).
  • 126.
    Analysing test results: Oncetest papers have been scored, they can then be analyzed in numerous ways to provide the teacher with information about student performance. For example, a student's tests from one semester can be ranked to show relative areas of strength and weakness; averaged class scores on a given test can be ranked to compare one class's performance to that of another. Such information is important for making decisions about lesson planning and future testing as well as knowing how to approach different students and classes. In order to analyze anything, specific criteria must be established. In test analysis, three different criteria are generally used: the content of the test, the norm group taking the test, or an individual student.
  • 127.
    Criterion-Referenced Scoring: Criterion-referenced scoringuses the content of the test itself as a the basis of comparison for assessing the student's level of achievement. Thus, a content-referenced score of 80% means that the student correctly answered 80% of the items on the test. The most common of all methods of test analysis, content- referenced scoring is used in: ° To determine the level of achievement at which to begin a student; ° To determine how much a student has learned from given section of material; and ° To determine a student's potential in a given field.
  • 128.
    Norm-Referenced Scoring: Sometimes referredto as "grading on a curve," norm-referenced scoring uses the class as a whole as a referent. The class average, or mean, usually serves as the base score against which all other grades are judged. The mean is calculated by adding all the scores and then dividing by the number of scores given (e.g., the total test scores in a class of 25 equals 1625, the class average, or mean, is 1625/25, or 65). Some schools require certain percentages of passing grades per class. If these percentages are exceeded, the teacher is seen as "too easy"; conversely, if these percentages are not met, students can become indignant and discipline problems can result.
  • 129.
    In these instances,it is important to be able to adjust students' scores so that official standards can be met. Such adjustments can be made by: ° Adding (or subtracting) points to students' overall scores ° Adding (or subtracting) points to sections in which students scored the highest ° Making the next test easier (or harder) ° Weighting the test lightly (or heavily) on the semester-end grade by multiplying each test by an appropriate amount, or coefficient. For example, in the table below, Panafricanism is weighted three times, which gives the student an end of term score of 84%.
  • 130.
    TEST SCORE RELATIVE WEIGHT (coefficient) SCORE WITH WEIGHTING (coefficient) Pre-colonial Africa 50%1 50 Neo-colonial Africa 85% 1 85 420/5= 84% Panafricanism 95% 3 285 TOTAL 5 420
  • 131.
    Self-Referenced Scoring: Though itis difficult to do in large classes, self-referenced scoring measures an individual student's rate of progress relative to his or her own past performance. By comparing past test scores, a teacher can assess a student's rate of progress in a given subject area or across subjects to see where he/she is in need of help.
  • 132.
    The advantages anddisadvantages of Criterion-, Norm- and Self- Referenced scoring are listed below: Type of Grading Advantages Disadvantages Norm-referenced 1 Allows for comparisons among students 1 It whose class does well some students still get poor grades 2 Classes can be compared to other classes 2 It class as a whole does poorly a good grade could be misleading 3 Allows teacher to spot students who are dropping behind the class 3 Does not allow individual progress or individual circumstances to be considered 4. The whole class (or large portions of it) must be evaluated in the same way 5 Everyone m class (or norm group) must be evaluated with the same instrument under the same conditions Criterion 1. Helps teacher to decide it students are ready to move on 1. It is difficult to develop meaningful criteria(therefore arbitrary cut-oft scores are often used) 2 Criteria are independent of group performance 2. Presents unique problems in computing the reliability of criterion-referenced tests 3 Works well in a mastery-learning setting 4 Each individual can be evaluated on different material depending on his or her level of achievement 3 Makes it difficult to make comparisons among students Self-referenced 1. Allows you to check student progress 1. All measures taken on an individual must be taken with similar Instruments under similar circumstances 2 Makes it possible to compare achievement across different subjects for the same individual 2 Does not help you to compare an individual with his or her peers
  • 133.
    Percentile Ranking: Just asthe raw scores for individual test items can be transformed to fit a certain testing model (e.g. Francophone testing - score/20), so can one set of test results be analyzed in relation to previous tests as well as other classes' performances. Percentile ranks offer a way to obtain an image of class performance on a test by calculating the percentage of persons who obtain lower scores. To obtain a percentile rank, divide the number of students below the passing grade by the total number of students who took the test. For example, if 10 students out of 30 get passing scores (50% and above), then the percentile ranking for that test would be 66% -- that is, 66% of that class rank in the lower fiftieth percentile.
  • 134.
    Charting Student Performance: Justas percentile ranking can give a teacher a comparative measure of class performance, charting the results of a test can give the teacher an internal picture of how his/her class has performed as a whole. The graph below, for example, clearly and graphically illustrates that the majority of the students in the class failed the test. Student performance
  • 135.
    To chart studentperformance: 1. Tally the number of students who obtain each score. (e.g. 4 students at 4/20 - or 20/100 16 students at 8/20 - or 40/100.) 2. Plot each number on a chart as illustrated above. 3. Drew a vertical line intersecting the passing grade. (In the French system 10/20 is passing; in the British system 50/100 is passing). The teacher can obtain a visual comparison of class performance over a semester or a year by superimposing charted results of multiple tests.
  • 136.
    Unit Three: ItemAnalysis 3.1. Item difficult level 3.2. Item discrimination index 3.3. Item Banking
  • 137.
    3.1 Item difficultylevel How difficulty do you think a test should be? How do we determine the difficulty level of test items? Why is it important to know the difficulty level of test items?. Item difficulty index is one of the most useful, and most frequently reported, item analysis statistics. It is a measure of the proportion of examinees who answered the item correctly; for this reason it is frequently called the p-value. If scores from all students in a group are included the difficulty index is simply the total percent correct. When there is a sufficient number of scores available (i.e., 100 or more) difficulty indexes are calculated using scores from the top and bottom 27 percent of the group.
  • 138.
    Item Analysis Procedures: 1.Rankthe papers in order from the highest to the lowest scores. 2.Select one-third of the papers with the highest total scores and another one-third of the papers with lowest total scores. 3.For each test item, tabulate the number of students in the upper and lower groups who selected each option. 4.Compute the difficulty of each item (% of students who got the right item) Item difficulty index can be calculated using the following formula: P=Successes in the HSG + Successes in the LSG/N in (HSG+LSG) Where, HSG=High scoring groups LSG=Low scoring groups N=The total number of HSG and LSG
  • 139.
    The difficulty indexescan range between 0.0 and 1.0 and are usually expressed as a percentage. A higher value indicates that a greater proportion of examinees responded to the item correctly, and it was thus an easier item. The average difficulty of a test is the average of the individual item difficulties. For maximum discrimination among students, an average difficulty of .60 is ideal. For example: If 243 students answered item no. 1 correctly and 9 students answered incorrectly, the difficulty level of the item would be 243/252 or .96. In the example below, five true-false questions were part of a larger test administered to a class of 20 students. For each question, the number of students answering correctly was determined, and then converted to the percentage of students answering correctly.
  • 140.
    Ques tion Correct responses Itemdifficulty 1 | | | | | | | | | | | | | | | 15 75% (15/20) 2 | | | | | | | | | | | | | | | | | 17 85% (17/20) 3 | | | | | | 6 30% (6/20) 4 | | | | | | | | | | | | | 13 65% 13/20) 5 | | | | | | | | | | | | | | | | | | | | 20 100% (20/20)
  • 141.
    Activity: Calculate theitem difficulty level for the following four options multiple choice test item. (The sign (*) shows the correct answer). Response Options Groups A B C D* Total High Scorers 0 1 1 8 10 Low Scorers 1 1 5 3 10 Total 1 2 6 11 20
  • 142.
    3.2 Item DiscriminationIndex To whome extent do you think a test item should discriminate between higher achievers and lower achievers? Shoud it be highly disciminating, averagely discriminating, or less discriminating? The index of discrimination is a numerical indicator that enables us to determine whether the question discriminates appropriatly between lower scoring and higher scoring students. When students who earn high scores are compared with those who earn low scores, we would expect to find more students in the high scoring groups answering a question correctly than students from the low scoring group. In te case of very difficult items which no one in either group answered correctly or fairly easy questions which even the students in the low group answered correctly, the numbers of correct answer might be equal for the two groups.
  • 143.
    What we wouldnot expect to find is a case in which the low scoring students answered correctly more frequently than students in the high group. D=Successes in the HSG-Successes in the LSG/1/2(HSG+LSG) Where, HSG=High Scoring Groups LSG=Low Scoring Groups In the example, there are 8 students in the high scoring group and 8 in the low scoring group (with 12 between the two groups which are not represented). For equestion 1, all 8 in high scoring group answered corretly, while only 4 in the low scoring group did so. Thus success in the HSG-success in the LSG (8-4)=+4. The last step is to divide the +4 by half of the total number of both groups (16). Thus 4/2x8 will give us +5, which is the D value.
  • 144.
    Quest ion Success in the HSG Successin the LSG Difference D value 1 8 4 8 – 4 = 4 .5 2 7 2 3 5 6 Activity 2: Calculate the item discrimination index for the questions 2 & 3 on the table above.The item discrimination index can vary from -1.00 to +1.00. A negative discrimination index (between -1.00 and zero) results when more students in the low group answered correctly than students in the high group. A discrimination index of zero means equal numbers of high and low students answered correctly, so the item did not discriminate between groups.
  • 145.
    A positive indexoccurs when more students in the high group answer correctly than the low group. If the students in the class are fairly homogeneous in ability and achievement, their test performance is also likely to be similar, resulting in little discrimination between high and low groups. Questions that have an item difficulty index (NOT item discrimination) of 1.00 or 0.00 need not be included when calculating item discrimination indices. An item difficulty of 1.00 indicates that everyone answered correctly, while 0.00 means no one answered correctly. We already know that neither type of item discriminates between students. When computing the discrimination index, the scores are divided into three groups with the top 27% of the scores in the upper group and the bottom 27% in the lower group.
  • 146.
    The number ofcorrect responses for an item by the lower group is subtracted from the number of correct responses for the item in the upper group. The difference is divided by the number of students in either group. The process is repeated for each item. The value is interpreted in terms of both:  direction (positive or negative) and  strength (non-discriminating to strongly- discriminating). These values can range from -1.00 to +1.00.The possible range of the discrimination index is -1.0 to 1.0.
  • 147.
    Item discrimination interpretation D-ValueDirection Strength > +.40 positive strong +.20 to +.40 positive moderate -.20 to +.20 none --- < -.20 negative moderate to strong
  • 148.
    For a smallgroup of students, an index of discrimination for an item that exceeds. 20 is considered satisfactory. For larger groups, the index should be higher because more difference between groups would be expected. The guidelines for an acceptable level of discrimination depend upon item difficulty. For very easy or very difficult items, low discrimination levels would be expected; most students, regardless of ability, would get the item correct or incorrect as the case may be. For items with a difficulty level of about 70 percent, the discrimination should be at least. 30. When an item is discriminating negatively, overall the most knowledgeable examinees are getting the item wrong and the least knowledgeable examinees are getting the item right. A negative discrimination index may indicate that the item is measuring something other than what the rest of the test is measuring. More often, it is a sign that the item has been mis-keyed.
  • 149.
    3.3 Item Banking Buildinga file of effective test items and assessment tasks involves recording the items or tasks, adding information from analyses of students responses, and filing the records by both the content area and the objective that the item or task measures. Thus, items and tasks are recorded on records as they are constructed; information form analysis of students responses is added after the items and tasks have been used, and then the effective items and tasks are deposited in the file. In a few years, it is possible to start using some of the items and tasks from the file and supplement these with new items and tasks.
  • 150.
    As the filegrow, it becomes possible to select the majority of the items and tasks from the file for any given test or assessment without repeating them frequently. Such a file is especially valuable in areas of complex achievement, when the construction of test items and assessment tasks is difficult and time consuming. When enough high-quality items and tasks have been assembled, the burden of preparing tests and assessments is considerably lightened. Computer item banking makes tasks even easier. Summary: In this unit you learned how to judge the quality of classroom test by carrying out item analysis which is the process of “testing the item” to ascertain specifically whether the item is functioning properly in measuring what the entire test is measuring.
  • 151.
    You also learnedabout the process of item analysis and how to compute item difficulty, item discriminating power and evaluating the effectiveness of distracters. You have learned that item difficulty indicates the percentage of testees who get the item right; Item discriminating power is an index which indicates how well an item is able to distinguish between the high achievers and low achievers given what the test is measuring; and the distraction power of a distracter is its ability to differentiate between those who do not know and those who know what the item is measuring. Finally you learned that after conducting item analysis, items may still be usable, after modest changes are made to improve their performance on future exams. Thus, good test items should be kept in test item banks and in this unit you were given highlights on how to build a Test Item File/Item Bank.
  • 152.
    Unit Four: EthicalStandards of Assessment 4.1 Ethical and professional standards of assessment and its use 4.2 Race, ethnicity, gender, religion and culture in assessment and test
  • 153.
    4.1 Ethical Standardsof assessment Ethical standards guide teachers in fulfilling their obligation to provide and use tests that are fair to all test takers regardless of age, gender, disability, ethnicity, religion, linguistic background, or other personal characteristics. Fairness is a primary consideration in all aspects of testing :  It helps to ensure that all test takers are given a comparable opportunity to demonstrate what they know and how they can perform in the area being tested.  Implies that every test taker has the opportunity to prepare for the test and is informed about the general nature and content of the test.  Also extends to the accurate reporting of individual and group test results.
  • 154.
    The following aresome ethical standards that teachers may consider in their assessment practices. 1. Teachers should be skilled in choosing assessment methods appropriate for instructional decisions. Skills in choosing appropriate, useful, administratively convenient, technically adequate, and fair assessment methods are prerequisite to good use of information to support instructional decisions. Teachers need to be well-acquainted with the kinds of information provided by a broad range of assessment alternatives and their strengths and weaknesses. In particular, they should be familiar with criteria for evaluating and selecting assessment methods in light of instructional plans.
  • 155.
    2. Teachers shoulddevelop tests that meet the intended purpose and that are appropriate for the intended test takers. This requires teachers to: Define the purpose for testing, the content and skills to be tested, and the intended test takers. Develop tests that are appropriate with content, skills tested, and content coverage for the intended purpose of testing. Develop tests that have clear, accurate, and complete information. Develop tests with appropriately modified forms or administration procedures for test takers with disabilities who need special accommodations
  • 156.
    3. The teachershould be skilled in administering, scoring and interpreting the results from diverse assessment methods. It is not enough that teachers are able to select and develop good assessment methods; they must also be able to apply them properly. This requires teachers to: Follow established procedures for administering tests in a standardized manner. Provide and document appropriate procedures for test takers with disabilities who need special accommodations or those with diverse linguistic backgrounds. Protect the security of test materials, including eliminating opportunities for test takers to obtain scores by fraudulent means. Develop and implement procedures for ensuring the confidentiality of scores
  • 157.
    4. Teachers shouldbe skilled in using assessment results when making decisions about individual students, planning teaching, developing curriculum, and school improvement. Assessment results are used to make educational decisions at several levels: in the classroom about students, in the community about a school and a school district, and in society, generally, about the purposes and outcomes of the educational enterprise. Teachers play a vital role when participating in decision-making at each of these levels and must be able to use assessment results effectively. 5. Teachers should be skilled in developing valid pupil grading procedures which use pupil assessments. Grading students is an important part of professional practice for teachers. Grading is defined as indicating both a student's level of performance and a teacher's valuing of that performance. The principles for using assessments to obtain valid grades are known and teachers should employ them.
  • 158.
    6. Teachers shouldbe skilled in communicating assessment results to students, parents, other lay audiences, and other educators. Teachers must routinely report assessment results to students and to parents or guardians. In addition, they are frequently asked to report or to discuss assessment results with other educators and with diverse lay audiences. If the results are not communicated effectively, they may be misused or not used. To communicate effectively with others on matters of student assessment, teachers must be able to use assessment terminology appropriately and must be able to articulate the meaning, limitations, and implications of assessment results.
  • 159.
    Furthermore, teachers willsometimes be in a position that will require them to defend their own assessment procedures and their interpretations of them. At other times, teachers may need to help the public to interpret assessment results appropriately. 7.Teachers should be skilled in recognizing unethical, illegal, and otherwise inappropriate assessment methods and uses of assessment information. Fairness, the rights of all concerned, and professional ethical behavior must undergird all student assessment activities, from the initial planning for and gathering of information to the interpretation, use, and communication of the results. Teachers must be well-versed in their own ethical and legal responsibilities in assessment.
  • 160.
    In addition, theyshould also attempt to have the inappropriate assessment practices of others discontinued whenever they are encountered. 1. Teachers should also participate with the wider educational community in defining the limits of appropriate professional behavior in assessment. In addition, the following are principles of grading that can guide the development of a grading system. 1. The system of grading should be clear and understandable (to parents, other stakeholders, and most especially students). 2. The system of grading should be communicated to all stakeholders (e.g., students, parents, administrators).
  • 161.
    3. Grading shouldbe fair for all students regardless of gender, socioeconomic status or any other personal characteristics. 4. Grading should support, enhance, and inform the instructional process. 4.2 Race, ethnicity, gender, religion and culture in assessment and test In the previous section we have learned that fairness is the fundamental principle that has to be followed in teachers’ assessment practices. It has been said that all students have to be provided with equal opportunity to demonstrate the skills and knowledge being assessed. Fairness is fundamentally a socio-cultural, rather than a technical, issue.
  • 162.
    Thus, in thissection we are going to see how culture and ethnicity may influence teachers’ assessment practices and what precautions we have to take in order avoid bias and be accommodative to students from all cultural groups. Students represent a variety of cultural and linguistic backgrounds. If the cultural and linguistic backgrounds are ignored, students may become alienated or disengaged from the learning and assessment process. Teachers need to be aware of how such backgrounds may influence student performance and the potential impact on learning. Teachers should be ready to provide accommodations where needed.
  • 163.
    Classroom assessment practicesshould be sensitive to the cultural and linguistic diversity of students in order to obtain accurate information about their learning. Assessment practices that attend to issues of cultural diversity include those that  Aacknowledge students’ cultural backgrounds.  Are sensitive to those aspects of an assessment that may hamper students’ ability to demonstrate their knowledge and understanding.  Use that knowledge to adjust or scaffold assessment practices if necessary.
  • 164.
    Assessment practices thatattend to issues of linguistic diversity include those that  Acknowledge students’ differing linguistic abilities.  Use that knowledge to adjust or scaffold assessment practices if necessary.  Use assessment practices in which the language demands do not unfairly prevent the students from understanding what is expected of them. Use assessment practices that allow students to accurately demonstrate their understanding by responding in ways that accommodate their linguistic abilities, if the response method is not relevant to the concept being assessed (e.g., allow a student to respond orally rather than in writing).
  • 165.
    Disability and AssessmentPractices: It is quite obvious that our education system was exclusionary in fully accommodating the educational needs of disabled students. This has been true not only in our country but in the rest of the world as well, although the magnitude might differ from country to country. It was in response to this situation that UNESCO has been promoting the principle of inclusive education to guide the educational policies and practice of all governments. Different world conventions were held and documents signed towards the implementation of inclusive education. Our country, Ethiopia, has been a signatory of these documents and therefore has accepted inclusive education as a basic principle to guide its policy and practice in relation to the education of disabled students.
  • 166.
    UN Convention onthe Rights of Persons with Disabilities (2006) One group should work on one convention and documents can be found from the internet. Inclusive education is based on the idea that all students, including those with disabilities, should be provided with the best possible education to develop themselves. This implies for the provision of all possible accommodations to address the educational needs of disabled students. Accommodations should not only refer to the teaching and learning process. It should also consider the assessment mechanisms and procedures.
  • 167.
    There are differentstrategies that can be considered to make assessment practices accessible to students with disabilities depending on the type of disability. In general terms, however, the following strategies could be considered in summative assessments:  Modifying assessments: This should enable disabled students to have full access to the assessment without giving them any unfair advantage.  Others’ support: Disabled students may need the support of others in certain assessment activities which they can not do it independently. For instance, they may require readers and scribes in written exams; they may also need others’ assistance in practical activities, such as using equipments, locating materials, drawing and measuring.
  • 168.
     Time allowances:Disabled students should be given additional time to complete their assessments which the individual instructor has to decide based on the purpose and nature of the assessment.  Rest breaks: Some students may need rest breaks during the examination. This may be to relieve pain or to attend to personal needs.  Flexible schedules: In some cases disabled students may require flexibility in the scheduling of examinations. For example, some students may find it difficult to manage a number of examinations in quick succession and need to have examinations scheduled over a period of days.
  • 169.
     Alternative methodsof assessment: In certain situations where formal methods of assessment may not be appropriate for disabled students, the instructor should assess them using non formal methods such as class works, portfolios, oral presentations, etc.  Assistive Technology: Specific equipment may need to be available to the student in an examination. Such arrangements often include the use of personal computers, voice activated software and screen readers.
  • 170.
    Gender issues inassessment: Do you feel that gender has any influence in teachers’ assessment practices? Is there any gender-related stereotype in relation to assessment results? Teachers’ assessment practices can also be affected by gender stereotypes. The issues of gender bias and fairness in assessment are concerned with differences in opportunities for boys and girls. A test is biased if boys and girls with the same ability levels tend to obtain different scores.
  • 171.
    Test questions shouldbe checked for:  material or references that may be offensive to members of one gender,  references to objects and ideas that are likely to be more familiar to men or to women,  unequal representation of men and women as actors in test items or representation of members of each gender only in stereotyped roles. If the questions involve objects and ideas that are more familiar or less offensive to members of one gender, then the test may be easier for individuals of that gender.
  • 172.
    Standards for achievementon such a test may be unfair to individuals of the gender that is less familiar with or more offended by the objects and ideas discussed, because it may be more difficult for such individuals to demonstrate their abilities or their knowledge of the material. Summary: In this unit you have learned that ethics is a very important issue we have to follow in our assessment practices. And the most important ethical consideration is fairness. If we are to draw reasonably good conclusions about what our students have learned, it is imperative that we make our assessments—and our uses of the results—as fair as possible for as many students as possible. A fair assessment is one in which students are given equitable opportunities to demonstrate their abilities and knowledge.
  • 173.
    Teachers must makeevery effort to address and minimize the effect of bias in classroom assessment practices. Biases in assessment can occur because of differences in culture or ethnicity, disability as well as gender. To ensure suitability and fairness for all students, teachers need to check the assessment strategy for its appropriateness and if there are cultural, disability and gender biases. Equitable assessment means that students are assessed using methods and procedures most appropriate to them. Classroom assessment practices should be sensitive and diverse enough to accommodate all types of diversity in the classroom in order to obtain accurate information about learning.