Assessing Student Learning and Performance

Evaluation:
1) The activity of placing value on something or
making judgments about students’ learning.
2) The process of monitoring progress during
and after instruction.
3) Making an overall judgment about one’s work
or a whole school’s work.

1) The procedures used to obtain information
about student performance information.
2) The process of identifying, gathering and
interpreting information about students'
learning.
3) Describing the process of trying to determine
what students already know about a topic
before instruction.

4) A set of procedures used to get information
dealing with students achievement and
progress, like observations, scales, or
written tests.
5) Emphasizing on the performance the actual
and complex tasks.

6) Any means of checking what students can
do with the language.
7) Checking what they cannot do.
8) It may be carried out before, during or after
a course, or it may not even be connected
with a course.
9) Assessment may be of individual students,
or it may be to check the capabilities of a
whole class.

10) It is concerned with the quality of teaching
as well as the quality of learning.
The central purpose of assessment is to
provide information on student achievement
and progress and set the direction for ongoing
teaching and learning.

Assessment in ELT means to discover what the
learners know and can do at a certain stage of
the learning process. They are:
1.To discover learners’ achievements.
2.To evaluate the existing curriculum.
3.To check upon teachers’ performance.
4.To motivate learners.
5.To provide an incentive for learning.

6. To provide the basis for further planning
of teaching: what to teach next.
7. To qualify students.
8. To provide criteria to qualify for higher
level studies.
9. To provide learners with a sense of
accomplishment.
10. To foster the ability to learn.

 In short, the purposes of assessment:
a. To measure language proficiency
b. To discover how far students have achieved
the objectives of a course of study.
c. To diagnose students’ strengths and
weaknesses,
d. to identify what they know and what they do
not know.
e. To assist placement of students by
identifying the stage or part of a teaching
program most appropriate to their ability.

- the activity of giving mark or score to a test
result.
- a process of assigning numbers to the
individual members of a set of objects or
persons for the purpose indicating
differences of the characteristic being
measured among them.
- It is useful for describing the amount of
certain abilities that individuals have.

 Test is a type of assessment consisting of
several questions applied during a period
time which primarily focuses on the
comparison of learning achievement among
the learners.
 A test is an instrument, procedure or a set of
activities employed to obtain the sample
one’s behavior describing his or her
capability in a certain instructional activity.

 By conducting a test, it is expected that a
teacher get representative information
dealing with the learner’s behavioral change
or learning output.
 It often takes the ‘pencil and paper’ form and
it is usually done at the end of a learning
period, such as unit-test, mid-term-test,
semester-test etc.

 Where are my students now?
Assessment provides information about what
students already know, understand, or what
they can do.
 What do I want my students to learn?
Standards Objectives or Performance
Measures.

 How will my students get there? (Teaching
and learning strategies)
Consider: Teaching Strategies, Classroom
Organization, Learning Environment. How can
students demonstrate learning?
 How do I know when my students get there?
Assessment: Conventional Assessment
(Testing) and Alternative/Authentic
Assessment.

 A good communicative assessment of
language should have a much positive effect
on teaching and learning and should
generally result in improved learning habits.
The effect of assessment on teaching and
learning is known as backwash.
 A test may be constructed primarily as
devices to reinforce learning and to motivate
the students or improves the students’
performance  beneficial or harmless
backwash.

 The assessment content and assessment
techniques are at variance with the objectives
of the course  harmful backwash. They fail
to measure accurately whatever it is they are
intended to measure.
 e.g. If the skill of writing is tested only by
multiple choice items, then there is great
pressure to practice such items rather than
practice the skill of writing itself.

 Causes of harmful backwash  invalidity &
unreliability.
 Invalidity:
a) assessment content (large scale, writing
tasks, difficulty in marking & scoring).
b) Techniques (what types of test? MCT or
essay? MCT difficult to construct & not all
cases are appropriate with MCT.
(Un)reliability:
- Whether the test measures what to measure
consistently or not.

- equivalent test performances are accorded
significantly different scores. For example, the
same composition may be given very different
scores by different markers (or even by the
same marker on different occasions).

 Assess the abilities whose development you
want to encourage.
 Sample widely and unpredictably.
 Use direct testing.
 Make testing criterion-referenced.
 Base assessment on objectives.
 Ensure the assessment is known and
understood by students and teachers.
 Where necessary, provide assistance to
teachers.
 Counting the cost  Economy & practicality.

1) Essay-translation (Essay writing, tanslation &
grammatical analysis)
2) Structuralist (systematic acquisition of
habits, contrastive analysis, mastery of
separate elements of phonology, vocabulary
& grammar).
3) Discrete (a type of test measures only one
skill or lang component).
4) Integrative (combination of lang skills or
component, language & context & meaning
(pragmatics).

5) Pragmatic (real lang use, meaning & context,
linguistic & nonlinguistic elements)
6) Communicative (linked to integrative approach,
how lang is used in daily communication (use),
psycholinguistic & sociolinguistic orientations,
measuring proficiency, involve culture &
authentic materials, needs analysis (ESP), more
on qualitative (non-test) mode.
7) Contextual (relating learning materials with
learners’ real situation/context), constructivism
(activating existing knowledge).

 Characteristics of CTL: cooperative work, mutual
support, exciting, high motivation, integrated, a
variety of source, active learners, sharing ideas,
critical learners & creative teachers, display of
learners’ work, report to learners’ parents.
 Elements of constructivism: activating existing
knowledge, acquiring new knowledge,
understanding knowledge, applying knowledge &
experience, reflecting knowledge.
 Six components of effective learning in CTL:
constructivism, questioning, inquiry, learning
community, modeling, reflection, authentic
assessment.

1) Diagnosis and Feedback
To pinpoint strengths and
weaknesses in the learned abilities of
the student. It provides critical
information to the student, teacher,
and administrator that should make
the learning process more efficient.

2) Screening and selection
To assist in the decision of who should
be allowed to participate in a particular
program of instruction. In the area of
language testing, a common screening
instruments is termed an aptitude test
used to predict the success or failure of
students prospective in a language-
learning program.

3) Placement
Tests are used to identify a
particular performance level of
the student and to place him or
her at an appropriate level of
instruction.

4) Aptitude, Achievement and
Proficiency Tests
 Aptitude tests are more often
used to measure the suitability of
a candidate for a specific program
of instruction or a particular kind
of employment.

 Achievement tests are used to
measure the extent of learning
in a prescribed content
domain, stated objectives of a
learning program.

 Proficiency tests are most often
global measures of ability in a
language or other content area,
not necessarily developed or
administered with reference to
some previously experienced
course of instruction or
curriculum.

5) Program Evaluation Providing
Research Criteria
 To provide information about the
effectiveness of programs of
instruction.
 Pretests  to assess gross levels of
student proficiency or “entry behavior”
prior to instruction

 Posttests  to measure post-
instructional levels of proficiency or
“exist behavior”, resulting “gain
scores;
 Formative test and summative test 
all are used to evaluate the
effectiveness of a program.

6) Assessment of Attitude and Socio-
psychological Differences
 The importance of non-cognitive
factors in achievement is seldom more
evident.
 Attitude toward the target language,
its people, and their culture have been
identified as important affective
correlates of good language learning.

 Cognitive style of the learner,
socioeconomic status and locus of
control of the learner, linguistic
situational context, and ego-
permeability of the learner have
been found to relate to levels of
language achievement and/or
strategies of language use.

 6. Objectives vs. subjective
 - how they are scored
 - responses compared with the scoring
key
 - no training for the scorer
 Subjective:
 - opinionated judgment of the scorer
 - e.g. writing or speaking

7) Speed Test vs. Power Test
 A speed test contains so easy items that
every person might be expected to get every
item correct, given enough time. Conversely,
power tests are tests that allow sufficient
time for every person to finish, but that
contain such difficult items that few if any
examinees are expected to get every item
correct.

8) Cloze Test
 The principle of cloze testing is based on the
Gestalt theory of ‘closure’ (closing gaps in
patterns subconsciously). Cloze tests
measure the testees’ ability to decode
‘interrupted’ or ‘mutilated’ messages by
making the most acceptable substitutions
from all the contextual clues available.

9) C-Test
 C-test is result of the Cloze test development.
The purpose of developing C-test was due to the
difficulty faced by test developers in constructing
cloze-test. C-test also requires the involvement
of contextual texts or discourses. Cloze test uses
long, complete discourses, while C-test just uses
several short texts. In a Cloze test, the deletion
of the words in text is systematic, whereas in C-
test there is no obligation to satisfy the pattern.

QUESTIONS AND PRINCIPLES
1. Questions
The following questions or problems should be
answered before constructing a test:
a. What kind of test is it to be? Achievement (final or
progress), proficiency, diagnostic, or placement?
b. What is its precise purpose?
c. What abilities are to be tested?
d. How detailed must the results be?
e. How accurate must the result be?
f. How important is backwash?
g. What constraints are set by unavailability of
expertise, facilities, time (for construction,
administration and scoring)?

2. Principle
a. Achievement test should measure clearly
defined competencies that are in harmony
with basic competency.
b. Achievement test should measure an
adequate sample of the competencies and
subject matter content included in the
lesson plan.

c. Achievement tests should include the types
of test items which are most appropriate for
measuring the desired competencies.
d. Achievement tests should be designed to fit
the particular uses to be made of the results.
e. Achievement test should be made as reliable
as possible and should be then interpreted
with caution.
f. Achievement test should be used to improve
student learning.

3. Phases of Test construction
In planning an achievement test, a tester should:
a. Identify the competencies to be measured
(assessed) Which domain?)
b. Define the competencies in terms of specific
observable behavior or performance
c. Outline the subject matter or activities.
d. Prepare the table of specifications
e. Write (develop) the test based on the
specifications.
f. Sample, item writing and moderation, write
and moderate of scoring key and pretest.

4. Specifications involve:
a. Purpose
b. What sort of learners will be taking the test:
age, sex, level of proficiency, stage of
learning, first language, cultural background,
reason for taking the test, interest, etc.
c. How many sections/papers should the test
have?
d. Target language situation?
e. What text type be chosen? Written or spoken?
f. What language skills or topics should be
tested?

g. What sort of tasks are required? Discrete-point,
integrative, etc.?
h. How many items?
i. What test methods are to be used? MC, gap
filling, etc.
j. What rubric are to be used as instructions for
candidates? Need examples?
k. What criteria level of performance will be used
for assessment by markers? How important is
accuracy, appropriateness, spelling, length of
utterances/scripts, etc.
1. Timing.

5. Who needs test Specifications?
a. Those who produce the test itself
b. Those who are responsible for or interested
in establishing the test validity
c. The test users
d. Teachers
e. Admission officers
f. Publishers to produce books.

1. Short-answer or Completion item
a. State the item so that only a single,
brief answer is possible.
b. Start with a direct question and
switch to an incomplete statement
only when greater conciseness is
possible.

c. The words to be supplied should relate to the
main point of the statement. Avoid asking
students to respond to unimportant or minor
aspects of statement, and leave blanks only for
key words. Students should not be expected to
supply such words as ‘the’ and ‘an.’
d. Place the blanks at the end of the statement.
This permits the learners to read the complete
problem before they come to the blank. Starting
with a direct question will make it easier to
construct incomplete statements of this type.

e. Avoid extraneous clues to the answer. The
use of indefinite article ‘a’ or ‘an’ at the end
of an incomplete statement is apt to provide
a clue to answer.
f. For numerical answers, indicate the degree of
precision expected and the units in which
they are to be expressed. This will clarify the
task to the student and make scoring easier.

2. Matching item
a. The premises and responses should be
homogeneous as possible. Homogeneous
premises and responses provide an
opportunity to arrive at responses based
upon association.
b. The first list premises and responses should
be logically arranged, for instance, order
alphabetically.
c. Keep the list of premises and responses to a
maximum of twelve items.

d. Place the list with the longer words or
phrases in the left hand premise column
since this column is read first.
e. Avoid one equal number of premises and
responses, which permits one-to-one
matching.
f. Present the entire item on one page.

3. True-False item
a. Avoid broad general statement if they are to
be judged true or false.
b. Avoid the use of negative statements,
especially double negative.
c. Long and complex sentences should be
avoided. When statements are not relatively
simple and direct, the student’s reading
ability, rather than achievement in subject
matter, is tested.

4. Multiple Choice Item
a. The stem should be meaningful by itself and
should present a definite problem.
b. The stem should include as much of the
item as possible and should be free of
irrelevant material.
c. Use a negatively stated item stem only when
significant learning outcomes require it.

d. All of the alternatives should be
grammatically consistent with the stem of
the item.
e. An item should contain only one correct or
clearly best answer.
f. Each option should be grammatically correct
when placed in the stem, except of course in
the case of specific grammar test items.

g. Multiple choice items should be as brief and
as clear as possible (though it is desirable to
provide short contexts in the case of specific
grammar test items.
h. All the distractors should be plausible.

i. Verbal associations between the stem and
the correct answer should be avoided.
j. The relative length of the alternatives should
not provide a clue to the answer.
k. The correct answer should appear in each of
the alternative positions approximately an
equal number of items, but in random order.
1. Use special alternative such as ‘none of the
above’ or ‘all of the above’ sparingly.

m. In may tests, items are arranged in rough order
of increasing difficulty.
ii. Don’t use a dash (-) as an alternative.
o. Each option should belong to the same word
class as the word in the stem, particularly when
the word appears in the context of a sentence
(Vocabulary Test).
p. The correct option and the distractors should be
at approximately the same level of difficulty.
q. All of the options should be approximately the
same length.
r. Don’t use MC items where other item types are
more appropriate

Difficulties in writing multiple choice item:
- The direct question form is easier to write
than the incomplete statement.
- Not all knowledge can be stated so precisely
that there is only one absolutely correct
response.

 Listening vs. reading/speaking. Listeners
cannot usually move backwards and forwards
over what is being said in the way that they
can a written text.
 Listening & Speaking separate  no
interaction. E.g. radio broadcasts, lectures,
public speeches and other spoken genres.

Basic types of Listening:
1. Recognition of speech sounds & short-term
memory.
2. Types of speech event (monologue &
dialogue) related to context (who the
speaker is, location, purpose) & content.
3. Bottom-up (linguistic decoding skills) &
schemata to bring a plausible interpretation
to the message (top-down) and assign literal
& intended meaning to the utterance.

Potential assessment objectives:
1. comprehending of surface structure
elements such as phonemes, words,
intonation or grammatical category.
2. understanding of pragmatic context.
3. determining meaning of auditory input.
4. developing the gist, a global or
comprehensive understanding.

Types of listening performance:
1. Intensive  perception of the linguistic
components (phonemes, words, intonation,
discourse markers).
2. Responsive Listening to language
functions (greeting, question, command,
comprehension check) to make an equally
short response.

3. Selective  Scanning from short monologue,
also listen for names, numbers, a
grammatical category, directions (in a map
exercise), certain facts & events).
4. Extensive  top-down, global
understanding of spoken language. E.g. long
lectures  gists, main ideas & making
inferences.

Some of the more plausible tasks requiring
careful listening  note-making and
summarizing.
Macro-skills:
1. Listening for specific information
2. Obtaining gist of what is being said
3. Following directions
4. Following instructions.

Micro-skills might include:
1. Interpretation of information patterns
(recognition of sarcasm, etc.)
2. recognition of function structures (such as
interrogative as request, for example, ‘Could
you pass the salt?’).
3. Distinguishing between phonemes (for
example between [w] and [v].

 Types of texts might be like monologue,
dialogue, or multi-participant; and further
specified: announcement, talk or lecture,
instructions, directions, etc.
 Sources: radio broadcasts, teaching materials,
and our own recordings of native speakers.
 A recording may be used simply as the basis
for a ‘live’ presentation  Care must be taken
to make them as natural as possible.

Assessing Student Learning and Performance

Recommended

Recommended

More Related Content

Similar to Assessing Student Learning and Performance

Similar to Assessing Student Learning and Performance (20)

Recently uploaded

Recently uploaded (20)

Assessing Student Learning and Performance