Testing a Test: Evaluating Our Assessment Tools
Upcoming SlideShare
Loading in...5
×
 

Testing a Test: Evaluating Our Assessment Tools

on

  • 4,485 views

This slideshow was used for teacher training workshops I conducted in the fall of 2011 at the Center for English as a Second Language, University of Arizona (Tucson, USA).

This slideshow was used for teacher training workshops I conducted in the fall of 2011 at the Center for English as a Second Language, University of Arizona (Tucson, USA).

Statistics

Views

Total Views
4,485
Views on SlideShare
4,319
Embed Views
166

Actions

Likes
1
Downloads
209
Comments
3

5 Embeds 166

http://feltap.blogspot.in 145
http://feltap.wordpress.com 10
http://feltap.blogspot.com 5
http://www.feltap.blogspot.in 5
http://translate.googleusercontent.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Testing a Test: Evaluating Our Assessment Tools Testing a Test: Evaluating Our Assessment Tools Presentation Transcript

  • ‘Tes%ng
a
test’
–
Evalua%ng
our
 Assessment
Tools
 Eddy
White,
Ph.D.
 Assessment
 Coordinator
 Center
for
English
as
a
 Second
Language
 University
of
Arizona


  • Targets
 1.  My
background
 2.  Classroom
based
 assessment
 3.  Tests
‐
purposes/

 func%ons
 4.  The
‘cardinal
 criteria’
for
 evalua%ng
a
test
 5.  Conclusions

 2

  • Targets
 1.  My
background
 2.  Classroom
based
 assessment
 3.  Tests
‐
purposes/

 func%ons
 4.  The
‘cardinal
 criteria’
for
 evalua%ng
a
test
 5.  Conclusions

 3

  • (1994‐2009)

  • Classroom-based Assessment•  Assessment
 of
Learning
•  Assessment
 for
Learning

  • Targets
 1.  My
background
 2.  Classroom
based
 assessment
 3.  Tests
‐
purposes/

 func%ons
 4.  The
‘cardinal
 criteria’
for
 evalua%ng
a
test
 5.  Conclusions

 8

  • The
goal
of
assessment
is
to
.
.
.

 9

  • The goal of assessment has to be, above all, to support the improvement of learning and teaching.
 (Fredrickson
&
Collins,
1989) 

 10

  • definiGon:
Classroom
Assessment
 Planning
 ReporGng
 Assessment
 CollecGng
 Analyzing

  • ESL
Assessment‐Purposes
•  idenGfy
strengths
and
weaknesses
of
individual
 students,
•  adjust
instrucGon
to
build
on
students’
strengths
 and
alleviate
weaknesses,
•  monitor
the
effecGveness
of
instrucGon,
•  provide
feedback
to
students
(sponsors,
 parents,etc.),
and

•  make
decisions
about
the
advancement
of
 students
to
the
next
level
of
the
program.

 (Source:
ESL
Senior
High
Guide
to
ImplementaGon,
2002)
 12

  • Consider
 •  Research
suggests
that
 teachers
spend
from
 one‐quarter
to
one‐third
 of
their
professional
Gme
 on
assessment‐related
 acGviGes.
 •  Almost
all
do
so
 without
the
benefit
 of
having
learned
the
 principles
of
sound
 assessment.
 (S%ggins,
2007)

  • 


Teachers
learn
how
to
teach
without
learning
 much
about
how
to
assess.
(Heritage,
2007)
 14

  • Assessment
literacy

•  the
kinds
of
assessment
know‐how
 and
understanding
that
teachers
 need
to
assess
their
students
 effecGvely
•  Assessment
literate
educators
 should
have
knowledge
and
skills
 related
to
the
basic
principles
of
 quality
assessment
pracGces

 (SERVE
Center,
University
of
North
Carolina,
2004)

  • Assessment
Literacy
 Know‐how
and
 understanding
 teachers
need
to
 assess
students
 effec%vely
and
 maximize
 learning

  • Importance
of
classroom
 assessment
•  We
may
not
like
it,
but
 students
can
and
do
 ignore
our
teaching;

•  however
if
they
want
to
 get
a
qualificaGon,
they
 have
to
parGcipate
in
the
 assessment
processes
we
 design
and
implement.

(Brown,
S.
2004.
Assessment
for
learning.
Learning
and
Teaching
in
Higher
Educa0on,
1,
81‐89)

  • 43
/
W

  • Who
are
the
assessment
 ‘deciders’
at
your
 insGtuGon?

  • Classroom-Based Assessment: Challenges, Choices, and Consequences
  • Assessment
Frameworks

  • Assessment
framework
•  ‐
the
series
of
assessment
 tools
(exams,
tasks,
projects,
 etc.)
that
are
scored
and
used
 to
arrive
at
a
summa%ve
grade
 for
a
course
•  ‐it
should
be
skills‐based
and
 knowledge‐
based
(i.e.
Ss
 demonstrate
what
they
know
 about
and
can
do
with
English)
•  based
on
learning
outcomes

  • •  The
spirit
and
style
 of
student
 assessment
defines
 the
de
facto
 curriculum.
 
(Rowntree,
1987)
de
facto=
exisGng
in
fact,
actual,

 whether
intended
or
not

  • Targets
 1.  My
background
 2.  Classroom
based
 assessment
 3.  Tests
‐
purposes/

 func%ons
 4.  The
‘cardinal
 criteria’
for
 evalua%ng
a
test
 5.  Conclusions

 25

  • Quiz
Gme!
 26

  • Assessing
an
English
arGcles
 quiz
Context
• ConversaGon
class
 (listening
&
speaking)
• high‐beginner
level
 27

  • What
is
a
fundamental
 problem
with
this
quiz?
 28

  • Answer
 29

  • Targets
 1.  My
background
 2.  Classroom
based
 assessment
 3.  Tests
‐
purposes/

 func%ons
 4.  The
‘cardinal
 criteria’
for
 evalua%ng
a
test
 5.  Conclusions

 30

  • What
is
a
test?
 31

  • A
test
.
.
.

•  is
a
method
of
measuring
a
person’s
 ability,
knowledge,
or
performance
in
 a
given
domain.
•  is
an
instrument
–
a
set
of
 techniques,
procedures,
or
items
–
 that
requires
performance
on
the
 part
of
the
test‐taker.
 32

  • Tests
–
measuring
func%on
 33

  • A
test
must
measure
•  Some
tests
measure
general
ability,
while
 others
focus
on
very
specific
competencies
or
 objecGves.
•  Examples

•  A
mulG‐skill
proficiency
test
measures
general
 ability;

•  a
quiz
on
recognizing
correct
use
of
definite
 arGcles
measures
very
specific
knowledge.
 34

  • •  A
test
measures
 performance,
.
.
.

•  but,
the
results
 imply
the
test‐ takers
ability,
or
 competence.
 35

  • •  Performance‐ based
tests
 sample
the
test‐ takers
actual
use
 of
language,

•  but
from
those
 samples
the
test
 administrator
 infers
general
 competence.

 36

  • •  A
well‐constructed
 test
is
an
 instrument
that
 provides
an
 accurate
measure
 of
a
test‐taker’s
 ability
within
a
 parGcular
domain.
•  Construc%ng
a
 good
test
is
a
 complex
task.
 37

  • Your
assessment

 prac%ces?
 38

  • Think
about
 what
is
happening
in
your
context
 and
your
 assessment
 pracGces

  • Your
assessment
pracGces?
•  True–False
Item
 •  Inventories
•  MulGple
Choice
 Checklists
•  CompleGon
 How
do
you
 •  •  Peer
RaGng
•  Short
Answer
•  Essay
 assess
your
 •  Self
RaGng
 •  Journals
• •  PracGcal
Exam
 Papers/Reports
 students?
 •  Porkolios
•  Projects
 •  ObservaGons
•  QuesGonnaires
 •  Discussions
•  PresentaGons
 •  Interviews

  • For
you,
 which
of
the
four
skills
are
 more/less
 challenging
 to
test?
 41

  • Targets
 1.  My
background
 2.  Classroom
based
 assessment
 3.  Tests
‐
purposes/

 func%ons
 4.  The
‘cardinal
 criteria’
for
 evalua%ng
a
test
 5.  Conclusions

 42

  • Quiz
Gme!
 43

  • 2010

  • •  Exploring
how
 principles
of
 language
assessment
 can
and
should
be
 applied
to
formal
 tests.
•  These
principles
 apply
to
assessment
 of
all
kinds.
•  How
to
use
these
 principles
to
design
a
 good
test.
 45

  • •  What
are
the
 ‘five
cardinal
 criteria’
that
 can
be
used
to
 design
and
 evaluate
all
 types
of
 assessment?
 46

  • Q.
How
do
you
know
if
a
test
is
effecGve,
appropriate,
 useful,
or,
in
down‐to‐earth
 terms,
a
“good”
test?


 47

  • Five
key
assessment
principles?
• Discuss
• 3
minutes
• Hint
(five
nouns)
 48

  • • PracGcality
 Five
key
 • Reliability
assessment
 principles
 • Validity
 • AuthenGcity
 • Washback
 49

  • 50

  • Key

Assessment
Principles

  • •  These
quesGons
 provide
an
 excellent
 criterion
to
 evaluate
the
 tests
we
design
 and
use.
 52

  • 53

  • 1.
PracGcality
• Is
the
procedure
relaGvely
easy
 to
administer?
 54

  • Prac%cality
considera%ons
•  the
logisGcal
and
administraGve
issues
 involved
in
making,
giving
and
scoring
an
 assessment
instrument
•  the
amount
of
Gme
it
takes
to
construct
 and
administer
•  the
ease
of
scoring
•  ease
of
interpreGng/reporGng
the
results
 55

  • An
effecGve
test
is
prac%cal.

This
means
that
it:
•  is
not
excessively
expensive
•  stays
within
appropriate
Gme
 constraints
•  is
relaGvely
easy
to
administer,
and
•  has
a
scoring/evaluaGon
procedure
 that
is
specific
and
Gme
efficient
 56

  • The
value
and
quality
of
a
test
someGmes

hinge
on
such
ni`y‐gri`y
prac%cal
 considera%ons.
 57

  • •  In
classroom
 based
tesGng,
 _________
is
 almost
always
 a
crucial
 pracGcal
factor
 for
busy
 teachers.

 58

  • 59

  • 2.
Reliability
 • Is all work beingconsistently marked to the same standard? 60

  • •  A
reliable
test
is
consistent
and
 dependable.
•  If
you
give
the
same
test
to
the
same
 student
or
matched
students
on
two
 different
occasions,
the
test
should
 yield
similar
results.
 61

  • What
 factors
 contribute
 to
the
unreliability
 of
a
test?
 62

  • Test
Unreliability‐ contribuGng
factors
• Student
related
reliability
• Rater
reliability
(inter,
intra)
• Test
administra%on
reliability
• Test
reliability
 63

  • Q.
What
is
one
key
way
to
 increase

reliability?
 A.
Use
rubrics
 64

  • • Rubrics
are
scoring
guidelines.
•  They
provide
a
way
to
make
 judgments
fair
and
sound
when
 assessing
performance.
•  A
uniform
set
of
precisely
defined
 criteria
or
guidelines
are
set
forth
to
 judge
student
work.

 65

  • 66

  • 3.
Validity
•  Does theassessment ‐ most
complex
 measure criteria
 what we really want ‐ 
most
important
 principle
 to measure?
  • Validity
‐
definiGon
•  ‘The
extend
to
which
inferences
 made
from
assessment
results
 are
appropriate,
meaningful,
and
 useful
in
terms
of
the
purpose
of
 the
assessment.’

 (Gronlund,
1998,
p.
226)
 68

  • •  A
valid
test
of
reading
 ability
.
.
.

•  actually
measures
reading
 ability
–
•  not
math
skills
•  or
previous
knowledge

in
 a
subject
•  nor
wriGng
skills
•  nor
some
other
variable
of
 quesGonable
relevance
 69

  • How
is
the
validity
of
a
test
 established?
1. Content
 validity
2. Face
 validity
 70

  • Content
validity
•  If
a
test
requires
the
test‐taker
to
perform
the
 behavior
that
is
being
measured.
.
.
•  it
can
claim
content‐related
evidence
of
validity
 (content
validity)
•  e.g.
A
test
of
a
person’s
ability
to
speak
an
L2
 requires
the
student
to
actually
speak
within
 some
sort
of
authenGc
context.

•  A
test

with
paper
and
pencil
mulGple
choice
 quesGons
requiring
grammaGcal
judgments
does
 not
achieve
content
validity.
 71

  • •  direct
tes%ng
–
Another
way
of
 involves
the
test‐taker
 in
actually
performing
understanding
 the
target
task
content
validity
 •  indirect
tes%ng‐ students
not
 is
to
consider
 performing
the
task
 the
difference
 itself,
but
a
related
 task.
between
direct
 •  e.g.
tes%ng
oral
 and
indirect
 produc%on
of
 tesGng.
 syllable
stress
 72

  • To
achieve
content
validity
in
 classroom
assessment,
try
to
test
performance
 directly.
 73

  • 74

  • How
is
the
validity
of
a
test
 established?
1. Content
 validity
2. Face
 validity
 75

  • Face
validity
•  The
extent
to
which
students
view
the
 assessment
as:
1.  fair
2.  relevant
3.  useful
for
improving
learning
•  Face
validity
refers
to
the
degree
to
which
a
 test
looks
right,
and
appears
to
measure
the
 knowledge
or
abiliGes
it
claims
to
measure.
 76

  • High
face
validity:
the
test
.
.
.
•  is
well‐constructed,
expected
format
with
 familiar
tasks
•  is
clearly
doable
within
alloued
Gme
•  has
items
that
are
clear
and
uncomplicated
•  direcGons
that
are
crystal
clear
•  has
tasks
related
to
course
work
(content
 validity)
•  has
a
difficulty
level
that
presents
a
 reasonable
challenge
 77

  • •  Most
significant
cardinal
principle
of
 assessment
evalua%on.
•  If
validity
is
not
established,
all
other
 consideraGons
may
be
rendered
useless.

 78

  • 79

  • 4.
AuthenGcity
• Are
students
asked
to
perform
 real‐world
tasks?
 80

  • Test
task
authen%city
• tasks
represent,
or
closely
 approximate,
real‐world
tasks
• the
task
is
likely
to
be
enacted
 in
the
“real
world”
• not
contrived
or
arGficial
 81

  • AuthenGcity
checklist
•  Is
the
language
in
the
test
as
natural
as
 possible?
•  Are
topics
as
contextualized
as
possible
rather
 than
isolated?
•  Are
topics
and
situaGons
interesGng
 enjoyable,
and/or
humorous?
•  Is
some
themaGc
organizaGon
provided,
such
 as
through
a
story
line
or
episode?
•  Do
tasks
represent,
or
closely
approximate,
 real‐world
tasks?
 82

  • 83

  • 5.Washback
• Does the assessment have positive effects on learning and teaching? 84

  • Washback
=
the
effect
of
tesGng
on
teaching
 and
learning

 ‐ posi%ve
washback
 ‐ nega%ve
washback
 85

  • Washback
•  Classroom
assessment:
the
affects
of
an

 assessment
on
teaching
and
learning
prior
to
 the
assessment
itself
(preparaGon)
•  Another
form
of
washback=the
informaGon
 that
‘washes
back’
to
students
in
the
form
of
 useful
diagnoses
of
strengths
and
weaknesses.
•  Formal
tests
provide
no
washback
if
students
 receive
a
simple
leuer
grade
or
single
overall
 numerical
score.

 86

  • A
test
that
provides
beneficial
 washback
.
.
.

•  posiGvely
influences
what
and
how
teachers
 teach
•  posiGvely
influences
what
and
how
students
 learn
•  offers
learners
a
chance
to
adequately
prepare
•  gives
learners
feedback
that
enhances
their
 language
development
•  provides
condiGons
for
peak
performance
by
 the
learner
 87

  • Teachers’
challenge
• to
create
classroom
tests
that
serve
as
learning
tools
through
which
washback
is
 achieved
 88

  • 89

  • Targets
 1.  My
background
 2.  Classroom
based
 assessment
 3.  Tests
‐
purposes/

 func%ons
 4.  The
‘cardinal
 criteria’
for
 evalua%ng
a
test
 5.  Conclusions


  • Q.
How
do
you
know
if
a
test
is
effecGve,
appropriate,
useful,
or,
in
 down‐to‐earth
terms,
a
“good”
 test?


 91

  • Answer.

A
‘good’
test:


•  can
be
given
within
appropriate
administraGve
 constraints,
•  is
dependable,
•  
accurately
measures
what
you
want
it
to
 measure,
•  the
language
in
the
test
is
representaGve
of
 real‐world
language
use,
and
•  the
test
provides
informaGon
that
is
useful
for
 the
learner.
 92

  • •  These
principles
will
help
 you
make
accurate
 judgments
about
the
 English
competence
of
 your
students.
•  They
provide
useful
 guidelines
for
evaluaGng
 exisGng
tests,
and
 designing
our
own.

 93

  • Assessment
Literacy
 Know‐how
and
 understanding
 teachers
need
to
 assess
students
 effec%vely
and
 maximize
 learning

  • •  There
is
no
gewng
 away
from
the
fact
 that
most
of
the
 things
that
go
wrong
 with
assessment
are
 our
fault,
•  the
result
of
poor
 assessment
design‐
 and
not
the
fault
of
 our
students.
 (Race
et
al.,
2005)

  • •  Improving
student
 learning
implies
 improving
the
 assessment
system.
•  Teachers
oxen
assume
 that
it
is
their
teaching
 that
directs
student
 learning.

•  In
pracGce,
assessment
 directs
student
learning,
 because
it
is
the
 assessment
system
that
 defines
what
is
worth
 learning.







 (Havnes,
2004,
p.1)

  • (Boud
&
Falchikov,
2007)
•  There
is
substanGal
evidence
that
 assessment,
rather
than
teaching,
has
the
 major
influence
on
students’
learning.
•  It
directs
auenGon
to
what
is
important,
 acts
as
an
incenGve
for
study,
and
has
a
 powerful
effect
on
student’s
approaches
 to
their
work.
 Rethinking
Assessment
in
Higher
Educa0on

  • “We owe it toourselves and ourstudents to devote atleast as much energyto ensuring that ourassessment practicesare worthwhile as wedo to ensuring thatwe teach well”. 
Dr.
David
Boud,

University
of
Technology,

Sydney,
Australia
 98

  • Thank
you
for
your
Gme
and
 parGcipaGon