Cracking the challenges of
assessment and feedback
Professor Tansy Jessop
Medical Educators Course
12 May 2017
Two premises
Assessment drives what students pay attention to,
and defines the actual curriculum
(Ramsden 1992).
Feedback is the single most important factor in
student learning
(Hattie 2007).
Mixed methods approach
Programme
Team
Meeting
Assessment
Experience
Questionnaire
(AEQ)
TESTA
Programme
Audit
Student
Focus Groups
Growth of TESTAWidespread use
A modular degree
Does IKEA 101 work for complex learning?
Defining the terms
• Summative assessment carries a grade which
counts toward the degree classification.
• Formative assessment does not count
towards the degree, elicits feedback and is
required.
Definitions of formative assessment
• Basic idea is simple – to contribute to student
learning through the provision of information about
performance (Yorke, 2003).
• A fine tuning mechanism for how and what we learn
(Boud 2000).
• …short-circuiting the randomness and inefficiency of
trial-and-error learningā€ (Sadler, 1989, p.120).
1. Variations in assessment patterns
• What is striking for
you about this data?
• How does it compare
with your context?
• Does variation
matter?
Characteristic Range
Summative 12 -227
Formative 0 - 116
Varieties of assessment 5 - 21
Proportion of examinations 0% - 87%
Time to return marks & feedback 10 - 42 days
Volume of oral feedback 37 -1800 minutes
Volume of written feedback 936 - 22,000 words
Variations in assessment diets (n=73 UG degrees in 14 UK universities)
Patterns on three year UG degrees
(n=73 programmes in 14 universities)
Characteristic Low Medium High
Volume of summative
assessment
Below 33 40-48 More than 48
Volume of formative only Below 1 5-19 More than 19
% of tasks by examinations Below 11% 22-31% More than 31%
Variety of assessment
methods
Below 8 11-15 More than 15
Written feedback in words Less than 3,800 6,000-7,600 More than 7,600
2. High summative: low formative
• High summative on UK, Irish, NZ and Indian degrees
• Summative as a ā€˜pedagogy of control’
• Low formative: ratio of 1:8 summative to formative
• Weakly practised and understood
Assessment Arms Race
What students say about high summative
• A lot of people don’t do wider reading. You just focus
on your essay question.
• In Weeks 9 to 12 there is hardly anyone in our
lectures. I'd rather use those two hours of lectures to
get the assignment done.
• It’s been non-stop assignments, and I’m now free of
assignments until the exams – I’ve had to rush every
piece of work I’ve done.
Deep and Surface Learning
Deep Learning
• Meaning
• Concepts
• Active learning
• Generating knowledge
• Relationship new and
previous knowledge
• Real-world learning
Surface Learning
• External purpose
• Topics
• Passive process
• Reproducing knowledge
• Isolated and disconnected
knowledge
• Artificial learning
(Marton and Saljo 1976)
What students say about formative…
• It was really useful. We were assessed on it but we
weren’t officially given a grade, but they did give us
feedback on how we did.
• It didn’t actually count so that helped quite a lot
because it was just a practice and didn’t really
matter what we did and we could learn from
mistakes so that was quite useful.
But…
• If there weren’t loads of other assessments, I’d do it.
• If there are no actual consequences of not doing it,
most students are going to sit in the bar.
• It’s good to know you’re being graded because you
take it more seriously.
• The lecturers do formative assessment but we don’t
get any feedback on it.
So, how do we do it?
Five case studies of
successful formative
Identify the principles that
make them work
How could you adapt them?
Case Study 1: Business School
• Reduction from average 2 x summative, zero
formative per module
• …to 1 x summative and 3 x formative
• Required by students in entire business school
• Systematic shift, experimentation, less risky
together
Case Study 2: Social Sciences
• Problem: silent seminar, students not reading
• Blogging on current academic texts
• Threads and live discussion
• Linked to summative
Case Study 3: Media degree
• Media degree
• Presentations formative
• Students get feedback (peer and tutor)
• Refines their thinking for…
• Linked summative essay
Case study 4: Film and TV
• Seminar
• Problem: lack of discrimination about sources
• Students bring 1 x book, 1 x chapter, 1 x
journal article, 2 x pop culture articles
• Justify choices to group
• Reach consensus about five best sources
Learning-oriented summative?
https://www.youtube.com/watch?v=ZVFwQzlVFy0
3. Disconnected feedback
Lose-Lose situation
It was heavy, tons of marking for
the tutor. It was such hard work.
It was criminal.
Media Course Leader
I’m really bad at reading
feedback. I’ll look at the mark
and then be like ā€˜well stuff it, I
can’t do anything about it’
Student, TESTA focus group
What students say…
It’s difficult because your assignments are so detached
from the next one you do for that subject. They don’t
relate to each other.
Because it’s at the end of the module, it doesn’t feed into
our future work.
Because they have to mark so many that our essay
becomes lost in the sea that they have to mark.
It was like ā€˜Who’s Holly?’ It’s that relationship where
you’re just a student.
Emotional impact
What students say…
It’s always the negatives you remember, as we’ve
all said. It’s always the negatives. We hardly ever
pick out the really positive points because once
you’ve seen the negative, the negatives can
outweigh the positives.
But hedging is unhelpful…
They just pacify really. I went for help and they just
told me what I wanted to hear, not what I needed to
know.
Its very positive like nobody ever says ā€˜no you’ve
done that completely wrong’. It's always 'You've
done that very wellā€˜. Well why have a got a low
grade then? It doesn’t really help you from there.
Take five
• What are the issues with
emotions in giving feedback?
• How honest is your
feedback? What stops you?
• What is your view of giving
personalised feedback?
• What challenges does it
pose?
A quick case study
I’m baffled.
Students love my
feedback but
they are a voice
in the
wilderness…
Is it a paradigm issue?
Scientific Paradigm Naturalistic paradigm
Neutrality Interpretation
External environment Personal and subjective
Marking apparatus – multiple audiences Conversation – single audience in mind
Written and traceable Free, ephemeral, incidental, more gaps
Convergent Divergent
Standardised Varied
Final word Dialogic
Accountability and evidence Social practice
Theme 4: Criteria and standards
What the literature says…
Marking is important. The grades we give students
and the decisions we make about whether they
pass or fail coursework and examinations are at the
heart of our academic standards
(Bloxham, Boyd and Orr 2011).
Grades matter (Sadler 2009).
What the papers say…
https://www.timeshighereducation.co.uk/news/examiners-give-hugely-different-
marks/2019946.article
QAA: a paradigm of accountability
• Learning outcomes
• Criteria-based learning
• Meticulous specification
• Written discourse
• Generic discourse (Woolf 2004)
• Intended to reduce the arbitrariness of staff
decisions (Sadler 2009).
What students say…
We’ve got two tutors- one marks completely differently to
the other and it’s pot luck which one you get.
They have different criteria, they build up their own criteria.
It’s such a guessing game.... You don’t know what they
expect from you.
They read the essay and then they get a general impression,
then they pluck a mark from the air.
What’s going wrong here?
There are criteria, but I find them really strange.
There’s ā€œwriting coherently, making sure the argument
that you present is backed up with evidenceā€.
Q: If you could change one thing to improve what
would it be?
A: More consistent marking, more consistency across
everything and that they would talk to each other.
But is this quite ā€˜normal’?
Differences between markers are not ā€˜error’, but
rather the inescapable outcome of the
multiplicity of perspectives that assessors bring
with them
(Shay 2005, 665).
Having ā€˜an eye for a dog’
The Art and Science of Evaluation
Judging is both an art and a science: It is an art
because the decisions with which a judge is
constantly faced are very often based on
considerations of an intangible nature that cannot
be recognized intuitively. It is also a science because
without a sound knowledge of a dog’s points and
anatomy, a judge cannot make a proper assessment
of it whether it is standing or in motion.
Take them round please: the art of judging dogs (Horner, T
1975).
Implicit
Criteria
Explicit
Written
I justify
Co-creation
and
participation
Active
engagement
by students
Marking as social practice
The typical technologies of our assessment and
moderation systems – marking memorandum,
double-marking, external examiners – privilege
reliability. These technologies are not in themselves
problematic. The problem is our failing to use these
technologies as opportunities for dialogue about
what we really value as assessors, individually and as
communities of practice
(Shay 2005).
Marking as social practice
Taking action: internalising goals and
standards
• Regular calibration exercises
• Discussion and dialogue
• Discipline specific criteria (no cut and paste)
Lecturers
• Rewrite/co-create criteria
• Marking exercises
• Exemplars
Lecturers
and students
• Enter secret garden - peer review
• Engage in drafting processes
• Self-reflection
Students
From this educational paradigm…
Transmission Model
Social Constructivist Model
References
Barlow, A. and Jessop, T. 2016. ā€œYou can’t write a load of rubbishā€: Why blogging works as formative
assessment. Educational Development. 17(3), 12-15. SEDA.
Berg, M. and Seeber, B. (2016) The Slow Professor: Challenging the Culture of Speed in the Academy.
Toronto. University of Toronto Press.
Boud, D. and Molloy, E. (2013) ā€˜Rethinking models of feedback for learning: The challenge of
design’, Assessment & Evaluation in Higher Education, 38(6), pp. 698–712.
Gibbs, G. & Simpson, C. (2004) Conditions which assessment supports students' learning. Learning and
Teaching in Higher Education. 1(1): 3-31.
Harland, T., McLean, A., Wass, R., Miller, E. and Sim, K. N. (2015) ā€˜An assessment arms race and its fallout:
High-stakes grading and the case for slow scholarship’, Assessment & Evaluation in Higher Education 40(4)
528-541.
Jessop, T. and Tomas, C. 2016 The implications of programme assessment on student learning. Assessment
and Evaluation in Higher Education. Published online 2 August 2016.
Jessop, T. and Maleckar, B. (2014). The Influence of disciplinary assessment patterns on student learning: a
comparative study. Studies in Higher Education. Studies in Higher Education. 41(4) 696-711.
Jessop, T. , El Hakim, Y. and Gibbs, G. (2014) The whole is greater than the sum of its parts: a large-scale
study of students’ learning in response to different assessment patterns. Assessment and Evaluation in
Higher Education. 39(1) 73-88.
Nicol, D. (2010) From monologue to dialogue: improving written feedback processes in mass higher
education, Assessment & Evaluation in Higher Education, 35: 5, 501 – 517.
Sadler, D. R. (1989) ā€˜Formative assessment and the design of instructional systems’, Instructional Science,
18(2), pp. 119–144.
O'Donovan, B , Price, M. and Rust, C. (2008) 'Developing student understanding of assessment standards: a
nested hierarchy of approaches', Teaching in Higher Education, 13: 2, 205 — 217
Sadler, D. R. (1989) ā€˜Formative assessment and the design of instructional systems’, Instructional Science,
18(2), pp. 119–144.
Shay, S.B. 2005. The assessment of complex tasks: A double reading. Studies in Higher Education. 30:663–79.
Woolf, H. (2004) Assessment criteria: Reflections on current practices. Assessment and Evaluation in Higher
Education 24:4 479-93.

Cracking the challenge of assessment and feeedback

  • 1.
    Cracking the challengesof assessment and feedback Professor Tansy Jessop Medical Educators Course 12 May 2017
  • 2.
    Two premises Assessment driveswhat students pay attention to, and defines the actual curriculum (Ramsden 1992). Feedback is the single most important factor in student learning (Hattie 2007).
  • 4.
  • 5.
  • 6.
  • 7.
    Does IKEA 101work for complex learning?
  • 8.
    Defining the terms •Summative assessment carries a grade which counts toward the degree classification. • Formative assessment does not count towards the degree, elicits feedback and is required.
  • 9.
    Definitions of formativeassessment • Basic idea is simple – to contribute to student learning through the provision of information about performance (Yorke, 2003). • A fine tuning mechanism for how and what we learn (Boud 2000). • …short-circuiting the randomness and inefficiency of trial-and-error learningā€ (Sadler, 1989, p.120).
  • 10.
    1. Variations inassessment patterns • What is striking for you about this data? • How does it compare with your context? • Does variation matter?
  • 11.
    Characteristic Range Summative 12-227 Formative 0 - 116 Varieties of assessment 5 - 21 Proportion of examinations 0% - 87% Time to return marks & feedback 10 - 42 days Volume of oral feedback 37 -1800 minutes Volume of written feedback 936 - 22,000 words Variations in assessment diets (n=73 UG degrees in 14 UK universities)
  • 12.
    Patterns on threeyear UG degrees (n=73 programmes in 14 universities) Characteristic Low Medium High Volume of summative assessment Below 33 40-48 More than 48 Volume of formative only Below 1 5-19 More than 19 % of tasks by examinations Below 11% 22-31% More than 31% Variety of assessment methods Below 8 11-15 More than 15 Written feedback in words Less than 3,800 6,000-7,600 More than 7,600
  • 13.
    2. High summative:low formative • High summative on UK, Irish, NZ and Indian degrees • Summative as a ā€˜pedagogy of control’ • Low formative: ratio of 1:8 summative to formative • Weakly practised and understood
  • 14.
  • 15.
    What students sayabout high summative • A lot of people don’t do wider reading. You just focus on your essay question. • In Weeks 9 to 12 there is hardly anyone in our lectures. I'd rather use those two hours of lectures to get the assignment done. • It’s been non-stop assignments, and I’m now free of assignments until the exams – I’ve had to rush every piece of work I’ve done.
  • 16.
    Deep and SurfaceLearning Deep Learning • Meaning • Concepts • Active learning • Generating knowledge • Relationship new and previous knowledge • Real-world learning Surface Learning • External purpose • Topics • Passive process • Reproducing knowledge • Isolated and disconnected knowledge • Artificial learning (Marton and Saljo 1976)
  • 17.
    What students sayabout formative… • It was really useful. We were assessed on it but we weren’t officially given a grade, but they did give us feedback on how we did. • It didn’t actually count so that helped quite a lot because it was just a practice and didn’t really matter what we did and we could learn from mistakes so that was quite useful.
  • 18.
    But… • If thereweren’t loads of other assessments, I’d do it. • If there are no actual consequences of not doing it, most students are going to sit in the bar. • It’s good to know you’re being graded because you take it more seriously. • The lecturers do formative assessment but we don’t get any feedback on it.
  • 19.
    So, how dowe do it? Five case studies of successful formative Identify the principles that make them work How could you adapt them?
  • 20.
    Case Study 1:Business School • Reduction from average 2 x summative, zero formative per module • …to 1 x summative and 3 x formative • Required by students in entire business school • Systematic shift, experimentation, less risky together
  • 21.
    Case Study 2:Social Sciences • Problem: silent seminar, students not reading • Blogging on current academic texts • Threads and live discussion • Linked to summative
  • 22.
    Case Study 3:Media degree • Media degree • Presentations formative • Students get feedback (peer and tutor) • Refines their thinking for… • Linked summative essay
  • 23.
    Case study 4:Film and TV • Seminar • Problem: lack of discrimination about sources • Students bring 1 x book, 1 x chapter, 1 x journal article, 2 x pop culture articles • Justify choices to group • Reach consensus about five best sources
  • 24.
  • 25.
  • 26.
    Lose-Lose situation It washeavy, tons of marking for the tutor. It was such hard work. It was criminal. Media Course Leader I’m really bad at reading feedback. I’ll look at the mark and then be like ā€˜well stuff it, I can’t do anything about it’ Student, TESTA focus group
  • 27.
    What students say… It’sdifficult because your assignments are so detached from the next one you do for that subject. They don’t relate to each other. Because it’s at the end of the module, it doesn’t feed into our future work. Because they have to mark so many that our essay becomes lost in the sea that they have to mark. It was like ā€˜Who’s Holly?’ It’s that relationship where you’re just a student.
  • 28.
  • 29.
    What students say… It’salways the negatives you remember, as we’ve all said. It’s always the negatives. We hardly ever pick out the really positive points because once you’ve seen the negative, the negatives can outweigh the positives.
  • 30.
    But hedging isunhelpful… They just pacify really. I went for help and they just told me what I wanted to hear, not what I needed to know. Its very positive like nobody ever says ā€˜no you’ve done that completely wrong’. It's always 'You've done that very wellā€˜. Well why have a got a low grade then? It doesn’t really help you from there.
  • 31.
    Take five • Whatare the issues with emotions in giving feedback? • How honest is your feedback? What stops you? • What is your view of giving personalised feedback? • What challenges does it pose?
  • 32.
    A quick casestudy I’m baffled. Students love my feedback but they are a voice in the wilderness…
  • 33.
    Is it aparadigm issue? Scientific Paradigm Naturalistic paradigm Neutrality Interpretation External environment Personal and subjective Marking apparatus – multiple audiences Conversation – single audience in mind Written and traceable Free, ephemeral, incidental, more gaps Convergent Divergent Standardised Varied Final word Dialogic Accountability and evidence Social practice
  • 34.
    Theme 4: Criteriaand standards
  • 35.
    What the literaturesays… Marking is important. The grades we give students and the decisions we make about whether they pass or fail coursework and examinations are at the heart of our academic standards (Bloxham, Boyd and Orr 2011). Grades matter (Sadler 2009).
  • 36.
    What the paperssay… https://www.timeshighereducation.co.uk/news/examiners-give-hugely-different- marks/2019946.article
  • 37.
    QAA: a paradigmof accountability • Learning outcomes • Criteria-based learning • Meticulous specification • Written discourse • Generic discourse (Woolf 2004) • Intended to reduce the arbitrariness of staff decisions (Sadler 2009).
  • 38.
    What students say… We’vegot two tutors- one marks completely differently to the other and it’s pot luck which one you get. They have different criteria, they build up their own criteria. It’s such a guessing game.... You don’t know what they expect from you. They read the essay and then they get a general impression, then they pluck a mark from the air.
  • 39.
    What’s going wronghere? There are criteria, but I find them really strange. There’s ā€œwriting coherently, making sure the argument that you present is backed up with evidenceā€. Q: If you could change one thing to improve what would it be? A: More consistent marking, more consistency across everything and that they would talk to each other.
  • 40.
    But is thisquite ā€˜normal’? Differences between markers are not ā€˜error’, but rather the inescapable outcome of the multiplicity of perspectives that assessors bring with them (Shay 2005, 665).
  • 41.
    Having ā€˜an eyefor a dog’
  • 42.
    The Art andScience of Evaluation Judging is both an art and a science: It is an art because the decisions with which a judge is constantly faced are very often based on considerations of an intangible nature that cannot be recognized intuitively. It is also a science because without a sound knowledge of a dog’s points and anatomy, a judge cannot make a proper assessment of it whether it is standing or in motion. Take them round please: the art of judging dogs (Horner, T 1975).
  • 43.
  • 44.
    Marking as socialpractice The typical technologies of our assessment and moderation systems – marking memorandum, double-marking, external examiners – privilege reliability. These technologies are not in themselves problematic. The problem is our failing to use these technologies as opportunities for dialogue about what we really value as assessors, individually and as communities of practice (Shay 2005). Marking as social practice
  • 45.
    Taking action: internalisinggoals and standards • Regular calibration exercises • Discussion and dialogue • Discipline specific criteria (no cut and paste) Lecturers • Rewrite/co-create criteria • Marking exercises • Exemplars Lecturers and students • Enter secret garden - peer review • Engage in drafting processes • Self-reflection Students
  • 46.
    From this educationalparadigm…
  • 47.
  • 48.
  • 49.
    References Barlow, A. andJessop, T. 2016. ā€œYou can’t write a load of rubbishā€: Why blogging works as formative assessment. Educational Development. 17(3), 12-15. SEDA. Berg, M. and Seeber, B. (2016) The Slow Professor: Challenging the Culture of Speed in the Academy. Toronto. University of Toronto Press. Boud, D. and Molloy, E. (2013) ā€˜Rethinking models of feedback for learning: The challenge of design’, Assessment & Evaluation in Higher Education, 38(6), pp. 698–712. Gibbs, G. & Simpson, C. (2004) Conditions which assessment supports students' learning. Learning and Teaching in Higher Education. 1(1): 3-31. Harland, T., McLean, A., Wass, R., Miller, E. and Sim, K. N. (2015) ā€˜An assessment arms race and its fallout: High-stakes grading and the case for slow scholarship’, Assessment & Evaluation in Higher Education 40(4) 528-541. Jessop, T. and Tomas, C. 2016 The implications of programme assessment on student learning. Assessment and Evaluation in Higher Education. Published online 2 August 2016. Jessop, T. and Maleckar, B. (2014). The Influence of disciplinary assessment patterns on student learning: a comparative study. Studies in Higher Education. Studies in Higher Education. 41(4) 696-711. Jessop, T. , El Hakim, Y. and Gibbs, G. (2014) The whole is greater than the sum of its parts: a large-scale study of students’ learning in response to different assessment patterns. Assessment and Evaluation in Higher Education. 39(1) 73-88. Nicol, D. (2010) From monologue to dialogue: improving written feedback processes in mass higher education, Assessment & Evaluation in Higher Education, 35: 5, 501 – 517. Sadler, D. R. (1989) ā€˜Formative assessment and the design of instructional systems’, Instructional Science, 18(2), pp. 119–144. O'Donovan, B , Price, M. and Rust, C. (2008) 'Developing student understanding of assessment standards: a nested hierarchy of approaches', Teaching in Higher Education, 13: 2, 205 — 217 Sadler, D. R. (1989) ā€˜Formative assessment and the design of instructional systems’, Instructional Science, 18(2), pp. 119–144. Shay, S.B. 2005. The assessment of complex tasks: A double reading. Studies in Higher Education. 30:663–79. Woolf, H. (2004) Assessment criteria: Reflections on current practices. Assessment and Evaluation in Higher Education 24:4 479-93.

Editor's Notes

  • #4Ā Research and change process. Three premises: assessment drives the curriculum; feedback is ā€˜the single most important factor in student learning’ and the programme is the most important place to influence change.
  • #7Ā Academics operate in isolation from one another. Only see their part of the degree. Don’t see connections. Fragments into small tasks – hamster wheel. Curriculum design issue. The trouble is that students experience the whole elephant and it is often indigestible… Assessment is mainly sort of the topical knowledge and the topics never relate. We'll never do something again that we’ve already studied, like we learn something and then just move on (TESTA focus group data).
  • #8Ā Hard to make connections, difficult to see the joins between assessments, much more assessment, much more assessment to accredit each little box. Multiplier effect. Less challenge, less integration. Lots of little neo-liberal tasks. The Assessment Arms Race.
  • #15Ā Teach Less, learn more. Assess less, learn more.
  • #17Ā The Swedes didn’t just give us IKEA
  • #27Ā Feedback: all that effort, but what is the effect? Margaret Price But lots of projects and programmes do….
  • #46Ā Students can increase their understanding of the language of assessment through their active engagement in: ā€˜observation, imitation, dialogue and practice’ (Rust, Price, and O’Donovan 2003, 152), Dialogue, clever strategies, social practice, relationship building, relinquishing power.