1. Fostering a culture shift
in assessment and feedback
through TESTA
Professor Tansy Jessop
Seminar at the University of Liverpool
13 March 2017
2. What are the issues?
1. The central assessment problem at Liverpool is….
2. The main feedback problem is…
3. My blue skies idea is….
3. This session
1. Brief overview of TESTA
2. Why people find it useful
3. Three problems TESTA addresses
4. Four themes in the data with activities
5. Solutions: a taster
9. Three problemsThree problems
Problem 1: Something awry not sure why
Problem 2: Curriculum design problem
Problem 3: The problem of educational change
13. ..has led to over-emphasising knowing
(Barnett and Coate 2005)
• Knowing is about content
• Acting is about becoming
a historian, actor,
psychologist, or
philosopher
• Being is about
understanding yourself,
orienting yourself and
relating your knowledge
and action to the world
Knowing
Being
Acting
15. The best approach from the student’s perspective is to focus
on concepts. I’m sorry to break it to you, but your students are
not going to remember 90 per cent – possibly 99 per cent – of
what you teach them unless it’s conceptual…. when broad,
over-arching connections are made, education occurs. Most
details are only a necessary means to that end.
http://www.timeshighereducation.co.uk/features/a-students-
lecture-to-rofessors/2013238.fullarticle#.U3orx_f9xWc.twitter
A student’s lecture to her professor
16. Problem 3: Educational change problem
Three misguided assumptions:
1. There is not enough high
quality data.
2. Data will do it
3. Academics will buy it.
http://www.liberalarts.wabash.edu/study-overview/
17. Proving is different from improving
“It is incredibly difficult to translate assessment
evidence into improvements in student learning”
“It’s far less risky and complicated to analyze data
than it is to act”
(Blaich & Wise, 2011)
18. Paradigm What it looks like
Technical rational Focus on data and tools
Relational Focus on people
Emancipatory Focus on systems and structures
19. TESTA themes and impacts
1. Variations in assessment patterns
2. High summative: low formative
3. Disconnected feedback
4. Lack of clarity about goals and standards
20. Defining the terms
• Summative assessment carries a grade which
counts toward the degree classification.
• Formative assessment does not count
towards the degree (either pass/fail or a
grade), elicits comments and is required to be
done by all students.
21. 1. Variations in assessment patterns
• What is striking for
you about this data?
• How does it compare
with your context?
• Does variation
matter?
22. Characteristic Range
Summative 12 -227
Formative 0 - 116
Varieties of assessment 5 - 21
Proportion of examinations 0% - 87%
Time to return marks & feedback 10 - 42 days
Volume of oral feedback 37 -1800 minutes
Volume of written feedback 936 - 22,000 words
Variations in assessment diets (n=73 UG degrees in 14 UK universities)
23. Patterns on three year UG degrees
(n=73 programmes in 14 universities)
Characteristic Low Medium High
Volume of summative
assessment
Below 33 40-48 More than 48
Volume of formative only Below 1 5-19 More than 19
% of tasks by examinations Below 11% 22-31% More than 31%
Variety of assessment
methods
Below 8 11-15 More than 15
Written feedback in words Less than 3,800 6,000-7,600 More than 7,600
24. Theme 2: High summative: low formative
• Summative ‘pedagogies of control’
• Circa 2 per module in UK
• Ratio of 1:8 of formative to summative
• Formative weakly understood and practised
26. What students say about high summative
• A lot of people don’t do wider reading. You just focus
on your essay question.
• In Weeks 9 to 12 there is hardly anyone in our
lectures. I'd rather use those two hours of lectures to
get the assignment done.
• It’s been non-stop assignments, and I’m now free of
assignments until the exams – I’ve had to rush every
piece of work I’ve done.
27. Deep and Surface Learning (Marton and
Saljo (1976)
Deep Learning
• Meaning
• Concepts
• Active learning
• Generating knowledge
• Relationship new and
previous knowledge
• Real-world learning
Surface Learning
• External purpose
• Topics
• Passive process
• Reproducing knowledge
• Isolated and
disconnected knowledge
• Artificial learning
28. What students say about formative
• If there weren’t loads of other assessments, I’d do
it.
• If there are no actual consequences of not doing
it, most students are going to sit in the bar.
• It’s good to know you’re being graded because
you take it more seriously.
• The lecturers do formative assessment but we
don’t get any feedback on it.
30. 1) Low-risk opportunities for students to learn from
feedback (Sadler, 1989)
2) Students fine-tune and understand requirements and
standards (Boud 2000, Nicol, 2006)
3) Feedback to lecturers from formative tasks helps to
adapt teaching (Hattie, 2009)
4) Cycles of reflection and collaboration (Biggs 2003;
Nicol & McFarlane Dick 2006)
5) Distributes student effort (Gibbs 2004).
Why formative matters
31. So, how do we do it?
Five case studies of
successful formative
Your task will be to identify
the principles that make
them work
How could you adapt them?
32. Case Study 1: Business School
• Reduction from average 2 x summative, zero
formative per module
• …to 1 x summative and 3 x formative
• Required by students in entire business school
• All working to similar script
• Systematic shift, experimentation, less risky
together
33. Case Study 2: Social Sciences
• Education, Sociology and PGCert in HE degrees
• Problem: silent seminar, students not reading
• Blogging on current academic texts
• Threads and live discussion
• Linked to summative
34. Case Study 3: Media degree
• Media degree
• Presentations formative
• Students get feedback (peer and tutor)
• Refines their thinking for…
• Linked summative essay
35. Case study 4: Film and TV
• Seminar
• Problem: lack of discrimination about sources
• Students bring 1 x book, 1 x chapter, 1 x
journal article, 2 x pop culture articles
• Justify choices to group
• Reach consensus about five best sources
36. Case study 5: Engineering
• Engineering
• Problem low averages
• Course requirement to complete 50 problems
• Peer assessed in six ‘lecture’ slots
• Marks do not count
• Lectures, problems, classes, exams unchanged
• Exam marks increased from 45% to 85%
37. Your task
• In groups, identify five principles for making
formative work. Write them down on flipchart
paper.
• Devise an adaptation for your discipline, using
the principles, and talk about what you
already do, or what might work at your tables.
39. Take five
• Choose a quote that
strikes you.
• What is the key issue?
• What strategies might
address this issue?
40. What students say…
It’s difficult because your assignments are so detached
from the next one you do for that subject. They don’t
relate to each other.
Because it’s at the end of the module, it doesn’t feed into
our future work.
Because they have to mark so many that our essay
becomes lost in the sea that they have to mark.
It was like ‘Who’s Holly?’ It’s that relationship where
you’re just a student.
41. Actions based on evidence
• Conversation: who starts the dialogue?
• Iterative cycles of reflection across modules
• Quick generic feedback: the ‘Sherlock’ factor
• Feedback synthesis tasks
• Technology: audio, screencast and blogging
• From feedback as ‘telling’…
• … to feedback as asking questions
42. Theme 4: Confusion about goals and
standards
• Consistently low scores on the AEQ for clear
goals and standards
• Alienation from the tools, especially criteria
and guidelines
• Symptoms: perceptions of marker variation,
unfair standards and inconsistencies in practice
43. What the literature says…
Marking is important. The grades we give
students and the decisions we make about
whether they pass or fail coursework and
examinations are at the heart of our academic
standards (Bloxham, Boyd and Orr 2011).
Grades matter (Sadler 2009).
44. What the papers say…
https://www.timeshighereducation.co.uk/news/examiners-give-hugely-different-
marks/2019946.article
45. QAA: a paradigm of accountability
• Learning outcomes
• Criteria-based learning
• Meticulous specification
• Written discourse
• Generic discourse (Woolf 2004)
• Intended to reduce the arbitrariness of staff
decisions (Sadler 2009).
46. What students say…
We’ve got two tutors- one marks completely differently to
the other and it’s pot luck which one you get.
They have different criteria, they build up their own criteria.
It’s such a guessing game.... You don’t know what they
expect from you.
They read the essay and then they get a general impression,
then they pluck a mark from the air.
47. What’s going wrong here?
There are criteria, but I find them really strange.
There’s “writing coherently, making sure the argument
that you present is backed up with evidence”.
Q: If you could change one thing to improve what
would it be?
A: More consistent marking, more consistency across
everything and that they would talk to each other.
48. But is this quite ‘normal’?
Differences between markers are not ‘error’, but
rather the inescapable outcome of the multiplicity of
perspectives that assessors bring with them
(Shay 2005, 665).
The tension between ‘the scientific aspirations of
assessment technologies to represent an objective
reality and the unavoidable subjectivities injected by
the human focus of these technologies’
(Broadfoot 2002, 157).
51. The Art and Science of Evaluation
Judging is both an art and a science: It is an art
because the decisions with which a judge is
constantly faced are very often based on
considerations of an intangible nature that cannot
be recognized intuitively. It is also a science because
without a sound knowledge of a dog’s points and
anatomy, a judge cannot make a proper assessment
of it whether it is standing or in motion.
Take them round please: the art of judging dogs (Horner, T
1975).
52. Marking as social practice
The typical technologies of our assessment and
moderation systems – marking memorandum,
double-marking, external examiners – privilege
reliability. These technologies are not in themselves
problematic. The problem is our failing to use these
technologies as opportunities for dialogue about
what we really value as assessors, individually and as
communities of practice
(Shay 2005).
Marking as social practice
53. Taking action: internalising goals and
standards
• Regular calibration exercises
• Discussion and dialogue
• Discipline specific criteria (no cut and paste)
Lecturers
• Rewrite/co-create criteria
• Marking exercises
• Exemplars
Lecturers
and students
• Enter secret garden - peer review
• Engage in drafting processes
• Self-reflection
Students
57. References
Arum, R, and Roksa, J. (2011) Academically Adrift: Limited Learning on College Campuses. University of
Chicago Press.
Blaich, C., & Wise, K. (2011). From Gathering to Using Assessment Results: Lessons from the Wabash
National Study. Occasional Paper #8. University of Illinois: National Institution for Learning Outcomes
Assessment.
Bloxham, S. , P. Boyd, and Orr S. (2011) Mark my words: the role of assessment criteria in UK higher
education practices. Studies in Higher Education. 36.6. 655-670.
Boud, D. and Molloy, E. (2013) ‘Rethinking models of feedback for learning: The challenge of
design’, Assessment & Evaluation in Higher Education, 38(6), pp. 698–712. doi:
10.1080/02602938.2012.691462.
Gibbs, G. & Simpson, C. (2004) Conditions which assessment supports students' learning. Learning and
Teaching in Higher Education. 1(1): 3-31.
Harland, T., McLean, A., Wass, R., Miller, E. and Sim, K. N. (2014) ‘An assessment arms race and its fallout:
High-stakes grading and the case for slow scholarship’, Assessment & Evaluation in Higher Education.
Jessop, T. and Tomas, C. 2016 The implications of programme assessment on student learning. Assessment
and Evaluation in Higher Education. Published online 2 August 2016.
Jessop, T. and Maleckar, B. (2014). The Influence of disciplinary assessment patterns on student learning: a
comparative study. Studies in Higher Education. Published Online 27 August 2014
Jessop, T. , El Hakim, Y. and Gibbs, G. (2014) The whole is greater than the sum of its parts: a large-scale
study of students’ learning in response to different assessment patterns. Assessment and Evaluation in
Higher Education. 39(1) 73-88.
Nicol, D. (2010) From monologue to dialogue: improving written feedback processes in mass higher
education, Assessment & Evaluation in Higher Education, 35: 5, 501 – 517.
O'Donovan, B , Price, M. and Rust, C. (2008) 'Developing student understanding of assessment standards: a
nested hierarchy of approaches', Teaching in Higher Education, 13: 2, 205 — 217
Sadler, D. R. (1989) ‘Formative assessment and the design of instructional systems’, Instructional Science,
18(2), pp. 119–144. doi: 10.1007/bf00117714.
Shay, S.B. 2005. The assessment of complex tasks: A double reading. Studies in Higher Education. 30:663–79.
Woolf, H. (2004) Assessment criteria: Reflections on current practices. Assessment and Evaluation in Higher
Education 24:4 479-93.
Editor's Notes
What started as a research methodology has become a way of thinking. David Nicol – changing the discourse, the way we think about assessment and feedback; not only technical, research, mapping, also shaping our thinking. Evidence, assessment principles. Habermas framework.
I realised what we were saying was ‘That’s only two per module’. And I was like ‘Ah, but that’s the point. This is a programmatic thing and you’re used to thinking about a module’
(Programme Leader, American Studies).
Data – persistent problem A&F scores. Traffic light systems – green for good. DVC find the people wo are doing well so we can share best practice. Three programmes. Neil McCaw
Data – persistent problem A&F scores. Traffic light systems – green for good. DVC find the people wo are doing well so we can share best practice. Three programmes. Neil McCaw
Hard to make connections, difficult to see the joins between assessments, much more assessment, much more assessment to accredit each little box. Multiplier effect. Less challenge, less integration. Lots of little neo-liberal tasks. The Assessment Arms Race.
Language of ‘covering material’ Should we be surprised?
Wabash study – 2005-2011, 17,000 students in 49 American colleges. 60-70 publications Critical thinking, moral reasoning, leadership towards social justice, engagement in diversity, deep intellectual work.
TESTA has done the data and that’s been useful. Ideological compromises. Mixed methods approaches. Critical pedagogy sleeping with the enemy. Democratic, participatory, liberating curriculum and pedagogy. Teachers and students shape and change education. Resist managerialism and the market. Risky pedagogies.
Teach Less, learn more. Assess less, learn more.
The Swedes didn’t just give us IKEA
Students can increase their understanding of the language of assessment through their active engagement in: ‘observation, imitation, dialogue and practice’ (Rust, Price, and O’Donovan 2003, 152), Dialogue, clever strategies, social practice, relationship building, relinquishing power.