Assessing with confidence
Why ask students about certainty?
● What do you know?
● How certain is your knowledge?
How can we get students to
honestly report their certainty?
Confidence ≡ certainty
Confidence-based marking (CBM)
NB terminology: ‘certainty’ would be better, but ‘confidence’ has stuck
Low 1 0
Medium 2 -2
High 3 -6
Tentative & correct
Cocksure – and wrong!
Gardner-Medwin & Curtin (2007) Certainty-Based Marking (CBM) for Reflective Learning and Proper Knowledge Assessment
Some trial questions
Question Taken by Mean score
What is 2 + 2? 238 2.51
What is derivative of x³? 223 -0.47
Who painted the 'Mona Lisa'? 212 2.14
Who is the 'Mona Lisa'? 208 0.21
Uncertainty principle -- whose? 207 0.03
Uncertainty principle -- formula? 218 0.01
Do students honestly assess confidence?
Question High Medium Low
What is 2 + 2? 210 5 23
What is derivative of x³? 96 46 81
Who painted the 'Mona Lisa'? 165 21 26
Who is the 'Mona Lisa'? 56 41 111
Uncertainty principle -- whose? 62 31 114
Uncertainty principle -- formula? 39 21 158
CBM – for learning and revision
Under exam conditions…
Benefits to students
CBM – motivations
● Rewards care and effort
● Greater engagement
● Encourages reflective learning
● Encourages self-assessment
Do students like CBM?
Yes – regard it as fair and challenging, helpful to learning
No – less likely to do CBM than MCQ when optional
Schoendorfer, N., & Emmett, D. (2012). Use of certainty-based marking in a second-year medical student cohort: a pilot study. Advances in Medical
Education and Practice, 3, 139–43. doi:10.2147/AMEP.S35972
Nix, I., & Wyllie, A. (2011). Exploring design features to enhance computer-based assessment: Learners’ views on using a confidence-indicator tool
and computer-based feedback. British Journal of Educational Technology, 42(1), 101–112. doi:10.1111/j.1467-8535.2009.00992.x
Barr, D. A., & Burke, J. R. (2013). Using confidence-based marking in a laboratory setting: A tool for student self-assessment and learning. The
Journal of Chiropractic Education, 27(1), 21–26. doi:10.7899/JCE-12-018
Is CBM fair?
● No significant gender differences
● Very few students seem over-confident, but some were under-confident
‘In decision-rich occupations such as medicine, mis-calibration of reliability is a
serious handicap’ Gardner-Medwin (2014, p.6)
● Scores will generally be lower when marked as CBM than MCQ
but possible to scale to non-CBM marking to set grade boundaries
Gardner-Medwin, A. R., & Gahan, M. (2003). Formative and Summative Confidence-Based Assessment. In Proc. 7th International
Computer-Aided Assessment Conference (pp. 147–155). Retrieved from www.caaconference.com
Gardner-Medwin, T. (2014). CBM selftests at UCL: The past and the future of LAPT. Retrieved from
Types of questions
Open question Multiple-choice question Confidence-based
What is wrong with MCQ?
Easy to implement
May engender misconceptions
Numeric easy to mark
Tests deeper learning
Can find misconceptions
Free text still difficult to mark
‘Open’ CBM – what benefit?
Easy to implement
May engender misconceptions
Not always applicable
● ‘Global training partner to pharma companies’
● Need to establish that reps are properly trained for compliance
● Platform that delivers:
● Regular questions from a bank
● CBM assessment
● Mastery = all questions correctly answered more than once
Future of CBM?
● Fit with competency and mastery assessment is a good one
● Use in formative / revision contexts avoids issues to do with
unconventional marking and assigning grade boundaries
● Dislike of negative marking
● Poor platform support – but improving
● Difficulty of marking more complex question types
Repeat of talk about 5 years ago – what has changed?
Answer not much!
Will review why it is a good assessment technique
Not been able to persuade OU colleagues to use it
Remains niche in rest of world – but may have found a good niche
This a confidence-based question
Ask a question – often multiple choice, doesn’t have to be
But also ask students to report their certainty
Can give feedback as normal, mark as CBM.
Teachers & trainers need to know what students know and what they don’t know.
But both students and teachers need to know whether students are certain about what they know and don’t know – if not, there could be problems. At worst, they could think they know but be wrong – and therefore make mistakes. If they are uncertain, they can’t make correct decisions.
Certainty or confidence? Subtlety in language – claim that what students are indicating and what CBM encourages is certainty, not confidence as character trait.
In typical university settings, students are driven by marks!
Show an example later which has taken a gaming approach
First, student who is tentative – if they get it correct they get some marks, if they get it wrong get nothing.
Next, student who is both correct and certain – they get maximum marks.
Finally, student who is overconfident – and wrong. They get penalty.
So really important to judge certainty realistically – if you report high certainty, risk losing marks if you are wrong. If you report low certainty, you can’t score good marks.
An easy question that everyone is certain about – average score is high
A really difficult question on quantum physics – nearly everyone gets wrong, so score is close to zero.
The tricky one – something that many university students ought to know, but quite a few get wrong, so average mark is negative.
Detail shows how students are answering
Easy question: most answer with high confidence
Difficult question: most answer with low confidence
Tricky question: splits into those who know, and those who don’t.
First note there is lots of data. This is for students doing practice tests for learning & revision; not for serious assessment. Means you get full range of poor to good results.
Some students better than others – better students are to right of graph getting greater percentage correct. Weaker students are to left, getting fewer correct.
If they get more correct, then expect score to be higher – that’s on vertical scale. But you can score best by setting certainty correctly, so someone who got say 60% correct overall would get higher mark if they set certainty/confidence sensibly for each question.
Scores above the pale green line with corners show successfully judging confidence -- green line shows always setting confidence low, medium and high, but not adjusting for each individual question.
Corners represent places where switch from low to medium to high confidence should occur, if student knows only how good they are overall.
Most students, even for revision where marks don’t count, are setting confidence level sensibly
Few students look like they are doing really badly – negative marks overall!
But remember these are students exploring, not serious final assessment.
Data from Tony Gardner-Medwin’s medical students
Another way of looking at same data which highlights the effect of certainty – most results are above the line, indicating students are aware of where their knowledge is reliable
Under exam conditions, then spread is very much reduced.
Some low marks, but no negatives.
Nearly all students are showing marks above the line – that is, they are correctly assessing their certainty for each question and so maximising their scores compared to answering with certainty set overall.
So CBM is delivering accurate assessment of knowledge.
But CBM is not just about more accurate assessment.
Could also have positives for student learning – better engagement with assessment helps students learn.
Students don’t like extra ‘stress’ imposed by setting certainty – but it is good for them!
Certainty or confidence as character trait?
Can over/under-confidence be dissociated from knowledge in any case? Can argue that correct understanding of own knowledge is essential part of academic and professional practice.
Is used in medical contexts because acknowledged that realistic judgement of certainty is essential to skills: ‘in decision-rich occupations such as medicine, mis-calibration of reliability is a serious handicap’ Gardner-Medwin (2014, p.6)
An open question asks a question but gives no clues. Here simple number is expected, but more generally a phrase, sentence, paragraph…
A multiple-choice question gives options – choose the correct one.
A confidence-marked question also asks student to say how certain they are of their answer.
MCQs are well established – objective, reliable, easy to implement
But pedagogically not ideal – have some drawbacks.
Open questions might be better tools for learning – but they are difficult to implement on a computer
This is a variant of CBM
Starts with an open question.
We don’t ask students to submit answer immediately – instead they have to set their confidence.
Once set it is locked – can’t change.
Now reveal options – actually multiple choice.
Can give feedback as normal, mark as CBM.
Benefit of open CBM is
-- an open question, so benefits for reflection
-- retains benefits and avoids drawbacks of MCQ
-- has some drawbacks – not always easy to implement, and needs further research on personality issues
-- bank of questions
-- few questions each week pushed to students
-- posed in CBM format
-- repeated until question answered successfully more than once, then dropped from pool
-- not used for summative assessment but for competency / mastery
Asked a question – have to set your confidence first.
Confidence/certainty expressed in terms of a bet – how mauch are you prepared to stake that you will get this right?
Virtual coins only!
Once certainty set (= stake bet), then select options
If incorrect, you lose your stake
If correct you win the bet – payback is 2 x stake
Leaderboards so competitive
Feedback on performance by topic and by confidence level
Badges as additional motivation
Fit to mastery: used in medical contexts because acknowledged that realistic judgement of certainty is essential to skills
Gardner-Medwin (2014, but Bender?) indicates higher accuracy cf standard marking – students identify uncertain answers so reduces variance so predictive accuracy & reliability of exam improves.
Complexity of setting up. Not much platform support – but now in Moodle and in Questionmark Perception.
Dislike of negative marking.
Difficulty of marking more complex question types: multiple- cf single-response questions, partial correct scores, differently weighted questions.