The Future of Online Testing with
MOOCs: An Exploratory Analysis of
Current Practice
Eamon Costello (National Institute for Digital Learning, Dublin City University)
Jane Holland (Royal College of Surgeons in Ireland)
Mark Brown (National Institute for Digital Learning, Dublin City University)
Background
• NIDL - DCU and Funded MOOC research
– MOOCs and building regional capacities
SCORE2020 Project
– HOME Project
• MOOCs and the Media
• Institutional Drivers
Possible MOOC Futures
Possible MOOC Futures
Possible MOOC Futures
Brightest MOOC Futures
Dillenbourg, P. (2015) Proposal for a Digital Education Strategy for Flanders Universities.
“Thinkers in Residence” Programme from KVAB
Koninklijke Vlaamse Academie van België voor Wetenschappen en Kunsten. Available from:
http://www.kvab.be/denkersprogramma/files/DP_BlendedLearning_No-time-to-lose.pdf
“Sooner or later,
online tests will be
as reliable or even
more reliable than
on campus exams”
Current State of Play
• The future is taken care of
• But what about the present?
– How mature are (x)MOOCs?
– How reliable and valid are MCQ type tests in
MOOCs?
Multiple Choice Question (MCQ) Tests
(Single Best Answer)
– Reliability:
If we repeated this would we get the
same result?
- Validity:
Are we measuring what we think we
are?
Don’t get
“fooled by
randomness”
Taleb, N. (2004). Fooled by randomness: The hidden role of chance in life and
in the markets. Random House Incorporated.
Best Practice
Case, S. M., & Swanson, D. B. (2003). Constructing written test questions for the basic
and clinical sciences (3rd ed.). Philadelphia, PA: National Board of Medical Examiners.
• Ambiguous or unclear information
• Negative worded stem (not, incorrect, except)
• Implausible distracters
• Gratuitous information in stem
• More than one or no correct answer
• Longest option is correct
• Logical cues in stem
• Word repeats in stem and correct answer
• Unfocused stem
• True/false question
• Use of all of the above
• Vague terms (sometimes, frequently)
• Absolute terms (never, always)
• Use of none of the above
• Fill-in-blank
• Complex or K-type
• Grammatical cues in sentence completion
• Convergence cues
Best Practice
Tarrant, M., Knierim, A., Hayes, S. K., & Ware, J. (2006). The frequency of item writing
flaws in multiple-choice questions used in high stakes nursing assessments. Nurse
Education Today, 26(8), 662-671
• Ambiguous or unclear information
• Negative worded stem (not, incorrect, except)
• Implausible distracters
• Gratuitous information in stem
• More than one or no correct answer
• Longest option is correct
• Logical cues in stem
• Word repeats in stem and correct answer
• Unfocused stem
• True/false question
• Use of all of the above
• Vague terms (sometimes, frequently)
• Absolute terms (never, always)
• Use of none of the above
• Fill-in-blank
• Complex or K-type
• Grammatical cues in sentence completion
• Convergence cues
• Position of correct option
Best Practice
Tarrant, M., Knierim, A., Hayes, S. K., & Ware, J. (2006). The frequency of item writing
flaws in multiple-choice questions used in high stakes nursing assessments. Nurse
Education Today, 26(8), 662-671
Methodology
• Use Tarrant et. al. (2004)’s diagnostic tool to
analyse MCQs in MOOC systematically
• Look for item writing flaws
The Data
• 12 Courses
– from six MOOC platforms
– From 12 Universities/Providers
• Total MCQs: 115
None/All of the Above
• None of the above: 1 (0.87%)
• All of the above: 2 (1.74%)
The kernel is defined as:
A. The graphical user interface on top of the operating system
B. The glue between hardware and software applications
C. the software libraries need to run the system
D. all of the above
Holsgrove, G., & Elzubeir, M. (1998). Imprecise terms in UK medical multiple‐choice
questions: what examiners think they mean. Medical Education, 32(4), 343-350.
Number of Correct Options per
Question
• Greater than 1 correct: 10 (8.7%)
• Average: 1.39
Check the words that are used synonymously in a
JMeter test plan
A. end users
B. virtual users
C. concurrent users
D. threads
Number of Options per Question
Greater than 4 options: 9 (8%)
Average: 3.77
2%
6%
17%
68%
6%
2%
0
20
40
60
80
1 option 2 options 3 options 4 options 5 options 7 options
Correction Option is the Longest
• In 47 of the 115 MCQs the correct option is
the longest (40.87%)
Position of Correct Option 105 MCQs
21%
30%
29%
19%
1%
0
10
20
30
1st 2nd 3rd 4th 5th
15%
27%
32%
26%
0
5
10
15
20
25
1st 2nd 3rd 4th
Position of Correct Option in 73 Four-
Option MCQs
Overall
• 17 (14.78%) all the questions contain a
defined item writing flaw
• When counting only four item writing flaws
• Two more item writing flaws are apparent in
characteristics that appear more often than
they should be chance (small sample)
Further work
Add qualitative analysis using the Tarrant et. al. (2004) evaluation tool
• Ambiguous or unclear information
• Negative worded stem (not, incorrect, except)
• Implausible distracters
• Gratuitous information in stem
• Logical cues in stem
• Word repeats in stem and correct answer
• Unfocused stem
• Vague terms (sometimes, frequently)
• Absolute terms (never, always)
• Fill-in-blank
• Complex or K-type
• Grammatical cues in sentence completion
• Convergence cues
Implications
• Validity and Reliability of MOOC Testing
• Replicating unsound pedagogies
• MOOC teachers/developers need evidence-
lead teaching
Questions?
Me: eamon.costello@dcu.ie

The Future of Online Testing with MOOCs: An Exploratory Analysis of Current Practice

  • 1.
    The Future ofOnline Testing with MOOCs: An Exploratory Analysis of Current Practice Eamon Costello (National Institute for Digital Learning, Dublin City University) Jane Holland (Royal College of Surgeons in Ireland) Mark Brown (National Institute for Digital Learning, Dublin City University)
  • 2.
    Background • NIDL -DCU and Funded MOOC research – MOOCs and building regional capacities SCORE2020 Project – HOME Project • MOOCs and the Media • Institutional Drivers
  • 3.
  • 4.
  • 5.
  • 6.
    Brightest MOOC Futures Dillenbourg,P. (2015) Proposal for a Digital Education Strategy for Flanders Universities. “Thinkers in Residence” Programme from KVAB Koninklijke Vlaamse Academie van België voor Wetenschappen en Kunsten. Available from: http://www.kvab.be/denkersprogramma/files/DP_BlendedLearning_No-time-to-lose.pdf “Sooner or later, online tests will be as reliable or even more reliable than on campus exams”
  • 7.
    Current State ofPlay • The future is taken care of • But what about the present? – How mature are (x)MOOCs? – How reliable and valid are MCQ type tests in MOOCs?
  • 8.
    Multiple Choice Question(MCQ) Tests (Single Best Answer) – Reliability: If we repeated this would we get the same result? - Validity: Are we measuring what we think we are? Don’t get “fooled by randomness” Taleb, N. (2004). Fooled by randomness: The hidden role of chance in life and in the markets. Random House Incorporated.
  • 9.
    Best Practice Case, S.M., & Swanson, D. B. (2003). Constructing written test questions for the basic and clinical sciences (3rd ed.). Philadelphia, PA: National Board of Medical Examiners.
  • 10.
    • Ambiguous orunclear information • Negative worded stem (not, incorrect, except) • Implausible distracters • Gratuitous information in stem • More than one or no correct answer • Longest option is correct • Logical cues in stem • Word repeats in stem and correct answer • Unfocused stem • True/false question • Use of all of the above • Vague terms (sometimes, frequently) • Absolute terms (never, always) • Use of none of the above • Fill-in-blank • Complex or K-type • Grammatical cues in sentence completion • Convergence cues Best Practice Tarrant, M., Knierim, A., Hayes, S. K., & Ware, J. (2006). The frequency of item writing flaws in multiple-choice questions used in high stakes nursing assessments. Nurse Education Today, 26(8), 662-671
  • 11.
    • Ambiguous orunclear information • Negative worded stem (not, incorrect, except) • Implausible distracters • Gratuitous information in stem • More than one or no correct answer • Longest option is correct • Logical cues in stem • Word repeats in stem and correct answer • Unfocused stem • True/false question • Use of all of the above • Vague terms (sometimes, frequently) • Absolute terms (never, always) • Use of none of the above • Fill-in-blank • Complex or K-type • Grammatical cues in sentence completion • Convergence cues • Position of correct option Best Practice Tarrant, M., Knierim, A., Hayes, S. K., & Ware, J. (2006). The frequency of item writing flaws in multiple-choice questions used in high stakes nursing assessments. Nurse Education Today, 26(8), 662-671
  • 12.
    Methodology • Use Tarrantet. al. (2004)’s diagnostic tool to analyse MCQs in MOOC systematically • Look for item writing flaws
  • 13.
    The Data • 12Courses – from six MOOC platforms – From 12 Universities/Providers • Total MCQs: 115
  • 14.
    None/All of theAbove • None of the above: 1 (0.87%) • All of the above: 2 (1.74%) The kernel is defined as: A. The graphical user interface on top of the operating system B. The glue between hardware and software applications C. the software libraries need to run the system D. all of the above Holsgrove, G., & Elzubeir, M. (1998). Imprecise terms in UK medical multiple‐choice questions: what examiners think they mean. Medical Education, 32(4), 343-350.
  • 15.
    Number of CorrectOptions per Question • Greater than 1 correct: 10 (8.7%) • Average: 1.39 Check the words that are used synonymously in a JMeter test plan A. end users B. virtual users C. concurrent users D. threads
  • 16.
    Number of Optionsper Question Greater than 4 options: 9 (8%) Average: 3.77 2% 6% 17% 68% 6% 2% 0 20 40 60 80 1 option 2 options 3 options 4 options 5 options 7 options
  • 17.
    Correction Option isthe Longest • In 47 of the 115 MCQs the correct option is the longest (40.87%)
  • 18.
    Position of CorrectOption 105 MCQs 21% 30% 29% 19% 1% 0 10 20 30 1st 2nd 3rd 4th 5th
  • 19.
    15% 27% 32% 26% 0 5 10 15 20 25 1st 2nd 3rd4th Position of Correct Option in 73 Four- Option MCQs
  • 20.
    Overall • 17 (14.78%)all the questions contain a defined item writing flaw • When counting only four item writing flaws • Two more item writing flaws are apparent in characteristics that appear more often than they should be chance (small sample)
  • 21.
    Further work Add qualitativeanalysis using the Tarrant et. al. (2004) evaluation tool • Ambiguous or unclear information • Negative worded stem (not, incorrect, except) • Implausible distracters • Gratuitous information in stem • Logical cues in stem • Word repeats in stem and correct answer • Unfocused stem • Vague terms (sometimes, frequently) • Absolute terms (never, always) • Fill-in-blank • Complex or K-type • Grammatical cues in sentence completion • Convergence cues
  • 22.
    Implications • Validity andReliability of MOOC Testing • Replicating unsound pedagogies • MOOC teachers/developers need evidence- lead teaching
  • 23.