Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
|| © 2016 MCC/CMC
© 2016 MCC | CMC
André F. De Champlain, PhD
Director, Psychometrics and Assessment Services
Miriam Fried...
|| © 2016 MCC/CMC 2
I do not have any conflicts of interest to report.
|| © 2016 MCC/CMC
Through the Looking Glass & What Alice Found There
Charles Dodgson (Lewis Carroll)
• Sequel to Alice’s A...
|| © 2016 MCC/CMC 4
|| © 2016 MCC/CMC
Education & Assessment as Willing Partners
5
• Mechanisms through which learning
occurs has shifted
• Fr...
|| © 2016 MCC/CMC
Education & Assessment as Willing Partners
6
• From a traditional view of education
• Teacher-centered w...
|| © 2016 MCC/CMC
Education & Assessment as Willing Partners
7
• Evolution of learning models & modalities not completely
...
|| © 2016 MCC/CMC
Rethinking the Nuts & Bolts of Assessment
8
• Reconceptualising assessment
• Over the past two decades, ...
|| © 2016 MCC/CMC
Rethinking the Nuts & Bolts of Assessment
9
• Assessment paradigm shift
• Programmatic assessment (van d...
|| © 2016 MCC/CMC
Rethinking the Nuts & Bolts of Assessment
10
• Illustrate our evolution in assessment paradigm, technolo...
|| © 2016 MCC/CMC
Assessment Paradigm Shift
11
• Increasing dissatisfaction with established educational
assessment models...
|| © 2016 MCC/CMC
Assessment Paradigm Shift
12
• Concerns
• Lack of overarching framework (program) to guide the design of...
|| © 2016 MCC/CMC
Programmatic Assessment1
13
• Calls for a deliberate, arranged set of longitudinal assessment
activities...
|| © 2016 MCC/CMC
Programmatic Assessment1
14
• Application of a program evaluation framework to assessment
• Systematic c...
|| © 2016 MCC/CMC
Programmatic Assessment1
15
• Systemic models of assessment and learning also popular in
other settings
...
|| © 2016 MCC/CMC
Dipping our Toes into the Programmatic Assessment Waters:
The MCC Experience
16
Assessment Review Task F...
|| © 2016 MCC/CMC
Tacit Sub-Recommendation: Macro-Analysis
17
An intimated challenge outlined in the ARTF report
pertains ...
|| © 2016 MCC/CMC
Programmatic Assessment Refocuses the Debate
18
Reductionism
• A system reduced to its
most basic elemen...
|| © 2016 MCC/CMC
Towards a Programmatic View of Assessment in Canada
19
Assessment Continuum for Canada (ACC)
• Purpose:
...
|| © 2016 MCC/CMC
CMG Competency Assessment Pathway
20
|| © 2016 MCC/CMC
Assessment Continuum for Canada (ACC)
21
• Next steps include a retreat to begin to define a program of ...
|| © 2016 MCC/CMC
Practical Implications of Programmatic Assessment
22
• Programmatic assessment is predicated on more fre...
|| © 2016 MCC/CMC
Enhancing & Supplementing Test Development: AIG
23
What is AIG?
• Automated item generation (AIG) is the...
|| © 2016 MCC/CMC
Enhancing & Supplementing Test Development: AIG
24
Three steps:
1. Identification of the PROBLEM (e.g., ...
|| © 2016 MCC/CMC
Enhancing & Supplementing Test Development: AIG
25
A traditional,
committee-based,
item-writing
workshop
|| © 2016 MCC/CMC
Enhancing & Supplementing Test Development: AIG
26
The beginnings of a
cognitive map
(post-operative
fev...
|| © 2016 MCC/CMC
Enhancing & Supplementing Test Development: AIG
27
|| © 2016 MCC/CMC
Enhancing & Supplementing Test Development: AIG
28
• A cognitive model can be viewed as a template, a re...
|| © 2016 MCC/CMC
Enhancing & Supplementing Test Development: AIG
29
Lessons learned since 2011:
• Test development effort...
|| © 2016 MCC/CMC
Streamlining the Scoring of Open Responses & Narrative
Text: Automated Essay Scoring (AES)
30
• Programm...
|| © 2016 MCC/CMC
Streamlining the Scoring of Open Responses & Narrative
Text: Automated Essay Scoring (AES)
31
• The MCCQ...
|| © 2016 MCC/CMC
Streamlining the Scoring of Open Responses & Narrative
Text: Automated Essay Scoring (AES)
32
• Our solu...
|| © 2016 MCC/CMC
Streamlining the Scoring of Open Responses & Narrative
Text: Automated Essay Scoring (AES)
33
Our resear...
|| © 2016 MCC/CMC
Streamlining the Scoring of Open Responses & Narrative
Text: Automated Essay Scoring (AES)
34
Agreement ...
|| © 2016 MCC/CMC
Streamlining the Scoring of Open Responses & Narrative
Text: Automated Essay Scoring (AES)
35
Impact on ...
|| © 2016 MCC/CMC
Streamlining the Scoring of Open Responses & Narrative
Text: Automated Essay Scoring (AES)
36
Conclusion...
|| © 2016 MCC/CMC
Conclusions
37
• Re-emergence of competency-based educational models
necessitates an analogous broadenin...
|| © 2016 MCC/CMC
Conclusions
38
• Technology provides several useful means by which:
• Test development processes can be ...
|| © 2016 MCC/CMC
THANK YOU!© 2016 MCC | CMC
André F. De Champlain, PhD
adechamplain@mcc.ca
39
Upcoming SlideShare
Loading in …5
×

Peering Through the Looking Glass: How Advances in Technology, Psychometrics and Philosophy are Altering the Assessment Landscape in Medical Education

507 views

Published on

André De Champlain Ottawa Conference keynote

  • Be the first to comment

  • Be the first to like this

Peering Through the Looking Glass: How Advances in Technology, Psychometrics and Philosophy are Altering the Assessment Landscape in Medical Education

  1. 1. || © 2016 MCC/CMC © 2016 MCC | CMC André F. De Champlain, PhD Director, Psychometrics and Assessment Services Miriam Friedman Ben-David Lecture 17th Ottawa Conference on the Assessment of Competence in Medicine & the Healthcare Professions Mar. 22, 2016 – Perth, Australia Peering Through the Looking Glass: How Advances in Technology, Psychometrics and Philosophy are Altering the Assessment Landscape in Medical Education
  2. 2. || © 2016 MCC/CMC 2 I do not have any conflicts of interest to report.
  3. 3. || © 2016 MCC/CMC Through the Looking Glass & What Alice Found There Charles Dodgson (Lewis Carroll) • Sequel to Alice’s Adventures in Wonderland (1871) Key theme: Inverse reflection • Reflection on an alternative world which lies on the other side of a mirror • Distortion of sense (Jabberwocky) • Portmanteau (Jabberwocky) • Linguistic blend of words ◦ Webinar (web + seminar) ◦ Brunch (breakfast + lunch) 3
  4. 4. || © 2016 MCC/CMC 4
  5. 5. || © 2016 MCC/CMC Education & Assessment as Willing Partners 5 • Mechanisms through which learning occurs has shifted • From traditional (paper-based) to electronic media ◦ Tablet & mobile device-based learning is ubiquitous (e.g., MedPage Today, QuantiaMD, etc.) ◦ Linear to exponential growth of knowledge in medicine
  6. 6. || © 2016 MCC/CMC Education & Assessment as Willing Partners 6 • From a traditional view of education • Teacher-centered with high exam scores as main goal • To alternate models which stress learning, retention & integration of knowledge & skills using a host of assessment modalities (e.g., PBL, competency-based education, etc.)
  7. 7. || © 2016 MCC/CMC Education & Assessment as Willing Partners 7 • Evolution of learning models & modalities not completely mirrored by similar changes in educational assessment • Educational assessment must evolve alongside learning models or risk fostering an antagonistic relationship • Educational assessment must (Bennett, 2002): • Provide meaningful information • Satisfy multiple purposes • Use modern conceptions of competency as a design basis • Design for positive impact & engagement • Use technology to achieve substantive goals
  8. 8. || © 2016 MCC/CMC Rethinking the Nuts & Bolts of Assessment 8 • Reconceptualising assessment • Over the past two decades, tremendous amount of thought & activity aimed at proposing models of assessment & related processes that are: ◦ More transparent & flexible ◦ Better linked to learning activities ◦ More informative from an educational standpoint • Concurrently, effort has been devoted to improving processes necessary to support these re-envisioned assessments • Revisitation of assessment’s epistemological core • What world lies on the other side of the assessment mirror?
  9. 9. || © 2016 MCC/CMC Rethinking the Nuts & Bolts of Assessment 9 • Assessment paradigm shift • Programmatic assessment (van der Vleuten et al., 2012) • Post-modern test theory (Mislevy, 1997) • Cognitively-based assessment of, for & as learning (CBAL [Bennett, 2010]) • Use of technology to improve • Test development practices (automated item generation [AIG]) • Marking of open-ended responses & narrative text
  10. 10. || © 2016 MCC/CMC Rethinking the Nuts & Bolts of Assessment 10 • Illustrate our evolution in assessment paradigm, technology & scoring at the Medical Council of Canada
  11. 11. || © 2016 MCC/CMC Assessment Paradigm Shift 11 • Increasing dissatisfaction with established educational assessment models • Candidate’s “true” competency level can be measured with standardized, context-free tools & further confirmed by highly reproducible, unambiguous, statistical results • Linear relationship between learning & assessment ◦ Discrete, episodic hurdles to overcome • Unlinked assessments
  12. 12. || © 2016 MCC/CMC Assessment Paradigm Shift 12 • Concerns • Lack of overarching framework (program) to guide the design of the assessment tools along an educational continuum ◦ Plea for a macroscopic rather than microscopic view of assessment (de Rosnay, 1979) • Reductionist lens that is applied to what is a complex, adaptive system with interconnected components & dynamic relationships • Missed opportunity to view learning & assessment in a rich, recursive relationship ◦ Both activities can dynamically inform each other ◦ Feed forwarding information
  13. 13. || © 2016 MCC/CMC Programmatic Assessment1 13 • Calls for a deliberate, arranged set of longitudinal assessment activities • Joint attestation of all data points for decision & remediation purposes • Input of expert professional judgment is a cornerstone of this model • Purposeful link between assessment & learning/remediation • Dynamic, recursive relationships between assessment & learning points
  14. 14. || © 2016 MCC/CMC Programmatic Assessment1 14 • Application of a program evaluation framework to assessment • Systematic collection of data to answer specific questions about a program • Gaining popularity within several medical education settings • Competency-based workplace learning • Medical schools (e.g., Dalhousie University, University of Toronto, etc.) • Etc.
  15. 15. || © 2016 MCC/CMC Programmatic Assessment1 15 • Systemic models of assessment and learning also popular in other settings • K-12 education (Bennett, 2012; Educational Testing Service) ◦ Cognitively based assessment of, for & as learning (CBAL) – Documents what students have achieved (‘of learning’) – Helps identify how to plan & adjust instruction (‘for learning’) – Considered by students and teachers to be a worthwhile educational experience in and of itself (‘as learning’) • We are treading on well-travelled ground!
  16. 16. || © 2016 MCC/CMC Dipping our Toes into the Programmatic Assessment Waters: The MCC Experience 16 Assessment Review Task Force (ARTF) • As the MCC approached it’s 100th anniversary (2012), a task force of eminent Canadian medical educators was convened to undertake a reflective & strategic review of the MCC’s assessment purposes and objectives, their structure & their alignment with MCC’s major stakeholder requirements • Report published in 2011 which contained six recommendations to fulfill including validating & updating the blueprint for MCC examinations, offering exams with greater flexibility, enhancing & standardizing IMG assessments & engaging in in-practice assessment
  17. 17. || © 2016 MCC/CMC Tacit Sub-Recommendation: Macro-Analysis 17 An intimated challenge outlined in the ARTF report pertains to the need to conduct a macro-analysis & review of the MCC Qualifying Examination (MCCQE) • Applying a systemic (macroscopic) lens to the MCCQE as an integrated examination system & not simply as a restricted number of episodic hurdles (MCCQE Parts I & II) • How are the components of the MCCQE interconnected & how do they inform key markers along a physician’s educational & professional continuum? • How can the MCCQE progress towards embodying an integrated, logically planned & sequenced system of assessments that mirrors the Canadian physician’s journey?
  18. 18. || © 2016 MCC/CMC Programmatic Assessment Refocuses the Debate 18 Reductionism • A system reduced to its most basic elements (e.g., corresponds to the sum of its parts) • Decision point I: MCCQE Part I • Decision point II: MCCQE Part II Emergentism • A system is more than the sum of its parts & also depends on complex interdependencies amongst its component parts • Decision point I: Purposeful integration of MCCQE Part I scores with other data elements • Decision point II: Purposeful integration of MCCQE Part II scores with other data elements VS.
  19. 19. || © 2016 MCC/CMC Towards a Programmatic View of Assessment in Canada 19 Assessment Continuum for Canada (ACC) • Purpose: • To define the ‘life of a physician’ from the beginning of medical school to retirement in terms of assessments • To propose a common, national framework of assessment using a programmatic approach • Composition includes representation from the MCC, UGME/PGME, certification colleges, FMRAC and MRAs • First step includes defining the CMG physician pathway with key partners
  20. 20. || © 2016 MCC/CMC CMG Competency Assessment Pathway 20
  21. 21. || © 2016 MCC/CMC Assessment Continuum for Canada (ACC) 21 • Next steps include a retreat to begin to define a program of assessment (April 2016) • Starting points to develop a program of assessment • Various ongoing North American EPA projects • Milestone projects (RCPSC, ACGME) • CanMEDS 2015 (RCPSC) & Triple C curriculum (CFPC) • MCCQE blueprint • …and many others • Critical to develop an overarching framework (program) prior to specifying elements & relationships in this program
  22. 22. || © 2016 MCC/CMC Practical Implications of Programmatic Assessment 22 • Programmatic assessment is predicated on more frequent & flexible assessment via a variety of tools: • Traditional exam formats • Lower-stakes, in-practice observations • Narratives, etc. • This shift impacts core assessment tasks including test development & scoring activities • Assessments need to be developed, administered & scored more frequently • How technology is helping the MCC optimize test development & scoring activities to better support programmatic assessment • AIG and automated marking as examples
  23. 23. || © 2016 MCC/CMC Enhancing & Supplementing Test Development: AIG 23 What is AIG? • Automated item generation (AIG) is the process of using item models to generate test items with the aid of computer technology • AIG uses a three-stage process for generating items where the cognitive mechanism required to solve the items is identified & manipulated to create new items (‘cognitive map’)
  24. 24. || © 2016 MCC/CMC Enhancing & Supplementing Test Development: AIG 24 Three steps: 1. Identification of the PROBLEM (e.g., post-operative fever) 2. Specification of SOURCES of information required to diagnose the problem (e.g., type of surgery, physical examination, etc.) 3. Description of VARIABLES & LEVELS within each information source (e.g., guarding & rebound, timing of fever, calf tenderness, etc.) needed to create different instances of the problem
  25. 25. || © 2016 MCC/CMC Enhancing & Supplementing Test Development: AIG 25 A traditional, committee-based, item-writing workshop
  26. 26. || © 2016 MCC/CMC Enhancing & Supplementing Test Development: AIG 26 The beginnings of a cognitive map (post-operative fever)
  27. 27. || © 2016 MCC/CMC Enhancing & Supplementing Test Development: AIG 27
  28. 28. || © 2016 MCC/CMC Enhancing & Supplementing Test Development: AIG 28 • A cognitive model can be viewed as a template, a rendering, or a mold of the assessment task (i.e., a target where we want to place the content for the item) • A 54-year-old woman has a <Type of Surgery>. On post-operative day <Timing of Fever>, the patient has a temperature of 38.5°C. Physical examination reveals <Physical Examination>. Which one of the following is the best next step? • Type of Surgery: Gastrectomy, right hemicolectomy, left hemicolectomy, appendectomy, laparoscopic cholecystectomy • Timing of Fever: 1 to 6 days • Physical Examination: Red & tender wound, guarding & rebound, abdominal tenderness, calf tenderness
  29. 29. || © 2016 MCC/CMC Enhancing & Supplementing Test Development: AIG 29 Lessons learned since 2011: • Test development efforts • Thousands of items generated across 50+ cognitive maps • Significantly improved test development process • Deliberate modeled process for distractors (diagnostic feedback) • Predictive identification accuracy ranged from 32-52% across four experts with an average accuracy rate of 42% • Psychometric efforts (250+ piloted AIG items) • On average, AIG items are comparable in difficulty & discrimination (based on classical & IRT statistics) • Stronger distractors (directly attributable to the AIG process)
  30. 30. || © 2016 MCC/CMC Streamlining the Scoring of Open Responses & Narrative Text: Automated Essay Scoring (AES) 30 • Programmatic assessment calls for a combination of low- & high- stakes assessment for remediation & decision purposes • Lower-stakes assessments include narrative & other open-ended qualitative measures • Scoring of these open-ended measures is typically based on human ratings & can be very resource-intensive • How might we streamline the scoring of such tasks? • Automated essay scoring (AES)
  31. 31. || © 2016 MCC/CMC Streamlining the Scoring of Open Responses & Narrative Text: Automated Essay Scoring (AES) 31 • The MCCQE Part I • MCCQE Part I completed as a licencing requirement in Canada towards the end of UGME • Two-part exam (196 A-type MCQs + clinical decision-making [CDM], short-menu and write-in questions) • Scoring process for write-in questions • >50 physicians hired to score write-in responses over two-three days • Scoring can only occur on weekends • Costly & logistically unsustainable
  32. 32. || © 2016 MCC/CMC Streamlining the Scoring of Open Responses & Narrative Text: Automated Essay Scoring (AES) 32 • Our solution: AES • AES offers a promising alternative for supplementing hand- scoring of written-response items • With AES, a computer builds scoring models based on previously-scored (human) responses & applies scoring model(s) to subsequent examinations • AES relies on natural language processing (NLP) and machine- learning algorithms (MLA)
  33. 33. || © 2016 MCC/CMC Streamlining the Scoring of Open Responses & Narrative Text: Automated Essay Scoring (AES) 33 Our research to date: • Under what conditions does AES work properly? • What is the level of agreement between human raters & machine? • How does this agreement compare to inter-human ratings? • What is the impact of scoring CDM write-in questions using AES on overall pass/fail rates?
  34. 34. || © 2016 MCC/CMC Streamlining the Scoring of Open Responses & Narrative Text: Automated Essay Scoring (AES) 34 Agreement rates: • Overall, human-machine agreement is very high • Median kappa = 0.89 for spring 2015 MCCQE Part I CDM write-in questions (range: 0.56-0.99) • Human-human agreement is slightly higher (Shermis, 2015) • Median kappa = 0.96 for spring 2015 MCCQE Part I CDM write-in questions (range: 0.56-1.00) • Humans were not always correct! ◦ For a few CDM write-in questions, humans scored consistently, but incorrectly
  35. 35. || © 2016 MCC/CMC Streamlining the Scoring of Open Responses & Narrative Text: Automated Essay Scoring (AES) 35 Impact on pass/fail decisions • Pass/fail decision agreement rates greater or equal to 0.96 whether candidates were scored by machine or human • All Cohen kappa statistics greater or equal to 0.93 SPRING 2015 MCCQE Part I COHEN’S KAPPA Overall 0.96 English 0.96 French 0.95
  36. 36. || © 2016 MCC/CMC Streamlining the Scoring of Open Responses & Narrative Text: Automated Essay Scoring (AES) 36 Conclusions: • Overall, agreement was very high • Results are aligned with findings from past research • AES cannot be completed using a one-size-fits-all approach • Performance of AES is less optimal in some conditions ◦ With some polytomous items ◦ With French models (very small sample sizes) • Limited impact of machine scoring on ability estimates and pass/fail decisions • MCC is confident that AES can considerably streamline the scoring of open-ended items
  37. 37. || © 2016 MCC/CMC Conclusions 37 • Re-emergence of competency-based educational models necessitates an analogous broadening of our assessment frameworks & strategies • Systemic models have challenged assessment experts to broaden our panoply of strategies & to purposefully link assessment to learning in a recursive fashion (assessment engineering) • The implementation of linked models is predicated on more frequent & broad assessments of performance
  38. 38. || © 2016 MCC/CMC Conclusions 38 • Technology provides several useful means by which: • Test development processes can be significantly improved, systematized & streamlined to support programmatic assessment (exam engineering) • Scoring of open-ended, qualitative narrative can be accomplished in an efficient & defensible manner • The mind is not a vessel to be filled, but a fire to be kindled. (Plutarch)
  39. 39. || © 2016 MCC/CMC THANK YOU!© 2016 MCC | CMC André F. De Champlain, PhD adechamplain@mcc.ca 39

×