Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
|||
André De Champlain, PhD
Director, Psychometrics and Assessment Services
Presented at the MCC Annual General
Meeting | ...
||
A Brave New Journey…. Programmatic Assessment
2MCC Annual Meeting – September 2015
||
Guiding Vision: The Assessment Review Task Force
3
6 ARTF RECOMMENDATIONS
Recommendation 1
• LMCC becomes ultimate cred...
|||
Validate and Update Blueprint for MCC Examinations
ARTF Recommendation #2
||
Blueprint Project
6
That the content of MCC examinations be expanded by:
• Defining knowledge and behaviours in all Can...
||
Addressing Micro-Gaps
7
A number of efforts are underway to assess how
the current MCCQE Parts I & II might evolve
towa...
||
Tacit Sub-Recommendation: Macro-Analysis
8
An intimated challenge outlined in the ARTF
report pertains to the need to c...
||
Key Recommendation: MEAAC
9
• Medical Education Assessment Advisory Committee
(MEAAC)
• MEAAC was a key contributor to ...
|||
Framework for a Systemic Analysis
of the MCCQE
|| 11
• Calls for a “deliberate”, arranged set of longitudinal
assessment activities
• Joint attestation of all data point...
|| 12
• Application of a program evaluation framework to assessment
• Systematic collection of data to answer specific que...
|| 13
Reductionism
• A system reduces to its
most basic elements (e.g.,
corresponds to the sum of
its parts)
• Decision po...
||
Can a Programmatic Assessment Framework Be
Applied to the MCCQE?
14
• Programmatic assessment is primarily restricted t...
||
Can a Programmatic Assessment Framework Be
Applied to the MCCQE?
15
• Probably not as conceived due to differences in:
...
||
Can the Philosophy Underpinning Programmatic
Assessment Be Applied to the MCCQE?
16MCC Annual Meeting – September 2015
||
How can the MCCQE Evolve?
17
• At its philosophical core, from: (1) an episodic system of two point-in-
time exams to; ...
||
D2
In PracticePostgraduate Training
Assessment Continuum for the Canadian Trainees
Undergraduate Education
Continuing P...
|| 19
• What constitutes the learning/assessment continuum for
physicians from “cradle to grave” (UGME to PPI)?
• At a pan...
|| 20
November Group on Assessment (NGA)
• Purpose
• To define the “life of a physician” from the beginning of medical
sch...
|| 21
• Summit to define a program of assessment
• Planned for first quarter of 2016
• Starting points to develop a progra...
|||
Validating a Program of Assessment
||
Appeal
23
• Emphasis is on a composite of data elements (quantitative and
qualitative) to better inform key educational...
|| 24MCC Annual Meeting – September 2015
||
Standards for Educational and Psychological Testing (2014)
25
Key Objectives
• Provide criteria for the development and...
|| 26
Assessing the Foundational Properties of a Program of
Assessment
1
• Reliability
2
• Validity
MCC Annual Meeting – S...
|| 27
• “Test” Score: A reminder
• Any assessment, by virtue of practical constraints (e.g., available
testing time), is c...
|| 28
• You’re interested in generalizing from the performance in those
12 very specific encounters to the broader domains...
|| 29
• Measurement error arises from multiple sources (multifaceted)
• For a WBA, measurement error could be attributable...
||
Reliability of a Program of Assessment?
30
• Programmatic assessment is predicated on the notion that
many purposefully...
||
Validity: What It Is
31
Validity is an overall evaluative judgment of the
degree to which empirical evidence and
theore...
||
Validity: What It’s Not
32
There is no such thing as a valid or invalid
exam or assessment
• Statements such as “my min...
|| 33
1
• State the interpretive argument as clearly as possible
2
• Assemble evidence relevant to the interpretive argume...
|| 34
Validity: Kane’s Five Key Arguments
1. Evaluation Argument
• The scoring rule is appropriate
• The scoring rule is a...
|| 35
Validity: Kane’s Five Key Arguments
3. Extrapolation Argument
• Does the program of assessment lead to intended outc...
|| 36
Validity: Kane’s Five Key Arguments
5. Decision-Making Argument
• Is the program of assessment appropriately passing...
||
A Practical Framework (Dijkstra et al., 2012)
37
Collecting information:
• Identify the components of the assessment pr...
||
A Practical Framework (Dijkstra et al., 2012)
38
Obtaining support for the program:
• Significant amount of faculty dev...
||
A Practical Framework (Dijkstra et al., 2012)
39
Domain mapping:
• Gather evidence to support that each assessment comp...
||
A Practical Framework (Dijkstra et al., 2012)
40
Justifying the program:
• All new initiatives need to be supported by ...
|||
Some Additional Challenges
||
Narrative Data in the MCCQE
• Narrative (qualitative) data poses unique opportunities and
challenges for inclusion into...
||
Narrative Data in the MCCQE
43
• Automated Essay Scoring (AES)
• AES can build scoring models based on previously human...
||
Narrative Data in the MCCQE
44
• AES of MCCQE Part I CDM write-in responses
• AES was used to parallel score 73 spring ...
||
Argument for Accreditation of Observation-Based Data
45
• Insufficient evidence to support the incorporation of “local”...
||
Argument for Accreditation of Observation-Based Data
46
• Accrediting (with partners) all facets of observation-based
d...
||
Putting the Pieces Together
47
• At a programmatic level, how can we aggregate this combination
of low- and high-stakes...
||
Next Steps
48
• Begin to lay foundation for a MCCQE program of assessment
• Define both the micro- and macro-elements t...
||
Next Steps
49
• Collaborative pilot project framework with key partners
and stakeholders
• Formulate key targeted resea...
|| 50
Would you tell me,
please, which way I ought
to go from here?
That depends a good deal
on where you want
to get to!
...
|||
THANK YOU!THANK YOU!
André De Champlain, PhD
adechamplain@mcc.ca
Upcoming SlideShare
Loading in …5
×

Peering through the Looking Glass: Towards a Programmatic View of the Qualifying Examination

1,103 views

Published on

2015 Annual Meeting - Dr. André De Champlain

Published in: Education
  • Be the first to comment

  • Be the first to like this

Peering through the Looking Glass: Towards a Programmatic View of the Qualifying Examination

  1. 1. ||| André De Champlain, PhD Director, Psychometrics and Assessment Services Presented at the MCC Annual General Meeting | 27 September 2015 Peering Through the Looking Glass: Towards a Programmatic View of the Qualifying Examination
  2. 2. || A Brave New Journey…. Programmatic Assessment 2MCC Annual Meeting – September 2015
  3. 3. || Guiding Vision: The Assessment Review Task Force 3 6 ARTF RECOMMENDATIONS Recommendation 1 • LMCC becomes ultimate credential (legislation issue) Recommendation 2 • Validate and update blueprint for MCC examinations Recommendation 3 • More frequent scheduling of the exams and associated automation Recommendation 4 • IMG assessment enhancement and national standardization (NAC & Practice Ready Assessment) Recommendation 5 • Physician practice improvement assessments Recommendation 6 • Implementation oversight, committee priorities, and budgets MCC Annual Meeting – September 2015
  4. 4. ||| Validate and Update Blueprint for MCC Examinations ARTF Recommendation #2
  5. 5. || Blueprint Project 6 That the content of MCC examinations be expanded by: • Defining knowledge and behaviours in all CanMEDS Roles that demonstrate competency of the physician about to enter independent practice • Reviewing adequacy of content and skill coverage on blueprints for all MCC examinations • Revising examination blueprints and reporting systems with aim of demonstrating that appropriate assessment of all core competencies is covered and fulfills purpose of each examination • Determining whether any general core competencies considered essential cannot be tested employing the current MCC examinations, and exploring the development of new tools to assess these specific competencies when current examinations cannot MCC Annual Meeting – September 2015
  6. 6. || Addressing Micro-Gaps 7 A number of efforts are underway to assess how the current MCCQE Parts I & II might evolve towards better fulfilling the MCCQE blueprint • New OSCE stations focusing on more generic skills and complex presentations expected of all physicians, irrespective of specialty, have been piloted with great success in the fall 2014 and spring 2015 MCCQE Part II administrations • Potential inclusion of innovative item types, including situational judgment test challenges, is also under careful consideration and review • From a micro-gap perspective, the MCCQE is better aligning towards the two dimensions outlined in our new blueprint MCC Annual Meeting – September 2015
  7. 7. || Tacit Sub-Recommendation: Macro-Analysis 8 An intimated challenge outlined in the ARTF report pertains to the need to conduct a macro-analysis and review of the MCCQE • Applying a systemic (macroscopic) lens to the MCCQE as an integrated examination system and not simply as a restricted number of episodic “hurdles” (MCCQE Parts I &II) • How are the components of the MCCQE interconnected and how do they inform key markers along a physician’s educational and professional continuum? • How can the MCCQE progress towards embodying an integrated, logically planned and sequenced system of assessments that mirrors the Canadian physician’s journey? MCC Annual Meeting – September 2015
  8. 8. || Key Recommendation: MEAAC 9 • Medical Education Assessment Advisory Committee (MEAAC) • MEAAC was a key contributor to our practice analysis (blueprinting) efforts through their report, Current Issues in Health Professional and Health Professional Trainee Assessment • Key recommendations ◦ Explore the implementation of an integrated & continuous model of assessment (linked assessments) ◦ Continue to incorporate “authentic” assessments in the MCCQE – OSCE stations that mimic real practice – Direct observation based assessment to supplement MCCQE Parts I & II MCC Annual Meeting – September 2015
  9. 9. ||| Framework for a Systemic Analysis of the MCCQE
  10. 10. || 11 • Calls for a “deliberate”, arranged set of longitudinal assessment activities • Joint attestation of all data points for decision and remediation purposes • Input of expert professional judgment is a cornerstone of this model • (Purposeful) link between assessment and learning/remediation • Dynamic, recursive relationships between assessment and learning points Programmatic Assessment (van der Vleuten et al., 2012) MCC Annual Meeting – September 2015
  11. 11. || 12 • Application of a program evaluation framework to assessment • Systematic collection of data to answer specific questions about a program • Gaining in popularity within several medical education settings • Competency-based workplace learning • Medical schools (e.g., Dalhousie University, University of Toronto, etc.) • Etc. Programmatic Assessment (van der Vleuten et al., 2012) MCC Annual Meeting – September 2015
  12. 12. || 13 Reductionism • A system reduces to its most basic elements (e.g., corresponds to the sum of its parts) • Decision point I = MCCQE Part I • Decision point II – MCCQE Part II Emergentism • A system is more than the sum of its parts & also depends on complex interdependencies amongst its component parts • Decision point I: Purposeful integration of MCCQE Part I scores with other data elements • Decision point II: Purposeful integration of MCCQE Part II scores with other data elements VS. Programmatic Assessment Refocuses the Debate MCC Annual Meeting – September 2015
  13. 13. || Can a Programmatic Assessment Framework Be Applied to the MCCQE? 14 • Programmatic assessment is primarily restricted to local contexts (medical school, postgraduate training program, etc.) • What about applicability to high-stakes registration/ licensing exam programs? • The model is limited to programmatic assessment in the educational context, and consequently licensing assessment programmes are not considered” (van der Vleuten et al.; 2012; p. 206) MCC Annual Meeting – September 2015
  14. 14. || Can a Programmatic Assessment Framework Be Applied to the MCCQE? 15 • Probably not as conceived due to differences in: • Settings (medical school vs. qualifying exam) • Stakes (graduation vs. licence to practise as a physician) • Outcomes • Interpretation of data sources • Nature of the program and its constituent elements MCC Annual Meeting – September 2015
  15. 15. || Can the Philosophy Underpinning Programmatic Assessment Be Applied to the MCCQE? 16MCC Annual Meeting – September 2015
  16. 16. || How can the MCCQE Evolve? 17 • At its philosophical core, from: (1) an episodic system of two point-in- time exams to; (2) an integrated program of assessment, continued to be supported by best practice and evidence, which includes: • The identification of data elements aimed at informing key decisions and activities along the continuum of a physician’s medical education • Clearly laid out relationships that are exemplified by the interactions between those elements, predicated on a clearly defined program (the MCCQE) • Defensible feedback interwoven at key points in the program • The $64,000 question: What does the MCCQE program of assessment look like (actually, the $529,153.80 question)? MCC Annual Meeting – September 2015
  17. 17. || D2 In PracticePostgraduate Training Assessment Continuum for the Canadian Trainees Undergraduate Education Continuing Professional Development D1 Full Lic. UGME Assessments Potentially Leading to MCC BP Decision Point 1 in Clerkship: • SR-items (17/17) • CR-items (16/17) • OSCE (17/17) • Direct observation reports (14/17) • In-training evaluation (13/17) • Simulation (10/17) • MSF/360 (6/17) • Others (11/17) PPI (FMRAC) • Assessment of practice • Audits CFPC: • Direct obs. • CR-items (SAMPs) • Structured orals Royal College 32 Entry Specialties: • ITEs • Direct observation • SR-items • CR-items • OSCE/orals • Simulations • Chart audits 18
  18. 18. || 19 • What constitutes the learning/assessment continuum for physicians from “cradle to grave” (UGME to PPI)? • At a pan-Canadian level: ◦ A temporal timeline is a necessary, but insufficient condition, for better understanding the lifecycle of a physician ◦ What competencies do physicians develop throughout this life cycle? ◦ What behavioural indicators (elements) best describe “competency” at various points in the life cycle? ◦ How are these competencies related (both within and across)? ◦ How do these competencies evolve? ◦ Etc. • All of these questions are critical in better informing the development of a programmatic model for the LMCC Major Step Towards a Programmatic View of the MCCQE MCC Annual Meeting – September 2015
  19. 19. || 20 November Group on Assessment (NGA) • Purpose • To define the “life of a physician” from the beginning of medical school to retirement in terms of assessments • To propose a common national framework of assessment using a programmatic approach • Composition • Includes representation from the AFMC, CFPC, CMQ, FMRAC, MCC, MRAs and Royal College • First step • Physician pathway First Step Towards a Programmatic View of the MCCQE MCC Annual Meeting – September 2015
  20. 20. || 21 • Summit to define a program of assessment • Planned for first quarter of 2016 • Starting points to develop a program of assessment ◦ Various ongoing North American EPA projects ◦ Milestone projects (Royal College, ACGME) ◦ CanMEDS 2015 ◦ MCCQE Blueprint! ◦ … and many others • Critical to develop an overarching framework (program) prior to specifying elements and relationships of this program November Group on Assessment: Next Step MCC Annual Meeting – September 2015
  21. 21. ||| Validating a Program of Assessment
  22. 22. || Appeal 23 • Emphasis is on a composite of data elements (quantitative and qualitative) to better inform key educational decisions as well as learning • Additional intricacy of including both micro-level (elements) and macro-level (complex system of interrelated elements) indicators adds an extra layer of complexity in the MCCQE validation process • Systemic nature of programmatic assessment requires validating not only the constituent elements (various data points) but also the program in and of itself • How do we proceed? MCC Annual Meeting – September 2015
  23. 23. || 24MCC Annual Meeting – September 2015
  24. 24. || Standards for Educational and Psychological Testing (2014) 25 Key Objectives • Provide criteria for the development and evaluation of tests and testing practices and to provide guidelines for assessing the validity of interpretations of test scores for the intended test uses • Although such evaluations should depend heavily on professional judgment, the Standards provides a frame of reference to ensure that relevant issues are addressed MCC Annual Meeting – September 2015
  25. 25. || 26 Assessing the Foundational Properties of a Program of Assessment 1 • Reliability 2 • Validity MCC Annual Meeting – September 2015
  26. 26. || 27 • “Test” Score: A reminder • Any assessment, by virtue of practical constraints (e.g., available testing time), is composed of a very restricted number of items, stations, tasks that comprise the domain of interest • My WBA program includes 12 completed mini-CEX forms • But as a test score user, are you really interested in the performance of candidates in those 12 very specific encounters? No! Reliability MCC Annual Meeting – September 2015
  27. 27. || 28 • You’re interested in generalizing from the performance in those 12 very specific encounters to the broader domains of interest • Reliability provides us with an indication of the degree of consistency (or precision) with which test scores and/or decisions are being measured by a given examination (sample of OSCE stations, sample of MCQs, sample of workplace-based assessments, etc.) Reliability MCC Annual Meeting – September 2015
  28. 28. || 29 • Measurement error arises from multiple sources (multifaceted) • For a WBA, measurement error could be attributable to: ◦ Selection of a particular set of patient encounters ◦ Patient effects ◦ Occasion effects ◦ Rater effects ◦ Setting (if given at multiple locations) • Need to clearly identify these sources and address them a priori • The impact of all of these sources needs to be estimated Reliability MCC Annual Meeting – September 2015
  29. 29. || Reliability of a Program of Assessment? 30 • Programmatic assessment is predicated on the notion that many purposefully selected and arranged data elements contribute to the evaluation of candidates • In addition to assessing the reliability of each element in the system, the reliability of scores/decisions based on this composite of measures therefore needs to be assessed • Models: ◦ Multivariate generalizability theory (Moonen van-Loon et al., 2013) ◦ Structural equation modeling MCC Annual Meeting – September 2015
  30. 30. || Validity: What It Is 31 Validity is an overall evaluative judgment of the degree to which empirical evidence and theoretical rationales support the adequacy and appropriateness of interpretations and actions based on test scores or other modes of assessment. (Messick, 1989) MCC Annual Meeting – September 2015
  31. 31. || Validity: What It’s Not 32 There is no such thing as a valid or invalid exam or assessment • Statements such as “my mini-CEX shows construct validity” are completely devoid of meaning • Validity refers to the appropriateness of inferences or judgments based on test scores, given supporting empirical evidence MCC Annual Meeting – September 2015
  32. 32. || 33 1 • State the interpretive argument as clearly as possible 2 • Assemble evidence relevant to the interpretive argument 3 • Evaluate the weakest part(s) of the interpretive argument 4 • Restate the interpretive argument and repeat • Five key arguments Validity: Kane’s Framework (1992) MCC Annual Meeting – September 2015
  33. 33. || 34 Validity: Kane’s Five Key Arguments 1. Evaluation Argument • The scoring rule is appropriate • The scoring rule is applied accurately and consistently • Evidence • Clearly documented training, scoring rules and processes for elements included in program as well as for complex interactions among these components 2. Generalization Argument • The sample of items/cases in the exam is representative of the domain (universe of items/cases) • Evidence • Practice analysis/blueprinting effort MCC Annual Meeting – September 2015
  34. 34. || 35 Validity: Kane’s Five Key Arguments 3. Extrapolation Argument • Does the program of assessment lead to intended outcomes? • Evidence • Do the outcomes of the program (LMCC or not) relate to clear practice-based indicators as anticipated? 4. Explanation Argument • Is the program of assessment measuring what was intended? • Evidence • Structural equation modeling, mapping of expert judgments, etc. MCC Annual Meeting – September 2015
  35. 35. || 36 Validity: Kane’s Five Key Arguments 5. Decision-Making Argument • Is the program of assessment appropriately passing and failing the “right” candidates? • How do we set a passing standard for a program of assessment? • Evidence • Internal validity ◦ Documentation of process followed ◦ Inter-judge reliability, generalizability analyses, etc. • External validity ◦ Relationship of performance on exam to other criteria MCC Annual Meeting – September 2015
  36. 36. || A Practical Framework (Dijkstra et al., 2012) 37 Collecting information: • Identify the components of the assessment program (What?) • Identify how component contributes to the goal of the assessment program for stakeholders (Why?) • Outline the balance between components that beast achieves the goal of the assessment program for stakeholders (How?) MCC Annual Meeting – September 2015
  37. 37. || A Practical Framework (Dijkstra et al., 2012) 38 Obtaining support for the program: • Significant amount of faculty development required to assure a level of expertise in performing critical tasks (e.g., rating) • The higher the stakes, the more robust procedures need to be • Acceptability • Involve and seek buy-in from key stakeholders MCC Annual Meeting – September 2015
  38. 38. || A Practical Framework (Dijkstra et al., 2012) 39 Domain mapping: • Gather evidence to support that each assessment component targets the intended element in the program (micro-level) • Gather evidence to support that the combination of components measures the overarching framework (macro-level) MCC Annual Meeting – September 2015
  39. 39. || A Practical Framework (Dijkstra et al., 2012) 40 Justifying the program: • All new initiatives need to be supported by scientific (psychometric) evidence • Cost-benefit analysis undertaken in light of the purpose(s) of the assessment program MCC Annual Meeting – September 2015
  40. 40. ||| Some Additional Challenges
  41. 41. || Narrative Data in the MCCQE • Narrative (qualitative) data poses unique opportunities and challenges for inclusion into a high-stakes program of assessment (MCCQE) • Opportunity ◦ Enhance the quality and usefulness of feedback provided at key points in the program • Challenge ◦ How to best integrate qualitative data in a sound, defensible, reliable and valid fashion in a program of assessment that fully meets legal and psychometric best practice • How can we better systematize feedback? 42MCC Annual Meeting – September 2015
  42. 42. || Narrative Data in the MCCQE 43 • Automated Essay Scoring (AES) • AES can build scoring models based on previously human- scored responses • AES relies on: ◦ Natural language processing (NLP) to extract linguistic features of each written answer ◦ Machine-learning algorithms (MLA) to construct a mathematical model linking the linguistic features and the human scores • The same scoring model can be applied to new sets of answers MCC Annual Meeting – September 2015
  43. 43. || Narrative Data in the MCCQE 44 • AES of MCCQE Part I CDM write-in responses • AES was used to parallel score 73 spring 2015 CDM write-ins (LightSide) • Overall human-machine concordance rate >0.90 ◦ Higher for dichotomous items; lower for polytomous items • Overall pass/fail concordance near 0.99, whether CDMs are scored by residents or computer • AES holds a great deal of promise as a means to systematize qualitative data in the MCCQE program MCC Annual Meeting – September 2015
  44. 44. || Argument for Accreditation of Observation-Based Data 45 • Insufficient evidence to support the incorporation of “local” (e.g., medical school) based scores (ratings) obtained from direct observation in the MCCQE without addressing a number of issues • Examiner training, patient problem variability, etc. ◦ Issues may never be fully resolved to high-stakes assessment standards • However, accrediting (attestation) observational data sources based on strict criteria and guidelines might be a viable compromise MCC Annual Meeting – September 2015
  45. 45. || Argument for Accreditation of Observation-Based Data 46 • Accrediting (with partners) all facets of observation-based data sources will require meeting a number of agreed- upon standards: • Selection of specific rating tool(s) (e.g., mini-CEX) • Adherence to a strict examiner training protocol ◦ Attestation that examiners have successfully met training targets (online video training module) • Sampling strategy (patient mix) based on agreed-upon list of common problems (and MCCQE blueprint) • Adherence to common scoring models MCC Annual Meeting – September 2015
  46. 46. || Putting the Pieces Together 47 • At a programmatic level, how can we aggregate this combination of low- and high-stakes data to arrive at a defensible decision both for entry into supervised and independent practice? • Standard setting process offers a defensible model that would allow expert judgment to be applied towards the development of a policy that could factor in all sources of data • Empirical (substantively-based) analyses would then be carried out to support & better inform (or even refute that policy) • Structural equation modeling, multivariate generalizability analysis, etc. MCC Annual Meeting – September 2015
  47. 47. || Next Steps 48 • Begin to lay foundation for a MCCQE program of assessment • Define both the micro- and macro-elements that define a program of assessment leading up to each MCCQE decision point • Initial efforts led by the November Group on Assessment • Agree on all supporting standards that need to be uniformly adopted by all stakeholders: • Accreditation criteria, where applicable • Core tools and pool of cases to be adopted by all schools • Training standards and clear outcomes for examiners • Scoring and standard setting frameworks MCC Annual Meeting – September 2015
  48. 48. || Next Steps 49 • Collaborative pilot project framework with key partners and stakeholders • Formulate key targeted research questions needed to support the implementation of a programmatic framework for the MCCQE • Identify collaborators (e.g., UGME programs, postgraduate training programs, MRAs, etc.) to answer specific questions from investigations • Aggregate information to better inform and support a programmatic model of assessment for the MCCQE MCC Annual Meeting – September 2015
  49. 49. || 50 Would you tell me, please, which way I ought to go from here? That depends a good deal on where you want to get to! - Alice in Wonderland MCC Annual Meeting – September 2015
  50. 50. ||| THANK YOU!THANK YOU! André De Champlain, PhD adechamplain@mcc.ca

×