1. Introduction: Hello, welcome, and thank you for this opportunity to present our work.
2. Disclosure: Our disclosure statement
3. Background: Although it has been a matter of interest within academics for decades, there still is no widely available, comprehensive, uniform, easy-to-use platform for evaluating surgical resident/fellow operative performance. Furthermore, current educational reforms depend on one being developed and integrated into training programs . In considering how to approach the problem, one does well to consider that other professions have tackled similar problems before: For example, Flight Training within the U.S. Navy (with which I am personally familiar) has been making trainee performance assessments since prior to WWII, and business management has been integrating systems of key process measurements into manufacturing & service industries since post WWII Japan.
4. Hypothesis: Our hypothesis was first that we could in fact build a comprehensive, easy-to-use, useful platform for achieving continuous operative performance assessments of surgical trainees, and second, that we could successfully integrate it into both our daily work-flow and our educational curriculum.
5. STAT: …What resulted is “The Surgical Training & Assessment Tool,” or “STAT” for short. STAT is an internet-based software system that functions to: (1) de-construct the complex training events of an operative case into its distinct elements; (2) enable people to record their subjective impressions of a trainee’s execution of those distinct elements in an organized, thoughtful way; and then (3) reassemble or re-construct these distinct impressions into usable, reviewable formats. These are reviewed for the purposes of improving the surgical training process.
6. Accessing STAT: You access the system by logging in from any computer, at work or at home, and using entering your unique user-name and password. Both attending and the surgical trainee responsible for a case log in separately, and they both will make an assessment of the trainee . Neither sees the data of the other until both submissions are complete.
7. Entry Page: The entry page is where you enter the basic details of the case you are about to assess: --you pick the date from a drop-down list… …Enter the name of the other assessor… …and then pick the type of case from the various categories present on the right side of the page. For example, let us suppose we have just completed an esophageal case and wish to assess it.
8. Entry Page / expanded: … you click on the little plus sign next to “Esophagus” and a list of esophageal procedures to choose from materializes; you click on “transhiatal esophagectomy” and click proceed.
9. Assessment Page … which will bring you to our “assessment page.” There isn’t time at present to describe it completely, but an overview is important: The page itself consists of four separate sections or lines of approach towards making a performance assessment. The first two sections of the assessment page feature a structured format and as you can see have a great number of little buttons underneath them, and the second two involve no structure at all. The first section –the “general capabilities”-- is comprised of three separate global ratings which are applicable to any and every type of case; these are “applied knowledge,” “technical skill,” and “professional independence.” Each of these three capabilities expands into finer levels of detail if you click on the little “+” sign, and they can be rated at any level of definition. The trainee’s scores in each category range from “poor” to “excellent,” and they are benchmarked against a theoretical ideal resident or fellow who is at their same level of training. Thus an intern compares himself to a theoretical intern, and a fellow compares herself against a theoretical fellow. The second part is a checklist organized specifically to the type of case being assessed. For Surgical Oncology we have generated 187 unique checklists for 187 different types of cases. In this example of an esophagectomy you can see there are six separate phases, and these expand into a total of 61 separate steps. When making an assessment, you can assess at the “phase” level, or, you can go into progressively finer detail, depending on your own choice. The scale is based upon the “phases of technical skills acquisition,” which I can explain later if there are any questions regarding it. The third section is an open comments box where you can add context or make specific comments or whatever. The final section -- which one reaches only after working your way through the first three-- exists as a summative, overall case-competency rating. It is not calculated by the system, it is an independent click made by the rater.
10. Results: Once all the data is entered, it is all retrievable as “Results.” STAT makes individual graphs all the scores for any given individual trainee. Here are graphs of the three general capabilities as well as the overall case-competency scores of one trainee. What you are looking at here is about 75 consecutive cases performed by one trainee; his self-ratings are in purple, and the various attendings’ ratings are in black. Ratings for each case are matched vertically. As you can see, this trainee performed quite well in each of the separate categories, as well as in the overall score.
11. Results/written: STAT also records and arranges all the comments made by the trainee and/or by the attending for the various cases. The comments are matched for any given case and organized by case-type.
Next comes the focal point of the entire system: The STAT data is meant to be put to use and here is how we do it: Every two weeks there is an informal meeting between a trainee and their mentor to review the trainee’s recent performance data. Trainee & Mentor discuss and interpret the various graphs and comments, they develop a performance improvement plan to define the trainee’s educational goals for the next two weeks, and enter it into the comments box. The trainee enters their password, and the improvement plan becomes part of their longitudinal educational record. In short, every two weeks, the trainee has a chance to hear the truth, and the mentor a chance to speak the truth.
12. Time per case entry: It is important to demonstrate for you how quick an affair it is to enter a STAT assessment. After thousands of case entries, the vast majority of case entries as is demonstrated by the bottom, blue bar on the graph, required under a minute to complete. The median value for time required is just 39 seconds, and the mean = 60 seconds.
13. System usage: The red bar here represents the total number of submissions by either attending or trainee over the 18 months of the study period, and the blue one represents the total number of cases which have a submission by either one or the other. You can see that usage was sustained over the study period. Participation in the study was voluntary, and only in the first four months did we actively remind and send out encouragements to people to utilize the system. By about the one-year mark the usage ebbed, but then you see it picked itself back up on its own, without any prompting or nagging, which strongly suggests that people accord some value to the experience and were happy to carry it out further until the end of the study period.
14 . Usage / Attending & Trainee: The entries were made in roughly even numbers between attending and trainee.
15. Overall Grades: Here are what the overall grades look like, separated by trainee self-grades, versus attending grades. What is reassuring about these graphs is that the overall grades assume a normal, bell-shaped distribution.
16. Internal Consistency: This slide asks the question of whether or not the measurements of Applied Knowledge, Technical Skill, and Professional Independence correlative with the independent measurements of Overall Competency? --Yes. Each component is strongly correlated with the overall score This is important because it suggests that not only are we measuring the right things, but we are measuring them in an adequate fashion.
17. Inter-rater Reliability: This slide shows the summary rankings of separate trainees’ overall scores, as recorded by different attendings. Each of these trainees had at least ten cases with each of the attendings. This graph represents 355 case submissions. The rank-lists match up, and this is not a random phenomenon. Patterns such as these suggest that while the values of the scores are free to vary between attendings, (a) the attendings’ mental models of the theoretical ideal resident must in fact resemble one another, and (b) the STAT system must be effective in capturing how closely the trainee’s actual performance did or did not approximate with the theoretical ideal. As the graph shows, the precision, consistency and reproducibility of the attendings independent conclusions are in fact reliable from one rater to another. These results are achieved despite the essentially subjective nature of the STAT assessments and without STAT assessments depending upon absolute or “criterion” measurements.
18. Inter-rater Reliability: This slide shows how well the trainees’ self-assessments rate up with those of their attendings. Each graph I’ll show you arranges the consecutive cases of one trainee horizontally, and their assessment scores on the vertical axis. There will be individual graphs for the separate general capabilities, as well as the “overall” assessment. Shown here is a graph of about 125 of “Trainee A’s” consecutive “Overall Assessments.” The trainee’s self-assessments represent one line (the purple) and the various attending assessments are the other line (the grey). If you assume that the various attendings are making accurate assessments (the grey line), then monitoring how well the trainee’s self-assesments (the purple) correlate with those assessments manifests how much insight the trainees must have into their own abilities. Is “insight” important? Absolutely. By the end of a residency or of a fellowship, one thing you really want to know about yourself is whether or not you can take on a given case safely --and to do that requires a high degree of insight into your own abilities and limitations.
22. Inter-rater Reliability: Here is someone showing gradual improvement over about 85 cases. It’s intriguing to see how the trainee’s awareness parallels the scores given by the attendings.
22. Summary: In Summary: The Surgical Training & Assessment Tool is an internet-based software system which facilitates the production of abundant trainee operative performance data of a uniquely specific and refined quality; The data is immediately retrievable by the trainee, their mentors, and their program directors, and it appears to be suitable for directing their week-by-week training objectives, as well as remaining part of their longitudinal training records; The system is convenient and cheap to use; And using the system on a regular basis appears to favorably influence the educational process in terms of promoting trainee self-reflection & self-awareness, facilitating mentor-to-trainee feedback, and doing so in a fashion so timely as to be immediately meaningful to the trainee.
23. Conclusion: In Conclusion, it is clear that STAT is clearly a practical assessment instrument insofar as it is feasible to use, it is comprehensive, covers a broad array of competencies and all kinds of surgical cases, and it is relevant because it measures the actual events of training as opposed to simulated ones. STAT may in fact be a “reliable” instrument insofar as it appears to be internally consistent, and it appears to feature an acceptable degree of inter-rater reliability. And STAT as an educational tool is clearly something which is not too difficult to integrate into the busy work-flow of a training program, it meshes well with the surgical oncology educational curriculum.
26. Future Directions: Firstly we plan to study the system further, to better establish its “reliability” and its “validity” and we have five training programs enlisted into a study for the purpose. Secondly and in an exciting twist, we have adapted the STAT system to be utilized in a major, multi-institutional, multi-national prospective, randomized controlled trial sponsored by the US Military Cancer Institute regarding HIPEC treatment. The study will have an educational component wherein participating surgeons will be proctored/certified in the case conduct by the study experts in the proper technique for performing the procedure, and STAT will be utilized in this regard. Furthermore, STAT was adapted to facilitate real-time reporting of the many details of the operative procedure, and the HIPEC trial will be able to conduct operative performance quality-control throughout the course of the study by reviewing the operative details on a real-time basis. This new capacity we believe will set a new standard for trials of a complex operative procedures.
10. General capabilities: The first section on the assessment page is “structured,” and it addresses what are variously described as “global ratings,” “competencies” or “surgical qualities;” we are calling them “general capabilities.” This first section addresses three separate capabilities important to the Surgeon which are applicable to any and every type of case: Applied Knowledge, Technical Skill, and Professional Independence. Regardless if you are logging a breast biopsy or an esophagectomy, these same qualities are assessed. By clicking on the little plus sign, each of these categories breaks down into finer levels of detail, and the trainee can rate him or herself at either the broader level, or the finer level of detail, depending upon their own choice. The attending rates the trainee similarly, at whatever level of detail they choose. Both attending and trainee are rating the trainee ; neither rates the attending. The scale of performance for both attending and training raters is “poor,” “fair,” “good,” “very good,” and “excellent.” They are benchmarked against a theoretical level of where the trainee ought to be at for their year of training: An intern would therefore be rated against an imaginary, theoretical standard intern, and a fellow against a theoretical standard fellow.
11. General Capabilities Criteria Checklist / expanded: If desired, each of these categories can expand to greater detail. For example, the “applied knowledge” section expands into…
12. General Capabilities Criteria Checklist – further expanded … “knowing the planned procedure,” “core knowledge,” and “thinking.” “ Knowing the planned procedure ” entails knowing the details of the patient upon whom you are practicing, knowing every step of the case, and understanding any pertinent issues. “ Core knowledge ” subdivides into having an accessible grasp of anatomy, physiology, and pathology. “ Thinking ” involves differential diagnosis abilities, diagnostic abilities, treatment options, backup plans, knowledge of complications, and anticipation of post-operative complications. We ask that you click on a minimum of just one box per each of these three sections within the “General Capabilities” area. For example, for the “Knowledge & Understanding” category, the click can be clicked at the highest level, or at any of these lower levels.
14. Specific Capabilities Criteria Checklist: … and what you find when you expand them is that each phase is comprised of sub-phases of greater detail. Because you can rate at any level of desired detail, we find that attendings typically submit ratings upon the trainees made at the broader, “phase” levels, occasionally digging down into the deeper detail to make a specific point; and trainees typically submit their self-ratings at the more detailed, sub-phase levels.
11. Specific Capabilities Criteria Checklist: Next comes the checklist specific to the type of case being assessed . 187 separate Surgical Oncology cases were individually outlined, and conceptually broken down into separate “phases.” Simpler cases feature just three phases, the most complex may have six or seven. The phases were constructed as similarly as possible across different case-types as well as across different organ-systems. For an esophagectomy, for example, there is an “exploration phase,” an “abdominal mobilization & vascular management phase,” a “cervical mobilization & vascular management phase,” an “abdominal resection phase,” an “esophageal resection phase,” and a “conclusion” phase.” Therefore in a manner similar to what you’ve already seen within the “General” section, in the “Specific” section each of these separate “phases” can break down into greater detail should it be desired. This case gradually expands into a total of 61 steps (if a step of the case is necessary for its successful conduct, it was included in the checklist). Again, you can make a single click for each phase (at any level of detail), or as many as one click per step; to expand into greater detail you just click on the little “+” sign. The ratings scale used here needs a quick explanation: for this section they differ depending upon whether you are a trainee or an attending. If you are a trainee, you rate yourself along the “phases of technical skills acquisition”: these are phases defined by cognitive psychologists and represent the progressive levels of skill mastery which anybody moves through when learning a new skill. The buttons are “NP” which stands for “Non-Participatory,” and “C” “I” “A” “P.” Skill mastery begins at a purely conceptual level, which they call “cognitive” –the hands are clumsy and the person’s mind is completely task-absorbed; they move through an integrative phase until the hand movements become graceful & second-nature and the mind is freed up from the task, this is referred to as “autonomous.” We’ve also added a fourth level, “proficient,” meaning not only can you perform the step automatically, but you can do it precisely, perfectly, and with an appreciation of nuance. Therefore filling out this section of the assessments demands a bit of introspection on behalf of the trainee, forcing them to appreciate at what level of mastery they may be for each phase or step. This encouraged introspection is intentional, as a major objective of the STAT system is for trainees to develop an appreciation of their own abilities & limitations. If you’re an attending rating the trainee on these steps you use a similar four-grade scale, but it is simply measuring the grace with which the step was performed: “Poor.” “awkward,” “competent,” or “excellent.”
12. Unstructured Elements: The last two sections of the assessment are unstructured, and exist to complement the first two sections. The open comments box is allows one to get beyond the inescapable limitations of any pre-formatted assessment scheme. It exists as an escape hatch, because the actual events of any case are generally too unpredictable to ever be captured in advance by a pre-formatted system. The comments people write add tremendously to the whole depiction. This section is optional. The final section, the overall case assessment, exists for the rater to sum up all the prior observations into a single statement on that trainee’s case-competence for that given case on that particular day. The overall section is a mandatory click --it is not a value calculated by the system, rather, the overall grade is something chosen independently by the rater.
Sso Stat Pbr.4 Ppt
Use of a Novel, Web-based Educational Platform Facilitates Intra-operative Training in a Surgical Oncology Fellowship Program Paul B. Roach, Kevin K. Roggin, Eugene Selkov Jr., James Dignam, Mitchell C. Posner, Jonathan C. Silverstein The University of Chicago Department of Surgery
Disclosure Statement <ul><li>The principal and contributing authors have no conflicts of interest to report. </li></ul><ul><li>The software program presented has an “open source” copyright, and will be distributed freely to any interested surgical program directors. </li></ul>
Background <ul><li>No widely available, comprehensive, easy-to-use platform for evaluating surgical resident/fellow operative performance. </li></ul><ul><li>Current educational reforms depend on one being developed and integrated into training programs. </li></ul><ul><li>Examples exist from within Naval Aviation, and Business Management. </li></ul>
Hypothesis <ul><li>Can a system be created for achieving continuous operative performance assessments? </li></ul><ul><li>Can it be integrated into our daily routine / educational curriculum? </li></ul>
“ The Surgical Training and Assessment Tool” or “STAT” 1. A web-based software system to de- construct the complex training events of an operative case into distinct elements; 2. Enable methodical, thoughtful, subjective assessments of these distinct elements; 3. Re-construct data into usable formats to impact / improve the surgical training process.
Do the “General Capabilities” correlate with the “Overall Score”? <ul><li>Each of the general capabilities is significantly and strongly correlated with the overall score. </li></ul>Linear correlation with overall score : Knowledge 0.6036 Skill 0.7638 Independence 0.6935
Do the general capabilities predict the overall score? <ul><li>In a multiple regression model, the general capabilities significantly predict the overall score. </li></ul>Predictor Coef. (slope) P Knowledge 0.312 <0.001 Skill 1.255 <0.001 Independence 0.546 <0.001 Model R 2 = .62 Multiple regression model for overall score :
How well do attending ratings of a given trainee correlate with one another? Average Rank 1.4 2.6 2.8 3.2
How well do the self-ratings of the trainees compare with the ratings their attendings give them? Trainee “A” Attending Trainee
<ul><li>Abundant trainee performance data </li></ul><ul><li>Real-time, transparent, longitudinal record of trainee performance </li></ul><ul><li>Inexpensive & time-efficient </li></ul><ul><li>Influences the educational process: </li></ul><ul><ul><li>Reflection & self-awareness </li></ul></ul><ul><ul><li>Feedback </li></ul></ul><ul><ul><li>Formative & summative </li></ul></ul>Summary
Conclusion 1. STAT is PRACTICAL: Feasible Comprehensive Relevant 2. STAT appears to be RELIABLE: Internal consistency Inter-rater reliability 3. STAT is EASILY INTEGRATED into the daily work- flow and educational curriculum . Voluntary, self-sustaining use
<ul><li>Surgical Oncology Faculty </li></ul><ul><li>Pritzker School of Medicine </li></ul><ul><li>General Surgery Faculty & Residents </li></ul><ul><li>Bob McBride – Education </li></ul><ul><li>CAPT JJ Lee – Naval Aviation </li></ul><ul><li>Steve Roach – Managerial Science </li></ul>Acknowledgements
Future Directions 1. STAT Validation Study : 4 military training programs + Hadassah University 2. STAT Expansion into other specialties Orthopedic, Thoracic, Plastic, Obstetric- Gynecologic Surgery 3. HIPEC study . Training Component Reporting Component 3. Society of Surgical Oncology: Possible tool for fellowship training