The document summarizes a webinar on using item categories in ExamSoft to direct curriculum reform and evaluation. The webinar covered:
1. Mapping assessment categories to identify gaps and redundancies in curriculum content coverage and frequency.
2. Connecting assessment data to learning objectives to measure intended and unintended changes in student learning over time.
3. Examples of how one medical school used category data in ExamSoft for curriculum blueprinting, predicting future performance, and gap/redundancy analysis.
The webinar emphasized that category data in ExamSoft can contribute significantly to program evaluation efforts and encouraged attendees to think about integrating it with other assessment systems.
Using Categories to Direct Curriculum Reform and Evaluation
1. 866.429.8889 | 1.954.429.8889 learn.examsoft.com
Thank you for joining.
The webinar will begin shortly.
Using Categories to Direct
Curriculum Reform & Evaluation
2. Please pose questions to
the presenter through the
“Questions” field of the
Go To Webinar tool on
the right side of your
screen.
All questions will be
addressed at the
conclusion of the
presentation.
3. USING CATEGORIES TO DIRECT
CURRICULUM REFORM &
EVALUATION
Sarah B. McBrien, MS
Curriculum & Assessment Manager
College of Medicine
University of Nebraska Medical Center
4. Session Objectives
• Transform category data into a map of the topics you’re
currently assessing.
• Identify gaps and redundancies in a curriculum in order to
better manage content coverage, frequency, and location.
• Consider methods for connecting assessment data to
learning objectives data.
7. Frequent & Regular Monitoring
All Students & Subgroups
Systematic
Approach
When
Who
8. Opportunities for Category Development
Licensing
Bodies
Governing
Bodies
Course
Objectives
Program
Objectives
Faculty
Students
Program
Objectives
9. Opportunities for Category Development
Item Type
ACGME
USMLE
Course
Objectives
Review
Process
Faculty
Author
At UNMC
Item Usage
Cognitive
Level
Instructional
Mode
Construction
Zone
18. Principles of Foundational Science
Biochemistry and molecular biology
Gene expression: DNA structure, replication, exchange, and
epigenetics (eg, imprinting, X-activation, DNA methylation)
24 items
5 Exams (3 in M1 Year, 2 in M2 Year)
Average Difficulty Index: 0.845
Average Time on Item: 1.12 mins
Example C: Making Predictions
19. Example D: Gap & Redundancy Analysis
M1 ICE Exams
4 exams, 228 total items
45 items come from USMLE Category:
Medical ethics and jurisprudence, including
issues related to death and dying and
palliative care
Medical ethics
and
jurisprudence,
including
issues related
to death and
dying and
palliative care
24. Concluding Messages
• Think carefully about how the category feature in
ExamSoft can contribute to your program evaluation
efforts.
• You’re never too late. Categories can be attached to an
item after an exam administration.
• ExamSoft’s category feature is flexible.
• Consider the other systems you have in place and how
they connect to the data in ExamSoft.
25. Contact Information & Resources
• Sarah McBrien: sarah.mcbrien@unmc.edu 402-559-2505
• UNMC College of Medicine
• UNMC Training the Physicians of Tomorrow
• USMLE Content Guide
• ACGME Competencies Guidebook
Editor's Notes
UNMC COM is undergoing curriculum renovation for rollout during the 2017-18 academic year. It is an integrated approach to teaching medicine that will consist of three phases. The first phase, Foundation of Medicine, will include integrated blocks that are organized by organ systems. Our current students take two years of pre-clinical coursework with the first year focusing on normal structure and function and the second year focusing on abnormal and disease processes in the second. This content will be converted to an integrated approach that combines normal processes with disease processes in a systems-based approach.
Assessment data is vital for:
identifying gaps and redundancies in our current and new curriculum.
managing content coverage, frequency, location.
mapping instruction and assessment activities to course and program objectives.
Objectives are already mapped to instructional activities via OASIS, and we can connect the data in OASIS with the data in ExamSoft
Even if you aren’t in the process of renovating your curriculum, this presentation might help you identify opportunities to gather and respond to assessment data from ExamSoft.
The primary purpose of program evaluation is to measure change. This change may be deliberate, purposeful change that we are working toward. These changes might be short-term changes based on slight changes to instructional practices, or it can be long-term changes that are part of a larger initiative. BUT it program evaluation is also important for identifying and responding to unintended changes in student performance. By looking at assessment data from ExamSoft, we can detect changes in student performance and then work to identify the factors that may be contributing to that change.
How ExamSoft can support efforts toward systematic program evaluation is the focus of this webinar, but using assessment data from student examination is just one piece of a larger puzzle. We also have to review faculty and course evaluation data, student performance on standardized assessments, focus group data, and cultural and social changes in our students that may be difficult to detect with an objective measure of student performance.
There are significant challenges to conducting systematic program evaluation. The learning process itself is complex. Students enter medical school with varying degrees of familiarity with the topics and gain new knowledge at varying rates. What takes one student only one exposure to learn might take another student multiple encounters.
The MD program is complex as well. Multiple instructional modalities are used to reinforce the same content, so we may be unable to pinpoint which approach has the most impact. The content that students tackle is interrelated, sophisticated, and delivered in a fast-paced environment.
Your approach to program evaluation should be a deliberate and systematic and should happen on a frequent, regular basis. The literature on theories underpinning approaches to program evaluation and more pragmatic step-by-step guides is fruitful, and you should consult those and consider the needs of your students, faculty, and institution when developing a plan for taking a systematic approach.
Tracking the progress of all students and of subgroups is an important part of the process, and you can create subgroups by creating Exam Taker groups in ExamSoft if you’d like to look a closer at certain groups of students. The longitudinal analysis feature in ExamSoft will allow you to identify trends in an entire class of students, and by identifying subgroups you are interested in looking at closer as Exam Taker groups you can track their progress. These groups might be minority students, pipeline program students, or students who have struggled previously.
ExamSoft’s category feature is an important building block in this process. I’d encourage you to think about the opportunities you have for collecting assessment data. One item can have an infinite number of categories. Careful thought about how you’ll build your category structure will make this process easier for you later. I’ll show you portions of our category structure so that you can consider how that might work in your setting.
These are general categories we’re using in ExamSoft. Some of these are directly related to our purpose here today ~ to collect data over time about student performance. Some of them are internal categories that we use to manage our process for building and administering assessments.
Used the import questions feature to import categories ~ see Donna’s presentation
Numbering scheme is to keep items in the same order as the outline, but it’s also helpful for reporting to use the numbering scheme as a shorthand tool to talk about the content.
We keep records of the year in which an item was added to our item bank. We also keep records of when an item was used. Our goal is to have more items in our bank than we need for multiple choice examinations, and this helps us track how often an item is being used in an examination.
The construction zone is used during the year to actually build each exam – it’s like a bin for all of the possible items that could be used on an exam. Our faculty use that to manage the building process, and when it is finalized we can take all of those items and drop them into an assessment to be configured and posted. So – why also keep the Years Used data when the Construction Zone is already doing that? It simply removes a layer of data that we may not need for longitudinal data, but the construction zone lets us really zone in on an examination. With this, I can determine the historical usage of an item and provide data about the specific exam in which it was used. So the construction zone serves two purposes for us. You’ll notice here that we’ve carried a similar numbering approach to keep the examination in the order we want to see them in our portal.
Let’s start by looking at how we can use the longitudinal reports feature to identify the frequency and locations in which a topic is being assessed.
From the entire 15-16 school year: 177 items were categorized as musculoskeletal items. 118 of those are in the abnormal processes category. We can look more specifically at each topic under the abnormal processes category to see how many assessments include items from that category, how many items, and the average score on that item. Your “group” is defined when you run the report. In this case, this is all students, all examinations from the 15-16 school year.
From this general report, we can:
identify gaps in the curriculum.
identify topics which may require more attention.
identify trends in performance.
Maybe we’re curious about why the average student score for degenerative and metabolic disorders is above 90% and what to find out where are these topics being assessed.
We can focus on that category and look closer to find out where the examination items categorized in this fashion are being used. I can now see that the majority of those items were included on the MS/Derm Examination, which makes sense. I can also see that that one item in our anatomy course at the beginning of the M1 year includes an item categorized this way. And two items in the CPEE core in the second year also assessed this content.
I can see by looking at the categories assigned to those items which items were used in the 15-16 school year and in the two years prior to that as well. As we think about transitioning to our new curriculum, these items might be redistributed to new blocks in the first phase; but we can refer to this information to examine changes in performance on this topic or on specific items.
Because were redistributing and reusing some of our examination items and will identify a pass/fail cut score for students, we will want to be extremely deliberate about planning our examinations next year. We can use the blueprinting feature in ExamSoft to manage the content of each examination.
Exam Blueprinting: Using a variety of categories, we can see metrics about an assessment before it is deployed.
How many items were new this academic year?
What percentage of items came from problem-based learning activities?
What is the weight of normal vs. abnormal processes questions?
Most of the items on this examination were imported from our former system, 16 were created this year, and 37 were created last year. This is a metric we like to keep our eye on, as we strive to build our item bank with well-constructed examination items.
The percentages on this screen show the percent of total items on the exam in each category, not student performance. 47 of 128 (about 36%) of the items on this assessment are categorized in the General Principles of Foundational Science. We do have performance data from our longitudinal reports about how students perform on that general category but also on each of the sub categories below it, and on each item within those categories as well.
That leads us to the third way we can use categories to mine valuable data is that we can make predictions about how students will perform on an examination by looking at the make-up of that exam and comparing performance on those types of items in previous years. This will be an important part of the test building process for us next year as we re-arrange the content and use some of the same test items. Because we will have historical data on most of the items that are used on future exams, we can make some predictions about how students will perform.
And most importantly, we can respond promptly if students do not perform on examination items in the same manner as their predecessors. We can’t afford to wait until the end of the first phase to respond to gaps or weaknesses in our teaching in our new curriculum, and comparing data to previous cohorts of students is one way to monitor progress.
In this example, medical ethics is making up nearly a fourth of the content assessed on our first year ICE exams. This content will be distributed among blocks in the first phase, and we have the capability to be sure that we maintain this ratio if we feel it is appropriate. We can also be sure that student performance on this type of item is consistent with that of previous cohorts of students.
Gap & Redundancy Analysis
Using the USMLE categories as a guide, longitudinal reports can be generated to identify content areas with low performance rates.
Likewise, content areas can be analyzed to determine:
where content is being taught (or assessed).
whether redundancy is planned and appropriate.
whether weighting of the topic is appropriate.
The ability to connect outcomes from our assessments to our program and course objectives is important for our transition as well. What you see here is the list of course objectives from our biochemistry course that takes place in the first year currently. Each of these objectives maps to a program objective; and likewise, objectives for each of the instructional activities in this course map to these course objectives.
The screenshot you are seeing here is from OASIS, the information management system we use at UNMC.
Let’s focus on this course objective ~ how cells generate energy.
We can tie that objective directly to this USMLE category, and use our historical data to identify changes in performance as we deliver our improved curriculum next year.
10 items are categorized under energy metabolism, and the average performance by M1 students in 15-16 on those items is 73%. This percentage is fairly consistent with overall exam scores in this course. We can compare this to performance of our M1s in the 16-17 school year and use that data to predict how students will do in Phase 1 next school year. Early responses to downward trends in performance will be vital next year, and by developing a set of baseline data now we can keep an eye on how our students are performing with our new approach. These same exams items may not be used on an exam or might be redistributed to a quiz or incorporated into another instructional activity, but we have created a baseline set of data.
You’ll want to think about your current or existing category structure in the framework for program evaluation. Don’t re-invent the wheel. Many health sciences universities are tackling this project all at the same time, and our collective experiences are helpful in thinking through this process. Consider how other institutions have created their structure and take the best parts of that for your setting.
Frankly ~ we did not get this right on our first try. We learned the hard way on some things and had great tips on others. It’s important to remember that you can categorize items at any time. We found ourselves barely able to keep up last school year with the implementation of ExamSoft in both the first and second year. Some of the work we have done took place after the fact, knowing that we needed to have this structure in place for a curriculum transition. We’re still fine tuning our approach and constantly considering new ways of thinking about how we can use categories to our advantage.
Any one item can be categorized multiple ways, so there is also no harm in trying out one or two different approaches to see what works best for you. There is certainly a balance between having too few and too many categories, but it’s okay to experiment with the system to try things on and see how they fit with your needs. If you decide to move away from a certain approach to categorizing, it’s easy to archive those categories.
As we move forward with connecting ExamSoft data and OASIS data, we’re making careful decisions about how we label items for inclusion so that we can easily merge data. Thinking through the reporting mechanisms of all of your systems on the front end will save you time, but as I said before, you can always adjust ExamSoft’s category information if you don’t get it right the first time around.