Conducting a Summative Study of EHR Usability: Case Study


Published on

At least year’s conference, a group of us explored the complexity involved with evaluating the usability of Electronic Health Records: The wide range of user profiles and characteristics, a seemingly infinite number of tasks, and challenges in obtaining realistic data while respecting HIPAA regulations. In December, the Usability team at athenahealth conducted a summative usability study of [product]. In this Case Study, the Kris will discuss how the team navigated the challenges of summative EHR evaluation to conduct this study. Topics include task selection, recruiting, metric selection, logistics, and lessons learned.

Published in: Design, Business, Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Welcome, thank you, introductions
  • What is athenahealth?athenahealthis a watertown based company delivering cloud-based solutions for practice management, electronic health records, patient communication and care coordination.User Experience introduced to athenahealth in 2008Backed by CEO, embedded in R&D processes, tackling really interesting UX problemsHas grown to a team of 22, providing: User experience designUser researchPatient safetyProduct copyThe company and the team are still growing (Come talk to us!)
  • Here’s where we left of from last year’s conference. In the next slides, we’ll talk about how athenahealth’s usability team approached these challenges.
  • It is possible to conduct usability studies on EHRs.We routinely conduct formative studies, along with field studies and other user research activities, as part of our user centered design process. In addition, in December 2011, we conducted a baseline summative study. So, while it is challenging to conduct a summative usability study, it is possible. We’ve done it.
  • What does it take to meet the challenges?
  • (This really should be a much bigger bag, or a much smaller box)
  • Imagine all the tasks that all of these health care workers do and testing them. Then imagine testing all those tasks with multiple user groups, based on age, specialty, tech-savviness, etc.
  • The number of tasks that are supported by EHRs is huge – choosing which ones to cover is a big challenge. The first thing to ask is how the data will be used – why are you doing this in the first place?Summative test is really for comparative measurement – comparing two versions of the same thing or two similar things. We conducted our study as a baseline measurement against which we could measure future versions of our EHR. This allowed us to focus on tasks that mattered to us—tasks that our users do, tasks we intended to reexamine during 2012.Most of our clients are in an ambulatory setting, so we tested tasks that often happen in a doctor’s office. We support a variety of specialties, so we picked tasks common to many specialties.As with any test, we proposed some tasks, assigned maximum times for each task, and then presented the list to stakeholders for prioritization
  • I recreated the slide, removing circles that were part of the original author’s point, but might have implied that these were the tasks we tested)Everyone who’s ever had a visit to the doctor’s office, raise your hand. You have some idea of what the doctor does. This slide, from the Journal of Usability Studies, has a rough task analysis of an office visit for an IM doctor – in case you aren’t familiar with what a doctor does.
  • How many: Basically, you need an estimate ofthe variance of the dependent measure(s) of interest (typically obtained from previous,similar studies or pilot data) and an idea of how precise the measurement must be(which is a function of the magnitude of the desired minimum critical difference andstatistical confidence level); once you have that, the rest is arithmetic. There are numeroussources for information on standard sample-size estimation [6, 23]. For this reason,I’m not going to describe them in any additional detail here (but for a detailed discussionof this type of sample size estimation in the context of usability testing, see Lewis[14]). (from James Lewis, “Sample Sizes for Usability Tests, Mostly Math, Not Magic”, in Interactions, November-December 2006)
  • AND THEN…Because we are cloud based, the environment changes every month – including our testing environment.Our EHR is released every monthBecause it’s cloud-based, everyone gets a new version once a monthThis is a summative test, so all participants need the same environment and dataSame fake patient, same fake problemWe needed to replicate the environmentWe arranged to replicate our scrambled, set-up environmentBorrowed some servers from another groupOur Development resource arranged for eight copies that were refreshed every night, and added a way for us to refresh a copy during the day if need be.Each participant for a day used a different instance of the environmentEnvironment setup and the release schedule set boundaries on testing datesWe scheduled test sessions for the last two weeks before the next release
  • This is not our real data.Task successDefinition: Reached goal within time limit without committing a critical errorSome tasks had partial successCounted number and percentageTime on taskDefinition: From when they started the task to when they declared themselves doneAverage, of successful participants onlyError-free rateError definitions determined during analysis; critical errors were defined as errors that carried patient safety riskNumber of participants who completed the task without errorSome had partial success but no errors – guess howPost-task ease of use ratings, and SUS at the end of all tasksAverage ratings on tasks; SUS
  • Markers for the next travelers… / Lessons learned
  • Johan says that cleansed patient data is available from the ONC – but I’m not finding it easily. When we asked him, he sent this link:
  • Your turn. Write and tell us how it goes. Use the EHR CIF format.
  • Conducting a Summative Study of EHR Usability: Case Study

    1. 1. Conducting a Summative Study of EHRUsability: Case Study from athenahealth Kris Engdahl May 7, 2012
    2. 2. How do we define EHR usability? What is an electronic health record (EHR)? • Electronic version of paper charting, plus capacity for electronic data exchange; managed by healthcare professionals • Used by single-physician practices, large healthcare networks, and everything in between • Different from a personal health record, which is managed by the patient What is usability? (Choose your definition – here is the NIST definition) • ISO 9421-11: “The extent to which a product can be used by specified users to achieve specified goals with effectiveness, efficiency and satisfaction in a specified context of use.”
    3. 3. Why is EHR Usability such a hot topic? Healthcare is a large industry • $2.5 trillion, 14.3 million jobs, 17.3% of GDP It affects all of us • We are all once and future patients • We pay for everyone’s healthcare, either through insurance or taxes Recent healthcare reform legislation encourages the adoption of EHRs • Health Information Technology for Economic and Clinical Health (HITECH) Act includes incentives for “meaningful use” of EHRs • Office of the National Coordinator (ONC) is paying attention to the field of usability as it evaluates EHRs for certification Poor EHR usability can put patients at risk • Increasing attention is being paid to patient safety in regards to EHR usability Healthcare providers are demanding better EHR usability
    4. 4. Challenges in EHR Usability Testing Who do we test with? • Who are “representative” users? • How do we get access to them? What tasks do we test? • Different kinds of users have widely varying tasks What product “version” do we test? • Most EHRs are highly customized for individual practices • How useful is a test of a “generic” version? What data do we use for testing? • Participants are always distracted by unrealistic data – so it has to be realistic • Real data is protected by HIPAA It is challenging, BUT…
    5. 5. Meeting the challenge Know why you’re testing • Why do a summative test? • What will you do with the data? Manage the scope • How much time do you have? • What kinds of resources do you have? • How much can you reasonably do? Prepare thoroughly • Consider all the pieces • Plan for logistics and timing • Prepare the testing team Do it!
    6. 6. Scope: You could do this forever Testing an EHR could mean testing everything all these people do And then there are the other dimensions that make more user groups • Age, specialty, tech-savviness, domain experience, etc.
    7. 7. Knowing why helped us scope our test Why we conducted a summative study • Planning was underway for 2012, with UX changes in it • We wanted to be able to quantify the improvements we were embarking on • This would be the “before” measurement This determined the tasks and users we selected • We focused on areas that we intended to re-examine in upcoming releases  We had completed a heuristic review and a patient safety review  We had identified areas for UX work in 2012 • We focused on clinicians’ tasks in an ambulatory setting • We focused on tasks that are common to a number of different specialties • We presented the list, with time limits, to key stakeholders for prioritization
    8. 8. Here’s a rough view of an internist’s work in an office visitSource:
    9. 9. Scoping the recruit Challenge: Recruit “representative” users • Clinicians vary by specialty, practice size, age, experience, gender, EHR knowledge • Just how many do you need to recruit, anyway? How we narrowed the list • We screened for clinicians – MD, DO, NP, PA • We asked for a mix of specialty, gender, years of experience, and practice size • We asked for people who used EHRs but had not used ours • We screened for people who commonly did tasks we were testing Recruiting itself • Hired a professional recruiting firm • We paid participants through the recruiter • We recruited 22 hoping to get 20 participants
    10. 10. Determining the environment to test Challenge: Most EHRs are highly customized by clients • Installed EHRs have lots of custom programming • Our cloud-based EHR is highly configurable • And it changes every month Challenge: Data has to be realistic, but not real • Participants will be distracted by incomplete or incorrect data • It is wrong (and illegal) to use actual medical data What we did • We modeled the environment we tested with on an actual client environment • We chose a practice that had very little configuration (as “vanilla” as possible) • We scrambled the data from the practice so all records were deidentified • We arranged to be able to copy a our set-up test environment for each participant
    11. 11. Managing participant training Challenge: EHRs are not walk-up-and-use applications • Most implementations include 2-5 days of training Challenge: We had 5 moderators • With varying expertise with the application • Any 5 people will say things differently What we did • We worked with Training professionals to develop a training script • ~8 minutes long • Covered key concepts / areas of the interface • Walked through the tasks that we tested • Each moderator followed the script, so each participant had the same training • We printed out the screen shots from the training walkthrough as “Help”
    12. 12. Challenge: What do you measure?Note: This data is for illustration only
    13. 13. Start with a reasonable scope Know how your data will be used • What are you comparing? • Who will use the data, and how? Prioritize tasks and user groups • Most common tasks • Critical tasks • Tasks that carry patient safety risks • “Disparity-oriented use cases” (NISTIR 7769) • NIST has some user scenarios in NISTIR 7804 Determine a reasonable sample size • How much money do you have? • How much time do you have? • (see “who will use the data, and how” above)
    14. 14. Determine how realistic you can get Balance customization with comparability • Least configuration / customization? • “Typical” configuration / customization? Be sure to get realistic – but not real - data • Creating realistic data from scratch will be time- and knowledge-intensive • Patients’ real data is covered by HIPAA • Would be nice if NIST had importable patient charts for their scenarios Decide how to handle training • How much training do users normally get with what you’re testing? • How much time can you get with participants? • Can you develop customized training for the tasks you will test?
    15. 15. Plan for a specialized recruit Prepare the screener(s) • NISTIR 7804 has a sample screener • You may need more questions, depending on tasks and users Allow time for recruiting • The more specific your screener, the more time you need • Book your professional recruiter in advance Don’t skimp • On recruiting • On incentives Be flexible to accommodate participants • Medical people are wicked busy • Be prepared to test early in the morning and in the evening
    16. 16. Prepare your moderators Get any equipment you need ahead of time • Technical problems tend to happen when you least expect them • Plan to have backup plans for everything Schedule time to train moderators • Product: Effective paths, ineffective paths (and ways back), accellerators • Test script: Starting points, time limits, what to look for Plan for multiple pilot sessions • Ideally at least one for each moderator, with everyone watching • Discussions ahead of time about success and errors  Identify likely errors  Get consensus on definitions of success • Discussions on how to handle possible “situations”
    17. 17. Use good test hygiene Schedule sessions reasonably • Time between sessions, for rest and reset • Do not overwork moderators • Make sure moderators eat and sleep Normalize observations and analysis • Encourage multiple observers • Document decisions about success and errors • Do a consistency pass of all recordings • Double-check data storage, analysis, and statistics
    18. 18. References General info on healthcare industry: • • annual-jump/1117 Forecasts on health care spending • • What’s an EHR? • basics/ Articles on EHR Usability • Health Insurance Portability and Accountability Act of 1996 (HIPAA) privacy: •
    19. 19. References Government documents on EHR usability • NIST site on the usability of Healthcare IT:  NISTIR 7769 – Human Factors Guidance to Prevent Healthcare Disparities with the Adoption of EHRs  NISTIR 7741 - NIST Guide to the Processes Approach for Improving the Usability of Electronic Health Records  NISTIR 7742 – Customized CIF Format Template for Electronic Health Record Testing  NISTIR 7743 – Usability in Health IT: Technical Strategy, Research, and Implementation  NISTIR 7804 – Technical Evaluation, Testing, and Validation of the Usability of Electronic Health Records • AHRQ articles:  EF.pdf  EF.pdf Information on EHR usability and patient safety • • • _and_converging_technologies/