Reprint of paper appearing in IEEE
Transactions on Professional Communication ,
Volume 37, Number 3, September 1994. ©1994...
Ideally, the project manager has played a key             •   Test the test. Before conducting the actual
role throughout ...
These axioms govern the project manager’s               Defining the scope of a usability test is usually
responsibility t...
Table I. Working Time Requirements for Usability Test Phases.

   Project Phase                       Working Time
The end product of the design stage is a                RECRUITING PARTICIPANTS
document listing the organization and orde...
The results of recruiting are:                            a map to the location, and describing their
The detailed task descriptions from the test              Create lists of what to supply for each session,
design serve as...
script, becoming more comfortable with the                    software back up without losing valuable
language and modify...
Regardless of the method chosen, observers can            REFERENCES
flag interesting participant actions and
Upcoming SlideShare
Loading in...5

Techniques for Managing a Usability Test


Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Techniques for Managing a Usability Test

  1. 1. Reprint of paper appearing in IEEE Transactions on Professional Communication , Volume 37, Number 3, September 1994. ©1994 IEEE. Home Techniques for Managing a Usability Test Laurie Kantner Abstract—Investing time, energy, and money in a The goal of this paper is to provide the project usability test pays off when the data you collect manager of a usability test with tools for answer your questions. Who makes sure usability managing the details and fostering teamwork. tests meet their information-gathering goals? The The guidelines presented here apply to all types project manager, who has to be ready to solve the of usability projects, from laboratory tests to many problems that will inevitably arise. This paper group interviews to field studies, and to many assumes the reader has taken a course on usability industries. Specific examples generally assume methods or has conducted a usability test with the individual laboratory tests of computer software assistance of a professional in the field. products and documentation. “U SABILITY is in the details.” Usability testing demands both creative innovation and meticulous attention to details. Only “after the fact” does SKILLS BROUGHT TO THE TASK The person best qualified to manage a the critical-path nature of some details become usability test most likely wears multiple hats in obvious. Managers need to give these details the the organization. Within a computer software attention they require. manufacturing company, this person may be a usability engineer, quality-assurance specialist, Usability test teams provide a new working user-interface designer, or technical context for the individual team members, who communicator. Recognizing that there is more to are generally from different groups or usability testing than “crunching numbers,” the departments: development or engineering, ideal project manager: human factors, documentation, quality control, technical support, customer sales/support, and • Identifies strongly as a user advocate. current internal users. As with any collaborative • Possesses creative problem-solving endeavor, usability tests can run fairly ability. successfully by themselves when they target agreed-on goals and follow an agreed-on • Provides an objective, sensitive view process. But as is often the case with team when facilitating goal-setting among projects, not everyone has the same goals or the individuals with different responsibilities same understanding of the process and schedule and backgrounds. to follow; and when changes occur, expectations • Understands the importance of often differ about what alternatives to pursue. scheduling tasks early and rescheduling Details overlooked during preparation for a diligently. usability test become gaping holes in the data • Knows the value of keeping written track collected during the actual test. With project of the test plan, progress, and decisions. management formally assigned to a designated team member, someone becomes responsible for • Understands the scientific method. paying attention to these details for fostering • Can keep the ultimate goals of the test in team communication and agreement on the test focus even when immersed in situational goals, processes, and schedule; and for revising constraints. goals when compromises become necessary. 1
  2. 2. Ideally, the project manager has played a key • Test the test. Before conducting the actual role throughout the product life cycle, helping sessions, conduct one or two “dry-run” define past studies for earlier product versions if sessions and pilot-test sessions to verify not actually managing those studies directly. the flow of the tasks, the clarity of the The project manager understands and promotes administrator’s language, the operation of usability testing as an iterative process, keeping the product, and the length of the session. the purpose of the current test in focus by The dry run identifies major flaws; the reminding team members of the results of pilot test identifies where fine-tuning is earlier studies and the goals of future studies. needed. Even when these axioms are well understood AXIOMS FOR COLLECTING THE RIGHT DATA during the initial test definition and receive full support from the project team, two pressure There are many ways to collect the wrong data points during the preparation stage lead to during a usability test. Even when the test axiom amnesia: initially focuses on high-priority, specific questions, its definition can easily change • “While we have them here” tempts team during the preparation stage, increasing the members to add more test activities to likelihood of ambiguous and unusable results. take advantage of the investment in Because preparation often constitutes the first recruiting participants. half to three-fourths of the entire project, there • “We’re running out of time before next are numerous opportunities for the definition to week’s sessions” causes three types of blur before any actual data collection begins. compromises: (1) features to be tested are The project manager is the gatekeeper, not available for the dry-run or pilot test, overseeing what goes into the test definition and reducing the usefulness of those pre-test what is taken out, and therefore must be verification steps; (2) features to be tested prepared to explain the consequences of every cannot be completed in time and must be suggested change in the definition. Some axioms removed to keep the test on schedule; and that help ensure usable data at the end of the (3) participant recruiters are tempted to test are the following: relax the screening requirements because finding appropriate participants is taking • Recruit participants who truly match the longer than expected. target audience for the product. If the participants don’t match, the results The project manager must keep waving the won’t reflect the experiences you can axiom flag, asking questions such as these: expect from your “real-world” audience • “If we study this additional feature, what [1, 2]. will we do with the information we • Design tasks that match expected—and learn?” natural—use of the product. Only realistic • “Will the added activity weaken the data tasks will enable identification of for one of our primary goals?” potential problems in actual use situations. • “If we remove this activity, how will what we don’t learn hurt us? What is the • Minimize variables and activities that do trade-off between that and rescheduling not generate (or might even mask) the test two weeks later?” meaningful results. Identify how every activity addresses a major question the • “Will using less-qualified participants still test is to answer [3]. give us the data we want?” • Make sessions as consistent as possible for each participant. Balance the order of presentation where you are deliberately varying participants’ experiences. Extraneous data complicates analysis and can dilute the usefulness of the results. 2
  3. 3. These axioms govern the project manager’s Defining the scope of a usability test is usually responsibility throughout all seven phases of a iterative. You begin by listing everything usability test: everyone would like to know, then evaluating the resources required to design and conduct a • Planning the test. test of that magnitude, paring lower-priority • Designing the test activities. Recruiting items from the list, reevaluating resource participants. requirements, and continuing this process until the right combination of content and feasibility • Preparing the test materials. is reached. Several rules of thumb help to • Setting up the test environment. shorten these iterations: Conducting the test. • Participant sessions can vary in length • Compiling the test results. from half an hour to half a day. The primary consideration is the length of time required to analyze the data and PLANNING THE TEST make recommendations, generally two to The end product of planning the test is a four times as long as the time spent document describing the test goals and collecting it. In addition, long sessions methodology, participant- selection may not be feasible for many potential requirements, working procedure and schedule, participants, lessening the likelihood of and resource requirements. To allow enough recruiting a representative cross-section. time for participant recruiting, a draft of the • A session longer than two hours requires participant selection questionnaire/script for that you add time for taking a formal screening candidates should also be ready as break. In addition, 15 to 30 minutes of part of the initial test plan. session time may be needed for The project manager is the logical person to participants to fill out an initial produce the test plan, based on input from background questionnaire and for others during initial planning discussions. The post-session debriefing. planning process should elicit agreement on the • Base the session length on the following information: least-experienced or most methodical • Major questions the test should answer, participant. In laboratory testing, sessions with assigned priorities. running over their allotted time may need to be compressed or truncated, introducing • Number of participants and desired more variability into the test results. characteristics. Remember that the dry run, often with an • Completeness, maturity, or fidelity of the internal, experienced participant, may be product (and documentation) for the test. completed in one-third to one-half the time For a software product: a paper prototype that a test session with an external, or on-line demonstration? For software inexperienced participant will require. documentation: handouts, a manual, or • For studies designed to detect serious on-line support? problems with a product or document, • Resources available to plan and conduct about 6 participants per “cell” or audience the test, and the desired schedule (first group is usually adequate [4]. If the goal cut). is to generalize the results over a broad range of uses and users, consult with a • Participant compensation available. Be statistician to determine an appropriate ready to offer an honorarium if gifts or number of participants. souvenirs do not attract candidates, and decide on the amount. Check with your The table below suggests ranges of hours to company’s legal department for allow for each test phase, to help the project record-keeping requirements if you offer manager determine the resources and calendar an honorarium. time required to prepare for and conduct a usability test. 3
  4. 4. Table I. Working Time Requirements for Usability Test Phases. Project Phase Working Time Design 20 to 40 hours Participant recruiting 2 to 4 hours per final participant Materials preparation 30 to 80 hours Session administration Number of participants times the session length, plus time for dry-runs, pilot-testing, breaks between sessions, and modification to materials Results reporting 50 to 150 hours (depends on study complexity, number of participants, session length, and level of detail in the final report) Recruiting 8 to 10 participants will probably Note that a usability test that must be take 20 to 40 hours; however, you need to completed in less than 8 to 12 weeks may mean allocate two to three times as much calendar using fewer participants, testing only a small time to allow for “down time” spent awaiting part of the product, and providing a limited returned calls and collecting more names of analysis (and final report) [5]. However, a candidates. smaller-scale test may be appropriate in two cases: to demonstrate the value (albeit limited) Figure 1 shows a typical schedule for a of usability testing to uninformed management; laboratory test with 8 participants, 1.5-hour test and to test mature products undergoing limited sessions, and a detailed final report. When changes from one release to another. developing your schedule, assume that the date by which developers promise a testable product DESIGNING THE TEST ACTIVITIES may slip another week or two. Make sure your It’s tempting to go from the test plan directly plan accommodates this likelihood yet still to preparing the materials. However, a detailed delivers the test results on the originally test design aligns specific expectations about the specified date, since that date is unlikely to test among all concerned parties, before the move. Also, postponing test sessions is time-consuming activity of preparing the time-consuming and increases participant materials begins. cancellations. Week Week Week Week Week Week Week Week Week Week 1 2 3 4 5 6 7 8 9 10 Design the test Recruit participants Prepare materials Administer sessions Report results Figure 1. Typical schedule for a laboratory usability test. 4
  5. 5. The end product of the design stage is a RECRUITING PARTICIPANTS document listing the organization and order of As a parallel activity to the test design and tasks, detailed activity descriptions, relevant materials development, recruiting participants issues/questions each task or activity is to requires adequate time and attention. Writing a address, and estimated timings for each task or detailed recruiting script and questionnaire activity. If the initial timings result in a session enables supporting staff to conduct this activity, that is too long, the test designer uses the test freeing the test designer to concentrate on test goals and priorities to eliminate activities. The activities. initial test design should conservatively allow for about 25% growth in length as a result of the The recruiting script provides the sequence of review process. qualifying questions the recruiter asks prospective candidates, designed to minimize In addition to describing activities the the time required of both the candidate and the participants will perform, the design lists recruiter. The script contains notes to the assumptions about participant skills, as well as recruiter indicating where an answer product functionality and documentation disqualifies the candidate from participating, at elements to be tested. The more detailed these which point the script prompts the recruiter to descriptions, the earlier you can identify any thank the candidate and ask for referrals. It also discrepancies between test activities and skills provides standard language for describing the and resources to complete them. test, so that all participants hold roughly the Team members carefully review the test same expectations. design for how it addresses test goals and its To ensure that a consistent methodology is assumptions about participant skills and used to screen candidates, preferably only one product/documentation readiness. The project person should conduct all recruiting. The manager facilitates the review discussion and recruiter should be outgoing, persuasive, and negotiates differing priorities when team sufficiently knowledgeable about the test to set members suggest design changes. candidates at ease. The recruiter should test the At the same time, team members also approve script on coworkers to ensure that it reads the final participant-screening questionnaire so smoothly and naturally before using it on that screening can start right away. That potential participants. approval should include agreement on the People’s motivations to participate vary, and priorities of the different participant so does their level of commitment when the characteristics, so that any compromises made actual day of their session arrives. Plan for quickly during screening are consistent with test inevitable drop-outs and “no shows” by goals. recruiting 10-20% more people than you actually At this point, the test team’s ability to produce need. Inform backup participants of the all required elements to meet the originally importance of their flexibility (in being willing defined schedule becomes clear. If necessary, the to be scheduled at the last minute), and be project manager revises the test schedule to prepared to offer them the same compensation reflect changes. as firmly scheduled participants. Although it provides more concrete details Expect the recruiter to have to make 10-15 about the exact activities during the test, the telephone calls for each final participant design is still a concept document, written recruited. Many initial calls are to persons without benefit of the product or documentation within customer organizations who can refer the to be studied. Thus, it is only as “accurate” as recruiter to an appropriate candidate; the script the development team’s stated intentions and needs to provide language for these the test designer’s imagination; expect transactions. The recruiter should make sure significant factual changes during materials each telephone call results in either a qualified preparation. candidate or names of other persons to contact, preferably both. 5
  6. 6. The results of recruiting are: a map to the location, and describing their compensation (gift, souvenir, honorarium). • Recruited participants (including extras). In recruiting dry-run and pilot-test • A questionnaire filled in by the recruiter candidates, qualifications are generally for all candidates who pass the initial somewhat less stringent. A dry-run participant qualification stage. need not have all the characteristics of the • A log of all contacts made, organized by desired participants (most tests use internal candidate and by status of recruiting participants-employees-for dry runs). The (“accepted,” “pending,” “rejected,” pilot-test participant should be an external “rejected but possible future candidate”). candidate who would qualify as a real The contact log should contain complete participant except in perhaps one characteristic. contact information for candidates, including a fax number for instant PREPARING THE TEST MATERIALS confirmation. The test materials govern all activity during • A summary of the characteristics met by the test sessions. They contain several types of the recruited participants. Team members information: should review the list of recruited participants to identify any qualification • A script for the administrator. Using a problems before final confirmation script ensures a consistent presentation packages are sent. from one participant to the next and minimizes the effects of fatigue as the day • A short description of the usability test for progresses. Logistics notes for the those candidates who need to present administrator. something in writing to receive their manager’s approval to participate. • Logistics notes help the administrator keep track of details about the Regular contact and final confirmation are key environment or the order of activities to ingredients for letting participants know how ensure consistency. important their participation is. The purpose of the first telephone call is to determine whether a • Convenient spaces for note-taking (such person qualifies; a second telephone call as questions with check-boxes or lists of confirms their selection and clarifies convenient key issues to observe [3]). Note-taking and inconvenient times for scheduling their test; forms make it easy to record data for and a third call confirms the place and time. quick compilation to report initial results These phone calls should be prompt to avoid after the test. This technique also ensures leaving candidates wondering about their that key issues are noted consistently, status. If the exact schedule of the test sessions is making it possible to analyze trends uncertain, more phone calls are required to set without viewing videotapes. the stage and then make final confirmation. • Removable questionnaires (if used) for Schedule the sessions to allow some break the administrator to hand the participant. time between participants, and try to avoid Questionnaires elicit participant opinions, more than six hours of sessions per day. The an important element of most tests. team members’ fatigue can become a factor in • (If the test includes protocol analysis) a the test results. However, be prepared to offer handout or videotape instructing evening (and perhaps weekend) sessions to participants on how to “think aloud.” accommodate participants who cannot make time during the working day. • Other handouts for participants, providing task instructions, pictures of At least three working days before sessions objects to identify, or lists of items to rank begin, participants should receive a or rate. For some studies, participants confirmation package containing a letter simply receive blank paper for providing thanking them for agreeing to participate, listing requested information. the location and time of their session, providing 6
  7. 7. The detailed task descriptions from the test Create lists of what to supply for each session, design serve as the test designer’s outline for such as sufficient videotapes and audiotapes, as writing the test materials. Often one draft, well as checklists of how to reset the equipment reviewed by team members, and one revision and environment between participants for cycle are sufficient to prepare a draft for the dry smooth, error-free transitions. Ensure that run. (The initial revisions often catch awkward sufficient copies of all documents are available: script language as well as obvious logistical and one set of test materials per participant for the procedural errors.) administrator and one set for each observer; copies of background questionnaires, In addition to the materials used during the nondisclosure forms, and videotape release sessions, the following administrative materials forms; documentation for reference during the are recommended: session; and any other documents that are part • Participant background questionnaire. of the session. This questionnaire asks for the same Make sure the right people are planning to be information collected during oral available to assist throughout the sessions, screening, to verify participant starting with yourself: clear your calendar, and characteristics. It can also collect other tell your coworkers and colleagues that you will information of interest. be unavailable during the test sessions. If a • Nondisclosure forms and videotape colleague will be assisting you with observing release forms—check with your legal and note-taking, get that person’s time department about these. If you will be commitment in advance. Make sure a technical videotaping participants’ faces, the resource person is available throughout the videotape release form is more complex. sessions in case of product difficulties. Make Plan on at least two more cycles of sure someone, preferably the original screener, revisions to the materials-after the dry is responsible for scheduling backup run, and after each pilot test-before actual participants in case of no-shows. sessions begin, and space the dry-run and Participant stress can affect how well the test pilot-test sessions to allow for this data will generalize to actual users [6]. Hospitality revision time. arrangements are an important component of reducing participant stress. Be a good host: make SETTING UP THE TEST ENVIRONMENT sure someone is there to greet participants on arrival, offer them a variety of refreshments, and Whether a specially equipped usability lab or (when the session is finished) escort them out. an office or conference room converted for temporary use, the environment should be Prevent interruptions during sessions by customized for the usability test sessions. In posting “Usability Session in Progress” signs addition to the necessary equipment, power and disabling the telephone in the usability test connections, and telephone access, the room. Circulate a session schedule in advance to environment should provide a comfortable encourage observers, and label the observation working space for the participant. Here are room. The observation room should provide some guidelines. sufficient space to accommodate the expected number of observers. • Allow sufficient working surface for any documents or paperwork the participant uses during the session. CONDUCTING THE TEST • Consider a separate room or area in Dry-run sessions and pilot-test sessions are which the participant fills in forms and often conducted a few days before actual testing relaxes, with enough space to hold begins. As mentioned earlier, these pre-tests refreshments. reduce the number of bugs, help verify test session length, and identify discrepancies • Furnish the space with high-quality between test activities, product functionality, furniture and decor to resemble the and documentation. The pre-tests also enable participant’s actual work environment [6]. the administrator to practice speaking from the 7
  8. 8. script, becoming more comfortable with the software back up without losing valuable language and modifying it as needed. data. Have technical staff available who can solve problems quickly, to avoid In testing software products, developers eliminating test activities to stay within usually make last-minute changes to the product the time limit, as well as to minimize the after each dry-run and pilot-test session. The participants’ anxiety. session administrator should personally run through the task sequences again before sessions As sessions are completed, recorded data for with external participants start, to make sure each participant accumulates: a filled-in these software changes do not have an background questionnaire, signed forms (if unexpected effect on the test. needed), one or more sets of annotated test materials reflecting administrator and observer Once real test sessions begin, the time for notes during each session, one or more labeled changes has ended. Participant experiences must videotapes (if used) and audiotapes, and be extremely consistent from one session to the perhaps computer-generated data. After each next, to provide a common foundation for the session, put all paper materials and tapes for data collected. each participant in a separate, labeled, clasped Several administrative practices can ensure envelope, and store electronic data in two places usable data from each test session, as well as for backup. ensure that a sufficient number of test sessions Between sessions, make sure all materials in take place: each envelope are labeled with the participant’s • Obtain the participants’ informed consent name, in case they become separated. Avoid to participate by having them sign a form removing more than one participant’s materials stating what will happen during the from an envelope at a time. usability test [5]. Make sure they know Developers or marketing staff frequently ask they can withdraw from the test at any for additional changes to the test design after time. observing the first few sessions. • After each participant fills in the Assess the value of having a smaller set of background questionnaire, compare that participant data for these changes versus having information with the information initially a larger set of participant data for the initially received during oral screening of agreed-on activities. participants. If you find discrepancies, decide whether they will invalidate the data you will collect. Generally they will COMPILING THE TEST RESULTS simply change your expectations. For If you have promised “oral quick results” to example, a participant who reported more the product developers, introduce them with computer software experience during ample caveats about their reliability if you’ve screening than on the background had to work quickly to generate them. Divide questionnaire will shift to your the results into short-term issues and long-term “inexperienced” category but will still issues. Always follow up with a carefully provide valuable data. prepared written report, however brief, to • Plan on at least one “no-show” for every 5 record the results and correct any or 6 participants scheduled. If the number misstatements. The decision of who analyzes the of no-shows leaves too few participants results, presents them, and writes the report for the desired amount of data, it may be depends on team member skills and availability. necessary to extend the sessions another An increasingly popular method of reporting day or two to schedule backup results is to create a “highlights videotape.” The participants. time investment depends on the presentation • In testing software products, prototype or method chosen-a compendium of revealing preliminary software can often crash experiences, organized by participant, is faster during the test activities, so try to learn to create than a tape organized by issue with the proper procedure for bringing the participant clips as supporting evidence. 8
  9. 9. Regardless of the method chosen, observers can REFERENCES flag interesting participant actions and [1] S. Rosenbaum, “Selecting the appropriate comments on their session notes to expedite subjects for documentation usability videotape preparation. testing.” Proceedings of the Third International Creating a detailed test report to convey final Conference on Human-Computer Interaction, results can increase the investment in a usability Boston, MA, 1989, vol. I. project by another 20-25% over the investment in test planning, preparation, and [2] S. Rosenbaum, “Chapter 3: Selecting administration, but it provides a level of appropriate participants for documentation information not otherwise available. The final usability testing.” Practical Approaches to report, which often takes three to four weeks to Usability Testing for Technical Documentation, complete, provides a comprehensive record of “ Chris Velotta (Ed.). Society for Technical what happened during the test. It explains the Communication, in press. reasons for specific elements of the test design and describes participant activity in more depth, [3] J. Ramey, “Usability testing of the providing details that may be difficult to documentation and user interface of reconstruct from a quick-results report and computer systems,” seminar sponsored by people’s memories as time goes on. It also the Dept. of Communication, College of makes possible the analysis of design decisions Engineering, University of Washington, and usability test results over a product’s life March 1994. cycle. [4] R. Virzi, “Refining the test phase of usability A final test report that includes key quotations evaluation: how many subjects is enough?”, from participants can often influence Human Factors, vol. 34, no. 4, August 1992. product-development decisions more readily than tables of summary data. (Developers who [5] J. S. Dumas, and J. C. Redish, A Practical watch all of the sessions would hear the Guide to Usability Testing, Norwood, NJ: participant comments, but developers rarely Ablex Publishing, 1993. watch more than one or two sessions.) In addition, the time you spend with the data [6] J. R. Schrier, “Reducing stress associated while preparing the final report gives you a with participating in a usability test,” Proc. chance to discover patterns not immediately Human Factors and Ergonomics Society 36th discernible initially. These issues are often Annual Meeting. 1992, pp. 1210-1214. secondary, but they might suggest areas for further exploration. Laurie Kantner is a technical writer and usability specialist with Tec-Ed, Inc., in Ann CONCLUSION Arbor, MI. She manages usability projects for computer-based products and documentation, The nature of usability tests-with tight from laboratory testing to group interviews to schedules, moving targets, and human field studies. interactions-invites the unexpected. The focus of this paper has been to help usability specialists identify where the unexpected is likely to occur, and plan alternative ways to cope with these changes. Designating project management as a separate responsibility increases the likelihood that usability test goals will be met and usability tests will become an integral part of product development. 9