Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Usability Testing


Published on

My usability testing talk from Web Directions East

Published in: Design, Technology, Education
  • Login to see the comments

Usability Testing

  1. 1. Guerrilla Usability Testing by Andy Budd
  2. 2. Buying a train ticket in the UK
  3. 3. Buying a train ticket in Hong Kong
  4. 4. Buying a train ticket in Japan
  5. 5. Early user interfaces mirrored the internal workings Early interfaces mirrored the internal workings of the system. You needed to understand how the system worked in order to operate it. This wasn’t a problem as the systems were usually built from kits by the users, so they had prior knowledge.
  6. 6. Early Desktop Interface However as computers got more powerful, it became increasingly difficult for the average person to understand how they worked. To make computers useful to the wider population, their operation needed to be abstracted from the way they worked. By using the “desktop” metaphor, it was easier for people to understand how to operate a computer and what to do with it. We look back at these old pictures and they seem crude, but at the time they were groundbreaking.
  7. 7. Computers now more powerful than moon landing Modern computers now have more power than was used on the first moon landing. However rather than being operated by rocket scientists.
  8. 8. That’s your mum, that is! They are being operated by people like your mum.
  9. 9. Frustration We live in a world of a million little frustrations. They may seem small, but they all add up over the day. We donʼt want the one small issue on our site sending somebody postal. Our job as a user experience designers is to remove as many of those barriers as possible.
  10. 10. Technology shouldn’t make us feel stupid When things don’t conform to these expectations we get frustrated. Either we curse the system for not being easier to use, or we curse ourselves for not being smart enough. Neither one of these are particularly desirable. Whatever you do, please don’t make users feel stupid. If you’re starting a discussion by saying “oh, it’s just the stupid users” then it’s probably your fault.
  11. 11. People don’t want to learn UNIX to update their iPod As technology becomes more powerful and ubiquitous, people expect to be shielded from the inner working of a system. Most people don’t want to know how UNIX works in order to download music to their iPod or upload pictures to flickr.
  12. 12. People no longer read instructions In fact, with so much technology in our lives, users no longer have the time or inclination to read the instructions any more. When was the last time you bought a new TV or DVD player and read the manual from cover to cover before you turned it on? I think the last time I did this I was about 9.
  13. 13. If you can program a VCR you can program a DVD This is because we’ve become adapt at learning new technologies, by basing our assumptions on past experiences. If the video recorder worked one way, there is a good chance the DVD player worked in a similar way. So when creating new systems, we need to be aware not to break commonly held idioms. [Talk about typical web conventions/patterns]
  14. 14. We learn new technology by exploring More importantly we learn new technologies by exploring: by pressing buttons and seeing what they do. As long as something “affords” pressing and there is little indication that doing so will damage the system, we’ll explore. When I got my new TV recently, rather than reading the manual, I went straight to the menu and started clicking around to see what I could do. Similarly when I buy a new bit of software, one of the first things I’ll do is click on menu items and check out the preferences pane to build a mental idea of what the application can do.
  15. 15. Modern life places great demands on our attention The modern world places great demands on our attention. We’ll be walking down the street listening to a podcast, dodging cars and pedestrians while trying to send a text message. Or maybe we’re sat in our offices listening to music, monitoring our emails and IM messages while flicking between different browser tabs as the pages load. It is very rare that we’ll be devoting 100% of our cognitive ability to a single given task. Infact, this behaviour has been given a name. Constant partial attention.
  16. 16. We’ve become adept at multitasking We’ve become a race of multi-taskers: time and attention poor. This means the technology we use needs to be as simple and straightforward as possible. It needs to become transparent. To get out of our way, letting us accomplish our goals with the minimum of effort or interruption.
  17. 17. We think people use our sites like this (screen show showing a linear progression) When designing a site we assume that the visitors know what they want and know where to get it. That they understand the site structure and will take time to learn how the site works. We assume that they will read every bit of copy in a linear fashion, weigh up all the options and then select the most likely fit.
  18. 18. However they actually use them like this In reality users attention will dart around the page looking for the first thing that jumps out at them. By bombarding them with pointless offers, they have become adapt at ignoring anything that looks like an advert or marketing copy. We think people use our sites like a game of chess. However most of the time it’s more like a game of snap. I’ve moderated countless sessions where users have said something like ‘I would have expected their to be X’ where X was right under their mouse pointer. They just hadn’t noticed. It would be easy at that stage to blame the ‘stupid user’ but in fact it’s the fault of the ‘stupid designer’ for not having a better understanding of basic user behaviour
  19. 19. Test on real users So the only way we can hope to build better products is to see how people are using the things we build, find out the problems, and then make them better. Otherwise, it’s just guesswork.
  20. 20. Usability Testing Halo 3 This is exactly what Microsoft did with Halo 3 Halo 1 was a big success, but Halo 2 was a bit of a flop. New weapons made it too easy for advanced players and reduced the amount of had-to-hand combat. The single player game was very short and the multi-player game didn’t add much that was new or interesting. Microsoft analysed more than 3,000 hrs of game play over 600 test participants over 2 1/2 years in possibly one of the biggest usability studies ever undertaken. Is it any wonder as Halo 3 sold $300m in the first week alone. They did classic usability testing using a testing suite with two way mirrors and video feeds. They set tasks and got the users to think aloud. They did multi-user simultaneous testing (MUST) with up to 40 people at a time and more traditional testing. Rather than looking for usability problems per-se, they were looking for things that resulted in user frustration, like accidentally killing youself by driving off a cliff or getting hit by rebounding cannon fire
  21. 21. Aiming Found on one of the earlier levels that people kept running out of ammo and being killed, despite having loads of spare clips on that level. Turned out that people were trying to kill the baddies when they were too far away, running out of ammo and getting slaughtered. The fix was to change the colour of the targeting ring when baddies were in range. Hidden grenades.
  22. 22. Getting Lost On the first campaign people were led through a jungle level with lush vegetation and limited views. They would get lost and end up backtracking. With no more baddies to kill this got very boring and frustrating. The designers added several small cliffs to prevent backtracking and the problem was solved.
  23. 23. Grunts of Death On this slide the dots represent where people were killed and the colours represents what killed them. Purple dots are bosses, red dots are suicides and tan dots are grunts. Turns out that there was a big in the system which made grunts, the lowest and easiest baddies to kill, almost indestructible.
  24. 24. Usability testing for the web Of course usability testing isn’t just limited to the games industry.
  25. 25. Formal Testing Screenshot of delicious usability testing Formal testing tends to be carried out by large, specialist companies with dedicated test suites. These suites often have one way mirrors, live audio and video feeds, and notice boards for collaborative note taking and discussion. This set up is great for getting the whole team together to watch the tests. They will use a professional facilitator/HCI person and generally test on a larger sample for more statistical rigour. Tests tend to be more quantitative than qualitative and the output is usually in the form of a written report to the site owners or developers. • Will recruit test subjects for you • Facilitators less likely to bias results • Comfortable facilities • The whole team can be present • Quantitative data likely to have more statistical validity • Professionally written report or presentation • Time consuming to set up, run and analyse results • Expensive so often only run once at the end of the project • Facilitators only have shallow knowledge of the project • Focus on quantitative over qualitative reduces insight • Smaller test samples can invalidate quantitative results • Most projects don’t require the level of statistical rigour • Easy for the results to be ‘thrown over the fence’ and forgotton
  26. 26. Informal (Guerrilla) Testing Screenshot of a guerilla test You don’t actually need much equipment to run a usability test. Just a computer to run the test, somebody to moderate and somebody to take notes. Instead of a note taker, I prefer to set up a video recorder pointed at the screen. This captures the mouse movements as well as the audio, so you don’t need to take notes. Alternatively you could have two videos, one recording the screen and one recording the person. Or a video and some screen capture software. However you then need to composite both feeds and this can get complicated. Guerilla testing tends to be carried out anywhere possible, from an empty meeting room to a local cafe. It’s usually (although not necessarily) carried out on a one-to-one basis by a member of the design team. Instead of focusing on quantitative data, these tests usually focus on qualitative data for design insight rather than statistical validity. The results are usually fed back into the design process. As such, they tend to be formative rather than summative. • Very quick and easy to perform • Relatively inexpensive so can afford multiple tests • Qualitative nature can provide improved design insight • Results can be fed back into the design process almost immediately with no loss of signal • Have to recruit subjects, arrange location and write tests yourself. • Familiarity with project may introduce personal bias • Less statistical rigour on quantitative results
  27. 27. So what is usability testing anyway? Usability testing generally involves setting a series of tasks for people to complete and noting any problems they encounter – It’s as simple as that!
  28. 28. Think aloud protocol As the name suggests, you set a series of tasks and ask the subject to verbalise their thoughts and feelings by ‘thinking aloud’. The object of this is to gain valuable insight into the thought processes behind the users actions. Think aloud testing (introspection) is usually the type of testing we perform as it’s the easiest to get insight out of. However be aware that by asking users to ‘think aloud’ you’re changing their normal pattern of behaviour. For instance ‘time on task’ calculations may no longer be valid. That being said, you’d be amazed by how tolerant people are of this type of test and their willingness to suspend disbelief and play along. Be aware that thinking aloud may make people more conscious and therefor attentive to what they are doing. Ironically it may also distract them from what they are actually doing. Subjects often go quiet when they get stuck. The other option is retrospective introspection when you let people do a task and then get them to explain what happened after the fact.
  29. 29. Recruiting Test Subjects • Client contacts • Via the client website ( • Recruitment agency • Market research agency • Twitter • Local notice boards and mailing lists • Other offices/departments • Classified ads e.g. gumtree
  30. 30. How many test subjects are enough? There is no easy answer here. Several people including Jakob Neilsen have suggested that testing on 5 users will uncover 80% of the problems on your site that have a high likelihood of detection. This is based on binomial probability distribution. After 5, you get diminishing returns. However these must be five of the same type of user. In my experience, 6 is a good number. You can run them all in a single day and you’re still OK if somebody drops out. Laura Faulkner of the University of Austin did an analysis of the results of a test using 60 subjects. 5 test subjects did find an average of 85% of the total issues found. However the percentage of problems found by any one sample set ranged from 55%-100% Groups of 10 found 95% of problems with a min of 82% 15 found 97% with a min of 90% 20 found 98% with a min of 95% If you want to discover the majority of issues, 10-20 participants is ideal.
  31. 31. Select your tasks (5-6 tasks, 45 min max) You can’t test everything so determine which are the most important tasks to test. Test the most common tasks, the most complicated tasks and tasks you think may have usability issues. Create a believable flow of tasks. e.g. register > complete profile > upload photo etc. Start with simpler tasks and slowly increase complexity Base tasks on your user scenarios . Select around 5-6 tasks that should take approximately 45 min. Run through the tasks to ensure they are achievable. No point setting tasks you know will fail!
  32. 32. Good and Bad Tasks Bad: Search for a bookcase Good: You have 200+ books in your fiction collection, currently in boxes strewn around your living room. Find a way to organize them. Don’t load tasks with the solution to the design. It’s much harder than you think. Quite often users will pick up on a couple of words in your task and simply look for those in the interface Instead create realistic scenarios that people can emotionally connect with. The above task is good, but maybe a bit woolly. Good tasks should have a specific and unambiguous goal, so both you and the subject know when it is completed.
  33. 33. Review and implement ASAP Review the videos while the test is still fresh in your head Capture the data in a spreadsheet Patterns will start to emerge Note the problems and your recommended fixes Summarise your findings in report or presentation (shoe example) Compile a short ‘highlights’ video Make the changes ASAP
  34. 34. Thanks
  35. 35. Questions? Repeat Questions!