Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Comparison_of_UX_Evaluation_Techniques_CA2_N00147768

220 views

Published on

  • Be the first to comment

  • Be the first to like this

Comparison_of_UX_Evaluation_Techniques_CA2_N00147768

  1. 1. Assignment Cover Sheet MSc in User Experience Design Student Name: Stephen Norman Student Number: N00147768 Programme: MSc UX Design Year of Programme: 2015/2016 Module Name: User Research and Usability Assignment: Comparison of UX Evaluation Techniques Assignment Deadline: 14/02/2015 I declare that that this submission is my own work. Where I have read, consulted and used the work of others I have acknowledged this in the text. Signature: Stephen Norman Date: 14/02/2016
  2. 2. Table of Contents 1. Introduction ....................................................................................................................................2 2. Evaluation Methods........................................................................................................................2 2.1. Usability Testing......................................................................................................................2 2.1.1. Case Study: “Find it if you can: Usability Case Study of Search Engines for Young Users” 3 2.1.2. Case Study Review ..........................................................................................................3 2.2. UX Curve..................................................................................................................................4 2.2.1. Case Study: “Comparing the Effectiveness of Electronic Diary and UX Curve Methods in Multi-Component Product Study” ..............................................................................................5 2.2.2. Case Study Review ..........................................................................................................5 2.3. Web Surveys ...........................................................................................................................6 2.3.1. Case Study: “Approaches to Cross-Cultural Design: Two Case Studies with UX Web- Surveys” 6 2.3.2. Case Study Review ..........................................................................................................7 3. Comparison of Evaluation Methods ...............................................................................................7 4. Conclusion.......................................................................................................................................8 5. References ......................................................................................................................................8 6. Bibiligraphy .................................................................................................................................9
  3. 3. 1. Introduction There are numerous amounts of user experience evaluation methods currently in use. Some 841 evaluation methods are currently in use, a reduction in methods since a publication by Vermeeren et al. (2010) where 96 methods were referenced. This paper will examine three methods; Usability Testing, UX Curve and Web Surveys. These will be discussed and their effectiveness demonstrated though various case studies. Furthermore, their performance will be examined and critiqued. Examples of their multi-functional roles will also be introduced and discussed. Followed up by a comparison on each for their real world feasibility and any future improvements demonstrated in the conclusion. 2. Evaluation Methods In this chapter three evaluation methods will be introduced. The methods discussed are Usability Testing, UX Curve and Web Surveys. Each method will be described and analysed through a case study review. 2.1. Usability Testing Usability testing is a single behavioural study using as many as five users to maximise outcome (Nielsen, 2012). Participants are set tasks while observers watch, listen and take notes (Usability.gov, 2016). It is an effective method at gathering both quantitative and qualitative information. Usability Testing is also ideal for examining attitudinal and behavioural dimensions (Rohrer, 2014). These tests are cost effective requiring no formal laboratory (Usability.gov, 2016), any room with portable recording devices will be sufficient, or the testing can be performed remotely, whilst eliminating such factors which can alter human behaviour such as location, time of day, season, or temperature (Trivedi, 2012). Remote testing is conducted in one of two ways; moderated or unmoderated (Schade, 2013). Moderated testing is conducted where there is two-way communication between the participant and facilitator allowing for additional information to be gathered. Unmoderated testing is done solely by the user without a facilitator where users are set predefined tasks without moderation. Unmoderated studies lack real time questioning and support (Schade, 2013). Because usability testing is done mostly in controlled environments Monahan et al., (2008) argues this is a disadvantage as these studies lack context. However, this also depends on the type of application being tested. 1 http://www.allaboutux.org/all-methods
  4. 4. 2.1.1. Case Study: “Find it if you can: Usability Case Study of Search Engines for Young Users” This study set out to assess 7 English search engines, and 5 German on their ability to successfully match their interface to the abilities and skills of children. Interestingly the study’s method was conducted without the involvement of any children, which deviates it from standard practices (Nielsen, 2012). Three main points were addressed; motor skills, cognitive skills and presentation of results. The motor skills research included artefacts such as a mouse and keyboard, which assessed the abilities of these devices from the handling, to their accuracy on the interface. This included button sizes, clickable regions such as imagery and use of alternate methods of providing results such as tangible figurines used in applications like TeddIR2 proposed by Jansen et al., 2010. Cognitive abilities were studied in both their understanding of general search and how they interacted with these interfaces from previous research. Children from age six to thirteen were in scope, as well as two types of interfaces; browsing versus keyword orientated. Browsing interfaces allow users to navigate and explore a set of predefined categories as used in KidsClick3, whereas keyword orientated interfaces e.g. Google, require the user to type each query. Final assessment criteria focused on font size, number of results per page, use of imagery and did the search cater for semantics and spell checking. 2.1.2. Case Study Review This was an untraditional usability evaluation with regards to using existing research of children’s web use. It was acknowledged that to verify their research and enrich the results that further studies should be conducted with children. With sufficient prior research a good user model was created to allow the researchers to conduct their own study of these interfaces thus saving time and money. Furthermore, the chosen method was appropriate for producing desired results. However further studies such as contextual usability inquiries, or EmoCards4 could be performed to gather richer qualitative data. Moreover, credit should be given to the paper’s authors with regards to their organisational skills. Exemplary efforts were carried out on the categorisation (Figure 1), which was conducted on all criteria throughout the paper. Without these efforts it would have been difficult to assess the search engines properly. 2 An interface designer to help children retrieve books by placing tangible figurines on screen to represent search terms in hopes of reducing errors from spelling and finding the correct query (Gossen et al., 2010). 3 http://www.kidsclick.org/ 4 http://www.allaboutux.org/emocards
  5. 5. Figure 1- Categorisation of search results by button size and page length. 2.2. UX Curve UX Curve is a method in which participants are asked to sketch their retrospective experiences of a product use over time (Figure 2). UX Curve has been designed to better understand user emotions and experiences chronologically (Kujala et al., 2011a). Sketching is done on a template divided in to two planes; x-axis is time with the y-axis can be any desired evaluating factors e.g. satisfaction or dissatisfaction (Sahar, Varsaluoma & Kujala, 2014). Figure 2- (Left) Showing a deteriorating and stable curve. (Right) Improving ease of use curve. When compared to a questionnaire UX Curve has proven to be more effective at collecting the hedonic aspects of users such as fun and pleasure (Kujala et al., 2011b). However, in a later study concluded that long term diary studies were more effective at collecting detailed information versus UX Curve (Sahar et al., 2014). Due to the longevity of this study, results favoured the long-term diary study (LTDS) as it recorded data more accurately, whereas recollection was required during UX Curve evaluation due its presentation after the diary studies had concluded. Having to recall such
  6. 6. information can lead to biases argues Norman, D.A, (2009); “Retrospective evaluations of long-term user experiences are based on memories of the user and they can be vulnerable to biases” (Kujala et al., 2011b). According to Vermeeren et al. (2010), it is one of the lesser used methods because it is not cost effective impractical in product development contexts (Kujala et al. 2011b). 2.2.1. Case Study: “Comparing the Effectiveness of Electronic Diary and UX Curve Methods in Multi-Component Product Study” This case study assessed the performance of both UX Curve and LTDS, each for collecting qualitative data as a remote research method (Sahar, Varsaluoma & Kujala, 2014). Twenty-five customers were recruited who had recently purchased a sports watch and were using it at least five times per week. This multi-product study included connected accessories such as a heart rate monitor, speed sensor and website, was conducted remotely over an eight-week period. Participants were asked to completed the electronic diary online up to twice a week, upon completion of the eight-week period they were sent four UX Curve templates; one for each of the components. The templates addressed the “Attractiveness” of the product. “We chose ‘attractiveness’ UX dimension because it represents overall appeal and non-instrumental qualities (aesthetics, symbolic and motivational aspects), although these were not specific to the users” (Sahar, et al., 2014), and was also chosen based on a previous study done by Kujala et al. (2014). 2.2.2. Case Study Review The results were clear that the LTDS proved more effective at its ability to collect further in depth information about each component when compared to UX Curve. Although getting good user response rates was a challenge over the study duration according to (Sahar et al.,2014). UX Curve took less time overall from implementation to deployment and analysis as it was conducted in one session with participants. However, this study did not maximise on the best intended use for UX Curve; “UX Curve is intended to be used in a face to face setting where the researcher is better able to inquire into the participants’ reasoning and thoughts” (Kujala et al. 2011a). Instead it was mailed to participants during the Sahar et al. (2014) study eliminating the potential of qualitative data gathering. Although limiting the full use of UX Curve, it did consider the existing research of Kujala et al., (2011a) whom tested six UX Curve types (Figure 3) and identified “attractiveness” as the best performing template.
  7. 7. Figure 3- Different curve types used while testing a product. 2.3. Web Surveys Web Surveys are a commonly used method in the researcher’s toolkit. These allow greater access from a larger audience due to accessibility of the internet. Both Walsh (2012) and Vermeeren et al. (2010) agree that these are desirable due to their lightweight nature; speed on implementation, and ease of use. These highly versatile studies can be used at any stage of the design process. In a recent project (Norman, 2016) a web survey was used during the exploratory phase to gauge people’s attitudes and use of the An Post website before prototype conceptualisation. In the opposite scale web surveys are used as a LTDS used in a study by Sahar, et al. (2014). Challenges raised by both Walsh (2012) and Sahar et al. (2014) was the ability to keep participants engaged for the duration as users tended to drop out or not complete the survey. These should be considered when surveys are being used in LTDS contexts. An issue identified by Walsh (2012) is that researchers who formulate questions and hypotheses should consider their own cultural background. This may affect research questions, its performance and participant interpretation if testing occurs in different regions and cultural backgrounds. 2.3.1. Case Study: “Approaches to Cross-Cultural Design: Two Case Studies with UX Web-Surveys” This study assesses the use of web surveys in two different cases. One covers an online gaming site and the other an online sports diary. The online gaming site objectives were to gain insights on how to design a good UX for new markets in the future (Walsh, 2012). The online sports diary evaluated customer usage of over a period of three months. The sample size in each greatly differed; 11,238 participants of the gaming site were sent an invite email, with only 632 responding. 17 were recruited for the online sports diary with 7 dropping out through the evaluation. A more effective response was noted for the online sports diary which screened for willing volunteers prior to the evaluation. Both surveys were sent internationally, however, translations had to be considered prior to survey deployment (Walsh, 2012).
  8. 8. Therefore, the survey was created in both Swedish and Spanish, requiring researchers to translate to English for collection. For the sports diary an invitation questionnaire was first sent, allowing the researchers to screen for English speaking participants, and collecting internet and device usage information. 2.3.2. Case Study Review It is believed researchers conducted their research using the best practice approaches to both studies. However, in the diary study, the survey could be used in conjunction with UX Curve, reflecting more positive results with regards to customer satisfaction based on the study by Sahar et al. (2014). Again Walsh (2012) experienced the same difficulties as Sahar et al. (2014) with regards to participation levels dropping on long- term studies. However, the benefits of the long term diary’s format allowed for the collection of rich qualitative data in conjunction with context, which is an important cultural factor according to Gillham (2005). The gaming site sought the research of Soley & Smith (2008) as their research appeared to prove that a “sentence completion survey” method proved to be most effective across cultures. Also this research could be improved by introducing an invitation questionnaire to recruit willing participants initially opening research to additional forms of questioning. 3. Comparison of Evaluation Methods The short term studies such as Usability Testing and Web Survey used in the gaming website (Section 2.3.1) are more cost effective, requiring less time to implement, run and analyse. However, short term studies lack visibility on long term user experienced emotions. Whereas LTDS provides rich qualitative data during the evaluation because users provides feedback usually within the same day of use, while the information is fresh. It seems UX Curve is open to a debate, as some researchers would argue that evaluating retrospectively during a long term study can be open to biases (Norman, 2009). However, during their study Sahar et al. (2014) found that although UX Curve requires less implementation effort, it requires additional time analysing and converting user sketches to digital formats. Although not in scope, iScale is worth studying further as its application can certainly be improved as the original publication was issues in 2012. Technology has improved and there is potential for this product to be every bit as intuitive as sketching on paper. If both projects were combined perhaps a better UX evaluation could emerge.
  9. 9. 4. Conclusion With all evaluations there are definite challenges, from participation levels, to time required to implement, coordination, and evaluate the data. Given the facts, it is the opinion of the author that with current trends and technology Web urveys of any form are the most effective at acquiring rich data. Although the set up may be longer than other methods, having the ability to easily access a database of users quickly and easily makes this a strong candidate to address the majority of business objectives, especially from a costing perspective. Interestingly, during the analysis of UX Curve the author had questioned the feasibility of a digital platform to address the same issue. It would eliminate the time needed to convert sketches to spreadsheet. Surprisingly enough development of such an application has been conceived. A more in depth study is required for analysing the potential of merging UX Curve and iScale with current technology. 5. References Desmet, P., Overbeeke, K., & Tax, S. (2001). Designing products with added emotional value: Development and application of an approach for research through design. The design journal, 4(1), 32-47. Gossen, T., Hempel, J., & Nürnberger, A. (2013). Find it if you can: usability case study of search engines for young users. Personal and Ubiquitous Computing, 17(8), 1593-1603. Gillham, R. (2005). Diary Studies as a Tool for Efficient Cross-Cultural Design. In IWIPS (pp. 57-65). Kujala, S., Roto, V., Väänänen-Vainio-Mattila, K., Karapanos, E., & Sinnelä, A. (2011). UX Curve: A method for evaluating long-term user experience. Interacting with Computers, 23(5), 473-483. Kujala, S., Roto, V., Väänänen-Vainio-Mattila, K., & Sinnelä, A. (2011, June). Identifying hedonic factors in long-term user experience. In Proceedings of the 2011 Conference on Designing Pleasurable Products and Interfaces (p. 17). ACM. Nielsen, J. (2012). How Many Test Users in a Usability Study? Nngroup.com. Retrieved 10 February 2016, from https://www.nngroup.com/articles/how- many-test-users/ Norman, S. (2016). Interaction Design Project - Anpost.ie (1st ed., pp. 3-4). Monahan, K., Lahteenmaki, M., McDonald, S., & Cockton, G. (2008, September). An investigation into the use of field methods in the design and evaluation of interactive systems. In Proceedings of the 22nd British HCI Group Annual Conference on People and Computers: Culture, Creativity, Interaction-Volume
  10. 10. 1 (pp. 99-108). British Computer Society. Reijneveld, K., de Looze, M., Krause, F., & Desmet, P. (2003, June). Measuring the emotions elicited by office chairs. In Proceedings of the 2003 international conference on Designing pleasurable products and interfaces (pp. 6-10). ACM. Rohrer, C. (2014). When to Use Which User-Experience Research Methods. Nngroup.com. Retrieved 7 January 2016, from https://www.nngroup.com/articles/which-ux-research-methods/ Sahar, F., Varsaluoma, J., & Kujala, S. (2014, November). Comparing the effectiveness of electronic diary and UX curve methods in multi-component product study. In Proceedings of the 18th International Academic MindTrek Conference: Media Business, Management, Content & Services (pp. 93-100). ACM. Schade, A. (2013). Remote Usability Tests: Moderated and Unmoderated. Nngroup.com. Retrieved 10 February 2016, from https://www.nngroup.com/articles/remote-usability-tests/ Soley, L., & Smith, A. (2008). Projective techniques for social science and business research. Usability.gov. (2016). Usability Testing. Retrieved 11 February 2016, from http://www.usability.gov/how-to-and-tools/methods/usability-testing.html Vermeeren, A. P., Law, E. L. C., Roto, V., Obrist, M., Hoonhout, J., & Väänänen- Vainio-Mattila, K. 2010, October). User experience evaluation methods: current state and development needs. In Proceedings of the 6th Nordic Conference on Human-Computer Interaction: Extending Boundaries (pp. 521- 530). ACM. Walsh, T., & Nurkka, P. (2012, November). Approaches to cross-cultural design: two case studies with UX web-surveys. In Proceedings of the 24th Australian Computer-Human Interaction Conference (pp. 633-642). ACM. 6. Bibiligraphy Allaboutux.org,. (2016). All UX evaluation methods « All About UX. Retrieved 11 February 2016, from http://www.allaboutux.org/all-methods Karapanos, E., Martens, J. B., & Hassenzahl, M. (2012). Reconstructing experiences with iScale. International Journal of Human-Computer Studies,70(11), 849-865. Jansen, M., Bos, W., van der Vet, P., Huibers, T., & Hiemstra, D. (2010, June). TeddIR: tangible information retrieval for children. In Proceedings of the 9th international conference on interaction design and children (pp. 282-285). ACM. Norman, D. A. (2009). THE WAY I SEE IT Memory is more important than actuality. Interactions, 16(2), 24-26.

×