SlideShare a Scribd company logo
Erfaringer med Remote Usability Testing?


Jan Stage

Professor, PhD
Forskningsleder i Informationssystemer (IS)/Human-Computer Interaction (HCI)
Aalborg Universitet, Institut for Datalogi, HCI-Lab

jans@cs.aau.dk
Oversigt
       • Undersøgelse 1
       • Undersøgelse 2




Institut for Datalogi     2
Oversigt
       • Undersøgelse 1: synkron eller asynkron
               •        Metode
               •        Resultater
               •        Konklusion
       • Undersøgelse 2




Institut for Datalogi                             3
Empirical Study 1
       Four methods: LAB – RS – AE – AU
       Test subjects: 6 in each condition (18 users and 6 with usability expertise), all
          students at Aalborg University
       System: Email client (Mozilla Thunderbird 1.5)
       9 defined tasks (typical email functions)
       Setting, procedure and data collection in accordance with method
       Data analysis: 24 outputs were analysed by three persons in random and
          different order
       Generated their individual lists of usability problems with their own
         categorizations (also for the AE and AU conditions)
       These were merged into an overall problem list through negotiation
Institut for Datalogi                                                                      4
Results: Task Completion
                              No significant difference in task completion
                              Significant difference in task completion
                                 time
                              The users in the two asynchronous
                                 conditions spent considerably more
                                 time
                              We do not know the reason




Institut for Datalogi                                                        5
Results: Usability Problems Identified




       A total of 46 usability problems
       No significant difference between LAB and RS
       AE/AU identified significantly fewer problems, also critical problems
       No significant difference between AE and AU in terms of problems identified


Institut for Datalogi                                                                6
Conclusion
       RS is the most widely described and used remote method. The performance
          is virtually equivalent to LAB (or slightly better)
       AE and AU perform surprisingly well
       Experts do not perform significantly better than users
       Video analysis (LAB and RS) required considerably more evaluator effort than
          the user-based reporting (AU and AE)
       Users can actually contribute to usability evaluation – not with the same
          quality, but reasonably well, and there are plenty of them




Institut for Datalogi                                                                 7
Oversigt
       • Undersøgelse 1
       • Undersøgelse 2: hvilken asynkron metode
               •        Metode
               •        Resultater
               •        Konklusion




Institut for Datalogi                              8
Empirical Study 2
       Purpose: examine and compare remote asynchronous methods
       Focus on usability problems identified
       Comparable with the previous study
       Selection of asynchronous methods based on literature survey




Institut for Datalogi                                                 9
The 3 Remote Asynchronous Methods
       User-reported critical incident (UCI)
               •    Well-defined method (Castillo et al. CHI 1998)
       Forum-based online reporting and discussion (Forum)
               •    Assumption: through collaboration participants may give input which increases data
                    quality and richness (Thompson, 1999)
               •    A source for collecting qualitative data in a study of auto logging (Millen, 1999): the
                    participants turned out to report detailed usability feedback
       Diary-based longitudinal user reporting (Diary)
               •    Used on a longitudinal basis for participants in a study of auto logging to provide
                    qualitative information (Steves et al. CSCW 2001)
               •    First day: same tasks as the other conditions (first part of diary delivered)
               •    Four more days: new tasks (same type) sent daily (complete diary delivered)
       Conventional user-based laboratory test (Lab)
               •    Included as benchmark


Institut for Datalogi                                                                                         10
Empirical Study (1)
       Participants:
               • 40 test subjects, 10 for each condition
               • Students, age 20 to 30
               • Distributed evenly: gender and tech/non-tech education
       Setting:
               • LAB: in our usability lab
               • Remote asynchronous: in the participants’ homes
       Participants in the remote asynchronous conditions received the software and
          installed it on their computer
       Training material for the remote asynchronous conditions
               • Identification and categorisation of usability problems
               • A minimalist approach that was strictly remote and asynchronous (via email)



Institut for Datalogi                                                                          11
Empirical Study (2)
       Tasks:
               • Nine fixed tasks
               • The same across the four conditions to ensure that all participants used the
                 same parts of the system
               • Typical email tasks (same as previous study)
       Data collection in accordance with the method
               •    LAB: video recordings
               •    UCI: web-based system for generating problem descriptions while solving tasks
               •    Forum: after solving tasks, one week for posting and discussing problems
               •    Diary: a diary with no imposed structure; first part after the first day




Institut for Datalogi                                                                               12
Data Analysis
       All data collected before the data analysis started
       3 evaluators did the whole data analysis
       The 40 data sets were analysed by the 3 evaluators
               • In random order: by a draw
               • In different order between them
       The user input from the three remote conditions was transformed into
          usability problem descriptions
       Each evaluator generated his/her own individual lists of usability problems with
          their own severity ratings
               • A problem list for each condition
               • A complete problem list (joined)
       These were merged into an overall problem list through negotiation

Institut for Datalogi                                                                     13
Results: Task Completion Time
       Considerable variation in task completion times




       Participants in the remote conditions worked in their home at a time they
          selected
       For each task there was a hint that allowed them to check if they had solved
          the task correctly
       As we have no data on the task solving process in the remote conditions, we
          cannot explain this variation
Institut for Datalogi                                                                 14
Results: Usability Problems Identified
                                        Lab               UCI           Forum             Diary
                                        N=10              N=10           N=10             N=10
         Task completion time in                                                      Tasks 1-9:
         minutes: Average (SD)     24.24 (6.3)      34.45 (14.33)     15.45 (5.83)   32.57 (28.34)

         Usability problems:       #           %     #           %     #       %      #           %
         Critical (21)             20          95    10          48    9       43    11       52
         Serious (17)              14          82    2           12    1       6      6       35
         Cosmetic (24)             12          50    1           4     5       21    12       50
         Total (62)                46          74    13          21   15       24    29       47



       LAB: significantly better than the 3 remote conditions
       UCI-Forum: no significant difference
       UCI-Diary: significant overall: Diary – also significant on cosmetic
       Forum-Diary: significant overall: Diary – not significant on any level

Institut for Datalogi                                                                                 15
Results: Evaluator Effort
                                 Lab      UCI     Forum   Diary
                                   (46)    (13)    (15)     (29)
         Preparation               6:00    2:40    2:40     2:40
         Conducting test          10:00    1:00    1:00     1:30
         Analysis                 33:18    2:52    3:56     9:38
         Merging problem lists    11:45    1:41    1:42     4:58
         Total time spent         61:03    8:13    9:18    18:46
         Avg. time per problem     1:20    0:38    0:37     0:39


       The sum for all evaluators involved in each activity
       Time for finding test subjects is not included (8h, common for all)
       Task specifications from an earlier study. Preparation in the remote
          conditions: work out written instructions
       Considerable differences between the remote conditions for analysis and
         merging of problem lists
Institut for Datalogi                                                            16
Conclusion
       The three remote methods performed significantly below the classical lab test
          in terms of the number of usability problems identified
       The Diary was the best remote method – it identified half of the problems
          found in the Lab condition
       UCI and Forum performed similarly for critical problems but worse for
         serious problems
       UCI and Forum took 13% of the lab test. Diary took 30%
       The productivity of the remote methods was considerably higher




Institut for Datalogi                                                                  17
Institut for Datalogi   18
Interaktionsdesign og usability-evaluering
       Master i IT
       Videreuddannelse under IT-Vest
       Fagpakke i Interaktionsdesign og usability-evaluering starter 1/2-12
       Optager bachelorer, men også indgang for datamatikere
       Information: http://www.master-it-vest.dk/




Institut for Datalogi                                                         19

More Related Content

Similar to Erfaringer med Remote Usability Testing af Jan Stage, AAU

Thesis Defense_Karpinsky
Thesis Defense_KarpinskyThesis Defense_Karpinsky
Thesis Defense_Karpinsky
Nicole D. Karpinsky-Mosley
 
Differences in-task-descriptions
Differences in-task-descriptionsDifferences in-task-descriptions
Differences in-task-descriptions
Sameer Chavan
 
WUD Slovakia 2015: Experiment v UX class / Prof. Ing. Mária Bieliková, PhD.
WUD Slovakia 2015: Experiment v UX class / Prof. Ing. Mária Bieliková, PhD.WUD Slovakia 2015: Experiment v UX class / Prof. Ing. Mária Bieliková, PhD.
WUD Slovakia 2015: Experiment v UX class / Prof. Ing. Mária Bieliková, PhD.
suxask
 
What Metrics Matter?
What Metrics Matter? What Metrics Matter?
What Metrics Matter?
CS, NcState
 
Workflow Reuse in Practice: A Study of Neuroimaging Pipeline Users
Workflow Reuse in Practice: A Study of Neuroimaging Pipeline UsersWorkflow Reuse in Practice: A Study of Neuroimaging Pipeline Users
Workflow Reuse in Practice: A Study of Neuroimaging Pipeline Users
dgarijo
 
2023-08-22 CoLLAs Tutorial - Beyond CIL.pdf
2023-08-22 CoLLAs Tutorial - Beyond CIL.pdf2023-08-22 CoLLAs Tutorial - Beyond CIL.pdf
2023-08-22 CoLLAs Tutorial - Beyond CIL.pdf
Vincenzo Lomonaco
 
Towards Task Analysis Tool Support
Towards Task Analysis Tool SupportTowards Task Analysis Tool Support
Towards Task Analysis Tool Support
Suzanne Kieffer
 
Usability evaluation methods (part 2) and performance metrics
Usability evaluation methods (part 2) and performance metricsUsability evaluation methods (part 2) and performance metrics
Usability evaluation methods (part 2) and performance metrics
Andres Baravalle
 
Software Engineering Research: Leading a Double-Agent Life.
Software Engineering Research: Leading a Double-Agent Life.Software Engineering Research: Leading a Double-Agent Life.
Software Engineering Research: Leading a Double-Agent Life.
Lionel Briand
 
Lecture-2 Applied ML .pptx
Lecture-2 Applied ML .pptxLecture-2 Applied ML .pptx
Lecture-2 Applied ML .pptx
ZainULABIDIN496386
 
ISM2014
ISM2014ISM2014
ISM2014
nlab_utokyo
 
A personal journey towards more reproducible networking research
A personal journey towards more reproducible networking researchA personal journey towards more reproducible networking research
A personal journey towards more reproducible networking research
Olivier Bonaventure
 
Investigating teachers' understanding of IMS Learning Design: Yes they can!
Investigating teachers' understanding of IMS Learning Design: Yes they can!Investigating teachers' understanding of IMS Learning Design: Yes they can!
Investigating teachers' understanding of IMS Learning Design: Yes they can!
Michael Derntl
 
Vision Based Analysis on Trajectories of Notes Representing Ideas Toward Work...
Vision Based Analysis on Trajectories of Notes Representing Ideas Toward Work...Vision Based Analysis on Trajectories of Notes Representing Ideas Toward Work...
Vision Based Analysis on Trajectories of Notes Representing Ideas Toward Work...
Yuji Oyamada
 
Evaluating the User Experience of Virtual Learning Environments Using Biometr...
Evaluating the User Experience of Virtual Learning Environments Using Biometr...Evaluating the User Experience of Virtual Learning Environments Using Biometr...
Evaluating the User Experience of Virtual Learning Environments Using Biometr...
Renée Schulz
 
DeepXplore: Automated Whitebox Testing of Deep Learning Systems
DeepXplore: Automated Whitebox Testing of Deep Learning Systems DeepXplore: Automated Whitebox Testing of Deep Learning Systems
DeepXplore: Automated Whitebox Testing of Deep Learning Systems
Jeongwhan Choi
 
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne UlitmatumElsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Anita de Waard
 
Controlled Experiments - Shengdong Zhao
Controlled Experiments - Shengdong ZhaoControlled Experiments - Shengdong Zhao
Controlled Experiments - Shengdong Zhao
Michael Shilman
 
Usability testing through the decades
Usability testing through the decadesUsability testing through the decades
Usability testing through the decades
UX Firm, LLC
 
18.02.05_IAAI2018_Mobille Network Failure Event Detection and Forecasting wit...
18.02.05_IAAI2018_Mobille Network Failure Event Detection and Forecasting wit...18.02.05_IAAI2018_Mobille Network Failure Event Detection and Forecasting wit...
18.02.05_IAAI2018_Mobille Network Failure Event Detection and Forecasting wit...
LINE Corp.
 

Similar to Erfaringer med Remote Usability Testing af Jan Stage, AAU (20)

Thesis Defense_Karpinsky
Thesis Defense_KarpinskyThesis Defense_Karpinsky
Thesis Defense_Karpinsky
 
Differences in-task-descriptions
Differences in-task-descriptionsDifferences in-task-descriptions
Differences in-task-descriptions
 
WUD Slovakia 2015: Experiment v UX class / Prof. Ing. Mária Bieliková, PhD.
WUD Slovakia 2015: Experiment v UX class / Prof. Ing. Mária Bieliková, PhD.WUD Slovakia 2015: Experiment v UX class / Prof. Ing. Mária Bieliková, PhD.
WUD Slovakia 2015: Experiment v UX class / Prof. Ing. Mária Bieliková, PhD.
 
What Metrics Matter?
What Metrics Matter? What Metrics Matter?
What Metrics Matter?
 
Workflow Reuse in Practice: A Study of Neuroimaging Pipeline Users
Workflow Reuse in Practice: A Study of Neuroimaging Pipeline UsersWorkflow Reuse in Practice: A Study of Neuroimaging Pipeline Users
Workflow Reuse in Practice: A Study of Neuroimaging Pipeline Users
 
2023-08-22 CoLLAs Tutorial - Beyond CIL.pdf
2023-08-22 CoLLAs Tutorial - Beyond CIL.pdf2023-08-22 CoLLAs Tutorial - Beyond CIL.pdf
2023-08-22 CoLLAs Tutorial - Beyond CIL.pdf
 
Towards Task Analysis Tool Support
Towards Task Analysis Tool SupportTowards Task Analysis Tool Support
Towards Task Analysis Tool Support
 
Usability evaluation methods (part 2) and performance metrics
Usability evaluation methods (part 2) and performance metricsUsability evaluation methods (part 2) and performance metrics
Usability evaluation methods (part 2) and performance metrics
 
Software Engineering Research: Leading a Double-Agent Life.
Software Engineering Research: Leading a Double-Agent Life.Software Engineering Research: Leading a Double-Agent Life.
Software Engineering Research: Leading a Double-Agent Life.
 
Lecture-2 Applied ML .pptx
Lecture-2 Applied ML .pptxLecture-2 Applied ML .pptx
Lecture-2 Applied ML .pptx
 
ISM2014
ISM2014ISM2014
ISM2014
 
A personal journey towards more reproducible networking research
A personal journey towards more reproducible networking researchA personal journey towards more reproducible networking research
A personal journey towards more reproducible networking research
 
Investigating teachers' understanding of IMS Learning Design: Yes they can!
Investigating teachers' understanding of IMS Learning Design: Yes they can!Investigating teachers' understanding of IMS Learning Design: Yes they can!
Investigating teachers' understanding of IMS Learning Design: Yes they can!
 
Vision Based Analysis on Trajectories of Notes Representing Ideas Toward Work...
Vision Based Analysis on Trajectories of Notes Representing Ideas Toward Work...Vision Based Analysis on Trajectories of Notes Representing Ideas Toward Work...
Vision Based Analysis on Trajectories of Notes Representing Ideas Toward Work...
 
Evaluating the User Experience of Virtual Learning Environments Using Biometr...
Evaluating the User Experience of Virtual Learning Environments Using Biometr...Evaluating the User Experience of Virtual Learning Environments Using Biometr...
Evaluating the User Experience of Virtual Learning Environments Using Biometr...
 
DeepXplore: Automated Whitebox Testing of Deep Learning Systems
DeepXplore: Automated Whitebox Testing of Deep Learning Systems DeepXplore: Automated Whitebox Testing of Deep Learning Systems
DeepXplore: Automated Whitebox Testing of Deep Learning Systems
 
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne UlitmatumElsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
 
Controlled Experiments - Shengdong Zhao
Controlled Experiments - Shengdong ZhaoControlled Experiments - Shengdong Zhao
Controlled Experiments - Shengdong Zhao
 
Usability testing through the decades
Usability testing through the decadesUsability testing through the decades
Usability testing through the decades
 
18.02.05_IAAI2018_Mobille Network Failure Event Detection and Forecasting wit...
18.02.05_IAAI2018_Mobille Network Failure Event Detection and Forecasting wit...18.02.05_IAAI2018_Mobille Network Failure Event Detection and Forecasting wit...
18.02.05_IAAI2018_Mobille Network Failure Event Detection and Forecasting wit...
 

More from InfinIT - Innovationsnetværket for it

Erfaringer med-c kurt-noermark
Erfaringer med-c kurt-noermarkErfaringer med-c kurt-noermark
Erfaringer med-c kurt-noermark
InfinIT - Innovationsnetværket for it
 
Object orientering, test driven development og c
Object orientering, test driven development og cObject orientering, test driven development og c
Object orientering, test driven development og c
InfinIT - Innovationsnetværket for it
 
Embedded softwaredevelopment hcs
Embedded softwaredevelopment hcsEmbedded softwaredevelopment hcs
Embedded softwaredevelopment hcs
InfinIT - Innovationsnetværket for it
 
C og c++-jens lund jensen
C og c++-jens lund jensenC og c++-jens lund jensen
C og c++-jens lund jensen
InfinIT - Innovationsnetværket for it
 
201811xx foredrag c_cpp
201811xx foredrag c_cpp201811xx foredrag c_cpp
C som-programmeringssprog-bt
C som-programmeringssprog-btC som-programmeringssprog-bt
C som-programmeringssprog-bt
InfinIT - Innovationsnetværket for it
 
Infinit seminar 060918
Infinit seminar 060918Infinit seminar 060918
DCR solutions
DCR solutionsDCR solutions
Not your grandfathers BPM
Not your grandfathers BPMNot your grandfathers BPM
Not your grandfathers BPM
InfinIT - Innovationsnetværket for it
 
Kmd workzone - an evolutionary approach to revolution
Kmd workzone - an evolutionary approach to revolutionKmd workzone - an evolutionary approach to revolution
Kmd workzone - an evolutionary approach to revolution
InfinIT - Innovationsnetværket for it
 
EcoKnow - oplæg
EcoKnow - oplægEcoKnow - oplæg
Martin Wickins Chatbots i fronten
Martin Wickins Chatbots i frontenMartin Wickins Chatbots i fronten
Martin Wickins Chatbots i fronten
InfinIT - Innovationsnetværket for it
 
Marie Fenger ai kundeservice
Marie Fenger ai kundeserviceMarie Fenger ai kundeservice
Marie Fenger ai kundeservice
InfinIT - Innovationsnetværket for it
 
Mads Kaysen SupWiz
Mads Kaysen SupWizMads Kaysen SupWiz
Leif Howalt NNIT Service Support Center
Leif Howalt NNIT Service Support CenterLeif Howalt NNIT Service Support Center
Leif Howalt NNIT Service Support Center
InfinIT - Innovationsnetværket for it
 
Jan Neerbek NLP og Chatbots
Jan Neerbek NLP og ChatbotsJan Neerbek NLP og Chatbots
Jan Neerbek NLP og Chatbots
InfinIT - Innovationsnetværket for it
 
Anders Soegaard NLP for Customer Support
Anders Soegaard NLP for Customer SupportAnders Soegaard NLP for Customer Support
Anders Soegaard NLP for Customer Support
InfinIT - Innovationsnetværket for it
 
Stephen Alstrup infinit august 2018
Stephen Alstrup infinit august 2018Stephen Alstrup infinit august 2018
Stephen Alstrup infinit august 2018
InfinIT - Innovationsnetværket for it
 
Innovation og værdiskabelse i it-projekter
Innovation og værdiskabelse i it-projekterInnovation og værdiskabelse i it-projekter
Innovation og værdiskabelse i it-projekter
InfinIT - Innovationsnetværket for it
 
Rokoko infin it presentation
Rokoko infin it presentation Rokoko infin it presentation
Rokoko infin it presentation
InfinIT - Innovationsnetværket for it
 

More from InfinIT - Innovationsnetværket for it (20)

Erfaringer med-c kurt-noermark
Erfaringer med-c kurt-noermarkErfaringer med-c kurt-noermark
Erfaringer med-c kurt-noermark
 
Object orientering, test driven development og c
Object orientering, test driven development og cObject orientering, test driven development og c
Object orientering, test driven development og c
 
Embedded softwaredevelopment hcs
Embedded softwaredevelopment hcsEmbedded softwaredevelopment hcs
Embedded softwaredevelopment hcs
 
C og c++-jens lund jensen
C og c++-jens lund jensenC og c++-jens lund jensen
C og c++-jens lund jensen
 
201811xx foredrag c_cpp
201811xx foredrag c_cpp201811xx foredrag c_cpp
201811xx foredrag c_cpp
 
C som-programmeringssprog-bt
C som-programmeringssprog-btC som-programmeringssprog-bt
C som-programmeringssprog-bt
 
Infinit seminar 060918
Infinit seminar 060918Infinit seminar 060918
Infinit seminar 060918
 
DCR solutions
DCR solutionsDCR solutions
DCR solutions
 
Not your grandfathers BPM
Not your grandfathers BPMNot your grandfathers BPM
Not your grandfathers BPM
 
Kmd workzone - an evolutionary approach to revolution
Kmd workzone - an evolutionary approach to revolutionKmd workzone - an evolutionary approach to revolution
Kmd workzone - an evolutionary approach to revolution
 
EcoKnow - oplæg
EcoKnow - oplægEcoKnow - oplæg
EcoKnow - oplæg
 
Martin Wickins Chatbots i fronten
Martin Wickins Chatbots i frontenMartin Wickins Chatbots i fronten
Martin Wickins Chatbots i fronten
 
Marie Fenger ai kundeservice
Marie Fenger ai kundeserviceMarie Fenger ai kundeservice
Marie Fenger ai kundeservice
 
Mads Kaysen SupWiz
Mads Kaysen SupWizMads Kaysen SupWiz
Mads Kaysen SupWiz
 
Leif Howalt NNIT Service Support Center
Leif Howalt NNIT Service Support CenterLeif Howalt NNIT Service Support Center
Leif Howalt NNIT Service Support Center
 
Jan Neerbek NLP og Chatbots
Jan Neerbek NLP og ChatbotsJan Neerbek NLP og Chatbots
Jan Neerbek NLP og Chatbots
 
Anders Soegaard NLP for Customer Support
Anders Soegaard NLP for Customer SupportAnders Soegaard NLP for Customer Support
Anders Soegaard NLP for Customer Support
 
Stephen Alstrup infinit august 2018
Stephen Alstrup infinit august 2018Stephen Alstrup infinit august 2018
Stephen Alstrup infinit august 2018
 
Innovation og værdiskabelse i it-projekter
Innovation og værdiskabelse i it-projekterInnovation og værdiskabelse i it-projekter
Innovation og værdiskabelse i it-projekter
 
Rokoko infin it presentation
Rokoko infin it presentation Rokoko infin it presentation
Rokoko infin it presentation
 

Recently uploaded

UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
Tatiana Kojar
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
SitimaJohn
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
Chart Kalyan
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
Mariano Tinti
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
DanBrown980551
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Jeffrey Haguewood
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
MichaelKnudsen27
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Wask
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Hiroshi SHIBATA
 
Webinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data WarehouseWebinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data Warehouse
Federico Razzoli
 

Recently uploaded (20)

UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
 
Webinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data WarehouseWebinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data Warehouse
 

Erfaringer med Remote Usability Testing af Jan Stage, AAU

  • 1. Erfaringer med Remote Usability Testing? Jan Stage Professor, PhD Forskningsleder i Informationssystemer (IS)/Human-Computer Interaction (HCI) Aalborg Universitet, Institut for Datalogi, HCI-Lab jans@cs.aau.dk
  • 2. Oversigt • Undersøgelse 1 • Undersøgelse 2 Institut for Datalogi 2
  • 3. Oversigt • Undersøgelse 1: synkron eller asynkron • Metode • Resultater • Konklusion • Undersøgelse 2 Institut for Datalogi 3
  • 4. Empirical Study 1 Four methods: LAB – RS – AE – AU Test subjects: 6 in each condition (18 users and 6 with usability expertise), all students at Aalborg University System: Email client (Mozilla Thunderbird 1.5) 9 defined tasks (typical email functions) Setting, procedure and data collection in accordance with method Data analysis: 24 outputs were analysed by three persons in random and different order Generated their individual lists of usability problems with their own categorizations (also for the AE and AU conditions) These were merged into an overall problem list through negotiation Institut for Datalogi 4
  • 5. Results: Task Completion No significant difference in task completion Significant difference in task completion time The users in the two asynchronous conditions spent considerably more time We do not know the reason Institut for Datalogi 5
  • 6. Results: Usability Problems Identified A total of 46 usability problems No significant difference between LAB and RS AE/AU identified significantly fewer problems, also critical problems No significant difference between AE and AU in terms of problems identified Institut for Datalogi 6
  • 7. Conclusion RS is the most widely described and used remote method. The performance is virtually equivalent to LAB (or slightly better) AE and AU perform surprisingly well Experts do not perform significantly better than users Video analysis (LAB and RS) required considerably more evaluator effort than the user-based reporting (AU and AE) Users can actually contribute to usability evaluation – not with the same quality, but reasonably well, and there are plenty of them Institut for Datalogi 7
  • 8. Oversigt • Undersøgelse 1 • Undersøgelse 2: hvilken asynkron metode • Metode • Resultater • Konklusion Institut for Datalogi 8
  • 9. Empirical Study 2 Purpose: examine and compare remote asynchronous methods Focus on usability problems identified Comparable with the previous study Selection of asynchronous methods based on literature survey Institut for Datalogi 9
  • 10. The 3 Remote Asynchronous Methods User-reported critical incident (UCI) • Well-defined method (Castillo et al. CHI 1998) Forum-based online reporting and discussion (Forum) • Assumption: through collaboration participants may give input which increases data quality and richness (Thompson, 1999) • A source for collecting qualitative data in a study of auto logging (Millen, 1999): the participants turned out to report detailed usability feedback Diary-based longitudinal user reporting (Diary) • Used on a longitudinal basis for participants in a study of auto logging to provide qualitative information (Steves et al. CSCW 2001) • First day: same tasks as the other conditions (first part of diary delivered) • Four more days: new tasks (same type) sent daily (complete diary delivered) Conventional user-based laboratory test (Lab) • Included as benchmark Institut for Datalogi 10
  • 11. Empirical Study (1) Participants: • 40 test subjects, 10 for each condition • Students, age 20 to 30 • Distributed evenly: gender and tech/non-tech education Setting: • LAB: in our usability lab • Remote asynchronous: in the participants’ homes Participants in the remote asynchronous conditions received the software and installed it on their computer Training material for the remote asynchronous conditions • Identification and categorisation of usability problems • A minimalist approach that was strictly remote and asynchronous (via email) Institut for Datalogi 11
  • 12. Empirical Study (2) Tasks: • Nine fixed tasks • The same across the four conditions to ensure that all participants used the same parts of the system • Typical email tasks (same as previous study) Data collection in accordance with the method • LAB: video recordings • UCI: web-based system for generating problem descriptions while solving tasks • Forum: after solving tasks, one week for posting and discussing problems • Diary: a diary with no imposed structure; first part after the first day Institut for Datalogi 12
  • 13. Data Analysis All data collected before the data analysis started 3 evaluators did the whole data analysis The 40 data sets were analysed by the 3 evaluators • In random order: by a draw • In different order between them The user input from the three remote conditions was transformed into usability problem descriptions Each evaluator generated his/her own individual lists of usability problems with their own severity ratings • A problem list for each condition • A complete problem list (joined) These were merged into an overall problem list through negotiation Institut for Datalogi 13
  • 14. Results: Task Completion Time Considerable variation in task completion times Participants in the remote conditions worked in their home at a time they selected For each task there was a hint that allowed them to check if they had solved the task correctly As we have no data on the task solving process in the remote conditions, we cannot explain this variation Institut for Datalogi 14
  • 15. Results: Usability Problems Identified Lab UCI Forum Diary N=10 N=10 N=10 N=10 Task completion time in Tasks 1-9: minutes: Average (SD) 24.24 (6.3) 34.45 (14.33) 15.45 (5.83) 32.57 (28.34) Usability problems: # % # % # % # % Critical (21) 20 95 10 48 9 43 11 52 Serious (17) 14 82 2 12 1 6 6 35 Cosmetic (24) 12 50 1 4 5 21 12 50 Total (62) 46 74 13 21 15 24 29 47 LAB: significantly better than the 3 remote conditions UCI-Forum: no significant difference UCI-Diary: significant overall: Diary – also significant on cosmetic Forum-Diary: significant overall: Diary – not significant on any level Institut for Datalogi 15
  • 16. Results: Evaluator Effort Lab UCI Forum Diary (46) (13) (15) (29) Preparation 6:00 2:40 2:40 2:40 Conducting test 10:00 1:00 1:00 1:30 Analysis 33:18 2:52 3:56 9:38 Merging problem lists 11:45 1:41 1:42 4:58 Total time spent 61:03 8:13 9:18 18:46 Avg. time per problem 1:20 0:38 0:37 0:39 The sum for all evaluators involved in each activity Time for finding test subjects is not included (8h, common for all) Task specifications from an earlier study. Preparation in the remote conditions: work out written instructions Considerable differences between the remote conditions for analysis and merging of problem lists Institut for Datalogi 16
  • 17. Conclusion The three remote methods performed significantly below the classical lab test in terms of the number of usability problems identified The Diary was the best remote method – it identified half of the problems found in the Lab condition UCI and Forum performed similarly for critical problems but worse for serious problems UCI and Forum took 13% of the lab test. Diary took 30% The productivity of the remote methods was considerably higher Institut for Datalogi 17
  • 19. Interaktionsdesign og usability-evaluering Master i IT Videreuddannelse under IT-Vest Fagpakke i Interaktionsdesign og usability-evaluering starter 1/2-12 Optager bachelorer, men også indgang for datamatikere Information: http://www.master-it-vest.dk/ Institut for Datalogi 19