Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

User Experiments in Human-Computer Interaction


Published on

This lecture covers the basics of user experiment design in human-computer interaction. Computer scientists and developers often create interfaces for a particular purpose. This lecture explains how a user experiment can be designed and conducted to systematically compare one interface with the other.

Published in: Technology

User Experiments in Human-Computer Interaction

  1. 1. LECTURE 5: USER EXPERIMENTS IN HCI COMP 4026 – Advanced HCI Semester 5 - 2017 Arindam Dey University of South Australia
  2. 2. OVERVIEW •  Why do we need user experiments? •  How to design a user experiments? •  Activity •  How to run a user experiments? •  Ethical considerations
  3. 3. Testing your idea/design/prototype with real users of the application
  4. 4. You (designer / developer) ≠ User Because you •  know your system well •  have special skills •  know what you are measuring
  5. 5. Who should your users be in the study? Sample must be a true representation of the population Everyone who may use your product Participants in your study
  6. 6. What users do and say? To what extent they do it? Why they do it and how to fix it? courtesy:
  7. 7. Categories of usability tests based on goals •  Formative - Beginning of and during the product development phase - Usability problems and fixes •  Summative - Towards the end of the development phase - Statistically measured usability
  8. 8. Categories of usability tests based on data collected •  Qualitative - Descriptions (verbally or behaviorally) - Directly measured - Takes more effort to analyze - Mostly earlier in the design phase •  Quantitative - Measurements (numbers) - Indirectly measured - Later in the design phase
  9. 9. User Experiments •  A method of academic research in HCI - To discover/test/approve new knowledge •  Hypothesis driven - Compares multiple conditions to discover causal relationships •  Replicable (generalizable) - Thrives to remove bias and error (random assignment) •  Draw conclusions with statistical tests of the hypothesis
  10. 10. Usability Testing vs. User Experiments •  The methods can be the same •  The goals are often different •  Usability testing goals - Identify usability problems & issues of a product •  User experiment goals - Answer research questions, discovering new knowledge (generalizable results)
  11. 11. Usability Testing vs. User Experiments Usability Testing User Experiment Improve products Discover knowledge Few participants Many participants Results inform design Results validated statistically Usually not completely replicable - case specific results Must be replicable - generalizable results Condition(s) controlled as much as possible Strongly controlled conditions Procedure planned Experimental design Results reported to product designer / developer Scientific report to scientific community
  12. 12. Designing User Experiments •  Hypothesis (research question) •  Experimental task •  Independent variables (IV) •  Dependent variables (DV) •  Subjective •  Objective •  Other variables •  Random, controlled, confounding •  Experimental designs •  Within-subjects, between-subjects, mixed-factorial
  13. 13. Hypothesis •  A prediction of the outcome - Based on research question but narrower - A research question can be tested in multiple hypotheses - Causal relationship between IV and DV - A precise statement that can be directly tested through an experiment e.g. Condition A will be faster than Condition B
  14. 14. Hypothesis •  Null hypothesis (H0) - Predicts there is no effect of IV on DV - Statistical tests accept/reject null hypothesis •  Alternative hypothesis (HA) - Predicts there is an effect of IV on DV •  H0 and HA are mutually exclusive
  15. 15. Hypothesis Testing Statistical tests (next lecture) are subject to Type I and Type II errors
  16. 16. Experimental Task •  A task that participants will do in a study under different conditions e.g. in Fitt’s Law studies participants click on buttons using different input devices •  Must be suitable to the application - depends on what is the research question •  Ideally risk-free
  17. 17. Independent Variables (IV) •  Variables that are independent of participant's behaviour •  Systematically manipulated by the experimenter •  Variables that experimenter is interested in •  There can be one or more IVs in an experiment
  18. 18. Typical Independent Variables •  Technology (controlled) - Types of technology, device, interface, design •  User - Physical/mental/social status - age, gender, computer experience, professional domain, education, culture, motivation, mood, and disabilities •  Context of use - Environmental status (physical/social) - Lighting, noise, indoor/outdoor, public/ private
  19. 19. Independent Variables vs. Experimental Conditions Conditions = factorized levels in all IVs
  20. 20. Dependent Variables (DV) •  The outcome or effect that the researchers are interested in •  Dependent on participants’ behavior or the changes in the IVs •  Usually the outcomes that the researchers need to measure - measurements or observations
  21. 21. Dependent Variables (DV) •  Subjective - Based on users’ opinions, interpretations, points of view, emotions and judgment - More vulnerable to context and users’ status - e.g. questionnaires, NASA TLX •  Objective - Not influenced by personal feeling/opinion - Based on observation, compared against standardized scale - More consistent - e.g. time, error
  22. 22. Typical Dependent Variables •  Efficiency - e.g. task completion time, speed •  Accuracy - e.g. error, success rate •  Subjective satisfaction - e.g. Likert scale ratings •  Ease of learning - e.g. test score, learning curve, retention rate •  Physical or cognitive demand - e.g. NASA task load index (TLX)
  23. 23. Other Variables •  Controlled Variables - Set to not change during an experiment - The more controlled - the more internal validity, but less generalizable •  Random Variables - The more influence of random variable - the less internal validity •  Confounding Variables - Variables that researchers failed to control - damages internal validity
  24. 24. Validity of User Experiments •  Internal Validity - approximate truth about inferences regarding cause-effect or causal relationships - not relevant for observational studies - higher under strict controlled lab conditions •  External Validity - the extent to which the conclusions of the experiment is generalizable - three types: population, environmental, and temporal
  25. 25. Experimental Designs •  Within-subjects - Each subject performs under all the different conditions - Repeated-measure •  Between-subjects - Each subject is assigned to one experimental condition - Independent samples - Matched groups •  Mixed-factorial - Combination of the two - More than one IV needed
  26. 26. Experimental Designs Condition A Participant 1 Participant 2 . . . Participant 10 Condition B Participant 1 Participant 2 . . . Participant 10 Condition A Participant 1 Participant 2 . . . Participant 10 Condition B Participant 11 Participant 12 . . . Participant 20 Within-subjects Between-subjects
  27. 27. Within-Subjects vs. Between-Subjects Within-subjects Between-subjects Learning effect Avoids interference effects (e.g. practice / learning effect) Longer time for each participant (larger impact of fatigue and frustration) Shorter time for each participant (less fatigue and frustration) Individual difference can be isolated Impact of individuals difference Easier to detect difference between conditions Harder to detect difference between conditions Requires smaller sample size Require larger sample size Counterbalance/randomize the order of presenting conditions Randomized assignment to conditions or matched groups
  28. 28. Randomization •  Critical condition of a true experiment •  The random assignment of treatments to the •  experimental units or participants •  No one, including the experimenters, can control the assignments •  Main way to minimize the effects of random variables
  29. 29. Counterbalancing •  All possible permutations - 3 conditions => 3P3 = 6 permutations - (1,2,3), (1,3,2), (2,1,3), (2,3,1), (3,1,2), (3,2,1) - 4 conditions => 4P4 = 24 permutations - (1,2,3,4), (1,2,4,3), (1,3,2,4), (1,3,4,2), … •  Number of participants must be multiple of number of permutations
  30. 30. Balanced Latin Square •  Latin Square - Each item occurs once in each row and column •  Balanced Latin Square - Each item both precedes and follows each other item an equal number of times
  31. 31. Participants / Subjects •  The sample in your experiment •  Number of participants - Between subject design: 15~20 per condition - Within subject design: 15~20 - The smaller the effect size, the more participants needed - The more variance between users, the more participants needed - The more conditions in the experiment, the more participants needed
  32. 32. Power Analysis •  You can calculate the ideal number of participants you have to test •  Parameters needed: - α: the probability of rejecting the H0 given that that the H0 is true (usually set to 0.05) - β (1-β = power: the probability of observing a difference when it really exist, usually set to 0.8) - Effect size: difference of mean divided by std. dev. •  Free program for power analysis: g*Power
  33. 33. Errors •  Random Errors - Also called ‘chance errors’ or ‘noises’ - Cause variations in both direction - Can be controlled by a large sample size + randomization •  Systematic Errors - Also called ‘biases’ - Always push actual value in the same direction - No matter how large the sample is, cannot be offset unless the source of error is controlled
  34. 34. Errors •  Five major sources - measurement instruments - experimental procedures - sampling participants - experimenter behavior - experimental environment
  35. 35. After Designing the Study •  Write down the design - hypotheses - task - IVs and DVs - design of the experiment - participants - randomization / counterbalancing - data collection •  Critically review your own design •  Ask others to review your design
  36. 36. Activity Fill out the template with your study design You have designed a new application to resize photos on mobile phones (Condition A) quickly. There are several alternative solutions available in the market, pick anyone of them (Condition B). Design a user experiment to compare Condition A and Condition B.
  37. 37. Running User Experiments • Experimental Procedure • Pilot Study • Main Study
  38. 38. Typical Experimental Session (1/2) •  Ensure the apparatus are ready - Both the system under test and measurement devices - Test-run - Make sure forms, questionnaires etc. are printed,… •  Greet the participants •  Introduce the purpose of the study and the procedures (experimenter script) •  Get the consent of the participants •  Assign the participants to a specific experiment condition according to the pre-defined randomization method
  39. 39. Typical Experimental Session (2/2) •  Participants complete training task •  Participants complete experimental tasks •  Participants answer questionnaires (if any) •  If within-subject design - change conditions and repeat above •  Debriefing session - Collect details through interview •  Compliments (always give some gift)
  40. 40. Pilot Study •  A small trial run of the main testing. - Can identify majority of issues with both the prototype and the experimental design •  Pilot testing check: - that the experimental plan is viable - you can conduct the procedure - your prototype and instruments for measurement work appropriately - the experimental task and environment •  Iron out problems before doing the main experiment •  This is not optional
  41. 41. As an Experimenter •  Offload your Brain! - Write down instructions and important information - Prepare checklists - Print questionnaires and documents in advance •  Take notes, document oddities - Create templates •  Rehearse procedures - Do you need assistants? •  Nothing is as bad as lost data - AVOID!!! - Collect ASAP, Backup
  42. 42. Research Ethics • Consent • Respect • Privacy
  43. 43. Consent •  Participant has the right to know - The experimental procedure - What kind of data is collected - Risks involved - How the data will be stored and presented •  Experimenter must - Explain the experiment in detail - Ask participant to sign a consent form
  44. 44. Respect Participants •  They are volunteers and should be allowed to - Take a rest (between conditions) - Leave the experiment anytime without reasoning - Given a token of appreciation (gift, money etc.) - Take time to organize (but don’t waste their time) “ Do unto others as you would have them do unto you.” - MATTHEW 7:12
  45. 45. Privacy •  Never disclose their identifiable data to anyone without written consent •  Data must be stored in secure locations - Digitally and physically •  Don’t use identifiable data, images, or videos in reports or publications
  46. 46. Limitations • No data collection method will be perfect - control problems - available technical equipment • Differences - Multiple researchers - Multiple methods - Multiple measures - Objective vs. Subjective - Qualitative vs. Quantitative
  47. 47. Limitations • A single study cannot tell us everything - Important to make sure it’s replicable • One paper ≠ scientific truth - Different researchers, different methods, all coming to the same conclusion, that’s when you find consensus • Science is not static - Theories evolve and change over time