This brief presentation statistical approach to measure user experience between two prototypes. In this case a taxonomy revision and search tool modification was compared against the old interface & newly organized search interfaces
In this example I was comparing the time-on-task for two different prototypes. The old and new. This was the first iteration, a proof of concept to lay the foundation for additional content classification and search across the enterprise. So the first iteration
Let’s determine what statistical measurement to use? There are different calculations for each variable type, i.e., number of prototypes or interfaces, set type of participants, number of participantsWe are comparing two prototypes: the old interface and new interfaceWe are going to compare the means of the participants time on task. I compared two prototypes. ClickSecond. We are using the same group of eight participants to complete tasks on each of the two prototypes, so we have a paired sample. So that is within the same group the same group of participants. NOT ACROSS. ClickFinally we have less than 30 participants – so a paired samples t-test is the proper measurement. The t-test measures whether the means of two groups are statistically significant. ClickIt is often used to compare “before” and “after” scores in experiments to determine whether significant change has occurred.a Hypothesis test called a t-test which determines whether the means of two groups are statistically different from each other. .
This t-test compares one set of measurements with a second set from the same sample. It is often used to compare “before” and “after” scores in experiments to determine whether significant change has occurred.
s few as three could be used, but the minimum is usually 4 or 5 paired samplesmy hypothesis is FALSE or there is a significant difference in the prototypes user experience.
Use Excel, SPSS, or other software. Using Excel I needed to add a Data Analysis tool
Hypothesis Confirmation – There is a difference in the two prototypes and one is more satisfying to use.State Certainty - 95% certainty.
Sample Size - Re-emphasize why a small sample number is acceptable, Power calculations, which are provided by many statistical software packages, can be used to predict the minimum sample size necessary for a reliablerejection of H0.It may be necessary to ‘school’ the marketing person presenting to the client to avoid their embrassment when some misinformed person insists only a large sample will do. misconception that LARGE numbers are needed, emphasize why a small number is still statistically significant with this technique, why 95% certainty is reported. Explain descriptive vs. inferential statistical measurement & formative vs. summative usability testing Reiterate the hypothesis and give examples of how this technique has been used in other similar circumstances, use a story-telling elementGive two examples : Linguistics, medical, usabilityNow the clients will object to the small sample size and say things like that cannot be statistically significant or you need at least 30 or 400. This is where the five is enough comes from? Five is not enough when there are more independent variables introduced. There are tables that will indicate under what conditions what sample sizes aremixes two concepts: representativeness and sample size. Representative samples mean asking the people in your population of interest. It is more important to ask a few of the right people what they think than a lot of the wrong people. you need to identify the highest margin of error you or your clirnt can endure survey sponsor Never alone in report - Other methods should accompany quantitative methods.
Statistical user experience measurement 1Presentation Transcript
1.
User Experience Measurement Prototype Comparison Sharon Harper - Taxonomist & User Experience Consultant
2.
Sharon Harper - Taxonomist & User Experience Consultant User Experience MeasurementTaxonomy Structure, Organizational, Navigational Difference In Prototype Designs
3.
Sharon Harper - Taxonomist & User Experience Consultant Statistical Measurement Strategy Compare Means - task completion time 3 or more prototypes Across different participants (independent samples) ≥ 30 participants z-test 2 prototypes Within same group of participants (paired samples) < 30 participants t-test
4.
Null Hypothesis . there is no difference in the usability (performance & satisfaction) of the new user interface and the previous interface. Neither the taxonomy integration, organization or navigation efforts have improved the user experience. (New user interface) – (Earlier user interface) = 0 Measure. What difference in the values of the means of two interfaces can be obtained that would disprove this hypothesis? The difference of the paired samples means by the same participants is relative to the spread or variability of their scores (time on successful task completion between prototype interfaces). This is the t value Sharon Harper - Taxonomist & User Experience Consultant T – test: Paired Two Samples for Means Comparison
5.
Compare this result, the t-value, to determine whether the result is statistically significant. Assign risk factor of 5% (α= 0.05)to ensure 95% significance. Number of paired measurements (participants) minus 1 is the degree of freedom, df. With the acceptable risk factor of 5% or 95% significance, df, and t-value the hypothesis is tested. The resulting p value is less than 0.05 then the null hypothesis is NOT TRUE The time on task for the two interfaces is significantly different, that is the new interface is better Sharon Harper - Taxonomist & User Experience Consultant T – test: Paired Two Samples for Means Comparison (cont.)
6.
Sharon Harper - Taxonomist & User Experience Consultant Excel™ Data Analysis Add In Other software packages available : SPSS, SAS , MiniLab, http://usablestats.com … Explanatory resources: The Cartoon Guide to Statistics by Gonick & Smith Introductory Statistics by Ross Web Center for Social Research Methods http://www.socialresearchmethods.net/kb/stat_t.php
7.
Sharon Harper - Taxonomist & User Experience Consultant Install Excel Data Analysis Add In
8.
Sharon Harper - Taxonomist & User Experience Consultant Time on Task t-Test paired sample for means
9.
Sharon Harper - Taxonomist & User Experience Consultant Result
10.
Misinformation about statistical measurement is strongly ingrained, ‘math anxiety’ ‘school’ marketing presenters, anticipate KIS(s) , use examples from others domains: medicine, linguistics, UX development NOT pages of formulas, jargon Do not show p-value = .001293051, use .00129 or .001 Usability analysts should have rudimentary statistical knowledge and able to read power tables identifying the minimum sample size and acceptable risk from that sample size. Other methods should accompany quantitative methods Sharon Harper - Taxonomist & User Experience Consultant Report/Presentation Elements
11.
UX Resources Tullis & Albert. (2008) Measuring The User Experience: Collecting, Analyzing, and Presenting Usability Metrics. Sauro. A Practical Guide to Measuring Usability: 72 Answers to the Most Common Questions about Quantifying the Usability of Websites and Software Sauro & Lewis. (2012) Practical Statistics for User Research. http://www.measuringusability.com Sharon Harper - Taxonomist & User Experience Consultant Questions?
12.
Sharon Harper - Taxonomist & User Experience Consultant Questions?