4. Measuring the User Experience
• The next slides are based on the core text
book for this module, "Measuring the User
Experience"
4
5. Issue based metrics
• Usability issues typically include qualitative
data:
– The identification and description of a problem one or
more participants experienced
– An assessment of the underlying cause of the
problem
– Specific recommendations for remedying the problem
and many report positive findings as well
– Positive findings (what went well)
5
6. Usability issues
• Usability issues are based on behaviour in
using a product
• Common issues include:
– Task is not completed
– User goes “off course” or doesn't see
something that should be noticed
– User is frustrated
– User misinterprets some piece of content
6
7. What do you do with usability
issues?
•In iterative
design!
7
8. How do you identify issues?
• In-person studies (observing participants)
• Automated (or semi-automated) studies
(analysing behaviour, e.g. through logs)
8
9. Severity ratings
• Severity ratings help focus attention on what
really matters
– Low: any issue that annoys or frustrates participants
but does not play a role in task failure
– Medium: any issue that contributes to significant task
difficulty but does not cause task failure
– High: any issue that leads directly to task failure;
encountering this issue will stop the user from
complete the task
9
10. Severity ratings: 2 factors
• Severity rating can also use a combination
of 2 factors – typically frequency and
impact
10
11. Severity ratings: 4 factors
• You can also use four three-point
• scales (low, medium, high)
• Impact on the user experience
• Predicted frequency of occurrence
• Impact on the business goals
• Technical/implementation costs
11
12. Using a severity rating system
• What does each level mean? is it clear to
the team?
• Have more than one usability specialist
assigning severity ratings to each issue!
– How do you establish the final rating? How do
you address differences in the evaluation?
• Track the usability issues!
12
13. Analysing usability issues
• How is the overall usability of the product?
• Is the usability improving with each design
iteration?
• Where should you focus your efforts to
improve the design?
13
14. Analysing usability issues (2)
• Analysing usability issues typically focuses
on identifying
– Unique issues
– Issues per participant
– Frequency per participant
– Issues by category
– Issues by task
14
15. Consistency in identifying
usability issues
• Research shows very little agreement on what a
usability issue is or how severe it is
• A set of studies coordinated by Molich, with
different teams of usability experts evaluating
the same design, showed that there is vey little
overlap in the findings of the teams
– Molich & Dumas (2008) showed that 60% of all the
issues were identified by only 1 of the 17 teams
participating in the study
15
16. Number of participants: five
users is enough
• About 80% of usability issues will be
observed with the first five participants
(Nielsen & Landauer, 1993)
16
17. Number of participants: five
participants is not enough
• Lindgaard and Chattratichart (2007) tested
a web site with a known number of issues
– 2 teams (6 and 12 participants)
– 42% and 43% of the usability issues in a web
site found – but only 28% in common!
17
19. What are self-reported metrics?
• Self reported metrics relate to the
perception of user interaction with an
interface
– They focus on subjective data
19
20. Collecting self-reported metrics
• Answer questions or provide ratings orally
– This is typically done through interviews
• Record responses on a paper form, or with
some type of online tool (questionnaires)
20
21. Interviews
• Unstructured - not directed by a script.
Rich but not replicable.
• Structured - tightly scripted, often like a
questionnaire. Replicable but may lack
richness.
• Semi-structured - guided by a script but
interesting issues can be explored in more
depth. Can provide a good balance
between richness and replicability.
21
22. Closed vs. open questions
• ‘Closed questions’ have a predetermined
answer format, e.g., ‘yes’ or ‘no’
– Easier to analyse
• ‘Open questions’ do not have a
predetermined format
– Can allow to better explore research topics
22
23. Questions to avoid
• Long questions
• Compound sentences - split them into two
• Jargon and language that the interviewee may
not understand
• Leading questions that make assumptions
– e.g., why do you like …? Or
– Asking a question that the respondent is not qualified
to answer
• Unconscious biases, e.g. gender stereotypes
23
24. Running the interview
• Introduction – introduce yourself, explain the goals of
the interview, reassure about the ethical issues, ask to
record, present any informed consent form.
• Warm-up – make first questions easy and non-
threatening.
• Main body – present questions in a logical order
• A cool-off period – include a few easy questions to
defuse tension at the end
• Closure – thank interviewee and signal the end,
e.g. switch recorder off.
24
25. Enriching the interview process
25
• Use props - devices for prompting interviewee,
e.g. a prototype or a scenario
26. Questionnaires
• Questions can be closed or open
– Closed questions are easier to analyse, and
may be done by computer
• Can be administered to large populations
– Paper, email and the web used for
dissemination
• Sampling can be a problem when the size
of a population is unknown as is common
online
26
27. Questionnaire design
• Provide clear instructions on how to
complete the questionnaire
• Decide on whether phrases will all be
positive, all negative or mixed
• Different versions of the questionnaire
might be needed for different populations
• The impact of a question can be
influenced by question order
27
28. Question and response format
• Questionnaires can include:
– Binary choices
– Checkboxes that offer many options
– Rating scales
• Likert scales
• Semantic scales
– Open-ended questions
28
29. Encouraging a good response
• Make sure purpose of study is clear
• Ensure questionnaire is well designed
– Consider offering a short version for those who do not
have time to complete a long questionnaire
• Promise anonymity
• Follow-up with emails, phone calls, letters
• Provide an incentive
• 40% response rate is high, 20% is often
acceptable
29
30. On-line questionnaires
• Responses are usually
received quickly
• No copying and/or
postage costs
• Data can be easily
collected in database for
analysis
• Time required for data
analysis is reduced
• Errors can be corrected
easily
30
32. Problems with online
questionnaires
• Sampling is problematic if population size
is unknown
• Preventing individuals from responding
more than once
32
33. Analysing data
• When analysing data from rating scales,
use frequency distribution of the
responses (rather than average or
median)
33
34. System usability scale
• One of the most widely used tools for
assessing the perceived usability of a
system (Brooke, 1996)
• 10 statements to which users rate their
level of agreement
– Half the statements are worded positively and
half are worded negatively.
– A five-point scale of agreement is used for
each
34
35. System usability scale (2)
• A technique for combining the 10 ratings into an
overall score (on a scale of 0 to 100) is also
given
35
36. System usability scale
(questions 1-5)
• I think that I would like to use this system
frequently
• I found the system unnecessarily complex
• I thought the system was easy to use
• I think that I would need the support of a
technical person to be able to use this
system
• I found the various functions in this
system were well integrated
36
37. System usability scale
(questions 6-10)
• I thought there was too much
inconsistency in this system
• I would imagine that most people would
learn to use this system very quickly
• I found the system very cumbersome to
use
• I felt very confident using the system
• I needed to learn a lot of things before I
could get going with this system
37
39. System usability scale: score
• Sum the score contributions from each item
– For items 1, 3, 5, 7, and 9, the score contribution is
the scale position minus 1
– For items 2, 4, 6, 8, and 10, the contribution is 5
minus the scale position
• Multiply the sum of the scores by 2.5 to obtain
the overall SUS score:
– <50: Not acceptable
– 50–70: Marginal
– >70: Acceptable
39
40. Usability scales/questionairres
• There are also other scales:
– Post-Study System Usability Questionnaire
and Computer System Usability
Questionnaire (Lewis, 1995)
– Questionnaire for User Interface Satisfaction
(Chin, Diehl, & Norman, 1988)
– Product Reaction Cards (Benedek and Miner,
2002)
– More here:
http://oldwww.acm.org/perlman/question.html
40
41. Assessing attributes
• The techniques described in the previous pages are
typically used to assess interfaces or tasks as a whole
• You can also look at specific attributes of an interface:
– Visual appeal
– Perceived efficiency
– Confidence
– Usefulness
– Enjoyment
– Credibility
– Appropriateness of terminology
– Ease of navigation
– Responsiveness
41
42. Biases in self-reported data
• Answers provided in person or over the
phone tend to be more positive than
through an anonymous survey (Dillman
et al., 2008)
42