Game Design 2
Lecture 7: User Evaluation
2. Why End-User Evaluation?
• As discussed last lecture, much evaluation
can take place without end-user
• However, there is no substitute for the
feedback that the target audience
• Only by having players use your interface
can you tell where the most important
4. Qualitative Vs Quantitative
• Quantitative is data concerning numeric,
• Qualitative data is verbose, descriptive
and more difficult to summarise.
– Techniques such as cluster analysis can help
to make meaningful interpretations of
5. Two Approaches
• You can ask players what they think
– Identify unexpected issues
– Can be relatively quick and simple way to gather data
from many players
– The only way to ‘get inside the head’ of a player.
• You can observe what players do
– It can be hard for players to voice issues
– Designer can focus on players’ natural response to a
game item without drawing attention
– Can find issues that the designer doesn’t expect and
the player doesn’t notice
6. Asking Questions
• Oral or Written
• 2 Basic types of questions
– Open-ended questions
– Closed-ended questions
• Questionnaires can contain one or both
types of question
7. Open Ended Questionnaires
• Lead to larger variety of answers than
• Answered in natural language.
• Result in qualitative data which has to be
analysed further before being considered
8. Can you see any difficulties in receiving data like this?
9. Open Ended Questions
10. How to write open ended questions
• Avoid leading language
– “Was the control scheme easy to use?”
– Questions like this tend to guide the
interviewee into answering the way the
• Avoid double-barrelled questions that ask
more than one thing at once.
– “How challenging was the second level?
Would you have preferred more or less ammo
on this level?”
11. Benefits of Open Ended Questions
• Allow respondents to include more
information including feelings, attitudes
and understanding of the subject.
• Respondents can’t forget or miss the
range of answers applicable (imagine a
race survey and missing the race that
most applies to you)
• Respondents can’t lazily answer ‘no’ or
‘yes’ to every question (thus skewing data)
12. Disadvantages of Open-Ended
• More difficult to write than closed-ended
• May result in irrelevant information
• Some interviewees don’t know how to
answer them or feel on the spot
• May result in too much information
• May be (very) difficult to analyse
13. Closed Ended Questions
• User chooses from a list of predefined
• Lead directly to quantifiable data.
• Although they naturally result in a
narrower range of responses than open
ended questions, it is still possible to
determine much of the same kind of data.
14. Closed Ended Questions
15. How would you redesign the question as a closed ended one
so as to avoid the problems identified earlier?
16. Choosing Answers
• Can be a list of predefined answers (such
as with colours).
• For simple questions, can use ‘yes’, ‘don’t
know’, ‘no’ to determine the audience’s
attitude towards a facet of your game.
• To gauge more subtle responses, a Likert
scale may be used.
17. Likert Scale
• Typically a 5 point scale
– Strongly disagree
– Neither agree nor disagree
– Strongly Agree
• Allows gauging of ‘in between’ answers
that are not boolean.
18. Benefits of Closed-Ended
• Easy to quantify since they directly provide
– (you just count the number for each response)
• Simpler to write and to fill out
• Can have a more semantic understanding
of responses rather than the potential
infinity of natural language
19. Observing Behaviour
• Playtests can be conducted in a focus group
setting or individually.
• Focus group settings allow for a quicker
understanding of consensus
– Care is needed to avoid dominant personalities from
– Some people are too shy to speak up in public
• Individual observations allow for discovering a
wider range of responses.
20. Usability Observation
• Ideally play is recorded for later analysis
although it is possible to take notes during
• Video should be used to record facial
expressions, body language and input
devices in addition to game output.
• The eMotion lab at Caledonian is fitted
with additional equipment including an eye
tracker and physiological sensors.
• Observations can result in meaningful data from
a small number of participants.
• Nielson says:
“the best results come from testing no more than
five users and running as many small tests as
you can afford. As you add more and more
users, you learn less and less because you keep
seeing the same things again and again.”
22. • Important to remember that the goal is to test the game
(or game interface) and NOT to test the user.
• If the player struggles to understand an interface, it is the
design’s ‘fault’, not the player’s.
• The facilitator should provide no more information that
the final end-user would receive. In other words, the
facilitator should not try to help the participant in any
• Observations are ideal places to use ‘Think Aloud’
methods in addition to analysis of what is recorded.
• Audience feedback is used to modify
designs which are then tested again until
the design satisfies its requirements.
• Often a wholly different set of users test
the new design to avoid improved results
due to familiarity.
24. Think Aloud Protocol
• Combines elements of observation with
25. Think Aloud/Online Self Report
aloud technique is pretty
much what it sounds like. You ask
someone to do a task, and to think
aloud about what they are doing
while they are doing it.”
- Rugg, 2007
26. “The basic concept is simple: you tell the respondent
what the task is, and ask them to think aloud while
doing it. If they are silent for more than a set length
of time (e.g. five seconds) then you use a prearranged
prompt to get them talking again”
- Rugg, 2007
“Could you tell me
what you’re thinking
“Are you looking at
the background of
27. “… if you have to transcribe the data,
then this can be very time-consuming (in
the order of ten hours of transcription
per hour of tape, depending on how
good your typing is and how loquacious
your respondents are).”
- Rugg, 2007
28. “the convention on transcripts is to use one
full stop per second of silence (so “….”
shows four seconds of silence). “Um” and
“er” sounds are also worth noting, for the
same reason, particularly when the
respondent is otherwise articulate.”
- Rugg, 2007
29. Fixing Problems With Card Sorting
• If an interface proves difficult to design, a
more structured approach may be
necessary to create a design that is
intuitive to the target audience.
• Card Sorting is a methodology which
enables non-expert end-users to help
categorise items in a way which is useful
for interface an user experience design.
30. Card Sorting
• Open sorting
– Often used early on in process
– Users can define their own categories
– Can also repeat the task dependent on a criteria
of their choosing
• Closed sorting
– Used later in process
– Categories are pre-defined
31. Card Sorts Analysis
• Category/Criteria names
– verbatim agreement
– gist agreement
• super-ordinate grouping
– cluster analysis, tree diagram, co-occurrence
• Number of criteria/categories
32. Super-ordinate grouping performed by an Independent Judge
“Your task is to interpret the criteria into super-ordinate constructs. You should
try to identify where the criterion given by one respondent could be said to have
meant the same as another but simply have chosen diﬀerent wording.”
Example superordinate grouping
33. Co-occurrence matrix
34. Card Sorting Advantages
Quick to Execute
Good Foundation for data
35. Card Sorting Disadvantages
• Emphasises data over actions
• Possible to have divergent results
• Analysis can be time consuming
– Especially if little consensus between
• May capture ‘surface’ characteristics only
– i.e. ignoring how the data would be used
36. Further reading
Card Sorting (boxes & arrows) - Definitive Guide: http://bit.ly/1aLMZvs
Card Sorting (boxes & arrows) - Analysis Spreadsheet: http://bit.ly/1bRXUqd
Dr Ed De Quincey’s Card Sorting Links: http://bit.ly/1jeGK8H