Moderated vs Unmoderated Research: It’s time to say ELMO (Enough, let’s move on!)

It’s time to say ELMO:
Unmoderated v.s. Moderated Showdown
Asha Fereydouni, Research Partner
23 January, 2020

Quick Housekeeping
• Control panel on the side of your screen if you
have any comments during the presentation
• Time at the end for Q&A
• Today’s webinar will be recorded for future
viewing
• All attendees will receive a copy of the
slides/recording
• Continue the discussion using #UZwebinar
Let’s make sure you’re all set up for the webinar!

c
c
I. Review of Unmoderated v.s. Moderated Literature
II. Research scope and Project overview
III. Discussion of Severity Matrices and sample size
IV. Most/Least Problematic tasks, Key Insights, & Takeaways
V. Saying ELMO (and next steps)
VI. Q/A
Today’s Presentation:

2002: Tullis et al - Quant comparison - 8 lab, 29 unmoderated

2010: UX Matters - Industry Blog Post

2010: UPA - Quantitative Task comparison

2013: Nielson Norman Group - Industry Article

2015: International Journal of HCI - Academic Journal Publication

2016: UX Magazine - Industry Article

2018: MeasuringU - Industry Blog Post

2018: MeasuringU - Quant Summary

“When considering doing unmoderated user research, it’s
important to keep in mind that unmoderated user research is never
as good as moderated user research.
You should always avoid attempting to replace necessary
moderated user research with unmoderated user research.”
- Actual quote from a leading publication -

Research Scope and Project Overview

Are the (1) number and (2) severity of usability issues discovered
through 10 remote moderated sessions, and 10 remote unmoderated
sessions different?
Primary Research Question
Note: Not looking at generative insights, but usability issues.

Tasks:
1. First impressions
2. Find a specific item w/o Search
3. Find a specific item w/ Search
and add to cart
4. Find key site info – policies
around organic cotton
5. Find a store
6. Describe site to a friend
Study 1: Remote Moderated - 10 sessions; 5 completed Saturday, 5 Sunday.
Study 2: Remote Unmoderated - 10 sessions; set-up Friday, launched Saturday.

Severity Matrices - Past Work
Exist to help researchers more consistently rate and present: 1) number and (2) severity
of various usability issues
Molich & Jeffries, 1-3 point scale:
1. Minor: delays user briefly;
2. Serious: delays user significantly but eventually allows them to complete the task;
3. Catastrophic: prevents user from completing their task.
Jakob Nielsen, 0-4 point scale:
0. I don’t agree that this is a usability problem at all;
1. Cosmetic problem only: need not be fixed unless extra time is available on project;
2. Minor usability problem: fixing this should be given low priority;
3. Major usability problem: important to fix, so should be given high priority;
4. Usability catastrophe: imperative to fix this before product can be released.

Confidence is Key
Problem / Insight
Occurrence
Sample Size
Needed
40% 4
30% 5
20% 9
10% 18
5% 37
Margin of Error
(+/-)
Sample Size Needed
90% Confidence
24% 10
15% 28
10% 65
8% 103
5% 268
3% 749
2% 1,689
Difference to Detect
(90% Confidence)
Sample Size
Within Subjects
Sample Size
Between Subjects
50% 17 22
30% 29 64
12% 93 426
10% 115 614
5% 246 2,468
3% 421 6,866
1% 1,297 61,822
Identify Usability Issues Estimating Parameter KPIs Comparing Options

Most & Least problematic Tasks
Most problematic task: Task 4 (Find key site info – policies around organic cotton):
• Number of issues: Most participants had the same difficulty with:
1. The first click
2. Navigating successive pages on the site with no clear site-level-guidance
3. Understanding the mass of text about Patagonia’s policies around organic cotton
• Severity of issues: Most participants had level 2 serious problems with this task – these were
problems that delay “user[s] significantly but eventually allows them to complete the task.”
Least problematic task: Task 5 (Find a store):
• Number of issues: Understanding the meaning of the different-colored location pins.
• Severity of issues: A few participants, for each method, had level 1 minor problems with this
task – that delayed the “user briefly.”

Struggling Finding policies around Organic Cotton

Key Insights
Key Insights:
• Number of issues: Moderated discovered 20% more usability issues,
largely due to my ability to probe and ask follow-up questions.
• Severity of issues: Of those additional issues uncovered most were
minor usability issues.
• Method-to-method comparison: Number and severity of issues
uncovered in moderated was stronger.

Takeaways for UX practitioners
If you need to identify the maximum number and severity of issues – a
remote moderated test is best.
Practitioner’s opinion: While moderated may be the “winner,” practical
limits including time (to moderate, to review videos) and cost (to pay
participants, to pay a recruiting firm to recruit people,etc.) mean that an
unmoderated test may be more practical. Researchers should inform
themselves about the theoretical and actual strengths and weaknesses
of various methods.

Additional Research
Different Industries
Larger Sample Sizes
Other Moderators
Cross-validation
etc..

afereydouni@userzoom.com Main HQ: +1 866-599-1550
THANK YOU!

Research Overview
Study 1: Remote Moderated - 10 sessions; 5 Saturday, 5 Sunday in September.
Study 2: Remote Unmoderated - 10 sessions, test set-up Friday - launched Saturday in Sept.
Stimulus: Patagonia.com public e-commerce website.
Tasks:
1. First impressions of site
2. Find a specific item w/o Search
3. Find a specific item w/ Search and add to cart
4. Find key site info – policies around organic cotton
5. Find a store
6. Describe site to a friend

Moderated vs Unmoderated Research: It’s time to say ELMO (Enough, let’s move on!)

More Related Content

What's hot

Similar to Moderated vs Unmoderated Research: It’s time to say ELMO (Enough, let’s move on!)

More from UserZoom

Recently uploaded

Moderated vs Unmoderated Research: It’s time to say ELMO (Enough, let’s move on!)