It’s time to say ELMO:
Unmoderated v.s. Moderated Showdown
Asha Fereydouni, Research Partner
23 January, 2020
Quick Housekeeping
• Control panel on the side of your screen if you
have any comments during the presentation
• Time at the end for Q&A
• Today’s webinar will be recorded for future
viewing
• All attendees will receive a copy of the
slides/recording
• Continue the discussion using #UZwebinar
Let’s make sure you’re all set up for the webinar!
c
c
I. Review of Unmoderated v.s. Moderated Literature
II. Research scope and Project overview
III. Discussion of Severity Matrices and sample size
IV. Most/Least Problematic tasks, Key Insights, & Takeaways
V. Saying ELMO (and next steps)
VI. Q/A
Today’s Presentation:
Quick Poll
Industry Research
2002: Tullis et al - Quant comparison - 8 lab, 29 unmoderated
2010: UX Matters - Industry Blog Post
2010: UPA - Quantitative Task comparison
2013: Nielson Norman Group - Industry Article
2015: International Journal of HCI - Academic Journal Publication
2016: UX Magazine - Industry Article
2018: MeasuringU - Industry Blog Post
2018: MeasuringU - Quant Summary
Everyone has an opinion
“When considering doing unmoderated user research, it’s
important to keep in mind that unmoderated user research is never
as good as moderated user research.
You should always avoid attempting to replace necessary
moderated user research with unmoderated user research.”
- Actual quote from a leading publication -
Quick Poll
More data!
And what’s that..
Research Scope and Project Overview
Are the (1) number and (2) severity of usability issues discovered
through 10 remote moderated sessions, and 10 remote unmoderated
sessions different?
Primary Research Question
Note: Not looking at generative insights, but usability issues.
Tasks:
1. First impressions
2. Find a specific item w/o Search
3. Find a specific item w/ Search
and add to cart
4. Find key site info – policies
around organic cotton
5. Find a store
6. Describe site to a friend
Study 1: Remote Moderated - 10 sessions; 5 completed Saturday, 5 Sunday.
Study 2: Remote Unmoderated - 10 sessions; set-up Friday, launched Saturday.
Severity Matrices - Past Work
Exist to help researchers more consistently rate and present: 1) number and (2) severity
of various usability issues
Molich & Jeffries, 1-3 point scale:
1. Minor: delays user briefly;
2. Serious: delays user significantly but eventually allows them to complete the task;
3. Catastrophic: prevents user from completing their task.
Jakob Nielsen, 0-4 point scale:
0. I don’t agree that this is a usability problem at all;
1. Cosmetic problem only: need not be fixed unless extra time is available on project;
2. Minor usability problem: fixing this should be given low priority;
3. Major usability problem: important to fix, so should be given high priority;
4. Usability catastrophe: imperative to fix this before product can be released.
Confidence is Key
Problem / Insight
Occurrence
Sample Size
Needed
40% 4
30% 5
20% 9
10% 18
5% 37
Margin of Error
(+/-)
Sample Size Needed
90% Confidence
24% 10
15% 28
10% 65
8% 103
5% 268
3% 749
2% 1,689
Difference to Detect
(90% Confidence)
Sample Size
Within Subjects
Sample Size
Between Subjects
50% 17 22
30% 29 64
12% 93 426
10% 115 614
5% 246 2,468
3% 421 6,866
1% 1,297 61,822
Identify Usability Issues Estimating Parameter KPIs Comparing Options
Most & Least problematic Tasks
Most problematic task: Task 4 (Find key site info – policies around organic cotton):
• Number of issues: Most participants had the same difficulty with:
1. The first click
2. Navigating successive pages on the site with no clear site-level-guidance
3. Understanding the mass of text about Patagonia’s policies around organic cotton
• Severity of issues: Most participants had level 2 serious problems with this task – these were
problems that delay “user[s] significantly but eventually allows them to complete the task.”
Least problematic task: Task 5 (Find a store):
• Number of issues: Understanding the meaning of the different-colored location pins.
• Severity of issues: A few participants, for each method, had level 1 minor problems with this
task – that delayed the “user briefly.”
Struggling Finding policies around Organic Cotton
Relative Ease Finding Denver
Key Insights
Key Insights:
• Number of issues: Moderated discovered 20% more usability issues,
largely due to my ability to probe and ask follow-up questions.
• Severity of issues: Of those additional issues uncovered most were
minor usability issues.
• Method-to-method comparison: Number and severity of issues
uncovered in moderated was stronger.
Takeaways for UX practitioners
If you need to identify the maximum number and severity of issues – a
remote moderated test is best.
Practitioner’s opinion: While moderated may be the “winner,” practical
limits including time (to moderate, to review videos) and cost (to pay
participants, to pay a recruiting firm to recruit people,etc.) mean that an
unmoderated test may be more practical. Researchers should inform
themselves about the theoretical and actual strengths and weaknesses
of various methods.
Additional Research
Different Industries
Larger Sample Sizes
Other Moderators
Cross-validation
etc..
But for now…
Enough Let’s Move On!
ELMO
Time to say… FarewELMO
Q&A
afereydouni@userzoom.com Main HQ: +1 866-599-1550
THANK YOU!
Appendix
Research Overview
Study 1: Remote Moderated - 10 sessions; 5 Saturday, 5 Sunday in September.
Study 2: Remote Unmoderated - 10 sessions, test set-up Friday - launched Saturday in Sept.
Stimulus: Patagonia.com public e-commerce website.
Tasks:
1. First impressions of site
2. Find a specific item w/o Search
3. Find a specific item w/ Search and add to cart
4. Find key site info – policies around organic cotton
5. Find a store
6. Describe site to a friend

Moderated vs Unmoderated Research: It’s time to say ELMO (Enough, let’s move on!)

  • 1.
    It’s time tosay ELMO: Unmoderated v.s. Moderated Showdown Asha Fereydouni, Research Partner 23 January, 2020
  • 2.
    Quick Housekeeping • Controlpanel on the side of your screen if you have any comments during the presentation • Time at the end for Q&A • Today’s webinar will be recorded for future viewing • All attendees will receive a copy of the slides/recording • Continue the discussion using #UZwebinar Let’s make sure you’re all set up for the webinar!
  • 3.
    c c I. Review ofUnmoderated v.s. Moderated Literature II. Research scope and Project overview III. Discussion of Severity Matrices and sample size IV. Most/Least Problematic tasks, Key Insights, & Takeaways V. Saying ELMO (and next steps) VI. Q/A Today’s Presentation:
  • 4.
  • 5.
  • 6.
    2002: Tullis etal - Quant comparison - 8 lab, 29 unmoderated
  • 7.
    2010: UX Matters- Industry Blog Post
  • 8.
    2010: UPA -Quantitative Task comparison
  • 9.
    2013: Nielson NormanGroup - Industry Article
  • 10.
    2015: International Journalof HCI - Academic Journal Publication
  • 11.
    2016: UX Magazine- Industry Article
  • 12.
    2018: MeasuringU -Industry Blog Post
  • 13.
    2018: MeasuringU -Quant Summary
  • 14.
  • 15.
    “When considering doingunmoderated user research, it’s important to keep in mind that unmoderated user research is never as good as moderated user research. You should always avoid attempting to replace necessary moderated user research with unmoderated user research.” - Actual quote from a leading publication -
  • 16.
  • 17.
  • 18.
  • 20.
    Research Scope andProject Overview
  • 21.
    Are the (1)number and (2) severity of usability issues discovered through 10 remote moderated sessions, and 10 remote unmoderated sessions different? Primary Research Question Note: Not looking at generative insights, but usability issues.
  • 22.
    Tasks: 1. First impressions 2.Find a specific item w/o Search 3. Find a specific item w/ Search and add to cart 4. Find key site info – policies around organic cotton 5. Find a store 6. Describe site to a friend Study 1: Remote Moderated - 10 sessions; 5 completed Saturday, 5 Sunday. Study 2: Remote Unmoderated - 10 sessions; set-up Friday, launched Saturday.
  • 23.
    Severity Matrices -Past Work Exist to help researchers more consistently rate and present: 1) number and (2) severity of various usability issues Molich & Jeffries, 1-3 point scale: 1. Minor: delays user briefly; 2. Serious: delays user significantly but eventually allows them to complete the task; 3. Catastrophic: prevents user from completing their task. Jakob Nielsen, 0-4 point scale: 0. I don’t agree that this is a usability problem at all; 1. Cosmetic problem only: need not be fixed unless extra time is available on project; 2. Minor usability problem: fixing this should be given low priority; 3. Major usability problem: important to fix, so should be given high priority; 4. Usability catastrophe: imperative to fix this before product can be released.
  • 24.
    Confidence is Key Problem/ Insight Occurrence Sample Size Needed 40% 4 30% 5 20% 9 10% 18 5% 37 Margin of Error (+/-) Sample Size Needed 90% Confidence 24% 10 15% 28 10% 65 8% 103 5% 268 3% 749 2% 1,689 Difference to Detect (90% Confidence) Sample Size Within Subjects Sample Size Between Subjects 50% 17 22 30% 29 64 12% 93 426 10% 115 614 5% 246 2,468 3% 421 6,866 1% 1,297 61,822 Identify Usability Issues Estimating Parameter KPIs Comparing Options
  • 25.
    Most & Leastproblematic Tasks Most problematic task: Task 4 (Find key site info – policies around organic cotton): • Number of issues: Most participants had the same difficulty with: 1. The first click 2. Navigating successive pages on the site with no clear site-level-guidance 3. Understanding the mass of text about Patagonia’s policies around organic cotton • Severity of issues: Most participants had level 2 serious problems with this task – these were problems that delay “user[s] significantly but eventually allows them to complete the task.” Least problematic task: Task 5 (Find a store): • Number of issues: Understanding the meaning of the different-colored location pins. • Severity of issues: A few participants, for each method, had level 1 minor problems with this task – that delayed the “user briefly.”
  • 26.
    Struggling Finding policiesaround Organic Cotton
  • 27.
  • 28.
    Key Insights Key Insights: •Number of issues: Moderated discovered 20% more usability issues, largely due to my ability to probe and ask follow-up questions. • Severity of issues: Of those additional issues uncovered most were minor usability issues. • Method-to-method comparison: Number and severity of issues uncovered in moderated was stronger.
  • 29.
    Takeaways for UXpractitioners If you need to identify the maximum number and severity of issues – a remote moderated test is best. Practitioner’s opinion: While moderated may be the “winner,” practical limits including time (to moderate, to review videos) and cost (to pay participants, to pay a recruiting firm to recruit people,etc.) mean that an unmoderated test may be more practical. Researchers should inform themselves about the theoretical and actual strengths and weaknesses of various methods.
  • 30.
    Additional Research Different Industries LargerSample Sizes Other Moderators Cross-validation etc..
  • 31.
  • 32.
  • 33.
    Time to say…FarewELMO
  • 34.
  • 35.
    afereydouni@userzoom.com Main HQ:+1 866-599-1550 THANK YOU!
  • 36.
  • 37.
    Research Overview Study 1:Remote Moderated - 10 sessions; 5 Saturday, 5 Sunday in September. Study 2: Remote Unmoderated - 10 sessions, test set-up Friday - launched Saturday in Sept. Stimulus: Patagonia.com public e-commerce website. Tasks: 1. First impressions of site 2. Find a specific item w/o Search 3. Find a specific item w/ Search and add to cart 4. Find key site info – policies around organic cotton 5. Find a store 6. Describe site to a friend