How to screen 100+ concepts with MaxDiff
SKIM | Hans Willems | April 6th 2017
Agenda
2
1 2 3MaxDiff Intro and challenges Number of MaxDiff items MaxDiff’s relativity issue
3
Maximum Difference scaling (MaxDiff)
Methodologies
MaxDiff was originally
invented as a superior
alternative to rating,
ranking and chip
allocation questions
Proved to be useful as a
good alternative to
simple conjoint
applications
Can be used to
answer a variety of
business
questions
Maximum Differential Scaling, in short MaxDiff
4
MaxDiff: How does it work?
MaxDiff forces consumers to make trade-offs between certain features/benefits
5
Main advantages
More
discriminating
and refined
ratings
Scale free,
hence not
biased by
cultural
differences
Engaging and
intuitive
exercise for
respondents
Sub segments
can be
identified
Generally lower
costs and
shorter
timelines
Conventional MaxDiff New SwipeDiff
How does a MaxDiff exercise look?
6
7
MaxDiff: Example output (fictional data)
7
1 Monthly costs 12.72
2 Data allowance 12.32
3 Network coverage 7.93
4 Digital security 7.13
5 4G network 6.97
6 Free calls/texts within provider network 6.76
7 Handset price 6.62
8 Customer service 5.29
9 Voice allowance/call rates 4.93
10 Mobile phone model/handset 4.54
11 Ease of understanding mobile phone plan/rates 3.81
12 Roaming rates 3.80
13 Contract length 3.78
14 Out of bundle call/text/data rates 3.47
15 Reputation of Brand 2.80
16 Text allowance/text rates 2.56
17 Availability of regular phone upgrades 2.44
18 International call/text rates 2.14
Rank Average Scores
MaxDiff: Challenges?
8
How many items can be included in a MaxDiff
exercise?
How good are the winning (or losing) items?
Trade-off between number
of screens per
respondents, number of
items per screen and
number of observations
per item
Sometimes more items
need to be tested than
what can be done with the
standard MaxDiff method
Ranking provides insights on relative
preferences between items, but not on
overall acceptability/likeability of the full set
of items
9
MaxDiff: Number of Items
Methodologies
MaxDiff: Number of items
How many items can be
included in a MaxDiff exercise?
Trade-off between number of screens per respondents,
number of items per screen and number of observations per
item
• 4 items per screen is standard, 6 considered to be the
maximum
• Rule of thumb: Show each item at least 3 times to each
respondent, for example:
- 12 items: 9 screens with 4 items or 12 screens with 3 items
- 20 items: 12 screens with 5 items or 15 screens with 4 items
- 30 items: 15 screens with 6 items or 18 screens with 5 items
• Generally, the more items per screen the more robust the
read on the best and worst items, however at the expense
of less robustness on the middle range
What solution to use when having over >30 items?
E.g. 50? or 100(+)?
X3
10
11
MaxDiff: Including more than 30 items
Methodologies
Sparse MaxDiff
Every item will only be shown 1
time to each respondent
Including 30-50 items: Sparse and Express MaxDiff
12
Express MaxDiff
A (random) subset of items out of
the total set will be tested per
respondent, with (at least) 3
observations for each item within
the subset
Both methods
require
information to
be borrowed
from other
respondents
Although Express MaxDiff seems more respondent friendly, some research has indicated
that Sparse MaxDiff leads to slightly better results
Including over 30 items often requires an unacceptably high number of screens for
respondents. There are two alternative MaxDiff approaches to handle this:
13
1 2 3 4
The algorithm utilizes a step-wise process based on successive model estimations to be able to
increase the frequency that items with high potential are shown to respondents
After each respondent
aggregate level utilities are
calculated on-the-fly
Based on these the top
10-20 items are selected
On top of that 10 additional
items are selected (semi-)
randomly
These 20-30 items are
now shown to the next
respondent
Items most potential are shown at a higher frequency whereas items with least potential are
reduced in the frequency of being included in the item sets
 Still all items are shown to at least 30 respondents for a robust read
Including >50 items: SKIM’s Thompson MaxDiff (TMD)
Thompson MaxDiff: Innovations and advantages
14
What is new/different compared to a standard MaxDiff?
Main advantages
Able to handle a large number of
MaxDiff items without having to show
an excessive amount of screens
Focuses on the top performing
items
Estimates real-time popularity and
uncertainty
Learns from each new respondent
Stronger reads on the top
ranked items
Lower sample size needed to handle
large number of items
Thompson Sampling vs Sparse Design
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
20
40
60
80
100
120
140
160
180
200
220
240
260
280
300
320
340
360
380
400
420
440
460
480
500
520
540
560
580
600
620
640
660
680
700
720
740
760
780
800
820
840
860
880
900
920
940
960
980
1000
1020
Hitrate%
# of respondents
Top 3 hit rates
~4x as many respondents
required to achieve the
same hit ratesFixed Sparse
Design
Thompson
30 no split /
20/10 split
Source: Fairchild, K., Orme, B. and Swartz, E. (2015), “Bandit Adaptive MaxDiff Designs for Huge
Number of Items”, 2015 Sawtooth Software Conference Proceedings
15
Thompson Sampling - Misinformed start
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
20
40
60
80
100
120
140
160
180
200
220
240
260
280
300
320
340
360
380
400
420
440
460
480
500
520
540
560
580
600
620
640
660
680
700
720
740
760
780
800
820
840
860
880
900
920
940
960
980
1000
1020
Hitrate%
# of respondents
Top 3 hit rates
Thompson
20/10 split with
misinformed start
Thompson
30 no split with
misinformed start
Thompson
30 no split /
20/10 split
Source: Fairchild, K., Orme, B. and Swartz, E. (2015), “Bandit Adaptive MaxDiff Designs for Huge
Number of Items”, 2015 Sawtooth Software Conference Proceedings
16
17
MaxDiff’s relativity issue
Methodologies
18
MaxDiff: The Relativity Issue
One issue MaxDiff suffers from:
relativity
• We don’t know if all items are good, bad,
or some are good and some are bad
Most
Preferred
Least
Preferred
Head ache
Having a cold
Broken toe
Pulled muscle
The Solution: Using MaxDiff Anchoring – Two methods
19
Indirect Approach
• Ask to identify
acceptable items from
entire list (SKIM’s own
Kevin Lattery’s Direct
Approach)
• Can also ask as
unacceptable, least
preferred, would not
consider buying, etc.
• Ask to indicate whether
all items in a set are All
Good, All Bad, or Some
Good and Some Bad.
(Louviere’s Indirect
Approach)
Direct Approach
19
MaxDiff Anchoring: Direct approach
20
Acceptable? Yes/No
Item 1 Yes
Item 2 No
Item 3 No
Item 4 No
PREFERENCE
Most
Preferred
Least
Preferred
Item 1
Item 2
Item 3
Item 4
Item1
Item2
Item4
Item3 Anchor
MaxDiff Anchoring: Indirect approach (Dual none)
21
Item1
Item2
Item4
Item3 Anchor
PREFERENCE
Most
Preferred
Least
Preferred
Item 1
Item 2
Item 3
Item 4
Considering only the items above…
None of these are preferred
Some of these are preferred
All of these are preferred
MaxDiff Anchoring: Indirect approach (Dual none)
22
PREFERENCE
Most
Preferred
Least
Preferred
Item 1
Item 2
Item 3
Item 4
Considering only the items above…
None of these are preferred
Some of these are preferred
All of these are preferred
Item1
Item2
Item4
Item3 Anchor
MaxDiff Anchoring: Indirect approach (Dual none)
23
PREFERENCE
Most
Preferred
Least
Preferred
Item 1
Item 2
Item 3
Item 4
Considering only the items above…
None of these are preferred
Some of these are preferred
All of these are preferred
Item1
Item2
Item4
Item3Anchor
MaxDiff: Example output with Anchor (fictional data)
1 Monthly costs 11.72
2 Data allowance 11.43
3 Network coverage 7.52
4 Digital security 6.88
5 4G network 6.67
6 Free calls/texts within provider network 6.57
7 Handset price 6.41
8 Customer service 5.24
9 Voice allowance/call rates 4.91
10 Mobile phone model/handset 4.55
11 Ease of understanding mobile phone plan/rates 3.83
12 Roaming rates 3.81
13 Contract length 3.79
14 Out of bundle call/text/data rates 3.54
15 Anchor 3.12
16 Reputation of Brand 2.85
17 Text allowance/text rates 2.55
18 Availability of regular phone upgrades 2.48
19 International call/text rates 2.13
Rank Average Scores
24
MaxDiff Anchoring: Recommended method?
25
None of the
methods
proven to be
superior
Largely based
on context and
personal
preference
More research
and experience
needed
Be aware of potential pitfalls of both methods
DIRECT
INDIRECT
• When a scale question is used, the cut-off logic
could be arbitrary
• The additional question introduces a potential scale
bias again
• More questions/ clicks for a respondent
• When having 5 or more items on a screen, it is likely
that many responses will be for “some are preferred”
which does not provide much information
26
Advanced MaxDiff: Key take-aways
Methodologies
Key take-aways
27
Large number of
items can be an
issue
Sparse and
Express MaxDiff
solution for 30-50
items
Thompson
Sampling MaxDiff
for over 50 items
Anchoring can be
used to tackle the
MaxDiff’s relativity
issue
Two methods:
• Direct approach
• Indirect approach
MaxDiff great technique but also challenges
30-50
>50
Hans Willems
Research Manager
Based in Rotterdam
h.willems@skimgroup.com
Contact us
skimgroup.com
@SKIMgroup
SKIMgroup
SKIMgroup
28

Webinar "How to screen 100+ concepts with MaxDiff"

  • 1.
    How to screen100+ concepts with MaxDiff SKIM | Hans Willems | April 6th 2017
  • 2.
    Agenda 2 1 2 3MaxDiffIntro and challenges Number of MaxDiff items MaxDiff’s relativity issue
  • 3.
    3 Maximum Difference scaling(MaxDiff) Methodologies
  • 4.
    MaxDiff was originally inventedas a superior alternative to rating, ranking and chip allocation questions Proved to be useful as a good alternative to simple conjoint applications Can be used to answer a variety of business questions Maximum Differential Scaling, in short MaxDiff 4
  • 5.
    MaxDiff: How doesit work? MaxDiff forces consumers to make trade-offs between certain features/benefits 5 Main advantages More discriminating and refined ratings Scale free, hence not biased by cultural differences Engaging and intuitive exercise for respondents Sub segments can be identified Generally lower costs and shorter timelines
  • 6.
    Conventional MaxDiff NewSwipeDiff How does a MaxDiff exercise look? 6
  • 7.
    7 MaxDiff: Example output(fictional data) 7 1 Monthly costs 12.72 2 Data allowance 12.32 3 Network coverage 7.93 4 Digital security 7.13 5 4G network 6.97 6 Free calls/texts within provider network 6.76 7 Handset price 6.62 8 Customer service 5.29 9 Voice allowance/call rates 4.93 10 Mobile phone model/handset 4.54 11 Ease of understanding mobile phone plan/rates 3.81 12 Roaming rates 3.80 13 Contract length 3.78 14 Out of bundle call/text/data rates 3.47 15 Reputation of Brand 2.80 16 Text allowance/text rates 2.56 17 Availability of regular phone upgrades 2.44 18 International call/text rates 2.14 Rank Average Scores
  • 8.
    MaxDiff: Challenges? 8 How manyitems can be included in a MaxDiff exercise? How good are the winning (or losing) items? Trade-off between number of screens per respondents, number of items per screen and number of observations per item Sometimes more items need to be tested than what can be done with the standard MaxDiff method Ranking provides insights on relative preferences between items, but not on overall acceptability/likeability of the full set of items
  • 9.
    9 MaxDiff: Number ofItems Methodologies
  • 10.
    MaxDiff: Number ofitems How many items can be included in a MaxDiff exercise? Trade-off between number of screens per respondents, number of items per screen and number of observations per item • 4 items per screen is standard, 6 considered to be the maximum • Rule of thumb: Show each item at least 3 times to each respondent, for example: - 12 items: 9 screens with 4 items or 12 screens with 3 items - 20 items: 12 screens with 5 items or 15 screens with 4 items - 30 items: 15 screens with 6 items or 18 screens with 5 items • Generally, the more items per screen the more robust the read on the best and worst items, however at the expense of less robustness on the middle range What solution to use when having over >30 items? E.g. 50? or 100(+)? X3 10
  • 11.
    11 MaxDiff: Including morethan 30 items Methodologies
  • 12.
    Sparse MaxDiff Every itemwill only be shown 1 time to each respondent Including 30-50 items: Sparse and Express MaxDiff 12 Express MaxDiff A (random) subset of items out of the total set will be tested per respondent, with (at least) 3 observations for each item within the subset Both methods require information to be borrowed from other respondents Although Express MaxDiff seems more respondent friendly, some research has indicated that Sparse MaxDiff leads to slightly better results Including over 30 items often requires an unacceptably high number of screens for respondents. There are two alternative MaxDiff approaches to handle this:
  • 13.
    13 1 2 34 The algorithm utilizes a step-wise process based on successive model estimations to be able to increase the frequency that items with high potential are shown to respondents After each respondent aggregate level utilities are calculated on-the-fly Based on these the top 10-20 items are selected On top of that 10 additional items are selected (semi-) randomly These 20-30 items are now shown to the next respondent Items most potential are shown at a higher frequency whereas items with least potential are reduced in the frequency of being included in the item sets  Still all items are shown to at least 30 respondents for a robust read Including >50 items: SKIM’s Thompson MaxDiff (TMD)
  • 14.
    Thompson MaxDiff: Innovationsand advantages 14 What is new/different compared to a standard MaxDiff? Main advantages Able to handle a large number of MaxDiff items without having to show an excessive amount of screens Focuses on the top performing items Estimates real-time popularity and uncertainty Learns from each new respondent Stronger reads on the top ranked items Lower sample size needed to handle large number of items
  • 15.
    Thompson Sampling vsSparse Design 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 20 40 60 80 100 120 140 160 180 200 220 240 260 280 300 320 340 360 380 400 420 440 460 480 500 520 540 560 580 600 620 640 660 680 700 720 740 760 780 800 820 840 860 880 900 920 940 960 980 1000 1020 Hitrate% # of respondents Top 3 hit rates ~4x as many respondents required to achieve the same hit ratesFixed Sparse Design Thompson 30 no split / 20/10 split Source: Fairchild, K., Orme, B. and Swartz, E. (2015), “Bandit Adaptive MaxDiff Designs for Huge Number of Items”, 2015 Sawtooth Software Conference Proceedings 15
  • 16.
    Thompson Sampling -Misinformed start 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 20 40 60 80 100 120 140 160 180 200 220 240 260 280 300 320 340 360 380 400 420 440 460 480 500 520 540 560 580 600 620 640 660 680 700 720 740 760 780 800 820 840 860 880 900 920 940 960 980 1000 1020 Hitrate% # of respondents Top 3 hit rates Thompson 20/10 split with misinformed start Thompson 30 no split with misinformed start Thompson 30 no split / 20/10 split Source: Fairchild, K., Orme, B. and Swartz, E. (2015), “Bandit Adaptive MaxDiff Designs for Huge Number of Items”, 2015 Sawtooth Software Conference Proceedings 16
  • 17.
  • 18.
    18 MaxDiff: The RelativityIssue One issue MaxDiff suffers from: relativity • We don’t know if all items are good, bad, or some are good and some are bad Most Preferred Least Preferred Head ache Having a cold Broken toe Pulled muscle
  • 19.
    The Solution: UsingMaxDiff Anchoring – Two methods 19 Indirect Approach • Ask to identify acceptable items from entire list (SKIM’s own Kevin Lattery’s Direct Approach) • Can also ask as unacceptable, least preferred, would not consider buying, etc. • Ask to indicate whether all items in a set are All Good, All Bad, or Some Good and Some Bad. (Louviere’s Indirect Approach) Direct Approach 19
  • 20.
    MaxDiff Anchoring: Directapproach 20 Acceptable? Yes/No Item 1 Yes Item 2 No Item 3 No Item 4 No PREFERENCE Most Preferred Least Preferred Item 1 Item 2 Item 3 Item 4 Item1 Item2 Item4 Item3 Anchor
  • 21.
    MaxDiff Anchoring: Indirectapproach (Dual none) 21 Item1 Item2 Item4 Item3 Anchor PREFERENCE Most Preferred Least Preferred Item 1 Item 2 Item 3 Item 4 Considering only the items above… None of these are preferred Some of these are preferred All of these are preferred
  • 22.
    MaxDiff Anchoring: Indirectapproach (Dual none) 22 PREFERENCE Most Preferred Least Preferred Item 1 Item 2 Item 3 Item 4 Considering only the items above… None of these are preferred Some of these are preferred All of these are preferred Item1 Item2 Item4 Item3 Anchor
  • 23.
    MaxDiff Anchoring: Indirectapproach (Dual none) 23 PREFERENCE Most Preferred Least Preferred Item 1 Item 2 Item 3 Item 4 Considering only the items above… None of these are preferred Some of these are preferred All of these are preferred Item1 Item2 Item4 Item3Anchor
  • 24.
    MaxDiff: Example outputwith Anchor (fictional data) 1 Monthly costs 11.72 2 Data allowance 11.43 3 Network coverage 7.52 4 Digital security 6.88 5 4G network 6.67 6 Free calls/texts within provider network 6.57 7 Handset price 6.41 8 Customer service 5.24 9 Voice allowance/call rates 4.91 10 Mobile phone model/handset 4.55 11 Ease of understanding mobile phone plan/rates 3.83 12 Roaming rates 3.81 13 Contract length 3.79 14 Out of bundle call/text/data rates 3.54 15 Anchor 3.12 16 Reputation of Brand 2.85 17 Text allowance/text rates 2.55 18 Availability of regular phone upgrades 2.48 19 International call/text rates 2.13 Rank Average Scores 24
  • 25.
    MaxDiff Anchoring: Recommendedmethod? 25 None of the methods proven to be superior Largely based on context and personal preference More research and experience needed Be aware of potential pitfalls of both methods DIRECT INDIRECT • When a scale question is used, the cut-off logic could be arbitrary • The additional question introduces a potential scale bias again • More questions/ clicks for a respondent • When having 5 or more items on a screen, it is likely that many responses will be for “some are preferred” which does not provide much information
  • 26.
    26 Advanced MaxDiff: Keytake-aways Methodologies
  • 27.
    Key take-aways 27 Large numberof items can be an issue Sparse and Express MaxDiff solution for 30-50 items Thompson Sampling MaxDiff for over 50 items Anchoring can be used to tackle the MaxDiff’s relativity issue Two methods: • Direct approach • Indirect approach MaxDiff great technique but also challenges 30-50 >50
  • 28.
    Hans Willems Research Manager Basedin Rotterdam h.willems@skimgroup.com Contact us skimgroup.com @SKIMgroup SKIMgroup SKIMgroup 28