Webinar slides: DIY Market Mapping Using Correspondence Analysis

T I M B O C K P R E S E N T S
If you have any questions, enter them into the Questions field.
Questions will be answered at the end. If we do not have time to get to your question, we will email you.
We will email you a link to the video, slides, and data.
Get a free one-month trial of Q from www.q-researchsoftware.com
DIY Market Mapping
Using Correspondence Analysis
IF YOU HAVE ANY TECHNICAL ISSUES VIEWING THIS WEBINAR,
YOU CAN CATCH UP ON THE FULL RECORDING ON OUR WEBSITE

Introduction Overview
Visualizing a big table
Software
Interpretation Proximities, angles, and lengths
Quality
Make it better Removing ‘outliers’
Rotation
Supplementary points
Data & algorithms Appropriate data for correspondence analysis
Correspondence analysis of square tables
Choice of statistic
Multiple correspondence analysis
Composite tables
Interpretation again Normalization
Visualization Moonplots
Logos
Bubble charts
Comparing groups
Trends
End Resources
Q&A
Overview
2

Typical input data: Brand association table
3
% Fun Worth
what you
pay for
Innovative Good
customer
service
Stylish Easy to
use
High
quality
High
performance
Low
prices
Apple 64% 49% 75% 51% 69% 59% 72% 66% 7%
Microsoft 22% 39% 43% 21% 20% 38% 46% 45% 7%
IBM 3% 6% 15% 4% 5% 7% 21% 23% 4%
Google 63% 40% 59% 27% 32% 58% 40% 42% 17%
Intel 4% 15% 19% 8% 5% 10% 21% 23% 3%
Hewlett-Packard 5% 21% 15% 13% 15% 19% 31% 25% 12%
Sony 25% 36% 28% 18% 36% 34% 48% 36% 12%
Dell 6% 15% 10% 12% 11% 17% 21% 18% 22%
Yahoo 14% 7% 9% 6% 3% 14% 7% 7% 11%
Nokia 5% 16% 11% 12% 12% 25% 22% 12% 25%
Samsung 29% 43% 50% 30% 52% 51% 49% 46% 21%
LG 16% 36% 28% 18% 31% 35% 38% 29% 34%
Panasonic 10% 27% 20% 13% 23% 27% 35% 24% 22%
None of these 14% 9% 5% 21% 10% 5% 4% 6% 31%

A market map / brand map / map
correspondence analysis scatterplot / correspondence analysis biplot
4

Software
6
Everything in this webinar can be done using R, with
our flipDimensionReduction package on github:
https://github.com/Displayr/flipDimensionReduction
Everything we do today can be done using Displayr:
Insert > More > Dimension Reduction.
Everything in this Webinar is demonstrated using Q
(www.q-researchsoftware.com)

Software
Quality
Rotation
Choice of statistic
Composite tables
Logos
Bubble charts
Comparing groups
Trends
End Resources
Q&A
Overview
8

Interpretation 1: More similar brands (rows) are usually close together
9
Similar
Similar
Not similar

Interpretation 2: The further a brand from the origin, the more differentiated (usually)
10
Differentiated
Undifferentiated

Interpretation 3: More similar attributes (columns) are usually close together
11
Similar
Not similar

Interpretation 4: The further an attribute from the origin, the more differentiating (usually)
12
Differentiating
Not differentiating

Interpretation 5: Relationships between brands and attributes are not determined by proximity
13
There is not a
strong association
between Easy to
use, Stylish and
Samsung
(Samsung is not
differentiated;
Easy to use is not
differentiating)

Interpretation 6: The direction of association between brands and attributes is usually
determined by angle of the lines joining the brand and the attribute to the origin – example 1
14
There is a positive association between Low
prices and Nokia, as the lines connecting them to
the origin have a small (acute) angle.

determined by angle of the lines joining the brand and the attribute to the origin – example 2
15
There is no association between High quality and
Nokia, as the angle formed by the lines connecting
the brand and attribute to the origin (0,0) is
approximately 90 degrees.

determined by angle of the lines joining the brand and the attribute to the origin
16
There is a negative association between
Innovative and Nokia, as they are on opposite
sides of the origin.

Interpretation 7: The strength of association is usually proportional to the product of the
cosine of the angle, and the lengths of the lines from brand and attribute to origin – example 1
17
There is a strong positive
association between Low prices
and Nokia.

Interpretation 7: The strength of association is usually proportional to the product of the cosine
of the angle, and the lengths of the two lines from brand and attribute to origin – example 2
18
There is perhaps a very weak association between
Easy to use and Samsung:
• The line to Easy to use from the origin is short
• The line to Samsung from the origin is moderate
• The angle at the origin is irrelevant because the
lines are so short.

19
Nokia’s negative association with Innovative is
stronger than LG’s.

20
There is a negative association between Low
prices and Apple:
• The line to Low prices from the origin is long
• The line to Apple from the origin is moderate
• The angle at the origin is obtuse (this means
negative).

Interpretation 8: Indexed residual ≈ cos(angle)  length line to attribute  length line to brand
22
7.6% - 11.1% = -3.5%
27.2% × 41.1% = 11.1%
-3.5% / 11.1% = -.317%
Cos(0º) 1.00
Cos(10º) .98
Cos(45º) .71
Cos(90º) .00
Cos(135º) -.71
Cos(170º) -.98
Cos(180º) -1.00

Interpretation 9: The biggest number in the raw data table will usually not be the biggest
indexed residual (i.e., strongest association)
23
Retail sales (millions) Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Food retailing 10,245 9,557 10,354 9,728 9,815 9,517 9,929 10,042 10,006 10,483 10,436 12,230
Household goods retailing 4,377 3,980 4,097 4,065 4,093 4,357 4,225 4,239 4,469 4,697 4,874 5,782
Clothes/Accessories 1,876 1,599 1,781 1,925 1,927 1,967 1,876 1,806 1,897 1,938 2,057 3,331
Department stores 1,519 1,156 1,452 1,451 1,450 1,596 1,468 1,294 1,394 1,497 1,684 2,850
Other retailing 3,305 3,257 3,399 3,356 3,429 3,414 3,493 3,562 3,602 3,643 4,051 4,860
Food service 3,432 3,187 3,435 3,452 3,431 3,314 3,573 3,648 3,696 3,717 3,679 4,047

Interpretation 10: Review the variance explained
(Select the Map: Create > Dimension Reduction > Diagnostic > Quality)
100% - 54.3% - 25.9% = 19.8% of the variance
in the indexed residuals is not shown on the
map. The map will be misleading in some
ways.
24

Interpretation 11: Review the quality of the map for each brand (row)
(set Output to Diagnostics)
= +
The map only explains 16% of the
variance relating to Samsung

Interpretation 12: Review the quality of the map for each attribute (column)
(set Output to Diagnostics)
The map largely ignores the
attributes Easy to use and Stylish.

Interpretation 13: Check interesting results using the raw data
27
So, Samsung and Easy to use may still be related.
(Note that all the earlier slides had the caveat
usually regarding interpretation.)

Interpretation 14: Check interesting results using the raw data
28
% Fun Worth
what you
pay for
Innovative Good
customer
service
Stylish Easy to
use
High
quality
High
performance
Low
prices
Apple 64% 49% 75% 51% 69% 59% 72% 66% 7%
Microsoft 22% 39% 43% 21% 20% 38% 46% 45% 7%
IBM 3% 6% 15% 4% 5% 7% 21% 23% 4%
Google 63% 40% 59% 27% 32% 58% 40% 42% 17%
Intel 4% 15% 19% 8% 5% 10% 21% 23% 3%
Hewlett-Packard 5% 21% 15% 13% 15% 19% 31% 25% 12%
Sony 25% 36% 28% 18% 36% 34% 48% 36% 12%
Dell 6% 15% 10% 12% 11% 17% 21% 18% 22%
Yahoo 14% 7% 9% 6% 3% 14% 7% 7% 11%
Nokia 5% 16% 11% 12% 12% 25% 22% 12% 25%
Samsung 29% 43% 50% 30% 52% 51% 49% 46% 21%
LG 16% 36% 28% 18% 31% 35% 38% 29% 34%
Panasonic 10% 27% 20% 13% 23% 27% 35% 24% 22%
None of these 14% 9% 5% 21% 10% 5% 4% 6% 31%

Interpretation 15: Use standardized residuals to help interpret the raw data
(in Q, the arrows and colors are based on the standardized residuals)
29
% Fun Worth
what you
pay for
Innovative Good
customer
service
Stylish Easy to
use
High
quality
High
performance
Low
prices
Apple 64% 49% 75% 51% 69% 59% 72% 66% 7%
Microsoft 22% 39% 43% 21% 20% 38% 46% 45% 7%
IBM 3% 6% 15% 4% 5% 7% 21% 23% 4%
Google 63% 40% 59% 27% 32% 58% 40% 42% 17%
Intel 4% 15% 19% 8% 5% 10% 21% 23% 3%
Hewlett-Packard 5% 21% 15% 13% 15% 19% 31% 25% 12%
Sony 25% 36% 28% 18% 36% 34% 48% 36% 12%
Dell 6% 15% 10% 12% 11% 17% 21% 18% 22%
Yahoo 14% 7% 9% 6% 3% 14% 7% 7% 11%
Nokia 5% 16% 11% 12% 12% 25% 22% 12% 25%
Samsung 29% 43% 50% 30% 52% 51% 49% 46% 21%
LG 16% 36% 28% 18% 31% 35% 38% 29% 34%
Panasonic 10% 27% 20% 13% 23% 27% 35% 24% 22%
None of these 14% 9% 5% 21% 10% 5% 4% 6% 31%
Low prices and Fun are the most differentiating attributes. As
they are not correlated with each other, they make up the first
two dimensions, squeezing Stylish off the map.
Samsung is
not well
differentiated
on most of
the attributes

Interpretation 16: The aspect ratio needs to be 1 for correct interpretation
30
Detractor
Passive
Promoter
18 to 34
35 to 49
50 over
-0.015
-0.01
-0.005
0
0.005
0.01
0.015
0.02
-0.3 -0.2 -0.1 0 0.1 0.2 0.3
Google NPS Age
.100
.008
This map has an aspect ratio of 12.5 (.1 / .008).
This means that vertical distances are shown to
be 12.5 times bigger than is appropriate.
DetractorPassive Promoter18 to 3435 to 4950 over
-0.05
0
0.05
-0.3 -0.2 -0.1 0 0.1 0.2 0.3
Dimension 1 (horizontal)
Google NPS Age
This map has an aspect ratio of 1

Software
Quality
Rotation
Choice of statistic
Composite tables
Logos
Bubble charts
Comparing groups
Trends
End Resources
Q&A
Overview
31

Software
Quality
Rotation
Choice of statistic
Composite tables
Logos
Bubble charts
Comparing groups
Trends
End Resources
Q&A
Overview
32

When to use correspondence analysis
• When we have a table with:
• At least two rows
• At least two columns
• No missing values
• No negatives
• Data on the same scale: Does the table cease to make sense if it is sorted
by any of its rows or columns?
33

Software
Quality
Rotation
Choice of statistic
Composite tables
Logos
Bubble charts
Comparing groups
Trends
End Resources
Q&A
Overview
36

Interpretation 17: The default normalization settings of most correspondence analysis plots
misrepresent the associations between the brands and attributes
Normalization How to interpret
brand relationships
How to interpret
attribute relationships
How to interpret brand-
attribute associations
Principal Proximity Proximity Angles and lengths (but
angles and lengths are
misrepresented)
Row principal Proximity Proximity, adjusting for
variance explained
Angles and lengths
Row principal (scaled) Proximity Proximity, adjusting for
variance explained
Angles and lengths
Column principal Proximity, adjusting for
variance explained
Proximity Angles and lengths
Column principal (scaled) Proximity, adjusting for
variance explained
Proximity Angles and lengths
Symmetrical ½ Proximity, ½ adjusting
for variance explained
Proximity, ½ adjusting
for variance explained
Angles and lengths

Software
Quality
Rotation
Choice of statistic
Composite tables
Logos
Bubble charts
Comparing groups
Trends
End Resources
Q&A
Overview
38

Resources
• Correspondence Analysis in Practice, Third Edition (Chapman & Hall/CRC
Interdisciplinary Statistics) 3rd Edition, by Michael Greenacre (2017)
• 18 posts on various aspects of correspondence analysis on our blog:
www.displayr.com/blog
• The Q wiki: http://wiki.q-researchsoftware.com/wiki/Main_Page
• All the source code: https://github.com/Displayr/flipDimensionReduction
39

T I M B O C K P R E S E N T S
Q&A
www.q-researchsoftware.com/webinars
DIY Market Mapping
Using Correspondence Analysis

Webinar slides: DIY Market Mapping Using Correspondence Analysis

Recommended

Recommended

More Related Content

What's hot

What's hot (16)

Similar to Webinar slides: DIY Market Mapping Using Correspondence Analysis

Similar to Webinar slides: DIY Market Mapping Using Correspondence Analysis (20)

Recently uploaded

Recently uploaded (20)

Webinar slides: DIY Market Mapping Using Correspondence Analysis

Editor's Notes