Your SlideShare is downloading.
×

- 1. T I M B O C K P R E S E N T S If you have any questions, enter them into the Questions field. Questions will be answered at the end. If we do not have time to get to your question, we will email you. We will email you a link to the video, slides, and data. Get a free one-month trial of Q from www.q-researchsoftware.com DIY Market Mapping Using Correspondence Analysis IF YOU HAVE ANY TECHNICAL ISSUES VIEWING THIS WEBINAR, YOU CAN CATCH UP ON THE FULL RECORDING ON OUR WEBSITE
- 2. Introduction Overview Visualizing a big table Software Interpretation Proximities, angles, and lengths Quality Make it better Removing ‘outliers’ Rotation Supplementary points Data & algorithms Appropriate data for correspondence analysis Correspondence analysis of square tables Choice of statistic Multiple correspondence analysis Composite tables Interpretation again Normalization Visualization Moonplots Logos Bubble charts Comparing groups Trends End Resources Q&A Overview 2
- 3. Typical input data: Brand association table 3 % Fun Worth what you pay for Innovative Good customer service Stylish Easy to use High quality High performance Low prices Apple 64% 49% 75% 51% 69% 59% 72% 66% 7% Microsoft 22% 39% 43% 21% 20% 38% 46% 45% 7% IBM 3% 6% 15% 4% 5% 7% 21% 23% 4% Google 63% 40% 59% 27% 32% 58% 40% 42% 17% Intel 4% 15% 19% 8% 5% 10% 21% 23% 3% Hewlett-Packard 5% 21% 15% 13% 15% 19% 31% 25% 12% Sony 25% 36% 28% 18% 36% 34% 48% 36% 12% Dell 6% 15% 10% 12% 11% 17% 21% 18% 22% Yahoo 14% 7% 9% 6% 3% 14% 7% 7% 11% Nokia 5% 16% 11% 12% 12% 25% 22% 12% 25% Samsung 29% 43% 50% 30% 52% 51% 49% 46% 21% LG 16% 36% 28% 18% 31% 35% 38% 29% 34% Panasonic 10% 27% 20% 13% 23% 27% 35% 24% 22% None of these 14% 9% 5% 21% 10% 5% 4% 6% 31%
- 4. A market map / brand map / map correspondence analysis scatterplot / correspondence analysis biplot 4
- 5. Software 6 Everything in this webinar can be done using R, with our flipDimensionReduction package on github: https://github.com/Displayr/flipDimensionReduction Everything we do today can be done using Displayr: Insert > More > Dimension Reduction. Everything in this Webinar is demonstrated using Q (www.q-researchsoftware.com)
- 6. Introduction Overview Visualizing a big table Software Interpretation Proximities, angles, and lengths Quality Make it better Removing ‘outliers’ Rotation Supplementary points Data & algorithms Appropriate data for correspondence analysis Correspondence analysis of square tables Choice of statistic Multiple correspondence analysis Composite tables Interpretation again Normalization Visualization Moonplots Logos Bubble charts Comparing groups Trends End Resources Q&A Overview 8
- 7. Interpretation 1: More similar brands (rows) are usually close together 9 Similar Similar Not similar
- 8. Interpretation 2: The further a brand from the origin, the more differentiated (usually) 10 Differentiated Undifferentiated
- 9. Interpretation 3: More similar attributes (columns) are usually close together 11 Similar Not similar
- 10. Interpretation 4: The further an attribute from the origin, the more differentiating (usually) 12 Differentiating Not differentiating
- 11. Interpretation 5: Relationships between brands and attributes are not determined by proximity 13 There is not a strong association between Easy to use, Stylish and Samsung (Samsung is not differentiated; Easy to use is not differentiating)
- 12. Interpretation 6: The direction of association between brands and attributes is usually determined by angle of the lines joining the brand and the attribute to the origin – example 1 14 There is a positive association between Low prices and Nokia, as the lines connecting them to the origin have a small (acute) angle.
- 13. Interpretation 6: The direction of association between brands and attributes is usually determined by angle of the lines joining the brand and the attribute to the origin – example 2 15 There is no association between High quality and Nokia, as the angle formed by the lines connecting the brand and attribute to the origin (0,0) is approximately 90 degrees.
- 14. Interpretation 6: The direction of association between brands and attributes is usually determined by angle of the lines joining the brand and the attribute to the origin 16 There is a negative association between Innovative and Nokia, as they are on opposite sides of the origin.
- 15. Interpretation 7: The strength of association is usually proportional to the product of the cosine of the angle, and the lengths of the lines from brand and attribute to origin – example 1 17 There is a strong positive association between Low prices and Nokia.
- 16. Interpretation 7: The strength of association is usually proportional to the product of the cosine of the angle, and the lengths of the two lines from brand and attribute to origin – example 2 18 There is perhaps a very weak association between Easy to use and Samsung: • The line to Easy to use from the origin is short • The line to Samsung from the origin is moderate • The angle at the origin is irrelevant because the lines are so short.
- 17. Interpretation 7: The strength of association is usually proportional to the product of the cosine of the angle, and the lengths of the two lines from brand and attribute to origin – example 3 19 Nokia’s negative association with Innovative is stronger than LG’s.
- 18. Interpretation 7: The strength of association is usually proportional to the product of the cosine of the angle, and the lengths of the two lines from brand and attribute to origin – example 4 20 There is a negative association between Low prices and Apple: • The line to Low prices from the origin is long • The line to Apple from the origin is moderate • The angle at the origin is obtuse (this means negative).
- 19. Interpretation 8: Indexed residual ≈ cos(angle) length line to attribute length line to brand 22 7.6% - 11.1% = -3.5% 27.2% × 41.1% = 11.1% -3.5% / 11.1% = -.317% Cos(0º) 1.00 Cos(10º) .98 Cos(45º) .71 Cos(90º) .00 Cos(135º) -.71 Cos(170º) -.98 Cos(180º) -1.00
- 20. Interpretation 9: The biggest number in the raw data table will usually not be the biggest indexed residual (i.e., strongest association) 23 Retail sales (millions) Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Food retailing 10,245 9,557 10,354 9,728 9,815 9,517 9,929 10,042 10,006 10,483 10,436 12,230 Household goods retailing 4,377 3,980 4,097 4,065 4,093 4,357 4,225 4,239 4,469 4,697 4,874 5,782 Clothes/Accessories 1,876 1,599 1,781 1,925 1,927 1,967 1,876 1,806 1,897 1,938 2,057 3,331 Department stores 1,519 1,156 1,452 1,451 1,450 1,596 1,468 1,294 1,394 1,497 1,684 2,850 Other retailing 3,305 3,257 3,399 3,356 3,429 3,414 3,493 3,562 3,602 3,643 4,051 4,860 Food service 3,432 3,187 3,435 3,452 3,431 3,314 3,573 3,648 3,696 3,717 3,679 4,047
- 21. Interpretation 10: Review the variance explained (Select the Map: Create > Dimension Reduction > Diagnostic > Quality) 100% - 54.3% - 25.9% = 19.8% of the variance in the indexed residuals is not shown on the map. The map will be misleading in some ways. 24
- 22. Interpretation 11: Review the quality of the map for each brand (row) (set Output to Diagnostics) = + The map only explains 16% of the variance relating to Samsung
- 23. Interpretation 12: Review the quality of the map for each attribute (column) (set Output to Diagnostics) The map largely ignores the attributes Easy to use and Stylish.
- 24. Interpretation 13: Check interesting results using the raw data 27 So, Samsung and Easy to use may still be related. (Note that all the earlier slides had the caveat usually regarding interpretation.)
- 25. Interpretation 14: Check interesting results using the raw data 28 % Fun Worth what you pay for Innovative Good customer service Stylish Easy to use High quality High performance Low prices Apple 64% 49% 75% 51% 69% 59% 72% 66% 7% Microsoft 22% 39% 43% 21% 20% 38% 46% 45% 7% IBM 3% 6% 15% 4% 5% 7% 21% 23% 4% Google 63% 40% 59% 27% 32% 58% 40% 42% 17% Intel 4% 15% 19% 8% 5% 10% 21% 23% 3% Hewlett-Packard 5% 21% 15% 13% 15% 19% 31% 25% 12% Sony 25% 36% 28% 18% 36% 34% 48% 36% 12% Dell 6% 15% 10% 12% 11% 17% 21% 18% 22% Yahoo 14% 7% 9% 6% 3% 14% 7% 7% 11% Nokia 5% 16% 11% 12% 12% 25% 22% 12% 25% Samsung 29% 43% 50% 30% 52% 51% 49% 46% 21% LG 16% 36% 28% 18% 31% 35% 38% 29% 34% Panasonic 10% 27% 20% 13% 23% 27% 35% 24% 22% None of these 14% 9% 5% 21% 10% 5% 4% 6% 31%
- 26. Interpretation 15: Use standardized residuals to help interpret the raw data (in Q, the arrows and colors are based on the standardized residuals) 29 % Fun Worth what you pay for Innovative Good customer service Stylish Easy to use High quality High performance Low prices Apple 64% 49% 75% 51% 69% 59% 72% 66% 7% Microsoft 22% 39% 43% 21% 20% 38% 46% 45% 7% IBM 3% 6% 15% 4% 5% 7% 21% 23% 4% Google 63% 40% 59% 27% 32% 58% 40% 42% 17% Intel 4% 15% 19% 8% 5% 10% 21% 23% 3% Hewlett-Packard 5% 21% 15% 13% 15% 19% 31% 25% 12% Sony 25% 36% 28% 18% 36% 34% 48% 36% 12% Dell 6% 15% 10% 12% 11% 17% 21% 18% 22% Yahoo 14% 7% 9% 6% 3% 14% 7% 7% 11% Nokia 5% 16% 11% 12% 12% 25% 22% 12% 25% Samsung 29% 43% 50% 30% 52% 51% 49% 46% 21% LG 16% 36% 28% 18% 31% 35% 38% 29% 34% Panasonic 10% 27% 20% 13% 23% 27% 35% 24% 22% None of these 14% 9% 5% 21% 10% 5% 4% 6% 31% Low prices and Fun are the most differentiating attributes. As they are not correlated with each other, they make up the first two dimensions, squeezing Stylish off the map. Samsung is not well differentiated on most of the attributes
- 27. Interpretation 16: The aspect ratio needs to be 1 for correct interpretation 30 Detractor Passive Promoter 18 to 34 35 to 49 50 over -0.015 -0.01 -0.005 0 0.005 0.01 0.015 0.02 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 Google NPS Age .100 .008 This map has an aspect ratio of 12.5 (.1 / .008). This means that vertical distances are shown to be 12.5 times bigger than is appropriate. DetractorPassive Promoter18 to 3435 to 4950 over -0.05 0 0.05 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 Dimension 1 (horizontal) Google NPS Age This map has an aspect ratio of 1
- 28. Introduction Overview Visualizing a big table Software Interpretation Proximities, angles, and lengths Quality Make it better Removing ‘outliers’ Rotation Supplementary points Data & algorithms Appropriate data for correspondence analysis Correspondence analysis of square tables Choice of statistic Multiple correspondence analysis Composite tables Interpretation again Normalization Visualization Moonplots Logos Bubble charts Comparing groups Trends End Resources Q&A Overview 31
- 29. Introduction Overview Visualizing a big table Software Interpretation Proximities, angles, and lengths Quality Make it better Removing ‘outliers’ Rotation Supplementary points Data & algorithms Appropriate data for correspondence analysis Correspondence analysis of square tables Choice of statistic Multiple correspondence analysis Composite tables Interpretation again Normalization Visualization Moonplots Logos Bubble charts Comparing groups Trends End Resources Q&A Overview 32
- 30. When to use correspondence analysis • When we have a table with: • At least two rows • At least two columns • No missing values • No negatives • Data on the same scale: Does the table cease to make sense if it is sorted by any of its rows or columns? 33
- 31. Introduction Overview Visualizing a big table Software Interpretation Proximities, angles, and lengths Quality Make it better Removing ‘outliers’ Rotation Supplementary points Data & algorithms Appropriate data for correspondence analysis Correspondence analysis of square tables Choice of statistic Multiple correspondence analysis Composite tables Interpretation again Normalization Visualization Moonplots Logos Bubble charts Comparing groups Trends End Resources Q&A Overview 36
- 32. Interpretation 17: The default normalization settings of most correspondence analysis plots misrepresent the associations between the brands and attributes Normalization How to interpret brand relationships How to interpret attribute relationships How to interpret brand- attribute associations Principal Proximity Proximity Angles and lengths (but angles and lengths are misrepresented) Row principal Proximity Proximity, adjusting for variance explained Angles and lengths Row principal (scaled) Proximity Proximity, adjusting for variance explained Angles and lengths Column principal Proximity, adjusting for variance explained Proximity Angles and lengths Column principal (scaled) Proximity, adjusting for variance explained Proximity Angles and lengths Symmetrical ½ Proximity, ½ adjusting for variance explained Proximity, ½ adjusting for variance explained Angles and lengths
- 33. Introduction Overview Visualizing a big table Software Interpretation Proximities, angles, and lengths Quality Make it better Removing ‘outliers’ Rotation Supplementary points Data & algorithms Appropriate data for correspondence analysis Correspondence analysis of square tables Choice of statistic Multiple correspondence analysis Composite tables Interpretation again Normalization Visualization Moonplots Logos Bubble charts Comparing groups Trends End Resources Q&A Overview 38
- 34. Resources • Correspondence Analysis in Practice, Third Edition (Chapman & Hall/CRC Interdisciplinary Statistics) 3rd Edition, by Michael Greenacre (2017) • 18 posts on various aspects of correspondence analysis on our blog: www.displayr.com/blog • The Q wiki: http://wiki.q-researchsoftware.com/wiki/Main_Page • All the source code: https://github.com/Displayr/flipDimensionReduction 39
- 35. T I M B O C K P R E S E N T S Q&A www.q-researchsoftware.com/webinars DIY Market Mapping Using Correspondence Analysis