SlideShare a Scribd company logo
1 of 47
Download to read offline
Looking for Patterns
in the Data
Ray Poynter
27 January 2022
Finding and Communicating the
Story in the Data
Ray Poynter
To be published Mid-2022
The Framework
1. Define the Problem
2. Assess the Wider Context
3. Find the Big Picture
4. Extract the Key Findings
5. Determine the Message
6. Create the Story
7. Communicate the Story
8. Follow Up
1 Define the Problem
If you don’t define the problem properly, you are unlikely to find the answer
The process includes:
• What is the business question?
• What are the research questions?
• What do we already know?
• What does success look like?
• What does the business plan to do after receiving the answers?
• What are the predictions?
Finding the
Patterns in the
Data
1. Using the question as a lens
2. Making the patterns easier to see
3. Look for Connections, Correlations,
Contradictions, Curiosities and
Surprises
4. Assemble the story
Making the Patterns Easier to See
• Making numbers more visual
• Sorting
• Comparing
• Benchmarks
• Derived variables
• Analytics
Per Capita Govt. Spend on Health - $
183 Countries, 1995 to 2010, source Gapminder
3 Significant Digits – Units of $10
Divide by 10, display no decimals (but don’t delete them)
Three digits good
Two digits medium
One digit poor
Sorting – Filters in Excel
Don’t look at data in alphabetical order or in questionnaire order
Pattern 1, Best = USA + Northern Europe
Pattern 2, Worst = Africa overrepresented
Surprise = Niue (Pacific island, pop < 2,000, it’s an outlier)
Looking at the Shape of the Data
Steps indicate
possible curiosities,
e.g. OECD, EU etc
Index, 1995 = 100
Pattern 1, the most improved tended to
have low incomes per person in 1995
Pattern 2, those who declined tended to
have low incomes per person in 1995
Diagonalizing
Brand Attribute Data
% Agreeing Sexy Cheap Stylish Friendly Strong Traditional
Brand 1 41.73% 63.40% 20.88% 71.20% 41.67% 31.66%
Brand 2 87.21% 22.16% 64.50% 40.02% 96.24% 61.31%
Brand 3 71.52% 48.80% 53.86% 30.08% 27.88% 28.12%
Brand 4 97.23% 9.70% 93.97% 19.89% 96.99% 98.19%
Brand 5 92.70% 12.73% 88.36% 34.63% 30.03% 26.98%
Brand 6 30.58% 60.82% 20.61% 55.34% 75.00% 64.70%
Brand 7 11.47% 72.02% 1.93% 83.76% 59.86% 28.94%
Hide the Decimal Places
% Agreeing Sexy Cheap Stylish Friendly Strong Traditional
Brand 1 42% 63% 21% 71% 42% 32%
Brand 2 87% 22% 65% 40% 96% 61%
Brand 3 72% 49% 54% 30% 28% 28%
Brand 4 97% 10% 94% 20% 97% 98%
Brand 5 93% 13% 88% 35% 30% 27%
Brand 6 31% 61% 21% 55% 75% 65%
Brand 7 11% 72% 2% 84% 60% 29%
Note, don’t change the numbers, just the way they are displayed
Conditional Formatting
% Agreeing Sexy Cheap Stylish Friendly Strong Traditional
Brand 1 42% 63% 21% 71% 42% 32%
Brand 2 87% 22% 65% 40% 96% 61%
Brand 3 72% 49% 54% 30% 28% 28%
Brand 4 97% 10% 94% 20% 97% 98%
Brand 5 93% 13% 88% 35% 30% 27%
Brand 6 31% 61% 21% 55% 75% 65%
Brand 7 11% 72% 2% 84% 60% 29%
We can see there are differences, but the patterns are not clear yet
Sum the Rows and Columns
% Agreeing Sexy Cheap Stylish Friendly Strong Traditional Sum
Brand 1 42% 63% 21% 71% 42% 32% 271%
Brand 2 87% 22% 65% 40% 96% 61% 371%
Brand 3 72% 49% 54% 30% 28% 28% 260%
Brand 4 97% 10% 94% 20% 97% 98% 416%
Brand 5 93% 13% 88% 35% 30% 27% 285%
Brand 6 31% 61% 21% 55% 75% 65% 307%
Brand 7 11% 72% 2% 84% 60% 29% 258%
Sum 432% 290% 344% 335% 428% 340%
Sort the Rows
% Agreeing Sexy Cheap Stylish Friendly Strong Traditional Sum
Brand 4 97% 10% 94% 20% 97% 98% 416%
Brand 2 87% 22% 65% 40% 96% 61% 371%
Brand 6 31% 61% 21% 55% 75% 65% 307%
Brand 5 93% 13% 88% 35% 30% 27% 285%
Brand 1 42% 63% 21% 71% 42% 32% 271%
Brand 3 72% 49% 54% 30% 28% 28% 260%
Brand 7 11% 72% 2% 84% 60% 29% 258%
Sum 432% 290% 344% 335% 428% 340%
Largest number to the top
Sort the Columns
% Agreeing Sexy Strong Stylish Traditional Friendly Cheap Sum
Brand 4 97% 97% 94% 98% 20% 10% 416%
Brand 2 87% 96% 65% 61% 40% 22% 371%
Brand 6 31% 75% 21% 65% 55% 61% 307%
Brand 5 93% 30% 88% 27% 35% 13% 285%
Brand 1 42% 42% 21% 32% 71% 63% 271%
Brand 3 72% 28% 54% 28% 30% 49% 260%
Brand 7 11% 60% 2% 29% 84% 72% 258%
Sum 432% 428% 344% 340% 335% 290%
In Excel, select Options in the Sort dialogue to access left-to-right sorting
Calculate Distances from
top row & left column
% Agreeing Sexy Strong Stylish Traditional Friendly Cheap Distance
Brand 4 97% 97% 94% 98% 20% 10% 0%
Brand 2 87% 96% 65% 61% 40% 22% 54%
Brand 6 31% 75% 21% 65% 55% 61% 124%
Brand 5 93% 30% 88% 27% 35% 13% 99%
Brand 1 42% 42% 21% 32% 71% 63% 146%
Brand 3 72% 28% 54% 28% 30% 49% 117%
Brand 7 11% 60% 2% 29% 84% 72% 173%
Distance 0% 101% 38% 92% 141% 154%
To calculate the distances, use the formula =SQRT(SUMXMY2(range1,range2))
This calculates the distance using Pythagoras.
Sort the Rows
% Agreeing Sexy Strong Stylish Traditional Friendly Cheap Distance
Brand 4 97% 97% 94% 98% 20% 10% 0%
Brand 2 87% 96% 65% 61% 40% 22% 54%
Brand 5 93% 30% 88% 27% 35% 13% 99%
Brand 3 72% 28% 54% 28% 30% 49% 117%
Brand 6 31% 75% 21% 65% 55% 61% 124%
Brand 1 42% 42% 21% 32% 71% 63% 146%
Brand 7 11% 60% 2% 29% 84% 72% 173%
Distance 0% 101% 38% 92% 141% 154%
Sort the rows by distance, smallest number from the top
Sort the Columns
% Agreeing Sexy Stylish Traditional Strong Friendly Cheap Distance
Brand 4 97% 94% 98% 97% 20% 10% 0%
Brand 2 87% 65% 61% 96% 40% 22% 54%
Brand 5 93% 88% 27% 30% 35% 13% 99%
Brand 3 72% 54% 28% 28% 30% 49% 117%
Brand 6 31% 21% 65% 75% 55% 61% 124%
Brand 1 42% 21% 32% 42% 71% 63% 146%
Brand 7 11% 2% 29% 60% 84% 72% 173%
Distance 0% 38% 92% 101% 141% 154%
Sort the columns by distance, smallest number on the left
Tidy up the Data
% Agreeing Sexy Stylish Traditional Strong Friendly Cheap
Brand 4 97 94 98 97 20 10
Brand 2 87 65 61 96 40 22
Brand 5 93 88 27 30 35 13
Brand 3 72 54 28 28 30 49
Brand 6 31 21 65 75 55 61
Brand 1 42 21 32 42 71 63
Brand 7 11 2 29 60 84 72
Remove the extra columns, multiply by 100 to remove the % signs.
Pattern 1, Brands 4, 2 & 5 are Sexy and Stylish
Pattern 2, Brands 6, 1 & 7 are Friendly and Cheap
Do Not (Normally) use the Questionnaire Sequence
0
5000
10000
15000
20000
25000
0
100
200
300
400
500
600
700
Brazil China France Germany India Italy Japan Russia South
Africa
Sweden UK USA World
Cases
Per
Million
Deaths
Per
Million
Deaths and Cases Per Million of Population
(until 27 September 2020)
Deaths / 1Million Case / 1Million
https://www.worldometers.info/coronavirus/ - downloaded 27 September 2020
Sort the data by something meaningful
0
5000
10000
15000
20000
25000
0
100
200
300
400
500
600
700
Brazil USA UK Italy Sweden France South
Africa
Russia World Germany India Japan China
Cases
Per
Million
Deaths
Per
Million
Deaths and Cases Per Million of Population
(until 27 September 2020)
Deaths / 1Million Case / 1Million
https://www.worldometers.info/coronavirus/ - downloaded 27 September 2020
World = benchmark
Countries with high death rates
include LatAm, USA, Europe and SA
Lower rates include Europe and Asia
Sort the data by something meaningful
0
5000
10000
15000
20000
25000
0
100
200
300
400
500
600
700
Brazil USA UK Italy Sweden France South
Africa
Russia World Germany India Japan China
Cases
Per
Million
Deaths
Per
Million
Deaths and Cases Per Million of Population
(until 27 September 2020)
Deaths / 1Million Case / 1Million
https://www.worldometers.info/coronavirus/ - downloaded 27 September 2020
Some countries break the link
between Cases and Deaths
Idea to test = Were UK, Italy, Sweden
& France testing fewer people?
Derived Variables
• Younger people
• People in London
• Men
If a concept is
preferred
(slightly) by
• The differences may be much larger
Create a
variable ‘Young
Men in London’
Downloaded from Worldmeters.info
3 March 2021
https://www.worldometers.info/coronavirus/country/uk/
Downloaded from Worldmeters.info
3 March 2021
https://www.worldometers.info/coronavirus/country/uk/
Downloaded from Worldmeters.info
3 March 2021
https://www.worldometers.info/coronavirus/country/uk/
Simplify and Compare
Downloaded from The Guardian 22 Jan 2022
• Pattern = Cases, Hospitalisations, then Deaths
• Alpha had the deaths and hospitalisations
• Delta had the vaccinations and lockdowns
• Omicron had boosters and light restrictions
• The link between cases and deaths changed
Try Alternative Perspectives
Add comparators
Deaths, raw numbers, new, linear scale
Derived Scale – per 100K
Deaths, per 100K, new, linear scale
Cumulative
Deaths, per 100K, cumulative, linear scale
Cumulative – longer time span
Deaths, per 100K, cumulative, linear scale
Consider other comparators
Deaths, per 100K, cumulative, linear scale
Consider other comparators
Deaths, per 100K, cumulative, linear scale
The Absence of a Strong Pattern Can be Interesting
Source: Statista, 2020
Older people cook
from scratch
more often – but
the differences
between age
groups are not
large.
35-64 almost
identical, 18-24 a
bit lower, 65+ a
bit higher
Analytics
We use analytics when the patterns are not visible without
analytics
Key tools include
• Correspondence Analysis – shows relationships on a map
• Factor analysis – great for simplifying the data
• Cluster analysis – shows groupings that exist in the data
• Regression analysis – shows the scale of relationships
• Latent Class – allows techniques to be combined, e.g. regression and
cluster analysis
The Building Blocks of Stories
Fact 1 Fact 2 Fact 3 Fact 4 Fact 5 Fact 6 Fact 7 Fact 8 Fact 9
Finding
1
Finding
2
Finding
3
Finding
4
Finding
5
Finding
6
Finding
7
Finding
8
Insight
Categorizing the Facts
Fact a Fact d
Fact c
Nice to Know
Fact b
Fact f
Fact g Fact l
Fact e
Fact h
Fact i
Fact j
Fact k
Extract the Findings
Fact a Fact d
Fact c
Nice to Know
Fact b
Fact f
Fact g Fact l
Fact e
Fact h
Fact i
Fact j
Fact k
Finding
a
Finding
b
Finding
c
Finding
d
Finding
e
Secondary Findings
Fact a Fact d
Fact c
Nice to Know
Fact b
Fact f
Fact g Fact l
Fact e
Fact h
Fact i
Fact j
Fact k
Finding
a
Finding
b
Finding
c
Finding
d
Finding
e
Finding
f
Extract the Insight
Fact a Fact d
Fact c
Nice to Know
Fact b
Fact f
Fact g Fact l
Fact e
Fact h
Fact i
Fact j
Fact k
Finding
a
Finding
b
Finding
c
Finding
d
Finding
e
Finding
f
Insight
Finding the
Patterns in the
Data
1. Using the question as a lens
2. Making the patterns easier
to see
3. Look for Connections,
Correlations, Contradictions
and Surprises
4. Assemble the story
Questions?
Sponsors
Communication

More Related Content

Similar to Looking for patterns in the data

statistic project on Hero motocorp
statistic project on Hero motocorpstatistic project on Hero motocorp
statistic project on Hero motocorp
Yug Bokadia
 
Data Mining Assignment Final
Data Mining Assignment FinalData Mining Assignment Final
Data Mining Assignment Final
TIEZHENG YUAN
 
WD-40 Saturday Presentation
WD-40 Saturday PresentationWD-40 Saturday Presentation
WD-40 Saturday Presentation
Ike Ekeh
 
Visual Merchandising Portfolio copy
Visual Merchandising Portfolio copyVisual Merchandising Portfolio copy
Visual Merchandising Portfolio copy
Jennifer Morris
 
The analysis of the data has been done using excel statistical sof.docx
The analysis of the data has been done using excel statistical sof.docxThe analysis of the data has been done using excel statistical sof.docx
The analysis of the data has been done using excel statistical sof.docx
mattinsonjanel
 

Similar to Looking for patterns in the data (20)

Predictive Analysis ( Sample Report)
Predictive  Analysis ( Sample  Report)Predictive  Analysis ( Sample  Report)
Predictive Analysis ( Sample Report)
 
Melda Elmas-Project1-ppt.pptx
Melda Elmas-Project1-ppt.pptxMelda Elmas-Project1-ppt.pptx
Melda Elmas-Project1-ppt.pptx
 
HYDSPIN Dec14 visual story telling
HYDSPIN Dec14 visual story tellingHYDSPIN Dec14 visual story telling
HYDSPIN Dec14 visual story telling
 
LSC Digital Prospecting
LSC Digital ProspectingLSC Digital Prospecting
LSC Digital Prospecting
 
statistic project on Hero motocorp
statistic project on Hero motocorpstatistic project on Hero motocorp
statistic project on Hero motocorp
 
The CMO Survey - Highights and Insights Report - Feb 2018
The CMO Survey - Highights and Insights Report - Feb 2018The CMO Survey - Highights and Insights Report - Feb 2018
The CMO Survey - Highights and Insights Report - Feb 2018
 
Communicated discounts (drinks wo alccols) 2017 2016
Communicated discounts (drinks wo alccols) 2017 2016Communicated discounts (drinks wo alccols) 2017 2016
Communicated discounts (drinks wo alccols) 2017 2016
 
Optimizing Assortments by Focusing on Attribute-Based Demand Patterns
Optimizing Assortments by Focusing on Attribute-Based Demand PatternsOptimizing Assortments by Focusing on Attribute-Based Demand Patterns
Optimizing Assortments by Focusing on Attribute-Based Demand Patterns
 
The Ultimate Guide to Facebook Data Activation
The Ultimate Guide to Facebook Data ActivationThe Ultimate Guide to Facebook Data Activation
The Ultimate Guide to Facebook Data Activation
 
Data Mining Assignment Final
Data Mining Assignment FinalData Mining Assignment Final
Data Mining Assignment Final
 
Hair treatment voluntary_report_2014
Hair treatment voluntary_report_2014Hair treatment voluntary_report_2014
Hair treatment voluntary_report_2014
 
WD-40 Saturday Presentation
WD-40 Saturday PresentationWD-40 Saturday Presentation
WD-40 Saturday Presentation
 
Database Marketing - Dominick's stores in Chicago distric
Database Marketing - Dominick's stores in Chicago districDatabase Marketing - Dominick's stores in Chicago distric
Database Marketing - Dominick's stores in Chicago distric
 
Harnessing the web 2014 segmentation for better email marketing
Harnessing the web 2014   segmentation for better email marketingHarnessing the web 2014   segmentation for better email marketing
Harnessing the web 2014 segmentation for better email marketing
 
The Tech Report - Q1 2015 Review
The Tech Report - Q1 2015 ReviewThe Tech Report - Q1 2015 Review
The Tech Report - Q1 2015 Review
 
Aaa Framework
Aaa FrameworkAaa Framework
Aaa Framework
 
Visual Merchandising Portfolio copy
Visual Merchandising Portfolio copyVisual Merchandising Portfolio copy
Visual Merchandising Portfolio copy
 
The analysis of the data has been done using excel statistical sof.docx
The analysis of the data has been done using excel statistical sof.docxThe analysis of the data has been done using excel statistical sof.docx
The analysis of the data has been done using excel statistical sof.docx
 
Business statistics -_assignment_dec_2019_zf_sgc5ylme
Business statistics -_assignment_dec_2019_zf_sgc5ylmeBusiness statistics -_assignment_dec_2019_zf_sgc5ylme
Business statistics -_assignment_dec_2019_zf_sgc5ylme
 
Tissue usage in Indonesia
Tissue usage in IndonesiaTissue usage in Indonesia
Tissue usage in Indonesia
 

More from Ray Poynter

More from Ray Poynter (20)

The State of AI in Insights and Research 2024: Results and Findings
The State of AI in Insights and Research 2024: Results and FindingsThe State of AI in Insights and Research 2024: Results and Findings
The State of AI in Insights and Research 2024: Results and Findings
 
ResearchWiseAI - an artificial intelligence driven research data analysis tool
ResearchWiseAI - an artificial intelligence driven research data analysis toolResearchWiseAI - an artificial intelligence driven research data analysis tool
ResearchWiseAI - an artificial intelligence driven research data analysis tool
 
AI-powered interviewing: Best practices from Yasna
AI-powered interviewing: Best practices from YasnaAI-powered interviewing: Best practices from Yasna
AI-powered interviewing: Best practices from Yasna
 
Artificial Intelligence and Qual: The Story So Far
Artificial Intelligence and Qual: The Story So FarArtificial Intelligence and Qual: The Story So Far
Artificial Intelligence and Qual: The Story So Far
 
State of Research Insights in Q1, 2024 from NewMR
State of Research Insights in Q1, 2024 from NewMRState of Research Insights in Q1, 2024 from NewMR
State of Research Insights in Q1, 2024 from NewMR
 
Sudden Death of Beliefs
Sudden Death of BeliefsSudden Death of Beliefs
Sudden Death of Beliefs
 
Uncovering Consumers’ Hidden Narratives
Uncovering Consumers’ Hidden NarrativesUncovering Consumers’ Hidden Narratives
Uncovering Consumers’ Hidden Narratives
 
Narrative Exploration of New Categories at Mondelēz
Narrative Exploration of New Categories at MondelēzNarrative Exploration of New Categories at Mondelēz
Narrative Exploration of New Categories at Mondelēz
 
The Future in Focus
The Future in FocusThe Future in Focus
The Future in Focus
 
The Future in Focus
The Future in FocusThe Future in Focus
The Future in Focus
 
The State of Insights – September 2023
The State of Insights – September 2023The State of Insights – September 2023
The State of Insights – September 2023
 
Research Thinking in the age of AI
Research Thinking in the age of AIResearch Thinking in the age of AI
Research Thinking in the age of AI
 
How might AI impact Research and Insights over the next two years?
How might AI impact Research and Insights over the next two years?How might AI impact Research and Insights over the next two years?
How might AI impact Research and Insights over the next two years?
 
From Words to Wisdom: Unleashing the Potential of Language Models for Human-C...
From Words to Wisdom: Unleashing the Potential of Language Models for Human-C...From Words to Wisdom: Unleashing the Potential of Language Models for Human-C...
From Words to Wisdom: Unleashing the Potential of Language Models for Human-C...
 
ChatGPT for Social Media Listening: practical application with YouScan’s Insi...
ChatGPT for Social Media Listening: practical application with YouScan’s Insi...ChatGPT for Social Media Listening: practical application with YouScan’s Insi...
ChatGPT for Social Media Listening: practical application with YouScan’s Insi...
 
Using Generative AI to Assess the Quality of Open-Ended Responses in Surveys
Using Generative AI to Assess the Quality of Open-Ended Responses in SurveysUsing Generative AI to Assess the Quality of Open-Ended Responses in Surveys
Using Generative AI to Assess the Quality of Open-Ended Responses in Surveys
 
Exploring the future of verbatim coding with ChatGPT
Exploring the future of verbatim coding with ChatGPTExploring the future of verbatim coding with ChatGPT
Exploring the future of verbatim coding with ChatGPT
 
Using Generative AI to bring Qualitative Capabilities to Quantitative Surveys
Using Generative AI to bring Qualitative Capabilities to Quantitative SurveysUsing Generative AI to bring Qualitative Capabilities to Quantitative Surveys
Using Generative AI to bring Qualitative Capabilities to Quantitative Surveys
 
How AI / ChatGPT Drives Business Growth
How AI / ChatGPT Drives Business GrowthHow AI / ChatGPT Drives Business Growth
How AI / ChatGPT Drives Business Growth
 
Tech for tech’s sake? Learnings from experiments with AI in consumer research
Tech for tech’s sake? Learnings from experiments with AI in consumer researchTech for tech’s sake? Learnings from experiments with AI in consumer research
Tech for tech’s sake? Learnings from experiments with AI in consumer research
 

Recently uploaded

SURVEY I created for uni project research
SURVEY I created for uni project researchSURVEY I created for uni project research
SURVEY I created for uni project research
CaitlinCummins3
 
MSc Ag Genetics & Plant Breeding: Insights from Previous Year JNKVV Entrance ...
MSc Ag Genetics & Plant Breeding: Insights from Previous Year JNKVV Entrance ...MSc Ag Genetics & Plant Breeding: Insights from Previous Year JNKVV Entrance ...
MSc Ag Genetics & Plant Breeding: Insights from Previous Year JNKVV Entrance ...
Krashi Coaching
 
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
中 央社
 

Recently uploaded (20)

Andreas Schleicher presents at the launch of What does child empowerment mean...
Andreas Schleicher presents at the launch of What does child empowerment mean...Andreas Schleicher presents at the launch of What does child empowerment mean...
Andreas Schleicher presents at the launch of What does child empowerment mean...
 
diagnosting testing bsc 2nd sem.pptx....
diagnosting testing bsc 2nd sem.pptx....diagnosting testing bsc 2nd sem.pptx....
diagnosting testing bsc 2nd sem.pptx....
 
male presentation...pdf.................
male presentation...pdf.................male presentation...pdf.................
male presentation...pdf.................
 
ANTI PARKISON DRUGS.pptx
ANTI         PARKISON          DRUGS.pptxANTI         PARKISON          DRUGS.pptx
ANTI PARKISON DRUGS.pptx
 
Analyzing and resolving a communication crisis in Dhaka textiles LTD.pptx
Analyzing and resolving a communication crisis in Dhaka textiles LTD.pptxAnalyzing and resolving a communication crisis in Dhaka textiles LTD.pptx
Analyzing and resolving a communication crisis in Dhaka textiles LTD.pptx
 
Đề tieng anh thpt 2024 danh cho cac ban hoc sinh
Đề tieng anh thpt 2024 danh cho cac ban hoc sinhĐề tieng anh thpt 2024 danh cho cac ban hoc sinh
Đề tieng anh thpt 2024 danh cho cac ban hoc sinh
 
“O BEIJO” EM ARTE .
“O BEIJO” EM ARTE                       .“O BEIJO” EM ARTE                       .
“O BEIJO” EM ARTE .
 
24 ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH SỞ GIÁO DỤC HẢI DƯ...
24 ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH SỞ GIÁO DỤC HẢI DƯ...24 ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH SỞ GIÁO DỤC HẢI DƯ...
24 ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH SỞ GIÁO DỤC HẢI DƯ...
 
Improved Approval Flow in Odoo 17 Studio App
Improved Approval Flow in Odoo 17 Studio AppImproved Approval Flow in Odoo 17 Studio App
Improved Approval Flow in Odoo 17 Studio App
 
SURVEY I created for uni project research
SURVEY I created for uni project researchSURVEY I created for uni project research
SURVEY I created for uni project research
 
DEMONSTRATION LESSON IN ENGLISH 4 MATATAG CURRICULUM
DEMONSTRATION LESSON IN ENGLISH 4 MATATAG CURRICULUMDEMONSTRATION LESSON IN ENGLISH 4 MATATAG CURRICULUM
DEMONSTRATION LESSON IN ENGLISH 4 MATATAG CURRICULUM
 
e-Sealing at EADTU by Kamakshi Rajagopal
e-Sealing at EADTU by Kamakshi Rajagopale-Sealing at EADTU by Kamakshi Rajagopal
e-Sealing at EADTU by Kamakshi Rajagopal
 
demyelinated disorder: multiple sclerosis.pptx
demyelinated disorder: multiple sclerosis.pptxdemyelinated disorder: multiple sclerosis.pptx
demyelinated disorder: multiple sclerosis.pptx
 
Major project report on Tata Motors and its marketing strategies
Major project report on Tata Motors and its marketing strategiesMajor project report on Tata Motors and its marketing strategies
Major project report on Tata Motors and its marketing strategies
 
MSc Ag Genetics & Plant Breeding: Insights from Previous Year JNKVV Entrance ...
MSc Ag Genetics & Plant Breeding: Insights from Previous Year JNKVV Entrance ...MSc Ag Genetics & Plant Breeding: Insights from Previous Year JNKVV Entrance ...
MSc Ag Genetics & Plant Breeding: Insights from Previous Year JNKVV Entrance ...
 
MOOD STABLIZERS DRUGS.pptx
MOOD     STABLIZERS           DRUGS.pptxMOOD     STABLIZERS           DRUGS.pptx
MOOD STABLIZERS DRUGS.pptx
 
Championnat de France de Tennis de table/
Championnat de France de Tennis de table/Championnat de France de Tennis de table/
Championnat de France de Tennis de table/
 
An Overview of the Odoo 17 Knowledge App
An Overview of the Odoo 17 Knowledge AppAn Overview of the Odoo 17 Knowledge App
An Overview of the Odoo 17 Knowledge App
 
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
 
IPL Online Quiz by Pragya; Question Set.
IPL Online Quiz by Pragya; Question Set.IPL Online Quiz by Pragya; Question Set.
IPL Online Quiz by Pragya; Question Set.
 

Looking for patterns in the data

  • 1. Looking for Patterns in the Data Ray Poynter 27 January 2022
  • 2. Finding and Communicating the Story in the Data Ray Poynter To be published Mid-2022
  • 3. The Framework 1. Define the Problem 2. Assess the Wider Context 3. Find the Big Picture 4. Extract the Key Findings 5. Determine the Message 6. Create the Story 7. Communicate the Story 8. Follow Up
  • 4. 1 Define the Problem If you don’t define the problem properly, you are unlikely to find the answer The process includes: • What is the business question? • What are the research questions? • What do we already know? • What does success look like? • What does the business plan to do after receiving the answers? • What are the predictions?
  • 5. Finding the Patterns in the Data 1. Using the question as a lens 2. Making the patterns easier to see 3. Look for Connections, Correlations, Contradictions, Curiosities and Surprises 4. Assemble the story
  • 6. Making the Patterns Easier to See • Making numbers more visual • Sorting • Comparing • Benchmarks • Derived variables • Analytics
  • 7. Per Capita Govt. Spend on Health - $ 183 Countries, 1995 to 2010, source Gapminder
  • 8. 3 Significant Digits – Units of $10 Divide by 10, display no decimals (but don’t delete them) Three digits good Two digits medium One digit poor
  • 9. Sorting – Filters in Excel Don’t look at data in alphabetical order or in questionnaire order Pattern 1, Best = USA + Northern Europe Pattern 2, Worst = Africa overrepresented Surprise = Niue (Pacific island, pop < 2,000, it’s an outlier)
  • 10. Looking at the Shape of the Data Steps indicate possible curiosities, e.g. OECD, EU etc
  • 11. Index, 1995 = 100 Pattern 1, the most improved tended to have low incomes per person in 1995 Pattern 2, those who declined tended to have low incomes per person in 1995
  • 12. Diagonalizing Brand Attribute Data % Agreeing Sexy Cheap Stylish Friendly Strong Traditional Brand 1 41.73% 63.40% 20.88% 71.20% 41.67% 31.66% Brand 2 87.21% 22.16% 64.50% 40.02% 96.24% 61.31% Brand 3 71.52% 48.80% 53.86% 30.08% 27.88% 28.12% Brand 4 97.23% 9.70% 93.97% 19.89% 96.99% 98.19% Brand 5 92.70% 12.73% 88.36% 34.63% 30.03% 26.98% Brand 6 30.58% 60.82% 20.61% 55.34% 75.00% 64.70% Brand 7 11.47% 72.02% 1.93% 83.76% 59.86% 28.94%
  • 13. Hide the Decimal Places % Agreeing Sexy Cheap Stylish Friendly Strong Traditional Brand 1 42% 63% 21% 71% 42% 32% Brand 2 87% 22% 65% 40% 96% 61% Brand 3 72% 49% 54% 30% 28% 28% Brand 4 97% 10% 94% 20% 97% 98% Brand 5 93% 13% 88% 35% 30% 27% Brand 6 31% 61% 21% 55% 75% 65% Brand 7 11% 72% 2% 84% 60% 29% Note, don’t change the numbers, just the way they are displayed
  • 14. Conditional Formatting % Agreeing Sexy Cheap Stylish Friendly Strong Traditional Brand 1 42% 63% 21% 71% 42% 32% Brand 2 87% 22% 65% 40% 96% 61% Brand 3 72% 49% 54% 30% 28% 28% Brand 4 97% 10% 94% 20% 97% 98% Brand 5 93% 13% 88% 35% 30% 27% Brand 6 31% 61% 21% 55% 75% 65% Brand 7 11% 72% 2% 84% 60% 29% We can see there are differences, but the patterns are not clear yet
  • 15. Sum the Rows and Columns % Agreeing Sexy Cheap Stylish Friendly Strong Traditional Sum Brand 1 42% 63% 21% 71% 42% 32% 271% Brand 2 87% 22% 65% 40% 96% 61% 371% Brand 3 72% 49% 54% 30% 28% 28% 260% Brand 4 97% 10% 94% 20% 97% 98% 416% Brand 5 93% 13% 88% 35% 30% 27% 285% Brand 6 31% 61% 21% 55% 75% 65% 307% Brand 7 11% 72% 2% 84% 60% 29% 258% Sum 432% 290% 344% 335% 428% 340%
  • 16. Sort the Rows % Agreeing Sexy Cheap Stylish Friendly Strong Traditional Sum Brand 4 97% 10% 94% 20% 97% 98% 416% Brand 2 87% 22% 65% 40% 96% 61% 371% Brand 6 31% 61% 21% 55% 75% 65% 307% Brand 5 93% 13% 88% 35% 30% 27% 285% Brand 1 42% 63% 21% 71% 42% 32% 271% Brand 3 72% 49% 54% 30% 28% 28% 260% Brand 7 11% 72% 2% 84% 60% 29% 258% Sum 432% 290% 344% 335% 428% 340% Largest number to the top
  • 17. Sort the Columns % Agreeing Sexy Strong Stylish Traditional Friendly Cheap Sum Brand 4 97% 97% 94% 98% 20% 10% 416% Brand 2 87% 96% 65% 61% 40% 22% 371% Brand 6 31% 75% 21% 65% 55% 61% 307% Brand 5 93% 30% 88% 27% 35% 13% 285% Brand 1 42% 42% 21% 32% 71% 63% 271% Brand 3 72% 28% 54% 28% 30% 49% 260% Brand 7 11% 60% 2% 29% 84% 72% 258% Sum 432% 428% 344% 340% 335% 290% In Excel, select Options in the Sort dialogue to access left-to-right sorting
  • 18. Calculate Distances from top row & left column % Agreeing Sexy Strong Stylish Traditional Friendly Cheap Distance Brand 4 97% 97% 94% 98% 20% 10% 0% Brand 2 87% 96% 65% 61% 40% 22% 54% Brand 6 31% 75% 21% 65% 55% 61% 124% Brand 5 93% 30% 88% 27% 35% 13% 99% Brand 1 42% 42% 21% 32% 71% 63% 146% Brand 3 72% 28% 54% 28% 30% 49% 117% Brand 7 11% 60% 2% 29% 84% 72% 173% Distance 0% 101% 38% 92% 141% 154% To calculate the distances, use the formula =SQRT(SUMXMY2(range1,range2)) This calculates the distance using Pythagoras.
  • 19. Sort the Rows % Agreeing Sexy Strong Stylish Traditional Friendly Cheap Distance Brand 4 97% 97% 94% 98% 20% 10% 0% Brand 2 87% 96% 65% 61% 40% 22% 54% Brand 5 93% 30% 88% 27% 35% 13% 99% Brand 3 72% 28% 54% 28% 30% 49% 117% Brand 6 31% 75% 21% 65% 55% 61% 124% Brand 1 42% 42% 21% 32% 71% 63% 146% Brand 7 11% 60% 2% 29% 84% 72% 173% Distance 0% 101% 38% 92% 141% 154% Sort the rows by distance, smallest number from the top
  • 20. Sort the Columns % Agreeing Sexy Stylish Traditional Strong Friendly Cheap Distance Brand 4 97% 94% 98% 97% 20% 10% 0% Brand 2 87% 65% 61% 96% 40% 22% 54% Brand 5 93% 88% 27% 30% 35% 13% 99% Brand 3 72% 54% 28% 28% 30% 49% 117% Brand 6 31% 21% 65% 75% 55% 61% 124% Brand 1 42% 21% 32% 42% 71% 63% 146% Brand 7 11% 2% 29% 60% 84% 72% 173% Distance 0% 38% 92% 101% 141% 154% Sort the columns by distance, smallest number on the left
  • 21. Tidy up the Data % Agreeing Sexy Stylish Traditional Strong Friendly Cheap Brand 4 97 94 98 97 20 10 Brand 2 87 65 61 96 40 22 Brand 5 93 88 27 30 35 13 Brand 3 72 54 28 28 30 49 Brand 6 31 21 65 75 55 61 Brand 1 42 21 32 42 71 63 Brand 7 11 2 29 60 84 72 Remove the extra columns, multiply by 100 to remove the % signs. Pattern 1, Brands 4, 2 & 5 are Sexy and Stylish Pattern 2, Brands 6, 1 & 7 are Friendly and Cheap
  • 22. Do Not (Normally) use the Questionnaire Sequence 0 5000 10000 15000 20000 25000 0 100 200 300 400 500 600 700 Brazil China France Germany India Italy Japan Russia South Africa Sweden UK USA World Cases Per Million Deaths Per Million Deaths and Cases Per Million of Population (until 27 September 2020) Deaths / 1Million Case / 1Million https://www.worldometers.info/coronavirus/ - downloaded 27 September 2020
  • 23. Sort the data by something meaningful 0 5000 10000 15000 20000 25000 0 100 200 300 400 500 600 700 Brazil USA UK Italy Sweden France South Africa Russia World Germany India Japan China Cases Per Million Deaths Per Million Deaths and Cases Per Million of Population (until 27 September 2020) Deaths / 1Million Case / 1Million https://www.worldometers.info/coronavirus/ - downloaded 27 September 2020 World = benchmark Countries with high death rates include LatAm, USA, Europe and SA Lower rates include Europe and Asia
  • 24. Sort the data by something meaningful 0 5000 10000 15000 20000 25000 0 100 200 300 400 500 600 700 Brazil USA UK Italy Sweden France South Africa Russia World Germany India Japan China Cases Per Million Deaths Per Million Deaths and Cases Per Million of Population (until 27 September 2020) Deaths / 1Million Case / 1Million https://www.worldometers.info/coronavirus/ - downloaded 27 September 2020 Some countries break the link between Cases and Deaths Idea to test = Were UK, Italy, Sweden & France testing fewer people?
  • 25. Derived Variables • Younger people • People in London • Men If a concept is preferred (slightly) by • The differences may be much larger Create a variable ‘Young Men in London’
  • 26. Downloaded from Worldmeters.info 3 March 2021 https://www.worldometers.info/coronavirus/country/uk/
  • 27. Downloaded from Worldmeters.info 3 March 2021 https://www.worldometers.info/coronavirus/country/uk/
  • 28. Downloaded from Worldmeters.info 3 March 2021 https://www.worldometers.info/coronavirus/country/uk/
  • 29. Simplify and Compare Downloaded from The Guardian 22 Jan 2022 • Pattern = Cases, Hospitalisations, then Deaths • Alpha had the deaths and hospitalisations • Delta had the vaccinations and lockdowns • Omicron had boosters and light restrictions • The link between cases and deaths changed
  • 31. Add comparators Deaths, raw numbers, new, linear scale
  • 32. Derived Scale – per 100K Deaths, per 100K, new, linear scale
  • 33. Cumulative Deaths, per 100K, cumulative, linear scale
  • 34. Cumulative – longer time span Deaths, per 100K, cumulative, linear scale
  • 35. Consider other comparators Deaths, per 100K, cumulative, linear scale
  • 36. Consider other comparators Deaths, per 100K, cumulative, linear scale
  • 37. The Absence of a Strong Pattern Can be Interesting Source: Statista, 2020 Older people cook from scratch more often – but the differences between age groups are not large. 35-64 almost identical, 18-24 a bit lower, 65+ a bit higher
  • 38. Analytics We use analytics when the patterns are not visible without analytics Key tools include • Correspondence Analysis – shows relationships on a map • Factor analysis – great for simplifying the data • Cluster analysis – shows groupings that exist in the data • Regression analysis – shows the scale of relationships • Latent Class – allows techniques to be combined, e.g. regression and cluster analysis
  • 39. The Building Blocks of Stories Fact 1 Fact 2 Fact 3 Fact 4 Fact 5 Fact 6 Fact 7 Fact 8 Fact 9 Finding 1 Finding 2 Finding 3 Finding 4 Finding 5 Finding 6 Finding 7 Finding 8 Insight
  • 40. Categorizing the Facts Fact a Fact d Fact c Nice to Know Fact b Fact f Fact g Fact l Fact e Fact h Fact i Fact j Fact k
  • 41. Extract the Findings Fact a Fact d Fact c Nice to Know Fact b Fact f Fact g Fact l Fact e Fact h Fact i Fact j Fact k Finding a Finding b Finding c Finding d Finding e
  • 42. Secondary Findings Fact a Fact d Fact c Nice to Know Fact b Fact f Fact g Fact l Fact e Fact h Fact i Fact j Fact k Finding a Finding b Finding c Finding d Finding e Finding f
  • 43. Extract the Insight Fact a Fact d Fact c Nice to Know Fact b Fact f Fact g Fact l Fact e Fact h Fact i Fact j Fact k Finding a Finding b Finding c Finding d Finding e Finding f Insight
  • 44. Finding the Patterns in the Data 1. Using the question as a lens 2. Making the patterns easier to see 3. Look for Connections, Correlations, Contradictions and Surprises 4. Assemble the story
  • 45.