SlideShare a Scribd company logo
1 of 40
Corvelle Drives Concepts to Completion
Understanding Data:
What do these numbers mean?
1
Corvelle Drives Concepts to Completion
Yogi Schulz
Biography
 Partner in Corvelle Consulting
 Information technology related management
consulting
 Microsoft Canada columnist & CBC Radio guest
 PPDM Association board member
 Industry presenter:
– Project World - 6 years
– PMI – SAC - 3 years
– CIPS – many years
– PPDM Association - several years
2
Corvelle Drives Concepts to Completion
Data Volumes Growing every Year
3
Corvelle Drives Concepts to Completion
Presentation Outline
 Introduction
 Learning
objectives
 Nine hazards
of data
misinterpretation
 Recommendations
& actions
• Insufficient Domain
Expertise
• Important Variables
Omitted
• Aggregation Obscures Truth
• Inferences are Off Base
• Sources of Variation
Overlooked
• Statistical Significance
Trumps Critical Thinking
• Numerical Analysis Missing
Something
• Correlation Mistaken for
Causation
• Explanation adds Distortion
4
Corvelle Drives Concepts to Completion
Learning Objectives
 Understand how to accurately interpret data
 Recognize factors
that lead to
misinterpreting data
 Understand actions
to minimize risk of
misinterpreting data
5
Corvelle Drives Concepts to Completion 6
Having all this health data available
is great, but I think I need a degree
in data analytics to sort it all out.
Corvelle Drives Concepts to Completion
Nine Hazards of
Data Misinterpretation
7
Corvelle Drives Concepts to Completion
Insufficient Domain Expertise
 Issues:
– Domain experts are not data scientists
– Data scientists are not domain experts
– Imbalance of expertise
 Solutions:
– Is sufficient expertise involved?
– Is the result possible or even likely?
– What experience makes you skeptical?
8
1
Corvelle Drives Concepts to Completion
Do you have sufficient expertise and
experience to interpret the data?
9
“I know nothing about the subject,
but I’d be happy to give you my
expert opinion!”
Corvelle Drives Concepts to Completion
Important Variables Omitted
 Issues:
– Too much complexity
– Odd results or strange data
– Too many variables ignored
 Solutions:
– Review the procedures
– Revise the research design
– Narrow the research goal
10
2
Corvelle Drives Concepts to Completion 11
Seriously? No worries!
It’s always
something.
I’ve factored in
lift, thrust, drag
and wind speed . . .
Just not
gravity.
Corvelle Drives Concepts to Completion
Aggregation Obscures Truth
 Issues:
– Story varies by aggregation level
– Aggregation produces surprising relationships
– Pilot results ambiguous
 Solutions:
– Check if low level trends hold up
– Identify potential sources of variation
– Confirm research design
12
Top-level report
Sub-level 1
report
Sub-level 2
report
3
Corvelle Drives Concepts to Completion
Aggregation to
confirm Trends
13
Use the
SAP database
to aggregate
our findings.
Then use
the survey
database.
Corvelle Drives Concepts to Completion
Inferences are Off Base
 Issues:
– Misunderstanding group characteristics
– Rose-coloured thinking
– Ideology-based agenda
 Solutions:
– Challenge your research project
– Review the statistical calculations
– Review the work
14
4
Corvelle Drives Concepts to Completion 15
When you two have finished
arguing your shaky inferences,
I have actual data!
Corvelle Drives Concepts to Completion
Sources of Variation Overlooked
 Issues:
– Obvious sources overlooked
– Less obvious sources overlooked
– Hidden sources not considered
 Solutions:
– Search for sources of variation
– Expand data gathering
– Look for unexpected correlations
16
5
Corvelle Drives Concepts to Completion
Not so subtle impacts
on research outcomes
17
“My diabetic research shows that
test subjects are 98% more likely
to take their diabetic pills
when the pills are covered in chocolate!”
Corvelle Drives Concepts to Completion
Statistical Significance
Trumps Critical Thinking
 Issues:
– Lack of critical thinking
– Over-emphasizing statistical significance
– Assumptions about big data
 Solutions:
– Use statistical significance as a screen
– Review hypothesis
– Develop a persuasive story
18
6
Corvelle Drives Concepts to Completion
Statistical Significance
19
2.5%
Significant
effect
2.5%
Significant
effect
95%
Non-significant
effect
Corvelle Drives Concepts to Completion 20
If you torture the data
long enough, they will confess.
Corvelle Drives Concepts to Completion
Numerical Analysis
Missing Something
 Issues:
– Superficial numerical analysis
– Insufficient analysis expertise
– Bias in analysis
 Solutions:
– Visualize data
– Check for false positives and false negatives
– Verify numerical analysis independently
21
7
Corvelle Drives Concepts to Completion 22
New study reveals that
reading too many studies
may cause heart disease.
Corvelle Drives Concepts to Completion
Correlation Mistaken
for Causation
 Issues:
– Correlation described as causation
– Delusional or misleading correlations
– Weak correlation stretched to strong correlation
– Random correlation positioned as real correlation
 Solutions:
– Create a plausible story
– Use correlation as scientific evidence
– Review the calculations
23
8
Corvelle Drives Concepts to Completion
Correlation ≠
Causation
24
I used to think
correlation implied
causation.
Then I took a
statistics class.
Now I don’t.
Sounds like the
class helped.
Well, maybe.
Corvelle Drives Concepts to Completion
Explanation adds Distortion
 Issues:
– Overly complex explanation
– Too much jargon
– Exaggeration of inferences
 Solutions:
– Express results clearly
– Develop simple, attractive charts
– Stick to the supportable inferences
25
9
Corvelle Drives Concepts to Completion
Dubious Explanation
of Survey Data
Public statement:
A survey of more than 25,000
Albertans reviewing the K-12
school curriculum found
“there exists a strong desire for
the removal of Shakespeare as
a required author.”
26
Corvelle Drives Concepts to Completion
Dubious Explanation
of Survey Data
Reality:
 25,000 survey respondents
 60 added a comment about Shakespeare
 50 called for removal of Shakespeare from
curriculum
27
Alberta Education is interpreting comments
to mean five out of six Albertans favour
removing Shakespeare from curriculum
Corvelle Drives Concepts to Completion
Number of article retractions
by cause of retraction
28
Corvelle Drives Concepts to Completion
Recommendations
 Improve your ability
to interpret data
 Watch out for these
nine hazards of data
misinterpretation
 Ensure your research
is accurate and
defensible
29
• Insufficient Domain Expertise
• Important Variables Omitted
• Aggregation Obscures Truth
• Inferences are Off Base
• Sources of Variation
Overlooked
• Statistical Significance Trumps
Critical Thinking
• Numerical Analysis Missing
Something
• Correlation Mistaken for
Causation
• Explanation adds Distortion
Corvelle Drives Concepts to Completion
Questions &
Discussion
30
Can you help us
understand data
better?Please
fill out
evaluation
form
Corvelle Drives Concepts to Completion
Understanding Data:
What do these numbers mean?
Corvelle Consulting
300, 400 - 5 Ave. S. W.
Calgary, Alberta T2P 0L6
Phone: (403) 860-5348
E-mail: YogiSchulz@corvelle.com
Web: www.corvelle.com
Yogi Schulz
Partner of Corvelle Consulting
Information technology related
management consulting
Microsoft Canada columnist
& CBC Radio host
Industry presenter
Former PPDM Association
board member
31
Corvelle Drives Concepts to Completion
Plausible Inferences for
Ranges of p Values
32
1.00
.10
.05
.01
.001
0.0
p > 0.10
0.05 < p < 0.10
0.01 < p < 0.05
0.001 < p < 0.01
0.0 < p < 0.001
Not significant
Marginally significant
Fairly significant
Strongly significant
Definitely significant
Values of p Plausible inference
Corvelle Drives Concepts to Completion
Questionable
Analysis Goals
33
Corvelle Drives Concepts to Completion
Bibliography
 A data analyst needs these 5 skills
– https://www.viva-viva.ca/index.php/news-events/114-a-data-analyst-needs-these-5-skills
 Analyzing, Interpreting and Reporting Basic Research Results
– http://managementhelp.org/businessresearch/analysis.htm
 Analyzing Data and Communicating Results
– http://strengtheningnonprofits.org/resources/e-learning/online/analyzingdata/
 Believe It Or Not, Most Published Research Findings Are Probably False
– Simon Oxenham
– http://bigthink.com/neurobonkers/believe-it-or-not-most-published-research-findings-are-
probably-false
 Big Data Solutions for Healthcare
– Stanislas Odinot, April 11, 2013
– https://www.slideshare.net/LarryCover/big-data-solutions-for-healthcare
 Big Data in Healthcare: Separating The Hype From The Reality
– Jared Crapo, Health Catalyst
– https://www.healthcatalyst.com/healthcare-big-data-realities
34
Corvelle Drives Concepts to Completion
Bibliography
 Confirmationist and falsificationist paradigms of science
– Andrew, 5 September 2014
– http://andrewgelman.com/2014/09/05/confirmationist-falsificationist-paradigms-science/
 Correlation, causation and coincidence
– 05 November 2015
– http://behindlabdoors.com/correlation-causation-and-coincidence/
 Correlation does not imply causation
– https://en.wikipedia.org/wiki/Correlation_does_not_imply_causation
 Correlation does not imply causation – except when it does
– The Original Skeptical Raptor, August 9, 2015
– http://www.skepticalraptor.com/skepticalraptorblog.php/correlation-does-not-imply-causation-except-
when-it-does/
 Data Analysis, Interpretation and Presentation
– http://www.uio.no/studier/emner/matnat/ifi/INF4260/h10/undervisningsmateriale/DataAnalysis.pdf
 Data Analysis and Interpretation
– Anne E. Egger, Ph.D., Anthony Carpi, Ph.D.
– http://www.visionlearning.com/en/library/Process-of-Science/49/Data-Analysis-and-Interpretation/154
 Data analysis and presentation
– http://www.statcan.gc.ca/pub/12-539-x/2009001/analysis-analyse-eng.htm
35
Corvelle Drives Concepts to Completion
Bibliography
 Data Collection and Interpretation
– The Gale Group Inc.
– http://www.encyclopedia.com/education/news-wires-white-papers-and-books/data-
collection-and-interpretation
 Dataviz: Making Smarter, More Persuasive Data Visualizations
– Scott Berinato, March 30, 2016
– https://hbr.org/webinar/2016/05/dataviz-making-smarter-more-persuasive-data-
visualizations
 Data-driven decision-making process
– http://www.txprofdev.org/apps/datadecisions/node/52.html
 Describing and Interpreting Data
– www.uh.edu/~tech132/sln12.doc
 Descriptive Statistics and Interpreting Statistics
– http://www.statisticssolutions.com/descriptive-statistics-and-interpreting-statistics/
 Economics methods in Cochrane systematic reviews of health promotion and
public health related interventions
– Ian Shemilt, 15 November 2006
– https://bmcmedresmethodol.biomedcentral.com/articles/10.1186/1471-2288-6-55
36
Corvelle Drives Concepts to Completion
Bibliography
 Excel errors and science papers
– Spreadsheets are playing havoc with scientists
– Economist, September 7, 2016
– http://www.economist.com/blogs/graphicdetail/2016/09/daily-chart-3
 How science goes wrong
– Scientific research has changed the world. Now it needs to change itself
– Economist, October 21 2013
– http://www.economist.com/news/leaders/21588069-scientific-research-has-changed-world-now-it-
needs-change-itself-how-science-goes-wrong
 HealthToon: Taking down chronic diseases with advanced analytics
– Oliver Clark, April 17, 2015
– http://www.ibmbigdatahub.com/blog/healthtoon-taking-down-chronic-diseases-advanced-analytics
 Hilarious Graphs Prove That Correlation Isn’t Causation
– https://www.fastcodesign.com/3030529/hilarious-graphs-prove-that-correlation-isnt-causation
– http://www.tylervigen.com/spurious-correlations
 Interpretation of Data: The Basics
– Tania, May 30, 2014
– https://blog.udemy.com/interpretation-of-data/
37
Corvelle Drives Concepts to Completion
Bibliography
 Interpreting and Presenting Data
– http://www.deq.state.or.us/lab/wqm/docs/InterpretingandPresentingData.pdf
 Misconduct, not error, is the source of most retracted papers
– Ashutosh Jogalekar, October 2, 2012
– https://blogs.scientificamerican.com/the-curious-wavefunction/misconduct-and-not-error-is-
the-source-of-most-retracted-papers/
 9 Causes Of Data Misinterpretation
– Lisa Morgan, 17 July 2015
– http://www.informationweek.com/big-data/big-data-analytics/9-causes-of-data-
misinterpretation/d/d-id/1321338
 Presentation, Analysis and Interpretation of Data
– https://www.slideshare.net/31mikaella/presentation-analysis-and-interpretation-of-data
 Retraction Watch
– http://retractionwatch.com/
 Sample Data Interpretation Questions
– http://www.psychometric-success.com/faq/faq-sample-data-interpretation-questions.htm
 Science and Engineering Practice of Analyzing and Interpreting Data
– http://ngss.nsta.org/Practices.aspx?id=4
38
Corvelle Drives Concepts to Completion
Bibliography
 Statistical significance and its part in science downfalls
– Imagine if there were a simple single statistical measure everybody could use with any set of data and it
would reliably separate true from false
– Hilda Bastian, November 11, 2013
– https://blogs.scientificamerican.com/absolutely-maybe/statistical-significance-and-its-part-in-science-
downfalls/
 Statistical Significance for CRO: 6 Things You Need to Know
– Tom Capper, May 08, 2014
– https://www.distilled.net/resources/statistical-significance-for-cro-6-things-you-need-to-know/
 Statistics Done Wrong
– The woefully complete guide
– Alex Reinhart
– https://www.statisticsdonewrong.com/
 Student Competencies & Requirements in Health Economics
– Cumming School of Medicine, 2017
– https://cumming.ucalgary.ca/gse/files/gse/competencies_health_economics_2017.pdf
 Summary and discussion of: “Why Most Published Research Findings Are False”
– Statistics Journal Club, 36-825
– Dallas Card and Shashank Srivastava, December 10, 2014
– http://www.stat.cmu.edu/~ryantibs/journalclub/ioannidis.pdf
39
Corvelle Drives Concepts to Completion
Bibliography
 Trouble at the lab
– Scientists like to think of science as self-correcting. To an alarming degree, it is not
– Economist, October 18 2013
– http://www.economist.com/news/briefing/21588057-scientists-think-science-self-correcting-
alarming-degree-it-not-trouble
 Understanding the Growth, Value of Healthcare Big Data Analytics
– HealthITAnalytics, July 30, 2015
– http://healthitanalytics.com/news/understanding-the-growth-value-of-healthcare-big-data-
analytics
 Watson Analytics
– https://www.ibm.com/analytics/watson-analytics/us-en/What’s Significant?
– Patrick Barlow, University of Tennessee
– https://www.slideshare.net/pbbarlow1/whats-significant-hypothesis-testing-effect-size-
confidence-intervals-the-pvalue-fallacy
 Why Most Published Research Findings Are False
– John P. A. Ioannidis, PLoS Med 2(8): e124
– http://faculty.dbmi.pitt.edu/day/Bioinf2118/Bioinf-2118-2013/Ioannidis-
journal.pmed.0020124.pdf
40

More Related Content

Similar to Understanding data dfm_1_yogi_schulz_2017_05

Visualizations power bi_yogi_schulz_2018_12_101
Visualizations power bi_yogi_schulz_2018_12_101Visualizations power bi_yogi_schulz_2018_12_101
Visualizations power bi_yogi_schulz_2018_12_101Beta-Research.org
 
Getting The Most Out of Your Data Analyst - HAS Session 9
Getting The Most Out of Your Data Analyst - HAS Session 9Getting The Most Out of Your Data Analyst - HAS Session 9
Getting The Most Out of Your Data Analyst - HAS Session 9Health Catalyst
 
Scale Development Techniques Presentation.pptx
Scale Development Techniques Presentation.pptxScale Development Techniques Presentation.pptx
Scale Development Techniques Presentation.pptxAnjaliUpadhye1
 
Module 1.3 data exploratory
Module 1.3  data exploratoryModule 1.3  data exploratory
Module 1.3 data exploratorySara Hooker
 
Experiences with Data Feedback - Better Software 2004 - Ben Linders
Experiences with Data Feedback - Better Software 2004 - Ben LindersExperiences with Data Feedback - Better Software 2004 - Ben Linders
Experiences with Data Feedback - Better Software 2004 - Ben LindersBen Linders
 
The Secrets of High Performance: Science Edition - Nicole Forsgren - Codemoti...
The Secrets of High Performance: Science Edition - Nicole Forsgren - Codemoti...The Secrets of High Performance: Science Edition - Nicole Forsgren - Codemoti...
The Secrets of High Performance: Science Edition - Nicole Forsgren - Codemoti...Codemotion
 
How Data Scientists Make Reliable Decisions with Data
How Data Scientists Make Reliable Decisions with DataHow Data Scientists Make Reliable Decisions with Data
How Data Scientists Make Reliable Decisions with DataTa-Wei (David) Huang
 
Title of Your Research Proposal Student Name Walde
Title of Your Research Proposal Student Name WaldeTitle of Your Research Proposal Student Name Walde
Title of Your Research Proposal Student Name WaldeTakishaPeck109
 
SharePoint "Moneyball" - The Art and Science of Winning the SharePoint Metric...
SharePoint "Moneyball" - The Art and Science of Winning the SharePoint Metric...SharePoint "Moneyball" - The Art and Science of Winning the SharePoint Metric...
SharePoint "Moneyball" - The Art and Science of Winning the SharePoint Metric...Susan Hanley
 
APF orlando diy survey workshop 071114 final
APF orlando diy survey workshop 071114 finalAPF orlando diy survey workshop 071114 final
APF orlando diy survey workshop 071114 finalMike Courtney
 
Intuition's Fall from Grace - Algorithms and Data in (Pre)-Selection by Colin...
Intuition's Fall from Grace - Algorithms and Data in (Pre)-Selection by Colin...Intuition's Fall from Grace - Algorithms and Data in (Pre)-Selection by Colin...
Intuition's Fall from Grace - Algorithms and Data in (Pre)-Selection by Colin...Textkernel
 
IA assessment criteria
IA assessment criteriaIA assessment criteria
IA assessment criteriaChris Hamper
 
Problem definition and research proposal(brm)
Problem definition and research proposal(brm)Problem definition and research proposal(brm)
Problem definition and research proposal(brm)university of peshawar
 
How To Research
How To ResearchHow To Research
How To ResearchFengyi
 
Create a proposal for your Design for Change Capstone Open.docx
Create a proposal for your Design for Change Capstone Open.docxCreate a proposal for your Design for Change Capstone Open.docx
Create a proposal for your Design for Change Capstone Open.docxwrite31
 
Data is love data viz best practices
Data is love   data viz best practicesData is love   data viz best practices
Data is love data viz best practicesGregory Nelson
 
Analyzing and Interpreting Data statippt
Analyzing and Interpreting Data statipptAnalyzing and Interpreting Data statippt
Analyzing and Interpreting Data statipptElleMaRie3
 

Similar to Understanding data dfm_1_yogi_schulz_2017_05 (20)

Visualizations power bi_yogi_schulz_2018_12_101
Visualizations power bi_yogi_schulz_2018_12_101Visualizations power bi_yogi_schulz_2018_12_101
Visualizations power bi_yogi_schulz_2018_12_101
 
Getting The Most Out of Your Data Analyst - HAS Session 9
Getting The Most Out of Your Data Analyst - HAS Session 9Getting The Most Out of Your Data Analyst - HAS Session 9
Getting The Most Out of Your Data Analyst - HAS Session 9
 
7INNOVA Lesson 5 slides for T26
7INNOVA Lesson 5 slides for T267INNOVA Lesson 5 slides for T26
7INNOVA Lesson 5 slides for T26
 
Scale Development Techniques Presentation.pptx
Scale Development Techniques Presentation.pptxScale Development Techniques Presentation.pptx
Scale Development Techniques Presentation.pptx
 
Module 1.3 data exploratory
Module 1.3  data exploratoryModule 1.3  data exploratory
Module 1.3 data exploratory
 
Experiences with Data Feedback - Better Software 2004 - Ben Linders
Experiences with Data Feedback - Better Software 2004 - Ben LindersExperiences with Data Feedback - Better Software 2004 - Ben Linders
Experiences with Data Feedback - Better Software 2004 - Ben Linders
 
The Secrets of High Performance: Science Edition - Nicole Forsgren - Codemoti...
The Secrets of High Performance: Science Edition - Nicole Forsgren - Codemoti...The Secrets of High Performance: Science Edition - Nicole Forsgren - Codemoti...
The Secrets of High Performance: Science Edition - Nicole Forsgren - Codemoti...
 
How Data Scientists Make Reliable Decisions with Data
How Data Scientists Make Reliable Decisions with DataHow Data Scientists Make Reliable Decisions with Data
How Data Scientists Make Reliable Decisions with Data
 
SPSS Data Cleaning and Management
SPSS Data Cleaning and ManagementSPSS Data Cleaning and Management
SPSS Data Cleaning and Management
 
Title of Your Research Proposal Student Name Walde
Title of Your Research Proposal Student Name WaldeTitle of Your Research Proposal Student Name Walde
Title of Your Research Proposal Student Name Walde
 
SharePoint "Moneyball" - The Art and Science of Winning the SharePoint Metric...
SharePoint "Moneyball" - The Art and Science of Winning the SharePoint Metric...SharePoint "Moneyball" - The Art and Science of Winning the SharePoint Metric...
SharePoint "Moneyball" - The Art and Science of Winning the SharePoint Metric...
 
APF orlando diy survey workshop 071114 final
APF orlando diy survey workshop 071114 finalAPF orlando diy survey workshop 071114 final
APF orlando diy survey workshop 071114 final
 
Intuition's Fall from Grace - Algorithms and Data in (Pre)-Selection by Colin...
Intuition's Fall from Grace - Algorithms and Data in (Pre)-Selection by Colin...Intuition's Fall from Grace - Algorithms and Data in (Pre)-Selection by Colin...
Intuition's Fall from Grace - Algorithms and Data in (Pre)-Selection by Colin...
 
IA assessment criteria
IA assessment criteriaIA assessment criteria
IA assessment criteria
 
Problem definition and research proposal(brm)
Problem definition and research proposal(brm)Problem definition and research proposal(brm)
Problem definition and research proposal(brm)
 
How To Research
How To ResearchHow To Research
How To Research
 
Create a proposal for your Design for Change Capstone Open.docx
Create a proposal for your Design for Change Capstone Open.docxCreate a proposal for your Design for Change Capstone Open.docx
Create a proposal for your Design for Change Capstone Open.docx
 
Data is love data viz best practices
Data is love   data viz best practicesData is love   data viz best practices
Data is love data viz best practices
 
Analyzing and Interpreting Data statippt
Analyzing and Interpreting Data statipptAnalyzing and Interpreting Data statippt
Analyzing and Interpreting Data statippt
 
Introduction to knowledge discovery
Introduction to knowledge discoveryIntroduction to knowledge discovery
Introduction to knowledge discovery
 

Recently uploaded

Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 

Recently uploaded (20)

Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 

Understanding data dfm_1_yogi_schulz_2017_05

  • 1. Corvelle Drives Concepts to Completion Understanding Data: What do these numbers mean? 1
  • 2. Corvelle Drives Concepts to Completion Yogi Schulz Biography  Partner in Corvelle Consulting  Information technology related management consulting  Microsoft Canada columnist & CBC Radio guest  PPDM Association board member  Industry presenter: – Project World - 6 years – PMI – SAC - 3 years – CIPS – many years – PPDM Association - several years 2
  • 3. Corvelle Drives Concepts to Completion Data Volumes Growing every Year 3
  • 4. Corvelle Drives Concepts to Completion Presentation Outline  Introduction  Learning objectives  Nine hazards of data misinterpretation  Recommendations & actions • Insufficient Domain Expertise • Important Variables Omitted • Aggregation Obscures Truth • Inferences are Off Base • Sources of Variation Overlooked • Statistical Significance Trumps Critical Thinking • Numerical Analysis Missing Something • Correlation Mistaken for Causation • Explanation adds Distortion 4
  • 5. Corvelle Drives Concepts to Completion Learning Objectives  Understand how to accurately interpret data  Recognize factors that lead to misinterpreting data  Understand actions to minimize risk of misinterpreting data 5
  • 6. Corvelle Drives Concepts to Completion 6 Having all this health data available is great, but I think I need a degree in data analytics to sort it all out.
  • 7. Corvelle Drives Concepts to Completion Nine Hazards of Data Misinterpretation 7
  • 8. Corvelle Drives Concepts to Completion Insufficient Domain Expertise  Issues: – Domain experts are not data scientists – Data scientists are not domain experts – Imbalance of expertise  Solutions: – Is sufficient expertise involved? – Is the result possible or even likely? – What experience makes you skeptical? 8 1
  • 9. Corvelle Drives Concepts to Completion Do you have sufficient expertise and experience to interpret the data? 9 “I know nothing about the subject, but I’d be happy to give you my expert opinion!”
  • 10. Corvelle Drives Concepts to Completion Important Variables Omitted  Issues: – Too much complexity – Odd results or strange data – Too many variables ignored  Solutions: – Review the procedures – Revise the research design – Narrow the research goal 10 2
  • 11. Corvelle Drives Concepts to Completion 11 Seriously? No worries! It’s always something. I’ve factored in lift, thrust, drag and wind speed . . . Just not gravity.
  • 12. Corvelle Drives Concepts to Completion Aggregation Obscures Truth  Issues: – Story varies by aggregation level – Aggregation produces surprising relationships – Pilot results ambiguous  Solutions: – Check if low level trends hold up – Identify potential sources of variation – Confirm research design 12 Top-level report Sub-level 1 report Sub-level 2 report 3
  • 13. Corvelle Drives Concepts to Completion Aggregation to confirm Trends 13 Use the SAP database to aggregate our findings. Then use the survey database.
  • 14. Corvelle Drives Concepts to Completion Inferences are Off Base  Issues: – Misunderstanding group characteristics – Rose-coloured thinking – Ideology-based agenda  Solutions: – Challenge your research project – Review the statistical calculations – Review the work 14 4
  • 15. Corvelle Drives Concepts to Completion 15 When you two have finished arguing your shaky inferences, I have actual data!
  • 16. Corvelle Drives Concepts to Completion Sources of Variation Overlooked  Issues: – Obvious sources overlooked – Less obvious sources overlooked – Hidden sources not considered  Solutions: – Search for sources of variation – Expand data gathering – Look for unexpected correlations 16 5
  • 17. Corvelle Drives Concepts to Completion Not so subtle impacts on research outcomes 17 “My diabetic research shows that test subjects are 98% more likely to take their diabetic pills when the pills are covered in chocolate!”
  • 18. Corvelle Drives Concepts to Completion Statistical Significance Trumps Critical Thinking  Issues: – Lack of critical thinking – Over-emphasizing statistical significance – Assumptions about big data  Solutions: – Use statistical significance as a screen – Review hypothesis – Develop a persuasive story 18 6
  • 19. Corvelle Drives Concepts to Completion Statistical Significance 19 2.5% Significant effect 2.5% Significant effect 95% Non-significant effect
  • 20. Corvelle Drives Concepts to Completion 20 If you torture the data long enough, they will confess.
  • 21. Corvelle Drives Concepts to Completion Numerical Analysis Missing Something  Issues: – Superficial numerical analysis – Insufficient analysis expertise – Bias in analysis  Solutions: – Visualize data – Check for false positives and false negatives – Verify numerical analysis independently 21 7
  • 22. Corvelle Drives Concepts to Completion 22 New study reveals that reading too many studies may cause heart disease.
  • 23. Corvelle Drives Concepts to Completion Correlation Mistaken for Causation  Issues: – Correlation described as causation – Delusional or misleading correlations – Weak correlation stretched to strong correlation – Random correlation positioned as real correlation  Solutions: – Create a plausible story – Use correlation as scientific evidence – Review the calculations 23 8
  • 24. Corvelle Drives Concepts to Completion Correlation ≠ Causation 24 I used to think correlation implied causation. Then I took a statistics class. Now I don’t. Sounds like the class helped. Well, maybe.
  • 25. Corvelle Drives Concepts to Completion Explanation adds Distortion  Issues: – Overly complex explanation – Too much jargon – Exaggeration of inferences  Solutions: – Express results clearly – Develop simple, attractive charts – Stick to the supportable inferences 25 9
  • 26. Corvelle Drives Concepts to Completion Dubious Explanation of Survey Data Public statement: A survey of more than 25,000 Albertans reviewing the K-12 school curriculum found “there exists a strong desire for the removal of Shakespeare as a required author.” 26
  • 27. Corvelle Drives Concepts to Completion Dubious Explanation of Survey Data Reality:  25,000 survey respondents  60 added a comment about Shakespeare  50 called for removal of Shakespeare from curriculum 27 Alberta Education is interpreting comments to mean five out of six Albertans favour removing Shakespeare from curriculum
  • 28. Corvelle Drives Concepts to Completion Number of article retractions by cause of retraction 28
  • 29. Corvelle Drives Concepts to Completion Recommendations  Improve your ability to interpret data  Watch out for these nine hazards of data misinterpretation  Ensure your research is accurate and defensible 29 • Insufficient Domain Expertise • Important Variables Omitted • Aggregation Obscures Truth • Inferences are Off Base • Sources of Variation Overlooked • Statistical Significance Trumps Critical Thinking • Numerical Analysis Missing Something • Correlation Mistaken for Causation • Explanation adds Distortion
  • 30. Corvelle Drives Concepts to Completion Questions & Discussion 30 Can you help us understand data better?Please fill out evaluation form
  • 31. Corvelle Drives Concepts to Completion Understanding Data: What do these numbers mean? Corvelle Consulting 300, 400 - 5 Ave. S. W. Calgary, Alberta T2P 0L6 Phone: (403) 860-5348 E-mail: YogiSchulz@corvelle.com Web: www.corvelle.com Yogi Schulz Partner of Corvelle Consulting Information technology related management consulting Microsoft Canada columnist & CBC Radio host Industry presenter Former PPDM Association board member 31
  • 32. Corvelle Drives Concepts to Completion Plausible Inferences for Ranges of p Values 32 1.00 .10 .05 .01 .001 0.0 p > 0.10 0.05 < p < 0.10 0.01 < p < 0.05 0.001 < p < 0.01 0.0 < p < 0.001 Not significant Marginally significant Fairly significant Strongly significant Definitely significant Values of p Plausible inference
  • 33. Corvelle Drives Concepts to Completion Questionable Analysis Goals 33
  • 34. Corvelle Drives Concepts to Completion Bibliography  A data analyst needs these 5 skills – https://www.viva-viva.ca/index.php/news-events/114-a-data-analyst-needs-these-5-skills  Analyzing, Interpreting and Reporting Basic Research Results – http://managementhelp.org/businessresearch/analysis.htm  Analyzing Data and Communicating Results – http://strengtheningnonprofits.org/resources/e-learning/online/analyzingdata/  Believe It Or Not, Most Published Research Findings Are Probably False – Simon Oxenham – http://bigthink.com/neurobonkers/believe-it-or-not-most-published-research-findings-are- probably-false  Big Data Solutions for Healthcare – Stanislas Odinot, April 11, 2013 – https://www.slideshare.net/LarryCover/big-data-solutions-for-healthcare  Big Data in Healthcare: Separating The Hype From The Reality – Jared Crapo, Health Catalyst – https://www.healthcatalyst.com/healthcare-big-data-realities 34
  • 35. Corvelle Drives Concepts to Completion Bibliography  Confirmationist and falsificationist paradigms of science – Andrew, 5 September 2014 – http://andrewgelman.com/2014/09/05/confirmationist-falsificationist-paradigms-science/  Correlation, causation and coincidence – 05 November 2015 – http://behindlabdoors.com/correlation-causation-and-coincidence/  Correlation does not imply causation – https://en.wikipedia.org/wiki/Correlation_does_not_imply_causation  Correlation does not imply causation – except when it does – The Original Skeptical Raptor, August 9, 2015 – http://www.skepticalraptor.com/skepticalraptorblog.php/correlation-does-not-imply-causation-except- when-it-does/  Data Analysis, Interpretation and Presentation – http://www.uio.no/studier/emner/matnat/ifi/INF4260/h10/undervisningsmateriale/DataAnalysis.pdf  Data Analysis and Interpretation – Anne E. Egger, Ph.D., Anthony Carpi, Ph.D. – http://www.visionlearning.com/en/library/Process-of-Science/49/Data-Analysis-and-Interpretation/154  Data analysis and presentation – http://www.statcan.gc.ca/pub/12-539-x/2009001/analysis-analyse-eng.htm 35
  • 36. Corvelle Drives Concepts to Completion Bibliography  Data Collection and Interpretation – The Gale Group Inc. – http://www.encyclopedia.com/education/news-wires-white-papers-and-books/data- collection-and-interpretation  Dataviz: Making Smarter, More Persuasive Data Visualizations – Scott Berinato, March 30, 2016 – https://hbr.org/webinar/2016/05/dataviz-making-smarter-more-persuasive-data- visualizations  Data-driven decision-making process – http://www.txprofdev.org/apps/datadecisions/node/52.html  Describing and Interpreting Data – www.uh.edu/~tech132/sln12.doc  Descriptive Statistics and Interpreting Statistics – http://www.statisticssolutions.com/descriptive-statistics-and-interpreting-statistics/  Economics methods in Cochrane systematic reviews of health promotion and public health related interventions – Ian Shemilt, 15 November 2006 – https://bmcmedresmethodol.biomedcentral.com/articles/10.1186/1471-2288-6-55 36
  • 37. Corvelle Drives Concepts to Completion Bibliography  Excel errors and science papers – Spreadsheets are playing havoc with scientists – Economist, September 7, 2016 – http://www.economist.com/blogs/graphicdetail/2016/09/daily-chart-3  How science goes wrong – Scientific research has changed the world. Now it needs to change itself – Economist, October 21 2013 – http://www.economist.com/news/leaders/21588069-scientific-research-has-changed-world-now-it- needs-change-itself-how-science-goes-wrong  HealthToon: Taking down chronic diseases with advanced analytics – Oliver Clark, April 17, 2015 – http://www.ibmbigdatahub.com/blog/healthtoon-taking-down-chronic-diseases-advanced-analytics  Hilarious Graphs Prove That Correlation Isn’t Causation – https://www.fastcodesign.com/3030529/hilarious-graphs-prove-that-correlation-isnt-causation – http://www.tylervigen.com/spurious-correlations  Interpretation of Data: The Basics – Tania, May 30, 2014 – https://blog.udemy.com/interpretation-of-data/ 37
  • 38. Corvelle Drives Concepts to Completion Bibliography  Interpreting and Presenting Data – http://www.deq.state.or.us/lab/wqm/docs/InterpretingandPresentingData.pdf  Misconduct, not error, is the source of most retracted papers – Ashutosh Jogalekar, October 2, 2012 – https://blogs.scientificamerican.com/the-curious-wavefunction/misconduct-and-not-error-is- the-source-of-most-retracted-papers/  9 Causes Of Data Misinterpretation – Lisa Morgan, 17 July 2015 – http://www.informationweek.com/big-data/big-data-analytics/9-causes-of-data- misinterpretation/d/d-id/1321338  Presentation, Analysis and Interpretation of Data – https://www.slideshare.net/31mikaella/presentation-analysis-and-interpretation-of-data  Retraction Watch – http://retractionwatch.com/  Sample Data Interpretation Questions – http://www.psychometric-success.com/faq/faq-sample-data-interpretation-questions.htm  Science and Engineering Practice of Analyzing and Interpreting Data – http://ngss.nsta.org/Practices.aspx?id=4 38
  • 39. Corvelle Drives Concepts to Completion Bibliography  Statistical significance and its part in science downfalls – Imagine if there were a simple single statistical measure everybody could use with any set of data and it would reliably separate true from false – Hilda Bastian, November 11, 2013 – https://blogs.scientificamerican.com/absolutely-maybe/statistical-significance-and-its-part-in-science- downfalls/  Statistical Significance for CRO: 6 Things You Need to Know – Tom Capper, May 08, 2014 – https://www.distilled.net/resources/statistical-significance-for-cro-6-things-you-need-to-know/  Statistics Done Wrong – The woefully complete guide – Alex Reinhart – https://www.statisticsdonewrong.com/  Student Competencies & Requirements in Health Economics – Cumming School of Medicine, 2017 – https://cumming.ucalgary.ca/gse/files/gse/competencies_health_economics_2017.pdf  Summary and discussion of: “Why Most Published Research Findings Are False” – Statistics Journal Club, 36-825 – Dallas Card and Shashank Srivastava, December 10, 2014 – http://www.stat.cmu.edu/~ryantibs/journalclub/ioannidis.pdf 39
  • 40. Corvelle Drives Concepts to Completion Bibliography  Trouble at the lab – Scientists like to think of science as self-correcting. To an alarming degree, it is not – Economist, October 18 2013 – http://www.economist.com/news/briefing/21588057-scientists-think-science-self-correcting- alarming-degree-it-not-trouble  Understanding the Growth, Value of Healthcare Big Data Analytics – HealthITAnalytics, July 30, 2015 – http://healthitanalytics.com/news/understanding-the-growth-value-of-healthcare-big-data- analytics  Watson Analytics – https://www.ibm.com/analytics/watson-analytics/us-en/What’s Significant? – Patrick Barlow, University of Tennessee – https://www.slideshare.net/pbbarlow1/whats-significant-hypothesis-testing-effect-size- confidence-intervals-the-pvalue-fallacy  Why Most Published Research Findings Are False – John P. A. Ioannidis, PLoS Med 2(8): e124 – http://faculty.dbmi.pitt.edu/day/Bioinf2118/Bioinf-2118-2013/Ioannidis- journal.pmed.0020124.pdf 40

Editor's Notes

  1. Understanding Data: What do these numbers mean? My name is Yogi Schulz Thank you to xxxx for inviting me to speak today We all see a lot of data every day Some of it useful, some of it suspicious, some is obviously wrong and some of it is outright bad and misleading We going to spend our time together today to learn how to better understand data Presentation created by Yogi Schulz in April 2017 - YogiSchulz@corvelle.com
  2. 2
  3. Data Volumes are Growing every Year The growth is shown for many categories of health care data and for imaging in particular Perhaps more alarmingly, the rate of growth appears to be accelerating Perhaps this cartoon of a data flood is a better way to describe the trend that most health care organizations are choking on It likely doesn’t matter if the cartoon characters represent clinicians, researchers, bureaucrats or care givers In any case, this growing volume undermines our effort to understand the data What examples of data overload have you observed or experienced?
  4. Presentation Outline Introduction I’ll start with a few introductory remarks Learning objectives Then we’ll talk about the learning objectives for this presentation We’ll spend most of the time today on these Nine hazards of data misinterpretation Insufficient Domain Expertise Important Variables Omitted Aggregation Obscures Truth Inferences are Off Base Sources of Variation Overlooked Statistical Significance Trumps Critical Thinking Numerical Analysis Missing Something Correlation Mistaken for Causation Explanation adds Distortion Recommendations & actions We’ll wrap up with some recommendations
  5. Corvelle Consulting
  6. Corvelle Consulting
  7. Nine Hazards of Data Misinterpretation Data is misinterpreted more often than you might expect Even with the best intentions, important variables may be omitted or a problem may be oversimplified or overcomplicated Sometimes organizations act on trends that are not what they seem Not surprisingly, when two people view the same analytical result, they may interpret it differently Statistics can tell you 'this versus that‘ The real questions are: Is the difference worth worrying about? Have we collected enough data to allow us to confidently make a decision? Are there any arithmetic errors in the data analysis? It is entirely possible for health care leaders to obsess about something that is statistically insignificant, or for data scientists to omit important variables, simply because they do not understand the entire context of the problem they are trying to solve The path to valuable insights can include a number of obstacles, some of which may not become apparent until well after the fact Some individuals and groups take a top-down approach to data analysis, meaning that they focus on the medical problem they are trying to solve and they make a point of identifying variables that have been relevant in the past in a same or similar context Others take a bottom-up approach, meaning that they attempt to correlate variables associated with what they are trying to improve such as readmission rates for specific conditions The danger of the latter approach is a high probability that some correlations are statistically significant but are an artifact of the way the data has been analyzed, versus being an accurate indicator of underlying relationships Because there are a lot of ways data can be misinterpreted, we need to understand how and why it can happen We’ll spend most of our time here today on the nine hazards of data misinterpretation
  8. Insufficient Domain Expertise Domain or subject matter expertise and data expertise are both necessary for the analysis and accurate interpretations of data As our available domain expertise decreases, the reliability of our research results also decreases Issues: Domain experts are not data scientists Domain experts tend not to be conversant with the techniques for analyzing data and related statistical concepts For example: Expert clinical practitioners often have limited exposure to statistical techniques Data scientists are not domain experts Data scientists do not have the same level of subject matter expertise that other experts in the organization have accumulated For example: Data scientists often don’t know much about health care or research design Imbalance of data expertise and domain expertise can easily lead to misinterpretation of data For example, the data scientist often doesn't understand the medical context of the variables they're looking at That happens a lot in large organizations where people work in silos Solutions or risk reduction: Before you use data to examine a situation or to make a recommendation, question: Is sufficient expertise involved? Would adding a business analyst add value? Business analysts operate between domain expert and data scientist, and as such, can help ensure sufficient expertise is being applied Is the result possible or even likely in the real world? What real-world experience makes you skeptical about the data? What real-world experience makes you think the data makes sense?
  9. Do you have sufficient education and experience to interpret the data? Clearly this person has not received enough training on how to set up the gurney “I know nothing about the subject, but I’d be happy to give you my expert opinion!” Do you know your expert collaborator or consultant well enough to be confident of his or her relevant expertise?
  10. Corvelle Consulting
  11. It’s really easy to omit variables Sometimes the consequences can be disastrous Seriously? No worries! I’ve factored in lift, thrust, drag and wind speed . . . Just not gravity. It’s always something. Think carefully about your design and procedures to minimize the risk of nasty surprises later
  12. Corvelle Consulting
  13. Aggregation to confirm Trends Use the SAP database to aggregate our findings. That data is wrong. Then use the survey database. That data is also wrong. Can you average our findings? Sure. I can multiply them too. This comic is funny because using fancy arithmetic or statistics will not improve accuracy, completeness or defensibility of research findings Fancy math will also not compensate for poorly designed research A better approach is to pursue continuous improvement of research design, execution and reporting processes
  14. Inferences are Off Base What inferences can we make about treatment impact on mother and her boy? The mother seems more upset than the little boy receiving the vaccination What inferences can we make about the wisdom skateboarding with crutches and a cast? Skateboarding is risky at the best of times; Skateboarding with crutches and a cast enables us to make inferences about teenagers being oblivious to risk Defensible inferences are essential to the process of assessing research data to form conclusions and then recommendations Issues: Misunderstanding group characteristics All interferences from data are conditional, so it's wise to understand as many group characteristics about which inferences are being made as possible. If not, you run the risk of inferring the wrong properties about a population or a sample For example: Is the treatment response affected by age, dosage, pre-existing conditions, sex or ethnicity? Rose-coloured thinking We all want to save the world; that leads us to interpret our data to be more indicative of the accuracy of our hypothesis than is supportable by the cold, hard facts Are you stretching the data? For example: If the treatment is effective for older adults, can I reasonably recommend the treatment for teenagers or children? Ideology-based agenda Sometimes conclusions are crafted to advance an ideology-based agenda While I obviously believe it’s dishonest and unethical, we have all observed such action particular for political issues and sometimes to advance drug or device approvals Solutions or risk reduction: Challenge your research project Are you making leaps in logic from what the data really says to your interferences? Is there a bias in your design or data collection? For example: do you inadvertently have only high-income trial participants due to travel requirements for tests? Are my recommendations clearly and reasonably based on the data I have? Review the statistical calculations with an experienced statistician If you are not trained in statistical thinking, you will tend to misinterpret the data or the results more positively than is defensible Review the work with an independent researcher That review may be ego-deflating but it will save you from more public embarrassment later
  15. Corvelle Consulting
  16. Corvelle Consulting
  17. “My diabetic research shows that test subjects are 98% more likely to take their diabetic pills when the pills are covered in chocolate!” Does your research design include a bias that is causing you to overlook a source of variation?
  18. Corvelle Consulting
  19. Statistical Significance The previous table is often charted as this ubiquitous bell curve The term "null hypothesis" is a general statement or default position that there is no relationship between two measured phenomena, or no association among groups Rejecting or disproving the null hypothesis — and thus concluding that there are grounds for believing that there is a relationship between two phenomena, such as that a potential treatment has a measurable effect — is a central task in the modern practice of science the field of statistics gives precise criteria for rejecting a null hypothesis https://en.wikipedia.org/wiki/Null_hypothesis 95% Non-significant effect Variation is likely caused by a combination of errors and noise 2.5% Significant effect The two 2.5% yellow areas together equal the p = 0.05 value I showed on the table on the previous slide Explain more P Value, Statistical Significance and Clinical Significance The Journal of Clinical and Preventive Cardiology, Volume 2, Oct 2013 Padam Singh, PhD Gurgaon, India J Clin Prev Cardiol. 2013;2(4):202-4 http://www.jcpcarchives.org/full/p-value-statistical-significance-and-clinical-significance-121.php
  20. Corvelle Consulting
  21. Corvelle Consulting
  22. File: Close_to_home_2017_04_19.gif I think the general public is confused by the seemingly contradictory findings of various studies of the same medical issue Often these contradictory findings can be legitimately explained by the differences in the various studies that have been conducted However, such subtleties aren’t easily explained in a short newspaper article or a TV sound bite
  23. Corvelle Consulting
  24. Corvelle Consulting
  25. Corvelle Consulting
  26. Corvelle Consulting
  27. Corvelle Consulting
  28. Number of article retractions by cause of retraction – first chart The authors focus on papers in the life sciences They find that about 67% of the 2047 retracted papers owed their retraction to plain old misconduct; only 21% or so can be traced back to error and honest mistakes The misconduct can come in three forms - outright fraud, plagiarism and duplication The other piece of bad news is that among these three, fraud contributes the most to the retraction with plagiarism and duplication tagging behind Clearly there’s a material number of published studies that can mislead us We need to be vigilant I wonder what is causing the increase in the number of retractions? More researchers creating more articles? More stringent peer review? More readers drawing problems to the attention of editors? Number of article retractions by cause of retraction – second chart Unfortunately these retracted studies continue to create problems in the research even after they have bee retracted because the retracted studies continue to be cited in subsequent research articles This data suggests harm to reputations and perhaps harm to patients can be reduced Misconduct, not error, is the source of most retracted papers Ashutosh Jogalekar, October 2, 2012 https://blogs.scientificamerican.com/the-curious-wavefunction/misconduct-and-not-error-is-the-source-of-most-retracted-papers/ Why researchers keep citing retracted papers https://qz.com/583497/researchers-keep-citing-these-retracted-papers/
  29. 5/19/2017
  30. Questions & Discussion “Can you help us understand data better?”
  31. Understanding Data: What do these numbers mean?
  32. Plausible Inferences for Ranges of p Values We’re often delighted when we achieve a correlation of p = 0.05 in our data This table is to remind us that this level of correlation is only fairly significant I appreciate that the effort associated with driving p to a lower value is often horrendous, stratospheric or ridiculously expensive So my point is to say that we should be cautious about inferring too much from a correlation of p = 0.05
  33. Corvelle Consulting
  34. Bibliography
  35. Bibliography
  36. Bibliography
  37. Bibliography
  38. Bibliography
  39. Bibliography
  40. Bibliography