SlideShare a Scribd company logo
1 of 1
If my samples are statistically different, my job is done.
Significant differences indicate that differences in group means are not likely due to
sampling error. The problem is that statistically significant differences can be found
even with very small differences if the sample size is large enough.
Ex: compare global volume values (cm2) of tresses treated with two hair products.
Product A is 58.32±0.81 (mean ± Standard Deviation) & Product B is 56.01±0.43.
Even though Product A is significantly better than Product B, the difference of 2.31
may not be enough to justify more resources into pursuing it.
Statistical significance is not always
actionable or has practical implications.
It rarely makes sense to treat scales, categorical or ordinal
data as continuous variables (AKA numbers).
Typical Pitfalls in Statistical Analyses
Brian LIN and Amol TEMBE
US Scientific Computing
Amol TEMBE, PhD
Director,
Scientific Computing
Brian Lin, MS MAppSc
Manager of Statistics & Analytics,
Scientific Computing
We are here to help you with your
experimental design, data, statistics, &
numbers.
Contact
Mean & Standard Deviation
Abstract
“If sets of data have the same mean and variance, they must be the same data.”
The following 4 datasets have the same means and standard deviations.
From "Pattern Classification" 2nd Edition by Richard O. Duda
The sample mean locates the center of the
distribution. The sample variance describes the
amount the data scattered along various directions.
Those two measurements not tell anything about
the distribution.
Different sets of data may have the
same means & standard deviations but very
different distributions.
Correlation & Causation
“If two attributes or variables are highly correlated, one must cause the other.”
Let’s look at the following two graphs. Do one factor cause the other?
Graphs from book “Spurious Correlations” by Tyler Vigen
Establishing a correlation between two variables is not a sufficient
condition to establish a causal relationship (in either direction). However,
it should encourage further research to uncover the truth.
Involve Your Statistician
Statistical vs Practical Significance
T-test Vs Paired t-test
When is a Number a Number?
Consider this survey: 2 treatments, 15 subject each
5-point scale: 5-love it, 4-like it, 3-neutral, 2-don’t like, 1-hate it.
A standard one-way Analysis of Variance (ANOVA) gives us
Group A Mean:2.1 Group B Mean: 3.3 F=5.15, p = 0.001
Can we conclude Group B perform better than Group A?
No!
Issue #1: how can a 5-point scale have value 9?
Issue #2: can 5-point scale be treated as pure numbers? Distribution of Group A & B data after removing 2 outliers
After removing the two outliers, the ANOVA now gives a p-value of 0.06 (not statistically
significant). Should this be your final answer?
Qualitative attributes (like, dislike, etc.) only convey implicit order (i.e., good is better than
neutral) with no quantitative meaning. Taking mean scores of these numerical
representation is also not appropriate. So, what should we do then?
An established approach to analyze this type is data uses the “Top 2 boxes” approach: The
numbers of 4s and 5s are pooled into YES category, and the rest into NO. A Chi-Square test
is performed to detect the differences between the ratios of responses .
The test concluded that the ratio of Top 2 boxes 26.7% Vs 46.2% are not statistically different
(p=0.28).
“How many subjects do I need to show a statistical significance?”
That is a trick question.
Without knowing your study objectives, hypothesis, measurements, variance,
and data, we cannot know for sure how many subjects are needed.
To compare machine readings of anti-breakage properties of tresses, one may
need 50 or fewer tresses in each treatment group. To compare consumer
perceptions on a 5-point scale, many more subjects are needed.
Size Does Matter
Samples size calculations need to be performed case by case taking many factors
into consideration. Consult your statistician to discuss your study objectives.
“I have completed the study, and now it is time to ask the statistician about data analysis method.”
No.
Well-designed experiment allows for multiple factors to be collected and analyzed properly, to
determine their effect on a desired outcome/response. A good design answers at least the follow
questions:
• What is the big question? What are the key factors in asking?
• What is the hypothesis? What are we testing?
• What are the main and interaction effects in the process?
• What settings would bring about less variation in the output?
• How many samples / data points will be sufficient?
• What blocking / design / randomization schemes will be most effective?
• What statistical methods / tools are best suited for the design and the analysis?
If you only think of those study factors after data points are collected, you might be too late. The old
saying goes: “Garbage in. Garbage out.”
“If I have good data, all statistical tests should show the same
results.”
Let’s look at the following data from Course Statistics 250 of Penn State University.
Pulse Rates Before and After Marching
Student BEFORE AFTER DIFFA-B
1 60 78 18
2 56 66 10
3 90 96 6
4 78 88 10
Two sample t-test for AFTER vs BEFORE
N Mean StdDev SE Mean
AFTER 4 82.0 13.0 6.5
BEFORE 4 71.0 15.9 7.9
T-Test mu AFTER = mu BEFORE (vs not =): T = 1.07, p = 0.33
The result concludes no difference in mean pulse rates before and after marching.
Paired t-test for AFTER - BEFORE
N Mean StdDev SE Mean
Difference 4 11.00 5.03 2.52
T-Test of mean difference = 0 (vs not = 0): T = 4.37, p = 0.02
The analysis concludes that mean pulse rate after is greater than mean pulse rate before.
Which one is the correct test?
Paired t-Test
Choosing a correct method is vital to the success of your
analysis. When in doubt, consult your statistician.
Group A 4 2 2 4 4 2 1 1 2 1 5 1 2 1 2
Group B 3 5 5 1 1 9 5 4 3 9 3 4 4 2 2
Without sample size specified, percentages along cannot provide much information .
Type I error ( α ) is the incorrect rejection of a true null
hypothesis (a "false positive"), while a Type II error (β ) is
incorrectly retaining a false null hypothesis (a "false negative").
At any given sample size n, there is a trade-off between
the significance level (Type I error) and power (Type II error).
Ideally, we want to limit Type I error to 5% (α=0.05) or 1%
(α=0.01) while achieving at least 80% power (1-β=0.8). In this
example to the right, we may achieve our goals with roughly 25
samples. However, this calculation is only good for this
population and this data.
When we have a different set of data, the power curves will have
similar look but the axis & units will be different.
The relationship between significance level and power with one-tailed tests: μ = 75, real μ = 80, and σ = 10.
Data Source: Rice University (Lead Developer), University of Houston Clear Lake, and Tufts University Online Statistics
Education
Statistics is the science of learning from data to measure, control, and communicate uncertainty in decision making. Appropriate application of statistical data analysis is essential to solving problems in a wide variety of research to shorten time and reduce cost by mitigating risks in decision making.
Improper use of statistics, either due to lack of adequate training, poor study design, or negligence, can trick the observer into believing data artifacts that obfuscate the underlying phenomena. The consequences of drawing wrong inferences that are not statistically substantiated can be detrimental
to business at large and can cause enormous monetary losses.
This poster presents a set of hypothetical but realistic scenarios widely observed in data from multitude disciplines. Each scenario contains at least one misconception and/or mistake typical of those made in analyses. The topics discussed are application & interpretation of mean, variance,
correlation, causality, significance, normality of distribution, and design of experiments.
The primary objective of the poster is to shed light on typical deceptive traps in statistical data analyses and to promote best data analyses practices within the R&I community.
Involve Scientific Computing early in the process so you
position yourself for success before the first data point comes in.
We are here to assist you with your analysis needs.

More Related Content

What's hot

Hypothesis Testing: Central Tendency – Non-Normal (Compare 2+ Factors)
Hypothesis Testing: Central Tendency – Non-Normal (Compare 2+ Factors)Hypothesis Testing: Central Tendency – Non-Normal (Compare 2+ Factors)
Hypothesis Testing: Central Tendency – Non-Normal (Compare 2+ Factors)Matt Hansen
 
Hypothesis Testing: Proportions (Compare 2+ Factors)
Hypothesis Testing: Proportions (Compare 2+ Factors)Hypothesis Testing: Proportions (Compare 2+ Factors)
Hypothesis Testing: Proportions (Compare 2+ Factors)Matt Hansen
 
Missing data and non response pdf
Missing data and non response pdfMissing data and non response pdf
Missing data and non response pdfAnuj Bhatia
 
Hypothesis Testing: Spread (Compare 1:Standard)
Hypothesis Testing: Spread (Compare 1:Standard)Hypothesis Testing: Spread (Compare 1:Standard)
Hypothesis Testing: Spread (Compare 1:Standard)Matt Hansen
 
Hypothesis Testing: Central Tendency – Normal (Compare 1:1)
Hypothesis Testing: Central Tendency – Normal (Compare 1:1)Hypothesis Testing: Central Tendency – Normal (Compare 1:1)
Hypothesis Testing: Central Tendency – Normal (Compare 1:1)Matt Hansen
 
Hypothesis Testing: Proportions (Compare 1:1)
Hypothesis Testing: Proportions (Compare 1:1)Hypothesis Testing: Proportions (Compare 1:1)
Hypothesis Testing: Proportions (Compare 1:1)Matt Hansen
 
Hypothesis Testing: Central Tendency – Normal (Compare 1:Standard)
Hypothesis Testing: Central Tendency – Normal (Compare 1:Standard)Hypothesis Testing: Central Tendency – Normal (Compare 1:Standard)
Hypothesis Testing: Central Tendency – Normal (Compare 1:Standard)Matt Hansen
 
Hypothesis Testing: Statistical Laws and Confidence Intervals
Hypothesis Testing: Statistical Laws and Confidence IntervalsHypothesis Testing: Statistical Laws and Confidence Intervals
Hypothesis Testing: Statistical Laws and Confidence IntervalsMatt Hansen
 
Power Analysis: Determining Sample Size for Quantitative Studies
Power Analysis: Determining Sample Size for Quantitative StudiesPower Analysis: Determining Sample Size for Quantitative Studies
Power Analysis: Determining Sample Size for Quantitative StudiesStatistics Solutions
 
Power, effect size, and Issues in NHST
Power, effect size, and Issues in NHSTPower, effect size, and Issues in NHST
Power, effect size, and Issues in NHSTCarlo Magno
 
Hypothesis Testing: Spread (Compare 1:1)
Hypothesis Testing: Spread (Compare 1:1)Hypothesis Testing: Spread (Compare 1:1)
Hypothesis Testing: Spread (Compare 1:1)Matt Hansen
 
Reviewing quantitative articles_and_checklist
Reviewing quantitative articles_and_checklistReviewing quantitative articles_and_checklist
Reviewing quantitative articles_and_checklistLasse Torkkeli
 
Hypothesis Testing: Central Tendency – Normal (Compare 2+ Factors)
Hypothesis Testing: Central Tendency – Normal (Compare 2+ Factors)Hypothesis Testing: Central Tendency – Normal (Compare 2+ Factors)
Hypothesis Testing: Central Tendency – Normal (Compare 2+ Factors)Matt Hansen
 
Representing and generating uncertainty effectively presentatıon
Representing and generating uncertainty effectively presentatıonRepresenting and generating uncertainty effectively presentatıon
Representing and generating uncertainty effectively presentatıonAzdeen Najah
 
Hypothesis Testing: Central Tendency – Non-Normal (Compare 1:1)
Hypothesis Testing: Central Tendency – Non-Normal (Compare 1:1)Hypothesis Testing: Central Tendency – Non-Normal (Compare 1:1)
Hypothesis Testing: Central Tendency – Non-Normal (Compare 1:1)Matt Hansen
 
Spss introductory session data entry and descriptive stats
Spss introductory session data entry and descriptive statsSpss introductory session data entry and descriptive stats
Spss introductory session data entry and descriptive statse1033930
 
Aron chpt 7 ed effect size f2011
Aron chpt 7 ed effect size f2011Aron chpt 7 ed effect size f2011
Aron chpt 7 ed effect size f2011Sandra Nicks
 
Stayer mat 510 final exam2
Stayer mat 510 final exam2Stayer mat 510 final exam2
Stayer mat 510 final exam2shyaminfo15
 

What's hot (19)

Hypothesis Testing: Central Tendency – Non-Normal (Compare 2+ Factors)
Hypothesis Testing: Central Tendency – Non-Normal (Compare 2+ Factors)Hypothesis Testing: Central Tendency – Non-Normal (Compare 2+ Factors)
Hypothesis Testing: Central Tendency – Non-Normal (Compare 2+ Factors)
 
Hypothesis Testing: Proportions (Compare 2+ Factors)
Hypothesis Testing: Proportions (Compare 2+ Factors)Hypothesis Testing: Proportions (Compare 2+ Factors)
Hypothesis Testing: Proportions (Compare 2+ Factors)
 
Missing data and non response pdf
Missing data and non response pdfMissing data and non response pdf
Missing data and non response pdf
 
Hypothesis Testing: Spread (Compare 1:Standard)
Hypothesis Testing: Spread (Compare 1:Standard)Hypothesis Testing: Spread (Compare 1:Standard)
Hypothesis Testing: Spread (Compare 1:Standard)
 
Hypothesis Testing: Central Tendency – Normal (Compare 1:1)
Hypothesis Testing: Central Tendency – Normal (Compare 1:1)Hypothesis Testing: Central Tendency – Normal (Compare 1:1)
Hypothesis Testing: Central Tendency – Normal (Compare 1:1)
 
Hypothesis Testing: Proportions (Compare 1:1)
Hypothesis Testing: Proportions (Compare 1:1)Hypothesis Testing: Proportions (Compare 1:1)
Hypothesis Testing: Proportions (Compare 1:1)
 
Hypothesis Testing: Central Tendency – Normal (Compare 1:Standard)
Hypothesis Testing: Central Tendency – Normal (Compare 1:Standard)Hypothesis Testing: Central Tendency – Normal (Compare 1:Standard)
Hypothesis Testing: Central Tendency – Normal (Compare 1:Standard)
 
Hypothesis Testing: Statistical Laws and Confidence Intervals
Hypothesis Testing: Statistical Laws and Confidence IntervalsHypothesis Testing: Statistical Laws and Confidence Intervals
Hypothesis Testing: Statistical Laws and Confidence Intervals
 
Power Analysis: Determining Sample Size for Quantitative Studies
Power Analysis: Determining Sample Size for Quantitative StudiesPower Analysis: Determining Sample Size for Quantitative Studies
Power Analysis: Determining Sample Size for Quantitative Studies
 
Power, effect size, and Issues in NHST
Power, effect size, and Issues in NHSTPower, effect size, and Issues in NHST
Power, effect size, and Issues in NHST
 
Hypothesis Testing: Spread (Compare 1:1)
Hypothesis Testing: Spread (Compare 1:1)Hypothesis Testing: Spread (Compare 1:1)
Hypothesis Testing: Spread (Compare 1:1)
 
Reviewing quantitative articles_and_checklist
Reviewing quantitative articles_and_checklistReviewing quantitative articles_and_checklist
Reviewing quantitative articles_and_checklist
 
Hypothesis Testing: Central Tendency – Normal (Compare 2+ Factors)
Hypothesis Testing: Central Tendency – Normal (Compare 2+ Factors)Hypothesis Testing: Central Tendency – Normal (Compare 2+ Factors)
Hypothesis Testing: Central Tendency – Normal (Compare 2+ Factors)
 
Representing and generating uncertainty effectively presentatıon
Representing and generating uncertainty effectively presentatıonRepresenting and generating uncertainty effectively presentatıon
Representing and generating uncertainty effectively presentatıon
 
Factor analysis
Factor analysisFactor analysis
Factor analysis
 
Hypothesis Testing: Central Tendency – Non-Normal (Compare 1:1)
Hypothesis Testing: Central Tendency – Non-Normal (Compare 1:1)Hypothesis Testing: Central Tendency – Non-Normal (Compare 1:1)
Hypothesis Testing: Central Tendency – Non-Normal (Compare 1:1)
 
Spss introductory session data entry and descriptive stats
Spss introductory session data entry and descriptive statsSpss introductory session data entry and descriptive stats
Spss introductory session data entry and descriptive stats
 
Aron chpt 7 ed effect size f2011
Aron chpt 7 ed effect size f2011Aron chpt 7 ed effect size f2011
Aron chpt 7 ed effect size f2011
 
Stayer mat 510 final exam2
Stayer mat 510 final exam2Stayer mat 510 final exam2
Stayer mat 510 final exam2
 

Similar to Statistical Significance vs Practical Significance in Data Analysis

Data Analysis for Graduate Studies Summary
Data Analysis for Graduate Studies SummaryData Analysis for Graduate Studies Summary
Data Analysis for Graduate Studies SummaryKelvinNMhina
 
Statistical ProcessesCan descriptive statistical processes b.docx
Statistical ProcessesCan descriptive statistical processes b.docxStatistical ProcessesCan descriptive statistical processes b.docx
Statistical ProcessesCan descriptive statistical processes b.docxdarwinming1
 
#06198 Topic PSY 325 Statistics for the Behavioral & Social Scien.docx
#06198 Topic PSY 325 Statistics for the Behavioral & Social Scien.docx#06198 Topic PSY 325 Statistics for the Behavioral & Social Scien.docx
#06198 Topic PSY 325 Statistics for the Behavioral & Social Scien.docxAASTHA76
 
Statistics  What you Need to KnowIntroductionOften, when peop.docx
Statistics  What you Need to KnowIntroductionOften, when peop.docxStatistics  What you Need to KnowIntroductionOften, when peop.docx
Statistics  What you Need to KnowIntroductionOften, when peop.docxdessiechisomjj4
 
Statistics pres 3.31.2014
Statistics pres 3.31.2014Statistics pres 3.31.2014
Statistics pres 3.31.2014tjcarter
 
DataIDSalaryCompaMidpoint AgePerformance RatingServiceGenderRaiseD.docx
DataIDSalaryCompaMidpoint AgePerformance RatingServiceGenderRaiseD.docxDataIDSalaryCompaMidpoint AgePerformance RatingServiceGenderRaiseD.docx
DataIDSalaryCompaMidpoint AgePerformance RatingServiceGenderRaiseD.docxsimonithomas47935
 
Need a nonplagiarised paper and a form completed by 1006015 before.docx
Need a nonplagiarised paper and a form completed by 1006015 before.docxNeed a nonplagiarised paper and a form completed by 1006015 before.docx
Need a nonplagiarised paper and a form completed by 1006015 before.docxlea6nklmattu
 
Computing Descriptive Statistics © 2014 Argos.docx
 Computing Descriptive Statistics     © 2014 Argos.docx Computing Descriptive Statistics     © 2014 Argos.docx
Computing Descriptive Statistics © 2014 Argos.docxaryan532920
 
Computing Descriptive Statistics © 2014 Argos.docx
Computing Descriptive Statistics     © 2014 Argos.docxComputing Descriptive Statistics     © 2014 Argos.docx
Computing Descriptive Statistics © 2014 Argos.docxAASTHA76
 
Research methods 2 operationalization & measurement
Research methods 2   operationalization & measurementResearch methods 2   operationalization & measurement
Research methods 2 operationalization & measurementattique1960
 
Sample Size Estimation and Statistical Test Selection
Sample Size Estimation  and Statistical Test SelectionSample Size Estimation  and Statistical Test Selection
Sample Size Estimation and Statistical Test SelectionVaggelis Vergoulas
 
Introduction_klsfnsfsnfsgnkgni _to_meta-analysis.ppt
Introduction_klsfnsfsnfsgnkgni _to_meta-analysis.pptIntroduction_klsfnsfsnfsgnkgni _to_meta-analysis.ppt
Introduction_klsfnsfsnfsgnkgni _to_meta-analysis.pptAnnaMarieAndalRanill
 
DataIDSalaryCompaMidpoint AgePerformance RatingServiceGenderRaiseD.docx
DataIDSalaryCompaMidpoint AgePerformance RatingServiceGenderRaiseD.docxDataIDSalaryCompaMidpoint AgePerformance RatingServiceGenderRaiseD.docx
DataIDSalaryCompaMidpoint AgePerformance RatingServiceGenderRaiseD.docxtheodorelove43763
 
You clearly understand the concepts of this assignment. You’ve don.docx
You clearly understand the concepts of this assignment. You’ve don.docxYou clearly understand the concepts of this assignment. You’ve don.docx
You clearly understand the concepts of this assignment. You’ve don.docxjeffevans62972
 
Chapter 12Choosing an Appropriate Statistical TestiStockph.docx
Chapter 12Choosing an Appropriate Statistical TestiStockph.docxChapter 12Choosing an Appropriate Statistical TestiStockph.docx
Chapter 12Choosing an Appropriate Statistical TestiStockph.docxmccormicknadine86
 

Similar to Statistical Significance vs Practical Significance in Data Analysis (20)

Data Analysis for Graduate Studies Summary
Data Analysis for Graduate Studies SummaryData Analysis for Graduate Studies Summary
Data Analysis for Graduate Studies Summary
 
Statistical ProcessesCan descriptive statistical processes b.docx
Statistical ProcessesCan descriptive statistical processes b.docxStatistical ProcessesCan descriptive statistical processes b.docx
Statistical ProcessesCan descriptive statistical processes b.docx
 
T test
T test T test
T test
 
Analysis and Interpretation
Analysis and InterpretationAnalysis and Interpretation
Analysis and Interpretation
 
#06198 Topic PSY 325 Statistics for the Behavioral & Social Scien.docx
#06198 Topic PSY 325 Statistics for the Behavioral & Social Scien.docx#06198 Topic PSY 325 Statistics for the Behavioral & Social Scien.docx
#06198 Topic PSY 325 Statistics for the Behavioral & Social Scien.docx
 
Statistics  What you Need to KnowIntroductionOften, when peop.docx
Statistics  What you Need to KnowIntroductionOften, when peop.docxStatistics  What you Need to KnowIntroductionOften, when peop.docx
Statistics  What you Need to KnowIntroductionOften, when peop.docx
 
Statistics pres 3.31.2014
Statistics pres 3.31.2014Statistics pres 3.31.2014
Statistics pres 3.31.2014
 
DataIDSalaryCompaMidpoint AgePerformance RatingServiceGenderRaiseD.docx
DataIDSalaryCompaMidpoint AgePerformance RatingServiceGenderRaiseD.docxDataIDSalaryCompaMidpoint AgePerformance RatingServiceGenderRaiseD.docx
DataIDSalaryCompaMidpoint AgePerformance RatingServiceGenderRaiseD.docx
 
Meta analysis with R
Meta analysis with RMeta analysis with R
Meta analysis with R
 
Need a nonplagiarised paper and a form completed by 1006015 before.docx
Need a nonplagiarised paper and a form completed by 1006015 before.docxNeed a nonplagiarised paper and a form completed by 1006015 before.docx
Need a nonplagiarised paper and a form completed by 1006015 before.docx
 
Computing Descriptive Statistics © 2014 Argos.docx
 Computing Descriptive Statistics     © 2014 Argos.docx Computing Descriptive Statistics     © 2014 Argos.docx
Computing Descriptive Statistics © 2014 Argos.docx
 
Computing Descriptive Statistics © 2014 Argos.docx
Computing Descriptive Statistics     © 2014 Argos.docxComputing Descriptive Statistics     © 2014 Argos.docx
Computing Descriptive Statistics © 2014 Argos.docx
 
Research methods 2 operationalization & measurement
Research methods 2   operationalization & measurementResearch methods 2   operationalization & measurement
Research methods 2 operationalization & measurement
 
Seawell_Exam
Seawell_ExamSeawell_Exam
Seawell_Exam
 
Sample Size Estimation and Statistical Test Selection
Sample Size Estimation  and Statistical Test SelectionSample Size Estimation  and Statistical Test Selection
Sample Size Estimation and Statistical Test Selection
 
Introduction_klsfnsfsnfsgnkgni _to_meta-analysis.ppt
Introduction_klsfnsfsnfsgnkgni _to_meta-analysis.pptIntroduction_klsfnsfsnfsgnkgni _to_meta-analysis.ppt
Introduction_klsfnsfsnfsgnkgni _to_meta-analysis.ppt
 
Sample Size Determination
Sample Size DeterminationSample Size Determination
Sample Size Determination
 
DataIDSalaryCompaMidpoint AgePerformance RatingServiceGenderRaiseD.docx
DataIDSalaryCompaMidpoint AgePerformance RatingServiceGenderRaiseD.docxDataIDSalaryCompaMidpoint AgePerformance RatingServiceGenderRaiseD.docx
DataIDSalaryCompaMidpoint AgePerformance RatingServiceGenderRaiseD.docx
 
You clearly understand the concepts of this assignment. You’ve don.docx
You clearly understand the concepts of this assignment. You’ve don.docxYou clearly understand the concepts of this assignment. You’ve don.docx
You clearly understand the concepts of this assignment. You’ve don.docx
 
Chapter 12Choosing an Appropriate Statistical TestiStockph.docx
Chapter 12Choosing an Appropriate Statistical TestiStockph.docxChapter 12Choosing an Appropriate Statistical TestiStockph.docx
Chapter 12Choosing an Appropriate Statistical TestiStockph.docx
 

More from Brian Lin

The Art of Incremental Improvement - A Personal Journey
The Art of Incremental Improvement - A Personal JourneyThe Art of Incremental Improvement - A Personal Journey
The Art of Incremental Improvement - A Personal JourneyBrian Lin
 
Taipei, A Hidden Jewel in East Asia - PR Strategy for Tourism
Taipei, A Hidden Jewel in East Asia - PR Strategy for TourismTaipei, A Hidden Jewel in East Asia - PR Strategy for Tourism
Taipei, A Hidden Jewel in East Asia - PR Strategy for TourismBrian Lin
 
Chief Judges Training 2015-2016
Chief Judges Training 2015-2016Chief Judges Training 2015-2016
Chief Judges Training 2015-2016Brian Lin
 
Judges training 2015-2016
Judges training 2015-2016 Judges training 2015-2016
Judges training 2015-2016 Brian Lin
 
Personal leadership & Career Development
Personal leadership & Career DevelopmentPersonal leadership & Career Development
Personal leadership & Career DevelopmentBrian Lin
 
Toastmasters New club opportunities
Toastmasters New club opportunitiesToastmasters New club opportunities
Toastmasters New club opportunitiesBrian Lin
 
Seeing is believing. Really?
Seeing is believing. Really?Seeing is believing. Really?
Seeing is believing. Really?Brian Lin
 
District new club marketing
District new club marketingDistrict new club marketing
District new club marketingBrian Lin
 
Battle Vs Games
Battle Vs GamesBattle Vs Games
Battle Vs GamesBrian Lin
 

More from Brian Lin (9)

The Art of Incremental Improvement - A Personal Journey
The Art of Incremental Improvement - A Personal JourneyThe Art of Incremental Improvement - A Personal Journey
The Art of Incremental Improvement - A Personal Journey
 
Taipei, A Hidden Jewel in East Asia - PR Strategy for Tourism
Taipei, A Hidden Jewel in East Asia - PR Strategy for TourismTaipei, A Hidden Jewel in East Asia - PR Strategy for Tourism
Taipei, A Hidden Jewel in East Asia - PR Strategy for Tourism
 
Chief Judges Training 2015-2016
Chief Judges Training 2015-2016Chief Judges Training 2015-2016
Chief Judges Training 2015-2016
 
Judges training 2015-2016
Judges training 2015-2016 Judges training 2015-2016
Judges training 2015-2016
 
Personal leadership & Career Development
Personal leadership & Career DevelopmentPersonal leadership & Career Development
Personal leadership & Career Development
 
Toastmasters New club opportunities
Toastmasters New club opportunitiesToastmasters New club opportunities
Toastmasters New club opportunities
 
Seeing is believing. Really?
Seeing is believing. Really?Seeing is believing. Really?
Seeing is believing. Really?
 
District new club marketing
District new club marketingDistrict new club marketing
District new club marketing
 
Battle Vs Games
Battle Vs GamesBattle Vs Games
Battle Vs Games
 

Statistical Significance vs Practical Significance in Data Analysis

  • 1. If my samples are statistically different, my job is done. Significant differences indicate that differences in group means are not likely due to sampling error. The problem is that statistically significant differences can be found even with very small differences if the sample size is large enough. Ex: compare global volume values (cm2) of tresses treated with two hair products. Product A is 58.32±0.81 (mean ± Standard Deviation) & Product B is 56.01±0.43. Even though Product A is significantly better than Product B, the difference of 2.31 may not be enough to justify more resources into pursuing it. Statistical significance is not always actionable or has practical implications. It rarely makes sense to treat scales, categorical or ordinal data as continuous variables (AKA numbers). Typical Pitfalls in Statistical Analyses Brian LIN and Amol TEMBE US Scientific Computing Amol TEMBE, PhD Director, Scientific Computing Brian Lin, MS MAppSc Manager of Statistics & Analytics, Scientific Computing We are here to help you with your experimental design, data, statistics, & numbers. Contact Mean & Standard Deviation Abstract “If sets of data have the same mean and variance, they must be the same data.” The following 4 datasets have the same means and standard deviations. From "Pattern Classification" 2nd Edition by Richard O. Duda The sample mean locates the center of the distribution. The sample variance describes the amount the data scattered along various directions. Those two measurements not tell anything about the distribution. Different sets of data may have the same means & standard deviations but very different distributions. Correlation & Causation “If two attributes or variables are highly correlated, one must cause the other.” Let’s look at the following two graphs. Do one factor cause the other? Graphs from book “Spurious Correlations” by Tyler Vigen Establishing a correlation between two variables is not a sufficient condition to establish a causal relationship (in either direction). However, it should encourage further research to uncover the truth. Involve Your Statistician Statistical vs Practical Significance T-test Vs Paired t-test When is a Number a Number? Consider this survey: 2 treatments, 15 subject each 5-point scale: 5-love it, 4-like it, 3-neutral, 2-don’t like, 1-hate it. A standard one-way Analysis of Variance (ANOVA) gives us Group A Mean:2.1 Group B Mean: 3.3 F=5.15, p = 0.001 Can we conclude Group B perform better than Group A? No! Issue #1: how can a 5-point scale have value 9? Issue #2: can 5-point scale be treated as pure numbers? Distribution of Group A & B data after removing 2 outliers After removing the two outliers, the ANOVA now gives a p-value of 0.06 (not statistically significant). Should this be your final answer? Qualitative attributes (like, dislike, etc.) only convey implicit order (i.e., good is better than neutral) with no quantitative meaning. Taking mean scores of these numerical representation is also not appropriate. So, what should we do then? An established approach to analyze this type is data uses the “Top 2 boxes” approach: The numbers of 4s and 5s are pooled into YES category, and the rest into NO. A Chi-Square test is performed to detect the differences between the ratios of responses . The test concluded that the ratio of Top 2 boxes 26.7% Vs 46.2% are not statistically different (p=0.28). “How many subjects do I need to show a statistical significance?” That is a trick question. Without knowing your study objectives, hypothesis, measurements, variance, and data, we cannot know for sure how many subjects are needed. To compare machine readings of anti-breakage properties of tresses, one may need 50 or fewer tresses in each treatment group. To compare consumer perceptions on a 5-point scale, many more subjects are needed. Size Does Matter Samples size calculations need to be performed case by case taking many factors into consideration. Consult your statistician to discuss your study objectives. “I have completed the study, and now it is time to ask the statistician about data analysis method.” No. Well-designed experiment allows for multiple factors to be collected and analyzed properly, to determine their effect on a desired outcome/response. A good design answers at least the follow questions: • What is the big question? What are the key factors in asking? • What is the hypothesis? What are we testing? • What are the main and interaction effects in the process? • What settings would bring about less variation in the output? • How many samples / data points will be sufficient? • What blocking / design / randomization schemes will be most effective? • What statistical methods / tools are best suited for the design and the analysis? If you only think of those study factors after data points are collected, you might be too late. The old saying goes: “Garbage in. Garbage out.” “If I have good data, all statistical tests should show the same results.” Let’s look at the following data from Course Statistics 250 of Penn State University. Pulse Rates Before and After Marching Student BEFORE AFTER DIFFA-B 1 60 78 18 2 56 66 10 3 90 96 6 4 78 88 10 Two sample t-test for AFTER vs BEFORE N Mean StdDev SE Mean AFTER 4 82.0 13.0 6.5 BEFORE 4 71.0 15.9 7.9 T-Test mu AFTER = mu BEFORE (vs not =): T = 1.07, p = 0.33 The result concludes no difference in mean pulse rates before and after marching. Paired t-test for AFTER - BEFORE N Mean StdDev SE Mean Difference 4 11.00 5.03 2.52 T-Test of mean difference = 0 (vs not = 0): T = 4.37, p = 0.02 The analysis concludes that mean pulse rate after is greater than mean pulse rate before. Which one is the correct test? Paired t-Test Choosing a correct method is vital to the success of your analysis. When in doubt, consult your statistician. Group A 4 2 2 4 4 2 1 1 2 1 5 1 2 1 2 Group B 3 5 5 1 1 9 5 4 3 9 3 4 4 2 2 Without sample size specified, percentages along cannot provide much information . Type I error ( α ) is the incorrect rejection of a true null hypothesis (a "false positive"), while a Type II error (β ) is incorrectly retaining a false null hypothesis (a "false negative"). At any given sample size n, there is a trade-off between the significance level (Type I error) and power (Type II error). Ideally, we want to limit Type I error to 5% (α=0.05) or 1% (α=0.01) while achieving at least 80% power (1-β=0.8). In this example to the right, we may achieve our goals with roughly 25 samples. However, this calculation is only good for this population and this data. When we have a different set of data, the power curves will have similar look but the axis & units will be different. The relationship between significance level and power with one-tailed tests: μ = 75, real μ = 80, and σ = 10. Data Source: Rice University (Lead Developer), University of Houston Clear Lake, and Tufts University Online Statistics Education Statistics is the science of learning from data to measure, control, and communicate uncertainty in decision making. Appropriate application of statistical data analysis is essential to solving problems in a wide variety of research to shorten time and reduce cost by mitigating risks in decision making. Improper use of statistics, either due to lack of adequate training, poor study design, or negligence, can trick the observer into believing data artifacts that obfuscate the underlying phenomena. The consequences of drawing wrong inferences that are not statistically substantiated can be detrimental to business at large and can cause enormous monetary losses. This poster presents a set of hypothetical but realistic scenarios widely observed in data from multitude disciplines. Each scenario contains at least one misconception and/or mistake typical of those made in analyses. The topics discussed are application & interpretation of mean, variance, correlation, causality, significance, normality of distribution, and design of experiments. The primary objective of the poster is to shed light on typical deceptive traps in statistical data analyses and to promote best data analyses practices within the R&I community. Involve Scientific Computing early in the process so you position yourself for success before the first data point comes in. We are here to assist you with your analysis needs.