SlideShare a Scribd company logo
1 of 24
Download to read offline
... and are you sure?
Multiple statistical comparisons problem
Jiˇr´ı Haviger
jiri.haviger@uhk.cz
May 12, 2018
Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 1 / 24
Introduction
Jelly beans cause acne ...
Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 2 / 24
Introduction
... and are you sure?
Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 3 / 24
Basic idea of inferential statistics Inference, confidence intervals and pvalue
Inference
Demostration of sample means distributions, shiny.rit.albany.edu
Demostration of sample means distributions, rpsychologist.com
Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 4 / 24
Basic idea of inferential statistics Inference, confidence intervals and pvalue
Confidence intervals
Q: how to estimate the popultion characteristic from knowing sample? Point? Interval?
Probabilistic theory:
is knowing probability density function PDF of sample measures
(eg. Student s T distribution of sample means m)
for different samples
we have: sample with statistical characteristic (n, x, sd, ...)
we have: α as a probability in which we accept mistake (usually
α = 0.05)
to do:: from sample information N, m, sd and α...
transform sample characteristics into variable with knowing
distribution (e.g. t = x−µ
s ·
√
n)
to do: based on PDF and t determine confidence interval for
characteristic (eg. CI(µ))
Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 5 / 24
Hypothesis testing pvalue
pvalue visualisation
Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 6 / 24
Hypothesis testing Hypothesis testing process
Hypothesis testing procecss
Q: Comes our sample comes population with null hypothesis?
we have: idea about population (from theory, intuition, goverment,
... )
we have: sample with statistical characteristic (n, x, sd, ...)
we have: α as a probability in which we accept mistake (usually
α = 0.05)
to do: formulate null and alternative hypothesis
to do: determine probability, that our sample is from population with
null hypothesis → p-value or sig.
to do: compare pvalue from sample and α level
pvalue < α → reject null hypothesis
pvalue ≥ α → retain null hypothesis.
Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 7 / 24
Hypothesis testing Two possible errors
Two possible errors
Q: Which mistakes in null hypothesis testing can I do?
null hypothesis rejected correctly (True Positive, TP)
null hypothesis rejected noncorrectly (False Positive, FP, error I)
null hypothesis retain correctly (True Negativ, TN)
null hypothesis retain noncorrectly (False Negative, FN, error II)
Terminology: H0 is reject ∼ test is positive ∼ discovery
test result about H0 rejection
positive (discovery) negative
reality H0 false TP FN
true FP TN
Online demostration of two type of error
Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 8 / 24
Hypothesis testing Two possible errors
Two errors
Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 9 / 24
Hypothesis testing Power of analysis, sample size, effect size
Power of test
test result about H0
positive (discovery) negative
reality H0 false TP FN (β)
true FP (α) TN (power, 1 − β)
In ”basic level of statistic” you determine α as probability of false
positives results (eg. false positives diagnoses of cancer)
in ”advanced level of statistic” you to compute minimal reqiured
sample size from given α β and effect size.
There are four numbers in relation: α, β, effect size and sample size
if is fixed effect size and sample size, then
decreasing α implies increasing β
Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 10 / 24
Hypothesis testing Power of analysis, sample size, effect size
Software for power analysis
G*power, package for R or python, ...)
Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 11 / 24
Multil comparisin problem Introduction
More tests
Q: Whats happends with probability of false popsitives, if we use more than one test?
for one test: probability that we have false positive results is
P(FP) = α
for two tests: probability of at least one false positive results is
P(FP1 or FP2) = P(FP1) + P(FP2) − P(FP1 and FP2) = · · ·
· · · = 1 − P(¬FP1 and ¬FP2) = · · ·
· · · = 1 − (1 − α) · (1 − α) = 1 − (1 − α)2
for m tests: probability of at least one false positive results is
P(FP1 or . . . or FPm) = 1 − (1 − α)m
Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 12 / 24
Multil comparisin problem Family wise error rate correction
More tests
Q: Relationship between number of test m and P(FP1 or . . . or FPm) = 1 − (1 − α)m
Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 13 / 24
Multil comparisin problem Family wise error rate correction
Basic alpha correction
Q: How to change α → αcorr so the prob of P(FP1 or . . . or FPm) will be α?
P(FP1 or . . . or FPm) should be α
P(FP1 or . . . or FPm) = α
1 − (1 − αcorr )m = α
αcorr = 1 − (1 − α)1/m
αcorr is call ˇSid´ak correction named by Czech statistician Zbynˇek ˇSid´ak
(see wiki) and we will use sign αsid
Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 14 / 24
Multil comparisin problem Family wise error rate correction
Bonferroni correction
Q: What about Bonfferoni correction αbonf = α
m
?
linear approximation of ˇSid´ak correction
αsid = 1 − (1 − α)1/m
Laurent series at m = ∞: αsid ≈ −log(1−α)
m + O(( 1
m )2)
Taylor series at α = 0: −log(1−α)
m ≈ α
m + O(α2)
Practically there is no difference in using
αsid ≈
α
m
= αbonf
αsid and αbonf corrections are based on number of all tests.
Bonferroni correction is named by Italian mathematician Carlo Emilio
Bonferroni.
Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 15 / 24
Multil comparisin problem Two type of errors again...
Balance between FP and FN
Q: And what about β?
Online demonstration of two type of error
decrease α → increase β
increase β → increase probability of FN → test is going to ”blind”
how to balance between FP and FN depends on solving problem
sometime is better to decrement FP
e.g. in justice - no one false prison
sometimes is better to decrement FN
e.g. in brain disorders - detect some disorders correct and some wrong
is better, than non-detect disorders at all
Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 16 / 24
Multil comparisin problem Two type of errors again...
Balance between FP and FN
Q: What if we have thousand of tests?
ˇSid´ak and Bonferonni control False Positive from all results
Family Wise Error Rate (FWER), FWER = FP/M
FWER corrections are strict and tending to blind test
other point of view is necessary, so what about ...
... tocontrol False Positive rate only from Discoveries
False Discovery Rate (FDR), FDR = FP/(TP+FP)
Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 17 / 24
Multil comparisin problem control False Discovery Rate
Benjamini - Hochberg algorithm
Q: How to control FDR to predefined level α in m tests?
Benjamini - Hochberg algorithm for independent tests
1 create all tests and determine all pvalues
2 sort pvalues from smallest one - P[i]
3 compute linear series C[i] = α · i
m
4 set k as a first i, for which P[i] ≥ C[i]
5 αbh = α · k
m
αbh is based on numbers of all tests and concrete pvalue series.
Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 18 / 24
Multil comparisin problem control False Discovery Rate
Bemjamini - Hochberg visualization
Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 19 / 24
Multil comparisin problem control False Discovery Rate
pvalues distribution
Q: Why αBH used number of all test, if control FDR only?
we don’t know, which pvalues are from discoveries
and which not, but ...
we can construct pvalue distribution
form definition of p-values we know:
all pvalues from H0 has uniform distribution between 0,1
all pvalues from HA has decreasing distribution
from top (close to 0) to zero (close to 1)
all pvalues has mixed distribution
Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 20 / 24
Multil comparisin problem control False Discovery Rate
pvalues distribution
Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 21 / 24
Multil comparisin problem control False Discovery Rate
pvalues distribution and qvalues
Q: So is possible to use pvalue distribution for control FDR?
Determining qvalues from pvalues distributions (Storey)
1 sort pvalues from smallest one - P[i]
2 create density plot of P[i] in (0,1) with step 0.05 (or smaller)
3 determine π0 from right part of density - level selecting H0 from HA
4 compute qvalues Q[k] as false discovery rate
5 select max Q[k] so Q[k] ≤ α
6 αst = k
7 αst is based on distributions of pvalues
Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 22 / 24
Multil comparisin problem control False Discovery Rate
Computational Psycholinguistic Analysis of Czech Text
Two examples of pvalue distributions from our research
Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 23 / 24
Finish Questions?
Web sources, contact
https://xkcd.com/882/
https://shiny.rit.albany.edu/stat/confidence/
http://rpsychologist.com/d3/CI/
http://varianceexplained.org/statistics/interpreting-pvalue-histogram/
http://qvalue.princeton.edu/
Jiˇr´ı Haviger
ResearchGate, ORCID, LinkedIn ...
e:jiri.haviger@uhk.cz
Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 24 / 24

More Related Content

What's hot

What's hot (20)

Analysis of variance ppt @ bec doms
Analysis of variance ppt @ bec domsAnalysis of variance ppt @ bec doms
Analysis of variance ppt @ bec doms
 
Regression
RegressionRegression
Regression
 
Test of significance in Statistics
Test of significance in StatisticsTest of significance in Statistics
Test of significance in Statistics
 
Chi squared test
Chi squared testChi squared test
Chi squared test
 
Measures Of Association
Measures Of AssociationMeasures Of Association
Measures Of Association
 
Population genetics
Population geneticsPopulation genetics
Population genetics
 
One Way Anova
One Way AnovaOne Way Anova
One Way Anova
 
Simple linear regression (final)
Simple linear regression (final)Simple linear regression (final)
Simple linear regression (final)
 
Hypothesis testing
Hypothesis testingHypothesis testing
Hypothesis testing
 
Analysis of variance (ANOVA)
Analysis of variance (ANOVA)Analysis of variance (ANOVA)
Analysis of variance (ANOVA)
 
Sampling Distributions
Sampling DistributionsSampling Distributions
Sampling Distributions
 
Tests of Significance: The Basics Concepts
Tests of Significance: The Basics ConceptsTests of Significance: The Basics Concepts
Tests of Significance: The Basics Concepts
 
Chapter8
Chapter8Chapter8
Chapter8
 
Regression analysis
Regression analysisRegression analysis
Regression analysis
 
Probit model
Probit modelProbit model
Probit model
 
Introduction to Bayesian Methods
Introduction to Bayesian MethodsIntroduction to Bayesian Methods
Introduction to Bayesian Methods
 
F test and ANOVA
F test and ANOVAF test and ANOVA
F test and ANOVA
 
Normal distribution
Normal distributionNormal distribution
Normal distribution
 
Tests of significance
Tests of significanceTests of significance
Tests of significance
 
Categorical data analysis
Categorical data analysisCategorical data analysis
Categorical data analysis
 

Similar to Multiple comparison problem

Hypothesis Testing: Proportions (Compare 2+ Factors)
Hypothesis Testing: Proportions (Compare 2+ Factors)Hypothesis Testing: Proportions (Compare 2+ Factors)
Hypothesis Testing: Proportions (Compare 2+ Factors)Matt Hansen
 
Hypothesis Testing: Proportions (Compare 1:Standard)
Hypothesis Testing: Proportions (Compare 1:Standard)Hypothesis Testing: Proportions (Compare 1:Standard)
Hypothesis Testing: Proportions (Compare 1:Standard)Matt Hansen
 
Hypothesis Testing: Proportions (Compare 1:1)
Hypothesis Testing: Proportions (Compare 1:1)Hypothesis Testing: Proportions (Compare 1:1)
Hypothesis Testing: Proportions (Compare 1:1)Matt Hansen
 
Hypothesis Testing: Spread (Compare 2+ Factors)
Hypothesis Testing: Spread (Compare 2+ Factors)Hypothesis Testing: Spread (Compare 2+ Factors)
Hypothesis Testing: Spread (Compare 2+ Factors)Matt Hansen
 
What So Funny About Proportion Testv3
What So Funny About Proportion Testv3What So Funny About Proportion Testv3
What So Funny About Proportion Testv3ChrisConnors
 
Hypothesis Testing: Central Tendency – Normal (Compare 1:1)
Hypothesis Testing: Central Tendency – Normal (Compare 1:1)Hypothesis Testing: Central Tendency – Normal (Compare 1:1)
Hypothesis Testing: Central Tendency – Normal (Compare 1:1)Matt Hansen
 
Novelties in social science statistics
Novelties in social science statisticsNovelties in social science statistics
Novelties in social science statisticsJiri Haviger
 
Hypothesis Testing: Central Tendency – Normal (Compare 1:Standard)
Hypothesis Testing: Central Tendency – Normal (Compare 1:Standard)Hypothesis Testing: Central Tendency – Normal (Compare 1:Standard)
Hypothesis Testing: Central Tendency – Normal (Compare 1:Standard)Matt Hansen
 
Bayes Classification
Bayes ClassificationBayes Classification
Bayes Classificationsathish sak
 
Hypothesis Testing in Six Sigma
Hypothesis Testing in Six SigmaHypothesis Testing in Six Sigma
Hypothesis Testing in Six SigmaBody of Knowledge
 
Probabilistic Reasoning
Probabilistic ReasoningProbabilistic Reasoning
Probabilistic ReasoningTameem Ahmad
 
Hypothesis Testing: Relationships (Compare 1:1)
Hypothesis Testing: Relationships (Compare 1:1)Hypothesis Testing: Relationships (Compare 1:1)
Hypothesis Testing: Relationships (Compare 1:1)Matt Hansen
 
Hypothesis Testing: Relationships (Compare 2+ Factors)
Hypothesis Testing: Relationships (Compare 2+ Factors)Hypothesis Testing: Relationships (Compare 2+ Factors)
Hypothesis Testing: Relationships (Compare 2+ Factors)Matt Hansen
 
2.statistical DEcision makig.pptx
2.statistical DEcision makig.pptx2.statistical DEcision makig.pptx
2.statistical DEcision makig.pptxImpanaR2
 
Introduction to Bayesian Statistics.ppt
Introduction to Bayesian Statistics.pptIntroduction to Bayesian Statistics.ppt
Introduction to Bayesian Statistics.pptLong Dang
 
General concept for hypohtesis testing
General concept for hypohtesis testingGeneral concept for hypohtesis testing
General concept for hypohtesis testingNadeem Uddin
 
Review Z Test Ci 1
Review Z Test Ci 1Review Z Test Ci 1
Review Z Test Ci 1shoffma5
 

Similar to Multiple comparison problem (20)

Hypothesis Testing: Proportions (Compare 2+ Factors)
Hypothesis Testing: Proportions (Compare 2+ Factors)Hypothesis Testing: Proportions (Compare 2+ Factors)
Hypothesis Testing: Proportions (Compare 2+ Factors)
 
Hypothesis Testing: Proportions (Compare 1:Standard)
Hypothesis Testing: Proportions (Compare 1:Standard)Hypothesis Testing: Proportions (Compare 1:Standard)
Hypothesis Testing: Proportions (Compare 1:Standard)
 
Hypothesis Testing: Proportions (Compare 1:1)
Hypothesis Testing: Proportions (Compare 1:1)Hypothesis Testing: Proportions (Compare 1:1)
Hypothesis Testing: Proportions (Compare 1:1)
 
bayesjaw.ppt
bayesjaw.pptbayesjaw.ppt
bayesjaw.ppt
 
Hypothesis Testing: Spread (Compare 2+ Factors)
Hypothesis Testing: Spread (Compare 2+ Factors)Hypothesis Testing: Spread (Compare 2+ Factors)
Hypothesis Testing: Spread (Compare 2+ Factors)
 
What So Funny About Proportion Testv3
What So Funny About Proportion Testv3What So Funny About Proportion Testv3
What So Funny About Proportion Testv3
 
Hypothesis Testing: Central Tendency – Normal (Compare 1:1)
Hypothesis Testing: Central Tendency – Normal (Compare 1:1)Hypothesis Testing: Central Tendency – Normal (Compare 1:1)
Hypothesis Testing: Central Tendency – Normal (Compare 1:1)
 
Novelties in social science statistics
Novelties in social science statisticsNovelties in social science statistics
Novelties in social science statistics
 
Lecture3
Lecture3Lecture3
Lecture3
 
Hypothesis Testing: Central Tendency – Normal (Compare 1:Standard)
Hypothesis Testing: Central Tendency – Normal (Compare 1:Standard)Hypothesis Testing: Central Tendency – Normal (Compare 1:Standard)
Hypothesis Testing: Central Tendency – Normal (Compare 1:Standard)
 
Bayes Classification
Bayes ClassificationBayes Classification
Bayes Classification
 
Hypothesis Testing in Six Sigma
Hypothesis Testing in Six SigmaHypothesis Testing in Six Sigma
Hypothesis Testing in Six Sigma
 
Hypothesis testing
Hypothesis testingHypothesis testing
Hypothesis testing
 
Probabilistic Reasoning
Probabilistic ReasoningProbabilistic Reasoning
Probabilistic Reasoning
 
Hypothesis Testing: Relationships (Compare 1:1)
Hypothesis Testing: Relationships (Compare 1:1)Hypothesis Testing: Relationships (Compare 1:1)
Hypothesis Testing: Relationships (Compare 1:1)
 
Hypothesis Testing: Relationships (Compare 2+ Factors)
Hypothesis Testing: Relationships (Compare 2+ Factors)Hypothesis Testing: Relationships (Compare 2+ Factors)
Hypothesis Testing: Relationships (Compare 2+ Factors)
 
2.statistical DEcision makig.pptx
2.statistical DEcision makig.pptx2.statistical DEcision makig.pptx
2.statistical DEcision makig.pptx
 
Introduction to Bayesian Statistics.ppt
Introduction to Bayesian Statistics.pptIntroduction to Bayesian Statistics.ppt
Introduction to Bayesian Statistics.ppt
 
General concept for hypohtesis testing
General concept for hypohtesis testingGeneral concept for hypohtesis testing
General concept for hypohtesis testing
 
Review Z Test Ci 1
Review Z Test Ci 1Review Z Test Ci 1
Review Z Test Ci 1
 

Recently uploaded

Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangeThinkInnovation
 
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...gajnagarg
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRajesh Mondal
 
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...HyderabadDolls
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...nirzagarg
 
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...kumargunjan9515
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...HyderabadDolls
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1ranjankumarbehera14
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...nirzagarg
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...nirzagarg
 
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...HyderabadDolls
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制vexqp
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxchadhar227
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...Health
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.pptibrahimabdi22
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795
 
Kings of Saudi Arabia, information about them
Kings of Saudi Arabia, information about themKings of Saudi Arabia, information about them
Kings of Saudi Arabia, information about themeitharjee
 
20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdfkhraisr
 
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...gragchanchal546
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 

Recently uploaded (20)

Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
 
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
 
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
 
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Kings of Saudi Arabia, information about them
Kings of Saudi Arabia, information about themKings of Saudi Arabia, information about them
Kings of Saudi Arabia, information about them
 
20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf
 
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 

Multiple comparison problem

  • 1. ... and are you sure? Multiple statistical comparisons problem Jiˇr´ı Haviger jiri.haviger@uhk.cz May 12, 2018 Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 1 / 24
  • 2. Introduction Jelly beans cause acne ... Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 2 / 24
  • 3. Introduction ... and are you sure? Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 3 / 24
  • 4. Basic idea of inferential statistics Inference, confidence intervals and pvalue Inference Demostration of sample means distributions, shiny.rit.albany.edu Demostration of sample means distributions, rpsychologist.com Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 4 / 24
  • 5. Basic idea of inferential statistics Inference, confidence intervals and pvalue Confidence intervals Q: how to estimate the popultion characteristic from knowing sample? Point? Interval? Probabilistic theory: is knowing probability density function PDF of sample measures (eg. Student s T distribution of sample means m) for different samples we have: sample with statistical characteristic (n, x, sd, ...) we have: α as a probability in which we accept mistake (usually α = 0.05) to do:: from sample information N, m, sd and α... transform sample characteristics into variable with knowing distribution (e.g. t = x−µ s · √ n) to do: based on PDF and t determine confidence interval for characteristic (eg. CI(µ)) Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 5 / 24
  • 6. Hypothesis testing pvalue pvalue visualisation Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 6 / 24
  • 7. Hypothesis testing Hypothesis testing process Hypothesis testing procecss Q: Comes our sample comes population with null hypothesis? we have: idea about population (from theory, intuition, goverment, ... ) we have: sample with statistical characteristic (n, x, sd, ...) we have: α as a probability in which we accept mistake (usually α = 0.05) to do: formulate null and alternative hypothesis to do: determine probability, that our sample is from population with null hypothesis → p-value or sig. to do: compare pvalue from sample and α level pvalue < α → reject null hypothesis pvalue ≥ α → retain null hypothesis. Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 7 / 24
  • 8. Hypothesis testing Two possible errors Two possible errors Q: Which mistakes in null hypothesis testing can I do? null hypothesis rejected correctly (True Positive, TP) null hypothesis rejected noncorrectly (False Positive, FP, error I) null hypothesis retain correctly (True Negativ, TN) null hypothesis retain noncorrectly (False Negative, FN, error II) Terminology: H0 is reject ∼ test is positive ∼ discovery test result about H0 rejection positive (discovery) negative reality H0 false TP FN true FP TN Online demostration of two type of error Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 8 / 24
  • 9. Hypothesis testing Two possible errors Two errors Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 9 / 24
  • 10. Hypothesis testing Power of analysis, sample size, effect size Power of test test result about H0 positive (discovery) negative reality H0 false TP FN (β) true FP (α) TN (power, 1 − β) In ”basic level of statistic” you determine α as probability of false positives results (eg. false positives diagnoses of cancer) in ”advanced level of statistic” you to compute minimal reqiured sample size from given α β and effect size. There are four numbers in relation: α, β, effect size and sample size if is fixed effect size and sample size, then decreasing α implies increasing β Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 10 / 24
  • 11. Hypothesis testing Power of analysis, sample size, effect size Software for power analysis G*power, package for R or python, ...) Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 11 / 24
  • 12. Multil comparisin problem Introduction More tests Q: Whats happends with probability of false popsitives, if we use more than one test? for one test: probability that we have false positive results is P(FP) = α for two tests: probability of at least one false positive results is P(FP1 or FP2) = P(FP1) + P(FP2) − P(FP1 and FP2) = · · · · · · = 1 − P(¬FP1 and ¬FP2) = · · · · · · = 1 − (1 − α) · (1 − α) = 1 − (1 − α)2 for m tests: probability of at least one false positive results is P(FP1 or . . . or FPm) = 1 − (1 − α)m Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 12 / 24
  • 13. Multil comparisin problem Family wise error rate correction More tests Q: Relationship between number of test m and P(FP1 or . . . or FPm) = 1 − (1 − α)m Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 13 / 24
  • 14. Multil comparisin problem Family wise error rate correction Basic alpha correction Q: How to change α → αcorr so the prob of P(FP1 or . . . or FPm) will be α? P(FP1 or . . . or FPm) should be α P(FP1 or . . . or FPm) = α 1 − (1 − αcorr )m = α αcorr = 1 − (1 − α)1/m αcorr is call ˇSid´ak correction named by Czech statistician Zbynˇek ˇSid´ak (see wiki) and we will use sign αsid Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 14 / 24
  • 15. Multil comparisin problem Family wise error rate correction Bonferroni correction Q: What about Bonfferoni correction αbonf = α m ? linear approximation of ˇSid´ak correction αsid = 1 − (1 − α)1/m Laurent series at m = ∞: αsid ≈ −log(1−α) m + O(( 1 m )2) Taylor series at α = 0: −log(1−α) m ≈ α m + O(α2) Practically there is no difference in using αsid ≈ α m = αbonf αsid and αbonf corrections are based on number of all tests. Bonferroni correction is named by Italian mathematician Carlo Emilio Bonferroni. Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 15 / 24
  • 16. Multil comparisin problem Two type of errors again... Balance between FP and FN Q: And what about β? Online demonstration of two type of error decrease α → increase β increase β → increase probability of FN → test is going to ”blind” how to balance between FP and FN depends on solving problem sometime is better to decrement FP e.g. in justice - no one false prison sometimes is better to decrement FN e.g. in brain disorders - detect some disorders correct and some wrong is better, than non-detect disorders at all Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 16 / 24
  • 17. Multil comparisin problem Two type of errors again... Balance between FP and FN Q: What if we have thousand of tests? ˇSid´ak and Bonferonni control False Positive from all results Family Wise Error Rate (FWER), FWER = FP/M FWER corrections are strict and tending to blind test other point of view is necessary, so what about ... ... tocontrol False Positive rate only from Discoveries False Discovery Rate (FDR), FDR = FP/(TP+FP) Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 17 / 24
  • 18. Multil comparisin problem control False Discovery Rate Benjamini - Hochberg algorithm Q: How to control FDR to predefined level α in m tests? Benjamini - Hochberg algorithm for independent tests 1 create all tests and determine all pvalues 2 sort pvalues from smallest one - P[i] 3 compute linear series C[i] = α · i m 4 set k as a first i, for which P[i] ≥ C[i] 5 αbh = α · k m αbh is based on numbers of all tests and concrete pvalue series. Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 18 / 24
  • 19. Multil comparisin problem control False Discovery Rate Bemjamini - Hochberg visualization Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 19 / 24
  • 20. Multil comparisin problem control False Discovery Rate pvalues distribution Q: Why αBH used number of all test, if control FDR only? we don’t know, which pvalues are from discoveries and which not, but ... we can construct pvalue distribution form definition of p-values we know: all pvalues from H0 has uniform distribution between 0,1 all pvalues from HA has decreasing distribution from top (close to 0) to zero (close to 1) all pvalues has mixed distribution Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 20 / 24
  • 21. Multil comparisin problem control False Discovery Rate pvalues distribution Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 21 / 24
  • 22. Multil comparisin problem control False Discovery Rate pvalues distribution and qvalues Q: So is possible to use pvalue distribution for control FDR? Determining qvalues from pvalues distributions (Storey) 1 sort pvalues from smallest one - P[i] 2 create density plot of P[i] in (0,1) with step 0.05 (or smaller) 3 determine π0 from right part of density - level selecting H0 from HA 4 compute qvalues Q[k] as false discovery rate 5 select max Q[k] so Q[k] ≤ α 6 αst = k 7 αst is based on distributions of pvalues Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 22 / 24
  • 23. Multil comparisin problem control False Discovery Rate Computational Psycholinguistic Analysis of Czech Text Two examples of pvalue distributions from our research Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 23 / 24
  • 24. Finish Questions? Web sources, contact https://xkcd.com/882/ https://shiny.rit.albany.edu/stat/confidence/ http://rpsychologist.com/d3/CI/ http://varianceexplained.org/statistics/interpreting-pvalue-histogram/ http://qvalue.princeton.edu/ Jiˇr´ı Haviger ResearchGate, ORCID, LinkedIn ... e:jiri.haviger@uhk.cz Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 24 / 24