1. ... and are you sure?
Multiple statistical comparisons problem
Jiˇr´ı Haviger
jiri.haviger@uhk.cz
May 12, 2018
Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 1 / 24
3. Introduction
... and are you sure?
Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 3 / 24
4. Basic idea of inferential statistics Inference, confidence intervals and pvalue
Inference
Demostration of sample means distributions, shiny.rit.albany.edu
Demostration of sample means distributions, rpsychologist.com
Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 4 / 24
5. Basic idea of inferential statistics Inference, confidence intervals and pvalue
Confidence intervals
Q: how to estimate the popultion characteristic from knowing sample? Point? Interval?
Probabilistic theory:
is knowing probability density function PDF of sample measures
(eg. Student s T distribution of sample means m)
for different samples
we have: sample with statistical characteristic (n, x, sd, ...)
we have: α as a probability in which we accept mistake (usually
α = 0.05)
to do:: from sample information N, m, sd and α...
transform sample characteristics into variable with knowing
distribution (e.g. t = x−µ
s ·
√
n)
to do: based on PDF and t determine confidence interval for
characteristic (eg. CI(µ))
Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 5 / 24
7. Hypothesis testing Hypothesis testing process
Hypothesis testing procecss
Q: Comes our sample comes population with null hypothesis?
we have: idea about population (from theory, intuition, goverment,
... )
we have: sample with statistical characteristic (n, x, sd, ...)
we have: α as a probability in which we accept mistake (usually
α = 0.05)
to do: formulate null and alternative hypothesis
to do: determine probability, that our sample is from population with
null hypothesis → p-value or sig.
to do: compare pvalue from sample and α level
pvalue < α → reject null hypothesis
pvalue ≥ α → retain null hypothesis.
Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 7 / 24
8. Hypothesis testing Two possible errors
Two possible errors
Q: Which mistakes in null hypothesis testing can I do?
null hypothesis rejected correctly (True Positive, TP)
null hypothesis rejected noncorrectly (False Positive, FP, error I)
null hypothesis retain correctly (True Negativ, TN)
null hypothesis retain noncorrectly (False Negative, FN, error II)
Terminology: H0 is reject ∼ test is positive ∼ discovery
test result about H0 rejection
positive (discovery) negative
reality H0 false TP FN
true FP TN
Online demostration of two type of error
Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 8 / 24
9. Hypothesis testing Two possible errors
Two errors
Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 9 / 24
10. Hypothesis testing Power of analysis, sample size, effect size
Power of test
test result about H0
positive (discovery) negative
reality H0 false TP FN (β)
true FP (α) TN (power, 1 − β)
In ”basic level of statistic” you determine α as probability of false
positives results (eg. false positives diagnoses of cancer)
in ”advanced level of statistic” you to compute minimal reqiured
sample size from given α β and effect size.
There are four numbers in relation: α, β, effect size and sample size
if is fixed effect size and sample size, then
decreasing α implies increasing β
Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 10 / 24
11. Hypothesis testing Power of analysis, sample size, effect size
Software for power analysis
G*power, package for R or python, ...)
Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 11 / 24
12. Multil comparisin problem Introduction
More tests
Q: Whats happends with probability of false popsitives, if we use more than one test?
for one test: probability that we have false positive results is
P(FP) = α
for two tests: probability of at least one false positive results is
P(FP1 or FP2) = P(FP1) + P(FP2) − P(FP1 and FP2) = · · ·
· · · = 1 − P(¬FP1 and ¬FP2) = · · ·
· · · = 1 − (1 − α) · (1 − α) = 1 − (1 − α)2
for m tests: probability of at least one false positive results is
P(FP1 or . . . or FPm) = 1 − (1 − α)m
Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 12 / 24
13. Multil comparisin problem Family wise error rate correction
More tests
Q: Relationship between number of test m and P(FP1 or . . . or FPm) = 1 − (1 − α)m
Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 13 / 24
14. Multil comparisin problem Family wise error rate correction
Basic alpha correction
Q: How to change α → αcorr so the prob of P(FP1 or . . . or FPm) will be α?
P(FP1 or . . . or FPm) should be α
P(FP1 or . . . or FPm) = α
1 − (1 − αcorr )m = α
αcorr = 1 − (1 − α)1/m
αcorr is call ˇSid´ak correction named by Czech statistician Zbynˇek ˇSid´ak
(see wiki) and we will use sign αsid
Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 14 / 24
15. Multil comparisin problem Family wise error rate correction
Bonferroni correction
Q: What about Bonfferoni correction αbonf = α
m
?
linear approximation of ˇSid´ak correction
αsid = 1 − (1 − α)1/m
Laurent series at m = ∞: αsid ≈ −log(1−α)
m + O(( 1
m )2)
Taylor series at α = 0: −log(1−α)
m ≈ α
m + O(α2)
Practically there is no difference in using
αsid ≈
α
m
= αbonf
αsid and αbonf corrections are based on number of all tests.
Bonferroni correction is named by Italian mathematician Carlo Emilio
Bonferroni.
Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 15 / 24
16. Multil comparisin problem Two type of errors again...
Balance between FP and FN
Q: And what about β?
Online demonstration of two type of error
decrease α → increase β
increase β → increase probability of FN → test is going to ”blind”
how to balance between FP and FN depends on solving problem
sometime is better to decrement FP
e.g. in justice - no one false prison
sometimes is better to decrement FN
e.g. in brain disorders - detect some disorders correct and some wrong
is better, than non-detect disorders at all
Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 16 / 24
17. Multil comparisin problem Two type of errors again...
Balance between FP and FN
Q: What if we have thousand of tests?
ˇSid´ak and Bonferonni control False Positive from all results
Family Wise Error Rate (FWER), FWER = FP/M
FWER corrections are strict and tending to blind test
other point of view is necessary, so what about ...
... tocontrol False Positive rate only from Discoveries
False Discovery Rate (FDR), FDR = FP/(TP+FP)
Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 17 / 24
18. Multil comparisin problem control False Discovery Rate
Benjamini - Hochberg algorithm
Q: How to control FDR to predefined level α in m tests?
Benjamini - Hochberg algorithm for independent tests
1 create all tests and determine all pvalues
2 sort pvalues from smallest one - P[i]
3 compute linear series C[i] = α · i
m
4 set k as a first i, for which P[i] ≥ C[i]
5 αbh = α · k
m
αbh is based on numbers of all tests and concrete pvalue series.
Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 18 / 24
19. Multil comparisin problem control False Discovery Rate
Bemjamini - Hochberg visualization
Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 19 / 24
20. Multil comparisin problem control False Discovery Rate
pvalues distribution
Q: Why αBH used number of all test, if control FDR only?
we don’t know, which pvalues are from discoveries
and which not, but ...
we can construct pvalue distribution
form definition of p-values we know:
all pvalues from H0 has uniform distribution between 0,1
all pvalues from HA has decreasing distribution
from top (close to 0) to zero (close to 1)
all pvalues has mixed distribution
Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 20 / 24
21. Multil comparisin problem control False Discovery Rate
pvalues distribution
Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 21 / 24
22. Multil comparisin problem control False Discovery Rate
pvalues distribution and qvalues
Q: So is possible to use pvalue distribution for control FDR?
Determining qvalues from pvalues distributions (Storey)
1 sort pvalues from smallest one - P[i]
2 create density plot of P[i] in (0,1) with step 0.05 (or smaller)
3 determine π0 from right part of density - level selecting H0 from HA
4 compute qvalues Q[k] as false discovery rate
5 select max Q[k] so Q[k] ≤ α
6 αst = k
7 αst is based on distributions of pvalues
Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 22 / 24
23. Multil comparisin problem control False Discovery Rate
Computational Psycholinguistic Analysis of Czech Text
Two examples of pvalue distributions from our research
Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 23 / 24
24. Finish Questions?
Web sources, contact
https://xkcd.com/882/
https://shiny.rit.albany.edu/stat/confidence/
http://rpsychologist.com/d3/CI/
http://varianceexplained.org/statistics/interpreting-pvalue-histogram/
http://qvalue.princeton.edu/
Jiˇr´ı Haviger
ResearchGate, ORCID, LinkedIn ...
e:jiri.haviger@uhk.cz
Jiˇr´ı Haviger (jiri.haviger@uhk.cz) ... and are you sure? May 12, 2018 24 / 24