SlideShare a Scribd company logo
1 of 40
Download to read offline
Overview of Statistics I:
Statistical Testing
Presented by: Jeff Skinner, M.S.ese ted by Je S e , S
Biostatistics Specialist
Bioinformatics and Computational Biosciences Branch (BCBB)
Office of Cyber Infrastructure and Computational Biology (OCICB)y p gy ( )
National Institute of Allergy and Infectious Diseases (NIAID)
Want To Publish Statistical Results?
Most research journals ask three big questions:
• Do your statistical tests inappropriately assume your
data is normal or Gaussian distributed?
P t i t t t i t t Parametric tests vs. nonparametric tests
 E.g. Student’s T-tests vs. Wilcoxon Rank Sum tests
D t ti ti l t t i l l ?• Do your statistical tests require large samples?
 Approximate tests vs. exact tests
• Do you require adjustments for multiple testing?
 False Discovery Rate (FDR) adjustments for high throughput biology
 Tukey’s Honest Significant Difference (HSD) for analysis of variance Tukey s Honest Significant Difference (HSD) for analysis of variance
Nature – Statistical Checklist
Outline
• Review the statistical testing process
• When do I use one-sided vs. two-sided tests?
Wh d I t i t i t t ?• When do I use parametric vs. nonparametric tests?
• When do I use large-sample tests vs. exact tests?g p
• When do I need to adjust for multiple testing?
• Additional questions about testing?
Recall the Statistical Testing Processg
• Formulate null and alternative hypotheses
 E.g. (null) H0: μ1 = μ2 vs. (alternative) HA: μ1 ≠ μ2
• Calculate the appropriate test statistic
 E g Student’s t-test Wilcoxon testE.g. Student s t test, Wilcoxon test, …
• Compute the probability of observing the test statistic
(i.e. your sample data) under the null hypothesis
 I.e. Compute a p-value
• Make a statistical decision
“Reject the n ll h pothesis” or “fail to reject the n ll h pothesis” “Reject the null hypothesis” or “fail to reject the null hypothesis”
• Make a biological conclusion
 E.g. New drug reduces viral load, vitamin C helps prevent cancer, …g g , p p ,
Null and Alternative Hypothesesyp
• (null) Drug and placebo viral loads are equal vs. (alternative) viral
loads higher for placebo group than for drug group
• (null) H : μ – μ ≤ 0 vs (alternative) H : μ – μ > 0• (null) H0: μP – μD ≤ 0 vs. (alternative) HA: μP – μD > 0
What is a Statistical Test?
ValueNullStatisticDifference
Error
ValueNullStatistic
Error
Difference
Test


• Almost all tests used in inferential statistics can be
li d th ti f “diff ” “ ”generalized as the ratio of a “difference” over an “error”
 Difference between a statistic and null value (usually 0)
 A statistic is nothing more than a numeric summary of the experimental
d t ith t t th ll h th idata with respect to the null hypothesis
 A null value is an assumption about the population under the null
hypothesis
 An error is an estimate of the sampling distribution error An error is an estimate of the sampling distribution error
Example: Two-sample Student’s T-testp p
X X 0statistic null value
T* 
X1  X2  0
1 1



n11 s1
2
 n21 s2
2 standard
error1
n1

1
n2



1  1 2  2
n1  n2  2
• The “statistic” in a two-sample t-test is a difference between
the two sample means and the null value is zero
 The hypothesis μ1 = μ2 implies μ1 – μ2 = 0
• The standard error is an estimate of the common variance• The standard error is an estimate of the common variance
Null Distributions
Compute T* statistic
f lfor our sample
C i t thCompare against the
distribution of all possible
T* statistics for all
possible samples from
the population under the
• If the null hypothesis is true (i.e. no difference between groups),
p p
null hypothesis
yp ( g p ),
then the T* statistics from most samples should be near zero
• Many null distributions (or sampling distributions) approximately
follow well known probability distributions e g normal distributionfollow well known probability distributions, e.g. normal distribution
P-values
• A p-value is the probability of
observing your data given that theobserving your data given that the
null hypothesis is actually true
O• P-values do NOT represent the
probability that the null is true
• P-values do NOT represent the
probabilty that a model is incorrect
If the null distribution follows a well
• P-values do NOT represent the
strength or size of an effect
known probability distribution, like
the normal distribution, the p-values
are computed by integration
Statistical Decisions and
Bi l i l C l iBiological Conclusions
• A statistical decision is a choice to “reject the null
hypothesis” or “fail to reject the null hypothesis”
 The decision is based on a critical value or decision rule The decision is based on a critical value or decision rule
 E.g. Reject the null hypothesis if p-value < 0.05
A bi l i l l i i th fi l i t t ti f• A biological conclusion is the final interpretation of
the statistical testing process in plain language
 E g Vitamin C prevents cancer drug reduced viral loadE.g. Vitamin C prevents cancer, drug reduced viral load, …
 Make sure conclusion can be justified by the hypotheses
Type I and Type II Errorsyp yp
Actual population difference?Actual population difference? 
Yes No
Type I Error
Was the difference 
detected by the 
i i l ?
Yes OK
Type I Error
(False Positive)
Type II Error
Diff t t f i t tt t t i i i
statistical test? No
Type II Error
(False Negative)
OK
• Different types of experiments attempt to minimize
Type I errors, Type II errors or both kinds of errors
 E.g. Type II errors are more important in medical testingE.g. Type II errors are more important in medical testing
Type I and Type II Errors - Exampleype a d ype o s a p e
• Suppose mean viral load is 7,000
lower after taking a new druglower after taking a new drug
 Drug population mean viral load 33,000
 Placebo population mean viral load
40,000,
• Samples from the population may
not be representative
 By chance we sample 120 sickly patients
for the drug treatment group
 By chance we sample 120 robust patients
for the placebo treatment groupfor the placebo treatment group
• This data yields a Type II error
because of a strange sample
Review: Statistical Testingg
• Formulate null and alternative hypotheses
 Null and alternative hypotheses are mutually exclusive
and exhaustive statements about the population
 Typically assume the null hypothesis is true, until we find
evidence to refute the null in favor of the alternative
 E.g. H0: µ = 0 versus HA: µ ≠ 0
• Calculate the appropriate test statistic and find its
probability under the null hypothesis
• Make a statistical decision and biological conclusion
Two-sample Student’s T-testp
T*
X1  X2  0
T*  1 2
sp
1

1
p
n1 n2
Assumptions of Student’s T‐test:p
 Data from each group are normal 
or sample sizes greater than 30
 Equal variances among groups
• Student’s T-test is a parametric test
to compare means of two samples
from normal distributed populations
Equal variances among groups
 Independent and identically 
distributed (iid) normal random 
errors from the group meansp p errors from the group means
Parametric Statistical Tests
• Parametric tests assume the populations have known distributions
ith k t th t t b ti t d ( d d)with unknown parameters that must be estimated (and compared)
Central Limit Theorem
• Student’s T-test is computed with
a difference of two samplep
means
• Draw thousands of samples of
size n from one population top p
view the distribution of their
sample means
• As sample size n increases, thep ,
distribution of the sample means
becomes Gaussian (normal),
even from non-normal
populations
• Student’s T-test does not require
normal data if sample size is
large Sample means from a uniform distribution
are approximately normal for n = 20
Equal Variance AssumptionEqual Variance Assumption
T* 
X1  X2  0
s1
2
s2
2
s1
n1

s2
n2
• Usual two-sample Student’s T-test computations assume both
samples share approximately equal variances
• Welsh’s correction computes an appropriate T-test when variances
are NOT equal among the two samples
• Welch’s correction is available on most software, so look carefully, y
Two-Sided Test
• Researchers expect two samples
ill b diff t b t d t kwill be different, but do not know
which will have the higher mean
 E.g. viral load for drug group could
be higher or lower than placebo
 E.g. H0: μP = μD vs. HA: μP ≠ μD
• Two-sided tests are less powerful,p
but more general than analogous
one-sided tests of same data
 The α = 0.05 rejection region of aThe α 0.05 rejection region of a
two-sided test is divided in two parts
 Need larger T-statistic for significant
p-values with two-sided testp
One-Sided Test
• Researchers expect one sample
will have a higher mean than the
other sample before testing
 E.g. viral load for drug group should
be lower than placebo
 E.g. H0: μP ≤ μD vs. HA: μP > μD
O id d t t f l• One-sided tests are more powerful
but require more assumptions
 The α = 0.05 rejection region of the
one-sided test is all on one side
 Smaller T-statistics will produce
significant p-values in one-sided
teststests
Nonparametric Statistical TestsNonparametric Statistical Tests
• Nonparametric tests make no assumptions about the distribution of
a population and focus on more general descriptions like mediansa population and focus on more general descriptions, like medians
Wilcoxon Rank Sum TestWilcoxon Rank Sum Test
n n 1  n n
Z* 
R1 
n1 n1 1 
2

n1n2
2
1 n1n2
n1 n21 
12
• The Wilcoxon rank sum test is a
nonparametric test to compare
the sums of two ranked samples Wilcoxon rank sum test assumes both 
the sums of two ranked samples
• If assumptions are met, the test
can be used to compare medians
samples are from the same type of 
distribution with different locations
 Also known as the Mann Whitney Test Also known as the Mann‐Whitney Test
Example: Wilcoxon-Rank Sum Test
R
n1 n1 1  n1n2
Example: Wilcoxon Rank Sum Test
statistic null value
Z* 
R1  1 1 
2
 1 2
2
1 
statistic
n1n2
n1 n21 
12
standard
error
12
• The “statistic” in a Wilcoxon rank sum test is equivalent to the
diff b t th f th k d d t i h ldifference between the sums of the ranked data in each class
• The null hypothesis R1 = R2 produces a strange statistic, null value
and standard error due to relationships among sums of ranks
Does It Really Compare Medians?Does It Really Compare Medians?
• If two samples come from the same
type of distributiontype of distribution …
YES
 Median of a sample is comparable to its
middle ranked observation(s)middle ranked observation(s)
 If two samples share a similar shape, the
sample with the significantly higher rank
sums will have the higher median too
• If two samples come from two very
different types of distributions … NO
 The Wilcoxon rank sum test actually
 Control and Treated samplescompares the sums of the ranked data
 Many counter examples have significant
Wilcoxon tests, but equal medians
 Control and Treated samples 
both have median = 100
 Wilcoxon rank sum test has a 
i ifi t l 0 0000027significant p‐value = 0.0000027
Does Wilcoxon Compare Distributions?Does Wilcoxon Compare Distributions?
• Wilcoxon rank sum test
does NOT compare the
distributions of samples
• Samples from two very
different distributions can
yield non-significanty g
Wilcoxon test p-values
• It is difficult to interpretIt is difficult to interpret
Wilcoxon rank sum test if
assumptions aren’t met
Student’s T vs WilcoxonStudent s T vs. Wilcoxon
• Two-sample Student’s T-test
A l ti l b t k f ll Assumes populations are normal, but works for all
populations if large sample sizes are used for both
classes
 Assumes variances are equal or requires Welch
correction
Wil R k S T t• Wilcoxon Rank Sum Test
 Assumes both samples are from the same type of
distribution
 Generally preferred over Student’s T-test by all journals
• What if neither test is appropriate?
Example: Viral LoadsExample: Viral Loads
• Want to compare viral loadsp
under treatment and placebo
 Viral loads are very high (> 10,000)
and skewed right for the placebo
groupgroup
 Viral loads all equal zero for treatment
• Both Student’s T-test and the
Wilcoxon test are inappropriateWilcoxon test are inappropriate
 Don’t have normal data or equal
variances for Student’s T-test
 Can’t use Wilcoxon test to compare
two different distributions
• Need an appropriate p-value, so
what statistical test can we use?
Exact Tests vs Approximate TestsExact Tests vs. Approximate Tests
• Exact statistical tests are based on probabilityExact statistical tests are based on probability
statements that are valid for any sample size
 Usually based combinatorial or resampling strategies
 All resampling tests are considered exact tests
 Some implementations of Wilcoxon tests use exact tests based on
combinatorial arguments for small sample sizes
• Approximate statistical tests are based mathematical
arguments about convergence with large sample sizes
 Student’s T-test is an approximate test based on arguments similar to
the Central Limit Theorem
 Approximate tests may have inaccurate p-values for small samples
Example: Bootstrap Samplesa p e: oo s ap Sa p es
Original Data Bootstrap Samples
Class Data
A 1
A 2
A 3
Sample 1 Sample 2 Sample 3 …
3 1 2 …
3 5 2 …
2 3 2A 3
A 4
A 5
2 3 2 …
4 1 4 …
2 2 1 …
B 6
B 7
B 8
7 9 6 …
6 6 7 …
8 6 8 …
• Bootstrapping uses sampling with replacement within each class
B 9 8 9 9 …
Example: Jackknife Samplesp p
Original Data
Cl D t
Jackknife Samples
S l 1 S l 2 S l 3Class Data
A 1
A 2
A 3
Sample 1 Sample 2 Sample 3 …
1 1 …
2 2 …
3 3A 3
A 4
A 5
3 3 …
4 4 4 …
5 5 5 …
B 6
B 7
B 8
6 6 …
7 7 …
8 8 …
• Jackknifing uses “leave one out” sampling within each class
B 9 9 9 9 …
Example: Permutation Samplesp p
Original Data
Cl D t
Permutation Samples
S l 1 S l 2 S l 3Class Data
A 1
A 2
A 3
Sample 1 Sample 2 Sample 3 …
B A A …
B A B …
A A AA 3
A 4
A 5
A A A …
A B B …
B A A …
B 6
B 7
B 8
A B B …
A A B …
A B A …
• Permutation tests scramble the class labels among the samples
B 9 B B A …
Example: Viral Loadsp
• One-sided permutation test
to compare the mediansto compare the medians
from two different samples
• Permutations of differences• Permutations of differences
in median are centered at 0
• Compute p value using:• Compute p-value using:
p 
# permutations > true difference in medians
t t l # t titotal # permutations
or
p 
# permutations < true difference in medians
total # permutationsp
Example: Viral LoadsExample: Viral Loads
 Two‐sided permutation test p
uses absolute values of each 
permutation and the true 
difference in mediansdifference in medians
 Permutations of differences 
are all greater than zeroare all greater than zero
 Compute p‐value using:
p 
# permutations > true difference in medians
total # permutations
Multiple Testingp g
• Use Family-Wise Error-Rate (FWER) adjustments for
l i f i (ANOVA) t t ianalysis of variance (ANOVA) tests or comparisons
among 3-20 groups of samples
 One-way ANOVA is an extension of the Student’s T-test
 Kruskal-Wallis is an extension of the Wilcoxon rank sum test
 Permutation tests can be adjusted with Bonferroni and other methods
• Use False Discovery Rate (FDR) adjustments for high-
throughput biology experiments like microarrays
 E.g. Microarrays, real time PCR, next gen sequencing, …E.g. Microarrays, real time PCR, next gen sequencing, …
 FDR methods are more powerful than family-wise error rate (FWER)
controlling methods, like those used in ANOVA, for high-throughput
methods with hundreds or thousands of tests
FWER Adjustments: Bonferronij
• Suppose you want to compare 5 new drugs against
l b b t k ll 5 d i ff tia placebo, but you know all 5 drugs are ineffective
 Compute 5 Student’s T-tests with false positive rate α = 0.05
 Each test has a 95% chance to correctly find p > 0.05Each test has a 95% chance to correctly find p 0.05
 Among all 5 tests, the chance of at least one false positive is:
1 – 0.955 = 0.23 > 0.05
• The Bonferroni FWER adjustment
 Divide the false positive rate α = 0 05 by the number of tests so onlyDivide the false positive rate α 0.05 by the number of tests, so only
p-values smaller than α = 0.05 / 5 = 0.01 are significant
 Multiply p-values by the number of tests for an “adjusted p-value”,
using the formula min(1 5*p) for these five testsusing the formula min(1, 5 p) for these five tests.
Other FWER MethodsOther FWER Methods
• Tukey’s Honest Significant Difference
 Uses the “standardized range” method for all pair-wise comparisons
 E.g. for three groups, compare A vs. B, B vs. C and A vs. C
• Dunnett’s Multiple Comparisons Against a Control
 Uses “standardized range method” for comparisons against a control
 E.g. for three groups, compare A vs. C and B vs. C for control group C
• Popular yet outdated methods• Popular, yet outdated methods
 Fisher’s LSD, Student-Newman-Keuls, Duncan’s Test, …
False Discovery Rate MethodsFalse Discovery Rate Methods
• Consider a microarray experiment with 20,000 genes
B f i i f l iti t 0 05 / 20 000 0 0000025 Bonferroni requires a false positive rate α = 0.05 / 20,000 = 0.0000025
 Few, if any, genes will be statistically significant using Bonferroni
• The purpose of a microarray experiment is different
than an ANOVA experiment comparing 3-20 groups
 Microarray experiments are often considered “fishing expeditions” Microarray experiments are often considered fishing expeditions”
 Want to find approximately 100 genes of interest for follow up
experiments with quantitative real-time PCR or other methods
Willing to accept a few false positives among our significant results if Willing to accept a few false positives among our significant results, if
we can capture all the biologically important genes in the process
FDR Methods (cont)
• Suppose you could test
5 000 genes that were not5,000 genes that were not
differentially expressed
Th 000 ld• Those 5,000 genes would
include many false
positives
• The p-values should
follow a uniformfollow a uniform
distribution from
p = 0.00 to p = 1.00
FDR Methods (cont.)
• Add in 1,000 differentially
expressed genes (DEGs)expressed genes (DEGs)
• All DEGs have p < 0.05p
• Want to adjust the cut-off
value α = 0.05 until the list
of significant genes has a
controlled proportion ofp p
false positives
Th k YThank You
For questions or comments please contact:
ScienceApps@niaid.nih.gov
301.496.4455
40

More Related Content

What's hot

Parmetric and non parametric statistical test in clinical trails
Parmetric and non parametric statistical test in clinical trailsParmetric and non parametric statistical test in clinical trails
Parmetric and non parametric statistical test in clinical trailsVinod Pagidipalli
 
Choosing appropriate statistical test RSS6 2104
Choosing appropriate statistical test RSS6 2104Choosing appropriate statistical test RSS6 2104
Choosing appropriate statistical test RSS6 2104RSS6
 
non parametric statistics
non parametric statisticsnon parametric statistics
non parametric statisticsAnchal Garg
 
Statistical Significance Testing in Information Retrieval: An Empirical Analy...
Statistical Significance Testing in Information Retrieval: An Empirical Analy...Statistical Significance Testing in Information Retrieval: An Empirical Analy...
Statistical Significance Testing in Information Retrieval: An Empirical Analy...Julián Urbano
 
Lesson 6 Nonparametric Test 2009 Ta
Lesson 6 Nonparametric Test 2009 TaLesson 6 Nonparametric Test 2009 Ta
Lesson 6 Nonparametric Test 2009 TaSumit Prajapati
 
3.1 non parametric test
3.1 non parametric test3.1 non parametric test
3.1 non parametric testShital Patil
 
Chosing the appropriate_statistical_test
Chosing the appropriate_statistical_testChosing the appropriate_statistical_test
Chosing the appropriate_statistical_testBRAJESH KUMAR PARASHAR
 
Non parametrict test
Non parametrict testNon parametrict test
Non parametrict testdobhalshiv
 
Parametric versus non parametric test
Parametric versus non parametric testParametric versus non parametric test
Parametric versus non parametric testJWANIKA VANSIYA
 
How to choose a right statistical test
How to choose a right statistical testHow to choose a right statistical test
How to choose a right statistical testKhalid Mahmood
 
Assumptions about parametric and non parametric tests
Assumptions about parametric and non parametric testsAssumptions about parametric and non parametric tests
Assumptions about parametric and non parametric testsBarath Babu Kumar
 
Alternatives to t test
Alternatives to t testAlternatives to t test
Alternatives to t testLONDIWE SHANGE
 

What's hot (20)

Parmetric and non parametric statistical test in clinical trails
Parmetric and non parametric statistical test in clinical trailsParmetric and non parametric statistical test in clinical trails
Parmetric and non parametric statistical test in clinical trails
 
Statistical test
Statistical testStatistical test
Statistical test
 
Choosing appropriate statistical test RSS6 2104
Choosing appropriate statistical test RSS6 2104Choosing appropriate statistical test RSS6 2104
Choosing appropriate statistical test RSS6 2104
 
non parametric statistics
non parametric statisticsnon parametric statistics
non parametric statistics
 
Statistical Significance Testing in Information Retrieval: An Empirical Analy...
Statistical Significance Testing in Information Retrieval: An Empirical Analy...Statistical Significance Testing in Information Retrieval: An Empirical Analy...
Statistical Significance Testing in Information Retrieval: An Empirical Analy...
 
Lesson 6 Nonparametric Test 2009 Ta
Lesson 6 Nonparametric Test 2009 TaLesson 6 Nonparametric Test 2009 Ta
Lesson 6 Nonparametric Test 2009 Ta
 
3.1 non parametric test
3.1 non parametric test3.1 non parametric test
3.1 non parametric test
 
Non parametric test
Non parametric testNon parametric test
Non parametric test
 
Non parametric-tests
Non parametric-testsNon parametric-tests
Non parametric-tests
 
Non parametric presentation
Non parametric presentationNon parametric presentation
Non parametric presentation
 
Biostatistics ii
Biostatistics iiBiostatistics ii
Biostatistics ii
 
Stat topics
Stat topicsStat topics
Stat topics
 
Non parametric tests
Non parametric testsNon parametric tests
Non parametric tests
 
Student's T Test
Student's T TestStudent's T Test
Student's T Test
 
Chosing the appropriate_statistical_test
Chosing the appropriate_statistical_testChosing the appropriate_statistical_test
Chosing the appropriate_statistical_test
 
Non parametrict test
Non parametrict testNon parametrict test
Non parametrict test
 
Parametric versus non parametric test
Parametric versus non parametric testParametric versus non parametric test
Parametric versus non parametric test
 
How to choose a right statistical test
How to choose a right statistical testHow to choose a right statistical test
How to choose a right statistical test
 
Assumptions about parametric and non parametric tests
Assumptions about parametric and non parametric testsAssumptions about parametric and non parametric tests
Assumptions about parametric and non parametric tests
 
Alternatives to t test
Alternatives to t testAlternatives to t test
Alternatives to t test
 

Similar to Overview of statistics: Statistical testing (Part I)

Test of significance in Statistics
Test of significance in StatisticsTest of significance in Statistics
Test of significance in StatisticsVikash Keshri
 
Parametric tests seminar
Parametric tests seminarParametric tests seminar
Parametric tests seminardrdeepika87
 
Statistics basics for oncologist kiran
Statistics basics for oncologist kiranStatistics basics for oncologist kiran
Statistics basics for oncologist kiranKiran Ramakrishna
 
Day-2_Presentation for SPSS parametric workshop.pptx
Day-2_Presentation for SPSS parametric workshop.pptxDay-2_Presentation for SPSS parametric workshop.pptx
Day-2_Presentation for SPSS parametric workshop.pptxrjaisankar
 
Class 5 Hypothesis & Normal Disdribution.pptx
Class 5 Hypothesis & Normal Disdribution.pptxClass 5 Hypothesis & Normal Disdribution.pptx
Class 5 Hypothesis & Normal Disdribution.pptxCallplanetsDeveloper
 
tests of significance
tests of significancetests of significance
tests of significancebenita regi
 
T test^jsample size^j ethics
T test^jsample size^j ethicsT test^jsample size^j ethics
T test^jsample size^j ethicsAbhishek Thakur
 
P-values the gold measure of statistical validity are not as reliable as many...
P-values the gold measure of statistical validity are not as reliable as many...P-values the gold measure of statistical validity are not as reliable as many...
P-values the gold measure of statistical validity are not as reliable as many...David Pratap
 
20200519073328de6dca404c.pdfkshhjejhehdhd
20200519073328de6dca404c.pdfkshhjejhehdhd20200519073328de6dca404c.pdfkshhjejhehdhd
20200519073328de6dca404c.pdfkshhjejhehdhdHimanshuSharma723273
 
Hypothesis testing1
Hypothesis testing1Hypothesis testing1
Hypothesis testing1HanaaBayomy
 
ChandanChakrabarty_1.pdf
ChandanChakrabarty_1.pdfChandanChakrabarty_1.pdf
ChandanChakrabarty_1.pdfDikshathawait
 
Parametric tests
Parametric testsParametric tests
Parametric testsheena45
 
Topic 7 stat inference
Topic 7 stat inferenceTopic 7 stat inference
Topic 7 stat inferenceSizwan Ahammed
 
Assessment 3 ContextYou will review the theory, logic, and a.docx
Assessment 3 ContextYou will review the theory, logic, and a.docxAssessment 3 ContextYou will review the theory, logic, and a.docx
Assessment 3 ContextYou will review the theory, logic, and a.docxgalerussel59292
 
Chapter 28 clincal trials
Chapter 28 clincal trials Chapter 28 clincal trials
Chapter 28 clincal trials Nilesh Kucha
 
Parametric vs non parametric test
Parametric vs non parametric testParametric vs non parametric test
Parametric vs non parametric testar9530
 

Similar to Overview of statistics: Statistical testing (Part I) (20)

Test of significance in Statistics
Test of significance in StatisticsTest of significance in Statistics
Test of significance in Statistics
 
Parametric tests seminar
Parametric tests seminarParametric tests seminar
Parametric tests seminar
 
Statistics basics for oncologist kiran
Statistics basics for oncologist kiranStatistics basics for oncologist kiran
Statistics basics for oncologist kiran
 
Day-2_Presentation for SPSS parametric workshop.pptx
Day-2_Presentation for SPSS parametric workshop.pptxDay-2_Presentation for SPSS parametric workshop.pptx
Day-2_Presentation for SPSS parametric workshop.pptx
 
Class 5 Hypothesis & Normal Disdribution.pptx
Class 5 Hypothesis & Normal Disdribution.pptxClass 5 Hypothesis & Normal Disdribution.pptx
Class 5 Hypothesis & Normal Disdribution.pptx
 
tests of significance
tests of significancetests of significance
tests of significance
 
T test^jsample size^j ethics
T test^jsample size^j ethicsT test^jsample size^j ethics
T test^jsample size^j ethics
 
P-values the gold measure of statistical validity are not as reliable as many...
P-values the gold measure of statistical validity are not as reliable as many...P-values the gold measure of statistical validity are not as reliable as many...
P-values the gold measure of statistical validity are not as reliable as many...
 
20200519073328de6dca404c.pdfkshhjejhehdhd
20200519073328de6dca404c.pdfkshhjejhehdhd20200519073328de6dca404c.pdfkshhjejhehdhd
20200519073328de6dca404c.pdfkshhjejhehdhd
 
Hypothesis testing1
Hypothesis testing1Hypothesis testing1
Hypothesis testing1
 
ChandanChakrabarty_1.pdf
ChandanChakrabarty_1.pdfChandanChakrabarty_1.pdf
ChandanChakrabarty_1.pdf
 
Ds 2251 -_hypothesis test
Ds 2251 -_hypothesis testDs 2251 -_hypothesis test
Ds 2251 -_hypothesis test
 
Parametric tests
Parametric testsParametric tests
Parametric tests
 
Hypothesis
HypothesisHypothesis
Hypothesis
 
T test
T test T test
T test
 
Topic 7 stat inference
Topic 7 stat inferenceTopic 7 stat inference
Topic 7 stat inference
 
Assessment 3 ContextYou will review the theory, logic, and a.docx
Assessment 3 ContextYou will review the theory, logic, and a.docxAssessment 3 ContextYou will review the theory, logic, and a.docx
Assessment 3 ContextYou will review the theory, logic, and a.docx
 
Chapter 28 clincal trials
Chapter 28 clincal trials Chapter 28 clincal trials
Chapter 28 clincal trials
 
Parametric vs non parametric test
Parametric vs non parametric testParametric vs non parametric test
Parametric vs non parametric test
 
T‑tests
T‑testsT‑tests
T‑tests
 

More from Bioinformatics and Computational Biosciences Branch

More from Bioinformatics and Computational Biosciences Branch (20)

Hong_Celine_ES_workshop.pptx
Hong_Celine_ES_workshop.pptxHong_Celine_ES_workshop.pptx
Hong_Celine_ES_workshop.pptx
 
Virus Sequence Alignment and Phylogenetic Analysis 2019
Virus Sequence Alignment and Phylogenetic Analysis 2019Virus Sequence Alignment and Phylogenetic Analysis 2019
Virus Sequence Alignment and Phylogenetic Analysis 2019
 
Nephele 2.0: How to get the most out of your Nephele results
Nephele 2.0: How to get the most out of your Nephele resultsNephele 2.0: How to get the most out of your Nephele results
Nephele 2.0: How to get the most out of your Nephele results
 
Introduction to METAGENOTE
Introduction to METAGENOTE Introduction to METAGENOTE
Introduction to METAGENOTE
 
Intro to homology modeling
Intro to homology modelingIntro to homology modeling
Intro to homology modeling
 
Protein fold recognition and ab_initio modeling
Protein fold recognition and ab_initio modelingProtein fold recognition and ab_initio modeling
Protein fold recognition and ab_initio modeling
 
Homology modeling: Modeller
Homology modeling: ModellerHomology modeling: Modeller
Homology modeling: Modeller
 
Protein docking
Protein dockingProtein docking
Protein docking
 
Protein function prediction
Protein function predictionProtein function prediction
Protein function prediction
 
Protein structure prediction with a focus on Rosetta
Protein structure prediction with a focus on RosettaProtein structure prediction with a focus on Rosetta
Protein structure prediction with a focus on Rosetta
 
Biological networks
Biological networksBiological networks
Biological networks
 
UNIX Basics and Cluster Computing
UNIX Basics and Cluster ComputingUNIX Basics and Cluster Computing
UNIX Basics and Cluster Computing
 
Intro to JMP for statistics
Intro to JMP for statisticsIntro to JMP for statistics
Intro to JMP for statistics
 
Better graphics in R
Better graphics in RBetter graphics in R
Better graphics in R
 
Automating biostatistics workflows using R-based webtools
Automating biostatistics workflows using R-based webtoolsAutomating biostatistics workflows using R-based webtools
Automating biostatistics workflows using R-based webtools
 
GraphPad Prism: Curve fitting
GraphPad Prism: Curve fittingGraphPad Prism: Curve fitting
GraphPad Prism: Curve fitting
 
Appendix: Crash course in R and BioConductor
Appendix: Crash course in R and BioConductorAppendix: Crash course in R and BioConductor
Appendix: Crash course in R and BioConductor
 
Crash course in R and BioConductor
Crash course in R and BioConductorCrash course in R and BioConductor
Crash course in R and BioConductor
 
GraphPad Prism: Customizing your graphs
GraphPad Prism: Customizing your graphsGraphPad Prism: Customizing your graphs
GraphPad Prism: Customizing your graphs
 
Design of experiments
Design of experiments Design of experiments
Design of experiments
 

Recently uploaded

How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxFurkanTasci3
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 

Recently uploaded (20)

How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptx
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 

Overview of statistics: Statistical testing (Part I)

  • 1. Overview of Statistics I: Statistical Testing Presented by: Jeff Skinner, M.S.ese ted by Je S e , S Biostatistics Specialist Bioinformatics and Computational Biosciences Branch (BCBB) Office of Cyber Infrastructure and Computational Biology (OCICB)y p gy ( ) National Institute of Allergy and Infectious Diseases (NIAID)
  • 2. Want To Publish Statistical Results? Most research journals ask three big questions: • Do your statistical tests inappropriately assume your data is normal or Gaussian distributed? P t i t t t i t t Parametric tests vs. nonparametric tests  E.g. Student’s T-tests vs. Wilcoxon Rank Sum tests D t ti ti l t t i l l ?• Do your statistical tests require large samples?  Approximate tests vs. exact tests • Do you require adjustments for multiple testing?  False Discovery Rate (FDR) adjustments for high throughput biology  Tukey’s Honest Significant Difference (HSD) for analysis of variance Tukey s Honest Significant Difference (HSD) for analysis of variance
  • 4. Outline • Review the statistical testing process • When do I use one-sided vs. two-sided tests? Wh d I t i t i t t ?• When do I use parametric vs. nonparametric tests? • When do I use large-sample tests vs. exact tests?g p • When do I need to adjust for multiple testing? • Additional questions about testing?
  • 5. Recall the Statistical Testing Processg • Formulate null and alternative hypotheses  E.g. (null) H0: μ1 = μ2 vs. (alternative) HA: μ1 ≠ μ2 • Calculate the appropriate test statistic  E g Student’s t-test Wilcoxon testE.g. Student s t test, Wilcoxon test, … • Compute the probability of observing the test statistic (i.e. your sample data) under the null hypothesis  I.e. Compute a p-value • Make a statistical decision “Reject the n ll h pothesis” or “fail to reject the n ll h pothesis” “Reject the null hypothesis” or “fail to reject the null hypothesis” • Make a biological conclusion  E.g. New drug reduces viral load, vitamin C helps prevent cancer, …g g , p p ,
  • 6. Null and Alternative Hypothesesyp • (null) Drug and placebo viral loads are equal vs. (alternative) viral loads higher for placebo group than for drug group • (null) H : μ – μ ≤ 0 vs (alternative) H : μ – μ > 0• (null) H0: μP – μD ≤ 0 vs. (alternative) HA: μP – μD > 0
  • 7. What is a Statistical Test? ValueNullStatisticDifference Error ValueNullStatistic Error Difference Test   • Almost all tests used in inferential statistics can be li d th ti f “diff ” “ ”generalized as the ratio of a “difference” over an “error”  Difference between a statistic and null value (usually 0)  A statistic is nothing more than a numeric summary of the experimental d t ith t t th ll h th idata with respect to the null hypothesis  A null value is an assumption about the population under the null hypothesis  An error is an estimate of the sampling distribution error An error is an estimate of the sampling distribution error
  • 8. Example: Two-sample Student’s T-testp p X X 0statistic null value T*  X1  X2  0 1 1    n11 s1 2  n21 s2 2 standard error1 n1  1 n2    1  1 2  2 n1  n2  2 • The “statistic” in a two-sample t-test is a difference between the two sample means and the null value is zero  The hypothesis μ1 = μ2 implies μ1 – μ2 = 0 • The standard error is an estimate of the common variance• The standard error is an estimate of the common variance
  • 9. Null Distributions Compute T* statistic f lfor our sample C i t thCompare against the distribution of all possible T* statistics for all possible samples from the population under the • If the null hypothesis is true (i.e. no difference between groups), p p null hypothesis yp ( g p ), then the T* statistics from most samples should be near zero • Many null distributions (or sampling distributions) approximately follow well known probability distributions e g normal distributionfollow well known probability distributions, e.g. normal distribution
  • 10. P-values • A p-value is the probability of observing your data given that theobserving your data given that the null hypothesis is actually true O• P-values do NOT represent the probability that the null is true • P-values do NOT represent the probabilty that a model is incorrect If the null distribution follows a well • P-values do NOT represent the strength or size of an effect known probability distribution, like the normal distribution, the p-values are computed by integration
  • 11. Statistical Decisions and Bi l i l C l iBiological Conclusions • A statistical decision is a choice to “reject the null hypothesis” or “fail to reject the null hypothesis”  The decision is based on a critical value or decision rule The decision is based on a critical value or decision rule  E.g. Reject the null hypothesis if p-value < 0.05 A bi l i l l i i th fi l i t t ti f• A biological conclusion is the final interpretation of the statistical testing process in plain language  E g Vitamin C prevents cancer drug reduced viral loadE.g. Vitamin C prevents cancer, drug reduced viral load, …  Make sure conclusion can be justified by the hypotheses
  • 12. Type I and Type II Errorsyp yp Actual population difference?Actual population difference?  Yes No Type I Error Was the difference  detected by the  i i l ? Yes OK Type I Error (False Positive) Type II Error Diff t t f i t tt t t i i i statistical test? No Type II Error (False Negative) OK • Different types of experiments attempt to minimize Type I errors, Type II errors or both kinds of errors  E.g. Type II errors are more important in medical testingE.g. Type II errors are more important in medical testing
  • 13. Type I and Type II Errors - Exampleype a d ype o s a p e • Suppose mean viral load is 7,000 lower after taking a new druglower after taking a new drug  Drug population mean viral load 33,000  Placebo population mean viral load 40,000, • Samples from the population may not be representative  By chance we sample 120 sickly patients for the drug treatment group  By chance we sample 120 robust patients for the placebo treatment groupfor the placebo treatment group • This data yields a Type II error because of a strange sample
  • 14. Review: Statistical Testingg • Formulate null and alternative hypotheses  Null and alternative hypotheses are mutually exclusive and exhaustive statements about the population  Typically assume the null hypothesis is true, until we find evidence to refute the null in favor of the alternative  E.g. H0: µ = 0 versus HA: µ ≠ 0 • Calculate the appropriate test statistic and find its probability under the null hypothesis • Make a statistical decision and biological conclusion
  • 15. Two-sample Student’s T-testp T* X1  X2  0 T*  1 2 sp 1  1 p n1 n2 Assumptions of Student’s T‐test:p  Data from each group are normal  or sample sizes greater than 30  Equal variances among groups • Student’s T-test is a parametric test to compare means of two samples from normal distributed populations Equal variances among groups  Independent and identically  distributed (iid) normal random  errors from the group meansp p errors from the group means
  • 16. Parametric Statistical Tests • Parametric tests assume the populations have known distributions ith k t th t t b ti t d ( d d)with unknown parameters that must be estimated (and compared)
  • 17. Central Limit Theorem • Student’s T-test is computed with a difference of two samplep means • Draw thousands of samples of size n from one population top p view the distribution of their sample means • As sample size n increases, thep , distribution of the sample means becomes Gaussian (normal), even from non-normal populations • Student’s T-test does not require normal data if sample size is large Sample means from a uniform distribution are approximately normal for n = 20
  • 18. Equal Variance AssumptionEqual Variance Assumption T*  X1  X2  0 s1 2 s2 2 s1 n1  s2 n2 • Usual two-sample Student’s T-test computations assume both samples share approximately equal variances • Welsh’s correction computes an appropriate T-test when variances are NOT equal among the two samples • Welch’s correction is available on most software, so look carefully, y
  • 19. Two-Sided Test • Researchers expect two samples ill b diff t b t d t kwill be different, but do not know which will have the higher mean  E.g. viral load for drug group could be higher or lower than placebo  E.g. H0: μP = μD vs. HA: μP ≠ μD • Two-sided tests are less powerful,p but more general than analogous one-sided tests of same data  The α = 0.05 rejection region of aThe α 0.05 rejection region of a two-sided test is divided in two parts  Need larger T-statistic for significant p-values with two-sided testp
  • 20. One-Sided Test • Researchers expect one sample will have a higher mean than the other sample before testing  E.g. viral load for drug group should be lower than placebo  E.g. H0: μP ≤ μD vs. HA: μP > μD O id d t t f l• One-sided tests are more powerful but require more assumptions  The α = 0.05 rejection region of the one-sided test is all on one side  Smaller T-statistics will produce significant p-values in one-sided teststests
  • 21. Nonparametric Statistical TestsNonparametric Statistical Tests • Nonparametric tests make no assumptions about the distribution of a population and focus on more general descriptions like mediansa population and focus on more general descriptions, like medians
  • 22. Wilcoxon Rank Sum TestWilcoxon Rank Sum Test n n 1  n n Z*  R1  n1 n1 1  2  n1n2 2 1 n1n2 n1 n21  12 • The Wilcoxon rank sum test is a nonparametric test to compare the sums of two ranked samples Wilcoxon rank sum test assumes both  the sums of two ranked samples • If assumptions are met, the test can be used to compare medians samples are from the same type of  distribution with different locations  Also known as the Mann Whitney Test Also known as the Mann‐Whitney Test
  • 23. Example: Wilcoxon-Rank Sum Test R n1 n1 1  n1n2 Example: Wilcoxon Rank Sum Test statistic null value Z*  R1  1 1  2  1 2 2 1  statistic n1n2 n1 n21  12 standard error 12 • The “statistic” in a Wilcoxon rank sum test is equivalent to the diff b t th f th k d d t i h ldifference between the sums of the ranked data in each class • The null hypothesis R1 = R2 produces a strange statistic, null value and standard error due to relationships among sums of ranks
  • 24. Does It Really Compare Medians?Does It Really Compare Medians? • If two samples come from the same type of distributiontype of distribution … YES  Median of a sample is comparable to its middle ranked observation(s)middle ranked observation(s)  If two samples share a similar shape, the sample with the significantly higher rank sums will have the higher median too • If two samples come from two very different types of distributions … NO  The Wilcoxon rank sum test actually  Control and Treated samplescompares the sums of the ranked data  Many counter examples have significant Wilcoxon tests, but equal medians  Control and Treated samples  both have median = 100  Wilcoxon rank sum test has a  i ifi t l 0 0000027significant p‐value = 0.0000027
  • 25. Does Wilcoxon Compare Distributions?Does Wilcoxon Compare Distributions? • Wilcoxon rank sum test does NOT compare the distributions of samples • Samples from two very different distributions can yield non-significanty g Wilcoxon test p-values • It is difficult to interpretIt is difficult to interpret Wilcoxon rank sum test if assumptions aren’t met
  • 26. Student’s T vs WilcoxonStudent s T vs. Wilcoxon • Two-sample Student’s T-test A l ti l b t k f ll Assumes populations are normal, but works for all populations if large sample sizes are used for both classes  Assumes variances are equal or requires Welch correction Wil R k S T t• Wilcoxon Rank Sum Test  Assumes both samples are from the same type of distribution  Generally preferred over Student’s T-test by all journals • What if neither test is appropriate?
  • 27. Example: Viral LoadsExample: Viral Loads • Want to compare viral loadsp under treatment and placebo  Viral loads are very high (> 10,000) and skewed right for the placebo groupgroup  Viral loads all equal zero for treatment • Both Student’s T-test and the Wilcoxon test are inappropriateWilcoxon test are inappropriate  Don’t have normal data or equal variances for Student’s T-test  Can’t use Wilcoxon test to compare two different distributions • Need an appropriate p-value, so what statistical test can we use?
  • 28. Exact Tests vs Approximate TestsExact Tests vs. Approximate Tests • Exact statistical tests are based on probabilityExact statistical tests are based on probability statements that are valid for any sample size  Usually based combinatorial or resampling strategies  All resampling tests are considered exact tests  Some implementations of Wilcoxon tests use exact tests based on combinatorial arguments for small sample sizes • Approximate statistical tests are based mathematical arguments about convergence with large sample sizes  Student’s T-test is an approximate test based on arguments similar to the Central Limit Theorem  Approximate tests may have inaccurate p-values for small samples
  • 29. Example: Bootstrap Samplesa p e: oo s ap Sa p es Original Data Bootstrap Samples Class Data A 1 A 2 A 3 Sample 1 Sample 2 Sample 3 … 3 1 2 … 3 5 2 … 2 3 2A 3 A 4 A 5 2 3 2 … 4 1 4 … 2 2 1 … B 6 B 7 B 8 7 9 6 … 6 6 7 … 8 6 8 … • Bootstrapping uses sampling with replacement within each class B 9 8 9 9 …
  • 30. Example: Jackknife Samplesp p Original Data Cl D t Jackknife Samples S l 1 S l 2 S l 3Class Data A 1 A 2 A 3 Sample 1 Sample 2 Sample 3 … 1 1 … 2 2 … 3 3A 3 A 4 A 5 3 3 … 4 4 4 … 5 5 5 … B 6 B 7 B 8 6 6 … 7 7 … 8 8 … • Jackknifing uses “leave one out” sampling within each class B 9 9 9 9 …
  • 31. Example: Permutation Samplesp p Original Data Cl D t Permutation Samples S l 1 S l 2 S l 3Class Data A 1 A 2 A 3 Sample 1 Sample 2 Sample 3 … B A A … B A B … A A AA 3 A 4 A 5 A A A … A B B … B A A … B 6 B 7 B 8 A B B … A A B … A B A … • Permutation tests scramble the class labels among the samples B 9 B B A …
  • 32. Example: Viral Loadsp • One-sided permutation test to compare the mediansto compare the medians from two different samples • Permutations of differences• Permutations of differences in median are centered at 0 • Compute p value using:• Compute p-value using: p  # permutations > true difference in medians t t l # t titotal # permutations or p  # permutations < true difference in medians total # permutationsp
  • 33. Example: Viral LoadsExample: Viral Loads  Two‐sided permutation test p uses absolute values of each  permutation and the true  difference in mediansdifference in medians  Permutations of differences  are all greater than zeroare all greater than zero  Compute p‐value using: p  # permutations > true difference in medians total # permutations
  • 34. Multiple Testingp g • Use Family-Wise Error-Rate (FWER) adjustments for l i f i (ANOVA) t t ianalysis of variance (ANOVA) tests or comparisons among 3-20 groups of samples  One-way ANOVA is an extension of the Student’s T-test  Kruskal-Wallis is an extension of the Wilcoxon rank sum test  Permutation tests can be adjusted with Bonferroni and other methods • Use False Discovery Rate (FDR) adjustments for high- throughput biology experiments like microarrays  E.g. Microarrays, real time PCR, next gen sequencing, …E.g. Microarrays, real time PCR, next gen sequencing, …  FDR methods are more powerful than family-wise error rate (FWER) controlling methods, like those used in ANOVA, for high-throughput methods with hundreds or thousands of tests
  • 35. FWER Adjustments: Bonferronij • Suppose you want to compare 5 new drugs against l b b t k ll 5 d i ff tia placebo, but you know all 5 drugs are ineffective  Compute 5 Student’s T-tests with false positive rate α = 0.05  Each test has a 95% chance to correctly find p > 0.05Each test has a 95% chance to correctly find p 0.05  Among all 5 tests, the chance of at least one false positive is: 1 – 0.955 = 0.23 > 0.05 • The Bonferroni FWER adjustment  Divide the false positive rate α = 0 05 by the number of tests so onlyDivide the false positive rate α 0.05 by the number of tests, so only p-values smaller than α = 0.05 / 5 = 0.01 are significant  Multiply p-values by the number of tests for an “adjusted p-value”, using the formula min(1 5*p) for these five testsusing the formula min(1, 5 p) for these five tests.
  • 36. Other FWER MethodsOther FWER Methods • Tukey’s Honest Significant Difference  Uses the “standardized range” method for all pair-wise comparisons  E.g. for three groups, compare A vs. B, B vs. C and A vs. C • Dunnett’s Multiple Comparisons Against a Control  Uses “standardized range method” for comparisons against a control  E.g. for three groups, compare A vs. C and B vs. C for control group C • Popular yet outdated methods• Popular, yet outdated methods  Fisher’s LSD, Student-Newman-Keuls, Duncan’s Test, …
  • 37. False Discovery Rate MethodsFalse Discovery Rate Methods • Consider a microarray experiment with 20,000 genes B f i i f l iti t 0 05 / 20 000 0 0000025 Bonferroni requires a false positive rate α = 0.05 / 20,000 = 0.0000025  Few, if any, genes will be statistically significant using Bonferroni • The purpose of a microarray experiment is different than an ANOVA experiment comparing 3-20 groups  Microarray experiments are often considered “fishing expeditions” Microarray experiments are often considered fishing expeditions”  Want to find approximately 100 genes of interest for follow up experiments with quantitative real-time PCR or other methods Willing to accept a few false positives among our significant results if Willing to accept a few false positives among our significant results, if we can capture all the biologically important genes in the process
  • 38. FDR Methods (cont) • Suppose you could test 5 000 genes that were not5,000 genes that were not differentially expressed Th 000 ld• Those 5,000 genes would include many false positives • The p-values should follow a uniformfollow a uniform distribution from p = 0.00 to p = 1.00
  • 39. FDR Methods (cont.) • Add in 1,000 differentially expressed genes (DEGs)expressed genes (DEGs) • All DEGs have p < 0.05p • Want to adjust the cut-off value α = 0.05 until the list of significant genes has a controlled proportion ofp p false positives
  • 40. Th k YThank You For questions or comments please contact: ScienceApps@niaid.nih.gov 301.496.4455 40