Upcoming SlideShare
×

# Topics in Biostatistics: Part II

538 views

Published on

1 Like
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

Views
Total views
538
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
18
0
Likes
1
Embeds 0
No embeds

No notes for slide

### Topics in Biostatistics: Part II

1. 1. CCEB Topics in Biostatistics Part 2 Sarah J. Ratcliffe, Ph.D.Sarah J. Ratcliffe, Ph.D. Center for Clinical Epidemiology andCenter for Clinical Epidemiology and BiostatisticsBiostatistics University of Penn School of MedicineUniversity of Penn School of Medicine
2. 2. CCEB Outline  Hypothesis testingHypothesis testing  ExamplesExamples  Interpreting resultsInterpreting results  ResourcesResources
3. 3. CCEB Hypothesis testing Steps:Steps:  Select a one-sided or two-sided test.Select a one-sided or two-sided test.  Establish the level of significance (e.g.,Establish the level of significance (e.g., αα = .05).= .05).  Select an appropriate test statistic.Select an appropriate test statistic.  Compute test statistic with actual data.Compute test statistic with actual data.  Calculate degrees of freedom (Calculate degrees of freedom (dfdf) for the) for the test statistic.test statistic.
4. 4. CCEB Hypothesis testing Steps cont’d:Steps cont’d:  Obtain a tabled value for the statisticalObtain a tabled value for the statistical test.test.  Compare the test statistic to the tabledCompare the test statistic to the tabled value.value.  Calculate a p-value.Calculate a p-value.  Make decision to accept or reject nullMake decision to accept or reject null hypothesis.hypothesis.
5. 5. CCEB Hypothesis testing Steps:Steps:  Select a one-sided or two-sided test.Select a one-sided or two-sided test.  Establish the level of significance (e.g.,Establish the level of significance (e.g., αα = .05).= .05).  Select an appropriate test statistic.Select an appropriate test statistic.  Compute test statistic with actual data.Compute test statistic with actual data.  Calculate degrees of freedom (Calculate degrees of freedom (dfdf) for the test) for the test statistic.statistic.
6. 6. CCEB Hypothesis testing: One-sided versus Two-sided  Determined by the alternative hypothesis.  Unidirectional = one-sided Example: Infected macaques given vaccine or placebo. Higher viral-replication in vaccine group has no benefit of interest. H0: vaccine has no beneficial effect on viral-replication levels at 6 weeks after infection. Ha: vaccine lowers viral-replication levels by 6 weeks after infection.
7. 7. CCEB Hypothesis testing: One-sided versus Two-sided  Bi-directional = two-sided Example: Infected macaques given vaccine or placebo. Interested in whether vaccine has any effect on viral- replication levels, regardless of direction of effect. H0: vaccine has no beneficial effect on viral-replication levels at 6 weeks after infection. Ha: vaccine effects the viral-replication levels.
8. 8. CCEB Hypothesis testing Steps:Steps:  Select a one-sided or two-sided test.Select a one-sided or two-sided test.  Establish the level of significance (e.g.,Establish the level of significance (e.g., αα = .= . 05).05).  Select an appropriate test statistic.Select an appropriate test statistic.  Compute test statistic with actual data.Compute test statistic with actual data.  Calculate degrees of freedom (Calculate degrees of freedom (dfdf) for the test) for the test statistic.statistic.
9. 9. CCEB Hypothesis testing: Level of Significance  How many different hypotheses are being examining?  How many comparisons are needed to answer this hypothesis?  Are any interim analyses planned? e.g. test data, depending on results collect more data and re-test. =>=> How many tests will be ran in total?How many tests will be ran in total?
10. 10. CCEB Hypothesis testing: Level of Significance  αtotal = desired total Type-I error (false positives) for all comparisons.  One test  α1 = αtotal  Multiple tests / comparisons  If αi = αtotal, then ∑αi > αtotal  Need to use a smaller α for each test.
11. 11. CCEB Hypothesis testing: Level of Significance  Conservative approach:  αi = αtotal / number comparisons  Can give different α’s to each comparison.  Formal methods include: Bonferroni, Tukey- Cramer, Scheffe’s method, Duncan-Walker.  O’Brien-Fleming boundary or a Lan and Demets analog can be used to determine αi for interim analyses.  Benjamini Y, and Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. JRSSB, 57:125-133.
12. 12. CCEB Hypothesis testing Steps:Steps:  Select a one-tailed or two-tailed test.Select a one-tailed or two-tailed test.  Establish the level of significance (e.g.,Establish the level of significance (e.g., αα = .05).= .05).  Select an appropriate test statistic.Select an appropriate test statistic.  Compute test statistic with actual data.Compute test statistic with actual data.  Calculate degrees of freedom (Calculate degrees of freedom (dfdf) for the test) for the test statistic.statistic.
13. 13. CCEB Hypothesis testing: Selecting an Appropriate test  How many samples are being compared?  One sample  Two samples  Multi-samples  Are these samples independent?  Unrelated subjects in each sample.  Subjects in each sample related / same.
14. 14. CCEB Hypothesis testing: Selecting an Appropriate test  Are your variables continuous or categorical?  If continuous, is the data normally distributed?  Normality can be determined using a P-P (or Q-Q) plot.  Plot should be approximately a straight line for normality.  If not normal, can it be transformed to normality? Blindly assuming normality can lead to wrong conclusions!!!
15. 15. CCEB Hypothesis testing: Selecting an Appropriate test Approximately a straight line = normal assumption okay
16. 16. CCEB Hypothesis testing: Selecting an Appropriate test Not a straight line = NOT normal Can it be transformed to normality?
17. 17. CCEB Hypothesis testing: Selecting an Appropriate test The natural log transform of the data is approximately a straight line = normal assumption okay Analyze the transformed data NOT the original data.
18. 18. CCEB Hypothesis testing: Geometric versus Arithmetic mean  GeometricGeometric mean of n positive numerical values ismean of n positive numerical values is the nth root of the product of the n values.the nth root of the product of the n values.  GeometricGeometric will always bewill always be less thanless than arithmeticarithmetic..  GeometricGeometric better when some values are very largebetter when some values are very large in magnitude and others are small.in magnitude and others are small.  IfIf geometricgeometric is used, log-transform the data beforeis used, log-transform the data before analyzing.analyzing.  Arithmetic mean of log-transformed data is theArithmetic mean of log-transformed data is the log of the geometric mean of the datalog of the geometric mean of the data  E.g. t-test on log-transformed data = test forE.g. t-test on log-transformed data = test for location of the geometric meanlocation of the geometric mean  Langley R.,Langley R., Practical Statistics Simply ExplainedPractical Statistics Simply Explained, 1970,, 1970, Dover PressDover Press
19. 19. CCEB Source: Richardson & Overbaugh (2005). Basic statistical considerations in virological experiments. Journal of Virology, 29(2): 669-676. Type of Data No. of samples being compared Relationship between samples Underlying distribution of all samples Potential statistical tests Binary 1 n/a Binary One sample binomial test Binary 2 Independent Binary Chi-square test, Fisher's exact test Binary >2 Independent Binary Chi-square test Binary 2 Paired Binary McNemar's test Binary >2 Related Binary Cochran's Q test Continuous 1 n/a Normal One sample t-test for means, one- sample chi-square test fro variances Continuous 1 n/a Non-normal One sample Wilcoxon signed-rank test, one-sample sign test Continuous 2 Independent Normal Two-sample t-test for means, two-sample F test for variances Continuous 2 Independent Non-normal Wilcoxon rank sum test Continuous 2 Paired Normal Paired t-test Continuous 2 Paired Non-normal Wilcoxon signed-rank test, sign test Continuous >2 Independent Normal One-way ANOVA for means, Bartlett's test of homogeneity for variances Continuous >2 Independent Non-normal Kruskal-Wallis test Continuous >2 Related Non-normal Friedman rank sum test
20. 20. CCEB Hypothesis testing: Selecting an Appropriate test  Other tests are available for more complex situations. For example,  Repeated measures ANOVA: >2 measurements taken on each subject; usually interested in time effect.  GEEs / Mixed-effects models: >2 measurements taken on each subject; adjust for other covariates.
21. 21. CCEB Hypothesis testing Steps:Steps:  Select a one-tailed or two-tailed test.Select a one-tailed or two-tailed test.  Establish the level of significance (e.g.,Establish the level of significance (e.g., αα = .05).= .05).  Select an appropriate test statistic.Select an appropriate test statistic.  Run the testRun the test..
22. 22. CCEB Example 1  Expression of chemokine receptors on CD14+/CD14- populations of blood monocytes.  Percent of cells positive by FACS.
23. 23. CCEB CCR8 subject  CD14+ CD14- 1 5 17 2 9 25 3 13 36 4 2 9 5 5 18 6 0 2 7 6 6 8 21 30 9 5 6 10 36 35 mean 10.2 18.4 st dev 10.9 12.6 st error 3.4 4.0
24. 24. CCEB Example 1 cont’d  Continuous data, 2 samples => t-test, if normal OR => Wilcoxon rank sum or signed-rank sum test, if non-normal  Are samples independent or paired?  If independent, can test for equality of variances using a Levene’s test
25. 25. CCEB Example 1 cont’d  T-tests in excel =TTEST(L6:L15,M6:M15,2,2) Cells containing data from sample 1 Cells containing data from sample 2 1-sided or 2-sided test Type of t-test: 1: paired 2: independent, equal variance 3: independent, unequal variance
26. 26. CCEB
27. 27. CCEB Example 1 cont’d  Possible results for different assumptions: P-valuesP-values NormalNormal (t-tests)(t-tests) Non-normalNon-normal (non-parametric(non-parametric tests)tests) Independent,Independent, equal varianceequal variance 0.1370.137 Independent,Independent, unequal varianceunequal variance 0.1370.137 0.1050.105 PairedPaired 0.0100.010 0.0130.013
28. 28. CCEB Example 1 cont’d  Which result is correct?  Data are paired  The differences for each subject are normally distributed. => Paired t-test p = .0095 There is a difference in the percentage of positive CD14+ and CD14- cells.
29. 29. CCEB A graph of the 95% CIs for the means would give the impression there is no difference …
30. 30. CCEB When it’s really the differences we are testing.
31. 31. CCEB Example 1 cont’d  Note: paired tests don’t always give lower p-values.  A 1-sided test on the CCR5 values would give p-values of: p = 0.06 independent samples p = 0.11 paired samples  WHY?
32. 32. CCEB Example 1 cont’d  The differences have a larger spread than the individual variables.
33. 33. CCEB Example 2  Does the level of CCR5 expression on PBLs (basal or upregulated using lentiviral vector) determine the % of entry that occurs via CCR5?  Two viruses  89.6  DH12
34. 34. CCEB Example 2 cont’d CCR5-mediated entry into PBL from 6 donors 89.6 y = 3.7371x - 0.1265 R 2 = 0.4473 DH12 y = 4.1408x + 4.2137 R 2 = 0.4333 0 4 8 12 16 20 0 0.5 1 1.5 2 2.5 % of cells CCR5 positive %ofentrymediatedbyCCR5 89.6 DH12 Linear (89.6) Linear (DH12)
35. 35. CCEB Example 2 cont’d  How do we know if the slope of the line is significantly different from 0?  Can perform a t-test on the slope estimate. For simple linear regression, this is the same as a t-test for correlation (= square root of R2 ).
36. 36. CCEB Example 2 cont’d
37. 37. CCEB Interpreting Results  P-values  Is there a statistically significant result?  If not, was the sample size large enough to detect a biologically meaningful difference?
38. 38. CCEB Online Resources  Power / sample size calculatorsPower / sample size calculators  http://calculators.stat.ucla.edu/powercalc/http://calculators.stat.ucla.edu/powercalc/  http://www.stat.uiowa.edu/~rlenth/Power/http://www.stat.uiowa.edu/~rlenth/Power/  Free statistical softwareFree statistical software  http://members.aol.com/johnp71/javasta2.html#http://members.aol.com/johnp71/javasta2.html# FreebiesFreebies
39. 39. CCEB BECC – Consulting Center  www.cceb.upenn.edu/main/center/becc.htmlwww.cceb.upenn.edu/main/center/becc.html  Hourly fee serviceHourly fee service  Design and analysis strategies for researchDesign and analysis strategies for research proposals;proposals;  Selecting and implementing appropriate statisticalSelecting and implementing appropriate statistical methods for specific applications to research data;methods for specific applications to research data;  Statistical and graphical analysis of data;Statistical and graphical analysis of data;  Statistical review of manuscripts.Statistical review of manuscripts.