Your SlideShare is downloading. ×
0
BSHI 2002 Biostatistics Workshop [M.Tevfik DORAK]
BSHI 2002 Biostatistics Workshop [M.Tevfik DORAK]
BSHI 2002 Biostatistics Workshop [M.Tevfik DORAK]
BSHI 2002 Biostatistics Workshop [M.Tevfik DORAK]
BSHI 2002 Biostatistics Workshop [M.Tevfik DORAK]
BSHI 2002 Biostatistics Workshop [M.Tevfik DORAK]
BSHI 2002 Biostatistics Workshop [M.Tevfik DORAK]
BSHI 2002 Biostatistics Workshop [M.Tevfik DORAK]
BSHI 2002 Biostatistics Workshop [M.Tevfik DORAK]
BSHI 2002 Biostatistics Workshop [M.Tevfik DORAK]
BSHI 2002 Biostatistics Workshop [M.Tevfik DORAK]
BSHI 2002 Biostatistics Workshop [M.Tevfik DORAK]
BSHI 2002 Biostatistics Workshop [M.Tevfik DORAK]
BSHI 2002 Biostatistics Workshop [M.Tevfik DORAK]
BSHI 2002 Biostatistics Workshop [M.Tevfik DORAK]
BSHI 2002 Biostatistics Workshop [M.Tevfik DORAK]
BSHI 2002 Biostatistics Workshop [M.Tevfik DORAK]
BSHI 2002 Biostatistics Workshop [M.Tevfik DORAK]
BSHI 2002 Biostatistics Workshop [M.Tevfik DORAK]
BSHI 2002 Biostatistics Workshop [M.Tevfik DORAK]
BSHI 2002 Biostatistics Workshop [M.Tevfik DORAK]
BSHI 2002 Biostatistics Workshop [M.Tevfik DORAK]
BSHI 2002 Biostatistics Workshop [M.Tevfik DORAK]
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

BSHI 2002 Biostatistics Workshop [M.Tevfik DORAK]

120

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
120
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • <number>
  • Transcript

    • 1. BSHI 2002 Glasgow, Scotland STATISTICAL ANALYSIS OFSTATISTICAL ANALYSIS OF HLA AND DISEASEHLA AND DISEASE ASSOCIATIONSASSOCIATIONS M. Tevfik DORAKM. Tevfik DORAK Department of EpidemiologyDepartment of Epidemiology University of Alabama at BirminghamUniversity of Alabama at Birmingham U.S.A.U.S.A. Present address (2007):Present address (2007): Newcastle UniversityNewcastle University School of Clinical Medical SciencesSchool of Clinical Medical Sciences U.K.U.K. http://www.dorak.infohttp://www.dorak.info
    • 2. BSHI 2002 Glasgow, Scotland This workshop will cover categorical data analysis for case-control design and some concepts in population genetics AIMSAIMS Familiarization with common statistical tests useful in HLA and disease association studies Clarification of several statistical concepts Discussion of common mistakes Interpretation of results
    • 3. BSHI 2002 Glasgow, Scotland Why would you do an associationWhy would you do an association study?study? Disease gene mapping and positional cloning Molecular profiling (to predict susceptibility, outcome, response, prognosis) Basic science (to learn about disease development and subsequently to design diagnostic tests or new treatment)
    • 4. BSHI 2002 Glasgow, Scotland Meaning of an associationMeaning of an association Population stratification (confounding by ethnicity) or other spurious associations Linkage disequilibrium (confounding by locus) Direct involvement in the disease process
    • 5. BSHI 2002 Glasgow, Scotland Cross-validation of resultsCross-validation of results Replication (population level and/or family-based) Functional studies Split the sample into two random groups (if nothing else can be done!)
    • 6. BSHI 2002 Glasgow, Scotland Failure to replicateFailure to replicate False positive in the original study False negative in the second one Population specificity Population stratification
    • 7. BSHI 2002 Glasgow, Scotland Considerations at the beginningConsiderations at the beginning Will you have enough power? Who are the controls? Unrelated or family-based? A subgroup vs another one (males vs females)? Prospective sequential sampling or retrospective convenience samples for cases? Remember you will be testing whether the cases and controls are from the same population. The answer shouldn’t be obvious at the beginning.
    • 8. BSHI 2002 Glasgow, Scotland An example of power calculationAn example of power calculation Proportion Difference Power / Sample Size Calculation Significance Level (alpha): .05 (Usually 0.05) Power (% chance of detecting): .80 (Usually 80) First Group Population Proportion: .40 (Between 0.0 and 1.0) Second Group Population Proportion: .60 (Between 0.0 and 1.0) Relative Sample Sizes Required: 2.0 (For equal samples, use 1.0) Sample Size Required: Group 1: 80 Group 2: 160 (Sample sizes become 115 : 231 for P = 0.01)
    • 9. BSHI 2002 Glasgow, Scotland An example of power calculationAn example of power calculation Proportion Difference Power / Sample Size Calculation Significance Level (alpha): .01 (Usually 0.05) Power (% chance of detecting): .80 (Usually 80) First Group Population Proportion: .05 (Between 0.0 and 1.0) Second Group Population Proportion: .10 (Between 0.0 and 1.0) Relative Sample Sizes Required: 2.0 (For equal samples, use 1.0) Sample Size Required: Group 1: 538 Group 2: 1077 http://statpages.org/proppowr.html
    • 10. BSHI 2002 Glasgow, Scotland Beware of the following flaws and fallacies ofBeware of the following flaws and fallacies of epidemiologic studiesepidemiologic studies confounders (known or unknown) selection bias response bias misclassification bias variable observer Hawthorne effect (changes caused by the observer in the observed values) diagnostic accuracy bias regression to the mean significance Turkey nerd of nonsignificance cohort effect ecologic fallacy Berkson bias (selection bias in hospital-based studies) SEE: http://www.dorak.info/epi/bc.html
    • 11. BSHI 2002 Glasgow, Scotland Categorical Data AnalysisCategorical Data Analysis * 2x2 Table Analysis for Association Chi-squared (Pearson, Yates) Fisher G-test McNemar's test: TDT, HRR (Logistic Regression) * Odds Ratio - Relative Risk Difference between OR and RR Woolf-Haldane Modification Comparison of two ORs Adjusted OR * Linkage Disequilibrium Comparison of two LDs * RxC (multicontingency) Table Analysis Chi-squared G-test Exact Tests (needed for HWE) Trend Test (frequently overlooked) See http://www.dorak.info/hla/stat.html
    • 12. BSHI 2002 Glasgow, Scotland The SAS SystemThe SAS System FREQ Procedure Output – IFREQ Procedure Output – I Statistic DF Value Prob Chi-Square 1 7.9047 0.0049 Likelihood Ratio Chi-Square 1 8.0067 0.0047 Continuity Adj. Chi-Square 1 7.3064 0.0069 Mantel-Haenszel Chi-Square 1 7.8840 0.0050 Phi Coefficient -0.1439 Contingency Coefficient 0.1424 Cramer's V -0.1439 * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * Fisher's Exact Test Cell (1,1) Frequency (F) 45 Left-sided Pr <= F 0.0033 Right-sided Pr >= F 0.9983 Table Probability (P) 0.0016 Two-sided Pr <= P 0.0066
    • 13. BSHI 2002 Glasgow, Scotland The SAS SystemThe SAS System FREQ Procedure Output – IIFREQ Procedure Output – II Estimates of the Common Relative Risk (Row1/Row2) Type of Study Method Value 95% Confidence Limits Case-Control Mantel-Haenszel 0.5359 0.3461 0.8299 (Odds Ratio) Logit 0.5359 0.3461 0.8299 Cohort Mantel-Haenszel 0.6595 0.4892 0.8891 (Col1 Risk) Logit 0.6595 0.4892 0.8891 Cohort Mantel-Haenszel 1.2306 1.0666 1.4198 (Col2 Risk) Logit 1.2306 1.0666 1.4198
    • 14. BSHI 2002 Glasgow, Scotland □ ○ ■BC AC BB □ ○ ● BB BC AB □ ○ ● □ ○ ■ BC AB AB CD AC BD “transmitted allele“  “case” “Non-transmitted allele”  “control” Parent-Case Trios in TDTParent-Case Trios in TDT/HRR/HRR
    • 15. BSHI 2002 Glasgow, Scotland - AN EXAMPLE OF TDT -- AN EXAMPLE OF TDT - TRANSMISSION DISEQUILIBRIUM OF HLA-B62 TOTRANSMISSION DISEQUILIBRIUM OF HLA-B62 TO THE PATIENTS WITH CHILDHOOD AMLTHE PATIENTS WITH CHILDHOOD AML (Dorak et al, BSHI 2002)(Dorak et al, BSHI 2002) Out of 13 parents heterozygote for B62, 12 transmitted B62 to the affected child and 1 did not McNemar’s test results: P = 0.0055 (with continuity correction) odds ratio = 12.0, 95% CI = 1.8 to 513 Nontransmitted Allele B62 Other Transmitted Allele B62 x 12 Other 1 y
    • 16. BSHI 2002 Glasgow, Scotland Multiple comparisonsMultiple comparisons Not needed if the study is not hypothesis driven (i.e., a fishing experiment) Not needed if the study is hypothesis driven ('Possible relevance of the HLA system' is not a valid hypothesis in this context. Those studies belong to the fishing experiments group) Therefore, it is not clear when it is needed in HLA association studies. Most frequently, it is an excuse for a busy reviewer to avoid a comprehensive review Best solution is to avoid facing this problem -ideally by replication and/or functional data to support the statistical association before it is dismissed as a spurious result of multiple comparisons
    • 17. BSHI 2002 Glasgow, Scotland Common Mistakes in Statistical EvaluationCommon Mistakes in Statistical Evaluation of Association Study Results - Iof Association Study Results - I Confusion between corrections (Yates/Williams for continuity VS Bonferroni) Confusion between RR and OR (they are not the same) Confusion between expected and observed values in cells of a contingency table Small sample size issue Don’t confuse a negative result with lack of power (‘No significant difference between the two groups and they were pooled’ VS ‘the difference did not reach significance due to small sample size’ are different interpretations of the same phenomenon, i.e., lack of power) Using Chi-squared test for small sample size (why not use Fisher all the time?) Using Chi-squared test for HWE (use exact test or G-test)
    • 18. BSHI 2002 Glasgow, Scotland Common Mistakes in Statistical EvaluationCommon Mistakes in Statistical Evaluation of Association Study Results - IIof Association Study Results - II One-tailed and two-tailed P values (always use two-tailed) Trend test for a multicontingency table? (if appropriate, more powerful) Multiple comparison issue Failure to give the strength of the association (OR, RR, RH) Use of the word ‘proof’. Does statistics prove anything? (A ‘P value’ provides a sense of the strength of the evidence for or against the null hypothesis of no association) Reliance on large sample effect to achieve significance Showing P values as 0.000 (this means P < 0.001) Confusion between association and linkage
    • 19. BSHI 2002 Glasgow, Scotland Association and Causality?Association and Causality? However strong an association does not necessarily meanHowever strong an association does not necessarily mean causation. Several criteria have been proposed to assess thecausation. Several criteria have been proposed to assess the role of an associated marker in causation. Some of those arerole of an associated marker in causation. Some of those are as follows:as follows: 1. Biological plausibility1. Biological plausibility 2. Strength of association (this is2. Strength of association (this is notnot measured by themeasured by the PP value)value) 3. Dose response (are heterozygotes intermediate between3. Dose response (are heterozygotes intermediate between the two homozygotes, or is homozygosity showing a strongerthe two homozygotes, or is homozygosity showing a stronger association than just having the marker?)association than just having the marker?) 4. Time sequence (this is inherent in the germ-line nature of4. Time sequence (this is inherent in the germ-line nature of HLA genes)HLA genes) 5. Consistency (next slide lists reasons for inconsistency in5. Consistency (next slide lists reasons for inconsistency in HLA association studies)HLA association studies) 6. Specificity of the association to the disease studied6. Specificity of the association to the disease studied
    • 20. BSHI 2002 Glasgow, Scotland Why Are the Inconsistencies? (I)Why Are the Inconsistencies? (I) 1. Mistakes in genotyping (lack of HWE in controls is1. Mistakes in genotyping (lack of HWE in controls is usually an indication of problems with typing rather thanusually an indication of problems with typing rather than selection, admixture, nonrandom mating or other reasons ofselection, admixture, nonrandom mating or other reasons of departure from HWE)departure from HWE) 2. Poor control selection (would your controls be in the2. Poor control selection (would your controls be in the case group if they had the disease, and would the cases be incase group if they had the disease, and would the cases be in your control group if they were free of the disease?)your control group if they were free of the disease?) 3. Design problems including the statistical power issue3. Design problems including the statistical power issue (negative results due to lack of statistical power should be(negative results due to lack of statistical power should be distinguished from truly negative results observed despitedistinguished from truly negative results observed despite having sufficient power)having sufficient power) 4. Publication bias (are there many more studies with4. Publication bias (are there many more studies with negative results but we have never heard about them?)negative results but we have never heard about them?) 5. Disease misclassification or misclassification bias5. Disease misclassification or misclassification bias
    • 21. BSHI 2002 Glasgow, Scotland Why Are the Inconsistencies? (II)Why Are the Inconsistencies? (II) 6. Excessive type I errors (are the positive results due to6. Excessive type I errors (are the positive results due to usingusing PP < 0.05 as the statistical significance?)< 0.05 as the statistical significance?) 7. Posthoc and subgroup analysis (are positive results due7. Posthoc and subgroup analysis (are positive results due to fishing (data dredging)?)to fishing (data dredging)?) 8. Unjustified multiple comparisons and subsequent type II8. Unjustified multiple comparisons and subsequent type II errorerror 9. Failure to consider the mode of inheritance in a genetic9. Failure to consider the mode of inheritance in a genetic diseasedisease 10. Failure to account for the LD structure of the gene (only10. Failure to account for the LD structure of the gene (only haplotype-tagging markers will show the association, otherhaplotype-tagging markers will show the association, other markers within the same gene may fail to show anmarkers within the same gene may fail to show an association and generate background noise)association and generate background noise) 11. Likelihood that the gene studied account for a small11. Likelihood that the gene studied account for a small proportion of the variability in riskproportion of the variability in risk
    • 22. BSHI 2002 Glasgow, Scotland Further informationFurther information Select ‘Biostatistics' or ‘Epidemiology’ at http://www.dorak.info or write to me at dorakmt :at: lycos.com [please do not add to your address book as it will change periodically]
    • 23. BSHI 2002 Glasgow, Scotland I am grateful to the BSHI Organizing Committee for giving me the opportunity to run this workshop at BSHI 2002 in Glasgow. I particularly thank Nancy Henderson and Ian Galbraith also for their hospitality. BSHI AGMBSHI AGM 5:15 pm5:15 pm All members should attendAll members should attend

    ×