Lecture 10 Sample Size

6,372 views

Published on

Published in: Business, Technology

Lecture 10 Sample Size

  1. 1. DESIGN OF CLINICAL TRIALS EPIDEMIOLOGY 2181 “ SAMPLE SIZE DETERMINATION AND STATISTICAL POWER IN CLINICAL TRIALS” S.F.Kelsey/class2181/lecture 4-sample size October 30, 2008 Lecture 4 SHERYL F. KELSEY, PhD Department of Epidemiology
  2. 2. S.F.Kelsey/class2181/lecture 4-sample size QUIZ Assume 90% Power, a = 0.05 two-sided (x) more with A (y) more with B (z) the same 1 . Mortality 20% vs 10% 40% vs 30% 2. Mortality 20% vs 10% 20% vs 15% 3. Diastolic 80 vs 85 mmHg 90 vs 95 mmHg BP 4. Diastolic 80 vs 85 mmHg 80 vs 85 mmHg BP A B (x) more with A (y) more with B (z) the same (x) more with A (y) more with B (z) the same (x) more with A (y) more with B (z) the same (St Dev 10) (St Dev 10) (St Dev 10) (St Dev 8) How many subjects?
  3. 3. S.F.Kelsey/class2181/lecture 4-sample size 1. More with B 2. More with B 3. The same 4. More with A ANSWERS Variance of the binomial bigger 50% smaller 0% 100% Small difference need more subjects Only standard deviation matters Bigger standard deviation more subjects
  4. 4. S.F.Kelsey/class2181/lecture 4-sample size SHALL WE COUNT THE LIVING OR THE DEAD? 40% vs 20% 20% 50% “reduction” in mortality lower mortality 20% vs 10% 10% 50% “reduction” in mortality lower mortality 10% vs 5% 05% 50% “reduction” in mortality lower mortality 60% vs 80% 20% 33% “improvement” in survival higher mortality 80% vs 90% 10% 2.5% “improvement” in survival higher mortality 90% vs 95% 05% 5.6% “improvement” in survival higher mortality Absolute Relative
  5. 5. S.F.Kelsey/class2181/lecture 4-sample size Even more confusing with continuous variables Blood pressure (St Dev 10) 5.9% “reduction” 80 vs 85 mmHg 5.3% “reduction” 90 vs 95 mmHg
  6. 6. S.F.Kelsey/class2181/lecture 4-sample size <ul><li>Intervention A has “increased” survival </li></ul><ul><li>No: Intervention A has longer, better, greater survival </li></ul><ul><li>“ Increase” should be used for changes over time </li></ul>
  7. 7. S.F.Kelsey/class2181/lecture 4-sample size PERCENTS <ul><ul><li>Absolute difference </li></ul></ul><ul><ul><li>Relative difference </li></ul></ul><ul><ul><li>Percents as continuous measures </li></ul></ul><ul><ul><li>Proceed with caution </li></ul></ul>
  8. 8. S.F.Kelsey/class2181/lecture 4-sample size Legal null hypothesis: innocent until proven guilty Scientific null hypothesis: no difference in response between treatment groups Innocent Guilty Innocent Guilty Truth Decision of Judge/Jury ok ok guilty goes free type II error (  ) hang the innocent type I error (  ) Treatment Different Treatment Same Truth Same Different Observed Data ok ok miss good treatment type II error (  ) promote worthless Tx type I error (  )
  9. 9. FUNDAMENTAL POINT S.F.Kelsey/class2181/lecture 4-sample size Clinical trials should have sufficient statistical power to detect differences between groups considered to be of clinical interest. Therefore, calculation of sample size with provision for adequate levels of significance and power is an essential part of planning.
  10. 10. S.F.Kelsey/class2181/lecture 4-sample size THE RAW INGREDIENTS <ul><li>What is your question, precisely? </li></ul><ul><li>What is your outcome, precisely? </li></ul><ul><li>Who will be measured? </li></ul><ul><li>Type 1 and type 2 error rates </li></ul><ul><li>Variability </li></ul>
  11. 11. PRIMARY COMPARISONS S.F.Kelsey/class2181/lecture 4-sample size <ul><li>Dichotomous Response Variables </li></ul><ul><ul><li>The event rates in the intervention group (P i ) and the control group (P c ) are compared </li></ul></ul><ul><li>Continuous Response Variables </li></ul><ul><ul><li>The true but unknown mean value in the intervention group is compared with the true, but unknown mean value in the control group. </li></ul></ul><ul><li>Survival Data </li></ul><ul><ul><li>A hazard rate is often compared for the two study groups or at least is used for sample size estimation. </li></ul></ul>
  12. 12. SAMPLE SIZE ISSUES S.F.Kelsey/class2181/lecture 4-sample size <ul><li>Choice of outcome – primary endpoint </li></ul><ul><li>Change from baseline </li></ul><ul><ul><li>only if correlation > .5 </li></ul></ul><ul><ul><li>when in doubt don’t </li></ul></ul><ul><li>The difference____________ to detect </li></ul><ul><ul><li>you want </li></ul></ul><ul><ul><li>you believe is clinically meaningful </li></ul></ul><ul><ul><li>you believe is biologically credible </li></ul></ul><ul><ul><li>you can afford to </li></ul></ul>
  13. 13. ASSESSMENTS OF EVENT RATE IN THE CONTROL AND INTERVENTION GROUPS S.F.Kelsey/class2181/lecture 4-sample size <ul><li>The estimate for the control group event rate is usually obtained from a previous study of similar people. Good data base desirable </li></ul><ul><li>The investigator must choose the difference in event rate based on preliminary evidence of the potential effectiveness of the intervention or be willing to specify some minimum difference. </li></ul><ul><li>Calculation of several sample sizes based on a range of estimates helps one to assess how sensitive the sample size is to these estimates. </li></ul>
  14. 14. S.F.Kelsey/class2181/lecture 4-sample size To Plan with continuous endpoints:  Clinical difference worth detecting 1–   Power Probability of obtaining a significant result if  is true difference  Significance level, must specify one or two-tailed test (Z   Z  ) 2 Multiplier which depends on level of significance  and Power 1-  n Sample size for each of two groups For continuous measures:  Standard deviation With a little algebra Z  =1.96 for  =.05, two-sided (solve for power) (Solve for difference)
  15. 15. S.F.Kelsey/class2181/lecture 4-sample size For two proportions P 1 vs P 2,  = P 1 - P 2 With a little algebra Z  = 1.64 for  .05, one-sided Z  = 1.96 for  .05, two-sided
  16. 16. S.F.Kelsey/class2181/lecture 4-sample size TABLE (Z  + Z  ) 2 Needed to determine the size of each sample (Z 2  2.32 1.645 1.28) Desired Two-Tailed Tests One-Tailed Tests Power Level Level Z  P 0.01 0.05 0.10 0.01 0.05 0.10 Two groups of unequal size: Calculate the harmonic mean This n is what is needed for 2 groups of equal size. Note that equal sized groups are the most efficient, that is the harmonic mean is less than the arithmetic mean. References: Snedecor and Cochran, 7th Edition Statistical Methods , 1980, pp 102-1- 5, 120, 130. Fleiss, JL. Statistical Methods for Rates and Proportions, 1981, Chapter 3 & Tables. Schlesselman, JJ. Case Control Studies, 1981, Chapter 6 & Tables. (Z  2.576 1.96 1.645) 0.84 0.80 11.7 7.9 6.2 10.0 6.2 4.5 1.28 0.90 14.9 10.5 8.6 13.0 8.6 6.6 1.645 0.95 17.8 13.0 10.8 15.8 10.8 8.6
  17. 17. S.F.Kelsey/class2181/lecture 4-sample size Example: Compare .10 vs .05  = .05 one sided Power 80% arcsin arcsin So total study: 334 x 2 = 668 .10 vs .05 with 200 patients in each group Power = 61% with 100 patients Z  = .28 39% power 50 patients Z  = .68 25% power | | .0963| - 1.64 = .286 | — Z  arcsin
  18. 18. FURTHER SAMPLE SIZE CONSIDERATIONS <ul><ul><li>outcome is survival time </li></ul></ul><ul><ul><li>discount for noncompliance/dropout projections </li></ul></ul><ul><ul><li>subgroups </li></ul></ul><ul><ul><li>more than 2 treatment groups </li></ul></ul><ul><ul><li>unequal group sizes </li></ul></ul><ul><ul><li>- less efficient </li></ul></ul><ul><ul><li>- can get more information on new treatment </li></ul></ul><ul><ul><li>computer packages (PASS, SAS, MINITAB) </li></ul></ul><ul><ul><li>ASSUMPTIONS ABOUT EVENT RATES PARAMOUNT sensitivity analysis </li></ul></ul>S.F.Kelsey/class2181/lecture 4-sample size
  19. 19. ONE-SIDED VERSUS TWO-SIDED TESTS S.F.Kelsey/class2181/lecture 4-sample size I Drug A side effects/expensive Drug B no side effects/cheap A more efficacious A&B the same B more efficacious II X Nutrition Intervention Strategy-Group sessions Y Nutrition Intervention Strategy-Individual program X reduce sodium intake more X&Y the same Y reduce sodium intake more
  20. 20. SAMPLE SIZE FOR TESTING “EQUIVALENCY” OF INTERVENTIONS S.F.Kelsey/class2181/lecture 4-sample size <ul><li>The problem in designing positive control studies is that there is no statistical method to demonstrate complete equivalence. </li></ul><ul><li>Computing a sample size assuming no difference results in an infinite sample size. </li></ul><ul><li>One approach is to specify a value for difference in response such that interventions with differences that are less than this might be considered equally effective or equivalent. </li></ul>
  21. 21. T = Innovative Therapy S = Standard Therapy S.F.Kelsey/class2181/lecture 4-sample size “ Superiority” H 0 : death rate T = death rate S H alt :death rate T < death rate S Equivalence H 0 : death rate T  death rate S +  H alt :death rate T < death rate S +  In general equivalence studies require more patients
  22. 22. S.F.Kelsey/class2181/lecture 4-sample size Patients: Acute MI Treatment: Double bolus vs accelerated Alteplace Outcome: 30 day mortality COBALT Equivalence Death rate within 0.4% GUSTO III Superiority Double bolus reduce mortality by 20% WARE AND ANTMAN EDITORIAL
  23. 23. MORTALITY RESULTS S.F.Kelsey/class2181/lecture 4-sample size COBALT GUSTO III N 7169 15059 Double bolus 7.98% 7.47% Accelerated 7.53% 7.24% Difference 0.45% 0.23% 95% CI Approx. (-.85%, 1.66%) (-.66%, 1.10%) Action reject equivalence accept null not significantly different from zero
  24. 24. DESIGN OF CLINICAL TRIALS EPIDEMIOLOGY 2181 RANDOMIZATION IN CLINICAL TRIALS S.F.Kelsey/class2181/lecture 4-sample size SHERYL F. KELSEY, PH.D
  25. 25. WHY RANDOMIZE? S.F.Kelsey/class2181/lecture 4-sample size <ul><li>Best way to assure comparability </li></ul><ul><li>In the long run balance of factors known unknown </li></ul><ul><li>Statistical hypothesis test based on random assignment </li></ul><ul><li>Selection is impartial: “dice not trying to prove a point” - must convince others of validity of comparison </li></ul>
  26. 26. RANDOMIZATION S.F.Kelsey/class2181/lecture 4-sample size <ul><li>FIXED ALLOCATION: Assigns with pre-specified probability (not necessarily, though usually, equal) </li></ul><ul><li>ADAPTIVE: Changes probabilities during study </li></ul><ul><li>Baseline adaptive: </li></ul><ul><ul><li>on basis of number per group </li></ul></ul><ul><ul><li>on basis of variables </li></ul></ul><ul><li>Responsive adaptive: </li></ul><ul><ul><li>depends on prior outcome </li></ul></ul><ul><ul><li>assumes: </li></ul></ul><ul><ul><ul><li>rapid response </li></ul></ul></ul><ul><ul><ul><li>stable population source </li></ul></ul></ul>
  27. 27. STEPS IN THE RANDOMIZATION OF A PATIENT Check eligibility Informed consent Formal identification RANDOMIZE Confirmation of patient entry S.F.Kelsey/class2181/lecture 4-sample size
  28. 28. HOW RANDOM TREATMENT ASSIGNMENTS ARE MADE S.F.Kelsey/class2181/lecture 4-sample size Model: Slips in a hat or flipping a coin Masked drugs numbered and given in order: pharmacy, drug manufacturer Envelopes Telephone to central unit Microcomputer at the site Central computer – internet access
  29. 29. HOW TO DO THE SCHEME S.F.Kelsey/class2181/lecture 4-sample size Simple randomization Biased coin, urn models Example: Start with 2 balls, one black and one white Draw-replace and add one of opposite color Prevents imbalance with high probability early on Random permuted block Balance at the end of block Could predict with unmasked trial
  30. 30. BLOCKS OF SIZE 4 S.F.Kelsey/class2181/lecture 4-sample size 1) 1100 2) 1010 3) 1001 4) 0110 5) 0101 6) 0011
  31. 31. HOW TO USE BLOCKS WHEN TREATMENT IS NOT MASKED S.F.Kelsey/class2181/lecture 4-sample size Choose the block sizes at random, too Example: 2 treatment, equal allocation Block sizes 4, 6, and 8 Balance in each block
  32. 32. SHOULD YOU STRATIFY? S.F.Kelsey/class2181/lecture 4-sample size Clinical sites - generally yes Prognostic variables - generally no Size Practical considerations Often governed by custom rather than statistical justification Stratified ANALYSIS is generally preferable
  33. 33. MINIMIZATION S.F.Kelsey/class2181/lecture 4-sample size Advantages: Balance several prognostic factors Balance marginal treatment totals Good for small trials (<100 patients) Computer makes this fairly easily Disadvantages: Can’t prepare treatment assignment Scheme in advance Need up-to-date record Not really random - could predict but can introduce random element by using say 3/4, 1/4
  34. 34. S.F.Kelsey/class2181/lecture 4-sample size TABLE 5.7. - TREATMENT ASSIGNMENTS BY THE FOUR PATIENT FACTORS FOR 80 PATIENTS IN AN ADVANCED BREAST CANCER TRIAL Factor Level No. on each Next treatment patient A B Performance status Ambulatory 30 31 Non-ambulatory 10 9 Age <50 18 17  50 22 23 Disease-free interval <2 years 31 32  2 years 9 8 Dominant metastatic lesion Visceral 19 21 Osseous 8 7 Soft tissue 13 12 Pocock S. Clinical Trials: A Practical Approach. John Wiley & Sons, Chichester, England, 1991, p. 85. Thus, for A this sum = 30 + 18 + 9 + 19 = 76 while for B this sum = 31 + 17 + 8 + 21 = 77
  35. 35. S.F.Kelsey/class2181/lecture 4-sample size INTERNAL VALIDITY compare treatments External Validity/ Generalizability extrapolate to other patients Not realistic to find a random sample of patients for recruitment (at the very least they have to consent) More important to establish efficacy of treatment before deciding if it can be broadly applied

×