Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Are Most Positive Findings False? Confirmatory Bias in the Evaluation of Psychological Interventions

621 views

Published on

I was tired of this 2007 presentation being plagiarized and so i am making it available. The time stamp for the file on a hard drive for it is 3.20.2007. An old cv I retrieved indicates that I gave a talk at Catholic University of America and at University of Gronigen with this title in 2007. I recycled some of the slides since and slides 48-50 have been quite popular as seen in some persons using them in publications without appropriate attribution.
Regardless, you should be amazed how prescient this presentation now seems, over a decade later, and how much things have not changed.

Published in: Science
  • Be the first to comment

Are Most Positive Findings False? Confirmatory Bias in the Evaluation of Psychological Interventions

  1. 1. Are Most Positive Findings False? Confirmatory Bias in the Evaluation of Psychological Interventions James C. Coyne, Ph.D. jcoyne@mail.med.upenn.edu
  2. 2. Confirmatory BiasConfirmatory Bias • Consistent Bias in the Availability andConsistent Bias in the Availability and Interpretation of Data so thatInterpretation of Data so that Intervention Appears More EffectiveIntervention Appears More Effective than it isthan it is • Publication BiasPublication Bias • Investigator AllegianceInvestigator Allegiance • Investigators’ Bias in Design of Trials,Investigators’ Bias in Design of Trials, Selection, Analysis, Interpretation, andSelection, Analysis, Interpretation, and Subsequent Discussion of DataSubsequent Discussion of Data
  3. 3. RCTs are Not Necessary to Resolve All Questions.
  4. 4. In the Late 1980s the Quality of theIn the Late 1980s the Quality of the Evidence Available in Medical Journals wasEvidence Available in Medical Journals was Subject to Considerable Criticism.Subject to Considerable Criticism. Strong findings in small trials did not replicate inStrong findings in small trials did not replicate in subsequent large studies.subsequent large studies. Results of meta-analyses did not predict outcomeResults of meta-analyses did not predict outcome of large trials.of large trials. Details of trials required to form an independentDetails of trials required to form an independent opinion of a trial were not being provided inopinion of a trial were not being provided in journal articles.journal articles. Trials funded by industry consistently supportedTrials funded by industry consistently supported the superiority of the sponsor’s products.the superiority of the sponsor’s products.
  5. 5. Schulz KF Chalmers I, Hayes RJ, Altman DG. Empirical-Schulz KF Chalmers I, Hayes RJ, Altman DG. Empirical- Evidence Of Bias - Dimensions Of Methodological QualityEvidence Of Bias - Dimensions Of Methodological Quality Associated With Estimates Of Treatment Effects InAssociated With Estimates Of Treatment Effects In Controlled Trials JAMA 273 (5): 408-412 1995Controlled Trials JAMA 273 (5): 408-412 1995 Compared with trials in which authors reportedCompared with trials in which authors reported adequately concealedadequately concealed treatment allocationtreatment allocation, trials in which, trials in which concealment was either inadequate or unclear yieldedconcealment was either inadequate or unclear yielded larger estimates of treatment effects (P<.001). Odds ratioslarger estimates of treatment effects (P<.001). Odds ratios were exaggerated by 41% for inadequately concealedwere exaggerated by 41% for inadequately concealed trials and by 30% for unclearly concealed trials. Trials intrials and by 30% for unclearly concealed trials. Trials in whichwhich participants had been excluded after randomizationparticipants had been excluded after randomization did not yield larger estimates of effects, but that lack ofdid not yield larger estimates of effects, but that lack of association may be due to incomplete reporting. Trialsassociation may be due to incomplete reporting. Trials that werethat were not double-blindnot double-blind also yielded larger estimates ofalso yielded larger estimates of effects (P=.01), with odds ratios being exaggerated byeffects (P=.01), with odds ratios being exaggerated by 17%.17%.
  6. 6. Chan, A.W. et al (2004). Empirical evidence for selectiveChan, A.W. et al (2004). Empirical evidence for selective reporting of outcomes in randomized trials: Comparisonreporting of outcomes in randomized trials: Comparison of protocols to published articles.of protocols to published articles. JAMA, 291JAMA, 291, 2457-2465., 2457-2465. One hundred two trials with 122 published journal articles andOne hundred two trials with 122 published journal articles and 3736 outcomes were identified.3736 outcomes were identified. Overall, 50% of outcomes per trial were incompletelyOverall, 50% of outcomes per trial were incompletely reported.reported. Statistically significant outcomes had a higher odds of beingStatistically significant outcomes had a higher odds of being fully reported compared to nonsignificant outcomes (pooledfully reported compared to nonsignificant outcomes (pooled odds ratio, 2.4; 95% confidence interval [CI], 1.4-4.0).odds ratio, 2.4; 95% confidence interval [CI], 1.4-4.0). Eighty-six percent of survey responders (42/49) denied theEighty-six percent of survey responders (42/49) denied the existence of unreported outcomes despite clear evidence to theexistence of unreported outcomes despite clear evidence to the contrary.contrary.
  7. 7. Strategies of Data Analysis That EnsuredStrategies of Data Analysis That Ensured Positive Findings Came under SustainedPositive Findings Came under Sustained CriticismCriticism Assmann, S. F., Pocock, S. J., Enos, L. E., & Kasten, L. E.Assmann, S. F., Pocock, S. J., Enos, L. E., & Kasten, L. E. (2000). Subgroup analysis and other (mis)uses of(2000). Subgroup analysis and other (mis)uses of baseline data in clinical trials.baseline data in clinical trials. Lancet,Lancet, 355355, 1064–1069., 1064–1069. Freemantle N (2001). Interpreting the results of secondaryFreemantle N (2001). Interpreting the results of secondary end points and subgroup analyses in clinical trials:end points and subgroup analyses in clinical trials: should we lock the crazy aunt in the attic?should we lock the crazy aunt in the attic? BMJBMJ,, 322,322, 989.989. Yusuf, S., Wittes, J., Probstfield, J., & Tyroler, H. A.Yusuf, S., Wittes, J., Probstfield, J., & Tyroler, H. A. (1991). Analysis and interpretation of treatment effects(1991). Analysis and interpretation of treatment effects in subgroups of patients in randomized clinical-trials.in subgroups of patients in randomized clinical-trials. JAMA, 266,JAMA, 266, 93-98.93-98.
  8. 8. Reforms Were Instituted and EnforcedReforms Were Instituted and Enforced (Even if Inconsistently).(Even if Inconsistently). Requirement of Declaration of Conflict ofRequirement of Declaration of Conflict of Interest.Interest. Adherence to CONSORT required for submittingAdherence to CONSORT required for submitting a paper for publication.a paper for publication. Researchers now required to provide a detailedResearchers now required to provide a detailed description of their study protocols, includingdescription of their study protocols, including specification of the 1-2 primary endpoints andspecification of the 1-2 primary endpoints and any subgroup analyses to journals where theyany subgroup analyses to journals where they intended to publish findings before theyintended to publish findings before they actually conducted their studies.actually conducted their studies.
  9. 9. CONSORTCONSORT Consolidated Standards ofConsolidated Standards of Reporting TrialsReporting Trials www.consort-statement.orgwww.consort-statement.org A list of requirements for uniformA list of requirements for uniform reporting of clinical trials with thereporting of clinical trials with the overall aim of improving the reporting ofoverall aim of improving the reporting of Randomized Controlled Trials, toRandomized Controlled Trials, to facilitate their critical appraisal, and tofacilitate their critical appraisal, and to facilitate their inclusion in systematicfacilitate their inclusion in systematic reviews.reviews.
  10. 10. CONSORT ChecklistCONSORT Checklist 22-item checklist and an accompanying22-item checklist and an accompanying flow diagram of participant’s progressionflow diagram of participant’s progression through a trial from approach for consentthrough a trial from approach for consent to completion of follow up assessments.to completion of follow up assessments. Initial intent was to provide minimalInitial intent was to provide minimal standards for reporting, not conductingstandards for reporting, not conducting trials.trials. Anticipated that in addition, CONSORTAnticipated that in addition, CONSORT would guide investigators in designingwould guide investigators in designing and implementing scientifically soundand implementing scientifically sound trials.trials.
  11. 11. The “Great Debate”The “Great Debate” (2005)(2005) "Resolved: Psychosocial Interventions for Cancer Patients are Ineffective and Unacceptable to Patients."
  12. 12. The Literature was WorseThe Literature was Worse Than it First LookedThan it First Looked
  13. 13. Positive 1995 Meyer & Mark ('95) Fawzy et al ('95) Devine & Westlake ('95) Mixed 1996-2002 Helgeson & Cohen ('96) Sheard & Maguire ('99) Rehse & Pukrop ('03) Shifting Views of Efficacy Inconclusive 2002-2004 Newell et al ('02) Edwards et al ('04) Gysels et al ('04)
  14. 14. What is Required for a DemonstrationWhat is Required for a Demonstration of Efficacy?of Efficacy? Stopped Accruing Patients at a Pre-Set SampleStopped Accruing Patients at a Pre-Set Sample Size and Without Making a Decision Based onSize and Without Making a Decision Based on Peeking at Data.Peeking at Data. Specified a Single Endpoint Ahead of Time thatSpecified a Single Endpoint Ahead of Time that Would Determine Outcome of Trial.Would Determine Outcome of Trial. Analyses Based on All Patients Who WereAnalyses Based on All Patients Who Were Randomized.Randomized. Obtained a Treatment x Time Interaction Effect.Obtained a Treatment x Time Interaction Effect.
  15. 15. The Literature was WorseThe Literature was Worse Than It First LookedThan It First Looked • Cannot accept positive appraisals of aCannot accept positive appraisals of a particular study or the literature at face value.particular study or the literature at face value. • Endemic confirmatory bias.Endemic confirmatory bias. • Myth that combinations of similarly flawedMyth that combinations of similarly flawed studies can yield an informative contribution tostudies can yield an informative contribution to the literature: blend them together, you getthe literature: blend them together, you get taintedtainted scrapple, not pate.scrapple, not pate.
  16. 16. Vickers, A.J., Analysis of variance is easily misapplied inVickers, A.J., Analysis of variance is easily misapplied in the analysis of randomized trials: a critique andthe analysis of randomized trials: a critique and discussion of alternative statistical approaches.discussion of alternative statistical approaches. Psychosom Med,Psychosom Med, 6767(4): p. 652-5, 2005.(4): p. 652-5, 2005. We are not concerned in whether scores willWe are not concerned in whether scores will change from baseline (it seems likely than theychange from baseline (it seems likely than they would) or whether overall anxiety scores,would) or whether overall anxiety scores, including pretreatment score, differ betweenincluding pretreatment score, differ between groups (at baseline, they should be similargroups (at baseline, they should be similar because of randomization). What we arebecause of randomization). What we are interested in, and why we conducted theinterested in, and why we conducted the randomized trial, is whether the change over timerandomized trial, is whether the change over time is different between groups. This is technicallyis different between groups. This is technically known as the “group by treatment interaction.”known as the “group by treatment interaction.”
  17. 17. What was Going on?What was Going on? • Most studies recruited samples of cancerMost studies recruited samples of cancer patients without regard to level ofpatients without regard to level of distress.distress. • Low mean levels of distress resulted in anLow mean levels of distress resulted in an inability to demonstrate interventionsinability to demonstrate interventions significantly reduced distress.significantly reduced distress. • Strategies for ending trials andStrategies for ending trials and organizing, analyzing, reporting andorganizing, analyzing, reporting and interpreting data hid and perpetuatedinterpreting data hid and perpetuated the myth that interventions were beingthe myth that interventions were being shown to be effective.shown to be effective.
  18. 18. Rescuing a Null Trial With A NewlyRescuing a Null Trial With A Newly Invented Outcome: Benefit FindingInvented Outcome: Benefit Finding** • A priori primary endpoint was distress.A priori primary endpoint was distress. • Primary analysis yielded null results, butPrimary analysis yielded null results, but subsequent reports have reported a secondarysubsequent reports have reported a secondary analysis in which there was an effect for aanalysis in which there was an effect for a subgroup of patients.subgroup of patients. • Subsequent reports give main emphasis to benefitSubsequent reports give main emphasis to benefit finding as an endpoint.finding as an endpoint. • Intervention not designed to affect benefit finding,Intervention not designed to affect benefit finding, no theoretical reason for assuming an effect.no theoretical reason for assuming an effect. • Benefit finding has unknown clinical significance.Benefit finding has unknown clinical significance. *Antoni et al, Health Psychology 2001
  19. 19. What are the EndemicWhat are the Endemic Problems in the Design,Problems in the Design, Conduct, and ReportingConduct, and Reporting of Trials?of Trials?
  20. 20. Are We Done Yet? Check the Data AgainAre We Done Yet? Check the Data Again and See if We Have a Finding to Reportand See if We Have a Finding to Report • A priori power analysis the occasionalA priori power analysis the occasional exception rather than the rule.exception rather than the rule. • Operative Rule: Peek and stop when resultsOperative Rule: Peek and stop when results are looking good.are looking good. • Must beware of modest sized trials claimingMust beware of modest sized trials claiming strong effects--likely to be false positives.strong effects--likely to be false positives. • Must beware of studies with odd numbersMust beware of studies with odd numbers of patients accumulated without a powerof patients accumulated without a power analysis.analysis.
  21. 21. Should We Get ExcitedShould We Get Excited About UnexpectedAbout Unexpected Strong Findings With aStrong Findings With a Small Sample?Small Sample?
  22. 22. Perils of Unexpected Results in Small Trials • Threat of spurious findings not a matter of low power. • Vulnerability to uncontrolled group differences, even when there has been no obvious breakdown in randomization procedures. • Finding with a low prior probability likely to represent a false positive.
  23. 23. Perils of Unexpected Results in Small Trials • “…in a RCT, the balance of pretreatment characteristics is merely one test of the adequacy of randomization and not proof that influential imbalances do not exist. Also, because such tabulations are invariably marginal summaries only (ie, the totals for each factor are considered separately), they provide essentially no insight into the joint distribution of prognostic factors in the two treatment groups. It is simple to envision situations in which the marginal imbalances of prognostic factors are minimal, but the joint distributions are different and influential” (Piantadosi, 1990).
  24. 24. ““When moderate benefits or negligiblyWhen moderate benefits or negligibly small benefits are both more plausiblesmall benefits are both more plausible than extreme benefits, then a p= .001than extreme benefits, then a p= .001 effect in a large trial or overview wouldeffect in a large trial or overview would provide much stronger evidence thanprovide much stronger evidence than the same significance level in a smallthe same significance level in a small trial, a small overview, or a smalltrial, a small overview, or a small subgroup analysis.”subgroup analysis.” Collins, et al, Lancet, (1995Collins, et al, Lancet, (1995))
  25. 25. What is wrong withWhat is wrong with exploring multipleexploring multiple outcomes?outcomes?
  26. 26. Austin PC, Mamdani MM, et al. Testing multipleAustin PC, Mamdani MM, et al. Testing multiple statistical hypotheses resulted in spurious associations: Astatistical hypotheses resulted in spurious associations: A study of astrological signs and health. J Clin Epidem 59study of astrological signs and health. J Clin Epidem 59 (9): 964-969, 2006.(9): 964-969, 2006. We sought statistically significant associations betweenWe sought statistically significant associations between astrological signs and health that would be neitherastrological signs and health that would be neither reproducible nor biologically plausiblereproducible nor biologically plausible.. We searched 223 of the most common diagnoses forWe searched 223 of the most common diagnoses for hospitalization until we identified two for which subjectshospitalization until we identified two for which subjects born under one astrological sign had a significantly higherborn under one astrological sign had a significantly higher probability of hospitalization.probability of hospitalization. Residents born under Leo had a higher probability ofResidents born under Leo had a higher probability of gastrointestinal hemorrhage (gastrointestinal hemorrhage (PP = 0.0447), while= 0.0447), while Sagittarians had a higher probability of humerus fractureSagittarians had a higher probability of humerus fracture ((PP = 0.0123) compared to all other signs combined.= 0.0123) compared to all other signs combined.
  27. 27. What is wrong withWhat is wrong with unplanned subgroupunplanned subgroup analyses?analyses?
  28. 28. Schulz KF, Grimes DA. Epidemiology 4 - Multiplicity inSchulz KF, Grimes DA. Epidemiology 4 - Multiplicity in randomised trials I: endpoints and treatments.randomised trials I: endpoints and treatments. Lancet 365Lancet 365 (9470): 1591-1595 2005.(9470): 1591-1595 2005. Thousands of potential comparisons can emanate fromThousands of potential comparisons can emanate from one trial. Investigators might only report the significantone trial. Investigators might only report the significant comparisons, an unscientific practice if unwitting, andcomparisons, an unscientific practice if unwitting, and fraudulent if intentional. Researchers must report all thefraudulent if intentional. Researchers must report all the endpoints analysed and treatments compared.endpoints analysed and treatments compared. Some researchers torture their data until they speak. TheySome researchers torture their data until they speak. They examine additional endpoints, manipulate groupexamine additional endpoints, manipulate group comparisons, do many subgroup analyses, and undertakecomparisons, do many subgroup analyses, and undertake repeated interim analyses. Difficulties usually manifest atrepeated interim analyses. Difficulties usually manifest at the analysis phase because investigators add unplannedthe analysis phase because investigators add unplanned analyses.analyses.
  29. 29. Just What Is Wrong With Post Hoc Subgroup Analyses? • High profile papers in the behavioral medicine literature routinely emphasize positive subgroup analyses in the face of negative primary analyses (Classen et al, 2001; Schneiderman et al., 2004). • In the broader clinical trials literature, this practice is uniformly seen as inappropriate (Yusuf et al., 1991). • Unplanned subgroup analyses frequently yield spurious results (Assman et al., 2000; Senn & Harrel, 1979)-- “only in exceptional circumstances should they affect the conclusions drawn from the trial” (Brooks et al., 2004, p 229).
  30. 30. Just What Is Wrong With Post HocJust What Is Wrong With Post Hoc Subgroup Analyses?Subgroup Analyses? • High profile papers in the behavioral medicineHigh profile papers in the behavioral medicine literature routinely emphasize subgroup analyses whenliterature routinely emphasize subgroup analyses when they are positive in the face of negative primarythey are positive in the face of negative primary analyses (Classen et al, 2001; Schneiderman et al.,analyses (Classen et al, 2001; Schneiderman et al., 2004).2004). • In the broader clinical trials literature, this practice isIn the broader clinical trials literature, this practice is uniformly criticized as inappropriate (Yusuf et al.,uniformly criticized as inappropriate (Yusuf et al., 1991).1991). • Unplanned subgroup analyses frequently yield spuriousUnplanned subgroup analyses frequently yield spurious results (Assman et al., 2000; Senn & Harrel, 1979), andresults (Assman et al., 2000; Senn & Harrel, 1979), and “only in exceptional circumstances should they affect“only in exceptional circumstances should they affect the conclusions drawn from the trial” (Brooks et al.,the conclusions drawn from the trial” (Brooks et al., 2004, p 229).2004, p 229).
  31. 31. Telling It Like It Ain’t: All theTelling It Like It Ain’t: All the Results That FitResults That Fit • Primary endpoint typically needs to be inferred, notPrimary endpoint typically needs to be inferred, not stated.stated. • Ignore negative results for presumed endpoints:Ignore negative results for presumed endpoints: Emphasize any positive effect, ignore larger number ofEmphasize any positive effect, ignore larger number of null findings.null findings. • Favor secondary and subgroup analyses and endpointsFavor secondary and subgroup analyses and endpoints developed post hoc over negative findings for presumeddeveloped post hoc over negative findings for presumed analyses.analyses. • Discuss negative findings as if positive in subsequentDiscuss negative findings as if positive in subsequent publications.publications. • Accommodate existing literature “as is” rather thanAccommodate existing literature “as is” rather than qualifying interpretation with reference toqualifying interpretation with reference to methodological shortcomings.methodological shortcomings.
  32. 32. The Norm: Lack ofThe Norm: Lack of Intent to Treat AnalysesIntent to Treat Analyses • Data from patients who do not complete trial orData from patients who do not complete trial or all measurements are discarded.all measurements are discarded. • ““As treated” analyses ignore informativeAs treated” analyses ignore informative missing data.missing data. • Intervention and control patients have differentIntervention and control patients have different reasons for not providing data and thisreasons for not providing data and this introduces bias in the available data.introduces bias in the available data. • ““As treated” data do not generalize back toAs treated” data do not generalize back to patients entering a trial.patients entering a trial.
  33. 33. Intent to Treat Analysis • Highly appropriate, one of the basic criteria by which adequacy of the reporting of randomized clinical trials are evaluated, including with CONSORT. • Intent to treat analyses most accurately address the question of how effective the intervention would be if it were offered outside the clinical trial. • Intent to treat analyses preserve the baseline equivalence of groups that was presumably achieved by randomization; and these analyses help to ensure that bias is not introduced by selective retention of patients. • Particularly important when retention of patients is affected by loss of patients related to the outcome under study.
  34. 34. Cook, J. M., Palmer, S., Hoffman, K.,Cook, J. M., Palmer, S., Hoffman, K., & Coyne, J. C. Evaluation of clinical& Coyne, J. C. Evaluation of clinical trials appearing in Journal oftrials appearing in Journal of Consulting and Clinical Psychology:Consulting and Clinical Psychology: CONSORT and beyond.CONSORT and beyond. The ScientificThe Scientific Review of Mental Health PracticeReview of Mental Health Practice (in(in press).press).
  35. 35. Reporting of RCTs in JCCP 1992 and 2002: Before CONSORT and Beyond Deficiencies were noted in features empiricallyDeficiencies were noted in features empirically related to confirmatory bias: randomization,related to confirmatory bias: randomization, blinding, and reporting of intent to treat analyses,blinding, and reporting of intent to treat analyses, with most articles meeting none of thesewith most articles meeting none of these requirements.requirements. NoNo articles specified primary and secondaryarticles specified primary and secondary endpoints.endpoints.
  36. 36. Reporting of RCTs in JCCP 1992 andReporting of RCTs in JCCP 1992 and 2002: Before CONSORT and Beyond2002: Before CONSORT and Beyond Significant improvement in reporting fromSignificant improvement in reporting from 1992 to 2002, but substantial gap remained1992 to 2002, but substantial gap remained between RCTs published in 2002 and fullbetween RCTs published in 2002 and full compliance with CONSORT.compliance with CONSORT. Compliance with CONSORT will requireCompliance with CONSORT will require education and enforcement of standards andeducation and enforcement of standards and will yield a literature that is discontinuous withwill yield a literature that is discontinuous with the existing literature in terms of quality ofthe existing literature in terms of quality of reporting.reporting.
  37. 37. Can We Bury the IdeaCan We Bury the Idea that Psychotherapythat Psychotherapy Prolongs the Survival ofProlongs the Survival of Cancer Patients?Cancer Patients?
  38. 38. Positive Appraisals of Literature • Spiegel and Giese-Davis (2004): “5 of 10 randomized trials demonstrate an effect of psychosocial intervention on survival time” • Sephton and Spiegel (2003): “If nothing else, these studies challenge us to systematically examine the interaction of mind and body, to determine the aspects of therapeutic intervention that are most effective and the populations that are most likely to benefit.”
  39. 39. Three of the “positive trials” can beThree of the “positive trials” can be eliminated because in each case, patientseliminated because in each case, patients in the intervention got substantiallyin the intervention got substantially better medical surveillance and care.better medical surveillance and care. Two of the investigator groups for theseTwo of the investigator groups for these trials deny that they were even studyingtrials deny that they were even studying psychotherapy!psychotherapy!
  40. 40. No Clinical Trial that was ExplicitlyNo Clinical Trial that was Explicitly Designed to Test WhetherDesigned to Test Whether Psychotherapy Improves Survival ofPsychotherapy Improves Survival of Cancer Patients, Three at the TimeCancer Patients, Three at the Time of Spiegel’s Claims, now Five, Hasof Spiegel’s Claims, now Five, Has Shown a Positive Effect.Shown a Positive Effect.
  41. 41. No study that was designed to test whetherNo study that was designed to test whether psychotherapy improved survival and inpsychotherapy improved survival and in which the intervention group did not getwhich the intervention group did not get better medical care has demonstrated anbetter medical care has demonstrated an effect.effect. Claim that psychotherapy promotes survivalClaim that psychotherapy promotes survival depend on the Spiegel and Fawzy studies,depend on the Spiegel and Fawzy studies, which have serious limitations.which have serious limitations.
  42. 42. Spiegel D, Bloom JR, Kraemer HC, Gottheil E (1989): Effect of treatment on the survival of patients with metastasic breast cancer. Lancet 2:888-891. Cited Over 900 Times
  43. 43. Fawzy, F.I., Canada, A.L., & Fawzy, N.W. (2003). Malignant melanoma: effects of a brief, structured psychiatric intervention on survival and recurrence at 10-year follow-up. Arch Gen Psychiat 60, 100-103.* Fawzy FI, Fawzy NW, Hyun CS, Elashoff R, Guthrie, D, Fahey JL, Morton DL (1993): Malignant melanoma. Effects of an early structured psychiatric intervention, coping, and affective state on recurrence and survival 6 years later. Arch Gen Psychiat, 50: 681-689. *Cited 448 Times
  44. 44. Taking on the CochraneTaking on the Cochrane CollaborationCollaboration
  45. 45. Coyne, JC. Cochrane reviewsCoyne, JC. Cochrane reviews vv industryindustry supported meta-analyses: We should read allsupported meta-analyses: We should read all reviews with caution.reviews with caution. BMJ, 333BMJ, 333: 916, 2006: 916, 2006 Cochrane meta-analysis concluded couples therapy wasCochrane meta-analysis concluded couples therapy was not better than individual therapy for depression. Offeringnot better than individual therapy for depression. Offering of couples therapy should be a matter of “patientof couples therapy should be a matter of “patient preference and availability of specific resources.” Yet, thepreference and availability of specific resources.” Yet, the studies reviewed were all seriously flawed. None had closestudies reviewed were all seriously flawed. None had close to minimal cell size necessary for inclusion in a meta-to minimal cell size necessary for inclusion in a meta- analysis, much less for a nonequivalence trial. Thisanalysis, much less for a nonequivalence trial. This premature conclusion serves to discourage thepremature conclusion serves to discourage the commitment of scarce resources to having maritalcommitment of scarce resources to having marital therapists available or to research providing an adequatetherapists available or to research providing an adequate comparison between the two forms of therapy.comparison between the two forms of therapy.
  46. 46. Jorgensen, A. W, Gotzsche, P. C, Hilden, J.Jorgensen, A. W, Gotzsche, P. C, Hilden, J. Authors' reply on Cochrane reviews vAuthors' reply on Cochrane reviews v industry supported meta-analyses.industry supported meta-analyses. BMJBMJ 333: 1072-1073, 2006.333: 1072-1073, 2006. We agree with Tostad and Coyne that someWe agree with Tostad and Coyne that some Cochrane reviews are not of goodCochrane reviews are not of good quality...We urge readers who findquality...We urge readers who find problems with Cochrane reviews to submitproblems with Cochrane reviews to submit a comment to be published as part of thea comment to be published as part of the review. This is very easy to do. Usereview. This is very easy to do. Use "Add/View Feedback" in the index to the"Add/View Feedback" in the index to the left of each review.left of each review.
  47. 47. How to Ensure a Publishable Positive Clinical Trial Have lots of outcome variables, and particularlyHave lots of outcome variables, and particularly alternative measures of the same outcomes.alternative measures of the same outcomes. Make sure patients and RAs rating outcomesMake sure patients and RAs rating outcomes know the treatment in which you are invested.know the treatment in which you are invested. Adjust randomization as needed and let RAsAdjust randomization as needed and let RAs know what next treatment assignment will be.know what next treatment assignment will be.
  48. 48. How to Ensure a Publishable PositiveHow to Ensure a Publishable Positive Clinical TrialClinical Trial If you do not have significant effects, keepIf you do not have significant effects, keep accruing patients.accruing patients. Examine personal characteristics of patients andExamine personal characteristics of patients and throw away results for patients who were unlikelythrow away results for patients who were unlikely to benefit from treatment.to benefit from treatment. Examine all outcomes and report those that areExamine all outcomes and report those that are significant.significant. Don’t report treatment x time interactions if theyDon’t report treatment x time interactions if they are not significant.are not significant.
  49. 49. How to Ensure a Publishable PositiveHow to Ensure a Publishable Positive Clinical TrialClinical Trial Don’t report treatment x time interactions if theyDon’t report treatment x time interactions if they are not significant.are not significant. Examine results for all possible subgroups andExamine results for all possible subgroups and report only subgroup analyses for which there arereport only subgroup analyses for which there are significant effects.significant effects. Do not report that there were outcome measuresDo not report that there were outcome measures or subgroups for which the results were examinedor subgroups for which the results were examined but not found to be significant.but not found to be significant. In discussion and abstract, emphasize theIn discussion and abstract, emphasize the outcomes and subgroup analyses that were mostoutcomes and subgroup analyses that were most positive.positive.
  50. 50. Rumors of the EfficacyRumors of the Efficacy of Psychologicalof Psychological Interventions areInterventions are Premature and GreatlyPremature and Greatly Exaggerated.Exaggerated.
  51. 51. We must not allow a shared commitment toWe must not allow a shared commitment to improving the wellbeing of patients to beimproving the wellbeing of patients to be exploited with exaggerated claims andexploited with exaggerated claims and poorly conceived, poorly conducted, andpoorly conceived, poorly conducted, and poorly reported clinical trials.poorly reported clinical trials.
  52. 52. Gunpowder forever sealed the knightsGunpowder forever sealed the knights fate. No knight could match against afate. No knight could match against a fired bullet. Soon knights would befired bullet. Soon knights would be useless because of the projectile thatuseless because of the projectile that could easily knock a knight off his horse,could easily knock a knight off his horse, rendering him helpless.rendering him helpless.
  53. 53. Critical Questions to AskCritical Questions to Ask • Was the Sample Size Set by Power Analysis?Was the Sample Size Set by Power Analysis? • Is a Primary Outcome Identified?Is a Primary Outcome Identified? • Are Analyses Intent to Treat?Are Analyses Intent to Treat? • Are There Subgroup Analyses orAre There Subgroup Analyses or Cherrypicking of Multiple Outcomes?Cherrypicking of Multiple Outcomes? • Is There a Treatment x Time Interaction?Is There a Treatment x Time Interaction? • Do the Abstract and Discussion Section FairlyDo the Abstract and Discussion Section Fairly Reflect the Results that were Obtained?Reflect the Results that were Obtained?

×