The document discusses the benefits of clinical trial authors submitting supplemental materials and making raw trial data publicly available, such as enabling other researchers to verify results, test secondary hypotheses, and aid the design of future trials, while also outlining some arguments against data sharing and proposing a code of conduct for data sharing. It concludes by suggesting medical journals require data availability for publication to help address issues around researchers restricting access to trial data.
1. Clinical trials, data sharing and
supplemental materials
Andrew J. Vickers
Department of Epidemiology and Biostatistics
Memorial Sloan-Kettering Cancer Center
1.30 – 2pm. SSP meeting
2. Should authors of clinical trials
be able to be submit
supplemental materials?
6. Overview of talk
• Experiences trying to obtain raw data
• Advantages of making raw data
available
• Arguments against data sharing
• A code of conduct for use of raw data
• A final thought: will journals step up to
the plate?
7. Typical experiences trying to
obtain data from medical trials
• Needed data from the control arm of a
trial to help design a study
• NIH researcher, NIH funded trial
• “I am not prepared to release the data at
this point”
8. Anecdote 2
• Conducting a meta-analysis
• Needed proportions from a published
trial that reported means and SDs
• “I would love to provide you with these
data but my biostatistician won’t allow it”
9. Anecdote 3
• Wanted data from a large cancer trial to
illustrate a novel statistical technique
n n 1
∆NB = ∑x i − pt − ∑x i , k =1 − pt ∑xk
i =1 i =1 k =0
• Investigators were suspicious
10. I implore you, oh great king, pity
me, poor, little worm that I am
• We promised:
– The data would only be used for a statistical
methodology study
– We would expressly state in the paper that
no clinical conclusions should be drawn
– We would slightly corrupt the data
– We would send a draft to the investigators
and they would have veto power
12. How I have done it ….
• Show biomed paper
• Link to excel file
• Etc.
13.
14.
15.
16.
17.
18. Notice that ….
• File is not large (21 Kb)
• File needs no editing
19. Why share data?
• Analyses can be reproduced and checked by others
• Acts as an additional incentive for checking that a data
set is clean and accurate
• May help prevent fraud and selective reporting
• Allows testing of secondary hypotheses
• Aids design of future trials
• Simplifies data acquisition for meta-analysis
• Teaching
• Aids development and evaluation of novel statistical
methods
20. Why share data?
• Analyses can be reproduced and checked by others
• Acts as an additional incentive for checking that a data
set is clean and accurate
• May help prevent fraud and selective reporting
• Allows testing of secondary hypotheses
• Aids design of future trials
• Simplifies data acquisition for meta-analysis
• Teaching
• Aids development and evaluation of novel statistical
methods
25. Conclusions
• We compared SELDI proteomic spectra … from
three experiments … on … ovarian cancer.
• These spectra are available on the web
at http://clinicalproteomics.steem.com
• The results were not reproducible across
experiments.
26.
27. Key point
• Publication of raw data is routine for:
– Protein chemistry
– Genomic research
– Astronomy
• But not clinical trials
28. Why share data?
• Analyses can be reproduced and checked by others
• Acts as an additional incentive for checking that a data
set is clean and accurate
• May help prevent fraud and selective reporting
• Allows testing of secondary hypotheses
• Aids design of future trials
• Simplifies data acquisition for meta-analysis
• Teaching
• Aids development and evaluation of novel statistical
methods
29. Acts as an additional incentive for
checking that a data set is clean
and accurate
• My house is neater when I know
someone is coming to visit
• Biomarker study:
– Gross errors found in clinical trial data set
31. Allows testing of secondary
hypotheses
• CARET study: do vitamins help prevent
lung cancer?
• Peter Bach at MSKCC: what is the
association between amount of cigarette
smoking and lung cancer?
– Predictive model
– Used to evaluate CT screening for lung
cancer
32. Aids design of future trials
• Numerous decisions on trial design
should be based on data
– How many patients do we need?
– When should we measure patients?
– What is the best way to measure outcome?
33. Simplifies data acquisition for
meta-analysis
• A single study rarely tells you much
• Combine data from several studies to
get the big picture: “meta-analysis”
• Can be difficult to combine data if
results are presented in different ways
34. Teaching
• Best way to teach swimming is to put
your children in the water
• Best way to teach statistics is to have
students analyze real data sets
35. Development of novel statistical
methods
n
n 1
∆ NB = ∑ x i − pt − ∑ x i ,k =1 − pt ∑ xk
i =1 i =1 k =0
36. Arguments against data sharing
• Cost and trouble of putting data set
together
• Doesn’t this have to be done anyway?
37. Arguments against data sharing 2
• It might violate patient privacy
• Changing names to codes and dates to
lengths of time is hardly rocket science
38. Arguments against data sharing 3
• Other researchers might conduct invalid
analyses
• A decision for the scientific community
as a whole
39. Arguments against data sharing 4
• Researchers have a right to exploit data
that they may have spent years
collecting
40. A code of conduct for sharing
data from clinical trials
1. Independent investigators planning to publish a new
analysis should contact the trialists before
undertaking any analyses
2. One or more trialists should be offered a co-
authorship or opportunity to write a commentary
published alongside the new analysis
3. Journals should not publish new analyses of
previously published data unless a trialist is co-
author or writes separate commentary
4. Published new analyses should cite the original trial
41. Code of conduct for trialists
1. Data set must be clean, well annotated, de-identified
2. Publish immediately data for the main analyses
3. No need to share data for planned 2ry analyses
4. All raw data made available no longer than five years
after first publication of trial results
5. There is no need to update data
6. Trialists must share data if analyses are not to be
published
42. To the Editor:
“Cancer Data? Sorry, Can’t Have It” (Essay, Jan.
22): Andrew Vickers hints at a truth known to
most physicians with any connection to medical
research; the pursuit of academic and scientific
prestige is often as important as the potential
benefit to patients. There’s a simple fix. If the top
dozen or so medical journals refused to
consider publication of any research results
without a pledge from the authors to make the
raw data available for follow-up analysis, the
problem would disappear.
David R. Bacon, M.D.
43. Will journals step up to the plate?
• Should the results of human
experimentation become personal
property of the researchers?
• Publication:
– Ethical approval
– Disclosure of conflict of interest
– Data made publicly available