• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Sharing Clinical Research Data:  an IOM Workshop

Sharing Clinical Research Data: an IOM Workshop



Briefing book for IOM workshop on Sharing Clinical Research Data, Oct. 4-5, 2012

Briefing book for IOM workshop on Sharing Clinical Research Data, Oct. 4-5, 2012



Total Views
Views on SlideShare
Embed Views



0 Embeds 0

No embeds



Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

    Sharing Clinical Research Data:  an IOM Workshop Sharing Clinical Research Data: an IOM Workshop Document Transcript

    • Forum on Drug Discovery, Development, and Translation Forum on Neuroscience and Nervous System DisordersRoundtable on Translating Genomic-Based Research for Health National Cancer Policy Forum Present Sharing Clinical Research Data:An IOM Workshop October 4-5, 2012 National Academy of Sciences 2101 Constitution Avenue, N.W. Washington, D.C. 20418
    • Sharing Clinical Research Data: An Institute of Medicine Workshop TABLE OF CONTENTS October 4 - 5, 2012 Washington, DCITEMS TAB/PAGEAgenda AGENDASharing Clinical Research Data 1 COORDINATING ROUNDTABLE AND FORUMS COORDINATING ROUNDTABLECoordinating Roundtable and Forums AND FORUMS• Forum on Drug Discovery, Development, and Translation 1• Forum on Neuroscience and Nervous System Disorders 3• Roundtable on Translating Genomic-Based Research for Health 5• National Cancer Policy Forum 7 WORKSHOPWorkshop, October 4-5 WORKSHOP• Speaker Biographies 1 BACKGROUND ARTICLES BACKGROUNDBackground Reading for Workshop1 ARTICLESBenefits• Akil, H., M. E. Martone, et al. Challenges and opportunities in 1 mining neuroscience data• Brown, J. S., J. H. Holmes, et al. A practical and preferred approach to 6 multi-institutional evaluations of comparative effectiveness, safety, and quality of care• Krumholz, H. M. Open science and data sharing in clinical research : 13 Basing informed decisions on the totality of the evidenceModels• Califf, R. M., D. A. Zarin, et al. Characteristics of clinical trials registered in ClinicalTrials.gov, 2007-2010 15• Doshi, P., T. Jefferson, et al. The imperative to share clinical study reports: Recommendations from the Tamiflu experience 25• Piwowar, H.A. Who shares? Who doesn’t? Factors associated with openly archiving raw research data 31• Turner, E. H., A. M. Matthews, et. al. Selective publication of 44 antidepressant trials and its influence on apparent efficacy• Wagner, J. A., M. Prince, et al. The Biomarkers Consortium: 53 Practice and pitfalls of open-source precompetitive collaboration• List of Clinical Research Data Sharing Projects 571 Background materials have been gathered throughout the workshop planning committee process to inform the development of the meeting and understanding of key issues. The articles contained in this briefing book are a subset of background materials collected and inclusion of particular articles does not denote IOM endorsement.
    • ITEMS TAB/PAGEStandardization and Governance• CDISC press release. CDISC, C-Path and FDA collaborate to develop data 75 standards to streamline path to new therapies• CDISC press release. Public release of Alzheimer’s clinical trial data by pharmaceutical researchers 77• Davies, K. Get smart: Knowledge management 80• Davies, K. Running tranSMART for the drug development marathon 82• The Economist editorial. Genomic research: Consent 2.0 85• Ghersi, D., et al. Reporting the findings of clinical trials: a discussion paper 86• Hayden, E. C. Informed consent: A broken contract 88• Kolata, G. Genes now tell doctors secrets they can’t utter 91• Savage, C. J. and A. J. Vickers. Empirical study of data sharing by authors publishing in PLoS Journals 96Culture and Policy• Eichler, H.-G., E. Abadie, et al. Open clinical trial data for all? A view from regulators 99• Laine, C., S. N. Goodman, et al. Reproducible research: Moving toward research the public can really trust 101• Law, M. R., Y. Kawasumi, et al. Despite Law, Fewer Than One In Eight Completed Studies Of Drugs And Biologics Are Reported On Time On ClinicalTrials.gov 106• Mansi, B. A., J. Clark, et al. Ten recommendations for closing the credibility gap in reporting industry-sponsored clinical research: A joint journal and pharmaceutical industry perspective 114• NIH. Final NIH statement on sharing research data• Vickers, A. J. Whose data set is it anyway? Sharing raw data from randomized trials 120• Zarin, D. A. and T. Tse. Moving toward transparency of clinical trials 126 TRAVEL INFORMATION TRAVELTravel Information INFORMATION• Logistics Memo 1• Map of National Academies Building 5
    • AGENDA
    • Forum on Drug Discovery, Development, and Translation Forum on Neuroscience and Nervous System Disorders National Cancer Policy Forum Roundtable on Translating Genomic-Based Research for Health Sharing Clinical Research Data: An Institute of Medicine Workshop DRAFT AGENDA October 4 & 5, 2012 National Academy of Sciences Building, Room 125 2101 Constitution Avenue, N.W. Washington, DCBackground: Pharmaceutical companies, academic institutions, advocacy organizations, and governmentagencies such as FDA and NIH have large quantities of clinical research data. Increased datasharing could facilitate scientific and public health advances, among other potential benefits topatients and society. Much of this information, however, is never published or shared. Morespecifically, study results are not always published and where results are published, theytypically only include summary-level data; participant-level data is privately held and rarelyshared or revealed publicly. The workshop will explore the benefits of and barriers to the sharing of clinical research dataand will help identify strategies for enhancing sharing both within sectors and across sectors. Tomake the workshop scope manageable, the workshop will focus on data resulting frompreplanned interventional studies of human subjects. While recognizing the importance of otherdata sources such as observational studies and electronic health records, this focus was selectedto encourage concrete problem-solving discussions over the course of a day-and-a-half meeting.Models and projects that involve sharing other types of data will be considered during theworkshop to the extent that these models provide lessons and best practices applicable to sharingpreplanned interventional clinical research data. The workshop is being jointly organized by the Institute of Medicine’s Forum on DrugDiscovery, Development, and Translation; Forum on Neuroscience and Nervous SystemDisorders; National Cancer Policy Forum; and Roundtable on Translating Genomic-BasedResearch for Health. Participants will be invited from industry, academia, government agenciessuch as FDA and NIH, disease advocacy groups, and other stakeholder groups.Meeting Objectives: Examine the benefits of sharing of clinical research data, and specifically clinical trial data, from all sectors and among these sectors, including, for example: o Benefits to the research and development enterprise o Benefits to the analysis of safety and efficacy Identify barriers and challenges to sharing clinical research data Explore strategies to address these barriers and challenges, including identifying priority actions and “low-hanging fruit” opportunities Discuss strategies for using these potentially large data sets to facilitate scientific and public health advances. 1
    • Forum on Drug Discovery, Development, and Translation Forum on Neuroscience and Nervous System Disorders National Cancer Policy Forum Roundtable on Translating Genomic-Based Research for Health Day One8:30 a.m. Opening Remarks SHARON TERRY, Workshop Chair President and Chief Executive Officer Genetic Alliance SESSION I: BENEFITS OF SHARING CLINICAL RESEARCH DATASession Objectives: Provide an overview of the benefits of sharing clinical research data, specifically clinical trial data, and discuss advantages and disadvantages of sharing participant vs. summary level data from individual trials as well as pooling data across multiple studies. Outline examples of scientific success stories that illustrate what can be accomplished when clinical trial data is shared.8:40 a.m. Background and Session Objectives WILLIAM POTTER, Session Co-Chair Co-Chair Emeritus Neuroscience Steering Committee FNIH Biomarkers Consortium DEBORAH ZARIN, Session Co-Chair Director, ClinicalTrials.gov National Library of Medicine National Institutes of Health8:50 a.m. Fundamentals and Benefits of Sharing Participant-Level Clinical Trial Data ELIZABETH LODER Clinical Epidemiology Editor British Medical Journal9:10 a.m. Pooling Data from Multiple Clinical Trials to Answer Big Questions ROBERT CALIFF Director, Duke Translational Medicine Institute Professor of Medicine Vice Chancellor for Clinical and Translational Research Duke University Medical Center 2
    • Forum on Drug Discovery, Development, and Translation Forum on Neuroscience and Nervous System Disorders National Cancer Policy Forum Roundtable on Translating Genomic-Based Research for Health9:30 a.m. Panel Discussion: Perspectives on the Benefits of Sharing Clinical Trial Data  Data sharing – what does it mean from your perspective?  Considering the benefits and risks of sharing clinical research data, how extensively should it be shared to maximize new knowledge and ultimately patient benefit? Panelists: HARLAN KRUMHOLZ Harold H. Hines, Jr. Professor of Medicine and Epidemiology and Public Health Yale University School of Medicine MYLES AXTON Editor Nature Genetics JESSE BERLIN Vice President of Epidemiology Janssen Research and Development10:30 a.m. BREAK SESSION II: DATA SHARING MODELS: DESIGN, BEST PRACTICES, AND LESSONS LEARNEDSession Objectives: Present examples, best practices, and lessons learned from projects across the continuum of data sharing opportunities (e.g., rapid publication of participant-level data, increased access to participant-level data for qualified researchers, or maximizing the use of clinical research data that is currently held in centralized locations by requiring sharing or access to subsets of data). Distill best practices and lessons learned that can be applied broadly to new projects to maximize the use of data from individual trials and/or data pooling initiatives.10:45 a.m. Background and Session Objectives JEFFREY NYE, Session Chair Vice President and Head Neuroscience External Innovation Johnson and Johnson Pharmaceutical R&D, LLC10:55 a.m. The Limits of Summary Data Reporting: Lessons from ClinicalTrials.gov 3
    • Forum on Drug Discovery, Development, and Translation Forum on Neuroscience and Nervous System Disorders National Cancer Policy Forum Roundtable on Translating Genomic-Based Research for Health DEBORAH ZARIN Director, ClinicalTrials.gov National Institutes of Health11:10 a.m. Models that Increase Access and Use of Data from Individual Clinical Trials The DataSphere Project CHARLES HUGH-JONES Vice President, Medical Affairs North America Sanofi Oncology Yale/Medtronic Experience RICHARD KUNTZ Senior Vice President Chief Scientific, Clinical and Regulatory Officer Medtronic, Inc.11:40 a.m. Models that Foster Pooling and Analysis of Data FNIH Biomarkers Consortium Adiponectin Project JOHN WAGNER Vice President, Clinical Pharmacology Merck & Co., Inc. Novel Methods Leading to New Medications in Depression and Schizophrenia (NEWMEDS) Consortium JONATHAN RABINOWITZ Academic Lead, NEWMEDS Bar Ilan University12:10 p.m. Series of Brief Presentations on Overcoming Challenges Facing Clinical Trial Data Sharing Challenge #1: Permissions JENNIFER GEETTER Partner McDermott Will & Emery Challenge #2: Techniques and Methodologies 4
    • Forum on Drug Discovery, Development, and Translation Forum on Neuroscience and Nervous System Disorders National Cancer Policy Forum Roundtable on Translating Genomic-Based Research for Health JOHN IOANNIDIS [via video conference] C.F. Rehnborg Chair in Disease Prevention Stanford University Challenge #3: Culture KELLY EDWARDS Acting Associate Dean, The Graduate School Associate Professor, Bioethics and Humanities University of Washington12:40 p.m. Discussion among speakers, panelists, and audience Discussant: Sally Okun, Health Data Integrity & Patient Safety, PatientsLikeMe1:00 p.m. LUNCH KEYNOTE CASE STUDY: DISTRIBUTED SYSTEMS FOR CLINICAL RESEARCH INFORMATION SHARING1:30 p.m. RICHARD PLATT Professor and Chair Department of Population Medicine Harvard Pilgrim Health Care Institute and Harvard Medical School1:50 p.m. Discussion with speaker and audience Moderator: Sharon Terry, Genetic Alliance SESSION III: STANDARDIZATION AND GOVERNANCESession Objectives: Receive an update on recent legislative and regulatory language regarding standardization of clinical research data and discuss how stakeholders are designing and implementing data standardization plans in response. Discuss the relative cost-benefit of data conversion of existing trial data versus building an infrastructure to improve data collection and sharing moving forward. Present case studies from data sharing projects using different data standardization and governance models and consider lessons learned or best practices for the future.2:00 p.m. Background and Session Objectives 5
    • Forum on Drug Discovery, Development, and Translation Forum on Neuroscience and Nervous System Disorders National Cancer Policy Forum Roundtable on Translating Genomic-Based Research for Health FRANK ROCKHOLD, Session Co-Chair Senior Vice President, Global Clinical Safety and Pharmacovigilance GlaxoSmithKline Pharmaceuticals Research and Development LYNN HUDSON, Session Co-Chair Chief Science Officer and Executive Director Coalition Against Major Diseases Critical Path Institute2:10 p.m. PDUFA Update on Data Standards MARY ANN SLACK Office of Planning and Informatics Center for Drug Evaluation and Research U.S. Food and Drug Administration2:25 p.m. Standardization to Facilitate Data Sharing: Opportunities and Limitations CDISC Efforts to Support Clinical Research Data REBECCA KUSH President and Chief Executive Officer Clinical Data Interchange Standards Consortium HL7 Efforts to Support Clinical Care Data CHARLES JAFFE Chief Executive Officer Health Level 7 International Health Information Technology Perspective on Clinical Research Data Standards SACHIN JAIN Chief Medical Information and Innovation Officer Merck & Co., Inc.3:10 p.m. Discussion with speakers and audience3:30 p.m. BREAK3:45 p.m. Cost-Benefit Analysis of Retrospective vs. Prospective Data Standardization 6
    • Forum on Drug Discovery, Development, and Translation Forum on Neuroscience and Nervous System Disorders National Cancer Policy Forum Roundtable on Translating Genomic-Based Research for Health VICKI SEYFERT-MARGOLIS Senior Advisor, Science Innovation and Policy Office of the Chief Scientist U.S. Food and Drug Administration4:00 p.m. Case Studies: Standardization and Governance Models in Data Sharing Critical Path Institute and Coalition Against Major Diseases (CAMD) Alzheimer’s Clinical Trial Database CAROLYN COMPTON President and CEO Critical Path Institute Translational Medicine Mart (tranSMART) ERIC PERAKSLIS Chief Information Officer and Chief Scientist, Informatics U.S. Food and Drug Administration4:30 p.m. Panel Discussion Catalog new data sharing challenges not yet discussed and provide suggestions for overcoming these challenges. Given the data standardization and governance models discussed, suggest a framework to guide the development of new data sharing projects based on their purpose (e.g., regulatory approval with FDA, detecting safety signals, testing secondary hypotheses, etc.) Panelists: LAURA LYMAN RODRIGUEZ Director Office of Policy, Communications and Education National Human Genome Research Institute MEREDITH NAHM Associate Director for Clinical Research Informatics Duke Translational Medicine Institute NEIL DE CRESCENZO Senior Vice President and General Manager Oracle Health Sciences 7
    • Forum on Drug Discovery, Development, and Translation Forum on Neuroscience and Nervous System Disorders National Cancer Policy Forum Roundtable on Translating Genomic-Based Research for Health MICHAEL CANTOR Senior Director Clinical Informatics and Innovation Pfizer, Inc.5:30 p.m. Adjourn day 1 8
    • Forum on Drug Discovery, Development, and Translation Forum on Neuroscience and Nervous System Disorders National Cancer Policy Forum Roundtable on Translating Genomic-Based Research for Health Day Two8:00 a.m. Opening Remarks SHARON TERRY, Workshop Chair President and Chief Executive Officer Genetic Alliance SESSION IV: INCENTIVIZING POLICY AND CULTURAL SHIFTS TO ENHANCE DATA SHARINGSession Objectives: Receive an update on clinical trial data transparency decisions in Europe. Explore current incentives for and against (i.e., benefits and risks of) data sharing within and across sectors and suggest mechanisms to encourage stakeholders to engage in a culture of data sharing. Identify existing and potential strategies, including technology-based approaches, for protecting patient privacy and confidentiality while facilitating data sharing.8:10 a.m. Background and Session Objectives ROBERT HARRINGTON, Session Chair Arthur L. Bloomfield Professor of Medicine Chair, Department of Medicine Stanford University8:20 a.m. Clinical Trial Data Transparency: European Medicines Agency Perspective HANS-GEORG EICHLER Senior Medical Officer European Medicines Agency8:40 a.m. Clinical Research Data Sharing Practices and Attitudes ANDREW VICKERS Attending Research Methodologist Department of Epidemiology and Biostatistics Memorial Sloan-Kettering Cancer Center 9
    • Forum on Drug Discovery, Development, and Translation Forum on Neuroscience and Nervous System Disorders National Cancer Policy Forum Roundtable on Translating Genomic-Based Research for Health8:55 a.m. Overview of Data Sharing Policies: Research Funders and Publishers STEVEN GOODMAN Associate Dean for Clinical and Translational Research Professor of Medicine & Health Research and Policy Stanford University School of Medicine9:10 a.m. Series of Presentations: Incentives for Data Sharing Within and Across Sectors Academic Perspectives PETER DOSHI Post-Doctoral Fellow Johns Hopkins University School of Medicine BETH KOZEL Instructor of Pediatrics Division of Genetics and Genomic Medicine St. Louis Children’s Hospital and Washington University Pharmaceutical Company Perspective [Speaker TBA] Federal Research Funder Perspective JOSEPHINE BRIGGS Director, National Center for Complementary and Alternative Medicine Director, National Center for Advancing Translation Sciences, Division of Clinical Innovation National Institutes of Health10:10 a.m. Discussion with speakers and audience10:30 a.m. BREAK10:45 a.m. Facilitating Patient Ownership of Clinical Trial Data: Technical Challenges and Opportunities JOHN WILBANKS Director Sage Bionetworks 10
    • Forum on Drug Discovery, Development, and Translation Forum on Neuroscience and Nervous System Disorders National Cancer Policy Forum Roundtable on Translating Genomic-Based Research for Health DEVEN MCGRAW Director, Health Privacy Project Center for Democracy and Technology11:15 a.m. Discussion with speakers and audience SESSION V: NEXT STEPS AND FUTURE DIRECTIONSSession Objectives: Discuss key themes from the workshop. Based on workshop presentations and discussions, identify potential next steps and priority actions for data sharing stakeholders to take action. Highlight potential opportunities and challenges that are currently on the horizon but may become more salient as technology evolves and/or data sharing becomes more pervasive.11:30 a.m. Background and Session Objectives SHARON TERRY, Workshop Chair President and Chief Executive Officer Genetic Alliance11:40 a.m. Session Chair Reports [5 minutes per Session] WILLIAM POTTER, Session I Co-Chair Co-Chair Emeritus Neuroscience Steering Committee FNIH Biomarkers Consortium DEBORAH ZARIN, Session I Co-Chair Director, ClinicalTrials.gov National Library of Medicine National Institutes of Health JEFFREY NYE, Session II Chair Vice President and Head Neuroscience External Innovation Johnson and Johnson Pharmaceutical R&D, LLC FRANK ROCKHOLD, Session III Co-Chair Senior Vice President Global Clinical Safety and Pharmacovigilance GlaxoSmithKline Pharmaceuticals Research and Development 11
    • Forum on Drug Discovery, Development, and Translation Forum on Neuroscience and Nervous System Disorders National Cancer Policy Forum Roundtable on Translating Genomic-Based Research for Health LYNN HUDSON, Session III Co-Chair Chief Science Officer and Executive Director Coalition Against Major Diseases Critical Path Institute ROBERT HARRINGTON, Session IV Chair Arthur L. Bloomfield Professor of Medicine Chair, Department of Medicine Stanford University12:00 p.m. Closing Discussion with Session Chairs, Panelists, and Audience Led by Workshop Chair JOSEPHINE BRIGGS Director, National Center for Complementary and Alternative Medicine Director, National Center for Advancing Translation Sciences, Division of Clinical Innovation National Institutes of Health MICHAEL ROSENBLATT Executive Vice President and Chief Medical Officer Merck & Co., Inc. JAY “MARTY” TENENBAUM Founder and Chairman Cancer Commons JANET WOODCOCK Director, Center for Drug Evaluation and Research U.S. Food and Drug Administration12:45 p.m. Adjourn 12
    • Forum on Drug Discovery, Development, and Translation Forum on Neuroscience and Nervous System Disorders National Cancer Policy Forum Roundtable on Translating Genomic-Based Research for Health SHARING CLINICAL RESEARCH DATA: A WORKSHOP October 4-5, 2012 – Washington, D.C. Background Information Statement of TaskPharmaceutical companies, academics, and govern- The Institute of Medicine will conduct a public work-ment agencies such as the Food and Drug Administra- shop that will focus on strategies to facilitate sharingtion and the National Institutes of Health have large of clinical research data. Participants will be invitedquantities of clinical research data. Data sharing within from industry, academia, government agencies such aseach sector and across sectors could facilitate scientific FDA and NIH, disease advocacy groups, and otherand public health advances and could enhance analysis stakeholder groups. The workshop will feature invitedof safety and efficacy. Much of this information, how- presentations and discussions that will:ever, is never published. This workshop will explore Examine the benefits of sharing of clinical re-barriers to sharing of clinical research data, specifical- search data from all sectors and among these sec-ly clinical trial data, and strategies for enhancing shar- tors, including, for example:ing within sectors and among sectors to facilitate o Benefits to the research and development en-research and development of effective, safe, and need- terpriseed products. o Benefits to the analysis of safety and efficacyA number of efforts are currently underway to enhance Identify barriers and challenges to sharing clinicalpublic/private partnerships and foster collaboration research datawithin the pharmaceutical sector to advance research Explore strategies to address these barriers andand enhance the discovery and development of drugs, challenges, including identifying priority actionsdevices, and diagnostic tools. Examples include the and “low-hanging fruit” opportunitiesAnalgesic Clinical Trials Innovation, Opportunities, Discuss strategies for using these potentially largeand Networks (ACTION) Initiative; the Foundation data sets to facilitate scientific and public healthfor the National Institute of Health’s Biomarkers Con- advances.sortium; the Innovative Medicines Initiative (Europe);Critical Path Institute; Sage Bionetworks’ Clinical Tri- Planning Committeeal Comparator Arm Partnership; and the Life Sciences Sharon Terry, chair, Genetic AllianceConsortium’s MetaPharm project. Planned within the Josephine P. Briggs, National Institutes of Healthcontext of these various efforts, this workshop will Timothy Coetzee, National Multiple Sclerosis Societyfocus specifically on data sharing, which is one com- Steven Goodman, Stanford Universityponent of larger efforts to build partnerships and en- Robert A. Harrington, Stanford Universityhance collaboration within and among sectors on Lynn Hudson, Critical Path Instituteresearch, development, and assessment of pharmaceu- Charles Hugh-Jones, Sanofi Oncologytical products. Jan Johannessen, Food and Drug AdministrationThis project will be a coordinated effort of the IOMs Jeffrey S. Nye, Johnson and JohnsonForum on Drug Discovery, Development and Transla- Richard Platt, Harvard Medical Schooltion; Forum on Neuroscience and Nervous System William Z. Potter, FNIH Biomarkers ConsortiumDisorders; National Cancer Policy Forum; and Frank W. Rockhold, GlaxoSmithKlineRoundtable on Translating Genomic-Based Research Michael Rosenblatt, Merckfor Health. All four Forums/Roundtables have done Vicki Seyfert-Margolis, Food and Drug Administrationprevious work on this topic and this issue cuts across Deborah A. Zarin, National Institutes of Healththe focus of each activity.The IOM was chartered in 1970 as a component of the National Academy of Sciences to enlist distinguished members of the appropriateprofessions in the examination of policy matters pertaining to the health of the public. In this, the Institute acts under both the Academy’s1863 congressional charter responsibility to be an adviser to the federal government and its own initiative in identifying issues of medicalcare, research, and education. For additional information about this workshop, contact Rebecca English at renglish@nas.edu.
    • Board on Health Sciences Policy Forum on Drug Discovery, Development, and Translation FORUM ON DRUG DISCOVERY, DEVELOPMENT, AND TRANSLATION The Institute of Medicine’s Forum on Drug Transforming Research and Fostering CollaborativeDiscovery, Development, and Translation was created in Research2005 by the IOM’s Board on Health Sciences Policy toprovide a unique platform for dialogue and collaboration The Forum has established an initiative to examineamong thought leaders and stakeholders in government, the state of clinical trials in the U.S., identify areas ofacademia, industry, foundations and patient advocacy. strength and weakness in our current clinical trialThe Forum brings together leaders from private sector enterprise, and consider transformative strategies forsponsors of biomedical and clinical research, federal enhancing the ways in which clinical research isagencies sponsoring and regulating biomedical and organized and conducted. Workshops and meetings heldclinical research, the academic community, and in 2009 and 2010 considered case studies in four diseaseconsumers, and in doing so serves to educate the policy areas; and included discussions around issues ofcommunity about issues where science and policy management of conflict of interest, and addressingintersect. regulatory and administrative impediments to the conduct The Forum convenes several times each year to of clinical trials. Meetings in 2011 will address how toidentify and discuss key problems and strategies in the move toward greater public engagement in anddiscovery, development, and translation of drugs. To understanding of the clinical trial enterprise, andsupplement the perspectives and expertise of its members, establishing a framework for a transformed nationalthe Drug Forum also holds public workshops to engage a clinical trial enterprise.wide range of experts, members of the public, and the Developing Drugs for Rare and Neglected Diseases andpolicy community in discussing areas of concern in the Addressing Urgent Global Health Problemsscience and policy of drug development. The Forum’spublic meetings focus substantial public attention on The Forum is sponsoring a series of workshops on thecritical areas of drug development, focusing on the major global problem of MDR TB. The Forum held athemes outlined below. foundational workshop in Washington, DC in 2008, for which it commissioned a paper from Partners In Health.The Approach to Drug Development Additional workshops are being held in the four countries with the highest MDR TB burden—South Africa and Despite exciting scientific advances, the pathway Russia (held 2010), and India and China (anticipatedfrom basic science to new therapeutics faces challenges 2011). Also in 2011, the Forum will convene a focusedon many fronts. New paradigms for discovering and initiative to consider the global drug supply chain fordeveloping drugs are being sought to bridge the ever- quality-assured second-line drugs for tuberculosis.widening gap between scientific discoveries andtranslation of those discoveries into life-changing Promoting Public Understanding of Drugmedications. The Forum has explored these issues from Developmentmany perspectives—emerging technology platforms,regulatory efficiency, intellectual property concerns, the Successful introduction of new therapeutic entitiespotential for precompetitive collaboration, and innovative requires testing in an informed and motivated public. Thebusiness models that address the “valley of death.” Forum has spent concerted effort to understand what limits public participation and how to enhance moreStrengthening the Scientific Basis of Drug Regulation widespread acceptance of the importance of advancing therapeutic development through public participation in Over the past several years, the Forum has focused its the drug development process. Forum meetings held in theattention on the scientific basis for the regulation of drugs. spring and fall of 2010 addressed these issues. The ForumIn February 2010, the Forum held a workshop that plans to continue to work with multiple stakeholders toexamined the state of the science of drug regulation and improve public understanding of and participation in theconsidered approaches for enhancing the scientific basis drug development process.of regulatory decision making. 500 Fifth Street, NW Phone: 202 334-2715 Washington, DC 20001 Fax: 202 334-1329 E-mail: aclaiborne@nas.edu www.iom.edu/drug 1
    • FORUM ON DRUG DISCOVERY, DEVELOPMENT, AND TRANSLATION JEFFREY DRAZEN, (Co-Chair) New England Journal of Medicine STEVEN GALSON (Co-Chair) Amgen Inc.MARGARET ANDERSON PETRA KAUFFMANFasterCures National Institute of Neurological Disorders and StrokeHUGH AUCHINCLOSS JACK KEENENational Institute of Allergy and Infectious Diseases Duke University Medical CenterLESLIE BENET RONALD KRALLUniversity of California, San Francisco University of PennsylvaniaANN BONHAM FREDA LEWIS-HALLAssociation of American Medical Colleges Pfizer Inc.LINDA BRADY MARK MCCLELLANNational Institute of Mental Health Brookings InstitutionROBERT CALIFF CAROL MIMURADuke University Medical Center University of California, BerkeleyTHOMAS CASKEY ELIZABETH (BETSY) MYERSBaylor College of Medicine Doris Duke Charitable FoundationGAIL CASSELL JOHN ORLOFFHarvard Medical School Novartis Pharmaceuticals CorporationPETER CORR AMY PATTERSONCeltic Therapeutics, LLLP NIH Office of the DirectorANDREW DAHLEM MICHAEL ROSENBLATTEli Lilly & Co. Merck & Co., Inc.TAMARA (MARA) DARSOW JANET SHOEMAKERAmerican Diabetes Association American Society for MicrobiologyJIM DOROSHOW ELLEN SIGALNational Cancer Institute Friends of Cancer ResearchGARY FILERMAN ELLIOTT SIGALAtlas Health Foundation Bristol-Myers SquibbGARRET FITZGERALD ELLEN STRAHLMANUniversity of Pennsylvania School of Medicine GlaxoSmithKlineMARK GOLDBERGER NANCY SUNGAbbott Pharmaceuticals Burroughs Wellcome FundHARRY GREENBERG JANET TOBIASStanford University School of Medicine Ikana MediaSTEPHEN GROFT JOANNE WALDSTREICHEROffice of Rare Diseases Research, NIH NCATS Janssen Research & DevelopmentLYNN HUDSON JANET WOODCOCKThe Critical Path Institute FDA Center for Drug Evaluation and ResearchTOM INSEL PROJECT STAFFNational Center for Advancing Translational Sciences Anne Claiborne, J.D., M.P.H., Forum Director Rebecca English, M.P.H., Associate Program OfficerMICHAEL KATZ Rita Guenther, Ph.D., Program OfficerMarch of Dimes Foundation Robin Guyse, Senior Program Assistant Elizabeth Tyson, Research AssociateThe IOM was chartered in 1970 as a component of the National Academy of Sciences to enlist distinguished members of the appropriate professions in theexamination of policy matters pertaining to the health of the public. In this, the Institute acts under both the Academy’s 1863 congressional charterresponsibility to be an adviser to the federal government and its own initiative in identifying issues of medical care, research, and education. 2
    • Board on Health Sciences Policy Forum on Neuroscience and Nervous System Disorders FORUM ON NEUROSCIENCE AND NERVOUS SYSTEM DISORDERS The Institute of Medicine’s Forum on Neuroscience Genetics of Nervous System Disordersand Nervous System Disorders focuses on buildingpartnerships to further understand the brain and nervous Understanding the roles of genes and their productssystem, disorders in their structure and function, as well that control electrical activity, trophic factors,as effective clinical prevention and treatment strategies. transporters, and receptors is critical to understandingThe forum focuses on the six themes outlined below, the brain and nervous system. The role of genetics inand serves to educate the public, press, and policy disease states needs to be more completely understoodmakers regarding these issues. to develop appropriate treatments and interventions. The Forum brings together leaders from private The forum discusses approaches to bettersector sponsors of biomedical and clinical research, understanding environmental and multigenetic factorsfederal agencies sponsoring and regulating biomedical that influence susceptibility, progression, and severityand clinical research, the academic community, and of disease.consumers. The Forum will sponsor workshops for members Cognition and Behaviorand the public to discuss approaches to resolve keychallenges identified by Forum members. It strives to Higher brain functions that give rise to language,enhance understanding of research and clinical issues cognition, memory, and emotion depend on a complexassociated with the nervous system among the scientific and interdependent set of neuronal pathways.community and the general public, and provide a Understanding higher brain functions and theirmechanism to foster partnerships among stakeholders. impairment is critical to understanding effective treatments for disease.Nervous System Disorders Modeling and Imaging The forum promotes partnerships which will leadto a more complete understanding of the agents that Tools that assist in better understanding of the basicdamage cells and trigger cell death, as well as the and higher order functions of the brain aid in researchmechanisms that carry out the process. For example, and understanding of disorders of the brain. This areaprotein aggregation has an important role in focuses on animal and computer models andneurodegeneration and is emerging as a common neuroimaging approaches.mechanism in several nervous system disorders (e.g.Parkinson’s and Alzheimer’s disease). Ethical and Social IssuesMental Illness and Addiction The Forum addresses ethical issues relating to the stigma associated with nervous system disorders, Improved understanding of the nervous system and mental illness, and addiction in society and theits role in mental illness and addictive behavior is engagement of these patient populations in researchcritical to developing new treatments for these efforts. Clinical research is also examined thatconditions. Especially important is improving the elucidates environmental, genetic, social, and culturalunderstanding of the neuronal mechanisms that give risk and protective factors which can guide therise to mental illnesses, increased risk behavior, and development of prevention and early interventionresilience. The Forum discusses mechanisms that strategies.might take advantage of the evolving nature of scienceand the opportunities it brings to investigating the brainand behavior using cross disciplinary approaches thatexpand the understanding of how complex mentalillnesses and addictions arise from individualcomponents. 500 Fifth Street, NW Phone: 202 334-3984 Washington, DC 20001 Fax: 202 334-1329 E-mail:baltevogt@nas.edu www.iom.edu/neuroforum 3
    • INSTITUTE OF MEDICINE Forum on Neuroscience and Nervous System Disorders STEVE HYMAN, (Chair) The Broad InstituteSUSAN AMARA STORY LANDISSociety for Neuroscience National Institute on Neurological Disease and StrokeMARC BARLOW ALAN LESHNERGE Healthcare, Inc. American Association for the Advancement of ScienceMARK BEAR HUSSEINI MANJIMassachusetts Institute of Technology Johnson and Johnson Pharmaceutical Research and Development, Inc.KATJA BROSENeuron DAVID MICHELSON Merck Research LaboratoriesDANIEL BURCHCeNeRx Biopharma RICHARD MOHS Lilly Research LaboratoriesC. THOMAS CASKEYBaylor College of Medicine JONATHAN MORENO University of Pennsylvania School of MedicineTIMOTHY COETZEEFast Forward, LLC ALEXANDER OMMAYA U.S. Department of Veterans AffairsEMMELINE EDWARDSNIH Neuroscience Blueprint ATUL PANDE GlaxoSmithKline, IncMARTHA FARAHUniversity of Pennsylvania STEVEN PAUL Weill Cornell Medical CollegeRICHARD FRANKGE Healthcare, Inc. TODD SHERER Michael J. Fox Foundation for Parkinson’s ResearchDANIEL GESCHWINDUniversity of California Los Angeles PAUL SIEVING National Eye InstituteHANK GREELYStanford University JUDY SIUCIAK Foundation for the National Institutes of HealthMYRON GUTMANNNational Science Foundation MARC TESSIER-LAVIGNE The Rockefeller UniversityRICHARD HODESNational Institute on Aging WILLIAM THIES Alzheimer’s AssociationSTEVEN HYMANThe Broad Institute NORA VOLKOW National Institute on Drug AbuseTHOMAS INSELNational Institute on Mental Health KENNETH WARREN National Institute on Alcohol Abuse and AlcoholismPHILLIP IREDALEPfizer Global Research and Development JOHN WILLIAMS Wellcome TrustDANIEL JAVITTNathan S. Kline Institute for Psychiatric Research STEVIN H. ZORN Lundbeck USAFRANCES JENSENUniversity of Pennsylvania, School of Medicine CHARLES ZORUMSKI Washington University School of Medicine PROJECT STAFF Bruce Altevogt, Ph.D., Forum Director Diana Pankevich, Ph.D., Associate Program Officer Elizabeth Thomas, Senior Project Assistant______________________________________________________________________________________________The IOM was chartered in 1970 as a component of the National Academy of Sciences to enlist distinguished members of the appropriate professionsin the examination of policy matters pertaining to the health of the public. In this, the Institute acts under both the Academy’s 1863 congressionalcharter responsibility to be an adviser to the federal government and its own initiative in identifying issues of medical care, research, and education.For additional information on the Forum on Neuroscience and Nervous System Disorders visit the project’s homepage at www.iom.edu/neuroforum,or call Bruce Altevogt at (202) 334-3984. 4
    • ROUNDTABLE ON TRANSLATING GENOMIC-BASED RESEARCH FOR HEALTHThe sequencing of the human genome is rapidly and span a broad range of issues relevant to theopening new doors to research and progress in translation process.biology, medicine, and health care. At the same time,these developments have produced a diversity of new Issues may include the integration and coordinationissues to be addressed. of genomic information into health care and public health including encompassing standards for geneticThe Institute of Medicine has convened a Roundtable screening and testing, improving informationon Translating Genomic-Based Research for Health technology for use in clinical decision making,that brings together leaders from academia, industry, ensuring access while protecting privacy, and usinggovernment, foundations and associations, and genomic information to reduce health disparities.representatives of patient and consumer interests who The patient and family perspective on the use ofhave a mutual concern and interest in addressing the genomic information for translation includes socialissues surrounding the translation of genome-based and behavioral issues for target populations. Thereresearch for use in maintaining and improving health. are evolving requirements for the health professionalThe mission of the Roundtable is to advance the field community, and the need to be able to understand andof genomics and improve the translation of research responsibly apply genomics to medicine and publicfindings to health care, education, and policy. The health.Roundtable will discuss the translation process,identify challenges at various points in the process, Of increasing importance is the need to identify theand discuss approaches to address those challenges. economic implications of using genome-based research for health. Such issues include incentives,The field of genomics and its translation involves cost-effectiveness, and sustainability.many disciplines, and takes place within differenteconomic, social, and cultural contexts, necessitating Issues related to the developing science base are alsoa need for increased communication and important in the translation process. Such issuesunderstanding across these fields. As a convening could include studies of gene-environmentmechanism for interested parties from diverse interactions, as well as the implications of genomicsperspectives to meet and discuss complex issues of for complex disorders such as addiction, mentalmutual concern in a neutral setting, the Roundtable: illness, and chronic diseases.fosters dialogue across sectors and institutions;illuminates issues, but does not necessarily resolve Roundtable sponsors include federal agencies,them; and fosters collaboration among stakeholders. pharmaceutical companies, medical and scientific associations, foundations, and patient/publicTo achieve its objectives, the Roundtable conducts representatives. For more information about thestructured discussions, workshops, and symposia. Roundtable on Translating Genomic-Based ResearchWorkshop summaries will be published and for Health, please visit our website at www.iom.edu/collaborative efforts among members are encouraged genomicroundtable or contact Adam Berger at 202-(e.g., journal articles). Specific issues and agenda 334-3756, or by e-mail at aberger2@nas.edu.topics are determined by the Roundtable membership, 500 Fifth Street, NW Phone: 202 334 3756 Washington, DC 20001-2721 Fax: 202 334 1329 www.iom.edu/genomicroundtable E-mail: aberger2@nas.edu 5
    • Institute of Medicine Roundtable on Translating Genomic-Based Research for Health Membership Wylie Burke, M.D., Ph.D. (Co-Chair) University of Washington Sharon Terry, M.A. (Co-Chair) Genetic AllianceNaomi Aronson, Ph.D. Debra Leonard, M.D., Ph.D.BlueCross/BlueShield Association College of American PathologistsEuan Ashley, M.R.C.P., D.Phil., FACC, FAHA Elizabeth Mansfield, Ph.D.American Heart Association Food & Drug AdministrationPaul R. Billings, M.D., Ph.D. Garry Neil, M.D.Life Technologies Corporation Johnson & JohnsonBruce Blumberg, M.D. Robert L. Nussbaum, M.D.Kaiser Permanente University of California San Francisco, School of MedicineDenise E. Bonds, M.D., M.P.H.National Heart, Lung, and Blood Institute Michelle Ann Penny, Ph.D. Eli Lilly and CompanyPhilip J. Brooks, Ph.D.Office of Rare Diseases Research Aidan Power, M.B., B.Ch., M.Sc., M.R.C.Psych. Pfizer Inc.C. Thomas Caskey, M.D., FACPBaylor College of Medicine Victoria M. Pratt, Ph.D., FACMG Quest Diagnostics Nichols InstituteSara Copeland, M.D.Health Resources and Services Administration Ronald Przygodzki, M.D. Department of Veterans AffairsVictor Dzau, M.D.Duke University Health System Allen D. Roses, M.D. Duke UniversityW. Gregory Feero, M.D., Ph.D.The Journal of the American Medical Association Kevin A. Schulman, M.D. Duke UniversityAndrew N. Freedman, Ph.D.National Cancer Institute Joan A. Scott, M.S., C.G.C. National Coalition for Health Professional EducationGeoffrey Ginsburg, M.D., Ph.D. in GeneticsDuke University David Veenstra, Pharm.D., Ph.D.Jennifer Hobin, Ph.D. University of WashingtonAmerican Society of Human Genetics Michael S. Watson, Ph.D.Richard Hodes, M.D. American College of Medical Genetics and GenomicsNational Institute on Aging Daniel Wattendorf, M.D. (Lt. Col)Sharon Kardia, Ph.D. Department of the Air ForceUniversity of Michigan, School of Public Health Catherine A. Wicklund, M.S., C.G.C.Mohamed Khan, M.D., Ph.D. National Society of Genetic CounselorsAmerican Medical Association Project StaffMuin Khoury, M.D., Ph.D. Adam C. Berger, Ph.D., Roundtable DirectorCenters for Disease Control and Prevention Sean P. David, M.D., D.Phil., Puffer/ABFM Fellow Claire Giammaria, M.P.H., Research AssociateThomas Lehner, Ph.D., M.P.H. Tonia Dickerson, Senior Program AssistantNational Institute of Mental HealthThe Institute of Medicine serves as adviser to the nation to improve health. Established in 1970 under the charter of the NationalAcademy of Sciences, the Institute of Medicine provides independent, objective, evidence-based advice to policy makers, healthprofessionals, the private sector, and the public. The mission of the Institute of Medicine embraces the health of people everywhere. 6
    • The National Cancer Policy Forum The National Cancer Policy Forum (NCPF) provides a continuous focus on cancer policy at the Institute of Medicine (IOM). IOM forums are designed to allow government, industry, academic, and other representatives to meet, confer, and plan on subject areas of mutual interest. The objectives of the NCPF are to identify emerging high priority policy issues in the nation’s effort to combat cancer and to examine those issues through convening activities that promote discussion about potential opportunities for action. These activities inform stakeholders about critical policy issues through published reports, and often provide input for planning formal IOM consensus committee studies. The IOM established the NCPF on May 1, 2005, to succeed the National Cancer Policy Board (1997-2005, the Board). The Board brought together leaders from the cancer community to identify and conduct studies and other activities contributing to cancer research, prevention, treatment, and public awareness. The combination of multi- disciplinary expertise (basic, clinical, and public health scientists, consumers, and advocates) and resources (grants from the National Cancer Institute (NCI) and Centers for Disease Control and Prevention (CDC), as well as smaller contributions from private sector organizations) allowed the Board to produce a remarkably original and diverse body of work contributing to improvements in knowledge and public policy. The NCPF continues to provide a focus within the National Academies for the consideration of issues in science, clinical medicine, public health, and public policy relevant to the goals of preventing, palliating, and curing cancer. The NCPF builds upon the work of the Board and enjoys a closer working relationship with its federal and non- federal sponsors. As a forum rather than a board, sponsors are full members with the academic, consumer, and policy community members. They bring ideas and requests to the deliberations and have the advantage of playing an active part in the discussions. Governmental sponsors represented on the NCPF include the NCI and CDC, and non- governmental sponsors include the American Association for Cancer Research, the American Cancer Society, the American Society of Clinical Oncology, the Association of American Cancer Institutes, Bristol-Myers Squibb, C-Change, the CEO Roundtable on Cancer, GlaxoSmithKline Oncology, Novartis, the Oncology Nursing Society, and Sanofi Oncology. Additional distinguished experts from the cancer community are also appointed as members to serve three-year terms. The NCPF operates under the aegis of the IOM Board on Health Care Services. The NCPF enables all members to be full participants in identifying and debating critical policy issues in cancer care and research, and in examining potential opportunities for actions. These convening activities result in published reports that are available to the public and may provide input to planning formal IOM consensus committee studies. Ideas for committee studies that emerge from the NCPF’s deliberations are handed off to an appropriate ad hoc committee appointed by IOM. The studies are conducted by NCPF staff and the committees often include one or more members of the NCPF. Forum sponsors often actively pursue activities to facilitate the implementation of recommendations made in these consensus reports, as well as suggestions put forth in NCPF workshops.500 Fifth Street, NW, Washington, DC 20001 Ph: 202-334-1233 Fax: 202-334-2862 7 www.iom.edu/Activities/Disease/NCPF
    • National Cancer Policy Forum Reports:Developing Biomarker-based Tools for Cancer Screening, Diagnosis and Therapy (2006), Effect ofthe HIPAA Privacy Rule on Health Research (2006), Implementing Cancer Survivorship CarePlanning (2007), Cancer in Elderly People (2007), Cancer-Related Genetic Counseling and Testing(2007), Improving the Quality of Cancer Clinical Trials (2008), Implementing Colorectal CancerScreening (2008), Multi-center Phase III Clinical Trials and the NCI Cooperative Group Program(2009), Ensuring Quality Cancer Care through the Oncology Workforce (2009), Assessing andImproving Value in Cancer Care (2009), Policy Issues in the Development of Personalized Medicinein Oncology (2010), A Foundation for Evidence-Driven Practice: A Rapid Learning System forCancer Care (2010), Extending the Spectrum of Precompetitive Collaboration in Oncology Research(2010), Direct to Consumer Genetic Testing (with the NRC, 2010), Policy Issues in Nanotechnologyand Oncology (2011), National Cancer Policy Summit: Opportunities and Challenges in CancerResearch and Care (2011), Patient-Centered Cancer Treatment Planning: Improving the Quality ofOncology Care (2011), Implementing a National Cancer Clinical Trials System for the 21st Century(2011), Facilitating Collaborations to Develop Combination Investigational Cancer Therapies(2011), The Role of Obesity in Cancer Survival and Recurrence (2012), Informatics Needs andChallenges in Cancer Research (2012), Reducing Tobacco-Related Cancer Incidence and Mortality(In Preparation)Spin-off IOM consensus committee reports:Cancer Biomarkers: The Promises and Challenges of Improving Detection and Treatment (2007),Beyond the HIPAA Privacy Rule: Enhancing Privacy, Improving Health through Research (2009),Evaluation of Biomarkers and Surrogate Endpoints in Chronic Disease (2010), A National CancerClinical Trials System for the 21st Century (2010), Evolution of Translational Omics (2012), Ensuringthe Quality of Cancer Care in an Aging Population (In preparation)Membership of the Forum:Chair - John Mendelsohn, MD, President, MD Anderson Cancer CenterVice Chair - Patricia Ganz, MD, Professor of Medicine, Jonsson Comprehensive Cancer CenterAmy Abernethy, MD, Director, Duke University School of Medicine Cancer Care Research ProgramRafael Amado, MD, Senior Vice President and Head of GlaxoSmithKline Oncology R&DFred Appelbaum, MD, Director, Clinical Research, Fred Hutchinson Cancer Research CenterPeter Bach, MD, MAPP, Member, Memorial Sloan-Kettering Cancer CenterEdward Benz, MD, President, Dana-Farber Cancer InstituteMonica Bertagnolli, MD, Professor of Surgery, Harvard University Medical SchoolOtis Brawley, MD, Chief Medical Officer and Executive Vice President, American Cancer SocietyMichael Caligiuri, MD, Director, Ohio State University Cancer Center, past President, AACIRenzo Canetta, MD, Vice President, Oncology Global Clinical Research, Bristol-Myers SquibbMichaele Chamblee Christian, MD, RetiredWilliam Dalton, PhD, MD, CEO, M2Gen, H. Lee Moffitt Cancer CenterWendy Demark-Wahnefried, PhD, RD, Professor and Chair, University of Alabama, BirminghamRobert Erwin, MS, President, Marti Nelson Cancer FoundationRoy Herbst, MD, PhD, Chief of Medical Oncology, Yale Cancer CenterThomas Kean, MPH, President and CEO, C-ChangeDouglas Lowy, MD, Deputy Director, Division of Basic Science, NCIDaniel R. Masys, MD, Affiliate Professor, Biomedical Informatics, University of WashingtonMartin Murphy, PhD, DMedSc, Chief Executive Officer, CEO Roundtable on CancerBrenda Nevidjon, RN, MSN, Professor, Duke University School of Nursing, past President, ONSSteven Piantadosi, MD, PhD, Director, Samuel Oschin Comprehensive Cancer InstituteLisa Richardson, MD, MPH, Associate Director, Division of Cancer Prevention and Control, CDCDebasish Roychowdhury, MD, Senior Vice President- Global Oncology, SanofiYa-Chen Tina Shih, PhD, Director, Program in the Economics of Cancer, University of ChicagoEllen Sigal, PhD, Chairperson and Founder, Friends of Cancer ResearchSteven Stein, MD, Senior Vice President, US Clinical Development & Medical Affairs, NovartisJohn Wagner, MD, PhD, Vice President, Clinical Pharmacology, Merck Research LaboratoriesRalph Weichselbaum, MD, Chairman, Radiation and Cellular Oncology, University of ChicagoJanet Woodcock, MD, Director, Center for Drug Evaluation and Research, FDAForum Director: Sharyl Nass, PhD 8
    • Forum on Drug Discovery, Development, and Translation Forum on Neuroscience and Nervous System Disorders Roundtable on Translating Genomic-Based Research for Health National Cancer Policy Forum Speaker Biographies Sharing Clinical Research Data: An IOM Workshop October 4-5, 2012Myles Axton, Ph.D.Myles Axton is the editor of Nature Genetics. He was a university lecturer in molecular andcellular biology at the University of Oxford and a Fellow of Balliol College from 1995 to 2003. Heobtained his degree in genetics at Cambridge in 1985, and his doctorate at Imperial College in1990, and between 1990 and 1995 did postdoctoral research at Dundee and at MIT’sWhitehead Institute. Myles’s research made use of the advanced genetics of Drosophila tostudy genome stability by examining the roles of cell cycle regulators in life cycle transitions. Hisinterests broadened into human genetics, genomics and systems biology through lecturing andfrom tutoring biochemists, zoologists and medical students from primary research papers.Helping to establish Oxford’s innovative research MSc. in Integrative Biosciences led Myles torealize the importance of the integrative overview of biomedical research. As a full timeprofessional editor he is now in a position to use this perspective to help coordinate research ingenetics.Jesse A. Berlin, Sc.D.After spending 15 years as a faculty member at the University of Pennsylvania, in the Centerfor Clinical Epidemiology and Biostatistics, under the direction of Dr. Brian Strom, Dr. Berlin leftthe University of Pennsylvania to join Janssen Research & Development, where he is currentlyVice President of Epidemiology. He has authored or coauthored over 230 publications in a widevariety of clinical and methodological areas, including papers on the study of meta-analyticmethods as applied to both randomized trials and epidemiology. He served on an Institute ofMedicine Committee that developed recently-released recommendations for the use ofsystematic reviews in clinical effectiveness research, and currently serves on the ScientificAdvisory Committee to the Observational Medical Outcomes Partnership, a public-privatepartnership aimed at understanding methodology for assessing drug safety in large,administrative databases. He is also a fellow of the American Statistical Association.Josephine P. Briggs, M.D.Dr. Briggs is the Director of the National Center for Complementary and Alternative Medicine,and Acting Director, Division of Clinical Innovation, National Center for Advancing TranslationalSciences, NIH. An accomplished researcher and physician, Dr. Briggs received her A.B. inbiology from Harvard-Radcliffe College and her M.D. from Harvard Medical School. Shecompleted her residency training in internal medicine and nephrology at the Mount Sinai Schoolof Medicine, followed by a fellowship at Yale, then work as a research scientist at thePhysiology Institute at the University of Munich.In 1985, Dr. Briggs moved to the University of Michigan where she held several academicpositions, including associate chair for research in the Department of Internal Medicine andprofessorships in the Division of Nephrology, Department of Internal Medicine, and theDepartment of Physiology. She joined the National Institutes of Health (NIH) in 1997 as directorof the Division of Kidney, Urologic, and Hematologic Diseases at the National Institute ofDiabetes and Digestive and Kidney Diseases. In 2006, Dr. Briggs accepted a position as senior 1
    • scientific officer at the Howard Hughes Medical Institute. In January 2008, she returned to NIHas the Director of the National Center for Complementary and Alternative Medicine.Dr. Briggs has published more than 175 research articles, book chapter, and scholarlypublications and has served on the editorial boards of several journals, and was deputy editorfor the Journal of Clinical Investigation. Dr. Briggs is an elected member of the AmericanAssociation of Physicians and the American Society of Clinical Investigation and a fellow of theAmerican Association for the Advancement of Science. She is a recipient of many awards andprizes, including the Volhard Prize of the German Nephrological Society, the Alexander vonHumboldt Scientific Exchange Award, and NIH Directors Awards for her role in the developmentof the Trans-NIH Type I Diabetes Strategic Plan and her leadership of the Trans-NIH Zebrafishcommittee. Dr. Briggs is also a member of the NIH Steering Committee, the senior mostgoverning board at the NIH.Robert M. Califf, M.D.Dr. Califf is the Vice Chancellor for Clinical and Translational Research, Director of the DukeTranslational Medicine Institute (DTMI), and Professor of Medicine in the Division of Cardiologyat Duke University Medical Center in Durham, NC. He leads a multifaceted organization thatseeks to transform how scientific discoveries are translated into improved health outcomes.Prior to leading the DTMI, he was the founding director of the Duke Clinical Research Institute(DCRI), one of the nations premier academic research organizations. He is editor-in-chief of theAmerican Heart Journal, the oldest cardiovascular specialty journal, and a practicing cardiologistat Duke University Medical Center.Born in 1951 in Anderson, SC, Dr. Califf attended high school in Columbia, where he was amember of the 1969 AAAA SC Championship basketball team. After high school, Dr. Califfattended Duke University, graduating summa cum laude and Phi Beta Kappa in 1973. Heremained at Duke for medical school, where he was selected for the Alpha Omega Alphamedical honor society. After graduating from Duke University School of Medicine in 1978, hecompleted a residency in internal medicine at the University of California, San Francisco,returning to Duke for a cardiology fellowship.Dr. Califf is board certified in internal medicine (1984) and cardiology (1986), and was named aMaster of the American College of Cardiology in 2006. An international leader in the fields ofcardiovascular medicine, healthcare outcomes, quality of care, and medical economics, he hasauthored or coauthored more than 1,000 peer-reviewed articles and is among the mostfrequently cited authors in medicine. He is also a contributing editor for TheHeart.org, an onlineinformation resource for healthcare professionals working in the field of cardiovascularmedicine.As founder and for a decade director of the DCRI, Dr. Califf led many landmark clinical trials andhealth services research projects, and remains actively involved in designing, leading, andconducting multinational clinical trials. Under his guidance, DCRI grew into an organization withmore than 1000 employees and an annual budget of over $1 00 million; its umbrellaorganization, the DTMI, now has an annual budget of more than $300 million. Supported in partby a Clinical and Translational Science Award (CTSA) from the National Institutes of Health, theDTMI works with government agencies, academic partners, research foundations, and themedical products industry to conduct innovative research spanning multiple therapeutic arenasand scientific disciplines. 2
    • Forum on Drug Discovery, Development, and Translation Forum on Neuroscience and Nervous System Disorders Roundtable on Translating Genomic-Based Research for Health National Cancer Policy ForumDr. Califf serves as co-chair of the first Principal Investigators Steering Committee of the CTSAand has served on the FDAs Cardiorenal Advisory Panel, and on the Institute of Medicines(IOM) Pharmaceutical Roundtable, Committee on Identifying and Preventing Medication Errors,and Committee on Nutritional Biomarkers. In 2008, he was part of the subcommittee of theFDAs Science Board that recommended sweeping reform ofthe agencys science base. He wasalso a member of the IOM committees that recommended Medicare coverage of clinical trialsand the removal of ephedra from the market. Dr. Califf is currently a member ofthe IOM Forumin Drug Discovery, Development, and Translation and a member of the National AdvisoryCouncil on Aging.Reflecting his interests in healthcare quality, Dr. Califf was the founding director of thecoordinating center of the Centers for Education & Research on Therapeutics, a public-privatepartnership that seeks to improve the use of medical products through research and education.He is currently co-chair of the Clinical Trials Transformation Initiative, a public-privatepartnership focused on improving the clinical trials system. He is also chair of the ClinicalResearch Forum, an organization of academic health and science system leaders devoted toimproving the clinical research enterprise.Dr. Califf is married to Lydia Carpenter, and the couple has three children-Sharon Califf, agraduate of Elon College; Sam, a graduate student at the University of Colorado-Boulder; andTom, a graduate of Duke-and one grandchild. Dr. Califf enjoys time with his family, works at hisgolf game, listens to music, and remains an ardent supporter of the Duke mens and womensbasketball programs.Michael N. Cantor, M.D., M.A., FACPMichael Cantor is currently Senior Director, Information Strategy and Analytics, in Pfizer’sClinical Informatics and Innovation group. His work focuses on leveraging data reuse andintegration to support future horizons of scientific decision support for precision medicine. He iscurrently co-leading several initiatives around the secondary use of clinical data, includingPfizer’s ePlacebo/eControls database, as well as its comprehensive Clinical Lab Data Catalog.He created and co-leads the MEDIC (Multisite Electronic Data Infectious Disease Consortium)project, which aims to partner with academic medical centers to perform observational studiesusing data from electronic medical record (EMR) systems. He has served as an advisor toprograms across each of Pfizer’s Business Units, as well as the Worldwide Research andDevelopment organization, on the role of healthcare IT in advancing their strategic priorities.Dr. Cantor previously lead Pfizer Business Technology’s “Data Without Borders” strategy, withthe aim of advancing data sharing and reuse, both internally and externally, to advancePrecision Medicine. Outside of Pfizer, he has been a member of AMIA’s public policy committeefor six years, and led the committee’s initiative to update its positions around data stewardshipand reuse.Prior to joining Pfizer, Michael was the Chief Medical Information Officer for the SouthManhattan Healthcare Network of the New York City Health and Hospitals Corporation, basedat Bellevue Hospital in Manhattan. His work there focused on developing the network’s EMRsystem to improve patient safety and on using the network’s clinical data warehouse forresearch. He continues to see patients 1 day/week at Bellevue, and is a Clinical AssistantProfessor of Medicine at NYU School of Medicine. 3
    • Michael completed his residency in internal medicine and informatics training at Columbia, hasan M.D. from Emory University, and an A.B. from Princeton.Carolyn Compton, M.D., Ph.D.Dr. Carolyn Compton is the President and CEO of Critical Path Institute. She was most recentlythe Director of the Office of Biorepositories and Biospecimen Research (OBBR) and theExecutive Director of the Cancer Human Biobank (caHUB) project at the National CancerInstitute. In these capacities, she had leadership responsibility for strategic initiatives thatincluded the Innovative Molecular Analysis Technologies for Cancer program, the BiospecimenResearch Network program, and the NCI Community Cancer Centers project. She is an adjunctProfessor of Pathology at the Johns Hopkins School of Medicine.She received her M.D. and Ph.D. in degrees from Harvard Medical School and the HarvardGraduate School of Arts and Sciences. She trained in Pathology at Harvards Brigham andWomens Hospital and is boarded in both Anatomic Pathology and Clinical Pathology. Shecame to the NCI from McGill University where she had been the Strathcona Professor and Chairof Pathology and the Pathologist-in-Chief of McGill University Health Center from 2000-2005.Prior to this, she had been a Professor of Pathology Harvard Medical School, the Director ofGastrointestinal Pathology at the Massachusetts General Hospital, and the Pathologist-in-Chiefof the Shriners Hospital For Crippled Children, Boston Burns Unit for 15 years. During this timeshe served as the Chair of the Pathology Committee of the Cancer and Leukemia Group B for12 years. Her research interests are in colon and pancreatic cancer as well as epithelial biologyand wound healing.Dr. Compton has held many national and international leadership positions in pathology andcancer-related professional organizations. She is a Fellow of the College of AmericanPathologists and a Fellow of the Royal Society of Medicine. Currently, she is the Chair of theAmerican Joint Committee on Cancer (AJCC), serves on the Executive Committee of theCommission on Cancer of the American College of Surgeons, and serves as the PathologySection Editor for Cancer. She is a past Chair of the Cancer Committee of the College ofAmerican Pathologists and was Editor of the first edition of the CAP Cancer Protocols(Reporting on Cancer Specimens) used as standards for COC accreditation. Among her awardsare the ISBER Award for Outstanding Achievement in Biobanking, the NIH Directors Award, theNIH Award of Merit, and the CAP Frank W. Hartman Award. She has published more than 500original scientific papers, reports, review articles, books and abstracts.Neil de Crescenzo, M.B.A.Mr. de Crescenzo is Senior Vice President and General Manager for Health Sciences at Oracle.He is responsible for managing Oracles solution groups, strategic planning, productdevelopment, and sales, service, and support for the industry solutions sold into the healthcareand life sciences markets worldwide. Mr. de Crescenzo brings more than 20 years ofoperational and IT leadership across healthcare and life sciences to his work with customersand partners worldwide.Prior to joining Oracle, Mr. de Crescenzo held a number of leadership positions during hisdecade at IBM Corporation, working with healthcare and life sciences clients worldwide. Prior toentering the information technology industry, he held leadership positions in healthcareoperations at multiple medical centers and a major health insurer. Mr. de Crescenzo began hiscareer in investment banking, working with U.S. and European clients in the areas of corporatefinance and mergers and acquisitions. 4
    • Forum on Drug Discovery, Development, and Translation Forum on Neuroscience and Nervous System Disorders Roundtable on Translating Genomic-Based Research for Health National Cancer Policy ForumMr. de Crescenzo has been a keynote speaker at numerous industry conferences worldwideand is quoted frequently on industry issues. In 2005, he was named one of the "Top 25 MostInfluential Consultants" by Consulting Magazine.Mr. de Crescenzo has a BA in political science from Yale University and an MBA in hightechnology from Northeastern University.Peter Doshi, Ph.D.Dr. Doshi is a postdoctoral fellow in comparative effectiveness research at the Johns HopkinsUniversity School of Medicine.His overarching research interests are in improving the bases for credible evidence synthesis tosupport and improve the quality of evidence-based medical and health policy related decisionmaking.In 2009, he joined a Cochrane systematic review team evaluating neuraminidase inhibitors forthe treatment and prevention of influenza. Rather than focusing on publications, their reviewevaluates regulatory information including clinical study reports.He received his A.B. in anthropology from Brown University, A.M. in East Asian studies fromHarvard University, and Ph.D. in History, Anthropology, and Science, Technology and Societyfrom the Massachusetts Institute of Technology.Kelly Edwards, Ph.D.An associate professor in the School of Medicines Department of Bioethics and Humanities, Dr.Edwards also is a core faculty member for the UW Institute for Public Health Genetics. Shereceived both her Master of Arts degree in Medical Ethics and her Ph.D. in Philosophy ofEducation from the UW.Dr. Edwards work incorporates communication and public engagement as an ethical obligationfor clinicians and researchers. She is the director of the Ethics and Outreach Core for the UWCenter for Ecogenetics and Environmental Health, which is funded by the National Institute ofEnvironmental Health Sciences. She also is a co-director of the Regulatory Support andBioethics Core for the Institute for Translational Health Sciences (ITHS), a partnership of theUW, Fred Hutchinson Cancer Research Center, Seattle Childrens and other regionalinstitutions and community and tribal groups. Funded by the National Institutes of Health (NIH),the ITHS assists researchers with translating their scientific discoveries into practice.In addition, Dr. Edwards is a lead investigator with UW Center for Genomics and HealthcareEquality, funded by the NIHs National Human Genome Research Institute. Since 2004, she hasbeen the faculty advisor for the Forum on Science, Ethics and Policy, groups of graduate andprofessional students and postdoctoral fellows at the UW and University of Colorado thatpromote dialogue on issues concerning science and society.To further engage people in conversations about ethical dimensions of science and medicine,Dr. Edwards has facilitated Community Conversations and the Public Health Café, a series ofevents hosted in Seattle by the Northwest Association for Biomedical Research. Nationally, Dr.Edwards contributes to issues of ethical research practices with the Genetic Alliance, a healthadvocacy organization; Sage Bionetworks, a local non-profit; and the Institute of Medicine.Her courses include "Inquiry-Based Science Communication," "Applied Research Ethics,""Community-Based Participatory Research: A Model for Genetics Research with Native 5
    • American Communities?" and "Public Commentary on Ethical Issues in Public Health Genetics."She is associate editor of BMC Medical Research Methodology and a reviewer for severaljournals.Dr. Edwards serves on the School of Medicines Continuous Professional ImprovementCommittee and is a former member of Medicines Standing Committee on Issues of WomenFaculty, the Student Progress Committee and the Committee on Research and GraduateEducation. She is a current member of the UW Graduate School Committee on InterdisciplinaryEducation.Hans-Georg Eichler, M.D., M.Sc.Dr. Eichler is the Senior Medical Officer at the European Medicines Agency in London, UnitedKingdom, where he is responsible for coordinating activities between the Agency’s scientificcommittees and giving advice on scientific and public health issues. From January untilDecember 2011, Dr. Eichler was the Robert E. Wilhelm fellow at the Massachusetts Institute ofTechnology’s Center for International Studies, participating in a joint research project under theMIT’s NEWDIGS initiative. He divided his time between the MIT and the EMA in London.Prior to joining the European Medicines Agency, Dr. Eichler was at the Medical University ofVienna in Austria for 15 years. He was vice-rector for Research and International Relationssince 2003, and professor and chair of the Department of Clinical Pharmacology since 1992.His other previous positions include president of the Vienna School of Clinical Research and co-chair of the Committee on Reimbursement of Drugs of the Austrian Social Security Association.His industry experience includes time spent at Ciba-Geigy Research Labs, U.K., and OutcomesResearch at Merck & Co., in New Jersey.Dr. Eichler graduated with an M.D. from Vienna University Medical School and a Master ofScience degree in Toxicology from the University of Surrey in Guildford, U.K. He trained ininternal medicine and clinical pharmacology at the Vienna University Hospital as well as atStanford University.Jennifer S. Geetter, J.D.Ms. Geetter is a partner in the law firm of McDermott Will & Emery LLP and is based in theFirms Washington, D.C., office. She focuses her practice on emerging biotechnology andsafety issues, advising hospital, industry, insurance and provider clients on matters relating toresearch, drug and device development, off-label use, personalized medicine, formularycompliance, privacy and security, electronic health records and data strategy initiatives, patientsafety, conflicts of interest, scientific review and research misconduct, internal hospitaldisciplinary proceedings, and emerging issues in secondary research concerning biologicalsamples and data warehousing. She also assists health care clients in implementing researchstrategies, structuring research operational and compliance infrastructure, and in developingguidelines for the appropriate relationships between providers and industry. She is a frequentspeaker on these topics.Ms. Geetter a member of the Firm’s Life Sciences Affinity Group and Personalized MedicineTeam. She is also a co-chair of the Pro Bono and Community Service Committee for theWashington, D.C., office. She sits on McDermott’s National Pro Bono Committee and theAmerican Health Lawyers Association Life Sciences Section Steering Committee.Ms. Geetter is listed as a leading individual in health care in Washington, D.C., in ChambersUSA 2008: America’s Leading Lawyers for Business. 6
    • Forum on Drug Discovery, Development, and Translation Forum on Neuroscience and Nervous System Disorders Roundtable on Translating Genomic-Based Research for Health National Cancer Policy ForumShe was recognized as a STAR Mentor in the McDermott University Mentoring Program for2006-2007. She is also a member of McDermott’s Gender Diversity Committee.In May 2003, Ms. Geetter was awarded the ACE Founders Award for her pro bono efforts aspart of a team of McDermott lawyers on behalf of a group of low-income residents of a Bostonneighborhood. She received the Firms Outstanding Achievement Award for Commitment toPro Bono and Service to the Community in 2004 and serves on the Firms national coordinatingcommittee for pro bono activities.Ms. Geetter is admitted to practice in Massachusetts, New York and Washington, D.C.Steven N. Goodman, M.D., M.H.S., Ph.D.Dr. Goodman is Professor of Medicine and Health Policy & Research and Associate Dean forClinical and Translational Research at Stanford University. Before joining Stanford in 2011, Dr.Goodman spent two decades on the Johns Hopkins medical faculty as Professor of Oncology inthe division of biostatistics, with appointments in the departments of Pediatrics, Biostatistics,and Epidemiology in the Johns Hopkins Schools of Medicine and Public Health. He has beeneditor of Clinical Trials: Journal of the Society for Clinical Trials since 2004 and senior statisticaleditor for the Annals of Internal Medicine since 1987. He has served on wide range of Instituteof Medicine committees, including Agent Orange and Veterans, Immunization Safety, theCommittee on Alternatives to the Daubert Standards, Treatment of PTSD in Veterans, and mostrecently co-chaired the Committee on Ethical and Scientific Issues in Studying the Safety ofApproved Drugs, whose report was released in May, 2012.Dr. Goodman was appointed by the GAO to serve on the Methodology Committee of the PatientCentered Outcomes Research Institute (PCORI) and is a scientific advisor to the MedicalAdvisory Panel of the National Blue Cross/Blue Shield Technology Evaluation Center. Heserved on the Surgeon General’s committee to write the 2004 report on the HealthConsequences of Smoking. Dr. Goodman received a BA from Harvard, an MD from New YorkUniversity, trained at Washington University in Pediatrics, in which he was board certified, andreceived an MHS in Biostatistics, and PhD in Epidemiology from Johns Hopkins University. Hewrites and teaches on evidence evaluation and inferential, methodologic, and ethical issues inepidemiology and clinical research.Robert A. Harrington, M.D.Dr. Robert Harrington is Arthur L. Bloomfield Professor of Medicine and Chairman of theDepartment of Medicine at Stanford University. He received his undergraduate degree inEnglish from the College of the Holy Cross Worcester, Mass. He attended Dartmouth MedicalSchool and received his medical degree from Tuft University School of Medicine in 1986. Hewas an intern, resident, and the chief medical resident in internal medicine at the University ofMassachusetts Medical Center. He was a fellow in cardiology at Duke University MedicalCenter, where he received training in interventional cardiology and research training in the DukeDatabank for Cardiovascular Diseases. Dr. Harrington was previously the director of the DukeClinical Research Institute (DCRI). His research interests include evaluating antithrombotictherapies to treat acute ischemic heart disease and to minimize the acute complications ofpercutaneous coronary procedures, studying the mechanism of disease of the acute coronarysyndromes, understanding the issue of risk stratification in the care of patients with acuteischemic coronary syndromes, trying to better understand and improve upon the methodology ofclinical trials. He is the recipient of an NIH Roadmap contract to investigate “best practices”among clinical trial networks. 7
    • He has authored more than 400 peer-reviewed manuscripts, reviews, book chapters, andeditorials. He is an associate editor of the American Heart Journal and an editorial boardmember for the JACC. He is a senior editor of the 13th edition of Hurst’s The Heart. He is aFellow of the ACC, the AHA, the ESC, the Society of Cardiovascular Angiography andIntervention, and the American College of Chest Physicians. He recently served as a memberand the chair of the Food and Drug Administration Cardiovascular and Renal Drugs AdvisoryCommittee.Lynn D. Hudson, Ph.D.Dr. Hudson serves as the Chief Science Officer for the Critical Path Institute and ExecutiveDirector of the Multiple Sclerosis Consortium. She received a BS in Biochemistry at theUniversity of Wisconsin, a Ph.D. in Genetics and Cell Biology at the University of Minnesota,and post-doctoral training at Harvard Medical School and Brown University. For most of hercareer, Lynn has been a bench neuroscientist at the National Institutes of Health (NIH), whereshe directed the Office of Science Policy Analysis from 2006-2011. As a major source for policyanalysis within NIHs Office of the Director, her office covered a wide spectrum of sensitive andemerging issues and oversaw a number of programs, including the AAAS/NIH Science PolicyFellowship program, NIH’s contract with the National Academy of Sciences, and the Public-Private Partnership Program. Her policy teams awards cite contributions to NIHsCongressional Justification, Biennial Report, implementation of the NIH Reform Act, Stem CellGuidelines, and Comparative Effectiveness Research. Lynn has conducted research in theNational Institute of Neurological Disorders and Stroke (NINDS) intramural program, where shewas Chief of the Section of Developmental Genetics for 16 years.She received the NIH Merit Award for her discovery of the causative mutations in the neurologicdisorder Pelizaeus-Merzbacher disease (PMD), and an NINDS Award for educational outreachefforts. Her research focused on defining the network of genes involved in the development ofglial cells, with the goal of designing strategies to overcome glial dysfunction in inherited oracquired neurological diseases. She served as an officer for the American Society forNeurochemistry, an officer on the scientific advisory board of the PMD Foundation, and as anadvisor for a number of granting agencies and disease foundations, including the NationalMultiple Sclerosis Society.Charles Hugh-Jones, M.D.Dr. Hugh-Jones is Vice President and Head, Medical Affairs North America, Oncology,Hematology, and Solid Organ Transplant at Sanofi. He has previously served as Vice Presidentand Head, Oncology Medical Affairs, North America for sanofi-aventis; Vice President of BrandManagement and Vice President of Medical Affairs at Enzon Pharmaceuticals; ExecutiveDirector, Medical Affairs at Schering AG/Berlex/Bayer-Schering; Director of Medical Affairs forSchering AG, Global Business Unit; Senior Medical Advisor for Schering AG, UK; and SpecialistRegistrar at Hammersmith Hospital, UK. Dr. Hugh-Jones received his medical degree from theUniversity of London. He is Board Certified in Internal Medicine and has Additional HigherSpecialist Training (UK) in Diagnostic and Interventional Radiology. 8
    • Forum on Drug Discovery, Development, and Translation Forum on Neuroscience and Nervous System Disorders Roundtable on Translating Genomic-Based Research for Health National Cancer Policy ForumJohn P.A. Ioannidis, D.Sc., M.D.Dr. Ioannidis currently holds the C.F. Rehnborg Chair in Disease Prevention at StanfordUniversity and is Professor of Medicine, Professor of Health Research and Policy, and Directorof the Stanford Prevention Research Center at Stanford University School of Medicine,Professor of Statistics (by courtesy) at Stanford University School of Humanities and Sciences,member of the Stanford Cancer Center and of the Stanford Cardiovascular Institute, andaffiliated faculty of the Woods Institute for the Environment. From 1999 until 2010 he chaired theDepartment of Hygiene and Epidemiology at the University of Ioannina School of Medicine inGreece, as a tenured professor since 2003.Dr. Ioannidis was born in New York, NY in 1965 and grew up in Athens, Greece. He wasValedictorian of his class (1984) at Athens College and won a number of early awards, includingthe National Award of the Greek Mathematical Society, in 1984. He graduated in the top rank ofhis class from the School of Medicine, University of Athens, in 1990 and earned a doctorate inbiopathology. He trained at Harvard and Tufts, specializing in Internal Medicine and InfectiousDiseases, and then held positions at NIH, Johns Hopkins University School of Medicine andTufts University School of Medicine before returning to Greece in 1999. He has been adjunctfaculty for the Tufts University School of Medicine since 1996, with the rank of professor since2002. Since 2008 he led the Genetics/Genomics component of the Tufts Clinical andTranslational Science Institute (CTSI) and the Center for Genetic Epidemiology and Modeling(CGEM) of the Tufts Institute for Clinical Research and Health Policy Studies at Tufts MedicalCenter. He is also adjunct professor of epidemiology at the Harvard School of Public Health andvisiting professor of epidemiology and biostatistics at Imperial College London.Dr. Ioannidis is a member of the executive board of the Human Genome Epidemiology Network,and has served as President of the Society for Research Synthesis Methodology, as a memberof the editorial board of 27 leading international journals (including PLoS Medicine, Lancet,Annals of Internal Medicine, Journal of the National Cancer Institute, Science TranslationalMedicine, Molecular and Cellular Proteomics, AIDS, International Journal of Epidemiology,Journal of Clinical Epidemiology, Clinical Trials, Cancer Treatment Reviews, Open Medicine,and PLoS ONE, among others) and as Editor-in-Chief of the European Journal of ClinicalInvestigation for the period 2010-2014. He has given more than 250 invited and honorarylectures. He is one of the most-cited scientists of his generation worldwide, with indices Hirschh=97 [h(F/L)=80 when limited to first- and last(senior)-author papers), Hirsch m=5.7, andSchreiber hm=60 per GoogleScholar as of 8/2012 (h=80 per ISI). He has received severalawards, including the European Award for Excellence in Clinical Science for 2007, and hasbeen inducted in the Association of American Physicians in 2009 and in the European Academyof Cancer Sciences in 2010. The PLoS Medicine paper on “Why most Published ResearchFindings are False,” has been the most-accessed article in the history of Public Library ofScience. The Atlantic selected Ioannidis as the Brave Thinker scientist for 2010 claiming that he“may be one of the most influential scientists alive.”Charles Jaffe, M.D.Dr. Jaffe is the Chief Executive Officer of HL7 where he serves as the organizations globalambassador, fostering relationships with key industry stakeholders. A 37-year veteran of thehealthcare IT industry, Dr. Jaffe was previously the Senior Global Strategist for the DigitalHealth Group at Intel Corporation, Vice President of Life Sciences at SAIC, and the Director ofMedical Informatics at AstraZeneca Pharmaceuticals. He completed his medical training atJohns Hopkins and Duke Universities, and was a post-doctoral fellow at the National Institutes 9
    • of Health and at Georgetown University. Formerly, he was President of InforMed, an informaticsconsultancy for research informatics. Over the course of his career, he has been the principalinvestigator for more than 200 clinical trials, and has served in various leadership roles in theAmerican Medical Informatics Association. He has been a board member on leadingorganizations for information technology standards, and served as the chair of a nationalinstitutional review board. Most recently, he held an appointment in the Department ofEngineering at Penn State University. Dr. Jaffe has been the contributing editor for severaljournals and has published on a range of subjects, including clinical management, informaticsdeployment, and healthcare policy.Sachin H. Jain, M.D., M.B.A.Dr. Jain is Chief Medical Information and Innovation Officer (CMIO) at Merck and Lecturer inHealth Care Policy at Harvard Medical School. He also serves as an attending hospitalistphysician at the Boston VA-Boston Medical Center.In his capacity as Mercks CMIO, Jain is working to establish and manage global alliances touse health data to improve medication safety, efficacy, use, and adherence. In addition, Jain isleading several efforts around new ventures and business development.Previously, he was Senior Advisor to the Administrator of the Centers for Medicare andMedicaid Services (CMS). At CMS, he helped lead the launch of the Center for Medicare andMedicaid Innovation, briefly serving as its Acting Deputy Director for Policy and Programs. AtCMS, Jain had advocated for speedier translation of health care delivery research into practiceand an expanded use of clinical registries.Dr. Jain also served as Special Assistant to the National Coordinator for Health InformationTechnology at the Office of the National Coordinator for Health Information Technology (ONC).At the ONC, Jain worked with David Blumenthal to implement the HITECH Provisions of theRecovery Act that provide incentives for physicians and hospitals to become meaningful usersof health information technology.He received his undergraduate degree magna cum laude in government from Harvard College;his medical degree from Harvard Medical School; and his masters degree in businessadministration from the Harvard Business School. He was a recipient of the Paul and DaisySoros Fellowship and the Deans Award at Harvard Business School. He completed hisresidency in internal medicine at Brigham and Womens Hospital, where he was honored withthe Resident Mentor Award.Dr. Jain is a founder of several non-profit health care ventures including the Homeless HealthClinic at the Harvard Square Homeless Shelter; the Harvard Bone Marrow Initiative; andImproveHealthCare.org. He worked with DaVita-Bridge of Life to bring charity dialysis care torural Rajasthan, India and Medical Missions for Children to bring cleft lip and palate surgery tothat region.He previously maintained a faculty appointment at Harvard Business Schools Institute forStrategy and Competitiveness and worked with strategy professor Michael Porter on a newcase literature on health care delivery innovation. He presently serves as an honorary SeniorInstitute Associate at Harvard Business School.Dr. Jain has worked previously at WellPoint, McKinsey & Co, and the Institute for HealthcareImprovement. He has also served as an expert consultant to the World Health Organization. 10
    • Forum on Drug Discovery, Development, and Translation Forum on Neuroscience and Nervous System Disorders Roundtable on Translating Genomic-Based Research for Health National Cancer Policy ForumHe has authored over 50 publications on health care delivery innovation and health care reformin journals such as the New England Journal of Medicine, JAMA, and HealthAffairs. His workhas been cited in the New York Times, CNN, the Wall Street Journal, and other media outlets.He serves on the editorial boards of the American Journal of Managed Care; Harvard Medicine;and Wing of Zock. The book he co-edited with Susan Pories and Gordon Harper, "The Soul of aDoctor" has been translated into Chinese.Beth Kozel, M.D., Ph.D.Dr. Kozel is an Instructor of Pediatrics in the Division of Genetics and Genomic Medicine at St.Louis Children’s Hospital and Washington University School of Medicine. Clinically, she caresfor children with genetic conditions including chromosomal changes, inherited disorders andmultiple medical problems of unknown etiology. In her research, she has studied the biology ofelastic fiber formation and works with patients with elastic fiber diseases such as Williams-Beuren syndrome, supravalvular aortic stenosis and cutis laxa. Her current research is aimed atidentifying modifiers of vascular disease severity among affected individuals. Her work issupported by a K08 award from the NIH and she is a scholar of the Children’s DiscoveryInstitute at Washington University. She is a member of the American Society of Matrix Biologyand the American Society of Human Genetics. She is a registered researcher with the WilliamsSyndrome Patient and Clinical Research (WSPCR) Registry.Harlan Krumholz, M.D.Dr. Krumholz is the Harold H. Hines, Jr. Professor of Medicine and Director of the Robert WoodJohnson Clinical Scholars Program at Yale University School of Medicine and Director of theYale-New Haven Hospital Center for Outcomes Research and Evaluation (CORE).His research is focused on determining optimal clinical strategies and identifying opportunitiesfor improving the prevention, treatment and outcome of cardiovascular disease. Using appliedclinical research methods, his work seeks to provide critical information to improve the quality ofhealth care, monitor changes over time, guide decisions about the allocation of scarceresources and inform decisions made by patients and their clinicians. Studies from his researchgroup have directly led to improvements in the use of guideline-based medications, thetimeliness of care for acute myocardial infarction, the information available to patients who aremaking decisions about clinical options, the public reporting of outcomes measures, and thecurrent national focus on reducing the risk of readmission. He has also advanced the applicationof quality improvement to the elimination of disparities. His work influenced several provisions inthe health reform bill.Dr. Krumholz serves on the Board of Trustees of the American College of Cardiology, the Boardof Directors of the American Board of Internal Medicine and the Board of Governors of thePatient-Centered Outcomes Research Institute. He is an elected member of the Association ofAmerican Physicians, the American Society for Clinical Investigation, and the Institute ofMedicine. He is a Distinguished Scientist of the American Heart Association and has alsoreceived the Distinguished Service Award from the organization. He received a B.S. from Yale,an M.D. from Harvard Medical School and a Masters in Health Policy and Management from theHarvard University School of Public Health. 11
    • Richard E. Kuntz, M.D., M.Sc.Dr. Kuntz is Senior Vice President and Chief Scientific, Clinical and Regulatory Officer ofMedtronic, Inc. In this role, which he assumed in August 2009, Kuntz oversees the company’sglobal regulatory affairs, health policy and reimbursement, clinical research, ventures and newtherapies and innovation functions.Dr. Kuntz joined Medtronic in October 2005, as Senior Vice President and President ofMedtronic Neuromodulation, which encompasses the company’s products and therapies used inthe treatment of chronic pain, movement disorders, spasticity, overactive bladder and urinaryretention, benign prostatic hyperplasia, and gastroparesis. In this role he was responsible forthe research, development, operations and product sales and marketing for each of thesetherapeutic areas worldwide.Dr. Kuntz brings to Medtronic a broad background and expertise in many different areas ofhealthcare. Prior to Medtronic he was the Founder and Chief Scientific Officer of the HarvardClinical Research Institute (HCRI), a university-based contract research organization whichcoordinates National Institutes of Health (NIH) and industry clinical trials with the United StatesFood and Drug Administration (FDA). He has directed over 100 multicenter clinical trials andhas authored more than 200 original publications. His major interests are traditional andalternative clinical trial design and biostatistics.He also served as Associate Professor of Medicine at Harvard Medical School, Chief of theDivision of Clinical Biometrics, and an interventional cardiologist in the division of cardiovasculardiseases at the Brigham and Women’s Hospital in Boston, MA.Dr. Kuntz graduated from Miami University, and received his medical degree from CaseWestern Reserve University School of Medicine. He completed his residency in internalmedicine at the University of Texas Southwestern Medical School, and then completedfellowships in cardiovascular diseases and interventional cardiology at the Beth Israel Hospitaland Harvard Medical School, Boston. Dr. Kuntz received his master’s of science in biostatisticsfrom the Harvard School of Public Health.Rebecca D. Kush, Ph.D.Dr. Kush is Founder, President and CEO of the Clinical Data Interchange Standards Consortium(CDISC), a non-profit standards developing organization (SDO) with a mission to develop andsupport global, platform-independent standards that enable information system interoperabilityto improve medical research and related areas of healthcare and a vision of “Informing patientcare and safety through higher quality medical research.”She has over 25 years of experience in the area of clinical research, including positions with theU.S. National Institutes of Health, academia, a global CRO and biopharmaceutical companies inthe U.S. and Japan. She earned her doctorate in Physiology and Pharmacology from theUniversity of California San Diego (UCSD) School of Medicine. She is lead author on the book:eClinical Trials: Planning and Implementation and has authored numerous publications forjournals, including New England Journal of Medicine and Science Translational Medicine. Shehas developed a Prescription Education Program for elementary and middle schools and wasnamed in PharmaVoice in 2008 as one of the 100 most inspiring individuals in the life-sciencesindustry.Dr. Kush has served on the Board of Directors for the U.S. Health Information TechnologyStandards Panel (HITSP), Drug Information Association (DIA) and currently Health Level 7 12
    • Forum on Drug Discovery, Development, and Translation Forum on Neuroscience and Nervous System Disorders Roundtable on Translating Genomic-Based Research for Health National Cancer Policy Forum(HL7), and she was a member of the advisory committee for the WHO International ClinicalTrials Registry Platform. Dr. Kush served on the appointed Planning Committee for theHHS/ONC-sponsored Workshop Series on the “Digital Infrastructure for the Learning HealthSystem” for the National Academy of Sciences Institute of Medicine (IOM). She is a member ofthe National Cancer Advisory Board IT Workgroup and was invited to represent research as anappointed member of the U.S. Health Information Technology (HIT) Standards Committee.Dr. Kush has developed a course “A Global Approach to Accelerating Medical Research” andhas been a keynote speaker at numerous conferences in this arena in Europe, U.S., Brazil,Japan, China, Korea and Australia.Elizabeth Loder, M.D., M.P.H.Dr. Loder is a senior research editor at BMJ. She is also chief of the Division of Headache andPain in the Department of Neurology at Brigham and Women’s Hospital, Boston, and anAssociate Professor of Neurology at Harvard Medical School. She is board-certified in InternalMedicine and Headache Medicine. Dr. Loder is the current President of the American HeadacheSociety and has held leadership positions in national and international pain organizations andserved as a member of the recent Institute of Medicine Committee on Advancing Pain Care,Research, and Education. Dr. Loder has worked as a clinician and researcher in the headachefield for many years, and has extensive experience in the design and conduct of clinical trials.As a result of these experiences Dr. Loder is deeply interested in problems associated withclinical trial reporting and interpretation, especially the consequences of missing clinical trialinformation for guidelines and treatment decisions. She has devoted much of her recent careerto the development, implementation and dissemination of standards for research reporting. Shewas responsible for a recent BMJ theme issue on unpublished clinical trial data. Researchpublished in that issue of the journal has led several US lawmakers to question whether the NIHand FDA are doing enough to enforce research reporting requirements.Deven McGraw, J.D.Ms. McGraw is the Director of the Health Privacy Project at CDT. The Project is focused ondeveloping and promoting workable privacy and security protections for electronic personalhealth information.Ms. McGraw is active in efforts to advance the adoption and implementation of healthinformation technology and electronic health information exchange to improve health care. Shewas one of three persons appointed by Kathleen Sebelius, the Secretary of the U.S.Department of Health & Human Services (HHS), to serve on the Health Information Technology(HIT) Policy Committee, a federal advisory committee established in the American Recoveryand Reinvestment Act of 2009. She co-chairs the Committee’s Privacy and Security “TigerTeam” and serves as a member of its Meaningful Use, Information Exchange, and StrategicPlan Workgroups. She also served on two key workgroups of the American Health InformationCommunity (AHIC), the federal advisory body established by HHS in the Bush Administration todevelop recommendations on how to facilitate use of health information technology to improvehealth. Specifically, she co-chaired the Confidentiality, Privacy and Security Workgroup and wasa member of the Personalized Health Care Workgroup. She also served on the Policy SteeringCommittee of the eHealth Initiative and now serves on its Leadership Council. She is also onthe Steering Group of the Markle Foundation’s Connecting for Health multi-stakeholder initiative.Ms. McGraw has a strong background in health care policy. Prior to joining CDT, Ms. McGrawwas the Chief Operating Officer of the National Partnership for Women & Families, providingstrategic direction and oversight for all of the organization’s core program areas, including the 13
    • promotion of initiatives to improve health care quality. Ms. McGraw also was an associate in thepublic policy group at Patton Boggs, LLP and in the health care group at Ropes & Gray. Shealso served as Deputy Legal Counsel to the Governor of Massachusetts and taught in theFederal Legislation Clinic at the Georgetown University Law Center.Ms. McGraw graduated magna cum laude from the University of Maryland. She earned herJ.D., magna cum laude, and her L.L.M. from Georgetown University Law Center and wasExecutive Editor of the Georgetown Law Journal. She also has a Master of Public Health fromJohns Hopkins School of Hygiene and Public Health.Meredith Nahm, Ph.D.Dr. Nahm provides oversight and coordination of clinical research informatics projectsundertaken by the Core, including DTMIs provision of infrastructure for clinical research datacollection and management. She also oversees informatics for the MURDOCK communityregistry and the Duke Clinical Research Unit, as well as involvement in national efforts todevelop and implement data standards. Additionally, Dr. Nahm supports efforts to advance theapplication of informatics capabilities to enhance medical practice. Through her work andassociated collaborations, Dr. Nahm helps to shape Dukes Clinical Research Informaticsstrategy and direction.Prior to her appointment as the Associate Director for Clinical Research Informatics, Dr. Nahmserved as the Director of Clinical Data Integration at the Duke Clinical Research Institute 2001and has been with the DCRI since 1998. She has over fifteen years of experience in clinicalresearch informatics and quality control.Dr. Nahm has held numerous industry leadership roles, including serving as a member of theBoard of Trustees of the Society for Clinical Data Management (SCDM), Chair of the SCDMGood Clinical Data Management Practices (GCDMP) Committee, Chair of the Clinical DataInterchange Standards Consortium (CDISC) Industry Advisory Board, and the datamanagement content expert for the NIH-sponsored National Leadership Forum for ClinicalResearch Networks.Dr. Nahm authored the GCDMP sections on Measuring and Assuring Data Quality and hasplayed a leadership role in data standards development and implementation efforts. She iscurrently working with the data standards development efforts in Cardiology and Tuberculosison two NIH Roadmap projects, and the Data and Statistical Center for the National Institute onDrug Abuse Clinical Trials Network. She is a frequent presenter at industry meetings, andpublishes in the clinical trial operational literature. Her research interests include quantitativeevaluation of common clinical data management practices and data quality. She received herMaster’s in Nuclear Engineering from North Carolina State University, and her PhD at theUniversity of Texas at Houston School of Health Information Sciences.Jeffrey S. Nye, M.D., Ph.D.Dr. Nye in his role as the Vice President and Global Head, Neuroscience External Innovation,Janssen Pharmaceutical Companies of Johnson & Johnson, is responsible for the strategy andimplementation of external R&D in the neurosciences, aiming to expand the pipeline with novelbusiness relationships, venture, academic and public-private partnerships and risk-sharedoutsourcing. Previously, Dr. Nye was Chief Medical Officer and co-head of the East CoastResearch and Early Development unit at J&J pharma. Prior roles include VP of ExperimentalMedicine, VP of compound development (CDTL) for Galantamine (Razadyne/Reminyl), a 14
    • Forum on Drug Discovery, Development, and Translation Forum on Neuroscience and Nervous System Disorders Roundtable on Translating Genomic-Based Research for Health National Cancer Policy Forumtherapy for Alzheimer’s disease, and for Topamax, an epilepsy and migraine medicine. Hepreviously served as a Director of CNS Discovery Genomics and Biotechnology at Pharmacia.Prior to joining the pharmaceutical industry, Dr. Nye was a tenured associate professor ofMolecular Pharmacology, Biological Chemistry and Pediatrics (Neurology) at NorthwesternUniversity and Children’s Memorial Hospital. He received his bachelor’s degree in biochemistryand master’s degree in pharmacology from Harvard, and his MD and PhD (with SolomonSnyder) from the Johns Hopkins University School of Medicine. He served as a pediatricsresident, postdoctoral fellow (with Richard Axel) and assistant professor of Pediatrics inNeurology at the College of Physicians and Surgeons of Columbia University.Dr. Nye has published over 60 peer reviewed papers on topics spanning basic and clinicalresearch, and has led pharma programs in discovery through clinical development. He hasserved on numerous study sections and advisory panels, and currently serves as a councilor ofthe British Association of Psychopharmacology, and is the co-chair of the New York AcademyAlzheimer’s Leadership council.Sally Okun, R.N., M.M.H.S.Ms. Okun is a member of the Research, Data and Analytics Team at PatientLikeMe, Inc. As theHead of Health Data Integrity and Patient Safety she is responsible for the site’s medicalontology and the integrity of patient reported health data. She also developed and oversees thePatientsLikeMe Drug Safety and Pharmacovigilance Platform.Prior to joining PatientsLikeMe, Ms. Okun practiced as a palliative care specialist. In addition toworking with patients and families she was an independent consultant supporting multi-yearclinical, research, and education projects focused on palliative and end of life care for numerousclients including Brown University, Harvard Medical School, MA Department of Mental Healthand the Robert Wood Johnson Foundation.Ms. Okun is a member of the Institute of Medicine’s (IOM) Clinical Effectiveness ResearchInnovation Collaborative, the Evidence Communication Innovation Collaborative and the BestPractices Innovation Collaborative. She is a contributing author on two IOM discussion papers:Principles and Values for Team-based Healthcare and Communicating Evidence in HealthCare: Engaging Patients for Improved Health Care Decisions.Ms. Okun received her Masters degree from The Heller School for Social Policy & Managementat Brandeis University; completed study of Palliative Care and Ethics at Memorial Sloan-Kettering Cancer Center; and was a National Library of Medicine Fellow in BiomedicalInformatics.Eric D. Perakslis, Ph.D.Dr. Perakslis is currently Chief Information Officer and Chief Scientist (Informatics) at the U.S.Food and Drug Administration. In this new role, Eric is responsible for modernizing andenhancing the IT capabilities as well as the in silico scientific capabilities at FDA.Prior to FDA, Dr. Perakslis was Senior Vice President of R&D Information Technology atJohnson & Johnson Pharmaceuticals R&D and was a member of the Corporate Office ofScience and Technology. During his thirteen years at J&J, Eric also held the posts of VicePresident R&D Informatics, Vice President and Chief Information Officer, Director of ResearchInformation Technology as well as assistant Director and Director of Drug Discovery Research 15
    • prior to his current role. Before joining J&J, he was the Group leader of Scientific Computing atArQule Inc. and he began his professional career with the Army Corps of Engineers.Dr. Perakslis has a Ph.D. in chemical and biochemical engineering from Drexel University andalso holds B.S.Che and M.S. degrees in chemical engineering. His current research interestsare enterprise knowledge management, patient stratification, healthcare IT and translationalinformatics with the specific focus precompetitive data sharing, and open source systemsglobalization.Dr. Perakslis is a late-stage kidney cancer survivor and an avid patient advocate. He has servedas the Chairman of the Survivor Advisory Board at the Cancer Institute of New Jersey and asthe Chief Information Officer of the King Hussein Institute for Biotechnology and Cancer inAmman, Jordan. Eric has also worked extensively with the Lance Armstrong Foundation, theKidney Cancer Association, the Scientist ↔ Survivor program of the American Association forCancer Research, OneMind4Research and several other top non-profit disease-basedorganizations to further their domestic and international agendas.Richard Platt, M.D., M.Sc.Dr. Platt is a Professor and Chair of the Department of Population Medicine, and ExecutiveDirector of the Harvard Pilgrim Health Care Institute. He is an internist trained in infectiousdiseases and epidemiology.His research focuses on developing multi-institution automated record linkage systems for usein pharmacoepidemiology, and for population based surveillance, reporting, and control of bothhospital and community acquired infections.He is principal investigator of the FDAs Mini-Sentinel program, of a CDC Center of Excellencein Public Health Informatics, of the AHRQ HMO Research Network DEcIDE center, and of aCDC Prevention Epicenter. He is a member of the Institute of Medicine Roundtable on Value &Science-Driven Health Care, and of the Association of American Medical Colleges’ AdvisoryPanel on Research. Dr. Platt formerly chaired the FDA Drug Safety and Risk ManagementAdvisory Committee, an NIH study section (Epidemiology and Disease Control 2), the CDCOffice of Health Care Partnerships steering committee, and co-chaired the Board of ScientificCounselors of the CDC National Center for Infectious Diseases.William Z. Potter, M.D., Ph.D.Dr. Potter earned his B.A., M.S., M.D., and Ph.D. at Indiana University, after which hefunctioned in positions of increasing responsibility and seniority over the next twenty-five yearsat the National Institutes of Health (NIH) focused on translational neuroscience. While at theNIH, Dr. Potter was widely published and appointed to many societies, committees, and boards;a role which enabled him to develop a wide reputation as an expert in psychopharmacologicalsciences and championing the development of novel treatments for CNS disorders.Dr. Potter left the NIH in 1996 to accept a position as Executive Director and Research Fellow atLilly Research Labs, specializing in the Neuroscience Therapeutic Area and in 2004 joinedMerck Research Labs (MRL) as VP of Clinical Neuroscience, then the newly created position ofTranslational Neuroscience in 2006, a position from which he retired in January of this year. Hisexperience at Lilly and MRL in identifying, expanding and developing methods of evaluatingCNS effects of compounds in human brain cover state of the art approaches across multiplemodalities. These include brain imaging and cerebrospinal fluid proteomics (plus metabolomics)as well as development of more sensitive clinical, psychophysiological and performance 16
    • Forum on Drug Discovery, Development, and Translation Forum on Neuroscience and Nervous System Disorders Roundtable on Translating Genomic-Based Research for Health National Cancer Policy Forummeasures allowing a range of novel targets to be tested in a manner which actually addressesthe underlying hypotheses. Bill has become a widely recognized champion for the position thatmore disciplined hypothesis testing of targets in humans is the best near term approach tomoving CNS drug development forward for important neurologic and psychiatric illnesses.Jonathan Rabinowitz, Ph.D.Prof. Rabinowitz is The Elie Wiesel Professor and a former Department Chairman at Bar IlanUniversity. He also holds a visiting professorship in the Department of Psychiatry at MountSinai School of Medicine in New York. His current research focuses on understanding andminimizing placebo response in antipsychotic and antidepressant trials and developingstatistical and methodological ways of improving efficiency of CNS drug trials. His researchhas been supported by NIH, BMBF, Innovative Medicines Initiative, European CommissionEuropean Union, NARSAD and major pharmaceutical companies. He serves as a member ofthe editorial board of European Neuropsychopharmacology and is a member of the InternationalBiometric Society, International Society for Schizophrenia Research and the European Collegeof Neuropsychopharmacology. He serves on the Clinical Expert Review Committee for the FDA(Food and Drug Administration) data standards development initiative that will standardizeclinical data elements for regulatory submission. He is a member of the DIA (Drug InformationAssociation) scientific working group on missing data in clinical trials. He has published over150 scientific papers.Frank W. Rockhold, Ph.D.Dr. Rockhold is currently Senior Vice President, Global Clinical Safety and Pharmacovigilanceat GlaxoSmithKline Pharmaceuticals Research and Development. This includes casemanagement, signal detection, safety evaluation, risk management for all development andmarketed products. He is also co-chair of the GSK Global Safety Board. In his 20 years atGSK he has also held management positions within the Statistics & Epidemiology Departmentand Clinical Operations both in R&D and in the U.S. Pharmaceutical Business, and mostrecently as interim head of the Cardiovascular and Metabolic Medicines Development Center.Frank has previously held positions of Research Statistician, Lilly Research Laboratories (1979-1987) and Executive Director of Biostatistics, Data Management, and Health Economics, MerckResearch Laboratories, (1994-1997).He has BA in Statistics, from the University of Connecticut, Sc.M. in Biostatistics, from JohnsHopkins University, and a Ph.D. in Biostatistics, from the Medical College of Virginia.He has held several academic appointments in his career at Butler University, IndianaUniversity, Adjunct Professor of Health Evaluation Sciences, Penn State University and AdjunctScholar in the Department of Epidemiology and Biostatistics at the University of Pennsylvania.Frank is currently Affiliate Professor of Biostatistics at the Virginia Commonwealth UniversityMedical Center.Frank is currently Past Chairman of the Board of Directors of the Clinical Data InterchangeStandards Consortium (CDISC) and a member of the National Library of Medicine Advisorygroup for the ClinicalTrials.Gov. He is Past-President, Society for Clinical Trials, Past Chair,PhRMA Biostatistics Steering Committee, and a member of the ICH E-9 and E-10 ExpertWorking Groups and previously served as Associate Editor for Controlled Clinical Trials.He is a Fellow of the American Statistical Association and a Fellow of the Society for ClinicalTrials. Frank is also a recipient of the PhRMA Career Achievement award. He has over 150publications and abstracts. 17
    • Laura Lyman Rodriguez, Ph.D.Dr. Rodriguez is the Director of the Office of Policy, Communications, and Education at theNational Human Genome Research Institute (NHGRI), National Institutes of Health (NIH). Sheworks to develop and implement policy for research initiatives at the NHGRI, designcommunication and outreach strategies to engage the public in genomic science, and preparehealth care professionals for the integration of genomic medicine into clinical care. Laura isparticularly interested in the policy and ethics questions related to the inclusion of humanresearch participants in genomics and genetics research and sharing human genomic datathrough broadly used research resources (e.g., databases). Among other activities, Dr.Rodriguez has provided leadership for many of the policy development activities pertaining togenomic data sharing and the creation of the database for Genotypes and Phenotypes (dbGaP)at the NIH. Dr. Rodriguez received her bachelor of science with honors in biology fromWashington and Lee University in Virginia and earned a doctorate in cell biology from BaylorCollege of Medicine in Texas.Michael Rosenblatt, M.D.Dr. Rosenblatt is Executive Vice President and Chief Medical Officer of Merck & Co., Inc. He isthe first person to serve in this role for Merck. Previously he served as Dean of Tufts UniversitySchool of Medicine. Prior to that, he held the appointment of George R. Minot Professor ofMedicine at Harvard Medical School and Chief of the Division of Bone and Mineral MetabolismResearch at Beth Israel Deaconess Medical Center (BIDMC). He served as the President ofBIDMC from 1999-2001. Previously, he was the Harvard Faculty Dean and Senior VicePresident for Academic Programs at CareGroup and BIDMC and a founder of the Carl J.Shapiro Institute for Education and Research at Harvard Medical School and BIDMC, a jointventure whose mission is to manage the academic enterprise and promote academicinnovation.Prior to that, he served as Director of the Harvard-MIT Division of Health Sciences andTechnology, during which time he led a medical education organization for M.D., Ph.D., andM.D.-Ph.D. training jointly sponsored by Harvard and MIT. And earlier, he was Senior VicePresident for Research at Merck Sharp & Dohme Research Laboratories where he co-led theworldwide development team for alendronate (FOSAMAX), Mercks bisphosphonate forosteoporosis and bone disorders. In addition, he directed drug discovery efforts in molecularbiology, bone biology and calcium metabolism, virology, cancer research, lipid metabolism, andcardiovascular research in the United States, Japan, and Italy. In leading most of Mercksinternational research efforts, he established two major basic research institutes, one inTsukuba, Japan, and one near Rome, Italy. He also headed Merck Researchs worldwideUniversity and Industry Relations Department.He is the recipient of the Fuller Albright Award for his work on parathyroid hormone and theVincent du Vigneaud Award in peptide chemistry and biology, and the Chairman’s Award fromMerck. His research is in the field of hormonal regulation of calcium metabolism, osteoporosis,and cancer metastasis to bone. His major research projects are in the design of peptidehormone antagonists for parathyroid hormone and the tumor-secreted parathyroid hormone-likeprotein, isolation/characterization of receptors and mapping hormone—receptor interactions,elucidating the mechanisms by which breast cancer “homes” to bone, and osteoporosis andbone biology.He has been an active participant in the biotechnology industry, serving on the board ofdirectors and scientific advisory boards of several biotech companies. He was a scientificfounder of ProScript, the company that discovered bortezomib (Velcade), now Millennium 18
    • Forum on Drug Discovery, Development, and Translation Forum on Neuroscience and Nervous System Disorders Roundtable on Translating Genomic-Based Research for Health National Cancer Policy ForumPharmaceutical’s drug for multiple myeloma and other malignancies. He was a member of theBoard of Scientific Counselors of the National Institute of Diabetes and Digestive and KidneyDiseases of the NIH. He has been elected to the American Society of Clinical Investigation, theAssociation of American Physicians, to Fellowship in the American Association for theAdvancement of Science and the American College of Physicians, and the presidency of theAmerican Society of Bone and Mineral Research. He has testified before a Senate Hearing onU.S. biomedical research priorities in 1997, and in 2011 as a consultant to the U.S. PresidentsCouncil of Advisors on Science and Technology.From 1981 to 1984, he served as Chief of the Endocrine Unit, Massachusetts General Hospital.He received his undergraduate degree summa cum laude from Columbia and his M.D. magnacum laude from Harvard. His internship, residency, and endocrinology training were all at theMassachusetts General Hospital.Vicki L. Seyfert-Margolis, Ph.D.Dr. Seyfert-Margolis is Senior Advisor for Science Innovation and Policy in the FDACommissioner’s Office. Dr. Seyfert-Margolis focuses on initiatives in innovation, regulatoryscience, personalized medicine, scientific computing and health informatics. Previously, sheserved as Chief Scientific Officer at Immune Tolerance Network (ITN), a non-profit consortiumof researchers seeking new treatments for diseases of the immune system. At ITN, she oversawthe development of more than 20 centralized laboratory facilities, and the design and executionof biomarker discovery studies for over 25 Phase II clinical trials. As part of the biomarker effortsshe established construction of a primer library of 1,000 genes that may be involved inestablishing and maintaining immunologic tolerance and co-discovered genes that may markkidney transplant tolerance. Dr. Seyfert-Margolis was also an Adjunct Associate Professor withthe Department of Medicine at the University of California San Francisco. Prior to academia,she served as Director of the Office of Innovative Scientific Research Technologies at theNational Institute of Allergy and Infectious Diseases at NIH, where she worked to integrateemerging technologies into existing immunology and infectious disease programs. Dr. Seyfert-Margolis completed her PhD in immunology at the University of Pennsylvania’s School ofMedicine.Mary Ann SlackMs. Slack has over 25 years’ background in technology and informatics in both the public andprivate sectors. Since joining FDA’s Center for Drug Evaluation and Research (CDER) in 2003,Ms Slack has led a number of efforts in implementing informatics solutions to businessproblems. She serves as an FDA expert on the International Conference on Harmonization’s(ICH) multidisciplinary working groups charged with establishing data exchange standards andstructures in support of drug and biologic regulatory review, and is actively engaged in thedevelopment of data standards to meet regulatory needs. Ms. Slack currently leads CDER’sdata standards development program.Jay M. Tenenbaum, Ph.D.Dr. Tenenbaum is the Founder and Chairman of Cancer Commons, a non-profit, open sciencecommunity that compiles and continually refines information about cancer subtypes andtreatments, based on the literature and actual patient outcomes.Dr. Tenenbaum’s background brings a unique perspective of a world-renowned Internetcommerce pioneer and visionary. He was founder and CEO of Enterprise IntegrationTechnologies, the first company to conduct a commercial Internet transaction (1992), secure 19
    • Web transaction (1993) and Internet auction (1993). In 1994, he founded CommerceNet toaccelerate business use of the Internet. In 1997, he co-founded Veo Systems, the company thatpioneered the use of XML for automating business-to-business transactions. Dr. Tenenbaumjoined Commerce One in January 1999, when it acquired Veo Systems. As Chief Scientist, hewas instrumental in shaping the companys business and technology strategies for the GlobalTrading Web. Post Commerce One, Dr. Tenenbaum was an officer and director of WebifySolutions, which was sold to IBM in 2006, and Medstory, which was sold to Microsoft in 2007.Dr. Tenenbaum was also the Founder and Chairman of CollabRx, a provider of Web-Basedapplications and services that help cancer patients and their physicians select optimaltreatments and trials, which was acquired by Tegal in 2012.Earlier in his career, Dr. Tenenbaum was a prominent AI researcher and led AI research groupsat SRI International and Schlumberger Ltd. Dr. Tenenbaum is a fellow and former boardmember of the American Association for Artificial Intelligence, and a former consulting professorof Computer Science at Stanford. He currently serves as a director of Efficient Finance,Patients Like Me, and the Public Library of Science, and is a consulting professor of InformationTechnology at Carnegie Mellons new West Coast campus. Dr. Tenenbaum holds B.S. andM.S. degrees in Electrical Engineering from MIT, and a Ph.D. from Stanford.Sharon F. Terry, M.A.Ms. Terry is President and CEO of the Genetic Alliance, a network of more than 10,000organizations, 1,200 of which are disease advocacy organizations. Genetic Alliance improveshealth through the authentic engagement of communities and individuals. It develops innovativesolutions through novel partnerships, connecting consumers to smart services.She is the founding CEO of PXE International, a research advocacy organization for the geneticcondition pseudoxanthoma elasticum (PXE). As co-discoverer of the gene associated with PXE,she holds the patent for ABCC6 and has assigned her rights to the foundation. She developeda diagnostic test and is conducting clinical trials.Ms. Terry is also a co-founder of the Genetic Alliance Registry and Biobank. She is the authorof more than 90 peer-reviewed articles. In her focus at the forefront of consumer participation ingenetics research, services and policy, she serves in a leadership role on many of the majorinternational and national organizations, including the Institute of Medicine Science and PolicyBoard, the National Coalition for Health Professional Education in Genetics board, theInternational Rare Disease Research Consortium Interim Executive Committee, and is amember of the IOM Roundtable on Translating Genomic-Based Research for Health. She is onthe editorial boards of several journals. She was instrumental in the passage of the GeneticInformation Nondiscrimination Act. In 2005, she received an honorary doctorate from IonaCollege for her work in community engagement; the first Patient Service Award from the UNCInstitute for Pharmacogenomics and Individualized Therapy in 2007; the Research!AmericaDistinguished Organization Advocacy Award in 2009; and the Clinical Research Forum andFoundation’s Annual Award for Leadership in Public Advocacy in 2011. She is an AshokaFellow.Andrew Vickers, Ph.D.Dr. Vickers is Attending Research Methodologist in the Department of Epidemiology andBiostatistics at Memorial Sloan-Kettering Cancer Center. He obtained a First in Natural Sciencefrom the University of Cambridge and has a doctorate in clinical medicine from the University ofOxford. 20
    • Forum on Drug Discovery, Development, and Translation Forum on Neuroscience and Nervous System Disorders Roundtable on Translating Genomic-Based Research for Health National Cancer Policy ForumDr. Vickers clinical research falls into three broad areas: randomized trials, surgical outcomesresearch and molecular marker studies. A particular focus of his work is the detection and initialtreatment of prostate cancer. Dr Vickers has analyzed the learning curve for radicalprostatectomy, is working on a series of studies demonstrating that a single measure ofProstate Specific Antigen (PSA) taken in middle age can predict prostate cancer up to 25 yearssubsequently and has developed a statistical model to predict the result. His work onrandomized trials focuses on methods for integrating randomized trials into routine surgicalpractice so as to compare different approaches to surgery. As part of this work, he haspioneered the use of web-interfaces for obtaining quality of life data from patients recoveringfrom radical prostatectomy.Dr. Vickers methodological research centers primarily on novel methods for assessing theclinical value of predictive tools. In particular, he has developed decision-analytic tools that canbe directly applied to a data set, without the need for data gathering on patient preferences orutilities.Dr. Vickers has a strong interest in teaching statistics. He is course leader for the MSKCCbiostatistics course, teaches on the undergraduate curriculum at Weill Medical College ofCornell University and he writes the statistics column for Medscape. He is author of theintroductory statistics textbook “What is a p-value anyway? 34 stories to help you actuallyunderstand statistics.”John A. Wagner, M.D., Ph.D.Dr. Wagner received his M.D. from Stanford University School of Medicine and his Ph.D. fromthe Johns Hopkins University School of Medicine. Postgraduate training included InternalMedicine Internship and Residency, as well as Molecular and Clinical PharmacologyPostdoctoral Fellowships. Currently, Dr. Wagner is Vice President and Head, EarlyDevelopment Pipeline and Projects; Head, Global Project Management at Merck & Co., Inc.Previous, he was Vice President and Head, Clinical Pharmacology, at Merck & Co., Inc andActing Modeling and Simulation Integrator, Strategically Integrated Modeling and Simulation.Dr. Wagner is also an Adjunct Assistant Professor, Division of Clinical Pharmacology,Department of Medicine, Jefferson Medical College, Thomas Jefferson University and VisitingClinical Scientist, Harvard - Massachusetts Institute of Technology Division of Health Sciencesand Technology, Center for Experimental Pharmacology and Therapeutics. He is the past chairof the PhRMA Clinical Pharmacology Technical Group, past chair of the adiponectin work groupfor the Biomarkers Consortium, past committee member of the National Academies Institute ofMedicine Committee on Qualification of Biomarkers and Surrogate Endpoints in ChronicDisease, current member of the Board of Directors and current chair of the 2012 ScientificProgram Committee for American Society for Clinical Pharmacology and Therapeutics, andcurrent full member of the National Academies Institute of Medicine National Cancer PolicyForum. Over 150 peer-reviewed publications include work in the areas of biomarkers andsurrogate endpoints, experimental medicine, M&S, pharmacokinetics, pharmacodynamics,precompetitive collaboration, and drug-drug interactions across a variety of therapeutic areas.John Wilbanks, B.A.Mr. Wilbanks is a Senior Fellow in Entrepreneurship with the Ewing Marion KauffmanFoundation. He works on open content, open data, and open innovation systems. Wilbanks alsoserves as a Research Fellow at Lybba. He’s worked at Harvard Law School, MIT’s ComputerScience and Artificial Intelligence Laboratory, the World Wide Web Consortium, the US Houseof Representatives, and Creative Commons, as well as starting a bioinformatics company. He 21
    • sits on the Board of Directors for Sage Bionetworks and iCommons, and the advisory board forBoundless Learning.Mr. Wilbanks holds a degree in philosophy from Tulane University and also studied modernletters at the University of Paris (La Sorbonne).Janet Woodcock, M.D.Dr. Woodcock is the Director, Center for Drug Evaluation and Research, Food and DrugAdministration (FDA). Dr. Woodcock held various leadership positions within the Office of theCommissioner, FDA including Deputy Commissioner and Chief Medical Officer, DeputyCommissioner for Operations and Chief Operating Officer and Director, Critical Path Programs.Previously, Dr. Woodcock served as Director, Center for Drug Evaluation and Research from1994-2005. She also held other positions at FDA including Director, Office of TherapeuticsResearch and Review and Acting Deputy Director, Center for Biologics Evaluation andResearch. A prominent FDA scientist and executive, Dr. Woodcock has received numerousawards, including a Presidential Rank Meritorious Executive Award, the American MedicalAssociations Nathan Davis Award, and Special Citations from FDA Commissioners. Dr.Woodcock received her M.D. from Northwestern Medical School, completed further training andheld teaching appointments at the Pennsylvania State University and the University of Californiain San Francisco. She joined FDA in 1986.Deborah A. Zarin, M.D.Dr. Zarin is the Director, ClinicalTrials.gov, and Assistant Director for Clinical Research Projectsat the Lister Hill National Center for Biomedical Communications in the National Library ofMedicine. In this capacity, Dr. Zarin oversees the development and operation of an internationalregistry of clinical trials. Previous positions held by Dr. Zarin include the Director, TechnologyAssessment Program, at the Agency for Healthcare Research and Quality, and the Director ofthe Practice Guidelines program at the American Psychiatric Association. In these positions, Dr.Zarin conducted systematic reviews and related analyses in order to develop clinical and policyrecommendations. Dr. Zarin’s academic interests are in the area of evidence based clinical andpolicy decision making, and she is the author of over 70 peer reviewed articles. Dr. Zaringraduated from Stanford University and received her doctorate in medicine from HarvardMedical School. She completed a clinical decision making fellowship, and is board certified ingeneral psychiatry, as well as in child and adolescent psychiatry. 22
    • visualizations. Research scientists need to work References and Notes 7. C. Bizer, T. Heath, K. Idehen, T. Berners-Lee, “Linked data more closely with their computing colleagues to 1. National Center for Atmospheric Research (NCAR), on the Web (LDOW2008),” in Proceedings of the 17th University Corporation for Atmospheric Research (UCAR), International Conference on World Wide Web, Beijing, 21 make sure that these needs are met and that the Towards a Robust, Agile, and Comprehensive Information to 25 April 2008. development of new analytic methods tied to sci- Infrastructure for the Geosciences: A Strategic Plan for 8. G. Williams, J. Weaver, M. Atre, J. Hendler, entific, as opposed to business, analysis is pursued. High-Performance Simulation (NCAR, UCAR, Boulder, CO, J. Web Semant. 8, 365 (2010). Finally, we must work together to explore 2000); www.ncar.ucar.edu/Director/plan.pdf. 9. R. Lengler, M. J. Epler, A Periodic Table of Visualization 2. T. Hey, S. Tansley, K. Tolle, Eds. The Fourth Paradigm: Methods, available at www.visual-literacy.org/ new ways of scaling easy-to-generate visualiza- Data-Intensive Scientific Discovery (Microsoft External periodic_table/periodic_table.html. tions to the data-intensive needs of current scien- Research, Redmond, WA, 2009). 10. We note that this is also a good example of our tific pursuits. (Figure 2 shows the use of standard 3. We use the term “immersive environments” to include general point about the need to integrate projection techniques as a low-cost alternative to a number of high-end technologies for the visualization of visualization into the exploration phase of science; complex, large-scale data. The term “Cave Automatic Virtual identifying this problem earlier would have expensive high-end technologies.) Although there enabled substantially higher-quality data collection Environment (CAVE)” is often used for these environments are scientific problems that do call for specialized based on the work of Cruz-Neira et al. (13). A summary of over the period shown. visualizers, many do not. By bringing visualiza- existing projects and technologies can be found at http:// 11. S. K. Card, J. D. Mackinlay, B. Shneiderman, Readings tion into everyday use in our laboratories, we can en.wikipedia.org/wiki/Cave_Automatic_Virtual_Environment. in Information Visualization: Using Vision to Think 4. 2010 Web usage estimate from Internet World Stats, (Morgan Kaufmann, San Francisco, 1999). better understand the requirements for the design www.internetworldstats.com/. 12. A. Perer, B. Shneiderman, “Integrating statistics and of new tool kits, and we can learn to share and visualization: Case studies of gaining clarity during Downloaded from www.sciencemag.org on February 13, 2012 5. This general term accounts for a number of different maintain visualization workflows and products the research approaches; the Web site http://nosql-database. exploratory data analysis,” ACM Conference on Human way we share other scientific systems. A side effect org/ maintains links to ongoing blogs and discussions on Factors in Computing Systems (CHI 2008), Florence, Italy, may well be the lowering of barriers (such as costs this topic. 5 to 10 April 2008. 6. R. Magoulas, B. Lorica, Introduction to Big Data, Release 13. C. Cruz-Neira, D. J. Sandin, T. A. DeFanti, R. V. Kenyon, and accessibility) to more sophisticated visualiza- 2.0, issue 11 (O’Reilly Media, Sebastopol, CA, 2009); J. C. Hart, Commun. ACM 35, 64 (1992). tion of increasingly larger data sets, a crucial func- http://radar.oreilly.com/2009/03/big-data-technologies- tionality for today’s data-intensive scientist. report.html. 10.1126/science.1197654 no matter how accomplished, a single neuron can PERSPECTIVE never perceive beauty, feel sadness, or solve a mathematical problem. These capabilities emerge Challenges and Opportunities only when networks of neurons work together. Ensembles of brain cells, often quite far-flung, form in Mining Neuroscience Data integrated neural circuits, and the activity of the network as a whole supports specific brain func- Huda Akil,1* Maryann E. Martone,2 David C. Van Essen3 tions such as perception, cognition, or emotions. Moreover, these circuits are not static. Environ- Understanding the brain requires a broad range of approaches and methods from the domains of mental events trigger molecular mechanisms of biology, psychology, chemistry, physics, and mathematics. The fundamental challenge is to decipher neuroplasticity that alter the morphology and con- the “neural choreography” associated with complex behaviors and functions, including thoughts, nectivity of brain cells. The strengths and pattern memories, actions, and emotions. This demands the acquisition and integration of vast amounts of of synaptic connectivity encode the “software” of data of many types, at multiple scales in time and in space. Here we discuss the need for brain function. Experience, by inducing changes neuroinformatics approaches to accelerate progress, using several illustrative examples. The nascent in that connectivity, can substantially alter the field of “connectomics” aims to comprehensively describe neuronal connectivity at either a macroscopic function of specific circuits during development level (in long-distance pathways for the entire brain) or a microscopic level (among axons, dendrites, and throughout the life span. and synapses in a small brain region). The Neuroscience Information Framework (NIF) encompasses A grand challenge in neuroscience is to eluci- all of neuroscience and facilitates the integration of existing knowledge and databases of many types. date brain function in relation to its multiple layers These examples illustrate the opportunities and challenges of data mining across multiple tiers of of organization that operate at different spatial and neuroscience information and underscore the need for cultural and infrastructure changes if temporal scales. Central to this effort is tackling neuroinformatics is to fulfill its potential to advance our understanding of the brain. “neural choreography”: the integrated functioning of neurons into brain circuits, including their spatial eciphering the workings of the brain is forming complex thoughts. They are also essen- organization, local, and long-distance connections; D the domain of neuroscience, one of the most dynamic fields of modern biology. Over the past few decades, our knowledge about tial for uncovering the causes of the vast array of brain disorders, whose impact on humanity is staggering (1). To accelerate progress, it is vital to their temporal orchestration; and their dynamic features, including interactions with their glial cell partners. Neural choreography cannot be under- the nervous system has advanced at a remarkable develop more powerful methods for capitalizing stood via a purely reductionist approach. Rather, it pace. These advances are critical for understand- on the amount and diversity of experimental data entails the convergent use of analytical and syn- ing the mechanisms underlying the broad range generated in association with these discoveries. thetic tools to gather, analyze, and mine informa- of brain functions, from controlling breathing to The human brain contains ~80 billion neurons tion from each level of analysis and capture the that communicate with each other via specialized emergence of new layers of function (or dysfunc- 1 The Molecular and Behavioral Neuroscience Institute, Uni- connections or synapses (2). A typical adult brain tion) as we move from studying genes and pro- versity of Michigan, Ann Arbor, MI, USA. 2National Center for has ~150 trillion synapses (3). The point of all this teins, to cells, circuits, thought, and behavior. Microscopy and Imaging Research, Center for Research in Bio- communication is to orchestrate brain activity. logical Systems, University of California, San Diego, La Jolla, CA, The Need for Neuroinformatics USA. 3Deparment of Anatomy and Neurobiology, Washington Each neuron is a piece of cellular machinery that University School of Medicine, St. Louis, MO 63110, USA. relies on neurochemical and electrophysiological The profoundly complex nature of the brain re- *To whom correspondence should be addressed. E-mail: mechanisms to integrate complicated inputs and quires that neuroscientists use the full spectrum akil@umich.edu communicate information to other neurons. But of tools available in modern biology: genetic,708 11 FEBRUARY 2011 VOL 331 1 SCIENCE www.sciencemag.org
    • SPECIALSECTIONcellular, anatomical, electrophysiological, behav- resources that provide a backbone for understand- grants to two consortia (18). The consortium ledioral, evolutionary, and computational. The ex- ing circuit biology and function (8–10). The chal- by Washington University in St. Louis and theperimental methods involve many spatial scales, lenge of integrating such data will dramatically University of Minnesota (19) aims to characterizefrom electron microscopy (EM) to whole-brain increase with the advent of high-throughput ana- whole-brain circuitry and its variability acrosshuman neuroimaging, and time scales ranging tomical methods, including those emerging from individuals in 1200 healthy adults (300 twin pairsfrom microseconds for ion channel gating to the nascent field of connectomics. and their nontwin siblings). Besides diffusionyears for longitudinal studies of human devel- A connectome is a comprehensive description imaging and R-fMRI, task-based fMRI dataopment and aging. An increasing number of in- of neural connectivity for a specified brain region will be acquired in all study participants, alongsights emerge from integration and synthesis at a specified spatial scale (11, 12). Connectomics with extensive behavioral testing; 100 partic-across these spatial and temporal domains. How- currently includes distinct subdomains for studying ipants will also be studied with magnetoenceph-ever, such efforts face impediments related to the the macroconnectome (long-distance pathways alography (MEG) and electroencephalographydiversity of scientific subcultures and differing ap- linking patches of gray matter) and the micro- (EEG). Acquired blood samples will enable geno-proaches to data acquisition, storage, description, connectome (complete connectivity within a single typing or full-genome sequencing of all partici-and analysis, and even to the language in which gray-matter patch). pants near the end of the 5-year project. Currently,they are described. It is often unclear how best to The Human Connectome Project. Until recent- data acquisition and analysis methods are beingintegrate the linear information of genetic se- ly, methods for charting neural circuits in the extensively refined using pilot data sets. Data ac- Downloaded from www.sciencemag.org on February 13, 2012quences, the highly visual data of neuroanatomy, human brain were sorely lacking (13). This sit- quisition from the main cohort will commence inthe time-dependent data of electrophysiology, and uation has changed dramatically with the advent mid-2012.the more global level of analyzing behavior andclinical syndromes. The great majority of neuroscientists carry outhighly focused, hypothesis-driven research thatcan be powerfully framed in the context of knowncircuits and functions. Such efforts are com-plemented by a growing number of projects thatprovide large data sets aimed not at testing aspecific hypothesis but instead enabling data-intensive discovery approaches by the communityat large. Notable successes include gene expressionatlases from the Allen Institute for Brain Sciences(4) and the Gene Expression Nervous SystemAtlas (GENSAT) Project (5), and disease-specific Fig. 1. Schematic illustration of online data mining capabilities envisioned for the HCP. Investigators willhuman neuroimaging repositories (6). However, be able to pose a wide range of queries (such as the connectivity patterns of a particular brain region ofthe neuroscience community is not yet fully en- interest averaged across a group of individuals, based on behavioral criteria) and view the search resultsgaged in exploiting the rich array of data cur- interactively on three-dimensional brain models. Data sets of interest will be freely available forrently available, nor is it adequately poised to downloading and additional offline analysis.capitalize on the forthcoming data explosion. Below we highlight several major endeavors of noninvasive neuroimaging methods. Two com- Neuroimaging and behavioral data from thethat provide complementary perspectives on the plementary modalities of magnetic resonance HCP will be made freely available to the neuro-challenges and opportunities in neuroscience data imaging (MRI) provide the most useful informa- science community via a database (20) and amining. One is a set of “connectome” projects that tion about long-distance connections. One mo- platform for visualization and user-friendly dataaim to comprehensively describe neural circuits dality uses diffusion imaging to determine the mining. This informatics effort involves majorat either the macroscopic or the microscopic level. orientation of axonal fiber bundles in white matter, challenges owing to the large amounts of dataAnother, the Neuroscience Information Framework based on the preferential diffusion of water mol- (expected to be ~1 petabyte), the diversity of data(NIF), encompasses all of neuroscience and pro- ecules parallel to these fiber bundles. Tractography types, and the many possible types of data min-vides access to existing knowledge and databases is an analysis strategy that uses this information to ing. Some investigators will drill deeply by ana-of many types. These and other efforts provide fresh estimate long-distance pathways linking different lyzing high-resolution connectivity maps betweenapproaches to the challenge of elucidating neural gray-matter regions (14, 15). A second modality, all gray-matter locations. Others will explore achoreography. resting-state functional MRI (R-fMRI), is based more compact “parcellated connectome” among on slow fluctuations in the standard fMRI BOLD all identified cortical and subcortical parcels. DataConnectomes: Macroscopic and Microscopic signal that occur even when people are at rest. mining options will reveal connectivity differencesBrain anatomy provides a fundamental three- The time courses of these fluctuations are corre- between subpopulations that are selected by be-dimensional framework around which many lated across gray-matter locations, and the spatial havioral phenotype (such as high versus low IQ)types of neuroscience data can be organized and patterns of the resultant functional connectivity and various other characteristics (Fig. 1). The util-mined. Decades of effort have revealed immense correlation maps are closely related but not iden- ity of HCP-generated data will be enhanced byamounts of information about local and long- tical to the known pattern of direct anatomical close links to other resources containing comple-distance connections in animal brains. A wide connectivity (16, 17). Diffusion imaging and mentary types of spatially organized data, such asrange of tools (such as immunohistochemistry R-fMRI each have important limitations, but to- the Allen Human Brain Atlas (21), which con-and in situ hybridization) have characterized the gether they offer powerful and complementary tains neural gene expression maps.biochemical nature of these circuits that are windows on human brain connectivity. Microconnectomes. Recent advances in serialstudied electrophysiologically, pharmacologically, To address these opportunities, the National section EM, high-resolution optical imagingand behaviorally (7). Several ongoing efforts aim Institutes of Health (NIH) recently launched the methods, and sophisticated image segmentationto integrate anatomical information into searchable Human Connectome Project (HCP) and awarded methods enable detailed reconstructions of the www.sciencemag.org SCIENCE 2 VOL 331 11 FEBRUARY 2011 709
    • microscopic connectome at the level of individ- daily. Over 2000 of these resources are databases tegration, requiring considerable human effort ual synapses, axons, dendrites, and glial processes that range in size from hundreds to millions of to access each resource, understand the context (22–24). Current efforts focus on the reconstruc- records. Many were created at considerable effort and content of the data, and determine the con- tion of local circuits, such as small patches of the and expense, yet most of them remain underused ditions under which they can be compared to cerebral cortex or retina, in laboratory animals. by the research community. other data sets of interest. As such data sets begin to emerge, a fresh set of Clearly, it is inefficient for individual re- To address the terminology problem, the NIF informatics challenges will arise in handling peta- searchers to sequentially visit and explore thou- has assembled an expansive lexicon and ontol- byte amounts of primary and analyzed data and in sands of databases, and conventional online ogy covering the broad domains of neuroscience providing data mining platforms that enable neuro- search engines are inadequate, insofar as they by synthesizing open-access community ontologies scientists to navigate complex local circuits and do not effectively in- (36). The Neurolex and examine interesting statistical characteristics. dex or search database accompanying NIFSTD Micro- and macroconnectomes exemplify content. To promote the (NIF-standardized) on- distinct data types within particular tiers of analy- discovery and use of on- tologies provide defi- sis that will eventually need to be linked. Effective line databases, the NIF nitions of over 50,000 interpretation of both macro- and microconnec- created a portal through concepts, using formal tomic approaches will require novel informatics which users can search languages to represent Downloaded from www.sciencemag.org on February 13, 2012 and computational approaches that enable these not only the NIF registry brain regions, cells, sub- two types of data to be analyzed in a common but the content of mul- cellular structures, mol- framework and infrastructure. Efforts such as the tiple databases simul- ecules, diseases, and Blue Brain Project (25) represent an important taneously. The current functions, and the rela- initial thrust in this direction, but the endeavor NIF federation includes tions among them. will entail decades of effort and innovation. more than 65 databases When users search for Powerful and complementary approaches such accessing ~30 million a concept through the as optogenetics operate at an intermediate (meso- records (35) in major NIF, it automatically ex- connectome) spatial scale by directly perturbing domains of relevance pands the query to in- neural circuits in vivo or in vitro with light-activated to neuroscience (Fig. 2). clude all synonymous ion channels inserted into selected neuronal types Besides very large ge- or closely related terms. (26). Other optical methods, such as calcium im- nomic collections, there For example, a query for aging with two-photon laser microscopy, enable are nearly 1 million anti- “striatum” will include analysis of the dynamics of ensembles of neurons body records, 23,000 “neostriatum, dorsal stri- in microcircuits (27, 28) and can lead to new con- brain connectivity records, atum, caudoputamen, ceptualizations of brain function (29). Such ap- and >50,000 brain activa- caudate putamen” and proaches provide an especially attractive window tion coordinates. Many other variants. on neural choreography as they assess or perturb of these areas are cov- Neurolex terms are the temporal patterns of macro- or microcircuit ered by multiple data- accessible through a activity. bases, which the NIF wiki (37) that allows knits together into a co- users to view, augment, The NIF herent view. Although and modify these con- Connectome-related projects illustrate ways in impressive, this repre- cepts. The goal is to pro- which neuroscience as a field is evolving at the sents only the tip of the vide clear definitions of level of neural circuitry. Other discovery efforts in- iceberg. Most individ- each concept that can clude genome-wide gene expression profiling [for ual databases are under- be used not only by hu- example, (30)] or epigenetic analyses across multi- populated because of mans but by automated ple brain regions in normal and diseased brains. insufficient communi- agents, such as the NIF, This wide range of efforts results in a sharp increase ty contributions. Entire to navigate the com- in the amount and diversity of data being generated, domains of neurosci- plexities of human neu- making it unlikely that neuroscience will be ade- ence (such as electrophys- roscience knowledge. quately served by only a handful of centralized iology and behavior) A key feature is the as- databases, as is largely the case for the genomics are underrepresented as Fig. 2. Current contents of the NIF. The NIF navi- signment of a unique and proteomics community (31). How, then, can compared to genomics gation bar displays the current contents of the NIF resource identifier to we access and explore these resources more effec- and neuroanatomy. data federation, organized by data type and level make it easier for search tively to support the data-intensive discovery en- Ideally, NIF users of the nervous system. The number of records in algorithms to distinguish visioned in The Fourth Paradigm (32)? should be able not only each category is displayed in parentheses. among concepts that Tackling this question was a prime motiva- to locate answers that share the same label. tion behind the NIF (33). The NIF was launched are known but to mine available data in ways For example, nucleus (part of cell) and nucleus in 2005 to survey the current ecosystem of neuro- that spur new hypotheses regarding what is not (part of brain) are distinguished by unique IDs. science resources (databases, tools, and materials) known. Perhaps the single biggest roadblock to Using these identifiers in addition to natural lan- and to establish a resource description framework this higher-order data mining is the lack of stan- guage to reference concepts in databases and and search strategy for locating, accessing, and dardized frameworks for organizing neuroscience publications, although conceptually simple, is an using digital neuroscience-related resources (34). data. Individual investigators often use terminol- especially powerful means for making data The NIF catalog, a human-curated registry of ogy or spatial coordinate systems customized for maximally discoverable and useful. known resources, currently includes more than their own particular analysis approaches. This These efforts to develop and deploy a seman- 3500 such resources, and new ones are added customization is a substantial barrier to data in- tic framework for neuroscience, spearheaded by710 11 FEBRUARY 2011 VOL 331 3 SCIENCE www.sciencemag.org
    • SPECIALSECTIONthe NIF and the International Neuroinformat- best practices for those producing new neuro- 8) Cultural changes are needed to promoteics Coordinating Facility (38), are comple- science data. Good planning and future invest- widespread participation in this endeavor. Thesemented by projects related to brain atlases and ment are needed to broaden and harden the ideas are not just a way to be responsible andspatial frameworks (39–41), providing tools overall framework for housing, analyzing, and collaborative; they may serve a vital role in attain-for referencing data to a standard coordinate integrating future neuroscience knowledge. The ing a deeper understanding of brain function andsystem based on the brain anatomy of a given International Neuroinformatics Coordinating Fa- dysfunction.organism. cility (INCF) plays an important role in coordi- With such efforts, and some luck, the ma- nating and promoting this framework at a global chinery that we have created, including powerfulNeuroinformatics as a Prelude to level. computers and associated tools, may provide usNew Discoveries But can neuroscience evolve so that neuro- with the means to comprehend this “most un-How might improved access to multiple tiers informatics becomes integral to how we study the accountable of machinery” (44), our own brain.of neurobiological data help us understand the brain? This would entail a cultural shift in the fieldbrain? Imagine that we are investigating the regarding the importance of data sharing and References and Notes 1. World Health Organization, Mental Health andneurobiology of bipolar disorder, an illness in mining. It would also require recognition that Development: Targeting People With Mental Healthwhich moods are normal for long periods of neuroscientists produce data not just for con- Conditions As a Vulnerable Group (World Healthtime, yet are labile and sometimes switch to sumption by readers of the conventional litera- Organization, Geneva, 2010). Downloaded from www.sciencemag.org on February 13, 2012mania or depression without an obvious external ture, but for automated agents that can find, 2. F. A. Azevedo et al., J. Comp. Neurol. 513, 532trigger. Although highly heritable, this disease relate, and begin to interpret data from databases (2009). 3. B. Pakkenberg et al., Exp. Gerontol. 38, 95 (2003).appears to be genetically very complex and pos- as well as the literature. Search technologies are 4. www.alleninstitute.org/sibly quite heterogeneous (42). We may dis- advancing rapidly, but the complexity of scien- 5. www.gensat.org/cover numerous genes that impart vulnerability tific data continues to challenge. To make neuro- 6. http://adni.loni.ucla.eduto the illness. Some may be ion channels, others science data maximally interoperable within a 7. A. Bjorklund, T. Hokfelt, Eds., Handbook of Chemical Neuroanatomy Book Series, vols. 1 to 21 (Elsevier,synaptic proteins or transcription factors. How global neuroscience information framework, we Amsterdam, 1983–2005).will we uncover how disparate genetic causes encourage the neuroscience community and the 8. http://cocomac.org/home.asplead to a similar clinical phenotype? Are they all associated funding agencies to consider the fol- 9. http://brancusi.usc.edu/bkms/affecting the morphology of certain cells; the lowing set of general and specific suggestions: 10. http://brainmaps.org/ 11. http://en.wikipedia.org/wiki/Connectomedynamics of specific microcircuits, for example, 1) Neuroscientists should, as much as is fea- 12. O. Sporns, G. Tononi, R. Kotter, PLoS Comput. Biol. 1,within the amygdala; the orchestration of infor- sible, share their data in a form that is machine- e42 (2005).mation across regions, for example, between the accessible, such as through a Web-based database 13. F. Crick, E. Jones, Nature 361, 109 (1993).amygdala and the prefrontal cortex? Can we or some other structured form that benefits from 14. H. Johansen-Berg, T. E. J. Behrens, Diffusion MRI: From Quantitative Measurement to in-vivo Neuroanatomycreate genetic mouse models of the various mu- increasingly powerful search tools. (Academic Press, London, ed. 1, 2009).tated genes and show a convergence at any of 2) Databases spanning a growing portion of 15. H. Johansen-Berg, M. F. Rushworth, Annu. Rev. Neurosci.these levels? Can we capture the critical changes the neuroscience realm need to be created, 32, 75 (2009).in neuronal and/or glial function (at any of the populated, and sustained. This effort needs ade- 16. J. L. Vincent et al., Nature 447, 83 (2007).levels) and find ways to prevent the illness? Dis- quate support from federal and other funding 17. D. Zhang, A. Z. Snyder, J. S. Shimony, M. D. Fox, M. E. Raichle, Cereb. Cortex 20, 1187 (2010).covering the common thread for such a disease mechanisms. 18. http://humanconnectome.org/consortiawill surely benefit from tools that facilitate navi- 3) Because databases become more useful 19. http://humanconnectome.orggation across the multiple tiers of data—genetics, as they are more densely populated (43), add- 20. D. S. Marcus, T. R. Olsen, M. Ramaratnam, R. L. Buckner,gene expression/epigenetics, changes in neuro- ing to existing databases may be preferable to Neuroinformatics 5, 11 (2007). 21. http://human.brain-map.org/nal activity, and differences in dynamics at the creating customized new ones. The NIF, INCF, 22. K. L. Briggman, W. Denk, Curr. Opin. Neurobiol. 16, 562micro and macro levels, depending on the mood and other resources provide valuable tools for (2006).state. No single focused level of analysis will suf- finding existing databases. 23. J. W. Lichtman, J. Livet, J. R. Sanes, Nat. Rev. Neurosci. 9,fice to achieve a satisfactory understanding of 4) Data consumption will increasingly involve 417 (2008). 24. S. J. Smith, Curr. Opin. Neurobiol. 17, 601the disease. In neural choreography terms, we machines first and humans second. Whether creat- (2007).need to identify the dancers, define the nature ing database content or publishing journal arti- 25. H. Markram, Nat. Rev. Neurosci. 7, 153 (2006).of the dance, and uncover how the disease cles, neuroscientists should annotate content using 26. K. Deisseroth, Nat. Methods 8, 26 (2011).disrupts it. community ontologies and identifiers. Coordi- 27. B. F. Grewe, F. Helmchen, Curr. Opin. Neurobiol. 19, 520 (2009). nates, atlas, and registration method should be spe- 28. B. O. Watson et al., Front. Neurosci. 4, 29 (2010).Recommendations cified when referencing spatial locations. 29. G. Buzsaki, Neuron 68, 362 (2010).Need for a cultural shift. To meet the grand chal- 5) Some types of published data (such as brain 30. R. Bernard et al., Mol. Psychiatry, 13 April 2010 (e-publenge of elucidating neural choreography, we coordinates in neuroimaging studies) should be ahead of print).need increasingly powerful scientific tools to reported in standardized table formats that facil- 31. M. E. Martone, A. Gupta, M. H. Ellisman, Nat. Neurosci. 7, 467 (2004).study brain activity in space and in time, to ex- itate data mining. 32. The Fourth Paradigm: Data Intensive Scientific Discovery,tract the key features associated with particular 6) Investment needs to occur in interdis- T. Hey, S. Tansler, K. Tolle, Eds. (Microsoft Researchevents, and to do so on a scale that reveals com- ciplinary research to develop computational, Publishing, Redmond, WA, 2009).monalities and differences between individual machine-learning, and visualization methods 33. http://neuinfo.org 34. D. Gardner et al., Neuroinformatics 6, 149 (2008).brains. This requires an informatics infrastructure for synthesizing across spatial and temporal 35. A. Gupta et al., Neuroinformatics 6, 205 (2008).that has built-in flexibility to incorporate new information tiers. 36. W. J. Bug et al., Neuroinformatics 6, 175 (2008).types of data and navigate across tiers and do- 7) Educational strategies from undergraduate 37. http://neurolex.orgmains of knowledge. through postdoctoral levels are needed to ensure 38. http://incf.org 39. http://www.brain-map.org The NIF currently provides a platform for that neuroscientists of the next generation are 40. http://wholebraincatalog.orgintegrating and systematizing existing neuro- proficient in data mining and using the data- 41. http://incf.org/core/programs/atlasingscience knowledge and has been working to define sharing tools of the future. 42. H. Akil et al., Science 327, 1580 (2010). www.sciencemag.org SCIENCE 4 VOL 331 11 FEBRUARY 2011 711
    • 43. G. A. Ascoli, Nat. Rev. Neurosci. 7, 318 (2006). and the Pritzker Neuropsychiatric Research Consortium Institute on Drug Abuse to M.E.M. We thank D. Marcus, 44. N. Nicolson, J. Trautmann, Eds., The Letters of Virginia (to H.A); and the HCP (1U54MH091657-01) from the R. Poldrack, S. Curtiss, A. Bandrowski, and Woolf, Volume V: 1932–1935 (Harcourt, Brace, 16 NIH Institutes and Centers that support the NIH S. J. Watson for discussions and comments on New York, 1982). Blueprint for Neuroscience Research (to D.C.V.E.). the manuscript. 45. Funded in part by grants from NIH (5P01-DA021633-02), The NIF is supported by NIH Neuroscience Blueprint the Office of Naval Research (ONR-N00014-02-1-0879), contract HHSN271200800035C via the National 10.1126/science.1199305 PERSPECTIVE CT and MRI have evolved across the widest range of applications. CT is sensitive to density. Its greatest strength is imaging dense materials The Disappearing Third Dimension like rocks, fossils, the bones in living organisms and, to a lesser extent, soft tissue. The first clin- Timothy Rowe1 and Lawrence R. Frank2 ical CT scan, made in 1971, used an 80 by 80 matrix of 3 mm by 3 mm by 13 mm voxels, each Three-dimensional computing is driving what many would call a revolution in scientific slice measuring 6.4 Kb and taking up to 20 min Downloaded from www.sciencemag.org on February 13, 2012 visualization. However, its power and advancement are held back by the absence of sustainable to acquire. Each slice image took 7 min to recon- archives for raw data and derivative visualizations. Funding agencies, professional societies, and struct on a mainframe computer (4). In 1984, the publishers each have unfulfilled roles in archive design and data management policy. first fossil, a 14-cm-long skull of the extinct mam- mal Stenopsochoerus, was scanned in its entirety hree-dimensional (3D) image acquisition, infrastructure and new policy to manage owner- (5), signaling CT’s impact beyond the clinical set- T analysis, and visualization are increas- ingly important in nearly every field of science, medicine, engineering, and even the ship and release (3). Technological advancements in both med- ical and industrial scanners have increased the ting and in digitizing entire object volumes. The complete data set measured 3.28 Mb. With a main- frame computer, rock matrix was removed to vi- fine arts. This reflects rapid growth of 3D scan- resolution of 3D volumetric scanners to the sualize underlying bone, and surface models ning instrumentation, visualization and analysis point that they can now volumetrically digi- were reconstructed in computations taking all algorithms, graphic displays, and graduate train- tize structures from the size of a cell to a blue night. By 1992, industrial adaptations brought ing. These capabilities are recognized as critical whale, with exquisite sensitivity toward scien- CT an order-of-magnitude higher resolution (high- for future advances in science, and the U.S. Na- tific targets ranging from tissue properties to the resolution x-ray computed tomography, HRXCT) tional Science Foundation (NSF) is one of many composition of meteorites. Voxel data sets are to inspect much smaller, denser objects. A fossil funding agencies increasing support for 3D im- generated by rapidly diversifying instruments skull of the stem-mammal Thrinaxodon (68 mm aging and computing. For example, one new that include x-ray computed tomographic (CT) long) was scanned in 0.2-mm-thick slices measur- initiative aims at digitizing biological research scanners (Fig. 1), magnetic resonance imaging ing 119 Kb each. Scanning the entire volume took collections, and NSF’s Earth Sciences program (MRI) scanners (Fig. 2), confocal microscopes, 6 hours, and the complete raw data set occupied will soon announce a new target in cyberinfra- synchrotron light sources, electron-positron scan- 18.4 Mb (6). The scans revealed all internal de- structure, with a spotlight on 3D imaging of nat- ners, and other innovative tools to digitize entire tails reported earlier, from destructive mechanical ural materials. object volumes. serial sectioning of a different specimen, and pushed Many consider the advent of 3D imaging a “scientific revolution” (1), but in many ways the revolution is still nascent and unfulfilled. Given the increasing ease of producing these data and rapidly increased funding for imaging in non- medical applications, a major unmet challenge to ensure maximal advancement is archiving and managing the science that will be, and even has already been, produced. To illustrate the prob- lems, we focus here on one domain, volume ele- ments or “voxels,” the 3D equivalents of pixels, but the argument applies more broadly across 3D computing. Useful, searchable archiving requires infrastructure and policy to enable disclosure by data producers and to guarantee quality to data consumers (2). Voxel data have generally lacked a coherent archiving and dissemination policy, and thus the raw data behind thousands of published reports are not released or available for validation and reuse. A solution requires new 1 Jackson School of Geosciences, The University of Texas, Austin, Fig. 1. (A) Photomicrograph of fossil tooth (Morganucodon sp.). (B) MicroXCT slice, 3.2-mm voxel size; TX 78712, USA. 2Department of Radiology, University of Cal- arrows show ring artifact, a correctable problem with raw data but an interpretive challenging with ifornia, San Diego, CA 92093, USA. compressed data. (C) Digital 3D reconstruction. (D) Slice through main cusp; red arrows show growth *To whom correspondence should be addressed. E-mail: bands in dentine, and blue arrows mark the enamel-dentine boundary, both observed in Morganucodon rowe@mail.utexas.edu (T.R.); lfrank@ucsd.edu (L.R.F.) for the first time in these scans.712 11 FEBRUARY 2011 VOL 331 5 SCIENCE www.sciencemag.org
    • COMPARATIVE EFFECTIVENESS Distributed Health Data Networks A Practical and Preferred Approach to Multi-Institutional Evaluations of Comparative Effectiveness, Safety, and Quality of Care Jeffrey S. Brown, PhD,* John H. Holmes, PhD,† Kiran Shah, BA,‡ Ken Hall, MDIV,§ Ross Lazarus, MBBS, MPH,* and Richard Platt, MD, MSc* were identified that would prevent development of a fully functional,Background: Comparative effectiveness research, medical product scalable network.safety evaluation, and quality measurement will require the ability to Conclusions: Distributed networks are capable of addressing nearlyuse electronic health data held by multiple organizations. There is no all anticipated uses of routinely collected electronic healthcare data.consensus about whether to create regional or national combined Distributed networks would obviate the need for centralized data-(eg, “all payer”) databases for these purposes, or distributed data bases, thus avoiding numerous obstacles.networks that leave most Protected Health Information and propri-etary data in the possession of the original data holders. Key Words: distributed health data network, all payer databases,Objectives: Demonstrate functions of a distributed research net- comparative effectiveness research networkwork that supports research needs and also address data holders (Med Care 2010;48: S45–S51)concerns about participation. Key design functions included stronglocal control of data uses and a centralized web-based queryinginterface.Research Design: We implemented a pilot distributed researchnetwork and evaluated the design considerations, utility for research,and the acceptability to data holders of methods for menu-drivenquerying. We developed and tested a central, web-based interface E lectronic health records and other data routinely collected during the delivery of, or payment for, health care, and disease or exposure-specific registries are important resourceswith supporting network software. Specific functions assessed in- to address the growing need for evidence about the effective-clude query formation and distribution, query execution and review, ness, safety, and quality of medical care.1– 6 Unfortunately,and aggregation of results. even very large individual healthcare databases and registriesResults: This pilot successfully evaluated temporal trends in med- are not big or diverse enough to address many of needs ofication use and diagnoses at 5 separate sites, demonstrating some of clinicians, health care delivery systems, or the public healththe possibilities of using a distributed research network. The pilot community. The Institute of Medicine and others have artic-demonstrated the potential utility of the design, which addressed the ulated the goal of using routinely collected health informationmajor concerns of both users and data holders. No serious obstacles for these “secondary” purposes.1,7–10 The Federal Coordi- nating Council for Comparative Effectiveness Research (FCCCER) noted the need for studies “with sufficient power to discern treatment effects and other impacts of interventionsFrom the *Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, Boston, MA; †Department of among patient subgroups.” In their priority recommendations, Biostatistics and Epidemiology, Center for Clinical Epidemiology and the FCCCER listed comparative effectiveness research data Biostatistics, University of Pennsylvania School of Medicine, Philadel- infrastructure as a primary investment that cuts-across all phia, PA; ‡Lincoln Peak Partners, Westborough, MA; and §Deloitte, other comparative effectiveness needs.5 The FCCCER also NCPHI/OD, Atlanta, GA.Supported by Agency for Healthcare Research and Quality Contract No. listed several considerations for investing in person-level 290 – 05– 0033, US Department of Health and Human Services as part of databases for comparative effectiveness research, including the Developing Evidence to Inform Decisions about Effectiveness the ability to link to external data sources, the research (DEcIDE) program. readiness of the databases, and the need to maintain securityThe authors of this report are responsible for its content. Statements in this and privacy of personally identifiable health information.5 report should not be construed as endorsement by the Agency for Healthcare Research and Quality or the US Department of Health and To address the need to evaluate the processes and Human Services. outcomes of care of large populations, some propose creationReprints: Jeffrey S. Brown, PhD, Department of Population Medicine, of large, centralized, multipayer claims databases.11 For ex- Harvard Medical School and Harvard Pilgrim Health Care Institute, ample, the Department of Health and Human Services issued 133 Brookline Ave, 6th floor, Boston, MA 02215. E-mail: jeff_brown@harvardpilgrim.org. a contract titled “Strategic Design for an All-Payor, All-Copyright © 2010 by Lippincott Williams & Wilkins Claims Database to Support Comparative Effectiveness Re-ISSN: 0025-7079/10/4800-0045 search.” In addition, several states have already implementedMedical Care • Volume 48, Number 6 Suppl 1, June 2010 www.lww-medicalcare.com | S45 6
    • Brown et al Medical Care • Volume 48, Number 6 Suppl 1, June 2010all-payer claims databases to address cost and quality con- cesses on the part of each data holder. Thus, the administra-cerns.12,13 This data centralization approach is alluring be- tive operation of networks can be cumbersome.cause, in theory, it mitigates the complications associated Many of the administrative, technical, and analyticwith conducting research across multiple data holders. In barriers to developing efficient and scalable distributed healthpractice, a centralized approach raises several serious secu- data networks to support population-level analyses can berity, proprietary, operational, legal, and patient privacy con- addressed through network features that support the needs ofcerns for data holders, patients, and funders.14 –17 As one users and data holders. We describe here the design and pilotexample, even if a centralized database omits explicit iden- implementation of a distributed research network infrastruc-tifying information like name and address, it is effectively ture intended to meet the broad needs of all parties forimpossible to prevent reidentification of individual level lon- comparative effectiveness evaluation and other uses. We notegitudinal data that contains enough detail to serve multiple the challenges and barriers identified, and provide a blueprintpurposes. In our experience, these limitations have severely for development of a comprehensive distributed researchconstrained the effective, coordinated use of data held by network as reusable national resource.multiple organizations. An alternative to centralized, all-payer, databases, is METHODSone or more distributed research networks that permitcomparative effectiveness and other evaluations across Background and Needs Assessmentmultiple databases without creation of a central data ware- The design, implementation, and evaluation plan of thehouse.14,15,18 –24 Several such networks18,22–29 currently con- pilot distributed network was based on findings from previousduct comparative effectiveness and pharmacoepidemiologic studies,14,15,21,34 coupled with our experience in operating aresearch using a distributed data approach in which data distributed network, the HMO Research Network Center forholders maintain control over their protected data and its Education and Research on Therapeutics, and participating inuses. These networks require data holders to transform their other networks,23,27,28,31 including a current study of thedata of interest into a common data model that enforces safety of the H1N1 vaccine.30 Our prior work investigated theuniform data element naming conventions, definitions, and needs of data holders (eg, health plans) and potential networkdata storage formats. The common data format allows data users (eg, federal agencies) with respect to making their datachecking, manipulation, and analysis via identical computer available for comparative effectiveness and other secondaryprograms that are shared by all data holders. Existing distrib- uses.14,15 Data holders identified several requirements for voluntary participation in a distributed network. These in-uted networks typically distribute these computer programs cluded: (1) complete control of, access to, and uses of, theirvia e-mail, data holders manually execute the programs, and data, (2) strong security and privacy features, (3) limitedreturn the output via secure e-mail, or another secure mech- impact on internal systems, (4) minimal data transfer, (5)anism to a coordinating center for aggregation and, possibly, auditable processes, (6) standardization of administrative andadditional analysis. Many studies require no transfer of pro- regulatory agreements, (7) transparent governance, and (8)tected health information. Several single-study networks con- ease of participation and use.sisting of 20 to 50 million members also have been developed Users’ needs assessments, by contrast, did not dependthat adhere to a distributed research network approach.30,31 on whether the underlying architecture was a distributed Existing and proposed distributed networks have tre- network or centralized database. Potential users identifiedmendous potential utility for addressing our current post other key elements of a network, including: menu-drivenmarketing evidence knowledge gap while benefiting from an querying, easy access for feasibility assessments and forapproach that is more acceptable to data holders.14,15,21 In public health surveillance and monitoring, the ability toaddition, a distributed approach keeps the data close to the specify and create subsets of the complete data via menu-people who know the data best and who can best consult on driven querying or complex programming code, and reuse ofproper use of the data and investigate findings or anomalies. network tools to improve efficiency (eg, reuse of validated Obstacles to effective implementation of both central- exposure and outcome algorithms).15 Nontechnical usersized and distributed approaches include differences in com- wanted the ability to ask simple questions without assistanceputing environments and information systems, the need for (eg, counts of people between the ages of 65 and 74 with adata standardization and checking, organization-by-organiza- positron emission tomography scan in 2008). More sophisti-tion variation in contracting policies and procedures, con- cated users wanted the ability to perform complex analysescerns related to the ethics of human subjects research and data (eg, compare risk adjusted survival curves for breast cancerprivacy, and cross-institution variation in the rules and guide- patients treated with tamoxifen as adjuvant chemotherapy, tolines related to privacy and proprietary issues.32,33 Distrib- those who were not treated). Potential users also noted that ituted networks can be built and tested in phases that allow the is often difficult to get rapid responses from existing systems,network to operate while being built, and network data and this was even noted by users who were also data holders.resources can be updated and enhanced without disruptingoverall network operations. Networks have the need for Implementationresponsible and consistent stewardship of clinical records, The design and rapid prototyping process focused onand they exchange the requirements of centralized database addressing the 8 specific data holder concerns noted above.administrative and computing infrastructure for similar pro- We used a phased approach to development and implemen-S46 | www.lww-medicalcare.com © 2010 Lippincott Williams & Wilkins 7
    • Medical Care • Volume 48, Number 6 Suppl 1, June 2010 Distributed Health Data Networks FIGURE 1. System architecture.TABLE 1. Description of System ComponentsComponent DescriptionPortal Access control manager Manages all aspects of security for portal including authentication, session management, policies, group permissions, user permissions, and access rights. Query manager Manages query entry, routing, and distribution. Results manager Manages receipt, organization, assembly, merging, and aggregation of result sets. Workflow manager Manages workflow (eg, for query approval) including request routing, alerting and notification, approval management, and tracking. User manager Manages user accounts. Audit manager Provides auditing functions including activity and error logging. Researcher user interface User interface for research users, including menu-driven and ad hoc query entry, query management, result status, and result set management. Data Mart administrator user User interface for Data Mart administrators including data mart setup and configuration, access control interface management, and workflow management. System administrator user User interface for System administrators including portal setup and configuration, access control interface management, and user management. Hub API Application Programming Interface (eg, web service API) for the DRN portal. Exposes portal functions for remote applications including query retrieval and results submission. Hub database Database for the portal. Message protocol Protocol for messaging between the portal and external applications including the Data Marts.Data holder Data Mart Data Mart security manager Manages all aspects of security for Data Mart including authentication, session management, policies, group permissions, user permissions, and access rights. Query execution manager Manages review and execution (or rejection) of queries including queue management, query translation, query engine interface, and results handling. Data Mart API Application Programming Interface (eg, web service API) for the Data Mart. Exposes functions for remote applications including query submission and results retrieval. Data Mart audit manager Provides auditing functions including activity and error logging. Data source manager Manages exchange of data between the Data Mart database and source systems. Data Mart database Database for the Data Mart.tation of the pilot distributed network. The first phase in- Health Systems, Kaiser Permanente Colorado, and Health-cluded a web-based portal system for menu-driven querying Partners Research Foundation). Figure 1 and Table 1 illus-of summary-level datasets held by 5 data holders (Harvard trate and describe the system architecture and features; FigurePilgrim Health Care, Group Health Cooperative, Geisinger 2 illustrates the current menu-driven query interface.© 2010 Lippincott Williams & Wilkins www.lww-medicalcare.com | S47 8
    • Brown et al Medical Care • Volume 48, Number 6 Suppl 1, June 2010FIGURE 2. Web-based Menu-drivenQuery Formation. It shows part ofthe query creation page that allowsa user to select specific exposures,diagnoses, or procedures, time peri-ods, and data sources. Development of the network software relied on a rapid mentation, and testing process we also informally evaluated dataprototyping approach that included multiple rounds of de- holder acceptance of the system and their willingness to con-signing, building, testing, and revising of the interface and tinue development and testing beyond the demonstration. Othersupporting portal, and the creation of a novel application network functions such as user-based access control (ie, usersprogram, the Query Execution Manager (QEM). Initial que- have different levels of permissions to submit queries), queryrying capability was built for drug utilization by generic name formation restrictions (ie, limit on the number and type of queryand drug class, diagnosis by 3-digit ICD-9-CM code, and parameters available for selection); and query results viewingprocedure (Healthcare Common Procedure Coding System) rules (ie, requirement of 2 data holder responses before a userqueries. Query results were stratifiable by age group, sex, and can view and export aggregated results) also were developed,year or year and quarter. These simple queries were chosen tested, and evaluated.for demonstration purposes, more complex queries are pos- Separately, we partnered with the National Center forsible within the network design. Public Health Informatics to design and pilot test an alterna- From the user perspective, query creation involved log- tive approach for securely transmitting queries and receivingging into the web portal, selecting a query type (eg, generic drug results. The use-case for this pilot test illustrated how anname), selecting query parameters and the data holders to query, authorized user could securely authenticate to a central portaland once submitted, reviewing the status of queries and the and securely distribute a computer program to each dataaggregated query results. Each data holder installed the QEM holders through their local firewall. The program was exe-software and responded to multiple test queries. The system used cuted and the results securely returned for aggregation. De-a “pull” query distribution mechanism that notified (via e-mail) tails of this work is described elsewhere.35data holders of waiting queries. The data holder user thenopened the QEM to review the query details (eg, submitter and RESULTSreason for submitting) and decides whether to execute, reject, ofhold the query for further review. If the data holder decided to Implementation and System Functioningrun the query, the QEM downloaded the query text from the The system was successfully implemented and tested atportal, executed it against a local database, and presented the each of the 5 participating data holders. Each data holder wasresults to the data holder for review. The data holder could then able to install and operate the QEM, retrieve and executeupload the results to the portal for aggregation with other results queries, and upload results. Installation took approximately 15and review by the submitter. minutes. Test scenarios were based on summary level data and Each of the sample queries was executed without error.included assessment of temporal trends in the use of genetic Figures 3A and B show how data holders interact with thetesting, influenza-related medical and pharmacy use, attention- system, specifically the functions that allow them to reviewdeficit hyperactivity disorder medication use by age, and urti- queries before executing them, and review results before upload-caria diagnoses by age and year, and rate of diabetes by age. We ing them to the central portal. Figure 4 illustrates results from aassessed the ability of the system to execute and perform each of set of sample queries regarding the use of attention-deficitthe user and data holder tasks described above for each of the hyperactivity disorder medications; this type of information cantest queries submitted. Throughout the rapid prototyping, imple- help identify trends that may warrant further evaluation or helpS48 | www.lww-medicalcare.com © 2010 Lippincott Williams & Wilkins 9
    • Medical Care • Volume 48, Number 6 Suppl 1, June 2010 Distributed Health Data NetworksFIGURE 3. Query Execution Manager: Query review and results screen. A, Shows query details that are presented to the dataholder for review. In this case the query text is submitted as structured query; alternatively, the design allows a user to sendan appropriately structured analytic program. B, Shows local query results for review and approval for upload to the centralportal for aggregation. FIGURE 4. Example of findings that could be obtained using the query tool output: Rate of attention-defi- cit-hyperactivity disorder drug use by age group and year.understand the adoption and diffusion of new products. The executions and data uploads, (2) an easy to use query inter-sample queries were completed within a day of submission; this face, (3) centralization of the network logic in a portal, andperiod included local review of incoming queries and of the (4) simple installation and light-weight software that did notresults. require any special expertise to install or use. Further, the “pull” mechanism for query distribution (ie, data holders areEvaluation of Acceptability to Data Holders notified of waiting queries and retrieve them) was also an All data holders found the system acceptable and were important favorable factor for data holders’ acceptance.willing to continue working on development and implemen-tation. The data holders found the approach acceptable be-cause the design directly addressed many of the data holder DISCUSSIONconcerns listed above. Specifically, the design illustrated: (1) Overall, this demonstration validated key design fea-data holder data autonomy that allowed approval of all query tures of a distributed health data network including: use of a© 2010 Lippincott Williams & Wilkins www.lww-medicalcare.com | S49 10
    • Brown et al Medical Care • Volume 48, Number 6 Suppl 1, June 2010central portal, a “pull” query distribution mechanism that centralized databases. In addition, distributed networks haveobviated the need to allow queries to pass through firewalls, these advantages, compared with centralized systems: (1)and local control that allows data holders to maintain physical They allow data holders to maintain physical control overcontrol of their data and all uses while simultaneously in- their data; without this control, in our experience, data hold-creasing research access for authorized users. These features ers are unlikely to voluntarily participate. (2) They ensurewere paramount to data holders. The use of summary-level ongoing participation of individuals who are knowledgeabledata for this demonstration obviated many data privacy con- about the systems and practices that underlie each datacerns, facilitating acceptance by data holders. The technical holder’s data. (3) They allow data holders to assess andplatform we developed is fully capable of supporting the use authorize query requests, or categories of requests, on aof patient and encounter-level utilization information, includ- user-by-user or case-by-case basis. (4) Distributed systemsing the ability to submit a SAS program as the query text.14 minimize the need to disclose protected health information This design minimizes data holder information technol- thus mitigating privacy concerns, many of which are regu-ogy responsibilities, leaves protected health information un- lated by the Privacy and Security Rules of the Health Insur-der the control of the data holder, provides for a more ance Portability and Accountability Act of 1996 (HIPAA).straightforward security implementation, and focuses net- (5) Distributed systems minimize the need to disclose andwork management tasks at the central portal. It supports lose control of proprietary data. (6) A distributed approachimportant capabilities such as secure communications and eliminates the need to create, secure, maintain, and managedata protection, auditable processes, a simple query interface, access to a complex central data warehouse. (7) Finally, aand locally managed query authorization. distributed network also avoids the need to repeatedly trans- The menu-driven interface facilitates the use of the fer and pool data to maintain a current database, which is adistributed network by users with limited technical expertise. costly undertaking each time updating is necessary.Network features streamline workflow among participating A phased approach is suggested for implementing asites and enable an authorized user to quickly assess the large scale distributed research network infrastructure forfeasibility of various comparative effectiveness studies in comparative effectiveness research and other purposes. Ques-larger populations than might not normally be readily avail- tions that leverage the most commonly used and best under-able. This demonstration project was relatively limited in stood data types, target large populations, and execute stan-scope to allow data holders to become comfortable with the dard statistical analyses using well-developed softwaretechnical design and the governance needed to manage and packages will be the best candidates for demonstration stud-adjudicate simple query requests. More complex and com- ies. The Agency for Healthcare Research and Quality’s recentplete demonstrations are currently underway that leverage the grant awards include several examples of research projectssame network infrastructure to support full comparative ef- that might benefit from the large study populations accessiblefectiveness studies. Enhancements will allow more sophisti- through distributed data networks, including the study ofcated authorization, security, and permission policies, a more outcomes resulting from depression treatments and asthmaflexible query interface, and access to additional data and treatments in pregnancy, the study of ACE inhibitors inquery types. In addition, the infrastructure is designed to African-American males, and the study of various treatmentsallow distributed multivariate analyses, either through use the for lumbar spine.39 Similarly, the Food and Drug Adminis-of iterative methods36,37 or by merging site-specific analysis tration’s planned Sentinel Initiative to monitor the safety offiles that omit PHI, for instance through the use of high medical products, could also use a distributed data network,dimensionality propensity scores.38 either identical to one developed for comparative effective- Finally, advances in governance are as important to ness, or very similar to it. Relatively small investmentsexpanding this network model as any of the technical capa- compared with the cost of developing the underlying elec-bilities. Network governance requires policies and procedures tronic health information will reduce individual study coststo address issues such as data holder protections, conflict of and demonstrate the value of creating reusable, distributedinterest, external communications, priority setting, by-laws, research networks.data security, accounting, network strategy, stakeholder is-sues, and HIPAA and human subjects protection. In addition, REFERENCESa coordinating center is needed to maintain network infra- 1. Baciu A, Stratton K, Burke SP, eds. The Future of Drug Safety:structure, documentation, coordination, monitoring of data Promoting and Protecting the Health of the Public. Washington, DC:resources and contacts, documentation of lessons learned, Institute of Medicine of the National Academies; 2006.data validity activities, and study implementation. 2. McClellan M. Drug safety reform at the FDA—pendulum swing or systematic improvement? N Engl J Med. 2007;356:1700 –1702. 3. Platt R, Wilson M, Chan KA, et al. The new Sentinel Network— improving the evidence of medical-product safety. N Engl J Med. CONCLUSION 2009;361:645– 647. In theory, either a distributed or a centralized (eg, 4. Strom BL. The future of pharmacoepidemiology. In: Strom BL, ed.all-payer) approach can meet the need to use the growing Pharmacoepidemiology. Chichester, United Kingdom: John Wiley &amount of electronic health data to address important societal Sons; 2005. 5. Federal Coordinating Council for Comparative Effectiveness Re-questions. However, a distributed network is preferred be- search. Report to the President and Congress. Department of Healthcause it can perform essentially all the functions desired of a and Human Services; 2009. Available at: www.hhs.gov/recovery/centralized database, while avoiding many disadvantage of prgrams/cer/cerannualrpt.pdf.S50 | www.lww-medicalcare.com © 2010 Lippincott Williams & Wilkins 11
    • Medical Care • Volume 48, Number 6 Suppl 1, June 2010 Distributed Health Data Networks 6. Gliklich RE, Dreyer NA, eds. Registries for Evaluating Patient Outcomes: 22. Platt R, Davis R, Finkelstein J, et al. Multicenter epidemiologic and A User’s Guide. (Prepared by Outcome DEcIDE Center ͓Outcome Sci- health services research on therapeutics in the HMO Research Network ences, Inc. dba Outcome͔ under Contract No. HHSA29020050035I TO1.). Center for Education and Research on Therapeutics. Pharmacoepide- AHRQ Publication No. 07-EHC001-1. Rockville, MD: Agency for Health- miol Drug Saf. 2001;10:373–377. care Research and Quality. April 2007. 23. Wagner EH, Greene SM, Hart G, et al. Building a research consortium 7. Institute of Medicine. Initial National Priorities for Comparative Effectiveness of large health systems: the Cancer Research Network. J Natl Cancer Research. The National Academies Press: 2009. Available at: http://www. Inst Monogr. 2005:3–11. iom.edu/Reports/2009/ComparativeEffectivenessResearchPriorities.aspx. 24. Chen RT, Glasser JW, Rhodes PH, et al. Vaccine Safety Datalink 8. Robert Wood Johnson Foundation. National Effort to Measure and project: a new tool for improving vaccine safety monitoring in the Report on Quality and Cost-Effectiveness of Health Care Unveiled. United States. The Vaccine Safety Datalink Team. Pediatrics. 1997;99: 2007. Available at: http://www.rwjf.org/pr/product.jsp?idϭ22371. Ac- 765–773. cessed February, 2010. 25. Vacine Safety Datalink. Available at: http://www.cdc.gov/vaccinesafety/vsd/. 9. Engelberg Center for Health Care Reform at Brookings. Using Data to Accessed January, 2010. Support Better Health Care: One Infrastructure with Many Uses. 2009. 26. Chan K; HMO Research Network. The HMO Research Network. In: Available at: http://www.brookings.edu/events/2009/1202_health_care_ Strom BL, ed. Pharmacoepediomology. Chichester, United Kingdom: data.aspx. Accessed February, 2010. John Wiley & Sons; 2005.10. Engelberg Center for Health Care Reform at Brookings. Implementing 27. Go AS, Magid DJ, Wells B, et al. The Cardiovascular Research Net- Comparative Effectiveness Research: Priorities, Methods, and Impact. work: a new paradigm for cardiovascular quality and outcomes research. June 2009. Available at: http://www.brookings.edu/ϳ/media/Files/ Circ Cardiovasc Qual Outcomes. 2008;1:138 –147. events/2009/0609_health_care_cer/0609_health_care_cer.pdf. 28. Magid DJ, Gurwitz JH, Rumsfeld JS, et al. Creating a research data11. Mosquera M. HHS awards $1M contract for effectiveness database. network for cardiovascular disease: the CVRN. Expert Rev Cardiovasc Government Health IT, 2010. Available at: http://govhealthit.com/ Ther. 2008;6:1043–1045. newsitem.aspx?nidϭ73035. Accessed January, 2010. 29. Davis RL, Kolczak M, Lewis E, et al. Active surveillance of vaccine12. Office for Oregon Health Policy and Research. POLICY BRIEF: All- safety: a system to detect early signs of adverse events. Epidemiology. Payer, All-Claims Data Base. 2009. Available at: http://www.oregon. 2005;16:336 –341. gov/OHPPR/HFB/docs/2009_Legislature_Presentations/Policy_Briefs/ 30. Federal Immunization Safety Task Force. Federal Plans to Monitor PolicyBrief_AllPayerAllClaimsDatabase_4.30.09.pdf. Immunization Safety for the Pandemic 2009 H1N1 Influenza Vaccina-13. Miller P, Schneider CD. State and National Efforts to Establish All- tion Program. 2009. Available at: http://www.flu.gov/professional/ Payer Claims Databases. Paper presented at: All-payer claims databases: federal/fed-plan-to-mon-h1n1-imm-safety.pdf. Accessed January 31, 2010. A key to healthcare reform. Massachusetts Health Data Consortium Fall 31. Velentgas P, Bohn R, Brown JS, et al. A distributed research network Workshop; 2009; Boston, MA.14. Brown JS, Holmes J, Maro J, et al. Design specifications for network model for post-marketing safety studies: the Meningococcal Vaccine prototype and cooperative to conduct population-based studies and Study. Pharmacoepidemiol Drug Saf. 2008;17:1226 –1234. safety surveillance. Effective Health Care Research Report No. 13. 32. Greene SM, Geiger AM. A review finds that multicenter studies face (Prepared by the DEcIDE Centers at the HMO Research Network Center substantial challenges but strategies exist to achieve Institutional Review for Education and Research on Therapeutics and the University of Board approval. J Clin Epidemiol. 2006;59:784 –790. Pennsylvania Under Contract No. HHSA29020050033I T05.) Agency 33. Greene SM, Geiger AM, Harris EL, et al. Impact of IRB requirements on for Healthcare Research and Quality: July 2009. Available at: www. a multicenter survey of prophylactic mastectomy outcomes. Ann Epide- effectivehealthcare.ahrq.gov/reports/final.cfm. miol. 2006;16:275–278.15. Maro JC, Platt R, Holmes JH, et al. Design of a national distributed 34. Brown JS, Moore KM, Braun MM, et al. Active influenza vaccine safety health data network. Ann Intern Med. 2009;151:341–344. surveillance: potential within a healthcare claims environment. Med16. Rosati K. Using electronic health information for pharmacovigilance: Care. 2009;47:1251–1257. the promise and the pitfalls. J Health Life Sci Law. 2009;2:171, 173– 35. National Center for Public Health Informatics Public Health Research 239. Grid. Pubic Health Grid Research and Development: DRN Demonstra-17. Rosati KB. HIPAA privacy: the compliance challenges ahead. J Health tion. 2009. Available at: http://phgrid.blogspot.com/search/label/DRN Law. 2002;35:45– 82. and http://sites.google.com/site/phgrid/. Accessed February, 2010.18. Hornbrook MC, Hart G, Ellis JL, et al. Building a virtual cancer research 36. Karr A, Lin X, Sanil AP, et al. Secure regression on distributed organization. J Natl Cancer Inst Monogr. 2005:12–25. databases. J Comput Graph Stat. 2005;14:263–279.19. Lazarus R, Yih K, Platt R. Distributed data processing for public health 37. Karr AF, Lin X, Reiter JP, et al. Secure regression on distributed surveillance. BMC Public Health. 2006;6:235. databases. J Comput Graph Stat. 2005;14:1–18.20. McMurry AJ, Gilbert CA, Reis BY, et al. A self-scaling, distributed 38. Rassen JA, Avorn J, Schneeweiss S. Multivariate-adjusted pharmaco- information architecture for public health, research, and clinical care. epidemiologic analyses of confidential information pooled from multiple J Am Med Inform Assoc. 2007;14:527–533. health care utilization databases. Pharmacoepidemiol Drug Saf. In press.21. Moore KM, Duddy A, Braun MM, et al. Potential population-based 39. Agency for Healthcare Research and Quality. Comparative Effectiveness Re- electronic data sources for rapid pandemic influenza vaccine adverse search Grant Awards. 2010. Available at: http://effectivehealthcare.ahrq.gov/ event detection: a survey of health plans. Pharmacoepidemiol Drug Saf. index.cfm/comparative-effectiveness-research-grant-awards/. Accessed Janu- 2008;17:1137–1141. ary, 2010.© 2010 Lippincott Williams & Wilkins www.lww-medicalcare.com | S51 12
    • Editor’s Perspective Open Science and Data Sharing in Clinical Research Basing Informed Decisions on the Totality of the Evidence Harlan M. Krumholz, MD, SMA patient tentatively enters the examination room. A decision looms, and the patient waits with some trepi-dation. A doctor soon appears and pulls up a chair, prepared remained unpublished at 5 years. Negative trials were less likely to be published, yet 34% of the positive trials were also unpublished at 5 years.for the discussion by having conscientiously reviewed the We tout systematic reviews, the linchpin of evidence-basedmedical literature, examined the evidence-based systematic medicine, as a method to distill and interpret the medicalreviews, and highlighted the relevant information. The ensu- literature. The Institute of Medicine recently produced stan-ing conversation reflects on the patient’s treatment goals and dards for these reviews and noted that, “Healthcare decisionthe published information about the benefits and harms of the makers, including clinicians and other healthcare providers,available options. The tension breaks with a decision, and increasingly turn to systematic reviews for reliable, evidence-plans for the next steps are put in place. based comparisons of health interventions.”8 Others have Every day, patients and their caregivers are faced with published checklists to promote high-quality methods9; how-difficult decisions about treatment. They turn to physicians and ever, the elegance of the methods cannot overcome theother healthcare professionals to interpret the medical evidence undermining effect of missing studies.and assist them in making individualized decisions. Missing studies are unlikely to be missing at random, and the Unfortunately, we are learning that what is published in the effect of the missing data on inferences about an intervention ismedical literature represents only a portion of the evidence that not easy to predict. Hart and colleagues investigated the effect ofis relevant to the risks and benefits of available treatments. In a unpublished data on the results of meta-analyses of drug trials.10profession that seeks to rely on evidence, it is ironic that we They showed that the unpublished data influenced the summarytolerate a system that enables evidence to be outside of public estimates such that the drug efficacy was lower in 46% of theview. Those who own data, usually scientists or industry, havethe choice of what, where, and when to publish. As a result, our meta-analyses and greater in 46%. The result was essentiallymedical literature portrays only a partial picture of the evidence unchanged in only 7% of the studies. Readers of meta-analyses,about clinical strategies, including drugs and devices. Experts without the benefit of these types of analyses, are left to wonderhave recently drawn attention to this issue, including contribu- how many missing trials might be relevant to the clinicaltions in this issue of our journal, but there is resistance to question and what their effect might be.change.1–5 In some cases, the missing clinical research data may hold Studies document that a remarkable percentage of trials are important information about risk. A classic example occurrednot published within a reasonable period after they are com- with Vioxx (rofecoxib). Merck had data, mostly unpublishedpleted. Ross and colleagues reported that fewer than half of trials and obtained several years before the drug was withdrawn fromwere published within 2 years of completion.4 Although the market,11 demonstrating that Vioxx likely increased the riskindustry-sponsored studies tended to have the lowest publication of an acute myocardial infarction. No systematic analysis couldrates, the problem spans all study sponsors. A subsequent study have detected this harm because most of the data were beyondfound that only 46% of clinical trials funded by the National public view. Similarly, GlaxoSmithKline had data, much of itInstitutes of Health were published within 30 months of com- unpublished, which indicated that Avandia (rosiglitazone) in-pletion.6 Lee and colleagues found that fewer than half of the creased the risk of an acute myocardial infarction. Nissen andnew drug trials submitted to the US Food and Drug Adminis- colleagues, accessing the data made available through litigation,tration were published within 5 years of drug approval.7 Al- revealed the concern that ultimately led to US Food and Drugthough pivotal trials were more likely to be published, 24% Administration restrictions on the use of the drug.12,13 Selective publication similarly skews the evidence base. For The opinions expressed in this article are not necessarily those of the example, Psaty and Kronmal, using information obtained duringAmerican Heart Association. From the Section of Cardiovascular Medicine and the Robert Wood litigation, reported that Merck and its consultants employedJohnson Clinical Scholars Program, Department of Medicine, Yale selective reporting in representing mortality results in trials ofUniversity School of Medicine; Section of Health Policy and Adminis- Vioxx in patients with Alzheimer disease or cognitive impair-tration, Yale School of Public Health; Center for Outcomes Research andEvaluation, Yale-New Haven Hospital, New Haven, Conn. ment.14 The published studies failed to report that an intention- Correspondence to Harlan Krumholz, 1 Church St, Suite 200, New to-treat analysis showed Vioxx to be associated with a signifi-Haven, CT 06510. E-mail harlan.krumholz@yale.edu cant increase in all-cause mortality. Turner and colleagues (Circ Cardiovasc Qual Outcomes. 2012;5:141-142.) © 2012 American Heart Association, Inc. showed that trials of antidepressant drugs highlighted positive results as if they were the primary outcomes, contrary to the Circ Cardiovasc Qual Outcomes is available athttp://circoutcomes.ahajournals.org original study protocols.15 They concluded that, “By altering the DOI: 10.1161/CIRCOUTCOMES.112.965848 apparent risk-benefit ratio of drugs, selective publication can lead 141 13 Downloaded from circoutcomes.ahajournals.org at FDA Library on June 22, 2012
    • 142 Circ Cardiovasc Qual Outcomes March 2012doctors to make inappropriate prescribing decisions that may not be would allow us to leverage existing investments to provide morein the best interest of their patients and, thus, the public health.” and better evidence that will increase the possibility that pa- The myriad of reasons why trials are selectively or never tients’ decisions will help them obtain the results they desire.published include professional bias, profit, journal bias againstnegative studies, loss of interest, lack of funding, and competing Sources of Fundinginterests. For example, when results do not confirm the beliefs of Dr Krumholz is supported by grant U01 HL105270-02 (Center forinvestigators, the motivation for their dissemination may Cardiovascular Outcomes Research at Yale University) from theweaken, as may the willingness of the investigator to devote National Heart, Lung, and Blood Institute.time and resources to the publication. Companies may also loseinterest when the findings are counter to embedded beliefs about Disclosuresa product. The end result is a scientific culture in which many Dr Krumholz discloses that he is the recipient of a research grant from Medtronic, Inc, through Yale University and is chair of aexperiments performed on patients are absent from the medical cardiac scientific advisory board for UnitedHealth.literature. Another issue relevant to selective and nonpublication is that Referencesaccess to a research database is commonly restricted to the 1. Gøtzsche P. Strengthening and opening up health research by sharing ourprincipal investigator or the funders. For many trials, there is no raw data. Circ Cardiovasc Qual Outcomes. 2012;5:236 –237.opportunity for independent replication of the analyses or eval- 2. Lehman R, Loder E. Missing clinical trial data. BMJ. 2012;344:d8158.uation of the raw data. The US Food and Drug Administration 3. Ross JS, Lehman R, Gross CP. The importance of clinical trial data sharing: toward more open science. Circ Cardiovasc Qual Outcomes.has access to these data, but other experts do not. 2012;5:238 –240. The sharing of trial data, as of yet uncommon except through 4. Ross JS, Mulvey GK, Hines EM, Nissen SE, Krumholz HM. Trialmechanisms by some funders such as the National Heart, Lung, publication after registration in ClinicalTrials.Gov: a cross-sectional anal-and Blood Institute,16 could provide an opportunity to leverage ysis. PLoS Med. 2009;6:e1000144. 5. Spertus JA. The double-edged sword of open access to research data. Circthe strength of the global community of investigators. Many Cardiovasc Qual Outcomes. 2012;5:143–144.trials yield only a fraction of the knowledge that could be 6. Ross JS, Tse T, Zarin DA, Xu H, Zhou L, Krumholz HM. Publication ofproduced with more resources and creativity. NIH funded trials registered in ClinicalTrials.gov: cross sectional analy- Now is the time to bring data sharing and open science into sis. BMJ. 2012;344:d7292. 7. Lee K, Bacchetti P, Sim I. Publication of clinical trials supporting suc-the mainstream of clinical research, particularly with respect cessful new drug applications: a literature analysis. PLoS Med. 2008;to trials that contain information about the risks and benefits 5:e191.of treatments in current use. This could be accomplished 8. Committee on Standards for Systematic Reviews of Comparative Effec-through the following steps: tiveness Research, Institute of Medicine. Finding what works in health care: standards for systematic reviews. The National Academies Press; 2011. http://www.iom.edu/Reports/2011/Finding-What-Works-in-Health- 1. Post, in the public domain, the study protocol for each Care-Standards-for-Systematic-Reviews.aspx. Accessed February 21, published trial. The protocol should be comprehensive 2012. and include policies and procedures relevant to actions 9. Stroup DF, Berlin JA, Morton SC, Olkin I, Williamson GD, Rennie D, taken in the trial. Moher D, Becker BJ, Sipe TA, Thacker SB. Meta-analysis of observa- 2. Develop mechanisms for those who own trial data to tional studies in epidemiology: a proposal for reporting. Meta-analysis Of share their raw data and individual patient data. Observational Studies in Epidemiology (MOOSE) group. JAMA. 2000; 283:2008 –2012. 3. Encourage industry to commit to place all its clinical 10. Hart B, Lundh A, Bero L. Effect of reporting bias on meta-analyses of research data relevant to approved products in the drug trials: reanalysis of meta-analyses. BMJ. 2012;344:d7202. public domain. This action would acknowledge that the 11. Ross JS, Madigan D, Hill KP, Egilman DS, Wang Y, Krumholz HM. privilege of selling products is accompanied by a Pooled analysis of rofecoxib placebo-controlled clinical trial data: lessons responsibility to share all the clinical research data for postmarket pharmaceutical safety surveillance. Arch Intern Med. relevant to the products’ benefits and harms. 2009;169:1976 –1985. 4. Develop a culture within academics that values data 12. Nissen SE, Wolski K. Effect of rosiglitazone on the risk of myocardial infarction and death from cardiovascular causes. N Engl J Med. 2007; sharing and open science. After a period in which the 356:2457–2471. original investigators can complete their funded studies, 13. United States Food and Drug Administration. FDA drug safety commu- the data should be de-identified and made available for nication: updated risk evaluation and mitigation strategy (REMS) to investigators globally. restrict access to rosiglitazone-containing medicines including Avandia, 5. Identify, within all systematic reviews, trials that are not Avandamet, and Avandaryl. http://www.fda.gov/Drugs/DrugSafety/ published, using sources such as clinicaltrials.gov and ucm255005.htm. May 18, 2011. Accessed February 21, 2012. regulatory postings to determine what is missing. 14. Psaty BM, Kronmal RA. Reporting mortality findings in trials of rofecoxib for Alzheimer disease or cognitive impairment: a case study 6. Share data. based on documents from rofecoxib litigation. JAMA. 2008;299: 1813–1817. The path is not easy. We lack an infrastructure and funding, 15. Turner EH, Matthews AM, Linardatos E, Tell RA, Rosenthal R. Selectiveand incentives in the system run counter to sharing. Those who publication of antidepressant trials and its influence on apparent efficacy.share may be giving advantage to “competitors.” The impera- N Engl J Med. 2008;358:252–260. 16. National Heart, Lung, and Blood Institute. NHLBI policy for data sharingtive, however, derives from what is best for society and our from clinical trials and epidemiological studies. http://www.nhlbi.nih.gov/obligation to respect the contributions made by the subjects who funding/datasharing.htm. Accessed February 21, 2012.agreed to participate in the research studies. Patients facing a decision deserve information that is based on KEY WORDS: evidence-based medicine Ⅲ data sharing Ⅲ medicalall of the evidence. A new era of data sharing and open science decision-making Ⅲ open access 14 Downloaded from circoutcomes.ahajournals.org at FDA Library on June 22, 2012
    • ORIGINAL CONTRIBUTION Characteristics of Clinical Trials Registered in ClinicalTrials.gov, 2007-2010 Robert M. Califf, MD Context Recent reports highlight gaps between guidelines-based treatment recom- Deborah A. Zarin, MD mendations and evidence from clinical trials that supports those recommendations. Strengthened reporting requirements for studies registered with ClinicalTrials.gov en- Judith M. Kramer, MD, MS able a comprehensive evaluation of the national trials portfolio. Rachel E. Sherman, MD, MPH Objective To examine fundamental characteristics of interventional clinical trials reg- Laura H. Aberle, BSPH istered in the ClinicalTrials.gov database. Asba Tasneem, PhD Methods A data set comprising 96 346 clinical studies from ClinicalTrials.gov was downloaded on September 27, 2010, and entered into a relational database to ana- C LINICAL TRIALS ARE THE CEN- lyze aggregate data. Interventional trials were identified and analyses were focused tralmeansbywhichpreventive, on 3 clinical specialties—cardiovascular, mental health, and oncology—that together diagnostic, and therapeutic encompass the largest number of disability-adjusted life-years lost in the United States. strategiesareevaluated,1 butthe Main Outcome Measures Characteristics of registered clinical trials as reported US clinical trials enterprise has been data elements in the trial registry; how those characteristics have changed over time; marked by debate regarding funding pri- differences in characteristics as a function of clinical specialty; and factors associated oritiesforclinicalresearch,thedesignand with use of randomization, blinding, and data monitoring committees (DMCs). interpretation of studies, and protections Results The number of registered interventional clinical trials increased from 28 881 (Oc- forresearchparticipants.2-4 Untilrecently, tober 2004–September 2007) to 40 970 (October 2007–September 2010), and the number however, we have lacked tools for com- of missing data elements has generally declined. Most interventional trials registered be- prehensively assessing trials across the tween 2007 and 2010 were small, with 62% enrolling 100 or fewer participants. Many broader US clinical trial enterprise. clinical trials were single-center (66%; 24 788/37 520) and funded by organizations other In 1997, Congress mandated the cre- than industry or the National Institutes of Health (NIH) (47%; 17 592/37 520). Hetero- geneity in the reported methods by clinical specialty; sponsor type; and the reported use ation of the ClinicalTrials.gov registry to of DMCs, randomization, and blinding was evident. For example, reported use of DMCs assistpeoplewithseriousillnessesingain- was less common in industry-sponsored vs NIH-sponsored trials (adjusted odds ratio [OR], ing access to trials.5 In September 2004, 0.11; 95% CI, 0.09-0.14), earlier-phase vs phase 3 trials (adjusted OR, 0.83; 95% CI, 0.76- the International Committee of Medical 0.91), and mental health trials vs those in the other 2 specialties. In similar comparisons, Journal Editors (ICMJE) announced a randomization and blinding were less frequently reported in earlier-phase, oncology, and policy, which took effect in 2005, of re- device trials. quiring registration of clinical trials as a Conclusion Clinical trials registered in ClinicalTrials.gov are dominated by small trials prerequisite for publication.6,7 The Food and contain significant heterogeneity in methodological approaches, including re- andDrugAdministrationAmendmentAct ported use of randomization, blinding, and DMCs. (FDAAA)8 expanded the mandate of JAMA. 2012;307(17):1838-1847 www.jama.com ClinicalTrials.gov to include most non– phase 1 interventional drug and de- data elements (effective September 27, appropriate, validated outcome mea- vice trials, with interventional trials de- 2007), report basic results (Septem- sures,13,14 as well as oversight by insti- fined as “studies in human beings in ber 27, 2008), and report adverse events tutional review boards and data moni- which individuals are assigned by an in- (September 27, 2009).10 Author Affiliations: Duke Translational Medicine Insti- vestigator based on a protocol to re- Recent work11,12 highlights the inad- tute, Durham, North Carolina (Drs Califf and Kramer); ceive specific interventions”9 (eTable 1, equate evidence base of current prac- National Library of Medicine, National Institutes of Health, Bethesda, Maryland (Dr Zarin); Office of Medical Policy, available at http://www.jama.com). The tice, in which less than 15% of major Center for Drug Evaluation and Research, US Food and law obliges sponsors or their desig- guideline recommendations are based Drug Administration, Silver Spring, Maryland (Dr Sher- man); and Duke Clinical Research Institute, Durham (Drs nees to register trials and record key on high-quality evidence, often de- Kramer and Tasneem and Ms Aberle). fined as evidence that emanates from Corresponding Author: Robert M. Califf, MD, Duke Translational Medicine Institute, 200 Trent Dr, 1117 trials with appropriate designs; Davison Bldg, Durham, NC 27710 (robert.califf@duke For editorial comment see p 1861. sufficiently large sample sizes; and .edu). 1838 JAMA, May 2, 2012—Vol 307, No. 17 ©2012 American Medical Association. All rights reserved. 15Downloaded From: http://jama.jamanetwork.com/ by a National Academy of Sciences User on 07/17/2012
    • CLINICAL TRIALS REGISTERED IN CLINICALTRIALS.GOV toring committees (DMCs) to protect comprehensive data dictionaries, is Within these specialty data sets, a few participants and ensure the trial’s in- available at the Clinical Trials Trans- data elements are missing because of tegrity.14 formation Initiative website.19 limitations in the data set or logistical In this article, we examine funda- Our analysis was restricted to inter- problems in obtaining analyzable in- mental characteristics of interven- ventional studies registered with formation. Specifically, the data ele- tional clinical trials in 3 major thera- ClinicalTrials.gov between October 2007 ment “human subject review” is not peutic areas contained in the registry and September 2010. To identify inter- present in the public download, and (cardiovascular, mental health, and on- ventional studies, we used the “study data regarding primary outcomes and cology), focusing on study character- type” field from the ClinicalTrials.gov oversight authority are not readily ana- istics (data elements reported in trial registry, which included the following lyzable because of the presence of free- registration) that are desirable for gen- choices: interventional, observational, ex- text values. erating reliable evidence from clinical panded access, and not available (NA) trials, including factors associated with (eAppendix 1). Interventional trials were Analytical Methods use of DMCs, randomization, and defined as “studies in human beings in Clinical trial characteristics were first blinding. which individuals are assigned by an in- assessed overall, by interventional trials, vestigator based on a protocol to re- and by 2 temporal subsets: October METHODS ceive specific interventions.” In this 2004 through September 2007 and Oc- The methods used by ClinicalTrials.gov study, the terms clinical trial, interven- tober 2007 through September 2010. to register clinical studies have been de- tional trial, and interventional study are The percentage of trials registered be- scribed previously.15-17 Briefly, spon- synonomous. Interventional studies were fore and after enrollment of the first par- sors and investigators from around the regrouped within the downloaded, de- ticipant was also determined by com- world enter data through a web-based rivative database according to the 3 clini- paring the date of registration to the data entry system. The country ad- cal specialties—cardiovascular, oncol- date that the first participant was en- dress of each facility (ie, a site that can ogy, and mental health—that together rolled. Other assessments included potentially enroll participants) was used encompass the largest number of dis- clinical trial characteristics, enroll- to group sites into regions according to ability-adjusted life-years lost in the ment characteristics, funding source, rubrics used by ClinicalTrials.gov.18 (In- United States.20 For this regrouping, we and number of study sites for all clini- dividual countries included in each re- used submitted disease condition terms cal trials vs cardiovascular, mental gion are available.) The sample we ex- and Medical Subject Heading (MeSH) health, and oncology trials for the lat- amined includes studies registered to terms generated by a National Library of ter time period (October 2007– comply with legal obligations, as well Medicine (NLM) algorithm to develop September 2010). Funding sources in- as those registered voluntarily to meet a methodology to annotate, validate, ad- cluded industry, NIH, other US federal ICMJE requirements or for other rea- judicate, and implement disease condi- (excluding NIH), and other. Frequen- sons. Similarly, data for registered stud- tion terms (MeSH and non-MeSH) to cre- cies and percentages are provided for ies include both mandatory and op- ate specialty data sets. categorical characteristics; medians and tional elements. Over time, the types, A subset of the 2010 MeSH thesau- interquartile ranges (IQRs) are pro- definitions, and optional vs manda- rus from the NLM21 and a list of non- vided for continuous characteristics. tory status of data elements have MeSH disease condition terms pro- Logistic regression analysis was per- changed. Mandatory and optional data vided by data submitters that appeared formed to calculate adjusted odds ra- elements for registration as of August in 5 or more interventional studies in the tios (ORs) with Wald 95% confidence 2011 are shown in eAppendix 1. analysis data set were reviewed and an- intervals for factors associated with notated by clinical specialists at Duke trials that report use of DMCs, random- ClinicalTrials.gov Data Set University Medical Center (eAppendix ization, and blinding. A full model con- We downloaded an XML data set com- 2). Terms were annotated according to taining 9 prespecified characteristics prising all 96 346 clinical studies reg- their relevance to a given specialty was developed. The first of these was istered with ClinicalTrials.gov as of Sep- (Y = relevant, N = not relevant). Spe- lead sponsor, which the NLM defines as tember 27, 2010—1 decade after the cialty data sets were created and the re- the primary organization that over- registry’s launch and 3 years after en- sults of algorithmic classifications were sees study implementation and is re- actment of the FDAAA. We loaded the validated by comparison with classifica- sponsible for conducting data analy- data set into a relational database tions based on manual review. Clinical sis.19 Collaborators are defined as other (Oracle RDBMS version 11.1g, Oracle trials were classed according to date reg- organizations (if any) that provide ad- Corporation) to facilitate aggregate istered and by interventional status. De- ditional support, including funding, de- analysis. This resource, the Database for tails regarding the creation of these spe- sign, implementation, data analysis, Aggregate Analysis of ClinicalTrials.gov cialty data sets are provided in an article and reporting. The sponsor (or desig- (AACT), as well as data definitions, and describing the study methodology.22 nee) is responsible for confirming all ©2012 American Medical Association. All rights reserved. JAMA, May 2, 2012—Vol 307, No. 17 1839 16Downloaded From: http://jama.jamanetwork.com/ by a National Academy of Sciences User on 07/17/2012
    • CLINICAL TRIALS REGISTERED IN CLINICALTRIALS.GOV collaborators before listing them. groups as 1, the value of allocation was trials and 70 (IQR, 35-190) for trials ClinicalTrials.gov stores funding orga- designated as nonrandomized and the that have been registered but not com- nization information in 2 data ele- value of blinding was designated as pleted. ments: lead sponsor and collaborator. open. TABLE 2 shows selected characteris- The NLM classifies submitted agency SAS version 9 (SAS Institute) was tics of all interventional trials regis- names in these data elements as indus- used for all statistical analyses. tered from October 2007 through Sep- try, NIH, US federal (excluding NIH), tember 2010 (n = 40 970), as well as or other. We derived probable funding RESULTS characteristics for oncology, cardiovas- source from the lead sponsor and col- Basic characteristics of all studies reg- cular, and mental health trials com- laborator fields using the following al- istered with ClinicalTrials.gov as of Sep- pared with all other trials. Of these 3 gorithm: if the lead sponsor was from tember 27, 2010 (N=96 346), all inter- categories, oncology trials were most industry, or the NIH was neither a lead ventional trials registered during the numerous (n=8992, 21.9%) and com- sponsor nor collaborator and at least 1 same period (n=79 413), and 2 recent prised the largest proportion of trials collaborator was from industry, then the subsets of interventional trials (Octo- listed as currently recruiting: 31.5% vs study was categorized as industry ber 2004–September 2007 and Octo- 9.3% and 10% for cardiovascular and funded. If the lead sponsor was not from ber 2007–September 2010) are shown mental health trials, respectively. On- industry, and NIH was either a lead in TABLE 1. The number of trials sub- cology trials also constituted the larg- sponsor or a collaborator, then the mitted for registration increased from est proportion of trials that were ac- study was categorized as NIH funded. 28 881 to 40 970 during the 2 periods. tive but not yet recruiting (25.8% vs Otherwise, if the lead sponsor and col- A decline in the numbers of missing 7.4% for cardiovascular and 7.5% for laborator fields were nonmissing, then data elements occurred for some char- mental health) and that were oriented the study was considered to be funded acteristics. The rate of registered trials toward treatment (25.7% vs 8% for car- by other. not reporting use of DMCs decreased diovascular and 9.6% for mental Also included in the model were from 57.9% to 18% between the 2 time health). Among trials oriented toward phase (0, 1, 1/2, 2, 2/3, 3, 4, NA); num- periods; not reporting either enroll- prevention, cardiovascular trials com- ber of participants; trial specialty— ment number or type (anticipated or ac- prised the largest group: 10.4% vs 8.1% cardiovascular, oncology, or mental tual) decreased from 33.8% to 1.8%; not for oncology and 5.9% for mental health (yes/no); trial start year; inter- reporting randomization decreased health. Cardiovascular trials also ac- vention type (procedure/device, drug from 5.6% to 4.2%; and not reporting counted for the largest proportion of or biological, behavioral, dietary supple- blinding decreased from 3.5% to 2.7%. trials assessing medical devices: 20.2% ment, other); and primary purpose The rate of missing data for primary vs 7.0% for oncology and 3.8% for men- (treatment, prevention, diagnostic, purpose increased from 4.6% to 6.8% tal health. As expected, among trials in- other). For the purposes of this mod- during these periods. The proportion corporating behavioral interventions, eling, studies reporting multiple inter- of trials reporting an NIH lead spon- mental health trials were most com- vention types were categorized in the sor decreased from 6.3% to 2.7% dur- mon: 33.4% vs 8.1% for oncology and following hierarchy: 1, procedure/ ing the during the 2 periods, and the 7.2% for cardiovascular. device; 2, drug/biological; 3, behav- proportion of trials with at least 1 North Enrollment and design characteris- ioral; 4, dietary supplement; 5, other. American research site decreased from tics for all interventional trials regis- Studies missing a response to any of the 61.9% to 57.5%. Other characteristics tered from October 2007 through Sep- data elements used in the model were have not changed substantially. tember 2010 are displayed in TABLE 3. excluded. The model predicting trials The proportion of trials registered be- There was heterogeneity in median with DMC was also run in 2 addi- fore beginning participant enrollment anticipated trial size according to spe- tional ways: (1) assuming that those increased over the 2 time periods: from cialty. Cardiovascular trials (median an- trials missing a response to the ques- 33% (9041/27 667) in October 2004– ticipated enrollment, 100; IQR, 42- tion regarding DMC did in fact have a September 2007 to 48% (19 347/ 280) tended to be nearly twice as large DMC, and (2) assuming that those trials 40 333) in October 2007–September as oncology trials (median, 54; IQR, 30- missing a response to the question re- 2010. 120), with mental health trials (me- garding DMC did not in fact have a The majority of clinical trials were dian, 85; IQR, 40-200) residing be- DMC. small in terms of numbers of partici- tween these 2. Cardiovascular and When possible for all analyses, val- pants. Overall, 96% of these trials had mental health trials were more ori- ues of missing methodological trial an anticipated enrollment of 1000 or ented toward later-phase research (ie, characteristics were inferred based on fewer participants and 62% had 100 or phases 3 and 4) while oncology trials other available data. For example, for fewer participants (eFigure). The me- displayed a higher relative proportion studies reporting an interventional dian number of participants per trial of earlier-phase trials (ie, phases 0 model of single group and number of was 58 (IQR, 27-161) for completed through 2). Trials restricted to women 1840 JAMA, May 2, 2012—Vol 307, No. 17 ©2012 American Medical Association. All rights reserved. 17Downloaded From: http://jama.jamanetwork.com/ by a National Academy of Sciences User on 07/17/2012
    • CLINICAL TRIALS REGISTERED IN CLINICALTRIALS.GOV Table 1. Characteristics for All Studies Registered in ClinicalTrials.gov, All Interventional Studies, and Interventional Trials From October 2004 Through September 2007 and From October 2007 Through September 2010 No./Total No. (%) Interventional Trials All Studies All, 2000-2010 Oct 2004–Sept 2007 Oct 2007–Sept 2010 (N = 96 346) (n = 79 413) a (n = 28 881) (n = 40 970) Primary purpose Treatment 59 200/75 778 (78.1) 59 200/75 198 (78.7) 22 242/27 565 (80.7) 28 605/38 199 (74.9) Prevention 8092/75 778 (10.7) 8092/75 198 (10.8) 3313/27 565 (12.0) 4152/38 199 (10.9) Diagnostic 2655/75 778 (3.5) 2655/75 198 (3.5) 1013/27 565 (3.7) 1489/38 199 (3.9) Other b 5831/75 778 (7.7) 5251/75 198 (7.0) 997/27 565 (3.6) 3953/38 199 (10.3) Missing 20 568/96 346 (21.3) 4215/79 413 (5.3) 1316/28 881 (4.6) 2771/40 970 (6.8) Intervention type c Drug 53 441/84 614 (63.2) 52 162/79 410 (65.7) 19 851/28 881 (68.7) 24 751/40 970 (60.4) Procedural 10 911/84 614 (12.9) 9635/79 410 (12.1) 3807/28 881 (13.2) 4104/40 970 (10.0) Biological 6841/84 614 (8.1) 6657/79 410 (8.4) 2049/28 881 (7.1) 2948/40 970 (7.2) Behavioral 7134/84 614 (8.4) 6582/79 410 (8.3) 2781/28 881 (9.6) 3307/40 970 (8.1) Device 6662/84 614 (7.9) 6012/79 410 (7.6) 2092/28 881 (7.2) 3799/40 970 (9.3) Radiation 2361/84 614 (2.8) 2292/79 410 (2.9) 494/28 881 (1.7) 928/40 970 (2.3) Dietary supplement 2067/84 614 (2.4) 2036/79 410 (2.6) 330/28 881 (1.1) 1603/40 970 (3.9) Genetic 1096/84 614 (1.3) 712/79 410 (0.9) 261/28 881 (0.9) 381/40 970 (0.9) Other 8211/84 614 (9.7) 6625/79 410 (8.3) 1339/28 881 (4.6) 5110/40 970 (12.5) Region c Africa 2091/86 404 (2.4) 1904/72 149 (2.6) 892/25 924 (3.4) 817/37 520 (2.2) Asia and Pacific 11 659/86 404 (13.5) 9953/72 149 (13.8) 3725/25 924 (14.4) 5768/37 520 (15.4) Central and South America 3953/86 404 (4.6) 3565/72 149 (4.9) 1240/25 924 (4.8) 1793/37 520 (4.8) Europe 23 853/86 404 (27.6) 20 337/72 149 (28.2) 8006/25 924 (30.9) 11 311/37 520 (30.1) Middle East 3587/86 404 (4.2) 2852/72 149 (4.0) 1152/25 924 (4.4) 1545/37 520 (4.1) North America 54 257/86 404 (62.8) 45 745/72 149 (63.4) 16 044/25 924 (61.9) 21 581/37 520 (57.5) Missing 9942/96 346 (10.3) 7264/79 413 (9.1) 2957/28 881 (10.2) 3450/40 970 (8.4) Anticipated enrollment, No. of participants 1-100 29 510/51 066 (57.8) 25 405/41 177 (61.7) 6415/10 611 (60.5) 17 726/28 458 (62.3) 101-1000 18 252/51 066 (35.7) 13 997/41 177 (34.0) 3669/10 611 (34.6) 9629/28 458 (33.8) Ͼ1000 3304/51 066 (6.5) 1775/41 177 (4.3) 527/10 611 (5.0) 1103/28 548 (3.9) Study registration Before first participant enrolled 9041/27 667 (32.7) 19 347/40 333 (48.0) After first participant enrolled 18 626/27 667 (67.3) 20 986/40 333 (52.0) Missing enrollment number or type (anticipated or actual) 20 926/96 346 (21.7) 16 934/79 413 (21.3) 9757/28 881 (33.8) 756/40 970 (1.8) Blinding d Open 40 520/72 475 (55.9) 40 520/72 475 (55.9) 15 571/27 880 (55.9) 22 234/39 871 (55.8) Single blind 7006/72 475 (9.7) 7006/72 475 (9.7) 2364/27 880 (8.5) 4457/39 871 (11.2) Double blind 24 949/72 475 (34.4) 24 949/72 475 (34.4) 9945/27 880 (35.7) 13 180/39 871 (33.1) Missing 23 871/96 346 (24.8) 6983/79 413 (8.7) 1001/28 881 (3.5) 1099/40 970 (2.7) Allocation status Randomized d 49 762/71 046 (70.0) 49 762/70 953 (70.13) 19 218/27 265 (70.5) 27 027/39 240 (68.9) Nonrandomized d 21 191/71 046 (29.83) 21 191/70 953 (29.87) 8047/27 265 (29.5) 12 213/39 240 (31.1) Missing 25 300/93 346 (26.3) 8460/79 413 (10.7) 1616/28 881 (5.6) 1730/40 970 (4.2) DMC Study has DMC 22 572/56 856 (39.7) 19 886/46 699 (42.6) 5711/12 153 (47.0) 13 644/33 615 (40.6) Study does not have DMC 34 284/56 856 (60.3) 26 813/46 699 (57.4) 6442/12 153 (53.0) 19 971/33 615 (59.4) Missing 39 490/96 346 (41.0) 32 714/79 413 (41.2) 16 728/28 881 (57.9) 7355/40 970 (18.0) Lead sponsor Industry 31 173/96 346 (32.4) 28 264/79 413 (35.6) 10 783/28 881 (37.3) 15 248/40 970 (37.2) NIH 9215/96 346 (9.6) 5878/79 413 (7.4) 1825/28 881 (6.3) 1106/40 970 (2.7) US federal 1715/96 346 (1.8) 1473/79 413 (1.9) 644/28 881 (2.2) 547/40 970 (1.3) Other 54 243/96 346 (56.3) 43 798/79 413 (55.2) 15 629/28 881 (54.1) 24 069/40 970 (58.7) Abbreviations: DMC data monitoring committee; NIH, National Institutes of Health. a Includes 9562 interventional trials registered before October 2004. b Includes supportive care, screening, health services research, and basic science. c Percentages may not sum to 100% as categories are not mutually exclusive. d Only collected for interventional studies. ©2012 American Medical Association. All rights reserved. JAMA, May 2, 2012—Vol 307, No. 17 1841 18Downloaded From: http://jama.jamanetwork.com/ by a National Academy of Sciences User on 07/17/2012
    • CLINICAL TRIALS REGISTERED IN CLINICALTRIALS.GOV were almost twice as common as trials Geographical differences were also health), and the majority of oncology restricted to men (9.1% vs 5.4%), a dif- apparent. Cardiovascular trials showed trials (87.6%) were not blinded. Men- ference driven largely by oncology trials the smallest proportion of studies with tal health trials, on the other hand, were (13.8% exclude men, compared with at least 1 North American research site more likely to be blinded (60.0%, vs 2.0% [cardiovascular] and 5.8% [men- (47.9%, vs 65.1% for oncology and 12.4% for oncology and 49.0% for car- tal health]). 69.1% for mental health) and the most diovascular), to use parallel-group de- There were also differences in age dis- substantial proportion of trials with at sign (65.9% vs 32.5% for oncology and tribution among therapeutic areas. Men- least 1 European site (39.9% vs 27.6% 63.2% for cardiovascular), and to use tal health trials were most likely to per- and 20.9%, respectively). randomization (80.1%, vs 36.3% for on- mitinclusionofchildren(17.9%vs11.3% Differences in trial design were also cology and 73.7% for cardiovascular). for oncology and 10.5% for cardiovascu- evident among therapeutic areas Data on funding source and number lar) but were also most likely to exclude (Table 3). Oncology trials were more of sites were available for 37 520 of elderlyparticipants:56%ofmentalhealth likely to involve a single group of par- 40 970 clinical trials registered during the trials excluded participants older than 65 ticipants with no randomization of treat- 2007-2010 period (eTable 2). The larg- years compared with 8.1% for oncology ment assignment (64.7% vs 26.2% for est proportion of these trials were not and 13.3% for cardiovascular. cardiovascular and 20.8% for mental funded by industry or the NIH (47%, Table 2. Clinical Trial Attributes by Therapeutic Area for All Interventional Trials, October 2007–September 2010 No./Total No. (%) All Trials Oncology Cardiovascular Mental Health (n = 40 970) (n = 8992) (n = 3437) (n = 3695) Overall status Not yet recruiting 3725/40 970 (9.1) 744/8992 (8.3) 362/3437 (10.5) 366/3695 (9.9) Recruiting 17 348/40 970 (42.3) 5456/8992 (60.7) 1609/3437 (46.8) 1727/3695 (46.7) Completed 12 265/40 970 (29.9) 961/8992 (10.7) 855/3437 (24.9) 1005/3695 (27.2) Suspended 254/40 970 (0.6) 101/8992 (1.1) 16/3437 (0.5) 11/3695 (0.3) Terminated 1180/40 970 (2.9) 265/8992 (2.9) 91/3437 (2.6) 111/3695 (3.0) Withdrawn 307/40 970 (0.7) 84/8992 (0.9) 18/3437 (0.5) 22/3695 (0.6) Active, not recruiting 4985/40 970 (12.2) 1286/8992 (14.3) 371/3437 (10.8) 372/3695 (10.1) Enrolling by invitation 906/40 970 (2.2) 95/8992 (1.1) 115/3437 (3.3) 81/3695 (2.2) Primary completion type a Actual 12 428/37 912 (32.8) 1194/8373 (14.3) 881/3218 (27.4) 992/3435 (28.9) Anticipated 25 484/37 912 (67.2) 7179/8373 (85.7) 2337/3218 (72.6) 2443/3435 (71.1) Missing 3058/40 970 (7.5) 619/8992 (6.9) 219/3437 (6.4) 260/3695 (7.0) Primary purpose Treatment 28 605/38 199 (74.9) 7353/8744 (84.1) 2284/3285 (69.5) 2750/3489 (78.8) Prevention 4152/38 199 (10.9) 338/8744 (3.9) 431/3285 (13.1) 247/3489 (7.1) Diagnosis 1489/38 199 (3.9) 470/8744 (5.4) 244/3285 (7.4) 71/3489 (2.0) Supportive care 1290/38 199 (3.4) 377/8744 (4.3) 75/3285 (2.3) 119/3489 (3.4) Screening 195/38 199 (0.5) 63/8744 (0.7) 18/3285 (0.5) 21/3489 (0.6) Health services research 733/38 199 (1.9) 64/8744 (0.7) 73/3285 (2.2) 107/3489 (3.1) Basic science 1735/38 199 (4.5) 79/8744 (0.9) 160/3285 (4.9) 174/3489 (5.0) Missing 2771/40 970 (6.8) 248/8992 (2.8) 152/3437 (4.4) 206/3695 (5.6) Intervention type b Drug 24 751/40 970 (60.4) 6485/8992 (72.1) 1629/3437 (47.4) 2172/3695 (58.8) Procedure 4104/40 970 (10.0) 1420/8992 (15.8) 449/3437 (13.1) 124/3695 (3.4) Biological 2948/40 970 (7.2) 1109/8992 (12.3) 86/3437 (2.5) 36/3695 (1.0) Behavioral 3307/40 970 (8.1) 269/8992 (3.0) 238/3437 (6.9) 1103/3695 (29.9) Device 3799/40 970 (9.3) 267/8992 (3.0) 766/3437 (22.3) 145/3695 (3.9) Radiation 928/40 970 (2.3) 811/8992 (9.0) 20/3437 (0.6) 17/3695 (0.5) Dietary supplement 1603/40 970 (3.9) 171/8992 (1.9) 141/3437 (4.1) 99/3695 (2.7) Genetic 381/40 970 (0.9) 313/8992 (3.5) 13/3437 (0.4) 8/3695 (0.2) Other 5110/40 970 (12.5) 1369/8992 (15.2) 439/3437 (12.8) 434/3695 (11.7) a Allowable values for this field are actual and anticipated. “Primary completion date” denotes the date that the final research participant was examined or received an intervention for the purposes of final collection of data for the primary outcome, whether the clinical trial concluded according to the prespecified protocol or was terminated. b Percentages may not sum to 100% as categories are not mutually exclusive. 1842 JAMA, May 2, 2012—Vol 307, No. 17 ©2012 American Medical Association. All rights reserved.Downloaded From: http://jama.jamanetwork.com/ by a National Academy of Sciences User on 07/17/2012 19
    • CLINICAL TRIALS REGISTERED IN CLINICALTRIALS.GOV Table 3. Trial Characteristics and Summary of Designs for All Interventional Trials, Registered October 2007–September 2010 No./Total No. (%) All Clinical Trials Oncology Cardiovascular Mental Health (n = 40 970) (n = 8992) (n = 3437) (n = 3695) Actual enrollment, median (IQR), 58.0 (27.0-161.0) 43.0 (18.0-100.0) 73.5 (34.0-200.0) 61.0 (29.0-216.5) No. of participants 1-100 7566/11 671 (64.8) 853/1139 (74.9) 475/807 (58.9) 584/968 (60.3) 101-1000 3744/11 671 (32.1) 259/1139 (22.7) 291/807 (36.1) 359/968 (37.1) Ͼ1000 361/11 671 (3.1) 27/1139 (2.4) 41/807 (5.1) 25/968 (2.6) Anticipated enrollment, median (IQR), 70.0 (35.0-190.0) 54.0 (30.0-120.0) 100.0 (42.0-280.0) 85.0 (40.0-200.0) No. of participants a 1-100 17 726/28 467 (62.3) 5597/7728 (72.4) 1331/2569 (51.8) 1494/2675 (55.8) 101-1000 9629/28 467 (33.8) 1937/7728 (25.1) 1041/2569 (40.5) 1086/2675 (40.6) Ͼ1000 1103/28 467 (3.9) 194/7728 (2.5) 197/2569 (7.7) 95/2675 (3.6) Sex, % Female only 3730/40 970 (9.1) 1240/8992 (13.8) 68/3437 (2.0) 215/3695 (5.8) Male only 2228/40 970 (5.4) 542/8992 (6.0) 120/3437 (3.5) 170/3695 (4.6) Both 35 012/40 970 (85.5) 7210/8992 (80.2) 3249/3437 (94.5) 3310/3695 (89.6) Includes children (Ͻ18 y) 7113/40 970 (17.4) 1015/8992 (11.3) 362/3437 (10.5) 663/3695 (17.9) Excludes elderly (Ͼ65 y) 13 047/40 970 (31.8) 727/8992 (8.1) 458/3437 (13.3) 2071/3695 (56.0) Regional distribution b,c Africa 817/37 520 (2.2) 76/8358 (0.9) 53/3164 (1.7) 43/3384 (1.3) Asia and Pacific 5768/37 520 (15.4) 1347/8358 (16.1) 538/3164 (17.0) 361/3384 (10.7) Central and South America 1793/37 520 (4.8) 211/8358 (2.5) 172/3164 (5.4) 111/3384 (3.3) Europe 11 311/37 520 (30.1) 2308/8358 (27.6) 1261/3164 (39.9) 707/3384 (20.9) Middle East 1545/37 520 (4.1) 204/8358 (2.4) 134/3164 (4.2) 127/3384 (3.8) North America 21 581/37 520 (57.5) 5445/8358 (65.1) 1516/3164 (47.9) 2338/3384 (69.1) Interventional group Single group 12 144/38 969 (31.2) 4822/7451 (64.7) 890/3394 (26.2) 754/3620 (20.8) Parallel 21 782/38 969 (55.9) 2422/7451 (32.5) 2145/3394 (63.2) 2386/3620 (65.9) Crossover 4351/38 969 (11.2) 140/7451 (1.9) 295/3394 (8.7) 353/3620 (9.8) Factorial 692/38 969 (1.8) 67/7451 (0.9) 64/3394 (1.9) 127/3620 (3.5) Missing 2001/40 970 (4.9) 1541/8992 (17.1) 43/3437 (1.3) 75/3695 (2.0) Blinding Open 22 234/39 871 (55.8) 7342/8386 (87.6) 1731/3394 (51.0) 1451/3629 (40.0) Single blind 4457/39 871 (11.2) 288/8386 (3.4) 484/3394 (14.3) 538/3629 (14.8) Double blind 13 180/39 871 (33.1) 756/8386 (9.0) 1179/3394 (34.7) 1640/3629 (45.2) Missing 1099/40 970 (2.7) 606/8992 (6.7) 43/3437 (1.3) 66/3695 (1.8) Allocation Randomized 27 027/39 240 (68.9) 2919/8035 (36.3) 2481/3366 (73.7) 2893/3612 (80.1) Nonrandomized 12 213/39 240 (31.1) 5116/8035 (63.7) 885/3366 (26.3) 719/3612 (19.9) Missing 1730/40 970 (4.2) 957/8992 (10.6) 71/3437 (2.1) 83/3695 (2.2) Phase 0 316/40 970 (0.8) 69/8992 (0.8) 29/3437 (0.8) 47/3695 (1.3) 1 6222/40 970 (15.2) 1884/8992 (21.0) 210/3437 (6.1) 391/3695 (10.6) 1/2 2105/40 970 (5.1) 950/8992 (10.6) 109/3437 (3.2) 124/3695 (3.4) 2 8484/40 970 (20.7) 3436/8992 (38.2) 471/3437 (13.7) 678/3695 (18.3) 2/3 1055/40 970 (2.6) 124/8992 (1.4) 113/3437 (3.3) 118/3695 (3.2) 3 6197/40 970 (15.1) 1025/8992 (11.4) 531/3437 (15.4) 559/3695 (15.1) 4 5559/40 970 (13.6) 234/8992 (2.6) 761/3437 (22.1) 584/3695 (15.8) NA 11 032/40 970 (26.9) 1270/8992 (14.1) 1213/3437 (35.3) 1194/3695 (32.3) Study has DMC d 13 644/33 615 (40.6) 3322/6316 (52.6) 1578/3125 (50.5) 1334/3189 (41.8) Abbreviations: DMC, data monitoring committee; IQR, interquartile range; NA, not applicable. a Denominators represent number of trials with information available on enrollment. Out of 40 970 trials, 756 (1.8%) were missing information on enrollment or on whether it was actual or anticipated; 107/8992 (1.2%) oncology trials, 30/3437 (1.7%) cardiovascular trials, and 43/3695 (1.2%) mental health trials were missing information on enrollment or on whether it was actual or anticipated. b Denominators represent number of trials with information available on region. Out of 40 970 trials, 3450 (8.4%) were missing region information; 634/8992 (7.1%) of oncology trials, 273/3437 (7.9%) of cardiovascular trials, and 311/3695 (8.4%) mental health trials were missing region information. c Percentages may not sum to 100% as categories are not mutually exclusive. d Denominators represent number of clinical trials that reported information on DMCs. Out of 40 970 trials, 7355 (18.0%) were missing information on DMC status; 2676/8992 (29.8%) oncology trials, 312/3437 (9.1%) cardiovascular trials, and 506/3695 (13.7%) mental health trials were missing DMC information. ©2012 American Medical Association. All rights reserved. JAMA, May 2, 2012—Vol 307, No. 17 1843Downloaded From: http://jama.jamanetwork.com/ by a National Academy of Sciences User on 07/17/2012 20
    • CLINICAL TRIALS REGISTERED IN CLINICALTRIALS.GOV n=17 592) with 16 674 (44%) funded by Regression analyses comparing trial of behavioral interventions were less industry, 3254 (9%) funded by the NIH, characteristics as they relate to use of likely to report use of DMCs. and 757 (2.0%) funded by other US DMCs, blinding, and randomization are There were small differences in re- federal agencies. The majority of trials displayed in TABLE 4. Compared with porting of blinding or randomization by were single site (24 788, 66%); 12 732 trials in which industry was the lead different lead sponsor organizations. For (34%) were multisite trials. The largest sponsor, other types of lead sponsors example, trials in which a US federal proportion of trials (39%, 14 637/ were more likely to report use of DMCs agency (excluding the NIH) or another 37 520) comprised single-site trials that with DMCs most common among NIH- sponsor was the lead sponsor were less were not funded by the NIH or by in- sponsored trials (adjusted OR, 9.09; likely to report use of blinding (ad- dustry (see eTable 2; note: this ex- 95% CI, 7.28-11.34). Reported use of justed OR, 0.65; 95% CI, 0.51-0.83; and cluded 3450 trials [8%] with missing data DMCs was less common in industry- adjusted OR, 0.90; 95% CI, 0.84-0.96, re- on facility location). These single-site sponsored vs NIH-sponsored trials (ad- spectively). Relative to phase 3 trials, ear- trials not funded by the NIH or indus- justed OR, 0.11; 95% CI, 0.09-0.14). lier- and later-phase trials were also less try were typically small: approximately Relative to phase 3 trials, earlier- and likely to report use of blinding (ad- 70% had enrolled or planned to enroll later-phase trials were less likely to re- justed OR, 0.66; 95% CI, 0.60-0.72 [ear- fewer than 100 participants. They were port use of DMCs (adjusted OR, 0.83; lier phase]; adjusted OR, 0.50; 95% CI, characterized primarily by North Ameri- 95% CI, 0.76-0.91 [earlier phase]; ad- 0.45-0.55 [later phase]) and randomiza- can (46.1%) and European (30.1%) rep- justed OR, 0.52; 95% CI, 0.47-0.58 tion (adjusted OR, 0.28; 95% CI, 0.25- resentation. Industry-funded multi- [later phase]). Compared with cardio- 0.31[earlier phase]; adjusted OR, 0.37; center trials included Asian and Pacific vascular and oncology trials, mental 95% CI, 0.33-0.42 [later phase]). On- sites in 27% of trials and European sites health trials were less likely to report cology trials were less likely to use ran- in 41.2% of trials, with 33.4% of trials not use of DMCs. When compared with domization (adjusted OR, 0.20; 95% CI, involving any North American sites. trials evaluating drugs or biologics, trials 0.19-0.22) and blinding (adjusted OR, Table 4. Regression Analyses of Interventional Trials Registered in ClinicalTrials.gov, October 2007–September 2010, and the Reported Use of DMC, Blinding, and Randomization Adjusted OR (95% CI) P P P Variable DMC Value Blinding Value Randomization Value Sponsoring agency (vs industry) NIH 9.087 (7.279-11.343) Ͻ.001 1.055 (0.880-1.265) .02 1.337 (1.061-1.685) .02 Other 2.984 (2.773-3.211) .07 0.895 (0.836-0.959) .80 1.031 (0.959-1.109) .39 US federal 2.174 (1.697-2.784) .01 0.653 (0.512-0.831) Ͻ.001 0.975 (0.727-1.308) .38 Study phase (vs phase 3) NA 0.503 (0.452-0.559) Ͻ.001 0.539 (0.487-0.598) Ͻ.001 0.365 (0.322-0.413) .71 0 0.623 (0.440-0.881) .25 0.622 (0.432-0.895) .77 0.200 (0.136-0.295) Ͻ.001 1 0.685 (0.607-0.772) .11 0.382 (0.341-0.428) Ͻ.001 0.162 (0.143-0.183) Ͻ.001 1/2 0.961 (0.830-1.113) Ͻ.001 0.516 (0.442-0.601) Ͻ.001 0.185 (0.159-0.216) Ͻ.001 2 0.868 (0.786-0.958) Ͻ.001 0.843 (0.767-0.926) Ͻ.001 0.362 (0.325-0.404) .80 2/3 0.974 (0.808-1.174) Ͻ.001 1.172 (0.968-1.419) Ͻ.001 0.917 (0.712-1.179) Ͻ.001 4 0.525 (0.470-0.588) Ͻ.001 0.502 (0.453-0.555) Ͻ.001 0.375 (0.331-0.425) .35 Study size (per 1000 additional participants) 1.030 (1.009-1.051) .004 1.005 (0.993-1.018) .40 1.022 (1.000-1.045) .05 Cardiovascular (yes vs no) 1.745 (1.583-1.923) Ͻ.001 0.966 (0.881-1.060) .47 0.970 (0.869-1.081) .58 Oncology (yes vs no) 1.591 (1.470-1.723) Ͻ.001 0.102 (0.093-0.112) Ͻ.001 0.202 (0.187-0.218) Ͻ.001 Mental health (yes vs no) 1.079 (0.976-1.194) .14 1.428 (1.299-1.569) Ͻ.001 1.026 (0.915-1.151) .66 Intervention type (vs drug or biological) Behavioral 0.691 (0.611-0.782) Ͻ.001 0.629 (0.558-0.710) Ͻ.001 3.216 (2.691-3.844) Ͻ.001 Dietary supplement 0.901 (0.763-1.064) .88 2.945 (2.449-3.541) Ͻ.001 2.651 (2.097-3.352) Ͻ.001 Other 0.900 (0.797-1.016) .86 0.669 (0.591-0.758) Ͻ.001 1.078 (0.944-1.231) Ͻ.001 Procedure/device 1.009 (0.926-1.100) Ͻ.001 0.513 (0.471-0.558) Ͻ.001 0.756 (0.692-0.825) Ͻ.001 Start year (increment of 1 y) 1.063 (1.049-1.078) Ͻ.001 1.023 (1.012-1.035) Ͻ.001 1.005 (0.993-1.018) .41 Primary category (vs treatment) Diagnostic 0.639 (0.542-0.754) Ͻ.001 0.569 (0.477-0.679) Ͻ.001 0.229 (0.193-0.270) Ͻ.001 Other 0.737 (0.659-0.823) Ͻ.001 1.112 (1.000-1.237) Ͻ.001 1.013 (0.900-1.140) Ͻ.001 Prevention 1.172 (1.058-1.299) Ͻ.001 1.151 (1.045-1.268) Ͻ.001 1.449 (1.282-1.638) Ͻ.001 Abbreviations: DMC, data monitoring committee; NA, not applicable; NIH, National Institutes of Health; OR, odds ratio. 1844 JAMA, May 2, 2012—Vol 307, No. 17 ©2012 American Medical Association. All rights reserved.Downloaded From: http://jama.jamanetwork.com/ by a National Academy of Sciences User on 07/17/2012 21
    • CLINICAL TRIALS REGISTERED IN CLINICALTRIALS.GOV 0.10; 95% CI, 0.09-0.11) while mental of which are not funded by the NIH or earlier-phase drug evaluations, or in- health trials were not more likely to use industry. Many registered trials contain vestigations of biological or behav- randomization (adjusted OR, 1.03; 95% significant heterogeneity in methodologi- ioral mechanisms, rather than clinical CI, 0.92-1.15) but were more likely to cal approaches, including reported use outcomes). Particularly in oncology, use blinding (adjusted OR, 1.43; 95% CI, of randomization, blinding, and DMCs. there is a growing sense that small trials 1.30-1.57). When compared with trials Although ClinicalTrials.gov has a num- based on genetics or biomarkers can evaluating drugs or biologics, trials of be- ber of limitations, it is the largest aggre- yield definitive results.25 However, small havioral interventions were less likely to gate resource for informing policy analy- trials are unlikely to be informative in report use of blinding (adjusted OR, 0.63; sis about the US clinical trials enterprise. many other settings, such as establish- 95% CI, 0.56-0.71) but more likely to use We anticipate that the “sunshine” on the ing the effectiveness of treatments with randomization (adjusted OR, 3.22; 95% national clinical trials portfolio brought modest effects and comparing effec- CI, 2.69-3.84). Trials of dietary supple- about by ClinicalTrials.gov, coupled with tive treatments to enable better deci- ments were more likely to use blinding the greater ease of obtaining an analysis sions in practice.26-28 Preliminary ob- (adjusted OR, 2.95; 95% CI, 2.45-3.54) data set from the database for AACT,19 servations suggest that many small and randomization (adjusted OR, 2.65; will engender much-needed debate about clinical trials were designed to enroll 95% CI, 2.10-3.35) and procedure and clinical trial methodologies and fund- more participants, raising questions device trials were less likely to use blind- ing allocation. about their ultimate power (D. A. Za- ing (adjusted OR, 0.51; 95% CI, 0.47- Many of the differences noted in the rin, MD, written communication, 0.56) and randomization (adjusted OR, present study have been identified be- March 28, 2012), but an accurate de- 0.76; 95% CI, 0.69-0.83). fore and likely represent variation in ap- piction of these issues requires a more More recent trials (reference: per propriate approaches for particular dis- in-depth analysis. These findings raise 1-year increment) were more likely to eases. Reviews of samples from the important issues that should be ad- report use of DMCs (adjusted OR, 1.06; literature in 198023 and 200024 raised dressed by detailed, specialty- 95% CI, 1.05-1.08) and blinding (ad- similar questions, for which this re- oriented assessments of the utility of the justed OR, 1.02; 95% CI, 1.01-1.04) but port provides a contemporary and more large number of small trials. no more likely to use randomization comprehensive sample. Despite con- A comprehensive collection of all (adjusted OR, 1.01; 95% CI, 0.99- cerns previously articulated by Mein- clinical trials on a global basis would 1.02). Larger trials were more likely to ert et al23 and Chan and Altman24— enable the most effective examination report use of DMCs (adjusted OR, 1.03; concerns that included a relatively high of evidence to support medical deci- 95% CI, 1.01-1.05) and randomiza- prevalence of clinical trials with inad- sions. The effect of the globalization of tion (adjusted OR, 1.02; 95% CI, 1.00- equate sample sizes and insufficiently clinical research has been debated,29-32 1.05). Diagnostic trials were less likely described methodologies—disparities and emerging evidence of differential to report use of all 3 methods (DMCs: still remain across specialties. This in regional involvement as a function of adjusted OR, 0.64; 95% CI, 0.54-0.75; turn raises questions about why such therapeutic area also raises questions blinding: adjusted OR, 0.57; 95% CI, heterogeneity persists, whether the relevant to policy and strategy. Al- 0.48-0.68; randomization: adjusted OR, portfolio documented by this analysis though the World Health Organiza- 0.23; 95% CI, 0.19-0.27) while preven- suffices to address gaps in evidence, and tion (WHO) provides a portal for many tion trials were more likely to use all 3 the reasons underlying differences in trial registries from around the world, compared with treatment trials (DMCs: trial methodology. It is particularly im- unacknowledged duplicate entries adjusted OR, 1.17; 95% CI, 1.06-1.30; portant to identify cases in which such make it difficult to determine a unique blinding: adjusted OR, 1.15; 95% CI, methodological differences lack ad- list of clinical trials; in addition, the 1.05-1.27; randomization: adjusted OR, equate scientific justification, as they overall data set is not available for elec- 1.45; 95% CI, 1.28-1.64). Finally, an may present an opportunity for im- tronic download, rendering the data un- analysis of blinding using only random- proving the public health through ad- available for aggregate analysis.33 ized trials produced results similar to justments to research investment strat- Attention to standards for nondrug in- the blinding analysis using all inter- egies and methods. terventions (eg, biologics, devices, and ventional trials, and oncology trials in procedures) as well as study design particular were less likely to report use Implications for Policy would also enhance the ability to de- of blinding in the context of a random- and Strategy scribe and understand the clinical trials ized design (␹2 =933; P Ͻ.001). The fact that 50% of interventional stud- enterprise.34 Indeed, as Devereaux and ies registered from October 2007 to Sep- colleagues35 point out, concepts as fun- COMMENT tember 2010 by design include fewer damental as blinding are shrouded in ter- Clinical studies registered in the than 70 participants may have impor- minological confusion and ambiguity. ClinicalTrials.gov database are domi- tant policy implications. Small trials Furthermore, lack of clarity surround- nated by small, single-center trials, many may be appropriate in many cases (eg, ing the naming of devices and biolog- ©2012 American Medical Association. All rights reserved. JAMA, May 2, 2012—Vol 307, No. 17 1845 22Downloaded From: http://jama.jamanetwork.com/ by a National Academy of Sciences User on 07/17/2012
    • CLINICAL TRIALS REGISTERED IN CLINICALTRIALS.GOV ics makes examination of specific medi- under US jurisdiction. Also, although CONCLUSIONS cal technologies difficult. many trialists from other countries use The clinical trials enterprise as revealed Although the industry is the lead ClinicalTrials.gov to satisfy ICMJE reg- by the contents of ClinicalTrials.gov is sponsor in only about 36% of interven- istration requirements,7 other registries dominated by small clinical trials and tional trials in this study, these ac- aroundtheworldmaybeused.10 However, contains significant heterogeneity in counted for 59% of all trial partici- ClinicalTrials.gov still accounts for more methodological approaches, includ- pants. Further analysis of trials in each than80%ofallclinicalstudiesintheWHO ing the use of randomization, blind- specialty may help elucidate this com- portal, as based on comparisons of the ing, and DMCs. Our analysis raises plex mix of funding, trial size, and lo- number of clinical studies appearing in questions about the best methods for cation so that policies might be en- the ClinicalTrials.gov registry divided by generating evidence, as well as the acted to improve the responsiveness of the number of unique studies appearing capacity of the clinical trials enterprise trials to the needs of public health and in the WHO portal. to supply sufficient amounts of high- the overall research community. Second, there have been changes over quality evidence needed to ensure con- Methodological differences across time in the data collected, the defini- fidence in guideline recommenda- therapeutic areas are also of interest. tions used, and the rigor with which tions. Given the deficit in evidence to The greater focus on earlier-phase trials missing data are pursued. As de- support key decisions in clinical prac- and biomarker-based personalized scribed in the “Methods” section, some tice guidelines11,12 as well as concerns medicine25 may explain some of the dif- data elements were either missing or about insufficient numbers of volun- ferences in approach evident with on- unavailable because of practical or lo- teers for trials,36 the desire to provide cology trials, but substantial differ- gistical limitations. Some of these is- high-quality evidence for medical deci- ences in the use of randomization and sues can be addressed by focused analy- sions must include consideration of a blinding across specialties persist af- ses in which ancillary data sets are comprehensive redesign of the clinical ter adjustment for phase, raising fun- created or review of primary protocols trial enterprise. damental questions about the ability to and studies is done. In addition, the po- Author Contributions: Dr Califf had full access to all draw reliable inferences from clinical tential for serious sanctions for incom- of the data in the study and takes responsibility for research conducted in that arena. plete data under the FDAAA may have the integrity of the data and the accuracy of the data analysis. The reporting of use of a DMC is an improved data collection for those fields Study concept and design: Califf, Zarin, Sherman, optional data element within the in recent years. As noted earlier, we Tasneem. Acquisition of data: Zarin, Tasneem. ClinicalTrials.gov registry. The appro- used the study type field from the Analysis and interpretation of data: Califf, Zarin, priate criteria for determining when a ClinicalTrials.gov registry to identify in- Kramer, Aberle, Tasneem. Drafting of the manuscript: Califf, Sherman. DMC is useful or required remain con- terventional studies; however, we did Critical revision of the manuscript for important in- troversial. Yet the heterogeneity ob- not perform additional manual screen- tellectual content: Califf, Zarin, Kramer, Aberle, served by trial phase, disease cat- ing to identify and exclude possibly mis- Tasneem. Statistical analysis: Califf, Aberle. egory, and lead sponsor category in this classified observational studies. Obtained funding: Califf, Kramer. study (eg, industry vs government Third, the need for a standard ontol- Administrative, technical, or material support: Califf, Zarin, Tasneem. sponsorship) may represent an oppor- ogy to describe clinical research remains Study supervision: Califf, Zarin, Sherman. tunity for mutual learning and com- a pressing concern. Current definitions Conflict of Interest Disclosures: All authors have com- pleted and submitted the ICMJE Form for Disclosure promise among disparate views. The were developed to help individuals find of Potential Conflicts of Interest. Dr Califf reports re- trend toward increased reporting of use particular trials or were legally mandated ceiving research grants that partially support his sal- ary from Amylin, Johnson & Johnson (Scios), Merck, of DMCs over time in this study is no- without necessarily involving experts or Novartis Pharma, Schering Plough, Bristol-Myers table, but clear policies would be use- allowing time for testing. Consequently, Squibb Foundation, Aterovax, Bayer, Roche, and Lilly; ful to those researchers designing trials. some data remain ambiguous, compli- all grants are paid to Duke University. Dr Califf also consults for TheHeart.org, Johnson & Johnson (Scios), For example, many different arrange- cating efforts to combine and analyze re- Kowa Research Institute, Nile, Parkview, Orexigen ments can be made for monitoring sults in a given therapeutic area or across Therapeutics, Pozen, Servier International, WebMD, Bristol-Myers Squibb Foundation, AstraZeneca, Bayer- safety in clinical trials, and the cur- areas. For example, the terms interven- OrthoMcNeil, BMS, Boerhinger Ingelheim, Daiichi San- rent data only reflect the presence of a tional trial and clinical trial are critical kyo, GlaxoSmithKline, Li Ka Shing Knowledge Insti- tute, Medtronic, Merck, Novartis, sanofi-aventis, typical, well-defined DMC. for distinguishing purely observational XOMA, and University of Florida; all income from these studies from those that assign partici- consultancies is donated to nonprofit organizations, Limitations pants to an interventional therapy. Fur- with the majority going to the clinical research fel- lowship fund of the Duke Clinical Research Institute. Several limitations of our study should be ther refinement of this definition9 could Dr Califf holds equity in Nitrox LLC. Dr Kramer is the noted. First, ClinicalTrials.gov does not be helpful to those interested in differ- executive director of the Clinical Trials Transforma- tion Institute (CTTI), a public-private partnership. A includeallclinicaltrials.WithintheUnited entiatinghigh-riskinvasiveinterventions portion of Dr Kramer’s salary is supported by pooled States, legal requirements for registration from low-risk interventions or distin- funds from CTTI members (https://www.ctti- clinicaltrials.org/about/membership). Dr Kramer re- do not include phase 1 trials, trials not in- guishing specific types of behavioral, ports receiving a research grant from Pfizer that sup- volving a drug or device, and trials not drug, or device interventions. ports a small percentage of her salary; this grant is paid 1846 JAMA, May 2, 2012—Vol 307, No. 17 ©2012 American Medical Association. All rights reserved. 23Downloaded From: http://jama.jamanetwork.com/ by a National Academy of Sciences User on 07/17/2012
    • CLINICAL TRIALS REGISTERED IN CLINICALTRIALS.GOV to Duke University. Dr Kramer also served on an ad- trials. International Committee of Medical Journal Edi- 22. Tasneem A, Aberle L, Ananth H, et al. The data- visory board for the “Pharmacovigilance Center of Ex- tors. http://www.icmje.org/publishing_10register base for Aggregate Analysis of ClinicalTrials.gov (AACT) cellence” at GlaxoSmithKline, for which she received .html. Accessed January 3, 2012. and subsequent regrouping by clinical specialty. PLoS an honorarium. Financial disclosure information for Drs 8. Food and Drug Administration Amendments Act of One. 2012;7(3):e33677. Califf and Kramer is also publicly available at https: 2007: Public Law 110-85. US Food and Drug Admin- 23. Meinert CL, Tonascia S, Higgins K. Content of re- //www.dcri.org/about-us/conflict-of-interest. No other istration. http://www.fda.gov/RegulatoryInformation ports on clinical trials: a critical review. Control Clin disclosures were reported. /Legislation/FederalFoodDrugandCosmeticAct Trials. 1984;5(4):328-347. Funding/Support: Financial support for this project was FDCAct/SignificantAmendmentstotheFDCAct 24. Chan AW, Altman DG. Epidemiology and report- provided by grant U19FD003800 from the US Food /FoodandDrugAdministrationAmendmentsActof2007 ing of randomised trials published in PubMed journals. and Drug Administration awarded to Duke Univer- /FullTextofFDAAALaw/default.htm. Accessed May 31, Lancet. 2005;365(9465):1159-1162. sity for the Clinical Trials Transformation Initiative. 2011. 25. Kris MG, Meropol NJ, Winer EP, eds. Accelerat- Role of the Sponsors: The US Food and Drug Admin- 9. ClinicalTrials.gov protocol data element defini- ing progress against cancer: ASCO’s blueprint for trans- istration participated in the design and conduct of the tions (draft): August 2011. ClinicalTrials.gov. http: forming clinical and translational cancer research. study via one of the coauthors (R.E.S.); in the collec- //prsinfo.clinicaltrials.gov/definitions.html. Ac- American Society of Clinical Oncology. http://www tion, management, analysis, and interpretation of the cessed November 4, 2011. .asco.org/ascov2/department%20content/cancer data; and in the preparation, review, and approval of 10. Zarin DA, Tse T, Williams RJ, Califf RM, Ide NC. %20policy%20and%20clinical%20affairs/downloads the manuscript. The ClinicalTrials.gov results database: update and key /blueprint.pdf. Accessed February 5, 2012. Online-Only Material: The 2 eTables, 2 eAppen- issues. N Engl J Med. 2011;364(9):852-860. 26. Yusuf S, Collins R, Peto R. Why do we need some dixes, and eFigure are available at http://www.jama 11. Lee DH, Vielemeyer O. Analysis of overall level large, simple randomized trials? Stat Med. 1984; .com. of evidence behind Infectious Diseases Society of 3(4):409-422. Additional Contributions: We gratefully acknowl- America practice guidelines. Arch Intern Med. 2011; 27. Peto R, Collins R, Gray R. Large-scale random- edge the contributions of CTTI Project Leader Jean 171(1):18-22. ized evidence: large, simple trials and overviews of trials. Bolte, RN (Duke Clinical Research Institute), and Na- 12. Tricoci P, Allen JM, Kramer JM, Califf RM, Smith J Clin Epidemiol. 1995;48(1):23-40. tional Library of Medicine staff members Nicholas Ide, SC Jr. Scientific evidence underlying the ACC/AHA 28. Califf RM, DeMets DL. Principles from clinical trials MS; Rebecca Williams, PhD; and Tony Tse, PhD, to clinical practice guidelines. JAMA. 2009;301(8): relevant to clinical practice: part I. Circulation. 2002; this project. We also thank Jonathan McCall, BA (Duke 831-841. 106(8):1015-1021. Clinical Research Institute), for editorial assistance with 13. Guyatt GH, Oxman AD, Vist GE, et al; GRADE 29. Glickman SW, McHutchison JG, Peterson ED, et al. the manuscript. None received compensation for their Working Group. GRADE: an emerging consensus on Ethical and scientific implications of the globalization contributions besides their salaries. rating quality of evidence and strength of of clinical research. N Engl J Med. 2009;360(8): recommendations. BMJ. 2008;336(7650):924- 816-823. 926. 30. Pasquali SK, Burstein DS, Benjamin DK Jr, Smith REFERENCES 14. Moher D, Schulz KF, Altman D; CONSORT Group PB, Li JS. Globalization of pediatric research: analysis (Consolidated Standards of Reporting Trials). The of clinical trials completed for pediatric exclusivity. 1. Fair tests of treatments in health care. The James CONSORT statement: revised recommendations for Pediatrics. 2010;126(3):e687-e692. Lind Library. http://www.jameslindlibrary.org. Ac- improving the quality of reports of parallel-group ran- 31. Kim ES, Carrigan TP, Menon V. International par- cessed November 28, 2011. domized trials. JAMA. 2001;285(15):1987-1991. ticipation in cardiovascular randomized controlled trials 2. DeMets DL, Califf RM. A historical perspective on 15. Gillen JE, Tse T, Ide NC, McCray AT. Design, imple- sponsored by the National Heart, Lung, and Blood clinical trials innovation and leadership: where have mentation and management of a web-based data en- Institute. J Am Coll Cardiol. 2011;58(7):671-676. the academics gone? JAMA. 2011;305(7):713- try system for ClinicalTrials.gov. Stud Health Tech- 32. Califf RM, Harrington RA. American industry and 714. nol Inform. 2004;107(Pt 2):1466-1470. the US Cardiovascular Clinical Research Enterprise an 3. Sung NS, Crowley WF Jr, Genel M, et al. Central 16. Zarin DA, Tse T, Ide NC. Trial registration at appropriate analogy? J Am Coll Cardiol. 2011; challenges facing the national clinical research ClinicalTrials.gov between May and October 2005. 58(7):677-680. enterprise. JAMA. 2003;289(10):1278-1287. N Engl J Med. 2005;353(26):2779-2787. 33. International Clinical Trials Registry Platform Search 4. Menikoff J, Richards EP. What the Doctor Didn’t 17. Zarin DA, Ide NC, Tse T, Harlan WR, West JC, Portal. World Health Organization. http://apps.who Say: The Hidden Truth About Medical Research. New Lindberg DA. Issues in the registration of clinical trials. .int/trialsearch/. Accessed December 8, 2011. York, NY: Oxford University Press; 2006. JAMA. 2007;297(19):2112-2120. 34. DeAngelis CD, Drazen JM, Frizelle FA, et al; In- 5. Food and Drug Administration Modernization 18. Studies by topic: select a location. ClinicalTrials ternational Committee of Medical Journal Editors. Is Act of 1997 (FDAMA): Public Law 105-15. US .gov. http://www.clinicaltrials.gov/ct2/search/browse this clinical trial fully registered? a statement from the Food and Drug Administration. http://www ?brwse=locn_cat. Accessed January 3, 2012. International Committee of Medical Journal Editors. .fda.gov/RegulatoryInformation/Legislation 19. AACT database (Aggregate Analysis of JAMA. 2005;293(23):2927-2929. /FederalFoodDrugandCosmeticActFDCAct ClinicalTrials.gov). Clinical Trials Transformation Ini- 35. Devereaux PJ, Manns BJ, Ghali WA, et al. Phy- /SignificantAmendmentstotheFDCAct/FDAMA tiative. http://www.ctti-clinicaltrials.org/project-topics sician interpretations and textbook definitions of blind- /FullTextofFDAMAlaw/default.htm. Accessed January /clinical-trials.gov/aact-database. Accessed March 20, ing terminology in randomized controlled trials. JAMA. 3, 2012. 2012. 2001;285(15):2000-2003. 6. DeAngelis CD, Drazen JM, Frizelle FA, et al; Inter- 20. McKenna MT, Michaud CM, Murray CJ, Marks 36. English RA, Leibowitz Y, Giffin RB. Chapter 2: The national Committee of Medical Journal Editors. Clini- JS. Assessing the burden of disease in the United States state of clinical research in the United States: an cal trial registration: a statement from the Interna- using disability-adjusted life years. Am J Prev Med. overview. In: Transforming Clinical Research in the tional Committee of Medical Journal Editors. JAMA. 2005;28(5):415-423. United States: Challenges and Opportunities: Work- 2004;292(11):1363-1364. 21. Medical Subjects Headings (MeSH) thesaurus. shop Summary. National Academies Press. http: 7. Uniform requirements for manuscripts submitted http://www.nlm.nih.gov/mesh/2010/introduction //books.nap.edu/openbook.php?record_id=12900. to biomedical journals: obligation to register clinical /introduction.html. Accessed February 20, 2012. Accessed February 15, 2012. ©2012 American Medical Association. All rights reserved. JAMA, May 2, 2012—Vol 307, No. 17 1847Downloaded From: http://jama.jamanetwork.com/ by a National Academy of Sciences User on 07/17/2012 24
    • Policy ForumThe Imperative to Share Clinical Study Reports:Recommendations from the Tamiflu ExperiencePeter Doshi1*, Tom Jefferson2, Chris Del Mar31 Johns Hopkins University School of Medicine, Baltimore, Maryland, United States of America, 2 The Cochrane Collaboration, Roma, Italy, 3 Centre for Research inEvidence-Based Practice, Bond University, Gold Coast, Australia Regulatory approval of new drugs is during the late 1990s by the manufacturer FDA has never clarified the many discrep-assumed to reflect a judgment that a prior to US registration of the drug [8]. ancies in claims made over the effects ofmedication’s benefits outweigh its harms. This analysis, conducted by Kaiser and Tamiflu. Although it may have limitedDespite this, controversy over approved colleagues, proposed that oseltamivir treat- approval indications accordingly, the FDAdrugs is common. If sales can be consid- ment of influenza reduced both secondary has never challenged the US HHS or theered a proxy for utility, the controversies complications and hospital admission. In US CDC for making far more ambitioussurrounding even the most successful contrast, the Food and Drug Administra- claims. This means that critical analysis bydrugs (such as blockbuster drugs) seem all tion (FDA), which approved Tamiflu in an independent group such as a Cochranethe more paradoxical, and have revealed 1999 and was aware of these same clinical review group is essential.the extent to which the success of many trials, concluded that Tamiflu had not been But which data should be used? Indrugs has been driven by sophisticated shown to reduce complications, and re- updating our Cochrane review of neur-marketing rather than verifiable evidence quired an explicit statement in the drug’s aminidase inhibitors, we have become[1,2]. But even among institutions that label to that effect [9]. FDA even cited convinced that the answer lies in analyzingaim to provide the least biased, objective Roche, Tamiflu’s manufacturer, for viola- clinical study reports rather than theassessments of a drug’s effects, determining tion of the law for claims made to the traditional published trials appearing in‘‘the truth’’ can be extremely difficult. contrary [10]. biomedical journals [13]. Clinical study Consider the case of the influenza Nor did the FDA approve an indication reports contain the same information asantiviral Tamiflu (oseltamivir). Prior to for Tamiflu in the prevention of transmis- journal papers (usually in standardizedthe global outbreak of H1N1 influenza in sion of influenza [9,11]. This assumption sections, including an introduction, meth-2009, the United States alone had stock- was at the heart of the World Health ods, results, and conclusion [14]), but havepiled nearly US$1.5 billion dollars worth Organization’s (WHO) proposed plan to far more detail: the study protocol,of the antiviral [3]. As the only drug in its suppress an emergent pandemic through analysis plan, numerous tables, listings,class (neuraminidase inhibitors) available mass prophylaxis [12]. While the WHO and figures, among others. They are farin oral form, Tamiflu was heralded as the recently added Tamiflu to its Essential larger (hundreds or thousands of pages),key pharmacologic intervention for use Medicines list, if FDA is right, the drug’s and represent the most complete synthesisduring the early days of an influenza effectiveness may be no better than aspirin of the planning, execution, and results of apandemic when a vaccine was yet to be or acetaminophen (paracetemol). The clinical trial. Journal publications of clin-produced. It would cut hospitalizationsand save lives, said the US Department ofHealth and Human Services (HHS) [4]. Citation: Doshi P, Jefferson T, Del Mar C (2012) The Imperative to Share Clinical Study Reports: Recommendations from the Tamiflu Experience. PLoS Med 9(4): e1001201. doi:10.1371/journal.pmed.1001201The Advisory Committee on Immuniza- Published April 10, 2012tion Practices (ACIP, the group the USCenters for Disease Control and Preven- Copyright: ß 2012 Doshi et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium,tion [CDC] uses to form national influen- provided the original author and source are credited.za control policy) said it would reduce the Funding: Peter Doshi is funded by an institutional training grant from the Agency for Healthcare Research andchances of developing complications from Quality #T32HS019488. The funders had no role in study design, data collection and analysis, decision toinfluenza [5]. So, too, did the Australian publish, or preparation of the manuscript.Therapeutic Goods Administration [6] Competing Interests: All review authors have applied for and received competitive research grants. Alland the European Medicines Agency review authors are co-recipients of a UK National Institute for Health Research grant to carry out a Cochrane review of neuraminidase inhibitors (http://www.hta.ac.uk/2352). In addition: TJ was an ad hoc consultant for F.(EMA) [7]. Hoffman-La Roche Ltd in 1998–1999. TJ receives royalties from his books published by Blackwells and Il Most (perhaps all) of these claims can be Pensiero Scientifico Editore, none of which are on neuraminidase inhibitors. TJ is occasionally interviewed bytraced back to a single source: a meta- market research companies for anonymous interviews about Phase 1 or 2 products unrelated to neuraminidase inhibitors. Since submission of this article, TJ has been retained as an expert consultant in a legal case involvinganalysis published in 2003 that combined Tamiflu. CDM provided expert advice to GlaxoSmithKline about vaccination against acute otitis media in 2008–ten randomized clinical trials conducted 2009. CDM receives royalties from books published through Blackwells BMJ Books and Elsevier. PD declares no further conflicts of interest. Abbreviations: ACIP, Advisory Committee on Immunization Practices; CDC, US Centers for Disease ControlThe Policy Forum allows health policy makers and Prevention; EMA, European Medicines Agency; FDA, US Food and Drug Administration; HHS, USaround the world to discuss challenges and Department of Health and Human Services; WHO, World Health Organizationopportunities for improving health care in theirsocieties. * E-mail: pnd@jhu.edu Provenance: Not commissioned; externally peer reviewed. PLoS Medicine | www.plosmedicine.org 1 April 2012 | Volume 9 | Issue 4 | e1001201 25
    • Summary Points An Urgent Call for a Debate on the Ethics of Data Secrecy N Systematic reviews of published randomized clinical trials (RCTs) are considered Taken together, these experiences sug- the gold standard source of synthesized evidence for interventions, but their gest that any attempt at reliable evidence conclusions are vulnerable to distortion when trial sponsors have strong interests that might benefit from suppressing or promoting selected data. synthesis must begin with clinical study reports. Yet we are aware of only three N More reliable evidence synthesis would result from systematic reviewing of other groups of independent researchers clinical study reports—standardized documents representing the most conducting a systematic review based complete record of the planning, execution, and results of clinical trials, which are submitted by industry to government drug regulators. entirely (or mostly) on these and other regulatory documents [17–19]. We can N Unfortunately, industry and regulators have historically treated clinical study think of two major reasons this might be. reports as confidential documents, impeding additional scrutiny by indepen- First, outside of regulatory agencies, few dent researchers. researchers have ever heard of clinical N We propose clinical study reports become available to such scrutiny, and study reports. Second, clinical study re- describe one manufacturer’s unconvincing reasons for refusing to provide us ports are massive in size and difficult to access to full clinical study reports. We challenge industry to either provide obtain, traditionally shared only with open access to clinical study reports or publically defend their current position regulators. The first problem seems trac- of RCT data secrecy. table, but gaining better access to manu- facturers’ clinical study reports requires shifting the status quo from a defaultical trials may generate media attention through a Freedom of Information request position of confidentiality to one of[2], propel researchers’ careers, and gen- to the EMA, amounting to tens of thou- disclosure.erate some journals a revenue stream [15]. sands of pages. While extensive and In the mid-20th century, regulatoryHowever, when regulators decide whether detailed, it is important to note that what agencies became increasingly responsibleto register a new drug in a manufacturer’s we have obtained is just a subset of the full to the public for ensuring the safety andapplication, they review the trial’s clinical clinical study reports in Roche’s possession. effectiveness of approved medicines. Thisstudy report. Nonetheless, Box 1 provides a list of details rise paralleled an increase in the number In 2010, we began our Cochrane review we have already discovered—and would and complexity of clinical trials. Both inupdate using clinical study reports rather have never discovered without access to the United States and Europe, manufac-than published papers [16]. We obtained these documents. This information has turers would come to submit trial data tosome sections of these clinical study reports turned our understanding of the drug’s regulators with the assurance that author-for the ten trials appearing in the Kaiser effects on its head. Other drugs for which ities would treat all trade secret data as2003 meta-analysis from Tamiflu’s manu- previously unpublished, detailed clinical confidential. Even following passage of thefacturer, Roche—around 3,200 pages in trial data have radically changed public 1966 Freedom of Information Act (FOIA),total. In 2011, we obtained additional knowledge of safety and efficacy include the FDA maintained that safety andsections of clinical study reports for Tamiflu Avandia, Neurontin, and Vioxx (Table 1). effectiveness test data were not subject to the FOIA [20,21]. One sign things may be changing came in November 2010, when the EMA announced its intention to make Box 1. What Is Missed without Access to Tamiflu Clinical Study all industry clinical study reports about a Reports drug publicly available as a matter of standard practice after reaching a regula- 1. Knowledge of the total denominator. (How many trials were conducted on this tory decision [22,23]. The EMA has also drug that might fit the systematic review inclusion criteria?) [13] already improved its handling of data 2. Realization that serious adverse events (SAEs) occurred in trials for which SAEs under Freedom of Information requests. were not reported in published papers [13]. But as we learned was true for Tamiflu, 3. Understanding what happened in some trials that were published 10 years after regulators may not be in possession of full completion [43]. trial reports of all studies on a given 4. Vital details of trials (content and toxicity profile of placebos, mode of action of intervention, implying the necessity of drug, description and temporality of adverse events) [11]. obtaining reports from manufacturers [11]. But at present, industry seems 5. Authorship is not consistent with published papers [44] (although, if a review’s extremely reluctant to make its clinical inclusion criteria include clinical study reports, authorship is not an issue, as the study reports freely available. responsibility is clearly the manufacturer’s). In addition to clinical study reports, we 6. Rationale for alternatively classifying outcomes such as pneumonia as a also need access to regulatory informa- complication or an adverse event [16]. tion. By the very nature of their profes- 7. Ability to know whether key subgroup analysis (influenza-infected subjects) is sional mandate, regulators may conduct valid [11]. some of the most thorough evaluations of 8. Assessment of validity of previously released information on the drug (articles, a trial program, and efforts like ours reviews, conferences, media, etc.). aimed at up-to-date evidence synthesis 9. Realization that Roche’s claim of Tamiflu’s mode of action [45] appears are seriously deprived without access to inconsistent with the evidence from trials [11,46]. their reports. While the FDA has increas- ingly published its reviews, memos, and PLoS Medicine | www.plosmedicine.org 2 April 2012 | Volume 9 | Issue 4 | e1001201 26
    • Table 1. Different sources of data for the uncovering of failures in reporting of safety and effectiveness of some examples of new drugs. Type and Magnitude of the Drug (Trade Name; Manufacturer) Brief History Problem How the Problem Was Discovered Rosiglitazone (Avandia; GSK) FDA approval granted in 1999 for The primary trials had weaknesses 2005–2007 series of meta-analyses use in diabetes. EMA did not approve, of the methods, including excessive increasingly suggest harms, especially initially, over concerns about lack of loss to follow-up, and over-emphasis cardiovascular. evidence of efficacy, but granted on proxy outcome measures (blood Public disclosure of unpublished study approval in 2000. In 2001, FDA issued glucose levels rather than mortality results was critical to uncovering the new warnings over concerns about and morbidity outcomes) [27–29]. evidence of harm [30]. cardiovascular risks (especially acute myocardial infarction). In a separate lawsuit in 2004, GSK agreed to publicly post online details of all its clinical trials. In 2010, the EMA suspended rosiglitazone. The company is dealing with 10,000 to 20,000 cases of litigation in the United States alone. Gabapentin (Neurontin; Parke-Davis, FDA approval was granted in 1994 Delay in publishing company An internal whistleblower drew attention a Pfizer subsidiary) for a narrow indication: adjunctive sponsored trials, such as a 1998 to the problem, which resulted in legal treatment for epilepsy (partial Pfizer-funded study showing lack action taken by US authorities [34,35]. seizures). The company was accused of efficacy for bipolar disease, in of using aggressive marketing using which publication was delayed for systematic methods, including key two years [34]. opinion leaders (KOLs), to promote off-label use [31–33]. Most sales were for off-label use. In 2002, FDA approval was extended to post- herpetic neuralgia. However, in May 2004, Pfizer agreed to pay a US$240 million criminal fine in the United States for illegally promoting off- label use [34]. Rofecoxib (Vioxx; Merck) FDA approval granted in 1999. In Unexpected and unanticipated Concern from a regulator (FDA) that there 2000, the VIGOR study was published, cardiovascular events (principally, might be increased CVD events caused by apparently demonstrating superiority myocardial infarction). the drug. Trial proposed, but not over non selective inhibition [36]. Data A trial of rofecoxib conducted to conducted [39]. claimed to have been excluded in a check protection against colon Clinical study reports indirectly important critical editorial [37]. In 2004, Merck polyps established an increased in uncovering the scandal. withdrew the drug. Subsequently, cardiovascular disease (CVD) risk [39]. Merck has been defending against thousands of liability legal actions [38] after educational events had been attempted to declare rofecoxib was safe [39]. Oseltamivir (Tamiflu; Roche) FDA approval granted in 1999. Sales There was a public call for individual The latest Cochrane update relied on surged following concern over patient data, already provided clinical study reports obtained from avian and pandemic influenza in confidentially to regulators, to be regulators (freedom of information 2005 and 2009, leading to massive made available for public scrutiny [41]. requests), reviews, and other documents government stockpiles worldwide. Roche agreed to release ‘‘full study released by regulators and leaked Numerous governmental bodies reports’’, but (as discussed in this sources. The manufacturer provided assumed the drug would reduce article) later offered multiple reasons incomplete reports for some trials. Our the complications of influenza, and for not providing these data. review has led to the detection of hospitalizations, based on a Roche- numerous reporting biases and supported meta-analysis of some fundamental problems in trial design, and efficacy trials [8]. This claim was we have concluded that previous challenged by our 2009 Cochrane effectiveness claims were not supported review update in response to by the available evidence. criticism [40]. doi:10.1371/journal.pmed.1001201.t001other correspondence on its Drugs@FDA information. Ideally, we would also have appropriate authorities or individuals,’’website (http://www.accessdata.fda.gov/ details of the regulators’ deliberations, and publicly pledged to release ‘‘full studyscripts/cder/drugsatfda/), and additional which can serve as signposts to important reports’’ for ten trials ‘‘within the comingdocuments are accessible through FOIA issues that need investigating. days’’ [25]. But despite extensive corre-requests, the process can be very time- In December 2009, after we voiced spondence over the next year and a half,consuming, taking months or even years. serious concerns in the BMJ about Tami- Roche refused to provide any more thanMoreover, regulatory agencies’ lack of flu’s alleged ability to reduce complica- portions of the clinical study reports forpublic inventories of their documentary tions [24], Roche wrote that it was ‘‘very the ten Kaiser studies (Table 2) and noholdings complicates the retrieval of happy to have its data reviewed by reports for any of the additional Tamiflu PLoS Medicine | www.plosmedicine.org 3 April 2012 | Volume 9 | Issue 4 | e1001201 27
    • Table 2. Roche’s reasons for not sharing its clinical study reports and the authors’ response (for 10 trials in Kaiser meta-analysis[8]).Roche’s Basis for Reluctance or Refusal to Share Data Response‘‘Unfortunately we are unable to send you the data requested as a similar It is unclear why another group of independent researchers would prevent Rochemeta analysis is currently commencing with which there are concerns your from sharing the same data with our group.request may conflict. We have been approached by an independent expertinfluenza group and as part of their meta analysis we have provided accessto Roche’s study reports.’’ (Oct. 8, 2009)Cochrane reviewers were ‘‘unwilling to enter into the [confidentiality] The terms of Roche’s proposed contract were unacceptable to us. We declined toagreement with Roche.’’ (Dec. 8, 2009) sign for two reasons: 1) all data disclosed under the contract were to be regarded as confidential; and 2) signing the contract would also require us ‘‘not to disclose … the existence and terms of this Agreement’’. We judged that the requirement to keep all data, and the confidentiality agreement itself, secret would interfere with our explicit aim of openly and transparently systematically reviewing the trial data and accounting for their provenance.Roche says that it was ‘‘under the impression that you [Cochrane] were also We did not immediately realize that what Roche had provided was incomplete.satisfied with its provision based on our correspondence earlier this year Irrespective of whether we had at one point seemed ‘‘satisfied,’’ Roche had not(March 2010).’’ (June 1, 2010) delivered what it publicly promised in the BMJ on Dec. 8, 2009: ‘‘full study reports will also be made available on a password-protected site within the coming days to physicians and scientists undertaking legitimate analyses.’’‘‘… around 3,200 pages of information have already been provided by What is important is completeness, and 3,200 pages is a fraction of the full studyRoche for review by your group and the scientific community.’’ (Aug. 20, 2010) reports for the ten Kaiser trials Roche promised to make available.‘‘Roche undertook this action [release of 3,200 pages] to demonstrate our This implies that release of the promised-but-never-released data would impinge oncomplete confidence in the data and our commitment to transparency to the ‘‘patient confidentiality, data exclusivity and the protection of intellectual property’’.degree to which patient confidentiality, data exclusivity and the protection of This does not seem to apply to many elements of clinical study reports (e.g., the trialintellectual property allow.’’ (Aug. 20, 2010) protocol and reporting analysis plan), and it is unclear why personal data could not be anonymized.‘‘The amount of data already made accessible to the scientific community It is irrelevant what is ‘‘generally provided’’. What is relevant is what was promisedthrough our actions extends beyond what is generally provided to any third and the need for public disclosure of clinical study reports.party in the absence of a confidentiality agreement.’’ (Aug. 20, 2010)‘‘Over the last few months, we have witnessed a number of developments Despite Roche’s promise and our request for specifics, Roche never respondedwhich raise concerns that certain members of Cochrane Group involved directly.with the review of the neuraminidase inhibitors are unlikely to approach thereview with the independence that is both necessary and justified. Amongstothers, this includes incorrect statements concerning Roche/Tamiflu madeduring a recent official enquiry into the response to last year’s pandemicby a member of this Cochrane Review Group. Roche intends to follow upseparately to clarify this issue. We also note with concern that certaininvestigators, who the Cochrane Group is proposing will carry out theplanned review, have previously published articles covering Tamifluwhich we believe lack the appropriate scientific rigor and objectivity.’’(Aug. 20, 2010)‘‘We noted in our correspondence to the BMJ in December of last year our Our view is that Tamiflu is a global public health drug and the media have aconcern that the first requests for data to assist in your review did not come legitimate reason for helping independent reviewers obtain data, which includesfrom the Cochrane Group, but from the media apparently trying to obtain being informed of our efforts to do so.data following discussions with the Cochrane Review Group. This raisedserious questions regarding the motivation for the review from the outset.We note that in subsequent correspondence regarding your next plannedreview you have copied a number of journalists when responding to emailssent by Roche staff.’’ (Aug. 20, 2010)Cochrane reviewers have been provided with ‘‘all the trial data [they] We disagree. First, it is up to us to decide what we require. Second, we now knowrequire …’’ (Jan. 14, 2011) that what was provided was not enough. For example, Roche did not provide us with the trial protocols and full amendment history.‘‘You have all the detail you need to undertake a review and so we have We have still not received what was promised in December 2009, and we know thatdecided not to supply any more detailed information. We do not believe what we have received is deficient.the requested detail to be necessary for the purposes of a review ofneuraminidase inhibitors.’’ (April 26, 2011)‘‘It is the role of Global Regulatory Authorities to review detailed Independent researchers such as the Cochrane Collaboration share the goal ofinformation of medicines when assessing benefit/risk. This has occurred, assessing benefit/risk, and require all details necessary to competently perform thisand continues to occur with Tamiflu, as with all other medicines, through function.regular license updates.’’ (April 26, 2011)‘‘Roche has made full clinical study data available to health authorities Roche may have made full clinical study data ‘‘available’’ but that does not mean theyaround the world for their review as part of the licensing process.’’ [42] ‘‘provided’’ all regulators with full clinical study data. For at least 15 Tamiflu trials,(Jan. 20, 2012) Roche did not provide the European regulator (EMA) with full study reports, apparently because EMA did not expressly request the complete clinical study reports. (Correspondence with EMA, May 24 and Jul. 20, 2011)doi:10.1371/journal.pmed.1001201.t002 PLoS Medicine | www.plosmedicine.org 4 April 2012 | Volume 9 | Issue 4 | e1001201 28
    • Table 3. Roche’s reasons for not sharing its clinical study reports and the authors’ response (for other Tamiflu trials). Roche’s Basis for Not Sharing Data Response ‘‘With regards your recent request for additional data, we are taking We agree that ‘‘patient and physician confidentiality, legality and feasibility’’ are valid the request very seriously and as such must assess this within the wider concerns. However, Roche has never provided details. organization particularly with respect to our obligations around patient and physician confidentiality, legality and feasibility’’ (July 24, 2010) With respect to clinical study reports for additional studies beyond the This is a reasonable request. However, we shared our systematic review protocol [16] original ten meta-analyzed by Kaiser: ‘‘we request that you submit your full with Roche on Dec. 11, 2010, following peer-review and publication online-first on analysis plan for review by Roche and the scientific/medical community.’’ the cochrane.org website (the usual Cochrane systematic review practice). (Aug. 20, 2010) ‘‘Concerning access to the reports beyond the Kaiser analysis … many are What Roche has directed us to is insufficient (peer-reviewed publications and the pharmacokinetic studies conducted in healthy volunteers and so are unlikely extremely compressed summaries of trials on http://www.rochetrials.com) as the to be useful in a meta analysis. The studies in prevention and in children are topic of our review is unpublished data, most prominently clinical study reports. published in peer reviewed publications, or in summary form on www. rochetrials.com.’’ (Jan. 14, 2011) ‘‘The long list of studies you sent are either published in peer reviewed Roche continues to imply the evidentiary equivalence between unpublished clinical publications, in summary form on www.rochetrials.com, halted early, study reports and extremely compressed summary formats such as journal or ongoing.’’ (Feb. 14, 2011) publications—a position we reject. We have repeatedly stated and shown evidence (in our study protocol and elsewhere [13,16]) that reliable evidence synthesis requires access to clinical study reports. doi:10.1371/journal.pmed.1001201.t003trials we had subsequently identified and doing so. But until these policies go into mary Points into Danish by Andreasrequested (Table 3). Reasons for refusing effect—and perhaps even after they do— Lundh.to share the full reports on Tamiflu kept most drugs on the market will remain (DOC)changing, and none seemed credible those approved in an era in which(Tables 2 and 3). regulators protected industry’s data [26]. Alternative Language Summary There are strong ethical arguments for It is therefore vital to know where industry Points S6 Translation of the Sum-ensuring that all clinical study reports are stands. If drug companies have legitimate mary Points into German by Gerdpublicly accessible. It is the public who take reasons for maintaining the status quo of Antes.and pay for approved drugs, and therefore treating all of their data as trade secret, we (DOC)the public should have access to complete have yet to hear them. We are all ears. Alternative Language Summaryinformation about those drugs. We should Points S7 Translation of the Sum-also not lose sight of the fact that clinical Supporting Informationtrials are experiments conducted on hu- mary Points into Spanish by Juanmans that carry an assumption of contrib- Alternative Language Summary Andres Leon.uting to medical knowledge. Non-disclosure Points S1 Translation of the Sum- (DOC)of complete trial results undermines the mary Points into Russian by Vasiliy Vlassov. Alternative Language Summaryphilanthropy of human participants and sets Points S8 Translation of the Sum-back the pursuit of knowledge. (DOC) mary Points into Chinese by Han-Pu Potentially valid reasons for restricting Alternative Language Summary Tung.the full public release of clinical study Points S2 Translation of the Sum- (DOC)reports include: ensuring patient confiden- mary Points into Italian by Tomtiality (although this could be remedied Jefferson.with redaction); commercial secrets (al- (DOCX) Acknowledgmentsthough these should be clear after drugs We thank our co-authors of the Cochranehave been registered, and we found no Alternative Language Summary review of neuraminidase inhibitors. We alsocommercially sensitive information in the Points S3 Translation of the Sum- thank Yuko Hara for helpful comments on theclinical study reports received from EMA mary Points into Japanese by Yuko draft manuscript and Ted Postol for manywith minimal redactions); and finally, Hara and Yasuyuki Kato. valuable insights. We are also grateful to thoseindustry concerns over adversaries’ mali- (DOC) who helped with translations.cious ‘‘cherry picking’’ over large datasets Alternative Language Summary(which could be reduced by requiring the Author Contributions Points S4 Translation of the Sum-prospective registration of research proto- mary Points into Croatian by Mir- Analyzed the data: PD TJ CDM. Wrote the firstcols). None appear insurmountable. ´ jana Huic. draft of the manuscript: PD TJ CDM. Contrib- With the EMA’s stated intentions on far uted to the writing of the manuscript: PD TJ (DOC)wider data disclosure, we hope the debate CDM. ICMJE criteria for authorship read andmay soon shift from one of whether to Alternative Language Summary met: PD TJ CDM. Agree with manuscriptrelease regulatory data to the specifics of Points S5 Translation of the Sum- results and conclusions: PD TJ CDM. PLoS Medicine | www.plosmedicine.org 5 April 2012 | Volume 9 | Issue 4 | e1001201 29
    • References1. Brody H, Light DW (2011) The inverse benefit Pharmaceuticals for Human Use (1996) Guide- 30. Nissen SE, Wolski K (2007) Effect of rosiglitazone law: how drug marketing undermines patient line for industry: structure and content of Clinical on the risk of myocardial infarction and death safety and public health. Am J Public Health Study Reports (ICH E3). Available: http://www. from cardiovascular causes. N Engl J Med 101(3): 399–404. fda.gov/downloads/regulatoryinformation/ 356(24): 2457–2471.2. Smith R (2005) Medical journals are an extension guidances/ucm129456.pdf. Accessed 6 2012. 31. Lenzer J (2003) Whistleblower charges drug of the marketing arm of pharmaceutical compa- 15. Lundh A, Barbateskovic M, Hrobjartsson A, ´ company with deceptive practices. BMJ nies. Plos Med 2(5): e138. doi:10.1371/journal. Gøtzsche PC (2010) Conflicts of interest at 326(7390): 620. pmed.0020138. medical journals: the influence of industry- 32. Kesselheim AS, Mello MM, Studdert DM (2011)3. The New York Times (28 April 2009) Tamiflu supported randomised trials on journal impact Strategies and practices in off-label marketing of (drug). The New York Times. Available: http:// factors and revenue – cohort study. PLoS Med pharmaceuticals: a retrospective analysis of whis- topic s.nytim es.com/topics/news/health/ 7(10): e1000354. doi:10.1371/journal.pmed. tleblower complaints. PLoS Med 8: e1000431. diseasesconditionsandhealthtopics/tamiflu-drug/ 1000354. doi:10.1371/journal.pmed.1000431. index.html. Accessed 6 March 2012. 16. Jefferson T, Jones MA, Doshi P, Del Mar CB, 33. Steinman MA, Bero LA, Chren M-M,4. U.S. Department of Health and Human Services Heneghan CJ, et al. (2011) Neuraminidase Landefeld CS (2006) Narrative review: the (2005) HHS pandemic influenza plan. Available: inhibitors for preventing and treating influenza promotion of gabapentin: an analysis of internal http://www.hhs.gov/pandemicflu/plan/pdf/ in healthy adults and children - a review of industry documents. Ann Intern Med 145(4): HHSPandemicInfluenzaPlan.pdf. Accessed 6 unpublished data.;Cochrane Database Syst Rev 284–293. March 2012. 2011: CD008965. Available: http://doi.wiley. 34. Lenzer J (2004) Pfizer pleads guilty, but drug sales5. Harper SA, Fukuda K, Uyeki TM, Cox NJ, com/10.1002/14651858. continue to soar. BMJ 328(7450): 1217. Bridges CB (2004) Prevention and control of 17. Gøtzsche PC, Jørgensen AW (2011) Opening up 35. Steinman MA, Harper GM, Chren M-M, influenza: recommendations of the Advisory data at the European Medicines Agency. BMJ Landefeld CS, Bero LA (2007) Characteristics Committee on Immunization Practices (ACIP). 342: d2686. and impact of drug detailing for gabapentin. MMWR Recomm Rep 53(RR-6): 1–40. 18. Yale School of Medicine (2011) Yale University PLoS Med 4(4): e134. doi:10.1371/journal.6. Roche Products Pty Limited (2011) TAMIFLUH Open Data Access Project (YODA Project). pmed.0040134. capsules: consumer medicine information. Avail- Available: http://medicine.yale.edu/core/ 36. Bombardier C, Laine L, Reicin A, Shapiro D, able: http://www.roche-australia.com/fmfiles/ projects/yodap/index.aspx. Accessed 6 March Burgos-Vargas R, et al. (2000) Comparison of re7229005/downloads/anti-virals/tamiflu-cmi. 2012. upper gastrointestinal toxicity of rofecoxib and pdf. Accessed 6 March 2012. 19. Eyding D, Lelgemann M, Grouven U, Harter M, naproxen in patients with rheumatoid arthritis.7. European Medicines Agency (2011) Summary of Kromp M, et al. (2010) Reboxetine for acute N Engl J Med 343(21): 1520–1528. product characteristics (Tamiflu 30 mg hard treatment of major depression: systematic review 37. Curfman GD, Morrissey S, Drazen JM (2005) capsule). Available: http://www.ema.europa.eu/ and meta-analysis of published and unpublished Expression of concern: Bombardier et al., ‘‘Com- docs/en_GB/document_library/EPAR_-_Product_ placebo and selective serotonin reuptake inhibitor parison of upper gastrointestinal toxicity of Information/human/000402/WC500033106.pdf. controlled trials. BMJ 341: c4737. rofecoxib and naproxen in patients with rheuma- Accessed 5 February 2012. 20. Halperin RM (1979) FDA disclosure of safety and toid arthritis,’’ N Engl J Med 2000; 343:1520–8.8. Kaiser L, Wat C, Mills T, Mahoney P, Ward P, effectiveness data: a legal and policy analysis. N Engl J Med 353(26): 2813–2814. Hayden F (2003) Impact of oseltamivir treatment Duke Law Journal 1979(1): 286–326. 38. Wilson D (22 November 2011) Merck to pay on influenza-related lower respiratory tract com- 21. Kesselheim AS, Mello MM (2007) Confidentiality $950 million over Vioxx]. The New York Times. plications and hospitalizations. Arch Intern Med laws and secrecy in medical research: improving Available: https://www.nytimes.com/2011/11/ 163(14): 1667–1672. public access to data on drug safety. Health Aff 23/business/merck-agrees-to-pay-950-million-in-9. Hoffman-La Roche, Ltd. (2011) Product label. (Millwood) 26(2): 483–491. vioxx-case.html. Accessed 6 March 2012. Tamiflu (oseltamivir phosphate) capsules and for 22. European Medicines Agency (2010) Output of the oral suspension. Available: http://www. European Medicines Agency policy on access to 39. Topol EJ (2004) Failing the public health– accessdata.fda.gov/drugsatfda_docs/label/2011/ documents related to medicinal products for rofecoxib, Merck, and the FDA. N Engl J Med 021087s056_021246s039lbl.pdf. Accessed 6 human and veterinary use. Available: http:// 351(17): 1707–1709. March 2012. www.ema.europa.eu/docs/en_GB/document_ 40. Doshi P (2009) Neuraminidase inhibitors–the10. U.S. Food and Drug Administration (14 April library/Regulatory_and_procedural_guideline/ story behind the Cochrane review. BMJ 339: 2000) NDA 21-087 TAMIFLU (oseltamivir 2010/11/WC500099472.pdf. Accessed 6 March b5164. phosphate) MACMIS ID#8675. Available: 2012. 41. Godlee F (2009) We want raw data, now. BMJ http://www.fda.gov/downloads/Drugs/Guidance 23. Pott A (2011) EMA’s response to articles. BMJ 339: b5405. ComplianceRegulatoryInformation/Enforcement 342: d3838. 42. Melville NA (20 January 2012) Tamiflu data ActivitiesbyFDA/WarningLettersandNoticeofVio 24. Jefferson T, Jones M, Doshi P, Del Mar C (2009) called inconsistent, underreported. Available: lationLetterstoPharmaceuticalCompanies/UCM Neuraminidase inhibitors for preventing and http://www.medscape.com/viewarticle/757226 166329.pdf. Accessed 6 March 2012. treating influenza in healthy adults: systematic [requires free registration].11. Jefferson T, Jones MA, Doshi P, Del Mar CB, review and meta-analysis. BMJ 339: b5106. 43. Dutkowski R, Smith JR, Davies BE (2010) Safety Heneghan CJ, et al. (2012) Neuraminidase 25. Smith J, on behalf of Roche (2009) Point-by-point and pharmacokinetics of oseltamivir at standard inhibitors for preventing and treating influenza response from Roche to BMJ questions. BMJ 339: and high dosages. Int J Antimicrob Agents 35(5): in healthy adults and children. Cochrane Data- b5374. 461–467. base Syst Rev 2012(1): CD008965. 26. Abraham J (1995) Science, politics and the 44. Cohen D (2009) Complications: tracking down12. World Health Organization (2007) WHO interim pharmaceutical industry: controversy and bias in the data on oseltamivir. BMJ 339: b5387. protocol: rapid operations to contain the initial drug regulation. First edition. Routledge. 45. Welliver R, Monto AS, Carewicz O, Schatteman E, emergence of pandemic influenza. Available: 27. Cohen D (2010) Rosiglitazone: what went wrong? Hassman M, et al. (2001) Effectiveness of oselta- http://www.who.int/influenza/resources/ BMJ 341: c4848. mivir in preventing influenza in household con- documents/RapidContProtOct15.pdf. Accessed 28. Rosen CJ (2007) The rosiglitazone story—lessons tacts: a randomized controlled trial. JAMA 285(6): 6 March 2012. from an FDA advisory committee meeting. 748–754.13. Jefferson T, Doshi P, Thompson M, Heneghan C N Engl J Med 357(9): 844–846. 46. Gilead Sciences (1999) Roche announces new (2011) Ensuring safe and effective drugs: who can 29. Richter B, Bandeira-Echtler E, Bergerhoff K, data on recently approved TamifluTM, first pill to do what it takes? BMJ 342: c7258. Clar C, Ebrahim SH (2007) Rosiglitazone for treat the most common strains of influenza (A &14. International Conference on Harmonisation of type 2 diabetes mellitus. Cochrane Database Syst B). Available: http://www.gilead.com/pr_ Technical Requirements for Registration of Rev 2007(3): CD006063. 942082531. Accessed 6 March 2012. PLoS Medicine | www.plosmedicine.org 6 April 2012 | Volume 9 | Issue 4 | e1001201 30
    • Who Shares? Who Doesn’t? Factors Associated withOpenly Archiving Raw Research DataHeather A. Piwowar*¤Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America Abstract Many initiatives encourage investigators to share their raw datasets in hopes of increasing research efficiency and quality. Despite these investments of time and money, we do not have a firm grasp of who openly shares raw research data, who doesn’t, and which initiatives are correlated with high rates of data sharing. In this analysis I use bibliometric methods to identify patterns in the frequency with which investigators openly archive their raw gene expression microarray datasets after study publication. Automated methods identified 11,603 articles published between 2000 and 2009 that describe the creation of gene expression microarray data. Associated datasets in best-practice repositories were found for 25% of these articles, increasing from less than 5% in 2001 to 30%–35% in 2007–2009. Accounting for sensitivity of the automated methods, approximately 45% of recent gene expression studies made their data publicly available. First-order factor analysis on 124 diverse bibliometric attributes of the data creation articles revealed 15 factors describing authorship, funding, institution, publication, and domain environments. In multivariate regression, authors were most likely to share data if they had prior experience sharing or reusing data, if their study was published in an open access journal or a journal with a relatively strong data sharing policy, or if the study was funded by a large number of NIH grants. Authors of studies on cancer and human subjects were least likely to make their datasets available. These results suggest research data sharing levels are still low and increasing only slowly, and data is least available in areas where it could make the biggest impact. Let’s learn from those with high rates of sharing to embrace the full potential of our research output. Citation: Piwowar HA (2011) Who Shares? Who Doesn’t? Factors Associated with Openly Archiving Raw Research Data. PLoS ONE 6(7): e18657. doi:10.1371/ journal.pone.0018657 Editor: Cameron Neylon, Science and Technology Facilities Council, United Kingdom Received September 16, 2010; Accepted March 15, 2011; Published July 13, 2011 Copyright: ß 2011 Healther A. Piwowar. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Funding: This work was completed in the Department of Biomedical Informatics at the University of Pittsburgh and partially funded by an NIH National Library of Medicine training grant 5 T15 LM007059. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing Interests: The author has declared that no competing interests exist. * E-mail: hpiwowar@gmail.com ¤ Current address: NESCent, The National Evolutionary Synthesis Center, Durham, North Carolina, United States of AmericaIntroduction [5]. Several government whitepapers [6,7] and high-profile editorials [8,9] call for responsible data sharing and reuse. Sharing and reusing primary research datasets has the potential Large-scale collaborative science is increasing the need to shareto increase research efficiency and quality. Raw data can be used datasets [10,11], and many guidelines, tools, standards, andto explore related or new hypotheses, particularly when combined databases are being developed and maintained to facilitate datawith other available datasets. Real data are indispensable for sharing and reuse [12,13].developing and validating study methods, analysis techniques, and Despite these investments of time and money, we do not yetsoftware implementations. The larger scientific community also understand the impact of these initiatives. There is a well-knownbenefits: Sharing data encourages multiple perspectives, helps to adage: You cannot manage what you do not measure. For thoseidentify errors, discourages fraud, is useful for training new with a goal of promoting responsible data sharing, it would beresearchers, and increases efficient use of funding and population helpful to evaluate the effectiveness of requirements, recommen-resources by avoiding duplicate data collection. dations, and tools. When data sharing is voluntary, insights could Eager to realize these benefits, funders, publishers, societies, and be gained by learning which datasets are shared, on what topics,individual research groups have developed tools, resources, and by whom, and in what locations. When policies make data sharingpolicies to encourage investigators to make their data publicly mandatory, monitoring is useful to understand compliance andavailable. For example, some journals require the submission of unexpected consequences.detailed biomedical datasets to publicly available databases as a Dimensions of data sharing action and intention have beencondition of publication [1,2]. Many funders require data sharing investigated by a variety of studies. Manual annotations andplans as a condition of funding: Since 2003, the National Institutes systematic data requests have been used to estimate the frequencyof Health (NIH) in the USA has required a data sharing plan for of data sharing within biomedicine [14,15,16,17], though fewall large funding grants [3] and has more recently introduced attempts were made to determine patterns of sharing and withholdingstronger requirements for genome-wide association studies [4]. As within these samples. Blumenthal [18], Campbell [19], Hedstromof January 2011, the US National Science Foundation requires [20], and others have used survey results to correlate self-reportedthat data sharing plans accompany all research grant proposals instances of data sharing and withholding with self-reported attributes PLoS ONE | www.plosone.org 1 July 2011 | Volume 6 | Issue 7 | e18657 31
    • Who Shares? Who Doesn’t?like industry involvement, perceived competitiveness, career produc- NOT (‘‘tissue microarray*’’ [text] OR ‘‘cpg is-tivity, and anticipated data sharing costs. Others have used surveys land*’’ [text])and interviews to analyze opinions about the effectiveness ofmandates [21] and the value of various incentives [20,22,23,24]. A Retrieved articles were mapped to PubMed identifiers wheneverfew inventories list the data-sharing policies of funders [25,26] and possible; the union of the PubMed identifiers returned by the fulljournals [1,27], and some work has been done to correlate policy text portals was used as the definitive list of articles for analysis. Anstrength with outcome [2,28]. Surveys and case studies have been independent evaluation of this approach found that it identifiedused to develop models of information behavior in related domains, articles that created microarray data with a precision of 90% (95%including knowledge sharing within an organization [29,30], confidence interval, 86% to 93%) and a recall of 56% (52% tophysician knowledge sharing in hospitals [31], participation in open 61%), compared to manual identification of articles that createdsource projects [32], academic contributions to institutional archives microarray data [43].[33,34], the choice to publish in open access journals [35], sharing Because Google Scholar only displays the first 1000 results of asocial science datasets [20], and participation in large-scale query, I was not able to view all of its hits. I tried to identify asbiomedical research collaborations [36]. many Google Scholar search results as possible by iteratively Although these studies provide valuable insights and their appending a variety of attributes to the end of the query, includingmethods facilitate investigation into an author’s intentions and various publisher names, journal title words, and years ofopinions, they have several limitations. First, associations to an publication, thereby retrieving distinct subsets of the results 1000investigator’s intention to share data do not directly translate to hits at a time.associations with actually sharing data [37]. Second, associationsthat rely on self-reported data sharing and withholding likely suffer Data availabilityfrom underreporting and confounding, since people admit The dependent variable in this study was whether each genewithholding data much less frequently than they report having expression microarray research article had an associated dataset inexperienced the data withholding of others [18]. a best-practice public centralized data repository. A community I suggest a supplemental approach for investigating research letter encouraging mandatory archiving in 2004 [41] identifieddata-sharing behavior. I have collected and analyzed a large set of three best-practice repositories for storing gene expressionobserved data sharing actions and associated study, investigator, microarray data: NCBI’s Gene Expression Omnibus (GEO),journal, funding, and institutional variables. The reported analysis EBI’s ArrayExpress, and Japan’s CIBEX database. The first twoexplores common factors behind these attributes and looks at the were included in this analysis, since CIBEX was defunct untilassociation between these factors and data sharing prevalence. recently. I chose to study data sharing for one particular type of data: An earlier evaluation found that querying GEO and ArrayEx-biological gene expression microarray intensity values. Microarray press with article PubMed identifiers located a representative 77%studies provide a useful environment for exploring data sharing of all associated publicly available datasets [44]. I used the samepolicies and behaviors. Despite being a rich resource valuable for method for finding datasets associated with published articles inreuse [38], microarray data are often, but not yet, universally this study: I queried GEO for links to the PubMed identifiers inshared. Best-practice guidelines for sharing microarray data are the analysis sample using the ‘‘pubmed_gds [filter]’’ and queriedfairly mature [12,39]. Two centralized databases have emerged as ArrayExpress by searching for each PubMed identifier in abest-practice repositories: the Gene Expression Omnibus (GEO) downloaded copy of the ArrayExpress database. Articles linked[13] and ArrayExpress [40]. Finally, high-profile letters have from a dataset in either of these two centralized repositories werecalled for strong journal data-sharing policies [41], resulting in considered to have ‘‘shared their data’’ for the endpoint of thisunusually strong data sharing requirements in some journals [42]. study, and those without such a link were considered not to haveAs such, the results here represent data sharing in an environment shared their data.where it has been particularly encouraged and supported. Study attributesMethods For each study article I collected 124 attributes for use as In brief, I used a full-text query to identify a set of studies in independent variables. The attributes were collected automaticallywhich the investigators generated gene expression microarray from a wide variety of sources. Basic bibliometric metadata wasdatasets. Best-practice data repositories were searched for extracted from the MEDLINE record, including journal, year ofassociated datasets. Attributes of the studies were used to derive publication, number of authors, Medical Subject Heading (MeSH)factors related to the investigators, journals, funding, institutions, terms, number of citations from PubMed Central, inclusion inand topic of the studies. Associations between these study factors PubMed subsets for cancer, whether the journal is published withand the frequency of public data archiving were determined an open-access model and if it had data-submission links fromthrough multivariate regression. Genbank, PDB, and SwissProt. ISI Journal Impact Factors and associated metrics were extractedStudies for analysis from the 2008 ISI Journal Citation Reports. I quantified the content The set of ‘‘gene expression microarray creation’’ articles was of journal data-sharing policies based on the ‘‘Instruction foridentified by querying the title, abstract, and full-text of PubMed, Authors’’ for the most commonly occurring journals.PubMed Central, Highwire Press, Scirus, and Google Scholar with NIH grant details were extracted by cross-referencing grantportal-specific variants of the following query: numbers in the MEDLINE record with the NIH award information (http://report.nih.gov/award/state/state.cfm). From this informa- tion I tabulated the amount of total funding received for each of the (‘‘gene expression’’ [text] AND ‘‘microarray’’ fiscal years from 2003 to 2008. I also estimated the date of renewal [text] AND ‘‘cell’’ [text] AND ‘‘rna’’ [text]) by identifying the most recent year in which a grant number was AND (‘‘rneasy’’ [text] OR ‘‘trizol’’ [text] OR ‘‘real- prefixed by a ‘‘1’’ or ‘‘2’’ —indication that the grant is ‘‘new’’ or time pcr’’ [text]) ‘‘renewed,’’ respectively. PLoS ONE | www.plosone.org 2 July 2011 | Volume 6 | Issue 7 | e18657 32
    • Who Shares? Who Doesn’t? The corresponding address was parsed for institution and Using this complete dataset, I computed scores for each of thecountry, following the methods of Yu et al. [45]. Institutions were datapoints onto all of the first and second-order factors usingcross-referenced to the SCImago Institutions Rankings 2009 Bartlett’s algorithm as extracted from the factanal function. IWorld Report (http://www.scimagoir.com/) to estimate the submitted these factor scores to a logistic regression using the lrmrelative degree of research output and impact of the institutions. function in the rms package. Continuous variables were modeled Attributes of study authors were collected for first and last authors as cubic splines with 4 knots using the rcs function from the rms(in biomedicine, customarily, the first and last authors make the package, and all two-way interactions were explored.largest contributions to a study and have the most power in Finally, hierarchical supervised clustering on the datapoints waspublication decisions). The gender of the first and last authors were performed to learn which factors were most predictive and thenestimated using the Baby Name Guesser website (http://www. estimated the data sharing prevalence in a contingency table ofgpeters.com/names/baby-names.php). A list of prior publications these two clusters split at their medians.in MEDLINE was extracted from Author-ity clusters, 2009 edition[46], for the first and last author of each article in this study. To limit Resultsthe impact of extremely large ‘‘lumped’’ clusters that erroneouslycontain the publications of more than one actual author, I excluded Full-text queries for articles describing the creation of geneprior publication lists for first or last authors in the largest 2% of expression microarray datasets returned PubMed identifiers for 11,603 studies.clusters and instead considered these data missing. For each paperin an author’s publication history with PubMed identifiers MEDLINE fields were still ‘‘in process’’ for 512 records,numerically less than the PubMed identifier of the paper in resulting in missing data for MeSH-derived variables. Impact factors were found for all but 1,001 articles. Journal policyquestion, I queried to find if any of these prior publications had been variables were missing for 4,107 articles. The institution rankingpublished in an open source journal, were included in the ‘‘gene attributes were missing for 6,185. I cross-referenced NIH grantexpression microarray creation’’ subset themselves, or had reused details for 3,064 studies (some grant numbers could not be parsed,gene expression data. I recorded the date of the earliest publication because they were incomplete or strangely formatted). I was ableby the author and the number of citations to date that their earlier to determine the gender of the first and last authors, based on thepapers received in PubMed Central. forenames in the MEDLINE record, for all but 2,841 first authors I attempted to estimate if the paper itself reused publicly and 2,790 last authors. All but 1,765 first authors and 797 lastavailable gene expression microarray data by looking for its authors were found to have a publication history in the 2009inclusion in the list that GEO keeps of reuse at http://www.ncbi. Author-ity clusters.nlm.nih.gov/projects/geo/info/ucitations.html. PubMed identifiers were found in the ‘‘primary citation’’ field of Data collection scripts were coded in Python version 2.5.2 (many dataset records in GEO or ArrayExpress for 2,901 of the 11,603libraries were used, including EUtils, BeautifulSoup, pyparsing and articles in this dataset, indicating that 25% (95% confidencenltk [47]) and SQLite version 3.4. Data collection source code is intervals: 24% to 26%) of the studies deposited their data in GEOavailable at github (http://github.com/hpiwowar/pypub). or ArrayExpress and completed the citation fields with the primary article PubMed identifier. This is my estimate for the prevalence ofStatistical methods gene expression microarray data deposited into the two predom- Statistical analysis was performed in R version 2.10.1 [48]. P- inant, centralized, publicly accessible databases.values were two-tailed. The collected data were visually explored This data-sharing rate increased with each subsequent articleusing Mondrian version 1.1 [49] and the Hmisc package [50]. I publication year, as seen in Figure 1, increasing from less than 5%applied a square-root transformation to variables representing in 2001 to 30%–35% in 2007–2009. Accounting for the sensitivitycount data to improve their normality prior to calculating of my automated method for detecting open data anywhere on thecorrelations. internet (about 77% [44]), it could be estimated that approxi- To calculate variable correlations, I used the hector function in mately 45% (0.35/0.77) of recent gene expression studies havethe polycor library. This computes polyserial correlations between made their data publicly available.pairs of numeric and ordinal variables and polychoric correlations The data-sharing rate also varied across journals. Figure 2between two ordinal variables. I modified it to calculate Pearson shows the data-sharing rate across the 50 journals with the mostcorrelations between numeric variables using the rcorr function in articles in this study. Many of the other attributes were alsothe Hmisc library. I used a pairwise-complete approach to missing associated with the prevalence of data sharing in univariatedata and used the nearcor function in the sfsmisc library to make analysis, as seen in Figures S1, S2, S3, S4, S5.the correlation matrix positive definite. A correlation heatmap wasproduced using the gplots library. First-order factors I used the nFactors library to calculate and display the scree plot I tried to use a scree plot to determine the optimal number offor correlations. factors for first-order analysis. Since the scree plot did not have a Since the correlation matrix was not well-behaved enough for clear drop-off, I experimented with a range of factor counts nearmaximum-likelihood factor analysis, first-order exploratory factor the optimal coordinates index (as calculated by nScree in theanalysis was performed with the fa function in the psych library, using nFactors R-project library) and finalized on 15 factors. Thethe minimum residual (minres) solution and a promax oblique correlation matrix was not sufficiently well-behaved for maximum-rotation. Second-order factor analysis also used the minres solution likelihood factor analysis, so I used a minimum residual (minres)but a varimax rotation, since I wanted these factors to be orthogonal. solution. I chose to rotate the factors with the promax obliqueI computed the loadings on the original variables for the second-order algorithm, because first-order factors were expected to havefactors using the method described by Gorsuch [51]. significant correlations with one another. The rotated first-order Before computing the factor scores for the original dataset, factors are given in Table 1 with loadings larger than 0.4 or lessmissing values were imputed through Gibbs sampling with two than 20.4. Some of the loadings are greater than one. This is notiterations through the mice library. unexpected since the factors are oblique and thus the loadings in PLoS ONE | www.plosone.org 3 July 2011 | Volume 6 | Issue 7 | e18657 33
    • Who Shares? Who Doesn’t?Figure 1. Proportion of articles with shared datasets, by year (error bars are 95% confidence intervals of the proportions).doi:10.1371/journal.pone.0018657.g001the pattern matrix represent regression coefficients rather than Most of these factors were significantly associated with data-correlations. Correlations between attributes and the first-order sharing behavior in a multivariate logistic regression: p = 0.18 forfactors are given in the structure matrix in Table S1. The factors ‘‘Large NIH grant’’, p,0.05 for ‘‘No GEO reuse & YES highhave been named based on the variables they load most heavily, institution output’’ and ‘‘No K funding or P funding’’, andusing abbreviations for publishing in an Open Access journal (OA) p,0.005 for the other first-order factors. The increase in the oddsand previously depositing data in the Gene Expression Omnibus of data sharing is illustrated in Figure 4 as scores on each factor in(GEO) or ArrayExpress (AE) databases. the model are moved from their 25th percentile value to their 75th After imputing missing values, I calculated scores for each of the percentile value.15 factors for each of the 11,603 data collection studies. Many of the factor scores demonstrated a correlation with Second-order factorsfrequency of data sharing in univariate analysis, as seen in Figure 3. The heavy correlations between the first-order factors suggestedSeveral factors seemed to have a linear relationship with data sharing that second-order factors may be illuminating. Scree plot analysisacross their whole range. For example, whereas the data sharing rate of the correlations between the first-order factors suggested awas relatively low for studies with the lowest scores on the factor solution containing five second-order factors. I calculated therelated to the citation and collaboration rate of the corresponding factors using a ‘‘varimax’’ rotation to find orthogonal factors.author’s institution (in Figure 3, the first row under the heading Loadings on the first-order factors are given in Table 2.‘‘Institution high citation & collaboration’’), the data sharing rate was Since interactions make these second-order variables slightlyhigher for studies that scored within the 25th to 50th percentile on that difficult to interpret, I followed the method explained by Gorsuchfactor, higher still for studies the third quartile, and studies from [51] to calculate the loadings of the second-order variables directly onhighly-cited institutions, above the 75th percentile had a relatively the original variables. The results are listed in Table 3. I named thehigh rate of data sharing. A trend in the opposite direction can be second-order factors based on the loadings on the original variables.seen for the factor ‘‘Humans & cancer’’: the higher a study scored on I then calculated factor scores for each of these second-orderthat factor, the less likely it was to have shared its data. factors using the original attributes of the 11,603 datapoints. In PLoS ONE | www.plosone.org 4 July 2011 | Volume 6 | Issue 7 | e18657 34
    • Who Shares? Who Doesn’t?Figure 2. Proportion of articles with shared datasets, by journal (error bars are 95% confidence intervals of the proportions).doi:10.1371/journal.pone.0018657.g002univariate analysis, scores on several of the five factors showed a shared in a publicly accessible database. I found that 25% ofclear linear relationship with data sharing frequency, as illustrated studies that performed gene expression microarray experimentsin Figure 5. have deposited their raw research data in a primary public All five of the second-order factors were associated with data repository. The proportion of studies that shared their genesharing in multivariate logistic regression, p,0.001.The increase expression datasets increased over time, from less than 5% in earlyin odds of data sharing is illustrated in Figure 6 as each factor in years, before mature standards and repositories, to 30%–35% inthe model is moved from its 25th percentile value to its 75th 2007–2009. This suggests that perhaps 45% of recent genepercentile value. expression studies have made their data available somewhere on Finally, to understand which of these factors was most predictive the internet, after accounting for datasets overlooked by theof data sharing behaviour, I performed supervised hierarchical automated methods of discovery [44]. This estimate is consistentclustering using the second-order factors. Splits on ‘‘OA journal & with a previous manual inventory [15].previous GEO-AE sharing’’ and ‘‘Cancer & Humans’’ were Many factors derived from an experiment’s topic, impact,clearly the most informative, so I simply split these two factors at funding, publishing, institutional, and authorship environmentstheir medians and looked at the data sharing prevalence. As shown were associated with the probability of data sharing. In particular,in Table 4, studies that scored high on the ‘‘OA journal & previous authors publishing in an open access journal, or with a history ofGEO-AE sharing’’ factor and low on the ‘‘Cancer & Humans’’ sharing and reusing shared gene expression microarray data, werefactor were almost three times as likely to share their data as a most likely to share their data, and those studying cancer or‘‘Cancer & Humans’’ study published without a strong ‘‘OA human subjects were least likely to share.journal & previous GEO-AE sharing’’ environment. It is disheartening to discover that human and cancer studies have particularly low rates of data sharing. These data are surelyDiscussion some of the most valuable for reuse, to confirm, refute, inform and advance bench-to-bedside translational research [52] Further This study explored the association between attributes of a studies are required to understand the interplay of an investigator’spublished experiment and the probability that its raw dataset was motivation, opportunity, and ability to share their raw datasets PLoS ONE | www.plosone.org 5 July 2011 | Volume 6 | Issue 7 | e18657 35
    • Who Shares? Who Doesn’t?Table 1. First-order factor loadings. Table 1. Cont.Large NIH grant Journal impact 0.97 num.post2005.morethan1000k.tr 0.88 journal.5yr.impact.factor.log 0.96 num.post2005.morethan750k.tr 0.88 journal.impact.factor.log 0.92 num.post2004.morethan750k.tr 0.85 journal.immediacy.index.log 0.91 num.post2004.morethan1000k.tr 0.70 journal.policy.mentions.exceptions 0.91 num.post2005.morethan500k.tr 0.54 journal.num.articles.2008.tr 0.89 num.post2006.morethan1000k.tr 0.51 journal.policy.contains.word.miame.mged 0.89 num.post2006.morethan750k.tr 20.61 journal.policy.contains.word.arrayexpress 0.86 num.post2004.morethan500k.tr 20.48 pubmed.is.open.access 0.85 num.post2006.morethan500k.tr Last author num prev pubs & first year pub 0.84 num.post2003.morethan750k.tr 0.84 last.author.num.prev.pubs.tr 0.84 num.post2003.morethan1000k.tr 0.74 last.author.year.first.pub.ago.tr 0.80 num.post2003.morethan500k.tr 0.73 last.author.num.prev.pmc.cites.tr 0.74 has.U.funding 0.68 last.author.num.prev.other.sharing.tr 0.71 has.P.funding 0.48 country.japan 0.58 nih.sum.avg.dollars.tr 0.44 last.author.num.prev.microarray.creations.tr 0.56 nih.sum.sum.dollars.tr Journal policy consequences & long half-life 0.44 nih.max.max.dollars.tr 0.78 journal.policy.mentions.consequencesHas journal policy 0.73 journal.cited.halflife 1.00 journal.policy.contains..geo.omnibus 0.60 pubmed.is.bacteria 0.95 journal.policy.at.least.requests.sharing.array 0.42 journal.policy.requires.microarray.accession 0.95 journal.policy.mentions.any.sharing 20.54 pubmed.is.open.access 0.93 journal.policy.contains.word.microarray 20.45 journal.policy.general.statement 0.91 journal.policy.requests.sharing.other.data Institution high citations & collaboration 0.85 journal.policy.says.must.deposit 0.76 institution.mean.norm.citation.score 0.83 journal.policy.contains.word.arrayexpress 0.72 institution.international.collaboration 0.72 journal.policy.requires.microarray.accession 0.64 institution.mean.norm.impact.factor 0.71 journal.policy.requests.accession 0.41 country.germany 0.58 journal.policy.contains.word.miame.mged 20.67 country.china 0.48 journal.microarray.creating.count.tr 20.61 country.korea 0.45 journal.policy.mentions.consequences 20.56 last.author.gender.not.found 0.42 journal.policy.general.statement 20.43 country.japanNOT institution NCI or intramural NO geo reuse & YES high institution output 0.59 pubmed.is.funded.non.us.govt 0.66 institution.research.output.tr 0.55 institution.is.higher.ed 0.58 institution.harvard 20.89 institution.nci 0.46 has.K.funding 20.86 pubmed.is.funded.nih.intramural 0.42 institution.stanford 20.42 country.usa 20.79 pubmed.is.geo.reuseCount of R01 & other NIH grants 20.62 country.australia 1.15 has.R01.funding 20.46 institution.rank 1.14 has.R.funding NOT animals or mice 0.89 num.grants.via.nih.tr 0.51 pubmed.is.humans 0.86 nih.cumulative.years.tr 0.43 pubmed.is.diagnosis 0.82 num.grant.numbers.tr 0.40 pubmed.is.effectiveness 0.80 max.grant.duration.tr 20.93 pubmed.is.animals 0.66 pubmed.is.funded.nih 20.86 pubmed.is.mice 0.50 nih.max.max.dollars.tr Humans & cancer 0.45 num.nih.is.nigms.tr 0.84 pubmed.is.humans 0.44 country.usa 0.75 pubmed.is.cancer 0.42 has.T.funding 0.67 pubmed.is.cultured.cells 0.41 num.nih.is.niaid.tr 0.52 institution.is.medical PLoS ONE | www.plosone.org 6 July 2011 | Volume 6 | Issue 7 | e18657 36
    • Who Shares? Who Doesn’t? Table 1. Cont. or cancer studies may have relatively strong links to industry – two attributes previously associated with data withholding [58,59]. NIH funding levels were associated with increased prevalence of 0.47 pubmed.is.core.clinical.journal data sharing, though the overall probability of sharing remains low even in well-funded studies. Data sharing was infrequent even in 20.68 pubmed.is.plants studies funded by grants clearly covered by the NIH Data Sharing 20.49 pubmed.is.fungi Policy, such as those that received more than one million dollars Institution is government & NOT higher ed per year and were awarded or renewed since 2006. This result is 0.92 institution.is.govnt consistent with reports that the NIH Data Sharing Policy is often 0.70 country.germany not taken seriously because compliance is not enforced [54]. It is surprising how infrequently the NIH Data Sharing Policy applies 0.65 country.france to gene expression microarray studies (19% as per a pilot to this 0.46 institution.international.collaboration study [60]). The NIH may address these issues soon within its 20.78 institution.is.higher.ed renewed commitment to make data more available [61]. 20.56 country.canada I am intrigued that publishing in an open access journal, previously 20.51 institution.stanford sharing gene expression data, and previously reusing gene expression 20.42 institution.is.medical data were associated with data sharing outcomes. More research is required to understand the drivers behind this association. Does the NO K funding or P funding factor represent an attitude towards ‘‘openness’’ by the decision- 0.56 has.R01.funding making authors? Does the act of sharing data lower the perceived 0.49 has.R.funding effort of sharing data again? Does it dispel fears induced by possible 0.41 num.post2006.morethan500k.tr negative outcomes from sharing data? To what extent does 0.41 num.post2006.morethan750k.tr recognizing the value of shared data through data reuse motivate an author to share his or her own datasets? 0.40 num.post2006.morethan1000k.tr People often wonder whether the attitude towards data sharing 20.65 has.K.funding varies with age. Although I was not able to capture author age, I 20.63 has.P.funding did estimate the number of years since first and last authors had Authors prev GEOAE sharing & OA & arry creation published their first paper. The analysis suggests that first authors 0.83 last.author.num.prev.geoae.sharing.tr with many years in the field are less likely to share data than those 0.74 last.author.num.prev.microarray.creations.tr with fewer years of experience, but no such association was found for last authors. More work is needed to confirm this finding given 0.73 last.author.num.prev.oa.tr the confounding factor of previous data-sharing experience. 0.60 first.author.num.prev.geoae.sharing.tr Gene expression publications associated with Stanford Univer- 0.47 first.author.num.prev.oa.tr sity have a very high level of data sharing. The true level of open 0.46 first.author.num.prev.microarray.creations.tr data archiving is actually much higher than that reflected in this 0.40 institution.stanford study: Stanford University hosts a public microarray repository, and many articles that did not have a dataset link from GEO or 20.44 years.ago.tr ArrayExpress do mention submission to the Stanford Microarray First author num prev pubs & first year pub Database. If one were looking for a community on which to model 0.83 first.author.num.prev.pubs.tr best practices for data sharing adoption, Stanford would be a great 0.77 first.author.year.first.pub.ago.tr place to start. 0.73 first.author.num.prev.pmc.cites.tr Similarly, Physiological Genomics has very high rates of public 0.52 first.author.num.prev.other.sharing.tr archiving relative to other journals. Perhaps not coincidentally, to my knowledge Physiological Genomics is the only journal to have doi:10.1371/journal.pone.0018657.t001 published an evaluation of their author’s attitudes and experiences following the adoption of new data archiving requirements for gene expression microarray data [21].[53,54]. In the mean time, we can make some guesses: As is Analyzing data sharing through bibliometric and data-miningappropriate, concerns about privacy of human subjects’ data attributes has several advantages: We can look at a very large set ofundoubtedly affect a researcher’s willingness and ability (perceived studies and attributes, our results are not biased by survey responseor actual) to share raw study data. I do not presume to recommend self-selection or reporting bias, and the analysis can be repeateda proper balance between privacy and the societal benefit of data over time with little additional effort.sharing, but I will emphasize that researchers should assess the However, this approach does suffer its own limitations. Filtersdegree of re-identification risk on a study-by-study basis [55], for identifying microarray creation studies do not have perfectevaluate the risks and benefits across the wide range of stakeholder precision, so some non-data-creation studies may be included ininterests [56], and consider an ethical framework to make these the analysis. Because studies that do not create data will not havedifficult decisions [57]. Learning how to make these decisions well data deposits, their inclusion alters the composition of what Iis difficult: it is vital that we educate and mentor both new and consider to be studies that create but do not share data.experienced researchers in best practices. Given the low risk of re- Furthermore, my method for detecting data deposits overlooksidentification through gene expression microarray data (illustrated data deposits that are missing PubMed identifiers in GEO andby its inclusion in the Open-Access Data Tier at http://target. ArrayExpress, so the dataset misclassifies some studies that did incancer.gov/dataportal/access/policy.asp), data-sharing rates fact share their data in these repositories.could also be low for reasons other than privacy. Cancer I made decisions to facilitate analysis, such as assuming thatresearchers may perceive their field as particularly competitive, PubMed identifiers were monotonically increasing with publica- PLoS ONE | www.plosone.org 7 July 2011 | Volume 6 | Issue 7 | e18657 37
    • Who Shares? Who Doesn’t?Figure 3. Association between shared data and first-order factors. Percentage of studies with shared data is shown for each quartile for eachfactor. Univariate analysis.doi:10.1371/journal.pone.0018657.g003tion date and using the current journal data-sharing policy as a Previous work [58] found that investigator gender wassurrogate for the data-sharing policy in place when papers were correlated with data withholding. It is important to look at genderpublished. These decisions may have introduced errors. in multivariate analysis since male scientists are more likely than Missing data may have obscured important information. For women to have large NIH grants [62]. Because gender did notexample, articles published in journals with policies that I did not contribute heavily to any of the derived factors in this study,examine had a lower rate of data sharing than articles published in additional analysis will be necessary to investigate its associationjournals whose ‘‘Instructions to Authors’’ policies I did quantify. It with data sharing behaviour in this dataset. It should be noted thatis likely that a more comprehensive analysis of journal data- the source of gender data has limitations. The Baby Name Guessersharing policies would provide additional insight. Similarly, the algorithm empirically estimates gender by analyzing popular usageinformation on funders was limited: I only included funding data on the internet. Although coverage across names from diverseon NIH grants. Inclusion of more funders would help us ethnicities seems quite good, the algorithm is relatively unsuccess-understand the general role of funder policy and funding levels. ful in determining the gender of Asian names. This may have PLoS ONE | www.plosone.org 8 July 2011 | Volume 6 | Issue 7 | e18657 38
    • Who Shares? Who Doesn’t? Table 2. Second-order factor loadings, by first-order factors. Table 3. Second-order factor loadings, by original variables. Amount of NIH funding Amount of NIH funding 0.89 Count of R01 & other NIH grants 0.87 nih.cumulative.years.tr 0.49 Large NIH grant 0.85 num.grants.via.nih.tr 20.55 NO K funding or P funding 0.84 max.grant.duration.tr Cancer & humans 0.82 num.grant.numbers.tr 0.83 Humans & cancer 0.80 pubmed.is.funded.nih OA journal & previous GEO-AE sharing 0.79 nih.max.max.dollars.tr 0.59 Authors prev GEOAE sharing & OA & microarray creation 0.70 nih.sum.avg.dollars.tr 0.43 Institution high citations & collaboration 0.70 nih.sum.sum.dollars.tr 0.31 First author num prev pubs & first year pub 0.59 has.R.funding 20.36 Last author num prev pubs & first year pub 0.59 num.post2003.morethan500k.tr Journal impact factor and policy 0.58 country.usa 0.57 Journal impact 0.58 has.U.funding 0.51 Last author num prev pubs & first year pub 0.57 has.R01.funding Higher Ed in USA 0.55 num.post2003.morethan750k.tr 0.40 NO geo reuse+YES high institution output 0.53 has.T.funding 20.44 Institution is government & NOT higher ed 0.53 num.post2003.morethan1000k.tr doi:10.1371/journal.pone.0018657.t002 0.49 num.post2004.morethan500k.tr 0.45 num.post2004.morethan750k.trconfounded the gender analysis, and the ‘‘gender not found’’ 0.44 has.P.fundingvariable might have served as an unexpected proxy for author 0.43 num.post2004.morethan1000k.trethnicity. 0.43 num.nih.is.nci.tr The Author-ity system provides accurate author publication 0.35 num.post2005.morethan500k.trhistories: A previous evaluation on a different sample found thatonly 0.5% of publication histories erroneously included more than 0.32 num.nih.is.nigms.trone author, and about 2% of clusters contained a partial inventory 0.31 num.post2005.morethan750k.trof an author’s publication history due to splitting a given author Cancer & humansacross multiple clusters [46]. However, because the lumping does 0.60 pubmed.is.cancernot occur randomly, my attributes based on author publication 0.59 pubmed.is.humanshistories may have included some bias. For example, thedocumented tendency of Author-ity to erroneously lump common 0.52 pubmed.is.cultured.cellsJapanese names [46] may have confounded the author-history 0.43 pubmed.is.core.clinical.journalvariables with author-ethnicity and thereby influenced the findings 0.39 institution.is.medicalon first-author age and experience. 20.58 pubmed.is.plants In previous work I used h-index and a-index metrics to 20.50 pubmed.is.fungimeasure ‘‘author experience’’ for both the first and last author 20.37 pubmed.is.shared.other[60] (in biomedicine, customarily, the first and last authorsmake the largest contributions to a study and have the most 20.30 pubmed.is.bacteriapower in publication decisions). A recent paper [63] suggests OA journal & previous GEO-AE sharingthat a raw count of number of papers and number of citations 0.40 first.author.num.prev.geoae.sharing.tris functionally equivalent to the h-index and a-index, so I used 0.37 pubmed.is.open.accessthe raw counts in this study for computational simplicity. 0.37 first.author.num.prev.oa.trReliance on citations from PubMed Central (to enable scripteddata collection) meant that older studies and those published in 0.35 last.author.num.prev.geoae.sharing.trareas less well represented in PubMed Central were character- 0.32 pubmed.is.effectivenessized by an artificially low citation count. 0.32 last.author.num.prev.oa.tr The large sample of 11,603 studies captured a fairly diverse and 0.31 pubmed.is.geo.reuserepresentative subset of gene expression microarray studies, though it 20.38 country.japanis possible that gene expression microarray studies missed by the full- Journal impact factor and policytext filter differed in significant ways from those that used mainstreamvocabulary to describe their wetlab methods. Selecting a sample 0.48 journal.impact.factor.logbased on queries of non-subscription full-text content may have 0.47 jour.policy.requires.microarray.accessionintroduced a slight bias towards open access journals. It is worth 0.46 jour.policy.mentions.exceptionsnoting that this study demonstrates the value of open access and open 0.46 pubmed.num.cites.from.pmc.trfull-text resources for research evaluation. 0.45 journal.5yr.impact.factor.log In regression studies it is important to remember that associationsdo not imply causation. It is possible, for example, that receiving a 0.45 jour.policy.contains.word.miame.mged PLoS ONE | www.plosone.org 9 July 2011 | Volume 6 | Issue 7 | e18657 39
    • Who Shares? Who Doesn’t? Table 3. Cont. high level of NIH funding and deciding to share data are not causally related, but rather result from the exposure and excitement inherent in a ‘‘hot’’ subfield of study. 0.42 last.author.num.prev.pmc.cites.tr Importantly, this study did not consider directed sharing, such as peer-to-peer data exchange or sharing within a 0.41 jour.policy.requests.accession defined collaboration network, and thus underestimates the 0.40 journal.immediacy.index.log amount of data sharing in all its forms. Furthermore, this 0.40 journal.num.articles.2008.tr study underestimated public sharing of gene expression data 0.39 years.ago.tr on the Internet. It did not recognize data listed in journal 0.36 jour.policy.says.must.deposit supplementary information, on lab or personal web sites, or in 0.35 pubmed.num.cites.from.pmc.per.year institutional or specialized repositories (including the well- regarded and well-populated Stanford Microarray Database). 0.33 institution.mean.norm.citation.score Finally, the study methods did not recognize deposits into the 0.32 last.author.year.first.pub.ago.tr Gene Expression Omnibus or ArrayExpress unless the 0.31 country.usa database entry was accompanied by a citation to the research 0.31 last.author.num.prev.pubs.tr paper, complete with PubMed identifier. 0.31 jour.policy.contains.word.microarray Due to these limitations, care should be taken in interpreting 20.31 pubmed.is.open.access the estimated levels of absolute data sharing and the data- Higher Ed in USA sharing status of any particular study listed in the raw data. More research is needed to attain a deep understanding of 0.36 institution.stanford information behaviour around research data sharing, its costs 0.36 institution.is.higher.ed and benefits to science, society and individual investigators, and 0.35 country.usa what makes for effective policy. 0.35 has.R.funding That said, the results presented here argue for action. Even 0.33 has.R01.funding in a field with mature policies, repositories and standards, 0.30 institution.harvard research data sharing levels are low and increasing only slowly, 20.37 institution.is.govnt and data is least available in areas where it could make the biggest impact. Let’s learn from those with high rates of sharing doi:10.1371/journal.pone.0018657.t003 and work to embrace the full potential of our research output.Figure 4. Odds ratios of data sharing for first-order factor, multivariate model. Odd ratios are calculated as factor scores are each variedfrom their 25th percentile value to their 75th percentile value. Horizontal lines show the 95% confidence intervals of the odds ratios.doi:10.1371/journal.pone.0018657.g004 PLoS ONE | www.plosone.org 10 July 2011 | Volume 6 | Issue 7 | e18657 40
    • Who Shares? Who Doesn’t?Figure 5. Association between shared data and second-order factors. Percentage of studies with shared data is shown for each quartile foreach factor. Univariate analysis.doi:10.1371/journal.pone.0018657.g005Figure 6. Odds ratios of data sharing for second-order factor, multivariate model. Odd ratios are calculated as factor scores are eachvaried from their 25th percentile value to their 75th percentile value. Horizontal lines show the 95% confidence intervals of the odds ratios.doi:10.1371/journal.pone.0018657.g006 PLoS ONE | www.plosone.org 11 July 2011 | Volume 6 | Issue 7 | e18657 41
    • Who Shares? Who Doesn’t? Table 4. Data sharing prevalence of subgroups divided at medians of two second-order factors [95% confidence intervals]. number of studies with shared Above the median value for the Below the median value for the factor data/ number of studies factor ‘‘Cancer & Humans’’ ‘‘Cancer & Humans’’ Total Above the median value for 629/2624 = 24% [22%, 26%] 1184/3178 = 37% [36%, 39%] 1813/5802 = 31% [30%, 32%] the factor ‘‘OA and previous GEO-AE sharing’’ Below the median value for 440/3178 = 14% [13%, 15%] 648/2623 = 25% [23%, 26%] 1088/5801 = 19% [18%, 20%] the factor ‘‘OA and previous GEO-AE sharing’’ Total 1069/5802 = 18% [17%, 19%] 1832/5801 = 32% [30%, 33%] 2901/11603 = 25% [24%, 26%] doi:10.1371/journal.pone.0018657.t004Availability of dataset, statistical scripts, and data Figure S4 Associations between shared data and fund-collection source code ing attribute variables. Percentage of studies with shared data Raw data and statistical scripts are available in the Dryad data is shown for each quartile for continuous variables.repository at doi:10.5061/dryad.mf1sd [64]. Data collection (EPS)source code is available at http://github.com/hpiwowar/pypub. Figure S5 Associations between shared data and coun- try and institution attribute variables. Percentage of studiesSupporting Information with shared data is shown for each quartile for continuous variables.Figure S1 Associations between shared data and author (EPS)attribute variables. Percentage of studies with shared data isshown for each quartile for continuous variables. Table S1 Structure matrix with correlations between all(EPS) attributes and first-order factors. (TXT)Figure S2 Associations between shared data and journalattribute variables. Percentage of studies with shared data is Acknowledgmentsshown for each quartile for continuous variables.(EPS) I sincerely thank my doctoral dissertation advisor, Dr. Wendy Chapman, for her support and feedback.Figure S3 Associations between shared data and studyattribute variables. Percentage of studies with shared data is Author Contributionsshown for each quartile for continuous variables. Conceived and designed the experiments: HAP. Performed the experi-(EPS) ments: HAP. Analyzed the data: HAP. Contributed reagents/materials/ analysis tools: HAP. Wrote the paper: HAP.References 1. McCain K (1995) Mandating Sharing: Journal Policies in the Natural Sciences. 13. Barrett T, Troup D, Wilhite S, Ledoux P, Rudnev D, et al. (2007) NCBI GEO: Science Communication 16: 403–431. mining tens of millions of expression profiles–database and tools update. Nucleic 2. Piwowar H, Chapman W (2008) A review of journal policies for sharing research Acids Res 35. data ELPUB, Toronto Canada. 14. Noor MA, Zimmerman KJ, Teeter KC (2006) Data Sharing: How Much 3. National Institutes of Health (NIH) (2003) NOT-OD-03-032: Final NIH Doesn’t Get Submitted to GenBank? PLoS Biol 4: e228. Statement on Sharing Research Data. 15. Ochsner SA, Steffen DL, Stoeckert CJ, McKenna NJ (2008) Much room for 4. National Institutes of Health (NIH) (2007) NOT-OD-08-013: Implementation improvement in deposition rates of expression microarray datasets. Nature Guidance and Instructions for Applicants: Policy for Sharing of Data Obtained Methods 5: 991. in NIH-Supported or Conducted Genome-Wide Association Studies 16. Reidpath DD, Allotey PA (2001) Data sharing in medical research: an empirical (GWAS). investigation. Bioethics 15: 125–134. 5. Nation Science Foundation (NSF) (2011) Grant Proposal Guide, Chapter 17. Kyzas P, Loizou K, Ioannidis J (2005) Selective reporting biases in cancer II.C.2.j. prognostic factor studies. J Natl Cancer Inst 97: 1043–1055. 18. Blumenthal D, Campbell EG, Gokhale M, Yucel R, Clarridge B, et al. (2006) 6. Fienberg SE, Martin ME, Straf ML (1985) Sharing research data National Data withholding in genetics and the other life sciences: prevalences and Academy Press. predictors. Acad Med 81: 137–145. 7. Cech TR, Eddy SR, Eisenberg D, Hersey K, Holtzman SH, et al. (2003) 19. Campbell EG, Clarridge BR, Gokhale M, Birenbaum L, Hilgartner S, et al. Sharing Publication-Related Data and Materials: Responsibilities of Authorship (2002) Data withholding in academic genetics: evidence from a national survey. in the Life Sciences. Plant Physiol 132: 19–24. JAMA 287: 473–480. 8. (2007) Time for leadership. Nat Biotech 25: 821–821. 20. Hedstrom M (2006) Producing Archive-Ready Datasets: Compliance, Incen- 9. (2007) Got data? Nat Neurosci 10: 931–931. tives, and Motivation. IASSIST Conference 2006: Presentations.10. Kakazu KK, Cheung LW, Lynne W (2004) The Cancer Biomedical Informatics 21. Ventura B (2005) Mandatory submission of microarray data to public Grid (caBIG): pioneering an expansive network of information and tools for repositories: how is it working? Physiol Genomics 20: 153–156. collaborative cancer research. Hawaii Med J 63: 273–275. 22. Giordano R (2007) The Scientist: Secretive, Selfish, or Reticent? A Social11. The GAIN Collaborative Research Group (2007) New models of collaboration Network Analysis. E-Social Science. in genome-wide association studies: the Genetic Association Information 23. Hedstrom M, Niu J (2008) Research Forum Presentation: Incentives to Create Network. Nat Genet 39: 1045–1051. ‘‘Archive-Ready’’ Data: Implications for Archives and Records Management.12. Brazma A, Hingamp P, Quackenbush J, Sherlock G, Spellman P, et al. (2001) Society of American Archivists Annual Meeting. Minimum information about a microarray experiment (MIAME)-toward 24. Niu J (2006) Incentive study for research data sharing. A case study on NIJ standards for microarray data. Nat Genet 29: 365–371. grantees. PLoS ONE | www.plosone.org 12 July 2011 | Volume 6 | Issue 7 | e18657 42
    • Who Shares? Who Doesn’t?25. Lowrance W Access to Collections of Data and Materials for Heath Research: A 45. Yu W, Yesupriya A, Wulf A, Qu J, Gwinn M, et al. (2007) An automatic method report to the Medical Research Council and the Wellcome Trust. to generate domain-specific investigator networks using PubMed abstracts. BMC26. University of Nottingham JULIET: Research funders’ open access policies. medical informatics and decision making 7: 17.27. Brown C (2003) The changing face of scientific discourse: Analysis of genomic 46. Torvik V, Smalheiser NR (2009) Author Name Disambiguation in MEDLINE. and proteomic database usage and acceptance. Journal of the American Society Transactions on Knowledge Discovery from Data. pp 1–37. for Information Science and Technology 54: 926–938. 47. Bird S, Loper E (2006) Natural Language Toolkit. Available: http://nltk.28. McCullough BD, McGeary KA, Harrison TD (2008) Do Economics Journal sourceforge.net/. Archives Promote Replicable Research? Canadian Journal of Economics 41: 48. R Development Core Team (2008) R: A Language and Environment for 1406–1420. Statistical Computing. Vienna, Austria: ISBN 3-900051-07-0.29. Constant D, Kiesler S, Sproull L (1994) What’s mine is ours, or is it? A study of 49. Theus M, Urbanek S (2008) Interactive Graphics for Data Analysis: Principles attitudes about information sharing. Information Systems Research 5: 400–421. and Examples (Computer Science and Data Analysis) Chapman & Hall/CRC.30. Matzler K, Renzl B, Muller J, Herting S, Mooradian T (2008) Personality traits 50. Harrell FE (2001) Regression Modeling Strategies Springer. and knowledge sharing. Journal of Economic Psychology 29: 301–313. 51. Gorsuch RL (1983) Factor Analysis, Second Edition Psychology Press.31. Ryu S, Ho SH, Han I (2003) Knowledge sharing behavior of physicians in 52. Vickers AJ (2008 January 22) Cancer Data? Sorry, Can’t Have It The New York hospitals. Expert Systems With Applications 25: 113–122. Times.32. Bitzer J, Schrettl W, Schroder PJH (2007) Intrinsic motivation in open source ¨ software development. Journal of Comparative Economics 35: 160–169. 53. Siemsen E, Roth A, Balasubramanian S (2008) How motivation, opportunity,33. Kim J (2007) Motivating and Impeding Factors Affecting Faculty Contribution and ability drive knowledge sharing: The constraining-factor model. Journal of to Institutional Repositories. Journal of Digital Information 8: 2. Operations Management 26: 426–445.34. Seonghee K, Boryung J (2008) An analysis of faculty perceptions: Attitudes 54. Tucker J (2009) Motivating Subjects: Data Sharing in Cancer Research [PhD toward knowledge sharing and collaboration in an academic institution. Library dissertation.] Virginia Polytechnic Institute and State University. 30: 282–290. 55. Malin B, Karp D, Scheuermann RH (2010) Technical and policy approaches to35. Warlick S, Vaughan K (2007) Factors influencing publication choice: why balancing patient privacy and data sharing in clinical and translational research. faculty choose open access. Biomed Digit Libr 4: 1. J Investig Med 58: 11–18.36. Lee C, Dourish P, Mark G (2006) The human infrastructure of cyberinfras- 56. Foster M, Sharp R (2007) Share and share alike: deciding how to distribute the tructure. Proceedings of the 2006 20th anniversary conference on Computer scientific and social benefits of genomic data. Nat Rev Genet 8: 633–639. supported cooperative work. Banff, Canada. 57. Navarro R (2008) An ethical framework for sharing patient data without37. Kuo F, Young M (2008) A study of the intention–action gap in knowledge consent. Inform Prim Care 16: 257–262. sharing practices. Journal of the American Society for Information Science and 58. Blumenthal D, Campbell E, Anderson M, Causino N, Louis K (1997) Technology 59: 1224–1237. Withholding research results in academic life science. Evidence from a national38. Rhodes DR, Yu J, Shanker K, Deshpande N, Varambally R, et al. (2004) Large- survey of faculty. JAMA 277: 1224–1228. scale meta-analysis of cancer microarray data identifies common transcriptional 59. Vogeli C, Yucel R, Bendavid E, Jones L, Anderson M, et al. (2006) Data profiles of neoplastic transformation and progression. Proc Natl Acad Sci U S A withholding and the next generation of scientists: results of a national survey. 101: 9309–9314. Acad Med 81: 128–136.39. Hrynaszkiewicz I, Altman D (2009) Towards agreement on best practice for 60. Piwowar HA, Chapman WW (2010) Public Sharing of Research Datasets: A publishing raw clinical trial data. Trials 10: 17. Pilot Study of Associations. Journal of Informetrics 4: 148–156.40. Parkinson H, Kapushesky M, Shojatalab M, Abeygunawardena N, Coulson R, 61. Wellcome Trust (2010) Sharing research data to improve public health: full joint et al. (2007) ArrayExpress–a public database of microarray experiments and statement by funders of health research. gene expression profiles. Nucleic Acids Res 35(Database issue): D747–D750. 62. Hosek SD, Cox AG, Ghosh-Dastidar B, Kofner A, Ramphal N, et al. (2005)41. Ball CA, Brazma A, Causton H, Chervitz S, Edgar R, et al. (2004) Submission of Gender Differences in Major Federal External Grant Programs RAND microarray data to public repositories. PLoS Biol 2: e317. Corporation.42. (2002) Microarray standards at last. Nature 419: 323. 63. Bornmann L, Mutz R, Daniel H-D (2009) Do we need the h index and its43. Piwowar HA (2010) Foundational studies for measuring the impact, prevalence, variants in addition to standard bibliometric measures? Journal of the American and patterns of publicly sharing biomedical research data. Doctoral Dissertation: Society for Information Science and Technology 60: 1286–1289. University of Pittsburgh.44. Piwowar H, Chapman W (2010) Recall and bias of retrieving gene expression 64. Piwowar HA (2011) Data From: Who Shares? Who Doesn’t? Factors Associated microarray datasets through PubMed identifiers. J Biomed Discov Collab 5: with Openly Archiving Raw Research Data. Dryad Digital Repository. 7–20. Available doi:10.5061/dryad.mf1sd. PLoS ONE | www.plosone.org 13 July 2011 | Volume 6 | Issue 7 | e18657 43
    • The n e w e ng l a n d j o u r na l of m e dic i n e Special article Selective Publication of Antidepressant Trials and Its Influence on Apparent Efficacy Erick H. Turner, M.D., Annette M. Matthews, M.D., Eftihia Linardatos, B.S., Robert A. Tell, L.C.S.W., and Robert Rosenthal, Ph.D. A BS T R AC T BACKGROUNDFrom the Departments of Psychiatry Evidence-based medicine is valuable to the extent that the evidence base is complete(E.H.T., A.M.M.) and Pharmacology and unbiased. Selective publication of clinical trials — and the outcomes within(E.H.T.), Oregon Health and Science Uni-versity; and the Behavioral Health and those trials — can lead to unrealistic estimates of drug effectiveness and alter theNeurosciences Division, Portland Veter- apparent risk–benefit ratio.ans Affairs Medical Center (E.H.T., A.M.M.,R.A.T.) — both in Portland, OR; the De-partment of Psychology, Kent State Uni- METHODSversity, Kent, OH (E.L.); the Department We obtained reviews from the Food and Drug Administration (FDA) for studies ofof Psychology, University of California– 12 antidepressant agents involving 12,564 patients. We conducted a systematic lit-Riverside, Riverside (R.R.); and HarvardUniversity, Cambridge, MA (R.R.). Address erature search to identify matching publications. For trials that were reported in thereprint requests to Dr. Turner at Portland literature, we compared the published outcomes with the FDA outcomes. We alsoVA Medical Center, P3MHDC, 3710 SW compared the effect size derived from the published reports with the effect size de-US Veterans Hospital Rd., Portland, OR97239, or at turnere@ohsu.edu. rived from the entire FDA data set.N Engl J Med 2008;358:252-60. RESULTSCopyright © 2008 Massachusetts Medical Society. Among 74 FDA-registered studies, 31%, accounting for 3449 study participants, were not published. Whether and how the studies were published were associated with the study outcome. A total of 37 studies viewed by the FDA as having positive results were published; 1 study viewed as positive was not published. Studies viewed by the FDA as having negative or questionable results were, with 3 exceptions, either not published (22 studies) or published in a way that, in our opinion, conveyed a posi- tive outcome (11 studies). According to the published literature, it appeared that 94% of the trials conducted were positive. By contrast, the FDA analysis showed that 51% were positive. Separate meta-analyses of the FDA and journal data sets showed that the increase in effect size ranged from 11 to 69% for individual drugs and was 32% overall. CONCLUSIONS We cannot determine whether the bias observed resulted from a failure to submit manuscripts on the part of authors and sponsors, from decisions by journal editors and reviewers not to publish, or both. Selective reporting of clinical trial results may have adverse consequences for researchers, study participants, health care profes- sionals, and patients.252 n engl j med 358;3  www.nejm.org  january 17, 2008 The New England Journal of Medicine 44 Downloaded from nejm.org at THE NATIONAL ACADEMIES on July 12, 2012. For personal use only. No other uses without permission. Copyright © 2008 Massachusetts Medical Society. All rights reserved.
    • Selective Publication of Antidepressant Trials M edical decisions are based on an Information Act.19 Reviews for the four newer understanding of publicly reported clin- antidepressants were available on the FDA Web ical trials.1,2 If the evidence base is bi- site.17,20 This study was approved by the Research ased, then decisions based on this evidence may and Development Committee of the Portland Vet- not be the optimal decisions. For example, selec- erans Affairs Medical Center; because of its na- tive publication of clinical trials, and the outcomes ture, informed consent from individual patients within those trials, can lead to unrealistic estimates was not required. of drug effectiveness and alter the apparent risk– From the FDA reviews of submitted clinical tri- benefit ratio.3,4 als, we extracted efficacy data on all randomized, Attempts to study selective publication are com- double-blind, placebo-controlled studies of drugs plicated by the unavailability of data from unpub- for the short-term treatment of depression. We in- lished trials. Researchers have found evidence for cluded data pertaining only to dosages later ap- selective publication by comparing the results of proved as safe and effective; data pertaining to published trials with information from surveys of unapproved dosages were excluded. authors,5 registries,6 institutional review boards,7,8 We extracted the FDA’s regulatory decisions — and funding agencies,9,10 and even with published that is, whether, for purposes of approval, the methods.11 Numerous tests are available to detect studies were judged to be positive or negative with selective-reporting bias, but none are known to be respect to the prespecified primary outcomes (or capable of detecting or ruling out bias reliably.12‑16 primary end points).21 We classified as question- In the United States, the Food and Drug Ad- able those studies that the FDA judged to be nei- ministration (FDA) operates a registry and a re- ther positive nor clearly negative — that is, stud- sults database.17 Drug companies must register ies that did not have significant findings on the with the FDA all trials they intend to use in sup- primary outcome but did have significant findings port of an application for marketing approval or on several secondary outcomes. Failed studies22 a change in labeling. The FDA uses this informa- were also classified as questionable (for more in- tion to create a table of all studies.18 The study formation, see the Methods section of the Supple- protocols in the database must prospectively iden- mentary Appendix, available with the full text of tify the exact methods that will be used to collect this article at www.nejm.org). For fixed-dose stud- and analyze data. Afterward, in their marketing ies (studies in which patients are randomly as- application, sponsors must report the results ob- signed to receive one of two or more dose levels tained using the prespecified methods. These or placebo) with a mix of significant and nonsig- submissions include raw data, which FDA statis- nificant results for different doses, we used the ticians use in corroborative analyses. This system FDA’s stated overall decisions on the studies. We prevents selective post hoc reporting of favorable used double data extraction and entry, as detailed trial results and outcomes within those trials. in the Methods section of the Supplementary Ap- How accurately does the published literature pendix. convey data on drug efficacy to the medical com- munity? To address this question, we compared Data from Journal Articles drug efficacy inferred from the published litera- Our literature-search strategy consisted of the ture with drug efficacy according to FDA reviews. following steps: a search of articles in PubMed, a search of references listed in review articles, and Me thods a search of the Cochrane Central Register of Con- trolled Trials; contact by telephone or e-mail with Data from FDA Reviews the drug sponsor’s medical-information depart- We identified the phase 2 and 3 clinical-trial pro- ment; and finally, contact by means of a certified grams for 12 antidepressant agents approved by letter sent to the sponsor’s medical-information the FDA between 1987 and 2004 (median, August department, including a deadline for responding 1996), involving 12,564 adult patients. For the eight in writing to our query about whether the study older antidepressants, we obtained hard copies of results had been published. If these steps failed statistical and medical reviews from colleagues to reveal any publications, we concluded that the who had procured them through the Freedom of study results had not been published. n engl j med 358;3  www.nejm.org  january 17, 2008 253 The New England Journal of Medicine 45Downloaded from nejm.org at THE NATIONAL ACADEMIES on July 12, 2012. For personal use only. No other uses without permission. Copyright © 2008 Massachusetts Medical Society. All rights reserved.
    • The n e w e ng l a n d j o u r na l of m e dic i n e We identified the best match between the FDA- the above calculation. Rather, P values were often reviewed clinical trials and journal articles on the indicated as being below or above a certain thresh- basis of the following information: drug name, old — for example, P<0.05 or “not significant” dose groups, sample size, active comparator (if (i.e., P>0.05). In these cases, we followed the pro- used), duration, and name of principal investiga- cedure described in the Supplementary Appendix. tor. We sought published reports on individual For each fixed-dose (multiple-dose) study, we studies; articles covering multiple studies were ex- computed a single study-level effect size weighted cluded. When the results of a trial were reported by the degrees of freedom for each dose group. in two or more primary publications, we selected On the basis of the study-level effect-size values the first publication. for both fixed-dose and flexible-dose studies, we Few journal articles used the term “primary ef- calculated weighted mean effect-size values for ficacy outcome” or a reasonable equivalent. There- each drug and for all drugs combined, using a fore, we identified the apparent primary efficacy random-effects model with the method of Der- outcome, or the result highlighted most promi- Simonian and Laird27 in Stata.28 nently, as the drug–placebo comparison reported Within the published studies, we compared the first in the text of the results section or in the table effect-size values derived from the journal articles or figure first cited in the text. As with the FDA with the corresponding effect-size values derived reviews, we used double data extraction and entry from the FDA reviews. Next, within the FDA data (see the Methods section of the Supplementary Ap- set, we compared the effect-size values for the pendix for details). published studies with the effect-size values for the unpublished studies. Finally, we compared the Statistical Analysis journal-based effect-size values with those derived We categorized the trials on the basis of the FDA from the entire FDA data set — that is, both pub- regulatory decision, whether the trial results were lished and unpublished studies. published, and whether the apparent primary out- We made these comparisons at the level of comes agreed or conflicted with the FDA decision. studies and again at the level of the 12 drugs. Be- We calculated risk ratios with exact 95% confi- cause the data were not normally distributed, we dence intervals and Pearson’s chi-square analysis, used the nonparametric rank-sum test for un- using Stata software, version 9. We used a simi- paired data and the signed-rank test for paired lar approach to examine the numbers of patients data. In these analyses, all the effect-size values within the studies. Sample sizes were compared were given equal weight. between published and unpublished studies with the use of the Wilcoxon rank-sum test. R e sult s For our major outcome indicator, we calculat- ed the effect size for each trial using Hedges’s g Study Outcome and Publication Status — that is, the difference between two means di- Of the 74 FDA-registered studies in the analysis vided by their pooled standard deviation.23 How- we could not find evidence of publication for 23 ever, because means and standard deviations (or (31%) (Table 1). The difference between the sam- standard errors) were inconsistently reported in ple sizes for the published studies (median, 153 both the FDA reviews and the journal articles, we patients) and the unpublished studies (median, used the algebraically equivalent computational 146 patients) was neither large nor significant equation24: (5% difference between medians; P = 0.29 by the rank-sum test). √ g = t ×  1 1 . The data in Table 1 are displayed in terms of ndrug + nplacebo the study outcome in Figure 1A. The questions of We calculated the t statistic25 using the precise whether the studies were published and, if so, how P value and the combined sample size as argu- the results were reported were strongly related ments in Microsoft Excel’s TINV (inverse T) func- to their overall outcomes. The FDA deemed 38 of tion, multiplying t by −1 when the study drug was the 74 studies (51%) positive, and all but 1 of the inferior to the placebo. Hedges’s correction for 38 were published. The remaining 36 studies (49%) small sample size was applied to all g values.26 were deemed to be either negative (24 studies) or Precise P values were not always available for questionable (12). Of these 36 studies, 3 were pub-254 n engl j med 358;3  www.nejm.org  january 17, 2008 The New England Journal of Medicine 46 Downloaded from nejm.org at THE NATIONAL ACADEMIES on July 12, 2012. For personal use only. No other uses without permission. Copyright © 2008 Massachusetts Medical Society. All rights reserved.
    • Selective Publication of Antidepressant Trials lished as not positive, whereas the remaining 33 Table 1. Overall Publication Status of FDA-Registered Antidepressant either were not published (22 studies) or were pub- Studies. lished, in our opinion, as positive (11) and therefore conflicted with the FDA’s conclusion. Overall, the No. of No. of Patients Publication Status Studies (%) in Studies (%) studies that the FDA judged as positive were ap- Published results agree with FDA 40 (54) 7,272 (58) proximately 12 times as likely to be published in decision a way that agreed with the FDA analysis as were Published results conflict with FDA 11 (15) 1,843 (15) studies with nonpositive results according to the decision (published as positive) FDA (risk ratio, 11.7; 95% confidence interval [CI], Results not published 23 (31) 3,449 (27) 6.2 to 22.0; P<0.001). This association of publi- Total 74 (100) 12,564 (100) cation status with study outcome remained sig- nificant when we excluded questionable studies and when we examined publication status with- specified primary outcome was nonsignificant, out regard to whether the published conclusions each publication highlighted a positive result as and the FDA conclusions were in agreement (for if it were the primary outcome. The nonsignificant details, see the Supplementary Appendix). results for the prespecified primary outcomes were Overall, 48 of the 51 published studies were either subordinated to nonprimary positive re- reported to have positive results (94%; binomial sults (in two reports) or omitted (in nine). (Study- 95% CI, 84 to 99). According to the FDA, 38 of the level methodologic differences are detailed in the 74 registered studies had positive results (51%; footnotes to Table B of the Supplementary Ap- 95% CI, 39 to 63). There was no overlap between pendix.) these two sets of confidence intervals. These data are broken down by drug and study Effect Size number in Figure 2A. For each of the 12 drugs, The effect-size values derived from the journal re- the results of at least one study either were unpub- ports were often greater than those derived from lished or were reported in the literature as posi- the FDA reviews. The difference between these two tive despite a conflicting judgment by the FDA. sets of values was significant whether the studies (P = 0.003) or the drugs (P = 0.012) were used as Number of Study Participants the units of analysis (see Table D in the Supple- As shown in Table 1, a total of 12,564 patients mentary Appendix). participated in these trials. The data from 3449 The effect sizes of the published and unpub- patients (27%) were not published. Data from an lished studies reviewed by the FDA are compared additional 1843 patients (15%) were reported in in Figure 3A. The overall mean weighted effect- journal articles in which the highlighted finding size value was 0.37 (95% CI, 0.33 to 0.41) for pub- conflicted with the FDA-defined primary outcome. lished studies and 0.15 (95% CI, 0.08 to 0.22) for Thus, the percentages for the patients closely mir- unpublished studies. The difference was signifi- rored those for the studies (Table 1). cant whether the studies (P<0.001) or the drugs Whether a patient’s data were reported in a way (P = 0.005) were used as the units of analysis that was in concert with the FDA review was as- (Table D in the Supplementary Appendix). sociated with the study outcome (Fig. 1B) (risk The mean effect-size values for all FDA stud- ratio, 27.1), which was consistent with the above- ies, both published and unpublished, are com- reported finding with the studies. Figure 2B shows pared with those for all published studies, as these same data according to the drug being shown in Figure 3B. Again, the differences were evaluated. significant whether the studies (P<0.001) or the drugs (P = 0.002) were used as units of analysis Qualitative Description of Selective (Table D in the Supplementary Appendix). Reporting within Trials For each of the 12 drugs, the effect size derived The methods reported in 11 journal articles ap- from the journal articles exceeded the effect size pear to depart from the prespecified methods re- derived from the FDA reviews (sign test, P<0.001) flected in the FDA reviews (Table B of the Supple- (Fig. 3B). The magnitude of the increases in effect mentary Appendix). Although for each of these size between the FDA reviews and the published studies the finding with respect to the protocol- reports ranged from 11 to 69%, with a median n engl j med 358;3  www.nejm.org  january 17, 2008 255 The New England Journal of Medicine 47Downloaded from nejm.org at THE NATIONAL ACADEMIES on July 12, 2012. For personal use only. No other uses without permission. Copyright © 2008 Massachusetts Medical Society. All rights reserved.
    • The n e w e ng l a n d j o u r na l of m e dic i n e Published, agrees with FDA decision Figure 2 (facing page). Publication Status and FDA Published, conflicts with FDA decision Regulatory Decision by Study and by Drug. Not published Panel A shows the publication status of individual studies. Nearly every study deemed positive by the A Studies (N=74) FDA (top row) was published in a way that agreed with FDA Decision the FDA’s judgment. By contrast, most studies Positive 37 deemed negative (bottom row) or questionable (mid- (N=38) (97%) dle row) by the FDA either were published in a way that 1 conflicted with the FDA’s judgment or were not pub- (3%) Questionable 6 6 lished. Numbers shown in boxes indicate individual (N=12) (50%) (50%) studies and correspond to the study numbers listed in Table A of the Supplementary Appendix. Panel B shows the numbers of patients participating in the Negative 5 16 individual studies indicated in Panel A. Data for pa- (N=24) (21%) (67%) tients who participated in studies deemed positive by 3 the FDA were very likely to be published in a way that (12%) agreed with the FDA’s judgment. By contrast, data for patients who participated in studies deemed negative 0 10 20 30 40 or questionable by the FDA tended either not to be No. of Studies published or to be published in a way that conflicted with the FDA’s judgment. B Patients in Studies (N=12,564) FDA Decision Positive 7075 A list of the study-level effect-size values used (N=7155) (99%) in the above analyses — derived from both the 80 FDA reviews and the published reports — is pro- (1%) Questionable 1180 1129 vided in Table C of the Supplementary Appendix. (N=2309) (51%) (49%) These effect-size values are based on P values and sample sizes shown in Table A of the Supplemen- Negative 2240 (N=3100) (72%) tary Appendix, which also lists reference infor- mation for the publications consulted. 197 663 (6%) (21%) 0 2000 4000 6000 Dis cus sion No. of Patients We found a bias toward the publication of posi- Figure 1. Effect of FDA Regulatory Decisions tive results. Not only were positive results more on Publication. ICM AUTHOR: Turner likely to be published, but studies that were not RETAKE 1st FIGURE: 1 of reviewed by the FDA (Panel A), 3 2nd AmongFthe 74 studies REG positive, in our opinion, were often published in 3rd 38 were deemed to have positive results, 37Revised CASE of which a way that conveyed a positive outcome. We ana- Line were published with positive results; the EMail 4-C remaining SIZE ARTIST: ts study was not published. Among theH/T H/T lyzed these data in terms of the proportion of studies deemed Enon 16p6 Combo positive studies and in terms of the effect size to have questionable or negative results by the FDA, AUTHOR, PLEASE NOTE: associated with drug treatment. Using both ap- there was a tendency toward nonpublication or publi- Figure has been redrawn and type has been reset. proaches, we found that the efficacy of this drug cation with positive results, conflicting with the con- Please check carefully. clusion of the FDA. Among the 12,564 patients in all class is less than would be gleaned from an ex- 74 studies (Panel B), data for patients who participated JOB: 35803 amination of the published literature alone. Ac- ISSUE: 01-17-08 in studies deemed positive by the FDA were very likely cording to the published literature, the results of to be published in a way that agreed with the FDA. In nearly all of the trials of antidepressants were contrast, data for patients participating in studies positive. In contrast, FDA analysis of the trial data deemed questionable or negative by the FDA tended either not to be published or to be published in a way showed that roughly half of the trials had positive that conflicted with the FDA’s judgment. results. The statistical significance of a study’s re- sults was strongly associated with whether and increase of 32%. A 32% increase was also ob- how they were reported, and the association was served in the weighted mean effect size for all independent of sample size. The study outcome drugs combined, from 0.31 (95% CI, 0.27 to also affected the chances that the data from a par- 0.35) to 0.41 (95% CI, 0.36 to 0.45). ticipant would be published. As a result of selec-256 n engl j med 358;3  www.nejm.org  january 17, 2008 The New England Journal of Medicine 48 Downloaded from nejm.org at THE NATIONAL ACADEMIES on July 12, 2012. For personal use only. No other uses without permission. Copyright © 2008 Massachusetts Medical Society. All rights reserved.
    • Selective Publication of Antidepressant Trials Published, agrees with FDA Published, conflicts with FDA Not published A Studies FDA Decision 42 43 26 44 Positive 9 21 27 45 66 10 17 22 28 36 46 67 4 11 18 23 29 37 47 58 68 72 1 5 12 19 24 30 38 48 59 61 69 73 13 14 Questionable 15 49 16 20 39 50 60 62 70 74 51 52 31 53 Negative 32 54 6 33 55 63 2 7 34 40 56 64 3 8 25 35 41 57 65 71 K R ) F m ba ul st) (L Esc i Lil e ap alo y) , F am za ox ) iL e ) to N an e ye fazo ) ui e xo ar b) , G Pa hKl ne Sm tin ) Kl R (Z Se ne) Pf e af ) W ne xo afa th) W XR ) (C Ci line t er Mir illy on la rox ine r th l n El in rg pin Sq on t, lin ith n S ith e C (P F res (E Ve ize l a, ra b , E eti it i r, xi ym D ore ye ye c, et et ro pr i ne of ra rs d ex op xo a , O za Sm io o lta ox Sm ox ol rt r X xi xo op el tal on ta ffe nl ro lu xo e R, ex it la pr la P l-M e ffe nl , G Bu (E Ve em G (C ris il, SR CR (R ax ,B (P rin il ne ax ut zo (P lb er el (S (W B Patients in Studies FDA Decision Positive N=1059 N=1046 N=1100 N=698 N=514 N=383 N=419 N=548 N=428 N=283 N=367 N=230 N=80 N=408 Questionable N=413 N=80 N=347 N=249 N=158 N=187 N=238 N=81 N=148 N=107 N=90 N=627 N=272 N=125 Negative N=501 N=387 185 N=185 N=42 N=299 N=241 N=224 K R ) F m ba ul st) (L Esc i Lil e ap alo y) , F am za ox ) i e ) to N ano e ye fazo ) ui e xo ar b) , G Pa hKl ne Sm tin ) Kl R (Z Se ne) Pf e ) (E Ve , W ine xo afa th) W XR ) (C Ci line t er Mir Lilly n la rox ine (E Ve izer th l n El in rg pin on t, lin ith n S ith e C (P F res l a, ra b , E eti it i ym D ore ye ye c, et et r x ro pr i ne of ra rs d ex op xo a , O za Sm io o lta ox Sm ox af Sq ol rt r X xi xo op el tal on ta ffe nl ro lu xo e R, ex it la pr la P l-M e ffe nl , G Bu em G (C ris il, SR CR (R ax ,B (P rin il ne ax ut zo (P lb er el (S (W ICM AUTHOR: Turner RETAKE 1st FIGURE: 2 of 3 2nd REG F 3rd CASE Revised Line 4-C n EMail engl SIZE j med 358;3 tswww.nejm.org  january 17, 2008 ARTIST: 257 H/T H/T 33p9 Enon Combo The New England Journal of Medicine AUTHOR, PLEASE NOTE: 49Downloaded from nejm.org at THE NATIONAL ACADEMIES on July 12, 2012. For personal use only. No other uses without permission. Figure has been redrawn and type has been reset. Copyright © 2008 Massachusetts Medical Society. All rights reserved. Please check carefully.
    • The n e w e ng l a n d j o u r na l of m e dic i n e A FDA-Based Effect Size B Overall Effect Size Unpublished Published FDA Journals Change g g in g Bupropion SR Bupropion SR 0.14 0.17 +55% (Wellbutrin SR, (Wellbutrin SR, GlaxoSmithKline) 0.27 GlaxoSmithKline) 0.27 0.01 0.24 +25% Citalopram Citalopram (Celexa, Forest) 0.31 (Celexa, Forest) 0.30 Duloxetine 0.15 Duloxetine 0.30 +33% (Cymbalta, Eli Lilly) 0.34 (Cymbalta, Eli Lilly) 0.40 Escitalopram 0.15 Escitalopram 0.31 +16% (Lexapro, Forest) 0.35 (Lexapro, Forest) 0.36 Fluoxetine Fluoxetine 0.26 +14% (Prozac, Eli Lilly) 0.26 (Prozac, Eli Lilly) 0.29 Mirtazapine 0.19 Mirtazapine 0.35 +61% (Remeron, Organon) 0.45 (Remeron, Organon) 0.57 Nefazodone 0.09 Nefazodone 0.26 +69% (Serzone, Bristol-Myers (Serzone, Bristol-Myers Squibb) 0.33 Squibb) 0.44 Paroxetine 0.20 Paroxetine 0.42 +40% (Paxil, GlaxoSmithKline) 0.55 (Paxil, GlaxoSmithKline) 0.59 Paroxetine CR Paroxetine CR 0.32 +11% (Paxil CR, GlaxoSmithKline) 0.32 (Paxil CR, GlaxoSmithKline) 0.36 0.18 0.26 +64% Sertraline Sertraline (Zoloft, Pfizer) 0.30 (Zoloft, Pfizer) 0.42 Venlafaxine 0.11 Venlafaxine 0.40 +28% (Effexor, Wyeth) 0.45 (Effexor, Wyeth) 0.51 Venlafaxine XR 0.19 Venlafaxine XR 0.40 +27% (Effexor XR, Wyeth) 0.52 (Effexor XR, Wyeth) 0.51 Overall mean weighted 0.15 Overall mean weighted 0.31 +32% effect size 0.37 effect size 0.41 −0.2 0.0 0.2 0.5 0.0 0.2 0.5 FDA-Based Effect Size Overall Effect Size Figure 3. Mean Weighted Effect Size According to Drug, Publication Status, and Data Source. AUTHOR: Turner RETAKE 1st Values for effect size are expressed as Hedges’s g (the difference between two means divided by their pooled standard deviation). Effect- ICM 2nd FIGURE: 3 of 3 29 Effect-size values for unpublished studies and published size values of 0.2 and 0.5 are considered to be small and medium, respectively. REG F 3rd studies, as extracted from data in FDA reviews, are CASE shown in Panel A. Horizontal lines Revised indicate 95% confidence intervals. There were no un- Line of the other antidepressants, the effect size for the published published studies for controlled-release paroxetine or fluoxetine. For each EMail 4-C SIZE ARTIST: ts H/T H/T 36p6 subgroup of studies was greater than the effect size for the unpublished subgroup of studies. Overall effect-size values (i.e., based on data Enon Combo from the FDA for published and unpublished studies combined), as compared with effect-size values based on data from corresponding AUTHOR, PLEASE NOTE: published reports, are shown in Panel B. For each drug, the effect-size value based been reset. Figure has been redrawn and type has on published literature was higher than the effect-size value based on FDA data, with increases ranging from 11 to 69%. For thecarefully.drug class, effect sizes increased by 32%. Please check entire JOB: 35803 ISSUE: 01-17-08 tive reporting, the published literature conveyed nals.3,30‑32 We built on this approach by compar- an effect size nearly one third larger than the ef- ing study-level data from the FDA with matched fect size derived from the FDA data. data from journal articles. This comparative ap- Previous studies have examined the risk–ben- proach allowed us to quantify the effect of selec- efit ratio for drugs after combining data from tive publication on apparent drug efficacy. regulatory authorities with data published in jour- Our findings have several limitations: they are258 n engl j med 358;3  www.nejm.org  january 17, 2008 The New England Journal of Medicine 50 Downloaded from nejm.org at THE NATIONAL ACADEMIES on July 12, 2012. For personal use only. No other uses without permission. Copyright © 2008 Massachusetts Medical Society. All rights reserved.
    • Selective Publication of Antidepressant Trials restricted to antidepressants, to industry-spon- the rate of false positive findings (type I error), sored trials registered with the FDA, and to issues and it prevents HARKing,36 or hypothesizing af- of efficacy (as opposed to “real-world” effective- ter the results are known. ness33). This study did not account for other factors It might be argued that some trials did not that may distort the apparent risk–benefit ratio, merit publication because of methodologic flaws, such as selective publication of safety issues, as including problems beyond the control of the in- has been reported with rofecoxib (Vioxx, Merck)34 vestigator. However, since the protocols were writ- and with the use of selective serotonin-reuptake ten according to international guidelines for ef- inhibitors for depression in children.3 Because ficacy studies37 and were carried out by companies we excluded articles covering multiple studies, we with ample financial and human resources, to be probably counted some studies as unpublished that fair to the people who put themselves at risk to were — technically — published. The prac­tice of participate, a cogent public reason should be given bundling negative and positive studies in a single for failure to publish. article has been found to be associated with du- Selective reporting deprives researchers of the plicate or multiple publication,35 which may also accurate data they need to estimate effect size re- influence the apparent risk–benefit ratio. alistically. Inflated effect sizes lead to underesti- There can be many reasons why the results of mates of the sample size required to achieve sta- a study are not published, and we do not know tistical significance. Underpowered studies — and the reasons for nonpublication. Thus, we cannot selectively reported studies in general — waste determine whether the bias observed resulted from resources and the contributions of investigators a failure to submit manuscripts on the part of and study participants, and they hinder the ad- authors and sponsors, decisions by journal editors vancement of medical knowledge. By altering the and reviewers not to publish submitted manu- apparent risk–benefit ratio of drugs, selective pub- scripts, or both. lication can lead doctors to make inappropriate We wish to clarify that nonsignificance in a prescribing decisions that may not be in the best single trial does not necessarily indicate lack of interest of their patients and, thus, the public efficacy. Each drug, when subjected to meta-analy- health. sis, was shown to be superior to placebo. On the Dr. Turner reports having served as a medical reviewer for the Food and Drug Administration. No other potential conflict of other hand, the true magnitude of each drug’s interest relevant to this article was reported. superiority to placebo was less than a diligent lit- We thank Emily Kizer, Marcus Griffith, and Tammy Lewis for erature review would indicate. clerical assistance; David Wilson, Alex Sutton, Ohidul Siddiqui, and Benjamin Chan for statistical consultation; Linda Ganzini, We do not mean to imply that the primary Thomas B. Barrett, and Daniel Hilfet-Hilliker for their com- methods agreed on between sponsors and the FDA ments on an earlier version of this manuscript; Arifula Khan, are necessarily preferable to alternative methods. Kelly Schwartz, and David Antonuccio for providing access to FDA reviews; Thomas B. Barrett, Norwan Moaleji and Samantha Nevertheless, when multiple analyses are con- Ruimy for double data extraction and entry; and Andrew Hamil- ducted, the principle of prespecification controls ton for literature database searches. References 1. Hagdrup N, Falshaw M, Gray RW, clinical trials. Control Clin Trials 1987;8: Altman DG. Outcome reporting bias in Carter Y. All members of primary care 343-53. randomized trials funded by the Canadi- team are aware of importance of evidence 6. Simes RJ. Confronting publication bias: an Institutes of Health Research. CMAJ based medicine. BMJ 1998;317:282. a cohort design for meta-analysis. Stat Med 2004;171:735-40. 2. Craig JC, Irwig LM, Stockler MR. Evi- 1987;6:11-29. 11. Chan AW, Altman DG. Identifying dence-based medicine: useful tools for de- 7. Stern JM, Simes RJ. Publication bias: outcome reporting bias in randomised cision making. Med J Aust 2001;174:248- evidence of delayed publication in a co- trials on PubMed: review of publications 53. hort study of clinical research projects. and survey of authors. BMJ 2005;330:753. 3. Whittington CJ, Kendall T, Fonagy P, BMJ 1997;315:640-5. 12. Lau J, Ioannidis JP, Terrin N, Schmid Cottrell D, Cotgrove A, Boddington E. Se- 8. Chan AW, Hróbjartsson A, Haahr MT, CH, Olkin I. The case of the misleading lective serotonin reuptake inhibitors in Gøtzsche PC, Altman DG. Empirical evi- funnel plot. BMJ 2006;333:597-600. childhood depression: systematic review of dence for selective reporting of outcomes 13. Hayashino Y, Noguchi Y, Fukui T. Sys- published versus unpublished data. Lan- in randomized trials: comparison of pro- tematic evaluation and comparison of sta- cet 2004;363:1341-5. tocols to published articles. JAMA 2004; tistical tests for publication bias. J Epide- 4. Kyzas PA, Loizou KT, Ioannidis JP. Se- 291:2457-65. miol 2005;15:235-43. lective reporting biases in cancer prog- 9. Ioannidis JP. Effect of the statistical 14. Pham B, Platt R, McAuley L, Klassen nostic factor studies. J Natl Cancer Inst significance of results on the time to com- TP, Moher D. Is there a “best” way to de- 2005;97:1043-55. pletion and publication of randomized tect and minimize publication bias? An 5. Dickersin K, Chan S, Chalmers TC, efficacy trials. JAMA 1998;279:281-6. empirical evaluation. Eval Health Prof Sacks HS, Smith H Jr. Publication bias and 10. Chan AW, Krleza-Jerić K, Schmid I, 2001;24:109-25. n engl j med 358;3  www.nejm.org  january 17, 2008 259 The New England Journal of Medicine 51Downloaded from nejm.org at THE NATIONAL ACADEMIES on July 12, 2012. For personal use only. No other uses without permission. Copyright © 2008 Massachusetts Medical Society. All rights reserved.
    • Selective Publication of Antidepressant Trials 15. Sterne JA, Gavaghan D, Egger M. Pub- Agency (EMEA). Topic E9: statistical prin- 31. Nissen SE, Wolski K, Topol EJ. Effect lication and related bias in meta-analysis: ciples for clinical trials. Rockville, MD: of muraglitazar on death and major ad- power of statistical tests and prevalence Food and Drug Administration. (Accessed verse cardiovascular events in patients with in the literature. J Clin Epidemiol 2000; December 20, 2007, at http://www.fda. type 2 diabetes mellitus. JAMA 2005;294: 53:1119-29. gov/cder/guidance/iche3.pdf.) 2581-6. 16. Ioannidis JP, Trikalinos TA. The ap- 22. Temple R, Ellenberg SS. Placebo-con- 32. Sackner-Bernstein JD, Kowalski M, propriateness of asymmetry tests for pub- trolled trials and active-control trials in Fox M, Aaronson K. Short-term risk of lication bias in meta-analyses: a large sur- the evaluation of new treatments. Part 1: death after treatment with nesiritide for vey. CMAJ 2007;176:1091-6. ethical and scientific issues. Ann Intern decompensated heart failure: a pooled 17. Turner EHA. A taxpayer-funded clini- Med 2000;133:455-63. analysis of randomized controlled trials. cal trials registry and results database. 23. Hedges LV. Estimation of effect size JAMA 2005;293:1900-5. PLoS Med 2004;1(3):e60. from a series of independent experiments. 33. Revicki DA, Frank L. Pharmacoeco- 18. Center for Drug Evaluation and Re- Psychol Bull 1982;92:490-9. nomic evaluation in the real world: effec- search. Manual of policies and procedures: 24. Rosenthal R. Meta-analytic procedures tiveness versus efficacy studies. Pharma- clinical review template. Rockville, MD: for social research. Newbury Park, CA: coeconomics 1999;15:423-34. Food and Drug Administration, 2004. Sage, 1991. 34. Topol EJ. Failing the public health — (Accessed December 20, 2007, at http:// 25. Whitley E, Ball J. Statistics review 5: rofecoxib, Merck, and the FDA. N Engl J www.fda.gov/cder/mapp/6010.3.pdf.) comparison of means. Crit Care 2002;6: Med 2004;351:1707-9. 19. Committee on Government Reform, 424-8. 35. Melander H, Ahlqvist-Rastad J, Meijer U.S. House of Representatives, 109th 26. Hedges LV, Olkin I. Statistical meth- G, Beermann B. Evidence b(i)ased medi- Congress, 1st Session. A citizen’s guide ods for meta-analysis. New York: Academic cine — selective reporting from studies on using the Freedom of Information Act Press, 1985. sponsored by pharmaceutical industry: re- and the Privacy Act of 1974 to request gov- 27. DerSimonian R, Laird N. Meta-analy- view of studies in new drug applications. ernment records. Report no. 109-226. sis in clinical trials. Control Clin Trials BMJ 2003;326:1171-3. Washington, DC: Government Printing 1986;7:177-88. 36. Kerr NL. HARKing: hypothesizing af- Office, 2005. (Also available at: http:// 28. Stata statistical software, release 9. ter the results are known. Pers Soc Psychol www.fas.org/sgp/foia/citizen.pdf.) College Station, TX: StataCorp, 2005. Rev 1998;2:196-217. 20. Center for Drug Evaluation and Re- 29. Cohen J. Statistical power analysis for 37. International Conference on Har- search. Drugs@FDA: FDA approved drug the behavioral sciences. 2nd ed. New York: monisation — Efficacy. Rockville, MD: products. Rockville, MD: Food and Drug Lawrence Erlbaum Associates, 1988. Food and Drug Administration. (Accessed Administration. (Accessed December 20, 30. Nissen SE, Wolski K. Effect of rosigli- December 20, 2007, at http://www.fda. 2007, at http://www.accessdata.fda.gov/ tazone on the risk of myocardial infarc- gov/cder/guidance/#ICH_efficacy.) scripts/cder/drugsatfda.) tion and death from cardiovascular causes. Copyright © 2008 Massachusetts Medical Society. 21. International Conference on Har- N Engl J Med 2007;356:2457-71. [Erratum, monisation (ICH), European Medicines N Engl J Med 2007;357:100.] view current job postings at the nejm careercenter Visit our online CareerCenter for physicians at www.nejmjobs.org to see the expanded features and services available. Physicians can conduct a quick search of the public database by specialty and view hundreds of current openings that are updated daily online at the CareerCenter.260 n engl j med 358;3  www.nejm.org  january 17, 2008 The New England Journal of Medicine 52 Downloaded from nejm.org at THE NATIONAL ACADEMIES on July 12, 2012. For personal use only. No other uses without permission. Copyright © 2008 Massachusetts Medical Society. All rights reserved.
    • perspec tivesing programs in which scientists enter-ing as physicists and clinicians/biologists, The Biomarkers Consortium:respectively, become network biologistsand systems biologists. Looking forward, there are several Practice and Pitfalls ofrequirements to build a functional con-tributor network of biologists/clinicians Open-Source Precompetitiveevolving probabilistic causal models ofdisease. More robust coherent global Collaborationgenomic data sets need to be gener-ated—probably hundreds or thousands, JA Wagner1, M Prince2, EC Wright3, MM Ennis4, J Kochan5,as opposed to the current number of less DJR Nunez6, B Schneider7, M-D Wang2, Y Chen1, S Ghosh8,than a hundred in the entire world today. BJ Musser1 and MT Vassileva9Standards and annotations will need tobe established and used through a col-lective process that corrects itself in ways Precompetitive collaboration is a growing driver for innovation andthat are similar to the semiconductor increased productivity in biomedical science and drug development.industry data-sharing efforts.5 Methods The Biomarkers Consortium, a public–private platform forfor rewarding scientists for sharing their precompetitive collaboration specific to biomarkers, demonstrateddata and also ways to recognize and citemodels will be essential. Privacy and that adiponectin has potential utility as a predictor of metabolicintellectual-property issues will need to responses to peroxisome proliferator–activated receptor (PPAR)be addressed in an iterative process. agonists in individuals with type 2 diabetes. Despite the challenges If these challenges were to be tackled overcome by this project, the most important lesson learned is thatby a government or a single industrial oracademic champion, they would prob- cross-company precompetitive collaboration is a feasible robustably fail. If, instead, scientists, patient approach to biomarker qualification.advocates, and industrial and governmentrepresentatives can align and ensure thatthese tasks are attacked by a distributed Biomedical research and drug devel- tage industry into a business model, withnetwork of scientists, software developers, opment remain uncertain, expensive, products such as Linux becoming main-modelers, and privacy and intellectual- time-consuming, and inefficient enter- stream alternatives to industry leaders.property experts in a contributor network prises.1 According to some estimates, it Biology has evolved into an informa-similar to that which drives other social costs more than $1.2 billion to develop tion-rich science, and the analogy withnetwork–driven efforts such as Twitter, a new drug.2 In fact, over the past dec- approaches to software could provide athere is a chance that we can build the ade drug development costs have sky- potential answer to productivity issuesprobabilistic causal models so needed to rocketed despite falling productivity for biomedical research and drug devel-develop biomarkers and new therapies evidenced by increasing drug devel- opment. Recent advances, includingmore effectively. opment timelines and decreasing new the implementation of network models molecular entity approvals. Increased and large-scale consortia, highlight theCONFLICT OF INTERESTThe author declared no conflict of interest. precompetitive collaboration is a poten- increasing role that precompetitive col- tial driver for innovation and increased laboration is assuming within biomedi-© 2010 ASCPT productivity.3 Precompetitive collabo- cal research and drug development. For ration, defined as competitors sharing example, Sage is an open-access net-1. Mathieu, R.G. IT systems perspective: top-down early stages of research that benefit all,3 work model platform that was recently approach to computing. IEEE Computer 35, 138–139 (2002). is a potential driver for innovation and launched with a vision to create open-2. Schadt, E.E., Friend, S.H. & Shaywitz, D.A. A increased productivity.4 Open-source access, integrative bionetworks, evolved network view of disease and compound software research has grown from a cot- from contributor scientists, to accelerate screening. Nat. Rev. Drug Discov. 8, 286–295 (2009). 1Merck, Rahway, New Jersey, USA; 2Eli Lilly and Company, Lilly Corporate Center, Indianapolis, Indiana, USA;3. Waters, K.M., Singhal, M., Webb-Robertson, 3National Institute of Diabetes and Digestive and Kidney Diseases/National Institutes of Health, Bethesda, B.J.M., Stephan, E.G., & Gephart, J.M. Breaking Maryland, USA; 4Quintiles, Inc., Austin, Texas, USA; 5Hoffmann–La Roche, Nutley, New Jersey, USA; 6Metabolic the high-throughput bottleneck: new tools Pathways Center of Excellence for Drug Discovery, GlaxoSmithKline, Research Triangle Park, North Carolina, help biologists integrate complex datasets. Sci. USA; 7Center for Biologics Evaluation and Research, US Food and Drug Administration, Rockville, Maryland, Comput. 23, 22–26 (2006). USA ; 8 Biomedical Biotechnology Research Institute, North Carolina Central University, Durham, North4. Winning through cooperation: An interview Carolina, USA; 9Foundation for the National Institutes of Health, Bethesda, Maryland, USA. Correspondence: with William Spencer. Technol. Rev. 100, M Vassileva (mvassileva@fnih.org) 22–27 <http://www.technologyreview.com/ computing/11501/> (1997). doi:10.1038/clpt.2009.227Clinical pharmacology & Therapeutics | VOLUME 87 NUMBER 5 | MAY 2010 53 539
    • perspec tivesresearch on human disease.5 Another April June Aug. Oct. Dec. Feb. April June Aug. Oct. Dec. Feb.important large-scale precompetitivecollaboration is the Biomarkers Consor- Project concept submitted to BCtium, a public–private partnership man-aged by the Foundation for the National Concept Project Clinical trials Data-sharing/ with relevant data confidentialityInstitutes of Health (FNIH). The goals approval team identified by four agreements Project launch by MDSC formedof the consortium are to promote the companies developed and executeddevelopment, qualification, and regula- Phase I Project and II datatory acceptance of biomarkers; to con- plan MDSC EC analysis developedduct research in precompetitive areas by by team review reviewteams that share vested interests in suc- Phase III Analysis of data all availablecess; and to speed the development of analysis datamedicines and therapies for detection, Manuscriptprevention, diagnosis, and treatment and presentationof disease through making consortium preparationproject results broadly available to thepublic so as to improve patient care.6 Development of the most useful 2007 2008 2009biomarkers is generally resource inten- Figure 1 The development and execution timeline of the adiponectin project. It was submitted to thesive; it took years to achieve the consen- Biomarkers Consortium in the spring of 2007; the data analysis was completed by January 2009; thesus status now enjoyed in research and results were published in the summer of 2009. BC, Biomarkers Consortium; EC, Executive Committe;medical practice for biomarkers such MDSC, Metabolic Disorders Steering Committee.as HbA1C. 7 One of the most difficulttasks facing biomarker assessment andevaluation has always been harmoniz- oxisome proliferator–activated receptor tic areas, as well as in Immunity anding the approaches of various stake- (PPAR) agonist treatment, which corre- Inflammation. At the same time, similarholders from government, industry, lated with decreases in glucose, glycated public–private partnerships in the areanonprofit organizations, providers, and hemoglobin (HbA1c), hematocrit, and of biomarkers, such as the Innovativeacademic institutions. There was a criti- triglycerides, and increases in blood urea Medicines Initiative and the Critical Pathcal need for a coordinated cross-sector nitrogen, creatinine, and high-density Institute, are starting to develop projectspartnership effort when the Biomarkers lipoprotein cholesterol concentrations. aimed at addressing bottlenecks in drugConsortium was officially launched in Early (6–8 weeks) increases in adiponec- development.10 Therefore, it becomesOctober 2006. tin concentrations after treatment with important to summarize the lessons In April 2007, three of the four ther- PPAR agonists negatively correlated with learned from the first completed projectapeutic-area Biomarkers Consortium subsequent changes in HbA1c. Logistic of the Biomarkers Consortium, espe-Steering Committees—Cancer, Neu- regression demonstrated that an increase cially because it managed to facilitate theroscience, and Metabolic Disorders— in adiponectin was correlated with a precompetitive sharing of preexistingbecame operational. One of the first decrease in HbA1c. These analyses con- data from four pharmaceutical compa-projects proposed to the consortium firmed previous relationships between nies. A summary of the lessons learned isin the field of metabolic disorders was adiponectin levels and metabolic param- displayed in Table 1.to evaluate the utility of adiponectin as eters, and supported the potential utility For the early days of the consortium,a marker of glycemic efficacy through of adiponectin across the spectrum of the trust among the decision makers whopooling existing data from clinical tri- glucose tolerance. Despite multiple chal- were willing to try this public-privateals conducted by multiple sponsors. lenges, this data-sharing project was com- partnership approach was invaluable.As shown in Figure 1, this project was pleted and answered many questions that Robust advice from the US Food andcompleted about two years later as the would otherwise be impossible to resolve Drug Administration (FDA) regard-results became available after having using the data sets of each individual ing the biomarker qualification processbeen approved for presentation at the company alone. remains a crucial incentive for sustainingAnnual Scientific Sessions of the Ameri- The adiponectin project represents the the interest and engagement of pharma-can Diabetes Association meeting8 and first completed project of the Biomark- ceutical companies in biomedical public–for subsequent publication.9 In short, ers Consortium. There are now six other private partnerships.this project analyzed blinded data on ongoing projects within the consortium: Before the consortium’s launch, the2,688 type 2 diabetes patients from ran- two in the area of cancer, two more in founding members from Pharmaceu-domized clinical trials conducted by four metabolic disorders, and two in neuro- tical Research and Manufacturers ofpharmaceutical companies. Adiponec- science. A few new projects are about America, the FDA, the NIH, and thetin concentrations increased after per- to be launched in these three therapeu- FNIH had multiple discussions not only540 54 VOLUME 87 NUMBER 5 | MAY 2010 | www.nature.com/cpt
    • perspec tivesTable 1 Lessons learned in development of the Biomarkers ConsortiumIssue Lesson MitigationFocus, organization, Although ultimately successful, the overall project was lengthy Robust project management with accountable leaders shouldand pace be instituted in all projects A clear and concise project plan with precise objectives is mandatory Setting and adhering to deadlines is important for achieving timely results Simultaneous work streams, where possible, will accelerate the project Creating a sense of urgency drives successOptimal collaboration A lack of collaboration tools hampered the project The Biomarkers Consortium created a collaboration Web portal Regular meetings and face-to-face meetings are important to uniformly institute optimal collaborationData-sharing principles A uniform data-sharing plan that was legally appropriate for all A single accountable legal liaison is crucialand standards parties was difficult to negotiate Adequate time and resources must be budgeted to data-sharing Standard definitions across clinic protocols were not always principles and standard definitions obvious and clearly important Much effort had to be put into blinding the data to ensure that It is very time-consuming to compile, clean, and blind the data, no company could identify which data sets contributed to any because every company performs clinical trials using different given result standard operating procedures In accordance with the Biomarkers Consortium’s general Limited institutional memory often remains after a given trial is intellectual property and data-sharing principles, preexisting completed at individual companies data from contributing companies were aggregated and Blinding the aggregated data so that no representative of the analyzed by independent third parties, and the results of that project team could identify individual sponsor data was possible process were shared with project team members but difficult The template for the Biomarkers Consortium data-sharing plan and confidentiality is now availableLimitations of existing The retrospective data set lacked time points earlier than 6 Acknowledge limitationsdata weeks of dosing, which limited the ability to make conclusions Carry out prospective follow-up when necessary related to the prognostic value of the biomarker Blinded aggregated data are inherently limited, including in this case difficulties with specifying dose response Biomarker assays differed among company databasesStatistical analyses The opportunity to utilize the expertise of two statisticians at For data-sharing projects, it is now clear that having a defined the same time for quality control purposes was very important budget, including funds specifically designated for the statistical for the credibility of the results analysis in the project to be performed by an independent third That the baseline analysis confirmed earlier findings, therefore party, is essential confirming the utility of the compiled data set, was also critical The statistical plan had to be revised a few times, and some The analysis plan evolved because the team could not fully interesting findings that were not expected arose at the end of understand all the questions possible to ask without knowing the analysis what data would be available after the compilationBiomarker qualification There was a lack of agreement on biomarker qualification at the Involvement of an FDA representative is important conclusion of the project Potential use of the FDA qualification pathway should be consideredData dissemination Is piecemeal or full-story distribution more appropriate? Address the data dissemination issues proactively with the Is a broad or narrow therapeutic area focus more appropriate? project teamaround the policies that should govern In a collaborative environment, iden- writing the project plan, reporting thethis novel partnership (such as intellec- tifying a principal investigator (a strong results in written form, and presentingtual property, confidentiality, and conflict project team leader) to spearhead the on the team’s behalf to outside audiencesof interest) but also around which spe- effort is one of the first steps toward a is also essential for operational success.cific questions in the therapeutic areas successful outcome for the project. Desig- Because the adiponectin project wasof focus might be of interest to all stake- nating a scientific program manager from going to involve sharing of preexistingholders in the consortium. The impor- the entity that acts as a neutral convener data, which required a complex data-tance of pre-project consensus building to assist the team through the develop- sharing plan and agreements to be devel-can never be overstated. It is the best way ment, internal approval, and execution of oped and executed, it became clear thatto identify what questions might benefit the project; to make sure that deadlines having each company designate the samemost by a collaborative approach while are met; and to keep the work of the team official legal liaison for all interactionsat the same time satisfying a significant cohesive by contributing project manage- with the consortium about data-sharingunmet medical need. ment skills and assisting the chair with issues, other contracts, and additionalClinical pharmacology & Therapeutics | VOLUME 87 NUMBER 5 | MAY 2010 55 541
    • perspec tiveslegal matters, could speed up the internal important conclusions of the project. oration will continue to drive innovativeapproval process, ensuring institutional Practical aspects of collaboration are open-source approaches to biomedicalmemory and continuing understanding. also important. Utilizing collaborative research and development.The lessons learned during the process resources, such as a website where team ACKNOWLEDGMENTSof developing the legal agreements for members can share and work on docu- The helpful input of Myrlene Staten and thethe adiponectin project became useful ments together as well as receive feedback Biomarkers Consortium staff is gratefullyfor the creation of templates for similar from other consortium members, can acknowledged.data-sharing projects developed later by speed the development and execution The findings and conclusions in this articlethe consortium. phase of any project through facilitat- have not been formally disseminated by the US Food and Drug Administration and should For data-sharing projects, it is now ing the collaboration of teams and mini- not be construed to represent any agencyclear that having a defined budget, mizing e-mail exchanges. Planning for determination or policy.including funds specifically designated at least two face-to-face meetings—onefor the statistical analysis to be per- at the inception of the project and one CONFLICT OF INTEREST J.A.W., Y.C., and B.J.M. are employees of Merckformed by an independent third party, is before the dissemination of the results— and may own stock and/or hold stock optionsessential. The opportunity to utilize the could also facilitate team collaboration in the company. M.P. and M.-D.W. are employeesexpertise of two statisticians at the same and ensure reaching consensus in a more of Eli Lilly and may own stock and/or hold stocktime for quality control purposes was efficient manner. Finding a designated options in the company. J.K. is an employeevery important for the credibility of the time every month that works for all team of Hoffmann–La Roche. D.J.R.N. and S.G. are employees of GlaxoSmithKline and may ownresults from the adiponectin project. It members and ensures their availability stock and/or hold stock options in the company.was also critical that the baseline analy- for recurring teleconferences, so they E.C.W., M.M.E., B.S., and M.T.V. declared no conflictsis reproduced earlier findings, therefore can contribute to the project develop- of interest.confirming the utility of the compiled ment and discussion process throughoutdata set. the year, is an essential part of maintain- © 2010 ASCPT Data sharing was a major focus of this ing strong team relations. 1. Kola, I. The state of innovation in drugproject. Key challenges associated with Putting the execution of processes in development. Clin. Pharmacol Ther. 83, 227–230the data-sharing process and the statisti- parallel always speeds up the timeline for (2008).cal analyses are summarized in Table 1. the project. Early in the planning phase 2. Kaitin, K.I. Obstacles and opportunities in new drug development. Clin. Pharmacol Ther. 83,The data specification table developed by the team members should decide whether 210–212 (2008).the project team for the project was very they would like to go for a wider dissemi- 3. Weber, S. The Success of Open Source (Harvarduseful but had its limitations, so the two nation of the results or instead would pre- University Press, Cambridge, MA, 2004). 4. Melese, T., Lin, S.M., Chang, J.L. & Cohen, N.H.statisticians needed to do considerable fer to distribute the results of their project Open innovation networks between academiawork in following up with each individ- only within the specific therapeutic area and industry: an imperative for breakthroughual company to clarify the data before without advertising the broader impli- therapies. Nat. Med. 15, 502–507 (2009). 5. Friend, S. & Schadt, E. Something wiki this waybeing able to compile it. The analysis cations of the project or the findings. comes [interview by Bryn Nelson]. Nature 458, 13plan evolved because the team could not Preparation of the results for dissemina- (2009).fully understand all the possible ques- tion requires significant effort and time 6. Altar, C.A. The Biomarkers Consortium: on the critical path of drug discovery. Clin. Pharmacoltions to ask without knowing what data on behalf of the project team leader, the Ther. 83, 361–364 (2008).would be available after the compila- scientific program manager, the statisti- 7. Wagner, J.A., Williams, S.A. & Webster, C.J.tion. The plan had to be revised several cians, and the entire project team. Biomarkers and surrogate end points for fit-for- purpose development and regulatory evaluationtimes. Some interesting but unexpected The most important lesson learned of new drugs. Clin. Pharmacol Ther. 81, 104–107findings came at the end of the analysis, from the first completed project of the (2007).although they were not necessarily an Biomarkers Consortium is that cross- 8. Wagner, J.A. et al. Utility of adiponectin as a biomarker predictive of glycemic efficacy isimportant part of the original questions company precompetitive collaboration demonstrated by collaborative pooling ofasked. For example, the potential utility is a feasible and powerful approach to data from clinical trials conducted by multipleof adiponectin across the spectrum of biomarker qualification. This experience sponsors. Clin. Pharmacol Ther. 86, 619–625 (2009).glucose tolerance, which was demon- will enable the consortium and other 9. American Diabetes Association. Abstracts of thestrated through the phase III analysis, biomedical research collaborations to 69th Scientific Session, New Orleans, LA, 5–9 Junewas not a finding that addressed a ques- approach future data-sharing exercises 2009 10. Hughes, B. Novel consortium to address shortfalltion that was originally in the analytical more effectively and efficiently. The wid- in innovative medicines for psychiatric disorders.plan, but it is nevertheless among the ening spectrum of precompetitive collab- Nat. Rev. Drug Discovery 8, 523–524 (2009).542 56 VOLUME 87 NUMBER 5 | MAY 2010 | www.nature.com/cpt
    • Draft as of September 25, 2012 Clinical Research Data Sharing ProjectsThis is an initial list of data sharing projects compiled by IOM staff. It is not an exhaustive list,and inclusion does not denote endorsement. The list includes projects that share different typesof data and health information (i.e., genetic information, observational data, etc.) as thepublicworkshop would like to consider all data sharing initiatives that have best practices/lessonslearned to be applied to initiatives focused on sharing data from preplanned interventionalstudies of human subjects. As indicated by the source line for each project, descriptions arebased largely on online resources or individual presentations. Alzheimer’s Disease Neuroimaging Initiative (ADNI)The Alzheimers Disease Neuroimaging Initiative is designed to validate the use of biomarkersincluding blood tests, tests of cerebrospinal fluid, and MRI/ PET imaging for Alzheimers disease(AD) clinical trials and diagnosis. An essential feature of the ADNI is that the clinical,neuropsychological, imaging, and biological data and biological samples will be made availableto all qualified scientific investigators as a public domain research resource.Among many goals, ADNI is aiming to define the rate of progress of mild cognitive impairment(MCI) and AD, to develop improved methods for clinical trials, and to provide a large databasewhich will improve design of clinical treatment trials. It is anticipated that ADNI will continue inthe tradition of providing information and methods that ultimately lead to effective treatment andprevention of AD.With better knowledge of the earliest stages of the disease, researchers may be able to testpotential therapies at earlier stages, when they may have the greatest promise for slowing downprogression of this devastating disease. The ADNI study is now in its third phase (ADNI, ADNIGO and ADNI 2). ADNI 2 is studying the rate of change of cognition, function, brain structure,function and biomarkers in 200 elderly controls, 400 subjects with mild cognitive impairment,and 200 with AD.ADNI continues an unprecedented policy of open access to ADNI clinical and imaging datathrough this website and a parallel website at the Laboratory of Neuroimaging. All data is madepublic, without any embargo.Source: http://adni-info.org Analgesic Clinical Trials Innovation, Opportunities and Networks (ACTION) InitiativeThe ACTION Initiative aims to streamline the discovery and development process for newanalgesic drug products for the benefit of the public health. This multi-year, multi-phasedInitiative will work to: 57
    • Draft as of September 25, 2012- Establish a scientific and administrative infrastructure to support a series of projects;-Establish relationships with key expert stakeholders, industry, professional organizations,academia and government agencies;-Coordinate scientific workshops with key experts in the field of anesthesia and analgesia;-Conduct in-depth and wide-ranging data analyses of analgesic clinical trial data to determine theeffects of specific research designs and analysis methods.Source:http://www.fda.gov/AboutFDA/PartnershipsCollaborations/PublicPrivatePartnershipProgram/ucm231130.htm Arch2POCM [Academics, regulators, citizens, health industry, through to Proof Of Clinical Mechanism (Phase IIa)]Arch2POCM is a new public private partnership (PPP) whose mission is to help invigorate andaccelerate the pharmaceutical industry’s development of new and effective medicines byconducting discovery through Phase II clinical trials on completely novel targets.Arch2POCM will focus on novel targets that are judged as too risky to be pursued in industry,but that have significant scientific potential (e.g., diseases of cancer, immunology, autism andschizophrenia). Safe drug-like compounds that interact with the selected targets will bedeveloped all the way through Phase IIb clinical trials in order to determine if modulation ofArch2POCM’s targets beneficially impacts disease in patients.Arch2POCM will produce information that empowers drug development. To do this,Arch2POCM will operate without filing any patents and will share all of its data and drug-likecompounds. The result will be a common stream of knowledge about the roles a selected targetplays in human biology and disease.Source: http://sagebase.org/WP/arch Biogrid AustraliaBioGrid Australia Limited is a secure research platform and infrastructure that provides access toreal-time clinical, imaging and biospecimen data across jurisdictions, institutions and diseases.The web-based platform provides ethical access while protecting both privacy and intellectualproperty.BioGrid Australia and the Australian Cancer Grid (ACG) allows life science researchers to linkdata across all states, regardless of their existing linkage and research platforms. 58
    • Draft as of September 25, 2012The BioGrid Australia/ACG is a virtual repository and will link to all participating teachinghospitals and cancer research centres in Australia as well as the integrated cancer centres inVictoria. The platform maximizes life science research and provides researchers with access tomultiple data sets, to data on clinical outcomes, to quality and audit data as well as genomic data,images and analytical tools.Source: http://www.biogrid.org.au/wps/portal Biomarkers Consortium Project on AdiponectinThe consortium is helping to create a new era of personalized medicine, with more highlypredictive markers that have an impact during a patient’s illness or lifespan. Their goal is tocombine the forces of the public and private sectors to accelerate the development of biomarker-based technologies, medicines, and therapies for the prevention, early detection, diagnosis, andtreatment of disease.The use of adiponectin, a hormone derived from fat cells, which is abundant in plasma and easyto measure through commercially available kits, was confirmed as a robust biomarker predictiveof glycemic efficacy in Type 2 diabetes and healthy subjects, after treatment with peroxisomeproliferator-activated receptor-agonists (PPAR), but not after treatment with non-PPAR drugssuch as metformin. The findings were the result of the first project to be completed by theBiomarkers Consortium, a public-private partnership managed by the Foundation for theNational Institutes of Health. The project conducted a statistical analysis of pooled and blindedpre-existing data from Phase II clinical trials contributed by four pharmaceutical companies andanalyzed under the direction of a diverse team of scientists from industry, the National Institutesof Health (NIH), U.S. Food & Drug Administration (FDA), and academic research institutions.Source:http://www.biomarkersconsortium.org/press_release_adiponectin_predictive_biomarker.php Biosense (CDC)BioSense is a program of the Centers for Disease Control and Prevention (CDC) that trackshealth problems as they evolve and provides public health officials with the data, informationand tools they need to better prepare for and coordinate responses to safeguard and improve thehealth of the American people. Mandated in the Public Health Security and BioterrorismPreparedness and Response Act of 2002, the CDC BioSense Program was launched in 2003 toestablish an integrated national public health surveillance system for early detection and rapidassessment of potential bioterrorism-related illness.BioSense 2.0 protects the health of the American people by providing timely insight into thehealth of communities, regions, and the nation by offering a variety of features to improve datacollection, standardization, storage, analysis, and collaboration. Using the latest technology,BioSense 2.0 integrates current health data shared by health departments from a variety ofsources to provide insight on the health of communities and the country. By getting more 59
    • Draft as of September 25, 2012information faster, local, state, and federal public health partners can detect and respond to moreoutbreaks and health events more quickly. BioSense 2.0 is community controlled and userdriven.Source: http://www.cdc.gov/biosense/ caBIG (Cancer Biomedical Informatics Grid)The mission of caBIG is to develop a collaborative information network that accelerates thediscovery of new approaches for the detection, diagnosis, treatment, and prevention of cancer.caBIG is sponsored by the National Cancer Institute (NCI) and administered by the NationalCancer Institute Center for Bioinformatics and Information Technology (NCI-CBIIT). Anyonecan participate in caBIG and there is no cost to join. The community includes Cancer Centers,other NCI-supported research endeavors, and a variety of federal, academic, not-for profit andindustry organizations. caBIG seeks to connect scientists and practitioners through a shareableand interoperable infrastructure; develop standard rules and a common language to more easilyshare information; and build or adapt tools for collecting, analyzing, integrating, anddisseminating information associated with cancer research and care.Since its start, caBIG has committed to the following principles: • Federated: caBIG software and resources are widely distributed, interlinked, and available to everyone in the cancer research community, but institutions maintain local control over their own resources and data. • Open-development: caBIG tools and infrastructure are being developed through an open, participatory process. caBIG leverages existing resources whenever possible, rather than building new tools in every case. • Open-access: caBIG resources are freely obtainable by the cancer community to ensure broad data-sharing and collaboration. • Open-source: The caBIG source code is available to view, alter, and redistribute.The caBIG program facilitates the integration of diverse research and data across the "bench tobedside continuum" through an interoperable infrastructure and collaborative leadership.Appropriate data sharing is a key element of this collaboration. Benefits of data sharing include: • The large volumes of research data created by the high throughput genomics and proteomics technologies can best be harvested by teams of individuals, rather than by single PIs. • Collaboration across and within disciplines is required to leverage the broad knowledge and skill bases needed to realize the scientific and public health benefits of translational and personalized medicine. • Data sharing raises the visibility of individual studies and data collections; it opens new avenues of data dissemination and validation, and points to opportunities for future collaboration. • Grants from NIH exceeding $500,000 require a plan for data sharing. 60
    • Draft as of September 25, 2012 CAMD (Coalition Against Major Diseases) and C-Path Alzheimer’s DatabaseA new database of more than 4,000 Alzheimer’s disease patients who have participated in 11industry-sponsored clinical trials was released in June 2010 by CAMD. This is the first databaseof combined clinical trials to be openly shared by pharmaceutical companies and made availableto qualified researchers around the world. It is also the first effort of its kind to create a voluntaryindustry data standard that will help accelerate new treatment research on brain disease, aspatients with other related brain diseases are expected to be added. The level of detail and scopeof this database will enable researchers to more accurately predict the true course ofAlzheimer’s, Parkinson’s, Huntington’s, and other neuro-degenerative diseases, thereby enablingthe design of more efficient clinical trials. Patient identifiers will not be included in the database,thereby ensuring patient privacy.Version 1.0 of an Alzheimer’s data standard, used to develop the C-Path CAMD groundbreakingAlzheimer’s Disease data repository, was developed through the CDISC process and posted tothe CDISC website in October 2011. Since this first major milestone, CDISC and C-Path andsuch partners as the Bill and Melinda Gates Foundation, the Michael J. Fox Foundation, the PKDFoundation, Tufts University, Rochester University, the National Institutes of Health (NIH) andothers have continued to make advancements in a number of therapeutic areas. Tuberculosis,Pain, Parkinson’s Disease, Polycystic Kidney Disease, Virology, Oncology, and CardiovascularDisease are therapeutic areas for which standards are currently in development. As of June 2012,C-Path and the Clinical Data Interchange Standards Consortium (CDISC) announced the signingof a partnership agreement to establish the Coalition For Accelerating Standards and Therapies(CFAST) an initiative to accelerate clinical research and medical product development bycreating and maintaining data standards, tools and methods for conducting research intherapeutic areas that are important to public health.CAMD is a formal consortium of pharmaceutical companies, research foundations and patientadvocacy/voluntary health associations, with advisors from government research and regulatoryagencies including the U.S. Food and Drug Administration (FDA), the European MedicinesAgency (EMA), the National Institute of Neurological Disorders and Stroke (NINDS), and theNational Institute on Aging (NIA). CAMD is led and managed by the non-profit Critical PathInstitute (C-Path), which is funded by a cooperative agreement with the FDA and a matchinggrant from Science Foundation Arizona.Source: http://www.c-path.org/News/CDISCTAStds%20PR-24June2012.pdf CDC’s Chronic Fatigue Syndrome Wichita Clinical StudyThe data sets were collected during a 2-day in-hospital clinical assessment study conducted fromDecember 2002 to July 2003 in Wichita, KS, USA. The study enrolled 227 people and classifiedthem into study groups. Assessment included clinical evaluation of each subjects medical andpsychiatric status, sleep characteristics and cognitive functioning. Laboratory testing evaluatedneuroendocrine status, autonomic nervous system function, systemic cytokine profiles, 61
    • Draft as of September 25, 2012peripheral blood gene expression patterns and polymorphisms in genes involved inneurotransmission and immune regulation.CDC approved the sharing of the chronic fatigue syndrome Wichita Clinical dataset with 25intramural and extramural investigators in the CFS Computational Challenge and with CAMDA(Critical Assessment of Microarray Data Analysis) in 2006 and 2007. Over the past 3 years,sharing the Wichita Clinical dataset and partnering with expert extramural scientists resulted inmore than 20 publications and there are more manuscripts delving into this dataset in the works.CDC’s CFS Research Program has published about 15 papers based on the Wichita Clinicalstudy in the past 3 years. In addition to yielding “more value for the money.” this data-sharingeffort generated new perspectives on CFS and confirmed previous observations aboutneuroendocrine and immune dysfunction in CFS.Sources: http://www.cdc.gov/rdc/B1DataType/Dt132.htm andhttp://www.cfids.org/advocacy/testimony-vernon-oct2008.pdf Cancer CommonsModern molecular biology supports the hypothesis that cancer is actually hundreds or thousandsof rare diseases, and that every patient’s tumor is, to some extent, unique. Although there is arapidly growing arsenal of targeted cancer therapies that can be highly effective in specificsubpopulations, especially when used in rational combinations to block complementarypathways, the pharmaceutical industry continues to rely on large-scale randomized clinical trialsthat test drugs individually in heterogeneous populations. Such trials can be an extremelyinefficient strategy for searching the combinational treatment space, and capture only a smallportion of the data needed to predict individual treatment responses. On the other hand, anestimated 70% of all cancer drugs are used off-label in cocktails based on each individualphysician’s experience, as if the nations 30,000 oncologists are engaged in a giganticuncontrolled and unobserved experiment, involving hundreds of thousands of patients sufferingfrom an undetermined number of diseases. These informal experiments could provide the basisfor what amounts to a giant adaptive search for better treatments, if only the genomic andoutcomes data could be captured and analyzed, and the findings integrated and disseminated.Toward this end, Cancer Commons was developed, a family of web-based open-sciencecommunities in which physicians, patients, and scientists collaborate on models of cancersubtypes to more accurately predict individual responses to therapy. The goals are to: 1) giveeach patient the best possible outcome by individualizing their treatment based on their tumor’sgenomic subtype; 2) learn as much as possible from each patient’s response, and 3) rapidlydisseminate what is learned. The key innovation is to run this adaptive search strategy in"realtime", continuously updating disease and treatment models so that the knowledge learnedfrom one patient is disseminated in time to help the next.In this way, Cancer Commons is attempting a new patient-centric paradigm for translationalmedicine, in which every patient receives personalized therapy based upon the best availablescience, and researchers continuously test and refine their models of cancer biology andtherapeutics based on the resulting clinical responses. 62
    • Draft as of September 25, 2012How It WorksModel: Editorial boards comprised of leading physicians and scientists in each cancer curatemolecular disease models (MDMs) that identify the most relevant tests, treatments and trials foreach molecular subtype of that cancer, based on the best current knowledge.Test and Treat: Patients and Physicians access the MDM through web based applications thattransform its knowledge into personalized actionable information that inform testing andtreatment decisions.Report and Analyze: Clinicians and researchers report clinical observations and outcomes, andcollaboratively analyze them to validate or refute the MDM’s knowledge and hypotheses.Learn: The editorial boards review these results and update the MDMs in real time.Source: http://cancercommons.org/wp-content/themes/cancer_commons/docs/cancer_commons_whitepaper.pdf andhttp://www.cancercommons.org/about/ ClinicalTrials.govClinicalTrials.gov offers information for locating federally and privately supported clinical trialsfor a wide range of diseases and conditions. A clinical trial (also clinical research) is a researchstudy in human volunteers to answer specific health questions. Interventional trials determinewhether experimental treatments or new ways of using known therapies are safe and effectiveunder controlled environments. Observational trials address health issues in large groups ofpeople or populations in natural settings.ClinicalTrials.gov currently contains 128,589 trials sponsored by the National Institutes ofHealth, other federal agencies, and private industry. Studies listed in the database are conductedin all 50 States and in 180 countries. ClinicalTrials.gov receives over 50 million page views permonth 65,000 visitors daily. NIH, through its National Library of Medicine (NLM), hasdeveloped this site in collaboration with the FDA as a result of the FDA Modernization Act,which was passed into law in November 1997.What information is in ClinicalTrials.gov?Summaries of Clinical Study Protocols that include the following information: • summary of the purpose of the study • recruiting status • disease or condition and medical product under study • research study design • phase of the trial • criteria for participation 63
    • Draft as of September 25, 2012 • location of the trial and contact informationSummaries of Clinical Study Results for some studies that include the following information: • description of study participants (e.g., number enrolled, demographic data) • overall outcomes of the study • summary of adverse events experienced by participantsLinks to Online Health Resources that help place clinical studies in the context of currentmedical knowledge, including MedlinePlus and PubMed.Source: http://www.nlm.nih.gov/pubs/factsheets/clintrial.html Clinical Trial Comparator Arm Partnership (CTCAP)This Pre-Competitive Disease Biology program has grown out of a transformative partnershipbetween two synergistic nonprofits, Sage Bionetworks and Genetic Alliance and many of theworld’s largest healthcare organizations. The program already has unprecedented commitmentsfrom leading pharmaceutical and healthcare corporations to contribute existing data from non-commercially-sensitive placebo and comparator clinical trial arms. These pre-competitive resultswill be made available to academic as well as industry researchers to enable the development ofcomputational models that more accurately define the molecular basis of disease and can betterguide the development of new therapies, better biomarkers and improved standards of clinicalcare.The core activities of the project include:1) Collecting, curating and hosting clinical trial datasets from the pharmaceutical industry at theSage Commons2) Developing and hosting network models of disease3) Establishing a framework for release into the public domain.Source: http://sagebase.org/partners/CTCAP.php DataSphere Project (CEO Roundtable on Cancer)DataSphere is a project to share oncology clinical trial sets. The project will include comparatorarm oncology datasets: protocols, case report forms, data descriptors. Datasets will be publiclyavailable via a file-sharing website (i.e., a secure database so that qualified researchers canaccess data). The rationale for the project is to increase transparency and improve data standards;improve research; and honor patient volunteers. Sanofi will donate two recent Phase III oncologydatasets by October 2012. Further dataset contributions expected from other members of the Life 64
    • Draft as of September 25, 2012Sciences Consortium of the CEO Roundtable on Cancer, as well as other pharmaceuticalcompanies, and academics, by the end of 2012. Ultimate goal is 30 datasets by the end of Q2,2013.Source: Presentation by Charles Hugh-Jones, M.D., MRCP, Sanofi, on behalf of DataSphereproject team to IOM staff (May 10, 2012) The database of Genotypes and Phenotypes (dbGaP)The database of Genotypes and Phenotypes (dbGaP) was developed to archive and distribute theresults of studies that have investigated the interaction of genotype and phenotype. Such studiesinclude genome-wide association studies, medical sequencing, molecular diagnostic assays, aswell as association between genotype and non-clinical traits.dbGaP provides two levels of access - open and controlled - in order to allow broad release ofnon-sensitive data, while providing oversight and investigator accountability for sensitive datasets involving personal health information. Summaries of studies and the contents of measuredvariables as well as original study document text are generally available to the public, whileaccess to individual-level data including phenotypic data tables and genotypes require varyinglevels of authorization.The data in dbGaP will be pre-competitive, and will not be protected by intellectual propertypatents. Investigators who agree to the terms of dbGaP data use may not restrict otherinvestigators use of primary dbGaP data by filing intellectual property patents on it. However,the use of primary data from dbGaP to develop commercial products and tests to meet publichealth needs is encouraged.Source: http://www.ncbi.nlm.nih.gov/gap The “ePlacebo” DatabaseRecruiting patients for clinical trials is increasingly difficult, not least because of the logisticaland oftentimes ethical issues involved in randomizing patients to control or placebo arms.Creating a large-scale, pre-competitive, historical control (“ePlacebo”) database of patients fromthe placebo/control arms of multiple trials would provide a research utility with many potentialuses, including improved trial design and better context for baseline adverse event levels. Asopposed to the series of well-populated but disconnected databases for individual disease areas, alarge-scale, integrated resource would be of value across diseases. A more general databasewould permit aggregation across disease areas, and would leave investigators the option toanalyze data according to the needs of their particular research question.Pfizer has completed a feasibility assessment of pooling placebo data across multiple clinicaltrials. We created our internal “ePlacebo” test data mart with de-identified data from over 1,000 65
    • Draft as of September 25, 2012trials comprising over 20,000 patients, and then created a segment of approximately 3500patients from the Pain therapeutic area. We are currently performing internal proof of conceptstudies on this data in a test environment.Ideally, the ePlacebo database would contain pooled, de-identified data from several differentsources and would serve as a research utility for all contributors. Building a large-scaleplacebo/controls database would require the interest and participation of several companies, aframework to ensure data is being used appropriately with appropriate informed consent, thetechnical infrastructure for hosting the data, the accompanying trial metadata to bettercharacterize patients, the statistical and technical methods for pooling data, a steering committeeto manage and validate the use of the data, and an outside convener that could serve as the“neutral party” among the participating organizations. Developing these agreements andinfrastructure would produce a resource with broad applications that could help speed thedelivery of new treatments to patients.Source: Personal communication with Michael Cantor, Pfizer Clinical Informatics andInnovation Senior Director Genetic Association Information Network (GAIN)GAIN supports a series of Genome-Wide Association Studies (GWAS) designed to identifyspecific points of DNA variation associated with the occurrence of a particular common disease.Investigators from existing case-control or trio (parent-offspring) studies were invited to submitsamples and data on roughly 2,000 participants for assay of 300,000 to 500,000 single nucleotidepolymorphisms designed to capture roughly 80 percent of the common variation in the humangenome.Now that the GAIN initiative has officially concluded, the resulting data from the GAIN studieshave been deposited into the database of Genotype and Phenotype (dbGaP) within the NationalLibrary of Medicine at the NIH for the broad use of the research community. Access iscontrolled by the GAIN Data Access Committee.Source: http://www.genome.gov/19518664 Informatics for Integrating Biology and the Bedside (i2b2)i2b2 (Informatics for Integrating Biology and the Bedside) is an NIH-funded National Center forBiomedical Computing based at Partners HealthCare System. The i2b2 Center is developing ascalable informatics framework that will enable clinical researchers to use existing clinical datafor discovery research and, when combined with IRB-approved genomic data, facilitate thedesign of targeted therapies for individual patients with diseases having genetic origins. Thisplatform currently enjoys wide international adoption by the CTSA network, academic healthcenters, and industry. i2b2 is funded as a cooperative agreement with the National Institutes ofHealth. 66
    • Draft as of September 25, 2012 Innovative Medicines Initiative (IMI) European Medical Information Framework (EMIF)The new IMI initiative (not yet formally announced) is called EMIF (European MedicalInformation Framework) and is aimed at sharing individual patient level data across a variety ofdiseases. The Innovative Medicines Initiative (IMI) is Europes largest public-private partnershipaiming to improve the drug development process by supporting a more efficient discovery anddevelopment of better and safer medicines for patients.With a €2 billion euro budget, IMI supports collaborative research projects and builds networksof industrial and academic experts in Europe in order to boost innovation in healthcare. Acting asa neutral third party in creating innovative partnerships, IMI aims to build a more collaborativeecosystem for pharmaceutical research and development. IMI plans to provide socio-economicbenefits to European citizens, increase Europes competitiveness globally and establish Europe asthe most attractive place for pharmaceutical R&D.IMI supports research projects in the areas of safety and efficacy, knowledge management andeducation and training. Projects are selected through open Calls for proposals. The researchconsortia participating in IMI projects consist of large biopharmaceutical companies that aremembers of EFPIA, and a variety of other partners, such as: small- and medium-sizedenterprises, patients organizations, universities and other research organizations, hospitals,regulatory agencies, and any other industrial partners. IMI is a Joint Undertaking between theEuropean Union and the European Federation of Pharmaceutical Industries and Associations(EFPIA).Source: http://www.imi.europa.eu/content/home Kaiser Permanente Research Program on Genes, Environment, and Health (RPGEH) and UCSF BiobankThe RPGEH, a scientific research program at Kaiser Permanente in California, is one of thelargest research projects in the United States to examine the genetic and environmental factorsthat influence common diseases such as heart disease, cancer, diabetes, high blood pressure,Alzheimers disease, asthma and many others. The goal of the research program is to discoverwhich genes and environmental factors—the air we breathe, the water we drink, as well aslifestyles and habits—are linked to specific diseases.Based on the over six million-member Kaiser Permanente Medical Care Plan of NorthernCalifornia (KPNC) and Southern California (KPSC), the completed resource will link togethercomprehensive electronic medical records, data on relevant behavioral and environmentalfactors, and biobank data (genetic information from saliva and blood) from 500,000 consentinghealth plan members. 67
    • Draft as of September 25, 2012In addition to learning more about the genetic and environmental determinants of disease, KaiserPermanente research scientists, working in collaboration with other scientists across the nationand around the world, hope to translate research findings into improvements in health andmedical care. They also hope to develop a broader understanding of the ethical, legal, and socialimplications of using genetic information in health care.Source: http://www.dor.kaiser.org/external/DORExternal/rpgeh/index.aspx Mini-SentinelThe objective of the Mini-Sentinel pilot project is to inform and facilitate development of theSentinel System and to carry out mandates delineated in FDAAA. With Mini-Sentinel, the FDAseeks to support new activities intended to develop the scientific operations required to build anefficient, valid, and reliable Sentinel System. The Mini-Sentinel pilot funds development of asingle Coordinating Center that: provides the FDA a "laboratory" for developing and evaluatingscientific methods that might be used in a fully-operational Sentinel System; affords the FDA theopportunity to assess safety issues using existing electronic healthcare data systems; and allowsthe FDA to learn more about the barriers and challenges to building a viable and accurate systemof safety surveillance for FDA-regulated medical productsKey features of Mini-Sentinel include: 1. Active Surveillance: Mini-Sentinels foundational activities are intended to facilitate implementation of an active medical-product safety surveillance system for FDA- regulated medical products, both newly and previously approved. 2. Collaboration: Mini-Sentinel is a collaborative endeavor conducted by a large group of Data and Academic Partners, known collectively as Collaborating Institutions, that provide both access to healthcare data and ongoing scientific, technical, methodological, and organizational expertise necessary to meet the requirements of the pilot. Each Collaborating Institution has a Site Principal Investigator who leads participation by that organization. 3. Coordinating Center: The Mini-Sentinel Coordinating Center (MSCC) integrates the activities of the Collaborating Institutions. Within the MSCC, the Operations Center develops and maintains the scientific, technical, analytic, and administrative infrastructure required to facilitate distributed scientific assessments. The Planning Board provides a forum for active participation by representatives of the Collaborating Institutions. 4. Distributed Data Approach: Mini-Sentinel uses a distributed approach that enables healthcare data to remain in their local environments under the control of the participating Data Partners. This approach is enhanced through development of a common data model that enables creation of centralized analytic programs for distribution to Data Partners for local implementation. 68
    • Draft as of September 25, 2012 5. Data Sources: Initial data sources include administrative claims environments with pharmacy dispensing data. Data from outpatient and inpatient electronic health records (EHRs) and registries will be added subsequently. 6. Rapid Response to Queries: The Mini-Sentinel system is being designed to accommodate urgent FDA requests for information about medical product exposures and outcomes. 7. Communication of Results: Results that relate exposure of a medical product to a health outcome will be posted within 30 days of completion of standard checks for accuracy. 8. Methods Development: Mini-Sentinel supports development and testing of improved statistical and epidemiological approaches to active surveillance of regulated medical products. 9. Policy Development: Mini-Sentinel supports development of policies and procedures to promote productive engagement among Collaborating Institutions consistent with the custodial responsibilities of the Data Partners. 10. Impact of FDA Actions: Mini-Sentinel is designed to facilitate assessment of the impact of FDA regulatory actions on the use of regulated medical products. 11. Engagement with Related Efforts: Under the FDAs direction, Mini-Sentinel coordinates with other national initiatives to facilitate development of comparable standards and optimal methodologies, and to disseminate lessons learned from the work of Mini- Sentinel.Source: http://mini-sentinel.org/about_us/ NEWMEDSNovel Methods leading to New Medications in Depression and Schizophrenia (NEWMEDS) - isan international consortium of scientists that has launched one of the largest ever researchacademic-industry collaboration projects to find new methods for the development of drugs forschizophrenia and depression.NEWMEDS brings together top scientists from academic institutions with a wide range ofexpertise, and partners them with nearly all major biopharmaceutical companies. The project willfocus on developing new animal models which use brain recording and behavioural tests toidentify innovative and effective drugs for schizophrenia. NEWMEDS will develop standardisedparadigms, acquisition and analysis techniques to apply brain imaging, especially fMRI and PETimaging to drug development. It will examine how new genetic findings (duplication anddeletion or changes in genes) influence the response to various drugs and whether thisinformation can be used to choose the right drug for the right patient. And finally, it will try anddevelop new approaches for shorter and more efficient trials of new medication – trials that mayrequire fewer patients and give faster results.Source: http://www.newmeds-europe.com 69
    • Draft as of September 25, 2012 One Mind InitiativeOne Mind is a new-model non-profit organization that will take the lead role in the research,funding, marketing, and public awareness of mental illness and brain injury, by bringing togetherthe governmental, corporate, scientific, and philanthropic communities in a concerted effort todrastically reduce the social and economic effects of mental illness and brain injury within tenyears. One Mind for Research is working on several key programs and initiatives.Knowledge Integration Network for Traumatic Brain Injury and Post Traumatic Stress is a newPublic Private Partnership bringing best-in-class science, technology and expertise together toaccelerate research and realize new diagnostics, treatments and cures for TBI and PTS disorders.Goals of the Knowledge Integration Network: 1. Increase the rate of large-scale collection of clinical and biomarker information in order: • To help identify biological indicators of the causes and effects of diseases, or pathology. • To investigate disease progression for earlier, more accurate diagnosis and treatment. 2. Create new systems and approaches that will enable: • The sharing of data from disparate sources with academia, industry, non-profits and governments • The removal of barriers to effective scientific and clinical research that will speed diagnosis and treatment on an unprecedented level. 3. Empower patients to take a more active role in their own care and contribute to the acceleration of research by direct engagement and expanding the use of advanced biosensors & diagnostics.Brain Data Exchange Portal: One Mind in partnership with the International NeuroinformaticsCoordinating Facility is creating a digital environment for multiple-source data sharing, withopen analysis tools, and provenance tracking systems for working with complex data. They aredeveloping this in concert with leading Health Information Technologies partners. Several open-source technology platforms and tools will serve as candidate components for the shared dataportal ecosystem that will be built in two stages. Version 1.0 will be completed in 2012 and willserve as the prototype for a full-scale 2.0 model to be launched in 2013. This system will fosteran open science approach among academia, industry, non-profits and governments, to removethe barriers to effective scientific and clinical research.Virtual Biorepository Program: A dynamic, on-line international registry of brain-relateddisorder biorepositories supported by the Neuroscience Information Framework (NIF), AmericanBrain Coalition, and the NIH.Source: http://1mind4research.org/programs 70
    • Draft as of September 25, 2012 Parkinson’s Progression Markers Initiative (PPMI)The Parkinson’s Progression Markers Initiative is a landmark observational clinical study tocomprehensively evaluate a cohort of recently diagnosed PD patients and healthy subjects usingadvanced imaging, biologic sampling and clinical and behavioral assessments to identifybiomarkers of Parkinson’s disease progression.PPMI will be carried out over five years at 24 clinical sites in the United States, Europe, andAustralia, and requires the participation of 400 Parkinson’s patients and 200 control participants.The mission of PPMI is to identify one or more biomarkers of Parkinson’s disease (PD)progression, a critical next step in the development of new and better treatments for PD. PPMIwill establish a comprehensive, standardized, longitudinal PD data and biological samplerepository that will be made available to the research community.Specific aims to accomplish this objective include: 1. Develop a comprehensive and uniformly acquired clinical/imaging dataset with correlated biological samples that can be used in biomarker verification studies 2. Establish standardized protocols for acquisition, transfer and analysis of clinical and imaging data and biological samples that can be used by the research community 3. Investigate existing biomarkers and identify new clinical, imaging and biological markers to determine interval changes in these markers in PD patients compared with control participants.Source: http://ppmi-info.org PatientsLikeMePatientsLikeMe was co-founded in 2004 by three MIT engineers: brothers Benjamin and JamesHeywood and longtime friend Jeff Cole. Five years earlier, their brother and friend StephenHeywood was diagnosed with ALS (Lou Gehrig’s disease) at the age of 29. The Heywoodfamily soon began searching the world over for ideas that would extend and improve Stephen’slife. Inspired by Stephen’s experiences, the co-founders and team conceptualized and built ahealth data-sharing platform that they believe can transform the way patients manage their ownconditions, change the way industry conducts research and improve patient care.Today, PatientsLikeMe is a for-profit company, but not one with a “just for profit” mission. theyfollow four core values: putting patients first, promoting transparency (“no surprises”), fosteringopenness and creating “wow.” They’re guided by these values as they continually enhance theirplatform, where patients can share and learn from real-world, outcome-based health data.They’ve also centered their business around these values by aligning patient and industryinterests through data-sharing partnerships. They work with trusted nonprofit, research andindustry Partners who use this health data to improve products, services and care for patients.They envision a world where information exchange between patients, doctors, pharmaceuticalcompanies, researchers and the healthcare industry can be free and open; where, in doing so, 71
    • Draft as of September 25, 2012people do not have to fear discrimination, stigmatization or regulation; and where the free flowof information helps everyone. They envision a future where every patient benefits from thecollective experience of all, and where the risk and reward of each possible choice is transparentand known.http://www.patientslikeme.com/about Sage BionetworksSage Bionetworks is a new 501(c)(3) nonprofit biomedical research organization created torevolutionize how researchers approach the complexity of human biological information and thetreatment of disease. Sage’s mission has five interdependent themes: (1) research on networkmodels of disease; (2) pilot projects trialing disruptive models of cooperation; (3) rules andreward structures that promote data sharing; (4) building the computational platform for a digitalCommons; and (5) activating public engagement and access. They are driven by two recenttrends in biomedical research:1. The exponential increases in the quantity of molecular and genetic information without concomitant advances in the understanding of biology or disease, and;2. The need for development and validation of computational methods for integrating massive molecular and clinical datasets obtained across sizable populations into predictive disease models that can recapitulate complex biological systems with increasing accuracy.They believe that these problems are compounded by current systems and practices for sharingdatasets as well as the scale of data and computational capacity required to construct usefulintegrated genomic models of networks and systems.Source: http://sagebase.org/info/index.php The International Severe Adverse Events Consortium (iSAEC)The international Serious Adverse Event Consortium (iSAEC) is a nonprofit organizationfounded in 2007. It is comprised of leading pharmaceutical companies, the Wellcome Trust, andacademic institutions; with scientific and strategic input from the U.S. Food and DrugAdministration (FDA) and other international regulatory bodies. The mission of the iSAEC is toidentify DNA-variants useful in predicting the risk of drug-related serious adverse events(SAEs).Patients respond differently to medicines and all medicines can have side effects in certainindividuals. The iSAEC’s work is based on the hypothesis these differences have (in part) agenetic basis, and its research studies examine the impact of genetic variation on how individualsrespond to a large variety of medicines. The iSAEC’s initial studies have successfully identifiedgenetic variants associated with drug-related liver toxicity (DILI) and Serious Skin Rashes 72
    • Draft as of September 25, 2012(SSR). The majority of the iSAEC’s genetic findings have been specific to a given drug versusacross multiple drugs. However, a number of cross drug genetic alleles are starting to emergethat may provide important insights into the underlying biology/mechanism of drug inducedSAEs (e.g. HLA*5701 or UGT1A1*28). They believe their findings clearly demonstrate animportant role for the MHC genomic region (Chromosome 6), in the pathology ofimmunologically mediated SAEs such as DILI and SSR. And that their findings emphasize theimportance of immune regulation genes, in addition to a number of well characterized drugmetabolism (ADME) genes.In the iSAEC’s second phase (2010-2013) their goal is to deepen understanding of the geneticsof immunologic influenced SAEs, including SSR, DILI, drug induced kidney injury (DIKI) andacute hypersensitivity syndrome (AHSS). They will continue to apply state of the art genomicmethods to fulfill these aims, including whole genome genotyping and next generationsequencing to explore the role “rare” genetic variants in drug induced SAEsIn addition to supporting original genomic research on drug-related SAEs, the iSAEC is: Executing "open-use research practices and standards" in all its collaborations. Encouraging greater research efficiency and “speed to results” by pooling talent and resources, under a collaborative private sector leadership model, solely focused on research quality and productivity. Releasing all of its novel genetic discoveries/associations into the public domain uninhibited by intellectual property constraints. Enhancing the public’s understanding of how the industry, academia and government are partnering to address drug-related adverse events Structural Genomics Consortium (SGC)The SGC (Structural Genomics Consortium) is a not-for-profit, public-private partnership withthe directive to carry out basic science of relevance to drug discovery. The core mandate of theSGC is to determine 3D structures on a large scale and cost-effectively - targeting humanproteins of biomedical importance and proteins from human parasites that represent potentialdrug targets. In these two areas respectively, the SGC is now responsible for >25% and >50% ofall structures deposited into the Protein Data Bank each year; to date (Sep.2011) the SGC hasreleased the structures of over 1200 proteins with implications to the development of newtherapies for cancer, diabetes, obesity, and psychiatric disorders.Since its inception, the SGC has been engaged in pre-competitive research to facilitate thediscovery of new medicines. As part of its mission the SGC generates reagents and knowledgerelated to human proteins and proteins from human parasites. The SGC believes that its outputwill have maximal benefit if released into the public domain without restriction on use, and hasadopted an Open Access policy.Source: http://www.thesgc.org 73
    • Draft as of September 25, 2012 tranSMARTDeveloped by J&J and launched in 2009, tranSMART is a tool for translational R&D. Theplatform is heavily reliant on open source data and is designed to facilitate pre-competitivesharing.tranSMART is a knowledge management platform that they believe enables scientists to developand refine research hypotheses by investigating correlations between genetic and phenotypicdata, and assessing their analytical results in the context of published literature and other work.The integration, normalization, and alignment of data in tranSMART is designed to permit usersto explore data very efficiently to formulate new research strategies. Some of tranSMARTsspecific applications include: • Revalidating previous hypotheses • Testing and refining novel hypotheses • Conducting cross-study meta-analysis • Searching across multiple data sources to find associations of concepts, such as a genes involvement in biological processes or experimental results • Comparing biological processes and pathways among multiple data sets from related diseases or even across multiple therapeutic areasThe tranSMART Data Repository combines a data warehouse with access to federated sources ofopen and commercial databases. tranSMART accommodates: • Phenotypic data, such as demographics, clinical observations, clinical trial outcomes, and adverse events • High content biomarker data, such as gene expression, genotyping, pharmacokinetic and pharmaco-dynamics markers, metabolomics data, and proteomics data • Unstructured text-data, such as published journal articles, conference abstracts and proceedings, and internal studies and white papers • Reference data from sources such as MeSH, UMLS, Entrez, GeneGo, Ingenuity, etc. • Metadata providing context about datasets, allowing users to assess the relevance of results delivered by tranSMARTData in tranSMART is designed to allow identification and analysis of associations betweenphenotypic and biomarker data, and it is normalized to conform with CDISC and other standardsto facilitate search and analysis across different data sources. tranSMART also allowsinvestigators to search published literature and other text sources in order to evaluate theiranalysis in the context of the broader universe of reported research.External data can also be integrated into the tranSMART data repository, either from open dataprojects like GEO, EBI Array Express, GCOD, or GO, or from commercially available datasources. They believe that by making data accessible in tranSMART enables organizations toleverage investments in manual curation, development costs of automated ETL tools, orcommercial subscription fees across multiple research groups. 74
    • Draft as of September 25, 2012The tranSMART system also incorporates role-based security protections to enable use of theplatform across large organizations. User authentication for tranSMART can be integrated intoan organizations existing infrastructure to simplify user management.The tranSMART security model allows an organization to control data access and use inaccordance with internal policies governing the use of research data, as well as HIPAA, IRB,FDA, EMEA, and other regulatory requirements.Source: http://www.transmartproject.org/index.html 75
    • CDISC Contact: C-Path Contact:Andrea Vadakin Dianne Janis+1.316.558.0160 520.382-1399Avadakin@cdisc.org DJanis@c-path.org CDISC, C-Path and FDA Collaborate to Develop Data Standards to Streamline Path to New TherapiesAustin, TX – 21 June 2012 – The Clinical Data Interchange Standards Consortium (CDISC) and theCritical Path Institute (C-Path) announce the signing of a partnership agreement to establish theCoalition For Accelerating Standards and Therapies, or CFAST, an initiative to accelerate clinicalresearch and medical product development by creating and maintaining data standards, tools andmethods for conducting research in therapeutic areas that are important to public health.Version 1.0 of an Alzheimer’s data standard, used to develop the C-Path Coalition Against MajorDiseases’ (CAMD) groundbreaking Alzheimer’s Disease data repository, was developed through theCDISC process and posted to the CDISC website in October 2011. Since this first major milestone,CDISC and C-Path and such partners as the Bill and Melinda Gates Foundation, the Michael J. FoxFoundation, the PKD Foundation, Tufts University, Rochester University, the National Institutes ofHealth (NIH) and others have continued to make advancements in a number of therapeutic areas.Tuberculosis, Pain, Parkinson’s Disease, Polycystic Kidney Disease, Virology, Oncology, andCardiovascular Disease are therapeutic areas for which standards are currently in development.“We have enjoyed our collaboration to develop standards and research databases that are open toscientists around the world to seek new therapies,” says Dr. Carolyn Compton, President and CEO of C-Path. “We need a means to scale the process and manage the development of a very large number oftherapeutic area standards. The establishment of CFAST embodies our resolve to take what we havelearned from the initial projects to formulate a new process that will be significantly more efficientwhile retaining the rigor that is essential to this work.”Also in 2011, the Center for Drug Evaluation and Research (CDER) at the U.S. Food and DrugAdministration (FDA) released a list of priority therapeutic areas that could benefit from thestandardization of key study data specific to each area. The FDA website states: “FDA, CDISC and theCritical Path Institute [C-Path] are collaborating on efforts to support development of therapeutic areastandards. […] We encourage stakeholders to engage in and, where possible, support these datastandardization efforts.”Therapeutic area standards development has been identified by the CDISC Board of Directors and theCDISC community as a key strategic goal for CDISC. "Our stakeholders have expressed that CDISCprovides value to them through our free and open standards and that this value will be enhanced if wecan extend the foundational standards to cover therapeutic areas," stated Dr. Rebecca Kush, CDISC 76
    • President and CEO. "CFAST is a major new initiative that will allow the industry to speed the deliverynew therapies for patients, both in helping us gain more knowledge from past research and inimproving the overall process for new studies, from protocol design through analysis and reporting."The official launch of CFAST, including the roll out of the newly designed process and a Shared Healthand Research Electronic Library (SHARE) environment to make the standards more accessible, will beon 24 October at the CDISC International Interchange in Baltimore, MD. To learn more about thesetherapeutic area standards, interested parties are invited to attend the session “Standards for Patients:Collaborations to Innovate Therapy Development,” chaired by Bron Kisler, CDISC VP of StrategicInitiatives, taking place at the DIA Annual Meeting in Philadelphia on Monday morning, 25 June 2012.ABOUT CDISCCDISC is a 501(c)(3) global non-profit charitable organization, with ~300 supporting memberorganizations from across the clinical research and healthcare arenas. Through the efforts of volunteersaround the globe, CDISC catalyzes productive collaboration to develop industry-wide data standardsenabling the harmonization of clinical data and streamlining research processes from protocol throughanalysis and reporting, including the use of electronic health records to facilitate the collection of highquality research data. The CDISC standards and innovations can decrease the time and cost of medicalresearch and improve quality, thus contributing to the faster development of safer and more effectivemedical products and a learning healthcare system. The CDISC Vision is to inform patient care andsafety through higher quality medical research.ABOUT C-PATHAn independent, non-profit organization established in 2005 with public and private philanthropicsupport from the Arizona community, Science Foundation Arizona, and the U.S. Food and DrugAdministration (FDA), C-Path’s mission is to improve human health and well-being by developing newtechnologies and methods to accelerate the development and review of medical products. Aninternational leader in forming collaborations, C-Path has established global, public-privatepartnerships that currently include 1,000+ scientists from government regulatory agencies, academia,patient advocacy organizations, and dozens of major pharmaceutical companies. C-Path isheadquartered in Tucson, AZ and has an office in Rockville, MD. Visit www.c-path.org for moreinformation. # # # 77
    • Embargoed for Contact: Rick Myers rmyers@c-path.org12:01 a.m. 520-547-3440Friday, June 11, 2010 Lisa Romero lromero@c-path.org 520-777-2883 PUBLIC RELEASE OF ALZHEIMER’S CLINICAL TRIAL DATA BY PHARMACEUTICAL RESEARCHERS First Combined Pharmaceutical Trial Data on Neuro-degenerative Diseases;Shared Resource from Unique Public-Private Partnership Will Help Accelerate Alzheimer’s, Parkinson’s, and Other Brain Disease ResearchWashington, DC – A new database of more than 4,000 Alzheimer’s disease patients who have participated in11 industry-sponsored clinical trials will be released today by the Coalition Against Major Diseases (CAMD). This isthe first database of combined clinical trials to be openly shared by pharmaceutical companies and made available toqualified researchers around the world.It is also the first effort of its kind to create a voluntary industry data standard that will help accelerate new treatmentresearch on brain disease, as patients with other related brain diseases are expected to be added. The level of detailand scope of this database will enable researchers to more accurately predict the true course of Alzheimer’s,Parkinson’s, Huntington’s, and other neuro-degenerative diseases, thereby enabling the design of more efficient clinicaltrials. Patient identifiers will not be included in the database, thereby ensuring patient privacy.CAMD is a formal consortium of pharmaceutical companies, research foundations and patient advocacy/voluntaryhealth associations, with advisors from government research and regulatory agencies including the U.S. Food and DrugAdministration (FDA), the European Medicines Agency (EMA), the National Institute of Neurological Disorders andStroke (NINDS), and the National Institute on Aging (NIA). CAMD is led and managed by the non-profit Critical PathInstitute (C-Path), which is funded by a cooperative agreement with the FDA and a matching grant from ScienceFoundation Arizona.“The U.S. Food and Drug Administration has supported and actively participated in this innovative and unprecedentedpublic-private partnership from its inception,” said Joshua Sharfstein, MD, Principal Deputy Commissioner, FDA. “Theagency is strongly committed to CAMD and other regulatory science collaborations that can speed safe and effectivetreatments to the public.”In addition to sharing data, the pharmaceutical members of CAMD have agreed to use the new common data standardestablished for Alzheimer’s disease by the standard-setting organization, CDISC, in their future submissions for drugapprovals. The Clinical Data Interchange Standards Consortium (CDISC) is a global, multidisciplinary, non-profitorganization that has established standards to support the acquisition, exchange, submission, and archive of clinicalresearch data and metadata. CDISC standards are vendor-neutral, platform-independent, and freely available via theCDISC website. This will add greater efficiencies to the FDA’s review process and make it possible for new products toreach the market more quickly and with greater assurances of safety and effectiveness. 78
    • “This unprecedented data sharing is game-changing for companies that are developing new therapies for neuro-degenerative diseases,” said Raymond Woosley, MD, PhD, President and CEO of Critical Path Institute (C-Path).“Scientists around the world will be able to analyze this new combined data from pharmaceutical companies, add theirown data, and consequently better understand the course of these diseases.”Mark McClellan, MD, PhD, who launched FDA’s Critical Path Initiative during his tenure as FDA Commissioner, also notedthe need for better evidence. “Too many treatments fail in the last stages of research, wasting millions of dollars andyears of research time. To get to faster, more efficient development of safe and effective treatments, we must have abetter understanding of diseases at the molecular level. The CAMD database is a promising step in this process forneurodegenerative diseases,” said Dr. McClellan, who is now the director of the Engelberg Center for Health CareReform and Leonard D. Schaeffer Chair in Health Policy Studies at the Brookings Institution.Roughly 6.5 million people in the U.S. are afflicted with Alzheimer’s and Parkinson’s diseases, with costs reaching asmuch as $175 billion annually. Worldwide there are already an estimated 30 million people with dementia alone. By2050, the number will rise to over 100 million. Halting or slowing the progression of these diseases will prevent untoldsuffering and save tens of billions of dollars every year.“Data sharing is the backbone of several CAMD projects designed to identify patients who might develop brain diseases,i.e., before symptoms are apparent,” said Marc Cantillon, MD, Director of C-Path’s Coalition Against Major Diseases.“Our goal is to develop tools to prevent or slow these diseases so patients can maintain independence and quality oflife.”The CAMD database will allow researchers to design more efficient clinical trials that have the maximum chance ofdemonstrating if a new treatment is truly safe and effective. In addition, the coalition is identifying biomarkers thatidentify patients in the very early stages of Alzheimer’s disease and Parkinson’s disease.According to Maria Isaac, MASc, MD, PhD, Scientific Administrator, Scientific Advice, Human Medicines Special AreasSector of the European Medicines Agency (EMA), “Within the context of the Innovative Medicines Initiative (IMI) inEurope, the EMA is committed to similar goals as C-Path’s consortia, i.e., to help biopharmaceutical drug development,for the benefits of patients. The Agency is especially interested in reviewing CAMD’s Alzheimer’s biomarkers anddisease progression models.”“We are proud to be a member of this coalition,” said Frank Casty, MD, VP Technical Evaluations, AstraZenecaPharmaceuticals LP, and Co-Director of CAMD. “A healthier world must come from collaboration, in making better,deeper connections with all our stakeholders, and sharing skills and ideas to meet a common goal – improved health.”____________________________________________________________________________________CAMD Members: Abbott, Alliance for Aging Research, Alzheimers Association, Alzheimers Foundation of America,AstraZeneca Pharmaceuticals LP, Bristol-Myers Squibb Company, CHDI Foundation Inc, Eli Lilly and Company, F.Hoffmann-La Roche Ltd, Forest Research Institute, Genentech Inc., GlaxoSmithKline, Johnson & Johnson, National HealthCouncil, Novartis Pharmaceuticals Corporation, Parkinsons Action Network, Parkinsons Disease Foundation, Pfizer, Inc.,and sanofi-aventis US Inc.About CAMD: CAMD members fully share pre-competitive data and knowledge that will more efficiently and safelyspeed development of new therapies and preventions for Alzheimer’s, Parkinson’s, Huntington’s, and other debilitatingneuro-degenerative diseases. CAMD’s overall objective is to help scientists identify clinical and laboratorycharacteristics of patients who are pre-symptomatic and most likely to benefit from new therapies. For moreinformation on CAMD and the database, visit http://www.c-path.org/CAMD.cfm.About Critical Path Institute (C-Path): An independent, non-profit organization, C-Path’s mission is to serve as theimpartial facilitator of collaborative efforts among scientists from government, academia, patient advocacyorganizations, and the private sector to support the U.S. Food and Drug Administration’s regulatory science initiatives. 79
    • This involves creating faster, safer, and smarter pathways for innovative new drugs, diagnostics, and devices that willsignificantly improve public health. Established in 2005, C-Path is headquartered in Tucson, Arizona, with offices inPhoenix, Arizona, and Rockville, Maryland. Visit www.c-path.org for more information. ### 80
    • Get SmartKnowledge ManagementWinner: Centocor R&D nominated by Recombinant Data Corp.Project: tranSMARTBy Kevin DaviesJuly 29, 2010 | One of the most gratifying aspects of awarding a Best Practices Award to a majorpharmaceutical operation is to see that technology or platform find applications beyond the wallsof the pharma itself. That certainly seems to be the case with tranSMART, the research datamanagement project that was implemented at the Johnson & Johnson (J&J) subsidiary CentocorR&D, with help from Recombinant Data Corporation in Boston.As Bio•IT World reported earlier this year (see, “Running tranSMART for the DrugDevelopment Marathon,” Bio•IT World, Jan 2010), tranSMART provides a federatedconnectivity architecture that allows Centocor researchers to query the company’s wealth of drugdata using a single tool. The tranSMART architecture minimizes the complexity and resourcesrequired to aggregate disparate research data, improving the productivity of the company’sresearchers.Leading the charge for Centocor was Eric Perakslis who, just days after accepting the BestPractices Award, took up a new role as J&J’s vice president of Pharma R&D IT (a positionformerly held by John Reynders, who has taken a different role at J&J). Indeed, Perakslis says “alot of the innovation we tried to do [with tranSMART] were things that made them seek me outfor this job.”The impetus behind tranSMART was the notion that modern drug development is expensive,tedious, and very data intensive. tranSMART has a search engine that indexes many sources ofinternal and external data, as well as links for further analysis by applications such as AriadneGenomics Pathway Studio. Some text sources are hand-curated, such that the hits from thesecurated sources can be exported in semantic triples and visualized in network diagrams.A Dataset Explorer allows users to easily query clinical trial data and focus on scientificquestions rather than data processing, generating queries by phenotypes, genotypes, or acombination thereof. End users can view summary statistics about their queries, analyze geneexpression or proteomic data through a link to the Broad Institute’s Gene Pattern application orview SNP data using linkage disequilibrium plots.Both the search engine and the Dataset Explorer leverage the same underlying data warehouse—the Recombinant Data Trust. tranSMART reduces the delay between a query across disparatesource systems and a result from days or weeks to a matter of minutes. Within about 12 months,Perakslis and colleagues had tranSMART deployed, including a variety of open-sourcecomponents to help other institutions leverage the knowledge base. That is already happening,says Perakslis. 81
    • One example is the European Union’s Innovative Medicines Initiative (IMI), designed toeliminate hurdles in drug development. IMI has funded U-BIOPRED, involving thousands ofrespiratory disease patients, looking for drug targets and repositioned molecules, involvingmultiple biopharmas, academic groups and regulators coming together. Perakslis says J&J waseager to get involved, and has granted tranSMART to the consortium. Another outreach effort iswith St. Jude Children’s Research Hospital in Memphis, integrating tranSMART into the Curefor Kids and pediatric oncology database for 172 countries. “It’s great to see people taking it up,”says Perakslis, particularly as it is a “slightly disruptive innovation” for a big pharma. “It’s notjust free software, but you can tailor your investment to make it viral and useful, rather than justbuying licenses and getting it to work.”Another pharma has approached Perakslis about using tranSMART as a large, public repositoryfor cancer data. For J&J, “It’s good business,” says Perakslis. “If you want a friend, be a friend.”J&J isn’t looking for anything in return, but “If some of these foundations find a drug target oran interesting compound, and they care to show it to us first, it doesn’t bother us!”This article also appeared in the July-August 2010 issue of Bio-IT World Magazine.http://www.bio-itworld.com/2010/issues/jul-aug/BP-KM.html 82
    • Running tranSMART for the DrugDevelopment MarathonBy Kevin DaviesJanuary 20, 2010 | If you sense a certain irresistible Lance Armstrong-like determination whenmeeting Centocor’s Eric Perakslis, it is not a coincidence. Perakslis, the VP of R&D Informaticsat Centocor R&D (and newly appointed member of the Corporate Office of Science andTechnology at J&J) is a cancer survivor—he was diagnosed with stage III kidney cancer at 38—and has run marathons in support of Armstrong’s LiveStrong Foundation.Once every other month, Centocor graciously lets Perakslis travel to Jordan where he volunteersas the CIO and head of Biomedical Engineering of the King Hussein Institute of Biotechnologyand Cancer. It’s a noble gesture given that Perakslis has his hands full at Centocor R&D—thebiotechnology, immunology and oncology research arm of Johnson & Johnson (J&J). Trained asa biochemical engineer, Perakslis has spent the past decade in handling a variety of IT and labresponsibilities. “You’ve got to understand yours skills and weaknesses and then just kind of gowhere the fun is,” he says.Amazingly AdvancedThe latest Perakslis project has been to spearhead the development of tranSMART (TranslationalMedicine Mart), a powerful new translational informatics enterprise data warehouse that wentlive at J&J in June 2009. TranSMART helps investigators mine drug target, gene and clinicaltrial data to aid in predictive biomarker discovery, chiefly in immunology and oncology.Perakslis says it is an “amazingly advanced” data warehouse that compares favorably with manysuch efforts he’s seen in the pharma world.TranSMART is a full translational medicine warehouse, says Perakslis. When he wasconsidering the project, Perakslis insisted that a different approach was needed. “If I’m going tobe the head of informatics, all the data has to be mine—I don’t care where it is, what [J&Jsubsidiary] it’s in, I don’t care if its drug discovery or clinical or post-marketing—in order to doinformatics well, all the data has to be in scope.” Few pharmas follow that practice. “When Istarted recruiting, I could find some good bioinformaticists, but none of them had ever seenclinical data—because they weren’t allowed!” It also meant changing the mindset that sees someR&D leaders more worried about their peers seeing, and potentially misunderstanding their data,than a rival pharma executive.Breaking down those silos is like developing an internal informatics-without-borders ecosystem.“We take all the resources and capabilities and direct it toward the largest questions facing theorganization.” Instead of a traditional bioinformatician only supporting a process like targetidentification, Perakslis puts them all in one team to apply some scientific and medical rigor tothe key strategic decisions facing the development teams and R&D leadership.The goal is an informatics package that combines all the different ‘omics technologies and 83
    • informatics disciplines to support or guide a decision. “It is still early days and, at first, we addeda lot of value by ruling things out. I can’t tell [the team leader] what indications to pick, but outof five disease indications we’re considering for a novel therapeutic, I have evidence that threeare a bad idea. How’s that to start? We are then able to stop debating the five and focus onevolving the thinking about the two that were left.”Perakslis partnered with Recombinant Data to build tranSMART. He wanted a partner who knewhow to assemble data warehouses. Head architect Bruce Johnson worked on clinical datawarehousing at the Mayo Clinic, but before that, he worked on railroads. “He just understandsdata warehousing and doesn’t care what sort of data it is,” says Perakslis. The Boston firm hasworked with Partners Healthcare in designing and building i2B2. “Technology-wise,tranSMART is very basic. We haven’t brought in a lot of third-party software or heavy analytics.The system is about the data and the technology is based on open stuff. I don’t owe peoplemillions of dollars a year in contracts.” The closest commercial competition is probablyInforSense, says Perakslis. In addition, the system is the first production system from J&J to behosted externally on the Amazon cloud.“IT isn’t always a profession that gets the respect it deserves. Sometimes, the MD on staff thinkshe can do IT better than the IT staff, [but] I wanted to assemble a team that knew it down cold.”For example, Sandor Szalma, the project manager for tranSMART, brings the needed mix ofR&D knowledge, vision, software and social engineering experience and tenacity to the team.Perakslis is currently having discussions about a Red Hat approach where a lot of the advancedanalytics is made open source. “I’m not running a commercial software company,” he says. He’salready offered the software to Guna Rajagopal, a colleague at the Cancer Institute of NewJersey. In addition, St Jude Children’s Research Hospital, UCSF and other centers are allconsidering the adoption of the i2b2-based platform.Unlike another prominent and excellent J&J informatics resource, ABCD (see, “ABCD: TheRelentless Pursuit of Perfection,” Bio•IT World, Nov 2007), Perakslis says is very focused on thechemistry and computational technologies, tranSMART handles large molecules and everythingelse, capable of providing access to every trial the company has ever run, and allowing cliniciansto test hypotheses that enable the design of better clinical trials. In an effort to reinvent as little aspossible, tranSMART integrates information from numerous existing vendors at Centocor/J&J,including Jubilant, Ariadne, Ingenuity and GeneGo. There are links to public repositories such asGene Cards and PubMed as well as direct links to many key internal systems across J&J.Perakslis has loaded several years of clinical trial data into tranSMART, which can be searchedfor screening data, lab data, you name it. “We have not had a lot of success in asthma yet but wefeel the disease is strategic and we must keep trying. If the PI in asthma says, ‘I just saw a newsignature or a new gene in a paper. I wonder if that gene was amplified in any of the patientswe’ve seen in our studies?’ That’s an instant search.”All Kinds of DataThe data warehouse can be searched by almost anything—compound, disease, gene, gene list,pathway, gene signature. Type in a gene, and a researcher can see every relevant clinical trial,gene expression datasets where expression of that gene was significantly altered, literature 84
    • results, pathway analysis, and more.Perakslis says he is incorporating all kinds of data into the warehouse. “When we add new trials,we use data diversity as a selection criteria. If a certain trial will bring a new and necessary datamodality, great. We’re growing the mart and growing the content.” While it is meaningless andtoo expensive to do a whole genome on everybody right now, he says most trials areunderpowered with respect to sampling for molecular profiling. “So what is just right? What isenough data from a trial to meet objectives but learn something about the biology or thecompound that helps you design the next one in the event this one fails?”Since tranSMART went live last summer, the accolades have been glowing, particularly from theclinicians and scientists who have changed their study designs because of the data warehouse.For example, J&J’s Elliot Barnathan, executive director of clinical immunology research, says,“tranSMART gives the pharmaceutical physician/researcher the ability to use data alreadyobtained internally or from the public domain at his or her desktop to ask and quickly answerquestions with ease.” Hans Winkler, senior director of oncology biomarkers, says tranSMARTwill transform the way his team analyzes data. “Cross-trial meta-analyses and combined pre-clinical and clinical data analyses are at our fingertips,” he says. Direct cross-referencing withthe literature and multiple visualization tools down-stream are all available instantly. This isclearly a quantum leap in our ability to extract knowledge from data.”A big reason for tranSMART’s success is the social engineering—getting the datasets released,having a QA process, informed consent, etc. As a past CIO for Centocor R&D, Perakslis hadwritten a lot of those policies, on records management, the FDA Gateway project, eCTD, andknew where a lot of that stuff was. But he admits: “The truth is, there’s nothing you could do inthis warehouse you couldn’t do anyway if you had the time to do it.”Perakslis is trying to encourage the Google approach, spending a portion of one’s time thinkingcreatively about something other than your job. “In pharma, when things are tough and peopleget busy, they tend to start acting like technicians. They run the models they know how to run…This is about efficiency and innovation. It does save time and money—if people are thinkingdifferently and running better experiments, the quality of decision making should go up.”Perakslis praises J&J CIO John Reynders for being supportive and “getting out of the way andremoving barriers, that’s one of the best things he could do.” As good a tool as tranSMART is,Perakslis says, “it’s only good because we built all our processes around it. If you dropped that inan organization, if you don’t have central agreement about data standards, data availability,compliance and protection of patient rights, it’s not any use.”This article also appeared in the January-February 2010 issue of Bio-IT World Magazine.http://www.bio-itworld.com/issues/2010/jan/transmart.html 85
    • Genomic research cial website: weconsent.us. To make sure consent is truly informed,Consent 2.0 Mr Wilbanks and his team have gone to great lengths to explain the consequences to signatories. There is an online tutorial that cannot be bypassed. Uploaded infor- mâtion may be removed from the data- base on request, at any time, but the pro-NEW YORK vider is clearþ warned that it may haveAbetterway of signingup for studies of your genes already found its way into places fromTN AN age where people promiscuously according to John Wilbanks, the crea[or of which it cannot be erased.Ipost personal data on the web and regu- Portable Legal Consent, representatives of Sage Bionetworks hopes z5,ooo peoplelarþ click "I agTee" to reams of legalese several drug companies have expressed will sign up in the frrst year, either becausethey have never read, news of yet another enthusiasm about Sage Bionetworks ap- trial organisers choose to adopt the proto-electronic consent form might seem like a proach. And appealing features, such as so- col or volunteers insist on it. But to be reallybig yawn. But for the future of genomics- called syndicating technology, which auto- useful, the database would need to grow torelated research the Portable Legal Con- matically informs both researchers and ten or 1oo times this size. Mr Wilbanks hassent, to be announced shortlyby Sage Bio- volunteers about new data relevant to a therefore started discussions with severalnetworks, a non-profit research organisa- specific drug or disease, should reduce the frrms that offer commercial genetic teststion based in Seattle, is anything but resistance of individual researchers to the for a range of diseases. It is also linking upmundane. Indeed, by reversing the normal loss of control of what they used to think with PatientslikeMe, an organisation thatway consent to use personal data is ac- of as their own data. helps almost rio,ooo people find thosequired from patients in clinical trials, it Of course, sharing in this open-ended with similar illnesses, in order to sharecould spell a new relationship between way carries risks. The data involved are their experiences.scientists and the human subjects of their "de-identified"-meaning they cannot im-research, with potential benefrts that ex- mediately be traced to a specific individ- Are we all agreed?tend well beyond genomics. ual. But as Mr Wilbanks notes, this is not a So far, the Portable Legal Consent is valid foolproof guarantee of anonymity. only in America, although Sage Bionet-Shareandenjoy Data shared might, for example, be works is looking at ways of adapting it to fitThe heart of Portable Legal Consent is the traced back to their owner by sophisticat- the legal frameworks of China and thenotion that anyone who signs up.for a clin- ed search algorithms. Or some malevolent European Union. How quickly the ideaical trial, or simply has his genome read in hacker might expose them to the world. willcatch on iemains to be seen. But if itorder to anticipate the risk of disease, Those squeamish about sharing their per- does, other sorts of researchers who relyshould easily be able to share his genomic sonal information should probably not on gathering personal data-for exampleand health data not just with that research sign the consent form, Mr Wilbanks coun-- in sociology or in tracking energy use ingroup or company, but with all scientists sels. But those who believe the benefrt of homes-may find it attractive. And thatwho are prepared in turn to accept some doing so-accelerating the pace of medical would enable research of a sort that is nowsensible rules about how they may use the research-outweighs the risks can start to impossible, by opening up the field ofdata. The main one of these is that the re- pool their own data next month on a spe- quantifi able social science. rsults of investigations which include such"open source" data must, themselves, befreely and publicly available. In much thesame vein as the open-source-softwaremovement, the purpose of this is to in-crease the long-term value of the data, byallowing them to be reused in ways thatmay not even have been conceived of at the time they were collected. That approach contrasts with todays system of secretive data silos for particular studies of speciflc diseases. Patients some- times discover that they have even signed away their rights to see their own data. That may serve the narrow interests of a research group or drug company intent on keeping competitors at bay. But the po- tential of genomic data to provide further insights, perhaps in completely novel con- texts, is huge-especially when correlated with a persons medical record. Also, teas- ing out correlations between particular ge- notypes and diseases in a statistically meaningful way requires large sets of data; the larger the set, the more believable the correlations. Portable Legal Consent brings the promise of very large data sets indeed. Even for academia and industry, these benefrts should eventually outweigh the short-term drawbacks of sharing. Indeed, Sign here 86
    • PerspectivesReporting the findings of clinical trials: a discussion paperD Ghersi,a M Clarke,b J Berlin,c AM Gülmezoglu,a R Kush,d P Lumbiganon,e D Moher,f F Rockhold,g I Sim h & E Wager iBackground (available at: http://www.who.int/ A proposed position trialsearch).When researchers embark on a clinical The value of registration goes far The position proposed by the mem-trial, they make a commitment to con- beyond the administrative benefits bers of the WHO Registry Platformduct the trial and to report the findings of having a complete collection of all Working Group on the Reporting ofin accordance with basic ethical prin- trials. Trial registers may facilitate re- Findings of Clinical Trials is that “theciples. This includes preserving the ac- cruitment into clinical trials by raising findings of all clinical trials must becuracy of the results and making both awareness of their existence among made publicly available”. This paperpositive and negative results publicly discusses the principles underlying this potential participants and health-careavailable.1 However, a significant pro- position. Our goal is to contribute to practitioners. They may also lead toportion of health-care research remains the ongoing debate and to foster the more ethical and successful researchunpublished and, even when it is pub- collaboration that is necessary to en- by avoiding the unintentional dupli-lished, some researchers do not make sure that the findings of clinical trials cation of research already under wayall of their results available.2 Selective do not remain hidden from the people elsewhere. From the perspective of thisreporting, regardless of the reason for it, who need access to them. paper, however, the greatest benefit ofleads to an incomplete and potentiallybiased view of a trial and its results.3 trial registration is enhanced transpar- The consequences of publication ency; that is, making it clear which tri- What is a finding?bias and selective reporting have gained als are being conducted so that people can anticipate their results. The language used to discuss the re-the attention of health-care consum- porting of clinical trials usually focusesers, the media and politicians. All have The next step to informed deci- sion-making is to make the findings on “results” – often taken to mean therecognized the impact that undisclosed numerical results of an analysis for aresults can have on the ability of pa- of clinical trials available, since it is knowledge of the findings, rather than specific outcome. For example, a sum-tients, practitioners and policy-makers mary estimate such as a relative risk.to make well-informed decisions about of the existence of a trial, that is likely to have the greatest impact on people However, the reader or user of researchhealth care. Concern over the underre- also needs background informationporting of adverse events, in particular, trying to choose between alternative in- terventions. The arrival and growth of that will allow them to correctly inter-has increased the demand for more electronic publishing and the Internet pret the results for specific outcomes.transparent processes for registering as dissemination tools without page The type and amount of informationclinical trials and reporting their find- or length restrictions has greatly ex- required will depend on the nature ofings. In recent years, trial registration panded the ability of people to make the audience and how it will use thehas become increasingly widely ac- findings available and accessible in information received but will have fourcepted and implemented. There arenow well-established mechanisms to full. The recognition of the need for key elements:register trials, assign unique identifiers reliable evidence to improve healthand make this information publicly care and to facilitate the synthesis of 1. Methodological contextavailable. WHO’s International Clinical the results of research into systematic The user needs to know what the origi-Trials Registry Platform Search Portal reviews has fuelled the demand for ac- nal plans for the trial were and how ithelps bring the data from these regis- cess to the findings of all research, as was actually conducted. Some of theters together, making it much easier have the needs of the numerous other original plans will be available if theto search for the existence of a trial stakeholders in clinical research. trial was registered, but more completea World Health Organization, Geneva, Switzerland.b UK Cochrane Centre, Oxford, England.c Johnson & Johnson, Titusville, NJ, United States of America.d Clinical Data Interchange Standards Consortium (CDISC), Austin, TX, USA.e Khon Kaen University, Khon Kaen, Thailand.f Children’s Hospital of Eastern Ontario, Ottawa, ON, Canada.g GlaxoSmithKline, Philadelphia, PA, USA.h University of California, San Francisco, CA, USA.i Sideview, Princes Risborough, Buckinghamshire, England.Correspondence to Davina Ghersi (e-mail: ghersid@who.int).doi:10.2471/BLT.08.053769(Submitted: 4 April 2008 – Accepted: 8 April 2008 – Published online: 21 April 2008 )492 Bulletin of the World Health Organization | June 2008, 86 (6) 87
    • PerspectivesDavina Ghersi et al. Reporting the findings of clinical trialsinformation should be in the full trial 4. Interpretation in an environment where the end usersprotocol. The actual conduct of the The most contentious aspect of report- of research information now includetrial might, however, differ from that ing the findings of a clinical trial may health-care policy-makers, consumers,planned. Such differences may be valid be whether the report should include regulators and legislators who wantbut should be traceable, preferably via the researchers’ opinions and conclu- rapid access to high quality informa-a clinical trial registry. The methodologi- sions. It is in this section of a report tion in a “user-friendly” format. Incal context for a trial might also include where the objectivity of the research future, researchers may be legally re-information on the primary question is confronted by the subjectivity of quired to make their findings publiclythat the researchers set out to answer; authors who “use their power as own- available within a specific timeframethe hypothesis they were seeking to test; ers of their writing to emphasize one (assuming any legislation created doesthe methods they used to allocate or point of view more than another”. 4 not have escape clauses built in). Inselect participants for the interventions; the United States of America, such Alternatively, linguistic spin is consid-whether any blinding or masking was legislation is already in place (available ered by some to be essential to scientificdone; and a description of the statistical at; http://www.fda.gov/oc/initiatives/ communication, being the opportunitymethods planned and used, including a HR3580.pdf ). This may compromise for scientists to speculate and formulatejustification of the sample size. the ability of researchers to publish new hypotheses.5 Distinguishing opin- trial findings in a peer-reviewed journal. ions and conclusions from advocacy2. Population context or promotion can be difficult, and it Although some journal editors haveThe user needs to consider to whom acknowledged the changing climate may be preferable to leave the users ofthe results of the trial can be applied. around results registration and report- the findings to draw their own conclu-They need to know the characteristics ing (available at: http://www.icmje. sions and form their own opinions of org/clin_trial07.pdf ), they may haveof the population it was initially intended what the findings of a trial mean forto recruit, as well as the characteristics of a conflict of interest in that they will them and their decision-making. probably want the key (and potentiallythe population who were recruited. Inreporting their findings, the researchers most exciting) messages from a trial toshould therefore refer to the original Public availability appear first, and perhaps exclusively, ininclusion and exclusion criteria and pro- their publication. Clinical trial results should be availablevide information on the population that to everyone, regardless of where theyactually took part, describing the extent live. If the results are not made avail- Conclusionto which participants left the trial early able within a reasonable period, the People making decisions about health(with reasons for doing so) and whether reasons for delay and a date by which care need access to knowledge derivedinterventions were delivered as planned. the findings will be available should be from the findings of clinical trial re- submitted to the relevant clinical trials search. Following on from WHO’s3. A result for each outcome registry. It should be the responsibility position that all clinical trials mustA clinical trial will usually consider of researchers to produce their findings be registered,6 it is proposed that theseveral outcomes, each of which might in a format that can be accessed by findings of all clinical trials must bebe measured in multiple ways or at potential users, and it should be the made publicly available. This reportmultiple time points, and there might responsibility of the user to seek these is the start of a consultation processbe more than one way to analyse each results. The Internet could provide the on how this goal of transparency canof these measures. The findings should main means by which this shared re- be achieved, with the intention thatcontain the results for every outcome sponsibility can be satisfied. Access also greater accessibility to the findings ofmeasure and analysis specified in the needs to be considered in terms of the all clinical trials will lead to improve-register entry for the trial, and set out end user’s ability to both view and un- ments in health and health care. Toin the protocol or statistical analysis derstand the information obtained. contribute to the first phase of the con-plan. If any of these results are not Historically, access to the results sultation, please visit: http://www.who.reported, a reason should be provided of a trial has usually been achieved int/ictrp/results/consultation ■for their absence. Any findings from through publication in a peer-reviewedanalyses that were not pre-specified journal. This traditional publication Competing interests: None declared.should be clearly identified as such. model has its limitations, particularlyReferences1. Declaration of Helsinki: ethical principles for medical research involving 4. Horton R. The rhetoric of research. BMJ 1995;310:985-7. PMID:7728037 humans. Ferney-Voltaire, France: World Medical Association; 2000. Available 5. Greenhalgh T. Commentary: scientific heads are not turned by rhetoric. BMJ from: http://www.wma.net/e/policy/b3.htm [accessed on 9 April 2008]. 1995;310:987-8. PMID:77280382. Song F, Eastwood AJ, Gilbody S, Duley L, Sutton AJ. Publication and related 6. Sim I, Chan AW, Gulmezoglu M, Evans T, Pang T. Clinical trial registration: biases. Health Technol Assess 2000;4 (10). transparency is the watchword. Lancet 2006;367:1631-3. PMID:167141663. Chan AW, Hrobjartsson A, Haahr MT, Gotzsche PC, Altman DG. Empirical doi:10.1016/S0140-6736(06)68708-4 evidence for selective reporting of outcomes in randomized trials: comparison of protocols to published articles. JAMA 2004;291:2457-65. PMID:15161896 doi:10.1001/jama.291.20.2457Bulletin of the World Health Organization | June 2008, 86 (6) 493 88
    • NEWS FEATUREA BROKEN CONTRACTL AS RESEARCHERS FIND MORE USES ate in May, the direct-to-consumer gene-testing company 23andMe proudly FOR DATA, INFORMED CONSENT HAS announced the impending award of its first patent. The firm’s research on BECOME A SOURCE OF CONFUSION.Parkinson’s disease, which used data from sev-eral thousand customers, had led to a patent on SOMETHING HAS TO CHANGE.gene sequences that contribute to risk for thedisease and might be used to predict its course.Anne Wojcicki, co-founder of the company,which is based in Mountain View, California,wrote in a blog post that the patent would help B Y E R I K A C H E C K H AY D E Nto move the work “from the realm of academicpublishing to the world of impacting lives bypreventing, treating or curing disease”. data in a growing number of ways. Several undermined support for research. In 2004, Some customers were less than enthusiastic. US states, including California, are consider- for example, scandal erupted in the UnitedHolly Dunsworth, for example, posted a com- ing laws that would curtail the way in which Kingdom after parents found out that fromment two days later, asking: “When we agreed researchers, law-enforcement officials and pri- the late 1980s to the mid-1990s doctors andto the terms of service and then when some of vate companies can use a person’s DNA. researchers had removed and stored organsus consented to participate in research, were The research coordinators who develop and tissues from patients — including infantswe consenting to that research being used to consent forms cannot predict how such data and children — without parental consent. Newpatent genes? What’s the language that covers might be used in the future, nor can they laws were passed that required explicit consentthat use of our data? I can’t find it.” guarantee that the data will remain protected. for such collections. The language is there, in both places. To be Many people argue that participants should Then, in 2010, the Havasupai tribe of Ari-fair, the terms of service is a bear of a docu- have more control over how their data are zona won a US$700,000 settlement againstment —  the kind one might quickly click past used, and efforts are afoot to give them that Arizona State University in Phoenix. Indi-while installing software. But the consent form control. Researchers, meanwhile, often bristle viduals believed that they had provided bloodis compact and carefully worded, and approved at the added layers of bureaucracy wrought by for a study on the tribe’s high rate of diabetes,by an independent review board to lay out the protections, which sometimes provide no but the samples had also been used in mental-clearly the risks and benefits of participating real benefits to the participants. The result is illness research and population-genetics stud-in research. “If 23andMe develops intellectual a mess of opinions and procedures that sow ies that called into question the tribe’s beliefsproperty and/or commercializes products or confusion and risk deterring people from par- about its origins. In the settlement, the uni-services, directly or indirectly, based on the ticipating in research. versity’s board of regents said that it wanted toresults of this study, you will not receive any “A lot of times researchers will say, ‘Why “remedy the wrong that was done”.compensation,” the document reads. can’t we just go back to the way it was?’, which The cases illustrate the divide between The example points to a broad problem in was basically that we take these samples and researchers and the public over what peopleresearch on humans — that informed consent people do it for altruistic reasons and every- need to know before agreeing to participate inis often not very well informed (see ‘Reading thing’s lovely,” says Sharon Terry, president of research.between the lines’). Protections for participants the patient-advocacy group Genetic Alliance Many of the recent concerns over consenthave been cobbled together in the wake of past in Washington DC. “That worked in a prior are driven by the rapid growth of genomecontroversies and have always been difficult to age. I don’t think it works today.” analysis. Decades ago, researchers weren’tuphold. But they are proving even more prob- The concept of informed consent was able to glean much information from storedlematic in the ‘big data’ era, in which biomedi- first set out in the Nuremberg Code, a set of tissue; now, they can identify the donor, as wellcal scientists are gathering more information research-ethics principles adopted in the as his or her susceptibilities to many diseases.about more individuals than ever before. Many wake of revelations of torture by Nazi doctors Researchers try to protect the genetic datastudies now include the collection of genetic during the Second World War. But in recent through technological and legal mechanisms,data, and researchers can interrogate those years, a series of mishaps over consent have but both approaches have weaknesses.3 1 2 | NAT U R E | VO L 4 8 6 | 2 1 J U N E 2 0 1 2 89 © 2012 Macmillan Publishers Limited. All rights reserved
    • FEATURE NEWS It is not enough to stripping out any safe’, and now I was in a position where I was programmes. Patients don’t choose toVIKTOR KOEN information that would identify the donor, told, ‘No, that’s not true’,” she says. After participate; rather they are given the chance to such as names and full health records, before Costello’s day in court, in August 2004, the opt out. Patients are asked to sign a ‘consent to the data are stored. In 2008, geneticists showed records remained sealed, but mostly because treat’ form every year. It includes a box that they that they could easily identify individuals the judge did not believe that they would exon- can tick to keep their DNA out of the database. within pooled, anonymized data sets if they had erate JT. The result provided no clarity about That model helps BioVU to collect many more a small amount of identified genetic informa- patient protections. samples, and much more cheaply, than other tion for reference (N. Homer et al. PLoS Genet projects can. 4, e1000167; 2008). And it may become pos- BETTER MODELS FOR CONSENT The opt-out model — which is used in only a sible to identify a person in a public database One solution is to keep genetic information few other places — troubles Misha Angrist, a from other information collected during a separate from demographic data. The BioVU genome policy analyst at Duke University, who study, such as data on ethnic background, loca- databank at Vanderbilt University Medical says that it risks taking advantage of people tion and medical factors unique to the study Center in Nashville, Tennessee, for instance, when they are ill. “Even a routine visit to the participants, or to predict a person’s appearance contains DNA samples from patients treated clinic can be a vulnerable moment, and they’re from his or her DNA. at the hospital — 143,939 people as of 11 June. saying, ‘Would you mind doing this for future Even legal mechanisms have vulnerabilities. The DNA is linked to health records in a sec- generations, to help people just like you?’.” In 2004, Jane Costello, a social psychologist at ond database, called a ‘synthetic derivative’, in And legal challenges have shown the weak- Duke University in Durham, North Carolina, which the data are anonymized and scrambled nesses of opt-out policies. Health officials are was forced to go to court to defend the con- in ways that, its creators say, make it difficult now destroying millions of blood samples fidentiality of patient records from the Great for anyone to work back from the database to taken from newborn babies in Texas and Smoky Mountains Study. The study, which is verify a patient’s identity. Sample-collection Minnesota because the families were not ade- just going into its third decade, examines emo- dates are altered, for example, and some quately informed that the samples, collected to tional and behavioural problems in a cohort records are discarded at random, so that it is screen for specific inherited disorders, would of people who enrolled as adolescents. A par- not possible to know that someone is in the also be used in research. ticipant in the study was testifying against her database just because he or she was treated at Vanderbilt officials and researchers counter grandfather, John Trosper ‘JT’ Bradley, who the hospital. Even researchers who work with that they have run extensive public campaigns had been accused of sexual abuse. JT’s lawyers the data cannot determine whose data they to ensure that people in Nashville are aware subpoenaed the granddaughter’s records from are using. The databank expects to include as of BioVU and are comfortable with the way the study in hope that the information would many as 200,000 individuals by 2014, making it works. They regularly consult a commu- undermine her credibility as a witness. it one of the largest collections of linked genetic nity advisory board about the project. And It meant a major crisis of confidence for and health records in the world. Vanderbilt’s approach actually goes above and Costello. “I was telling 1,400-plus people every But when it comes to consent, BioVU takes beyond what is required by federal law; because time we saw them that ‘your data are absolutely a different approach from many other the synthetic derivative includes de-identified 90 2 1 J U N E 2 0 1 2 | VO L 4 8 6 | NAT U R E | 3 1 3 © 2012 Macmillan Publishers Limited. All rights reserved
    • NEWS FEATURE READING BETWEEN THE LINES within the research community about the need Despite the work that goes into making consent forms clear and detailed, some participants say they are to adopt things like broad consent, but that confused by the wording. hasn’t translated out to the legal community or to the public,” he says. 5. What are the benefits and risks of participating? Another solution might be called ‘radical ......If 23andMe develops intellectual property and/or commercializes products or services, directly or indirectly, honesty’. A US project called Consent to based on the results of this study, you will not receive any compensation. Research, which aims to provide a large pool of user-contributed genomic and health data, has 23andMe received a patent in May for its work on Parkinson’s disease. devised what it calls a ‘Portable Legal Consent’, Some participants did not expect it to seek intellectual-property rights. which allows anyone to upload information From 23andMe’s research consent form www.23andme.com/about/consent/ about him or herself, such as direct-to-con- 2. I have been informed that the purpose of the research is to study the causes of behavioral/medical disorders. sumer genetic results and lab tests ordered through medical providers, to an interface that Many participants in the Medical Genetics at Havasupai study in Arizona in the 1990s were strips the data of identifiers. It makes the data unaware of how broad the research goals were and what kind of studies would be performed. widely available to researchers under broad Courtesy of Pilar Ossorio guidelines, but also requires data donors to go through a much more rigorous consent pro- Re-identification cess than most studies do. The Portable Legal We are quickly learning that with powerful computers and good mathematicians, it is increasingly possible to uniquely identify people inside large data sets ... Researchers will sign a contract in which Consent specifically informs participants that they agree that, even if they were able to identify you, they won’t do it. researchers might be able to determine their identities, but that they are forbidden from In an attempt to be more transparent about privacy risks, the Consent to Research project’s ‘Portable Legal doing so under the project’s terms of use. Consent’ essentially states that although anonymity cannot be guaranteed, researchers must pledge to uphold it. Such approaches could help scientists by Courtesy of John Wilbanks Consent to Research giving them access to a trove of data with no restrictions on use. But the participant pro-data, it doesn’t legally require informed consent Last year, he began a project to sequence the tections system that is in place might not beat all. Last July, the US Department of Health exomes — the protein-coding regions of the ready for such frank dialogues, says Angrist,and Human Services signalled that it might be genome — of 500 children and adults, looking who serves on one of Duke’s institutionalrethinking the rules that exempt de-identified for the genetic causes of intellectual disabilities, review boards.data from the consent requirement, as part of a blindness, deafness and other disorders. Brun- While reviewing a research proposal for abroad overhaul of research ethics regulations. ner proposed allowing participants to choose large biobank, for example, Angrist suggested Irrespective of the outcome, obliterating from three options: they could learn everything that the researchers send the participants anpatient identities has drawbacks. Researchers that researchers had divined about disease sus- annual e-mail explaining how their samplescan’t perform some types of research on the ceptibility; just information relevant to the dis- were being used, and thanking them for donat-scrambled data. Because dates are changed, ease for which their genomes were examined; ing their time and tissue. The review boardstudies on the timing of influenza infections, or no information at all. Ethics reviewers shot voted this suggestion down after its chair arguedfor example, are impossible. And patients can’t down his proposal. “They said that in practice, that e-mailing the patients would create a prob-be told if the research has revealed that they it would be impossible for people to draw those lem in light of the Health Insurance Portabilitycarry individual genetic risks linked to disease. lines, because people giving consent cannot and Accountability Act (HIPAA) — the US law foresee all the possible outcomes of the study,” that guarantees the privacy of health records.FULL DISCLOSURE Brunner says. Instead, everyone participating “The irony is that the HIPAA is supposed toReturning study results to research participants in the studies must agree to learn all medically protect people, and what I was hearing was,has been another thorny issue for consent. Doc- relevant information arising from the analysis ‘We can’t talk to people because we’re too busytors might learn about genetic predispositions of their genomes. As a consequence, Brunner protecting them’,” Angrist says. “Institutions useto disease that are separate from the ailments recently had to tell the family of a child with a informed consent to mitigate their own liabil-that led a patient to participate in the research developmental disability that the child also has ity and to tell research participants about all thein the first place, but it is not clear what they a genetic predisposition to colon cancer. Not all things they cannot have, and all the ways theyshould do with this information. researchers endorse the idea of informing chil- can’t be involved. It borders on farcical.” UK researchers, for example, are forbidden dren about diseases that might affect them as But as patient data become more preciousfrom sharing genetic results with participants. adults. In this case, doctors recommended early to researchers, and as advocacy organizationsBut US research societies, such as the Ameri- screening, and Brunner says, “the family han- become more involved in driving researchcan College of Medical Genetics and Genom- dled it very well; they said, ‘This is not what we agendas and in funding the work, such pater-ics in Bethesda, Maryland, are moving towards anticipated, but it’s useful information’”. nalistic attitudes will probably not survive, saysadopting standards that would encourage the Many of the studies done now ask patients Terry. She adds that technologies that allowpractice for some types of findings, such as to give consent for research linked to particular research participants to control and track howthose that are medically relevant. investigators or diseases. But that means that researchers use their data will soon catch on. Some countries, such as Germany, Austria, researchers cannot pool data from separate These approaches could benefit patients, whoSwitzerland and Spain, are already feeding studies to tackle different research questions. gain transparency and control over their data,back such information. And some clinical Many researchers say that the obvious solu- and researchers, who gain access to richersequencing programmes are considering offer- tion is a broad consent document that gives data sets. “I think we’re going to have to geting patients ‘tiered’ consent, in which people researchers free rein with the data. But many to a place where consenting people becomescan decide whether to be told about their data non-scientists think participants should be able customizable easily through technology, andand how much they want to learn. to control how their data are used, says lawyer we’re not there yet,” Terry says. ■ SEE EDITORIAL P.293 This is what Han Brunner, a geneticist at Tim Caulfield of the University of Alberta inthe Radboud University Nijmegen Medical Calgary, Canada, who has surveyed patients Erika Check Hayden writes for Nature fromCentre in the Netherlands, had hoped to do. about this idea. “There’s an emerging consensus San Francisco, California.3 1 4 | NAT U R E | VO L 4 8 6 | 2 1 J U N E 2 0 1 2 91 © 2012 Macmillan Publishers Limited. All rights reserved
    • With Rise of Gene Sequencing, Ethical Puzzles - NYTimes.com Page 1 of 5 August 25, 2012 Genes Now Tell Doctors Secrets They Can’t Utter By GINA KOLATA Dr. Arul Chinnaiyan stared at a printout of gene sequences from a man with cancer, a subject in one of his studies. There, along with the man’s cancer genes, was something unexpected — genes of the virus that causes AIDS. It could have been a sign that the man was infected with H.I.V.; the only way to tell was further testing. But Dr. Chinnaiyan, who leads the Center for Translational Pathology at the University of Michigan, was not able to suggest that to the patient, who had donated his cells on the condition that he remain anonymous. In laboratories around the world, genetic researchers using tools that are ever more sophisticated to peer into the DNA of cells are increasingly finding things they were not looking for, including information that could make a big difference to an anonymous donor. The question of how, when and whether to return genetic results to study subjects or their families “is one of the thorniest current challenges in clinical research,” said Dr. Francis Collins, the director of the National Institutes of Health. “We are living in an awkward interval where our ability to capture the information often exceeds our ability to know what to do with it.” The federal government is hurrying to develop policy options. It has made the issue a priority, holding meetings and workshops and spending millions of dollars on research on how to deal with questions unique to this new genomics era. The quandaries arise from the conditions that medical research studies typically set out. Volunteers usually sign forms saying that they agree only to provide tissue samples, and that they will not be contacted. Only now have some studies started asking the participants whether 92http://www.nytimes.com/2012/08/26/health/research/with-rise-of-gene-sequencing-ethical-... 8/27/2012
    • With Rise of Gene Sequencing, Ethical Puzzles - NYTimes.com Page 2 of 5 they want to be contacted, but that leads to more questions: What sort of information should they get? What if the person dies before the study is completed? The complications are procedural as well as ethical. Often, the research labs that make the surprise discoveries are not certified to provide clinical information to patients. The consent forms the patients signed were approved by ethics boards, which would have to approve any changes to the agreements — if the patients could even be found. Sometimes the findings indicate that unexpected treatments might help. In a newly published federal study of 224 gene sequences of colon cancers, for example, researchers found genetic changes in 5 percent that were the same as changes in breast cancer patients whose prognosis is drastically improved with a drug, Herceptin. About 15 percent had a particular gene mutation that is common in melanoma. Once again, there is a drug, approved for melanoma, that might help. But under the rules of the study, none of the research subjects could ever know. Other times the findings indicate that the study subjects or their relatives who might have the same genes are at risk for diseases they had not considered. For example, researchers at the Mayo Clinic in Rochester, Minn., found genes predisposing patients to melanoma in cells of people in a pancreatic cancer study — but most of those patients had died, and their consent forms did not say anything about contacting relatives. One of the first cases came a decade ago, just as the new age of genetics was beginning. A young woman with a strong family history of breast and ovarian cancer enrolled in a study trying to find cancer genes that, when mutated, greatly increase the risk of breast cancer. But the woman, terrified by her family history, also intended to have her breasts removed prophylactically. Her consent form said she would not be contacted by the researchers. Consent forms are typically written this way because the purpose of such studies is not to provide medical care but to gain new insights. The researchers are not the patients’ doctors. But in this case, the researchers happened to know about the woman’s plan, and they also knew that their study indicated that she did not have her family’s breast cancer gene. They were horrified. 93http://www.nytimes.com/2012/08/26/health/research/with-rise-of-gene-sequencing-ethical-... 8/27/2012
    • With Rise of Gene Sequencing, Ethical Puzzles - NYTimes.com Page 3 of 5 “We couldn’t sit back and let this woman have her healthy breasts cut off,” said Barbara B. Biesecker, the director of the genetic counseling program at the National Human Genome Research Institute, part of the National Institutes of Health. After consulting the university’s lawyer and ethics committee, the researchers decided they had to breach the consent stipulations and offer the results to the young woman and anyone else in her family who wanted to know if they were likely to have the gene mutation discovered in the study. The entire family — about a dozen people — wanted to know. One by one, they went into a room to be told their result. “It was a heavy and intense experience,” Dr. Biesecker recalled. Around the same time, Dr. Gail Jarvik, now a professor of medicine and genome science at the University of Washington, had a similar experience. But her story had a very different ending. She was an investigator in a study of genes unrelated to breast cancer when the study researchers noticed that members of one family had a breast cancer gene. But because the consent form, which was not from the University of Washington, said no results would be returned, the investigators never told them, arguing that their hands were tied. The researchers said an ethics board — not they — made the rules. Dr. Jarvik argued that they should have tried to persuade the ethics board. But, she said, “I did not hold sway.” Such ethical quandaries grow more immediate year by year as genome sequencing gets cheaper and easier. More studies include gene sequencing and look at the entire genome instead of just one or two genes. Yet while some findings are clear-cut — a gene for colon cancer, for example, will greatly increase the disease risk in anyone who inherits it — more often the significance of a genetic change is not so clear. Or, even if it is, there is nothing to be done. Researchers are divided on what counts as an important finding. Some say it has to suggest prevention or treatment. Others say it can suggest a clinical trial or an experimental drug. Then there is the question of what to do if the genetic findings only sometimes lead to bad outcomes and there is nothing to do to prevent them. “If you are a Ph.D. in a lab in Oklahoma and think you made a discovery using a sample from 15 94http://www.nytimes.com/2012/08/26/health/research/with-rise-of-gene-sequencing-ethical-... 8/27/2012
    • With Rise of Gene Sequencing, Ethical Puzzles - NYTimes.com Page 4 of 5 years ago from a subject in California, what exactly are you supposed to do with that?” asked Dr. Robert C. Green, an associate professor of medicine at Harvard. “Are you supposed to somehow track the sample back?” Then there are the consent forms saying that no one would ever contact the subjects. “If you go back to them and ask them to re-consent, you are telling them something is there,” Dr. Green said. “There is a certain kind of participant who doesn’t want to know,” he added, and if a researcher contacts study subjects, “you are kind of invalidating the contract.” Other questions involve the lab that did the analysis. All labs providing clinical results to patients must have certification ensuring that they follow practices making it more likely that their results are accurate and reproducible. But most research labs lack this certification, and some of the latest genetic tests are so new that there are no certification standards for them. “I find it really hard to defend the notion that we are not going to give you something back because it was not done” in a certified lab, “even though we are 99 percent certain it is correct,” Dr. Jarvik said. Gloria M. Petersen, a genetic epidemiologist at the Mayo Clinic, and her colleagues ran into a disclosure problem in a study of genes that predispose people to pancreatic cancer. The 2,000 study patients had signed consents indicating whether they wanted to know about research findings that might be important to them. But the forms did not ask about sharing findings that might be important to their families, or about what the researchers should do if they discovered important information after the patients were dead. Seventy-three of the study patients, almost all of whom are now dead, had one of three clinically important mutations. One predisposed them mostly to melanoma but also to pancreatic cancer. A second predisposed them primarily to breast and ovarian cancer. The third, a cystic fibrosis gene, can increase the risk of pancreatic cancer and can also be important in family planning. If a man and a woman each have this gene, they have a one-in-four chance of having a child with the disease. When it comes to the family members, “I don’t know what my obligation is,” Dr. Petersen said. “There is an incredible burden to track down the relatives. Whose information is it, and who 95http://www.nytimes.com/2012/08/26/health/research/with-rise-of-gene-sequencing-ethical-... 8/27/2012
    • With Rise of Gene Sequencing, Ethical Puzzles - NYTimes.com Page 5 of 5 has a right to that information?” Dr. Petersen, along with Barbara Koenig, a professor of medical anthropology and bioethics at the University of California, San Francisco, and Susan M. Wolf, a professor of law, medicine and public policy at the University of Minnesota, got a federal grant to study the effects of offering to return the genetic results to the families of those 73 patients. The questions involved are tricky, Dr. Koenig said. Finding patients and their families can be expensive, and labs do not have money set aside for it. How would you find them? Even if they were found, whom would you tell? What if there had been a divorce, or if family members were estranged? “My gut feeling is that there is a moral obligation to return results,” Dr. Koenig said. “But that comes at an enormous cost. If you were in a study 20 years ago, where does my obligation end?” 96http://www.nytimes.com/2012/08/26/health/research/with-rise-of-gene-sequencing-ethical-... 8/27/2012
    • Empirical Study of Data Sharing by Authors Publishing inPLoS JournalsCaroline J. Savage, Andrew J. Vickers*Department of Epidemiology and Biostatistics, Memorial Sloan-Kettering Cancer Center, New York, New York, United States of America Abstract Background: Many journals now require authors share their data with other investigators, either by depositing the data in a public repository or making it freely available upon request. These policies are explicit, but remain largely untested. We sought to determine how well authors comply with such policies by requesting data from authors who had published in one of two journals with clear data sharing policies. Methods and Findings: We requested data from ten investigators who had published in either PLoS Medicine or PLoS Clinical Trials. All responses were carefully documented. In the event that we were refused data, we reminded authors of the journal’s data sharing guidelines. If we did not receive a response to our initial request, a second request was made. Following the ten requests for raw data, three investigators did not respond, four authors responded and refused to share their data, two email addresses were no longer valid, and one author requested further details. A reminder of PLoS’s explicit requirement that authors share data did not change the reply from the four authors who initially refused. Only one author sent an original data set. Conclusions: We received only one of ten raw data sets requested. This suggests that journal policies requiring data sharing do not lead to authors making their data sets available to independent investigators. Citation: Savage CJ, Vickers AJ (2009) Empirical Study of Data Sharing by Authors Publishing in PLoS Journals. PLoS ONE 4(9): e7078. doi:10.1371/ journal.pone.0007078 Editor: Chris Mavergames, The Cochrane Collaboration, Germany Received May 8, 2009; Accepted August 5, 2009; Published September 18, 2009 Copyright: ß 2009 Savage, Vickers. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Funding: Supported in part by funds from David H. Koch provided through the Prostate Cancer Foundation, the Sidney Kimmel Center for Prostate and Urologic Cancers and P50-CA92629 SPORE grant from the National Cancer Institute to Dr. P. T. Scardino. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing Interests: The authors have declared that no competing interests exist. * E-mail: vickersa@mskcc.orgIntroduction requested by others for the purpose of academic, non-commercial research.’’[4] Several other highly regarded journals, such as Technology has dramatically improved the ways in which data Nature and Science, have also established clear data sharingcan be stored, analyzed and disseminated. The Internet facilitates requirements[5,6].almost instantaneous access to data and information, and now Unfortunately, the specific mechanisms for data sharing areplays a key role in many fields of medical research. Researchers in often unspecified and the implementation of such policies largelygenomics, for example, rely heavily on publicly available resources untested. Very few journals have an explicit statement regardingsuch as GenBank, a repository of annotated DNA sequences. how their data sharing policies are enforced, and thus it is unclearMany journals now state that submission of original data, such as what options are available to investigators who encounter authorsmicroarray results, into appropriate repositories is a requirement who are unwilling to share. To study how well authors complyfor publication[1]. with data sharing requests, we attempted to acquire original data Following the notable strides of genomics, and other fields such as sets from several researchers who had published in journals withthe open source movement in software, there has been a surge of explicit data sharing policies.awareness regarding data sharing in the biomedical field. The NIHrecently declared that the sharing of data is essential for translating Methodsresearch into knowledge and products that improve health[2]. Assuch, all investigators seeking more than $500,000 in grant support We chose to make our requests from authors who had publishedper year are now required to include a plan for data sharing[3]. in PLoS journals due to their exceedingly clear data sharing Several journals now require authors to share their raw data sets policies: all data relevant to publication should be deposited in anas a condition of publication. The Public Library of Science appropriate public repository, if appropriate, and if not, ‘‘data(PLoS) Journals, a collection of open access journals, specifically should be provided as supporting information with the publishedstates that open access applies to both the scientific literature and paper. If this is not practical, data should be made freely availablethe supporting data: ‘‘publication is conditional upon the upon reasonable request.’’[4]agreement of authors to make freely available any materials and Ten papers from two PLoS publications, PLoS Medicine (n = 4)information associated with their publication that are reasonably and PLoS Clinical Trials (n = 6). None included raw data as part PLoS ONE | www.plosone.org 1 September 2009 | Volume 4 | Issue 9 | e7078 97
    • Journal Data Sharing Policiesof the supporting information. For all papers, the data requested was also documented. If we did not receive a response to our initialwould have allowed us to test a specific pre-specified hypothesis request, a second request was sent after one month.about prediction modeling, a scientific interest of the seniorauthor. Our requests encompassed a wide range of diseases Results(cancer, malaria, HIV and others) and study designs (randomizedcontrolled trials, case-control and cohort studies). We do not think We emailed initial requests for original data to ten correspond-it is plausible that our selection of papers amenable to prediction ing authors. The responses are summarized in Figure 1. Twomodeling could represent a selection of authors importantly more corresponding authors’ email addresses listed on the paper wereor less likely to share data than a random selection from PLoS no longer valid as they had changed institutions since publication.publications. Data requests were emailed by CS to the After extensive investigation, we were able to track down ancorresponding author listed on the manuscript. Requests were updated email address for one author; however, she was away onsimilar in all cases: the stated reason was out of personal interest in maternity leave and we have been unable to establish contact. Anthe topic and the need for original data for master’s level updated email address for the second author was never found. Ofcoursework. the remaining eight investigators with functioning email addresses, With respect to ethical considerations, we pre-specified that any four authors replied that sharing their data was not possible, threerequest must not create undue work for the investigators and that authors did not respond, and another asked for further detailsany data received as part of a successful request would be analyzed regarding our request.as planned, the results of which would be sent to the original Of the four authors refusing to share their data in response to ourinvestigators. In the event that we did obtain data, we would not initial request, two authors did not give a reason, one stated he wasguarantee to publish, but would promise a co-authorship on currently too busy, and the fourth said that he no longer hadanything that was published. jurisdiction over the data as he had changed institutions. We sent a If the response to our initial request was ‘‘no’’, the stated reason follow up email to these four investigators and reminded them ofwas documented. A second request was then made by AV, who PLoS’s data sharing policies. From the two authors who initiallyidentified himself as an Editorial Board member for PLoS ONE. refused without providing a reason, one author said that we couldThe email included an explicit reminder of PLoS’s editorial submit a formal and lengthy proposal to the appropriate trialists’ group;policies, and stated clearly that, as the authors had published in a the other apologized that he had not been aware of the journal’s dataPLoS journal, they were required to provide free and open access sharing policy and, as he was forbidden to pass the data on to thirdto data sets used for analysis. The response to the second request parties, he would not have published in PLoS had he been aware of thisFigure 1. Summary of responses to the 10 initial requests for raw data.doi:10.1371/journal.pone.0007078.g001 PLoS ONE | www.plosone.org 2 September 2009 | Volume 4 | Issue 9 | e7078 98
    • Journal Data Sharing Policiesrequirement. To the investigator who said he was currently too busy, coded in such a way as to ensure subject’s anonymity. There are alsowe responded that we could wait to receive the data; his response was concerns about authorship and future publishing opportunities, andthat it was simply too much work. Our request to the investigator who a natural desire to retain exclusive access to data that may havehad changed institutions was forwarded to the appropriate researchers taken many years of hard work to collect. We have published somestill working at the institution. They claimed it would take too long to simple guidelines to protect the rights of investigators to exploit theirorganize and annotate the data set and would be too much work. data, such as an embargo period between publication and release of Three authors simply did not respond to our initial request. raw data.[9] Other investigators may fear that future researchersAfter a follow up email to these authors, one author replied that he will undermine the original authors’ conclusions, either bywas in favor of sharing data in general, but wished to conduct uncovering an error or by employing alternative analytic methods.more analyses before sharing. We did not receive a reply from the We believe that it is in the best interests of science to have a robustother two authors. debate on the merits of particular research findings. However, it is One author replied to our initial request by asking for more only right and fair that investigators should have a say in the use ofinformation about our proposed analysis. After correspondence to their data, so we have previously recommended[9] that originaldiscuss our analytic plan and the particular variables needed, we investigators should be included as co-authors on any publicationreceived a well annotated dataset within a few hours. resulting from re-analysis of raw data or, alternatively, be offered the opportunity to provide a response and commentary.Discussion Some of the authors we contacted claimed that it would take ‘‘too much work’’ to provide us with raw data. This suggests that We requested raw data from ten corresponding authors and researchers do not always develop a clean, well annotated data setreceived only one data set. Although our sample was small, our for analyses associated with a particular scientific paper. Thisresults are clear: explicit data sharing policies in journals do not strikes us as a problem in and of itself. We also found that that, aslead authors to share data. Our initial intention was to see if the time passes from the date of publication, authors sometimes loserates of data sharing were any higher in journals with data sharing access to the original data or switch institutions and withoutpolicies compared to those without. However, our initial results maintaining their the email address listed on the paper. A simplewere sufficiently clear that a comparison was deemed unnecessary. solution to both of these problems would be to require authors to We are aware of only one prior study that described the submit de-identified data sets to journals or public repositories atdifficultly of obtaining original data sets from published authors. the time of publication.Wicherts et al. wrote to more than one hundred authors who had Data was requested from articles published in only two journals,published in American Psychological Association (APA) journals to PLoS Medicine and PLoS Clinical Trials, and it is possible thatrequest data for reanalysis[7]. They experienced similar difficulty authors who publish in other journals are more likely to share datain obtaining original data, receiving only one quarter of the sets. This seems unlikely as open access journals with explicit datarequested data sets. Although our study was considerably smaller, sharing policies are likely to attract authors who support greaterit is importantly different for three reasons. First, the APA’s openness to scientific data. Our method of requesting data sets waspolicies are not as explicit as those from PLoS. The APA’s intentionally left vague as we were interested as much in theguidelines state that authors should share their data ‘‘provided that investigators responses as acquiring the actual data set; perhaps athe confidentiality of the participants can be protected and unless more detailed request would have garnered more positive responses.legal rights concerning proprietary data preclude their release,’’ a Again this is unlikely, as no authors claimed that our request wasstipulation that may have permitted many authors to avoid sharing unreasonable, inappropriate, or lacking in sufficient details, andunder the guise of patient confidentiality or legal rights[8]. Second, further information was provided upon request. A final limitationAPA policies don’t provide an incentive for original authors to was that our sample was small. Yet the results were so striking – onlyshare data, as data can only be used ‘‘to verify the substantive one in ten authors complied - that we can be fairly confident the trueclaims through reanalysis.’’ [8] Thus original authors stand only to rate of compliance is far from 100%. Indeed, it would be highlylose by having their conclusions challenged. In contrast, we asked unlikely to obtain only 1 in 10 data sets if the true rate of datainvestigators for data not to challenge their original conclusions, sharing was even as low as 50% (p = 0.01 from binomial test).but to test a new hypothesis. Third, the majority of data setsrequested by Wicherts et al. were survey research; we requested In conclusion, our findings suggest that explicit journal policiesdata from medical studies and clinical trials - research that has a requiring data sharing do not lead to authors making their datadirect and immediate relevance to individuals suffering ill-health. sets available to independent investigators. We acknowledge that there are numerous real and perceivedimpediments to sharing raw data. One of us has previously written Author Contributionsextensively on this issue[9]. Concerns about patient privacy are Conceived and designed the experiments: CS AV. Performed thefrequently cited, although data from most studies can easily be experiments: CS. Analyzed the data: CS AV. Wrote the paper: CS AV.References 1. Science Magazine. Authors and Contributors, General Policies. Available from: 6. Science Magazine. General Information for Authors. Conditions of Acceptance. http://www.sciencemag.org/help/authors/policies.dtl. Accessed March 4, 2009. Available from: www.sciencemag.org/about/authors/prep/gen_info.dtl# 2. National Institutes of Health. NIH Data Sharing Policy. Available from: http:// datadepwww.sciencemag.org/about/authors/prep/gen_info.dtl#datadep. Ac- grants.nih.gov/grants/policy/data_sharing/. Accessed March 4, 2009. cessed March 4, 2009. 3. National Institutes of Health. NIH Data Sharing Policy Brochure. Available from: 7. Wicherts JM, Borsboom D, Kats J, Molenaar D (2006) The poor availability of http://grants.nih.gov/grants/policy/data_sharing/data_sharing_brochure.pdf. psychological research data for reanalysis. Am Psychol 61: 726–728. Accessed March 4, 2009. 8. (2001) Ethical standards for the reporting and publishing of scientific 4. PLoS Medicine Editorial and Publishing Policies. 13. Sharing of Materials, information. Publication manual of the American Psychological Association. Methods and Data. Available from: http://journals.plos.org/plosmedicine/ Washington, DC. pp 387-396. policies.php#sharing. Accessed March 4, 2009. 9. Vickers AJ (2006) Whose data set is it anyway? Sharing raw data from 5. Nature Journal. Authors & Referees, Editorial Policies, Availability of data & randomized trials. Trials 7: 15. materials. Available from: http://www.nature.com/authors/editorial_policies/ availability.html. Accessed March 4, 2009. PLoS ONE | www.plosone.org 3 September 2009 | Volume 4 | Issue 9 | e7078 99
    • PerspectiveOpen Clinical Trial Data for All? A View from RegulatorsHans-Georg Eichler1*, Eric Abadie1,2, Alasdair Breckenridge3, Hubert Leufkens1,4, Guido Rasi1 ´curite Sanitaire des Produits de Sante (AFSSAPS) Saint-Denis, France, 3 Medicines1 European Medicines Agency (EMA), London, United Kingdom, 2 Agence Francaise de Se ¸ ´ ´and Healthcare products Regulatory Agency (MHRA), London, United Kingdom, 4 Medicines Evaluation Board (CBG-MEB), Den Haag, The Netherlands In this issue of PLoS Medicine, Doshi and research is a means to open healthcolleagues argue that the full clinical trial Linked Policy Forum research.reports of authorized drugs should bemade publicly available to enable inde- This Perspective discusses the fol- lowing new Policy Forum published Why Trial Data Should Not Bependent re-analysis of drugs’ benefits and in PLoS Medicine: Open for Allrisks [1]. We offer comments on their callfor openness from a European Union drug Doshi P, Jefferson T, Del Mar C There are indeed many good argumentsregulatory perspective. (2012) The Imperative to Share for unrestricted and easy access to full For the purpose of this discussion, we Clinical Study Reports: Recommen- RCT data. Yet, simply uploading all trialconsider ‘‘clinical study reports’’ to com- dations from the Tamiflu Experi- data on a website would entail its ownprise not just the protocol, summary ence. PLoS Med 9(4): e1001201. problems.tables, and figures of (mostly) randomized doi:10.1371/journal.pmed.1001201 First among those is the issue ofcontrolled trials (RCTs), but the full ‘‘raw’’ personal data protection or patient confi-data set, including data at the patient level Peter Doshi and colleagues de- dentiality, a concept that is very different[2]. We limit discussion to data on drugs scribe their experience trying and failing to access clinical study from commercial confidentiality. There is afor which the regulatory benefit-risk as- small risk that personal data could inad-sessment has been completed. reports from the manufacturer of Tamiflu and challenge industry to vertently be publicized. There is also a defend their current position of small risk that an individual patient couldWhy Trial Data Should Be Open RCT data secrecy. be identified from an anonymized dataset,for All for example, from trials in ultra-rare First and foremost, we agree with diseases. Achieving an adequate standardDoshi et al. that clinical trial data should and observational study data sets for of personal data protection is not annot be considered commercial confiden- better, individualized therapeutic decisions insurmountable obstacle, though, andtial information; most patients enrolling (L. Perez-Breva, personal communica- proposals for best practice for publishingin clinical trials do so with an assumption tion). raw data are available [2]. However,of contributing to medical knowledge, Large, information-rich datasets are implementation is not straightforward,and ‘‘non-disclosure of complete trial needed to support the computer science standards will need to be agreed upon upresults undermines the philanthropy’’ and artificial intelligence research re- front, and data redaction may in a few[1]. quired to develop and test these applica- cases be resource intensive. The potential benefits for public health tions. Developing such tools is usually not Our second caveat is likely moreof independent (re-)analysis of data are not a priority for, and often beyond the contentious. We do not dispute thatdisputed and, in an open society, trial capabilities and resources of, even the financial conflicts of interests (CoIs) maysponsors and regulators do not have a largest pharmaceutical companies. These render analyses and conclusions ‘‘vulner-monopoly on analyzing and assessing drug endeavors might best thrive in an envi- able to distortion’’ [1]. However, sur-trial results. Yet, the different responsibil- ronment that invites research from be- rounding the ongoing debate over spon-ities of regulators and independent ana- yond the current stakeholders in health sor-independent analyses is an implicitlysts have to be acknowledged. Regulators, [5]. Making rich datasets available for assumption that ‘‘analysis by independentunlike academicians, are legally obliged totake timely decisions on the availability of Citation: Eichler H-G, Abadie E, Breckenridge A, Leufkens H, Rasi G (2012) Open Clinical Trial Data for All? Adrugs for patients, even under conditions View from Regulators. PLoS Med 9(4): e1001202. doi:10.1371/journal.pmed.1001202of uncertainty. Published April 10, 2012 Going beyond the merits of indepen- Copyright: ß 2012 Eichler et al. This is an open-access article distributed under the terms of the Creativedent meta-analysis, we foresee other, Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium,potentially more important benefits from provided the original author and source are credited.public disclosure of raw trial data. For Funding: No specific funding was received for writing this article.example, RCT datasets enabled the de- Competing Interests: All authors work for drug regulatory agencies. The authors have declared that no othervelopment of predictive models for patient competing interests exist. The views expressed in this article are the personal views of the authors and may not be understood or quoted as being made on behalf of or reflecting the position of the regulatory agencies, orselection to appropriate treatments [3,4]. the committees or working parties of the regulatory agencies the authors work for.Taking this notion a step further, we Abbreviations: CoI, conflict of interest; RCT, randomized controlled trialenvisage machine learning systems thatwill allow clinicians to match a patient’s * E-mail: hans-georg.eichler@ema.europa.euelectronic health record directly to RCT Provenance: Commissioned; externally peer reviewed. PLoS Medicine | www.plosmedicine.org 1 April 2012 | Volume 9 | Issue 4 | e1001202 100
    • groups’’ is somehow free from CoIs. We level of scrutiny as sponsor-conducted of engagement follow the principle ofbeg to differ. Personal advancement in analyses do. maximum transparency whilst respect-academia, confirmation of previously de- Finally, re-analysis of trial data could be ing the need to guarantee data privacyfended positions, or simply raising one’s misused for competitive purposes. and to avert the potential for misuseown visibility within the scientific commu- [10]. Others have come up withnity may be powerful motivators. In a The Way Forward? broadly similar proposals [11]. Con-publish-or-perish environment, would the ceivably, analogous principles (e.g.,finding of an important adverse or favor- We consider it neither desirable nor data sharing only after receipt of a fullable drug effect at the p,0.05-level be realistic to maintain the status quo of analysis plan) could be applied tomore helpful to a researcher than not limited availability of regulatory trials regulatory RCT data [1].finding any new effects? Will society data. What is needed is a three-prongedalways be guaranteed that a finding that approach: Moreover, we take it as self-evident thatis reported as ‘‘confirmatory’’ was not the the same standard of openness shouldresult of multiple exploratory re-runs of a 1. Develop and agree upon adequate apply to all (drug) trial data, whetherdataset? We submit that analyses by standards for protection of personal sponsored by industry, investigator-initiat-sponsor-independent scientists are not data when publicizing RCT datasets. ed, or sponsored by public grant-givinggenerated in a CoI-free zone and, more Most stakeholders will likely agree that bodies. Likewise, the same standard ofoften than not, ego trumps money. adequate standards of data protection third party scrutiny