The ACP
MS Repository
A Case Study

tranSMART Community Meeting, Nov. 2013

Stephen Wicks, Ph.D.
The ACP Repository: a Case Study

What is Multiple Sclerosis?









Chronic inflammatory/demyelination disorder af...
The ACP Repository: a Case Study

What is the cost of MS?
 Difficult

and costly to diagnose (MRI, symptom
variability le...
Orion Bionetworks
 ACP

is a founding member of Orion.
 Orion seeks to cure MS by harnessing the power of
computational ...
The ACP Repository: a Case Study

ACP and the MS Repository
 Founded

in 2001 by an MIT entrepreneur with MS
 ACP MS Rep...
The ACP Repository: a Case Study

Repository Enrollment Status (6/21/2013)
• 3,220 subjects enrolled; 467 longitudinal vis...
The ACP Repository: a Case Study

The ACP Engine
“Matchmaker”
Database Graphical
User Interface

MS Discovery Forum
Review...
The ACP Repository: a Case Study

ACP MS Repository
Open-access collection of highly annotated bloodderived samples plus d...
The ACP Repository: a Case Study

Case Report Form (CRF)
Curation challenges
The ACP Repository: a Case Study

ACP Case Report Form
48 Page (first visit) and 38 page (second visit)
complete clinical ...
The ACP Repository: a Case Study

CRF Sample Fields

Study drugs Drug”measurelethaly so) drug meaningless.
103 distinct te...
The ACP Repository: a Case Study

DMD Curation Solutions
 We

applied drug ontologies and mapping
vocabularies where need...
The ACP Repository: a Case Study

CRF Sample Fields

Multiple Drugs (Observations) were addressed with…
The ACP Repository: a Case Study

VISIT_NAME application
The ACP Repository: a Case Study

Controlled Vocabularies
(sports)
 ~5000

responses
 779 distinct sports reported
 Whe...
The ACP Repository: a Case Study

Controlled Vocabularies (sports)

All sports mapped to a 29 term vocabulary.
The ACP Repository: a Case Study

Controlled Vocabularies (pets)
 ~6500

pets reported
 600 distinct pets reported
 Whe...
The ACP Repository: a Case Study

Controlled Vocabularies (pets)

All pets mapped to a 31 category controlled vocabulary
The ACP Repository: a Case Study

Medication Curation Challenges









Amitriptaline
>10,000 medications listed.
...
The ACP Repository: a Case Study

ACP Repository
Tree
The ACP Repository: a Case Study

Date and Time Coding
 All
Dates in multiple formats:
15/12/2001
15/Dec/2001
Dec-2001
20...
The ACP Repository: a Case Study

Repository Usage




77 studies ongoing or completed
36 studies have returned data to...
The ACP Repository: a Case Study

Repository Usage
The ACP Repository: a Case Study

Research Data Curation Challenges
 Few

guidelines provided to researchers for data
for...
The ACP Repository: a Case Study

Sample Study Results
Biogen gene expression study:
Designed to identify gene-expression
...
The ACP Repository: a Case Study

Future Directions








Rancho BioSciences is providing guidance to ACP
for data-c...
The ACP Repository: a Case Study

Thanks for your time!
Questions?
Upcoming SlideShare
Loading in …5
×

tranSMART Community Meeting 5-7 Nov 13 - Session 5: The Accelerated Cure Project MS Repository Dataset as a Case Study

581 views
412 views

Published on

tranSMART Community Meeting 5-7 Nov 13 - Session 5: The Accelerated Cure Project MS Repository Dataset as a Case Study The Accelerated Cure Project MS Repository Dataset as a Case Study
Stephen Wicks, Rancho Biosciences
The Accelerated Cure Project for Multiple Sclerosis is a non-profit focused on accelerating research for a cure for MS. One of their major projects over the last decade has been the generation of the ACP Repository, a collection of biological samples and associated clinical data from approximately 3200 case or control participants. More than 75 studies are underway or have been completed, in both industry and academic settings, using samples from the ACP Repository. Rancho BioSciences has partnered with ACP through Orion Bionetworks to curate and load these datasets and associated clinical CRFs into tranSMART. In this talk, we will describe the rich ACP dataset and discuss our experiences in preparing the data for analysis in tranSMART

Published in: Health & Medicine, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
581
On SlideShare
0
From Embeds
0
Number of Embeds
42
Actions
Shares
0
Downloads
12
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Ethos of ACP = removing barriers to research. The first challenge was to create a new type of infrastructure for research to get conducted within. They say the journey of a thousand miles begins with a single step – for us the beginning of the journey was with a single study.
  • tranSMART Community Meeting 5-7 Nov 13 - Session 5: The Accelerated Cure Project MS Repository Dataset as a Case Study

    1. 1. The ACP MS Repository A Case Study tranSMART Community Meeting, Nov. 2013 Stephen Wicks, Ph.D.
    2. 2. The ACP Repository: a Case Study What is Multiple Sclerosis?      Chronic inflammatory/demyelination disorder affecting the CNS. (about 0.1%) Leading cause of neurological disability in young adults. Symptoms are variable and significant. They include vision, cognition, locomotion, pain, disorientation, dexterity, mood, bowel/bladder control, others. Generally progressive, but progression is idiosyncratic. (CISRRMSSPMS, vs. CISPPMS etc.) Complex etiology
    3. 3. The ACP Repository: a Case Study What is the cost of MS?  Difficult and costly to diagnose (MRI, symptom variability leads to extensive differential diagnosis)  Treatments can slow progression, but are expensive.  Many different drugs exist, but patient stratification for maximum efficacy and minimum side effects is non-existent. “Role the dice”  Often strikes early in life, and is a life-long disability.  Average Diagnosis at about 30. 5% before 16.
    4. 4. Orion Bionetworks  ACP is a founding member of Orion.  Orion seeks to cure MS by harnessing the power of computational modeling of disease progression.  ACP will provide its data to Orion in tranSMART to facilitate this goal.  Rancho BioSciences will curate and harmonize the ACP data for Orion
    5. 5. The ACP Repository: a Case Study ACP and the MS Repository  Founded in 2001 by an MIT entrepreneur with MS  ACP MS Repository started in 2006. The goal was to identify the cause of MS.  ACP MS Repository enrollment shut down this year. Approximately 3200 participants enrolled.  Biosamples, demographics, medical history etc.  Research data  OPT-UP
    6. 6. The ACP Repository: a Case Study Repository Enrollment Status (6/21/2013) • 3,220 subjects enrolled; 467 longitudinal visits completed Subject Breakdown Treatment Status Subjects: 3220 TX+ TX Naïve TX- Time Since Diagnosis Pending < 5 yrs 5-10 yrs > 10 yrs Pending RRMS:1436 146 229 116 675 388 291 82 SPMS:234 164 53 13 4 27 58 149 0 PPMS:117 46 38 31 2 48 37 32 0 CIS:88 19 2 62 5 74 5 5 4 TM:175 12 17 131 15 106 32 24 13 NMO:383 200 66 39 78 252 52 13 66 ON:16 1 1 12 2 12 1 0 3 ADEM:32 MS: 1787 945 1 6 20 5 24 5 1 2 Cases: 2481 Controls: 739 • DNA, RNA, Plasma, Serum, PBMCs + data from 52 page CRF 6
    7. 7. The ACP Repository: a Case Study The ACP Engine “Matchmaker” Database Graphical User Interface MS Discovery Forum Reviewing Developments in the MS Field Communicating with MS Researchers Allowing MS Researchers Worldwide to Explore the ACP Repository Database ACP Repository $13 million Invested 3200+ participants Biosamples & Datasets 77 sets of biosamples+data (b,m)illions of datapoints, From 36 studies, so far MS Researchers Worldwide Academia & Industry Insights and Results Mechanisms Diagnostics Causes Treatments
    8. 8. The ACP Repository: a Case Study ACP MS Repository Open-access collection of highly annotated bloodderived samples plus data from MS, related diseases, & control subjects gathered from 2006-2013.  Requirement for research data derived from samples to be with them (ACP) allowed us to obtain IP protection). “Workingdeposited (with a provision forcritical samples and confirm our results for only $20,000. If I had to obtain these samples from scratch, it would  Contributes to MS+ research in many ways: have cost $1 million and added 5 years to the project.”  - Enables studies that might not be conducted School of Thomas M. Aune, PhD, Molecular Biology, Vanderbilt University Medicine otherwise (academic & commercial) (from Scientific American)  Creates a common results database for studies from multiple bio-analytical techniques on overlapping sets of subjects.  Approximately 3200 participants. 
    9. 9. The ACP Repository: a Case Study Case Report Form (CRF) Curation challenges
    10. 10. The ACP Repository: a Case Study ACP Case Report Form 48 Page (first visit) and 38 page (second visit) complete clinical workup  Form completed with the assistance of a clinical research associate over a several hour interview (with sample draw and lab workup)  Broad data: 80 distinct tables in an SQL database  Deep data: in flat data files, more than 20 million cells 
    11. 11. The ACP Repository: a Case Study CRF Sample Fields Study drugs Drug”measurelethaly so) drug meaningless. 103 distinct textual responses.“CS-0777”, orordinal with betaseron, BETASERON, etc. Illustrates some Nothe problems associated order was units dataset. Inappropriate (sometimesdrugfrequency. of consistent etc… of trail enrollment “First Drug”, “Second “Betseron”, beta-seron, curating this “BG00012 (FUMARATE) OR PLACEBO”
    12. 12. The ACP Repository: a Case Study DMD Curation Solutions  We applied drug ontologies and mapping vocabularies where needed.  We repaired and consolidated dose, frequency, etc. to a single measure with 3 values (high, standard, low)  We re-formatted the data to eliminate the ambiguous cardinal ordering of reporting
    13. 13. The ACP Repository: a Case Study CRF Sample Fields Multiple Drugs (Observations) were addressed with…
    14. 14. The ACP Repository: a Case Study VISIT_NAME application
    15. 15. The ACP Repository: a Case Study Controlled Vocabularies (sports)  ~5000 responses  779 distinct sports reported  When filtered by “ski”,29  “gym”, 45  “walk”, 30; “jog”, 17, “run”, 40
    16. 16. The ACP Repository: a Case Study Controlled Vocabularies (sports) All sports mapped to a 29 term vocabulary.
    17. 17. The ACP Repository: a Case Study Controlled Vocabularies (pets)  ~6500 pets reported  600 distinct pets reported  When filtered by “dog”, 112, however, this misses mispellings (“diog”, “dot”, “pubs”, dog-like pets “wolf”, “half-wolf”, “mutt”, and breeds (“poddle”, “poodle”, “Afghan Hound”, etc.)  59 additional dog-like entries
    18. 18. The ACP Repository: a Case Study Controlled Vocabularies (pets) All pets mapped to a 31 category controlled vocabulary
    19. 19. The ACP Repository: a Case Study Medication Curation Challenges       Amitriptaline >10,000 medications listed. Amitriptylin Amitriptyline 2703 distinct medications listed. Amitriptyline HCL Mapped these to 614 real medications (e.g. Amitroptyline Amitriptyline) Amitryetyline This was split into twoAmitrypatiline tables: Amitryptailine  Continuing Medications (541 entities) Amitryptaline  Stopped Medications (317 entities) Amitryptilin VISIT_NAME was used Amitryptiline distinct observations to represent across the whole study Amitryptilline Amitryptylene Truly longitudinal measures were reified in the tree hierarchy in the data Amitryptylinefile. mapping
    20. 20. The ACP Repository: a Case Study ACP Repository Tree
    21. 21. The ACP Repository: a Case Study Date and Time Coding  All Dates in multiple formats: 15/12/2001 15/Dec/2001 Dec-2001 2001 Dec./2001 12/2001 --/--/-----/2001 ------------/12/2001 dates converted to periods (Months, Years, or Days) prior to the relevant blood draw date.  Dates were represented by International Standard ISO 8601. i.e. YYYY-MM-DD (e.g. 200112-15)
    22. 22. The ACP Repository: a Case Study Repository Usage    77 studies ongoing or completed 36 studies have returned data to ACP Data types:       Low-D biomarker (antibodies, metabolites, serum markers of inflammation, etc.) Low-D genotype data High-D SNP/GWAS data Gene-expression studies Whole-genome sequencing (2 distinct studies) Study types:    Etiology Diagnostics Disease activity biomarkers
    23. 23. The ACP Repository: a Case Study Repository Usage
    24. 24. The ACP Repository: a Case Study Research Data Curation Challenges  Few guidelines provided to researchers for data formatting or treatment  Often little or no documentation describing how the data was generated or handled (raw vs. normalized, transformations e.g.)  Load study meta-data (contact info, description, etc. at the node level)
    25. 25. The ACP Repository: a Case Study Sample Study Results Biogen gene expression study: Designed to identify gene-expression profiles that discriminate progressive forms of MS from relapsing-remitting forms of the disease.
    26. 26. The ACP Repository: a Case Study Future Directions     Rancho BioSciences is providing guidance to ACP for data-collection practices going forward (e.g. OPT-UP) We loaded the clinical data and 6 sample study datasets into an Oracle-based tranSMART instance that we host in-house for QC purposes. The full dataset is slated to be loaded into a 1.1 postgreSQL-based tranSMART instance (hosted by Recombinant by Deloitte for Orion). This and other data sources (Inst. For Neuroscience at B&W) will be analyzed and modeled by Orion
    27. 27. The ACP Repository: a Case Study Thanks for your time! Questions?

    ×