Traditional classification of human disease has been based on pathological analysis and clinical observation. However such approaches are often flawed with many diseases having significant heterogeneity in their etiology although presenting with similar symptoms and pathologies. This mean that in any population of ‘ diseased ’ individuals we see significant variation in response to treatment. As well as being unsatisfactory from a patient health perspective, this has a profound impact on clinical trials: if the mechanism of the drug being tested is only effective in a sub population of disease sufferers, a poorly designed trial could be abandoned because of perceived lack of efficacy.
Access to NGS, omic and other high throughput bioassays techniques, combined with detailed patient information and advanced infromatics is allowing us to more precisely phenotype heterogeneous disease populations. These approaches will lead to improved prognosis and response to treatment as well as identifying new diagnostic and disease progression makers for clinical trials.
Although patient stratification and so such molecular understanding of disease etiology, is increasingly important in bringing new therapies to market most clinical studies run by Pharma are RCT to establish safety/efficacy of NMEs: they are not designed to support biomarker discovery. The cost of running sufficiently powered, longitudinal, cross-sectional cohort observation studies, with sufficient depth of molecular data collection is often prohibitive for single organisations. So given their size, complexity and that they are often not associated with any specific NME, these studies are increasing conducted in partnership with other pharma, academia, SMEs, regulators and patient groups as pre-competitive endeavours.
“ Non-competitive ” collaborative research for EFPIA companies Competitive calls to select partners of EFPIA companies (IMI beneficiaries) Open collaboration in public-private consortia (data sharing, dissemination of results)
So what are the data challenges. Eg ABPI/MRC RA-Map – over 26 data collection centres.
Tell story about the trade off between costs of services and patients in UBIOPRED
Talk about the value of data beyond the life time of the project. E.G Framingham Heart Study started in 1948
Talk about the value of data beyond the life time of the project. E.G Framingham Heart Study started in 1948
How do we provides a cost effective model to provide a Knowledge management platform to IMI and similar projects?
Fits with emerging R&D strategies in Pharma
Project Name Therapeutic Area Data Type Summary (eTRIKS support) IMI U-BIOPRED Severe Asthma Clinical, Animal Models, Transcriptomics, genetics, metabonomics, lipidomics IMI OncoTrack Colon Cancer Clinical, Next Generation Sequencing, Protein Arrays Cell-based Assays, Animal Models, Cancer Stem Cells IMI ABI RISK Biopharmaceutical Risk Assessment Clinical observations, Legacy cohorts, Cell-based assays, Gene Expression, Long-term studies IMI PREDECT Prostate, Breast and Lung Cancer Tissue Micro-Arrays, In Vitro Culture Models, GEMM Animal Models IMI ND4BB Combating Antimicrobial Resistance Pharmacology, In vivo, Clinical, omics MRC-ABPI RA-MAP Rheumatoid Arthritis Clinical, transcriptomics, proteomics, metabonomics, cell based assays, flow cytometry, genetics IMI NEWMEDS Depression & Schizophrenia Clinical, Pre-Clinical IMI Predict-TB Tuberculosis Clinical, Pre-Clinical PK/PD IMI Biovacsafe Vaccine Immunogenicity Clinical, Transcriptomics, Metabonomics, protein assays IMI QuIC-ConCePT Oncology Immaging biomarkers Animal model data management
Data Search & Analysis Dataset explorer enables hypothesis generation and refinement across experimental and published knowledge in system. Incorporates powerful I2b2, Lucene, GenePattern applications as well as enabling the connection of many open & commercial analytical tools
tranSMART is the core HBase
Clinical research centre. Open clinica
Enables search of differentially expressed genes across studies.
2013 04-10 eTRIKS overview
eTRIKS: A Knowledge ManagementPlatform for Translational ResearchIan Dix, AstraZeneca R&DYike Guo, Imperial College LondonOn Behalf of eTRIKS Ian.firstname.lastname@example.org email@example.com
Challenge of Drug Development Complex Disease Phenotypes 2
How do we stratify these complex phenotypes? Mass WGS RNAseq Imaging RT Sensing SpecNext Generation Platforms -> Data Explosion 3
Challenges in running internal biomarker programs• Study population, design & data collection is defined by clinical development program – Typically not optimised for biomarker discovery• Cost of running sufficiently powered Longitudinal Observational Studies designed for biomarker discovery (and validation) is prohibitive Industry• Collaborate… Public Private Consortium Academia
Sharing costs enables bigger studiesPros:• Shared costs: increase in scale – breadth & depth• Diverse specialisms enabling increased complexity and insight• Multiple centres: Improved recruitmentCons:• IP complexity• Coordination
Example Translational Study Consortium• RA-Map Stratified Medicine:• COPD-Map • GAUCHERITE Consortium • Stop HCV • MATURA22 CTMM research projects are active,involving a total of 119 partners and aresearch budget of 302.7 M€.
Translational Research Information Flow Analytical Labs BiobankClinical Sites Data Data Samples Data External Analytics API LIMS / Sample Tracker Analytical Workflows Collaboration and External KM platform Visualisation API eCRF software Databases Ontology Management Service Central Cloud- based Platform
Challenge 1: The Science Priority ree s ructu r ttructu Innrras I f fa e ci i nc e SSceencData Infrastructure costs are consistently underestimated Who Pays?
Challenge 2: Data Collaboration Org 22 Org Org 11 Org Org 44 Org Org 33 Org Org n Org nSharing data securely across the individual organisations… What tools and standards?
Challenge 3: Fixed Time Line Org 11 Org 5 Years Org 22 Org Org 33 OrgProject Consortium Org 44 Org Org n Org nThe value of data is long lived, virtual organisations are not:Who stewards the data when the consortium ends?
Translational Research Information and Knowledge Management ServiceIMI EFPIA KM First RDG accept EoIInitiates Group advises Proposal for Selected FPP Project on need for KM Agreed Onboarded In IMI KM call 2008 2009 2010 2011 2012 2013 GSK/JnJ/ICL eTRIKS eTRIKS Pilot tranSMART Call Published Initiates (Q4)
Objectives• Objectives: – Provision of a KM Service to support Private/Public Translational Research (TR) in IMI – Single access point to standardised curated , IMI TR study information along with IMI project relevant historic translational studies – Establishing a common, open source, interoperable TR platform, based on open agree standards across the IMI TR projects. – Development of an active TR analytics & informatics community across IMI• Budget: €23.79m for 5 years (Oct 2012---Sept 2017)• Members: – 10 Pharma, 3 Academic, 1 standards, 2 Commercial Suppliers
Business Logic• Reduced risk of data loss post IMI• Improved operational efficiency for TR PPPs• Improved access to secure data generated in TR PPPs• Improved access to relevant historic TR study data• Increased innovation in analytics tools and applications
Work PackagesWP Number WP Name WP Leads WP1 Platform Deployment CNRS/JPNV WP2 Platform Development Imperial/Sanofi/Pfizer WP3 Data Standards Roche/IDBS/Merck/CDISC WP4 Curation and Analysis Luxembourg/Sanofi Management and AstraZeneca/BioSci WP5 Sustainability Consulting WP6 Community and Outreach Janssen/BioSci Consulting WP7 Ethics GSK/CNRS/Bayer/Sanofi
Engagement and Governance Platform Deployment 3-6 Month Cycle Demand Data Platform 1 Standards Development IMI Demand Decision Curation and CommunityClient 2 Analysis and OutreachProject Demand eTRIKS Demand 3 Resources Ethics Delivery Packages Execution Progress Updates Project Input Progress Reports Deliveries
Core Technology Mining tranSMART GPLv3 ETLClinical Cohort High Content Reference DataPhenotypic Data Biomarker Data Demographics Gene Expression Literature Clinical Observations Genotyping Pathway Data Clinical Trial Outcomes Metabolomic Gene Metadata Adverse Events PK/PD Markers
eTRIKS PlatformCollaboration Platform• Collaboration• Research process• IP capture and management Access Manag Study Book /• Secure access ement Visual ResearchAnalytics Environment• Access to analytics tools• Open API for public and commercial software to plug-inTR Knowledge Hub• Cloud Infrastructure• Load procedures Study Management• ‘Big Data’ storage Ontology Scientific Data Architecture• Ontology management Management
tranSMART Foundation http://www.transmartfoundation.orgGlobal non-profit organization devoted to realizing the promise oftranslational biomedical research through development of the tranSMARTknowledge management platform.Goals1.Establish and sustain tranSMART as the preferred data sharing and analyticsplatform for translational biomedical research.2.Link academic, non-profit and corporate research communities for collaborativeresearch facilitated by tranSMART.3.Align and grow a vibrant developer network around the scientific goals of thetranSMART community.4.Reduce barriers to entry through use of advanced technologies and an activemarketplace.Community:Large scale KM consortia, AMCs, NFPs, Pharma, Regulators, Biotech service suppliers
Progress…What have we done in the first 5 months?
2. Reinforcing the Foundation• Working in partnership with TF to address core issues in tranSMART (2/6 FTEs)• tranSMART 1.1 (June) – first stable postgres release (UMichigan) * – ETL procedures for postgres (Imperial/Recombinant) * – decoupling of i2b2 (TraIT/Imperial) * – plugin architecture for: • Ontology data (TraIT/Hyve) • Clinical data (Imperial)• Data types for higher dimensional data• API for CRC data• tranSMART 1.2 (October) to be defined: Unit tests, API for analysis in dataset explore, API for high dimensional data, GUI improvements, search improvements… * Complete – currently in testing
3. Public eTRIKS Server• Aim: – eTRIKS server enabling access to public studies of interest to eTRIKS community• Progress: • Setup – tranSMART PostgresSQL Dev environment in place – Training and awareness of existing ETL processes for tranSMART • Population of Search Tool with EBI Atlas fold change data – ETL pipeline built for populating the tranSMART Search tool. – ~2000 human, primate, mouse and rat ATLAS studies selected. – Beta version May 2013, with QC/QA prior to production release in June/July. • Population of Dataset Explorer Tool with subset of GEO & Arrayexpress data – Selection of UBIOPRED relevant studies from GEO/Arrayexpress – Adaptation of Sanofi Dataset Explorer curation tool for PostgresSQL – Initial data sets being loaded now: production release June/July
4. First Supported Project: UBIOPREDKey Facts• Identification of novel biomarkers of severe asthma• 40 Partners (20 Academic, 10 SME, 10 EFPIA)• Novel Cohort and Biobank of Severe Asthma Patients• Cross-Sectional Comparative Study • Systems Medicine approach to with Longitudinal Follow up Identify ‘handprint’ biomarkers across data• Profiling of Genomics, Proteomics, Lipidomic, Breathomic 1,025 Patients 175,000 Samples• Matching with in-vivo and in-vitro models 3,000,000 Data Points 04/14/13 http://www.ubiopred.european-lung-foundation.org/ R&D IT External Innovation 26
4. eTRIKS Support of UBIOPREDAim (to date): Stand up a secure UBIOPRED server, load clinical and first omic data setsProgress: Dedicated U-BIOPRED server set-up at ICL running PostgreSQL TM. tranSMART tutorials circulated to UBIOPRED project Anonymised clinical data loaded - 250 patients to date (baseline visit) Urine and Sputum Lipidomic data loaded (eicosanoid panel)Next sprints: Animal Model (house dust mite-induced asthma - mouse) data Robust curation methodology to be developed Research into Longitudinal Data Model (load & querying) Loading of omics data.
Immediate Next Steps• Onboard 2-3 further projects• Scope out Animal Model requirements across projects• Work with Foundation to deliver TM1.1/1.2• Release the public eTRIKS server: 2000+ studies
Ideal Future State (IMI and beyond) • Accessible Common Infrastructure Medical Centres Analytics • Federation of searchable archivesSpecialists P of translational study information CRO P Patient • Ability to transfer data securely between organization P organisations within consortiaRegulatoryauthorities Disease Specialists • Healthy ecosystem of commercial and Assay Specialists NFP service providers supporting projects KM Support and institutions The new reality of • Large and diverse innovative Drug Research analytics & visualisation toolbox
1. Ensure the legacy of project data/results2. Facilitate dataset integration3. Increase operational efficiency4. Establish a common set of standards www.eTRIKS.org Linked In Discussion Group: eTRIKS Twitter @etriks1