eTRIKS: A Knowledge Management Service
for PPP Translational Research

Yike Guo
eTRIKS academic coordinator
Example Stratified
Medicine Consortium

• RA-Map
• COPD-Map

Stratified Medicine:
• GAUCHERITE
Consortium
• Stop HCV
• MAT...
Strat Med Project Process
Patient enters
medical center
Clinical
Procedures
Electronic
Health Record

Imaging

Samples

Ex...
Data Management Components
• Clinical Data Capture (& Anonymisation)

• Sample tracking
• Biological assay data capture & ...
The eTRIKS Project
• Service Project – not Research. ~80% of project activities driven by demand
from supported IMI projec...
The Consortium…
10 Pharma

6 Partners

+
Work Packages

Biosci Consulting (Collaboration Management)

WP Number

WP Name

WP Leads

WP1

Platform Deployment

CNRS/...
Projects Engaging eTRIKS
Oncology

Safety

RA-Map
Inflammation

ND4BB
Infection
Business Logic
• Discoverable Data – Basis for an IMI archive
• Enables re-usable innovation – common plug’n’play
interfac...
This is what we are starting with

Data gets captured
Organized / Managed
Stored
Analyzed
Viewed / disseminated
Assimilati...
This is what we really need
to support
• A key to translational research advancement is allowing a
• continuous feedback l...
eTRIKS Platform
DM

KM
Product Management
Platform
Deployment

3-6 Month Cycle
Demand
1
IMI
Client
Project

Demand
2

Data
Standards

Decision

D...
Product Management Process
•

All requests for new features (via forms) are be submitted 6 weeks before the
Resource Team ...
Schematic Representation on PM
Competit
ion
Analysis

Internal
Stakeho
lder
Request

External
Stakehol
der
Request

Consol...
User Requirement Gathering
Platform

http://requirements.etriks.org/twiki/bin/view/RequestManagement/
User Requirement Gathering
Key fields
•

Benefit Estimate: Stakeholders must provide for each request an estimate of the r...
eTRIKS DM Component
Example : GUI Design for Study Repository
Consistent
Data Organization

Consistent
Vocabulary

Consistent Layout

One data...
Progress (1)
•

Becoming functional – recruitment, op norms, reporting, legal docs, etc

•

Production tranSMART v1.1 rele...
Progress (2)
•

Requirements gathering
1.

2 x User requirement workshops:
•
•

2.
3.

1. tranSMART developer and user mee...
Proposed Future Model
Metadata
Query

ABIRISK

Secure
Federated
Search
(data & samples)

Patient Stratification

UBIOPRED
...
Ultimately…
•

Accessible Common Infrastructure

•

Federation of searchable archives

Medical Centres
Analytics
Specialis...
1.
2.
3.
4.

Ensure the legacy of project data/results
Facilitate dataset integration
Increase operational efficiency
Esta...
End
Data Management Components
•

Clinical Data Capture:
–

•

Sample tracking:
–

•

A consortia wide platform for normalized...
Upcoming SlideShare
Loading in …5
×

tranSMART Community Meeting 5-7 Nov 13 - Session 5: eTRIKS - Science Driven Development

1,215 views

Published on

tranSMART Community Meeting 5-7 Nov 13 - Session 5: eTRIKS - Science Driven Development
Yike Guo, Imperial College London

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,215
On SlideShare
0
From Embeds
0
Number of Embeds
29
Actions
Shares
0
Downloads
14
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Cloud deployment
  • Traditional classification of human disease has been based on pathological analysis and clinical observation. However such approaches are often flawed with many diseases having significant heterogeneity in their etiology although presenting with similar symptoms and pathologies. This mean that in any population of ‘diseased’ individuals we see significant variation in response to treatment.As well as being unsatisfactory from a patient health perspective, this has a profound impact on clinical trials: if the mechanism of the drug being tested is only effective in a sub population of disease sufferers, a poorly designed trial could be abandoned because of perceived lack of efficacy.
  • “Non-competitive” collaborative research for EFPIA companies Competitive calls to select partners ofEFPIA companies (IMI beneficiaries) Open collaboration in public-private consortia (data sharing, dissemination of results)
  • Project NameTherapeutic AreaData Type Summary (eTRIKS support)IMI U-BIOPREDSevere AsthmaClinical, Animal Models,Transcriptomics, genetics, metabonomics, lipidomicsIMI OncoTrackColon CancerClinical, Next Generation Sequencing, Protein Arrays Cell-based Assays, Animal Models, Cancer Stem CellsIMI ABI RISKBiopharmaceutical Risk Assessment Clinical observations, Legacy cohorts, Cell-based assays, Gene Expression, Long-term studiesIMI PREDECTProstate, Breast and Lung CancerTissue Micro-Arrays, In Vitro Culture Models, GEMM Animal ModelsIMI ND4BBCombating Antimicrobial Resistance Pharmacology, In vivo, Clinical, omicsMRC-ABPI RA-MAPRheumatoid ArthritisClinical, transcriptomics, proteomics, metabonomics, cell based assays, flow cytometry, geneticsIMI NEWMEDSDepression & SchizophreniaClinical, Pre-ClinicalIMI Predict-TBTuberculosisClinical, Pre-Clinical PK/PDIMI BiovacsafeVaccine ImmunogenicityClinical, Transcriptomics,Metabonomics, protein assaysIMI QuIC-ConCePTOncology Immaging biomarkersAnimal model data management
  • Data Search & AnalysisDataset explorer enables hypothesis generation and refinement across experimental and published knowledge in system.Incorporates powerful I2b2, Lucene, GenePattern applications as well as enabling the connection of many open & commercial analytical tools
  • the knowledge “added value” of this pipeline is not fed back into a system that reflects the cumulative knowledge gained from this process and other similar processes
  • “Non-competitive” collaborative research for EFPIA companies Competitive calls to select partners ofEFPIA companies (IMI beneficiaries) Open collaboration in public-private consortia (data sharing, dissemination of results)
  • The data tree is a feature familiar to all tranSMART users. In order to provide a longer term solution to certain new user requirements, we will introduce an new tree hierarchy, in which data is organized according to the subject in which investigations are performed. This example shows three subject types: Cell lines, animal and humans. Within each subject nodes are grouped to represent attributes or properties of the subject. The most important feature about this new tree is that this hierarchy is extendable for new data types, while maintaining the integrity of the rest of the tree.When all investigations can be viewed through the lens of this new data hierarchy we will be able to perform more specific cross study searches while minimizing errors because every data type will be viewed in the context of the subject and its properties. In this example, we show a query builder for cohort selection. The user is able to build a complex filter for cohort selection; this task is made more intuitive because the data hierarchy intuitively provides the context for each data element. For example, we know that xenograft belongs to the animal subject because it is a node for that subject. We are also able to differentiate treatments given to the patient and treatments given to the animal because although the nodes are similar (Treatment), they belong to different parts of the tree.
  • The ability to refer to a single vocabulary enables us to retrieve all relevant information across studies, despite obstacles such as the use of synonyms. In this example, user is searching for all investigations related to Trastuzumab. eTRIKS should be able to retrieve all studies tagged with this label or with ‘Herceptin’, because Trastuzumab and Herceptin are the same. In addition, eTRIKS may search for Trastuzumab entries in other vocabularies, so that we can search for ‘trastuzumab in the context of a treatment’ OR’ trastuzumab and adverse reactions related to it’.The extended data model enables eTRIKS to retrieve studies performed in cell lines, animals and humans. The consistent data recorded from each study enables us to collate a specific set of information about each study. This is a minimum information set; in this example, the minimum information is the title of the study, the disease, treatment, outcome measurements and data holder details. The icons show types of data collected (in this case gene expression, but may be replaced with GWAS etc when needed) and the subject in which the investigation was performed.
  • eTRIKS user interface should be consistent across the entire workflow. This makes eTRIKS easy to learn. We will select a number of GUI functions and build the interface around these functionalities. These GUI elements will be kept in the same position throughout the workflow pages.Two GUI elements that we will be maintaining throughout eTRIKS is the data tree and the Drag&Drop functionality. The drag and drop function as introduced in tranSMART is familiar to current users. We suggest that the same functionality be introduced throughout the workflow. This means in Step 1, users can Drag&Drop from the tree to form the criteria to perform cross study searches. Step 2 to Drag&Drop to select criteria for cohort selection and finally step 3 to Drag&Drop to specify types of data to be exported.
  • tranSMART Community Meeting 5-7 Nov 13 - Session 5: eTRIKS - Science Driven Development

    1. 1. eTRIKS: A Knowledge Management Service for PPP Translational Research Yike Guo eTRIKS academic coordinator
    2. 2. Example Stratified Medicine Consortium • RA-Map • COPD-Map Stratified Medicine: • GAUCHERITE Consortium • Stop HCV • MATURA 22 CTMM research projects are active, involving a total of 119 partners and a research budget of 302.7 M€.
    3. 3. Strat Med Project Process Patient enters medical center Clinical Procedures Electronic Health Record Imaging Samples Experiments Clinical database Image database Biobank database Experimental data Data Integration External data Scientific Output Downstream analysis Intellectual Property Improved Healthcare
    4. 4. Data Management Components • Clinical Data Capture (& Anonymisation) • Sample tracking • Biological assay data capture & processing • Consortia Data & KM Platform • Data Analytics tools • Consortia Collaboration toolbox
    5. 5. The eTRIKS Project • Service Project – not Research. ~80% of project activities driven by demand from supported IMI projects (customer driven). • Mandate to support PPP Translational studies with data & KM services: • • • • • • • Open Platform development, enhancement and support Installation support Training Curation ETL support Standards development Data hosting Limited retrospective content curation to support studies • Budget: €23.79m for 5 years (Oct 2012---Sept 2017) • Members: – 10 Pharma, 3 Academic, 1 standards, 2 Commercial Suppliers
    6. 6. The Consortium… 10 Pharma 6 Partners +
    7. 7. Work Packages Biosci Consulting (Collaboration Management) WP Number WP Name WP Leads WP1 Platform Deployment CNRS/Janssen WP2 Platform Development Imperial/Pfizer WP3 Data Standards WP4 Curation and Analysis Luxembourg/Sanofi WP5 Management and Sustainability AstraZeneca/BioSci Consulting WP6 Community and Outreach WP7 Ethics Roche/IDBS/Merck/CDISC Janssen/BioSci Consulting CNRS/Sanofi
    8. 8. Projects Engaging eTRIKS Oncology Safety RA-Map Inflammation ND4BB Infection
    9. 9. Business Logic • Discoverable Data – Basis for an IMI archive • Enables re-usable innovation – common plug’n’play interface. Enables entrepreneurial Biz Models • Minimises re-invention of the wheel by each project: e.g. ‘Big Data’ omic challenges or data security once. • Facilitates easy interoperation & integration for partners with each new consortia • Cost effective use of tax payers €s – operational efficiency • Drives standardisation in data capture and management
    10. 10. This is what we are starting with Data gets captured Organized / Managed Stored Analyzed Viewed / disseminated Assimilation / Synthesis
    11. 11. This is what we really need to support • A key to translational research advancement is allowing a • continuous feedback loop between outcomes of basic and clinical • research to accelerate translation of data into knowledge
    12. 12. eTRIKS Platform DM KM
    13. 13. Product Management Platform Deployment 3-6 Month Cycle Demand 1 IMI Client Project Demand 2 Data Standards Decision Demand eTRIKS Resources Demand 3 Delivery Packages Progress Updates Project Input Platform Development Curation and Analysis Community and Outreach Ethics Execution Progress Reports Deliveries
    14. 14. Product Management Process • All requests for new features (via forms) are be submitted 6 weeks before the Resource Team meeting, in order to be considered. An appointed member of the PMP will consolidate the requests and place on the eTRIKS PM wiki. • PMP Decision making TC meeting is held 4 weeks before upcoming Resource team meeting, where the ranking proposal will be agreed upon. • The PMP reviews all requests for entering into eTRIKS product backlog and selects a set of features from the product backlog to be implemented in the following development period following Resource Team meeting approval. • Potentially, there is an additional PMP meeting (TC), 3 week before the Resource team meeting, in case the PMP decides they require further information and/or a user demo of the requested feature. • The ranked list from the PMP is be placed on the eTRIKS PM wiki 2 weeks prior to the Resource team meeting, where all eTRIKS participants can comment.
    15. 15. Schematic Representation on PM Competit ion Analysis Internal Stakeho lder Request External Stakehol der Request Consolidati on of Project Requests of by AM1 Requirement Gathering *3 PMwiki Consolidat ion of all requests by PM2 Ranking of request s by PMP Proposa l to Resourc e Team Appro val Developm ent Roadmap *3 Requirement Consolidation Requirement Ranking & Proposal Development Roadmap 1: AM - Account Manager, 2: PM – Product Manager, 3: Clarification of request by requesting stakeholder/AM
    16. 16. User Requirement Gathering Platform http://requirements.etriks.org/twiki/bin/view/RequestManagement/
    17. 17. User Requirement Gathering Key fields • Benefit Estimate: Stakeholders must provide for each request an estimate of the relative benefit that each feature provides to the users and/or to achieving the eTRIKS objectives (e.g. establishing a centralized European data base) on a scale from 1 to 5, with 1 indicating very little benefit and 5 being the maximum possible benefit. • Cost Estimate: Stakeholders interface with the WP2 Architect to provide a rough estimate of the effort (in person month) or required financial investment (in Euros). PMP will transform absolute costs into a relative cost value, again on a scale ranging from a low of 1 to a high of 5. Cost ratings are be based on factors such as the requirement complexity, the extent of user interface work required, the potential ability to reuse existing designs or code, and the levels of testing and documentation needed. • Risk Estimate / Mitigation Analysis: Stakeholders and WP2 architect should provide a brief description of possible risks associated with the feature development and mitigation strategy for each risk. In addition, Stakeholders and WP2 Architect to estimate the relative degree of technical or other risk associated with each feature on a scale from 1 to 5. An estimate of 1 means you can program it in your sleep, while 9 indicates serious concerns about feasibility, the availability of staff with the needed expertise, or the use of unproven or unfamiliar tools and technologies. Click here for Request Page
    18. 18. eTRIKS DM Component
    19. 19. Example : GUI Design for Study Repository Consistent Data Organization Consistent Vocabulary Consistent Layout One data tree for all investigations - Cross study searches - Every data type viewed in context
    20. 20. Progress (1) • Becoming functional – recruitment, op norms, reporting, legal docs, etc • Production tranSMART v1.1 released this month. – PostgreSQL open platform • Installations at: Imperial (UBIOPRED), Liverpool (PredictTB), Alacris (Oncotrack), QMUL (RAMap), Luxembourg (eTRIKS), and CC-IN2P3 (eTRIKS) • Curation of retrospective public content: – – – EBI Atlas gene fold change data (subset): SearchApp Public Asthma & RA related GEO data: DataSetExplorer App TCGA Datasets (Clinical and Gene Expression Data): • • • • • Support of 5 IMI projects to date: – – • Breast invasive carcinoma [BRCA] Colon adenocarcinoma [COAD] Uterine Corpus Endometrial Carcinoma [UCEC] Ovarian serous cystadenocarcinoma [OV] January: UBIOPRED: server set-up at ICL, 625 patients to date (screening & baseline), Low density Eicosanoid Lipidomic data, gene expression data, proteomic data and animal model clinical data loaded. Training provided. May: Oncotrack, ABIRISK, PredictTB and ABPI/MRC RA-Map: tranSMART installations and training to date. Active discussions with 4-5 other projects re requirements and support
    21. 21. Progress (2) • Requirements gathering 1. 2 x User requirement workshops: • • 2. 3. 1. tranSMART developer and user meeting , Amsterdam, June 2013 (tranSMART foundation collaboration) 2. eTRIKS User requirement session, London, 2013 (eTRIKS Request) Requirements reviewed and consolidated (identical request merged into one) eTRIKS Product Management WIKI (functions such as voting and automatic pre-prioritisation: based on number or requests, benefit, cost and risk estimates, enabled) implemented and requirements uploaded to it
    22. 22. Proposed Future Model Metadata Query ABIRISK Secure Federated Search (data & samples) Patient Stratification UBIOPRED COPD - Predicting Therapy Response IMI Archive StratMed X Local Instance What Longitudinal Arteriosclerosis studies have been run in the UK involving > 500 subjects? Data Transfer lMI Instance Biomarker Discovery: Correlating Signatures to Clinical Outcome Animal: Human model validation Disease Modelling • • • • • • • • • • • Robust Responsive Fit for Purpose Stable Supported Backed-up Secure Re-usable Sustainable Community Led Efficient
    23. 23. Ultimately… • Accessible Common Infrastructure • Federation of searchable archives Medical Centres Analytics Specialists of translational study information P CRO Patient organization P • organisations within consortia P Regulatory authorities Disease Specialists Assay Specialists Ability to transfer data securely between • Healthy ecosystem of commercial and NFP service providers supporting projects IT Services and institutions Fully Integrated Stratified Medicine Ecosystem • Large and diverse innovative analytics & visualisation toolbox
    24. 24. 1. 2. 3. 4. Ensure the legacy of project data/results Facilitate dataset integration Increase operational efficiency Establish a common set of standards www.eTRIKS.org Linked In Discussion Group: eTRIKS Twitter @etriks1
    25. 25. End
    26. 26. Data Management Components • Clinical Data Capture: – • Sample tracking: – • A consortia wide platform for normalized data storage, integration, querying, and long term archiving. Requires multiple ETL processes to export data from the local EDCs & LIMS. Needs a pluggable interface to allow the integration of analytics tools. Archive – but what to keep??? Data Analytics tools: – • Multiple LIMS (laboratory information management systems) platforms for the different assay technologies (NGS, omic, etc). Either vendor supplied, open source or locally developed. Important that data pre-processing is transparent and results in processed data in standard formats. Consortia Data & KM Platform: – • human and animal biopsy/fluid and cell line samples need tracking as they are stored and shipped between the consortia partners for assays. Operational logistics tool. Biological assay data capture: – • EDCs for the capture and validation of the clinical assay and patient data. Needs to be standardised for project across all recruiting centres. e.g. OpenClinica, REDCap Range of commercial, open and local data analysis tools to find signals in the phenotypic and biological information. Consortia Collaboration toolbox: – Tools to support communication, project management and document sharing across the consortia partners. Basic collaboration tools such as calendar, document management, project management, tc facilities etc, including a common ELN (E-Lab Notebook) for the capture of experimental design, analysis processes, results and conclusions.

    ×