Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

AnalyzeGenomes.com: A Federated In-Memory Database Platform for Digital Health

2,001 views

Published on

The slide deck of the presentation "AnalyzeGenomes.com: A Federated In-Memory Database Platform for Digital Health" of the 2017 BMBF All Hands Meeting in Karlsruhe are online available now.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

AnalyzeGenomes.com: A Federated In-Memory Database Platform for Digital Health

  1. 1. AnalyzeGenomes.com A Federated In-Memory Database Platform for Digital Health Dr.-Ing. Matthieu-P. Schapranow BMBF All Hands Meeting, Karlsruhe Oct 11, 2017
  2. 2. What is the Hasso Plattner Institute, Potsdam, Germany? Dr. Schapranow, BMBF All Hands, Oct 11, 2017 Analyze Genomes: A Federerated In- Memory Database for Digital Health 2
  3. 3. ■ Founded as a public-private partnership in 1998 in Potsdam near Berlin, Germany ■ Institute belongs to the University of Potsdam ■ Ranked 1st in CHE since 2009 ■ 500 B.Sc. and M.Sc. students ■ 12 professors/chairs, 150 PhD students ■ Apr 2017: Digital Engineering Faculty ■ Oct 2017: Opening of Digital Health Center Hasso Plattner Institute Key Facts Analyze Genomes: A Federerated In- Memory Database for Digital Health 3 Dr. Schapranow, BMBF All Hands, Oct 11, 2017
  4. 4. ■  Can we enable clinicians to take their therapy decisions: □  Incorporating all available patient specifics, □  Referencing latest lab results and worldwide medical knowledge, and □  In an interactive manner during their ward round? Our Motivation Turn Precision Medicine Into Clinical Routine Analyze Genomes: A Federerated In- Memory Database for Digital Health 4 Dr. Schapranow, BMBF All Hands, Oct 11, 2017
  5. 5. Dr. Schapranow, BMBF All Hands, Oct 11, 2017 Analyze Genomes: A Federerated In- Memory Database for Digital Health 5
  6. 6. Dr. Schapranow, BMBF All Hands, Oct 11, 2017 Analyze Genomes: A Federerated In- Memory Database for Digital Health 6
  7. 7. Our Vision Medical Board Incorporating Latest Medical Knowledge Dr. Schapranow, BMBF All Hands, Oct 11, 2017 Analyze Genomes: A Federerated In- Memory Database for Digital Health 7
  8. 8. The Challenge Distributed Heterogeneous Data Sources 8 Human genome/biological data 600GB per full genome 15PB+ in databases of leading institutes Prescription data 1.5B records from 10,000 doctors and 10M Patients (100 GB) Clinical trials Currently more than 30k recruiting on ClinicalTrials.gov Human proteome 160M data points (2.4GB) per sample >3TB raw proteome data in ProteomicsDB PubMed database >23M articles Hospital information systems Often more than 50GB Medical sensor data Scan of a single organ in 1s creates 10GB of raw dataCancer patient records >160k records at NCT Analyze Genomes: A Federerated In- Memory Database for Digital Health Dr. Schapranow, BMBF All Hands, Oct 11, 2017
  9. 9. Combined column and row store Map/Reduce Single and multi-tenancy Lightweight compression Insert only for time travel Real-time replication Working on integers SQL interface on columns and rows Active/passive data store Minimal projections Group key Reduction of software layers Dynamic multi- threading Bulk load of data Object- relational mapping Text retrieval and extraction engine No aggregate tables Data partitioning Any attribute as index No disk On-the-fly extensibility Analytics on historical data Multi-core/ parallelization Our Technology In-Memory Database Technology + ++ + + P v +++ t SQL x x T disk 9 Dr. Schapranow, BMBF All Hands, Oct 11, 2017 Analyze Genomes: A Federerated In- Memory Database for Digital Health
  10. 10. Dr. Schapranow, BMBF All Hands, Oct 11, 2017 Our Approach: AnalyzeGenomes.com In-Memory Computing Platform for Big Medical Data 10 In-Memory Database Analyze Genomes: A Federerated In- Memory Database for Digital Health
  11. 11. Dr. Schapranow, BMBF All Hands, Oct 11, 2017 Our Approach: AnalyzeGenomes.com In-Memory Computing Platform for Big Medical Data 11 In-Memory Database Combined and Linked Data Genome Data Cellular Pathways Genome Metadata Research Publications Pipeline and Analysis Models Drugs and Interactions Analyze Genomes: A Federerated In- Memory Database for Digital Health Indexed Sources
  12. 12. Dr. Schapranow, BMBF All Hands, Oct 11, 2017 Our Approach: AnalyzeGenomes.com In-Memory Computing Platform for Big Medical Data 12 In-Memory Database Extensions for Life Sciences Data Exchange, App Store Access Control, Data Protection Fair Use Statistical Tools Real-time Analysis App-spanning User Profiles Combined and Linked Data Genome Data Cellular Pathways Genome Metadata Research Publications Pipeline and Analysis Models Drugs and Interactions Analyze Genomes: A Federerated In- Memory Database for Digital Health Indexed Sources
  13. 13. Dr. Schapranow, BMBF All Hands, Oct 11, 2017 Our Approach: AnalyzeGenomes.com In-Memory Computing Platform for Big Medical Data 13 In-Memory Database Extensions for Life Sciences Data Exchange, App Store Access Control, Data Protection Fair Use Statistical Tools Real-time Analysis App-spanning User Profiles Combined and Linked Data Genome Data Cellular Pathways Genome Metadata Research Publications Pipeline and Analysis Models Drugs and Interactions Analyze Genomes: A Federerated In- Memory Database for Digital Health Drug Response Analysis Pathway Topology Analysis Medical Knowledge CockpitOncolyzer Clinical Trial Recruitment Cohort Analysis ... Indexed Sources
  14. 14. Reproducibility Modeling of Data Analysis Pipelines 1.  Design time (researcher, process expert) □  Definition of parameterized process model □  Uses graphical editor and jobs from repository 2.  Configuration time (researcher, lab assistant) □  Select model and specify parameters, e.g. aln opts □  Results in model instance stored in repository 3.  Execution time (researcher) □  Select model instance □  Specify execution parameters, e.g. input files Analyze Genomes: A Federerated In- Memory Database for Digital Health Dr. Schapranow, BMBF All Hands, Oct 11, 2017 14
  15. 15. Heart Failure Sleeping disorder Fibrosis Blood pressure Blood volume Gene ex- pression Hyper- trophyCalcium meta- bolism Energy meta- bolism Iron deficiency Vitamin-D deficiency Gender Epi- genetics ■  Integrated systems medicine based on real-time analysis of healthcare data ■  Initial funding period: Mar ‘15 – Feb ‘18 ■  Funded consortium partners: Systems Medicine Model of Heart Failure (SMART) Dr. Schapranow, BMBF All Hands, Oct 11, 2017 Analyze Genomes: A Federerated In- Memory Database for Digital Health 15 A R T + T RAM S + S M
  16. 16. ■  Patient: 63 years, male, smoker, chronic heart insufficiency, stage III-IV 1.  Appointment I (pre-surgery): Acquire systemic patient details, e.g. physiological and blood markers 2.  Predict outcome using clinical model with patient specifics 3.  Select adequate option and conduct valve replacement 4.  Equip patient with sensors to allow regular monitoring 5.  Appointment II 6 wks after surgery to validate outcome Establish Systems Medicine Model for Improved Treatment of Heart Failure Dr. Schapranow, BMBF All Hands, Oct 11, 2017 Analyze Genomes: A Federerated In- Memory Database for Digital Health 16
  17. 17. ■  Joint process definition ■  Identification of long running steps ■  Aims □  Sharing of data □  Improved communication □  Reproducible data processing □  Analysis applications for interactive hypothesis validation Requirements Engineering for System Medicine Computer-aided Systems Medicine Process Dr. Schapranow, BMBF All Hands, Oct 11, 2017 Analyze Genomes: A Federerated In- Memory Database for Digital Health 17
  18. 18. ■  Structured data acquisition, e.g. IMDB as data integration platform ■  Improved communication, e.g. event-driven user notifications ■  Reproducible data processing, e.g. IMDB as processing platform for DNA and RNA data ■  Enables real-time data analysis Contributions Dr. Schapranow, BMBF All Hands, Oct 11, 2017 Analyze Genomes: A Federerated In- Memory Database for Digital Health 18 RNA Seq Analysis_V2 TopHat Trimmomatic FASTQC STAR featureCounts Counts Matrix BAM-File Aligned Reads FASTQC 2 FASTQ - Trimmed Reads Pre-Trimming QC-Report FASTQ - Reads Post-Alignment QC-Report
  19. 19. s Dr. Schapranow, BMBF All Hands, Oct 11, 2017 Analyze Genomes: A Federerated In- Memory Database for Digital Health 19
  20. 20. ■  Interdisciplinary partners collaborate on enabling interactive health research ■  Current funding period: Aug 2015 – July 2018 ■  Funded consortium partners: □  AOK German healthcare insurance company □  data experts group Technology operations □  Hasso Plattner Institute Real-time data analysis, in-memory database technology □  Technology, Methods, and Infrastructure for Networked Medical Research Legal and data protection Smart Analysis Health Research Access (SAHRA) Dr. Schapranow, BMBF All Hands, Oct 11, 2017 Analyze Genomes: A Federerated In- Memory Database for Digital Health 20
  21. 21. ■  Analysis dashboard combining functions per use case ■  Providing expert-facing entry point to individual apps ■  Provides application-wide authentication / single sign on Interactive Analysis Dashboard Dr. Schapranow, BMBF All Hands, Oct 11, 2017 Analyze Genomes: A Federerated In- Memory Database for Digital Health 21
  22. 22. ■  Stratification of patient cohorts using patient specifics ■  Automatic matching of similar patients and patient anamnesis ■  Interactive graphical exploration of longitudinal patient data Stratification of Hypertension Patients and Longitudinal Data Analysis Dr. Schapranow, BMBF All Hands, Oct 11, 2017 Analyze Genomes: A Federerated In- Memory Database for Digital Health 22
  23. 23. ■  Query-oriented search interface ■  Seamless integration of patient specifics, e.g. from EMR ■  Parallel search in international knowledge bases, e.g. for biomarkers, literature, cellular pathway, and clinical trials App Example: Medical Knowledge Cockpit for Patients and Clinicians Analyze Genomes: A Federerated In- Memory Database for Digital Health 23 Dr. Schapranow, BMBF All Hands, Oct 11, 2017
  24. 24. Dr. Schapranow, BMBF All Hands, Oct 11, 2017 Medical Knowledge Cockpit for Patients and Clinicians Pathway Topology Analysis ■  Search in pathways is limited to “is a certain element contained” today ■  Integrated >1,5k pathways from international sources, e.g. KEGG, HumanCyc, and WikiPathways, into HANA ■  Implemented graph-based topology exploration and ranking based on patient specifics ■  Enables interactive identification of possible dysfunctions affecting the course of a therapy before its start Analyze Genomes: A Federerated In- Memory Database for Digital Health Unified access to multiple formerly disjoint data sources Pathway analysis of genetic variants with graph engine 24
  25. 25. Dr. Schapranow, BMBF All Hands, Oct 11, 2017 ■  Interactively explore relevant publications, e.g. PDFs ■  Improved ease of exploration, e.g. by highlighted medical terms and relevant concepts Medical Knowledge Cockpit for Patients and Clinicians Publications Analyze Genomes: A Federerated In- Memory Database for Digital Health 25
  26. 26. ■  For patients □  Identify relevant clinical trials and medical experts □  Become an informed patient ■  For clinicians □  Identify pharmacokinetic correlations □  Scan for similar patient cases, e.g. to evaluate therapy efficiency ■  For researchers □  Enable real-time analysis of medical data, e.g. assess pathways to identify impact of detected variants □  Combined mining in structured and unstructured data, e.g. publications, diagnosis, and EMR data What to Take Home? Learn more and test-drive it yourself: AnalyzeGenomes.com Dr. Schapranow, BMBF All Hands, Oct 11, 2017 26 Analyze Genomes: A Federerated In- Memory Database for Digital Health
  27. 27. Keep in contact with us! Dr. Schapranow, BMBF All Hands, Oct 11, 2017 Analyze Genomes: A Federerated In- Memory Database for Digital Health 27 Dr.-Ing. Matthieu-P. Schapranow Program Manager E-Health & Life Sciences Hasso Plattner Institute August-Bebel-Str. 88 14482 Potsdam, Germany schapranow@hpi.de http://we.analyzegenomes.com/

×