SlideShare a Scribd company logo
1 of 26
Download to read offline
A Federated In-Memory Database Computing Platform Enabling Real-
time Analysis of Big Medical Data
Dr.-Ing. Matthieu-P. Schapranow
Hasso Plattner Institute, Potsdam, Germany
May 17, 2017
■  Can we enable clinicians to take their therapy decisions:
□  Incorporating all available patient specifics,
□  Referencing latest lab results and worldwide medical knowledge, and
□  In an interactive manner during their ward round?
Our Motivation
Turn Precision Medicine Into Clinical Routine
Analyze Genomes:
A Federated In-
Memory Database
Computing Platform
2
Dr. Schapranow, Intel
Tech Talk at SAPPHIRE,
May 17, 2017
Dr. Schapranow, Intel
Tech Talk at SAPPHIRE,
May 17, 2017
Analyze Genomes:
A Federated In-
Memory Database
Computing Platform
3
Our Vision
Medical Board Incorporating Latest Medical Knowledge
Dr. Schapranow, Intel
Tech Talk at SAPPHIRE,
May 17, 2017
Analyze Genomes:
A Federated In-
Memory Database
Computing Platform
4
Project Time Line
Dr. Schapranow, Intel
Tech Talk at SAPPHIRE,
May 17, 2017
Analyze Genomes:
A Federated In-
Memory Database
Computing Platform
5
2009 2010 2011 2012 2013 2014 2015
SAP HANA
launched Oncolyzer SORMAS
Drug Response
Analysis
Enterprise
Software
Medical
Knowledge
Cockpit
Analyze
Genomes
Platform
IMDB
Research
2016 2017
A R
T
+
T
RAM
S
+
S
M
The Challenge
Distributed Heterogeneous Data Sources
6
Human genome/biological data
600GB per full genome
15PB+ in databases of leading institutes
Prescription data
1.5B records from 10,000 doctors and
10M Patients (100 GB)
Clinical trials
Currently more than 30k
recruiting on ClinicalTrials.gov
Human proteome
160M data points (2.4GB) per sample
>3TB raw proteome data in ProteomicsDB
PubMed database
>23M articles
Hospital information systems
Often more than 50GB
Medical sensor data
Scan of a single organ in 1s
creates 10GB of raw dataCancer patient records
>160k records at NCT Analyze Genomes:
A Federated In-
Memory Database
Computing Platform
Dr. Schapranow, Intel
Tech Talk at SAPPHIRE,
May 17, 2017
■  Requirements
□  Managed services
□  Reproducibility
□  Real-time data analysis
■  Restrictions
□  Data privacy
□  Data locality
□  Volume of big medical data
Software Requirements in Life Sciences
Dr. Schapranow, Intel
Tech Talk at SAPPHIRE,
May 17, 2017
Analyze Genomes:
A Federated In-
Memory Database
Computing Platform
7
http://stevedempsen.blogspot.de/2013/08/agile-software-requirements-comic.html
Dr. Schapranow, Intel
Tech Talk at SAPPHIRE,
May 17, 2017
Our Approach: AnalyzeGenomes.com
In-Memory Computing Platform for Big Medical Data
8
In-Memory Database
Analyze Genomes:
A Federated In-
Memory Database
Computing Platform
Dr. Schapranow, Intel
Tech Talk at SAPPHIRE,
May 17, 2017
Our Approach: AnalyzeGenomes.com
In-Memory Computing Platform for Big Medical Data
9
In-Memory Database
Combined and Linked Data
Genome
Data
Cellular
Pathways
Genome
Metadata
Research
Publications
Pipeline and
Analysis Models
Drugs and
Interactions
Analyze Genomes:
A Federated In-
Memory Database
Computing Platform
Indexed
Sources
Dr. Schapranow, Intel
Tech Talk at SAPPHIRE,
May 17, 2017
Our Approach: AnalyzeGenomes.com
In-Memory Computing Platform for Big Medical Data
10
In-Memory Database
Extensions for Life Sciences
Data Exchange,
App Store
Access Control,
Data Protection
Fair Use
Statistical
Tools
Real-time
Analysis
App-spanning
User Profiles
Combined and Linked Data
Genome
Data
Cellular
Pathways
Genome
Metadata
Research
Publications
Pipeline and
Analysis Models
Drugs and
Interactions
Analyze Genomes:
A Federated In-
Memory Database
Computing Platform
Indexed
Sources
Dr. Schapranow, Intel
Tech Talk at SAPPHIRE,
May 17, 2017
Our Approach: AnalyzeGenomes.com
In-Memory Computing Platform for Big Medical Data
11
In-Memory Database
Extensions for Life Sciences
Data Exchange,
App Store
Access Control,
Data Protection
Fair Use
Statistical
Tools
Real-time
Analysis
App-spanning
User Profiles
Combined and Linked Data
Genome
Data
Cellular
Pathways
Genome
Metadata
Research
Publications
Pipeline and
Analysis Models
Drugs and
Interactions
Analyze Genomes:
A Federated In-
Memory Database
Computing Platform
Drug Response
Analysis
Pathway Topology
Analysis
Medical
Knowledge CockpitOncolyzer
Clinical Trial
Recruitment
Cohort
Analysis
...
Indexed
Sources
Combined column
and row store
Map/Reduce Single and
multi-tenancy
Lightweight
compression
Insert only
for time travel
Real-time
replication
Working on
integers
SQL interface on
columns and rows
Active/passive
data store
Minimal
projections
Group key Reduction of
software layers
Dynamic multi-
threading
Bulk load
of data
Object-
relational
mapping
Text retrieval
and extraction engine
No aggregate
tables
Data partitioning Any attribute
as index
No disk
On-the-fly
extensibility
Analytics on
historical data
Multi-core/
parallelization
Our Technology
In-Memory Database Technology
+
++
+
+
P
v
+++
t
SQL
x
x
T
disk
12
Dr. Schapranow, Intel
Tech Talk at SAPPHIRE,
May 17, 2017
Analyze Genomes:
A Federated In-
Memory Database
Computing Platform
Scheduling and Execution of
Genome Data Processing Pipelines
Analyze Genomes:
A Federated In-Memory
Database
Computing Platform
In-Memory Database
Tasks
Scheduler
ID Pipeline Params
12 BWA xyz.fastq
13 Stanford A_1.fastq
14 Bowtie xyz.fastq
Worker
Worker
Subtasks
Task ID Job Status Params
12 97 Split done xyz.fastq
12 98 Import todo abc.vcf
12 98 Import done abc.vcf
Webservice
. . .
1. Trigger task execution
2. Schedule subtasks
3. Execute subtasks
13
Managed Services provided by
Federated In-Memory Database System (FIMDB)
Dr. Schapranow, Intel
Tech Talk at SAPPHIRE,
May 17, 2017
Analyze Genomes:
A Federated In-
Memory Database
Computing Platform
14
Node i
WorkerWorkerWorker
IMDB
Node j
WorkerWorkerWorker
IMDB
Node k
WorkerWorkerWorker
IMDB
Scheduler
Node m
WorkerWorkerWorker
IMDB
Relay
Node n
WorkerWorkerWorker
IMDB
...
Cloud Service Provider
(Shared Algorithms and Public Reference Data)
Hospital or Research Department
(Sensitive/Patient Data)
VPN
UDP
TCP
Shared File System (Pool) Shared File System (Pool)
...
Shared File System (Global)
■  Not standardized
■  Not exchangeable
■  Concatenation of bash scripts reading from and writing to files
■  Requires IT expertise for
□  Setup
□  Error handling, and
□  Efficient processing and parallelization
■  Objective: Model, configure, and execute pipelines without involving IT experts
Genome Data Processing Pipelines
State of the Art
Dr. Schapranow, Intel
Tech Talk at SAPPHIRE,
May 17, 2017
Analyze Genomes:
A Federated In-
Memory Database
Computing Platform
15
bwa aln ref.fa sample.fastq | bwa samse ref.fa – sample.fastq | samtools view -Su - | samtools sort …
■  Graphical modeling notation
■  Compliant with BPMN 2.0 extended by
□  Modular structure
□  Degree of parallelization
□  Parameters and variables
■  Model descriptions (XPDL) are stored in IMDB
■  Model instances are transformed into graph structure
executed by our worker framework
Genome Data Processing Pipelines
Standardized Modeling
Dr. Schapranow, Intel
Tech Talk at SAPPHIRE,
May 17, 2017
Analyze Genomes:
A Federated In-
Memory Database
Computing Platform
Chart 16
Genome Data Processing Pipelines
XML Process Definition Language
Dr. Schapranow, Intel
Tech Talk at SAPPHIRE,
May 17, 2017
Analyze Genomes:
A Federated In-
Memory Database
Computing Platform
17
PIPELINES.MODELS
Database Structure
Dr. Schapranow, Intel
Tech Talk at SAPPHIRE,
May 17, 2017
Analyze Genomes:
A Federated In-
Memory Database
Computing Platform
18
PIPELINES.PIPELINES
■  Results are imported into IMDB
■  Optimization reduced execution time by >50%
Genome Data Processing Pipelines
Traditional vs. Optimized Approach
Dr. Schapranow, Intel
Tech Talk at SAPPHIRE,
May 17, 2017
Analyze Genomes:
A Federated In-
Memory Database
Computing Platform
19
Reproducibility
Modeling of Data Analysis Pipelines
1.  Design time (researcher, process expert)
□  Definition of parameterized process model
□  Uses graphical editor and jobs from repository
2.  Configuration time (researcher, lab assistant)
□  Select model and specify parameters, e.g. aln opts
□  Results in model instance stored in repository
3.  Execution time (researcher)
□  Select model instance
□  Specify execution parameters, e.g. input files
Analyze Genomes:
A Federated In-
Memory Database
Computing Platform
Dr. Schapranow, Intel
Tech Talk at SAPPHIRE,
May 17, 2017
20
■  Query-oriented search interface
■  Seamless integration of patient specifics, e.g. from EMR
■  Parallel search in international knowledge bases, e.g. for biomarkers, literature,
cellular pathway, and clinical trials
App Example:
Medical Knowledge Cockpit for Patients and Clinicians
Analyze Genomes:
A Federated In-
Memory Database
Computing Platform
21
Dr. Schapranow, Intel
Tech Talk at SAPPHIRE,
May 17, 2017
Dr. Schapranow, Intel
Tech Talk at SAPPHIRE,
May 17, 2017
Medical Knowledge Cockpit for Patients and Clinicians
Pathway Topology Analysis
■  Search in pathways is limited to “is a certain
element contained” today
■  Integrated >1,5k pathways from international
sources, e.g. KEGG, HumanCyc, and WikiPathways,
into HANA
■  Implemented graph-based topology exploration and
ranking based on patient specifics
■  Enables interactive identification of possible
dysfunctions affecting the course of a therapy
before its start
Analyze Genomes:
A Federated In-
Memory Database
Computing Platform
Unified access to multiple formerly
disjoint data sources
Pathway analysis of genetic
variants with graph engine
22
Dr. Schapranow, Intel
Tech Talk at SAPPHIRE,
May 17, 2017
■  Interactively explore relevant publications, e.g. PDFs
■  Improved ease of exploration, e.g. by highlighted medical terms and relevant
concepts
Medical Knowledge Cockpit for Patients and Clinicians
Publications
Analyze Genomes:
A Federated In-
Memory Database
Computing Platform
23
App Example:
Real-time Assessment of Clinical Trial Candidates
■  Supports trial design and recruitment process through
statistical data analysis
■  Real-time matching and clustering of patients and
clinical trial inclusion/exclusion criteria
■  Reassessment of already screened or participating
citizens to reduce recruitment costs
■  Integrates smoothly with the
Analyze Genomes:
A Federated In-
Memory Database
Computing Platform
Real-time assessment of
clinical trial candidates
24
Dr. Schapranow, Intel
Tech Talk at SAPPHIRE,
May 17, 2017
■  Online: Visit we.analyzegenomes.com for latest research
results, slides, videos, tools, and publications
■  Offline: High-Performance In-Memory Genome Data Analysis:
In-Memory Data Management Research, Springer,
ISBN: 978-3-319-03034-0, 2014
■  In Person: Visit us at the HPI booth 200!
■  Join us for Intel Tech Talks at SAPPHIRE booth 669!
□  May 17 01.00pm: A Federated In-Memory Database Computing Platform Enabling
Real-time Analysis of Big Medical Data
□  May 18 3.00pm: In-Memory Apps For Precision Medicine
Where to find additional information?
Dr. Schapranow, Intel
Tech Talk at SAPPHIRE,
May 17, 2017
Analyze Genomes:
A Federated In-
Memory Database
Computing Platform
25
Keep in contact with us!
Dr. Schapranow, Intel
Tech Talk at SAPPHIRE,
May 17, 2017
Analyze Genomes:
A Federated In-
Memory Database
Computing Platform
26
Dr. Matthieu-P. Schapranow
Program Manager E-Health & Life Sciences
Hasso Plattner Institute
August-Bebel-Str. 88
14482 Potsdam, Germany
schapranow@hpi.de
http://we.analyzegenomes.com/

More Related Content

What's hot

Algorithmen statt Ärzte: Algorithmen statt Ärzte: Ersetzt Big Data künftig ...
Algorithmen statt Ärzte: Algorithmen statt Ärzte: Ersetzt Big Data künftig ...Algorithmen statt Ärzte: Algorithmen statt Ärzte: Ersetzt Big Data künftig ...
Algorithmen statt Ärzte: Algorithmen statt Ärzte: Ersetzt Big Data künftig ...Matthieu Schapranow
 
How will AI affect the patient journey of the future?
How will AI affect the patient journey of the future?How will AI affect the patient journey of the future?
How will AI affect the patient journey of the future?Matthieu Schapranow
 
Analyze Genomes: In-memory Apps for Next-generation Life Sciences Research
Analyze Genomes: In-memory Apps for Next-generation Life Sciences ResearchAnalyze Genomes: In-memory Apps for Next-generation Life Sciences Research
Analyze Genomes: In-memory Apps for Next-generation Life Sciences ResearchMatthieu Schapranow
 
Analyze Genomes Services for Precision Medicine
Analyze Genomes Services for Precision MedicineAnalyze Genomes Services for Precision Medicine
Analyze Genomes Services for Precision MedicineMatthieu Schapranow
 
In-Memory Data Management for Systems Medicine
In-Memory Data Management for Systems MedicineIn-Memory Data Management for Systems Medicine
In-Memory Data Management for Systems MedicineMatthieu Schapranow
 
Analyze Genomes Services for Precision Medicine
Analyze Genomes Services for Precision MedicineAnalyze Genomes Services for Precision Medicine
Analyze Genomes Services for Precision MedicineMatthieu Schapranow
 
The Driver of the Healthcare System in the 21st Century: Real-world Applicati...
The Driver of the Healthcare System in the 21st Century: Real-world Applicati...The Driver of the Healthcare System in the 21st Century: Real-world Applicati...
The Driver of the Healthcare System in the 21st Century: Real-world Applicati...Matthieu Schapranow
 
A Platform for Integrated Genome Data Analysis
A Platform for Integrated Genome Data AnalysisA Platform for Integrated Genome Data Analysis
A Platform for Integrated Genome Data AnalysisMatthieu Schapranow
 
Analyze Genomes: A Federated In-Memory Database System For Life Sciences
Analyze Genomes: A Federated In-Memory Database System For Life SciencesAnalyze Genomes: A Federated In-Memory Database System For Life Sciences
Analyze Genomes: A Federated In-Memory Database System For Life SciencesMatthieu Schapranow
 
A Federated In-Memory Database System for Life Sciences
A Federated In-Memory Database System for Life SciencesA Federated In-Memory Database System for Life Sciences
A Federated In-Memory Database System for Life SciencesMatthieu Schapranow
 
Big Data in Genomics: Opportunities and Challenges
Big Data in Genomics: Opportunities and ChallengesBig Data in Genomics: Opportunities and Challenges
Big Data in Genomics: Opportunities and ChallengesMatthieu Schapranow
 
BioNRW: Big Medical Data: Challenge or Potential
BioNRW: Big Medical Data: Challenge or PotentialBioNRW: Big Medical Data: Challenge or Potential
BioNRW: Big Medical Data: Challenge or PotentialMatthieu Schapranow
 
Processing of Big Medical Data in Personalized Medicine: Challenge or Potential
Processing of Big Medical Data in Personalized Medicine: Challenge or PotentialProcessing of Big Medical Data in Personalized Medicine: Challenge or Potential
Processing of Big Medical Data in Personalized Medicine: Challenge or PotentialMatthieu Schapranow
 
Festival of Genomics 2016 London: What to take home?
Festival of Genomics 2016 London: What to take home?Festival of Genomics 2016 London: What to take home?
Festival of Genomics 2016 London: What to take home?Matthieu Schapranow
 
Analyze Genomes: Drug Response Analysis
Analyze Genomes: Drug Response AnalysisAnalyze Genomes: Drug Response Analysis
Analyze Genomes: Drug Response AnalysisMatthieu Schapranow
 
Human Genome and Big Data Challenges
Human Genome and Big Data ChallengesHuman Genome and Big Data Challenges
Human Genome and Big Data ChallengesPhilip Bourne
 
Festival of Genomics 2016 London: Agenda
Festival of Genomics 2016 London: AgendaFestival of Genomics 2016 London: Agenda
Festival of Genomics 2016 London: AgendaMatthieu Schapranow
 
Festival of Genomics 2016 London: Challenges of Big Medical Data?
Festival of Genomics 2016 London: Challenges of Big Medical Data?Festival of Genomics 2016 London: Challenges of Big Medical Data?
Festival of Genomics 2016 London: Challenges of Big Medical Data?Matthieu Schapranow
 

What's hot (20)

Algorithmen statt Ärzte: Algorithmen statt Ärzte: Ersetzt Big Data künftig ...
Algorithmen statt Ärzte: Algorithmen statt Ärzte: Ersetzt Big Data künftig ...Algorithmen statt Ärzte: Algorithmen statt Ärzte: Ersetzt Big Data künftig ...
Algorithmen statt Ärzte: Algorithmen statt Ärzte: Ersetzt Big Data künftig ...
 
How will AI affect the patient journey of the future?
How will AI affect the patient journey of the future?How will AI affect the patient journey of the future?
How will AI affect the patient journey of the future?
 
Analyze Genomes: In-memory Apps for Next-generation Life Sciences Research
Analyze Genomes: In-memory Apps for Next-generation Life Sciences ResearchAnalyze Genomes: In-memory Apps for Next-generation Life Sciences Research
Analyze Genomes: In-memory Apps for Next-generation Life Sciences Research
 
Analyze Genomes Services for Precision Medicine
Analyze Genomes Services for Precision MedicineAnalyze Genomes Services for Precision Medicine
Analyze Genomes Services for Precision Medicine
 
In-Memory Data Management for Systems Medicine
In-Memory Data Management for Systems MedicineIn-Memory Data Management for Systems Medicine
In-Memory Data Management for Systems Medicine
 
Analyze Genomes Services for Precision Medicine
Analyze Genomes Services for Precision MedicineAnalyze Genomes Services for Precision Medicine
Analyze Genomes Services for Precision Medicine
 
The Driver of the Healthcare System in the 21st Century: Real-world Applicati...
The Driver of the Healthcare System in the 21st Century: Real-world Applicati...The Driver of the Healthcare System in the 21st Century: Real-world Applicati...
The Driver of the Healthcare System in the 21st Century: Real-world Applicati...
 
A Platform for Integrated Genome Data Analysis
A Platform for Integrated Genome Data AnalysisA Platform for Integrated Genome Data Analysis
A Platform for Integrated Genome Data Analysis
 
AI in Oncology
AI in OncologyAI in Oncology
AI in Oncology
 
Analyze Genomes: A Federated In-Memory Database System For Life Sciences
Analyze Genomes: A Federated In-Memory Database System For Life SciencesAnalyze Genomes: A Federated In-Memory Database System For Life Sciences
Analyze Genomes: A Federated In-Memory Database System For Life Sciences
 
A Federated In-Memory Database System for Life Sciences
A Federated In-Memory Database System for Life SciencesA Federated In-Memory Database System for Life Sciences
A Federated In-Memory Database System for Life Sciences
 
Big Data in Genomics: Opportunities and Challenges
Big Data in Genomics: Opportunities and ChallengesBig Data in Genomics: Opportunities and Challenges
Big Data in Genomics: Opportunities and Challenges
 
BioNRW: Big Medical Data: Challenge or Potential
BioNRW: Big Medical Data: Challenge or PotentialBioNRW: Big Medical Data: Challenge or Potential
BioNRW: Big Medical Data: Challenge or Potential
 
Processing of Big Medical Data in Personalized Medicine: Challenge or Potential
Processing of Big Medical Data in Personalized Medicine: Challenge or PotentialProcessing of Big Medical Data in Personalized Medicine: Challenge or Potential
Processing of Big Medical Data in Personalized Medicine: Challenge or Potential
 
Festival of Genomics 2016 London: What to take home?
Festival of Genomics 2016 London: What to take home?Festival of Genomics 2016 London: What to take home?
Festival of Genomics 2016 London: What to take home?
 
Big Data in Life Sciences
Big Data in Life SciencesBig Data in Life Sciences
Big Data in Life Sciences
 
Analyze Genomes: Drug Response Analysis
Analyze Genomes: Drug Response AnalysisAnalyze Genomes: Drug Response Analysis
Analyze Genomes: Drug Response Analysis
 
Human Genome and Big Data Challenges
Human Genome and Big Data ChallengesHuman Genome and Big Data Challenges
Human Genome and Big Data Challenges
 
Festival of Genomics 2016 London: Agenda
Festival of Genomics 2016 London: AgendaFestival of Genomics 2016 London: Agenda
Festival of Genomics 2016 London: Agenda
 
Festival of Genomics 2016 London: Challenges of Big Medical Data?
Festival of Genomics 2016 London: Challenges of Big Medical Data?Festival of Genomics 2016 London: Challenges of Big Medical Data?
Festival of Genomics 2016 London: Challenges of Big Medical Data?
 

Similar to Federated In-Memory Platform for Analyzing Big Medical Data

Introduction to High-performance In-memory Genome Project at HPI
Introduction to High-performance In-memory Genome Project at HPI Introduction to High-performance In-memory Genome Project at HPI
Introduction to High-performance In-memory Genome Project at HPI Matthieu Schapranow
 
Big Medical Data – Challenge or Potential?
Big Medical Data – Challenge or Potential?Big Medical Data – Challenge or Potential?
Big Medical Data – Challenge or Potential?Matthieu Schapranow
 
Festival of Genomics 2016 London: Analyze Genomes: A Federated In-Memory Comp...
Festival of Genomics 2016 London: Analyze Genomes: A Federated In-Memory Comp...Festival of Genomics 2016 London: Analyze Genomes: A Federated In-Memory Comp...
Festival of Genomics 2016 London: Analyze Genomes: A Federated In-Memory Comp...Matthieu Schapranow
 
In-memory Applications for Oncology
In-memory Applications for OncologyIn-memory Applications for Oncology
In-memory Applications for OncologyMatthieu Schapranow
 
Enabling Real-time Genome Data Research with In-memory Database Technology (S...
Enabling Real-time Genome Data Research with In-memory Database Technology (S...Enabling Real-time Genome Data Research with In-memory Database Technology (S...
Enabling Real-time Genome Data Research with In-memory Database Technology (S...Matthieu Schapranow
 
Festival of Genomics 2016 London: Analyze Genomes: Modeling and Executing Gen...
Festival of Genomics 2016 London: Analyze Genomes: Modeling and Executing Gen...Festival of Genomics 2016 London: Analyze Genomes: Modeling and Executing Gen...
Festival of Genomics 2016 London: Analyze Genomes: Modeling and Executing Gen...Matthieu Schapranow
 
iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to ...
iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to ...iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to ...
iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to ...Bonnie Hurwitz
 
Tag.bio aws public jun 08 2021
Tag.bio aws public jun 08 2021 Tag.bio aws public jun 08 2021
Tag.bio aws public jun 08 2021 Sanjay Padhi, Ph.D
 
How SAP HANA can provide value for Pharma R&D
How SAP HANA can provide value for Pharma R&DHow SAP HANA can provide value for Pharma R&D
How SAP HANA can provide value for Pharma R&DMarc Maurer
 
Peter Embi's 2011 AMIA CRI Year-in-Review
Peter Embi's 2011 AMIA CRI Year-in-ReviewPeter Embi's 2011 AMIA CRI Year-in-Review
Peter Embi's 2011 AMIA CRI Year-in-ReviewPeter Embi
 
BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)
BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)
BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)BigData_Europe
 
DataFAIRy bioassays pilot -- lessons learned and future outlook
DataFAIRy bioassays pilot -- lessons learned and future outlookDataFAIRy bioassays pilot -- lessons learned and future outlook
DataFAIRy bioassays pilot -- lessons learned and future outlookIsabella Feierberg
 
Informatics in Context: Managing Sample-to-Answer Multi-Omics Workflows
Informatics in Context: Managing Sample-to-Answer Multi-Omics WorkflowsInformatics in Context: Managing Sample-to-Answer Multi-Omics Workflows
Informatics in Context: Managing Sample-to-Answer Multi-Omics WorkflowsKate Barlow
 
Standards and tools for model management in biomedical research
Standards and tools for model management in biomedical researchStandards and tools for model management in biomedical research
Standards and tools for model management in biomedical researchUniversity Medicine Greifswald
 
TranSMART: How open source software revolutionizes drug discovery through cro...
TranSMART: How open source software revolutionizes drug discovery through cro...TranSMART: How open source software revolutionizes drug discovery through cro...
TranSMART: How open source software revolutionizes drug discovery through cro...keesvb
 
D1 1440 cesar wong next generation sequencing & bio medical data analysis
D1 1440 cesar wong next generation sequencing & bio medical data analysisD1 1440 cesar wong next generation sequencing & bio medical data analysis
D1 1440 cesar wong next generation sequencing & bio medical data analysisDr. Wilfred Lin (Ph.D.)
 
Efficient Data Labelling for Ocular Imaging
Efficient Data Labelling for Ocular ImagingEfficient Data Labelling for Ocular Imaging
Efficient Data Labelling for Ocular ImagingPetteriTeikariPhD
 

Similar to Federated In-Memory Platform for Analyzing Big Medical Data (20)

Introduction to High-performance In-memory Genome Project at HPI
Introduction to High-performance In-memory Genome Project at HPI Introduction to High-performance In-memory Genome Project at HPI
Introduction to High-performance In-memory Genome Project at HPI
 
Big Medical Data – Challenge or Potential?
Big Medical Data – Challenge or Potential?Big Medical Data – Challenge or Potential?
Big Medical Data – Challenge or Potential?
 
Festival of Genomics 2016 London: Analyze Genomes: A Federated In-Memory Comp...
Festival of Genomics 2016 London: Analyze Genomes: A Federated In-Memory Comp...Festival of Genomics 2016 London: Analyze Genomes: A Federated In-Memory Comp...
Festival of Genomics 2016 London: Analyze Genomes: A Federated In-Memory Comp...
 
In-memory Applications for Oncology
In-memory Applications for OncologyIn-memory Applications for Oncology
In-memory Applications for Oncology
 
Enabling Real-time Genome Data Research with In-memory Database Technology (S...
Enabling Real-time Genome Data Research with In-memory Database Technology (S...Enabling Real-time Genome Data Research with In-memory Database Technology (S...
Enabling Real-time Genome Data Research with In-memory Database Technology (S...
 
Festival of Genomics 2016 London: Analyze Genomes: Modeling and Executing Gen...
Festival of Genomics 2016 London: Analyze Genomes: Modeling and Executing Gen...Festival of Genomics 2016 London: Analyze Genomes: Modeling and Executing Gen...
Festival of Genomics 2016 London: Analyze Genomes: Modeling and Executing Gen...
 
iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to ...
iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to ...iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to ...
iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to ...
 
Pine education-platform
Pine education-platformPine education-platform
Pine education-platform
 
Tag.bio aws public jun 08 2021
Tag.bio aws public jun 08 2021 Tag.bio aws public jun 08 2021
Tag.bio aws public jun 08 2021
 
How SAP HANA can provide value for Pharma R&D
How SAP HANA can provide value for Pharma R&DHow SAP HANA can provide value for Pharma R&D
How SAP HANA can provide value for Pharma R&D
 
Peter Embi's 2011 AMIA CRI Year-in-Review
Peter Embi's 2011 AMIA CRI Year-in-ReviewPeter Embi's 2011 AMIA CRI Year-in-Review
Peter Embi's 2011 AMIA CRI Year-in-Review
 
2015 genome-center
2015 genome-center2015 genome-center
2015 genome-center
 
BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)
BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)
BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)
 
DataFAIRy bioassays pilot -- lessons learned and future outlook
DataFAIRy bioassays pilot -- lessons learned and future outlookDataFAIRy bioassays pilot -- lessons learned and future outlook
DataFAIRy bioassays pilot -- lessons learned and future outlook
 
Informatics in Context: Managing Sample-to-Answer Multi-Omics Workflows
Informatics in Context: Managing Sample-to-Answer Multi-Omics WorkflowsInformatics in Context: Managing Sample-to-Answer Multi-Omics Workflows
Informatics in Context: Managing Sample-to-Answer Multi-Omics Workflows
 
Standards and tools for model management in biomedical research
Standards and tools for model management in biomedical researchStandards and tools for model management in biomedical research
Standards and tools for model management in biomedical research
 
TranSMART: How open source software revolutionizes drug discovery through cro...
TranSMART: How open source software revolutionizes drug discovery through cro...TranSMART: How open source software revolutionizes drug discovery through cro...
TranSMART: How open source software revolutionizes drug discovery through cro...
 
2019 Triangle Machine Learning Day - Biomedical Image Understanding and EHRs ...
2019 Triangle Machine Learning Day - Biomedical Image Understanding and EHRs ...2019 Triangle Machine Learning Day - Biomedical Image Understanding and EHRs ...
2019 Triangle Machine Learning Day - Biomedical Image Understanding and EHRs ...
 
D1 1440 cesar wong next generation sequencing & bio medical data analysis
D1 1440 cesar wong next generation sequencing & bio medical data analysisD1 1440 cesar wong next generation sequencing & bio medical data analysis
D1 1440 cesar wong next generation sequencing & bio medical data analysis
 
Efficient Data Labelling for Ocular Imaging
Efficient Data Labelling for Ocular ImagingEfficient Data Labelling for Ocular Imaging
Efficient Data Labelling for Ocular Imaging
 

Recently uploaded

Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfjimielynbastida
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 

Recently uploaded (20)

Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdf
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 

Federated In-Memory Platform for Analyzing Big Medical Data

  • 1. A Federated In-Memory Database Computing Platform Enabling Real- time Analysis of Big Medical Data Dr.-Ing. Matthieu-P. Schapranow Hasso Plattner Institute, Potsdam, Germany May 17, 2017
  • 2. ■  Can we enable clinicians to take their therapy decisions: □  Incorporating all available patient specifics, □  Referencing latest lab results and worldwide medical knowledge, and □  In an interactive manner during their ward round? Our Motivation Turn Precision Medicine Into Clinical Routine Analyze Genomes: A Federated In- Memory Database Computing Platform 2 Dr. Schapranow, Intel Tech Talk at SAPPHIRE, May 17, 2017
  • 3. Dr. Schapranow, Intel Tech Talk at SAPPHIRE, May 17, 2017 Analyze Genomes: A Federated In- Memory Database Computing Platform 3
  • 4. Our Vision Medical Board Incorporating Latest Medical Knowledge Dr. Schapranow, Intel Tech Talk at SAPPHIRE, May 17, 2017 Analyze Genomes: A Federated In- Memory Database Computing Platform 4
  • 5. Project Time Line Dr. Schapranow, Intel Tech Talk at SAPPHIRE, May 17, 2017 Analyze Genomes: A Federated In- Memory Database Computing Platform 5 2009 2010 2011 2012 2013 2014 2015 SAP HANA launched Oncolyzer SORMAS Drug Response Analysis Enterprise Software Medical Knowledge Cockpit Analyze Genomes Platform IMDB Research 2016 2017 A R T + T RAM S + S M
  • 6. The Challenge Distributed Heterogeneous Data Sources 6 Human genome/biological data 600GB per full genome 15PB+ in databases of leading institutes Prescription data 1.5B records from 10,000 doctors and 10M Patients (100 GB) Clinical trials Currently more than 30k recruiting on ClinicalTrials.gov Human proteome 160M data points (2.4GB) per sample >3TB raw proteome data in ProteomicsDB PubMed database >23M articles Hospital information systems Often more than 50GB Medical sensor data Scan of a single organ in 1s creates 10GB of raw dataCancer patient records >160k records at NCT Analyze Genomes: A Federated In- Memory Database Computing Platform Dr. Schapranow, Intel Tech Talk at SAPPHIRE, May 17, 2017
  • 7. ■  Requirements □  Managed services □  Reproducibility □  Real-time data analysis ■  Restrictions □  Data privacy □  Data locality □  Volume of big medical data Software Requirements in Life Sciences Dr. Schapranow, Intel Tech Talk at SAPPHIRE, May 17, 2017 Analyze Genomes: A Federated In- Memory Database Computing Platform 7 http://stevedempsen.blogspot.de/2013/08/agile-software-requirements-comic.html
  • 8. Dr. Schapranow, Intel Tech Talk at SAPPHIRE, May 17, 2017 Our Approach: AnalyzeGenomes.com In-Memory Computing Platform for Big Medical Data 8 In-Memory Database Analyze Genomes: A Federated In- Memory Database Computing Platform
  • 9. Dr. Schapranow, Intel Tech Talk at SAPPHIRE, May 17, 2017 Our Approach: AnalyzeGenomes.com In-Memory Computing Platform for Big Medical Data 9 In-Memory Database Combined and Linked Data Genome Data Cellular Pathways Genome Metadata Research Publications Pipeline and Analysis Models Drugs and Interactions Analyze Genomes: A Federated In- Memory Database Computing Platform Indexed Sources
  • 10. Dr. Schapranow, Intel Tech Talk at SAPPHIRE, May 17, 2017 Our Approach: AnalyzeGenomes.com In-Memory Computing Platform for Big Medical Data 10 In-Memory Database Extensions for Life Sciences Data Exchange, App Store Access Control, Data Protection Fair Use Statistical Tools Real-time Analysis App-spanning User Profiles Combined and Linked Data Genome Data Cellular Pathways Genome Metadata Research Publications Pipeline and Analysis Models Drugs and Interactions Analyze Genomes: A Federated In- Memory Database Computing Platform Indexed Sources
  • 11. Dr. Schapranow, Intel Tech Talk at SAPPHIRE, May 17, 2017 Our Approach: AnalyzeGenomes.com In-Memory Computing Platform for Big Medical Data 11 In-Memory Database Extensions for Life Sciences Data Exchange, App Store Access Control, Data Protection Fair Use Statistical Tools Real-time Analysis App-spanning User Profiles Combined and Linked Data Genome Data Cellular Pathways Genome Metadata Research Publications Pipeline and Analysis Models Drugs and Interactions Analyze Genomes: A Federated In- Memory Database Computing Platform Drug Response Analysis Pathway Topology Analysis Medical Knowledge CockpitOncolyzer Clinical Trial Recruitment Cohort Analysis ... Indexed Sources
  • 12. Combined column and row store Map/Reduce Single and multi-tenancy Lightweight compression Insert only for time travel Real-time replication Working on integers SQL interface on columns and rows Active/passive data store Minimal projections Group key Reduction of software layers Dynamic multi- threading Bulk load of data Object- relational mapping Text retrieval and extraction engine No aggregate tables Data partitioning Any attribute as index No disk On-the-fly extensibility Analytics on historical data Multi-core/ parallelization Our Technology In-Memory Database Technology + ++ + + P v +++ t SQL x x T disk 12 Dr. Schapranow, Intel Tech Talk at SAPPHIRE, May 17, 2017 Analyze Genomes: A Federated In- Memory Database Computing Platform
  • 13. Scheduling and Execution of Genome Data Processing Pipelines Analyze Genomes: A Federated In-Memory Database Computing Platform In-Memory Database Tasks Scheduler ID Pipeline Params 12 BWA xyz.fastq 13 Stanford A_1.fastq 14 Bowtie xyz.fastq Worker Worker Subtasks Task ID Job Status Params 12 97 Split done xyz.fastq 12 98 Import todo abc.vcf 12 98 Import done abc.vcf Webservice . . . 1. Trigger task execution 2. Schedule subtasks 3. Execute subtasks 13
  • 14. Managed Services provided by Federated In-Memory Database System (FIMDB) Dr. Schapranow, Intel Tech Talk at SAPPHIRE, May 17, 2017 Analyze Genomes: A Federated In- Memory Database Computing Platform 14 Node i WorkerWorkerWorker IMDB Node j WorkerWorkerWorker IMDB Node k WorkerWorkerWorker IMDB Scheduler Node m WorkerWorkerWorker IMDB Relay Node n WorkerWorkerWorker IMDB ... Cloud Service Provider (Shared Algorithms and Public Reference Data) Hospital or Research Department (Sensitive/Patient Data) VPN UDP TCP Shared File System (Pool) Shared File System (Pool) ... Shared File System (Global)
  • 15. ■  Not standardized ■  Not exchangeable ■  Concatenation of bash scripts reading from and writing to files ■  Requires IT expertise for □  Setup □  Error handling, and □  Efficient processing and parallelization ■  Objective: Model, configure, and execute pipelines without involving IT experts Genome Data Processing Pipelines State of the Art Dr. Schapranow, Intel Tech Talk at SAPPHIRE, May 17, 2017 Analyze Genomes: A Federated In- Memory Database Computing Platform 15 bwa aln ref.fa sample.fastq | bwa samse ref.fa – sample.fastq | samtools view -Su - | samtools sort …
  • 16. ■  Graphical modeling notation ■  Compliant with BPMN 2.0 extended by □  Modular structure □  Degree of parallelization □  Parameters and variables ■  Model descriptions (XPDL) are stored in IMDB ■  Model instances are transformed into graph structure executed by our worker framework Genome Data Processing Pipelines Standardized Modeling Dr. Schapranow, Intel Tech Talk at SAPPHIRE, May 17, 2017 Analyze Genomes: A Federated In- Memory Database Computing Platform Chart 16
  • 17. Genome Data Processing Pipelines XML Process Definition Language Dr. Schapranow, Intel Tech Talk at SAPPHIRE, May 17, 2017 Analyze Genomes: A Federated In- Memory Database Computing Platform 17
  • 18. PIPELINES.MODELS Database Structure Dr. Schapranow, Intel Tech Talk at SAPPHIRE, May 17, 2017 Analyze Genomes: A Federated In- Memory Database Computing Platform 18 PIPELINES.PIPELINES
  • 19. ■  Results are imported into IMDB ■  Optimization reduced execution time by >50% Genome Data Processing Pipelines Traditional vs. Optimized Approach Dr. Schapranow, Intel Tech Talk at SAPPHIRE, May 17, 2017 Analyze Genomes: A Federated In- Memory Database Computing Platform 19
  • 20. Reproducibility Modeling of Data Analysis Pipelines 1.  Design time (researcher, process expert) □  Definition of parameterized process model □  Uses graphical editor and jobs from repository 2.  Configuration time (researcher, lab assistant) □  Select model and specify parameters, e.g. aln opts □  Results in model instance stored in repository 3.  Execution time (researcher) □  Select model instance □  Specify execution parameters, e.g. input files Analyze Genomes: A Federated In- Memory Database Computing Platform Dr. Schapranow, Intel Tech Talk at SAPPHIRE, May 17, 2017 20
  • 21. ■  Query-oriented search interface ■  Seamless integration of patient specifics, e.g. from EMR ■  Parallel search in international knowledge bases, e.g. for biomarkers, literature, cellular pathway, and clinical trials App Example: Medical Knowledge Cockpit for Patients and Clinicians Analyze Genomes: A Federated In- Memory Database Computing Platform 21 Dr. Schapranow, Intel Tech Talk at SAPPHIRE, May 17, 2017
  • 22. Dr. Schapranow, Intel Tech Talk at SAPPHIRE, May 17, 2017 Medical Knowledge Cockpit for Patients and Clinicians Pathway Topology Analysis ■  Search in pathways is limited to “is a certain element contained” today ■  Integrated >1,5k pathways from international sources, e.g. KEGG, HumanCyc, and WikiPathways, into HANA ■  Implemented graph-based topology exploration and ranking based on patient specifics ■  Enables interactive identification of possible dysfunctions affecting the course of a therapy before its start Analyze Genomes: A Federated In- Memory Database Computing Platform Unified access to multiple formerly disjoint data sources Pathway analysis of genetic variants with graph engine 22
  • 23. Dr. Schapranow, Intel Tech Talk at SAPPHIRE, May 17, 2017 ■  Interactively explore relevant publications, e.g. PDFs ■  Improved ease of exploration, e.g. by highlighted medical terms and relevant concepts Medical Knowledge Cockpit for Patients and Clinicians Publications Analyze Genomes: A Federated In- Memory Database Computing Platform 23
  • 24. App Example: Real-time Assessment of Clinical Trial Candidates ■  Supports trial design and recruitment process through statistical data analysis ■  Real-time matching and clustering of patients and clinical trial inclusion/exclusion criteria ■  Reassessment of already screened or participating citizens to reduce recruitment costs ■  Integrates smoothly with the Analyze Genomes: A Federated In- Memory Database Computing Platform Real-time assessment of clinical trial candidates 24 Dr. Schapranow, Intel Tech Talk at SAPPHIRE, May 17, 2017
  • 25. ■  Online: Visit we.analyzegenomes.com for latest research results, slides, videos, tools, and publications ■  Offline: High-Performance In-Memory Genome Data Analysis: In-Memory Data Management Research, Springer, ISBN: 978-3-319-03034-0, 2014 ■  In Person: Visit us at the HPI booth 200! ■  Join us for Intel Tech Talks at SAPPHIRE booth 669! □  May 17 01.00pm: A Federated In-Memory Database Computing Platform Enabling Real-time Analysis of Big Medical Data □  May 18 3.00pm: In-Memory Apps For Precision Medicine Where to find additional information? Dr. Schapranow, Intel Tech Talk at SAPPHIRE, May 17, 2017 Analyze Genomes: A Federated In- Memory Database Computing Platform 25
  • 26. Keep in contact with us! Dr. Schapranow, Intel Tech Talk at SAPPHIRE, May 17, 2017 Analyze Genomes: A Federated In- Memory Database Computing Platform 26 Dr. Matthieu-P. Schapranow Program Manager E-Health & Life Sciences Hasso Plattner Institute August-Bebel-Str. 88 14482 Potsdam, Germany schapranow@hpi.de http://we.analyzegenomes.com/