VIP Mumbai Call Girls Hiranandani Gardens Just Call 9920874524 with A/C Room ...
Turning Big Data into Precision Medicine
1. Turning Big Data into Precision Medicine: Real-life Experiences
Dr. Matthieu-P. Schapranow
Festival of Genomics, Boston, MA
June 24, 2015
2. ■ Online: Visit we.analyzegenomes.com for latest research results, tools, and news
■ Offline: Read more about it, e.g. High-Performance In-Memory Genome Data Analysis:
How In-Memory Database Technology Accelerates Personalized Medicine, In-Memory
Data Management Research, Springer, ISBN: 978-3-319-03034-0, 2014
■ In Person: Join us for “Big Data in Medicine” July 1-2, 2015 in Potsdam, Germany
Important things first:
Where do you find additional information?
Schapranow, Festival of
Genomics, Boston, MA,
June 24, 2015
Turning Big Data into
Precision Medicine
2
3. What is the Hasso Plattner Institute, Potsdam, Germany?
Schapranow, Festival of
Genomics, Boston, MA,
June 24, 2015
Turning Big Data into
Precision Medicine
3
4. ■ Since 2009 Program Manager E-Health & Life Sciences
■ 2006-2014 Strategic Projects SAP HANA
■ Visiting Scientist at Charité, Berlin and V.A., Boston, MA
■ Software Engineer by training (PhD, M.Sc., B.Sc.)
Who are you dealing with?
Schapranow, Festival of
Genomics, Boston, MA,
June 24, 2015
Turning Big Data into
Precision Medicine
4
5. ■ Patients
□ Individual anamnesis, family history, and background
□ Require fast access to individualized therapy
■ Clinicians
□ Identify root and extent of disease using laboratory tests
□ Evaluate therapy alternatives, adapt existing therapy
■ Researchers
□ Conduct laboratory work, e.g. analyze patient samples
□ Create new research findings and come-up with treatment alternatives
The Setting
Actors in Oncology
Schapranow, Festival of
Genomics, Boston, MA,
June 24, 2015
5
Turning Big Data into
Precision Medicine
6. IT Challenges
Distributed Heterogeneous Data Sources
6
Human genome/biological data
600GB per full genome
15PB+ in databases of leading institutes
Prescription data
1.5B records from 10,000 doctors and
10M Patients (100 GB)
Clinical trials
Currently more than 30k
recruiting on ClinicalTrials.gov
Human proteome
160M data points (2.4GB) per sample
>3TB raw proteome data in ProteomicsDB
PubMed database
>23M articles
Hospital information systems
Often more than 50GB
Medical sensor data
Scan of a single organ in 1s
creates 10GB of raw dataCancer patient records
>160k records at NCT
Turning Big Data into
Precision Medicine
Schapranow, Festival of
Genomics, Boston, MA,
June 24, 2015
7. Schapranow, Festival of
Genomics, Boston, MA,
June 24, 2015
Our Approach
Analyze Genomes: Real-time Analysis of Big Medical Data
7
In-Memory Database
Extensions for Life Sciences
Data Exchange,
App Store
Access Control,
Data Protection
Fair Use
Statistical
Tools
Real-time
Analysis
App-spanning
User Profiles
Combined and Linked Data
Genome
Data
Cellular
Pathways
Genome
Metadata
Research
Publications
Pipeline and
Analysis Models
Drugs and
Interactions
Drug Response
Analysis
Pathway Topology
Analysis
Medical
Knowledge CockpitOncolyzer
Clinical Trial
Assessment
Cohort
Analysis
...
Turning Big Data into
Precision Medicine
8. Case Vignette I
■ Patient: 48 years, female, non-smoker, smoke-free environment
■ Diagnosis: Non-Small Cell Lung Cancer (NSCLC), stage IV
■ Markers: KRAS, EGFR, BRAF, NRAS, (ERBB2)
■ Initial treatment: Surgery
■ Therapy: Palliative chemotherapy
Schapranow, Festival of
Genomics, Boston, MA,
June 24, 2015
Turning Big Data into
Precision Medicine
8
Medical
Knowledge Cockpit
9. ■ Query-oriented search interface
■ Seamless integration of patient specifics, e.g. from EMR
■ Parallel search in international knowledge bases, e.g. for biomarkers, literature,
cellular pathway, and clinical trials
Medical Knowledge Cockpit for Patients and Clinicians
Linking Patient Specifics with International Knowledge
Turning Big Data into
Precision Medicine
9
Schapranow, Festival of
Genomics, Boston, MA,
June 24, 2015
10. Medical Knowledge Cockpit for Patients and Clinicians
■ Search for affected genes in distributed and
heterogeneous data sources
■ Immediate exploration of relevant information, such as
□ Gene descriptions,
□ Molecular impact and related pathways,
□ Scientific publications, and
□ Suitable clinical trials.
■ No manual searching for hours or days:
In-memory technology translates searching into
interactive finding!
Turning Big Data into
Precision Medicine
Automatic clinical trial
matching build on text
analysis features
Unified access to structured
and un-structured data
sources
10
Schapranow, Festival of
Genomics, Boston, MA,
June 24, 2015
11. Schapranow, Festival of
Genomics, Boston, MA,
June 24, 2015
Medical Knowledge Cockpit for Patients and Clinicians
Pathway Topology Analysis
■ Search in pathways is limited to “is a certain
element contained” today
■ Integrated >1,5k pathways from international
sources, e.g. KEGG, HumanCyc, and WikiPathways,
into HANA
■ Implemented graph-based topology exploration and
ranking based on patient specifics
■ Enables interactive identification of possible
dysfunctions affecting the course of a therapy
before its start Turning Big Data into
Precision Medicine
Unified access to multiple formerly
disjoint data sources
Pathway analysis of genetic
variants with graph engine
11
12. Case Vignette II
■ Patient: 67 years, male, smoker, consumes frequently alcohol
■ Diagnosis: Squamous cell carcinoma of the oropharynx, T2N2bM0, stage IVa
■ Initial treatment: Surgery
■ After one year: Relapse multiple metastatic nodules to the lung
■ Therapy: Palliative chemotherapy
Schapranow, Festival of
Genomics, Boston, MA,
June 24, 2015
Turning Big Data into
Precision Medicine
12
Drug Response
Analysis
13. Real-time Data Analysis and
Interactive Exploration
Drug Response Analysis
Data Sources
Schapranow, Festival of
Genomics, Boston, MA,
June 24, 2015
Turning Big Data into
Precision Medicine
Smoking status,
tumor classification
and age
(1MB - 100MB)
Raw DNA data
and genetic variants
(100MB - 1TB)
Medication efficiency
and wet lab results
(10MB - 1GB)
13
Patient-specific
Data
Tumor-specific
Data
Compound
Interaction Data
16. Schapranow, Festival of
Genomics, Boston, MA,
June 24, 2015
Turning Big Data into
Precision Medicine
16
cetuximab might be more
beneficial for the current case
18. Our Methodology
Design Thinking
Schapranow, Festival of
Genomics, Boston, MA,
June 24, 2015
Turning Big Data into
Precision Medicine
18
Desirability
■ Portfolio of integrated services for clinicians, researchers, and patients
■ Include latest treatment option, e.g. most effective therapies
Viability
■ Enable precision medicine also in far-off
regions and developing countries
■ Involve word-wide experts (cost-saving)
■ Combine latest international data
(publications, annotations, genome data)
Feasibility
■ HiSeq 2500 enables high-coverage
whole genome sequencing in 20h
■ IMDB enables allele frequency
determination of 12B records within <1s
■ Cloud-based data processing services
reduce TCO
19. Combined column
and row store
Map/Reduce Single and
multi-tenancy
Lightweight
compression
Insert only
for time travel
Real-time
replication
Working on
integers
SQL interface on
columns and rows
Active/passive
data store
Minimal
projections
Group key Reduction of
software layers
Dynamic multi-
threading
Bulk load
of data
Object-
relational
mapping
Text retrieval
and extraction engine
No aggregate
tables
Data partitioning Any attribute
as index
No disk
On-the-fly
extensibility
Analytics on
historical data
Multi-core/
parallelization
Our Technology
In-Memory Database Technology
+
++
+
+
P
v
+++
t
SQL
x
x
T
disk
19
Schapranow, Festival of
Genomics, Boston, MA,
June 24, 2015
Turning Big Data into
Precision Medicine
20. ■ 1,000 core cluster at Hasso Plattner Institute
with 25 TB main memory
■ 25 nodes, each consists of:
□ 40 cores
□ 1 TB main memory
□ Intel® Xeon® E7- 4870
□ 2.40GHz
□ 30 MB Cache
In-Memory Database Technology
Hardware Characteristics at HPI FSOC Lab
Schapranow, Festival of
Genomics, Boston, MA,
June 24, 2015
Turning Big Data into
Precision Medicine
20
21. ■ Main memory access is the new bottleneck
■ Lightweight compression can reduce this bottleneck, i.e.
□ Lossless
□ Improved usage of data bus capacity
□ Work directly on compressed data
Lightweight Compression
Schapranow, Festival of
Genomics, Boston, MA,
June 24, 2015
Turning Big Data into
Precision Medicine
21
Attribute Vector
RecId ValueId
1 C18.0
2 C32.0
3 C00.9
4 C18.0
5 C20.0
6 C20.0
7 C50.9
8 C18.0
Inverted Index
ValueId RecIdList
1 2
2 3
3 5,6
4 1,4,8
5 7
Data Dictionary
ValueId Value
1 Larynx
2 Lip
3 Rectum
4 Colon
5 MamaTable
………
C18.0Colon646470
C50.9Mama167898
C20.0Rectum647912
C20.0Rectum215678
C18.0Colon998711
C00.9Lip123489
C32.0Larynx357982
C18.0Colon091487RecId 1
RecId 2
RecId 3
RecId 4
RecId 5
RecId 6
RecId 7
RecId 8
…
• Typical compression factor of 10:1 for
enterprise software
• In financial applications up to 50:1
22. ■ For patients
□ Identify relevant clinical trials and medical experts
□ Become an informed patient
■ For clinicians
□ Identify pharmacokinetic correlations
□ Scan for similar patient cases, e.g. to evaluate therapy efficiency
■ For researchers
□ Enable real-time analysis of medical data, e.g. assess pathways
to identify impact of detected variants
□ Combined mining in structured and unstructured data, e.g. publications,
diagnosis, and EMR data
What to Take Home?
Test it Yourself: AnalyzeGenomes.com
Schapranow, Festival of
Genomics, Boston, MA,
June 24, 2015
22
Turning Big Data into
Precision Medicine
23. Keep in contact with us!
Hasso Plattner Institute
Enterprise Platform & Integration Concepts (EPIC)
Program Manager E-Health
Dr. Matthieu-P. Schapranow
August-Bebel-Str. 88
14482 Potsdam, Germany
Dr. Matthieu-P. Schapranow
schapranow@hpi.de
http://we.analyzegenomes.com/
Schapranow, Festival of
Genomics, Boston, MA,
June 24, 2015
Turning Big Data into
Precision Medicine
23