"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
Gaining Time – Real-time Analysis of Big Medical Data
1. Gaining Time – Real-time Analysis of
Big Medical Data
Prof. Dr. Hasso Plattner
Chairman of the Supervisory Board, SAP AG and
Professor, Hasso Plattner Institute
2. Growing Data Volumes in
Diverse Healthcare Systems
Human genome/biological data
800 MB per full genome
15 PB+ in databases of leading institutes
Human proteome
160 Mil. data points (2.4 GB) per sample
3.7 TB raw proteome data in ProteomicsDB
Clinical information
management systems
Often more than 50 GB
PubMed biomedical
article database
Cancer patient records
Medical sensor data
23+ Mil. articles
160,000 at
NCT Heidelberg
Prescription data
1.5 Bil. records from 10,000 doctors
and 10 Mil. Patients (100 GB)
Scan of a single organ
in 1s creates 10GB of
raw data
Clinical trials
Currently more than 30,000
recruiting on ClinicalTrials.gov
2
3. Innovation in Medicine can be Driven
Using a Design Thinking Approach
Clinicians
Researchers
Human
Factors
Business
Factors
Desirability
Administration &
Operations Staff
Technical
Factors
Viability
Feasibility
3
4. Only a Collaborative Effort can be
Viable From a Business Perspective
Clinical
Pharma
Care Circles
Patients &
Consumers
Payers
SAP HANA
Research
Desirability
Providers
Viability
Feasibility
4
5. SAP HANA is the
Technology Enabler for This Vision
Advances in Hardware
• Multi-core Architectures,
e.g. 16 CPUs x 10 Cores on
Each Node
• Scaling Across Servers,
e.g. 100 Nodes x 160 Cores
• 64 bit Address Space –
12TB in Current Servers
• 25GB/s Data Throughput
• Cost-Performance Ratio
Improving
A
Advances in Software
Reduced
Footprint
Multi-Core
Parallelization
Compression
Desirability
No aggregate
tables
Viability
Federation
Feasibility
Complex
Algorithms
5
6. More Than Just a Faster Database, SAP HANA
is a Revolutionary Computing Platform
+
Desirability
Viability
Feasibility
6
7. Selected SAP HANA Usage Scenarios
Clinicians
Decision Support
Medical Knowledge Cockpit
Researchers
Personalized
Proteome
medicine
Diagnostics
Medical Explorer
Genomics for
Personalized Medicine
SAP
HANA
Prescription
Analysis
Healthcare
Administration
Optimized
Operations
Patient Management
(IS-H) Analytics
7
8. Research
Genome Variant Analysis
For personalized/preventative medicine
Analysis on 125
variants in 629 people
Multi-Core
in parallel; was not
Parallelization possible before
“ ”
Researchers want to identify and chart amount
of variation in one gene across a population
Multi-Core
Parallelization
Full human genome is 3.2 billion characters long
With SAP HANA, researchers can compare
genetic variants of diseased & healthy cohorts
in real-time
Using SAP HANA, Stanford has seen
“spectacular” findings: Type 2 diabetes disease
risk is very different across populations
"We have been thrilled to work with SAP and HPI on a collaboration to accelerate DNA sequence analysis. In our pilot projects, we are seeing
dramatic speedups in computing on human genome variation data from many samples. We are dreaming of what will soon be possible as we
8
integrate phenotype, genomics, proteomics, and exposome data to empower complex trait mapping using millions of health records.”
- Professor Carlos D. Bustamante at the Stanford University School of Medicine
9. Proteome-based Cancer Diagnostics
Platform for Researchers and Clinicians
Research
Proteome analysis yields very large data sets
(160Mil data points/sample)
Fingerprint
recognition
Diagnosis can be done by analysing proteome
“fingerprint” from just one drop of blood
Researchers can model a detection pipeline
interactively on SAP HANA
Researchers can manipulate the detection
pipeline interactively
Minimally invasive diagnostics made possible by
large scale studies
on high resolution data
now possible
Intuitive interface
for complex analysis
pipeline
9
18. Clinic
Medical Explorer
Cancer patient treatment and research
to multiple formerly
disjoint data sources
Flexible Analytics
t on historical data
Clinical records and inclusion criteria are
very complex
Clinical data from different sources is
combined in one SAP HANA system
Unified access
Oncologists need to find the best treatment
option for patients Find patients eligible
for clinical trials
Doctors can filter patient cohorts based on
any clinical attribute Patients eligible for
clinical trials can be found in seconds
“In the future we would like to use SAP HANA at every diagnostic and therapeutic step in the fight against cancer as every cancer is different
18
and can vary immensely from one patient to the next.“
- Prof. Dr. Christof von Kalle, Head of National Center for Tumor Diseases Heidelberg, Germany
19. Medical Knowledge Cockpit
Clinic
Relevant scientific findings at a glance
Search for affected genes in distributed and
heterogeneous data sources
Immediate exploration of relevant
information, such as
Gene descriptions,
Molecular impact and related pathways,
Scientific publications, and
Suitable clinical trials.
No manual search for hours or days –
SAP HANA translates manual searching into
interactive finding
Unified access to
structured and
unstructured data sources
Automatic
clinical trial
matching using
HANA text analysis
features
19
20. Patient Management (IS-H) Analytics
Real-time analysis of hospital patient management data
Medical Controllers need to check occupancy
for different wards frequently
Current systems too slow for real-time
analysis no what-if scenarios possible
HANA made sub-second
response times possible
Admin
New analytical applications can now help
drive cost-savings and more efficient
resource allocation
Flexible analysis
– no need for materialized
aggregates
20
21. Admin
Prescription Data Analysis
Understanding the who, where, and what of drug prescriptions
Which is prescribed e.g. for migraine?
Specialists might prescribe different drugs
than general practitioners
SAP HANA cloud system holds 1.5 Bil.
Prescription records for around 10 Mil.
patients and 10,000 doctors
Data can be explored and visualized
interactively with SAP Lumira in seconds
Answers in 1 sec.
instead of 1 hour
Intuitive analysis
using data graphics
"SAP Health Data on Demand reduces the time it takes to analyze our more than 1.5 bn data records from 1 hour to 1 second. As a result, we
21
are able to offer our customers new online services, establish a new business model and generate additional revenue.”
- Franz-Xaver Thalmeir, Managing Director, Medimed GmbH
22. Healthcare Projects on SAP HANA
HANA helps gain time and enables completely new scenarios
Speedups achieved
Patient Management (IS-H) Analytics
50x (55 seconds 800 milliseconds)
Virtual Patient Platform
5000x (4 hours 2-3 seconds)
Prescription analysis
3600x (1 hour 1 second)
DNA Sequence Alignment
17x (85 hours 5 hours)
Proteome-based Cancer Diagnostics
22x (15 minutes 40 seconds)
New usage scenarios
Medical Explorer
Genome Analysis
Clinical Trial Matching
ProteomicsDB
Genome Browser
Biological Pathway Analysis
Large Patient Cohort Analysis
HANA Data Scientist
Genome Data Processing and Pipeline Modeling
22
24. The Power of Multidisciplinary Teams
Only Strong Partners Build Strong Co-Operative Success Stories
SAP: Global Software Vendor and Expert for Enterprise
Technologies World-Wide
+
Hasso Plattner Institute: Academic Research Institute for IT
Systems Engineering
+
Carlos Bustamante Lab: Leading Stanford Lab On Human
Population Genomics and Global Health
+
Charité – Universitätsmedizin Berlin: One of the largest
university hospitals in Europe
+
National Center for Tumor Diseases Heidelberg (NCT): One of
the leading institutions for cancer research and patient care
Design Thinking
Teams
You
Join Us!
24
25. New Ways of Real-Time Collaborative
Personal Medicine
Thank you!
25
Editor's Notes
Goal of the KeynoteInvestors, decision makers, and politicians shall be convinced of HANA’s capabilities. They shall be encouraged to invest in healthcare (in Germany), do research with the HPI and co-develop commercial applications with SAP.Overall Storyline/-flow1. Time as decisive factorTime is an absolutely critical resource, in healthcare (e.g. cancer therapy) probably more than in most other industries nothing is more valuable than personal health! Every second, healthcare professionals cannot spend with their core tasks is therefore as a waste of timeKey question:How can IT support healthcare professionals to make optimal use of their time to Take optimally informed decisions when treating patients, developing new therapies + managing clinics’ business operations?Spend more time with core tasks?Spend time with core tasks more effectively?2. IT-related challenges in healthcare -> slide 2Growing data volumes in distributed healthcare systems (clinical systems, research, administrative systems)Enable different types of professional healthcare users to performcomplex analysesof massive amounts of data from diverse data sources comfortably and in real-time3. Users/Desirability -> slide 3Clinicians: relevant information in real-time and from various sources for optimal support of treatment decisions Researchers: unified view and real-time analysis of scientific knowledge and patient cohort dataAdministrative Users/Operations Managers: Instant overview of hospital KPIs to monitor operational excellence4. Collaboration is Key/Viability -> slide 4SAP as Trusted Partner in HealthcareBroad range of collaboration partners More collaboration is needed -> extension of ecosystem/political support 5. SAP HANA/Feasibility -> slide 5SAP HANA is a key technology enabler of a Real-time Personalized Medicine In-Memory/SAP HANA basics -> developments in hardware + key software concepts applied in HANASAP HANA Healthcare Platform 6. Proof Points -> slides 6-end-> introduce 3-4 SAP/HPI projects which make tangible the potential of SAP HANA in healthcare forresearchersclinicianshealthcare admin and operations
POV: “Big Data” Challenge in healthcare two-foldthe mere volume of datathe diversity of the data sourcesSpeaker Notes:Raw biological data like genome or proteome data represent large volumes even for single patients. Clinical data like treatment records or prescription data add up to big data for large patient cohorts, e.g. when analyzed on a national level. Some relevant information is not available in a structured format and needs to be extracted from text documents, e.g. publications and trial proposals.To really make a difference to users, we need to bring together data of many types and from many sources, and present an integrated view of them for optimal decision-support.Transition:How do we make all these different information sources usable for different healthcare professionals? Design Thinking
POV: Design Thinking brings together multidisciplinary teams driving innovation -> desirability, viability and feasibility are essential dimensions to look at.Speakers Notes:Regarding desirability, we need to consider the needs of the different professionals working in healthcare:Researchers:Quickly translate the latest findings in Genomics and Proteomics research into new treatments Need to formulate flexible ad-hoc questions to verify hypotheses, e.g. to discover genetic sources for disease of interest in children compared to their healthy siblings and parentsClinicians: Find the best treatments and clinical trials for each patient right away intuitive access to all data with interactive response time (<1s)Administration and Operations staff:Monitor and improveoperational performance as a prerequisite for optimal medicalcare need real-time summaries of relevant KPI and intuitive (graphical) drill-downsTransition:What do we need to do to create a viable solution for these users? Collaboration with strong partners
SAP is working on all fronts of the healthcare spectrum. Patients & consumers:Care circles: extended care using social computingResearchWorking with cutting edge research universities & institutes to enable new insights in genomic & proteomic, and other biological data ClinicalEnabling new insights with evidence based research -> from connected medical devices & integrating structured/unstructured data from patient dataPayersIdentify patterns of specific illnesses & precursers to disease to offer individualized preemptive programs ProvidersOutcome driven treatment based on integration of all relevant patient data (both biological and clinical)
POV: Mainmemory capacity and #cores increases while costs decrease software needs to be optimized to incorporate available parallel powerSpeaker Notes:Main software concepts applied in HANA……create performance reserves and enable advanced concepts:Federation of data from heterogeneous sources, even unstructured sources like text documentsComplex algorithmic pipelines acting directly on big data without incurring transformation or data transfer overheadTransition: Following the design thinking approach, we are already collaborating with users to create new healthcare solutions
POV:HANA has core functions optimized for in-memory technology, which leverage high-speed data analysis to a diverse set of enterprise applications. Speaker notes: Core based oninnovative in-memorytechnologyNon-disruptiveintegration ofreal-timeanalytics of bigdata in new andexisting enterpriseapplications
POV: Heterogeneous medical data from different sources is consolidated on SAP HANA to enable users to work with medical data in a completely new way.Speaker Notes:We are showing a selection of projects that illustrate how we can help the different user groups:Researchers can build genome and proteome analysis pipelines and make the results accessible to clinicians instantlyClinicians can use Medical Explorer to build patient cohorts by filtering on any available clinical attribute, for example to match them to suitable clinical trialsAdmin and operations staff use real-time analytics of administrative and clinical data to ensure optimal usage of healthcare resourcesTransition: Let’s look at these use cases in a little more detail
DNA of a person encodes a lot of data only becomes medically useful when connected with existing knowledge (annotations, literature)Stanford used HANA to join Varimed data with the pre-phase 1 1000 genomes dataset (629 individuals). Then, by filtering out the variants that were associated with Type 2 Diabetes (125 variants), they were able to calculate the genetic risk of 629 individuals and segment it by population. The graph on the right shows the the genetic risk of getting Type 2 Diabetes is highest in people from the continental Americas. East Asians seem to have the lowest genetic risk of Type 2 Diabetes. This query in HANA involved a database join on the 1000genomes data with the Varimed database, and took less than a minute. This type of query and join would take a really long time, and researchers typically focused on less than 20 variants at a time because they were unable to look at 100s of variants simultaneously in all genomes. With HANA, they were able to study 125 variants in 629 individuals in less than a minute. 1000 genomes project: international consortium aimed at sequencing whole genomes of 1000 anonymous individuals and making the data publicly available for research use (to date: sequenced a total of 2500 individuals) Varimed: Stanford owned manually curated database that identifies genetic associates between traits and diseasesChen, R., Corona, E., Sikora, M., Dudley, J. T., Morgan, A. A., Moreno-Estrada, A., ... & Butte, A. J. (2012). Type 2 diabetes risk alleles demonstrate extreme directional differentiation among human populations, compared to other diseases. PLoS genetics, 8(4), e1002621.
POV: Enableproteome-baseddiagnostics by running a complexanalysis pipeline on data sets > 2GB per patient.Speaker Notes:HANA is helping drive tomorrows therapies – finding ”disease fingerprints” across huge proteome datasets using sophisticated algorithms directly in HANACan lead to e.g. new tests to detectlung cancer earlyusingonlyone drop of bloodTransition:Led to the insight that we need an easyway to build and modify data analysis pipelines using HANA-internalmethods and externaltools
POV: Explore patient data from many different sources within a comprehensive cancer center in absolute detail.Speaker Notes:A patient’s record is a complex mixture of different data: doctor’s notes, biomarkers, treatment history, molecular data… - previously had to be manually searched and integratedMedical Explorer uses SAP HANA to combine different data types in large volumes on a single platform and lets the user search across all of themUsers can easily build patient cohorts matching the inclusion and exclusion criteria of clinical trials, e.g. “Which patients have had two cycles of chemotherapy within the last two years, with a gap of 3-6 months in between?”Transition: Can be connected to automatic clinical trial proposal
POV: Automated clinical trial matching based on patient specific data instead of tedious search through trial databasesSpeaker notes: Clinical trials are an important step in medical research and can help patients to get early access to new treatments. However, finding a matching trial for a patient is a tedious work as online databases only allow simple keyword searches, while trials list numerous including or excluding criteria, e.g. age limits, former treatments, or specific variants in unstructured texts. HANA text analysis features enable the automatic extraction of these criteria in addition to a full text keyword search. As a result, all recruiting trials are matched with the detected variants and additional patient specific data. Thus, clinicians no longer need to search for matching trials by hand, instead they can choose from the proposed list. Transition: Variant Calling provides results that can be used in trial matchingBackground: As one of the leading online databases for clinical trials ClinicalTrials.gov list more than 30,000 recruiting trials out of around 150,000 total trials. It provides search functionality that allows to search by location, age and keywords, but no including or excluding criteria. Therefore, finding a matching study requires the clinician to search this database, and others, by providing the right keywords and then reading through various texts to find out whether the search results actually match the patient or not, a tedious work that can take up to days.In our prototype the recruiting trials of ClinicalTrials.gov were imported into HANA and text analysis features extract gene names, medical ingredients, start and end dates, age conditions, location and so on. The algorithm matches trials to a specific patient based on a list of detected variants, the affected genes, and additional patient data, e.g. age, gender or former diseases and treatments. It provides a list of ranked trials that could recruit the patient. Thus our prototype proposes matching clinical trials in a matter of seconds and allows the clinician to provide the patient with trial proposals during his round.
POV: Real-time analysis of hospital patient management dataSpeaker Notes:- Improving healthcare through IT is not just about molecular data, but also about ensuring treatment quality and efficient allocation of resources within hospitals- HANA turns a static reporting tool in an interactive exploration tool that can help hospital manager / controller in identifying problems right awayTransition: Possible target for optimization: prescription of the optimal drugs
POV: Comparing prescription behavior to a benchmark dataset from 10,000 doctorsSpeaker Notes: HANA makes it possible to explore prescriptions and other medical data for huge cohorts interactively. Example: Neurologists tend to prescribe different drugs for migraine than general practitioners.
POV:We have a large set of healthcare projects on HANA, leading to significant and diverse benefits for usersSpeakers Notes:We have a diverse set of healthcare projects on SAP HANA, touching different data types, users groups, and functionalities.These system deliver many different benefits, e.gIncreased speed interactivity/flexibilityMaking it easier to work with large and diverse data setsEnabling us to ask whole new questionsTransition: To give you sense of what is possible with SAP HANA, we will demonstrate a few of these systems
POV: With our design thinking teams, our vision for the future brings new ways of collaborating globallyTransition: With SAP HANA, our vision will become realizedBackground:
POV: With SAP HANA information is accessible within 1s. This revolutionizes how doctors and researchers can work together worldwide. - Moving to concrete examples of how HANA is doing this for medicine today & tomorrow.Speaker notes:Transition: Close session!Background: