SlideShare a Scribd company logo
1 of 34
Download to read offline
From Advanced Queries to Algorithms to
Advanced ML: 3 Pharmaceutical Graph Use Cases
Dr. Alexander Jarasch
• 5 partners + assoc. partners


• 450 researchers


• bundles basic research and
clinical trials expertise


• => variety of data


=> unstructured


=> heterogeneous


=> not connected


=> unFAIR
DZD Data and Knowledge Management team
Dr. Alexander Jarasch
Justus Täger
Tim Bleimehl
Angela Dedie
Yaroslav Zdravomyslov
The Challenge


Connecting data (silos) -> get new insights
Easy question -> Difficult to answer
The Challenge


Variety of users / diversity of scientific questions
Scientists
Medical

Doctors
Data

Scientists
Graphdatabase
Biological question:


Are human T2D genes enzymes acting on metabolites which in turn are regulated in pig diabetes model?




The actual question (from a data-point-of-view):




Is there a connection between A and R?


=> 3s to look into the Excel sheet


Why graph? Easy scientific question


The actual question (from a data-point-of-view):




Is there a connection between A and R?


=> 3s to look into the graph


A
B
C
E
D
F
G
K
Q
R
S
W
Z
U
Why graph? Easy scientific question
Back to the question
Are human T2D genes enzymes acting on metabolites which in turn are regulated in pig diabetes model?
Genomics
Human diabetic data
Genes
SNPs
Proteins
Enzymes
Pathways
Metabolites
Metabolomics
Pre diabetic pig
Metabolites
List of SNPs
List of Genes of
(species 1)
List of Proteins of
(species 1)
List of loci
List of Enzymes of
(species 1)
List of Pathways of
(species 1)
List of Metabolites
of (species 1)
List of Metabolites
of (species 2)
graph
Why graph? -> why not relational
• biomedical data / healthcare data is highly connected


• => variety of data


=> unstructured


=> heterogeneous


=> not connected


=> unFAIR


• easy to model


• extremely flexible / easy adoptable („re-shaping the graph“) vs. static SQL model


• scalable (Billion of nodes+relationships on a single machine


• easy to query (cyclic dependencies)


• GraphDataScience library + graph embeddings
Alzheimer‘s
cancer
cardio
vascular
diseases
diabetes
Lung


diseases
infectious


diseases
new hypotheses
Diseases are connected
DZDconnect: Concept
DZD in-house data
Natural Language Processing


Inferring knowledge
Knowledge Graph
DZDconnect: stats
• PROD-Server: 323m nodes, 1.1bn relationships => 480GB


• DEV-Server: 1.1bn nodes, 4.8bn relationships


• Singleserver (60 CPUs, 256GB memory, only SSDs)


• 4 developers


• Neo4j enterprise (live backup, GDS)


• UI: flask web server, SemSpect, Neo4j browser


• Visualization for interactive browsing (SemSpect by derive GmbH)


• Bloom (semi-natural-language queries)
Strata Data


Award finalist 2019
bytes4diabetes Award
2020
Graphie Award 2018
We have


DB role model
DZDconnect:


data integration + ML
Gene RNA Protein
CODES CODES
CODES*
• Python


• Py2Neo, GraphIO


• Docker Pipeline for orchestration (open-source by DZD)


• Based on integrated data => annotate / enrich


• textmatching + Natural Language Processing


• „shortcuts“ for queries (reduce #hops)


• inferring knowledge
DZDconnect:


data model <-> human readable = easy to query
DZDconnect:


data model
The Challenge


User with a specific input => specific output
Scientist
multi-omics

experiment

output
Flask app
The Challenge


User ”start somewhere -> explore freely knowledge”
SemSpect
interactive
browsing
Start from any node
Scientist

or

Medical

Doctor
The Challenge


User with data analysis skills / computer scientist
Scientist
Start from any node
Cypher query language
Graph Data
Science
Use case 1


Handle mapping identifiers of molecular entities
Knowledge Graph
Query „friends of a friend“ on a gene level


Example: diabetes relevant gene ‚TCF7L2’
match path=(g:Gene{sid:'TCF7L2'})-[:MAPS|SYNONYM*0..2]-(g1:Gene) return path
Use case 2


Find information that is NOW connected
Knowledge Graph
Query for SNPs (mutations) associated to diabetes


Output: relevant protein and its function (ontology terms)
match (tr:Trait)


where tr.name contains ‚diabetes mellitus‘


with tr as disease


match path=(disease)<-[:ASSOCIATED_WITH_TRAIT]-(asso:Association)<-[:SNP_HAS_ASSOCIATION]-(snp:SNP)-
[:SNP_HAS_GENE]-(gene:Gene)-[:MAPS]-(g1:Gene)-[x:CODES]->(transcript:Transcript)-[:CODES]->
(prot:Protein)-[:ASSOCIATION]->(term:Term)—(o:Ontology)


return path
Use case 3


Using graph algorithms to infer new insights
Natural Language
Processing


Ontologies


Knowledge Graph
Google’s page rank algorithm - find the most relevant gene


finding ACE2 - the receptor the SARS-Cov2 virus uses to enter the cell
• 140’000 abstracts from


Covid19 related publications


• NamedEntityRecognition


of gene names


• Page Rank identified


‚ACE2‘ as the most relevant


gene
Who’s this ACE2-guy?
source: https://www.benaroyaresearch.org/blog/post/11-things-know-about-mrna-vaccines-covid-19
Use case 4


Using node embeddings to sub phenotype diabetic patients
Natural
DZDconnect


connect raw data of diabetic patients with cancer
Clinical data from 404 diabetic patients
DZDconnect


connect lipidomics fingerprint
Lipidomics
Lipidomics experiment with 116 specific lipids
DZDconnect


connect transcriptomics fingerprint
Transcriptomics experiment with 58’345 specific Transcripts (RNAs)
Transform patients


Fast random projections (fastRP)
CALL gds.fastRP.write
(

'patients'
,

{

embeddingDimension: 50
,

writeProperty: 'fastrp-
embedding'
}

)

YIELD nodePropertiesWritten
Lipido
k-nearest neighbour clustering with k=5


representing the 5 diabetes subtypes
patient 01 patient 02
patient 03
Graph

algorithms
patient 04
patient 05
patient 02
p
a
t
i
e
n
t
0
4
patient 03
patient 05
patient 01
subphenotyping of diabetic patients
DZDconnect


connect patient data with knowledge graph
Transcript
Gene
Synonyms
Abstract
PubMed


Article
Keyword


MeSH-term


Ontology term
Hello role-model :-)
Take home message
• Knowledge graph


• as single point of truth


• connect in-house data


• scalability


• infer new insights


• Use cases:


• simple and advanced (Cypher) queries


• Graph Data Science library (page rank, kNN)


• Node embeddings for complex data


• NLP
• Visualization of graph


• different users


• flask app, browser, SemSpect,…
Thanks to

More Related Content

What's hot

DNA and RNA , Structure, Functions, Types, difference, Similarities, Protein ...
DNA and RNA , Structure, Functions, Types, difference, Similarities, Protein ...DNA and RNA , Structure, Functions, Types, difference, Similarities, Protein ...
DNA and RNA , Structure, Functions, Types, difference, Similarities, Protein ...AKSHAYMAGAR17
 
Top 10 Cypher Tuning Tips & Tricks
Top 10 Cypher Tuning Tips & TricksTop 10 Cypher Tuning Tips & Tricks
Top 10 Cypher Tuning Tips & TricksNeo4j
 
The Neo4j Data Platform for Today & Tomorrow.pdf
The Neo4j Data Platform for Today & Tomorrow.pdfThe Neo4j Data Platform for Today & Tomorrow.pdf
The Neo4j Data Platform for Today & Tomorrow.pdfNeo4j
 
Amsterdam - The Neo4j Graph Data Platform Today & Tomorrow
Amsterdam - The Neo4j Graph Data Platform Today & TomorrowAmsterdam - The Neo4j Graph Data Platform Today & Tomorrow
Amsterdam - The Neo4j Graph Data Platform Today & TomorrowNeo4j
 
Knowledge Graphs for Network Digital Twins
Knowledge Graphs for Network Digital TwinsKnowledge Graphs for Network Digital Twins
Knowledge Graphs for Network Digital TwinsNeo4j
 
5. Building the Cancer Research Data Commons with Neo4j: The Bento Framework
5. Building the Cancer Research Data Commons with Neo4j: The Bento Framework5. Building the Cancer Research Data Commons with Neo4j: The Bento Framework
5. Building the Cancer Research Data Commons with Neo4j: The Bento FrameworkNeo4j
 
Knowledge Graphs and Graph Data Science: More Context, Better Predictions (Ne...
Knowledge Graphs and Graph Data Science: More Context, Better Predictions (Ne...Knowledge Graphs and Graph Data Science: More Context, Better Predictions (Ne...
Knowledge Graphs and Graph Data Science: More Context, Better Predictions (Ne...Neo4j
 
Scaling into Billions of Nodes and Relationships with Neo4j Graph Data Science
Scaling into Billions of Nodes and Relationships with Neo4j Graph Data ScienceScaling into Billions of Nodes and Relationships with Neo4j Graph Data Science
Scaling into Billions of Nodes and Relationships with Neo4j Graph Data ScienceNeo4j
 
OpenBOM: Neo4j and Bill of Materials meetup, Boston
OpenBOM: Neo4j and Bill of Materials meetup, BostonOpenBOM: Neo4j and Bill of Materials meetup, Boston
OpenBOM: Neo4j and Bill of Materials meetup, BostonOleg Shilovitsky
 
Biotechnology information system {btis}
Biotechnology information system {btis}Biotechnology information system {btis}
Biotechnology information system {btis}KAUSHAL SAHU
 
Supply Chain Twin Demo - Companion Deck
Supply Chain Twin Demo - Companion DeckSupply Chain Twin Demo - Companion Deck
Supply Chain Twin Demo - Companion DeckNeo4j
 
An Overview to Protein bioinformatics
An Overview to Protein bioinformaticsAn Overview to Protein bioinformatics
An Overview to Protein bioinformaticsJoel Ricci-López
 
How Will Knowledge Graphs Improve Clinical Reporting Workflows
How Will Knowledge Graphs Improve Clinical Reporting WorkflowsHow Will Knowledge Graphs Improve Clinical Reporting Workflows
How Will Knowledge Graphs Improve Clinical Reporting WorkflowsNeo4j
 
Extending Machine Learning Algorithms with PySpark
Extending Machine Learning Algorithms with PySparkExtending Machine Learning Algorithms with PySpark
Extending Machine Learning Algorithms with PySparkDatabricks
 
Getting Started with Knowledge Graphs
Getting Started with Knowledge GraphsGetting Started with Knowledge Graphs
Getting Started with Knowledge GraphsPeter Haase
 
Workshop - Neo4j Graph Data Science
Workshop - Neo4j Graph Data ScienceWorkshop - Neo4j Graph Data Science
Workshop - Neo4j Graph Data ScienceNeo4j
 
Neo4j Innovation Lab – Bringing the Best of Data Science and Design Thinking ...
Neo4j Innovation Lab – Bringing the Best of Data Science and Design Thinking ...Neo4j Innovation Lab – Bringing the Best of Data Science and Design Thinking ...
Neo4j Innovation Lab – Bringing the Best of Data Science and Design Thinking ...Neo4j
 

What's hot (20)

DNA and RNA , Structure, Functions, Types, difference, Similarities, Protein ...
DNA and RNA , Structure, Functions, Types, difference, Similarities, Protein ...DNA and RNA , Structure, Functions, Types, difference, Similarities, Protein ...
DNA and RNA , Structure, Functions, Types, difference, Similarities, Protein ...
 
Top 10 Cypher Tuning Tips & Tricks
Top 10 Cypher Tuning Tips & TricksTop 10 Cypher Tuning Tips & Tricks
Top 10 Cypher Tuning Tips & Tricks
 
The Neo4j Data Platform for Today & Tomorrow.pdf
The Neo4j Data Platform for Today & Tomorrow.pdfThe Neo4j Data Platform for Today & Tomorrow.pdf
The Neo4j Data Platform for Today & Tomorrow.pdf
 
Biological databases
Biological databasesBiological databases
Biological databases
 
Amsterdam - The Neo4j Graph Data Platform Today & Tomorrow
Amsterdam - The Neo4j Graph Data Platform Today & TomorrowAmsterdam - The Neo4j Graph Data Platform Today & Tomorrow
Amsterdam - The Neo4j Graph Data Platform Today & Tomorrow
 
Knowledge Graphs for Network Digital Twins
Knowledge Graphs for Network Digital TwinsKnowledge Graphs for Network Digital Twins
Knowledge Graphs for Network Digital Twins
 
5. Building the Cancer Research Data Commons with Neo4j: The Bento Framework
5. Building the Cancer Research Data Commons with Neo4j: The Bento Framework5. Building the Cancer Research Data Commons with Neo4j: The Bento Framework
5. Building the Cancer Research Data Commons with Neo4j: The Bento Framework
 
Knowledge Graphs and Graph Data Science: More Context, Better Predictions (Ne...
Knowledge Graphs and Graph Data Science: More Context, Better Predictions (Ne...Knowledge Graphs and Graph Data Science: More Context, Better Predictions (Ne...
Knowledge Graphs and Graph Data Science: More Context, Better Predictions (Ne...
 
Scaling into Billions of Nodes and Relationships with Neo4j Graph Data Science
Scaling into Billions of Nodes and Relationships with Neo4j Graph Data ScienceScaling into Billions of Nodes and Relationships with Neo4j Graph Data Science
Scaling into Billions of Nodes and Relationships with Neo4j Graph Data Science
 
Kegg database resources
Kegg database resourcesKegg database resources
Kegg database resources
 
Spark sql
Spark sqlSpark sql
Spark sql
 
OpenBOM: Neo4j and Bill of Materials meetup, Boston
OpenBOM: Neo4j and Bill of Materials meetup, BostonOpenBOM: Neo4j and Bill of Materials meetup, Boston
OpenBOM: Neo4j and Bill of Materials meetup, Boston
 
Biotechnology information system {btis}
Biotechnology information system {btis}Biotechnology information system {btis}
Biotechnology information system {btis}
 
Supply Chain Twin Demo - Companion Deck
Supply Chain Twin Demo - Companion DeckSupply Chain Twin Demo - Companion Deck
Supply Chain Twin Demo - Companion Deck
 
An Overview to Protein bioinformatics
An Overview to Protein bioinformaticsAn Overview to Protein bioinformatics
An Overview to Protein bioinformatics
 
How Will Knowledge Graphs Improve Clinical Reporting Workflows
How Will Knowledge Graphs Improve Clinical Reporting WorkflowsHow Will Knowledge Graphs Improve Clinical Reporting Workflows
How Will Knowledge Graphs Improve Clinical Reporting Workflows
 
Extending Machine Learning Algorithms with PySpark
Extending Machine Learning Algorithms with PySparkExtending Machine Learning Algorithms with PySpark
Extending Machine Learning Algorithms with PySpark
 
Getting Started with Knowledge Graphs
Getting Started with Knowledge GraphsGetting Started with Knowledge Graphs
Getting Started with Knowledge Graphs
 
Workshop - Neo4j Graph Data Science
Workshop - Neo4j Graph Data ScienceWorkshop - Neo4j Graph Data Science
Workshop - Neo4j Graph Data Science
 
Neo4j Innovation Lab – Bringing the Best of Data Science and Design Thinking ...
Neo4j Innovation Lab – Bringing the Best of Data Science and Design Thinking ...Neo4j Innovation Lab – Bringing the Best of Data Science and Design Thinking ...
Neo4j Innovation Lab – Bringing the Best of Data Science and Design Thinking ...
 

Similar to From Queries to Algorithms to Advanced ML: 3 Pharmaceutical Graph Use Cases

From Advanced Queries to Algorithms and Graph-Based ML: Tackling Diabetes wit...
From Advanced Queries to Algorithms and Graph-Based ML: Tackling Diabetes wit...From Advanced Queries to Algorithms and Graph-Based ML: Tackling Diabetes wit...
From Advanced Queries to Algorithms and Graph-Based ML: Tackling Diabetes wit...Neo4j
 
A Distributed Annotation Pipeline for MSSNG
A Distributed Annotation Pipeline for MSSNGA Distributed Annotation Pipeline for MSSNG
A Distributed Annotation Pipeline for MSSNGSimon Twigger
 
Neo4j_Cypher.pdf
Neo4j_Cypher.pdfNeo4j_Cypher.pdf
Neo4j_Cypher.pdfJaberRad1
 
Drug and Vaccine Discovery: Knowledge Graph + Apache Spark
Drug and Vaccine Discovery: Knowledge Graph + Apache SparkDrug and Vaccine Discovery: Knowledge Graph + Apache Spark
Drug and Vaccine Discovery: Knowledge Graph + Apache SparkDatabricks
 
Neo4j for Healthcare & Life Sciences
Neo4j for Healthcare & Life SciencesNeo4j for Healthcare & Life Sciences
Neo4j for Healthcare & Life SciencesNeo4j
 
Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009Ian Foster
 
Charleston Conference 2016
Charleston Conference 2016Charleston Conference 2016
Charleston Conference 2016Anita de Waard
 
Big Data in Pharma - Overview and Use Cases
Big Data in Pharma - Overview and Use CasesBig Data in Pharma - Overview and Use Cases
Big Data in Pharma - Overview and Use CasesJosef Scheiber
 
Fostering Serendipity through Big Linked Data
Fostering Serendipity through Big Linked DataFostering Serendipity through Big Linked Data
Fostering Serendipity through Big Linked DataMuhammad Saleem
 
EiTESAL eHealth Conference 14&15 May 2017
EiTESAL eHealth Conference 14&15 May 2017 EiTESAL eHealth Conference 14&15 May 2017
EiTESAL eHealth Conference 14&15 May 2017 EITESANGO
 
Cancer Analytics Poster
Cancer Analytics PosterCancer Analytics Poster
Cancer Analytics PosterMichael Atkins
 
Rescuing Data from Decaying and Moribund Clinical Information Systems
Rescuing Data from Decaying and Moribund Clinical Information SystemsRescuing Data from Decaying and Moribund Clinical Information Systems
Rescuing Data from Decaying and Moribund Clinical Information SystemsHealth Informatics New Zealand
 
Semantic Web for Health Care and Biomedical Informatics
Semantic Web for Health Care and Biomedical InformaticsSemantic Web for Health Care and Biomedical Informatics
Semantic Web for Health Care and Biomedical InformaticsAmit Sheth
 
Stephen Friend Dana Farber Cancer Institute 2011-10-24
Stephen Friend Dana Farber Cancer Institute 2011-10-24Stephen Friend Dana Farber Cancer Institute 2011-10-24
Stephen Friend Dana Farber Cancer Institute 2011-10-24Sage Base
 
The Power of Graphs to Analyze Biological Data
The Power of Graphs to Analyze Biological DataThe Power of Graphs to Analyze Biological Data
The Power of Graphs to Analyze Biological Datadatablend
 
The Power of Graphs to Analyze Biological Data - Davy Suvee @ GraphConnect Lo...
The Power of Graphs to Analyze Biological Data - Davy Suvee @ GraphConnect Lo...The Power of Graphs to Analyze Biological Data - Davy Suvee @ GraphConnect Lo...
The Power of Graphs to Analyze Biological Data - Davy Suvee @ GraphConnect Lo...Neo4j
 
Transparency in the Data Supply Chain
Transparency in the Data Supply ChainTransparency in the Data Supply Chain
Transparency in the Data Supply ChainPaul Groth
 

Similar to From Queries to Algorithms to Advanced ML: 3 Pharmaceutical Graph Use Cases (20)

From Advanced Queries to Algorithms and Graph-Based ML: Tackling Diabetes wit...
From Advanced Queries to Algorithms and Graph-Based ML: Tackling Diabetes wit...From Advanced Queries to Algorithms and Graph-Based ML: Tackling Diabetes wit...
From Advanced Queries to Algorithms and Graph-Based ML: Tackling Diabetes wit...
 
A Distributed Annotation Pipeline for MSSNG
A Distributed Annotation Pipeline for MSSNGA Distributed Annotation Pipeline for MSSNG
A Distributed Annotation Pipeline for MSSNG
 
Neo4j_Cypher.pdf
Neo4j_Cypher.pdfNeo4j_Cypher.pdf
Neo4j_Cypher.pdf
 
Drug and Vaccine Discovery: Knowledge Graph + Apache Spark
Drug and Vaccine Discovery: Knowledge Graph + Apache SparkDrug and Vaccine Discovery: Knowledge Graph + Apache Spark
Drug and Vaccine Discovery: Knowledge Graph + Apache Spark
 
Neo4j for Healthcare & Life Sciences
Neo4j for Healthcare & Life SciencesNeo4j for Healthcare & Life Sciences
Neo4j for Healthcare & Life Sciences
 
Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009
 
2013 alumni-webinar
2013 alumni-webinar2013 alumni-webinar
2013 alumni-webinar
 
Charleston Conference 2016
Charleston Conference 2016Charleston Conference 2016
Charleston Conference 2016
 
Big Data in Pharma - Overview and Use Cases
Big Data in Pharma - Overview and Use CasesBig Data in Pharma - Overview and Use Cases
Big Data in Pharma - Overview and Use Cases
 
Fostering Serendipity through Big Linked Data
Fostering Serendipity through Big Linked DataFostering Serendipity through Big Linked Data
Fostering Serendipity through Big Linked Data
 
Practical semantics in the pharmaceutical industry - the Open PHACTS project
Practical semantics in the pharmaceutical industry - the Open PHACTS projectPractical semantics in the pharmaceutical industry - the Open PHACTS project
Practical semantics in the pharmaceutical industry - the Open PHACTS project
 
EiTESAL eHealth Conference 14&15 May 2017
EiTESAL eHealth Conference 14&15 May 2017 EiTESAL eHealth Conference 14&15 May 2017
EiTESAL eHealth Conference 14&15 May 2017
 
Final-Presentation
Final-PresentationFinal-Presentation
Final-Presentation
 
Cancer Analytics Poster
Cancer Analytics PosterCancer Analytics Poster
Cancer Analytics Poster
 
Rescuing Data from Decaying and Moribund Clinical Information Systems
Rescuing Data from Decaying and Moribund Clinical Information SystemsRescuing Data from Decaying and Moribund Clinical Information Systems
Rescuing Data from Decaying and Moribund Clinical Information Systems
 
Semantic Web for Health Care and Biomedical Informatics
Semantic Web for Health Care and Biomedical InformaticsSemantic Web for Health Care and Biomedical Informatics
Semantic Web for Health Care and Biomedical Informatics
 
Stephen Friend Dana Farber Cancer Institute 2011-10-24
Stephen Friend Dana Farber Cancer Institute 2011-10-24Stephen Friend Dana Farber Cancer Institute 2011-10-24
Stephen Friend Dana Farber Cancer Institute 2011-10-24
 
The Power of Graphs to Analyze Biological Data
The Power of Graphs to Analyze Biological DataThe Power of Graphs to Analyze Biological Data
The Power of Graphs to Analyze Biological Data
 
The Power of Graphs to Analyze Biological Data - Davy Suvee @ GraphConnect Lo...
The Power of Graphs to Analyze Biological Data - Davy Suvee @ GraphConnect Lo...The Power of Graphs to Analyze Biological Data - Davy Suvee @ GraphConnect Lo...
The Power of Graphs to Analyze Biological Data - Davy Suvee @ GraphConnect Lo...
 
Transparency in the Data Supply Chain
Transparency in the Data Supply ChainTransparency in the Data Supply Chain
Transparency in the Data Supply Chain
 

More from Neo4j

Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...Neo4j
 
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafosBBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafosNeo4j
 
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...Neo4j
 
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jGraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jNeo4j
 
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdfNeo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdfNeo4j
 
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdfRabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdfNeo4j
 
Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!Neo4j
 
IA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG timeIA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG timeNeo4j
 
Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j: Data Engineering for RAG (retrieval augmented generation)Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j: Data Engineering for RAG (retrieval augmented generation)Neo4j
 
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdfNeo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdfNeo4j
 
Enabling GenAI Breakthroughs with Knowledge Graphs
Enabling GenAI Breakthroughs with Knowledge GraphsEnabling GenAI Breakthroughs with Knowledge Graphs
Enabling GenAI Breakthroughs with Knowledge GraphsNeo4j
 
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdfNeo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdfNeo4j
 
Neo4j Jesus Barrasa The Art of the Possible with Graph
Neo4j Jesus Barrasa The Art of the Possible with GraphNeo4j Jesus Barrasa The Art of the Possible with Graph
Neo4j Jesus Barrasa The Art of the Possible with GraphNeo4j
 
SWIFT: Maintaining Critical Standards in the Financial Services Industry with...
SWIFT: Maintaining Critical Standards in the Financial Services Industry with...SWIFT: Maintaining Critical Standards in the Financial Services Industry with...
SWIFT: Maintaining Critical Standards in the Financial Services Industry with...Neo4j
 
Deloitte & Red Cross: Talk to your data with Knowledge-enriched Generative AI
Deloitte & Red Cross: Talk to your data with Knowledge-enriched Generative AIDeloitte & Red Cross: Talk to your data with Knowledge-enriched Generative AI
Deloitte & Red Cross: Talk to your data with Knowledge-enriched Generative AINeo4j
 
Ingka Digital: Linked Metadata by Design
Ingka Digital: Linked Metadata by DesignIngka Digital: Linked Metadata by Design
Ingka Digital: Linked Metadata by DesignNeo4j
 
Discover Neo4j Aura_ The Future of Graph Database-as-a-Service Workshop_3.13.24
Discover Neo4j Aura_ The Future of Graph Database-as-a-Service Workshop_3.13.24Discover Neo4j Aura_ The Future of Graph Database-as-a-Service Workshop_3.13.24
Discover Neo4j Aura_ The Future of Graph Database-as-a-Service Workshop_3.13.24Neo4j
 
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptx
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptxGraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptx
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptxNeo4j
 
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptx
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptxEmil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptx
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptxNeo4j
 

More from Neo4j (20)

Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
ISDEFE - GraphSummit Madrid - ARETA: Aviation Real-Time Emissions Token Accre...
 
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafosBBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
BBVA - GraphSummit Madrid - Caso de éxito en BBVA: Optimizando con grafos
 
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
Graph Everywhere - Josep Taruella - Por qué Graph Data Science en tus modelos...
 
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jGraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
 
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdfNeo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
Neo4j_Exploring the Impact of Graph Technology on Financial Services.pdf
 
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdfRabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
Rabobank_Exploring the Impact of Graph Technology on Financial Services.pdf
 
Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!
 
IA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG timeIA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG time
 
Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j: Data Engineering for RAG (retrieval augmented generation)Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j: Data Engineering for RAG (retrieval augmented generation)
 
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdfNeo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
 
Enabling GenAI Breakthroughs with Knowledge Graphs
Enabling GenAI Breakthroughs with Knowledge GraphsEnabling GenAI Breakthroughs with Knowledge Graphs
Enabling GenAI Breakthroughs with Knowledge Graphs
 
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdfNeo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
 
Neo4j Jesus Barrasa The Art of the Possible with Graph
Neo4j Jesus Barrasa The Art of the Possible with GraphNeo4j Jesus Barrasa The Art of the Possible with Graph
Neo4j Jesus Barrasa The Art of the Possible with Graph
 
SWIFT: Maintaining Critical Standards in the Financial Services Industry with...
SWIFT: Maintaining Critical Standards in the Financial Services Industry with...SWIFT: Maintaining Critical Standards in the Financial Services Industry with...
SWIFT: Maintaining Critical Standards in the Financial Services Industry with...
 
Deloitte & Red Cross: Talk to your data with Knowledge-enriched Generative AI
Deloitte & Red Cross: Talk to your data with Knowledge-enriched Generative AIDeloitte & Red Cross: Talk to your data with Knowledge-enriched Generative AI
Deloitte & Red Cross: Talk to your data with Knowledge-enriched Generative AI
 
Ingka Digital: Linked Metadata by Design
Ingka Digital: Linked Metadata by DesignIngka Digital: Linked Metadata by Design
Ingka Digital: Linked Metadata by Design
 
Discover Neo4j Aura_ The Future of Graph Database-as-a-Service Workshop_3.13.24
Discover Neo4j Aura_ The Future of Graph Database-as-a-Service Workshop_3.13.24Discover Neo4j Aura_ The Future of Graph Database-as-a-Service Workshop_3.13.24
Discover Neo4j Aura_ The Future of Graph Database-as-a-Service Workshop_3.13.24
 
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptx
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptxGraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptx
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptx
 
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptx
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptxEmil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptx
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptx
 

Recently uploaded

AmsterdamJUG April 2024 - Going serverless with Quarkus GraalVM native images...
AmsterdamJUG April 2024 - Going serverless with Quarkus GraalVM native images...AmsterdamJUG April 2024 - Going serverless with Quarkus GraalVM native images...
AmsterdamJUG April 2024 - Going serverless with Quarkus GraalVM native images...Bert Jan Schrijver
 
Business Analyzopedia - Your Pocket Gita for Business Analysis
Business Analyzopedia - Your Pocket Gita for Business AnalysisBusiness Analyzopedia - Your Pocket Gita for Business Analysis
Business Analyzopedia - Your Pocket Gita for Business AnalysisDEEPRAJ PATHAK
 
Effort Estimation Techniques used in Software Projects
Effort Estimation Techniques used in Software ProjectsEffort Estimation Techniques used in Software Projects
Effort Estimation Techniques used in Software ProjectsDEEPRAJ PATHAK
 
Preparing BitVisor for Supporting Multiple Architectures
Preparing BitVisor for Supporting Multiple ArchitecturesPreparing BitVisor for Supporting Multiple Architectures
Preparing BitVisor for Supporting Multiple ArchitecturesAke Koomsin
 
eSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration toolseSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration toolsosttopstonverter
 
Santander Stream Processing with Apache Flink
Santander Stream Processing with Apache FlinkSantander Stream Processing with Apache Flink
Santander Stream Processing with Apache Flinkconfluent
 
Understanding Plagiarism: Causes, Consequences and Prevention.pptx
Understanding Plagiarism: Causes, Consequences and Prevention.pptxUnderstanding Plagiarism: Causes, Consequences and Prevention.pptx
Understanding Plagiarism: Causes, Consequences and Prevention.pptxSasikiranMarri
 
ETE PPT.pdf LMMKLMKLMLKMLLMJKBHJBHBNUIHBU
ETE PPT.pdf LMMKLMKLMLKMLLMJKBHJBHBNUIHBUETE PPT.pdf LMMKLMKLMLKMLLMJKBHJBHBNUIHBU
ETE PPT.pdf LMMKLMKLMLKMLLMJKBHJBHBNUIHBUsamruddhijedgule2004
 
Tech Tuesday Slides - Getting Started with the Portfolio Module.
Tech Tuesday Slides - Getting Started with the Portfolio Module.Tech Tuesday Slides - Getting Started with the Portfolio Module.
Tech Tuesday Slides - Getting Started with the Portfolio Module.OnePlan Solutions
 
ManageIQ - Sprint 234 Review - Slide Deck
ManageIQ - Sprint 234 Review - Slide DeckManageIQ - Sprint 234 Review - Slide Deck
ManageIQ - Sprint 234 Review - Slide DeckManageIQ
 
Key Steps in Agile Software Delivery Roadmap
Key Steps in Agile Software Delivery RoadmapKey Steps in Agile Software Delivery Roadmap
Key Steps in Agile Software Delivery RoadmapIshara Amarasekera
 
Zer0con 2024 final share short version.pdf
Zer0con 2024 final share short version.pdfZer0con 2024 final share short version.pdf
Zer0con 2024 final share short version.pdfmaor17
 
OpenMetadata Community Meeting - 4th April, 2024
OpenMetadata Community Meeting - 4th April, 2024OpenMetadata Community Meeting - 4th April, 2024
OpenMetadata Community Meeting - 4th April, 2024OpenMetadata
 
What’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 UpdatesWhat’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 UpdatesVictoriaMetrics
 
Explore the Three Main Types of Logistics - Inbound Logistics, Outbound Logis...
Explore the Three Main Types of Logistics - Inbound Logistics, Outbound Logis...Explore the Three Main Types of Logistics - Inbound Logistics, Outbound Logis...
Explore the Three Main Types of Logistics - Inbound Logistics, Outbound Logis...Piyovi
 
The Ultimate Guide to Performance Testing in Low-Code, No-Code Environments (...
The Ultimate Guide to Performance Testing in Low-Code, No-Code Environments (...The Ultimate Guide to Performance Testing in Low-Code, No-Code Environments (...
The Ultimate Guide to Performance Testing in Low-Code, No-Code Environments (...kalichargn70th171
 
Leveraging the Expertise of a Social Media Fraud Analyst to Safeguard Brand R...
Leveraging the Expertise of a Social Media Fraud Analyst to Safeguard Brand R...Leveraging the Expertise of a Social Media Fraud Analyst to Safeguard Brand R...
Leveraging the Expertise of a Social Media Fraud Analyst to Safeguard Brand R...Milind Agarwal
 
Transform your Corporate Strategy Office - Harness OnePlan’s Strategic Portfo...
Transform your Corporate Strategy Office - Harness OnePlan’s Strategic Portfo...Transform your Corporate Strategy Office - Harness OnePlan’s Strategic Portfo...
Transform your Corporate Strategy Office - Harness OnePlan’s Strategic Portfo...OnePlan Solutions
 
Effectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorEffectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorTier1 app
 
oracle 23c new features for developer and dba
oracle 23c new features for developer and dbaoracle 23c new features for developer and dba
oracle 23c new features for developer and dbaRemote DBA Services
 

Recently uploaded (20)

AmsterdamJUG April 2024 - Going serverless with Quarkus GraalVM native images...
AmsterdamJUG April 2024 - Going serverless with Quarkus GraalVM native images...AmsterdamJUG April 2024 - Going serverless with Quarkus GraalVM native images...
AmsterdamJUG April 2024 - Going serverless with Quarkus GraalVM native images...
 
Business Analyzopedia - Your Pocket Gita for Business Analysis
Business Analyzopedia - Your Pocket Gita for Business AnalysisBusiness Analyzopedia - Your Pocket Gita for Business Analysis
Business Analyzopedia - Your Pocket Gita for Business Analysis
 
Effort Estimation Techniques used in Software Projects
Effort Estimation Techniques used in Software ProjectsEffort Estimation Techniques used in Software Projects
Effort Estimation Techniques used in Software Projects
 
Preparing BitVisor for Supporting Multiple Architectures
Preparing BitVisor for Supporting Multiple ArchitecturesPreparing BitVisor for Supporting Multiple Architectures
Preparing BitVisor for Supporting Multiple Architectures
 
eSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration toolseSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration tools
 
Santander Stream Processing with Apache Flink
Santander Stream Processing with Apache FlinkSantander Stream Processing with Apache Flink
Santander Stream Processing with Apache Flink
 
Understanding Plagiarism: Causes, Consequences and Prevention.pptx
Understanding Plagiarism: Causes, Consequences and Prevention.pptxUnderstanding Plagiarism: Causes, Consequences and Prevention.pptx
Understanding Plagiarism: Causes, Consequences and Prevention.pptx
 
ETE PPT.pdf LMMKLMKLMLKMLLMJKBHJBHBNUIHBU
ETE PPT.pdf LMMKLMKLMLKMLLMJKBHJBHBNUIHBUETE PPT.pdf LMMKLMKLMLKMLLMJKBHJBHBNUIHBU
ETE PPT.pdf LMMKLMKLMLKMLLMJKBHJBHBNUIHBU
 
Tech Tuesday Slides - Getting Started with the Portfolio Module.
Tech Tuesday Slides - Getting Started with the Portfolio Module.Tech Tuesday Slides - Getting Started with the Portfolio Module.
Tech Tuesday Slides - Getting Started with the Portfolio Module.
 
ManageIQ - Sprint 234 Review - Slide Deck
ManageIQ - Sprint 234 Review - Slide DeckManageIQ - Sprint 234 Review - Slide Deck
ManageIQ - Sprint 234 Review - Slide Deck
 
Key Steps in Agile Software Delivery Roadmap
Key Steps in Agile Software Delivery RoadmapKey Steps in Agile Software Delivery Roadmap
Key Steps in Agile Software Delivery Roadmap
 
Zer0con 2024 final share short version.pdf
Zer0con 2024 final share short version.pdfZer0con 2024 final share short version.pdf
Zer0con 2024 final share short version.pdf
 
OpenMetadata Community Meeting - 4th April, 2024
OpenMetadata Community Meeting - 4th April, 2024OpenMetadata Community Meeting - 4th April, 2024
OpenMetadata Community Meeting - 4th April, 2024
 
What’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 UpdatesWhat’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 Updates
 
Explore the Three Main Types of Logistics - Inbound Logistics, Outbound Logis...
Explore the Three Main Types of Logistics - Inbound Logistics, Outbound Logis...Explore the Three Main Types of Logistics - Inbound Logistics, Outbound Logis...
Explore the Three Main Types of Logistics - Inbound Logistics, Outbound Logis...
 
The Ultimate Guide to Performance Testing in Low-Code, No-Code Environments (...
The Ultimate Guide to Performance Testing in Low-Code, No-Code Environments (...The Ultimate Guide to Performance Testing in Low-Code, No-Code Environments (...
The Ultimate Guide to Performance Testing in Low-Code, No-Code Environments (...
 
Leveraging the Expertise of a Social Media Fraud Analyst to Safeguard Brand R...
Leveraging the Expertise of a Social Media Fraud Analyst to Safeguard Brand R...Leveraging the Expertise of a Social Media Fraud Analyst to Safeguard Brand R...
Leveraging the Expertise of a Social Media Fraud Analyst to Safeguard Brand R...
 
Transform your Corporate Strategy Office - Harness OnePlan’s Strategic Portfo...
Transform your Corporate Strategy Office - Harness OnePlan’s Strategic Portfo...Transform your Corporate Strategy Office - Harness OnePlan’s Strategic Portfo...
Transform your Corporate Strategy Office - Harness OnePlan’s Strategic Portfo...
 
Effectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorEffectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryError
 
oracle 23c new features for developer and dba
oracle 23c new features for developer and dbaoracle 23c new features for developer and dba
oracle 23c new features for developer and dba
 

From Queries to Algorithms to Advanced ML: 3 Pharmaceutical Graph Use Cases

  • 1. From Advanced Queries to Algorithms to Advanced ML: 3 Pharmaceutical Graph Use Cases Dr. Alexander Jarasch
  • 2. • 5 partners + assoc. partners • 450 researchers • bundles basic research and clinical trials expertise • => variety of data 
 => unstructured 
 => heterogeneous 
 => not connected 
 => unFAIR
  • 3. DZD Data and Knowledge Management team Dr. Alexander Jarasch Justus Täger Tim Bleimehl Angela Dedie Yaroslav Zdravomyslov
  • 4. The Challenge Connecting data (silos) -> get new insights Easy question -> Difficult to answer
  • 5. The Challenge Variety of users / diversity of scientific questions Scientists Medical
 Doctors Data
 Scientists Graphdatabase
  • 6. Biological question: Are human T2D genes enzymes acting on metabolites which in turn are regulated in pig diabetes model? 
 The actual question (from a data-point-of-view): 
 
 Is there a connection between A and R? => 3s to look into the Excel sheet Why graph? Easy scientific question
  • 7. 
 The actual question (from a data-point-of-view): 
 
 Is there a connection between A and R? => 3s to look into the graph A B C E D F G K Q R S W Z U Why graph? Easy scientific question
  • 8. Back to the question Are human T2D genes enzymes acting on metabolites which in turn are regulated in pig diabetes model? Genomics Human diabetic data Genes SNPs Proteins Enzymes Pathways Metabolites Metabolomics Pre diabetic pig Metabolites List of SNPs List of Genes of (species 1) List of Proteins of (species 1) List of loci List of Enzymes of (species 1) List of Pathways of (species 1) List of Metabolites of (species 1) List of Metabolites of (species 2) graph
  • 9. Why graph? -> why not relational • biomedical data / healthcare data is highly connected • => variety of data 
 => unstructured 
 => heterogeneous 
 => not connected 
 => unFAIR • easy to model • extremely flexible / easy adoptable („re-shaping the graph“) vs. static SQL model • scalable (Billion of nodes+relationships on a single machine • easy to query (cyclic dependencies) • GraphDataScience library + graph embeddings
  • 11. DZDconnect: Concept DZD in-house data Natural Language Processing Inferring knowledge Knowledge Graph
  • 12. DZDconnect: stats • PROD-Server: 323m nodes, 1.1bn relationships => 480GB • DEV-Server: 1.1bn nodes, 4.8bn relationships • Singleserver (60 CPUs, 256GB memory, only SSDs) • 4 developers 
 • Neo4j enterprise (live backup, GDS) • UI: flask web server, SemSpect, Neo4j browser • Visualization for interactive browsing (SemSpect by derive GmbH) • Bloom (semi-natural-language queries) Strata Data 
 Award finalist 2019 bytes4diabetes Award 2020 Graphie Award 2018 We have 
 DB role model
  • 13. DZDconnect: data integration + ML Gene RNA Protein CODES CODES CODES* • Python • Py2Neo, GraphIO • Docker Pipeline for orchestration (open-source by DZD) • Based on integrated data => annotate / enrich • textmatching + Natural Language Processing • „shortcuts“ for queries (reduce #hops) • inferring knowledge
  • 14. DZDconnect: data model <-> human readable = easy to query
  • 16. The Challenge User with a specific input => specific output Scientist multi-omics
 experiment
 output Flask app
  • 17. The Challenge User ”start somewhere -> explore freely knowledge” SemSpect interactive browsing Start from any node Scientist
 or
 Medical
 Doctor
  • 18. The Challenge User with data analysis skills / computer scientist Scientist Start from any node Cypher query language Graph Data Science
  • 19. Use case 1 Handle mapping identifiers of molecular entities Knowledge Graph
  • 20. Query „friends of a friend“ on a gene level 
 Example: diabetes relevant gene ‚TCF7L2’ match path=(g:Gene{sid:'TCF7L2'})-[:MAPS|SYNONYM*0..2]-(g1:Gene) return path
  • 21. Use case 2 Find information that is NOW connected Knowledge Graph
  • 22. Query for SNPs (mutations) associated to diabetes 
 Output: relevant protein and its function (ontology terms) match (tr:Trait) where tr.name contains ‚diabetes mellitus‘ with tr as disease match path=(disease)<-[:ASSOCIATED_WITH_TRAIT]-(asso:Association)<-[:SNP_HAS_ASSOCIATION]-(snp:SNP)- [:SNP_HAS_GENE]-(gene:Gene)-[:MAPS]-(g1:Gene)-[x:CODES]->(transcript:Transcript)-[:CODES]-> (prot:Protein)-[:ASSOCIATION]->(term:Term)—(o:Ontology) return path
  • 23. Use case 3 Using graph algorithms to infer new insights Natural Language Processing 
 Ontologies Knowledge Graph
  • 24. Google’s page rank algorithm - find the most relevant gene 
 finding ACE2 - the receptor the SARS-Cov2 virus uses to enter the cell • 140’000 abstracts from Covid19 related publications • NamedEntityRecognition 
 of gene names • Page Rank identified 
 ‚ACE2‘ as the most relevant 
 gene
  • 25. Who’s this ACE2-guy? source: https://www.benaroyaresearch.org/blog/post/11-things-know-about-mrna-vaccines-covid-19
  • 26. Use case 4 Using node embeddings to sub phenotype diabetic patients Natural
  • 27. DZDconnect connect raw data of diabetic patients with cancer Clinical data from 404 diabetic patients
  • 29. DZDconnect connect transcriptomics fingerprint Transcriptomics experiment with 58’345 specific Transcripts (RNAs)
  • 30. Transform patients Fast random projections (fastRP) CALL gds.fastRP.write ( 'patients' , { embeddingDimension: 50 , writeProperty: 'fastrp- embedding' } ) YIELD nodePropertiesWritten Lipido
  • 31. k-nearest neighbour clustering with k=5 representing the 5 diabetes subtypes patient 01 patient 02 patient 03 Graph
 algorithms patient 04 patient 05 patient 02 p a t i e n t 0 4 patient 03 patient 05 patient 01 subphenotyping of diabetic patients
  • 32. DZDconnect connect patient data with knowledge graph Transcript Gene Synonyms Abstract PubMed 
 Article Keyword 
 MeSH-term Ontology term Hello role-model :-)
  • 33. Take home message • Knowledge graph • as single point of truth • connect in-house data • scalability • infer new insights 
 • Use cases: • simple and advanced (Cypher) queries • Graph Data Science library (page rank, kNN) • Node embeddings for complex data • NLP • Visualization of graph • different users • flask app, browser, SemSpect,…