Cognitive and AI applications consume data that is far too complex for current databases. These systems require an expressive data model and an intelligent query language to perform knowledge engineering over complex datasets. GRAKN.AI is a database to organise such complex networks of data.
Systems biology is one of the domains that produces huge amounts of data which presents integration challenges due to their complex nature. As understanding the complex relationships among these biological data is one of the key goals in biology, solutions are necessary that speed up the integration and querying of such data.
However, analysing large volumes of this biological data through traditional database systems is troublesome and challenging. In this talk, we will demonstrate how integrating a sequencing algorithm with a Grakn knowledge graph leads to valuable new insights of our data at scale.
1. T H E K N O W L E D G E G R A P H
Join our community at grakn.ai/community
Using Grakn in Protein
Sequence Alignment
By Tomas Sabat
COO of GRAKN.AI
@graknlabs
@tasabat
2. Follow us @GraknLabs
We push the boundary of intelligent systems forward,
starting at the database.
3. Follow us @GraknLabs
Integrating datasets Contextualisation of insights Discovery of new relationships
Difficult to integrate heterogeneous
and flat datasets
Difficult to ingest insights and
connect with rest of public/private
data
Difficult to reason and discover
new explainable relationships
What problems have we found in bioinformatics?
4. Follow us @GraknLabs
Grakn accelerates biomedical knowledge discovery
Faster ingestion and integration
of data
Contextualise newly generated
insights
Discover and explain new
relationships
Grakn’s hyper-relational data model
enables fast and flexible integration of
heterogeneous biomedical data sets
Bring context to newly generated
insights by understanding how it
interacts with the rest of the graph
Through automated deductive
reasoning of data points derive new
relationships between biological
components
5. Follow us @GraknLabs
GRAKN.AI the knowledge base
foundation for intelligent systems
a.k.a.,
GRAKN.AI is a knowledge graph
Knowledge Storage System
Novel Knowledge Representation System based on
Hypergraph Theory
Knowledge Inference
OLTP Reasoning Engine
Knowledge Analytics
OLAP Distributed Analytics
10. Follow us @GraknLabs
Integrating data
Drug
Disease
Protein
Gene
gene-
protein-
encoding
encoding-gene
encoded-protein
Kinase
sub
Ion
-channel
Nuclear-
receptor GPCR
sub sub sub
11. Follow us @GraknLabs
Integrating data
Protein
Drug
DiseaseGene
encoding-gene
encoded-protein
protein-
disease-
associati
on
associated-protein
associated-disease
associated-diseaseassociated-gene
drug-
protein-
interaction
target-protein
interacted-drug
drug-gene-
interaction
inhibitor
target-gene
drug--
disease-
association
affected-disease
therapeutic
gene-
protein-
encoding
gene-
disease-
association
Kinase
sub
Ion
-channel
Nuclear-
receptor GPCR
sub sub sub
15. Follow us @GraknLabs
How do we ingest and contextualise sequencing data?
Protein
Drug
Disease
Gene
encoding-gene
encoded-protein
protein-
disease-
associati
on
associated-
protein
associated-disease
associated-disease
associated-gene
drug-
protein-
interaction
target-protein
interacted-drug
drug-gene-
interaction
inhibitor
target-gene
drug-
disease-
association
affected-disease
therapeutic
gene-
protein-
encoding
gene-
disease-
association
Kinase
sub
Ion
-channel
Nuclear-
receptor GPCR
sub sub sub
16. Follow us @GraknLabs
How do we ingest and contextualise sequencing data?
Protein
Drug
Disease
Gene
protein-
protein-
alignment
target-protein
matched-protein
encoding-gene
encoded-protein
protein-
disease-
associati
on
associated-
protein
associated-disease
associated-disease
associated-gene
drug-
protein-
interaction
target-protein
interacted-drug
drug-gene-
interaction
inhibitor
target-gene
drug-
disease-
association
affected-disease
therapeutic
gene-
protein-
encoding
gene-
disease-
association
sequence
Kinase
sub
Ion
-channel
Nuclear-
receptor GPCR
sub sub sub
17. Follow us @GraknLabs
How do we ingest and contextualise sequencing data?
Protein
Drug
Disease
Gene
protein-
protein-
alignment
target-protein
matched-protein
encoding-gene
encoded-protein
protein-
disease-
associati
on
associated-
protein
associated-disease
associated-disease
associated-gene
drug-
protein-
interaction
target-protein
interacted-drug
drug-gene-
interaction
inhibitor
target-gene
drug-
disease-
association
affected-disease
therapeutic
gene-
protein-
encoding
gene-
disease-
association
sequenceSequence-
sequence-
alignment
target-seq
matched-seq
Kinase
sub
Ion
-channel
Nuclear-
receptor GPCR
sub sub sub
18. Follow us @GraknLabs
How do we ingest and contextualise sequencing data?
Protein
Drug
Disease
Gene
protein-
protein-
alignment
target-protein
matched-protein
encoding-gene
encoded-protein
protein-
disease-
associati
on
associated-
protein
associated-disease
associated-disease
associated-gene
drug-
protein-
interaction
target-protein
interacted-drug
drug-gene-
interaction
inhibitor
target-gene
drug-
disease-
association
affected-disease
therapeutic
gene-
protein-
encoding
gene-
disease-
association
sequenceSequence-
sequence-
alignment
target-seq
matched-seq
sequence-
positivity
sequence-
identicality
Kinase
sub
Ion
-channel
Nuclear-
receptor GPCR
sub sub sub
20. Follow us @GraknLabs
How do we discover new relationships?
Protein
Drug
Disease
Gene
protein-
protein-
alignment
target-protein
matched-protein
encoding-gene
encoded-protein
protein-
disease-
associati
on
associated-
protein
associated-disease
associated-disease
associated-gene
drug-
protein-
interaction
target-protein
interacted-drug
drug-gene-
interaction
inhibitor
target-gene
drug-
disease-
association
affected-disease
therapeutic
gene-
protein-
encoding
gene-
disease-
association
sequenceSequence-
sequence-
alignment
target-seq
matched-seq
sequence-
positivity
sequence-
identicality
Kinase
sub
Ion
-channel
Nuclear-
receptor GPCR
sub sub sub
21. Follow us @GraknLabs
Complex Query Example
Asthma
Disease
Gene
Nuclear
Receptor
Kinase
Ion
channel
GPCR
Nuclear
Receptor
Kinase
Ion
channel
GPCR Gene
Drug
seq seq
seq seq
seq seq
seq seq
ass
ass
ass
ass
ass
ass
ass
ass
enc
enc enc
enc
What drugs are associated to “Asthma”?
align
id
pos
22. Follow us @GraknLabs
align
Complex Query Example
Asthma
Disease
Gene
Nuclear
Receptor
Kinase
Ion
channel
GPCR
Nuclear
Receptor
Kinase
Ion
channel
GPCR Gene
Drug
seq seq
align
seq seq
seq seq
seq seq
ass
ass
ass
ass
ass
ass
ass
ass
enc
enc enc
enc
What drugs are associated to “Asthma”?
id
pos
23. Follow us @GraknLabs
Rule Example: Transitive Relationship
Protein Protein
align
Drug
ass
ass
Disease
ass
24. Follow us @GraknLabs
Rule Example: Transitive Relationship
Protein Protein
align
Drug
ass
ass
Disease
ass
25. Follow us @GraknLabs
Rule Example: Chained Rules
Protein
sequence-
alignment
Protein
sequence sequence
protein-
alignment
seq-pos seq-id
>0.9 >0.9
26. Follow us @GraknLabs
Rule Example: Chained Rules
Protein
sequence-
alignment
Protein
sequence sequence
protein-
alignment
seq-pos seq-id
>0.9 >0.9
27. Follow us @GraknLabs
1. How do we integrate so much biomedical data?
2. How do we ingest and contextualise sequencing data?
3. How do we discover new relationships?
28. Follow us @GraknLabs
Grakn accelerates biomedical knowledge discovery
Faster ingestion and integration
of data
Contextualise newly generated
insights
Discover and explain new
relationships
Grakn’s hyper-relational data model
enables fast and flexible integration of
heterogeneous biomedical data sets
Bring context to newly generated
insights by understanding how it
interacts with the rest of the graph
Through automated deductive
reasoning of data points derive new
relationships between biological
components
29. T H E K N O W L E D G E G R A P H
Join our community at grakn.ai/community
Using Grakn in Protein
Sequence Alignment
By Tomas Sabat
COO of GRAKN.AI
@graknlabs
@tasabat