SlideShare a Scribd company logo
1 of 49
Collaboratively Creating the Knowledge
Graph of Life
Chris Mungall
cjmungall@lbl.gov @chrismungall
JPM Graph Gang April 2021
Who am I and why am I here?
Education: University of Edinburgh
(BSc & PhD, AI + Bioinformatics)
Current: Staff Scientist, Berkeley Lab,
Environmental Genomics and Systems
Biology
In between: Lots of hacking and
occasional research papers
What I’m known for:
Biological databases and ontologies
What I actually do:
Write proposals and wrange grants
and let others do the work
My Interests
genes
environment
effects
My Interests
genes
environment
biocuration
machine inference effects
mechanisms
experiments
Drugs
Protein
structure
Clinical
data
Omics
Literature
Protein
function
Assembling the
data is a huge
challenge!
Biological data management is hard.
We have many named things.
Drugs 10k
Chemicals 4tn?
Species
~9 million
Diseases and
Phenotypes
10-50k/species
Cells
10,000s+
types
per species)
Experiments
Raw data
?? exabytes
Genes 20k per species
Genetic
variants
3m in human
alone
The things are interconnected
Cirrhosis
MONDO:0005155
Liver
UBERON:0002107
Hepatocyte
CL:0000182
Ethanol
CHEBI:16236
It’s hard to find and integrate the things
I guess I
have a lot of
reading to
do!
Ontologies and Knowledge Graphs to the rescue!
I can organize it all
for you!
Ontolowhat?
genes diseases cell types
What is an ontology anyway?
??? how does this
help me?
It is the branch of
metaphysics dealing with
the nature of being.
No, it’s a formal, explicit
specification of a shared
conceptualization
What is an ontology anyway?
...better,
I guess
A graph (network)
connecting all the things
you care about
me
pizza cheese
food
Victor
cat
human
mammal
fromage@fr
type
is a
has pet
depicted by
likes
has part
has role
type
Ontologies enable discovery
This is
fun
actually...
Do owners of different
kinds of pets like different
kinds of food? What do
those foods have in
common?
me
pizza cheese
food
Victor
cat
human
mammal
fromage@fr
type
is a
has pet
depicted by
likes
has part
has role
type
Lizard owners
like spicy food
[p=0.04]
Some of the things you can do with ontologies
Standardize,
organize, &
communicate
data
Filter &
search for
data
Connect &
harmonize data
Infer
knowledge &
make
suggestions
The Gene Ontology: An Ontology for Genes
Genes 20k/species
Gene Ontology (GO)
45k ontology classes
Each gene can be categorized with multiple
GO terms describing the role of each gene
The Gene Ontology is the work of many people
● Manual curation forms the
backbone of the GO
● AI can help but not replace!
id: GO:0043570
name: maintenance of
DNA repeat elements
id: GO:0006915
name: apoptosis
id: GO:0016446
name: somatic hypermutation of
immunoglobulin genes
Inferring GO
classification
based on
evolutionary
history of
genes
Effects of space radiation on CSF molecular profiles
• Innate immune system overactivated
• Decreased nervous system development
• axonal fasciculation, astrocyte & oligodendrocyte
differentiation,
synaptic plasticity, axonogenesis, …
• Many negative regulation processes impaired:
• Neuron proliferation, differentiation and projection
• Leukocyte proliferation and differentiation
• Extrinsic and possibly intrinsic apoptotic signaling pathways
Goal: predict individual risks for
behavior deficits and brain
pathologies in astronauts
proteomic
data
GO
analysis
predict
GO is used by researchers… and in the clinic!
doi:10.1038/nature24487
Transgenic replacement
skin was tested for off-
target mutations using GO
GO describes just one aspect of biology
Drugs 10k
Chemicals 4tn?
Species
~9 million
Diseases and
Phenotypes
10-50k/species
Cells
1000s+ core
types
per species)
Experiments
Raw data
?? exabytes
Genes 20k per species
Gene Functions
Genetic
variants
3m in human
alone
There are many ontologies
to categorize the other
things
many biological ontologies!
Problems:
● Duplication
● Silos
● Lack of interoperability
We can build
the universal
ontological
map of life...
...but how do
we put the
pieces
together?
Step 1:
Agreeing to
work together
Open Biological Ontologies (OBO)
http://obofoundry.org
1. Well-integrated
Modular ontologies
(SUBSET of bioportal)
2. Provide technical and
sociotechnological
framework for
cooperation
4. Allow us to describe all
of the things
3. Provide tools,
best practices and
infrastructure for
forging new
ontologies
@obofoundry
• 160 active ontologies
○ Developed by different teams
• Millions of classes
• Wide variety
○ Specific
■ E.g. cephalopod
○ General
■ E.g. chemicals
http://obofoundry.org
The OBO Foundry
The OBO Dashboard
Step 2: Connecting
the pieces
The original bio-ontologies were silos
glucan biosynthesis
(GO:0009250)
polysaccharide biosynthesis
(GO:0000271)
is_a
glucan
(CHEBI:37163)
polysaccharide
(CHEBI:18154)
is_a
GO:
Biological
Processes
CHEBI:
Chemical
Entities
No reuse or
connection
OWL to the rescue!
MODULARITY
TOOLS +
REASONING
Ontology
Development
Environment
http://robot.obolibrary.org
Command
line tool for
operating on
ontologies
ODK: ONTOLOGY DEVELOPMENT KIT
kernel
ODK container
ROBOT
Make
odk.py
dosdp-tools
Reasoners
container
Ontology
Operations
(Command Line)
Workflows: chains
together
operations
Seed an ontology project:
Create a GitHub
repository
with workflows in place
Build ontologies rapidly
from
Design Pattern templates
Includes Elk, HermiT,
Konklude
Complements ODEs
(Protégé)
fastobo
Validation of obo
format files
(Rust)
https://github.com/INCATools/ontology-development-kit
ROBOT is an OBO tool
http://robot.obolibrary.org/
Standard
Command
Line
Operations
OWL Axiomatization
glucan biosynthesis
(GO:0009250)
polysaccharide
biosynthesis
(GO:0000271)
⊑
≡
biosynthesis
(GO:0009058)
glucan
(CHEBI:37163)
⊓
biosynthesis
(GO:0009058)
polysaccharide
(CHEBI:18154)
⊓
∃.has_output
≡
∃.has_output
Hill et al, Dovetailing biology and chemistry: integrating the Gene Ontology with the ChEBI chemical ontology. BMC genomics, 14(1):513
OWL Reasoning leverages modularity
glucan biosynthesis
(GO:0009250)
polysaccharide
biosynthesis
(GO:0000271)
⊑
≡
⊓
⊓
∃.has_output
≡
∃.has_output
⊑
Inferred by
OWL
reasoner
biosynthesis
(GO:0009058)
glucan
(CHEBI:37163)
biosynthesis
(GO:0009058)
polysaccharide
(CHEBI:18154)
OBO Relation Ontology: glue
within and between ontologies
http://obofoundry.org/ontology/ro
Connected Knowledge Graphs using Ontologies
So far, so good...
Challenges with OWL
Under-
axiomatization
Over-
axiomatization
Ontology Users
Ontology
Developer
s
OWL
experts
● Author OWL templates
● Create Design Patterns
● Implement OWL templates
● Test against Design Patterns
● Consume pre-
reasoned hierarchies
Leverage the Expertise Pyramid
https://github.com/INCATools/dead_simple_owl_design_patterns
Can we make semantic tools easier?
RDF
OWL
SPARQL
SHACL
Semantic
engineer /
ontologist
Developer
Data Scientist
Scientists, Clinicians, ..
Python
SQL
Mongo
JSON
Pandas
BigTable
SPARK
Scikit-learn
Excel
Web Portals
???
id: https://example.org/linkml/hello-world
title: Really basic LinkML model
name: hello-world
version: 0.0.1
prefixes:
linkml: https://w3id.org/linkml/
sdo: https://schema.org/
ex: https://example.org/linkml/hello-world/
default_prefix: ex
default_curi_maps:
- semweb_context
imports:
- linkml:types
classes:
Person:
description: Minimal information about a person
class_uri: sdo:Person
attributes:
id:
identifier: true
slot_uri: sdo:taxID
first_name:
required: true
slot_uri: sdo:givenName
multivalued: true
last_name:
required: true
slot_uri: sdo:familyName
knows:
range: Person
multivalued: true
slot_uri: foaf:knows
LinkML: Linked Data Modeling Language
MyModel
Documentation
OWL
JSON Schema
ShEx Schema
Schema.py
GraphQL Schema
JSONLD Context
. . .
LinkML
schema
http://linkml.github.io
Biolink
Model
Biolink: Goals
The charge from NCATS:
● Create a Knowledge Graph Schema
● Encompass all biology from molecules through to clinical entities
● Get 20 different sites using the same data model
○ (oh: Only a handful of which use RDF/OWL)
● Do it quickly and break new ground in Translational Science
43
Biolink
Model
Where we are (year 2 or 5)
● All participants using common KG datamodel
● Early demonstrations of powerful federated queries
● LinkML advantages:
○ Edges are first-class citizens
○ Ontologies/OWL leveraged, but in background
44
NationalMicro
biome Data
Collaborative
Goal
● Make multi-omics microbiome data FAIR
○ Environments
○ Metagenomes
○ Metatranscriptomes
○ Metabolomics
○ Metaproteomics
● Leverage existing ontologies and standards
● Enable discovery in microbiome science
● Collaboration between multiple National Labs
45
NationalMicro
biome Data
Collaborative
Approach
● Formalize existing “checklist” standards
● Create modular schema
● Leverage MIxS, ENVO, PROV
Why LinkML
● Developers like JSON + JSON-Schema
● Biologists like spreadsheets
● “Semantic enums” work well
● Needed something that worked with
traditional technology (Mongo, Postgres)
● “Stealth semantics”
○ Everything has URI
○ All JSON is transparently JSON-LD
46
NationalMicro
biome Data
Collaborative
Where we are (year 2)
● Unified modular schema
● Easy for developers
○ System based mainly on JSON
exchange
○ RDF can be leveraged
○ Currently Mongo + Postgres +
TerminusDB
● Easy for biologists
○ Spreadsheets and validators created
from the schema
● Everything has semantics
○ “On-the-fly” JSON-LD
○ Satisfies FAIR mandate
47
Take Homes
Building the graph of life requires collaboration, social engineering, and lots of
curation
OWL is a powerful framework but it can be challenging to deploy effectively in an
information system
Integrating data into cohesive ontologies/KGs is hard but the return on investment is
high
LinkML provides a unifying layer over tooling… but more hands on deck required!
1
2
3
4
Some Links
● Open Bio Ontologies: http://obofoundry.org/resources
● ODK: https://github.com/INCATools/ontology-development-kit
● LinkML: https://linkml.github.io
● KG Hub: https://knowledge-graph-hub.github.io/
● GO: http://geneontology.org
● http://douroucouli.wordpress.com: My blog on all things OWL and Knowledge
Graphs

More Related Content

What's hot

Generative Models and ChatGPT
Generative Models and ChatGPTGenerative Models and ChatGPT
Generative Models and ChatGPTLoic Merckel
 
실리콘 밸리 데이터 사이언티스트의 하루
실리콘 밸리 데이터 사이언티스트의 하루실리콘 밸리 데이터 사이언티스트의 하루
실리콘 밸리 데이터 사이언티스트의 하루Jaimie Kwon (권재명)
 
Introduction to RDF & SPARQL
Introduction to RDF & SPARQLIntroduction to RDF & SPARQL
Introduction to RDF & SPARQLOpen Data Support
 
Introduction to natural language processing
Introduction to natural language processingIntroduction to natural language processing
Introduction to natural language processingMinh Pham
 
장바구니를 든 데이터 사이언티스트
장바구니를 든 데이터 사이언티스트장바구니를 든 데이터 사이언티스트
장바구니를 든 데이터 사이언티스트Dennis Lee
 
Conversational AI is Now the Heart of Customer Experience.pdf
Conversational AI is Now the Heart of Customer Experience.pdfConversational AI is Now the Heart of Customer Experience.pdf
Conversational AI is Now the Heart of Customer Experience.pdfScallionRice
 
Natural language processing (NLP) introduction
Natural language processing (NLP) introductionNatural language processing (NLP) introduction
Natural language processing (NLP) introductionRobert Lujo
 
Neanex - Semantic Construction with Graphs
Neanex - Semantic Construction with GraphsNeanex - Semantic Construction with Graphs
Neanex - Semantic Construction with GraphsNeo4j
 
Natural Language processing
Natural Language processingNatural Language processing
Natural Language processingSanzid Kawsar
 
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingBERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingMinh Pham
 
Word Embeddings, why the hype ?
Word Embeddings, why the hype ? Word Embeddings, why the hype ?
Word Embeddings, why the hype ? Hady Elsahar
 
And then there were ... Large Language Models
And then there were ... Large Language ModelsAnd then there were ... Large Language Models
And then there were ... Large Language ModelsLeon Dohmen
 
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven RecipesReasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven RecipesOntotext
 
Introduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga PetrovaIntroduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga PetrovaAlexey Grigorev
 
Knowledge Graph Introduction
Knowledge Graph IntroductionKnowledge Graph Introduction
Knowledge Graph IntroductionSören Auer
 
Getting Started with Knowledge Graphs
Getting Started with Knowledge GraphsGetting Started with Knowledge Graphs
Getting Started with Knowledge GraphsPeter Haase
 

What's hot (20)

Generative Models and ChatGPT
Generative Models and ChatGPTGenerative Models and ChatGPT
Generative Models and ChatGPT
 
실리콘 밸리 데이터 사이언티스트의 하루
실리콘 밸리 데이터 사이언티스트의 하루실리콘 밸리 데이터 사이언티스트의 하루
실리콘 밸리 데이터 사이언티스트의 하루
 
Introduction to RDF & SPARQL
Introduction to RDF & SPARQLIntroduction to RDF & SPARQL
Introduction to RDF & SPARQL
 
Introduction to natural language processing
Introduction to natural language processingIntroduction to natural language processing
Introduction to natural language processing
 
장바구니를 든 데이터 사이언티스트
장바구니를 든 데이터 사이언티스트장바구니를 든 데이터 사이언티스트
장바구니를 든 데이터 사이언티스트
 
Conversational AI is Now the Heart of Customer Experience.pdf
Conversational AI is Now the Heart of Customer Experience.pdfConversational AI is Now the Heart of Customer Experience.pdf
Conversational AI is Now the Heart of Customer Experience.pdf
 
Natural language processing (NLP) introduction
Natural language processing (NLP) introductionNatural language processing (NLP) introduction
Natural language processing (NLP) introduction
 
O’FAIRe: Ontology FAIRness Evaluator in the AgroPortal semantic resource rep...
O’FAIRe: Ontology FAIRness Evaluator in theAgroPortal semantic resource rep...O’FAIRe: Ontology FAIRness Evaluator in theAgroPortal semantic resource rep...
O’FAIRe: Ontology FAIRness Evaluator in the AgroPortal semantic resource rep...
 
Neanex - Semantic Construction with Graphs
Neanex - Semantic Construction with GraphsNeanex - Semantic Construction with Graphs
Neanex - Semantic Construction with Graphs
 
Natural Language processing
Natural Language processingNatural Language processing
Natural Language processing
 
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingBERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
 
Word Embeddings, why the hype ?
Word Embeddings, why the hype ? Word Embeddings, why the hype ?
Word Embeddings, why the hype ?
 
Serving ML easily with FastAPI
Serving ML easily with FastAPIServing ML easily with FastAPI
Serving ML easily with FastAPI
 
And then there were ... Large Language Models
And then there were ... Large Language ModelsAnd then there were ... Large Language Models
And then there were ... Large Language Models
 
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven RecipesReasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
 
Introduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga PetrovaIntroduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga Petrova
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Knowledge Graph Introduction
Knowledge Graph IntroductionKnowledge Graph Introduction
Knowledge Graph Introduction
 
Getting Started with Knowledge Graphs
Getting Started with Knowledge GraphsGetting Started with Knowledge Graphs
Getting Started with Knowledge Graphs
 
BERT
BERTBERT
BERT
 

Similar to Collaboratively Creating the Knowledge Graph of Life

All together now: piecing together the knowledge graph of life
All together now: piecing together the knowledge graph of lifeAll together now: piecing together the knowledge graph of life
All together now: piecing together the knowledge graph of lifeChris Mungall
 
Ontology Services for the Biomedical Sciences
Ontology Services for the Biomedical SciencesOntology Services for the Biomedical Sciences
Ontology Services for the Biomedical SciencesConnected Data World
 
Modeling exposure events and adverse outcome pathways using ontologies
Modeling exposure events and adverse outcome pathways using ontologiesModeling exposure events and adverse outcome pathways using ontologies
Modeling exposure events and adverse outcome pathways using ontologiesChris Mungall
 
Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga...
Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga...Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga...
Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga...Nathan Dunn
 
Introduction to Ontologies for Environmental Biology
Introduction to Ontologies for Environmental BiologyIntroduction to Ontologies for Environmental Biology
Introduction to Ontologies for Environmental BiologyBarry Smith
 
The Past, Present and Future of Knowledge in Biology
The Past, Present and Future of Knowledge in BiologyThe Past, Present and Future of Knowledge in Biology
The Past, Present and Future of Knowledge in Biologyrobertstevens65
 
Computing on the shoulders of giants
Computing on the shoulders of giantsComputing on the shoulders of giants
Computing on the shoulders of giantsBenjamin Good
 
Experiences in the biosciences with the open biological ontologies foundry an...
Experiences in the biosciences with the open biological ontologies foundry an...Experiences in the biosciences with the open biological ontologies foundry an...
Experiences in the biosciences with the open biological ontologies foundry an...Chris Mungall
 
MADICES Mungall 2022.pptx
MADICES Mungall 2022.pptxMADICES Mungall 2022.pptx
MADICES Mungall 2022.pptxChris Mungall
 
ICBO 2018 Poster - Current Development in the Evidence and Conclusion Ontolog...
ICBO 2018 Poster - Current Development in the Evidence and Conclusion Ontolog...ICBO 2018 Poster - Current Development in the Evidence and Conclusion Ontolog...
ICBO 2018 Poster - Current Development in the Evidence and Conclusion Ontolog...dolleyj
 
Integrating Pathway Databases with Gene Ontology Causal Activity Models
Integrating Pathway Databases with Gene Ontology Causal Activity ModelsIntegrating Pathway Databases with Gene Ontology Causal Activity Models
Integrating Pathway Databases with Gene Ontology Causal Activity ModelsBenjamin Good
 
Advanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven ResearchAdvanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven ResearchEuropean Bioinformatics Institute
 
Representing and reasoning with biological knowledge
Representing and reasoning with biological knowledgeRepresenting and reasoning with biological knowledge
Representing and reasoning with biological knowledgeBenjamin Good
 
Web Apollo: Lessons learned from community-based biocuration efforts.
Web Apollo: Lessons learned from community-based biocuration efforts.Web Apollo: Lessons learned from community-based biocuration efforts.
Web Apollo: Lessons learned from community-based biocuration efforts.Monica Munoz-Torres
 
The Gene Ontology & Gene Ontology Annotation resources
The Gene Ontology & Gene Ontology Annotation resourcesThe Gene Ontology & Gene Ontology Annotation resources
The Gene Ontology & Gene Ontology Annotation resourcesMelanie Courtot
 
Connecting life sciences data at the European Bioinformatics Institute
Connecting life sciences data at the European Bioinformatics InstituteConnecting life sciences data at the European Bioinformatics Institute
Connecting life sciences data at the European Bioinformatics InstituteConnected Data World
 
Ontology for the Financial Services Industry
Ontology for the Financial Services IndustryOntology for the Financial Services Industry
Ontology for the Financial Services IndustryBarry Smith
 
BioCuration 2019 - Evidence and Conclusion Ontology 2019 Update
BioCuration 2019 - Evidence and Conclusion Ontology 2019 UpdateBioCuration 2019 - Evidence and Conclusion Ontology 2019 Update
BioCuration 2019 - Evidence and Conclusion Ontology 2019 Updatedolleyj
 
Reverse-and forward-engineering specificity of carbohydrate-processing enzymes
Reverse-and forward-engineering specificity of carbohydrate-processing enzymesReverse-and forward-engineering specificity of carbohydrate-processing enzymes
Reverse-and forward-engineering specificity of carbohydrate-processing enzymesLeighton Pritchard
 
Ontologies for life sciences: examples from the gene ontology
Ontologies for life sciences: examples from the gene ontologyOntologies for life sciences: examples from the gene ontology
Ontologies for life sciences: examples from the gene ontologyMelanie Courtot
 

Similar to Collaboratively Creating the Knowledge Graph of Life (20)

All together now: piecing together the knowledge graph of life
All together now: piecing together the knowledge graph of lifeAll together now: piecing together the knowledge graph of life
All together now: piecing together the knowledge graph of life
 
Ontology Services for the Biomedical Sciences
Ontology Services for the Biomedical SciencesOntology Services for the Biomedical Sciences
Ontology Services for the Biomedical Sciences
 
Modeling exposure events and adverse outcome pathways using ontologies
Modeling exposure events and adverse outcome pathways using ontologiesModeling exposure events and adverse outcome pathways using ontologies
Modeling exposure events and adverse outcome pathways using ontologies
 
Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga...
Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga...Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga...
Genome annotation with open source software: Apollo, Jbrowse and the GO in Ga...
 
Introduction to Ontologies for Environmental Biology
Introduction to Ontologies for Environmental BiologyIntroduction to Ontologies for Environmental Biology
Introduction to Ontologies for Environmental Biology
 
The Past, Present and Future of Knowledge in Biology
The Past, Present and Future of Knowledge in BiologyThe Past, Present and Future of Knowledge in Biology
The Past, Present and Future of Knowledge in Biology
 
Computing on the shoulders of giants
Computing on the shoulders of giantsComputing on the shoulders of giants
Computing on the shoulders of giants
 
Experiences in the biosciences with the open biological ontologies foundry an...
Experiences in the biosciences with the open biological ontologies foundry an...Experiences in the biosciences with the open biological ontologies foundry an...
Experiences in the biosciences with the open biological ontologies foundry an...
 
MADICES Mungall 2022.pptx
MADICES Mungall 2022.pptxMADICES Mungall 2022.pptx
MADICES Mungall 2022.pptx
 
ICBO 2018 Poster - Current Development in the Evidence and Conclusion Ontolog...
ICBO 2018 Poster - Current Development in the Evidence and Conclusion Ontolog...ICBO 2018 Poster - Current Development in the Evidence and Conclusion Ontolog...
ICBO 2018 Poster - Current Development in the Evidence and Conclusion Ontolog...
 
Integrating Pathway Databases with Gene Ontology Causal Activity Models
Integrating Pathway Databases with Gene Ontology Causal Activity ModelsIntegrating Pathway Databases with Gene Ontology Causal Activity Models
Integrating Pathway Databases with Gene Ontology Causal Activity Models
 
Advanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven ResearchAdvanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven Research
 
Representing and reasoning with biological knowledge
Representing and reasoning with biological knowledgeRepresenting and reasoning with biological knowledge
Representing and reasoning with biological knowledge
 
Web Apollo: Lessons learned from community-based biocuration efforts.
Web Apollo: Lessons learned from community-based biocuration efforts.Web Apollo: Lessons learned from community-based biocuration efforts.
Web Apollo: Lessons learned from community-based biocuration efforts.
 
The Gene Ontology & Gene Ontology Annotation resources
The Gene Ontology & Gene Ontology Annotation resourcesThe Gene Ontology & Gene Ontology Annotation resources
The Gene Ontology & Gene Ontology Annotation resources
 
Connecting life sciences data at the European Bioinformatics Institute
Connecting life sciences data at the European Bioinformatics InstituteConnecting life sciences data at the European Bioinformatics Institute
Connecting life sciences data at the European Bioinformatics Institute
 
Ontology for the Financial Services Industry
Ontology for the Financial Services IndustryOntology for the Financial Services Industry
Ontology for the Financial Services Industry
 
BioCuration 2019 - Evidence and Conclusion Ontology 2019 Update
BioCuration 2019 - Evidence and Conclusion Ontology 2019 UpdateBioCuration 2019 - Evidence and Conclusion Ontology 2019 Update
BioCuration 2019 - Evidence and Conclusion Ontology 2019 Update
 
Reverse-and forward-engineering specificity of carbohydrate-processing enzymes
Reverse-and forward-engineering specificity of carbohydrate-processing enzymesReverse-and forward-engineering specificity of carbohydrate-processing enzymes
Reverse-and forward-engineering specificity of carbohydrate-processing enzymes
 
Ontologies for life sciences: examples from the gene ontology
Ontologies for life sciences: examples from the gene ontologyOntologies for life sciences: examples from the gene ontology
Ontologies for life sciences: examples from the gene ontology
 

More from Chris Mungall

Scaling up semantics; lessons learned across the life sciences
Scaling up semantics; lessons learned across the life sciencesScaling up semantics; lessons learned across the life sciences
Scaling up semantics; lessons learned across the life sciencesChris Mungall
 
Ontology Access Kit_ Workshop Intro Slides.pptx
Ontology Access Kit_ Workshop Intro Slides.pptxOntology Access Kit_ Workshop Intro Slides.pptx
Ontology Access Kit_ Workshop Intro Slides.pptxChris Mungall
 
LinkML Intro (for Monarch devs)
LinkML Intro (for Monarch devs)LinkML Intro (for Monarch devs)
LinkML Intro (for Monarch devs)Chris Mungall
 
LinkML presentation to Yosemite Group
LinkML presentation to Yosemite GroupLinkML presentation to Yosemite Group
LinkML presentation to Yosemite GroupChris Mungall
 
Representation of kidney structures in Uberon
Representation of kidney structures in UberonRepresentation of kidney structures in Uberon
Representation of kidney structures in UberonChris Mungall
 
SparqlProg (BioHackathon 2019)
SparqlProg (BioHackathon 2019)SparqlProg (BioHackathon 2019)
SparqlProg (BioHackathon 2019)Chris Mungall
 
Ontology Development Kit: Bio-Ontologies 2019
Ontology Development Kit: Bio-Ontologies 2019Ontology Development Kit: Bio-Ontologies 2019
Ontology Development Kit: Bio-Ontologies 2019Chris Mungall
 
US2TS: Reasoning over multiple open bio-ontologies to make machines and human...
US2TS: Reasoning over multiple open bio-ontologies to make machines and human...US2TS: Reasoning over multiple open bio-ontologies to make machines and human...
US2TS: Reasoning over multiple open bio-ontologies to make machines and human...Chris Mungall
 
Uberon: opening up to community contributions
Uberon: opening up to community contributionsUberon: opening up to community contributions
Uberon: opening up to community contributionsChris Mungall
 
Causal reasoning using the Relation Ontology
Causal reasoning using the Relation OntologyCausal reasoning using the Relation Ontology
Causal reasoning using the Relation OntologyChris Mungall
 
US2TS presentation on Gene Ontology
US2TS presentation on Gene OntologyUS2TS presentation on Gene Ontology
US2TS presentation on Gene OntologyChris Mungall
 
Computing on Phenotypes AMP 2015
Computing on Phenotypes AMP 2015Computing on Phenotypes AMP 2015
Computing on Phenotypes AMP 2015Chris Mungall
 
Mungall keynote-biocurator-2017
Mungall keynote-biocurator-2017Mungall keynote-biocurator-2017
Mungall keynote-biocurator-2017Chris Mungall
 
GIGA2 Structuring Phenotype Data
GIGA2 Structuring Phenotype DataGIGA2 Structuring Phenotype Data
GIGA2 Structuring Phenotype DataChris Mungall
 
Mapping Phenotype Ontologies for Obesity and Diabetes
Mapping Phenotype Ontologies for Obesity and DiabetesMapping Phenotype Ontologies for Obesity and Diabetes
Mapping Phenotype Ontologies for Obesity and DiabetesChris Mungall
 
Uberon EBI industry workshop
Uberon EBI industry workshopUberon EBI industry workshop
Uberon EBI industry workshopChris Mungall
 
Increased Expressivity of Gene Ontology Annotations - Biocuration 2013
Increased Expressivity of Gene Ontology Annotations - Biocuration 2013Increased Expressivity of Gene Ontology Annotations - Biocuration 2013
Increased Expressivity of Gene Ontology Annotations - Biocuration 2013Chris Mungall
 

More from Chris Mungall (20)

Scaling up semantics; lessons learned across the life sciences
Scaling up semantics; lessons learned across the life sciencesScaling up semantics; lessons learned across the life sciences
Scaling up semantics; lessons learned across the life sciences
 
Ontology Access Kit_ Workshop Intro Slides.pptx
Ontology Access Kit_ Workshop Intro Slides.pptxOntology Access Kit_ Workshop Intro Slides.pptx
Ontology Access Kit_ Workshop Intro Slides.pptx
 
LinkML Intro (for Monarch devs)
LinkML Intro (for Monarch devs)LinkML Intro (for Monarch devs)
LinkML Intro (for Monarch devs)
 
LinkML presentation to Yosemite Group
LinkML presentation to Yosemite GroupLinkML presentation to Yosemite Group
LinkML presentation to Yosemite Group
 
Representation of kidney structures in Uberon
Representation of kidney structures in UberonRepresentation of kidney structures in Uberon
Representation of kidney structures in Uberon
 
SparqlProg (BioHackathon 2019)
SparqlProg (BioHackathon 2019)SparqlProg (BioHackathon 2019)
SparqlProg (BioHackathon 2019)
 
Ontology Development Kit: Bio-Ontologies 2019
Ontology Development Kit: Bio-Ontologies 2019Ontology Development Kit: Bio-Ontologies 2019
Ontology Development Kit: Bio-Ontologies 2019
 
US2TS: Reasoning over multiple open bio-ontologies to make machines and human...
US2TS: Reasoning over multiple open bio-ontologies to make machines and human...US2TS: Reasoning over multiple open bio-ontologies to make machines and human...
US2TS: Reasoning over multiple open bio-ontologies to make machines and human...
 
Uberon: opening up to community contributions
Uberon: opening up to community contributionsUberon: opening up to community contributions
Uberon: opening up to community contributions
 
Causal reasoning using the Relation Ontology
Causal reasoning using the Relation OntologyCausal reasoning using the Relation Ontology
Causal reasoning using the Relation Ontology
 
US2TS presentation on Gene Ontology
US2TS presentation on Gene OntologyUS2TS presentation on Gene Ontology
US2TS presentation on Gene Ontology
 
Computing on Phenotypes AMP 2015
Computing on Phenotypes AMP 2015Computing on Phenotypes AMP 2015
Computing on Phenotypes AMP 2015
 
ENVO GSC 2015
ENVO GSC 2015ENVO GSC 2015
ENVO GSC 2015
 
Mungall keynote-biocurator-2017
Mungall keynote-biocurator-2017Mungall keynote-biocurator-2017
Mungall keynote-biocurator-2017
 
Kboom phenoday-2016
Kboom phenoday-2016Kboom phenoday-2016
Kboom phenoday-2016
 
BioMake PAG 2017
BioMake PAG 2017 BioMake PAG 2017
BioMake PAG 2017
 
GIGA2 Structuring Phenotype Data
GIGA2 Structuring Phenotype DataGIGA2 Structuring Phenotype Data
GIGA2 Structuring Phenotype Data
 
Mapping Phenotype Ontologies for Obesity and Diabetes
Mapping Phenotype Ontologies for Obesity and DiabetesMapping Phenotype Ontologies for Obesity and Diabetes
Mapping Phenotype Ontologies for Obesity and Diabetes
 
Uberon EBI industry workshop
Uberon EBI industry workshopUberon EBI industry workshop
Uberon EBI industry workshop
 
Increased Expressivity of Gene Ontology Annotations - Biocuration 2013
Increased Expressivity of Gene Ontology Annotations - Biocuration 2013Increased Expressivity of Gene Ontology Annotations - Biocuration 2013
Increased Expressivity of Gene Ontology Annotations - Biocuration 2013
 

Recently uploaded

Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCEPRINCE C P
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoSérgio Sacani
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptMAESTRELLAMesa2
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfSwapnil Therkar
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptxanandsmhk
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...RohitNehra6
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxyaramohamed343013
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |aasikanpl
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...jana861314
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfSELF-EXPLANATORY
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​kaibalyasahoo82800
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Nistarini College, Purulia (W.B) India
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 

Recently uploaded (20)

Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.ppt
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docx
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 

Collaboratively Creating the Knowledge Graph of Life

  • 1. Collaboratively Creating the Knowledge Graph of Life Chris Mungall cjmungall@lbl.gov @chrismungall JPM Graph Gang April 2021
  • 2. Who am I and why am I here? Education: University of Edinburgh (BSc & PhD, AI + Bioinformatics) Current: Staff Scientist, Berkeley Lab, Environmental Genomics and Systems Biology In between: Lots of hacking and occasional research papers What I’m known for: Biological databases and ontologies What I actually do: Write proposals and wrange grants and let others do the work
  • 6. Biological data management is hard. We have many named things. Drugs 10k Chemicals 4tn? Species ~9 million Diseases and Phenotypes 10-50k/species Cells 10,000s+ types per species) Experiments Raw data ?? exabytes Genes 20k per species Genetic variants 3m in human alone
  • 7. The things are interconnected Cirrhosis MONDO:0005155 Liver UBERON:0002107 Hepatocyte CL:0000182 Ethanol CHEBI:16236
  • 8. It’s hard to find and integrate the things I guess I have a lot of reading to do!
  • 9. Ontologies and Knowledge Graphs to the rescue! I can organize it all for you! Ontolowhat? genes diseases cell types
  • 10. What is an ontology anyway? ??? how does this help me? It is the branch of metaphysics dealing with the nature of being. No, it’s a formal, explicit specification of a shared conceptualization
  • 11. What is an ontology anyway? ...better, I guess A graph (network) connecting all the things you care about me pizza cheese food Victor cat human mammal fromage@fr type is a has pet depicted by likes has part has role type
  • 12. Ontologies enable discovery This is fun actually... Do owners of different kinds of pets like different kinds of food? What do those foods have in common? me pizza cheese food Victor cat human mammal fromage@fr type is a has pet depicted by likes has part has role type Lizard owners like spicy food [p=0.04]
  • 13. Some of the things you can do with ontologies Standardize, organize, & communicate data Filter & search for data Connect & harmonize data Infer knowledge & make suggestions
  • 14. The Gene Ontology: An Ontology for Genes Genes 20k/species Gene Ontology (GO) 45k ontology classes Each gene can be categorized with multiple GO terms describing the role of each gene
  • 15. The Gene Ontology is the work of many people ● Manual curation forms the backbone of the GO ● AI can help but not replace!
  • 16. id: GO:0043570 name: maintenance of DNA repeat elements id: GO:0006915 name: apoptosis id: GO:0016446 name: somatic hypermutation of immunoglobulin genes Inferring GO classification based on evolutionary history of genes
  • 17. Effects of space radiation on CSF molecular profiles • Innate immune system overactivated • Decreased nervous system development • axonal fasciculation, astrocyte & oligodendrocyte differentiation, synaptic plasticity, axonogenesis, … • Many negative regulation processes impaired: • Neuron proliferation, differentiation and projection • Leukocyte proliferation and differentiation • Extrinsic and possibly intrinsic apoptotic signaling pathways Goal: predict individual risks for behavior deficits and brain pathologies in astronauts proteomic data GO analysis predict
  • 18. GO is used by researchers… and in the clinic! doi:10.1038/nature24487 Transgenic replacement skin was tested for off- target mutations using GO
  • 19. GO describes just one aspect of biology Drugs 10k Chemicals 4tn? Species ~9 million Diseases and Phenotypes 10-50k/species Cells 1000s+ core types per species) Experiments Raw data ?? exabytes Genes 20k per species Gene Functions Genetic variants 3m in human alone
  • 20. There are many ontologies to categorize the other things many biological ontologies! Problems: ● Duplication ● Silos ● Lack of interoperability
  • 21. We can build the universal ontological map of life... ...but how do we put the pieces together?
  • 23. Open Biological Ontologies (OBO) http://obofoundry.org 1. Well-integrated Modular ontologies (SUBSET of bioportal) 2. Provide technical and sociotechnological framework for cooperation 4. Allow us to describe all of the things 3. Provide tools, best practices and infrastructure for forging new ontologies @obofoundry
  • 24. • 160 active ontologies ○ Developed by different teams • Millions of classes • Wide variety ○ Specific ■ E.g. cephalopod ○ General ■ E.g. chemicals http://obofoundry.org The OBO Foundry
  • 27. The original bio-ontologies were silos glucan biosynthesis (GO:0009250) polysaccharide biosynthesis (GO:0000271) is_a glucan (CHEBI:37163) polysaccharide (CHEBI:18154) is_a GO: Biological Processes CHEBI: Chemical Entities No reuse or connection
  • 28. OWL to the rescue! MODULARITY TOOLS + REASONING
  • 31.
  • 32. ODK: ONTOLOGY DEVELOPMENT KIT kernel ODK container ROBOT Make odk.py dosdp-tools Reasoners container Ontology Operations (Command Line) Workflows: chains together operations Seed an ontology project: Create a GitHub repository with workflows in place Build ontologies rapidly from Design Pattern templates Includes Elk, HermiT, Konklude Complements ODEs (Protégé) fastobo Validation of obo format files (Rust) https://github.com/INCATools/ontology-development-kit
  • 33. ROBOT is an OBO tool http://robot.obolibrary.org/ Standard Command Line Operations
  • 35. OWL Reasoning leverages modularity glucan biosynthesis (GO:0009250) polysaccharide biosynthesis (GO:0000271) ⊑ ≡ ⊓ ⊓ ∃.has_output ≡ ∃.has_output ⊑ Inferred by OWL reasoner biosynthesis (GO:0009058) glucan (CHEBI:37163) biosynthesis (GO:0009058) polysaccharide (CHEBI:18154)
  • 36. OBO Relation Ontology: glue within and between ontologies http://obofoundry.org/ontology/ro
  • 37. Connected Knowledge Graphs using Ontologies
  • 38. So far, so good...
  • 40. Ontology Users Ontology Developer s OWL experts ● Author OWL templates ● Create Design Patterns ● Implement OWL templates ● Test against Design Patterns ● Consume pre- reasoned hierarchies Leverage the Expertise Pyramid https://github.com/INCATools/dead_simple_owl_design_patterns
  • 41. Can we make semantic tools easier? RDF OWL SPARQL SHACL Semantic engineer / ontologist Developer Data Scientist Scientists, Clinicians, .. Python SQL Mongo JSON Pandas BigTable SPARK Scikit-learn Excel Web Portals ???
  • 42. id: https://example.org/linkml/hello-world title: Really basic LinkML model name: hello-world version: 0.0.1 prefixes: linkml: https://w3id.org/linkml/ sdo: https://schema.org/ ex: https://example.org/linkml/hello-world/ default_prefix: ex default_curi_maps: - semweb_context imports: - linkml:types classes: Person: description: Minimal information about a person class_uri: sdo:Person attributes: id: identifier: true slot_uri: sdo:taxID first_name: required: true slot_uri: sdo:givenName multivalued: true last_name: required: true slot_uri: sdo:familyName knows: range: Person multivalued: true slot_uri: foaf:knows LinkML: Linked Data Modeling Language MyModel Documentation OWL JSON Schema ShEx Schema Schema.py GraphQL Schema JSONLD Context . . . LinkML schema http://linkml.github.io
  • 43. Biolink Model Biolink: Goals The charge from NCATS: ● Create a Knowledge Graph Schema ● Encompass all biology from molecules through to clinical entities ● Get 20 different sites using the same data model ○ (oh: Only a handful of which use RDF/OWL) ● Do it quickly and break new ground in Translational Science 43
  • 44. Biolink Model Where we are (year 2 or 5) ● All participants using common KG datamodel ● Early demonstrations of powerful federated queries ● LinkML advantages: ○ Edges are first-class citizens ○ Ontologies/OWL leveraged, but in background 44
  • 45. NationalMicro biome Data Collaborative Goal ● Make multi-omics microbiome data FAIR ○ Environments ○ Metagenomes ○ Metatranscriptomes ○ Metabolomics ○ Metaproteomics ● Leverage existing ontologies and standards ● Enable discovery in microbiome science ● Collaboration between multiple National Labs 45
  • 46. NationalMicro biome Data Collaborative Approach ● Formalize existing “checklist” standards ● Create modular schema ● Leverage MIxS, ENVO, PROV Why LinkML ● Developers like JSON + JSON-Schema ● Biologists like spreadsheets ● “Semantic enums” work well ● Needed something that worked with traditional technology (Mongo, Postgres) ● “Stealth semantics” ○ Everything has URI ○ All JSON is transparently JSON-LD 46
  • 47. NationalMicro biome Data Collaborative Where we are (year 2) ● Unified modular schema ● Easy for developers ○ System based mainly on JSON exchange ○ RDF can be leveraged ○ Currently Mongo + Postgres + TerminusDB ● Easy for biologists ○ Spreadsheets and validators created from the schema ● Everything has semantics ○ “On-the-fly” JSON-LD ○ Satisfies FAIR mandate 47
  • 48. Take Homes Building the graph of life requires collaboration, social engineering, and lots of curation OWL is a powerful framework but it can be challenging to deploy effectively in an information system Integrating data into cohesive ontologies/KGs is hard but the return on investment is high LinkML provides a unifying layer over tooling… but more hands on deck required! 1 2 3 4
  • 49. Some Links ● Open Bio Ontologies: http://obofoundry.org/resources ● ODK: https://github.com/INCATools/ontology-development-kit ● LinkML: https://linkml.github.io ● KG Hub: https://knowledge-graph-hub.github.io/ ● GO: http://geneontology.org ● http://douroucouli.wordpress.com: My blog on all things OWL and Knowledge Graphs