SlideShare a Scribd company logo
LinkML: A brief guide for Monarch team
members
Chris Mungall
Monarch Initiative
cjmungall@lbl.gov
Monarch Data Call Sep 2021
Data modeling is ubiquitous in Monarch and Phen1
Ontology
developer
How do infectious
diseases (MONDO) relate
to infectious agents
(NCBITaxon)? Or to
treatments (MAXO) or
exposures (ECTO) Phenopacket Team
KG/Ingest
Engineer
How do
diseases relate
to genes?
How do patients
relate to
diseases? Or to
samples/.
UI developer
How do disease
pages relate to
gene pages?
We love our stacks
Ontology
developer
I ❤️ OWL
I ❤️ DOSDPs
I ❤️ ROBOT templates
Phenopacket Team
KG/Ingest
Engineer /
Graph ML
I ❤️ KGX
I ❤️ TSVs
I ❤️ Neo4J
I ❤️ Protobuf
Monarch-adjacent and beyond
Translator
How do genes relate
to chemicals
CRDC-H; NMDC
How do I relate
samples to
ontological
descriptors
CD2H / N3C
How do patients
relate to drug
treatments?
Allen / HCA
How do cell
types relate to
genes?
OBO / RO /
COB
How do
processes
relate to inputs
Monarch-adjacent and beyond
Semweb developer
I ❤️ RDF + triplestores
I ❤️ SHACL/ShEx
Clinical informatics
I ❤️ FHIR
Biologists
I ❤️ spreadsheets
CD2H / N3C
I ❤️ OMOP
GA4GH, HCA, many
devs
I ❤️ JSON-Schema
I ❤️ SQLite
LinkML: One ring to bind them….
https://linkml.io/ * https://github.com/linkml/linkml/
LinkML philosophy: Parasitize rather than compete
Strategy
● Be expressive enough to cover all our use cases
● Allow compilation to people’s favored stack (e.g JSON-Schema)
● Simple to do simple things, but add-ons where necessary
● Stealth Semantics
○ Everything can be RDF/Linked Data - if you want it to be
LinkML parasitizes other toolchains
YourModel
Documentation
OWL
JSON Schema
ShEx Schema
Schema.py
Object model
GraphQL Schema
Your LinkML Schema
(YAML)
JSONLD Context
. . .
LinkML
parser
Philosophy:
● Be expressive
● Parasitize
● Be developer-friendly
● Stealth semantics
● 80% rule
LinkML != Biolink Model
Biolink Model
● Expressed using LinkML
● An uber-data model for biology
○ Main types: gene, chemical, disease, …
○ Translator
○ Monarch KG
○ KG-COVID-19
○ KG-Microbe
○ KG-OBO
● Not appropriate for everything
○ Highly patient specific (CCDH, Pfx, FHIR)
○ Sample and omics data (CCDH, NMDC)
○ Single-cell data
LinkML
● A modeling language
● Can express multiple datamodels
id: https://example.org/linkml/hello-world
title: Really basic LinkML model
name: hello-world
license: https://creativecommons.org/publicdomain/zero/1.0/
version: 0.0.1
prefixes:
linkml: https://w3id.org/linkml/
sdo: https://schema.org/
ex: https://example.org/linkml/hello-world/
default_prefix: ex
default_curi_maps:
- semweb_context
imports:
- linkml:types
classes:
Person:
description: Minimal information about a person
class_uri: sdo:Person
attributes:
id:
identifier: true
slot_uri: sdo:taxID
first_name:
required: true
slot_uri: sdo:givenName
multivalued: true
last_name:
required: true
slot_uri: sdo:familyName
knows:
range: Person
multivalued: true
slot_uri: foaf:knows
Metadata
Dependencies
Namespaces
Actual Model
A sample LinkML Schema
11
id: https://example.org/linkml/hello-world
title: Really basic LinkML model
name: hello-world
license: https://creativecommons.org/publicdomain/zero/1.0/
version: 0.0.1
prefixes:
linkml: https://w3id.org/linkml/
sdo: https://schema.org/
ex: https://example.org/linkml/hello-world/
default_prefix: ex
default_curi_maps:
- semweb_context
imports:
- linkml:types
classes:
Person:
description: Minimal information about a person
class_uri: sdo:Person
attributes:
id:
identifier: true
slot_uri: sdo:taxID
first_name:
required: true
slot_uri: sdo:givenName
multivalued: true
last_name:
required: true
slot_uri: sdo:familyName
knows:
range: Person
multivalued: true
slot_uri: foaf:knows
Metadata
Dependencies
Namespaces
Actual Model
LinkML RDF is hidden in plain sight
12
Reuse schema
elements from
core
vocabularies
FAIR
(specifically:
allows diverse
data to be
combined)
Sample model documentation output
https://hsolbrig.github.io/sample_model/docs
13
BASE <https://example.org/linkml/hello-world/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX sdo: <https://schema.org/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
<String> xsd:string
<Person> CLOSED {
( $<Person_tes> ( sdo:givenName @<String> + ;
sdo:familyName @<String> ;
foaf:knows @<Person> *
) ;
rdf:type [ sdo:Person ]
)
}
"$id": "https://example.org/linkml/person",
"$schema": "http://json-schema.org/draft-07/schema#",
"definitions": {
"Person": {
"additionalProperties": false,
"description": "Minimal information about a person",
"properties": {
"first_name": {
"items": {
"type": "string"
},
"type": "array"
},
"id": {
"type": "string"
},
...
Shape Expressions (ShEx) Schema
JSON Schema
type Person
{
id: String!
firstName: [String]!
lastName: String!
knows: [Person]
}
Graphql Schema
Sample LinkML generated schemas
14
LinkML can also emit OWL
15
LinkML can also emit OWL
16
from examples.basic import Person
from linkml.dumpers import json_dumper, rdf_dumper
sam = Person("1172438", first_name=["Samual", "J"], last_name="Snooter")
ann = Person("17a3923", first_name="Jill", last_name="Jones", knows=[sam.id])
print(json_dumper.dumps(ann))
print(yaml_dumper.dumps(ann))
print(rdf_dumper.dumps(ann, contexts="../examples/jsonld/basic.context.jsonld"))
{
"id": "17a3923",
"first_name": [
"Jill"
],
"last_name": "Jones",
"knows": [
"1172438"
],
"@type": "Person"
}
id: 17a3923
first_name:
- Jill
last_name: Jones
knows:
- '1172438'
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix sdo: <https://schema.org/> .
<https://example.org/linkml/hello-world/17a3923> a sdo:Person ;
foaf:knows <https://example.org/linkml/hello-
world/1172438> ;
sdo:familyName "Jones" ;
sdo:givenName "Jill" .
python
JSON output YAML output RDF output (by way of JSON-LD)
Objects can be exported as JSON, YAML, or RDF
17
from linkml.loaders import yaml_loader
fred = yaml_loader.load('input/fred.yaml', target_class=Person)
print(fred.first_name)
['Fred', 'William']
harvey = json_loader.load('https://raw.githubusercontent.com/hsolbrig/linkml-enhanced-
template/master/tests/input/harvey.json', target_class=Person)
print(harvey.last_name)
Mackerson
ann = rdf_loader.load('input/ann.xml', target_class=Person, fmt="xml")
print(ann.last_name)
Richardson
id: 118-28-3199
first_name:
- Fred
- William
last_name: Phillips
knows:
- '1172438'
- '1172438'
input/fred.yaml
Python code
{
"id": "118-78-0697",
"first_name": [
"Harvey"
],
"last_name": "Mackerson"
}
http://example.org/.../harvey.json input/ann.xml
Objects can be imported from JSON, YAML, or RDF
18
<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:sdo="https://schema.org/"
>
<rdf:Description rdf:about="https://peoples.r.us">
<sdo:givenName>Ann</sdo:givenName>
<rdf:type rdf:resource="https://schema.org/Person"/>
<sdo:familyName>Richardson</sdo:familyName>
<sdo:givenName>Elizabeth</sdo:givenName>
</rdf:Description>
</rdf:RDF>
Class:
CyberAttack:
attributes:
has_attack_means:
AttackMeansEnum
enums:
AttackMeansUnum:
description: |-
Type of cyber attack
permissible_values:
BufferOverFlow:
SynFlood:
MaliciousCodeExecution:
...
Simple Enums
19
LinkML has rich enum support
Prefixes:
uco: http://ffrdc.ebiquity.umbc.edu/ns/ontology/#
classes:
CyberAttack:
attributes:
has_attack_means:
AttackMeansUnum
enums:
AttackMeansUnum:
description: |-
Type of cyber attack
permissible_values:
BufferOverFlow:
meaning: uco#BufferOverFlow
SynFlood:
meaning: uco#SynFlood
MaliciousCodeExecution:
meaning: uco#MaliciousCodeExecution
...
Ontological Enums
20
LinkML has rich enum support
Uses of LinkML
Lightweight data
dictionary for CSVs /
Machine Learning
Knowledge Graph
schemas
Simple Complex
Standard Databases
https://linkml.io/
linkml-registry/registry
Currently in use for: key projects
National Microbiome Data Collaborative
● Samples
● Omics datatypes
Genomics Standards Consortium
Translator (BioLink)
All our KGs (BioLink + Source modeling)
Alliance of Genome Resources
Center for Cancer Data Harmonization
Knowledge Graph Change Language (KGCL)
Chemical Ontology Schema
SSSOM
GHGPA
Critical Path Institute
...and more
Active development
https://github.com/topics/linkml
External tooling: DataHarmonizer driven by LinkML
https://genepio.org/DataHarmonizer/main.html -- spiritual successor to Phenote
Caveats / expectation management
The core is stable
● i.e. schemas won’t break in linkml1.x.x
series
● Used in production in multiple projects
Some things are currently incomplete
● Mapping to JSON-Schema
● Mapping to JSON-LD Contexts
● Documentation
New language features being added
● E.g constraint language, mapping
language
● These are extensions and won’t break
existing schemas
The tool stack is constantly evolving
● Other frameworks -> LinkML
● Automated schema mapping
● Generators for other languages
○ new : javagen
● Binding to databases
○ SQL, SPARQL, Solr, MongoDB, ...
Tutorial
https://linkml.io/linkml/intro/tutorial.html

More Related Content

What's hot

RDF SHACL, Annotations, and Data Frames
RDF SHACL, Annotations, and Data FramesRDF SHACL, Annotations, and Data Frames
RDF SHACL, Annotations, and Data Frames
Kurt Cagle
 
Linked Open Data: A simple how-to
Linked Open Data: A simple how-toLinked Open Data: A simple how-to
Linked Open Data: A simple how-to
nvitucci
 
SWT Lecture Session 2 - RDF
SWT Lecture Session 2 - RDFSWT Lecture Session 2 - RDF
SWT Lecture Session 2 - RDF
Mariano Rodriguez-Muro
 
Triple Stores
Triple StoresTriple Stores
Triple Stores
Stephan Volmer
 
Sparql a simple knowledge query
Sparql  a simple knowledge querySparql  a simple knowledge query
Sparql a simple knowledge query
Stanley Wang
 
Data shapes-test-suite
Data shapes-test-suiteData shapes-test-suite
Data shapes-test-suite
Jose Emilio Labra Gayo
 
2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs
2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs
2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs
Josef Petrák
 
Jesús Barrasa
Jesús BarrasaJesús Barrasa
Jesús Barrasa
Connected Data World
 
2009 0807 Lod Gmod
2009 0807 Lod Gmod2009 0807 Lod Gmod
2009 0807 Lod Gmod
Jun Zhao
 
Chado for evolutionary biology
Chado for evolutionary biologyChado for evolutionary biology
Chado for evolutionary biology
Chris Mungall
 
Introduction To RDF and RDFS
Introduction To RDF and RDFSIntroduction To RDF and RDFS
Introduction To RDF and RDFS
Nilesh Wagmare
 
Chado introduction
Chado introductionChado introduction
Chado introduction
Chris Mungall
 
Chado-XML
Chado-XMLChado-XML
Chado-XML
Chris Mungall
 
RDF Data Model
RDF Data ModelRDF Data Model
RDF Data Model
Jose Emilio Labra Gayo
 
Validating RDF data: Challenges and perspectives
Validating RDF data: Challenges and perspectivesValidating RDF data: Challenges and perspectives
Validating RDF data: Challenges and perspectives
Jose Emilio Labra Gayo
 
Demystifying RDF
Demystifying RDFDemystifying RDF
Demystifying RDF
Kyle Banerjee
 
RDF briefing
RDF briefingRDF briefing
RDF briefing
Frank van Harmelen
 
Semantic web an overview and projects
Semantic web   an  overview and projectsSemantic web   an  overview and projects
Semantic web an overview and projects
Pranali Gedam-Khobragade
 
Challenges and applications of RDF shapes
Challenges and applications of RDF shapesChallenges and applications of RDF shapes
Challenges and applications of RDF shapes
Jose Emilio Labra Gayo
 
Towards an RDF Validation Language based on Regular Expression Derivatives
Towards an RDF Validation Language based on Regular Expression DerivativesTowards an RDF Validation Language based on Regular Expression Derivatives
Towards an RDF Validation Language based on Regular Expression Derivatives
Jose Emilio Labra Gayo
 

What's hot (20)

RDF SHACL, Annotations, and Data Frames
RDF SHACL, Annotations, and Data FramesRDF SHACL, Annotations, and Data Frames
RDF SHACL, Annotations, and Data Frames
 
Linked Open Data: A simple how-to
Linked Open Data: A simple how-toLinked Open Data: A simple how-to
Linked Open Data: A simple how-to
 
SWT Lecture Session 2 - RDF
SWT Lecture Session 2 - RDFSWT Lecture Session 2 - RDF
SWT Lecture Session 2 - RDF
 
Triple Stores
Triple StoresTriple Stores
Triple Stores
 
Sparql a simple knowledge query
Sparql  a simple knowledge querySparql  a simple knowledge query
Sparql a simple knowledge query
 
Data shapes-test-suite
Data shapes-test-suiteData shapes-test-suite
Data shapes-test-suite
 
2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs
2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs
2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs
 
Jesús Barrasa
Jesús BarrasaJesús Barrasa
Jesús Barrasa
 
2009 0807 Lod Gmod
2009 0807 Lod Gmod2009 0807 Lod Gmod
2009 0807 Lod Gmod
 
Chado for evolutionary biology
Chado for evolutionary biologyChado for evolutionary biology
Chado for evolutionary biology
 
Introduction To RDF and RDFS
Introduction To RDF and RDFSIntroduction To RDF and RDFS
Introduction To RDF and RDFS
 
Chado introduction
Chado introductionChado introduction
Chado introduction
 
Chado-XML
Chado-XMLChado-XML
Chado-XML
 
RDF Data Model
RDF Data ModelRDF Data Model
RDF Data Model
 
Validating RDF data: Challenges and perspectives
Validating RDF data: Challenges and perspectivesValidating RDF data: Challenges and perspectives
Validating RDF data: Challenges and perspectives
 
Demystifying RDF
Demystifying RDFDemystifying RDF
Demystifying RDF
 
RDF briefing
RDF briefingRDF briefing
RDF briefing
 
Semantic web an overview and projects
Semantic web   an  overview and projectsSemantic web   an  overview and projects
Semantic web an overview and projects
 
Challenges and applications of RDF shapes
Challenges and applications of RDF shapesChallenges and applications of RDF shapes
Challenges and applications of RDF shapes
 
Towards an RDF Validation Language based on Regular Expression Derivatives
Towards an RDF Validation Language based on Regular Expression DerivativesTowards an RDF Validation Language based on Regular Expression Derivatives
Towards an RDF Validation Language based on Regular Expression Derivatives
 

Similar to LinkML Intro (for Monarch devs)

LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODO
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODOLinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODO
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODO
Chris Mungall
 
Triplificating and linking XBRL financial data
Triplificating and linking XBRL financial dataTriplificating and linking XBRL financial data
Triplificating and linking XBRL financial data
Roberto García
 
IPTC News in JSON AGM 2013
IPTC News in JSON AGM 2013IPTC News in JSON AGM 2013
IPTC News in JSON AGM 2013
Stuart Myles
 
Understanding the Standards Gap
Understanding the Standards GapUnderstanding the Standards Gap
Understanding the Standards Gap
Dan Brickley
 
GEDCOM X - FamilySearch Developers Conference 2014
GEDCOM X - FamilySearch Developers Conference 2014GEDCOM X - FamilySearch Developers Conference 2014
GEDCOM X - FamilySearch Developers Conference 2014
Ryan Heaton
 
Ontologies Ontop Databases
Ontologies Ontop DatabasesOntologies Ontop Databases
Ontologies Ontop Databases
Martín Rezk
 
2016 mORMot
2016 mORMot2016 mORMot
2016 mORMot
Arnaud Bouchez
 
Semantic SEO in the post Hummingbird Era and WordLift
Semantic SEO in the post Hummingbird Era and WordLiftSemantic SEO in the post Hummingbird Era and WordLift
Semantic SEO in the post Hummingbird Era and WordLift
Andrea Volpini
 
20100614 ISWSA Keynote
20100614 ISWSA Keynote20100614 ISWSA Keynote
20100614 ISWSA Keynote
Axel Polleres
 
The nature.com ontologies portal: nature.com/ontologies
The nature.com ontologies portal: nature.com/ontologiesThe nature.com ontologies portal: nature.com/ontologies
The nature.com ontologies portal: nature.com/ontologies
Tony Hammond
 
Data in RDF
Data in RDFData in RDF
JSON Fuzzing: New approach to old problems
JSON Fuzzing: New  approach to old problemsJSON Fuzzing: New  approach to old problems
JSON Fuzzing: New approach to old problems
titanlambda
 
AI-SDV 2020: Combining Knowledge and Machine Learning for the Analysis of Sci...
AI-SDV 2020: Combining Knowledge and Machine Learning for the Analysis of Sci...AI-SDV 2020: Combining Knowledge and Machine Learning for the Analysis of Sci...
AI-SDV 2020: Combining Knowledge and Machine Learning for the Analysis of Sci...
Dr. Haxel Consult
 
End-to-End Platform Support for Distributed Deep Learning in Finance
End-to-End Platform Support for Distributed Deep Learning in FinanceEnd-to-End Platform Support for Distributed Deep Learning in Finance
End-to-End Platform Support for Distributed Deep Learning in Finance
Jim Dowling
 
The Nature.com ontologies portal - Linked Science 2015
The Nature.com ontologies portal - Linked Science 2015The Nature.com ontologies portal - Linked Science 2015
The Nature.com ontologies portal - Linked Science 2015
Michele Pasin
 
Where is the World is my Open Government Data?
Where is the World is my Open Government Data?Where is the World is my Open Government Data?
Where is the World is my Open Government Data?
Rensselaer Polytechnic Institute
 
Bio it 2005_rdf_workshop05
Bio it 2005_rdf_workshop05Bio it 2005_rdf_workshop05
Bio it 2005_rdf_workshop05
Joanne Luciano
 
CDISC Presentation
CDISC PresentationCDISC Presentation
CDISC Presentation
hoot72
 
HyperGraphQL
HyperGraphQLHyperGraphQL
HyperGraphQL
Szymon Klarman
 
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven RecipesReasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
Ontotext
 

Similar to LinkML Intro (for Monarch devs) (20)

LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODO
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODOLinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODO
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODO
 
Triplificating and linking XBRL financial data
Triplificating and linking XBRL financial dataTriplificating and linking XBRL financial data
Triplificating and linking XBRL financial data
 
IPTC News in JSON AGM 2013
IPTC News in JSON AGM 2013IPTC News in JSON AGM 2013
IPTC News in JSON AGM 2013
 
Understanding the Standards Gap
Understanding the Standards GapUnderstanding the Standards Gap
Understanding the Standards Gap
 
GEDCOM X - FamilySearch Developers Conference 2014
GEDCOM X - FamilySearch Developers Conference 2014GEDCOM X - FamilySearch Developers Conference 2014
GEDCOM X - FamilySearch Developers Conference 2014
 
Ontologies Ontop Databases
Ontologies Ontop DatabasesOntologies Ontop Databases
Ontologies Ontop Databases
 
2016 mORMot
2016 mORMot2016 mORMot
2016 mORMot
 
Semantic SEO in the post Hummingbird Era and WordLift
Semantic SEO in the post Hummingbird Era and WordLiftSemantic SEO in the post Hummingbird Era and WordLift
Semantic SEO in the post Hummingbird Era and WordLift
 
20100614 ISWSA Keynote
20100614 ISWSA Keynote20100614 ISWSA Keynote
20100614 ISWSA Keynote
 
The nature.com ontologies portal: nature.com/ontologies
The nature.com ontologies portal: nature.com/ontologiesThe nature.com ontologies portal: nature.com/ontologies
The nature.com ontologies portal: nature.com/ontologies
 
Data in RDF
Data in RDFData in RDF
Data in RDF
 
JSON Fuzzing: New approach to old problems
JSON Fuzzing: New  approach to old problemsJSON Fuzzing: New  approach to old problems
JSON Fuzzing: New approach to old problems
 
AI-SDV 2020: Combining Knowledge and Machine Learning for the Analysis of Sci...
AI-SDV 2020: Combining Knowledge and Machine Learning for the Analysis of Sci...AI-SDV 2020: Combining Knowledge and Machine Learning for the Analysis of Sci...
AI-SDV 2020: Combining Knowledge and Machine Learning for the Analysis of Sci...
 
End-to-End Platform Support for Distributed Deep Learning in Finance
End-to-End Platform Support for Distributed Deep Learning in FinanceEnd-to-End Platform Support for Distributed Deep Learning in Finance
End-to-End Platform Support for Distributed Deep Learning in Finance
 
The Nature.com ontologies portal - Linked Science 2015
The Nature.com ontologies portal - Linked Science 2015The Nature.com ontologies portal - Linked Science 2015
The Nature.com ontologies portal - Linked Science 2015
 
Where is the World is my Open Government Data?
Where is the World is my Open Government Data?Where is the World is my Open Government Data?
Where is the World is my Open Government Data?
 
Bio it 2005_rdf_workshop05
Bio it 2005_rdf_workshop05Bio it 2005_rdf_workshop05
Bio it 2005_rdf_workshop05
 
CDISC Presentation
CDISC PresentationCDISC Presentation
CDISC Presentation
 
HyperGraphQL
HyperGraphQLHyperGraphQL
HyperGraphQL
 
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven RecipesReasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
 

More from Chris Mungall

MADICES Mungall 2022.pptx
MADICES Mungall 2022.pptxMADICES Mungall 2022.pptx
MADICES Mungall 2022.pptx
Chris Mungall
 
Scaling up semantics; lessons learned across the life sciences
Scaling up semantics; lessons learned across the life sciencesScaling up semantics; lessons learned across the life sciences
Scaling up semantics; lessons learned across the life sciences
Chris Mungall
 
Ontology Access Kit_ Workshop Intro Slides.pptx
Ontology Access Kit_ Workshop Intro Slides.pptxOntology Access Kit_ Workshop Intro Slides.pptx
Ontology Access Kit_ Workshop Intro Slides.pptx
Chris Mungall
 
Experiences in the biosciences with the open biological ontologies foundry an...
Experiences in the biosciences with the open biological ontologies foundry an...Experiences in the biosciences with the open biological ontologies foundry an...
Experiences in the biosciences with the open biological ontologies foundry an...
Chris Mungall
 
All together now: piecing together the knowledge graph of life
All together now: piecing together the knowledge graph of lifeAll together now: piecing together the knowledge graph of life
All together now: piecing together the knowledge graph of life
Chris Mungall
 
Collaboratively Creating the Knowledge Graph of Life
Collaboratively Creating the Knowledge Graph of LifeCollaboratively Creating the Knowledge Graph of Life
Collaboratively Creating the Knowledge Graph of Life
Chris Mungall
 
Representation of kidney structures in Uberon
Representation of kidney structures in UberonRepresentation of kidney structures in Uberon
Representation of kidney structures in Uberon
Chris Mungall
 
SparqlProg (BioHackathon 2019)
SparqlProg (BioHackathon 2019)SparqlProg (BioHackathon 2019)
SparqlProg (BioHackathon 2019)
Chris Mungall
 
Ontology Development Kit: Bio-Ontologies 2019
Ontology Development Kit: Bio-Ontologies 2019Ontology Development Kit: Bio-Ontologies 2019
Ontology Development Kit: Bio-Ontologies 2019
Chris Mungall
 
US2TS: Reasoning over multiple open bio-ontologies to make machines and human...
US2TS: Reasoning over multiple open bio-ontologies to make machines and human...US2TS: Reasoning over multiple open bio-ontologies to make machines and human...
US2TS: Reasoning over multiple open bio-ontologies to make machines and human...
Chris Mungall
 
Uberon: opening up to community contributions
Uberon: opening up to community contributionsUberon: opening up to community contributions
Uberon: opening up to community contributions
Chris Mungall
 
Modeling exposure events and adverse outcome pathways using ontologies
Modeling exposure events and adverse outcome pathways using ontologiesModeling exposure events and adverse outcome pathways using ontologies
Modeling exposure events and adverse outcome pathways using ontologies
Chris Mungall
 
Causal reasoning using the Relation Ontology
Causal reasoning using the Relation OntologyCausal reasoning using the Relation Ontology
Causal reasoning using the Relation Ontology
Chris Mungall
 
US2TS presentation on Gene Ontology
US2TS presentation on Gene OntologyUS2TS presentation on Gene Ontology
US2TS presentation on Gene Ontology
Chris Mungall
 
Introduction to the BioLink datamodel
Introduction to the BioLink datamodelIntroduction to the BioLink datamodel
Introduction to the BioLink datamodel
Chris Mungall
 
Computing on Phenotypes AMP 2015
Computing on Phenotypes AMP 2015Computing on Phenotypes AMP 2015
Computing on Phenotypes AMP 2015
Chris Mungall
 
ENVO GSC 2015
ENVO GSC 2015ENVO GSC 2015
ENVO GSC 2015
Chris Mungall
 
Mungall keynote-biocurator-2017
Mungall keynote-biocurator-2017Mungall keynote-biocurator-2017
Mungall keynote-biocurator-2017
Chris Mungall
 
Kboom phenoday-2016
Kboom phenoday-2016Kboom phenoday-2016
Kboom phenoday-2016
Chris Mungall
 
BioMake PAG 2017
BioMake PAG 2017 BioMake PAG 2017
BioMake PAG 2017
Chris Mungall
 

More from Chris Mungall (20)

MADICES Mungall 2022.pptx
MADICES Mungall 2022.pptxMADICES Mungall 2022.pptx
MADICES Mungall 2022.pptx
 
Scaling up semantics; lessons learned across the life sciences
Scaling up semantics; lessons learned across the life sciencesScaling up semantics; lessons learned across the life sciences
Scaling up semantics; lessons learned across the life sciences
 
Ontology Access Kit_ Workshop Intro Slides.pptx
Ontology Access Kit_ Workshop Intro Slides.pptxOntology Access Kit_ Workshop Intro Slides.pptx
Ontology Access Kit_ Workshop Intro Slides.pptx
 
Experiences in the biosciences with the open biological ontologies foundry an...
Experiences in the biosciences with the open biological ontologies foundry an...Experiences in the biosciences with the open biological ontologies foundry an...
Experiences in the biosciences with the open biological ontologies foundry an...
 
All together now: piecing together the knowledge graph of life
All together now: piecing together the knowledge graph of lifeAll together now: piecing together the knowledge graph of life
All together now: piecing together the knowledge graph of life
 
Collaboratively Creating the Knowledge Graph of Life
Collaboratively Creating the Knowledge Graph of LifeCollaboratively Creating the Knowledge Graph of Life
Collaboratively Creating the Knowledge Graph of Life
 
Representation of kidney structures in Uberon
Representation of kidney structures in UberonRepresentation of kidney structures in Uberon
Representation of kidney structures in Uberon
 
SparqlProg (BioHackathon 2019)
SparqlProg (BioHackathon 2019)SparqlProg (BioHackathon 2019)
SparqlProg (BioHackathon 2019)
 
Ontology Development Kit: Bio-Ontologies 2019
Ontology Development Kit: Bio-Ontologies 2019Ontology Development Kit: Bio-Ontologies 2019
Ontology Development Kit: Bio-Ontologies 2019
 
US2TS: Reasoning over multiple open bio-ontologies to make machines and human...
US2TS: Reasoning over multiple open bio-ontologies to make machines and human...US2TS: Reasoning over multiple open bio-ontologies to make machines and human...
US2TS: Reasoning over multiple open bio-ontologies to make machines and human...
 
Uberon: opening up to community contributions
Uberon: opening up to community contributionsUberon: opening up to community contributions
Uberon: opening up to community contributions
 
Modeling exposure events and adverse outcome pathways using ontologies
Modeling exposure events and adverse outcome pathways using ontologiesModeling exposure events and adverse outcome pathways using ontologies
Modeling exposure events and adverse outcome pathways using ontologies
 
Causal reasoning using the Relation Ontology
Causal reasoning using the Relation OntologyCausal reasoning using the Relation Ontology
Causal reasoning using the Relation Ontology
 
US2TS presentation on Gene Ontology
US2TS presentation on Gene OntologyUS2TS presentation on Gene Ontology
US2TS presentation on Gene Ontology
 
Introduction to the BioLink datamodel
Introduction to the BioLink datamodelIntroduction to the BioLink datamodel
Introduction to the BioLink datamodel
 
Computing on Phenotypes AMP 2015
Computing on Phenotypes AMP 2015Computing on Phenotypes AMP 2015
Computing on Phenotypes AMP 2015
 
ENVO GSC 2015
ENVO GSC 2015ENVO GSC 2015
ENVO GSC 2015
 
Mungall keynote-biocurator-2017
Mungall keynote-biocurator-2017Mungall keynote-biocurator-2017
Mungall keynote-biocurator-2017
 
Kboom phenoday-2016
Kboom phenoday-2016Kboom phenoday-2016
Kboom phenoday-2016
 
BioMake PAG 2017
BioMake PAG 2017 BioMake PAG 2017
BioMake PAG 2017
 

Recently uploaded

Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid ResearchHarnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
Neo4j
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
Chart Kalyan
 
Leveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and StandardsLeveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and Standards
Neo4j
 
A Deep Dive into ScyllaDB's Architecture
A Deep Dive into ScyllaDB's ArchitectureA Deep Dive into ScyllaDB's Architecture
A Deep Dive into ScyllaDB's Architecture
ScyllaDB
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
Alex Pruden
 
Apps Break Data
Apps Break DataApps Break Data
Apps Break Data
Ivo Velitchkov
 
Dandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity serverDandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity server
Antonios Katsarakis
 
"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota
Fwdays
 
"Scaling RAG Applications to serve millions of users", Kevin Goedecke
"Scaling RAG Applications to serve millions of users",  Kevin Goedecke"Scaling RAG Applications to serve millions of users",  Kevin Goedecke
"Scaling RAG Applications to serve millions of users", Kevin Goedecke
Fwdays
 
"NATO Hackathon Winner: AI-Powered Drug Search", Taras Kloba
"NATO Hackathon Winner: AI-Powered Drug Search",  Taras Kloba"NATO Hackathon Winner: AI-Powered Drug Search",  Taras Kloba
"NATO Hackathon Winner: AI-Powered Drug Search", Taras Kloba
Fwdays
 
Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving
 
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge GraphGraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
Neo4j
 
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
Edge AI and Vision Alliance
 
Essentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation ParametersEssentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation Parameters
Safe Software
 
Christine's Supplier Sourcing Presentaion.pptx
Christine's Supplier Sourcing Presentaion.pptxChristine's Supplier Sourcing Presentaion.pptx
Christine's Supplier Sourcing Presentaion.pptx
christinelarrosa
 
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptxPRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
christinelarrosa
 
Principle of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptxPrinciple of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptx
BibashShahi
 
Y-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PPY-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PP
c5vrf27qcz
 
Must Know Postgres Extension for DBA and Developer during Migration
Must Know Postgres Extension for DBA and Developer during MigrationMust Know Postgres Extension for DBA and Developer during Migration
Must Know Postgres Extension for DBA and Developer during Migration
Mydbops
 
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
DanBrown980551
 

Recently uploaded (20)

Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid ResearchHarnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
 
Leveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and StandardsLeveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and Standards
 
A Deep Dive into ScyllaDB's Architecture
A Deep Dive into ScyllaDB's ArchitectureA Deep Dive into ScyllaDB's Architecture
A Deep Dive into ScyllaDB's Architecture
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
 
Apps Break Data
Apps Break DataApps Break Data
Apps Break Data
 
Dandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity serverDandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity server
 
"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota
 
"Scaling RAG Applications to serve millions of users", Kevin Goedecke
"Scaling RAG Applications to serve millions of users",  Kevin Goedecke"Scaling RAG Applications to serve millions of users",  Kevin Goedecke
"Scaling RAG Applications to serve millions of users", Kevin Goedecke
 
"NATO Hackathon Winner: AI-Powered Drug Search", Taras Kloba
"NATO Hackathon Winner: AI-Powered Drug Search",  Taras Kloba"NATO Hackathon Winner: AI-Powered Drug Search",  Taras Kloba
"NATO Hackathon Winner: AI-Powered Drug Search", Taras Kloba
 
Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024
 
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge GraphGraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
 
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
 
Essentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation ParametersEssentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation Parameters
 
Christine's Supplier Sourcing Presentaion.pptx
Christine's Supplier Sourcing Presentaion.pptxChristine's Supplier Sourcing Presentaion.pptx
Christine's Supplier Sourcing Presentaion.pptx
 
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptxPRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
 
Principle of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptxPrinciple of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptx
 
Y-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PPY-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PP
 
Must Know Postgres Extension for DBA and Developer during Migration
Must Know Postgres Extension for DBA and Developer during MigrationMust Know Postgres Extension for DBA and Developer during Migration
Must Know Postgres Extension for DBA and Developer during Migration
 
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
 

LinkML Intro (for Monarch devs)

  • 1. LinkML: A brief guide for Monarch team members Chris Mungall Monarch Initiative cjmungall@lbl.gov Monarch Data Call Sep 2021
  • 2. Data modeling is ubiquitous in Monarch and Phen1 Ontology developer How do infectious diseases (MONDO) relate to infectious agents (NCBITaxon)? Or to treatments (MAXO) or exposures (ECTO) Phenopacket Team KG/Ingest Engineer How do diseases relate to genes? How do patients relate to diseases? Or to samples/. UI developer How do disease pages relate to gene pages?
  • 3. We love our stacks Ontology developer I ❤️ OWL I ❤️ DOSDPs I ❤️ ROBOT templates Phenopacket Team KG/Ingest Engineer / Graph ML I ❤️ KGX I ❤️ TSVs I ❤️ Neo4J I ❤️ Protobuf
  • 4. Monarch-adjacent and beyond Translator How do genes relate to chemicals CRDC-H; NMDC How do I relate samples to ontological descriptors CD2H / N3C How do patients relate to drug treatments? Allen / HCA How do cell types relate to genes? OBO / RO / COB How do processes relate to inputs
  • 5. Monarch-adjacent and beyond Semweb developer I ❤️ RDF + triplestores I ❤️ SHACL/ShEx Clinical informatics I ❤️ FHIR Biologists I ❤️ spreadsheets CD2H / N3C I ❤️ OMOP GA4GH, HCA, many devs I ❤️ JSON-Schema I ❤️ SQLite
  • 6. LinkML: One ring to bind them…. https://linkml.io/ * https://github.com/linkml/linkml/
  • 7.
  • 8. LinkML philosophy: Parasitize rather than compete Strategy ● Be expressive enough to cover all our use cases ● Allow compilation to people’s favored stack (e.g JSON-Schema) ● Simple to do simple things, but add-ons where necessary ● Stealth Semantics ○ Everything can be RDF/Linked Data - if you want it to be
  • 9. LinkML parasitizes other toolchains YourModel Documentation OWL JSON Schema ShEx Schema Schema.py Object model GraphQL Schema Your LinkML Schema (YAML) JSONLD Context . . . LinkML parser Philosophy: ● Be expressive ● Parasitize ● Be developer-friendly ● Stealth semantics ● 80% rule
  • 10. LinkML != Biolink Model Biolink Model ● Expressed using LinkML ● An uber-data model for biology ○ Main types: gene, chemical, disease, … ○ Translator ○ Monarch KG ○ KG-COVID-19 ○ KG-Microbe ○ KG-OBO ● Not appropriate for everything ○ Highly patient specific (CCDH, Pfx, FHIR) ○ Sample and omics data (CCDH, NMDC) ○ Single-cell data LinkML ● A modeling language ● Can express multiple datamodels
  • 11. id: https://example.org/linkml/hello-world title: Really basic LinkML model name: hello-world license: https://creativecommons.org/publicdomain/zero/1.0/ version: 0.0.1 prefixes: linkml: https://w3id.org/linkml/ sdo: https://schema.org/ ex: https://example.org/linkml/hello-world/ default_prefix: ex default_curi_maps: - semweb_context imports: - linkml:types classes: Person: description: Minimal information about a person class_uri: sdo:Person attributes: id: identifier: true slot_uri: sdo:taxID first_name: required: true slot_uri: sdo:givenName multivalued: true last_name: required: true slot_uri: sdo:familyName knows: range: Person multivalued: true slot_uri: foaf:knows Metadata Dependencies Namespaces Actual Model A sample LinkML Schema 11
  • 12. id: https://example.org/linkml/hello-world title: Really basic LinkML model name: hello-world license: https://creativecommons.org/publicdomain/zero/1.0/ version: 0.0.1 prefixes: linkml: https://w3id.org/linkml/ sdo: https://schema.org/ ex: https://example.org/linkml/hello-world/ default_prefix: ex default_curi_maps: - semweb_context imports: - linkml:types classes: Person: description: Minimal information about a person class_uri: sdo:Person attributes: id: identifier: true slot_uri: sdo:taxID first_name: required: true slot_uri: sdo:givenName multivalued: true last_name: required: true slot_uri: sdo:familyName knows: range: Person multivalued: true slot_uri: foaf:knows Metadata Dependencies Namespaces Actual Model LinkML RDF is hidden in plain sight 12 Reuse schema elements from core vocabularies FAIR (specifically: allows diverse data to be combined)
  • 13. Sample model documentation output https://hsolbrig.github.io/sample_model/docs 13
  • 14. BASE <https://example.org/linkml/hello-world/> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> PREFIX sdo: <https://schema.org/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> <String> xsd:string <Person> CLOSED { ( $<Person_tes> ( sdo:givenName @<String> + ; sdo:familyName @<String> ; foaf:knows @<Person> * ) ; rdf:type [ sdo:Person ] ) } "$id": "https://example.org/linkml/person", "$schema": "http://json-schema.org/draft-07/schema#", "definitions": { "Person": { "additionalProperties": false, "description": "Minimal information about a person", "properties": { "first_name": { "items": { "type": "string" }, "type": "array" }, "id": { "type": "string" }, ... Shape Expressions (ShEx) Schema JSON Schema type Person { id: String! firstName: [String]! lastName: String! knows: [Person] } Graphql Schema Sample LinkML generated schemas 14
  • 15. LinkML can also emit OWL 15
  • 16. LinkML can also emit OWL 16
  • 17. from examples.basic import Person from linkml.dumpers import json_dumper, rdf_dumper sam = Person("1172438", first_name=["Samual", "J"], last_name="Snooter") ann = Person("17a3923", first_name="Jill", last_name="Jones", knows=[sam.id]) print(json_dumper.dumps(ann)) print(yaml_dumper.dumps(ann)) print(rdf_dumper.dumps(ann, contexts="../examples/jsonld/basic.context.jsonld")) { "id": "17a3923", "first_name": [ "Jill" ], "last_name": "Jones", "knows": [ "1172438" ], "@type": "Person" } id: 17a3923 first_name: - Jill last_name: Jones knows: - '1172438' @prefix foaf: <http://xmlns.com/foaf/0.1/> . @prefix sdo: <https://schema.org/> . <https://example.org/linkml/hello-world/17a3923> a sdo:Person ; foaf:knows <https://example.org/linkml/hello- world/1172438> ; sdo:familyName "Jones" ; sdo:givenName "Jill" . python JSON output YAML output RDF output (by way of JSON-LD) Objects can be exported as JSON, YAML, or RDF 17
  • 18. from linkml.loaders import yaml_loader fred = yaml_loader.load('input/fred.yaml', target_class=Person) print(fred.first_name) ['Fred', 'William'] harvey = json_loader.load('https://raw.githubusercontent.com/hsolbrig/linkml-enhanced- template/master/tests/input/harvey.json', target_class=Person) print(harvey.last_name) Mackerson ann = rdf_loader.load('input/ann.xml', target_class=Person, fmt="xml") print(ann.last_name) Richardson id: 118-28-3199 first_name: - Fred - William last_name: Phillips knows: - '1172438' - '1172438' input/fred.yaml Python code { "id": "118-78-0697", "first_name": [ "Harvey" ], "last_name": "Mackerson" } http://example.org/.../harvey.json input/ann.xml Objects can be imported from JSON, YAML, or RDF 18 <?xml version="1.0" encoding="UTF-8"?> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:sdo="https://schema.org/" > <rdf:Description rdf:about="https://peoples.r.us"> <sdo:givenName>Ann</sdo:givenName> <rdf:type rdf:resource="https://schema.org/Person"/> <sdo:familyName>Richardson</sdo:familyName> <sdo:givenName>Elizabeth</sdo:givenName> </rdf:Description> </rdf:RDF>
  • 19. Class: CyberAttack: attributes: has_attack_means: AttackMeansEnum enums: AttackMeansUnum: description: |- Type of cyber attack permissible_values: BufferOverFlow: SynFlood: MaliciousCodeExecution: ... Simple Enums 19 LinkML has rich enum support
  • 20. Prefixes: uco: http://ffrdc.ebiquity.umbc.edu/ns/ontology/# classes: CyberAttack: attributes: has_attack_means: AttackMeansUnum enums: AttackMeansUnum: description: |- Type of cyber attack permissible_values: BufferOverFlow: meaning: uco#BufferOverFlow SynFlood: meaning: uco#SynFlood MaliciousCodeExecution: meaning: uco#MaliciousCodeExecution ... Ontological Enums 20 LinkML has rich enum support
  • 21. Uses of LinkML Lightweight data dictionary for CSVs / Machine Learning Knowledge Graph schemas Simple Complex Standard Databases
  • 23. Currently in use for: key projects National Microbiome Data Collaborative ● Samples ● Omics datatypes Genomics Standards Consortium Translator (BioLink) All our KGs (BioLink + Source modeling) Alliance of Genome Resources Center for Cancer Data Harmonization Knowledge Graph Change Language (KGCL) Chemical Ontology Schema SSSOM GHGPA Critical Path Institute ...and more
  • 25. External tooling: DataHarmonizer driven by LinkML https://genepio.org/DataHarmonizer/main.html -- spiritual successor to Phenote
  • 26. Caveats / expectation management The core is stable ● i.e. schemas won’t break in linkml1.x.x series ● Used in production in multiple projects Some things are currently incomplete ● Mapping to JSON-Schema ● Mapping to JSON-LD Contexts ● Documentation New language features being added ● E.g constraint language, mapping language ● These are extensions and won’t break existing schemas The tool stack is constantly evolving ● Other frameworks -> LinkML ● Automated schema mapping ● Generators for other languages ○ new : javagen ● Binding to databases ○ SQL, SPARQL, Solr, MongoDB, ...

Editor's Notes

  1. (Shhh…. RDF is everywhere!)
  2. Web based model documentation can be emitted “out of the box”, and several LinkML users have added fancier tool-specific(?) documentation packages. Note that the above model is not UML -- YUML is a graphics tool that makes UML diagrams but not XMI
  3. Generating the schemas in various target languages enables the the use of tooling and other resources developed for that particular language. LinkML emits JSON-Schema, ShEx and GraphQL today. SQL ORM work is underway and future plans include UML, SHACL and FHIR.
  4. LinkML OWL can generate the necessary “glue” to allow model instances (e.g. Person) to be used in reasoners.
  5. LinkML OWL can generate the necessary “glue” to allow model instances (e.g. Person) to be used in reasoners.
  6. Model instances can be constructed using python and emitted as JSON, YAML or RDF. Columnar (CSV, TSV, Excel, …) is on the todo list. Others can be created as needed.
  7. One can also use YAML, JSON or RDF loaders to import information. Columnar input is on the horizon but make note that the ability to import RDF, potentially from a large graph, SPARQL query or ShEx “slurp” allows us to work with an RDF data store (e.g. WikiData) or other source (schema.org annotated web pages). Note: as of 4/14/2021, we are still working through issues in the rdf_loader.
  8. Three permissible values -- no meaning connection. This works for basic models, but lacks the semantic (RDF) bridge necessary to do transformation
  9. Three permissible values -- no meaning connection. This works for basic models, but lacks the semantic (RDF) bridge necessary to do transformation