One small(ish) step for modellers, one
giant leap for mankind
Capturing the context
Mihai Glonț
Reproducible and Citable Data and Models
Warnemuende
September 2015
A simple(?) question

How easy is it to find reusable models?

Reusable should entail, at least
– Reproducible
– Friendly licence
– Understandable
Is this understandable?
Problems

How do we recognise concepts? Is
adenosine5PrimePhospate a better variable name than a?
Do all modellers know the same amount information about
ATP?

How can we uniquely identify the concepts involved in a
modelling exercise?
A brief (and biased) history of the Web
Web 1.0 - basic HTML pages (personal web sites on Geosites)
Web 2.0
●
Prevalence of content generators
●
Social media
●
Rich user interfaces
●
Folksonomies
●
Software as a service
Web 3.0
●
Semantic Web
●
“The Semantic Web provides a common framework that
allows data to be shared and reused across application,
enterprise, and community boundaries" (W3C)
●
Machines understand the data on the web and can reason
about it
●
Implicit knowledge is captured in a machine-processable
manner
●
What holiday options are there for a family of four for 10
days, somewhere sunny and close to the sea, with good
food and a budget of EUR 3000?
Semantic web overview
●
Taxonomies and ontologies define
concepts (resources) and
ontologies
●
Identification through URIs
●
Data is exchanged as RDF
Ontologies
●
Define concepts, instances, attributes and
relationships
●
Workshop is a kind of Thing
●
Workshop hasA location
Linking ontologies
http://lod-cloud.net/
RDF Primer
●
Resource Description Framework
●
Documents consist of a series of statements
●
Statements (triples) follow the following syntax
●
Subject - Predicate – Object
https://sems.uni-rostock.de/reproducible-and-citable-data-and-models/
http://example.com/someOntology/hasLocation
https://en.wikipedia.org/wiki/Warnemunde
A selection of ontologies for life scientists
●
ChEBI: http://www.ebi.ac.uk/chebi/
●
GO: http://geneontology.org/
●
BRENDA Tissue Ontology: http://www.brenda-enzymes.org/
●
FMA: http://bioportal.bioontology.org/ontologies/FMA
●
Human disease ontology: http://disease-ontology.org/
●
TEDDY: http://purl.bioontology.org/ontology/TEDDY/
●
KiSAO: http://co.mbine.org/standards/kisao
●
SBO: http://www.ebi.ac.uk/sbo/
https://www.ebi.ac.uk/ontology-lookup/
Identifiers, identifiers, identifiers
●
Is http://purl.uniprot.org/taxonomy/9606
the same as
http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/w
wwtax.cgi?mode=Info&id=9606
or
http://taxonomy.bio2rdf.org/describe/?url=http://
bio2rdf.org/taxonomy:9606
●
What if the URIs change?
●
What if the URIs don't point to anything?
Introducing identifiers.org
●
The aim of the identifiers.org project is to provide unique,
stable, resolvable and location-independent URIs to identify
and to locate scientific data
●
Community-driven
●
Free to use
Registry
500+ curated
data collections
500+ curated
data collections
Creating unique URIs
• Homo sapiens in Taxonomy
(9606)
http://identifiers.org/taxonomy/9606http://identifiers.org/taxonomy/9606
[Data
collection]
[Entity
identifier]
Creating resolvable URIs
http://identifiers.org/taxonomy/
9606
http://identifiers.org/taxonomy/
9606
• URI to identify the entity 'Homo sapiens' in the
data collection Taxonomy
http://www.ncbi.nlm.nih.go
v/Taxonomy/Browser/wwwtax.
cgi?mode=Info&id=9606
http://www.uniprot.org/taxon
omy/9606
http://www.ebi.ac.uk/ena/dat
a/view/Taxon:9606
ResourceResource ResourceResource ReferenceReference
Primary
http://info.identifiers.org/taxonomy/
9606
http://info.identifiers.org/taxonomy/
9606
Inter-conversion of identifier
schemes
• Registry records different identifier schemes
• Web service for inter-conversion between
identifier schemes
http://purl.obolibrary.org/obo/GO_000588
6
http://purl.obolibrary.org/obo/GO_000588
6
http://bio2rdf.org/go:0005886http://bio2rdf.org/go:0005886
http://identifiers.org/go/GO:000588
6
http://identifiers.org/go/GO:000588
6
Support for different formats
TaxonomyTaxonomy
htmlhtml
htmlhtml
RDFRDF
jsonjson
• The Registry records the formats provided by the
various data resources
BioModels
http://www.ebi.ac.uk/biomodels/
http://biomodels.caltech.edu/
Model workflow within BioModels
BioModels model display
BioModels model display
BioModels model display
BioModels model classification
Quo vadis?
●
Model curation is hard
●
Model annotation is laborious
●
We moved from lack of methods to scalability and usability
issues
●
Towards semi-automated annotation based on model
clustering
●
User-friendly tools for annotating models

Capturing the context: one small(ish step for modellers, one giant leap for mankind.

  • 1.
    One small(ish) stepfor modellers, one giant leap for mankind Capturing the context Mihai Glonț Reproducible and Citable Data and Models Warnemuende September 2015
  • 2.
    A simple(?) question  Howeasy is it to find reusable models?  Reusable should entail, at least – Reproducible – Friendly licence – Understandable
  • 3.
  • 4.
    Problems  How do werecognise concepts? Is adenosine5PrimePhospate a better variable name than a? Do all modellers know the same amount information about ATP?  How can we uniquely identify the concepts involved in a modelling exercise?
  • 5.
    A brief (andbiased) history of the Web Web 1.0 - basic HTML pages (personal web sites on Geosites)
  • 6.
    Web 2.0 ● Prevalence ofcontent generators ● Social media ● Rich user interfaces ● Folksonomies ● Software as a service
  • 7.
    Web 3.0 ● Semantic Web ● “TheSemantic Web provides a common framework that allows data to be shared and reused across application, enterprise, and community boundaries" (W3C) ● Machines understand the data on the web and can reason about it ● Implicit knowledge is captured in a machine-processable manner ● What holiday options are there for a family of four for 10 days, somewhere sunny and close to the sea, with good food and a budget of EUR 3000?
  • 8.
    Semantic web overview ● Taxonomiesand ontologies define concepts (resources) and ontologies ● Identification through URIs ● Data is exchanged as RDF
  • 9.
    Ontologies ● Define concepts, instances,attributes and relationships ● Workshop is a kind of Thing ● Workshop hasA location
  • 10.
  • 11.
    RDF Primer ● Resource DescriptionFramework ● Documents consist of a series of statements ● Statements (triples) follow the following syntax ● Subject - Predicate – Object https://sems.uni-rostock.de/reproducible-and-citable-data-and-models/ http://example.com/someOntology/hasLocation https://en.wikipedia.org/wiki/Warnemunde
  • 12.
    A selection ofontologies for life scientists ● ChEBI: http://www.ebi.ac.uk/chebi/ ● GO: http://geneontology.org/ ● BRENDA Tissue Ontology: http://www.brenda-enzymes.org/ ● FMA: http://bioportal.bioontology.org/ontologies/FMA ● Human disease ontology: http://disease-ontology.org/ ● TEDDY: http://purl.bioontology.org/ontology/TEDDY/ ● KiSAO: http://co.mbine.org/standards/kisao ● SBO: http://www.ebi.ac.uk/sbo/ https://www.ebi.ac.uk/ontology-lookup/
  • 13.
    Identifiers, identifiers, identifiers ● Ishttp://purl.uniprot.org/taxonomy/9606 the same as http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/w wwtax.cgi?mode=Info&id=9606 or http://taxonomy.bio2rdf.org/describe/?url=http:// bio2rdf.org/taxonomy:9606 ● What if the URIs change? ● What if the URIs don't point to anything?
  • 14.
    Introducing identifiers.org ● The aimof the identifiers.org project is to provide unique, stable, resolvable and location-independent URIs to identify and to locate scientific data ● Community-driven ● Free to use
  • 15.
  • 16.
    Creating unique URIs •Homo sapiens in Taxonomy (9606) http://identifiers.org/taxonomy/9606http://identifiers.org/taxonomy/9606 [Data collection] [Entity identifier]
  • 17.
    Creating resolvable URIs http://identifiers.org/taxonomy/ 9606 http://identifiers.org/taxonomy/ 9606 •URI to identify the entity 'Homo sapiens' in the data collection Taxonomy http://www.ncbi.nlm.nih.go v/Taxonomy/Browser/wwwtax. cgi?mode=Info&id=9606 http://www.uniprot.org/taxon omy/9606 http://www.ebi.ac.uk/ena/dat a/view/Taxon:9606 ResourceResource ResourceResource ReferenceReference Primary http://info.identifiers.org/taxonomy/ 9606 http://info.identifiers.org/taxonomy/ 9606
  • 18.
    Inter-conversion of identifier schemes •Registry records different identifier schemes • Web service for inter-conversion between identifier schemes http://purl.obolibrary.org/obo/GO_000588 6 http://purl.obolibrary.org/obo/GO_000588 6 http://bio2rdf.org/go:0005886http://bio2rdf.org/go:0005886 http://identifiers.org/go/GO:000588 6 http://identifiers.org/go/GO:000588 6
  • 19.
    Support for differentformats TaxonomyTaxonomy htmlhtml htmlhtml RDFRDF jsonjson • The Registry records the formats provided by the various data resources
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
    Quo vadis? ● Model curationis hard ● Model annotation is laborious ● We moved from lack of methods to scalability and usability issues ● Towards semi-automated annotation based on model clustering ● User-friendly tools for annotating models