Linking Linked Data Linked Data to Integrated DataExpert Bioinformatics from Bioinformatics Experts
Put your data on the webmake a pretty web site later.           Expert Bioinformatics from Bioinformatics Experts
Expert Bioinformatics from Bioinformatics Experts
Now we can ask questions like this...What members of a target pathway are already targeted in other diseases?           Ta...
Because we have lots of data exposedas RDF                    Uniprot:Protein                                             ...
What do you do when you have to adddata...       Expert Bioinformatics from Bioinformatics Experts
Or connect SPARQL endpoints?    RDF != Linked Data      Expert Bioinformatics from Bioinformatics Experts
Is your data 5* ? Linked data is essential to actually connect the semantic web. It is quite easy to do with a little thou...
Example openflydata to BioCyc What genes are differentially expressed in the hindgut and are there any pathways associated...
Problem: Node URIs<http://openflydata.org/id/flyatlas/affyid/1616608_a_at><http://purl.org/NET/flyatlas/schema#gene><http:...
Integration Level 1Use Identifiers.org CONSTRUCT {     ?x     RDFS:seeAlso     `bif:sprintf_iri ("http://identifiers.org/f...
Integration Level 2adding property characteristics BP = <http://www.biopax.org/release/biopax-level3.owl#>BP:Protein BP:co...
Integration Level 3class subsumption FlyA = <http://purl.org/NET/flyatlas/schema#>flywebflyatlas:1616608_a_at a flyatlas:P...
Connect BiochemicalReactions toExpression ValuesSELECT ?name ?id ?meanWHERE{   ?reaction a BP:BiochemicalReaction .   ?rea...
Expert Bioinformatics from Bioinformatics Experts
Client Architecture      Expert Bioinformatics from Bioinformatics Experts
Vocabularies in Linked DataWhat does the linked data cloud know about Drugs....                                           ...
Create a tighter more unified “view” underone schema        Expert Bioinformatics from Bioinformatics Experts
Unified VocabularyWhat does the linked data cloud know about Drugs....         Expert Bioinformatics from Bioinformatics E...
Map Classes and Properties into asingle instantiated view       Expert Bioinformatics from Bioinformatics Experts
Before QuerySELECT *WHERE{?s drugb:calculatedInChIKey ?inchiD .?s a drugb:Drug .?c a Chembl:ChemicalCompund .?c chembl:sta...
After QuerySELECT *where{?s a GB:Drug .?s GB:inchiKey ?inchi .}            Expert Bioinformatics from Bioinformatics Experts
Linked Data Architecture      Expert Bioinformatics from Bioinformatics Experts
Creating fixed “views” of Linked DataWhen the use of integrated data is fixed e.g. an API orapplication, Linked Data can b...
Summary●   Exposing data as RDF does not equal Linked Data●   Making data linked is not hard      –    Node IRIs        – ...
www.generalbioinformatics.com/science.html    Expert Bioinformatics from Bioinformatics Experts
Upcoming SlideShare
Loading in …5
×

Linking Linked Data CSHALS2013

418 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
418
On SlideShare
0
From Embeds
0
Number of Embeds
10
Actions
Shares
0
Downloads
7
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Linking Linked Data CSHALS2013

  1. 1. Linking Linked Data Linked Data to Integrated DataExpert Bioinformatics from Bioinformatics Experts
  2. 2. Put your data on the webmake a pretty web site later. Expert Bioinformatics from Bioinformatics Experts
  3. 3. Expert Bioinformatics from Bioinformatics Experts
  4. 4. Now we can ask questions like this...What members of a target pathway are already targeted in other diseases? Target Pathway Disease Chembl Uniprot Reactome OMIM Protein Target Compound Pathway Disease Expert Bioinformatics from Bioinformatics Experts
  5. 5. Because we have lots of data exposedas RDF Uniprot:Protein BioPAX:Protein Mim:Phenotype Expert Bioinformatics from Bioinformatics Experts
  6. 6. What do you do when you have to adddata... Expert Bioinformatics from Bioinformatics Experts
  7. 7. Or connect SPARQL endpoints? RDF != Linked Data Expert Bioinformatics from Bioinformatics Experts
  8. 8. Is your data 5* ? Linked data is essential to actually connect the semantic web. It is quite easy to do with a little thought, and becomes second nature. Various common sense considerations determine when to make a link and when not to. Expert Bioinformatics from Bioinformatics Experts
  9. 9. Example openflydata to BioCyc What genes are differentially expressed in the hindgut and are there any pathways associated with those genes? ● Use FlyAtlas at openflydata.org for tissue specific expression profiles. ● Use FlyCyc from BioCyc. ● Then SPARQL Expert Bioinformatics from Bioinformatics Experts
  10. 10. Problem: Node URIs<http://openflydata.org/id/flyatlas/affyid/1616608_a_at><http://purl.org/NET/flyatlas/schema#gene><http://openflydata.org/id/flybase/feature/FBgn0001128> .<http://biocyc.org/biopax/biopax-level3#UnificationXref202209><http://www.biopax.org/release/biopax-level3.owl#xref><http://biocyc.org/biopax/biopax-level3#Protein202210> .<http://biocyc.org/biopax/biopax-level3#UnificationXref202209><http://www.biopax.org/release/biopax-level3.owl#db> FlyCyc .<http://biocyc.org/biopax/biopax-level3#UnificationXref202209><http://www.biopax.org/release/biopax-level3.owl#id> FBGN0001128 . Expert Bioinformatics from Bioinformatics Experts
  11. 11. Integration Level 1Use Identifiers.org CONSTRUCT { ?x RDFS:seeAlso `bif:sprintf_iri ("http://identifiers.org/flybase/%s", ?id)` } WHERE { ?x BP:unificationxref ?xref . ?xref BP:id ?id . ?blank BP:db "FlyCyc"^^xsd:string } Expert Bioinformatics from Bioinformatics Experts
  12. 12. Integration Level 2adding property characteristics BP = <http://www.biopax.org/release/biopax-level3.owl#>BP:Protein BP:controls BP:CatalysisBP:Catalysis BP:controls BP:BioChemicalReactionBP:Protein BP:controls BP:BioChemicalReactionCONSTRUCT {?x GB:controlledBy ?y }WHERE { ?x BP:controls ?catalysis . ?catalysis BP:controls ?y } Expert Bioinformatics from Bioinformatics Experts
  13. 13. Integration Level 3class subsumption FlyA = <http://purl.org/NET/flyatlas/schema#>flywebflyatlas:1616608_a_at a flyatlas:ProbeData BP = <http://www.biopax.org/release/biopax-level3.owl#> flyatlas:ProbeData rdfs:subClassOf BP:DNARegionCONSTRUCT {?x a BP:DNARegion }WHERE { ?x a flyatlas:ProbeData } Expert Bioinformatics from Bioinformatics Experts
  14. 14. Connect BiochemicalReactions toExpression ValuesSELECT ?name ?id ?meanWHERE{ ?reaction a BP:BiochemicalReaction . ?reaction BP:standardName ?name . ?reaction GB:controlledBy ?protein . ?protein a BP:Protein . ?protein BP:xref ?id . ?probe a BP:DNARegion . ?probe BP:xref ?id . ?probe flyatlas:l_fatbody ?blank . ?blank flyatlas:mean ?mean}LIMIT 5 No Reasoner – just a few SPARQL CONSTRUCTs Expert Bioinformatics from Bioinformatics Experts
  15. 15. Expert Bioinformatics from Bioinformatics Experts
  16. 16. Client Architecture Expert Bioinformatics from Bioinformatics Experts
  17. 17. Vocabularies in Linked DataWhat does the linked data cloud know about Drugs.... chembl:Activity chembl:Assay chembl:AssayCategorySELECT distinct ?class chembl:AssayTargetLinkWHERE chembl:ChemicalCompound >100 chembl:DrugTarget{ chembl:LiteratureCitation ?s a ?class . dailymed:drugs ?s ?p ?o drugbank:Drug} drugbank:DrugInteraction drugbank:EnzymeLink drugbank:ExternalIdentifier drugbank:ExternalLink drugbank:LiteratureCitation drugbank:Molecule drugbank:OrganismSpecies drugbank:Patent drugbank:ProteinSequence drugbank:TargetLink entrez:EnsemblReference entrez:Gene pdb:Molecule pdb:Structure pubmed:Chemical pubmed:Citation Expert Bioinformatics from Bioinformatics Experts pubmed:DatabankReference
  18. 18. Create a tighter more unified “view” underone schema Expert Bioinformatics from Bioinformatics Experts
  19. 19. Unified VocabularyWhat does the linked data cloud know about Drugs.... Expert Bioinformatics from Bioinformatics Experts
  20. 20. Map Classes and Properties into asingle instantiated view Expert Bioinformatics from Bioinformatics Experts
  21. 21. Before QuerySELECT *WHERE{?s drugb:calculatedInChIKey ?inchiD .?s a drugb:Drug .?c a Chembl:ChemicalCompund .?c chembl:standardInChIKey ?inchiC .FILTER regex(?inchiD, ?inchiC)} Expert Bioinformatics from Bioinformatics Experts
  22. 22. After QuerySELECT *where{?s a GB:Drug .?s GB:inchiKey ?inchi .} Expert Bioinformatics from Bioinformatics Experts
  23. 23. Linked Data Architecture Expert Bioinformatics from Bioinformatics Experts
  24. 24. Creating fixed “views” of Linked DataWhen the use of integrated data is fixed e.g. an API orapplication, Linked Data can be expensive: – Changes to data requires significant recoding – Multiple Schemas make queries long and inefficient• A view or middle layer of data used by the API, changes to data are managed by the view and the API is minimally disturbed – Views are easier to query – Views are faster to query• Client gets the best of both worlds a tight view of data for API queries while still having all the advantages of a linked data strategy. Expert Bioinformatics from Bioinformatics Experts
  25. 25. Summary● Exposing data as RDF does not equal Linked Data● Making data linked is not hard – Node IRIs – Unifying Classes – Transitive closure of Properties● A little semantics goes a long way (no reasoner required)● Creating “Views” from one schema to another is not hard. – But should be easier Expert Bioinformatics from Bioinformatics Experts
  26. 26. www.generalbioinformatics.com/science.html Expert Bioinformatics from Bioinformatics Experts

×