Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Bio ontologies and semantic technologies

SPARQL

  • Login to see the comments

  • Be the first to like this

Bio ontologies and semantic technologies

  1. 1. Introduction to Bio Ontologies and The Semantic Web M. Devisscher Biological Databases
  2. 2. Overview • Bio ontologies • Semantic technologies • SPARQL in practice
  3. 3. Introduction • Ontologies: what are ontologies ? • Ontologies in the bio domain: OBO Foundry • Ontologies in the semantic web • OBO • RDF, IRI, TTL, SPARQL, OWL
  4. 4. What is an ontology ? • Ontology = a specification of a conceptualization (Gruber 1993) • In practice: controlled vocabularies – Disambiguation (e.g. Bank, Running) – Language/species independence • Very useful in biology – complex hierarchies of terms
  5. 5. Ontologies in the bio Domain • OBO Foundry - open Biological and Biomedical Ontologies • Common principles • List of ontologies at http://www.obofoundry.org • OBO is also a data format .obo
  6. 6. SideTrack – The Gene Ontology • The mother of bio-ontologies: the GO – Oldest bio – ontology – Many practical applications: • Cross species studies • Term abundance studies • GO is an OBO ontology
  7. 7. SideTrack – The Gene Ontology • Collection of terms
  8. 8. SideTrack – The Gene Ontology • Relationships between terms: – Subsumption: is_a – Partonomic: part_of • These terms are transitive • Terms form a DAG (directed, acyclic graph) • Some information can be inferred
  9. 9. SideTrack – The Gene Ontology
  10. 10. SideTrack – The Gene Ontology
  11. 11. SideTrack – The Gene Ontology • Know more: www.geneontology.org • AMIGO : the GO browser
  12. 12. Gene Ontology Annotation • Gene ontology annotations GOA = entities labeled with GO terms – E.g. Uniprot-GOA
  13. 13. Semantic Technologies • The semantic web: Tim Berners Lee et al, Scientific American 2001
  14. 14. Semantic Technologies • W3C: a set of specifications http://www.w3.org/standards/semanticweb/ • A mature toolset – Dedicated data formats – Storage – Query language
  15. 15. Semantic Technologies • Basic data element = a Triple – A mini sentence – Contains three Terms: • Subject Predicate Object
  16. 16. Semantic Technologies • Representation of triples – Basic data format: RDF/XML – All data expressed in RDF (Resource Description Framework) – Several compatible syntaxes: TTL (Terse Triple Language) most human readable
  17. 17. Example
  18. 18. The Turtle Syntax • Basic Triple <http://bioinformatics.be/entities#martijn> <http://bioinformatics.be/relations#has_favorite_beer> <http://bioinformatics.be/entities#karmeliet>.
  19. 19. The Turtle Syntax • Prefix @prefix b4x: <http:bioinformatics.be/terms#> b4x:martijn b4x:has_favorite_beer b4x:karmeliet.
  20. 20. The Turtle Syntax • Predicate lists @prefix b4x: <http:bioinformatics.be/terms#> . @prefix foaf: <http://xmlns.com/foaf/0.1/> . b4x:martijn b4x:has_favorite_beer b4x:karmeliet; foaf:name “Martijn Devisscher”.
  21. 21. The Turtle Syntax • Object lists @prefix b4x: <http:bioinformatics.be/terms#> . @prefix foaf: <http://xmlns.com/foaf/0.1/> . b4x:martijn b4x:has_favorite_beer b4x:karmeliet, b4x:chimay_blauw; foaf:name “Martijn Devisscher”.
  22. 22. IRI’s and Literals • Terms can be either IRI’s, Literals or blank nodes • IRI = Internationalized Resource Identifier • Unique id – a virtual URI – Example: <http://bioinformatics.be/terms#martijn> – There is no requirement for resolving – Now: Open Data initiatives: please do use resolvable URI’s http://linkeddata.org – Unique identifiers can be registered on http://identifiers.org
  23. 23. Introduction • Literals: can be typed, allowed types from the XSD namespace: – E.g. “This is a string example”^^xsd:string – E.g. “5”^^xsd:integer • IRI’s are used for entities and attributes • Literals are used for attribute values that aren’t entities
  24. 24. The Turtle Syntax • Typed literals @prefix b4x: <http:bioinformatics.be/terms#> . @prefix foaf: <http://xmlns.com/foaf/0.1/> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . b4x:martijn b4x:has_favorite_beer b4x:karmeliet, b4x:chimay_blauw; b4x:length “184”^^xsd:integer; foaf:name “Martijn Devisscher”^^xsd:string.
  25. 25. The Turtle Syntax • Blank nodes @prefix b4x: <http:bioinformatics.be/terms#> . @prefix foaf: <http://xmlns.com/foaf/0.1/> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . b4x:martijn b4x:has_favorite_beer b4x:karmeliet, b4x:chimay_blauw; b4x:length “184”^^xsd:integer; foaf:name “Martijn Devisscher”^^xsd:string; b4x:owns_cat [ b4x:color “Gray” ].
  26. 26. Classes and Individuals • rdf:type @prefix b4x: <http:bioinformatics.be/terms#> . @prefix foaf: <http://xmlns.com/foaf/0.1/> . b4x:martijn rdf:type foaf:Person.
  27. 27. Classes and Individuals • Shorthand: a @prefix b4x: <http:bioinformatics.be/terms#> . @prefix foaf: <http://xmlns.com/foaf/0.1/> . b4x:martijn a foaf:Person; foaf:knows b4x:geert. b4x:geert a foaf:Person.
  28. 28. Example <http://xmpl/entities#martijn> <http://xmpl/relations#has_favorite_beer> <http://xmpl/entities#karmeliet>.
  29. 29. Semantic Technologies • Sets of triples form a Graph
  30. 30. Graphs • Triples are building blocks of Graphs • Combining sets of triples allows the construction of arbitrarily complex graphs b4x:martijn b4x:karmeliethas_favorite_beer
  31. 31. Add meaning ! • Reuse terms from existing, well defined vocabularies – ontologies (foaf, dc, go, so) • Describe new terms = Ontologies • Contain – A crisp human definition – Some machine readable facts
  32. 32. Metadata • Ontologies are also described in RDF – RDFS: RDF - Schema – OWL: Web Ontology Language – Also expressed in RDF • For clarity, file extension can be .rdfs or .owl
  33. 33. RDFS Essentials • Descriptions – rdfs:label – rdfs:comment
  34. 34. RDFS • Relationships between properties, classes – rdfs:Class – rdfs:subClassOf – rdf:Property – rdfs:subPropertyOf – rdfs:range – rdfs:domain
  35. 35. RDFS: Example @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>. @prefix foaf: <http://xmlns.com/foaf/0.1/> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . b4x:karmeliet a b4x:Tripel . b4x:Beer a rdfs:Class . b4x:Tripel a rdfs:Class . b4x:Tripel rdfs:subClassOf b4x:Beer . b4x:has_favorite_beer a rdf:Property ; rdfs:domain foaf:Person ; rdfs:range b4x:Beer . b4x:Beer rdfs:subClassOf b4x:Drink .
  36. 36. Analogy • RDF = database = data • RDFS/OWL = schema = metadata • Both are described in RDF, but have a different scope
  37. 37. Semantic Technologies • Inference – Enhance dataset using knowledge from metadata (e.g. rdfs, owl) • Types of inference engines – RDFS inference • RDFS entailment regime – OWL inference • Under active research • Engines exist for specific subsets of OWL (OWL-DL)
  38. 38. RDFS Entailment
  39. 39. RDFS: Inference b4x:kevin b4x:has_favorite_beer b4x:stella Q: What can we infer from this using RDFS entailment ?
  40. 40. RDFS: Inference b4x:kevin b4x:has_favorite_beer b4x:stella Inferred triples: b4x:kevin a foaf:Person [from domain] b4x:stella a b4x:Beer [from range] b4x:stella a b4x:Drink [from subClassOf]
  41. 41. DuckTyping • Watch out with inference ! Example: You want to express that people can have lengths b4x:length a rdf:Property; rdfs:domain foaf:Person; rdfs:range xsd:integer.
  42. 42. DuckTyping • Problem: ex:VW_Transporter b4x:length “600”^xsd:integer. • Would infer that VW_Transporter is a Person ! • This is called DuckTyping If it looks like a duck, swims like a duck, and quacks like a duck, then it probably is a duck
  43. 43. Task • Find a solution: express in rdfs that people can have lengths
  44. 44. Task • Find a solution: express in rdfs that people can have lengths b4x:havingLenght a rdfs:Class. b4x:length a rdf:Property; rdfs:domain b4x:havingLength; rdfs:range xsd:integer. foaf:Person rdfs:subClassOf b4x:havingLength.
  45. 45. Storing RDF • As an RDF file for download • In a Triplestore – Database optimised for storing triples – Examples: BlazeGraph, Fuseki, Sesame
  46. 46. Semantic Technologies • Querying over RDF data: SPARQL • Cool features: – Distributed querying = actual distribution of data and computing resources – SPARQL/Update: modify data • SPARQL endpoints: SPARQL over HTTP
  47. 47. SPARQL Query Syntax • First example: SELECT ?subject ?predicate ?object WHERE { ?subject ?predicate ?object. } (Generally not a good idea as it will pull down the whole dataset) Binding variables Graph matching
  48. 48. ? SELECT ?person WHERE { ?person b4x:has_favorite_beer b4x:karmeliet }
  49. 49. ?
  50. 50. SPARQL Query Syntax • Limit result size : SELECT ?subject ?predicate ?object WHERE { ?subject ?predicate ?object. } LIMIT 10
  51. 51. SPARQL Query Syntax • Find all classes: PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> SELECT ?class ?label WHERE { ?class a rdfs:Class. ?class rdfs:label ?label. } (This will only retrieve classes that have a label)
  52. 52. SPARQL Query Syntax • Find all classes: PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> SELECT ?class ?label WHERE { ?class a rdfs:Class. OPTIONAL { ?class rdfs:label ?label. } }
  53. 53. SPARQL Query Syntax • Find all classes that contain “duck” in the label: PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> SELECT ?class ?label WHERE { ?class a rdfs:Class. ?class rdfs:label ?label. FILTER( CONTAINS (str(?label) , “duck” ) ) }
  54. 54. SPARQL Query Syntax • Make it case insensitive: PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> SELECT ?class ?label WHERE { ?class a rdfs:Class. ?class rdfs:label ?label. FILTER( CONTAINS ( UCASE(str(?label)) , “DUCK” ) ) }
  55. 55. SPARQL Query Syntax • Search in specific graph: PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> SELECT ?class ?label FROM <http://example.org/animals> WHERE { ?class a rdfs:Class. ?class rdfs:label ?label. FILTER( CONTAINS ( UCASE(str(?label)) , “DUCK” ) ) }
  56. 56. SPARQL Query Syntax • Search in specific graph: PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> SELECT ?class ?label WHERE { GRAPH <http://example.org/animals> { ?class a rdfs:Class. ?class rdfs:label ?label. FILTER( CONTAINS ( UCASE(str(?label)) , “DUCK” ) ) } }
  57. 57. SPARQL Query Syntax • Can also search for graphs : PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> SELECT ?g WHERE { GRAPH ?g { ?class a rdfs:Class. ?class rdfs:label ?label. FILTER( CONTAINS ( UCASE(str(?label)) , “DUCK” ) ) } }
  58. 58. Summary: Querying RDF data RDF Data Inference Engine RDFS/OWL RDF Data Inferred SPARQL Endpoint
  59. 59. • Basic data element = a Triple – A mini sentence – Contains three Terms: – Subject Predicate Object • Example: <http://xmpl/entities#martijn> <http://xmpl/relations#has_favorite_beer> <http://xmpl/entities#karmeliet>. Take home Summary
  60. 60. • Combine triples to represent knowledge
  61. 61. • Use terms from ONTOLOGIES – COMMON VOCABULARIES – POSSIBLE TO INFER MEANING • OMIABIS • OBIB • SNOMED/ICD • MESH
  62. 62. ? • SPARQL searches for patterns
  63. 63. ?
  64. 64. Interoperability between OBO and Semantic Technologies • Originated from two separate academic worlds • Computing applications of OBO mainly consistency checking and overrepresentation analysis • Semantic Technologies: much broader toolset • Interoperability ? – Direct offering in both formats – Automated mapping
  65. 65. Where to find ontologies • OBO Foundry • Bioportal; NCBO • Biogateway • Bio2RDF
  66. 66. Where to find RDF data • Google for SPARQL endpoint • => e.g. EBI databases • Non biological: DBpedia
  67. 67. How about Tim Berners Lee’s vision • We’re not there yet, but for bio data we’re getting quite close – The explicitome – Crowd sourcing – Nanopublications
  68. 68. SPARQL in PRACTICE
  69. 69. SPARQL : Recap PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> SELECT ?label FROM <http://graphName> WHERE { ?x rdfs:label ?label. FILTER ( CONTAINS(?label, “dimethylalinine”) ) } LIMIT 10 ORDER BY ?label
  70. 70. SPARQL : Recap PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> SELECT ?label FROM <http://graphName> WHERE { ?x rdfs:label ?label. FILTER ( CONTAINS(?label, “dimethylalinine”) ) } LIMIT 10 ORDER BY ?label • FIND the pattern ?x rdfs:label ?label.
  71. 71. SPARQL : Recap PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> SELECT ?label FROM <http://graphName> WHERE { ?x rdfs:label ?label. FILTER ( CONTAINS(?label, “dimethylalinine”) ) } LIMIT 10 ORDER BY ?label • FIND the pattern ?x rdfs:label ?label. • BIND variables ?label, ?x
  72. 72. SPARQL : Recap PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> SELECT ?label FROM <http://graphName> WHERE { ?x rdfs:label ?label. FILTER ( CONTAINS(?label, “dimethylalinine”) ) } LIMIT 10 ORDER BY ?label • FIND the pattern ?x rdfs:label ?label. • BIND variables ?label, ?x • RETRIEVE variable ?label
  73. 73. SPARQL : Recap PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> SELECT ?label FROM <http://graphName> WHERE { ?x rdfs:label ?label. FILTER ( CONTAINS(?label, “dimethylalinine”) ) } LIMIT 10 ORDER BY ?label • FIND the pattern ?x rdfs:label ?label. • BIND variables ?label, ?x • RETRIEVE variable ?label • PREFIX: replace rdfs:label by <http://www.w3.org/2000/01/rdf-schema#label>
  74. 74. SPARQL : Recap PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> SELECT ?label FROM <http://graphName> WHERE { ?x rdfs:label ?label. FILTER ( CONTAINS(?label, “dimethylalinine”) ) } LIMIT 10 ORDER BY ?label • FIND the pattern ?x rdfs:label ?label. • BIND variables ?label, ?x • RETRIEVE variable ?label • PREFIX: replace rdfs:label by <http://www.w3.org/2000/01/rdf-schema#> • FILTER results to labels containing “dimethylalinine”
  75. 75. SPARQL : Recap PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> SELECT ?label FROM <http://graphName> WHERE { ?x rdfs:label ?label. FILTER ( CONTAINS(?label, “dimethylalinine”) ) } LIMIT 10 ORDER BY ?label • FIND the pattern ?x rdfs:label ?label. • BIND variables ?label, ?x • RETRIEVE variable ?label • PREFIX: replace rdfs:label by <http://www.w3.org/2000/01/rdf-schema#> • FILTER results to labels containing “dimethylalinine” • LIMIT results to first 10 matches ordered by label
  76. 76. SPARQL : Recap DESCRIBE <http://rdf.wikipathways.org/Pathway/WP1425_r74390/WP/Interaction/e077e> • Useful short query to get direct links from/to a given node
  77. 77. SPARQL REFERENCE http://www.w3.org/TR/sparql11-overview/
  78. 78. Running SPARQL • From a web interface
  79. 79. • From a web interface • Using http – HTTP GET – HTTP POST : for larger query strings – Headers determine response type (JSON, XML, HTML) http://…/sparql?default-graph-uri=<http://graphName>&query=URLENCODEDQUERYSTRING Running SPARQL
  80. 80. BIO-ONTOLOGIES
  81. 81. BioPortal
  82. 82. Access • From the web interface ! • SPARQL endpoint: using API key; on request • Running a local copy: download VM image; on request
  83. 83. Exercises • Find a term • Find ontologies containing a term • Browse some ontologies • Check the NCBO annotator !
  84. 84. BIO-DATA
  85. 85. EBI RDF Resources
  86. 86. EBI RDF Resources
  87. 87. Ensembl
  88. 88. Exercise • From uniprot find proteins that are annotated with a given Gene Ontology term
  89. 89. PREFIX up:<http://purl.uniprot.org/core/> PREFIX taxon:<http://purl.uniprot.org/taxonomy/> PREFIX rdfs:<http://www.w3.org/2000/01/rdf-schema#> PREFIX obo:<http://purl.obolibrary.org/obo/> SELECT * WHERE { ?protein up:classifiedWith obo:GO_0004499. ?protein up:organism taxon:9606. } http://sparql.uniprot.org
  90. 90. Exercise • From Expression Atlas find proteins that are differentially expressed (P < 1e-12) in Crohn’s disease
  91. 91. PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX owl: <http://www.w3.org/2002/07/owl#> PREFIX dcterms: <http://purl.org/dc/terms/> PREFIX obo: <http://purl.obolibrary.org/obo/> PREFIX sio: <http://semanticscience.org/resource/> PREFIX efo: <http://www.ebi.ac.uk/efo/> PREFIX atlas: <http://rdf.ebi.ac.uk/resource/atlas/> PREFIX atlasterms: <http://rdf.ebi.ac.uk/terms/atlas/> PREFIX up:<http://purl.uniprot.org/core/> PREFIX biopax3:<http://www.biopax.org/release/biopax-level3.owl#> SELECT distinct ?protein ?expressionValue ?pvalue WHERE { ?factor rdf:type efo:EFO_0000384 . ?value atlasterms:hasFactorValue ?factor . ?value atlasterms:isMeasurementOf ?probe . ?value atlasterms:pValue ?pvalue . ?value rdfs:label ?expressionValue . ?probe atlasterms:dbXref ?protein . FILTER ( ?pvalue < 1e-12 ) FILTER ( strstarts(str(?protein),"http://purl.uniprot.org/uniprot/") ) }ORDER BY ASC (?pvalue) https://www.ebi.ac.uk/rdf/services/atlas/sparql
  92. 92. • Links pathways with genes, terms from Pathway, Cell line and Disease ontology, PubMed references • Models individual Interactions • Can be downloaded as RDF • Has an experimental SPARQL endpoint WikiPathways
  93. 93. • Define a query to find pathways linked to TNFalpha gene Exercise
  94. 94. PREFIX wp: <http://vocabularies.wikipathways.org/wp#> PREFIX dc: <http://purl.org/dc/elements/1.1/> PREFIX dcterms: <http://purl.org/dc/terms/> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> SELECT DISTINCT ?PathwayName where { ?geneProduct a wp:GeneProduct . ?geneProduct dc:identifier ?GeneID . ?geneProduct dcterms:isPartOf ?pathway . ?geneProduct rdfs:label ?geneName . ?pathway dc:identifier ?pathwayid . ?pathway dc:title ?PathwayName . FILTER(str(?geneName) = "TNFalpha" ) } http://sparql.wikipathways.org
  95. 95. • Try this, or another query – Using web interface – Using http get • Define a simple describe • Use a web tool to URLEncode the query • Submit query as a URL parameter Exercise
  96. 96. DisGeNet
  97. 97. • Find diseases linked to BRCA1 Exercise
  98. 98. PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX owl: <http://www.w3.org/2002/07/owl#> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> PREFIX dcterms: <http://purl.org/dc/terms/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX skos: <http://www.w3.org/2004/02/skos/core#> PREFIX void: <http://rdfs.org/ns/void#> PREFIX sio: <http://semanticscience.org/resource/> PREFIX ncit: <http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#> PREFIX up: <http://purl.uniprot.org/core/> SELECT DISTINCT ?disease WHERE { ?gda a sio:SIO_000983. ?gda sio:SIO_000628 ?disease. ?disease a ncit:C7057. ?gda sio:SIO_000628 ?gene. ?gene a ncit:C16612. ?gene skos:exactMatch <http://identifiers.org/hgnc.symbol/BRCA1>} http://rdf.disgenet.org/lodestar/sparql
  99. 99. • Yields no results ????
  100. 100. PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX owl: <http://www.w3.org/2002/07/owl#> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> PREFIX dcterms: <http://purl.org/dc/terms/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX skos: <http://www.w3.org/2004/02/skos/core#> PREFIX void: <http://rdfs.org/ns/void#> PREFIX sio: <http://semanticscience.org/resource/> PREFIX ncit: <http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#> PREFIX up: <http://purl.uniprot.org/core/> SELECT DISTINCT ?disease WHERE { ?gda a sio:SIO_000983. ?gda sio:SIO_000628 ?disease. ?disease a ncit:C7057. ?gda sio:SIO_000628 ?gene. ?gene a ncit:C16612. ?gene skos:exactMatch <http://identifiers.org/hgnc.symbol/BRCA1>} http://rdf.disgenet.org/lodestar/sparql
  101. 101. PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX owl: <http://www.w3.org/2002/07/owl#> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> PREFIX dcterms: <http://purl.org/dc/terms/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX skos: <http://www.w3.org/2004/02/skos/core#> PREFIX void: <http://rdfs.org/ns/void#> PREFIX sio: <http://semanticscience.org/resource/> PREFIX ncit: <http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#> PREFIX up: <http://purl.uniprot.org/core/> SELECT DISTINCT ?disease WHERE { ?gda a [(rdfs:subClassOf)* sio:SIO_000983]. ?gda sio:SIO_000628 ?disease. ?disease a ncit:C7057. ?gda sio:SIO_000628 ?gene. ?gene a ncit:C16612. ?gene skos:exactMatch <http://identifiers.org/hgnc.symbol/BRCA1>} http://rdf.disgenet.org/lodestar/sparql
  102. 102. • Inference cannot be assumed on a SPARQL endpoint => take care with defining queries Why ?
  103. 103. • Define a query to find genes with important link to Crohn’s disease (score > 0.35) Exercise
  104. 104. PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> PREFIX dcterms: <http://purl.org/dc/terms/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX skos: <http://www.w3.org/2004/02/skos/core#> PREFIX void: <http://rdfs.org/ns/void#> PREFIX sio: <http://semanticscience.org/resource/> PREFIX ncit: <http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#> PREFIX up: <http://purl.uniprot.org/core/> SELECT DISTINCT ?gene WHERE { ?gda sio:SIO_000628 ?gene,?disease . ?gene a ncit:C16612 . ?gene skos:exactMatch ?GeneID . ?disease a ncit:C7057 . ?disease dcterms:title ?DiseaseName . ?gda sio:SIO_000216 ?scoreIRI . ?scoreIRI sio:SIO_000300 ?score . FILTER (?score > "0.35"^^xsd:decimal) FILTER (contains(str(?DiseaseName),"Crohn")) } http://rdf.disgenet.org/lodestar/sparql
  105. 105. neXtProt
  106. 106. • Define a query to find proteins related with Cardio diseases • Define a query to find the genomic location of gene “TP53” Exercise
  107. 107. select distinct ?id where { ?entry skos:exactMatch ?id. ?entry :isoform ?isoform. ?isoform :medical ?medical_annotation. ?medical_annotation :term ?term. ?term :related ?disease. ?disease a :MeshCv. ?disease rdfs:label ?label. FILTER(CONTAINS(?label,"Cardio")). } https://snorql.nextprot.org/
  108. 108. select ?chrom ?start ?end where { ?gene rdf:type :Gene. ?gene :name ?name. ?gene :chromosome ?chrom. ?gene :begin ?start. ?gene :end ?end. FILTER (str(?name) = "TP53") } https://snorql.nextprot.org/
  109. 109. • Federated querying: include data from another endpoint using the SERVICE keyword • Example: find pathways (from wikipathways) involving gene linked to Crohn’s disease (from disgenet) SPARQL and federated queries
  110. 110. PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX owl: <http://www.w3.org/2002/07/owl#> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> PREFIX dcterms: <http://purl.org/dc/terms/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX skos: <http://www.w3.org/2004/02/skos/core#> PREFIX void: <http://rdfs.org/ns/void#> PREFIX sio: <http://semanticscience.org/resource/> PREFIX ncit: <http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#> PREFIX up: <http://purl.uniprot.org/core/> PREFIX wp: <http://vocabularies.wikipathways.org/wp#> PREFIX dc: <http://purl.org/dc/elements/1.1/> PREFIX dcterms: <http://purl.org/dc/terms/> http://rdf.disgenet.org/lodestar
  111. 111. SELECT DISTINCT ?PathwayName WHERE { ?gda sio:SIO_000628 ?gene, ?disease . ?gene a ncit:C16612 . ?disease a ncit:C7057 . ?disease dcterms:title ?DiseaseName . ?gda sio:SIO_000216 ?scoreIRI . ?scoreIRI sio:SIO_000300 ?score . FILTER (?score > "0.35"^^xsd:decimal) FILTER (contains(str(?DiseaseName),"Crohn")) SERVICE <http://sparql.wikipathways.org/> { ?geneProduct a wp:GeneProduct . ?geneProduct dc:identifier ?gene . ?geneProduct dcterms:isPartOf ?pathway . ?pathway dc:identifier ?pathwayid . ?pathway dc:title ?PathwayName . } } http://rdf.disgenet.org/lodestar/sparql

×