The Semantic Web for Life Sciences

2,861 views

Published on

Published in: Health & Medicine
2 Comments
2 Likes
Statistics
Notes
No Downloads
Views
Total views
2,861
On SlideShare
0
From Embeds
0
Number of Embeds
136
Actions
Shares
0
Downloads
31
Comments
2
Likes
2
Embeds 0
No embeds

No notes for slide

The Semantic Web for Life Sciences

  1. 1. The Semantic Web for Life Sciences Egon Willighagen <http://chem-bla-ics.blogspot.com/> Bioclipse & Proteochemometric Group (Prof. J. Wikberg) Until 2010-09-30 Department of Pharmaceutical Biosciences (Prof. E. Brittebo) Uppsala University 2010-09-22
  2. 2. Drug-Protein Binding Why RDF? Applications patented:ELW00356 Online Building binds to Blocks RDF uniprot:CYP1B1 The Players Conclusion 2010-09-22 Bioclipse & Proteochemometric Group -2- Egon Willighagen | chem-bla-ics.blogspot.com
  3. 3. Drug-Protein Binding Why RDF? Applications patented:ELW00356 Online Building binds to Blocks uniprot:CYP1B1 RDF The Players Where was this published? How measured? Conclusion Other measurements? What about similar molecules and proteins? What haplotype, SNP, or missense mutation? 2010-09-22 Bioclipse & Proteochemometric Group -3- Egon Willighagen | chem-bla-ics.blogspot.com
  4. 4. Semantic Web Why RDF? Applications Online Current World Wide Web Building Blocks hyperlinked web pages RDF Semantic Web The Players Conclusion machine-readable hyperlinked (web) pages 2010-09-22 Bioclipse & Proteochemometric Group -4- Egon Willighagen | chem-bla-ics.blogspot.com
  5. 5. Semantic Web 4 Life Sciences?? Why RDF? Applications Online Is this relevant to drug discovery? Building Blocks knowledge discovery, ... RDF data consistency checking The Players Conclusion model validation 2010-09-22 Bioclipse & Proteochemometric Group -5- Egon Willighagen | chem-bla-ics.blogspot.com
  6. 6. Applications Why RDF? Applications Online Building Blocks RDF The Players Conclusion 2010-09-22 Bioclipse & Proteochemometric Group -6- Egon Willighagen | chem-bla-ics.blogspot.com
  7. 7. OpenMolecules RDF: linked data Why RDF? Applications Online Building Blocks RDF The Players Conclusion 2010-09-22 Bioclipse & Proteochemometric Group -7- Egon Willighagen | chem-bla-ics.blogspot.com
  8. 8. OpenMolecules RDF Why RDF? Applications Online Building Blocks RDF The Players Conclusion http://rdf.openmolecules.net/?InChI=1/CH4/h1H4 2010-09-22 Bioclipse & Proteochemometric Group -8- Egon Willighagen | chem-bla-ics.blogspot.com
  9. 9. Linked Data ECS South- Sem- Wiki- BBC Surge ampton LIBRIS Web- company Playcount Radio Central RDF Data ohloh Resex Doap- Buda- Music- space Semantic ReSIST brainz Audio- pest Eurécom Project Flickr Web.org MySpace Scrobbler QDOS SW BME Wiki exporter Conference IRIT Why RDF? Wrapper Corpus Toulouse RAE National BBC BBC Crunch 2001 Science Applications BBC Music Later + John Base FOAF profiles SIOC Revyu ACM Foundation Jamendo TOTP Peel Sites Open- Guides DBLP Online RKB Project flickr Pub Geo- Euro- wrappr Explorer Guten- Virtuoso Guide names stat Pisa CORDIS berg Sponger eprints Building BBC Programmes Open Blocks Calais RKB riese World Linked ECS Magna- Fact- MDB IEEE New- South- tune book castle RDF RDF Book ampton DBpedia Mashup Linked GeoData lingvoj Freebase LAAS- The Players US CiteSeer CNRS Census W3C DBLP Data IBM WordNet Hannover UniRef Conclusion GEO Species DBLP Gov- UMBEL Track Berlin Reactome LinkedCT UniParc Open Taxonomy Cyc Yago Drug PROSITE Daily Bank Med Pub GeneID Chem Homolo KEGG UniProt Gene Pfam ProDom Disea- CAS Gene some ChEBI Ontology Symbol OMIM Inter Pro UniSTS PDB HGNC MGI PubMed As of July 2009 2010-09-22 Bioclipse & Proteochemometric Group CC-BY-SA -9- Egon Willighagen | chem-bla-ics.blogspot.com
  10. 10. Linked Data: the Life Science corner Why RDF? Applications Online Building Blocks RDF The Players Conclusion CC-BY-SA 2010-09-22 Bioclipse & Proteochemometric Group - 10 - Egon Willighagen | chem-bla-ics.blogspot.com
  11. 11. Proteochemometrics Data: protein sequences, molecular structures, binding anities Why RDF? Applications Online Building Blocks RDF The Players Conclusion E.L. Willighagen et al., J. Biomed. Sem., 2010, in print 2010-09-22 Bioclipse Proteochemometric Group - 11 - Egon Willighagen | chem-bla-ics.blogspot.com
  12. 12. Proteochemometrics: RDF input Why RDF? Applications Online Building Blocks RDF The Players Conclusion 2010-09-22 Bioclipse Proteochemometric Group - 12 - Egon Willighagen | chem-bla-ics.blogspot.com
  13. 13. Substructure mining: ChEMBL Why RDF? Applications Online Building Blocks RDF The Players Conclusion Annsoe Andersson, M.Sc. project 2010-09-22 Bioclipse Proteochemometric Group - 13 - Egon Willighagen | chem-bla-ics.blogspot.com
  14. 14. OpenTox Open Standards around Computation Toxicology Why RDF? Applications Online Web services Building Blocks Public Data Repository RDF Bioclipse integration The Players downloading/uploading data Conclusion run descriptor calculation future: build QSAR models E.L Willighagen, N. Jeliazkova, O. Spjuth, in preparation 2010-09-22 Bioclipse Proteochemometric Group - 14 - Egon Willighagen | chem-bla-ics.blogspot.com
  15. 15. OpenTox: downloading Why RDF? Applications Online Building Blocks RDF The Players Conclusion 2010-09-22 Bioclipse Proteochemometric Group - 15 - Egon Willighagen | chem-bla-ics.blogspot.com
  16. 16. Hyperlinked Data Why RDF? Applications Online Building Blocks How do we put our semantic data online? RDF The Players Conclusion 2010-09-22 Bioclipse Proteochemometric Group - 16 - Egon Willighagen | chem-bla-ics.blogspot.com
  17. 17. XHTML+RDFa Why RDF? Embedded in web pages Applications Online Building Blocks RDF The Players Conclusion 2010-09-22 Bioclipse Proteochemometric Group - 17 - Egon Willighagen | chem-bla-ics.blogspot.com
  18. 18. SPARQL end point Query the data directly Why RDF? Applications Online Building Blocks RDF The Players Conclusion 2010-09-22 Bioclipse Proteochemometric Group - 18 - Egon Willighagen | chem-bla-ics.blogspot.com
  19. 19. Semantic Wikis Bootstrapping Life Sciences Knowledge Bases Why RDF? Applications Online Building Blocks RDF The Players Conclusion Samuel Lampa et al., in preparation 2010-09-22 Bioclipse Proteochemometric Group - 19 - Egon Willighagen | chem-bla-ics.blogspot.com
  20. 20. Other Building Blocks Why RDF? Applications Online Building Blocks RDF The Players Conclusion 2010-09-22 Bioclipse Proteochemometric Group - 20 - Egon Willighagen | chem-bla-ics.blogspot.com
  21. 21. The Chemistry Development Kit A Family of Projects CDK-Taverna (chemoinformatics workows) Why RDF? JChemPaint (semantic 2D editor) ChemoJava (GPL-ed extension) Applications Online Building Goals Blocks RDF library of cheminformatics algorithms The Players educational Conclusion Usage CDK: 140+ times cited in scientic literature Bioclipse, KNIME, CDK-Taverna, Jumbo (CML), AMBIT, ... C. Steinbeck et al., J.Chem.Inf.Comput.Sci, 2003 C. Steinbeck et al., Curr.Pharm.Design, 2006 2010-09-22 Bioclipse Proteochemometric Group - 21 - Egon Willighagen | chem-bla-ics.blogspot.com
  22. 22. More detail on the CDK Why RDF? Applications Online Building Blocks Tomorrow, during the presentation from 13:00-14:00 RDF The Players Conclusion 2010-09-22 Bioclipse Proteochemometric Group - 22 - Egon Willighagen | chem-bla-ics.blogspot.com
  23. 23. Bioclipse-RDF Linking the Semantic Web to Cheminformatics Why RDF? Applications local RDF storage (memory, on disk) Online read/write RDF/XML, N3 Building Blocks run SPARQL queries (local and remote) RDF extract RDF from XHTML/RDFa The Players Conclusion Thanx to Open Source projects including Jena, SWI-Prolog, and Pellet. E.L. Willighagen et al., J. BioMed. Sem., in press O. Spjuth et al., BMC Bioinformatics 2007 O. Spjuth et al., BMC Bioinformatics 2010 2010-09-22 Bioclipse Proteochemometric Group - 23 - Egon Willighagen | chem-bla-ics.blogspot.com
  24. 24. MyExperiment: Bioclipse Scripting Language Why RDF? Applications Online Building Blocks RDF The Players Conclusion myexperiment.search(RDF) myexperiment.downloadWorkow(937) 2010-09-22 Bioclipse Proteochemometric Group - 24 - Egon Willighagen | chem-bla-ics.blogspot.com
  25. 25. Semantic Web Why RDF? Applications The new building block... Online Building Blocks RDF The Players Conclusion It's already 10+ years old... 2010-09-22 Bioclipse Proteochemometric Group - 25 - Egon Willighagen | chem-bla-ics.blogspot.com
  26. 26. Why are Open Standards Important? Why RDF? Applications Online Building Standards: We speak the same langauge Blocks RDF Open: Social contract: you can use it, now and in the The Players future Conclusion 2010-09-22 Bioclipse Proteochemometric Group - 26 - Egon Willighagen | chem-bla-ics.blogspot.com
  27. 27. The Semantic Web Stack The Semantic Web is more than RDF Why RDF? Applications Online Building Blocks RDF The Players Conclusion (From Wikipedia) 2010-09-22 Bioclipse Proteochemometric Group - 27 - Egon Willighagen | chem-bla-ics.blogspot.com
  28. 28. Resource Description Framework Why RDF? Applications Online Building hyperlinked, machine-readable knowledge Blocks hyperlinked Universal Resource Identier (URI) RDF The Players machine-readable knowledge markup with triple Conclusion 2010-09-22 Bioclipse Proteochemometric Group - 28 - Egon Willighagen | chem-bla-ics.blogspot.com
  29. 29. RDF: the URI Why RDF? Applications Online Universal Resource Identier (URI) Building Blocks e.g. URL: http://www.pharmbio.org/ RDF Too long? Use prexes The Players http://www.semanticweb.org/ontologies/cheminf.owl#Molecule Conclusion cheminf:Molecule 2010-09-22 Bioclipse Proteochemometric Group - 29 - Egon Willighagen | chem-bla-ics.blogspot.com
  30. 30. RDF: the triple Type 1: resource - predicate - resource Why RDF? Applications heavier Online than Building Blocks Ethane Methane RDF The Players Type 2: resource - predicate - literal Conclusion boiling point Methane -161 2010-09-22 Bioclipse Proteochemometric Group - 30 - Egon Willighagen | chem-bla-ics.blogspot.com
  31. 31. RDF: graphs Linked triples create a graph Why RDF? Applications heavier Online than Building Ethane Methane Blocks RDF The Players boiling boiling Conclusion point point -89 -161 2010-09-22 Bioclipse Proteochemometric Group - 31 - Egon Willighagen | chem-bla-ics.blogspot.com
  32. 32. RDF Schema Web Ontology Language Why RDF? RDF Schema: taxonomies Applications Online rdfs:Class, rdfs:Property Building rdfs:label, rdfs:comment Blocks RDF rdfs:subClassOf, rdfs:subPropertyOf The Players Web Ontology Language (OWL) Conclusion owl:equivalentClass owl:sameAs and a lot more ... 2010-09-22 Bioclipse Proteochemometric Group - 32 - Egon Willighagen | chem-bla-ics.blogspot.com
  33. 33. SPARQL SPARQL RDF Query Language SELECT DISTINCT * WHERE { Why RDF? SERVICE http://uu3.org:8888/7tm_receptors { ?iuphar iface:family ?family . Applications ?iuphar iface:code ?code . Online ?iuphar iface:iupharName ?iupharNm . Building ?human iface:iuphar ?iuphar . Blocks ?human iface:geneName GABBR1 . RDF ?human iface:entrezGene ?humanEntrez . } The Players SERVICE http://dbpedia.org/sparql { Conclusion _:gene dbp:entrezgene ?humanEntrez ; rdfs:label ?label ; FILTER (lang(?label) = en) } GRAPH http://hcls.deri.org/atag/data/gabab_example.html { ?topic rdfs:label ?label . ?post sioc:topic ?topic } } 2010-09-22 Bioclipse Proteochemometric Group - 33 - Egon Willighagen | chem-bla-ics.blogspot.com
  34. 34. The Players Why RDF? Applications Online Building Blocks Who are working with RDF? RDF The Players Conclusion 2010-09-22 Bioclipse Proteochemometric Group - 34 - Egon Willighagen | chem-bla-ics.blogspot.com
  35. 35. W3C Why RDF? Applications World Wide Web Consortium Online Coordinates standard development Building Blocks Builds user communities RDF Health Care and Life Sciences Interest Group Linked Open Drug Data (LODD) The Players Conclusion Transitional Medican O? (TMO) Scientic Discourse 2010-09-22 Bioclipse Proteochemometric Group - 35 - Egon Willighagen | chem-bla-ics.blogspot.com
  36. 36. Bio2RDF / Chem2Bio2RDF Why RDF? Applications Bio2RDF Online Proteins, DNA, ... Building Blocks Open Source RDF Canada, Virtuoso, ... The Players Chem2Bio2RDF Conclusion More towards molecules... Indiana University 2010-09-22 Bioclipse Proteochemometric Group - 36 - Egon Willighagen | chem-bla-ics.blogspot.com
  37. 37. Talis Virtuoso Why RDF? Applications Online Building Blocks The companies that provide triple store support. RDF Very supportive of Open initiatives... The Players Conclusion 2010-09-22 Bioclipse Proteochemometric Group - 37 - Egon Willighagen | chem-bla-ics.blogspot.com
  38. 38. ACS Meeting Boston, 22-23 August 2010 Topics lipidomics, text mining, drug discovery, ontologies Why RDF? Thematic Issue submit before 2010-11-28 Applications Online http://egonw.github.com/acsrdf2010/ Building Blocks RDF The Players Conclusion 2010-09-22 Bioclipse Proteochemometric Group - 38 - Egon Willighagen | chem-bla-ics.blogspot.com
  39. 39. Summary Why RDF? RDF New Open Standards for knowledge exchange Applications Online Building Simplies sharing data Blocks RDF RDFS + OWL The Players Machine-readable knowledge Conclusion Disambiguation SPARQL, etc Application Programming Interfaces 2010-09-22 Bioclipse Proteochemometric Group - 39 - Egon Willighagen | chem-bla-ics.blogspot.com
  40. 40. What does this bring us? Why RDF? Applications Platform to integrate the RDF with the computation world Online Building Blocks Bioclipse as single point of access RDF The Players Scripting, sharing of scripts with MyExperiment.org Conclusion Bridging Names to Numbers 2010-09-22 Bioclipse Proteochemometric Group - 40 - Egon Willighagen | chem-bla-ics.blogspot.com
  41. 41. Acknowledgements Why RDF? Applications Maris Lapins, Martin Eklund: statistics Online Building Annsoe Andersson: ChEMBL + MoSS integration Samuel Lampa: reasoning (Pellet/Prolog) and RDFIO Blocks RDF The Players Nina Jeliazkova: OpenTox integration Conclusion John Overington: ChEMBL database Ola Spjuth 2010-09-22 Bioclipse Proteochemometric Group - 41 - Egon Willighagen | chem-bla-ics.blogspot.com
  42. 42. The Details: PharmBio course Why RDF? Applications Online Building Blocks http://www.pharmbio.org/ RDF Book in preparation ... The Players Conclusion 2010-09-22 Bioclipse Proteochemometric Group - 42 - Egon Willighagen | chem-bla-ics.blogspot.com
  43. 43. The Details: Molecular Chemometrics Why RDF? Applications Online http://www.citeulike.org/user/ Building egonw/tag/papers Blocks http: RDF //chem-bla-ics.blogspot.com The Players Conclusion http://egonw.github.com waveto: egon.willighagen@googlewave.com 2010-09-22 Bioclipse Proteochemometric Group - 43 - Egon Willighagen | chem-bla-ics.blogspot.com

×