Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

2008 Jun Zhao Eswc


Published on

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

2008 Jun Zhao Eswc

  1. 1. Building a Semantic Web Image Repository for Biological Research Images Jun Zhao , Graham Klyne and David Shotton [email_address] Image Bioinformatics Research Group Department of Zoology University of Oxford, UK
  2. 2. FlyTED <ul><li>The Drosophila Testis Gene Expression Image Database </li></ul><ul><li>Publish research data </li></ul><ul><li>Use existing tools to build a biological image repository </li></ul><ul><ul><li>Browse-able and searchable by research biologists </li></ul></ul><ul><ul><li>Accessible to wider communities, e.g. Semantic Web, Linked Data </li></ul></ul><ul><li>Loosely coupled software architecture maximises the opportunity of replacing or updating components used </li></ul>
  3. 3. Images <ul><li>“… so that people can come and see the images, and they will notice something special about the genes” </li></ul><ul><li>--- Dr Helen White-Cooper ( Drosophila testis expert) </li></ul>
  4. 4. Where we started from <ul><li>In situ gene expression images of the ~1500 genes of the testis of Drosophila melanogaster </li></ul><ul><li>>=1 image for each wild type gene </li></ul><ul><li>Possibly, >=1 image for any of the 6 different mutant strains having defective sperm maturation </li></ul><ul><li>Images stored in the file system </li></ul><ul><li>Metadata in spreadsheets, but not expressed using controlled keywords </li></ul><ul><li>No way to search for images </li></ul><ul><li>Search through the hard disk whenever they need an image </li></ul>
  5. 5. The goal <ul><li>Publish Drosophila gene expression images to the Web </li></ul><ul><li>Make them accessible and searchable to our biological researchers as well as third parties </li></ul><ul><li>Quick, easy and cost-effective approach </li></ul>
  6. 6. EPrints 3.0 ( <ul><li>A digital repository software system </li></ul><ul><li>Quick and easy to deploy </li></ul><ul><li>Built-in user interface </li></ul><ul><li>Programmatically data access </li></ul><ul><ul><li>Repository-specific protocol: OAI-PMH </li></ul></ul><ul><li>Support for domain-specific image metadata, e.g. Serpent Project </li></ul>A Piglet Squid from Serpent Project
  7. 7. Our gene expression images Gene name Strain name >1 Expression location Slide name Creation date ………… .
  8. 8. Adaptation of EPrints <ul><li>Basic structure cannot hold the domain metadata </li></ul><ul><li>Customize underlying database: </li></ul><ul><ul><li>Add additional metadata fields in the database schema </li></ul></ul><ul><ul><li>Keep both images and their metadata files as blobs </li></ul></ul><ul><li>Customize the user interface: CSS </li></ul><ul><li> </li></ul>
  9. 15. Issues <ul><li>Difficult to query metadata programmatically </li></ul><ul><li>Limited flexibility in the user interface </li></ul>
  10. 16. FlyTED on the Semantic Web <ul><li>Data become programmatically accessible: </li></ul><ul><ul><li> </li></ul></ul><ul><li>Images can be used in more flexible UIs </li></ul><ul><li>Semantic Web faceted browsers </li></ul><ul><ul><li>Exhibit </li></ul></ul><ul><ul><ul><li>from MIT SIMILE </li></ul></ul></ul><ul><ul><ul><li>Javascripts, run in Web browsers </li></ul></ul></ul><ul><ul><ul><li> </li></ul></ul></ul><ul><ul><li>jSpace </li></ul></ul><ul><ul><ul><li>from clarkparsia </li></ul></ul></ul><ul><ul><ul><li>Java Web Application </li></ul></ul></ul><ul><ul><ul><li> </li></ul></ul></ul>
  11. 17. Publish FlyTED on the Semantic Web Free-text metadata file OAI-PMH + Relational FlyTED Database N3 metadata Local Harvesting & Transformer script Jena RDF database Command-line Jena model loader Joseki SPARQL Endpoint JSON data HTTP-based Babel HTTP SPARQL jSpace Exhibit
  12. 18. FlyTED in Exhibit
  13. 20. Functionality Measurement Yes Yes Partial Yes Yes Yes Yes No Yes Partial No Yes Partial Yes Yes No No No
  14. 21. Performance Exhibit Exhibit jSpace jSpace
  15. 22. Summary <ul><li>We built an image repository, based on Eprints </li></ul><ul><li>We used existing tools ( OAIHarvester2 API and Joseki ) to make metadata accessible through SPARQL </li></ul><ul><li>We consumed these images using existing faceted browsers ( Exhibit and jSpace ) in order to present them in more flexible user interfaces </li></ul><ul><li>Potentially, we can replace existing components with new tools, e.g. OAI2LOD , Joseki/SDB </li></ul>
  16. 23. To take home <ul><li>Publish your data </li></ul><ul><li>Take a look at our “Exhibit” of FlyTED images </li></ul><ul><ul><li> </li></ul></ul><ul><li>Play with and make a link to our SPARQL endpoint </li></ul><ul><ul><li> </li></ul></ul><ul><li>VOID: Vocabulary of Interlinked Data </li></ul>
  17. 24. Since then <ul><li>More images </li></ul><ul><li>Enrich the Fly Anatomy Ontology </li></ul><ul><li>Link with others: the FlyWeb Project </li></ul><ul><ul><li>BDGP ( ): </li></ul></ul><ul><ul><ul><li>Drosophila gene expression images in embryos </li></ul></ul></ul><ul><ul><li>FlyBase ( ): </li></ul></ul><ul><ul><ul><li>Genomic Drosophila database </li></ul></ul></ul><ul><ul><li>Bio2RDF ( ): </li></ul></ul><ul><ul><ul><li>RDFized PubMed, Medline, UniProt, GO database, etc </li></ul></ul></ul>
  18. 25. Screen shot of FlyWeb
  19. 26. Acknowledgement <ul><li>David Shotton, Graham Klyne, and Alistair Miles </li></ul><ul><li>Dr Helen White-Cooper and her research group </li></ul><ul><li>JISC and BBSRC </li></ul><ul><li>EPrints Southampton team </li></ul><ul><li>HP Labs, SIMILE project (MIT), and Clark&Parsia </li></ul>
  20. 27. Thank you!