2008 Jun Zhao Eswc

  • 778 views
Uploaded on

 

More in: Technology , Education
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
778
On Slideshare
0
From Embeds
0
Number of Embeds
1

Actions

Shares
Downloads
0
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Building a Semantic Web Image Repository for Biological Research Images Jun Zhao , Graham Klyne and David Shotton [email_address] Image Bioinformatics Research Group Department of Zoology University of Oxford, UK
  • 2. FlyTED
    • The Drosophila Testis Gene Expression Image Database
    • Publish research data
    • Use existing tools to build a biological image repository
      • Browse-able and searchable by research biologists
      • Accessible to wider communities, e.g. Semantic Web, Linked Data
    • Loosely coupled software architecture maximises the opportunity of replacing or updating components used
  • 3. Images
    • “… so that people can come and see the images, and they will notice something special about the genes”
    • --- Dr Helen White-Cooper ( Drosophila testis expert)
  • 4. Where we started from
    • In situ gene expression images of the ~1500 genes of the testis of Drosophila melanogaster
    • >=1 image for each wild type gene
    • Possibly, >=1 image for any of the 6 different mutant strains having defective sperm maturation
    • Images stored in the file system
    • Metadata in spreadsheets, but not expressed using controlled keywords
    • No way to search for images
    • Search through the hard disk whenever they need an image
  • 5. The goal
    • Publish Drosophila gene expression images to the Web
    • Make them accessible and searchable to our biological researchers as well as third parties
    • Quick, easy and cost-effective approach
  • 6. EPrints 3.0 (http://eprints.org/software/)
    • A digital repository software system
    • Quick and easy to deploy
    • Built-in user interface
    • Programmatically data access
      • Repository-specific protocol: OAI-PMH
    • Support for domain-specific image metadata, e.g. Serpent Project http://archive.serpentproject.com/
    A Piglet Squid from Serpent Project
  • 7. Our gene expression images Gene name Strain name >1 Expression location Slide name Creation date ………… .
  • 8. Adaptation of EPrints
    • Basic structure cannot hold the domain metadata
    • Customize underlying database:
      • Add additional metadata fields in the database schema
      • Keep both images and their metadata files as blobs
    • Customize the user interface: CSS
    • https://milos2.zoo.ox.ac.uk/svn/ImageWeb/FlyTED/Trunk/
  • 9.  
  • 10.  
  • 11.  
  • 12.  
  • 13.  
  • 14.  
  • 15. Issues
    • Difficult to query metadata programmatically
    • Limited flexibility in the user interface
  • 16. FlyTED on the Semantic Web
    • Data become programmatically accessible:
      • http://www.fly-ted.org/sparql
    • Images can be used in more flexible UIs
    • Semantic Web faceted browsers
      • Exhibit
        • from MIT SIMILE
        • Javascripts, run in Web browsers
        • http://simile.mit.edu/exhibit/
      • jSpace
        • from clarkparsia
        • Java Web Application
        • http://clarkparsia.com/jspace/
  • 17. Publish FlyTED on the Semantic Web Free-text metadata file OAI-PMH + Relational FlyTED Database N3 metadata Local Harvesting & Transformer script Jena RDF database Command-line Jena model loader Joseki SPARQL Endpoint JSON data HTTP-based Babel HTTP SPARQL jSpace Exhibit
  • 18. FlyTED in Exhibit
  • 19.  
  • 20. Functionality Measurement Yes Yes Partial Yes Yes Yes Yes No Yes Partial No Yes Partial Yes Yes No No No
  • 21. Performance Exhibit Exhibit jSpace jSpace
  • 22. Summary
    • We built an image repository, based on Eprints
    • We used existing tools ( OAIHarvester2 API and Joseki ) to make metadata accessible through SPARQL
    • We consumed these images using existing faceted browsers ( Exhibit and jSpace ) in order to present them in more flexible user interfaces
    • Potentially, we can replace existing components with new tools, e.g. OAI2LOD , Joseki/SDB
  • 23. To take home
    • Publish your data
    • Take a look at our “Exhibit” of FlyTED images
      • http://www.fly-ted.org/exhibit/exhibit_flyted.html
    • Play with and make a link to our SPARQL endpoint
      • http://www.fly-ted.org/sparql
    • VOID: Vocabulary of Interlinked Data
  • 24. Since then
    • More images
    • Enrich the Fly Anatomy Ontology
    • Link with others: the FlyWeb Project
      • BDGP ( http://www.fruitfly.org/ ):
        • Drosophila gene expression images in embryos
      • FlyBase ( http://www.flybase.org/ ):
        • Genomic Drosophila database
      • Bio2RDF ( http://bio2rdf.org/ ):
        • RDFized PubMed, Medline, UniProt, GO database, etc
  • 25. Screen shot of FlyWeb
  • 26. Acknowledgement
    • David Shotton, Graham Klyne, and Alistair Miles
    • Dr Helen White-Cooper and her research group
    • JISC and BBSRC
    • EPrints Southampton team
    • HP Labs, SIMILE project (MIT), and Clark&Parsia
  • 27. Thank you!