Getting Serious About A Community Bio Service Catalogue


Published on

Why do we need a catalogue of bio Web Services?
Get the answer from these slides presented by Prof. Carole Goble

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • ; currently 850 databases are publicly web-accessible [1]. To get the most value out of these resources, however, they need to be able to integrate, interrogate and mine heterogeneous data and associated knowledge from distributed sources.
  • The service providers, decoupled from their consumers, have no obligation to supply such information and no great incentive either.
  • The automatic reconstruction of genome-scale yeast metabolic pathways from distributed data in the life sciences. Taverna extended to create and manipulate Systems Biology Markup Models.
  • Technical Infrastructure But its still not all joined up!! Feta keeps coming and going. Grid service descriptions are produced by annotating services with terms from the myGrid ontology, stored in a central registry, GRIMOIRES. Services are found using the Feta discovery service [5]. We have piloted expert manual annotation tools augmented by automated tools using information extraction techniques.
  • Semantic Infrastructure the controlled vocabulary for our annotations is an suite of ontologies written in OWL, and deployed as RDFS, that describe the bioinformatics research domain and the dimensions with which a service can be characterised from the perspective of the scientist , not the software developer [6].
  • Tools: Feta uses a “query and respond” registry interface, deployed chiefly as a plug-in to the Taverna workbench. However, for many users an Amazon-style web based browser is preferable, so scientists can “go shopping” for web services; we have piloted BioBay that adapts shopping metaphors to science service discovery [7]. Going further, our myExperiment activity is designing and building a “MySpace-like” collaboration environment, in partnership with an international focus group of users, for light-touch hosting of the registry and a user-controlled workflow and data exchange platform [8].
  • Would be preceded by Feta Screenshot
  • We did all sorts of fancy repository ideas and data sharing ideas. But actually something simple is what they needed. Would be preceeded by old MIR screenshot.
  • BioMOBY Dashboard GO tools for annotation and discovery Flickr style tagging tool!
  • BioMOBY Dashboard GO tools for annotation and discovery Flickr style tagging tool!
  • Capture and Curation effort: For the semantic discovery framework to be effective, a critical-mass of service descriptions have to be present. We have employed a full-time curator for the provision and maintenance of these annotations and to design annotation tools for future service providers, and begun to develop the Chameleon infrastructure to help cope with the impact of changes to the services and ontologies on the annotations [9].
  • Not just annotation, but checking the services still there and mirroring them.
  • How many services do we need to annotate to make it useful or make a difference? Don’t know. Key services everyone uses? Services that are “tipping points” in workflows? Services that are rare? What about mirroring the services? How reliable are they? How transient are they? Will they still be there tomorrow? If I can’t rely on them I won’t include them in a workflow. What about local services? They are exchange-blockers.
  • Getting Serious About A Community Bio Service Catalogue

    1. 1. Getting Serious about a Community Bio-Service Catalogue Carole Goble and Katy Wolstencroft OMII-UK my Grid Project The University of Manchester, UK
    2. 2. 12181 acatttctac caacagtgga tgaggttgtt ggtctatgtt ctcaccaaat ttggtgttgt 12241 cagtctttta aattttaacc tttagagaag agtcatacag tcaatagcct tttttagctt 12301 gaccatccta atagatacac agtggtgtct cactgtgatt ttaatttgca ttttcctgct 12361 gactaattat gttgagcttg ttaccattta gacaacttca ttagagaagt gtctaatatt 12421 taggtgactt gcctgttttt ttttaattgg gatcttaatt tttttaaatt attgatttgt 12481 aggagctatt tatatattct ggatacaagt tctttatcag atacacagtt tgtgactatt 12541 ttcttataag tctgtggttt ttatattaat gtttttattg atgactgttt tttacaattg 12601 tggttaagta tacatgacat aaaacggatt atcttaacca ttttaaaatg taaaattcga 12661 tggcattaag tacatccaca atattgtgca actatcacca ctatcatact ccaaaagggc 12721 atccaatacc cattaagctg tcactcccca atctcccatt ttcccacccc tgacaatcaa 12781 taacccattt tctgtctcta tggatttgcc tgttctggat attcatatta atagaatcaa
    3. 3. [Mark Wilkinson, 2006 BioMOBY] 12181 acatttctac caacagtgga tgaggttgtt ggtctatgtt ctcaccaaat ttggtgttgt 12241 cagtctttta aattttaacc tttagagaag agtcatacag tcaatagcct tttttagctt 12301 gaccatccta atagatacac agtggtgtct cactgtgatt ttaatttgca ttttcctgct 12361 gactaattat gttgagcttg ttaccattta gacaacttca ttagagaagt gtctaatatt 12421 taggtgactt gcctgttttt ttttaattgg
    4. 4. Taverna Workflow Workbench
    5. 5. <ul><li>3000+ services </li></ul><ul><li>Open domain services and resources. </li></ul><ul><li>Third party. </li></ul><ul><li>Enforce NO common data model. </li></ul><ul><li>No common typing. </li></ul><ul><li>Missing metadata. </li></ul><ul><li>Input0: string </li></ul><ul><li>Output0: string </li></ul><ul><li>. </li></ul>
    6. 6. <ul><li>Body </li></ul><ul><ul><li>Large experiments or large research groups/labs, possibly distributed </li></ul></ul><ul><ul><li>Large service provider institutes. </li></ul></ul><ul><ul><li>Tightly coupled provider-consumer of resources. </li></ul></ul><ul><ul><li>Commonly resource providers. </li></ul></ul><ul><ul><li>Lots of facilities. Some access to sys admin etc </li></ul></ul><ul><ul><li>Not many of them </li></ul></ul><ul><li>Long tail </li></ul><ul><ul><li>Small, isolated, independent, groups/individuals </li></ul></ul><ul><ul><li>Loosely coupled provider-consumer of resources. </li></ul></ul><ul><ul><li>Commonly resource consumers. </li></ul></ul><ul><ul><li>Boutique suppliers. </li></ul></ul><ul><ul><li>Poor access to facilities, instruments, sys admins. </li></ul></ul><ul><ul><li>Many of them – esp. in Life Sciences! </li></ul></ul>
    7. 7. Information Providers Information Consumers Decoupled suppliers and consumers
    8. 8. Shim Services
    9. 9. A Community Bio-Service Catalogue <ul><li>For my Grid. </li></ul><ul><li>For other middleware developers like BioMOBY. </li></ul><ul><li>For application developers in the community. </li></ul><ul><li>For bioinformaticians. </li></ul><ul><li>Semantic annotations for the community’s services. </li></ul><ul><li>Storage and discovery framework. </li></ul><ul><li>We have pieces. We have been investigating. We have been prototyping. Time to get serious! </li></ul><ul><li>SIX things: tech infrastructure, semantic infrastructure, tools, curation, community buy-in, institutional support </li></ul>
    10. 10. Importers Importers Ontology Editor Ontologists Domain Service Registry Shim Service Registry Workflow repository Registry Manager Repository Manager Service Providers Public Annotation Tool Shim Service Registry Domain Services Domain Services Shim Services Shim Services Extraction Importers Expert Annotation Tool Expert Annotator Chameleon change handler Feta Discovery Tool Feta GUI Text Discovery Scientists Ontology Exporter Publication Services Discovery Services Registry Services Ontology Services BioBay Browsing GUI MatchMaker Auto Annotation Khalid Belhajjame et al Automatic Annotation of Web Services based on Workflow Definitions , ISWC2006
    11. 11. Semantic Infrastructure <ul><li>Written in OWL, deployed as RDFS </li></ul><ul><li>The perspective of the scientist </li></ul>OPERATION name description task method resource PARAMETER name description namespace objectType SERVICE name description dc: format ORGANISATION name description dc: publisher hasInput hasOutput hasOperation hasOperation my Grid Service Ontology my Grid Domain Ontology
    12. 12. Tools, Tools, Tools Feta Search tool Pedro Annotation tool
    13. 13. webTaverna GUI - main
    14. 15. Adapt what is out there
    15. 16. Adapt what we have?!?
    16. 17. Adapt what we have?!?
    17. 18. Curation Ontology curation team OWL RDFS Service curation team Annotate Services for user discovery Annotate Services for shim discovery Service + Domain Ontology Shim Ontology and Mismatch Ontology Ontology curation team Community taggers Annotate Services for user discovery Scientists Service discovery Service + Domain Ontology OWL Workflow Composition Workflow matchmaker
    18. 19. Capture and Curation Effort Ontology and Annotation Curation Team Franck Tanoh and Katy Wolstencroft Community Service Providers Community Scientists
    19. 20. Community Buy-In <ul><li>The services are owned and developed by the community. </li></ul><ul><li>Partnership with the European Bioinformatics Institute to improve the metadata for services provided by major suppliers at source , and propagate best practice by example . </li></ul><ul><li>Enthusiasm for a catalogue in the community. </li></ul>
    20. 21. Institutional Support <ul><li>Hosting the registry, curating, resources to develop the infrastructure, partnering with suppliers. </li></ul><ul><li>Sustainability and impact. </li></ul><ul><li>OMII-UK and the EBI </li></ul>
    21. 22. So why is it taking so damn long? <ul><li>The final 9 yards and 80:20 rule. </li></ul><ul><li>All or nothing. </li></ul><ul><li>Dedicated resources and best intentions. </li></ul><ul><li>Content, content, content. </li></ul><ul><li>Being too damn, and unnecessarily, clever. </li></ul><ul><li>Can HCLS SIG help? </li></ul>
    22. 23. Let’s get serious <ul><li>Its time to get serious about building and managing a catalogue. </li></ul><ul><li>The outcome would be a living and useful resource, not just for bioinformaticians but also for those working in the Semantic Web for Life Sciences, or just the Semantic Web. </li></ul><ul><li>We have the parts and enough of the know-how. And a means to the content. </li></ul><ul><li>So lets. Just. Do. It. </li></ul>
    23. 24. <ul><li>my Grid project </li></ul><ul><li>BioMOBY project </li></ul><ul><li>ISPIDER project </li></ul><ul><li>GRIMOIRES project </li></ul><ul><li>OMII-UK </li></ul><ul><li>EPSRC </li></ul><ul><li>Genome Canada </li></ul>Acknowledgements