Paper presentation @ SWF 2010


Published on

Functional Units: Abstractions for Web Service Annotation, SWF 2010 workshop

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • To clearly annotate web service we need another layer of abstraction independent to the technology used. In this presentation a number of example to define the FU The work presented here stems from the observation that current annotation models force users to think in term of service interface rather than high level functionality FU: the elementary units of information used to describe a service. Using widely used web service in Life Science we define the FU as configurations and compositions of underlying service operations. FU is limited to the set of operations that are part of the same service.
  • How many web services are there? What are the API submission statistics for 2008 Is there a graph showing an increase? o 3 million/month accesses to various WS APIs (MSD, BioModels, ES-compute jobs, etc). o 1 million/month compute jobs of which more than 50% are over WS (mostly by systematic users). o 20K unique IPs/month for the whole. Of these Ca. 5K are systematic users and account for the vast majority of job submissions. o User agents covering every single LS programming language have been detected (perl, python, C/C++, C#, Java, Ruby, PHP, etc). o A guess for LSWS: >500 - < 1000 worldwide but growing as specialisation and segregation of methods from monolithic servers offering more than 20 methods takes place. This only includes SOAP (rpc &doclit). REST, JAX-WS and DAS are not included in this estimate. If you count DAS as a type of REST WS, you can say >700 - <1000. I'm being conservative.
  • Web service providers usually think about themselves first when building web service
  • Despite a wealth of research over the past few years, service annotations still reflect a interface oriented view rather than a functional view of the service. WSMO Ontologies : Terminology used by other elements Goals : service functionality Web services: the services provided. Mediators: for interoperability between WSMO elements OWL-S Service: web services declaration Service profile: functionality and non-functional properties Service model: service functionality Service grounding: technical aspect of the service SAWSDL W3C recommendation since 2007 Maps WSDL document to a domain ontology
  • Annotation apply to the entire service or individual operations, they follow the WSDL structure. For the purpose of discovery in registry such as BioCatalogue , this level of abstraction in not always suitable because the set of operations exposed by a service are not always functional tasks.
  • A means to pool metadata about services in the wild A means to discover and reuse those services A means to curate services A platform for service monitoring and analytics A generic service annotation model for community annotation
  • Service in the wild worse than we think…we’ve come across these different type of service. Multiple operation->1 task: by annotating these services on individual operation, a gap remains between the users perspective of service operations as tasks with a well-defined function and service providers’ technological view. We argue that this gap can be filled by choosing to annotate at a higher level of abstraction => that’s what we name the FU KEGG: Kyoto Encyclopedia of Genes and Genomes
  • ChEBI (meaning either Chemical Entities of Biological Interest or Chemistry at the EBI) is a database of molecular entities focused on 'small' chemical compounds. ... The SABIO-RK ( S ystem for the A nalysis of Bio chemical Pathways - R eaction K inetics) is a web-based application based on the SABIO relational database that contains information about biochemical reactions, their kinetic equations with their parameters, and the experimental conditions under which these parameters were measured. This is a concrete example of FU…
  • Notes: useful for finding alternative services and configuring services
  • To elicit the FU we can extract sub workflow of tried and testing workflow from workflow repository such as myExperiment…. A single workflow may define multiple FU. Identify FU by parsing the workflow definition from myexperiment Elicitation of FU: Identify the operations and the way they are combined Annotation of FU: annotating inputs and outputs by relating them to concepts from a domain ontology… this can be automated using existing tools such as QuASAR, Meteor-S, Assam. QuASAR ( Quality Assurance of Semantic Annotations for Services) : aims to provide a toolkit to assist in the cost-effective creation and evolution of reliable semantic annotations Web services. Can be used to infer new semantic annotation or verify the quality of existing annotation ASSAM: A Tool for Semi-automatically Annotating Semantic Web Services
  • Merge last 2 slides ….
  • Paper presentation @ SWF 2010

    1. 1. The Functional Units: Abstractions for Web Service Annotation Paolo Missier Katy Wolstencroft Franck Tanoh Peter Li Sean Bechhofer Khalid Belhajjame Steve Pettifer Carole Goble School of Computer Science, University of Manchester (UK) SWF 2010
    2. 2. Functional Unit (FU) <ul><li>Service description abstraction </li></ul><ul><li>Services as functional tasks </li></ul><ul><li>Independent from technology used </li></ul><ul><li>FU within the boundary of a service </li></ul>SOAP REST DAS OTHERS SERVICE FUNCTIONAL UNIT
    3. 3. Motivations <ul><li>Useful for tools and resources integration </li></ul><ul><li>Automated Life Science applications and workflow systems (Taverna, Kepler, Trident, KNIME …) </li></ul><ul><li>Estimates 3500+ public web services </li></ul>1-Web Services in the Life Sciences
    4. 4. 2-Web Services issues <ul><li>Not properly documented </li></ul><ul><li>Poorly annotated </li></ul><ul><li>Distributed </li></ul><ul><li>Hard to find </li></ul><ul><li>Hard to get how they work </li></ul><wsdl:message name=&quot;getGlimmersResponse&quot;> <wsdl:part name=&quot; getGlimmers Return&quot; type=&quot;xsd:string&quot;/> </wsdl:message> <wsdl:message name=&quot;aboutServiceRequest&quot;/> <wsdl:message name=&quot;getGlimmersRequest&quot;> <wsdl:part name=&quot;in0&quot; type=&quot;xsd:string&quot;/> <wsdl:part name= &quot;in1&quot; type=&quot;xsd:string&quot;/> <wsdl:part name=&quot; in2 &quot; type=&quot;xsd:string&quot;/> <wsdl:part name=&quot; in3 &quot; type=&quot;xsd:string&quot;/> <wsdl:part name=&quot; in4 &quot; type=&quot;xsd:string&quot;/> <wsdl:part name=&quot; in5 &quot; type=&quot;xsd:string&quot;/> <wsdl:part name=&quot; in6 &quot; type=&quot;xsd:string&quot;/> <wsdl:part name=&quot; in7 &quot; type=&quot;xsd:int&quot;/> <wsdl:part name=&quot;i n8 &quot; type=&quot;xsd:string&quot;/> Motivations e.g. a WSDL document
    5. 5. 3-Existing annotation frameworks <ul><li>SAWSDL (Semantic Annotations for WSDL) </li></ul><ul><li>WSMO (Web Service Modeling Ontology) </li></ul><ul><li>OWL-S </li></ul><ul><li>Feta </li></ul>Motivations
    6. 6. <ul><li>Annotation at the interface level </li></ul><ul><li>Tied to the service technology </li></ul><ul><li>Difficult for user to understand </li></ul><ul><li>Wrong level of abstraction </li></ul>Motivations 4-Shortcoming Existing frameworks
    7. 7. The BioCatalogue
    8. 8. The BioCatalogue <ul><li>A means to pool metadata about services </li></ul><ul><li>A means to discover and reuse services </li></ul><ul><li>A means to curate services </li></ul><ul><li>A platform for service monitoring and analytics </li></ul><ul><li>A generic service annotation model for community annotation </li></ul>
    9. 9. Truth about web services <ul><li>Different types and different behaviors: </li></ul><ul><li>Each operation performs a single domain related task e.g. KEGG web service </li></ul><ul><li>A single operation performs several domain related tasks => Polymorphic services e.g. searchSimple operation from BLAST web service by DDBJ </li></ul><ul><li>Multiple operation of the same service combined in a manner to perform a single domain related task => operation patterns e.g. InterProScan web service by EBI </li></ul>
    10. 10. FU by example <ul><li>1- One operation, one FU </li></ul><ul><li>1 Operation performs 1 domain related task </li></ul><ul><ul><li>KEGG </li></ul></ul><ul><ul><li>SABIO-RK </li></ul></ul><ul><ul><li>BioMoby </li></ul></ul><ul><ul><li>Etc… </li></ul></ul>Inputs Outputs Data resources FU aligned with service operation
    11. 11. FU by example <ul><li>2-One operation multiple FUs </li></ul><ul><li>e.g. SearchSimple from blast web service by DDBJ </li></ul>searchSimple PD: protein sequence database ND: nucleotide sequence database 5 FU for searchSimple query database program proteinBlast blastp protein PD nucleotideBlast blastn nucleotide ND proteinNucleotideBlast tblastn nucleotide ND nucleotideProteinBlast blastx protein PD nucleotideBlastFrameTranslation tblastx nucleotide ND
    12. 12. FU by example <ul><li>Multiple operation, one FU </li></ul><ul><li>Operations orchestration or pattern based </li></ul><ul><li>Asynchronous service </li></ul><ul><li>Server like services (e.g soaplab) </li></ul>FU for InterProScan Inputs Outputs Data resources Protein Motifs analysis Protein sequence Protein Motifs InterProScan FUNCTIONAL UNIT SOAP runInterProScan CheckStatus Get_XML_Result
    13. 13. FU by example <ul><li>Composite FUs </li></ul><ul><li>Task A + Task B = Task C </li></ul><ul><li>2 or more Functional Units </li></ul><ul><li>when combined can produce </li></ul><ul><li>another Functional Unit </li></ul>Inputs Outputs Data resources
    14. 14. <ul><li>Composite of the service properties </li></ul><ul><li>Input </li></ul><ul><li>Task </li></ul><ul><li>Output </li></ul><ul><li>Underlying data resources </li></ul>FU defined Inputs Outputs Data resources
    15. 15. Specifying the FU <ul><li>We used: </li></ul><ul><li>An OWL-based semantic annotation </li></ul><ul><li>The myGrid ontology for core annotation vocabulary </li></ul>
    16. 16. Specifying the FU by example <ul><li>Describing a KEGG operation (getEnzymeByGene) as FU </li></ul>
    17. 17. Specifying the FU by example <ul><li>Describing a pattern based service (e.g. interProScan) as FU </li></ul>
    18. 18. FU usefulness <ul><li>Identify domain relevant operations from a given service </li></ul><ul><li>Same level of abstraction as the user </li></ul><ul><li>Enhance service discovery </li></ul><ul><li>Conceal service technology </li></ul><ul><li>Workflow composition </li></ul>
    19. 19. Cost of identifying the FU <ul><li>Yet another ‘thing’ to annotate </li></ul><ul><li>triple the curation effort </li></ul><ul><li>Rely on the curators to identify the FUs </li></ul><ul><li>Domain knowledge required </li></ul>
    20. 20. To reduce the cost <ul><li>Use existing pool of workflows available at myExperiment </li></ul> <ul><li>Use existing tool such as QuASAR </li></ul><ul><li>Encourage Web Service providers </li></ul>
    21. 21. Summary <ul><li>Abstraction handle the diversity of service type (SOAP, REST) and behaviors </li></ul><ul><li>Concrete example of how to use the service </li></ul><ul><li>Enhance web service annotation </li></ul><ul><li>Enhance web service discovery </li></ul><ul><li>Development of FU by the BioCatalogue team ongoing </li></ul>
    22. 22. Acknowledgments <ul><li>Jiten Bhagat </li></ul><ul><li>School of Computer Science, University of Manchester (UK) </li></ul><ul><li>Eric Nzuobontane, Thomas Laurent, Rodrigo Lopez </li></ul><ul><li>EMBL European Bioinformatics Institute, Cambridge (UK) </li></ul>