Automated Syntactic Mediation for Web Service Integration

  • 444 views
Uploaded on

 

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
444
On Slideshare
0
From Embeds
0
Number of Embeds
1

Actions

Shares
Downloads
8
Comments
0
Likes
1

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • The presentation concentrates on the problem that arises when service providers use different representations for conceptually equivalent information. I will present a solution using ontolgoies that capture data structure and semantics, and a mapping language to express the relation between data models and ontologies This work is set against the background of a bioinformatics use case… in the context of the mygrid project Real Datasets Real Services

Transcript

  • 1. Automated Syntactic Mediation for Web Service Integration Martin Szomszor ( [email_address] )
  • 2. Presentation Outline
    • Contemporary workflow design pattern
      • Using workflow to capture experimentation process
      • Discovery of services using semantics
    • Problem description
      • Syntactic incompatibility
    • Using ontologies for mediation
    • Architecture to support syntactic mediation
    • Mapping Language
      • Overview of mapping mechanics
      • Implementation description
    • Future work
      • Dynamic discovery of Mappings
  • 3. In Silico Experimentation
    • Computational experimentation
    • Access to resources provided by Web Services
    • Users map experimental process to workflow
    • Tasks are realised by service instances
  • 4. Service Discovery
    • Users need to find services to fulfill given tasks e.g.
      • Retrieve sequence data
      • Sequence alignment (Blast)
    • There are lots of services !
    • Interface definitions can be terse, often un-documented and sometimes cryptic
    • Limited semantic value
    • Manual discovery not ideal
  • 5. Semantic Discovery
    • Support users in the discovery of services according to domain specific terminology
    • Annotate service descriptions with concepts from an ontology (PEDRO annotation tool)
      • Input and output types assigned a semantic type by a reference to an ontology concept
    • Discover services by:
      • Task performed
      • Resources used
      • Input and output semantic types
  • 6. Use Case
    • Common bioinformatics task:
      • Find sequence data for a given id (accession number)
      • Perform sequence alignment to discover similar sequence data
      • Obtain results
    • Itself a complete workflow, but likely to feature in larger workflows too
  • 7. Semantically Driven Workflow Design
    • When building workflows, users connect services because they are deemed semantically compatible:
      • Output semantic type equivalent to input semantic type
  • 8. Syntactic Compatibility
    • However, semantically compatible service interfaces may not be syntactically compatible (i.e. different data formats)
  • 9. Syntactic Mediation
    • When a mismatch in data formats occurs within a workflow, a translation component is required
    • Current solutions are manual
      • Identify when mismatch occurs
      • Derive conversion requirements
      • Find suitable conversion tool
      • Create new translation components if necessary
    • These conversion components come in a variety of guises
      • Translation Scripts (e.g. XSLT)
      • Bespoke Code (JAVA and PERL)
      • Web Services
  • 10.
    • Simple solution: Adaptor for each compatible data format
      • O(n 2 )
      • Poor Scalability
    • Alternative: Introduce intermediate representation
      • O(n)
      • Less effort introducing new formats
    • Data Integration problem
    Conversion Approaches f c e b d a f c e b d a
  • 11. Three Layer View
    • Physical Layer
      • Data can be stored in different formats:
        • E.g. binary, text, xml, relational database, etc…
    • Logical Layer
      • Organisation of data elements described by a schema:
        • E.g XML Schema, relational database model
    • Conceptual Layer
      • What the data means (semantics)
        • E.g. Ontology, description logic, Entity Relation Diagram
  • 12. Intermediate Representation
    • Data integration field has used this solution in similar application domains:
      • TAMBIS Project [Stevens et al 2003]
        • Complex query formulation over diverse bioinformatics information sources
      • SEEK Project [Bowers and Ludascher 2004]
        • An ontology-driven framework for geographic data transformation in scientific workflows
    • Intermediate representation in the form of a conceptual model
      • E.g. Ontology, Description Logic
  • 13. Architecture Requirements
    • OWL ontologies capture data format structure and semantics:
      • Existing service ontologies [e.g. C. Wroe et al 2003] can be extended with concepts and properties to describe data contents
    • Modular and composable mapping language
      • Mapping overhead reduced when service providers expose multiple operations over single schemas
      • When schemas are combined to form new datasets, existing mappings can be reused
  • 14. Architecture Requirements
    • Invocation of arbitrary Web Services
      • Grid and WS applications pull resources from multiple providers into a dynamic and volatile environment
      • Must be able to invoke previously unseen services
    • Minimise annotation overhead
      • Reuse existing Semantic Web Service description methods
      • Input and output types are assigned a concept (semantic type)
  • 15. Mapping XML to OWL
    • Problem can be simplified by assuming a canonical XML representation for OWL concept instances [OWL- XI ]
      • XML serialisations of OWL concepts commonly used
    • However, XML Schemas to validate individuals do not exist
    • To support validation, OWL instance Schemas [OWL- XIS ] are generated from ontologies
      • Concept hierarchies computed
      • Jena + Java Implementation
    • Enables us to view the translation as an XML to XML transformation
  • 16. Architecture Diagram Service providers describe their Web Service interfaces using WSDL. Data consumed and produced is defined using XML Schema. OWL Ontologies are created to Describe the information contained Within Bioinformatics data structures. Serialisation and Realisation Mappings describe how to transform XML dopcuments to and from [OWL-XI] Semantic Annotations associate each WSDL Message part with a concept from the ontology. [OWL XIS] are generated to validate ontology instances.
  • 17. Configurable Mediator
    • Input:
      • Source data instance
      • Source schema
      • Realisation Mapping (source format -> ontology)
      • Ontology Definition
      • Serialisation Mapping (ontology -> destination format)
      • Destination Schema
    • Output:
      • Destination data instance
    • Conversion performed via intermediate OWL concept instance
  • 18. Configurable Mediator
  • 19. Mapping Mechanics <S> <X>foo</X> <X>bar</X> </S> <D> <Y>foo</Y> <Y>bar</Y> </D> Source Document Destination Document m 1 : S/X -> D/Y m 2 : X/$ -> Y/$ Mappings
  • 20. Mapping Mechanics S X “ foo” S/* S/* xsd:string xsd:string X “ bar” D Y “ foo” D/* D/* xsd:string xsd:string Y “ bar” m 1 : S/X -> D/Y m 2 : X/$ -> Y/$
  • 21. Example M-Binding <binding xmlns=&quot;http://www.ecs.soton.ac.uk/~mns03r/mapping/example&quot; xmlns:sns=&quot;http://jaco.ecs.soton.ac.uk/schema/source&quot; xmlns:dns=&quot;http://jaco.ecs.soton.ac.uk/schema/destination&quot;> <mapping id= &quot;1&quot; > <source match= &quot;sns:S/sns:X&quot; /> <destination create= &quot;dns:D[join]/dns:Y[branch]&quot; /> </mapping> <mapping id= ”2&quot; > <source match= &quot;sns:X/$&quot; /> <destination create= &quot;dns:Y[join]/$&quot; /> </mapping> </binding>
  • 22. Bio Example <ddbj:DDBJXML> <ddbj:ACCESSION>AB000059</ddbj:ACCESSION> <ddbj:FEATURES> <ddbj:source> <ddbj:location>1..1755</ddbj:location> <ddbj:qualifiers name=&quot;isolate&quot;>Som1</ddbj:qualifiers> <ddbj:qualifiers name=&quot;lab_host&quot;>Felis domesticus</ddbj:qualifiers> </ddbj:source> </ddbj:FEATURES> </ddbj:DDBJXML> <ont:Sequence_Data_Record> <ont:accession_id>AB000059</ont:accession_id> <ont:has_feature> <ont:Feature_Source> <ont:isolate>Som1</ont:isolate> <ont:lab_host>Felis domesticus</ont:lab_host> <ont:location> <ont:Feature_Location> <ont:start>1</ont:start> <ont:end>1755</ont:end> </ont:Feature_Location> </ont:location> </ont:Feature_source> </ont:has_feature> </ont:Sequence_Data_Record> Simple One-to-One Element and literal Many-to-Many Split literal value Predicate evaluation
  • 23. Example M-Binding <binding xmlns=&quot;http://www.ecs.soton.ac.uk/~mns03r/mapping/ddbj-to-ont-mapping&quot; xmlns:sns=&quot;http://jaco.ecs.soton.ac.uk/schema/DDBJ&quot; xmlns:dns=&quot;http://jaco.ecs.soton.ac.uk/ont/sequencedata&quot;> <mapping id=&quot;1&quot;> <source match=&quot;sns:DDBJXML/sns:ACCESSION&quot;/> <destination create=&quot;dns:Sequence_Data_Record[join]/dns:accession_id[branch]/&quot;/> </mapping> <mapping id=“2”> <source match=&quot;sns:ACCESSION/$&quot;/> <destination create=&quot;dns:accession_id[join]/$&quot;/> </mapping> <mapping id=”3&quot;> <source match=&quot;sns:DDBJXML/sns:FEATURES/sns:source&quot;/> <destination create=&quot;dns:Sequence_Data_Record[join]/dns:has_feature[branch]/ dns:Feature_Source[branch]&quot;/> </mapping> <mapping id=”4&quot;> <source match='sns:source/sns:qualifiers[sns:qualifiers/sns:name/$ = &quot;lab_host&quot;]'/> <destination create=&quot;dns:Feature_Source[join]/dns:lab-host[branch]&quot;/> <mapping> <source match=&quot;sns:qualifiers/$&quot;/> <destination create=&quot;dns:lab-host[join]/$&quot;/> </mapping> </mapping> <mapping id=”5&quot;> <source match=&quot;sns:location/$^[^.]+&quot;/> <destination create=&quot;dns:Location[join]/dns:start[branch]/$&quot;/> </mapping> </binding>
  • 24. Conclusions
    • Provide infrastructure to support syntactic mediation:
      • OWL Ontologies to capture data format structure and semantics (reuse existing annotations)
      • Mapping Language to describe relationships between XML Schemas and OWL Ontologies
        • Modular and Composable
      • Configurable Mediator to consume mappings and perform document translation
      • Dynamic Web Service Invoker
  • 25. Future Work
    • Dynamic discovery of Mappings
      • Already implemented using the GRIMOIRES registry and WSDL to describe mapping capabilities
        • [Szomszor, Payne, Moreau 2006] UK All Hands, Nottingham
    • Annotation Tool
      • Mappings are complex and difficult to write by hand
      • Web based annotation tool
  • 26. Questions and Comments?