Automatic Semantic Web Services Composition

846 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
846
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
16
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Automatic Semantic Web Services Composition

  1. 1. Automatic Semantic Web Services Composition Zixin Wu, Ajith Ranabahu, Karthik Gomadam, Amit P. Sheth, John A. Miller LSDIS Lab, University of Georgia, Athens, Georgia {todo} @cs.uga.edu Abstract changed. Our solution involves very few human efforts. Only the specification of the task, i.e., initial state and Web service composition has quickly become an goal of the task, has to be changed, suppose all Web important area of research in the services oriented services are already semantically annotated. architecture community. One of the challenges in In our solution, we extend the AI planning algorithm composition is the existence of heterogeneities between GraphPlan to generate the control flow of a Web process independently created and autonomously managed Web automatically. Our extension is that besides the service requesters and Web service providers. Most of precondition and effect of the operations, we also take the previous work in this area either involve too much into consideration in the planning algorithm the structure human effort or overlook the problem of data and semantic of the input and output message. Our heterogeneities, thus cannot automatically generate approach for the problem of data heterogeneities is a executable workflow for real-world problems. In this dedicated data mediation Web service. Let us say that paper we present a planning-based approach for solving message M1 need to be converted into message M2 since the problem of protocol heterogeneity and a service- they have different structure and/or semantic. The data based approach for data heterogeneity, without human mediation service takes as input the message M1 and the intervention. Our system successfully outputs a BPEL semantically annotated schemas of M1 and M2, gives as file which correctly and completely solves the problem output the message of M2 automatically. By separating in SWS Challenge 2006, which is a typical real-world the data mediation from the protocol mediation, our problem. system is highly loosely coupled. The protocol mediation system may concentrate on generating the control flow, 1. Introduction and make it easier to analyze the control flow. We test our solution by solving the problem in SWS Web services are software systems designed to Challenge 2006, which is a typical real-world problem. support interoperable machine-to-machine interaction Our system generates a BPEL process automatically over a network. They are the preferred standards-based according to the specification of initial state and goal, way to realize Service Oriented Architecture computing and semantically annotated Web service descriptions. (SOA). A problem that has seen much interest from the The generated BPEL process can be executed research community is that of automated composition of successfully and accomplishes the task. Web services to realize Web service compositions or We consider our contributions as follows: Web processes by leveraging the functionality of 1. Our solution can generate executable workflow autonomously created services. While SOA is promising (BPEL) automatically, including the control flow for its flexibility of loosely coupling, it brings and data flow. heterogeneities to the table inevitably. These 2. We extend GraphPlan algorithm for Web services heterogeneities in general may exist at many different composition. levels, such as the data level or a 3. We propose and implement a loosely coupled data communication/protocol level. It is necessary and mediation Web service. critical to overcome the heterogeneities in order to 4. We propose a context-based ranking algorithm for organize autonomously created Web services into a Data Mediation (We will discuss this in the section process to aggregate their power. of data mediation). Previous work in this regard has considered various approaches to composition, and have included use of 5. We propose a pattern-based approach for loop HTN[1, 2], Golog[3-5], classic AI planning[6], Rule- generation in planning. based planning[7, 8], model checking[9-11], theorem The remainder of this paper is organized as follows. proving[12-15], etc. Some solutions involve too much We first give some background information of the human effort; some overlook the problem of data problem of Web service composition in section 2, and heterogeneities. Overcoming both protocol and data then introduce a motivating scenario in section 3. In heterogeneities is the key to automatic generation of section 4, we discuss our approach for automatic Web executable process. service composition. The system architecture and The way to measure the flexibility of a solution is to implementation is briefly introduced in section 5, and see how much human effort is needed if the scenario is the evaluation result is given in section 6. Finally we draw a conclusion in section 7.
  2. 2. Semantic Templates: While a semantic Web service 2. Background and related work description represents a service provider on a concrete level, a semantic template captures the semantic 2.1 Background capabilities of a service provider on an abstract level. It is the way a service requester defines its task There are two categories of partners that are described specifications. We again represent a semantic template within the web services domain, namely the service in SAWSDL, very similar to Web service description, provider and service requester. 1 A service provider except that it is the specifications of a task, not of a presents its web service functionality by providing a set specific Web service. We will discuss the formal model of operation specifications (or operations for of semantic templates in a later session. abbreviation). These operations allow service requesters to use these services by simply invoking them. These 2.2 Related work operations might be inter-dependent. The dependency can be represented as precondition, effect, input, and [6] is similar to our work. They also use GraphPlan output specification of an operation. Using these algorithm to generate the process. While it is good to available operations, a service requester performs one or consider the interaction with the users, their approach more inter-related steps, also known as tasks, to achieve suffers from the extent of automation. An important the desired goal. Tasks can be best viewed as activities difference of their work from ours is that they don’t in a process and can be divided into smaller and more consider the input/output message schema when concrete sub-tasks, and eventually invocations of generating the plan. This issue is important, as an concrete operations. These tasks may be organized into operation’s precondition may be satisfied while there is a process specification which may consist of one or more no suitable data for its input message. Another problem activities that may be executed one or more times. of their work is that the only workflow pattern their Service requesters and providers are oftentimes system can generate is Sequence, although the autonomously created. This causes heterogeneity to composition process may contain other patterns. As the exist between the requester and provider when Web read may find in the motivation scenario, other patterns services are aggregated to perform a task. These such as Loop are also frequently used. heterogeneities in general may exist at many different [17] discusses using the pre- and post-conditions of levels, such as the data level or at a actions and automatic synthesis of Web services by communication/protocol level. We say that protocol finding a backbone path as the first step. One problem of heterogeneity exists when the goal of the service their work is that they assume task predicates are requester cannot be achieved by atomically invoking associated with ranks (positive integer), thus their exactly one operation once. On the other hand, data algorithm gives priority to the tasks with higher rank. heterogeneity exists when the output message of an This is infeasible if the Web services (tasks) are operation has different structure or semantics from the developed by independent organizations, which is the input message of the consecutive operation. In the above common case and the main reason leading to cases, process composition is needed. heterogeneities. SAWSDL: We describe Web services and Semantic The authors in [18] found a correlation between Templates (see next item) in SAWSDL. SAWSDL[16] is Hierarchical task network (HTN) planning and web a W3C standard to add semantics in Web services. service representation in the OWL-S framework. HTN “SAWSDL does not specify a language for representing planning uses the approach of refining plans by applying the semantic models, e.g. ontologies. Instead, it provides action, or task, decompositions. The idea is to divide mechanisms by which concepts from the semantic high-level tasks into smaller and smaller sub-tasks until models that are defined either within or outside the more primitive tasks can be performed directly. The WSDL document can be referenced from within WSDL benefit of this approach is that planning complexity may components as annotations.” 2 Semantic annotations be reduced for tasks that require many actions. However, facilitate process composition by eliminating how to divide high-level tasks itself is a problem. If ambiguities. We annotate a Web service by specifying human’s intervention is needed, it also suffers from the the Model Reference attribute of its operations, the extent of automation. Model Reference and Schema Mappings of the input and [11] proposes an approach of planning as model output message of its operations. We also extend checking. They encode OWL-S process models as state SAWSDL by adding precondition and effect in operation transition systems, and claim their approach can handle for protocol mediation, which will be discussed in later nondeterminism, partial observability, and complex sessions. goals. But it relies on the specification of OWL-S process models, i.e. the user need to specify the 1 “Web Services Glossary” (http://www.w3.org/TR/ws-gloss/), and the interaction between the operations. discussion of terminologies (http://www.w3.org/TR/2004/NOTE-ws- arch-20040211/#wordonspr). 2 http://www.w3.org/2002/ws/sawsdl/
  3. 3. [19] refers to this problem of protocol mediation as 4. METEOR-S approach for automatic Web orchestration. [20] presents a graphical tool to guide the service composition user to composite a process. METEOR-S Configuration approach (Karthik) 4.1 Semantically annotated Web service description 3. Motivation scenario WSDL is a widely accepted industry standard for describing Web services, and SAWSDL is a W3C The SWS Challenge 2006 mediation scenario version standard to add semantics in Web services. SAWSDL is 1 is a typical real-world problem where distributed expressive for functional and data semantics, and organizations’ systems are trying to communication with sufficient to solve the problem of semantic discovery and each others3. A customer (depicted on the left side of the data mediation. We extend SAWSDL by adding figure) desires to purchase goods from a provider precondition and effect in the operations for protocol (depicted on the right side of the figure) offering retail mediation. Precondition and effect are necessary because services. The anticipated process is in the mediator not all the states of a Web service are represented by the (depicted on the middle of the figure). input/output message. For example, both a book buying service and book renting service may take as the input the user ID and the ISBN, and give as the output the status “succeed” or “fail”. Formal model of Web service: Before giving our formal model of Web service, we need to introduce some definition of the basic building block of our model. Most classical AI planning problems are defined by the STRIPS representational language (or its variants like ADL), which divides its representational scheme into three components, namely, states, goals, and actions. For the domain of Web service composition, we extend the STRIPS language as the base representational language of our method. − Extended State. We extend a traditional workflow Figure 1. SWS Challenge 2006 mediation Scenario state by adding a set of semantic data types in order to ensure the input of an operation is available before it Both protocol and data heterogeneities exist in this is invoked. An extended state s has two components: s scenario. For instance, from the point of view of the = <SF, SDT>, where: service requester, i.e. Blue, placing an order is a one-step • SF is a conjunction of a set of status flags job (send PO), while the service provider, i.e. Moon, representing the status of a process by using concepts involves 4 operations (searchCustomer, createNewOrder, in a domain ontology. We use propositional logic for addLineItem, closeOrder). The message schemas they status flags, thus a status flag is a propositional use are not exactly the same. For example, Blue uses variable. Status flags with False value are prefixed by “fromRole” to specify the partner who wants to place an ¬ . We also assume an open-world assumption, i.e., order, while Moon uses “billTo” to mean the same thing. any status flag not mentioned in the state is unknown. The structures of the message schemas are also different. • SDT is a set of semantic data types representing the To make it worse, an input message may involves availability of data by using concepts in a domain information from two or more output message, for ontology. We use description logic for semantic data example, the operation “addLineItem” requires types. An example state could be: <ontology1#orderPlaced & ¬ ontology1#orderPaid, information from the order request message by Blue and the newly created order ID from the output message of (ontology2#orderID)> operation “createNewOrder”. In order to solve this problem successfully and − Condition. A condition also has two components as a automatically, the composition system at least should be state, except that a condition is a requirement for a able to do the following: generate the control flow of the state. mediator that involves at least two workflow patterns − For example, (Sequence and Loop) based on the specification of the <ontology1#orderPaid, ontology2#paidAmount> task and the candidate Web service(s), and convert (and − Semantic Web Service[21]. Our definition of a combine if needed) to a format of an input message. semantic Web service is based on the definition of WSDL and proposed SWS specifications – OWL-S and WSDL-S, with Interaction Protocol included. 3The reader may find the detail on http://sws- challenge.org/wiki/index.php/Scenario:_Purchase_Order_Mediation.
  4. 4. operation. It is defined as a 6-tuple which is similar to a SWS = ( {sop } ) i i sop, except that it is the requirement for an operation. Where, SWS is the union of a set of semantic The following table shows the representation of a operations (sop). semantic operation template SendPO from the scenario sop = <op:FunctionalConcept, input:SemanticType, in section 2. output:SemanticType, Pre, Effects, fault:SemanticFault> Table 2. Representation of semantic template Operation name Precondition Input Effect Output Where, a semantic operation (or operation) (sop) is SendPO haveCompanyInfo OrderInformation completeOrder^ Acknowledgement defined as a 6- tuple of the following: closeOrder • op is an operation mapped to a functional concept in a 4.3 Semantic discovery (1/2 page) domain ontology. • Pre is the precondition. It is a conjunction of status Based on modelReference flags stating which status flags must be true (or false) Based on WS-Policy before an operation can be executed. • Eff is the effect. It is a conjunction of status flags 4.4 Automatic Web service composition. describing how the status flags in a state changes when the action is executed. Status flags prefixed by 4.4.1 Formal definition of Web service composition. A ¬ are called negative effects, whereas others are semantic Web service composition problem is called positive effects. composing a set of semantic Web services (SWSs) to fulfill the requirement of one specific semantic operation • input is mapped to a set of semantic data types stating template. Before we give the definition of the problem, what data are required in order to execute the we need to define the initial state, the goal, the “apply” operation. and “satisfy” operator for the problem. • output is mapped to a set of semantic types stating For convenience, we use Eff(a) to represent the effect what data are available after the operation is executed. of operation a, SF(s) for the set of status flags in state s, • fault is the exceptions of the operation represented SF(eff) for the set of status flags in eff, value(sfs) for the using concepts in a domain ontology. value of status flag sf in state s, value(sf eff) for the value Table 1 illustrates an example of the representation of of status flag sf in effect eff, ST(s) for the set of semantic part of the Order Management System Web service data types in state s, and ST(c) for the set of semantic described in our running scenario. Using our definition, data types in condition c. we allocate the state parameters for “CreateNewOrder”, − Initial state. One of the possible extended states when “AddLineItem”, “CloseOrder”, and “PlaceOrder” the service requester’s process runs to an activity 4 accordingly. which defined by the semantic operation template soptp. It is defined by the precondition and input of the Table 1. Representation of order management semantic operation template. system Web wervice s0 = <Pre(soptp), In(soptp)> Operation name Precondition Input Effect Output CreateNewOrder haveCustomerID CustomerIdentification haveOrderID^ OrderIdentification Remember that an extended states included a set of ¬ completeOrder^ status flags and a set of semantic data types, so does an ¬ closeOrder initial state, since it is an extended state. AddLineItem haveOrderID^ LineItemEntry completeOrder LineItemSubmission ¬ completeOrde − Goal. A goal is a condition which is defined by the CloseOrder r completeOrder OrderIdentification closeOrder ConfirmedOrder effect and output of the semantic operation template soptp. 4.2 Semantic Template g = < Eff(soptp), Out(soptp)> − “Apply” operator. We use the notation “+” for While a semantic Web service definition represents a “apply” operator which is a function mapping an service provider on a concrete level, a semantic template extended state s and an operation (or pseudo captures the semantic capabilities of a service provider operation) a to a new extended state s’: on an abstract level. It is the way a service requester +: (s, a)  s’ (Alternatively, we may write s + a = s’) defines its task specifications SF ( s ' ) = SF ( s )  SF ( Eff (a )) Formal model of Semantic Template ∀sf ∈ SF ( Eff (a )), value( sf s ' ) = value( sf Eff ( a ) ) ST =  {sopt i } i ∀sf ∈ SF ( s ) ∧ sf ∉ SF ( Eff ( a )), value( sf s ' ) = value( sf s ) sopt = <Action o, Io, Oo, Preo, Effo, Fo> Where, a semantic operation template (sopt) is an abstract representation of the functionality of an 4There may be more than one initial state for an activity because of the non-deterministic behavior of the activities before. For the time being, we only handle one initial state.
  5. 5. That is, a positive/negative status flag in eff is also in There are two basic operations in every state-space- s’ with the same value, while any status flag in s but not based planning approach. First, the precondition of an in eff is assumed to remain in s’ with the same value as action needs to be checked to make sure it is satisfied by in s. the current state before the operation can be a part of the ST(s’) = ST(s)∪out plan. Second, once the operation is put into the plan, its That is, a semantic type in out is added to s’ if it is effect should be applied to the current state and thus not in s, while other semantic data types in s are also in produce a consecutive state. s’. We will handle deleting a semantic type when we We consider the significant differences between AI consider scopes in a process in our future work. planning and semantic Web service composition as − “Satisfy” operator. We use the notation “→” for follows: “satisfy” operator which is a function mapping an 1. Actions in AI planning can be described completely extended state s and a condition c to T or F: by its name, precondition, and effect, while Web services also include input and/or output message →: (s, c)  {T, F} schema. This function maps to T (we call it “s satisfies c” and we may write it as: s→c) if and only if: 2. AI planning, it is assumed there has an agreement within an application on the terms in the • c s, that is, for each status flag sfc ∈ SF(c), there is a precondition and effect. Terms with same name semantically equivalent status flag sfs∈ SF(s), such (string) mean the same thing, while terms with that their values are the same. Semantic equivalency different name (string) mean different things. For can be represented as “equivalentClass” relationship example, in the famous block world scenario, if in OWL. And both “block” and “box” exist in the • ∀st c ∈ ST (c), ∃st s ∈ ST ( s) |sts stc∨(stc is part of precondition/effect, they are treated as different things. This obviously is not scalable in the scope sts), that is for each semantic data type stc ∈ ST(c), of the Web, thus it is necessary to introduce there exists a semantic data type sts∈ ST(s), such that semantic for semantic Web service composition. stc subsumes sts, or stc is part of sts. In such case, we As discussed in the previous sections, both Web say sts is a compatible data type of stc. services and the specification of the task, i.e. Semantic − A semantic Web service composition problem is the Template are described in extended SAWSDL standard, following function: so the terms in the precondition, effect, and input/output p: (sopt, SWSs)  plan messages reach an agreement which is captured by the Where, ontologies. For the two types of differences we mentioned above, • sopt is a semantic operation template. to apply AI planning techniques to semantic Web service • SWSs is the union of the given semantic Web services. composition, any state-space-based planning algorithm • plan is a partially ordered set of operations. Every need to be revised according to the following guideline. sequence (total order) of operations (say a1, a2, … an) 1. State space should include status flags, as in the that satisfies the partial order must conform to the existing AI planning approaches, and semantic data following restrictions: types to represent the availability of data. s0 → <Pre(a1), In(a1)> 2. For each candidate action, besides checking its s0 + a1 = s1 precondition against the status flags in the current si → <Pre(ai+1), In(ai+1)> state, it is also necessary to check its input message si + ai+1 = si+1 schema against the semantic data types in the sn → g current state. Where s0 is the initial state, g is the goal, and a i ∈ 3. Since the states and the actions/operations are SWSs. Remember that a condition has two components: semantically annotated by referring to ontologies, set of status flags and set of semantic data types, so the the checking in the previous step should also refer condition for s0 consists of the precondition and input of to the ontologies, not just compare the name of the a1. terms. 4.4.2 Planning for Protocol Mediation. AI planning is 4. Once an action/operation is added into the plan, not only the status flags are updated by applying the a way to generate a process automatically based on the effect, the semantic data types should also be specification of a problem. Planners typically use updated by adding a new data type based on the techniques such as progression (or forward state-space output message schema. search), regression (or backward state-space search), and The idea can be illustrated informally in picture 2. partial-ordering. These techniques attempt to use combinatorial exploration methods such as searching, backtracking, and/or branching techniques in order to extract such a solution.
  6. 6. An operation may only be invoked when its Status flags preconditions exist in the current state level of the graph (i.e. pre(op) ∈ current graph level StatusFlags) and there is an input data type (in(op) ∈ current graph level Update Update semantic type). When an operation is placed in the Update graph, its effects as well as output data types are added Operation Operation Operation to the existing conditions of the previous state. Thus, the Create Create new current state level becomes the new effects and Create output data of the operations in addition to the previous state’s defining literals and data types. Afterwards, mutex links between actions must be evaluated and Semantic data types placed so that they may be used when backtracking through the graph for the solution. Note that the creation of the mutex links should also consider the semantic data types accordingly. Picture 2. Illustration of protocol mediation Pattern-based approach for loop generation. GraphPlan algorithm may generate plans only with Extended GraphPlan algorithm. Although most AI sequence and AND-split workflow patterns [23]. planning algorithms are suitable for the task here, we use However, loops are also a frequently used pattern, such GraphPlan algorithm for its time and space efficiency. as in the motivation scenario, only one line item can be The reader may get basic knowledge of GraphPlan in added into the order. Loop generation (or iterative [22]. It is sound and complete, and its compact planning) itself is a difficult and open problem in AI. representation of the states makes it space efficient while Lots of work on iterative planning is based on theorem- doing a breadth-first style search. It also uses mutex proving[24]. It is believed by Stephan and Biundo[25] links to avoid exploring some irrelevant space. and other researchers that iterative planning cannot be The general idea of the GraphPlan algorithm is to carried out in a fully automatic way. [26] proposes a new represent the combination of possible states and actions way that is not tied to proving a theorem, but it is only that may exist in the plan as an expandable graph data correct for a given bound or a certain class of simple structure, where nodes may be represented as state planning problems. parameters (status flags and semantic types) and actions, Here we proposed a pattern-based approach for loop and edges representing relations between the state generation. It is based on the observation of frequently parameters and actions. The structure consists of a used patterns of iterations. For example, in the sequence of alternating state and action levels which motivation scenario, the order request includes multiple consist of state parameters and actions (operations in the line items (an array of line items) while the addLineItem context of Web services), respectively, that may exist at operation take as input one line item only. It is obviously that particular level. The graph is expanded by adding that the process needs to iterate all the line items in the the effects of actions to the existing state. When the order request. We may extract the pattern as follows. If goal literals are contained in a state level, the algorithm an operation has input message which has semantic data attempts to extract a plan by backtracking through the type SDTi (not an array), and the semantic data types in action levels of the graph starting from the goal state. the current state have an array of SDTi (or other suitable Our approach requires an extension to the SDT, see “satisfy” operator), a loop is needed for this EXPANDGRAPH function to accommodate the semantic operation to iterate this array. data types defined above. Pseudocode for this procedure 4.4.3 Data Mediation. Most of the previous work in this is shown in figure 3. area focus on the generation of the control flow, hence overlook the problem of data heterogeneities and assume function EXPANDGRAPH(graph) there are no such problems or it is handled automatically create new action level ai in an unspecified way. We consider data mediation is create new state level si critical for generating executable workflows for real- for all op ∈ A world problems. For convenience, let us say that we if pre(op) ∈ SF(si-1) and in(op) ∈ ST(si-1) need to convert a message M1 with schema MS1 into a add op to ai message M2 with schema MS2, and let us call M1 add eff(op) to si source message, M2 target message, etc. add out(op) to si There are two types of heterogeneities in the message for all op ∈ ai schema of Web services: semantic and structure. create mutex links (if applicable) Semantic heterogeneities in message schema. terms with different names may refer to the same concept, or Figure 3. Pseudocode for EXPANDGRAPH terms with the same name may refer to different concepts. For example,
  7. 7. <name>John</name> may be a name of a person or a name of a dog, and The basic idea is that we traverse the target schema <num_of_kids>5</ num_of_kids> and tree/DAG in a top-down direction, and try to fill up each <num_of_children>5</ num_of_ children > node by using the data in the source message. Let us say may mean the same thing. The solution is to annotate the that we are handling the node Nt in the target schema. Nt message schema by using ontology concepts. This makes is filled up if one of the following happens: sure different Web services reach an agreement of the  Nt has the annotation of schema mapping and there semantics of the terms. is another node in the source schema with Structure heterogeneities in message schema. compatible data type and schema mapping, thus we <name>John Smith</name> and assume it can be converted into the target format <name> according to the schema mapping and we don’t <first_name>John</first_name> look into the sub-tree anymore. <last_name>Smith</last_name>  Nt is a leaf and there is another leaf in the source </name> schema with compatible data type. They have the same information but organized in different ways (schemas). There are two solutions for  All the nodes in the sub-tree of Nt is filled up. this type of problem.  Nt is allowed to be empty in the target message. SchemaMapping in SAWSDL. Similar to the idea of Context-based ranking algorithm. In case more than capturing agreements by ontologies, the attribute one node in the source schema is suitable, we have the “LiftingSchemaMapping” and following context-based ranking algorithm to select the “LoweringSchemaMapping” are to transform different best one automatically. This is necessary because a XSD schemas to a shared and agreed schema. The read may element may refer to another element by using “ref” refer to [16] for more information. attribute. For example, Matching algorithm. If the SchemaMapping attribute does not exist, but all the information in the target schema exists in the source schema, we use the following algorithm to convert the source message into a target message. status, mapCopy, mapTransform <- convert(targetElement) It calls the following recursive program to traverse the target schema tree/DAG. convert(currentElement) modelRef <- getMOdelRef(currentElement) Figure 5. Semantic difference because of different nullable <- whether currentElement is allowed to be empty context if currentElement is a leaf if modelRef is not null if there is an element in source with compatible modelRef Besides the model reference of the element “name”, if currentElement has schema mapping AND we need also look at the model reference of its parents in srcElement has schema mapping order to get the most accurate meaning of the data. The handle by schema mapping algorithm is a variant of edit distance algorithm. else handle by copying value from source to target getXpathSim (srcXpath, tarXpath) else srcArray <- ExtractElement(srcXpath) return nullable tarArray <- ExtractElement(tarXpath) else for i<-1 to length(srcArray) for j<-1 to length(tarArray) return nullable sim[i][j] = 0; if currentElement is an internal node elementSim <- compareElement(srcArray[i], tarArray[j]) if modelRef is not null if elementSim >= 0 if there is an element in source with compatible modelRef x = (1-fadingFactor) * sim[i-1][j-1] + nameSimMetric; if currentElement has schema mapping AND if x > sim[i][j] srcElement has schema mapping sim[i][j] = x; handle by schema mapping x = (1-fadingFactor) * sim[i-1][j]; if (x > sim[i][j]) else sim[i][j] = x; return convert(children) x = (1- fadingFactor) * sim[i][j-1]; else if (x > sim[i][j]) return convert(children) sim[i][j] = x; else return convert(children) Figure 6. Context-based data type ranking algorithm Figure 4. Semantic data type algorithm
  8. 8. Where “fadingFactor” is to give more weight to the elements near the current one, while make the elements Ontolog far away less important. This program calls the function Data below. Mediation Service Calls compareElement (srcElement, tarElement) Candidate if srcElement has modelRef AND tarElement has modelRef Discove Web BPEL r Planne if srcModelRef = tarModelRef OR isSubClassOf (srcModelRef, Semanti Task Initial State Engine services r Engin tarModelRef) c Parse & Goal e return 1 Templat r else return 0 Figure 8. System architecture else if isCompareNameString nameSim<- nameStrWeight*compareNames(srcElement,tarElement) 6. Evaluation if (nameSim > nameSimThreshold) Our system generates a BPEL file according to the return nameSim semantic template we created (part of it is in Figure 9). else return 0 We run it on Oracle BPM engine, and get the graphical else result in Figure 10. It placed an order successfully as we return 0 see the record in our account in SWS challenge 2006 server. The only thing we can’t do is the Figure 7. Algorithm of comparing two annotated XSD “confirmLineItem” operation, since it use Solicit elements Response message pattern which is not supported by BPEL. “isCompareNameString” is set by the user to indicate whether compare the similarity of the name of the <wsdl:message name="PIP3A4POR"> elements if no model reference is specified. Different <wsdl:part element="por:Pip3A4PurchaseOrderRequest" name="Request" string comparison algorithms such as Soundex may be sawsdl:modelReference="Ontology1#PurchaseOrderRequest"/> employed in the function “compareNames”. The </wsdl:message> similarity of the string will be returned only if it is above <wsdl:message name="PORAck"> a preset threshold “nameSimThreshold”. <wsdl:part element="ack:ReceiptAcknowledgment" name="Ack" sawsdl:modelReference="Ontology1#PurchaseOrderAck"/> For the above example, if the target xpath is </wsdl:message> “customer/email”, where “customer” is an equivalent <wsdl:portType name="MediatorPortType"> class of “fromRole” and “email” is an equivalent class of <wsdl:operation name="sendPO"> “EmailAddress”, our system gives score 0.252 and <wsdl:input message="impl:PIP3A4POR"/> <wsdl:output message="impl:PORAck"/> 0.16666667 to the left and right xpath respectively, thus <sawsdl:effect expression="confirmedOrder"/> successfully selects the best matched xpath. </wsdl:operation> Dedicated Data Mediation Web Service. We deploy </wsdl:portType> the above program as a dedicated data mediation Web Figure 9. Part of the semantic template service which converts and combines messages at run- time, thus alleviate the burden of data mediation from the generated process and make it easier to analyze the control flow. Our system is highly loosely coupled, enhances large-scale reusability and facilitates dynamic partner binding, especially at run-time. 5. Implementation and System Architecture Picture 8 is the overview of our implemented system. We implement the system in Java, and use Jena to handle the ontology. Note that if more than one ontology is involved, ontology matching/mapping is needed. We develop our SAWSDL API (SA-WSDL4J) to parse semantic templates and annotated Web service descriptions. We use IBM BPWS4J API to generate BPEL, and run it on Oracle BPM engine. Figure 10. Part of the invocation result 7. Conclusion
  9. 9. This paper presents an automatic approach for Web source message if more than one element has compatible service composition, addresses the problem of protocol semantic with the target element. heterogeneities and data heterogeneities by a planner and Our experiment shows that our systems solved the a data mediation service. Specifically, an extended problem in SWS challenge 2006 mediation scenario GraphPlan algorithm is employed to generate a BPEL successfully, which is a typical real-world problem and process based on the task specification (Semantic involves protocol and data heterogeneities. We consider Template) and candidate Web services described in our approach is highly flexible, since the only thing a SAWSDL. When the BPEL process is running, it calls a user need to change for a new scenario is the task dedicated data mediation service to convert (and specification (semantic template), and perhaps the combine if necessary) the available messages into the discover rules/preferences. format of an input message in order to invoke an operation in the BPEL process. A context-based ranking 8. References algorithm is created to select the best element from the

×