I thank the members of CiSOFT for inviting me to USC. (depending on audience, LA sun mention). I am Karthik Gomadam, PhD candidate at knoesis center, wright state. Started phd in UGA and migrated north in 07. Interests include semantic Web, SOA, social computing. This talk has three parts.
My research in semantic web services is largely motivated by the challenges enterprises face, as they evolve and adapt to a global market place.
Let us review what has happened to a farmer Grandpa Joe in an increasingly flat world, within a period two decades. Initially, Joe tilled the farm in his village and Gramma Doe sold the produce to the residents of the village.
As this Small businesses started to grow in a local context; spreading their market to neighboring locations. Soon Granpa Joe started hiring other services such as delivery; but still this was manageable and he had a small service fleet.
and as business expanded even more, there was interplay of many services such as inventory management, delivery, retail
And today we are in a flat world, one where the interplay and synergy is between providers across the world; each offering a service that the other consumes. Many new services such as currency forecasting, market analysis and such started to play a significant role in Grandpa Joes business.
His business is now a global enterprise with partners everywhere.
The market place is a complex dynamic on the fly environment with many players having a significant stake.
To be successful organizations need to be agile and their processes needs to be flexible. This environment has many moving parts, such as currency fluctuations, political climate; not everything can be controlled. SOA is a paradigm that allows one to create applications that are flexible and adaptable to change.
A service refers to a discretely defined set of contiguous and autonomous business or technical functionality.
One can have many kinds of services: Data services that expose and share data; Software as a service: Where a software functionality can be remotely utilized; platform as a service: Where one can provide a suite of tools and expose an integration platform as a service
Changing a particular component can mean just rewiring the system to use a different service provider and addressing data mediation issues with the new provider. SOA also promises reuse, especially data models defined as XML do not have any binding whatsoever to underlying languages/ implementation platforms. Easier integration of services largely due to the standards compliance in description and discoverability; one can find the services one needs and the uniform structure in description will enable easier integration.
what do we mean by agility? Let us consider a production flow of an enterprise in the supply chain domain. This example was created after talking to business managers at IBM, from experiences at Accenture Labs (a colleague of mine is a researcher) and from publications by Dell. We see a production flow of a product in the supply chain. There are two places that require agility. One where a new product line is to be added. How quickly can we identify suppliers, get contracts across and update inventory determines how soon the production can begin. Another example is when the inventory levels are low for a component (especially when a product exceeds forecasting expectation). Can the current supplier handle a request? If not, is there another supplier who can meet the demands?
However, despite the slew of standards and specs agility is still hard. Each organization brings their own business processes and vocabularies, their data formats governed by their legacy systems. Challenges one needs to address include finding the partner services who fulfill functional and non-functional requirements, data mediation, given discrepancies in data definition, what it means and creating processes that are flexible and can adapt to a changing environment .
Semantic Web Services has been an area of research to address these shortcomings.
How does one describe services and data models, capturing their meaning in addition to the structure.
How can one find the right set of partners that fulfill both functional and non-functional requirements. While the functional requirements capture the desired functionality, important requirements such as service level agreements, business requirements also need to be fulfilled. An example from the computer manufacturing: I want a supplier for memory chips, 533 MHz DDR 3, minimum quantity: 100000, price.., supply time.
data mediation is very important for interoperability; needs to be addressed before interop can be realized; Major promise of SOA; 90% time goes in data mediation. seen usage and learnt.
Last challenge is in integration and execution. How does one find and bind partners at run time? What happens say, a partner does not keep his guarantees up? How does one identify what events affect which process instances?
Semantic Web technologies play an important role in addressing a significant portion problems. Description semantic metadata, discovery reasoning and contract advertisements, model driven data mediation and optimization and adaptation in dynamic integration.
The Semantic Web provides a common framework that allows data to be shared and reused across application, enterprise, and community boundaries. At the heart of semantic Web is an ontology, a formal representation of a set of concepts within a domain and the relationships between those concepts. It is used to reason about the properties of that domain, and may be used to define the domain.
Semantic interface contract; Contract first;
My research address many aspects of these problems; For my dissertation, I seek to create a flexible and dynamic services environment, leveraging on semantic Web techniques. There has emerged two popular approaches to realizing services; One is Web services approach, with WSDL / SOAP and other standards and specifications. The other approach, the RESTful approach is much more lightweight. I will discuss a few aspects of my work in the RESTful context and conclude the first part of my presentation.
Semantics for services can be classified into four types: Data, Functional, non-functional and execution.
Order : Data ; Supply: Functional; Fail: execution; 5 business days and secure: non-functional
How can one capture the different types of semantics in service descriptions?
WSDL is the standard for service description. Captures the operations, messages and invocation information. But, how does one capture what a service does? SAWSDL is a way for adding semantic metadata to WSDL service descriptions. the metadata captures meaning of an element in WSDL and comes from an ontology. An evolutionary and compatible upgrade of existing Web services standards
agnostic to ontology representation languages (although W3C recommended RDFS or OWL are likely to be often used) reuse of existing domain models (in some domains, usable ontologies have been built, eg life science and health care) allows annotation using multiple ontologies (same or different domain)
Model references are at the heart of SAWSDL. A model reference is an attribute that grounds an WSDL element to a concept in the semantic model.
These services exposed various resources on the Web such as feeds, APIS
easy for humans to read and understand but hard for machines. The problems related to description and interop still remain.
During standardization process of SAWSDL in 2005, we realized that there was an emerging paradigm of services, one that did not necessarily have a WSDL for description.
SA-REST is an approach to add rich semantic markup to web resource descriptions. It builds on top of microformats, which have become an easy way to add semantic markups for calendar entries, contact information etc. I am currently editing the W3C submission of SA-REST, as a part of W3C incubation group for SWS. From these markups, one can extract RDF representation of the resource that can be used in search, data integration (when resources are used in a mashup). (Yahoo already has provision for using RDFa for extracting additional semantic meta information while crawling). Feedbooks allows you to browse using many facets such as theme, author, Let us look at an example markup of a simple Web page.
SA-REST can be extended and microformats can be created to markup a specific kind of resource. We have done this with hRESTs, a microformat for RESTful service markup.
SAWSDL allows for service providers to capture semantics within service descriptions. Two parties in the SOA game are the provider and requestor. How can a requestor describe the requirements?
Semantic Templates A way of capturing data / functional /non-functional /execution semantics for requirements Techniques for adding semantics follow SAWSDL principles
Template Term Functional requirement (as Operation) Data requirement (as Inputs and Outputs) Term Policy Non-Functional requirement Assertions and constraints
I will now discuss the impact of Semantic annotations in service discovery, data mediation and execution.
Selecting the best set of business partners, suppliers, and contractors is an indispensable component in business process infrastructure. Ideally, partner selection should take into account both functional requirements that describe what task the partner is expected to perform and non-functional requirements that may include business parameters such as the cost and quality of the service provided by the partner.
the keyword based search paradigm is extremely successful in the context of Web search, keywords are not sufficient to describe the desired functional and non-functional aspects of services.
The other paradigm supported by UDDI is interface-based discovery. In this approach, certain popular interfaces can be published in a registry and services conforming to them can be classified as such. This approach has the limitation that the interface itself is treated as a black box and there is no mechanism to compute relationships between the interfaces.
Our observation is that a number of services with similar functionality may have syntactically different interfaces, but similar or even equivalent semantic signatures.
SEMRE is a semantic Web services registry based on UDDI, the WS registry standard. There have been many approaches to SWS registries, notably many that use UDDI. The key benefits fo SEMRE include
SEMRE supports semantics natively; many approaches exist that use SWS and UDDI. However, they force-fit semantic metadta into existing UDDI data structures. In SEMRE, we extend UDDI data structures to support semantics, while taking care that their external interfaces do not change. In fact, SEMRE can function as plain old UDDI as well. Plus ontology management is incorporated into the registry,
For the registry to select services based on the functional and non-functional requirements, it must allow service providers to publish their non-functional capabilities.
The criteria for selection must not be too restrictive, since it may be very difficult to find services that exactly match the requirements. The requester must be able to specify the expected level of match for the different aspects of the request, such as data, functional and non-functional requirements.
the requester must be able to create requirements that describe his functional and non-functional requirements. Such a requirement must also allow the requestor to specify the level of match expected for the different elements of the requirement.
captures the semantic annotations of the domain and the operation information of a service interface(SI) or a template term in a semantic template; What does this service offer in what domain. We first calculate the SSI. Ms: is the semantic annotation on the interface
Theta_i: a tuple consisting of the annotation on the operation and the data elements
Interface relationship captures the relationship between two semantic interface signatures over a semantic meta-model. Relationship over domain, operation and data models. Fine grained representation of interface relation. For example, two services can offer the same functionality, but one offers a more generic data model. x
For example, if a service is published with RPO for memory, it is added as an instance of semantic service model for RPO Memory. The syntactic label on the operation si not used for indexing but we see the service as an implementation of the RPOMemory concept in the ontology. The interface relationship will capture the fact that this service is related to RPO storage servces via subclass. This calculation is done in a lazy way by the system and the relationships are indexed for efficient search.
We consider four levels of match: Equiv, generalized, subsumtion as well as discovery based on named relationship. in life sciences for example, one must look for services using named relationships. Example from SEMBOWSER or GlYCO
semantic interface signature (Semantic Template) set of domain, operation, and data relationships which must exist between a semantic interface signature; he fulfillment set over domain ($R_F^S$) consists of all domain relationships that are stronger than the expected level of match. The fulfillment set over operations ($R_F^\\theta$) and data elements ($R_F^D$) are similarly defined. We compute this parallelly (each independent of other). Possible optimization also includes map reduce
The cartesian product gives a set of all possible domain operation and data properties that fulfill
Services that are instances of such interfaces are returned.
Non-functional match. Define policy as a collection of alternatives and each alternative is inturn a collection of assertions. On the set of services that fulfill the functional requirements, we further identify those that fulfill non-functional requirements.
Disjunction of alternatives. For example, production time + shipping time = supply time. A supplier can have an assertion saying supply time is 5 days and shipping time is 3 days. Requestor wants supply time to be less than 10 days. If this rule is modeled captured in the model, we apply the rule to the assertion values. Each operation policy can be independently computed for efficiency and aggregated in the end.
Significant work in SWS discovery has been on IOPE matching. we compared our system. Eval consisted of creating a test scenario where 42 services matched a requirement (only functional, since other approaches do not consider non-functional). IO match yielded 153 services, IOPE did better 87, but IO with operation and domain using our approach was precise. We do note here that we tested on services that are completely annotated. More annotation meant more processing and this test set served well for timing tests. Also we note that all three approaches had the same benefit of high quality annotation.
Discovery time vs no, of interfaces. Note that this is nearly a constant; this is because of the interface relationship precomputation. spikes are in cases when the match criterion was made very flexible.
Discovery and ranking for RESTful services. There are more than few thousand APIs available. How does one find the right APIs for a application apiHUT is a web api search engine for faceted search. Faceted search allows for more directed and fine grained search. I want to find a mapping service that uses the REST protocol and supports JSON. Search either via Web interface or using restful service interface and post semantic templates. Our ranking approach, called Serviut rank, borrows the ideas of inlinks and outlinks from pagerank. Four metrics we consider are: mashups use this service mashups use other services that are same in functionality total services in the domain total mashups in the domain
The SA-REST annotations of the APIs used apiHUT taxonomy, illustrated here as the semantic model; the semantic model available in RDFS.
OUr search engine uses a hybrid of bayesian statistics, TF-IDF for text analysis and classification along with available semantic markups.
A key SAWSDL upgrade is a systematic approach to data mediation
Meta approach to data mediation. Rather than
Rather than mediate between individual schemas, one can write reusable mediation scripts between concepts in the metamodel. Once individual schemas are annotated, one can create Lifting schema mapping and lowering schema mapping.
liftingSchemaMapping: This can be used to specify mappings between WSDL Type Definitions in XML and semantic data. loweringSchemaMapping: This can be used to specify mappings between semantic data and WSDL Type Definitions in XML.
Say we have discovered two services for image search: Live search and Yahoo image search. Given my target schema, which one should I choose so that mediation between this service and my mapping service is easier? My mapping service uses the target schema.
A measure of ease of mediation between two schemas for a human Mediatability is a metric that estimates the ease of mediation between two schemas for a human; measure between 0 and 1, where 1 being hard to mediate and 0 being easy to mediate.
Much of work in mediation have focussed on automatic mediation. hard and still not satisfactory. we thought it might be better to aid manual mediation.
Two step process.
A dynamic process is one in which partners of the process are bound during the execution. This process is created using semantic templates as process partners. Shown here is an example of a dynamic process in the domain of supply chain. The METEOR-S middleware is a semantic WS middleware capable of executing dynamic processes. This work was published in ICSOC 2005, ICWS 2006.
We divide the process into three states; prebinding, binding and post binding. for each process instance the middleware maintains the state of the process. For each sematic template the middleware instantiates a service manager. the binding information of the service along with constraints is maintained by the service manager. Apache axis 2 and Apache synapse implementations.
Standalone middleware over any bpel engine. proxy based approach.
For mashups, we adopt a declarative paradigm (keeping in mind the scripting dfferences, lightweight paradigm). Snapshot of our first generation integration tool. developed in collaboration with IBM Almaden services research. Available at IBM alpha works.
SMartmashups are appliations generated from a high level specification and are configured during their execution. Housing maps (craigslist + google maps) ; i want to show neighborhood information using street view. Depending on the city the user picks, i will choose that mapping API that shows the street view.
Dynamic and Agile SOA
using Semantic Web
Wright State University.
Packaging Identify components levels
Inspection level low?
and Testing NO
from internal Order components
YES registry from existing
from inventory Can
Update inventory supplier
Add to production
Obtain order Update inventory
from alternate supplier
Production flow Production flow Add a new product line Inventory Management
(a) (b) (c)
Semantic Web + Web Services =
• Semantic markup of service description
• Use of reasoning for discovery and integration
Manufacturer 1 Manufacturer 1
Create and publish semantic Create and publish semantic
Manufacturer 1 Manufacturer 2 interface contract 1 in SAWSDL interface contract 2 in SAWSDL
Create and publish service Create and publish service annotated with concepts annotated with concepts
interface contract 1 in WSDL interface contract 2 in WSDL from the ontology from the ontology
Service Provider 1 Service Provider 2 Service Provider 1 Service Provider 2
private registry private registry private registry private registry
Publish service 1 that Publish service 2 that
adhere to the service adhere to the service Publish service 1 that
interface contracts of Publish service 1 that
interface contracts of adhere to the ontology
manufacturer 2 adhere to the ontology
Service Provider Service Provider
IR (Si , Sj ) = (R , R , R )
S θ D
is_followed_by RequestPurchaseOrder is_followed_by CancelOrder is_followed_by
RequestPurchaseOrderHardDisk RequestPurchaseOrderMouse RequestPurchaseOrderKeyBoard
1. semantic interface signature (Semantic Template)
2. identify fulﬁllment set: RE ,
RE and D
3. all fulﬁlling interfaces:
4. Interface relation in the set deﬁned above
1. Compute the effective policy of both request and
2. Comparison between policies is by comparing
assertions in each alternative
3. If there is no assertion with a matching capability, is
there a rule connecting assertion elements?
4. Aggregate matches.
Number of Matches
Data Only Match Data and Operation Data and Operation and
Service Publication Time
With Interfaces published
Time - Milliseconds
0 100 200 300 400 500 600 700 800 900 1000
Number of services