Final Defense


Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Homer wants to drive from Athens, GA to NYC.
  • and would like to have his favorite burger all through the way.
  • and would like to have his favorite burger all through the way.
  • He drives to the apple store, gets his iPhone and excitedly downloads the aroundme application that allows him to find restaurants near him.
  • and then realizes that he has to do too much before he can find a mcdonalds!!
  • he sits to create his own application coz he has heard that even 9 year olds can create apps.
  • but its not that easy for an average user!
  • we break down this task into four steps
  • first is to find the right services that give Homer all the information?
  • When one service responds, we need to send it to the next service. however, its not easy. so let us resolve the dispute here and make them
  • once we have made them talk, we will make sure that they are all integrated.
  • all is good, but when Homer is using the app, things can go wrong! how can we identify something when it happens?
  • In this talk, I will discuss my research in addressing these four questions. We have attempted to address significant parts of each problem
    and have used semantic Web techniques for the same. First, I will very briefly talk about Service Oriented Architecture. SOA is at the heart of my work.
  • One can have many kinds of services: Data services that expose and share data; Software as a service: Where a software functionality can be remotely utilized; platform as a service: Where one can provide a suite of tools and expose an integration platform as a service. SOA allows us to create software that is easy to configure, flexible and agile.
  • Along with SOA, my work also employs semantic Web techniques. The Semantic Web is an evolving development of the World Wide Web in which the semantics of information and services on the web is defined, making it possible for the web to understand and satisfy the requests of people and machines to use the web content. In this work, we largely employ RDF as
    the framework for modeling metadata.
  • RDF expresses resources in the form of subject predicate object expressions, called triples. In the above example, the resource me is the subject, fullName is the predicate and Eric Miller is the object. Similar for other nodes and edges. RDF represents a labeled, directed multigraphs. For those of you unfamiliar, this example is from the wikipedia page of RDF and will be familiar when you look it up.
  • The other area of research that this work borrows from and has contributed towards is Semantic Web Services. Simply, put SWS as it is called is adding semantics to Web services. There are many approaches to do this.
  • This bringing together of SW and WS can be done in two ways. The first one is where, Services are added into semantic Web.
  • In doing so, we represent aspects of a service such as its inputs, outputs, operations using semantic models.
  • Significant in this is OWL-S. In OWL-S, there is a semantic model to capture the profile of a service read from picture. relies on description logic.
  • Goals - The client's objectives when consulting a Web Service.
    Ontologies - A formal Semantic description of the information used by all other components.
    Mediators - Connectors between componentes with mediation facilities. Provides interoperability between different ontologies.
    WebServices - Semantic description of Web Services. May include functional (Capability) and usage (Interface) descriptions. Relies on F-Logic.
  • The second approach is one where we ground service descriptions in semantic metadata. This is a bottom up approach and does not
    require a significant change in perception of a service.
  • The METEOR-S project adopts this approach.
  • We classify the semantics for services into four types.
  • And capture these within a service using modelrefernce, an extensibility attribute. WSDL schema allows extensibility attributes to represent additional properties beyond WSDL description. We define an extensibility attribute, called model reference that allows the addition to semantic metadata to WSDL elements.
  • Semantic templates are often more formal and have many enterprise quality features such as support for policy. So we will deviate a bit
    from Homer Simpson and look at a game manufacturer. In this example, talk about the template.
  • These services exposed various resources on the Web such as feeds, APIS
  • easy for humans to read and understand but hard for machines. The problems related to description and interop still remain.
  • During standardization process of SAWSDL in 2005, we realized that there was an emerging paradigm of services, one that did not necessarily have a WSDL for description.
  • SA-REST is an approach to add rich semantic markup to web resource descriptions. It builds on top of microformats, which have become an easy way to add semantic markups for calendar entries, contact information etc. I am currently editing the W3C submission of SA-REST, as a part of W3C incubation group for SWS. From these markups, one can extract RDF representation of the resource that can be used in search, data integration (when resources are used in a mashup). (Yahoo already has provision for using RDFa for extracting additional semantic meta information while crawling).
    Feedbooks allows you to browse using many facets such as theme, author, Let us look at an example markup of a simple Web page.
  • Everything in this site is related to basketball. capture this information at this high level.
  • really deep markups.. i want to markup for each thumnail link info about that episode
  • We look at the body element of the entry to page ESPN’s basketball site and markup that says that the site content belongs to the domain described in this semantic model (in this case dbpedia’s basketball)
  • when used without site, domain-rel simply denotes the domain of a block of content. it is a block markup.
    in a site, when domain-rel is used in an inner resource, the domain of the resource is what is mentioned by this domain-rel
  • in our example, the article talks not just about iphone apps but also about unit conversion(which is what the app does)
  • in this case, we enumerate the domains (no specific order)
  • not all links are created by us.. say we are blogrolling. there is no guarantee that the target resource of a link is marked up too.
    but being socially responsible, we want to throw some hints at what to expect off a link. also allows enumeration
  • We present a quick comparison of prior research and thesis. In the dissertation I have discussed this in detail. I have compared thesis of
    Dr. Paolucci, Dr. Mocan from WSMO, Dr. zhao from UGA (who worked on RESTful composition with only control flow considerations).
  • two months ago in my prospectus,
  • first is to find the right services that give Homer all the information?
  • I presented our approach to Faceted API search, SWS discovery and data mediation. I will briefly discuss these contributions today.
  • the keyword based search paradigm is extremely successful in the context of Web search, keywords are not sufficient to describe the desired functional and non-functional aspects of services.

    The other paradigm supported by UDDI is interface-based discovery. In this approach, certain popular interfaces can be published in a registry and services conforming to them can be classified as such. This approach has the limitation that the interface itself is treated as a black box and there is no mechanism to compute relationships between the interfaces.

    Our observation is that a number of services with similar functionality may have syntactically different interfaces, but similar or even equivalent semantic signatures.
  • The SA-REST annotations of the APIs used apiHUT taxonomy, illustrated here as the semantic model; the semantic model available in RDFS.

    OUr search engine uses a hybrid of bayesian statistics, TF-IDF for text analysis and classification along with available semantic markups.
  • Serviut rank is our approach to ranking. It is very similar to the popular Page rank. Page rank is computed using inlinks nad outlinks.
    Serviut rank employs a similar strategy using the number of mashups that use a service as a positive referral, the total number of
    services that do a given task and total number of mashups. In addition to this, we also use the popularity of teh application itself. This is given by Alexa.
  • Builds upon what is called as contract first. WSDL is the contract that binds a provider and a requestor. Rather than having a keyword driven
    contract, SEMRE adopts SAWSDL and enables a semantic contract. allows us to reason at the meta level while computing matches.
  • parallel, each attribute fulfillment is computed independently. Map function computes this. reduce aggregates.
  • Published 200 services of which 42 were relevant for a request. Our approach worked well. Data only did worst and data and operation did
    slightly better. We do note here that we tested on services that are completely annotated. More annotation meant more processing and this test set served well for timing tests. Also we note that all three approaches had the same benefit of high quality annotation
  • first is to find the right services that give Homer all the information?
  • This is our approach to data mediation. Charles Simonyi is the creator of M$ office, the computer scientist who went to space.
  • Rather than mediate between individual schemas, one can write reusable mediation scripts between concepts in the metamodel. Once individual schemas are annotated, one can create
    Lifting schema mapping and lowering schema mapping.
  • Our approach is to calculate the ease of mediation
  • Given two data models (schemas) how easy can a user mediate between the two?
  • Compute the similarity of the two schemas. This is structural, and semantic.
  • User evals; asked to mediate between Yahoo Web, Image search, live search, google search
    and Flickr; System is conservative. Normal user: average user with some Web dev experience, expert user : apache XML Schema committers , hardcore mashup developers.
  • first is to find the right services that give Homer all the information?
  • In the next part of my talk, I will discuss our work on declarative based approach for integration
  • Our approach to composition is declarative. By this we mean, the input to the composition can be richly described with additional
    parameters such as semantic annotations, semantic templates and a smashmaker DSL.
  • For compositions involving non-functional requirements such as reliability, security, the ones found in enterprise scenarios, the declarative requirement is captured using a semantic template.
  • this is a declarative specification of a smart mashup in JSON. We chose JSON because, it is easier to generate this from a Web interface.
    The current platform does not have an UI as creating a UI like Yahoo Pipes would require a considerable amount of effort, that we hope
    to pursue once the middleware is stable. Going back to the example , explain the example.
  • Our approach to composition is generic and does not fundamentally differentiate between SOAP and RESTful services. We ground
    services to a common model, something we call an abstract Web service.
  • Precondition of current operation holds, along with the semantics of the available data entails a valid input to the operation
  • Apply the transition function or the operation to the input and to the status flags of the state and transition to the next state.
  • ie it is a set of operations that given a goal and an initial state, applies the check and add operators until the goal is reached.
  • Many apporaches can be adopted. We adopt graph plan, since its sound and complete. That means, if there is a valid plan, we will find it.
  • Conventional AI planning looks at Preconditions and effects. services have data
  • AI planning does not mandate anything beyond regular string matching. we go one step further and employ logic based semantic matching / graph based.
  • Unique to our approach is data based loop generation. If the data is an enumeration, our planner generates a loop. For example in the
    iPhone app, each address location is mapped.
  • Smashlets are a unique category of service that do data mediation. Going by our user driven approach, we allow users to share their mediation scripts using either SAWSDL or SA-REST annotation in APIHut. The task of data mediation is then similar to that of service
    discovery and check and add. user can specify a smashlet as an iSmashlet (input smashlet) or (oSmashlet) output smashlet. The notion
    of smashlets is something we verified very recently and is not in the dissertation. However, mediation as a service is discussed.
  • A big thanks to Meena for laying this out so clearly. We intend to publish this as a guide for people wanting to write smashlets. This will
    give them a clue about what kind of mediation can they do, how easy it is to mediate.
  • In a distributed environment like SOA, with many components events are common place.
  • We classify events into two types: events of direct consquence. resulting from an action by the provider or the requester. such as
    event raised when location information is made available in our application. The location info is obtained asynchronously using a delegate protocol and when the info is available, the delegate raises an event.
  • something that we did not intend or cause. such as a network outage.
  • the last part of my talk deals with how we can identify what events matters to our objectives and what dont?
  • The enterprise space offers us a rich example. so we bid adieu to Homer’s burger here. however we will bring him back when we walk through the algorithm. Here we have 2 models, one a functional ontology that
    captures the different operations, and their protocols; the second is a non-functional ontology that captures various QoS metrics. We reuse
    the popular OWL-QoS coalition ontology for non-functional and RosettaNet for functional. The approach is to find out what the goal is and find the events modeled in the functional ontology related to that goal. Grab the event and look at the non-functional ontology. That will tell us what non-func properties are affected by this event. Check if any of those are in our requirements.
  • to calculate this association, we extend the rho operator, proposed by Kemafor in 2003.
  • however we need to get some bounds and constraints on rho
  • bounds because in large graphs we can get into an endless search and that will leave running instances with timeouts
  • we are not interested in all paths that are within a bound. For example, we want the path to contain atleast one edge that has a label
    notifies_event. otherwise that path tells us nothing about an event, and is not useful for our purpose. similarly there can be many other constraints.
  • as a first step, we extend rho operator to find paths between an entity and all entities belonging to a class.
  • For example, we can define a set that has classes of operations, their inputs and the events raised. Such as
  • In the context of iPhone, we have a fetch location operation which is of type maptask, CLLocationDelegate the delegate that is raised
    when the async response is obtained.
  • the semantic annotation on the operation must be in any of the paths. so that is added to the constraint. Next is to add events that
    are related to this association. we can fix the bounds and get the events. the events are related to an operation concept by a set of relationships that can be known apriori. we add the relationships. events that belong to the bounded constrained path are events of interest.
  • EIC’s are obtained in a similar manner, but by using the non-functional requirements.
  • not all events have the same relevance. relevance is defined as a function of the path length.
  • we calculate the relevance by factoring the shortest path we got and the path of an event. This gives the relative importance of the event
    over the most important event.
  • an event can affect many metrics. hence we compute the cumulative relevance of an event.
  • once this is done, we use normalization to identify events of relevance.
  • once this is done, we use normalization to identify events of relevance.
  • where we observe the reaction of the underlying system to an event and evaluate it against it calculated importance. if an event deemed non relevant causes a reaction every time, we add a fixed delta to its over all relevance. similarly we deduct delta for the other case.
  • after the adjustment see how many events change status from relevant to non-relevant. if many events change, we are way off base.
    so adjust delta. keep checking this entropy and stop this adjustment if entropy is zero for a certain time interval.
  • we compare how many iterations it takes for the system to stabilize in both the approaches. as we can see the hybrid approach since it
    constantly adjusts almost maintains a constant convergence time and always converges faster. the second experiment measures hte same
    metrics, this time varying the percentage of relevant events.
  • this experiment measures how any irrelavant events we find with both approaches for every relevant event. the flip side to the hybrid
    apporach is that on an average it finds more irrelvant events intially, as it tries to first find the optimal adjustment value.
  • Rather than a specific problem, this dissertation outlines what is possible with what is out there today. we have created an environment
    that facilitates various tasks involved in creating a service oriented ecosystem. we have addressed the problems of discovery mediation
    composition and execution in a service type agnostic manner.
  • We have been actively involved in standardization and have contributed in terms of standards (SAWSDL , SA-REST) as well as refernece
    implentations. By grounding our research in these standards, we have also demonstrated the usefulness and the values of these standards.
    Our prototypes have always demonstrated comparable performance in terms of scalability and better performance in terms of doing the job at hand.
  • We are one of the first to publish and implement a unifying search engine framework to search both Web APIs as well as conventional services.
  • By having user annotated services, sharing nad discovery in the context of mashups we have demonstrated the value of incorporatin g
    social computing in services computing.
  • beyond reference implementations, we have released open source tooling that is used in the community along with a bunch of
    Web based easy to use systems; we will soon release RESTful APIs (need to migrate to cloud) that will make it easier to use.
  • for their invaluable time and suggestions.
  • there was a technological freedom and allowed me to play with all sorts of areas.
  • all the guys with whom I have had a chance to work and have fun. esp Cory and Delroy for all football saturdays.
  • I wish to really thank Topher, Meena, Cartic, Pablo and Ajith for making sure I stand here today. Ajith was and I hope will con
  • Also thank three people who were instrumental in this progress who are not here with me. My very good friend Badri who convinced me
    to start this journey in 2002; Dr. Randy Paush, from whom I learnt two valuable lessons. Brick walls are there to keep those who dont want it badly out and you beat the reaper not by living long and by living well and by inspiring people to live well.
    My uncle, Kannan who was a big inspiration throughout and whose support and encouragement in everything I have done have been very instrumental; He was my biggest cheerleader and I will miss this encouragement going forward in life.
  • Here is Homer Simpson
  • Final Defense

    1. 1. How do we help Homer find his Big Mac?
    2. 2. Four
    3. 3. SEMANTICS ENRICHED SERVICES ENVIRONMENT Karthik Gomadam, Services Research Lab, kno.e.sis center.
    4. 4. Semantic Web Services = Semantic Web + Web Services
    5. 5. Semantic Web / Semantic Models
    6. 6. Semantic Web / Services Semantic Models
    7. 7. Service aspects represented in semantic models
    8. 8. OWL -S
    9. 9. WSMO Goals, Ontologies, Mediators, Web Services
    10. 10. Services Semantic Web / Semantic Models
    11. 11. METEOR-S
    12. 12. Execution Data Event Identification, Not just model, Adaptation Express your data Non- Functional Functional Response time, What does the Cost, QoS Metrics service offer?
    13. 13. Semantic Templates
    14. 14. Collection of template terms
    15. 15. Snapshot of Rosettanet PIP Event Ontology has_input ISA ISA notifies_event Input_Message has_output Action_PIP has_notification Notification_PIP Output_Message ISA ISA ISA ISA ISA ISA has_output CancelOrder_Output PurchaseOrder_Input CancelOrder RequestPurchaseOrder PIP3A4 PurchaseOrder_Output PIP3A4 has_output CancelOrder_Input has_output has_input Semantic Template ServiceLevelMetaData (SLM) Category= NAICS:Electronics Template ProductCategory= DUNS:RAM Location= Athens,GA metadata SemanticOperation Template (SOPT1) Action= Rosetta:RequestPurchaseOrder Legend Input= Rosetta:PurchaseOrder_Input Operation Modelreference Output= Rosetta:PurchaseOrder_Output OLP= {Encryption = RSA, ResponseTime< 5 Sec} SemanticOperation Template (SOPT2) Requirement for Input Modelreference Action= Rosetta:CancelOrder operation OutputModelReference Input= Rosetta:CancelOrder_Input Output= Rosetta:CancelOrder_Output OLP= {Encryption = RSA, ResponseTime< 5 Sec}
    16. 16. Resourceful Web
    17. 17. described in X/HTML
    18. 18. SA-REST*: semantic microformat for resource markup. *- Not an acronym
    19. 19. inline semantic annotations that refer to a rich semantic model
    20. 20. site-domain-rel markup the domain of an entire site markup to the entry page of the site applies to /*
    21. 21. <body class= ‘dooduh-main site-domain-rel’ title= ‘dbpedia:web2.0’>
    22. 22. domain-rel
    23. 23. multiple domains
    24. 24. <p class=”domain-rel” title=”dbpedia:iphone dbpedia:unit_conversion”>
    25. 25. sem-rel
    26. 26. <a href=”*” class= “sem-rel” title=”metadata about the episode”>
    27. 27. In summary
    28. 28. Real world applications Technology Centric Data and Control flow Service types and services SOAP / WSDL OWL-S - DL IOPE Very few evaluations Haibo Zhao’s work on WSMO - FLogic Either or scenarios available RESTful composition Technology Data flow: Mediatability independent - use any Control flow: SOAP / WSDL and APIHut, FoxyREST, upper level modeling Declarative RESTful Dooduh, AIR language composition
    29. 29. Faceted API Search, Discovery and Mediatability
    30. 30. Key word based paradigms Interface based techniques
    31. 31. Serviut Rank
    32. 32. Query Precision Recall Query1 0.89 0.75 Query2 0.83 0.69 Query3 0.54 0.71 Query4 0.82 0.21 Query pWeb ApiHut Google Table 1: Precision and Recall of ApiHut Query1 0.48 0.89 0.20 Query2 0.61 0.83 0.13 Query3 0.25 0.54 0.23 Query4 0.70 0.82 0.37 Table 1: Precision : Apihut, PWeb and Google
    33. 33. SEMRE: SEMantic services REGistry
    34. 34. Syntactic interface agreement required! Manufacturer 1 Manufacturer 1 Create and publish semantic Create and publish semantic Manufacturer 1 Manufacturer 2 interface contract 1 in SAWSDL interface contract 2 in SAWSDL Create and publish service Create and publish service annotated with concepts annotated with concepts interface contract 1 in WSDL interface contract 2 in WSDL from the ontology from the ontology Service Provider 1 Service Provider 2 Service Provider 1 Service Provider 2 private registry private registry private registry private registry Publish service 1 that Publish service 2 that adhere to the service adhere to the service Publish service 1 that interface contracts of Publish service 1 that interface contracts of adhere to the ontology manufacturer 2 adhere to the ontology manufacturer 1 Service Provider Service Provider
    35. 35. 1. semantic interface signature (Semantic Template) 2. identify fulfillment set: RE , S θ RE and D RE 3. all fulfilling interfaces: 4. Interface relation in the set defined above
    36. 36. Matching comparison 180 160 153 140 Number of Matches 120 100 87 80 60 42 40 20 0 Data Only Match Data and Operation Data and Operation and Domain
    37. 37. Think Meta!!! “Anything you can do, I can do Meta” Charles Simonyi, Intention Software
    38. 38. Many services for the same task
    39. 39. Ease of mediation
    40. 40. Mediatability
    41. 41. Top down - Mediation Similarity
    42. 42. Bottom up - Mediatability
    43. 43. Declarative
    44. 44. Semantic Templates
    45. 45. { "smashup": { "device": "iPhone", "code": { "title": "Address Finder", "sketch": { Get current "service": { "type": "component", location "resource": "" }, service": { "type": "service-api", "resource": "", "endpoint": "", "method": "GET", "input-resource":"", "input-param":"address" }, "service": { Find McD’s near location "type": "i-smashlet", "resource": "", "source":"" }, "service": { "type": "o-smashlet", "resource": "", "target":"" }, Add to map "service":{ "type": "component", "resource": "" } } } }
    46. 46. Modeling an abstract Web service
    47. 47. A collection of operations Each operation is a collection of 1. Input 2. Output 3. Preconditions 4. Effects 5. Faults
    48. 48. Composition: Check and Add
    49. 49. Check
    50. 50. Does the output of Restaurant service return location?
    51. 51. Add
    52. 52. Extract location (transition function) add to composition
    53. 53. plan is a DAG (Directed Acyclic Graph) of operations
    54. 54. Graph plan
    55. 55. What about DATA?
    56. 56. Semantic approach to check
    57. 57. Loops
    58. 58. Data Mediation as a service
    59. 59. smashlets
    60. 60. events
    61. 61. Event of direct consequence
    62. 62. Event of indirect consequence
    63. 63. what matters and what doesn’t?
    64. 64. built over our declarative models semantic template smashmaker DSL
    65. 65. extends semantic association computation
    67. 67. Path is a collection of vertices and edges in a graph
    68. 68. ρpath Queries an ontology and returns a set of vertices and edges between two nodes AKA semantic association
    69. 69. bounds and constraints
    70. 70. Large graphs - endless search - timeouts
    71. 71. really not all paths lead to Rome!!!
    72. 72. Extend rho to find all paths between an entity and a class
    73. 73. We define a set of classes and relationships of interest
    74. 74. {maptask, coordinate, location, location obtained} {has input, has output, raises delegate}
    75. 75. Path satisfiability: If every node returned is in the set of classes and every relationship is in the set of relationships.
    76. 76. {fetch location, gpsCoord, address, CLLocationDelegate} {has input, has output, raises delegate}
    77. 77. Bounded constrained rho is one where all relationships statisfy the constraint and are bounded
    78. 78. {fetch location, gpsCoord, address, CLLocationDelegate} {has input, has output, raises delegate} if we add a bound of 4 to this satisfying constraint then this is a bounded constrained path of bounds 4
    79. 79. Identifying events
    80. 80. EDC’s get the semantic annotation on the operation
    81. 81. EIC’s get the concepts from the policy
    82. 82. how relevant is an event?
    83. 83. shortest path over an event path
    84. 84. events can affect many metrics
    85. 85. filtering
    86. 86. adjusting importance
    87. 87. fixed relevance adjustment
    88. 88. variable/hybrid relevance adjustment
    89. 89. mulative relevance of the events in the non-relevant set after each feedback iteratio d change in the value of the cumulative relevance coupled with the increased num classified as non-relevant by the framework, being actually relevant, increases the ite to stabilize the system. .3: Studying the performance of hybrid and fixed Adjustment schemes with var the total number of events. 126 8.4: Performance of hybrid and fixed adjustment schemes with variation in the perc of relevant events
    90. 90. LUATION Augus ents. The fourth experiment measures the variation in accuracy of both the : Variation in the accuracy of feedback schemes with increase in the number hen the total number of events is changed. This is illustrated in Figure 8.5.
    91. 91. conclusions
    92. 92. Environment
    93. 93. Contributions to standards community
    94. 94. marrying old school services with the cool restless RESTful crowd
    95. 95. social approach
    96. 96. tooling
    97. 97. thank Dr. Michael Raymer, Dr. T.K. Prasad, Dr. Guozhou Dong, and Dr. Shu Shiller
    98. 98. Dr. Amit Sheth, for letting me do what I want for the most part of my research. This is a priceless feature of this work environment. Dr. Lakshmish Ramaswamy for guiding me through my research including my first paper at ICWS 07. Dr. Kunal Verma for mentoring me and for continued collaboration and friendship Kaarthik Sivashanmugam for getting me started on Web services
    99. 99. Wenbo, Ashu, Pramod, Raghava, Wes, Rob, John Lathem, Cory, Prateek, Delroy
    100. 100. and one more thing...
    101. 101. resourceful many to many model eg: annotation social contributions APIHut and RESTful aspects goes towards incorporataion of social web as part of the service Web framework Used this for lightweight services, same thing can be extended to WSDL / SAWSDL Wiki based approaches. and such .. last slide(s)