• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Wi iat-bootstrapping the analysis of large-scale web service networks-v3
 

Wi iat-bootstrapping the analysis of large-scale web service networks-v3

on

  • 217 views

 

Statistics

Views

Total Views
217
Views on SlideShare
217
Embed Views
0

Actions

Likes
0
Downloads
0
Comments
0

0 Embeds 0

No embeds

Accessibility

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Wi iat-bootstrapping the analysis of large-scale web service networks-v3 Wi iat-bootstrapping the analysis of large-scale web service networks-v3 Presentation Transcript

    • Bootstrapping the Analysis ofLarge-scale Web Service NetworksShahab Mokarizadeh, Royal Institute of Technology , Sweden Peep Kungas, Tartu University, Estonia Mihhail Matskin, Royal Institute of Technology, Sweden IEEE/WIC/ACM International Conference of Web Intelligence 22-27 Aug 2011
    • BackgroundWhy web service analysis? Identifying Missing but Valuable Web service (to be implemented)  Discovering correlation among public , governmental and private sector web services Discovery of the most/least exploited concept(s)s, web service(s), we service provider(s) …..Initial challenge? Vast majority of available services are not semantically annotated or even come with any sort of documentation !2 22-27 Aug 2011 IEEE/WIC/ACM International Conference of Web Intelligence
    • Analysis Roadmap • Generate Reference Ontology • Initially only WSDL web services • Web service Annotation • Web Service Matching & Network generation • Apply Social Network Analysis Algorithms • Information Diffusion among Web service communities • Analysis the Impact of Services /Concept on other services or concepts3 IEEE/WIC/ACM International Conference of Web Intelligence 22-27 Aug 2011
    • Remind: WSDL Structure Image from : Web Services and Security,1/17/2006 ,Marco Cova4 IEEE/WIC/ACM International Conference of Web Intelligence 22-27 Aug 2011
    • Ontology Learning from Information Elicitation WSDL Interfaces1 Term Extraction Syntactic Refinement Ontology DiscoveryOntology Learning Input: Pattern-based - Message Part names of input/output Semantic Analysis parameters Term Disambiguation - XML Schema leaf element names of complex types Class and Relation Determination Ontology Organization[1] ”Ontology Learning for Cost-Effective Large-scale Semantic Adding RelationsAnnotation of XML Schemas and Web Service Interfaces". in Porc.EKAW 2010, LNAI 6317,pp.401-410, 2010 Reference 5 Ontology IEEE/WIC/ACM International Conference of Web Intelligence 22-27 Aug 2011
    • Annotation Heuristics2entity_reference ← synset{…}Concept in Ontology Instances in Ontology (terms) Example:Password ← {password, pwd, strPassword, authPassword, pass}Address ← {addr, address1, postal_address} [2] P.Küngas, and M. Dumas.“Cost-Effective Semantic Annotation of XML Schemas and Web Service Interfaces”. Proc. IEEE Conference on Services Computing, 2009, pp.372-379,6 22-27 Aug 2011 IEEE/WIC/ACM International Conference of Web Intelligence
    • Web service Matching SchemeMatching of basic elements of Web service input and output parameters (ontological instances)Web service matching Simplified as Instance MatchingRule based matching scheme. - A matching rule reveals existence of kind of semantic relation between the given two instances.7 IEEE/WIC/ACM International Conference of Web Intelligence 22-27 Aug 2011
    • Instance Matching Rules (1)Rule-1: Same concept . Example: (addr, addr_line) : {addr, addr_line} instanceOf Address .Rule-2: Synonyms Concepts . Example: ( loc, place) {loc} instanceOf Location , {place} instanceOf Place Place isSynonymOf LocationRule-3: Subcalss Concepts. Example: (loc, city): {loc} instanceOf Location, {city} instanceOf City, City isSubClassOf Location8 22-27 Aug 2011 IEEE/WIC/ACM International Conference of Web Intelligence
    • Instance Matching Rules (2)Rule-4: Rule 2 + Rule 3 .Example : (bidUId, id) {bidUId} instanceOf BidUniqueCode, {id} instanceOf ContractIdentifier BidUniqueCode isSynonymOf ContractIdentifierRule-5: Interrelated by an ontological relations (other than isSynonymOf):Example : Person hasPropertyXXX FirstName.9 22-27 Aug 2011 IEEE/WIC/ACM International Conference of Web Intelligence
    • Evaluate Matching Scheme -1 1- Classical Approach (Precision, Recall, F-measure)1. Need a Golden Annotation /Ontology to compare with .2. Identify :  True Positives (TP) : the common annotations between golden and generated ontology  False Positives (FP) : annotations made only by generated ontology  False Negatives (FN): annotations made by golden ontology but not discovered by the generated ontology).3. Compute:10 IEEE/WIC/ACM International Conference of Web Intelligence 22-27 Aug 2011
    • Evaluate Matching Scheme - 2 2-Tracking Performance of Matching Scheme in Network Model • Generate Semantic Network model out of Annotated Web service corpus. • Track the performance of exploited Annotation & Matching scheme in the network properties .Web service (WSDL) networks (in small size) observed to exhibit: • Small-worldness model  Scale free model  Correlation degree on nodes ?11 22-27 Aug 2011 IEEE/WIC/ACM International Conference of Web Intelligence
    • Web service Network Models 2-Projecting Matching Scheme Accuracy in Network Model Operations Parameters Concepts Semantic NetworkWS1 - WS3 : Web services WS1 P1 C1 C1 OP1OP1 - OP3 : Web service P2 Operations C2 WS2 C2 C3 P3 OP2 C3P1 - P6 : Basic Elements of Input P4 / Output Parameters C5 C4 C4 WS3 P5C1 – C5 : Ontological Concepts OP3 C5 P6 Representing the Parameter Annotated Web service 12 22-27 Aug 2011 IEEE/WIC/ACM International Conference of Web Intelligence
    • Evaluating Network PropertiesSmall WorldnessSmall world networks are networks with the following characteristics:1. LRandom ≤ LActual L: Shortest Path Length2. CRandom << CActual C: Clustering Coefficient Sindex : Small worldness IndexIn other words: > 1, λ > 1, Sindex > 1Small-worldness scales linearly withnetwork size.13 22-27 Aug 2011 IEEE/WIC/ACM International Conference of Web Intelligence
    • Evaluating Network PropertiesScale free Networks  Scale free Networks:  Fitted to power-law function y  c.x Many nodes with few links # of nodes with M links (log) A few nodes with many links # of links (M) (log)14 IEEE/WIC/ACM International Conference of Web Intelligence 22-27 Aug 2011
    • Evaluating Network PropertiesAssortativity of Node Degree (Correlation Degree on Nodes) Positive Correlation : if vertices with high number of connection tend to be connected with other nodes which also have many links . Observed in social networks : e.g. network of actors. Negative Correlation: if the preference is to attach to those having small quantity of connection. Observed in technological and biological networks : e.g. Internet, protein interactions.15 22-27 Aug 2011 IEEE/WIC/ACM International Conference of Web Intelligence
    • Experimental Datasets SOATrader dataset: 1,000,000 terms form SOATrader collection of 15000 WSDL s collected from different repositories in the Web between 2005-2007. SOATarder: ( http://www.soatrader.com/web-services) . ASSAM dataset3: 146 WSDLs collected by Hess et. al and annotated by ASSAM tools .We use all unique terms (appr. 375 ) with any frequency from this collection. ASSAM : http://www.andreas-hess.info/projects/annotator/ [3] A.Heß, N.Kushmeric, ”Machine Learning for Annotating Semantic Web services “,AAAI Spring Symposium Semantic Web Services, 200416 22-27 Aug 2011 IEEE/WIC/ACM International Conference of Web Intelligence
    • Golden Ontology SOATrader dataset: The golden annotation is handcrafted by authors based on top 2000 recurrent terms. ASSAM : Exploit the golden annotation developed by ASSAM developers and exploited as reference ontology in their experiment with ASSAM Web service annotation tool.17 22-27 Aug 2011 IEEE/WIC/ACM International Conference of Web Intelligence
    • Evaluation Result - 1 Precision, Recall, F-Measure 0.6 0.5 0.4 0.3 0.2 Rule-1 0.1 Rules 1-4 0 Rules 1-5 Recall Precision Recall Precision F-Measure F-Measure Top2000 ASSAM18 22-27 Aug 2011 IEEE/WIC/ACM International Conference of Web Intelligence
    • Dataset for Network EvaluationIdeal :Use all dataset of WSDL/XSD elements (approx. 1,000,000 terms) from SOATrader collection (appr. 1 million term) and ASSAM collection ( appr. 10000 terms)Problem with Large dataset:- The larger is dataset, the bigger will be ontology, the harder will be verifying and enhancing the quality of annotation- Not Cost Effective (human and computation cost) nor Scalable for analysis purpose.Proposal: limit SOATarder experimental dataset to the following four arbitrary chosen thresholds ( minimum frequency of occurrence of term) 10, 15, 20 and 25( h10, h15, h20, h25 ) , covering 30000 (unique) most recurrent terms.19 22-27 Aug 2011 IEEE/WIC/ACM International Conference of Web Intelligence
    • Annotation Progress h25 h20 h15 h10 Learned ontology size 4523 5614 7378 11610 Annotated elements 588057 596625 621336 663618 Total elements 998916 998916 998916 998916 Percentage of total 59% 60% 62% 66%20 22-27 Aug 2011 IEEE/WIC/ACM International Conference of Web Intelligence
    • Analysis of Small Worldness Dataset Networks L C Sindex Entire Syntactic Actual 3.283 0.2968 591.08SOATarder Random-ER 3.9229 0.00062 h 25 Generated Actual 2.4256 0.259 7.5769 Random-ER 2.4756 0.0348 h20 Generated Actual 2.3882 0.2811 8.8148 Random-ER 2.4851 0.0331 h15 Generated Actual 2.3724 0.2805 8.2753 Random-ER 2.3396 0.0334 h10 Generated Actual 2.5322 0.2449 18.2709 Random-ER 2.7662 0.0146 Top2000 Golden Actual 2.1895 0.3761 2.8404 Random-ER 1.8852 0.1146 Generated Actual 2.08475 0.3209 3.3878 Random-ER 2.0667 0.0939 ASSAM Golden Actual 4.5653 0.2147 3.1464 Random-ER 3.546 0.05304 Generated Actual 3.0592 0.4803 21.4835 Rule. 1 Random-ER 3.8451 0.0281 21 Generated Actual 2.5732 0.4057 8.5288 Rules .1-4 Random-ER 3.1267 0.0578
    • Analysis of Scale-free Properties & Correlation DegreeCategory Networks Power-law Degree #Nodes Degree Exponent CorrelationEntire Syntactic 1.3722 67622 -0.0413 h25 Generated 1.1945 2086 -0.1993 Random Annotation 0.6332 2086 0.019 h20 Generated 1.1977 2394 -0.2093 h15 Generated 1.1448 3239 -0.2222 h10 Generated 1.2316 4050 -0.1895Top2000 Golden 1.1504 856 -0.2238 Generated 1.1483 936 -0.2137 Syntactic 1.1653 828 -0.2229ASSAM Golden 1.5346 170 -0.3079 Generated- Rule. 1 1.5574 413 0.3642 Generated - Rules .1-4 1.4566 217 0.041 Random Annotation 1.0755 170 0.115122 Syntactic 1.6105 886 0.194
    • Plot of Degree Distribution Out-degree Distribution of Random Annotation Out-degree Distribution of Actual Annotation23 IEEE/WIC/ACM International Conference of Web Intelligence
    • Conclusion & Future work Performance of Web service Annotation scheme can be tracked in the properties of Web service networks models.An efficient matching scheme eliminates or at least minimizes deviation from small-worldness conditions , shows strong negative correlation degree and follows scale-free model. A major threat :  Network theories are incomplete : e.g. emergence of power-laws is so normal to rely on !  Evaluated dataset may not represent the model governing whole picture Future work:  Benchmarking other WS annotation & matching methods  Investigating other network properties24 22-27 Aug 2011 IEEE/WIC/ACM International Conference of Web Intelligence
    • Thanks ! Grateful to have your Questions , Critics and Suggestions?  SHAHABM@KTH.SE25 22-27 Aug 2011 IEEE/WIC/ACM International Conference of Web Intelligence
    • Backup Slides IEEE/WIC/ACM International Conference of Web26 Intelligence 22-27 Aug 2011
    • What Is Going To Be Annotated?Note: We annotate ONLY basic elements of Web service input and output parameter (message part names and XML Scheme basic element names).WSDL Semantic Annotation Ontology<wsdl:types> Address <complexType name="Address"> <sequence> hasZipCode hasCityName …… <element name="Zip" type="string“/> ….. ZipCode <element name="City" type="string“/> </sequence> </complexType>(…) CityName</wsdl:types> IEEE/WIC/ACM International Conference of Web Intelligence 22-27 Aug 201127
    • Example of Generated Ontology Input Terms: “userId”,” username”,“Zip”,“addr_line”, “userPostalAddress”,“online_usr”,…. OnlineUser isSubClassOf hasAddress User PostalAddress hasName hasIdentifier isSubClassOf Address hasAddressLine UserName UserIdentifier hasZipCode PostalCode ZipCode AddressLine isSynonymOf IEEE/WIC/ACM International Conference of Web28 Intelligence 22-27 Aug 2011