Digital Enterprise Research Institute                                                        www.deri.ie                  ...
Further ReadingDigital Enterprise Research Institute                     www.deri.ie  Hasan S, O’Riain S, Curry E.  Approx...
OutlineDigital Enterprise Research Institute                                         www.deri.ie       n    Introduction ...
Smart EnvironmentsDigital Enterprise Research Institute                                              www.deri.ie       n ...
Motivational Scenario- EnterpriseDigital Enterprise Research Institute                                   www.deri.ie      ...
RequirementsDigital Enterprise Research Institute                   www.deri.ie       n  Handling of semantically heterog...
Event ProcessingDigital Enterprise Research Institute                                                                     ...
Exact Event Processing ParadigmDigital Enterprise Research Institute                                   www.deri.ie        ...
Decoupling in Event SystemsDigital Enterprise Research Institute                                      www.deri.ie       n...
Decoupling in Event SystemsDigital Enterprise Research Institute                                                 www.deri....
Semantic CouplingDigital Enterprise Research Institute                                                www.deri.ie       n...
Current ApproachesDigital Enterprise Research Institute                                       www.deri.ie       n    Onto...
Proposed ApproachDigital Enterprise Research Institute                                      www.deri.ie       n    Approx...
BackgroundDigital Enterprise Research Institute                                         www.deri.ie       q    Semantic S...
Proposed Approach InstantiationDigital Enterprise Research Institute                                                      ...
Proposed Approach InstantiationDigital Enterprise Research Institute                                                      ...
Proposed Approach InstantiationDigital Enterprise Research Institute                                             www.deri....
Proposed Approach InstantiationDigital Enterprise Research Institute                                          www.deri.ie ...
Proposed Approach InstantiationDigital Enterprise Research Institute                                       www.deri.ie    ...
Proposed Approach InstantiationDigital Enterprise Research Institute                                         www.deri.ie  ...
Proposed Approach InstantiationDigital Enterprise Research Institute                                        www.deri.ie   ...
Proposed Approach InstantiationDigital Enterprise Research Institute                                        www.deri.ie   ...
Proposed Approach InstantiationDigital Enterprise Research Institute                        www.deri.ie                   ...
Experiments OverviewDigital Enterprise Research Institute                                          www.deri.ie       n   ...
Experiment 1- Wikipedia EventsDigital Enterprise Research Institute                                                     ww...
Experiment 1- Wikipedia EventsDigital Enterprise Research Institute                www.deri.ie       n    Example Event T...
Experiment 1- Subscription SetDigital Enterprise Research Institute                                                       ...
Experiment 1- Subscription SetDigital Enterprise Research Institute                                                       ...
Experiment 1- ResultsDigital Enterprise Research Institute                                                                ...
Experiment 1- ResultsDigital Enterprise Research Institute                                                                ...
Experiment 2- Freebase Event SetDigital Enterprise Research Institute                                                   ww...
Experiment 2- Subscription SetDigital Enterprise Research Institute                                                       ...
Experiment 2- ResultsDigital Enterprise Research Institute                                                                ...
ConclusionsDigital Enterprise Research Institute                                         www.deri.ie       n    Approxima...
Future WorkDigital Enterprise Research Institute                      www.deri.ie       n    Need to enhance subscription...
Upcoming SlideShare
Loading in...5
×

Approximate Semantic Matching of Heterogeneous Events

3,238

Published on

Event-based systems have loose coupling within space, time and synchronization, providing a scalable infrastructure for information exchange and distributed workflows. However, event-based systems are tightly coupled, via event subscriptions and patterns, to the semantics of the underlying event schema and values. The high degree of semantic heterogeneity of events in large and open deployments such as smart cities and the sensor web makes it difficult to develop and maintain event-based systems. In order to address semantic coupling within event-based systems, we propose vocabulary free subscriptions together with the use of approximate semantic matching of events. This paper examines the requirement of event semantic decoupling and discusses approximate semantic event matching and the consequences it implies for event processing systems. We introduce a semantic event matcher and evaluate the suitability of an approximate hybrid matcher based on both thesauri-based and distributional semantics-based similarity and relatedness measures. The matcher is evaluated over show that the approach matches a representation of Wikipedia and Freebase events. Initial evaluations events structured with maximal combined precision-recall F1 score of 75.89% on average in all experiments with a subscription set of 7 subscriptions. The evaluation shows how a hybrid approach to semantic event matching outperforms a single similarity measure approach.

Hasan S, O€'Riain S, Curry E. Approximate Semantic Matching of Heterogeneous Events. In: 6th ACM International Conference on Distributed Event-Based Systems (DEBS 2012).

Published in: Technology, Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
3,238
On Slideshare
0
From Embeds
0
Number of Embeds
8
Actions
Shares
0
Downloads
6
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Approximate Semantic Matching of Heterogeneous Events

  1. 1. Digital Enterprise Research Institute www.deri.ie Approximate Semantic Matching of Heterogeneous Events Souleiman Hasan, Sean O’Riain, Edward Curry Digital Enterprise Research Institute (DERI) National University of Ireland, Galway (NUIG) In proceedings of DEBS 2012, Berlin, Germany Stefan.Decker@deri.org http://www.StefanDecker.org/© Copyright 2010 Digital Enterprise Research Institute. All rights reserved.
  2. 2. Further ReadingDigital Enterprise Research Institute www.deri.ie Hasan S, O’Riain S, Curry E. Approximate Semantic Matching of Heterogeneous Events. In: 6th ACM International Conference on Distributed Event-Based Systems (DEBS 2012) www.edwardcurry.org
  3. 3. OutlineDigital Enterprise Research Institute www.deri.ie n  Introduction n  Experiments ¨  Smart Environments ¨  Wikipedia ¨  Motivational Scenario ¨  Freebase ¨  Related Work n  Conclusions n  Proposal n  Q&A ¨  Approximate Semantic Matching 3 of 34
  4. 4. Smart EnvironmentsDigital Enterprise Research Institute www.deri.ie n  Smart Homes, Grids, Cities… n  Internet-of-Things, Sensor Web… by 2020 50 billion devices connected to mobile networks (OECD, 2012) n  Non-technical users n  High heterogeneity n  Trend for dynamic data-driven decision making Event/Situation of Interest Event/Situation of Interest Soccer match played in Berlin New free parking space near me ........ 4 of 34
  5. 5. Motivational Scenario- EnterpriseDigital Enterprise Research Institute www.deri.ie CIO CSO Situation of Interest Company CO2 emissions performance Energy usage by global IT department Helpdesk Various terms used: energy consumption, energy usage…. PUE of the Data Center in room, space, zone… Dublin Maintenance Personnel Dynamic Environments: New events from kWhs used by equipments joining and server leaving 172.16.0.8 Building Data Center 5 of 34
  6. 6. RequirementsDigital Enterprise Research Institute www.deri.ie n  Handling of semantically heterogeneous events n  Handling of dynamic environments with event types by sources joining and leaving n  Low cost of rules management n  Usability n  Precision 6 of 34
  7. 7. Event ProcessingDigital Enterprise Research Institute www.deri.ie Situation of Interest When a floor is empty and its energy usage for an hour is above threshold w.r.t budget then it is an excessive usage Non-technical users with User Translation Developer natural language needs CEP Engine Separated from the engine UI Rules tied RULE vocabulary EVENT PROCESSING to EPL Interface Rules Repository and Parser Execution INSERT INTO ExcessiveEnergyUsageByFloor Pattern Matcher Repository SELECT a.floor as floor case of High cost in heterogeneity or change FROM PATTERN Single Event Templates [(a=FloorEmptySensor -> every b=DeviceEnergyUsageSensor Matcher Repository (a.floor=b.floor))] .WIN:TIME(1 hour) GROUP BY a.floor WHERE (b.usage) > GetAcceptableThreshold(a.budgetValue) ERP PC NO XDG26359 Floor: 1st usage: 3 kWh VM: vmdgsit01.deri.ie Floor: 1st BMS usage: 15 kWh 7 of 34
  8. 8. Exact Event Processing ParadigmDigital Enterprise Research Institute www.deri.ie Requirement Addressing by the paradigm Semantic Heterogeneity Does not scale out to high heterogeneous environments Dynamic Environment Does not scale out to high dynamic environments Rule Management High cost on large heterogeneity and dynamicity Usability Low Precision 100% (typically) 8 of 34
  9. 9. Decoupling in Event SystemsDigital Enterprise Research Institute www.deri.ie n  Space Producers and consumers don’t know each other n  Time Participants don’t need to be actively involved in the interaction the same time n  Synchronization Event producers and consumers don’t get blocked to send/receive events Space Time Event Event Producer Consumer Synchronization 9 of 34
  10. 10. Decoupling in Event SystemsDigital Enterprise Research Institute www.deri.ie n  Principle ¨  “Removal of explicit dependencies between participants” (Eugster et al., 2003) n  Outcome ¨  Scalability Space Time Event Event Producer Consumer Synchronization 10 of 34
  11. 11. Semantic CouplingDigital Enterprise Research Institute www.deri.ie n  Current event-based systems keep explicit semantic dependency between participants n  Limited scalability in highly heterogeneous and dynamic environment Space Time Event Event Producer Consumer Synchronization Semantic (Event types, property, values) 11 of 34
  12. 12. Current ApproachesDigital Enterprise Research Institute www.deri.ie n  Ontology-based ¨  (Petrovic et al., 2003), (Zhang & Ye, 2008)… ¨  Does not “remove explicit dependency” ¨  Hard to achieve ontology agreement a priori at large-scale of heterogeneity and dynamicism ¨  Medium usability, 100% precision typically n  Fuzzy sets ¨  (Liu & Jacobsen, 2002) ¨  Address only event numerical values vs. string values subscriptions ¨  Medium usability, High precision 12 of 34
  13. 13. Proposed ApproachDigital Enterprise Research Institute www.deri.ie n  Approximate semantic matching of events Event Types & properties Type(s) possible mappings Properties Values Subscription Values possible Type(s) mappings Properties Values Pick best overall mapping Post-matching event processing 13 of 34
  14. 14. BackgroundDigital Enterprise Research Institute www.deri.ie q  Semantic Similarity q  f: Terms X Terms à [0,1] q  term1, term2 are Terms q f(term1, term2)=0 absolute semantic mismatch q f(term1,term2)=1 exact match q  E.g. Football Match and Soccer Match are similar q  Relatedness: a general case of similarity q  E.g. Football Match and Referee related but not similar q  Thesaurus-based: e.g. WordNet-based q  Distributional semantics-based: e.g. Wikipedia ESA q  The more Wikipedia articles two terms occurs in, the more related they are 14 of 34
  15. 15. Proposed Approach InstantiationDigital Enterprise Research Institute www.deri.ie Football Match Types & properties possible mappings 2010 FIFA World Howard Webb type Cup Final referee name Values possible mappings Spain National event team Football Team team Pick best overall location Netherlands National mapping location Football Team Johannesburg Post-matching event FNB stadium processing Subscription Event type “”Soccer Match Event team “Spain” Event place “South Africa” 15 of 34
  16. 16. Proposed Approach InstantiationDigital Enterprise Research Institute www.deri.ie Event Subscription Types & properties possible mappings type type name place Values possible referee team mappings team location Pick best overall mapping 1   0.9   Lin   0.8   Post-matching event 0.7   Jiang&Conrath   processing Precision   0.6   0.5   Leacock&Chodorow   0.4   Lesk   0.3   0.2   Path   0.1   0   Resnik   0   0.1   0.2   0.3   0.4   0.5   0.6   0.7   0.8   0.9   1   Gloss  Vector   Recall   16 of 34
  17. 17. Proposed Approach InstantiationDigital Enterprise Research Institute www.deri.ie Event Subscription Types & properties possible mappings type type name place Values possible referee team mappings team location Pick best overall mapping Determine top m correspondence candidates Post-matching event RankSimJiiang&Conrath(ps, pe) processing Measure properties relatedness fP=Min(1,m-RankSimJiiang&Conrath(ps, pe) +1)*WikipediaESA(ps, pe)) 17 of 34
  18. 18. Proposed Approach InstantiationDigital Enterprise Research Institute www.deri.ie Event Subscription Types & properties possible mappings type type name place Values possible referee team mappings team location Pick best overall mapping type type Top 1 location 90% place Post-matching event team team processing type type Top 2 name 40% place referee team 18 of 34
  19. 19. Proposed Approach InstantiationDigital Enterprise Research Institute www.deri.ie Event Subscription Types & properties possible mappings Football Match Howard Webb Soccer Match Spain National Football Team South Africa Values possible Johannesburg FNB stadium Spain mappings Netherlands National Football Team Pick best overall mapping Measure values relatedness fV=WikipediaESA(Vs, Ve) Post-matching event processing 19 of 34
  20. 20. Proposed Approach InstantiationDigital Enterprise Research Institute www.deri.ie Event Subscription Types & properties possible mappings Football Match Howard Webb Soccer Match Spain National Football Team South Africa Values possible Johannesburg FNB stadium Spain mappings Netherlands National Football Team Pick best overall mapping Spain National 95% Spain Football Team Post-matching event processing Netherlands National 30% Spain Football Team 20 of 34
  21. 21. Proposed Approach InstantiationDigital Enterprise Research Institute www.deri.ie Event Subscription Types & properties possible mappings type type name place Values possible referee team mappings team location Pick best overall mapping Football Match Howard Webb Soccer Match Spain National Football Team South Africa Post-matching event Johannesburg FNB stadium Spain processing Netherlands National Football Team Calculate statements relatedness fSTMT =fP(ps, pe)*fV(vs, ve) 21 of 34
  22. 22. Proposed Approach InstantiationDigital Enterprise Research Institute www.deri.ie Event Subscription Types & properties possible mappings type type name place Values possible referee team mappings team location Pick best overall mapping Football Match Howard Webb Soccer Match Spain National Football Team South Africa Post-matching event Johannesburg FNB stadium Spain processing Netherlands National Football Team Determine correspondent event statement Corre by Max fSTMT 22 of 34
  23. 23. Proposed Approach InstantiationDigital Enterprise Research Institute www.deri.ie Types & properties n  Rank within a window possible mappings n  Complex Event Processing Values possible n  … mappings Pick best overall mapping Post-matching event processing 23 of 34
  24. 24. Experiments OverviewDigital Enterprise Research Institute www.deri.ie n  Methodology ¨  Prepare an event set that reflect required semantic heterogeneity (Wikipedia events) ¨  Prepare gold standard set of subscriptions that stress multiple aspects of semantic coupling ¨  Validate suitability of semantic approximation from precision perspective ¨  Use a different event set and same subscriptions to validate low maintainability cost (Freebase events) n  Evaluation Criteria ¨  Average interpolated Precision-Recall Curve on 11 recall points ¨  Maximal F1 Score over the average curve 24 of 34
  25. 25. Experiment 1- Wikipedia EventsDigital Enterprise Research Institute www.deri.ie Event Set Statistics Source structured Wikipedia Infoboxes, DBpedia 31 August 2011 Collection Triples directly associated to instances of dbpedia-owl:Event class Data model RDF Total # of events 20,156 Total # of distinct event types 4,950 Total # of distinct event properties 1,459 Total # of distinct event values 500,717 Total # of triples 1,502,599 Average # of distinct type per event 7.42 Average # of distinct property per event 30.52 Average # of distinct value per event 54.16 Average # of triple per event 64.67 25 of 34
  26. 26. Experiment 1- Wikipedia EventsDigital Enterprise Research Institute www.deri.ie n  Example Event Types ¨  Football Match ¨  Race ¨  Music Festival ¨  Space Mission ¨  Election ¨  10th-Century BC Conflicts ¨  Academic Conference ¨  Aviation Accident ¨  … 26 of 34
  27. 27. Experiment 1- Subscription SetDigital Enterprise Research Institute www.deri.ie n  Manually created gold standard set of subscriptions ID Description Subscription # of # of Event type Event Literals and relevant needed approximation properties resources events exact approximation approximation rules 1 Football matches event type "Football Match" 1 1 NO NO NO played by Spain in event team "Spain national football the FNB stadium team" event stadium "FNB Stadium" 2 Football matches event type "Football Match" 2 2 NO YES NO played in the FNB event place "FNB Stadium" stadium 3 Events taking place in event type "Event" 219 5 NO YES Syntactic Wembley stadium event place "Wembley Stadium" 4 Charity events taking event type "Charity" 29 6 YES YES Semantic place in Wembley event place "Wembley Stadium" + Syntactic stadium 5 Charity Rock events event type "Charity" 2 2 YES YES Semantic taking place in event type "Rock" + Syntactic Wembley stadium event place "Wembley Stadium" 6 Football matches event type "Football Match" 505 603 NO YES Background played in the UK event stadium "United Kingdom" Knowledge 7 Football matches event type "Football Match" 20 123,774 NO YES Background played by a South event team "South America" Knowledge American team in event stadium "Europe" Europe 27 of 34
  28. 28. Experiment 1- Subscription SetDigital Enterprise Research Institute www.deri.ie approximation approximation approximation n  Manually created gold standard set of subscriptions # of relevant Subscription # of needed Literals and Description exact rules Event type ID Description Template # of # of Event type Event Literals and properties resources relevant needed approximation properties resources events exact approximation approximation events rules Event 1 Football matches event type "Football Match" 1 1 NO NO NO ID played by Spain in event team "Spain national football the FNB stadium team" 3 Events taking event type event stadium "FNB Stadium" 219 5 NO YES Syntactic place in "Event" 2 Football matches event type "Football Match" 2 2 NO YES NO played in the FNB event place "FNB Stadium" Wembley stadium event place 3 stadium Events taking place in Wembley stadium "Wembley event type "Event" event place "Wembley Stadium" 219 5 NO YES Syntactic 4 Charity events Stadium" event type "Charity" 29 6 YES YES Semantic taking place in event place "Wembley Stadium" + Syntactic Wembley stadium event type "Event" Subscription events 5 Charity Rock event place "Wembley Stadium" event type "Charity" 2 2 YES YES Semantic taking place in event type "Rock" + Syntactic Wembley stadium ?event rdf:type dbpedia-owl:Event. event place "Wembley Stadium" SPARQL pattern 1 6 Football matches ?event dbpprop:stadium event type "Football Match" 505 dbpedia:Wembley_Stadium. 603 NO YES Background played in the UK event stadium "United Kingdom" Knowledge ?event rdf:type dbpedia-owl:Event. SPARQL pattern 2 7 Football matches event type "Football Match" 20 123,774 NO YES Background played by a South ?event dbpedia-owl:location event team "South America" dbpedia:Wembley_Stadium. Knowledge American team in event stadium "Europe" … Europe … 28 of 34
  29. 29. Experiment 1- ResultsDigital Enterprise Research Institute www.deri.ie 1   0.9   0.8   0.7   Precision   0.6   0.5   Events taking place in Wembley stadium 0.4   0.3   Need for a hybrid matcher that 0.2   0.1   combines both 0   0   0.1   0.2   0.3   0.4   0.5   0.6   0.7   0.8   0.9   1   Recall   45% Jiang&Conrath   40% Wikipedia  ESA   35% Frequency 30% 25% 1   20% 0.9   15% 0.8   10% 0.7   Precision   5% 0.6   0.5   Football matches played in the UK 0% 0.4   0 2^ -25 2^ -20 2^ -15 2^ -10 0.3   2^ -5 1 0.2   Semantic similarity or relatedness score 0.1   (log scale) 0   Jiang&Conrath WikipediaESA 0.2   0.3   0.4   0.5   0.6   0.7   0.8   0.9   0   0.1   1   Recall   Jiang&Conrath   Wikipedia  ESA   29 of 34
  30. 30. Experiment 1- ResultsDigital Enterprise Research Institute www.deri.ie n  Hybrid matcher outperforms a single similarity or relatedness measure matcher. Matcher Jiang&Conrath Wikipedia ESA Hybrid Maximal F1 Score 70.06% 44.26% 75.45% Recall 80% 80% 90% Precision 62.31% 30.59% 64.94% 1   0.9   0.8   0.7   Precision   0.6   0.5   0.4   0.3   0.2   0.1   0   0   0.1   0.2   0.3   0.4   0.5   0.6   0.7   0.8   0.9   1   Recall   Jiang&Conrath   Wikipedia  ESA   Hybrid   30 of 34
  31. 31. Experiment 2- Freebase Event SetDigital Enterprise Research Institute www.deri.ie Event Set Statistics Source Freebase events dump 1 December 2011, triples current Collection Triples directly associated to instances of “fbase:time.event" class Data model RDF Total # of events 84,529 Total # of distinct event types 858 Total # of distinct event properties 1,242 Total # of distinct event values 1,199,627 Total # of triples 1,859,338 Average # of distinct type per event 3.33 Average # of distinct property per event 10.67 Average # of distinct value per event 21.66 Average # of triple per event 21.99 31 of 34
  32. 32. Experiment 2- Subscription SetDigital Enterprise Research Institute www.deri.ie n  Same as in Experiment 1. ID Description Subscription # of # of Event type Event Literals and relevant needed approximation properties resources events exact approximation approximation rules 1 Football matches event type "Football Match" 1 1 YES YES NO played by Spain in event team "Spain national the FNB stadium football team" event stadium "FNB Stadium" 2 Football matches event type "Football Match" 8 2 YES YES NO played in the FNB event place "FNB Stadium" stadium 3 Events taking place in event type "Event" 29 5 NO YES NO Wembley stadium event place "Wembley Stadium" 4 Charity events taking event type "Charity" 0 - - - - place in Wembley event place "Wembley Stadium" stadium 5 Charity Rock events event type "Charity" 0 - - - - taking place in event type "Rock" Wembley stadium event place "Wembley Stadium" 6 Football matches event type "Football Match" 34 1,398 YES YES Background played in the UK event stadium "United Kingdom" Knowledge 7 Football matches event type "Football Match" 2 219,600 YES YES Background played by a South event team "South America" Knowledge American team in event stadium "Europe" Europe 32 of 34
  33. 33. Experiment 2- ResultsDigital Enterprise Research Institute www.deri.ie n  Hybrid matcher gives similar results in Freebase as in DBpedia Matcher Jiang&Conrath Wikipedia ESA Hybrid Maximal F1 Score 44.60% 70.73% 76.33% Recall 60% 80% 80% Precision 35.49% 63.39% 72.98% 1   0.9   0.8   0.7   Precision   0.6   0.5   0.4   0.3   0.2   0.1   0   0   0.1   0.2   0.3   0.4   0.5   0.6   0.7   0.8   0.9   1   Recall   Jiang&Conrath   Wikipedia  ESA   Hybrid   33 of 34
  34. 34. ConclusionsDigital Enterprise Research Institute www.deri.ie n  Approximate semantic matcher addresses subscriptions/ rules maintainability cost in heterogeneous and dynamic environments n  Approximate semantic matcher is suitable when less than 100% precision is acceptable Approximate Semantic Exact Matcher Matcher Number of Required Subscriptions 345,000 7 Maximal F1-Score 100% 75.89% n  A hybrid matcher outperforms a single similarity or relatedness measure matcher. 34 of 34
  35. 35. Future WorkDigital Enterprise Research Institute www.deri.ie n  Need to enhance subscription set for more representativeness. n  Approximate semantic matcher generates “uncertain” results whose impacts on further event processing functions such as CEP needs to be studied 35 of 34
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×