Event processing – State of the art and research challenges  AAAI 2011 Tutorial,  San Francisco, August 7 th , 2011  Opher...
Slides available at: ie.technion.ac.il/~yagile/EP_Tutorial.pdf
Imagine that… A driver gets notification on the car screen: the person crossing the street is an Alzheimer patient out of ...
Agenda Introduction and roots of event processing  Players and architecture of event processing Current state of the art i...
I: Introduction and roots of event processing
What is  “event processing” anyway? or Event processing is a form of computing that performs operations on  events
In computing we processed events since early days  Network and System Management
Emerging technologies in enterprise computing (Gartner Hype Cycle, Summer 2009)
What ’s new? The analog: moving from files to DBMS  In recent years – architectures, abstractions, and dedicated  commerci...
What is an event – three views  An event  is anything that happens, or is contemplated as  happening.  The happening view ...
In daily life we often react to events..
Many times we react to combination of events within a context  The house sensor detects that the child did not arrive home...
Event Patterns Pattern detection is one of the notable functions of event processing
What we actually want to react to are – situations  TOLL VILOATOR  FRUSTRATED CUSTOMER  Sometimes the situation is  determ...
Event processing is being used for various reasons
Ancestor: Production Rules  When Precondition  Fire Action  The precondition is implicit event when activated in forward c...
Ancestor: active databases  On event When condition Do action With coupling mode  Composite events were inherited to event...
Ancestor: Data Stream management system Source: Ankur Jain ’s website
Event processing and Data stream management? Aliases? One of them subset of the other? Totally unrelated concepts?
Ancestor: Temporal databases  There is a substantial temporal  nature to event processing.  Recently – also spatial and  s...
Ancestor: Discrete event simulation
Ancestor: Formal Verification
Ancestor: Network and system management
Ancestor:  Messaging – pub/sub middleware
II:  Players and architecture of event processing
Event Driven Architecture   Event driven architecture: asynchronous, decoupled; each  component is autonomic.
Fast Flower Delivery Flower Store Van Driver Ranking and Reporting  System Bid Request  Delivery Bid  Assignments, Bid ale...
Event Processing Agent Context  Event Channel Event Consumer Event Type Event Producer Global State  The seven  Building b...
Event processing network
Example of EPN –  part of the FFD example
Event type definition  Detection time, Occurrence time, source, Certainty… Stock id, quote, volume… Free comments…
Producer – State Observer in workflows  State observer Push:  Instrumentation points; Pull: Query the state
Producer – Code instrumentation
Producer – syndication
Producers –  video streams to events
Producer – sensors
Producer and consumer - Sixth sense
Twitter as a producer and consumer
Consumer - Performance monitoring dashboard
Consumer - Ambient Orb
Event Processing Agent Filter Transform Detect Pattern Translate Aggregate Split  Compose Enrich Project Event Processing ...
The EPA picture
Filter EPA A filter EPA is an EPA that performs filtering only, and has no matching or derivation steps, so it does not tr...
Transform EPA sub types
Sample of pattern types <ul><li>all  pattern is satisfied when the relevant event set contains at least one instance of ea...
Pattern detection example  Pattern name:  Manual Assignment Preparation  Pattern Type: relative N highest  Context: Bid In...
Our entire culture is context sensitive  <ul><li>In the play  “The Tea house of the August Moon” one of the characters say...
Context has three distinct roles (which may be combined)  Partition the incoming events  The events that relate to each cu...
Context Definition  A   context  is a named specification of conditions that groups event instances so that they can be pr...
Context Types Context Fixed location Entity distance location Event distance location Spatial State Oriented Fixed interva...
Context Types Examples Spatial State Oriented Temporal Context “ Every day between 08:00 and 10:00 AM ” “ A week after bor...
III: The current states of the art in event processing
An Observation The Babylon Tower symbolizes the tendency Of humanity  to talk in multiple languages. <ul><li>The Event Pro...
The Babylon tower and current state of the practice
StreamBase Studio
StreamBase Pattern Matching
CCL Studio (Coral8    Sybase)
CCL – Pattern Matching <ul><li>RFID monitoring application  </li></ul><ul><ul><li>Checks if a tag has been seen by readers...
Microsoft Streaminsights  var topfive = (from window in inputStream.Snapshot() from e in window orderby e.f ascending, e.i...
Esper EPL  – FFD Example /* *  Not delivered up after 10 mins (600 secs) of the request target delivery time */ insert int...
ruleCore - Reakt <ul><li>Event stream view  - a unique context of events </li></ul><ul><ul><li>a view contains a window in...
Amit - Situation
IBM Websphere Business Events
Apama EPL – FFD Examples
Performance benchmarks There is a large variance among applications, thus a collection of benchmarks should be devised, an...
Performance benchmarks – cont. Adi A., Etzion O. Amit - the situation manager. The VLDB Journal – The International Journa...
Throughput  Input throughput output throughput Processing throughput Measures: number of input events that the system can ...
Latency latency In the E2E level it is defined as the elapsed time  FROM the time-point when the producer emits an input e...
Performance goals and metrics  <ul><li>Multi-objective optimization function: </li></ul><ul><ul><li>min(  *avg latency + ...
Scalability in event processing: various dimensions # of producers   # of input events  # of EPA types # of  concurrent  r...
Scalability solutions Significant progress in scalability enablers that provides feasibility for a system based on large s...
IV: Challenges in event processing systems
Challenges <ul><li>Inexact Event Processing </li></ul><ul><li>Predictive Event Processing </li></ul><ul><li>Use of Machine...
Inexact event processing
Uncertain situations  False positive: The pattern is matched; The real-world situation  does not occur  False negative: Th...
Temporal indeterminacy  T1 T2
Challenges <ul><li>Inexact Event Processing </li></ul><ul><li>Predictive Event Processing </li></ul><ul><li>Use of Machine...
Predictive Event Processing (1) VS. Photo by Michael Gray, Flickr
Predictive Event Processing (2) VS. +
Predictive Event Patterns <ul><ul><li>Pattern    Future event, probability, time interval  </li></ul></ul><ul><ul><li>“ 4...
Limitations of the use of rules in specifying predictive event patterns  <ul><li>Limitations: </li></ul><ul><ul><li>Partia...
Dynamic event prediction Time Series  Prediction Graphical models  Temporal Graphical models
Graphical Model for Missing a Flight (Logistics Scenario)
Predictive Model for Missing a Flight (Logistics Scenario)
Predictive Model for Missing a Flight (Logistics Scenario)
Predictive Model for Missing a Flight (Logistics Scenario)
Continuous Time Bayesian Networks (CTBN, Nodelman et al, 2002) <ul><li>Can be used to model probabilistic and temporal rel...
Anomaly Detection in Networks (Xu and Shelton, 2008)
CTBN model (Xu and Shelton, 2008)
Challenges <ul><li>Inexact Event Processing </li></ul><ul><li>Predictive Event Processing </li></ul><ul><li>Use of Machine...
Machine Learning in EP Systems <ul><li>Requires for training predictive capabilities: </li></ul><ul><ul><li>Learn paramete...
Event Pattern Discovery <ul><li>Most (almost all) deployed systems today rely on user input to obtain complex event patter...
Requirements of Data Mining Algorithms <ul><li>What DM algorithms should be able to do? </li></ul><ul><ul><li>Low frequenc...
Low Frequency Patterns <ul><li>Detecting rare events: </li></ul><ul><ul><li>Frauds, attacks </li></ul></ul><ul><ul><li>Pre...
Temporal Windows <ul><li>Time window should be output of the DM process </li></ul><ul><li>Work by Mannila et al. 1997 : WI...
Assertions and Thresholds <ul><li>Pattern  “3 cash deposits in one day” may have no predictive value </li></ul><ul><li>BUT...
Other kinds of patterns <ul><li>We may be interested in patterns which are not sequential: </li></ul><ul><ul><li>“ All”, “...
Challenges <ul><li>Inexact Event Processing </li></ul><ul><li>Predictive Event Processing </li></ul><ul><li>Use of Machine...
From Reactive to Proactive
Example: Call Center Queue Assignment MDP Model: States (S) : queue status Actions (a) : assignments Reward (R) : penalty ...
Proactivity: Call Center Example
Proactive Event-Driven Computing (1) predict (states, events) Real-time decision  Proactive action events Event processing...
Energy Scenario Detect   Predict Decide   Act Consumption Level Production Level State Generator Failure Generator Fixed...
Detect Monitor shipment progress and various related alerts (traffic, cargo handling time at airport, carriers being late)...
Personal reschedule  Detect   I got out of the house 20 Minutes late; there are three spots of traffic congestion on the w...
Electric car – battery replacement overload  Detect Tracking the cars driving within a certain area and  their battery sta...
Portfolio tuning  Detect Track corporate actions, news, exchange prices, and  rumors about all securities in my portfolio ...
Predict <ul><li>Uncertain Rules </li></ul><ul><li>Bayesian Network </li></ul><ul><li>Classifiers: </li></ul><ul><ul><ul><l...
Event Processing DM vs. AI DM <ul><li>EP: scalable decision making, under large steams of online information </li></ul><ul...
Decision Making for Proactive E-D Computing <ul><li>Decision Rules: EPAs that react to future events </li></ul><ul><li>Mar...
Proactive EP: Challenges to the EPN <ul><li>Event Life Span </li></ul><ul><li>Response from Actuators </li></ul><ul><li>Mu...
Challenges <ul><li>Inexact Event Processing </li></ul><ul><li>Predictive Event Processing </li></ul><ul><li>Use of Machine...
Correctness  The ability of a developer to create correct implementation for all cases (including the boundaries)   Observ...
Some correctness topics The right interpretation of language constructs  The right order of events  The right classificati...
The right interpretation of language constructs – example All (E1, E2) – what do we mean? A customer both sells and buys t...
Fine tuning of the semantics (I) When should the derived event be emitted?  When the Pattern is matched ? At the window end?
Fine tuning of the semantics (II) How many instances of derived events should be emitted?  Only once?  Every time there is...
Fine tuning of the semantics (III) What happens if the same event happens several times?  Only one – first, last, higher/l...
Fine tuning of the semantics (IV) Can we consume or reuse events that participate in a match?
Fine tuning of semantics – conclusion  <ul><li>Some languages have explicit policies: </li></ul><ul><li>Example: CCL Keep ...
The right order of events - scenario <ul><li>Bid scenario- ground rules: </li></ul><ul><li>All bidders that issued a bid w...
Ordering in a distributed environment  -  possible issues Even if the occurrence time of an event is accurate,  it might a...
Clock accuracy in the source  Clock synchronization Time server,  example:  http:// tf.nist.gov/service/its.htm
Buffering technique <ul><li>Assumptions: </li></ul><ul><ul><li>Events are reported by the producers as soon as they occur;...
Retrospective compensation  Out of order event Recalculation Retraction of previous EPA results Not always possible!
Classification to windows - scenario Calculate Statistics for each Player (aggregate per quarter) Calculate Statistics for...
V: Summary
Event processing is an emerging  technology  Potential for mutually beneficial interaction with AI    Make the next gener...
REFERENCES (StoA of Event Processing) <ul><li>Opher Etzion and Peter Niblett,  Event Processing in Action , Manning, 2010....
REFERENCES (Challenges Section) <ul><li>H. Mannila, H. Toivonen, and A. Inkeri Verkamo,  Discovery of frequent episodes in...
Upcoming SlideShare
Loading in...5
×

Aaai 2011 event processing tutorial

4,915

Published on

AAAI 2011 - Tutorial: Introduction to event processing and challenges for the next generations of event processing of interest to the AI community

Published in: Technology, Business

Aaai 2011 event processing tutorial

  1. 1. Event processing – State of the art and research challenges AAAI 2011 Tutorial, San Francisco, August 7 th , 2011 Opher Etzion ( [email_address] ) Yagil Engel ( [email_address] )
  2. 2. Slides available at: ie.technion.ac.il/~yagile/EP_Tutorial.pdf
  3. 3. Imagine that… A driver gets notification on the car screen: the person crossing the street is an Alzheimer patient out of his regular route, he lives in 5 King Street. A national park gets information on all cars heading to the park from the car computer; can open more parking lots and notify cars that the park will be closed.
  4. 4. Agenda Introduction and roots of event processing Players and architecture of event processing Current state of the art in event processing Challenges in event processing systems Summary I II III IV V
  5. 5. I: Introduction and roots of event processing
  6. 6. What is “event processing” anyway? or Event processing is a form of computing that performs operations on events
  7. 7. In computing we processed events since early days Network and System Management
  8. 8. Emerging technologies in enterprise computing (Gartner Hype Cycle, Summer 2009)
  9. 9. What ’s new? The analog: moving from files to DBMS In recent years – architectures, abstractions, and dedicated commercial products emerge to support functionality that was traditionally carried out within regular programming. For some applications it is an improvement in TCO; for others is breaking the cost-effectiveness barrier.
  10. 10. What is an event – three views An event is anything that happens, or is contemplated as happening. The happening view The state change view An event is a state of change of anything The detectable condition view An event is a detectable condition that can trigger a notification
  11. 11. In daily life we often react to events..
  12. 12. Many times we react to combination of events within a context The house sensor detects that the child did not arrive home within 2 hours from the scheduled end of classes for the day I want to be notified when my own investment portfolio is down 5% since the start of the trading Day ; have an agent call me when I am available , send SMS when I am in a meeting , and Email when I am out of office .
  13. 13. Event Patterns Pattern detection is one of the notable functions of event processing
  14. 14. What we actually want to react to are – situations TOLL VILOATOR FRUSTRATED CUSTOMER Sometimes the situation is determined by detecting that some pattern occurred in the Flowing events. Toll violation Frustrated customer Sometimes the events can approximate or indicate with some certainty that the situation has occurred
  15. 15. Event processing is being used for various reasons
  16. 16. Ancestor: Production Rules When Precondition Fire Action The precondition is implicit event when activated in forward chaining
  17. 17. Ancestor: active databases On event When condition Do action With coupling mode Composite events were inherited to event processing
  18. 18. Ancestor: Data Stream management system Source: Ankur Jain ’s website
  19. 19. Event processing and Data stream management? Aliases? One of them subset of the other? Totally unrelated concepts?
  20. 20. Ancestor: Temporal databases There is a substantial temporal nature to event processing. Recently – also spatial and spatio-temporal functions are being added
  21. 21. Ancestor: Discrete event simulation
  22. 22. Ancestor: Formal Verification
  23. 23. Ancestor: Network and system management
  24. 24. Ancestor: Messaging – pub/sub middleware
  25. 25. II: Players and architecture of event processing
  26. 26. Event Driven Architecture Event driven architecture: asynchronous, decoupled; each component is autonomic.
  27. 27. Fast Flower Delivery Flower Store Van Driver Ranking and Reporting System Bid Request Delivery Bid Assignments, Bid alerts, Assign Alerts Control System GPS Location Location Service Location Driver ’s Guild Ranking and reports Delivery confirmation Pick Up confirmation Ranked drivers / automatic assignment Bid System Store Preferences Delivery Request Assignment System Manual Assignment Assignment Assignments, Pick Up Alert Delivery Alert http://www.ep-ts.com/EventProcessingInAction
  28. 28. Event Processing Agent Context Event Channel Event Consumer Event Type Event Producer Global State The seven Building blocks
  29. 29. Event processing network
  30. 30. Example of EPN – part of the FFD example
  31. 31. Event type definition Detection time, Occurrence time, source, Certainty… Stock id, quote, volume… Free comments…
  32. 32. Producer – State Observer in workflows State observer Push: Instrumentation points; Pull: Query the state
  33. 33. Producer – Code instrumentation
  34. 34. Producer – syndication
  35. 35. Producers – video streams to events
  36. 36. Producer – sensors
  37. 37. Producer and consumer - Sixth sense
  38. 38. Twitter as a producer and consumer
  39. 39. Consumer - Performance monitoring dashboard
  40. 40. Consumer - Ambient Orb
  41. 41. Event Processing Agent Filter Transform Detect Pattern Translate Aggregate Split Compose Enrich Project Event Processing Agents
  42. 42. The EPA picture
  43. 43. Filter EPA A filter EPA is an EPA that performs filtering only, and has no matching or derivation steps, so it does not transform the input event .
  44. 44. Transform EPA sub types
  45. 45. Sample of pattern types <ul><li>all pattern is satisfied when the relevant event set contains at least one instance of each event type in the participant set </li></ul><ul><li>any pattern is satisfied if the relevant event set contains an instance of any of the event types in the participant set </li></ul><ul><li>absence pattern is satisfied when there are no relevant events </li></ul><ul><li>relative N highest values pattern is satisfied by the events which have the N highest value of a specific attribute over all the relevant events, where N is an argument </li></ul><ul><li>value average pattern is satisfied when the value of a specific attribute, averaged over all the relevant events, satisfies the value average threshold assertion. </li></ul><ul><li>always pattern is satisfied when all the relevant events satisfy the always pattern assertion </li></ul><ul><li>sequence pattern is satisfied when the relevant event set contains at least one event instance for each event type in the participant set, and the order of the event instances is identical to the order of the event types in the participant set. </li></ul><ul><li>increasing pattern is satisfied by an attribute A if for all the relevant events, e1 << e2  e1.A < e2.A </li></ul><ul><li>relative max distance pattern is satisfied when the maximal distance between any two relevant events satisfies the max threshold assertion </li></ul><ul><li>moving toward pattern is satisfied when for any pair of relevant events e1, e2 we have e1 << e2  the location of e2 is closer to a certain object then the location of e1. </li></ul>
  46. 46. Pattern detection example Pattern name: Manual Assignment Preparation Pattern Type: relative N highest Context: Bid Interval Relevant event types: Delivery Bid Pattern parameter: N = 5; value = Ranking Cardinality: Single deferred Find the five highest bids within the bid interval Taken from the Fast Flower Delivery use case
  47. 47. Our entire culture is context sensitive <ul><li>In the play “The Tea house of the August Moon” one of the characters says: Pornography question of geography </li></ul><ul><li>This says that in different geographical contexts people view things differently </li></ul><ul><li>Furthermore, the syntax of the language (no verbs) is typical to the way that the people of Okinawa are talking </li></ul>When hearing concert people are not talking, eating, and keep their mobile phone on “silent”.
  48. 48. Context has three distinct roles (which may be combined) Partition the incoming events The events that relate to each customer are processed separately Grouping events together Different processing for Different context partitions Determining the processing Grouping together events that happened in the same hour at the same location
  49. 49. Context Definition A context is a named specification of conditions that groups event instances so that they can be processed in a related way. It assigns each event instance to one or more context partitions . A context may have one or more context dimensions. Temporal Spatial State Oriented Segmentation Oriented
  50. 50. Context Types Context Fixed location Entity distance location Event distance location Spatial State Oriented Fixed interval Event interval Sliding fixed interval Sliding event interval Temporal Segmentation Oriented
  51. 51. Context Types Examples Spatial State Oriented Temporal Context “ Every day between 08:00 and 10:00 AM ” “ A week after borrowing a disk” “ A time window bounded by TradingDayStart and TradingDayEnd events ” “ 3 miles from the traffic accident location ” “ Within an authorized zone in a manufactory ” “ All Children 2-5 years old” “ All platinum customers” “ Airport security level is red” “ Weather is stormy” Segmentation Oriented
  52. 52. III: The current states of the art in event processing
  53. 53. An Observation The Babylon Tower symbolizes the tendency Of humanity to talk in multiple languages. <ul><li>The Event Processing area is no different: most languages in the industry really follow </li></ul><ul><li>the hammer and nails syndrome – and extended existing approaches </li></ul><ul><li>imperative script language </li></ul><ul><li>SQL extensions </li></ul><ul><li>Extension of inference rule language </li></ul>The epts language analysis workgroup is aimed to understand the various styles And extract common functions that can be used to define what is an event processing language; this tutorial is an interim report It does not seem that we ’ll succeed to settle In the near future around a single programming style
  54. 54. The Babylon tower and current state of the practice
  55. 55. StreamBase Studio
  56. 56. StreamBase Pattern Matching
  57. 57. CCL Studio (Coral8  Sybase)
  58. 58. CCL – Pattern Matching <ul><li>RFID monitoring application </li></ul><ul><ul><li>Checks if a tag has been seen by readers A and B, then C, but not D, within a 10 second window. </li></ul></ul>Insert into StreamAlerts Select StreamA.id From StreamA a, StreamB b, StreamC c, StreamD d Matching [10 seconds: a && b, c, !d] On a.id = b.id = c.id = d.id
  59. 59. Microsoft Streaminsights var topfive = (from window in inputStream.Snapshot() from e in window orderby e.f ascending, e.i descending select e).Take(5); var avgCount = from v in inputStream group v by v.i % 4 into eachGroup from window in eachGroup.Snapshot() select new { avgNumber = window.Avg(e => e.number) };
  60. 60. Esper EPL – FFD Example /* * Not delivered up after 10 mins (600 secs) of the request target delivery time */ insert into AlertW(requestId, message, driver, timestamp) select a.requestId, &quot;not delivered&quot;, a.driver, current_timestamp() from pattern[ every a=Assignment  (timer:interval(600 + (a.deliveryTime-current_timestamp)/1000) and not DeliveryConfirmation(requestId = a.requestId) and not NoOneToReceiveMSG(requestId = a.requestId)) ];
  61. 61. ruleCore - Reakt <ul><li>Event stream view - a unique context of events </li></ul><ul><ul><li>a view contains a window into the inbound stream of events and contains commonly only semantically related events </li></ul></ul><ul><li>Situation - an interesting combination of multiple events as they occur over time </li></ul><ul><ul><li>An item with an RFID tag being picked up from the shelf and then moving past the checkout without being paid for </li></ul></ul><ul><li>Rule - an active event processing entity reacting to specific combinations of inbound events over time </li></ul><ul><li>Action - the last part of a rule's evaluation in response to a detected situation </li></ul>
  62. 62. Amit - Situation
  63. 63. IBM Websphere Business Events
  64. 64. Apama EPL – FFD Examples
  65. 65. Performance benchmarks There is a large variance among applications, thus a collection of benchmarks should be devised, and each application should be classified to a benchmark Some classification criteria: Application complexity Filtering rate Required Performance metrics
  66. 66. Performance benchmarks – cont. Adi A., Etzion O. Amit - the situation manager. The VLDB Journal – The International Journal on Very Large Databases. Volume 13 Issue 2, 2004. Mendes M., Bizarro P., Marques P. Benchmarking event processing systems: current state and future directions. WOSP/SIPEW 2010: 259-260 . Previous studies ‎indicate that there is a major performance degradation as application complexity increases.
  67. 67. Throughput Input throughput output throughput Processing throughput Measures: number of input events that the system can digest within a given time interval Measures: Total processing times / # of event processed within a given time interval Measures: # of events that were emitted to consumers within a given time interval
  68. 68. Latency latency In the E2E level it is defined as the elapsed time FROM the time-point when the producer emits an input event TO the time-point when the consumer receives and output event The latency definition But – input event may not result in output event: It may be filtered out, participate in a pattern but does not result in pattern detection, or participates in deferred operation (e.g. aggregation) Similar definitions for the EPA level, or path level
  69. 69. Performance goals and metrics <ul><li>Multi-objective optimization function: </li></ul><ul><ul><li>min(  *avg latency + (1-  )*(1/thoughput)) </li></ul></ul>Max throughput All/ 80% have max/avg latency < δ All/ 90% of time units have throughput > Ω minmax latency minavg latency latency leveling
  70. 70. Scalability in event processing: various dimensions # of producers # of input events # of EPA types # of concurrent runtime instances # of concurrent runtime contexts Internal state size # of consumers # of derived events Processing complexity
  71. 71. Scalability solutions Significant progress in scalability enablers that provides feasibility for a system based on large scale event sources, event quantities, computations and actuators Smart placements of processing elements with dynamic load balancing Fault tolerance techniques enable trustable automatic processing Virtualization (scale-in) Use of parallel processing – multi-core and GPU processors – without extra programming efforts
  72. 72. IV: Challenges in event processing systems
  73. 73. Challenges <ul><li>Inexact Event Processing </li></ul><ul><li>Predictive Event Processing </li></ul><ul><li>Use of Machine Learning </li></ul><ul><li>From Reactive to Proactive </li></ul><ul><li>Correctness </li></ul>
  74. 74. Inexact event processing
  75. 75. Uncertain situations False positive: The pattern is matched; The real-world situation does not occur False negative: The pattern is not matched; The real-world situation occurs
  76. 76. Temporal indeterminacy T1 T2
  77. 77. Challenges <ul><li>Inexact Event Processing </li></ul><ul><li>Predictive Event Processing </li></ul><ul><li>Use of Machine Learning </li></ul><ul><li>From Reactive to Proactive </li></ul><ul><li>Correctness </li></ul>
  78. 78. Predictive Event Processing (1) VS. Photo by Michael Gray, Flickr
  79. 79. Predictive Event Processing (2) VS. +
  80. 80. Predictive Event Patterns <ul><ul><li>Pattern  Future event, probability, time interval </li></ul></ul><ul><ul><li>“ 4 high value deposits from different geographic locations within 3 days ” </li></ul></ul><ul><ul><ul><li> “ 0.6 chance for a large transfer abroad, in 1 day” </li></ul></ul></ul>“ Output event will occur with distribution D over interval (t1,t2)” Stock decrease of > 5% in 3 hours  Good chance for 2% increase within 2 hours
  81. 81. Limitations of the use of rules in specifying predictive event patterns <ul><li>Limitations: </li></ul><ul><ul><li>Partial patterns </li></ul></ul><ul><ul><li>Uncertain input events </li></ul></ul><ul><ul><li>Complex relationship between random variables </li></ul></ul>Rule = hard-coded probabilistic Relationship
  82. 82. Dynamic event prediction Time Series Prediction Graphical models Temporal Graphical models
  83. 83. Graphical Model for Missing a Flight (Logistics Scenario)
  84. 84. Predictive Model for Missing a Flight (Logistics Scenario)
  85. 85. Predictive Model for Missing a Flight (Logistics Scenario)
  86. 86. Predictive Model for Missing a Flight (Logistics Scenario)
  87. 87. Continuous Time Bayesian Networks (CTBN, Nodelman et al, 2002) <ul><li>Can be used to model probabilistic and temporal relationship between events E.g., Applied for the problem of detecting host-level attacks in network traffic (Xu and Shelton, 2008) </li></ul>
  88. 88. Anomaly Detection in Networks (Xu and Shelton, 2008)
  89. 89. CTBN model (Xu and Shelton, 2008)
  90. 90. Challenges <ul><li>Inexact Event Processing </li></ul><ul><li>Predictive Event Processing </li></ul><ul><li>Use of Machine Learning </li></ul><ul><li>From Reactive to Proactive </li></ul><ul><li>Correctness </li></ul>
  91. 91. Machine Learning in EP Systems <ul><li>Requires for training predictive capabilities: </li></ul><ul><ul><li>Learn parameters / structure of graphical models </li></ul></ul><ul><ul><li>Learn predictive rules </li></ul></ul><ul><li>Discover the patters used by EPAs </li></ul>
  92. 92. Event Pattern Discovery <ul><li>Most (almost all) deployed systems today rely on user input to obtain complex event patterns </li></ul><ul><li>How can (business) users obtain these patterns? </li></ul><ul><ul><li>Users do not know all the patterns that are relevant </li></ul></ul><ul><ul><li>System must be built and maintained by domain experts </li></ul></ul>
  93. 93. Requirements of Data Mining Algorithms <ul><li>What DM algorithms should be able to do? </li></ul><ul><ul><li>Low frequency patterns </li></ul></ul><ul><ul><li>Temporal Windows </li></ul></ul><ul><ul><li>Assertions and Thresholds </li></ul></ul><ul><ul><li>Non-Standard patterns </li></ul></ul>
  94. 94. Low Frequency Patterns <ul><li>Detecting rare events: </li></ul><ul><ul><li>Frauds, attacks </li></ul></ul><ul><ul><li>Predict crashes </li></ul></ul><ul><ul><li>Equipment failure </li></ul></ul><ul><ul><li>Natural disasters </li></ul></ul><ul><li>Solutions: </li></ul><ul><ul><li>Low support mining </li></ul></ul><ul><ul><li>Unsupervised learning for anomaly detection </li></ul></ul>
  95. 95. Temporal Windows <ul><li>Time window should be output of the DM process </li></ul><ul><li>Work by Mannila et al. 1997 : WINEPI </li></ul>
  96. 96. Assertions and Thresholds <ul><li>Pattern “3 cash deposits in one day” may have no predictive value </li></ul><ul><li>BUT </li></ul><ul><li>“ 3 cash deposits above $10000 from 3 different locations” does </li></ul><ul><li>Multiattribute mining (Hellerstein et al.) </li></ul>
  97. 97. Other kinds of patterns <ul><li>We may be interested in patterns which are not sequential: </li></ul><ul><ul><li>“ All”, “Absence”, “Max Value”, “Sometime” </li></ul></ul><ul><ul><li>“ If there is no deposit to this account in the last year,…” </li></ul></ul><ul><ul><li>“ If the maximal value of deposit to this account in the last year is $5,…” </li></ul></ul><ul><ul><li>“ If at least one of the deposits where made from abroad,…” </li></ul></ul>
  98. 98. Challenges <ul><li>Inexact Event Processing </li></ul><ul><li>Predictive Event Processing </li></ul><ul><li>Use of Machine Learning </li></ul><ul><li>From Reactive to Proactive </li></ul><ul><li>Correctness </li></ul>
  99. 99. From Reactive to Proactive
  100. 100. Example: Call Center Queue Assignment MDP Model: States (S) : queue status Actions (a) : assignments Reward (R) : penalty for waiting and blocking Transition (T) : call arrival, call ending
  101. 101. Proactivity: Call Center Example
  102. 102. Proactive Event-Driven Computing (1) predict (states, events) Real-time decision Proactive action events Event processing (filter, transform, match patterns) events Detect / Derive Predict Decide Act events Proactive event-driven computing is a new paradigm aimed at predicting the occurrence of problems or opportunities before they occur, and changing the course of actions to mitigate or leverage them
  103. 103. Energy Scenario Detect  Predict Decide  Act Consumption Level Production Level State Generator Failure Generator Fixed Weather Forecast (sun, wind, temp, storm) Consumption Forecast Production Forecast Outage Prediction Many Failed Generators Prediction Call for Urgent Generators Fix Activate Expensive Diesel Generators Declare “Peak Hours” for Tomorrow Activate Rolling Blackout
  104. 104. Detect Monitor shipment progress and various related alerts (traffic, cargo handling time at airport, carriers being late) Predict According to current route, the shipment will be 3 hours late and we will incur high penalty Decide Find alternative route which (given new condition) is faster than previous route Act Generate cargo reservations, reroute shipment Critical Shipment Logistics
  105. 105. Personal reschedule Detect I got out of the house 20 Minutes late; there are three spots of traffic congestion on the way to the office; it is raining; and I have an important meeting in 25 minutes! Predict I am not going to get to the meeting, not even close! Decide Check whether there is a qualified person for this meeting that can replace me and has lower priority task for the duration of this meeting and reschedule his/her other obligations; Alternatively, check if there Is another time-slot later on the day for which the meeting can be rescheduled and get a decision! Act Notify all involved on their reschedule.
  106. 106. Electric car – battery replacement overload Detect Tracking the cars driving within a certain area and their battery status. Predict In 2 hours the service stations in the area will be out of charged batteries. Decide Whether there are available spare batteries nearby that can be shipped via car, or a helicopter need to be dispatched to ship batteries from the central store. Act Load batteries on selected means of transportation and start the journey! Background: A company leases electric cars that can drive up to 100 miles; it provides both personal and public battery charge spots, and robotic battery replacement service stations as part of the lease.
  107. 107. Portfolio tuning Detect Track corporate actions, news, exchange prices, and rumors about all securities in my portfolio Predict My portfolio is going to exceed my personal risk limit within 1 hour Decide Mark the securities to be sold and best timing to sell, find an alternative to buy that retain the risk limit. Act Buy/Sell orders
  108. 108. Predict <ul><li>Uncertain Rules </li></ul><ul><li>Bayesian Network </li></ul><ul><li>Classifiers: </li></ul><ul><ul><ul><li>Decision trees </li></ul></ul></ul><ul><ul><ul><li>Naïve Bayes </li></ul></ul></ul><ul><ul><ul><li>… </li></ul></ul></ul><ul><li>… </li></ul>Decide <ul><li>Temporal Decision Process </li></ul><ul><li>Optimization tools (black box) </li></ul>Probabilistic events Analytics <ul><li>Events </li></ul>Actions Proactive Event-Driven Computing (2)
  109. 109. Event Processing DM vs. AI DM <ul><li>EP: scalable decision making, under large steams of online information </li></ul><ul><li>AI: state-based, decision-theoretic deliberation </li></ul><ul><li>EP+AI: EP synthesize streams to meaningful bit of info, AI operates on reduced state space </li></ul>
  110. 110. Decision Making for Proactive E-D Computing <ul><li>Decision Rules: EPAs that react to future events </li></ul><ul><li>Markov Decision Process </li></ul><ul><ul><li>Model for policy optimization under uncertainty </li></ul></ul><ul><ul><li>Model must be updated when the predictive EP modules predicts relevant future events </li></ul></ul><ul><ul><ul><li>Requires online adjustment of policy </li></ul></ul></ul><ul><ul><ul><ul><li>Brafman, Domshlak, Engel, and Feldman, AAAI 2011 </li></ul></ul></ul></ul><ul><li>External Optimization tools </li></ul><ul><ul><li>E.g., route planner for the logistics scenario </li></ul></ul><ul><ul><li>Parameterization, or shared resource information </li></ul></ul>
  111. 111. Proactive EP: Challenges to the EPN <ul><li>Event Life Span </li></ul><ul><li>Response from Actuators </li></ul><ul><li>Multiple Proactive Agents </li></ul><ul><li>State Driven vs. Event-Driven </li></ul>
  112. 112. Challenges <ul><li>Inexact Event Processing </li></ul><ul><li>Predictive Event Processing </li></ul><ul><li>Use of Machine Learning </li></ul><ul><li>From Reactive to Proactive </li></ul><ul><li>Correctness </li></ul>
  113. 113. Correctness The ability of a developer to create correct implementation for all cases (including the boundaries) Observation: A substantial amount of effort is invested today in many of the tools to workaround the inability of the language to easily create correct solutions
  114. 114. Some correctness topics The right interpretation of language constructs The right order of events The right classification of events to windows
  115. 115. The right interpretation of language constructs – example All (E1, E2) – what do we mean? A customer both sells and buys the same security in value of more than $1M within a single day Deal fulfillment: Package arrival and payment arrival 6/3 10:00 7/3 11:00 8/3 11:00 8/3 14:00
  116. 116. Fine tuning of the semantics (I) When should the derived event be emitted? When the Pattern is matched ? At the window end?
  117. 117. Fine tuning of the semantics (II) How many instances of derived events should be emitted? Only once? Every time there is a match ?
  118. 118. Fine tuning of the semantics (III) What happens if the same event happens several times? Only one – first, last, higher/lower value on some predicate? All of them participate in a match?
  119. 119. Fine tuning of the semantics (IV) Can we consume or reuse events that participate in a match?
  120. 120. Fine tuning of semantics – conclusion <ul><li>Some languages have explicit policies: </li></ul><ul><li>Example: CCL Keep policies </li></ul><ul><ul><li>KEEP LAST PER Id </li></ul></ul><ul><ul><li>KEEP 3 MINUTES </li></ul></ul><ul><ul><li>KEEP EVERY 3 MINUTES </li></ul></ul><ul><ul><li>KEEP UNTIL ( ” MON 17:00:00 ” ) </li></ul></ul><ul><ul><li>KEEP 10 ROWS </li></ul></ul><ul><ul><li>KEEP LAST ROW </li></ul></ul><ul><ul><li>KEEP 10 ROWS PER Symbol </li></ul></ul>In other cases – explicit programming and workarounds are used if semantics intended is different than the default semantics
  121. 121. The right order of events - scenario <ul><li>Bid scenario- ground rules: </li></ul><ul><li>All bidders that issued a bid within the validity interval participate in the bid. </li></ul><ul><li>The highest bid wins. In the case of tie between bids, the first accepted bid wins the auction </li></ul>===Input Bids=== Bid Start 12:55:00 credit bid id=2,occurrence time=12:55:32,price=4 cash bid id=29,occurrence time=12:55:33,price=4 cash bid id=33,occurrence time=12:55:34,price=3 credit bid id=66,occurrence time=12:55:36,price=4 credit bid id=56,occurrence time=12:55:59,price=5 Bid End 12:56:00 ===Winning Bid=== cash bid id=29,occurrence time=12:55:33,price=4 Trace: Race conditions: Between events; Between events and Window start/end
  122. 122. Ordering in a distributed environment - possible issues Even if the occurrence time of an event is accurate, it might arrive after some processing has already been done If we used occurrence time of an event as reported by the sources it might not be accurate, due to clock accuracy in the source Most systems order event by detection time – but events may switch their order on the way
  123. 123. Clock accuracy in the source Clock synchronization Time server, example: http:// tf.nist.gov/service/its.htm
  124. 124. Buffering technique <ul><li>Assumptions: </li></ul><ul><ul><li>Events are reported by the producers as soon as they occur; </li></ul></ul><ul><ul><li>The delay in reporting events to the system is relatively small, and can be bounded by a time-out offset ; </li></ul></ul><ul><ul><li>Events arriving after this time-out can be ignored. </li></ul></ul>Sorted Buffer (by occurrence time) To t > To +  Producers Event Processing <ul><li>Principles: </li></ul><ul><ul><li>Let  be the time-out offset, according to the assumption it is safe to assume that at any time-point t, all events whose occurrence time is earlier than t -  have already arrived. </li></ul></ul><ul><ul><li>Each event whose occurrence time is To is then kept in the buffer until To+  , at which time the buffer can be sorted by occurrence time, and then events can be processed in this sorted order. </li></ul></ul>
  125. 125. Retrospective compensation Out of order event Recalculation Retraction of previous EPA results Not always possible!
  126. 126. Classification to windows - scenario Calculate Statistics for each Player (aggregate per quarter) Calculate Statistics for each Team (aggregate per quarter) Window classification: Player statistics are calculated at the end of each quarter Team statistics are calculated at the end of each quarter based on the players events arrived within the same quarter All instances of player statistics that occur within a quarter window must be classified to the same window, even if they are derived after the window termination.
  127. 127. V: Summary
  128. 128. Event processing is an emerging technology Potential for mutually beneficial interaction with AI  Make the next generation A vehicle to substantially Change the world Already attracted coverage of analysts and all major software vendors Event Patterns It barely scratched the surface Of its potential
  129. 129. REFERENCES (StoA of Event Processing) <ul><li>Opher Etzion and Peter Niblett, Event Processing in Action , Manning, 2010. </li></ul><ul><li>Mani Chandy and Roy Schulte, Event Processing: Designing IT Systems for Agile Companies , McGraw Hill, 2009. </li></ul><ul><li>David Luckham, The Power of Events: An Introduction to Complex Event Processing in Distributed Enterprise Systems , Addison-Wesley, 2002. </li></ul><ul><li>Gianpaolo Cugola and Alessandro Margara, Processing Flows of Information: From Data Stream to Complex Event Processing , to appear in ACM Computing Surveys. Available through: http://home.dei.polimi.it/margara/papers/survey.pdf </li></ul>
  130. 130. REFERENCES (Challenges Section) <ul><li>H. Mannila, H. Toivonen, and A. Inkeri Verkamo, Discovery of frequent episodes in event sequences , Data Mining and Knowledge Discovery, 1997. </li></ul><ul><li>J.L. Hellerstein, S. Ma, and C.S. Perng, Discovering actionable patterns in event data , IBM Systems Journal, 2002 </li></ul><ul><li>R.I. Brafman, C. Domshlak, Y. Engel, and Z. Feldman, Planning for Operational Control Systems with Predictable Exogenous Events , AAAI 2011 </li></ul><ul><li>Y. Engel and O. Etzion, Towards Proactive Event Driven Computing, DEBS 2011 </li></ul>
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×