Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

RuleML 2015: When Processes Rule Events

1,623 views

Published on

Big data, with its four main characteristics (Volume, Velocity,
Variety, and Veracity) pose challenges to the gathering, management, analytics, and visualization of events. These very same four characteristics, however, also hold a great promise in unlocking the story behind data. In this talk, we focus on the observation that event creation is guided by processes. For example, GPS information, emitted by buses in an urban setting follow the bus scheduled route. Also, RTLS information about the whereabouts of patients and nurses in a hospital is guided by the predefined schedule of work. With this observation at hand, we thoroughly seek a method for mining, not the data, but rather the rules that guide data creation and show how, by knowing such rules, big data tasks become more efficient and more effective. In particular, we demonstrate how, by knowing the rules that govern event creation, we can detect complex events sooner and make use of historical data to predict future behaviors.

Published in: Science
  • Be the first to comment

  • Be the first to like this

RuleML 2015: When Processes Rule Events

  1. 1. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Complex Event Processing Optimization Process Mining with Schedules When Processes Rule Events Avigdor Gal Technion – Israel Institute of Technology
  2. 2. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Presentation Outline Big data: the New Playground Events, Processes, and Anything in Between Complex Event Processing Optimizaion Process Mining with Schedules
  3. 3. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Big Data: is it a Storm in a Teacup?
  4. 4. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Big data is a game changer From Theory to Systems: empirical evaluation counts From Systems to Data: large scale empirical evaluation counts
  5. 5. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Who is a Data Scientist? The ability to take data – to be able to understand it, to process it, to extract value from it, to visualize it, to communicate it – that’s going to be a hugely important skill in the next decades. (Hal Varian, Google’s Chief Economist)
  6. 6. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Data Volume: No Longer the Size of a Teacup Volume Table: Big Data Cross Table Big data may be a single dataset with a lot of data
  7. 7. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Data Volume: No Longer the Size of a Teacup Table: Big Data Cross Table Big data may be a single dataset with a lot of data
  8. 8. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Data Velocity: Replacing a Teacup with a Tea Hose Volume Velocity Table: Big Data Cross Table Big data may be data that rapidly changes
  9. 9. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Data Velocity: Replacing a Teacup with a Tea Hose Table: Big Data Cross Table Big data may be data that rapidly changes
  10. 10. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Data Velocity: Replacing a Teacup with a Tea Hose Table: Big Data Cross Table Big data may be data that rapidly changes
  11. 11. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Data Velocity: Replacing a Teacup with a Tea Hose Table: Big Data Cross Table Big data may be data that rapidly changes
  12. 12. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Data Variety: When One Tea Type is Just not Enough Volume Velocity Variety Table: Big Data Cross Table Big data may be a small dataset with many different schemata
  13. 13. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Data Variety: When One Tea Type is Just not Enough Table: Big Data Cross Table Big data may be a small dataset with many different schemata
  14. 14. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Data Veracity: Is it Coffee or Black Tea with Milk? Volume Velocity Variety Veracity Table: Big Data Cross Table Big data may be data with varying levels of trustworthiness
  15. 15. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Data Veracity: Is it Coffee or Black Tea with Milk? Table: Big Data Cross Table Big data may be data with varying levels of trustworthiness
  16. 16. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Data Gathering: where and when to expect the fountain to burst Gathering Volume Velocity Variety Veracity Signal and Event Processing Table: Big Data Cross Table
  17. 17. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Data Gathering: where and when to expect the fountain to burst Table: Big Data Cross Table
  18. 18. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Data Management: Not your typical DBA anymore Gathering Managing Volume Velocity Variety Veracity Cloud Computing, NoSQL, NewSQL Table: Big Data Cross Table
  19. 19. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Data Analytics: When Data Analysis Explodes Multi-Dimensionally Gathering Managing Analyzing Volume Velocity Variety Veracity Data & Process Mining ML, IR, NLP Table: Big Data Cross Table
  20. 20. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Data Visualization: The Machine Offering to Mankind Gathering Managing Analyzing Visualizing Volume Velocity Variety Veracity User Experience Table: Big Data Cross Table
  21. 21. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Data Visualization: The Machine Offering to Mankind Table: Big Data Cross Table
  22. 22. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Events Processes Complex Event Processing Optimization Process Mining with Schedules Big Data Cross Table Gathering Managing Analyzing Visualizing Volume Ev Pro Velocity en ce Variety t ss Veracity s es Table: Big Data Cross Table
  23. 23. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Events Processes Complex Event Processing Optimization Process Mining with Schedules Event Processing Events An event e is an occurrence within a particular system or domain. It is something that has happened, or is contemplated as having happened in that domain. [Etzion and Niblett, 2010] Point-based semantics. An event type E ∈ E is a specification for a set of events that share the same semantic intent and structure. Complex Event Processing Systems: Amit [Adi and Etzion, 2004], SASE [Wu et al., 2006], Cayuga [Demers et al., 2007], CEDR [Barga et al., 2007], ESPER []. DEBS 2016: Oragne County, California
  24. 24. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Events Processes Complex Event Processing Optimization Process Mining with Schedules Event Processing Urban Traffic Management
  25. 25. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Events Processes Complex Event Processing Optimization Process Mining with Schedules Traffic Flow
  26. 26. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Events Processes Complex Event Processing Optimization Process Mining with Schedules Bus Log
  27. 27. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Events Processes Complex Event Processing Optimization Process Mining with Schedules Events and Big Data Volume: 23 Million records per month (∼ 4GB) Velocity: 770,000 new records per day (an event each 2-6 seconds) Variety: Homogeneous Veracity: GPS locations
  28. 28. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Events Processes Complex Event Processing Optimization Process Mining with Schedules Processes Processes Process models describe time dependencies among activities: Business processes Scheduled activities Used as a template for execution by a process engine. A process model can be modeled as a graph containing activity nodes and control nodes: Petri nets [Reisig, 1985] BPMN [bpm, 2011]
  29. 29. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Events Processes Complex Event Processing Optimization Process Mining with Schedules Process Models Bus Log Bus Model s d ω_2 ω_3 ω_i ω_{n-1}
  30. 30. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Events Processes Complex Event Processing Optimization Process Mining with Schedules Between Events and Processes Given processes, detect (complex) events Given events, discover processes
  31. 31. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Complex Event Processing Optimization Process Mining with Schedules From Processes to CEP Optimisation of event pattern matching on three levels Approach based on domain knowledge Results taken from: M. Weidlich, H. Ziekow, A. Gal, J. Mendling, M. Weske - Optimising Event Pattern Matching using Business Process Models. IEEE Transactions on Knowledge and Data Engineering (TKDE), accepted for publication, 2015.
  32. 32. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Complex Event Processing Optimization Process Mining with Schedules From Processes to CEP Thanks Matthias Weidlich for the slides
  33. 33. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Optimization by Transformation Sequentialization Rule
  34. 34. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Optimization by Plan Selection Sequentialization Rule
  35. 35. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Optimization by Early Termination Sequentialization Rule
  36. 36. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Performance Analysis Datasets publicly available process log that contains recorded execution sequences of a paper reviewing process.a The model denes 20 activities. The log comprises 3730 events that are related to 100 process instances. Each event is associated with a timestamp and a reference to an activity of the process model. Process models of a German insurance company. 1021 process models, ranging from 4 to 339 nodes. The average size of the process models is around 23 nodes. The log was simulated using annotations of the process models. a http://www.processmining.org/logs/start
  37. 37. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Performance Analysis
  38. 38. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Performance Analysis
  39. 39. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Complex Events Processing with Processes Gathering ... Volume Velocity Optimization Variety Optimisation in event processing networks Veracity Table: Big Data Cross Table
  40. 40. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Complex Events Processing with Processes ... Analysis Volume Mining of constraints Velocity Variety Veracity Probabilistic mining of constraints Table: Big Data Cross Table
  41. 41. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Complex Event Processing Optimization Process Mining with Schedules From Events to Processes Online Traveling Time Prediction: when Processes Rule Events Using information on bus stops, the prediction of the journey traveling time T( ω1, . . . , ωn , tω1 ) is traced back to the sum of traveling times per segment: T( ω1, . . . , ωn , tω1 ) = T( ω1, ω2 , tω1 ) + . . . + T( ωn−1, ωn , tωn−1 ) where tωn−1 = tω1 + T( ω1, ωn−1 , tω1 ). s d Traveling Time = Drive Time + Delay Time + Stop Time ω_2 ω_3 ω_i ω_{n-1} (Thanks to Arik Senderovich for the slides)
  42. 42. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Complex Event Processing Optimization Process Mining with Schedules From Events to Processes Online Traveling Time Prediction: when Processes Rule Events Using information on bus stops, the prediction of the journey traveling time T( ω1, . . . , ωn , tω1 ) is traced back to the sum of traveling times per segment: T( ω1, . . . , ωn , tω1 ) = T( ω1, ω2 , tω1 ) + . . . + T( ωn−1, ωn , tωn−1 ) where tωn−1 = tω1 + T( ω1, ωn−1 , tω1 ). s d Traveling Time = Drive Time + Delay Time + Stop Time ω_2 ω_3 ω_i ω_{n-1} (Thanks to Arik Senderovich for the slides)
  43. 43. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Prediction: The Snapshot Principle in Single-Station Queues The snapshot principle stems from a heavy-traffic approximation of a queueing system under limits of its parameters, as the workload converges to capacity. Station1 The principle states that the total time in the station (waiting+service) remains constant. In our context, bus that passes through a segment, e.g., ωi, ωi+1 ∈ S × S, will have the same traveling time as another bus that has just passed through that segment (not necessarily of the same type, line, etc.).
  44. 44. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Prediction: The Snapshot Principle in Single-Station Queues The snapshot principle stems from a heavy-traffic approximation of a queueing system under limits of its parameters, as the workload converges to capacity. Station1 The principle states that the total time in the station (waiting+service) remains constant. In our context, bus that passes through a segment, e.g., ωi, ωi+1 ∈ S × S, will have the same traveling time as another bus that has just passed through that segment (not necessarily of the same type, line, etc.).
  45. 45. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Prediction: The Snapshot Principle in Single-Station Queues The snapshot principle stems from a heavy-traffic approximation of a queueing system under limits of its parameters, as the workload converges to capacity. Station1 The principle states that the total time in the station (waiting+service) remains constant. In our context, bus that passes through a segment, e.g., ωi, ωi+1 ∈ S × S, will have the same traveling time as another bus that has just passed through that segment (not necessarily of the same type, line, etc.).
  46. 46. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Complex Event Processing Optimization Process Mining with Schedules The Snapshot Principle in Single-Station Queues Based on the above, we define a single-segment snapshot predictor, Last-Bus-to-Travel-Segment (LBTS), denoted by θLBTS( ωi, ωi+1 , tω1 ). In real-life settings, applicability of the snapshot principle predictors should be tested ad-hoc. The snapshot principle was shown to be of an empirical value in previous research, where queueing techniques were applied to predict delays.
  47. 47. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Complex Event Processing Optimization Process Mining with Schedules The Snapshot Principle in Single-Station Queues Based on the above, we define a single-segment snapshot predictor, Last-Bus-to-Travel-Segment (LBTS), denoted by θLBTS( ωi, ωi+1 , tω1 ). In real-life settings, applicability of the snapshot principle predictors should be tested ad-hoc. The snapshot principle was shown to be of an empirical value in previous research, where queueing techniques were applied to predict delays.
  48. 48. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Snapshot Principle in a Network In our case, the LBTS predictor needs to be lifted to a network setting. The snapshot principle holds for networks of queues, when the routing through this network is known in advance. In scheduled transportation such as buses this is the case as the order of stops (and segments) is predefined: Station1 Station2 Station3 Station5 Station6 Station4 Station7
  49. 49. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Snapshot Principle in a Network In our case, the LBTS predictor needs to be lifted to a network setting. The snapshot principle holds for networks of queues, when the routing through this network is known in advance. In scheduled transportation such as buses this is the case as the order of stops (and segments) is predefined: Station1 Station2 Station3 Station5 Station6 Station4 Station7
  50. 50. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Snapshot Principle in a Network In our case, the LBTS predictor needs to be lifted to a network setting. The snapshot principle holds for networks of queues, when the routing through this network is known in advance. In scheduled transportation such as buses this is the case as the order of stops (and segments) is predefined: Station1 Station2 Station3 Station5 Station6 Station4 Station7
  51. 51. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Snapshot Principle in a Network We define a multi-segment (network) snapshot predictor that we refer to as the Last-Bus-to-Travel-Network or θLBTN ( ω1, ..., ωn , tω1 ), given a sequence of stops (with ω1 being the start stop and ωn being the end stop). According to the snapshot principle in networks we get that: θLBTN ( ω1, ..., ωn , tω1 ) = n i=1 θLBTS( ωi, ωi+1 , tω1 ).
  52. 52. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Snapshot Principle in a Network We define a multi-segment (network) snapshot predictor that we refer to as the Last-Bus-to-Travel-Network or θLBTN ( ω1, ..., ωn , tω1 ), given a sequence of stops (with ω1 being the start stop and ωn being the end stop). According to the snapshot principle in networks we get that: θLBTN ( ω1, ..., ωn , tω1 ) = n i=1 θLBTS( ωi, ωi+1 , tω1 ).
  53. 53. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Snapshot Principle in a Network We define a multi-segment (network) snapshot predictor that we refer to as the Last-Bus-to-Travel-Network or θLBTN ( ω1, ..., ωn , tω1 ), given a sequence of stops (with ω1 being the start stop and ωn being the end stop). According to the snapshot principle in networks we get that: θLBTN ( ω1, ..., ωn , tω1 ) = n i=1 θLBTS( ωi, ωi+1 , tω1 ).
  54. 54. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Performance Analysis Data 8 days of bus data, between September and October of 2014. Each day: approximately 11500 traveled segments. First trip for each day: no associated last travel time. Prediction for line 046A. Data comes from all buses that share segments with line 046A.
  55. 55. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Performance Analysis 10 20 30 40 50 Index of the segment in the trip 100 101 102 103 104 105 106 107 Samplesquareestimationerror 40 50 60 70 80 90 100 110 RootMeanSquareError
  56. 56. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Process Mining with Schedules ... Analysis Volume Better prediction Velocity Segmentation Variety Veracity Table: Big Data Cross Table
  57. 57. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Process Mining with Schedules ... Management ... Volume Velocity Variety Veracity Event Cleaning Table: Big Data Cross Table
  58. 58. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Thank You Avigdor Gal Technion – Israel Institute of Technology
  59. 59. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Complex Event Processing Optimization Process Mining with Schedules A. Adi and O. Etzion. Amit - the situation manager. The International Journal on Very Large Data Bases, 13(2):177–203, May 2004. Roger S. Barga, Jonathan Goldstein, Mohamed H. Ali, and Mingsheng Hong. Consistent streaming through time: A vision for event stream processing. In CIDR [DBL, 2007], pages 363–374. Business Process Model and Notation (BPMN) Version 2.0. Technical report, Object Management Group (OMG), January 2011. CIDR 2007, Third Biennial Conference on Innovative Data Systems Research, Asilomar, CA, USA, January 7-10, 2007, Online Proceedings. www.cidrdb.org, 2007. Alan J. Demers, Johannes Gehrke, Biswanath Panda, Mirek Riedewald, Varun Sharma, and Walker M. White. Cayuga: A general purpose event monitoring system. In CIDR [DBL, 2007], pages 412–422. Opher Etzion and Peter Niblett. Event Processing in Action. Manning Publications Company, 2010.
  60. 60. Lecture Outline Big Data: the New Playground Events, Processes, and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Wolfgang Reisig. Petri Nets: An Introduction, volume 4 of Monographs in Theoretical Computer Science. An EATCS Series. Springer, 1985. Eugene Wu, Yanlei Diao, and Shariq Rizvi. High-performance complex event processing over streams. In SIGMOD ’06: Proceedings of the 2006 ACM SIGMOD international conference on Management of data, pages 407–418, New York, NY, USA, 2006. ACM.

×