SlideShare a Scribd company logo
Extending Complex Event Processing
to Graph-structured Information
Gala Barquero1, Loli Burgueño2, Javier Troya3, Antonio Vallecillo1
1Universidad de Málaga, Spain
2Universitat Oberta de Catalunya, Spain
3Universidad de Sevilla, Spain
Complex Event Processing
1. CEP is a method for data stream-processing for analyzing and correlating streams of
information about real-time events in order to derive conclusions from them.
2. CEP permits defining complex events on top of other events (primitive or complex)
3. CEP programs are composed of rules which are in charge of processing the events
2
Complex Event Processing
1. CEP is a method for data stream-processing for analyzing and correlating streams of
information about real-time events in order to derive conclusions from them.
2. CEP permits defining complex events on top of other events (primitive or complex)
3. CEP programs are composed of rules which are in charge of processing the events
3
Queries Data Results Data Results
Queries
(patterns)
Complex Event Processing
1. CEP is a method for data stream-processing for analyzing and correlating streams of
information about real-time events in order to derive conclusions from them.
2. CEP permits defining complex events on top of other events (primitive or complex)
3. CEP programs are composed of rules which are in charge of processing the events
4. CEP programs define (size or temporal) windows on the stream of events
4
Current CEP technologies
1. Efficient languages and technologies for processing huge streams of data
 6.5 zettabytes (10^21) in 2016
 15.3 zettabytes expected in 2020
2. Increasingly used (and useful) in applications for critical infrastructure monitoring,
real-time market trend analysis, plagues and natural disasters prediction, ...
5
However, real information is normally structured in more complex ways
6
However, real information is normally structured in more complex ways
1. The data is not only structured as a sequence of timed events, but as graphs that
combine transient (streams) and persistent (database) information
 Queries about social trends based on Twitter feeds and shared Flickr photos
 Monitoring tendencies via Twitter and Facebook posts
7
Our contribution
1. Extend CEP systems and languages to deal with graph-based information
 Able to deal both with streams of timed events and with graphs of persistent data
 Extend the concept of a CEP “sequential window” to a “spatial window”
 Keep up with the stringent requirements on performance and scalability of CEP
systems
2. For this we decided to:
 Generalize the structure of a CEP stream from a sequence of time-ordered events to a
Model (i.e., a graph of interrelated elements – time being just one dimension)
 Consider the behavior of a CEP system as a particular kind of in-place Model
Transformation
 Use the concept of “vicinity graphs” to define and implement spatial windows in
models (a generalization of CEP’s sequential windows)
 Use recent graph parallel computational technologies to provide the supporting
storage and access infrastructure for the models, and graph-processing systems to
implement the corresponding in-place model transformations
8
Case study: Twitter and Flicker
9
Case study: Twitter and Flicker
10
Q1
A HotTopic event is generated every time a hashtag has been used
by both Twitter and Flickr users at least 100 times in the last hour
Case study: Twitter and Flicker
Q1: A HotTopic event is generated every time a hashtag has been used by both
Twitter and Flickr users at least 100 times in the last hour.
Q2: A PopularTwitterPhoto element is created when the hashtag of a photo is
mentioned in a tweet that receives more than 30 likes in the last hour.
Q3: A PopularFlickrPhoto element is created when a photo is favored by more
than 50 Flickr users who have more than 50 followers.
Q4: We generate a NiceTwitterPhoto event when a user, with an h-index higher
than 50, posts three tweets in a row in the last hour containing a hashtag
that describes a photo.
Q5: A InfluencerTweeted event is generated, considering the 10K most recent
tweets, when a user with h-index higher than 70 and more than 50K
followers, sends a tweet.
11
Current Implementation
1. Models implemented with Apache Spark
 RDDs (resilient distributed dataset) used to store both model elements (graph vertices)
and their relations (edges)
 Models populated using the sources’ APIs to obtain the data
 One thread for each stream of events in case of streaming data
2. Model transformation rules (modeling the corresponding CEP rules) implemented in
Scala
 Implemented in terms of Spark and GraphX functions
 One dedicated running thread for each rule
 Produced events stored using RDDs too
3. Data lifecycle
 Transient data (and their relationships) have an “expiration date” (ED)
 The ED is determined by the largest window of the rules that deal with the event
 Once the ED of an element has passed, the element is removed from the system
12
Scala code for the “HotTopic” Rule
13
Analyses
1. Performance
 How fast are we?
 Is the performance of our
proposal acceptable for dealing
with large systems?
 How do we compare with CEP
systems? (when only
one-dimensional streams are
used)
2. Expressiveness
 Are we as expressive as CEP
languages?
 Can we write all CEP patterns
with GraphX?
 How easy is to write Rules with
our proposal?
14
Performance analysis
1. Performance Figures for the Twitter and Flickr case study (in milliseconds)
2. Comparison figures with other solutions (127K/6500K):
15
Performance analysis: comparison with streaming CEP systems
1. A different case study (Motorbike) implemented using both our solution and Esper
16
Expressiveness
1. We have been able to express all queries using Scala and GraphX
2. However, the expression of the queries is not simple
17
Scala code for the “DriverLeftSeat” rule:
Expressiveness
18
Esper code for the “DriverLeftSeat” rule:
Cypher code for the “DriverLeftSeat” rule:
Technology (and its rapid evolution) is an issue in this context
19
Technology In
memory
Query
Language
Pros Cons
Neo4j No Cypher * Expressiveness and usability of Cypher!!!
* Easy to install and to use
* Scalability
* Disk Access (R/W) very slow
* No in-memory implementation available
Spark +
Graphx
Yes Scala * Versatile and very expressive language.
* Easy to install
* Implements cluster mode (distributed)
* Cumbersome as query lang. for graphs
* Uses lazy evaluation
* Complex configuration in cluster mode
Viatra Yes Viatra * Speed and general performance
* Good language for querying models
* Very expressive
* Difficult to install and configure
* Documentation is scarce
Tinkergraph Yes Gremlin * Graph-native language and tools
* In-memory implementation
* Easy to install and to use
* Learning curve of Gremlin
CrateDB No SQL * Uses disk but very efficiently (scalability).
* SQL is well known and used
* Implements cluster mode (distributed)
* Easy to install and to use
* Writting graph queries in SQL is not easy
(specially those queries involving hops)
Conclusions and future work
Contribution: Extension of CEP systems to deal with graph-structured information:
 Able to deal both with streams of timed events and with graphs of persistent data
 Represent the information to manage as a Model
 Consider the behavior of a CEP system as an in-place Model Transformation
 Extend the concept of CEP windows to models’ spatial windows
 Use graph parallel computational technologies to provide the supporting storage
and access infrastructure, and
 Use of graph-processing languages and systems to implement the corresponding
model transformations
20
Future work
1. Performance:
 Experiment with other technologies, beyond Spark+GraphX
 Each one has pros and cons (expressiveness, performance, scalability, distribution)
 Volatility is an issue… They change too rapidly!
2. Expressiveness
 Compilers from Query languages to Storage technologies can be a solution
 For example, from Cypher to Gremlin or to Scala+GraphX
3. Correctness/Accuracy
 What is the error introduced by the use of spatial windows?
 Here we need to trade accuracy for performance
 Approximate queries and model transformations…
21
Q: A YoungInfluencer is a TwitterUser younger
than 25 years old, which has more than 30
followers older than 25 years old.
Extending Complex Event Processing
to Graph-structured Information
Gala Barquero1, Loli Burgueño2, Javier Troya3, Antonio Vallecillo1
1Universidad de Málaga, Spain
2Universitat Oberta de Catalunya, Spain
3Universidad de Sevilla, Spain

More Related Content

What's hot

Webinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert HodgesWebinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Altinity Ltd
 
Streaming Event Time Partitioning with Apache Flink and Apache Iceberg - Juli...
Streaming Event Time Partitioning with Apache Flink and Apache Iceberg - Juli...Streaming Event Time Partitioning with Apache Flink and Apache Iceberg - Juli...
Streaming Event Time Partitioning with Apache Flink and Apache Iceberg - Juli...
Flink Forward
 
Building a Streaming Microservice Architecture: with Apache Spark Structured ...
Building a Streaming Microservice Architecture: with Apache Spark Structured ...Building a Streaming Microservice Architecture: with Apache Spark Structured ...
Building a Streaming Microservice Architecture: with Apache Spark Structured ...
Databricks
 
End to end todo list app with NestJs - Angular - Redux & Redux Saga
End to end todo list app with NestJs - Angular - Redux & Redux SagaEnd to end todo list app with NestJs - Angular - Redux & Redux Saga
End to end todo list app with NestJs - Angular - Redux & Redux Saga
Babacar NIANG
 
Intro to Kapacitor for Alerting and Anomaly Detection
Intro to Kapacitor for Alerting and Anomaly DetectionIntro to Kapacitor for Alerting and Anomaly Detection
Intro to Kapacitor for Alerting and Anomaly Detection
InfluxData
 
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Databricks
 
TiDB Introduction
TiDB IntroductionTiDB Introduction
TiDB Introduction
Morgan Tocker
 
Sizing MongoDB Clusters
Sizing MongoDB Clusters Sizing MongoDB Clusters
Sizing MongoDB Clusters
MongoDB
 
Autoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive Mode
Flink Forward
 
Building an analytics workflow using Apache Airflow
Building an analytics workflow using Apache AirflowBuilding an analytics workflow using Apache Airflow
Building an analytics workflow using Apache Airflow
Yohei Onishi
 
All about Zookeeper and ClickHouse Keeper.pdf
All about Zookeeper and ClickHouse Keeper.pdfAll about Zookeeper and ClickHouse Keeper.pdf
All about Zookeeper and ClickHouse Keeper.pdf
Altinity Ltd
 
ClickHouse on Kubernetes, by Alexander Zaitsev, Altinity CTO
ClickHouse on Kubernetes, by Alexander Zaitsev, Altinity CTOClickHouse on Kubernetes, by Alexander Zaitsev, Altinity CTO
ClickHouse on Kubernetes, by Alexander Zaitsev, Altinity CTO
Altinity Ltd
 
ClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei MilovidovClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei Milovidov
Altinity Ltd
 
喬叔 Elasticsearch Index 管理技巧與效能優化
喬叔 Elasticsearch Index 管理技巧與效能優化喬叔 Elasticsearch Index 管理技巧與效能優化
喬叔 Elasticsearch Index 管理技巧與效能優化
Joe Wu
 
HTTP Analytics for 6M requests per second using ClickHouse, by Alexander Boc...
HTTP Analytics for 6M requests per second using ClickHouse, by  Alexander Boc...HTTP Analytics for 6M requests per second using ClickHouse, by  Alexander Boc...
HTTP Analytics for 6M requests per second using ClickHouse, by Alexander Boc...
Altinity Ltd
 
DBA Fundamentals Group: Continuous SQL with Kafka and Flink
DBA Fundamentals Group: Continuous SQL with Kafka and FlinkDBA Fundamentals Group: Continuous SQL with Kafka and Flink
DBA Fundamentals Group: Continuous SQL with Kafka and Flink
Timothy Spann
 
eBPF in the view of a storage developer
eBPF in the view of a storage developereBPF in the view of a storage developer
eBPF in the view of a storage developer
Richárd Kovács
 
The Art of Database Experiments – PostgresConf Silicon Valley 2018 / San Jose
The Art of Database Experiments – PostgresConf Silicon Valley 2018 / San JoseThe Art of Database Experiments – PostgresConf Silicon Valley 2018 / San Jose
The Art of Database Experiments – PostgresConf Silicon Valley 2018 / San Jose
Nikolay Samokhvalov
 
VictoriaMetrics 2023 Roadmap
VictoriaMetrics 2023 RoadmapVictoriaMetrics 2023 Roadmap
VictoriaMetrics 2023 Roadmap
VictoriaMetrics
 
Concurrent Programming Using the Disruptor
Concurrent Programming Using the DisruptorConcurrent Programming Using the Disruptor
Concurrent Programming Using the Disruptor
Trisha Gee
 

What's hot (20)

Webinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert HodgesWebinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert Hodges
 
Streaming Event Time Partitioning with Apache Flink and Apache Iceberg - Juli...
Streaming Event Time Partitioning with Apache Flink and Apache Iceberg - Juli...Streaming Event Time Partitioning with Apache Flink and Apache Iceberg - Juli...
Streaming Event Time Partitioning with Apache Flink and Apache Iceberg - Juli...
 
Building a Streaming Microservice Architecture: with Apache Spark Structured ...
Building a Streaming Microservice Architecture: with Apache Spark Structured ...Building a Streaming Microservice Architecture: with Apache Spark Structured ...
Building a Streaming Microservice Architecture: with Apache Spark Structured ...
 
End to end todo list app with NestJs - Angular - Redux & Redux Saga
End to end todo list app with NestJs - Angular - Redux & Redux SagaEnd to end todo list app with NestJs - Angular - Redux & Redux Saga
End to end todo list app with NestJs - Angular - Redux & Redux Saga
 
Intro to Kapacitor for Alerting and Anomaly Detection
Intro to Kapacitor for Alerting and Anomaly DetectionIntro to Kapacitor for Alerting and Anomaly Detection
Intro to Kapacitor for Alerting and Anomaly Detection
 
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
 
TiDB Introduction
TiDB IntroductionTiDB Introduction
TiDB Introduction
 
Sizing MongoDB Clusters
Sizing MongoDB Clusters Sizing MongoDB Clusters
Sizing MongoDB Clusters
 
Autoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive Mode
 
Building an analytics workflow using Apache Airflow
Building an analytics workflow using Apache AirflowBuilding an analytics workflow using Apache Airflow
Building an analytics workflow using Apache Airflow
 
All about Zookeeper and ClickHouse Keeper.pdf
All about Zookeeper and ClickHouse Keeper.pdfAll about Zookeeper and ClickHouse Keeper.pdf
All about Zookeeper and ClickHouse Keeper.pdf
 
ClickHouse on Kubernetes, by Alexander Zaitsev, Altinity CTO
ClickHouse on Kubernetes, by Alexander Zaitsev, Altinity CTOClickHouse on Kubernetes, by Alexander Zaitsev, Altinity CTO
ClickHouse on Kubernetes, by Alexander Zaitsev, Altinity CTO
 
ClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei MilovidovClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei Milovidov
 
喬叔 Elasticsearch Index 管理技巧與效能優化
喬叔 Elasticsearch Index 管理技巧與效能優化喬叔 Elasticsearch Index 管理技巧與效能優化
喬叔 Elasticsearch Index 管理技巧與效能優化
 
HTTP Analytics for 6M requests per second using ClickHouse, by Alexander Boc...
HTTP Analytics for 6M requests per second using ClickHouse, by  Alexander Boc...HTTP Analytics for 6M requests per second using ClickHouse, by  Alexander Boc...
HTTP Analytics for 6M requests per second using ClickHouse, by Alexander Boc...
 
DBA Fundamentals Group: Continuous SQL with Kafka and Flink
DBA Fundamentals Group: Continuous SQL with Kafka and FlinkDBA Fundamentals Group: Continuous SQL with Kafka and Flink
DBA Fundamentals Group: Continuous SQL with Kafka and Flink
 
eBPF in the view of a storage developer
eBPF in the view of a storage developereBPF in the view of a storage developer
eBPF in the view of a storage developer
 
The Art of Database Experiments – PostgresConf Silicon Valley 2018 / San Jose
The Art of Database Experiments – PostgresConf Silicon Valley 2018 / San JoseThe Art of Database Experiments – PostgresConf Silicon Valley 2018 / San Jose
The Art of Database Experiments – PostgresConf Silicon Valley 2018 / San Jose
 
VictoriaMetrics 2023 Roadmap
VictoriaMetrics 2023 RoadmapVictoriaMetrics 2023 Roadmap
VictoriaMetrics 2023 Roadmap
 
Concurrent Programming Using the Disruptor
Concurrent Programming Using the DisruptorConcurrent Programming Using the Disruptor
Concurrent Programming Using the Disruptor
 

Similar to Extending Complex Event Processing to Graph-structured Information

Towards an Incremental Schema-level Index for Distributed Linked Open Data G...
Towards an Incremental Schema-level Index  for Distributed Linked Open Data G...Towards an Incremental Schema-level Index  for Distributed Linked Open Data G...
Towards an Incremental Schema-level Index for Distributed Linked Open Data G...
Till Blume
 
Time Series Analysis… using an Event Streaming Platform
Time Series Analysis… using an Event Streaming PlatformTime Series Analysis… using an Event Streaming Platform
Time Series Analysis… using an Event Streaming Platform
confluent
 
Time Series Analysis Using an Event Streaming Platform
 Time Series Analysis Using an Event Streaming Platform Time Series Analysis Using an Event Streaming Platform
Time Series Analysis Using an Event Streaming Platform
Dr. Mirko Kämpf
 
Time series-analysis-using-an-event-streaming-platform -_v3_final
Time series-analysis-using-an-event-streaming-platform -_v3_finalTime series-analysis-using-an-event-streaming-platform -_v3_final
Time series-analysis-using-an-event-streaming-platform -_v3_final
confluent
 
Proto Spiral.ppt Proto Spiral.ppt Proto Spiral.ppt Proto Spiral.ppt
Proto Spiral.ppt Proto Spiral.ppt Proto Spiral.ppt Proto Spiral.pptProto Spiral.ppt Proto Spiral.ppt Proto Spiral.ppt Proto Spiral.ppt
Proto Spiral.ppt Proto Spiral.ppt Proto Spiral.ppt Proto Spiral.ppt
AnirbanBhar3
 
Shared time-series-analysis-using-an-event-streaming-platform -_v2
Shared   time-series-analysis-using-an-event-streaming-platform -_v2Shared   time-series-analysis-using-an-event-streaming-platform -_v2
Shared time-series-analysis-using-an-event-streaming-platform -_v2
confluent
 
Presentation iswc
Presentation iswcPresentation iswc
Presentation iswc
SydGillani
 
Proposal for google summe of code 2016
Proposal for google summe of code 2016 Proposal for google summe of code 2016
Proposal for google summe of code 2016
Mahesh Dananjaya
 
Distributed Near Real-Time Processing of Sensor Network Data Flows for Smart ...
Distributed Near Real-Time Processing of Sensor Network Data Flows for Smart ...Distributed Near Real-Time Processing of Sensor Network Data Flows for Smart ...
Distributed Near Real-Time Processing of Sensor Network Data Flows for Smart ...
Otávio Carvalho
 
Going deep (learning) with tensor flow and quarkus
Going deep (learning) with tensor flow and quarkusGoing deep (learning) with tensor flow and quarkus
Going deep (learning) with tensor flow and quarkus
Red Hat Developers
 
Ling liu part 02:big graph processing
Ling liu part 02:big graph processingLing liu part 02:big graph processing
Ling liu part 02:big graph processing
jins0618
 
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
confluent
 
Swift Parallel Scripting for High-Performance Workflow
Swift Parallel Scripting for High-Performance WorkflowSwift Parallel Scripting for High-Performance Workflow
Swift Parallel Scripting for High-Performance Workflow
Daniel S. Katz
 
How to expand the Galaxy from genes to Earth in six simple steps (and live sm...
How to expand the Galaxy from genes to Earth in six simple steps (and live sm...How to expand the Galaxy from genes to Earth in six simple steps (and live sm...
How to expand the Galaxy from genes to Earth in six simple steps (and live sm...
Raffaele Montella
 
Phenoflow: A Microservice Architecture for Portable Workflow-based Phenotype ...
Phenoflow: A Microservice Architecture for Portable Workflow-based Phenotype ...Phenoflow: A Microservice Architecture for Portable Workflow-based Phenotype ...
Phenoflow: A Microservice Architecture for Portable Workflow-based Phenotype ...
Martin Chapman
 
Sysml 2019 demo_paper
Sysml 2019 demo_paperSysml 2019 demo_paper
Sysml 2019 demo_paper
strange_loop
 
Flink Forward Berlin 2017: Stephan Ewen - The State of Flink and how to adopt...
Flink Forward Berlin 2017: Stephan Ewen - The State of Flink and how to adopt...Flink Forward Berlin 2017: Stephan Ewen - The State of Flink and how to adopt...
Flink Forward Berlin 2017: Stephan Ewen - The State of Flink and how to adopt...
Flink Forward
 
Data Stream Analytics - Why they are important
Data Stream Analytics - Why they are importantData Stream Analytics - Why they are important
Data Stream Analytics - Why they are important
Paris Carbone
 
ExtremeEarth: Hopsworks, a data-intensive AI platform for Deep Learning with ...
ExtremeEarth: Hopsworks, a data-intensive AI platform for Deep Learning with ...ExtremeEarth: Hopsworks, a data-intensive AI platform for Deep Learning with ...
ExtremeEarth: Hopsworks, a data-intensive AI platform for Deep Learning with ...
Big Data Value Association
 

Similar to Extending Complex Event Processing to Graph-structured Information (20)

Towards an Incremental Schema-level Index for Distributed Linked Open Data G...
Towards an Incremental Schema-level Index  for Distributed Linked Open Data G...Towards an Incremental Schema-level Index  for Distributed Linked Open Data G...
Towards an Incremental Schema-level Index for Distributed Linked Open Data G...
 
Time Series Analysis… using an Event Streaming Platform
Time Series Analysis… using an Event Streaming PlatformTime Series Analysis… using an Event Streaming Platform
Time Series Analysis… using an Event Streaming Platform
 
Time Series Analysis Using an Event Streaming Platform
 Time Series Analysis Using an Event Streaming Platform Time Series Analysis Using an Event Streaming Platform
Time Series Analysis Using an Event Streaming Platform
 
Time series-analysis-using-an-event-streaming-platform -_v3_final
Time series-analysis-using-an-event-streaming-platform -_v3_finalTime series-analysis-using-an-event-streaming-platform -_v3_final
Time series-analysis-using-an-event-streaming-platform -_v3_final
 
Proto Spiral.ppt Proto Spiral.ppt Proto Spiral.ppt Proto Spiral.ppt
Proto Spiral.ppt Proto Spiral.ppt Proto Spiral.ppt Proto Spiral.pptProto Spiral.ppt Proto Spiral.ppt Proto Spiral.ppt Proto Spiral.ppt
Proto Spiral.ppt Proto Spiral.ppt Proto Spiral.ppt Proto Spiral.ppt
 
Shared time-series-analysis-using-an-event-streaming-platform -_v2
Shared   time-series-analysis-using-an-event-streaming-platform -_v2Shared   time-series-analysis-using-an-event-streaming-platform -_v2
Shared time-series-analysis-using-an-event-streaming-platform -_v2
 
Presentation iswc
Presentation iswcPresentation iswc
Presentation iswc
 
Proposal for google summe of code 2016
Proposal for google summe of code 2016 Proposal for google summe of code 2016
Proposal for google summe of code 2016
 
Distributed Near Real-Time Processing of Sensor Network Data Flows for Smart ...
Distributed Near Real-Time Processing of Sensor Network Data Flows for Smart ...Distributed Near Real-Time Processing of Sensor Network Data Flows for Smart ...
Distributed Near Real-Time Processing of Sensor Network Data Flows for Smart ...
 
Going deep (learning) with tensor flow and quarkus
Going deep (learning) with tensor flow and quarkusGoing deep (learning) with tensor flow and quarkus
Going deep (learning) with tensor flow and quarkus
 
Ling liu part 02:big graph processing
Ling liu part 02:big graph processingLing liu part 02:big graph processing
Ling liu part 02:big graph processing
 
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
 
Swift Parallel Scripting for High-Performance Workflow
Swift Parallel Scripting for High-Performance WorkflowSwift Parallel Scripting for High-Performance Workflow
Swift Parallel Scripting for High-Performance Workflow
 
How to expand the Galaxy from genes to Earth in six simple steps (and live sm...
How to expand the Galaxy from genes to Earth in six simple steps (and live sm...How to expand the Galaxy from genes to Earth in six simple steps (and live sm...
How to expand the Galaxy from genes to Earth in six simple steps (and live sm...
 
Phenoflow: A Microservice Architecture for Portable Workflow-based Phenotype ...
Phenoflow: A Microservice Architecture for Portable Workflow-based Phenotype ...Phenoflow: A Microservice Architecture for Portable Workflow-based Phenotype ...
Phenoflow: A Microservice Architecture for Portable Workflow-based Phenotype ...
 
oracle-complex-event-processing-066421
oracle-complex-event-processing-066421oracle-complex-event-processing-066421
oracle-complex-event-processing-066421
 
Sysml 2019 demo_paper
Sysml 2019 demo_paperSysml 2019 demo_paper
Sysml 2019 demo_paper
 
Flink Forward Berlin 2017: Stephan Ewen - The State of Flink and how to adopt...
Flink Forward Berlin 2017: Stephan Ewen - The State of Flink and how to adopt...Flink Forward Berlin 2017: Stephan Ewen - The State of Flink and how to adopt...
Flink Forward Berlin 2017: Stephan Ewen - The State of Flink and how to adopt...
 
Data Stream Analytics - Why they are important
Data Stream Analytics - Why they are importantData Stream Analytics - Why they are important
Data Stream Analytics - Why they are important
 
ExtremeEarth: Hopsworks, a data-intensive AI platform for Deep Learning with ...
ExtremeEarth: Hopsworks, a data-intensive AI platform for Deep Learning with ...ExtremeEarth: Hopsworks, a data-intensive AI platform for Deep Learning with ...
ExtremeEarth: Hopsworks, a data-intensive AI platform for Deep Learning with ...
 

More from Antonio Vallecillo

Modeling Objects with Uncertain Behaviors
Modeling Objects with Uncertain BehaviorsModeling Objects with Uncertain Behaviors
Modeling Objects with Uncertain Behaviors
Antonio Vallecillo
 
Introducing Subjective Knowledge Graphs
Introducing Subjective Knowledge GraphsIntroducing Subjective Knowledge Graphs
Introducing Subjective Knowledge Graphs
Antonio Vallecillo
 
Using UML and OCL Models to realize High-Level Digital Twins
Using UML and OCL Models to realize High-Level Digital TwinsUsing UML and OCL Models to realize High-Level Digital Twins
Using UML and OCL Models to realize High-Level Digital Twins
Antonio Vallecillo
 
Modeling behavioral deontic constraints using UML and OCL
Modeling behavioral deontic constraints using UML and OCLModeling behavioral deontic constraints using UML and OCL
Modeling behavioral deontic constraints using UML and OCL
Antonio Vallecillo
 
Modeling and Evaluating Quality in the Presence of Uncertainty
Modeling and Evaluating Quality in the Presence of UncertaintyModeling and Evaluating Quality in the Presence of Uncertainty
Modeling and Evaluating Quality in the Presence of Uncertainty
Antonio Vallecillo
 
Research Evaluation - The current situation in Spain
Research Evaluation - The current situation in SpainResearch Evaluation - The current situation in Spain
Research Evaluation - The current situation in Spain
Antonio Vallecillo
 
Belief Uncertainty in Software Models
Belief Uncertainty in Software ModelsBelief Uncertainty in Software Models
Belief Uncertainty in Software Models
Antonio Vallecillo
 
Adding Random Operations to OCL
Adding Random Operations to OCLAdding Random Operations to OCL
Adding Random Operations to OCL
Antonio Vallecillo
 
Towards a Body of Knowledge for Model-Based Software Engineering
Towards a Body of Knowledge for Model-Based Software EngineeringTowards a Body of Knowledge for Model-Based Software Engineering
Towards a Body of Knowledge for Model-Based Software Engineering
Antonio Vallecillo
 
La Ingeniería Informática no es una Ciencia -- Reflexiones sobre la Educación...
La Ingeniería Informática no es una Ciencia -- Reflexiones sobre la Educación...La Ingeniería Informática no es una Ciencia -- Reflexiones sobre la Educación...
La Ingeniería Informática no es una Ciencia -- Reflexiones sobre la Educación...
Antonio Vallecillo
 
La Ética en la Ingeniería de Software de Pruebas: Necesidad de un Código Ético
La Ética en la Ingeniería de Software de Pruebas: Necesidad de un Código ÉticoLa Ética en la Ingeniería de Software de Pruebas: Necesidad de un Código Ético
La Ética en la Ingeniería de Software de Pruebas: Necesidad de un Código Ético
Antonio Vallecillo
 
La ingeniería del software en España: retos y oportunidades
La ingeniería del software en España: retos y oportunidadesLa ingeniería del software en España: retos y oportunidades
La ingeniería del software en España: retos y oportunidades
Antonio Vallecillo
 
Los Estudios de Posgrado de la Universidad de Málaga
Los Estudios de Posgrado de la Universidad de MálagaLos Estudios de Posgrado de la Universidad de Málaga
Los Estudios de Posgrado de la Universidad de Málaga
Antonio Vallecillo
 
El papel de los MOOCs en la Formación de Posgrado. El reto de la Universidad...
El papel de los MOOCs en la Formación de Posgrado. El reto de la Universidad...El papel de los MOOCs en la Formación de Posgrado. El reto de la Universidad...
El papel de los MOOCs en la Formación de Posgrado. El reto de la Universidad...
Antonio Vallecillo
 
La enseñanza digital y los MOOC en la UMA. Presentación en el XV encuentro de...
La enseñanza digital y los MOOC en la UMA. Presentación en el XV encuentro de...La enseñanza digital y los MOOC en la UMA. Presentación en el XV encuentro de...
La enseñanza digital y los MOOC en la UMA. Presentación en el XV encuentro de...
Antonio Vallecillo
 
El doctorado en Informática: ¿Nuevo vino en viejas botellas? (Charla U. Sevil...
El doctorado en Informática: ¿Nuevo vino en viejas botellas? (Charla U. Sevil...El doctorado en Informática: ¿Nuevo vino en viejas botellas? (Charla U. Sevil...
El doctorado en Informática: ¿Nuevo vino en viejas botellas? (Charla U. Sevil...
Antonio Vallecillo
 
Accountable objects: Modeling Liability in Open Distributed Systems
Accountable objects: Modeling Liability in Open Distributed SystemsAccountable objects: Modeling Liability in Open Distributed Systems
Accountable objects: Modeling Liability in Open Distributed Systems
Antonio Vallecillo
 
Models And Meanings
Models And MeaningsModels And Meanings
Models And Meanings
Antonio Vallecillo
 
Improving Naming and Grouping in UML
Improving Naming and Grouping in UMLImproving Naming and Grouping in UML
Improving Naming and Grouping in UML
Antonio Vallecillo
 
On the Combination of Domain Specific Modeling Languages
On the Combination of Domain Specific Modeling LanguagesOn the Combination of Domain Specific Modeling Languages
On the Combination of Domain Specific Modeling Languages
Antonio Vallecillo
 

More from Antonio Vallecillo (20)

Modeling Objects with Uncertain Behaviors
Modeling Objects with Uncertain BehaviorsModeling Objects with Uncertain Behaviors
Modeling Objects with Uncertain Behaviors
 
Introducing Subjective Knowledge Graphs
Introducing Subjective Knowledge GraphsIntroducing Subjective Knowledge Graphs
Introducing Subjective Knowledge Graphs
 
Using UML and OCL Models to realize High-Level Digital Twins
Using UML and OCL Models to realize High-Level Digital TwinsUsing UML and OCL Models to realize High-Level Digital Twins
Using UML and OCL Models to realize High-Level Digital Twins
 
Modeling behavioral deontic constraints using UML and OCL
Modeling behavioral deontic constraints using UML and OCLModeling behavioral deontic constraints using UML and OCL
Modeling behavioral deontic constraints using UML and OCL
 
Modeling and Evaluating Quality in the Presence of Uncertainty
Modeling and Evaluating Quality in the Presence of UncertaintyModeling and Evaluating Quality in the Presence of Uncertainty
Modeling and Evaluating Quality in the Presence of Uncertainty
 
Research Evaluation - The current situation in Spain
Research Evaluation - The current situation in SpainResearch Evaluation - The current situation in Spain
Research Evaluation - The current situation in Spain
 
Belief Uncertainty in Software Models
Belief Uncertainty in Software ModelsBelief Uncertainty in Software Models
Belief Uncertainty in Software Models
 
Adding Random Operations to OCL
Adding Random Operations to OCLAdding Random Operations to OCL
Adding Random Operations to OCL
 
Towards a Body of Knowledge for Model-Based Software Engineering
Towards a Body of Knowledge for Model-Based Software EngineeringTowards a Body of Knowledge for Model-Based Software Engineering
Towards a Body of Knowledge for Model-Based Software Engineering
 
La Ingeniería Informática no es una Ciencia -- Reflexiones sobre la Educación...
La Ingeniería Informática no es una Ciencia -- Reflexiones sobre la Educación...La Ingeniería Informática no es una Ciencia -- Reflexiones sobre la Educación...
La Ingeniería Informática no es una Ciencia -- Reflexiones sobre la Educación...
 
La Ética en la Ingeniería de Software de Pruebas: Necesidad de un Código Ético
La Ética en la Ingeniería de Software de Pruebas: Necesidad de un Código ÉticoLa Ética en la Ingeniería de Software de Pruebas: Necesidad de un Código Ético
La Ética en la Ingeniería de Software de Pruebas: Necesidad de un Código Ético
 
La ingeniería del software en España: retos y oportunidades
La ingeniería del software en España: retos y oportunidadesLa ingeniería del software en España: retos y oportunidades
La ingeniería del software en España: retos y oportunidades
 
Los Estudios de Posgrado de la Universidad de Málaga
Los Estudios de Posgrado de la Universidad de MálagaLos Estudios de Posgrado de la Universidad de Málaga
Los Estudios de Posgrado de la Universidad de Málaga
 
El papel de los MOOCs en la Formación de Posgrado. El reto de la Universidad...
El papel de los MOOCs en la Formación de Posgrado. El reto de la Universidad...El papel de los MOOCs en la Formación de Posgrado. El reto de la Universidad...
El papel de los MOOCs en la Formación de Posgrado. El reto de la Universidad...
 
La enseñanza digital y los MOOC en la UMA. Presentación en el XV encuentro de...
La enseñanza digital y los MOOC en la UMA. Presentación en el XV encuentro de...La enseñanza digital y los MOOC en la UMA. Presentación en el XV encuentro de...
La enseñanza digital y los MOOC en la UMA. Presentación en el XV encuentro de...
 
El doctorado en Informática: ¿Nuevo vino en viejas botellas? (Charla U. Sevil...
El doctorado en Informática: ¿Nuevo vino en viejas botellas? (Charla U. Sevil...El doctorado en Informática: ¿Nuevo vino en viejas botellas? (Charla U. Sevil...
El doctorado en Informática: ¿Nuevo vino en viejas botellas? (Charla U. Sevil...
 
Accountable objects: Modeling Liability in Open Distributed Systems
Accountable objects: Modeling Liability in Open Distributed SystemsAccountable objects: Modeling Liability in Open Distributed Systems
Accountable objects: Modeling Liability in Open Distributed Systems
 
Models And Meanings
Models And MeaningsModels And Meanings
Models And Meanings
 
Improving Naming and Grouping in UML
Improving Naming and Grouping in UMLImproving Naming and Grouping in UML
Improving Naming and Grouping in UML
 
On the Combination of Domain Specific Modeling Languages
On the Combination of Domain Specific Modeling LanguagesOn the Combination of Domain Specific Modeling Languages
On the Combination of Domain Specific Modeling Languages
 

Recently uploaded

Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
rickgrimesss22
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
Adele Miller
 
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisProviding Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Globus
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Globus
 
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
Tier1 app
 
How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
Globus
 
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
AMB-Review
 
top nidhi software solution freedownload
top nidhi software solution freedownloadtop nidhi software solution freedownload
top nidhi software solution freedownload
vrstrong314
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus
 
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Globus
 
Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
Fermin Galan
 
Accelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with PlatformlessAccelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with Platformless
WSO2
 
Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024
Globus
 
BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024
Ortus Solutions, Corp
 
Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"
Donna Lenk
 
Lecture 1 Introduction to games development
Lecture 1 Introduction to games developmentLecture 1 Introduction to games development
Lecture 1 Introduction to games development
abdulrafaychaudhry
 
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns
 
Into the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdfInto the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdf
Ortus Solutions, Corp
 
A Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdfA Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdf
kalichargn70th171
 
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar
 

Recently uploaded (20)

Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
 
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisProviding Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
 
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
 
How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
 
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
 
top nidhi software solution freedownload
top nidhi software solution freedownloadtop nidhi software solution freedownload
top nidhi software solution freedownload
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
 
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
 
Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
 
Accelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with PlatformlessAccelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with Platformless
 
Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024
 
BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024
 
Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"
 
Lecture 1 Introduction to games development
Lecture 1 Introduction to games developmentLecture 1 Introduction to games development
Lecture 1 Introduction to games development
 
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
 
Into the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdfInto the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdf
 
A Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdfA Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdf
 
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBroker
 

Extending Complex Event Processing to Graph-structured Information

  • 1. Extending Complex Event Processing to Graph-structured Information Gala Barquero1, Loli Burgueño2, Javier Troya3, Antonio Vallecillo1 1Universidad de Málaga, Spain 2Universitat Oberta de Catalunya, Spain 3Universidad de Sevilla, Spain
  • 2. Complex Event Processing 1. CEP is a method for data stream-processing for analyzing and correlating streams of information about real-time events in order to derive conclusions from them. 2. CEP permits defining complex events on top of other events (primitive or complex) 3. CEP programs are composed of rules which are in charge of processing the events 2
  • 3. Complex Event Processing 1. CEP is a method for data stream-processing for analyzing and correlating streams of information about real-time events in order to derive conclusions from them. 2. CEP permits defining complex events on top of other events (primitive or complex) 3. CEP programs are composed of rules which are in charge of processing the events 3 Queries Data Results Data Results Queries (patterns)
  • 4. Complex Event Processing 1. CEP is a method for data stream-processing for analyzing and correlating streams of information about real-time events in order to derive conclusions from them. 2. CEP permits defining complex events on top of other events (primitive or complex) 3. CEP programs are composed of rules which are in charge of processing the events 4. CEP programs define (size or temporal) windows on the stream of events 4
  • 5. Current CEP technologies 1. Efficient languages and technologies for processing huge streams of data  6.5 zettabytes (10^21) in 2016  15.3 zettabytes expected in 2020 2. Increasingly used (and useful) in applications for critical infrastructure monitoring, real-time market trend analysis, plagues and natural disasters prediction, ... 5
  • 6. However, real information is normally structured in more complex ways 6
  • 7. However, real information is normally structured in more complex ways 1. The data is not only structured as a sequence of timed events, but as graphs that combine transient (streams) and persistent (database) information  Queries about social trends based on Twitter feeds and shared Flickr photos  Monitoring tendencies via Twitter and Facebook posts 7
  • 8. Our contribution 1. Extend CEP systems and languages to deal with graph-based information  Able to deal both with streams of timed events and with graphs of persistent data  Extend the concept of a CEP “sequential window” to a “spatial window”  Keep up with the stringent requirements on performance and scalability of CEP systems 2. For this we decided to:  Generalize the structure of a CEP stream from a sequence of time-ordered events to a Model (i.e., a graph of interrelated elements – time being just one dimension)  Consider the behavior of a CEP system as a particular kind of in-place Model Transformation  Use the concept of “vicinity graphs” to define and implement spatial windows in models (a generalization of CEP’s sequential windows)  Use recent graph parallel computational technologies to provide the supporting storage and access infrastructure for the models, and graph-processing systems to implement the corresponding in-place model transformations 8
  • 9. Case study: Twitter and Flicker 9
  • 10. Case study: Twitter and Flicker 10 Q1 A HotTopic event is generated every time a hashtag has been used by both Twitter and Flickr users at least 100 times in the last hour
  • 11. Case study: Twitter and Flicker Q1: A HotTopic event is generated every time a hashtag has been used by both Twitter and Flickr users at least 100 times in the last hour. Q2: A PopularTwitterPhoto element is created when the hashtag of a photo is mentioned in a tweet that receives more than 30 likes in the last hour. Q3: A PopularFlickrPhoto element is created when a photo is favored by more than 50 Flickr users who have more than 50 followers. Q4: We generate a NiceTwitterPhoto event when a user, with an h-index higher than 50, posts three tweets in a row in the last hour containing a hashtag that describes a photo. Q5: A InfluencerTweeted event is generated, considering the 10K most recent tweets, when a user with h-index higher than 70 and more than 50K followers, sends a tweet. 11
  • 12. Current Implementation 1. Models implemented with Apache Spark  RDDs (resilient distributed dataset) used to store both model elements (graph vertices) and their relations (edges)  Models populated using the sources’ APIs to obtain the data  One thread for each stream of events in case of streaming data 2. Model transformation rules (modeling the corresponding CEP rules) implemented in Scala  Implemented in terms of Spark and GraphX functions  One dedicated running thread for each rule  Produced events stored using RDDs too 3. Data lifecycle  Transient data (and their relationships) have an “expiration date” (ED)  The ED is determined by the largest window of the rules that deal with the event  Once the ED of an element has passed, the element is removed from the system 12
  • 13. Scala code for the “HotTopic” Rule 13
  • 14. Analyses 1. Performance  How fast are we?  Is the performance of our proposal acceptable for dealing with large systems?  How do we compare with CEP systems? (when only one-dimensional streams are used) 2. Expressiveness  Are we as expressive as CEP languages?  Can we write all CEP patterns with GraphX?  How easy is to write Rules with our proposal? 14
  • 15. Performance analysis 1. Performance Figures for the Twitter and Flickr case study (in milliseconds) 2. Comparison figures with other solutions (127K/6500K): 15
  • 16. Performance analysis: comparison with streaming CEP systems 1. A different case study (Motorbike) implemented using both our solution and Esper 16
  • 17. Expressiveness 1. We have been able to express all queries using Scala and GraphX 2. However, the expression of the queries is not simple 17 Scala code for the “DriverLeftSeat” rule:
  • 18. Expressiveness 18 Esper code for the “DriverLeftSeat” rule: Cypher code for the “DriverLeftSeat” rule:
  • 19. Technology (and its rapid evolution) is an issue in this context 19 Technology In memory Query Language Pros Cons Neo4j No Cypher * Expressiveness and usability of Cypher!!! * Easy to install and to use * Scalability * Disk Access (R/W) very slow * No in-memory implementation available Spark + Graphx Yes Scala * Versatile and very expressive language. * Easy to install * Implements cluster mode (distributed) * Cumbersome as query lang. for graphs * Uses lazy evaluation * Complex configuration in cluster mode Viatra Yes Viatra * Speed and general performance * Good language for querying models * Very expressive * Difficult to install and configure * Documentation is scarce Tinkergraph Yes Gremlin * Graph-native language and tools * In-memory implementation * Easy to install and to use * Learning curve of Gremlin CrateDB No SQL * Uses disk but very efficiently (scalability). * SQL is well known and used * Implements cluster mode (distributed) * Easy to install and to use * Writting graph queries in SQL is not easy (specially those queries involving hops)
  • 20. Conclusions and future work Contribution: Extension of CEP systems to deal with graph-structured information:  Able to deal both with streams of timed events and with graphs of persistent data  Represent the information to manage as a Model  Consider the behavior of a CEP system as an in-place Model Transformation  Extend the concept of CEP windows to models’ spatial windows  Use graph parallel computational technologies to provide the supporting storage and access infrastructure, and  Use of graph-processing languages and systems to implement the corresponding model transformations 20
  • 21. Future work 1. Performance:  Experiment with other technologies, beyond Spark+GraphX  Each one has pros and cons (expressiveness, performance, scalability, distribution)  Volatility is an issue… They change too rapidly! 2. Expressiveness  Compilers from Query languages to Storage technologies can be a solution  For example, from Cypher to Gremlin or to Scala+GraphX 3. Correctness/Accuracy  What is the error introduced by the use of spatial windows?  Here we need to trade accuracy for performance  Approximate queries and model transformations… 21 Q: A YoungInfluencer is a TwitterUser younger than 25 years old, which has more than 30 followers older than 25 years old.
  • 22. Extending Complex Event Processing to Graph-structured Information Gala Barquero1, Loli Burgueño2, Javier Troya3, Antonio Vallecillo1 1Universidad de Málaga, Spain 2Universitat Oberta de Catalunya, Spain 3Universidad de Sevilla, Spain