SlideShare a Scribd company logo
The data streaming processing paradigm and
its use in modern fog architectures
Vincenzo Gulisano, Ph.D.
Agenda
• Who we are (my group)
• Motivation
• The data streaming processing paradigm
• Challenges and research questions
• Conclusions
• Bibliography
2
Agenda
• Who we are (my group)
• Motivation
• The data streaming processing paradigm
• Challenges and research questions
• Conclusions
• Bibliography
3
Who we are
Chalmers university
Computer Science and Engineering
department
Distributed Computing and Systems
research group
4
Who we are
• ~15 PhD degrees awarded
• ~12 PhD students, 5 Postdocs and 6 faculty
• Acknowledged internationally as leading
group in practical multicore synchronization
algorithms and programming (results
adopted by Java JDK, C++, Intel, NVIDIA)
• Extensive network of academic and
industrial collaborations and has been
continuously supported by national and
international projects
5
Distributed Computing and Systems Research Group
Department of Computer Science and engineering
Chalmers University of Technology
At our research team:
Research expertise & projects
Cyber
Security
Efficient
parallel &
stream
computing
Distributed
systems
IoT & Sensor
Networks
6
Agenda
• Who we are
• Motivation
• The data streaming processing paradigm
• Challenges and research questions
• Conclusions
• Bibliography
7
• IoT
• Edge
• Fog
• Cloud
• Cyber-Physical System
• Big Data
...
BUZZ WORDS
8
9
10
IoT enables for increased awareness, security, power-efficiency, ...
large IoT systems are complex
traditional data analysis techniques alone are not adequate!
11
AMIs [1,2,3,4] VNs [5,6]
• demand-response
• scheduling [7]
• micro-grids
• detection of medium size blackouts [8]
• detection of non technical losses
• ...
• autonomous driving
• platooning
• accident detection [9]
• variable tolls [9]
• congestion monitoring [10]
• ...
IoT enables for increased awareness, security, power-efficiency, ...
12
AMIs VNs
large IoT systems are complex
Characteristics [15]:
1. edge location
2. location awareness
3. low latency
4. geographical distribution
5. large-scale
6. support for mobility
7. real-time interactions
8. predominance of wireless
9. heterogeneous
10. interoperability / federation
11. interaction with the cloud
13
traditional data analysis techniques alone are not adequate! [13,14]
1. does the infrastructure allow for billions of
readings per day to be transferred continuously?
2. the latency incurred while transferring data, does
that undermine the utility of the analysis?
3. is it secure to concentrate all the data in a single
place? [11]
4. is it smart to give away fine-grained data? [12]
14
Agenda
• Who we are
• Motivation
• The data streaming processing paradigm
• Challenges and research questions
• Conclusions
• Bibliography
15
Main Memory
Motivation
DBMS vs. DSMS
Disk
1 Data
Query Processing
3 Query
results
2 Query
Main Memory
Query Processing
Continuous
Query
Data
Query
results
16
Before we start... about data streaming and Stream Processing Engines (SPEs)
An incomplete list of SPEs (cf. related work in [18]):
time
Borealis
The Aurora Project
STanfordstREamdatAManager
NiagaraCQ
COUGAR
StreamCloud
Covering all of them / discussing which use cases are best for each one out of scope...
the following show connection between what is being presented and a certain SPE
17
data stream: unbounded sequence of tuples sharing the same schema
Example: vehicles’ speed reports
time
Field Field
vehicle id text
time (secs) text
speed (Km/h) double
X coordinate double
Y coordinate double
A 8:00 55.5 X1 Y1
Let’s assume each source
(e.g., vehicle) produces
and delivers a timestamp
sorted stream
A 8:07 34.3 X3 Y3
A 8:03 70.3 X2 Y2
18
continuous query (or simply query): Directed Acyclic Graph (DAG) of
streams and operators
OP
OP
OP
OP OP
OP OP
source op
(1+ out streams)
sink op
(1+ in streams)
stream
op
(1+ in, 1+ out streams)
19
data streaming operators
Two main types:
• Stateless operators
• do not maintain any state
• one-by-one processing
• if they maintain some state, such state does not evolve depending
on the tuples being processed
• Stateful operators
• maintain a state that evolves depending on the tuples being
processed
• produce output tuples that depend on multiple input tuples
OP
OP
20
stateless operators
Filter
...
Map
Union
...
Filter / route tuples based on one (or more) conditions
Transform each tuple
Merge multiple streams (with the same schema) into one
21
stateless operators
Filter
...
Map
Union
...
source: http://storm.apache.org/releases/2.0.0-SNAPSHOT/Trident-tutorial.html
22
stateless operators
Filter
...
Map
Union
...
source: https://ci.apache.org/projects/flink/flink-docs-release-1.0/apis/streaming/index.html
23
stateless operators
Filter
...
Map
Union
...
source: http://spark.apache.org/docs/latest/streaming-programming-guide.html#transformations-on-dstreams
24
stateful operators
Aggregate information from multiple tuples
(e.g., max, min, sum, ...)
Join tuples coming from 2 streams given a certain predicate
Aggregate
Join
25
stateful operators
source: http://storm.apache.org/releases/2.0.0-SNAPSHOT/Trident-tutorial.html
source: http://spark.apache.org/docs/latest/streaming-programming-
guide.html#transformations-on-dstreams
source: http://spark.apache.org/docs/latest/streaming-programming-
guide.html#transformations-on-dstreams
26
Wait a moment!
if streams are unbounded, how can we aggregate or join?
27
windows and stateful analysis [18]
Stateful operations are done over windows:
• Time-based (e.g., tuples in the last 10 minutes)
• Tuple-based (e.g., given the last 50 tuples)
time
[8:00,9:00)
[8:20,9:20)
[8:40,9:40)
Usually applications rely on time-based sliding windows
28
time-based sliding window aggregation (count)
Counter: 4
time
[8:00,9:00)
8:05 8:15 8:22 8:45 9:05
Output: 4
Counter: 1
Counter: 2
Counter: 3
Counter: 3
time
8:05 8:15 8:22 8:45 9:05
[8:20,9:20)
we assumed each source
produces and delivers a
timestamp sorted stream!
What happens if this is not
the case?
29
windows and stateful analysis
30
basic operators and user-defined operators
Besides a set of basic operators, SPEs usually allow the user to define
ad-hoc operators (e.g., when existing aggregation are not enough)
31
sample query
For each vehicle, raise an alert if the speed of the latest report is more
than 2 times higher than its average speed in the last 30 days.
time
A 8:00 55.5 X1 Y1 A 8:07 34.3 X3 Y3
A 8:03 70.3 X2 Y2
32
Remove
unused fields
Map
Field
vehicle id
time (secs)
speed (Km/h)
X coordinate
Y coordinate
Field
vehicle id
time (secs)
speed (Km/h)
Compute average
speed for each
vehicle during the
last 30 days
Aggregate
Field
vehicle id
time (secs)
avg speed (Km/h)
Join
Check
condition
Filter
Field
vehicle id
time (secs)
speed (Km/h)
Join on
vehicle id
Field
vehicle id
time (secs)
avg speed (Km/h)
speed (Km/h)
sample query
33
M A J F
sample query
Notice:
• the same semantics can be defined in several ways (using different
operators and composing them in different ways)
• Using many basic building blocks can ease the task of distributing
and parallelizing the analysis (more in the following...)
34
Why data streaming, then?
Expressive
Online
Parallel &
Distributed
35
M A J F
At the edge, in parallel at each vehicle
M A J F
36
M A J FAt the cloud
37
Distributed, edge + fog + cloud
A J
M
F
38
Agenda
• Who we are
• Motivation
• The data streaming processing paradigm
• Challenges and research questions
• Conclusions
• Bibliography
39
1. Distributed deployment
2. Parallel deployment
3. Ordering and determinism
4. Shared-nothing vs shared-memory parallelism
5. Load balancing
6. Elasticity
7. Fault tolerance
8. Data sharing
Challenges and research questions
40
Picture source: Modern OS, by A. Tanenbaum
Data analysts
Stream Processing Engine
OS / Hardware
How can research
support these layers?
Why is it challenging?
41
A minimal example
(... the data analyst)
tell me the average speed
(per week) of each car
Road Side Unit (RSU)
42
Parallelization challenges
compute average speed
(per week) of each car
compute average speed
(per week) of each car
Does this work?
What if a car travels close to
different RSUs (and sends data
to both)?
43
1 3 8 9 13 6.8
1 3 2
10
1 3 2
10
6
44
Parallelization challenges
compute average speed
(per week) of each car
Odd plate number
Even plate numberSend to other RSU
compute average speed
(per week) of each car
Even plate number
Odd plate numberSend to other RSU
They are now communicating!
45
Balancing challenges
compute average speed
(per week) of each car
Odd plate
number
Even plate
number
Send to other RSU
compute average speed
(per week) of each car
Even plate
number
Odd plate
number
Send to other RSU
Too much pollution, we will
introduce the alternate
license plates policy!
Thanks! Now I am
using only 50% of my
resources...
46
Balancing challenges
compute average speed
(per week) of each car
Dynamic
condition TRUE
Dynamic
condition FALSE
Send to other RSU
compute average speed
(per week) of each car
Dynamic
condition FALSE
Dynamic
condition TRUE
Send to other RSU
They are now communicating!
There must be monitoring!
We need state
transfer protocols!
47
Balancing challenges
Time
responsible for responsible for
dynamic condition change
(in between the week!)
What about the data already sent to ?
tell me the average speed
(per week) of each car
48
Adapting challenges
compute average speed
(per week) of each car
Dynamic
condition TRUE
Dynamic
condition FALSE
Send to other RSU
compute average speed
(per week) of each car
Dynamic
condition FALSE
Dynamic
condition TRUE
Send to other RSU
What about the data
already sent to ?
49
Balancing challenges
compute average speed
(per week) of each car
Dynamic
condition TRUE
Dynamic
condition FALSE
Send to other RSU
compute average speed
(per week) of each car
Dynamic
condition FALSE
Dynamic
condition TRUE
Send to other RSU
They are now communicating!
There must be monitoring!
We need state
transfer protocols!
We need backups!
What about the data sent
between and ?
Cars need to
buffer data too!
Cars need to
buffer data too! 50
Faulttolerance
Elasticity
Loadbalancing
Determinism
Parallel execution of streaming operators
51
Parallel execution of streaming applications
DDoSdetection
andmitigation
Intrusiondetection
Datavalidation
Differentially
privateaggregation
Vehicularnetworks
analysis
SmartGrids/AMI
analysis
Security and
privacy
IoT
Cyber-Physical
Systems
Provenance and custom scheduling
Synchronization / Data structures
Faulttolerance
Elasticity
Loadbalancing
Determinism
Parallel execution of streaming operators
52
Parallel execution of streaming applications
Many-core systems / FPGAs
DDoSdetection
andmitigation
Intrusiondetection
Datavalidation
Differentially
privateaggregation
Vehicularnetworks
analysis
SmartGrids/AMI
analysis
Parallel joins
Parallel
aggregates
Joins
modeling
Security and
privacy
IoT
Cyber-Physical
Systems
Provenance and custom scheduling
• Palyvos-Giannas, D., Gulisano, V., & Papatriantafilou, M. (2019). GeneaLog: Fine-
grained data streaming provenance in cyber-physical systems. Parallel Computing
• Najdataei, H., Nikolakopoulos, Y., Papatriantafilou, M., Tsigas, P., & Gulisano, V.
(2019). STRETCH: Scalable and Elastic Deterministic Streaming Analysis with Virtual
Shared-Nothing Parallelism. Proceedings of the 13th ACM International Conference
on Distributed and Event-based Systems, DEBS 2019
• Palyvos-Giannas, D., Gulisano, V., & Papatriantafilou, M. (2019). Haren: A Framework
for Ad-Hoc Thread Scheduling Policies for Data Streaming Applications. Proceedings
of the 13th ACM International Conference on Distributed and Event-based Systems,
DEBS 2019
• Havers, B., Duvignau, R., Najdataei, H., Gulisano, V., Koppisetty, A. C., &
Papatriantafilou, M. (2019). DRIVEN: a Framework for Efficient Data Retrieval and
Clustering in Vehicular Networks. 35th IEEE International Conference on Data
Engineering, ICDE 2019
• Duvignau, R., Gulisano, V., Papatriantafilou, M., & Savic, V. (2019). Streaming
piecewise linear approximation for efficient data management in edge computing.
Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing, SAC 2019
• Walulya, I., Palyvos-Giannas, D., Nikolakopoulos, Y., Gulisano, V., Papatriantafilou,
M., & Tsigas, P. (2018). Viper: A module for communication-layer determinism and
scaling in low-latency stream processing. Future Generation Comp. Syst., 88
• Rooij, J. van, Gulisano, V., & Papatriantafilou, M. (2018). LoCoVolt: Distributed
Detection of Broken Meters in Smart Grids through Stream Processing. Proceedings
of the 12th ACM International Conference on Distributed and Event-based Systems,
DEBS 2018
Provenance Determinism
Vehicular networks
AMI
Parallelism Determinism Elasticity
53
Custom scheduling
Vehicular networks
Vehicular networks
DeterminismParallelism
Smart Grids / AMIs
• Najdataei, H., Nikolakopoulos, Y., Gulisano, V., & Papatriantafilou, M.
(2018). Continuous and Parallel LiDAR Point-Cloud Clustering. 38th IEEE
International Conference on Distributed Computing Systems, ICDCS 2018
• Gulisano, V., Nikolakopoulos, Y., Cederman, D., Papatriantafilou, M., &
Tsigas, P. (2017). Efficient Data Streaming Multiway Aggregation through
Concurrent Algorithmic Designs and New Abstract Data Types. TOPC
• Zacheilas, N., Kalogeraki, V., Nikolakopoulos, Y., Gulisano, V.,
Papatriantafilou, M., & Tsigas, P. (2017). Maximizing Determinism in
Stream Processing Under Latency Constraints. BT - Proceedings of the
11th ACM International Conference on Distributed and Event-based
Systems, DEBS 2017
• Gulisano, V., Papadopoulos, A. V., Nikolakopoulos, Y., Papatriantafilou, M.,
& Tsigas, P. (2017). Performance Modeling of Stream Joins. Proceedings
of the 11th ACM International Conference on Distributed and Event-
based Systems, DEBS 2017
• Geethakumari, P. R., Gulisano, V., Svensson, B. J., Trancoso, P., & Sourdis,
I. (2017). Single window stream aggregation using reconfigurable
hardware. International Conference on Field Programmable Technology,
ICFPT 2017
• Gulisano, V., Tudor, V., Almgren, M., & Papatriantafilou, M. (2016). BES:
Differentially Private and Distributed Event Aggregation in Advanced
Metering Infrastructures. Proceedings of the 2nd ACM International
Workshop on Cyber-Physical System Security, CPSS@AsiaCCS
• Gulisano, V., Callau-Zori, M., Fu, Z., Jiménez-Peris, R., Papatriantafilou, M.,
& Patiño-Martínez, M. (2015). STONE: A streaming DDoS defense
framework. Expert Syst. Appl., 42
54
Parallelism Clustering
Parallelism DeterminismStream aggregates
Parallelism Determinism
Synchronization / Data structures
ModelingStream joins
FPGAs DeterminismStream aggregates
Differentially
private aggregation
DDoS detection
and mitigation
• Gulisano, V., Nikolakopoulos, Y., Papatriantafilou, M., & Tsigas, P. (2015).
Scalejoin: A deterministic, disjoint-parallel and skew-resilient stream join. 2015
IEEE International Conference on Big Data, Big Data 2015
• Gulisano, V., Nikolakopoulos, Y., Walulya, I., Papatriantafilou, M., & Tsigas, P.
(2015). Deterministic real-time analytics of geospatial data streams through
ScaleGate objects. Proceedings of the 9th ACM International Conference on
Distributed Event-Based Systems, DEBS ’15
• Gulisano, V., Almgren, M., & Papatriantafilou, M. (2014). METIS: A Two-Tier
Intrusion Detection System for Advanced Metering Infrastructures.
International Conference on Security and Privacy in Communication Networks -
10th International ICST Conference, SecureComm 2014
• Gulisano, V., Jiménez-Peris, R., Patiño-Martínez, M., Soriente, C., & Valduriez, P.
(2012). StreamCloud: An Elastic and Scalable Data Streaming System. IEEE
Trans. Parallel Distrib. Syst., 23
• Gulisano, V., Jiménez-Peris, R., Patiño-Martínez, M., & Valduriez, P. (2010).
StreamCloud: A Large Scale Data Streaming System. BT - 2010 International
Conference on Distributed Computing Systems, ICDCS 2010
55
Parallelism DeterminismStream joins
Parallelism Determinism
Synchronization / Data structures
Intrusion detection
Parallelism Determinism
Parallelism Determinism
Agenda
• Who we are
• Motivation
• The data streaming processing paradigm
• Challenges and research questions
• Conclusions
• Bibliography
56
Millions of years
of evolution
Millions of
sensors
• Store information
• Iterate multiple times over data
• Think, do not rush through decisions
• ”Hard-wired” routines
• Real-time decisions
• High-throughput / low-latency
Should I (really)
have an extra
piece of cake?
Danger!!!
Run!!!
Humans
57
Years / Decades
of evolution
Millions of
sensors
What traffic
congestion
patterns can I
observe
frequently?
Don’t take
over, car in
opposite lane!
• Store information
• Iterate multiple times over data
• Think, do not rush through decisions
Databases, data mining
techniques...
Data streaming, distributed
and parallel analysis
• Continuous analysis
• Real-time decisions
• High-throughput / low-latency
Computers
(cyber-physical / IoT systems)
58
Agenda
• Motivation
• The data streaming processing paradigm
• Challenges and research questions
• Conclusions
• Bibliography
59
Bibliography
1. Zhou, Jiazhen, Rose Qingyang Hu, and Yi Qian. "Scalable distributed communication architectures to support advanced
metering infrastructure in smart grid." IEEE Transactions on Parallel and Distributed Systems 23.9 (2012): 1632-1642.
2. Gulisano, Vincenzo, et al. "BES: Differentially Private and Distributed Event Aggregation in Advanced Metering Infrastructures."
Proceedings of the 2nd ACM International Workshop on Cyber-Physical System Security. ACM, 2016.
3. Gulisano, Vincenzo, Magnus Almgren, and Marina Papatriantafilou. "Online and scalable data validation in advanced metering
infrastructures." IEEE PES Innovative Smart Grid Technologies, Europe. IEEE, 2014.
4. Gulisano, Vincenzo, Magnus Almgren, and Marina Papatriantafilou. "METIS: a two-tier intrusion detection system for advanced
metering infrastructures." International Conference on Security and Privacy in Communication Systems. Springer International
Publishing, 2014.
5. Yousefi, Saleh, Mahmoud Siadat Mousavi, and Mahmood Fathy. "Vehicular ad hoc networks (VANETs): challenges and
perspectives." 2006 6th International Conference on ITS Telecommunications. IEEE, 2006.
6. El Zarki, Magda, et al. "Security issues in a future vehicular network." European Wireless. Vol. 2. 2002.
7. Georgiadis, Giorgos, and Marina Papatriantafilou. "Dealing with storage without forecasts in smart grids: Problem
transformation and online scheduling algorithm." Proceedings of the 29th Annual ACM Symposium on Applied Computing.
ACM, 2014.
8. Fu, Zhang, et al. "Online temporal-spatial analysis for detection of critical events in Cyber-Physical Systems." Big Data (Big
Data), 2014 IEEE International Conference on. IEEE, 2014.
60
Bibliography
9. Arasu, Arvind, et al. "Linear road: a stream data management benchmark." Proceedings of the Thirtieth
international conference on Very large data bases-Volume 30. VLDB Endowment, 2004.
10. Lv, Yisheng, et al. "Traffic flow prediction with big data: a deep learning approach." IEEE Transactions on
Intelligent Transportation Systems 16.2 (2015): 865-873.
11. Grochocki, David, et al. "AMI threats, intrusion detection requirements and deployment
recommendations." Smart Grid Communications (SmartGridComm), 2012 IEEE Third International
Conference on. IEEE, 2012.
12. Molina-Markham, Andrés, et al. "Private memoirs of a smart meter." Proceedings of the 2nd ACM
workshop on embedded sensing systems for energy-efficiency in building. ACM, 2010.
13. Gulisano, Vincenzo, et al. "Streamcloud: A large scale data streaming system." Distributed Computing
Systems (ICDCS), 2010 IEEE 30th International Conference on. IEEE, 2010.
14. Stonebraker, Michael, Uǧur Çetintemel, and Stan Zdonik. "The 8 requirements of real-time stream
processing." ACM SIGMOD Record 34.4 (2005): 42-47.
15. Bonomi, Flavio, et al. "Fog computing and its role in the internet of things." Proceedings of the first
edition of the MCC workshop on Mobile cloud computing. ACM, 2012.
16. Himmelsbach, Michael, et al. "LIDAR-based 3D object perception." Proceedings of 1st international
workshop on cognition for technical systems. Vol. 1. 2008.
61
Bibliography
17. Geiger, Andreas, et al. "Vision meets robotics: The KITTI dataset." The International Journal of Robotics Research (2013):
0278364913491297.
18. Gulisano, Vincenzo Massimiliano. StreamCloud: An Elastic Parallel-Distributed Stream Processing Engine. Diss. Informatica,
2012.
19. Cardellini, Valeria, et al. "Optimal operator placement for distributed stream processing applications." Proceedings of the 10th
ACM International Conference on Distributed and Event-based Systems. ACM, 2016.
20. Costache, Stefania, et al. "Understanding the Data-Processing Challenges in Intelligent Vehicular Systems." Proceedings of the
2016 IEEE Intelligent Vehicles Symposium (IV16).
21. Cormode, Graham. "The continuous distributed monitoring model." ACM SIGMOD Record 42.1 (2013): 5-14.
22. Giatrakos, Nikos, Antonios Deligiannakis, and Minos Garofalakis. "Scalable Approximate Query Tracking over Highly Distributed
Data Streams." Proceedings of the 2016 International Conference on Management of Data. ACM, 2016.
23. Gulisano, Vincenzo, et al. "Streamcloud: An elastic and scalable data streaming system." IEEE Transactions on Parallel and
Distributed Systems 23.12 (2012): 2351-2365.
24. Shah, Mehul A., et al. "Flux: An adaptive partitioning operator for continuous query systems." Data Engineering, 2003.
Proceedings. 19th International Conference on. IEEE, 2003.
62
Bibliography
25. Cederman, Daniel, et al. "Brief announcement: concurrent data structures for efficient streaming aggregation." Proceedings of
the 26th ACM symposium on Parallelism in algorithms and architectures. ACM, 2014.
26. Ji, Yuanzhen, et al. "Quality-driven processing of sliding window aggregates over out-of-order data streams." Proceedings of
the 9th ACM International Conference on Distributed Event-Based Systems. ACM, 2015.
27. Ji, Yuanzhen, et al. "Quality-driven disorder handling for concurrent windowed stream queries with shared operators."
Proceedings of the 10th ACM International Conference on Distributed and Event-based Systems. ACM, 2016.
28. Gulisano, Vincenzo, et al. "Scalejoin: A deterministic, disjoint-parallel and skew-resilient stream join." Big Data (Big Data), 2015
IEEE International Conference on. IEEE, 2015.
29. Ottenwälder, Beate, et al. "MigCEP: operator migration for mobility driven distributed complex event processing." Proceedings
of the 7th ACM international conference on Distributed event-based systems. ACM, 2013.
30. De Matteis, Tiziano, and Gabriele Mencagli. "Keep calm and react with foresight: strategies for low-latency and energy-efficient
elastic data stream processing." Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel
Programming. ACM, 2016.
31. Balazinska, Magdalena, et al. "Fault-tolerance in the Borealis distributed stream processing system." ACM Transactions on
Database Systems (TODS) 33.1 (2008): 3.
32. Castro Fernandez, Raul, et al. "Integrating scale out and fault tolerance in stream processing using operator state
management." Proceedings of the 2013 ACM SIGMOD international conference on Management of data. ACM, 2013.
63
Bibliography
33. Dwork, Cynthia. "Differential privacy: A survey of results." International Conference on Theory and Applications of
Models of Computation. Springer Berlin Heidelberg, 2008.
34. Dwork, Cynthia, et al. "Differential privacy under continual observation." Proceedings of the forty-second ACM
symposium on Theory of computing. ACM, 2010.
35. Kargl, Frank, Arik Friedman, and Roksana Boreli. "Differential privacy in intelligent transportation systems." Proceedings
of the sixth ACM conference on Security and privacy in wireless and mobile networks. ACM, 2013.
64

More Related Content

What's hot

When Two Choices Are not Enough: Balancing at Scale in Distributed Stream Pro...
When Two Choices Are not Enough: Balancing at Scale in Distributed Stream Pro...When Two Choices Are not Enough: Balancing at Scale in Distributed Stream Pro...
When Two Choices Are not Enough: Balancing at Scale in Distributed Stream Pro...
Anis Nasir
 
The Power of Both Choices: Practical Load Balancing for Distributed Stream Pr...
The Power of Both Choices: Practical Load Balancing for Distributed Stream Pr...The Power of Both Choices: Practical Load Balancing for Distributed Stream Pr...
The Power of Both Choices: Practical Load Balancing for Distributed Stream Pr...
Anis Nasir
 
defense_slides_pengdu
defense_slides_pengdudefense_slides_pengdu
defense_slides_pengdu
Peng Du
 
Streaming Algorithms
Streaming AlgorithmsStreaming Algorithms
Streaming Algorithms
Joe Kelley
 
Scalable Distributed Real-Time Clustering for Big Data Streams
Scalable Distributed Real-Time Clustering for Big Data StreamsScalable Distributed Real-Time Clustering for Big Data Streams
Scalable Distributed Real-Time Clustering for Big Data Streams
Antonio Severien
 
Scalable Distributed Real-Time Clustering for Big Data Streams
Scalable Distributed Real-Time Clustering for Big Data StreamsScalable Distributed Real-Time Clustering for Big Data Streams
Scalable Distributed Real-Time Clustering for Big Data Streams
Antonio Severien
 
18 Data Streams
18 Data Streams18 Data Streams
18 Data Streams
Pier Luca Lanzi
 
Mining high speed data streams: Hoeffding and VFDT
Mining high speed data streams: Hoeffding and VFDTMining high speed data streams: Hoeffding and VFDT
Mining high speed data streams: Hoeffding and VFDT
Davide Gallitelli
 
Mining Big Data Streams with APACHE SAMOA
Mining Big Data Streams with APACHE SAMOAMining Big Data Streams with APACHE SAMOA
Mining Big Data Streams with APACHE SAMOA
Albert Bifet
 
Mining big data streams with APACHE SAMOA by Albert Bifet
Mining big data streams with APACHE SAMOA by Albert BifetMining big data streams with APACHE SAMOA by Albert Bifet
Mining big data streams with APACHE SAMOA by Albert Bifet
J On The Beach
 
Introduction to Data streaming - 05/12/2014
Introduction to Data streaming - 05/12/2014Introduction to Data streaming - 05/12/2014
Introduction to Data streaming - 05/12/2014
Raja Chiky
 
A Scalable Dataflow Implementation of Curran's Approximation Algorithm
A Scalable Dataflow Implementation of Curran's Approximation AlgorithmA Scalable Dataflow Implementation of Curran's Approximation Algorithm
A Scalable Dataflow Implementation of Curran's Approximation Algorithm
NECST Lab @ Politecnico di Milano
 
Stream data mining & CluStream framework
Stream data mining & CluStream frameworkStream data mining & CluStream framework
Stream data mining & CluStream framework
Yueshen Xu
 
ACM DEBS Grand Challenge: Continuous Analytics on Geospatial Data Streams wit...
ACM DEBS Grand Challenge: Continuous Analytics on Geospatial Data Streams wit...ACM DEBS Grand Challenge: Continuous Analytics on Geospatial Data Streams wit...
ACM DEBS Grand Challenge: Continuous Analytics on Geospatial Data Streams wit...
Srinath Perera
 
Self-adaptive container monitoring with performance-aware Load-Shedding policies
Self-adaptive container monitoring with performance-aware Load-Shedding policiesSelf-adaptive container monitoring with performance-aware Load-Shedding policies
Self-adaptive container monitoring with performance-aware Load-Shedding policies
NECST Lab @ Politecnico di Milano
 
Artificial intelligence and data stream mining
Artificial intelligence and data stream miningArtificial intelligence and data stream mining
Artificial intelligence and data stream mining
Albert Bifet
 
Sebastian Schelter – Distributed Machine Learing with the Samsara DSL
Sebastian Schelter – Distributed Machine Learing with the Samsara DSLSebastian Schelter – Distributed Machine Learing with the Samsara DSL
Sebastian Schelter – Distributed Machine Learing with the Samsara DSL
Flink Forward
 
Big Graph Analytics Systems (Sigmod16 Tutorial)
Big Graph Analytics Systems (Sigmod16 Tutorial)Big Graph Analytics Systems (Sigmod16 Tutorial)
Big Graph Analytics Systems (Sigmod16 Tutorial)
Yuanyuan Tian
 
Joey gonzalez, graph lab, m lconf 2013
Joey gonzalez, graph lab, m lconf 2013Joey gonzalez, graph lab, m lconf 2013
Joey gonzalez, graph lab, m lconf 2013
MLconf
 
Hyperspectral Image Reduction
Hyperspectral Image ReductionHyperspectral Image Reduction
Hyperspectral Image Reduction
Sarah Hussein
 

What's hot (20)

When Two Choices Are not Enough: Balancing at Scale in Distributed Stream Pro...
When Two Choices Are not Enough: Balancing at Scale in Distributed Stream Pro...When Two Choices Are not Enough: Balancing at Scale in Distributed Stream Pro...
When Two Choices Are not Enough: Balancing at Scale in Distributed Stream Pro...
 
The Power of Both Choices: Practical Load Balancing for Distributed Stream Pr...
The Power of Both Choices: Practical Load Balancing for Distributed Stream Pr...The Power of Both Choices: Practical Load Balancing for Distributed Stream Pr...
The Power of Both Choices: Practical Load Balancing for Distributed Stream Pr...
 
defense_slides_pengdu
defense_slides_pengdudefense_slides_pengdu
defense_slides_pengdu
 
Streaming Algorithms
Streaming AlgorithmsStreaming Algorithms
Streaming Algorithms
 
Scalable Distributed Real-Time Clustering for Big Data Streams
Scalable Distributed Real-Time Clustering for Big Data StreamsScalable Distributed Real-Time Clustering for Big Data Streams
Scalable Distributed Real-Time Clustering for Big Data Streams
 
Scalable Distributed Real-Time Clustering for Big Data Streams
Scalable Distributed Real-Time Clustering for Big Data StreamsScalable Distributed Real-Time Clustering for Big Data Streams
Scalable Distributed Real-Time Clustering for Big Data Streams
 
18 Data Streams
18 Data Streams18 Data Streams
18 Data Streams
 
Mining high speed data streams: Hoeffding and VFDT
Mining high speed data streams: Hoeffding and VFDTMining high speed data streams: Hoeffding and VFDT
Mining high speed data streams: Hoeffding and VFDT
 
Mining Big Data Streams with APACHE SAMOA
Mining Big Data Streams with APACHE SAMOAMining Big Data Streams with APACHE SAMOA
Mining Big Data Streams with APACHE SAMOA
 
Mining big data streams with APACHE SAMOA by Albert Bifet
Mining big data streams with APACHE SAMOA by Albert BifetMining big data streams with APACHE SAMOA by Albert Bifet
Mining big data streams with APACHE SAMOA by Albert Bifet
 
Introduction to Data streaming - 05/12/2014
Introduction to Data streaming - 05/12/2014Introduction to Data streaming - 05/12/2014
Introduction to Data streaming - 05/12/2014
 
A Scalable Dataflow Implementation of Curran's Approximation Algorithm
A Scalable Dataflow Implementation of Curran's Approximation AlgorithmA Scalable Dataflow Implementation of Curran's Approximation Algorithm
A Scalable Dataflow Implementation of Curran's Approximation Algorithm
 
Stream data mining & CluStream framework
Stream data mining & CluStream frameworkStream data mining & CluStream framework
Stream data mining & CluStream framework
 
ACM DEBS Grand Challenge: Continuous Analytics on Geospatial Data Streams wit...
ACM DEBS Grand Challenge: Continuous Analytics on Geospatial Data Streams wit...ACM DEBS Grand Challenge: Continuous Analytics on Geospatial Data Streams wit...
ACM DEBS Grand Challenge: Continuous Analytics on Geospatial Data Streams wit...
 
Self-adaptive container monitoring with performance-aware Load-Shedding policies
Self-adaptive container monitoring with performance-aware Load-Shedding policiesSelf-adaptive container monitoring with performance-aware Load-Shedding policies
Self-adaptive container monitoring with performance-aware Load-Shedding policies
 
Artificial intelligence and data stream mining
Artificial intelligence and data stream miningArtificial intelligence and data stream mining
Artificial intelligence and data stream mining
 
Sebastian Schelter – Distributed Machine Learing with the Samsara DSL
Sebastian Schelter – Distributed Machine Learing with the Samsara DSLSebastian Schelter – Distributed Machine Learing with the Samsara DSL
Sebastian Schelter – Distributed Machine Learing with the Samsara DSL
 
Big Graph Analytics Systems (Sigmod16 Tutorial)
Big Graph Analytics Systems (Sigmod16 Tutorial)Big Graph Analytics Systems (Sigmod16 Tutorial)
Big Graph Analytics Systems (Sigmod16 Tutorial)
 
Joey gonzalez, graph lab, m lconf 2013
Joey gonzalez, graph lab, m lconf 2013Joey gonzalez, graph lab, m lconf 2013
Joey gonzalez, graph lab, m lconf 2013
 
Hyperspectral Image Reduction
Hyperspectral Image ReductionHyperspectral Image Reduction
Hyperspectral Image Reduction
 

Similar to The data streaming processing paradigm and its use in modern fog architectures

Data Streaming in IoT and Big Data Analytics
Data Streaming in  IoT and Big Data AnalyticsData Streaming in  IoT and Big Data Analytics
Data Streaming in IoT and Big Data Analytics
Vincenzo Gulisano
 
Clickstream data with spark
Clickstream data with sparkClickstream data with spark
Clickstream data with spark
Marissa Saunders
 
Stream Processing
Stream Processing Stream Processing
Stream Processing
FogGuru MSCA Project
 
From complex Systems to Networks: Discovering and Modeling the Correct Network"
From complex Systems to Networks: Discovering and Modeling the Correct Network"From complex Systems to Networks: Discovering and Modeling the Correct Network"
From complex Systems to Networks: Discovering and Modeling the Correct Network"
diannepatricia
 
distributed system lab materials about ad
distributed system lab materials about addistributed system lab materials about ad
distributed system lab materials about ad
milkesa13
 
Big Data Processing Beyond MapReduce by Dr. Flavio Villanustre
Big Data Processing Beyond MapReduce by Dr. Flavio VillanustreBig Data Processing Beyond MapReduce by Dr. Flavio Villanustre
Big Data Processing Beyond MapReduce by Dr. Flavio Villanustre
HPCC Systems
 
IoT beneath your feet - building smart roads and networks
IoT beneath your feet - building smart roads and networksIoT beneath your feet - building smart roads and networks
IoT beneath your feet - building smart roads and networks
Alcatel-Lucent Enterprise
 
The DEBS Grand Challenge 2017
The DEBS Grand Challenge 2017The DEBS Grand Challenge 2017
The DEBS Grand Challenge 2017
Holistic Benchmarking of Big Linked Data
 
Using Graph Analysis and Fraud Detection in the Fintech Industry
Using Graph Analysis and Fraud Detection in the Fintech IndustryUsing Graph Analysis and Fraud Detection in the Fintech Industry
Using Graph Analysis and Fraud Detection in the Fintech Industry
Stanka Dalekova
 
Using Graph Analysis and Fraud Detection in the Fintech Industry
Using Graph Analysis and Fraud Detection in the Fintech IndustryUsing Graph Analysis and Fraud Detection in the Fintech Industry
Using Graph Analysis and Fraud Detection in the Fintech Industry
Stanka Dalekova
 
Disadvantages Of Robotium
Disadvantages Of RobotiumDisadvantages Of Robotium
Disadvantages Of Robotium
Susan Tullis
 
introduction to advanced distributed system
introduction to advanced distributed systemintroduction to advanced distributed system
introduction to advanced distributed system
milkesa13
 
Big learning 1.2
Big learning   1.2Big learning   1.2
Big learning 1.2
Mohit Garg
 
Stream Processing Overview
Stream Processing OverviewStream Processing Overview
Stream Processing Overview
Maycon Viana Bordin
 
Presentation mz
Presentation mzPresentation mz
Get Started with Data Science by Analyzing Traffic Data from California Highways
Get Started with Data Science by Analyzing Traffic Data from California HighwaysGet Started with Data Science by Analyzing Traffic Data from California Highways
Get Started with Data Science by Analyzing Traffic Data from California Highways
Aerospike, Inc.
 
WWV2015: Jibes Paul van der Hulst big data
WWV2015: Jibes Paul van der Hulst big dataWWV2015: Jibes Paul van der Hulst big data
WWV2015: Jibes Paul van der Hulst big data
webwinkelvakdag
 
Shikha fdp 62_14july2017
Shikha fdp 62_14july2017Shikha fdp 62_14july2017
Shikha fdp 62_14july2017
Dr. Shikha Mehta
 
TransPAC3/ACE Measurement & PerfSONAR Update
TransPAC3/ACE Measurement & PerfSONAR UpdateTransPAC3/ACE Measurement & PerfSONAR Update
TransPAC3/ACE Measurement & PerfSONAR Update
International Networking at Indiana University
 
Provenance for Data Munging Environments
Provenance for Data Munging EnvironmentsProvenance for Data Munging Environments
Provenance for Data Munging Environments
Paul Groth
 

Similar to The data streaming processing paradigm and its use in modern fog architectures (20)

Data Streaming in IoT and Big Data Analytics
Data Streaming in  IoT and Big Data AnalyticsData Streaming in  IoT and Big Data Analytics
Data Streaming in IoT and Big Data Analytics
 
Clickstream data with spark
Clickstream data with sparkClickstream data with spark
Clickstream data with spark
 
Stream Processing
Stream Processing Stream Processing
Stream Processing
 
From complex Systems to Networks: Discovering and Modeling the Correct Network"
From complex Systems to Networks: Discovering and Modeling the Correct Network"From complex Systems to Networks: Discovering and Modeling the Correct Network"
From complex Systems to Networks: Discovering and Modeling the Correct Network"
 
distributed system lab materials about ad
distributed system lab materials about addistributed system lab materials about ad
distributed system lab materials about ad
 
Big Data Processing Beyond MapReduce by Dr. Flavio Villanustre
Big Data Processing Beyond MapReduce by Dr. Flavio VillanustreBig Data Processing Beyond MapReduce by Dr. Flavio Villanustre
Big Data Processing Beyond MapReduce by Dr. Flavio Villanustre
 
IoT beneath your feet - building smart roads and networks
IoT beneath your feet - building smart roads and networksIoT beneath your feet - building smart roads and networks
IoT beneath your feet - building smart roads and networks
 
The DEBS Grand Challenge 2017
The DEBS Grand Challenge 2017The DEBS Grand Challenge 2017
The DEBS Grand Challenge 2017
 
Using Graph Analysis and Fraud Detection in the Fintech Industry
Using Graph Analysis and Fraud Detection in the Fintech IndustryUsing Graph Analysis and Fraud Detection in the Fintech Industry
Using Graph Analysis and Fraud Detection in the Fintech Industry
 
Using Graph Analysis and Fraud Detection in the Fintech Industry
Using Graph Analysis and Fraud Detection in the Fintech IndustryUsing Graph Analysis and Fraud Detection in the Fintech Industry
Using Graph Analysis and Fraud Detection in the Fintech Industry
 
Disadvantages Of Robotium
Disadvantages Of RobotiumDisadvantages Of Robotium
Disadvantages Of Robotium
 
introduction to advanced distributed system
introduction to advanced distributed systemintroduction to advanced distributed system
introduction to advanced distributed system
 
Big learning 1.2
Big learning   1.2Big learning   1.2
Big learning 1.2
 
Stream Processing Overview
Stream Processing OverviewStream Processing Overview
Stream Processing Overview
 
Presentation mz
Presentation mzPresentation mz
Presentation mz
 
Get Started with Data Science by Analyzing Traffic Data from California Highways
Get Started with Data Science by Analyzing Traffic Data from California HighwaysGet Started with Data Science by Analyzing Traffic Data from California Highways
Get Started with Data Science by Analyzing Traffic Data from California Highways
 
WWV2015: Jibes Paul van der Hulst big data
WWV2015: Jibes Paul van der Hulst big dataWWV2015: Jibes Paul van der Hulst big data
WWV2015: Jibes Paul van der Hulst big data
 
Shikha fdp 62_14july2017
Shikha fdp 62_14july2017Shikha fdp 62_14july2017
Shikha fdp 62_14july2017
 
TransPAC3/ACE Measurement & PerfSONAR Update
TransPAC3/ACE Measurement & PerfSONAR UpdateTransPAC3/ACE Measurement & PerfSONAR Update
TransPAC3/ACE Measurement & PerfSONAR Update
 
Provenance for Data Munging Environments
Provenance for Data Munging EnvironmentsProvenance for Data Munging Environments
Provenance for Data Munging Environments
 

Recently uploaded

一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
74nqk8xf
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
Sachin Paul
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Kiwi Creative
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
Walaa Eldin Moustafa
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
v7oacc3l
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
aqzctr7x
 
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
zsjl4mimo
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
nyfuhyz
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Aggregage
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
slg6lamcq
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
AndrzejJarynowski
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
sameer shah
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
74nqk8xf
 
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
74nqk8xf
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
Social Samosa
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
jerlynmaetalle
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
soxrziqu
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
Roger Valdez
 

Recently uploaded (20)

一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
 
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
 

The data streaming processing paradigm and its use in modern fog architectures

  • 1. The data streaming processing paradigm and its use in modern fog architectures Vincenzo Gulisano, Ph.D.
  • 2. Agenda • Who we are (my group) • Motivation • The data streaming processing paradigm • Challenges and research questions • Conclusions • Bibliography 2
  • 3. Agenda • Who we are (my group) • Motivation • The data streaming processing paradigm • Challenges and research questions • Conclusions • Bibliography 3
  • 4. Who we are Chalmers university Computer Science and Engineering department Distributed Computing and Systems research group 4
  • 5. Who we are • ~15 PhD degrees awarded • ~12 PhD students, 5 Postdocs and 6 faculty • Acknowledged internationally as leading group in practical multicore synchronization algorithms and programming (results adopted by Java JDK, C++, Intel, NVIDIA) • Extensive network of academic and industrial collaborations and has been continuously supported by national and international projects 5
  • 6. Distributed Computing and Systems Research Group Department of Computer Science and engineering Chalmers University of Technology At our research team: Research expertise & projects Cyber Security Efficient parallel & stream computing Distributed systems IoT & Sensor Networks 6
  • 7. Agenda • Who we are • Motivation • The data streaming processing paradigm • Challenges and research questions • Conclusions • Bibliography 7
  • 8. • IoT • Edge • Fog • Cloud • Cyber-Physical System • Big Data ... BUZZ WORDS 8
  • 9. 9
  • 10. 10
  • 11. IoT enables for increased awareness, security, power-efficiency, ... large IoT systems are complex traditional data analysis techniques alone are not adequate! 11
  • 12. AMIs [1,2,3,4] VNs [5,6] • demand-response • scheduling [7] • micro-grids • detection of medium size blackouts [8] • detection of non technical losses • ... • autonomous driving • platooning • accident detection [9] • variable tolls [9] • congestion monitoring [10] • ... IoT enables for increased awareness, security, power-efficiency, ... 12
  • 13. AMIs VNs large IoT systems are complex Characteristics [15]: 1. edge location 2. location awareness 3. low latency 4. geographical distribution 5. large-scale 6. support for mobility 7. real-time interactions 8. predominance of wireless 9. heterogeneous 10. interoperability / federation 11. interaction with the cloud 13
  • 14. traditional data analysis techniques alone are not adequate! [13,14] 1. does the infrastructure allow for billions of readings per day to be transferred continuously? 2. the latency incurred while transferring data, does that undermine the utility of the analysis? 3. is it secure to concentrate all the data in a single place? [11] 4. is it smart to give away fine-grained data? [12] 14
  • 15. Agenda • Who we are • Motivation • The data streaming processing paradigm • Challenges and research questions • Conclusions • Bibliography 15
  • 16. Main Memory Motivation DBMS vs. DSMS Disk 1 Data Query Processing 3 Query results 2 Query Main Memory Query Processing Continuous Query Data Query results 16
  • 17. Before we start... about data streaming and Stream Processing Engines (SPEs) An incomplete list of SPEs (cf. related work in [18]): time Borealis The Aurora Project STanfordstREamdatAManager NiagaraCQ COUGAR StreamCloud Covering all of them / discussing which use cases are best for each one out of scope... the following show connection between what is being presented and a certain SPE 17
  • 18. data stream: unbounded sequence of tuples sharing the same schema Example: vehicles’ speed reports time Field Field vehicle id text time (secs) text speed (Km/h) double X coordinate double Y coordinate double A 8:00 55.5 X1 Y1 Let’s assume each source (e.g., vehicle) produces and delivers a timestamp sorted stream A 8:07 34.3 X3 Y3 A 8:03 70.3 X2 Y2 18
  • 19. continuous query (or simply query): Directed Acyclic Graph (DAG) of streams and operators OP OP OP OP OP OP OP source op (1+ out streams) sink op (1+ in streams) stream op (1+ in, 1+ out streams) 19
  • 20. data streaming operators Two main types: • Stateless operators • do not maintain any state • one-by-one processing • if they maintain some state, such state does not evolve depending on the tuples being processed • Stateful operators • maintain a state that evolves depending on the tuples being processed • produce output tuples that depend on multiple input tuples OP OP 20
  • 21. stateless operators Filter ... Map Union ... Filter / route tuples based on one (or more) conditions Transform each tuple Merge multiple streams (with the same schema) into one 21
  • 25. stateful operators Aggregate information from multiple tuples (e.g., max, min, sum, ...) Join tuples coming from 2 streams given a certain predicate Aggregate Join 25
  • 26. stateful operators source: http://storm.apache.org/releases/2.0.0-SNAPSHOT/Trident-tutorial.html source: http://spark.apache.org/docs/latest/streaming-programming- guide.html#transformations-on-dstreams source: http://spark.apache.org/docs/latest/streaming-programming- guide.html#transformations-on-dstreams 26
  • 27. Wait a moment! if streams are unbounded, how can we aggregate or join? 27
  • 28. windows and stateful analysis [18] Stateful operations are done over windows: • Time-based (e.g., tuples in the last 10 minutes) • Tuple-based (e.g., given the last 50 tuples) time [8:00,9:00) [8:20,9:20) [8:40,9:40) Usually applications rely on time-based sliding windows 28
  • 29. time-based sliding window aggregation (count) Counter: 4 time [8:00,9:00) 8:05 8:15 8:22 8:45 9:05 Output: 4 Counter: 1 Counter: 2 Counter: 3 Counter: 3 time 8:05 8:15 8:22 8:45 9:05 [8:20,9:20) we assumed each source produces and delivers a timestamp sorted stream! What happens if this is not the case? 29
  • 30. windows and stateful analysis 30
  • 31. basic operators and user-defined operators Besides a set of basic operators, SPEs usually allow the user to define ad-hoc operators (e.g., when existing aggregation are not enough) 31
  • 32. sample query For each vehicle, raise an alert if the speed of the latest report is more than 2 times higher than its average speed in the last 30 days. time A 8:00 55.5 X1 Y1 A 8:07 34.3 X3 Y3 A 8:03 70.3 X2 Y2 32
  • 33. Remove unused fields Map Field vehicle id time (secs) speed (Km/h) X coordinate Y coordinate Field vehicle id time (secs) speed (Km/h) Compute average speed for each vehicle during the last 30 days Aggregate Field vehicle id time (secs) avg speed (Km/h) Join Check condition Filter Field vehicle id time (secs) speed (Km/h) Join on vehicle id Field vehicle id time (secs) avg speed (Km/h) speed (Km/h) sample query 33
  • 34. M A J F sample query Notice: • the same semantics can be defined in several ways (using different operators and composing them in different ways) • Using many basic building blocks can ease the task of distributing and parallelizing the analysis (more in the following...) 34
  • 35. Why data streaming, then? Expressive Online Parallel & Distributed 35
  • 36. M A J F At the edge, in parallel at each vehicle M A J F 36
  • 37. M A J FAt the cloud 37
  • 38. Distributed, edge + fog + cloud A J M F 38
  • 39. Agenda • Who we are • Motivation • The data streaming processing paradigm • Challenges and research questions • Conclusions • Bibliography 39
  • 40. 1. Distributed deployment 2. Parallel deployment 3. Ordering and determinism 4. Shared-nothing vs shared-memory parallelism 5. Load balancing 6. Elasticity 7. Fault tolerance 8. Data sharing Challenges and research questions 40
  • 41. Picture source: Modern OS, by A. Tanenbaum Data analysts Stream Processing Engine OS / Hardware How can research support these layers? Why is it challenging? 41
  • 42. A minimal example (... the data analyst) tell me the average speed (per week) of each car Road Side Unit (RSU) 42
  • 43. Parallelization challenges compute average speed (per week) of each car compute average speed (per week) of each car Does this work? What if a car travels close to different RSUs (and sends data to both)? 43
  • 44. 1 3 8 9 13 6.8 1 3 2 10 1 3 2 10 6 44
  • 45. Parallelization challenges compute average speed (per week) of each car Odd plate number Even plate numberSend to other RSU compute average speed (per week) of each car Even plate number Odd plate numberSend to other RSU They are now communicating! 45
  • 46. Balancing challenges compute average speed (per week) of each car Odd plate number Even plate number Send to other RSU compute average speed (per week) of each car Even plate number Odd plate number Send to other RSU Too much pollution, we will introduce the alternate license plates policy! Thanks! Now I am using only 50% of my resources... 46
  • 47. Balancing challenges compute average speed (per week) of each car Dynamic condition TRUE Dynamic condition FALSE Send to other RSU compute average speed (per week) of each car Dynamic condition FALSE Dynamic condition TRUE Send to other RSU They are now communicating! There must be monitoring! We need state transfer protocols! 47
  • 48. Balancing challenges Time responsible for responsible for dynamic condition change (in between the week!) What about the data already sent to ? tell me the average speed (per week) of each car 48
  • 49. Adapting challenges compute average speed (per week) of each car Dynamic condition TRUE Dynamic condition FALSE Send to other RSU compute average speed (per week) of each car Dynamic condition FALSE Dynamic condition TRUE Send to other RSU What about the data already sent to ? 49
  • 50. Balancing challenges compute average speed (per week) of each car Dynamic condition TRUE Dynamic condition FALSE Send to other RSU compute average speed (per week) of each car Dynamic condition FALSE Dynamic condition TRUE Send to other RSU They are now communicating! There must be monitoring! We need state transfer protocols! We need backups! What about the data sent between and ? Cars need to buffer data too! Cars need to buffer data too! 50
  • 51. Faulttolerance Elasticity Loadbalancing Determinism Parallel execution of streaming operators 51 Parallel execution of streaming applications DDoSdetection andmitigation Intrusiondetection Datavalidation Differentially privateaggregation Vehicularnetworks analysis SmartGrids/AMI analysis Security and privacy IoT Cyber-Physical Systems Provenance and custom scheduling
  • 52. Synchronization / Data structures Faulttolerance Elasticity Loadbalancing Determinism Parallel execution of streaming operators 52 Parallel execution of streaming applications Many-core systems / FPGAs DDoSdetection andmitigation Intrusiondetection Datavalidation Differentially privateaggregation Vehicularnetworks analysis SmartGrids/AMI analysis Parallel joins Parallel aggregates Joins modeling Security and privacy IoT Cyber-Physical Systems Provenance and custom scheduling
  • 53. • Palyvos-Giannas, D., Gulisano, V., & Papatriantafilou, M. (2019). GeneaLog: Fine- grained data streaming provenance in cyber-physical systems. Parallel Computing • Najdataei, H., Nikolakopoulos, Y., Papatriantafilou, M., Tsigas, P., & Gulisano, V. (2019). STRETCH: Scalable and Elastic Deterministic Streaming Analysis with Virtual Shared-Nothing Parallelism. Proceedings of the 13th ACM International Conference on Distributed and Event-based Systems, DEBS 2019 • Palyvos-Giannas, D., Gulisano, V., & Papatriantafilou, M. (2019). Haren: A Framework for Ad-Hoc Thread Scheduling Policies for Data Streaming Applications. Proceedings of the 13th ACM International Conference on Distributed and Event-based Systems, DEBS 2019 • Havers, B., Duvignau, R., Najdataei, H., Gulisano, V., Koppisetty, A. C., & Papatriantafilou, M. (2019). DRIVEN: a Framework for Efficient Data Retrieval and Clustering in Vehicular Networks. 35th IEEE International Conference on Data Engineering, ICDE 2019 • Duvignau, R., Gulisano, V., Papatriantafilou, M., & Savic, V. (2019). Streaming piecewise linear approximation for efficient data management in edge computing. Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing, SAC 2019 • Walulya, I., Palyvos-Giannas, D., Nikolakopoulos, Y., Gulisano, V., Papatriantafilou, M., & Tsigas, P. (2018). Viper: A module for communication-layer determinism and scaling in low-latency stream processing. Future Generation Comp. Syst., 88 • Rooij, J. van, Gulisano, V., & Papatriantafilou, M. (2018). LoCoVolt: Distributed Detection of Broken Meters in Smart Grids through Stream Processing. Proceedings of the 12th ACM International Conference on Distributed and Event-based Systems, DEBS 2018 Provenance Determinism Vehicular networks AMI Parallelism Determinism Elasticity 53 Custom scheduling Vehicular networks Vehicular networks DeterminismParallelism Smart Grids / AMIs
  • 54. • Najdataei, H., Nikolakopoulos, Y., Gulisano, V., & Papatriantafilou, M. (2018). Continuous and Parallel LiDAR Point-Cloud Clustering. 38th IEEE International Conference on Distributed Computing Systems, ICDCS 2018 • Gulisano, V., Nikolakopoulos, Y., Cederman, D., Papatriantafilou, M., & Tsigas, P. (2017). Efficient Data Streaming Multiway Aggregation through Concurrent Algorithmic Designs and New Abstract Data Types. TOPC • Zacheilas, N., Kalogeraki, V., Nikolakopoulos, Y., Gulisano, V., Papatriantafilou, M., & Tsigas, P. (2017). Maximizing Determinism in Stream Processing Under Latency Constraints. BT - Proceedings of the 11th ACM International Conference on Distributed and Event-based Systems, DEBS 2017 • Gulisano, V., Papadopoulos, A. V., Nikolakopoulos, Y., Papatriantafilou, M., & Tsigas, P. (2017). Performance Modeling of Stream Joins. Proceedings of the 11th ACM International Conference on Distributed and Event- based Systems, DEBS 2017 • Geethakumari, P. R., Gulisano, V., Svensson, B. J., Trancoso, P., & Sourdis, I. (2017). Single window stream aggregation using reconfigurable hardware. International Conference on Field Programmable Technology, ICFPT 2017 • Gulisano, V., Tudor, V., Almgren, M., & Papatriantafilou, M. (2016). BES: Differentially Private and Distributed Event Aggregation in Advanced Metering Infrastructures. Proceedings of the 2nd ACM International Workshop on Cyber-Physical System Security, CPSS@AsiaCCS • Gulisano, V., Callau-Zori, M., Fu, Z., Jiménez-Peris, R., Papatriantafilou, M., & Patiño-Martínez, M. (2015). STONE: A streaming DDoS defense framework. Expert Syst. Appl., 42 54 Parallelism Clustering Parallelism DeterminismStream aggregates Parallelism Determinism Synchronization / Data structures ModelingStream joins FPGAs DeterminismStream aggregates Differentially private aggregation DDoS detection and mitigation
  • 55. • Gulisano, V., Nikolakopoulos, Y., Papatriantafilou, M., & Tsigas, P. (2015). Scalejoin: A deterministic, disjoint-parallel and skew-resilient stream join. 2015 IEEE International Conference on Big Data, Big Data 2015 • Gulisano, V., Nikolakopoulos, Y., Walulya, I., Papatriantafilou, M., & Tsigas, P. (2015). Deterministic real-time analytics of geospatial data streams through ScaleGate objects. Proceedings of the 9th ACM International Conference on Distributed Event-Based Systems, DEBS ’15 • Gulisano, V., Almgren, M., & Papatriantafilou, M. (2014). METIS: A Two-Tier Intrusion Detection System for Advanced Metering Infrastructures. International Conference on Security and Privacy in Communication Networks - 10th International ICST Conference, SecureComm 2014 • Gulisano, V., Jiménez-Peris, R., Patiño-Martínez, M., Soriente, C., & Valduriez, P. (2012). StreamCloud: An Elastic and Scalable Data Streaming System. IEEE Trans. Parallel Distrib. Syst., 23 • Gulisano, V., Jiménez-Peris, R., Patiño-Martínez, M., & Valduriez, P. (2010). StreamCloud: A Large Scale Data Streaming System. BT - 2010 International Conference on Distributed Computing Systems, ICDCS 2010 55 Parallelism DeterminismStream joins Parallelism Determinism Synchronization / Data structures Intrusion detection Parallelism Determinism Parallelism Determinism
  • 56. Agenda • Who we are • Motivation • The data streaming processing paradigm • Challenges and research questions • Conclusions • Bibliography 56
  • 57. Millions of years of evolution Millions of sensors • Store information • Iterate multiple times over data • Think, do not rush through decisions • ”Hard-wired” routines • Real-time decisions • High-throughput / low-latency Should I (really) have an extra piece of cake? Danger!!! Run!!! Humans 57
  • 58. Years / Decades of evolution Millions of sensors What traffic congestion patterns can I observe frequently? Don’t take over, car in opposite lane! • Store information • Iterate multiple times over data • Think, do not rush through decisions Databases, data mining techniques... Data streaming, distributed and parallel analysis • Continuous analysis • Real-time decisions • High-throughput / low-latency Computers (cyber-physical / IoT systems) 58
  • 59. Agenda • Motivation • The data streaming processing paradigm • Challenges and research questions • Conclusions • Bibliography 59
  • 60. Bibliography 1. Zhou, Jiazhen, Rose Qingyang Hu, and Yi Qian. "Scalable distributed communication architectures to support advanced metering infrastructure in smart grid." IEEE Transactions on Parallel and Distributed Systems 23.9 (2012): 1632-1642. 2. Gulisano, Vincenzo, et al. "BES: Differentially Private and Distributed Event Aggregation in Advanced Metering Infrastructures." Proceedings of the 2nd ACM International Workshop on Cyber-Physical System Security. ACM, 2016. 3. Gulisano, Vincenzo, Magnus Almgren, and Marina Papatriantafilou. "Online and scalable data validation in advanced metering infrastructures." IEEE PES Innovative Smart Grid Technologies, Europe. IEEE, 2014. 4. Gulisano, Vincenzo, Magnus Almgren, and Marina Papatriantafilou. "METIS: a two-tier intrusion detection system for advanced metering infrastructures." International Conference on Security and Privacy in Communication Systems. Springer International Publishing, 2014. 5. Yousefi, Saleh, Mahmoud Siadat Mousavi, and Mahmood Fathy. "Vehicular ad hoc networks (VANETs): challenges and perspectives." 2006 6th International Conference on ITS Telecommunications. IEEE, 2006. 6. El Zarki, Magda, et al. "Security issues in a future vehicular network." European Wireless. Vol. 2. 2002. 7. Georgiadis, Giorgos, and Marina Papatriantafilou. "Dealing with storage without forecasts in smart grids: Problem transformation and online scheduling algorithm." Proceedings of the 29th Annual ACM Symposium on Applied Computing. ACM, 2014. 8. Fu, Zhang, et al. "Online temporal-spatial analysis for detection of critical events in Cyber-Physical Systems." Big Data (Big Data), 2014 IEEE International Conference on. IEEE, 2014. 60
  • 61. Bibliography 9. Arasu, Arvind, et al. "Linear road: a stream data management benchmark." Proceedings of the Thirtieth international conference on Very large data bases-Volume 30. VLDB Endowment, 2004. 10. Lv, Yisheng, et al. "Traffic flow prediction with big data: a deep learning approach." IEEE Transactions on Intelligent Transportation Systems 16.2 (2015): 865-873. 11. Grochocki, David, et al. "AMI threats, intrusion detection requirements and deployment recommendations." Smart Grid Communications (SmartGridComm), 2012 IEEE Third International Conference on. IEEE, 2012. 12. Molina-Markham, Andrés, et al. "Private memoirs of a smart meter." Proceedings of the 2nd ACM workshop on embedded sensing systems for energy-efficiency in building. ACM, 2010. 13. Gulisano, Vincenzo, et al. "Streamcloud: A large scale data streaming system." Distributed Computing Systems (ICDCS), 2010 IEEE 30th International Conference on. IEEE, 2010. 14. Stonebraker, Michael, Uǧur Çetintemel, and Stan Zdonik. "The 8 requirements of real-time stream processing." ACM SIGMOD Record 34.4 (2005): 42-47. 15. Bonomi, Flavio, et al. "Fog computing and its role in the internet of things." Proceedings of the first edition of the MCC workshop on Mobile cloud computing. ACM, 2012. 16. Himmelsbach, Michael, et al. "LIDAR-based 3D object perception." Proceedings of 1st international workshop on cognition for technical systems. Vol. 1. 2008. 61
  • 62. Bibliography 17. Geiger, Andreas, et al. "Vision meets robotics: The KITTI dataset." The International Journal of Robotics Research (2013): 0278364913491297. 18. Gulisano, Vincenzo Massimiliano. StreamCloud: An Elastic Parallel-Distributed Stream Processing Engine. Diss. Informatica, 2012. 19. Cardellini, Valeria, et al. "Optimal operator placement for distributed stream processing applications." Proceedings of the 10th ACM International Conference on Distributed and Event-based Systems. ACM, 2016. 20. Costache, Stefania, et al. "Understanding the Data-Processing Challenges in Intelligent Vehicular Systems." Proceedings of the 2016 IEEE Intelligent Vehicles Symposium (IV16). 21. Cormode, Graham. "The continuous distributed monitoring model." ACM SIGMOD Record 42.1 (2013): 5-14. 22. Giatrakos, Nikos, Antonios Deligiannakis, and Minos Garofalakis. "Scalable Approximate Query Tracking over Highly Distributed Data Streams." Proceedings of the 2016 International Conference on Management of Data. ACM, 2016. 23. Gulisano, Vincenzo, et al. "Streamcloud: An elastic and scalable data streaming system." IEEE Transactions on Parallel and Distributed Systems 23.12 (2012): 2351-2365. 24. Shah, Mehul A., et al. "Flux: An adaptive partitioning operator for continuous query systems." Data Engineering, 2003. Proceedings. 19th International Conference on. IEEE, 2003. 62
  • 63. Bibliography 25. Cederman, Daniel, et al. "Brief announcement: concurrent data structures for efficient streaming aggregation." Proceedings of the 26th ACM symposium on Parallelism in algorithms and architectures. ACM, 2014. 26. Ji, Yuanzhen, et al. "Quality-driven processing of sliding window aggregates over out-of-order data streams." Proceedings of the 9th ACM International Conference on Distributed Event-Based Systems. ACM, 2015. 27. Ji, Yuanzhen, et al. "Quality-driven disorder handling for concurrent windowed stream queries with shared operators." Proceedings of the 10th ACM International Conference on Distributed and Event-based Systems. ACM, 2016. 28. Gulisano, Vincenzo, et al. "Scalejoin: A deterministic, disjoint-parallel and skew-resilient stream join." Big Data (Big Data), 2015 IEEE International Conference on. IEEE, 2015. 29. Ottenwälder, Beate, et al. "MigCEP: operator migration for mobility driven distributed complex event processing." Proceedings of the 7th ACM international conference on Distributed event-based systems. ACM, 2013. 30. De Matteis, Tiziano, and Gabriele Mencagli. "Keep calm and react with foresight: strategies for low-latency and energy-efficient elastic data stream processing." Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. ACM, 2016. 31. Balazinska, Magdalena, et al. "Fault-tolerance in the Borealis distributed stream processing system." ACM Transactions on Database Systems (TODS) 33.1 (2008): 3. 32. Castro Fernandez, Raul, et al. "Integrating scale out and fault tolerance in stream processing using operator state management." Proceedings of the 2013 ACM SIGMOD international conference on Management of data. ACM, 2013. 63
  • 64. Bibliography 33. Dwork, Cynthia. "Differential privacy: A survey of results." International Conference on Theory and Applications of Models of Computation. Springer Berlin Heidelberg, 2008. 34. Dwork, Cynthia, et al. "Differential privacy under continual observation." Proceedings of the forty-second ACM symposium on Theory of computing. ACM, 2010. 35. Kargl, Frank, Arik Friedman, and Roksana Boreli. "Differential privacy in intelligent transportation systems." Proceedings of the sixth ACM conference on Security and privacy in wireless and mobile networks. ACM, 2013. 64

Editor's Notes

  1. Before we start... questions and please notice
  2. A like to be hands-on...
  3. say these are some examples...
  4. say these are some examples...
  5. say these are some examples...
  6. About the ordering we will come back later
  7. Need to explain a bit what is the approach I am following here
  8. maybe say the DB part is a bit of an oversimplifaction