SlideShare a Scribd company logo
1 of 61
The benefits of fine-grained synchronization in
deterministic and efficient stream processing
Vincenzo Gulisano, Yiannis Nikolakopoulos
2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 1
Chalmers University
of technology
Agenda
• Who we are
• Motivation and System Model
• ScaleJoin: a Deterministic, Disjoint-Parallel and
Skew-Resilient Stream Join
• The ScaleGate data structure
• Fine-grained synchronization and other
use-cases
• Research in progress
• Conclusions
2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 2
Agenda
• Who we are
• Motivation and System Model
• ScaleJoin: a Deterministic, Disjoint-Parallel and
Skew-Resilient Stream Join
• The ScaleGate data structure
• Fine-grained synchronization and other
use-cases
• Research in progress
• Conclusions
2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 3
Who we are
Chalmers university
Computer Science and Engineering
department
Distributed Computing and Systems
research group
2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 4
Who we are
• 12 PhD degrees awarded
• ~20 researchers, 3 Postdocs and 10 PhD
students.
• acknowledged internationally as leading
group in practical multicore synchronization
algorithms and programming (results
adopted by Java JDK, C++, Intel, NVIDIA)
• extensive network of academic and
industrial collaborations and has been
continuously supported by national and
international projects
2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 5
Distributed Computing and
Systems research group
Who we are
Vincenzo Gulisano (PhD)
Assistant Professor at
Chalmers University of Technology
Research focus:
• scalable big data analysis in cyber-physical systems
(Smart Grids, Advanced Metering Infrastructures and
Vehicular Networks).
• streaming operators’ parallelization, load balancing,
provisioning, decommissioning and fault tolerance
protocols.
• enhancement of parallel stream operators by means of
streaming- and energy-aware concurrent data
structures that boost analysis performance (both in
throughput and latency).
• applied use of the streaming paradigm in data
validation, intrusion detection systems, privacy-
preserving online analysis, distributed embedded
networks and cloud infrastructures.
2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 6
Who we are
Yiannis Nikolakopoulos
PhD Student
2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 7
Research interests:
• Parallel and distributed computing
• Shared memory concurrent data structures
• Multicore/Manycore systems
• Consistency problems
• Synchronization
• Data Streaming:
operator parallelism & scalability
Agenda
• Who we are
• Motivation and System Model
• ScaleJoin: a Deterministic, Disjoint-Parallel and
Skew-Resilient Stream Join
• The ScaleGate data structure
• Fine-grained synchronization and other
use-cases
• Research in progress
• Conclusions
2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 8
Motivation
Applications in sensor networks, cyber-physical systems:
• large and fluctuating volumes of data generated
continuously
demand for:
• Continuous processing of data streams
• In a real-time fashion
Store-then-process is not feasible
Main Memory
1 Data
Query Processing
3 Query
results
2 Query
Main Memory
Query Processing
Disk Main Memory
Query Processing
Continuous
Query
Data
Query
results
2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 9
System Model
• Data Stream: unbounded sequence of tuples
– Example: Call Description Record (CDR)
time
Field Field
Caller text
Callee text
Time (secs) int
Price (€) double
A B 8:00 3 C D 8:20 7 A E 8:35 6
102015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos
System Model
• Operators:
OP
Stateless
1 input tuple
1 output tuple
OP
Stateful
1+ input tuple(s)
1 output tuple
112015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos
System Model
• Continuous Query: graph operators/streams
Convert
€  $
Only
> 10$
Count calls
made by each
Caller number
Map Filter Agg
122015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos
Count the number of calls whose price is more than 10
dollars made by each caller
System Model
• Infinite sequence of tuples / bounded memory
 windows
• Example: 1 hour windows
time
[8:00,9:00)
[8:20,9:20)
[8:40,9:40)
132015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos
System Model
• Infinite sequence of tuples / bounded memory
 windows
• Example: count tuples - 1 hour windows
time
[8:00,9:00)
8:05 8:15 8:22 8:45 9:05
Output: 4
14
[8:20,9:20)
2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos
The evolution of SPEs
Centralized
SPEs
Distributed
SPEs
Parallel-Distributed
SPEs
Elastic
SPEs
2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 15
Inter-operator parallelism … …
Intra-operator parallelism
… …
+/- +/-
Scale up / down
Agenda
• Who we are
• Motivation and System Model
• ScaleJoin: a Deterministic, Disjoint-Parallel and
Skew-Resilient Stream Join
• The ScaleGate data structure
• Fine-grained synchronization and other
use-cases
• Research in progress
• Conclusions
2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 16
What is a stream join?
2015-10-31 17
t1
t2
t3
t4
t1
t2
t3
t4
R S
Sliding
window Window
size WS
WSWR
Predicate P
Why parallel stream joins?
• WS = 600 seconds
• R receives 500 tuples/second
• S receives 500 tuples/second
• WR will contain 300,000 tuples
• WS will contain 300,000 tuples
• Each new tuple from R gets compared with
all the tuples in WS
• Each new tuple from S gets compared with
all the tuples in WR
… 300,000,000 comparisons/second!
t1
t2
t3
t4
t1
t2
t3
t4
R S
WSWR
2015-10-31 18
Which are the challenges of a parallel stream join?
Scalability
High
throughput
Low latency
Disjoint
parallelism
Skew
resilience
Determinism
2015-10-31 19
The 3-step procedure (sequential stream join)
For each incoming tuple t:
1. compare t with all tuples in opposite window given predicate P
2. add t to its window
3. remove stale tuples from t’s window
Add tuples to S
Add tuples to R
Prod
R
Prod
S
Consume resultsConsPU
2015-10-31 20
We assume each
producer delivers tuples
in timestamp order
The 3-step procedure, is it enough?
Scalability
High
throughput
Low latency
Disjoint
parallelism
Skew
resilience
Determinism
2015-10-31 21
t1
t2
t1
t2
R S
WSWR
t3
t1
t2
t1
t2
R S
WSWR
t4
t3
Enforcing determinism in sequential stream joins
• Next tuple to process = earliest(tS,tR)
• The earliest(tS,tR) tuple is referred to as the next ready tuple
• Process ready tuples in timestamp order  Determinism
PU
tS tR
2015-10-31 22
Deterministic 3-step procedure
Pick the next ready tuple t:
1. compare t with all tuples in opposite window given predicate P
2. add t to its window
3. remove stale tuples from t’s window
Add tuples to S
Add tuples to R
Prod
R
Prod
S
Consume resultsConsPU
2015-10-31 23
Shared-nothing parallel stream join
(state-of-the-art)
Prod
R
Prod
S
PU1
PU2
PUN
… Cons
Add tuple to PUi S
Add tuple to PUi R
Consume results
Pick the next ready tuple t:
1. compare t with all tuples in opposite window given P
2. add t to its window
3. remove stale tuples from t’s window
Chose a PU
Chose a PU
Take the next
ready output tuple
Scalability
High
throughput
Low latency
Disjoint
parallelism
Skew
resilience
Determinism
2015-10-31 24
Merge
Shared-nothing parallel stream join
(state-of-the-art)
Prod
R
Prod
S
PU1
PU2
PUN
…
2015-10-31 25
enqueue()
dequeue()
ConsMerge
From coarse-grained to fine-grained synchronization
Prod
R
Prod
S
PU1
PU2
PUN
…
Cons
2015-10-31 26
ScaleGate
2015-10-31 27
addTuple(tuple,sourceID)
allows a tuple from sourceID to be merged by ScaleGate in the
resulting timestamp-sorted stream of ready tuples.
getNextReadyTuple(readerID)
provides to readerID the next earliest ready tuple that has not been
yet consumed by the former.
https://github.com/dcs-chalmers/ScaleGate_Java
ScaleJoin
Prod
R
Prod
S
PU1
PU2
PUN
…
Cons
Add tuple SGin
Add tuple SGin
Get next ready
output tuple
from SGout
Get next ready input tuple from SGin
1. compare t with all tuples in opposite window given P
2. add t to its window in a round-robin fashion
3. remove stale tuples from t’s window
2015-10-31 28
SGin SGout
Steps for PU
2015-10-31 29
t1
t2
R S
WR
t3
t4
R S
t4
t1
WR
R S
t4
t2
WR
R S
t4
WR
t3
Sequential stream join:
ScaleJoin with 3 PUs:
ScaleJoin (example)
ScaleJoin
Prod
R
Prod
S
PU1
PU2
PUN
… Cons
Add tuple SGin
Add tuple SGin
Get next ready
output tuple
from SGout
2015-10-31 30
SGin SGout
Scalability
High
throughput
Low latency
Disjoint
parallelism
Skew
resilience
Determinism
Prod
S
Prod
S
Prod
R Get next ready input tuple from SGin
1. compare t with all tuples in opposite window given P
2. add t to its window in a round robin fashion
3. remove stale tuples from t’s window
Steps for PUi
ScaleJoin Scalability – comparisons/second
2015-10-31 32
Number of PUs
4 billion comparison / second!!!
Agenda
• Who we are
• Motivation and System Model
• ScaleJoin: a Deterministic, Disjoint-Parallel and
Skew-Resilient Stream Join
• The ScaleGate data structure
• Fine-grained synchronization and other
use-cases
• Research in progress
• Conclusions
2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 35
2015-10-31 36
addTuple(tuple,sourceID)
allows a tuple from sourceID to be merged by ScaleGate in the
resulting timestamp-sorted stream of ready tuples.
getNextReadyTuple(readerID)
provides to readerID the next earliest ready tuple that has not been
yet consumed by the former.
ScaleGate: Main Functionality
• addTuple(tuple,sourceID)
– Insert tuple from sourceID
effectively in ts order
Vincenzo Gulisano - Yiannis Nikolakopoulos
<6,…>
<1,…>
<3,…>
Insert
concurrently
ScaleGate: Main Functionality
• getNextReadyTuple(readerID)
– readerID gets the next ready tuple t in ts order
i.e. no tuple with ts<t.ts will arrive afterwards
– return null if no ready tuple
Vincenzo Gulisano - Yiannis Nikolakopoulos
<1,…>
Get ready
tuples
<6,…>
<3,…>
<7,…>
<9,…>
<8,…>
Implementation?
• getNextReadyTuple(readerID)
– we get tuples in ts order,
so maybe an extractMin()? (priority queue)
– still no guarantee that t is ready
• addTuple(tuple, sourceID)
– sorted insertion maybe could help
Vincenzo Gulisano - Yiannis Nikolakopoulos
<6,…> <1,…><3,…>
Is there a concurrent data structure with
cheap extractMin() + insert()
Binary Search Trees,
Heaps:
Expensive rebalancing
and heap property
ScaleGate Anatomy (1)
• Inspired from
lock-free skip lists
– randomized height of nodes
– expected cost for search/insertion O(logN)
• Search by traversing from higher to lower levels
Vincenzo Gulisano - Yiannis Nikolakopoulos
BigData’15
[GNPT]
ScaleGate Anatomy (2)
• Reader-local view of "head"
(also has minimum ts for that reader)
• Flagging mechanism:
– If "head" is not flagged can be safely returned
– Flag the last written tuple of each source
• Nodes free to be garbage-collected after every reader passes
(almost...)
Vincenzo Gulisano - Yiannis Nikolakopoulos
head0 head1
BigData’15
[GNPT]
ScaleGate Properties
• Safety: Linearizable
– Every operation appears to take effect
instantaneously
• Progress: Lock-free
– At least one thread makes progress,
independently of the state of others
• Basic building block for providing
deterministic processing
(through the ready semantics)
Vincenzo Gulisano - Yiannis Nikolakopoulos
Why not a Skip List?
Vincenzo Gulisano - Yiannis Nikolakopoulos
Agenda
• Who we are
• Motivation and System Model
• ScaleJoin: a Deterministic, Disjoint-Parallel and Skew-
Resilient Stream Join
• The ScaleGate data structure
• Fine-grained synchronization and other
use-cases
– Streaming Aggregation
– Towards real-time analytics
– Some ongoing work
• Research in progress
• Conclusions
2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 44
Timestamp sorted
Multi-way Streaming Aggregation
1 5
Aggregate F
3 7
5 9
<6,…>
<1,…>
<3,…>
When to
process a tuple?
Tuples not in TS order
across streams;
not ready for
processing at arrival…
Merge & sort
streams
Timestamp
sorted tuples
per stream
Vincenzo Gulisano - Yiannis Nikolakopoulos
Multi-way Streaming Aggregation
(baseline [Streamcloud, Borealis])
Vincenzo Gulisano - Yiannis Nikolakopoulos
Aggregate F:
Update
windows and
output<6,…>
<1,…>
<3,…>
When to
process a tuple?
Keep checking
all buffers!!!
Merge & sort
streamsCan Data
Structures do
more than insert
and extract?
Data Structures to the Rescue
<6,…>
<1,…>
<3,…>
Insert
concurrently
Concurrently
update
windows
Get ready
tuples
Manage
sorted
windows
Output
tuples
ScaleGate objects allow concurrent access with
- consistency/safety , progress guarantees
- determinism
- appropriate interface  parallelization of pipeline stages
Contribution:
SPAA’14 [CGNPT]
“Brief announcement:
concurrent data structures for
efficient streaming aggregation”
Aggregate Scaling
Vincenzo Gulisano - Yiannis Nikolakopoulos
Latency with Increasing
# Input Streams
Vincenzo Gulisano - Yiannis Nikolakopoulos
Best Solution Award 
Vincenzo Gulisano - Yiannis Nikolakopoulos
Analyze taxi trip reports from NYC and compute:
ACM DEBS 2015 [GNWPT]
“Deterministic Real-Time
Analytics of Geospatial Data
Streams through ScaleGate
Objects”
System Architecture Overview
Vincenzo Gulisano - Yiannis Nikolakopoulos
• Networks on chip (NoC)
• Short distance
between cores
• Message passing
model support
• Shared memory support
Can we have Data Structures:
Fast
Scalable
Good progress guarantees
Cache Cache
IA Core
Shared Local
• Eliminated
cache coherency
• Limited support for
synchronization
primitives
What’s happening in hardware?
Vincenzo Gulisano - Yiannis Nikolakopoulos
Off-chip DRAM
Adapteva’s Epiphany Architecture:
16 or 64 cores
core
core
On-chip
32kb per
core
core
core core
core
Cores
communicate
through mesh
by reading
and writing
the on-chip
memory
Vincenzo Gulisano - Yiannis Nikolakopoulos
Distributed
shared memory
Ongoing work with Ericsson:
ScaleGate on Epiphany
• Task-graph based processing
• (Baseband) signal processing applications
Vincenzo Gulisano - Yiannis Nikolakopoulos
Can ScaleGate help?
Agenda
• Who we are
• Motivation and System Model
• ScaleJoin: a Deterministic, Disjoint-Parallel and
Skew-Resilient Stream Join
• The ScaleGate data structure
• Fine-grained synchronization and other use-cases
• Research in progress
• Conclusions
2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 55
2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 56
Source: http://storm.apache.org/
Source: http://storm.apache.org/
… …
Intra-operator parallelism
Storm is a parallel-distributed
Stream Processing Engine
• Data streaming operators
 Spout / Bolt
• Splitting of work into tasks / workers
 Intra-operator parallelism
Parallelization
• General approach
LB: Load Balancer
IM: Input Merger
OPA OPB
OPA LBIM
Node 1
OPA LBIM
Node m
…
Subcluster A
OPB LBIM
Node 1
OPB LBIM
Node n
…
Subcluster B
572015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos
Parallelization
• Stateful operators: Semantic awareness
– Aggregate: count within last hour, group-by caller number
Previous Subcluster
LB…
LB…
IM Agg1
IM Agg2
IM Agg3
…
…
…
Caller A
A B 8:00 3
58
A C 8:30 5
A D 9:05 2
2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos
Parallelization
Previous Subcluster
LB…
LB…
IM Agg1
IM Agg2
IM Agg3
…
…
…
592015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos
Source: http://storm.apache.org/
Fields grouping
Source: http://storm.apache.org/
User defined
On-going research
2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 60
Source: http://www.michael-noll.com/blog/2013/06/21/understanding-storm-internal-message-buffers/
Coarse-grained
synchronization!!!
• Individual queues for
executors
• 2 threads per executor
• Dedicated Receive and Send
thread queues…
… What about fine-grained
synchronization?
Agenda
• Who we are
• Motivation and System Model
• ScaleJoin: a Deterministic, Disjoint-Parallel and
Skew-Resilient Stream Join
• The ScaleGate data structure
• Fine-grained synchronization and other
use-cases
• Research in progress
• Conclusions
2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 61
2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 62
1.Parallel stream joins
2.Parallel stream aggregation
3.“Ad-hoc” streaming applications
4.Determinism in Storm
…
https://github.com/dcs-chalmers/ScaleGate_Java
The benefits of fine-grained synchronization in
deterministic and efficient stream processing
Vincenzo Gulisano, Yiannis Nikolakopoulos
2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 63
Chalmers University
of technology
Thank you! Questions?
References
• Vincenzo Gulisano, Yiannis Nikolakopoulos, Marina Papatriantafilou, and
Philippas Tsigas, “ScaleJoin: a Deterministic, Disjoint-Parallel and Skew-
Resilient Stream Join”, to appear in IEEE BigData 2015
• Vincenzo Gulisano, Yiannis Nikolakopoulos, Ivan Walulya, Marina
Papatriantafilou, Philippas Tsigas, “Deterministic Real-time Analytics of
Geospatial Data Streams Through ScaleGate Objects”, 9th ACM
International Conference on Distributed Event-Based Systems (DEBS '15)
• Yiannis Nikolakopoulos, Anders Gidenstam, Marina Papatriantafilou,
Philippas Tsigas “A Consistency Framework for Iteration Operations in
Concurrent Data Structures”, 2015 IEEE 29th International Symposium on
Parallel & Distributed Processing (IPDPS)
• Daniel Cederman, Vincenzo Gulisano, Yiannis Nikolakopoulos, Marina
Papatriantafilou, and Philippas Tsigas “Brief announcement: concurrent
data structures for efficient streaming aggregation”, 2014 26th ACM
symposium on Parallelism in algorithms and architectures (SPAA '14)
Vincenzo Gulisano - Yiannis Nikolakopoulos

More Related Content

Similar to The benefits of fine-grained synchronization in deterministic and efficient stream processing

Mining System Logs to Learn Error Predictors, Universität Stuttgart, Stuttgar...
Mining System Logs to Learn Error Predictors, Universität Stuttgart, Stuttgar...Mining System Logs to Learn Error Predictors, Universität Stuttgart, Stuttgar...
Mining System Logs to Learn Error Predictors, Universität Stuttgart, Stuttgar...Barbara Russo
 
Discrete event simulation
Discrete event simulationDiscrete event simulation
Discrete event simulationssusera970cc
 
Time series analysis : Refresher and Innovations
Time series analysis : Refresher and InnovationsTime series analysis : Refresher and Innovations
Time series analysis : Refresher and InnovationsQuantUniversity
 
Show and Tell - Data and Digitalisation, Digital Twins.pdf
Show and Tell - Data and Digitalisation, Digital Twins.pdfShow and Tell - Data and Digitalisation, Digital Twins.pdf
Show and Tell - Data and Digitalisation, Digital Twins.pdfSIFOfgem
 
Heuristic Stimuli Generation for Coverage Closure Exploiting Simulation Feedback
Heuristic Stimuli Generation for Coverage Closure Exploiting Simulation FeedbackHeuristic Stimuli Generation for Coverage Closure Exploiting Simulation Feedback
Heuristic Stimuli Generation for Coverage Closure Exploiting Simulation FeedbackDVClub
 
Data Streaming in IoT and Big Data Analytics
Data Streaming in  IoT and Big Data AnalyticsData Streaming in  IoT and Big Data Analytics
Data Streaming in IoT and Big Data AnalyticsVincenzo Gulisano
 
Wanted!: Open M&S Standards and Technologies for the Smart Grid - Introducing...
Wanted!: Open M&S Standards and Technologies for the Smart Grid - Introducing...Wanted!: Open M&S Standards and Technologies for the Smart Grid - Introducing...
Wanted!: Open M&S Standards and Technologies for the Smart Grid - Introducing...Luigi Vanfretti
 
Stream-IT: Continuous and Dynamic Processing of Production Systems Data (thro...
Stream-IT: Continuous and Dynamic Processing of Production Systems Data (thro...Stream-IT: Continuous and Dynamic Processing of Production Systems Data (thro...
Stream-IT: Continuous and Dynamic Processing of Production Systems Data (thro...Hannaneh Najdataei
 
Monitoring of Transmission and Distribution Grids using PMUs
Monitoring of Transmission and Distribution Grids using PMUsMonitoring of Transmission and Distribution Grids using PMUs
Monitoring of Transmission and Distribution Grids using PMUsLuigi Vanfretti
 
Testing Dynamic Behavior in Executable Software Models - Making Cyber-physica...
Testing Dynamic Behavior in Executable Software Models - Making Cyber-physica...Testing Dynamic Behavior in Executable Software Models - Making Cyber-physica...
Testing Dynamic Behavior in Executable Software Models - Making Cyber-physica...Lionel Briand
 
Efficient Similarity Computation for Collaborative Filtering in Dynamic Envir...
Efficient Similarity Computation for Collaborative Filtering in Dynamic Envir...Efficient Similarity Computation for Collaborative Filtering in Dynamic Envir...
Efficient Similarity Computation for Collaborative Filtering in Dynamic Envir...Olivier Jeunen
 
OSLC KM: Elevating the meaning of data and operations within the toolchain
OSLC KM: Elevating the meaning of data and operations within the toolchainOSLC KM: Elevating the meaning of data and operations within the toolchain
OSLC KM: Elevating the meaning of data and operations within the toolchainCARLOS III UNIVERSITY OF MADRID
 
ThoughtWorks Continuous Delivery
ThoughtWorks Continuous DeliveryThoughtWorks Continuous Delivery
ThoughtWorks Continuous DeliveryKyle Hodgson
 
The RaPId Toolbox for Parameter Identification and Model Validation: How Mode...
The RaPId Toolbox for Parameter Identification and Model Validation: How Mode...The RaPId Toolbox for Parameter Identification and Model Validation: How Mode...
The RaPId Toolbox for Parameter Identification and Model Validation: How Mode...Luigi Vanfretti
 
Keynote: Building and Operating A Serverless Streaming Runtime for Apache Bea...
Keynote: Building and Operating A Serverless Streaming Runtime for Apache Bea...Keynote: Building and Operating A Serverless Streaming Runtime for Apache Bea...
Keynote: Building and Operating A Serverless Streaming Runtime for Apache Bea...Flink Forward
 
Online machine learning in Streaming Applications
Online machine learning in Streaming ApplicationsOnline machine learning in Streaming Applications
Online machine learning in Streaming ApplicationsStavros Kontopoulos
 
Making Model-Driven Verification Practical and Scalable: Experiences and Less...
Making Model-Driven Verification Practical and Scalable: Experiences and Less...Making Model-Driven Verification Practical and Scalable: Experiences and Less...
Making Model-Driven Verification Practical and Scalable: Experiences and Less...Lionel Briand
 
Mike Bartley - Innovations for Testing Parallel Software - EuroSTAR 2012
Mike Bartley - Innovations for Testing Parallel Software - EuroSTAR 2012Mike Bartley - Innovations for Testing Parallel Software - EuroSTAR 2012
Mike Bartley - Innovations for Testing Parallel Software - EuroSTAR 2012TEST Huddle
 

Similar to The benefits of fine-grained synchronization in deterministic and efficient stream processing (20)

Mining System Logs to Learn Error Predictors, Universität Stuttgart, Stuttgar...
Mining System Logs to Learn Error Predictors, Universität Stuttgart, Stuttgar...Mining System Logs to Learn Error Predictors, Universität Stuttgart, Stuttgar...
Mining System Logs to Learn Error Predictors, Universität Stuttgart, Stuttgar...
 
Discrete event simulation
Discrete event simulationDiscrete event simulation
Discrete event simulation
 
Time series analysis : Refresher and Innovations
Time series analysis : Refresher and InnovationsTime series analysis : Refresher and Innovations
Time series analysis : Refresher and Innovations
 
Show and Tell - Data and Digitalisation, Digital Twins.pdf
Show and Tell - Data and Digitalisation, Digital Twins.pdfShow and Tell - Data and Digitalisation, Digital Twins.pdf
Show and Tell - Data and Digitalisation, Digital Twins.pdf
 
Heuristic Stimuli Generation for Coverage Closure Exploiting Simulation Feedback
Heuristic Stimuli Generation for Coverage Closure Exploiting Simulation FeedbackHeuristic Stimuli Generation for Coverage Closure Exploiting Simulation Feedback
Heuristic Stimuli Generation for Coverage Closure Exploiting Simulation Feedback
 
Data Streaming in IoT and Big Data Analytics
Data Streaming in  IoT and Big Data AnalyticsData Streaming in  IoT and Big Data Analytics
Data Streaming in IoT and Big Data Analytics
 
Wanted!: Open M&S Standards and Technologies for the Smart Grid - Introducing...
Wanted!: Open M&S Standards and Technologies for the Smart Grid - Introducing...Wanted!: Open M&S Standards and Technologies for the Smart Grid - Introducing...
Wanted!: Open M&S Standards and Technologies for the Smart Grid - Introducing...
 
A Brief History of Stream Processing
A Brief History of Stream ProcessingA Brief History of Stream Processing
A Brief History of Stream Processing
 
Stream-IT: Continuous and Dynamic Processing of Production Systems Data (thro...
Stream-IT: Continuous and Dynamic Processing of Production Systems Data (thro...Stream-IT: Continuous and Dynamic Processing of Production Systems Data (thro...
Stream-IT: Continuous and Dynamic Processing of Production Systems Data (thro...
 
Monitoring of Transmission and Distribution Grids using PMUs
Monitoring of Transmission and Distribution Grids using PMUsMonitoring of Transmission and Distribution Grids using PMUs
Monitoring of Transmission and Distribution Grids using PMUs
 
Testing Dynamic Behavior in Executable Software Models - Making Cyber-physica...
Testing Dynamic Behavior in Executable Software Models - Making Cyber-physica...Testing Dynamic Behavior in Executable Software Models - Making Cyber-physica...
Testing Dynamic Behavior in Executable Software Models - Making Cyber-physica...
 
Efficient Similarity Computation for Collaborative Filtering in Dynamic Envir...
Efficient Similarity Computation for Collaborative Filtering in Dynamic Envir...Efficient Similarity Computation for Collaborative Filtering in Dynamic Envir...
Efficient Similarity Computation for Collaborative Filtering in Dynamic Envir...
 
OSLC KM: Elevating the meaning of data and operations within the toolchain
OSLC KM: Elevating the meaning of data and operations within the toolchainOSLC KM: Elevating the meaning of data and operations within the toolchain
OSLC KM: Elevating the meaning of data and operations within the toolchain
 
ThoughtWorks Continuous Delivery
ThoughtWorks Continuous DeliveryThoughtWorks Continuous Delivery
ThoughtWorks Continuous Delivery
 
The RaPId Toolbox for Parameter Identification and Model Validation: How Mode...
The RaPId Toolbox for Parameter Identification and Model Validation: How Mode...The RaPId Toolbox for Parameter Identification and Model Validation: How Mode...
The RaPId Toolbox for Parameter Identification and Model Validation: How Mode...
 
Keynote: Building and Operating A Serverless Streaming Runtime for Apache Bea...
Keynote: Building and Operating A Serverless Streaming Runtime for Apache Bea...Keynote: Building and Operating A Serverless Streaming Runtime for Apache Bea...
Keynote: Building and Operating A Serverless Streaming Runtime for Apache Bea...
 
Online machine learning in Streaming Applications
Online machine learning in Streaming ApplicationsOnline machine learning in Streaming Applications
Online machine learning in Streaming Applications
 
Making Model-Driven Verification Practical and Scalable: Experiences and Less...
Making Model-Driven Verification Practical and Scalable: Experiences and Less...Making Model-Driven Verification Practical and Scalable: Experiences and Less...
Making Model-Driven Verification Practical and Scalable: Experiences and Less...
 
Mike Bartley - Innovations for Testing Parallel Software - EuroSTAR 2012
Mike Bartley - Innovations for Testing Parallel Software - EuroSTAR 2012Mike Bartley - Innovations for Testing Parallel Software - EuroSTAR 2012
Mike Bartley - Innovations for Testing Parallel Software - EuroSTAR 2012
 
Mit16 30 f10_lec01
Mit16 30 f10_lec01Mit16 30 f10_lec01
Mit16 30 f10_lec01
 

Recently uploaded

Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real timeSatoshi NAKAHIRA
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Nistarini College, Purulia (W.B) India
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxAleenaTreesaSaji
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...jana861314
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Patrick Diehl
 
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdfNAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdfWadeK3
 
zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzohaibmir069
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |aasikanpl
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PPRINCE C P
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxpradhanghanshyam7136
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfSELF-EXPLANATORY
 

Recently uploaded (20)

Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real time
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptx
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?
 
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdfNAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
 
zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistan
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C P
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptx
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
 

The benefits of fine-grained synchronization in deterministic and efficient stream processing

  • 1. The benefits of fine-grained synchronization in deterministic and efficient stream processing Vincenzo Gulisano, Yiannis Nikolakopoulos 2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 1 Chalmers University of technology
  • 2. Agenda • Who we are • Motivation and System Model • ScaleJoin: a Deterministic, Disjoint-Parallel and Skew-Resilient Stream Join • The ScaleGate data structure • Fine-grained synchronization and other use-cases • Research in progress • Conclusions 2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 2
  • 3. Agenda • Who we are • Motivation and System Model • ScaleJoin: a Deterministic, Disjoint-Parallel and Skew-Resilient Stream Join • The ScaleGate data structure • Fine-grained synchronization and other use-cases • Research in progress • Conclusions 2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 3
  • 4. Who we are Chalmers university Computer Science and Engineering department Distributed Computing and Systems research group 2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 4
  • 5. Who we are • 12 PhD degrees awarded • ~20 researchers, 3 Postdocs and 10 PhD students. • acknowledged internationally as leading group in practical multicore synchronization algorithms and programming (results adopted by Java JDK, C++, Intel, NVIDIA) • extensive network of academic and industrial collaborations and has been continuously supported by national and international projects 2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 5 Distributed Computing and Systems research group
  • 6. Who we are Vincenzo Gulisano (PhD) Assistant Professor at Chalmers University of Technology Research focus: • scalable big data analysis in cyber-physical systems (Smart Grids, Advanced Metering Infrastructures and Vehicular Networks). • streaming operators’ parallelization, load balancing, provisioning, decommissioning and fault tolerance protocols. • enhancement of parallel stream operators by means of streaming- and energy-aware concurrent data structures that boost analysis performance (both in throughput and latency). • applied use of the streaming paradigm in data validation, intrusion detection systems, privacy- preserving online analysis, distributed embedded networks and cloud infrastructures. 2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 6
  • 7. Who we are Yiannis Nikolakopoulos PhD Student 2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 7 Research interests: • Parallel and distributed computing • Shared memory concurrent data structures • Multicore/Manycore systems • Consistency problems • Synchronization • Data Streaming: operator parallelism & scalability
  • 8. Agenda • Who we are • Motivation and System Model • ScaleJoin: a Deterministic, Disjoint-Parallel and Skew-Resilient Stream Join • The ScaleGate data structure • Fine-grained synchronization and other use-cases • Research in progress • Conclusions 2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 8
  • 9. Motivation Applications in sensor networks, cyber-physical systems: • large and fluctuating volumes of data generated continuously demand for: • Continuous processing of data streams • In a real-time fashion Store-then-process is not feasible Main Memory 1 Data Query Processing 3 Query results 2 Query Main Memory Query Processing Disk Main Memory Query Processing Continuous Query Data Query results 2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 9
  • 10. System Model • Data Stream: unbounded sequence of tuples – Example: Call Description Record (CDR) time Field Field Caller text Callee text Time (secs) int Price (€) double A B 8:00 3 C D 8:20 7 A E 8:35 6 102015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos
  • 11. System Model • Operators: OP Stateless 1 input tuple 1 output tuple OP Stateful 1+ input tuple(s) 1 output tuple 112015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos
  • 12. System Model • Continuous Query: graph operators/streams Convert €  $ Only > 10$ Count calls made by each Caller number Map Filter Agg 122015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos Count the number of calls whose price is more than 10 dollars made by each caller
  • 13. System Model • Infinite sequence of tuples / bounded memory  windows • Example: 1 hour windows time [8:00,9:00) [8:20,9:20) [8:40,9:40) 132015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos
  • 14. System Model • Infinite sequence of tuples / bounded memory  windows • Example: count tuples - 1 hour windows time [8:00,9:00) 8:05 8:15 8:22 8:45 9:05 Output: 4 14 [8:20,9:20) 2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos
  • 15. The evolution of SPEs Centralized SPEs Distributed SPEs Parallel-Distributed SPEs Elastic SPEs 2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 15 Inter-operator parallelism … … Intra-operator parallelism … … +/- +/- Scale up / down
  • 16. Agenda • Who we are • Motivation and System Model • ScaleJoin: a Deterministic, Disjoint-Parallel and Skew-Resilient Stream Join • The ScaleGate data structure • Fine-grained synchronization and other use-cases • Research in progress • Conclusions 2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 16
  • 17. What is a stream join? 2015-10-31 17 t1 t2 t3 t4 t1 t2 t3 t4 R S Sliding window Window size WS WSWR Predicate P
  • 18. Why parallel stream joins? • WS = 600 seconds • R receives 500 tuples/second • S receives 500 tuples/second • WR will contain 300,000 tuples • WS will contain 300,000 tuples • Each new tuple from R gets compared with all the tuples in WS • Each new tuple from S gets compared with all the tuples in WR … 300,000,000 comparisons/second! t1 t2 t3 t4 t1 t2 t3 t4 R S WSWR 2015-10-31 18
  • 19. Which are the challenges of a parallel stream join? Scalability High throughput Low latency Disjoint parallelism Skew resilience Determinism 2015-10-31 19
  • 20. The 3-step procedure (sequential stream join) For each incoming tuple t: 1. compare t with all tuples in opposite window given predicate P 2. add t to its window 3. remove stale tuples from t’s window Add tuples to S Add tuples to R Prod R Prod S Consume resultsConsPU 2015-10-31 20 We assume each producer delivers tuples in timestamp order
  • 21. The 3-step procedure, is it enough? Scalability High throughput Low latency Disjoint parallelism Skew resilience Determinism 2015-10-31 21 t1 t2 t1 t2 R S WSWR t3 t1 t2 t1 t2 R S WSWR t4 t3
  • 22. Enforcing determinism in sequential stream joins • Next tuple to process = earliest(tS,tR) • The earliest(tS,tR) tuple is referred to as the next ready tuple • Process ready tuples in timestamp order  Determinism PU tS tR 2015-10-31 22
  • 23. Deterministic 3-step procedure Pick the next ready tuple t: 1. compare t with all tuples in opposite window given predicate P 2. add t to its window 3. remove stale tuples from t’s window Add tuples to S Add tuples to R Prod R Prod S Consume resultsConsPU 2015-10-31 23
  • 24. Shared-nothing parallel stream join (state-of-the-art) Prod R Prod S PU1 PU2 PUN … Cons Add tuple to PUi S Add tuple to PUi R Consume results Pick the next ready tuple t: 1. compare t with all tuples in opposite window given P 2. add t to its window 3. remove stale tuples from t’s window Chose a PU Chose a PU Take the next ready output tuple Scalability High throughput Low latency Disjoint parallelism Skew resilience Determinism 2015-10-31 24 Merge
  • 25. Shared-nothing parallel stream join (state-of-the-art) Prod R Prod S PU1 PU2 PUN … 2015-10-31 25 enqueue() dequeue() ConsMerge
  • 26. From coarse-grained to fine-grained synchronization Prod R Prod S PU1 PU2 PUN … Cons 2015-10-31 26
  • 27. ScaleGate 2015-10-31 27 addTuple(tuple,sourceID) allows a tuple from sourceID to be merged by ScaleGate in the resulting timestamp-sorted stream of ready tuples. getNextReadyTuple(readerID) provides to readerID the next earliest ready tuple that has not been yet consumed by the former. https://github.com/dcs-chalmers/ScaleGate_Java
  • 28. ScaleJoin Prod R Prod S PU1 PU2 PUN … Cons Add tuple SGin Add tuple SGin Get next ready output tuple from SGout Get next ready input tuple from SGin 1. compare t with all tuples in opposite window given P 2. add t to its window in a round-robin fashion 3. remove stale tuples from t’s window 2015-10-31 28 SGin SGout Steps for PU
  • 29. 2015-10-31 29 t1 t2 R S WR t3 t4 R S t4 t1 WR R S t4 t2 WR R S t4 WR t3 Sequential stream join: ScaleJoin with 3 PUs: ScaleJoin (example)
  • 30. ScaleJoin Prod R Prod S PU1 PU2 PUN … Cons Add tuple SGin Add tuple SGin Get next ready output tuple from SGout 2015-10-31 30 SGin SGout Scalability High throughput Low latency Disjoint parallelism Skew resilience Determinism Prod S Prod S Prod R Get next ready input tuple from SGin 1. compare t with all tuples in opposite window given P 2. add t to its window in a round robin fashion 3. remove stale tuples from t’s window Steps for PUi
  • 31. ScaleJoin Scalability – comparisons/second 2015-10-31 32 Number of PUs 4 billion comparison / second!!!
  • 32. Agenda • Who we are • Motivation and System Model • ScaleJoin: a Deterministic, Disjoint-Parallel and Skew-Resilient Stream Join • The ScaleGate data structure • Fine-grained synchronization and other use-cases • Research in progress • Conclusions 2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 35
  • 33. 2015-10-31 36 addTuple(tuple,sourceID) allows a tuple from sourceID to be merged by ScaleGate in the resulting timestamp-sorted stream of ready tuples. getNextReadyTuple(readerID) provides to readerID the next earliest ready tuple that has not been yet consumed by the former.
  • 34. ScaleGate: Main Functionality • addTuple(tuple,sourceID) – Insert tuple from sourceID effectively in ts order Vincenzo Gulisano - Yiannis Nikolakopoulos <6,…> <1,…> <3,…> Insert concurrently
  • 35. ScaleGate: Main Functionality • getNextReadyTuple(readerID) – readerID gets the next ready tuple t in ts order i.e. no tuple with ts<t.ts will arrive afterwards – return null if no ready tuple Vincenzo Gulisano - Yiannis Nikolakopoulos <1,…> Get ready tuples <6,…> <3,…> <7,…> <9,…> <8,…>
  • 36. Implementation? • getNextReadyTuple(readerID) – we get tuples in ts order, so maybe an extractMin()? (priority queue) – still no guarantee that t is ready • addTuple(tuple, sourceID) – sorted insertion maybe could help Vincenzo Gulisano - Yiannis Nikolakopoulos <6,…> <1,…><3,…> Is there a concurrent data structure with cheap extractMin() + insert() Binary Search Trees, Heaps: Expensive rebalancing and heap property
  • 37. ScaleGate Anatomy (1) • Inspired from lock-free skip lists – randomized height of nodes – expected cost for search/insertion O(logN) • Search by traversing from higher to lower levels Vincenzo Gulisano - Yiannis Nikolakopoulos BigData’15 [GNPT]
  • 38. ScaleGate Anatomy (2) • Reader-local view of "head" (also has minimum ts for that reader) • Flagging mechanism: – If "head" is not flagged can be safely returned – Flag the last written tuple of each source • Nodes free to be garbage-collected after every reader passes (almost...) Vincenzo Gulisano - Yiannis Nikolakopoulos head0 head1 BigData’15 [GNPT]
  • 39. ScaleGate Properties • Safety: Linearizable – Every operation appears to take effect instantaneously • Progress: Lock-free – At least one thread makes progress, independently of the state of others • Basic building block for providing deterministic processing (through the ready semantics) Vincenzo Gulisano - Yiannis Nikolakopoulos
  • 40. Why not a Skip List? Vincenzo Gulisano - Yiannis Nikolakopoulos
  • 41. Agenda • Who we are • Motivation and System Model • ScaleJoin: a Deterministic, Disjoint-Parallel and Skew- Resilient Stream Join • The ScaleGate data structure • Fine-grained synchronization and other use-cases – Streaming Aggregation – Towards real-time analytics – Some ongoing work • Research in progress • Conclusions 2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 44
  • 42. Timestamp sorted Multi-way Streaming Aggregation 1 5 Aggregate F 3 7 5 9 <6,…> <1,…> <3,…> When to process a tuple? Tuples not in TS order across streams; not ready for processing at arrival… Merge & sort streams Timestamp sorted tuples per stream Vincenzo Gulisano - Yiannis Nikolakopoulos
  • 43. Multi-way Streaming Aggregation (baseline [Streamcloud, Borealis]) Vincenzo Gulisano - Yiannis Nikolakopoulos Aggregate F: Update windows and output<6,…> <1,…> <3,…> When to process a tuple? Keep checking all buffers!!! Merge & sort streamsCan Data Structures do more than insert and extract?
  • 44. Data Structures to the Rescue <6,…> <1,…> <3,…> Insert concurrently Concurrently update windows Get ready tuples Manage sorted windows Output tuples ScaleGate objects allow concurrent access with - consistency/safety , progress guarantees - determinism - appropriate interface  parallelization of pipeline stages Contribution: SPAA’14 [CGNPT] “Brief announcement: concurrent data structures for efficient streaming aggregation”
  • 45. Aggregate Scaling Vincenzo Gulisano - Yiannis Nikolakopoulos
  • 46. Latency with Increasing # Input Streams Vincenzo Gulisano - Yiannis Nikolakopoulos
  • 47. Best Solution Award  Vincenzo Gulisano - Yiannis Nikolakopoulos Analyze taxi trip reports from NYC and compute: ACM DEBS 2015 [GNWPT] “Deterministic Real-Time Analytics of Geospatial Data Streams through ScaleGate Objects”
  • 48. System Architecture Overview Vincenzo Gulisano - Yiannis Nikolakopoulos
  • 49. • Networks on chip (NoC) • Short distance between cores • Message passing model support • Shared memory support Can we have Data Structures: Fast Scalable Good progress guarantees Cache Cache IA Core Shared Local • Eliminated cache coherency • Limited support for synchronization primitives What’s happening in hardware? Vincenzo Gulisano - Yiannis Nikolakopoulos
  • 50. Off-chip DRAM Adapteva’s Epiphany Architecture: 16 or 64 cores core core On-chip 32kb per core core core core core Cores communicate through mesh by reading and writing the on-chip memory Vincenzo Gulisano - Yiannis Nikolakopoulos Distributed shared memory
  • 51. Ongoing work with Ericsson: ScaleGate on Epiphany • Task-graph based processing • (Baseband) signal processing applications Vincenzo Gulisano - Yiannis Nikolakopoulos Can ScaleGate help?
  • 52. Agenda • Who we are • Motivation and System Model • ScaleJoin: a Deterministic, Disjoint-Parallel and Skew-Resilient Stream Join • The ScaleGate data structure • Fine-grained synchronization and other use-cases • Research in progress • Conclusions 2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 55
  • 53. 2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 56 Source: http://storm.apache.org/ Source: http://storm.apache.org/ … … Intra-operator parallelism Storm is a parallel-distributed Stream Processing Engine • Data streaming operators  Spout / Bolt • Splitting of work into tasks / workers  Intra-operator parallelism
  • 54. Parallelization • General approach LB: Load Balancer IM: Input Merger OPA OPB OPA LBIM Node 1 OPA LBIM Node m … Subcluster A OPB LBIM Node 1 OPB LBIM Node n … Subcluster B 572015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos
  • 55. Parallelization • Stateful operators: Semantic awareness – Aggregate: count within last hour, group-by caller number Previous Subcluster LB… LB… IM Agg1 IM Agg2 IM Agg3 … … … Caller A A B 8:00 3 58 A C 8:30 5 A D 9:05 2 2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos
  • 56. Parallelization Previous Subcluster LB… LB… IM Agg1 IM Agg2 IM Agg3 … … … 592015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos Source: http://storm.apache.org/ Fields grouping Source: http://storm.apache.org/ User defined
  • 57. On-going research 2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 60 Source: http://www.michael-noll.com/blog/2013/06/21/understanding-storm-internal-message-buffers/ Coarse-grained synchronization!!! • Individual queues for executors • 2 threads per executor • Dedicated Receive and Send thread queues… … What about fine-grained synchronization?
  • 58. Agenda • Who we are • Motivation and System Model • ScaleJoin: a Deterministic, Disjoint-Parallel and Skew-Resilient Stream Join • The ScaleGate data structure • Fine-grained synchronization and other use-cases • Research in progress • Conclusions 2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 61
  • 59. 2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 62 1.Parallel stream joins 2.Parallel stream aggregation 3.“Ad-hoc” streaming applications 4.Determinism in Storm … https://github.com/dcs-chalmers/ScaleGate_Java
  • 60. The benefits of fine-grained synchronization in deterministic and efficient stream processing Vincenzo Gulisano, Yiannis Nikolakopoulos 2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 63 Chalmers University of technology Thank you! Questions?
  • 61. References • Vincenzo Gulisano, Yiannis Nikolakopoulos, Marina Papatriantafilou, and Philippas Tsigas, “ScaleJoin: a Deterministic, Disjoint-Parallel and Skew- Resilient Stream Join”, to appear in IEEE BigData 2015 • Vincenzo Gulisano, Yiannis Nikolakopoulos, Ivan Walulya, Marina Papatriantafilou, Philippas Tsigas, “Deterministic Real-time Analytics of Geospatial Data Streams Through ScaleGate Objects”, 9th ACM International Conference on Distributed Event-Based Systems (DEBS '15) • Yiannis Nikolakopoulos, Anders Gidenstam, Marina Papatriantafilou, Philippas Tsigas “A Consistency Framework for Iteration Operations in Concurrent Data Structures”, 2015 IEEE 29th International Symposium on Parallel & Distributed Processing (IPDPS) • Daniel Cederman, Vincenzo Gulisano, Yiannis Nikolakopoulos, Marina Papatriantafilou, and Philippas Tsigas “Brief announcement: concurrent data structures for efficient streaming aggregation”, 2014 26th ACM symposium on Parallelism in algorithms and architectures (SPAA '14) Vincenzo Gulisano - Yiannis Nikolakopoulos

Editor's Notes

  1. So, why are we in need of scalable parallelization approaches for stream joins? Present example
  2. So, let’s see how stream joins are actually implemented Example But wait, what happens if we get tuples in another order? Mmmmhhh
  3. So, let’s see how stream joins are actually implemented Example But wait, what happens if we get tuples in another order? Mmmmhhh
  4. Ok, so deterministic 3-step procedure looks like this Now let’s try to parallelize this, and let’s do it as it has been done before
  5. Ok, so deterministic 3-step procedure looks like this Now let’s try to parallelize this, and let’s do it as it has been done before First, we need to do more operations, this affects the latency for sure But what’s worst is that we introduce a new bottleneck, the output thread and ready tuples And this actually breaks disjoint parallelism too… Finally, is not really skew-resilient So, what’s the problem? Are we doing it in the wrong way or are we forgetting something? Look at the data structures, we parallelize by parallelizing the computation, but what about the communication?
  6. The queues! We parallelized the computation, but overlooked the communication We are still using a queue with its methods enqueue and dequeue
  7. Let’s be creative, let’s assume they actually share something more powerful, that let’s them communicate and synchronize in a more efficient way What do we want from such communication and synchronization ds?
  8. Then we can do something like that…
  9. Here we can basically discuss why this addresses the different challenges, one by one… It gets even better, you can even have multiple physical producers for S and R!!!! And this is actually important because in the real world it will be like that!
  10. OK, so this is the benchmark we used… Implemented in Java And we evaluated it with 2 different systems (SAY WHY TWO SYSTEMS IF THEY ASK OR JUST SAY IT?)…
  11. Here we want to check the number of comparisons per second sustained by ScaleJoin After checking the ones obtained for a single thread, we computed the expected max and then observed ones for 3 different window sizes As you can see… Up to 4 billion comparison/second!
  12. This is the processing latency we get (in milliseconds) As you can see, even when we have 48 PUs (and notice that this means more threads than cores, since we have also injectors and receivers…) less than 60 – Actually, when we do not spawn too many threads we are talking of 30 milliseconds Might seem counterintuitive that latency grows with PU, but that’s because of determinism!
  13. In this case we have two different rates for R and S (notice actually the multiple physical streams!) and then peaks over time As you can see, comparisons of course increase when we have a peak, but nevertheless the overall work is very well balanced, the standard deviation among the Pus is less than 0.2% even during the spikes!!! Skew-resilience!
  14. Determinism
  15. When to process: we are sure a lower TS tuple will not come from other input streams
  16. Remove red clouds
  17. make sure to make clear that this is an "abstraction" of the results ()
  18. Number of input streams
  19. Here we could have some discussion about this general structure of the SPE Lots of queues and threads, can we see problems in this? What could we do by doing stuff in the right way?