Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
The benefits of fine-grained synchronization in
deterministic and efficient stream processing
Vincenzo Gulisano, Yiannis N...
Agenda
• Who we are
• Motivation and System Model
• ScaleJoin: a Deterministic, Disjoint-Parallel and
Skew-Resilient Strea...
Agenda
• Who we are
• Motivation and System Model
• ScaleJoin: a Deterministic, Disjoint-Parallel and
Skew-Resilient Strea...
Who we are
Chalmers university
Computer Science and Engineering
department
Distributed Computing and Systems
research grou...
Who we are
• 12 PhD degrees awarded
• ~20 researchers, 3 Postdocs and 10 PhD
students.
• acknowledged internationally as l...
Who we are
Vincenzo Gulisano (PhD)
Assistant Professor at
Chalmers University of Technology
Research focus:
• scalable big...
Who we are
Yiannis Nikolakopoulos
PhD Student
2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 7
Research interests:
...
Agenda
• Who we are
• Motivation and System Model
• ScaleJoin: a Deterministic, Disjoint-Parallel and
Skew-Resilient Strea...
Motivation
Applications in sensor networks, cyber-physical systems:
• large and fluctuating volumes of data generated
cont...
System Model
• Data Stream: unbounded sequence of tuples
– Example: Call Description Record (CDR)
time
Field Field
Caller ...
System Model
• Operators:
OP
Stateless
1 input tuple
1 output tuple
OP
Stateful
1+ input tuple(s)
1 output tuple
112015-10...
System Model
• Continuous Query: graph operators/streams
Convert
€  $
Only
> 10$
Count calls
made by each
Caller number
M...
System Model
• Infinite sequence of tuples / bounded memory
 windows
• Example: 1 hour windows
time
[8:00,9:00)
[8:20,9:2...
System Model
• Infinite sequence of tuples / bounded memory
 windows
• Example: count tuples - 1 hour windows
time
[8:00,...
The evolution of SPEs
Centralized
SPEs
Distributed
SPEs
Parallel-Distributed
SPEs
Elastic
SPEs
2015-10-31 Vincenzo Gulisan...
Agenda
• Who we are
• Motivation and System Model
• ScaleJoin: a Deterministic, Disjoint-Parallel and
Skew-Resilient Strea...
What is a stream join?
2015-10-31 17
t1
t2
t3
t4
t1
t2
t3
t4
R S
Sliding
window Window
size WS
WSWR
Predicate P
Why parallel stream joins?
• WS = 600 seconds
• R receives 500 tuples/second
• S receives 500 tuples/second
• WR will cont...
Which are the challenges of a parallel stream join?
Scalability
High
throughput
Low latency
Disjoint
parallelism
Skew
resi...
The 3-step procedure (sequential stream join)
For each incoming tuple t:
1. compare t with all tuples in opposite window g...
The 3-step procedure, is it enough?
Scalability
High
throughput
Low latency
Disjoint
parallelism
Skew
resilience
Determini...
Enforcing determinism in sequential stream joins
• Next tuple to process = earliest(tS,tR)
• The earliest(tS,tR) tuple is ...
Deterministic 3-step procedure
Pick the next ready tuple t:
1. compare t with all tuples in opposite window given predicat...
Shared-nothing parallel stream join
(state-of-the-art)
Prod
R
Prod
S
PU1
PU2
PUN
… Cons
Add tuple to PUi S
Add tuple to PU...
Shared-nothing parallel stream join
(state-of-the-art)
Prod
R
Prod
S
PU1
PU2
PUN
…
2015-10-31 25
enqueue()
dequeue()
ConsM...
From coarse-grained to fine-grained synchronization
Prod
R
Prod
S
PU1
PU2
PUN
…
Cons
2015-10-31 26
ScaleGate
2015-10-31 27
addTuple(tuple,sourceID)
allows a tuple from sourceID to be merged by ScaleGate in the
resulting t...
ScaleJoin
Prod
R
Prod
S
PU1
PU2
PUN
…
Cons
Add tuple SGin
Add tuple SGin
Get next ready
output tuple
from SGout
Get next r...
2015-10-31 29
t1
t2
R S
WR
t3
t4
R S
t4
t1
WR
R S
t4
t2
WR
R S
t4
WR
t3
Sequential stream join:
ScaleJoin with 3 PUs:
Scal...
ScaleJoin
Prod
R
Prod
S
PU1
PU2
PUN
… Cons
Add tuple SGin
Add tuple SGin
Get next ready
output tuple
from SGout
2015-10-31...
ScaleJoin Scalability – comparisons/second
2015-10-31 32
Number of PUs
4 billion comparison / second!!!
Agenda
• Who we are
• Motivation and System Model
• ScaleJoin: a Deterministic, Disjoint-Parallel and
Skew-Resilient Strea...
2015-10-31 36
addTuple(tuple,sourceID)
allows a tuple from sourceID to be merged by ScaleGate in the
resulting timestamp-s...
ScaleGate: Main Functionality
• addTuple(tuple,sourceID)
– Insert tuple from sourceID
effectively in ts order
Vincenzo Gul...
ScaleGate: Main Functionality
• getNextReadyTuple(readerID)
– readerID gets the next ready tuple t in ts order
i.e. no tup...
Implementation?
• getNextReadyTuple(readerID)
– we get tuples in ts order,
so maybe an extractMin()? (priority queue)
– st...
ScaleGate Anatomy (1)
• Inspired from
lock-free skip lists
– randomized height of nodes
– expected cost for search/inserti...
ScaleGate Anatomy (2)
• Reader-local view of "head"
(also has minimum ts for that reader)
• Flagging mechanism:
– If "head...
ScaleGate Properties
• Safety: Linearizable
– Every operation appears to take effect
instantaneously
• Progress: Lock-free...
Why not a Skip List?
Vincenzo Gulisano - Yiannis Nikolakopoulos
Agenda
• Who we are
• Motivation and System Model
• ScaleJoin: a Deterministic, Disjoint-Parallel and Skew-
Resilient Stre...
Timestamp sorted
Multi-way Streaming Aggregation
1 5
Aggregate F
3 7
5 9
<6,…>
<1,…>
<3,…>
When to
process a tuple?
Tuples...
Multi-way Streaming Aggregation
(baseline [Streamcloud, Borealis])
Vincenzo Gulisano - Yiannis Nikolakopoulos
Aggregate F:...
Data Structures to the Rescue
<6,…>
<1,…>
<3,…>
Insert
concurrently
Concurrently
update
windows
Get ready
tuples
Manage
so...
Aggregate Scaling
Vincenzo Gulisano - Yiannis Nikolakopoulos
Latency with Increasing
# Input Streams
Vincenzo Gulisano - Yiannis Nikolakopoulos
Best Solution Award 
Vincenzo Gulisano - Yiannis Nikolakopoulos
Analyze taxi trip reports from NYC and compute:
ACM DEBS ...
System Architecture Overview
Vincenzo Gulisano - Yiannis Nikolakopoulos
• Networks on chip (NoC)
• Short distance
between cores
• Message passing
model support
• Shared memory support
Can we hav...
Off-chip DRAM
Adapteva’s Epiphany Architecture:
16 or 64 cores
core
core
On-chip
32kb per
core
core
core core
core
Cores
c...
Ongoing work with Ericsson:
ScaleGate on Epiphany
• Task-graph based processing
• (Baseband) signal processing application...
Agenda
• Who we are
• Motivation and System Model
• ScaleJoin: a Deterministic, Disjoint-Parallel and
Skew-Resilient Strea...
2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 56
Source: http://storm.apache.org/
Source: http://storm.apache.org/...
Parallelization
• General approach
LB: Load Balancer
IM: Input Merger
OPA OPB
OPA LBIM
Node 1
OPA LBIM
Node m
…
Subcluster...
Parallelization
• Stateful operators: Semantic awareness
– Aggregate: count within last hour, group-by caller number
Previ...
Parallelization
Previous Subcluster
LB…
LB…
IM Agg1
IM Agg2
IM Agg3
…
…
…
592015-10-31 Vincenzo Gulisano - Yiannis Nikolak...
On-going research
2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 60
Source: http://www.michael-noll.com/blog/2013/0...
Agenda
• Who we are
• Motivation and System Model
• ScaleJoin: a Deterministic, Disjoint-Parallel and
Skew-Resilient Strea...
2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 62
1.Parallel stream joins
2.Parallel stream aggregation
3.“Ad-hoc” ...
The benefits of fine-grained synchronization in
deterministic and efficient stream processing
Vincenzo Gulisano, Yiannis N...
References
• Vincenzo Gulisano, Yiannis Nikolakopoulos, Marina Papatriantafilou, and
Philippas Tsigas, “ScaleJoin: a Deter...
Upcoming SlideShare
Loading in …5
×

The benefits of fine-grained synchronization in deterministic and efficient stream processing

435 views

Published on

This talk, given by Vincenzo Gulisano and Yiannis Nikolakopoulos at Yahoo! discusses some of their latest research results in the field of deterministic and efficient parallelization of data streaming operators. It also present ScaleGate, the abstract data type at the core of their research and whose java-based lock-free implementation is available at https://github.com/dcs-chalmers/ScaleGate_Java

Published in: Science
  • Be the first to comment

  • Be the first to like this

The benefits of fine-grained synchronization in deterministic and efficient stream processing

  1. 1. The benefits of fine-grained synchronization in deterministic and efficient stream processing Vincenzo Gulisano, Yiannis Nikolakopoulos 2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 1 Chalmers University of technology
  2. 2. Agenda • Who we are • Motivation and System Model • ScaleJoin: a Deterministic, Disjoint-Parallel and Skew-Resilient Stream Join • The ScaleGate data structure • Fine-grained synchronization and other use-cases • Research in progress • Conclusions 2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 2
  3. 3. Agenda • Who we are • Motivation and System Model • ScaleJoin: a Deterministic, Disjoint-Parallel and Skew-Resilient Stream Join • The ScaleGate data structure • Fine-grained synchronization and other use-cases • Research in progress • Conclusions 2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 3
  4. 4. Who we are Chalmers university Computer Science and Engineering department Distributed Computing and Systems research group 2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 4
  5. 5. Who we are • 12 PhD degrees awarded • ~20 researchers, 3 Postdocs and 10 PhD students. • acknowledged internationally as leading group in practical multicore synchronization algorithms and programming (results adopted by Java JDK, C++, Intel, NVIDIA) • extensive network of academic and industrial collaborations and has been continuously supported by national and international projects 2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 5 Distributed Computing and Systems research group
  6. 6. Who we are Vincenzo Gulisano (PhD) Assistant Professor at Chalmers University of Technology Research focus: • scalable big data analysis in cyber-physical systems (Smart Grids, Advanced Metering Infrastructures and Vehicular Networks). • streaming operators’ parallelization, load balancing, provisioning, decommissioning and fault tolerance protocols. • enhancement of parallel stream operators by means of streaming- and energy-aware concurrent data structures that boost analysis performance (both in throughput and latency). • applied use of the streaming paradigm in data validation, intrusion detection systems, privacy- preserving online analysis, distributed embedded networks and cloud infrastructures. 2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 6
  7. 7. Who we are Yiannis Nikolakopoulos PhD Student 2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 7 Research interests: • Parallel and distributed computing • Shared memory concurrent data structures • Multicore/Manycore systems • Consistency problems • Synchronization • Data Streaming: operator parallelism & scalability
  8. 8. Agenda • Who we are • Motivation and System Model • ScaleJoin: a Deterministic, Disjoint-Parallel and Skew-Resilient Stream Join • The ScaleGate data structure • Fine-grained synchronization and other use-cases • Research in progress • Conclusions 2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 8
  9. 9. Motivation Applications in sensor networks, cyber-physical systems: • large and fluctuating volumes of data generated continuously demand for: • Continuous processing of data streams • In a real-time fashion Store-then-process is not feasible Main Memory 1 Data Query Processing 3 Query results 2 Query Main Memory Query Processing Disk Main Memory Query Processing Continuous Query Data Query results 2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 9
  10. 10. System Model • Data Stream: unbounded sequence of tuples – Example: Call Description Record (CDR) time Field Field Caller text Callee text Time (secs) int Price (€) double A B 8:00 3 C D 8:20 7 A E 8:35 6 102015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos
  11. 11. System Model • Operators: OP Stateless 1 input tuple 1 output tuple OP Stateful 1+ input tuple(s) 1 output tuple 112015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos
  12. 12. System Model • Continuous Query: graph operators/streams Convert €  $ Only > 10$ Count calls made by each Caller number Map Filter Agg 122015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos Count the number of calls whose price is more than 10 dollars made by each caller
  13. 13. System Model • Infinite sequence of tuples / bounded memory  windows • Example: 1 hour windows time [8:00,9:00) [8:20,9:20) [8:40,9:40) 132015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos
  14. 14. System Model • Infinite sequence of tuples / bounded memory  windows • Example: count tuples - 1 hour windows time [8:00,9:00) 8:05 8:15 8:22 8:45 9:05 Output: 4 14 [8:20,9:20) 2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos
  15. 15. The evolution of SPEs Centralized SPEs Distributed SPEs Parallel-Distributed SPEs Elastic SPEs 2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 15 Inter-operator parallelism … … Intra-operator parallelism … … +/- +/- Scale up / down
  16. 16. Agenda • Who we are • Motivation and System Model • ScaleJoin: a Deterministic, Disjoint-Parallel and Skew-Resilient Stream Join • The ScaleGate data structure • Fine-grained synchronization and other use-cases • Research in progress • Conclusions 2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 16
  17. 17. What is a stream join? 2015-10-31 17 t1 t2 t3 t4 t1 t2 t3 t4 R S Sliding window Window size WS WSWR Predicate P
  18. 18. Why parallel stream joins? • WS = 600 seconds • R receives 500 tuples/second • S receives 500 tuples/second • WR will contain 300,000 tuples • WS will contain 300,000 tuples • Each new tuple from R gets compared with all the tuples in WS • Each new tuple from S gets compared with all the tuples in WR … 300,000,000 comparisons/second! t1 t2 t3 t4 t1 t2 t3 t4 R S WSWR 2015-10-31 18
  19. 19. Which are the challenges of a parallel stream join? Scalability High throughput Low latency Disjoint parallelism Skew resilience Determinism 2015-10-31 19
  20. 20. The 3-step procedure (sequential stream join) For each incoming tuple t: 1. compare t with all tuples in opposite window given predicate P 2. add t to its window 3. remove stale tuples from t’s window Add tuples to S Add tuples to R Prod R Prod S Consume resultsConsPU 2015-10-31 20 We assume each producer delivers tuples in timestamp order
  21. 21. The 3-step procedure, is it enough? Scalability High throughput Low latency Disjoint parallelism Skew resilience Determinism 2015-10-31 21 t1 t2 t1 t2 R S WSWR t3 t1 t2 t1 t2 R S WSWR t4 t3
  22. 22. Enforcing determinism in sequential stream joins • Next tuple to process = earliest(tS,tR) • The earliest(tS,tR) tuple is referred to as the next ready tuple • Process ready tuples in timestamp order  Determinism PU tS tR 2015-10-31 22
  23. 23. Deterministic 3-step procedure Pick the next ready tuple t: 1. compare t with all tuples in opposite window given predicate P 2. add t to its window 3. remove stale tuples from t’s window Add tuples to S Add tuples to R Prod R Prod S Consume resultsConsPU 2015-10-31 23
  24. 24. Shared-nothing parallel stream join (state-of-the-art) Prod R Prod S PU1 PU2 PUN … Cons Add tuple to PUi S Add tuple to PUi R Consume results Pick the next ready tuple t: 1. compare t with all tuples in opposite window given P 2. add t to its window 3. remove stale tuples from t’s window Chose a PU Chose a PU Take the next ready output tuple Scalability High throughput Low latency Disjoint parallelism Skew resilience Determinism 2015-10-31 24 Merge
  25. 25. Shared-nothing parallel stream join (state-of-the-art) Prod R Prod S PU1 PU2 PUN … 2015-10-31 25 enqueue() dequeue() ConsMerge
  26. 26. From coarse-grained to fine-grained synchronization Prod R Prod S PU1 PU2 PUN … Cons 2015-10-31 26
  27. 27. ScaleGate 2015-10-31 27 addTuple(tuple,sourceID) allows a tuple from sourceID to be merged by ScaleGate in the resulting timestamp-sorted stream of ready tuples. getNextReadyTuple(readerID) provides to readerID the next earliest ready tuple that has not been yet consumed by the former. https://github.com/dcs-chalmers/ScaleGate_Java
  28. 28. ScaleJoin Prod R Prod S PU1 PU2 PUN … Cons Add tuple SGin Add tuple SGin Get next ready output tuple from SGout Get next ready input tuple from SGin 1. compare t with all tuples in opposite window given P 2. add t to its window in a round-robin fashion 3. remove stale tuples from t’s window 2015-10-31 28 SGin SGout Steps for PU
  29. 29. 2015-10-31 29 t1 t2 R S WR t3 t4 R S t4 t1 WR R S t4 t2 WR R S t4 WR t3 Sequential stream join: ScaleJoin with 3 PUs: ScaleJoin (example)
  30. 30. ScaleJoin Prod R Prod S PU1 PU2 PUN … Cons Add tuple SGin Add tuple SGin Get next ready output tuple from SGout 2015-10-31 30 SGin SGout Scalability High throughput Low latency Disjoint parallelism Skew resilience Determinism Prod S Prod S Prod R Get next ready input tuple from SGin 1. compare t with all tuples in opposite window given P 2. add t to its window in a round robin fashion 3. remove stale tuples from t’s window Steps for PUi
  31. 31. ScaleJoin Scalability – comparisons/second 2015-10-31 32 Number of PUs 4 billion comparison / second!!!
  32. 32. Agenda • Who we are • Motivation and System Model • ScaleJoin: a Deterministic, Disjoint-Parallel and Skew-Resilient Stream Join • The ScaleGate data structure • Fine-grained synchronization and other use-cases • Research in progress • Conclusions 2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 35
  33. 33. 2015-10-31 36 addTuple(tuple,sourceID) allows a tuple from sourceID to be merged by ScaleGate in the resulting timestamp-sorted stream of ready tuples. getNextReadyTuple(readerID) provides to readerID the next earliest ready tuple that has not been yet consumed by the former.
  34. 34. ScaleGate: Main Functionality • addTuple(tuple,sourceID) – Insert tuple from sourceID effectively in ts order Vincenzo Gulisano - Yiannis Nikolakopoulos <6,…> <1,…> <3,…> Insert concurrently
  35. 35. ScaleGate: Main Functionality • getNextReadyTuple(readerID) – readerID gets the next ready tuple t in ts order i.e. no tuple with ts<t.ts will arrive afterwards – return null if no ready tuple Vincenzo Gulisano - Yiannis Nikolakopoulos <1,…> Get ready tuples <6,…> <3,…> <7,…> <9,…> <8,…>
  36. 36. Implementation? • getNextReadyTuple(readerID) – we get tuples in ts order, so maybe an extractMin()? (priority queue) – still no guarantee that t is ready • addTuple(tuple, sourceID) – sorted insertion maybe could help Vincenzo Gulisano - Yiannis Nikolakopoulos <6,…> <1,…><3,…> Is there a concurrent data structure with cheap extractMin() + insert() Binary Search Trees, Heaps: Expensive rebalancing and heap property
  37. 37. ScaleGate Anatomy (1) • Inspired from lock-free skip lists – randomized height of nodes – expected cost for search/insertion O(logN) • Search by traversing from higher to lower levels Vincenzo Gulisano - Yiannis Nikolakopoulos BigData’15 [GNPT]
  38. 38. ScaleGate Anatomy (2) • Reader-local view of "head" (also has minimum ts for that reader) • Flagging mechanism: – If "head" is not flagged can be safely returned – Flag the last written tuple of each source • Nodes free to be garbage-collected after every reader passes (almost...) Vincenzo Gulisano - Yiannis Nikolakopoulos head0 head1 BigData’15 [GNPT]
  39. 39. ScaleGate Properties • Safety: Linearizable – Every operation appears to take effect instantaneously • Progress: Lock-free – At least one thread makes progress, independently of the state of others • Basic building block for providing deterministic processing (through the ready semantics) Vincenzo Gulisano - Yiannis Nikolakopoulos
  40. 40. Why not a Skip List? Vincenzo Gulisano - Yiannis Nikolakopoulos
  41. 41. Agenda • Who we are • Motivation and System Model • ScaleJoin: a Deterministic, Disjoint-Parallel and Skew- Resilient Stream Join • The ScaleGate data structure • Fine-grained synchronization and other use-cases – Streaming Aggregation – Towards real-time analytics – Some ongoing work • Research in progress • Conclusions 2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 44
  42. 42. Timestamp sorted Multi-way Streaming Aggregation 1 5 Aggregate F 3 7 5 9 <6,…> <1,…> <3,…> When to process a tuple? Tuples not in TS order across streams; not ready for processing at arrival… Merge & sort streams Timestamp sorted tuples per stream Vincenzo Gulisano - Yiannis Nikolakopoulos
  43. 43. Multi-way Streaming Aggregation (baseline [Streamcloud, Borealis]) Vincenzo Gulisano - Yiannis Nikolakopoulos Aggregate F: Update windows and output<6,…> <1,…> <3,…> When to process a tuple? Keep checking all buffers!!! Merge & sort streamsCan Data Structures do more than insert and extract?
  44. 44. Data Structures to the Rescue <6,…> <1,…> <3,…> Insert concurrently Concurrently update windows Get ready tuples Manage sorted windows Output tuples ScaleGate objects allow concurrent access with - consistency/safety , progress guarantees - determinism - appropriate interface  parallelization of pipeline stages Contribution: SPAA’14 [CGNPT] “Brief announcement: concurrent data structures for efficient streaming aggregation”
  45. 45. Aggregate Scaling Vincenzo Gulisano - Yiannis Nikolakopoulos
  46. 46. Latency with Increasing # Input Streams Vincenzo Gulisano - Yiannis Nikolakopoulos
  47. 47. Best Solution Award  Vincenzo Gulisano - Yiannis Nikolakopoulos Analyze taxi trip reports from NYC and compute: ACM DEBS 2015 [GNWPT] “Deterministic Real-Time Analytics of Geospatial Data Streams through ScaleGate Objects”
  48. 48. System Architecture Overview Vincenzo Gulisano - Yiannis Nikolakopoulos
  49. 49. • Networks on chip (NoC) • Short distance between cores • Message passing model support • Shared memory support Can we have Data Structures: Fast Scalable Good progress guarantees Cache Cache IA Core Shared Local • Eliminated cache coherency • Limited support for synchronization primitives What’s happening in hardware? Vincenzo Gulisano - Yiannis Nikolakopoulos
  50. 50. Off-chip DRAM Adapteva’s Epiphany Architecture: 16 or 64 cores core core On-chip 32kb per core core core core core Cores communicate through mesh by reading and writing the on-chip memory Vincenzo Gulisano - Yiannis Nikolakopoulos Distributed shared memory
  51. 51. Ongoing work with Ericsson: ScaleGate on Epiphany • Task-graph based processing • (Baseband) signal processing applications Vincenzo Gulisano - Yiannis Nikolakopoulos Can ScaleGate help?
  52. 52. Agenda • Who we are • Motivation and System Model • ScaleJoin: a Deterministic, Disjoint-Parallel and Skew-Resilient Stream Join • The ScaleGate data structure • Fine-grained synchronization and other use-cases • Research in progress • Conclusions 2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 55
  53. 53. 2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 56 Source: http://storm.apache.org/ Source: http://storm.apache.org/ … … Intra-operator parallelism Storm is a parallel-distributed Stream Processing Engine • Data streaming operators  Spout / Bolt • Splitting of work into tasks / workers  Intra-operator parallelism
  54. 54. Parallelization • General approach LB: Load Balancer IM: Input Merger OPA OPB OPA LBIM Node 1 OPA LBIM Node m … Subcluster A OPB LBIM Node 1 OPB LBIM Node n … Subcluster B 572015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos
  55. 55. Parallelization • Stateful operators: Semantic awareness – Aggregate: count within last hour, group-by caller number Previous Subcluster LB… LB… IM Agg1 IM Agg2 IM Agg3 … … … Caller A A B 8:00 3 58 A C 8:30 5 A D 9:05 2 2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos
  56. 56. Parallelization Previous Subcluster LB… LB… IM Agg1 IM Agg2 IM Agg3 … … … 592015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos Source: http://storm.apache.org/ Fields grouping Source: http://storm.apache.org/ User defined
  57. 57. On-going research 2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 60 Source: http://www.michael-noll.com/blog/2013/06/21/understanding-storm-internal-message-buffers/ Coarse-grained synchronization!!! • Individual queues for executors • 2 threads per executor • Dedicated Receive and Send thread queues… … What about fine-grained synchronization?
  58. 58. Agenda • Who we are • Motivation and System Model • ScaleJoin: a Deterministic, Disjoint-Parallel and Skew-Resilient Stream Join • The ScaleGate data structure • Fine-grained synchronization and other use-cases • Research in progress • Conclusions 2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 61
  59. 59. 2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 62 1.Parallel stream joins 2.Parallel stream aggregation 3.“Ad-hoc” streaming applications 4.Determinism in Storm … https://github.com/dcs-chalmers/ScaleGate_Java
  60. 60. The benefits of fine-grained synchronization in deterministic and efficient stream processing Vincenzo Gulisano, Yiannis Nikolakopoulos 2015-10-31 Vincenzo Gulisano - Yiannis Nikolakopoulos 63 Chalmers University of technology Thank you! Questions?
  61. 61. References • Vincenzo Gulisano, Yiannis Nikolakopoulos, Marina Papatriantafilou, and Philippas Tsigas, “ScaleJoin: a Deterministic, Disjoint-Parallel and Skew- Resilient Stream Join”, to appear in IEEE BigData 2015 • Vincenzo Gulisano, Yiannis Nikolakopoulos, Ivan Walulya, Marina Papatriantafilou, Philippas Tsigas, “Deterministic Real-time Analytics of Geospatial Data Streams Through ScaleGate Objects”, 9th ACM International Conference on Distributed Event-Based Systems (DEBS '15) • Yiannis Nikolakopoulos, Anders Gidenstam, Marina Papatriantafilou, Philippas Tsigas “A Consistency Framework for Iteration Operations in Concurrent Data Structures”, 2015 IEEE 29th International Symposium on Parallel & Distributed Processing (IPDPS) • Daniel Cederman, Vincenzo Gulisano, Yiannis Nikolakopoulos, Marina Papatriantafilou, and Philippas Tsigas “Brief announcement: concurrent data structures for efficient streaming aggregation”, 2014 26th ACM symposium on Parallelism in algorithms and architectures (SPAA '14) Vincenzo Gulisano - Yiannis Nikolakopoulos

×