Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

4th PFI System reading


Published on

Introduction of Handshake joins in the paper "How Soccer Players Would do Stream Joins".

  • Be the first to comment

4th PFI System reading

  1. 1. How Soccer Players Would do Stream Joins 3/4/2015 @nobu_k 1
  2. 2. Who? 久保田展行 (@nobu_k) CTO@Preferred Networks America, Inc. Speciality DBMS, Search engine Distributed Systems (consensus) beatmania IIDX SP/DP皆伝 (DPメイン) 2
  3. 3. How Soccer Players Would do Stream Joins Jens Teubner, Rene Mueller, SIGMOD 2011 Handshake Join Window-based stream joins supporting any join predicate Very high degrees of parallelism multi-core CPUs FPGA Massively Parallel Processor Arrays (MPPAs) 3
  4. 4. Joins Joins(⋈) combine two or more relations(tables) in RDBMS A join is a cross product of relations followed by a selection(σ) Many methods Nested-loops joins Sort-merge joins (Recursive, hybrid) hash joins 4
  5. 5. Stream Joins Problems Unbounded "infinite" input data Solution: (sliding) window-based joins tuple-based/time-based Latency of the output Solution: online, symmetric evaluation How can it be scalable? 5
  6. 6. Handshake Joins Streams flow by each other in opposite directions. Each core locally evaluates tuples. Core 1 Core 2 Core 3 6
  7. 7. A newly arrived tuple( ) will be compared to all tuples( ) in the other stream in the same core. Any comparison algorithm(predicate) can be used. Evaluation Strategy 7
  8. 8. Strategies Lock Step Forwarding Two-Phase Forwarding using Async-MQ Asymmetric protocol Synchronization b a a and b miss each other. 8
  9. 9. Two-Phase Forwarding Using Asynchronous Message Queue b a b a b Leaving the tuple with a special mark. FIFO queue 9 1. 2.
  10. 10. Two-phase forwarding: ACK b a b b a b b When the left core receives tuple b, it sends an ack to the right core before sending any other tuples. The right core deletes b when it receives the ack. 10 2. 3.
  11. 11. Two-Phase Forwarding: when a and b miss each other bb a bb a a will be compared to tuples in the right core including b. 11 4. 5.
  12. 12. Load Balancing Automatic load balancing without centralized control. Each core can handle an arbitrary number of tuples. Core 1 Core 2 Core 3 12
  13. 13. Software Implementation AMD Opteron 6174 2.2GHz libnuma quoted from page 8 13
  14. 14. Scalability page 8 page 9 14
  15. 15. FPGA Implementation Assume the system has to provide a throughput of 500ktuples/sec with a window size of 100 tuples. Config- urations with 1, 2, 5, and 10 join cores can guarantee this throughput if operated at clock frequencies of 50, 25, 10, or 5 MHz, respectively. from page 10 15
  16. 16. Summary Handshake join Window-based stream join Flexible and scalable Working well with FPGA 16