Successfully reported this slideshow.

Apache Incubator Samza: Stream Processing at LinkedIn

4

Share

1 of 107
1 of 107

Apache Incubator Samza: Stream Processing at LinkedIn

4

Share

Download to read offline

This is the slide deck that was presented at the Hadoop Users Group at LinkedIn on November 5, 2013.

The presentation covers what Samza is, why we built it, and how it works.

This is the slide deck that was presented at the Hadoop Users Group at LinkedIn on November 5, 2013.

The presentation covers what Samza is, why we built it, and how it works.

More Related Content

Related Books

Free with a 14 day trial from Scribd

See all

Related Audiobooks

Free with a 14 day trial from Scribd

See all

Apache Incubator Samza: Stream Processing at LinkedIn

  1. 1. Apache Samza* Stream Processing at LinkedIn Chris Riccomini 9/27/2013 * Incubating
  2. 2. Stream Processing?
  3. 3. 0 ms Response latency
  4. 4. 0 ms Response latency Synchronous
  5. 5. 0 ms Response latency Synchronous Later. Possibly much later.
  6. 6. 0 ms Response latency Milliseconds to minutes Synchronous Later. Possibly much later.
  7. 7. Newsfeed
  8. 8. News
  9. 9. Ad Relevance
  10. 10. Email
  11. 11. Search Indexing Pipeline
  12. 12. Metrics and Monitoring
  13. 13. Motivation
  14. 14. Real-time Feeds • • • • User activity Metrics Monitoring Database Changes
  15. 15. Real-time Feeds • 10+ billion writes per day • 172,000 messages per second (average) • 55+ billion messages per day to real-time consumers
  16. 16. Stream Processing is Hard • • • • • • Partitioning State Re-processing Failure semantics Joins to services or database Non-determinism
  17. 17. Samza Concepts & Architecture
  18. 18. Streams Partition 0 Partition 1 Partition 2
  19. 19. Streams Partition 0 1 2 3 4 5 6 Partition 1 1 2 3 4 5 Partition 2 1 2 3 4 5 6 7
  20. 20. Streams Partition 0 1 2 3 4 5 6 Partition 1 1 2 3 4 5 Partition 2 1 2 3 4 5 6 7
  21. 21. Streams Partition 0 1 2 3 4 5 6 Partition 1 1 2 3 4 5 Partition 2 1 2 3 4 5 6 7
  22. 22. Streams Partition 0 1 2 3 4 5 6 Partition 1 1 2 3 4 5 Partition 2 1 2 3 4 5 6 7
  23. 23. Streams Partition 0 1 2 3 4 5 6 Partition 1 1 2 3 4 5 Partition 2 1 2 3 4 5 6 7
  24. 24. Streams Partition 0 1 2 3 4 5 6 Partition 1 1 2 3 4 5 Partition 2 1 2 3 4 5 6 7 next append
  25. 25. Tasks Partition 0
  26. 26. Tasks Partition 0 Task 1
  27. 27. Tasks Partition 0 class PageKeyViewsCounterTask implements StreamTask { public void process(IncomingMessageEnvelope envelope, MessageCollector collector, TaskCoordinator coordinator) { GenericRecord record = ((GenericRecord) envelope.getMsg()); String pageKey = record.get("page-key").toString(); int newCount = pageKeyViews.get(pageKey).incrementAndGet(); collector.send(countStream, pageKey, newCount); } }
  28. 28. Tasks Partition 0 class PageKeyViewsCounterTask implements StreamTask { public void process(IncomingMessageEnvelope envelope, MessageCollector collector, TaskCoordinator coordinator) { GenericRecord record = ((GenericRecord) envelope.getMsg()); String pageKey = record.get("page-key").toString(); int newCount = pageKeyViews.get(pageKey).incrementAndGet(); collector.send(countStream, pageKey, newCount); } }
  29. 29. Tasks Partition 0 class PageKeyViewsCounterTask implements StreamTask { public void process(IncomingMessageEnvelope envelope, MessageCollector collector, TaskCoordinator coordinator) { GenericRecord record = ((GenericRecord) envelope.getMsg()); String pageKey = record.get("page-key").toString(); int newCount = pageKeyViews.get(pageKey).incrementAndGet(); collector.send(countStream, pageKey, newCount); } }
  30. 30. Tasks Partition 0 class PageKeyViewsCounterTask implements StreamTask { public void process(IncomingMessageEnvelope envelope, MessageCollector collector, TaskCoordinator coordinator) { GenericRecord record = ((GenericRecord) envelope.getMsg()); String pageKey = record.get("page-key").toString(); int newCount = pageKeyViews.get(pageKey).incrementAndGet(); collector.send(countStream, pageKey, newCount); } }
  31. 31. Tasks Partition 0 class PageKeyViewsCounterTask implements StreamTask { public void process(IncomingMessageEnvelope envelope, MessageCollector collector, TaskCoordinator coordinator) { GenericRecord record = ((GenericRecord) envelope.getMsg()); String pageKey = record.get("page-key").toString(); int newCount = pageKeyViews.get(pageKey).incrementAndGet(); collector.send(countStream, pageKey, newCount); } }
  32. 32. Tasks Partition 0 class PageKeyViewsCounterTask implements StreamTask { public void process(IncomingMessageEnvelope envelope, MessageCollector collector, TaskCoordinator coordinator) { GenericRecord record = ((GenericRecord) envelope.getMsg()); String pageKey = record.get("page-key").toString(); int newCount = pageKeyViews.get(pageKey).incrementAndGet(); collector.send(countStream, pageKey, newCount); } }
  33. 33. Tasks Partition 0 class PageKeyViewsCounterTask implements StreamTask { public void process(IncomingMessageEnvelope envelope, MessageCollector collector, TaskCoordinator coordinator) { GenericRecord record = ((GenericRecord) envelope.getMsg()); String pageKey = record.get("page-key").toString(); int newCount = pageKeyViews.get(pageKey).incrementAndGet(); collector.send(countStream, pageKey, newCount); } }
  34. 34. Tasks Partition 0 class PageKeyViewsCounterTask implements StreamTask { public void process(IncomingMessageEnvelope envelope, MessageCollector collector, TaskCoordinator coordinator) { GenericRecord record = ((GenericRecord) envelope.getMsg()); String pageKey = record.get("page-key").toString(); int newCount = pageKeyViews.get(pageKey).incrementAndGet(); collector.send(countStream, pageKey, newCount); } }
  35. 35. Tasks Partition 0 class PageKeyViewsCounterTask implements StreamTask { public void process(IncomingMessageEnvelope envelope, MessageCollector collector, TaskCoordinator coordinator) { GenericRecord record = ((GenericRecord) envelope.getMsg()); String pageKey = record.get("page-key").toString(); int newCount = pageKeyViews.get(pageKey).incrementAndGet(); collector.send(countStream, pageKey, newCount); } }
  36. 36. Tasks Partition 0 class PageKeyViewsCounterTask implements StreamTask { public void process(IncomingMessageEnvelope envelope, MessageCollector collector, TaskCoordinator coordinator) { GenericRecord record = ((GenericRecord) envelope.getMsg()); String pageKey = record.get("page-key").toString(); int newCount = pageKeyViews.get(pageKey).incrementAndGet(); collector.send(countStream, pageKey, newCount); } }
  37. 37. Tasks Partition 0 Task 1
  38. 38. Tasks Page Views - Partition 0 1 2 3 4 PageKeyViews CounterTask Partition 0 Partition 1 Output Count Stream
  39. 39. Tasks Page Views - Partition 0 1 2 3 4 PageKeyViews CounterTask Partition 0 Partition 1 Output Count Stream
  40. 40. Tasks Page Views - Partition 0 1 2 3 4 PageKeyViews CounterTask Partition 0 Partition 1 Output Count Stream
  41. 41. Tasks Page Views - Partition 0 1 2 3 4 PageKeyViews CounterTask Output Count Stream Partition 0 Partition 1
  42. 42. Tasks Page Views - Partition 0 1 2 3 4 PageKeyViews CounterTask Output Count Stream Partition 0 Partition 1
  43. 43. Tasks Page Views - Partition 0 1 2 3 4 PageKeyViews CounterTask Output Count Stream Partition 0 Partition 1
  44. 44. Tasks Page Views - Partition 0 1 2 3 4 PageKeyViews CounterTask Output Count Stream Partition 0 Partition 1
  45. 45. Tasks Page Views - Partition 0 1 2 3 4 PageKeyViews CounterTask Output Count Stream Partition 0 Partition 1
  46. 46. Tasks Page Views - Partition 0 1 2 3 4 PageKeyViews CounterTask Checkpoint Stream 2 Output Count Stream Partition 1 Partition 0 Partition 1
  47. 47. Tasks Page Views - Partition 0 1 2 3 4 PageKeyViews CounterTask Checkpoint Stream 2 Output Count Stream Partition 1 Partition 0 Partition 1
  48. 48. Tasks Page Views - Partition 0 1 2 3 4 PageKeyViews CounterTask Checkpoint Stream 2 Output Count Stream Partition 1 Partition 0 Partition 1
  49. 49. Tasks Page Views - Partition 0 1 2 3 4 PageKeyViews CounterTask Checkpoint Stream 2 Output Count Stream Partition 1 Partition 0 Partition 1
  50. 50. Tasks Page Views - Partition 0 1 2 3 4 PageKeyViews CounterTask Checkpoint Stream 2 Output Count Stream Partition 1 Partition 0 Partition 1
  51. 51. Tasks Page Views - Partition 0 1 2 3 4 PageKeyViews CounterTask Checkpoint Stream 2 Output Count Stream Partition 1 Partition 0 Partition 1
  52. 52. Tasks Page Views - Partition 0 1 2 3 4 PageKeyViews CounterTask Checkpoint Stream 2 Output Count Stream Partition 1 Partition 0 Partition 1
  53. 53. Tasks Page Views - Partition 0 1 2 3 4 PageKeyViews CounterTask Checkpoint Stream 2 Output Count Stream Partition 1 Partition 0 Partition 1
  54. 54. Tasks Page Views - Partition 0 1 2 3 4 PageKeyViews CounterTask Checkpoint Stream 2 Output Count Stream Partition 1 Partition 0 Partition 1
  55. 55. Jobs Stream A Task 1 Task 2 Stream B Task 3
  56. 56. Jobs Stream A Task 1 Stream B Task 2 Stream C Task 3
  57. 57. Jobs AdViews Task 1 AdClicks Task 2 AdClickThroughRate Task 3
  58. 58. Jobs AdViews Task 1 AdClicks Task 2 AdClickThroughRate Task 3
  59. 59. Jobs Stream A Task 1 Stream B Task 2 Stream C Task 3
  60. 60. Dataflow Stream A Stream B Job 1 Stream D Job 2 Stream E Job 3 Stream B Stream C
  61. 61. Dataflow Stream A Stream B Job 1 Stream D Job 2 Stream E Job 3 Stream B Stream C
  62. 62. YARN
  63. 63. Jobs Stream A Task 1 Task 2 Stream B Task 3
  64. 64. Containers Stream A Task 1 Task 2 Stream B Task 3
  65. 65. Containers Stream A Samza Container 1 Stream B Samza Container 2
  66. 66. Containers Samza Container 1 Samza Container 2
  67. 67. YARN Host 1 Samza Container 1 Host 2 Samza Container 2
  68. 68. YARN Host 1 Host 2 NodeManager NodeManager Samza Container 1 Samza Container 2
  69. 69. YARN Host 1 Host 2 NodeManager NodeManager Samza Container 1 Samza Container 2 Samza YARN AM
  70. 70. YARN Host 1 Host 2 NodeManager NodeManager Samza Container 1 Kafka Broker Samza Container 2 Samza YARN AM Kafka Broker
  71. 71. YARN Host 1 Host 2 NodeManager NodeManager MapReduce Container HDFS MapReduce YARN AM MapReduce Container HDFS
  72. 72. YARN Host 1 Stream A NodeManager Samza Container 1 Samza Container 1 Kafka Broker Stream C Samza Container 2
  73. 73. YARN Host 1 Stream A NodeManager Samza Container 1 Samza Container 1 Kafka Broker Stream C Samza Container 2
  74. 74. YARN Host 1 Stream A NodeManager Samza Container 1 Samza Container 1 Kafka Broker Stream C Samza Container 2
  75. 75. YARN Host 1 Stream A NodeManager Samza Container 1 Samza Container 1 Kafka Broker Stream C Samza Container 2
  76. 76. YARN Host 1 Host 2 NodeManager NodeManager Samza Container 1 Kafka Broker Samza Container 2 Samza YARN AM Kafka Broker
  77. 77. CGroups Host 1 Host 2 NodeManager NodeManager Samza Container 1 Kafka Broker Samza Container 2 Samza YARN AM Kafka Broker
  78. 78. (Not Running) Multi-Framework Host 1 Host 2 NodeManager NodeManager Samza Container 1 Kafka MapReduce Container Samza YARN AM HDFS
  79. 79. Stateful Processing
  80. 80. SELECT col1, count(*) FROM stream1 INNER JOIN stream2 ON stream1.col3 = stream2.col3 WHERE col2 > 20 GROUP BY col1 ORDER BY count(*) DESC LIMIT 50;
  81. 81. SELECT col1, count(*) FROM stream1 INNER JOIN stream2 ON stream1.col3 = stream2.col3 WHERE col2 > 20 GROUP BY col1 ORDER BY count(*) DESC LIMIT 50;
  82. 82. SELECT col1, count(*) FROM stream1 INNER JOIN stream2 ON stream1.col3 = stream2.col3 WHERE col2 > 20 GROUP BY col1 ORDER BY count(*) DESC LIMIT 50;
  83. 83. SELECT col1, count(*) FROM stream1 INNER JOIN stream2 ON stream1.col3 = stream2.col3 WHERE col2 > 20 GROUP BY col1 ORDER BY count(*) DESC LIMIT 10;
  84. 84. How do people do this?
  85. 85. Remote Stores Stream A Task 1 Task 2 Task 3 Key-Value Store Stream B
  86. 86. Remote RPC is slow • Stream: ~500k records/sec/container • DB: << less
  87. 87. Online vs. Async
  88. 88. No undo • Database state is non-deterministic • Can’t roll back mutations if task crashes
  89. 89. Tables & Streams put(a, w) put(b, x) Database put(a, y) put(b, z) Time
  90. 90. Stateful Tasks Stream A Task 1 Task 2 Stream B Task 3
  91. 91. Stateful Tasks Stream A Task 1 Task 2 Stream B Task 3
  92. 92. Stateful Tasks Stream A Task 1 Task 2 Stream B Task 3 Changelog Stream
  93. 93. Stateful Tasks Stream A Task 1 Task 2 Stream B Task 3 Changelog Stream
  94. 94. Stateful Tasks Stream A Task 1 Task 2 Stream B Task 3 Changelog Stream
  95. 95. Stateful Tasks Stream A Task 1 Task 2 Stream B Task 3 Changelog Stream
  96. 96. Stateful Tasks Stream A Task 1 Task 2 Stream B Task 3 Changelog Stream
  97. 97. Stateful Tasks Stream A Task 1 Task 2 Stream B Task 3 Changelog Stream
  98. 98. Stateful Tasks Stream A Task 1 Task 2 Stream B Task 3 Changelog Stream
  99. 99. Stateful Tasks Stream A Task 1 Task 2 Stream B Task 3 Changelog Stream
  100. 100. Stateful Tasks Stream A Task 1 Task 2 Stream B Task 3 Changelog Stream
  101. 101. Stateful Tasks Stream A Task 1 Task 2 Stream B Task 3 Changelog Stream
  102. 102. Stateful Tasks Stream A Task 1 Task 2 Stream B Task 3 Changelog Stream
  103. 103. Stateful Tasks Stream A Task 1 Task 2 Stream B Task 3 Changelog Stream
  104. 104. Key-Value Store • • • • put(table_name, key, value) get(table_name, key) delete(table_name, key) range(table_name, key1, key2)
  105. 105. Whew!
  106. 106. Let’s be Friends! • We are incubating, and you can help! • Get up and running in 5 minutes http://bit.ly/hello-samza • Grab some newbie JIRAs http://bit.ly/samza_newbie_issues

Editor's Notes

  • - stream processing for us = anything asynchronous, but not batch computed.- 25% of code is async. 50% is rpc/online. 25% is batch.- stream processing is worst supported.
  • - stream processing for us = anything asynchronous, but not batch computed.- 25% of code is async. 50% is rpc/online. 25% is batch.- stream processing is worst supported.
  • - stream processing for us = anything asynchronous, but not batch computed.- 25% of code is async. 50% is rpc/online. 25% is batch.- stream processing is worst supported.
  • - stream processing for us = anything asynchronous, but not batch computed.- 25% of code is async. 50% is rpc/online. 25% is batch.- stream processing is worst supported.
  • - stream processing for us = anything asynchronous, but not batch computed.- 25% of code is async. 50% is rpc/online. 25% is batch.- stream processing is worst supported.
  • - compute top shares, pull in, scrape, entity tag- language detection- send emails: friend was in the news- requirement: has to be fast, since news is trendy
  • - relevance pipeline
  • - we send relatively data rich emails- some emails are time sensitive (need to be sent soon)
  • - time sensitive- data ingestion pattern- other systems that follow this pattern: realtimeolap system, and social graph system
  • - ecosystem at LinkedIn (some unique traits)- hard unsolved problems in this space
  • - oncewe had all this data in kafka, we wanted to do stuff with it.- persistent,reliable,distributed,message queue- Kafka = first among equals, but stream systems are pluggable. Just like Hadoop with HDSF vs. S3.
  • - started with just simple web service that consumes and produces kafka messages.- realized that there are a lot of hard problems that needed to be solved.- reprocessing: what if my algorithm changes and I need to reprocess all events?- non-determinism: queries to external systems, time dependencies, ordering of messages.
  • - open area of research- been around for 20 years
  • partitioned
  • re-playableorderedfault tolerantinfinitevery heavyweight definition of a stream (vs. s4, storm, etc)
  • At least once messaging. Duplicates are possible.Future: exact semantics.Transparent to user. No ack’ing API.
  • connected by stream name onlyfully buffered
  • - group by, sum, count
  • - stream to stream, stream to table, table to table
  • - buffered sorting
  • UDP is an over-optimization, since most processors try to remote join, which is very slow.
  • Changelog/redologState machine model
  • Can also consume these streams from other jobs.
  • - can’t keep messages forever. - log compaction: delete over-written keys over time.
  • - can’t keep messages forever. - log compaction: delete over-written keys over time.
  • storeAPI is pluggable: Lucene, buffered sort, external sort, bitmap index, bloom filters and sketches
  • ×