Advertisement
Advertisement

More Related Content

Slideshows for you(20)

Advertisement

Apache Incubator Samza: Stream Processing at LinkedIn

  1. Apache Samza* Stream Processing at LinkedIn Chris Riccomini 11/13/2013 * Incubating
  2. Stream Processing?
  3. 0 ms Response latency
  4. 0 ms Response latency Synchronous
  5. 0 ms Response latency Synchronous Later. Possibly much later.
  6. 0 ms Response latency Milliseconds to minutes Synchronous Later. Possibly much later.
  7. Newsfeed
  8. News
  9. Ad Relevance
  10. Email
  11. Search Indexing Pipeline
  12. Metrics and Monitoring
  13. Motivation
  14. Real-time Feeds • • • • User activity Metrics Monitoring Database Changes
  15. Real-time Feeds • 10+ billion writes per day • 172,000 messages per second (average) • 55+ billion messages per day to real-time consumers
  16. Stream Processing is Hard • • • • • • Partitioning State Re-processing Failure semantics Joins to services or database Non-determinism
  17. Samza Concepts & Architecture
  18. Streams Partition 0 Partition 1 Partition 2
  19. Streams Partition 0 1 2 3 4 5 6 Partition 1 1 2 3 4 5 Partition 2 1 2 3 4 5 6 7
  20. Streams Partition 0 1 2 3 4 5 6 Partition 1 1 2 3 4 5 Partition 2 1 2 3 4 5 6 7
  21. Streams Partition 0 1 2 3 4 5 6 Partition 1 1 2 3 4 5 Partition 2 1 2 3 4 5 6 7
  22. Streams Partition 0 1 2 3 4 5 6 Partition 1 1 2 3 4 5 Partition 2 1 2 3 4 5 6 7
  23. Streams Partition 0 1 2 3 4 5 6 Partition 1 1 2 3 4 5 Partition 2 1 2 3 4 5 6 7
  24. Streams Partition 0 1 2 3 4 5 6 Partition 1 1 2 3 4 5 Partition 2 1 2 3 4 5 6 7 next append
  25. Tasks Partition 0
  26. Tasks Partition 0 Task 1
  27. Tasks Partition 0 class PageKeyViewsCounterTask implements StreamTask { public void process(IncomingMessageEnvelope envelope, MessageCollector collector, TaskCoordinator coordinator) { GenericRecord record = ((GenericRecord) envelope.getMsg()); String pageKey = record.get("page-key").toString(); int newCount = pageKeyViews.get(pageKey).incrementAndGet(); collector.send(countStream, pageKey, newCount); } }
  28. Tasks Partition 0 class PageKeyViewsCounterTask implements StreamTask { public void process(IncomingMessageEnvelope envelope, MessageCollector collector, TaskCoordinator coordinator) { GenericRecord record = ((GenericRecord) envelope.getMsg()); String pageKey = record.get("page-key").toString(); int newCount = pageKeyViews.get(pageKey).incrementAndGet(); collector.send(countStream, pageKey, newCount); } }
  29. Tasks Partition 0 class PageKeyViewsCounterTask implements StreamTask { public void process(IncomingMessageEnvelope envelope, MessageCollector collector, TaskCoordinator coordinator) { GenericRecord record = ((GenericRecord) envelope.getMsg()); String pageKey = record.get("page-key").toString(); int newCount = pageKeyViews.get(pageKey).incrementAndGet(); collector.send(countStream, pageKey, newCount); } }
  30. Tasks Partition 0 class PageKeyViewsCounterTask implements StreamTask { public void process(IncomingMessageEnvelope envelope, MessageCollector collector, TaskCoordinator coordinator) { GenericRecord record = ((GenericRecord) envelope.getMsg()); String pageKey = record.get("page-key").toString(); int newCount = pageKeyViews.get(pageKey).incrementAndGet(); collector.send(countStream, pageKey, newCount); } }
  31. Tasks Partition 0 class PageKeyViewsCounterTask implements StreamTask { public void process(IncomingMessageEnvelope envelope, MessageCollector collector, TaskCoordinator coordinator) { GenericRecord record = ((GenericRecord) envelope.getMsg()); String pageKey = record.get("page-key").toString(); int newCount = pageKeyViews.get(pageKey).incrementAndGet(); collector.send(countStream, pageKey, newCount); } }
  32. Tasks Partition 0 class PageKeyViewsCounterTask implements StreamTask { public void process(IncomingMessageEnvelope envelope, MessageCollector collector, TaskCoordinator coordinator) { GenericRecord record = ((GenericRecord) envelope.getMsg()); String pageKey = record.get("page-key").toString(); int newCount = pageKeyViews.get(pageKey).incrementAndGet(); collector.send(countStream, pageKey, newCount); } }
  33. Tasks Partition 0 class PageKeyViewsCounterTask implements StreamTask { public void process(IncomingMessageEnvelope envelope, MessageCollector collector, TaskCoordinator coordinator) { GenericRecord record = ((GenericRecord) envelope.getMsg()); String pageKey = record.get("page-key").toString(); int newCount = pageKeyViews.get(pageKey).incrementAndGet(); collector.send(countStream, pageKey, newCount); } }
  34. Tasks Partition 0 class PageKeyViewsCounterTask implements StreamTask { public void process(IncomingMessageEnvelope envelope, MessageCollector collector, TaskCoordinator coordinator) { GenericRecord record = ((GenericRecord) envelope.getMsg()); String pageKey = record.get("page-key").toString(); int newCount = pageKeyViews.get(pageKey).incrementAndGet(); collector.send(countStream, pageKey, newCount); } }
  35. Tasks Partition 0 class PageKeyViewsCounterTask implements StreamTask { public void process(IncomingMessageEnvelope envelope, MessageCollector collector, TaskCoordinator coordinator) { GenericRecord record = ((GenericRecord) envelope.getMsg()); String pageKey = record.get("page-key").toString(); int newCount = pageKeyViews.get(pageKey).incrementAndGet(); collector.send(countStream, pageKey, newCount); } }
  36. Tasks Partition 0 class PageKeyViewsCounterTask implements StreamTask { public void process(IncomingMessageEnvelope envelope, MessageCollector collector, TaskCoordinator coordinator) { GenericRecord record = ((GenericRecord) envelope.getMsg()); String pageKey = record.get("page-key").toString(); int newCount = pageKeyViews.get(pageKey).incrementAndGet(); collector.send(countStream, pageKey, newCount); } }
  37. Tasks Partition 0 Task 1
  38. Tasks Page Views - Partition 0 1 2 3 4 PageKeyViews CounterTask Partition 0 Partition 1 Output Count Stream
  39. Tasks Page Views - Partition 0 1 2 3 4 PageKeyViews CounterTask Partition 0 Partition 1 Output Count Stream
  40. Tasks Page Views - Partition 0 1 2 3 4 PageKeyViews CounterTask Partition 0 Partition 1 Output Count Stream
  41. Tasks Page Views - Partition 0 1 2 3 4 PageKeyViews CounterTask Output Count Stream Partition 0 Partition 1
  42. Tasks Page Views - Partition 0 1 2 3 4 PageKeyViews CounterTask Output Count Stream Partition 0 Partition 1
  43. Tasks Page Views - Partition 0 1 2 3 4 PageKeyViews CounterTask Output Count Stream Partition 0 Partition 1
  44. Tasks Page Views - Partition 0 1 2 3 4 PageKeyViews CounterTask Output Count Stream Partition 0 Partition 1
  45. Tasks Page Views - Partition 0 1 2 3 4 PageKeyViews CounterTask Output Count Stream Partition 0 Partition 1
  46. Tasks Page Views - Partition 0 1 2 3 4 PageKeyViews CounterTask Checkpoint Stream 2 Output Count Stream Partition 1 Partition 0 Partition 1
  47. Tasks Page Views - Partition 0 1 2 3 4 PageKeyViews CounterTask Checkpoint Stream 2 Output Count Stream Partition 1 Partition 0 Partition 1
  48. Tasks Page Views - Partition 0 1 2 3 4 PageKeyViews CounterTask Checkpoint Stream 2 Output Count Stream Partition 1 Partition 0 Partition 1
  49. Tasks Page Views - Partition 0 1 2 3 4 PageKeyViews CounterTask Checkpoint Stream 2 Output Count Stream Partition 1 Partition 0 Partition 1
  50. Tasks Page Views - Partition 0 1 2 3 4 PageKeyViews CounterTask Checkpoint Stream 2 Output Count Stream Partition 1 Partition 0 Partition 1
  51. Tasks Page Views - Partition 0 1 2 3 4 PageKeyViews CounterTask Checkpoint Stream 2 Output Count Stream Partition 1 Partition 0 Partition 1
  52. Tasks Page Views - Partition 0 1 2 3 4 PageKeyViews CounterTask Checkpoint Stream 2 Output Count Stream Partition 1 Partition 0 Partition 1
  53. Tasks Page Views - Partition 0 1 2 3 4 PageKeyViews CounterTask Checkpoint Stream 2 Output Count Stream Partition 1 Partition 0 Partition 1
  54. Tasks Page Views - Partition 0 1 2 3 4 PageKeyViews CounterTask Checkpoint Stream 2 Output Count Stream Partition 1 Partition 0 Partition 1
  55. Jobs Stream A Task 1 Task 2 Stream B Task 3
  56. Jobs Stream A Task 1 Stream B Task 2 Stream C Task 3
  57. Jobs AdViews Task 1 AdClicks Task 2 AdClickThroughRate Task 3
  58. Jobs AdViews Task 1 AdClicks Task 2 AdClickThroughRate Task 3
  59. Jobs Stream A Task 1 Stream B Task 2 Stream C Task 3
  60. Dataflow Stream A Stream B Job 1 Stream D Job 2 Stream E Job 3 Stream B Stream C
  61. Dataflow Stream A Stream B Job 1 Stream D Job 2 Stream E Job 3 Stream B Stream C
  62. YARN
  63. YARN You: I want to run command X on two machines with 512M of memory.
  64. YARN You: I want to run command X on two machines with 512M of memory. YARN: Cool, where’s your code?
  65. YARN You: I want to run command X on two machines with 512M of memory. YARN: Cool, where’s your code? You: http://some-host/jobs/download/my.tgz
  66. YARN You: I want to run command X on two machines with 512M of memory. YARN: Cool, where’s your code? You: http://some-host/jobs/download/my.tgz YARN: I’ve run your command on grid-node-2 and grid-node-7.
  67. YARN Host 1 Host 2 Host 3
  68. YARN Host 1 Host 2 Host 3 NM NM NM
  69. YARN Host 0 RM Host 1 Host 2 Host 3 NM NM NM
  70. YARN Host 0 Client RM Host 1 Host 2 Host 3 NM NM NM
  71. YARN Host 0 Client RM Host 1 Host 2 Host 3 NM NM NM
  72. YARN Host 0 Client RM Host 1 Host 2 Host 3 NM NM NM
  73. YARN Host 0 Client Host 1 NM RM Host 2 AM Host 3 NM NM
  74. YARN Host 0 Client Host 1 NM RM Host 2 AM Host 3 NM NM
  75. YARN Host 0 Client Host 1 NM RM Host 2 AM Host 3 NM NM
  76. YARN Host 0 Client Host 1 NM RM Host 2 AM Host 3 NM NM Container
  77. YARN Host 0 Client Host 1 NM RM Host 2 AM Host 3 NM NM Container
  78. YARN Host 0 Client Host 1 NM RM Host 2 AM Host 3 NM NM
  79. YARN Host 0 Client Host 1 NM RM Host 2 AM Host 3 NM NM
  80. YARN Host 0 Client Host 1 NM RM Host 2 AM Host 3 NM NM
  81. YARN Host 0 Client Host 1 NM RM Host 2 AM Host 3 NM NM
  82. YARN Host 0 Client Host 1 NM RM Host 2 AM Host 3 NM NM Container
  83. Jobs Stream A Task 1 Task 2 Stream B Task 3
  84. Containers Stream A Task 1 Task 2 Stream B Task 3
  85. Containers Stream A Samza Container 1 Stream B Samza Container 2
  86. Containers Samza Container 1 Samza Container 2
  87. YARN Host 1 Samza Container 1 Host 2 Samza Container 2
  88. YARN Host 1 Host 2 NodeManager NodeManager Samza Container 1 Samza Container 2
  89. YARN Host 1 Host 2 NodeManager NodeManager Samza Container 1 Samza Container 2 Samza YARN AM
  90. YARN Host 1 Host 2 NodeManager NodeManager Samza Container 1 Kafka Broker Samza Container 2 Samza YARN AM Kafka Broker
  91. YARN Host 1 Host 2 NodeManager NodeManager MapReduce Container HDFS MapReduce YARN AM MapReduce Container HDFS
  92. YARN Host 1 Stream A NodeManager Samza Container 1 Samza Container 1 Kafka Broker Stream C Samza Container 2
  93. YARN Host 1 Stream A NodeManager Samza Container 1 Samza Container 1 Kafka Broker Stream C Samza Container 2
  94. YARN Host 1 Stream A NodeManager Samza Container 1 Samza Container 1 Kafka Broker Stream C Samza Container 2
  95. YARN Host 1 Stream A NodeManager Samza Container 1 Samza Container 1 Kafka Broker Stream C Samza Container 2
  96. YARN Host 1 Host 2 NodeManager NodeManager Samza Container 1 Kafka Broker Samza Container 2 Samza YARN AM Kafka Broker
  97. CGroups Host 1 Host 2 NodeManager NodeManager Samza Container 1 Kafka Broker Samza Container 2 Samza YARN AM Kafka Broker
  98. (Not Running) Multi-Framework Host 1 Host 2 NodeManager NodeManager Samza Container 1 Kafka MapReduce Container Samza YARN AM HDFS
  99. Stateful Processing
  100. SELECT col1, count(*) FROM stream1 INNER JOIN stream2 ON stream1.col3 = stream2.col3 WHERE col2 > 20 GROUP BY col1 ORDER BY count(*) DESC LIMIT 50;
  101. SELECT col1, count(*) FROM stream1 INNER JOIN stream2 ON stream1.col3 = stream2.col3 WHERE col2 > 20 GROUP BY col1 ORDER BY count(*) DESC LIMIT 50;
  102. SELECT col1, count(*) FROM stream1 INNER JOIN stream2 ON stream1.col3 = stream2.col3 WHERE col2 > 20 GROUP BY col1 ORDER BY count(*) DESC LIMIT 50;
  103. SELECT col1, count(*) FROM stream1 INNER JOIN stream2 ON stream1.col3 = stream2.col3 WHERE col2 > 20 GROUP BY col1 ORDER BY count(*) DESC LIMIT 10;
  104. How do people do this?
  105. Remote Stores Stream A Task 1 Task 2 Task 3 Key-Value Store Stream B
  106. Remote RPC is slow • Stream: ~500k records/sec/container • DB: << less
  107. Online vs. Async
  108. No undo • Database state is non-deterministic • Can’t roll back mutations if task crashes
  109. Tables & Streams put(a, w) put(b, x) Database put(a, y) put(b, z) Time
  110. Stateful Tasks Stream A Task 1 Task 2 Stream B Task 3
  111. Stateful Tasks Stream A Task 1 Task 2 Stream B Task 3
  112. Stateful Tasks Stream A Task 1 Task 2 Stream B Task 3 Changelog Stream
  113. Stateful Tasks Stream A Task 1 Task 2 Stream B Task 3 Changelog Stream
  114. Stateful Tasks Stream A Task 1 Task 2 Stream B Task 3 Changelog Stream
  115. Stateful Tasks Stream A Task 1 Task 2 Stream B Task 3 Changelog Stream
  116. Stateful Tasks Stream A Task 1 Task 2 Stream B Task 3 Changelog Stream
  117. Stateful Tasks Stream A Task 1 Task 2 Stream B Task 3 Changelog Stream
  118. Stateful Tasks Stream A Task 1 Task 2 Stream B Task 3 Changelog Stream
  119. Stateful Tasks Stream A Task 1 Task 2 Stream B Task 3 Changelog Stream
  120. Stateful Tasks Stream A Task 1 Task 2 Stream B Task 3 Changelog Stream
  121. Stateful Tasks Stream A Task 1 Task 2 Stream B Task 3 Changelog Stream
  122. Stateful Tasks Stream A Task 1 Task 2 Stream B Task 3 Changelog Stream
  123. Stateful Tasks Stream A Task 1 Task 2 Stream B Task 3 Changelog Stream
  124. Key-Value Store • • • • put(table_name, key, value) get(table_name, key) delete(table_name, key) range(table_name, key1, key2)
  125. Stateful Stream Task public class SimpleStatefulTask implements StreamTask, InitableTask { private KeyValueStore<String, String> store; public void init(Config config, TaskContext context) { this.store = context.getStore("mystore"); } public void process( IncomingMessageEnvelope envelope, MessageCollector collector, TaskCoordinator coordinator) { GenericRecord record = (GenericRecord) envelope.getMessage(); String memberId = record.get("member_id"); String name = record.get("name"); System.out.println("old name: " + store.get(memberId)); store.put(memberId, name); } }
  126. Stateful Stream Task public class SimpleStatefulTask implements StreamTask, InitableTask { private KeyValueStore<String, String> store; public void init(Config config, TaskContext context) { this.store = context.getStore("mystore"); } public void process( IncomingMessageEnvelope envelope, MessageCollector collector, TaskCoordinator coordinator) { GenericRecord record = (GenericRecord) envelope.getMessage(); String memberId = record.get("member_id"); String name = record.get("name"); System.out.println("old name: " + store.get(memberId)); store.put(memberId, name); } }
  127. Stateful Stream Task public class SimpleStatefulTask implements StreamTask, InitableTask { private KeyValueStore<String, String> store; public void init(Config config, TaskContext context) { this.store = context.getStore("mystore"); } public void process( IncomingMessageEnvelope envelope, MessageCollector collector, TaskCoordinator coordinator) { GenericRecord record = (GenericRecord) envelope.getMessage(); String memberId = record.get("member_id"); String name = record.get("name"); System.out.println("old name: " + store.get(memberId)); store.put(memberId, name); } }
  128. Stateful Stream Task public class SimpleStatefulTask implements StreamTask, InitableTask { private KeyValueStore<String, String> store; public void init(Config config, TaskContext context) { this.store = context.getStore("mystore"); } public void process( IncomingMessageEnvelope envelope, MessageCollector collector, TaskCoordinator coordinator) { GenericRecord record = (GenericRecord) envelope.getMessage(); String memberId = record.get("member_id"); String name = record.get("name"); System.out.println("old name: " + store.get(memberId)); store.put(memberId, name); } }
  129. Whew!
  130. Let’s be Friends! • We are incubating, and you can help! • Get up and running in 5 minutes http://bit.ly/hello-samza • Grab some newbie JIRAs http://bit.ly/samza_newbie_issues

Editor's Notes

  1. - stream processing for us = anything asynchronous, but not batch computed.- 25% of code is async. 50% is rpc/online. 25% is batch.- stream processing is worst supported.
  2. - stream processing for us = anything asynchronous, but not batch computed.- 25% of code is async. 50% is rpc/online. 25% is batch.- stream processing is worst supported.
  3. - stream processing for us = anything asynchronous, but not batch computed.- 25% of code is async. 50% is rpc/online. 25% is batch.- stream processing is worst supported.
  4. - stream processing for us = anything asynchronous, but not batch computed.- 25% of code is async. 50% is rpc/online. 25% is batch.- stream processing is worst supported.
  5. - compute top shares, pull in, scrape, entity tag- language detection- send emails: friend was in the news- requirement: has to be fast, since news is trendy
  6. - relevance pipeline
  7. - we send relatively data rich emails- some emails are time sensitive (need to be sent soon)
  8. - time sensitive- data ingestion pattern- other systems that follow this pattern: realtimeolap system, and social graph system
  9. - ecosystem at LinkedIn (some unique traits)- hard unsolved problems in this space
  10. - once we had all this data in kafka, we wanted to do stuff with it.- persistent,reliable,distributed,message queue- Kafka = first among equals, but stream systems are pluggable. Just like Hadoop with HDSF vs. S3.
  11. - started with just simple web service that consumes and produces kafka messages.- realized that there are a lot of hard problems that needed to be solved.- reprocessing: what if my algorithm changes and I need to reprocess all events?- non-determinism: queries to external systems, time dependencies, ordering of messages.
  12. - open area of research- been around for 20 years
  13. partitioned
  14. re-playable,ordered,fault tolerant,infinitevery heavyweight definition of a stream (vs. s4, storm, etc)
  15. partition assignment happens on write
  16. At least once messaging. Duplicates are possible.Future: exact semantics.Transparent to user. No ack’ing API.
  17. connected by stream name onlyfully buffered
  18. split job tracker upresource management, process isolation, fault tolerance, security
  19. - group by, sum, count
  20. - stream to stream, stream to table, table to table
  21. - buffered sorting
  22. Changelog/redologState machine model
  23. Can also consume these streams from other jobs.
  24. - can’t keep messages forever. - log compaction: delete over-written keys over time.
  25. - can’t keep messages forever. - log compaction: delete over-written keys over time.
  26. store API is pluggable: Lucene, buffered sort, external sort, bitmap index, bloom filters and sketches
Advertisement