Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Twitter's Real Time Stack - Processing Billions of Events Using Distributed Log and Heron


Published on

Twitter generates billions and billions of events per day. Analyzing these events in real time presents a massive challenge. In order
to meet this challenge, Twitter designed an end to end real-time stack consisting of DistributedLog, the distributed and replicated messaging system system, and Heron, the streaming system for real time computation. DistributedLog is a replicated log service that is built on top of Apache BookKeeper, providing infinite, ordered, append-only streams that can be used for building robust real-time systems. It is the foundation of Twitter’s publish-subscribe system. Twitter Heron is the next generation streaming system built from ground up to address our scalability and reliability needs. Both the systems have been in production for nearly two years and is widely used at Twitter in a range of diverse applications such as search ingestion pipeline, ad analytics, image classification and more. These slides will describe Heron and DistributedLog in detail, covering a few use cases in-depth and sharing the operating experiences and challenges of running large-scale real time systems at scale.

Published in: Software

Twitter's Real Time Stack - Processing Billions of Events Using Distributed Log and Heron

  1. 1. Twitter Real Time Stack Processing Billions of Events Using Distributed Log and Heron Karthik  Ramasamy   Twi/er @karthikz
  2. 2. 2
  3. 3. 3 Value of Data It’s contextual Value&of&Data&to&Decision/Making& Time& Preven8ve/& Predic8ve& Ac8onable& Reac8ve& Historical& Real%& Time& Seconds& Minutes& Hours& Days& Tradi8onal&“Batch”&&&&&&&&&&&&&&& Business&&Intelligence& Informa9on&Half%Life& In&Decision%Making& Months& Time/cri8cal& Decisions& [1]  Courtesy  Michael  Franklin,  BIRTE,  2015.  
  4. 4. 4 What is Real-Time? BATCH high throughput > 1 hour monthly active users relevance for ads adhoc queries REAL TIME low latency < 1 ms Financial Trading ad impressions count hash tag trends approximate 10 ms - 1 sec Near Real Time latency sensitive < 500 ms fanout Tweets search for Tweets deterministic workflows OLTP It’s contextual
  5. 5. 5 Why Real Time? G Emerging break out trends in Twitter (in the form #hashtags) Ü Real time sports conversations related with a topic (recent goal or touchdown) ! Real time product recommendations based on your behavior & profile real time searchreal time trends real time conversations real time recommendations Real time search of tweets s ANALYZING BILLIONS OF EVENTS IN REAL TIME IS A CHALLENGE!
  6. 6. 6 Real Time: Analytics STREAMING Analyze  data  as  it  is  being   produced INTERACTIVE Store  data  and  provide  results   instantly   when   a   query   is   posed H C
  7. 7. 7 Real Time Use Cases Online Services 10s of ms Near Real Time 100s of ms Data for Batch Analytics secs to mins TransacKon  log,  Queues,   RPCs Change  propagaKon,   Streaming  analyKcs Log  aggregaKon,  Client   events I
  8. 8. 8 Real Time Stack Components: Many moving parts TWITTER REAL TIME ! scribe s heron J Event Bus a dlog b
  9. 9. 9 Scribe Open source log aggregation Originally  from  Facebook.  TwiRer   made  significant  enhancements  for   real  Kme  event  aggregaKon High throughput and scale Delivers  125M  messages/min.     Provides  Kght  SLAs  on  data  reliability Runs on every machine Simple,  very  reliable  and  efficiently   uses  memory  and  CPU ! { "
  10. 10. Event  Bus  &  Distributed  Log Next Generation Messaging "
  11. 11. 11 Twitter Messaging Kestrel Core  Business  Logic   (tweets,  fanouts  …) Kestrel HDFS Kestrel Book   Keeper My  SQL Ka]a Scribe Deferred   RPC Gizzard Database Search
  12. 12. 12 Kestrel Limitations Adding subscribers is expensive Scales poorly as #queues increase Durability is hard to achieve Read-behind degrades performance Too many random I/Os Cross DC replication ! #" 7!
  13. 13. 13 Kafka Limitations Relies on file system page cache Performance degradation when subscribers fall behind - too much random I/O ! "
  14. 14. 14 Rethinking Messaging Durable writes, intra cluster and geo-replication Scale resources independently Cost efficiency Unified Stack - tradeoffs for various workloads Multi tenancy Ease of Manageability ! #" 7!
  15. 15. 15 Event Bus Durable writes, intra cluster and geo-replication Scale resources independently Cost efficiency Unified Stack - tradeoffs for various workloads Multi tenancy Ease of Manageability ! #" 7!
  16. 16. 16 Event Bus - Pub-Sub Write   Proxy Read   Proxy Publisher Subscriber Metadata Distributed     Log Distributed  Log
  17. 17. 17 Distributed Log Write   Proxy Read   Proxy Publisher Subscriber Metadata Distributed     Log
  18. 18. 18 Distributed Log @Twitter 01 02 03 04 Manhattan Key Value Store Durable Deferred RPC Real Time Search Indexing Pub Sub System / . - , 05 / Globally Replicated Log
  19. 19. 19 Distributed Log @Twitter 400  TB/Day   IN 10  PB/Day     OUT 2  Trillion  Events/Day   PROCESSED 100  MS   latency
  20. 20. ALGORITHMS Mining Streaming Data
  21. 21. Twi/er  Heron Next Generation Streaming Engine "
  22. 22. 22 Better Storm Twitter Heron Container  Based  Architecture Separate  Monitoring  and  Scheduling - Simplified  ExecuTon  Model 2 Much  Be/er  Performance$
  23. 23. 23 Twitter Heron Batching of tuples AmorKzing  the  cost  of  transferring  tuples ! Task isolation Ease  of  debug-­‐ability/isolaKon/profiling #Fully API compatible with Storm Directed  acyclic  graph      Topologies,  Spouts  and  Bolts " Support for back pressure Topologies  should  self  adjusKng gUse of main stream languages C++,  Java  and  Python ! Efficiency Reduce resource consumption G Design: Goals
  24. 24. 24 Twitter Heron Guaranteed Message Passing Horizontal Scalability Robust Fault Tolerance Concise Code-Focus on Logic b Ñ /
  25. 25. 25 Heron Terminology Topology Directed  acyclic  graph     verKces  =  computaKon,  and     edges  =  streams  of  data  tuples Spouts Sources  of  data  tuples  for  the  topology   Examples  -­‐  Ka]a/Kestrel/MySQL/Postgres Bolts Process  incoming  tuples,  and  emit  outgoing  tuples   Examples  -­‐  filtering/aggregaKon/join/any  funcKon , %
  26. 26. 26 Heron Topology % % % % % Spout 1 Spout 2 Bolt 1 Bolt 2 Bolt 3 Bolt 4 Bolt 5
  27. 27. 27 Stream Groupings 01 02 03 04 Shuffle Grouping Random distribution of tuples Fields Grouping Group tuples by a field or multiple fields All Grouping Replicates tuples to all tasks Global Grouping Send the entire stream to one task / . - ,
  28. 28. 28 Heron Topology 1 Topology Submission Scheduler Topology 2 Topology N Architecture: High Level
  29. 29. 29 Heron Topology Master ZK Cluster Stream Manager I1 I2 I3 I4 Stream Manager I1 I2 I3 I4 Logical Plan, Physical Plan and Execution State Sync Physical Plan CONTAINER CONTAINER Metrics Manager Metrics Manager Architecture: Topology
  30. 30. 30 Heron % % S1 B2 B3 % B4 Stream Manager: BackPressure
  31. 31. 31 Stream Manager S1 B2 B3 Stream Manager Stream Manager Stream Manager Stream Manager S1 B2 B3 B4 S1 B2 B3 S1 B2 B3 B4 B4 Stream Manager: BackPressure
  32. 32. S1 S1 S1S1S1 S1 S1S1 32 Heron B2 B3 Stream Manager Stream Manager Stream Manager Stream Manager B2 B3 B4 B2 B3 B2 B3 B4 B4 Stream Manager: Spout BackPressure
  34. 34. 34 Heron Sample Topologies
  35. 35. 35 Heron @Twitter 1 stage 10 stages 3x reduction in cores and memory Heron has been in production for 2 years
  36. 36. 36 Heron COMPONENTS EXPT #1 EXPT #2 EXPT #3 Spout 25 100 200 Bolt 25 100 200 # Heron containers 25 100 200 # Storm workers 25 100 200 Performance: Settings
  37. 37. 37 Heron Throughput CPU usage milliontuples/min 0 2750 5500 8250 11000 Spout Parallelism 25 100 200 10,200 5,820 1,545 1,920 965249 Heron (paper) Heron (master) #coresused 0 112.5 225 337.5 450 Spout Parallelism 25 100 200 397.5 217.5 54 261 137 32 Heron (paper) Heron (master) Performance: Atmost Once 5 - 6x 1.4 -1.6x
  38. 38. 38 Heronmilliontuples/min 0 10 20 30 40 Spout Parallelism 25 100 200 Heron (paper) Heron (master) 4-5x Performance: CPU Usage
  39. 39. 39 Heron @Twitter >  400  Real   Time  Jobs 500  Billions  Events/Day   PROCESSED 25-­‐200   MS   latency
  40. 40. Tying  Together "
  41. 41. 41 Combining batch and real time Lambda Architecture New  Data Client
  42. 42. 42 Lambda Architecture - The Good Event  BusScribe  CollecKon  Pipeline Heron  AnalyKcs  Pipeline Results
  43. 43. 43 Lambda Architecture - The Bad Have to fix everything (may be twice)! How much Duct Tape required? Have to write everything twice! Subtle differences in semantics What about Graphs, ML, SQL, etc? ! #" 7!
  44. 44. 44 Summingbird to the Rescue Summingbird  Program Scalding/Map  Reduce HDFS Message  broker Heron  Topology Online  key  value  result   store Batch  key  value  result   store Client
  45. 45. 45 Curious to Learn More? Twitter Heron: Stream Processing at Scale Sanjeev Kulkarni, Nikunj Bhagat, Maosong Fu, Vikas Kedigehalli, Christopher Kellogg, Sailesh Mittal, Jignesh M. Patel*,1 , Karthik Ramasamy, Siddarth Taneja @sanjeevrk, @challenger_nik, @Louis_Fumaosong, @vikkyrk, @cckellogg, @saileshmittal, @pateljm, @karthikz, @staneja Twitter, Inc., *University of Wisconsin – Madison ABSTRACT Storm has long served as the main platform for real-time analytics at Twitter. However, as the scale of data being processed in real- time at Twitter has increased, along with an increase in the diversity and the number of use cases, many limitations of Storm have become apparent. We need a system that scales better, has better debug-ability, has better performance, and is easier to manage – all while working in a shared cluster infrastructure. We considered various alternatives to meet these needs, and in the end concluded that we needed to build a new real-time stream data processing system. This paper presents the design and implementation of this new system, called Heron. Heron is now system process, which makes debugging very challenging. Thus, we needed a cleaner mapping from the logical units of computation to each physical process. The importance of such clean mapping for debug-ability is really crucial when responding to pager alerts for a failing topology, especially if it is a topology that is critical to the underlying business model. In addition, Storm needs dedicated cluster resources, which requires special hardware allocation to run Storm topologies. This approach leads to inefficiencies in using precious cluster resources, and also limits the ability to scale on demand. We needed the ability to work in a more flexible way with popular cluster scheduling software that Storm @Twitter Ankit Toshniwal, Siddarth Taneja, Amit Shukla, Karthik Ramasamy, Jignesh M. Patel*, Sanjeev Kulkarni, Jason Jackson, Krishna Gade, Maosong Fu, Jake Donham, Nikunj Bhagat, Sailesh Mittal, Dmitriy Ryaboy @ankitoshniwal, @staneja, @amits, @karthikz, @pateljm, @sanjeevrk, @jason_j, @krishnagade, @Louis_Fumaosong, @jakedonham, @challenger_nik, @saileshmittal, @squarecog Twitter, Inc., *University of Wisconsin – Madison
  48. 48. 48 WHAT WHY WHERE WHEN WHO HOW Any Question ???
  49. 49. 49 @karthikz Get in Touch
  50. 50. THANKS  FOR  ATTENDING  !!!