Your SlideShare is downloading. ×
real-time delivery architecture                                         @raffi                                  qcon nyc 2...
http://twitpic.com/135xa -There’s a plane in theHudson. I’m on a ferrygoing to pick up thepeople. Crazy.15 Jan 09Janis Kru...
this is barack - lets getthis started! -bo24 May 12The White House@whitehouse
what are the goals?⇢ evolve from being solely a web stack
ROUTING   PRESENTATION   LOGIC   STORAGE &                                 RETRIEVAL                                   T-B...
what are the goals?⇢ evolve from being solely a web stack⇢ isolate responsibilities and concerns⇢ site speed and reliabili...
Pull                PushTargeted twitter.com          User / Site Streams          home_timeline API   Mobile Push (SMS, e...
Write APIIngester                     Fanout                                                                              ...
Write API                                                                   Social         Ingester                     Fa...
Write API          Ingester                     Fanoutusing redis                                                         ...
Write API          Ingester                     Fanoutusing redis                                                         ...
Write APIIngester                     Fanout                                                                              ...
Pull                PushTargeted twitter.com          User / Site Streams          home_timeline API   Mobile Push (SMS, e...
Write APIIngester                     Fanout                                                                              ...
Write APIIngester                     Fanout                                                         blender              ...
Write APIIngester                     Fanout                                                                              ...
Pull                PushTargeted twitter.com          User / Site Streams          home_timeline API   Mobile Push (SMS, e...
Write APIIngester                     Fanout                                                                              ...
http push / hosebird⇢ maintains persistent connections with  end clients⇢ processes tweet & social graph events⇢ event-bas...
Hosebird   Firehose    Write API   Hosebird              User Streams                           Hosebird   Track / Followe...
Hosebird   Firehose   Write API    Hosebird              User Streams                           Hosebird   Track / Followe...
Hosebird   Firehose   Write API   Hosebird              User Streams                          Hosebird   Track / Followfire...
Hosebird   Firehose    Write API    Hosebird              Track / Follow                            Hosebird   User Stream...
Hosebird   Firehose    Write API    Hosebird              Track / Follow                            Hosebird   User Stream...
Write APIIngester                     Fanout                                                                              ...
Pull                PushTargeted twitter.com          User / Site Streams          home_timeline API   Mobile Push (SMS, e...
Write APIIngester                     Fanout                                                                              ...
Write APIIngester                     Fanout                                                                              ...
Write APIIngester                     Fanout                                                                              ...
Write APIIngester                     Fanout                                                                              ...
Pull                PushTargeted twitter.com          User / Site Streams          home_timeline API   Mobile Push (SMS, e...
Write APIIngester                     Fanout                                                                              ...
Write APIIngester                     Fanout                                                                              ...
Synchronous Path                            Write APIIngester                     Fanout                                  ...
Synchronous Path                            Write APIIngester                     Fanout                                  ...
Synchronous Path                            Write APIIngester                     Fanout                                  ...
Write APIRead Path                                                                                  Write Path            ...
Write APIRead Path                                                                                  Write Path            ...
things we’re trying...
Write APIIngester                    Fanout                                      Timeline CacheRedis        Search Index  ...
Write API                     Ingester                    Fanout                                                          ...
User Intent          Query Expansion    “Hello, world”       “Hello” AND “world”@raffi’s home timeline   home_timeline:raffi
User Intent            Query Expansion    “Hello, world”         “Hello” AND “world”                           user_timeli...
Write APIfan-in                                                                fan-out⇢ O(1) write                        ...
User Intent          Query Expansion    “Hello, world”       “Hello” AND “world”@raffi’s home timeline   home_timeline:raffi
User Intent             Query Expansion    “Hello, world”          “Hello” AND “world”                            home_tim...
streaming compute⇢ continuous computation⇢ driven by the events that come into  twitter⇢ generalizing the push mechanism
Write APIIngester                     Fanout                                                                              ...
timeline query statistics⇢ >150m active users worldwide⇢ 300k qps poll-based timelines  @ 1ms p50 / 4ms p99⇢ 30k qps searc...
tweet input⇢ ~340m tweets per day⇢ ~4K/sec daily average⇢ ~6K/sec daily peak⇢ >10K/sec during large events
followed by   following
timeline delivery statistics⇢ 26b deliveries / day (~18m / min)⇢ 3.5 seconds @ p50 to deliver to 1m⇢ ~300k deliveries / sec
thanks!
QCon NYC 2012: Twitter's Real Time Architecture
QCon NYC 2012: Twitter's Real Time Architecture
QCon NYC 2012: Twitter's Real Time Architecture
QCon NYC 2012: Twitter's Real Time Architecture
QCon NYC 2012: Twitter's Real Time Architecture
QCon NYC 2012: Twitter's Real Time Architecture
QCon NYC 2012: Twitter's Real Time Architecture
Upcoming SlideShare
Loading in...5
×

QCon NYC 2012: Twitter's Real Time Architecture

28,624

Published on

With hundreds of millions of users, Twitter operates one of the world's largest real-time delivery systems, large enough and pervasive enough to exert noticeable "pressure" on the overall internet itself. At steady state, Twitter receives thousands of tweets a second that it needs to deliver to disks, in-memory timelines, email, and mobile devices. The name of the game for Twitter is "now", so those deliveries, which multiply according to the graph of who follows whom, need to occur in real-time. In this session, we will dive into both the "write path" and "read path" of Twitter to understand the architecture which supports those tweets, and also how Twitter serves them through one of the world's largest web sites.

animated slides available at animated slides available at http://www.youtube.com/watch?v=l55jFAGsgbs

Published in: Technology, Business
5 Comments
96 Likes
Statistics
Notes
No Downloads
Views
Total Views
28,624
On Slideshare
0
From Embeds
0
Number of Embeds
14
Actions
Shares
0
Downloads
0
Comments
5
Likes
96
Embeds 0
No embeds

No notes for slide

Transcript of "QCon NYC 2012: Twitter's Real Time Architecture"

  1. 1. real-time delivery architecture @raffi qcon nyc 2012
  2. 2. http://twitpic.com/135xa -There’s a plane in theHudson. I’m on a ferrygoing to pick up thepeople. Crazy.15 Jan 09Janis Krums @jkrums
  3. 3. this is barack - lets getthis started! -bo24 May 12The White House@whitehouse
  4. 4. what are the goals?⇢ evolve from being solely a web stack
  5. 5. ROUTING PRESENTATION LOGIC STORAGE & RETRIEVAL T-Bird T-Flock + Haplo Monorail Darkwing Flock(s)
  6. 6. what are the goals?⇢ evolve from being solely a web stack⇢ isolate responsibilities and concerns⇢ site speed and reliability⇢ developer innovation speed
  7. 7. Pull PushTargeted twitter.com User / Site Streams home_timeline API Mobile Push (SMS, etc.)Queried Search API Track / Follow Streams
  8. 8. Write APIIngester Fanout Batch Compute Timeline Cache Push Compute Search Cache HTTP PushRedis Redis Redis Redis Hadoop Earlybird Redis Mobile Push TimelineBlender Service
  9. 9. Write API Social Ingester Fanout Graph Serviceinsert Batch Compute Timeline Cache Push Compute Search Cache HTTP Push Redis Redis Redis Redis Hadoop Earlybird Redis⇢ keyed off Mobile Push “recipient” Timeline⇢ pipelined 4k Blender Service “destinations” at a time⇢ replicated
  10. 10. Write API Ingester Fanoutusing redis Batch Compute Timeline Cache Push Compute Search Cache Tweet IDPush User ID HTTP Bits Redis Redis Redis Redis Hadoop Earlybird Redis 8 bytes 8 bytes 4 bytes⇢ native list Mobile Push structure Timeline⇢ RPUSHX to Blender Service only add to cached timelines
  11. 11. Write API Ingester Fanoutusing redis Batch Compute Timeline Cache Push Compute Search Cache Tweet IDPush User ID HTTP Bits Redis Redis Redis Redis Tweet ID User ID Bits Hadoop Earlybird Redis⇢ native list Mobile Tweet ID User ID Bits Push structure Tweet ID User ID Bits Tweet ID User ID Bits Timeline⇢ RPUSHX to Blender Tweet ID User ID Bits Service Tweet ID User ID Bits only add to Tweet ID User ID Bits cached Tweet ID User ID Bits timelines Tweet ID User ID Bits
  12. 12. Write APIIngester Fanout Batch Compute Timeline Cache Push Compute Search Cache HTTP PushRedis Redis Redis Redis Hadoop Earlybird Redis Mobile Push TimelineBlender Service
  13. 13. Pull PushTargeted twitter.com User / Site Streams home_timeline API Mobile Push (SMS, etc.)Queried Search API Track / Follow Streams
  14. 14. Write APIIngester Fanout Batch Compute Timeline Cache Push Compute Search Index HTTP PushRedis RedisEarlybird Redis Hadoop Earlybird Redis Mobile Push TimelineBlender Service
  15. 15. Write APIIngester Fanout blender Batch Compute Timeline Cache Push Compute Search Index HTTP PushRedis RedisEarlybird Redis Hadoop Earlybird Redis ⇢ queries one Mobile Push replica of allBlender Timeline Service indexes ⇢ merges & ranks results
  16. 16. Write APIIngester Fanout Batch Compute Timeline Cache Push Compute Search Index HTTP PushRedis RedisEarlybird Redis Hadoop Earlybird Redis Mobile Push TimelineBlender Service
  17. 17. Pull PushTargeted twitter.com User / Site Streams home_timeline API Mobile Push (SMS, etc.)Queried Search API Track / Follow Streams
  18. 18. Write APIIngester Fanout Batch Compute Timeline Cache Push Compute Search Index HTTP PushRedis RedisEarlybird Redis Hadoop Earlybird Redis Mobile Push TimelineBlender Service
  19. 19. http push / hosebird⇢ maintains persistent connections with end clients⇢ processes tweet & social graph events⇢ event-based “router”
  20. 20. Hosebird Firehose Write API Hosebird User Streams Hosebird Track / Followevent propagation⇢ write API sends all events into hosebird; sees content creation events, social graph changes, etc.⇢ different queues for public tweets, protected tweets, social events, etc.
  21. 21. Hosebird Firehose Write API Hosebird User Streams Hosebird Track / Followevent cascading⇢ bandwidth management⇢ simultaneous connection management (~1m long lived & open connections to this cluster)
  22. 22. Hosebird Firehose Write API Hosebird User Streams Hosebird Track / Followfirehose⇢ edge machine simply outputs the public tweet queue⇢ only allow a limited number of firehoses per hosebird box for bandwidth management
  23. 23. Hosebird Firehose Write API Hosebird Track / Follow Hosebird User Streamstrack / follow⇢ simple query based on tweet content⇢ keeps list of terms / users of interest⇢ parses public tweets at the edge, and if term matches a token, or user is of interest, then route
  24. 24. Hosebird Firehose Write API Hosebird Track / Follow Hosebird User Streamsuser streams⇢ replicate home timeline experience⇢ upon login, obtain “following” list⇢ keep cached following list coherent by seeing social graph updates⇢ route tweet if from a followed user
  25. 25. Write APIIngester Fanout Batch Compute Timeline Cache Push Compute Search Index HTTP PushRedis RedisEarlybird Redis Hadoop Earlybird Redis Mobile Push TimelineBlender Service
  26. 26. Pull PushTargeted twitter.com User / Site Streams home_timeline API Mobile Push (SMS, etc.)Queried Search API Track / Follow Streams
  27. 27. Write APIIngester Fanout Batch Compute Timeline Cache Push Compute Search Index HTTP PushRedis RedisEarlybird Redis Hadoop Earlybird Redis Mobile Push TimelineBlender Service
  28. 28. Write APIIngester Fanout Batch Compute Timeline Cache Push Compute Search Index HTTP PushRedis RedisEarlybird Redis Hadoop Earlybird Redis Social Mobile Graph Push Service TimelineBlender Service
  29. 29. Write APIIngester Fanout Batch Compute Timeline Cache Push Compute Search Index HTTP PushRedis RedisEarlybird Redis Hadoop Earlybird Redis Social Mobile Graph Push Service TimelineBlender Service
  30. 30. Write APIIngester Fanout Batch Compute Timeline Cache Push Compute Search Index HTTP PushRedis RedisEarlybird Redis Hadoop Earlybird Redis Mobile Push TimelineBlender Service
  31. 31. Pull PushTargeted twitter.com User / Site Streams home_timeline API Mobile Push (SMS, etc.)Queried Search API Track / Follow Streams
  32. 32. Write APIIngester Fanout Batch Compute Timeline Cache Push Compute Search Index HTTP PushRedis RedisEarlybird Redis Hadoop Earlybird Redis Mobile Push TimelineBlender Service
  33. 33. Write APIIngester Fanout Batch Compute Timeline Cache Push Compute Search Index HTTP PushRedis RedisEarlybird Redis Hadoop Earlybird Redis Mobile Push TimelineBlender Service
  34. 34. Synchronous Path Write APIIngester Fanout Asynchronous Path Batch Compute Timeline Cache Push Compute Search Index HTTP PushRedis RedisEarlybird Redis Hadoop Earlybird Redis Mobile Push TimelineBlender Query Path Service
  35. 35. Synchronous Path Write APIIngester Fanout Asynchronous Path Batch Compute Timeline Cache Push Compute Search Index HTTP PushRedis RedisEarlybird Redis Hadoop Earlybird Redis Mobile Push TimelineBlender Query Path Service
  36. 36. Synchronous Path Write APIIngester Fanout Asynchronous Path Batch Compute Timeline Cache Push Compute Search Index HTTP PushRedis RedisEarlybird Redis Hadoop Earlybird Redis Mobile Push TimelineBlender Query Path Service
  37. 37. Write APIRead Path Write Path Ingester Fanout Batch Compute Timeline Cache Push Compute Search Index HTTP Push Redis Redis Earlybird Redis Hadoop Earlybird Redis Mobile Push Timeline Blender Service
  38. 38. Write APIRead Path Write Path Ingester Fanout Batch Compute Timeline Cache Push Compute Search Index HTTP Push Redis Redis Earlybird Redis Hadoop Earlybird Redis Mobile Push Timeline Blender Service
  39. 39. things we’re trying...
  40. 40. Write APIIngester Fanout Timeline CacheRedis Search Index RedisEarlybird Redis Earlybird Redis
  41. 41. Write API Ingester Fanout Timeline Cache Redis Search Index Redis Earlybird Redis Earlybird Redissearch index fanout index⇢[‘hello’,‘world’] ⇢[@danadanger, ...]
  42. 42. User Intent Query Expansion “Hello, world” “Hello” AND “world”@raffi’s home timeline home_timeline:raffi
  43. 43. User Intent Query Expansion “Hello, world” “Hello” AND “world” user_timeline:nelson@raffi’s home timeline OR user_timeline:danadanger
  44. 44. Write APIfan-in fan-out⇢ O(1) write ⇢ O(n) write Ingester Fanout⇢ O(n) read ⇢ O(1) read Timeline Cache Redis Search Index Redis Earlybird Redis Earlybird Redis
  45. 45. User Intent Query Expansion “Hello, world” “Hello” AND “world”@raffi’s home timeline home_timeline:raffi
  46. 46. User Intent Query Expansion “Hello, world” “Hello” AND “world” home_timeline:raffi@raffi’s home timeline OR user_timeline:taylorswift13
  47. 47. streaming compute⇢ continuous computation⇢ driven by the events that come into twitter⇢ generalizing the push mechanism
  48. 48. Write APIIngester Fanout Batch Compute Timeline Cache Push Compute Search Index HTTP PushRedis RedisEarlybird Redis Hadoop Earlybird Redis Mobile Push TimelineBlender Service
  49. 49. timeline query statistics⇢ >150m active users worldwide⇢ 300k qps poll-based timelines @ 1ms p50 / 4ms p99⇢ 30k qps search-based timelines
  50. 50. tweet input⇢ ~340m tweets per day⇢ ~4K/sec daily average⇢ ~6K/sec daily peak⇢ >10K/sec during large events
  51. 51. followed by following
  52. 52. timeline delivery statistics⇢ 26b deliveries / day (~18m / min)⇢ 3.5 seconds @ p50 to deliver to 1m⇢ ~300k deliveries / sec
  53. 53. thanks!

×