timelines at scale
@raffi
qcon sf 2012
Pull Push
Targeted twitter.com
home_timeline API
User / Site Streams
Mobile Push (SMS, etc.)
Queried Search API Track / Fo...
the challenge
⇢> 150M world wide active users
⇢> 300K QPS for timelines
⇢naïve timeline “materialization” can be
slow
Timeline
Service
Ingester
SearchCache
Redis
Redis
Earlybird
Blender
PushCompute
HTTP Push
Mobile
Push
BatchCompute
Hadoop
...
Timeline
Service
Ingester
SearchCache
Redis
Redis
Earlybird
Blender
PushCompute
HTTP Push
Mobile
Push
BatchCompute
Hadoop
...
Timeline
Service
Ingester
SearchCache
Redis
Redis
Earlybird
Blender
PushCompute
HTTP Push
Mobile
Push
BatchCompute
Hadoop
...
Timeline
Service
Ingester
SearchCache
Redis
Redis
Earlybird
Blender
PushCompute
HTTP Push
Mobile
Push
BatchCompute
Hadoop
...
Timeline
Service
Ingester
SearchCache
Redis
Redis
Earlybird
Blender
PushCompute
HTTP Push
Mobile
Push
BatchCompute
Hadoop
...
Ingester
SearchCache
Redis
Redis
Earlybird
Blender
PushCompute
HTTP Push
Mobile
Push
BatchCompute
Hadoop
Write API
Fanout
...
Timeline
Service
Write API
Fanout
Redis
Redis
TimelineCache
Redis
TweetyPieGizmoduck
Pull Push
Targeted twitter.com
home_timeline API
User / Site Streams
Mobile Push (SMS, etc.)
Queried Search API Track / Fo...
Ingester
SearchCache
Redis
Redis
Earlybird
Blender
PushCompute
HTTP Push
Mobile
Push
BatchCompute
Hadoop
Write API
Fanout
...
PushCompute
HTTP Push
Mobile
Push
BatchCompute
HadoopSearchIndex
Blender
Redis
Timeline
Service
Ingester
Earlybird
Write A...
PushCompute
HTTP Push
Mobile
Push
BatchCompute
HadoopSearchIndex
Blender
Redis
Timeline
Service
Ingester
Earlybird
Write A...
Write
API
Redis
Redis
Redis
Write
API
Earlybird
Earlybird
Earlybird
API
Cache
Read
API
Redis
Redis
Redis
Read
API
Earlybir...
the challenge (part #2)
⇢fanout can be really slow!
⇢...especially for high follower counts
@barackobama
23 million followers
31 million followers
@katyperry
28 million followers
@justinbieber
28 million followers
...
there are over
400 million tweets
a day
a second
4600 tweets
0.2 m
a twe
≈
Write API
Ingester Fanout
SearchIndexRedis
Earlybird
Earlybird
Redis
Redis
Redis
TimelineCache
search index
⇢[‘hello’,‘wor...
User Intent Query Expansion
“Hello, world” “Hello” AND “world”
@raffi’s home timeline home_timeline:raffi
User Intent Query Expansion
“Hello, world” “Hello” AND “world”
@raffi’s home timeline
user_timeline:nelson
OR
user_timelin...
User Intent Query Expansion
“Hello, world” “Hello” AND “world”
@raffi’s home timeline home_timeline:raffi
User Intent Query Expansion
“Hello, world” “Hello” AND “world”
@raffi’s home timeline
home_timeline:raffi
OR
user_timeline...
BatchCompute
Hadoop
PushCompute
HTTP Push
SearchIndex
Blender
Redis
Timeline
Service
Ingester
Earlybird
Write API
Fanout
R...
Asynchronous Path
Query Path
BatchCompute
Hadoop
Synchronous Path
PushCompute
HTTP Push
SearchIndex
Blender
Redis
Timeline...
Synchronous Path
Query Path
BatchCompute
Hadoop
Asynchronous Path
PushCompute
HTTP Push
SearchIndex
Blender
Redis
Timeline...
Asynchronous Path
Synchronous Path
BatchCompute
Hadoop
Query Path
PushCompute
HTTP Push
SearchIndex
Blender
Redis
Timeline...
timeline query statistics
⇢>150m active users worldwide
⇢>300k qps poll-based timelines
@ 1ms p50 / 4ms p99
⇢>30k qps sear...
tweet input
⇢~400m tweets per day
⇢~5K/sec daily average
⇢~7K/sec daily peak
⇢>12K/sec during large events
timeline delivery statistics
⇢30b deliveries / day (~21m / min)
⇢3.5 seconds @ p50 to deliver to 1m
⇢~300k deliveries / sec
thanks!
Timelines at scale
Timelines at scale
Timelines at scale
Timelines at scale
Timelines at scale
Timelines at scale
Timelines at scale
Timelines at scale
Timelines at scale
Timelines at scale
Upcoming SlideShare
Loading in …5
×

Timelines at scale

344 views
247 views

Published on

Raffi Krikorian explains the architecture used by Twitter to deal with 300K queries per second - tweets, social graph mutations, and direct messages

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
344
On SlideShare
0
From Embeds
0
Number of Embeds
20
Actions
Shares
0
Downloads
5
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Timelines at scale

  1. 1. timelines at scale @raffi qcon sf 2012
  2. 2. Pull Push Targeted twitter.com home_timeline API User / Site Streams Mobile Push (SMS, etc.) Queried Search API Track / Follow Streams
  3. 3. the challenge ⇢> 150M world wide active users ⇢> 300K QPS for timelines ⇢naïve timeline “materialization” can be slow
  4. 4. Timeline Service Ingester SearchCache Redis Redis Earlybird Blender PushCompute HTTP Push Mobile Push BatchCompute Hadoop Write API Fanout Redis Redis TimelineCache Redis
  5. 5. Timeline Service Ingester SearchCache Redis Redis Earlybird Blender PushCompute HTTP Push Mobile Push BatchCompute Hadoop Write API Fanout Redis Redis TimelineCache Redis Social Graph Service
  6. 6. Timeline Service Ingester SearchCache Redis Redis Earlybird Blender PushCompute HTTP Push Mobile Push BatchCompute Hadoop Write API Fanout Redis Redis TimelineCache Redis Social Graph Service insert ⇢keyed off “recipient” ⇢pipelined 4k “destinations” at a time ⇢replicated
  7. 7. Timeline Service Ingester SearchCache Redis Redis Earlybird Blender PushCompute HTTP Push Mobile Push BatchCompute Hadoop Write API Fanout Redis Redis TimelineCache Redis using redis ⇢native list structure Tweet ID BitsUser ID 8 bytes 4 bytes8 bytes
  8. 8. Timeline Service Ingester SearchCache Redis Redis Earlybird Blender PushCompute HTTP Push Mobile Push BatchCompute Hadoop Write API Fanout Redis Redis TimelineCache Redis using redis ⇢native list structure ⇢RPUSHX to only add to cached timelines Tweet ID BitsUser ID Tweet ID BitsUser ID Tweet ID BitsUser ID Tweet ID BitsUser ID Tweet ID BitsUser ID Tweet ID BitsUser ID Tweet ID BitsUser ID Tweet ID BitsUser ID Tweet ID BitsUser ID Tweet ID BitsUser ID Tweet ID Tweet ID Tweet ID
  9. 9. Ingester SearchCache Redis Redis Earlybird Blender PushCompute HTTP Push Mobile Push BatchCompute Hadoop Write API Fanout Redis Redis TimelineCache Timeline Service Redis
  10. 10. Timeline Service Write API Fanout Redis Redis TimelineCache Redis TweetyPieGizmoduck
  11. 11. Pull Push Targeted twitter.com home_timeline API User / Site Streams Mobile Push (SMS, etc.) Queried Search API Track / Follow Streams
  12. 12. Ingester SearchCache Redis Redis Earlybird Blender PushCompute HTTP Push Mobile Push BatchCompute Hadoop Write API Fanout Redis Redis TimelineCache Timeline Service Redis
  13. 13. PushCompute HTTP Push Mobile Push BatchCompute HadoopSearchIndex Blender Redis Timeline Service Ingester Earlybird Write API Fanout Redis Redis TimelineCache RedisEarlybird blender ⇢queries one replica of all indexes ⇢merges & ranks results
  14. 14. PushCompute HTTP Push Mobile Push BatchCompute HadoopSearchIndex Blender Redis Timeline Service Ingester Earlybird Write API Fanout Redis Redis TimelineCache RedisEarlybird
  15. 15. Write API Redis Redis Redis Write API Earlybird Earlybird Earlybird API Cache Read API Redis Redis Redis Read API Earlybird Earlybird Earlybird API Cache ⇢O(n) write ⇢O(1) write ⇢O(1) read ⇢O(n) read
  16. 16. the challenge (part #2) ⇢fanout can be really slow! ⇢...especially for high follower counts
  17. 17. @barackobama 23 million followers 31 million followers @katyperry 28 million followers @justinbieber 28 million followers @raffi 0.019 million followers @ladygaga
  18. 18. there are over 400 million tweets a day
  19. 19. a second 4600 tweets 0.2 m a twe ≈
  20. 20. Write API Ingester Fanout SearchIndexRedis Earlybird Earlybird Redis Redis Redis TimelineCache search index ⇢[‘hello’,‘world’] fanout index ⇢[@danadanger, ...]
  21. 21. User Intent Query Expansion “Hello, world” “Hello” AND “world” @raffi’s home timeline home_timeline:raffi
  22. 22. User Intent Query Expansion “Hello, world” “Hello” AND “world” @raffi’s home timeline user_timeline:nelson OR user_timeline:danadanger
  23. 23. User Intent Query Expansion “Hello, world” “Hello” AND “world” @raffi’s home timeline home_timeline:raffi
  24. 24. User Intent Query Expansion “Hello, world” “Hello” AND “world” @raffi’s home timeline home_timeline:raffi OR user_timeline:taylorswift13
  25. 25. BatchCompute Hadoop PushCompute HTTP Push SearchIndex Blender Redis Timeline Service Ingester Earlybird Write API Fanout Redis Redis TimelineCache RedisEarlybird Mobile Push
  26. 26. Asynchronous Path Query Path BatchCompute Hadoop Synchronous Path PushCompute HTTP Push SearchIndex Blender Redis Timeline Service Ingester Earlybird Write API Fanout Redis Redis TimelineCache RedisEarlybird Mobile Push
  27. 27. Synchronous Path Query Path BatchCompute Hadoop Asynchronous Path PushCompute HTTP Push SearchIndex Blender Redis Timeline Service Ingester Earlybird Write API Fanout Redis Redis TimelineCache RedisEarlybird Mobile Push
  28. 28. Asynchronous Path Synchronous Path BatchCompute Hadoop Query Path PushCompute HTTP Push SearchIndex Blender Redis Timeline Service Ingester Earlybird Write API Fanout Redis Redis TimelineCache RedisEarlybird Mobile Push
  29. 29. timeline query statistics ⇢>150m active users worldwide ⇢>300k qps poll-based timelines @ 1ms p50 / 4ms p99 ⇢>30k qps search-based timelines
  30. 30. tweet input ⇢~400m tweets per day ⇢~5K/sec daily average ⇢~7K/sec daily peak ⇢>12K/sec during large events
  31. 31. timeline delivery statistics ⇢30b deliveries / day (~21m / min) ⇢3.5 seconds @ p50 to deliver to 1m ⇢~300k deliveries / sec
  32. 32. thanks!

×