Your SlideShare is downloading. ×
0
Timelines at scale
Timelines at scale
Timelines at scale
Timelines at scale
Timelines at scale
Timelines at scale
Timelines at scale
Timelines at scale
Timelines at scale
Timelines at scale
Timelines at scale
Timelines at scale
Timelines at scale
Timelines at scale
Timelines at scale
Timelines at scale
Timelines at scale
Timelines at scale
Timelines at scale
Timelines at scale
Timelines at scale
Timelines at scale
Timelines at scale
Timelines at scale
Timelines at scale
Timelines at scale
Timelines at scale
Timelines at scale
Timelines at scale
Timelines at scale
Timelines at scale
Timelines at scale
Timelines at scale
Timelines at scale
Timelines at scale
Timelines at scale
Timelines at scale
Timelines at scale
Timelines at scale
Timelines at scale
Timelines at scale
Timelines at scale
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Timelines at scale

200

Published on

Raffi Krikorian explains the architecture used by Twitter to deal with 300K queries per second - tweets, social graph mutations, and direct messages

Raffi Krikorian explains the architecture used by Twitter to deal with 300K queries per second - tweets, social graph mutations, and direct messages

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
200
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
5
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. timelines at scale @raffi qcon sf 2012
  • 2. Pull Push Targeted twitter.com home_timeline API User / Site Streams Mobile Push (SMS, etc.) Queried Search API Track / Follow Streams
  • 3. the challenge ⇢> 150M world wide active users ⇢> 300K QPS for timelines ⇢naïve timeline “materialization” can be slow
  • 4. Timeline Service Ingester SearchCache Redis Redis Earlybird Blender PushCompute HTTP Push Mobile Push BatchCompute Hadoop Write API Fanout Redis Redis TimelineCache Redis
  • 5. Timeline Service Ingester SearchCache Redis Redis Earlybird Blender PushCompute HTTP Push Mobile Push BatchCompute Hadoop Write API Fanout Redis Redis TimelineCache Redis Social Graph Service
  • 6. Timeline Service Ingester SearchCache Redis Redis Earlybird Blender PushCompute HTTP Push Mobile Push BatchCompute Hadoop Write API Fanout Redis Redis TimelineCache Redis Social Graph Service insert ⇢keyed off “recipient” ⇢pipelined 4k “destinations” at a time ⇢replicated
  • 7. Timeline Service Ingester SearchCache Redis Redis Earlybird Blender PushCompute HTTP Push Mobile Push BatchCompute Hadoop Write API Fanout Redis Redis TimelineCache Redis using redis ⇢native list structure Tweet ID BitsUser ID 8 bytes 4 bytes8 bytes
  • 8. Timeline Service Ingester SearchCache Redis Redis Earlybird Blender PushCompute HTTP Push Mobile Push BatchCompute Hadoop Write API Fanout Redis Redis TimelineCache Redis using redis ⇢native list structure ⇢RPUSHX to only add to cached timelines Tweet ID BitsUser ID Tweet ID BitsUser ID Tweet ID BitsUser ID Tweet ID BitsUser ID Tweet ID BitsUser ID Tweet ID BitsUser ID Tweet ID BitsUser ID Tweet ID BitsUser ID Tweet ID BitsUser ID Tweet ID BitsUser ID Tweet ID Tweet ID Tweet ID
  • 9. Ingester SearchCache Redis Redis Earlybird Blender PushCompute HTTP Push Mobile Push BatchCompute Hadoop Write API Fanout Redis Redis TimelineCache Timeline Service Redis
  • 10. Timeline Service Write API Fanout Redis Redis TimelineCache Redis TweetyPieGizmoduck
  • 11. Pull Push Targeted twitter.com home_timeline API User / Site Streams Mobile Push (SMS, etc.) Queried Search API Track / Follow Streams
  • 12. Ingester SearchCache Redis Redis Earlybird Blender PushCompute HTTP Push Mobile Push BatchCompute Hadoop Write API Fanout Redis Redis TimelineCache Timeline Service Redis
  • 13. PushCompute HTTP Push Mobile Push BatchCompute HadoopSearchIndex Blender Redis Timeline Service Ingester Earlybird Write API Fanout Redis Redis TimelineCache RedisEarlybird blender ⇢queries one replica of all indexes ⇢merges & ranks results
  • 14. PushCompute HTTP Push Mobile Push BatchCompute HadoopSearchIndex Blender Redis Timeline Service Ingester Earlybird Write API Fanout Redis Redis TimelineCache RedisEarlybird
  • 15. Write API Redis Redis Redis Write API Earlybird Earlybird Earlybird API Cache Read API Redis Redis Redis Read API Earlybird Earlybird Earlybird API Cache ⇢O(n) write ⇢O(1) write ⇢O(1) read ⇢O(n) read
  • 16. the challenge (part #2) ⇢fanout can be really slow! ⇢...especially for high follower counts
  • 17. @barackobama 23 million followers 31 million followers @katyperry 28 million followers @justinbieber 28 million followers @raffi 0.019 million followers @ladygaga
  • 18. there are over 400 million tweets a day
  • 19. a second 4600 tweets 0.2 m a twe ≈
  • 20. Write API Ingester Fanout SearchIndexRedis Earlybird Earlybird Redis Redis Redis TimelineCache search index ⇢[‘hello’,‘world’] fanout index ⇢[@danadanger, ...]
  • 21. User Intent Query Expansion “Hello, world” “Hello” AND “world” @raffi’s home timeline home_timeline:raffi
  • 22. User Intent Query Expansion “Hello, world” “Hello” AND “world” @raffi’s home timeline user_timeline:nelson OR user_timeline:danadanger
  • 23. User Intent Query Expansion “Hello, world” “Hello” AND “world” @raffi’s home timeline home_timeline:raffi
  • 24. User Intent Query Expansion “Hello, world” “Hello” AND “world” @raffi’s home timeline home_timeline:raffi OR user_timeline:taylorswift13
  • 25. BatchCompute Hadoop PushCompute HTTP Push SearchIndex Blender Redis Timeline Service Ingester Earlybird Write API Fanout Redis Redis TimelineCache RedisEarlybird Mobile Push
  • 26. Asynchronous Path Query Path BatchCompute Hadoop Synchronous Path PushCompute HTTP Push SearchIndex Blender Redis Timeline Service Ingester Earlybird Write API Fanout Redis Redis TimelineCache RedisEarlybird Mobile Push
  • 27. Synchronous Path Query Path BatchCompute Hadoop Asynchronous Path PushCompute HTTP Push SearchIndex Blender Redis Timeline Service Ingester Earlybird Write API Fanout Redis Redis TimelineCache RedisEarlybird Mobile Push
  • 28. Asynchronous Path Synchronous Path BatchCompute Hadoop Query Path PushCompute HTTP Push SearchIndex Blender Redis Timeline Service Ingester Earlybird Write API Fanout Redis Redis TimelineCache RedisEarlybird Mobile Push
  • 29. timeline query statistics ⇢>150m active users worldwide ⇢>300k qps poll-based timelines @ 1ms p50 / 4ms p99 ⇢>30k qps search-based timelines
  • 30. tweet input ⇢~400m tweets per day ⇢~5K/sec daily average ⇢~7K/sec daily peak ⇢>12K/sec during large events
  • 31. timeline delivery statistics ⇢30b deliveries / day (~21m / min) ⇢3.5 seconds @ p50 to deliver to 1m ⇢~300k deliveries / sec
  • 32. thanks!

×