Raffi Krikorian, Twitter Timelines at Scale

12,382 views

Published on

Charla de Raffi Krikorian, el VP de Ingeniería de Twitter, sobre la infraestructura de los Timelines de twitter

Published in: Technology

Raffi Krikorian, Twitter Timelines at Scale

  1. timelines at scale @raffi qcon sf 2012
  2. Pull Push Targeted twitter.com home_timeline API User / Site Streams Mobile Push (SMS, etc.) Queried Search API Track / Follow Streams
  3. the challenge ⇢> 150M world wide active users ⇢> 300K QPS for timelines ⇢naïve timeline “materialization” can be slow
  4. Timeline Service Ingester SearchCache Redis Redis Earlybird Blender PushCompute HTTP Push Mobile Push BatchCompute Hadoop Write API Fanout Redis Redis TimelineCache Redis
  5. Timeline Service Ingester SearchCache Redis Redis Earlybird Blender PushCompute HTTP Push Mobile Push BatchCompute Hadoop Write API Fanout Redis Redis TimelineCache Redis Social Graph Service
  6. Timeline Service Ingester SearchCache Redis Redis Earlybird Blender PushCompute HTTP Push Mobile Push BatchCompute Hadoop Write API Fanout Redis Redis TimelineCache Redis Social Graph Service insert ⇢keyed off “recipient” ⇢pipelined 4k “destinations” at a time ⇢replicated
  7. Timeline Service Ingester SearchCache Redis Redis Earlybird Blender PushCompute HTTP Push Mobile Push BatchCompute Hadoop Write API Fanout Redis Redis TimelineCache Redis using redis ⇢native list structure Tweet ID BitsUser ID 8 bytes 4 bytes8 bytes
  8. Timeline Service Ingester SearchCache Redis Redis Earlybird Blender PushCompute HTTP Push Mobile Push BatchCompute Hadoop Write API Fanout Redis Redis TimelineCache Redis using redis ⇢native list structure ⇢RPUSHX to only add to cached timelines Tweet ID BitsUser ID Tweet ID BitsUser ID Tweet ID BitsUser ID Tweet ID BitsUser ID Tweet ID BitsUser ID Tweet ID BitsUser ID Tweet ID BitsUser ID Tweet ID BitsUser ID Tweet ID BitsUser ID Tweet ID BitsUser ID Tweet ID Tweet ID Tweet ID
  9. Ingester SearchCache Redis Redis Earlybird Blender PushCompute HTTP Push Mobile Push BatchCompute Hadoop Write API Fanout Redis Redis TimelineCache Timeline Service Redis
  10. Timeline Service Write API Fanout Redis Redis TimelineCache Redis TweetyPieGizmoduck
  11. Pull Push Targeted twitter.com home_timeline API User / Site Streams Mobile Push (SMS, etc.) Queried Search API Track / Follow Streams
  12. Ingester SearchCache Redis Redis Earlybird Blender PushCompute HTTP Push Mobile Push BatchCompute Hadoop Write API Fanout Redis Redis TimelineCache Timeline Service Redis
  13. PushCompute HTTP Push Mobile Push BatchCompute HadoopSearchIndex Blender Redis Timeline Service Ingester Earlybird Write API Fanout Redis Redis TimelineCache RedisEarlybird blender ⇢queries one replica of all indexes ⇢merges & ranks results
  14. PushCompute HTTP Push Mobile Push BatchCompute HadoopSearchIndex Blender Redis Timeline Service Ingester Earlybird Write API Fanout Redis Redis TimelineCache RedisEarlybird
  15. Write API Redis Redis Redis Write API Earlybird Earlybird Earlybird API Cache Read API Redis Redis Redis Read API Earlybird Earlybird Earlybird API Cache ⇢O(n) write ⇢O(1) write ⇢O(1) read ⇢O(n) read
  16. the challenge (part #2) ⇢fanout can be really slow! ⇢...especially for high follower counts
  17. @barackobama 23 million followers 31 million followers @katyperry 28 million followers @justinbieber 28 million followers @raffi 0.019 million followers @ladygaga
  18. there are over 400 million tweets a day
  19. a second 4600 tweets 0.2 m a twe ≈
  20. Write API Ingester Fanout SearchIndexRedis Earlybird Earlybird Redis Redis Redis TimelineCache search index ⇢[‘hello’,‘world’] fanout index ⇢[@danadanger, ...]
  21. User Intent Query Expansion “Hello, world” “Hello” AND “world” @raffi’s home timeline home_timeline:raffi
  22. User Intent Query Expansion “Hello, world” “Hello” AND “world” @raffi’s home timeline user_timeline:nelson OR user_timeline:danadanger
  23. User Intent Query Expansion “Hello, world” “Hello” AND “world” @raffi’s home timeline home_timeline:raffi
  24. User Intent Query Expansion “Hello, world” “Hello” AND “world” @raffi’s home timeline home_timeline:raffi OR user_timeline:taylorswift13
  25. BatchCompute Hadoop PushCompute HTTP Push SearchIndex Blender Redis Timeline Service Ingester Earlybird Write API Fanout Redis Redis TimelineCache RedisEarlybird Mobile Push
  26. Asynchronous Path Query Path BatchCompute Hadoop Synchronous Path PushCompute HTTP Push SearchIndex Blender Redis Timeline Service Ingester Earlybird Write API Fanout Redis Redis TimelineCache RedisEarlybird Mobile Push
  27. Synchronous Path Query Path BatchCompute Hadoop Asynchronous Path PushCompute HTTP Push SearchIndex Blender Redis Timeline Service Ingester Earlybird Write API Fanout Redis Redis TimelineCache RedisEarlybird Mobile Push
  28. Asynchronous Path Synchronous Path BatchCompute Hadoop Query Path PushCompute HTTP Push SearchIndex Blender Redis Timeline Service Ingester Earlybird Write API Fanout Redis Redis TimelineCache RedisEarlybird Mobile Push
  29. timeline query statistics ⇢>150m active users worldwide ⇢>300k qps poll-based timelines @ 1ms p50 / 4ms p99 ⇢>30k qps search-based timelines
  30. tweet input ⇢~400m tweets per day ⇢~5K/sec daily average ⇢~7K/sec daily peak ⇢>12K/sec during large events
  31. timeline delivery statistics ⇢30b deliveries / day (~21m / min) ⇢3.5 seconds @ p50 to deliver to 1m ⇢~300k deliveries / sec
  32. thanks!

×