Socialite, the Open Source Status Feed Part 3: Scaling the Data Feed
 

Socialite, the Open Source Status Feed Part 3: Scaling the Data Feed

on

  • 483 views

Scaling the delivery of posts and content to the follower networks of millions of users has many challenges. In this section we look at the various approaches to fanning out posts and look at a ...

Scaling the delivery of posts and content to the follower networks of millions of users has many challenges. In this section we look at the various approaches to fanning out posts and look at a performance comparison between them. We will highlight some tricks for caching the recent timeline of active users to drive down read latency. We will also look at overall performance metrics from Socialite as we scale from a single replica set to a large sharded environment using MMS Automation.

Statistics

Views

Total Views
483
Views on SlideShare
286
Embed Views
197

Actions

Likes
0
Downloads
10
Comments
0

3 Embeds 197

https://www.mongodb.com 106
http://www.mongodb.com 90
https://comwww-drupal.10gen.com 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • For a Social Platform to store and deliver streaming timelines over long periods of time, careful attention must be paid to the way content is stored. We provide a detailed look into storing an infinite timeline of data while optimizing indexing and sharding configuration for access the most recent window of data. We will also look at some overall performance metrics from Socialite as we scale from a single replica set to a large sharded environment.
  • image at https://dropwizard.github.io/dropwizard of the hat  <br /> <br />
  • BRUTAL!!!
  • Variants?
  • Should you embed the messages/content into "cache"/buckets/etc. or just store references? <br /> <br />
  • WHICH ONE DID WE IMPLEMENT IN SOCIALITE??? <br /> All work with Async Service(? or mention later) <br /> And we did benchmark them! -> Asya <br />
  • examining latency of reading content by fanout type - note two types of latency – for sender and for recipient. <br /> scaling throughput... THIS WILL NOT SCALE LINEARLY(!) <br /> <br /> *RERUN WITH SEVERAL SHARDS* replace with new screenshot <br /> <br />
  • MongoDB as a cache <br /> Storage amplification on a feed service – Justin Bieber makes a single post and we need to write it to 2 million timelines.... ??? <br /> Cache only for active users. <br /> <br /> Number of updates across all cache / number of documents updated <br /> <br /> <br />
  • MongoDB as a cache <br /> Storage amplification on a feed service – Justin Bieber makes a single post and we need to write it to 2 million timelines.... ??? <br /> Cache only for active users. <br /> <br />
  • MongoDB as a cache <br /> Storage amplification on a feed service – Justin Bieber makes a single post and we need to write it to 2 million timelines.... ??? <br /> Cache only for active users. <br /> <br />
  • MongoDB as a cache <br /> Storage amplification on a feed service – Justin Bieber makes a single post and we need to write it to 2 million timelines.... ??? <br /> Cache only for active users. <br /> <br />
  • MongoDB as a cache <br /> Storage amplification on a feed service – Justin Bieber makes a single post and we need to write it to 2 million timelines.... ??? <br /> Cache only for active users. <br /> <br />
  • MongoDB as a cache <br /> Storage amplification on a feed service – Justin Bieber makes a single post and we need to write it to 2 million timelines.... ??? <br /> Cache only for active users. <br /> <br />
  • MongoDB as a cache <br /> Storage amplification on a feed service – Justin Bieber makes a single post and we need to write it to 2 million timelines.... ??? <br /> Cache only for active users. <br /> <br />
  • Some kind of wrap-up <br />
  • image at https://dropwizard.github.io/dropwizard of the hat  <br /> <br />

Socialite, the Open Source Status Feed Part 3: Scaling the Data Feed Socialite, the Open Source Status Feed Part 3: Scaling the Data Feed Presentation Transcript

  • Building a Social Platform Part 3: Scaling the Data Feed
  • Socialite • Reference Implementation – Various Fanout Feed Models – User Graph Implementation – Content storage • Configurable models and options • REST API in Dropwizard (Yammer) – https://dropwizard.github.io/dropwizard/ • Built-in benchmarking https://github.com/10gen-labs/socialite
  • Architecture GraphServiceProxy ContentProxy
  • Feed Service • Two main functions : – Aggregating “followed” content for a user – Forwarding user’s content to “followers” • Common implementation models : – Fanout on read • Query content of all followed users on fly – Fanout on write • Add to “cache” of each user’s timeline for every post • Various storage models for the timeline
  • Fanout On Read
  • Fanout On Read Pros Simple implementation No extra storage for timelines Cons – Timeline reads (typically) hit all shards – Often involves reading more data than required – May require additional indexing on Content
  • Fanout On Write
  • Fanout On Write Pros Timeline can be single document read Dormant users easily excluded Working set minimized Cons – Fanout for large follower lists can be expensive – Additional storage for materialized timelines
  • Fanout On Write • Three different approaches – Time buckets – Size buckets – Cache • Each has different pros & cons
  • Timeline Buckets - Time Upsert to time range buckets for each user > db.timed_buckets.find().pretty() { "_id" : {"_u" : "jsr", "_t" : 516935}, "_c" : [ {"_id" : ObjectId("...dc1"), "_a" : "djw", "_m" : "message from daz"}, {"_id" : ObjectId("...dd2"), "_a" : "ian", "_m" : "message from ian"} ] } { "_id" : {"_u" : "ian", "_t" : 516935}, "_c" : [ {"_id" : ObjectId("...dc1"), "_a" : "djw", "_m" : "message from daz"} ] } { "_id" : {"_u" : "jsr", "_t" : 516934 }, "_c" : [ {"_id" : ObjectId("...da7"), "_a" : "ian", "_m" : "earlier from ian"} ] }
  • Timeline Buckets - Size More complex, but more consistently sized > db.sized_buckets.find().pretty() { "_id" : ObjectId("...122"), "_c" : [ {"_id" : ObjectId("...dc1"), "_a" : "djw", "_m" : "message from daz"}, {"_id" : ObjectId("...dd2"), "_a" : "ian", "_m" : "message from ian"}, {"_id" : ObjectId("...da7"), "_a" : "ian", "_m" : "earlier from ian"} ], "_s" : 3, "_u" : "jsr" } { "_id" : ObjectId("...011"), "_c" : [ {"_id" : ObjectId("...dc1"), "_a" : "djw", "_m" : "message from daz"} ], "_s" : 1, "_u" : "ian" }
  • Timeline - Cache Store a limited cache, fall back to fanout on read – Create single cache doc on demand with upsert – Limit size of cache with $slice – Timeout docs with TTL for inactive users > db.timeline_cache.find().pretty() { "_c" : [ {"_id" : ObjectId("...dc1"), "_a" : "djw", "_m" : "message from daz"}, {"_id" : ObjectId("...dd2"), "_a" : "ian", "_m" : "message from ian"}, {"_id" : ObjectId("...da7"), "_a" : "ian", "_m" : "earlier from ian"} ], "_u" : "jsr" } { "_c" : [ {"_id" : ObjectId("...dc1"), "_a" : "djw", "_m" : "message from daz"} ], "_u" : "ian" }
  • Embedding vs Linking Content Embedded content for direct access – Great when it is small, predictable in size Link to content, store only metadata – Read only desired content on demand – Further stabilizes cache document sizes > db.timeline_cache.findOne({”_id" : "jsr"}) { "_c" : [ {"_id" : ObjectId("...dc1”)}, {"_id" : ObjectId("...dd2”)}, {"_id" : ObjectId("...da7”)} ], ”_id" : "jsr" }
  • Socialite Feed Service • Implemented four models as plugins – FanoutOnRead – FanoutOnWrite – Buckets (size) – FanoutOnWrite – Buckets (time) – FanoutOnWrite - Cache • Switchable by config • Store content by reference or value • Benchmark-able back to back
  • Benchmark by feed type
  • Benchmarking the Feed • Biggest challenge: scaling the feed • High cost of "fanout on write" • Popular user posts => # operations: – Content collection insert: 1 – Timeline Cache: on average, 130+ cache document updates • SCATTER GATHER (slowest shard determines latency)
  • Benchmarking the Feed • Timeline is different from content! – "It's a Cache" IT CAN BE REBUILT!
  • Benchmarking the Feed • MongoDB as a cache
  • IT CAN BE REBUILT! Effect of removing the cache and forcing drop-back to fanout on read and rebuilding of the cache: Benchmarking the Feed
  • Benchmarking the Feed
  • Benchmarking the Feed
  • Benchmarking the Feed • Results – last two weeks – ran load with one million users – ran load with ten million users (currently running) – used avg send rate 1K/s; 2K/s; reads 10K-20k/s – 22 AWS c3.2xlarge servers (7.5GB RAM) – 18 across six shards (3 content, 3 user graph) – 4 mongos and app machines – 2 c2x4xlarge servers (30GB RAM) – timeline feed cache (six shards)
  • Summary
  • Socialite • Real Working Implementation – Implements All Components – Configurable models and options • Built-in benchmarking • Questions? – We will be at "Ask The Experts" this afternoon! https://github.com/10gen-labs/socialite https://github.com/10gen-labs/socialite
  • https://github.com/10gen-labs/socialite Thank You!