Socialite, the Open Source Status Feed Part 1: Design Overview and Scaling for Infinite Content

Like this? Share it with your network

Share

Socialite, the Open Source Status Feed Part 1: Design Overview and Scaling for Infinite Content

  • 1,120 views
Uploaded on

 

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,120
On Slideshare
790
From Embeds
330
Number of Embeds
3

Actions

Shares
Downloads
17
Comments
0
Likes
2

Embeds 330

https://www.mongodb.com 187
http://www.mongodb.com 142
https://comwww-drupal.10gen.com 1

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • give intuitive sense of how the presentation will be structured...

  • News/Social Status Feed: popular and common

    Internal goals: implement different schema options, builtin benchmarking for comparison

    External goals: low latency from end-user perspective, linear scaling from operational perspective
  • News/Social Status Feed: popular and common

    Internal goals: implement different schema options, builtin benchmarking for comparison

    External goals: low latency from end-user perspective, linear scaling from operational perspective
  • image at https://dropwizard.github.io/dropwizard of the hat 

Transcript

  • 1. Building a Social Platform Part 1: Design Overview; Storing Infinite Content
  • 2. Solutions Engineering • Identify Popular Use Cases – Directly from MongoDB Users – Addressing "limitations" • Go beyond documentation and blogs • Create open source project • Run it!
  • 3. Social Status Feed
  • 4. Agenda • What is a status feed and why build it w/MongoDB • Application overview (goals, non-goals) • Architecture overview (arch diagram) • Operational overview (benchmarks, automation) • Describe components – Describe options • For each component – Options tried – Results – Option chosen
  • 5. Socialite • News/Social Status Feed: popular and common • Appears misleadingly simple: turns out to have many tricky problems to solve to have good performance • We created a reference implementation – Configurable models and options – Built-in benchmarking • Used this implementation to test out different options. • This talk will summarize
  • 6. Status Feed
  • 7. Status Feed
  • 8. Socialite • Open Source • Reference Implementation – Various Fanout Feed Models – User Graph Implementation – Content storage • Configurable models and options • REST API in Dropwizard (Yammer) – https://dropwizard.github.io/dropwizard/ • Built-in benchmarking https://github.com/10gen-labs/socialite
  • 9. Architecture GraphServiceProxy ContentProxy
  • 10. Pluggable Services • Major components each have an interface – see com.mongodb.socialite.services • Configuration selects implementation to use • ServiceManager organizes : – Default implementations – Lifecycle – Binding configuration – Wiring dependencies – see com.mongodb.socialite.ServiceManager
  • 11. Simple Interface GET /users/{user_id} Get a User by their ID DELETE /users/{user_id} Remove a user by their ID POST /users/{user_id}/posts Send a message from this user GET /users/{user_id}/followers Get a list of followers of a user GET /users/{user_id}/followers_count Get the number of followers of a user GET /users/{user_id}/following Get the list of users this user is following GET /users/{user_id}/following count Get the number of users this user follows GET /users/{user_id}/posts Get the messages sent by a user GET /users/{user_id}/timeline Get the timeline for this user PUT /users/{user_id} Create a new user PUT /users/{user_id}/following/{target} Follow a user DELETE /users/{user_id}/following/{target} Unfollow a user https://github.com/10gen-labs/socialite
  • 12. Technical Decisions User timeline cache Schema Indexing Horizontal Scaling
  • 13. Operational Setup
  • 14. Real life validation of our choices. User facing latency Linear scaling of resources Most important criteria? Operational Testing
  • 15. Scaling Goals • Realistic real-life-scale workload – compared to Twitter, etc. • Understanding of HW required – containing costs • Confirm architecture scales linearly – without loss of responsiveness
  • 16. Architecture GraphServiceProxy ContentProxy
  • 17. DB Architecture The storage layer is separatefrom Socialiteservices, and each service has its own URI – its own mongodb server or cluster that can be configured differentlyfrom others. This allows us to physically optimize each services'DB for the workload we'llbe running on it. It also allows us to scale out the DB that's currently the limiting factor(the bottleneck) in our setup.
  • 18. Operational Testing
  • 19. Operational Testing
  • 20. Operational Testing
  • 21. Operational Testing
  • 22. Operational Testing
  • 23. Operational Testing
  • 24. Operational Testing
  • 25. Operational Testing
  • 26. Operational Testing
  • 27. Operational Testing
  • 28. Operational Testing
  • 29. Operational Testing
  • 30. Operational Testing
  • 31. Operational Testing Built-in benchmark capability
  • 32. Operational Testing • All hosts in AWS • Each service used its own DB, cluster or shards • All benchmarks through `mongos` (sharded config) • Used MMS monitoring for measuring throughput • Used internal benchmarks for measuring latency • Based volume tested on real life social metrics
  • 33. Scaling for Infinite Content
  • 34. Architecture GraphServiceProxy ContentProxy
  • 35. Socialite Content Service • System of record for all user content • Initially very simple (no search) • Mainly designed to support feed – Lookup/indexed by _id and userid – Time based anchors/pagination
  • 36. • Half life of most content is 1 day ! • Popular content usually < 1 month • Access to old data is rare Social Data Ages Fast