Creating social features at BranchOut using MongoDB

11,594 views

Published on

Slides from the MongoDB MeetUp "IRC Bots and Activity Feeds with MongoDB - At BranchOut", presented by the San Francisco MongoDB User Group and 10gen.

http://www.meetup.com/San-Francisco-MongoDB-User-Group/events/95713262/

Over the past year, we've used MongoDB to power more and more of BranchOut's functionality, including some cool social features such as a Facebook-like activity feed. In this talk, I discuss the design decisions that went into developing these features and outline how Mongo is used under the hood. I discuss not only what makes Mongo a good technology choice, but also list a few things about Mongo that need to be worked around.

If you have any questions regarding these slides, feel free to reach out to me on Twitter: @nate510.

Thanks!

Published in: Technology
1 Comment
32 Likes
Statistics
Notes
  • Thanks for sharing the slide and its really interesting, my question is in follow system : what operation you make to check if i follow a user or not , so in UI you will display unfollow button if its true,and follow if its false , is it $elemMatch ?
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total views
11,594
On SlideShare
0
From Embeds
0
Number of Embeds
152
Actions
Shares
0
Downloads
139
Comments
1
Likes
32
Embeds 0
No embeds

No notes for slide

Creating social features at BranchOut using MongoDB

  1. Building Social Features with MongoDB Nathan Smith BranchOut.com Jan. 22, 2013Tuesday, January 22, 13
  2. BranchOut A more social professional network • Connect with your colleagues (follow) • Activity feed of their professional activity • Timeline of an individual’s postsTuesday, January 22, 13
  3. BranchOut A more social professional network • 30M installed users • 750MM total user records • Average 300 connections per installed userTuesday, January 22, 13
  4. MongoDB @ BranchOutTuesday, January 22, 13
  5. MongoDB @ BranchOut • 100% MySQL until ~July 2012Tuesday, January 22, 13
  6. MongoDB @ BranchOut • 100% MySQL until ~July 2012 • Much of our data fits well into a document modelTuesday, January 22, 13
  7. MongoDB @ BranchOut • 100% MySQL until ~July 2012 • Much of our data fits well into a document model • Our data design avoids RDBMS featuresTuesday, January 22, 13
  8. Follow SystemTuesday, January 22, 13
  9. Follow System Business logicTuesday, January 22, 13
  10. Follow System Business logic • Limit of 2000 followees (people you follow)Tuesday, January 22, 13
  11. Follow System Business logic • Limit of 2000 followees (people you follow) • Unlimited followersTuesday, January 22, 13
  12. Follow System Business logic • Limit of 2000 followees (people you follow) • Unlimited followers • Both lists reflect updates in near-real timeTuesday, January 22, 13
  13. Follow System Traditional RDBMS (i.e. MySQL) follower_uid followee_uid follow_time 123 456 2013-01-22 15:43:00 456 123 2013-01-22 15:52:00Tuesday, January 22, 13
  14. Follow System Traditional RDBMS (i.e. MySQL) follower_uid followee_uid follow_time 123 456 2013-01-22 15:43:00 456 123 2013-01-22 15:52:00 Advantage: Easy inserts, deletesTuesday, January 22, 13
  15. Follow System Traditional RDBMS (i.e. MySQL) follower_uid followee_uid follow_time 123 456 2013-01-22 15:43:00 456 123 2013-01-22 15:52:00 Advantage: Easy inserts, deletes Disadvantage: Data locality, index sizeTuesday, January 22, 13
  16. Follow System MongoDB (first pass) followee: { _id: 123 uids: [456, 567, 678] }Tuesday, January 22, 13
  17. Follow System MongoDB (first pass) followee: { _id: 123 uids: [456, 567, 678] } Advantage: Compact data, read localityTuesday, January 22, 13
  18. Follow System MongoDB (first pass) followee: { _id: 123 uids: [456, 567, 678] } Advantage: Compact data, read locality Disadvantage: Can’t display a user’s followersTuesday, January 22, 13
  19. Follow System Can’t display a user’s followers (easily) followee: { _id: 123 uids: [456, 567, 678] } ...with multi-key index on uids db.follow.find({uids: 456}, {_id: 1});Tuesday, January 22, 13
  20. Follow System Can’t display a user’s followers (easily) followee: { _id: 123 uids: [456, 567, 678] } ...with multi-key index on uids db.follow.find({uids: 456}, {_id: 1}); Expensive! Also, no guarantee of order.Tuesday, January 22, 13
  21. Follow System MongoDB (second pass) follower: { _id: 1, followee: { uids: [2] _id: 1, }, uids: [2, 3] follower: { }, _id: 2, followee: { uids: [1] _id: 2, } uids: [1, 3] follower: { } _id: 3, uids: [1, 2] }Tuesday, January 22, 13
  22. Follow System MongoDB (second pass) follower: { _id: 1, followee: { uids: [2] _id: 1, }, uids: [2, 3] follower: { }, _id: 2, followee: { uids: [1] _id: 2, } uids: [1, 3] follower: { } _id: 3, uids: [1, 2] } Advantages: Local data, fast selectsTuesday, January 22, 13
  23. Follow System MongoDB (second pass) follower: { _id: 1, followee: { uids: [2] _id: 1, }, uids: [2, 3] follower: { }, _id: 2, followee: { uids: [1] _id: 2, } uids: [1, 3] follower: { } _id: 3, uids: [1, 2] } Advantages: Local data, fast selects Disadvantages: Follower doc sizeTuesday, January 22, 13
  24. Follow System Follower document sizeTuesday, January 22, 13
  25. Follow System Follower document size • Max Mongo doc size: 16MBTuesday, January 22, 13
  26. Follow System Follower document size • Max Mongo doc size: 16MB • Number of people who follow our community manager: 30MMTuesday, January 22, 13
  27. Follow System Follower document size • Max Mongo doc size: 16MB • Number of people who follow our community manager: 30MM • 30MM uids × 8 bytes/uid = 240MBTuesday, January 22, 13
  28. Follow System Follower document size • Max Mongo doc size: 16MB • Number of people who follow our community manager: 30MM • 30MM uids × 8 bytes/uid = 240MB • Max followers per doc: ~2MMTuesday, January 22, 13
  29. Follow System MongoDB (final pass) follower: { followee: { _id: “1”, _id: 1, uids: [2,3,4,...], uids: [2, 3] count: 20001, }, next_page: 2 followee: { }, _id: 2, follower: { uids: [1, 3] _id: “1_p2”, } uids: [23,24,25,...], count: 10000 }Tuesday, January 22, 13
  30. Follow System MongoDB (final pass) follower: { followee: { _id: “1”, _id: 1, uids: [2,3,4,...], uids: [2, 3] count: 20001, 10001, }, next_page: 23 followee: { }, _id: 2, follower: { uids: [1, 3] _id: “1_p2”, } uids: [23,24,25,...], count: 10000 }Tuesday, January 22, 13
  31. Follow System MongoDB (final pass) follower: { followee: { _id: “1”, _id: 1, uids: [2,3,4,...], uids: [2, 3] count: 20001, 10001, }, next_page: 23 followee: { }, _id: 2, follower: { uids: [1, 3] _id: “1_p2”, } uids: [23,24,25,...], count: 10000 } Asynchronous thread manages follower documentsTuesday, January 22, 13
  32. Activity FeedTuesday, January 22, 13
  33. Activity Feed Push vs Pull architectureTuesday, January 22, 13
  34. Activity Feed Push vs Pull architectureTuesday, January 22, 13
  35. Activity Feed Push vs Pull architectureTuesday, January 22, 13
  36. Activity Feed Business logicTuesday, January 22, 13
  37. Activity Feed Business logic • All connections and followees appear in your feedTuesday, January 22, 13
  38. Activity Feed Business logic • All connections and followees appear in your feed • Reverse chron sort order (but should support other rankings)Tuesday, January 22, 13
  39. Activity Feed Business logic • All connections and followees appear in your feed • Reverse chron sort order (but should support other rankings) • Support for evolving set of feed event typesTuesday, January 22, 13
  40. Activity Feed Business logic • All connections and followees appear in your feed • Reverse chron sort order (but should support other rankings) • Support for evolving set of feed event types • Tagging creates multiple feed events for the same underlying objectTuesday, January 22, 13
  41. Activity Feed Business logic • All connections and followees appear in your feed • Reverse chron sort order (but should support other rankings) • Support for evolving set of feed event types • Tagging creates multiple feed events for the same underlying object • Feed events are not ephemeral -- TimelineTuesday, January 22, 13
  42. Activity Feed Traditional RDBMS (i.e. MySQL) activity_id uid event_time type oid1 oid2 1 123 2013-01-22 15:43:00 photo 123abc 789ghi 2 345 2013-01-22 15:52:00 status 456def foobarTuesday, January 22, 13
  43. Activity Feed Traditional RDBMS (i.e. MySQL) activity_id uid event_time type oid1 oid2 1 123 2013-01-22 15:43:00 photo 123abc 789ghi 2 345 2013-01-22 15:52:00 status 456def foobar Advantage: Easy insertsTuesday, January 22, 13
  44. Activity Feed Traditional RDBMS (i.e. MySQL) activity_id uid event_time type oid1 oid2 1 123 2013-01-22 15:43:00 photo 123abc 789ghi 2 345 2013-01-22 15:52:00 status 456def foobar Advantage: Easy inserts Disadvantages: Rigid schema adapts poorly to new activity types, doesn’t scaleTuesday, January 22, 13
  45. Activity Feed MongoDB user_feed_card user_feed_month ufc:{ ufm:{ _id: 123, // UID _id: “123_2013_01”, total_events: 18, events: [ 2013_01_total: 4, { 2012_12_total: 8, uid: 123, 2012_11_total: 6, type: “photo_upload”, ...other counts... content_id: “abcd9876”, } timestamp: 1358824502, ...more metadata... }, ...more events... ] }Tuesday, January 22, 13
  46. Activity Feed AlgorithmTuesday, January 22, 13
  47. Activity Feed Algorithm 1. Load user_feed_cards for all connectionsTuesday, January 22, 13
  48. Activity Feed Algorithm 1. Load user_feed_cards for all connections 2. Calculate which user_feed_months to loadTuesday, January 22, 13
  49. Activity Feed Algorithm 1. Load user_feed_cards for all connections 2. Calculate which user_feed_months to load 3. Load user_feed_monthsTuesday, January 22, 13
  50. Activity Feed Algorithm 1. Load user_feed_cards for all connections 2. Calculate which user_feed_months to load 3. Load user_feed_months 4. Aggregate events that refer to the same storyTuesday, January 22, 13
  51. Activity Feed Algorithm 1. Load user_feed_cards for all connections 2. Calculate which user_feed_months to load 3. Load user_feed_months 4. Aggregate events that refer to the same story 5. Sort (reverse chron)Tuesday, January 22, 13
  52. Activity Feed Algorithm 1. Load user_feed_cards for all connections 2. Calculate which user_feed_months to load 3. Load user_feed_months 4. Aggregate events that refer to the same story 5. Sort (reverse chron) 6. Load content, comments, etc. and build storiesTuesday, January 22, 13
  53. Activity Feed PerformanceTuesday, January 22, 13
  54. Activity Feed Performance • Response times average under 500 ms (98th percentile under 1 secTuesday, January 22, 13
  55. Activity Feed Performance • Response times average under 500 ms (98th percentile under 1 sec • Design expected to scale well horizontallyTuesday, January 22, 13
  56. Activity Feed Performance • Response times average under 500 ms (98th percentile under 1 sec • Design expected to scale well horizontally • Need to continue to optimizeTuesday, January 22, 13
  57. Building Social Features with MongoDB Nathan Smith BrO: http://branchout.com/nate FB: http://facebook.com/neocortica Twitter: @nate510 Email: nate@branchout.com Aditya Agarwal on Facebook’s architecture: http://www.infoq.com/presentations/Facebook-Software-Stack Dan McKinley on Etsy’s activity feed: http://www.slideshare.net/danmckinley/etsy-activity-feeds-architecture Good Quora questions on activity feeds: http://www.quora.com/What-are-the-scaling-issues-to-keep-in-mind-while-developing-a-social-network-feed http://www.quora.com/What-are-best-practices-for-building-something-like-a-News-FeedTuesday, January 22, 13

×