Successfully reported this slideshow.
Mongo or Die!How MongoDB powers Doodle or Die        Aaron Silverman             (@Zugwalt)
Doodle or Die@DoodleOrDie
What is Doodle or Die?
TelephonePhrase     Id like some beer!Phrase     Id like some deerPhrase       I see no deerPhrase        I’ve no idea!
Doodle or DieDoodlePhrase         Shining AppleDoodlePhrase      Eat your fruit or DIE!
What Powers Doodle or Die?
Started very, very small            4 Cores          128MB RAM            Node Server           MongoDB Server
Got serious about our servers                     Small - 2GB     8 Cores      MongoDB Server   256MB RAM    Node Server
Called in some Reinforcements                      Large - 5GB    12 Cores        MongoDB Server    1GB RAM     Node Server
In the last 30 Days:• 2,500,000 page views• 100,000 uniques• 35,000 active player accounts• 2,000,000 new doodles and desc...
“Small Data”Mongo DB           • Player Info                   • Chain Info (excluding4 GB data size                     d...
MongoDB - $65 / monthDaily backups to Amazon S3: $16/mo
Amazon S3 - $70 / month
Node - $62/ month
Total Cost To Host: $197/month     MongoDB     $65     Amazon S3   $70     Node        $62     Total       $197
PAAS Provides Easy Upgrade Path
General Principles
Custom _idPartially random string generated usingShortId node module                        ObjectId        ObjectId("4fd0...
• Shorter, less cumbersome in code and  queriesdb.players.findOne({_id: ‘58mwYlTKV’});db.chains.update({_id: ‘58mwYlTKV’},...
Question Oriented Subdocuments  What chain is this player working on right now?  What are this player’s stats?  Which play...
Goal is for most “Questions” to be able tobe answered in one query from one subdocumentdb.players.findOne({_id: ‘58mwYlTKV...
Indexes are designed to make answeringquestions easy!Which player is working on this chain?db.players.ensureIndex({‘game.a...
Doodle or Die Collections
Disclaimer: Much of the original Doodle orDie code and schema were created during aweekend long hackathon!
playersPrimary   chains          groups          sessionsSupport            log
Players
players  game            OftenchainHistory   stats         Query               Frequency  account   login                 ...
players        game               •   activeState                           •   activeChain_id     chainHistory          •...
players                             Chain1            game                            • datePlayed                        ...
players               Chain1       game              • datePlayed                         • dateViewed    chainHistory    ...
players                            •       totalSteps           game                               •       drawSteps      ...
Chains
chains•   activePlayer_id•   activeState•   numSteps•   lastModified       ineligiblePlayer_ids                      steps...
chains           [player_id1                            player_id2,•   activePlayer_id         player_id3•   activeState  ...
chains           [•   activePlayer_id               Step1•   activeState                • player_id•   numSteps           ...
Description Step Content:  • phraseDoodle Step Content:  •   url (points to S3)  •   time  •   numStrokes  •   numDistinct...
Description Step Content:  • phraseDoodle Step Content:  •   url (points to S3)  •   time  •   numStrokes  •   numDistinct...
We plan to stop embedding steps in chains,and link to “doodles” and “descriptions”collections          doodles            ...
How Does it all Work?
players queried for users who are associatedwith the authenticated twitter accountdb.players.find({‘login.twitter.uid’: ‘X...
chains collection searched and atomicallyupdated for unclaimed eligible chain to begiven to player db.chains.findAndModify...
Content saved to chaindb.chains.update({_id: chain._id,           activePlayer_id: player._id}},          {$inc: {numSteps...
Doodle strokes will be saved to S3(but url to S3 and metadata saved to chain){"color":"#000000","size":15,"path":[307,66,3...
Player’s chainHistory retrieveddb.players.find({‘login.urlSafe’: urlSafeUid},          {chainHistory: 1});chainHistory fil...
Stats are loaded from the counting log whichis essentially a bunch of countersincrementeddb.log.find({_id: {$in: [‘2012-06...
Some Additional Doodle or Die Tricks
Build assumptions into queries
Alice and Bob are in an awesome groupgroups:{  _id: ‘8rOIwh2VD’,  members: [     {name: ‘Alice’, dateJoined: ‘2012-05-24’}...
Bugs in our code fail to detect Eve’s trickeryand the update query is run!groups.update(  {_id: ‘8rOIwh2VD’},  {$push:    ...
groups:{  _id: ‘8rOIwh2VD’,  members: [     {name: ‘Alice’, dateJoined: ‘2012-05-24’},     {name: ‘Bob’, dateJoined: ‘2012...
Bake in our assumptions!groups.update(  {_id: ‘8rOIwh2VD’,   ‘members.name’: {$ne: ‘Eve’},   ‘banned.name’: {$ne: ‘Eve’}},...
Always Specify Fields
To assign a player a new chain, we only needtheir _id and game information                   players                    ga...
To load a player’s profile content, we justneed their history, stats, and profile info                   players          ...
We need to assign Bob a new chain, letsfigure out what state he needs nextdb.players.find({name: ‘Bob’});This will fetch a...
Lets specify that we only want our “game”subdocumentdb.players.find({name: ‘Bob’}, {fields: [‘game’]});Hooray! Much less t...
Store Everything!
If you have information available, save iteven if you don’t plan on using it!db.chains.update({_id: ‘2VmVO18hs’},         ...
Some Pain Points
Less obvious database structuremysql> DESC players;+----------------+---------+------+-----+---------+-------+| Field     ...
Data integrity up to applicationMysql> ALTER TABLE players    ADD CONSTRAINT fk_players_chains     FOREIGN KEY (activeChai...
No mature GUI query development tools
If we could start over would we still use MongoDB?
Absolutely!• NoSQL in general is great for rapid prototyping• Fantastic performance• Intuitive language keeps it easy to r...
Questions?@Zugwalt     @DoodleOrDie
Mongo or Die: How MongoDB Powers Doodle or Die
Mongo or Die: How MongoDB Powers Doodle or Die
Mongo or Die: How MongoDB Powers Doodle or Die
Mongo or Die: How MongoDB Powers Doodle or Die
Mongo or Die: How MongoDB Powers Doodle or Die
Mongo or Die: How MongoDB Powers Doodle or Die
Mongo or Die: How MongoDB Powers Doodle or Die
Mongo or Die: How MongoDB Powers Doodle or Die
Mongo or Die: How MongoDB Powers Doodle or Die
Mongo or Die: How MongoDB Powers Doodle or Die
Mongo or Die: How MongoDB Powers Doodle or Die
Mongo or Die: How MongoDB Powers Doodle or Die
Mongo or Die: How MongoDB Powers Doodle or Die
Mongo or Die: How MongoDB Powers Doodle or Die
Mongo or Die: How MongoDB Powers Doodle or Die
Mongo or Die: How MongoDB Powers Doodle or Die
Upcoming SlideShare
Loading in …5
×

Mongo or Die: How MongoDB Powers Doodle or Die

9,958 views

Published on

"Doodle or Die" is a popular online drawing game built on Node.js and MongoDB. It started off as an entry to the 2011 Node Knockout competition and after winning the category for "Most Fun" has continued to grow into a game that has thousands of players play each day who have produced millions of drawings. This talk will use Doodle or Die as a vehicle to showcase the many strengths of using MongoDB as a go-to database for small and midsize applications. It will also cover lessons learned as we used MongoDB to rapidly develop and scale Doodle or Die.

Mongo or Die: How MongoDB Powers Doodle or Die

  1. Mongo or Die!How MongoDB powers Doodle or Die Aaron Silverman (@Zugwalt)
  2. Doodle or Die@DoodleOrDie
  3. What is Doodle or Die?
  4. TelephonePhrase Id like some beer!Phrase Id like some deerPhrase I see no deerPhrase I’ve no idea!
  5. Doodle or DieDoodlePhrase Shining AppleDoodlePhrase Eat your fruit or DIE!
  6. What Powers Doodle or Die?
  7. Started very, very small 4 Cores 128MB RAM Node Server MongoDB Server
  8. Got serious about our servers Small - 2GB 8 Cores MongoDB Server 256MB RAM Node Server
  9. Called in some Reinforcements Large - 5GB 12 Cores MongoDB Server 1GB RAM Node Server
  10. In the last 30 Days:• 2,500,000 page views• 100,000 uniques• 35,000 active player accounts• 2,000,000 new doodles and descriptions
  11. “Small Data”Mongo DB • Player Info • Chain Info (excluding4 GB data size doodles)<1 GB index size • Group Info~10 queries/sec • Game State • LogsAmazon170 GB data size • Doodles • Static Content8 GB in/month • Compressed200 GB out/month Database Backups
  12. MongoDB - $65 / monthDaily backups to Amazon S3: $16/mo
  13. Amazon S3 - $70 / month
  14. Node - $62/ month
  15. Total Cost To Host: $197/month MongoDB $65 Amazon S3 $70 Node $62 Total $197
  16. PAAS Provides Easy Upgrade Path
  17. General Principles
  18. Custom _idPartially random string generated usingShortId node module ObjectId ObjectId("4fd02d5d78315a502d15cdde") ObjectId("4fd02d5a78315a502d15cddd") ObjectId("4fd02d5878315a502d15cddc") ShortId "8rOIwh2VD" "1qyY61Lu1" "5GQnbx-1"
  19. • Shorter, less cumbersome in code and queriesdb.players.findOne({_id: ‘58mwYlTKV’});db.chains.update({_id: ‘58mwYlTKV’}, {$set: activePlayer_id: ‘88ueYaL6V’});• Randomness could help with sharding; more importantly makes it harder to cheathttp://doodleordie.com/c/5ONtvvSGH<span class="doodle" data-jsonp="http://doodles.s3.amazonaws.com/d2/Eh8-Po2R5/1Em5kj3LY.js">
  20. Question Oriented Subdocuments What chain is this player working on right now? What are this player’s stats? Which players are not eligible to be assigned this chain?
  21. Goal is for most “Questions” to be able tobe answered in one query from one subdocumentdb.players.findOne({_id: ‘58mwYlTKV’}, {‘game.recentSkips’: 1});Related “Questions” will share commonancestorsdb.players.findOne({_id: ‘58mwYlTKV’}, {game: 1});
  22. Indexes are designed to make answeringquestions easy!Which player is working on this chain?db.players.ensureIndex({‘game.activeChain_id’: 1});What chains are recently awaiting a new doodle? db.chains.ensureIndex({inUse: -1, activeState: 1 lastModified" : -1});
  23. Doodle or Die Collections
  24. Disclaimer: Much of the original Doodle orDie code and schema were created during aweekend long hackathon!
  25. playersPrimary chains groups sessionsSupport log
  26. Players
  27. players game OftenchainHistory stats Query Frequency account login Rarely info
  28. players game • activeState • activeChain_id chainHistory • activeStepIndex • recentSkips stats accountAnswerslogin question:“What is this player working on right now?” info db.players.findOne({_id: ‘58mwYlTKV’}, {game: 1});
  29. players Chain1 game • datePlayed • dateViewed chainHistory stats account ChainN loginAnswers the question:“What has the player worked on?” infodb.players.findOne({_id: ‘58mwYlTKV’}, {chainHistory: 1});
  30. players Chain1 game • datePlayed • dateViewed chainHistory stats account ChainNYUCK! loginPlan to refactor out along with a refactor of infohow chains store steps
  31. players • totalSteps game • drawSteps • phraseSteps chainHistory • numSkips • numLikes stats account loginAnswers the question:“How active/good is this player?” infodb.players.findOne({_id: ‘58mwYlTKV’}, {stats: 1});
  32. Chains
  33. chains• activePlayer_id• activeState• numSteps• lastModified ineligiblePlayer_ids stepschains are assigned (not as many questions)db.chains.findOne({inUse: false, activeState : player.game.activeState, ineligiblePlayer_ids: {$ne: player._id}, lastModified: { $gte: timeRange}});
  34. chains [player_id1 player_id2,• activePlayer_id player_id3• activeState …• numSteps player_idN]• lastModified * ineligiblePlayer_idsNote: $addToSet and $pull work great in stepsmaintaining this array
  35. chains [• activePlayer_id Step1• activeState • player_id• numSteps • state• lastModified • date content * ineligiblePlayer_ids StepN steps ]
  36. Description Step Content: • phraseDoodle Step Content: • url (points to S3) • time • numStrokes • numDistinctColors
  37. Description Step Content: • phraseDoodle Step Content: • url (points to S3) • time • numStrokes • numDistinctColors
  38. We plan to stop embedding steps in chains,and link to “doodles” and “descriptions”collections doodles descriptions • player_id • player_id • date • date • url (points to S3) • text • time • numStrokes • numDistinctColors
  39. How Does it all Work?
  40. players queried for users who are associatedwith the authenticated twitter accountdb.players.find({‘login.twitter.uid’: ‘XXXXXXX’}, {game: 1, chainHistory: 1});chain history used to load thumbnails,previous chain is loadeddb.chains.find({_id: {$in: [player.game.activeChain_id, player.game.lastChain_id]}});
  41. chains collection searched and atomicallyupdated for unclaimed eligible chain to begiven to player db.chains.findAndModify( {inUse: false, activeState : player.game.activeState, ineligiblePlayer_ids: {$ne: player._id}, lastModified: { $gte: timeRange}}, {$set: {inUse: true, activePlayer_id: player._id, $addToSet: {ineligiblePlayer_ids: player._id}});
  42. Content saved to chaindb.chains.update({_id: chain._id, activePlayer_id: player._id}}, {$inc: {numSteps: 1}, $set: inUse: false, lastModified: datePlayed} $unset: {activePlayer_id: 1}, $push: {steps: { player_id: player._id, state: chain.activeState, content: content, date: datePlayed}}});
  43. Doodle strokes will be saved to S3(but url to S3 and metadata saved to chain){"color":"#000000","size":15,"path":[307,66,308,66,308,68,308,69,308,70,308,71,308,72,308,73,306,76,305,79,305,81,302,83,302,84,302,85,302,86,301,87,300,89,300,90,300,91,300,92,300,95,300,97,300,102,300,103,300,104,300,108,300,109,300,110,301,114,303,116,304,116,305,119,305,121,307,124,309,127,310,130,310,131,313,132,314,133,316,137,317,138,320,140,321,140,323,143,326,143,328,143,333,144,337,144,341,144,343,144,347,143,352,143,354,141,357,140,358,140,359,139,362,138,363,138,365,137,368,135,369,132,370,131,371,130,374,128,375,128,376,125,378,125,379,124,380,123,381,123,382,120,385,119,386,118,388,115,391,112,394,109,394,107,394,106,397,104,397,103,397,102,397,101,397,100,397,98,397,96,397,94,397,93,397,91,397,90,397,88,397,87,397,86,397,83,395,80,395,79,395,77,394,74,393,72,393,70,393,67,392,65,391,63,389,62,388,59,386,57,383,54,383,52,381,49,379,48,378,47,376,46,375,44,374,43,372,43,370,42,369,42,368,42,364,41,362,41,359,41,355,41,351,42,349,42,347,42,346,44,343,44,342,45,339,46,337,47,336,48,333,49,332,51,330,51,328,51,327,52,326,53,323,53,322,53,321,54,320,55,317,55,316,56,314,58,309,59,306,61,306,62,305,62,304,63]}
  44. Player’s chainHistory retrieveddb.players.find({‘login.urlSafe’: urlSafeUid}, {chainHistory: 1});chainHistory filtered and sorted onserver, Applicable chains/steps retrieveddb.chains.find({$in: chain_idArr});Retrieved chains ordered (on server) basedon previously sorted chainHistory
  45. Stats are loaded from the counting log whichis essentially a bunch of countersincrementeddb.log.find({_id: {$in: [‘2012-06-20’, ‘2012-06-20’,‘2012-06-20’, ‘2012-06-20’]}});Extremely simple implementation usingnested subdocuments for organization
  46. Some Additional Doodle or Die Tricks
  47. Build assumptions into queries
  48. Alice and Bob are in an awesome groupgroups:{ _id: ‘8rOIwh2VD’, members: [ {name: ‘Alice’, dateJoined: ‘2012-05-24’}, {name: ‘Bob’, dateJoined: ‘2012-05-25’} ] banned: [ {name: ‘Eve’, dateBanned: ‘2012-05-25’} ]}Eve tries to join despite being banned!
  49. Bugs in our code fail to detect Eve’s trickeryand the update query is run!groups.update( {_id: ‘8rOIwh2VD’}, {$push: {members: {name: ‘Eve’, dateJoined: new Date()} }, }, function(err) { if (err) throw err; callback(); });
  50. groups:{ _id: ‘8rOIwh2VD’, members: [ {name: ‘Alice’, dateJoined: ‘2012-05-24’}, {name: ‘Bob’, dateJoined: ‘2012-05-25’}, {name: ‘Eve’, dateJoined: ‘2012-05-26’} ] banned: [ {name: ‘Eve’, dateRequested: ‘2012-05-25’} ]}Blast! Eve got through! How can we helpprevent this?
  51. Bake in our assumptions!groups.update( {_id: ‘8rOIwh2VD’, ‘members.name’: {$ne: ‘Eve’}, ‘banned.name’: {$ne: ‘Eve’}}, {$push: {members: {name: ‘Eve’, dateJoined: new Date()} }, }, function(err, updateCount) { if (err) throw err; if (updateCount !== 1) throw new Error(‘bugs!’); callback(updateCount === 1); });
  52. Always Specify Fields
  53. To assign a player a new chain, we only needtheir _id and game information players game chainHistory stats account login info
  54. To load a player’s profile content, we justneed their history, stats, and profile info players game chainHistory stats account login info
  55. We need to assign Bob a new chain, letsfigure out what state he needs nextdb.players.find({name: ‘Bob’});This will fetch and send back the ENTIREplayer object!
  56. Lets specify that we only want our “game”subdocumentdb.players.find({name: ‘Bob’}, {fields: [‘game’]});Hooray! Much less to retrieve and sendover the series of tubes.
  57. Store Everything!
  58. If you have information available, save iteven if you don’t plan on using it!db.chains.update({_id: ‘2VmVO18hs’}, {$push: {player_id: ‘1qyY61Lu1’, state: ‘draw’, date: new Date(), content: {time: 106896, count: 27, width: 520, height: 390, step_id: ‘1i7RlFbgU’ } });
  59. Some Pain Points
  60. Less obvious database structuremysql> DESC players;+----------------+---------+------+-----+---------+-------+| Field | Type | Null | Key | Default | Extra |+----------------+---------+------+-----+---------+-------+| id | int(11) | NO | PRI | NULL | || activeStateId | int(11) | YES | | NULL | || activeStepId | int(11) | YES | | NULL | |+----------------+---------+------+-----+---------+-------+
  61. Data integrity up to applicationMysql> ALTER TABLE players ADD CONSTRAINT fk_players_chains FOREIGN KEY (activeChainId) REFERENCES chains(id);/* Oh no! This is going to mess things up! */players.update({_id: ‘c58D4’}, {$set: {‘game.activeChain_id’: ‘bad_id’}});
  62. No mature GUI query development tools
  63. If we could start over would we still use MongoDB?
  64. Absolutely!• NoSQL in general is great for rapid prototyping• Fantastic performance• Intuitive language keeps it easy to run one-off queries / refactor schema• Excellent (and now officially supported) Node driver
  65. Questions?@Zugwalt @DoodleOrDie

×