Open source, high performance database                                   Jared Rosoff                                     ...
Challenges                 Opportunities• Variably typed data      • Agility• Complex objects          • Cost reduction• H...
AGILE DEVELOPMENT                                • Iterative and continuous                                • New and emerg...
PROJECT                                                                  DENORMALIZE                                      ...
How we got here                  5
•   Variably typed data•   Complex data objects•   High transaction rates•   Large data size•   High availability•   Agile...
• Metadata management• EAV anti-pattern• Sparse table anti-  pattern                        7
Entity Attribute    Value1      Type         Book1      Title        Inside Intel                                    Diffi...
Type     Title           Author       Manufacturer        Model      RAM   Screen                                         ...
Type   Manufacturer Model         RA      Screen                                                        M       Width     ...
Complex objects       11
Challenges                         Quer                                    y                                              ...
Challenges• Adding more storage over time• Aging out data that’s no longer needed• Minimizing resource overhead of “cold” ...
14
15
Variably typed     • Rigid schemas      data   Complex          • Normalization can be hard   Objects          • Dependent...
A new data model                   17
var post = { author: “Jared”,           date: new Date(),           text: “NoSQL Now 2012”,           tags: [“NoSQL”, “Mon...
>db.posts.find() { _id : ObjectId("4c4ba5c0672c685e5e8aabf3"),   author : ”Jared",   date : "Sat Jul 24 2010 19:47:11 GMT-...
Create index on any Field in Document // 1 means ascending, -1 means descending >db.posts.ensureIndex({author: 1}) >db.pos...
• Conditional Operators   – $all, $exists, $mod, $ne, $in, $nin, $nor, $or, $size, $type   – $lt, $lte, $gt, $gte  // find...
• $set, $unset, $inc, $push, $pushAll, $pull, $pullAll, $bit> comment = { author: “Brendan”,           date: new Date(),  ...
{ _id : ObjectId("4c4ba5c0672c685e5e8aabf3"),  author : ”Jared",  date : "Sat Jul 24 2010 19:47:11 GMT-0700 (PDT)",  text ...
// Index nested documents> db.posts.ensureIndex( “comments.author”:1 )  db.posts.find({‘comments.author’:’Brendan’})// In...
db.posts.aggregate(  { $project : {      author : 1,      tags : 1,  } },  { $unwind : “$tags” },  { $group : {      _id :...
It’s highly available                        26
Write                  Primary         Read                             AsynchronousDriver                 Secondary   Rep...
PrimaryDriver                Secondary         Read                Secondary         Read                            28
Primary         Write               AutomaticDriver                  Primary    Leader Election         Read              ...
Secondary         Read         WriteDriver                  Primary         Read                 Secondary         Read   ...
With tunableconsistency               31
WriteDriver                  Primary         Read                 Secondary                 Secondary                     ...
Write                  PrimaryDriver                 Secondary         Read                 Secondary         Read        ...
Durability             34
•   Fire and forget•   Wait for error•   Wait for fsync•   Wait for journal sync•   Wait for replication                  ...
Driver           Primary         write                           apply in memory                                          ...
Driver          Primary        write     getLastError                          apply in memory                            ...
Driver          Primary         write     getLastError                          apply in memory        j:true             ...
Driver          Primary                     Secondary         write     getLastError                          apply in mem...
Value          Meaning<n:integer>    Replicate to N members of replica set“majority”     Replicate to a majority of replic...
{ _id: “someSet”,   members: [     { _id:0, host:”A”, tags: { dc: “ny”}},     { _id:1, host:”B”, tags: { dc: “ny”}},     {...
• Between 0..1000• Highest member that is up to date wins   – Up to date == within 10 seconds of primary• If a higher prio...
• Lags behind master by configurable time delay• Automatically hidden from clients• Protects against operator errors   – A...
• Vote in elections• Don’t store a copy of data• Use as tie breaker                               44
Data CenterPrimary              45
Data Center Zone 1      Zone 2         Zone 3Primary   Secondary      Secondary                                     46
Data Center Zone 1      Zone 2          Zone 3Primary   Secondary      Secondary                         hidden = true    ...
Active Data Center          Standby Data Center  Zone 1                 Zone 2Primary            Secondary        Secondar...
West Coast DC    Central DC     East Coast DC                   Zone 1           Zone 1                 Primary        Sec...
Sharding           50
client   client   client   client                                             config         mongos            mongos     ...
> db.runCommand( { shardcollection: “test.users”,                        key: { email: 1 }} ){    name: “Jared”,    email:...
-∞   +     ∞         53
-∞                                              +                                                ∞     dan@10gen.com      ...
Split!-∞                                              +                                                ∞     dan@10gen.com...
This is a              Split!         This is a      chunk                                 chunk-∞                        ...
-∞                                              +                                                ∞     dan@10gen.com      ...
-∞                                              +                                                ∞     dan@10gen.com      ...
Split!-∞                                              +                                                ∞     dan@10gen.com...
Min Key               Max Key              Shard-∞                    dan@10gen.com        1dan@10gen.com         jsr@10ge...
mongos                                                                                  config                            ...
mongos                                                                                  config                            ...
mongos                                                                                 config                             ...
mongos                                                                                config                              ...
mongos                                                                                config                              ...
By Shard Key   Routed            db.users.find(                                   {email: “jsr@10gen.com”})Sorted by      ...
Inserts   Requires shard   db.users.insert({          key                name: “Jared”,                             email:...
1                                       1. Query arrives at                         4                    mongos           ...
1                             4                 1. Query arrives at                        mongos                    Mongo...
1                             6                1. Query arrives at                        mongos                   MongoS ...
Use cases            71
Content Management       Operational Intelligence      Meta Data Management        w            User Data Management      ...
Machine      • More machines, more sensors, Generated       more data   Data        • Variably structuredStock Market   • ...
Flexible document                              model can adapt to                              changes in sensor          ...
• Large volume of state about usersAd Targeting   • Very strict latency requirements               • Expose report data to...
Parallelize queries               Low latency reads                                   across replicas and                 ...
Intuit relies on a MongoDB-powered real-time analytics tool for small businesses to   derive interesting and actionable pa...
Rich profiles                                                          collecting multiple                                ...
Data        • Meta data about artifacts              • Content in the libraryArchiving              • Have data sources th...
Indexing and rich                                               query API for easy                                        ...
Shutterfly uses MongoDB to safeguard more than six billion images for millions of  customers in the form of photos and vid...
• Comments and user generated News Site       content               • Personalization of content, layoutMulti-Device   • G...
Geo spatial indexing                              Flexible data model                             for location basedGridFS...
Wordnik uses MongoDB as the foundation for its “live” dictionary that stores its entire                    text corpus – 3...
• Scale out to large graphs  Social  Graphs     • Easy to search and               process             • Authentication,  ...
Native support forArrays makes it easyto store connections inside user profile                           Sharding partitio...
Review         87
• Variety, Velocity and Volume make it difficult• Documents are easier for many use cases                                 ...
• Distributed by default• High availability• Cloud deployment                           89
•   Document oriented data model•   Highly available deployments•   Strong consistency model•   Horizontally scalable arch...
Upcoming SlideShare
Loading in …5
×

MongoDB at NoSQL Now! 2012: Benefits and Challenges of Using MongoDB in the Enterprise

5,464 views

Published on

by Jared Rosoff

Published in: Technology

MongoDB at NoSQL Now! 2012: Benefits and Challenges of Using MongoDB in the Enterprise

  1. 1. Open source, high performance database Jared Rosoff @forjared 1
  2. 2. Challenges Opportunities• Variably typed data • Agility• Complex objects • Cost reduction• High transaction rates • Simplification• Large data size• High availability 2
  3. 3. AGILE DEVELOPMENT • Iterative and continuous • New and emerging appsVOLUME AND TYPEOF DATA• Trillions of records• 10’s of millions of queries NEW ARCHITECTURES per second • Systems scaling horizontally,• Volume of data not vertically• Semi-structured and • Commodity servers unstructured data • Cloud Computing 3
  4. 4. PROJECT DENORMALIZE STARTDEVELOPER PRODUCTIVITY DECREASES DATA MODEL STOP USING JOINS CUSTOM• Needed to add new software layers of ORM, Caching, CACHING LAYER Sharding and Message Queue CUSTOM SHARDING• Polymorphic, semi-structured and unstructured data not well supported COST OF DATABASE INCREASES +1 YEAR • Increased database licensing cost • Vertical, not horizontal, scaling • High cost of SAN +6 MONTHS +90 DAYS +30 DAYS LAUNCH 4
  5. 5. How we got here 5
  6. 6. • Variably typed data• Complex data objects• High transaction rates• Large data size• High availability• Agile development 6
  7. 7. • Metadata management• EAV anti-pattern• Sparse table anti- pattern 7
  8. 8. Entity Attribute Value1 Type Book1 Title Inside Intel Difficult to query1 Author Andy Grove2 Type Laptop Difficult to index2 Manufactur Dell er Expensive to query2 Model Inspiron2 RAM 8gb3 Type Cereal 8
  9. 9. Type Title Author Manufacturer Model RAM Screen WidthBook Inside Intel Andy GroveLaptop Dell Inspiron 8gb 12”TV Panasonic Viera 52”MP3 Margin Walker Fugazi Dischord Constant schema changes Space inefficient Overloaded fields 9
  10. 10. Type Manufacturer Model RA Screen M Width Laptop Dell Inspiron 8gb 12” Querying is difficult Manufacturer Model Screen WidthHard to add more types Panasonic Viera 52” Title Author Manufacturer Margin Fugazi Dischord Walker 10
  11. 11. Complex objects 11
  12. 12. Challenges Quer y Quer y Quer• Constant load from client y• Far in excess of single server Quer y Quer y capacity• Can never take the system Query Query down Query Database 12
  13. 13. Challenges• Adding more storage over time• Aging out data that’s no longer needed• Minimizing resource overhead of “cold” data Recent Data Old Data Fast Storage Archival Storage Add Capacity 13
  14. 14. 14
  15. 15. 15
  16. 16. Variably typed • Rigid schemas data Complex • Normalization can be hard Objects • Dependent on joinsHigh transaction • Vertical scaling rate • Poor data locality • Difficult to maintain consistency & HAHigh Availability • HA a bolt-on to many RDBMS Agile • Schema changes Development • Monolithic data model 16
  17. 17. A new data model 17
  18. 18. var post = { author: “Jared”, date: new Date(), text: “NoSQL Now 2012”, tags: [“NoSQL”, “MongoDB”]}> db.posts.save(post) 18
  19. 19. >db.posts.find() { _id : ObjectId("4c4ba5c0672c685e5e8aabf3"), author : ”Jared", date : "Sat Jul 24 2010 19:47:11 GMT-0700 (PDT)", text : ”NoSQL Now 2012", tags : [ ”NoSQL", ”MongoDB" ] }Notes: - _id is unique, but can be anything you’d like 19
  20. 20. Create index on any Field in Document // 1 means ascending, -1 means descending >db.posts.ensureIndex({author: 1}) >db.posts.find({author: ’Jared}) { _id : ObjectId("4c4ba5c0672c685e5e8aabf3"), author : ”Jared", ... } 20
  21. 21. • Conditional Operators – $all, $exists, $mod, $ne, $in, $nin, $nor, $or, $size, $type – $lt, $lte, $gt, $gte // find posts with any tags > db.posts.find( {tags: {$exists: true }} ) // find posts matching a regular expression > db.posts.find( {author: /^Jar*/i } ) // count posts by author > db.posts.find( {author: ‘Jared’} ).count() 21
  22. 22. • $set, $unset, $inc, $push, $pushAll, $pull, $pullAll, $bit> comment = { author: “Brendan”, date: new Date(), text: “I want a freakin pony”}> db.posts.update( { _id: “...” }, $push: {comments: comment} ); 22
  23. 23. { _id : ObjectId("4c4ba5c0672c685e5e8aabf3"), author : ”Jared", date : "Sat Jul 24 2010 19:47:11 GMT-0700 (PDT)", text : ”NoSQL Now 2012", tags : [ ”NoSQL", ”MongoDB" ], comments : [ { author : "Brendan", date : "Sat Jul 24 2010 20:51:03 GMT-0700 (PDT)", text : ”I want a freakin pony" } ]} 23
  24. 24. // Index nested documents> db.posts.ensureIndex( “comments.author”:1 )  db.posts.find({‘comments.author’:’Brendan’})// Index on tags> db.posts.ensureIndex( tags: 1)> db.posts.find( { tags: ’MongoDB’ } )// geospatial index> db.posts.ensureIndex( “author.location”: “2d” )> db.posts.find( “author.location” : { $near : [22,42] } ) 24
  25. 25. db.posts.aggregate( { $project : { author : 1, tags : 1, } }, { $unwind : “$tags” }, { $group : { _id : { tags : 1 }, authors : { $addToSet : “$author” }}}); 25
  26. 26. It’s highly available 26
  27. 27. Write Primary Read AsynchronousDriver Secondary Replication Read Secondary Read 27
  28. 28. PrimaryDriver Secondary Read Secondary Read 28
  29. 29. Primary Write AutomaticDriver Primary Leader Election Read Secondary Read 29
  30. 30. Secondary Read WriteDriver Primary Read Secondary Read 30
  31. 31. With tunableconsistency 31
  32. 32. WriteDriver Primary Read Secondary Secondary 32
  33. 33. Write PrimaryDriver Secondary Read Secondary Read 33
  34. 34. Durability 34
  35. 35. • Fire and forget• Wait for error• Wait for fsync• Wait for journal sync• Wait for replication 35
  36. 36. Driver Primary write apply in memory 36
  37. 37. Driver Primary write getLastError apply in memory 37
  38. 38. Driver Primary write getLastError apply in memory j:true Write to journal 38
  39. 39. Driver Primary Secondary write getLastError apply in memory w:2 replicate 39
  40. 40. Value Meaning<n:integer> Replicate to N members of replica set“majority” Replicate to a majority of replica set members<m:modeName> Use custom error mode name 40
  41. 41. { _id: “someSet”, members: [ { _id:0, host:”A”, tags: { dc: “ny”}}, { _id:1, host:”B”, tags: { dc: “ny”}}, { _id:2, host:”C”, tags: { dc: “sf”}}, { _id:3, host:”D”, tags: { dc: “sf”}}, { _id:4, host:”E”, tags: { dc: “cloud”}}, settings: { getLastErrorModes: { These are the veryImportant: { dc: 3 }, modes you can sortOfImportant: { dc: 2 } use in write } concern } } 41
  42. 42. • Between 0..1000• Highest member that is up to date wins – Up to date == within 10 seconds of primary• If a higher priority member catches up, it will force election and win Primary Secondary Secondary priority = 3 priority = 2 priority = 1 42
  43. 43. • Lags behind master by configurable time delay• Automatically hidden from clients• Protects against operator errors – Accidentally delete database – Application corrupts data 43
  44. 44. • Vote in elections• Don’t store a copy of data• Use as tie breaker 44
  45. 45. Data CenterPrimary 45
  46. 46. Data Center Zone 1 Zone 2 Zone 3Primary Secondary Secondary 46
  47. 47. Data Center Zone 1 Zone 2 Zone 3Primary Secondary Secondary hidden = true backups 47
  48. 48. Active Data Center Standby Data Center Zone 1 Zone 2Primary Secondary Secondarypriority = 1 priority = 1 priority = 0 48
  49. 49. West Coast DC Central DC East Coast DC Zone 1 Zone 1 Primary Secondary priority = 2 priority = 1 Zone 2 Zone 2 Abiter Secondary Secondary priority = 2 priority = 1 49
  50. 50. Sharding 50
  51. 51. client client client client config mongos mongos config config Config Serversmongod mongod mongodmongod mongod mongodmongod mongod mongodShard Shard Shard 51
  52. 52. > db.runCommand( { shardcollection: “test.users”, key: { email: 1 }} ){ name: “Jared”, email: “jsr@10gen.com”,}{ name: “Scott”, email: “scott@10gen.com”,}{ name: “Dan”, email: “dan@10gen.com”,} 52
  53. 53. -∞ + ∞ 53
  54. 54. -∞ + ∞ dan@10gen.com scott@10gen.com jsr@10gen.com 54
  55. 55. Split!-∞ + ∞ dan@10gen.com scott@10gen.com jsr@10gen.com 55
  56. 56. This is a Split! This is a chunk chunk-∞ + ∞ dan@10gen.com scott@10gen.com jsr@10gen.com 56
  57. 57. -∞ + ∞ dan@10gen.com scott@10gen.com jsr@10gen.com 57
  58. 58. -∞ + ∞ dan@10gen.com scott@10gen.com jsr@10gen.com 58
  59. 59. Split!-∞ + ∞ dan@10gen.com scott@10gen.com jsr@10gen.com 59
  60. 60. Min Key Max Key Shard-∞ dan@10gen.com 1dan@10gen.com jsr@10gen.com 1jsr@10gen.com scott@10gen.com 1scott@10gen.com +∞ 1 • Stored in the config servers • Cached in MongoS • Used to route requests and keep cluster balanced 60
  61. 61. mongos config balancer configChunks! config 1 2 3 4 13 14 15 16 25 26 27 28 37 38 39 40 5 6 7 8 17 18 19 20 29 30 31 32 41 42 43 44 9 10 11 12 21 22 23 24 33 34 35 36 45 46 47 48 Shard 1 Shard 2 Shard 3 Shard 4 61
  62. 62. mongos config balancer config Imbalance Imbalance config1 2 3 45 6 7 89 10 11 12 21 22 23 24 33 34 35 36 45 46 47 48 Shard 1 Shard 2 Shard 3 Shard 4 62
  63. 63. mongos config balancer config Move chunk 1 config to Shard 21 2 3 45 6 7 89 10 11 12 21 22 23 24 33 34 35 36 45 46 47 48 Shard 1 Shard 2 Shard 3 Shard 4 63
  64. 64. mongos config balancer config config1 2 3 45 6 7 89 10 11 12 21 22 23 24 33 34 35 36 45 46 47 48 Shard 1 Shard 2 Shard 3 Shard 4 64
  65. 65. mongos config balancer config Chunk 1 now lives on Shard 2 config 2 3 45 6 7 8 19 10 11 12 21 22 23 24 33 34 35 36 45 46 47 48 Shard 1 Shard 2 Shard 3 Shard 4 65
  66. 66. By Shard Key Routed db.users.find( {email: “jsr@10gen.com”})Sorted by Routed in order db.users.find().sort({email:-1})shard keyFind by non Scatter Gather db.users.find({state:”CA”})shard keySorted by Distributed merge db.users.find().sort({state:1}) sortnon shardkey 66
  67. 67. Inserts Requires shard db.users.insert({ key name: “Jared”, email: “jsr@10gen.com”})Removes Routed db.users.delete({ email: “jsr@10gen.com”}) Scattered db.users.delete({name: “Jared”})Updates Routed db.users.update( {email: “jsr@10gen.com”}, {$set: { state: “CA”}}) Scattered db.users.update( {state: “FZ”}, {$set:{ state: “CA”}}, false, true ) 67
  68. 68. 1 1. Query arrives at 4 mongos MongoS 2. MongoS routes query to a single shard 3. Shard returns results of query 2 4. Results returned to client 3Shard 1 Shard 2 Shard 3 68
  69. 69. 1 4 1. Query arrives at mongos MongoS 2. MongoS broadcasts query to all shards 3. Each shard returns results for query 2 4. Results combined and 2 2 returned to client 3 3 3Shard 1 Shard 2 Shard 3 69
  70. 70. 1 6 1. Query arrives at mongos MongoS 2. MongoS broadcasts 5 query to all shards 3. Each shard locally sorts results 2 4. Results returned to 2 2 mongos 4 4 4 5. MongoS merge sorts individual results3 6. Combined sorted result 3 3 returned to clientShard 1 Shard 2 Shard 3 70
  71. 71. Use cases 71
  72. 72. Content Management Operational Intelligence Meta Data Management w User Data Management High Volume Data Feeds 72
  73. 73. Machine • More machines, more sensors, Generated more data Data • Variably structuredStock Market • High frequency trading Data • Multiple sources of dataSocial Media • Each changes their format Firehose constantly 73
  74. 74. Flexible document model can adapt to changes in sensor format Asynchronous writes Data DataSources Data Sources Data Write to memory with Sources periodic disk flush Sources Scale writes over multiple shards 74
  75. 75. • Large volume of state about usersAd Targeting • Very strict latency requirements • Expose report data to millions of customers Real time • Report on large volumes of datadashboards • Reports that update in real timeSocial Media • What are people talking about? Monitoring 75
  76. 76. Parallelize queries Low latency reads across replicas and shards API In database aggregationDashboards Flexible schema adapts to changing input dataCan use same clusterto collect, store, and report on data 76
  77. 77. Intuit relies on a MongoDB-powered real-time analytics tool for small businesses to derive interesting and actionable patterns from their customers’ website traffic Problem Why MongoDB Impact Intuit hosts more than 500,000  Intuit hosts more than 500,000  In one week Intuit was able to websites websites become proficient in MongoDB wanted to collect and analyze  wanted to collect and analyze development data to recommend conversion data to recommend conversion  Developed application features and lead generation and lead generation more quickly for MongoDB than improvements to customers. improvements to customers. for relational databases With 10 years worth of user  With 10 years worth of user  MongoDB was 2.5 times faster data, it took several days to data, it took several days to than MySQL process the information using a process the information using a relational database. relational database. We did a prototype for one week, and within one week we had made big progress. Very big progress. It was so amazing that we decided, “Let’s go with this.” -Nirmala Ranganathan, Intuit 77
  78. 78. Rich profiles collecting multiple complex actions1 See Ad Scale out to support { cookie_id: ‚1234512413243‛, high throughput of advertiser:{ apple: { activities tracked actions: [2 See Ad { impression: ‘ad1’, time: 123 }, { impression: ‘ad2’, time: 232 }, { click: ‘ad2’, time: 235 }, { add_to_cart: ‘laptop’, sku: ‘asdf23f’, time: 254 }, Click { purchase: ‘laptop’, time: 354 }3 ] } } } Dynamic schemas make it easy to track Indexing and4 Convert vendor specific querying to support attributes matching, frequency capping 78
  79. 79. Data • Meta data about artifacts • Content in the libraryArchiving • Have data sources that you don’t haveInformation access to • Stores meta-data on those stores and figure discovery out which ones have the content • Retina scansBiometrics • Finger prints 79
  80. 80. Indexing and rich query API for easy searching and sorting db.archives. find({ ‚country”: ‚Egypt‛ }); Flexible data model for similar, but different objects{ type: ‚Artefact‛, { ISBN: ‚00e8da9b‛, medium: ‚Ceramic‛, type: ‚Book‛, country: ‚Egypt‛, country: ‚Egypt‛, year: ‚3000 BC‛ title: ‚Ancient Egypt‛} } 80
  81. 81. Shutterfly uses MongoDB to safeguard more than six billion images for millions of customers in the form of photos and videos, and turn everyday pictures into keepsakes Problem Why MongoDB Impact Managing 20TB of data (six  JSON-based data structure  500% cost reduction and 900% billion images for millions of  Provided Shutterfly with an performance improvement customers) partitioning by agile, high performance, compared to previous Oracle function. scalable solution at a low cost. implementation Home-grown key value store on  Works seamlessly with  Accelerated time-to-market for top of their Oracle database Shutterfly’s services-based nearly a dozen projects on offered sub-par performance architecture MongoDB Codebase for this hybrid store  Improved Performance by became hard to manage reducing average latency for High licensing, HW costs inserts from 400ms to 2ms. The “really killer reason” for using MongoDB is its rich JSON-based data structure, which offers Shutterfly an agile approach to develop software. With MongoDB, the Shutterfly team can quickly develop and deploy new applications, especially Web 2.0 and social features. -Kenny Gorman, Director of Data Services 81
  82. 82. • Comments and user generated News Site content • Personalization of content, layoutMulti-Device • Generate layout on the fly for rendering each device that connects • No need to cache static pages • Store large objects Sharing • Simple modeling of metadata 82
  83. 83. Geo spatial indexing Flexible data model for location basedGridFS for large for similar, but searches object storage different objects { camera: ‚Nikon d4‛, location: [ -122.418333, 37.775 ] } { camera: ‚Canon 5d mkII‛, people: [ ‚Jim‛, ‚Carol‛ ], taken_on: ISODate("2012-03-07T18:32:35.002Z") } { origin: ‚facebook.com/photos/xwdf23fsdf‛, license: ‚Creative Commons CC0‛, size: { dimensions: [ 124, 52 ], units: ‚pixels‛ Horizontal scalability } for large data sets } 83
  84. 84. Wordnik uses MongoDB as the foundation for its “live” dictionary that stores its entire text corpus – 3.5T of data in 20 billion records Problem Why MongoDB Impact Analyze a staggering amount of  Migrated 5 billion records in a  Reduced code by 75% data for a system build on single day with zero downtime compared to MySQL continuous stream of high-  MongoDB powers every  Fetch time cut from 400ms to quality text pulled from online website requests: 20m API calls 60ms sources per day  Sustained insert speed of 8k Adding too much data too  Ability to eliminated words per second, with quickly resulted in outages; memcached layer, creating a frequent bursts of up to 50k per tables locked for tens of simplified system that required second seconds during inserts fewer resources and was less  Significant cost savings and 15% Initially launched entirely on prone to error. reduction in servers MySQL but quickly hit performance road blocks Life with MongoDB has been good for Wordnik. Our code is faster, more flexible and dramatically smaller. Since we don’t spend time worrying about the database, we can spend more time writing code for our application. -Tony Tam, Vice President of Engineering and Technical Co-founder 84
  85. 85. • Scale out to large graphs Social Graphs • Easy to search and process • Authentication, Identity Authorization andManagement Accounting 85
  86. 86. Native support forArrays makes it easyto store connections inside user profile Sharding partitions user profiles across Documents enable Social Graph available servers disk locality of all profile data for a user 86
  87. 87. Review 87
  88. 88. • Variety, Velocity and Volume make it difficult• Documents are easier for many use cases 88
  89. 89. • Distributed by default• High availability• Cloud deployment 89
  90. 90. • Document oriented data model• Highly available deployments• Strong consistency model• Horizontally scalable architecture 90

×