Scaling with MongoDB

Rick Copeland
Rick CopelandPrincipal Consultant at Arborian Consulting LLC
Rick Copeland @rick446
Arborian Consulting, LLC
   Now a consultant, but formerly…

     Software engineer at SourceForge, early adopter of
     MongoDB (version 0.8)

     Wrote the SQLAlchemy book (I love SQL when it’s
     used well)

     Mainly write Python now, but have done C++, C#,
     Java, Javascript, VHDL, Verilog, …
   You can do it with an RDBMS as long as you…
     Don’t use joins
     Don’t use transactions
     Use read-only slaves
     Use memcached
     Denormalize your data
     Use custom sharding/partitioning
     Do a lot of vertical scaling
      ▪ (we’re going to need a bigger box)
+1
Year
Scaling with MongoDB
   Use documents to improve locality

   Optimize your indexes

   Be aware of your working set

   Scaling your disks

   Replication for fault-tolerance and read scaling

   Sharding for read and write scaling
Relational (SQL)   MongoDB
Database           Database              Dynamic
                                          Typing
Table              Collection            B-tree
                                     (range-based)
Index              Index
Row                Document
                                          Think JSON
Column             Field
                                 Primitive types +
                                arrays, documents
{
    title: "Slides for Scaling with MongoDB",
    author: "Rick Copeland",
    date: ISODate("20012-02-29T19:30:00Z"),
    text: "My slides are available on speakerdeck.com",
    comments: [
      { author: "anonymous",
         date: ISODate("20012-02-29T19:30:01Z"),
        text: "Fristpsot!" },
      { author: "mark”,
        date: ISODate("20012-02-29T19:45:23Z"),
        text: "Nice slides" } ] }
                                                 Embed comment data in
                                                  blog post document
Seek = 5+ ms   Read = really really fast
Post
                Comment
Author
Post

Author

Comment
Comment
 Comment
 Comment
  Comment
Find where x equals 7

1   2    3   4   5   6   7




        Looked at 7 objects
Find where x
equals 7         4



        2                6



    1        3       5       7




            Looked at 3 objects
Entire index
must fit in
RAM
Only small
 portion in
      RAM
   Working set =
     sizeof(frequently used data)
     + sizeof(frequently used indexes)

   Right-aligned indexes reduce working set size

   Working set should fit in available RAM for best
    performance

   Page faults are the biggest cause of performance
    loss in MongoDB
>db.foo.stats()
                                      Data Size
{
  "ns" : "test.foo",
  "count" : 1338330,
  "size" : 46915928,                                 Average doc size
  "avgObjSize" : 35.05557523181876,
  "storageSize" : 86092032,
  "numExtents" : 12,
  "nindexes" : 2,                        Size on disk (or RAM!)
  "lastExtentSize" : 20872960,
  "paddingFactor" : 1,
  "flags" : 0,                                    Size of all indexes
  "totalIndexSize" : 99860480,
  "indexSizes" : {
    "_id_" : 55877632,
    "x_1" : 43982848},
  "ok" : 1                                        Size of each index
}
~200 seeks / second
~200 seeks / second   ~200 seeks / second   ~200 seeks / second


   Faster, but less reliable
~400 seeks / second   ~400 seeks / second   ~400 seeks / second


   Faster and more reliable ($$$ though)
   Old and busted  master/slave replication

   The new hotness  replica sets with automatic
    failover
                 Read / Write       Primary




                       Read       Secondary




                       Read       Secondary
   Primary handles all
    writes

   Application optionally
    sends reads to slaves

   Heartbeat manages
    automatic failover
   Special collection (the oplog) records operations
    idempotently

   Secondaries read from primary oplog and replay
    operations locally

   Space is preallocated and fixed for the oplog
{
"ts" : Timestamp(1317653790000, 2),
                                     Insert
"h" : -6022751846629753359,
"op" : "i",
"ns" : "confoo.People",                  Collection name
"o" : {
"_id" : ObjectId("4e89cd1e0364241932324269"),
"first" : "Rick",
"last" : "Copeland”
   }
}                                                   Object to insert
   Use heartbeat signal to detect failure

   When primary can’t be reached, elect a new one

   Replica that’s the most up-to-date is chosen

   If there is skew, changes not on new primary are
    saved to a .bson file for manual reconciliation

   Application can require data to be replicated to a
    majority to ensure this doesn’t happen
   Priority
     Slower nodes with lower priority
     Backup or read-only nodes to never be primary

   slaveDelay
     Fat-finger protection

   Data center awareness and tagging
     Application can ensure complex replication
     guarantees
   Reads scale nicely
     As long as the working set fits in RAM
     … and you don’t mind eventual consistency


   Sharding to the rescue!
     Automatically partitioned data sets
     Scale writes and reads
     Automatic load balancing between the shards
Configuration
             MongoS        MongoS
                                           Config 1    Config 2             Config 3




Shard 1               Shard 2       Shard 3               Shard 4
 0..10                 10..20        20..30                30..40


   Primary               Primary       Primary                    Primary




 Secondary             Secondary     Secondary              Secondary




 Secondary             Secondary     Secondary              Secondary
   Sharding is per-collection and range-based

   The highest-impact choice (and hardest to
    change decision) you make is the shard key
     Random keys: good for writes, bad for reads
     Right-aligned index: bad for writes
     Small # of discrete keys: very bad
     Ideal: balance writes, make reads routable by mongos
     Optimal shard key selection is hard
Primary Data Center               Secondary Data Center


 Shard 1           Shard 1                    Shard 1
Priority 1        Priority 1                 Priority 0



 Shard 2           Shard 2                    Shard 2
Priority 1        Priority 1                 Priority 0



 Shard 3           Shard 3                    Shard 3
                             RS3
Priority 1        Priority 1                 Priority 0



Config 1          Config 2                   Config 3
   Writes and reads both scale (with good choice of
    shard key)

   Reads scale while remaining strongly consistent

   Partitioning ensures you get more usable RAM

   Pitfall: don’t wait too long to add capacity
Rick Copeland @rick446
Arborian Consulting, LLC
1 of 32

More Related Content

Viewers also liked(11)

Custom Courseware DevelopmentCustom Courseware Development
Custom Courseware Development
CommLab India – Rapid eLearning Solutions856 views
2017 Volvo S60 Brochure | Orange County Volvo2017 Volvo S60 Brochure | Orange County Volvo
2017 Volvo S60 Brochure | Orange County Volvo
Volvo Cars Mission Viejo872 views
Containerization and palletizationContainerization and palletization
Containerization and palletization
Amar Ashish Shrivastava8.1K views
Global IT Consulting MarketGlobal IT Consulting Market
Global IT Consulting Market
Joyjeet Dan5.9K views
Best practices multichannel-integrationBest practices multichannel-integration
Best practices multichannel-integration
Giuseppe Monserrato7.7K views
Dantes Inferno Study GuideDantes Inferno Study Guide
Dantes Inferno Study Guide
followthelamb12.2K views
Temperature TransducerTemperature Transducer
Temperature Transducer
AIT8.4K views
Camels approachCamels approach
Camels approach
Vishal Parmar26.3K views

Similar to Scaling with MongoDB

Scaling with MongoDBScaling with MongoDB
Scaling with MongoDBMongoDB
592 views47 slides
MongoDB 3.0 MongoDB 3.0
MongoDB 3.0 Victoria Malaya
1.2K views53 slides
Intro to mongo dbIntro to mongo db
Intro to mongo dbChi Lee
445 views23 slides

Similar to Scaling with MongoDB(20)

Recently uploaded(20)

ChatGPT and AI for Web DevelopersChatGPT and AI for Web Developers
ChatGPT and AI for Web Developers
Maximiliano Firtman161 views
METHOD AND SYSTEM FOR PREDICTING OPTIMAL LOAD FOR WHICH THE YIELD IS MAXIMUM ...METHOD AND SYSTEM FOR PREDICTING OPTIMAL LOAD FOR WHICH THE YIELD IS MAXIMUM ...
METHOD AND SYSTEM FOR PREDICTING OPTIMAL LOAD FOR WHICH THE YIELD IS MAXIMUM ...
Prity Khastgir IPR Strategic India Patent Attorney Amplify Innovation24 views

Scaling with MongoDB

  • 2. Now a consultant, but formerly…  Software engineer at SourceForge, early adopter of MongoDB (version 0.8)  Wrote the SQLAlchemy book (I love SQL when it’s used well)  Mainly write Python now, but have done C++, C#, Java, Javascript, VHDL, Verilog, …
  • 3. You can do it with an RDBMS as long as you…  Don’t use joins  Don’t use transactions  Use read-only slaves  Use memcached  Denormalize your data  Use custom sharding/partitioning  Do a lot of vertical scaling ▪ (we’re going to need a bigger box)
  • 6. Use documents to improve locality  Optimize your indexes  Be aware of your working set  Scaling your disks  Replication for fault-tolerance and read scaling  Sharding for read and write scaling
  • 7. Relational (SQL) MongoDB Database Database Dynamic Typing Table Collection B-tree (range-based) Index Index Row Document Think JSON Column Field Primitive types + arrays, documents
  • 8. { title: "Slides for Scaling with MongoDB", author: "Rick Copeland", date: ISODate("20012-02-29T19:30:00Z"), text: "My slides are available on speakerdeck.com", comments: [ { author: "anonymous", date: ISODate("20012-02-29T19:30:01Z"), text: "Fristpsot!" }, { author: "mark”, date: ISODate("20012-02-29T19:45:23Z"), text: "Nice slides" } ] } Embed comment data in blog post document
  • 9. Seek = 5+ ms Read = really really fast
  • 10. Post Comment Author
  • 12. Find where x equals 7 1 2 3 4 5 6 7 Looked at 7 objects
  • 13. Find where x equals 7 4 2 6 1 3 5 7 Looked at 3 objects
  • 16. Working set =  sizeof(frequently used data)  + sizeof(frequently used indexes)  Right-aligned indexes reduce working set size  Working set should fit in available RAM for best performance  Page faults are the biggest cause of performance loss in MongoDB
  • 17. >db.foo.stats() Data Size { "ns" : "test.foo", "count" : 1338330, "size" : 46915928, Average doc size "avgObjSize" : 35.05557523181876, "storageSize" : 86092032, "numExtents" : 12, "nindexes" : 2, Size on disk (or RAM!) "lastExtentSize" : 20872960, "paddingFactor" : 1, "flags" : 0, Size of all indexes "totalIndexSize" : 99860480, "indexSizes" : { "_id_" : 55877632, "x_1" : 43982848}, "ok" : 1 Size of each index }
  • 18. ~200 seeks / second
  • 19. ~200 seeks / second ~200 seeks / second ~200 seeks / second  Faster, but less reliable
  • 20. ~400 seeks / second ~400 seeks / second ~400 seeks / second  Faster and more reliable ($$$ though)
  • 21. Old and busted  master/slave replication  The new hotness  replica sets with automatic failover Read / Write Primary Read Secondary Read Secondary
  • 22. Primary handles all writes  Application optionally sends reads to slaves  Heartbeat manages automatic failover
  • 23. Special collection (the oplog) records operations idempotently  Secondaries read from primary oplog and replay operations locally  Space is preallocated and fixed for the oplog
  • 24. { "ts" : Timestamp(1317653790000, 2), Insert "h" : -6022751846629753359, "op" : "i", "ns" : "confoo.People", Collection name "o" : { "_id" : ObjectId("4e89cd1e0364241932324269"), "first" : "Rick", "last" : "Copeland” } } Object to insert
  • 25. Use heartbeat signal to detect failure  When primary can’t be reached, elect a new one  Replica that’s the most up-to-date is chosen  If there is skew, changes not on new primary are saved to a .bson file for manual reconciliation  Application can require data to be replicated to a majority to ensure this doesn’t happen
  • 26. Priority  Slower nodes with lower priority  Backup or read-only nodes to never be primary  slaveDelay  Fat-finger protection  Data center awareness and tagging  Application can ensure complex replication guarantees
  • 27. Reads scale nicely  As long as the working set fits in RAM  … and you don’t mind eventual consistency  Sharding to the rescue!  Automatically partitioned data sets  Scale writes and reads  Automatic load balancing between the shards
  • 28. Configuration MongoS MongoS Config 1 Config 2 Config 3 Shard 1 Shard 2 Shard 3 Shard 4 0..10 10..20 20..30 30..40 Primary Primary Primary Primary Secondary Secondary Secondary Secondary Secondary Secondary Secondary Secondary
  • 29. Sharding is per-collection and range-based  The highest-impact choice (and hardest to change decision) you make is the shard key  Random keys: good for writes, bad for reads  Right-aligned index: bad for writes  Small # of discrete keys: very bad  Ideal: balance writes, make reads routable by mongos  Optimal shard key selection is hard
  • 30. Primary Data Center Secondary Data Center Shard 1 Shard 1 Shard 1 Priority 1 Priority 1 Priority 0 Shard 2 Shard 2 Shard 2 Priority 1 Priority 1 Priority 0 Shard 3 Shard 3 Shard 3 RS3 Priority 1 Priority 1 Priority 0 Config 1 Config 2 Config 3
  • 31. Writes and reads both scale (with good choice of shard key)  Reads scale while remaining strongly consistent  Partitioning ensures you get more usable RAM  Pitfall: don’t wait too long to add capacity

Editor's Notes

  1. You’d like to just ‘add capacity’ but you end up having to buy a bigger serverBuild your own infrastructure and you pay more for less as you scaleThe cloud can help with this, but only up to a point; what happens when you’re using the largest instance? Time to rearchitect.
  2. There are a lot of features that make RDBMSs attractiveBut as we scale we need to turn off a lot of them to get performance increasesWe end up with something that scales, but it’s hard to use
  3. RAM functions as a cacheReplication ends up caching documents in multiple locationsSharding makes sure documents only have one ‘home’
  4. A single shard is a replica setMongoS is a router that determines where reads and writes goDocuments is ‘chunked’ into ranges. Chunks can be split and migrated to other servers based on load.Configuration servers persist location of particular shard key ranges Cluster is alive when one or more config servers are down, but there can be no migration