Scaling with mongo db - SF Mongo User Group 7-19-2011

5,466 views

Published on

My talk from SF Mongo User Group on 7-19-2011

Published in: Technology, Business

Scaling with mongo db - SF Mongo User Group 7-19-2011

  1. Scaling with MongoDBJared Rosoff (jsr@10gen.com) - @forjared<br />
  2. How do we do it today? <br />We use a relational database but … <br />We don’t use joins<br />We don’t use transactions<br />We add read-only slaves<br />We added a caching layer<br />We de-normalized our data<br />We implemented custom sharding<br />We buy bigger servers<br />
  3. How’s that working out for you?<br />
  4. Costs go up <br />
  5. Productivity goes down<br />
  6. By engineers, for engineers<br />
  7. The landscape<br />Memcached<br />Key / Value<br />Scalability & Performance<br />RDBMS<br />Depth of functionality<br />
  8. Scaling your app<br />Use documents <br />Indexes make me happy<br />Knowing your working set<br />Disks are the bottleneck<br />Replication makes reading fun<br />Sharding for profit <br />
  9. Scaling your data model<br />
  10. Documents<br />{ <br />author : "roger",<br />date : "Sat Jul 24 2010 19:47:11 GMT-0700 (PDT)", <br />text : "Spirited Away",<br />tags : [ "Tezuka", "Manga" ],<br />comments : [<br /> {<br />author : "Fred",<br />date : "Sat Jul 24 2010 20:51:03 GMT-0700 (PDT)",<br />text : "Best Movie Ever”<br /> }<br /> ]<br />}<br />
  11. Disk Seeks & Data Locality<br />Read = really really fast<br />Seek = 5+ ms<br />
  12. Disk Seeks & Data Locality<br />Post<br />Comment<br />Author<br />
  13. Disk Seeks & Data Locality<br />Post<br />Author<br />Comment<br />Comment<br />Comment<br />Comment<br />Comment<br />
  14. Optimized indexes<br />
  15. Table scans<br />Find where x equals 7<br />1<br />2<br />3<br />4<br />5<br />6<br />7<br />Looked at 7 objects<br />
  16. Tree Lookup<br />Find where x equals 7<br />4<br />6<br />2<br />7<br />5<br />3<br />1<br />Looked at 3 objects<br />
  17. Random Index<br />Entire index must fit in RAM<br />
  18. Right Aligned<br />Only small portion in RAM<br />
  19. Working set size<br />
  20. Working Set<br />Active Documents + Used Indexes<br />RAM<br />Disk<br />
  21. Page Fault<br />App requests document<br />Document not in memory<br />Evict a page from memory<br />Read block from disk <br />Return document from memory<br />App<br />1<br />5<br />2<br />RAM<br />3<br />4<br />Disk<br />
  22. Figuring out working Set<br />> db.foo.stats() <br />{<br /> "ns" : "test.foo",<br /> "count" : 1338330,<br /> "size" : 46915928,<br /> "avgObjSize" : 35.05557523181876,<br /> "storageSize" : 86092032,<br /> "numExtents" : 12,<br /> "nindexes" : 2,<br /> "lastExtentSize" : 20872960,<br /> "paddingFactor" : 1,<br /> "flags" : 0,<br /> "totalIndexSize" : 99860480,<br /> "indexSizes" : {<br /> "_id_" : 55877632,<br /> "x_1" : 43982848<br /> },<br /> "ok" : 1<br />}<br />Size of data<br />Average document size<br />Size on disk (and in memory!)<br />Size of all indexes<br />Size of each index<br />
  23. Disk configurations<br />
  24. Single Disk<br />~200 seeks / second<br />
  25. RAID0<br />~200 seeks / second<br />~200 seeks / second<br />~200 seeks / second<br />
  26. RAID10<br />~400 seeks / second<br />~400 seeks / second<br />~400 seeks / second<br />
  27. replication<br />
  28. Replica Sets<br />Read / Write<br />Secondary<br />Read<br />Primary<br />Read<br />Secondary<br />
  29. Replica Sets<br />Read / Write<br />Read<br />Secondary<br />Secondary<br />Read<br />Primary<br />Read<br />Secondary<br />Secondary<br />Read<br />
  30. Sharding<br />
  31. Secondary<br />Secondary<br />Secondary<br />Secondary<br />MongoS<br />MongoS<br />Shard 1<br />0..10<br />Shard 2<br />10..20<br />Shard 3<br />20..30<br />Shard 4<br />30..40<br />Primary<br />Primary<br />Primary<br />Primary<br />Secondary<br />Secondary<br />Secondary<br />Secondary<br />
  32. 400GB Index?<br />
  33. 400GB Index?<br />Shard 1<br />0..10<br />Shard 2<br />10..20<br />Shard 3<br />20..30<br />Shard 4<br />30..40<br />100GB<br />Index!<br />100GB<br />Index!<br />100GB<br />Index!<br />100GB<br />Index!<br />
  34. Summary<br />
  35. Summary<br />Use documents to your advantage!<br />Optimize your indexes <br />Understand your working set<br />Use a sane disk configuratino<br />Use replicas to scale reads <br />Use sharding to scale writes & working RAM<br />

×