Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

MongoDB Miami Meetup 1/26/15: Introduction to WiredTiger

1,643 views

Published on

A high level overview of WiredTiger and the new storage engine API in MongoDB 3.0.

Published in: Technology
  • Be the first to comment

MongoDB Miami Meetup 1/26/15: Introduction to WiredTiger

  1. 1. Introduction to WiredTiger and the Storage Engine API Valeri Karpov NodeJS Engineer, MongoDB www.thecodebarbarian.com www.slideshare.net/vkarpov15 github.com/vkarpov15 @code_barbarian
  2. 2. * A Bit About Me •CI/NodeJS Engineer at MongoDB •Maintainer of mongoose ODM •Recently working on rewriting mongodump, etc.
  3. 3. * Talk Overview •What is the Storage Engine API? •What is WT and why you should care •Basic WT internals and gotchas •MMS Automation and WT •Some very basic performance numbers
  4. 4. * Introducing Storage Engines •How MongoDB persists data •<= MongoDB 2.6: “mmapv1” storage engine •MongoDB 3.0 has a few options • mmapv1 • in_memory • wiredTiger • devnull (now /dev/null does support sharding!) •Internal hack: Twitter storage engine
  5. 5. * Why Storage Engine API? •Different performance characteristics •mmapv1 doesn’t handle certain workloads well •Consistent API on top of storage layer • Can mix storage engines in a replset or sharded cluster!
  6. 6. * Storage Engines Visually •mmapv1 persists data to dbpath
  7. 7. * Storage Engines Visually •WT persists data differently
  8. 8. * What is WiredTiger? •Storage engine company founded by BerkleyDB alums •Recently acquired by MongoDB •Available as a storage engine option in MongoDB 3.0
  9. 9. * Why is WiredTiger Awesome? •Document-level locking •Compression on disk •Consistency without journaling •Better performance on certain workloads
  10. 10. * Document-level Locking •The often-criticized global write lock was removed in 2.2 •Database-level locking •3.0 with mmapv1 has collection-level locking •3.0 with WT only locks at document layer •Writes no longer block all other writes •Better CPU usage: more cores ~= more writes
  11. 11. * Compression •WT uses snappy compression by default •Data is compressed on disk •2 supported compression algorithms: • snappy: default. Good compression, relatively low overhead • zlib: Better compression, but at cost of more overhead
  12. 12. * Consistency without Journaling •mmapv1 uses write-ahead log to guarantee consistency as well as durability •WT doesn’t have this problem: no in-place updates •Potentially good for insert-heavy workloads •Rely on replication for durability •More on this in the next section
  13. 13. * WiredTiger Internals: Running It •mongod now has a --storageEngine option
  14. 14. * WiredTiger and dbpath •Don’t run WT when you have mmapv1 files in your dbpath (and vice versa)
  15. 15. * Upgrading to WT •Can’t copy database files •Can’t just restart with same dbpath •Other methods for upgrading still work: • Initial sync from replica set • mongodump/mongorestore •Can still do rolling upgrade of replica set to WT: • Shut down secondary, delete dbpath, bring it back up with -- storageEngine wiredTiger
  16. 16. * Other Configuration Options •Compression: --wiredTigerCollectionBlockCompressor •
  17. 17. * Other Configuration Options •directoryperdb: doesn’t exist in WT •Databases are a higher level abstraction with WT •Following options also have no WT equivalent • noprealloc • syncdelay • smallfiles • journalCommitInterval
  18. 18. * Configuration with YAML Files •MongoDB 2.6 introduced YAML config files •The storage.wiredTiger field lets you tweak WT options
  19. 19. * WiredTiger Journaling •Journaling in WT is a little different •Write-ahead log committed to disk at checkpoints •By default checkpoint every 60 seconds or 2GB written •Data files always consistent - no journaling means you lose data since last checkpoint •No journal commit interval: writes are written to journal as they come in
  20. 20. * Gotcha: system.indexes, system. namespaces deprecated •Special collections in mmapv1 •Explicit commands: db.getIndexes(), db. getCollectionNames()
  21. 21. * Gotcha: No 32-bit Support •WT storage engine will not work on 32-bit platforms at all
  22. 22. * Using MMS Automation with WT •MMS automation allows you to manage and deploy MongoDB installations •Demo of upgrading a standalone to WT
  23. 23. * Some Basic Performance Numbers •My desired use case: MongoDB for analytics data •Write-heavy workloads aren’t mmapv1’s strong suit •Don’t care about durability as much, but do care about high throughput •Compression is a plus •How does WT w/o journaling do on insert-only? •Simple N=1 experiment
  24. 24. * Some Basic Performance Numbers •mongo shell’s benchRun function
  25. 25. * Some Basic Performance Numbers •mmapv1 WT with --nojournal
  26. 26. * Some Basic Performance Numbers •How does the compression work in this example? •After bench run, WT’s /data/db sums to ~23mb •With --noprealloc, --nopreallocj, --smallfiles, mmapv1 has ~100mb •Not a really fair comparison since data is very small
  27. 27. * Thanks for Listening! •Slides on: • Twitter: @code_barbarian • Slideshare: slideshare.net/vkarpov15

×