MongoDB Miami Meetup 1/26/15: Introduction to WiredTiger

Introduction to WiredTiger and the
Storage Engine API
Valeri Karpov
NodeJS Engineer, MongoDB
www.thecodebarbarian.com
www.slideshare.net/vkarpov15
github.com/vkarpov15
@code_barbarian

*
A Bit About Me
•CI/NodeJS Engineer at MongoDB
•Maintainer of mongoose ODM
•Recently working on rewriting mongodump, etc.

*
Talk Overview
•What is the Storage Engine API?
•What is WT and why you should care
•Basic WT internals and gotchas
•MMS Automation and WT
•Some very basic performance numbers

*
Introducing Storage Engines
•How MongoDB persists data
•<= MongoDB 2.6: “mmapv1” storage engine
•MongoDB 3.0 has a few options
• mmapv1
• in_memory
• wiredTiger
• devnull (now /dev/null does support sharding!)
•Internal hack: Twitter storage engine

*
Why Storage Engine API?
•Different performance characteristics
•mmapv1 doesn’t handle certain workloads well
•Consistent API on top of storage layer
• Can mix storage engines in a replset or sharded cluster!

*
Storage Engines Visually
•mmapv1 persists data to dbpath

*
Storage Engines Visually
•WT persists data differently

*
What is WiredTiger?
•Storage engine company founded by BerkleyDB alums
•Recently acquired by MongoDB
•Available as a storage engine option in MongoDB 3.0

*
Why is WiredTiger Awesome?
•Document-level locking
•Compression on disk
•Consistency without journaling
•Better performance on certain workloads

*
Document-level Locking
•The often-criticized global write lock was removed in
2.2
•Database-level locking
•3.0 with mmapv1 has collection-level locking
•3.0 with WT only locks at document layer
•Writes no longer block all other writes
•Better CPU usage: more cores ~= more writes

*
Compression
•WT uses snappy compression by default
•Data is compressed on disk
•2 supported compression algorithms:
• snappy: default. Good compression, relatively low overhead
• zlib: Better compression, but at cost of more overhead

*
Consistency without Journaling
•mmapv1 uses write-ahead log to guarantee consistency
as well as durability
•WT doesn’t have this problem: no in-place updates
•Potentially good for insert-heavy workloads
•Rely on replication for durability
•More on this in the next section

*
WiredTiger Internals: Running It
•mongod now has a --storageEngine option

*
WiredTiger and dbpath
•Don’t run WT when you have mmapv1 files in your
dbpath (and vice versa)

*
Upgrading to WT
•Can’t copy database files
•Can’t just restart with same dbpath
•Other methods for upgrading still work:
• Initial sync from replica set
• mongodump/mongorestore
•Can still do rolling upgrade of replica set to WT:
• Shut down secondary, delete dbpath, bring it back up with --
storageEngine wiredTiger

*
Other Configuration Options
•Compression: --wiredTigerCollectionBlockCompressor
•

*
Other Configuration Options
•directoryperdb: doesn’t exist in WT
•Databases are a higher level abstraction with WT
•Following options also have no WT equivalent
• noprealloc
• syncdelay
• smallfiles
• journalCommitInterval

*
Configuration with YAML Files
•MongoDB 2.6 introduced YAML config files
•The storage.wiredTiger field lets you tweak WT options

*
WiredTiger Journaling
•Journaling in WT is a little different
•Write-ahead log committed to disk at checkpoints
•By default checkpoint every 60 seconds or 2GB written
•Data files always consistent - no journaling means you
lose data since last checkpoint
•No journal commit interval: writes are written to
journal as they come in

*
Gotcha: system.indexes, system.
namespaces deprecated
•Special collections in mmapv1
•Explicit commands: db.getIndexes(), db.
getCollectionNames()

*
Gotcha: No 32-bit Support
•WT storage engine will not work on 32-bit platforms at
all

*
Using MMS Automation with WT
•MMS automation allows you to manage and deploy
MongoDB installations
•Demo of upgrading a standalone to WT

*
Some Basic Performance Numbers
•My desired use case: MongoDB for analytics data
•Write-heavy workloads aren’t mmapv1’s strong suit
•Don’t care about durability as much, but do care about
high throughput
•Compression is a plus
•How does WT w/o journaling do on insert-only?
•Simple N=1 experiment

*
•mongo shell’s benchRun function

*
•mmapv1
WT with --nojournal

*
•How does the compression work in this example?
•After bench run, WT’s /data/db sums to ~23mb
•With --noprealloc, --nopreallocj, --smallfiles, mmapv1
has ~100mb
•Not a really fair comparison since data is very small

*
Thanks for Listening!
•Slides on:
• Twitter: @code_barbarian
• Slideshare: slideshare.net/vkarpov15

MongoDB Miami Meetup 1/26/15: Introduction to WiredTiger

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (12)

Similar to MongoDB Miami Meetup 1/26/15: Introduction to WiredTiger

Similar to MongoDB Miami Meetup 1/26/15: Introduction to WiredTiger (20)

More from Valeri Karpov

More from Valeri Karpov (12)

Recently uploaded

Recently uploaded (20)

MongoDB Miami Meetup 1/26/15: Introduction to WiredTiger