3. Agenda
• Pluggable Storage Engines
• WiredTiger Storage Engine
– Document-Level Locking Concurrency Control
– Compression
– Installation & Upgrade
• Other New Stuff in 3.0
• Public Service Announcement
• There will be a test at the end
5. How does MongoDB persist data?
• MongoDB <= 2.6
– MMAPv1 Storage Engine
– Uses Memory Mapped Files
• MongoDB 3.0
– MMAPv1
• still the default
• now with collection-level locking!
– WiredTiger
6. Storage Engine
Content
Repo
IoT Sensor
Backend
Ad Service
Customer
Analytics
Archive
MongoDB Query Language (MQL) + Native Drivers
MongoDB Document Data Model
MMAP V1 WT In-Memory ? ?
Supported in MongoDB 3.0 Future Possible Storage Engines
Management
Security
Example Future State
Experimental
7. Storage Engine API
• Allows to "plug-in" different storage engines
– Different working sets require different performance
characteristics
– MMAPv1 is not ideal for all workloads
– More flexibility: you can mix storage engines on same
replica set/sharded cluster
• Opportunity to integrate further (HDFS, native encrypted,
hardware optimized …)
9. History
• Authors Former Members of Berkeley DB team
– WT product and team acquired by MongoDB
– Standalone Engine already in use in large
deployments including Amazon
10. Why is WiredTiger Awesome
• Document-level concurrency
• Compression
• Consistency without journaling
• Better performance on certain workloads
– write heavy
• Vertically scalable
– Allows full hardware utilization
– More tunable
11. Document-Level Concurrency
• Uses algorithms to minimize contention
between threads
– One thread yields on write contention to same document
– Atomic update replaces latching/locking
• Writes no longer block all other writers
• CPU utilization directly correlates with
performance
12. 50%-80% Less Storage via Compression
• Better storage utilization
• Higher I/O scalability
• Multiple compression options
– Snappy (default) - Good compression benefits
with little CPU/performance impact
– zlib - Extremely good compression at a cost of
additional CPU/degraded performance
– None
• Data and journal compressed on disk
• Indexes compressed on disk and in memory
• No more cryptic field names in documents!
14. Filesystem Layout
• Data stored as conventional B+ tree on disk
• Each collection and index stored in own file
• WT fails to start if MMAPv1 files found in
dbpath
• No in-place updates
– Rewrites document every time, reuses space
– No more padding factor!
• Journal has own folder under dbpath
• You can now store indexes on separate
volumes!
15. Cache
• WT uses two caches
– WiredTiger cache stores uncompressed data
• ideally, working set fits in WT cache
– File system cache stores compressed data
– WT cache uses higher value of 50% of
system memory or 1GB (by default)
16. Supported Platforms
• Supported Platforms
– Linux
– Windows
– Mac OSX
• Non-Supported Platforms
– NO Solaris (yet)
– NO 32Bit (ever)
17. Gotchas
• Deprecate MMAPv1-specific catalog metadata
– system.indexes & system.namespaces
– System metadata should be accessed via
explicit commands going forward
db.getIndexes() db.getCollectionNames()
• Cold start penalty
– due to separate WiredTiger cache
19. How Do I Install It?
• If starting from scratch add 1 additional flag
when launching mongod:
--storageEngine=wiredTiger
20. How Do I Upgrade to it?
• 2 ways:
1. Mongodump/Mongorestore
2. Initial sync a new replica member running
WT
• Note: you can run replicas with mixed
storage engines
• CANNOT copy raw data files!
– WT will fail to start if wrong data format in
dbpath
22. Native Auditing for Any Operation
• Essential for many compliance standards (e.g., PCI
DSS, HIPAA, NIST 800-53, European Union Data
Protection Directive)
• MongoDB Native Auditing
– Construct and filter audit trails for any operation
against the database, whether DML, DCL or DDL
– Can filter by user or action
– Audit log can be written to multiple destinations
24. Enhanced Query Language and Tools
• All Tools rewritten in GO
– Smaller Package Size
– More rapid iteration
– Faster Loading and Export
• Easier Query Optimization
– Explain 2.0
• Improved Logging System
– Faster Debugging
• Aggregation Framework Improvements
• Geospatial Index Improvements
25. Single-click provisioning, scaling &
upgrades, admin tasks
Monitoring, with charts, dashboards and
alerts on 100+ metrics
Backup and restore, with point-in-time
recovery, support for sharded clusters
MMS & Ops Manager 1.6
The Best Way to Manage MongoDB
Up to 95% Reduction in Operational Overhead