Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Understanding and tuning WiredTiger
the new high performance database engine in MongoDB
Henrik Ingo
Solutions Architect, M...
Agenda:
- MongoDB and NoSQL
- Storage Engine API
- WiredTiger configuration + performance
3
Most popular NoSQL database
4
5 NoSQL categories
Key Value Wide Column Document
Graph Map Reduce
Redis, Riak Cassandra
Neo4j Hadoop
5
MongoDB is a Document Database
MongoDB
Rich Queries
• Find Paul’s cars
• Find everybody in London with a car
built betwe...
6
Operational Database Landscape
MongoDB 3.0 & storage engines
8
MongoDB until 3.0
Read-heavy apps
• Great performance
• B-tree
• Low overhead
• Good scale-out perf
• Secondary reads
• ...
9
Current state in MongoDB 2.6
Read-heavy apps
• Great performance
• B-tree
• Low overhead
• Good scale-out perf
• Seconda...
10
MongoDB 3.0 Storage Engine API
MMAP
Read-heavy app
WiredTiger
Write-heavy app
3rd party
Special app
11
MMAP
Read-heavy app
WiredTiger
Write-heavy app
3rd party
Special app
• One at a time:
– Many engines built into mongod
...
12
• MMAPv1
– Improved MMAP (collection-level locking)
• WiredTiger
– Discussed next
• RocksDB
– LSM style engine develope...
13
• Heap
– In-memory engine
• Devnull
– Write all data to /dev/null
– Based on idea from famous flash animation...
• SSD ...
WiredTiger
15
• Modern NoSQL database engine
– flexible schema
• Advanced database engine
– Secondary indexes, MVCC, non-locking algo...
16
Choosing WiredTiger at server startup
mongod --storageEngine wiredTiger
http://docs.mongodb.org/master/reference/progra...
17
Main tunables exposed as MongoDB options
mongod --storageEngine wiredTiger
--wiredTigerCacheSizeGB 8
--wiredTigerDirect...
18
All WiredTiger options via configString (hidden)
mongod --storageEngine wiredTiger
--wiredTigerEngineConfigString
"cach...
19
Also via createCollection(), createIndex()
db.createCollection( "users",
{ storageEngine: {
wiredTiger: {
configString:...
20
• db.serverStatus()
• db.collection.stats()
More...
Understanding and Optimizing
WiredTiger
22
Understanding WiredTiger architectureWiredTigerSE
Btree LSM Columnar
Cache (default: 50%)
None Snappy Zlib
OS Disk Cach...
23
Covering 90% of your optimization needsWiredTigerSE
Btree LSM Columnar
Cache (default: 50%)
None Snappy Zlib
OS Disk Ca...
24
Strategy 1: fit working set in CacheWiredTigerSE
Btree LSM Columnar
Cache (default: 50%)
None Snappy Zlib
OS Disk Cache...
25
Strategy 2: fit working set in OS Disk CacheWiredTigerSE
Btree LSM Columnar
Cache (default: 50%)
None Snappy Zlib
OS Di...
26
Strategy 3: SSD disk + compression to save €WiredTigerSE
Btree LSM Columnar
Cache (default: 50%)
None Snappy Zlib
OS Di...
27
Strategy 4: SSD disk (no compression)WiredTigerSE
Btree LSM Columnar
Cache (default: 50%)
None Snappy Zlib
OS Disk Cach...
28
Compression benchmarks
29
What problem is solved by LSM indexes?Performance
Fast reads Fast writesBoth
Easy:
Add indexes
Easy:
No indexes
Hard:
S...
30
2B inserts (with 3 secondary indexes)
http://smalldatum.blogspot.fi/2014/12/read-modify-write-optimized.html
Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)
Upcoming SlideShare
Loading in …5
×

Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

3,863 views

Published on

MongoDB 3.0 introduced the concept of different storage engine. The new engine known as WiredTiger introduces document level MVCC locking, compression and a choice between Btree or LSM indexes. In this talk you will learn about the storage engine architecture and specifically WiredTiger, and how to tune and monitor it for best performance.

MongoDB 3.0 представил новый концепт движков хранения. Новый движок известен как WiredTiger и предоставляет новый уровень документов MVCC фикс, компрессию и выбор между Btree или индексами LSM. В этом докладе вы поймете, как тюнить и мониторить архитектуры движка базы данных, а точнее WiredTiger для получения максимальной производительности.

Published in: Engineering
  • Hi! Thanks for the great slides. In Slide #18, you’ve pointed out an option I can’t find in the official docs, the “wiredTigerIndexConfigString”, with which we can instruct indexing to use LSM trees instead of BTrees. Is it currently available in MongoDB? Can we use WiredTiger with LSM as storage engine ?
Thanks!
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Understanding and tuning WiredTiger, the new high performance database engine in MongoDB / Henrik Ingo (MongoDB)

  1. 1. Understanding and tuning WiredTiger the new high performance database engine in MongoDB Henrik Ingo Solutions Architect, MongoDB
  2. 2. Agenda: - MongoDB and NoSQL - Storage Engine API - WiredTiger configuration + performance
  3. 3. 3 Most popular NoSQL database
  4. 4. 4 5 NoSQL categories Key Value Wide Column Document Graph Map Reduce Redis, Riak Cassandra Neo4j Hadoop
  5. 5. 5 MongoDB is a Document Database MongoDB Rich Queries • Find Paul’s cars • Find everybody in London with a car built between 1970 and 1980 Geospatial • Find all of the car owners within 5km of Trafalgar Sq. Text Search • Find all the cars described as having leather seats Aggregation • Calculate the average value of Paul’s car collection Map Reduce • What is the ownership pattern of colors by geography over time? (is purple trending up in China?) { first_name: ‘Paul’, surname: ‘Miller’, city: ‘London’, location: [45.123,47.232], cars: [ { model: ‘Bentley’, year: 1973, value: 100000, … }, { model: ‘Rolls Royce’, year: 1965, value: 330000, … } } }
  6. 6. 6 Operational Database Landscape
  7. 7. MongoDB 3.0 & storage engines
  8. 8. 8 MongoDB until 3.0 Read-heavy apps • Great performance • B-tree • Low overhead • Good scale-out perf • Secondary reads • Sharding Write-heavy apps • Good scale-out perf • Sharding • Per-node efficiency wish-list: • Doc level locking • Write-optimized data structures (LSM) • Compression Other • Multi statement transactions • In-memory engine • SSD optimized engine • etc...
  9. 9. 9 Current state in MongoDB 2.6 Read-heavy apps • Great performance • B-tree • Low overhead • Good scale-out perf • Secondary reads • Sharding Write-heavy apps • Good scale-out perf • Sharding • Per-node efficiency wish-list: • Doc level locking • Write-optimized data structures (LSM) • Compression Other • Complex transactions • In-memory engine • SSD optimized engine • etc... How to get all of the above?
  10. 10. 10 MongoDB 3.0 Storage Engine API MMAP Read-heavy app WiredTiger Write-heavy app 3rd party Special app
  11. 11. 11 MMAP Read-heavy app WiredTiger Write-heavy app 3rd party Special app • One at a time: – Many engines built into mongod – Choose 1 at startup – All data stored by the same engine – Incompatible on-disk data formats (obviously) – Compatible client API • Compatible Oplog & Replication – Same replica set can mix different engines – No-downtime migration possible MongoDB 3.0 Storage Engine API
  12. 12. 12 • MMAPv1 – Improved MMAP (collection-level locking) • WiredTiger – Discussed next • RocksDB – LSM style engine developed by Facebook – Based on LevelDB • TokuMXse – Fractal Tree indexing engine from Percona Some existing engines
  13. 13. 13 • Heap – In-memory engine • Devnull – Write all data to /dev/null – Based on idea from famous flash animation... • SSD optimized engine (e.g. Fusion-IO) • KV simple key-value engine Some rumored engines https://github.com/mongodb/mongo/tree/master/src/mongo/db/storage
  14. 14. WiredTiger
  15. 15. 15 • Modern NoSQL database engine – flexible schema • Advanced database engine – Secondary indexes, MVCC, non-locking algorithms – Multi-statement transactions (not in MongoDB) • Very modular, tunable – Btree, LSM and columnar indexes – Snappy, Zlib, 3rd-party compression – Index prefix compression, etc... – Encryption at rest • Built by creators of BerkeleyDB • Acquired by MongoDB in 2014 • source.wiredtiger.com, @WiredTigerInc What is WiredTiger
  16. 16. 16 Choosing WiredTiger at server startup mongod --storageEngine wiredTiger http://docs.mongodb.org/master/reference/program/mongod/#cmdoption--storageEngine Default engine: MongoDB 3.0 = MMAP MongoDB 3.2 = WiredTiger
  17. 17. 17 Main tunables exposed as MongoDB options mongod --storageEngine wiredTiger --wiredTigerCacheSizeGB 8 --wiredTigerDirectoryForIndexes /data/indexes --wiredTigerCollectionBlockCompressor zlib --dbpath /data/datafiles http://docs.mongodb.org/master/reference/program/mongod/#cmdoption--storageEngine
  18. 18. 18 All WiredTiger options via configString (hidden) mongod --storageEngine wiredTiger --wiredTigerEngineConfigString "cache_size=8GB,eviction=(threads_min=4,threads_max=8), checkpoint(wait=30)" --wiredTigerCollectionConfigString "block_compressor=zlib" --wiredTigerIndexConfigString "type=lsm,block_compressor=zlib" --wiredTigerDirectoryForIndexes /data/indexes See docs for wiredtiger_open() & WT_SESSION::create() http://source.wiredtiger.com/2.5.0/group__wt.html#ga9e6adae3fc6964ef837a62795c7840ed http://source.wiredtiger.com/2.5.0/struct_w_t___s_e_s_s_i_o_n.html#a358ca4141d59c345f401c58501276bbb
  19. 19. 19 Also via createCollection(), createIndex() db.createCollection( "users", { storageEngine: { wiredTiger: { configString: "block_compressor=none" } } ) http://docs.mongodb.org/master/reference/method/db.createCollection/#db.createCollection http://docs.mongodb.org/master/reference/method/db.collection.createIndex/#db.collection.createIndex
  20. 20. 20 • db.serverStatus() • db.collection.stats() More...
  21. 21. Understanding and Optimizing WiredTiger
  22. 22. 22 Understanding WiredTiger architectureWiredTigerSE Btree LSM Columnar Cache (default: 50%) None Snappy Zlib OS Disk Cache (Default: 50%) Physical disk
  23. 23. 23 Covering 90% of your optimization needsWiredTigerSE Btree LSM Columnar Cache (default: 50%) None Snappy Zlib OS Disk Cache (Default: 50%) Physical disk Decompression time Disk seek time
  24. 24. 24 Strategy 1: fit working set in CacheWiredTigerSE Btree LSM Columnar Cache (default: 50%) None Snappy Zlib OS Disk Cache (Default: 50%) Physical disk cache_size = 80%
  25. 25. 25 Strategy 2: fit working set in OS Disk CacheWiredTigerSE Btree LSM Columnar Cache (default: 50%) None Snappy Zlib OS Disk Cache (Default: 50%) Physical disk cache_size = 10% OS Disk Cache (Remaining: 90%)
  26. 26. 26 Strategy 3: SSD disk + compression to save €WiredTigerSE Btree LSM Columnar Cache (default: 50%) None Snappy Zlib OS Disk Cache (Default: 50%) Physical diskSSD
  27. 27. 27 Strategy 4: SSD disk (no compression)WiredTigerSE Btree LSM Columnar Cache (default: 50%) None Snappy Zlib OS Disk Cache (Default: 50%) Physical diskSSD
  28. 28. 28 Compression benchmarks
  29. 29. 29 What problem is solved by LSM indexes?Performance Fast reads Fast writesBoth Easy: Add indexes Easy: No indexes Hard: Smart schema design (hire a consultant) LSM index structures (or columnar)
  30. 30. 30 2B inserts (with 3 secondary indexes) http://smalldatum.blogspot.fi/2014/12/read-modify-write-optimized.html

×