WiredTiger In-Memory
vs WiredTiger B-Tree
October, 5, 2016 — M¨ovenpick Hotel — Amsterdam
Sveta Smirnova
∙What is Percona Memory Engine for MongoDB?
∙Typical use cases
∙Advanced Memory Engine
Table of Contents
2
What is Percona Memory Engine for MongoDB?
3
∙ Up to 1000 times faster for OLTP wokloads
∙ 10 times faster for read-only workloads
∙ Stable throughput
∙ No checkpointing
∙ No jitter
Extremely fast In-Memory storage
4
∙ Document-level locking
∙ B-Tree
∙ Practically WiredTiger, but without disk
access
Based on WiredTiger
5
∙ Doesn’t store data on disk
∙ Except small amount of statistics
∙ You can control when to log statistics with
option –inMemoryStatisticsLogDelaySecs
∙ Still must specify –dbpath
sveta@Thinkie:~/mongo_tests$ ls -lh single/
total 40K
drwxrwxr-x 2 sveta sveta 4,0K Eyl 29 15:00 diagnostic.data
-rw-r--r-- 1 sveta sveta 6 Eyl 29 14:58 mongod.lock
-rw-rw-r-- 1 sveta sveta 93 Eyl 29 14:55 storage.bson
∙ Data does not persist between restarts
WiredTiger without storage
6
∙ –storageEngine=inMemory
∙ Can be only engine on MongoDB server
∙ MongoDB restriction, applicable to all engines
∙ Heterogeneous replication and sharding
setups supported
How to enable Memory Engine?
7
∙ Engine can use up to –inMemorySizeGB
∙ If data exceeds this amount
∙ WT CACHE FULL error is returned for all
kinds of operations that cause user data size
to grow
INSERT
CREATE
UPDATE
∙ Reads are not affected
How to control memory usage
8
∙ 100% Open Source
∙ Code available at GitHub
∙ Free for all Percona users and customers
Open Source
9
Typical use cases for Percona Memory Engine
10
∙ Session management
∙ Store active sessions in memory
∙ Users will receive answer almost immediately
∙ Reduce application response time
dramatically
∙ Various temporary collections
∙ All you used to store in memcached
Application cache
11
∙ Application runtime data which does not
require on-disk storage
∙ Intermediary results of calculations
∙ User-specific options
∙ Your idea
Transient Runtime State
12
∙ Thousand-lines aggregations
∙ Temporary collections to store intermediary
data
∙ Complicated queries
Sophisticated data manipulation
13
∙ Large aggregations might be slow
∙ Especially if use many collections
∙ Often this is not avoidable
∙ To calculate number of distinct values you
need to read whole index
∙ Fast dedicated server is great solution
Real-Time Analytics
14
∙ Data sharing between multi-tier or
multi-language applications
English labels
∙ Articles
∙ Pictures
∙ Contact
information
∙ Other content
Russian labels
Multi-tier object sharing
15
∙ Are you tired to wait when data, needed for
application test, loads?
∙ Any change in test data causes delay?
∙ With Memory engine you can reduce
turnaround time for automated application
tests.
∙ And still use same syntax
Application Testing
16
Advanced Percona Memory Engine
17
∙ Are you amazed with speed of the Memory
engine?
∙ But still need data to persist between
restarts?
∙ You can combine both Memory and Wired
Tiger in Replica Set or Sharded Cluster
Best of both worlds
18
∙ Setup 2 or more Memory replicas which
can be Primary
∙ Let WiredTiger to persist data on disk
∙ In rare cases if all Memory replicas crash at
the same time you will loose few
transactions
∙ Number of transactions depends on the
latency between In-Memory Primary replica
and WiredTiger replica
Hidden WiredTiger, storing changes in Replica Set
19
Memory
WiredTiger
Memory
Hidden WiredTiger, storing changes in Replica Set
20
rs.initiate(
... {
... "_id" : "rs",
... "members" : [
... {"_id" : 0, "host" : "inMemory1", "priority" : 1},
... {"_id" : 1, "host" : "inMemory2", "priority" : 1},
... {"_id" : 2, "host" : "WiredTiger", "priority" : 0, "hidden" : true}
... ]
... }
)
Hidden WiredTiger to store on disk: example setup
21
∙ Make WiredTiger Primary
∙ Move all reads to read-only Memory
replicas
∙ Writes will be slow
WiredTiger as Primary in Replica Set
22
Memory
WiredTiger
Memory
WiredTiger as Primary in Replica Set
23
∙ Create Sharded Cluster using Memory
nodes only
∙ Split data between nodes
∙ Create copies of data to prevent data loss
Scaling beyond the RAM of a single server
24
Shard 1 Shard 2
Scaling beyond the RAM
25
Shard 1 Shard 2
R1
R2
R3 R4
R5
R6
Scaling beyond the RAM: add redundancy
26
∙ You can use both engines in the Sharded
Cluster
∙ Split data
∙ Session data on Memory nodes
∙ Persistent data on WiredTiger node(s)
∙ Duplicate Memory shards to avoid loosing
data
Memory and WiredTiger in Sharded Cluster
27
∙ You can have sharded nodes which use
Memory engine
∙ Make them parts of Replica Set
∙ Let hidden WiredTiger member to persist
data on disk
Sharded Cluster Memory and Replica Sets
28
∙ User posts changing rarely are stored on
disk
∙ Session data stored using Memory engine
∙ Active comments (last 24 hours) and
actively accessed posts are cached in
Memory node
Example: blog application
29
mongos> sh.addShardTag("shard01", "memory") // Memory node
WriteResult( "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 )
mongos> sh.addShardTag("shard02", "memory") // Memory node
WriteResult( "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 )
mongos> sh.addShardTag("shard03", "persist") // WiredTiger node
WriteResult( "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 )
mongos> sh.addShardTag("shard04", "persist") // WiredTiger node
WriteResult( "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 )
Example: tag shards
30
mongos> sh.addTagRange("blog.sessions", { sid: 0 }, { sid: 1000000 }, "memory")
WriteResult({
"nMatched" : 0,
"nUpserted" : 1,
"nModified" : 0,
" id" : {
"ns" : "blog.sessions",
"min" : {
"sid" : 0
}
}
})
Example: split data
31
mongos> sh.addTagRange("blog.comments", { store: "persist", cid: 0 },
... { store: "persist", cid: 1000000 }, "persist")
WriteResult({
...
" id" : {
"ns" : "blog.comments",
"min" : {
"store" : "persist",
"cid" : 0
}}})
mongos> sh.addTagRange("blog.comments", { store: "memory", cid: 0 },
... { store: "memory", cid: 1000000 }, "memory")
WriteResult({
...
Example: split data
31
mongos> sh.addTagRange("blog.posts", { pid: 0 }, { pid: 1000000 }, "persist")
WriteResult({
"nMatched" : 0,
"nUpserted" : 1,
"nModified" : 0,
" id" : {
"ns" : "blog.posts",
"min" : {
"pid" : 0
}
}
})
Example: split data
31
∙ Percona Memory Engine replaces
WiredTiger when you need better speed
and can afford loosing data
∙ Can be used in setups which combine both
high performance of the Memory engine
and data persistence of WiredTiger
∙ Open Source
Summary
32
∙ David Bennett
∙ David Murphy
∙ Fernando Ipar
∙ Denis Protyvenskyi
Special thanks
33
∙ Benchmarks for Percona Memory Engine
∙ Introducing Percona Memory Engine for
MongoDB
∙ Percona Server for MongoDB manual
∙ Source code for Percona Memory Engine
More informaiton
34
Rate My Session!
35
???
Place for your questions
36
http://www.slideshare.net/SvetaSmirnova
https://twitter.com/svetsmirnova
Thank you!
37

WiredTiger In-Memory vs WiredTiger B-Tree

  • 1.
    WiredTiger In-Memory vs WiredTigerB-Tree October, 5, 2016 — M¨ovenpick Hotel — Amsterdam Sveta Smirnova
  • 2.
    ∙What is PerconaMemory Engine for MongoDB? ∙Typical use cases ∙Advanced Memory Engine Table of Contents 2
  • 3.
    What is PerconaMemory Engine for MongoDB? 3
  • 4.
    ∙ Up to1000 times faster for OLTP wokloads ∙ 10 times faster for read-only workloads ∙ Stable throughput ∙ No checkpointing ∙ No jitter Extremely fast In-Memory storage 4
  • 5.
    ∙ Document-level locking ∙B-Tree ∙ Practically WiredTiger, but without disk access Based on WiredTiger 5
  • 6.
    ∙ Doesn’t storedata on disk ∙ Except small amount of statistics ∙ You can control when to log statistics with option –inMemoryStatisticsLogDelaySecs ∙ Still must specify –dbpath sveta@Thinkie:~/mongo_tests$ ls -lh single/ total 40K drwxrwxr-x 2 sveta sveta 4,0K Eyl 29 15:00 diagnostic.data -rw-r--r-- 1 sveta sveta 6 Eyl 29 14:58 mongod.lock -rw-rw-r-- 1 sveta sveta 93 Eyl 29 14:55 storage.bson ∙ Data does not persist between restarts WiredTiger without storage 6
  • 7.
    ∙ –storageEngine=inMemory ∙ Canbe only engine on MongoDB server ∙ MongoDB restriction, applicable to all engines ∙ Heterogeneous replication and sharding setups supported How to enable Memory Engine? 7
  • 8.
    ∙ Engine canuse up to –inMemorySizeGB ∙ If data exceeds this amount ∙ WT CACHE FULL error is returned for all kinds of operations that cause user data size to grow INSERT CREATE UPDATE ∙ Reads are not affected How to control memory usage 8
  • 9.
    ∙ 100% OpenSource ∙ Code available at GitHub ∙ Free for all Percona users and customers Open Source 9
  • 10.
    Typical use casesfor Percona Memory Engine 10
  • 11.
    ∙ Session management ∙Store active sessions in memory ∙ Users will receive answer almost immediately ∙ Reduce application response time dramatically ∙ Various temporary collections ∙ All you used to store in memcached Application cache 11
  • 12.
    ∙ Application runtimedata which does not require on-disk storage ∙ Intermediary results of calculations ∙ User-specific options ∙ Your idea Transient Runtime State 12
  • 13.
    ∙ Thousand-lines aggregations ∙Temporary collections to store intermediary data ∙ Complicated queries Sophisticated data manipulation 13
  • 14.
    ∙ Large aggregationsmight be slow ∙ Especially if use many collections ∙ Often this is not avoidable ∙ To calculate number of distinct values you need to read whole index ∙ Fast dedicated server is great solution Real-Time Analytics 14
  • 15.
    ∙ Data sharingbetween multi-tier or multi-language applications English labels ∙ Articles ∙ Pictures ∙ Contact information ∙ Other content Russian labels Multi-tier object sharing 15
  • 16.
    ∙ Are youtired to wait when data, needed for application test, loads? ∙ Any change in test data causes delay? ∙ With Memory engine you can reduce turnaround time for automated application tests. ∙ And still use same syntax Application Testing 16
  • 17.
  • 18.
    ∙ Are youamazed with speed of the Memory engine? ∙ But still need data to persist between restarts? ∙ You can combine both Memory and Wired Tiger in Replica Set or Sharded Cluster Best of both worlds 18
  • 19.
    ∙ Setup 2or more Memory replicas which can be Primary ∙ Let WiredTiger to persist data on disk ∙ In rare cases if all Memory replicas crash at the same time you will loose few transactions ∙ Number of transactions depends on the latency between In-Memory Primary replica and WiredTiger replica Hidden WiredTiger, storing changes in Replica Set 19
  • 20.
  • 21.
    rs.initiate( ... { ... "_id": "rs", ... "members" : [ ... {"_id" : 0, "host" : "inMemory1", "priority" : 1}, ... {"_id" : 1, "host" : "inMemory2", "priority" : 1}, ... {"_id" : 2, "host" : "WiredTiger", "priority" : 0, "hidden" : true} ... ] ... } ) Hidden WiredTiger to store on disk: example setup 21
  • 22.
    ∙ Make WiredTigerPrimary ∙ Move all reads to read-only Memory replicas ∙ Writes will be slow WiredTiger as Primary in Replica Set 22
  • 23.
  • 24.
    ∙ Create ShardedCluster using Memory nodes only ∙ Split data between nodes ∙ Create copies of data to prevent data loss Scaling beyond the RAM of a single server 24
  • 25.
    Shard 1 Shard2 Scaling beyond the RAM 25
  • 26.
    Shard 1 Shard2 R1 R2 R3 R4 R5 R6 Scaling beyond the RAM: add redundancy 26
  • 27.
    ∙ You canuse both engines in the Sharded Cluster ∙ Split data ∙ Session data on Memory nodes ∙ Persistent data on WiredTiger node(s) ∙ Duplicate Memory shards to avoid loosing data Memory and WiredTiger in Sharded Cluster 27
  • 28.
    ∙ You canhave sharded nodes which use Memory engine ∙ Make them parts of Replica Set ∙ Let hidden WiredTiger member to persist data on disk Sharded Cluster Memory and Replica Sets 28
  • 29.
    ∙ User postschanging rarely are stored on disk ∙ Session data stored using Memory engine ∙ Active comments (last 24 hours) and actively accessed posts are cached in Memory node Example: blog application 29
  • 30.
    mongos> sh.addShardTag("shard01", "memory")// Memory node WriteResult( "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 ) mongos> sh.addShardTag("shard02", "memory") // Memory node WriteResult( "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 ) mongos> sh.addShardTag("shard03", "persist") // WiredTiger node WriteResult( "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 ) mongos> sh.addShardTag("shard04", "persist") // WiredTiger node WriteResult( "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 ) Example: tag shards 30
  • 31.
    mongos> sh.addTagRange("blog.sessions", {sid: 0 }, { sid: 1000000 }, "memory") WriteResult({ "nMatched" : 0, "nUpserted" : 1, "nModified" : 0, " id" : { "ns" : "blog.sessions", "min" : { "sid" : 0 } } }) Example: split data 31
  • 32.
    mongos> sh.addTagRange("blog.comments", {store: "persist", cid: 0 }, ... { store: "persist", cid: 1000000 }, "persist") WriteResult({ ... " id" : { "ns" : "blog.comments", "min" : { "store" : "persist", "cid" : 0 }}}) mongos> sh.addTagRange("blog.comments", { store: "memory", cid: 0 }, ... { store: "memory", cid: 1000000 }, "memory") WriteResult({ ... Example: split data 31
  • 33.
    mongos> sh.addTagRange("blog.posts", {pid: 0 }, { pid: 1000000 }, "persist") WriteResult({ "nMatched" : 0, "nUpserted" : 1, "nModified" : 0, " id" : { "ns" : "blog.posts", "min" : { "pid" : 0 } } }) Example: split data 31
  • 34.
    ∙ Percona MemoryEngine replaces WiredTiger when you need better speed and can afford loosing data ∙ Can be used in setups which combine both high performance of the Memory engine and data persistence of WiredTiger ∙ Open Source Summary 32
  • 35.
    ∙ David Bennett ∙David Murphy ∙ Fernando Ipar ∙ Denis Protyvenskyi Special thanks 33
  • 36.
    ∙ Benchmarks forPercona Memory Engine ∙ Introducing Percona Memory Engine for MongoDB ∙ Percona Server for MongoDB manual ∙ Source code for Percona Memory Engine More informaiton 34
  • 37.
  • 38.
    ??? Place for yourquestions 36
  • 39.