1. MongoDB – Roma
12 Luglio 2012
Replication and Sharding:
Hands on
Guglielmo Incisa
2. Replication
• What is it
– Data is replicated (cloned) into at least two nodes
– Updates are sent to one node (Primary) and automatically propagated
to the others (Secondary)
– Connection can through a router or directly to the Primary (Secondary
is read only)
• If we connect our app server to the Primary we must deal with its failure and
reconnect to the new Primary
Primary
App server DB
Router
3. Replication
• Why we need it
– If one node fails the application server can still work without any
impact
– The router will automatically manage the connection to the rest of the
nodes (router may be subject to failure though)
Primary
App server DB
Router
4. Replication
• Why we need it
– More and more IT departments are moving from
• Big, proprietary, reliable and expensive servers
– To
• Commodity Hardware (smaller, less reliable, inexpensive servers: PC)
– Commodity hardware is less reliable but our users demand that our
applications be always available: the replication can help.
– Example: how many servers do I need to have 99,999% of availability?
• If for example a PC has 98% availability (8 days if downtime in a year, or 98%
probability to be down)
• -> Two replicated PC have 99,96% of availability
• -> Three replicated PC have more than 99,999% (Telecom Grade / Core Network).
5. Sharding
• What is it
– Data is partitioned and distributed to different nodes
• Some records are in node 1, others in node 2 etc…
– MongoDB Sharding: the partition is based on a field.
• Database: test2
– Table: testSchema1
– Fields:
» owner: owner of the file, key and shard key (string)
» date (string)
» tags (list of string)
» keywords: words in the document, created by java code below (list of string)
» fileName (string)
» content: the file (binary)
» ascii: the file (string)
6. Sharding
• Why we need it
– Servers with smaller storage
– To increase responsiveness by increasing parallelism
Router
Owner: A-H Owner: I-O Owner: P-Z
7. Replication and Sharding
• Can we have both?
– MongoDB: yes!
• Our example:
Shard A: 2 + arbiter
Config process
Shard B: 2 + arbiter
Router
mongos
Shard C: 2 + arbiter
8. Replication and Sharding
• Replication:
– Two nodes and an arbiter
• The arbiter is needed when a number of even nodes are used, it decides which server is Primary and which
one is secondary, manages the upgrade when one is down
• Sharding
– Three sets: A, B, C
– Config Process:
• <<The config servers store the cluster's metadata, which includes basic information on each shard server and
the chunks contained therein.>>
– Routing Process:
• <<The mongos process can be thought of as a routing and coordination process that makes the various
components of the cluster look like a single system. When receiving client requests, the mongos process routes
the request to the appropriate server(s) and merges any results to be sent back to the client.>>
9. Setup 1
• Start Servers and arbiters
– Create /data/db, db2, db3, db4, db5, db6, db7, db8 ,db9, configdb
– --nojournal speeds up the startup (journalling is default in 64 bit)
• Replica set A
– ./mongodb-linux-x86_64-2.0.4/bin/mongod --shardsvr --replSet DSSA –nojournal
– ./mongodb-linux-x86_64-2.0.4/bin/mongod --shardsvr --replSet DSSA --dbpath /data/db2
--port 27021 –nojournal
– Arbiter:
Shard A: 2 + arbiter
– ./mongodb-linux-x86_64-2.0.4/bin/mongod --shardsvr --replSet DSSA --dbpath /data/db7
--port 27031 –nojournal
• Replica set B
– ./mongodb-linux-x86_64-2.0.4/bin/mongod --shardsvr --replSet DSSB --dbpath /data/db3 -- Shard B: 2 + arbiter
port 27023 –nojournal
– ./mongodb-linux-x86_64-2.0.4/bin/mongod --shardsvr --replSet DSSB --dbpath /data/db4 --
port 27025 –nojournal
– Arbiter: Shard C: 2 + arbiter
– ./mongodb-linux-x86_64-2.0.4/bin/mongod --shardsvr --replSet DSSB --dbpath /data/db8 --
port 27035 –nojournal
• Replica set C
– ./mongodb-linux-x86_64-2.0.4/bin/mongod --shardsvr --replSet DSSC --dbpath /data/db5 --
port 27027 –nojournal
– ./mongodb-linux-x86_64-2.0.4/bin/mongod --shardsvr --replSet DSSC --dbpath /data/db6 --
port 27029 –nojournal
– Arbiter:
– ./mongodb-linux-x86_64-2.0.4/bin/mongod --shardsvr --replSet DSSC --dbpath /data/db9 --
port 27039 --nojournal
10. Setup 2
• Set the replicas, connect to each primary and set the configuration
• Set replica A
./mongodb-linux-x86_64-2.0.4/bin/mongo --port 27018
cfg = {
_id : "DSSA",
members : [
{_id : 0, host : “hostname:27018"},
{_id : 1, host : "hostname:27021"},
{_id : 2, host : "hostname:27031", arbiterOnly:true}
]
}
rs.initiate(cfg)
db.getMongo().setSlaveOk()
• Set replica B
./mongodb-linux-x86_64-2.0.4/bin/mongo --port 27023
cfg = {
_id : "DSSB",
members : [
{_id : 0, host : "hostname:27023"},
{_id : 1, host : "hostname:27025"},
{_id : 2, host : "hostname:27035", arbiterOnly:true}
]
}
rs.initiate(cfg)
db.getMongo().setSlaveOk()
• Set replica C
./mongodb-linux-x86_64-2.0.4/bin/mongo --port 27027
cfg = {
_id : "DSSC",
members : [
{_id : 0, host : "hostname:27027"},
{_id : 1, host : "hostname:27029"},
{_id : 2, host : "hostname:27039", arbiterOnly:true},
]
}
rs.initiate(cfg)
db.getMongo().setSlaveOk()
12. MapReduce
• "Map" step: The master node takes the input, divides it into smaller sub-
problems, and distributes them to worker nodes. A worker node may do
this again in turn, leading to a multi-level tree structure. The worker node
processes the smaller problem, and passes the answer back to its master
node.
• "Reduce" step: The master node then collects the answers to all the sub-
problems and combines them in some way to form the output – the
answer to the problem it was originally trying to solve.
• Source: Wikipedia
•
13. MapReduce
• map = function(){
if(!this.keywords){
return;
}
for (index in this.keywords){
emit(this.keywords[index],1);
}
}
• reduce = function(previous,current){
var count = 0;
for (index in current) {
count += current[index];
}
return count;
}
• result = db.runCommand({
"mapreduce" : "testSchema1",
"map":map,
"reduce":reduce,
"out":"keywords"})
db.keywords.find()
mongos> db.keywords.find({_id:“hello"})
14. Check Sharding
• Connect to router and count the records:
./mongodb-linux-x86_64-2.0.4/bin/mongo admin
mongos>use test2
mongos>db,testSchema1.count()
11
• Connect to each primary (and see the number of records in each shard):
./mongodb-linux-x86_64-2.0.4/bin/mongo --port 27018
mongo>use test2
Mongo>db,testSchema1.count()
4
./mongodb-linux-x86_64-2.0.4/bin/mongo --port 27023
mongo>use test2
mongo>db,testSchema1.count()
4
./mongodb-linux-x86_64-2.0.4/bin/mongo --port 27027
mongo>use test2
mongo>db,testSchema1.count()
3
15. Check Replication
• Kill Server 1 (=Primary A)
• Connect to router and count the records:
mongos>use test2
mongos>db,testSchema1.count()
11
• Check if (Server 2) Secondary A in now primary
• Load a new chunck
• Counting will be 22
• Restart killed server (Server 1) , wait
• Kill the other one (Server 2), Primary A
• Check that Server 1 is Primary again
• Counting will still be 22
• Restart Server 2