Use Your MySQL Knowledge to
Become a MongoDB Guru
Percona Live London 2013

Robert Hodges
CEO
Continuent

Tim Callaghan
VP/Engineering
Tokutek
®

Tuesday, November 12, 13
Our Companies
Robert Hodges
• CEO at Continuent
• Database nerd since 1982 starting with M204, RDBMS since
1990, NoSQL since 2012; designed Continuent Tungsten

•

Continuent offers clustering and replication for MySQL and
other fine DBMS types

Tim Callaghan
• VP/Engineering at Tokutek
• Long time database consumer (Oracle) and producer (VoltDB,
Tokutek)

•

Tokutek offers Fractal Tree indexes in MySQL (TokuDB) and
MongoDB (TokuMX)
®

Tuesday, November 12, 13
MongoDB -- The New MySQL

One Bad Thing about
MongoDB
One Good Thing about
MongoDB
®

Tuesday, November 12, 13
One Bad Thing about MongoDB
MySQL
> select * from table1 where column1 > column2;
> ... 5 row(s) returned
MongoDB
> db.collection1.find({$field1: {gt: $field2}});
> ReferenceError: $field2 is not defined
[current] MongoDB query language is
<field> <operator> <literal>
®

Tuesday, November 12, 13
One Good Thing about MongoDB

Robert’s “ease of use”
demo

®

Tuesday, November 12, 13
Today’s Question

How can you use your
MySQL knowledge to get
up to speed on MongoDB?

®

Tuesday, November 12, 13
Topic:
Schema Design

®

Tuesday, November 12, 13
How Do I Find Things in MongoDB?

mongod server

== mysqld
== MySQL schema
== MySQL table
~ Sort of like a MySQL row
!= MySQL column

database
collection
BSON document
key/value pair
key/value pair
key/value pair

BSON document...

8
®

Tuesday, November 12, 13
How Do I Create a Table and Insert Data?
Connect
# Ruby Code

MongoClient.new("localhost").
db("mydb").
Use database
collection("sample").
insert({"data" => "hello world"})

Choose
collection

Insert data to
materialize database
and collection

Primary key
generated
automatically
9
®

Tuesday, November 12, 13
How Do I Change the Schema?

# Ruby Code

MongoClient.new("localhost").
db("mydb").
collection("sample").
insert({"data" => "hello again!",
"author" => “robert”})

Just add
more data
10
®

Tuesday, November 12, 13
How Do I Validate Schema?

rs0:PRIMARY>
{ "_id" : 1,
{ "_id" : 2,
{ "_id" : 3,

db.samples.find()
"data" : "hello world" }
"daata" : "bye world” }
"data" : 26.44 }

Software bugs?
rs0:PRIMARY> show databases
local ! 2.0771484375GB
mydb! 7.9501953125GB
Typo from an
mydb1! 0.203125GB

early run

11
®

Tuesday, November 12, 13
How Do I Remove Data? (Part 1)

Drop a database

rs0:PRIMARY> db.dropDatabase()
{ "dropped" : "mydb", "ok" : 1 }
Drop a collection

rs0:PRIMARY> db.samples.drop()
true
Drop a column?

rs0:PRIMARY> db.foo.update(
{ author: { $exists: true }},
{ $unset: { author: 1 } },
false, true )
12
®

Tuesday, November 12, 13
How Do I Remove Data? (Part 2)

(Remove documents based on TTL index)

> db.samples.ensureIndex(
{"inserted": 1},
{"expireAfterSeconds": 60})
> db.samples.insert(
{"data": "hello world",
inserted: new Date()})
> db.table.count()
1
...
> db.table.count()
0
(Capped collections do same with space)
13
®

Tuesday, November 12, 13
How Does MongoDB Do Joins?

It Doesn’t!
(It is your job to denormalize or do
application level joins. This includes
thinking about storage.)

14
®

Tuesday, November 12, 13
Topic:
Data Storage
and Organization

®

Tuesday, November 12, 13
How is My Data Stored, Logically?
MongoDB storage is very similar to MyISAM
secondary
index(es)

_id index

etc.

collection data (documents)
16
®

Tuesday, November 12, 13
How is My Data Stored, Physically?
But it does look different in the file system.
MyISAM
<db>/<table>.frm
<db>/<table>.myd
<db>/<table>.myi

MongoDB
<db1>.ns
<db1>.1 .. <db1>.n
<db2>.ns
<db2>.1 .. <db2>.n

• start MongoDB with “--directoryperdb” to put
files in database folders
• pro-tip : do this to gain IOPs by database
17
®

Tuesday, November 12, 13
How Much Memory Does It Use?

All of it!

18
®

Tuesday, November 12, 13
How does MongoDB Manage Memory?
• MyISAM
– key_cache_size determines index caching
– data is cached in Operating System buffers

• InnoDB
– innodb_buffer_pool_size determines index/data
caching

• MongoDB
– memory mapped files
– mongod grows to consume available RAM
– good : no knob
– bad : operating system is in charge of cache
– bad : available RAM may change over time
19
®

Tuesday, November 12, 13
How Will It Perform for My Workload?
• It depends...
– Determine your “working set”
o The portion of your data that clients access most often
o db.runCommand( { serverStatus: 1, workingSet: 1 } )

– If working set <= RAM
o Performance generally very good
o Be careful in high-concurrent-write use cases

– If working set >= RAM
o Likely IO bound
o Sharding to the rescue!

20
®

Tuesday, November 12, 13
How Can Schema Affect Working Set?
• Field names are stored with the document
– On disk and in memory

• Plan ahead, specially for large collections

BAD!

GOOD!

{ first_name: “Timothy”,
middle_initial: “M”,
last_name: “Callaghan”,
address_line_1: “555 Main Street”,
address_line_2: “Apt. 9” }

{ fn: “Timothy”,
mi: “M”,
ln: “Callaghan”,
al1: “555 Main Street”,
al2: “Apt. 9” }

21
®

Tuesday, November 12, 13
Topic:
Query Optimization

®

Tuesday, November 12, 13
How Does the Query Optimizer Work?
• MySQL
– Optimizer find useable indexes for the query
– For each index, optimizer asks the storage engine
o What is the cardinality for the given keys?
o What is the estimated cost?

– The “best” plan is chosen and used for the query

• This occurs for every single query

23
®

Tuesday, November 12, 13
How Does the Query Optimizer Work?
• MongoDB
– All candidate indexes run the query in parallel
o “candidate” meaning it contains useful keys

– As matching results are found they are placed in a
shared buffer
– When one of the parallel runs completes, all
others are stopped
– This “plan” is used for future executions of the
same query
o Until the collection has 1,000 writes, mongod restarts, or
there is an index change to the collection

24
®

Tuesday, November 12, 13
A Simple Yet Elegant Solution?
• No more wrestling with the optimizer
• Hints are supported ($hint)
– Force a particular index
– http://docs.mongodb.org/manual/reference/
operator/meta/hint/

• Easier since MongoDB does not support joins

25
®

Tuesday, November 12, 13
Topic:
Transactions

®

Tuesday, November 12, 13
MySQL Transactions and Isolation
InnoDB creates
MVCC view of data;
locks updated rows,
commits atomically

mysql> BEGIN;
...
mysql> INSERT INTO sample(data) VALUES
(“Hello world!”);
mysql> INSERT INTO sample(data) VALUES
(“Goodbye world!”);
...
mysql> COMMIT;
MyISAM locks table

and commits each
row immediately
27
®

Tuesday, November 12, 13
How Does MongoDB Implement Locking?
# Update data ranges of documents to
# show effects of database lock.
@col.update(
{key =>
Locks database
{"$gte" => first.to_s,
"$lt" => last.to_s}
},
{ "$set" =>
{ "data.x" => rand(@rows)}})
Test

Total Requests/Sec

Single thread updating single collection
Two threads updating two collections, same DB
Four threads updating two collections, same DB
Two threads updating two collections, different DBs

197
80 + 80 = 160
29+29+30+30 = 118
190 + 179 = 369

28
®

Tuesday, November 12, 13
How Does MongoDB Implement Isolation?
• MongoDB does not prevent threads from
seeing partially committed data
• Example: Index changes can result in “double
read” of data if query uses index while index
is changing
• Experiment: Construct a test to:
• Select from numeric index and count rows
• Simultaneously update index to shift lower
values past end of previous high value
29
®

Tuesday, November 12, 13
How Does MongoDB Implement Isolation?

# Select values.
count = 0
@col.find(“k1” =>
{"$gte" => 120000}).
each do |doc|
count += 1
end
puts "Count=#{count}"

# Run update to increase.
@col.update(
{"_id" =>
{"$exists" => true}},
{"$inc" =>
{“k1” => increment}},
{:multi => true})

Count=50000
Count=50000
Count=100000 <--Index shifts over tail
Count=50000
Count=50000
30
®

Tuesday, November 12, 13
Topic:
Replication and HA

®

Tuesday, November 12, 13
Review of MySQL Replication

Master

Slave
Master-master
configuration
for fast failover

Relay
Log

Binlog

Relay
Log

set global
read_only=1;

Binlog

32
®

Tuesday, November 12, 13
How Does MongoDB Set Up Replication?

PRIMARY

Replication

SECONDARY

Heartbeat

Heartbeat

Replication

SECONDARY

33
®

Tuesday, November 12, 13
Where Is The Replica Set Defined?
$ mongo localhost
...
# rs0:PRIMARY> rs.config()
{
!
"_id" : "rs0",
!
"version" : 8,
!
"members" : [
!
! {
!
! ! "_id" : 0,
!
! ! "host" : "mongodb1:27017"
!
! },
!
! {
!
! ! "_id" : 1,
!
! ! "host" : "mongodb2:27017"
!
! },
!
! {
!
! ! "_id" : 2,
!
! ! "host" : "mongodb3:27017”
!
! }
!
]
}
34
®

Tuesday, November 12, 13
How Do Applications Connect?

# Connect to MongoDB replica set.
client = MongoReplicaSetClient.new(
['mongodb1', 'mongodb2', 'mongodb3'])
# Access a collection and add data
db = client.db("xacts")
col = db.collection("data")
col.insert({"data" => "hello world"})

35
®

Tuesday, November 12, 13
How Do You Read From a Slave?

# Connect to MongoDB replica set.
client = MongoReplicaSetClient.new(
['mongodb1', 'mongodb2', 'mongodb3'],
:slave_ok => true)
# Access a collection and select documents.
db = client.db("xacts")
col = db.collection("data")
col.find()

36
®

Tuesday, November 12, 13
Where’s the Binlog?
Find last document in
the OpLog
rs0:PRIMARY> use local
rs0:PRIMARY> db.oplog.rs.find().
... sort({ts:-1}).limit(1)
{ "ts" : Timestamp(1383980308, 1),
"h" : NumberLong("9112507265624716453"),
"v" : 2, "op" : "i", "ns" : "xacts.data",
"o" : {
"_id" :
ObjectId("527ddd116244f28f4592f6a8"),
"data" : "hello world!"
}
}
37
®

Tuesday, November 12, 13
How Do You Lock the DB to Back Up?
(= FLUSH TABLES WITH READ LOCK)
rs0:SECONDARY> db.fsyncLock()
{
! "info" : "now locked against writes, use
db.fsyncUnlock() to unlock",
! "seeAlso" : "http://dochub.mongodb.org/
core/fsynccommand",
! "ok" : 1
}
...

(tar or rsync data)
...

(= UNLOCK TABLES)
rs0:SECONDARY> db.fsyncUnlock()
{ "ok" : 1, "info" : "unlock completed" }
38
®

Tuesday, November 12, 13
How Do You Fail Over?

• Planned failover: update rs.config and save:
rs0:SECONDARY>
rs0:SECONDARY>
rs0:SECONDARY>
rs0:SECONDARY>
rs0:SECONDARY>

cfg = rs.conf()
cfg.members[0].priority = 1
cfg.members[1].priority = 1
cfg.members[2].priority = 2
rs.reconfig(cfg)

• Unplanned failover: kill or stop mongod

39
®

Tuesday, November 12, 13
Topic:
Sharding

®

Tuesday, November 12, 13
How is Partitioning Like Sharding?
• MySQL partitioning breaks a table into <n>
tables
– “PARTITION” is actually a storage engine

• Tables can be partitioned by hash or range
– Hash = random distribution
– Range = user controlled distribution (date range)

• Helpful in “big data” use-cases
• Partitions can usually be dropped efficiently
– Unlike “delete from table1 where timeField <
’12/31/2012’;”

41
®

Tuesday, November 12, 13
How Does Partitioning Help Queries?
Partitioned big_table on dateCol by month.
select * from big_table where column1 = 5;

Aug-2013

Sep-2013

Oct-2013

Nov-2013

select * from big_table where dateCol = ’10/12/2013’;

42
®

Tuesday, November 12, 13
Can I Finally Scale My Workload Horizontally?
• MySQL partitioning is helpful, but is still
constrained to a single machine
• MongoDB supports cross-server sharding
– huge plus: it’s “in the box”
– MySQL fabric is bringing something, we’ll see
– Many other 3rd Party MySQL options exist

• Only shard the collections that require it
• Each MongoDB shard is a replica set (1
primary and 1+ secondaries)

43
®

Tuesday, November 12, 13
What Does MongoDB Sharding Look Like?

Master
client app1

client app2

Master

...
mongosn

shard1

Slave

shard2

Slave

shardn

mongos1

...

Slave

...
Master

44
®

Tuesday, November 12, 13
How Does Sharding Help Queries?
Sharded big_table on dateCol by month.
select * from big_table where column1 = 5;

Aug-2013
shard... shard1

Sep-2013

Oct-2013

Nov-2013

shard2

shard3

shard4

shard...

select * from big_table where dateCol = ’10/12/2013’;

45
®

Tuesday, November 12, 13
How Do I Pick a Shard Key?
• MongoDB shards on one or more fields
• Simple example
– “orders” collection (customerId and productId)
– 1: shard on customerId
o each order writes to a single shard
o reads by customer on single shard
o reads by product on entire cluster

– 2: shard on productId
o each order writes to several shards
o reads by customer on entire cluster
o reads by product on single shard

– 3: store everything twice and shard both ways
o worst case for writes
o best cast for reads (either is shingle shard)
46
®

Tuesday, November 12, 13
Topic:
Security

®

Tuesday, November 12, 13
How Secure is It?
• basic username/password
• by database
• roles
– read = read any collection
– readWrite = read/write any collection
– dbAdmin = create index, create collection, rename
collection, etc.

48
®

Tuesday, November 12, 13
What About Advanced Security?
• Kerberos support in MongoDB Enterprise
Edition
• SSL is supported, but
– Note: The default distribution of MongoDB does
not contain support for SSL. To use SSL, you must
either build MongoDB locally passing the “--ssl”
option to scons or use MongoDB Enterprise.

49
®

Tuesday, November 12, 13
What Else is There to Learn?

• Tools - mongostat, mongo[export/
•
•

import], mongo[dump/restore]
Aggregation Framework
• Think SQL aggregate functionality
Map/Reduce

®

Tuesday, November 12, 13
What Should You Do?

®

Tuesday, November 12, 13
Summary
We liked...
• Ease of install
• Ability to just “jump in”
Look [out] for...
• Query language (Tim says hang in there!)
• You have to think about storage and queries in advance
Highly Recommended Reading
• Karl Seguin’s “The Little MongoDB Book”
• http://openmymind.net/mongodb.pdf
• MongoDB’s “SQL to MongoDB Mapping Chart”
• http://docs.mongodb.org/manual/reference/sql-comparison/
®

Tuesday, November 12, 13
Questions?

Robert Hodges
CEO, Continuent
robert.hodges@continuent.com
@continuent

Tim Callaghan
VP/Engineering, Tokutek
tim@tokutek.com
@tmcallaghan

®

Tuesday, November 12, 13

Use Your MySQL Knowledge to Become a MongoDB Guru

  • 1.
    Use Your MySQLKnowledge to Become a MongoDB Guru Percona Live London 2013 Robert Hodges CEO Continuent Tim Callaghan VP/Engineering Tokutek ® Tuesday, November 12, 13
  • 2.
    Our Companies Robert Hodges •CEO at Continuent • Database nerd since 1982 starting with M204, RDBMS since 1990, NoSQL since 2012; designed Continuent Tungsten • Continuent offers clustering and replication for MySQL and other fine DBMS types Tim Callaghan • VP/Engineering at Tokutek • Long time database consumer (Oracle) and producer (VoltDB, Tokutek) • Tokutek offers Fractal Tree indexes in MySQL (TokuDB) and MongoDB (TokuMX) ® Tuesday, November 12, 13
  • 3.
    MongoDB -- TheNew MySQL One Bad Thing about MongoDB One Good Thing about MongoDB ® Tuesday, November 12, 13
  • 4.
    One Bad Thingabout MongoDB MySQL > select * from table1 where column1 > column2; > ... 5 row(s) returned MongoDB > db.collection1.find({$field1: {gt: $field2}}); > ReferenceError: $field2 is not defined [current] MongoDB query language is <field> <operator> <literal> ® Tuesday, November 12, 13
  • 5.
    One Good Thingabout MongoDB Robert’s “ease of use” demo ® Tuesday, November 12, 13
  • 6.
    Today’s Question How canyou use your MySQL knowledge to get up to speed on MongoDB? ® Tuesday, November 12, 13
  • 7.
  • 8.
    How Do IFind Things in MongoDB? mongod server == mysqld == MySQL schema == MySQL table ~ Sort of like a MySQL row != MySQL column database collection BSON document key/value pair key/value pair key/value pair BSON document... 8 ® Tuesday, November 12, 13
  • 9.
    How Do ICreate a Table and Insert Data? Connect # Ruby Code MongoClient.new("localhost"). db("mydb"). Use database collection("sample"). insert({"data" => "hello world"}) Choose collection Insert data to materialize database and collection Primary key generated automatically 9 ® Tuesday, November 12, 13
  • 10.
    How Do IChange the Schema? # Ruby Code MongoClient.new("localhost"). db("mydb"). collection("sample"). insert({"data" => "hello again!", "author" => “robert”}) Just add more data 10 ® Tuesday, November 12, 13
  • 11.
    How Do IValidate Schema? rs0:PRIMARY> { "_id" : 1, { "_id" : 2, { "_id" : 3, db.samples.find() "data" : "hello world" } "daata" : "bye world” } "data" : 26.44 } Software bugs? rs0:PRIMARY> show databases local ! 2.0771484375GB mydb! 7.9501953125GB Typo from an mydb1! 0.203125GB early run 11 ® Tuesday, November 12, 13
  • 12.
    How Do IRemove Data? (Part 1) Drop a database rs0:PRIMARY> db.dropDatabase() { "dropped" : "mydb", "ok" : 1 } Drop a collection rs0:PRIMARY> db.samples.drop() true Drop a column? rs0:PRIMARY> db.foo.update( { author: { $exists: true }}, { $unset: { author: 1 } }, false, true ) 12 ® Tuesday, November 12, 13
  • 13.
    How Do IRemove Data? (Part 2) (Remove documents based on TTL index) > db.samples.ensureIndex( {"inserted": 1}, {"expireAfterSeconds": 60}) > db.samples.insert( {"data": "hello world", inserted: new Date()}) > db.table.count() 1 ... > db.table.count() 0 (Capped collections do same with space) 13 ® Tuesday, November 12, 13
  • 14.
    How Does MongoDBDo Joins? It Doesn’t! (It is your job to denormalize or do application level joins. This includes thinking about storage.) 14 ® Tuesday, November 12, 13
  • 15.
  • 16.
    How is MyData Stored, Logically? MongoDB storage is very similar to MyISAM secondary index(es) _id index etc. collection data (documents) 16 ® Tuesday, November 12, 13
  • 17.
    How is MyData Stored, Physically? But it does look different in the file system. MyISAM <db>/<table>.frm <db>/<table>.myd <db>/<table>.myi MongoDB <db1>.ns <db1>.1 .. <db1>.n <db2>.ns <db2>.1 .. <db2>.n • start MongoDB with “--directoryperdb” to put files in database folders • pro-tip : do this to gain IOPs by database 17 ® Tuesday, November 12, 13
  • 18.
    How Much MemoryDoes It Use? All of it! 18 ® Tuesday, November 12, 13
  • 19.
    How does MongoDBManage Memory? • MyISAM – key_cache_size determines index caching – data is cached in Operating System buffers • InnoDB – innodb_buffer_pool_size determines index/data caching • MongoDB – memory mapped files – mongod grows to consume available RAM – good : no knob – bad : operating system is in charge of cache – bad : available RAM may change over time 19 ® Tuesday, November 12, 13
  • 20.
    How Will ItPerform for My Workload? • It depends... – Determine your “working set” o The portion of your data that clients access most often o db.runCommand( { serverStatus: 1, workingSet: 1 } ) – If working set <= RAM o Performance generally very good o Be careful in high-concurrent-write use cases – If working set >= RAM o Likely IO bound o Sharding to the rescue! 20 ® Tuesday, November 12, 13
  • 21.
    How Can SchemaAffect Working Set? • Field names are stored with the document – On disk and in memory • Plan ahead, specially for large collections BAD! GOOD! { first_name: “Timothy”, middle_initial: “M”, last_name: “Callaghan”, address_line_1: “555 Main Street”, address_line_2: “Apt. 9” } { fn: “Timothy”, mi: “M”, ln: “Callaghan”, al1: “555 Main Street”, al2: “Apt. 9” } 21 ® Tuesday, November 12, 13
  • 22.
  • 23.
    How Does theQuery Optimizer Work? • MySQL – Optimizer find useable indexes for the query – For each index, optimizer asks the storage engine o What is the cardinality for the given keys? o What is the estimated cost? – The “best” plan is chosen and used for the query • This occurs for every single query 23 ® Tuesday, November 12, 13
  • 24.
    How Does theQuery Optimizer Work? • MongoDB – All candidate indexes run the query in parallel o “candidate” meaning it contains useful keys – As matching results are found they are placed in a shared buffer – When one of the parallel runs completes, all others are stopped – This “plan” is used for future executions of the same query o Until the collection has 1,000 writes, mongod restarts, or there is an index change to the collection 24 ® Tuesday, November 12, 13
  • 25.
    A Simple YetElegant Solution? • No more wrestling with the optimizer • Hints are supported ($hint) – Force a particular index – http://docs.mongodb.org/manual/reference/ operator/meta/hint/ • Easier since MongoDB does not support joins 25 ® Tuesday, November 12, 13
  • 26.
  • 27.
    MySQL Transactions andIsolation InnoDB creates MVCC view of data; locks updated rows, commits atomically mysql> BEGIN; ... mysql> INSERT INTO sample(data) VALUES (“Hello world!”); mysql> INSERT INTO sample(data) VALUES (“Goodbye world!”); ... mysql> COMMIT; MyISAM locks table and commits each row immediately 27 ® Tuesday, November 12, 13
  • 28.
    How Does MongoDBImplement Locking? # Update data ranges of documents to # show effects of database lock. @col.update( {key => Locks database {"$gte" => first.to_s, "$lt" => last.to_s} }, { "$set" => { "data.x" => rand(@rows)}}) Test Total Requests/Sec Single thread updating single collection Two threads updating two collections, same DB Four threads updating two collections, same DB Two threads updating two collections, different DBs 197 80 + 80 = 160 29+29+30+30 = 118 190 + 179 = 369 28 ® Tuesday, November 12, 13
  • 29.
    How Does MongoDBImplement Isolation? • MongoDB does not prevent threads from seeing partially committed data • Example: Index changes can result in “double read” of data if query uses index while index is changing • Experiment: Construct a test to: • Select from numeric index and count rows • Simultaneously update index to shift lower values past end of previous high value 29 ® Tuesday, November 12, 13
  • 30.
    How Does MongoDBImplement Isolation? # Select values. count = 0 @col.find(“k1” => {"$gte" => 120000}). each do |doc| count += 1 end puts "Count=#{count}" # Run update to increase. @col.update( {"_id" => {"$exists" => true}}, {"$inc" => {“k1” => increment}}, {:multi => true}) Count=50000 Count=50000 Count=100000 <--Index shifts over tail Count=50000 Count=50000 30 ® Tuesday, November 12, 13
  • 31.
  • 32.
    Review of MySQLReplication Master Slave Master-master configuration for fast failover Relay Log Binlog Relay Log set global read_only=1; Binlog 32 ® Tuesday, November 12, 13
  • 33.
    How Does MongoDBSet Up Replication? PRIMARY Replication SECONDARY Heartbeat Heartbeat Replication SECONDARY 33 ® Tuesday, November 12, 13
  • 34.
    Where Is TheReplica Set Defined? $ mongo localhost ... # rs0:PRIMARY> rs.config() { ! "_id" : "rs0", ! "version" : 8, ! "members" : [ ! ! { ! ! ! "_id" : 0, ! ! ! "host" : "mongodb1:27017" ! ! }, ! ! { ! ! ! "_id" : 1, ! ! ! "host" : "mongodb2:27017" ! ! }, ! ! { ! ! ! "_id" : 2, ! ! ! "host" : "mongodb3:27017” ! ! } ! ] } 34 ® Tuesday, November 12, 13
  • 35.
    How Do ApplicationsConnect? # Connect to MongoDB replica set. client = MongoReplicaSetClient.new( ['mongodb1', 'mongodb2', 'mongodb3']) # Access a collection and add data db = client.db("xacts") col = db.collection("data") col.insert({"data" => "hello world"}) 35 ® Tuesday, November 12, 13
  • 36.
    How Do YouRead From a Slave? # Connect to MongoDB replica set. client = MongoReplicaSetClient.new( ['mongodb1', 'mongodb2', 'mongodb3'], :slave_ok => true) # Access a collection and select documents. db = client.db("xacts") col = db.collection("data") col.find() 36 ® Tuesday, November 12, 13
  • 37.
    Where’s the Binlog? Findlast document in the OpLog rs0:PRIMARY> use local rs0:PRIMARY> db.oplog.rs.find(). ... sort({ts:-1}).limit(1) { "ts" : Timestamp(1383980308, 1), "h" : NumberLong("9112507265624716453"), "v" : 2, "op" : "i", "ns" : "xacts.data", "o" : { "_id" : ObjectId("527ddd116244f28f4592f6a8"), "data" : "hello world!" } } 37 ® Tuesday, November 12, 13
  • 38.
    How Do YouLock the DB to Back Up? (= FLUSH TABLES WITH READ LOCK) rs0:SECONDARY> db.fsyncLock() { ! "info" : "now locked against writes, use db.fsyncUnlock() to unlock", ! "seeAlso" : "http://dochub.mongodb.org/ core/fsynccommand", ! "ok" : 1 } ... (tar or rsync data) ... (= UNLOCK TABLES) rs0:SECONDARY> db.fsyncUnlock() { "ok" : 1, "info" : "unlock completed" } 38 ® Tuesday, November 12, 13
  • 39.
    How Do YouFail Over? • Planned failover: update rs.config and save: rs0:SECONDARY> rs0:SECONDARY> rs0:SECONDARY> rs0:SECONDARY> rs0:SECONDARY> cfg = rs.conf() cfg.members[0].priority = 1 cfg.members[1].priority = 1 cfg.members[2].priority = 2 rs.reconfig(cfg) • Unplanned failover: kill or stop mongod 39 ® Tuesday, November 12, 13
  • 40.
  • 41.
    How is PartitioningLike Sharding? • MySQL partitioning breaks a table into <n> tables – “PARTITION” is actually a storage engine • Tables can be partitioned by hash or range – Hash = random distribution – Range = user controlled distribution (date range) • Helpful in “big data” use-cases • Partitions can usually be dropped efficiently – Unlike “delete from table1 where timeField < ’12/31/2012’;” 41 ® Tuesday, November 12, 13
  • 42.
    How Does PartitioningHelp Queries? Partitioned big_table on dateCol by month. select * from big_table where column1 = 5; Aug-2013 Sep-2013 Oct-2013 Nov-2013 select * from big_table where dateCol = ’10/12/2013’; 42 ® Tuesday, November 12, 13
  • 43.
    Can I FinallyScale My Workload Horizontally? • MySQL partitioning is helpful, but is still constrained to a single machine • MongoDB supports cross-server sharding – huge plus: it’s “in the box” – MySQL fabric is bringing something, we’ll see – Many other 3rd Party MySQL options exist • Only shard the collections that require it • Each MongoDB shard is a replica set (1 primary and 1+ secondaries) 43 ® Tuesday, November 12, 13
  • 44.
    What Does MongoDBSharding Look Like? Master client app1 client app2 Master ... mongosn shard1 Slave shard2 Slave shardn mongos1 ... Slave ... Master 44 ® Tuesday, November 12, 13
  • 45.
    How Does ShardingHelp Queries? Sharded big_table on dateCol by month. select * from big_table where column1 = 5; Aug-2013 shard... shard1 Sep-2013 Oct-2013 Nov-2013 shard2 shard3 shard4 shard... select * from big_table where dateCol = ’10/12/2013’; 45 ® Tuesday, November 12, 13
  • 46.
    How Do IPick a Shard Key? • MongoDB shards on one or more fields • Simple example – “orders” collection (customerId and productId) – 1: shard on customerId o each order writes to a single shard o reads by customer on single shard o reads by product on entire cluster – 2: shard on productId o each order writes to several shards o reads by customer on entire cluster o reads by product on single shard – 3: store everything twice and shard both ways o worst case for writes o best cast for reads (either is shingle shard) 46 ® Tuesday, November 12, 13
  • 47.
  • 48.
    How Secure isIt? • basic username/password • by database • roles – read = read any collection – readWrite = read/write any collection – dbAdmin = create index, create collection, rename collection, etc. 48 ® Tuesday, November 12, 13
  • 49.
    What About AdvancedSecurity? • Kerberos support in MongoDB Enterprise Edition • SSL is supported, but – Note: The default distribution of MongoDB does not contain support for SSL. To use SSL, you must either build MongoDB locally passing the “--ssl” option to scons or use MongoDB Enterprise. 49 ® Tuesday, November 12, 13
  • 50.
    What Else isThere to Learn? • Tools - mongostat, mongo[export/ • • import], mongo[dump/restore] Aggregation Framework • Think SQL aggregate functionality Map/Reduce ® Tuesday, November 12, 13
  • 51.
    What Should YouDo? ® Tuesday, November 12, 13
  • 52.
    Summary We liked... • Easeof install • Ability to just “jump in” Look [out] for... • Query language (Tim says hang in there!) • You have to think about storage and queries in advance Highly Recommended Reading • Karl Seguin’s “The Little MongoDB Book” • http://openmymind.net/mongodb.pdf • MongoDB’s “SQL to MongoDB Mapping Chart” • http://docs.mongodb.org/manual/reference/sql-comparison/ ® Tuesday, November 12, 13
  • 53.
    Questions? Robert Hodges CEO, Continuent robert.hodges@continuent.com @continuent TimCallaghan VP/Engineering, Tokutek tim@tokutek.com @tmcallaghan ® Tuesday, November 12, 13