How MongoDB works

Uladzimir Mihura
Senior Software Engineer, EPAM Systems




                        Software Engineering Conference 2013
Data Model



{name:'mongo',	
  type:'DB'}
Data Model: Document oriented



{
	
  	
  _id:	
  1,
	
  	
  name:	
  {
	
  	
  	
  	
  first:	
  'Michael',
	
  	
  	
  	
  last:	
  'Faraday'
	
  	
  },
	
  	
  birth:	
  new	
  Date('Sep	
  22,	
  1791'),
	
  	
  death:	
  new	
  Date('Aug	
  25,	
  1867'),
	
  	
  contribs:	
  ['Chemistry',	
  'Electricity',	
  'Diamagnetism']
}
Data Model: BSON



     {	
  hello:	
  'SEC	
  2013'}


 x19x00x00x00	
  x02hello

x09x00x00x00	
  SEC	
  2013x00
Data Model: Structure


      Document


       Collection


       Database


   MongoDB Instance
Reading



db.foo.find({name:'Niels',	
  surname:'Bohr'})
Reading: Dynamic Queries


•   find, findOne
•   accepts a conditions object
    •   regex, strings, numbers, etc...

• rich operators
    •   $lt, $gt, $or, $and, $in, $ne, ...

• projections
    •   specify fields to return
Reading: Indexes


• B-Tree
• multiply fields                        Di Lu Rh


• options




                                                                Zo
                                                      Sa
                               Cr




                                                           St
                                        B
                                    C
 •   unique

 •

                                            Ka
     sparse




                                                 Cr
                                        H




                                                           B
                                                      C
 •   2d                             F

 •   full text search (≥2.4)
Reading: Query Optimizer


    db.foo.find({	
  x:	
  10,	
  y:'bar'})



 full scan                     x
index on x                 x
index on y



                     remember
Writing



db.foo.insert({name:'Georg',	
  surname:'Ohm'})
Writing: Write Concerns


• Errors ignored (even network errors)
• Unacknowledged (fire & forget)
• Acknowledged (write accepted)
• Journaled
• Propagated to the replica set
  members
Writing: Isolation



db.foo.update(	
  
  {	
  field1	
  :	
  1	
  ,	
  $isolated	
  :	
  1	
  },	
  
  {	
  $inc	
  :	
  {	
  field2	
  :	
  1	
  }	
  }	
  ,
  {	
  multi:	
  true	
  }	
  
)
Storage Management



{name:'mongo',	
  type:'DB'}
Storage: Padding factor


          ...

        Padding
        Header

       BSON Data

        Padding
        Header

          ...
Storage: Directory Layout

-­‐rw-­‐-­‐-­‐-­‐-­‐-­‐-­‐	
  	
  1	
  trnl	
  	
  trnl	
  	
  	
  	
  64M	
  Jan	
  28	
  14:38	
  foo.0
-­‐rw-­‐-­‐-­‐-­‐-­‐-­‐-­‐	
  	
  1	
  trnl	
  	
  trnl	
  	
  	
  128M	
  Jan	
  28	
  14:26	
  foo.1
-­‐rw-­‐-­‐-­‐-­‐-­‐-­‐-­‐	
  	
  1	
  trnl	
  	
  trnl	
  	
  	
  256M	
  Jan	
  28	
  14:31	
  foo.2
-­‐rw-­‐-­‐-­‐-­‐-­‐-­‐-­‐	
  	
  1	
  trnl	
  	
  trnl	
  	
  	
  512M	
  Jan	
  28	
  14:38	
  foo.3
-­‐rw-­‐-­‐-­‐-­‐-­‐-­‐-­‐	
  	
  1	
  trnl	
  	
  trnl	
  	
  	
  1.0G	
  Jan	
  28	
  14:38	
  foo.4
-­‐rw-­‐-­‐-­‐-­‐-­‐-­‐-­‐	
  	
  1	
  trnl	
  	
  trnl	
  	
  	
  2.0G	
  Jan	
  28	
  14:38	
  foo.5
-­‐rw-­‐-­‐-­‐-­‐-­‐-­‐-­‐	
  	
  1	
  trnl	
  	
  trnl	
  	
  	
  	
  16M	
  Jan	
  28	
  14:38	
  foo.ns




       • Separate files per database
       • Aggressive preallocation
Storage: Memory mapped files



          Virtual Memory




 foo.1   foo.2   foo.3     ...   foo.n
Scaling



{name:'mongo',	
  type:'DB'}
Scaling: Replication

        Node 1                         Node 2
       Secondary                      Secondary




read                    Node 3                      read

        Replication     Primary       Replication


                      read    write


                         client
Scaling: Replication

                 Replication
       Node 1                   Node 2
       Primary                 Secondary




read    write    Node 3                    read
                 Primary




                  client
Scaling: Replication

                       Replication
       Node 1                         Node 2
       Primary                       Secondary


                 Replication



read    write          Node 3                    read
                      Secondary

                      read


                        client
Scaling: Sharding
Config                            Client
Servers
mongod
                                 mongos                  mongos          ...
mongod

mongod         Replica Set             Replica Set               Replica Set



           mongod                  mongod                    mongod
          mongod


                        mongod



                                  mongod


                                                mongod



                                                            mongod


                                                                          mongod
                                  Shards
MongoDB Drivers Team @ 10gen
Hosted Solutions


• MongoLab.com
• MongoHQ.com
• HostedMongo.com
• MongoMachine.com
• ObjectRocket
• etc
http://try.mongodb.org/
      http://on.fb.me/bymongo/




Thank you. Let’s ask.

How MongoDB works