Schema Design with MongoDBDwight MerrimanCEO10gen
What is document-oriented?JSON objectsNot relationalNot OODBDatabase schema != program “schema”
TermsRow -> JSON documentTables -> collectionsIndexes -> indexJoin -> embedding and linking
Choose a schema that  makes queries easy  makes queries fast  facilitates atomicity  facilitates sharding
Key question: embed vs. link“Contains relationship” : embedEmbed = “pre-joined”Links: client/server turnaroundsOn a close call, embed.  Use rich documents.Note the 4MB object size limitArbitrary limit but pushes one towards good designs
Map-Reduce
{ _id : … }Should be:Unique	Invariant	Ideally, not reused (delete; insert)ObjectID type often best for a sharded collection
Treesmongodb.org/display/DOCS/Trees+in+MongoDB
Single “Table” Inheritance Works Well> t.find(){ type:’irregular-shape’, area:99 }{ type:’circle’, area:3.14, radius:1 }{ type:’square’, area:4, d:2 }{ type:’rect’, area:8, x:2, y:4 }> t.find( { radius : { $gt : 2.0 } } )> t.ensureIndex( { radius : 1 } ) // fine
(1) Full tree in one document{   comments: [    {by: "mathias", text: "...",     replies: []}    {by: "eliot", text: "...",     replies: [      {by: "mike", text: "...", replies: []}    ]}  ] }Pros:
Single document to fetch per page
One location on disk for whole tree
You can see full structure easily
Cons:
Hard to search
Hard to get back partial results
4MB limit(2) Parent Links> t = db.tree1;> t.find(){ "_id" : 1 }{ "_id" : 2, "parent" : 1 }{ "_id" : 3, "parent" : 1 }{ "_id" : 4, "parent" : 2 }{ "_id" : 5, "parent" : 4 }{ "_id" : 6, "parent" : 4 }> // find children of node 4> t.ensureIndex({parent:1})> t.find( {parent : 4 } ){ "_id" : 5, "parent" : 4 }{ "_id" : 6, "parent" : 4 }// hard to get all descendants
(3) Array of Ancestors> t = db.mytree;> t.find(){ "_id" : "a" }{ "_id" : "b", "ancestors" : [ "a" ], "parent" : "a" }{ "_id" : "c", "ancestors" : [ "a", "b" ], "parent" : "b" }{ "_id" : "d", "ancestors" : [ "a", "b" ], "parent" : "b" }{ "_id" : "e", "ancestors" : [ "a" ], "parent" : "a" }{ "_id" : "f", "ancestors" : [ "a", "e" ], "parent" : "e" }{ "_id" : "g", "ancestors" : [ "a", "b", "d" ], "parent" : "d" }> t.ensureIndex( { ancestors : 1 } )> // find all descendents of b:> t.find( { ancestors : 'b' }){ "_id" : "c", "ancestors" : [ "a", "b" ], "parent" : "b" }{ "_id" : "d", "ancestors" : [ "a", "b" ], "parent" : "b" }{ "_id" : "g", "ancestors" : [ "a", "b", "d" ], "parent" : "d" }> // get all ancestors of f:> anc = db.mytree.findOne({_id:'f'}).ancestors[ "a", "e" ]> db.mytree.find( { _id : { $in : anc } } ){ "_id" : "a" }{ "_id" : "e", "ancestors" : [ "a" ], "parent" : "a" }
AtomicityAtomicity at the document level$operatorsCompare and swap
Compare and Swap> t=db.inventory> s = t.findOne( {sku:'abc'} )> --s.qty;> t.update({_id:s._id, qty:qty_old}, s);> db.getLastError(){"err" : , "updatedExisting" : true , "n" : 1 , "ok" : 1} // it worked
Compare and Swap> t=db.inventory> s = t.findOne( {sku:'abc'} )> --s.qty;> t.update({_id:s._id, qty:qty_old}, s);> db.getLastError(){"err" : , "updatedExisting" : true , "n" : 1 , "ok" : 1} // it worked Oops?
Compare and Swap - Better> t=db.inventory> s = t.findOne( {sku:'abc'} )> obj_old = Object.extend({}, s);> --s.qty;> // t.update({_id:s._id, qty:qty_old}, s);> t.update( obj_old , s);> print( db.getLastError().ok ? “worked” : “try again” );
Compare and Swap – versionsupdate( { _id : myid, ver : last_ver },        { $set : { x : “abc”, y : 99 },          $inc : { ver : 1 }        } )

Schema design with MongoDB (Dwight Merriman)

  • 1.
    Schema Design withMongoDBDwight MerrimanCEO10gen
  • 2.
    What is document-oriented?JSONobjectsNot relationalNot OODBDatabase schema != program “schema”
  • 3.
    TermsRow -> JSONdocumentTables -> collectionsIndexes -> indexJoin -> embedding and linking
  • 4.
    Choose a schemathat makes queries easy makes queries fast facilitates atomicity facilitates sharding
  • 5.
    Key question: embedvs. link“Contains relationship” : embedEmbed = “pre-joined”Links: client/server turnaroundsOn a close call, embed. Use rich documents.Note the 4MB object size limitArbitrary limit but pushes one towards good designs
  • 11.
  • 12.
    { _id :… }Should be:Unique Invariant Ideally, not reused (delete; insert)ObjectID type often best for a sharded collection
  • 13.
  • 14.
    Single “Table” InheritanceWorks Well> t.find(){ type:’irregular-shape’, area:99 }{ type:’circle’, area:3.14, radius:1 }{ type:’square’, area:4, d:2 }{ type:’rect’, area:8, x:2, y:4 }> t.find( { radius : { $gt : 2.0 } } )> t.ensureIndex( { radius : 1 } ) // fine
  • 15.
    (1) Full treein one document{ comments: [ {by: "mathias", text: "...", replies: []} {by: "eliot", text: "...", replies: [ {by: "mike", text: "...", replies: []} ]} ] }Pros:
  • 16.
    Single document tofetch per page
  • 17.
    One location ondisk for whole tree
  • 18.
    You can seefull structure easily
  • 19.
  • 20.
  • 21.
    Hard to getback partial results
  • 22.
    4MB limit(2) ParentLinks> t = db.tree1;> t.find(){ "_id" : 1 }{ "_id" : 2, "parent" : 1 }{ "_id" : 3, "parent" : 1 }{ "_id" : 4, "parent" : 2 }{ "_id" : 5, "parent" : 4 }{ "_id" : 6, "parent" : 4 }> // find children of node 4> t.ensureIndex({parent:1})> t.find( {parent : 4 } ){ "_id" : 5, "parent" : 4 }{ "_id" : 6, "parent" : 4 }// hard to get all descendants
  • 23.
    (3) Array ofAncestors> t = db.mytree;> t.find(){ "_id" : "a" }{ "_id" : "b", "ancestors" : [ "a" ], "parent" : "a" }{ "_id" : "c", "ancestors" : [ "a", "b" ], "parent" : "b" }{ "_id" : "d", "ancestors" : [ "a", "b" ], "parent" : "b" }{ "_id" : "e", "ancestors" : [ "a" ], "parent" : "a" }{ "_id" : "f", "ancestors" : [ "a", "e" ], "parent" : "e" }{ "_id" : "g", "ancestors" : [ "a", "b", "d" ], "parent" : "d" }> t.ensureIndex( { ancestors : 1 } )> // find all descendents of b:> t.find( { ancestors : 'b' }){ "_id" : "c", "ancestors" : [ "a", "b" ], "parent" : "b" }{ "_id" : "d", "ancestors" : [ "a", "b" ], "parent" : "b" }{ "_id" : "g", "ancestors" : [ "a", "b", "d" ], "parent" : "d" }> // get all ancestors of f:> anc = db.mytree.findOne({_id:'f'}).ancestors[ "a", "e" ]> db.mytree.find( { _id : { $in : anc } } ){ "_id" : "a" }{ "_id" : "e", "ancestors" : [ "a" ], "parent" : "a" }
  • 24.
    AtomicityAtomicity at thedocument level$operatorsCompare and swap
  • 25.
    Compare and Swap>t=db.inventory> s = t.findOne( {sku:'abc'} )> --s.qty;> t.update({_id:s._id, qty:qty_old}, s);> db.getLastError(){"err" : , "updatedExisting" : true , "n" : 1 , "ok" : 1} // it worked
  • 26.
    Compare and Swap>t=db.inventory> s = t.findOne( {sku:'abc'} )> --s.qty;> t.update({_id:s._id, qty:qty_old}, s);> db.getLastError(){"err" : , "updatedExisting" : true , "n" : 1 , "ok" : 1} // it worked Oops?
  • 27.
    Compare and Swap- Better> t=db.inventory> s = t.findOne( {sku:'abc'} )> obj_old = Object.extend({}, s);> --s.qty;> // t.update({_id:s._id, qty:qty_old}, s);> t.update( obj_old , s);> print( db.getLastError().ok ? “worked” : “try again” );
  • 28.
    Compare and Swap– versionsupdate( { _id : myid, ver : last_ver }, { $set : { x : “abc”, y : 99 }, $inc : { ver : 1 } } )