SlideShare a Scribd company logo
1 of 117
Download to read offline
SCHEMA DESIGN WORKSHOP
Jeremy Mikola
  @jmikola
AGENDA
1.   Basic schema design principles for MongoDB
2.   Schema design over an application's lifetime
3.   Common design patterns
4.   Sharding
GOALS
Learn the schema design process in MongoDB
Practice applying common principles via exercises
Understand the implications of sharding
WHAT IS A SCHEMA AND WHY IS IT
          IMPORTANT?
SCHEMA
Map concepts and relationships to data
Set expectations for the data
Minimize overhead of iterative modifications
Ensure compatibility
NORMALIZATION
users    ←   books    →   authors
uenm
 srae        tte
              il          frtnm
                           is_ae
frtnm
 is_ae       ib
              sn          ls_ae
                           atnm
ls_ae
 atnm        lnug
              agae
             cetdb
              rae_y
             ato
              uhr
DENORMALIZATION
users   ←    books
uenm
srae        tte
            il
frtnm
is_ae       ib
            sn
ls_ae
atnm        lnug
            agae
            cetdb
            rae_y
            ato
            uhr
             frtnm
              is_ae
             ls_ae
              atnm
WHAT IS SCHEMA DESIGN LIKE IN
         MONGODB?
 Schema is defined at the application-level
 Design is part of each phase in its lifetime
 There is no magic formula
MONGODB DOCUMENTS
        Storage in BSON → BSONSpec.org

Scalars                     Rich types
  Doubles                     Objects
  Integers (32 or 64-bit)     Arrays
  UTF-8 strings
  UTC Date, timestamp
  Binary, regex, code
  Object ID
  nlul
TERMINOLOGY
{
  "ogd"   rltoa b,
   mnob  :"eainld"
  "aaae   dtbs"
   dtbs" :"aaae,
  "olcin  tbe,
   cleto":"al"
  "ouet   rw,
   dcmn" :"o"
  "ne"    idx,
   idx   :"ne"
  "hrig  
   sadn":{
    "hr"  :"atto"
     sad    priin,
    "hr e":"atto e"
     sadky  priinky
  }
   
}
THREE CONSIDERATIONS IN MONGODB
         SCHEMA DESIGN
 1. The data your application needs
 2. Your application's read usage of the data
 3. Your application's write usage of the data
CASE STUDY
LIBRARY WEB APPLICATION
 Different schemas are possible
AUTHOR SCHEMA
{
  "i" n,
   _d:it
  "is_ae:srn,
   frtnm" tig
  "atnm" tig
   ls_ae:srn
}
USER SCHEMA
{
  "i" n,
   _d:it
  "srae:srn,
   uenm" tig
  "asod:srn
   pswr" tig
}
BOOK SCHEMA
{
  "i" n,
   _d:it
  "il" tig
   tte:srn,
  "lg:srn,
   su" tig
  "uhr:it
   ato" n,
  "vial" ola,
   aalbe:boen
  "sn:srn,
   ib" tig
  "ae" n,
   pgs:it
  "ulse" 
   pbihr:{
    "iy:srn,
     ct" tig
    "ae:dt,
     dt" ae
    "ae:srn
     nm" tig
  }
   ,
  "ujcs:[srn,srn ,
   sbet"  tig tig]
  "agae:srn,
   lnug" tig
  "eiw" 
   rves:[
     ue" n,"et:srn ,
    {"sr:it tx" tig}
     ue" n,"et:srn 
    {"sr:it tx" tig}
  ]
   ,
}
EXAMPLE DOCUMENTS
AUTHOR DOCUMENT
>d.uhr.idn(
  batosfnOe)
{
  _d ,
   i:1
  frtnm:".Sot,
   is_ae F ct"
  ls_ae Ftgrl"
   atnm:"izead
}
USER DOCUMENT
>d.sr.idn(
  buesfnOe)
{
  _d ,
   i:1
  uenm:"ml@0e.o"
   srae eiy1gncm,
  pswr:"ljkok429ld9098d
   asod ssf4d8k0dkj0023"
}
BOOK DOCUMENT
>d.ok.idn(
  bbosfnOe)
{
  _d ,
   i:1
  tte TeGetGtb"
   il:"h ra asy,
  su:"715109‐h‐ra‐asy,
   lg 9887513tegetgtb"
  ato:1
   uhr ,
  aalbe re
   vial:tu,
  ib:"715109"
   sn 9887513,
  pgs 7,
   ae:16
  pbihr 
   ulse:{
    nm:"vrmnsLbay,
     ae Eeya' irr"
    dt:IOae"910‐90:00Z)
     ae SDt(19‐91T00:0",
    ct:"odn
     iy Lno"
  }
   ,
  sbet:[Lv tre" 12s,"azAe]
   ujcs "oesois,"90" Jz g",
  lnug:"nls"
   agae Egih,
  rves 
   eiw:[
     sr ,tx:"n ftebs… ,
    {ue:1 et Oeo h et"}
     sr ,tx:"tshr o"}
    {ue:2 et I' adt… 
  ]
   
}
EMBEDDED OBJECTS
      AKA EMBEDDED OR SUB-DOCUMENTS
  What advantages do they have?
   When should they be used?
EMBEDDED OBJECTS
>d.ok.idn(
  bbosfnOe)
{
  _d ,
   i:1
  tte TeGetGtb"
   il:"h ra asy,
  su:"715109‐h‐ra‐asy,
   lg 9887513tegetgtb"
  ato:1
   uhr ,
  aalbe re
   vial:tu,
  ib:"715109"
   sn 9887513,
  pgs 7,
   ae:16
  pbihr 
   ulse:{
    nm:"vrmnsLbay,
     ae Eeya' irr"
    dt:IOae"910‐90:00Z)
     ae SDt(19‐91T00:0",
    ct:"odn
     iy Lno"
  }
   ,
  sbet:[Lv tre" 12s,"azAe]
   ujcs "oesois,"90" Jz g",
  lnug:"nls"
   agae Egih,
  rves 
   eiw:[
     sr ,tx:"n ftebs… ,
    {ue:1 et Oeo h et"}
     sr ,tx:"tshr o"}
    {ue:2 et I' adt… 
  ]
   
}
EMBEDDED OBJECTS
Great for read performance
One seek to load the entire document
One round trip to the database
Writes can be slow if constantly adding to objects
LINKED DOCUMENTS
What advantages does this approach have?
      When should they be used?
LINKED DOCUMENTS
>d.ok.idn(
  bbosfnOe)
{
  _d ,
   i:1
  tte TeGetGtb"
   il:"h ra asy,
  su:"715109‐h‐ra‐asy,
   lg 9887513tegetgtb"
  ato:1
   uhr ,
  aalbe re
   vial:tu,
  ib:"715109"
   sn 9887513,
  pgs 7,
   ae:16
  pbihr 
   ulse:{
    pbihrnm:"vrmnsLbay,
     ulse_ae Eeya' irr"
    dt:IOae"910‐90:00Z)
     ae SDt(19‐91T00:0",
    pbihrct:"odn
     ulse_iy Lno"
  }
   ,
  sbet:[Lv tre" 12s,"azAe]
   ujcs "oesois,"90" Jz g",
  lnug:"nls"
   agae Egih,
  rves 
   eiw:[
     sr ,tx:"n ftebs… ,
    {ue:1 et Oeo h et"}
     sr ,tx:"tshr o"}
    {ue:2 et I' adt… 
  ]
   
}
LINKED DOCUMENTS
More, smaller documents
Can make queries by ID very simple
Accessing linked document data requires extra read
What effect does this have on the system?
DATA, RAM AND DISK
ARRAYS
When should they be used?
ARRAY OF SCALARS
>d.ok.idn(
  bbosfnOe)
{
  _d ,
   i:1
  tte TeGetGtb"
   il:"h ra asy,
  su:"715109‐h‐ra‐asy,
   lg 9887513tegetgtb"
  ato:1
   uhr ,
  aalbe re
   vial:tu,
  ib:"715109"
   sn 9887513,
  pgs 7,
   ae:16
  pbihr 
   ulse:{
    nm:"vrmnsLbay,
     ae Eeya' irr"
    dt:IOae"910‐90:00Z)
     ae SDt(19‐91T00:0",
    ct:"odn
     iy Lno"
  }
   ,
  sbet:[Lv tre" 12s,"azAe]
   ujcs "oesois,"90" Jz g",
  lnug:"nls"
   agae Egih,
  rves 
   eiw:[
     sr ,tx:"n ftebs… ,
    {ue:1 et Oeo h et"}
     sr ,tx:"tshr o"}
    {ue:2 et I' adt… 
  ]
   
}
ARRAY OF OBJECTS
 d.ok.idn(
  bbosfnOe)
{ _d ,
   i:1
  tte TeGetGtb"
   il:"h ra asy,
  su:"715109‐h‐ra‐asy,
   lg 9887513tegetgtb"
  ato:1
   uhr ,
  aalbe re
   vial:tu,
  ib:"715109"
   sn 9887513,
  pgs 7,
   ae:16
  pbihr 
   ulse:{
    nm:"vrmnsLbay,
     ae Eeya' irr"
    dt:IOae"910‐90:00Z)
     ae SDt(19‐91T00:0",
    ct:"odn
     iy Lno"
  }
   ,
  sbet:[Lv tre" 12s,"azAe]
   ujcs "oesois,"90" Jz g",
  lnug:"nls"
   agae Egih,
  rves 
   eiw:[
     sr ,tx:"n ftebs… ,
    {ue:1 et Oeo h et"}
     sr ,tx:"tshr o"}
    {ue:2 et I' adt… 
  ]
   ,
}
EXERCISE #1
 Design a schema for users and their book reviews

Users                     Reviews
  username (string)         text (string)
  email (string)            rating (integer)
                            created_at (date)
            Usernames are immutable
EXERCISE #1: SOLUTION A
     Reviews may be queried by user or book
/ bues(n ouetprue)
 /d.sr oedcmn e sr
{ _d betd"",
   i:OjcI(…)
  uenm:"o"
   srae bb,
  eal bbeapecm
   mi:"o@xml.o"
}



/ brves(n ouetprrve)
 /d.eiw oedcmn e eiw
{ _d betd"",
   i:OjcI(…)
  ue:OjcI(…)
   sr betd"",
  bo:OjcI(…)
   ok betd"",
  rtn:5
   aig ,
  tx:"hsbo seclet"
   et Ti oki xeln!,
  cetda:IOae"021‐02:40.9Z)
   rae_t SDt(21‐01T11:706"
}
EXERCISE #1: SOLUTION B
       Optimized to retrieve reviews by user
/ bues(n ouetprue ihalrves
 /d.sr oedcmn e srwt l eiw)
{ _d betd"",
   i:OjcI(…)
  uenm:"o"
   srae bb,
  eal bbeapecm,
   mi:"o@xml.o"
  rves 
   eiw:[
    { bo:OjcI(…)
       ok betd"",
      rtn:5
       aig ,
      tx:"hsbo seclet"
       et Ti oki xeln!,
      cetda:IOae"021‐02:40.9Z)
       rae_t SDt(21‐01T11:706"
    }
     
  ]
   
}
EXERCISE #1: SOLUTION C
      Optimized to retrieve reviews by book
/ bues(n ouetprue)
 /d.sr oedcmn e sr
{ _d betd"",
   i:OjcI(…)
  uenm:"o"
   srae bb,
  eal bbeapecm
   mi:"o@xml.o"
}



/ bbos(n ouetprbo ihalrves
 /d.ok oedcmn e okwt l eiw)
{ _d betd"",
   i:OjcI(…)
  / te okfed…
   /Ohrbo ils
  rves 
   eiw:[
    { ue:OjcI(…)
       sr betd"",
      rtn:5
       aig ,
      tx:"hsbo seclet"
       et Ti oki xeln!,
      cetda:IOae"021‐02:40.9Z)
       rae_t SDt(21‐01T11:706"
    }
     
  ]
   
}
SCHEMA DESIGN OVER AN APPLICATION'S
             LIFETIME
           Development
           Production
           Iterative Modifications
DEVELOPMENT PHASE
    Basic CRUD functionality
CREATERUD
 ato  
 uhr={
  _d ,
  i:2
  frtnm:"rhr,
  is_ae Atu"
  ls_ae Mle"
  atnm:"ilr
 }
 ;

 d.uhr.netato)
 batosisr(uhr;




 The _ d
      i field is unique and automatically indexed
 MongoDB will generate an ObjectId if not provided
CREADUD
>d.uhr.id{"atnm" Mle"}
  batosfn( ls_ae:"ilr )
{
  _d ,
   i:2
  frtnm:"rhr,
   is_ae Atu"
  ls_ae Mle"
   atnm:"ilr
}
READS AND INDEXING
    Examine the query after creating an index.
>d.ok.nuene( su"  )
 bbosesrIdx{"lg:1}

>d.ok.id{"lg:"h‐ra‐asy )epan)
  bbosfn( su" tegetgtb"}.xli(
{
  "usr:"teCro lg1,
   cro" Breusrsu_"
  "sutKy  as,
   iMlie":fle
  "":1
   n  ,
  "sandbet":1
   ncneOjcs  ,
  "sand  ,
   ncne":1
  "cnnOdr  as,
   saAdre":fle
  "neOl":fle
   idxny  as,
  "Yed":0
   nils  ,
  "Cukkp":0
   nhnSis  ,
  "ils  ,
   mli":0
  / te ilsflo…
   /Ohrfed olw
}
MULTI-KEY INDEXES
          Index all values in an array field.
 >d.ok.nuene( sbet"  )
  bbosesrIdx{"ujcs:1};
INDEXING EMBEDDED FIELDS
         Index an embedded object's field.
  
 >d.ok.nuene( pbihrnm"  )
   bbosesrIdx{"ulse.ae:1} 
QUERY OPERATORS
Conditional operators
  $ t$ t , $ t$ t , $ e$ l , $ n$ i , $ i e
   g, ge l, le n, al i, nn sz,
  $ n , $ r$ o , $ o , $ y e$ x s s
   ad o, nr md tp, eit
Regular expressions
Value in an array
  $lmac
   eeMth
Cursor methods and modifiers
  c u t )l m t )s i ( , s a s o ( , s r ( ,
   on(, ii(, kp) npht) ot)
   acSz(, xli(, it)
  b t h i e )e p a n )h n (
CRUPDATED
 rve  
 eiw={
  ue:1
  sr ,
  tx:" i O ieti ok"
  et IddNTlk hsbo.
 }
 ;

 d.ok.pae
 bbosudt(
  {_d  ,
   i:1}
  {$uh  eiw:rve }
   ps:{rves eiw}
 )
 ;
ATOMIC MODIFIERS
Update specific fields within a document
           $e, $ne
             st ust
             ps, psAl
           $ u h$ u h l
             adoe, pp
           $ d T S t$ o
             pl, plAl
           $ u l$ u l l
           $eae
             rnm
           $ibt
CRUDELETE
 >d.ok.eoe{_d  )
  bbosrmv( i:1}
PRODUCTION PHASE
Evolve schema to meet the application's read and write
                     patterns
READ USAGE
      Finding books by an author's first name
 atos=d.uhr.id{frtnm:/f*i}  i:1};
 uhr  batosfn( is_ae ^./ ,{_d  )

 atoIs=atosmpfnto(){rtr .i;};
 uhrd  uhr.a(ucinx  eunx_d )

 d.ok.id{uhr  i:atoIs})
 bbosfn(ato:{$n uhrd };
READ USAGE
"Cache" the author name in an embedded document
>d.ok.idn(
  bbosfnOe)
{
  _d ,
   i:1
  tte TeGetGtb"
   il:"h ra asy,
  ato:{
   uhr 
    frtnm:".Sot,
     is_ae F ct"
    ls_ae Ftgrl"
     atnm:"izead
  }
   
  / te ilsflo…
   /Ohrfed olw
}




              Queries are now one step
 >d.ok.id{ato.is_ae ^./ )
  bbosfn( uhrfrtnm:/f*i}
WRITE USAGE
              Users can review a book
rve  
 eiw={
  ue:1
   sr ,
  tx:" huh hsbo a ra!,
   et Itogtti okwsget"
  rtn:5
   aig 
};

 >d.ok.pae
   bbosudt(
  {_d  ,
    i:3}
  {$uh  eiw:rve }
    ps:{rves eiw}
);




 Document size limit (16MB)
 Storage fragmentation after many updates/deletes
EXERCISE #2
Display the 10 most recent reviews by a user
Make efficient use of memory and disk seeks
EXERCISE #2: SOLUTION
      Store users' reviews in monthly buckets
/ brves(n ouetprue e ot)
 /d.eiw oedcmn e srprmnh
{ _d bb211"
   i:"o‐020,
  rves 
   eiw:[
    { _d betd"",
       i:OjcI(…)
      rtn:5
       aig ,
      tx:"hsbo seclet"
       et Ti oki xeln!,
      cetda:IOae"021‐02:40.9Z)
       rae_t SDt(21‐01T11:706"
    }
     ,
    { _d betd"",
       i:OjcI(…)
      rtn:2
       aig ,
      tx:" intral no hsbo.,
       et Idd' elyejyti ok"
      cetda:IOae"021‐12:25.9Z)
       rae_t SDt(21‐01T01:054"
    }
     
  ]
   
}
EXERCISE #2: SOLUTION
  Adding a new review to the appropriate bucket
mRve  
 yeiw={
  _d betd"",
   i:OjcI(…)
  rtn:3
   aig ,
  tx:"naeaera.,
   et A vrg ed"
  cetda:IOae"021‐31:61.0Z)
   rae_t SDt(21‐01T22:152"
};

>d.eiw.pae
  brvesudt(
   {_d bb21‐0 ,
     i:"o‐021"}
   {$uh  eiw:mRve }
     ps:{rves yeiw}
);
EXERCISE #2: SOLUTION
   Display the 10 most recent reviews by a user
cro  brvesfn(
usr=d.eiw.id
  {_d ^o‐ ,
   i:/bb/}
  {rves  sie 0}
   eiw:{$lc:1 }
)sr( i:‐ )
.ot{_d 1};

nm=0
u  ;

wie(usrhset)& u  0 
hl cro.aNx( &nm<1){
  dc=cro.et)
  o  usrnx(;

  fr(a   ;i<dcrveslnt &nm<1;+i +u){
   o vri=0   o.eiw.egh& u  0 +,+nm 
    pitsndcrvesi)
     rnjo(o.eiw[];
  }
   
}
EXERCISE #2: SOLUTION
                  Deleting a review
cro  brvesudt(
 usr=d.eiw.pae
  {_d bb21‐0 ,
    i:"o‐021"}
  {$ul  eiw:{_d betd"" }
    pl:{rves  i:OjcI(…)}}
);
ITERATIVE
MODIFICATIONS
 Schema design is evolutionary
ALLOW USERS TO BROWSE BY BOOK
            SUBJECT
>d.ujcsfnOe)
  bsbet.idn(
{
  _d ,
   i:1
  nm:"mrcnLtrtr"
   ae Aeia ieaue,
  sbctgr:{
   u_aeoy 
     ae 12s,
     nm:"90"
     u_aeoy  ae Jz g"}
     sbctgr:{nm:"azAe 
  
  }
}




   How can you search this collection?
   Be aware of document size limitations
   Benefit from hierarchy being in same document
TREE STRUCTURES
>d.ujcsfn(
 bsbet.id)
{ _d Aeia ieaue 
  i:"mrcnLtrtr"}

{ _d:"90"
   i  12s,
  acsos "mrcnLtrtr",
   netr:[Aeia ieaue]
  prn:"mrcnLtrtr"
   aet Aeia ieaue
}

{ _d Jz g"
   i:"azAe,
  acsos "mrcnLtrtr" 12s]
   netr:[Aeia ieaue,"90",
  prn:"90"
   aet 12s
}

{ _d Jz g nNwYr"
   i:"azAei e ok,
  acsos "mrcnLtrtr" 12s,"azAe]
   netr:[Aeia ieaue,"90" Jz g",
  prn:"azAe
   aet Jz g"
}
TREE STRUCTURES
       Find sub-categories of a given subject
>d.ujcsfn( netr:"90"}
  bsbet.id{acsos 12s )
{
  _d Jz g"
   i:"azAe,
  acsos "mrcnLtrtr" 12s]
   netr:[Aeia ieaue,"90",
  prn:"90"
   aet 12s
}

{
  _d Jz g nNwYr"
   i:"azAei e ok,
  acsos "mrcnLtrtr" 12s,"azAe]
   netr:[Aeia ieaue,"90" Jz g",
  prn:"azAe
   aet Jz g"
}
EXERCISE #3
Allow users to borrow library books
   User sends a loan request
   Library approves or not
   Requests time out after seven days
Approval process is asynchronous
Requests may be prioritized
EXERCISE #3: SOLUTION
           Need to maintain order and state
           Ensure that updates are atomic
/ raeanwla eus
/Cet  e onrqet
>d.on.net{
 blasisr(
  _d  orwr bb,bo:OjcI(…)}
  i:{broe:"o" ok betd"" ,
  pnig as,
  edn:fle
  apoe:fle
  prvd as,
  pirt:1
  roiy ,
};
)

/ idtehgetpirt eus n aka edn prvl
/Fn h ihs roiyrqetadmr spnigapoa
rqet=d.on.idnMdf(
eus  blasfnAdoiy{
  qey  edn:fle}
  ur:{pnig as ,
  sr:{pirt:‐ ,
  ot  roiy 1}
  udt:{$e:{pnig re tre:nwIOae)},
  pae  st  edn:tu,satd e SDt( }
  nw re
  e:tu
};
)
EXERCISE #3: SOLUTION
           Updated and added fields
           Modified document was returned
{
  _d  orwr bb,bo:OjcI(…)}
   i:{broe:"o" ok betd"" ,
  pnig re
   edn:tu,
  apoe:fle
   prvd as,
  pirt:1
   roiy ,
  satd SDt(21‐01T20:252"
   tre:IOae"021‐12:94.4Z)
}
EXERCISE #3: SOLUTION
/ irr prvstela eus
 /Lbayapoe h onrqet
>d.on.pae
  blasudt(
  {_d  orwr bb,bo:OjcI(…)},
    i:{broe:"o" ok betd"" }
  {$e:{pnig as,apoe:tu }
    st  edn:fle prvd re}
);
EXERCISE #3: SOLUTION
/ eus ie u fe ee as
/Rqettmsotatrsvndy
lmt=nwDt(;
ii  e ae)
lmtstaelmtgtae)‐7;
ii.eDt(ii.eDt(  )

>d.on.pae
  blasudt(
  {pnig re tre:{$t ii }
    edn:tu,satd  l:lmt},
  {$e:{pnig as,apoe:fle}
    st  edn:fle prvd as }
);
EXERCISE #4
   Allow users to recommend books
Users can recommend each book only once
Display a book's current recommendations
EXERCISE #4: SOLUTION
/ brcmedtos(n ouetprue e ok
/d.eomnain oedcmn e srprbo)
>d.eomnain.net{
 brcmedtosisr(
  bo:OjcI(…)
  ok betd"",
  ue:OjcI(…)
  sr betd""
};
)

/ nqeidxesrsuescntrcmedtie
 /Uiu ne nue sr a' eomn wc
>d.eomnain.nuene(
  brcmedtosesrIdx
  {bo:1 sr  ,
    ok ,ue:1}
  {uiu:tu 
    nqe re}
);

/ on h ubro eomnain o  ok
/Cuttenme frcmedtosfrabo
>d.eomnain.on( ok betd"" )
 brcmedtoscut{bo:OjcI(…)};
EXERCISE #4: SOLUTION
        Indexes in MongoDB are not counting
        Counts are computed via index scans
        Denormalize totals on books
>d.ok.pae
 bbosudt(
  {_d betd"" ,
   i:OjcI(…)}
  {$n:{rcmedtos  }
   ic  eomnain:1}
};
)
COMMON DESIGN
  PATTERNS
ONE-TO-ONE
       RELATIONSHIP
Let's pretend that authors only write one book.
LINKING
  Either side, or both, can track the relationship.
>d.ok.idn(
  bbosfnOe)
{
  _d ,
   i:1
  tte TeGetGtb"
   il:"h ra asy,
  su:"715109‐h‐ra‐asy,
   lg 9887513tegetgtb"
  ato:1
   uhr ,
  / te ilsflo…
   /Ohrfed olw
}

>d.uhr.idn( i:1}
  batosfnOe{_d  )
{
  _d ,
   i:1
  frtnm:".Sot,
   is_ae F ct"
  ls_ae Ftgrl"
   atnm:"izead
  bo:1
   ok ,
}
EMBEDDED OBJECT
>d.ok.idn(
  bbosfnOe)
{
  _d ,
   i:1
  tte TeGetGtb"
   il:"h ra asy,
  su:"715109‐h‐ra‐asy,
   lg 9887513tegetgtb"
  ato:{
   uhr 
    frtnm:".Sot,
     is_ae F ct"
    ls_ae Ftgrl"
     atnm:"izead
  }
   
  / te ilsflo…
   /Ohrfed olw
}
ONE-TO-MANY
     RELATIONSHIP
In reality, authors may write multiple books.
ARRAY OF ID'S
       The "one" side tracks the relationship.
>d.uhr.idn(
  batosfnOe)
{
  _d ,
   i:1
  frtnm:".Sot,
   is_ae F ct"
  ls_ae Ftgrl"
   atnm:"izead,
  bos 1 ,2]
   ok:[,3 0
}




     Flexible and space-efficient
     Additional query needed for non-ID lookups
SINGLE FIELD WITH ID
      The "many" side tracks the relationship.
>d.ok.id{ato:1}
  bbosfn( uhr  )
{
  _d ,
   i:1
  tte TeGetGtb"
   il:"h ra asy,
  su:"715109‐h‐ra‐asy,
   lg 9887513tegetgtb"
  ato:1
   uhr ,
  / te ilsflo…
   /Ohrfed olw
}

{
  _d ,
   i:3
  tte Ti ieo aaie,
   il:"hsSd fPrds"
  su:"707473‐hssd‐fprds"
   lg 9869428ti‐ieo‐aaie,
  ato:1
   uhr ,
  / te ilsflo…
   /Ohrfed olw
}
ARRAY OF OBJECTS
>d.uhr.idn(
  batosfnOe)
{
  _d ,
   i:1
  frtnm:".Sot,
   is_ae F ct"
  ls_ae Ftgrl"
   atnm:"izead,
  bos 
   ok:[
    {_d ,tte TeGetGtb"}
      i:1 il:"h ra asy ,
    {_d ,tte Ti ieo aaie 
      i:3 il:"hsSd fPrds"}
  ]
   
  / te ilsflo…
   /Ohrfed olw
}




 Use $ l c operator to return a subset of books
      sie
MANY-TO-MANY
 RELATIONSHIP
Some books may also have co-authors.
ARRAY OF ID'S ON BOTH SIDES
>d.ok.idn(
  bbosfnOe)
{
  _d ,
   i:1
  tte TeGetGtb"
   il:"h ra asy,
  atos 1 ]
   uhr:[,5
  / te ilsflo…
   /Ohrfed olw
}



>d.uhr.idn(
  batosfnOe)
{
  _d ,
   i:1
  frtnm:".Sot,
   is_ae F ct"
  ls_ae Ftgrl"
   atnm:"izead,
  bos 1 ,2]
   ok:[,3 0
}
ARRAY OF ID'S ON BOTH SIDES
       Query for all books by a given author
>d.ok.id{atos  )
 bbosfn( uhr:1};




       Query for all authors of a given book
>d.uhr.id{bos  )
 batosfn( ok:1};
ARRAY OF ID'S ON ONE SIDE
>d.ok.idn(
  bbosfnOe)
{
  _d ,
   i:1
  tte TeGetGtb"
   il:"h ra asy,
  atos 1 ]
   uhr:[,5
  / te ilsflo…
   /Ohrfed olw
}



>d.uhr.idn( i:{$n 1 ]})
  batosfnOe{_d  i:[,5 }
{
  _d ,
   i:1
  frtnm:".Sot,
   is_ae F ct"
  ls_ae Ftgrl"
   atnm:"izead
}

{
  _d ,
   i:5
  frtnm:"nnw"
   is_ae Ukon,
  ls_ae C‐uhr
   atnm:"oato"
}
ARRAY OF ID'S ON ONE SIDE
        Query for all books by a given author
 >d.ok.id{atos  )
  bbosfn( uhr:1};




        Query for all authors of a given book
bo  bbosfnOe
 ok=d.ok.idn(
  {tte TeGetGtb"}
    il:"h ra asy ,
  {atos  
    uhr:1}
);

d.uhr.id{_d  i:bo.uhr };
batosfn( i:{$n okatos})
EXERCISE #5
Tracking time series data
  Graph recommendations per unit of time
  Count by: day, hour, minute
EXERCISE #5: SOLUTION A
/ brct tm eisbces oradmnt u‐os
/d.e_s(iesre ukt,hu n iuesbdc)
>d.e_sisr(
 brct.net{
  bo:OjcI(…)
  ok betd"",
  dy SDt(21‐01T00:000"
  a:IOae"021‐10:00.0Z)
  ttl ,
  oa:0
  hu: {"" ,"" ,/  /"3:0}
  or   0:0 1:0 *…* 2"  ,
  mnt:{"" ,"" ,/  /"49:0}
  iue  0:0 1:0 *…* 13"  
};
)

/ eodarcmedto rae n iuebfr ingt
/Rcr  eomnaincetdoemnt eoemdih
>d.e_sudt(
 brct.pae
  {bo:OjcI(…) a:IOae"021‐10:00.0Z)}
   ok betd"",dy SDt(21‐01T00:000" ,
  {$n:{ttl ,"or2" ,"iue13"  }
   ic  oa:1 hu.3:1 mnt.49:1}
};
)
BSON STORAGE
               Sequence of key/value pairs
               Not a hash map
               Optimized to scan quickly

                        minute
                     [][]…[49
                      0 1  13]


What is the cost of updating the minute before midnight?
BSON STORAGE
   We can skip sub-documents

     hour0     …     hour23
   [][]…[9
    0 1  5]        [30  13]
                    18]…[49


How could this change the schema?
EXERCISE #5: SOLUTION B
/ brct tm eisbces ahhu  u‐o)
/d.e_s(iesre ukt,ec orasbdc
>d.e_sisr(
 brct.net{
  bo:OjcI(…)
  ok betd"",
  dy SDt(21‐01T00:000"
  a:IOae"021‐10:00.0Z)
  ttl 4,
  oa:18
  hu:{
  or 
    ""  oa:7 0:0 *…* 5"  ,
    0:{ttl ,"" ,/  /"9:2}
    ""  oa:3 6" ,/  /"1"  ,
    1:{ttl ,"0:1 *…* 19:0}
    / te or…
    /Ohrhus
    "3:{ttl 2 18" ,/  /"49:3}
    2"  oa:1,"30:0 *…* 13"  
  }
  
};
)

/ eodarcmedto rae n iuebfr ingt
/Rcr  eomnaincetdoemnt eoemdih
>d.e_sudt(
 brct.pae
  {bo:OjcI(…) a:IOae"021‐10:00.0Z)}
   ok betd"",dy SDt(21‐01T00:000" ,
  {$n:{ttl ,"or2.oa" ,"or2.49:1}
   ic  oa:1 hu.3ttl:1 hu.313"  }
};
)
SINGLE-COLLECTION INHERITANCE
  Take advantage of MongoDB's features
 Documents need not all have the same fields
 Sparsely index only present fields
SCHEMA FLEXIBILITY
>d.ok.idn(
  bbosfnOe)
{
  _d 7
   i:4,
  tte TeWzr hs"
   il:"h iadCae,
  tp:"eis,
   ye sre"
  sre_il:"h iadsTioy,
   eistte TeWzr' rlg"
  vlm:2
   oue 
  / te ilsflo…
   /Ohrfed olw
}




       Find all books that are part of a series
d.ok.id{tp:"eis )
bbosfn( ye sre"};

>d.ok.id{sre_il:{$xss re})
 bbosfn( eistte  eit:tu };

>d.ok.id{vlm:{$t  };
 bbosfn( oue  g:0})
INDEX ONLY PRESENT FIELDS
Documents without these fields will not be indexed.
>d.ok.nuene( eistte  ,{sas:tu )
 bbosesrIdx{sre_il:1}  pre re}

>d.ok.nuene( oue  ,{sas:tu )
 bbosesrIdx{vlm:1}  pre re}
EXERCISE #6
Users can recommend at most 10 books
EXERCISE #6: SOLUTION
/ bue_es(rc srsrmiigadgvnrcmedtos
/d.srrc takue' eann n ie eomnain)
>d.srrc.net{
 bue_esisr(
  _d bb,
  i:"o"
  rmiig ,
  eann:8
  bos 3 0
  ok:[,1]
};
)

/ eodarcmedto fpsil
/Rcr  eomnaini osbe
>d.srrc.pae
 bue_esudt(
  {_d bb,rmiig  g:0} ok:{$e  }
   i:"o" eann:{$t  ,bos  n:4},
  {$n:{rmiig 1} ps:{bos  }
   ic  eann:‐ ,$uh  ok:4}
};
)
EXERCISE #6: SOLUTION
  One less unassigned recommendation remaining
  Newly-recommended book is now linked
>d.srrc.idn(
  bue_esfnOe)
{
  _d bb,
   i:"o"
  rmiig ,
   eann:7
  bos 3 0 ]
   ok:[,1,4
}
EXERCISE #7
Statistic buckets
  Each book has a listing page in our application
  Record referring website domains for each book
  Count each domain independently
EXERCISE #7: SOLUTION A
>d.okrf.idn(
  bbo_esfnOe)
{ bo:1
   ok ,
  rfres 
   eerr:[
    {dmi:"ogecm,cut  ,
      oan gol.o" on:4}
    {dmi:"ao.o" on:1}
      oan yhocm,cut  
  ]
   
}



>d.okrf.pae
  bbo_esudt(
  {bo:1 rfresdmi" gol.o"}
    ok ,"eerr.oan:"ogecm ,
  {$n:{"eerr..on"  }
    ic  rfres$cut:1}
);
EXERCISE #7: SOLUTION A
Update the position of the first matched element.
>d.okrf.pae
  bbo_esudt(
  {bo:1 rfresdmi" gol.o"}
    ok ,"eerr.oan:"ogecm ,
  {$n:{"eerr..on"  }
    ic  rfres$cut:1}
);



>d.okrf.idn(
  bbo_esfnOe)
{ bo:1
   ok ,
  rfres 
   eerr:[
    {dmi:"ogecm,cut  ,
      oan gol.o" on:5}
    {dmi:"ao.o" on:1}
      oan yhocm,cut  
  ]
   
}




      What if a new referring website is used?
EXERCISE #7: SOLUTION B
>d.okrf.idn(
  bbo_esfnOe)
{ bo:1
   ok ,
  rfres 
   eerr:{
    "ogecm:5
     gol_o" ,
    "ao_o" 
     yhocm:1
  }
   
}



>d.okrf.pae
  bbo_esudt(
  {bo:1}
    ok  ,
  {$n:{"eerr.igcm:1},
    ic  rfresbn_o"  }
  tu
   re
);




    Replace dots with underscores for key names
    Increment to add a new referring website
    Upsert in case this is the book's first referrer
SHARDING
SHARDING
Ad-hoc partitioning
Consistent hashing
  Amazon DynamoDB
Range based partitioning
  Google BigTable
  Yahoo! PNUTS
  MongoDB
SHARDING IN MONGODB
Automated management
Range based partitioning
Convert to sharded system with no downtime
Fully consistent
SHARDING A COLLECTION
>d.uCmad{adhr  sad.xml.o"};
 brnomn( dsad:"hr1eapecm )

>d.uCmad{
 brnomn(
  sadolcin lbaybos,
  hrCleto:"irr.ok"
  ky  i  }
  e:{_d:1
};
)




             Keys range from −∞ to +∞
             Ranges are stored as chunks
SHARDING DATA BY CHUNKS
>d.ok.ae{_d 5 il:"alo h id )
 bbossv( i:3,tte Cl fteWl"};
>d.ok.ae{_d 0 il:"rpco acr )
 bbossv( i:4,tte Toi fCne"};
>d.ok.ae{_d 5 il:"h uge )
 bbossv( i:4,tte TeJnl"};
>d.ok.ae{_d 0 il:"fMc n e"};
 bbossv( i:5,tte O ieadMn )




                        [∞ 0
                         −,4)            [∞ 0
                                          −,4)
    [−,+)
      ∞ ∞       →       [0 ∞
                         4,+)
                                  →      [0 0
                                          4,5)
                                         [0 ∞
                                          5,+)

  Ranges are split into chunks as data is inserted
ADDING NEW SHARDS
      shard1
      [∞ 0
       −,4)
      [0 0
       4,5)
      [0 0
       5,6)
      [0 ∞
       6,+)
ADDING NEW SHARDS
 >d.uCmad{adhr  sad.xml.o"};
  brnomn( dsad:"hr2eapecm )




                shard1         shard2
                [∞ 0
                 −,4)
                               [0 0
                               4,5)
                [0 0
                 5,6)
                               [0 ∞
                               6,+)

      Chunks are migrated to balance shards
ADDING NEW SHARDS
 >d.uCmad{adhr  sad.xml.o"};
  brnomn( dsad:"hr3eapecm )




        shard1          shard2   shard3
        [∞ 0
         −,4)
                        [0 0
                         4,5)
                                 [0 0
                                  5,6)
                        [0 ∞
                         6,+)
SHARDING COMPONENTS
     mno
      ogs
     Config servers
     Shards
       mno
        ogd
       Replica sets
SHARDED WRITES
Inserts
   Shard key required
   Routed
Updates and removes
   Shard key optional
   May be routed or scattered
SHARDED READS
Queries
  By shard key: routed
  Without shard key: scatter/gather
Sorted queries
  By shard key: routed in order
  Without shard key: distributed merge sort
EXERCISE #8
    Users can upload images for books

                 images
                iaei:??
                mg_d ?
                dt:bnr
                aa iay




The collection will be sharded by i a e i .
                                   mg_d
       What should i a e i be?
                    mg_d
EXERCISE #8: SOLUTIONS
What's the best shard key for our use case?
         Auto-increment (ObjectId)
         MD5 of data
         Time (e.g. month) and MD5
Right-balanced Access
Random Access
Segmented Access
SUMMARY
Schema design is different in MongoDB.
Basic data design principles apply.
It's about your application.
It's about your data and how it's used.
It's about the entire lifetime of your application.
THANKS!
 QUESTIONS?

More Related Content

What's hot

Forking Oryx at Intalio
Forking Oryx at IntalioForking Oryx at Intalio
Forking Oryx at IntalioAntoine Toulme
 
deepjs - tools for better programming
deepjs - tools for better programmingdeepjs - tools for better programming
deepjs - tools for better programmingnomocas
 
CSS: A Slippery Slope to the Backend
CSS: A Slippery Slope to the BackendCSS: A Slippery Slope to the Backend
CSS: A Slippery Slope to the BackendFITC
 
NoSQL & MongoDB
NoSQL & MongoDBNoSQL & MongoDB
NoSQL & MongoDBShuai Liu
 
Great BigTable and my toys
Great BigTable and my toysGreat BigTable and my toys
Great BigTable and my toysmseki
 
Interactive Visualization With Bokeh (SF Python Meetup)
Interactive Visualization With Bokeh (SF Python Meetup)Interactive Visualization With Bokeh (SF Python Meetup)
Interactive Visualization With Bokeh (SF Python Meetup)Peter Wang
 
Bokeh Tutorial - PyData @ Strata San Jose 2015
Bokeh Tutorial - PyData @ Strata San Jose 2015Bokeh Tutorial - PyData @ Strata San Jose 2015
Bokeh Tutorial - PyData @ Strata San Jose 2015Peter Wang
 
Schema design
Schema designSchema design
Schema designchristkv
 
MongoDB .local Munich 2019: Tips and Tricks++ for Querying and Indexing MongoDB
MongoDB .local Munich 2019: Tips and Tricks++ for Querying and Indexing MongoDBMongoDB .local Munich 2019: Tips and Tricks++ for Querying and Indexing MongoDB
MongoDB .local Munich 2019: Tips and Tricks++ for Querying and Indexing MongoDBMongoDB
 
Inside MongoDB: the Internals of an Open-Source Database
Inside MongoDB: the Internals of an Open-Source DatabaseInside MongoDB: the Internals of an Open-Source Database
Inside MongoDB: the Internals of an Open-Source DatabaseMike Dirolf
 
MongoDB .local Munich 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pip...
MongoDB .local Munich 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pip...MongoDB .local Munich 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pip...
MongoDB .local Munich 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pip...MongoDB
 
ELK Stack - Turn boring logfiles into sexy dashboard
ELK Stack - Turn boring logfiles into sexy dashboardELK Stack - Turn boring logfiles into sexy dashboard
ELK Stack - Turn boring logfiles into sexy dashboardGeorg Sorst
 

What's hot (20)

Forking Oryx at Intalio
Forking Oryx at IntalioForking Oryx at Intalio
Forking Oryx at Intalio
 
deepjs - tools for better programming
deepjs - tools for better programmingdeepjs - tools for better programming
deepjs - tools for better programming
 
Elastic search 검색
Elastic search 검색Elastic search 검색
Elastic search 검색
 
CSS: A Slippery Slope to the Backend
CSS: A Slippery Slope to the BackendCSS: A Slippery Slope to the Backend
CSS: A Slippery Slope to the Backend
 
NoSQL & MongoDB
NoSQL & MongoDBNoSQL & MongoDB
NoSQL & MongoDB
 
Css selectors
Css selectorsCss selectors
Css selectors
 
Great BigTable and my toys
Great BigTable and my toysGreat BigTable and my toys
Great BigTable and my toys
 
Canvas - The Cure
Canvas - The CureCanvas - The Cure
Canvas - The Cure
 
PHP Tutorial (funtion)
PHP Tutorial (funtion)PHP Tutorial (funtion)
PHP Tutorial (funtion)
 
Interactive Visualization With Bokeh (SF Python Meetup)
Interactive Visualization With Bokeh (SF Python Meetup)Interactive Visualization With Bokeh (SF Python Meetup)
Interactive Visualization With Bokeh (SF Python Meetup)
 
PHP 1
PHP 1PHP 1
PHP 1
 
Bokeh Tutorial - PyData @ Strata San Jose 2015
Bokeh Tutorial - PyData @ Strata San Jose 2015Bokeh Tutorial - PyData @ Strata San Jose 2015
Bokeh Tutorial - PyData @ Strata San Jose 2015
 
Mongo db presentation
Mongo db presentationMongo db presentation
Mongo db presentation
 
Sposoby na internet
Sposoby na internetSposoby na internet
Sposoby na internet
 
Mongo scaling
Mongo scalingMongo scaling
Mongo scaling
 
Schema design
Schema designSchema design
Schema design
 
MongoDB .local Munich 2019: Tips and Tricks++ for Querying and Indexing MongoDB
MongoDB .local Munich 2019: Tips and Tricks++ for Querying and Indexing MongoDBMongoDB .local Munich 2019: Tips and Tricks++ for Querying and Indexing MongoDB
MongoDB .local Munich 2019: Tips and Tricks++ for Querying and Indexing MongoDB
 
Inside MongoDB: the Internals of an Open-Source Database
Inside MongoDB: the Internals of an Open-Source DatabaseInside MongoDB: the Internals of an Open-Source Database
Inside MongoDB: the Internals of an Open-Source Database
 
MongoDB .local Munich 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pip...
MongoDB .local Munich 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pip...MongoDB .local Munich 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pip...
MongoDB .local Munich 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pip...
 
ELK Stack - Turn boring logfiles into sexy dashboard
ELK Stack - Turn boring logfiles into sexy dashboardELK Stack - Turn boring logfiles into sexy dashboard
ELK Stack - Turn boring logfiles into sexy dashboard
 

Viewers also liked

Securing Data in MongoDB with Gazzang and Chef
Securing Data in MongoDB with Gazzang and ChefSecuring Data in MongoDB with Gazzang and Chef
Securing Data in MongoDB with Gazzang and ChefMongoDB
 
Schema design mongo_boston
Schema design mongo_bostonSchema design mongo_boston
Schema design mongo_bostonMongoDB
 
What's New in the PHP Driver
What's New in the PHP DriverWhat's New in the PHP Driver
What's New in the PHP DriverMongoDB
 
MongoDB at Flight Centre Ltd
MongoDB at Flight Centre LtdMongoDB at Flight Centre Ltd
MongoDB at Flight Centre LtdMongoDB
 
First app online conf
First app   online confFirst app   online conf
First app online confMongoDB
 
Building Your First App: An Introduction to MongoDB
Building Your First App: An Introduction to MongoDBBuilding Your First App: An Introduction to MongoDB
Building Your First App: An Introduction to MongoDBMongoDB
 
The importance of indexes in mongo db
The importance of indexes in mongo dbThe importance of indexes in mongo db
The importance of indexes in mongo dbMongoDB
 
Mongo db conference march 2012 (1)
Mongo db conference march 2012 (1)Mongo db conference march 2012 (1)
Mongo db conference march 2012 (1)MongoDB
 
An Evening with MongoDB - Orlando: Welcome and Keynote
An Evening with MongoDB - Orlando: Welcome and KeynoteAn Evening with MongoDB - Orlando: Welcome and Keynote
An Evening with MongoDB - Orlando: Welcome and KeynoteMongoDB
 
Use Case: Apollo Group at Oracle Open World
Use Case: Apollo Group at Oracle Open WorldUse Case: Apollo Group at Oracle Open World
Use Case: Apollo Group at Oracle Open WorldMongoDB
 
MongoDB and Windows Azure
MongoDB and Windows AzureMongoDB and Windows Azure
MongoDB and Windows AzureMongoDB
 
Indexing & Query Optimization
Indexing & Query OptimizationIndexing & Query Optimization
Indexing & Query OptimizationMongoDB
 
Schema Design
Schema DesignSchema Design
Schema DesignMongoDB
 
Introducing MongoDB into your Organization
Introducing MongoDB into your OrganizationIntroducing MongoDB into your Organization
Introducing MongoDB into your OrganizationMongoDB
 
Webinar: MongoDB Connector for Spark
Webinar: MongoDB Connector for SparkWebinar: MongoDB Connector for Spark
Webinar: MongoDB Connector for SparkMongoDB
 
MongoDB and Web Scrapping with the Gyes Platform
MongoDB and Web Scrapping with the Gyes PlatformMongoDB and Web Scrapping with the Gyes Platform
MongoDB and Web Scrapping with the Gyes PlatformMongoDB
 
Case Studies: Leroy Merlin and Wellnet
Case Studies: Leroy Merlin and WellnetCase Studies: Leroy Merlin and Wellnet
Case Studies: Leroy Merlin and WellnetMongoDB
 
MongoDB using PHP: Using a New Framework Called Ox
MongoDB using PHP: Using a New Framework Called OxMongoDB using PHP: Using a New Framework Called Ox
MongoDB using PHP: Using a New Framework Called OxMongoDB
 
A flexible plugin like data layer - decouple your -_application logic from yo...
A flexible plugin like data layer - decouple your -_application logic from yo...A flexible plugin like data layer - decouple your -_application logic from yo...
A flexible plugin like data layer - decouple your -_application logic from yo...MongoDB
 

Viewers also liked (20)

Securing Data in MongoDB with Gazzang and Chef
Securing Data in MongoDB with Gazzang and ChefSecuring Data in MongoDB with Gazzang and Chef
Securing Data in MongoDB with Gazzang and Chef
 
Schema design mongo_boston
Schema design mongo_bostonSchema design mongo_boston
Schema design mongo_boston
 
What's New in the PHP Driver
What's New in the PHP DriverWhat's New in the PHP Driver
What's New in the PHP Driver
 
MongoDB at Flight Centre Ltd
MongoDB at Flight Centre LtdMongoDB at Flight Centre Ltd
MongoDB at Flight Centre Ltd
 
First app online conf
First app   online confFirst app   online conf
First app online conf
 
Building Your First App: An Introduction to MongoDB
Building Your First App: An Introduction to MongoDBBuilding Your First App: An Introduction to MongoDB
Building Your First App: An Introduction to MongoDB
 
The importance of indexes in mongo db
The importance of indexes in mongo dbThe importance of indexes in mongo db
The importance of indexes in mongo db
 
Mongo db conference march 2012 (1)
Mongo db conference march 2012 (1)Mongo db conference march 2012 (1)
Mongo db conference march 2012 (1)
 
An Evening with MongoDB - Orlando: Welcome and Keynote
An Evening with MongoDB - Orlando: Welcome and KeynoteAn Evening with MongoDB - Orlando: Welcome and Keynote
An Evening with MongoDB - Orlando: Welcome and Keynote
 
Use Case: Apollo Group at Oracle Open World
Use Case: Apollo Group at Oracle Open WorldUse Case: Apollo Group at Oracle Open World
Use Case: Apollo Group at Oracle Open World
 
MongoDB and Windows Azure
MongoDB and Windows AzureMongoDB and Windows Azure
MongoDB and Windows Azure
 
Indexing & Query Optimization
Indexing & Query OptimizationIndexing & Query Optimization
Indexing & Query Optimization
 
Schema Design
Schema DesignSchema Design
Schema Design
 
Taming NoSQL with Spring Data
Taming NoSQL with Spring DataTaming NoSQL with Spring Data
Taming NoSQL with Spring Data
 
Introducing MongoDB into your Organization
Introducing MongoDB into your OrganizationIntroducing MongoDB into your Organization
Introducing MongoDB into your Organization
 
Webinar: MongoDB Connector for Spark
Webinar: MongoDB Connector for SparkWebinar: MongoDB Connector for Spark
Webinar: MongoDB Connector for Spark
 
MongoDB and Web Scrapping with the Gyes Platform
MongoDB and Web Scrapping with the Gyes PlatformMongoDB and Web Scrapping with the Gyes Platform
MongoDB and Web Scrapping with the Gyes Platform
 
Case Studies: Leroy Merlin and Wellnet
Case Studies: Leroy Merlin and WellnetCase Studies: Leroy Merlin and Wellnet
Case Studies: Leroy Merlin and Wellnet
 
MongoDB using PHP: Using a New Framework Called Ox
MongoDB using PHP: Using a New Framework Called OxMongoDB using PHP: Using a New Framework Called Ox
MongoDB using PHP: Using a New Framework Called Ox
 
A flexible plugin like data layer - decouple your -_application logic from yo...
A flexible plugin like data layer - decouple your -_application logic from yo...A flexible plugin like data layer - decouple your -_application logic from yo...
A flexible plugin like data layer - decouple your -_application logic from yo...
 

Similar to 20121023 mongodb schema-design

Making Mongo realtime - oplog tailing in Meteor
Making Mongo realtime - oplog tailing in MeteorMaking Mongo realtime - oplog tailing in Meteor
Making Mongo realtime - oplog tailing in Meteoryaliceme
 
Building modern web apps with html5, javascript, and java
Building modern web apps with html5, javascript, and javaBuilding modern web apps with html5, javascript, and java
Building modern web apps with html5, javascript, and javaAlexander Gyoshev
 
How ElasticSearch lives in my DevOps life
How ElasticSearch lives in my DevOps lifeHow ElasticSearch lives in my DevOps life
How ElasticSearch lives in my DevOps life琛琳 饶
 
Simple search with elastic search
Simple search with elastic searchSimple search with elastic search
Simple search with elastic searchmarkstory
 
Hadoop in Data Warehousing
Hadoop in Data WarehousingHadoop in Data Warehousing
Hadoop in Data WarehousingAlexey Grigorev
 
Beginner workshop to angularjs presentation at Google
Beginner workshop to angularjs presentation at GoogleBeginner workshop to angularjs presentation at Google
Beginner workshop to angularjs presentation at GoogleAri Lerner
 
Spring scala - Sneaking Scala into your corporation
Spring scala  - Sneaking Scala into your corporationSpring scala  - Sneaking Scala into your corporation
Spring scala - Sneaking Scala into your corporationHenryk Konsek
 
NOSQL101, Or: How I Learned To Stop Worrying And Love The Mongo!
NOSQL101, Or: How I Learned To Stop Worrying And Love The Mongo!NOSQL101, Or: How I Learned To Stop Worrying And Love The Mongo!
NOSQL101, Or: How I Learned To Stop Worrying And Love The Mongo!Daniel Cousineau
 
Elasticsearch at EyeEm
Elasticsearch at EyeEmElasticsearch at EyeEm
Elasticsearch at EyeEmLars Fronius
 
Profile Serialization IIPC GA 2015
Profile Serialization IIPC GA 2015Profile Serialization IIPC GA 2015
Profile Serialization IIPC GA 2015Sawood Alam
 
Solr vs. Elasticsearch - Case by Case
Solr vs. Elasticsearch - Case by CaseSolr vs. Elasticsearch - Case by Case
Solr vs. Elasticsearch - Case by CaseAlexandre Rafalovitch
 
Lambdas myths-and-mistakes
Lambdas myths-and-mistakesLambdas myths-and-mistakes
Lambdas myths-and-mistakesRichardWarburton
 
Lambda Expressions: Myths and Mistakes - Richard Warburton (jClarity)
Lambda Expressions: Myths and Mistakes - Richard Warburton (jClarity)Lambda Expressions: Myths and Mistakes - Richard Warburton (jClarity)
Lambda Expressions: Myths and Mistakes - Richard Warburton (jClarity)jaxLondonConference
 
An Introduction to PHP Dependency Management With Composer
An Introduction to PHP Dependency Management With ComposerAn Introduction to PHP Dependency Management With Composer
An Introduction to PHP Dependency Management With ComposerOomph, Inc.
 
Discovering User's Topics of Interest in Recommender Systems @ Meetup Machine...
Discovering User's Topics of Interest in Recommender Systems @ Meetup Machine...Discovering User's Topics of Interest in Recommender Systems @ Meetup Machine...
Discovering User's Topics of Interest in Recommender Systems @ Meetup Machine...Gabriel Moreira
 
Discovering User's Topics of Interest in Recommender Systems
Discovering User's Topics of Interest in Recommender SystemsDiscovering User's Topics of Interest in Recommender Systems
Discovering User's Topics of Interest in Recommender SystemsGabriel Moreira
 
Faster! Faster! Accelerate your business with blazing prototypes
Faster! Faster! Accelerate your business with blazing prototypesFaster! Faster! Accelerate your business with blazing prototypes
Faster! Faster! Accelerate your business with blazing prototypesOSCON Byrum
 
Indexing in Cassandra
Indexing in CassandraIndexing in Cassandra
Indexing in CassandraEd Anuff
 

Similar to 20121023 mongodb schema-design (20)

Making Mongo realtime - oplog tailing in Meteor
Making Mongo realtime - oplog tailing in MeteorMaking Mongo realtime - oplog tailing in Meteor
Making Mongo realtime - oplog tailing in Meteor
 
Building modern web apps with html5, javascript, and java
Building modern web apps with html5, javascript, and javaBuilding modern web apps with html5, javascript, and java
Building modern web apps with html5, javascript, and java
 
How ElasticSearch lives in my DevOps life
How ElasticSearch lives in my DevOps lifeHow ElasticSearch lives in my DevOps life
How ElasticSearch lives in my DevOps life
 
Simple search with elastic search
Simple search with elastic searchSimple search with elastic search
Simple search with elastic search
 
Hadoop in Data Warehousing
Hadoop in Data WarehousingHadoop in Data Warehousing
Hadoop in Data Warehousing
 
Beginner workshop to angularjs presentation at Google
Beginner workshop to angularjs presentation at GoogleBeginner workshop to angularjs presentation at Google
Beginner workshop to angularjs presentation at Google
 
Spring scala - Sneaking Scala into your corporation
Spring scala  - Sneaking Scala into your corporationSpring scala  - Sneaking Scala into your corporation
Spring scala - Sneaking Scala into your corporation
 
NOSQL101, Or: How I Learned To Stop Worrying And Love The Mongo!
NOSQL101, Or: How I Learned To Stop Worrying And Love The Mongo!NOSQL101, Or: How I Learned To Stop Worrying And Love The Mongo!
NOSQL101, Or: How I Learned To Stop Worrying And Love The Mongo!
 
Python: The Dynamic!
Python: The Dynamic!Python: The Dynamic!
Python: The Dynamic!
 
Elasticsearch at EyeEm
Elasticsearch at EyeEmElasticsearch at EyeEm
Elasticsearch at EyeEm
 
Profile Serialization IIPC GA 2015
Profile Serialization IIPC GA 2015Profile Serialization IIPC GA 2015
Profile Serialization IIPC GA 2015
 
Elastic tire demo
Elastic tire demoElastic tire demo
Elastic tire demo
 
Solr vs. Elasticsearch - Case by Case
Solr vs. Elasticsearch - Case by CaseSolr vs. Elasticsearch - Case by Case
Solr vs. Elasticsearch - Case by Case
 
Lambdas myths-and-mistakes
Lambdas myths-and-mistakesLambdas myths-and-mistakes
Lambdas myths-and-mistakes
 
Lambda Expressions: Myths and Mistakes - Richard Warburton (jClarity)
Lambda Expressions: Myths and Mistakes - Richard Warburton (jClarity)Lambda Expressions: Myths and Mistakes - Richard Warburton (jClarity)
Lambda Expressions: Myths and Mistakes - Richard Warburton (jClarity)
 
An Introduction to PHP Dependency Management With Composer
An Introduction to PHP Dependency Management With ComposerAn Introduction to PHP Dependency Management With Composer
An Introduction to PHP Dependency Management With Composer
 
Discovering User's Topics of Interest in Recommender Systems @ Meetup Machine...
Discovering User's Topics of Interest in Recommender Systems @ Meetup Machine...Discovering User's Topics of Interest in Recommender Systems @ Meetup Machine...
Discovering User's Topics of Interest in Recommender Systems @ Meetup Machine...
 
Discovering User's Topics of Interest in Recommender Systems
Discovering User's Topics of Interest in Recommender SystemsDiscovering User's Topics of Interest in Recommender Systems
Discovering User's Topics of Interest in Recommender Systems
 
Faster! Faster! Accelerate your business with blazing prototypes
Faster! Faster! Accelerate your business with blazing prototypesFaster! Faster! Accelerate your business with blazing prototypes
Faster! Faster! Accelerate your business with blazing prototypes
 
Indexing in Cassandra
Indexing in CassandraIndexing in Cassandra
Indexing in Cassandra
 

More from MongoDB

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump StartMongoDB
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB
 

More from MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 

20121023 mongodb schema-design

  • 2. AGENDA 1. Basic schema design principles for MongoDB 2. Schema design over an application's lifetime 3. Common design patterns 4. Sharding
  • 3. GOALS Learn the schema design process in MongoDB Practice applying common principles via exercises Understand the implications of sharding
  • 4. WHAT IS A SCHEMA AND WHY IS IT IMPORTANT?
  • 5. SCHEMA Map concepts and relationships to data Set expectations for the data Minimize overhead of iterative modifications Ensure compatibility
  • 6. NORMALIZATION users ← books → authors uenm srae tte il frtnm is_ae frtnm is_ae ib sn ls_ae atnm ls_ae atnm lnug agae cetdb rae_y ato uhr
  • 7. DENORMALIZATION users ← books uenm srae tte il frtnm is_ae ib sn ls_ae atnm lnug agae cetdb rae_y ato uhr frtnm is_ae ls_ae atnm
  • 8. WHAT IS SCHEMA DESIGN LIKE IN MONGODB? Schema is defined at the application-level Design is part of each phase in its lifetime There is no magic formula
  • 9. MONGODB DOCUMENTS Storage in BSON → BSONSpec.org Scalars Rich types Doubles Objects Integers (32 or 64-bit) Arrays UTF-8 strings UTC Date, timestamp Binary, regex, code Object ID nlul
  • 10. TERMINOLOGY {   "ogd"   rltoa b,   mnob  :"eainld"   "aaae   dtbs"   dtbs" :"aaae,   "olcin  tbe,   cleto":"al"   "ouet   rw,   dcmn" :"o"   "ne"    idx,   idx   :"ne"   "hrig     sadn":{     "hr"  :"atto"     sad    priin,     "hr e":"atto e"     sadky  priinky   }    }
  • 11. THREE CONSIDERATIONS IN MONGODB SCHEMA DESIGN 1. The data your application needs 2. Your application's read usage of the data 3. Your application's write usage of the data
  • 12. CASE STUDY LIBRARY WEB APPLICATION Different schemas are possible
  • 13. AUTHOR SCHEMA {   "i" n,   _d:it   "is_ae:srn,   frtnm" tig   "atnm" tig   ls_ae:srn }
  • 14. USER SCHEMA {   "i" n,   _d:it   "srae:srn,   uenm" tig   "asod:srn   pswr" tig }
  • 15. BOOK SCHEMA {   "i" n,   _d:it   "il" tig   tte:srn,   "lg:srn,   su" tig   "uhr:it   ato" n,   "vial" ola,   aalbe:boen   "sn:srn,   ib" tig   "ae" n,   pgs:it   "ulse"    pbihr:{     "iy:srn,     ct" tig     "ae:dt,     dt" ae     "ae:srn     nm" tig   }   ,   "ujcs:[srn,srn ,   sbet"  tig tig]   "agae:srn,   lnug" tig   "eiw"    rves:[      ue" n,"et:srn ,    {"sr:it tx" tig}      ue" n,"et:srn     {"sr:it tx" tig}   ]   , }
  • 17. AUTHOR DOCUMENT >d.uhr.idn(  batosfnOe) {   _d ,   i:1   frtnm:".Sot,   is_ae F ct"   ls_ae Ftgrl"   atnm:"izead }
  • 18. USER DOCUMENT >d.sr.idn(  buesfnOe) {   _d ,   i:1   uenm:"ml@0e.o"   srae eiy1gncm,   pswr:"ljkok429ld9098d   asod ssf4d8k0dkj0023" }
  • 19. BOOK DOCUMENT >d.ok.idn(  bbosfnOe) {   _d ,   i:1   tte TeGetGtb"   il:"h ra asy,   su:"715109‐h‐ra‐asy,   lg 9887513tegetgtb"   ato:1   uhr ,   aalbe re   vial:tu,   ib:"715109"   sn 9887513,   pgs 7,   ae:16   pbihr    ulse:{     nm:"vrmnsLbay,     ae Eeya' irr"     dt:IOae"910‐90:00Z)     ae SDt(19‐91T00:0",     ct:"odn     iy Lno"   }   ,   sbet:[Lv tre" 12s,"azAe]   ujcs "oesois,"90" Jz g",   lnug:"nls"   agae Egih,   rves    eiw:[      sr ,tx:"n ftebs… ,    {ue:1 et Oeo h et"}      sr ,tx:"tshr o"}    {ue:2 et I' adt…    ]    }
  • 20. EMBEDDED OBJECTS AKA EMBEDDED OR SUB-DOCUMENTS What advantages do they have? When should they be used?
  • 21. EMBEDDED OBJECTS >d.ok.idn(  bbosfnOe) {   _d ,   i:1   tte TeGetGtb"   il:"h ra asy,   su:"715109‐h‐ra‐asy,   lg 9887513tegetgtb"   ato:1   uhr ,   aalbe re   vial:tu,   ib:"715109"   sn 9887513,   pgs 7,   ae:16   pbihr    ulse:{     nm:"vrmnsLbay,     ae Eeya' irr"     dt:IOae"910‐90:00Z)     ae SDt(19‐91T00:0",     ct:"odn     iy Lno"   }   ,   sbet:[Lv tre" 12s,"azAe]   ujcs "oesois,"90" Jz g",   lnug:"nls"   agae Egih,   rves    eiw:[      sr ,tx:"n ftebs… ,    {ue:1 et Oeo h et"}      sr ,tx:"tshr o"}    {ue:2 et I' adt…    ]    }
  • 22. EMBEDDED OBJECTS Great for read performance One seek to load the entire document One round trip to the database Writes can be slow if constantly adding to objects
  • 23. LINKED DOCUMENTS What advantages does this approach have? When should they be used?
  • 24. LINKED DOCUMENTS >d.ok.idn(  bbosfnOe) {   _d ,   i:1   tte TeGetGtb"   il:"h ra asy,   su:"715109‐h‐ra‐asy,   lg 9887513tegetgtb"   ato:1   uhr ,   aalbe re   vial:tu,   ib:"715109"   sn 9887513,   pgs 7,   ae:16   pbihr    ulse:{     pbihrnm:"vrmnsLbay,     ulse_ae Eeya' irr"     dt:IOae"910‐90:00Z)     ae SDt(19‐91T00:0",     pbihrct:"odn     ulse_iy Lno"   }   ,   sbet:[Lv tre" 12s,"azAe]   ujcs "oesois,"90" Jz g",   lnug:"nls"   agae Egih,   rves    eiw:[      sr ,tx:"n ftebs… ,    {ue:1 et Oeo h et"}      sr ,tx:"tshr o"}    {ue:2 et I' adt…    ]    }
  • 25. LINKED DOCUMENTS More, smaller documents Can make queries by ID very simple Accessing linked document data requires extra read What effect does this have on the system?
  • 27.
  • 28.
  • 29.
  • 30.
  • 32. ARRAY OF SCALARS >d.ok.idn(  bbosfnOe) {   _d ,   i:1   tte TeGetGtb"   il:"h ra asy,   su:"715109‐h‐ra‐asy,   lg 9887513tegetgtb"   ato:1   uhr ,   aalbe re   vial:tu,   ib:"715109"   sn 9887513,   pgs 7,   ae:16   pbihr    ulse:{     nm:"vrmnsLbay,     ae Eeya' irr"     dt:IOae"910‐90:00Z)     ae SDt(19‐91T00:0",     ct:"odn     iy Lno"   }   ,   sbet:[Lv tre" 12s,"azAe]   ujcs "oesois,"90" Jz g",   lnug:"nls"   agae Egih,   rves    eiw:[      sr ,tx:"n ftebs… ,    {ue:1 et Oeo h et"}      sr ,tx:"tshr o"}    {ue:2 et I' adt…    ]    }
  • 33. ARRAY OF OBJECTS  d.ok.idn(  bbosfnOe) { _d ,   i:1   tte TeGetGtb"   il:"h ra asy,   su:"715109‐h‐ra‐asy,   lg 9887513tegetgtb"   ato:1   uhr ,   aalbe re   vial:tu,   ib:"715109"   sn 9887513,   pgs 7,   ae:16   pbihr    ulse:{     nm:"vrmnsLbay,     ae Eeya' irr"     dt:IOae"910‐90:00Z)     ae SDt(19‐91T00:0",     ct:"odn     iy Lno"   }   ,   sbet:[Lv tre" 12s,"azAe]   ujcs "oesois,"90" Jz g",   lnug:"nls"   agae Egih,   rves    eiw:[      sr ,tx:"n ftebs… ,    {ue:1 et Oeo h et"}      sr ,tx:"tshr o"}    {ue:2 et I' adt…    ]   , }
  • 34. EXERCISE #1 Design a schema for users and their book reviews Users Reviews username (string) text (string) email (string) rating (integer) created_at (date) Usernames are immutable
  • 35. EXERCISE #1: SOLUTION A Reviews may be queried by user or book / bues(n ouetprue) /d.sr oedcmn e sr { _d betd"",   i:OjcI(…)   uenm:"o"   srae bb,   eal bbeapecm   mi:"o@xml.o" } / brves(n ouetprrve) /d.eiw oedcmn e eiw { _d betd"",   i:OjcI(…)   ue:OjcI(…)   sr betd"",   bo:OjcI(…)   ok betd"",   rtn:5   aig ,   tx:"hsbo seclet"   et Ti oki xeln!,   cetda:IOae"021‐02:40.9Z)   rae_t SDt(21‐01T11:706" }
  • 36. EXERCISE #1: SOLUTION B Optimized to retrieve reviews by user / bues(n ouetprue ihalrves /d.sr oedcmn e srwt l eiw) { _d betd"",   i:OjcI(…)   uenm:"o"   srae bb,   eal bbeapecm,   mi:"o@xml.o"   rves    eiw:[     { bo:OjcI(…)       ok betd"",       rtn:5       aig ,       tx:"hsbo seclet"       et Ti oki xeln!,       cetda:IOae"021‐02:40.9Z)       rae_t SDt(21‐01T11:706"     }        ]    }
  • 37. EXERCISE #1: SOLUTION C Optimized to retrieve reviews by book / bues(n ouetprue) /d.sr oedcmn e sr { _d betd"",   i:OjcI(…)   uenm:"o"   srae bb,   eal bbeapecm   mi:"o@xml.o" } / bbos(n ouetprbo ihalrves /d.ok oedcmn e okwt l eiw) { _d betd"",   i:OjcI(…)   / te okfed…   /Ohrbo ils   rves    eiw:[     { ue:OjcI(…)       sr betd"",       rtn:5       aig ,       tx:"hsbo seclet"       et Ti oki xeln!,       cetda:IOae"021‐02:40.9Z)       rae_t SDt(21‐01T11:706"     }        ]    }
  • 38. SCHEMA DESIGN OVER AN APPLICATION'S LIFETIME Development Production Iterative Modifications
  • 39. DEVELOPMENT PHASE Basic CRUD functionality
  • 42. READS AND INDEXING Examine the query after creating an index. >d.ok.nuene( su"  )  bbosesrIdx{"lg:1} >d.ok.id{"lg:"h‐ra‐asy )epan)  bbosfn( su" tegetgtb"}.xli( {   "usr:"teCro lg1,   cro" Breusrsu_"   "sutKy  as,   iMlie":fle   "":1   n  ,   "sandbet":1   ncneOjcs  ,   "sand  ,   ncne":1   "cnnOdr  as,   saAdre":fle   "neOl":fle   idxny  as,   "Yed":0   nils  ,   "Cukkp":0   nhnSis  ,   "ils  ,   mli":0   / te ilsflo…   /Ohrfed olw }
  • 43. MULTI-KEY INDEXES Index all values in an array field.  >d.ok.nuene( sbet"  )   bbosesrIdx{"ujcs:1};
  • 44. INDEXING EMBEDDED FIELDS Index an embedded object's field.     >d.ok.nuene( pbihrnm"  )   bbosesrIdx{"ulse.ae:1} 
  • 45. QUERY OPERATORS Conditional operators $ t$ t , $ t$ t , $ e$ l , $ n$ i , $ i e g, ge l, le n, al i, nn sz, $ n , $ r$ o , $ o , $ y e$ x s s ad o, nr md tp, eit Regular expressions Value in an array $lmac eeMth Cursor methods and modifiers c u t )l m t )s i ( , s a s o ( , s r ( , on(, ii(, kp) npht) ot) acSz(, xli(, it) b t h i e )e p a n )h n (
  • 47. ATOMIC MODIFIERS Update specific fields within a document $e, $ne st ust ps, psAl $ u h$ u h l adoe, pp $ d T S t$ o pl, plAl $ u l$ u l l $eae rnm $ibt
  • 49. PRODUCTION PHASE Evolve schema to meet the application's read and write patterns
  • 50. READ USAGE Finding books by an author's first name  atos=d.uhr.id{frtnm:/f*i}  i:1};  uhr  batosfn( is_ae ^./ ,{_d  )  atoIs=atosmpfnto(){rtr .i;};  uhrd  uhr.a(ucinx  eunx_d )  d.ok.id{uhr  i:atoIs})  bbosfn(ato:{$n uhrd };
  • 51. READ USAGE "Cache" the author name in an embedded document >d.ok.idn(  bbosfnOe) {   _d ,   i:1   tte TeGetGtb"   il:"h ra asy,   ato:{   uhr      frtnm:".Sot,     is_ae F ct"     ls_ae Ftgrl"     atnm:"izead   }      / te ilsflo…   /Ohrfed olw } Queries are now one step  >d.ok.id{ato.is_ae ^./ )   bbosfn( uhrfrtnm:/f*i}
  • 52. WRITE USAGE Users can review a book rve   eiw={   ue:1   sr ,   tx:" huh hsbo a ra!,   et Itogtti okwsget"   rtn:5   aig  };  >d.ok.pae   bbosudt(   {_d  ,    i:3}   {$uh  eiw:rve }    ps:{rves eiw} ); Document size limit (16MB) Storage fragmentation after many updates/deletes
  • 53. EXERCISE #2 Display the 10 most recent reviews by a user Make efficient use of memory and disk seeks
  • 54. EXERCISE #2: SOLUTION Store users' reviews in monthly buckets / brves(n ouetprue e ot) /d.eiw oedcmn e srprmnh { _d bb211"   i:"o‐020,   rves    eiw:[     { _d betd"",       i:OjcI(…)       rtn:5       aig ,       tx:"hsbo seclet"       et Ti oki xeln!,       cetda:IOae"021‐02:40.9Z)       rae_t SDt(21‐01T11:706"     }     ,     { _d betd"",       i:OjcI(…)       rtn:2       aig ,       tx:" intral no hsbo.,       et Idd' elyejyti ok"       cetda:IOae"021‐12:25.9Z)       rae_t SDt(21‐01T01:054"     }        ]    }
  • 55. EXERCISE #2: SOLUTION Adding a new review to the appropriate bucket mRve   yeiw={   _d betd"",   i:OjcI(…)   rtn:3   aig ,   tx:"naeaera.,   et A vrg ed"   cetda:IOae"021‐31:61.0Z)   rae_t SDt(21‐01T22:152" }; >d.eiw.pae  brvesudt(    {_d bb21‐0 ,     i:"o‐021"}    {$uh  eiw:mRve }     ps:{rves yeiw} );
  • 56. EXERCISE #2: SOLUTION Display the 10 most recent reviews by a user cro  brvesfn( usr=d.eiw.id   {_d ^o‐ ,    i:/bb/}   {rves  sie 0}    eiw:{$lc:1 } )sr( i:‐ ) .ot{_d 1}; nm=0 u  ; wie(usrhset)& u  0  hl cro.aNx( &nm<1){   dc=cro.et)   o  usrnx(;   fr(a   ;i<dcrveslnt &nm<1;+i +u){   o vri=0   o.eiw.egh& u  0 +,+nm      pitsndcrvesi)     rnjo(o.eiw[];   }    }
  • 57. EXERCISE #2: SOLUTION Deleting a review cro  brvesudt( usr=d.eiw.pae   {_d bb21‐0 ,    i:"o‐021"}   {$ul  eiw:{_d betd"" }    pl:{rves  i:OjcI(…)}} );
  • 59. ALLOW USERS TO BROWSE BY BOOK SUBJECT >d.ujcsfnOe)  bsbet.idn( {   _d ,   i:1   nm:"mrcnLtrtr"   ae Aeia ieaue,   sbctgr:{   u_aeoy       ae 12s,     nm:"90"      u_aeoy  ae Jz g"}     sbctgr:{nm:"azAe      } } How can you search this collection? Be aware of document size limitations Benefit from hierarchy being in same document
  • 60. TREE STRUCTURES >d.ujcsfn(  bsbet.id) { _d Aeia ieaue    i:"mrcnLtrtr"} { _d:"90"   i  12s,   acsos "mrcnLtrtr",   netr:[Aeia ieaue]   prn:"mrcnLtrtr"   aet Aeia ieaue } { _d Jz g"   i:"azAe,   acsos "mrcnLtrtr" 12s]   netr:[Aeia ieaue,"90",   prn:"90"   aet 12s } { _d Jz g nNwYr"   i:"azAei e ok,   acsos "mrcnLtrtr" 12s,"azAe]   netr:[Aeia ieaue,"90" Jz g",   prn:"azAe   aet Jz g" }
  • 61. TREE STRUCTURES Find sub-categories of a given subject >d.ujcsfn( netr:"90"}  bsbet.id{acsos 12s ) {   _d Jz g"   i:"azAe,   acsos "mrcnLtrtr" 12s]   netr:[Aeia ieaue,"90",   prn:"90"   aet 12s } {   _d Jz g nNwYr"   i:"azAei e ok,   acsos "mrcnLtrtr" 12s,"azAe]   netr:[Aeia ieaue,"90" Jz g",   prn:"azAe   aet Jz g" }
  • 62. EXERCISE #3 Allow users to borrow library books User sends a loan request Library approves or not Requests time out after seven days Approval process is asynchronous Requests may be prioritized
  • 63. EXERCISE #3: SOLUTION Need to maintain order and state Ensure that updates are atomic / raeanwla eus /Cet  e onrqet >d.on.net{  blasisr(   _d  orwr bb,bo:OjcI(…)}   i:{broe:"o" ok betd"" ,   pnig as,   edn:fle   apoe:fle   prvd as,   pirt:1   roiy , }; ) / idtehgetpirt eus n aka edn prvl /Fn h ihs roiyrqetadmr spnigapoa rqet=d.on.idnMdf( eus  blasfnAdoiy{   qey  edn:fle}   ur:{pnig as ,   sr:{pirt:‐ ,   ot  roiy 1}   udt:{$e:{pnig re tre:nwIOae)},   pae  st  edn:tu,satd e SDt( }   nw re   e:tu }; )
  • 64. EXERCISE #3: SOLUTION Updated and added fields Modified document was returned {   _d  orwr bb,bo:OjcI(…)}   i:{broe:"o" ok betd"" ,   pnig re   edn:tu,   apoe:fle   prvd as,   pirt:1   roiy ,   satd SDt(21‐01T20:252"   tre:IOae"021‐12:94.4Z) }
  • 65. EXERCISE #3: SOLUTION / irr prvstela eus /Lbayapoe h onrqet >d.on.pae  blasudt(   {_d  orwr bb,bo:OjcI(…)},    i:{broe:"o" ok betd"" }   {$e:{pnig as,apoe:tu }    st  edn:fle prvd re} );
  • 66. EXERCISE #3: SOLUTION / eus ie u fe ee as /Rqettmsotatrsvndy lmt=nwDt(; ii  e ae) lmtstaelmtgtae)‐7; ii.eDt(ii.eDt(  ) >d.on.pae  blasudt(   {pnig re tre:{$t ii }    edn:tu,satd  l:lmt},   {$e:{pnig as,apoe:fle}    st  edn:fle prvd as } );
  • 67. EXERCISE #4 Allow users to recommend books Users can recommend each book only once Display a book's current recommendations
  • 68. EXERCISE #4: SOLUTION / brcmedtos(n ouetprue e ok /d.eomnain oedcmn e srprbo) >d.eomnain.net{  brcmedtosisr(   bo:OjcI(…)   ok betd"",   ue:OjcI(…)   sr betd"" }; ) / nqeidxesrsuescntrcmedtie /Uiu ne nue sr a' eomn wc >d.eomnain.nuene(  brcmedtosesrIdx   {bo:1 sr  ,    ok ,ue:1}   {uiu:tu     nqe re} ); / on h ubro eomnain o  ok /Cuttenme frcmedtosfrabo >d.eomnain.on( ok betd"" )  brcmedtoscut{bo:OjcI(…)};
  • 69. EXERCISE #4: SOLUTION Indexes in MongoDB are not counting Counts are computed via index scans Denormalize totals on books >d.ok.pae  bbosudt(   {_d betd"" ,    i:OjcI(…)}   {$n:{rcmedtos  }    ic  eomnain:1} }; )
  • 70. COMMON DESIGN PATTERNS
  • 71. ONE-TO-ONE RELATIONSHIP Let's pretend that authors only write one book.
  • 72. LINKING Either side, or both, can track the relationship. >d.ok.idn(  bbosfnOe) {   _d ,   i:1   tte TeGetGtb"   il:"h ra asy,   su:"715109‐h‐ra‐asy,   lg 9887513tegetgtb"   ato:1   uhr ,   / te ilsflo…   /Ohrfed olw } >d.uhr.idn( i:1}  batosfnOe{_d  ) {   _d ,   i:1   frtnm:".Sot,   is_ae F ct"   ls_ae Ftgrl"   atnm:"izead   bo:1   ok , }
  • 73. EMBEDDED OBJECT >d.ok.idn(  bbosfnOe) {   _d ,   i:1   tte TeGetGtb"   il:"h ra asy,   su:"715109‐h‐ra‐asy,   lg 9887513tegetgtb"   ato:{   uhr      frtnm:".Sot,     is_ae F ct"     ls_ae Ftgrl"     atnm:"izead   }      / te ilsflo…   /Ohrfed olw }
  • 74. ONE-TO-MANY RELATIONSHIP In reality, authors may write multiple books.
  • 75. ARRAY OF ID'S The "one" side tracks the relationship. >d.uhr.idn(  batosfnOe) {   _d ,   i:1   frtnm:".Sot,   is_ae F ct"   ls_ae Ftgrl"   atnm:"izead,   bos 1 ,2]   ok:[,3 0 } Flexible and space-efficient Additional query needed for non-ID lookups
  • 76. SINGLE FIELD WITH ID The "many" side tracks the relationship. >d.ok.id{ato:1}  bbosfn( uhr  ) {   _d ,   i:1   tte TeGetGtb"   il:"h ra asy,   su:"715109‐h‐ra‐asy,   lg 9887513tegetgtb"   ato:1   uhr ,   / te ilsflo…   /Ohrfed olw } {   _d ,   i:3   tte Ti ieo aaie,   il:"hsSd fPrds"   su:"707473‐hssd‐fprds"   lg 9869428ti‐ieo‐aaie,   ato:1   uhr ,   / te ilsflo…   /Ohrfed olw }
  • 77. ARRAY OF OBJECTS >d.uhr.idn(  batosfnOe) {   _d ,   i:1   frtnm:".Sot,   is_ae F ct"   ls_ae Ftgrl"   atnm:"izead,   bos    ok:[     {_d ,tte TeGetGtb"}      i:1 il:"h ra asy ,     {_d ,tte Ti ieo aaie       i:3 il:"hsSd fPrds"}   ]      / te ilsflo…   /Ohrfed olw } Use $ l c operator to return a subset of books sie
  • 78. MANY-TO-MANY RELATIONSHIP Some books may also have co-authors.
  • 79. ARRAY OF ID'S ON BOTH SIDES >d.ok.idn(  bbosfnOe) {   _d ,   i:1   tte TeGetGtb"   il:"h ra asy,   atos 1 ]   uhr:[,5   / te ilsflo…   /Ohrfed olw } >d.uhr.idn(  batosfnOe) {   _d ,   i:1   frtnm:".Sot,   is_ae F ct"   ls_ae Ftgrl"   atnm:"izead,   bos 1 ,2]   ok:[,3 0 }
  • 80. ARRAY OF ID'S ON BOTH SIDES Query for all books by a given author >d.ok.id{atos  )  bbosfn( uhr:1}; Query for all authors of a given book >d.uhr.id{bos  )  batosfn( ok:1};
  • 81. ARRAY OF ID'S ON ONE SIDE >d.ok.idn(  bbosfnOe) {   _d ,   i:1   tte TeGetGtb"   il:"h ra asy,   atos 1 ]   uhr:[,5   / te ilsflo…   /Ohrfed olw } >d.uhr.idn( i:{$n 1 ]})  batosfnOe{_d  i:[,5 } {   _d ,   i:1   frtnm:".Sot,   is_ae F ct"   ls_ae Ftgrl"   atnm:"izead } {   _d ,   i:5   frtnm:"nnw"   is_ae Ukon,   ls_ae C‐uhr   atnm:"oato" }
  • 82. ARRAY OF ID'S ON ONE SIDE Query for all books by a given author  >d.ok.id{atos  )   bbosfn( uhr:1}; Query for all authors of a given book bo  bbosfnOe ok=d.ok.idn(   {tte TeGetGtb"}    il:"h ra asy ,   {atos      uhr:1} ); d.uhr.id{_d  i:bo.uhr }; batosfn( i:{$n okatos})
  • 83. EXERCISE #5 Tracking time series data Graph recommendations per unit of time Count by: day, hour, minute
  • 84. EXERCISE #5: SOLUTION A / brct tm eisbces oradmnt u‐os /d.e_s(iesre ukt,hu n iuesbdc) >d.e_sisr(  brct.net{   bo:OjcI(…)   ok betd"",   dy SDt(21‐01T00:000"   a:IOae"021‐10:00.0Z)   ttl ,   oa:0   hu: {"" ,"" ,/  /"3:0}   or   0:0 1:0 *…* 2"  ,   mnt:{"" ,"" ,/  /"49:0}   iue  0:0 1:0 *…* 13"   }; ) / eodarcmedto rae n iuebfr ingt /Rcr  eomnaincetdoemnt eoemdih >d.e_sudt(  brct.pae   {bo:OjcI(…) a:IOae"021‐10:00.0Z)}    ok betd"",dy SDt(21‐01T00:000" ,   {$n:{ttl ,"or2" ,"iue13"  }    ic  oa:1 hu.3:1 mnt.49:1} }; )
  • 85. BSON STORAGE Sequence of key/value pairs Not a hash map Optimized to scan quickly minute [][]…[49 0 1  13] What is the cost of updating the minute before midnight?
  • 86. BSON STORAGE We can skip sub-documents hour0 … hour23 [][]…[9 0 1  5] [30  13] 18]…[49 How could this change the schema?
  • 87. EXERCISE #5: SOLUTION B / brct tm eisbces ahhu  u‐o) /d.e_s(iesre ukt,ec orasbdc >d.e_sisr(  brct.net{   bo:OjcI(…)   ok betd"",   dy SDt(21‐01T00:000"   a:IOae"021‐10:00.0Z)   ttl 4,   oa:18   hu:{   or      ""  oa:7 0:0 *…* 5"  ,     0:{ttl ,"" ,/  /"9:2}     ""  oa:3 6" ,/  /"1"  ,     1:{ttl ,"0:1 *…* 19:0}     / te or…     /Ohrhus     "3:{ttl 2 18" ,/  /"49:3}     2"  oa:1,"30:0 *…* 13"     }    }; ) / eodarcmedto rae n iuebfr ingt /Rcr  eomnaincetdoemnt eoemdih >d.e_sudt(  brct.pae   {bo:OjcI(…) a:IOae"021‐10:00.0Z)}    ok betd"",dy SDt(21‐01T00:000" ,   {$n:{ttl ,"or2.oa" ,"or2.49:1}    ic  oa:1 hu.3ttl:1 hu.313"  } }; )
  • 88. SINGLE-COLLECTION INHERITANCE Take advantage of MongoDB's features Documents need not all have the same fields Sparsely index only present fields
  • 89. SCHEMA FLEXIBILITY >d.ok.idn(  bbosfnOe) {   _d 7   i:4,   tte TeWzr hs"   il:"h iadCae,   tp:"eis,   ye sre"   sre_il:"h iadsTioy,   eistte TeWzr' rlg"   vlm:2   oue    / te ilsflo…   /Ohrfed olw } Find all books that are part of a series d.ok.id{tp:"eis ) bbosfn( ye sre"}; >d.ok.id{sre_il:{$xss re})  bbosfn( eistte  eit:tu }; >d.ok.id{vlm:{$t  };  bbosfn( oue  g:0})
  • 90. INDEX ONLY PRESENT FIELDS Documents without these fields will not be indexed. >d.ok.nuene( eistte  ,{sas:tu )  bbosesrIdx{sre_il:1}  pre re} >d.ok.nuene( oue  ,{sas:tu )  bbosesrIdx{vlm:1}  pre re}
  • 91. EXERCISE #6 Users can recommend at most 10 books
  • 93. EXERCISE #6: SOLUTION One less unassigned recommendation remaining Newly-recommended book is now linked >d.srrc.idn(  bue_esfnOe) {   _d bb,   i:"o"   rmiig ,   eann:7   bos 3 0 ]   ok:[,1,4 }
  • 94. EXERCISE #7 Statistic buckets Each book has a listing page in our application Record referring website domains for each book Count each domain independently
  • 95. EXERCISE #7: SOLUTION A >d.okrf.idn(  bbo_esfnOe) { bo:1   ok ,   rfres    eerr:[     {dmi:"ogecm,cut  ,      oan gol.o" on:4}     {dmi:"ao.o" on:1}      oan yhocm,cut     ]    } >d.okrf.pae  bbo_esudt(   {bo:1 rfresdmi" gol.o"}    ok ,"eerr.oan:"ogecm ,   {$n:{"eerr..on"  }    ic  rfres$cut:1} );
  • 96. EXERCISE #7: SOLUTION A Update the position of the first matched element. >d.okrf.pae  bbo_esudt(   {bo:1 rfresdmi" gol.o"}    ok ,"eerr.oan:"ogecm ,   {$n:{"eerr..on"  }    ic  rfres$cut:1} ); >d.okrf.idn(  bbo_esfnOe) { bo:1   ok ,   rfres    eerr:[     {dmi:"ogecm,cut  ,      oan gol.o" on:5}     {dmi:"ao.o" on:1}      oan yhocm,cut     ]    } What if a new referring website is used?
  • 97. EXERCISE #7: SOLUTION B >d.okrf.idn(  bbo_esfnOe) { bo:1   ok ,   rfres    eerr:{     "ogecm:5     gol_o" ,     "ao_o"      yhocm:1   }    } >d.okrf.pae  bbo_esudt(   {bo:1}    ok  ,   {$n:{"eerr.igcm:1},    ic  rfresbn_o"  }   tu   re ); Replace dots with underscores for key names Increment to add a new referring website Upsert in case this is the book's first referrer
  • 99. SHARDING Ad-hoc partitioning Consistent hashing Amazon DynamoDB Range based partitioning Google BigTable Yahoo! PNUTS MongoDB
  • 100. SHARDING IN MONGODB Automated management Range based partitioning Convert to sharded system with no downtime Fully consistent
  • 102. SHARDING DATA BY CHUNKS >d.ok.ae{_d 5 il:"alo h id )  bbossv( i:3,tte Cl fteWl"}; >d.ok.ae{_d 0 il:"rpco acr )  bbossv( i:4,tte Toi fCne"}; >d.ok.ae{_d 5 il:"h uge )  bbossv( i:4,tte TeJnl"}; >d.ok.ae{_d 0 il:"fMc n e"};  bbossv( i:5,tte O ieadMn ) [∞ 0 −,4) [∞ 0 −,4) [−,+)  ∞ ∞ → [0 ∞ 4,+) → [0 0 4,5) [0 ∞ 5,+) Ranges are split into chunks as data is inserted
  • 103. ADDING NEW SHARDS shard1 [∞ 0 −,4) [0 0 4,5) [0 0 5,6) [0 ∞ 6,+)
  • 104. ADDING NEW SHARDS  >d.uCmad{adhr  sad.xml.o"};   brnomn( dsad:"hr2eapecm ) shard1 shard2 [∞ 0 −,4) [0 0 4,5) [0 0 5,6) [0 ∞ 6,+) Chunks are migrated to balance shards
  • 105. ADDING NEW SHARDS  >d.uCmad{adhr  sad.xml.o"};   brnomn( dsad:"hr3eapecm ) shard1 shard2 shard3 [∞ 0 −,4) [0 0 4,5) [0 0 5,6) [0 ∞ 6,+)
  • 106.
  • 107.
  • 108. SHARDING COMPONENTS mno ogs Config servers Shards mno ogd Replica sets
  • 109. SHARDED WRITES Inserts Shard key required Routed Updates and removes Shard key optional May be routed or scattered
  • 110. SHARDED READS Queries By shard key: routed Without shard key: scatter/gather Sorted queries By shard key: routed in order Without shard key: distributed merge sort
  • 111. EXERCISE #8 Users can upload images for books images iaei:?? mg_d ? dt:bnr aa iay The collection will be sharded by i a e i . mg_d What should i a e i be? mg_d
  • 112. EXERCISE #8: SOLUTIONS What's the best shard key for our use case? Auto-increment (ObjectId) MD5 of data Time (e.g. month) and MD5
  • 116. SUMMARY Schema design is different in MongoDB. Basic data design principles apply. It's about your application. It's about your data and how it's used. It's about the entire lifetime of your application.