d60developing smart software solutions                                      So you want to liberate your data?            ...
Mogens	  Heller	  Grabe	              	              	               	        mhg@d60.dk	        @mookid8000	   h8p://mook...
Agenda	  •    Data,	  queries,	  etc.	  •    Concurrency	  •    AggregaEon	  •    Deployment	  •    Durability	  •    Thin...
MongoDB	  •    Document	  database	  •    Currently	  in	  v.	  2.0.4	  •    Developed	  by	  10gen	  •    Open	  source	 ...
Conceptual	  data	  organizaEon	         process                 database                         collection              ...
Data	  
Example	  1	  •    Install	  •    Mongo	  Shell	  •    Show	  database	  contents	  •    Add	  and	  show	  a	  document	  
Queries	  including	  several	  other	  query	  operators:	  $gt,	  $gte,	  $lt,	  $lte,	  $exists,	  $all,	  etc...	  
Indexes	  
Updates	  including	  several	  other	  update	  modifiers:	  $inc,	  $set,	  $addToSet,	  $rename,	  etc...	  
Example	  2	  •    Import	  some	  data	  •    Query	  •    Update	  •    Index	  •    Query	  
ACID?	  •    Atomic:	  Yeah	  well,	  per	  document.	  •    Consistent:	  Yeah	  well,	  can	  be.	  •    Isolated:	  Yea...
Concurrency	  •  Pushing	  it	  down	  the	  stack	  
Concurrency	  •  Preserve	  invariants	  with	  update	  precondiEons	  
Concurrency	  •  Use	  opEmisEc	     locking	  when	     replacing	     document	  	  (and	  then	  check	  whether	  n	  ...
Concurrency	  •  Use	  FindAndModify	  to	  “check	  out”	  documents	  
AggregaEon	  •  Map/reduce	  
AggregaEon	  •  Map/reduce	     –  Map:	        for	  each	  document:	       	  emit	  0	  or	  more	  (key,	  value)	  t...
AggregaEon	  m	  =	  function()	  {	  	  	  	  	  var	  doc	  =	  this;	  	  	  	  	  doc.appearances.forEach(function(a)	...
Example	  3	  •  Use	  map/reduce	  to	  collect	  informaEon	  on	     who	  appeared	  in	  each	  episode	  
AggregaEon	  •  AggregaEon	  framework	  (not	  available	  unEl	     2.2)	     –  declaraEve	  syntax	  for	  construcEon...
AggregaEon	  •  AggregaEon	  framework	  (not	  available	  unEl	     2.2)	  
Deployment	  •  Several	  configuraEons	     –  we’ll	  check	  out	  replica	  sets	  and	  sharding	  
Replica	  sets	  •  Master-­‐slave	  with	  automaEc	  failover	     –  Each	  mongod	  should	  be	  started	  with	  the...
Replica	  sets	  •  Higher	  availability	  •  Scale	  out	  reads	  •  Backup	  without	  interfering	  with	  the	  prim...
Sharding	  •  Auto-­‐sharding	     –  happens	  by	  user-­‐defined	        shard	  key	     –  can	  be	  defined	  per	   ...
Sharding	  •  Scale	  out	  writes	  •  LimitaEons:	      –  Shard	  key	  is	  immutable	      –  All	  inserts/updates	 ...
Sharding	  +	  replica	  sets	  
MongoDB’s	  durability	  story	  •  Memory-­‐mapped	  files.	  •  fsync.	  •  Durability	  through	  replicaEon	     –  pre...
MongoDB’s	  durability	  story	  •  Inserts	  and	  updates	  are	  unsafe	  by	  default!!	      –  only	  purpose:	  get...
MongoDB’s	  durability	  story	  •  Conclusion:	  It’s	  cool	  that	  you	  can	     tweak	  it	  per	  operation,	  but	...
Things	  to	  be	  aware	  of	  •    Safe	  mode	  off	  •    32/64	  bit	  •    Memory-­‐mapped	  file	  •    Global	  writ...
Thanks	  for	  listening!	          mhg@d60.dk	          @mookid8000	     h8p://mookid.dk/oncode	  
Image	  credits	  The	  world’s	  most	  interesEng	  man:	  h8p://i.qkme.me/3mwy.jpg	  Bison:	  h8p://www.flickr.com/photo...
So you want to liberate your data?
So you want to liberate your data?
So you want to liberate your data?
So you want to liberate your data?
So you want to liberate your data?
So you want to liberate your data?
Upcoming SlideShare
Loading in …5
×

So you want to liberate your data?

3,260
-1

Published on

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
3,260
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
14
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

So you want to liberate your data?

  1. d60developing smart software solutions So you want to liberate your data? April 2012
  2. Mogens  Heller  Grabe         mhg@d60.dk   @mookid8000   h8p://mookid.dk/oncode  
  3. Agenda  •  Data,  queries,  etc.  •  Concurrency  •  AggregaEon  •  Deployment  •  Durability  •  Things  to  be  aware  of  
  4. MongoDB  •  Document  database  •  Currently  in  v.  2.0.4  •  Developed  by  10gen  •  Open  source   –  server  is  GNU  AGPL  v3   –  clients  (the  official)  are  Apache  V2  •  Absolutely  free  to  use   –  you  can  get  a  commercial  version  of  the  db  though  –  has   support,  SSL,  and  more  security  features  
  5. Conceptual  data  organizaEon   process database collection document process database table row
  6. Data  
  7. Example  1  •  Install  •  Mongo  Shell  •  Show  database  contents  •  Add  and  show  a  document  
  8. Queries  including  several  other  query  operators:  $gt,  $gte,  $lt,  $lte,  $exists,  $all,  etc...  
  9. Indexes  
  10. Updates  including  several  other  update  modifiers:  $inc,  $set,  $addToSet,  $rename,  etc...  
  11. Example  2  •  Import  some  data  •  Query  •  Update  •  Index  •  Query  
  12. ACID?  •  Atomic:  Yeah  well,  per  document.  •  Consistent:  Yeah  well,  can  be.  •  Isolated:  Yeah  well,  per  document.  •  Durable:  Yeah  well,  can  be  –  not  default   though....  
  13. Concurrency  •  Pushing  it  down  the  stack  
  14. Concurrency  •  Preserve  invariants  with  update  precondiEons  
  15. Concurrency  •  Use  opEmisEc   locking  when   replacing   document    (and  then  check  whether  n  is  0  or  1...)  
  16. Concurrency  •  Use  FindAndModify  to  “check  out”  documents  
  17. AggregaEon  •  Map/reduce  
  18. AggregaEon  •  Map/reduce   –  Map:   for  each  document:    emit  0  or  more  (key,  value)  tuples   –  Reduce:   given  a  (key,  value[]),    return  1  value  
  19. AggregaEon  m  =  function()  {          var  doc  =  this;          doc.appearances.forEach(function(a)  {                  emit(a,  {                          count:  1,                            names:  [doc.firstName  +  “  “  +  doc.lastName]                  });          });  }    r  =  function(key,  values)  {          var  count  =  0;          var  names  =  [];          values.forEach(function(v)  {                  count  +=  v.count;                  names  =  names.concat(v.names);          });          return  {count:  count,  names:  names};  }  
  20. Example  3  •  Use  map/reduce  to  collect  informaEon  on   who  appeared  in  each  episode  
  21. AggregaEon  •  AggregaEon  framework  (not  available  unEl   2.2)   –  declaraEve  syntax  for  construcEon  of  an   aggregaEon  pipeline  
  22. AggregaEon  •  AggregaEon  framework  (not  available  unEl   2.2)  
  23. Deployment  •  Several  configuraEons   –  we’ll  check  out  replica  sets  and  sharding  
  24. Replica  sets  •  Master-­‐slave  with  automaEc  failover   –  Each  mongod  should  be  started  with  the  -­‐-­‐replset   argument   –  AddiEonal  nodes  added  from  the  shell   –  Make  sure  the  number  of  nodes  is  odd,  possibly   by  adding  an  arbiter  
  25. Replica  sets  •  Higher  availability  •  Scale  out  reads  •  Backup  without  interfering  with  the  primary  
  26. Sharding  •  Auto-­‐sharding   –  happens  by  user-­‐defined   shard  key   –  can  be  defined  per   collecEon   –  requires  special  nodes:   mongos  (the  load   balancer)  and  a  mongod   that  is  configured  to  be  a   configuraEon  server  
  27. Sharding  •  Scale  out  writes  •  LimitaEons:   –  Shard  key  is  immutable   –  All  inserts/updates  must  include  the  shard  key   –  Cannot  enforce  (arbitrary)  uniqueness  across   shards,  only  for  shard  key  
  28. Sharding  +  replica  sets  
  29. MongoDB’s  durability  story  •  Memory-­‐mapped  files.  •  fsync.  •  Durability  through  replicaEon   –  pre  1.8  •  Durability  through  journaling   –  an  opEon  since  1.8  –  replica  sets  sEll  cool  though   –  default  since  2.0  
  30. MongoDB’s  durability  story  •  Inserts  and  updates  are  unsafe  by  default!!   –  only  purpose:  get  awesome  benchmarks   –  bad:  bites  you  in  the  a**  •  Exposed  differently  on  drivers,  but  always   maps  to  db.getLastError()  
  31. MongoDB’s  durability  story  •  Conclusion:  It’s  cool  that  you  can   tweak  it  per  operation,  but  it’s   uncool  that  it’s  unsafe.  
  32. Things  to  be  aware  of  •  Safe  mode  off  •  32/64  bit  •  Memory-­‐mapped  file  •  Global  write  lock  •  Indexes  should  always  fit  in  RAM  
  33. Thanks  for  listening!   mhg@d60.dk   @mookid8000   h8p://mookid.dk/oncode  
  34. Image  credits  The  world’s  most  interesEng  man:  h8p://i.qkme.me/3mwy.jpg  Bison:  h8p://www.flickr.com/photos/johan-­‐gril/5632513228/  Tired  Fry:  h8p://cdn.memegenerator.net/instances/400x/18731987.jpg              Thanks  for  lerng  me  borrow  your  awesome  images  –  if  you  ever  meet  me,  I’ll  buy  you  a  beer.  Seriously,  I  will.  

×