Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

So you want to liberate your data?

4,021 views

Published on

Published in: Technology
  • Dating for everyone is here: ❶❶❶ http://bit.ly/2F90ZZC ❶❶❶
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Sex in your area is here: ❤❤❤ http://bit.ly/2F90ZZC ❤❤❤
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Be the first to like this

So you want to liberate your data?

  1. d60developing smart software solutions So you want to liberate your data? April 2012
  2. Mogens  Heller  Grabe         mhg@d60.dk   @mookid8000   h8p://mookid.dk/oncode  
  3. Agenda  •  Data,  queries,  etc.  •  Concurrency  •  AggregaEon  •  Deployment  •  Durability  •  Things  to  be  aware  of  
  4. MongoDB  •  Document  database  •  Currently  in  v.  2.0.4  •  Developed  by  10gen  •  Open  source   –  server  is  GNU  AGPL  v3   –  clients  (the  official)  are  Apache  V2  •  Absolutely  free  to  use   –  you  can  get  a  commercial  version  of  the  db  though  –  has   support,  SSL,  and  more  security  features  
  5. Conceptual  data  organizaEon   process database collection document process database table row
  6. Data  
  7. Example  1  •  Install  •  Mongo  Shell  •  Show  database  contents  •  Add  and  show  a  document  
  8. Queries  including  several  other  query  operators:  $gt,  $gte,  $lt,  $lte,  $exists,  $all,  etc...  
  9. Indexes  
  10. Updates  including  several  other  update  modifiers:  $inc,  $set,  $addToSet,  $rename,  etc...  
  11. Example  2  •  Import  some  data  •  Query  •  Update  •  Index  •  Query  
  12. ACID?  •  Atomic:  Yeah  well,  per  document.  •  Consistent:  Yeah  well,  can  be.  •  Isolated:  Yeah  well,  per  document.  •  Durable:  Yeah  well,  can  be  –  not  default   though....  
  13. Concurrency  •  Pushing  it  down  the  stack  
  14. Concurrency  •  Preserve  invariants  with  update  precondiEons  
  15. Concurrency  •  Use  opEmisEc   locking  when   replacing   document    (and  then  check  whether  n  is  0  or  1...)  
  16. Concurrency  •  Use  FindAndModify  to  “check  out”  documents  
  17. AggregaEon  •  Map/reduce  
  18. AggregaEon  •  Map/reduce   –  Map:   for  each  document:    emit  0  or  more  (key,  value)  tuples   –  Reduce:   given  a  (key,  value[]),    return  1  value  
  19. AggregaEon  m  =  function()  {          var  doc  =  this;          doc.appearances.forEach(function(a)  {                  emit(a,  {                          count:  1,                            names:  [doc.firstName  +  “  “  +  doc.lastName]                  });          });  }    r  =  function(key,  values)  {          var  count  =  0;          var  names  =  [];          values.forEach(function(v)  {                  count  +=  v.count;                  names  =  names.concat(v.names);          });          return  {count:  count,  names:  names};  }  
  20. Example  3  •  Use  map/reduce  to  collect  informaEon  on   who  appeared  in  each  episode  
  21. AggregaEon  •  AggregaEon  framework  (not  available  unEl   2.2)   –  declaraEve  syntax  for  construcEon  of  an   aggregaEon  pipeline  
  22. AggregaEon  •  AggregaEon  framework  (not  available  unEl   2.2)  
  23. Deployment  •  Several  configuraEons   –  we’ll  check  out  replica  sets  and  sharding  
  24. Replica  sets  •  Master-­‐slave  with  automaEc  failover   –  Each  mongod  should  be  started  with  the  -­‐-­‐replset   argument   –  AddiEonal  nodes  added  from  the  shell   –  Make  sure  the  number  of  nodes  is  odd,  possibly   by  adding  an  arbiter  
  25. Replica  sets  •  Higher  availability  •  Scale  out  reads  •  Backup  without  interfering  with  the  primary  
  26. Sharding  •  Auto-­‐sharding   –  happens  by  user-­‐defined   shard  key   –  can  be  defined  per   collecEon   –  requires  special  nodes:   mongos  (the  load   balancer)  and  a  mongod   that  is  configured  to  be  a   configuraEon  server  
  27. Sharding  •  Scale  out  writes  •  LimitaEons:   –  Shard  key  is  immutable   –  All  inserts/updates  must  include  the  shard  key   –  Cannot  enforce  (arbitrary)  uniqueness  across   shards,  only  for  shard  key  
  28. Sharding  +  replica  sets  
  29. MongoDB’s  durability  story  •  Memory-­‐mapped  files.  •  fsync.  •  Durability  through  replicaEon   –  pre  1.8  •  Durability  through  journaling   –  an  opEon  since  1.8  –  replica  sets  sEll  cool  though   –  default  since  2.0  
  30. MongoDB’s  durability  story  •  Inserts  and  updates  are  unsafe  by  default!!   –  only  purpose:  get  awesome  benchmarks   –  bad:  bites  you  in  the  a**  •  Exposed  differently  on  drivers,  but  always   maps  to  db.getLastError()  
  31. MongoDB’s  durability  story  •  Conclusion:  It’s  cool  that  you  can   tweak  it  per  operation,  but  it’s   uncool  that  it’s  unsafe.  
  32. Things  to  be  aware  of  •  Safe  mode  off  •  32/64  bit  •  Memory-­‐mapped  file  •  Global  write  lock  •  Indexes  should  always  fit  in  RAM  
  33. Thanks  for  listening!   mhg@d60.dk   @mookid8000   h8p://mookid.dk/oncode  
  34. Image  credits  The  world’s  most  interesEng  man:  h8p://i.qkme.me/3mwy.jpg  Bison:  h8p://www.flickr.com/photos/johan-­‐gril/5632513228/  Tired  Fry:  h8p://cdn.memegenerator.net/instances/400x/18731987.jpg              Thanks  for  lerng  me  borrow  your  awesome  images  –  if  you  ever  meet  me,  I’ll  buy  you  a  beer.  Seriously,  I  will.  

×