Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Designing	  Couchbase	  DocumentsBenjamin	  Young        @bigbluehat
SCHEMA-­‐LESS	  DATABASE• 	  ad	  hoc	  data	  store   – 	  no	  need	  to	  define	  schema	  before	  adding	  data• docu...
HOWEVER!There	  are	  constraints.                               3
INHERENT	  SCHEMA       sort	  of                     4
UNIQUENESS• Document	  ID	  is	  the	  only	  (DB-­‐side)	  way	  to	  make	    something	  unique   – 	  UUID’s	  don’t	 ...
ONE	  DOC	  OR	  MULTIPLE	  DOCS?                                        6
DECISION	  MAKERS• what	  does	  this	  document	  look	  like	  in	  real	  life?• how	  oPen	  will	  I	  update	  this?...
JSON	  DOCUMENTS{    “json”:	  “key	  /	  value	  pairs”,    “_id”:	  “some	  uuid”,    “_rev”:	  “mvcc	  key”,    “string...
KEY	  NAMES• JSON	  Object	  restricLons   – 	  they’re	  all	  strings• 	  Couchbase	  reserves	  these	  prefixes	  on	  ...
VALUES• JSON	  restricLons   – objects,	  arrays,	  strings,	  numbers• Be	  careful	  of	  numbers	  as	  strings   – run...
QUERYINGcan	  I	  get	  at	  the	  doc’s	  data	  easily?                                                           11
UPDATING• When	  things	  change,	  do	  I	  want	  to	  update	  the	  doc?   – 	  or	  put	  in	  a	  new	  doc	  and	  ...
REPLICATION•   The	  biggie!•   Avoid	  conflicts	  (if	  possible)•   Leverage	  small	  pieces	  where	  possible/sensibl...
TOOLS        14
VALIDATE_DOC_UPDATE• funcLon(newDoc,	  storedDoc,	  userCtx)• opLonally	  enforced	  schema• throw	  errors	  to	  prevent...
?INCLUDE_DOCS=TRUE• super	  handy	  for	  “joining”	  map/reduce	  results• can	  help	  you	  “accept”	  using	  mulLple,...
SAMPLE	  DOCS	  (IN	  	  2.0)  HANDY	  FOR	  QUICK	  DOC	  “SCHEMA”	  REFERENCING                                         ...
MORE	  TOOLS• Update	  handlers• Output	  funcLons   – 	  _show/{show_funcLon_name}/{doc_id}       • runs	  a	  single	  d...
CONVENTIONS	  &	  GOOD	  HABITS•   “type”:	  “contact”•   “created_at”:	  Lmestamp•   “status”:	  some	  status	  for	  th...
MORE	  CONVENTIONS• “created_by”:	  username	  (from	  _users	  typically)• “profile”:	  CouchApp	  profile	  contents	  (fr...
EXAMPLES           21
BLUEINK• “page”	  documents	  reference	  ID’s	  (UUIDs)	  in	  various	    page	  areas• map/reduce	  aggregate	  general...
BLUEINK	  (CONT)• content	  item	  docs	  (type	  “html”,	  “contact”,	  etc)	  hold	    content	  separate	  from	  doc	 ...
BLUEINK	  PAGE           page	  document                                             • page	  document	      1 page	  area...
COMPLETE	  PAGE	  DOC• {"_id":	  "a9c276de2a064836ab306b095f000f8a",• 	  "_rev":	  "174-­‐5cf651f7b944b1a352bc10103e018652...
PAGE	  ITEMS	  SECTION                                   OF	  PREVIOUS	  “PAGE”	  DOC {"page_items":	  [1	  	  	  	  [ 	  ...
“HTML”	  ITEM	  DOC                             “INCLUDED”	  &	  FORMATTED	  VIA	  _LIST	  FUNCTION•   {	  	  "_id":	  "8d...
MAP/REDUCE	  OUTPUT                       28
ANY	  QUESTIONS?• catch	  me	  in	  the	  lounge• or	  online:  • @bigbluehat  • bigbluehat	  on	  IRC	  (freenode)  • ben...
Upcoming SlideShare
Loading in …5



Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this


  1. 1. Designing  Couchbase  DocumentsBenjamin  Young @bigbluehat
  2. 2. SCHEMA-­‐LESS  DATABASE•  ad  hoc  data  store –  no  need  to  define  schema  before  adding  data• document  structure  comes  into  play –  but  at  the  query  level,  not  at  the  entry  level 2
  3. 3. HOWEVER!There  are  constraints. 3
  4. 4. INHERENT  SCHEMA sort  of 4
  5. 5. UNIQUENESS• Document  ID  is  the  only  (DB-­‐side)  way  to  make   something  unique –  UUID’s  don’t  cut  it  for  this• App  could  de-­‐dup  from  map/reduce –  but  that  can  be  tricky• Be  prepared  to  handle  conflicLng  IDs 5
  7. 7. DECISION  MAKERS• what  does  this  document  look  like  in  real  life?• how  oPen  will  I  update  this?• does  this  need  its  own  revision/transacLon  path? –  does  all  this  data  need  updaLng  together? –  or  rolled  back  together?• Side  Note: –  revisions  should  never  be  used  for  versioning –  compacLon  will  remove  them,  and  you’ll  be  sad 7
  8. 8. JSON  DOCUMENTS{ “json”:  “key  /  value  pairs”, “_id”:  “some  uuid”, “_rev”:  “mvcc  key”, “string  keys”  :  [1,  2,  3,  “four”,  null], “schema  free”  :  true} 8
  9. 9. KEY  NAMES• JSON  Object  restricLons –  they’re  all  strings•  Couchbase  reserves  these  prefixes  on  top-­‐level  keys –  “_”  underscore  -­‐  also  reserved  by  CouchDB –  “$”  dollar  signs• Consider  how  you’ll  be  using  it  in  your  app –  template  system  constraints? –  objects  vs.  arrays 9
  10. 10. VALUES• JSON  restricLons – objects,  arrays,  strings,  numbers• Be  careful  of  numbers  as  strings – running  _stats  (or  _sum)  on  strings  will  ruin  your  day – might  use  Number()  if  you’re  unsure• Date  formats – unix  Lmestamps – output  as  an  array  for  grouping  reducLons 10
  11. 11. QUERYINGcan  I  get  at  the  doc’s  data  easily? 11
  12. 12. UPDATING• When  things  change,  do  I  want  to  update  the  doc? –  or  put  in  a  new  doc  and  “collapse”  things  later • the  accounLng  model• Frequently  (re)wrigen  docs  might  make  replicaLon   harder 12
  13. 13. REPLICATION• The  biggie!• Avoid  conflicts  (if  possible)• Leverage  small  pieces  where  possible/sensible• Keep  uniqueness  and  conflicts  in  balance 13
  14. 14. TOOLS 14
  15. 15. VALIDATE_DOC_UPDATE• funcLon(newDoc,  storedDoc,  userCtx)• opLonally  enforced  schema• throw  errors  to  prevent  save• cannot  modify  newDoc• can  enforce  field  types,  values• can  prevent  docs  or  fields  from  being  updated  again   (created_at,  user)• runs  every  Lme  a  document  is  updated –  even  during  replicaLon 15
  16. 16. ?INCLUDE_DOCS=TRUE• super  handy  for  “joining”  map/reduce  results• can  help  you  “accept”  using  mulLple,  smaller  docs 16
  18. 18. MORE  TOOLS• Update  handlers• Output  funcLons –  _show/{show_funcLon_name}/{doc_id} • runs  a  single  doc  through  an  addiLonal  “display”  funcLon –  _list/{list_funcLon_name}/{view_name} • same  as  _show,  but  for  map/reduce  results• research  these  later –  or  join  me  in  the  Lounge  aPer  this  talk 18
  19. 19. CONVENTIONS  &  GOOD  HABITS• “type”:  “contact”• “created_at”:  Lmestamp• “status”:  some  status  for  this  doc  (published)• “tags”:  [“couch”,  “db”,  “nosql”] 19
  20. 20. MORE  CONVENTIONS• “created_by”:  username  (from  _users  typically)• “profile”:  CouchApp  profile  contents  (from  _users)   stored  on  doc  for  convenience 20
  21. 21. EXAMPLES 21
  22. 22. BLUEINK• “page”  documents  reference  ID’s  (UUIDs)  in  various   page  areas• map/reduce  aggregate  general  page  data,  site  wide   serngs,  the  chosen  template,  and  page  items• _list  applies  the  template,  builds  final  page,  all  in  one   GET 22
  23. 23. BLUEINK  (CONT)• content  item  docs  (type  “html”,  “contact”,  etc)  hold   content  separate  from  doc  for  independent  updates   and  reuse• other  special  docs  use  non-­‐UUID’s:  site,  sitemap,   template  docs 23
  24. 24. BLUEINK  PAGE page  document • page  document   1 page  area 2 contains: page  area page_items  key  in  1 content  item 1 which  is  a  mulL-­‐ content   item dimensional  array   containing  page  2 content  item areas  and  the   3 page  area associated  content   items 1 content  item 2 content  item 24
  25. 25. COMPLETE  PAGE  DOC• {"_id":  "a9c276de2a064836ab306b095f000f8a",•  "_rev":  "174-­‐5cf651f7b944b1a352bc10103e018652",•  "type":  "page"•  "Ltle":  "Home",  "nav_label":  "Home",  "url":  "home",•  "page_items":  [•      [  {  "_id":  "13a54b6e52123745cced243d620003e0",   "display_Ltle":  true},•          {"_id":  "8dd982de76e8b5959e10e6d4360067ce",   "display_Ltle":  false},•          {"_id":  "8dd982de76e8b5959e10e6d43600709a",   "display_Ltle":  true}],•      [  {"_id":  "90cb972de2a11045be18a3a88c001bad"}  ]  ]  } 25
  26. 26. PAGE  ITEMS  SECTION OF  PREVIOUS  “PAGE”  DOC {"page_items":  [1        [        1        {    "_id":  "13a54b6e52123745cced243d620003e0",                        "display_Ltle":  true},        2        {    "_id":  "8dd982de76e8b5959e10e6d4360067ce",                        "display_Ltle":  false}        ],2        [          1        {    "_id":  "90cb972de2a11045be18a3a88c001bad"}          ] ]  } 26
  27. 27. “HTML”  ITEM  DOC “INCLUDED”  &  FORMATTED  VIA  _LIST  FUNCTION• {    "_id":  "8dd982de76e8b5959e10e6d43600615d",•      "_rev":  "174-­‐aa7546e6308324d4b3ad84469fcbd773",•    "type":  "html",•    "created":  "2008-­‐06-­‐18  15:03:45",•      "updated":  "2009-­‐08-­‐08  13:28:58",•      "Ltle":  "Welcome",•      "content":  "<p>The  <a  href="hgp:// "><strong>BlueInk  Content  Management  System  (CMS)</strong></a>   gives  clients  hassle-­‐free  control  over  their  content.  Through  a  simple   interface  youll  be  able  to  edit  and  organize  text,  photos,  contact   informaLon,  and  other  components-­‐all  while  looking  right  at  your   website!  The  soPware  is  easy  to  learn,  but  dont  take  our  word  for  it-­‐sign   up  for  a  <a  href="hgp://">free  demo</a>!</ p>"} 27
  28. 28. MAP/REDUCE  OUTPUT 28
  29. 29. ANY  QUESTIONS?• catch  me  in  the  lounge• or  online: • @bigbluehat • bigbluehat  on  IRC  (freenode) • 29