Best Practices for  Good Document DesignDustin Sallings   @dlsspy
SCHEMA-FREE DATABASE• schema definition is optional at write time – no need to define schema before adding data  – write any...
HOWEVER!There are constraints.                         3
INHERENT SCHEMA      sort of                  4
UNIQUENESS• Document ID is the only (DB-side) way to  make something unique• App could de-duplicate from map/reduce – but ...
IMPLICIT BASICS                  6
JSON DOCUMENTS{    “json”: “key / value pairs”,    “_id”: “specified id or auto generated    UUID”,    “_rev”: “mvcc”,    “...
KEY NAMES• JSON Object restrictions – they’re all strings• Couchbase Server reserves these prefixes on top-level keys – “_”...
VALUES• JSON restrictions  – objects, arrays, strings, numbers• Consider how you’ll be using it in your app  – template sy...
ONE DOC OR MULTIPLE      DOCS?                      10
THIS IS NOT A DOCUMENT                         11
THIS IS NOT A DOCUMENT                         12
NOPE       13
14
DECISION MAKERS• what does this document look like in real  life?• how often will I update this?• does this need its own r...
QUERYINGcan I get at the doc’s data easily?                                      16
UPDATING• When things change, do I want to update the  doc? – or record the document’s changes as individual  docs• Freque...
REPLICATION  the biggie!                18
REPLICATION• Where possible... – avoid conflicts – leverage small pieces• Keep uniqueness and conflicts in balance          ...
CONVENTIONSpaving the cattle trails                           20
CONVENTIONS & GOOD           HABITS• “type”: “contact”• “created_at”: Unix timestamp• “status”: some status for this doc (...
MORE CONVENTIONS• “created_by”: username – typically from _users database• “profile”: CouchApp profile contents – from _user...
TOOLS        23
VALIDATE_DOC_UPDATE• optional schema enforcement• function(newDoc, storedDoc, userCtx)• throw errors to prevent save• cann...
SAMPLE DOCS (IN 2.0) HANDY FOR QUICK DOC “SCHEMA” REFERENCING                                            25
ADVANCED DOCUMENT      DESIGNmore tools and tricks in this session                                        26
EXAMPLES           27
1.{2.     "_id": "2011-10-20T00:32:58_101D8A2A000000F7",3.     "_rev": "1-0c9914a4695b67a4f38cb5f8e345d28f",4.     "readin...
1. {2.    "_id": "0017dcf0149c130229f35b537df48073",3.    "_rev": "7-ef6c5723edafb9828dbc36467493341d",4.    "old_id": 734...
Q&A Panel with Couchbase Experts Submit your questions for ourCouchbase Q&A at the end of the         conference to:couchc...
Upcoming SlideShare
Loading in...5
×

CouchConf-Chicago-Best-Practices-Good-Document-design

694

Published on

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
694
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
14
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • \n
  • \n
  • \n
  • \n
  • - UUIDs don’t cut it for this\n- same doc POSTed twice gets 2 different UUIDs\n\n
  • \n
  • \n
  • \n
  • * running _stats (or _sum) on strings will ruin your day\n* use Number() if you’re unsure\n\n
  • \n
  • It’s not everything you know…\n
  • It’s not half of what you know…\n
  • It’s not even everything you know about a particular subject…\n
  • Who knows what it is? The concept is quite complicated, so there isn't an easy answer, just like there isn't an easy answer for "what is a good data model," for your app or RDBMS schema.\n\nhttp://polaris.gseis.ucla.edu/gleazer/260_readings/Buckland.pdf\n
  • Side Note:\n revisions should never be used for versioning as compaction will remove them and you’ll be sad\n
  • \n
  • the accounting model - requires a “reconciliation” (via MapReduce) to take place to produce the canonical document\n\nDenormalized data example: author data when updating a blog post\n
  • \n
  • \n
  • \n
  • caution: someone else likely uses this type name\n maybe “namespace” them: yourapp.contact\n\n can turn this into whatever other format you need\n\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Simple document representing a single reading from a single thermometer.\n
  • Complex document with lots of fields (exif abbreviated, more attachments, etc...). Real life image from my photo album. ID is an MD5 of the original image.\n
  • \n
  • \n
  • CouchConf-Chicago-Best-Practices-Good-Document-design

    1. 1. Best Practices for Good Document DesignDustin Sallings @dlsspy
    2. 2. SCHEMA-FREE DATABASE• schema definition is optional at write time – no need to define schema before adding data – write any sort of JSON you’d like• schemas can be enforced• but (by default) only matter when writing queries 2
    3. 3. HOWEVER!There are constraints. 3
    4. 4. INHERENT SCHEMA sort of 4
    5. 5. UNIQUENESS• Document ID is the only (DB-side) way to make something unique• App could de-duplicate from map/reduce – but that can be tricky• So, be prepared to handle conflicting IDs 5
    6. 6. IMPLICIT BASICS 6
    7. 7. JSON DOCUMENTS{ “json”: “key / value pairs”, “_id”: “specified id or auto generated UUID”, “_rev”: “mvcc”, “keys are strings” : [1, 2, 3, “four”, null], “schema free” : true} 7
    8. 8. KEY NAMES• JSON Object restrictions – they’re all strings• Couchbase Server reserves these prefixes on top-level keys – “_” underscore - also reserved by CouchDB – “$” dollar signs• cannot have duplicate key names on a single level – {“key”: 1, “key”: 2} is invalid (thankfully) 8
    9. 9. VALUES• JSON restrictions – objects, arrays, strings, numbers• Consider how you’ll be using it in your app – template system constraints – Mustache needs arrays of objects vs. arrays of arrays• Be careful of numbers as strings• Date formats – ISO8601 – unix timestamps – output as an array for grouping reductions 9
    10. 10. ONE DOC OR MULTIPLE DOCS? 10
    11. 11. THIS IS NOT A DOCUMENT 11
    12. 12. THIS IS NOT A DOCUMENT 12
    13. 13. NOPE 13
    14. 14. 14
    15. 15. DECISION MAKERS• what does this document look like in real life?• how often will I update this?• does this need its own revision/transaction path? – does all this data need updating together? – or rolled back together? 15
    16. 16. QUERYINGcan I get at the doc’s data easily? 16
    17. 17. UPDATING• When things change, do I want to update the doc? – or record the document’s changes as individual docs• Frequently written docs might make replication harder due to higher conflict probability• Will I have de-normalized portions of data on hand in the client app when updating? 17
    18. 18. REPLICATION the biggie! 18
    19. 19. REPLICATION• Where possible... – avoid conflicts – leverage small pieces• Keep uniqueness and conflicts in balance 19
    20. 20. CONVENTIONSpaving the cattle trails 20
    21. 21. CONVENTIONS & GOOD HABITS• “type”: “contact”• “created_at”: Unix timestamp• “status”: some status for this doc (ex: published)• “tags”: [“couch”, “db”, “nosql”] 21
    22. 22. MORE CONVENTIONS• “created_by”: username – typically from _users database• “profile”: CouchApp profile contents – from _users database – stored on the doc for convenience 22
    23. 23. TOOLS 23
    24. 24. VALIDATE_DOC_UPDATE• optional schema enforcement• function(newDoc, storedDoc, userCtx)• throw errors to prevent save• cannot modify newDoc• can enforce field types, values, formats• can prevent docs or fields from being changed (created_at, user)• runs every time a document is updated – even during replication 24
    25. 25. SAMPLE DOCS (IN 2.0) HANDY FOR QUICK DOC “SCHEMA” REFERENCING 25
    26. 26. ADVANCED DOCUMENT DESIGNmore tools and tricks in this session 26
    27. 27. EXAMPLES 27
    28. 28. 1.{2.   "_id": "2011-10-20T00:32:58_101D8A2A000000F7",3.   "_rev": "1-0c9914a4695b67a4f38cb5f8e345d28f",4.   "reading": 22.98,5.   "sn": "101D8A2A000000F7",6.   "ts": "2011-10-20T00:32:58",7.   "type": "reading"8.} 28
    29. 29. 1. {2.    "_id": "0017dcf0149c130229f35b537df48073",3.    "_rev": "7-ef6c5723edafb9828dbc36467493341d",4.    "old_id": 7343,5.    "height": 1800,6.    "keywords": [7.        "wrx"8.    ],9.    "cat": "Public",10.    "size": 1711223,11.    "tnwidth": 194,12.    "exif": {13.        "EXIF ApertureValue": "367/100",14.        /* ... */15.        "EXIF SensingMethod": "One-chip color area",16.        "MakerNote AEWarning": "Off",17.        "Thumbnail Orientation": "Horizontal (normal)"18.    },19.    "descr": "My car had fun this weekend.  It got all dirty in the snow and then ran into a sign. ",20.    "ts": "2006-02-22T17:47:16",21.    "addedby": "dustin",22.    "width": 2400,23.    "extension": "jpg",24.    "tnheight": 146,25.    "taken": "2006-02-21",26.    "type": "photo",27.    "annotations": [28.    ],29.    "_attachments": {30.        "800x600.jpg": {31.            "content_type": "image/jpeg",32.            "revpos": 4,33.            "digest": "md5-HB0NfWVLWeQJn8j79214Fw==",34.            "length": 84388,35.            "stub": true36.        }, 37. // [...] 29
    30. 30. Q&A Panel with Couchbase Experts Submit your questions for ourCouchbase Q&A at the end of the conference to:couchconfchicago@couchbase.co m
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×