CouchConf-Bangalore-Best-Practices-for-good-document-design

454 views

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
454
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
8
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • \n
  • It’s not everything you know…\n
  • It’s not half of what you know…\n
  • It’s not even everything you know about a particular subject…\n
  • Who knows what it is?\n\nhttp://polaris.gseis.ucla.edu/gleazer/260_readings/Buckland.pdf\n
  • IMO, schemas always exist, but you may be in the process of discovering them while you’re developing your application.\n
  • \n
  • \n
  • - UUIDs don’t cut it for this\n- same doc POSTed twice gets 2 different UUIDs\n\n
  • \n
  • \n
  • \n
  • * running _stats (or _sum) on strings will ruin your day\n* use Number() if you’re unsure\n\n
  • \n
  • Side Note:\n revisions should never be used for versioning as compaction will remove them and you’ll be sad\n
  • \n
  • the accounting model - requires a “reconciliation” (via MapReduce) to take place to produce the canonical document\n\nDenormalized data example: author data when updating a blog post\n
  • \n
  • \n
  • \n
  • caution: someone else likely uses this type name\n maybe “namespace” them: yourapp.contact\n\n can turn this into whatever other format you need\n\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Simple document representing a single reading from a single thermometer.\n
  • \n
  • India train station data.\n
  • India train data.\n
  • India train data. This is modified from the original form. Problems:\n1. departure_days was in the form of “---W---”\n2. Uses “T” and “F” for true and false\n3. Can tell starting and ending stations, but not middle stations (redundant, but important for querying)\n4. Had fields for each day of the week (e.g. “monday”: “F”).\n
  • Complex document with lots of fields (exif abbreviated, more attachments, etc...). Real life image from my photo album. ID is an MD5 of the original image.\n
  • \n
  • CouchConf-Bangalore-Best-Practices-for-good-document-design

    1. 1. Best Practices for Good Document DesignDustin Sallings @dlsspy
    2. 2. DOCUMENTS ARE NOT… 2
    3. 3. DOCUMENTS ARE NOT… 3
    4. 4. DOCUMENTS ARE NOT… 4
    5. 5. 5
    6. 6. SCHEMA-FREE DATABASE• schema definition is optional at write time – no need to define schema before adding data – write any sort of JSON you’d like• schemas can be enforced• but (by default) only matter when writing queries 6
    7. 7. HOWEVER!There are constraints. 7
    8. 8. INHERENT SCHEMA sort of 8
    9. 9. UNIQUENESS• Document ID is the only (DB-side) way to make something unique• App could de-duplicate from map/reduce – but that can be tricky• So, be prepared to handle conflicting IDs 9
    10. 10. IMPLICIT BASICS 10
    11. 11. JSON DOCUMENTS{ “json”: “key / value pairs”, “_id”: “specified id or auto generated UUID”, “_rev”: “mvcc”, “keys are strings” : [1, 2, 3, “four”, null], “schema free” : true} 11
    12. 12. KEY NAMES• JSON Object restrictions – they’re all strings• Couchbase Server reserves these prefixes on top-level keys – “_” underscore - also reserved by CouchDB – “$” dollar signs• cannot have duplicate key names on a single level – {“key”: 1, “key”: 2} is invalid (thankfully) 12
    13. 13. VALUES• JSON restrictions – objects, arrays, strings, numbers• Consider how you’ll be using it in your app – template system constraints – Mustache needs arrays of objects vs. arrays of arrays• Be careful of numbers as strings• Date formats – ISO8601 – unix timestamps – output as an array for grouping reductions 13
    14. 14. ONE DOC OR MULTIPLE DOCS? 14
    15. 15. DECISION MAKERS• what does this document look like in real life?• how often will I update this?• does this need its own revision/transaction path? – does all this data need updating together? – or rolled back together? 15
    16. 16. QUERYINGcan I get at the doc’s data easily? 16
    17. 17. UPDATING• When things change, do I want to update the doc? – or record the document’s changes as individual docs• Frequently written docs might make replication harder due to higher conflict probability• Will I have de-normalized portions of data on hand in the client app when updating? 17
    18. 18. REPLICATION the biggie! 18
    19. 19. REPLICATION• Where possible... – avoid conflicts – leverage small pieces• Keep uniqueness and conflicts in balance 19
    20. 20. CONVENTIONSpaving the cattle trails 20
    21. 21. CONVENTIONS & GOOD HABITS• “type”: “contact”• “created_at”: Unix timestamp• “status”: some status for this doc (ex: published)• “tags”: [“couch”, “db”, “nosql”] 21
    22. 22. MORE CONVENTIONS• “created_by”: username – typically from _users database• “profile”: CouchApp profile contents – from _users database – stored on the doc for convenience 22
    23. 23. TOOLS 23
    24. 24. VALIDATE_DOC_UPDATE• optional schema enforcement• function(newDoc, storedDoc, userCtx)• throw errors to prevent save• cannot modify newDoc• can enforce field types, values, formats• can prevent docs or fields from being changed (created_at, user)• runs every time a document is updated – even during replication 24
    25. 25. SAMPLE DOCS (IN 2.0) HANDY FOR QUICK DOC “SCHEMA” REFERENCING 25
    26. 26. ADVANCED DOCUMENT DESIGNmore tools and tricks in this session 26
    27. 27. EXAMPLES 27
    28. 28. 1.{2.   "_id": "2011-10-20T00:32:58_101D8A2A000000F7",3.   "_rev": "1-0c9914a4695b67a4f38cb5f8e345d28f",4.   "reading": 22.98,5.   "sn": "101D8A2A000000F7",6.   "ts": "2011-10-20T00:32:58",7.   "type": "reading"8.} 28
    29. 29. 29
    30. 30. 1.{2.  "_id": "station_724",3.  "_rev": "1-35f3b06a85f2997f365d5e41bcf6967a",4.  "code": "BIH",5.  "name": "Bairagarh",6.  "zone": "WR",7.  "doctype": "station",8.  "state": "Madhya Pradesh",9.  "address": "Bhopal, Madhya Pradesh",10. "id": 72411.} 30
    31. 31. 1.{2.   "_id": "sched_284908",3.   "_rev": "1-776a7ceeea990c8eb84d57dc01ea4d2f",4.   "arrival": "14:40",5.   "halt": "10m",6.   "stop_number": "27",7.   "station_code": "KOTA",8.   "departure": "14:50",9.   "train_number": "19039",10.  "day": 2,11.  "doctype": "schedule",12.  "station_name": "Kota Junction",13.  "id": 284908,14.  "distance_travelled": 90915.} 31
    32. 32. 1. {2. "_id": "train_97",3. "_rev": "2-640c3360c86405167e3b59a8f463d1c0",4. "return_train": "06617",5. "number": "06618",6.  "duration": "11h 45m",7.  "id": 97,8.  "zone": "SR",9.  "date_from": "Nov 23",10. "to_station_code": "CBE",11. "number_of_halts": 13,12. "sleeper": "T",13. "type": "Exp",14. "arrival": "08:00",15. "from_station_code": "NCJ",16. "doctype": "train",17. "departure_days": [18.     "Wednesday"19. ],20. "date_to": "Jan 18",21. "first_class": "F",22. "distance": "497 km",23. "third_ac": "T",24. "name": "Nagercoil-Coimbatore Special",25. "from_station_name": "Nagercoil Junction",26. "departure": "20:15",27. "second_ac": "T",28. "classes": "SL 3A 2A",29. "second_sitting": "F",30. "to_station_name": "Coimbatore Main Junction",31. "first_ac": "F"32.} 32
    33. 33. 1. {2.    "_id": "0017dcf0149c130229f35b537df48073",3.    "_rev": "7-ef6c5723edafb9828dbc36467493341d",4.    "old_id": 7343,5.    "height": 1800,6.    "keywords": [7.        "wrx"8.    ],9.    "cat": "Public",10.    "size": 1711223,11.    "tnwidth": 194,12.    "exif": {13.        "EXIF ApertureValue": "367/100",14.        /* ... */15.        "EXIF SensingMethod": "One-chip color area",16.        "MakerNote AEWarning": "Off",17.        "Thumbnail Orientation": "Horizontal (normal)"18.    },19.    "descr": "My car had fun this weekend.  It got all dirty in the snow and then ran into a sign. ",20.    "ts": "2006-02-22T17:47:16",21.    "addedby": "dustin",22.    "width": 2400,23.    "extension": "jpg",24.    "tnheight": 146,25.    "taken": "2006-02-21",26.    "type": "photo",27.    "annotations": [28.    ],29.    "_attachments": {30.        "800x600.jpg": {31.            "content_type": "image/jpeg",32.            "revpos": 4,33.            "digest": "md5-HB0NfWVLWeQJn8j79214Fw==",34.            "length": 84388,35.            "stub": true36.        }, 37. // [...] 33
    34. 34. ANY QUESTIONS?• submit to couchconfbangalore@couchbase.com• or ask me: • @dlsspy • dsal on irc.freenode.net: #couchdb, #couchbase, #membase • dustin@couchbase.com 34

    ×