Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Entity Relationships ina Document Database    MapReduce Views for SQL Users
Entity:An object defined by its identityand a thread of continuity[1]             1. "Entity" Domain-driven Design Communi...
EntityRelationshipModel
Join vs. Collation
SQL Query JoiningPublishers and BooksSELECT  `publisher`.`id`,  `publisher`.`name`,  `book`.`title`FROM `publisher`FULL OU...
Joined Result Setpublisher.id publisher.name          book.title                              Building iPhone Apps with  o...
Joined Result Set     Publisher (“left”)publisher.id publisher.name          book.title                              Build...
Joined Result Set     Publisher (“left”)            Book “right”publisher.id publisher.name          book.title           ...
Collated Result Set      key            id                 value  ["oreilly",0]   "oreilly"        "OReilly Media"        ...
Collated Result Set    key            id                 value["oreilly",0]   "oreilly"        "OReilly Media"        Publ...
Collated Result Set    key            id                 value["oreilly",0]   "oreilly"        "OReilly Media"        Publ...
View Result SetsMade up of columns and rowsEvery row has the same three columns:  • key  • id  • valueColumns can contain ...
One to Many Relationships
Embedded Entities:Nest related entities within a document
Embedded EntitiesA single document represents the “one” entityNested entities (JSON Array) represents the “many” entitiesS...
Example: Publisherwith Nested Books{  "_id":"oreilly",  "collection":"publisher",  "name":"OReilly Media",  "books":[    {...
Map Functionfunction(doc) {  if ("publisher" == doc.collection) {    emit([doc._id, 0], doc.name);    for (var i in doc.bo...
Result Set     key            id                 value ["oreilly",0]   "oreilly"        "OReilly Media"                   ...
LimitationsOnly works if there aren’t a large number of related entities: • Too many nested entities can result in very la...
Related Documents:Reference an entity by its identifier
Related DocumentsA document representing the “one” entitySeparate documents for each “many” entityEach “many” entity refer...
Example: Publisher{    "_id":"oreilly",    "collection":"publisher",    "name":"OReilly Media"}
Example: Related Book{    "_id":"9780596155896",    "collection":"book",    "title":"CouchDB: The Definitive Guide",    "p...
Map Functionfunction(doc) {  if ("publisher" == doc.collection) {    emit([doc._id, 0], doc.name);  }  if ("book" == doc.c...
Result Set      key                   id              value["oreilly",0]   "oreilly"         "OReilly Media"              ...
LimitationsWhen retrieving the entity on the “right” side of the relationship,one cannot include any data from the entity ...
Many to Many Relationships
List of Keys:Reference entities by their identifiers
List of KeysA document representing each “many” entity on the “left” sideof the relationshipSeparate documents for each “m...
Books and Related Authors
Example: Book{    "_id":"9780596805029",    "collection":"book",    "title":"DocBook 5: The Definitive Guide"}
Example: Book{    "_id":"9781565920514",    "collection":"book",    "title":"Making TeX Work"}
Example: Book{    "_id":"9781565925809",    "collection":"book",    "title":"DocBook: The Definitive Guide"}
Example: Author{    "_id":"muellner",    "collection":"author",    "name":"Leonard Muellner",    "books":[      "978156592...
Example: Author{    "_id":"walsh",    "collection":"author",    "name":"Norman Walsh",    "books":[      "9780596805029", ...
Map Functionfunction(doc) {  if ("book" == doc.collection) {    emit([doc._id, 0], doc.title);  }  if ("author" == doc.col...
Result Set        key                   id                  value["9780596805029",0] "9780596805029" "DocBook 5: The Defini...
Authors and Related Books
Map Functionfunction(doc) {  if ("author" == doc.collection) {    emit([doc._id, 0], doc.name);    for (var i in doc.books...
Result Set      key              id              value["muellner",0]   "muellner"   "Leonard Muellner"["muellner",1]   "mu...
Including Docs  include_docs=true     key          id    value               doc (truncated)["muellner",0] "muellner" …   ...
Or, we can reverse the references…
Example: Author{    "_id":"muellner",    "collection":"author",    "name":"Leonard Muellner"}
Example: Author{    "_id":"walsh",    "collection":"author",    "name":"Norman Walsh"}
Example: Book{    "_id":"9780596805029",    "collection":"book",    "title":"DocBook 5: The Definitive Guide",    "authors...
Example: Book{    "_id":"9781565920514",    "collection":"book",    "title":"Making TeX Work",    "authors":[      "walsh"...
Example: Book{    "_id":"9781565925809",    "collection":"book",    "title":"DocBook: The Definitive Guide",    "authors":...
Map Functionfunction(doc) {  if ("author" == doc.collection) {    emit([doc._id, 0], doc.name);  }  if ("book" == doc.coll...
Result Set     key                id                  value["muellner",0] "muellner"     "Leonard Muellner"["muellner",1] ...
LimitationsQueries from the “right” side of the relationship cannot includeany data from entities on the “left” side of th...
Relationship Documents:Create a document to represent eachindividual relationship
Relationship DocumentsA document representing each “many” entity on the “left” sideof the relationshipSeparate documents f...
Example: Book{    "_id":"9780596805029",    "collection":"book",    "title":"DocBook 5: The Definitive Guide"}
Example: Book{    "_id":"9781565920514",    "collection":"book",    "title":"Making TeX Work"}
Example: Book{    "_id":"9781565925809",    "collection":"book",    "title":"DocBook: The Definitive Guide"}
Example: Author{    "_id":"muellner",    "collection":"author",    "name":"Leonard Muellner"}
Example: Author{    "_id":"walsh",    "collection":"author",    "name":"Norman Walsh"}
Example:Relationship Document{    "_id":"44005f2c",    "collection":"book-author",    "book":"9780596805029",    "author":...
Example:Relationship Document{    "_id":"44005f72",    "collection":"book-author",    "book":"9781565920514",    "author":...
Example:Relationship Document{    "_id":"44006720",    "collection":"book-author",    "book":"9781565925809",    "author":...
Example:Relationship Document{    "_id":"44006b0d",    "collection":"book-author",    "book":"9781565925809",    "author":...
Books and Related Authors
Map Functionfunction(doc) {  if ("book" == doc.collection) {    emit([doc._id, 0], doc.title);  }  if ("book-author" == do...
Result Set       key                 id                         value["9780596805029",0] "9780596805029" "DocBook 5: The D...
Including Docs  include_docs=true      key         id value               doc (truncated)["9780596805029",0] … …      {"ti...
Authors and Related Books
Map Functionfunction(doc) {  if ("author" == doc.collection) {    emit([doc._id, 0], doc.name);  }  if ("book-author" == d...
Result Set      key              id              value["muellner",0]   "muellner"   "Leonard Muellner"["muellner",1]   "44...
Including Docsinclude_docs=true     key       id value               doc (truncated)["muellner",0] …   …      {"name":"Leo...
LimitationsQueries can only contain data from the “left” or “right” side of therelationship (without the use of include_do...
Final Thoughts
Document Databases Comparedto Relational DatabasesDocument databases have no tables (and therefore no columns)Indexes (vie...
CaveatsNo referential integrityNo atomic transactions across document boundariesSome patterns may involve denormalized (i....
Additional TechniquesUse the startkey and endkey parameters to retrieve one entity andits related entities: startkey=["978...
Cheat Sheet                  Embedded     Related                 Relationship                                          Li...
http://oreilly.com/catalog/9781449303129/   http://oreilly.com/catalog/9781449303433/
Thank You                                  @BradleyHolt                             http://bradley-holt.com               ...
Upcoming SlideShare
Loading in …5
×

Entity Relationships in a Document Database at CouchConf Boston

3,394 views

Published on

Unlike relational databases, document databases like CouchDB and Couchbase do not directly support entity relationships. This talk will explore patterns of modeling one-to-many and many-to-many entity relationships in a document database. These patterns include using an embedded JSON array, relating documents using identifiers, using a list of keys, and using relationship documents.

Published in: Technology, Business
  • Be the first to comment

Entity Relationships in a Document Database at CouchConf Boston

  1. 1. Entity Relationships ina Document Database MapReduce Views for SQL Users
  2. 2. Entity:An object defined by its identityand a thread of continuity[1] 1. "Entity" Domain-driven Design Community <http://domaindrivendesign.org/node/109>.
  3. 3. EntityRelationshipModel
  4. 4. Join vs. Collation
  5. 5. SQL Query JoiningPublishers and BooksSELECT `publisher`.`id`, `publisher`.`name`, `book`.`title`FROM `publisher`FULL OUTER JOIN `book` ON `publisher`.`id` = `book`.`publisher_id`ORDER BY `publisher`.`id`, `book`.`title`;
  6. 6. Joined Result Setpublisher.id publisher.name book.title Building iPhone Apps with oreilly OReilly Media HTML, CSS, and JavaScript CouchDB: The Definitive oreilly OReilly Media Guide DocBook: The Definitive oreilly OReilly Media Guide oreilly OReilly Media RESTful Web Services
  7. 7. Joined Result Set Publisher (“left”)publisher.id publisher.name book.title Building iPhone Apps with oreilly OReilly Media HTML, CSS, and JavaScript CouchDB: The Definitive oreilly OReilly Media Guide DocBook: The Definitive oreilly OReilly Media Guide oreilly OReilly Media RESTful Web Services
  8. 8. Joined Result Set Publisher (“left”) Book “right”publisher.id publisher.name book.title Building iPhone Apps with oreilly OReilly Media HTML, CSS, and JavaScript CouchDB: The Definitive oreilly OReilly Media Guide DocBook: The Definitive oreilly OReilly Media Guide oreilly OReilly Media RESTful Web Services
  9. 9. Collated Result Set key id value ["oreilly",0] "oreilly" "OReilly Media" "Building iPhone Apps with ["oreilly",1] "oreilly" HTML, CSS, and JavaScript" "CouchDB: The Definitive ["oreilly",1] "oreilly" Guide" "DocBook: The Definitive ["oreilly",1] "oreilly" Guide" ["oreilly",1] "oreilly" "RESTful Web Services"
  10. 10. Collated Result Set key id value["oreilly",0] "oreilly" "OReilly Media" Publisher "Building iPhone Apps with["oreilly",1] "oreilly" HTML, CSS, and JavaScript" "CouchDB: The Definitive["oreilly",1] "oreilly" Guide" "DocBook: The Definitive["oreilly",1] "oreilly" Guide"["oreilly",1] "oreilly" "RESTful Web Services"
  11. 11. Collated Result Set key id value["oreilly",0] "oreilly" "OReilly Media" Publisher "Building iPhone Apps with["oreilly",1] "oreilly" HTML, CSS, and JavaScript" "CouchDB: The Definitive["oreilly",1] "oreilly" Guide" Books "DocBook: The Definitive["oreilly",1] "oreilly" Guide"["oreilly",1] "oreilly" "RESTful Web Services"
  12. 12. View Result SetsMade up of columns and rowsEvery row has the same three columns: • key • id • valueColumns can contain a mixture of logical data types
  13. 13. One to Many Relationships
  14. 14. Embedded Entities:Nest related entities within a document
  15. 15. Embedded EntitiesA single document represents the “one” entityNested entities (JSON Array) represents the “many” entitiesSimplest way to create a one to many relationship
  16. 16. Example: Publisherwith Nested Books{ "_id":"oreilly", "collection":"publisher", "name":"OReilly Media", "books":[ { "title":"CouchDB: The Definitive Guide" }, { "title":"RESTful Web Services" }, { "title":"DocBook: The Definitive Guide" }, { "title":"Building iPhone Apps with HTML, CSS,and JavaScript" } ]}
  17. 17. Map Functionfunction(doc) { if ("publisher" == doc.collection) { emit([doc._id, 0], doc.name); for (var i in doc.books) { emit([doc._id, 1], doc.books[i].title); } }}
  18. 18. Result Set key id value ["oreilly",0] "oreilly" "OReilly Media" "Building iPhone Apps with ["oreilly",1] "oreilly" HTML, CSS, and JavaScript" "CouchDB: The Definitive ["oreilly",1] "oreilly" Guide" "DocBook: The Definitive ["oreilly",1] "oreilly" Guide" ["oreilly",1] "oreilly" "RESTful Web Services"
  19. 19. LimitationsOnly works if there aren’t a large number of related entities: • Too many nested entities can result in very large documents • Slow to transfer between client and server • Unwieldy to modify • Time-consuming to index
  20. 20. Related Documents:Reference an entity by its identifier
  21. 21. Related DocumentsA document representing the “one” entitySeparate documents for each “many” entityEach “many” entity references its related“one” entity by the “one” entity’s document identifierMakes for smaller documentsReduces the probability of document update conflicts
  22. 22. Example: Publisher{ "_id":"oreilly", "collection":"publisher", "name":"OReilly Media"}
  23. 23. Example: Related Book{ "_id":"9780596155896", "collection":"book", "title":"CouchDB: The Definitive Guide", "publisher":"oreilly"}
  24. 24. Map Functionfunction(doc) { if ("publisher" == doc.collection) { emit([doc._id, 0], doc.name); } if ("book" == doc.collection) { emit([doc.publisher, 1], doc.title); }}
  25. 25. Result Set key id value["oreilly",0] "oreilly" "OReilly Media" "CouchDB: The Definitive["oreilly",1] "9780596155896" Guide"["oreilly",1] "9780596529260" "RESTful Web Services" "Building iPhone Apps with["oreilly",1] "9780596805791" HTML, CSS, and JavaScript" "DocBook: The Definitive["oreilly",1] "9781565925809" Guide"
  26. 26. LimitationsWhen retrieving the entity on the “right” side of the relationship,one cannot include any data from the entity on the “left” side ofthe relationship without the use of an additional queryOnly works for one to many relationships
  27. 27. Many to Many Relationships
  28. 28. List of Keys:Reference entities by their identifiers
  29. 29. List of KeysA document representing each “many” entity on the “left” sideof the relationshipSeparate documents for each “many” entity on the “right” sideof the relationshipEach “many” entity on the “right” side of the relationshipmaintains a list of document identifiers for its related “many”entities on the “left” side of the relationship
  30. 30. Books and Related Authors
  31. 31. Example: Book{ "_id":"9780596805029", "collection":"book", "title":"DocBook 5: The Definitive Guide"}
  32. 32. Example: Book{ "_id":"9781565920514", "collection":"book", "title":"Making TeX Work"}
  33. 33. Example: Book{ "_id":"9781565925809", "collection":"book", "title":"DocBook: The Definitive Guide"}
  34. 34. Example: Author{ "_id":"muellner", "collection":"author", "name":"Leonard Muellner", "books":[ "9781565925809" ]}
  35. 35. Example: Author{ "_id":"walsh", "collection":"author", "name":"Norman Walsh", "books":[ "9780596805029", "9781565925809", "9781565920514" ]}
  36. 36. Map Functionfunction(doc) { if ("book" == doc.collection) { emit([doc._id, 0], doc.title); } if ("author" == doc.collection) { for (var i in doc.books) { emit([doc.books[i], 1], doc.name); } }}
  37. 37. Result Set key id value["9780596805029",0] "9780596805029" "DocBook 5: The Definitive Guide"["9780596805029",1] "walsh" "Norman Walsh"["9781565920514",0] "9781565920514" "Making TeX Work"["9781565920514",1] "walsh" "Norman Walsh"["9781565925809",0] "9781565925809" "DocBook: The Definitive Guide"["9781565925809",1] "muellner" "Leonard Muellner"["9781565925809",1] "walsh" "Norman Walsh"
  38. 38. Authors and Related Books
  39. 39. Map Functionfunction(doc) { if ("author" == doc.collection) { emit([doc._id, 0], doc.name); for (var i in doc.books) { emit([doc._id, 1], {"_id":doc.books[i]}); } }}
  40. 40. Result Set key id value["muellner",0] "muellner" "Leonard Muellner"["muellner",1] "muellner" {"_id":"9781565925809"}["walsh",0] "walsh" "Norman Walsh"["walsh",1] "walsh" {"_id":"9780596805029"}["walsh",1] "walsh" {"_id":"9781565920514"}["walsh",1] "walsh" {"_id":"9781565925809"}
  41. 41. Including Docs include_docs=true key id value doc (truncated)["muellner",0] "muellner" … {"name":"Leonard Muellner"}["muellner",1] "muellner" … {"title":"DocBook: The Definitive Guide"}["walsh",0] "walsh" … {"name":"Norman Walsh"}["walsh",1] "walsh" … {"title":"DocBook 5: The Definitive Guide"}["walsh",1] "walsh" … {"title":"Making TeX Work"}["walsh",1] "walsh" … {"title":"DocBook: The Definitive Guide"}
  42. 42. Or, we can reverse the references…
  43. 43. Example: Author{ "_id":"muellner", "collection":"author", "name":"Leonard Muellner"}
  44. 44. Example: Author{ "_id":"walsh", "collection":"author", "name":"Norman Walsh"}
  45. 45. Example: Book{ "_id":"9780596805029", "collection":"book", "title":"DocBook 5: The Definitive Guide", "authors":[ "walsh" ]}
  46. 46. Example: Book{ "_id":"9781565920514", "collection":"book", "title":"Making TeX Work", "authors":[ "walsh" ]}
  47. 47. Example: Book{ "_id":"9781565925809", "collection":"book", "title":"DocBook: The Definitive Guide", "authors":[ "muellner", "walsh" ]}
  48. 48. Map Functionfunction(doc) { if ("author" == doc.collection) { emit([doc._id, 0], doc.name); } if ("book" == doc.collection) { for (var i in doc.authors) { emit([doc.authors[i], 1], doc.title); } }}
  49. 49. Result Set key id value["muellner",0] "muellner" "Leonard Muellner"["muellner",1] "9781565925809" "DocBook: The Definitive Guide"["walsh",0] "walsh" "Norman Walsh"["walsh",1] "9780596805029" "DocBook 5: The Definitive Guide"["walsh",1] "9781565920514" "Making TeX Work"["walsh",1] "9781565925809" "DocBook: The Definitive Guide"
  50. 50. LimitationsQueries from the “right” side of the relationship cannot includeany data from entities on the “left” side of the relationship(without the use of include_docs)A document representing an entity with lots of relationshipscould become quite large
  51. 51. Relationship Documents:Create a document to represent eachindividual relationship
  52. 52. Relationship DocumentsA document representing each “many” entity on the “left” sideof the relationshipSeparate documents for each “many” entity on the “right” sideof the relationshipNeither the “left” nor “right” side of the relationship contain anydirect references to each otherFor each distinct relationship, a separate document includes thedocument identifiers for both the “left” and “right” sides of therelationship
  53. 53. Example: Book{ "_id":"9780596805029", "collection":"book", "title":"DocBook 5: The Definitive Guide"}
  54. 54. Example: Book{ "_id":"9781565920514", "collection":"book", "title":"Making TeX Work"}
  55. 55. Example: Book{ "_id":"9781565925809", "collection":"book", "title":"DocBook: The Definitive Guide"}
  56. 56. Example: Author{ "_id":"muellner", "collection":"author", "name":"Leonard Muellner"}
  57. 57. Example: Author{ "_id":"walsh", "collection":"author", "name":"Norman Walsh"}
  58. 58. Example:Relationship Document{ "_id":"44005f2c", "collection":"book-author", "book":"9780596805029", "author":"walsh"}
  59. 59. Example:Relationship Document{ "_id":"44005f72", "collection":"book-author", "book":"9781565920514", "author":"walsh"}
  60. 60. Example:Relationship Document{ "_id":"44006720", "collection":"book-author", "book":"9781565925809", "author":"muellner"}
  61. 61. Example:Relationship Document{ "_id":"44006b0d", "collection":"book-author", "book":"9781565925809", "author":"walsh"}
  62. 62. Books and Related Authors
  63. 63. Map Functionfunction(doc) { if ("book" == doc.collection) { emit([doc._id, 0], doc.title); } if ("book-author" == doc.collection) { emit([doc.book, 1], {"_id":doc.author}); }}
  64. 64. Result Set key id value["9780596805029",0] "9780596805029" "DocBook 5: The Definitive Guide"["9780596805029",1] "44005f2c" {"_id":"walsh"}["9781565920514",0] "9781565920514" "Making TeX Work"["9781565920514",1] "44005f72" {"_id":"walsh"}["9781565925809",0] "9781565925809" "DocBook: The Definitive Guide"["9781565925809",1] "44006720" {"_id":"muellner"}["9781565925809",1] "44006b0d" {"_id":"walsh"}
  65. 65. Including Docs include_docs=true key id value doc (truncated)["9780596805029",0] … … {"title":"DocBook 5: The Definitive Guide"}["9780596805029",1] … … {"name":"Norman Walsh"}["9781565920514",0] … … {"title":"Making TeX Work"}["9781565920514",1] … … {"author","name":"Norman Walsh"}["9781565925809",0] … … {"title":"DocBook: The Definitive Guide"}["9781565925809",1] … … {"name":"Leonard Muellner"}["9781565925809",1] … … {"name":"Norman Walsh"}
  66. 66. Authors and Related Books
  67. 67. Map Functionfunction(doc) { if ("author" == doc.collection) { emit([doc._id, 0], doc.name); } if ("book-author" == doc.collection) { emit([doc.author, 1], {"_id":doc.book}); }}
  68. 68. Result Set key id value["muellner",0] "muellner" "Leonard Muellner"["muellner",1] "44006720" {"_id":"9781565925809"}["walsh",0] "walsh" "Norman Walsh"["walsh",1] "44005f2c" {"_id":"9780596805029"}["walsh",1] "44005f72" {"_id":"9781565920514"}["walsh",1] "44006b0d" {"_id":"9781565925809"}
  69. 69. Including Docsinclude_docs=true key id value doc (truncated)["muellner",0] … … {"name":"Leonard Muellner"}["muellner",1] … … {"title":"DocBook: The Definitive Guide"}["walsh",0] … … {"name":"Norman Walsh"}["walsh",1] … … {"title":"DocBook 5: The Definitive Guide"}["walsh",1] … … {"title":"Making TeX Work"}["walsh",1] … … {"title":"DocBook: The Definitive Guide"}
  70. 70. LimitationsQueries can only contain data from the “left” or “right” side of therelationship (without the use of include_docs)Maintaining relationship documents may require more work
  71. 71. Final Thoughts
  72. 72. Document Databases Comparedto Relational DatabasesDocument databases have no tables (and therefore no columns)Indexes (views) are queried directly, instead of being used tooptimize more generalized queriesResult set columns can contain a mix of logical data typesNo built-in concept of relationships between documentsRelated entities can be embedded in a document, referenced froma document, or both
  73. 73. CaveatsNo referential integrityNo atomic transactions across document boundariesSome patterns may involve denormalized (i.e. redundant) dataData inconsistencies are inevitable (i.e. eventual consistency)Consider the implications of replication—what may seemconsistent with one database may not be consistent across nodes(e.g. referencing entities that don’t yet exist on the node)
  74. 74. Additional TechniquesUse the startkey and endkey parameters to retrieve one entity andits related entities: startkey=["9781565925809"]&endkey=["9781565925809",{}]Define a reduce function and use grouping levelsUse UUIDs rather than natural keys for better performanceUse the bulk document API when writing Relationship DocumentsWhen using the List of Keys or Relationship Documents patterns,denormalize data so that you can have data from the “right” and“left” side of the relationship within your query results
  75. 75. Cheat Sheet Embedded Related Relationship List of Keys Entities Documents Documents One to Many ✓ ✓Many to Many ✓ ✓<= N* Relations ✓ ✓> N* Relations ✓ ✓ * where N is a large number for your system
  76. 76. http://oreilly.com/catalog/9781449303129/ http://oreilly.com/catalog/9781449303433/
  77. 77. Thank You @BradleyHolt http://bradley-holt.com bradley.holt@foundline.comCopyright © 2011-2012 Bradley Holt. All rights reserved.

×