CouchConf London: Developing II Advanced Document Design

Like this? Share it with your network

Share
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
853
On Slideshare
853
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
7
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • \n
  • Modeling data from the real world to software is nothing new. In the case of Couchbase, as a document oriented system, data modeling is pretty easy. We aren’t constrained by schemas and needing to fit things into relational algebra.\n\nInstead, we only need to think about \n
  • \n
  • \n
  • \n
  • \n
  • No easy way to query:\nHow often do you need to run complex join query? When data is denormalized for speed, how much complex query are you really running?\n\n"Stop thinking in terms of joins and queries is ticket to speed"\n\nNot handling bank transactions:\nWe can live with small percentage of concurrency issue. Err on the side of making player happy. \n
  • \n
  • \n
  • \n
  • If you’re expecting a very large number of comments, or want to display them threaded, you can easily imagine doing so by extending the list technique discussed earlier. This allows us to very easily build very complex arrangements of the data across various keys. Since they distribute throughout the cluster, we spread load out among the cluster nodes.\n\nIn contrast, with a typical relational model, you may have to have the comments and blogs colocated on a single shard system so you can use join queries. This creates hotspots in the system, and resharding to redistribute the data becomes a manual process.\n\nBecause of the active cache management in Couchbase Server, the hottest data will be in memory and served very quickly, so the data items may be served very quickly if they’re popular.\n
  • If you’re expecting a very large number of comments, or want to display them threaded, you can easily imagine doing so by extending the list technique discussed earlier. This allows us to very easily build very complex arrangements of the data across various keys. Since they distribute throughout the cluster, we spread load out among the cluster nodes.\n\nIn contrast, with a typical relational model, you may have to have the comments and blogs colocated on a single shard system so you can use join queries. This creates hotspots in the system, and resharding to redistribute the data becomes a manual process.\n\nBecause of the active cache management in Couchbase Server, the hottest data will be in memory and served very quickly, so the data items may be served very quickly if they’re popular.\n
  • If you’re expecting a very large number of comments, or want to display them threaded, you can easily imagine doing so by extending the list technique discussed earlier. This allows us to very easily build very complex arrangements of the data across various keys. Since they distribute throughout the cluster, we spread load out among the cluster nodes.\n\nIn contrast, with a typical relational model, you may have to have the comments and blogs colocated on a single shard system so you can use join queries. This creates hotspots in the system, and resharding to redistribute the data becomes a manual process.\n\nBecause of the active cache management in Couchbase Server, the hottest data will be in memory and served very quickly, so the data items may be served very quickly if they’re popular.\n
  • If you’re expecting a very large number of comments, or want to display them threaded, you can easily imagine doing so by extending the list technique discussed earlier. This allows us to very easily build very complex arrangements of the data across various keys. Since they distribute throughout the cluster, we spread load out among the cluster nodes.\n\nIn contrast, with a typical relational model, you may have to have the comments and blogs colocated on a single shard system so you can use join queries. This creates hotspots in the system, and resharding to redistribute the data becomes a manual process.\n\nBecause of the active cache management in Couchbase Server, the hottest data will be in memory and served very quickly, so the data items may be served very quickly if they’re popular.\n
  • If you’re expecting a very large number of comments, or want to display them threaded, you can easily imagine doing so by extending the list technique discussed earlier. This allows us to very easily build very complex arrangements of the data across various keys. Since they distribute throughout the cluster, we spread load out among the cluster nodes.\n\nIn contrast, with a typical relational model, you may have to have the comments and blogs colocated on a single shard system so you can use join queries. This creates hotspots in the system, and resharding to redistribute the data becomes a manual process.\n\nBecause of the active cache management in Couchbase Server, the hottest data will be in memory and served very quickly, so the data items may be served very quickly if they’re popular.\n
  • If you’re expecting a very large number of comments, or want to display them threaded, you can easily imagine doing so by extending the list technique discussed earlier. This allows us to very easily build very complex arrangements of the data across various keys. Since they distribute throughout the cluster, we spread load out among the cluster nodes.\n\nIn contrast, with a typical relational model, you may have to have the comments and blogs colocated on a single shard system so you can use join queries. This creates hotspots in the system, and resharding to redistribute the data becomes a manual process.\n\nBecause of the active cache management in Couchbase Server, the hottest data will be in memory and served very quickly, so the data items may be served very quickly if they’re popular.\n
  • If you’re expecting a very large number of comments, or want to display them threaded, you can easily imagine doing so by extending the list technique discussed earlier. This allows us to very easily build very complex arrangements of the data across various keys. Since they distribute throughout the cluster, we spread load out among the cluster nodes.\n\nIn contrast, with a typical relational model, you may have to have the comments and blogs colocated on a single shard system so you can use join queries. This creates hotspots in the system, and resharding to redistribute the data becomes a manual process.\n\nBecause of the active cache management in Couchbase Server, the hottest data will be in memory and served very quickly, so the data items may be served very quickly if they’re popular.\n
  • If you’re expecting a very large number of comments, or want to display them threaded, you can easily imagine doing so by extending the list technique discussed earlier. This allows us to very easily build very complex arrangements of the data across various keys. Since they distribute throughout the cluster, we spread load out among the cluster nodes.\n\nIn contrast, with a typical relational model, you may have to have the comments and blogs colocated on a single shard system so you can use join queries. This creates hotspots in the system, and resharding to redistribute the data becomes a manual process.\n\nBecause of the active cache management in Couchbase Server, the hottest data will be in memory and served very quickly, so the data items may be served very quickly if they’re popular.\n
  • If you’re expecting a very large number of comments, or want to display them threaded, you can easily imagine doing so by extending the list technique discussed earlier. This allows us to very easily build very complex arrangements of the data across various keys. Since they distribute throughout the cluster, we spread load out among the cluster nodes.\n\nIn contrast, with a typical relational model, you may have to have the comments and blogs colocated on a single shard system so you can use join queries. This creates hotspots in the system, and resharding to redistribute the data becomes a manual process.\n\nBecause of the active cache management in Couchbase Server, the hottest data will be in memory and served very quickly, so the data items may be served very quickly if they’re popular.\n
  • If you’re expecting a very large number of comments, or want to display them threaded, you can easily imagine doing so by extending the list technique discussed earlier. This allows us to very easily build very complex arrangements of the data across various keys. Since they distribute throughout the cluster, we spread load out among the cluster nodes.\n\nIn contrast, with a typical relational model, you may have to have the comments and blogs colocated on a single shard system so you can use join queries. This creates hotspots in the system, and resharding to redistribute the data becomes a manual process.\n\nBecause of the active cache management in Couchbase Server, the hottest data will be in memory and served very quickly, so the data items may be served very quickly if they’re popular.\n
  • If you’re expecting a very large number of comments, or want to display them threaded, you can easily imagine doing so by extending the list technique discussed earlier. This allows us to very easily build very complex arrangements of the data across various keys. Since they distribute throughout the cluster, we spread load out among the cluster nodes.\n\nIn contrast, with a typical relational model, you may have to have the comments and blogs colocated on a single shard system so you can use join queries. This creates hotspots in the system, and resharding to redistribute the data becomes a manual process.\n\nBecause of the active cache management in Couchbase Server, the hottest data will be in memory and served very quickly, so the data items may be served very quickly if they’re popular.\n
  • If you’re expecting a very large number of comments, or want to display them threaded, you can easily imagine doing so by extending the list technique discussed earlier. This allows us to very easily build very complex arrangements of the data across various keys. Since they distribute throughout the cluster, we spread load out among the cluster nodes.\n\nIn contrast, with a typical relational model, you may have to have the comments and blogs colocated on a single shard system so you can use join queries. This creates hotspots in the system, and resharding to redistribute the data becomes a manual process.\n\nBecause of the active cache management in Couchbase Server, the hottest data will be in memory and served very quickly, so the data items may be served very quickly if they’re popular.\n
  • If you’re expecting a very large number of comments, or want to display them threaded, you can easily imagine doing so by extending the list technique discussed earlier. This allows us to very easily build very complex arrangements of the data across various keys. Since they distribute throughout the cluster, we spread load out among the cluster nodes.\n\nIn contrast, with a typical relational model, you may have to have the comments and blogs colocated on a single shard system so you can use join queries. This creates hotspots in the system, and resharding to redistribute the data becomes a manual process.\n\nBecause of the active cache management in Couchbase Server, the hottest data will be in memory and served very quickly, so the data items may be served very quickly if they’re popular.\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Using a view over all transactions, say they’re in a separate bucket or have type information on them, we can easily query for individual balances.\n
  • Using a view over all transactions, say they’re in a separate bucket or have type information on them, we can easily query for individual balances.\n
  • Using a view over all transactions, say they’re in a separate bucket or have type information on them, we can easily query for individual balances.\n
  • \n
  • \n

Transcript

  • 1. Advanced Document Design Jan Lehnardt jan@couchbase.com
  • 2. First Principles
  • 3. First Principles All Hail JSON Model your data for your app, not for the databaseGroup data together in types of documents
  • 4. Example of A User Profile
  • 5. Example: Data Profile for a User Start with a { username "_id": "user7778_profile", "user7778" as "user_id": 7778 the key "password": "a3191b7", "name": "Alan User", Use this as a "nicknames": ["Al", "Fred"], common prefix "sign_up": 1224612317, for related data "last_login": 1245613101 } When building user’s session information, fetch these items
  • 6. Example: Data Profile for a User Create { related "_id": "user7778_friends", documents "friends": [ "joe", by prefixing "ingenthr", with the "toru" ] same } username { "_id": "user7778_history", Can extend "history": [ this concept {"action":"post", further "timestamp":1245614000}, {"action":"comment", "timestamp":1245623563} Other data records for this ] user, building out }
  • 7. The Famous Blog Example
  • 8. Modeling a Blog Post Blog posts to Comment comments: a 1:N Comment relationship Comment Comment 6
  • 9. Blog Data in CouchbaseStart out easy: - Make model objects as JSON documents - use previously learned _id - add a unique part- add rest of your data{ "_id": "user7778_post_e11eadea" "title": "It’s a dogs life in the British Army", "body": "Well, well well…", "posted": 1234533356} 9
  • 10. The Front Page A list document with references { "_id": "user7778_blog", "posts": [ "user7778_post_e11eadea", "user7778_post_44ff337c", "user7778_post_037ababa" ] } 9
  • 11. Comments as Lists of DocumentsTwo simple steps: - List items as documents - A list document with references{"_id": "user7778_comment_f4ea7b17", "text":"First"}{"_id": "user7778_comment_f00db44d", "text":"No!"}{"_id": "user7778_comment_deadbeef", "text":"I agree."}{ "_id": "user7778_post_e11eadea_comments", "comments": [ "user7778_comment_f4ea7b17", "user7778_comment_f00db44d", "user7778_comment_deadbeef" ]} 9
  • 12. Threaded Comments
  • 13. Threaded CommentsYou can imagine how to take this to a threaded list
  • 14. Threaded CommentsYou can imagine how to take this to a threaded list Blog
  • 15. Threaded CommentsYou can imagine how to take this to a threaded list Blog First comment
  • 16. Threaded CommentsYou can imagine how to take this to a threaded list Blog First comment Reply to comment
  • 17. Threaded CommentsYou can imagine how to take this to a threaded list Blog First comment Reply to comment More Comment s
  • 18. Threaded CommentsYou can imagine how to take this to a threaded list Blog Li st First comment Reply to comment More Comment s
  • 19. Threaded CommentsYou can imagine how to take this to a threaded list Blog Li st First comment Reply to comment More Comment s
  • 20. Threaded CommentsYou can imagine how to take this to a threaded list Blog Li st First comment Reply to comment More Comment s
  • 21. Threaded CommentsYou can imagine how to take this to a threaded list Blog Li st First comment Reply to comment More Comment s
  • 22. Threaded CommentsYou can imagine how to take this to a threaded list Blog Li st First comment Reply to comment List More Comment s
  • 23. Threaded CommentsYou can imagine how to take this to a threaded list Blog Li st First comment Reply to comment List More Comment s
  • 24. Threaded CommentsYou can imagine how to take this to a threaded list Blog Li st First comment Reply to comment List More Comment s
  • 25. Threaded CommentsYou can imagine how to take this to a threaded list Blog Li st First comment Reply to comment List More Comment sAdvantages:• Only fetch the data when you need it• For example, rendering part of the page, jQuery style• Spread the data (and thus the load) across the entire cluster
  • 26. Schema-less Modeling of Data● No ALTER TABLE● Add new "fields" to any object any time ● improves development speed● JSON data is interoperable ● 2.0 Features ● dynamic queries with views ● Exchange with other systems ● Web-native 10
  • 27. Couchbase Server 2.0:Querying and Aggregation with Views
  • 28. The Bank
  • 29. The BankDun, dun, dun duuuhn.
  • 30. Exchanging Virtual Currency• Simulate the way { "from": "matt", federated systems "to": "jchris", "coins": 10 } work: { "from": "matt", "to": "perry", "coins": 20 }• Examples: checking accounts, credit card { "from": "jchris", "to": "perry", "coins": 30 } transactions• Create a record per { "from": "jchris", "to": "matt", "coins": 10 } transaction { "from": "perry",• Use views to reconcile "to": "matt", "coins": 30 } the results of the transaction { "from": "jchris", "to": "perry", "coins": 20 }• If results don’t reconcile, there is a missing transaction or
  • 31. Exchanging Virtual Currency { "from": "matt", "to": "jchris", "coins": 10 } { "from": "matt", "to": "perry", "coins": 20 } { "from": "jchris", "to": "perry", "coins": 30 } Mapper function(transaction) { emit(transaction.from, transaction.amount *-1); emit(transaction.to, transaction.amount); } Reduce _sum
  • 32. The View Result {"key":"jchris", "value": 10} {"key":"jchris", "value":-30} {"key":"jchris", "value":-10} {"key":"jchris", "value":-20} {"key":"matt", "value":-10} {"key":"matt", "value":-20} {"key":"matt", "value": 10} {"key":"matt", "value": 30} {"key":"perry", "value": 20} {"key":"perry", "value": 30} {"key":"perry", "value":-30} {"key":"perry", "value": 20}
  • 33. Group! {"key":"jchris", "value":-50} {"key":"matt", "value":10} {"key":"perry", "value":40}
  • 34. Questions?
  • 35. Thanks! Jan Lehnardt jan@couchbase.com