Developing II:  Advanced Document Design

Like this? Share it with your network

Share
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,550
On Slideshare
1,550
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
9
Comments
0
Likes
3

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • Modeling data from the real world to software is nothing new. In the case of Couchbase, as a document oriented system, data modeling is pretty easy. We aren’t constrained by schemas and needing to fit things into relational algebra. Instead, we only need to think about
  • No easy way to query: How often do you need to run complex join query? When data is denormalized for speed, how much complex query are you really running? “ Stop thinking in terms of joins and queries is ticket to speed ” Not handling bank transactions: We can live with small percentage of concurrency issue. Err on the side of making player happy.
  • To represent game data in our system, we simply represent objects as JSON. We will then determine the key for an object using the class name or type of the object and a unique ID. In fact, Couchbase Server can serve up sequence numbers pretty easily by using it’s built in increment function. To represent a one to many relationship, we can have a small list that shows the relationships. This allows us to be closer to normalized, but be slightly denormalized. The code for building out our graph of related items will be quite simple, and because it’s distributed and Couchbase caches hot items, it should be very fast.
  • TODO: add artwork from screenshots.
  • Here we see the three different objects in their JSON document form. These are very simple documents, but show the concept. Each document’s key (also known as the _id) is the object’s class, followed by a serial number. Since each player has a plant list, or we can simply create one if the player does not yet have plants, we create the plant list as an array
  • On this slide, we see a sample blog post in JSON. It has most of the fields you’d expect to have in a blog entry. The one field that is a little different is the comments field. One approach here would be to store all comments on this blog in the blog. This is simple, denormalized and lets us get the data in one shot. There are a coupledownsides though. One is that we may not want to display all of the comments. If I’m showing multiple plogs, maybe blog summaries on a given page, I don’t want to display the comments. The other is that some popular blogs, from popular bloggers, may have 100s or 1000s of commments. Of course, the challenge with this is that we don’t want to display them all at once, and may not want to have to grab such a large amount of data. We can reapply the same denormalization technique we’d encountered earlier.
  • As you see here, rather than storing comments inline, we can separate them to a comment list, and then from there to individual comments. Comments in this case can be threaded. You may wonder about the performance of such an arrangement because of all of the traffic across the wire. First off, in a distributed system the data may not be local anyway, so we’ll just make it easier by having the client system fetch the data from the server.
  • If you’re expecting a very large number of comments, or want to display them threaded, you can easily imagine doing so by extending the list technique discussed earlier. This allows us to very easily build very complex arrangements of the data across various keys. Since they distribute throughout the cluster, we spread load out among the cluster nodes. In contrast, with a typical relational model, you may have to have the comments and blogs colocated on a single shard system so you can use join queries. This creates hotspots in the system, and resharding to redistribute the data becomes a manual process. Because of the active cache management in Couchbase Server, the hottest data will be in memory and served very quickly, so the data items may be served very quickly if they’re popular.
  • Using a view over all transactions, say they’re in a separate bucket or have type information on them, we can easily query for individual balances.
  • First we’ll do a demonstration of finding all of the items owned by a particular player through a view. Then we will do a demonstration of showing a leaderboard from the gamesim data previously shown.

Transcript

  • 1. Advanced Document Design J Chris Anderson Mobile
  • 2. Modeling DataWhen considering how to model data for a given application,you should…•Think of a logical container for the data•Think of how data groups togetherYou may notice that…•From a software development standpoint, this maps betterto the way you structure the data•It’s a more natural way of representing the entities theapplication will be handling
  • 3. Example of A User Profile
  • 4. Example: Data Profile for a User• Key selection: start with { the username “_id”: “auser_profile”, “user_id”: 7778 • “password”: “a1004cdcaa3191b7”, ”common_name”: ”Alan User”, se this as a common prefix for ”nicknames”: [”Al”, ”Buddy”], related data "sign_up_timestamp": 1224612317, "last_login_timestamp": 1245613101 } • hen building user’s session information, fetch these items• Create related documents { by prefixing with the same “_id”: “auser_friends”, “friends”: [ “joe”, username “ingenthr”, “toru” ] } • an extend this concept further
  • 5. blog example
  • 6. Modeling a Blog Blog post structure: •Main content itself •User to blogs is a 1:N Post relationship •Set of comments, also a 1:N relationship Comment Comment Comment Comment 6
  • 7. Representing Blog Data in CouchbaseRepresenting Objects● Simply treat an object as a JSON document ● Could also represent it with serialized objects from the higher level language● Determine the key for an object using the class name (or type) of the object and an unique IDRepresenting Object Lists● Denormalization● Save an array of object IDs in this list 7
  • 8. Modeling the Example Blog Data• Three entities (for this simple example) •Users •Hold the user’s profile, enough to instantiate the needed data for a session •Blog lists •A simple list of blogs written by a given user– semi normalized •Blogs •The blogs authored by a given user 8
  • 9. Blog Data in CouchbaseUser ObjectKey: ’JChris Blog ObjectJSON Key: ’blog_spoke_at_couchconf{ “_id” : “JChris”, “nid” : 1, JSON “name” : “Chris” { “password” : “sldkfjslkdfj” “_id” :} “blog_spoke_at_couchconf”, “user_id” : 1UserBlog List “content-md” : “blah de blah”Key: ’JChris_BlogList }JSON{ “_id” : “Player1_Plantlist”, “blogtitles” : [“spoke_at_couchconf”, “(didn’t)_miss_sxsw”, “next_to_couch_st”]} 9
  • 10. Schema-less Modeling of Data● No need to “ALTER TABLE”● Add new “fields” all objects at any time – Specify default value for missing fields – Increased development speed● Using JSON for data objects● This will allow future capabilities with Couchbase Server 2.0● Offers the ability to query and analyze arbitrary fields with views 10
  • 11. Modifying Blog Data in CouchbaseCreating a new blog post// Create the new blogBlog aBlog = new Blog(”ready_for_lunch");// do morecbclient.set(aBlog.getId(), aBlog);// Update the user’s bloglistUser theUser = user.fetch(cbclient.get(username);// Add the plant to the playertheUser.addPost(aBlog);// Store the players new plant listcbclient.set(username + “_bloglist", theUser.getBlogList()); 11
  • 12. Adding Comments to a Blog• User profile •Main pointer into the user data: •Blog entries •Badge settings, like a twitter badge• Blog posts •Contains the blogs themselves• Blog comments •Comments from other users
  • 13. Blog Document Sample{“_id”: “jchris_Hello_World”,“author”: “jchris”,“type”: “post”“title”: “Hello World”,“format”: “markdown”,“body”: “Hello from [Couchbase](http://couchbase.com).”,“html”: “<p>Hello from <a href=“http: …“comments”: [ [ “format”: “markdown”, “body”:”Awesome post!” ], [ “format”: “markdown”, “body”:”Like it.” ] ]}
  • 14. Blog Document Sample, Broken up{“_id”: “jchris_Hello_World”, { “_id”:“author”: “jchris”, “comment1_jchris_Hello_World”,“type”: “post” “format”: “markdown”, “body”:”Awesome post!” }“title”: “Hello World”,“format”: “markdown”,“body”: “Hello from [Couchbase](http://couchbase.com).”,“html”: “<p>Hello from <a href=“http: …“comments”: [ “comment1_jchris_Hello_world” ]}
  • 15. Threaded CommentsYou can imagine how to take this to a threaded list Blog List First comment Reply to comment List More CommentsAdvantages:• Only fetch the data when you need it •For example, rendering part of the page, jQuery style• Spread the data (and thus the load) across the entire cluster
  • 16. Couchbase Server 2.0:Querying and Aggregation with Views 1616
  • 17. Exchanging Virtual Currency• Simulate the way { “From”: “matt”, federated systems work: “to”: “jchris”, “coins”: 30 } { “From”: “matt”, • “to”: “perry”, “coins”: 30 } xamples: checking accounts, credit card transactions { “From”: “jchris”, “to”: “perry”, “coins”: 30 }• Create a record per transaction { “From”: “matt”, “to”: “jchris”, “coins”: 30 } • { “From”: “matt”, everage views to reconcile “to”: “perry”, “coins”: 30 } the results of the transaction { “From”: “jchris”, • “to”: “perry”, “coins”: 30 } f results don’t reconcile, there is a missing transaction or a flaw in the business
  • 18. Exchanging Virtual Currency { “From”: “matt”, “to”: “jchris”, “coins”: 30 } { “From”: “matt”, “to”: “perry”, “coins”: 30 } Query balance view with key { “From”: “jchris”, “to”: “perry”, “coins”: 30 } == user, get the balance for Mapper the user. function(transaction) { emit(transaction.from, transaction.amount *-1); Query the sum of the entire emit (transaction.to, transaction.amount); view, value should be 0. } Reduce function(keys, values) { return sum(values); }
  • 19. Common QuestionsQ: How can I implement transactions across multiple documents?A: Transactions in this distributed system can be re-imagined with views. (example to follow…)Q: Can I write more complex aggregation logic? For example apply a transform or a filter when reducing?A: While the built in reduce functions are often enough and highly tuned, Couchbase Server can execute arbitrary javascript functions.
  • 20. Demo: Game Simulator Views• Player items •Find all the weapons owned by an individual player• Leader board view •Showing who has the highest level in the system
  • 21. Questions?