CouchConf Tokyo Developing with Couchbase Part II
 

CouchConf Tokyo Developing with Couchbase Part II

on

  • 1,765 views

 

Statistics

Views

Total Views
1,765
Views on SlideShare
1,765
Embed Views
0

Actions

Likes
1
Downloads
34
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Modeling data from the real world to software is nothing new. In the case of Couchbase, as a document oriented system, data modeling is pretty easy. We aren’t constrained by schemas and needing to fit things into relational algebra. Instead, we only need to think about
  • Tribal crossing set out to designa new game. They planned for a large audience, running in a cloud environment. With a previous experiment they’d deployed as a Facebook app, my polls (based on an RDBMS), they had planned for the traditional sharding approach. The problem was, at the time Tribal was only a few engineers, no real operations staff. If the app took off and became popular, they would have to reshard the database to keep up with the load. It turned out that my polls started to become very quickly popular over a weekend. They spent the entire weekend adding nodes, resharding, repeating just to try to keep up with the new users. Once sharded out across those system, shrinking the system would also be an issue.
  • No easy way to query: How often do you need to run complex join query? When data is denormalized for speed, how much complex query are you really running? “ Stop thinking in terms of joins and queries is ticket to speed ” Not handling bank transactions: We can live with small percentage of concurrency issue. Err on the side of making player happy.
  • To represent game data in our system, we simply represent objects as JSON. We will then determine the key for an object using the class name or type of the object and a unique ID. In fact, Couchbase Server can serve up sequence numbers pretty easily by using it’s built in increment function. To represent a one to many relationship, we can have a small list that shows the relationships. This allows us to be closer to normalized, but be slightly denormalized. The code for building out our graph of related items will be quite simple, and because it’s distributed and Couchbase caches hot items, it should be very fast.
  • TODO: add artwork from screenshots.
  • Here we see the three different objects in their JSON document form. These are very simple documents, but show the concept. Each document’s key (also known as the _id) is the object’s class, followed by a serial number. Since each player has a plant list, or we can simply create one if the player does not yet have plants, we create the plant list as an array
  • On this slide, we see a sample blog post in JSON. It has most of the fields you’d expect to have in a blog entry. The one field that is a little different is the comments field. One approach here would be to store all comments on this blog in the blog. This is simple, denormalized and lets us get the data in one shot. There are a coupledownsides though. One is that we may not want to display all of the comments. If I’m showing multiple plogs, maybe blog summaries on a given page, I don’t want to display the comments. The other is that some popular blogs, from popular bloggers, may have 100s or 1000s of commments. Of course, the challenge with this is that we don’t want to display them all at once, and may not want to have to grab such a large amount of data. We can reapply the same denormalization technique we’d encountered earlier.
  • As you see here, rather than storing comments inline, we can separate them to a comment list, and then from there to individual comments. Comments in this case can be threaded. You may wonder about the performance of such an arrangement because of all of the traffic across the wire. First off, in a distributed system the data may not be local anyway, so we’ll just make it easier by having the client system fetch the data from the server.
  • If you’re expecting a very large number of comments, or want to display them threaded, you can easily imagine doing so by extending the list technique discussed earlier. This allows us to very easily build very complex arrangements of the data across various keys. Since they distribute throughout the cluster, we spread load out among the cluster nodes. In contrast, with a typical relational model, you may have to have the comments and blogs colocated on a single shard system so you can use join queries. This creates hotspots in the system, and resharding to redistribute the data becomes a manual process. Because of the active cache management in Couchbase Server, the hottest data will be in memory and served very quickly, so the data items may be served very quickly if they’re popular.
  • First we’ll do a demonstration of finding all of the items owned by a particular player through a view. Then we will do a demonstration of showing a leaderboard from the gamesim data previously shown.
  • Using a view over all transactions, say they’re in a separate bucket or have type information on them, we can easily query for individual balances.

CouchConf Tokyo Developing with Couchbase Part II CouchConf Tokyo Developing with Couchbase Part II Presentation Transcript

  • Advanced Document Design J Chris Anderson Mobile
    • When considering how to model data for a given application, you should…
    • Think of a logical container for the data
    • Think of how data groups together
    • You may notice that…
    • From a software development standpoint, this maps better to the way you structure the data
    • It’s a more natural way of representing the entities the application will be handling
    Modeling Data
  • Example of A User Profile
    • Key selection: start with the username
      • Use this as a common prefix for related data
      • When building user’s session information, fetch these items
    • Create related documents by prefixing with the same username
      • Can extend this concept further
        • Other data records for this user, building out object graph, etc.
    Example: Data Profile for a User { “ _id”: “ auser_profile ” , “ user_id ” : 7778 “ password”: “ a1004cdcaa3191b7 ” , ” common_name ” : ” Alan User ” , ” nicknames ” : [ ” Al ” , ” Buddy ” ], "sign_up_timestamp": 1224612317, "last_login_timestamp": 1245613101 } { “ _id”: “ auser_friends ” , “ friends”: [ “ joe ” , “ ingenthr ” , “ toru ” ] }
  • Social Game Example
    • As they prepped their new game in 2011, the developers at Tribal Crossing recognized they needed to plan for this game a bit differently:
    • Planned to scale
    • Deploys to cloud servers
    • Be able to easily scale and monitor the overall system
    • Previous experience:
    • My polls went from 100 users to 1M users in a very long weekend of resharding!
    Tribal Crossing
    • But, there are some differences in using Couchbase to handle the game data:
        • Direct key or _id queries only
        • Transactions do not span objects
    • add speed and distribution
    • Tribal Questioned: Can this work for an online game?
        • Decided they could, just needed to tackle the modeling issues
        • Decided that for their deployment, they could relax some constraints to allow for higher concurrency and “repair” issues when encountered by erring to player happyness
    Modeling Data
    • Representing Objects
    • Simply treat an object as a JSON document
      • Could also represent it with serialized objects from the higher level language
    • Determine the key for an object using the class name (or type) of the object and an unique ID
    • Representing Object Lists
    • Denormalization
    • Save an array of object IDs in this list
    Tribal Crossing: Representing Game Data in Couchbase
    • Three entities (for this simple example)
      • Players
        • Hold the player’s profile, enough to instantiate the needed data for a session
      • Plant lists
        • A simple list of plants owned by a given player– semi normalized
      • Plants
        • The plants owned by a given player
    Modeling the Example Game Data
    • Player Object
    • Key : 'Player1'
    • JSON
    • {
    • “ _id ” : “ Player1 ” ,
    • “ nid ” : 1,
    • “ name ” : “ Shawn ”
    • }
    Social Game Data in Couchbase Plant Object Key : 'Plant201' JSON { “ _id ” : “ Plant201 ” , “ nid ” : 201, “ player_id ” : 1 “ name ” : “ Starflower ” } PlayerPlant List Key : 'Player1_PlantList' JSON { “ _id ” : “ Player1_Plantlist ” , “ plants ” : [201, 202, 204] }
    • No need to “ ALTER TABLE ”
    • Add new “ fields ” all objects at any time
        • Specify default value for missing fields
        • Increased development speed
    • Using JSON for data objects
      • This will allow future capabilities with Couchbase Server 2.0
      • Offers the ability to query and analyze arbitrary fields with views
    Tribal Crossing: Schema-less Modeling of Data
    • Give a player a new plant
    • // Create the new plant
    • Plant givenPlant = new Plant(100, "Mushroom");
    • cbclient.set("Plant100", givenPlant);
    • // Update the player plant list
    • Player thePlayer = player.fetch(cbclient.get("Player1");
    • // Add the plant to the player
    • thePlayer.receivePlant(givenPlant);
    • // Store the player's new plant list
    • cbclient.set("Player1_PlantList",
    • thePlayer.getPlantsArray());
    Modifying Game Data in Couchbase
  • When to break up a document: blog example
    • User profile
      • Main pointer into the user data:
        • Blog entries
        • Badge settings, like a twitter badge
    • Blog posts
      • Contains the blogs themselves
    • Blog comments
      • Comments from other users
    Entities for a Blog
    • {
    • “ _id”: “ jchris_Hello_World ” ,
    • “ author”: “ jchris ” ,
    • “ type”: “post”
    • “ title”: “Hello World”,
    • “ format”: “markdown”,
    • “ body”: “Hello from [Couchbase]( http://couchbase.com ).”,
    • “ html”: “<p>Hello from <a href=“http: …
    • “ comments”: [
    • [ “format”: “markdown”, “ body ” : ” Awesome post! ” ],
    • [ “format”: “markdown”, “ body ” : ” Like it. ” ]
    • ]
    • }
    Blog Document Sample
  • Blog Document Sample, Broken up { “ _id”: “ jchris_Hello_World ” , “ author”: “ jchris ” , “ type”: “post” “ title”: “Hello World”, “ format”: “markdown”, “ body”: “Hello from [Couchbase]( http://couchbase.com ).”, “ html”: “<p>Hello from <a href=“http: … “ comments”: [ “ comment1_jchris_Hello_world” ] } { “ _id”: “comment1_jchris_Hello_World”, “ format”: “markdown”, “ body ” : ” Awesome post! ” }
    • You can imagine how to take this to a threaded list
    Threaded Comments Blog First comment Reply to comment More Comments List List
    • Advantages:
    • Only fetch the data when you need it
      • For example, rendering part of the page, jQuery style
    • Spread the data (and thus the load) across the entire cluster
  • Couchbase Server 2.0: Querying and Aggregation with Views
    • Player items
      • Find all the weapons owned by an individual player
    • Leader board view
      • Showing who has the highest level in the system
    Demo: Game Simulator Views
    • Q: How can I implement transactions across multiple documents?
    • A: Transactions in this distributed system can be re-imagined with views . (example to follow…)
    • Q: Can I write more complex aggregation logic? For example apply a transform or a filter when reducing?
    • A: While the built in reduce functions are often enough and highly tuned, Couchbase Server can execute arbitrary javascript functions.
    Common Questions
    • Simulate the way federated systems work:
      • Examples: checking accounts, credit card transactions
    • Create a record per transaction
      • Leverage views to reconcile the results of the transaction
      • If results don’t reconcile, there is a missing transaction or a flaw in the business logic
    Exchanging Virtual Currency { “From”: “matt”, “ to”: “ jchris ” , “ coins ” : 30 } { “From”: “matt”, “ to”: “ perry ” , “ coins ” : 30 } { “From”: “ jchris ” , “ to”: “ perry ” , “ coins ” : 30 } { “From”: “matt”, “ to”: “ jchris ” , “ coins ” : 30 } { “From”: “matt”, “ to”: “ perry ” , “ coins ” : 30 } { “From”: “ jchris ” , “ to”: “ perry ” , “ coins ” : 30 }
  • Exchanging Virtual Currency { “From”: “matt”, “to”: “ jchris ” , “ coins ” : 30 } { “From”: “matt”, “to”: “ perry ” , “ coins ” : 30 } { “From”: “ jchris ” , “ to ” : “ perry ” , “ coins ” : 30 } Mapper function(transaction) { emit(transaction.from, transaction.amount *-1); emit (transaction.to, transaction.amount); } Reduce function(keys, values) { return sum(values); } Query balance view with key == user, get the balance for the user. Query the sum of the entire view, value should be 0.
  • Questions?