• Share
  • Email
  • Embed
  • Like
  • Private Content
Couchbase Seoul Data Engineering Conference (SDEC) 2011
 

Couchbase Seoul Data Engineering Conference (SDEC) 2011

on

  • 2,366 views

 

Statistics

Views

Total Views
2,366
Views on SlideShare
2,364
Embed Views
2

Actions

Likes
0
Downloads
36
Comments
0

2 Embeds 2

http://www.slideshare.net 1
http://duckduckgo.com 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Couchbase Seoul Data Engineering Conference (SDEC) 2011 Couchbase Seoul Data Engineering Conference (SDEC) 2011 Presentation Transcript

    • Chiyoung Seo, Couchbase Inc.
      Matt Ingenthron, Couchbase Inc.
      Using Couchbase For Social game scaling and speed
    • Introduction
      What is Couchbase Server?
      Simple, Fast, Elastic
      Technology Overview (Architecture, data flow, rebalancing)
      Tribal Crossing Inc: Animal Party
      Challenges before Couchbase
      Original Architecture
      Why Couchbase?
      Simplicity
      Performance
      Flexibility
      Deploying Couchbase
      New Architecture
      EC2
      Data Model
      Accessing data in Couchbase
      Product Roadmap
      Q&A
      Agenda
    • Membase and CouchOne have merged to form Couchbase Inc. (headquartered in Silicon Valley)
      Team
      Brings together the creators and core contributors of Memcached, Membase and CouchDB technologies
      Doubles technical team size, accelerates roadmaps by over a year
      Products
      Couchbase Server (Formerly Membase)
      Couchbase Single Server
      Mobile Couchbase (iPhone and Android)
      Technology
      Most mature, reliable and widely deployed NoSQL technologies
      Fully featured, open source document datastore
      First complete, end-to-end NoSQL database product
      Couchbase Inc.
    • Modern Interactive Web Application Architecture
      Application Scales Out
      Just add more commodity web servers
      www.facebook.com/animalparty
      Load Balancer
      WebServers
      Database Scales Up
      Get a bigger, more complex server
      Relational
      Database
      - Expensive and disruptive sharding
      - Doesn’t perform at Web Scale
    • Couchbase Server is a distributed database
      CouchbaseWeb Console
      Application user
      Web application server
      Couchbase Servers
    • Couchbase data layer scales like application logic tierData layer now scales with linear cost and constant performance.
      Application Scales Out
      Just add more commodity web servers
      www.facebook.com/animalparty
      Load Balancer
      Web Servers
      Couchbase Servers
      Database Scales Out
      Just add more commodity data servers
      Horizontally scalable, schema-less, auto-sharding, high-performance at Web Scale
      Scaling out flattens the cost and performance curves.
    • Couchbase Server is Simple, Fast, Elastic
      Five minutes or less to a working cluster
      Downloads for Windows, Linux and OSX
      Start with a single node
      One button press joins nodes to a cluster
      Easy to develop against
      Just SET and GET – no schema required
      Drop it in. 10,000+ existing applications already “speak Couchbase” (via memcached)
      Practically every language and application framework is supported, out of the box
      Easy to manage
      One-click failover and cluster rebalancing
      Graphical and programmatic interfaces
      Configurable alerting
    • Couchbase Server is Simple, Fast, Elastic
      Predictable
      “Never keep an application waiting”
      Quasi-deterministic latency and throughput
      Low latency
      Built-in Memcached technology
      Auto-migration of hot data to lowest latency storage technology (RAM, SSD, Disk)
      Selectable write behavior – asynchronous, synchronous (on replication, persistence)
      High throughput
      Multi-threaded
      Low lock contention
      Asynchronous wherever possible
      Automatic write de-duplication
    • Couchbase Server is Simple, Fast, Elastic
      Zero-downtime elasticity
      Spread I/O and data across commodity servers (or VMs)
      Consistent performance with linear cost
      Dynamic rebalancing of a live cluster
      All nodes are created equal
      No special case nodes
      Clone to grow
      Extensible
      Change feeds
      Real-time map-reduce
      RESTful interface for management
      CouchbaseWeb Console
    • Proven at Small, and Extra Large Scale
      Leading cloud service (PAAS) provider
      Over 150,000 hosted applications
      Couchbase Server serving over 6,200 Heroku customers
      • Social game leader – FarmVille, Mafia Wars, Empires and Allies, Café World, FishVille
      • Over 230 million monthly users
      • Couchbase Server is the primary database behind key Zynga properties
    • Customers and Partners
      Customers (partial listing)
      Partners
    • 11211
      11210
      memcapable 1.0
      memcapable 2.0
      moxi
      memcached
      protocol listener/sender
      REST management API/Web UI
      vBucket state and replication manager
      Rebalance orchestrator
      Node health monitor
      Heartbeat
      Process monitor
      Global singleton supervisor
      Configuration manager
      Data Manager
      Cluster Manager
      engine interface
      Couchbase Storage Engine
      http
      on each node
      one per cluster
      Erlang/OTP
      HTTP
      distributed erlang
      erlang port mapper
      21100 – 21199
      4369
      8091
      Couchbase Server Architecture
    • 11211
      11210
      memcapable 1.0
      memcapable 2.0
      moxi
      memcached
      protocol listener/sender
      REST management API/Web UI
      vBucket state and replication manager
      Rebalance orchestrator
      Node health monitor
      Heartbeat
      Process monitor
      Global singleton supervisor
      Configuration manager
      engine interface
      Couchbase Storage Engine
      http
      on each node
      one per cluster
      Erlang/OTP
      HTTP
      distributed erlang
      erlang port mapper
      21100 – 21199
      4369
      8091
      Couchbase Server Architecture
    • Couchbase “write” Data Flow – application view
      User action results in the need to change the VALUE of KEY
      1
      Application updates key’s VALUE, performs SET operation
      2
      4
      Couchbase client hashes KEY, identifies KEY’s master server
      3
      SET request sent over network to master server
      5
      Couchbase replicates KEY-VALUE pair, caches it in memory and stores it to disk
    • Couchbase Data Flow – under the hood
      SET request arrives at KEY’s master server
      SET acknowledgement returned to application
      1
      3
      2
      2
      Listener-Sender
      RAM*
      2
      SSD
      SSD
      SSD
      Couchbase storage engine
      4
      Disk
      Disk
      Disk
      Master server for KEY
      Replica Server 2 for KEY
      Replica Server 1 for KEY
    • Elasticity - Rebalancing
      Node 1
      Node 2
      Node 3
      Before
      • Adding Node 3
      • Node 3 is in pending state
      • Clients talk to Node 1,2 only
      vBucket 1
      vBucket 7
      vBucket 2
      vBucket 8
      vBucket 3
      vBucket 9
      Pending state
      vBucket 4
      vBucket 10
      vBucket 5
      vBucket 11
      vBucket 6
      vBucket 12
      vBucket 1
      vBucket 7
      During
      • Rebalancing orchestrator recalculates the vBucket map (including replicas)
      • Migrate vBucketsto the new server
      • Finalize migration
      vBucket 2
      vBucket 8
      vBucket 3
      vBucket 9
      Rebalancing
      vBucket 4
      vBucket 10
      vBucket 5
      vBucket 11
      vBucket 6
      vBucket 12
      vBucket migrator
      vBucket migrator
      Client
      After
      • Node 3 is balanced
      • Clients are reconfigured to talk to Node 3
      vBucket 5
      vBucket 1
      vBucket 7
      vBucket 6
      vBucket 2
      vBucket 8
      vBucket 11
      vBucket 3
      vBucket 9
      vBucket 12
      vBucket 4
      vBucket 10
    • Data buckets are secure Couchbase “slices”
      Application user
      Web application server
      Bucket 1
      Bucket 2
      Aggregate Cluster Memory and Disk Capacity
      Couchbase data servers
      In the data center
      On the administrator console
    • Support large-scale analytics on application data by streaming data from Couchbase to Hadoop
      Real-time integration using Flume
      Batch integration using Sqoop
      Examples
      Various game statistics (e.g., monthly / daily / hourly rankings)
      Analyze game patterns from users to enhance various game metrics
      Couchbase and Hadoop Integration
      Flume
      memcached
      protocol listener/sender
      TAP
      Sqoop
      engine interface
      Couchbase Storage Engine
    • Introduction
      What is Couchbase Server?
      Simple, Fast, Elastic
      Technology Overview (Architecture, data flow, rebalancing)
      Tribal Crossing Inc: Animal Party
      Challenges before Couchbase
      Original Architecture
      Why Couchbase?
      Simplicity
      Performance
      Flexibility
      Deploying Couchbase
      New Architecture
      EC2
      Data Model
      Accessing data in Couchbase
      Product Roadmap
      Q&A
      Agenda
    • Common steps on scaling up database:
      Tune queries (indexing, explain query)
      Denormalization
      Cache data (APC / Memcache)
      Tune MySQL configuration
      Replication (read slaves)
      Where do we go from here to prepare for the scale of a successful social game?
      Tribal Crossing: Challenges
    • Tribal Crossing: Challenges
      Write-heavy requests
      Caching does not help
      MySQL / InnoDB limitation (Percona)
      Need to scale drastically over night
      My Polls – 100 to 1m users over a weekend
      Small team, no dedicated sysadmin
      Focus on what we do best – making games
      Keeping cost down
    • MySQL with master-to-master replication and sharding
      Complex to setup, high administration cost
      Requires application level changes
      Cassandra
      High write, but low read throughput
      Live cluster reconfiguration and rebalance is quite complicated
      Eventual consistency gives too much burden to application developers
      MongoDB
      High read/write, but unpredictable latency
      Live cluster rebalance for existing nodes only
      Eventual consistency with slave nodes
      Tribal Crossing: “Old” Architecture and Options
    • SPEED, SPEED, SPEED
      Immediate consistency
      Interface is dead simple to use
      We are already using Memcache
      Low sysadmin overhead
      Schema-less data store
      Used and Proven by big guys like Zynga
      … and lastly, because Tribal CAN
      Bigger firms with legacy code base = hard to adapt
      Small team = ability to get on the cutting edge
      Tribal Crossing: Why Couchbase Server?
    • But, there are some different challenges in using Couchbase (currently 1.7) to handle the game data:
      No easy way to query data
      No transaction / rollback
      Couchbase Server 2.0 resolves them by using CouchDB as the underlying database engine
      Can this work for an online game?
      Break out of the old ORM / relational paradigm!
      We are not handling bank transactions
      Tribal Crossing: New Challenges With Couchbase
    • Tribal Crossing: Deploying Couchbase in EC2
      Web Server
      Basic production environment setup
      Dev/Stage environment – feel free to install Couchbase on your web server
      Apache
      Client-side Moxi
      Cluster Mgmt.
      Requests
      DNS Entry
      Couchbase Cluster
      Couchbase
      Couchbase

    • Tribal Crossing: Deploying Couchbasein EC2
      Web Server
      Amazon Linux AMI, 64-bit, EBS backed instance
      Setup swap space
      Install Couchbase’sMembase Server 1.7
      Access web consolehttp://<hostname>:8091
      Start the new cluster with a single node
      Add the other nodes to the cluster and rebalance
      Apache
      Client-side Moxi
      Cluster Mgmt.
      Requests
      DNS Entry
      CouchbaseCluster
      Couchbase
      Couchbase

    • Tribal Crossing: Deploying Couchbase in EC2
      Web Server
      Moxi figures out which node in the cluster holds data for a given key.
      On each web server, install Moxi proxy
      Start Moxi by pointing it to the DNS entry you created
      Web apps connect to Moxi that is running locallymemcache->addServer(‘localhost’, 11211);
      Apache
      Client-side Moxi
      Cluster Mgmt.
      Requests
      DNS Entry
      CouchbaseCluster
      Couchbase
      Couchbase

    • Tribal Crossing: Representing Game Data in Couchbase
      Use case - simple farming game:
      A player can have a variety of plants on their farm.
      A player can add or remove plants from their farm.
      A Player can see what plants are on another player's farm.
    • Representing Objects
      Simply treat an object as an associative array
      Determine the key for an object using the class name (or type) of the object and an unique ID
      Representing Object Lists
      Denormalization
      Save a comma separated list or an array of object IDs
      Tribal Crossing: Representing Game Data in Couchbase
    • Tribal Crossing: Representing Game Data in Couchbase
      Player Object
      Key: 'Player1'
      Array
      (
      [Id] => 1
      [Name] => Shawn
      )
      Plant Object
      Key: 'Plant201'
      Array
      (
      [Id] => 201
      [Player_Id] => 1
      [Name] => Starflower
      )
      PlayerPlant List
      Key: 'Player1_PlantList'
      Array
      (
      [0] => 201
      [1] => 202
      [2] => 204
      )
    • No need to “ALTER TABLE”
      Add new “fields” all objects at any time
      Specify default value for missing fields
      Increased development speed
      Using JSON for data objects though, owing to the ability to query on arbitrary fields in Couchbase 2.0
      Tribal Crossing: Schema-less Game Data
    • Tribal Crossing: Accessing Game Data in Couchbase
      Get all plants belong to a given player
      Request: GET /player/1/farm
      $plant_ids = couchbase->get('Player1_PlantList');
      $response = array();
      foreach ($plant_ids as $plant_id)
      {
      $plant = couchbase->get('Plant' . $plant_id);
      $response[] = $plant;
      }
      echo json_encode($response);
    • Give a player a new plant
      // Create the new plant
      $new_plant = array (
      'id' => 100,
      'name' => 'Mushroom'
      );
      $couchbase->set('Plant100', $new_plant);
      // Update the player plant list
      $plant_ids = $couchbase->get('Player1_PlantList');
      $plant_ids[] = $new_plant['id'];
      $couchbase->set('Player1_PlantList', $plant_ids);
      Tribal Crossing: Modifying Game Data in Couchbase
    • Concurrency issue can occur when multiple requests are working with the same piece of data.
      Solution:
      CAS (check-and-set)
      Client can know if someone else has modified the data while you are trying to update
      Implement optimistic concurrency control
      Locking (try/wait cycle)
      GETL (get with lock + timeout) operations
      Pessimistic concurrency control
      Tribal Crossing: Concurrency
    • Record object relationships both ways
      Example: Plots and Plants
      Plot object stores id of the plant that it hosts
      Plant object stores id of the plot that it grows on
      Resolution in case of mismatch
      Don't sweat the extra calls to load data in a one-to-many relationship
      Use multiGet
      We can still cache aggregated results in a Memcache bucket if needed
      Tribal Crossing: Data Relationship
    • First migrated large or slow performing tables and frequently updated fields from MySQL to Couchbase
      Web Server
      Tribal Crossing: Migrating to CouchbaseServers
      Apache + PHP
      Client-side Moxi
      MySQL
      memcached
      protocol listener/sender
      TAP
      engine interface
      Reporting Applications
      TAP Client
      Couchbase Storage Engine
    • Tribal Crossing: Deployment
    • Tribal Crossing: Deployment
    • Significantly reduced the cost incurred by scaling up database servers and managing them.
      Achieved significant improvements in various performance metrics (e.g., read, write, latency, etc.)
      Allowed them to focus more on game development and optimizing key metrics
      Plan to use real-time MapReduce, querying, and indexing abilities provided by the upcoming Elastic Couchbase 2.0
      Tribal Crossing: Conclusion
    • Introduction
      What is Couchbase Server?
      Simple, Fast, Elastic
      Technology Overview (Architecture, data flow, rebalancing)
      Tribal Crossing Inc: Animal Party
      Challenges before Couchbase
      Original Architecture
      Why Couchbase?
      Simplicity
      Performance
      Flexibility
      Deploying Couchbase
      New Architecture
      EC2
      Data Model
      Accessing data in Couchbase
      Product Roadmap
      Q&A
      Agenda
    • Mobile to cloud data synchronization
      Cross data center replication
      Product Roadmap: Couchbase Server 2.0
      US West Coast Data Center
      US East Coast Data Center
      CouchbaseServer
      CouchbaseServer
      CouchSync
      CouchSync
      CouchSync
      Couchbase Single Server
      Couchbase Single Server
      CouchSync
      CouchSync







    • Replace Sqlite-based storage engine with CouchDB
      Support indexing and querying on values
      Integrate real-time MapReduce into Couchbase server
      SDK for Couchbase server
      Product Roadmap: Couchbase Server 2.0
      Membase Server 1.7
      CouchDB 1.1
      Couchbase Server 2.0
      The world’s leading caching and clustering technology
      The fastest, most complete and most reliable database on the planet
      The most reliable and full-featured document database
    • Community Edition
      Open source build
      Free forum support
      Enterprise Edition
      Free for non-production use
      Certified, QA tested version of open source
      Case tracking and guaranteed SLA for production environments
      Partner in Korea
      N2M Inc. (http://www.n2m.co.kr)
      CouchbaseProduct Download
    • Q&A
      Matt Ingenthron, Couchbase Inc.
      (matt@couchbase.com, @ingenthr)
      Chiyoung Seo, Couchbase Inc.
      (chiyoung@couchbase.com, @chiyoungseo)