Couchbase Seoul Data Engineering Conference (SDEC) 2011

Like this? Share it with your network

Share
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
2,721
On Slideshare
2,718
From Embeds
3
Number of Embeds
3

Actions

Shares
Downloads
38
Comments
0
Likes
0

Embeds 3

http://www.slideshare.net 1
http://duckduckgo.com 1
https://twitter.com 1

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1.
  • 2. Chiyoung Seo, Couchbase Inc.
    Matt Ingenthron, Couchbase Inc.
    Using Couchbase For Social game scaling and speed
  • 3. Introduction
    What is Couchbase Server?
    Simple, Fast, Elastic
    Technology Overview (Architecture, data flow, rebalancing)
    Tribal Crossing Inc: Animal Party
    Challenges before Couchbase
    Original Architecture
    Why Couchbase?
    Simplicity
    Performance
    Flexibility
    Deploying Couchbase
    New Architecture
    EC2
    Data Model
    Accessing data in Couchbase
    Product Roadmap
    Q&A
    Agenda
  • 4. Membase and CouchOne have merged to form Couchbase Inc. (headquartered in Silicon Valley)
    Team
    Brings together the creators and core contributors of Memcached, Membase and CouchDB technologies
    Doubles technical team size, accelerates roadmaps by over a year
    Products
    Couchbase Server (Formerly Membase)
    Couchbase Single Server
    Mobile Couchbase (iPhone and Android)
    Technology
    Most mature, reliable and widely deployed NoSQL technologies
    Fully featured, open source document datastore
    First complete, end-to-end NoSQL database product
    Couchbase Inc.
  • 5. Modern Interactive Web Application Architecture
    Application Scales Out
    Just add more commodity web servers
    www.facebook.com/animalparty
    Load Balancer
    WebServers
    Database Scales Up
    Get a bigger, more complex server
    Relational
    Database
    - Expensive and disruptive sharding
    - Doesn’t perform at Web Scale
  • 6. Couchbase Server is a distributed database
    CouchbaseWeb Console
    Application user
    Web application server
    Couchbase Servers
  • 7. Couchbase data layer scales like application logic tierData layer now scales with linear cost and constant performance.
    Application Scales Out
    Just add more commodity web servers
    www.facebook.com/animalparty
    Load Balancer
    Web Servers
    Couchbase Servers
    Database Scales Out
    Just add more commodity data servers
    Horizontally scalable, schema-less, auto-sharding, high-performance at Web Scale
    Scaling out flattens the cost and performance curves.
  • 8. Couchbase Server is Simple, Fast, Elastic
    Five minutes or less to a working cluster
    Downloads for Windows, Linux and OSX
    Start with a single node
    One button press joins nodes to a cluster
    Easy to develop against
    Just SET and GET – no schema required
    Drop it in. 10,000+ existing applications already “speak Couchbase” (via memcached)
    Practically every language and application framework is supported, out of the box
    Easy to manage
    One-click failover and cluster rebalancing
    Graphical and programmatic interfaces
    Configurable alerting
  • 9. Couchbase Server is Simple, Fast, Elastic
    Predictable
    “Never keep an application waiting”
    Quasi-deterministic latency and throughput
    Low latency
    Built-in Memcached technology
    Auto-migration of hot data to lowest latency storage technology (RAM, SSD, Disk)
    Selectable write behavior – asynchronous, synchronous (on replication, persistence)
    High throughput
    Multi-threaded
    Low lock contention
    Asynchronous wherever possible
    Automatic write de-duplication
  • 10. Couchbase Server is Simple, Fast, Elastic
    Zero-downtime elasticity
    Spread I/O and data across commodity servers (or VMs)
    Consistent performance with linear cost
    Dynamic rebalancing of a live cluster
    All nodes are created equal
    No special case nodes
    Clone to grow
    Extensible
    Change feeds
    Real-time map-reduce
    RESTful interface for management
    CouchbaseWeb Console
  • 11. Proven at Small, and Extra Large Scale
    Leading cloud service (PAAS) provider
    Over 150,000 hosted applications
    Couchbase Server serving over 6,200 Heroku customers
    • Social game leader – FarmVille, Mafia Wars, Empires and Allies, Café World, FishVille
    • 12. Over 230 million monthly users
    • 13. Couchbase Server is the primary database behind key Zynga properties
  • Customers and Partners
    Customers (partial listing)
    Partners
  • 14. 11211
    11210
    memcapable 1.0
    memcapable 2.0
    moxi
    memcached
    protocol listener/sender
    REST management API/Web UI
    vBucket state and replication manager
    Rebalance orchestrator
    Node health monitor
    Heartbeat
    Process monitor
    Global singleton supervisor
    Configuration manager
    Data Manager
    Cluster Manager
    engine interface
    Couchbase Storage Engine
    http
    on each node
    one per cluster
    Erlang/OTP
    HTTP
    distributed erlang
    erlang port mapper
    21100 – 21199
    4369
    8091
    Couchbase Server Architecture
  • 15. 11211
    11210
    memcapable 1.0
    memcapable 2.0
    moxi
    memcached
    protocol listener/sender
    REST management API/Web UI
    vBucket state and replication manager
    Rebalance orchestrator
    Node health monitor
    Heartbeat
    Process monitor
    Global singleton supervisor
    Configuration manager
    engine interface
    Couchbase Storage Engine
    http
    on each node
    one per cluster
    Erlang/OTP
    HTTP
    distributed erlang
    erlang port mapper
    21100 – 21199
    4369
    8091
    Couchbase Server Architecture
  • 16. Couchbase “write” Data Flow – application view
    User action results in the need to change the VALUE of KEY
    1
    Application updates key’s VALUE, performs SET operation
    2
    4
    Couchbase client hashes KEY, identifies KEY’s master server
    3
    SET request sent over network to master server
    5
    Couchbase replicates KEY-VALUE pair, caches it in memory and stores it to disk
  • 17. Couchbase Data Flow – under the hood
    SET request arrives at KEY’s master server
    SET acknowledgement returned to application
    1
    3
    2
    2
    Listener-Sender
    RAM*
    2
    SSD
    SSD
    SSD
    Couchbase storage engine
    4
    Disk
    Disk
    Disk
    Master server for KEY
    Replica Server 2 for KEY
    Replica Server 1 for KEY
  • 18. Elasticity - Rebalancing
    Node 1
    Node 2
    Node 3
    Before
    • Adding Node 3
    • 19. Node 3 is in pending state
    • 20. Clients talk to Node 1,2 only
    vBucket 1
    vBucket 7
    vBucket 2
    vBucket 8
    vBucket 3
    vBucket 9
    Pending state
    vBucket 4
    vBucket 10
    vBucket 5
    vBucket 11
    vBucket 6
    vBucket 12
    vBucket 1
    vBucket 7
    During
    • Rebalancing orchestrator recalculates the vBucket map (including replicas)
    • 21. Migrate vBucketsto the new server
    • 22. Finalize migration
    vBucket 2
    vBucket 8
    vBucket 3
    vBucket 9
    Rebalancing
    vBucket 4
    vBucket 10
    vBucket 5
    vBucket 11
    vBucket 6
    vBucket 12
    vBucket migrator
    vBucket migrator
    Client
    After
    • Node 3 is balanced
    • 23. Clients are reconfigured to talk to Node 3
    vBucket 5
    vBucket 1
    vBucket 7
    vBucket 6
    vBucket 2
    vBucket 8
    vBucket 11
    vBucket 3
    vBucket 9
    vBucket 12
    vBucket 4
    vBucket 10
  • 24. Data buckets are secure Couchbase “slices”
    Application user
    Web application server
    Bucket 1
    Bucket 2
    Aggregate Cluster Memory and Disk Capacity
    Couchbase data servers
    In the data center
    On the administrator console
  • 25. Support large-scale analytics on application data by streaming data from Couchbase to Hadoop
    Real-time integration using Flume
    Batch integration using Sqoop
    Examples
    Various game statistics (e.g., monthly / daily / hourly rankings)
    Analyze game patterns from users to enhance various game metrics
    Couchbase and Hadoop Integration
    Flume
    memcached
    protocol listener/sender
    TAP
    Sqoop
    engine interface
    Couchbase Storage Engine
  • 26. Introduction
    What is Couchbase Server?
    Simple, Fast, Elastic
    Technology Overview (Architecture, data flow, rebalancing)
    Tribal Crossing Inc: Animal Party
    Challenges before Couchbase
    Original Architecture
    Why Couchbase?
    Simplicity
    Performance
    Flexibility
    Deploying Couchbase
    New Architecture
    EC2
    Data Model
    Accessing data in Couchbase
    Product Roadmap
    Q&A
    Agenda
  • 27. Common steps on scaling up database:
    Tune queries (indexing, explain query)
    Denormalization
    Cache data (APC / Memcache)
    Tune MySQL configuration
    Replication (read slaves)
    Where do we go from here to prepare for the scale of a successful social game?
    Tribal Crossing: Challenges
  • 28. Tribal Crossing: Challenges
    Write-heavy requests
    Caching does not help
    MySQL / InnoDB limitation (Percona)
    Need to scale drastically over night
    My Polls – 100 to 1m users over a weekend
    Small team, no dedicated sysadmin
    Focus on what we do best – making games
    Keeping cost down
  • 29. MySQL with master-to-master replication and sharding
    Complex to setup, high administration cost
    Requires application level changes
    Cassandra
    High write, but low read throughput
    Live cluster reconfiguration and rebalance is quite complicated
    Eventual consistency gives too much burden to application developers
    MongoDB
    High read/write, but unpredictable latency
    Live cluster rebalance for existing nodes only
    Eventual consistency with slave nodes
    Tribal Crossing: “Old” Architecture and Options
  • 30. SPEED, SPEED, SPEED
    Immediate consistency
    Interface is dead simple to use
    We are already using Memcache
    Low sysadmin overhead
    Schema-less data store
    Used and Proven by big guys like Zynga
    … and lastly, because Tribal CAN
    Bigger firms with legacy code base = hard to adapt
    Small team = ability to get on the cutting edge
    Tribal Crossing: Why Couchbase Server?
  • 31. But, there are some different challenges in using Couchbase (currently 1.7) to handle the game data:
    No easy way to query data
    No transaction / rollback
    Couchbase Server 2.0 resolves them by using CouchDB as the underlying database engine
    Can this work for an online game?
    Break out of the old ORM / relational paradigm!
    We are not handling bank transactions
    Tribal Crossing: New Challenges With Couchbase
  • 32. Tribal Crossing: Deploying Couchbase in EC2
    Web Server
    Basic production environment setup
    Dev/Stage environment – feel free to install Couchbase on your web server
    Apache
    Client-side Moxi
    Cluster Mgmt.
    Requests
    DNS Entry
    Couchbase Cluster
    Couchbase
    Couchbase

  • 33. Tribal Crossing: Deploying Couchbasein EC2
    Web Server
    Amazon Linux AMI, 64-bit, EBS backed instance
    Setup swap space
    Install Couchbase’sMembase Server 1.7
    Access web consolehttp://<hostname>:8091
    Start the new cluster with a single node
    Add the other nodes to the cluster and rebalance
    Apache
    Client-side Moxi
    Cluster Mgmt.
    Requests
    DNS Entry
    CouchbaseCluster
    Couchbase
    Couchbase

  • 34. Tribal Crossing: Deploying Couchbase in EC2
    Web Server
    Moxi figures out which node in the cluster holds data for a given key.
    On each web server, install Moxi proxy
    Start Moxi by pointing it to the DNS entry you created
    Web apps connect to Moxi that is running locallymemcache->addServer(‘localhost’, 11211);
    Apache
    Client-side Moxi
    Cluster Mgmt.
    Requests
    DNS Entry
    CouchbaseCluster
    Couchbase
    Couchbase

  • 35. Tribal Crossing: Representing Game Data in Couchbase
    Use case - simple farming game:
    A player can have a variety of plants on their farm.
    A player can add or remove plants from their farm.
    A Player can see what plants are on another player's farm.
  • 36. Representing Objects
    Simply treat an object as an associative array
    Determine the key for an object using the class name (or type) of the object and an unique ID
    Representing Object Lists
    Denormalization
    Save a comma separated list or an array of object IDs
    Tribal Crossing: Representing Game Data in Couchbase
  • 37. Tribal Crossing: Representing Game Data in Couchbase
    Player Object
    Key: 'Player1'
    Array
    (
    [Id] => 1
    [Name] => Shawn
    )
    Plant Object
    Key: 'Plant201'
    Array
    (
    [Id] => 201
    [Player_Id] => 1
    [Name] => Starflower
    )
    PlayerPlant List
    Key: 'Player1_PlantList'
    Array
    (
    [0] => 201
    [1] => 202
    [2] => 204
    )
  • 38. No need to “ALTER TABLE”
    Add new “fields” all objects at any time
    Specify default value for missing fields
    Increased development speed
    Using JSON for data objects though, owing to the ability to query on arbitrary fields in Couchbase 2.0
    Tribal Crossing: Schema-less Game Data
  • 39. Tribal Crossing: Accessing Game Data in Couchbase
    Get all plants belong to a given player
    Request: GET /player/1/farm
    $plant_ids = couchbase->get('Player1_PlantList');
    $response = array();
    foreach ($plant_ids as $plant_id)
    {
    $plant = couchbase->get('Plant' . $plant_id);
    $response[] = $plant;
    }
    echo json_encode($response);
  • 40. Give a player a new plant
    // Create the new plant
    $new_plant = array (
    'id' => 100,
    'name' => 'Mushroom'
    );
    $couchbase->set('Plant100', $new_plant);
    // Update the player plant list
    $plant_ids = $couchbase->get('Player1_PlantList');
    $plant_ids[] = $new_plant['id'];
    $couchbase->set('Player1_PlantList', $plant_ids);
    Tribal Crossing: Modifying Game Data in Couchbase
  • 41. Concurrency issue can occur when multiple requests are working with the same piece of data.
    Solution:
    CAS (check-and-set)
    Client can know if someone else has modified the data while you are trying to update
    Implement optimistic concurrency control
    Locking (try/wait cycle)
    GETL (get with lock + timeout) operations
    Pessimistic concurrency control
    Tribal Crossing: Concurrency
  • 42. Record object relationships both ways
    Example: Plots and Plants
    Plot object stores id of the plant that it hosts
    Plant object stores id of the plot that it grows on
    Resolution in case of mismatch
    Don't sweat the extra calls to load data in a one-to-many relationship
    Use multiGet
    We can still cache aggregated results in a Memcache bucket if needed
    Tribal Crossing: Data Relationship
  • 43. First migrated large or slow performing tables and frequently updated fields from MySQL to Couchbase
    Web Server
    Tribal Crossing: Migrating to CouchbaseServers
    Apache + PHP
    Client-side Moxi
    MySQL
    memcached
    protocol listener/sender
    TAP
    engine interface
    Reporting Applications
    TAP Client
    Couchbase Storage Engine
  • 44. Tribal Crossing: Deployment
  • 45. Tribal Crossing: Deployment
  • 46. Significantly reduced the cost incurred by scaling up database servers and managing them.
    Achieved significant improvements in various performance metrics (e.g., read, write, latency, etc.)
    Allowed them to focus more on game development and optimizing key metrics
    Plan to use real-time MapReduce, querying, and indexing abilities provided by the upcoming Elastic Couchbase 2.0
    Tribal Crossing: Conclusion
  • 47. Introduction
    What is Couchbase Server?
    Simple, Fast, Elastic
    Technology Overview (Architecture, data flow, rebalancing)
    Tribal Crossing Inc: Animal Party
    Challenges before Couchbase
    Original Architecture
    Why Couchbase?
    Simplicity
    Performance
    Flexibility
    Deploying Couchbase
    New Architecture
    EC2
    Data Model
    Accessing data in Couchbase
    Product Roadmap
    Q&A
    Agenda
  • 48. Mobile to cloud data synchronization
    Cross data center replication
    Product Roadmap: Couchbase Server 2.0
    US West Coast Data Center
    US East Coast Data Center
    CouchbaseServer
    CouchbaseServer
    CouchSync
    CouchSync
    CouchSync
    Couchbase Single Server
    Couchbase Single Server
    CouchSync
    CouchSync







  • 49. Replace Sqlite-based storage engine with CouchDB
    Support indexing and querying on values
    Integrate real-time MapReduce into Couchbase server
    SDK for Couchbase server
    Product Roadmap: Couchbase Server 2.0
    Membase Server 1.7
    CouchDB 1.1
    Couchbase Server 2.0
    The world’s leading caching and clustering technology
    The fastest, most complete and most reliable database on the planet
    The most reliable and full-featured document database
  • 50. Community Edition
    Open source build
    Free forum support
    Enterprise Edition
    Free for non-production use
    Certified, QA tested version of open source
    Case tracking and guaranteed SLA for production environments
    Partner in Korea
    N2M Inc. (http://www.n2m.co.kr)
    CouchbaseProduct Download
  • 51. Q&A
    Matt Ingenthron, Couchbase Inc.
    (matt@couchbase.com, @ingenthr)
    Chiyoung Seo, Couchbase Inc.
    (chiyoung@couchbase.com, @chiyoungseo)