1) The document discusses using Couchbase NoSQL technology to store data for social network games, which have huge concurrent requests but require low response times.
2) Traditional SQL databases have limitations for these workloads, as they are centralized and have processing overhead. Couchbase is distributed, stores active data in RAM for fast access, and allows horizontal scaling.
3) However, moving to Couchbase from SQL presented design and architecture challenges. The document then describes the SNS Storage Engine (SSE), a PHP library that provides a layer on top of Couchbase to address these challenges through features like concurrency control and high-level data structures.
4. SNS games characteristics
• Huge amount of concurrent requests but
require low response time
• Accounts can be stored separately
– No need for centralized storage
– In most cases, no need to put strict constrains on
data relationship
5. Native limitations of SQL-based DBMS
• Centralized fundamentally
– Vertical scale up issue
• Schema
– High risk (and cost) for updates
• Normalized data
– Unnecessary overhead: join tables, locking, data
constrain check,…
7. Native limitations of SQL-based DBMS
• SQL processing overhead at both DBMS and
client side.
• Most data accesses end up at hard-disk
– Very challenging to meet low response time
– Internal caching does not help much
• Hard to distributed data across multiple-
servers
9. NoSQL technology
• Persistent distributed hash-table
• Active set resides on RAM
– Extremely fast response time
• Horizontal scale up
• Raw and direct data access
– set, get, add, inc, dec : no overhead
10. NoSQL technology
Key Value
Jack.Gold 50123
Jack.Exp 4670
Jack.Coin 700
Peter.Gold 7050
Peter.Exp 20005
Peter.Coin 1
Key Value Key Value Key Value
Peter.Gold 7050 Jack.Gold 50123 Peter.Coin 1
Jack.Exp 4670 Jack.Coin 700
Peter.Exp 20005
11. Active set on RAM
CLIENT
ACTIVE SET ON RAM
Lazy write
HDD
12. Couchbase server
• Based on membase technology
• Distributed
• Replica
• Since 1.8, have native client for PHP
• Bucket types
– Couchbase (persistent)
– Memcache (memory only)
15. Architecture and design issues
• Transition from relational database design to
key-value design
– Account data => keys : how ?
• Only minimum support for
locking, concurrency control
– add : failed if exists - mutex
– cas : read get cas, write failed if cas is out-dated
16. Architecture and design issues
• No transaction support
– Data corruption becomes so easy!
• No high-level data support (e.g. list,queue,…)
• No tools for raw data viewing / editing
17. Pitfalls
• Too much freedom for developers
– Anyone can add / modify any key any time
• Epic key design mindset
– One key for all : bad performance, concurrency
control is a true night mare
• Abuse the power of set
– Never fail ! Developer LOVE it !
20. What is SSE ?
• A thin “layer” between developers and the
all-mighty Couchbase
– SSE is simply a PHP library
• Provide better support for locking and
concurrency control
– Basic support for : Begin – update - commit
• Provide high-level data structures
– Collection, queue, stack, integer (gold), inc-only
integer (exp), binary flags (quest)…
21. What is SSE ?
• Minimize the risk of weak concurrency support
– Ability to rollback pending writes
• Schema
– Limit freedom of developers!
– No more nightmare for backup and raw data
view/editing
• Buffers to eliminate repeated read / writes
25. Multi-instance architecture
• Replica is too costly to performance
• One node failed means cluster failed
• Adding nodes requires rebalance
– Only good when having clusters with large
number of nodes (more than 20 nodes)
26. Multi-instance architecture
• One instance for index (user-to-instance
mapping)
– Use APC on logic servers to cache / reduce load
to index instance
• Many instances of data
– Dynamically adjust weight on each instance base
on average load of instance
– Node failure only affects part of the user-base
27. Multi-instance architecture
Game Logic Game Logic Game Logic Game Logic
APC APC APC APC
Index Data Data Data
Instance Instance 1 Instance 2 Instance 3
29. How good is SSE for us ?
• No more data loss due to concurrency
• No more data corruption
• No mysterious bugs due to un-intended
writes
• Reduce more than 3 times workload of server
developers