Data storage solutions for SNS game
Upcoming SlideShare
Loading in...5

Data storage solutions for SNS game






Total Views
Views on SlideShare
Embed Views



2 Embeds 922 919 3



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

Data storage solutions for SNS game Data storage solutions for SNS game Presentation Transcript

  • Data Storage Solutions for SNS gameDinh Nguyen Anh Dung – P2S – G6 – VNG
  • CONTENT• SNS games and SQL-based databases• NoSQL technology and Couchbase• NoSQL does not come without challenges• SNS Storage Engine (SSE)
  • SNS games AND SQL-based databases View slide
  • SNS games characteristics• Huge amount of concurrent requests but require low response time• Accounts can be stored separately – No need for centralized storage – In most cases, no need to put strict constrains on data relationship View slide
  • Native limitations of SQL-based DBMS• Centralized fundamentally – Vertical scale up issue• Schema – High risk (and cost) for updates• Normalized data – Unnecessary overhead: join tables, locking, data constrain check,…
  • Native limitations of SQL-based DBMS Source : NoSQL - WhitePaper
  • Native limitations of SQL-based DBMS• SQL processing overhead at both DBMS and client side.• Most data accesses end up at hard-disk – Very challenging to meet low response time – Internal caching does not help much• Hard to distributed data across multiple- servers
  • NoSQL technology and Couchbase
  • NoSQL technology• Persistent distributed hash-table• Active set resides on RAM – Extremely fast response time• Horizontal scale up• Raw and direct data access – set, get, add, inc, dec : no overhead
  • NoSQL technology Key Value Jack.Gold 50123 Jack.Exp 4670 Jack.Coin 700 Peter.Gold 7050 Peter.Exp 20005 Peter.Coin 1Key Value Key Value Key ValuePeter.Gold 7050 Jack.Gold 50123 Peter.Coin 1Jack.Exp 4670 Jack.Coin 700Peter.Exp 20005
  • Active set on RAM CLIENT ACTIVE SET ON RAM Lazy write HDD
  • Couchbase server• Based on membase technology• Distributed• Replica• Since 1.8, have native client for PHP• Bucket types – Couchbase (persistent) – Memcache (memory only)
  • NoSQL does not come without challenges
  • Our first SNS game with Couchbase
  • Architecture and design issues• Transition from relational database design to key-value design – Account data => keys : how ?• Only minimum support for locking, concurrency control – add : failed if exists - mutex – cas : read get cas, write failed if cas is out-dated
  • Architecture and design issues• No transaction support – Data corruption becomes so easy!• No high-level data support (e.g. list,queue,…)• No tools for raw data viewing / editing
  • Pitfalls• Too much freedom for developers – Anyone can add / modify any key any time• Epic key design mindset – One key for all : bad performance, concurrency control is a true night mare• Abuse the power of set – Never fail ! Developer LOVE it !
  • SSE – SNS Storage Engine
  • Our second SNS game with Couchbase
  • What is SSE ?• A thin “layer” between developers and the all-mighty Couchbase – SSE is simply a PHP library• Provide better support for locking and concurrency control – Basic support for : Begin – update - commit• Provide high-level data structures – Collection, queue, stack, integer (gold), inc-only integer (exp), binary flags (quest)…
  • What is SSE ?• Minimize the risk of weak concurrency support – Ability to rollback pending writes• Schema – Limit freedom of developers! – No more nightmare for backup and raw data view/editing• Buffers to eliminate repeated read / writes
  • Raw account view / editing tool
  • What is SSE ?
  • What is SSE ?
  • Multi-instance architecture• Replica is too costly to performance• One node failed means cluster failed• Adding nodes requires rebalance – Only good when having clusters with large number of nodes (more than 20 nodes)
  • Multi-instance architecture• One instance for index (user-to-instance mapping) – Use APC on logic servers to cache / reduce load to index instance• Many instances of data – Dynamically adjust weight on each instance base on average load of instance – Node failure only affects part of the user-base
  • Multi-instance architecture Game Logic Game Logic Game Logic Game Logic APC APC APC APC Index Data Data Data Instance Instance 1 Instance 2 Instance 3
  • Disavantages• Lower performance of multi-get• Not well balance between instances in terms of accesses
  • How good is SSE for us ?• No more data loss due to concurrency• No more data corruption• No mysterious bugs due to un-intended writes• Reduce more than 3 times workload of server developers