Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Data storage solutions for SNS game


Published on

Published in: Technology, Design
  • Login to see the comments

  • Be the first to like this

Data storage solutions for SNS game

  1. 1. Data Storage Solutions for SNS gameDinh Nguyen Anh Dung – P2S – G6 – VNG
  2. 2. CONTENT• SNS games and SQL-based databases• NoSQL technology and Couchbase• NoSQL does not come without challenges• SNS Storage Engine (SSE)
  3. 3. SNS games AND SQL-based databases
  4. 4. SNS games characteristics• Huge amount of concurrent requests but require low response time• Accounts can be stored separately – No need for centralized storage – In most cases, no need to put strict constrains on data relationship
  5. 5. Native limitations of SQL-based DBMS• Centralized fundamentally – Vertical scale up issue• Schema – High risk (and cost) for updates• Normalized data – Unnecessary overhead: join tables, locking, data constrain check,…
  6. 6. Native limitations of SQL-based DBMS Source : NoSQL - WhitePaper
  7. 7. Native limitations of SQL-based DBMS• SQL processing overhead at both DBMS and client side.• Most data accesses end up at hard-disk – Very challenging to meet low response time – Internal caching does not help much• Hard to distributed data across multiple- servers
  8. 8. NoSQL technology and Couchbase
  9. 9. NoSQL technology• Persistent distributed hash-table• Active set resides on RAM – Extremely fast response time• Horizontal scale up• Raw and direct data access – set, get, add, inc, dec : no overhead
  10. 10. NoSQL technology Key Value Jack.Gold 50123 Jack.Exp 4670 Jack.Coin 700 Peter.Gold 7050 Peter.Exp 20005 Peter.Coin 1Key Value Key Value Key ValuePeter.Gold 7050 Jack.Gold 50123 Peter.Coin 1Jack.Exp 4670 Jack.Coin 700Peter.Exp 20005
  11. 11. Active set on RAM CLIENT ACTIVE SET ON RAM Lazy write HDD
  12. 12. Couchbase server• Based on membase technology• Distributed• Replica• Since 1.8, have native client for PHP• Bucket types – Couchbase (persistent) – Memcache (memory only)
  13. 13. NoSQL does not come without challenges
  14. 14. Our first SNS game with Couchbase
  15. 15. Architecture and design issues• Transition from relational database design to key-value design – Account data => keys : how ?• Only minimum support for locking, concurrency control – add : failed if exists - mutex – cas : read get cas, write failed if cas is out-dated
  16. 16. Architecture and design issues• No transaction support – Data corruption becomes so easy!• No high-level data support (e.g. list,queue,…)• No tools for raw data viewing / editing
  17. 17. Pitfalls• Too much freedom for developers – Anyone can add / modify any key any time• Epic key design mindset – One key for all : bad performance, concurrency control is a true night mare• Abuse the power of set – Never fail ! Developer LOVE it !
  18. 18. SSE – SNS Storage Engine
  19. 19. Our second SNS game with Couchbase
  20. 20. What is SSE ?• A thin “layer” between developers and the all-mighty Couchbase – SSE is simply a PHP library• Provide better support for locking and concurrency control – Basic support for : Begin – update - commit• Provide high-level data structures – Collection, queue, stack, integer (gold), inc-only integer (exp), binary flags (quest)…
  21. 21. What is SSE ?• Minimize the risk of weak concurrency support – Ability to rollback pending writes• Schema – Limit freedom of developers! – No more nightmare for backup and raw data view/editing• Buffers to eliminate repeated read / writes
  22. 22. Raw account view / editing tool
  23. 23. What is SSE ?
  24. 24. What is SSE ?
  25. 25. Multi-instance architecture• Replica is too costly to performance• One node failed means cluster failed• Adding nodes requires rebalance – Only good when having clusters with large number of nodes (more than 20 nodes)
  26. 26. Multi-instance architecture• One instance for index (user-to-instance mapping) – Use APC on logic servers to cache / reduce load to index instance• Many instances of data – Dynamically adjust weight on each instance base on average load of instance – Node failure only affects part of the user-base
  27. 27. Multi-instance architecture Game Logic Game Logic Game Logic Game Logic APC APC APC APC Index Data Data Data Instance Instance 1 Instance 2 Instance 3
  28. 28. Disavantages• Lower performance of multi-get• Not well balance between instances in terms of accesses
  29. 29. How good is SSE for us ?• No more data loss due to concurrency• No more data corruption• No mysterious bugs due to un-intended writes• Reduce more than 3 times workload of server developers