• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Zing Database – Distributed Key-Value Database
 

Zing Database – Distributed Key-Value Database

on

  • 5,450 views

 

Statistics

Views

Total Views
5,450
Views on SlideShare
3,732
Embed Views
1,718

Actions

Likes
3
Downloads
152
Comments
0

10 Embeds 1,718

http://open.me.zing.vn 818
http://open.zing.vn 717
http://phavoraocan.com 88
http://www.phavoraocan.com 65
http://dev.openplatform.me.zing.vn 9
http://www.haui.vn 8
http://open.zing.vn. 5
http://www.pega.vn 4
http://www.gata.vn 3
http://link.apps.zing.vn 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Zing Database – Distributed Key-Value Database Zing Database – Distributed Key-Value Database Presentation Transcript

    • Zing Database – Distributed Key-Value Database Nguyễn Quang Nam Zing Web-Technical Team
    • Content Why Introduction Overview architecture 1 3 2 Single Server/Storage 4 Distribution 5
    • Introduction
    • Some statistics: - Feeds: 1.6 B, 700 GB hard drive in 4 DB instances, 8 caching servers, 136 GB memory cache in used. - User Profiles: 44.5 M registered accounts, 2 database instances, 30 GB memory cache. - Comments: 350 M, 50 GB hard drive in 2 DB instances, 20 GB memory cache
    • Why
    • Access time L1 cache reference 0.5 ns Branch mispredict 5 ns L2 cache reference 7 ns Mutex lock/unlock 100 ns Main memory reference 100 ns Compress 1K bytes with Zippy 10,000 ns Send 2K bytes over 1 Gbps network 20,000 ns Read 1 MB sequentially from memory 250,000 ns Round trip within same datacenter 500,000 ns Disk seek 10,000,000 ns Read 1 MB sequentially from network 10,000,000 ns Read 1 MB sequentially from disk 30,000,000 ns Send packet CA->Netherlands->CA 150,000,000 ns by Jeff Dean (http://labs.google.com/people/jeff)
    • Standard & Real Requirement - Time to load a page < 200 ms - Read data rate ~12K ops/sec - Write data rate ~8K ops/sec - Caching service/Database recovery time < 5 mins
    • Existent thing - RDBMS (MySQL, MSSQL): Write: too slow; Read: so so with a small DB, too bad with a huge DB - Cassandra (by Facebook): difficult to do operation/maintain, and performance is not so good - HBase/Hadoop: We use this for log system - MongoDB, Membase, Tokyo Tyrant, .. : OK! we use these in several cases, but not suitable for all
    • Overview architecture
    •  
    • Server/Storage
    • ZNonblockingServer - Based on TNonblockingServer (Apache Thrift) - 185K reqs/sec (original TNonblockingServer is just 45K reqs/sec) - Serialize/Deserialize data - Prevent overload server - Data is not secured while transferring - Protect service from invalid requests
    • ICache - Least Recently Used/Time based expiration strategy - zlru_table<key_type, value_type>: hash table data structure - Re-write malloc/free functions instead of using standard malloc/free in glibc to reduce memory fragment - Support dirty-items marking => for lazy DB flush
    • ZiDB - Separate into DataFile & IndexFile - 1 seek for a read, 1-2 seeks for a write - IndexFile (hash structure) is loaded onto memory as a mapping file (shared memory) to reduce system call - Write-ahead log to avoid data loss - Data magic-padding - Checksum & checkpoint for repair data - Partitioning DB for easier maintenance
    • Distribution
    • Key requirements: - Scalability - Load balance - Availability - Consistency
    • 2 Models: - Centralized: 1 addressing server & multiple storage servers => bottleneck & single-point-of-failure - Peer-peer: Each server includes addressing module & storage 2 Types of routing: - Client routing: Each client itself does the addressing and query data - Server routing: The addressing is done at server
    • Operation Flows * Addressing module is moved into each storage node in Peer-peer model Business Logic Server Addressing Server (DHT) Storage Layer Storage Node 1 ICache ZiDB Storage Module Storage Node N ICache ZiDB Storage Module … (1) Request key locations (2) Key locations (3) Get & Set operations (4) Operation returns
    • Addressing: - Provide key locations of resources - Basically a Distributed Hash Table, using consistent hashing - Hashing: Jenkins, Murmur, or any algorithm that satisfies two conditions: - Uniform distribution of generated keys in the key space - Consistency (MD5, SHA are bad choice since performance)
    • Addressing - Node location: Each node is assigned a continuous range of IDs (hashed key)
    • Addressing - Node location: Golden ratio principle (a/b = 2b/a) - Init ratio = 1.618 - Max ratio ~ 2.6 - Easy to implement - Easy for routing from client 2 3 4 5 1
    • Server 1: 1,2,3 Server 2: 4,5,6,7 Server 3: 8,9 1 4 7 3 6 2 5 8 9 Addressing - Node location: Virtual nodes - Each real server has multiple virtual nodes on ring - More virtual nodes, more balance of load - Hard to maintain table of nodes
    • A A A B B C Addressing – Multi-layer rings - Store the change history of system - Provide availability/reconfigurability - Able to put a node on ring manually * Write: data is located on the highest ring * Read: data is located on the highest ring, then lower rings if not found
    • Replication & Backup - Each node has one primary range of IDs, and Some secondary range of IDs - Each real node need a backup instance to replace in case it’s down * Data is queried from primary node, then secondary nodes
    • Configuration: to find the best parameters to configure DB or to choose the suitable DB type. - How many read/write per second? - Length Deviation of data: data length is same same or much different each others, - Has updation/deletion data? - How important of data: acceptable loss or not - The old data can be recycled?
    • Q & A Contact: Nguyễn Quang Nam [email_address] http://me.zing.vn/nam.nq