DISTRIBUTED STORAGE
SYSTEM
Mr. Dương Công Lợi
Company: VNG-Corp
Tel: +84989510016
Email:loiduongcong@gmail.com
CONTENTS
 1. What is distributed-computing system?
 2. Principle of distributed database/storage
system
 3. Distributed storage system paradigm
 4. UniversalDistributedStorage
1. WHAT IS DISTRIBUTED-COMPUTING
SYSTEM?
 Distributed-Computing is the process of solving a
computational problem using a distributed
system.
 A distributed system is a computing system in
which a number of components on multiple
computers cooperate by communicating over a
network to achieve a common goal.
DISTRIBUTED DATABASE/STORAGE
SYSTEM
 A distributed database system, the database is
stored on several computers .

 A distributed database is a collection of multiple
, Logic computer network .
DISTRIBUTED SYSTEM ADVANCE
 Advance
 Avoid bottleneck & single-point-of-failure
 More Scalability
 More Availability
 Routing model
 Client routing: client request to appropriate server to
read/write data
 Server routing: server forward request of client to
appropriate server and send result to this client
* can combine the two model above into a system
DISTRIBUTED STORAGE SYSTEM
 Store some data {1,2,3,4,6,7,8} into 1 server
 And store them into 3 distributed server
1,2,3,4,
6,7,8
1,2,3
4,6
7,8
2. PRINCIPLE OF DISTRIBUTED
DATABASE/STORAGE SYSTEM
 Shard data key and store it to appropriate server
use Distributed Hash Table (DHT)
 DHT must be consistent hashing:
 Uniform distribution of generation
 Consistent
 Jenkins, Murmur are the good choice; MD5, SHA
slower
CANONICAL PROBLEMS IN DISTRIBUTED
SYSTEMS
 Distributed data independence
 Distributed transactions: ACID (Atomicity,
Consistency, Isolation, Durability) requirement
 Fault tolerance
 Transparency
3. DISTRIBUTED STORAGE SYSTEM
PARADIGM
 Data Hashing/Addressing
 Determine server for data store in
 Data Replication
 Store data into multi server node for more available,
fault-tolerance
DISTRIBUTED STORAGE SYSTEM
ARCHITECT
 Data Hashing/Addressing
 Use DHT to addressing server (use server-name) to a
number, performing it on one circle called the keys
space
 Use DHT to addressing data and find server store it
by successor(k)=ceiling(addressing(k))
 successor(k): server store k
0
server3
server1
server2
DISTRIBUTED STORAGE SYSTEM
ARCHITECT
 Addressing – Virtual node
 Each server node is generated to more node-id for
evenly distributed, load balance
Server1: n1, n4, n6
Server2: n2, n7
Server3: n3, n5
0
server3
server1
server2
n7
n1
n5
n2
n4
n6
n3
n6
DISTRIBUTED STORAGE SYSTEM
ARCHITECT
 Data Replication
Data k1 store in server1 as master and store in
server2 as slave
0
server3
server1
server2
k1
UNIVERSALDISTRIBUTEDSTORAGE
a distributed storage system
4. UNIVERSALDISTRIBUTEDSTORAGE
 UniversalDistributedStorage is a distributed
storage system develop for:
 Distributed data independence
 Distributed transactions (ACID)
 Fault tolerance
 Leader election (decision for join or leave server node)
 Replicate with multiple master replication
 Transparency
UNIVERSALDISTRIBUTEDSTORAGE
ARCHITECTURE
 Overview
Bussiness
Layer
Distrib
uted
Layer
Storage
Layer
Bussiness
Layer
Distrib
uted
Layer
Storage
Layer
Bussiness
Layer
Distrib
uted
Layer
Storage
Layer
ARCHITECTURE OVERVIEW
UNIVERSALDISTRIBUTEDSTORAGE
FEATURE
 Data hashing/addressing
 Use Murmur hashing function
UNIVERSALDISTRIBUTEDSTORAGE
FEATURE
 Leader election
 Use Bully Leader Election algorithm
UNIVERSALDISTRIBUTEDSTORAGE
FEATURE
 Multi-master replication
 Problem of multi-master replication
UNIVERSALDISTRIBUTEDSTORAGE
FEATURE
 Multi-master replication
 Data store to main master (called sub-leader), then
this data post to queue to sync to other master.
UNIVERSALDISTRIBUTEDSTORAGE
STATISTIC
 System information:
 3 machine 8GB Ram, core i5 3,220GHz
 LAN/WAN network
 7 physical servers on 3 above mechine
 Concurrence write 16500000 items in 3680s, rate~
4480req/sec (at client computing)
 Concurrence read 16500000 items in 1458s, rate~
11320req/sec (at client computing)
* It doesn’t limit of this system, it limit at clients (this
test using 3 client thread)
Q & A
Contact:
Duong Cong Loi
loiduongcong@gmail.com
https://www.facebook.com/duongcong.loi

Distributed storage system