Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Distributed storage system

1,020 views

Published on

Distributed storage system

Published in: Technology
  • Be the first to comment

Distributed storage system

  1. 1. DISTRIBUTED STORAGE SYSTEM Mr. Dương Công Lợi Company: VNG-Corp Tel: +84989510016 Email:loiduongcong@gmail.com
  2. 2. CONTENTS  1. What is distributed-computing system?  2. Principle of distributed database/storage system  3. Distributed storage system paradigm  4. UniversalDistributedStorage
  3. 3. 1. WHAT IS DISTRIBUTED-COMPUTING SYSTEM?  Distributed-Computing is the process of solving a computational problem using a distributed system.  A distributed system is a computing system in which a number of components on multiple computers cooperate by communicating over a network to achieve a common goal.
  4. 4. DISTRIBUTED DATABASE/STORAGE SYSTEM  A distributed database system, the database is stored on several computers .   A distributed database is a collection of multiple , Logic computer network .
  5. 5. DISTRIBUTED SYSTEM ADVANCE  Advance  Avoid bottleneck & single-point-of-failure  More Scalability  More Availability  Routing model  Client routing: client request to appropriate server to read/write data  Server routing: server forward request of client to appropriate server and send result to this client * can combine the two model above into a system
  6. 6. DISTRIBUTED STORAGE SYSTEM  Store some data {1,2,3,4,6,7,8} into 1 server  And store them into 3 distributed server 1,2,3,4, 6,7,8 1,2,3 4,6 7,8
  7. 7. 2. PRINCIPLE OF DISTRIBUTED DATABASE/STORAGE SYSTEM  Shard data key and store it to appropriate server use Distributed Hash Table (DHT)  DHT must be consistent hashing:  Uniform distribution of generation  Consistent  Jenkins, Murmur are the good choice; MD5, SHA slower
  8. 8. CANONICAL PROBLEMS IN DISTRIBUTED SYSTEMS  Distributed data independence  Distributed transactions: ACID (Atomicity, Consistency, Isolation, Durability) requirement  Fault tolerance  Transparency
  9. 9. 3. DISTRIBUTED STORAGE SYSTEM PARADIGM  Data Hashing/Addressing  Determine server for data store in  Data Replication  Store data into multi server node for more available, fault-tolerance
  10. 10. DISTRIBUTED STORAGE SYSTEM ARCHITECT  Data Hashing/Addressing  Use DHT to addressing server (use server-name) to a number, performing it on one circle called the keys space  Use DHT to addressing data and find server store it by successor(k)=ceiling(addressing(k))  successor(k): server store k 0 server3 server1 server2
  11. 11. DISTRIBUTED STORAGE SYSTEM ARCHITECT  Addressing – Virtual node  Each server node is generated to more node-id for evenly distributed, load balance Server1: n1, n4, n6 Server2: n2, n7 Server3: n3, n5 0 server3 server1 server2 n7 n1 n5 n2 n4 n6 n3 n6
  12. 12. DISTRIBUTED STORAGE SYSTEM ARCHITECT  Data Replication Data k1 store in server1 as master and store in server2 as slave 0 server3 server1 server2 k1
  13. 13. UNIVERSALDISTRIBUTEDSTORAGE a distributed storage system
  14. 14. 4. UNIVERSALDISTRIBUTEDSTORAGE  UniversalDistributedStorage is a distributed storage system develop for:  Distributed data independence  Distributed transactions (ACID)  Fault tolerance  Leader election (decision for join or leave server node)  Replicate with multiple master replication  Transparency
  15. 15. UNIVERSALDISTRIBUTEDSTORAGE ARCHITECTURE  Overview Bussiness Layer Distrib uted Layer Storage Layer Bussiness Layer Distrib uted Layer Storage Layer Bussiness Layer Distrib uted Layer Storage Layer
  16. 16. ARCHITECTURE OVERVIEW
  17. 17. UNIVERSALDISTRIBUTEDSTORAGE FEATURE  Data hashing/addressing  Use Murmur hashing function
  18. 18. UNIVERSALDISTRIBUTEDSTORAGE FEATURE  Leader election  Use Bully Leader Election algorithm
  19. 19. UNIVERSALDISTRIBUTEDSTORAGE FEATURE  Multi-master replication  Problem of multi-master replication
  20. 20. UNIVERSALDISTRIBUTEDSTORAGE FEATURE  Multi-master replication  Data store to main master (called sub-leader), then this data post to queue to sync to other master.
  21. 21. UNIVERSALDISTRIBUTEDSTORAGE STATISTIC  System information:  3 machine 8GB Ram, core i5 3,220GHz  LAN/WAN network  7 physical servers on 3 above mechine  Concurrence write 16500000 items in 3680s, rate~ 4480req/sec (at client computing)  Concurrence read 16500000 items in 1458s, rate~ 11320req/sec (at client computing) * It doesn’t limit of this system, it limit at clients (this test using 3 client thread)
  22. 22. Q & A Contact: Duong Cong Loi loiduongcong@gmail.com https://www.facebook.com/duongcong.loi

×