High Availability Cloud Storage as a Software Package with Social Graph, Throughput Awareness, and Smart Distribution Features

  • 60 views
Uploaded on

Cloud platforms today are increasingly resorting to vertical integration to save on costs and effort in general. This leads to famous cases like June 2013 when power outage in one of Amazon data …

Cloud platforms today are increasingly resorting to vertical integration to save on costs and effort in general. This leads to famous cases like June 2013 when power outage in one of Amazon data centers led to prolonged outage in entire ecosystems like heroku. When it comes to storage, businesses are extremely sensitive to prolonged outages in service. This paper proposes a client-side solution which rectifies this problem by distributing storage among multiple service providers -- referred to as substores, where the abstraction allows for any storage technology like over-the-network APIs, local HDD or SSD disks, etc. Using the opportunity, the tool implements other useful features like social metadata layer and throughput awareness which allow for a brand-new formulation of smart distribution.

More in: Technology , Business
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
60
On Slideshare
0
From Embeds
0
Number of Embeds
1

Actions

Shares
Downloads
0
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. High Availability
  • 2. . Mission Statement 1. high availability business-level cloud data store 2. federated clouds = diversification 3. many DCs and/or cloud providers 4. we care mostly about performance = high availability 5. practical solutions are needed Marat Zhanikeev -- maratishe@gmail.com High Availability Cloud Storage: Social, Throughput, Smart -- http://tinyurl.com/marat140417 2/21 ... 2/21
  • 3. . haStore : The Short Story Marat Zhanikeev -- maratishe@gmail.com High Availability Cloud Storage: Social, Throughput, Smart -- http://tinyurl.com/marat140417 3/21 ... 3/21
  • 4. . haStore: One DC is Not Enough • remember June 2013? • most services today use vertical intergration -- no diversity • Hitachi does not share DCs with NEC • regional diversity of one provider is bad ◦ how many Amazon DCs in Japan? . (the only possible) Solution .. . ... is to sign contracts with multiple DCs and manage on client side Marat Zhanikeev -- maratishe@gmail.com High Availability Cloud Storage: Social, Throughput, Smart -- http://tinyurl.com/marat140417 4/21 ... 4/21
  • 5. . haStore: One DC is Not Enough Kansai DC1 OkinawaLocations Data Centers DC2 Kyushu Osaka Office DC1 DC1 DC2 Naha Office Network distance Network distance storage network Employee A …. Content / Social Metadata High Availability Data Store DC1 DC2 …. DC1 DC2 Business trip Store APIs Proposed Software Marat Zhanikeev -- maratishe@gmail.com High Availability Cloud Storage: Social, Throughput, Smart -- http://tinyurl.com/marat140417 5/21 ... 5/21
  • 6. . haStore: Store Diversification • store = sum of multiple substores • in software: not a priority list -- optimization engine! • realtime performance monitoring, read/write optimization, etc. • sub-file data unit -- chunks SSD Growing network distance User HDD DC1 DC2 … Network Marat Zhanikeev -- maratishe@gmail.com High Availability Cloud Storage: Social, Throughput, Smart -- http://tinyurl.com/marat140417 6/21 ... 6/21
  • 7. . haStore: Socially Aware Store • content relevance based on social graph • relevance is a distribution • individual redundancy based on distribution • other link types: same time, location, filetype, ... • link strength != 1 Descending order Relevance Distribution Redundancy (user setting) Physical limit of redundancy End of content There is a link When a file is … Between Created Viewed Edited Deleted Marat Zhanikeev -- maratishe@gmail.com High Availability Cloud Storage: Social, Throughput, Smart -- http://tinyurl.com/marat140417 7/21 ... 7/21
  • 8. . hsStore: Software Design Marat Zhanikeev -- maratishe@gmail.com High Availability Cloud Storage: Social, Throughput, Smart -- http://tinyurl.com/marat140417 8/21 ... 8/21
  • 9. . Design: Specs • many substores, heterogeneous e2e performance and capacity • each substore has its own API (Dropbox, GDrive, SSD, etc.), but haStore exports a generic API • data unit: sub-file blobs, for now fixed 100kb size • social graph is used to define priority lists of files ◦ different for each user • optimization is key element of software engines 1. sync logic 2. redundancy logic Marat Zhanikeev -- maratishe@gmail.com High Availability Cloud Storage: Social, Throughput, Smart -- http://tinyurl.com/marat140417 9/21 ... 9/21
  • 10. . Design: API Stack • Generic API starts from Level 2, similar to drivers • the stack is implemented by each client = each user Employee A …. Content / Social Metadata High Availability Data Store DC1 DC2 …. Store Proposed Software Marat Zhanikeev -- maratishe@gmail.com High Availability Cloud Storage: Social, Throughput, Smart -- http://tinyurl.com/marat140417 10/21 ... 10/21
  • 11. . Design: Sync Engine • optimization for throughput minimization • same logic for SSD, HDD and over-the-network haStore Storage Sync Engine Optimization Local Cache Check 1 2 Use GUI, Clients Marat Zhanikeev -- maratishe@gmail.com High Availability Cloud Storage: Social, Throughput, Smart -- http://tinyurl.com/marat140417 11/21 ... 11/21
  • 12. . Design: Sync Engine Logic Bulk Throughput History Data Increase timeout Performance Tradeoff Marat Zhanikeev -- maratishe@gmail.com High Availability Cloud Storage: Social, Throughput, Smart -- http://tinyurl.com/marat140417 12/21 ... 12/21
  • 13. . Design: Redundancy Logic (1) Descending order Relevance Distribution Redundancy (user setting) Physical limit of redundancy End of content Marat Zhanikeev -- maratishe@gmail.com High Availability Cloud Storage: Social, Throughput, Smart -- http://tinyurl.com/marat140417 13/21 ... 13/21
  • 14. . Design: Redundancy Logic (2) Marat Zhanikeev -- maratishe@gmail.com High Availability Cloud Storage: Social, Throughput, Smart -- http://tinyurl.com/marat140417 14/21 ... 14/21
  • 15. . haStore: Social Graph Marat Zhanikeev -- maratishe@gmail.com High Availability Cloud Storage: Social, Throughput, Smart -- http://tinyurl.com/marat140417 15/21 ... 15/21
  • 16. . Social Graph : Basics • current version: only simple types of links • no link strength There is a link When a file is … Between Created Viewed Edited Deleted Marat Zhanikeev -- maratishe@gmail.com High Availability Cloud Storage: Social, Throughput, Smart -- http://tinyurl.com/marat140417 16/21 ... 16/21
  • 17. . Social Graph : Advanced • community detection • files that could be linked: 1. touched at roughly the same time 2. touched by the same user 3. same location, filetype, size, etc. • link strength, different for each kind of relation, variable e2e cost on paths • discovery based on e2e cost, not hop count Marat Zhanikeev -- maratishe@gmail.com High Availability Cloud Storage: Social, Throughput, Smart -- http://tinyurl.com/marat140417 17/21 ... 17/21
  • 18. . Implementation, Tests Marat Zhanikeev -- maratishe@gmail.com High Availability Cloud Storage: Social, Throughput, Smart -- http://tinyurl.com/marat140417 18/21 ... 18/21
  • 19. . Performance : Demo A-san B-san DBX GDR 2014-01-22 12:13:30 Block DONE Block UPLOAD Block DOWNLOAD • also demo Marat Zhanikeev -- maratishe@gmail.com High Availability Cloud Storage: Social, Throughput, Smart -- http://tinyurl.com/marat140417 19/21 ... 19/21
  • 20. . Wrapup • haStore: high availability cloud store • main features ◦ throughput-awaresync/redundancy optimization ◦ sub-file blocks, smart distribution ◦ social graph • current status: v1.0 in operation, v2.0 on the way Marat Zhanikeev -- maratishe@gmail.com High Availability Cloud Storage: Social, Throughput, Smart -- http://tinyurl.com/marat140417 20/21 ... 20/21
  • 21. . That’s all, thank you ... Marat Zhanikeev -- maratishe@gmail.com High Availability Cloud Storage: Social, Throughput, Smart -- http://tinyurl.com/marat140417 21/21 ... 21/21