Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

More Efficient Object Replication in OpenStack Summit Juno

1,705 views

Published on

This slide is related to http://junodesignsummit.sched.org/event/7ae1af936b54b937a92db9c4344dfe66#.U3m1OPl_t8E

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

More Efficient Object Replication in OpenStack Summit Juno

  1. 1. Copyright©2014 NTT corp. All Rights Reserved. Developing More Efficient Object Replication on OpenStack Swift 2014/05/16 (OpenStack Juno Design Summit) Kota Tsuyuzaki Developer (Swift ATC) Advanced Information Processing Technology SE Project NTT Software Innovation Center Copyright(c)2009-2014 NTT CORPORATION. All Rights Reserved.
  2. 2. 2Copyright©2014 NTT corp. All Rights Reserved. 1. Global Distributed Cluster 2. More Efficient Object Replication 3. Benchmark Analysis Etherpad: https://etherpad.openstack.org/p/juno_swift _object_replication Extra: ssync issue Outline
  3. 3. 3Copyright©2014 NTT corp. All Rights Reserved. Demands: • World Wide Services • Capacity Optimization • Disaster Recovery Solution: • Global Distributed Cluster 1. Global Distributed Cluster
  4. 4. 4Copyright©2014 NTT corp. All Rights Reserved. Network Issues: 1. Global Distributed Cluster ・High Latency ・Narrow ・Expensive tens of ~ 100 ms 1~10Gbps $15000/Gbps/mo
  5. 5. 5Copyright©2014 NTT corp. All Rights Reserved. Network Issues: 1. Global Distributed Cluster ・High Latency Excellent -> Regions -> Affinity Controls Region1 Region2 from SwiftStack Blog https://swiftstack.com/blog/
  6. 6. 6Copyright©2014 NTT corp. All Rights Reserved. Network Issues: 1. Global Distributed Cluster ・Narrow ・Expensive Not So Enough -> ??? -> ??? • Large Amounts of Transfer • Replication Delay
  7. 7. 7Copyright©2014 NTT corp. All Rights Reserved. Objective: Reducing The Amounts of Replication Network Transfer between Regions (focus on Narrow Network) 2. More Efficient Object Replication
  8. 8. 8Copyright©2014 NTT corp. All Rights Reserved. 2. More Efficient Object Replication Current Behavior
  9. 9. 9Copyright©2014 NTT corp. All Rights Reserved. Current: Model: 2 Regions 3 Replicas with Write Affinity 2. More Efficient Object Replication Region1 Network between Regions Region2 User Internet PUT object Primary Handoff
  10. 10. 10Copyright©2014 NTT corp. All Rights Reserved. Current: Model: 2 Regions 3 Replicas with Write Affinity 2. More Efficient Object Replication Region1 Network between Regions Region2 User Internet Primary Handoff Unfortunately Copy Twice or More
  11. 11. 11Copyright©2014 NTT corp. All Rights Reserved. 2. More Efficient Object Replication Proposed Approach
  12. 12. 12Copyright©2014 NTT corp. All Rights Reserved. Approach: • Only push to one remote based on affinity • Request to sync to others from the remote • Change only few codes in object-replicator and object- server 2. More Efficient Object Replication Region1 Network between Regions Region2 Only push to one remote Sync to others
  13. 13. 13Copyright©2014 NTT corp. All Rights Reserved. 2. More Efficient Object Replication *Additional code[Object-Replicator] find local part suffixes for each: find other primary locations check remote if not in remote: if (remote region is local) or (remote region not in synced region): push data create remote suffix with request to sync in remote region add remote region to synced region [Object-Server (REPLICATE)] create local suffix if sync request in header: push data to requested remotes
  14. 14. 14Copyright©2014 NTT corp. All Rights Reserved. Objective: • Analyze Replication Performance • Total transferred data amount • Average network bandwidth between region • One pass time 3. Performance Analysis
  15. 15. 15Copyright©2014 NTT corp. All Rights Reserved. Model: • 2 Regions 3 Replicas • 1 Gate Way Node(GW) between Regions Scenario: • Shaping GW Network as 1Gbps • Stop object-replicator • Load objects with Write Affinity • 1Gbps -> 8MB * 5,000 (40GB total) • Run object-replicator with once mode (32 concurrency) Benchmark Patterns: • Original (ssync) • Proposed (ssync, rsync) 3. Benchmark Scenario
  16. 16. 16Copyright©2014 NTT corp. All Rights Reserved. 3. Benchmark Environment Storage1 Storage2 Infiniband switch (LAN) Region 1 Region 2 Proxy x 36 x 36 Infiniband switch (LAN) Storage3 Storage4 x 36 x 36 GW 20Gbps 20Gbps 20Gbps (1G) 20Gbps 20Gbps Client Ethernet Storage: CPU: 2 * Intel X5650 2.67GHz (6 core * HT) MEM: 48GB RAM NIC: 20Gbps Infiniband Disks: 3TB SATA (7,200 rpm) x 36 disks GW: CPU: 2 * Intel X5650 2.67GHz (6 core * HT) MEM: 64GB RAM NIC: 2 * 20Gbps Infiniband (Shaping 1G) 20Gbps (1G)
  17. 17. 17Copyright©2014 NTT corp. All Rights Reserved. 3. Result (w/1Gbps shaping) 0 100 200 300 400 500 600 Original Proposed (ssync) Proposed (rsync) elapsedtime(sec) One Replication Pass Time (1Gps) 0 10 20 30 40 50 60 70 Original Proposed (ssync) Proposed (rsync) TransferredDataAmount(GB) Transferred Data on One Pass (1Gps) 0 0.2 0.4 0.6 0.8 1 Original Proposed (ssync) Proposed (rsync) AverageNEtworkBandwidth (Gbps) Average Network Bandwidth (1Gps) - Good Reduction in Transferred Data Amount - Little decreasing appeared in Average Network Bandwidth - Good Reduction in One Pass Time -- ssync is more efficient than rsync. -- Proposed algorithm has small overhead with waiting node syncing. -- Enable to ensure sync all primary nodes with a shorter time and smaller amount of data transfer. Very Good! Very Good! Little decreasing40GB * 3 replica / 2 = 60GB 1 / 3 has 2 copy in region2 40 GB = theoretical value
  18. 18. 18Copyright©2014 NTT corp. All Rights Reserved. 1. Global Distributed Cluster • Efficient Replication Needs 2. More Efficient Object Replication • Affinity based approach • Only push to one remote 3. Benchmark Analysis • Good reduction of data transfer • Little overhead in One Pass Time acknowledgment: Swiftstack members, Ken Igarachi, Yohei Hayashi, Takashi Shito, Hiromichi Ito, Naoto Nishizono Conclusion
  19. 19. 19Copyright©2014 NTT corp. All Rights Reserved. • Is ensuring syncing all nodes needed? • Request to sync at that time of replicate: • Pros: Able to ensure to sync all replica • Cons: Little overhead to wait syncing • Not to request to sync, update the replica asynchronously: • Pros: To be simple • Cons: Unable to ensure to sync all replica • Good way to sync other nodes in Object-Server • Naïve (but very simple): • Use object-replicator instance with unnecessary wasted information. (e.g. Ring) • Complex: • Create syncing function or class for object-server • Are there more efficient ways? Discussions current current
  20. 20. 20Copyright©2014 NTT corp. All Rights Reserved. Kota Tsuyuzaki IRC: Kota tsuyuzaki.kota@lab.ntt.co.jp
  21. 21. 21Copyright©2014 NTT corp. All Rights Reserved. Ssync: • Replication process improvement based on HTTP • Replacement of rsync (designed to be slimmer) • Sender / Receiver Model Issue: • Performance of parallel i/o (might be) caused by evenlet • Disable to access local disk in parallel (maybe, by constraint of Python VM) • Slower than rsync in my experiment • Possible Solution: • Launch sender as subprocess to allow using another CPU core for disk read similar with rsync. • When using os.fork(), performance became better to around same as rsync. Extra: Ssync issue

×