0
Inter-cloud object storage:                                              Colony                                           ...
EtherPad                   http://etherpad.openstack.org/grizzly-colonyCopyright © 2012 NTT DATA INTELLILINK Corporation  ...
Agenda•What is Colony ?–Our goal–Usecase•How to make swift network(or region) aware–Problems with original swift code–Our ...
What is Colony?Copyright © 2012 NTT DATA INTELLILINK Corporation   4
Goal: academic community cloudAcademicCommunity Cloud                                                          Education C...
Intercloud object storage service                                                                                         ...
Users’ points of view          Cloud Services                                                                             ...
Colony archives the federation                                                       Shibboleth IdP                       ...
UseCase We plan to use Colony as Object Storage for Clouds to Clouds migration Object Storage to delevery VM Images around...
Developed software components in colony•Colony-Horizon – based on diablo/stable Horizon with some enhancements•Multi-regio...
Colony-horizonUsers can choose swift                               Swift -I                                               ...
Colony – keystone                        Shibboleth IdP                                                       Modification...
Colony-dispatcher1. Swift client can send requests to Swift-A and Swift-I through Swift Dispatcher2. Swift Dispatcher merg...
CachingSwift Dispatcher can use cache proxy (like squid) perswift proxy to retrieve objects from remote swifts.     A:cont...
How to swift make network awareCopyright © 2012 NTT DATA INTELLILINK Corporation              15
Current implementationCopyright © 2012 NTT DATA INTELLILINK Corporation     16
Problems which original swift code has•PUT/GET performance–Swift proxy waits all objects are put to storage servers.–Swift...
Test Environments                                                    CPU: Intel(R) Xeon(R) CPU E7- 8870 (40core)          ...
PUT operation                                                                  Sapporo                                    ...
Objects locationCopyright © 2012 NTT DATA INTELLILINK Corporation          20
PUT objects throughput @Tokyo (Bytes/sec)Copyright © 2012 NTT DATA INTELLILINK Corporation      21
GET operation                                                                    High-bandwidth, low-latency              ...
Objects locationCopyright © 2012 NTT DATA INTELLILINK Corporation          23
GET objects throughput @Tokyo (Bytes/sec)            Performance degradation by network between Sapporo and TokyoCopyright...
Our modificationCopyright © 2012 NTT DATA INTELLILINK Corporation   25
How to solve - Basic Idea•Limitation–Don’t modify data structure (including ring)–Minimize customization•Adding some rules...
How to solve                                                                         Proxy                                ...
PUT operation  Proxy initially puts objects to the nearest storage servers using zone information and  zone distance. Then...
PUT operation  This is the same situation that all storage servers located in Supporo are broken.                         ...
GET operation                                                                                    1.First, try to retrieve ...
DELETE operation                                                                                    1.First, try to delete...
Code  ring.py                                                                      proxy/server.py                        ...
Investigation               PUT Average (bytes/sec) @Sapporo40,000,00035,000,00030,000,00025,000,00020,000,000            ...
Using Cache How about the case of all objects are located to remote areas ?                                               ...
Colony-Dispatcher as a cacheColony-Dispatcher can be a swift-proxy-proxy with cachemechanism    Copyright © 2012 NTT DAT...
Investigation – Cache effectiveness        Using Colony-Dispatcher as a cache, the performance to retrieve objects from   ...
Conclusion•Re-ordering the nodes by regions for Proxy resolves GET/PUT performanceissues–And this feature can be implement...
Our future planCopyright © 2012 NTT DATA INTELLILINK Corporation   38
Problems to tackle•Object’s location•Adding Region concepts to the ring structure might help.–Primary nodes isolated by re...
Are you interested in Colony ?•Please contact with me if you are interested in Colony project.–We want to collaborate with...
Are you interested in academic clouds?•If you are interested in the way how to integrate clouds using dodai andclony–My co...
Thank you.Copyright © 2012 NTT DATA INTELLILINK Corporation   42
Q&A•Please phase your question using simple grammar if possible. Copyright © 2012 NTT DATA INTELLILINK Corporation        ...
Upcoming SlideShare
Loading in...5
×

Colony for-openstack-grizzly-summit

300

Published on

The presentation slides at OpenStack Grizzly summit on Oct.15th, 2012 in San Diego

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
300
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
7
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Transcript of "Colony for-openstack-grizzly-summit"

  1. 1. Inter-cloud object storage: Colony 15/Oct/2012 NTT DATA INTELLILINK Motonobu IchimuraCopyright © 2012 NTT DATA Corporation
  2. 2. EtherPad http://etherpad.openstack.org/grizzly-colonyCopyright © 2012 NTT DATA INTELLILINK Corporation 2
  3. 3. Agenda•What is Colony ?–Our goal–Usecase•How to make swift network(or region) aware–Problems with original swift code–Our modification–Investigation–Conclusion•Future Plan–Problems to tackle (and being tackled)–Collaboration Copyright © 2012 NTT DATA INTELLILINK Corporation 3
  4. 4. What is Colony?Copyright © 2012 NTT DATA INTELLILINK Corporation 4
  5. 5. Goal: academic community cloudAcademicCommunity Cloud Education Cloud Univ.-X Cloud Univ. -A Cloud Univ.-B Cloud Research Cloud ・・・ Intercloud services Science Information Network Copyright © 2012 NTT DATA INTELLILINK Corporation 5
  6. 6. Intercloud object storage service Colony federates cloud Nova object storage services, like swift, to archive Glance Swift for intercloud Swift intercloud object storage service. use Swift for intercloud use Swift forNova Swift for Swift local use intercloud Glance use Glance Swift Nova Swift for intercloud use Copyright © 2012 NTT DATA INTELLILINK Corporation 6
  7. 7. Users’ points of view Cloud Services Cloud-B Object B1-1 Cloud-A Container B1 Container B2 Container B3 Object B1-2 Object B1-3 Object A1-1 Swift-B Object I1-1 Container A1 Object A1-2 Inter-cloud Container I1 Object I1-2 Container A2 Object A1-3 Inter-cloud Container I8 Object I1-3 Swift-A Container A3 Inter-cloud Container I10 Inter-cloud Container I1 Object I4-1 Inter-cloud Container I4 Object I4-2 Inter-cloud Container I13 Object I4-3 Geographically Inter-cloud Container Object I1-1 Object I1-2 Distributed I1 Inter-cloud Container I2 Object I1-3 Swift-I Inter-cloud Container I3 Object I4-1 Inter-cloud Container I4 Object I4-2 Object I4-3 Inter-cloud object storage service : colony Copyright © 2012 NTT DATA INTELLILINK Corporation 7
  8. 8. Colony archives the federation Shibboleth IdP Authenticate with Shibboleth IdP Cloud-A User Colony Apache mod_wsgi mod_shib Colony-horizon Colony-keystone Colony-dispatcher Squid Provide seamless access to Slapd Ubuntu multiple swifts Swift Swift Swift-I Colony-Keystone Colony-Keystone Swift-A Slapd SlapdCopyright © 2012 NTT DATA INTELLILINK Corporation 8
  9. 9. UseCase We plan to use Colony as Object Storage for Clouds to Clouds migration Object Storage to delevery VM Images around Japan Object Storage to store big data.Copyright © 2012 NTT DATA INTELLILINK Corporation 9
  10. 10. Developed software components in colony•Colony-Horizon – based on diablo/stable Horizon with some enhancements•Multi-region support – Users can choose which swift is used to store/retrieve objects•Swift Container’s ACL ,metadata support•Swift Object’s metadata support•>5G segment upload support …•Colony-Keystone – based on diablo/stable Keystone with some enhancements•Authenticate with Shibboleth•%{tanant_name} can be used for endpointTemplates in addition to %{tenant_id} to federatecloud services•Colony-Dispatcher - new•Relay requests to multiple object services (and merge response for clients)•Relay requests to a specific object service indicated by URI•Choose the “nearest” swift-proxy server to relay requests•Copy objects among different swifts•Utilities - new•Tools to simplfy admin tasks to federate object storage services Copyright © 2012 NTT DATA INTELLILINK Corporation 10
  11. 11. Colony-horizonUsers can choose swift Swift -I Swift -A Copyright © 2012 NTT DATA INTELLILINK Corporation 11
  12. 12. Colony – keystone Shibboleth IdP Modifications to keystone • Add ePPN field to keystone schema • ADD rest api services to create token by ePPN (/token_by/eppn) and email address(/token_by/email) • Add a rest api service to register/update ePPN (/users/{user_id}/eppn)1. ID/passwd 2. Attribute: ePPN, mail_addr 0-1. User registration by mail_addr 0-2 . Associate ePPN to mail_addr by initial access Shibboleth SP Colony- Colony-Horizon 3. Attribute: ePPN Colony- 4. auth_token Keystone Copyright © 2012 NTT DATA INTELLILINK Corporation 12
  13. 13. Colony-dispatcher1. Swift client can send requests to Swift-A and Swift-I through Swift Dispatcher2. Swift Dispatcher merges and sends the response from each Swift to Swift Client Swift Client Requests modified for merging responses. A:container1 •Account Info A:container2 •Container List •X-Copy-from/to I:container1 I:container2 Colony DispatcherResponse merged by Swift Proxy Swift Proxy Swift ProxyColony Dispatcher hasa prefix to indicatewhich Swift is used tostore. Swift-A (local) Swift-I (intercloud ) Copyright © 2012 NTT DATA INTELLILINK Corporation 13
  14. 14. CachingSwift Dispatcher can use cache proxy (like squid) perswift proxy to retrieve objects from remote swifts. A:container1 A:container2 I:container1 Colony Dispatcher I:container2 Cache(Proxy) Swift Client Swift Proxy Swift Proxy Swift Proxy Swift-A (local) Swift-I (intercloud )Copyright © 2012 NTT DATA INTELLILINK Corporation 14
  15. 15. How to swift make network awareCopyright © 2012 NTT DATA INTELLILINK Corporation 15
  16. 16. Current implementationCopyright © 2012 NTT DATA INTELLILINK Corporation 16
  17. 17. Problems which original swift code has•PUT/GET performance–Swift proxy waits all objects are put to storage servers.–Swift proxy chooses randomly the node to retrieve object. Copyright © 2012 NTT DATA INTELLILINK Corporation 17
  18. 18. Test Environments CPU: Intel(R) Xeon(R) CPU E7- 8870 (40core) Mem: 126GB NIC: 1000baseT/Full x2 900MBps(0.1msec) Sapporo Tokyo 9900MBps CPU: AMD Opetron 6128 2000Mhz (16core) Mem: 32GB NIC: 10000baseT/Full x2Copyright © 2012 NTT DATA INTELLILINK Corporation 18
  19. 19. PUT operation Sapporo Storage Storage Object PUT operation is Storage always affected by the worst case. Tokyo Storage Storage Storage Proxy ClientCopyright © 2012 NTT DATA INTELLILINK Corporation 19
  20. 20. Objects locationCopyright © 2012 NTT DATA INTELLILINK Corporation 20
  21. 21. PUT objects throughput @Tokyo (Bytes/sec)Copyright © 2012 NTT DATA INTELLILINK Corporation 21
  22. 22. GET operation High-bandwidth, low-latency Sapporo Storage Storage 1/replications Storage Tokyo Storage Storage Storage Proxy High-bandwidth, low-latency ClientCopyright © 2012 NTT DATA INTELLILINK Corporation 22
  23. 23. Objects locationCopyright © 2012 NTT DATA INTELLILINK Corporation 23
  24. 24. GET objects throughput @Tokyo (Bytes/sec) Performance degradation by network between Sapporo and TokyoCopyright © 2012 NTT DATA INTELLILINK Corporation 24
  25. 25. Our modificationCopyright © 2012 NTT DATA INTELLILINK Corporation 25
  26. 26. How to solve - Basic Idea•Limitation–Don’t modify data structure (including ring)–Minimize customization•Adding some rules to the ring’s data strcuture–Zone information is treated as decimal number, so consider difference betweenzoneA and ZoneB represents a distance of zoneA and ZoneB•Adding some zone hints to Swift proxy servers•Changes the order of nodes for Proxy server. Copyright © 2012 NTT DATA INTELLILINK Corporation 26
  27. 27. How to solve Proxy Zone 200[app:proxy-server] Distance 10 Sapporonearby_mode = false zone 200- Proxy , which has zone info(200) and zoneown_zone = 100 distance(10), considers 202 storage servers between zone 200-210near_distance = 10 to be located near the proxy. Tokyo zone 100- 102 Proxy Proxy ,which has zone info(100) and zone Zone 100 distance(10), considers Distance 10 storage servers between zone 100-110 to be located near the proxy.Copyright © 2012 NTT DATA INTELLILINK Corporation 27
  28. 28. PUT operation Proxy initially puts objects to the nearest storage servers using zone information and zone distance. Then object replicator replicates it the proper position asyncronasly. Sapporo Storage Storage D F Storage G Tokyo Storage Storage A B zone_info: 100 Storage zone_distance: 10 C Proxy ClientCopyright © 2012 NTT DATA INTELLILINK Corporation 28
  29. 29. PUT operation This is the same situation that all storage servers located in Supporo are broken. Sapporo Storage Storage × D E × Storage F × Tokyo Storage Storage A B Storage C Proxy Hinted hand off ClientCopyright © 2012 NTT DATA INTELLILINK Corporation 29
  30. 30. GET operation 1.First, try to retrieve object from storage server near the proxy. Sapporo 2.After that, try to retrieve object from storage Storage Storage server indicated as a primary zone Storage Tokyo Storage Storage Storage Proxy ClientCopyright © 2012 NTT DATA INTELLILINK Corporation 30
  31. 31. DELETE operation 1.First, try to delete object from storage server near the proxy Sapporo 2.After that, try to delete object from storage Storage Storage server indicated as a primary zone Storage Tokyo Storage Storage Storage Proxy ClientCopyright © 2012 NTT DATA INTELLILINK Corporation 31
  32. 32. Code ring.py proxy/server.py @@ -1044,6 +1056,14 @@ def POST(self, req): def get_near_nodes(self, account, container, obj, own_zone, near_distance): 1056 container_partition, containers, _junk, req.acl, _junk = ¥ """ 1057 self.container_info(self.account_name, self.container_name, Get the partition and nodes same as get_nodes, 1058 account_autocreate=self.app.account_autocreate) 1059 + if self.app.nearby_mode: :param account: account name 1060 :param container: container name + partition, near_nodes = self.app.object_ring.get_near_nodes( :param obj: object name 1061 :param own_zone: top number of zone name + self.account_name, self.container_name, self.object_name, :param near_distance: recognize matched zone name 1062 + self.app.own_zone, self.app.near_distance) 1063 which start from own_zone to a number add own_zone and this number. + print before nodes: %s % containers 1064 :returns: a tuple of (partition, list of node dicts) + containers = near_nodes + ¥ 1065 """ + [cont for cont in containers if cont[zone] not in [c[zon part, nodes = self.get_nodes(account, container, obj) e] for c in near_nodes]] 1066 + print after nodes: %s % containers 1047 1067 if swift.authorize in req.environ: 1048 def isnearby(one, other, distance): 1068 aresp = req.environ[swift.authorize](req) 1049 if one <= other and one + distance > other: 1069 if aresp: return True return False near_nodes = [] and then modify proxy/server.py to for node in nodes: if isnearby(own_zone, node[zone], near_distance): near_nodes.append(node) use get_near_nodes() for each if len(near_nodes) <= self.replica_count: for node in self.get_more_nodes(part): method. if isnearby(own_zone, node[zone], near_distance): near_nodes.append(node) if len(near_nodes) >= self.replica_count: break return part, near_nodes adding get_near_nodes() to ring.pyCopyright © 2012 NTT DATA INTELLILINK Corporation 32
  33. 33. Investigation PUT Average (bytes/sec) @Sapporo40,000,00035,000,00030,000,00025,000,00020,000,000 Original15,000,000 Patched10,000,000 5,000,000 0 1K 1M 10M 100M 1G PUT Average (bytes/sec) @Tokyo160,000,000140,000,000120,000,000100,000,000 80,000,000 Original 60,000,000 Patched 40,000,000 20,000,000 0 1K 1M 10M 100M 1GCopyright © 2012 NTT DATA INTELLILINK Corporation 33
  34. 34. Using Cache How about the case of all objects are located to remote areas ? Sapporo Storage Storage Storage TokyoKyusyu Storage Storage Proxy Storage ProxyClient Copyright © 2012 NTT DATA INTELLILINK Corporation 34
  35. 35. Colony-Dispatcher as a cacheColony-Dispatcher can be a swift-proxy-proxy with cachemechanism Copyright © 2012 NTT DATA INTELLILINK Corporation 35
  36. 36. Investigation – Cache effectiveness Using Colony-Dispatcher as a cache, the performance to retrieve objects from remote area could be nice. GET average (bytes/sec) @Sapporo 350,000,000 300,000,000 250,000,000 Column K 200,000,000 Column K 150,000,000 Column K 100,000,000 Column K 50,000,000 0 1K 1M 10M 100M 1G GET average (bytes/sec) @Tokyo250,000,000200,000,000 Column K150,000,000 Column K100,000,000 Column K 50,000,000 Column K 0 1K 1M 10M 100M 1G Copyright © 2012 NTT DATA INTELLILINK Corporation 36
  37. 37. Conclusion•Re-ordering the nodes by regions for Proxy resolves GET/PUT performanceissues–And this feature can be implemented with minimum(<50 lines of code) customization.•Using cache is a good idea for inter-cloud use Copyright © 2012 NTT DATA INTELLILINK Corporation 37
  38. 38. Our future planCopyright © 2012 NTT DATA INTELLILINK Corporation 38
  39. 39. Problems to tackle•Object’s location•Adding Region concepts to the ring structure might help.–Primary nodes isolated by region•Replication’s performance – Key factor • We aggressivelly used hinted-hand-off mechanism to – Using UDT instead of TCP for replication – Using pyinotify to I/O event driven replication – Separation of Network for replication – Hop by Hop replication Copyright © 2012 NTT DATA INTELLILINK Corporation 39
  40. 40. Are you interested in Colony ?•Please contact with me if you are interested in Colony project.–We want to collaborate with people who want to use/develop swift as a inter-cloudobject store. Copyright © 2012 NTT DATA INTELLILINK Corporation 40
  41. 41. Are you interested in academic clouds?•If you are interested in the way how to integrate clouds using dodai andclony–My colleague (guan-san) will make a presentation about dodai (Cluster as aservice) at 17:20 @Manchester A–Yokoyama-san (a member of NII) might talk about how to integrate both Colonyand Dodai on LT Copyright © 2012 NTT DATA INTELLILINK Corporation 41
  42. 42. Thank you.Copyright © 2012 NTT DATA INTELLILINK Corporation 42
  43. 43. Q&A•Please phase your question using simple grammar if possible. Copyright © 2012 NTT DATA INTELLILINK Corporation 43
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×