Swift in the Small OpenStack Meetup June 29, 2011 Computer History Museum Mountain View, CA Joe Arnold - Cloudscaling twitter: @joearnold blog: http://joearnold.comWednesday, June 29, 2011- The theme of tonight is Corporate IT.- The promise of OpenStack for Corporate IT is the ability to take advantage of-- all the great tooling,-- all the great services,-- all the compatible applications that use infrastructure cloud services as a platform.- It gives the ability to deploy cloud infrastructure in-house.- Tonight I’ll be covering OpenStack Object Storage, Swift -- In the Small- Raise of hands: How many have downloaded and installed either Swift?
Wednesday, June 29, 2011- Swift is an Object Storage system that was designed for scale.- This was one of the ﬁrst clusters we deployed.- It’s a petabyte of useable storage. It can serve a lot of users.- For the spinning disks of aluminum, bent sheet metal, forged iron for the racks, strands ofglass, and silicon wafers, etc... A deployment like this is a great deal at $500,000 and amillion dollars.- But not everyone needs a petabyte out of the gate.- Even for these deployments, we have staging clusters in the range of 80-100 TB
Wednesday, June 29, 2011- Challenge for this ‘Corporate IT’ theme is what a small-scale Object Storage (Swift) clusterwould look like.- What does it take and what compromises are made when scaling down something designedfor large-scale.- This, for example, is a 4-U, 36 drive system from ComputerLINK. ComputerLINK was niceenough to provide a demo unit for the meetup tonight.- I’ll be powering it up in a few minutes and if you’re interested, you can come over and wecan start pulling drives and watch data get replicated around.
Zone 2 Zone Zone Zone ZoneWednesday, June 29, 2011Why is this a challenge? — Zones- Swift is designed for large-scale deployments.- The mechanisms for replication and data distribution are built on the concept that data isdistributed across isolated failure boundaries. These isolated failure boundaries are calledzones.- Unlike RAID systems, data isn’t chopped up and distributed throughout the system.- With Swift whole ﬁles are distributed throughout the system. Each copy of the data residesin a different zone.- Swift stores 3 copies of the data, so at least 4 zones are required. (in case 1 zone fails)- Preferably 5 zones (so that 2 zones can fail).- In the big clusters, failure boundaries can be separate racks with their own networkingcomponents.- In medium deployments, a physical node can represent a zone.- For smaller deployments with fewer then 4 nodes, drives need to be grouped together toform pseudo-failure boundaries.- A grouping of drives is simply declared a zone.- Here is a scheme for starting small and growing the cluster bit-by-bit (well.. terabyte-by-terabyte).
4 Disks 4 ZonesWednesday, June 29, 2011- For a single storage node the minimum conﬁguration would have 4 drives for data + 1boot drive.- Each disk is a zone.- If a single drive fails, it’s data will be replicated to the remaining 3 drives in the system.- The system would grow, 4-disks at at time (one in each zone) until the chassis was full.
Zone 1 Zone 2 Zone 3 Zone 4Wednesday, June 29, 2011- The strategy here is to split the zones evenly across the two nodes.- The addition of an additional node does increases availability (assuming that loadbalancing is conﬁgured),- but it does does not create a master-slave conﬁguration. If one of the nodes is down ½ ofyour zones are unavailable.- The good news is that if one of the nodes is down (½ of your zones), data is stillaccessible.- This is because because at least one of the zones will still up on the remaining node.- The bad news is that there is still a 1 in 2 chance that writes will fail- because at least two of three zones need to be written to for the write to be consideredsuccessful.
Zone 1 ⅓ Zone 4 Zone 2 ⅓ Zone 4 Zone 3 ⅓ Zone 4Wednesday, June 29, 2011- The addition of a third node further enables distribution of zones across the nodes.- Something strange is going on here by putting whole zones in each node,- but breaking up zone 4 into thirds and distributing across the three nodes.- This is done to enable smoother rebalancing when going to 4 nodes.- Again, if a single node is down, data will be available, but there will be a 1 if 5 chance thata write would fail.
Zone 1 Zone 2 Zone 3 Zone 4Wednesday, June 29, 2011- The strategy of breaking up Zone 4 into thirds with 3 nodes, is to make this transitioneasier.- The cluster can be conﬁgured with zone 4 entirely on that new server,- then the remaining zones can slowly be rebalanced to fold-in the newly vacated drives ontheir node.- Now, if a single node fails, writes will be successful as at least two zones will be available.
Wednesday, June 29, 2011- Why small-scale Swift?- Using OpenStack Object Storage is a private-cloud alternative to S3, CloudFiles, etc.- This enables private cloud builders to start out with a single machine their own data centerand scale-up as their needs grow.- Why not use RAID?- Why not use a banana? :) It’s a different storage system, used for different purposes.- Going with a private deployment of Object Storage gives something that looks and feelsjust like Rackspace Cloud Files.- App developers don’t need to attach a volume to use the storage system and assets can beserved directly to end users or to a CDN.- The bottom line is that a small deployment can transition smoothly into a largerdeployment.- The great thing about OpenStack being open-source software is that it gives us thefreedom to build and design systems however we see ﬁt.