• Save
Scale-Out Block Storage
Upcoming SlideShare
Loading in...5
×
 

Scale-Out Block Storage

on

  • 4,375 views

Existing approaches to delivering persistent block storage in OpenStack focus on integrating existing SAN/NAS hardware solutions, using Distributed File Systems (DFS), or using simple Direct Attached ...

Existing approaches to delivering persistent block storage in OpenStack focus on integrating existing SAN/NAS hardware solutions, using Distributed File Systems (DFS), or using simple Direct Attached Storage (DAS) with Cinder. There is another alternative: scale-out block storage nodes with intelligent scheduling. This is the same approach that Amazon Web Services (AWS) uses for Elastic Block Storage (EBS) and it's worth taking a close look at the pros and cons. This presentation will explore the differences between SAN, NAS, DFS, DAS, and EBS. We will look at the implicit and explicit contracts that users and operators get from the different approaches and look at a variety of failure conditions. EBS may not be right for some clouds, but for many it's an important and viable alternative to the existing approaches.

Statistics

Views

Total Views
4,375
Views on SlideShare
1,539
Embed Views
2,836

Actions

Likes
5
Downloads
0
Comments
0

6 Embeds 2,836

http://engineering.cloudscaling.com 2723
http://feeds.feedburner.com 62
http://engineering.cloudscaling.com.dev 26
http://wiki.swissre.com 19
http://digg.com 4
https://twitter.com 2

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

CC Attribution-NoDerivs LicenseCC Attribution-NoDerivs License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Scale-Out Block Storage Scale-Out Block Storage Presentation Transcript

    • Eric Windisch Cloudscaling Principal Engineer • OpenStack Dev since Cactus • Building PaaS since 2002 • Building IaaS since 2006 • Engineering lead for KT’s second-generation elastic block storage solution.Thursday, April 18, 13
    • Scale Out Block StorageThursday, April 18, 13
    • Amazon’s Architecture (or what we can infer)Thursday, April 18, 13
    • Control-plane + Storage EBS Control-plane EBS Storage EC2 4Thursday, April 18, 13
    • Distributed Storage “EBS is a distributed, replicated block data store that is optimized for consistency and low latency read and write access from EC2 instances.” 5Thursday, April 18, 13
    • Brewer’s Theorem: “It is impossible for a distributed system to simultaneously provide all three of the following guarantees.” consistency availability C A P partition tolerance 6Thursday, April 18, 13
    • Brewer’s Theorem: ial “It is impossible for a distributed system to simultaneously nt provide all three of sse following guarantees.” E the consistency availability C A = reliability P partition = reliability tolerance 7Thursday, April 18, 13
    • Brewer’s Theorem: ial “It is impossible for a distributed system to simultaneously nt provide all three of sse following guarantees.” E the consistency availability C A = reliability To have consistency, P EBS gives up partition = reliability reliability. tolerance 8Thursday, April 18, 13
    • Snapshots “Amazon EBS also provides the ability to create point-in-time snapshots of volumes... These snapshots... protect data for long-term durability.” For reliability, use snapshots. 9Thursday, April 18, 13
    • Cloud #FAILThursday, April 18, 13
    • Real-life failures 11Thursday, April 18, 13
    • Real-life failures “ ” 12Thursday, April 18, 13
    • Real-life failures “ FF T U S ” 13Thursday, April 18, 13
    • Service Provider Failures - 100% Hardware rarely fails, operators fail, software fails Type Year Why Duration Switch 2005 Bug 2 hrs SAN 2007 Ops Err 24 hrs NAS 2008 Ops Err 8 hrs SAN 2011 Bug 72 hrs Datacenter 2012 Bug+Ops 2 hrs (names withheld to protect the guilty) 14Thursday, April 18, 13
    • What is Scale-out? A B A B C D N A B Scale-up - Make boxes Scale-out - Make moar bigger (usually an HA pair) boxes 15Thursday, April 18, 13
    • Scaling out is a mindset Scaling up is like treating your servers as pets bowzer.company.com web001.company.com Servers *are* cattle 16Thursday, April 18, 13
    • Big failure domains vs. small Would you rather have the whole cloud down or just a small bit for a short period of time? Still a scale-up pattern ... wouldn’t you rather scale-out? 17Thursday, April 18, 13
    • Cinder ArchitectureThursday, April 18, 13
    • Cinder: Different than EBS? Cinder Control-plane Storage Compute 19Thursday, April 18, 13
    • Cinder Control Plane Control Plane Sto rag e 20Thursday, April 18, 13
    • Not just a control-plane Cinder Control-plane Storage Compute 21Thursday, April 18, 13
    • Storage Plugins emc coraid sheepdog huawei glusterfs solidfire netapp lvm storewize nexenta nfs windows san rbd (ceph) xiv xenapi scality zadara 18 different architectures 22Thursday, April 18, 13
    • Storage Plugin Choices really are: NAS DFS BLOCK 23Thursday, April 18, 13
    • Breaking the Cinder Control PlaneThursday, April 18, 13
    • Scale-up, Scale-up (I hope nobody does this) XOne-really- big-server One-really- big-server X really-big really-big storage system storage system 25Thursday, April 18, 13
    • Diagram: Multi-backend pattern Cin der vol- man ager one big cinder Stor control-plane age Several really-big storage systems 26Thursday, April 18, 13
    • Diagram: Multi-backend pattern Many really-big Many really-big control-plane nodes storage systems 27Thursday, April 18, 13
    • Scale-out, Scale-out Complex. Split-brain problems. Cinder doesn’t natively support this. To fix this, lets talk about the storage backend. 28Thursday, April 18, 13
    • Breaking the Cinder Storage Backend This section needs a lot of work yet...Thursday, April 18, 13
    • Scale-out, Scale-out Complex. Split-brain problems. Cinder doesn’t natively support this. 30Thursday, April 18, 13
    • Scale-out, then scale-up Transport Failed Storage Failed server Failure 31 BackendThursday, April 18, 13
    • “HA” Replication Cinder Control-plane HA PAIR 32Thursday, April 18, 13
    • Scale-out, Scale-out, HA Really Complex. Where is my data? More split-brain problems. Cinder still doesn’t support it. 33Thursday, April 18, 13
    • Networking complexity Imagine this was 4 or 6 nodes in the cluster 34Thursday, April 18, 13
    • Brewer’s Theorem: his se s t system to simultaneously “It is impossible for a distributed lo HA provide all three of the following guarantees.” consistency availability C A P in-consistent partition block storage? tolerance REALLY SCARY. 35Thursday, April 18, 13
    • Solution!Thursday, April 18, 13
    • We deploy Cinder like Nova. vo lume ^ API Scheduler Compute HTTP Proxy API Scheduler ^ me Compute volu 37Thursday, April 18, 13
    • Brokerless Messaging With ZeroMQ Distributed MessagingFailure 0MQ Avoiding RabbitMQ’s Single Point Of with Nova-Compute Nova-Compute Single Point Of Failure RabbitMQ Broker Nova-Scheduler Nova-API Nova-Scheduler Nova-API RabbitMQ vs. ZeroMQ (Brokered) Centralized Broker (Peer To Peer) Distributed Broker (peer to peer) 35Thursday, October 18, 12 38Thursday, April 18, 13
    • Scale-together Simple. Deterministic. Won’t lose all data together. Can lose leaf-node data, but snapshots hedge against this. 39Thursday, April 18, 13
    • Brewer’s Theorem: “It is impossible for a distributed system to simultaneously provide all three of the following guarantees.” consistency availability C A P partition tolerance 40Thursday, April 18, 13
    • Brewer’s Theorem: “It is impossible for a distributed system to simultaneously provide all three of the following guarantees.” consistency availability C A We don’t need P availability with snapshots & partition partition tolerance tolerance 41Thursday, April 18, 13
    • Snapshots save us. Swift / S3 Snapshots 42Thursday, April 18, 13
    • QA http://engineering.cloudscaling.com/portland13/ Eric Windisch Principal Engineer - OpenStack, Cloudscaling @ewindisch CCA - NoDerivs 3.0 Unported License - Usage OK, no modifications, full attribution* * All unlicensed or borrowed works retain their original licensesThursday, April 18, 13