Architecting an Enterprise Storage
Platform Using Object Stores
© mekuria getinet / www.mekuriageti.net
Niraj Tolia
Chief ...
A Whirlwind Tour
Awesome Questions == AwesomeT-shirts
80%YoY Growth in
Unstructured Data
41% Growth in IaaS
Systems through 2016
Sources:
Gartner, IT Marketing Clock for Storag...
MagFS –The File System for the Cloud
Consistent, Elastic, Secure, Mobile-Enabled
Layered on Object Stores
“Software-Define...
No (Initial) Legacy
Support (NFS/CIFS)
Native Clients: Push
Intelligence to Edges
Strong Consistency w/
Full-Spectrum Cach...
File System Design Goals
Low Cost,
High Scale
Intelligent
Clients
Span Devices
and Networks
Support Rapid
Iteration
In-Cloud
File System
NAS Replacement
and Consolidation
Enterprise File
Sharing
Use Cases
Object Storage
(public, on-premises,or hybrid)
Data
Metadata
Metadata Servers
Clients
10,000 FootView
Koukouvaya / flickr.com/photos/jackoughton/6535137981/
Heavy (Data) Lifting via Clients
Encryption
Inline Deduplication
Co...
Cloud Object Storage
Scale Out, Low Cost
Handles Placement + Replication
Tolerates Failures
High Aggregate Performance
Virtualized Metadata Servers
Enforce Strong Consistency
Enforce Authentication and Integrity
Runtime Performance Optimizat...
Architecture
Client
Architecture
Client Architecture
Application
Redirector
(e.g., FUSE)
File System
OS Glue
Data Manager
MetadataTransport
Layer
Local Rem...
Data Manager
File System Layer
SimplifiedWrite: Deduplication + Encryption
Write Request
Plaintext
Variable-Length
Chunkin...
Data Manager
File System Layer
SimplifiedWrite: Deduplication + Encryption
Write Request
Plaintext
Variable-Length
Chunkin...
Data Manager
File System Layer
Simplified Read: Deduplication + Encryption
Read Request
<File, Offset, Range>
Local Cache ...
The Client in Real Life Does a Lot More!
• File and Directory Leases (data and metadata caching)
• Asynchronous Operations...
Object Storage
(public, on-premises,or hybrid)
Data
Metadata
Metadata Servers
Clients
Communication Details
Thrift
(HTTPS)...
Server
Architecture
Metadata Server Internals
Metadata Storage Layer
Storage Core
Backups
Production Development
GC
Scrubbing
Quotas Dedup Lea...
Bootstrapping:Virtualized Namespaces
server.example.comshare
HOST FQDN FOLDER
Legacy
server.example.comshare
MagFS
Dynamic...
Discovery Service
Metadata
Server
Metadata
Server (HA)
Metadata
Server
ZooKeeper
ZooKeeperZooKeeper
Monitoring
Management
...
Leases: Performance and Strong Consistency
Read Write HandleLeaseTypes
Read
Read +
Handle
Read +
Write +
Handle
Lease Stat...
Cloud Storage
Interaction
Object Storage
(public, on-premises,or hybrid)
Object Storage systems
are like snowflakes!
Object Store API Compatibility
Q: Has anyone come across a near 100%
Amazon S3 API compatible object storage
system?
A: It...
Object Storage
(public, on-premises,or hybrid)
Data
Metadata
Metadata Servers
Clients
Direct Client Access: Security Probl...
Request Signing
Server-Driven Request Signing
SignString = HTTP-Verb + "n"
+ Content-MD5 + "n"
+ Content-Type + "n"
+ Date + "n"
+ Resourc...
Server-Driven Request Signing
SignString = PUT + "n"
+ Content-MD5 + "n"
+ Content-Type + "n"
+ Date + "n"
+ Resource + "n...
Server-Driven Request Signing
SignString = PUT + "n"
+ 07BzhNET7exJ6qYjitX/AA== + "n"
+ Content-Type + "n"
+ Date + "n"
+ ...
Server-Driven Request Signing
SignString = PUT + "n"
+ 07BzhNET7exJ6qYjitX/AA== + "n"
+ image/jpeg + "n"
+ Date + "n"
+ Re...
Server-Driven Request Signing
SignString = PUT + "n"
+ 07BzhNET7exJ6qYjitX/AA== + "n"
+ image/jpeg + "n"
+ Tue, 11 Jun 201...
Server-Driven Request Signing
SignString = PUT + "n"
+ 07BzhNET7exJ6qYjitX/AA== + "n"
+ image/jpeg + "n"
+ Tue, 11 Jun 201...
Server-Driven Request Signing
SignString = PUT + "n"
+ 07BzhNET7exJ6qYjitX/AA== + "n"
+ image/jpeg + "n"
+ Tue, 11 Jun 201...
Server-Driven Request Signing
SignString = PUT + "n"
+ 07BzhNET7exJ6qYjitX/AA== + "n"
+ image/jpeg + "n"
+ Tue, 11 Jun 201...
Object Storage
(public, on-premises,or hybrid)
Data
Metadata
Metadata Servers
Clients
Safe Direct Client Access via Reques...
Dealing with Lost Client Writes
• Clients can lose connectivity or, in the worst case, be malicious
• Naïvely trusting cli...
Handling Object Store Eventual Consistency
• Treat objects as immutable (even if modifications are allowed)
• Use content-...
Security
Architecture
Recap: On-Premises Security Model
• User authentication and permissions derived from native Active
Directory setup
• Encry...
Slides (with speaker notes) at http://tolia.org
Try MagFS at http://maginatics.com
Architecting An Enterprise Storage Platform Using Object Stores
Upcoming SlideShare
Loading in …5
×

Architecting An Enterprise Storage Platform Using Object Stores

1,566 views

Published on

Presented at SNIA SDC 2013

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,566
On SlideShare
0
From Embeds
0
Number of Embeds
15
Actions
Shares
0
Downloads
52
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Architecting An Enterprise Storage Platform Using Object Stores

  1. 1. Architecting an Enterprise Storage Platform Using Object Stores © mekuria getinet / www.mekuriageti.net Niraj Tolia Chief Architect, Maginatics @nirajtolia
  2. 2. A Whirlwind Tour
  3. 3. Awesome Questions == AwesomeT-shirts
  4. 4. 80%YoY Growth in Unstructured Data 41% Growth in IaaS Systems through 2016 Sources: Gartner, IT Marketing Clock for Storage, Sep 2011 Gartner, Forecast Overview: Public Cloud Services, Worldwide, 2011-2016, Feb 2013
  5. 5. MagFS –The File System for the Cloud Consistent, Elastic, Secure, Mobile-Enabled Layered on Object Stores “Software-Defined”
  6. 6. No (Initial) Legacy Support (NFS/CIFS) Native Clients: Push Intelligence to Edges Strong Consistency w/ Full-Spectrum Caching
  7. 7. File System Design Goals Low Cost, High Scale Intelligent Clients Span Devices and Networks Support Rapid Iteration
  8. 8. In-Cloud File System NAS Replacement and Consolidation Enterprise File Sharing Use Cases
  9. 9. Object Storage (public, on-premises,or hybrid) Data Metadata Metadata Servers Clients 10,000 FootView
  10. 10. Koukouvaya / flickr.com/photos/jackoughton/6535137981/ Heavy (Data) Lifting via Clients Encryption Inline Deduplication Compression Persistent Data Caching Bulk DataTransfers
  11. 11. Cloud Object Storage Scale Out, Low Cost Handles Placement + Replication Tolerates Failures High Aggregate Performance
  12. 12. Virtualized Metadata Servers Enforce Strong Consistency Enforce Authentication and Integrity Runtime Performance Optimization Share-level Deduplication Data Scrubbing & Garbage Collection
  13. 13. Architecture
  14. 14. Client Architecture
  15. 15. Client Architecture Application Redirector (e.g., FUSE) File System OS Glue Data Manager MetadataTransport Layer Local Remote Userspace Kernel Deduplication Encryption Compression Locking Leases
  16. 16. Data Manager File System Layer SimplifiedWrite: Deduplication + Encryption Write Request Plaintext Variable-Length Chunking Encrypted Text (E) AES-256 (K) Object Name (N) SHA-256 Local Cache Remote Transfer Encryption Key (K) SHA-256
  17. 17. Data Manager File System Layer SimplifiedWrite: Deduplication + Encryption Write Request Plaintext Variable-Length Chunking Encrypted Text (E) AES-256 (K) Object Name (N) SHA-256 <File, Offset, N, K> Optional(<URI>) Local Cache Remote Transfer <N, E> <URI, E> No Encryption Keys in the Cloud No Encryption Keys in Local Cache Encryption Key (K) SHA-256 <E>
  18. 18. Data Manager File System Layer Simplified Read: Deduplication + Encryption Read Request <File, Offset, Range> Local Cache Remote Transfer <N, URI> Encryption Key (K) <N, K, URI> Encrypted Text (E) <E> <URI> <E> <URI> <E> Plaintext AES-256 (K)
  19. 19. The Client in Real Life Does a Lot More! • File and Directory Leases (data and metadata caching) • Asynchronous Operations (including writes) • Operation Compounding • Runtime Optimizations (e.g., read ahead) • Optimizing for High Bandwidth Delay Product (BDP) • …
  20. 20. Object Storage (public, on-premises,or hybrid) Data Metadata Metadata Servers Clients Communication Details Thrift (HTTPS) REST (HTTPS)
  21. 21. Server Architecture
  22. 22. Metadata Server Internals Metadata Storage Layer Storage Core Backups Production Development GC Scrubbing Quotas Dedup Leases Security HA MagFS Ext. Sharing Multi-Cloud Versioning Offline Mode Cloud Abstraction Layer Legend
  23. 23. Bootstrapping:Virtualized Namespaces server.example.comshare HOST FQDN FOLDER Legacy server.example.comshare MagFS Dynamic mapping to host:port
  24. 24. Discovery Service Metadata Server Metadata Server (HA) Metadata Server ZooKeeper ZooKeeperZooKeeper Monitoring Management Console Config + Scheduler Virtual Filer  Host:Port Mapping
  25. 25. Leases: Performance and Strong Consistency Read Write HandleLeaseTypes Read Read + Handle Read + Write + Handle Lease States Valid File Leases Valid Directory Leases
  26. 26. Cloud Storage Interaction
  27. 27. Object Storage (public, on-premises,or hybrid)
  28. 28. Object Storage systems are like snowflakes!
  29. 29. Object Store API Compatibility Q: Has anyone come across a near 100% Amazon S3 API compatible object storage system? A: It is hard to find a near-100% compatible product… -Vendor w/ S3 Compatible Product
  30. 30. Object Storage (public, on-premises,or hybrid) Data Metadata Metadata Servers Clients Direct Client Access: Security Problem?
  31. 31. Request Signing
  32. 32. Server-Driven Request Signing SignString = HTTP-Verb + "n" + Content-MD5 + "n" + Content-Type + "n" + Date + "n" + Resource + "n" + ...
  33. 33. Server-Driven Request Signing SignString = PUT + "n" + Content-MD5 + "n" + Content-Type + "n" + Date + "n" + Resource + "n" + ...
  34. 34. Server-Driven Request Signing SignString = PUT + "n" + 07BzhNET7exJ6qYjitX/AA== + "n" + Content-Type + "n" + Date + "n" + Resource + "n" + ...
  35. 35. Server-Driven Request Signing SignString = PUT + "n" + 07BzhNET7exJ6qYjitX/AA== + "n" + image/jpeg + "n" + Date + "n" + Resource + "n" + ...
  36. 36. Server-Driven Request Signing SignString = PUT + "n" + 07BzhNET7exJ6qYjitX/AA== + "n" + image/jpeg + "n" + Tue, 11 Jun 2013 00:27:41 + "n" + Resource + "n" + ...
  37. 37. Server-Driven Request Signing SignString = PUT + "n" + 07BzhNET7exJ6qYjitX/AA== + "n" + image/jpeg + "n" + Tue, 11 Jun 2013 00:27:41 + "n" + /container/example.jpeg + "n" + ...
  38. 38. Server-Driven Request Signing SignString = PUT + "n" + 07BzhNET7exJ6qYjitX/AA== + "n" + image/jpeg + "n" + Tue, 11 Jun 2013 00:27:41 + "n" + /container/example.jpeg + "n" + ... HMAC-SHA1( , SignString)
  39. 39. Server-Driven Request Signing SignString = PUT + "n" + 07BzhNET7exJ6qYjitX/AA== + "n" + image/jpeg + "n" + Tue, 11 Jun 2013 00:27:41 + "n" + /container/example.jpeg + "n" + ... Signature = Base64(HMAC-SHA1( , SignString))
  40. 40. Object Storage (public, on-premises,or hybrid) Data Metadata Metadata Servers Clients Safe Direct Client Access via Request Signing 1. Read/Write Request 3. HTTP Request + Signature + Encrypted Data 2. HTTP Request + Signature
  41. 41. Dealing with Lost Client Writes • Clients can lose connectivity or, in the worst case, be malicious • Naïvely trusting client writes can “corrupt” w/ global dedup • MagFS server scrubs all writes: • Client acknowledges write • Server verifies object existence (object store performed MD5 at PUT) • Server can also read and verify object data (stronger SHA-256 check) • The object will be available for deduplication only after scrubbing
  42. 42. Handling Object Store Eventual Consistency • Treat objects as immutable (even if modifications are allowed) • Use content-based names (generated using cryptographic hashes) • Tombstone names after Garbage Collection • Suffix generation number to content-based names in case of resurrection
  43. 43. Security Architecture
  44. 44. Recap: On-Premises Security Model • User authentication and permissions derived from native Active Directory setup • Encryption keys are never exposed to the cloud • Data and metadata is always encrypted:At-Rest and In-Flight
  45. 45. Slides (with speaker notes) at http://tolia.org Try MagFS at http://maginatics.com

×