Public clouds were initially popularized under the premise that workloads are dynamic, and that you could easily match available compute resources to the peaks and troughs in your consumption, rather than having to maintain mostly idle buffer capacity to meet peak user demand. However, what has become more apparent is that this isn’t necessarily true when it comes to storage. Typically what is observed in production environments is a continual growth of all data sets, across those that are actively used for decision making or transactional processing, those maintained as training data for AI/ML, or kept for archival purposes, and simply just backups of critical data. During this talk we will discuss how Ceph can be deployed in a cost effective manner adjacent to public clouds, and investigate the financial implications of both approaches.
6. What is Ceph?
A Software Defined Storage solution, designed to provide massively scalable,
block, object and file storage from a single resilient storage cluster
10. Cluster Hardware Specs
● 12x Storage Nodes
○ 2x Intel® Xeon® Gold 5218 2.3G, 16C/32T
○ 256GB RAM
○ 1x Dual Port 25Gb NIC
○ Onboard Gb NIC
○ 2x 480GB SSD for Boot/OS (RAID-1)
○ 4x 3.84TB SSD for Index/Caching
○ 20*18TB SATA HDDs
● 3x Management Nodes
○ Intel® Xeon® Silver 4214 2.2G, 12C/24T
○ 128GB RAM
○ Onboard Gb NIC
○ 2x 960GB SSD (RAID-1)
● Networking
○ 2x 25Gb Top of rack switches
○ 1x 1Gb Management Switch
11. How did we come up with these numbers?
$/GB Monthly Cost
First 50 TB $0.023 $1,177.60
Next 450 TB $0.022 $10,137.60
Else >500TB /GB $0.021 $54,423.18
REQs PUTs etc (1000) $0.005 $2,654.29
REQs GETs etc (1000) $0.0004 $1,061.72
Monthly $65,800.31
5 Years $3,948,018.64
● List prices!
● Leading cloud provider
● General purpose object storage tier
● WORM dataset
○ 10MB object size
○ Allow for 2x full rewrites
Public Cloud
12. How did we come up with these numbers?
● List prices!
● RGW w/ S3 Only
● Storage config
○ Erasure Coding 4+2
○ Indexes/Bcache on NVMe
● WORM dataset
○ 10MB object size
○ Allow for 2x full rewrites
● Co-located in metro-reach
● 2x 10GbE private circuits
Hardware $857,282.02
Remote hands $20,000.00
Professional Services $50,000.00
Cluster management &
Support
$266,266.00
Cloud connectivity charges
& x-connects
$227,100.00
Cloud data transfer $103,683.19
Co-location $150,000.00
5 Years $1,674,331.21
Managed Ceph
13. How did we come up with these numbers?
/GB Monthly Cost
First 50TB $0.0125 $640.00
Next 450TB $0.0125 $5,760.00
Else >500TB /GB $0.0125 $32,394.75
REQs PUTs etc (1000) $0.010 $5,308.58
REQs GETs etc (1000) $0.0010 $2,654.29
Monthly $38,927.46
5 Years $2,335,647.77
● List prices!
● Same dataset and access pattern
Infrequent/Cool Tiers
14. Do I need to commit long term?
TCO
YEARS
1 2 3 4 5
Even a <1 year
commitment yields
savings
15. Apart from the cost, why do this?
• Agility to choose cloud provider
• Controlled ingress/egress costs
• No longer anchored to one cloud
• Bridge Private-Public clouds
• Regain control of most important asset