SlideShare a Scribd company logo
1 of 40
Download to read offline
The safe way to make Ceph storage enterprise ready!
Build your own [disaster] ?
Copyright 2015 FUJITSU
Paul von Stamwitz
Sr. Storage Architect
Storage Planning, R&D Center
2015-07-16
1
 The safe and convenient way to make Ceph storage enterprise ready
 ETERNUS CD10k integrated in OpenStack
 mSHEC Erasure Code from Fujitsu
 Contribution to performance enhancements
2
Building Storage with Ceph looks simple
Copyright 2015 FUJITSU
Ceph
+ some servers
+ network
= storage
3
Building Storage with Ceph looks simple – but……
Many new Complexities
 Rightsizing server, disk types, network
bandwidth
 Silos of management tools (HW, SW..)
 Keeping Ceph versions with versions of
server HW, OS, connectivity, drivers in sync
 Management of maintenance and support
contracts of components
 Troubleshooting
Copyright 2015 FUJITSU
Build Ceph source storage yourself
4
The challenges of software defined storage
 What users want
 Open standards
 High scalability
 High reliability
 Lower costs
 No-lock in from a vendor
 What users may get
 An own developed storage system based on open
/ industry standard HW & SW components
 High scalability and reliability ? If the stack works !
 Lower investments but higher operational efforts
 Lock-in into the own stack
Copyright 2015 FUJITSU
5
ETERNUS CD10000 – Making Ceph enterprise ready
Build Ceph source storage yourself Out of the box ETERNUS CD10000
incl. support
incl. maintenance
ETERNUS CD10000 combines open source storage with enterprise–class quality of service

E2E Solution Contract by Fujitsu based on Red Hat Ceph Enterprise
Easy Deployment / Management by Fujitsu
+
+
+ Lifecycle Management for Hardware & Software by Fujitsu
+
6
Fujitsu Maintenance, Support and Professional Services
ETERNUS CD10000: A complete offer
Copyright 2015 FUJITSU
7
Unlimited Scalability
 Cluster of storage nodes
 Capacity and performance scales by
adding storage nodes
 Three different node types enable
differentiated service levels
 Density, capacity optimized
 Performance optimized
 Optimized for small scale dev & test
 1st version of CD10000 (Q3.2014) is
released for a range of 4 to 224 nodes
 Scales up to >50 Petabyte
Copyright 2015 FUJITSU
Basic node 12 TB Performance node 35 TB Capacity node 252 TB
8
Immortal System
Copyright 2015 FUJITSU
Node1 Node2 Node(n)
+
Adding nodes
with new generation
of hardware
………+
Adding nodes
 Non-disruptive add / remove / exchange of hardware (disks and nodes)
 Mix of nodes/disks of different generations, online technology refresh
 Very long lifecycle reduces migration efforts and costs
9
TCO optimized
 Based on x86 industry standard architectures
 Based on open source software (Ceph)
 High-availability and self-optimizing functions are part
of the design at no extra costs
 Highly automated and fully integrated management
reduces operational efforts
 Online maintenance and technology refresh reduce
costs of downtime dramatically
 Extreme long lifecycle delivers investment protection
 End-to-end design an maintenance from Fujitsu
reduces, evaluation, integration, maintenance costs
Copyright 2015 FUJITSU
Better service levels at reduced costs – business centric storage
10
One storage – seamless management
 ETERNUS CD10000 delivers one seamless
management for the complete stack
 Central Ceph software deployment
 Central storage node management
 Central network management
 Central log file management
 Central cluster management
 Central configuration, administration and
maintenance
 SNMP integration of all nodes and
network components
Copyright 2015 FUJITSU
11
Seamless management (2)
Dashboard – Overview of cluster status
Server Management – Management of cluster hardware – add/remove server
(storage node), replace storage devices
Cluster Management – Management of cluster resources – cluster and pool creation
Monitoring the cluster – Monitoring overall capacity, pool utilization, status of OSD,
Monitor, and MDS processes, Placement Group status, and RBD status
Managing OpenStack Interoperation: Connection to OpenStack Server, and
placement of pools in Cinder multi-backend
12
Optional use of Calamari Management GUI
12
13
Example: Replacing an HDD
 Plain Ceph
 taking the failed disk offline in Ceph
 taking the failed disk offline on OS /
Controller Level
 identify (right) hard drive in server
 exchanging hard drive
 partitioning hard drive on OS level
 Make and mount file system
 bring the disk up in Ceph again
 On ETERNUS CD10000
 vsm_cli <cluster> replace-disk-out
<node> <dev>
 exchange hard drive
 vsm_cli <cluster> replace-disk-in
<node> <dev>
14
Example: Adding a Node
 Plain Ceph
 Install hardware
 Install OS
 Configure OS
 Partition disks (OSDs, Journals)
 Make filesystems
 Configure network
 Configure ssh
 Configure Ceph
 Add node to cluster
 On ETERNUS CD10000
 Install hardware
• hardware will automatically PXE boot
and install the current cluster
environment including current
configuration
 Node automatically available to GUI
 Add node to cluster with mouse click
on GUI
• Automatic PG adjustment if needed
15
Adding and Integrating Apps
 The ETERNUS CD10000 architecture
enables the integration of apps
 Fujitsu is working with customers and
software vendors to integrate selected
storage apps
 E.g. archiving, sync & share, data
discovery, cloud apps…
Copyright 2015 FUJITSU
Cloud
Services
Sync
& Share
Archive
iRODS
data
discovery
ETERNUSCD10000
Object Level
Access
Block Level
Access
File Level
Access
Central Management
Ceph Storage System S/W and Fujitsu
Extensions
10GbE Frontend Network
Fast Interconnect Network
PerformanceNodes
CapacityNodes
16
ETERNUS CD10000 at University Mainz
 Large university in Germany
 Uses iRODS Application for library services
 iRODS is an open-source data management software in use at research
organizations and government agencies worldwide
 Organizes and manages large depots of distributed digital data
 Customer has built an interface from iRODS to Ceph
 Stores raw data of measurement instruments (e.g. research in chemistry and
physics) for 10+ years meeting compliance rules of the EU
 Need to provide extensive and rapidly growing data volumes online at
reasonable costs
 Will implement a sync & share service on top of ETERNUS CD10000
17
Summary ETERNUS CD10k – Key Values
Copyright 2015 FUJITSU
ETERNUS CD10000
ETERNUS
CD10000
Unlimited
Scalability
TCO
optimized
The new
unified
Immortal
System
Zero
Downtime
ETERNUS CD10000 combines open source storage with enterprise–class quality of service
18
 The safe way to make Ceph storage enterprise ready
 ETERNUS CD10k integrated in OpenStack
 mSHEC Erasure Code from Fujitsu
 Contribution to performance enhancements
19
What is OpenStack
Free open source (Apache license) software governed by a non-profit foundation
(corporation) with a mission to produce the ubiquitous Open Source Cloud
Computing platform that will meet the needs of public and private clouds
regardless of size, by being simple to implement and massively scalable.
Platin
Gold
Corporate
…
…
 Massively scalable cloud operating system that
controls large pools of compute, storage, and
networking resources
 Community OSS with contributions from 1000+
developers and 180+ participating organizations
 Open web-based API Programmatic IaaS
 Plug-in architecture; allows different hypervisors,
block storage systems, network implementations,
hardware agnostic, etc.
http://www.openstack.org/foundation/companies/
20
Attained fast growing customer interest
 VMware clouds dominate
 OpenStack clouds already #2
 Worldwide adoption
Source: OpenStack User Survey and Feedback Nov 3rd 2014
Source: OpenStack User Survey and Feedback May 13th 2014
21
Why are Customers so interested?
Source: OpenStack User Survey and Feedback Nov 3rd 2014
Greatest industry & community support
compared to alternative open platforms:
Eucalyptus, CloudStack, OpenNebula
“Ability to Innovate” jumped from #6 to #1
22
OpenStack.org User Survey Paris: Nov. 2014
23
OpenStack Cloud Layers
OpenStack and ETERNUS CD10000
Physical Server (CPU, Memory, SSD, HDD) and Network
Base Operating System (CentOS)
OAM
-dhcp
-Deploy
-LCM
Hypervisor
KVM, ESXi,
Hyper-V
Compute (Nova)
Network
(Neutron) +
plugins
Dashboard (Horizon)
Billing Portal
OpenStack
Cloud APIs
RADOS
Block
(RBD)
S3
(Rados-GW)
Object (Swift)Volume (Cinder)
Authentication (Keystone)
Images (Glance)
EC2 API
Metering (Ceilometer)
Manila (File)
File
(CephFS)
Fujitsu
Open Cloud
Storage
24
 The safe way to make Ceph storage enterprise ready
 ETERNUS CD10k integrated in OpenStack
 mSHEC Erasure Code from Fujitsu
 Contribution to performance enhancements
25
Backgrounds (1)
 Erasure codes for content data
 Content data for ICT services is ever-growing
 Demand for higher space efficiency and durability
 Reed Solomon code (de facto erasure code) improves both
Reed Solomon Code(Old style)Triple Replication
However, Reed Solomon code is not so recovery-efficient
content data
copy copy
3x space
parity parity
1.5x space
content data
26
Backgrounds (2)
 Local parity improves recovery efficiency
 Data recovery should be as efficient as possible
• in order to avoid multiple disk failures and data loss
 Reed Solomon code was improved by local parity methods
• data read from disks is reduced during recovery
Data Chunks
Parity Chunks
Reed Solomon Code
(No Local Parities) Local Parities
data read from disks
However, multiple disk failures is out of consideration
A Local Parity Method
27
 Local parity method for multiple disk failures
 Existing methods is optimized for single disk failure
• e.g. Microsoft MS-LRC, Facebook Xorbas
 However, Its recovery overhead is large in case of multiple disk failures
• because they have a chance to use global parities for recovery
Our Goal
A Local Parity Method
Our goal is a method efficiently handling multiple disk failures
Multiple Disk Failures
28
 SHEC (= Shingled Erasure Code)
 An erasure code only with local parity groups
• to improve recovery efficiency in case of multiple disk failures
 The calculation ranges of local parities are shifted and partly overlap with each
other (like the shingles on a roof)
• to keep enough durability
Our Proposal Method (SHEC)
k : data chunks (=10)
m :
parity
chunks
(=6)
l : calculation range (=5)
29
1. mSHEC is more adjustable than Reed Solomon code,
because SHEC provides many recovery-efficient layouts
including Reed Solomon codes
2. mSHEC’s recovery time was ~20% faster than Reed
Solomon code in case of double disk failures
3. mSHEC erasure-code included in Hammer release
4. For more information see
https://wiki.ceph.com/Planning/Blueprints/Hammer/Shingled_Erasure_Code_(SHEC)
or ask Fujitsu
Summary mSHEC
30
 The safe way to make Ceph storage enterprise ready
 ETERNUS CD10k integrated in OpenStack
 mSHEC Erasure Code from Fujitsu
 Contribution to performance enhancements
31
Areas to improve Ceph performance
Ceph has an adequate performance today,
But there are performance issues which prevent us from taking full
advantage of our hardware resources.
Two main goals for improvement:
(1) Decrease latency in the Ceph code path
(2) Enhance large cluster scalability with many nodes / ODS
32
LTTng general http://lttng..org/
General
 open source tracing framework for Linux
 trace Linux kernel and user space applications
 low overhead and therefore usable on
production systems
 activate tracing at runtime
 Ceph code contains LTTng trace points already
Our LTTng based profiling
 activate within a function, collect timestamp information at the interesting places
 save collected information in a single trace point at the end of the function
 transaction profiling instead of function profiling: use Ceph transaction id's to
correlate trace points
 focused on primary and secondary write operations
33
Turn Around Time of a single Write IO
34
LTTng data evaluation: Replication Write
Observation:
 replication write latency suffers from the "large variance problem"
 minimum and average differ by a factor of 2
 This is a common problem visible for many ceph-osd components.
Why is variance so large?
 Observation: No single hotspot visible.
 Observation: Active processing steps do not differ between minimum and average
sample as much as the total latency does.
Additional latency penalty mostly at the switch from
 sub_op_modify_commit to Pipe::writer
 no indication that queue length is the cause
Question: Can the overall thread load on the system and Linux scheduling be the
reason for the delayed start of the Pipe::writer thread?
35
Thread classes and ceph-osd CPU usage
Thread per ceph-osd depends on complexity of Ceph cluster: 3x node with 4 OSDs
each ~700 threads per node; 9x nodes with 40 OSDs each > 100k threads per node
 ThreadPool::WorkThread is a hot spot = work in the ObjectStore / FileStore
total CPU usage during test 43.17 CPU seconds
Pipe::Writer 4.59 10.63%
Pipe::Reader 5.81 13.45%
ShardedThreadPool::WorkThreadSharded 8.08 18.70%
ThreadPool::WorkThread 15.56 36.04%
FileJournal::Writer 2.41 5.57%
FileJournal::WriteFinisher 1.01 2.33%
Finisher::finisher_thread_entry 2.86 6.63%
36
FileStore benchmarking
 most of the work is done in FileStore::do_transactions
 each write transaction consists of
 3 calls to omap_setkeys,
 the actual call to write to the file system
 2 calls to setattr
 Proposal: coalesce calls to omap_setkeys
 1 function call instead of 3 calls, set 5 key value pairs instead of 6 (duplicate key)
 Official change was to coalesce at the higher PG layer
37
Other areas of investigation and improvement
 Lock analysis
 RWLock instead of mutex
 Start with CRC locks
 Bufferlist tuning
 Optimize for jumbo packets
 malloc issues
Copyright 2015 FUJITSU
38
Summary and Conclusion
ETERNUS CD10k is the safe way to make Ceph enterprise ready
 Unlimited Scalability: 4 to 224 nodes, scales up to >50 Petabyte
 Immortal System with Zero downtime: Non-disruptive add / remove /
exchange of hardware (disks and nodes) or Software update
 TCO optimized: Highly automated and fully integrated management reduces
operational efforts
 Tight integration in OpenStack with own GUI
Fujitsu will continue to enhance ease-of-use and performance
 This is important!
 As Ceph’s popularity increases, competitors will attack Ceph in these areas.
3939 Copyright 2015 FUJITSU
pvonstamwitz@us.fujitsu.com

More Related Content

What's hot

How To Build A Scalable Storage System with OSS at TLUG Meeting 2008/09/13
How To Build A Scalable Storage System with OSS at TLUG Meeting 2008/09/13How To Build A Scalable Storage System with OSS at TLUG Meeting 2008/09/13
How To Build A Scalable Storage System with OSS at TLUG Meeting 2008/09/13Gosuke Miyashita
 
OpenStack Day Italy: openATTC as an open storage platform for OpenStack
OpenStack Day Italy: openATTC as an open storage platform for OpenStackOpenStack Day Italy: openATTC as an open storage platform for OpenStack
OpenStack Day Italy: openATTC as an open storage platform for OpenStackit-novum
 
Couchbase Performance Benchmarking
Couchbase Performance BenchmarkingCouchbase Performance Benchmarking
Couchbase Performance BenchmarkingRenat Khasanshyn
 
CISCO - Presentation at Hortonworks Booth - Strata 2014
CISCO - Presentation at Hortonworks Booth - Strata 2014CISCO - Presentation at Hortonworks Booth - Strata 2014
CISCO - Presentation at Hortonworks Booth - Strata 2014Hortonworks
 
IBM Enterprise 2014 - BMR / DR Planning for Linux with Storix Software
IBM Enterprise 2014 - BMR / DR Planning for Linux with Storix SoftwareIBM Enterprise 2014 - BMR / DR Planning for Linux with Storix Software
IBM Enterprise 2014 - BMR / DR Planning for Linux with Storix SoftwareScott Figgins
 
Iris: Inter-cloud Resource Integration System for Elastic Cloud Data Center
Iris: Inter-cloud Resource Integration System for Elastic Cloud Data CenterIris: Inter-cloud Resource Integration System for Elastic Cloud Data Center
Iris: Inter-cloud Resource Integration System for Elastic Cloud Data CenterRyousei Takano
 
EMC Data domain advanced features and functions
EMC Data domain advanced features and functionsEMC Data domain advanced features and functions
EMC Data domain advanced features and functionssolarisyougood
 
dCUDA: Distributed GPU Computing with Hardware Overlap
 dCUDA: Distributed GPU Computing with Hardware Overlap dCUDA: Distributed GPU Computing with Hardware Overlap
dCUDA: Distributed GPU Computing with Hardware Overlapinside-BigData.com
 
OSS Presentation by Kevin Halgren
OSS Presentation by Kevin HalgrenOSS Presentation by Kevin Halgren
OSS Presentation by Kevin HalgrenOpenStorageSummit
 
Make room for more virtual desktops with fast storage
Make room for more virtual desktops with fast storageMake room for more virtual desktops with fast storage
Make room for more virtual desktops with fast storagePrincipled Technologies
 
INN694-2014-OpenStack installation process V5
INN694-2014-OpenStack installation process V5INN694-2014-OpenStack installation process V5
INN694-2014-OpenStack installation process V5Fabien CHASTEL
 
Deep Learning on the SaturnV Cluster
Deep Learning on the SaturnV ClusterDeep Learning on the SaturnV Cluster
Deep Learning on the SaturnV Clusterinside-BigData.com
 
Getting Started with Redis
Getting Started with RedisGetting Started with Redis
Getting Started with RedisFaisal Akber
 
Cost and performance comparison for OpenStack compute and storage infrastructure
Cost and performance comparison for OpenStack compute and storage infrastructureCost and performance comparison for OpenStack compute and storage infrastructure
Cost and performance comparison for OpenStack compute and storage infrastructurePrincipled Technologies
 
Unleash oracle 12c performance with cisco ucs
Unleash oracle 12c performance with cisco ucsUnleash oracle 12c performance with cisco ucs
Unleash oracle 12c performance with cisco ucssolarisyougood
 
Create useful data center health visualizations with Dell iDRAC Telemetry Ref...
Create useful data center health visualizations with Dell iDRAC Telemetry Ref...Create useful data center health visualizations with Dell iDRAC Telemetry Ref...
Create useful data center health visualizations with Dell iDRAC Telemetry Ref...Principled Technologies
 
Ambedded - how to build a true no single point of failure ceph cluster
Ambedded - how to build a true no single point of failure ceph cluster Ambedded - how to build a true no single point of failure ceph cluster
Ambedded - how to build a true no single point of failure ceph cluster inwin stack
 
EMC Deduplication Fundamentals
EMC Deduplication FundamentalsEMC Deduplication Fundamentals
EMC Deduplication Fundamentalsemcbaltics
 
Approaching hyperconvergedopenstack
Approaching hyperconvergedopenstackApproaching hyperconvergedopenstack
Approaching hyperconvergedopenstackIkuo Kumagai
 

What's hot (20)

How To Build A Scalable Storage System with OSS at TLUG Meeting 2008/09/13
How To Build A Scalable Storage System with OSS at TLUG Meeting 2008/09/13How To Build A Scalable Storage System with OSS at TLUG Meeting 2008/09/13
How To Build A Scalable Storage System with OSS at TLUG Meeting 2008/09/13
 
OpenStack Day Italy: openATTC as an open storage platform for OpenStack
OpenStack Day Italy: openATTC as an open storage platform for OpenStackOpenStack Day Italy: openATTC as an open storage platform for OpenStack
OpenStack Day Italy: openATTC as an open storage platform for OpenStack
 
Couchbase Performance Benchmarking
Couchbase Performance BenchmarkingCouchbase Performance Benchmarking
Couchbase Performance Benchmarking
 
CISCO - Presentation at Hortonworks Booth - Strata 2014
CISCO - Presentation at Hortonworks Booth - Strata 2014CISCO - Presentation at Hortonworks Booth - Strata 2014
CISCO - Presentation at Hortonworks Booth - Strata 2014
 
IBM Enterprise 2014 - BMR / DR Planning for Linux with Storix Software
IBM Enterprise 2014 - BMR / DR Planning for Linux with Storix SoftwareIBM Enterprise 2014 - BMR / DR Planning for Linux with Storix Software
IBM Enterprise 2014 - BMR / DR Planning for Linux with Storix Software
 
Iris: Inter-cloud Resource Integration System for Elastic Cloud Data Center
Iris: Inter-cloud Resource Integration System for Elastic Cloud Data CenterIris: Inter-cloud Resource Integration System for Elastic Cloud Data Center
Iris: Inter-cloud Resource Integration System for Elastic Cloud Data Center
 
EMC Data domain advanced features and functions
EMC Data domain advanced features and functionsEMC Data domain advanced features and functions
EMC Data domain advanced features and functions
 
dCUDA: Distributed GPU Computing with Hardware Overlap
 dCUDA: Distributed GPU Computing with Hardware Overlap dCUDA: Distributed GPU Computing with Hardware Overlap
dCUDA: Distributed GPU Computing with Hardware Overlap
 
OSS Presentation by Kevin Halgren
OSS Presentation by Kevin HalgrenOSS Presentation by Kevin Halgren
OSS Presentation by Kevin Halgren
 
Make room for more virtual desktops with fast storage
Make room for more virtual desktops with fast storageMake room for more virtual desktops with fast storage
Make room for more virtual desktops with fast storage
 
INN694-2014-OpenStack installation process V5
INN694-2014-OpenStack installation process V5INN694-2014-OpenStack installation process V5
INN694-2014-OpenStack installation process V5
 
Deep Learning on the SaturnV Cluster
Deep Learning on the SaturnV ClusterDeep Learning on the SaturnV Cluster
Deep Learning on the SaturnV Cluster
 
Getting Started with Redis
Getting Started with RedisGetting Started with Redis
Getting Started with Redis
 
Cost and performance comparison for OpenStack compute and storage infrastructure
Cost and performance comparison for OpenStack compute and storage infrastructureCost and performance comparison for OpenStack compute and storage infrastructure
Cost and performance comparison for OpenStack compute and storage infrastructure
 
Unleash oracle 12c performance with cisco ucs
Unleash oracle 12c performance with cisco ucsUnleash oracle 12c performance with cisco ucs
Unleash oracle 12c performance with cisco ucs
 
Create useful data center health visualizations with Dell iDRAC Telemetry Ref...
Create useful data center health visualizations with Dell iDRAC Telemetry Ref...Create useful data center health visualizations with Dell iDRAC Telemetry Ref...
Create useful data center health visualizations with Dell iDRAC Telemetry Ref...
 
Ambedded - how to build a true no single point of failure ceph cluster
Ambedded - how to build a true no single point of failure ceph cluster Ambedded - how to build a true no single point of failure ceph cluster
Ambedded - how to build a true no single point of failure ceph cluster
 
EMC Deduplication Fundamentals
EMC Deduplication FundamentalsEMC Deduplication Fundamentals
EMC Deduplication Fundamentals
 
110629 nexenta- andy bennett
110629   nexenta- andy bennett110629   nexenta- andy bennett
110629 nexenta- andy bennett
 
Approaching hyperconvergedopenstack
Approaching hyperconvergedopenstackApproaching hyperconvergedopenstack
Approaching hyperconvergedopenstack
 

Similar to Ceph Day LA: Building your own disaster? The safe way to make Ceph storage ready!

Ceph Day SF 2015 - Building your own disaster? The safe way to make Ceph ready!
Ceph Day SF 2015 - Building your own disaster? The safe way to make Ceph ready!Ceph Day SF 2015 - Building your own disaster? The safe way to make Ceph ready!
Ceph Day SF 2015 - Building your own disaster? The safe way to make Ceph ready!Ceph Community
 
Peanut Butter and jelly: Mapping the deep Integration between Ceph and OpenStack
Peanut Butter and jelly: Mapping the deep Integration between Ceph and OpenStackPeanut Butter and jelly: Mapping the deep Integration between Ceph and OpenStack
Peanut Butter and jelly: Mapping the deep Integration between Ceph and OpenStackSean Cohen
 
Sioux Hot-or-Not: The future of Linux (Alan Cox)
Sioux Hot-or-Not: The future of Linux (Alan Cox)Sioux Hot-or-Not: The future of Linux (Alan Cox)
Sioux Hot-or-Not: The future of Linux (Alan Cox)siouxhotornot
 
Le soluzioni tecnologiche a supporto del mondo OpenStack e Container
Le soluzioni tecnologiche a supporto del mondo OpenStack e ContainerLe soluzioni tecnologiche a supporto del mondo OpenStack e Container
Le soluzioni tecnologiche a supporto del mondo OpenStack e ContainerJürgen Ambrosi
 
Closing the Storage gap - presentation from OpenStack Summit in Vancouver 2015
Closing the Storage gap - presentation from OpenStack Summit in Vancouver 2015Closing the Storage gap - presentation from OpenStack Summit in Vancouver 2015
Closing the Storage gap - presentation from OpenStack Summit in Vancouver 2015it-novum
 
Ceph Day Shanghai - Hyper Converged PLCloud with Ceph
Ceph Day Shanghai - Hyper Converged PLCloud with Ceph Ceph Day Shanghai - Hyper Converged PLCloud with Ceph
Ceph Day Shanghai - Hyper Converged PLCloud with Ceph Ceph Community
 
OpenNebulaConf 2016 - OpenNebula, a story about flexibility and technological...
OpenNebulaConf 2016 - OpenNebula, a story about flexibility and technological...OpenNebulaConf 2016 - OpenNebula, a story about flexibility and technological...
OpenNebulaConf 2016 - OpenNebula, a story about flexibility and technological...OpenNebula Project
 
Speed up Digital Transformation with Openstack Cloud & Software Defined Storage
Speed up Digital Transformation with Openstack Cloud & Software Defined StorageSpeed up Digital Transformation with Openstack Cloud & Software Defined Storage
Speed up Digital Transformation with Openstack Cloud & Software Defined StorageMatthew Sheppard
 
CoreOS and cloud provider integration: simple cloud-init example at Exoscale
CoreOS and cloud provider integration: simple cloud-init example at ExoscaleCoreOS and cloud provider integration: simple cloud-init example at Exoscale
CoreOS and cloud provider integration: simple cloud-init example at ExoscaleAntoine COETSIER
 
Oracle Exec Summary 7000 Unified Storage
Oracle Exec Summary 7000 Unified StorageOracle Exec Summary 7000 Unified Storage
Oracle Exec Summary 7000 Unified StorageDavid R. Klauser
 
Canonical Ubuntu OpenStack Overview Presentation
Canonical Ubuntu OpenStack Overview PresentationCanonical Ubuntu OpenStack Overview Presentation
Canonical Ubuntu OpenStack Overview PresentationThe World Bank
 
Red Hat® Ceph Storage and Network Solutions for Software Defined Infrastructure
Red Hat® Ceph Storage and Network Solutions for Software Defined InfrastructureRed Hat® Ceph Storage and Network Solutions for Software Defined Infrastructure
Red Hat® Ceph Storage and Network Solutions for Software Defined InfrastructureIntel® Software
 
Building an open source cloud storage platform for OpenStack - openATTIC
Building an open source cloud storage platform for OpenStack - openATTICBuilding an open source cloud storage platform for OpenStack - openATTIC
Building an open source cloud storage platform for OpenStack - openATTICit-novum
 
Huawei OceanStorDoradoAll-Flashtorage Systems.pdf
Huawei OceanStorDoradoAll-Flashtorage Systems.pdfHuawei OceanStorDoradoAll-Flashtorage Systems.pdf
Huawei OceanStorDoradoAll-Flashtorage Systems.pdfvineeshen2
 
Netcloud Breakfast Event Mai 2011
Netcloud Breakfast Event Mai 2011Netcloud Breakfast Event Mai 2011
Netcloud Breakfast Event Mai 2011Null00
 
Presentation cisco unified fabric
Presentation   cisco unified fabricPresentation   cisco unified fabric
Presentation cisco unified fabricxKinAnx
 
The Enhanced Cisco Container Platform
The Enhanced Cisco Container PlatformThe Enhanced Cisco Container Platform
The Enhanced Cisco Container PlatformRobb Boyd
 
The State of CXL-related Activities within OCP
The State of CXL-related Activities within OCPThe State of CXL-related Activities within OCP
The State of CXL-related Activities within OCPMemory Fabric Forum
 
Building Complex Workloads in Cloud - AWS PS Summit Canberra
Building Complex Workloads in Cloud - AWS PS Summit CanberraBuilding Complex Workloads in Cloud - AWS PS Summit Canberra
Building Complex Workloads in Cloud - AWS PS Summit CanberraAmazon Web Services
 
Intel open stack-summit-session-nov13-final
Intel open stack-summit-session-nov13-finalIntel open stack-summit-session-nov13-final
Intel open stack-summit-session-nov13-finalDeepak Mane
 

Similar to Ceph Day LA: Building your own disaster? The safe way to make Ceph storage ready! (20)

Ceph Day SF 2015 - Building your own disaster? The safe way to make Ceph ready!
Ceph Day SF 2015 - Building your own disaster? The safe way to make Ceph ready!Ceph Day SF 2015 - Building your own disaster? The safe way to make Ceph ready!
Ceph Day SF 2015 - Building your own disaster? The safe way to make Ceph ready!
 
Peanut Butter and jelly: Mapping the deep Integration between Ceph and OpenStack
Peanut Butter and jelly: Mapping the deep Integration between Ceph and OpenStackPeanut Butter and jelly: Mapping the deep Integration between Ceph and OpenStack
Peanut Butter and jelly: Mapping the deep Integration between Ceph and OpenStack
 
Sioux Hot-or-Not: The future of Linux (Alan Cox)
Sioux Hot-or-Not: The future of Linux (Alan Cox)Sioux Hot-or-Not: The future of Linux (Alan Cox)
Sioux Hot-or-Not: The future of Linux (Alan Cox)
 
Le soluzioni tecnologiche a supporto del mondo OpenStack e Container
Le soluzioni tecnologiche a supporto del mondo OpenStack e ContainerLe soluzioni tecnologiche a supporto del mondo OpenStack e Container
Le soluzioni tecnologiche a supporto del mondo OpenStack e Container
 
Closing the Storage gap - presentation from OpenStack Summit in Vancouver 2015
Closing the Storage gap - presentation from OpenStack Summit in Vancouver 2015Closing the Storage gap - presentation from OpenStack Summit in Vancouver 2015
Closing the Storage gap - presentation from OpenStack Summit in Vancouver 2015
 
Ceph Day Shanghai - Hyper Converged PLCloud with Ceph
Ceph Day Shanghai - Hyper Converged PLCloud with Ceph Ceph Day Shanghai - Hyper Converged PLCloud with Ceph
Ceph Day Shanghai - Hyper Converged PLCloud with Ceph
 
OpenNebulaConf 2016 - OpenNebula, a story about flexibility and technological...
OpenNebulaConf 2016 - OpenNebula, a story about flexibility and technological...OpenNebulaConf 2016 - OpenNebula, a story about flexibility and technological...
OpenNebulaConf 2016 - OpenNebula, a story about flexibility and technological...
 
Speed up Digital Transformation with Openstack Cloud & Software Defined Storage
Speed up Digital Transformation with Openstack Cloud & Software Defined StorageSpeed up Digital Transformation with Openstack Cloud & Software Defined Storage
Speed up Digital Transformation with Openstack Cloud & Software Defined Storage
 
CoreOS and cloud provider integration: simple cloud-init example at Exoscale
CoreOS and cloud provider integration: simple cloud-init example at ExoscaleCoreOS and cloud provider integration: simple cloud-init example at Exoscale
CoreOS and cloud provider integration: simple cloud-init example at Exoscale
 
Oracle Exec Summary 7000 Unified Storage
Oracle Exec Summary 7000 Unified StorageOracle Exec Summary 7000 Unified Storage
Oracle Exec Summary 7000 Unified Storage
 
Canonical Ubuntu OpenStack Overview Presentation
Canonical Ubuntu OpenStack Overview PresentationCanonical Ubuntu OpenStack Overview Presentation
Canonical Ubuntu OpenStack Overview Presentation
 
Red Hat® Ceph Storage and Network Solutions for Software Defined Infrastructure
Red Hat® Ceph Storage and Network Solutions for Software Defined InfrastructureRed Hat® Ceph Storage and Network Solutions for Software Defined Infrastructure
Red Hat® Ceph Storage and Network Solutions for Software Defined Infrastructure
 
Building an open source cloud storage platform for OpenStack - openATTIC
Building an open source cloud storage platform for OpenStack - openATTICBuilding an open source cloud storage platform for OpenStack - openATTIC
Building an open source cloud storage platform for OpenStack - openATTIC
 
Huawei OceanStorDoradoAll-Flashtorage Systems.pdf
Huawei OceanStorDoradoAll-Flashtorage Systems.pdfHuawei OceanStorDoradoAll-Flashtorage Systems.pdf
Huawei OceanStorDoradoAll-Flashtorage Systems.pdf
 
Netcloud Breakfast Event Mai 2011
Netcloud Breakfast Event Mai 2011Netcloud Breakfast Event Mai 2011
Netcloud Breakfast Event Mai 2011
 
Presentation cisco unified fabric
Presentation   cisco unified fabricPresentation   cisco unified fabric
Presentation cisco unified fabric
 
The Enhanced Cisco Container Platform
The Enhanced Cisco Container PlatformThe Enhanced Cisco Container Platform
The Enhanced Cisco Container Platform
 
The State of CXL-related Activities within OCP
The State of CXL-related Activities within OCPThe State of CXL-related Activities within OCP
The State of CXL-related Activities within OCP
 
Building Complex Workloads in Cloud - AWS PS Summit Canberra
Building Complex Workloads in Cloud - AWS PS Summit CanberraBuilding Complex Workloads in Cloud - AWS PS Summit Canberra
Building Complex Workloads in Cloud - AWS PS Summit Canberra
 
Intel open stack-summit-session-nov13-final
Intel open stack-summit-session-nov13-finalIntel open stack-summit-session-nov13-final
Intel open stack-summit-session-nov13-final
 

Recently uploaded

Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 

Recently uploaded (20)

Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 

Ceph Day LA: Building your own disaster? The safe way to make Ceph storage ready!

  • 1. The safe way to make Ceph storage enterprise ready! Build your own [disaster] ? Copyright 2015 FUJITSU Paul von Stamwitz Sr. Storage Architect Storage Planning, R&D Center 2015-07-16
  • 2. 1  The safe and convenient way to make Ceph storage enterprise ready  ETERNUS CD10k integrated in OpenStack  mSHEC Erasure Code from Fujitsu  Contribution to performance enhancements
  • 3. 2 Building Storage with Ceph looks simple Copyright 2015 FUJITSU Ceph + some servers + network = storage
  • 4. 3 Building Storage with Ceph looks simple – but…… Many new Complexities  Rightsizing server, disk types, network bandwidth  Silos of management tools (HW, SW..)  Keeping Ceph versions with versions of server HW, OS, connectivity, drivers in sync  Management of maintenance and support contracts of components  Troubleshooting Copyright 2015 FUJITSU Build Ceph source storage yourself
  • 5. 4 The challenges of software defined storage  What users want  Open standards  High scalability  High reliability  Lower costs  No-lock in from a vendor  What users may get  An own developed storage system based on open / industry standard HW & SW components  High scalability and reliability ? If the stack works !  Lower investments but higher operational efforts  Lock-in into the own stack Copyright 2015 FUJITSU
  • 6. 5 ETERNUS CD10000 – Making Ceph enterprise ready Build Ceph source storage yourself Out of the box ETERNUS CD10000 incl. support incl. maintenance ETERNUS CD10000 combines open source storage with enterprise–class quality of service  E2E Solution Contract by Fujitsu based on Red Hat Ceph Enterprise Easy Deployment / Management by Fujitsu + + + Lifecycle Management for Hardware & Software by Fujitsu +
  • 7. 6 Fujitsu Maintenance, Support and Professional Services ETERNUS CD10000: A complete offer Copyright 2015 FUJITSU
  • 8. 7 Unlimited Scalability  Cluster of storage nodes  Capacity and performance scales by adding storage nodes  Three different node types enable differentiated service levels  Density, capacity optimized  Performance optimized  Optimized for small scale dev & test  1st version of CD10000 (Q3.2014) is released for a range of 4 to 224 nodes  Scales up to >50 Petabyte Copyright 2015 FUJITSU Basic node 12 TB Performance node 35 TB Capacity node 252 TB
  • 9. 8 Immortal System Copyright 2015 FUJITSU Node1 Node2 Node(n) + Adding nodes with new generation of hardware ………+ Adding nodes  Non-disruptive add / remove / exchange of hardware (disks and nodes)  Mix of nodes/disks of different generations, online technology refresh  Very long lifecycle reduces migration efforts and costs
  • 10. 9 TCO optimized  Based on x86 industry standard architectures  Based on open source software (Ceph)  High-availability and self-optimizing functions are part of the design at no extra costs  Highly automated and fully integrated management reduces operational efforts  Online maintenance and technology refresh reduce costs of downtime dramatically  Extreme long lifecycle delivers investment protection  End-to-end design an maintenance from Fujitsu reduces, evaluation, integration, maintenance costs Copyright 2015 FUJITSU Better service levels at reduced costs – business centric storage
  • 11. 10 One storage – seamless management  ETERNUS CD10000 delivers one seamless management for the complete stack  Central Ceph software deployment  Central storage node management  Central network management  Central log file management  Central cluster management  Central configuration, administration and maintenance  SNMP integration of all nodes and network components Copyright 2015 FUJITSU
  • 12. 11 Seamless management (2) Dashboard – Overview of cluster status Server Management – Management of cluster hardware – add/remove server (storage node), replace storage devices Cluster Management – Management of cluster resources – cluster and pool creation Monitoring the cluster – Monitoring overall capacity, pool utilization, status of OSD, Monitor, and MDS processes, Placement Group status, and RBD status Managing OpenStack Interoperation: Connection to OpenStack Server, and placement of pools in Cinder multi-backend
  • 13. 12 Optional use of Calamari Management GUI 12
  • 14. 13 Example: Replacing an HDD  Plain Ceph  taking the failed disk offline in Ceph  taking the failed disk offline on OS / Controller Level  identify (right) hard drive in server  exchanging hard drive  partitioning hard drive on OS level  Make and mount file system  bring the disk up in Ceph again  On ETERNUS CD10000  vsm_cli <cluster> replace-disk-out <node> <dev>  exchange hard drive  vsm_cli <cluster> replace-disk-in <node> <dev>
  • 15. 14 Example: Adding a Node  Plain Ceph  Install hardware  Install OS  Configure OS  Partition disks (OSDs, Journals)  Make filesystems  Configure network  Configure ssh  Configure Ceph  Add node to cluster  On ETERNUS CD10000  Install hardware • hardware will automatically PXE boot and install the current cluster environment including current configuration  Node automatically available to GUI  Add node to cluster with mouse click on GUI • Automatic PG adjustment if needed
  • 16. 15 Adding and Integrating Apps  The ETERNUS CD10000 architecture enables the integration of apps  Fujitsu is working with customers and software vendors to integrate selected storage apps  E.g. archiving, sync & share, data discovery, cloud apps… Copyright 2015 FUJITSU Cloud Services Sync & Share Archive iRODS data discovery ETERNUSCD10000 Object Level Access Block Level Access File Level Access Central Management Ceph Storage System S/W and Fujitsu Extensions 10GbE Frontend Network Fast Interconnect Network PerformanceNodes CapacityNodes
  • 17. 16 ETERNUS CD10000 at University Mainz  Large university in Germany  Uses iRODS Application for library services  iRODS is an open-source data management software in use at research organizations and government agencies worldwide  Organizes and manages large depots of distributed digital data  Customer has built an interface from iRODS to Ceph  Stores raw data of measurement instruments (e.g. research in chemistry and physics) for 10+ years meeting compliance rules of the EU  Need to provide extensive and rapidly growing data volumes online at reasonable costs  Will implement a sync & share service on top of ETERNUS CD10000
  • 18. 17 Summary ETERNUS CD10k – Key Values Copyright 2015 FUJITSU ETERNUS CD10000 ETERNUS CD10000 Unlimited Scalability TCO optimized The new unified Immortal System Zero Downtime ETERNUS CD10000 combines open source storage with enterprise–class quality of service
  • 19. 18  The safe way to make Ceph storage enterprise ready  ETERNUS CD10k integrated in OpenStack  mSHEC Erasure Code from Fujitsu  Contribution to performance enhancements
  • 20. 19 What is OpenStack Free open source (Apache license) software governed by a non-profit foundation (corporation) with a mission to produce the ubiquitous Open Source Cloud Computing platform that will meet the needs of public and private clouds regardless of size, by being simple to implement and massively scalable. Platin Gold Corporate … …  Massively scalable cloud operating system that controls large pools of compute, storage, and networking resources  Community OSS with contributions from 1000+ developers and 180+ participating organizations  Open web-based API Programmatic IaaS  Plug-in architecture; allows different hypervisors, block storage systems, network implementations, hardware agnostic, etc. http://www.openstack.org/foundation/companies/
  • 21. 20 Attained fast growing customer interest  VMware clouds dominate  OpenStack clouds already #2  Worldwide adoption Source: OpenStack User Survey and Feedback Nov 3rd 2014 Source: OpenStack User Survey and Feedback May 13th 2014
  • 22. 21 Why are Customers so interested? Source: OpenStack User Survey and Feedback Nov 3rd 2014 Greatest industry & community support compared to alternative open platforms: Eucalyptus, CloudStack, OpenNebula “Ability to Innovate” jumped from #6 to #1
  • 23. 22 OpenStack.org User Survey Paris: Nov. 2014
  • 24. 23 OpenStack Cloud Layers OpenStack and ETERNUS CD10000 Physical Server (CPU, Memory, SSD, HDD) and Network Base Operating System (CentOS) OAM -dhcp -Deploy -LCM Hypervisor KVM, ESXi, Hyper-V Compute (Nova) Network (Neutron) + plugins Dashboard (Horizon) Billing Portal OpenStack Cloud APIs RADOS Block (RBD) S3 (Rados-GW) Object (Swift)Volume (Cinder) Authentication (Keystone) Images (Glance) EC2 API Metering (Ceilometer) Manila (File) File (CephFS) Fujitsu Open Cloud Storage
  • 25. 24  The safe way to make Ceph storage enterprise ready  ETERNUS CD10k integrated in OpenStack  mSHEC Erasure Code from Fujitsu  Contribution to performance enhancements
  • 26. 25 Backgrounds (1)  Erasure codes for content data  Content data for ICT services is ever-growing  Demand for higher space efficiency and durability  Reed Solomon code (de facto erasure code) improves both Reed Solomon Code(Old style)Triple Replication However, Reed Solomon code is not so recovery-efficient content data copy copy 3x space parity parity 1.5x space content data
  • 27. 26 Backgrounds (2)  Local parity improves recovery efficiency  Data recovery should be as efficient as possible • in order to avoid multiple disk failures and data loss  Reed Solomon code was improved by local parity methods • data read from disks is reduced during recovery Data Chunks Parity Chunks Reed Solomon Code (No Local Parities) Local Parities data read from disks However, multiple disk failures is out of consideration A Local Parity Method
  • 28. 27  Local parity method for multiple disk failures  Existing methods is optimized for single disk failure • e.g. Microsoft MS-LRC, Facebook Xorbas  However, Its recovery overhead is large in case of multiple disk failures • because they have a chance to use global parities for recovery Our Goal A Local Parity Method Our goal is a method efficiently handling multiple disk failures Multiple Disk Failures
  • 29. 28  SHEC (= Shingled Erasure Code)  An erasure code only with local parity groups • to improve recovery efficiency in case of multiple disk failures  The calculation ranges of local parities are shifted and partly overlap with each other (like the shingles on a roof) • to keep enough durability Our Proposal Method (SHEC) k : data chunks (=10) m : parity chunks (=6) l : calculation range (=5)
  • 30. 29 1. mSHEC is more adjustable than Reed Solomon code, because SHEC provides many recovery-efficient layouts including Reed Solomon codes 2. mSHEC’s recovery time was ~20% faster than Reed Solomon code in case of double disk failures 3. mSHEC erasure-code included in Hammer release 4. For more information see https://wiki.ceph.com/Planning/Blueprints/Hammer/Shingled_Erasure_Code_(SHEC) or ask Fujitsu Summary mSHEC
  • 31. 30  The safe way to make Ceph storage enterprise ready  ETERNUS CD10k integrated in OpenStack  mSHEC Erasure Code from Fujitsu  Contribution to performance enhancements
  • 32. 31 Areas to improve Ceph performance Ceph has an adequate performance today, But there are performance issues which prevent us from taking full advantage of our hardware resources. Two main goals for improvement: (1) Decrease latency in the Ceph code path (2) Enhance large cluster scalability with many nodes / ODS
  • 33. 32 LTTng general http://lttng..org/ General  open source tracing framework for Linux  trace Linux kernel and user space applications  low overhead and therefore usable on production systems  activate tracing at runtime  Ceph code contains LTTng trace points already Our LTTng based profiling  activate within a function, collect timestamp information at the interesting places  save collected information in a single trace point at the end of the function  transaction profiling instead of function profiling: use Ceph transaction id's to correlate trace points  focused on primary and secondary write operations
  • 34. 33 Turn Around Time of a single Write IO
  • 35. 34 LTTng data evaluation: Replication Write Observation:  replication write latency suffers from the "large variance problem"  minimum and average differ by a factor of 2  This is a common problem visible for many ceph-osd components. Why is variance so large?  Observation: No single hotspot visible.  Observation: Active processing steps do not differ between minimum and average sample as much as the total latency does. Additional latency penalty mostly at the switch from  sub_op_modify_commit to Pipe::writer  no indication that queue length is the cause Question: Can the overall thread load on the system and Linux scheduling be the reason for the delayed start of the Pipe::writer thread?
  • 36. 35 Thread classes and ceph-osd CPU usage Thread per ceph-osd depends on complexity of Ceph cluster: 3x node with 4 OSDs each ~700 threads per node; 9x nodes with 40 OSDs each > 100k threads per node  ThreadPool::WorkThread is a hot spot = work in the ObjectStore / FileStore total CPU usage during test 43.17 CPU seconds Pipe::Writer 4.59 10.63% Pipe::Reader 5.81 13.45% ShardedThreadPool::WorkThreadSharded 8.08 18.70% ThreadPool::WorkThread 15.56 36.04% FileJournal::Writer 2.41 5.57% FileJournal::WriteFinisher 1.01 2.33% Finisher::finisher_thread_entry 2.86 6.63%
  • 37. 36 FileStore benchmarking  most of the work is done in FileStore::do_transactions  each write transaction consists of  3 calls to omap_setkeys,  the actual call to write to the file system  2 calls to setattr  Proposal: coalesce calls to omap_setkeys  1 function call instead of 3 calls, set 5 key value pairs instead of 6 (duplicate key)  Official change was to coalesce at the higher PG layer
  • 38. 37 Other areas of investigation and improvement  Lock analysis  RWLock instead of mutex  Start with CRC locks  Bufferlist tuning  Optimize for jumbo packets  malloc issues Copyright 2015 FUJITSU
  • 39. 38 Summary and Conclusion ETERNUS CD10k is the safe way to make Ceph enterprise ready  Unlimited Scalability: 4 to 224 nodes, scales up to >50 Petabyte  Immortal System with Zero downtime: Non-disruptive add / remove / exchange of hardware (disks and nodes) or Software update  TCO optimized: Highly automated and fully integrated management reduces operational efforts  Tight integration in OpenStack with own GUI Fujitsu will continue to enhance ease-of-use and performance  This is important!  As Ceph’s popularity increases, competitors will attack Ceph in these areas.
  • 40. 3939 Copyright 2015 FUJITSU pvonstamwitz@us.fujitsu.com