SlideShare a Scribd company logo
1 of 29
CEPH@DeutscheTelekom
A 2+ Years ProductionLiaison
IevgenNelen, GerdPruessmann - Deutsche Telekom AG, DBU Cloud Services,P&I
07.05.2015 2
Speakers
Ievgen Nelen & Gerd Prüßmann
• Cloud Operations Engineer
• Ceph cuttlefish
• Openstack diablo
• @eugene_nelen
• i.nelen@telekom.de
• Head of PlatformEngineering
• CEPH argonaut
• Openstack cactus
• @2digitsLeft
• g.pruessmann@telekom.de
Overview
the business case
07.05.2015 4
Overview
Business Marketplace
• https://portal.telekomcloud.com/
• SaaS Applications from Software Partners (ISVs) and
DT offered to SME customers
• i.e.Saperion, Sage,PadCloud, Teamlike, Fastbill, Imeet,
Weclapp, SilverERP, Teamdisk ...
• Complements othercloud offerings from Deutsche Telekom (Enterprise cloud from T-Systems, Cisco
Intercloud, Mediencenter etc.)
• IaaS platform based only on Open Source technologies like OpenStack, CEPH and Linux
• Project started in 2012 with OS Essex, CEPH in production since 3/2013 (bobtail)
07.05.2015–Strictlyconfidential,Confidential,Internal– Author/Presentationtitle 5
Overview
why opensource?Why ceph?
• no vendorlock in!
• easier to change and adapt newtechnologies / concepts - more independent from vendor priorities
• low cost of ownership and operation, utilizing commodity hardware and Open Source
• no license fees - but professional support
• modular and horizontally scalable platform
• automation and flexibility allow for faster deployment cycles, than in traditional hosting
• control overopen sourcecode - faster bug fixing and feature delivery
DETAILS
BASICS
07.05.2015 7
DETAILS
ceph basics
• Bobtail > Cuttlefish > Dumpling > Firefly (0.80.9)
• Multiple CEPH clusters
• overall raw capacity 4.8 PB
• OneS3and cluster(~810TB raw capacity - 15 storage nodes - 3 MONs)
• multiple smaller RBD clusters for REF, LIFE and DEV
• S3 storage for cloud native apps (Teamdisk, Teamlike) and for backups(i.eRBD)
• RBD for persistent volumes / data via Openstack Cinder(i.e.DB volumes)
07.05.2015 8
Details
ceph basics
DETAILS
Hardware
07.05.2015 10
DETAILS
hardware
• Supermicro
2x Intel Xeon E5-2640 v2
@ 2.00GHz
64GB RAM
7x SSDs
18x HDDs
• Seagate Terascale
ST4000NC000
4TB HDDs
• LSI MegaRAID SAS 9271-8i
• 18 OSDs per node: RAID1 with 2
SSD for /, 3 RAID0 with 1 SSD for
journals, 18 raid0 with 1 hdd for OSD
• 2x10Gb network adapters
07.05.2015 11
DETAILS
hardware
• Supermicro
1x Intel Xeon E5-2650L
@ 1.80GHz
64GB RAM
36x HDDs
• Seagate Barracuda
ST3000DM001
3TB HDDs
• LSI MegaRAID
SAS 9271-8i
• 10 OSDs per node: RAID1 for /, 10
RAID0 with 1 hdd for journals, 10
raid0 with 2 hdd for OSD
• 2x10Gb network adapters
Details
Configuration& deployment
07.05.2015 13
details
configuration& deployment
• Razor
• Puppet
• https://github.com/TelekomCloud/pup
pet-ceph
• dm-crypt disk encryption
• osd location
• XFS
• 3 replica
• OMD/Check_mk http://omdistro.org/
• ceph-dash https://github.com/TelekomCloud/ceph-dash for dashboard
and API
• check_mkplugins (Cluster health, OSDs,S3)
Details
performancetuning
07.05.2015 15
details
performance tuning
• Problem - Low IOPS,IOPSdrops
• fio
• Enable RAID0 Writeback cache
• Useseparate disks for cephjournals (better use SSDs –scale out project)
• Problem - Recovery/Backfilling consumes a lot of cpu, decreaseof performance
• osd_recovery_max_active 1 numberofactiverecoveryrequestsperOSDatonetime
• osd_max_backfills 1 maximumnumberofbackfills allowedtoorfromasingleOSD
07.05.2015 16
details
performance Tests – current hardware / IO
07.05.2015 17
details
performance Tests – curr.Hardware/Bandwidth
lessonslearned
07.05.2015 19
lessonslearned
operational experience
• Chose your hardware well !!
• I,e. RAID and hard disks -> enterprise gradedisks (desktop HDs aremissing important features like TLER/ERC)
• CPU/RAM planning: calculate 1GHz CPU powerand 2GB RAM persingle OSD
• pick nodes with lowstorage capacity density for smaller clusters
• At least 5 nodes for a 3 replica cluster (i.e.for PoC, testing and development purposes)
• Cluster configuration “adjustments”:
• increasing PG num> impact on cluster becauseofmassive data migration
• Rolling software updates / upgrades workedperfectly
• CEPH: has a character– buthighly reliable - neverlost data
07.05.2015 20
lessonslearned
operational experience
• Failed / ”Slow”disks
• Inconsistent PGs
• Incomplete PGs
• RBD pool configured with min_size=2
• Blocks IO operations to the pool / cluster
• fixed in Hammer (allows PG replication while replica level below min_size pool/OSD)
/var/log/syslog.log
Apr 12 04:59:47 cephosd5 kernel: [12473860.669262] sd 6:2:10:0: [sdk]
Unhandled error code
root@cephosd5:/var/log# mount | grep sdk /dev/mapper/cephosd5-journal-sdk on
/var/lib/ceph/osd/journal-disk9
root@cephosd5:/var/log# grep journal-disk9 /etc/ceph/ceph.conf osd journal =
/var/lib/ceph/osd/journal-disk9/osd.151-journal
/var/log/ceph/ceph-osd.151.log.1.gz
2015-04-12 04:59:47.891284 7f8a10c76700 -1 journal FileJournal::do_ write:
pwrite(fd=25, hbp.length=4096) failed :(5) Input/output error
07.05.2015 21
lessonslearned
operational experience
5/7/2015 22
lessonslearned
incompletePGs- what happened?
OSDnode
OSD
Journal
pg pg
OSD
Journal
OSDnode
OSD
Journal
pg pg
OSD
Journal
OSDnode
OSD
Journal
pg pg
OSD
Journal
pg
glimpseof the future
07.05.2015 24
Overview
SCALE OUT Project
+40%
Current overall capacity:
 ~60 storage nodes
 5,4 PB Storage Gross
 ~0,5 PB S3 Storage Net
Planned Capacity for 2015:
 ~90 storage nodes
 7,5 PB Storage Gross
 ~1,5 PB S3 Storage Net
07.05.2015 25
Future setup
scale out project
• 2 physically separated rooms
• Data distributed accordingthe rule
• not more than 2 replicas in - oneroom not more than 1
replica in onerack
07.05.2015 26
Future setup
New crushmap rules
rule myrule {
ruleset 3
type replicated
min_size 1
max_size 10
step take default
step choose firstn 2 type room
step chooseleaf firstn 2 type rack
step emit
}
crushtool -i real7 --test --show-
statistics --rule 3 --min-x 1 --
max-x 1024 --num-rep 3 --show-
mappings
CRUSH rule 3 x 1 [12,19,15]
CRUSH rule 3 x 2 [14,16,13]
CRUSH rule 3 x 3 [3,0,7]
…
Listing 1: crushmap rule Listing 2: Simulate 1024Objects
07.05.2015 27
Future setup
dreams
• cachetiering
• make use of shiny newSSDs in a hot zone / cachepool
• SSD pools
• Openstack live migration for VMs(boot from rbd volume)
Q & a
07.05.2015 29
QUESTION & ANSWERS
• Ievgen Nelen
• @eugene_nelen
• i.nelen@telekom.de
• Gerd Prüßmann
• @2digitsLeft
• g.pruessmann@telekom.de

More Related Content

What's hot

Build an affordable Cloud Stroage
Build an affordable Cloud StroageBuild an affordable Cloud Stroage
Build an affordable Cloud StroageAlex Lau
 
Ceph Day Beijing- Ceph Community Update
Ceph Day Beijing- Ceph Community UpdateCeph Day Beijing- Ceph Community Update
Ceph Day Beijing- Ceph Community UpdateDanielle Womboldt
 
openSUSE storage workshop 2016
openSUSE storage workshop 2016openSUSE storage workshop 2016
openSUSE storage workshop 2016Alex Lau
 
Ceph Day Bring Ceph To Enterprise
Ceph Day Bring Ceph To EnterpriseCeph Day Bring Ceph To Enterprise
Ceph Day Bring Ceph To EnterpriseAlex Lau
 
Ceph Day San Jose - From Zero to Ceph in One Minute
Ceph Day San Jose - From Zero to Ceph in One Minute Ceph Day San Jose - From Zero to Ceph in One Minute
Ceph Day San Jose - From Zero to Ceph in One Minute Ceph Community
 
Ceph Day San Jose - All-Flahs Ceph on NUMA-Balanced Server
Ceph Day San Jose - All-Flahs Ceph on NUMA-Balanced Server Ceph Day San Jose - All-Flahs Ceph on NUMA-Balanced Server
Ceph Day San Jose - All-Flahs Ceph on NUMA-Balanced Server Ceph Community
 
Ceph Day San Jose - Red Hat Storage Acceleration Utlizing Flash Technology
Ceph Day San Jose - Red Hat Storage Acceleration Utlizing Flash TechnologyCeph Day San Jose - Red Hat Storage Acceleration Utlizing Flash Technology
Ceph Day San Jose - Red Hat Storage Acceleration Utlizing Flash TechnologyCeph Community
 
Ceph Day Tokyo -- Ceph on All-Flash Storage
Ceph Day Tokyo -- Ceph on All-Flash StorageCeph Day Tokyo -- Ceph on All-Flash Storage
Ceph Day Tokyo -- Ceph on All-Flash StorageCeph Community
 
Ceph Day Beijing - SPDK for Ceph
Ceph Day Beijing - SPDK for CephCeph Day Beijing - SPDK for Ceph
Ceph Day Beijing - SPDK for CephDanielle Womboldt
 
Ceph Day Shanghai - Hyper Converged PLCloud with Ceph
Ceph Day Shanghai - Hyper Converged PLCloud with Ceph Ceph Day Shanghai - Hyper Converged PLCloud with Ceph
Ceph Day Shanghai - Hyper Converged PLCloud with Ceph Ceph Community
 
Ceph Day Beijing - Storage Modernization with Intel and Ceph
Ceph Day Beijing - Storage Modernization with Intel and CephCeph Day Beijing - Storage Modernization with Intel and Ceph
Ceph Day Beijing - Storage Modernization with Intel and CephDanielle Womboldt
 
SLE12 SP2 : High Availability et Geo Cluster
SLE12 SP2 : High Availability et Geo ClusterSLE12 SP2 : High Availability et Geo Cluster
SLE12 SP2 : High Availability et Geo ClusterSUSE
 
Ceph Day San Jose - HA NAS with CephFS
Ceph Day San Jose - HA NAS with CephFSCeph Day San Jose - HA NAS with CephFS
Ceph Day San Jose - HA NAS with CephFSCeph Community
 
Ambedded - how to build a true no single point of failure ceph cluster
Ambedded - how to build a true no single point of failure ceph cluster Ambedded - how to build a true no single point of failure ceph cluster
Ambedded - how to build a true no single point of failure ceph cluster inwin stack
 
Ceph Performance and Sizing Guide
Ceph Performance and Sizing GuideCeph Performance and Sizing Guide
Ceph Performance and Sizing GuideJose De La Rosa
 
Ceph optimized Storage / Global HW solutions for SDS, David Alvarez
Ceph optimized Storage / Global HW solutions for SDS, David AlvarezCeph optimized Storage / Global HW solutions for SDS, David Alvarez
Ceph optimized Storage / Global HW solutions for SDS, David AlvarezCeph Community
 
Walk Through a Software Defined Everything PoC
Walk Through a Software Defined Everything PoCWalk Through a Software Defined Everything PoC
Walk Through a Software Defined Everything PoCCeph Community
 

What's hot (17)

Build an affordable Cloud Stroage
Build an affordable Cloud StroageBuild an affordable Cloud Stroage
Build an affordable Cloud Stroage
 
Ceph Day Beijing- Ceph Community Update
Ceph Day Beijing- Ceph Community UpdateCeph Day Beijing- Ceph Community Update
Ceph Day Beijing- Ceph Community Update
 
openSUSE storage workshop 2016
openSUSE storage workshop 2016openSUSE storage workshop 2016
openSUSE storage workshop 2016
 
Ceph Day Bring Ceph To Enterprise
Ceph Day Bring Ceph To EnterpriseCeph Day Bring Ceph To Enterprise
Ceph Day Bring Ceph To Enterprise
 
Ceph Day San Jose - From Zero to Ceph in One Minute
Ceph Day San Jose - From Zero to Ceph in One Minute Ceph Day San Jose - From Zero to Ceph in One Minute
Ceph Day San Jose - From Zero to Ceph in One Minute
 
Ceph Day San Jose - All-Flahs Ceph on NUMA-Balanced Server
Ceph Day San Jose - All-Flahs Ceph on NUMA-Balanced Server Ceph Day San Jose - All-Flahs Ceph on NUMA-Balanced Server
Ceph Day San Jose - All-Flahs Ceph on NUMA-Balanced Server
 
Ceph Day San Jose - Red Hat Storage Acceleration Utlizing Flash Technology
Ceph Day San Jose - Red Hat Storage Acceleration Utlizing Flash TechnologyCeph Day San Jose - Red Hat Storage Acceleration Utlizing Flash Technology
Ceph Day San Jose - Red Hat Storage Acceleration Utlizing Flash Technology
 
Ceph Day Tokyo -- Ceph on All-Flash Storage
Ceph Day Tokyo -- Ceph on All-Flash StorageCeph Day Tokyo -- Ceph on All-Flash Storage
Ceph Day Tokyo -- Ceph on All-Flash Storage
 
Ceph Day Beijing - SPDK for Ceph
Ceph Day Beijing - SPDK for CephCeph Day Beijing - SPDK for Ceph
Ceph Day Beijing - SPDK for Ceph
 
Ceph Day Shanghai - Hyper Converged PLCloud with Ceph
Ceph Day Shanghai - Hyper Converged PLCloud with Ceph Ceph Day Shanghai - Hyper Converged PLCloud with Ceph
Ceph Day Shanghai - Hyper Converged PLCloud with Ceph
 
Ceph Day Beijing - Storage Modernization with Intel and Ceph
Ceph Day Beijing - Storage Modernization with Intel and CephCeph Day Beijing - Storage Modernization with Intel and Ceph
Ceph Day Beijing - Storage Modernization with Intel and Ceph
 
SLE12 SP2 : High Availability et Geo Cluster
SLE12 SP2 : High Availability et Geo ClusterSLE12 SP2 : High Availability et Geo Cluster
SLE12 SP2 : High Availability et Geo Cluster
 
Ceph Day San Jose - HA NAS with CephFS
Ceph Day San Jose - HA NAS with CephFSCeph Day San Jose - HA NAS with CephFS
Ceph Day San Jose - HA NAS with CephFS
 
Ambedded - how to build a true no single point of failure ceph cluster
Ambedded - how to build a true no single point of failure ceph cluster Ambedded - how to build a true no single point of failure ceph cluster
Ambedded - how to build a true no single point of failure ceph cluster
 
Ceph Performance and Sizing Guide
Ceph Performance and Sizing GuideCeph Performance and Sizing Guide
Ceph Performance and Sizing Guide
 
Ceph optimized Storage / Global HW solutions for SDS, David Alvarez
Ceph optimized Storage / Global HW solutions for SDS, David AlvarezCeph optimized Storage / Global HW solutions for SDS, David Alvarez
Ceph optimized Storage / Global HW solutions for SDS, David Alvarez
 
Walk Through a Software Defined Everything PoC
Walk Through a Software Defined Everything PoCWalk Through a Software Defined Everything PoC
Walk Through a Software Defined Everything PoC
 

Viewers also liked

Accumulo Summit 2015: Ambari and Accumulo: HDP 2.3 Upcoming Features [Sponsored]
Accumulo Summit 2015: Ambari and Accumulo: HDP 2.3 Upcoming Features [Sponsored]Accumulo Summit 2015: Ambari and Accumulo: HDP 2.3 Upcoming Features [Sponsored]
Accumulo Summit 2015: Ambari and Accumulo: HDP 2.3 Upcoming Features [Sponsored]Accumulo Summit
 
методичні рекомендації по_дпа_з_хімії
методичні рекомендації по_дпа_з_хіміїметодичні рекомендації по_дпа_з_хімії
методичні рекомендації по_дпа_з_хіміїInna Pavlova
 
Finpron kansainvälistymispalvelut
Finpron kansainvälistymispalvelut Finpron kansainvälistymispalvelut
Finpron kansainvälistymispalvelut K2HEL
 
Horisontti 2020 -ohjelma, Pk-instrumentti Horisontto 2020 -ohjelmassa, COSME
Horisontti 2020 -ohjelma, Pk-instrumentti Horisontto 2020 -ohjelmassa, COSMEHorisontti 2020 -ohjelma, Pk-instrumentti Horisontto 2020 -ohjelmassa, COSME
Horisontti 2020 -ohjelma, Pk-instrumentti Horisontto 2020 -ohjelmassa, COSMEK2HEL
 
Jurang pencapaian matematik di sekolah menengah
Jurang pencapaian matematik di sekolah menengahJurang pencapaian matematik di sekolah menengah
Jurang pencapaian matematik di sekolah menengahFarah Waheeda
 
الفصل الرابع من مانجا الرجل دو اللكمة الواحدة - one punch man
الفصل الرابع من مانجا الرجل دو اللكمة الواحدة - one punch manالفصل الرابع من مانجا الرجل دو اللكمة الواحدة - one punch man
الفصل الرابع من مانجا الرجل دو اللكمة الواحدة - one punch manSidi Mohamed
 
Control of Dvr with Battery Energy Storage System Using Srf Theory
Control of Dvr with Battery Energy Storage System Using Srf TheoryControl of Dvr with Battery Energy Storage System Using Srf Theory
Control of Dvr with Battery Energy Storage System Using Srf TheoryIJERA Editor
 
Rezultati NALED-a 2014.
Rezultati NALED-a 2014.Rezultati NALED-a 2014.
Rezultati NALED-a 2014.NALED Serbia
 
Ceph Day Berlin: Scaling an Academic Cloud
Ceph Day Berlin: Scaling an Academic CloudCeph Day Berlin: Scaling an Academic Cloud
Ceph Day Berlin: Scaling an Academic CloudCeph Community
 
Tips on How to be More Productive at Work
Tips on How to be More Productive at WorkTips on How to be More Productive at Work
Tips on How to be More Productive at WorkBuzz Marketing Pros
 
Οιογένεια και σχολείο: Συμβουλές επικοινωνίας - 1
Οιογένεια και σχολείο: Συμβουλές επικοινωνίας - 1Οιογένεια και σχολείο: Συμβουλές επικοινωνίας - 1
Οιογένεια και σχολείο: Συμβουλές επικοινωνίας - 1parentbook
 
Referaadi vormistamise qr koodid
Referaadi vormistamise qr koodidReferaadi vormistamise qr koodid
Referaadi vormistamise qr koodidTriinu Pääsik
 
Joseph J Paczelt Resume
Joseph J Paczelt ResumeJoseph J Paczelt Resume
Joseph J Paczelt ResumeJoseph Paczelt
 
Углеводы, минеральные вещества
Углеводы, минеральные веществаУглеводы, минеральные вещества
Углеводы, минеральные веществаqwer78
 

Viewers also liked (20)

C-2014-4-Meijer-EN
C-2014-4-Meijer-ENC-2014-4-Meijer-EN
C-2014-4-Meijer-EN
 
Accumulo Summit 2015: Ambari and Accumulo: HDP 2.3 Upcoming Features [Sponsored]
Accumulo Summit 2015: Ambari and Accumulo: HDP 2.3 Upcoming Features [Sponsored]Accumulo Summit 2015: Ambari and Accumulo: HDP 2.3 Upcoming Features [Sponsored]
Accumulo Summit 2015: Ambari and Accumulo: HDP 2.3 Upcoming Features [Sponsored]
 
методичні рекомендації по_дпа_з_хімії
методичні рекомендації по_дпа_з_хіміїметодичні рекомендації по_дпа_з_хімії
методичні рекомендації по_дпа_з_хімії
 
Finpron kansainvälistymispalvelut
Finpron kansainvälistymispalvelut Finpron kansainvälistymispalvelut
Finpron kansainvälistymispalvelut
 
Horisontti 2020 -ohjelma, Pk-instrumentti Horisontto 2020 -ohjelmassa, COSME
Horisontti 2020 -ohjelma, Pk-instrumentti Horisontto 2020 -ohjelmassa, COSMEHorisontti 2020 -ohjelma, Pk-instrumentti Horisontto 2020 -ohjelmassa, COSME
Horisontti 2020 -ohjelma, Pk-instrumentti Horisontto 2020 -ohjelmassa, COSME
 
Jurang pencapaian matematik di sekolah menengah
Jurang pencapaian matematik di sekolah menengahJurang pencapaian matematik di sekolah menengah
Jurang pencapaian matematik di sekolah menengah
 
الفصل الرابع من مانجا الرجل دو اللكمة الواحدة - one punch man
الفصل الرابع من مانجا الرجل دو اللكمة الواحدة - one punch manالفصل الرابع من مانجا الرجل دو اللكمة الواحدة - one punch man
الفصل الرابع من مانجا الرجل دو اللكمة الواحدة - one punch man
 
Презентация Re flame
Презентация Re flameПрезентация Re flame
Презентация Re flame
 
Control of Dvr with Battery Energy Storage System Using Srf Theory
Control of Dvr with Battery Energy Storage System Using Srf TheoryControl of Dvr with Battery Energy Storage System Using Srf Theory
Control of Dvr with Battery Energy Storage System Using Srf Theory
 
Rezultati NALED-a 2014.
Rezultati NALED-a 2014.Rezultati NALED-a 2014.
Rezultati NALED-a 2014.
 
Ceph Day Berlin: Scaling an Academic Cloud
Ceph Day Berlin: Scaling an Academic CloudCeph Day Berlin: Scaling an Academic Cloud
Ceph Day Berlin: Scaling an Academic Cloud
 
AdWords UI と API
AdWords UI と APIAdWords UI と API
AdWords UI と API
 
Kundemålskiven
KundemålskivenKundemålskiven
Kundemålskiven
 
Tips on How to be More Productive at Work
Tips on How to be More Productive at WorkTips on How to be More Productive at Work
Tips on How to be More Productive at Work
 
Οιογένεια και σχολείο: Συμβουλές επικοινωνίας - 1
Οιογένεια και σχολείο: Συμβουλές επικοινωνίας - 1Οιογένεια και σχολείο: Συμβουλές επικοινωνίας - 1
Οιογένεια και σχολείο: Συμβουλές επικοινωνίας - 1
 
Referaadi vormistamise qr koodid
Referaadi vormistamise qr koodidReferaadi vormistamise qr koodid
Referaadi vormistamise qr koodid
 
Joseph J Paczelt Resume
Joseph J Paczelt ResumeJoseph J Paczelt Resume
Joseph J Paczelt Resume
 
Углеводы, минеральные вещества
Углеводы, минеральные веществаУглеводы, минеральные вещества
Углеводы, минеральные вещества
 
«Рейнська Сивілла»
«Рейнська Сивілла» «Рейнська Сивілла»
«Рейнська Сивілла»
 
Cabinet
CabinetCabinet
Cabinet
 

Similar to Ceph Day Berlin: CEPH@DeutscheTelekom - a 2+ years production liaison

Xap memory xtend-tutorial-2014
Xap memory xtend-tutorial-2014Xap memory xtend-tutorial-2014
Xap memory xtend-tutorial-2014Shay Hassidim
 
Azure VM 101 - HomeGen by CloudGen Verona - Marco Obinu
Azure VM 101 - HomeGen by CloudGen Verona - Marco ObinuAzure VM 101 - HomeGen by CloudGen Verona - Marco Obinu
Azure VM 101 - HomeGen by CloudGen Verona - Marco ObinuMarco Obinu
 
Backup management with Ceph Storage - Camilo Echevarne, Félix Barbeira
Backup management with Ceph Storage - Camilo Echevarne, Félix BarbeiraBackup management with Ceph Storage - Camilo Echevarne, Félix Barbeira
Backup management with Ceph Storage - Camilo Echevarne, Félix BarbeiraCeph Community
 
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...Ceph Community
 
Ceph Day Beijing - Ceph all-flash array design based on NUMA architecture
Ceph Day Beijing - Ceph all-flash array design based on NUMA architectureCeph Day Beijing - Ceph all-flash array design based on NUMA architecture
Ceph Day Beijing - Ceph all-flash array design based on NUMA architectureCeph Community
 
2013 09-02 senzations-bimschas-part4-setting-up-your-own-testbed
2013 09-02 senzations-bimschas-part4-setting-up-your-own-testbed2013 09-02 senzations-bimschas-part4-setting-up-your-own-testbed
2013 09-02 senzations-bimschas-part4-setting-up-your-own-testbedDaniel Bimschas
 
Red Hat Storage Day Dallas - Red Hat Ceph Storage Acceleration Utilizing Flas...
Red Hat Storage Day Dallas - Red Hat Ceph Storage Acceleration Utilizing Flas...Red Hat Storage Day Dallas - Red Hat Ceph Storage Acceleration Utilizing Flas...
Red Hat Storage Day Dallas - Red Hat Ceph Storage Acceleration Utilizing Flas...Red_Hat_Storage
 
Running E-Business Suite Database on Oracle Database Appliance
Running E-Business Suite Database on Oracle Database ApplianceRunning E-Business Suite Database on Oracle Database Appliance
Running E-Business Suite Database on Oracle Database ApplianceMaris Elsins
 
Hostvn ceph in production v1.1 dungtq
Hostvn   ceph in production v1.1 dungtqHostvn   ceph in production v1.1 dungtq
Hostvn ceph in production v1.1 dungtqViet Stack
 
Cisco connect montreal 2018 compute v final
Cisco connect montreal 2018   compute v finalCisco connect montreal 2018   compute v final
Cisco connect montreal 2018 compute v finalCisco Canada
 
Red Hat Ceph Storage Acceleration Utilizing Flash Technology
Red Hat Ceph Storage Acceleration Utilizing Flash Technology Red Hat Ceph Storage Acceleration Utilizing Flash Technology
Red Hat Ceph Storage Acceleration Utilizing Flash Technology Red_Hat_Storage
 
PostgreSQL High Availability in a Containerized World
PostgreSQL High Availability in a Containerized WorldPostgreSQL High Availability in a Containerized World
PostgreSQL High Availability in a Containerized WorldJignesh Shah
 
The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...
The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...
The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...Glenn K. Lockwood
 
Are your ready for in memory applications?
Are your ready for in memory applications?Are your ready for in memory applications?
Are your ready for in memory applications?G2MCommunications
 
Scylla Summit 2018: The Short and Straight Road That Leads from Cassandra to ...
Scylla Summit 2018: The Short and Straight Road That Leads from Cassandra to ...Scylla Summit 2018: The Short and Straight Road That Leads from Cassandra to ...
Scylla Summit 2018: The Short and Straight Road That Leads from Cassandra to ...ScyllaDB
 
Azure + DataStax Enterprise (DSE) Powers Office365 Per User Store
Azure + DataStax Enterprise (DSE) Powers Office365 Per User StoreAzure + DataStax Enterprise (DSE) Powers Office365 Per User Store
Azure + DataStax Enterprise (DSE) Powers Office365 Per User StoreDataStax Academy
 
The Forefront of the Development for NVDIMM on Linux Kernel
The Forefront of the Development for NVDIMM on Linux KernelThe Forefront of the Development for NVDIMM on Linux Kernel
The Forefront of the Development for NVDIMM on Linux KernelYasunori Goto
 

Similar to Ceph Day Berlin: CEPH@DeutscheTelekom - a 2+ years production liaison (20)

Xap memory xtend-tutorial-2014
Xap memory xtend-tutorial-2014Xap memory xtend-tutorial-2014
Xap memory xtend-tutorial-2014
 
Azure VM 101 - HomeGen by CloudGen Verona - Marco Obinu
Azure VM 101 - HomeGen by CloudGen Verona - Marco ObinuAzure VM 101 - HomeGen by CloudGen Verona - Marco Obinu
Azure VM 101 - HomeGen by CloudGen Verona - Marco Obinu
 
Backup management with Ceph Storage - Camilo Echevarne, Félix Barbeira
Backup management with Ceph Storage - Camilo Echevarne, Félix BarbeiraBackup management with Ceph Storage - Camilo Echevarne, Félix Barbeira
Backup management with Ceph Storage - Camilo Echevarne, Félix Barbeira
 
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
 
Ceph Day Beijing - Ceph all-flash array design based on NUMA architecture
Ceph Day Beijing - Ceph all-flash array design based on NUMA architectureCeph Day Beijing - Ceph all-flash array design based on NUMA architecture
Ceph Day Beijing - Ceph all-flash array design based on NUMA architecture
 
2013 09-02 senzations-bimschas-part4-setting-up-your-own-testbed
2013 09-02 senzations-bimschas-part4-setting-up-your-own-testbed2013 09-02 senzations-bimschas-part4-setting-up-your-own-testbed
2013 09-02 senzations-bimschas-part4-setting-up-your-own-testbed
 
IaaS for DBAs in Azure
IaaS for DBAs in AzureIaaS for DBAs in Azure
IaaS for DBAs in Azure
 
Red Hat Storage Day Dallas - Red Hat Ceph Storage Acceleration Utilizing Flas...
Red Hat Storage Day Dallas - Red Hat Ceph Storage Acceleration Utilizing Flas...Red Hat Storage Day Dallas - Red Hat Ceph Storage Acceleration Utilizing Flas...
Red Hat Storage Day Dallas - Red Hat Ceph Storage Acceleration Utilizing Flas...
 
Running E-Business Suite Database on Oracle Database Appliance
Running E-Business Suite Database on Oracle Database ApplianceRunning E-Business Suite Database on Oracle Database Appliance
Running E-Business Suite Database on Oracle Database Appliance
 
Hostvn ceph in production v1.1 dungtq
Hostvn   ceph in production v1.1 dungtqHostvn   ceph in production v1.1 dungtq
Hostvn ceph in production v1.1 dungtq
 
Hostvn ceph in production v1.1 dungtq
Hostvn   ceph in production v1.1 dungtqHostvn   ceph in production v1.1 dungtq
Hostvn ceph in production v1.1 dungtq
 
Cisco connect montreal 2018 compute v final
Cisco connect montreal 2018   compute v finalCisco connect montreal 2018   compute v final
Cisco connect montreal 2018 compute v final
 
Red Hat Ceph Storage Acceleration Utilizing Flash Technology
Red Hat Ceph Storage Acceleration Utilizing Flash Technology Red Hat Ceph Storage Acceleration Utilizing Flash Technology
Red Hat Ceph Storage Acceleration Utilizing Flash Technology
 
PostgreSQL High Availability in a Containerized World
PostgreSQL High Availability in a Containerized WorldPostgreSQL High Availability in a Containerized World
PostgreSQL High Availability in a Containerized World
 
Stabilizing Ceph
Stabilizing CephStabilizing Ceph
Stabilizing Ceph
 
The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...
The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...
The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...
 
Are your ready for in memory applications?
Are your ready for in memory applications?Are your ready for in memory applications?
Are your ready for in memory applications?
 
Scylla Summit 2018: The Short and Straight Road That Leads from Cassandra to ...
Scylla Summit 2018: The Short and Straight Road That Leads from Cassandra to ...Scylla Summit 2018: The Short and Straight Road That Leads from Cassandra to ...
Scylla Summit 2018: The Short and Straight Road That Leads from Cassandra to ...
 
Azure + DataStax Enterprise (DSE) Powers Office365 Per User Store
Azure + DataStax Enterprise (DSE) Powers Office365 Per User StoreAzure + DataStax Enterprise (DSE) Powers Office365 Per User Store
Azure + DataStax Enterprise (DSE) Powers Office365 Per User Store
 
The Forefront of the Development for NVDIMM on Linux Kernel
The Forefront of the Development for NVDIMM on Linux KernelThe Forefront of the Development for NVDIMM on Linux Kernel
The Forefront of the Development for NVDIMM on Linux Kernel
 

Recently uploaded

FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 

Recently uploaded (20)

FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 

Ceph Day Berlin: CEPH@DeutscheTelekom - a 2+ years production liaison

  • 1. CEPH@DeutscheTelekom A 2+ Years ProductionLiaison IevgenNelen, GerdPruessmann - Deutsche Telekom AG, DBU Cloud Services,P&I
  • 2. 07.05.2015 2 Speakers Ievgen Nelen & Gerd Prüßmann • Cloud Operations Engineer • Ceph cuttlefish • Openstack diablo • @eugene_nelen • i.nelen@telekom.de • Head of PlatformEngineering • CEPH argonaut • Openstack cactus • @2digitsLeft • g.pruessmann@telekom.de
  • 4. 07.05.2015 4 Overview Business Marketplace • https://portal.telekomcloud.com/ • SaaS Applications from Software Partners (ISVs) and DT offered to SME customers • i.e.Saperion, Sage,PadCloud, Teamlike, Fastbill, Imeet, Weclapp, SilverERP, Teamdisk ... • Complements othercloud offerings from Deutsche Telekom (Enterprise cloud from T-Systems, Cisco Intercloud, Mediencenter etc.) • IaaS platform based only on Open Source technologies like OpenStack, CEPH and Linux • Project started in 2012 with OS Essex, CEPH in production since 3/2013 (bobtail)
  • 5. 07.05.2015–Strictlyconfidential,Confidential,Internal– Author/Presentationtitle 5 Overview why opensource?Why ceph? • no vendorlock in! • easier to change and adapt newtechnologies / concepts - more independent from vendor priorities • low cost of ownership and operation, utilizing commodity hardware and Open Source • no license fees - but professional support • modular and horizontally scalable platform • automation and flexibility allow for faster deployment cycles, than in traditional hosting • control overopen sourcecode - faster bug fixing and feature delivery
  • 7. 07.05.2015 7 DETAILS ceph basics • Bobtail > Cuttlefish > Dumpling > Firefly (0.80.9) • Multiple CEPH clusters • overall raw capacity 4.8 PB • OneS3and cluster(~810TB raw capacity - 15 storage nodes - 3 MONs) • multiple smaller RBD clusters for REF, LIFE and DEV • S3 storage for cloud native apps (Teamdisk, Teamlike) and for backups(i.eRBD) • RBD for persistent volumes / data via Openstack Cinder(i.e.DB volumes)
  • 11. • Supermicro 2x Intel Xeon E5-2640 v2 @ 2.00GHz 64GB RAM 7x SSDs 18x HDDs • Seagate Terascale ST4000NC000 4TB HDDs • LSI MegaRAID SAS 9271-8i • 18 OSDs per node: RAID1 with 2 SSD for /, 3 RAID0 with 1 SSD for journals, 18 raid0 with 1 hdd for OSD • 2x10Gb network adapters 07.05.2015 11 DETAILS hardware • Supermicro 1x Intel Xeon E5-2650L @ 1.80GHz 64GB RAM 36x HDDs • Seagate Barracuda ST3000DM001 3TB HDDs • LSI MegaRAID SAS 9271-8i • 10 OSDs per node: RAID1 for /, 10 RAID0 with 1 hdd for journals, 10 raid0 with 2 hdd for OSD • 2x10Gb network adapters
  • 13. 07.05.2015 13 details configuration& deployment • Razor • Puppet • https://github.com/TelekomCloud/pup pet-ceph • dm-crypt disk encryption • osd location • XFS • 3 replica • OMD/Check_mk http://omdistro.org/ • ceph-dash https://github.com/TelekomCloud/ceph-dash for dashboard and API • check_mkplugins (Cluster health, OSDs,S3)
  • 15. 07.05.2015 15 details performance tuning • Problem - Low IOPS,IOPSdrops • fio • Enable RAID0 Writeback cache • Useseparate disks for cephjournals (better use SSDs –scale out project) • Problem - Recovery/Backfilling consumes a lot of cpu, decreaseof performance • osd_recovery_max_active 1 numberofactiverecoveryrequestsperOSDatonetime • osd_max_backfills 1 maximumnumberofbackfills allowedtoorfromasingleOSD
  • 16. 07.05.2015 16 details performance Tests – current hardware / IO
  • 17. 07.05.2015 17 details performance Tests – curr.Hardware/Bandwidth
  • 19. 07.05.2015 19 lessonslearned operational experience • Chose your hardware well !! • I,e. RAID and hard disks -> enterprise gradedisks (desktop HDs aremissing important features like TLER/ERC) • CPU/RAM planning: calculate 1GHz CPU powerand 2GB RAM persingle OSD • pick nodes with lowstorage capacity density for smaller clusters • At least 5 nodes for a 3 replica cluster (i.e.for PoC, testing and development purposes) • Cluster configuration “adjustments”: • increasing PG num> impact on cluster becauseofmassive data migration • Rolling software updates / upgrades workedperfectly • CEPH: has a character– buthighly reliable - neverlost data
  • 20. 07.05.2015 20 lessonslearned operational experience • Failed / ”Slow”disks • Inconsistent PGs • Incomplete PGs • RBD pool configured with min_size=2 • Blocks IO operations to the pool / cluster • fixed in Hammer (allows PG replication while replica level below min_size pool/OSD)
  • 21. /var/log/syslog.log Apr 12 04:59:47 cephosd5 kernel: [12473860.669262] sd 6:2:10:0: [sdk] Unhandled error code root@cephosd5:/var/log# mount | grep sdk /dev/mapper/cephosd5-journal-sdk on /var/lib/ceph/osd/journal-disk9 root@cephosd5:/var/log# grep journal-disk9 /etc/ceph/ceph.conf osd journal = /var/lib/ceph/osd/journal-disk9/osd.151-journal /var/log/ceph/ceph-osd.151.log.1.gz 2015-04-12 04:59:47.891284 7f8a10c76700 -1 journal FileJournal::do_ write: pwrite(fd=25, hbp.length=4096) failed :(5) Input/output error 07.05.2015 21 lessonslearned operational experience
  • 22. 5/7/2015 22 lessonslearned incompletePGs- what happened? OSDnode OSD Journal pg pg OSD Journal OSDnode OSD Journal pg pg OSD Journal OSDnode OSD Journal pg pg OSD Journal pg
  • 24. 07.05.2015 24 Overview SCALE OUT Project +40% Current overall capacity:  ~60 storage nodes  5,4 PB Storage Gross  ~0,5 PB S3 Storage Net Planned Capacity for 2015:  ~90 storage nodes  7,5 PB Storage Gross  ~1,5 PB S3 Storage Net
  • 25. 07.05.2015 25 Future setup scale out project • 2 physically separated rooms • Data distributed accordingthe rule • not more than 2 replicas in - oneroom not more than 1 replica in onerack
  • 26. 07.05.2015 26 Future setup New crushmap rules rule myrule { ruleset 3 type replicated min_size 1 max_size 10 step take default step choose firstn 2 type room step chooseleaf firstn 2 type rack step emit } crushtool -i real7 --test --show- statistics --rule 3 --min-x 1 -- max-x 1024 --num-rep 3 --show- mappings CRUSH rule 3 x 1 [12,19,15] CRUSH rule 3 x 2 [14,16,13] CRUSH rule 3 x 3 [3,0,7] … Listing 1: crushmap rule Listing 2: Simulate 1024Objects
  • 27. 07.05.2015 27 Future setup dreams • cachetiering • make use of shiny newSSDs in a hot zone / cachepool • SSD pools • Openstack live migration for VMs(boot from rbd volume)
  • 28. Q & a
  • 29. 07.05.2015 29 QUESTION & ANSWERS • Ievgen Nelen • @eugene_nelen • i.nelen@telekom.de • Gerd Prüßmann • @2digitsLeft • g.pruessmann@telekom.de