SlideShare a Scribd company logo
1 of 44
Download to read offline
Ceph for Big Science
Dan van der Ster, CERN IT
Cephalocon APAC 2018
23 March 2018 | Beijing
27kmcircumference
5
~30MHz interactions filtered to ~1kHz recorded collisions
6
~30MHz interactions filtered to ~1kHz recorded collisions
ATLAS Detector, 100m underground
Higgs Boson Candidate
300 petabytes storage, 230 000 CPU cores
Worldwide LHC Compute Grid
Worldwide LHC Compute Grid
Beijing IHEP
WLCG Centre
Ceph at CERN: Yesterday & Today
12
14
First production cluster built mid to late 2013
for OpenStack Cinder block storage.
3 PB, 48x24x3TB drives, 200 journaling SSDs
Ceph dumpling v0.67 on Scientific Linux 6
We were very cautious: 4 replicas! (now 3)
History
• March 2013: 300TB proof of concept
• Dec 2013: 3PB in prod for RBD
• 2014-15: EC, radosstriper, radosfs
• 2016: 3PB to 6PB, no downtime
• 2017: 8 prod clusters
15
CERN Ceph Clusters Size Version
OpenStack Cinder/Glance Production 5.5PB jewel
Satellite data centre (1000km away) 0.4PB luminous
CephFS (HPC+Manila) Production 0.8PB luminous
Manila testing cluster 0.4PB luminous
Hyperconverged HPC 0.4PB luminous
CASTOR/XRootD Production 4.2PB luminous
CERN Tape Archive 0.8PB luminous
S3+SWIFT Production 0.9PB luminous
16
CephFS
17
CephFS: Filer Evolution
• Virtual NFS filers are stable and
perform well:
• nfsd, ZFS, zrep, OpenStack VMs,
Cinder/RBD
• We have ~60TB on ~30 servers
• High performance, but not scalable:
• Quota management tedious
• Labour-intensive to create new filers
• Can’t scale performance horizontally
18
CephFS: Filer Evolution
• OpenStack Manila (with CephFS) has most of the needed features:
• Multi-tenant with security isolation + quotas
• Easy self-service share provisioning
• Scalable performance (add more MDSs or OSDs as needed)
• Successful testing with preproduction users since mid-2017.
• Single MDS was seen as a bottleneck. Luminous has stable multi-MDS.
• Manila + CephFS now in production:
• One user already asked for 2000 shares
• Also using for Kubernetes: we are working on a new CSI CephFS plugin
• Really need kernel quota support!
19
Multi-MDS in Production
• ~20 tenants on our pre-prod
environment for several
months
• 2 active MDSs since luminous
• Enabled multi-MDS on our
production cluster on Jan 24
• Currently have 3 active MDSs
• default balancer and pinning
20
HPC on CephFS?
• CERN is mostly a high
throughput computing lab:
• File-based parallelism
• Several smaller HPC use-
cases exist within our lab:
• Beams, plasma, CFD,
QCD, ASICs
• Need full POSIX,
consistency, parallel IO
21
“Software Defined HPC”
• CERN’s approach is to build HPC clusters with commodity
parts: “Software Defined HPC”
• Compute side is solved with HTCondor & SLURM
• Typical HPC storage is not very attractive (missing expertise + budget)
• 200-300 HPC nodes accessing ~1PB CephFS since mid-2016:
• Manila + HPC use-cases on the same clusters. HPC is just another
user.
• Quite stable but not super high performance
22
IO-500
• Storage benchmark announced by John Bent on ceph-users ML
(from SuperComputing 2017)
• « goal is to improve parallel file systems by ensuring that sites
publish results of both "hero" and "anti-hero" runs and by
sharing the tuning and configuration »
• We have just started testing on our CephFS clusters:
• IOR throughput tests, mdtest + find metadata test
• Easy/hard mode for shared/unique file tests
23
IO-500 First Look…No tuning!
Test Result
ior_easy_write 2.595 GB/s
ior_hard_write 0.761 GB/s
ior_easy_read 4.951 GB/s
ior_hard_read 0.944 GB/s
24
Test Result
mdtest_easy_write 1.774 kiops
mdtest_hard_write 1.512 kiops
find 50.00 kiops
mdtest_easy_stat 8.722 kiops
mdtest_hard_stat 7.076 kiops
mdtest_easy_delete 0.747 kiops
mdtest_hard_read 2.521 kiops
mdtest_hard_delete 1.351 kiops
Luminous v12.2.4 -- Tested March 2018
411 OSDs: 800TB SSDs, 2 per server
OSDs running on same HW as clients
2 active MDSs running on VMs
[SCORE] Bandwidth 1.74 GB/s : IOPS 3.47 kiops : TOTAL 2.46
IO-500 First Look…No tuning!
Test Result
ior_easy_write 2.595 GB/s
ior_hard_write 0.761 GB/s
ior_easy_read 4.951 GB/s
ior_hard_read 0.944 GB/s
25
Test Result
mdtest_easy_write 1.774 kiops
mdtest_hard_write 1.512 kiops
find 50.000 kiops
mdtest_easy_stat 8.722 kiops
mdtest_hard_stat 7.076 kiops
mdtest_easy_delete 0.747 kiops
mdtest_hard_read 2.521 kiops
mdtest_hard_delete 1.351 kiops
Luminous v12.2.4 -- Tested March 2018
411 OSDs: 800TB SSDs, 2 per server
OSDs running on same HW as clients
2 active MDSs running on VMs
[SCORE] Bandwidth 1.74 GB/s : IOPS 3.47 kiops : TOTAL 2.46
RGW
26
S3 @ CERN
• Ceph luminous cluster with VM gateways. Single region.
• 4+2 erasure coding. Physics data for small objects, volunteer computing, some backups.
• Pre-signed URLs and object expiration working well.
• HAProxy is very useful:
• High-availability & mapping special buckets to dedicated gateways
27
RBD
28
RBD: Ceph + OpenStack
29
Cinder Volume Types
Volume Type Size (TB) Count
standard 871 4,758
io1 440 608
cp1 97 118
cpio1 269 107
wig-cp1 26 19
wig-cpio1 106 13
io-test10k 20 1
Totals: 1,811 5,624
30
RBD @ CERN
• OpenStack Cinder + Glance use-cases continue to be highly
reliable:
• QoS via IOPS/BW throttles is essential.
• Spectre/Meltdown reboots updated all clients to luminous!
• Ongoing work:
• Recently finished an expanded rbd trash feature
• Just starting work on a persistent cache for librbd
• CERN Openlab collaboration with Rackspace!
• Writing a backup driver for glance (RBD to S3)
31
Hyperconverged Ceph/Cloud
• Experimenting with co-located ceph-osd on HVs and HPC:
• New cluster with 384 SSDs on HPC nodes
• Minor issues related to server isolation:
• cgroups or NUMA pinning are options but not yet used.
• Issues are related to our operations culture:
• We (Ceph team) don’t own the servers – need to co-operate with the
cloud/HPC teams.
• E.g. When is it ok to reboot a node? how to drain a node? Software upgrade
procedures.
32
User Feedback:
From Jewel to Luminous
33
Jewel to Luminous Upgrade Notes
• In general upgrades went well with no big problems.
• New/replaced OSDs are BlueStore (ceph-volume lvm)
• Preparing a FileStore conversion script for our infrastructure
• ceph-mgr balancer is very interesting:
• Actively testing the crush-compat mode
• Python module can be patched in place for quick fixes
34
How to replace many OSDs?
35
Fully replaced 3PB of block storage
with 6PB new hardware over several
weeks, transparent to users.
cernceph/ceph-scripts
Current Challenges
• RBD / OpenStack Cinder:
• Ops: how to identify active volumes?
• “rbd top”
• Performance: µs latencies and kHz IOPS.
• Need persistent SSD caches.
• On the wire encryption, client-side volume encryption
• OpenStack: volume type / availability zone coupling for
hyper-converged clusters
36
Current Challenges
• CephFS HPC:
• HPC: parallel MPI I/O and single-MDS metadata perf (IO-500!)
• Copying data across /cephfs: need "rsync --ceph"
• CephFS general use-case:
• Scaling to 10,000 (or 100,000!) clients:
• client throttles, tools to block/disconnect noisy users/clients.
• Need “ceph client top”
• native Kerberos (without NFS gateway), group accounting and quotas
• HA CIFS and NFS gateways for non-Linux clients
• How to backup a 10 billion file CephFS?
• e.g. how about binary diff between snaps, similar to ZFS send/receive?
37
Current Challenges
• RADOS:
• How to phase in new features on old clusters
• e.g. we have 3PB of RBD data with hammer tunables
• Pool-level object backup (convert from replicated to EC, copy to non-Ceph)
• rados export the diff between two pool snaphots?
• Areas we cannot use Ceph yet:
• Storage for large enterprise databases (are we close?)
• Large scale batch processing
• Single filesystems spanning multiple sites
• HSM use-cases (CephFS with tape backend?, Glacier for S3?)
38
Future…
39
HEP Computing for the 2020s
• Run-2 (2015-18):
~50-80PB/year
• Run-3 (2020-23):
~150PB/year
• Run-4: ~600PB/year?!
“Data Lakes” – globally distributed, flexible placement, ubiquitous access
https://arxiv.org/abs/1712.06982
Ceph Bigbang Scale Testing
• Bigbang scale tests mutually benefit
CERN & Ceph project
• Bigbang I: 30PB, 7200 OSDs, Ceph
hammer. Several osdmap limitations
• Bigbang II: Similar size, Ceph jewel.
Scalability limited by OSD/MON
messaging. Motivated ceph-mgr
• Bigbang III: 65PB, 10800 OSDs
41
https://ceph.com/community/new-luminous-scalability/
Thanks…
42
Thanks to my CERN Colleagues
• Ceph team at CERN
• Hervé Rousseau, Teo Mouratidis, Roberto Valverde, Paul Musset, Julien Collet
• Massimo Lamanna / Alberto Pace (Storage Group Leadership)
• Andreas-Joachim Peters (Intel EC)
• Sebastien Ponce (radosstriper)
• OpenStack & Containers teams at CERN
• Tim Bell, Jan van Eldik, Arne Wiebalck (also co-initiator of Ceph at CERN),
Belmiro Moreira, Ricardo Rocha, Jose Castro Leon
• HPC team at CERN
• Nils Hoimyr, Carolina Lindqvist, Pablo Llopis
43
!

More Related Content

What's hot

OpenStack and Ceph case study at the University of Alabama
OpenStack and Ceph case study at the University of AlabamaOpenStack and Ceph case study at the University of Alabama
OpenStack and Ceph case study at the University of AlabamaKamesh Pemmaraju
 
Ceph Goes on Online at Qihoo 360 - Xuehan Xu
Ceph Goes on Online at Qihoo 360 - Xuehan XuCeph Goes on Online at Qihoo 360 - Xuehan Xu
Ceph Goes on Online at Qihoo 360 - Xuehan XuCeph Community
 
Designing for High Performance Ceph at Scale
Designing for High Performance Ceph at ScaleDesigning for High Performance Ceph at Scale
Designing for High Performance Ceph at ScaleJames Saint-Rossy
 
Basic and Advanced Analysis of Ceph Volume Backend Driver in Cinder - John Haan
Basic and Advanced Analysis of Ceph Volume Backend Driver in Cinder - John HaanBasic and Advanced Analysis of Ceph Volume Backend Driver in Cinder - John Haan
Basic and Advanced Analysis of Ceph Volume Backend Driver in Cinder - John HaanCeph Community
 
Global deduplication for Ceph - Myoungwon Oh
Global deduplication for Ceph - Myoungwon OhGlobal deduplication for Ceph - Myoungwon Oh
Global deduplication for Ceph - Myoungwon OhCeph Community
 
Which Hypervisor is Best?
Which Hypervisor is Best?Which Hypervisor is Best?
Which Hypervisor is Best?Kyle Bader
 
RADOS improvements and roadmap - Greg Farnum, Josh Durgin, Kefu Chai
RADOS improvements and roadmap - Greg Farnum, Josh Durgin, Kefu ChaiRADOS improvements and roadmap - Greg Farnum, Josh Durgin, Kefu Chai
RADOS improvements and roadmap - Greg Farnum, Josh Durgin, Kefu ChaiCeph Community
 
Ceph Day Chicago - Ceph Deployment at Target: Best Practices and Lessons Learned
Ceph Day Chicago - Ceph Deployment at Target: Best Practices and Lessons LearnedCeph Day Chicago - Ceph Deployment at Target: Best Practices and Lessons Learned
Ceph Day Chicago - Ceph Deployment at Target: Best Practices and Lessons LearnedCeph Community
 
QCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference ArchitectureQCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference ArchitecturePatrick McGarry
 
Ceph on 64-bit ARM with X-Gene
Ceph on 64-bit ARM with X-GeneCeph on 64-bit ARM with X-Gene
Ceph on 64-bit ARM with X-GeneCeph Community
 
Evaluation of RBD replication options @CERN
Evaluation of RBD replication options @CERNEvaluation of RBD replication options @CERN
Evaluation of RBD replication options @CERNCeph Community
 
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...Ceph Community
 
Automatic Operation Bot for Ceph - You Ji
Automatic Operation Bot for Ceph - You JiAutomatic Operation Bot for Ceph - You Ji
Automatic Operation Bot for Ceph - You JiCeph Community
 
RBD: What will the future bring? - Jason Dillaman
RBD: What will the future bring? - Jason DillamanRBD: What will the future bring? - Jason Dillaman
RBD: What will the future bring? - Jason DillamanCeph Community
 
Doing QoS Before Ceph Cluster QoS is available - David Byte, Alex Lau
Doing QoS Before Ceph Cluster QoS is available - David Byte, Alex LauDoing QoS Before Ceph Cluster QoS is available - David Byte, Alex Lau
Doing QoS Before Ceph Cluster QoS is available - David Byte, Alex LauCeph Community
 
CEPH DAY BERLIN - DISK HEALTH PREDICTION AND RESOURCE ALLOCATION FOR CEPH BY ...
CEPH DAY BERLIN - DISK HEALTH PREDICTION AND RESOURCE ALLOCATION FOR CEPH BY ...CEPH DAY BERLIN - DISK HEALTH PREDICTION AND RESOURCE ALLOCATION FOR CEPH BY ...
CEPH DAY BERLIN - DISK HEALTH PREDICTION AND RESOURCE ALLOCATION FOR CEPH BY ...Ceph Community
 
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance BarriersCeph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance BarriersCeph Community
 
Build an High-Performance and High-Durable Block Storage Service Based on Ceph
Build an High-Performance and High-Durable Block Storage Service Based on CephBuild an High-Performance and High-Durable Block Storage Service Based on Ceph
Build an High-Performance and High-Durable Block Storage Service Based on CephRongze Zhu
 

What's hot (20)

OpenStack and Ceph case study at the University of Alabama
OpenStack and Ceph case study at the University of AlabamaOpenStack and Ceph case study at the University of Alabama
OpenStack and Ceph case study at the University of Alabama
 
Ceph Goes on Online at Qihoo 360 - Xuehan Xu
Ceph Goes on Online at Qihoo 360 - Xuehan XuCeph Goes on Online at Qihoo 360 - Xuehan Xu
Ceph Goes on Online at Qihoo 360 - Xuehan Xu
 
Designing for High Performance Ceph at Scale
Designing for High Performance Ceph at ScaleDesigning for High Performance Ceph at Scale
Designing for High Performance Ceph at Scale
 
Basic and Advanced Analysis of Ceph Volume Backend Driver in Cinder - John Haan
Basic and Advanced Analysis of Ceph Volume Backend Driver in Cinder - John HaanBasic and Advanced Analysis of Ceph Volume Backend Driver in Cinder - John Haan
Basic and Advanced Analysis of Ceph Volume Backend Driver in Cinder - John Haan
 
Global deduplication for Ceph - Myoungwon Oh
Global deduplication for Ceph - Myoungwon OhGlobal deduplication for Ceph - Myoungwon Oh
Global deduplication for Ceph - Myoungwon Oh
 
MySQL on Ceph
MySQL on CephMySQL on Ceph
MySQL on Ceph
 
Which Hypervisor is Best?
Which Hypervisor is Best?Which Hypervisor is Best?
Which Hypervisor is Best?
 
RADOS improvements and roadmap - Greg Farnum, Josh Durgin, Kefu Chai
RADOS improvements and roadmap - Greg Farnum, Josh Durgin, Kefu ChaiRADOS improvements and roadmap - Greg Farnum, Josh Durgin, Kefu Chai
RADOS improvements and roadmap - Greg Farnum, Josh Durgin, Kefu Chai
 
Ceph Day Chicago - Ceph Deployment at Target: Best Practices and Lessons Learned
Ceph Day Chicago - Ceph Deployment at Target: Best Practices and Lessons LearnedCeph Day Chicago - Ceph Deployment at Target: Best Practices and Lessons Learned
Ceph Day Chicago - Ceph Deployment at Target: Best Practices and Lessons Learned
 
QCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference ArchitectureQCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference Architecture
 
Ceph on 64-bit ARM with X-Gene
Ceph on 64-bit ARM with X-GeneCeph on 64-bit ARM with X-Gene
Ceph on 64-bit ARM with X-Gene
 
Evaluation of RBD replication options @CERN
Evaluation of RBD replication options @CERNEvaluation of RBD replication options @CERN
Evaluation of RBD replication options @CERN
 
ceph-barcelona-v-1.2
ceph-barcelona-v-1.2ceph-barcelona-v-1.2
ceph-barcelona-v-1.2
 
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...
 
Automatic Operation Bot for Ceph - You Ji
Automatic Operation Bot for Ceph - You JiAutomatic Operation Bot for Ceph - You Ji
Automatic Operation Bot for Ceph - You Ji
 
RBD: What will the future bring? - Jason Dillaman
RBD: What will the future bring? - Jason DillamanRBD: What will the future bring? - Jason Dillaman
RBD: What will the future bring? - Jason Dillaman
 
Doing QoS Before Ceph Cluster QoS is available - David Byte, Alex Lau
Doing QoS Before Ceph Cluster QoS is available - David Byte, Alex LauDoing QoS Before Ceph Cluster QoS is available - David Byte, Alex Lau
Doing QoS Before Ceph Cluster QoS is available - David Byte, Alex Lau
 
CEPH DAY BERLIN - DISK HEALTH PREDICTION AND RESOURCE ALLOCATION FOR CEPH BY ...
CEPH DAY BERLIN - DISK HEALTH PREDICTION AND RESOURCE ALLOCATION FOR CEPH BY ...CEPH DAY BERLIN - DISK HEALTH PREDICTION AND RESOURCE ALLOCATION FOR CEPH BY ...
CEPH DAY BERLIN - DISK HEALTH PREDICTION AND RESOURCE ALLOCATION FOR CEPH BY ...
 
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance BarriersCeph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
 
Build an High-Performance and High-Durable Block Storage Service Based on Ceph
Build an High-Performance and High-Durable Block Storage Service Based on CephBuild an High-Performance and High-Durable Block Storage Service Based on Ceph
Build an High-Performance and High-Durable Block Storage Service Based on Ceph
 

Similar to Ceph for Big Science: Storage at the Large Hadron Collider

The state of Hive and Spark in the Cloud (July 2017)
The state of Hive and Spark in the Cloud (July 2017)The state of Hive and Spark in the Cloud (July 2017)
The state of Hive and Spark in the Cloud (July 2017)Nicolas Poggi
 
Scaling Ceph at CERN - Ceph Day Frankfurt
Scaling Ceph at CERN - Ceph Day Frankfurt Scaling Ceph at CERN - Ceph Day Frankfurt
Scaling Ceph at CERN - Ceph Day Frankfurt Ceph Community
 
Sanger OpenStack presentation March 2017
Sanger OpenStack presentation March 2017Sanger OpenStack presentation March 2017
Sanger OpenStack presentation March 2017Dave Holland
 
Ceph in the GRNET cloud stack
Ceph in the GRNET cloud stackCeph in the GRNET cloud stack
Ceph in the GRNET cloud stackNikos Kormpakis
 
QCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference ArchitectureQCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference ArchitectureCeph Community
 
Quick-and-Easy Deployment of a Ceph Storage Cluster
Quick-and-Easy Deployment of a Ceph Storage ClusterQuick-and-Easy Deployment of a Ceph Storage Cluster
Quick-and-Easy Deployment of a Ceph Storage ClusterPatrick Quairoli
 
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...Odinot Stanislas
 
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...Ceph Community
 
Ceph in 2023 and Beyond.pdf
Ceph in 2023 and Beyond.pdfCeph in 2023 and Beyond.pdf
Ceph in 2023 and Beyond.pdfClyso GmbH
 
TUT18972: Unleash the power of Ceph across the Data Center
TUT18972: Unleash the power of Ceph across the Data CenterTUT18972: Unleash the power of Ceph across the Data Center
TUT18972: Unleash the power of Ceph across the Data CenterEttore Simone
 
Disaggregating Ceph using NVMeoF
Disaggregating Ceph using NVMeoFDisaggregating Ceph using NVMeoF
Disaggregating Ceph using NVMeoFShapeBlue
 
Big data talk barcelona - jsr - jc
Big data talk   barcelona - jsr - jcBig data talk   barcelona - jsr - jc
Big data talk barcelona - jsr - jcJames Saint-Rossy
 
Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph Ceph Community
 
August 2013 HUG: Removing the NameNode's memory limitation
August 2013 HUG: Removing the NameNode's memory limitation August 2013 HUG: Removing the NameNode's memory limitation
August 2013 HUG: Removing the NameNode's memory limitation Yahoo Developer Network
 
Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications OpenEBS
 
Accelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket CacheAccelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket CacheNicolas Poggi
 
Ceph Deployment at Target: Customer Spotlight
Ceph Deployment at Target: Customer SpotlightCeph Deployment at Target: Customer Spotlight
Ceph Deployment at Target: Customer SpotlightRed_Hat_Storage
 
Ceph Deployment at Target: Customer Spotlight
Ceph Deployment at Target: Customer SpotlightCeph Deployment at Target: Customer Spotlight
Ceph Deployment at Target: Customer SpotlightColleen Corrice
 

Similar to Ceph for Big Science: Storage at the Large Hadron Collider (20)

The state of Hive and Spark in the Cloud (July 2017)
The state of Hive and Spark in the Cloud (July 2017)The state of Hive and Spark in the Cloud (July 2017)
The state of Hive and Spark in the Cloud (July 2017)
 
Scaling Ceph at CERN - Ceph Day Frankfurt
Scaling Ceph at CERN - Ceph Day Frankfurt Scaling Ceph at CERN - Ceph Day Frankfurt
Scaling Ceph at CERN - Ceph Day Frankfurt
 
Sanger OpenStack presentation March 2017
Sanger OpenStack presentation March 2017Sanger OpenStack presentation March 2017
Sanger OpenStack presentation March 2017
 
Ceph in the GRNET cloud stack
Ceph in the GRNET cloud stackCeph in the GRNET cloud stack
Ceph in the GRNET cloud stack
 
QCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference ArchitectureQCT Ceph Solution - Design Consideration and Reference Architecture
QCT Ceph Solution - Design Consideration and Reference Architecture
 
Quick-and-Easy Deployment of a Ceph Storage Cluster
Quick-and-Easy Deployment of a Ceph Storage ClusterQuick-and-Easy Deployment of a Ceph Storage Cluster
Quick-and-Easy Deployment of a Ceph Storage Cluster
 
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
 
Ceph
CephCeph
Ceph
 
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
 
Ceph in 2023 and Beyond.pdf
Ceph in 2023 and Beyond.pdfCeph in 2023 and Beyond.pdf
Ceph in 2023 and Beyond.pdf
 
TUT18972: Unleash the power of Ceph across the Data Center
TUT18972: Unleash the power of Ceph across the Data CenterTUT18972: Unleash the power of Ceph across the Data Center
TUT18972: Unleash the power of Ceph across the Data Center
 
Manycores for the Masses
Manycores for the MassesManycores for the Masses
Manycores for the Masses
 
Disaggregating Ceph using NVMeoF
Disaggregating Ceph using NVMeoFDisaggregating Ceph using NVMeoF
Disaggregating Ceph using NVMeoF
 
Big data talk barcelona - jsr - jc
Big data talk   barcelona - jsr - jcBig data talk   barcelona - jsr - jc
Big data talk barcelona - jsr - jc
 
Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph
 
August 2013 HUG: Removing the NameNode's memory limitation
August 2013 HUG: Removing the NameNode's memory limitation August 2013 HUG: Removing the NameNode's memory limitation
August 2013 HUG: Removing the NameNode's memory limitation
 
Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications
 
Accelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket CacheAccelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket Cache
 
Ceph Deployment at Target: Customer Spotlight
Ceph Deployment at Target: Customer SpotlightCeph Deployment at Target: Customer Spotlight
Ceph Deployment at Target: Customer Spotlight
 
Ceph Deployment at Target: Customer Spotlight
Ceph Deployment at Target: Customer SpotlightCeph Deployment at Target: Customer Spotlight
Ceph Deployment at Target: Customer Spotlight
 

Recently uploaded

Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 

Recently uploaded (20)

Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 

Ceph for Big Science: Storage at the Large Hadron Collider

  • 1.
  • 2. Ceph for Big Science Dan van der Ster, CERN IT Cephalocon APAC 2018 23 March 2018 | Beijing
  • 3.
  • 5. 5 ~30MHz interactions filtered to ~1kHz recorded collisions
  • 6. 6 ~30MHz interactions filtered to ~1kHz recorded collisions
  • 7. ATLAS Detector, 100m underground
  • 9. 300 petabytes storage, 230 000 CPU cores
  • 11. Worldwide LHC Compute Grid Beijing IHEP WLCG Centre
  • 12. Ceph at CERN: Yesterday & Today 12
  • 13.
  • 14. 14 First production cluster built mid to late 2013 for OpenStack Cinder block storage. 3 PB, 48x24x3TB drives, 200 journaling SSDs Ceph dumpling v0.67 on Scientific Linux 6 We were very cautious: 4 replicas! (now 3)
  • 15. History • March 2013: 300TB proof of concept • Dec 2013: 3PB in prod for RBD • 2014-15: EC, radosstriper, radosfs • 2016: 3PB to 6PB, no downtime • 2017: 8 prod clusters 15
  • 16. CERN Ceph Clusters Size Version OpenStack Cinder/Glance Production 5.5PB jewel Satellite data centre (1000km away) 0.4PB luminous CephFS (HPC+Manila) Production 0.8PB luminous Manila testing cluster 0.4PB luminous Hyperconverged HPC 0.4PB luminous CASTOR/XRootD Production 4.2PB luminous CERN Tape Archive 0.8PB luminous S3+SWIFT Production 0.9PB luminous 16
  • 18. CephFS: Filer Evolution • Virtual NFS filers are stable and perform well: • nfsd, ZFS, zrep, OpenStack VMs, Cinder/RBD • We have ~60TB on ~30 servers • High performance, but not scalable: • Quota management tedious • Labour-intensive to create new filers • Can’t scale performance horizontally 18
  • 19. CephFS: Filer Evolution • OpenStack Manila (with CephFS) has most of the needed features: • Multi-tenant with security isolation + quotas • Easy self-service share provisioning • Scalable performance (add more MDSs or OSDs as needed) • Successful testing with preproduction users since mid-2017. • Single MDS was seen as a bottleneck. Luminous has stable multi-MDS. • Manila + CephFS now in production: • One user already asked for 2000 shares • Also using for Kubernetes: we are working on a new CSI CephFS plugin • Really need kernel quota support! 19
  • 20. Multi-MDS in Production • ~20 tenants on our pre-prod environment for several months • 2 active MDSs since luminous • Enabled multi-MDS on our production cluster on Jan 24 • Currently have 3 active MDSs • default balancer and pinning 20
  • 21. HPC on CephFS? • CERN is mostly a high throughput computing lab: • File-based parallelism • Several smaller HPC use- cases exist within our lab: • Beams, plasma, CFD, QCD, ASICs • Need full POSIX, consistency, parallel IO 21
  • 22. “Software Defined HPC” • CERN’s approach is to build HPC clusters with commodity parts: “Software Defined HPC” • Compute side is solved with HTCondor & SLURM • Typical HPC storage is not very attractive (missing expertise + budget) • 200-300 HPC nodes accessing ~1PB CephFS since mid-2016: • Manila + HPC use-cases on the same clusters. HPC is just another user. • Quite stable but not super high performance 22
  • 23. IO-500 • Storage benchmark announced by John Bent on ceph-users ML (from SuperComputing 2017) • « goal is to improve parallel file systems by ensuring that sites publish results of both "hero" and "anti-hero" runs and by sharing the tuning and configuration » • We have just started testing on our CephFS clusters: • IOR throughput tests, mdtest + find metadata test • Easy/hard mode for shared/unique file tests 23
  • 24. IO-500 First Look…No tuning! Test Result ior_easy_write 2.595 GB/s ior_hard_write 0.761 GB/s ior_easy_read 4.951 GB/s ior_hard_read 0.944 GB/s 24 Test Result mdtest_easy_write 1.774 kiops mdtest_hard_write 1.512 kiops find 50.00 kiops mdtest_easy_stat 8.722 kiops mdtest_hard_stat 7.076 kiops mdtest_easy_delete 0.747 kiops mdtest_hard_read 2.521 kiops mdtest_hard_delete 1.351 kiops Luminous v12.2.4 -- Tested March 2018 411 OSDs: 800TB SSDs, 2 per server OSDs running on same HW as clients 2 active MDSs running on VMs [SCORE] Bandwidth 1.74 GB/s : IOPS 3.47 kiops : TOTAL 2.46
  • 25. IO-500 First Look…No tuning! Test Result ior_easy_write 2.595 GB/s ior_hard_write 0.761 GB/s ior_easy_read 4.951 GB/s ior_hard_read 0.944 GB/s 25 Test Result mdtest_easy_write 1.774 kiops mdtest_hard_write 1.512 kiops find 50.000 kiops mdtest_easy_stat 8.722 kiops mdtest_hard_stat 7.076 kiops mdtest_easy_delete 0.747 kiops mdtest_hard_read 2.521 kiops mdtest_hard_delete 1.351 kiops Luminous v12.2.4 -- Tested March 2018 411 OSDs: 800TB SSDs, 2 per server OSDs running on same HW as clients 2 active MDSs running on VMs [SCORE] Bandwidth 1.74 GB/s : IOPS 3.47 kiops : TOTAL 2.46
  • 27. S3 @ CERN • Ceph luminous cluster with VM gateways. Single region. • 4+2 erasure coding. Physics data for small objects, volunteer computing, some backups. • Pre-signed URLs and object expiration working well. • HAProxy is very useful: • High-availability & mapping special buckets to dedicated gateways 27
  • 29. RBD: Ceph + OpenStack 29
  • 30. Cinder Volume Types Volume Type Size (TB) Count standard 871 4,758 io1 440 608 cp1 97 118 cpio1 269 107 wig-cp1 26 19 wig-cpio1 106 13 io-test10k 20 1 Totals: 1,811 5,624 30
  • 31. RBD @ CERN • OpenStack Cinder + Glance use-cases continue to be highly reliable: • QoS via IOPS/BW throttles is essential. • Spectre/Meltdown reboots updated all clients to luminous! • Ongoing work: • Recently finished an expanded rbd trash feature • Just starting work on a persistent cache for librbd • CERN Openlab collaboration with Rackspace! • Writing a backup driver for glance (RBD to S3) 31
  • 32. Hyperconverged Ceph/Cloud • Experimenting with co-located ceph-osd on HVs and HPC: • New cluster with 384 SSDs on HPC nodes • Minor issues related to server isolation: • cgroups or NUMA pinning are options but not yet used. • Issues are related to our operations culture: • We (Ceph team) don’t own the servers – need to co-operate with the cloud/HPC teams. • E.g. When is it ok to reboot a node? how to drain a node? Software upgrade procedures. 32
  • 33. User Feedback: From Jewel to Luminous 33
  • 34. Jewel to Luminous Upgrade Notes • In general upgrades went well with no big problems. • New/replaced OSDs are BlueStore (ceph-volume lvm) • Preparing a FileStore conversion script for our infrastructure • ceph-mgr balancer is very interesting: • Actively testing the crush-compat mode • Python module can be patched in place for quick fixes 34
  • 35. How to replace many OSDs? 35 Fully replaced 3PB of block storage with 6PB new hardware over several weeks, transparent to users. cernceph/ceph-scripts
  • 36. Current Challenges • RBD / OpenStack Cinder: • Ops: how to identify active volumes? • “rbd top” • Performance: µs latencies and kHz IOPS. • Need persistent SSD caches. • On the wire encryption, client-side volume encryption • OpenStack: volume type / availability zone coupling for hyper-converged clusters 36
  • 37. Current Challenges • CephFS HPC: • HPC: parallel MPI I/O and single-MDS metadata perf (IO-500!) • Copying data across /cephfs: need "rsync --ceph" • CephFS general use-case: • Scaling to 10,000 (or 100,000!) clients: • client throttles, tools to block/disconnect noisy users/clients. • Need “ceph client top” • native Kerberos (without NFS gateway), group accounting and quotas • HA CIFS and NFS gateways for non-Linux clients • How to backup a 10 billion file CephFS? • e.g. how about binary diff between snaps, similar to ZFS send/receive? 37
  • 38. Current Challenges • RADOS: • How to phase in new features on old clusters • e.g. we have 3PB of RBD data with hammer tunables • Pool-level object backup (convert from replicated to EC, copy to non-Ceph) • rados export the diff between two pool snaphots? • Areas we cannot use Ceph yet: • Storage for large enterprise databases (are we close?) • Large scale batch processing • Single filesystems spanning multiple sites • HSM use-cases (CephFS with tape backend?, Glacier for S3?) 38
  • 40. HEP Computing for the 2020s • Run-2 (2015-18): ~50-80PB/year • Run-3 (2020-23): ~150PB/year • Run-4: ~600PB/year?! “Data Lakes” – globally distributed, flexible placement, ubiquitous access https://arxiv.org/abs/1712.06982
  • 41. Ceph Bigbang Scale Testing • Bigbang scale tests mutually benefit CERN & Ceph project • Bigbang I: 30PB, 7200 OSDs, Ceph hammer. Several osdmap limitations • Bigbang II: Similar size, Ceph jewel. Scalability limited by OSD/MON messaging. Motivated ceph-mgr • Bigbang III: 65PB, 10800 OSDs 41 https://ceph.com/community/new-luminous-scalability/
  • 43. Thanks to my CERN Colleagues • Ceph team at CERN • Hervé Rousseau, Teo Mouratidis, Roberto Valverde, Paul Musset, Julien Collet • Massimo Lamanna / Alberto Pace (Storage Group Leadership) • Andreas-Joachim Peters (Intel EC) • Sebastien Ponce (radosstriper) • OpenStack & Containers teams at CERN • Tim Bell, Jan van Eldik, Arne Wiebalck (also co-initiator of Ceph at CERN), Belmiro Moreira, Ricardo Rocha, Jose Castro Leon • HPC team at CERN • Nils Hoimyr, Carolina Lindqvist, Pablo Llopis 43
  • 44. !