Five More Ways to Break Your Ceph Cluster

01
42on Ceph Month2021
A quick update on '10 ways to break
your Ceph cluster' originally by Wido
den Hollander.
https://youtu.be/-FOYXz3Bz3Q
https://www.slideshare.net/ShapeBlue/
widoden-hollander-10-ways-to-break-
your-ceph-cluster
Five more ways to break your Ceph cluster

Break your Ceph cluster in these five ways
02
Completing an
update too soon
Not completing an
update
Under- or over estimate
automation tool
Running with
min_size=1
Running multiple
rgws with sameid
Blindly trusting the
PG autoscaler
42on Ceph Month2021

03
Under- or over-estimating
your automation tool
o Due to a missing variablein a script that was part of
the automationtooling,the amount of monitorswas
set from 3 to 0. The very thorough tool nicely cleaned
up the mons. This includedthe directory structures of
all three monitors.
o A similar case was found while using cephadm. While
we did not find the root cause, it was clearly NOT
cephadms mistake. All monitors were scrapt due to
the ceph “mon means monitoring” mistake. New
clean monitors were deployed,but that doesn’t work.
1. Recreated the monitor db by scraping the OSDs.
2. Found the original mon directories hidingon the
filesystem.
Impact example case:
- availability: high.
- durability: low.
42on Ceph Month2021

04
Running with min_size=1
o We still recommend that you run with at least
'size=3' in all cases if you value your data.
o We revised our earlier views a littlebit though.
Never, ever in any case run production with
min_size=1.
o In a good case you’ll see recovery_unfound.
You will see ‘unknown’in a bad case.
o A better statement: Make sure that you can only
write data if a redundantobject can also be written.
- durability: high.
42on Ceph Month2021

05
Not completing an update
At least 4 cases:
o Customer upgraded to Nautilusand enabled msgr v2.
They didn’t updatethe required osd version.
o This is a common mistake with Nautilusupgrade.
o Sometimes the cluster survives into Octopus.
- durability: low.
42on Ceph Month2021

06
Not completing an update
cluster:
health: HEALTH_WARN
Reduced data availability: 512 pgs inactive, 143 pgs peering, 29 pgs stale
3 slow requests areblocked > 32 sec
3 slow ops, oldestone blocked for 864 sec, daemons [osd.24,osd.34] haveslow ops.
1/6 mons down, quorummon-02,mon-03,mon-05,mon-06,mon-08
data:
pools: 4 pools, 1664 pgs
objects: 2.68Mobjects, 10 TiB
usage: 20 TiB used, 150 TiB / 170 TiB avail
pgs: 22.055% pgs unknown
8.714% pgs notactive
1152 active+clean
367 unknown
58 peering
58 remapped+peering
27 stale+peering
2 stale
io:
client: 24 KiB/s rd, 24 MiB/s wr, 0 op/s rd, 797 op/s wr
42on Ceph Month2021

07
Completing an update
too soon
o Example 14.2.19 -> 14.2.20.
o Setting auth_allow_insecure_global_id_reclaim
false to before upgrading all clients and
daemons.
o This makes the clients unableto connect.
- availability: medium.
- durability: low.
42on Ceph Month2021

08
Running multiple rgws
with the same id behind
a load balancer
o 9 Ceph Object Gateways(rgws) installed,reusing
3 names.
o Ceph only understands3 rgws based on the 3
names.
o The rgws keep switching over which 3 are active
in the service map.
o The load balancerin front of them kept trying to
do new uploads.
o Result: bad performance and millionsof failed
multipart uploads.Cluster getting full faster than
expected. New hardware ordered and installed.
- durability: medium.
42on Ceph Month2021

9
Bonus:
Blindly trusting
the PG autoscaler
o Installeda reasonablylarge cluster for rgw with
only hdd no ssd bluefs_db_dev <- mistake nr. 1
o Only tested with very small dataset, Pools were
created with default amount of pgs (32).
o They then started to ingest a large amount of
data. The autoscaler kept splitting pgs and never
completes. Cluster staysat ~'5% misplaced' for a
very long time.
o Performance poor and customers unhappy.
- durability: low.
42on Ceph Month2021

So, which of the original 10 no longer break your
Ceph cluster?
08
42on Ceph Month2021

Thank you!
011
42on Ceph Month2021

Five More Ways to Break Your Ceph Cluster

More Related Content

What's hot

Similar to Five More Ways to Break Your Ceph Cluster

Recently uploaded

Five More Ways to Break Your Ceph Cluster