4. … and few care or are aware
(thank you cloud providers!)
5. Cloud providers have duped a generation of
operators into thinking bitrot does not exist.
It does. The reality has been hidden from plain sight,
but it’s there… lurking… silently, waiting…
13. TL;DR: 4.2% -> 34% of SSDs have one UBER per year
How many SSDs in that shiny box 'ya got there?
How many boxes are running?
14. TL;DR: 4.2% -> 34% of SSDs have one UBER per year
How many SSDs in that shiny box 'ya got there?
How many boxes are running?
(1-(1-uberRate)^(numDisks)) = Probability of UBER/server/year
(1-(1-0.042)^(20)) = 58%
(1-(1-0.34)^(20)) = 99.975%
15. External Factors for UBER on SSDs:
• Temperature
• Bus Power Consumption
• Data Written by the System Software
• Workload changes due to SSD failure
25. PostgreSQL and ZFS were meant for each other
• Because bitrot happens
• Because its fast
26. PostgreSQL and ZFS were meant for each other
• Because bitrot happens
• Because its fast
2-4µs/pwrite(2)!!
27. PostgreSQL and ZFS were meant for each other
• Because bitrot happens
• Because its fast
28. PostgreSQL and ZFS were meant for each other
• Because bitrot happens
• Because its fast
29. PostgreSQL and ZFS were meant for each other
• Because bitrot happens
• Because its fast
P.S. This was observed on 10K RPM spinning rust.
30. PostgreSQL and ZFS were meant for each other
• Because bitrot happens
• Because its fast
• Because restoring from backups is AWFUL!
# zfsnap snapshot -rv -a 25h tank/pgdata
# zfs list -r -t snapshot
# zfs rollback -r tank/pgdata@hourly-2016-09-14_14.52.00—25h
31. PostgreSQL and ZFS were meant for each other
• Because bitrot happens
• Because its fast
• Because restoring from backups is AWFUL!
# zfsnap snapshot -rv -a 25h tank/pgdata
# zfs list -r -t snapshot
# zfs rollback -r tank/pgdata@hourly-2016-09-14_14.52.00—25h
This happens in seconds!
It’s YUGE people, absolutely YUGE!
32. PostgreSQL and ZFS were meant for each other
• Because bitrot happens
• Because its fast
• Because restoring from backups is AWFUL!
• Because compression is a performance win,
even on SSDs
# zfs set compression=lz4 tank/pgdata
(wtb publishing of benchmarks any year now…
you know who you are…)
33. PostgreSQL and ZFS were meant for each other
• Because bitrot happens
• Because its fast
• Because restoring from backups is AWFUL!
• Because compression is a performance win,
even on SSDs
• Because compression is a space win (2.2:1
compression for most PG data)
34. PostgreSQL and ZFS were meant for each other
• Because bitrot happens
• Because its fast
• Because restoring from backups is AWFUL!
• Because compression is a performance win,
even on SSDs
• Because compression is a space win (2.2:1
compression for most PG data)
• Because zfs snap; zfs send; ssh… ; zfs recv
35. PostgreSQL and ZFS were meant for each other
• Because bitrot happens
• Because its fast
• Because restoring from backups is AWFUL!
• Because compression is a performance win,
even on SSDs
• Because compression is a space win (2.2:1
compression for most PG data)
• Because zfs snap; zfs send; ssh… ; zfs recv
• Because caching compressed records is win
https://www.illumos.org/issues/6950