SlideShare a Scribd company logo
1 of 25
Download to read offline
What Ever Happened to
Durability?
Tom Lyon
Founder & Chief Scientist, DriveScale
@aka_pugs
Durability & ACID
§  Atomicity
§  Consistency
§  Isolation
§  Durability
Durability Formalized
Durability Defined
§  If written data is acknowledged, it must
be forever readable
§  If written data is read once [before it is
acknowledged], it must be forever
readable
Nothing is Forever
§  Hardware eventually fails
§  Software eventually (?) works
§  Durability is a matter of degree
§  What is good enough?
Estimating Durability
https://www.backblaze.com/blog/cloud-storage-durability/
Performance is the Enemy
§  “The only good write is an O_SYNC write”
§  Write-behind, caching, background
compaction/migration can all lead to
hidden errors
§  fsync(2) can and should return errors, but
misses some
§  See https://wiki.postgresql.org/wiki/Fsync_Errors
§  PostgreSQL: Caring about durability since 1986
§  “commit intervals”?
Can’t trust a File System
“We analyze 11 applications, and find 60 vulnerabilities,
some of which result in severe consequences like
corruption or data loss.”
Can’t trust an SSD
‘Surprisingly, we find that 13 out of the 15 devices,
including the supposedly “enterprise-class” devices,
exhibit failure behavior contrary to our expectations’
Servers and Mayflies
§  Back in the day, when “the” computer
crashed, you just waited for repair
§  Now you remove or re-image the server –
with the drives
§  Local durability is really hard,
but no longer adequate
Replication
§  Backups? Not timely
§  Synchronous mirroring? Very expensive
§  Just use the network! Make copies! Go
forth and replicate!
§  Losing a disk or server no longer causes
lost data. Right? Who needs fsync?
Correlated Failures
§  AWS can lose a data center, you can too
§  Rack power problems are common
§  The smaller your cluster, the more
vulnerable it is
https://xkcd.com/1737
It’s a Distributed System!
CAP Theorem
§  You will have Partitioning.
§  You must choose between Availability
and Consistency.
§  Your users will hate your choice.
§  Availability can be improved by brute
force and $$$ - to reduce partitioning.
§  Consistency requires consensus.
Consensus is Hard
Jepsen breaks everything
“Use Zookeeper. It’s mature, well-designed, and battle-tested.”
“The etcd and Consul teams both take consistency seriously…”
Kyle Kingsbury, https://jepsen.io
Logs & Journals
§  Application first writes to log, then to
where the data “really lives”
§  FS writes to journal, then to where the
data “really lives”
§  Device writes to log, then to where the
data “really lives”
§  What if “the truth” “really lived” in the log?
§  The other places become read caches
Table and Stream Duality
§  “A table is just a cache of the latest value
for each key in a stream” – P. Helland
§  Logs are great for streaming data
§  What if the log itself is distributed and
allows many writers and readers?
Streaming Systems
§  Apache Kafka
§  60 second “commit interval?”
§  Apache Pulsar
§  Uses Apache Bookkeeper
§  Distributed Logs:
§  Apache DistributedLog – uses Bookkeeper
§  Facebook LogDevice
Apache Bookkeeper™
§  “A scaleable, fault-tolerant, and low-
latency storage service optimized for
real-time workloads”
§  Guarantees:
§  “If an entry has been acknowledged, it must
be readable”
§  “If an entry has been read once, it must
always be readable”
Bookkeeper Components
§  Client-side library
§  Distributed Ledger Abstraction
§  “Bookie” – very simple storage nodes
§  Bookies do NOT talk to each other
§  Zookeeper coordination, consensus,
cluster membership, and quorums
Bookkeeper Data Flow
Bookies
Apps
Planet Java
§  Zookeeper and Bookkeeper are both
from planet Java
§  How about something more friendly to
Planet Linux?
§  Use etcd, rewrite Bookkeeper like
ScyllaDB did for Cassandra?
Take-aways
§  Durability is Hard
§  Distributed Durability is Very Hard
§  Be Up-Front about your durability model
§  Logs as Truth & Streaming are the future
§  Apache Bookkeeper is awesome
§  Don’t re-invent the wheel!
Q & A
Software Composable Infrastructure
for modern workloads
and commodity hardware.

More Related Content

What's hot

VMUG St Louis - SDN in the Real World
VMUG St Louis - SDN in the Real WorldVMUG St Louis - SDN in the Real World
VMUG St Louis - SDN in the Real WorldChris Wahl
 
lock, block & two smoking barrels
lock, block & two smoking barrelslock, block & two smoking barrels
lock, block & two smoking barrelsMark Broadbent
 
Rapid RESTful Web Applications with Apache Sling and Jackrabbit
Rapid RESTful Web Applications with Apache Sling and JackrabbitRapid RESTful Web Applications with Apache Sling and Jackrabbit
Rapid RESTful Web Applications with Apache Sling and JackrabbitCraig Dickson
 
Ye Olde Cluster Curiosity Shoppe
Ye Olde Cluster Curiosity ShoppeYe Olde Cluster Curiosity Shoppe
Ye Olde Cluster Curiosity ShoppeMark Broadbent
 
Divide and Conquer: Easier Continuous Delivery using Micro-Services
Divide and Conquer: Easier Continuous Delivery using Micro-ServicesDivide and Conquer: Easier Continuous Delivery using Micro-Services
Divide and Conquer: Easier Continuous Delivery using Micro-ServicesCarlos Sanchez
 
Make your Ansible playbooks maintainable, flexible, and scalable
Make your Ansible playbooks maintainable, flexible, and scalableMake your Ansible playbooks maintainable, flexible, and scalable
Make your Ansible playbooks maintainable, flexible, and scalableJeff Geerling
 
Lockless in Seattle - Using In-Memory OLTP for Transaction Processing
Lockless in Seattle -  Using In-Memory OLTP for Transaction ProcessingLockless in Seattle -  Using In-Memory OLTP for Transaction Processing
Lockless in Seattle - Using In-Memory OLTP for Transaction ProcessingMark Broadbent
 
Automation for Anyone at Nutanix NEXT 2017 US
Automation for Anyone at Nutanix NEXT 2017 USAutomation for Anyone at Nutanix NEXT 2017 US
Automation for Anyone at Nutanix NEXT 2017 USChris Wahl
 
Kubernetes on Top of Mesos on Top of DCOS
Kubernetes on Top of Mesos on Top of DCOSKubernetes on Top of Mesos on Top of DCOS
Kubernetes on Top of Mesos on Top of DCOSStefan Schimanski
 
VMUG - My Journey to Full Stack Engineering
VMUG - My Journey to Full Stack EngineeringVMUG - My Journey to Full Stack Engineering
VMUG - My Journey to Full Stack EngineeringChris Wahl
 
Node Security: The Good, Bad & Ugly
Node Security: The Good, Bad & UglyNode Security: The Good, Bad & Ugly
Node Security: The Good, Bad & UglyBishan Singh
 
Updates on Offline: “My AppCache won’t come back” and “ServiceWorker Tricks ...
Updates on Offline: “My AppCache won’t come back” and  “ServiceWorker Tricks ...Updates on Offline: “My AppCache won’t come back” and  “ServiceWorker Tricks ...
Updates on Offline: “My AppCache won’t come back” and “ServiceWorker Tricks ...Natasha Rooney
 

What's hot (13)

VMUG St Louis - SDN in the Real World
VMUG St Louis - SDN in the Real WorldVMUG St Louis - SDN in the Real World
VMUG St Louis - SDN in the Real World
 
lock, block & two smoking barrels
lock, block & two smoking barrelslock, block & two smoking barrels
lock, block & two smoking barrels
 
Rapid RESTful Web Applications with Apache Sling and Jackrabbit
Rapid RESTful Web Applications with Apache Sling and JackrabbitRapid RESTful Web Applications with Apache Sling and Jackrabbit
Rapid RESTful Web Applications with Apache Sling and Jackrabbit
 
Ye Olde Cluster Curiosity Shoppe
Ye Olde Cluster Curiosity ShoppeYe Olde Cluster Curiosity Shoppe
Ye Olde Cluster Curiosity Shoppe
 
HTTPS and Ansible
HTTPS and AnsibleHTTPS and Ansible
HTTPS and Ansible
 
Divide and Conquer: Easier Continuous Delivery using Micro-Services
Divide and Conquer: Easier Continuous Delivery using Micro-ServicesDivide and Conquer: Easier Continuous Delivery using Micro-Services
Divide and Conquer: Easier Continuous Delivery using Micro-Services
 
Make your Ansible playbooks maintainable, flexible, and scalable
Make your Ansible playbooks maintainable, flexible, and scalableMake your Ansible playbooks maintainable, flexible, and scalable
Make your Ansible playbooks maintainable, flexible, and scalable
 
Lockless in Seattle - Using In-Memory OLTP for Transaction Processing
Lockless in Seattle -  Using In-Memory OLTP for Transaction ProcessingLockless in Seattle -  Using In-Memory OLTP for Transaction Processing
Lockless in Seattle - Using In-Memory OLTP for Transaction Processing
 
Automation for Anyone at Nutanix NEXT 2017 US
Automation for Anyone at Nutanix NEXT 2017 USAutomation for Anyone at Nutanix NEXT 2017 US
Automation for Anyone at Nutanix NEXT 2017 US
 
Kubernetes on Top of Mesos on Top of DCOS
Kubernetes on Top of Mesos on Top of DCOSKubernetes on Top of Mesos on Top of DCOS
Kubernetes on Top of Mesos on Top of DCOS
 
VMUG - My Journey to Full Stack Engineering
VMUG - My Journey to Full Stack EngineeringVMUG - My Journey to Full Stack Engineering
VMUG - My Journey to Full Stack Engineering
 
Node Security: The Good, Bad & Ugly
Node Security: The Good, Bad & UglyNode Security: The Good, Bad & Ugly
Node Security: The Good, Bad & Ugly
 
Updates on Offline: “My AppCache won’t come back” and “ServiceWorker Tricks ...
Updates on Offline: “My AppCache won’t come back” and  “ServiceWorker Tricks ...Updates on Offline: “My AppCache won’t come back” and  “ServiceWorker Tricks ...
Updates on Offline: “My AppCache won’t come back” and “ServiceWorker Tricks ...
 

Similar to What Ever Happened to Durability?

Log analysis with the elk stack
Log analysis with the elk stackLog analysis with the elk stack
Log analysis with the elk stackVikrant Chauhan
 
Non Relational Databases
Non Relational DatabasesNon Relational Databases
Non Relational DatabasesChris Baglieri
 
Apache BookKeeper: A High Performance and Low Latency Storage Service
Apache BookKeeper: A High Performance and Low Latency Storage ServiceApache BookKeeper: A High Performance and Low Latency Storage Service
Apache BookKeeper: A High Performance and Low Latency Storage ServiceSijie Guo
 
Cassandra For Online Systems, What actually works
Cassandra For Online Systems, What actually worksCassandra For Online Systems, What actually works
Cassandra For Online Systems, What actually worksjdsumsion
 
The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...
The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...
The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...Glenn K. Lockwood
 
BigData as a Platform: Cassandra and Current Trends
BigData as a Platform: Cassandra and Current TrendsBigData as a Platform: Cassandra and Current Trends
BigData as a Platform: Cassandra and Current TrendsMatthew Dennis
 
Espresso_DevOps_BLR_Meetup_28_Mar_2015
Espresso_DevOps_BLR_Meetup_28_Mar_2015Espresso_DevOps_BLR_Meetup_28_Mar_2015
Espresso_DevOps_BLR_Meetup_28_Mar_2015Shahnawaz Saifi
 
MongoDB and server performance
MongoDB and server performanceMongoDB and server performance
MongoDB and server performanceAlon Horev
 
Spark 101 - First steps to distributed computing
Spark 101 - First steps to distributed computingSpark 101 - First steps to distributed computing
Spark 101 - First steps to distributed computingDemi Ben-Ari
 
Running Cassandra on Amazon EC2
Running Cassandra on Amazon EC2Running Cassandra on Amazon EC2
Running Cassandra on Amazon EC2Dave Gardner
 
Scalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsScalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsJonas Bonér
 
Elasticsearch quick Intro (English)
Elasticsearch quick Intro (English)Elasticsearch quick Intro (English)
Elasticsearch quick Intro (English)Federico Panini
 
Racing The Web - Hackfest 2016
Racing The Web - Hackfest 2016Racing The Web - Hackfest 2016
Racing The Web - Hackfest 2016Aaron Hnatiw
 
How to fix IO problems for faster SQL Server performance
How to fix IO problems for faster SQL Server performanceHow to fix IO problems for faster SQL Server performance
How to fix IO problems for faster SQL Server performanceSolarWinds
 
Data Streaming Technology Overview
Data Streaming Technology OverviewData Streaming Technology Overview
Data Streaming Technology OverviewDan Lynn
 
Replication Distilled: Hazelcast Deep Dive @ In-Memory Computing Summit San F...
Replication Distilled: Hazelcast Deep Dive @ In-Memory Computing Summit San F...Replication Distilled: Hazelcast Deep Dive @ In-Memory Computing Summit San F...
Replication Distilled: Hazelcast Deep Dive @ In-Memory Computing Summit San F...Ensar Basri Kahveci
 
Replication Distilled: Hazelcast Deep Dive - Berlin Expert Days 2018
Replication Distilled: Hazelcast Deep Dive - Berlin Expert Days 2018Replication Distilled: Hazelcast Deep Dive - Berlin Expert Days 2018
Replication Distilled: Hazelcast Deep Dive - Berlin Expert Days 2018Ensar Basri Kahveci
 
Faster Time to Science - Scaling BioMedical Research in the Cloud with SciOps...
Faster Time to Science - Scaling BioMedical Research in the Cloud with SciOps...Faster Time to Science - Scaling BioMedical Research in the Cloud with SciOps...
Faster Time to Science - Scaling BioMedical Research in the Cloud with SciOps...Amazon Web Services
 
Accelerating Machine Learning Pipelines with Alluxio at Alluxio Meetup 2016
Accelerating Machine Learning Pipelines with Alluxio at Alluxio Meetup 2016Accelerating Machine Learning Pipelines with Alluxio at Alluxio Meetup 2016
Accelerating Machine Learning Pipelines with Alluxio at Alluxio Meetup 2016Alluxio, Inc.
 

Similar to What Ever Happened to Durability? (20)

Log analysis with the elk stack
Log analysis with the elk stackLog analysis with the elk stack
Log analysis with the elk stack
 
Non Relational Databases
Non Relational DatabasesNon Relational Databases
Non Relational Databases
 
Apache BookKeeper: A High Performance and Low Latency Storage Service
Apache BookKeeper: A High Performance and Low Latency Storage ServiceApache BookKeeper: A High Performance and Low Latency Storage Service
Apache BookKeeper: A High Performance and Low Latency Storage Service
 
Cassandra For Online Systems, What actually works
Cassandra For Online Systems, What actually worksCassandra For Online Systems, What actually works
Cassandra For Online Systems, What actually works
 
The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...
The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...
The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...
 
Amazon Aurora: Under the Hood
Amazon Aurora: Under the HoodAmazon Aurora: Under the Hood
Amazon Aurora: Under the Hood
 
BigData as a Platform: Cassandra and Current Trends
BigData as a Platform: Cassandra and Current TrendsBigData as a Platform: Cassandra and Current Trends
BigData as a Platform: Cassandra and Current Trends
 
Espresso_DevOps_BLR_Meetup_28_Mar_2015
Espresso_DevOps_BLR_Meetup_28_Mar_2015Espresso_DevOps_BLR_Meetup_28_Mar_2015
Espresso_DevOps_BLR_Meetup_28_Mar_2015
 
MongoDB and server performance
MongoDB and server performanceMongoDB and server performance
MongoDB and server performance
 
Spark 101 - First steps to distributed computing
Spark 101 - First steps to distributed computingSpark 101 - First steps to distributed computing
Spark 101 - First steps to distributed computing
 
Running Cassandra on Amazon EC2
Running Cassandra on Amazon EC2Running Cassandra on Amazon EC2
Running Cassandra on Amazon EC2
 
Scalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsScalability, Availability & Stability Patterns
Scalability, Availability & Stability Patterns
 
Elasticsearch quick Intro (English)
Elasticsearch quick Intro (English)Elasticsearch quick Intro (English)
Elasticsearch quick Intro (English)
 
Racing The Web - Hackfest 2016
Racing The Web - Hackfest 2016Racing The Web - Hackfest 2016
Racing The Web - Hackfest 2016
 
How to fix IO problems for faster SQL Server performance
How to fix IO problems for faster SQL Server performanceHow to fix IO problems for faster SQL Server performance
How to fix IO problems for faster SQL Server performance
 
Data Streaming Technology Overview
Data Streaming Technology OverviewData Streaming Technology Overview
Data Streaming Technology Overview
 
Replication Distilled: Hazelcast Deep Dive @ In-Memory Computing Summit San F...
Replication Distilled: Hazelcast Deep Dive @ In-Memory Computing Summit San F...Replication Distilled: Hazelcast Deep Dive @ In-Memory Computing Summit San F...
Replication Distilled: Hazelcast Deep Dive @ In-Memory Computing Summit San F...
 
Replication Distilled: Hazelcast Deep Dive - Berlin Expert Days 2018
Replication Distilled: Hazelcast Deep Dive - Berlin Expert Days 2018Replication Distilled: Hazelcast Deep Dive - Berlin Expert Days 2018
Replication Distilled: Hazelcast Deep Dive - Berlin Expert Days 2018
 
Faster Time to Science - Scaling BioMedical Research in the Cloud with SciOps...
Faster Time to Science - Scaling BioMedical Research in the Cloud with SciOps...Faster Time to Science - Scaling BioMedical Research in the Cloud with SciOps...
Faster Time to Science - Scaling BioMedical Research in the Cloud with SciOps...
 
Accelerating Machine Learning Pipelines with Alluxio at Alluxio Meetup 2016
Accelerating Machine Learning Pipelines with Alluxio at Alluxio Meetup 2016Accelerating Machine Learning Pipelines with Alluxio at Alluxio Meetup 2016
Accelerating Machine Learning Pipelines with Alluxio at Alluxio Meetup 2016
 

Recently uploaded

Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 

Recently uploaded (20)

Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 

What Ever Happened to Durability?

  • 1. What Ever Happened to Durability? Tom Lyon Founder & Chief Scientist, DriveScale @aka_pugs
  • 2. Durability & ACID §  Atomicity §  Consistency §  Isolation §  Durability
  • 4. Durability Defined §  If written data is acknowledged, it must be forever readable §  If written data is read once [before it is acknowledged], it must be forever readable
  • 5. Nothing is Forever §  Hardware eventually fails §  Software eventually (?) works §  Durability is a matter of degree §  What is good enough?
  • 7. Performance is the Enemy §  “The only good write is an O_SYNC write” §  Write-behind, caching, background compaction/migration can all lead to hidden errors §  fsync(2) can and should return errors, but misses some §  See https://wiki.postgresql.org/wiki/Fsync_Errors §  PostgreSQL: Caring about durability since 1986 §  “commit intervals”?
  • 8. Can’t trust a File System “We analyze 11 applications, and find 60 vulnerabilities, some of which result in severe consequences like corruption or data loss.”
  • 9. Can’t trust an SSD ‘Surprisingly, we find that 13 out of the 15 devices, including the supposedly “enterprise-class” devices, exhibit failure behavior contrary to our expectations’
  • 10. Servers and Mayflies §  Back in the day, when “the” computer crashed, you just waited for repair §  Now you remove or re-image the server – with the drives §  Local durability is really hard, but no longer adequate
  • 11. Replication §  Backups? Not timely §  Synchronous mirroring? Very expensive §  Just use the network! Make copies! Go forth and replicate! §  Losing a disk or server no longer causes lost data. Right? Who needs fsync?
  • 12. Correlated Failures §  AWS can lose a data center, you can too §  Rack power problems are common §  The smaller your cluster, the more vulnerable it is https://xkcd.com/1737
  • 14. CAP Theorem §  You will have Partitioning. §  You must choose between Availability and Consistency. §  Your users will hate your choice. §  Availability can be improved by brute force and $$$ - to reduce partitioning. §  Consistency requires consensus.
  • 16. Jepsen breaks everything “Use Zookeeper. It’s mature, well-designed, and battle-tested.” “The etcd and Consul teams both take consistency seriously…” Kyle Kingsbury, https://jepsen.io
  • 17. Logs & Journals §  Application first writes to log, then to where the data “really lives” §  FS writes to journal, then to where the data “really lives” §  Device writes to log, then to where the data “really lives” §  What if “the truth” “really lived” in the log? §  The other places become read caches
  • 18. Table and Stream Duality §  “A table is just a cache of the latest value for each key in a stream” – P. Helland §  Logs are great for streaming data §  What if the log itself is distributed and allows many writers and readers?
  • 19. Streaming Systems §  Apache Kafka §  60 second “commit interval?” §  Apache Pulsar §  Uses Apache Bookkeeper §  Distributed Logs: §  Apache DistributedLog – uses Bookkeeper §  Facebook LogDevice
  • 20. Apache Bookkeeper™ §  “A scaleable, fault-tolerant, and low- latency storage service optimized for real-time workloads” §  Guarantees: §  “If an entry has been acknowledged, it must be readable” §  “If an entry has been read once, it must always be readable”
  • 21. Bookkeeper Components §  Client-side library §  Distributed Ledger Abstraction §  “Bookie” – very simple storage nodes §  Bookies do NOT talk to each other §  Zookeeper coordination, consensus, cluster membership, and quorums
  • 23. Planet Java §  Zookeeper and Bookkeeper are both from planet Java §  How about something more friendly to Planet Linux? §  Use etcd, rewrite Bookkeeper like ScyllaDB did for Cassandra?
  • 24. Take-aways §  Durability is Hard §  Distributed Durability is Very Hard §  Be Up-Front about your durability model §  Logs as Truth & Streaming are the future §  Apache Bookkeeper is awesome §  Don’t re-invent the wheel!
  • 25. Q & A Software Composable Infrastructure for modern workloads and commodity hardware.