SlideShare a Scribd company logo
1 of 39
Download to read offline
Human fault-tolerance


                  Nathan Marz
                       Twitter   1
What is a data system?

   A system that manages the storage and
   querying of data
What is a data system?

   A system that manages the storage and
   querying of data with a lifetime measured in
   years
What is a data system?

   A system that manages the storage and
   querying of data with a lifetime measured in
   years encompassing every version of the
   application to ever exist
What is a data system?

   A system that manages the storage and
   querying of data with a lifetime measured in
   years encompassing every version of the
   application to ever exist, every hardware
   failure
What is a data system?

   A system that manages the storage and
   querying of data with a lifetime measured in
   years encompassing every version of the
   application to ever exist, every hardware
   failure, and every human mistake ever
   made
+
We don’t know how to make
     perfect software
Paxos         tunable consistency
replication   active anti-entropy
CRC’s         merkle trees
BASE          read repair
quorum        consistent hashing
      Machine fault-tolerance
The worst consequence is
data loss or data corruption
As long as an error doesn’t
 lose or corrupt good data,
you can fix what went wrong
This brings us to mutability
Mutability
The U and D in CRUD
Mutable systems inherently
lack human fault-tolerance
Very easy for a mistake to
   corrupt or lose data
Immutability
Immutability
•   An immutable system captures a historical record of events
•   Each event happens at a particular time and is always true
Capturing change with mutable data
model
  Person    Location                    Person   Location

   Sally   Philadelphia                  Sally   New York

   Bob      Chicago                      Bob     Chicago




                   Sally moves to New York
Capturing change with immutable
data model
Person    Location      Time               Person    Location      Time

 Sally   Philadelphia 1318358351            Sally   Philadelphia 1318358351

 Bob      Chicago    1327928370             Bob      Chicago    1327928370

                                            Sally   New York    1338469380


                         Sally moves to New York
Immutability greatly restricts the
 range of errors that can cause
  data loss or data corruption
Vastly more human fault-tolerant
Plus much easier to reason about
 systems based on immutability
What can we conclude?
Your source of truth should be
         immutable
Mutable
       Application
                         database




                     SOURCE OF TRUTH


Rather than build systems like this...
MySQL
                        Cassandra
                         HBase
       Application
                        Voldemort
                           Riak
                            ...

                     SOURCE OF TRUTH


Rather than build systems like this...
Immutable
                         View            Application
     data



SOURCE OF TRUTH



               ...build them like this
Cassandra
                       HBase
                     Voldemort
     HDFS                             Application
                        Riak
                    ElephantDB
                         ...

SOURCE OF TRUTH



            ...build them like this
Long live immutability
           (forgive the pun)

More Related Content

Similar to Data System - Human fault-tolerance

Runaway complexity in Big Data... and a plan to stop it
Runaway complexity in Big Data... and a plan to stop itRunaway complexity in Big Data... and a plan to stop it
Runaway complexity in Big Data... and a plan to stop it
nathanmarz
 
CrowdStrike CrowdCast: Is Ransomware Morphing Beyond The Ability Of Standard ...
CrowdStrike CrowdCast: Is Ransomware Morphing Beyond The Ability Of Standard ...CrowdStrike CrowdCast: Is Ransomware Morphing Beyond The Ability Of Standard ...
CrowdStrike CrowdCast: Is Ransomware Morphing Beyond The Ability Of Standard ...
CrowdStrike
 
Building a cloud service on a cloud infrastructure. Also, cloud.
Building a cloud service on a cloud infrastructure. Also, cloud.Building a cloud service on a cloud infrastructure. Also, cloud.
Building a cloud service on a cloud infrastructure. Also, cloud.
Mikhail Panchenko
 

Similar to Data System - Human fault-tolerance (20)

Runaway complexity in Big Data... and a plan to stop it
Runaway complexity in Big Data... and a plan to stop itRunaway complexity in Big Data... and a plan to stop it
Runaway complexity in Big Data... and a plan to stop it
 
CERT Data Science in Cybersecurity Symposium
CERT Data Science in Cybersecurity SymposiumCERT Data Science in Cybersecurity Symposium
CERT Data Science in Cybersecurity Symposium
 
Technology Disruption
Technology DisruptionTechnology Disruption
Technology Disruption
 
Black ops 2012
Black ops 2012Black ops 2012
Black ops 2012
 
Distributed computing for new bloods
Distributed computing for new bloodsDistributed computing for new bloods
Distributed computing for new bloods
 
Managing Data from the Edge to HPC
Managing Data from the Edge to HPCManaging Data from the Edge to HPC
Managing Data from the Edge to HPC
 
CrowdStrike CrowdCast: Is Ransomware Morphing Beyond The Ability Of Standard ...
CrowdStrike CrowdCast: Is Ransomware Morphing Beyond The Ability Of Standard ...CrowdStrike CrowdCast: Is Ransomware Morphing Beyond The Ability Of Standard ...
CrowdStrike CrowdCast: Is Ransomware Morphing Beyond The Ability Of Standard ...
 
Cloud-forensics
Cloud-forensicsCloud-forensics
Cloud-forensics
 
OSDC 2017 - Claus Matzinger - An Open Machine Data Analysis Srack with Docker...
OSDC 2017 - Claus Matzinger - An Open Machine Data Analysis Srack with Docker...OSDC 2017 - Claus Matzinger - An Open Machine Data Analysis Srack with Docker...
OSDC 2017 - Claus Matzinger - An Open Machine Data Analysis Srack with Docker...
 
OSDC 2017 | An Open Machine Data Analysis Stack with Docker, CrateDB, and Gr...
OSDC 2017 |  An Open Machine Data Analysis Stack with Docker, CrateDB, and Gr...OSDC 2017 |  An Open Machine Data Analysis Stack with Docker, CrateDB, and Gr...
OSDC 2017 | An Open Machine Data Analysis Stack with Docker, CrateDB, and Gr...
 
Why should you trust my data code4lib 2016
Why should you trust my data code4lib 2016Why should you trust my data code4lib 2016
Why should you trust my data code4lib 2016
 
Antifragile, Microservices and DevOps - A Study
Antifragile, Microservices and DevOps - A StudyAntifragile, Microservices and DevOps - A Study
Antifragile, Microservices and DevOps - A Study
 
Migrate and Modernize Hadoop-Based Security Policies for Databricks
Migrate and Modernize Hadoop-Based Security Policies for DatabricksMigrate and Modernize Hadoop-Based Security Policies for Databricks
Migrate and Modernize Hadoop-Based Security Policies for Databricks
 
Basho and Riak at GOTO Stockholm: "Don't Use My Database."
Basho and Riak at GOTO Stockholm:  "Don't Use My Database."Basho and Riak at GOTO Stockholm:  "Don't Use My Database."
Basho and Riak at GOTO Stockholm: "Don't Use My Database."
 
Security for AWS : Journey to Least Privilege (update)
Security for AWS : Journey to Least Privilege (update)Security for AWS : Journey to Least Privilege (update)
Security for AWS : Journey to Least Privilege (update)
 
Security for AWS: Journey to Least Privilege
Security for AWS: Journey to Least PrivilegeSecurity for AWS: Journey to Least Privilege
Security for AWS: Journey to Least Privilege
 
Building a cloud service on a cloud infrastructure. Also, cloud.
Building a cloud service on a cloud infrastructure. Also, cloud.Building a cloud service on a cloud infrastructure. Also, cloud.
Building a cloud service on a cloud infrastructure. Also, cloud.
 
DataStax
DataStaxDataStax
DataStax
 
Bodleian Library's DAMS system
Bodleian Library's DAMS systemBodleian Library's DAMS system
Bodleian Library's DAMS system
 
Data Virtualization: From Zero to Hero (Middle East)
Data Virtualization: From Zero to Hero (Middle East)Data Virtualization: From Zero to Hero (Middle East)
Data Virtualization: From Zero to Hero (Middle East)
 

More from Trieu Nguyen

[Notes] Customer 360 Analytics with LEO CDP
[Notes] Customer 360 Analytics with LEO CDP[Notes] Customer 360 Analytics with LEO CDP
[Notes] Customer 360 Analytics with LEO CDP
Trieu Nguyen
 

More from Trieu Nguyen (20)

Building Your Customer Data Platform with LEO CDP in Travel Industry.pdf
Building Your Customer Data Platform with LEO CDP in Travel Industry.pdfBuilding Your Customer Data Platform with LEO CDP in Travel Industry.pdf
Building Your Customer Data Platform with LEO CDP in Travel Industry.pdf
 
Building Your Customer Data Platform with LEO CDP - Spa and Hotel Business
Building Your Customer Data Platform with LEO CDP - Spa and Hotel BusinessBuilding Your Customer Data Platform with LEO CDP - Spa and Hotel Business
Building Your Customer Data Platform with LEO CDP - Spa and Hotel Business
 
Building Your Customer Data Platform with LEO CDP
Building Your Customer Data Platform with LEO CDP Building Your Customer Data Platform with LEO CDP
Building Your Customer Data Platform with LEO CDP
 
How to track and improve Customer Experience with LEO CDP
How to track and improve Customer Experience with LEO CDPHow to track and improve Customer Experience with LEO CDP
How to track and improve Customer Experience with LEO CDP
 
[Notes] Customer 360 Analytics with LEO CDP
[Notes] Customer 360 Analytics with LEO CDP[Notes] Customer 360 Analytics with LEO CDP
[Notes] Customer 360 Analytics with LEO CDP
 
Leo CDP - Pitch Deck
Leo CDP - Pitch DeckLeo CDP - Pitch Deck
Leo CDP - Pitch Deck
 
LEO CDP - What's new in 2022
LEO CDP  - What's new in 2022LEO CDP  - What's new in 2022
LEO CDP - What's new in 2022
 
Lộ trình triển khai LEO CDP cho ngành bất động sản
Lộ trình triển khai LEO CDP cho ngành bất động sảnLộ trình triển khai LEO CDP cho ngành bất động sản
Lộ trình triển khai LEO CDP cho ngành bất động sản
 
Why is LEO CDP important for digital business ?
Why is LEO CDP important for digital business ?Why is LEO CDP important for digital business ?
Why is LEO CDP important for digital business ?
 
From Dataism to Customer Data Platform
From Dataism to Customer Data PlatformFrom Dataism to Customer Data Platform
From Dataism to Customer Data Platform
 
Data collection, processing & organization with USPA framework
Data collection, processing & organization with USPA frameworkData collection, processing & organization with USPA framework
Data collection, processing & organization with USPA framework
 
Part 1: Introduction to digital marketing technology
Part 1: Introduction to digital marketing technologyPart 1: Introduction to digital marketing technology
Part 1: Introduction to digital marketing technology
 
Why is Customer Data Platform (CDP) ?
Why is Customer Data Platform (CDP) ?Why is Customer Data Platform (CDP) ?
Why is Customer Data Platform (CDP) ?
 
How to build a Personalized News Recommendation Platform
How to build a Personalized News Recommendation PlatformHow to build a Personalized News Recommendation Platform
How to build a Personalized News Recommendation Platform
 
How to grow your business in the age of digital marketing 4.0
How to grow your business  in the age of digital marketing 4.0How to grow your business  in the age of digital marketing 4.0
How to grow your business in the age of digital marketing 4.0
 
Video Ecosystem and some ideas about video big data
Video Ecosystem and some ideas about video big dataVideo Ecosystem and some ideas about video big data
Video Ecosystem and some ideas about video big data
 
Concepts, use cases and principles to build big data systems (1)
Concepts, use cases and principles to build big data systems (1)Concepts, use cases and principles to build big data systems (1)
Concepts, use cases and principles to build big data systems (1)
 
Open OTT - Video Content Platform
Open OTT - Video Content PlatformOpen OTT - Video Content Platform
Open OTT - Video Content Platform
 
Apache Hadoop and Spark: Introduction and Use Cases for Data Analysis
Apache Hadoop and Spark: Introduction and Use Cases for Data AnalysisApache Hadoop and Spark: Introduction and Use Cases for Data Analysis
Apache Hadoop and Spark: Introduction and Use Cases for Data Analysis
 
Introduction to Recommendation Systems (Vietnam Web Submit)
Introduction to Recommendation Systems (Vietnam Web Submit)Introduction to Recommendation Systems (Vietnam Web Submit)
Introduction to Recommendation Systems (Vietnam Web Submit)
 

Recently uploaded

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Recently uploaded (20)

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 

Data System - Human fault-tolerance

  • 1. Human fault-tolerance Nathan Marz Twitter 1
  • 2. What is a data system? A system that manages the storage and querying of data
  • 3. What is a data system? A system that manages the storage and querying of data with a lifetime measured in years
  • 4. What is a data system? A system that manages the storage and querying of data with a lifetime measured in years encompassing every version of the application to ever exist
  • 5. What is a data system? A system that manages the storage and querying of data with a lifetime measured in years encompassing every version of the application to ever exist, every hardware failure
  • 6. What is a data system? A system that manages the storage and querying of data with a lifetime measured in years encompassing every version of the application to ever exist, every hardware failure, and every human mistake ever made
  • 7.
  • 8.
  • 9.
  • 10.
  • 11. +
  • 12.
  • 13.
  • 14.
  • 15.
  • 16. We don’t know how to make perfect software
  • 17. Paxos tunable consistency replication active anti-entropy CRC’s merkle trees BASE read repair quorum consistent hashing Machine fault-tolerance
  • 18.
  • 19. The worst consequence is data loss or data corruption
  • 20. As long as an error doesn’t lose or corrupt good data, you can fix what went wrong
  • 21. This brings us to mutability
  • 23. The U and D in CRUD
  • 24. Mutable systems inherently lack human fault-tolerance
  • 25. Very easy for a mistake to corrupt or lose data
  • 27. Immutability • An immutable system captures a historical record of events • Each event happens at a particular time and is always true
  • 28. Capturing change with mutable data model Person Location Person Location Sally Philadelphia Sally New York Bob Chicago Bob Chicago Sally moves to New York
  • 29. Capturing change with immutable data model Person Location Time Person Location Time Sally Philadelphia 1318358351 Sally Philadelphia 1318358351 Bob Chicago 1327928370 Bob Chicago 1327928370 Sally New York 1338469380 Sally moves to New York
  • 30. Immutability greatly restricts the range of errors that can cause data loss or data corruption
  • 31. Vastly more human fault-tolerant
  • 32. Plus much easier to reason about systems based on immutability
  • 33. What can we conclude?
  • 34. Your source of truth should be immutable
  • 35. Mutable Application database SOURCE OF TRUTH Rather than build systems like this...
  • 36. MySQL Cassandra HBase Application Voldemort Riak ... SOURCE OF TRUTH Rather than build systems like this...
  • 37. Immutable View Application data SOURCE OF TRUTH ...build them like this
  • 38. Cassandra HBase Voldemort HDFS Application Riak ElephantDB ... SOURCE OF TRUTH ...build them like this
  • 39. Long live immutability (forgive the pun)