SlideShare a Scribd company logo
1 of 36
When Bad Things
Happen to Good Data
Understanding Anti-Entropy in Cassandra
#cassandra13
Jason Brown
@jasobrown jasedbrown@gmail.com
About me
• Senior Software Engineer, Netflix
• Apache Cassandra committer
• E-commerce Architect, Major League Baseball Advanced
Media
• Wireless Developer (J2ME and BREW)
#cassandra13
Maintaining consistent state is hard in a distributed system
CAP theorem is working against you
#cassandra13
Inconsistencies creep in
• Node is down
• Network partition
• Dropped Mutations
• Process crash before flush
• File corruption
#cassandra13
Anti-Entropy Overview
• Write time
• Tunable consistency
• Atomic batches
• Hinted handoff
• Read time
• Consistent reads
• Read repair
• Maintenance time
• Node repair
#cassandra13
Write Time
#cassandra13
C* Write Basics
• Determine all replica nodes, in all DCs
• Send to all replicas in local DC
• Send to one replica in remote DCs
• It will forward to peers
• All respond back to coordinator
#cassandra13
Writes – request path
#cassandra13
Writes – response path
#cassandra13
Tunable consistency
Coordinator blocks for specified count of replicas to respond
consistency levels:
• ANY
• ONE / TWO / THREE
• LOCAL_QUORUM
• EACH_QUORUM
• ALL
#cassandra13
Hinted Handoff
Save a copy of the write for down nodes, and replay later
Hint = target replica ID + mutation data
#cassandra13
Hinted Handoff - storing
• On coordinator, store hint for nodes not up
• Also, if a replica doesn’t respond within
write_request_timeout_in_ms, store a hint
• max_hint_window_in_ms – max time a node will create
hints for a dead node
#cassandra13
Hinted Handoff - replay
• Try to send hints to nodes
• Runs every ten minutes
• Multithreaded (c* 1.2)
• Throttleable (kb per second)
#cassandra13
Hinted Handoff – down node
#cassandra13
Hinted Handoff – replay
#cassandra13
What if coordinator dies?
#cassandra13
Atomic Batches
• Coordinator stores incoming mutation to two peers in
same DC
• Deletes batch from peers on successful completion
• Peers will play batch if not deleted
• Runs every 60 seconds
• With c* 1.2, all mutates use atomic batch
#cassandra13
Read time
#cassandra13
Cassandra reads - setup
• Determine replicas to invoke
• consistency level vs. read repair
• First data node responds with full data set, other send
digest
• Coordinator waits for consistency_level nodes to respond
#cassandra13
LOCAL_QUORUM read
#cassandra13
Consistent reads
• Compare digests
• If any mismatches
• re-request to same nodes (full data set)
• compare full data sets, send updates
• block until out of date replicas respond successfully
• Return merged data set to client
#cassandra13
Read repair
• Synchronizes the client-requested data amongst all
replicas
• Piggy-backs on normal reads, but waits for all replicas to
responds (asynchronously)
• Compares the digests and follow same alg as consistent
read
#cassandra13
Read Repair
#cassandra13
Green lines = LOCAL_QUORUM nodes
Blue lines = nodes for read repair
Read repair configuration
• Setting per column family
• Percentage of all reads to CF
• Local DC vs. Global
#cassandra13
Read repair fixes data that is actually
requested,
…but what about data that isn’t requested?
#cassandra13
Node repair - introduction
• Repairs inconsistencies across all replicas for a given
range
• nodetool repair
• repairs the ranges the node contains
• one or more column families (within the same keyspace)
• can choose local datacenter only (c* 1.2)
#cassandra13
Node Repair - cautions
• Should be part of standard c* operations
• Especially if you delete data
• Repair is IO and CPU intensive
#cassandra13
Node Repair – details, 1
• Determine peer nodes with matching ranges
• Triggers a major (validation) compaction on peer nodes
• read and generate hash for every row in CF
• add result to a Merkle Tree
• return tree to initiator
#cassandra13
Node Repair – details, 2
• Initiator awaits trees from participating nodes
• Compares every tree to every other tree
• If any differences detected, the differing nodes exchange
conflicting range(s)
• Written out as new, local SSTables
#cassandra13
Read Repair – example
#cassandra13
#cassandra13
#cassandra13
#cassandra13
#cassandra13
Anti-Entropy – Wrap Up
• CAP Theorem lives, tradeoffs must be understood and
made
• C* contains processes to make diverging data sets
consistent
• Tunable controls exist at write and read times, as well on-
demand
#cassandra13
Thank you!
Q & A time
@jasobrown
#cassandra13

More Related Content

What's hot

Migrating Hundreds of Legacy Applications to Kubernetes - The Good, the Bad, ...
Migrating Hundreds of Legacy Applications to Kubernetes - The Good, the Bad, ...Migrating Hundreds of Legacy Applications to Kubernetes - The Good, the Bad, ...
Migrating Hundreds of Legacy Applications to Kubernetes - The Good, the Bad, ...QAware GmbH
 
An introduction to terraform
An introduction to terraformAn introduction to terraform
An introduction to terraformJulien Pivotto
 
Terraform -- Infrastructure as Code
Terraform -- Infrastructure as CodeTerraform -- Infrastructure as Code
Terraform -- Infrastructure as CodeMartin Schütte
 
AWS AutoScaling
AWS AutoScalingAWS AutoScaling
AWS AutoScalingMahesh Raj
 
Kubernetes ist so viel mehr als ein Container Orchestrierer
Kubernetes ist so viel mehr als ein Container OrchestriererKubernetes ist so viel mehr als ein Container Orchestrierer
Kubernetes ist so viel mehr als ein Container OrchestriererQAware GmbH
 
Hands-On Introduction to Kubernetes at LISA17
Hands-On Introduction to Kubernetes at LISA17Hands-On Introduction to Kubernetes at LISA17
Hands-On Introduction to Kubernetes at LISA17Ryan Jarvinen
 
Kubernetes design principles, patterns and ecosystem
Kubernetes design principles, patterns and ecosystemKubernetes design principles, patterns and ecosystem
Kubernetes design principles, patterns and ecosystemSreenivas Makam
 
Neutron-to-Neutron: interconnecting multiple OpenStack deployments
Neutron-to-Neutron: interconnecting multiple OpenStack deploymentsNeutron-to-Neutron: interconnecting multiple OpenStack deployments
Neutron-to-Neutron: interconnecting multiple OpenStack deploymentsThomas Morin
 
Cloud Computing: Overview and Examples
Cloud Computing: Overview and ExamplesCloud Computing: Overview and Examples
Cloud Computing: Overview and ExamplesEueung Mulyana
 
Disaster Recovery using AWS -Architecture blueprints
Disaster Recovery using AWS -Architecture blueprintsDisaster Recovery using AWS -Architecture blueprints
Disaster Recovery using AWS -Architecture blueprintsHarish Ganesan
 
Service Discovery using etcd, Consul and Kubernetes
Service Discovery using etcd, Consul and KubernetesService Discovery using etcd, Consul and Kubernetes
Service Discovery using etcd, Consul and KubernetesSreenivas Makam
 
Everything as Code with Terraform
Everything as Code with TerraformEverything as Code with Terraform
Everything as Code with TerraformAll Things Open
 

What's hot (20)

Migrating Hundreds of Legacy Applications to Kubernetes - The Good, the Bad, ...
Migrating Hundreds of Legacy Applications to Kubernetes - The Good, the Bad, ...Migrating Hundreds of Legacy Applications to Kubernetes - The Good, the Bad, ...
Migrating Hundreds of Legacy Applications to Kubernetes - The Good, the Bad, ...
 
An introduction to terraform
An introduction to terraformAn introduction to terraform
An introduction to terraform
 
Terraform -- Infrastructure as Code
Terraform -- Infrastructure as CodeTerraform -- Infrastructure as Code
Terraform -- Infrastructure as Code
 
Helm 3
Helm 3Helm 3
Helm 3
 
Introduction to Amazon EKS
Introduction to Amazon EKSIntroduction to Amazon EKS
Introduction to Amazon EKS
 
AWS AutoScaling
AWS AutoScalingAWS AutoScaling
AWS AutoScaling
 
Containers 101
Containers 101Containers 101
Containers 101
 
Kubernetes ist so viel mehr als ein Container Orchestrierer
Kubernetes ist so viel mehr als ein Container OrchestriererKubernetes ist so viel mehr als ein Container Orchestrierer
Kubernetes ist so viel mehr als ein Container Orchestrierer
 
Hands-On Introduction to Kubernetes at LISA17
Hands-On Introduction to Kubernetes at LISA17Hands-On Introduction to Kubernetes at LISA17
Hands-On Introduction to Kubernetes at LISA17
 
Introduction to Kubernetes
Introduction to KubernetesIntroduction to Kubernetes
Introduction to Kubernetes
 
Kubernetes PPT.pptx
Kubernetes PPT.pptxKubernetes PPT.pptx
Kubernetes PPT.pptx
 
Cloud Deployment
Cloud DeploymentCloud Deployment
Cloud Deployment
 
Kubernetes design principles, patterns and ecosystem
Kubernetes design principles, patterns and ecosystemKubernetes design principles, patterns and ecosystem
Kubernetes design principles, patterns and ecosystem
 
AWS ECS vs EKS
AWS ECS vs EKSAWS ECS vs EKS
AWS ECS vs EKS
 
Neutron-to-Neutron: interconnecting multiple OpenStack deployments
Neutron-to-Neutron: interconnecting multiple OpenStack deploymentsNeutron-to-Neutron: interconnecting multiple OpenStack deployments
Neutron-to-Neutron: interconnecting multiple OpenStack deployments
 
Cloud Computing: Overview and Examples
Cloud Computing: Overview and ExamplesCloud Computing: Overview and Examples
Cloud Computing: Overview and Examples
 
Disaster Recovery using AWS -Architecture blueprints
Disaster Recovery using AWS -Architecture blueprintsDisaster Recovery using AWS -Architecture blueprints
Disaster Recovery using AWS -Architecture blueprints
 
Service Discovery using etcd, Consul and Kubernetes
Service Discovery using etcd, Consul and KubernetesService Discovery using etcd, Consul and Kubernetes
Service Discovery using etcd, Consul and Kubernetes
 
DEVSECOPS.pptx
DEVSECOPS.pptxDEVSECOPS.pptx
DEVSECOPS.pptx
 
Everything as Code with Terraform
Everything as Code with TerraformEverything as Code with Terraform
Everything as Code with Terraform
 

Viewers also liked

Spotify: Automating Cassandra repairs
Spotify: Automating Cassandra repairsSpotify: Automating Cassandra repairs
Spotify: Automating Cassandra repairsDataStax Academy
 
Cassandra London March 2016 - Lightening talk - introduction to incremental ...
Cassandra London March 2016  - Lightening talk - introduction to incremental ...Cassandra London March 2016  - Lightening talk - introduction to incremental ...
Cassandra London March 2016 - Lightening talk - introduction to incremental ...aaronmorton
 
Real World Tales of Repair (Alexander Dejanovski, The Last Pickle) | Cassandr...
Real World Tales of Repair (Alexander Dejanovski, The Last Pickle) | Cassandr...Real World Tales of Repair (Alexander Dejanovski, The Last Pickle) | Cassandr...
Real World Tales of Repair (Alexander Dejanovski, The Last Pickle) | Cassandr...DataStax
 
Linux Performance Analysis and Tools
Linux Performance Analysis and ToolsLinux Performance Analysis and Tools
Linux Performance Analysis and ToolsBrendan Gregg
 
LISA17 Container Performance Analysis
LISA17 Container Performance AnalysisLISA17 Container Performance Analysis
LISA17 Container Performance AnalysisBrendan Gregg
 
Learn 90% of Python in 90 Minutes
Learn 90% of Python in 90 MinutesLearn 90% of Python in 90 Minutes
Learn 90% of Python in 90 MinutesMatt Harrison
 

Viewers also liked (7)

Spotify: Automating Cassandra repairs
Spotify: Automating Cassandra repairsSpotify: Automating Cassandra repairs
Spotify: Automating Cassandra repairs
 
Cassandra London March 2016 - Lightening talk - introduction to incremental ...
Cassandra London March 2016  - Lightening talk - introduction to incremental ...Cassandra London March 2016  - Lightening talk - introduction to incremental ...
Cassandra London March 2016 - Lightening talk - introduction to incremental ...
 
Real World Tales of Repair (Alexander Dejanovski, The Last Pickle) | Cassandr...
Real World Tales of Repair (Alexander Dejanovski, The Last Pickle) | Cassandr...Real World Tales of Repair (Alexander Dejanovski, The Last Pickle) | Cassandr...
Real World Tales of Repair (Alexander Dejanovski, The Last Pickle) | Cassandr...
 
Linux Performance Analysis and Tools
Linux Performance Analysis and ToolsLinux Performance Analysis and Tools
Linux Performance Analysis and Tools
 
LISA17 Container Performance Analysis
LISA17 Container Performance AnalysisLISA17 Container Performance Analysis
LISA17 Container Performance Analysis
 
Core java slides
Core java slidesCore java slides
Core java slides
 
Learn 90% of Python in 90 Minutes
Learn 90% of Python in 90 MinutesLearn 90% of Python in 90 Minutes
Learn 90% of Python in 90 Minutes
 

Similar to Understanding AntiEntropy in Cassandra

Apache Cassandra multi-datacenter essentials
Apache Cassandra multi-datacenter essentialsApache Cassandra multi-datacenter essentials
Apache Cassandra multi-datacenter essentialsJulien Anguenot
 
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...DataStax
 
Cassandra multi-datacenter operations essentials
Cassandra multi-datacenter operations essentialsCassandra multi-datacenter operations essentials
Cassandra multi-datacenter operations essentialsJulien Anguenot
 
Webinar Back to Basics 3 - Introduzione ai Replica Set
Webinar Back to Basics 3 - Introduzione ai Replica SetWebinar Back to Basics 3 - Introduzione ai Replica Set
Webinar Back to Basics 3 - Introduzione ai Replica SetMongoDB
 
Hindsight is 20/20: MySQL to Cassandra
Hindsight is 20/20: MySQL to CassandraHindsight is 20/20: MySQL to Cassandra
Hindsight is 20/20: MySQL to CassandraMichael Kjellman
 
C* Summit 2013 - Hindsight is 20/20. MySQL to Cassandra by Michael Kjellman
C* Summit 2013 - Hindsight is 20/20. MySQL to Cassandra by Michael KjellmanC* Summit 2013 - Hindsight is 20/20. MySQL to Cassandra by Michael Kjellman
C* Summit 2013 - Hindsight is 20/20. MySQL to Cassandra by Michael KjellmanDataStax Academy
 
Cassandra for mission critical data
Cassandra for mission critical dataCassandra for mission critical data
Cassandra for mission critical dataOleksandr Semenov
 
Cassandra overview
Cassandra overviewCassandra overview
Cassandra overviewSean Murphy
 
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...ScyllaDB
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinChristian Johannsen
 
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...DataStax Academy
 
Talk about apache cassandra, TWJUG 2011
Talk about apache cassandra, TWJUG 2011Talk about apache cassandra, TWJUG 2011
Talk about apache cassandra, TWJUG 2011Boris Yen
 
Talk About Apache Cassandra
Talk About Apache CassandraTalk About Apache Cassandra
Talk About Apache CassandraJacky Chu
 
Cassandra summit 2013 how not to use cassandra
Cassandra summit 2013  how not to use cassandraCassandra summit 2013  how not to use cassandra
Cassandra summit 2013 how not to use cassandraAxel Liljencrantz
 
Dynamo cassandra
Dynamo cassandraDynamo cassandra
Dynamo cassandraWu Liang
 
Making Cassandra Perform as a Time Series Database - Cassandra Summit 15
Making Cassandra Perform as a Time Series Database - Cassandra Summit 15Making Cassandra Perform as a Time Series Database - Cassandra Summit 15
Making Cassandra Perform as a Time Series Database - Cassandra Summit 15SignalFx
 
SignalFx: Making Cassandra Perform as a Time Series Database
SignalFx: Making Cassandra Perform as a Time Series DatabaseSignalFx: Making Cassandra Perform as a Time Series Database
SignalFx: Making Cassandra Perform as a Time Series DatabaseDataStax Academy
 
Introduction to Cassandra - Denver
Introduction to Cassandra - DenverIntroduction to Cassandra - Denver
Introduction to Cassandra - DenverJon Haddad
 

Similar to Understanding AntiEntropy in Cassandra (20)

Apache Cassandra multi-datacenter essentials
Apache Cassandra multi-datacenter essentialsApache Cassandra multi-datacenter essentials
Apache Cassandra multi-datacenter essentials
 
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
 
Cassandra multi-datacenter operations essentials
Cassandra multi-datacenter operations essentialsCassandra multi-datacenter operations essentials
Cassandra multi-datacenter operations essentials
 
Webinar Back to Basics 3 - Introduzione ai Replica Set
Webinar Back to Basics 3 - Introduzione ai Replica SetWebinar Back to Basics 3 - Introduzione ai Replica Set
Webinar Back to Basics 3 - Introduzione ai Replica Set
 
Hindsight is 20/20: MySQL to Cassandra
Hindsight is 20/20: MySQL to CassandraHindsight is 20/20: MySQL to Cassandra
Hindsight is 20/20: MySQL to Cassandra
 
C* Summit 2013 - Hindsight is 20/20. MySQL to Cassandra by Michael Kjellman
C* Summit 2013 - Hindsight is 20/20. MySQL to Cassandra by Michael KjellmanC* Summit 2013 - Hindsight is 20/20. MySQL to Cassandra by Michael Kjellman
C* Summit 2013 - Hindsight is 20/20. MySQL to Cassandra by Michael Kjellman
 
Cassandra for mission critical data
Cassandra for mission critical dataCassandra for mission critical data
Cassandra for mission critical data
 
Cassandra overview
Cassandra overviewCassandra overview
Cassandra overview
 
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...
 
Devops kc
Devops kcDevops kc
Devops kc
 
Apache cassandra
Apache cassandraApache cassandra
Apache cassandra
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek Berlin
 
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
 
Talk about apache cassandra, TWJUG 2011
Talk about apache cassandra, TWJUG 2011Talk about apache cassandra, TWJUG 2011
Talk about apache cassandra, TWJUG 2011
 
Talk About Apache Cassandra
Talk About Apache CassandraTalk About Apache Cassandra
Talk About Apache Cassandra
 
Cassandra summit 2013 how not to use cassandra
Cassandra summit 2013  how not to use cassandraCassandra summit 2013  how not to use cassandra
Cassandra summit 2013 how not to use cassandra
 
Dynamo cassandra
Dynamo cassandraDynamo cassandra
Dynamo cassandra
 
Making Cassandra Perform as a Time Series Database - Cassandra Summit 15
Making Cassandra Perform as a Time Series Database - Cassandra Summit 15Making Cassandra Perform as a Time Series Database - Cassandra Summit 15
Making Cassandra Perform as a Time Series Database - Cassandra Summit 15
 
SignalFx: Making Cassandra Perform as a Time Series Database
SignalFx: Making Cassandra Perform as a Time Series DatabaseSignalFx: Making Cassandra Perform as a Time Series Database
SignalFx: Making Cassandra Perform as a Time Series Database
 
Introduction to Cassandra - Denver
Introduction to Cassandra - DenverIntroduction to Cassandra - Denver
Introduction to Cassandra - Denver
 

Recently uploaded

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 

Recently uploaded (20)

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 

Understanding AntiEntropy in Cassandra

  • 1. When Bad Things Happen to Good Data Understanding Anti-Entropy in Cassandra #cassandra13 Jason Brown @jasobrown jasedbrown@gmail.com
  • 2. About me • Senior Software Engineer, Netflix • Apache Cassandra committer • E-commerce Architect, Major League Baseball Advanced Media • Wireless Developer (J2ME and BREW) #cassandra13
  • 3. Maintaining consistent state is hard in a distributed system CAP theorem is working against you #cassandra13
  • 4. Inconsistencies creep in • Node is down • Network partition • Dropped Mutations • Process crash before flush • File corruption #cassandra13
  • 5. Anti-Entropy Overview • Write time • Tunable consistency • Atomic batches • Hinted handoff • Read time • Consistent reads • Read repair • Maintenance time • Node repair #cassandra13
  • 7. C* Write Basics • Determine all replica nodes, in all DCs • Send to all replicas in local DC • Send to one replica in remote DCs • It will forward to peers • All respond back to coordinator #cassandra13
  • 8. Writes – request path #cassandra13
  • 9. Writes – response path #cassandra13
  • 10. Tunable consistency Coordinator blocks for specified count of replicas to respond consistency levels: • ANY • ONE / TWO / THREE • LOCAL_QUORUM • EACH_QUORUM • ALL #cassandra13
  • 11. Hinted Handoff Save a copy of the write for down nodes, and replay later Hint = target replica ID + mutation data #cassandra13
  • 12. Hinted Handoff - storing • On coordinator, store hint for nodes not up • Also, if a replica doesn’t respond within write_request_timeout_in_ms, store a hint • max_hint_window_in_ms – max time a node will create hints for a dead node #cassandra13
  • 13. Hinted Handoff - replay • Try to send hints to nodes • Runs every ten minutes • Multithreaded (c* 1.2) • Throttleable (kb per second) #cassandra13
  • 14. Hinted Handoff – down node #cassandra13
  • 15. Hinted Handoff – replay #cassandra13
  • 16. What if coordinator dies? #cassandra13
  • 17. Atomic Batches • Coordinator stores incoming mutation to two peers in same DC • Deletes batch from peers on successful completion • Peers will play batch if not deleted • Runs every 60 seconds • With c* 1.2, all mutates use atomic batch #cassandra13
  • 19. Cassandra reads - setup • Determine replicas to invoke • consistency level vs. read repair • First data node responds with full data set, other send digest • Coordinator waits for consistency_level nodes to respond #cassandra13
  • 21. Consistent reads • Compare digests • If any mismatches • re-request to same nodes (full data set) • compare full data sets, send updates • block until out of date replicas respond successfully • Return merged data set to client #cassandra13
  • 22. Read repair • Synchronizes the client-requested data amongst all replicas • Piggy-backs on normal reads, but waits for all replicas to responds (asynchronously) • Compares the digests and follow same alg as consistent read #cassandra13
  • 23. Read Repair #cassandra13 Green lines = LOCAL_QUORUM nodes Blue lines = nodes for read repair
  • 24. Read repair configuration • Setting per column family • Percentage of all reads to CF • Local DC vs. Global #cassandra13
  • 25. Read repair fixes data that is actually requested, …but what about data that isn’t requested? #cassandra13
  • 26. Node repair - introduction • Repairs inconsistencies across all replicas for a given range • nodetool repair • repairs the ranges the node contains • one or more column families (within the same keyspace) • can choose local datacenter only (c* 1.2) #cassandra13
  • 27. Node Repair - cautions • Should be part of standard c* operations • Especially if you delete data • Repair is IO and CPU intensive #cassandra13
  • 28. Node Repair – details, 1 • Determine peer nodes with matching ranges • Triggers a major (validation) compaction on peer nodes • read and generate hash for every row in CF • add result to a Merkle Tree • return tree to initiator #cassandra13
  • 29. Node Repair – details, 2 • Initiator awaits trees from participating nodes • Compares every tree to every other tree • If any differences detected, the differing nodes exchange conflicting range(s) • Written out as new, local SSTables #cassandra13
  • 30. Read Repair – example #cassandra13
  • 35. Anti-Entropy – Wrap Up • CAP Theorem lives, tradeoffs must be understood and made • C* contains processes to make diverging data sets consistent • Tunable controls exist at write and read times, as well on- demand #cassandra13
  • 36. Thank you! Q & A time @jasobrown #cassandra13