SlideShare a Scribd company logo
1 of 36
When Bad Things
Happen to Good Data
Understanding Anti-Entropy in Cassandra
#cassandra13
Jason Brown
@jasobrown jasedbrown@gmail.com
About me
• Senior Software Engineer, Netflix
• Apache Cassandra committer
• E-commerce Architect, Major League Baseball Advanced
Media
• Wireless Developer (J2ME and BREW)
#cassandra13
Maintaining consistent state is hard in a distributed system
CAP theorem is working against you
#cassandra13
Inconsistencies creep in
• Node is down
• Network partition
• Dropped Mutations
• Process crash before flush
• File corruption
#cassandra13
Anti-Entropy Overview
• Write time
• Tunable consistency
• Atomic batches
• Hinted handoff
• Read time
• Consistent reads
• Read repair
• Maintenance time
• Node repair
#cassandra13
Write Time
#cassandra13
C* Write Basics
• Determine all replica nodes, in all DCs
• Send to all replicas in local DC
• Send to one replica in remote DCs
• It will forward to peers
• All respond back to coordinator
#cassandra13
Writes – request path
#cassandra13
Writes – response path
#cassandra13
Tunable consistency
Coordinator blocks for specified count of replicas to respond
consistency levels:
• ANY
• ONE / TWO / THREE
• LOCAL_QUORUM
• EACH_QUORUM
• ALL
#cassandra13
Hinted Handoff
Save a copy of the write for down nodes, and replay later
Hint = target replica ID + mutation data
#cassandra13
Hinted Handoff - storing
• On coordinator, store hint for nodes not up
• Also, if a replica doesn’t respond within
write_request_timeout_in_ms, store a hint
• max_hint_window_in_ms – max time a node will create
hints for a dead node
#cassandra13
Hinted Handoff - replay
• Try to send hints to nodes
• Runs every ten minutes
• Multithreaded (c* 1.2)
• Throttleable (kb per second)
#cassandra13
Hinted Handoff – down node
#cassandra13
Hinted Handoff – replay
#cassandra13
What if coordinator dies?
#cassandra13
Atomic Batches
• Coordinator stores incoming mutation to two peers in
same DC
• Deletes batch from peers on successful completion
• Peers will play batch if not deleted
• Runs every 60 seconds
• With c* 1.2, all mutates use atomic batch
#cassandra13
Read time
#cassandra13
Cassandra reads - setup
• Determine replicas to invoke
• consistency level vs. read repair
• First data node responds with full data set, other send
digest
• Coordinator waits for consistency_level nodes to respond
#cassandra13
LOCAL_QUORUM read
#cassandra13
Consistent reads
• Compare digests
• If any mismatches
• re-request to same nodes (full data set)
• compare full data sets, send updates
• block until out of date replicas respond successfully
• Return merged data set to client
#cassandra13
Read repair
• Synchronizes the client-requested data amongst all
replicas
• Piggy-backs on normal reads, but waits for all replicas to
responds (asynchronously)
• Compares the digests and follow same alg as consistent
read
#cassandra13
Read Repair
#cassandra13
Green lines = LOCAL_QUORUM nodes
Blue lines = nodes for read repair
Read repair configuration
• Setting per column family
• Percentage of all reads to CF
• Local DC vs. Global
#cassandra13
Read repair fixes data that is actually
requested,
…but what about data that isn’t requested?
#cassandra13
Node repair - introduction
• Repairs inconsistencies across all replicas for a given
range
• nodetool repair
• repairs the ranges the node contains
• one or more column families (within the same keyspace)
• can choose local datacenter only (c* 1.2)
#cassandra13
Node Repair - cautions
• Should be part of standard c* operations
• Especially if you delete data
• Repair is IO and CPU intensive
#cassandra13
Node Repair – details, 1
• Determine peer nodes with matching ranges
• Triggers a major (validation) compaction on peer nodes
• read and generate hash for every row in CF
• add result to a Merkle Tree
• return tree to initiator
#cassandra13
Node Repair – details, 2
• Initiator awaits trees from participating nodes
• Compares every tree to every other tree
• If any differences detected, the differing nodes exchange
conflicting range(s)
• Written out as new, local SSTables
#cassandra13
Read Repair – example
#cassandra13
#cassandra13
#cassandra13
#cassandra13
#cassandra13
Anti-Entropy – Wrap Up
• CAP Theorem lives, tradeoffs must be understood and
made
• C* contains processes to make diverging data sets
consistent
• Tunable controls exist at write and read times, as well on-
demand
#cassandra13
Thank you!
Q & A time
@jasobrown
#cassandra13

More Related Content

What's hot

Ingesting and Processing IoT Data Using MQTT, Kafka Connect and Kafka Streams...
Ingesting and Processing IoT Data Using MQTT, Kafka Connect and Kafka Streams...Ingesting and Processing IoT Data Using MQTT, Kafka Connect and Kafka Streams...
Ingesting and Processing IoT Data Using MQTT, Kafka Connect and Kafka Streams...
confluent
 
Analyzing 1.2 Million Network Packets per Second in Real-time
Analyzing 1.2 Million Network Packets per Second in Real-timeAnalyzing 1.2 Million Network Packets per Second in Real-time
Analyzing 1.2 Million Network Packets per Second in Real-time
DataWorks Summit
 
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
DataStax
 

What's hot (20)

Introduction to Spark Streaming
Introduction to Spark StreamingIntroduction to Spark Streaming
Introduction to Spark Streaming
 
Introduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlIntroduction to Kafka Cruise Control
Introduction to Kafka Cruise Control
 
Cassandra & puppet, scaling data at $15 per month
Cassandra & puppet, scaling data at $15 per monthCassandra & puppet, scaling data at $15 per month
Cassandra & puppet, scaling data at $15 per month
 
YugaByte DB Internals - Storage Engine and Transactions
YugaByte DB Internals - Storage Engine and Transactions YugaByte DB Internals - Storage Engine and Transactions
YugaByte DB Internals - Storage Engine and Transactions
 
Kafka High Availability in multi data center setup with floating Observers wi...
Kafka High Availability in multi data center setup with floating Observers wi...Kafka High Availability in multi data center setup with floating Observers wi...
Kafka High Availability in multi data center setup with floating Observers wi...
 
Grafana introduction
Grafana introductionGrafana introduction
Grafana introduction
 
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
 
Real-Time Event Processing
Real-Time Event ProcessingReal-Time Event Processing
Real-Time Event Processing
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Deploying Flink on Kubernetes - David Anderson
 Deploying Flink on Kubernetes - David Anderson Deploying Flink on Kubernetes - David Anderson
Deploying Flink on Kubernetes - David Anderson
 
Ingesting and Processing IoT Data Using MQTT, Kafka Connect and Kafka Streams...
Ingesting and Processing IoT Data Using MQTT, Kafka Connect and Kafka Streams...Ingesting and Processing IoT Data Using MQTT, Kafka Connect and Kafka Streams...
Ingesting and Processing IoT Data Using MQTT, Kafka Connect and Kafka Streams...
 
PostgreSQL Query Cache - "pqc"
PostgreSQL Query Cache - "pqc"PostgreSQL Query Cache - "pqc"
PostgreSQL Query Cache - "pqc"
 
Analyzing 1.2 Million Network Packets per Second in Real-time
Analyzing 1.2 Million Network Packets per Second in Real-timeAnalyzing 1.2 Million Network Packets per Second in Real-time
Analyzing 1.2 Million Network Packets per Second in Real-time
 
Apache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals ExplainedApache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals Explained
 
Apache Cassandra at Macys
Apache Cassandra at MacysApache Cassandra at Macys
Apache Cassandra at Macys
 
Migrating Airflow-based Apache Spark Jobs to Kubernetes – the Native Way
Migrating Airflow-based Apache Spark Jobs to Kubernetes – the Native WayMigrating Airflow-based Apache Spark Jobs to Kubernetes – the Native Way
Migrating Airflow-based Apache Spark Jobs to Kubernetes – the Native Way
 
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
 
Alfresco Content Modelling and Policy Behaviours
Alfresco Content Modelling and Policy BehavioursAlfresco Content Modelling and Policy Behaviours
Alfresco Content Modelling and Policy Behaviours
 
Cluster management with Kubernetes
Cluster management with KubernetesCluster management with Kubernetes
Cluster management with Kubernetes
 
Best Practices for Streaming IoT Data with MQTT and Apache Kafka
Best Practices for Streaming IoT Data with MQTT and Apache KafkaBest Practices for Streaming IoT Data with MQTT and Apache Kafka
Best Practices for Streaming IoT Data with MQTT and Apache Kafka
 

Viewers also liked

Real World Tales of Repair (Alexander Dejanovski, The Last Pickle) | Cassandr...
Real World Tales of Repair (Alexander Dejanovski, The Last Pickle) | Cassandr...Real World Tales of Repair (Alexander Dejanovski, The Last Pickle) | Cassandr...
Real World Tales of Repair (Alexander Dejanovski, The Last Pickle) | Cassandr...
DataStax
 

Viewers also liked (7)

Spotify: Automating Cassandra repairs
Spotify: Automating Cassandra repairsSpotify: Automating Cassandra repairs
Spotify: Automating Cassandra repairs
 
Cassandra London March 2016 - Lightening talk - introduction to incremental ...
Cassandra London March 2016  - Lightening talk - introduction to incremental ...Cassandra London March 2016  - Lightening talk - introduction to incremental ...
Cassandra London March 2016 - Lightening talk - introduction to incremental ...
 
Real World Tales of Repair (Alexander Dejanovski, The Last Pickle) | Cassandr...
Real World Tales of Repair (Alexander Dejanovski, The Last Pickle) | Cassandr...Real World Tales of Repair (Alexander Dejanovski, The Last Pickle) | Cassandr...
Real World Tales of Repair (Alexander Dejanovski, The Last Pickle) | Cassandr...
 
Linux Performance Analysis and Tools
Linux Performance Analysis and ToolsLinux Performance Analysis and Tools
Linux Performance Analysis and Tools
 
LISA17 Container Performance Analysis
LISA17 Container Performance AnalysisLISA17 Container Performance Analysis
LISA17 Container Performance Analysis
 
Core java slides
Core java slidesCore java slides
Core java slides
 
Learn 90% of Python in 90 Minutes
Learn 90% of Python in 90 MinutesLearn 90% of Python in 90 Minutes
Learn 90% of Python in 90 Minutes
 

Similar to Understanding AntiEntropy in Cassandra

Cassandra overview
Cassandra overviewCassandra overview
Cassandra overview
Sean Murphy
 
Talk about apache cassandra, TWJUG 2011
Talk about apache cassandra, TWJUG 2011Talk about apache cassandra, TWJUG 2011
Talk about apache cassandra, TWJUG 2011
Boris Yen
 
Cassandra summit 2013 how not to use cassandra
Cassandra summit 2013  how not to use cassandraCassandra summit 2013  how not to use cassandra
Cassandra summit 2013 how not to use cassandra
Axel Liljencrantz
 

Similar to Understanding AntiEntropy in Cassandra (20)

Apache Cassandra multi-datacenter essentials
Apache Cassandra multi-datacenter essentialsApache Cassandra multi-datacenter essentials
Apache Cassandra multi-datacenter essentials
 
Cassandra multi-datacenter operations essentials
Cassandra multi-datacenter operations essentialsCassandra multi-datacenter operations essentials
Cassandra multi-datacenter operations essentials
 
Webinar Back to Basics 3 - Introduzione ai Replica Set
Webinar Back to Basics 3 - Introduzione ai Replica SetWebinar Back to Basics 3 - Introduzione ai Replica Set
Webinar Back to Basics 3 - Introduzione ai Replica Set
 
Hindsight is 20/20: MySQL to Cassandra
Hindsight is 20/20: MySQL to CassandraHindsight is 20/20: MySQL to Cassandra
Hindsight is 20/20: MySQL to Cassandra
 
C* Summit 2013 - Hindsight is 20/20. MySQL to Cassandra by Michael Kjellman
C* Summit 2013 - Hindsight is 20/20. MySQL to Cassandra by Michael KjellmanC* Summit 2013 - Hindsight is 20/20. MySQL to Cassandra by Michael Kjellman
C* Summit 2013 - Hindsight is 20/20. MySQL to Cassandra by Michael Kjellman
 
Cassandra for mission critical data
Cassandra for mission critical dataCassandra for mission critical data
Cassandra for mission critical data
 
Cassandra overview
Cassandra overviewCassandra overview
Cassandra overview
 
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...
 
Devops kc
Devops kcDevops kc
Devops kc
 
Apache cassandra
Apache cassandraApache cassandra
Apache cassandra
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek Berlin
 
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
 
Talk about apache cassandra, TWJUG 2011
Talk about apache cassandra, TWJUG 2011Talk about apache cassandra, TWJUG 2011
Talk about apache cassandra, TWJUG 2011
 
Talk About Apache Cassandra
Talk About Apache CassandraTalk About Apache Cassandra
Talk About Apache Cassandra
 
Cassandra summit 2013 how not to use cassandra
Cassandra summit 2013  how not to use cassandraCassandra summit 2013  how not to use cassandra
Cassandra summit 2013 how not to use cassandra
 
Dynamo cassandra
Dynamo cassandraDynamo cassandra
Dynamo cassandra
 
Making Cassandra Perform as a Time Series Database - Cassandra Summit 15
Making Cassandra Perform as a Time Series Database - Cassandra Summit 15Making Cassandra Perform as a Time Series Database - Cassandra Summit 15
Making Cassandra Perform as a Time Series Database - Cassandra Summit 15
 
SignalFx: Making Cassandra Perform as a Time Series Database
SignalFx: Making Cassandra Perform as a Time Series DatabaseSignalFx: Making Cassandra Perform as a Time Series Database
SignalFx: Making Cassandra Perform as a Time Series Database
 
Introduction to Cassandra - Denver
Introduction to Cassandra - DenverIntroduction to Cassandra - Denver
Introduction to Cassandra - Denver
 
Cassandra Day Denver 2014: Introduction to Apache Cassandra
Cassandra Day Denver 2014: Introduction to Apache CassandraCassandra Day Denver 2014: Introduction to Apache Cassandra
Cassandra Day Denver 2014: Introduction to Apache Cassandra
 

Recently uploaded

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Recently uploaded (20)

Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 

Understanding AntiEntropy in Cassandra

  • 1. When Bad Things Happen to Good Data Understanding Anti-Entropy in Cassandra #cassandra13 Jason Brown @jasobrown jasedbrown@gmail.com
  • 2. About me • Senior Software Engineer, Netflix • Apache Cassandra committer • E-commerce Architect, Major League Baseball Advanced Media • Wireless Developer (J2ME and BREW) #cassandra13
  • 3. Maintaining consistent state is hard in a distributed system CAP theorem is working against you #cassandra13
  • 4. Inconsistencies creep in • Node is down • Network partition • Dropped Mutations • Process crash before flush • File corruption #cassandra13
  • 5. Anti-Entropy Overview • Write time • Tunable consistency • Atomic batches • Hinted handoff • Read time • Consistent reads • Read repair • Maintenance time • Node repair #cassandra13
  • 7. C* Write Basics • Determine all replica nodes, in all DCs • Send to all replicas in local DC • Send to one replica in remote DCs • It will forward to peers • All respond back to coordinator #cassandra13
  • 8. Writes – request path #cassandra13
  • 9. Writes – response path #cassandra13
  • 10. Tunable consistency Coordinator blocks for specified count of replicas to respond consistency levels: • ANY • ONE / TWO / THREE • LOCAL_QUORUM • EACH_QUORUM • ALL #cassandra13
  • 11. Hinted Handoff Save a copy of the write for down nodes, and replay later Hint = target replica ID + mutation data #cassandra13
  • 12. Hinted Handoff - storing • On coordinator, store hint for nodes not up • Also, if a replica doesn’t respond within write_request_timeout_in_ms, store a hint • max_hint_window_in_ms – max time a node will create hints for a dead node #cassandra13
  • 13. Hinted Handoff - replay • Try to send hints to nodes • Runs every ten minutes • Multithreaded (c* 1.2) • Throttleable (kb per second) #cassandra13
  • 14. Hinted Handoff – down node #cassandra13
  • 15. Hinted Handoff – replay #cassandra13
  • 16. What if coordinator dies? #cassandra13
  • 17. Atomic Batches • Coordinator stores incoming mutation to two peers in same DC • Deletes batch from peers on successful completion • Peers will play batch if not deleted • Runs every 60 seconds • With c* 1.2, all mutates use atomic batch #cassandra13
  • 19. Cassandra reads - setup • Determine replicas to invoke • consistency level vs. read repair • First data node responds with full data set, other send digest • Coordinator waits for consistency_level nodes to respond #cassandra13
  • 21. Consistent reads • Compare digests • If any mismatches • re-request to same nodes (full data set) • compare full data sets, send updates • block until out of date replicas respond successfully • Return merged data set to client #cassandra13
  • 22. Read repair • Synchronizes the client-requested data amongst all replicas • Piggy-backs on normal reads, but waits for all replicas to responds (asynchronously) • Compares the digests and follow same alg as consistent read #cassandra13
  • 23. Read Repair #cassandra13 Green lines = LOCAL_QUORUM nodes Blue lines = nodes for read repair
  • 24. Read repair configuration • Setting per column family • Percentage of all reads to CF • Local DC vs. Global #cassandra13
  • 25. Read repair fixes data that is actually requested, …but what about data that isn’t requested? #cassandra13
  • 26. Node repair - introduction • Repairs inconsistencies across all replicas for a given range • nodetool repair • repairs the ranges the node contains • one or more column families (within the same keyspace) • can choose local datacenter only (c* 1.2) #cassandra13
  • 27. Node Repair - cautions • Should be part of standard c* operations • Especially if you delete data • Repair is IO and CPU intensive #cassandra13
  • 28. Node Repair – details, 1 • Determine peer nodes with matching ranges • Triggers a major (validation) compaction on peer nodes • read and generate hash for every row in CF • add result to a Merkle Tree • return tree to initiator #cassandra13
  • 29. Node Repair – details, 2 • Initiator awaits trees from participating nodes • Compares every tree to every other tree • If any differences detected, the differing nodes exchange conflicting range(s) • Written out as new, local SSTables #cassandra13
  • 30. Read Repair – example #cassandra13
  • 35. Anti-Entropy – Wrap Up • CAP Theorem lives, tradeoffs must be understood and made • C* contains processes to make diverging data sets consistent • Tunable controls exist at write and read times, as well on- demand #cassandra13
  • 36. Thank you! Q & A time @jasobrown #cassandra13