SlideShare a Scribd company logo
1 of 36
When Bad Things
Happen to Good Data
Understanding Anti-Entropy in Cassandra
#cassandra13
Jason Brown
@jasobrown jasedbrown@gmail.com
About me
• Senior Software Engineer, Netflix
• Apache Cassandra committer
• E-commerce Architect, Major League Baseball Advanced
Media
• Wireless Developer (J2ME and BREW)
#cassandra13
Maintaining consistent state is hard in a distributed system
CAP theorem is working against you
#cassandra13
Inconsistencies creep in
• Node is down
• Network partition
• Dropped Mutations
• Process crash before flush
• File corruption
#cassandra13
Anti-Entropy Overview
• Write time
• Tunable consistency
• Atomic batches
• Hinted handoff
• Read time
• Consistent reads
• Read repair
• Maintenance time
• Node repair
#cassandra13
Write Time
#cassandra13
C* Write Basics
• Determine all replica nodes, in all DCs
• Send to all replicas in local DC
• Send to one replica in remote DCs
• It will forward to peers
• All respond back to coordinator
#cassandra13
Writes – request path
#cassandra13
Writes – response path
#cassandra13
Tunable consistency
Coordinator blocks for specified count of replicas to respond
consistency levels:
• ANY
• ONE / TWO / THREE
• LOCAL_QUORUM
• EACH_QUORUM
• ALL
#cassandra13
Hinted Handoff
Save a copy of the write for down nodes, and replay later
Hint = target replica ID + mutation data
#cassandra13
Hinted Handoff - storing
• On coordinator, store hint for nodes not up
• Also, if a replica doesn’t respond within
write_request_timeout_in_ms, store a hint
• max_hint_window_in_ms – max time a node will create
hints for a dead node
#cassandra13
Hinted Handoff - replay
• Try to send hints to nodes
• Runs every ten minutes
• Multithreaded (c* 1.2)
• Throttleable (kb per second)
#cassandra13
Hinted Handoff – down node
#cassandra13
Hinted Handoff – replay
#cassandra13
What if coordinator dies?
#cassandra13
Atomic Batches
• Coordinator stores incoming mutation to two peers in
same DC
• Deletes batch from peers on successful completion
• Peers will play batch if not deleted
• Runs every 60 seconds
• With c* 1.2, all mutates use atomic batch
#cassandra13
Read time
#cassandra13
Cassandra reads - setup
• Determine replicas to invoke
• consistency level vs. read repair
• First data node responds with full data set, other send
digest
• Coordinator waits for consistency_level nodes to respond
#cassandra13
LOCAL_QUORUM read
#cassandra13
Consistent reads
• Compare digests
• If any mismatches
• re-request to same nodes (full data set)
• compare full data sets, send updates
• block until out of date replicas respond successfully
• Return merged data set to client
#cassandra13
Read repair
• Synchronizes the client-requested data amongst all
replicas
• Piggy-backs on normal reads, but waits for all replicas to
responds (asynchronously)
• Compares the digests and follow same alg as consistent
read
#cassandra13
Read Repair
#cassandra13
Green lines = LOCAL_QUORUM nodes
Blue lines = nodes for read repair
Read repair configuration
• Setting per column family
• Percentage of all reads to CF
• Local DC vs. Global
#cassandra13
Read repair fixes data that is actually
requested,
…but what about data that isn’t requested?
#cassandra13
Node repair - introduction
• Repairs inconsistencies across all replicas for a given
range
• nodetool repair
• repairs the ranges the node contains
• one or more column families (within the same keyspace)
• can choose local datacenter only (c* 1.2)
#cassandra13
Node Repair - cautions
• Should be part of standard c* operations
• Especially if you delete data
• Repair is IO and CPU intensive
#cassandra13
Node Repair – details, 1
• Determine peer nodes with matching ranges
• Triggers a major (validation) compaction on peer nodes
• read and generate hash for every row in CF
• add result to a Merkle Tree
• return tree to initiator
#cassandra13
Node Repair – details, 2
• Initiator awaits trees from participating nodes
• Compares every tree to every other tree
• If any differences detected, the differing nodes exchange
conflicting range(s)
• Written out as new, local SSTables
#cassandra13
Read Repair – example
#cassandra13
#cassandra13
#cassandra13
#cassandra13
#cassandra13
Anti-Entropy – Wrap Up
• CAP Theorem lives, tradeoffs must be understood and
made
• C* contains processes to make diverging data sets
consistent
• Tunable controls exist at write and read times, as well on-
demand
#cassandra13
Thank you!
Q & A time
@jasobrown
#cassandra13

More Related Content

What's hot

Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
DataStax
 

What's hot (20)

Common linux ubuntu commands overview
Common linux  ubuntu commands overviewCommon linux  ubuntu commands overview
Common linux ubuntu commands overview
 
Introduction to cassandra
Introduction to cassandraIntroduction to cassandra
Introduction to cassandra
 
HDFS on Kubernetes—Lessons Learned with Kimoon Kim
HDFS on Kubernetes—Lessons Learned with Kimoon KimHDFS on Kubernetes—Lessons Learned with Kimoon Kim
HDFS on Kubernetes—Lessons Learned with Kimoon Kim
 
Workshop on Git and GitHub
Workshop on Git and GitHubWorkshop on Git and GitHub
Workshop on Git and GitHub
 
Disaster Recovery with MirrorMaker 2.0 (Ryanne Dolan, Cloudera) Kafka Summit ...
Disaster Recovery with MirrorMaker 2.0 (Ryanne Dolan, Cloudera) Kafka Summit ...Disaster Recovery with MirrorMaker 2.0 (Ryanne Dolan, Cloudera) Kafka Summit ...
Disaster Recovery with MirrorMaker 2.0 (Ryanne Dolan, Cloudera) Kafka Summit ...
 
Let's talk about Failures with Kubernetes - Hamburg Meetup
Let's talk about Failures with Kubernetes - Hamburg MeetupLet's talk about Failures with Kubernetes - Hamburg Meetup
Let's talk about Failures with Kubernetes - Hamburg Meetup
 
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
 
How Kubernetes helps Devops
How Kubernetes helps DevopsHow Kubernetes helps Devops
How Kubernetes helps Devops
 
Apache Spark on Kubernetes Anirudh Ramanathan and Tim Chen
Apache Spark on Kubernetes Anirudh Ramanathan and Tim ChenApache Spark on Kubernetes Anirudh Ramanathan and Tim Chen
Apache Spark on Kubernetes Anirudh Ramanathan and Tim Chen
 
Introduction to Git and GitHub
Introduction to Git and GitHubIntroduction to Git and GitHub
Introduction to Git and GitHub
 
Upgrading HDFS to 3.3.0 and deploying RBF in production #LINE_DM
Upgrading HDFS to 3.3.0 and deploying RBF in production #LINE_DMUpgrading HDFS to 3.3.0 and deploying RBF in production #LINE_DM
Upgrading HDFS to 3.3.0 and deploying RBF in production #LINE_DM
 
Microservices, Containers and Docker
Microservices, Containers and DockerMicroservices, Containers and Docker
Microservices, Containers and Docker
 
Linux - Introductions to Linux Operating System
Linux - Introductions to Linux Operating SystemLinux - Introductions to Linux Operating System
Linux - Introductions to Linux Operating System
 
Kubernetes Basics
Kubernetes BasicsKubernetes Basics
Kubernetes Basics
 
Docker - Containervirtualisierung leichtgemacht
Docker - Containervirtualisierung leichtgemachtDocker - Containervirtualisierung leichtgemacht
Docker - Containervirtualisierung leichtgemacht
 
Bucket your partitions wisely - Cassandra summit 2016
Bucket your partitions wisely - Cassandra summit 2016Bucket your partitions wisely - Cassandra summit 2016
Bucket your partitions wisely - Cassandra summit 2016
 
Application Monitoring using Open Source: VictoriaMetrics - ClickHouse
Application Monitoring using Open Source: VictoriaMetrics - ClickHouseApplication Monitoring using Open Source: VictoriaMetrics - ClickHouse
Application Monitoring using Open Source: VictoriaMetrics - ClickHouse
 
Containerd Internals: Building a Core Container Runtime
Containerd Internals: Building a Core Container RuntimeContainerd Internals: Building a Core Container Runtime
Containerd Internals: Building a Core Container Runtime
 
Delphix for DBAs by Jonathan Lewis
Delphix for DBAs by Jonathan LewisDelphix for DBAs by Jonathan Lewis
Delphix for DBAs by Jonathan Lewis
 
Cassandra internals
Cassandra internalsCassandra internals
Cassandra internals
 

Viewers also liked

Real World Tales of Repair (Alexander Dejanovski, The Last Pickle) | Cassandr...
Real World Tales of Repair (Alexander Dejanovski, The Last Pickle) | Cassandr...Real World Tales of Repair (Alexander Dejanovski, The Last Pickle) | Cassandr...
Real World Tales of Repair (Alexander Dejanovski, The Last Pickle) | Cassandr...
DataStax
 

Viewers also liked (7)

Spotify: Automating Cassandra repairs
Spotify: Automating Cassandra repairsSpotify: Automating Cassandra repairs
Spotify: Automating Cassandra repairs
 
Cassandra London March 2016 - Lightening talk - introduction to incremental ...
Cassandra London March 2016  - Lightening talk - introduction to incremental ...Cassandra London March 2016  - Lightening talk - introduction to incremental ...
Cassandra London March 2016 - Lightening talk - introduction to incremental ...
 
Real World Tales of Repair (Alexander Dejanovski, The Last Pickle) | Cassandr...
Real World Tales of Repair (Alexander Dejanovski, The Last Pickle) | Cassandr...Real World Tales of Repair (Alexander Dejanovski, The Last Pickle) | Cassandr...
Real World Tales of Repair (Alexander Dejanovski, The Last Pickle) | Cassandr...
 
Linux Performance Analysis and Tools
Linux Performance Analysis and ToolsLinux Performance Analysis and Tools
Linux Performance Analysis and Tools
 
LISA17 Container Performance Analysis
LISA17 Container Performance AnalysisLISA17 Container Performance Analysis
LISA17 Container Performance Analysis
 
Core java slides
Core java slidesCore java slides
Core java slides
 
Learn 90% of Python in 90 Minutes
Learn 90% of Python in 90 MinutesLearn 90% of Python in 90 Minutes
Learn 90% of Python in 90 Minutes
 

Similar to Understanding AntiEntropy in Cassandra

Cassandra overview
Cassandra overviewCassandra overview
Cassandra overview
Sean Murphy
 
Talk about apache cassandra, TWJUG 2011
Talk about apache cassandra, TWJUG 2011Talk about apache cassandra, TWJUG 2011
Talk about apache cassandra, TWJUG 2011
Boris Yen
 
Cassandra summit 2013 how not to use cassandra
Cassandra summit 2013  how not to use cassandraCassandra summit 2013  how not to use cassandra
Cassandra summit 2013 how not to use cassandra
Axel Liljencrantz
 

Similar to Understanding AntiEntropy in Cassandra (20)

Apache Cassandra multi-datacenter essentials
Apache Cassandra multi-datacenter essentialsApache Cassandra multi-datacenter essentials
Apache Cassandra multi-datacenter essentials
 
Cassandra multi-datacenter operations essentials
Cassandra multi-datacenter operations essentialsCassandra multi-datacenter operations essentials
Cassandra multi-datacenter operations essentials
 
Webinar Back to Basics 3 - Introduzione ai Replica Set
Webinar Back to Basics 3 - Introduzione ai Replica SetWebinar Back to Basics 3 - Introduzione ai Replica Set
Webinar Back to Basics 3 - Introduzione ai Replica Set
 
Hindsight is 20/20: MySQL to Cassandra
Hindsight is 20/20: MySQL to CassandraHindsight is 20/20: MySQL to Cassandra
Hindsight is 20/20: MySQL to Cassandra
 
C* Summit 2013 - Hindsight is 20/20. MySQL to Cassandra by Michael Kjellman
C* Summit 2013 - Hindsight is 20/20. MySQL to Cassandra by Michael KjellmanC* Summit 2013 - Hindsight is 20/20. MySQL to Cassandra by Michael Kjellman
C* Summit 2013 - Hindsight is 20/20. MySQL to Cassandra by Michael Kjellman
 
Cassandra for mission critical data
Cassandra for mission critical dataCassandra for mission critical data
Cassandra for mission critical data
 
Cassandra overview
Cassandra overviewCassandra overview
Cassandra overview
 
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...
 
Devops kc
Devops kcDevops kc
Devops kc
 
Apache cassandra
Apache cassandraApache cassandra
Apache cassandra
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek Berlin
 
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
 
Talk about apache cassandra, TWJUG 2011
Talk about apache cassandra, TWJUG 2011Talk about apache cassandra, TWJUG 2011
Talk about apache cassandra, TWJUG 2011
 
Talk About Apache Cassandra
Talk About Apache CassandraTalk About Apache Cassandra
Talk About Apache Cassandra
 
Cassandra summit 2013 how not to use cassandra
Cassandra summit 2013  how not to use cassandraCassandra summit 2013  how not to use cassandra
Cassandra summit 2013 how not to use cassandra
 
Dynamo cassandra
Dynamo cassandraDynamo cassandra
Dynamo cassandra
 
SignalFx: Making Cassandra Perform as a Time Series Database
SignalFx: Making Cassandra Perform as a Time Series DatabaseSignalFx: Making Cassandra Perform as a Time Series Database
SignalFx: Making Cassandra Perform as a Time Series Database
 
Making Cassandra Perform as a Time Series Database - Cassandra Summit 15
Making Cassandra Perform as a Time Series Database - Cassandra Summit 15Making Cassandra Perform as a Time Series Database - Cassandra Summit 15
Making Cassandra Perform as a Time Series Database - Cassandra Summit 15
 
Cassandra Day Denver 2014: Introduction to Apache Cassandra
Cassandra Day Denver 2014: Introduction to Apache CassandraCassandra Day Denver 2014: Introduction to Apache Cassandra
Cassandra Day Denver 2014: Introduction to Apache Cassandra
 
Introduction to Cassandra - Denver
Introduction to Cassandra - DenverIntroduction to Cassandra - Denver
Introduction to Cassandra - Denver
 

Recently uploaded

Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
FIDO Alliance
 
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
panagenda
 

Recently uploaded (20)

TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024
 
Easier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties ReimaginedEasier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties Reimagined
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
 
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfHow Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
 
Intro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераIntro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджера
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
 
Using IESVE for Room Loads Analysis - UK & Ireland
Using IESVE for Room Loads Analysis - UK & IrelandUsing IESVE for Room Loads Analysis - UK & Ireland
Using IESVE for Room Loads Analysis - UK & Ireland
 
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
 
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
 
Portal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russePortal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russe
 
Introduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptxIntroduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptx
 
AI mind or machine power point presentation
AI mind or machine power point presentationAI mind or machine power point presentation
AI mind or machine power point presentation
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdf
 
Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024
 
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
 
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
 
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
 
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfSimplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
 
Overview of Hyperledger Foundation
Overview of Hyperledger FoundationOverview of Hyperledger Foundation
Overview of Hyperledger Foundation
 
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
 

Understanding AntiEntropy in Cassandra

  • 1. When Bad Things Happen to Good Data Understanding Anti-Entropy in Cassandra #cassandra13 Jason Brown @jasobrown jasedbrown@gmail.com
  • 2. About me • Senior Software Engineer, Netflix • Apache Cassandra committer • E-commerce Architect, Major League Baseball Advanced Media • Wireless Developer (J2ME and BREW) #cassandra13
  • 3. Maintaining consistent state is hard in a distributed system CAP theorem is working against you #cassandra13
  • 4. Inconsistencies creep in • Node is down • Network partition • Dropped Mutations • Process crash before flush • File corruption #cassandra13
  • 5. Anti-Entropy Overview • Write time • Tunable consistency • Atomic batches • Hinted handoff • Read time • Consistent reads • Read repair • Maintenance time • Node repair #cassandra13
  • 7. C* Write Basics • Determine all replica nodes, in all DCs • Send to all replicas in local DC • Send to one replica in remote DCs • It will forward to peers • All respond back to coordinator #cassandra13
  • 8. Writes – request path #cassandra13
  • 9. Writes – response path #cassandra13
  • 10. Tunable consistency Coordinator blocks for specified count of replicas to respond consistency levels: • ANY • ONE / TWO / THREE • LOCAL_QUORUM • EACH_QUORUM • ALL #cassandra13
  • 11. Hinted Handoff Save a copy of the write for down nodes, and replay later Hint = target replica ID + mutation data #cassandra13
  • 12. Hinted Handoff - storing • On coordinator, store hint for nodes not up • Also, if a replica doesn’t respond within write_request_timeout_in_ms, store a hint • max_hint_window_in_ms – max time a node will create hints for a dead node #cassandra13
  • 13. Hinted Handoff - replay • Try to send hints to nodes • Runs every ten minutes • Multithreaded (c* 1.2) • Throttleable (kb per second) #cassandra13
  • 14. Hinted Handoff – down node #cassandra13
  • 15. Hinted Handoff – replay #cassandra13
  • 16. What if coordinator dies? #cassandra13
  • 17. Atomic Batches • Coordinator stores incoming mutation to two peers in same DC • Deletes batch from peers on successful completion • Peers will play batch if not deleted • Runs every 60 seconds • With c* 1.2, all mutates use atomic batch #cassandra13
  • 19. Cassandra reads - setup • Determine replicas to invoke • consistency level vs. read repair • First data node responds with full data set, other send digest • Coordinator waits for consistency_level nodes to respond #cassandra13
  • 21. Consistent reads • Compare digests • If any mismatches • re-request to same nodes (full data set) • compare full data sets, send updates • block until out of date replicas respond successfully • Return merged data set to client #cassandra13
  • 22. Read repair • Synchronizes the client-requested data amongst all replicas • Piggy-backs on normal reads, but waits for all replicas to responds (asynchronously) • Compares the digests and follow same alg as consistent read #cassandra13
  • 23. Read Repair #cassandra13 Green lines = LOCAL_QUORUM nodes Blue lines = nodes for read repair
  • 24. Read repair configuration • Setting per column family • Percentage of all reads to CF • Local DC vs. Global #cassandra13
  • 25. Read repair fixes data that is actually requested, …but what about data that isn’t requested? #cassandra13
  • 26. Node repair - introduction • Repairs inconsistencies across all replicas for a given range • nodetool repair • repairs the ranges the node contains • one or more column families (within the same keyspace) • can choose local datacenter only (c* 1.2) #cassandra13
  • 27. Node Repair - cautions • Should be part of standard c* operations • Especially if you delete data • Repair is IO and CPU intensive #cassandra13
  • 28. Node Repair – details, 1 • Determine peer nodes with matching ranges • Triggers a major (validation) compaction on peer nodes • read and generate hash for every row in CF • add result to a Merkle Tree • return tree to initiator #cassandra13
  • 29. Node Repair – details, 2 • Initiator awaits trees from participating nodes • Compares every tree to every other tree • If any differences detected, the differing nodes exchange conflicting range(s) • Written out as new, local SSTables #cassandra13
  • 30. Read Repair – example #cassandra13
  • 35. Anti-Entropy – Wrap Up • CAP Theorem lives, tradeoffs must be understood and made • C* contains processes to make diverging data sets consistent • Tunable controls exist at write and read times, as well on- demand #cassandra13
  • 36. Thank you! Q & A time @jasobrown #cassandra13