SlideShare a Scribd company logo
1 of 41
Download to read offline
Distributed Counters
            in Cassandra




Friday, August 13, 2010
I: Goal
             II: Design
            III: Implementation




            Distributed Counters in Cassandra

Friday, August 13, 2010
I: Goal




            Distributed Counters in Cassandra

Friday, August 13, 2010
Goal




       Low Latency,
       Highly Available
       Counters




            Distributed Counters in Cassandra

Friday, August 13, 2010
II: Design




            Distributed Counters in Cassandra

Friday, August 13, 2010
I: Traditional Counter Design
             II: Abstract Strategy
            III: Distributed Counter Design




            Distributed Counters in Cassandra

Friday, August 13, 2010
Design



                 I: Traditional Counter Design




            Distributed Counters in Cassandra

Friday, August 13, 2010
Traditional Counter Design
       Atomic Counters


       1. single machine
       2. one order of execution
       3. strongly consistent



            Distributed Counters in Cassandra

Friday, August 13, 2010
Traditional Counter Design
       Problems


       1. SPOF / single master
       2. high latency
       3. manually sharded



            Distributed Counters in Cassandra

Friday, August 13, 2010
Traditional Counter Design
       Question




                          What constraints can we relax?




            Distributed Counters in Cassandra

Friday, August 13, 2010
Design



               II: Abstract Strategy




            Distributed Counters in Cassandra

Friday, August 13, 2010
Abstract Strategy
       Constraints to Relax



       1. one order of execution
       2. strong consistency




            Distributed Counters in Cassandra

Friday, August 13, 2010
Abstract Strategy
       Relax: One Order of Execution



       commutative operation:
         - operations must be re-orderable



            Distributed Counters in Cassandra

Friday, August 13, 2010
Abstract Strategy
       Relax: Strong Consistency

       partitioned work:
         - each op must occur once
         - unique partition identifier
       idempotent repair:
         - recognize ops from other partitions

            Distributed Counters in Cassandra

Friday, August 13, 2010
Design



            III: Distributed Counter Design




            Distributed Counters in Cassandra

Friday, August 13, 2010
Distributed Counter Design
       Requirements


       1. commutative operation
       2. partitioned work
       3. idempotent repair



            Distributed Counters in Cassandra

Friday, August 13, 2010
Distributed Counter Design
       Commutative Operation


       addition:
         - commutative operation
         - sum ops performed by all replicas
         -a + b = b + a

            Distributed Counters in Cassandra

Friday, August 13, 2010
Distributed Counter Design
       Partitioned Work



       each op assigned to a replica:
         - every replica sums all of its ops



            Distributed Counters in Cassandra

Friday, August 13, 2010
Distributed Counter Design
       Idempotent Repair


       save counts from remote replicas:
         - keep highest count seen
       prevent multiple execution:
         - do not transfer the target replica’s count


            Distributed Counters in Cassandra

Friday, August 13, 2010
III: Implementation




            Distributed Counters in Cassandra

Friday, August 13, 2010
I: Data Structure
             II: Single Node
            III: Eventual Consistency




            Distributed Counters in Cassandra

Friday, August 13, 2010
I: Data Structure




            Distributed Counters in Cassandra

Friday, August 13, 2010
Data Structure
       Requirements


       local counts:
         - incrementally update
       remote counts:
         - independently track partitions

            Distributed Counters in Cassandra

Friday, August 13, 2010
Data Structure
       Context Format



       list of (replica id, count) tuples:
                 [(replica A, count), (replica B, count), ...]




            Distributed Counters in Cassandra

Friday, August 13, 2010
Data Structure
       Context Mutations


       local write:
         sum local count and write delta
         note: memtable



            Distributed Counters in Cassandra

Friday, August 13, 2010
Data Structure
       Context Mutations


       remote repair:
         for each replica,
         keep highest count seen
         (local or from repair)


            Distributed Counters in Cassandra

Friday, August 13, 2010
II: Single Node




            Distributed Counters in Cassandra

Friday, August 13, 2010
Single Node
       Write Path

       client
          1. construct column
             - value: delta (big-endian long)
             - clock: empty
          2. thrift: insert / batch_mutate

            Distributed Counters in Cassandra

Friday, August 13, 2010
Single Node
       Write Path

       coordinator
         1. choose partition
                          - choose target replica
                          - requirement: ConsistencyLevel.ONE
                 2. construct clock
                          - context format: [(target replica id, count delta)]


            Distributed Counters in Cassandra

Friday, August 13, 2010
Single Node
       Write Path


       target replica
       insert:
                 1. memtable does not contain column
                 2. insert column into memtable



            Distributed Counters in Cassandra

Friday, August 13, 2010
Single Node
       Write Path
       target replica
       update:
                 1. memtable contains column
                 2. retrieve existing column
                 3. create new column
                    - context: sum local count w/ delta from write
                 4. replace column in ConcurrentSkipListMap
                 5. if failed to replace column, go to step 2.


            Distributed Counters in Cassandra

Friday, August 13, 2010
Single Node
       Write Path
       Interesting Note:
       MTs are serialized to SSTs, as-is
                 - each SST encapsulates the updates
                   when it was an MT
                 - local count total must be aggregated
                   across the MT and all SSTs

            Distributed Counters in Cassandra

Friday, August 13, 2010
Single Node
       Read Path
       target replica
       read:
                 1. construct collating iterator over:
                    - frozen snapshot of MT
                    - all relevant SSTs
                 2. resolve column
                    - local counts: sum
                    - remote counts: keep max
                 3. construct value
                    - sum local and remote counts (big-endian long)

            Distributed Counters in Cassandra

Friday, August 13, 2010
Single Node
       Compaction

       replica
       compaction:
                 1. construct collating iterator over all SSTs
                 2. resolve every column in the CF
                    - local counts: sum
                    - remote counts: keep max
                 3. write out resolved CF



            Distributed Counters in Cassandra

Friday, August 13, 2010
III: Eventual Consistency




            Distributed Counters in Cassandra

Friday, August 13, 2010
Eventual Consistency
       Read Repair


       coordinator / replica
       read repair:
                 1. calculate resolved (superset) CF
                    - resolve every column (local: sum, remote: max)
                 2. return resolved CF to client




            Distributed Counters in Cassandra

Friday, August 13, 2010
Eventual Consistency
       Read Repair

       coordinator / replica
       read repair:
                 1. calculate repair CF for each replica
                    - calculate diff CF between resolved and received
                    - modify columns to remove target replica’s counts
                 2. send repair CF to each replica



            Distributed Counters in Cassandra

Friday, August 13, 2010
Eventual Consistency
       Anti-Entropy Service


       sending replica
       AES:
                 1. follow normal AES code path
                    - calculate repair SST based on shared ranges
                    - send repair SST



            Distributed Counters in Cassandra

Friday, August 13, 2010
Eventual Consistency
       Anti-Entropy Service

       receiving replica
       AES:
                 1. post-process streamed SST
                    - re-build streamed SST
                    - note: strip out local replica’s counts
                 2. remove temporary descriptor
                 3. add to SSTableTracker



            Distributed Counters in Cassandra

Friday, August 13, 2010
Questions?




            Distributed Counters in Cassandra

Friday, August 13, 2010
More Information
       Issues:
       #580: Vector Clocks
       #1072: Distributed Counters

       Related Work:
       Helland and Campbell, Building on Quicksand, CIDR (2009),
       Sections 5 & 6.


       My email address:
       kakugawa@gmail.com


            Distributed Counters in Cassandra

Friday, August 13, 2010

More Related Content

What's hot

Under the Hood of a Shard-per-Core Database Architecture
Under the Hood of a Shard-per-Core Database ArchitectureUnder the Hood of a Shard-per-Core Database Architecture
Under the Hood of a Shard-per-Core Database ArchitectureScyllaDB
 
Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeff...
Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeff...Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeff...
Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeff...DataStax
 
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiHow to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiFlink Forward
 
Common Strategies for Improving Performance on Your Delta Lakehouse
Common Strategies for Improving Performance on Your Delta LakehouseCommon Strategies for Improving Performance on Your Delta Lakehouse
Common Strategies for Improving Performance on Your Delta LakehouseDatabricks
 
UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) K...
UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) K...UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) K...
UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) K...confluent
 
Multi-Cluster and Failover for Apache Kafka - Kafka Summit SF 17
Multi-Cluster and Failover for Apache Kafka - Kafka Summit SF 17Multi-Cluster and Failover for Apache Kafka - Kafka Summit SF 17
Multi-Cluster and Failover for Apache Kafka - Kafka Summit SF 17Gwen (Chen) Shapira
 
Where is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in FlinkWhere is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in FlinkFlink Forward
 
Using ClickHouse for Experimentation
Using ClickHouse for ExperimentationUsing ClickHouse for Experimentation
Using ClickHouse for ExperimentationGleb Kanterov
 
Using Apache Arrow, Calcite, and Parquet to Build a Relational Cache
Using Apache Arrow, Calcite, and Parquet to Build a Relational CacheUsing Apache Arrow, Calcite, and Parquet to Build a Relational Cache
Using Apache Arrow, Calcite, and Parquet to Build a Relational CacheDremio Corporation
 
Photon Technical Deep Dive: How to Think Vectorized
Photon Technical Deep Dive: How to Think VectorizedPhoton Technical Deep Dive: How to Think Vectorized
Photon Technical Deep Dive: How to Think VectorizedDatabricks
 
Crossing the Streams: the New Streaming Foreign-Key Join Feature in Kafka Str...
Crossing the Streams: the New Streaming Foreign-Key Join Feature in Kafka Str...Crossing the Streams: the New Streaming Foreign-Key Join Feature in Kafka Str...
Crossing the Streams: the New Streaming Foreign-Key Join Feature in Kafka Str...confluent
 
[Meetup] a successful migration from elastic search to clickhouse
[Meetup] a successful migration from elastic search to clickhouse[Meetup] a successful migration from elastic search to clickhouse
[Meetup] a successful migration from elastic search to clickhouseVianney FOUCAULT
 
Instana - ClickHouse presentation
Instana - ClickHouse presentationInstana - ClickHouse presentation
Instana - ClickHouse presentationMiel Donkers
 
Stability Patterns for Microservices
Stability Patterns for MicroservicesStability Patterns for Microservices
Stability Patterns for Microservicespflueras
 
Real-time Data Ingestion from Kafka to ClickHouse with Deterministic Re-tries...
Real-time Data Ingestion from Kafka to ClickHouse with Deterministic Re-tries...Real-time Data Ingestion from Kafka to ClickHouse with Deterministic Re-tries...
Real-time Data Ingestion from Kafka to ClickHouse with Deterministic Re-tries...HostedbyConfluent
 
Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...
Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...
Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...HostedbyConfluent
 
Delta Lake: Optimizing Merge
Delta Lake: Optimizing MergeDelta Lake: Optimizing Merge
Delta Lake: Optimizing MergeDatabricks
 
Pinterest’s Story of Streaming Hundreds of Terabytes of Pins from MySQL to S3...
Pinterest’s Story of Streaming Hundreds of Terabytes of Pins from MySQL to S3...Pinterest’s Story of Streaming Hundreds of Terabytes of Pins from MySQL to S3...
Pinterest’s Story of Streaming Hundreds of Terabytes of Pins from MySQL to S3...confluent
 
A Deep Dive into Kafka Controller
A Deep Dive into Kafka ControllerA Deep Dive into Kafka Controller
A Deep Dive into Kafka Controllerconfluent
 

What's hot (20)

Under the Hood of a Shard-per-Core Database Architecture
Under the Hood of a Shard-per-Core Database ArchitectureUnder the Hood of a Shard-per-Core Database Architecture
Under the Hood of a Shard-per-Core Database Architecture
 
Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeff...
Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeff...Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeff...
Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeff...
 
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiHow to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
 
Common Strategies for Improving Performance on Your Delta Lakehouse
Common Strategies for Improving Performance on Your Delta LakehouseCommon Strategies for Improving Performance on Your Delta Lakehouse
Common Strategies for Improving Performance on Your Delta Lakehouse
 
UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) K...
UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) K...UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) K...
UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) K...
 
Multi-Cluster and Failover for Apache Kafka - Kafka Summit SF 17
Multi-Cluster and Failover for Apache Kafka - Kafka Summit SF 17Multi-Cluster and Failover for Apache Kafka - Kafka Summit SF 17
Multi-Cluster and Failover for Apache Kafka - Kafka Summit SF 17
 
Where is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in FlinkWhere is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in Flink
 
Using ClickHouse for Experimentation
Using ClickHouse for ExperimentationUsing ClickHouse for Experimentation
Using ClickHouse for Experimentation
 
Using Apache Arrow, Calcite, and Parquet to Build a Relational Cache
Using Apache Arrow, Calcite, and Parquet to Build a Relational CacheUsing Apache Arrow, Calcite, and Parquet to Build a Relational Cache
Using Apache Arrow, Calcite, and Parquet to Build a Relational Cache
 
Photon Technical Deep Dive: How to Think Vectorized
Photon Technical Deep Dive: How to Think VectorizedPhoton Technical Deep Dive: How to Think Vectorized
Photon Technical Deep Dive: How to Think Vectorized
 
Crossing the Streams: the New Streaming Foreign-Key Join Feature in Kafka Str...
Crossing the Streams: the New Streaming Foreign-Key Join Feature in Kafka Str...Crossing the Streams: the New Streaming Foreign-Key Join Feature in Kafka Str...
Crossing the Streams: the New Streaming Foreign-Key Join Feature in Kafka Str...
 
[Meetup] a successful migration from elastic search to clickhouse
[Meetup] a successful migration from elastic search to clickhouse[Meetup] a successful migration from elastic search to clickhouse
[Meetup] a successful migration from elastic search to clickhouse
 
Instana - ClickHouse presentation
Instana - ClickHouse presentationInstana - ClickHouse presentation
Instana - ClickHouse presentation
 
Stability Patterns for Microservices
Stability Patterns for MicroservicesStability Patterns for Microservices
Stability Patterns for Microservices
 
Real-time Data Ingestion from Kafka to ClickHouse with Deterministic Re-tries...
Real-time Data Ingestion from Kafka to ClickHouse with Deterministic Re-tries...Real-time Data Ingestion from Kafka to ClickHouse with Deterministic Re-tries...
Real-time Data Ingestion from Kafka to ClickHouse with Deterministic Re-tries...
 
Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...
Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...
Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...
 
Delta Lake: Optimizing Merge
Delta Lake: Optimizing MergeDelta Lake: Optimizing Merge
Delta Lake: Optimizing Merge
 
Pinterest’s Story of Streaming Hundreds of Terabytes of Pins from MySQL to S3...
Pinterest’s Story of Streaming Hundreds of Terabytes of Pins from MySQL to S3...Pinterest’s Story of Streaming Hundreds of Terabytes of Pins from MySQL to S3...
Pinterest’s Story of Streaming Hundreds of Terabytes of Pins from MySQL to S3...
 
A Deep Dive into Kafka Controller
A Deep Dive into Kafka ControllerA Deep Dive into Kafka Controller
A Deep Dive into Kafka Controller
 
Intro to Pinot (2016-01-04)
Intro to Pinot (2016-01-04)Intro to Pinot (2016-01-04)
Intro to Pinot (2016-01-04)
 

Similar to Distributed Counters in Cassandra (Cassandra Summit 2010)

Summary of "Cassandra" for 3rd nosql summer reading in Tokyo
Summary of "Cassandra" for 3rd nosql summer reading in TokyoSummary of "Cassandra" for 3rd nosql summer reading in Tokyo
Summary of "Cassandra" for 3rd nosql summer reading in TokyoCLOUDIAN KK
 
TechEvent Apache Cassandra
TechEvent Apache CassandraTechEvent Apache Cassandra
TechEvent Apache CassandraTrivadis
 
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)Pavlo Baron
 
Dynamo cassandra
Dynamo cassandraDynamo cassandra
Dynamo cassandraWu Liang
 
Understanding AntiEntropy in Cassandra
Understanding AntiEntropy in CassandraUnderstanding AntiEntropy in Cassandra
Understanding AntiEntropy in CassandraJason Brown
 
Making Cassandra Perform as a Time Series Database - Cassandra Summit 15
Making Cassandra Perform as a Time Series Database - Cassandra Summit 15Making Cassandra Perform as a Time Series Database - Cassandra Summit 15
Making Cassandra Perform as a Time Series Database - Cassandra Summit 15SignalFx
 
SignalFx: Making Cassandra Perform as a Time Series Database
SignalFx: Making Cassandra Perform as a Time Series DatabaseSignalFx: Making Cassandra Perform as a Time Series Database
SignalFx: Making Cassandra Perform as a Time Series DatabaseDataStax Academy
 
The Apache Cassandra ecosystem
The Apache Cassandra ecosystemThe Apache Cassandra ecosystem
The Apache Cassandra ecosystemAlex Thompson
 
ScyllaDB: NoSQL at Ludicrous Speed
ScyllaDB: NoSQL at Ludicrous SpeedScyllaDB: NoSQL at Ludicrous Speed
ScyllaDB: NoSQL at Ludicrous SpeedJ On The Beach
 
Automatic Storage Management (ASM) metrics are a goldmine: Let's use them!
Automatic Storage Management (ASM) metrics are a goldmine: Let's use them!Automatic Storage Management (ASM) metrics are a goldmine: Let's use them!
Automatic Storage Management (ASM) metrics are a goldmine: Let's use them!BertrandDrouvot
 

Similar to Distributed Counters in Cassandra (Cassandra Summit 2010) (16)

07 problem-solving
07 problem-solving07 problem-solving
07 problem-solving
 
Summary of "Cassandra" for 3rd nosql summer reading in Tokyo
Summary of "Cassandra" for 3rd nosql summer reading in TokyoSummary of "Cassandra" for 3rd nosql summer reading in Tokyo
Summary of "Cassandra" for 3rd nosql summer reading in Tokyo
 
L09.pdf
L09.pdfL09.pdf
L09.pdf
 
TechEvent Apache Cassandra
TechEvent Apache CassandraTechEvent Apache Cassandra
TechEvent Apache Cassandra
 
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
 
Dynamo cassandra
Dynamo cassandraDynamo cassandra
Dynamo cassandra
 
L09-handout.pdf
L09-handout.pdfL09-handout.pdf
L09-handout.pdf
 
04 reports
04 reports04 reports
04 reports
 
Understanding AntiEntropy in Cassandra
Understanding AntiEntropy in CassandraUnderstanding AntiEntropy in Cassandra
Understanding AntiEntropy in Cassandra
 
Making Cassandra Perform as a Time Series Database - Cassandra Summit 15
Making Cassandra Perform as a Time Series Database - Cassandra Summit 15Making Cassandra Perform as a Time Series Database - Cassandra Summit 15
Making Cassandra Perform as a Time Series Database - Cassandra Summit 15
 
SignalFx: Making Cassandra Perform as a Time Series Database
SignalFx: Making Cassandra Perform as a Time Series DatabaseSignalFx: Making Cassandra Perform as a Time Series Database
SignalFx: Making Cassandra Perform as a Time Series Database
 
The Apache Cassandra ecosystem
The Apache Cassandra ecosystemThe Apache Cassandra ecosystem
The Apache Cassandra ecosystem
 
06 data
06 data06 data
06 data
 
ScyllaDB: NoSQL at Ludicrous Speed
ScyllaDB: NoSQL at Ludicrous SpeedScyllaDB: NoSQL at Ludicrous Speed
ScyllaDB: NoSQL at Ludicrous Speed
 
04 Reports
04 Reports04 Reports
04 Reports
 
Automatic Storage Management (ASM) metrics are a goldmine: Let's use them!
Automatic Storage Management (ASM) metrics are a goldmine: Let's use them!Automatic Storage Management (ASM) metrics are a goldmine: Let's use them!
Automatic Storage Management (ASM) metrics are a goldmine: Let's use them!
 

Recently uploaded

My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 

Recently uploaded (20)

My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 

Distributed Counters in Cassandra (Cassandra Summit 2010)

  • 1. Distributed Counters in Cassandra Friday, August 13, 2010
  • 2. I: Goal II: Design III: Implementation Distributed Counters in Cassandra Friday, August 13, 2010
  • 3. I: Goal Distributed Counters in Cassandra Friday, August 13, 2010
  • 4. Goal Low Latency, Highly Available Counters Distributed Counters in Cassandra Friday, August 13, 2010
  • 5. II: Design Distributed Counters in Cassandra Friday, August 13, 2010
  • 6. I: Traditional Counter Design II: Abstract Strategy III: Distributed Counter Design Distributed Counters in Cassandra Friday, August 13, 2010
  • 7. Design I: Traditional Counter Design Distributed Counters in Cassandra Friday, August 13, 2010
  • 8. Traditional Counter Design Atomic Counters 1. single machine 2. one order of execution 3. strongly consistent Distributed Counters in Cassandra Friday, August 13, 2010
  • 9. Traditional Counter Design Problems 1. SPOF / single master 2. high latency 3. manually sharded Distributed Counters in Cassandra Friday, August 13, 2010
  • 10. Traditional Counter Design Question What constraints can we relax? Distributed Counters in Cassandra Friday, August 13, 2010
  • 11. Design II: Abstract Strategy Distributed Counters in Cassandra Friday, August 13, 2010
  • 12. Abstract Strategy Constraints to Relax 1. one order of execution 2. strong consistency Distributed Counters in Cassandra Friday, August 13, 2010
  • 13. Abstract Strategy Relax: One Order of Execution commutative operation: - operations must be re-orderable Distributed Counters in Cassandra Friday, August 13, 2010
  • 14. Abstract Strategy Relax: Strong Consistency partitioned work: - each op must occur once - unique partition identifier idempotent repair: - recognize ops from other partitions Distributed Counters in Cassandra Friday, August 13, 2010
  • 15. Design III: Distributed Counter Design Distributed Counters in Cassandra Friday, August 13, 2010
  • 16. Distributed Counter Design Requirements 1. commutative operation 2. partitioned work 3. idempotent repair Distributed Counters in Cassandra Friday, August 13, 2010
  • 17. Distributed Counter Design Commutative Operation addition: - commutative operation - sum ops performed by all replicas -a + b = b + a Distributed Counters in Cassandra Friday, August 13, 2010
  • 18. Distributed Counter Design Partitioned Work each op assigned to a replica: - every replica sums all of its ops Distributed Counters in Cassandra Friday, August 13, 2010
  • 19. Distributed Counter Design Idempotent Repair save counts from remote replicas: - keep highest count seen prevent multiple execution: - do not transfer the target replica’s count Distributed Counters in Cassandra Friday, August 13, 2010
  • 20. III: Implementation Distributed Counters in Cassandra Friday, August 13, 2010
  • 21. I: Data Structure II: Single Node III: Eventual Consistency Distributed Counters in Cassandra Friday, August 13, 2010
  • 22. I: Data Structure Distributed Counters in Cassandra Friday, August 13, 2010
  • 23. Data Structure Requirements local counts: - incrementally update remote counts: - independently track partitions Distributed Counters in Cassandra Friday, August 13, 2010
  • 24. Data Structure Context Format list of (replica id, count) tuples: [(replica A, count), (replica B, count), ...] Distributed Counters in Cassandra Friday, August 13, 2010
  • 25. Data Structure Context Mutations local write: sum local count and write delta note: memtable Distributed Counters in Cassandra Friday, August 13, 2010
  • 26. Data Structure Context Mutations remote repair: for each replica, keep highest count seen (local or from repair) Distributed Counters in Cassandra Friday, August 13, 2010
  • 27. II: Single Node Distributed Counters in Cassandra Friday, August 13, 2010
  • 28. Single Node Write Path client 1. construct column - value: delta (big-endian long) - clock: empty 2. thrift: insert / batch_mutate Distributed Counters in Cassandra Friday, August 13, 2010
  • 29. Single Node Write Path coordinator 1. choose partition - choose target replica - requirement: ConsistencyLevel.ONE 2. construct clock - context format: [(target replica id, count delta)] Distributed Counters in Cassandra Friday, August 13, 2010
  • 30. Single Node Write Path target replica insert: 1. memtable does not contain column 2. insert column into memtable Distributed Counters in Cassandra Friday, August 13, 2010
  • 31. Single Node Write Path target replica update: 1. memtable contains column 2. retrieve existing column 3. create new column - context: sum local count w/ delta from write 4. replace column in ConcurrentSkipListMap 5. if failed to replace column, go to step 2. Distributed Counters in Cassandra Friday, August 13, 2010
  • 32. Single Node Write Path Interesting Note: MTs are serialized to SSTs, as-is - each SST encapsulates the updates when it was an MT - local count total must be aggregated across the MT and all SSTs Distributed Counters in Cassandra Friday, August 13, 2010
  • 33. Single Node Read Path target replica read: 1. construct collating iterator over: - frozen snapshot of MT - all relevant SSTs 2. resolve column - local counts: sum - remote counts: keep max 3. construct value - sum local and remote counts (big-endian long) Distributed Counters in Cassandra Friday, August 13, 2010
  • 34. Single Node Compaction replica compaction: 1. construct collating iterator over all SSTs 2. resolve every column in the CF - local counts: sum - remote counts: keep max 3. write out resolved CF Distributed Counters in Cassandra Friday, August 13, 2010
  • 35. III: Eventual Consistency Distributed Counters in Cassandra Friday, August 13, 2010
  • 36. Eventual Consistency Read Repair coordinator / replica read repair: 1. calculate resolved (superset) CF - resolve every column (local: sum, remote: max) 2. return resolved CF to client Distributed Counters in Cassandra Friday, August 13, 2010
  • 37. Eventual Consistency Read Repair coordinator / replica read repair: 1. calculate repair CF for each replica - calculate diff CF between resolved and received - modify columns to remove target replica’s counts 2. send repair CF to each replica Distributed Counters in Cassandra Friday, August 13, 2010
  • 38. Eventual Consistency Anti-Entropy Service sending replica AES: 1. follow normal AES code path - calculate repair SST based on shared ranges - send repair SST Distributed Counters in Cassandra Friday, August 13, 2010
  • 39. Eventual Consistency Anti-Entropy Service receiving replica AES: 1. post-process streamed SST - re-build streamed SST - note: strip out local replica’s counts 2. remove temporary descriptor 3. add to SSTableTracker Distributed Counters in Cassandra Friday, August 13, 2010
  • 40. Questions? Distributed Counters in Cassandra Friday, August 13, 2010
  • 41. More Information Issues: #580: Vector Clocks #1072: Distributed Counters Related Work: Helland and Campbell, Building on Quicksand, CIDR (2009), Sections 5 & 6. My email address: kakugawa@gmail.com Distributed Counters in Cassandra Friday, August 13, 2010