SlideShare a Scribd company logo
1 of 20
Download to read offline
Deconstructing Cassandra
Fault tolerance at scale
Cluster architecture concepts
Cassandra node
A Cassandra node is a server within a
Cassandra ring.
A Cassandra node is independent of but
cooperates with other nodes within a
Cassandra ring.
1
5
2
3
4
8
6
7
9
Cassandra ring
A Cassandra ring is a collection of nodes that
form a complete token range.
DC1
1
5
2
3
4
8
6
7
9
Cassandra logical
data center
When more than a single Cassandra ring
exists within a cluster we call each of them a
Cassandra logical data center and name them.
DC1
1
5
2
3
4
8
6
7
9
DC2
1
5
2
3
4
8
6
7
9
Cassandra cluster
A Cassandra cluster is a collection of
Cassandra logical data centers.
A cluster could consist of a single Cassandra
ring or many Cassandra logical data centers.
In this case the Cluster consists of 6x
Cassandra logical data centers in 5 physical
data centers with the 2x Cassandra logical
data centers in Australia in two different
AMZN availability zones.
US1 US2 EU1 JP1
AU1
AZ1
AU2
AZ2
Data distribution
Keyspace
Data in Cassandra is stored in a keyspace. A
keyspace is kind of analogous to a database in
the RDBMS world.
Each node in a Cassandra ring owns part of
the keyspace.
1
5
2
3
4
8
6
7
9
Tokens
Each node owns part of the keyspace via
ownership of a range of tokens.
A Cassandra ring, regardless of the number of
nodes always forms a complete token range.
So if you had 9 nodes each node would own
1/9th of the token range.
If you had only 3 nodes each node would own
1/3rd of the token range.
The actual token range is a fixed number:
-9VERY LARGE NUMBER
-> 9VERY LARGE NUMBER
CLDC
1
5
2
3
4
8
6
7
9
0 - 9
10 - 19
20 - 29
30 - 15
40 - 49
50 - 59
60 - 69
70 - 79
80 - 89
CLDC
1
23
Partition keys
Partition keys (data design concept) map to
tokens.
I may have chosen in a table design that my
partition key is “email_address”.
When we read or write to Cassandra with my
email_address (alex@site.com) Cassandra
first of all hashes my email address and
returns a token, with the token Cassandra now
knows via the ring’s topology which node owns
my data.
Note also that my data is replicated to other
nodes, from the token Cassandra can also
work out where my replicas are.
CLDC
1
5
2
3
4
8
6
7
9
0 - 9
10 - 19
20 - 29
30 - 15
40 - 49
50 - 59
60 - 69
70 - 79
80 - 89
CREATE TABLE users {
email_address text,
password text,
age int,
PRIMARY KEY(email_address)
}
Resiliency: the battle to produce a
data storage platform with no
single point of failure
Requirements
● No single point of failure within a
Cassandra ring
● Able to handle failure of nodes within
a Cassandra ring with zero
intervention and zero impact
● Able to handle failure of Cassandra
logical data centers with zero
intervention and zero impact
● Able to handle failure of physical
datacenters with zero intervention and
zero impact
● Able to handle failure of entire
geographic locations with zero
intervention and zero impact
Ask me about 30ms, go on, dare you.
US1 US2 EU1 JP1
AU1
AZ1
AU2
AZ2
No SPOF within a
Cassandra ring
To have no SPOF within a ring each node must
be able to survive independently from all other
nodes, but at the same time co-operate to
complete tasks.
This is not just a problem that Cassandra has
to answer, it is a wider problem of distributed
computing.
So how do you do that?
CLDC
1
5
2
3
4
8
6
7
9
No SPOF within a
Cassandra ring
Firstly to have no SPOF within a ring you:
● cannot have a single point of control,
this means you cannot introduce a
controller (e.g. Zookeeper)
● you cannot use DNS
● you cannot use shared storage (i.e
SAN, NAS etc)
● data must be replicated to multiple
nodes
CLDC
1
5
2
3
4
8
6
7
9
Node
independence
So how do you do that?
Each node needs to be able to independently
source and understand from NO central
location:
● the topology of the ring
● the distributed schema design
● the discovery of other nodes
● the changing state of other nodes
CLDC
1
5
2
3
4
8
6
7
9
Gossip protocol
Each Cassandra node is continually gossiping
with other nodes about the state of the cluster,
the distributed schema design, the topology of
the ring, the latency of other nodes, the state
of other nodes etc.
The gossip protocol converges on agreement
within 1 second in even large clusters.
The gossip protocol is extremely light weight
from a network perspective.
1
5
2
3
4
8
6
7
9
….!
….!
….!
….!
….!
….! ….!
….!
….!
Gossip spans
logical DCs
Gossip does not just work in a single ring, it
spans all Cassandra logical data centers that
form a cluster.
So any Cassandra node in one DC is aware of
the state of nodes in all other DCs.
DC1
1
5
2
3
4
8
6
7
9
….
!
….
!
….
!
….
!
….
!
….
!
….
!
….
!
….
!
DC2
1
5
2
3
4
8
6
7
9
….
!
….
!
….
!
….
!
….
!
….
!
….
!
….
!
….
!
Summary
Each node in a Cassandra cluster is
independent but cooperative.
You could kill 8 of the 9 nodes and the single
remaining node will remain up and active
answering for the part of the token range that
it is authoritative for.
1
5
2
3
4
8
6
7
9
….!
….!
….!
….!
….!
….! ….!
….!
….!
Data replication
Replication
Replication is configured at the
keyspace level.
1
5
2
3
4
8
6
7
9

More Related Content

What's hot

Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...DataStax
 
My SQL Portal Database (Cluster)
My SQL Portal Database (Cluster)My SQL Portal Database (Cluster)
My SQL Portal Database (Cluster)Nicholas Adu Gyamfi
 
Dynamo cassandra
Dynamo cassandraDynamo cassandra
Dynamo cassandraWu Liang
 
Apache Cassandra multi-datacenter essentials
Apache Cassandra multi-datacenter essentialsApache Cassandra multi-datacenter essentials
Apache Cassandra multi-datacenter essentialsJulien Anguenot
 
Cassandra in Operation
Cassandra in OperationCassandra in Operation
Cassandra in Operationniallmilton
 
Basic stuff You Need to Know about Cassandra
Basic stuff You Need to Know about CassandraBasic stuff You Need to Know about Cassandra
Basic stuff You Need to Know about CassandraYu-Chang Ho
 
Learning Cassandra
Learning CassandraLearning Cassandra
Learning CassandraDave Gardner
 
Scaling HDFS with a Strongly Consistent Relational Model for Metadata
Scaling HDFS with a Strongly Consistent Relational Model for MetadataScaling HDFS with a Strongly Consistent Relational Model for Metadata
Scaling HDFS with a Strongly Consistent Relational Model for MetadataHooman Peiro Sajjad
 
Corbett osdi12 slides (1)
Corbett osdi12 slides (1)Corbett osdi12 slides (1)
Corbett osdi12 slides (1)Aksh54
 
Understanding Data Consistency in Apache Cassandra
Understanding Data Consistency in Apache CassandraUnderstanding Data Consistency in Apache Cassandra
Understanding Data Consistency in Apache CassandraDataStax
 
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...DataStax Academy
 
MySQL with DRBD/Pacemaker/Corosync on Linux
 MySQL with DRBD/Pacemaker/Corosync on Linux MySQL with DRBD/Pacemaker/Corosync on Linux
MySQL with DRBD/Pacemaker/Corosync on LinuxPawan Kumar
 
Cassandra - A Decentralized Structured Storage System
Cassandra - A Decentralized Structured Storage SystemCassandra - A Decentralized Structured Storage System
Cassandra - A Decentralized Structured Storage SystemVarad Meru
 
An Overview of Spanner: Google's Globally Distributed Database
An Overview of Spanner: Google's Globally Distributed DatabaseAn Overview of Spanner: Google's Globally Distributed Database
An Overview of Spanner: Google's Globally Distributed DatabaseBenjamin Bengfort
 

What's hot (20)

Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
 
My SQL Portal Database (Cluster)
My SQL Portal Database (Cluster)My SQL Portal Database (Cluster)
My SQL Portal Database (Cluster)
 
Spanner osdi2012
Spanner osdi2012Spanner osdi2012
Spanner osdi2012
 
Dynamo cassandra
Dynamo cassandraDynamo cassandra
Dynamo cassandra
 
Apache Cassandra multi-datacenter essentials
Apache Cassandra multi-datacenter essentialsApache Cassandra multi-datacenter essentials
Apache Cassandra multi-datacenter essentials
 
Cassandra in Operation
Cassandra in OperationCassandra in Operation
Cassandra in Operation
 
Basic stuff You Need to Know about Cassandra
Basic stuff You Need to Know about CassandraBasic stuff You Need to Know about Cassandra
Basic stuff You Need to Know about Cassandra
 
Cassandra勉強会
Cassandra勉強会Cassandra勉強会
Cassandra勉強会
 
Google Spanner
Google SpannerGoogle Spanner
Google Spanner
 
Learning Cassandra
Learning CassandraLearning Cassandra
Learning Cassandra
 
Scaling HDFS with a Strongly Consistent Relational Model for Metadata
Scaling HDFS with a Strongly Consistent Relational Model for MetadataScaling HDFS with a Strongly Consistent Relational Model for Metadata
Scaling HDFS with a Strongly Consistent Relational Model for Metadata
 
Spanner
SpannerSpanner
Spanner
 
Corbett osdi12 slides (1)
Corbett osdi12 slides (1)Corbett osdi12 slides (1)
Corbett osdi12 slides (1)
 
Understanding Data Consistency in Apache Cassandra
Understanding Data Consistency in Apache CassandraUnderstanding Data Consistency in Apache Cassandra
Understanding Data Consistency in Apache Cassandra
 
Drbd
DrbdDrbd
Drbd
 
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
 
Spanner (may 19)
Spanner (may 19)Spanner (may 19)
Spanner (may 19)
 
MySQL with DRBD/Pacemaker/Corosync on Linux
 MySQL with DRBD/Pacemaker/Corosync on Linux MySQL with DRBD/Pacemaker/Corosync on Linux
MySQL with DRBD/Pacemaker/Corosync on Linux
 
Cassandra - A Decentralized Structured Storage System
Cassandra - A Decentralized Structured Storage SystemCassandra - A Decentralized Structured Storage System
Cassandra - A Decentralized Structured Storage System
 
An Overview of Spanner: Google's Globally Distributed Database
An Overview of Spanner: Google's Globally Distributed DatabaseAn Overview of Spanner: Google's Globally Distributed Database
An Overview of Spanner: Google's Globally Distributed Database
 

Similar to Deconstructing Apache Cassandra

Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...DataStax Academy
 
Cassandra advanced part-ll
Cassandra advanced part-llCassandra advanced part-ll
Cassandra advanced part-llachudhivi
 
The Apache Cassandra ecosystem
The Apache Cassandra ecosystemThe Apache Cassandra ecosystem
The Apache Cassandra ecosystemAlex Thompson
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinChristian Johannsen
 
Cassandra for Sysadmins
Cassandra for SysadminsCassandra for Sysadmins
Cassandra for SysadminsNathan Milford
 
Cassandra advanced-I
Cassandra advanced-ICassandra advanced-I
Cassandra advanced-Iachudhivi
 
Cassandra - A Distributed Database System
Cassandra - A Distributed Database System Cassandra - A Distributed Database System
Cassandra - A Distributed Database System Md. Shohel Rana
 
Pythian: My First 100 days with a Cassandra Cluster
Pythian: My First 100 days with a Cassandra ClusterPythian: My First 100 days with a Cassandra Cluster
Pythian: My First 100 days with a Cassandra ClusterDataStax Academy
 
Cassandra multi-datacenter operations essentials
Cassandra multi-datacenter operations essentialsCassandra multi-datacenter operations essentials
Cassandra multi-datacenter operations essentialsJulien Anguenot
 
Cassandra & Python - Springfield MO User Group
Cassandra & Python - Springfield MO User GroupCassandra & Python - Springfield MO User Group
Cassandra & Python - Springfield MO User GroupAdam Hutson
 
Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...
Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...
Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...DataStax
 
Cassandra basics 2.0
Cassandra basics 2.0Cassandra basics 2.0
Cassandra basics 2.0Asis Mohanty
 
Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache CassandraRobert Stupp
 
NoSQL - Cassandra & MongoDB.pptx
NoSQL -  Cassandra & MongoDB.pptxNoSQL -  Cassandra & MongoDB.pptx
NoSQL - Cassandra & MongoDB.pptxNaveen Kumar
 

Similar to Deconstructing Apache Cassandra (20)

Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
 
Cassandra advanced part-ll
Cassandra advanced part-llCassandra advanced part-ll
Cassandra advanced part-ll
 
The Apache Cassandra ecosystem
The Apache Cassandra ecosystemThe Apache Cassandra ecosystem
The Apache Cassandra ecosystem
 
Cassandra Database
Cassandra DatabaseCassandra Database
Cassandra Database
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek Berlin
 
Cassandra for Sysadmins
Cassandra for SysadminsCassandra for Sysadmins
Cassandra for Sysadmins
 
Cassandra advanced-I
Cassandra advanced-ICassandra advanced-I
Cassandra advanced-I
 
Cassandra - A Distributed Database System
Cassandra - A Distributed Database System Cassandra - A Distributed Database System
Cassandra - A Distributed Database System
 
Pythian: My First 100 days with a Cassandra Cluster
Pythian: My First 100 days with a Cassandra ClusterPythian: My First 100 days with a Cassandra Cluster
Pythian: My First 100 days with a Cassandra Cluster
 
Cassandra multi-datacenter operations essentials
Cassandra multi-datacenter operations essentialsCassandra multi-datacenter operations essentials
Cassandra multi-datacenter operations essentials
 
Cassandra
CassandraCassandra
Cassandra
 
Cassandra & Python - Springfield MO User Group
Cassandra & Python - Springfield MO User GroupCassandra & Python - Springfield MO User Group
Cassandra & Python - Springfield MO User Group
 
Cassandra
CassandraCassandra
Cassandra
 
Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...
Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...
Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...
 
Cassandra basics 2.0
Cassandra basics 2.0Cassandra basics 2.0
Cassandra basics 2.0
 
No sql
No sqlNo sql
No sql
 
Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache Cassandra
 
NoSQL - Cassandra & MongoDB.pptx
NoSQL -  Cassandra & MongoDB.pptxNoSQL -  Cassandra & MongoDB.pptx
NoSQL - Cassandra & MongoDB.pptx
 
BigData Developers MeetUp
BigData Developers MeetUpBigData Developers MeetUp
BigData Developers MeetUp
 
Cassandra no sql ecosystem
Cassandra no sql ecosystemCassandra no sql ecosystem
Cassandra no sql ecosystem
 

Recently uploaded

BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 

Recently uploaded (20)

BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 

Deconstructing Apache Cassandra

  • 3. Cassandra node A Cassandra node is a server within a Cassandra ring. A Cassandra node is independent of but cooperates with other nodes within a Cassandra ring. 1 5 2 3 4 8 6 7 9
  • 4. Cassandra ring A Cassandra ring is a collection of nodes that form a complete token range. DC1 1 5 2 3 4 8 6 7 9
  • 5. Cassandra logical data center When more than a single Cassandra ring exists within a cluster we call each of them a Cassandra logical data center and name them. DC1 1 5 2 3 4 8 6 7 9 DC2 1 5 2 3 4 8 6 7 9
  • 6. Cassandra cluster A Cassandra cluster is a collection of Cassandra logical data centers. A cluster could consist of a single Cassandra ring or many Cassandra logical data centers. In this case the Cluster consists of 6x Cassandra logical data centers in 5 physical data centers with the 2x Cassandra logical data centers in Australia in two different AMZN availability zones. US1 US2 EU1 JP1 AU1 AZ1 AU2 AZ2
  • 8. Keyspace Data in Cassandra is stored in a keyspace. A keyspace is kind of analogous to a database in the RDBMS world. Each node in a Cassandra ring owns part of the keyspace. 1 5 2 3 4 8 6 7 9
  • 9. Tokens Each node owns part of the keyspace via ownership of a range of tokens. A Cassandra ring, regardless of the number of nodes always forms a complete token range. So if you had 9 nodes each node would own 1/9th of the token range. If you had only 3 nodes each node would own 1/3rd of the token range. The actual token range is a fixed number: -9VERY LARGE NUMBER -> 9VERY LARGE NUMBER CLDC 1 5 2 3 4 8 6 7 9 0 - 9 10 - 19 20 - 29 30 - 15 40 - 49 50 - 59 60 - 69 70 - 79 80 - 89 CLDC 1 23
  • 10. Partition keys Partition keys (data design concept) map to tokens. I may have chosen in a table design that my partition key is “email_address”. When we read or write to Cassandra with my email_address (alex@site.com) Cassandra first of all hashes my email address and returns a token, with the token Cassandra now knows via the ring’s topology which node owns my data. Note also that my data is replicated to other nodes, from the token Cassandra can also work out where my replicas are. CLDC 1 5 2 3 4 8 6 7 9 0 - 9 10 - 19 20 - 29 30 - 15 40 - 49 50 - 59 60 - 69 70 - 79 80 - 89 CREATE TABLE users { email_address text, password text, age int, PRIMARY KEY(email_address) }
  • 11. Resiliency: the battle to produce a data storage platform with no single point of failure
  • 12. Requirements ● No single point of failure within a Cassandra ring ● Able to handle failure of nodes within a Cassandra ring with zero intervention and zero impact ● Able to handle failure of Cassandra logical data centers with zero intervention and zero impact ● Able to handle failure of physical datacenters with zero intervention and zero impact ● Able to handle failure of entire geographic locations with zero intervention and zero impact Ask me about 30ms, go on, dare you. US1 US2 EU1 JP1 AU1 AZ1 AU2 AZ2
  • 13. No SPOF within a Cassandra ring To have no SPOF within a ring each node must be able to survive independently from all other nodes, but at the same time co-operate to complete tasks. This is not just a problem that Cassandra has to answer, it is a wider problem of distributed computing. So how do you do that? CLDC 1 5 2 3 4 8 6 7 9
  • 14. No SPOF within a Cassandra ring Firstly to have no SPOF within a ring you: ● cannot have a single point of control, this means you cannot introduce a controller (e.g. Zookeeper) ● you cannot use DNS ● you cannot use shared storage (i.e SAN, NAS etc) ● data must be replicated to multiple nodes CLDC 1 5 2 3 4 8 6 7 9
  • 15. Node independence So how do you do that? Each node needs to be able to independently source and understand from NO central location: ● the topology of the ring ● the distributed schema design ● the discovery of other nodes ● the changing state of other nodes CLDC 1 5 2 3 4 8 6 7 9
  • 16. Gossip protocol Each Cassandra node is continually gossiping with other nodes about the state of the cluster, the distributed schema design, the topology of the ring, the latency of other nodes, the state of other nodes etc. The gossip protocol converges on agreement within 1 second in even large clusters. The gossip protocol is extremely light weight from a network perspective. 1 5 2 3 4 8 6 7 9 ….! ….! ….! ….! ….! ….! ….! ….! ….!
  • 17. Gossip spans logical DCs Gossip does not just work in a single ring, it spans all Cassandra logical data centers that form a cluster. So any Cassandra node in one DC is aware of the state of nodes in all other DCs. DC1 1 5 2 3 4 8 6 7 9 …. ! …. ! …. ! …. ! …. ! …. ! …. ! …. ! …. ! DC2 1 5 2 3 4 8 6 7 9 …. ! …. ! …. ! …. ! …. ! …. ! …. ! …. ! …. !
  • 18. Summary Each node in a Cassandra cluster is independent but cooperative. You could kill 8 of the 9 nodes and the single remaining node will remain up and active answering for the part of the token range that it is authoritative for. 1 5 2 3 4 8 6 7 9 ….! ….! ….! ….! ….! ….! ….! ….! ….!
  • 20. Replication Replication is configured at the keyspace level. 1 5 2 3 4 8 6 7 9