SlideShare a Scribd company logo
Apache Cassandra 
Philip Thompson 
Software Engineer 
DataStax 
©2014 DataStax. Do not distribute without consent. 
1
Who I am 
• Philip Thompson 
• Software Engineer at DataStax 
• Contributor to Apache Cassandra 
• A maintainer of CCM, the Cassandra Cluster Manager
Apache Cassandra™ 
•Apache Cassandra™ is a massively scalable, open source, NoSQL, distributed 
database built for modern, mission-critical online applications. 
•Written in Java and is a hybrid of Amazon Dynamo and Google BigTable 
•Masterless with no single point of failure 
•Distributed and data centre aware 
•100% uptime 
•Predictable scaling 
3
©2012 DataStax 4
©2012 DataStax 5
©2012 DataStax 6
http://techblog.netflix.com/2012/07/lessons-netflix-learned-from-aws-storm.html 
©2012 DataStax 7 9
Cluster Architecture 
©2012 DataStax 
8
Data Distribution 
75 
0 
25 
50 
Murmur3_Hash_Function(Partition Key) >> 
Token
Cassandra - More than one server 
• All nodes participate in a 
cluster 
• Shared nothing 
• Add or remove as needed 
• More capacity? Add a 
server 
10 
• Each node owns a number of tokens 
• Tokens denote a range of keys 
• 4 nodes? -> Key range/4 
• Each node owns 1/4 the data
Cassandra - Locally Distributed 
• Client writes to any 
node 
• Node coordinates with 
others 
• Data replicated in 
parallel 
• Replication factor (RF): 
How many copies of 
your data? 
• RF = 3 here 
Each node stores 3/4 
of clusters total data. 
11
Cassandra - Geographically Distributed 
• Client writes local 
• Data syncs across WAN 
• Replication Factor per DC 
Single coordinator 
12
Cassandra - Replication Factor 
• Replication factor (RF): 
How many copies of 
your data? 
• Replication Factor is set 
per keyspace 
• Can be altered by 
operator 
13 
RF = 3
Cassandra - Consistency 
• Consistency Level (CL) 
• Client specifies per read 
or write 
• ALL = All replicas ack 
• QUORUM = > 51% of replicas ack 
• LOCAL_QUORUM = > 51% in local DC ack 
• ONE = Only one replica acks 
14
Cassandra - Transparent to the application 
• A single node failure shouldn’t bring failure 
• Replication Factor + Consistency Level = Success 
• This example: 
• RF = 3 
• CL = QUORUM 
>51% Ack so we are good! 
15
Cassandra - Scaling 
• Take a cluster of four nodes 
• Where does the fifth node go? 
• Rebalancing is costly 
75 
16 
0 
25 
50
Devops kc
Devops kc
Gossip 
• Manages cluster state 
• Nodes up/down 
• Nodes joining/leaving 
• Decentralized 
• “Heartbeat” every second 
• Every node contacts 1-3 other nodes
Snitch 
• Responsible for determining cluster topology 
• Datacenter awareness 
• Tracks node responsiveness 
• Many snitches provided out of the box 
• SimpleSnitch 
• GossipingPropertyFileSnitch (recommended for production) 
• EC2Snitch and EC2MultiRegionSnitch 
• For use with AWS 
• Comparable GCE snitch has just been added 
• Custom snitches can be added 
20
Anti-Entropy - Read Repair
Anti-Entropy - Hinted Handoff 
• Three hour window 
• Hints are replayed when node is 
restored 
• Stored in system.hints table on 
coordinator 
• Cassandra does not copy Dynamo’s 
“sloppy quorum” 
22
Anti-Entropy - Repair 
• Nodetool repair 
• Uses merkle trees for data 
comparison 
• Should be run weekly. 
• Cassandra 2.1 has drastically 
improved repair times, thanks to 
incremental repair 
23
Node Architecture 
©2012 DataStax 
24
Write Path 
commit log 
Memtable 
SSTable 
Write 
Memory 
Disk
Write Path 
• By default data is fsynced every 10s 
• This can be configured in cassandra.yaml 
commit log 
Memtable 
SSTable 
Write
Read Path 
Memtable 
SSTable 
Read 
SSTable 
Memory 
Disk
Read Path
Compaction
Compaction
Debugging your data model 
• Tracing 
cqlsh> tracing on; 
Now tracing requests. 
cqlsh:foo> INSERT INTO test (a, b) VALUES (1, 'example'); 
Tracing session: 4ad36250-1eb4-11e2-0000-fe8ebeead9f9 
activity | timestamp | source | source_elapsed 
-------------------------------------+--------------+-----------+---------------- 
execute_cql3_query | 00:02:37,015 | 127.0.0.1 | 0 
Parsing statement | 00:02:37,015 | 127.0.0.1 | 81 
Preparing statement | 00:02:37,015 | 127.0.0.1 | 273 
Determining replicas for mutation | 00:02:37,015 | 127.0.0.1 | 540 
Sending message to /127.0.0.2 | 00:02:37,015 | 127.0.0.1 | 779 
Messsage received from /127.0.0.1 | 00:02:37,016 | 127.0.0.2 | 63 
Applying mutation | 00:02:37,016 | 127.0.0.2 | 220 
Acquiring switchLock | 00:02:37,016 | 127.0.0.2 | 250 
Appending to commitlog | 00:02:37,016 | 127.0.0.2 | 277 
Adding to memtable | 00:02:37,016 | 127.0.0.2 | 378 
Enqueuing response to /127.0.0.1 | 00:02:37,016 | 127.0.0.2 | 710 
Sending message to /127.0.0.1 | 00:02:37,016 | 127.0.0.2 | 888 
Messsage received from /127.0.0.2 | 00:02:37,017 | 127.0.0.1 | 2334 
Processing response from /127.0.0.2 | 00:02:37,017 | 127.0.0.1 | 2550 
Request complete | 00:02:37,017 | 127.0.0.1 | 2581
Nodetool 
• Command line interface for monitoring Cassandra and performing routine 
database operations 
• Commands for viewing detailed metrics for tables, server metrics, and 
compaction statistics: 
• cfstats: statistics for each table and keyspace 
• cfhistograms: statistics about a table, including read/write latency, row size, column count, 
and number of SSTables 
• netstats: statistics about network operations and connections 
• tpstats: statistics about the number of active, pending, and completed tasks for each stage of 
Cassandra operations by thread pool 
32
Try it out 
©2012 DataStax 
33
Cassandra 
• Download from source: 
• git clone git://git.apache.org/cassandra.git 
• Packaged install and tarballs available: 
• http://www.datastax.com/documentation/cassandra/2.1/cassandra/install/install_cassan 
draTOC.html
CCM 
• CCM - Cassandra Cluster Manager 
• https://github.com/pcmanus/ccm 
•Warning: not lightweight 
• Example: 
• ccm create test -v 2.0.1 
• ccm populate -n 3 
• ccm start
Clients 
• Cqlsh 
• Bundled with Cassandra 
• Drivers 
• java: https://github.com/datastax/java-driver 
• python: https://github.com/datastax/python-driver 
• .net: https://github.com/datastax/csharp-driver 
• and more: http://www.datastax.com/download/clientdrivers 
• Ruby, C/C++, NodeJS
Get Help 
• IRC: #cassandra on freenode 
• Mailing Lists 
• Subscribe at cassandra.apache.org 
• Stack Overflow 
• DataStax Docs 
• http://www.datastax.com/docs 
37
Questions? 
©2012 DataStax 
38
©2014 DataStax Confidential. Do not distribute without consent. 39

More Related Content

What's hot

Advanced Operations
Advanced OperationsAdvanced Operations
Advanced Operations
DataStax Academy
 
Cassandra and Spark
Cassandra and SparkCassandra and Spark
Cassandra and Spark
nickmbailey
 
Cassandra Exports as a Trivially Parallelizable Problem (Emilio Del Tessandor...
Cassandra Exports as a Trivially Parallelizable Problem (Emilio Del Tessandor...Cassandra Exports as a Trivially Parallelizable Problem (Emilio Del Tessandor...
Cassandra Exports as a Trivially Parallelizable Problem (Emilio Del Tessandor...
DataStax
 
Cassandra Summit 2014: Active-Active Cassandra Behind the Scenes
Cassandra Summit 2014: Active-Active Cassandra Behind the ScenesCassandra Summit 2014: Active-Active Cassandra Behind the Scenes
Cassandra Summit 2014: Active-Active Cassandra Behind the Scenes
DataStax Academy
 
Introduction to Cassandra Architecture
Introduction to Cassandra ArchitectureIntroduction to Cassandra Architecture
Introduction to Cassandra Architecture
nickmbailey
 
Looking towards an official cassandra sidecar netflix
Looking towards an official cassandra sidecar   netflixLooking towards an official cassandra sidecar   netflix
Looking towards an official cassandra sidecar netflix
Vinay Kumar Chella
 
Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...
Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...
Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...
DataStax
 
Building a Multi-Region Cluster at Target (Aaron Ploetz, Target) | Cassandra ...
Building a Multi-Region Cluster at Target (Aaron Ploetz, Target) | Cassandra ...Building a Multi-Region Cluster at Target (Aaron Ploetz, Target) | Cassandra ...
Building a Multi-Region Cluster at Target (Aaron Ploetz, Target) | Cassandra ...
DataStax
 
Scylla Summit 2018: Make Scylla Fast Again! Find out how using Tools, Talent,...
Scylla Summit 2018: Make Scylla Fast Again! Find out how using Tools, Talent,...Scylla Summit 2018: Make Scylla Fast Again! Find out how using Tools, Talent,...
Scylla Summit 2018: Make Scylla Fast Again! Find out how using Tools, Talent,...
ScyllaDB
 
Make 2016 your year of SMACK talk
Make 2016 your year of SMACK talkMake 2016 your year of SMACK talk
Make 2016 your year of SMACK talk
DataStax Academy
 
Instaclustr webinar 2017 feb 08 japan
Instaclustr webinar 2017 feb 08   japanInstaclustr webinar 2017 feb 08   japan
Instaclustr webinar 2017 feb 08 japan
Hiromitsu Komatsu
 
Clock Skew and Other Annoying Realities in Distributed Systems (Donny Nadolny...
Clock Skew and Other Annoying Realities in Distributed Systems (Donny Nadolny...Clock Skew and Other Annoying Realities in Distributed Systems (Donny Nadolny...
Clock Skew and Other Annoying Realities in Distributed Systems (Donny Nadolny...
DataStax
 
A glimpse of cassandra 4.0 features netflix
A glimpse of cassandra 4.0 features   netflixA glimpse of cassandra 4.0 features   netflix
A glimpse of cassandra 4.0 features netflix
Vinay Kumar Chella
 
How netflix manages petabyte scale apache cassandra in the cloud
How netflix manages petabyte scale apache cassandra in the cloudHow netflix manages petabyte scale apache cassandra in the cloud
How netflix manages petabyte scale apache cassandra in the cloud
Vinay Kumar Chella
 
Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...
Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...
Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...
DataStax
 
Mesosphere and Contentteam: A New Way to Run Cassandra
Mesosphere and Contentteam: A New Way to Run CassandraMesosphere and Contentteam: A New Way to Run Cassandra
Mesosphere and Contentteam: A New Way to Run Cassandra
DataStax Academy
 
Processing 50,000 events per second with Cassandra and Spark
Processing 50,000 events per second with Cassandra and SparkProcessing 50,000 events per second with Cassandra and Spark
Processing 50,000 events per second with Cassandra and Spark
Ben Slater
 
Strata+Hadoop 2017 San Jose: Lessons from a year of supporting Apache Kafka
Strata+Hadoop 2017 San Jose: Lessons from a year of supporting Apache KafkaStrata+Hadoop 2017 San Jose: Lessons from a year of supporting Apache Kafka
Strata+Hadoop 2017 San Jose: Lessons from a year of supporting Apache Kafka
confluent
 
Micro-batching: High-performance writes
Micro-batching: High-performance writesMicro-batching: High-performance writes
Micro-batching: High-performance writes
Instaclustr
 
How to Monitor and Size Workloads on AWS i3 instances
How to Monitor and Size Workloads on AWS i3 instancesHow to Monitor and Size Workloads on AWS i3 instances
How to Monitor and Size Workloads on AWS i3 instances
ScyllaDB
 

What's hot (20)

Advanced Operations
Advanced OperationsAdvanced Operations
Advanced Operations
 
Cassandra and Spark
Cassandra and SparkCassandra and Spark
Cassandra and Spark
 
Cassandra Exports as a Trivially Parallelizable Problem (Emilio Del Tessandor...
Cassandra Exports as a Trivially Parallelizable Problem (Emilio Del Tessandor...Cassandra Exports as a Trivially Parallelizable Problem (Emilio Del Tessandor...
Cassandra Exports as a Trivially Parallelizable Problem (Emilio Del Tessandor...
 
Cassandra Summit 2014: Active-Active Cassandra Behind the Scenes
Cassandra Summit 2014: Active-Active Cassandra Behind the ScenesCassandra Summit 2014: Active-Active Cassandra Behind the Scenes
Cassandra Summit 2014: Active-Active Cassandra Behind the Scenes
 
Introduction to Cassandra Architecture
Introduction to Cassandra ArchitectureIntroduction to Cassandra Architecture
Introduction to Cassandra Architecture
 
Looking towards an official cassandra sidecar netflix
Looking towards an official cassandra sidecar   netflixLooking towards an official cassandra sidecar   netflix
Looking towards an official cassandra sidecar netflix
 
Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...
Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...
Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...
 
Building a Multi-Region Cluster at Target (Aaron Ploetz, Target) | Cassandra ...
Building a Multi-Region Cluster at Target (Aaron Ploetz, Target) | Cassandra ...Building a Multi-Region Cluster at Target (Aaron Ploetz, Target) | Cassandra ...
Building a Multi-Region Cluster at Target (Aaron Ploetz, Target) | Cassandra ...
 
Scylla Summit 2018: Make Scylla Fast Again! Find out how using Tools, Talent,...
Scylla Summit 2018: Make Scylla Fast Again! Find out how using Tools, Talent,...Scylla Summit 2018: Make Scylla Fast Again! Find out how using Tools, Talent,...
Scylla Summit 2018: Make Scylla Fast Again! Find out how using Tools, Talent,...
 
Make 2016 your year of SMACK talk
Make 2016 your year of SMACK talkMake 2016 your year of SMACK talk
Make 2016 your year of SMACK talk
 
Instaclustr webinar 2017 feb 08 japan
Instaclustr webinar 2017 feb 08   japanInstaclustr webinar 2017 feb 08   japan
Instaclustr webinar 2017 feb 08 japan
 
Clock Skew and Other Annoying Realities in Distributed Systems (Donny Nadolny...
Clock Skew and Other Annoying Realities in Distributed Systems (Donny Nadolny...Clock Skew and Other Annoying Realities in Distributed Systems (Donny Nadolny...
Clock Skew and Other Annoying Realities in Distributed Systems (Donny Nadolny...
 
A glimpse of cassandra 4.0 features netflix
A glimpse of cassandra 4.0 features   netflixA glimpse of cassandra 4.0 features   netflix
A glimpse of cassandra 4.0 features netflix
 
How netflix manages petabyte scale apache cassandra in the cloud
How netflix manages petabyte scale apache cassandra in the cloudHow netflix manages petabyte scale apache cassandra in the cloud
How netflix manages petabyte scale apache cassandra in the cloud
 
Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...
Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...
Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...
 
Mesosphere and Contentteam: A New Way to Run Cassandra
Mesosphere and Contentteam: A New Way to Run CassandraMesosphere and Contentteam: A New Way to Run Cassandra
Mesosphere and Contentteam: A New Way to Run Cassandra
 
Processing 50,000 events per second with Cassandra and Spark
Processing 50,000 events per second with Cassandra and SparkProcessing 50,000 events per second with Cassandra and Spark
Processing 50,000 events per second with Cassandra and Spark
 
Strata+Hadoop 2017 San Jose: Lessons from a year of supporting Apache Kafka
Strata+Hadoop 2017 San Jose: Lessons from a year of supporting Apache KafkaStrata+Hadoop 2017 San Jose: Lessons from a year of supporting Apache Kafka
Strata+Hadoop 2017 San Jose: Lessons from a year of supporting Apache Kafka
 
Micro-batching: High-performance writes
Micro-batching: High-performance writesMicro-batching: High-performance writes
Micro-batching: High-performance writes
 
How to Monitor and Size Workloads on AWS i3 instances
How to Monitor and Size Workloads on AWS i3 instancesHow to Monitor and Size Workloads on AWS i3 instances
How to Monitor and Size Workloads on AWS i3 instances
 

Viewers also liked

Dsp presentation
Dsp presentationDsp presentation
Dsp presentation
ILA SHARMA
 
Window_of_Economic_Statistics_MDPS_AE_Q3_2013Window of economic_statistics_md...
Window_of_Economic_Statistics_MDPS_AE_Q3_2013Window of economic_statistics_md...Window_of_Economic_Statistics_MDPS_AE_Q3_2013Window of economic_statistics_md...
Window_of_Economic_Statistics_MDPS_AE_Q3_2013Window of economic_statistics_md...
Mahmoud Abozaid
 
02 cv mil_intro_to_probability
02 cv mil_intro_to_probability02 cv mil_intro_to_probability
02 cv mil_intro_to_probability
zukun
 
Data Portals in National Statistics Offices: Case of Developing Countries
Data Portals in National Statistics Offices: Case of Developing CountriesData Portals in National Statistics Offices: Case of Developing Countries
Data Portals in National Statistics Offices: Case of Developing Countries
Rajiv Ranjan
 
Chap019
Chap019Chap019
Chap019
Dhamo daran
 
Spatial Statistics on the Geospatial Web
Spatial Statistics on the Geospatial WebSpatial Statistics on the Geospatial Web
Spatial Statistics on the Geospatial Web
Matthias Hinz
 
Fourier transform
Fourier transformFourier transform
Six sigma
Six sigmaSix sigma
Six sigma
Alaa Youssef
 
Probability and random processes project based learning template.pdf
Probability and random processes project based learning template.pdfProbability and random processes project based learning template.pdf
Probability and random processes project based learning template.pdf
Vedant Srivastava
 
Economics Statistics Worktext
Economics Statistics WorktextEconomics Statistics Worktext
Economics Statistics Worktext
Jameson Estrada Pangasinan State University
 
Quantative analysis
Quantative analysisQuantative analysis
Quantative analysis
Dhruti Gadhiya
 
Key Economic & Social Statistics - India
Key Economic & Social Statistics - IndiaKey Economic & Social Statistics - India
Key Economic & Social Statistics - India
Federation of Indian Chambers of Commerce & Industry (FICCI)
 
Analytical Design in Applied Marketing Research
Analytical Design in Applied Marketing ResearchAnalytical Design in Applied Marketing Research
Analytical Design in Applied Marketing Research
Kelly Page
 
Noida Master Plan 2021
Noida Master Plan 2021Noida Master Plan 2021
Noida Master Plan 2021
Vijay Meena
 
Hollywood Motion Picture Cluster
Hollywood Motion Picture ClusterHollywood Motion Picture Cluster
Hollywood Motion Picture Cluster
Aliaksey Narko
 
Data Analysis
Data AnalysisData Analysis
Data Analysis
sikander kushwaha
 
Application fields of R in classical industrial analytics
Application fields of R in classical industrial analyticsApplication fields of R in classical industrial analytics
Application fields of R in classical industrial analytics
eoda GmbH
 
Lean knowledge
Lean knowledgeLean knowledge
Lean knowledge
NsbmUcd
 
COM2304: Intensity Transformation and Spatial Filtering – I (Intensity Transf...
COM2304: Intensity Transformation and Spatial Filtering – I (Intensity Transf...COM2304: Intensity Transformation and Spatial Filtering – I (Intensity Transf...
COM2304: Intensity Transformation and Spatial Filtering – I (Intensity Transf...
Hemantha Kulathilake
 
Ola fopl stats project
Ola fopl stats projectOla fopl stats project
Ola fopl stats project
Stephen Abram
 

Viewers also liked (20)

Dsp presentation
Dsp presentationDsp presentation
Dsp presentation
 
Window_of_Economic_Statistics_MDPS_AE_Q3_2013Window of economic_statistics_md...
Window_of_Economic_Statistics_MDPS_AE_Q3_2013Window of economic_statistics_md...Window_of_Economic_Statistics_MDPS_AE_Q3_2013Window of economic_statistics_md...
Window_of_Economic_Statistics_MDPS_AE_Q3_2013Window of economic_statistics_md...
 
02 cv mil_intro_to_probability
02 cv mil_intro_to_probability02 cv mil_intro_to_probability
02 cv mil_intro_to_probability
 
Data Portals in National Statistics Offices: Case of Developing Countries
Data Portals in National Statistics Offices: Case of Developing CountriesData Portals in National Statistics Offices: Case of Developing Countries
Data Portals in National Statistics Offices: Case of Developing Countries
 
Chap019
Chap019Chap019
Chap019
 
Spatial Statistics on the Geospatial Web
Spatial Statistics on the Geospatial WebSpatial Statistics on the Geospatial Web
Spatial Statistics on the Geospatial Web
 
Fourier transform
Fourier transformFourier transform
Fourier transform
 
Six sigma
Six sigmaSix sigma
Six sigma
 
Probability and random processes project based learning template.pdf
Probability and random processes project based learning template.pdfProbability and random processes project based learning template.pdf
Probability and random processes project based learning template.pdf
 
Economics Statistics Worktext
Economics Statistics WorktextEconomics Statistics Worktext
Economics Statistics Worktext
 
Quantative analysis
Quantative analysisQuantative analysis
Quantative analysis
 
Key Economic & Social Statistics - India
Key Economic & Social Statistics - IndiaKey Economic & Social Statistics - India
Key Economic & Social Statistics - India
 
Analytical Design in Applied Marketing Research
Analytical Design in Applied Marketing ResearchAnalytical Design in Applied Marketing Research
Analytical Design in Applied Marketing Research
 
Noida Master Plan 2021
Noida Master Plan 2021Noida Master Plan 2021
Noida Master Plan 2021
 
Hollywood Motion Picture Cluster
Hollywood Motion Picture ClusterHollywood Motion Picture Cluster
Hollywood Motion Picture Cluster
 
Data Analysis
Data AnalysisData Analysis
Data Analysis
 
Application fields of R in classical industrial analytics
Application fields of R in classical industrial analyticsApplication fields of R in classical industrial analytics
Application fields of R in classical industrial analytics
 
Lean knowledge
Lean knowledgeLean knowledge
Lean knowledge
 
COM2304: Intensity Transformation and Spatial Filtering – I (Intensity Transf...
COM2304: Intensity Transformation and Spatial Filtering – I (Intensity Transf...COM2304: Intensity Transformation and Spatial Filtering – I (Intensity Transf...
COM2304: Intensity Transformation and Spatial Filtering – I (Intensity Transf...
 
Ola fopl stats project
Ola fopl stats projectOla fopl stats project
Ola fopl stats project
 

Similar to Devops kc

Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek Berlin
Christian Johannsen
 
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
DataStax Academy
 
Leveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analyticsLeveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analytics
Julien Anguenot
 
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
DataStax
 
High Throughput Analytics with Cassandra & Azure
High Throughput Analytics with Cassandra & AzureHigh Throughput Analytics with Cassandra & Azure
High Throughput Analytics with Cassandra & Azure
DataStax Academy
 
Pythian: My First 100 days with a Cassandra Cluster
Pythian: My First 100 days with a Cassandra ClusterPythian: My First 100 days with a Cassandra Cluster
Pythian: My First 100 days with a Cassandra Cluster
DataStax Academy
 
Cassandra - A Basic Introduction Guide
Cassandra - A Basic Introduction GuideCassandra - A Basic Introduction Guide
Cassandra - A Basic Introduction Guide
Mohammed Fazuluddin
 
London + Dublin Cassandra 2.0
London + Dublin Cassandra 2.0London + Dublin Cassandra 2.0
London + Dublin Cassandra 2.0
jbellis
 
Cassandra - A decentralized storage system
Cassandra - A decentralized storage systemCassandra - A decentralized storage system
Cassandra - A decentralized storage system
Arunit Gupta
 
Highly available, scalable and secure data with Cassandra and DataStax Enterp...
Highly available, scalable and secure data with Cassandra and DataStax Enterp...Highly available, scalable and secure data with Cassandra and DataStax Enterp...
Highly available, scalable and secure data with Cassandra and DataStax Enterp...
Johnny Miller
 
Apache Cassandra in the Real World
Apache Cassandra in the Real WorldApache Cassandra in the Real World
Apache Cassandra in the Real World
Jeremy Hanna
 
Streaming Analytics with Spark, Kafka, Cassandra and Akka
Streaming Analytics with Spark, Kafka, Cassandra and AkkaStreaming Analytics with Spark, Kafka, Cassandra and Akka
Streaming Analytics with Spark, Kafka, Cassandra and Akka
Helena Edelson
 
2010 12 mysql_clusteroverview
2010 12 mysql_clusteroverview2010 12 mysql_clusteroverview
2010 12 mysql_clusteroverview
Dimas Prasetyo
 
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
DataStax Academy
 
Apache cassandra
Apache cassandraApache cassandra
Apache cassandra
Adnan Siddiqi
 
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
DataStax Academy
 
Apache Cassandra and The Multi-Cloud by Amanda Moran
Apache Cassandra and The Multi-Cloud by Amanda MoranApache Cassandra and The Multi-Cloud by Amanda Moran
Apache Cassandra and The Multi-Cloud by Amanda Moran
Data Con LA
 
Cassandra training
Cassandra trainingCassandra training
Cassandra training
András Fehér
 
Databases Have Forgotten About Single Node Performance, A Wrongheaded Trade Off
Databases Have Forgotten About Single Node Performance, A Wrongheaded Trade OffDatabases Have Forgotten About Single Node Performance, A Wrongheaded Trade Off
Databases Have Forgotten About Single Node Performance, A Wrongheaded Trade Off
Timescale
 
DataStax TechDay - Munich 2014
DataStax TechDay - Munich 2014DataStax TechDay - Munich 2014
DataStax TechDay - Munich 2014
Christian Johannsen
 

Similar to Devops kc (20)

Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek Berlin
 
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
 
Leveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analyticsLeveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analytics
 
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
 
High Throughput Analytics with Cassandra & Azure
High Throughput Analytics with Cassandra & AzureHigh Throughput Analytics with Cassandra & Azure
High Throughput Analytics with Cassandra & Azure
 
Pythian: My First 100 days with a Cassandra Cluster
Pythian: My First 100 days with a Cassandra ClusterPythian: My First 100 days with a Cassandra Cluster
Pythian: My First 100 days with a Cassandra Cluster
 
Cassandra - A Basic Introduction Guide
Cassandra - A Basic Introduction GuideCassandra - A Basic Introduction Guide
Cassandra - A Basic Introduction Guide
 
London + Dublin Cassandra 2.0
London + Dublin Cassandra 2.0London + Dublin Cassandra 2.0
London + Dublin Cassandra 2.0
 
Cassandra - A decentralized storage system
Cassandra - A decentralized storage systemCassandra - A decentralized storage system
Cassandra - A decentralized storage system
 
Highly available, scalable and secure data with Cassandra and DataStax Enterp...
Highly available, scalable and secure data with Cassandra and DataStax Enterp...Highly available, scalable and secure data with Cassandra and DataStax Enterp...
Highly available, scalable and secure data with Cassandra and DataStax Enterp...
 
Apache Cassandra in the Real World
Apache Cassandra in the Real WorldApache Cassandra in the Real World
Apache Cassandra in the Real World
 
Streaming Analytics with Spark, Kafka, Cassandra and Akka
Streaming Analytics with Spark, Kafka, Cassandra and AkkaStreaming Analytics with Spark, Kafka, Cassandra and Akka
Streaming Analytics with Spark, Kafka, Cassandra and Akka
 
2010 12 mysql_clusteroverview
2010 12 mysql_clusteroverview2010 12 mysql_clusteroverview
2010 12 mysql_clusteroverview
 
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
 
Apache cassandra
Apache cassandraApache cassandra
Apache cassandra
 
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
 
Apache Cassandra and The Multi-Cloud by Amanda Moran
Apache Cassandra and The Multi-Cloud by Amanda MoranApache Cassandra and The Multi-Cloud by Amanda Moran
Apache Cassandra and The Multi-Cloud by Amanda Moran
 
Cassandra training
Cassandra trainingCassandra training
Cassandra training
 
Databases Have Forgotten About Single Node Performance, A Wrongheaded Trade Off
Databases Have Forgotten About Single Node Performance, A Wrongheaded Trade OffDatabases Have Forgotten About Single Node Performance, A Wrongheaded Trade Off
Databases Have Forgotten About Single Node Performance, A Wrongheaded Trade Off
 
DataStax TechDay - Munich 2014
DataStax TechDay - Munich 2014DataStax TechDay - Munich 2014
DataStax TechDay - Munich 2014
 

Recently uploaded

Comprehensive Vulnerability Assessments Process _ Aardwolf Security.docx
Comprehensive Vulnerability Assessments Process _ Aardwolf Security.docxComprehensive Vulnerability Assessments Process _ Aardwolf Security.docx
Comprehensive Vulnerability Assessments Process _ Aardwolf Security.docx
Aardwolf Security
 
Cisco Live Announcements: New ThousandEyes Release Highlights - July 2024
Cisco Live Announcements: New ThousandEyes Release Highlights - July 2024Cisco Live Announcements: New ThousandEyes Release Highlights - July 2024
Cisco Live Announcements: New ThousandEyes Release Highlights - July 2024
ThousandEyes
 
VVIP Girls Call Mumbai 9910780858 Provide Best And Top Girl Service And No1 i...
VVIP Girls Call Mumbai 9910780858 Provide Best And Top Girl Service And No1 i...VVIP Girls Call Mumbai 9910780858 Provide Best And Top Girl Service And No1 i...
VVIP Girls Call Mumbai 9910780858 Provide Best And Top Girl Service And No1 i...
jealousviolet
 
Russian Girls Call Mumbai 🎈🔥9930687706 🔥💋🎈 Provide Best And Top Girl Service ...
Russian Girls Call Mumbai 🎈🔥9930687706 🔥💋🎈 Provide Best And Top Girl Service ...Russian Girls Call Mumbai 🎈🔥9930687706 🔥💋🎈 Provide Best And Top Girl Service ...
Russian Girls Call Mumbai 🎈🔥9930687706 🔥💋🎈 Provide Best And Top Girl Service ...
shanihomely
 
bangalore Girls call 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
bangalore Girls call  👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Deliverybangalore Girls call  👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
bangalore Girls call 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
sunilverma7884
 
Celebrity Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Servic...
Celebrity Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Servic...Celebrity Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Servic...
Celebrity Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Servic...
45unexpected
 
GT degree offer diploma Transcript
GT degree offer diploma TranscriptGT degree offer diploma Transcript
GT degree offer diploma Transcript
attueb
 
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
confluent
 
Independent Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class H...
Independent Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class H...Independent Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class H...
Independent Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class H...
aslasdfmkhan4750
 
HIRE A HACKER FOR CHEATING HUSBAND/WIFE)
HIRE A HACKER FOR CHEATING HUSBAND/WIFE)HIRE A HACKER FOR CHEATING HUSBAND/WIFE)
HIRE A HACKER FOR CHEATING HUSBAND/WIFE)
josephinedrea942
 
Mobile App Development Company in Noida - Drona Infotech.
Mobile App Development Company in Noida - Drona Infotech.Mobile App Development Company in Noida - Drona Infotech.
Mobile App Development Company in Noida - Drona Infotech.
Mobile App Development Company in Noida - Drona Infotech
 
Vip Girls Call ServiCe Hyderabad 0000000000 Pooja Best High Class Hyderabad A...
Vip Girls Call ServiCe Hyderabad 0000000000 Pooja Best High Class Hyderabad A...Vip Girls Call ServiCe Hyderabad 0000000000 Pooja Best High Class Hyderabad A...
Vip Girls Call ServiCe Hyderabad 0000000000 Pooja Best High Class Hyderabad A...
ashiklo9823
 
Verified Girls Call Mumbai 👀 9820252231 👀 Cash Payment With Room DeliveryDeli...
Verified Girls Call Mumbai 👀 9820252231 👀 Cash Payment With Room DeliveryDeli...Verified Girls Call Mumbai 👀 9820252231 👀 Cash Payment With Room DeliveryDeli...
Verified Girls Call Mumbai 👀 9820252231 👀 Cash Payment With Room DeliveryDeli...
87tomato
 
Celebrity Girls Call Mumbai 9930687706 Unlimited Short Providing Girls Servic...
Celebrity Girls Call Mumbai 9930687706 Unlimited Short Providing Girls Servic...Celebrity Girls Call Mumbai 9930687706 Unlimited Short Providing Girls Servic...
Celebrity Girls Call Mumbai 9930687706 Unlimited Short Providing Girls Servic...
kiara pandey
 
High Girls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 ...
High Girls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 ...High Girls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 ...
High Girls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 ...
singhlata50dh
 
A Step-by-Step Guide to Selecting the Right Automated Software Testing Tools.pdf
A Step-by-Step Guide to Selecting the Right Automated Software Testing Tools.pdfA Step-by-Step Guide to Selecting the Right Automated Software Testing Tools.pdf
A Step-by-Step Guide to Selecting the Right Automated Software Testing Tools.pdf
kalichargn70th171
 
AWS DevOps-Tutorial CHANAKYA SRIYAN DUKKA.
AWS DevOps-Tutorial CHANAKYA SRIYAN DUKKA.AWS DevOps-Tutorial CHANAKYA SRIYAN DUKKA.
AWS DevOps-Tutorial CHANAKYA SRIYAN DUKKA.
Srinivas Dukka
 
Amadeus Travel API, Amadeus Booking API, Amadeus GDS
Amadeus Travel API, Amadeus Booking API, Amadeus GDSAmadeus Travel API, Amadeus Booking API, Amadeus GDS
Amadeus Travel API, Amadeus Booking API, Amadeus GDS
aadhiyaeliza
 
NYGGS 360: A Complete ERP for Construction Innovation
NYGGS 360: A Complete ERP for Construction InnovationNYGGS 360: A Complete ERP for Construction Innovation
NYGGS 360: A Complete ERP for Construction Innovation
NYGGS Construction ERP Software
 
Girls Call Jogeshwari 9967584737 Provide Best And Top Girl Service And No1 in...
Girls Call Jogeshwari 9967584737 Provide Best And Top Girl Service And No1 in...Girls Call Jogeshwari 9967584737 Provide Best And Top Girl Service And No1 in...
Girls Call Jogeshwari 9967584737 Provide Best And Top Girl Service And No1 in...
simran hot girls
 

Recently uploaded (20)

Comprehensive Vulnerability Assessments Process _ Aardwolf Security.docx
Comprehensive Vulnerability Assessments Process _ Aardwolf Security.docxComprehensive Vulnerability Assessments Process _ Aardwolf Security.docx
Comprehensive Vulnerability Assessments Process _ Aardwolf Security.docx
 
Cisco Live Announcements: New ThousandEyes Release Highlights - July 2024
Cisco Live Announcements: New ThousandEyes Release Highlights - July 2024Cisco Live Announcements: New ThousandEyes Release Highlights - July 2024
Cisco Live Announcements: New ThousandEyes Release Highlights - July 2024
 
VVIP Girls Call Mumbai 9910780858 Provide Best And Top Girl Service And No1 i...
VVIP Girls Call Mumbai 9910780858 Provide Best And Top Girl Service And No1 i...VVIP Girls Call Mumbai 9910780858 Provide Best And Top Girl Service And No1 i...
VVIP Girls Call Mumbai 9910780858 Provide Best And Top Girl Service And No1 i...
 
Russian Girls Call Mumbai 🎈🔥9930687706 🔥💋🎈 Provide Best And Top Girl Service ...
Russian Girls Call Mumbai 🎈🔥9930687706 🔥💋🎈 Provide Best And Top Girl Service ...Russian Girls Call Mumbai 🎈🔥9930687706 🔥💋🎈 Provide Best And Top Girl Service ...
Russian Girls Call Mumbai 🎈🔥9930687706 🔥💋🎈 Provide Best And Top Girl Service ...
 
bangalore Girls call 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
bangalore Girls call  👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Deliverybangalore Girls call  👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
bangalore Girls call 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
 
Celebrity Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Servic...
Celebrity Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Servic...Celebrity Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Servic...
Celebrity Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Servic...
 
GT degree offer diploma Transcript
GT degree offer diploma TranscriptGT degree offer diploma Transcript
GT degree offer diploma Transcript
 
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
 
Independent Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class H...
Independent Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class H...Independent Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class H...
Independent Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class H...
 
HIRE A HACKER FOR CHEATING HUSBAND/WIFE)
HIRE A HACKER FOR CHEATING HUSBAND/WIFE)HIRE A HACKER FOR CHEATING HUSBAND/WIFE)
HIRE A HACKER FOR CHEATING HUSBAND/WIFE)
 
Mobile App Development Company in Noida - Drona Infotech.
Mobile App Development Company in Noida - Drona Infotech.Mobile App Development Company in Noida - Drona Infotech.
Mobile App Development Company in Noida - Drona Infotech.
 
Vip Girls Call ServiCe Hyderabad 0000000000 Pooja Best High Class Hyderabad A...
Vip Girls Call ServiCe Hyderabad 0000000000 Pooja Best High Class Hyderabad A...Vip Girls Call ServiCe Hyderabad 0000000000 Pooja Best High Class Hyderabad A...
Vip Girls Call ServiCe Hyderabad 0000000000 Pooja Best High Class Hyderabad A...
 
Verified Girls Call Mumbai 👀 9820252231 👀 Cash Payment With Room DeliveryDeli...
Verified Girls Call Mumbai 👀 9820252231 👀 Cash Payment With Room DeliveryDeli...Verified Girls Call Mumbai 👀 9820252231 👀 Cash Payment With Room DeliveryDeli...
Verified Girls Call Mumbai 👀 9820252231 👀 Cash Payment With Room DeliveryDeli...
 
Celebrity Girls Call Mumbai 9930687706 Unlimited Short Providing Girls Servic...
Celebrity Girls Call Mumbai 9930687706 Unlimited Short Providing Girls Servic...Celebrity Girls Call Mumbai 9930687706 Unlimited Short Providing Girls Servic...
Celebrity Girls Call Mumbai 9930687706 Unlimited Short Providing Girls Servic...
 
High Girls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 ...
High Girls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 ...High Girls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 ...
High Girls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 ...
 
A Step-by-Step Guide to Selecting the Right Automated Software Testing Tools.pdf
A Step-by-Step Guide to Selecting the Right Automated Software Testing Tools.pdfA Step-by-Step Guide to Selecting the Right Automated Software Testing Tools.pdf
A Step-by-Step Guide to Selecting the Right Automated Software Testing Tools.pdf
 
AWS DevOps-Tutorial CHANAKYA SRIYAN DUKKA.
AWS DevOps-Tutorial CHANAKYA SRIYAN DUKKA.AWS DevOps-Tutorial CHANAKYA SRIYAN DUKKA.
AWS DevOps-Tutorial CHANAKYA SRIYAN DUKKA.
 
Amadeus Travel API, Amadeus Booking API, Amadeus GDS
Amadeus Travel API, Amadeus Booking API, Amadeus GDSAmadeus Travel API, Amadeus Booking API, Amadeus GDS
Amadeus Travel API, Amadeus Booking API, Amadeus GDS
 
NYGGS 360: A Complete ERP for Construction Innovation
NYGGS 360: A Complete ERP for Construction InnovationNYGGS 360: A Complete ERP for Construction Innovation
NYGGS 360: A Complete ERP for Construction Innovation
 
Girls Call Jogeshwari 9967584737 Provide Best And Top Girl Service And No1 in...
Girls Call Jogeshwari 9967584737 Provide Best And Top Girl Service And No1 in...Girls Call Jogeshwari 9967584737 Provide Best And Top Girl Service And No1 in...
Girls Call Jogeshwari 9967584737 Provide Best And Top Girl Service And No1 in...
 

Devops kc

  • 1. Apache Cassandra Philip Thompson Software Engineer DataStax ©2014 DataStax. Do not distribute without consent. 1
  • 2. Who I am • Philip Thompson • Software Engineer at DataStax • Contributor to Apache Cassandra • A maintainer of CCM, the Cassandra Cluster Manager
  • 3. Apache Cassandra™ •Apache Cassandra™ is a massively scalable, open source, NoSQL, distributed database built for modern, mission-critical online applications. •Written in Java and is a hybrid of Amazon Dynamo and Google BigTable •Masterless with no single point of failure •Distributed and data centre aware •100% uptime •Predictable scaling 3
  • 9. Data Distribution 75 0 25 50 Murmur3_Hash_Function(Partition Key) >> Token
  • 10. Cassandra - More than one server • All nodes participate in a cluster • Shared nothing • Add or remove as needed • More capacity? Add a server 10 • Each node owns a number of tokens • Tokens denote a range of keys • 4 nodes? -> Key range/4 • Each node owns 1/4 the data
  • 11. Cassandra - Locally Distributed • Client writes to any node • Node coordinates with others • Data replicated in parallel • Replication factor (RF): How many copies of your data? • RF = 3 here Each node stores 3/4 of clusters total data. 11
  • 12. Cassandra - Geographically Distributed • Client writes local • Data syncs across WAN • Replication Factor per DC Single coordinator 12
  • 13. Cassandra - Replication Factor • Replication factor (RF): How many copies of your data? • Replication Factor is set per keyspace • Can be altered by operator 13 RF = 3
  • 14. Cassandra - Consistency • Consistency Level (CL) • Client specifies per read or write • ALL = All replicas ack • QUORUM = > 51% of replicas ack • LOCAL_QUORUM = > 51% in local DC ack • ONE = Only one replica acks 14
  • 15. Cassandra - Transparent to the application • A single node failure shouldn’t bring failure • Replication Factor + Consistency Level = Success • This example: • RF = 3 • CL = QUORUM >51% Ack so we are good! 15
  • 16. Cassandra - Scaling • Take a cluster of four nodes • Where does the fifth node go? • Rebalancing is costly 75 16 0 25 50
  • 19. Gossip • Manages cluster state • Nodes up/down • Nodes joining/leaving • Decentralized • “Heartbeat” every second • Every node contacts 1-3 other nodes
  • 20. Snitch • Responsible for determining cluster topology • Datacenter awareness • Tracks node responsiveness • Many snitches provided out of the box • SimpleSnitch • GossipingPropertyFileSnitch (recommended for production) • EC2Snitch and EC2MultiRegionSnitch • For use with AWS • Comparable GCE snitch has just been added • Custom snitches can be added 20
  • 22. Anti-Entropy - Hinted Handoff • Three hour window • Hints are replayed when node is restored • Stored in system.hints table on coordinator • Cassandra does not copy Dynamo’s “sloppy quorum” 22
  • 23. Anti-Entropy - Repair • Nodetool repair • Uses merkle trees for data comparison • Should be run weekly. • Cassandra 2.1 has drastically improved repair times, thanks to incremental repair 23
  • 25. Write Path commit log Memtable SSTable Write Memory Disk
  • 26. Write Path • By default data is fsynced every 10s • This can be configured in cassandra.yaml commit log Memtable SSTable Write
  • 27. Read Path Memtable SSTable Read SSTable Memory Disk
  • 31. Debugging your data model • Tracing cqlsh> tracing on; Now tracing requests. cqlsh:foo> INSERT INTO test (a, b) VALUES (1, 'example'); Tracing session: 4ad36250-1eb4-11e2-0000-fe8ebeead9f9 activity | timestamp | source | source_elapsed -------------------------------------+--------------+-----------+---------------- execute_cql3_query | 00:02:37,015 | 127.0.0.1 | 0 Parsing statement | 00:02:37,015 | 127.0.0.1 | 81 Preparing statement | 00:02:37,015 | 127.0.0.1 | 273 Determining replicas for mutation | 00:02:37,015 | 127.0.0.1 | 540 Sending message to /127.0.0.2 | 00:02:37,015 | 127.0.0.1 | 779 Messsage received from /127.0.0.1 | 00:02:37,016 | 127.0.0.2 | 63 Applying mutation | 00:02:37,016 | 127.0.0.2 | 220 Acquiring switchLock | 00:02:37,016 | 127.0.0.2 | 250 Appending to commitlog | 00:02:37,016 | 127.0.0.2 | 277 Adding to memtable | 00:02:37,016 | 127.0.0.2 | 378 Enqueuing response to /127.0.0.1 | 00:02:37,016 | 127.0.0.2 | 710 Sending message to /127.0.0.1 | 00:02:37,016 | 127.0.0.2 | 888 Messsage received from /127.0.0.2 | 00:02:37,017 | 127.0.0.1 | 2334 Processing response from /127.0.0.2 | 00:02:37,017 | 127.0.0.1 | 2550 Request complete | 00:02:37,017 | 127.0.0.1 | 2581
  • 32. Nodetool • Command line interface for monitoring Cassandra and performing routine database operations • Commands for viewing detailed metrics for tables, server metrics, and compaction statistics: • cfstats: statistics for each table and keyspace • cfhistograms: statistics about a table, including read/write latency, row size, column count, and number of SSTables • netstats: statistics about network operations and connections • tpstats: statistics about the number of active, pending, and completed tasks for each stage of Cassandra operations by thread pool 32
  • 33. Try it out ©2012 DataStax 33
  • 34. Cassandra • Download from source: • git clone git://git.apache.org/cassandra.git • Packaged install and tarballs available: • http://www.datastax.com/documentation/cassandra/2.1/cassandra/install/install_cassan draTOC.html
  • 35. CCM • CCM - Cassandra Cluster Manager • https://github.com/pcmanus/ccm •Warning: not lightweight • Example: • ccm create test -v 2.0.1 • ccm populate -n 3 • ccm start
  • 36. Clients • Cqlsh • Bundled with Cassandra • Drivers • java: https://github.com/datastax/java-driver • python: https://github.com/datastax/python-driver • .net: https://github.com/datastax/csharp-driver • and more: http://www.datastax.com/download/clientdrivers • Ruby, C/C++, NodeJS
  • 37. Get Help • IRC: #cassandra on freenode • Mailing Lists • Subscribe at cassandra.apache.org • Stack Overflow • DataStax Docs • http://www.datastax.com/docs 37
  • 39. ©2014 DataStax Confidential. Do not distribute without consent. 39