SlideShare a Scribd company logo
Apache Cassandra
Harshit Daga
Software Consultant
Knoldus Software LLP
Agenda
● What is Cassandra
● Gossip communication protocol
● Cassandra- Data Model
● Cassandra- Architecture
● Reading/Writing a node
● Data consistency
Cassandra
● Cassandra is massively scalable schemaless database.
● Open source database, licensed under Apache.
● Originally, developed by Facebok for inbox search.
● Data model based upon Google’s BigTable.
● Distributed design is based upon Amazon Dynamo.
● Promoted massively by Datastax.
Gossip Communication Protocol
● Peer to peer communication protocol.
● Nodes are arranged in ring format.
● Data is replicated to multiple nodes.
● Nodes periodically exchange info. they have.
● Nodes also exchange their own info.
● Each message has its associated version.
● No master-slave concept, and hence no single point of failure.
Cassandra- Data Model
● Column data is stored as in key/value pair.
● Collection of column makes a Row.
● Column family is then becomes as collection of all rows.
● In RDBMS, each column must have some value else NULL,
but not in case of cassandra database.
Cassandra- Data Model
● Consider following example,
● Now inserting a new row:
● Above insertion would not fail.
Cassandra- Data Model
● It means, data are stored as multi-dimensional sparse array.
Cassandra- Architecture
● A ring has several nodes.
● Each node is assigned a Partition value.
● Data processing is based on the Partition Key.
● When a client makes a request to a node, it becomes the
coordinator for that request.
● The coordinator determines which node in the ring should
process upon that request.
Cassandra- Architecture
● Virtual Nodes (Vnodes)
– Responsible for assigning the partition token range.
– Tokens are automatically calculated & assigned to each
node.
– Cluster re-balancing is done automatically.
Cassandra- Architecture
● Which node gets what data is based on the partition key.
● Cassandra assigns a hash value to each partition key.
● And data gets to a node as per the hash value
Cassandra- Architecture
● How write request gets fulfilled:-
Data Replication
● Data replication
– Simple Strategy
● Used for only one cluster
– Network Topology Strategy
● Used for multiple clusters in multiple data centers.
Writing data in a Node
● Write an entry in the commit log
● Write data to memtable.
● When memtable is full, Store data on disk in SSTables.
● SSTables are immutable data structure.
● Also has a support for TTL.
Cassandra is the fastest db in concern with the write operation
Reading data from a Node
● First, checks the memtable using Bloom filter.
● If found, then data is sent as response.
● Else, fetch the data from the SSTables.
Cassandra may write many versions of the same row, then
how to identify the latest one?
Update/Delete data from Node
● Data is not immediately deleted.
● It is marked to be deleted/updated in memtables.
● This process is called tombstone.
● Tombstone, runs at configured interval of time.
● During each interval, it collects all the SSTables and updates
the marked record and discards the old SSTables.
Data Consistency
● Data is not necessarily on every node all the time.
● For maintaining consistency, no. of replicas should respond:
– ONE
– QUORUM
– ALL
● Consistency has major impact on performance.
● For strong consistency:
R + W > N
References
● O’reilly- Cassandra Definitive Guide
● https://cassandra.apache.org/doc/latest/
● http://docs.datastax.com/en/cassandra/3.0/
Thank You !!

More Related Content

What's hot

Hands-on Lab: Migrating Oracle to PostgreSQL
Hands-on Lab: Migrating Oracle to PostgreSQL Hands-on Lab: Migrating Oracle to PostgreSQL
Hands-on Lab: Migrating Oracle to PostgreSQL
Amazon Web Services
 
Apache Spark Architecture
Apache Spark ArchitectureApache Spark Architecture
Apache Spark Architecture
Alexey Grishchenko
 
Introduction to Pig
Introduction to PigIntroduction to Pig
Introduction to Pig
Prashanth Babu
 
Map Reduce
Map ReduceMap Reduce
Map Reduce
Manuel Correa
 
Key-Value NoSQL Database
Key-Value NoSQL DatabaseKey-Value NoSQL Database
Key-Value NoSQL Database
Heman Hosainpana
 
Spark architecture
Spark architectureSpark architecture
Spark architecture
GauravBiswas9
 
Intro to Apache Spark
Intro to Apache SparkIntro to Apache Spark
Intro to Apache Spark
Robert Sanders
 
Hadoop HDFS.ppt
Hadoop HDFS.pptHadoop HDFS.ppt
Hadoop HDFS.ppt
6535ANURAGANURAG
 
NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
Harri Kauhanen
 
cassandra
cassandracassandra
cassandra
Akash R
 
Presentation of Apache Cassandra
Presentation of Apache Cassandra Presentation of Apache Cassandra
Presentation of Apache Cassandra
Nikiforos Botis
 
7. Key-Value Databases: In Depth
7. Key-Value Databases: In Depth7. Key-Value Databases: In Depth
7. Key-Value Databases: In Depth
Fabio Fumarola
 
Cassandra Introduction & Features
Cassandra Introduction & FeaturesCassandra Introduction & Features
Cassandra Introduction & Features
DataStax Academy
 
Near Real-Time Netflix Recommendations Using Apache Spark Streaming with Nit...
 Near Real-Time Netflix Recommendations Using Apache Spark Streaming with Nit... Near Real-Time Netflix Recommendations Using Apache Spark Streaming with Nit...
Near Real-Time Netflix Recommendations Using Apache Spark Streaming with Nit...
Databricks
 
Cassandra background-and-architecture
Cassandra background-and-architectureCassandra background-and-architecture
Cassandra background-and-architecture
Markus Klems
 
Introduction to Apache ZooKeeper
Introduction to Apache ZooKeeperIntroduction to Apache ZooKeeper
Introduction to Apache ZooKeeper
Saurav Haloi
 
Memory Management in Apache Spark
Memory Management in Apache SparkMemory Management in Apache Spark
Memory Management in Apache Spark
Databricks
 
Hadoop Map Reduce
Hadoop Map ReduceHadoop Map Reduce
Hadoop Map Reduce
VNIT-ACM Student Chapter
 
Spark SQL
Spark SQLSpark SQL
Spark SQL
Joud Khattab
 
Spark shuffle introduction
Spark shuffle introductionSpark shuffle introduction
Spark shuffle introduction
colorant
 

What's hot (20)

Hands-on Lab: Migrating Oracle to PostgreSQL
Hands-on Lab: Migrating Oracle to PostgreSQL Hands-on Lab: Migrating Oracle to PostgreSQL
Hands-on Lab: Migrating Oracle to PostgreSQL
 
Apache Spark Architecture
Apache Spark ArchitectureApache Spark Architecture
Apache Spark Architecture
 
Introduction to Pig
Introduction to PigIntroduction to Pig
Introduction to Pig
 
Map Reduce
Map ReduceMap Reduce
Map Reduce
 
Key-Value NoSQL Database
Key-Value NoSQL DatabaseKey-Value NoSQL Database
Key-Value NoSQL Database
 
Spark architecture
Spark architectureSpark architecture
Spark architecture
 
Intro to Apache Spark
Intro to Apache SparkIntro to Apache Spark
Intro to Apache Spark
 
Hadoop HDFS.ppt
Hadoop HDFS.pptHadoop HDFS.ppt
Hadoop HDFS.ppt
 
NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
 
cassandra
cassandracassandra
cassandra
 
Presentation of Apache Cassandra
Presentation of Apache Cassandra Presentation of Apache Cassandra
Presentation of Apache Cassandra
 
7. Key-Value Databases: In Depth
7. Key-Value Databases: In Depth7. Key-Value Databases: In Depth
7. Key-Value Databases: In Depth
 
Cassandra Introduction & Features
Cassandra Introduction & FeaturesCassandra Introduction & Features
Cassandra Introduction & Features
 
Near Real-Time Netflix Recommendations Using Apache Spark Streaming with Nit...
 Near Real-Time Netflix Recommendations Using Apache Spark Streaming with Nit... Near Real-Time Netflix Recommendations Using Apache Spark Streaming with Nit...
Near Real-Time Netflix Recommendations Using Apache Spark Streaming with Nit...
 
Cassandra background-and-architecture
Cassandra background-and-architectureCassandra background-and-architecture
Cassandra background-and-architecture
 
Introduction to Apache ZooKeeper
Introduction to Apache ZooKeeperIntroduction to Apache ZooKeeper
Introduction to Apache ZooKeeper
 
Memory Management in Apache Spark
Memory Management in Apache SparkMemory Management in Apache Spark
Memory Management in Apache Spark
 
Hadoop Map Reduce
Hadoop Map ReduceHadoop Map Reduce
Hadoop Map Reduce
 
Spark SQL
Spark SQLSpark SQL
Spark SQL
 
Spark shuffle introduction
Spark shuffle introductionSpark shuffle introduction
Spark shuffle introduction
 

Similar to Introduction to Apache Cassandra

Cassandra
CassandraCassandra
Cassandra
Carbo Kuo
 
Cassandra overview
Cassandra overviewCassandra overview
Cassandra overview
Sean Murphy
 
On Rails with Apache Cassandra
On Rails with Apache CassandraOn Rails with Apache Cassandra
On Rails with Apache Cassandra
Stu Hood
 
An Introduction to Apache Cassandra
An Introduction to Apache CassandraAn Introduction to Apache Cassandra
An Introduction to Apache Cassandra
Saeid Zebardast
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learning
VitsRangannavar
 
Apache cassandra
Apache cassandraApache cassandra
Apache cassandra
Adnan Siddiqi
 
Cassandra Talk: Austin JUG
Cassandra Talk: Austin JUGCassandra Talk: Austin JUG
Cassandra Talk: Austin JUG
Stu Hood
 
Cassandra
CassandraCassandra
Cassandra
Upaang Saxena
 
NoSQL Databases
NoSQL DatabasesNoSQL Databases
NoSQL Databases
Eduard Tudenhoefner
 
cassandra.pptx
cassandra.pptxcassandra.pptx
cassandra.pptx
BRINDHA256909
 
Storage cassandra
Storage   cassandraStorage   cassandra
Storage cassandra
PL dream
 
Cassandra - A Basic Introduction Guide
Cassandra - A Basic Introduction GuideCassandra - A Basic Introduction Guide
Cassandra - A Basic Introduction Guide
Mohammed Fazuluddin
 
Cassandra - A Distributed Database System
Cassandra - A Distributed Database System Cassandra - A Distributed Database System
Cassandra - A Distributed Database System
Md. Shohel Rana
 
Cassandra vs Databases
Cassandra vs Databases Cassandra vs Databases
Cassandra vs Databases
Anant Corporation
 
Cassandra Insider
Cassandra InsiderCassandra Insider
Cassandra Insider
Knoldus Inc.
 
Introduction to AWS Big Data
Introduction to AWS Big Data Introduction to AWS Big Data
Introduction to AWS Big Data
Omid Vahdaty
 
Cassandra advanced part-ll
Cassandra advanced part-llCassandra advanced part-ll
Cassandra advanced part-ll
achudhivi
 
Cassandra presentation
Cassandra presentationCassandra presentation
Cassandra presentation
Sergey Enin
 
Cassandra an overview
Cassandra an overviewCassandra an overview
Cassandra an overview
PritamKathar
 
Cassandra tutorial
Cassandra tutorialCassandra tutorial
Cassandra tutorial
Ramakrishna kapa
 

Similar to Introduction to Apache Cassandra (20)

Cassandra
CassandraCassandra
Cassandra
 
Cassandra overview
Cassandra overviewCassandra overview
Cassandra overview
 
On Rails with Apache Cassandra
On Rails with Apache CassandraOn Rails with Apache Cassandra
On Rails with Apache Cassandra
 
An Introduction to Apache Cassandra
An Introduction to Apache CassandraAn Introduction to Apache Cassandra
An Introduction to Apache Cassandra
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learning
 
Apache cassandra
Apache cassandraApache cassandra
Apache cassandra
 
Cassandra Talk: Austin JUG
Cassandra Talk: Austin JUGCassandra Talk: Austin JUG
Cassandra Talk: Austin JUG
 
Cassandra
CassandraCassandra
Cassandra
 
NoSQL Databases
NoSQL DatabasesNoSQL Databases
NoSQL Databases
 
cassandra.pptx
cassandra.pptxcassandra.pptx
cassandra.pptx
 
Storage cassandra
Storage   cassandraStorage   cassandra
Storage cassandra
 
Cassandra - A Basic Introduction Guide
Cassandra - A Basic Introduction GuideCassandra - A Basic Introduction Guide
Cassandra - A Basic Introduction Guide
 
Cassandra - A Distributed Database System
Cassandra - A Distributed Database System Cassandra - A Distributed Database System
Cassandra - A Distributed Database System
 
Cassandra vs Databases
Cassandra vs Databases Cassandra vs Databases
Cassandra vs Databases
 
Cassandra Insider
Cassandra InsiderCassandra Insider
Cassandra Insider
 
Introduction to AWS Big Data
Introduction to AWS Big Data Introduction to AWS Big Data
Introduction to AWS Big Data
 
Cassandra advanced part-ll
Cassandra advanced part-llCassandra advanced part-ll
Cassandra advanced part-ll
 
Cassandra presentation
Cassandra presentationCassandra presentation
Cassandra presentation
 
Cassandra an overview
Cassandra an overviewCassandra an overview
Cassandra an overview
 
Cassandra tutorial
Cassandra tutorialCassandra tutorial
Cassandra tutorial
 

More from Knoldus Inc.

Managing State & HTTP Requests In Ionic.
Managing State & HTTP Requests In Ionic.Managing State & HTTP Requests In Ionic.
Managing State & HTTP Requests In Ionic.
Knoldus Inc.
 
Facilitation Skills - When to Use and Why.pptx
Facilitation Skills - When to Use and Why.pptxFacilitation Skills - When to Use and Why.pptx
Facilitation Skills - When to Use and Why.pptx
Knoldus Inc.
 
Performance Testing at Scale Techniques for High-Volume Services
Performance Testing at Scale Techniques for High-Volume ServicesPerformance Testing at Scale Techniques for High-Volume Services
Performance Testing at Scale Techniques for High-Volume Services
Knoldus Inc.
 
Snowflake and its features (Presentation)
Snowflake and its features (Presentation)Snowflake and its features (Presentation)
Snowflake and its features (Presentation)
Knoldus Inc.
 
Terratest - Automation testing of infrastructure
Terratest - Automation testing of infrastructureTerratest - Automation testing of infrastructure
Terratest - Automation testing of infrastructure
Knoldus Inc.
 
Getting Started with Apache Spark (Scala)
Getting Started with Apache Spark (Scala)Getting Started with Apache Spark (Scala)
Getting Started with Apache Spark (Scala)
Knoldus Inc.
 
Secure practices with dot net services.pptx
Secure practices with dot net services.pptxSecure practices with dot net services.pptx
Secure practices with dot net services.pptx
Knoldus Inc.
 
Distributed Cache with dot microservices
Distributed Cache with dot microservicesDistributed Cache with dot microservices
Distributed Cache with dot microservices
Knoldus Inc.
 
Introduction to gRPC Presentation (Java)
Introduction to gRPC Presentation (Java)Introduction to gRPC Presentation (Java)
Introduction to gRPC Presentation (Java)
Knoldus Inc.
 
Using InfluxDB for real-time monitoring in Jmeter
Using InfluxDB for real-time monitoring in JmeterUsing InfluxDB for real-time monitoring in Jmeter
Using InfluxDB for real-time monitoring in Jmeter
Knoldus Inc.
 
Intoduction to KubeVela Presentation (DevOps)
Intoduction to KubeVela Presentation (DevOps)Intoduction to KubeVela Presentation (DevOps)
Intoduction to KubeVela Presentation (DevOps)
Knoldus Inc.
 
Stakeholder Management (Project Management) Presentation
Stakeholder Management (Project Management) PresentationStakeholder Management (Project Management) Presentation
Stakeholder Management (Project Management) Presentation
Knoldus Inc.
 
Introduction To Kaniko (DevOps) Presentation
Introduction To Kaniko (DevOps) PresentationIntroduction To Kaniko (DevOps) Presentation
Introduction To Kaniko (DevOps) Presentation
Knoldus Inc.
 
Efficient Test Environments with Infrastructure as Code (IaC)
Efficient Test Environments with Infrastructure as Code (IaC)Efficient Test Environments with Infrastructure as Code (IaC)
Efficient Test Environments with Infrastructure as Code (IaC)
Knoldus Inc.
 
Exploring Terramate DevOps (Presentation)
Exploring Terramate DevOps (Presentation)Exploring Terramate DevOps (Presentation)
Exploring Terramate DevOps (Presentation)
Knoldus Inc.
 
Clean Code in Test Automation Differentiating Between the Good and the Bad
Clean Code in Test Automation  Differentiating Between the Good and the BadClean Code in Test Automation  Differentiating Between the Good and the Bad
Clean Code in Test Automation Differentiating Between the Good and the Bad
Knoldus Inc.
 
Integrating AI Capabilities in Test Automation
Integrating AI Capabilities in Test AutomationIntegrating AI Capabilities in Test Automation
Integrating AI Capabilities in Test Automation
Knoldus Inc.
 
State Management with NGXS in Angular.pptx
State Management with NGXS in Angular.pptxState Management with NGXS in Angular.pptx
State Management with NGXS in Angular.pptx
Knoldus Inc.
 
Authentication in Svelte using cookies.pptx
Authentication in Svelte using cookies.pptxAuthentication in Svelte using cookies.pptx
Authentication in Svelte using cookies.pptx
Knoldus Inc.
 
OAuth2 Implementation Presentation (Java)
OAuth2 Implementation Presentation (Java)OAuth2 Implementation Presentation (Java)
OAuth2 Implementation Presentation (Java)
Knoldus Inc.
 

More from Knoldus Inc. (20)

Managing State & HTTP Requests In Ionic.
Managing State & HTTP Requests In Ionic.Managing State & HTTP Requests In Ionic.
Managing State & HTTP Requests In Ionic.
 
Facilitation Skills - When to Use and Why.pptx
Facilitation Skills - When to Use and Why.pptxFacilitation Skills - When to Use and Why.pptx
Facilitation Skills - When to Use and Why.pptx
 
Performance Testing at Scale Techniques for High-Volume Services
Performance Testing at Scale Techniques for High-Volume ServicesPerformance Testing at Scale Techniques for High-Volume Services
Performance Testing at Scale Techniques for High-Volume Services
 
Snowflake and its features (Presentation)
Snowflake and its features (Presentation)Snowflake and its features (Presentation)
Snowflake and its features (Presentation)
 
Terratest - Automation testing of infrastructure
Terratest - Automation testing of infrastructureTerratest - Automation testing of infrastructure
Terratest - Automation testing of infrastructure
 
Getting Started with Apache Spark (Scala)
Getting Started with Apache Spark (Scala)Getting Started with Apache Spark (Scala)
Getting Started with Apache Spark (Scala)
 
Secure practices with dot net services.pptx
Secure practices with dot net services.pptxSecure practices with dot net services.pptx
Secure practices with dot net services.pptx
 
Distributed Cache with dot microservices
Distributed Cache with dot microservicesDistributed Cache with dot microservices
Distributed Cache with dot microservices
 
Introduction to gRPC Presentation (Java)
Introduction to gRPC Presentation (Java)Introduction to gRPC Presentation (Java)
Introduction to gRPC Presentation (Java)
 
Using InfluxDB for real-time monitoring in Jmeter
Using InfluxDB for real-time monitoring in JmeterUsing InfluxDB for real-time monitoring in Jmeter
Using InfluxDB for real-time monitoring in Jmeter
 
Intoduction to KubeVela Presentation (DevOps)
Intoduction to KubeVela Presentation (DevOps)Intoduction to KubeVela Presentation (DevOps)
Intoduction to KubeVela Presentation (DevOps)
 
Stakeholder Management (Project Management) Presentation
Stakeholder Management (Project Management) PresentationStakeholder Management (Project Management) Presentation
Stakeholder Management (Project Management) Presentation
 
Introduction To Kaniko (DevOps) Presentation
Introduction To Kaniko (DevOps) PresentationIntroduction To Kaniko (DevOps) Presentation
Introduction To Kaniko (DevOps) Presentation
 
Efficient Test Environments with Infrastructure as Code (IaC)
Efficient Test Environments with Infrastructure as Code (IaC)Efficient Test Environments with Infrastructure as Code (IaC)
Efficient Test Environments with Infrastructure as Code (IaC)
 
Exploring Terramate DevOps (Presentation)
Exploring Terramate DevOps (Presentation)Exploring Terramate DevOps (Presentation)
Exploring Terramate DevOps (Presentation)
 
Clean Code in Test Automation Differentiating Between the Good and the Bad
Clean Code in Test Automation  Differentiating Between the Good and the BadClean Code in Test Automation  Differentiating Between the Good and the Bad
Clean Code in Test Automation Differentiating Between the Good and the Bad
 
Integrating AI Capabilities in Test Automation
Integrating AI Capabilities in Test AutomationIntegrating AI Capabilities in Test Automation
Integrating AI Capabilities in Test Automation
 
State Management with NGXS in Angular.pptx
State Management with NGXS in Angular.pptxState Management with NGXS in Angular.pptx
State Management with NGXS in Angular.pptx
 
Authentication in Svelte using cookies.pptx
Authentication in Svelte using cookies.pptxAuthentication in Svelte using cookies.pptx
Authentication in Svelte using cookies.pptx
 
OAuth2 Implementation Presentation (Java)
OAuth2 Implementation Presentation (Java)OAuth2 Implementation Presentation (Java)
OAuth2 Implementation Presentation (Java)
 

Recently uploaded

A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
kalichargn70th171
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Stork Product Overview: An AI-Powered Autonomous Delivery Fleet
Stork Product Overview: An AI-Powered Autonomous Delivery FleetStork Product Overview: An AI-Powered Autonomous Delivery Fleet
Stork Product Overview: An AI-Powered Autonomous Delivery Fleet
Vince Scalabrino
 
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
The Third Creative Media
 
美洲杯赔率投注网【​网址​🎉3977·EE​🎉】
美洲杯赔率投注网【​网址​🎉3977·EE​🎉】美洲杯赔率投注网【​网址​🎉3977·EE​🎉】
美洲杯赔率投注网【​网址​🎉3977·EE​🎉】
widenerjobeyrl638
 
Enhanced Screen Flows UI/UX using SLDS with Tom Kitt
Enhanced Screen Flows UI/UX using SLDS with Tom KittEnhanced Screen Flows UI/UX using SLDS with Tom Kitt
Enhanced Screen Flows UI/UX using SLDS with Tom Kitt
Peter Caitens
 
Superpower Your Apache Kafka Applications Development with Complementary Open...
Superpower Your Apache Kafka Applications Development with Complementary Open...Superpower Your Apache Kafka Applications Development with Complementary Open...
Superpower Your Apache Kafka Applications Development with Complementary Open...
Paul Brebner
 
Mobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona InfotechMobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona Infotech
Drona Infotech
 
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
dakas1
 
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Paul Brebner
 
How GenAI Can Improve Supplier Performance Management.pdf
How GenAI Can Improve Supplier Performance Management.pdfHow GenAI Can Improve Supplier Performance Management.pdf
How GenAI Can Improve Supplier Performance Management.pdf
Zycus
 
ACE - Team 24 Wrapup event at ahmedabad.
ACE - Team 24 Wrapup event at ahmedabad.ACE - Team 24 Wrapup event at ahmedabad.
ACE - Team 24 Wrapup event at ahmedabad.
Maitrey Patel
 
Optimizing Your E-commerce with WooCommerce.pptx
Optimizing Your E-commerce with WooCommerce.pptxOptimizing Your E-commerce with WooCommerce.pptx
Optimizing Your E-commerce with WooCommerce.pptx
WebConnect Pvt Ltd
 
Beginner's Guide to Observability@Devoxx PL 2024
Beginner's  Guide to Observability@Devoxx PL 2024Beginner's  Guide to Observability@Devoxx PL 2024
Beginner's Guide to Observability@Devoxx PL 2024
michniczscribd
 
Penify - Let AI do the Documentation, you write the Code.
Penify - Let AI do the Documentation, you write the Code.Penify - Let AI do the Documentation, you write the Code.
Penify - Let AI do the Documentation, you write the Code.
KrishnaveniMohan1
 
TMU毕业证书精仿办理
TMU毕业证书精仿办理TMU毕业证书精仿办理
TMU毕业证书精仿办理
aeeva
 
Building API data products on top of your real-time data infrastructure
Building API data products on top of your real-time data infrastructureBuilding API data products on top of your real-time data infrastructure
Building API data products on top of your real-time data infrastructure
confluent
 
WMF 2024 - Unlocking the Future of Data Powering Next-Gen AI with Vector Data...
WMF 2024 - Unlocking the Future of Data Powering Next-Gen AI with Vector Data...WMF 2024 - Unlocking the Future of Data Powering Next-Gen AI with Vector Data...
WMF 2024 - Unlocking the Future of Data Powering Next-Gen AI with Vector Data...
Luigi Fugaro
 
Migration From CH 1.0 to CH 2.0 and Mule 4.6 & Java 17 Upgrade.pptx
Migration From CH 1.0 to CH 2.0 and  Mule 4.6 & Java 17 Upgrade.pptxMigration From CH 1.0 to CH 2.0 and  Mule 4.6 & Java 17 Upgrade.pptx
Migration From CH 1.0 to CH 2.0 and Mule 4.6 & Java 17 Upgrade.pptx
ervikas4
 
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
gapen1
 

Recently uploaded (20)

A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
 
Stork Product Overview: An AI-Powered Autonomous Delivery Fleet
Stork Product Overview: An AI-Powered Autonomous Delivery FleetStork Product Overview: An AI-Powered Autonomous Delivery Fleet
Stork Product Overview: An AI-Powered Autonomous Delivery Fleet
 
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
 
美洲杯赔率投注网【​网址​🎉3977·EE​🎉】
美洲杯赔率投注网【​网址​🎉3977·EE​🎉】美洲杯赔率投注网【​网址​🎉3977·EE​🎉】
美洲杯赔率投注网【​网址​🎉3977·EE​🎉】
 
Enhanced Screen Flows UI/UX using SLDS with Tom Kitt
Enhanced Screen Flows UI/UX using SLDS with Tom KittEnhanced Screen Flows UI/UX using SLDS with Tom Kitt
Enhanced Screen Flows UI/UX using SLDS with Tom Kitt
 
Superpower Your Apache Kafka Applications Development with Complementary Open...
Superpower Your Apache Kafka Applications Development with Complementary Open...Superpower Your Apache Kafka Applications Development with Complementary Open...
Superpower Your Apache Kafka Applications Development with Complementary Open...
 
Mobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona InfotechMobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona Infotech
 
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
 
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
 
How GenAI Can Improve Supplier Performance Management.pdf
How GenAI Can Improve Supplier Performance Management.pdfHow GenAI Can Improve Supplier Performance Management.pdf
How GenAI Can Improve Supplier Performance Management.pdf
 
ACE - Team 24 Wrapup event at ahmedabad.
ACE - Team 24 Wrapup event at ahmedabad.ACE - Team 24 Wrapup event at ahmedabad.
ACE - Team 24 Wrapup event at ahmedabad.
 
Optimizing Your E-commerce with WooCommerce.pptx
Optimizing Your E-commerce with WooCommerce.pptxOptimizing Your E-commerce with WooCommerce.pptx
Optimizing Your E-commerce with WooCommerce.pptx
 
Beginner's Guide to Observability@Devoxx PL 2024
Beginner's  Guide to Observability@Devoxx PL 2024Beginner's  Guide to Observability@Devoxx PL 2024
Beginner's Guide to Observability@Devoxx PL 2024
 
Penify - Let AI do the Documentation, you write the Code.
Penify - Let AI do the Documentation, you write the Code.Penify - Let AI do the Documentation, you write the Code.
Penify - Let AI do the Documentation, you write the Code.
 
TMU毕业证书精仿办理
TMU毕业证书精仿办理TMU毕业证书精仿办理
TMU毕业证书精仿办理
 
Building API data products on top of your real-time data infrastructure
Building API data products on top of your real-time data infrastructureBuilding API data products on top of your real-time data infrastructure
Building API data products on top of your real-time data infrastructure
 
WMF 2024 - Unlocking the Future of Data Powering Next-Gen AI with Vector Data...
WMF 2024 - Unlocking the Future of Data Powering Next-Gen AI with Vector Data...WMF 2024 - Unlocking the Future of Data Powering Next-Gen AI with Vector Data...
WMF 2024 - Unlocking the Future of Data Powering Next-Gen AI with Vector Data...
 
Migration From CH 1.0 to CH 2.0 and Mule 4.6 & Java 17 Upgrade.pptx
Migration From CH 1.0 to CH 2.0 and  Mule 4.6 & Java 17 Upgrade.pptxMigration From CH 1.0 to CH 2.0 and  Mule 4.6 & Java 17 Upgrade.pptx
Migration From CH 1.0 to CH 2.0 and Mule 4.6 & Java 17 Upgrade.pptx
 
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
 

Introduction to Apache Cassandra

  • 1. Apache Cassandra Harshit Daga Software Consultant Knoldus Software LLP
  • 2. Agenda ● What is Cassandra ● Gossip communication protocol ● Cassandra- Data Model ● Cassandra- Architecture ● Reading/Writing a node ● Data consistency
  • 3. Cassandra ● Cassandra is massively scalable schemaless database. ● Open source database, licensed under Apache. ● Originally, developed by Facebok for inbox search. ● Data model based upon Google’s BigTable. ● Distributed design is based upon Amazon Dynamo. ● Promoted massively by Datastax.
  • 4. Gossip Communication Protocol ● Peer to peer communication protocol. ● Nodes are arranged in ring format. ● Data is replicated to multiple nodes. ● Nodes periodically exchange info. they have. ● Nodes also exchange their own info. ● Each message has its associated version. ● No master-slave concept, and hence no single point of failure.
  • 5. Cassandra- Data Model ● Column data is stored as in key/value pair. ● Collection of column makes a Row. ● Column family is then becomes as collection of all rows. ● In RDBMS, each column must have some value else NULL, but not in case of cassandra database.
  • 6. Cassandra- Data Model ● Consider following example, ● Now inserting a new row: ● Above insertion would not fail.
  • 7. Cassandra- Data Model ● It means, data are stored as multi-dimensional sparse array.
  • 8. Cassandra- Architecture ● A ring has several nodes. ● Each node is assigned a Partition value. ● Data processing is based on the Partition Key. ● When a client makes a request to a node, it becomes the coordinator for that request. ● The coordinator determines which node in the ring should process upon that request.
  • 9. Cassandra- Architecture ● Virtual Nodes (Vnodes) – Responsible for assigning the partition token range. – Tokens are automatically calculated & assigned to each node. – Cluster re-balancing is done automatically.
  • 10. Cassandra- Architecture ● Which node gets what data is based on the partition key. ● Cassandra assigns a hash value to each partition key. ● And data gets to a node as per the hash value
  • 11. Cassandra- Architecture ● How write request gets fulfilled:-
  • 12. Data Replication ● Data replication – Simple Strategy ● Used for only one cluster – Network Topology Strategy ● Used for multiple clusters in multiple data centers.
  • 13. Writing data in a Node ● Write an entry in the commit log ● Write data to memtable. ● When memtable is full, Store data on disk in SSTables. ● SSTables are immutable data structure. ● Also has a support for TTL. Cassandra is the fastest db in concern with the write operation
  • 14. Reading data from a Node ● First, checks the memtable using Bloom filter. ● If found, then data is sent as response. ● Else, fetch the data from the SSTables. Cassandra may write many versions of the same row, then how to identify the latest one?
  • 15. Update/Delete data from Node ● Data is not immediately deleted. ● It is marked to be deleted/updated in memtables. ● This process is called tombstone. ● Tombstone, runs at configured interval of time. ● During each interval, it collects all the SSTables and updates the marked record and discards the old SSTables.
  • 16. Data Consistency ● Data is not necessarily on every node all the time. ● For maintaining consistency, no. of replicas should respond: – ONE – QUORUM – ALL ● Consistency has major impact on performance. ● For strong consistency: R + W > N
  • 17. References ● O’reilly- Cassandra Definitive Guide ● https://cassandra.apache.org/doc/latest/ ● http://docs.datastax.com/en/cassandra/3.0/