SlideShare a Scribd company logo
1 of 16
CASSANDRA
What is Cassandra
• Apache Cassandra is highly scalable, high performance, distributed NoSQL database.
Cassandra is designed to handle huge amount of data across many commodity servers,
providing high availability without a single point of failure.
• Cassandra has a distributed architecture which is capable to handle a huge amount of
data. Data is placed on different machines with more than one replication factor to attain a
high availability without a single point of failure.
Features of Cassandra
There are a lot of outstanding technical features which makes Cassandra very popular.
Following is a list of some popular features of Cassandra:
• High Scalability
Cassandra is highly scalable which facilitates you to add more hardware to attach more
customers and more data as per requirement.
• Rigid Architecture
Cassandra has not a single point of failure and it is continuously available for business-critical
applications that cannot afford a failure.
• Fast Linear-scale Performance
Cassandra is linearly scalable. It increases your throughput because it facilitates you to
increase the number of nodes in the cluster. Therefore it maintains a quick response time.
• Fault tolerant
Cassandra is fault tolerant. Suppose, there are 4 nodes in a cluster, here each node has a
copy of same data. If one node is no longer serving then other three nodes can served as per
request.
Cassandra Architecture
Cassandra was designed to handle big data workloads across multiple nodes without a
single point of failure. It has a peer-to-peer distributed system across its nodes, and data is
distributed among all the nodes in a cluster.
•In Cassandra, each node is independent and at the same time interconnected to other
nodes. All the nodes in a cluster play the same role.
•Every node in a cluster can accept read and write requests, regardless of where the data is
actually located in the cluster.
•In the case of failure of one node, Read/Write requests can be served from other nodes in
the network.
Write Operations
Every write activity of nodes is captured by the commit logs written in the nodes. Later the data
will be captured and stored in the mem-table. Whenever the mem-table is full, data will be
written into the SStable data file. All writes are automatically partitioned and replicated
throughout the cluster. Cassandra periodically consolidates the SSTables, discarding
unnecessary data.
Read Operations
In Read operations, Cassandra gets values from the mem-table and checks the bloom filter to
find the appropriate SSTable which contains the required data.
There are three types of read request that is sent to replicas by coordinators.
•Direct request
•Digest request
•Read repair request
The coordinator sends direct request to one of the replicas. After that, the coordinator sends the
digest request to the number of replicas specified by the consistency level and checks if the
returned data is an updated data.
Data Replication in Cassandra
In Cassandra, nodes in a cluster act as replicas for a given piece of data. If some of the nodes
are responded with an out-of-date value, Cassandra will return the most recent value to the
client. After returning the most recent value, Cassandra performs a read repair in the
background to update the stale values.
See the following image to understand the schematic view of how Cassandra uses data
replication among the nodes in a cluster to ensure no single point of failure.
Components of Cassandra
•Node: A Cassandra node is a place where data is stored.
•Data center: Data center is a collection of related nodes.
•Cluster: A cluster is a component which contains one or more data centers.
•Commit log: In Cassandra, the commit log is a crash-recovery mechanism. Every write
operation is written to the commit log.
•Mem-table: A mem-table is a memory-resident data structure. After commit log, the data will
be written to the mem-table. Sometimes, for a single-column family, there will be multiple
mem-tables.
•SSTable: It is a disk file to which the data is flushed from the mem-table when its contents
reach a threshold value.
•Bloom filter: These are nothing but quick, nondeterministic, algorithms for testing whether
an element is a member of a set. It is a special kind of cache. Bloom filters are accessed after
every query.
Cassandra data model
Cassandra Data Model Components
Cassandra data model provides a mechanism for data storage. The components of Cassandra
data model are keyspaces , tables, and columns.
Keyspaces
Cassandra data model consists of keyspaces at the highest level. Keyspaces are the containers
of data, similar to the schema or database in a relational database. Typically, keyspaces contain
many tables.
Tables
Within the keyspaces, the tables are defined. Tables are also referred to as Column Families in the
earlier versions of Cassandra. Tables contain a set of columns and a primary key, and they store
data in a set of rows.
Columns
Columns define the structure of data in a table. Each column has an associated type, such as
integer, text, double, and Boolean.
•A keyspace needs to be defined before creating tables, as there is no default keyspace.
•A keyspace can contain any number of tables, and a table belongs only to one keyspace. This
represents a one-to-many relationship.
•Replication is specified at the keyspace level. For example, replication of three implies that each
data row in the keyspace will have three copies.
•Further, you need to specify the replication factor during the creation of keyspace. However, the
replication factor can be modified later.
Tables
A table contains data in the horizontal and vertical formats, referred to as rows and columns
respectively. Some of the features of tables are:
•Tables have multiple rows and columns. As mentioned earlier, a table is also called Column
Family in the earlier versions of Cassandra.
•It is still referred to as column family in some of the error messages and documents of
Cassandra.
•It is important to define a primary key for a table.
Columns
Column represents a single piece of data in Cassandra and has a type defined.
Some of its features are:
•Columns consist of various types, such as integer, big integer, text, float, double, and Boolean.
•Cassandra also provides collection types such as set, list, and map.
•Further, column values have an associated time stamp representing the time of update.
•This timestamp can be retrieved using the function writetime.
Thank you

More Related Content

Similar to cassandra.pptx

Talk about apache cassandra, TWJUG 2011
Talk about apache cassandra, TWJUG 2011Talk about apache cassandra, TWJUG 2011
Talk about apache cassandra, TWJUG 2011
Boris Yen
 

Similar to cassandra.pptx (20)

Appache Cassandra
Appache Cassandra  Appache Cassandra
Appache Cassandra
 
Cassndra (4).pptx
Cassndra (4).pptxCassndra (4).pptx
Cassndra (4).pptx
 
Introduction to cassandra
Introduction to cassandraIntroduction to cassandra
Introduction to cassandra
 
Data Storage Management
Data Storage ManagementData Storage Management
Data Storage Management
 
Cassandra Tutorial
Cassandra Tutorial Cassandra Tutorial
Cassandra Tutorial
 
cassandra_presentation_final
cassandra_presentation_finalcassandra_presentation_final
cassandra_presentation_final
 
The No SQL Principles and Basic Application Of Casandra Model
The No SQL Principles and Basic Application Of Casandra ModelThe No SQL Principles and Basic Application Of Casandra Model
The No SQL Principles and Basic Application Of Casandra Model
 
5266732.ppt
5266732.ppt5266732.ppt
5266732.ppt
 
Talk about apache cassandra, TWJUG 2011
Talk about apache cassandra, TWJUG 2011Talk about apache cassandra, TWJUG 2011
Talk about apache cassandra, TWJUG 2011
 
Talk About Apache Cassandra
Talk About Apache CassandraTalk About Apache Cassandra
Talk About Apache Cassandra
 
Big Data Storage Concepts from the "Big Data concepts Technology and Architec...
Big Data Storage Concepts from the "Big Data concepts Technology and Architec...Big Data Storage Concepts from the "Big Data concepts Technology and Architec...
Big Data Storage Concepts from the "Big Data concepts Technology and Architec...
 
Introduction to cassandra
Introduction to cassandraIntroduction to cassandra
Introduction to cassandra
 
White paper on cassandra
White paper on cassandraWhite paper on cassandra
White paper on cassandra
 
NoSQL - Cassandra & MongoDB.pptx
NoSQL -  Cassandra & MongoDB.pptxNoSQL -  Cassandra & MongoDB.pptx
NoSQL - Cassandra & MongoDB.pptx
 
Cassandra for mission critical data
Cassandra for mission critical dataCassandra for mission critical data
Cassandra for mission critical data
 
A tour of Amazon Redshift
A tour of Amazon RedshiftA tour of Amazon Redshift
A tour of Amazon Redshift
 
Dsm project-h base-cassandra
Dsm project-h base-cassandraDsm project-h base-cassandra
Dsm project-h base-cassandra
 
Cassandra from the trenches: migrating Netflix (update)
Cassandra from the trenches: migrating Netflix (update)Cassandra from the trenches: migrating Netflix (update)
Cassandra from the trenches: migrating Netflix (update)
 
Apache cassandra
Apache cassandraApache cassandra
Apache cassandra
 
CASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEM
CASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEMCASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEM
CASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEM
 

Recently uploaded

如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
acoha1
 
Simplify hybrid data integration at an enterprise scale. Integrate all your d...
Simplify hybrid data integration at an enterprise scale. Integrate all your d...Simplify hybrid data integration at an enterprise scale. Integrate all your d...
Simplify hybrid data integration at an enterprise scale. Integrate all your d...
varanasisatyanvesh
 
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
pwgnohujw
 
Displacement, Velocity, Acceleration, and Second Derivatives
Displacement, Velocity, Acceleration, and Second DerivativesDisplacement, Velocity, Acceleration, and Second Derivatives
Displacement, Velocity, Acceleration, and Second Derivatives
23050636
 
Huawei Ransomware Protection Storage Solution Technical Overview Presentation...
Huawei Ransomware Protection Storage Solution Technical Overview Presentation...Huawei Ransomware Protection Storage Solution Technical Overview Presentation...
Huawei Ransomware Protection Storage Solution Technical Overview Presentation...
LuisMiguelPaz5
 
Abortion pills in Jeddah |+966572737505 | get cytotec
Abortion pills in Jeddah |+966572737505 | get cytotecAbortion pills in Jeddah |+966572737505 | get cytotec
Abortion pills in Jeddah |+966572737505 | get cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Abortion pills in Doha {{ QATAR }} +966572737505) Get Cytotec
Abortion pills in Doha {{ QATAR }} +966572737505) Get CytotecAbortion pills in Doha {{ QATAR }} +966572737505) Get Cytotec
Abortion pills in Doha {{ QATAR }} +966572737505) Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
jk0tkvfv
 
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
zifhagzkk
 

Recently uploaded (20)

如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
 
Simplify hybrid data integration at an enterprise scale. Integrate all your d...
Simplify hybrid data integration at an enterprise scale. Integrate all your d...Simplify hybrid data integration at an enterprise scale. Integrate all your d...
Simplify hybrid data integration at an enterprise scale. Integrate all your d...
 
DAA Assignment Solution.pdf is the best1
DAA Assignment Solution.pdf is the best1DAA Assignment Solution.pdf is the best1
DAA Assignment Solution.pdf is the best1
 
Bios of leading Astrologers & Researchers
Bios of leading Astrologers & ResearchersBios of leading Astrologers & Researchers
Bios of leading Astrologers & Researchers
 
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
 
Displacement, Velocity, Acceleration, and Second Derivatives
Displacement, Velocity, Acceleration, and Second DerivativesDisplacement, Velocity, Acceleration, and Second Derivatives
Displacement, Velocity, Acceleration, and Second Derivatives
 
Huawei Ransomware Protection Storage Solution Technical Overview Presentation...
Huawei Ransomware Protection Storage Solution Technical Overview Presentation...Huawei Ransomware Protection Storage Solution Technical Overview Presentation...
Huawei Ransomware Protection Storage Solution Technical Overview Presentation...
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Abortion pills in Jeddah |+966572737505 | get cytotec
Abortion pills in Jeddah |+966572737505 | get cytotecAbortion pills in Jeddah |+966572737505 | get cytotec
Abortion pills in Jeddah |+966572737505 | get cytotec
 
Northern New England Tableau User Group (TUG) May 2024
Northern New England Tableau User Group (TUG) May 2024Northern New England Tableau User Group (TUG) May 2024
Northern New England Tableau User Group (TUG) May 2024
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
Pentesting_AI and security challenges of AI
Pentesting_AI and security challenges of AIPentesting_AI and security challenges of AI
Pentesting_AI and security challenges of AI
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Predictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting TechniquesPredictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting Techniques
 
Abortion pills in Doha {{ QATAR }} +966572737505) Get Cytotec
Abortion pills in Doha {{ QATAR }} +966572737505) Get CytotecAbortion pills in Doha {{ QATAR }} +966572737505) Get Cytotec
Abortion pills in Doha {{ QATAR }} +966572737505) Get Cytotec
 
Introduction to Statistics Presentation.pptx
Introduction to Statistics Presentation.pptxIntroduction to Statistics Presentation.pptx
Introduction to Statistics Presentation.pptx
 
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
 
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
 
Credit Card Fraud Detection: Safeguarding Transactions in the Digital Age
Credit Card Fraud Detection: Safeguarding Transactions in the Digital AgeCredit Card Fraud Detection: Safeguarding Transactions in the Digital Age
Credit Card Fraud Detection: Safeguarding Transactions in the Digital Age
 
Capstone in Interprofessional Informatic // IMPACT OF COVID 19 ON EDUCATION
Capstone in Interprofessional Informatic  // IMPACT OF COVID 19 ON EDUCATIONCapstone in Interprofessional Informatic  // IMPACT OF COVID 19 ON EDUCATION
Capstone in Interprofessional Informatic // IMPACT OF COVID 19 ON EDUCATION
 

cassandra.pptx

  • 2. What is Cassandra • Apache Cassandra is highly scalable, high performance, distributed NoSQL database. Cassandra is designed to handle huge amount of data across many commodity servers, providing high availability without a single point of failure. • Cassandra has a distributed architecture which is capable to handle a huge amount of data. Data is placed on different machines with more than one replication factor to attain a high availability without a single point of failure.
  • 3. Features of Cassandra There are a lot of outstanding technical features which makes Cassandra very popular. Following is a list of some popular features of Cassandra: • High Scalability Cassandra is highly scalable which facilitates you to add more hardware to attach more customers and more data as per requirement. • Rigid Architecture Cassandra has not a single point of failure and it is continuously available for business-critical applications that cannot afford a failure. • Fast Linear-scale Performance Cassandra is linearly scalable. It increases your throughput because it facilitates you to increase the number of nodes in the cluster. Therefore it maintains a quick response time. • Fault tolerant Cassandra is fault tolerant. Suppose, there are 4 nodes in a cluster, here each node has a copy of same data. If one node is no longer serving then other three nodes can served as per request.
  • 4. Cassandra Architecture Cassandra was designed to handle big data workloads across multiple nodes without a single point of failure. It has a peer-to-peer distributed system across its nodes, and data is distributed among all the nodes in a cluster. •In Cassandra, each node is independent and at the same time interconnected to other nodes. All the nodes in a cluster play the same role. •Every node in a cluster can accept read and write requests, regardless of where the data is actually located in the cluster. •In the case of failure of one node, Read/Write requests can be served from other nodes in the network.
  • 5. Write Operations Every write activity of nodes is captured by the commit logs written in the nodes. Later the data will be captured and stored in the mem-table. Whenever the mem-table is full, data will be written into the SStable data file. All writes are automatically partitioned and replicated throughout the cluster. Cassandra periodically consolidates the SSTables, discarding unnecessary data.
  • 6. Read Operations In Read operations, Cassandra gets values from the mem-table and checks the bloom filter to find the appropriate SSTable which contains the required data. There are three types of read request that is sent to replicas by coordinators. •Direct request •Digest request •Read repair request The coordinator sends direct request to one of the replicas. After that, the coordinator sends the digest request to the number of replicas specified by the consistency level and checks if the returned data is an updated data.
  • 7.
  • 8. Data Replication in Cassandra In Cassandra, nodes in a cluster act as replicas for a given piece of data. If some of the nodes are responded with an out-of-date value, Cassandra will return the most recent value to the client. After returning the most recent value, Cassandra performs a read repair in the background to update the stale values. See the following image to understand the schematic view of how Cassandra uses data replication among the nodes in a cluster to ensure no single point of failure.
  • 9. Components of Cassandra •Node: A Cassandra node is a place where data is stored. •Data center: Data center is a collection of related nodes. •Cluster: A cluster is a component which contains one or more data centers. •Commit log: In Cassandra, the commit log is a crash-recovery mechanism. Every write operation is written to the commit log. •Mem-table: A mem-table is a memory-resident data structure. After commit log, the data will be written to the mem-table. Sometimes, for a single-column family, there will be multiple mem-tables. •SSTable: It is a disk file to which the data is flushed from the mem-table when its contents reach a threshold value. •Bloom filter: These are nothing but quick, nondeterministic, algorithms for testing whether an element is a member of a set. It is a special kind of cache. Bloom filters are accessed after every query.
  • 11.
  • 12. Cassandra Data Model Components Cassandra data model provides a mechanism for data storage. The components of Cassandra data model are keyspaces , tables, and columns. Keyspaces Cassandra data model consists of keyspaces at the highest level. Keyspaces are the containers of data, similar to the schema or database in a relational database. Typically, keyspaces contain many tables. Tables Within the keyspaces, the tables are defined. Tables are also referred to as Column Families in the earlier versions of Cassandra. Tables contain a set of columns and a primary key, and they store data in a set of rows. Columns Columns define the structure of data in a table. Each column has an associated type, such as integer, text, double, and Boolean.
  • 13. •A keyspace needs to be defined before creating tables, as there is no default keyspace. •A keyspace can contain any number of tables, and a table belongs only to one keyspace. This represents a one-to-many relationship. •Replication is specified at the keyspace level. For example, replication of three implies that each data row in the keyspace will have three copies. •Further, you need to specify the replication factor during the creation of keyspace. However, the replication factor can be modified later.
  • 14. Tables A table contains data in the horizontal and vertical formats, referred to as rows and columns respectively. Some of the features of tables are: •Tables have multiple rows and columns. As mentioned earlier, a table is also called Column Family in the earlier versions of Cassandra. •It is still referred to as column family in some of the error messages and documents of Cassandra. •It is important to define a primary key for a table.
  • 15. Columns Column represents a single piece of data in Cassandra and has a type defined. Some of its features are: •Columns consist of various types, such as integer, big integer, text, float, double, and Boolean. •Cassandra also provides collection types such as set, list, and map. •Further, column values have an associated time stamp representing the time of update. •This timestamp can be retrieved using the function writetime.