Cassandra

Outline
I. Why Cassandra?
II. Basic Operations
III. The Cassandra Architecture
IV. Clients
V. Maintenance

CAP theorem
• Consistence: all nodes see the same data at the same time.
• Availability: a guarantee that every request receives a
response about whether it success of failed.
• Partition Tolerance: the system continues to operate
despite arbitrary message lose or failure of part of the
system.
Ref: http://uzigood.blogspot.com/2016/06/cap-theorem.html

Partition Tolerance of Mongo
Ref: https://docs.mongodb.com/manual/replication/
Ref: https://docs.mongodb.com/manual/core/read-preference/

Partition Tolerance of Cassandra
Cassandra uses consistent hashing to
determine which nodes out of your
cluster must manage the data you are
passing in. You set a replication factor,
which basically states to how many
nodes you want to replicate your data.
How big can it scale? Cassandra can handle the load of
applications like Instagram that have roughly 80 million
photos uploaded to the database every day.

Ref: https://blog.panoply.io/cassandra-vs-mongodb
Cassandra vs Mongo

Ref: https://scalegrid.io/blog/cassandra-vs-mongodb/
Cassandra vs Mongo
Yes

IMS, RDBMSs, NoSQL. The horse, the car, the plane.

1. Apache Cassandra Cluster: Apache Cassandra Cluster
as a database server spread across a number of
machines.
2. Keyspaces : A keyspace is a logical grouping of Apache
Cassandra tables.
3. Tables : An Apache Cassandra table is similar to an
RDBMS table.
4. Primary Key: A Primary key uniquely identifies an
Apache Cassandra row. A primary key can be a simple
key or a composite key. A composite key is made up of
two parts, a partition key and a cluster key. The partition
key determines data distribution in the cluster while the
cluster key determines sort order within a partition.
Terminology

cqlsh> DESCRIBE CLUSTER;
cqlsh> DESCRIBE KEYSPACES;
cqlsh> CREATE KEYSPACE my_keyspace WITH replication = {
'class': 'SimpleStrategy',
'replication_factor': 1};
cqlsh:my_keyspace> CREATE TABLE user (
first_name text ,
last_name text,
PRIMARY KEY (first_name)) ;
Get started
Ref: http://abiasforaction.net/cassandra-query-language-cql-
tutorial/

cqlsh> DESCRIBE KEYSPACES;
cqlsh:my_keyspace> DESCRIBE KEYSPACES;
cqlsh:my_keyspace> DESCRIBE KEYSPACE my_keyspace;
cqlsh:my_keyspace> DESCRIBE TABLE user;
DESCRIBE

INSERT
cqlsh:my_keyspace> INSERT INTO user (first_name , last_name ) VALUES ('ben', 'liu');
cqlsh:my_keyspace> SELECT * FROM user;
cqlsh:my_keyspace> SELECT * FROM user WHERE first_name='ben';
cqlsh:my_keyspace> SELECT COUNT (*) FROM user;
DELETE
cqlsh:my_keyspace> DELETE last_name FROM user WHERE first_name ='ben';

Exercises
1. Create a keyspace named mifly. The class of this keyspace is SimpleStrategy and the
value of replication_factor is set to 1.
2. Create a table and named it as employees. This table has two columns which are first_name
and last_name. The datatypes of first_name and last_name are text. Set first_name as
the primary key of that table.
3. To check that the first_name has been set to primary key, use DESCRIBE to get the
information of employees.
4. Insert the data which is shown below into employees.
first_name last_name
ben liu
maka long

Exercises
5. Dump all columns and all rows from employees.
6. Delete the employee whose first name is maka.
7. Drop table emploees.
8. Drop keyspace mifly.

Cassandra’s Data Model
cqlsh:my_keyspace> INSERT INTO user (first_name , last_name ) VALUES ( 'doggy', 'wang');

cqlsh:my_keyspace> UPDATE user SET last_name = 'liu' WHERE first_name ='white' ;
UPDATE

ALTER
ALTER TABLE user ADD phone text ;
ALTER TABLE user DROP phone ;

Timestamps
cqlsh:my_keyspace> SELECT first_name,last_name, writetime(last_name) from user;
Cassandra uses these timestamps for resolving any conflicting changes that are
made to the same value. Generally, the last timestamp wins.

TTL (time to live)
cqlsh:my_keyspace> SELECT first_name, last_name, TTL(last_name) FROM user;
cqlsh:my_keyspace> UPDATE user USING TTL 30 SET last_name='liou' WHERE first_name ='white' ;

Exercises
1. Create a keyspace named mifly. The class of this keyspace is SimpleStrategy and the
value of replication_factor is set to 1.
2. Create a table and named it as employees. This table has two columns which are first_name
and last_name. The datatypes of first_name and last_name are text. Set first_name as
the primary key of that table.
3. To check that the first_name has been set to primary key, use DESCRIBE to get the
information of employees.
4. Insert the data which is shown below into employees. Remain the last_name of feifei empty.
first_name last_name
ben liu
maka long
feifei

Exercises
5. Select feifei and change the value of last_name to king.
6. Add a column of email to the table. The data type of the email column is text.
7. Dump the information of first_name, last_name and TTL of email.
8. Set the email address of ben to mifly@gmail.com and set the TTL to 30s.
9. Drop table emploees.
10. Drop keyspace mifly.

cqlsh:my_keyspace> CREATE TABLE user (
first_name text ,
last_name text,
PRIMARY KEY (first_name)) ;
Data Types
first_name (text) last_name (text)
ben liu
maka long

Textual Data Types
Other Simple Data Types
• boolean: This is a simple true/false value.
• blob: A binary large object (blob) is a colloquial computing term for an arbitrary array
• of bytes.
• inet: This type represents IPv4 or IPv6 Internet addresses.
• counter: The counter data type provides 64-bit signed integer, whose value cannot be set
directly, but only incremented or decremented.

Time and Identity Data Types
• timestamp: It indicates when the data was last modified with ISO 8601 date formats.
(e.g. 2015-06-15 20:05-0700, 2015-06-15 20:05:07.013-0700).
• date, time: The 2.2 release introduced date and time types that allowed these to be represented
independently.
• uuid: This is a Type 4 UUID (universally unique identifier) which is a 128-bit value based entirely
on random numbers (e.g. 1a6300ca-0572-4736-a393-c0b7229e193e).
• timeuuid: This is a Type 1 UUID, which is based on the MAC address of the computer, the
system time, and a sequence number used to prevent duplicates.

uuid
cqlsh:my_keyspace> ALTER TABLE user ADD id uuid;
cqlsh:my_keyspace> UPDATE user SET id = uuid() WHERE first_name ='ben' ;
Ref: https://docs.datastax.com/en/cql/3.3/cql/cql_reference/timeuuid_functions_r.html

Collections
• set: The set data type stores a collection of elements.
• list: The list data type contains an ordered list of elements.
• map: The map data type contains a collection of key/value pairs.

set
cqlsh:my_keyspace> ALTER TABLE user ADD email set<text> ;
UPDATE user SET email = {'a@email.com', 'b@emai.com'} WHERE first_name ='ben';
UPDATE user SET email= email + {'dog@email.com'} WHERE first_name='white';

list
cqlsh:my_keyspace> ALTER TABLE user ADD phone list<text> ;
cqlsh:my_keyspace> UPDATE user SET phone =['1234567'] WHERE first_name ='fei' ;
cqlsh:my_keyspace> UPDATE user SET phone[0] = null WHERE first_name ='fei';

map
cqlsh:my_keyspace> ALTER TABLE user ADD food map<text, boolean > ;
cqlsh:my_keyspace> UPDATE user SET food = {'beef': false} WHERE first_name = 'white';

User-Defined Types
cqlsh:my_keyspace> CREATE TYPE address (
... street text,
... city text,
... state text);
cqlsh:my_keyspace> ALTER TABLE user ADD addresses map<text, frozen<address>>;
cqlsh:my_keyspace> UPDATE user SET addresses = {
...'home': { street:'ooo', city: 'xxx' } } WHERE first_name='ben' ;

Secondary Indexes
cqlsh:my_keyspace> CREATE INDEX on user (last_name) ;
cqlsh:my_keyspace> SELECT * FROM user WHERE last_name = 'liu' ;

Defining Application Queries
Each box on the diagram represents a step in the application workflow,
with arrows indicating the flows between steps and the associated query.

Introducing Chebotko Diagrams
K for partition key columns and C↑ or C↓ to
represent clustering columns.

Hotel Logical Data Model
Our first query Q1 is to find hotels near a point of interest, so we’ll call our table hotels_by_poi.

Reservation Logical Data Model

Physical Data Modeling
To draw physical models, we need to be able
to add the typing information for each
column.

Reservation Physical Data Model

Calculating Partition Size
N r = 5000 hotel × 100 rooms/hotel × 730 days = 365,000,000 rows

Calculating Size on Disk
Partition size = 16 bytes + 0 bytes + 2.56 GB + 2.92 GB = 5.48 GB

III. The Cassandra Architecture

1. The efficiency and the availability of the network topology.
2. The data is distributed to the different nodes with Rings and Tokens.
3. Making data durable and available.
The Design Pattern of Cassandra Cluster

Data Centers and Racks
Cassandra tries to store copies of your data in multiple data centers to maximize availability and partition
tolerance, while preferring to route queries to nodes in the local data center to maximize performance.

Gossip and Failure Detection
1. Once per second, the gossiper will choose a random node in the cluster and initialize
a gossip session with it.
2. The gossip initiator sends its chosen friend a GossipDigestSynMessage.
3. When the friend receives this message, it returns a GossipDigestAckMessage.
4. When the initiator receives the ack message from the friend, it sends the friend a
GossipDigestAck2Message to complete the round of gossip.
org.apache.cassandra.gms.FailureDetector class

Snitches
The snitch will figure out where nodes are in relation to other nodes.
1. Your selected snitch is wrapped with another snitch called the DynamicEndpointSnitch.
2. The dynamic snitch gets its basic understanding of the topology from the selected snitch types.
3. It then monitors the performance of requests to the other nodes, even keeping track of things like
which nodes are performing compaction. The performance data is used to select the best
replica for each query.

Rings and Tokens
• A token is a 128-bit integer ID used to identify each partition.
• A node claims ownership of the range of values less than or equal to each token and
greater than the token of the previous node.
• Data is assigned to nodes by using a hash function (partitioner) to calculate a token for the
partition key.

Virtual Nodes
Ref: http://docs.basho.com/riak/kv/2.2.3/learn/concepts/vnodes/
node0
node1
node2
node3
Cassandra’s 1.2 release introduced the concept of virtual nodes, also called vnodes for short. Instead of
assigning a single token to a node, the token range is broken up into multiple smaller ranges.

Replication Strategies
1. The SimpleStrategy places replicas at consecutive nodes around the ring, starting with the node
indicated by the partitioner.
2. The NetworkTopologyStrategy allows you to specify a different replication factor for each data center.
Within a data center, allocates replicas to different racks in order to maximize availability.

SimpleStrategy
The SimpleStrategy places replicas at consecutive nodes around the ring, starting with the
node indicated by the partitioner.

NetworkTopologyStrategy
The total number of replicas that will be stored is equal to the sum of the replication factors for each data
center.
The NetworkTopologyStrategy allows you to
specify a different replication factor for each data
center. Within a data center, allocates replicas to
different racks in order to maximize availability.

Consistency Levels
For read queries, the consistency level specifies how many replica nodes must respond to a read request
before returning the data.
For write operations, the consistency level specifies how many replica nodes must respond for the write to
be reported as successful to the client.
Setting consistency levels:
(1) ONE, TWO, and THREE, each of which specify an absolute number of replica nodes that must respond to a request.
(2) The QUORUM consistency level requires a response from a majority of the replica nodes
(e.g. "replication factor / 2 + 1").
(3) The ALL consistency level requires the response from all of the replicas.
(4) The ANY consistency level requires arbitrary responses from all of the replicas.
R + W > N = strong consistency

Read/Write Data from Nodes
A client may connect to any node in the
cluster to initiate a read or write query.
This node is known as the coordinator
node.
For a read, the coordinator contacts
enough replicas to ensure the required
consistency level is met, and returns the
data to the client.

Read/Write Data from Nodes
For a write, the coordinator node
contacts all replicas, as determined
by the consistency level and
replication factor, and considers
the write successful when a
number of replicas commensurate
with the consistency level
acknowledge the write.

Cassandra node
Cassandra stores data both in memory and on disk to provide both high performance and durability.

Commit Logs
When you perform a write operation, it’s immediately
written to a commit log.
The commit log gets replayed if the database crashes
unexpectedly

Memtables
After it’s written to the commit log, the value is written
to a memory-resident data structure called the
memtable. Each memtable contains data for a specific
table.
When the number of objects stored in the memtable
reaches a threshold, the contents of the memtable are
flushed to disk in a file called an SSTable and a new
memtable then created.

SSTables
Each commit log maintains an internal bit flag to
indicate whether it needs flushing.
When a write operation is first received, it is
written to the commit log and its bit flag
is set to 1.
Once the memtable has been properly flushed
to disk, the corresponding commit log’s bit flag
is set to 0, indicating that the commit log no
longer has to maintain that data for durability
purposes.
On reads, Cassandra will read both SSTables and
memtables to find data values.

Caching
The key cache stores a map of partition keys to row index
entries, facilitating faster read access into SSTables
stored on disk. The key cache is stored on the JVM heap.
The row cache caches entire rows and can greatly speed
up read access for frequently accessed rows, at the cost
of more memory usage. The row cache is stored in off-
heap memory.

Cassandra Cluster Manager
Cassandra Cluster Manager or ccm is a set of Python scripts that allow you to run a multi-
node cluster on a single machine.
$ sudo pip3 install ccm
$ sudo service ccm stop
$ ccm create -v 3.0.0 -n 3 my_cluster --vnodes
$ ccm list
$ ccm start
$ ccm status
Cluster: 'my_cluster'
---------------------
node1: UP
node3: UP
node2: UP

This is equivalent to running the command nodetool status on the individual node.

We can run the nodetool ring command in order to get a list of the tokens owned by each node.

Adding a Nodes to a Cluster
$ ccm add node4 -i 127.0.0.4 -j 7400
The tokens will be reallocated across all of the nodes.

$ cd ~/.ccm; ls
CURRENT my_cluster repository
$ cd my_cluster; ls
cluster.conf node1 node2 node3
$ cd ~/.ccm/my_cluster
$ diff node1/conf/ node2/conf/
Cluster Configuration

Seed Nodes
A seed node is used as a contact point for other nodes, so Cassandra can learn the topology of the
cluster—that is, what hosts have what ranges.
For example, if node A acts as a seed for node C, when node C comes online, it will use node A as a
reference point from which to get topology . This process is known as bootstrapping.
Seed nodes do not auto bootstrap because it is assumed that they will be the first nodes in the cluster.
A
B
C
Cassandra.yaml in node1~node3
node1 - seeds: 127.0.0.1
node2 - seeds: 127.0.0.1,127.0.0.2
node3 - seeds: 127.0.0.1,127.0.0.2,127.0.0.3

Snitches
Snitches gather some information about your network topology so that Cassandra can efficiently
route requests.
• Simple Snitch: it unsuitable for multi-data center deployments. If you choose to use this snitch, you
should also use the SimpleStrategy replication strategy for your keyspaces.
• Property File Snitch: it uses information you provide about the topology of your cluster in a standard Java
key/value properties file called cassandratopology.properties.
• Gossiping Property File Snitch: The data exchanges information about its own rack and data cen‐
ter location with other nodes via gossip. The rack and data center locations are defined in the cassandra-
rackdc.properties file.

Snitches
You configure the endpoint snitch implementation to use by updating the endpoint_snitch property in
the cassandra.yaml file.

Exercise
1. Using ccm to create a pseudo cassandra cluster with 3 nodes. The cassandra version of the nodes is
set to 3.0.0 . The nodes use vnode to segment the tokens.
2. Before you starting up the cluster, configure the settings of each nodes. Use GossipingPropertyFile-
Snitch to assign the datacenter and the rack of each node.
3. Stop the pseudo cluster. Configuring the setting of snitch to SimpleSnitch and restart the cluster.
What's happening after you switching from GossipingPropertyFileSnitch to SimpleSnitch. Try to solve
that error.

Tokens and Virtual Nodes
You configure the token numbers by updating the num_token property in the cassandra.yaml file.
The value of num_token is configured to 1 and the result is shown in the figure bellow. Each node
just holds a token.

Network Interfaces
Node ip
• listen_address: the ip address of the node.
• storage_port: designate the port used for inter-node communications, typically 7000.
Thrift transport (Remote Procedure Call which will be removed entirely in a future release)
• rpc_port: default 9160.
• rpc_address: the ip address of the node.
native transport (since cassandra 0.8)
• start_native_transport: set it to true to enable native transport (the native transport handles
the communication between client and server).
• native_transport_port: designate the port used for native transport, typically 9042.

Data Storage
• commitlog_directory: the directory to store the commit logs.
• data_file_directories: the directory to store SSTables.
• disk_failure_policy, commit_failure_policy: set the failure response.

Ref: https://twgame.wordpress.com/2015/02/16/real-machine-cassandra-cluster/
Building a Cassandra Cluster
node1 node2

libraryDependencies += "com.datastax.cassandra" % "cassandra-driver-core" % "3.5.1"
libraryDependencies += "org.slf4j" % "slf4j-simple" % "1.6.4"
libraryDependencies += "org.apache.logging.log4j" % "log4j-core" % "2.11.1"
Scala client
build.sbt

Cassandra

Recommended

Recommended

More Related Content

What's hot

What's hot (17)

Similar to Cassandra

Similar to Cassandra (20)

Recently uploaded

Recently uploaded (20)

Cassandra