Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at WildHacks NU

Apache Cassandra and Datastax Enterprise
Peter Halliday Sr. Software Engineer
peter@datastax.com
@phalliday

How Many Have Heard of…
©2013 DataStax Confidential. Do not distribute without consent.
3
• DataStax
• Apache Cassandra

What is Apache Cassandra?
Apache Cassandra™ is a massively scalable NoSQL database.
Cassandra is designed to handle big data workloads across
multiple data centers with no single point of failure, providing
enterprises with continuous availability without compromising
performance.

Customer Industries
• Financial
• eCommerce
• Marketing
• Recommendation Engines
• Health Sciences
• Fraud Detection
• Sensor Data
• Online Games
5

Why use Cassandra?
100,000
txns/sec
200,000
txns/sec
400,000
txns/sec
• Masterless architecture.
• Continuous availability.
• Multi-data center and cloud availability zone support.
• Flexible data model.
• Linear scale performance.
• Operationally simple.
• CQL – SQL-like language.

History of Cassandra
• Amazon Dynamo partitioning and replication
• Log-structured ColumnFamily data model similar to Bigtable's
7

Cassandra Terminology
• Node (Vnode)
• Cluster
• Datacenter
• Partition
• Gossip
• Snitch
• Replication Factor
• Consistency
8

Cassandra Node
9
A computer server running Cassandra

Cassandra Cluster
A group of Cassandra nodes working together
10

Cassandra Datacenter
DC1 DC2
11

Logical Datacenter
DC1 DC2 DC3
12

Physical Datacenter
13

Cloud-based Datacenters
14

Cassandra Cluster
Data is evenly distributed around the nodes in a cluster
15

Overview of Data Partitioning in Cassandra
There are two basic data partitioning strategies:
1. Random partitioning – this is the default and
recommended strategy. Partitions data as evenly as
possible across all nodes using a hash of every column
family row key
2. Ordered partitioning – stores column family row keys in
sorted order across the nodes in a database cluster

Data Distribution and Partitioning
• Each node “owns” a set of tokens
• A node’s token range is manually configured or randomly
assigned when the node joins the cluster
• A partition key is defined for each table
• The partitioner applies a hash function to convert a
partition key to a token. This determines which
node(s) store that piece of data.
• By default, Cassandra users the Murmur3Partitioner
17

What is a Hash Algorithm?
• A function used to map data of arbitrary size to data
of fixed size
• Slight differences in input produce large differences in
output
Input MurmurHash
1 8213365047359667313
2 5293579765126103566
3 -155496620801056360
4 -663977588974966463
5 958005880272148645
The Murmur3Partitioner uses the MurmurHash function. This hashing function
creates a 64-bit hash value of the row key. The possible range of hash values
is from -263 to +263.
18

Data Distribution
-9223372036854775808
5534023222112865484
19
-5534023222112865485
1844674407370955161 -1844674407370955162
Each node is
configured
with an initial
token.
The initial
token
determines the
token range
owned by the
node.

5534023222112865484
20
-9223372036854775808
-5534023222112865485
-1844674407370955162
1844674407370955161
Data Distribution
This node owns the token
range:
1844674407370955162
To
5534023222112865484

Overview of Replication in Cassandra
• Replication is controlled by what is called the replication
factor. A replication factor of 1 means there is only one
copy of a row in a cluster. A replication factor of 2 means
there are two copies of a row stored in a cluster
• Replication is controlled at the keyspace level in
Cassandra
Original row
Copy of row

Cassandra Replication Strategies
• Simple Strategy
• Network Topology Strategy:

Simple Topology – Single Datacenter
Second Replica
RF=2
First Replica
{ 'class' : 'SimpleStrategy', 'replication_factor' : 2 };
23

Simple Topology – Single Datacenter
Third Replica
Second Replica
RF=3
First Replica
{ 'class' : 'SimpleStrategy', 'replication_factor' : 3 };
24

Network Topology – Multiple Datacenters
25
DC1
RF=2
DC2
RF=3
CREATE KEYSPACE Test
WITH REPLICATION = {'class' : 'NetworkTopologyStrategy', ’DC1' : 2, ’DC2' : 3};

Network Topology – Rack Awareness
How is the location of each node
(Rack, Datacenter) known?
26
DC1
RACK1
DC1
RACK1
DC1
RACK1
DC1
RACK2
DC1
RACK2
Second Replica
DC1
RACK2
RF=2
First Replica

Snitches
• A snitch determines which data centers and racks are
written to and read from.
• Snitches inform Cassandra about the network
topology so that requests are routed efficiently and
allows Cassandra to distribute replicas by grouping
machines into data centers and racks.
• Cassandra does its best not to have more than one
replica on the same rack.
27

Snitch Types
SimpleSnitch
• Single-data center deployments (or single-zone in public clouds)
RackInferringSnitch
• Determines the location of nodes by rack and data center
corresponding to the IP addresses
PropertyFileSnitch
• User-defined description of the network details
• cassandra-topology.properties file
GossipingPropertyFileSnitch
• Defines a local node's data center and rack
• Uses gossip for propagating this information to other nodes
• cassandra-rackdc.properties
Amazon Snitches
• EC2Snitch
• EC2MultiRegionSnitch
28

Gossip = Internode Communications
• Gossip is a peer-to-peer communication protocol in
which nodes periodically (every second) exchange
information about themselves and about other nodes
they know about.
• Cassandra uses gossip to discover location and state
information about the other nodes participating in a
Cassandra cluster
29

Load Balancing
• Each node handles client requests, but the balancing
policy is configurable
• Round Robin
• DC-Aware Round Robin
• Token-Aware
30

Round Robin
CLIENT
local Remote
31

DC Aware Round Robin
CLIENT
local Remote
32

CLIENT
local Remote
The client attempts to contact nodes in the local datacenter.
33

CLIENT
local Remote
Remote nodes are used when local nodes cannot be reached.
34

Token Aware
CLIENT
local Remote
35

Vnodes (Virtual Nodes)
Instead of each node owning a single token range, Vnodes divide
each node into many ranges (256).
Vnodes simplify many tasks in Cassandra:
• You no longer have to calculate and assign tokens to each node.
• Rebalancing a cluster is no longer necessary when adding or
removing nodes.
• Rebuilding a dead node is faster because it involves every other
node in the cluster.
• Improves the use of heterogeneous machines in a cluster. You
can assign a proportional number of vnodes to smaller and larger
machines.
36

Reading and Writing to Cassandra Nodes
• Cassandra has a ‘location independence’ architecture,
which allows any user to connect to any node in any data
center and read/write the data they need
• All writes being partitioned and replicated for them
automatically throughout the cluster

Writing Data
RF=3 transaction
The client sends a mutation
(insert/update/delete) to a
node in the cluster.
That node serves as the
coordinator for this
38

Writing Data
The coordinator
forwards the update
to all replicas.
39
RF=3

Writing Data
40
The replicas
acknowledge that
data was written.
RF=3

Writing Data
41
And the coordinator
sends a successful
response to the client.
RF=3

What if a node is down?
RF=3 Write Consistency (WC)
Only two nodes respond.
The client gets to choose if
the write was successful.
42

Tunable Data Consistency
• Choose between strong and eventual consistency (one
to all responding) depending on the need
• Can be done on a per-operation basis, and for both
reads and writes
• Handles multi-data center operations
Writes
• Any
• One
• Quorum
• Local_Quorum
• Each_Quorum
• All
Reads
• One
• Quorum
• Local_Quorum
• Each_Quorum
• All

Consistency
• ANY
Returns data from any of the replica.
• QUORUM
Returns the most recent data from the majority of replicas.
• LOCAL QUORUM
Returns the most recent data from the majority of local replicas.
• ALL
Returns the most recent data from all replicas.
44

Quorum means > 50%
45
WC = QUORUM
Will this write succeed?
YES!!
A majority of replicas
received the mutation.
RF=3

What if two nodes are down?
WC = QUORUM
Will this write succeed?
46
NO
Failed to write a
majority of replicas.
RF=3

Multi DC Writes
47
DC1
RF=3
The coordinator forwards the
mutation to local replicas and a
remote coordinator.
DC2
RF=3

Multi DC Writes
48
DC1
RF=3
The remote coordinator
forwards the mutation to
replicas in the remote DC
DC2
RF=3

Multi DC Writes
All replicas acknowledge the write.
49
DC1
RF=3
DC2
RF=3

Hinted Handoff
50
HINT
The write is replayed
when the target node
comes online

Reading Data
The client sends a query to
a node in the cluster.
That node serves as the
coordinator.
52
RF=3

Reading Data
53
The coordinator
forwards the query
to all replicas.
RF=3

Reading Data
54
The replicas
respond with data.
RF=3

Reading Data
55
And the coordinator
returns the data to the
client.
RF=3

What if the nodes disagree?
WRITE
56
Data was written with
QUORUM when one
node was down.
The write was
successful, but that
node missed the
update.
RF=3

READ
57
Now the node is back
online, and it responds
to a read request.
It has older data than
the other replicas.
RF=3

NEWEST
58
The coordinator resolves the
discrepancy and sends the
newest data to the client.
READ REPAIR
The coordinator also notifies
the “out of date” node that it
has old data.
The “out of date” node
receives updated data from
another replica.
RF=3

Rapid Read Protection
59
During a read, does
the coordinator really
forward the query to all
replicas?
That seems
RF=3 unnecessary!

NO
Cassandra performs only as
many requests as necessary to
meet the requested Consistency
Level.
Cassandra routes requests to the
most-responsive replicas.
60
RF=3

61
If a replica doesn’t
respond quickly,
Cassandra will try
another node.
This is known as an
“eager retry” RF=3

Writes (what happens within each node)
• Data is first written to a commit log for durability. Your data is safe
in Cassandra
• Then written to a memtable in memory
• Once the memtable becomes full, it is flushed to an SSTable
(sorted strings table)
• Writes are atomic at the row level; all columns are written or
updated, or none are. RDBMS-styled transactions are not
supported
INSERT INTO…
Commit log memtable
SSTable
Cassandra is known for being the fastest database in the industry where
write operations are concerned.

SSTables
63
Commit Log
Mem table
Cassandra writes to a commit log
and to memory. When the
memtable is full, data is flushed to
disk, creating an SSTable.
Disk

Compaction
64
Commit Log
Mem table
SSTables are cleaned up using
compaction. Compaction creates a
new combined SSTable and
deletes the old ones.
Disk

Reads (what happens within each node)
• Depending on the frequency of inserts and updates, a record will
exist in multiple places. Each place must be read to retrieve the
entire record.
• Data is read from the memtable in memory.
• Multiple SSTables may also be read.
• Bloom filters prevent excessive reading of SSTables.
SELECT * FROM…
memtable
SSTable
Bloom Filter
SSTable SSTable SSTable

What is DataStax Enterprise?
Strong Data Protection
In-Memory OLTP/Analytics
Point-and-Click/Automated Mgmt
Certified Production Cassandra
Multi-Workload/Use Case Capable
Integrated OLTP, Analytics, Search
Dev. IDE&
Drivers
Security Analytics Search Visual
Monitoring
Management
Services In-Memory
Professional
Services
Support&
Training

Cassandra Clients – API & Native Driver
• CQL (Cassandra Query Language) is the primary API
• Clients that use the native driver also have access to various policies that
enable the client to intelligently route requests as required.
• DataStax drivers: Java, Python, C#, C++, Ruby, Node.js (much more to
come and in the community)
• This includes:
• Load Balancing
• Data Centre Aware
• Latency Aware
• Token Aware
• Reconnection policies
• Retry policies
• Downgrading Consistency
• Plus others..
• http://www.datastax.com/download/clientdrivers
68

Security in Cassandra
BENEFITS FEATURES
Internal Authentication
Manages login IDs and
passwords inside the
database
+ Ensures only authorized
users can access a
database system using
internal validation
+ Simple to implement and
easy to understand
+ No learning curve from
the relational world
Object Permission
Management
controls who has access to
what and who can do what
in the database
+ Provides granular based
control over who can
add/change/delete/read
data
+ Uses familiar GRANT/
REVOKE from relational
systems
+ No learning curve
Client to Node
Encryption
protects data in flight to
and from a database
cluster
+ Ensures data cannot
be captured/stolen in
route to a server
+ Data is safe both in
flight from/to a
database and on the
database; complete
coverage is ensured

Advanced Security in DataStax Enterprise
BENEFITS FEATURES
External Authentication
uses external security
software systems to control
security
+ Only authorized users
have access to a database
system using external
validation
+ Uses most trusted external
security systems
(Kerberos, LDAP, AD),
mainstays in government
and finance
+ Single sign on to all data
domains
Transparent Data
Encryption
encrypts data at rest
+ Protects sensitive data
at rest from theft and
from being read at the
file system level
+ No changes needed at
application level
Data Auditing
provides trail of who did and
looked at what/when
+ Supplies admins with an
audit trail of all accesses
and changes
+ Granular control to audit
only what’s needed
+ Uses log4j interface to
ensure performance and
efficient audit operations

DataStax OpsCenter
• Visual, browser-based user
interface negates need to install
client software
• Administration tasks carried out in
point-and-click fashion
• Allows for visual rebalance of data
across a cluster when new nodes
are added
• Contains proactive alerts that warn
of impending issues.
• Built-in external notification
abilities
• Visually perform and schedule
backup operations

DataStax OpsCenter
A new, 10-node Cassandra (or Hadoop) cluster with OpsCenter A new, 10-node DSE cluster with OpsCenter running on AruWnnSin gin i n3 3m mininuutteess……
1 2 3 Done

Enterprise Search
• Built-in enterprise search on Cassandra data via Solr integration
• Facets, Filtering, Geospatial search, Text Analysis, etc.
• Near real-time search operations
• Search queries from CQL and REST/Solr
• Solr shortcomings:
• No bottleneck. Client can read/write to any Solr node.
• Search index partitioning and replication for scalability and availability.
• Multi-DC support
• Data durability (Solr lacks write-ahead log, data can be lost)
76
Cassandra
Replication
Customer
Facing
Search
Nodes

77
Integrated Batch Analytics

Batch Analytics - Hadoop
• Integrated Hadoop 1.0.4
• CFS (Cassandra File System) , no HDFS
• No Single Point of failure
• No Hadoop complexity – every node is built the same
• Hive / Pig / Sqoop / Mahout
Cassandra
Replication
78
Customer
Facing
Hadoop
Nodes

External Batch Analytics - BYOH
Bring Your Own Hadoop
Hive
Request
External Hadoop
Resource
Manager
• Analytics from external
Hadoop distribution
• Hadoop 2.x support
• Certified with Cloudera,
Hortonworks
Cassandra
Nodes

Integrated Near Real-Time Analytics
81

Real-Time Analytics - Spark
• Tight integration with Cassandra
• Real-time Streaming
• Distributed Processing
• “In-memory Map/Reduce”, multi-thread, best for iterations
• GraphX, MLLib, SparkSQL, Shark (Hive SQL like)
• DataStax / Databricks partnership
• 10x – 100x speed of MapReduce
Cassandra
Replication
82
Customer
Facing
Spark
Nodes

Spark Solr
Cassandra Cluster – Nodes Ring – Column Family Storage
High Performance – Alway Available – Massive Scalability
Company Confidential
DataStax Enterprise Products
© 2014 DataStax, All Rights Reserved.
Hadoop
Offline
Application
DataStax Cassandra Enterprise External Hadoop Distribution
Cloudera, Hortonworks
OpsCenter
Hadoop
Monitoring
Operations
Operational
Application
Real Time
Search
Real Time
Analytics
Batch
Analytics
SGBDR
Analytics
Transformations
83

How to test DataStax Cassandra ?
• CCM, A development tool for creating local Cassandra clusters
• http://www.datastax.com/dev/blog/ccm-a-development-tool-for-creating-
local-cassandra-clusters
• https://github.com/pcmanus/ccm
• Using Vagrant for Local Cassandra Development (VM)
• http://www.cantoni.org/2014/08/26/vagrant-cassandra
• https://github.com/bcantoni/vagrant-cassandra
84
• Sandbox
• http://www.datastax.com/download#dl-sandbox
• Cloud : Amazon AWS, Google Cloud, Microsoft Azure
• DataStax Code Samples
• https://github.com/DataStaxCodeSamples/

Find out more
• DataStax: http://www.datastax.com
• Getting Started: http://www.datastax.com/documentation/gettingstarted/index.html
• Training: http://www.datatstax.com/training
• Downloads: http://www.datastax.com/download
• Documentation: http://www.datastax.com/docs
• Developer Blog: http://www.datastax.com/dev/blog
• Community Site: http://planetcassandra.org
• Webinars: http://planetcassandra.org/Learn/CassandraCommunityWebinars
• Summit Talks: http://planetcassandra.org/Learn/CassandraSummit
85

Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at WildHacks NU

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (7)

Similar to Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at WildHacks NU

Similar to Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at WildHacks NU (20)

More from DataStax Academy

More from DataStax Academy (20)

Recently uploaded

Recently uploaded (20)

Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at WildHacks NU