SlideShare a Scribd company logo
INTRODUCTION TO
APACHE CASSANDRA
Gökhan Atıl
GÖKHAN ATIL
➤ Database Administrator
➤ Oracle ACE Director (2016)

ACE (2011)
➤ 10g/11g and R12 Oracle Certified Professional (OCP)
➤ Co-author of Expert Oracle Enterprise Manager 12c
➤ Founding Member and Vice President of TROUG
➤ Blogger (since 2008) gokhanatil.com
➤ Twitter: @gokhanatil
2
INTRODUCTION TO APACHE CASSANDRA
➤ What is Apache Cassandra? Why to use it?
➤ Cassandra Architecture
➤ Cassandra Query Language (CQL)
➤ Cassandra Data Modeling
➤ How to install and run Cassandra?
➤ Cassandra nodetool
➤ Backup and Recovery
3
WHAT IS APACHE CASSANDRA? WHY TO USE IT?
4
WHAT IS APACHE CASSANDRA? WHY TO USE IT?
➤ Fast Distributed (Column Family NoSQL) Database
High availability
Linear Scalability
High Performance
➤ Fault tolerant on Commodity Hardware
➤ Multi-Data Center Support
➤ Easy to operate
➤ Proven: CERN, Netflix, eBay, GitHub, Instagram, Reddit
5
HIGH AVAILABILITY: CAP THEOREM AND CASSANDRA
6
Partition
Tolerance
Availability
Consistency

(ACID)
RDBMS
Atomicity
Consistency
Isolation
Durability
HIGH AVAILABILITY: THE RING
7
NO MASTER NO SLAVE
PEER TO
PEER
gossip
gossip
I'm online!
LINEAR SCALABILITY
8
CASSANDRA ARCHITECTURE
9
CASSANDRA PARTITIONS
10
EMAIL NAME PHONE
gokhan@ Gokhan 542xxxxxxx
aylin@ Aylin 532xxxxxxx
ilayda@ Ilayda 532xxxxxxx
partitionerPRIMARY KEY
PARTITION KEY, CLUSTERING KEY
REPLICATION FACTOR
11
EMAIL
gokhan@
Murmur3Partitioner
# 60
WRITE PATH (CLUSTER)
12
coordinator
node
client
hinted
hand off
WRITE PATH (NODE)
➤ Logging data in the commit log
➤ Writing data to the memtable
➤ Flushing to (immutable)
SSTables (Sorted Strings Table)
13
memtable
commit log SSTable SSTable SSTable
disk
mem
flush
compaction
READ PATH (CLUSTER)
14
coordinator
node
client
➤ Read Repair: repair during read path using digest and timestamp
data
digest
digest
READ PATH (NODE)
15
memtable row (read) cache
bloom filter

(maybe or no)
partition key
cache
partition
summary
partition index SSTable
found
maybe
found
no
disk
mem
CONSISTENCY LEVELS
➤ Formula for Strong Consistency: R + W > N
16
ANY (write only) at least one node
ONE, TWO, THREE
at least one/two/three replica
node
QUORUM
a quorum (N/2+1) of replica
nodes across all datacenters
LOCAL_QUORUM
a quorum (N/2+1) of replica
nodes in the same datacenter
ALL on all replica nodes
CASSANDRA QUERY LANGUAGE (CQL)
17
CASSANDRA QUERY LANGUAGE (CQL)
➤ Create a Keyspace (Database):

create keyspace demo with replication = { 'class' :
'SimpleStrategy', 'replication_factor' :1 };
➤ Remove a keyspace:

drop keyspace demo;
➤ Select a keyspace to operate:

use demo;
18
CASSANDRA QUERY LANGUAGE (CQL)
➤ Create a table:

create table demo.democlients ( email text, name text,
phone text, primary key (email, name));
➤ Alter a table:

alter table democlients add money int;
➤ Remove a table:

drop table democlients;
➤ Remove all rows in a table:

truncate table democlients;
19
EMAIL: PARTITION KEY
NAME: CLUSTERING KEY
CASSANDRA QUERY LANGUAGE (CQL)
➤ Retrieve rows:

select * from democlients where name='Gokhan Atil'
ALLOW FILTERING; -- or create a secondary index
➤ Retrieve distinct values:

select DISTINCT email from democlients;
➤ Limit the number of rows returned:

select * from democlients LIMIT 1;
➤ Sort the result:

select * from democlients where email='gokhan at
gokhanatil.com' ORDER by name DESC;
20
NAME: CLUSTERING KEY
EMAIL: PARTITION KEY
CASSANDRA QUERY LANGUAGE (CQL)
➤ Retrieve the results in the JSON format:

select JSON * from democlients;
➤ Insert a row:

insert into democlients (email, name, phone) values
('gokhan at gokhanatil.com','Gokhan Atil','542' ) IF NOT
EXISTS;
➤ Insert a row with TTL (Time to live - seconds):

insert into democlients (email, name, phone) values ('info
at gokhanatil.com','Information','542' ) USING TTL 10;
21
CASSANDRA QUERY LANGUAGE (CQL)
➤ Update records:

update democlients set phone='535' where
email='gokhan at gokhanatil.com' and 

name='Gokhan' IF EXISTS;
➤ Update records with a condition:

update democlients set money=20 where email='gokhan
at gokhanatil.com' and name='Gokhan Atil' 

IF phone='542';
➤ Delete rows:

delete from democlients where email='gokhan at
gokhanatil.com' IF EXISTS;
22
CASSANDRA QUERY LANGUAGE (CQL)
➤ Delete row with a condition:

delete from democlients where email='gokhan at
gokhanatil.com' and name='Gokhan Atil' IF money > 10;
➤ Delete columns in a row:

delete money from democlients where email='gokhan at
gokhanatil.com' and name='Gokhan Atil';
23
CASSANDRA DATA MODELING
➤ Query-Driven Data Modeling
➤ Spread data evenly across the cluster
➤ Use Denormalization
➤ Be careful about using secondary indexes
24
HOW TO INSTALL AND RUN CASSANDRA?
25
HOW TO INSTALL AND RUN CASSANDRA CLUSTER?
➤ Make sure you have JDK (8u40 or newer) installed
➤ Download apache-cassandra-VERSION-bin.tar.gz
➤ Extract the file to a folder
➤ Make data and logs directories in cassandra folder
➤ Run bin/cassandra
➤ Edit the configuration file (conf/cassandra.yaml)
➤ Give a name to cluster, change listening address, data and logs
directory locations, enable authentication and authorization.
26
HOW TO INSTALL AND RUN CASSANDRA CLUSTER?
➤ User docker to pull the latest image:

docker pull cassandra
➤ Run it as standalone:

docker run --name cas1 -p 9042:9042 -e
CASSANDRA_CLUSTER_NAME=MyCluster -d cassandra
➤ Connect using clqsh:

docker exec -it cas1 cqlsh
➤ Run nodetool (i.e for check status):

docker exec -it cas1 nodetool status
27
CASSANDRA NODETOOL
28
CASSANDRA NODETOOL
➤ Get a quick summary of the node:

nodetool info
➤ Get version of Cassandra:

nodetool version
29
CASSANDRA NODETOOL
➤ Get status of the cluster/keyspace:

nodetool status <keyspace_name>
➤ View the network statistics of the node:

nodetool netstats
➤ Get information of a table:

nodetool cfstats <keyspace_name.table_name>
30
CASSANDRA NODETOOL
➤ Repair a node (you can run it weekly on non-peak hours):

nodetool repair
➤ Cleanup of keys no longer belonging to a node:

nodetool cleanup
➤ Start a major compaction process:

nodetool compact
➤ Check the compaction process:

nodetool compactionstats
31
CASSANDRA NODETOOL
➤ Decommission a node (to prepare to remove it):

nodetool decommission <node_UUID>
➤ Remove a dead/or decommissioned node from the cluster:

nodetool removenode <node_UUID>
➤ Take a snapshot (for backup):

nodetool snapshot
➤ Remove previous snapshots:

nodetool clearsnapshot
32
BACKUP AND RECOVERY
33
BACKUP AND RECOVERY
➤ Back up a cluster:
1. Take a snapshot of each node.
2. Move the snapshots to another storage (S3 bucket?)
3. Clean all the snapshots
➤ Restore node(s):
➤ Make sure schema exists
➤ Truncate table
➤ Copy most recent snapshots to a directory. Its name should
be formatted as "keyspace/tablename". Run:

sstableloader -d <nodeip> keyspace/tablename
34
BUILD A BACKUP NODE
➤ Use multi-DC replication:

CREATE KEYSPACE "MyKeyspace"

WITH replication = { 

'class' : 'NetworkTopologyStrategy',

'datacenter1' : 3, 'datacenter2' : 1 };
35
RF=3
client
snapshots
QUESTIONS?
36
Blog: www.gokhanatil.com Twitter: @gokhanatil

More Related Content

What's hot

Intro to HBase
Intro to HBaseIntro to HBase
Intro to HBase
alexbaranau
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Spark
Rahul Jain
 
Nosql data models
Nosql data modelsNosql data models
Nosql data models
Viet-Trung TRAN
 
Apache HBase™
Apache HBase™Apache HBase™
Apache HBase™
Prashant Gupta
 
Nosql databases
Nosql databasesNosql databases
Nosql databases
ateeq ateeq
 
Cassandra concepts, patterns and anti-patterns
Cassandra concepts, patterns and anti-patternsCassandra concepts, patterns and anti-patterns
Cassandra concepts, patterns and anti-patterns
Dave Gardner
 
Hadoop Overview & Architecture
Hadoop Overview & Architecture  Hadoop Overview & Architecture
Hadoop Overview & Architecture
EMC
 
Schemaless Databases
Schemaless DatabasesSchemaless Databases
Schemaless Databases
Dan Gunter
 
Introduction to Cassandra Basics
Introduction to Cassandra BasicsIntroduction to Cassandra Basics
Introduction to Cassandra Basics
nickmbailey
 
Introduction to NoSQL Databases
Introduction to NoSQL DatabasesIntroduction to NoSQL Databases
Introduction to NoSQL DatabasesDerek Stainer
 
Apache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals ExplainedApache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals Explained
confluent
 
Modeling Data and Queries for Wide Column NoSQL
Modeling Data and Queries for Wide Column NoSQLModeling Data and Queries for Wide Column NoSQL
Modeling Data and Queries for Wide Column NoSQL
ScyllaDB
 
What is NoSQL and CAP Theorem
What is NoSQL and CAP TheoremWhat is NoSQL and CAP Theorem
What is NoSQL and CAP Theorem
Rahul Jain
 
Storing time series data with Apache Cassandra
Storing time series data with Apache CassandraStoring time series data with Apache Cassandra
Storing time series data with Apache Cassandra
Patrick McFadin
 
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
Simplilearn
 
NOSQL Database: Apache Cassandra
NOSQL Database: Apache CassandraNOSQL Database: Apache Cassandra
NOSQL Database: Apache Cassandra
Folio3 Software
 
Introduction to Cassandra Architecture
Introduction to Cassandra ArchitectureIntroduction to Cassandra Architecture
Introduction to Cassandra Architecture
nickmbailey
 
Apache Spark Architecture
Apache Spark ArchitectureApache Spark Architecture
Apache Spark Architecture
Alexey Grishchenko
 
Cassandra ppt 1
Cassandra ppt 1Cassandra ppt 1
Cassandra ppt 1
Skillwise Group
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
Ravi Teja
 

What's hot (20)

Intro to HBase
Intro to HBaseIntro to HBase
Intro to HBase
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Spark
 
Nosql data models
Nosql data modelsNosql data models
Nosql data models
 
Apache HBase™
Apache HBase™Apache HBase™
Apache HBase™
 
Nosql databases
Nosql databasesNosql databases
Nosql databases
 
Cassandra concepts, patterns and anti-patterns
Cassandra concepts, patterns and anti-patternsCassandra concepts, patterns and anti-patterns
Cassandra concepts, patterns and anti-patterns
 
Hadoop Overview & Architecture
Hadoop Overview & Architecture  Hadoop Overview & Architecture
Hadoop Overview & Architecture
 
Schemaless Databases
Schemaless DatabasesSchemaless Databases
Schemaless Databases
 
Introduction to Cassandra Basics
Introduction to Cassandra BasicsIntroduction to Cassandra Basics
Introduction to Cassandra Basics
 
Introduction to NoSQL Databases
Introduction to NoSQL DatabasesIntroduction to NoSQL Databases
Introduction to NoSQL Databases
 
Apache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals ExplainedApache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals Explained
 
Modeling Data and Queries for Wide Column NoSQL
Modeling Data and Queries for Wide Column NoSQLModeling Data and Queries for Wide Column NoSQL
Modeling Data and Queries for Wide Column NoSQL
 
What is NoSQL and CAP Theorem
What is NoSQL and CAP TheoremWhat is NoSQL and CAP Theorem
What is NoSQL and CAP Theorem
 
Storing time series data with Apache Cassandra
Storing time series data with Apache CassandraStoring time series data with Apache Cassandra
Storing time series data with Apache Cassandra
 
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
 
NOSQL Database: Apache Cassandra
NOSQL Database: Apache CassandraNOSQL Database: Apache Cassandra
NOSQL Database: Apache Cassandra
 
Introduction to Cassandra Architecture
Introduction to Cassandra ArchitectureIntroduction to Cassandra Architecture
Introduction to Cassandra Architecture
 
Apache Spark Architecture
Apache Spark ArchitectureApache Spark Architecture
Apache Spark Architecture
 
Cassandra ppt 1
Cassandra ppt 1Cassandra ppt 1
Cassandra ppt 1
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 

Similar to Introduction to Cassandra

Trivadis TechEvent 2016 Big Data Cassandra, wieso brauche ich das? by Jan Ott
Trivadis TechEvent 2016 Big Data Cassandra, wieso brauche ich das? by Jan OttTrivadis TechEvent 2016 Big Data Cassandra, wieso brauche ich das? by Jan Ott
Trivadis TechEvent 2016 Big Data Cassandra, wieso brauche ich das? by Jan Ott
Trivadis
 
KSQL: Streaming SQL for Kafka
KSQL: Streaming SQL for KafkaKSQL: Streaming SQL for Kafka
KSQL: Streaming SQL for Kafka
confluent
 
Streams Don't Fail Me Now - Robustness Features in Kafka Streams
Streams Don't Fail Me Now - Robustness Features in Kafka StreamsStreams Don't Fail Me Now - Robustness Features in Kafka Streams
Streams Don't Fail Me Now - Robustness Features in Kafka Streams
HostedbyConfluent
 
Meetup cassandra for_java_cql
Meetup cassandra for_java_cqlMeetup cassandra for_java_cql
Meetup cassandra for_java_cql
zznate
 
Multi-cluster k8ssandra
Multi-cluster k8ssandraMulti-cluster k8ssandra
Multi-cluster k8ssandra
KubernetesCommunityD
 
Alexander Pavlenko, Senior Java Developer, "Cassandra into"
Alexander Pavlenko, Senior Java Developer, "Cassandra into"Alexander Pavlenko, Senior Java Developer, "Cassandra into"
Alexander Pavlenko, Senior Java Developer, "Cassandra into"
Alina Vilk
 
Cassandra into
Cassandra intoCassandra into
Cassandra into
DataArt
 
Drizzle to MySQL, Stress Free Migration
Drizzle to MySQL, Stress Free MigrationDrizzle to MySQL, Stress Free Migration
Drizzle to MySQL, Stress Free Migration
Andrew Hutchings
 
Scylla Summit 2017: How to Run Cassandra/Scylla from a MySQL DBA's Point of View
Scylla Summit 2017: How to Run Cassandra/Scylla from a MySQL DBA's Point of ViewScylla Summit 2017: How to Run Cassandra/Scylla from a MySQL DBA's Point of View
Scylla Summit 2017: How to Run Cassandra/Scylla from a MySQL DBA's Point of View
ScyllaDB
 
Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum...
Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum...Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum...
Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum...
DataStax
 
Advanced Apache Cassandra Operations with JMX
Advanced Apache Cassandra Operations with JMXAdvanced Apache Cassandra Operations with JMX
Advanced Apache Cassandra Operations with JMX
zznate
 
Apache Cassandra and Go
Apache Cassandra and GoApache Cassandra and Go
Apache Cassandra and Go
DataStax Academy
 
Store and Process Big Data with Hadoop and Cassandra
Store and Process Big Data with Hadoop and CassandraStore and Process Big Data with Hadoop and Cassandra
Store and Process Big Data with Hadoop and CassandraDeependra Ariyadewa
 
Getting Started with PL/Proxy
Getting Started with PL/ProxyGetting Started with PL/Proxy
Getting Started with PL/Proxy
Peter Eisentraut
 
Clojure ♥ cassandra
Clojure ♥ cassandra Clojure ♥ cassandra
Clojure ♥ cassandra
Max Penet
 
DataStax NYC Java Meetup: Cassandra with Java
DataStax NYC Java Meetup: Cassandra with JavaDataStax NYC Java Meetup: Cassandra with Java
DataStax NYC Java Meetup: Cassandra with Java
carolinedatastax
 
Montreal User Group - Cloning Cassandra
Montreal User Group - Cloning CassandraMontreal User Group - Cloning Cassandra
Montreal User Group - Cloning CassandraAdam Hutson
 
Cassandra Java APIs Old and New – A Comparison
Cassandra Java APIs Old and New – A ComparisonCassandra Java APIs Old and New – A Comparison
Cassandra Java APIs Old and New – A Comparison
shsedghi
 
Redshift performance tuning
Redshift performance tuningRedshift performance tuning
Redshift performance tuning
Carlos del Cacho
 

Similar to Introduction to Cassandra (20)

Trivadis TechEvent 2016 Big Data Cassandra, wieso brauche ich das? by Jan Ott
Trivadis TechEvent 2016 Big Data Cassandra, wieso brauche ich das? by Jan OttTrivadis TechEvent 2016 Big Data Cassandra, wieso brauche ich das? by Jan Ott
Trivadis TechEvent 2016 Big Data Cassandra, wieso brauche ich das? by Jan Ott
 
KSQL: Streaming SQL for Kafka
KSQL: Streaming SQL for KafkaKSQL: Streaming SQL for Kafka
KSQL: Streaming SQL for Kafka
 
Streams Don't Fail Me Now - Robustness Features in Kafka Streams
Streams Don't Fail Me Now - Robustness Features in Kafka StreamsStreams Don't Fail Me Now - Robustness Features in Kafka Streams
Streams Don't Fail Me Now - Robustness Features in Kafka Streams
 
CQL - Cassandra commands Notes
CQL - Cassandra commands NotesCQL - Cassandra commands Notes
CQL - Cassandra commands Notes
 
Meetup cassandra for_java_cql
Meetup cassandra for_java_cqlMeetup cassandra for_java_cql
Meetup cassandra for_java_cql
 
Multi-cluster k8ssandra
Multi-cluster k8ssandraMulti-cluster k8ssandra
Multi-cluster k8ssandra
 
Alexander Pavlenko, Senior Java Developer, "Cassandra into"
Alexander Pavlenko, Senior Java Developer, "Cassandra into"Alexander Pavlenko, Senior Java Developer, "Cassandra into"
Alexander Pavlenko, Senior Java Developer, "Cassandra into"
 
Cassandra into
Cassandra intoCassandra into
Cassandra into
 
Drizzle to MySQL, Stress Free Migration
Drizzle to MySQL, Stress Free MigrationDrizzle to MySQL, Stress Free Migration
Drizzle to MySQL, Stress Free Migration
 
Scylla Summit 2017: How to Run Cassandra/Scylla from a MySQL DBA's Point of View
Scylla Summit 2017: How to Run Cassandra/Scylla from a MySQL DBA's Point of ViewScylla Summit 2017: How to Run Cassandra/Scylla from a MySQL DBA's Point of View
Scylla Summit 2017: How to Run Cassandra/Scylla from a MySQL DBA's Point of View
 
Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum...
Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum...Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum...
Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum...
 
Advanced Apache Cassandra Operations with JMX
Advanced Apache Cassandra Operations with JMXAdvanced Apache Cassandra Operations with JMX
Advanced Apache Cassandra Operations with JMX
 
Apache Cassandra and Go
Apache Cassandra and GoApache Cassandra and Go
Apache Cassandra and Go
 
Store and Process Big Data with Hadoop and Cassandra
Store and Process Big Data with Hadoop and CassandraStore and Process Big Data with Hadoop and Cassandra
Store and Process Big Data with Hadoop and Cassandra
 
Getting Started with PL/Proxy
Getting Started with PL/ProxyGetting Started with PL/Proxy
Getting Started with PL/Proxy
 
Clojure ♥ cassandra
Clojure ♥ cassandra Clojure ♥ cassandra
Clojure ♥ cassandra
 
DataStax NYC Java Meetup: Cassandra with Java
DataStax NYC Java Meetup: Cassandra with JavaDataStax NYC Java Meetup: Cassandra with Java
DataStax NYC Java Meetup: Cassandra with Java
 
Montreal User Group - Cloning Cassandra
Montreal User Group - Cloning CassandraMontreal User Group - Cloning Cassandra
Montreal User Group - Cloning Cassandra
 
Cassandra Java APIs Old and New – A Comparison
Cassandra Java APIs Old and New – A ComparisonCassandra Java APIs Old and New – A Comparison
Cassandra Java APIs Old and New – A Comparison
 
Redshift performance tuning
Redshift performance tuningRedshift performance tuning
Redshift performance tuning
 

More from Gokhan Atil

Introduction to Spark with Python
Introduction to Spark with PythonIntroduction to Spark with Python
Introduction to Spark with Python
Gokhan Atil
 
SQL or noSQL - Oracle Cloud Day Istanbul
SQL or noSQL - Oracle Cloud Day IstanbulSQL or noSQL - Oracle Cloud Day Istanbul
SQL or noSQL - Oracle Cloud Day Istanbul
Gokhan Atil
 
EM13c: Write Powerful Scripts with EMCLI
EM13c: Write Powerful Scripts with EMCLIEM13c: Write Powerful Scripts with EMCLI
EM13c: Write Powerful Scripts with EMCLI
Gokhan Atil
 
Oracle Enterprise Manager Cloud Control 13c for DBAs
Oracle Enterprise Manager Cloud Control 13c for DBAsOracle Enterprise Manager Cloud Control 13c for DBAs
Oracle Enterprise Manager Cloud Control 13c for DBAs
Gokhan Atil
 
Essential Linux Commands for DBAs
Essential Linux Commands for DBAsEssential Linux Commands for DBAs
Essential Linux Commands for DBAs
Gokhan Atil
 
Oracle Enterprise Manager Cloud Control 13c for DBAs
Oracle Enterprise Manager Cloud Control 13c for DBAsOracle Enterprise Manager Cloud Control 13c for DBAs
Oracle Enterprise Manager Cloud Control 13c for DBAs
Gokhan Atil
 
Enterprise Manager: Write powerful scripts with EMCLI
Enterprise Manager: Write powerful scripts with EMCLIEnterprise Manager: Write powerful scripts with EMCLI
Enterprise Manager: Write powerful scripts with EMCLI
Gokhan Atil
 
EMCLI Crash Course - DOAG Germany
EMCLI Crash Course - DOAG GermanyEMCLI Crash Course - DOAG Germany
EMCLI Crash Course - DOAG Germany
Gokhan Atil
 
Oracle Enterprise Manager 12c: EMCLI Crash Course
Oracle Enterprise Manager 12c: EMCLI Crash CourseOracle Enterprise Manager 12c: EMCLI Crash Course
Oracle Enterprise Manager 12c: EMCLI Crash Course
Gokhan Atil
 
TROUG & Turkey JUG Semineri: Veriye erişimin en hızlı yolu
TROUG & Turkey JUG Semineri: Veriye erişimin en hızlı yoluTROUG & Turkey JUG Semineri: Veriye erişimin en hızlı yolu
TROUG & Turkey JUG Semineri: Veriye erişimin en hızlı yolu
Gokhan Atil
 
Oracle 12c Database In Memory DBA SIG
Oracle 12c Database In Memory DBA SIGOracle 12c Database In Memory DBA SIG
Oracle 12c Database In Memory DBA SIG
Gokhan Atil
 
Oracle 12c Database In-Memory
Oracle 12c Database In-MemoryOracle 12c Database In-Memory
Oracle 12c Database In-Memory
Gokhan Atil
 
Oracle DB Standard Edition: Başka Bir Arzunuz?
Oracle DB Standard Edition: Başka Bir Arzunuz?Oracle DB Standard Edition: Başka Bir Arzunuz?
Oracle DB Standard Edition: Başka Bir Arzunuz?
Gokhan Atil
 
Enterprise Manager 12c ASH Analytics
Enterprise Manager 12c ASH AnalyticsEnterprise Manager 12c ASH Analytics
Enterprise Manager 12c ASH Analytics
Gokhan Atil
 
Using APEX to Create a Mobile User Interface for Enterprise Manager 12c
Using APEX to Create a Mobile User Interface for Enterprise Manager 12cUsing APEX to Create a Mobile User Interface for Enterprise Manager 12c
Using APEX to Create a Mobile User Interface for Enterprise Manager 12c
Gokhan Atil
 

More from Gokhan Atil (15)

Introduction to Spark with Python
Introduction to Spark with PythonIntroduction to Spark with Python
Introduction to Spark with Python
 
SQL or noSQL - Oracle Cloud Day Istanbul
SQL or noSQL - Oracle Cloud Day IstanbulSQL or noSQL - Oracle Cloud Day Istanbul
SQL or noSQL - Oracle Cloud Day Istanbul
 
EM13c: Write Powerful Scripts with EMCLI
EM13c: Write Powerful Scripts with EMCLIEM13c: Write Powerful Scripts with EMCLI
EM13c: Write Powerful Scripts with EMCLI
 
Oracle Enterprise Manager Cloud Control 13c for DBAs
Oracle Enterprise Manager Cloud Control 13c for DBAsOracle Enterprise Manager Cloud Control 13c for DBAs
Oracle Enterprise Manager Cloud Control 13c for DBAs
 
Essential Linux Commands for DBAs
Essential Linux Commands for DBAsEssential Linux Commands for DBAs
Essential Linux Commands for DBAs
 
Oracle Enterprise Manager Cloud Control 13c for DBAs
Oracle Enterprise Manager Cloud Control 13c for DBAsOracle Enterprise Manager Cloud Control 13c for DBAs
Oracle Enterprise Manager Cloud Control 13c for DBAs
 
Enterprise Manager: Write powerful scripts with EMCLI
Enterprise Manager: Write powerful scripts with EMCLIEnterprise Manager: Write powerful scripts with EMCLI
Enterprise Manager: Write powerful scripts with EMCLI
 
EMCLI Crash Course - DOAG Germany
EMCLI Crash Course - DOAG GermanyEMCLI Crash Course - DOAG Germany
EMCLI Crash Course - DOAG Germany
 
Oracle Enterprise Manager 12c: EMCLI Crash Course
Oracle Enterprise Manager 12c: EMCLI Crash CourseOracle Enterprise Manager 12c: EMCLI Crash Course
Oracle Enterprise Manager 12c: EMCLI Crash Course
 
TROUG & Turkey JUG Semineri: Veriye erişimin en hızlı yolu
TROUG & Turkey JUG Semineri: Veriye erişimin en hızlı yoluTROUG & Turkey JUG Semineri: Veriye erişimin en hızlı yolu
TROUG & Turkey JUG Semineri: Veriye erişimin en hızlı yolu
 
Oracle 12c Database In Memory DBA SIG
Oracle 12c Database In Memory DBA SIGOracle 12c Database In Memory DBA SIG
Oracle 12c Database In Memory DBA SIG
 
Oracle 12c Database In-Memory
Oracle 12c Database In-MemoryOracle 12c Database In-Memory
Oracle 12c Database In-Memory
 
Oracle DB Standard Edition: Başka Bir Arzunuz?
Oracle DB Standard Edition: Başka Bir Arzunuz?Oracle DB Standard Edition: Başka Bir Arzunuz?
Oracle DB Standard Edition: Başka Bir Arzunuz?
 
Enterprise Manager 12c ASH Analytics
Enterprise Manager 12c ASH AnalyticsEnterprise Manager 12c ASH Analytics
Enterprise Manager 12c ASH Analytics
 
Using APEX to Create a Mobile User Interface for Enterprise Manager 12c
Using APEX to Create a Mobile User Interface for Enterprise Manager 12cUsing APEX to Create a Mobile User Interface for Enterprise Manager 12c
Using APEX to Create a Mobile User Interface for Enterprise Manager 12c
 

Recently uploaded

AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteAI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
Google
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
Globus
 
BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024
Ortus Solutions, Corp
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
Max Andersen
 
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
informapgpstrackings
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke
 
Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024
Globus
 
Lecture 1 Introduction to games development
Lecture 1 Introduction to games developmentLecture 1 Introduction to games development
Lecture 1 Introduction to games development
abdulrafaychaudhry
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
Adele Miller
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus
 
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
AMB-Review
 
Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
Fermin Galan
 
How Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptxHow Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptx
wottaspaceseo
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
Globus
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus
 
Cyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdfCyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdf
Cyanic lab
 
RISE with SAP and Journey to the Intelligent Enterprise
RISE with SAP and Journey to the Intelligent EnterpriseRISE with SAP and Journey to the Intelligent Enterprise
RISE with SAP and Journey to the Intelligent Enterprise
Srikant77
 
How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
Globus
 
Large Language Models and the End of Programming
Large Language Models and the End of ProgrammingLarge Language Models and the End of Programming
Large Language Models and the End of Programming
Matt Welsh
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Globus
 

Recently uploaded (20)

AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteAI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
 
BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
 
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
 
Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024
 
Lecture 1 Introduction to games development
Lecture 1 Introduction to games developmentLecture 1 Introduction to games development
Lecture 1 Introduction to games development
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
 
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
 
Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
 
How Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptxHow Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptx
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
 
Cyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdfCyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdf
 
RISE with SAP and Journey to the Intelligent Enterprise
RISE with SAP and Journey to the Intelligent EnterpriseRISE with SAP and Journey to the Intelligent Enterprise
RISE with SAP and Journey to the Intelligent Enterprise
 
How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
 
Large Language Models and the End of Programming
Large Language Models and the End of ProgrammingLarge Language Models and the End of Programming
Large Language Models and the End of Programming
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
 

Introduction to Cassandra

  • 2. GÖKHAN ATIL ➤ Database Administrator ➤ Oracle ACE Director (2016)
 ACE (2011) ➤ 10g/11g and R12 Oracle Certified Professional (OCP) ➤ Co-author of Expert Oracle Enterprise Manager 12c ➤ Founding Member and Vice President of TROUG ➤ Blogger (since 2008) gokhanatil.com ➤ Twitter: @gokhanatil 2
  • 3. INTRODUCTION TO APACHE CASSANDRA ➤ What is Apache Cassandra? Why to use it? ➤ Cassandra Architecture ➤ Cassandra Query Language (CQL) ➤ Cassandra Data Modeling ➤ How to install and run Cassandra? ➤ Cassandra nodetool ➤ Backup and Recovery 3
  • 4. WHAT IS APACHE CASSANDRA? WHY TO USE IT? 4
  • 5. WHAT IS APACHE CASSANDRA? WHY TO USE IT? ➤ Fast Distributed (Column Family NoSQL) Database High availability Linear Scalability High Performance ➤ Fault tolerant on Commodity Hardware ➤ Multi-Data Center Support ➤ Easy to operate ➤ Proven: CERN, Netflix, eBay, GitHub, Instagram, Reddit 5
  • 6. HIGH AVAILABILITY: CAP THEOREM AND CASSANDRA 6 Partition Tolerance Availability Consistency
 (ACID) RDBMS Atomicity Consistency Isolation Durability
  • 7. HIGH AVAILABILITY: THE RING 7 NO MASTER NO SLAVE PEER TO PEER gossip gossip I'm online!
  • 10. CASSANDRA PARTITIONS 10 EMAIL NAME PHONE gokhan@ Gokhan 542xxxxxxx aylin@ Aylin 532xxxxxxx ilayda@ Ilayda 532xxxxxxx partitionerPRIMARY KEY PARTITION KEY, CLUSTERING KEY
  • 13. WRITE PATH (NODE) ➤ Logging data in the commit log ➤ Writing data to the memtable ➤ Flushing to (immutable) SSTables (Sorted Strings Table) 13 memtable commit log SSTable SSTable SSTable disk mem flush compaction
  • 14. READ PATH (CLUSTER) 14 coordinator node client ➤ Read Repair: repair during read path using digest and timestamp data digest digest
  • 15. READ PATH (NODE) 15 memtable row (read) cache bloom filter
 (maybe or no) partition key cache partition summary partition index SSTable found maybe found no disk mem
  • 16. CONSISTENCY LEVELS ➤ Formula for Strong Consistency: R + W > N 16 ANY (write only) at least one node ONE, TWO, THREE at least one/two/three replica node QUORUM a quorum (N/2+1) of replica nodes across all datacenters LOCAL_QUORUM a quorum (N/2+1) of replica nodes in the same datacenter ALL on all replica nodes
  • 18. CASSANDRA QUERY LANGUAGE (CQL) ➤ Create a Keyspace (Database):
 create keyspace demo with replication = { 'class' : 'SimpleStrategy', 'replication_factor' :1 }; ➤ Remove a keyspace:
 drop keyspace demo; ➤ Select a keyspace to operate:
 use demo; 18
  • 19. CASSANDRA QUERY LANGUAGE (CQL) ➤ Create a table:
 create table demo.democlients ( email text, name text, phone text, primary key (email, name)); ➤ Alter a table:
 alter table democlients add money int; ➤ Remove a table:
 drop table democlients; ➤ Remove all rows in a table:
 truncate table democlients; 19 EMAIL: PARTITION KEY NAME: CLUSTERING KEY
  • 20. CASSANDRA QUERY LANGUAGE (CQL) ➤ Retrieve rows:
 select * from democlients where name='Gokhan Atil' ALLOW FILTERING; -- or create a secondary index ➤ Retrieve distinct values:
 select DISTINCT email from democlients; ➤ Limit the number of rows returned:
 select * from democlients LIMIT 1; ➤ Sort the result:
 select * from democlients where email='gokhan at gokhanatil.com' ORDER by name DESC; 20 NAME: CLUSTERING KEY EMAIL: PARTITION KEY
  • 21. CASSANDRA QUERY LANGUAGE (CQL) ➤ Retrieve the results in the JSON format:
 select JSON * from democlients; ➤ Insert a row:
 insert into democlients (email, name, phone) values ('gokhan at gokhanatil.com','Gokhan Atil','542' ) IF NOT EXISTS; ➤ Insert a row with TTL (Time to live - seconds):
 insert into democlients (email, name, phone) values ('info at gokhanatil.com','Information','542' ) USING TTL 10; 21
  • 22. CASSANDRA QUERY LANGUAGE (CQL) ➤ Update records:
 update democlients set phone='535' where email='gokhan at gokhanatil.com' and 
 name='Gokhan' IF EXISTS; ➤ Update records with a condition:
 update democlients set money=20 where email='gokhan at gokhanatil.com' and name='Gokhan Atil' 
 IF phone='542'; ➤ Delete rows:
 delete from democlients where email='gokhan at gokhanatil.com' IF EXISTS; 22
  • 23. CASSANDRA QUERY LANGUAGE (CQL) ➤ Delete row with a condition:
 delete from democlients where email='gokhan at gokhanatil.com' and name='Gokhan Atil' IF money > 10; ➤ Delete columns in a row:
 delete money from democlients where email='gokhan at gokhanatil.com' and name='Gokhan Atil'; 23
  • 24. CASSANDRA DATA MODELING ➤ Query-Driven Data Modeling ➤ Spread data evenly across the cluster ➤ Use Denormalization ➤ Be careful about using secondary indexes 24
  • 25. HOW TO INSTALL AND RUN CASSANDRA? 25
  • 26. HOW TO INSTALL AND RUN CASSANDRA CLUSTER? ➤ Make sure you have JDK (8u40 or newer) installed ➤ Download apache-cassandra-VERSION-bin.tar.gz ➤ Extract the file to a folder ➤ Make data and logs directories in cassandra folder ➤ Run bin/cassandra ➤ Edit the configuration file (conf/cassandra.yaml) ➤ Give a name to cluster, change listening address, data and logs directory locations, enable authentication and authorization. 26
  • 27. HOW TO INSTALL AND RUN CASSANDRA CLUSTER? ➤ User docker to pull the latest image:
 docker pull cassandra ➤ Run it as standalone:
 docker run --name cas1 -p 9042:9042 -e CASSANDRA_CLUSTER_NAME=MyCluster -d cassandra ➤ Connect using clqsh:
 docker exec -it cas1 cqlsh ➤ Run nodetool (i.e for check status):
 docker exec -it cas1 nodetool status 27
  • 29. CASSANDRA NODETOOL ➤ Get a quick summary of the node:
 nodetool info ➤ Get version of Cassandra:
 nodetool version 29
  • 30. CASSANDRA NODETOOL ➤ Get status of the cluster/keyspace:
 nodetool status <keyspace_name> ➤ View the network statistics of the node:
 nodetool netstats ➤ Get information of a table:
 nodetool cfstats <keyspace_name.table_name> 30
  • 31. CASSANDRA NODETOOL ➤ Repair a node (you can run it weekly on non-peak hours):
 nodetool repair ➤ Cleanup of keys no longer belonging to a node:
 nodetool cleanup ➤ Start a major compaction process:
 nodetool compact ➤ Check the compaction process:
 nodetool compactionstats 31
  • 32. CASSANDRA NODETOOL ➤ Decommission a node (to prepare to remove it):
 nodetool decommission <node_UUID> ➤ Remove a dead/or decommissioned node from the cluster:
 nodetool removenode <node_UUID> ➤ Take a snapshot (for backup):
 nodetool snapshot ➤ Remove previous snapshots:
 nodetool clearsnapshot 32
  • 34. BACKUP AND RECOVERY ➤ Back up a cluster: 1. Take a snapshot of each node. 2. Move the snapshots to another storage (S3 bucket?) 3. Clean all the snapshots ➤ Restore node(s): ➤ Make sure schema exists ➤ Truncate table ➤ Copy most recent snapshots to a directory. Its name should be formatted as "keyspace/tablename". Run:
 sstableloader -d <nodeip> keyspace/tablename 34
  • 35. BUILD A BACKUP NODE ➤ Use multi-DC replication:
 CREATE KEYSPACE "MyKeyspace"
 WITH replication = { 
 'class' : 'NetworkTopologyStrategy',
 'datacenter1' : 3, 'datacenter2' : 1 }; 35 RF=3 client snapshots