SlideShare a Scribd company logo
1 of 87
Apache Cassandra
Overview and Basics
© Oleg Magazov
omagazov@t-online.de
Learning Targets

 Big Data introduction
 Understand driving forces behind NoSQL development
 Map known RDBMS concepts to corresponding NoSQL
paradigms
 Get overview about Apache CassandraTM architecture
 Get overview about CassandraTM data model
 Get first experience of CassandraTM packaging and CLI
Agenda

•
•
•
•
•
•
•

Big Data
NoSQL. Main Technologies
NoSQL. Products
Apache CassandraTM Features
Apache CassandraTM Architecture
Apache CassandraTM Data Modeling
Apache CassandraTM CLI
Big Data
Origin

•
•

April 1998 John R. Mashey from SGI, Usenix
talk: “Big Data and the Next Wave of Infrastress”
Big Data refers to huge data volumes,
continuously increasing data sources, velocity of
data generation, data analysis and related
technology solutions
IDC Analysis
2020 Forecast Global

•
•

40 zettabytes data on Earth
5,247 GB of data for every man,
woman and child on earth in
2020
Cisco Forecasts
Some Facts...
Big Data Driving Forces

•
•
•
•

Continued growth of Internet usage, social networks,
and smartphones
The falling costs of the technology for information
creation, capturing and storage
Migration from analog TV to digital TV
Growth of machine-to-machine communication
Main Producer

•
•

Machine-generated data is a key factor behind expansion
Growth from 11% of the digital universe in 2005 to more than 40% in 2020

–
–
–
–
–

Machine logs
RFID readers
Sensor networks
Vehicle GPS traces
Retail transactions
The Analysis Gap

Currently only 3% of the potentially useful data is
tagged, and even less is analyzed
Storage Capacities

•
•
•
•

I/O for HDDs is time consuming
For a 1 TB with with transfer speed of 300 MB/s (SATA) it
takes ~ 1 h
SSD are 5 faster in average
SSD are more expensive
Random Seeks

•
•
•

Seek time is improving more slowly than transfer
rate
Random seeks are expensive
Inherent to most RDBMS
Structure

•
•
•

Data is becoming increasingly semi-structured and
unstructured
Unstructured data is data without a schema
Semi-structured
– no conformity to relational databases structures
– self-describing, containing tags or structure
related markers
Limitations of RDBMS

•
•
•
•
•
•

Up-front schema declaration is needed
Referential integrity is necessary
Use mainly B-Tree indexes
Non-Liniar scaling
Are build around OLTP and OLAP approaches
Many solutions are really expensive
ACID

•
•
•
•

Atomicity
Consistency
Isolation
Durability
Isolation Levels

•
•
•
•

Read Uncommitted
Read Committed
Repeatable Read
Serializable
ACID in Destributed Systems

•
•

Two-phase commit (2PC)
Two-phase locking (2PL)
Roadmap

•
•
•
•
•

Parallel Processing
Sharding and shared-nothing architecture
Reliability through replication
Advanced algorithms for parallel processing
Advanced storage structures addressing seek problem
NoSQL
CAP Theorem
BASE

•

BASE - Basically Available Soft-state Eventually
consistency
R — Number of nodes that are read from
W — Number of nodes that are written to
N — Total number of nodes in the cluster
R + W = 2N – ACID complaint
Sharding
Sharding

•
•
•

Feature-based shard or functional segmentation
Key-based sharding
Lookup table
Setting Context

•
•
•
•

„The Google File System”, October 2003
“MapReduce: Simplified Data Processing on Large
Clusters”, December 2004
“Bigtable: A Distributed Storage System for Structured
Data”, November 2006
“The Chubby Lock Service for Loosely-Coupled
Distributed Systems”, November 2006
MapReduce

•
•
•
•
•
•

Created by Google
Parallel processing model
Data locality
Allows distributed processing on large data sets in cluster
Derives its ideas from functional programming
Works with semi-structured data
MapReduce

•
•

map(key1,value) -> list<key2,value2>
reduce(key2, list<value2>) -> list<value3>
Amazon Dynamo

•
•
•
•

“Dynamo: Amazon’s Highly Available Key/value Store”,
October 2007
Introduction of notion of eventual consistency
There could be small intervals of inconsistency
between replicated nodes
Eventual consistency does not mean inconsistency
Amazon Dynamo

•
•
•
•

Masterless
Physical nodes are peers and organized into
a ring
Automatically partitioning mechanism
Written in Java
Apache Hadoop

•
•
•

2004—Initial versions of Hadoop Distributed Filesystem
and Map-Reduce implemented
January 2006—Doug Cutting joins Yahoo!
February 2006—Apache Hadoop project officially
started
NoSQL Features

•
•
•
•
•
•

Advocated horizontal scalability in favor of vertical
scalability
Promises linear scalability
Uses new advanced technologies for parallel processing
Often uses custom file system implementation or
advanced storage techniques
Optionally schema-free
No the concept of locking or locking is a choice by
design
NoSQL Databases Classification

•
•
•
•

Sorted Ordered Column-Oriented
Stores
Key/Value Stores
Document Databases
Graph Databases
Ordered Column-Oriented Stores
•

•

Store data sets (Column Families) as sections of
columns
• Set of key(column)/value pairs
• Sorted by row-key (primary key)
Units of data are sorted and ordered on the basis of the
row-key
Column-Oriented Stores
Key/Value Stores

•
•
•

Idea
– HashMap – fast O(1) access
The key of a key/value pair is a unique value in the set
and can be easily looked up to access the data
Eventual consistency
Products
Document Databases

•
•
•

Keep documents as loosely structured sets of key/value
pairs, typically JSON (JavaScript Object Notation)
Treat document as a whole and avoid splitting a
document into its constituent name/value pairs
Allow indexing of documents on the basis of not only its
primary identifier but also its properties
Products
Graph Databases
Graph Databases

•
•
•
•
•

Use graph structures with nodes, edges, and properties
to represent and store data
Are based on graph theory
Are faster for associative data sets
Don’t not require expensive join operations
Best suitable for graph-like queries
Products
Apache CassandraTM
History

•
•
•
•
•

Originated at Facebook in 2007 to solve company’s
inbox search problem
July 2008, open source Google Code project
March 2009, Apache Incubator project
February 2010, top level Apache Project
November 2013, version 2.0.3 was released
Cassandra Features (Part I)

•
•
•
•
•

High availability
Linear and elastic scalability
Distributed and decentralized
Peer-to-Peer
No single point of failure
Cassandra Features (Part II)

•
•
•
•
•

Fault tolerance and built-in failure
detection
Tunable consistency
Supports basic subset of SQL via CQL
A command-line access to the store
Basic security support
Cassandra Features (Part III)

•
•
•
•

Thrift interface and an internal Java API
Clients for multiple Java, Python, Grails, PHP,
.NET., Ruby, Scala
Support of JMX interfaces
Built-in benchmarking

•

Hadoop and MapReduce integration
Cassandra in CAP Triangle
Architecture. Big Picture
Architecture Components. Part I

•
•
•
•
•

Consistent hashing
Virtual nodes
Gossip and failure detection
Hinted handoff
Anti-Entropy and read repair
Architecture Components. Part II

•
•
•
•
•

Ring topology
Staged Event-Driven Architecture (SEDA)
Compaction
Tombstones
Memtables, SSTables, and commit logs
Architecture Components. Part III

•
•
•
•
•

Row and key caches
Bloom filters
Merkle trees
Compression
Atomic batches
Tunable Consistency

•
•

•

Replication Factor (RF)
Quorum
– R+W > RF
– Quorum = (RF/2) +1
Consistency for read and write on operation
basis
Replication Strategy

•
•
•

SimpleStrategy
NetworkTopologyStrategy
Created for a keyspace with replica placement
strategy
Simple Strategy

•
•
•

For single data center clusters
First replica on a node determined by a partitioner
Additional replicas are placed on the next nodes
clockwise in the ring
Simple Strategy
2

3

1

2

1

2

3

1

3

4
1

2

3

3

1

2
Data Distribution and Replication

•

How does Cassandra data distribution
and replication work?
Consistent Hashing
Client Request Workflow
Network Topology

•
•
•

Data center - grouping of nodes configured together for
replication purposes
Rack - similar physical grouping of nodes
Snitch maps IPs to racks and data centers
– All nodes in a cluster must use the same snitch
configuration
Cassandra Client API

•
•
•
•
•

Cassandra CLI, Thrift based
CQL3, native protocol
Cqlsh with Python dependency
Multiple languages drivers
Java: CQL3 via DataStax 1.0 driver
DataStax Java Driver

•
•
•
•
•
•

Works only with CQL3
Layered architecture
Relies on Netty to provide non-blocking I/O for providing
a fully asynchronous architecture
Connection pooling, node discovery
Automatic failover, load balancing
Prepared statements are supported
Some Services

•
•
•
•
•

Daemon
Storage
Gossip
Messaging
Load Balancing
Data Model

•

RDBMS vs. Cassandra terminology
RDBMS View
Cassandra View
Cassandra vs. RDBMS (Part I)

•
•
•
•

No referential integrity
Doesn’t support joins
Limited SQL support
Denormalization
Cassandra vs. RDBMS (Part II)

•
•
•
•

Storing of collections in a field is possible
Row size is a design issue
Comparators for column families
Ordering is the design issue
Cassandra View
Keyspaces

•
•
•
•

Replication factor
Replica placement strategy
Column families
Usually one keyspace per application
Column Families

•
•
•
•
•

Serve as container for an ordered collection of
columns/rows
Are not equal to RDBMS tables
Column families have to be defined, the columns shouldn't
Entries in column families are grouped by row key
All data for a single row must fit on a single machine in the
cluster
Column Families
Static Column Families

•
•
•

Use a relatively static set of column names
Are more similar to a relational database table
Have metadata definition for individual columns
Dynamic Column Families

•
•
•

Allow to pre-compute result sets and store them in a
single row for efficient data retrieval
Defines the type information for column names and
values (comparators and validators)
Actual column names and values are set by the
application when a column is inserted
Column

•
•
•

Row keys and column names can be any kind of byte array
Useful data can be stored in the key itself, not only in the
value
2 billion columns per (physical) row
Legacy: Super Columns
Composite Columns

•
•
•
•
•

Are used under the hood to store clustered rows
All the logical rows with the same partition key get
stored as a single, physical wide row
Can be created and queried using CQL 3
Support range queries
Substitute Super Columns
Skinny Rows

•
•
•

Are like traditional RDBMS rows
Each row contains similar sets of column names
But all columns are optional
Wide Rows

•
•
•
•

Have lots (eventually millions) of columns
Typically contain automatically generated names (like
UUIDs or timestamps)
Are used to store lists of things
All the logical rows with the same partition key get
stored as a single, physical row
Practice Drive
Download and Install

•

•
•
•
•

Cassandra requires minimum version of Java 1.7 JDK
(http://www.oracle.com/technetwork/java/javase/downloa
ds/index.html)
Download from http://cassandra.apache.org/download/
Extract in some directory
Customize cassandra.yaml in the /conf directory
Start with bin/cassandra -f
Create Schema
•
•
•
•
•
•

cassandra-cli -host localhost -port 9160
create keyspace TestsDataStore;
show keyspaces;
use TestsDataStore;
create column family Cars
with comparator = UTF8Type;
update column family Cars with
column_metadata =
[
{column_name: make, validation_class: UTF8Type},
{column_name: model, validation_class: UTF8Type},
];
Populate With Data

•
•
•
•
•
•
•
•
•

assume Cars keys as utf8;
set Cars['Cabrio']['make'] = 'bmw'
set Cars['Cabrio']['model'] = '640i';
set Cars['Corolla']['make'] = 'toyota';
set Cars['Corolla']['model'] = 'le';
set Cars['fit']['make'] = 'honda';
set Cars['fit']['model'] = 'fit sport';
set Cars['focus']['make'] = 'ford';
set Cars['focus']['model'] = 'sel';
Data Manipulation

•
•
•

•
•
•
•
•

get Cars['Cabrio'];
get Cars['Cabrio']['make'];
update column family Cars with comparator=UTF8Type
and column_metadata=[{column_name: make,
validation_class: UTF8Type,
index_type: KEYS}, {column_name: model,
validation_class: UTF8Type}];
del Cars['Cabrio']['bmw'];
drop column family Cars;
drop keyspace TestsDataStore;
show keyspaces;
Agile Development with Cassandra

•
•
•
•

Facilitates agile development providing schema free
data model and query first paradigm
Makes TDD easier providing build in test tools
Is built around multiple design patterns, facilitating Clean
Code approach
Decentralized nature makes distributed work easier
(including geographical distribution)
Use Cases

•
•
•
•
•

Large deployments
Lots of writes, statistics, and analysis
Geographical distribution
Very large data volumes
High reliability requirements for data
storage
Some Users

More Related Content

What's hot

Evaluating Apache Cassandra as a Cloud Database
Evaluating Apache Cassandra as a Cloud DatabaseEvaluating Apache Cassandra as a Cloud Database
Evaluating Apache Cassandra as a Cloud DatabaseDataStax
 
Cassandra basics 2.0
Cassandra basics 2.0Cassandra basics 2.0
Cassandra basics 2.0Asis Mohanty
 
Apache Cassandra overview
Apache Cassandra overviewApache Cassandra overview
Apache Cassandra overviewElifTech
 
Cassandra architecture
Cassandra architectureCassandra architecture
Cassandra architectureT Jake Luciani
 
Cassandra overview
Cassandra overviewCassandra overview
Cassandra overviewSean Murphy
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinChristian Johannsen
 
Intro to cassandra
Intro to cassandraIntro to cassandra
Intro to cassandraAaron Ploetz
 
Apache Cassandra @Geneva JUG 2013.02.26
Apache Cassandra @Geneva JUG 2013.02.26Apache Cassandra @Geneva JUG 2013.02.26
Apache Cassandra @Geneva JUG 2013.02.26Benoit Perroud
 
Pythian: My First 100 days with a Cassandra Cluster
Pythian: My First 100 days with a Cassandra ClusterPythian: My First 100 days with a Cassandra Cluster
Pythian: My First 100 days with a Cassandra ClusterDataStax Academy
 
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...DataStax Academy
 
Apache Cassandra in the Real World
Apache Cassandra in the Real WorldApache Cassandra in the Real World
Apache Cassandra in the Real WorldJeremy Hanna
 
Cassandra background-and-architecture
Cassandra background-and-architectureCassandra background-and-architecture
Cassandra background-and-architectureMarkus Klems
 
Cassandra internals
Cassandra internalsCassandra internals
Cassandra internalsnarsiman
 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to CassandraSoftwareMill
 

What's hot (20)

Evaluating Apache Cassandra as a Cloud Database
Evaluating Apache Cassandra as a Cloud DatabaseEvaluating Apache Cassandra as a Cloud Database
Evaluating Apache Cassandra as a Cloud Database
 
Cassandra basics 2.0
Cassandra basics 2.0Cassandra basics 2.0
Cassandra basics 2.0
 
Cassandra
CassandraCassandra
Cassandra
 
Apache Cassandra overview
Apache Cassandra overviewApache Cassandra overview
Apache Cassandra overview
 
Cassandra architecture
Cassandra architectureCassandra architecture
Cassandra architecture
 
Cassandra ppt 2
Cassandra ppt 2Cassandra ppt 2
Cassandra ppt 2
 
Cassandra
CassandraCassandra
Cassandra
 
Cassandra overview
Cassandra overviewCassandra overview
Cassandra overview
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek Berlin
 
Intro to cassandra
Intro to cassandraIntro to cassandra
Intro to cassandra
 
Apache Cassandra @Geneva JUG 2013.02.26
Apache Cassandra @Geneva JUG 2013.02.26Apache Cassandra @Geneva JUG 2013.02.26
Apache Cassandra @Geneva JUG 2013.02.26
 
Intro to Cassandra
Intro to CassandraIntro to Cassandra
Intro to Cassandra
 
Pythian: My First 100 days with a Cassandra Cluster
Pythian: My First 100 days with a Cassandra ClusterPythian: My First 100 days with a Cassandra Cluster
Pythian: My First 100 days with a Cassandra Cluster
 
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
 
Cassandra training
Cassandra trainingCassandra training
Cassandra training
 
Apache Cassandra in the Real World
Apache Cassandra in the Real WorldApache Cassandra in the Real World
Apache Cassandra in the Real World
 
Cassandra Database
Cassandra DatabaseCassandra Database
Cassandra Database
 
Cassandra background-and-architecture
Cassandra background-and-architectureCassandra background-and-architecture
Cassandra background-and-architecture
 
Cassandra internals
Cassandra internalsCassandra internals
Cassandra internals
 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to Cassandra
 

Viewers also liked

Cassandra Explained
Cassandra ExplainedCassandra Explained
Cassandra ExplainedEric Evans
 
Voice and Video on the Web
Voice and Video on the WebVoice and Video on the Web
Voice and Video on the WebKundan Singh
 
Talk About Apache Cassandra
Talk About Apache CassandraTalk About Apache Cassandra
Talk About Apache CassandraJacky Chu
 
Apache Cassandra
Apache CassandraApache Cassandra
Apache CassandraSperasoft
 
Real-time Analytics with Cassandra, Spark, and Shark
Real-time Analytics with Cassandra, Spark, and SharkReal-time Analytics with Cassandra, Spark, and Shark
Real-time Analytics with Cassandra, Spark, and SharkEvan Chan
 
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016DataStax
 
Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012jbellis
 
Introduction to Cassandra Basics
Introduction to Cassandra BasicsIntroduction to Cassandra Basics
Introduction to Cassandra Basicsnickmbailey
 
Indexing in Cassandra
Indexing in CassandraIndexing in Cassandra
Indexing in CassandraEd Anuff
 
How Do I Cassandra?
How Do I Cassandra?How Do I Cassandra?
How Do I Cassandra?Rick Branson
 
Understanding Data Partitioning and Replication in Apache Cassandra
Understanding Data Partitioning and Replication in Apache CassandraUnderstanding Data Partitioning and Replication in Apache Cassandra
Understanding Data Partitioning and Replication in Apache CassandraDataStax
 
Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache CassandraRobert Stupp
 
Analyzing Time Series Data with Apache Spark and Cassandra
Analyzing Time Series Data with Apache Spark and CassandraAnalyzing Time Series Data with Apache Spark and Cassandra
Analyzing Time Series Data with Apache Spark and CassandraPatrick McFadin
 
Cassandra under the hood
Cassandra under the hoodCassandra under the hood
Cassandra under the hoodAndriy Rymar
 
Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3Eric Evans
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Sparkdatamantra
 
Reactive app using actor model & apache spark
Reactive app using actor model & apache sparkReactive app using actor model & apache spark
Reactive app using actor model & apache sparkRahul Kumar
 
Cassandra Introduction & Features
Cassandra Introduction & FeaturesCassandra Introduction & Features
Cassandra Introduction & FeaturesDataStax Academy
 
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)Helena Edelson
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache SparkRahul Jain
 

Viewers also liked (20)

Cassandra Explained
Cassandra ExplainedCassandra Explained
Cassandra Explained
 
Voice and Video on the Web
Voice and Video on the WebVoice and Video on the Web
Voice and Video on the Web
 
Talk About Apache Cassandra
Talk About Apache CassandraTalk About Apache Cassandra
Talk About Apache Cassandra
 
Apache Cassandra
Apache CassandraApache Cassandra
Apache Cassandra
 
Real-time Analytics with Cassandra, Spark, and Shark
Real-time Analytics with Cassandra, Spark, and SharkReal-time Analytics with Cassandra, Spark, and Shark
Real-time Analytics with Cassandra, Spark, and Shark
 
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016
 
Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012
 
Introduction to Cassandra Basics
Introduction to Cassandra BasicsIntroduction to Cassandra Basics
Introduction to Cassandra Basics
 
Indexing in Cassandra
Indexing in CassandraIndexing in Cassandra
Indexing in Cassandra
 
How Do I Cassandra?
How Do I Cassandra?How Do I Cassandra?
How Do I Cassandra?
 
Understanding Data Partitioning and Replication in Apache Cassandra
Understanding Data Partitioning and Replication in Apache CassandraUnderstanding Data Partitioning and Replication in Apache Cassandra
Understanding Data Partitioning and Replication in Apache Cassandra
 
Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache Cassandra
 
Analyzing Time Series Data with Apache Spark and Cassandra
Analyzing Time Series Data with Apache Spark and CassandraAnalyzing Time Series Data with Apache Spark and Cassandra
Analyzing Time Series Data with Apache Spark and Cassandra
 
Cassandra under the hood
Cassandra under the hoodCassandra under the hood
Cassandra under the hood
 
Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Spark
 
Reactive app using actor model & apache spark
Reactive app using actor model & apache sparkReactive app using actor model & apache spark
Reactive app using actor model & apache spark
 
Cassandra Introduction & Features
Cassandra Introduction & FeaturesCassandra Introduction & Features
Cassandra Introduction & Features
 
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Spark
 

Similar to Apache Cassandra Overview and Basics

Sa introduction to big data pipelining with cassandra &amp; spark west mins...
Sa introduction to big data pipelining with cassandra &amp; spark   west mins...Sa introduction to big data pipelining with cassandra &amp; spark   west mins...
Sa introduction to big data pipelining with cassandra &amp; spark west mins...Simon Ambridge
 
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...Fwdays
 
NoSql Data Management
NoSql Data ManagementNoSql Data Management
NoSql Data Managementsameerfaizan
 
Big data berlin
Big data berlinBig data berlin
Big data berlinkammeyer
 
Bigdata antipatterns
Bigdata antipatternsBigdata antipatterns
Bigdata antipatternsAnurag S
 
Storage Requirements and Options for Running Spark on Kubernetes
Storage Requirements and Options for Running Spark on KubernetesStorage Requirements and Options for Running Spark on Kubernetes
Storage Requirements and Options for Running Spark on KubernetesDataWorks Summit
 
Spark volume requirements 2018
Spark volume requirements 2018Spark volume requirements 2018
Spark volume requirements 2018Rachit Arora
 
Hadoop for Bioinformatics: Building a Scalable Variant Store
Hadoop for Bioinformatics: Building a Scalable Variant StoreHadoop for Bioinformatics: Building a Scalable Variant Store
Hadoop for Bioinformatics: Building a Scalable Variant StoreUri Laserson
 
Data Pipelines with Spark & DataStax Enterprise
Data Pipelines with Spark & DataStax EnterpriseData Pipelines with Spark & DataStax Enterprise
Data Pipelines with Spark & DataStax EnterpriseDataStax
 
No SQL- The Future Of Data Storage
No SQL- The Future Of Data StorageNo SQL- The Future Of Data Storage
No SQL- The Future Of Data StorageBethmi Gunasekara
 
Data Lake and the rise of the microservices
Data Lake and the rise of the microservicesData Lake and the rise of the microservices
Data Lake and the rise of the microservicesBigstep
 
Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...
Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...
Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...Michael Rys
 
P.Maharajothi,II-M.sc(computer science),Bon secours college for women,thanjavur.
P.Maharajothi,II-M.sc(computer science),Bon secours college for women,thanjavur.P.Maharajothi,II-M.sc(computer science),Bon secours college for women,thanjavur.
P.Maharajothi,II-M.sc(computer science),Bon secours college for women,thanjavur.MaharajothiP
 
SpringPeople - Introduction to Cloud Computing
SpringPeople - Introduction to Cloud ComputingSpringPeople - Introduction to Cloud Computing
SpringPeople - Introduction to Cloud ComputingSpringPeople
 
NoSQL Intro with cassandra
NoSQL Intro with cassandraNoSQL Intro with cassandra
NoSQL Intro with cassandraBrian Enochson
 

Similar to Apache Cassandra Overview and Basics (20)

Sa introduction to big data pipelining with cassandra &amp; spark west mins...
Sa introduction to big data pipelining with cassandra &amp; spark   west mins...Sa introduction to big data pipelining with cassandra &amp; spark   west mins...
Sa introduction to big data pipelining with cassandra &amp; spark west mins...
 
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
 
NoSql Data Management
NoSql Data ManagementNoSql Data Management
NoSql Data Management
 
Big data berlin
Big data berlinBig data berlin
Big data berlin
 
Bigdata antipatterns
Bigdata antipatternsBigdata antipatterns
Bigdata antipatterns
 
Storage Requirements and Options for Running Spark on Kubernetes
Storage Requirements and Options for Running Spark on KubernetesStorage Requirements and Options for Running Spark on Kubernetes
Storage Requirements and Options for Running Spark on Kubernetes
 
Spark volume requirements 2018
Spark volume requirements 2018Spark volume requirements 2018
Spark volume requirements 2018
 
Hadoop for Bioinformatics: Building a Scalable Variant Store
Hadoop for Bioinformatics: Building a Scalable Variant StoreHadoop for Bioinformatics: Building a Scalable Variant Store
Hadoop for Bioinformatics: Building a Scalable Variant Store
 
Architecting Your First Big Data Implementation
Architecting Your First Big Data ImplementationArchitecting Your First Big Data Implementation
Architecting Your First Big Data Implementation
 
Nosql databases
Nosql databasesNosql databases
Nosql databases
 
Data Pipelines with Spark & DataStax Enterprise
Data Pipelines with Spark & DataStax EnterpriseData Pipelines with Spark & DataStax Enterprise
Data Pipelines with Spark & DataStax Enterprise
 
NoSQL_Night
NoSQL_NightNoSQL_Night
NoSQL_Night
 
CC -Unit4.pptx
CC -Unit4.pptxCC -Unit4.pptx
CC -Unit4.pptx
 
Oracle OpenWo2014 review part 03 three_paa_s_database
Oracle OpenWo2014 review part 03 three_paa_s_databaseOracle OpenWo2014 review part 03 three_paa_s_database
Oracle OpenWo2014 review part 03 three_paa_s_database
 
No SQL- The Future Of Data Storage
No SQL- The Future Of Data StorageNo SQL- The Future Of Data Storage
No SQL- The Future Of Data Storage
 
Data Lake and the rise of the microservices
Data Lake and the rise of the microservicesData Lake and the rise of the microservices
Data Lake and the rise of the microservices
 
Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...
Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...
Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...
 
P.Maharajothi,II-M.sc(computer science),Bon secours college for women,thanjavur.
P.Maharajothi,II-M.sc(computer science),Bon secours college for women,thanjavur.P.Maharajothi,II-M.sc(computer science),Bon secours college for women,thanjavur.
P.Maharajothi,II-M.sc(computer science),Bon secours college for women,thanjavur.
 
SpringPeople - Introduction to Cloud Computing
SpringPeople - Introduction to Cloud ComputingSpringPeople - Introduction to Cloud Computing
SpringPeople - Introduction to Cloud Computing
 
NoSQL Intro with cassandra
NoSQL Intro with cassandraNoSQL Intro with cassandra
NoSQL Intro with cassandra
 

Recently uploaded

FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetEnjoy Anytime
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 

Recently uploaded (20)

FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 

Apache Cassandra Overview and Basics