SlideShare a Scribd company logo
1 of 40
Cassandra NoSQL
- Pankaj Khattar
What are we going to learn today?
 New Problems which can’t be handled by traditional RDBMS
 Tradeoff between Consistency, Availability, Partition Tolerance (CAP theorem)
 What are the different solutions available?
 What is Cassandra?
 Use-Cases for Cassandra
Cassandra Features – Tunable Consistency, P2P Architecture, Elastic Scalability, Column
Orientation
Data Model for Cassandra
Demo Application using Cassandra



Twitter – Massive Scale, High Availability
Travel Booking – Scale and Availability
Movie Booking – Consistency and Scale
Facebook Graph Search – Fast, Complex Querying
Facebook Messenger – Consistency and Scale
So, What Is Common?
 Huge Data
 Fast Random access
 Variable Schema
 Need of Compression
 High Availability
 Need for Consistency
 Need of Distribution (Sharding)
y
Brewer’s CAP Theorem
MongoDB
HBase
Redis
RDBMS
Consistenc
CA CP
Partition
Tolerance
Availability AP
CouchDB Cassandra DynamoDB Riak
http://www.w3resource.com/mongodb/nosql.php
P
NoSQL Landscape
Big Table
Clones
BigTable
(Google),
Cassandra,
HBase,
Hypertable
Key-Value
Stores Dynamo
(Amazon),
Voldemort
(LinkedIn), Citrusleaf,
Membase, Riak,
Tokyo Cabinet
Document
Database
CouchOne,
MongoDB,
Terrastore,
OrientDB
Graph
Databases
FlockDB (Twitter),
AllegroGraph,
DEX, InfoGrid,
Neo4J, Sones
Performance
Query and Navigational Complexity
Scalability&Speed
Cassandra Usecase – Deep Drive
5000 TPS
300 ~ 500 SQL
Transaction
WEB APPLICATION
Caching Layer
Elastic Scale
1000 TPS
100 ~ 200 SQL
Transaction
Applications Changing Data
RDBMS 2RDBMS 1
Using Cassandra
5000 TPS
300 ~ 500 SQL
Transaction
WEB APPLICATIONElastic Scale
Elastic Scale
CASSANDRA
1000 TPS
100 ~ 200 SQL
Transaction
Applications Changing Data
Cassandra Usecase - Summary
 E-Commerce (Travel Portal)  Development Approaches


Both B2B & B2C Consumers
High volume of shopping transactions
(> 500 Million Visits / Day)
High volume supply changes
(Manual & System) generated.
Huge Inventory Database
(Millions of hotels)
High Read/Write
(Thousands Reads & Writes/Second)
Application has to 99.99% Available
Fault Tolerant & Reliable.
Fast & Quick Shopping Experience.
Elastic Scale
Innovative Recommendations & Algorithms.
Should be fast for new changes
Should be cost effective for maintenance.


Legacy Way (Pure RDBMS)
Augmented (RDBMS + Caching, Heavy
Database Hardware)
Using Cassandra 









What is Apache Cassandra?
Apache Cassandra is an open source, distributed, decentralized, elastically scalable, highly available,
fault-tolerant, Tuneably consistent, column-oriented database.
Open
Source
Column
Oriented
Decentralized
Cassandra Features
Tuneably
Consistent
Elastically
Scalable
Distributed
Highly
Scalable
Fault Tolerant
Distributed and Decentralized
Post Office Post Office
CCY
Exchange CCY, Stationary,
Letter/Couriers
CCY, Stationary,
Letter/Couriers
CCY, Stationary,
Letter/Couriers
stationary Letter/Couriers
Ccy Courier StationaryCcy Courier Stationary
DecentralisedCentralised
Distributed and Decentralized
 Every Node Is Identical.
 Peer to Peer Protocol and uses Gossip Protocol to
maintain and keep the List of nodes in Sync.CCY, Stationary,
Letter/Couriers
CCY, Stationary,
Letter/Couriers
CCY, Stationary,
Letter/Couriers
 No Special Host to Coordinate Activities.
 No Single Point of Failure.
 Easier to Operate and Maintain because all nodes
are same.
Ccy Courier Stationary
Elastic Scalability
Types of Scalability
 Vertical Scalability
 Horizontal Scalability
What is Elastic Scalability?
This is special property of Horizontal Scalability.
 The cluster can seamlessly scale up and scale back down without major disruption.
Elastic Scalability
CCY, Stationary,
Letter/Couriers
CCY, Stationary,
Letter/Couriers
CCY, Stationary,
Letter/Couriers
CCY, Stationary,
Letter/Couriers
 Cluster must accept new nodes without major disruption or
reconfiguration.
Process should not be restarted
Do not have to change application charges
Don’t have to rebalance data



ADD A NODE AND MOVE ON!!
Ccy Courier Stationary
High Availability and Fault Tolerance
Highly Available
 No Downtime
CCY, Stationary,
Letter/Couriers
CCY, Stationary,
Letter/Couriers
CCY, Stationary,
Letter/Couriers
Ccy Courier Stationary
Tunable Consistency
 Cassandra enables us to define consistency as per application requirements
High Performance
 Cassandra was designed specifically from the ground up to take full advantage
of multiprocessor/ multicore machines, and to run across many dozens of
these machines housed in multiple data centres.
 It scales consistently and seamlessly to hundreds of terabytes.
 Shows exceptional performance under heavy loads.
 Consistently shows very fast throughput for writes per second on a basic
commodity workstation.
Where to Use Cassandra?
Use if your application has:
 Big Data (Billions Of Records Rows & Columns)
 Very High Velocity Random Reads & Writes
 Flexible Sparse / Wide Column Requirements
 No Multiple Secondary Index Needs
 Low Latency
Use Cases:



eCommerce Inventory Cache Use Cases
Time Series / Events Use Cases
Feed Based Activities / Use Cases
Where NOT to Use Cassandra?
Don’t Use if your application has:
Secondary Indexes.
Relational Data.
Transactional (Rollback, Commit)
Primary & Financial Records.
Stringent Security & Authorization Needs On Data
Dynamic Queries on Columns.
Searching Column Data
Low Latency
Data Model
RDBMS vs Cassandra
 In RDBMS,
 Define Schema
 Define tables with defined columns
 The table defines the column names and their data types
 Add rows conforming to that schema: each row contains the same fixed set of
columns.
 In Cassandra,
 Define Keyspaces
 Define columnfamilies/tables
 Column families can define metadata about the columns
 Each row can have a different set of columns
Data Model – Column Families
Designing Column Families/Tables
 Static Column Families,
 Static set of column names
 Similar to a relational database table
 Rows are not required to have all of the columns defined
 Dynamic Column Families,
 Use arbitrary column names to store data
Data Model - Keys
Type of Keys
 Primary Key
create table test ( key text PRIMARY KEY, data text );
 Composite Primary (or Compound) Key
create table test ( key_part_one text, key_part_two int, data text, PRIMARY
KEY(key_part_one, key_part_two) );
 In above, the "first part" of the key is called Partition Key(key_part_one) and the
second part of the key is the Clustering Key(key_part_two)
 The Partition Key is responsible for data distribution across your nodes
 The Clustering Key is responsible for data sorting within the partition
 The Primary Key is equivalent to the Partition Key in a single-field-key table
Data Model - Columns
Type of Columns
 Standard Columns
 A tuple containing a name, a value and a timestamp
 Expiring Columns
 optional expiration date called TTL (time to live)
 defined in seconds
 Counter Columns
 store a number that incrementally counts the occurrences
 Example Page Views
Data Model - Columns
Type of Columns
 Composite Columns
CREATE TABLE timeline ( user_id varchar, tweet_id uuid, author varchar, body varchar,
PRIMARY KEY (user_id, tweet_id) );
Data Model – Data Types
Data Types (Comparators & Validators)
 Data type for a column (or row key) value is
called a validator
 Data type for a column name is called a
comparator
 Can define data types while column family
schemas creation but not required.
 Internally, stores column names and values as
hex byte arrays (BytesType).
Data Model - Indexes
Type of Indexes
 Primary Indexes
 the primary index for a column family is the index of its row keys
 Each node maintains this index for the data it manages
 Secondary Indexes
 Indexes on column values
 Implemented as a hidden table, separate from the main table
 Do not use secondary indexes to query a huge volume of records for a small number
of results
 more efficient to manually maintain a lookup column family instead of using a
secondary index
Cassandra – Writes/Reads
Writes in Cassandra
 Cassandra writes are first written to a commit log (for durability), and then to an in-
memory table structure called a memtable.
 There is very minimal disk I/O at the time of write.
 Writes are batched in memory and periodically written to disk to a persistent table
structure called an SSTable (sorted string table).
 Memtables and SSTables are maintained per column family.
 Memtables are organized in sorted order by row key and flushed to SSTables sequentially
Reads in Cassandra
 At read time, a row must be combined from all SSTables on disk (as well as unflushed
memtables) to produce the requested data.
 Each SSTable has a Bloom filter associated with it that checks if a requested row key exists
in the SSTable before doing any disk seeks.
Application Demo
Cassandra Installation & Configuration
 Conf/cassandra.yaml
 Tools
Key Space Setup
Column Family / Data Model Setup
 Key
 Columns & Data Types
 Indexes (Primary & Secondary)
 Programmatic Consistency
Thrift Hector API
CQL3 API





Application Demo
Application Demo
Application Demo
Application Demo
Application Demo
Application Demo
Questions?
Thanks

More Related Content

What's hot

HBase Tutorial For Beginners | HBase Architecture | HBase Tutorial | Hadoop T...
HBase Tutorial For Beginners | HBase Architecture | HBase Tutorial | Hadoop T...HBase Tutorial For Beginners | HBase Architecture | HBase Tutorial | Hadoop T...
HBase Tutorial For Beginners | HBase Architecture | HBase Tutorial | Hadoop T...Simplilearn
 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to CassandraSoftwareMill
 
How to Fine-Tune Performance Using Amazon Redshift
How to Fine-Tune Performance Using Amazon RedshiftHow to Fine-Tune Performance Using Amazon Redshift
How to Fine-Tune Performance Using Amazon RedshiftAWS Germany
 
A NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRA
A NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRAA NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRA
A NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRAijfcstjournal
 
CASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEM
CASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEMCASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEM
CASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEMIJCI JOURNAL
 
Managing Objects and Data in Apache Cassandra
Managing Objects and Data in Apache CassandraManaging Objects and Data in Apache Cassandra
Managing Objects and Data in Apache CassandraDataStax
 
SQL Server Workshop for Developers - Visual Studio Live! NY 2012
SQL Server Workshop for Developers - Visual Studio Live! NY 2012SQL Server Workshop for Developers - Visual Studio Live! NY 2012
SQL Server Workshop for Developers - Visual Studio Live! NY 2012Andrew Brust
 
Cassandra for Ruby/Rails Devs
Cassandra for Ruby/Rails DevsCassandra for Ruby/Rails Devs
Cassandra for Ruby/Rails DevsTyler Hobbs
 
Introduction to NoSQL & Apache Cassandra
Introduction to NoSQL & Apache CassandraIntroduction to NoSQL & Apache Cassandra
Introduction to NoSQL & Apache CassandraChetan Baheti
 
Architectural Anti Patterns - Notes on Data Distribution and Handling Failures
Architectural Anti Patterns - Notes on Data Distribution and Handling FailuresArchitectural Anti Patterns - Notes on Data Distribution and Handling Failures
Architectural Anti Patterns - Notes on Data Distribution and Handling FailuresGleicon Moraes
 
Chapter 4 terminolgy of keyvalue databses from nosql for mere mortals
Chapter 4 terminolgy of keyvalue databses from nosql for mere mortalsChapter 4 terminolgy of keyvalue databses from nosql for mere mortals
Chapter 4 terminolgy of keyvalue databses from nosql for mere mortalsnehabsairam
 

What's hot (20)

HBase Tutorial For Beginners | HBase Architecture | HBase Tutorial | Hadoop T...
HBase Tutorial For Beginners | HBase Architecture | HBase Tutorial | Hadoop T...HBase Tutorial For Beginners | HBase Architecture | HBase Tutorial | Hadoop T...
HBase Tutorial For Beginners | HBase Architecture | HBase Tutorial | Hadoop T...
 
Hbase
HbaseHbase
Hbase
 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to Cassandra
 
How to Fine-Tune Performance Using Amazon Redshift
How to Fine-Tune Performance Using Amazon RedshiftHow to Fine-Tune Performance Using Amazon Redshift
How to Fine-Tune Performance Using Amazon Redshift
 
Apache Cassandra
Apache CassandraApache Cassandra
Apache Cassandra
 
A NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRA
A NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRAA NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRA
A NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRA
 
CASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEM
CASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEMCASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEM
CASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEM
 
Hbase
HbaseHbase
Hbase
 
Managing Objects and Data in Apache Cassandra
Managing Objects and Data in Apache CassandraManaging Objects and Data in Apache Cassandra
Managing Objects and Data in Apache Cassandra
 
SQL Server Workshop for Developers - Visual Studio Live! NY 2012
SQL Server Workshop for Developers - Visual Studio Live! NY 2012SQL Server Workshop for Developers - Visual Studio Live! NY 2012
SQL Server Workshop for Developers - Visual Studio Live! NY 2012
 
Cassandra for Ruby/Rails Devs
Cassandra for Ruby/Rails DevsCassandra for Ruby/Rails Devs
Cassandra for Ruby/Rails Devs
 
Nosql databases
Nosql databasesNosql databases
Nosql databases
 
Hive(ppt)
Hive(ppt)Hive(ppt)
Hive(ppt)
 
Sql Server Basics
Sql Server BasicsSql Server Basics
Sql Server Basics
 
Introduction to NoSQL & Apache Cassandra
Introduction to NoSQL & Apache CassandraIntroduction to NoSQL & Apache Cassandra
Introduction to NoSQL & Apache Cassandra
 
Architectural Anti Patterns - Notes on Data Distribution and Handling Failures
Architectural Anti Patterns - Notes on Data Distribution and Handling FailuresArchitectural Anti Patterns - Notes on Data Distribution and Handling Failures
Architectural Anti Patterns - Notes on Data Distribution and Handling Failures
 
NoSql
NoSqlNoSql
NoSql
 
PHP and Cassandra
PHP and CassandraPHP and Cassandra
PHP and Cassandra
 
Deep Dive on Amazon Redshift
Deep Dive on Amazon RedshiftDeep Dive on Amazon Redshift
Deep Dive on Amazon Redshift
 
Chapter 4 terminolgy of keyvalue databses from nosql for mere mortals
Chapter 4 terminolgy of keyvalue databses from nosql for mere mortalsChapter 4 terminolgy of keyvalue databses from nosql for mere mortals
Chapter 4 terminolgy of keyvalue databses from nosql for mere mortals
 

Similar to Learning Cassandra NoSQL

Similar to Learning Cassandra NoSQL (20)

Storage cassandra
Storage   cassandraStorage   cassandra
Storage cassandra
 
Learn Cassandra at edureka!
Learn Cassandra at edureka!Learn Cassandra at edureka!
Learn Cassandra at edureka!
 
Why Cassandra?
Why Cassandra?Why Cassandra?
Why Cassandra?
 
Apache Cassandra, part 1 – principles, data model
Apache Cassandra, part 1 – principles, data modelApache Cassandra, part 1 – principles, data model
Apache Cassandra, part 1 – principles, data model
 
Cassandra
CassandraCassandra
Cassandra
 
NOSQL Database: Apache Cassandra
NOSQL Database: Apache CassandraNOSQL Database: Apache Cassandra
NOSQL Database: Apache Cassandra
 
Apache cassandra
Apache cassandraApache cassandra
Apache cassandra
 
Cassandra data modelling best practices
Cassandra data modelling best practicesCassandra data modelling best practices
Cassandra data modelling best practices
 
Cassandra Learning
Cassandra LearningCassandra Learning
Cassandra Learning
 
No sql
No sqlNo sql
No sql
 
About "Apache Cassandra"
About "Apache Cassandra"About "Apache Cassandra"
About "Apache Cassandra"
 
Cassandra tutorial
Cassandra tutorialCassandra tutorial
Cassandra tutorial
 
Appache Cassandra
Appache Cassandra  Appache Cassandra
Appache Cassandra
 
Cassndra (4).pptx
Cassndra (4).pptxCassndra (4).pptx
Cassndra (4).pptx
 
Nonrelational Databases
Nonrelational DatabasesNonrelational Databases
Nonrelational Databases
 
Application architecture for the rest of us - php xperts devcon 2012
Application architecture for the rest of us -  php xperts devcon 2012Application architecture for the rest of us -  php xperts devcon 2012
Application architecture for the rest of us - php xperts devcon 2012
 
Cassandra implementation for collecting data and presenting data
Cassandra implementation for collecting data and presenting dataCassandra implementation for collecting data and presenting data
Cassandra implementation for collecting data and presenting data
 
Cassandra no sql ecosystem
Cassandra no sql ecosystemCassandra no sql ecosystem
Cassandra no sql ecosystem
 
5266732.ppt
5266732.ppt5266732.ppt
5266732.ppt
 
Cassandra
CassandraCassandra
Cassandra
 

Recently uploaded

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 

Recently uploaded (20)

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 

Learning Cassandra NoSQL

  • 2. What are we going to learn today?  New Problems which can’t be handled by traditional RDBMS  Tradeoff between Consistency, Availability, Partition Tolerance (CAP theorem)  What are the different solutions available?  What is Cassandra?  Use-Cases for Cassandra Cassandra Features – Tunable Consistency, P2P Architecture, Elastic Scalability, Column Orientation Data Model for Cassandra Demo Application using Cassandra   
  • 3. Twitter – Massive Scale, High Availability
  • 4. Travel Booking – Scale and Availability
  • 5. Movie Booking – Consistency and Scale
  • 6. Facebook Graph Search – Fast, Complex Querying
  • 7. Facebook Messenger – Consistency and Scale
  • 8. So, What Is Common?  Huge Data  Fast Random access  Variable Schema  Need of Compression  High Availability  Need for Consistency  Need of Distribution (Sharding)
  • 9. y Brewer’s CAP Theorem MongoDB HBase Redis RDBMS Consistenc CA CP Partition Tolerance Availability AP CouchDB Cassandra DynamoDB Riak http://www.w3resource.com/mongodb/nosql.php
  • 10. P NoSQL Landscape Big Table Clones BigTable (Google), Cassandra, HBase, Hypertable Key-Value Stores Dynamo (Amazon), Voldemort (LinkedIn), Citrusleaf, Membase, Riak, Tokyo Cabinet Document Database CouchOne, MongoDB, Terrastore, OrientDB Graph Databases FlockDB (Twitter), AllegroGraph, DEX, InfoGrid, Neo4J, Sones Performance Query and Navigational Complexity Scalability&Speed
  • 11. Cassandra Usecase – Deep Drive 5000 TPS 300 ~ 500 SQL Transaction WEB APPLICATION Caching Layer Elastic Scale 1000 TPS 100 ~ 200 SQL Transaction Applications Changing Data RDBMS 2RDBMS 1
  • 12. Using Cassandra 5000 TPS 300 ~ 500 SQL Transaction WEB APPLICATIONElastic Scale Elastic Scale CASSANDRA 1000 TPS 100 ~ 200 SQL Transaction Applications Changing Data
  • 13. Cassandra Usecase - Summary  E-Commerce (Travel Portal)  Development Approaches   Both B2B & B2C Consumers High volume of shopping transactions (> 500 Million Visits / Day) High volume supply changes (Manual & System) generated. Huge Inventory Database (Millions of hotels) High Read/Write (Thousands Reads & Writes/Second) Application has to 99.99% Available Fault Tolerant & Reliable. Fast & Quick Shopping Experience. Elastic Scale Innovative Recommendations & Algorithms. Should be fast for new changes Should be cost effective for maintenance.   Legacy Way (Pure RDBMS) Augmented (RDBMS + Caching, Heavy Database Hardware) Using Cassandra          
  • 14. What is Apache Cassandra? Apache Cassandra is an open source, distributed, decentralized, elastically scalable, highly available, fault-tolerant, Tuneably consistent, column-oriented database. Open Source Column Oriented Decentralized Cassandra Features Tuneably Consistent Elastically Scalable Distributed Highly Scalable Fault Tolerant
  • 15. Distributed and Decentralized Post Office Post Office CCY Exchange CCY, Stationary, Letter/Couriers CCY, Stationary, Letter/Couriers CCY, Stationary, Letter/Couriers stationary Letter/Couriers Ccy Courier StationaryCcy Courier Stationary DecentralisedCentralised
  • 16. Distributed and Decentralized  Every Node Is Identical.  Peer to Peer Protocol and uses Gossip Protocol to maintain and keep the List of nodes in Sync.CCY, Stationary, Letter/Couriers CCY, Stationary, Letter/Couriers CCY, Stationary, Letter/Couriers  No Special Host to Coordinate Activities.  No Single Point of Failure.  Easier to Operate and Maintain because all nodes are same. Ccy Courier Stationary
  • 17. Elastic Scalability Types of Scalability  Vertical Scalability  Horizontal Scalability What is Elastic Scalability? This is special property of Horizontal Scalability.  The cluster can seamlessly scale up and scale back down without major disruption.
  • 18. Elastic Scalability CCY, Stationary, Letter/Couriers CCY, Stationary, Letter/Couriers CCY, Stationary, Letter/Couriers CCY, Stationary, Letter/Couriers  Cluster must accept new nodes without major disruption or reconfiguration. Process should not be restarted Do not have to change application charges Don’t have to rebalance data    ADD A NODE AND MOVE ON!! Ccy Courier Stationary
  • 19. High Availability and Fault Tolerance Highly Available  No Downtime CCY, Stationary, Letter/Couriers CCY, Stationary, Letter/Couriers CCY, Stationary, Letter/Couriers Ccy Courier Stationary
  • 20. Tunable Consistency  Cassandra enables us to define consistency as per application requirements
  • 21. High Performance  Cassandra was designed specifically from the ground up to take full advantage of multiprocessor/ multicore machines, and to run across many dozens of these machines housed in multiple data centres.  It scales consistently and seamlessly to hundreds of terabytes.  Shows exceptional performance under heavy loads.  Consistently shows very fast throughput for writes per second on a basic commodity workstation.
  • 22. Where to Use Cassandra? Use if your application has:  Big Data (Billions Of Records Rows & Columns)  Very High Velocity Random Reads & Writes  Flexible Sparse / Wide Column Requirements  No Multiple Secondary Index Needs  Low Latency Use Cases:    eCommerce Inventory Cache Use Cases Time Series / Events Use Cases Feed Based Activities / Use Cases
  • 23. Where NOT to Use Cassandra? Don’t Use if your application has: Secondary Indexes. Relational Data. Transactional (Rollback, Commit) Primary & Financial Records. Stringent Security & Authorization Needs On Data Dynamic Queries on Columns. Searching Column Data Low Latency
  • 24. Data Model RDBMS vs Cassandra  In RDBMS,  Define Schema  Define tables with defined columns  The table defines the column names and their data types  Add rows conforming to that schema: each row contains the same fixed set of columns.  In Cassandra,  Define Keyspaces  Define columnfamilies/tables  Column families can define metadata about the columns  Each row can have a different set of columns
  • 25. Data Model – Column Families Designing Column Families/Tables  Static Column Families,  Static set of column names  Similar to a relational database table  Rows are not required to have all of the columns defined  Dynamic Column Families,  Use arbitrary column names to store data
  • 26. Data Model - Keys Type of Keys  Primary Key create table test ( key text PRIMARY KEY, data text );  Composite Primary (or Compound) Key create table test ( key_part_one text, key_part_two int, data text, PRIMARY KEY(key_part_one, key_part_two) );  In above, the "first part" of the key is called Partition Key(key_part_one) and the second part of the key is the Clustering Key(key_part_two)  The Partition Key is responsible for data distribution across your nodes  The Clustering Key is responsible for data sorting within the partition  The Primary Key is equivalent to the Partition Key in a single-field-key table
  • 27. Data Model - Columns Type of Columns  Standard Columns  A tuple containing a name, a value and a timestamp  Expiring Columns  optional expiration date called TTL (time to live)  defined in seconds  Counter Columns  store a number that incrementally counts the occurrences  Example Page Views
  • 28. Data Model - Columns Type of Columns  Composite Columns CREATE TABLE timeline ( user_id varchar, tweet_id uuid, author varchar, body varchar, PRIMARY KEY (user_id, tweet_id) );
  • 29. Data Model – Data Types Data Types (Comparators & Validators)  Data type for a column (or row key) value is called a validator  Data type for a column name is called a comparator  Can define data types while column family schemas creation but not required.  Internally, stores column names and values as hex byte arrays (BytesType).
  • 30. Data Model - Indexes Type of Indexes  Primary Indexes  the primary index for a column family is the index of its row keys  Each node maintains this index for the data it manages  Secondary Indexes  Indexes on column values  Implemented as a hidden table, separate from the main table  Do not use secondary indexes to query a huge volume of records for a small number of results  more efficient to manually maintain a lookup column family instead of using a secondary index
  • 31. Cassandra – Writes/Reads Writes in Cassandra  Cassandra writes are first written to a commit log (for durability), and then to an in- memory table structure called a memtable.  There is very minimal disk I/O at the time of write.  Writes are batched in memory and periodically written to disk to a persistent table structure called an SSTable (sorted string table).  Memtables and SSTables are maintained per column family.  Memtables are organized in sorted order by row key and flushed to SSTables sequentially Reads in Cassandra  At read time, a row must be combined from all SSTables on disk (as well as unflushed memtables) to produce the requested data.  Each SSTable has a Bloom filter associated with it that checks if a requested row key exists in the SSTable before doing any disk seeks.
  • 32. Application Demo Cassandra Installation & Configuration  Conf/cassandra.yaml  Tools Key Space Setup Column Family / Data Model Setup  Key  Columns & Data Types  Indexes (Primary & Secondary)  Programmatic Consistency Thrift Hector API CQL3 API     