SlideShare a Scribd company logo
1 of 30
BỘ GIÁO DỤC VÀ ĐÀO TẠO
TRƯỜNG ĐẠI HỌC KHOA HỌC TỰ NHIÊN TP.HCM
KHOA CÔNG NGHỆ THÔNG TIN
NoSQL
Column-Family Stores
GVHD: Ts. Nguyễn Trần Minh Thư
Nhóm 07:
1. 19C11015 - Đỗ Huy Gia Cát
2. 21C12003 - Đào Thanh Danh
3. 21C11026 - Nguyễn Thành Thái
Báo cáo môn Các hệ cơ sở dữ liệu nâng cao
1
NoSQL - Not Only SQL
CONTENTS
• Column-Family Stores NoSQL
o Overview
o Column-Family Databases
• Cassandra's Structure and Features
• Compare Colum-Family Data Store with others
• Query features
• Expand analyse
• Scaling
• Some compare Cassandra and HBase
• Apply suitable usecases
2
Introduction
3
Wide Column / Column Family Database
• Column-family stores are databases in which data is stored by key-
value mapping and values group into multiple column families, with
each being a map of data
• Keyword comparison between RDBMS and Cassandra
4
RDBMS Cassandra
Database instance Cluster
Database Keyspace
Table Column Family
Row Row
Column (same for all rows) Column (can be different per row)
Column Family Database
6
Column Family Unit Structure Storage
• Column: the basic storage unit,
consist of a name-value pair with
the name also acts as the key, and
stored with a timestamp value
• Super column: column
whose value is a map of columns
7
Column Family Unit Structure Storage
• Standard column family: column
family where all columns are
simple columns
• Super column family: column
family where exists at least one
super column
8
Cassandra's Features
• Consistency
• Transactions
• Availability
• Scaling
9
Consistency
• Cassandra stores replicas on multiple nodes to ensure reliability
• Cassandra provides three consistency levels:
ONE: Only need one of the nodes to respond to the request, good for
high write performance requirements
QUOROM: Ensures that majority of the node respond to the request
ALL: All nodes will have to respond to the requests
• If a node is down, the data will be stored later when it comes back via
hints (hinted handoff) or repair command.
10
Transactions
• In Cassandra, transactions are atomic and isolated
• Atomic: inserted or updating columns in a row is treated as a write
operation
• Isolation: writes to a row are isolated to client and not visible to other
uses until completion
11
Availability
• Availability is governed by the formula
(R + W) > N
R, W: minimum number of nodes read/write request is successfully
responded; N: number of replicas of data
• Keyspaces should be set up depending on your need – higher
availability for read or write
12
Scaling
• Cassandra handles scaling by adding additional nodes to the cluster
• Allows clusters to be scaled on the fly without operations => maxium
uptime
13
15
Database - Open-source NoSQL - Column Family
- Store data no relationship on column-family model
Scalability - Scalabilitiable by increasing nodes
Replication - Replica data on multi node
16
Infrastructure Design independence, can integrate
with DBMS other and Storm, Hadoop
Base on Hadoop, can integrate with
Zookeeper
HBase master, HBase data node,
name node
Support Support ordered partitioning Not-support ordered partitioning
Node Multi seed node in clusster Node master monitoring/coordinator
nodes
Query language Cassandra Query Language – CQL
Cassandra Query Language Shell -
CQLSH
Only support HBase shell
17
18
Basic Queries CQL
• Cassandra Query Language
• Cassandra Query Language Shell -
CQLSH
• Only support HBase Shell
• Apache Phoenix -> Query Engine
https://data-
flair.training/blogs/hbase-shell-
commands/
19
Cassandra Query Language
20
• CREATE KEYSPACE <identifier> WITH <properties>
• CREATE KEYSPACE videodb WITH REPLICATION =
{ 'class' : 'SimpleStrategy', 'replication_factor' : 1 };
SimpleStrategy
NetworkTopologyStrategy
Cassandra Query Language
21
USE videodb;
CREATE TABLE videos
(
videoid uuid,
videoname varchar,
username varchar,
description varchar,
location map<varchar,varchar>,
tags set<varchar>,
upload_date timestamp,
PRIMARY KEY (videoid)
);
CREATE TABLE video_rating
(
videoid uuid,
rating_counter counter,
rating_total counter,
PRIMARY KEY (videoid)
);
CREATE TABLE video_event
(
videoid uuid,
username varchar,
event varchar,
event_timestamp timeuuid,
video_timestamp bigint,
PRIMARY KEY ((videoid, username), event_timestamp, event)
) WITH CLUSTERING ORDER BY (event_timestamp DESC, event ASC);
CREATE (TABLE | COLUMNFAMILY)
<tablename>
('<column-definition>' , '<column-definition>')
(WITH <option> AND <option>)
Cassandra Query Language
• Built-In Data Type: boolean, int, bigint, variant, float, double, decimal,
ascii, varchar, text, timestamp, blob, inet, timeuuid, uuid,…
• Collection Data Type: LIST, SET, MAP
• User-Defined Data Type
22
Cassandra Query Language
• User-Defined Data Type
CREATE TYPE <keyspace>.<data type>
(variable,variable)
CREATE TYPE records (
name text,
branch text,
phone int,
city text,
id set<int>
);
23
Cassandra Query Language
SELECT Clause, WHERE Clause & ORDERBY
INSERT INTO <table name>
(<field name 1>,<field name 2>,<field name 3>.,...)
VALUES ('value 1','value 2','value 3',....)
USING <update parameter>;
UPDATE <table name> USING <update parameter>
SET <field name 1>=< value 1>,
< field name 2>=< value 2>,
< field name 3>=<value 3>, .....
WHERE <field>=<value>; 24
Cassandra Query Language
DELETE <table name>
USING <update parameter>
WHERE <identifier>
BEGIN BATCH
//different data manipulation command syntax -> INSERT, UPDATE
// DELETE
APPLY BATCH;
25
Cassandra Query Language
• Advanced Queries and Indexing
• CREATE INDEX <field name> ON <table name>​
Indexes are implemented as bit-mapped indexes and perform well for
low-cardinality column values.
• USE, CREATE, ALTER, DROP, TRUNCATE,...
26
Stores writing
• Memory space - memtable
• Disk store SSTable
27
Retried reading
28
Suitable Use Cases
• A great choice to store event information, such as application state or errors
encountered by the application
• Content Management Systems, Blogging Platforms
=> store blog entries with tags, categories, links, and trackbacks
• Count and categorize visitors of a page to calculate analytics
• Data for specific time -> as ad banners on a website
29
When Not to Use
• Systems that require ACID transactions for writes and reads
• The database to aggregate the data using queries (such as SUM or
AVG)
• Sample product prototypes or initial tech spikes
30
Conclusion
31
32

More Related Content

Similar to N07_RoundII_20220405.pptx

Appache Cassandra
Appache Cassandra  Appache Cassandra
Appache Cassandra nehabsairam
 
Couchbase Data Platform | Big Data Demystified
Couchbase Data Platform | Big Data DemystifiedCouchbase Data Platform | Big Data Demystified
Couchbase Data Platform | Big Data DemystifiedOmid Vahdaty
 
Cassandra implementation for collecting data and presenting data
Cassandra implementation for collecting data and presenting dataCassandra implementation for collecting data and presenting data
Cassandra implementation for collecting data and presenting dataChen Robert
 
From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016
From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016
From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016DataStax
 
Cassandra - A decentralized storage system
Cassandra - A decentralized storage systemCassandra - A decentralized storage system
Cassandra - A decentralized storage systemArunit Gupta
 
Re-Engineering PostgreSQL as a Time-Series Database
Re-Engineering PostgreSQL as a Time-Series DatabaseRe-Engineering PostgreSQL as a Time-Series Database
Re-Engineering PostgreSQL as a Time-Series DatabaseAll Things Open
 
Apache Cassandra introduction
Apache Cassandra introductionApache Cassandra introduction
Apache Cassandra introductionfardinjamshidi
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinChristian Johannsen
 
cassandra_presentation_final
cassandra_presentation_finalcassandra_presentation_final
cassandra_presentation_finalSergioBruno21
 
cassandra
cassandracassandra
cassandraAkash R
 
The No SQL Principles and Basic Application Of Casandra Model
The No SQL Principles and Basic Application Of Casandra ModelThe No SQL Principles and Basic Application Of Casandra Model
The No SQL Principles and Basic Application Of Casandra ModelRishikese MR
 
Storage cassandra
Storage   cassandraStorage   cassandra
Storage cassandraPL dream
 

Similar to N07_RoundII_20220405.pptx (20)

Cassndra (4).pptx
Cassndra (4).pptxCassndra (4).pptx
Cassndra (4).pptx
 
Appache Cassandra
Appache Cassandra  Appache Cassandra
Appache Cassandra
 
Cassandra Learning
Cassandra LearningCassandra Learning
Cassandra Learning
 
Couchbase Data Platform | Big Data Demystified
Couchbase Data Platform | Big Data DemystifiedCouchbase Data Platform | Big Data Demystified
Couchbase Data Platform | Big Data Demystified
 
Cassandra implementation for collecting data and presenting data
Cassandra implementation for collecting data and presenting dataCassandra implementation for collecting data and presenting data
Cassandra implementation for collecting data and presenting data
 
From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016
From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016
From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016
 
Cassandra tutorial
Cassandra tutorialCassandra tutorial
Cassandra tutorial
 
Cassandra - A decentralized storage system
Cassandra - A decentralized storage systemCassandra - A decentralized storage system
Cassandra - A decentralized storage system
 
Cassandra Overview
Cassandra OverviewCassandra Overview
Cassandra Overview
 
Re-Engineering PostgreSQL as a Time-Series Database
Re-Engineering PostgreSQL as a Time-Series DatabaseRe-Engineering PostgreSQL as a Time-Series Database
Re-Engineering PostgreSQL as a Time-Series Database
 
Nosql databases
Nosql databasesNosql databases
Nosql databases
 
Apache Cassandra introduction
Apache Cassandra introductionApache Cassandra introduction
Apache Cassandra introduction
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek Berlin
 
cassandra_presentation_final
cassandra_presentation_finalcassandra_presentation_final
cassandra_presentation_final
 
cassandra
cassandracassandra
cassandra
 
The No SQL Principles and Basic Application Of Casandra Model
The No SQL Principles and Basic Application Of Casandra ModelThe No SQL Principles and Basic Application Of Casandra Model
The No SQL Principles and Basic Application Of Casandra Model
 
NoSql Database
NoSql DatabaseNoSql Database
NoSql Database
 
2. Lecture2_NOSQL_KeyValue.ppt
2. Lecture2_NOSQL_KeyValue.ppt2. Lecture2_NOSQL_KeyValue.ppt
2. Lecture2_NOSQL_KeyValue.ppt
 
Storage cassandra
Storage   cassandraStorage   cassandra
Storage cassandra
 
Why Cassandra?
Why Cassandra?Why Cassandra?
Why Cassandra?
 

Recently uploaded

Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlCall Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlkumarajju5765
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 

Recently uploaded (20)

Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlCall Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 

N07_RoundII_20220405.pptx

  • 1. BỘ GIÁO DỤC VÀ ĐÀO TẠO TRƯỜNG ĐẠI HỌC KHOA HỌC TỰ NHIÊN TP.HCM KHOA CÔNG NGHỆ THÔNG TIN NoSQL Column-Family Stores GVHD: Ts. Nguyễn Trần Minh Thư Nhóm 07: 1. 19C11015 - Đỗ Huy Gia Cát 2. 21C12003 - Đào Thanh Danh 3. 21C11026 - Nguyễn Thành Thái Báo cáo môn Các hệ cơ sở dữ liệu nâng cao 1 NoSQL - Not Only SQL
  • 2. CONTENTS • Column-Family Stores NoSQL o Overview o Column-Family Databases • Cassandra's Structure and Features • Compare Colum-Family Data Store with others • Query features • Expand analyse • Scaling • Some compare Cassandra and HBase • Apply suitable usecases 2
  • 4. Wide Column / Column Family Database • Column-family stores are databases in which data is stored by key- value mapping and values group into multiple column families, with each being a map of data • Keyword comparison between RDBMS and Cassandra 4 RDBMS Cassandra Database instance Cluster Database Keyspace Table Column Family Row Row Column (same for all rows) Column (can be different per row)
  • 6. Column Family Unit Structure Storage • Column: the basic storage unit, consist of a name-value pair with the name also acts as the key, and stored with a timestamp value • Super column: column whose value is a map of columns 7
  • 7. Column Family Unit Structure Storage • Standard column family: column family where all columns are simple columns • Super column family: column family where exists at least one super column 8
  • 8. Cassandra's Features • Consistency • Transactions • Availability • Scaling 9
  • 9. Consistency • Cassandra stores replicas on multiple nodes to ensure reliability • Cassandra provides three consistency levels: ONE: Only need one of the nodes to respond to the request, good for high write performance requirements QUOROM: Ensures that majority of the node respond to the request ALL: All nodes will have to respond to the requests • If a node is down, the data will be stored later when it comes back via hints (hinted handoff) or repair command. 10
  • 10. Transactions • In Cassandra, transactions are atomic and isolated • Atomic: inserted or updating columns in a row is treated as a write operation • Isolation: writes to a row are isolated to client and not visible to other uses until completion 11
  • 11. Availability • Availability is governed by the formula (R + W) > N R, W: minimum number of nodes read/write request is successfully responded; N: number of replicas of data • Keyspaces should be set up depending on your need – higher availability for read or write 12
  • 12. Scaling • Cassandra handles scaling by adding additional nodes to the cluster • Allows clusters to be scaled on the fly without operations => maxium uptime 13
  • 13. 15 Database - Open-source NoSQL - Column Family - Store data no relationship on column-family model Scalability - Scalabilitiable by increasing nodes Replication - Replica data on multi node
  • 14. 16 Infrastructure Design independence, can integrate with DBMS other and Storm, Hadoop Base on Hadoop, can integrate with Zookeeper HBase master, HBase data node, name node Support Support ordered partitioning Not-support ordered partitioning Node Multi seed node in clusster Node master monitoring/coordinator nodes Query language Cassandra Query Language – CQL Cassandra Query Language Shell - CQLSH Only support HBase shell
  • 15. 17
  • 16. 18
  • 17. Basic Queries CQL • Cassandra Query Language • Cassandra Query Language Shell - CQLSH • Only support HBase Shell • Apache Phoenix -> Query Engine https://data- flair.training/blogs/hbase-shell- commands/ 19
  • 18. Cassandra Query Language 20 • CREATE KEYSPACE <identifier> WITH <properties> • CREATE KEYSPACE videodb WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 }; SimpleStrategy NetworkTopologyStrategy
  • 19. Cassandra Query Language 21 USE videodb; CREATE TABLE videos ( videoid uuid, videoname varchar, username varchar, description varchar, location map<varchar,varchar>, tags set<varchar>, upload_date timestamp, PRIMARY KEY (videoid) ); CREATE TABLE video_rating ( videoid uuid, rating_counter counter, rating_total counter, PRIMARY KEY (videoid) ); CREATE TABLE video_event ( videoid uuid, username varchar, event varchar, event_timestamp timeuuid, video_timestamp bigint, PRIMARY KEY ((videoid, username), event_timestamp, event) ) WITH CLUSTERING ORDER BY (event_timestamp DESC, event ASC); CREATE (TABLE | COLUMNFAMILY) <tablename> ('<column-definition>' , '<column-definition>') (WITH <option> AND <option>)
  • 20. Cassandra Query Language • Built-In Data Type: boolean, int, bigint, variant, float, double, decimal, ascii, varchar, text, timestamp, blob, inet, timeuuid, uuid,… • Collection Data Type: LIST, SET, MAP • User-Defined Data Type 22
  • 21. Cassandra Query Language • User-Defined Data Type CREATE TYPE <keyspace>.<data type> (variable,variable) CREATE TYPE records ( name text, branch text, phone int, city text, id set<int> ); 23
  • 22. Cassandra Query Language SELECT Clause, WHERE Clause & ORDERBY INSERT INTO <table name> (<field name 1>,<field name 2>,<field name 3>.,...) VALUES ('value 1','value 2','value 3',....) USING <update parameter>; UPDATE <table name> USING <update parameter> SET <field name 1>=< value 1>, < field name 2>=< value 2>, < field name 3>=<value 3>, ..... WHERE <field>=<value>; 24
  • 23. Cassandra Query Language DELETE <table name> USING <update parameter> WHERE <identifier> BEGIN BATCH //different data manipulation command syntax -> INSERT, UPDATE // DELETE APPLY BATCH; 25
  • 24. Cassandra Query Language • Advanced Queries and Indexing • CREATE INDEX <field name> ON <table name>​ Indexes are implemented as bit-mapped indexes and perform well for low-cardinality column values. • USE, CREATE, ALTER, DROP, TRUNCATE,... 26
  • 25. Stores writing • Memory space - memtable • Disk store SSTable 27
  • 27. Suitable Use Cases • A great choice to store event information, such as application state or errors encountered by the application • Content Management Systems, Blogging Platforms => store blog entries with tags, categories, links, and trackbacks • Count and categorize visitors of a page to calculate analytics • Data for specific time -> as ad banners on a website 29
  • 28. When Not to Use • Systems that require ACID transactions for writes and reads • The database to aggregate the data using queries (such as SUM or AVG) • Sample product prototypes or initial tech spikes 30
  • 30. 32

Editor's Notes

  1. distributed (phân tán) + non-relational (không ràng buộc). https://www.scnsoft.com/blog/cassandra-vs-hbase
  2. Bản init release đầu tiên của Bigtable Google năm 2005 đặt nền móng  Apache Cassandra® is an open-source, Developed at Facebook, Cassandra was open-sourced 2008, after Apache continue develop 2009 HBase  modeled after Google's Bigtable and written in Java
  3. allow you to store data with keys mapped to values and the values grouped into multiple column families Each column is a tuple (triplet) consisting of a column name, a value, and a timestamp. In a relational database table, this data would be grouped together within a table with other non-related data.
  4. sự kết hợp của key-value và dạng table
  5.  thuật toán Consistent Hashing thì mỗi node sẽ được cấp phát 1 token, và dựa vào token này sẽ phân phối dữ liệu đến từng node. Gossip Protocol giao thức truyền thông giữa các node trong cluster. Cơ bản là truyền thông P2P.
  6. kiến trúc mạng ngang hàng (Peer - to - Peer) tất cả các node máy chủ trong hệ thống đều có vai trò như nhau, không có master  -> giảm thiểu sự cố sập master sập cả hệ thống -> master-slave truyền thống.
  7. https://www.techtarget.com/searchdatamanagement/tip/NoSQL-database-types-explained-Key-value-store
  8. Choosing_the_right_NoSQL_database_for_the_job_a_qu.pdf
  9. SimpleStrategy -> clockwise direction in the Node ring. NetworkTopologyStrategy -> replicate multi data center in a cluster 
  10. SimpleStrategy -> clockwise direction in the Node ring. NetworkTopologyStrategy -> replicate multi data center in a cluster 
  11. Blob byte data Ascii string for ASCII; text UTF8  Bigint -2^32 to 2^32 Counter số nguyên 64bit, ko có lệnh insert trong bảng với các column counter, chỉ update có thể sử dụng như việc tăng hay giảm.. không thể lập chỉ muc  inet biểu diễn chuỗi IPv4 hoặc IPv6 timestamp yyyy-mm-dd HH:mm, yyyy-mm-dd HH:mm:ss SET sorted 
  12. USING <update parameter> -optional> USING TTL 86400; insert và update dựa trên rowkey giống với primary key nhưng insert duplicate thì không báo lỗi mà nhận dạng theo timestamp trong version mới nhất gần với timestamp hiện tại nhất
  13. USING <update parameter> -optional> USING TTL 86400;
  14. thousands of companies have adopted it, including Apple, Instagram, Uber, Spotify, Twitter, Cisco, Rackspace, eBay, and Netflix
  15. Comments can be either stored in the same row or moved to a different keyspace; similarly, blog users and the actual blogs can be put into different column families.
  16. the cost may be higher for query change as compared to schema change.? Cassandra không hỗ trợ nhiều cho việc tính toán trên storage, nó không hỗ trợ các hàm sum, group, join, max, min