Cassandra database design best practises

•

2 likes•9,492 views

Sandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW

Important consideration for cassandra NoSQL data design

Technology

Key points to look at while designing:
1. Distributed No shared storage
No Single Point of Failure SPOF
2. Linear Scalability
3. ACID Vs CAP Theorem
4. DB write speed more than read speed
5. Mitigation for sequence joins and
aggregation by Hive for speed spark.
6. Polyglot of persistence

• Expiring Columns with TTL
• Create row key for speed & use wide rows for
efficiency
• Row level isolation (Rowkey wide rows)
• Atomic batches
(use timestamp to present order SSTable )
• Log structured Storage: Regular compaction
Commit log (durability)  memtable (in
memory){flush when full} SSTable
(Compaction)

$• Tunable data consistency by consistency strategy {Any|One|Quorum|Local Quorum|Each Quorum} Quorum (replication_factor+1)/2 • Super Column limitation: require deseriailize all column for processing • Counter Column: incrementally counts occurrence of particular events$

• Study Query denormalize by Access
pattern identify repeating columns as
keys. (depends on requirement)
• One to Many relationship Modelling E.g.
RDBMS: User table  Video Table
C* : Break to 2 Column Families CF
CF1: Video(key videoid UUID)
CF2: Username_video_index
(key(username,videoid)

Many to Many Relationship modeling
Example:
Tables: Student  m:n  Teacher
CF1: Student_with_Teacherid
CF2: Teacher_with_Studentid
Model depends on Query pattern

$• Look serialization cost of Collection (Set|Map|List) and limit 64K. • TimeSeries : CF ordered by Timestamp+access pattern. • Partial Word Indexes: create partition + partial indexes • Bit Map index: Example Vehicle{make,model,color,vech_id,lot_id} primary key {make,model,color,vech_id} Row Key {make,model,color}$

• Index interval:
smaller index faster seek time
properties:
index_interval keep optimum
• Immediate Consistency:
Nodes_written+nodes_read >
replication_factor
• Do not used index on high cardinality
columns,counter columns,frequently
updated columns.

What's hot

Introduction to CassandraSoftwareMill

Introduction to Apache Cassandra Knoldus Inc.

Cassandra architectureT Jake Luciani

Apache cassandraAdnan Siddiqi

HBase at Flurryddlatham

Cassandra: Open Source Bigtable + Dynamojbellis

Load testing Cassandra applicationsBen Slater

Cloud Strategy Architecture for multi country deploymentSandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW

Write intensive workloads and lsm treesTilak Patidar

Life as a GlusterFS Consultant with Ivan RossiGluster.org

Simple math fusion-io_v4Steve Lee

Ssis optimization –better designsvarunragul

Avoiding Data Hotspots at ScaleScyllaDB

GOOGLE BIGTABLETomcy Thankachan

Cassandra - A Distributed Database System Md. Shohel Rana

Cassandra presentationSergey Enin

Availability and scalability in mongoMd. Khairul Anam

Write behind loggingPouyan Rezazadeh

Apache Cassandra at the Geek2Geek BerlinChristian Johannsen

Gfs vs hdfsYuval Carmel

What's hot (20)

Introduction to Cassandra

Introduction to Apache Cassandra

Cassandra architecture

Apache cassandra

HBase at Flurry

Cassandra: Open Source Bigtable + Dynamo

Load testing Cassandra applications

Cloud Strategy Architecture for multi country deployment

Write intensive workloads and lsm trees

Life as a GlusterFS Consultant with Ivan Rossi

Simple math fusion-io_v4

Ssis optimization –better designs

Avoiding Data Hotspots at Scale

GOOGLE BIGTABLE

Cassandra - A Distributed Database System

Cassandra presentation

Availability and scalability in mongo

Write behind logging

Apache Cassandra at the Geek2Geek Berlin

Gfs vs hdfs

Viewers also liked

Cassandra ConfigurationSandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW

Overcoming cassandra query limitation sparkSandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW

Data modelling qlik viewSandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW

Mathematical thinking of database performanceSandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW

Smart metering infrastructure Architecture and analyticsSandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW

Mathematical Modelling of Wireless sensor Network and new energy Aware Routing Sandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW

Cloud Security Alliance Guide to Cloud SecuritySandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW

Risk management in Healthcare on CloudSandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW

Cassandra design patternsDenis Gabaydulin

Real time bi solution architectureSandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW

Case study haad operating model improvement modelSandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW

Data stax no sql use casesSandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW

Saas securitySandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW

Iam cloud security_vision_wp_236732Sandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW

DataStax: Rigorous Cassandra Data Modeling for the Relational Data ArchitectDataStax Academy

Presentation of Apache Cassandra Nikiforos Botis

Digital Transformation StrategyJames Woolwine

The Governance Framework For Managing ChangeSandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW

Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...DataStax Academy

C* Summit 2013: How Not to Use Cassandra by Axel LiljencrantzDataStax Academy

Viewers also liked (20)

Cassandra Configuration

Overcoming cassandra query limitation spark

Data modelling qlik view

Mathematical thinking of database performance

Smart metering infrastructure Architecture and analytics

Mathematical Modelling of Wireless sensor Network and new energy Aware Routing

Cloud Security Alliance Guide to Cloud Security

Risk management in Healthcare on Cloud

Cassandra design patterns

Real time bi solution architecture

Case study haad operating model improvement model

Data stax no sql use cases

Saas security

Iam cloud security_vision_wp_236732

DataStax: Rigorous Cassandra Data Modeling for the Relational Data Architect

Presentation of Apache Cassandra

Digital Transformation Strategy

The Governance Framework For Managing Change

Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...

C* Summit 2013: How Not to Use Cassandra by Axel Liljencrantz

Similar to Cassandra database design best practises

A Closer Look at Apache KuduAndriy Zabavskyy

Motivation for multithreaded architecturesYoung Alista

Cassandra for mission critical dataOleksandr Semenov

Cache optimizationVani Kandhasamy

Cassandra 2.1 boot camp, Read/Write pathJoshua McKenzie

Cosmos DB at VLDB 2019Dharma Shukla

Building tiered data stores using aesop to bridge sql and no sql systemsRegunath B

Hands-on Workshop: Apache PulsarSijie Guo

SQL Server 2014 In-Memory OLTPTony Rogerson

Multithreaded processors pptSiddhartha Anand

Cassandra trainingAndrás Fehér

cache memory.pptMUNAZARAZZAQELEA

Scalability, Availability & Stability PatternsJonas Bonér

Outside The Box With Apache CassnadraEric Evans

hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at AlibabaMichael Stack

CPU Cachesshinolajla

Reduced instruction set computersSyed Zaid Irshad

Kafka overview v0.1Mahendran Ponnusamy

Best practices for Data warehousing with Amazon Redshift - AWS PS Summit Canb...Amazon Web Services

Similar to Cassandra database design best practises (20)

A Closer Look at Apache Kudu

Motivation for multithreaded architectures

Cassandra for mission critical data

Cache optimization

Cassandra 2.1 boot camp, Read/Write path

Cosmos DB at VLDB 2019

Building tiered data stores using aesop to bridge sql and no sql systems

Hands-on Workshop: Apache Pulsar

SQL Server 2014 In-Memory OLTP

Multithreaded processors ppt

Cassandra training

cache memory.ppt

Scalability, Availability & Stability Patterns

Outside The Box With Apache Cassnadra

hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba

CPU Caches

Reduced instruction set computers

Kafka overview v0.1

Best practices for Data warehousing with Amazon Redshift - AWS PS Summit Canb...

More from Sandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW

Management Consultancy Saudi Telecom Digital Transformation Design ThinkingSandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW

Major new initiativesSandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW

Digital transformation journey ConsultingSandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW

Agile Jira Reporting Sandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW

Lnt and bbby Retail Houseare industry Case assignment sandeep sharmaSandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW

Risk management Consulting For MunicipalitySandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW

GDPR And Privacy By design ConsultancySandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW

Real implementation Blockchain Best Use Cases ExamplesSandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW

Ffd 05 2012Sandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW

Biztalk architecture for Configured SMS serviceSandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW

Data modelling interview questionSandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW

Pmo best practicesSandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW

Agile project managementSandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW

Enroll hostel Business ModelSandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW

Cloud manager client provisioning guideline draft 1.0Sandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW

Bpm digital transformationSandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW

Digital transformation explainedSandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW

Government Digital transformation trend draft 1.0Sandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW

Enterprise architecture maturity rating draft 1.0Sandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW

Organisation Structure For digital Transformation TeamSandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW

More from Sandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW (20)

Management Consultancy Saudi Telecom Digital Transformation Design Thinking

Major new initiatives

Digital transformation journey Consulting

Agile Jira Reporting

Lnt and bbby Retail Houseare industry Case assignment sandeep sharma

Risk management Consulting For Municipality

GDPR And Privacy By design Consultancy

Real implementation Blockchain Best Use Cases Examples

Ffd 05 2012

Biztalk architecture for Configured SMS service

Data modelling interview question

Pmo best practices

Agile project management

Enroll hostel Business Model

Cloud manager client provisioning guideline draft 1.0

Bpm digital transformation

Digital transformation explained

Government Digital transformation trend draft 1.0

Enterprise architecture maturity rating draft 1.0

Organisation Structure For digital Transformation Team

Recently uploaded

Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski

Key Features Of Token Development (1).pptxLBM Solutions

SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited

WordPress Websites for Engineers: Elevate Your Brandgvaughan

Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University

DMCC Future of Trade Web3 - Special EditionDubai Multi Commodity Centre

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays

Gen AI in Business - Global Trends Report 2024.pdfAddepto

Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge

Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK

Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar

Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed

New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada

"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays

Understanding the Laravel MVC ArchitecturePixlogix Infotech

"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays

My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer

Recently uploaded (20)

Streamlining Python Development: A Guide to a Modern Project Setup

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...

Key Features Of Token Development (1).pptx

SIP trunking in Janus @ Kamailio World 2024

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365

WordPress Websites for Engineers: Elevate Your Brand

Nell’iperspazio con Rocket: il Framework Web di Rust!

DMCC Future of Trade Web3 - Special Edition

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...

Gen AI in Business - Global Trends Report 2024.pdf

Designing IA for AI - Information Architecture Conference 2024

Unblocking The Main Thread Solving ANRs and Frozen Frames

Unleash Your Potential - Namagunga Girls Coding Club

Scanning the Internet for External Cloud Exposures via SSL Certs

New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024

"Debugging python applications inside k8s environment", Andrii Soldatenko

Understanding the Laravel MVC Architecture

"Federated learning: out of reach no matter how close",Oleksandr Lapshyn

My INSURER PTE LTD - Insurtech Innovation Award 2024

Cassandra database design best practises

1. Key points to look at while designing: 1. Distributed No shared storage No Single Point of Failure SPOF 2. Linear Scalability 3. ACID Vs CAP Theorem 4. DB write speed more than read speed 5. Mitigation for sequence joins and aggregation by Hive for speed spark. 6. Polyglot of persistence

2. • Expiring Columns with TTL • Create row key for speed & use wide rows for efficiency • Row level isolation (Rowkey wide rows) • Atomic batches (use timestamp to present order SSTable ) • Log structured Storage: Regular compaction Commit log (durability)  memtable (in memory){flush when full} SSTable (Compaction)

3. • Tunable data consistency by consistency strategy {Any|One|Quorum|Local Quorum|Each Quorum} Quorum (replication_factor+1)/2 • Super Column limitation: require deseriailize all column for processing • Counter Column: incrementally counts occurrence of particular events

4. • Study Query denormalize by Access pattern identify repeating columns as keys. (depends on requirement) • One to Many relationship Modelling E.g. RDBMS: User table  Video Table C* : Break to 2 Column Families CF CF1: Video(key videoid UUID) CF2: Username_video_index (key(username,videoid)

5. Many to Many Relationship modeling Example: Tables: Student  m:n  Teacher CF1: Student_with_Teacherid CF2: Teacher_with_Studentid Model depends on Query pattern

6. • Look serialization cost of Collection (Set|Map|List) and limit 64K. • TimeSeries : CF ordered by Timestamp+access pattern. • Partial Word Indexes: create partition + partial indexes • Bit Map index: Example Vehicle{make,model,color,vech_id,lot_id} primary key {make,model,color,vech_id} Row Key {make,model,color}

7. • Index interval: smaller index faster seek time properties: index_interval keep optimum • Immediate Consistency: Nodes_written+nodes_read > replication_factor • Do not used index on high cardinality columns,counter columns,frequently updated columns.

Cassandra database design best practises

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Cassandra database design best practises

Similar to Cassandra database design best practises (20)

More from Sandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW

More from Sandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW (20)

Recently uploaded

Recently uploaded (20)

Cassandra database design best practises