NoSql with cassandra

•Download as PPTX, PDF•

0 likes•772 views

Cassandra trainings presentation for R&D: Training objectives links: http://www.datastax.com/what-we-offer/products-services/training/objectives-developer http://www.datastax.com/what-we-offer/products-services/training/objectives-administrator

Software

NoSql with Cassandra
Marek Koniew
and a little bit of MongoDB

Cassandra basics
SEDA architecture
Key ranges
Replication Factor
Gossip
Bootstrap & Repair
CQL: DDL & DML
54:00

Data Center aware & Rack aware
Data Center aware
Rack Aware
Different topology strategies
48:00

Read / Write consistency
ANY (hinted handoff)
ONE
TWO
QUORUM
LOCAL QUORUM
EACH QUORUM
ALL
SERIAL (lighweight transactions)
39:00

Cassandra blazing fast writes
No read before write
Commit Log
Memtable
SSTable (immutable)
Flush
42:00

Write inconsistency
Not required odd number of nodes
Cassandra counters – not working
Schema updates
30:00

Expensive Read
Caches (Row, Key)
Merging SSTables
Bloom Filter
Compaction
Read repair
36:00

Bloom FIlter
Creating filter logical OR
Checking filter logical AND
Possible false positives
NOT possible false negatives
33:00

$Atomic Operations Cassandra „Lightweight Transactions” in 2.0 using Paxos UPDATE users SET reset_token = null AND password = ‘newpassword’ IF reset_token = ‘some-generated-reset-token’ MongoDB atomic operations db.runCommand( { findAndModify: "people", query: { name: "Tom", state: "active", rating: { $gt: 10 } }, sort: { rating: 1 }, update: { $inc: { score: 1 } } } ) 27:00$

Tunable Consistency
Cassandra AP
Cassandra consistency level ALL = AC
MongoDB CP
24:00

Scalability
Cassandra – One additional
server adds performance to whole ring.
Mongo – one more server in replica set increases read performance
Adding shard requires adding whole replica set
21:00

Indexing
Cassandra secondary indexes
MongoDB secondary, geospatial, unique
Every node has its own part of index
Does not scale !!!
18:00

Schema denormalization
Create additional table and duplicate data
Use instead of indexes and joins
select * from audiofile where id = 1
select * from audiofile where artist = Sting
15:00

$CQL: Cassandra query language (v3.1.1) http://cassandra.apache.org/doc/cql3/CQL.html DDL: Data definition language DML: Data modification language CREATE TABLE monkeySpecies ( species text PRIMARY KEY, common_name text, population varint, average_size int ) CREATE KEYSPACE Excelsior WITH replication = { 'class':'SimpleStrategy', 'replication_factor' : 3 } SELECT time, value FROM events WHERE event_type = 'myEvent' AND time > '2011-02-03' AND time <= '2012-01-01' INSERT INTO NerdMovies (movie, director, main_actor, year) VALUES ('Serenity', 'Joss Whedon', 'Nathan Fillion', 2005) USING TTL 86400; 12:00$

Time series example
CREATE TABLE timeseries (
pkey date,
skey time,
temperature 19,
PRIMARY KEY (pkey, skey)
)
select * from timeseries
9:00

Map-Reduce
Hadoop map reduce is used
Advanced Task Tracker balancing
Use Pig & Hive. Almost not possible Java code
6:00

Solr Integration
Hadoop – strong analitics
Solr – sharded index
Automatic replication
1:00

Full text search
Solr sharding the same problem like with secondary indexes
MongoDB full text search
db.articles.ensureIndex( { subject: "text" } )
db.articles.runCommand( "text", { search: "bake coffee -cake" } )

What's hot

Stabilising the jenga towerGordon Chung

Taskerman: A Distributed Cluster Task ManagerRaghavendra Prabhu

Gnocchi v3Gordon Chung

GTC 2009 OpenGL GoldMark Kilgard

Mario on sparkIgor Berman

Gnocchi v4 (preview)Gordon Chung

Nsq.io on Node.js and ShellLuis Faustino

The power of streams in node jsJawahar

Mission impossibleSamantha Billington

LAMP Stack (Reloaded) - Infrastructure as Code with Terraform & PackerJan-Christoph Küster

Gnocchi v4 - past and presentGordon Chung

Cassandra NYC 2011 Data ModelingMatthew Dennis

Gnocchi Profiling 2.1.xGordon Chung

Seastar @ SF/BA C++UGAvi Kivity

Cassandra at BrightTagDataStax Academy

Sedna XML Database: Memory ManagementIvan Shcheklein

Tuning Speculative Retries to Fight Latency (Michael Figuiere, Minh Do, Netfl...DataStax

Cassandra at GlogsterRoman Komkov

Gnocchi Profiling v2Gordon Chung

DSD-INT 2017 The use of big data for dredging - De BoerDeltares

What's hot (20)

Stabilising the jenga tower

Taskerman: A Distributed Cluster Task Manager

Gnocchi v3

GTC 2009 OpenGL Gold

Mario on spark

Gnocchi v4 (preview)

Nsq.io on Node.js and Shell

The power of streams in node js

Mission impossible

LAMP Stack (Reloaded) - Infrastructure as Code with Terraform & Packer

Gnocchi v4 - past and present

Cassandra NYC 2011 Data Modeling

Gnocchi Profiling 2.1.x

Seastar @ SF/BA C++UG

Cassandra at BrightTag

Sedna XML Database: Memory Management

Tuning Speculative Retries to Fight Latency (Michael Figuiere, Minh Do, Netfl...

Cassandra at Glogster

Gnocchi Profiling v2

DSD-INT 2017 The use of big data for dredging - De Boer

Similar to NoSql with cassandra

Introduction to Apache CassandraRobert Stupp

Cassandra DatabaseYounesCharfaoui

Java one2011 brisk-and_high_order_bits_from_cassandra_and_hadoopsrisatish ambati

Use Your MySQL Knowledge to Become an Instant Cassandra GuruTim Callaghan

Trivadis TechEvent 2016 Big Data Cassandra, wieso brauche ich das? by Jan OttTrivadis

Cassandra20141113Brian Enochson

Apache cassandra and spark. you got the the lighter, let's start the firePatrick McFadin

Spark & Cassandra - DevFest CórdobaJose Mº Muñoz

Spark and cassandra (Hulu Talk)Jon Haddad

5 Ways to Use Spark to Enrich your Cassandra EnvironmentJim Hatcher

Introduction to SQLEhsan Hamzei

introductiontosql-161216154706.pdfssusere4c6aa

Cassandra ExplainedEric Evans

Multi-cluster k8ssandraKubernetesCommunityD

The Apache Cassandra ecosystemAlex Thompson

Cassandra trainingAndrás Fehér

Cassandra no sql ecosystemSandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW

DataStax NYC Java Meetup: Cassandra with Javacarolinedatastax

Spark & Cassandra at DataStax Meetup on Jan 29, 2015 Sameer Farooqui

Cassandra Java APIs Old and New – A Comparisonshsedghi

Similar to NoSql with cassandra (20)

Introduction to Apache Cassandra

Cassandra Database

Java one2011 brisk-and_high_order_bits_from_cassandra_and_hadoop

Use Your MySQL Knowledge to Become an Instant Cassandra Guru

Trivadis TechEvent 2016 Big Data Cassandra, wieso brauche ich das? by Jan Ott

Cassandra20141113

Apache cassandra and spark. you got the the lighter, let's start the fire

Spark & Cassandra - DevFest Córdoba

Spark and cassandra (Hulu Talk)

5 Ways to Use Spark to Enrich your Cassandra Environment

Introduction to SQL

introductiontosql-161216154706.pdf

Cassandra Explained

Multi-cluster k8ssandra

The Apache Cassandra ecosystem

Cassandra training

Cassandra no sql ecosystem

DataStax NYC Java Meetup: Cassandra with Java

Spark & Cassandra at DataStax Meetup on Jan 29, 2015

Cassandra Java APIs Old and New – A Comparison

Recently uploaded

TECUNIQUE: Success Stories: IT Service providermohitmore19

ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin

The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS

why an Opensea Clone Script might be your perfect match.pdfjoe51371421

Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01

Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH

5 Signs You Need a Fashion PLM Software.pdfWave PLM

Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh

Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin

Asset Management Software - InfographicHr365.us smith

Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.

BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp

Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531

The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171

(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700

Professional Resume Template for Software DevelopersVinodh Ram

Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.

What is Binary Language? Computer Number SystemsJheuzeDellosa

Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy

Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq

Recently uploaded (20)

TECUNIQUE: Success Stories: IT Service provider

ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...

The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...

why an Opensea Clone Script might be your perfect match.pdf

Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...

Der Spagat zwischen BIAS und FAIRNESS (2024)

5 Signs You Need a Fashion PLM Software.pdf

Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...

Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide

Asset Management Software - Infographic

Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...

BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE

Hand gesture recognition PROJECT PPT.pptx

The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf

(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...

Professional Resume Template for Software Developers

Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data

What is Binary Language? Computer Number Systems

Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications

Salesforce Certified Field Service Consultant

NoSql with cassandra

2. NoSql with Cassandra Marek Koniew and a little bit of MongoDB

3. Cassandra basics SEDA architecture Key ranges Replication Factor Gossip Bootstrap & Repair CQL: DDL & DML 54:00

4. Cassandra Ring - Coordinator role 51:00

5. Data Center aware & Rack aware Data Center aware Rack Aware Different topology strategies 48:00

6. MongoDB deployment schema 45:00

7. Read / Write consistency ANY (hinted handoff) ONE TWO QUORUM LOCAL QUORUM EACH QUORUM ALL SERIAL (lighweight transactions) 39:00

8. Cassandra blazing fast writes No read before write Commit Log Memtable SSTable (immutable) Flush 42:00

9. Write inconsistency Not required odd number of nodes Cassandra counters – not working Schema updates 30:00

10. Expensive Read Caches (Row, Key) Merging SSTables Bloom Filter Compaction Read repair 36:00

11. Bloom FIlter Creating filter logical OR Checking filter logical AND Possible false positives NOT possible false negatives 33:00

12. Atomic Operations Cassandra „Lightweight Transactions” in 2.0 using Paxos UPDATE users SET reset_token = null AND password = ‘newpassword’ IF reset_token = ‘some-generated-reset-token’ MongoDB atomic operations db.runCommand( { findAndModify: "people", query: { name: "Tom", state: "active", rating: { $gt: 10 } }, sort: { rating: 1 }, update: { $inc: { score: 1 } } } ) 27:00

13. Tunable Consistency Cassandra AP Cassandra consistency level ALL = AC MongoDB CP 24:00

14. Scalability Cassandra – One additional server adds performance to whole ring. Mongo – one more server in replica set increases read performance Adding shard requires adding whole replica set 21:00

15. Indexing Cassandra secondary indexes MongoDB secondary, geospatial, unique Every node has its own part of index Does not scale !!! 18:00

16. Schema denormalization Create additional table and duplicate data Use instead of indexes and joins select * from audiofile where id = 1 select * from audiofile where artist = Sting 15:00

17. CQL: Cassandra query language (v3.1.1) http://cassandra.apache.org/doc/cql3/CQL.html DDL: Data definition language DML: Data modification language CREATE TABLE monkeySpecies ( species text PRIMARY KEY, common_name text, population varint, average_size int ) CREATE KEYSPACE Excelsior WITH replication = { 'class':'SimpleStrategy', 'replication_factor' : 3 } SELECT time, value FROM events WHERE event_type = 'myEvent' AND time > '2011-02-03' AND time <= '2012-01-01' INSERT INTO NerdMovies (movie, director, main_actor, year) VALUES ('Serenity', 'Joss Whedon', 'Nathan Fillion', 2005) USING TTL 86400; 12:00

18. Time series example CREATE TABLE timeseries ( pkey date, skey time, temperature 19, PRIMARY KEY (pkey, skey) ) select * from timeseries 9:00

19. Map-Reduce Hadoop map reduce is used Advanced Task Tracker balancing Use Pig & Hive. Almost not possible Java code 6:00

20. MongoDB map-reduce 3:00

21. Solr Integration Hadoop – strong analitics Solr – sharded index Automatic replication 1:00

22. Full text search Solr sharding the same problem like with secondary indexes MongoDB full text search db.articles.ensureIndex( { subject: "text" } ) db.articles.runCommand( "text", { search: "bake coffee -cake" } )

23. Questions

NoSql with cassandra

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to NoSql with cassandra

Similar to NoSql with cassandra (20)

Recently uploaded

Recently uploaded (20)

NoSql with cassandra