SlideShare a Scribd company logo
1 of 21
Sergey Titov
Software Architect
@sergtitov
AgendaAGENDA
• Cassandra Architecture
• CAP theorem and Consistency
• Scalability
• Astyanax client
• Data Modeling
• Queries
• DataStax OpsCenter
• Resources
Cassandra architectureARCHITECTURE
• Ring
• P2P
• Gossip
• Key hash-based sharding
CAP TheoremCAPTHEOREM
Consistency in CassandraCONSISTENCY
• ACID - Atomicity Consistency Isolation Durability
• BASE - Basically Available Soft-state Eventual consistency
• Isolation on the row level
• Atomic batches starting Cassandra 1.2
• Consistency level for READs and WRITEs set for every request
• Tunable consistency
• Log: CL_WRITE = ANY or ONE
• Strong: CL_READ + CL_WRITE > REPLICATION_FACTOR
• Recommended default: LOCAL_QUORUM
Consistency in Cassandra - continuedCONSISTENCY
Level Description
ANY
A write must be written to at least one node. If all replica nodes
for the given row key are down, the write can still succeed once
a hinted handoff has been written. Note that if all replica nodes
are down at write time, an ANY write will not be readable until
the replica nodes for that row key have recovered.
ONE
A write must be written to the commit log and memory table of
at least one replica node.
QUORUM
A write must be written to the commit log and memory table on
a quorum of replica nodes.
LOCAL_QUORUM
A write must be written to the commit log and memory table on
a quorum of replica nodes in the same data center as the
coordinator node. Avoids latency of inter-data center
communication.
EACH_QUORUM
A write must be written to the commit log and memory table on
a quorum of replica nodes in all data centers.
ALL
A write must be written to the commit log and memory table on
all replica nodes in the cluster for that row key.
Write Data FlowARCHITECTURE
Multiple Data CentersARCHITECTURE
ScalabilitySCALABILITY
Astyanax clientASTYANAX
• Based on Hector
• High level, simple object oriented interface to Cassandra.
• Fail-over behavior on the client side.
• Connection pool abstraction (round robin connection pool)
• Monitoring to get event notification from the connection pool.
• Complete encapsulation of the underlying Thrift API.
• Automatic retry of downed hosts.
• Automatic discovery of additional hosts in the cluster.
• Suspension of hosts for a short period of time after timeouts.
Astyanax – token aware clientASTYANAX
Data Modeling in CassandraDATAMODELING
• Column Families are NOT tables!
• Map<RowKey, SortedMap<ColumnKey, ColumnValue>>
• Values could be and often are stored in column names
• Number of columns could be different for different rows
• There could be 2 billions columns in one row!
• Use UUIDs
• Separate read-heavy from write-heavy data
Data Modeling in Cassandra - continuedDATAMODELING
• Client joins
• Denormalize data
• Wide rows
• Materialized views
• Model around queries
• Row key is “shard” key
Modeling nested entities and documentsDATAMODELING
Motivation
• Parent-child decomposition lacks performance in Cassandra.
• No JOIN operator in CQL!
• The only solution is to store tree-like structure with nested “children”
• Cassandra doesn’t have built-in support for a document object
Solution
• Column Families are NOT tables
• Domain object fields are traversed along with the nested entities
• Collection and Map fields (of any level of deepness) are unwrapped
into plain key-value pairs (mapped to Cassandra column name – value)
Modeling nested entities and documents. ExampleDATAMODELING
class Parent {
@Id
private UUID id;
@Column
private String stringField1;
@NestedCollection
private Map<String, byte[]> imageMap;
@NestedCollection
private List<Child> children;
}
class Child {
@Column
private Integer kidsNumber;
}
Modeling nested entities and documents. ExampleDATAMODELING
Let’s use JSON notation:
If Parent is
{
“id” : “edc39a6c-355f-4ad0-a4de-
b2103dbd610d”,
“stringField1” : “value1”,
“imageMap”: [
“name1” : “SW1hZ2VEYXRhMQ==“,
“name2” : “SW1hZ2VEYXRhMg==“
],
“children” : [
{
“kidsNumber” : 1
},
{
“kidsNumber” : 2
}
]
}
the corresponding Cassandra columns will be:
• “id” -> “edc39a6c-355f-4ad0-a4de-
b2103dbd610d”
• “stringField1” -> “value1”
• “imageMap:name1” -> “SW1…MQ==“
• “imageMap:name2” -> “SW1…MQ==“
• “children:0:kidsNumber” -> 1
• “children:1:kidsNumber” -> 2
Range queries in CassandraQUERIES
Motivation
• No CQL equivalent for SQL clause:
WHERE “field_name” >= value1 and “field_name” <= value2
• For indexed fields the only possible query is
WHERE “field_name” [<,>,<=,>=,=] “value” but “field_name” can be
specified in a cql query only once
Solution
• Any name of Cassandra column is a byte buffer ~ byte [] columnName
• Column names (in comparison with the values)
may be filtered by the specified range,
i.e. if two border values
• byte [] lowMargin,
• byte [] highMargin
are defined it is possible to select columns with columName
WHERE columnName >= lowMargin AND columnName <= highMargin
• As there are ~ 2 bln columns can be persisted for the same key
it is possible to search quickly among lists of size < 2 * 10^9
Composite Column FamiliesQUERIES
Motivation
• Raw untyped column names are not convenient in processing.
• If there are 2 or more components of a column name serialized
to a same byte buffer it is hard to build quick search on a single part.
For instance, let’s introduce column name consisting of two components:
• person_name: String
• time_stamp: Date
How to build a column range returning all the previously persisted
combinations of person_name = “Tom” and time_stamp >= “1999-01-01” and
time_stamp <= “2012-01-01”?
Solution
Cassandra has built-in CompositeType comparator which can be defined for
number of components and sorts columns first by component number 0, 1, …
Composite Column Families - mappingQUERIES
public class ReferenceCategoryValue {
@Id
private String category; //maps to row key
@Component(ordinal = 0) //the following three fields are serialized
private UUID id; //into a column name
@Component(ordinal = 1)
private String description;
@Component(ordinal = 2)
private String code;
@Value
private String value // the value which is saved for the column
}
DataStax OpsCenterOPSCENTER
ResourcesRESOURCES
• DataStax Documentation
• Free Cassandra Academy
• Tutorials
• Apache Cassandra Home Page
• Cassandra Summit Presentations
• 2014 Summit Videos
• Netflix blog
• Astyanax
• Ebay Cassandra Data Modeling best practices part 1 and part 2

More Related Content

What's hot

Elassandra: Elasticsearch as a Cassandra Secondary Index (Rémi Trouville, Vin...
Elassandra: Elasticsearch as a Cassandra Secondary Index (Rémi Trouville, Vin...Elassandra: Elasticsearch as a Cassandra Secondary Index (Rémi Trouville, Vin...
Elassandra: Elasticsearch as a Cassandra Secondary Index (Rémi Trouville, Vin...DataStax
 
Deep Dive into Cassandra
Deep Dive into CassandraDeep Dive into Cassandra
Deep Dive into CassandraBrent Theisen
 
Elk presentation1#3
Elk presentation1#3Elk presentation1#3
Elk presentation1#3uzzal basak
 
Introduction to NoSQL & Apache Cassandra
Introduction to NoSQL & Apache CassandraIntroduction to NoSQL & Apache Cassandra
Introduction to NoSQL & Apache CassandraChetan Baheti
 
Cassandra Summit 2015: Intro to DSE Search
Cassandra Summit 2015: Intro to DSE SearchCassandra Summit 2015: Intro to DSE Search
Cassandra Summit 2015: Intro to DSE SearchCaleb Rackliffe
 
Introduction to data modeling with apache cassandra
Introduction to data modeling with apache cassandraIntroduction to data modeling with apache cassandra
Introduction to data modeling with apache cassandraPatrick McFadin
 
Cassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series ModelingCassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series ModelingVassilis Bekiaris
 
Apache Spark and DataStax Enablement
Apache Spark and DataStax EnablementApache Spark and DataStax Enablement
Apache Spark and DataStax EnablementVincent Poncet
 
Helsinki Cassandra Meetup #2: From Postgres to Cassandra
Helsinki Cassandra Meetup #2: From Postgres to CassandraHelsinki Cassandra Meetup #2: From Postgres to Cassandra
Helsinki Cassandra Meetup #2: From Postgres to CassandraBruno Amaro Almeida
 
Introduction to cassandra
Introduction to cassandraIntroduction to cassandra
Introduction to cassandraNguyen Quang
 
DTCC '14 Spark Runtime Internals
DTCC '14 Spark Runtime InternalsDTCC '14 Spark Runtime Internals
DTCC '14 Spark Runtime InternalsCheng Lian
 
Instaclustr Webinar 50,000 Transactions Per Second with Apache Spark on Apach...
Instaclustr Webinar 50,000 Transactions Per Second with Apache Spark on Apach...Instaclustr Webinar 50,000 Transactions Per Second with Apache Spark on Apach...
Instaclustr Webinar 50,000 Transactions Per Second with Apache Spark on Apach...Instaclustr
 
Vitalii Bondarenko - “Azure real-time analytics and kappa architecture with K...
Vitalii Bondarenko - “Azure real-time analytics and kappa architecture with K...Vitalii Bondarenko - “Azure real-time analytics and kappa architecture with K...
Vitalii Bondarenko - “Azure real-time analytics and kappa architecture with K...Lviv Startup Club
 
Time series with Apache Cassandra - Long version
Time series with Apache Cassandra - Long versionTime series with Apache Cassandra - Long version
Time series with Apache Cassandra - Long versionPatrick McFadin
 
Mysqlconf2013 mariadb-cassandra-interoperability
Mysqlconf2013 mariadb-cassandra-interoperabilityMysqlconf2013 mariadb-cassandra-interoperability
Mysqlconf2013 mariadb-cassandra-interoperabilitySergey Petrunya
 
Spark Streaming with Cassandra
Spark Streaming with CassandraSpark Streaming with Cassandra
Spark Streaming with CassandraJacek Lewandowski
 
Tuning and Debugging in Apache Spark
Tuning and Debugging in Apache SparkTuning and Debugging in Apache Spark
Tuning and Debugging in Apache SparkPatrick Wendell
 

What's hot (20)

Elassandra: Elasticsearch as a Cassandra Secondary Index (Rémi Trouville, Vin...
Elassandra: Elasticsearch as a Cassandra Secondary Index (Rémi Trouville, Vin...Elassandra: Elasticsearch as a Cassandra Secondary Index (Rémi Trouville, Vin...
Elassandra: Elasticsearch as a Cassandra Secondary Index (Rémi Trouville, Vin...
 
Deep Dive into Cassandra
Deep Dive into CassandraDeep Dive into Cassandra
Deep Dive into Cassandra
 
NoSql Database
NoSql DatabaseNoSql Database
NoSql Database
 
Elk presentation1#3
Elk presentation1#3Elk presentation1#3
Elk presentation1#3
 
Introduction to NoSQL & Apache Cassandra
Introduction to NoSQL & Apache CassandraIntroduction to NoSQL & Apache Cassandra
Introduction to NoSQL & Apache Cassandra
 
Cassandra Summit 2015: Intro to DSE Search
Cassandra Summit 2015: Intro to DSE SearchCassandra Summit 2015: Intro to DSE Search
Cassandra Summit 2015: Intro to DSE Search
 
Introduction to data modeling with apache cassandra
Introduction to data modeling with apache cassandraIntroduction to data modeling with apache cassandra
Introduction to data modeling with apache cassandra
 
Cassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series ModelingCassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series Modeling
 
Apache Spark and DataStax Enablement
Apache Spark and DataStax EnablementApache Spark and DataStax Enablement
Apache Spark and DataStax Enablement
 
Helsinki Cassandra Meetup #2: From Postgres to Cassandra
Helsinki Cassandra Meetup #2: From Postgres to CassandraHelsinki Cassandra Meetup #2: From Postgres to Cassandra
Helsinki Cassandra Meetup #2: From Postgres to Cassandra
 
Introduction to cassandra
Introduction to cassandraIntroduction to cassandra
Introduction to cassandra
 
DTCC '14 Spark Runtime Internals
DTCC '14 Spark Runtime InternalsDTCC '14 Spark Runtime Internals
DTCC '14 Spark Runtime Internals
 
Instaclustr Webinar 50,000 Transactions Per Second with Apache Spark on Apach...
Instaclustr Webinar 50,000 Transactions Per Second with Apache Spark on Apach...Instaclustr Webinar 50,000 Transactions Per Second with Apache Spark on Apach...
Instaclustr Webinar 50,000 Transactions Per Second with Apache Spark on Apach...
 
Vitalii Bondarenko - “Azure real-time analytics and kappa architecture with K...
Vitalii Bondarenko - “Azure real-time analytics and kappa architecture with K...Vitalii Bondarenko - “Azure real-time analytics and kappa architecture with K...
Vitalii Bondarenko - “Azure real-time analytics and kappa architecture with K...
 
Time series with Apache Cassandra - Long version
Time series with Apache Cassandra - Long versionTime series with Apache Cassandra - Long version
Time series with Apache Cassandra - Long version
 
Mysqlconf2013 mariadb-cassandra-interoperability
Mysqlconf2013 mariadb-cassandra-interoperabilityMysqlconf2013 mariadb-cassandra-interoperability
Mysqlconf2013 mariadb-cassandra-interoperability
 
Spark Streaming with Cassandra
Spark Streaming with CassandraSpark Streaming with Cassandra
Spark Streaming with Cassandra
 
Spark Introduction
Spark IntroductionSpark Introduction
Spark Introduction
 
Apache Cassandra
Apache CassandraApache Cassandra
Apache Cassandra
 
Tuning and Debugging in Apache Spark
Tuning and Debugging in Apache SparkTuning and Debugging in Apache Spark
Tuning and Debugging in Apache Spark
 

Similar to Cassandra Overview

NoSQL - Cassandra & MongoDB.pptx
NoSQL -  Cassandra & MongoDB.pptxNoSQL -  Cassandra & MongoDB.pptx
NoSQL - Cassandra & MongoDB.pptxNaveen Kumar
 
Scaling web applications with cassandra presentation
Scaling web applications with cassandra presentationScaling web applications with cassandra presentation
Scaling web applications with cassandra presentationMurat Çakal
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinChristian Johannsen
 
On Rails with Apache Cassandra
On Rails with Apache CassandraOn Rails with Apache Cassandra
On Rails with Apache CassandraStu Hood
 
Apache Cassandra, part 1 – principles, data model
Apache Cassandra, part 1 – principles, data modelApache Cassandra, part 1 – principles, data model
Apache Cassandra, part 1 – principles, data modelAndrey Lomakin
 
Cassandra Java APIs Old and New – A Comparison
Cassandra Java APIs Old and New – A ComparisonCassandra Java APIs Old and New – A Comparison
Cassandra Java APIs Old and New – A Comparisonshsedghi
 
Appache Cassandra
Appache Cassandra  Appache Cassandra
Appache Cassandra nehabsairam
 
Intro to cassandra
Intro to cassandraIntro to cassandra
Intro to cassandraAaron Ploetz
 
Cassandra implementation for collecting data and presenting data
Cassandra implementation for collecting data and presenting dataCassandra implementation for collecting data and presenting data
Cassandra implementation for collecting data and presenting dataChen Robert
 
From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016
From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016
From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016DataStax
 
cassandra_presentation_final
cassandra_presentation_finalcassandra_presentation_final
cassandra_presentation_finalSergioBruno21
 
Storage cassandra
Storage   cassandraStorage   cassandra
Storage cassandraPL dream
 
Cassandra - A decentralized storage system
Cassandra - A decentralized storage systemCassandra - A decentralized storage system
Cassandra - A decentralized storage systemArunit Gupta
 
Cassandra - A Distributed Database System
Cassandra - A Distributed Database System Cassandra - A Distributed Database System
Cassandra - A Distributed Database System Md. Shohel Rana
 
Cassandra Talk: Austin JUG
Cassandra Talk: Austin JUGCassandra Talk: Austin JUG
Cassandra Talk: Austin JUGStu Hood
 
Chicago Kafka Meetup
Chicago Kafka MeetupChicago Kafka Meetup
Chicago Kafka MeetupCliff Gilmore
 
N07_RoundII_20220405.pptx
N07_RoundII_20220405.pptxN07_RoundII_20220405.pptx
N07_RoundII_20220405.pptxNguyễn Thái
 

Similar to Cassandra Overview (20)

NoSQL - Cassandra & MongoDB.pptx
NoSQL -  Cassandra & MongoDB.pptxNoSQL -  Cassandra & MongoDB.pptx
NoSQL - Cassandra & MongoDB.pptx
 
Scaling web applications with cassandra presentation
Scaling web applications with cassandra presentationScaling web applications with cassandra presentation
Scaling web applications with cassandra presentation
 
Cassandra Learning
Cassandra LearningCassandra Learning
Cassandra Learning
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek Berlin
 
On Rails with Apache Cassandra
On Rails with Apache CassandraOn Rails with Apache Cassandra
On Rails with Apache Cassandra
 
Apache Cassandra, part 1 – principles, data model
Apache Cassandra, part 1 – principles, data modelApache Cassandra, part 1 – principles, data model
Apache Cassandra, part 1 – principles, data model
 
Cassandra Java APIs Old and New – A Comparison
Cassandra Java APIs Old and New – A ComparisonCassandra Java APIs Old and New – A Comparison
Cassandra Java APIs Old and New – A Comparison
 
Appache Cassandra
Appache Cassandra  Appache Cassandra
Appache Cassandra
 
Cassandra
CassandraCassandra
Cassandra
 
Intro to cassandra
Intro to cassandraIntro to cassandra
Intro to cassandra
 
Cassandra implementation for collecting data and presenting data
Cassandra implementation for collecting data and presenting dataCassandra implementation for collecting data and presenting data
Cassandra implementation for collecting data and presenting data
 
From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016
From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016
From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016
 
Cassandra training
Cassandra trainingCassandra training
Cassandra training
 
cassandra_presentation_final
cassandra_presentation_finalcassandra_presentation_final
cassandra_presentation_final
 
Storage cassandra
Storage   cassandraStorage   cassandra
Storage cassandra
 
Cassandra - A decentralized storage system
Cassandra - A decentralized storage systemCassandra - A decentralized storage system
Cassandra - A decentralized storage system
 
Cassandra - A Distributed Database System
Cassandra - A Distributed Database System Cassandra - A Distributed Database System
Cassandra - A Distributed Database System
 
Cassandra Talk: Austin JUG
Cassandra Talk: Austin JUGCassandra Talk: Austin JUG
Cassandra Talk: Austin JUG
 
Chicago Kafka Meetup
Chicago Kafka MeetupChicago Kafka Meetup
Chicago Kafka Meetup
 
N07_RoundII_20220405.pptx
N07_RoundII_20220405.pptxN07_RoundII_20220405.pptx
N07_RoundII_20220405.pptx
 

Recently uploaded

How to Track Employee Performance A Comprehensive Guide.pdf
How to Track Employee Performance A Comprehensive Guide.pdfHow to Track Employee Performance A Comprehensive Guide.pdf
How to Track Employee Performance A Comprehensive Guide.pdfLivetecs LLC
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Andreas Granig
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEEVICTOR MAESTRE RAMIREZ
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfFerryKemperman
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)jennyeacort
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesPhilip Schwarz
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...OnePlan Solutions
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
 
CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceBrainSell Technologies
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作qr0udbr0
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Hr365.us smith
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWave PLM
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmSujith Sukumaran
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesŁukasz Chruściel
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfAlina Yurenko
 

Recently uploaded (20)

How to Track Employee Performance A Comprehensive Guide.pdf
How to Track Employee Performance A Comprehensive Guide.pdfHow to Track Employee Performance A Comprehensive Guide.pdf
How to Track Employee Performance A Comprehensive Guide.pdf
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEE
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdf
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a series
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. Salesforce
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
 
2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva
 
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort ServiceHot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalm
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New Features
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
 

Cassandra Overview

  • 2. AgendaAGENDA • Cassandra Architecture • CAP theorem and Consistency • Scalability • Astyanax client • Data Modeling • Queries • DataStax OpsCenter • Resources
  • 3. Cassandra architectureARCHITECTURE • Ring • P2P • Gossip • Key hash-based sharding
  • 5. Consistency in CassandraCONSISTENCY • ACID - Atomicity Consistency Isolation Durability • BASE - Basically Available Soft-state Eventual consistency • Isolation on the row level • Atomic batches starting Cassandra 1.2 • Consistency level for READs and WRITEs set for every request • Tunable consistency • Log: CL_WRITE = ANY or ONE • Strong: CL_READ + CL_WRITE > REPLICATION_FACTOR • Recommended default: LOCAL_QUORUM
  • 6. Consistency in Cassandra - continuedCONSISTENCY Level Description ANY A write must be written to at least one node. If all replica nodes for the given row key are down, the write can still succeed once a hinted handoff has been written. Note that if all replica nodes are down at write time, an ANY write will not be readable until the replica nodes for that row key have recovered. ONE A write must be written to the commit log and memory table of at least one replica node. QUORUM A write must be written to the commit log and memory table on a quorum of replica nodes. LOCAL_QUORUM A write must be written to the commit log and memory table on a quorum of replica nodes in the same data center as the coordinator node. Avoids latency of inter-data center communication. EACH_QUORUM A write must be written to the commit log and memory table on a quorum of replica nodes in all data centers. ALL A write must be written to the commit log and memory table on all replica nodes in the cluster for that row key.
  • 10. Astyanax clientASTYANAX • Based on Hector • High level, simple object oriented interface to Cassandra. • Fail-over behavior on the client side. • Connection pool abstraction (round robin connection pool) • Monitoring to get event notification from the connection pool. • Complete encapsulation of the underlying Thrift API. • Automatic retry of downed hosts. • Automatic discovery of additional hosts in the cluster. • Suspension of hosts for a short period of time after timeouts.
  • 11. Astyanax – token aware clientASTYANAX
  • 12. Data Modeling in CassandraDATAMODELING • Column Families are NOT tables! • Map<RowKey, SortedMap<ColumnKey, ColumnValue>> • Values could be and often are stored in column names • Number of columns could be different for different rows • There could be 2 billions columns in one row! • Use UUIDs • Separate read-heavy from write-heavy data
  • 13. Data Modeling in Cassandra - continuedDATAMODELING • Client joins • Denormalize data • Wide rows • Materialized views • Model around queries • Row key is “shard” key
  • 14. Modeling nested entities and documentsDATAMODELING Motivation • Parent-child decomposition lacks performance in Cassandra. • No JOIN operator in CQL! • The only solution is to store tree-like structure with nested “children” • Cassandra doesn’t have built-in support for a document object Solution • Column Families are NOT tables • Domain object fields are traversed along with the nested entities • Collection and Map fields (of any level of deepness) are unwrapped into plain key-value pairs (mapped to Cassandra column name – value)
  • 15. Modeling nested entities and documents. ExampleDATAMODELING class Parent { @Id private UUID id; @Column private String stringField1; @NestedCollection private Map<String, byte[]> imageMap; @NestedCollection private List<Child> children; } class Child { @Column private Integer kidsNumber; }
  • 16. Modeling nested entities and documents. ExampleDATAMODELING Let’s use JSON notation: If Parent is { “id” : “edc39a6c-355f-4ad0-a4de- b2103dbd610d”, “stringField1” : “value1”, “imageMap”: [ “name1” : “SW1hZ2VEYXRhMQ==“, “name2” : “SW1hZ2VEYXRhMg==“ ], “children” : [ { “kidsNumber” : 1 }, { “kidsNumber” : 2 } ] } the corresponding Cassandra columns will be: • “id” -> “edc39a6c-355f-4ad0-a4de- b2103dbd610d” • “stringField1” -> “value1” • “imageMap:name1” -> “SW1…MQ==“ • “imageMap:name2” -> “SW1…MQ==“ • “children:0:kidsNumber” -> 1 • “children:1:kidsNumber” -> 2
  • 17. Range queries in CassandraQUERIES Motivation • No CQL equivalent for SQL clause: WHERE “field_name” >= value1 and “field_name” <= value2 • For indexed fields the only possible query is WHERE “field_name” [<,>,<=,>=,=] “value” but “field_name” can be specified in a cql query only once Solution • Any name of Cassandra column is a byte buffer ~ byte [] columnName • Column names (in comparison with the values) may be filtered by the specified range, i.e. if two border values • byte [] lowMargin, • byte [] highMargin are defined it is possible to select columns with columName WHERE columnName >= lowMargin AND columnName <= highMargin • As there are ~ 2 bln columns can be persisted for the same key it is possible to search quickly among lists of size < 2 * 10^9
  • 18. Composite Column FamiliesQUERIES Motivation • Raw untyped column names are not convenient in processing. • If there are 2 or more components of a column name serialized to a same byte buffer it is hard to build quick search on a single part. For instance, let’s introduce column name consisting of two components: • person_name: String • time_stamp: Date How to build a column range returning all the previously persisted combinations of person_name = “Tom” and time_stamp >= “1999-01-01” and time_stamp <= “2012-01-01”? Solution Cassandra has built-in CompositeType comparator which can be defined for number of components and sorts columns first by component number 0, 1, …
  • 19. Composite Column Families - mappingQUERIES public class ReferenceCategoryValue { @Id private String category; //maps to row key @Component(ordinal = 0) //the following three fields are serialized private UUID id; //into a column name @Component(ordinal = 1) private String description; @Component(ordinal = 2) private String code; @Value private String value // the value which is saved for the column }
  • 21. ResourcesRESOURCES • DataStax Documentation • Free Cassandra Academy • Tutorials • Apache Cassandra Home Page • Cassandra Summit Presentations • 2014 Summit Videos • Netflix blog • Astyanax • Ebay Cassandra Data Modeling best practices part 1 and part 2