SlideShare a Scribd company logo
1 of 31
Download to read offline
Introduction to
  Cassandra




    Shimi Kiviti
    @shimi_k
Motivation

            Scaling

How do you scale your database?
 ● reads
 ● writes
Influential Papers

 ● Bigtable: A distributed storage system for structured data,
   2006
 ● Dynamo: amazon's highly available key-value store, 2007


Cassandra:
 ● partition and replication - Dynamo
 ● log structure column family - Bigtable
Cassandra Highlights

● Symmetric - all nodes are exactly the same
   ○ No single point of failure
   ○ Linearly scalable
   ○ Ease of administration
● High availability with multiple datacenters
● Consistency vs Latency
● Read/Write anywhere
● Flexible Schema
● Column TTL
● Distributed Counters
DHT - Distributed Hash Table
DHT

● O(1) node lookup
● Explicit replication
● Linear Scalability
Consistency

N = Replication factor
R = Number of replicas to block when read <= N
W = Number of replicas to block when write <= N
Quorum = N/2 + 1

When W + R > N there is a full consistency
examples:
 ● W = 1, R = N
 ● W = N, R = 1
 ● W = Quorum, R = Quorum
Consistency Level

● Every request defines consistency level
   ○ Any
   ○ One
   ○ Two
   ○ Three
   ○ Quorum
   ○ Local Quorum
   ○ Each Quorum
   ○ All
Data Model

● Keyspace ~ schema
● ColumnFamilies ~ table
● Rows
● Columns
Column Family

Key1   Column   Column   Column


Key2   Column   Column
Column Family

ColumnFamily: {
  TOK: {
    chen: 1,
    ronen: 7
  }
  CityPath: {
    yuval: 5
  }
}
Super Column Family
          Super1   Column Column Column
Key
          Super2   Column Column Column

 ColumnFamily: {
   Key: {
     super1: {
       name: value,
       name: value
     }
     super2: {
       name: value
     }
   }
 }
Write

● Any node
● Partitioner
● Commit log, memtable
● Wait for W responses
Write
Write

● No reads
● No seeks
● Sequential disk access
● Atomic within a column family
● Fast
● Always writeable (hinted hand-off)
Read

● Choose any node
● Partitioner
● Wait for R responses
● tunable read repair in the background
Read




Read can be from multiple SSTables
Slower then writes
Cache

● There is no need to use memcached
● There is an internal configurable cache
   ○ Key cache
   ○ Row cache
Sorting

When you preform get the result is sorted
 ● Rows are sorted according to the partitioner
 ● Columns in a row are sorted according to the type of the
   column name
Partitioner

● RandomPartitioner - Uses hash values as tokens. useful for
  distributing the load on all nodes.
  If you use it, set the nodes tokens manually

● OrderPreservePartioner - You can get sorted rows but it will
  cost you with an even cluster
Column Types

Available types:
 ● Bytes
 ● UTF8
 ● Ascii
 ● Long
 ● Date
 ● UUID
 ● Composite - <Type1>:<Type2>
Column Types

Examples:

Sort1:
8            10
9      vs    8
10           9

Sort2:
dan:8             dan:10
dan:10      vs    dan:8
shimi:1           shimi:1
Clients

● Thrift - Cassandra driver level interface
● CQL - Cassandra query language (SQL like)
● High level clients:
   ○ Python
   ○ Java
   ○ Scala
   ○ Clojure
   ○ .Net
   ○ Ruby
   ○ PHP
   ○ Perl
   ○ C++
   ○ Haskel
Cascal - Scala client

Insert column:

session.insert("app"  "users"  "shimi"  "passwd"  "mypass")

val key = "app"  "users"  "shimi"
session.insert(key  "email"  "shimi.k@...")


Get column value:

val pass = session.get(key  "passwd")
Cascal

Get multiple columns:

val row = session.list(key)
val cols = session.list(key, RangePredicate("email", "passwd"))
val cols = session.list(key, ColumnPredicate( List("passwd", "email") ))
Cascal

Get multiple rows:

val family = "app"  "users"
val rows = session.list(family, RangePredicate("dan", "shimi"))
val rows = session.list(family, KeyPrdicate("dan", "shimi"))
Cascal

Remove column:
session.remove("app"  "users"  "shimi"  "passwd")


Remove row:
session.remove("app"  "users"  "shimi")


Batch operations:

val deleteCols = Delete(key, ColumnPredicate("age" :: "sex"))
val insertEmail = Insert(key  "email"  "shimi.k@...")
session.batch(insertEmail :: deleteCols)
Guidelines

● Keep together the data you query together
● Think about your use case and how you should fetch your
  data.
● Don't try to normalize your data
● You can't win the disk
● Be ready to get your hands dirty
● There is no single solution for everything. You might
  consider using different solutions together
The End

Useful links:
 ● Cassandra, http://cassandra.apache.org/
 ● Wiki http://wiki.apache.org/cassandra/
 ● Cassandra mailing list
 ● IRC
 ● Bigtable, http://labs.google.com/papers/bigtable.html
 ● Dynamo http://www.allthingsdistributed.
   com/2007/10/amazons_dynamo.html
 ● Cascal, https://github.com/shimi/cascal

More Related Content

What's hot

openTSDB - Metrics for a distributed world
openTSDB - Metrics for a distributed worldopenTSDB - Metrics for a distributed world
openTSDB - Metrics for a distributed worldOliver Hankeln
 
"Scala in Goozy", Alexey Zlobin
"Scala in Goozy", Alexey Zlobin "Scala in Goozy", Alexey Zlobin
"Scala in Goozy", Alexey Zlobin Vasil Remeniuk
 
Log stage zero-cost structured logging
Log stage  zero-cost structured loggingLog stage  zero-cost structured logging
Log stage zero-cost structured loggingMaksym Ratoshniuk
 
Cassandra Overview
Cassandra OverviewCassandra Overview
Cassandra Overviewbtoddb
 
Viliam Ganz - Domain Specific Languages
Viliam Ganz - Domain Specific LanguagesViliam Ganz - Domain Specific Languages
Viliam Ganz - Domain Specific LanguagesDavinci software
 
Query hierarchical data the easy way, with CTEs
Query hierarchical data the easy way, with CTEsQuery hierarchical data the easy way, with CTEs
Query hierarchical data the easy way, with CTEsMariaDB plc
 
XML Schema and RELAX NG Element Comparison
XML Schema and RELAX NG Element ComparisonXML Schema and RELAX NG Element Comparison
XML Schema and RELAX NG Element ComparisonOverdue Books LLC
 
Big Data Processing using Apache Spark and Clojure
Big Data Processing using Apache Spark and ClojureBig Data Processing using Apache Spark and Clojure
Big Data Processing using Apache Spark and ClojureDr. Christian Betz
 
C* Summit 2013: Time-Series Metrics with Cassandra by Mike Heffner
C* Summit 2013: Time-Series Metrics with Cassandra by Mike HeffnerC* Summit 2013: Time-Series Metrics with Cassandra by Mike Heffner
C* Summit 2013: Time-Series Metrics with Cassandra by Mike HeffnerDataStax Academy
 
OpenTSDB: HBaseCon2017
OpenTSDB: HBaseCon2017OpenTSDB: HBaseCon2017
OpenTSDB: HBaseCon2017HBaseCon
 

What's hot (14)

openTSDB - Metrics for a distributed world
openTSDB - Metrics for a distributed worldopenTSDB - Metrics for a distributed world
openTSDB - Metrics for a distributed world
 
"Scala in Goozy", Alexey Zlobin
"Scala in Goozy", Alexey Zlobin "Scala in Goozy", Alexey Zlobin
"Scala in Goozy", Alexey Zlobin
 
Log stage zero-cost structured logging
Log stage  zero-cost structured loggingLog stage  zero-cost structured logging
Log stage zero-cost structured logging
 
Cassandra Overview
Cassandra OverviewCassandra Overview
Cassandra Overview
 
11 bytecode
11 bytecode11 bytecode
11 bytecode
 
Viliam Ganz - Domain Specific Languages
Viliam Ganz - Domain Specific LanguagesViliam Ganz - Domain Specific Languages
Viliam Ganz - Domain Specific Languages
 
Query hierarchical data the easy way, with CTEs
Query hierarchical data the easy way, with CTEsQuery hierarchical data the easy way, with CTEs
Query hierarchical data the easy way, with CTEs
 
XML Schema and RELAX NG Element Comparison
XML Schema and RELAX NG Element ComparisonXML Schema and RELAX NG Element Comparison
XML Schema and RELAX NG Element Comparison
 
Big Data Processing using Apache Spark and Clojure
Big Data Processing using Apache Spark and ClojureBig Data Processing using Apache Spark and Clojure
Big Data Processing using Apache Spark and Clojure
 
Clojure Small Intro
Clojure Small IntroClojure Small Intro
Clojure Small Intro
 
C* Summit 2013: Time-Series Metrics with Cassandra by Mike Heffner
C* Summit 2013: Time-Series Metrics with Cassandra by Mike HeffnerC* Summit 2013: Time-Series Metrics with Cassandra by Mike Heffner
C* Summit 2013: Time-Series Metrics with Cassandra by Mike Heffner
 
OpenTSDB: HBaseCon2017
OpenTSDB: HBaseCon2017OpenTSDB: HBaseCon2017
OpenTSDB: HBaseCon2017
 
Meet the-other-elephant
Meet the-other-elephantMeet the-other-elephant
Meet the-other-elephant
 
SAX PARSER
SAX PARSER SAX PARSER
SAX PARSER
 

Viewers also liked

Dare to be Digital 2012 - Information presentation
Dare to be Digital 2012 - Information presentation Dare to be Digital 2012 - Information presentation
Dare to be Digital 2012 - Information presentation Dare to be Digital
 
Pa yessy
Pa yessyPa yessy
Pa yessySJM
 
7 สามัญ อังกฤษ
7 สามัญ อังกฤษ7 สามัญ อังกฤษ
7 สามัญ อังกฤษWarangkana Singthong
 
Roshoman: The Truth About the Truth
Roshoman: The Truth About the Truth Roshoman: The Truth About the Truth
Roshoman: The Truth About the Truth Azzikorn
 
Shreya bhaveshreception airport
Shreya bhaveshreception airportShreya bhaveshreception airport
Shreya bhaveshreception airportdoshi15
 
ΧΡΗΣΙΜΕΣ ΔΙΕΥΘΥΝΣΕΙΣ ΓΙΑ ΔΙΔΑΚΤΙΚΑ ΣΕΝΑΡΙΑ
ΧΡΗΣΙΜΕΣ ΔΙΕΥΘΥΝΣΕΙΣ ΓΙΑ ΔΙΔΑΚΤΙΚΑ ΣΕΝΑΡΙΑΧΡΗΣΙΜΕΣ ΔΙΕΥΘΥΝΣΕΙΣ ΓΙΑ ΔΙΔΑΚΤΙΚΑ ΣΕΝΑΡΙΑ
ΧΡΗΣΙΜΕΣ ΔΙΕΥΘΥΝΣΕΙΣ ΓΙΑ ΔΙΔΑΚΤΙΚΑ ΣΕΝΑΡΙΑEleni Papadopoulou
 
2nd Equitarian Workshop Logistics
2nd Equitarian Workshop Logistics2nd Equitarian Workshop Logistics
2nd Equitarian Workshop Logisticsequitarian
 
ituren eta zubieta inauteriak
ituren eta zubieta inauteriakituren eta zubieta inauteriak
ituren eta zubieta inauteriakIratxe Allende
 
L'onada perillosa
L'onada perillosaL'onada perillosa
L'onada perillosacarmeo
 
Movi moves
Movi movesMovi moves
Movi movesmiloherr
 

Viewers also liked (20)

Dare to be Digital 2012 - Information presentation
Dare to be Digital 2012 - Information presentation Dare to be Digital 2012 - Information presentation
Dare to be Digital 2012 - Information presentation
 
Pa yessy
Pa yessyPa yessy
Pa yessy
 
Gp
GpGp
Gp
 
Lantz inauteri
Lantz inauteriLantz inauteri
Lantz inauteri
 
Halo3 .pdf
Halo3 .pdfHalo3 .pdf
Halo3 .pdf
 
Front covers comparison
Front covers comparisonFront covers comparison
Front covers comparison
 
Maintenance Engineering
Maintenance EngineeringMaintenance Engineering
Maintenance Engineering
 
7 สามัญ อังกฤษ
7 สามัญ อังกฤษ7 สามัญ อังกฤษ
7 สามัญ อังกฤษ
 
Roshoman: The Truth About the Truth
Roshoman: The Truth About the Truth Roshoman: The Truth About the Truth
Roshoman: The Truth About the Truth
 
Ituren eta zubieta3
Ituren eta zubieta3Ituren eta zubieta3
Ituren eta zubieta3
 
IKT PROIEKTUA
IKT PROIEKTUAIKT PROIEKTUA
IKT PROIEKTUA
 
Lantz inauteri
Lantz inauteriLantz inauteri
Lantz inauteri
 
Shreya bhaveshreception airport
Shreya bhaveshreception airportShreya bhaveshreception airport
Shreya bhaveshreception airport
 
Amit kumar mishra
Amit kumar mishraAmit kumar mishra
Amit kumar mishra
 
ΧΡΗΣΙΜΕΣ ΔΙΕΥΘΥΝΣΕΙΣ ΓΙΑ ΔΙΔΑΚΤΙΚΑ ΣΕΝΑΡΙΑ
ΧΡΗΣΙΜΕΣ ΔΙΕΥΘΥΝΣΕΙΣ ΓΙΑ ΔΙΔΑΚΤΙΚΑ ΣΕΝΑΡΙΑΧΡΗΣΙΜΕΣ ΔΙΕΥΘΥΝΣΕΙΣ ΓΙΑ ΔΙΔΑΚΤΙΚΑ ΣΕΝΑΡΙΑ
ΧΡΗΣΙΜΕΣ ΔΙΕΥΘΥΝΣΕΙΣ ΓΙΑ ΔΙΔΑΚΤΙΚΑ ΣΕΝΑΡΙΑ
 
2nd Equitarian Workshop Logistics
2nd Equitarian Workshop Logistics2nd Equitarian Workshop Logistics
2nd Equitarian Workshop Logistics
 
ituren eta zubieta inauteriak
ituren eta zubieta inauteriakituren eta zubieta inauteriak
ituren eta zubieta inauteriak
 
Gruppo_8_tirapelle_sean
Gruppo_8_tirapelle_seanGruppo_8_tirapelle_sean
Gruppo_8_tirapelle_sean
 
L'onada perillosa
L'onada perillosaL'onada perillosa
L'onada perillosa
 
Movi moves
Movi movesMovi moves
Movi moves
 

Similar to Introduction to Cassandra

On Rails with Apache Cassandra
On Rails with Apache CassandraOn Rails with Apache Cassandra
On Rails with Apache CassandraStu Hood
 
Cassandra Explained
Cassandra ExplainedCassandra Explained
Cassandra ExplainedEric Evans
 
Cassandra Talk: Austin JUG
Cassandra Talk: Austin JUGCassandra Talk: Austin JUG
Cassandra Talk: Austin JUGStu Hood
 
Cassandra overview
Cassandra overviewCassandra overview
Cassandra overviewSean Murphy
 
Cassandra Explained
Cassandra ExplainedCassandra Explained
Cassandra ExplainedEric Evans
 
Intro to cassandra
Intro to cassandraIntro to cassandra
Intro to cassandraAaron Ploetz
 
Cassandra in production
Cassandra in productionCassandra in production
Cassandra in productionvalstadsve
 
An Introduction to Apache Cassandra
An Introduction to Apache CassandraAn Introduction to Apache Cassandra
An Introduction to Apache CassandraSaeid Zebardast
 
Cassandra: Open Source Bigtable + Dynamo
Cassandra: Open Source Bigtable + DynamoCassandra: Open Source Bigtable + Dynamo
Cassandra: Open Source Bigtable + Dynamojbellis
 
Introduce Apache Cassandra - JavaTwo Taiwan, 2012
Introduce Apache Cassandra - JavaTwo Taiwan, 2012Introduce Apache Cassandra - JavaTwo Taiwan, 2012
Introduce Apache Cassandra - JavaTwo Taiwan, 2012Boris Yen
 
Scaling web applications with cassandra presentation
Scaling web applications with cassandra presentationScaling web applications with cassandra presentation
Scaling web applications with cassandra presentationMurat Çakal
 
Design Patterns for Distributed Non-Relational Databases
Design Patterns for Distributed Non-Relational DatabasesDesign Patterns for Distributed Non-Relational Databases
Design Patterns for Distributed Non-Relational Databasesguestdfd1ec
 
Rust All Hands Winter 2011
Rust All Hands Winter 2011Rust All Hands Winter 2011
Rust All Hands Winter 2011Patrick Walton
 
Online Analytics with Hadoop and Cassandra
Online Analytics with Hadoop and CassandraOnline Analytics with Hadoop and Cassandra
Online Analytics with Hadoop and CassandraRobbie Strickland
 
GBM in H2O with Cliff Click: H2O API
GBM in H2O with Cliff Click: H2O APIGBM in H2O with Cliff Click: H2O API
GBM in H2O with Cliff Click: H2O APISri Ambati
 
Avoiding Pitfalls for Cassandra.pdf
Avoiding Pitfalls for Cassandra.pdfAvoiding Pitfalls for Cassandra.pdf
Avoiding Pitfalls for Cassandra.pdfCédrick Lunven
 
Programming in scala - 1
Programming in scala - 1Programming in scala - 1
Programming in scala - 1Mukesh Kumar
 
Apache cassandra an introduction
Apache cassandra  an introductionApache cassandra  an introduction
Apache cassandra an introductionShehaaz Saif
 

Similar to Introduction to Cassandra (20)

On Rails with Apache Cassandra
On Rails with Apache CassandraOn Rails with Apache Cassandra
On Rails with Apache Cassandra
 
Cassandra Explained
Cassandra ExplainedCassandra Explained
Cassandra Explained
 
Cassandra Talk: Austin JUG
Cassandra Talk: Austin JUGCassandra Talk: Austin JUG
Cassandra Talk: Austin JUG
 
Cassandra overview
Cassandra overviewCassandra overview
Cassandra overview
 
Cassandra Explained
Cassandra ExplainedCassandra Explained
Cassandra Explained
 
Intro to cassandra
Intro to cassandraIntro to cassandra
Intro to cassandra
 
Cassandra in production
Cassandra in productionCassandra in production
Cassandra in production
 
An Introduction to Apache Cassandra
An Introduction to Apache CassandraAn Introduction to Apache Cassandra
An Introduction to Apache Cassandra
 
Cassandra
CassandraCassandra
Cassandra
 
Cassandra: Open Source Bigtable + Dynamo
Cassandra: Open Source Bigtable + DynamoCassandra: Open Source Bigtable + Dynamo
Cassandra: Open Source Bigtable + Dynamo
 
Introduce Apache Cassandra - JavaTwo Taiwan, 2012
Introduce Apache Cassandra - JavaTwo Taiwan, 2012Introduce Apache Cassandra - JavaTwo Taiwan, 2012
Introduce Apache Cassandra - JavaTwo Taiwan, 2012
 
Scaling web applications with cassandra presentation
Scaling web applications with cassandra presentationScaling web applications with cassandra presentation
Scaling web applications with cassandra presentation
 
Design Patterns for Distributed Non-Relational Databases
Design Patterns for Distributed Non-Relational DatabasesDesign Patterns for Distributed Non-Relational Databases
Design Patterns for Distributed Non-Relational Databases
 
Rust All Hands Winter 2011
Rust All Hands Winter 2011Rust All Hands Winter 2011
Rust All Hands Winter 2011
 
Online Analytics with Hadoop and Cassandra
Online Analytics with Hadoop and CassandraOnline Analytics with Hadoop and Cassandra
Online Analytics with Hadoop and Cassandra
 
GBM in H2O with Cliff Click: H2O API
GBM in H2O with Cliff Click: H2O APIGBM in H2O with Cliff Click: H2O API
GBM in H2O with Cliff Click: H2O API
 
Cassandra 101
Cassandra 101Cassandra 101
Cassandra 101
 
Avoiding Pitfalls for Cassandra.pdf
Avoiding Pitfalls for Cassandra.pdfAvoiding Pitfalls for Cassandra.pdf
Avoiding Pitfalls for Cassandra.pdf
 
Programming in scala - 1
Programming in scala - 1Programming in scala - 1
Programming in scala - 1
 
Apache cassandra an introduction
Apache cassandra  an introductionApache cassandra  an introduction
Apache cassandra an introduction
 

Recently uploaded

Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfjimielynbastida
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 

Recently uploaded (20)

Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdf
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 

Introduction to Cassandra

  • 1. Introduction to Cassandra Shimi Kiviti @shimi_k
  • 2. Motivation Scaling How do you scale your database? ● reads ● writes
  • 3.
  • 4. Influential Papers ● Bigtable: A distributed storage system for structured data, 2006 ● Dynamo: amazon's highly available key-value store, 2007 Cassandra: ● partition and replication - Dynamo ● log structure column family - Bigtable
  • 5. Cassandra Highlights ● Symmetric - all nodes are exactly the same ○ No single point of failure ○ Linearly scalable ○ Ease of administration ● High availability with multiple datacenters ● Consistency vs Latency ● Read/Write anywhere ● Flexible Schema ● Column TTL ● Distributed Counters
  • 6. DHT - Distributed Hash Table
  • 7. DHT ● O(1) node lookup ● Explicit replication ● Linear Scalability
  • 8.
  • 9. Consistency N = Replication factor R = Number of replicas to block when read <= N W = Number of replicas to block when write <= N Quorum = N/2 + 1 When W + R > N there is a full consistency examples: ● W = 1, R = N ● W = N, R = 1 ● W = Quorum, R = Quorum
  • 10. Consistency Level ● Every request defines consistency level ○ Any ○ One ○ Two ○ Three ○ Quorum ○ Local Quorum ○ Each Quorum ○ All
  • 11. Data Model ● Keyspace ~ schema ● ColumnFamilies ~ table ● Rows ● Columns
  • 12. Column Family Key1 Column Column Column Key2 Column Column
  • 13. Column Family ColumnFamily: { TOK: { chen: 1, ronen: 7 } CityPath: { yuval: 5 } }
  • 14. Super Column Family Super1 Column Column Column Key Super2 Column Column Column ColumnFamily: { Key: { super1: { name: value, name: value } super2: { name: value } } }
  • 15. Write ● Any node ● Partitioner ● Commit log, memtable ● Wait for W responses
  • 16. Write
  • 17. Write ● No reads ● No seeks ● Sequential disk access ● Atomic within a column family ● Fast ● Always writeable (hinted hand-off)
  • 18. Read ● Choose any node ● Partitioner ● Wait for R responses ● tunable read repair in the background
  • 19. Read Read can be from multiple SSTables Slower then writes
  • 20. Cache ● There is no need to use memcached ● There is an internal configurable cache ○ Key cache ○ Row cache
  • 21. Sorting When you preform get the result is sorted ● Rows are sorted according to the partitioner ● Columns in a row are sorted according to the type of the column name
  • 22. Partitioner ● RandomPartitioner - Uses hash values as tokens. useful for distributing the load on all nodes. If you use it, set the nodes tokens manually ● OrderPreservePartioner - You can get sorted rows but it will cost you with an even cluster
  • 23. Column Types Available types: ● Bytes ● UTF8 ● Ascii ● Long ● Date ● UUID ● Composite - <Type1>:<Type2>
  • 24. Column Types Examples: Sort1: 8 10 9 vs 8 10 9 Sort2: dan:8 dan:10 dan:10 vs dan:8 shimi:1 shimi:1
  • 25. Clients ● Thrift - Cassandra driver level interface ● CQL - Cassandra query language (SQL like) ● High level clients: ○ Python ○ Java ○ Scala ○ Clojure ○ .Net ○ Ruby ○ PHP ○ Perl ○ C++ ○ Haskel
  • 26. Cascal - Scala client Insert column: session.insert("app" "users" "shimi" "passwd" "mypass") val key = "app" "users" "shimi" session.insert(key "email" "shimi.k@...") Get column value: val pass = session.get(key "passwd")
  • 27. Cascal Get multiple columns: val row = session.list(key) val cols = session.list(key, RangePredicate("email", "passwd")) val cols = session.list(key, ColumnPredicate( List("passwd", "email") ))
  • 28. Cascal Get multiple rows: val family = "app" "users" val rows = session.list(family, RangePredicate("dan", "shimi")) val rows = session.list(family, KeyPrdicate("dan", "shimi"))
  • 29. Cascal Remove column: session.remove("app" "users" "shimi" "passwd") Remove row: session.remove("app" "users" "shimi") Batch operations: val deleteCols = Delete(key, ColumnPredicate("age" :: "sex")) val insertEmail = Insert(key "email" "shimi.k@...") session.batch(insertEmail :: deleteCols)
  • 30. Guidelines ● Keep together the data you query together ● Think about your use case and how you should fetch your data. ● Don't try to normalize your data ● You can't win the disk ● Be ready to get your hands dirty ● There is no single solution for everything. You might consider using different solutions together
  • 31. The End Useful links: ● Cassandra, http://cassandra.apache.org/ ● Wiki http://wiki.apache.org/cassandra/ ● Cassandra mailing list ● IRC ● Bigtable, http://labs.google.com/papers/bigtable.html ● Dynamo http://www.allthingsdistributed. com/2007/10/amazons_dynamo.html ● Cascal, https://github.com/shimi/cascal