SlideShare a Scribd company logo
Cassandra Explained


    Berlin Buzzwords
      June 6, 2010

            Eric Evans
    eevans@rackspace.com
           @jericevans
    http://blog.sym-link.com
Outline
●   Background
●   Description
●   API
●   Examples
Background
Influential Papers
●   BigTable
    ● Strong consistency
    ● Sparse map data model


    ● GFS, Chubby, et al


●   Dynamo
    ●   O(1) distributed hash table (DHT)
    ●   BASE (aka eventual consistency)
    ●   Client tunable consistency/availability
NoSQL
●   HBase          ●   Hypertable
●   MongoDB        ●   HyperGraphDB
●   Riak           ●   Memcached
●   Voldemort      ●   Tokyo Cabinet
●   Neo4J          ●   Redis
●   Cassandra      ●   CouchDB
NoSQL Big data
●   HBase           ●   Hypertable
●   MongoDB         ●   HyperGraphDB
●   Riak            ●   Memcached
●   Voldemort       ●   Tokyo Cabinet
●   Neo4J           ●   Redis
●   Cassandra       ●   CouchDB
Bigtable / Dynamo
        Bigtable              Dynamo
●   HBase          ●   Riak
●   Hypertable     ●   Voldemort



            Cassandra ??
Dynamo-Bigtable Lovechild
CAP Theorem “Pick Two”
●   CP               ●   AP
    ●   Bigtable         ●   Dynamo
    ●   Hypertable       ●   Voldemort
    ●   HBase            ●   Cassandra
CAP Theorem “Pick Two”



   ●   Consistency
   ●   Availability
   ●   Partition Tolerance
Description
Properties
●   Symmetric
    ● No single point of failure
    ● Linearly scalable


    ● Ease of administration


●   Flexible partitioning, replica placement
●   Automated provisioning
●   High availability (eventual consistency)
P2P Routing
P2P Routing
Partitioning
●   Random
    ●   128bit namespace, (MD5)
    ●   Good distribution
●   Order Preserving
    ●   Tokens determine namespace
    ●   Natural order (lexicographical)
    ●   Range / cover queries
●   Yours ??
Replica Placement
●   SimpleSnitch
    ●   Default
    ●   N-1 successive nodes
●   RackInferringSnitch
    ●   Infers DC/rack from IP
●   PropertyFileSnitch
    ●   Configured w/ a properties file
Bootstrap
Bootstrap
Bootstrap
Choosing Consistency

         Write                      Read
Level     Description      Level     Description
ZERO      Hail Mary        ZERO      N/A
ANY       1 replica (HH)   ANY       N/A
ONE       1 replica        ONE       1 replica
QUORUM    (N / 2) +1       QUORUM    (N / 2) +1
ALL       All replicas     ALL       All replicas

                       R+W>N
Quorum ((N/2) + 1)
Quorum ((N/2) + 1)
Data Model
Overview
●   Keyspace
    ●   Uppermost namespace
    ●   Typically one per application
●   ColumnFamily
    ●   Associates records of a similar kind
    ●   Record-level Atomicity
    ●   Indexed
●   Column
    ●   Basic unit of storage
Sparse Table
Column
●   name
    ●   byte[]
    ●   Queried against (predicates)
    ●   Determines sort order
●   value
    ●   byte[]
    ●   Opaque to Cassandra
●   timestamp
    ●   long
    ●   Conflict resolution (Last Write Wins)
Column Comparators
●    Bytes
●    UTF8
●    TimeUUID
●    Long
●    LexicalUUID
●    Composite (third-party)


    http://github.com/edanuff/CassandraCompositeType
API
Low / High
●    Thrift
      ●   Compact binary RPC framework
      ●   12 different languages
●    Idiomatic
      ●   Hector (Java)
      ●   Pycassa (Python)
      ●   Others...


    http://wiki.apache.org/cassandra/ClientOptions
Thrift Read Methods
●   get() → Column
●   get_slice() → list<Column>
●   mulitget_slice() → map<key, list<Column>>
●   get_count() → int
●   multiget_count() → map<key, int>
●   get_range_slices()
Thrift Write Methods
●   insert()
●   batch_insert()
●   remove()
●   batch_mutate()
Examples
Pycassa – Python Client API
●    connect() → Thrift proxy
●    cf = ColumnFamily(proxy, ksp, cfname)
●    cf.insert() → long
●    cf.get() → dict
●    cf.get_range() → dict




    http://github.com/vomjom/pycassa
Address Book – Setup
<!-- conf/storage-conf.xml -->
<Keyspace Name=”AddressBook”>
  <ColumnFamily Name=”Addresses”
                CompareWith=”BytesType”
                RowsCached=”10000”
                KeysCached=”50%”
                Comment=”Too lame” />
</Keyspace>
Adding an entry
key = uuid()

columns = {
    'first':   'Eric',
    'last':    'Evans',
    'email':   'eevans@rackspace.com',
    'city':    'Austin',
    'zip':     78250
}

addresses.insert(key, columns)
Fetching a record
# fetching the record by key
record = addresses.get(key)

# accessing columns by name
zipcode = record['zip']
city = record['city']
Indexing
<!-- conf/storage-conf.xml -->
<Keyspace Name=”AddressBook”>
  <ColumnFamily Name=”Addresses”
                CompareWith=”BytesType”
                RowsCached=”10000”
                KeysCached=”50%”
                Comment=”Too lame” />
  <ColumnFamily Name=”ByCity”
                CompareWith=”UTF8Type” />
</Keyspace>
Updating the index
key = uuid()

columns = {
    'first':   'Eric',
    'last':    'Evans',
    'email':   'eevans@rackspace.com',
    'city':    'Austin',
    'zip':     78250
}

addresses.insert(key, columns)
byCity.insert('Austin', {key: ''})
Timeseries
<!-- conf/storage-conf.xml -->
<Keyspace Name=”Sites”>
  <ColumnFamily Name=”Stats”
                CompareWith=”LongType”/>
</Keyspace>
Logging values
# time as a long, binary, network-order
ts = pack('>d', long(time() * 1e6))

stats.insert('org.apache', {ts: value})
Slicing
begin = pack('>d', long(s * 1e6))

stats.get_range('org.apache',
                column_start=begin)

end = pack('>d', long((s + 86400) * 1e6))

stats.get_range(start='org.apache',
                finish='org.debian',
                column_start=begin,
                column_finish=end)
Questions?

More Related Content

What's hot

Introduction to redis - version 2
Introduction to redis - version 2Introduction to redis - version 2
Introduction to redis - version 2
Dvir Volk
 
Caching solutions with Redis
Caching solutions   with RedisCaching solutions   with Redis
Caching solutions with Redis
George Platon
 
Disperse xlator ramon_datalab
Disperse xlator ramon_datalabDisperse xlator ramon_datalab
Disperse xlator ramon_datalab
Gluster.org
 
Building Scalable, Distributed Job Queues with Redis and Redis::Client
Building Scalable, Distributed Job Queues with Redis and Redis::ClientBuilding Scalable, Distributed Job Queues with Redis and Redis::Client
Building Scalable, Distributed Job Queues with Redis and Redis::Client
Mike Friedman
 
Lcna example-2012
Lcna example-2012Lcna example-2012
Lcna example-2012
Gluster.org
 
Data file handling in python binary & csv files
Data file handling in python binary & csv filesData file handling in python binary & csv files
Data file handling in python binary & csv files
keeeerty
 
A Brief Introduction to Redis
A Brief Introduction to RedisA Brief Introduction to Redis
A Brief Introduction to Redis
Charles Anderson
 
Work WIth Redis and Perl
Work WIth Redis and PerlWork WIth Redis and Perl
Work WIth Redis and Perl
Brett Estrade
 
What Reika Taught us
What Reika Taught usWhat Reika Taught us
Lcna 2012-tutorial
Lcna 2012-tutorialLcna 2012-tutorial
Lcna 2012-tutorial
Gluster.org
 
Bulk Loading Data into Cassandra
Bulk Loading Data into CassandraBulk Loading Data into Cassandra
Bulk Loading Data into Cassandra
DataStax
 
"Metrics: Where and How", Vsevolod Polyakov
"Metrics: Where and How", Vsevolod Polyakov"Metrics: Where and How", Vsevolod Polyakov
"Metrics: Where and How", Vsevolod Polyakov
Yulia Shcherbachova
 
Dexador Rises
Dexador RisesDexador Rises
Dexador Rises
fukamachi
 
Kubernetes
KubernetesKubernetes
Kubernetes
Diego Pacheco
 
Scalable XQuery Processing with Zorba on top of MongoDB
Scalable XQuery Processing with Zorba on top of MongoDBScalable XQuery Processing with Zorba on top of MongoDB
Scalable XQuery Processing with Zorba on top of MongoDB
William Candillon
 
Fluentd and AWS at classmethod
Fluentd and AWS at classmethodFluentd and AWS at classmethod
Fluentd and AWS at classmethod
Treasure Data, Inc.
 
Scale out backups-with_bareos_and_gluster
Scale out backups-with_bareos_and_glusterScale out backups-with_bareos_and_gluster
Scale out backups-with_bareos_and_gluster
Gluster.org
 
Gluster intro-tdose
Gluster intro-tdoseGluster intro-tdose
Gluster intro-tdose
Gluster.org
 
Gluster d2
Gluster d2Gluster d2
Gluster d2
Gluster.org
 
Handling 20 billion requests a month
Handling 20 billion requests a monthHandling 20 billion requests a month
Handling 20 billion requests a monthDmitriy Dumanskiy
 

What's hot (20)

Introduction to redis - version 2
Introduction to redis - version 2Introduction to redis - version 2
Introduction to redis - version 2
 
Caching solutions with Redis
Caching solutions   with RedisCaching solutions   with Redis
Caching solutions with Redis
 
Disperse xlator ramon_datalab
Disperse xlator ramon_datalabDisperse xlator ramon_datalab
Disperse xlator ramon_datalab
 
Building Scalable, Distributed Job Queues with Redis and Redis::Client
Building Scalable, Distributed Job Queues with Redis and Redis::ClientBuilding Scalable, Distributed Job Queues with Redis and Redis::Client
Building Scalable, Distributed Job Queues with Redis and Redis::Client
 
Lcna example-2012
Lcna example-2012Lcna example-2012
Lcna example-2012
 
Data file handling in python binary & csv files
Data file handling in python binary & csv filesData file handling in python binary & csv files
Data file handling in python binary & csv files
 
A Brief Introduction to Redis
A Brief Introduction to RedisA Brief Introduction to Redis
A Brief Introduction to Redis
 
Work WIth Redis and Perl
Work WIth Redis and PerlWork WIth Redis and Perl
Work WIth Redis and Perl
 
What Reika Taught us
What Reika Taught usWhat Reika Taught us
What Reika Taught us
 
Lcna 2012-tutorial
Lcna 2012-tutorialLcna 2012-tutorial
Lcna 2012-tutorial
 
Bulk Loading Data into Cassandra
Bulk Loading Data into CassandraBulk Loading Data into Cassandra
Bulk Loading Data into Cassandra
 
"Metrics: Where and How", Vsevolod Polyakov
"Metrics: Where and How", Vsevolod Polyakov"Metrics: Where and How", Vsevolod Polyakov
"Metrics: Where and How", Vsevolod Polyakov
 
Dexador Rises
Dexador RisesDexador Rises
Dexador Rises
 
Kubernetes
KubernetesKubernetes
Kubernetes
 
Scalable XQuery Processing with Zorba on top of MongoDB
Scalable XQuery Processing with Zorba on top of MongoDBScalable XQuery Processing with Zorba on top of MongoDB
Scalable XQuery Processing with Zorba on top of MongoDB
 
Fluentd and AWS at classmethod
Fluentd and AWS at classmethodFluentd and AWS at classmethod
Fluentd and AWS at classmethod
 
Scale out backups-with_bareos_and_gluster
Scale out backups-with_bareos_and_glusterScale out backups-with_bareos_and_gluster
Scale out backups-with_bareos_and_gluster
 
Gluster intro-tdose
Gluster intro-tdoseGluster intro-tdose
Gluster intro-tdose
 
Gluster d2
Gluster d2Gluster d2
Gluster d2
 
Handling 20 billion requests a month
Handling 20 billion requests a monthHandling 20 billion requests a month
Handling 20 billion requests a month
 

Viewers also liked

NoSQL Yes, But YesCQL, No?
NoSQL Yes, But YesCQL, No?NoSQL Yes, But YesCQL, No?
NoSQL Yes, But YesCQL, No?Eric Evans
 
Outside The Box With Apache Cassnadra
Outside The Box With Apache CassnadraOutside The Box With Apache Cassnadra
Outside The Box With Apache Cassnadra
Eric Evans
 
Cassandra: Not Just NoSQL, It's MoSQL
Cassandra: Not Just NoSQL, It's MoSQLCassandra: Not Just NoSQL, It's MoSQL
Cassandra: Not Just NoSQL, It's MoSQLEric Evans
 
An Introduction To Cassandra
An Introduction To CassandraAn Introduction To Cassandra
An Introduction To Cassandra
Eric Evans
 
The Cassandra Distributed Database
The Cassandra Distributed DatabaseThe Cassandra Distributed Database
The Cassandra Distributed Database
Eric Evans
 
Cassandra Explained
Cassandra ExplainedCassandra Explained
Cassandra ExplainedEric Evans
 

Viewers also liked (6)

NoSQL Yes, But YesCQL, No?
NoSQL Yes, But YesCQL, No?NoSQL Yes, But YesCQL, No?
NoSQL Yes, But YesCQL, No?
 
Outside The Box With Apache Cassnadra
Outside The Box With Apache CassnadraOutside The Box With Apache Cassnadra
Outside The Box With Apache Cassnadra
 
Cassandra: Not Just NoSQL, It's MoSQL
Cassandra: Not Just NoSQL, It's MoSQLCassandra: Not Just NoSQL, It's MoSQL
Cassandra: Not Just NoSQL, It's MoSQL
 
An Introduction To Cassandra
An Introduction To CassandraAn Introduction To Cassandra
An Introduction To Cassandra
 
The Cassandra Distributed Database
The Cassandra Distributed DatabaseThe Cassandra Distributed Database
The Cassandra Distributed Database
 
Cassandra Explained
Cassandra ExplainedCassandra Explained
Cassandra Explained
 

Similar to Cassandra Explained

On Rails with Apache Cassandra
On Rails with Apache CassandraOn Rails with Apache Cassandra
On Rails with Apache Cassandra
Stu Hood
 
Cassandra Talk: Austin JUG
Cassandra Talk: Austin JUGCassandra Talk: Austin JUG
Cassandra Talk: Austin JUG
Stu Hood
 
AWS Big Data Demystified #2 | Athena, Spectrum, Emr, Hive
AWS Big Data Demystified #2 |  Athena, Spectrum, Emr, Hive AWS Big Data Demystified #2 |  Athena, Spectrum, Emr, Hive
AWS Big Data Demystified #2 | Athena, Spectrum, Emr, Hive
Omid Vahdaty
 
Cassandra: Open Source Bigtable + Dynamo
Cassandra: Open Source Bigtable + DynamoCassandra: Open Source Bigtable + Dynamo
Cassandra: Open Source Bigtable + Dynamo
jbellis
 
Taming NoSQL with Spring Data
Taming NoSQL with Spring DataTaming NoSQL with Spring Data
Taming NoSQL with Spring Data
Sergi Almar i Graupera
 
NoSQL, no Limits, lots of Fun!
NoSQL, no Limits, lots of Fun!NoSQL, no Limits, lots of Fun!
NoSQL, no Limits, lots of Fun!
Otávio Santana
 
Online Analytics with Hadoop and Cassandra
Online Analytics with Hadoop and CassandraOnline Analytics with Hadoop and Cassandra
Online Analytics with Hadoop and Cassandra
Robbie Strickland
 
Your Database Cannot Do this (well)
Your Database Cannot Do this (well)Your Database Cannot Do this (well)
Your Database Cannot Do this (well)
javier ramirez
 
Big Data Day LA 2015 - Compiling DSLs for Diverse Execution Environments by Z...
Big Data Day LA 2015 - Compiling DSLs for Diverse Execution Environments by Z...Big Data Day LA 2015 - Compiling DSLs for Diverse Execution Environments by Z...
Big Data Day LA 2015 - Compiling DSLs for Diverse Execution Environments by Z...
Data Con LA
 
Heterogenous Persistence
Heterogenous PersistenceHeterogenous Persistence
Heterogenous Persistence
Jervin Real
 
JDD 2016 - Michal Matloka - Small Intro To Big Data
JDD 2016 - Michal Matloka - Small Intro To Big DataJDD 2016 - Michal Matloka - Small Intro To Big Data
JDD 2016 - Michal Matloka - Small Intro To Big Data
PROIDEA
 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to Cassandrashimi_k
 
Spring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_dataSpring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_data
Roger Xia
 
Scaling web applications with cassandra presentation
Scaling web applications with cassandra presentationScaling web applications with cassandra presentation
Scaling web applications with cassandra presentationMurat Çakal
 
Cassandra Java APIs Old and New – A Comparison
Cassandra Java APIs Old and New – A ComparisonCassandra Java APIs Old and New – A Comparison
Cassandra Java APIs Old and New – A Comparison
shsedghi
 
Introduction to AWS Big Data
Introduction to AWS Big Data Introduction to AWS Big Data
Introduction to AWS Big Data
Omid Vahdaty
 
JPoint'15 Mom, I so wish Hibernate for my NoSQL database...
JPoint'15 Mom, I so wish Hibernate for my NoSQL database...JPoint'15 Mom, I so wish Hibernate for my NoSQL database...
JPoint'15 Mom, I so wish Hibernate for my NoSQL database...
Alexey Zinoviev
 
TDC2017 | Florianopolis - Trilha DevOps How we figured out we had a SRE team ...
TDC2017 | Florianopolis - Trilha DevOps How we figured out we had a SRE team ...TDC2017 | Florianopolis - Trilha DevOps How we figured out we had a SRE team ...
TDC2017 | Florianopolis - Trilha DevOps How we figured out we had a SRE team ...
tdc-globalcode
 
Sorry - How Bieber broke Google Cloud at Spotify
Sorry - How Bieber broke Google Cloud at SpotifySorry - How Bieber broke Google Cloud at Spotify
Sorry - How Bieber broke Google Cloud at Spotify
Neville Li
 

Similar to Cassandra Explained (20)

On Rails with Apache Cassandra
On Rails with Apache CassandraOn Rails with Apache Cassandra
On Rails with Apache Cassandra
 
Cassandra Talk: Austin JUG
Cassandra Talk: Austin JUGCassandra Talk: Austin JUG
Cassandra Talk: Austin JUG
 
AWS Big Data Demystified #2 | Athena, Spectrum, Emr, Hive
AWS Big Data Demystified #2 |  Athena, Spectrum, Emr, Hive AWS Big Data Demystified #2 |  Athena, Spectrum, Emr, Hive
AWS Big Data Demystified #2 | Athena, Spectrum, Emr, Hive
 
Cassandra: Open Source Bigtable + Dynamo
Cassandra: Open Source Bigtable + DynamoCassandra: Open Source Bigtable + Dynamo
Cassandra: Open Source Bigtable + Dynamo
 
Taming NoSQL with Spring Data
Taming NoSQL with Spring DataTaming NoSQL with Spring Data
Taming NoSQL with Spring Data
 
NoSQL, no Limits, lots of Fun!
NoSQL, no Limits, lots of Fun!NoSQL, no Limits, lots of Fun!
NoSQL, no Limits, lots of Fun!
 
Online Analytics with Hadoop and Cassandra
Online Analytics with Hadoop and CassandraOnline Analytics with Hadoop and Cassandra
Online Analytics with Hadoop and Cassandra
 
Cassandra
CassandraCassandra
Cassandra
 
Your Database Cannot Do this (well)
Your Database Cannot Do this (well)Your Database Cannot Do this (well)
Your Database Cannot Do this (well)
 
Big Data Day LA 2015 - Compiling DSLs for Diverse Execution Environments by Z...
Big Data Day LA 2015 - Compiling DSLs for Diverse Execution Environments by Z...Big Data Day LA 2015 - Compiling DSLs for Diverse Execution Environments by Z...
Big Data Day LA 2015 - Compiling DSLs for Diverse Execution Environments by Z...
 
Heterogenous Persistence
Heterogenous PersistenceHeterogenous Persistence
Heterogenous Persistence
 
JDD 2016 - Michal Matloka - Small Intro To Big Data
JDD 2016 - Michal Matloka - Small Intro To Big DataJDD 2016 - Michal Matloka - Small Intro To Big Data
JDD 2016 - Michal Matloka - Small Intro To Big Data
 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to Cassandra
 
Spring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_dataSpring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_data
 
Scaling web applications with cassandra presentation
Scaling web applications with cassandra presentationScaling web applications with cassandra presentation
Scaling web applications with cassandra presentation
 
Cassandra Java APIs Old and New – A Comparison
Cassandra Java APIs Old and New – A ComparisonCassandra Java APIs Old and New – A Comparison
Cassandra Java APIs Old and New – A Comparison
 
Introduction to AWS Big Data
Introduction to AWS Big Data Introduction to AWS Big Data
Introduction to AWS Big Data
 
JPoint'15 Mom, I so wish Hibernate for my NoSQL database...
JPoint'15 Mom, I so wish Hibernate for my NoSQL database...JPoint'15 Mom, I so wish Hibernate for my NoSQL database...
JPoint'15 Mom, I so wish Hibernate for my NoSQL database...
 
TDC2017 | Florianopolis - Trilha DevOps How we figured out we had a SRE team ...
TDC2017 | Florianopolis - Trilha DevOps How we figured out we had a SRE team ...TDC2017 | Florianopolis - Trilha DevOps How we figured out we had a SRE team ...
TDC2017 | Florianopolis - Trilha DevOps How we figured out we had a SRE team ...
 
Sorry - How Bieber broke Google Cloud at Spotify
Sorry - How Bieber broke Google Cloud at SpotifySorry - How Bieber broke Google Cloud at Spotify
Sorry - How Bieber broke Google Cloud at Spotify
 

More from Eric Evans

Wikimedia Content API (Strangeloop)
Wikimedia Content API (Strangeloop)Wikimedia Content API (Strangeloop)
Wikimedia Content API (Strangeloop)
Eric Evans
 
Wikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-caseWikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-case
Eric Evans
 
Wikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-caseWikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-case
Eric Evans
 
Time Series Data with Apache Cassandra (ApacheCon EU 2014)
Time Series Data with Apache Cassandra (ApacheCon EU 2014)Time Series Data with Apache Cassandra (ApacheCon EU 2014)
Time Series Data with Apache Cassandra (ApacheCon EU 2014)Eric Evans
 
Time Series Data with Apache Cassandra
Time Series Data with Apache CassandraTime Series Data with Apache Cassandra
Time Series Data with Apache CassandraEric Evans
 
Time Series Data with Apache Cassandra
Time Series Data with Apache CassandraTime Series Data with Apache Cassandra
Time Series Data with Apache Cassandra
Eric Evans
 
It's not you, it's me: Ending a 15 year relationship with RRD
It's not you, it's me: Ending a 15 year relationship with RRDIt's not you, it's me: Ending a 15 year relationship with RRD
It's not you, it's me: Ending a 15 year relationship with RRD
Eric Evans
 
Time series storage in Cassandra
Time series storage in CassandraTime series storage in Cassandra
Time series storage in Cassandra
Eric Evans
 
Virtual Nodes: Rethinking Topology in Cassandra
Virtual Nodes: Rethinking Topology in CassandraVirtual Nodes: Rethinking Topology in Cassandra
Virtual Nodes: Rethinking Topology in Cassandra
Eric Evans
 
Cassandra by Example: Data Modelling with CQL3
Cassandra by Example:  Data Modelling with CQL3Cassandra by Example:  Data Modelling with CQL3
Cassandra by Example: Data Modelling with CQL3Eric Evans
 
Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3
Eric Evans
 
Rethinking Topology In Cassandra (ApacheCon NA)
Rethinking Topology In Cassandra (ApacheCon NA)Rethinking Topology In Cassandra (ApacheCon NA)
Rethinking Topology In Cassandra (ApacheCon NA)Eric Evans
 
Virtual Nodes: Rethinking Topology in Cassandra
Virtual Nodes: Rethinking Topology in CassandraVirtual Nodes: Rethinking Topology in Cassandra
Virtual Nodes: Rethinking Topology in CassandraEric Evans
 
Castle enhanced Cassandra
Castle enhanced CassandraCastle enhanced Cassandra
Castle enhanced Cassandra
Eric Evans
 
CQL In Cassandra 1.0 (and beyond)
CQL In Cassandra 1.0 (and beyond)CQL In Cassandra 1.0 (and beyond)
CQL In Cassandra 1.0 (and beyond)
Eric Evans
 
Cassandra In A Nutshell
Cassandra In A NutshellCassandra In A Nutshell
Cassandra In A Nutshell
Eric Evans
 

More from Eric Evans (16)

Wikimedia Content API (Strangeloop)
Wikimedia Content API (Strangeloop)Wikimedia Content API (Strangeloop)
Wikimedia Content API (Strangeloop)
 
Wikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-caseWikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-case
 
Wikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-caseWikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-case
 
Time Series Data with Apache Cassandra (ApacheCon EU 2014)
Time Series Data with Apache Cassandra (ApacheCon EU 2014)Time Series Data with Apache Cassandra (ApacheCon EU 2014)
Time Series Data with Apache Cassandra (ApacheCon EU 2014)
 
Time Series Data with Apache Cassandra
Time Series Data with Apache CassandraTime Series Data with Apache Cassandra
Time Series Data with Apache Cassandra
 
Time Series Data with Apache Cassandra
Time Series Data with Apache CassandraTime Series Data with Apache Cassandra
Time Series Data with Apache Cassandra
 
It's not you, it's me: Ending a 15 year relationship with RRD
It's not you, it's me: Ending a 15 year relationship with RRDIt's not you, it's me: Ending a 15 year relationship with RRD
It's not you, it's me: Ending a 15 year relationship with RRD
 
Time series storage in Cassandra
Time series storage in CassandraTime series storage in Cassandra
Time series storage in Cassandra
 
Virtual Nodes: Rethinking Topology in Cassandra
Virtual Nodes: Rethinking Topology in CassandraVirtual Nodes: Rethinking Topology in Cassandra
Virtual Nodes: Rethinking Topology in Cassandra
 
Cassandra by Example: Data Modelling with CQL3
Cassandra by Example:  Data Modelling with CQL3Cassandra by Example:  Data Modelling with CQL3
Cassandra by Example: Data Modelling with CQL3
 
Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3
 
Rethinking Topology In Cassandra (ApacheCon NA)
Rethinking Topology In Cassandra (ApacheCon NA)Rethinking Topology In Cassandra (ApacheCon NA)
Rethinking Topology In Cassandra (ApacheCon NA)
 
Virtual Nodes: Rethinking Topology in Cassandra
Virtual Nodes: Rethinking Topology in CassandraVirtual Nodes: Rethinking Topology in Cassandra
Virtual Nodes: Rethinking Topology in Cassandra
 
Castle enhanced Cassandra
Castle enhanced CassandraCastle enhanced Cassandra
Castle enhanced Cassandra
 
CQL In Cassandra 1.0 (and beyond)
CQL In Cassandra 1.0 (and beyond)CQL In Cassandra 1.0 (and beyond)
CQL In Cassandra 1.0 (and beyond)
 
Cassandra In A Nutshell
Cassandra In A NutshellCassandra In A Nutshell
Cassandra In A Nutshell
 

Recently uploaded

From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 

Recently uploaded (20)

From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 

Cassandra Explained

  • 1. Cassandra Explained Berlin Buzzwords June 6, 2010 Eric Evans eevans@rackspace.com @jericevans http://blog.sym-link.com
  • 2. Outline ● Background ● Description ● API ● Examples
  • 4. Influential Papers ● BigTable ● Strong consistency ● Sparse map data model ● GFS, Chubby, et al ● Dynamo ● O(1) distributed hash table (DHT) ● BASE (aka eventual consistency) ● Client tunable consistency/availability
  • 5. NoSQL ● HBase ● Hypertable ● MongoDB ● HyperGraphDB ● Riak ● Memcached ● Voldemort ● Tokyo Cabinet ● Neo4J ● Redis ● Cassandra ● CouchDB
  • 6. NoSQL Big data ● HBase ● Hypertable ● MongoDB ● HyperGraphDB ● Riak ● Memcached ● Voldemort ● Tokyo Cabinet ● Neo4J ● Redis ● Cassandra ● CouchDB
  • 7. Bigtable / Dynamo Bigtable Dynamo ● HBase ● Riak ● Hypertable ● Voldemort Cassandra ??
  • 9. CAP Theorem “Pick Two” ● CP ● AP ● Bigtable ● Dynamo ● Hypertable ● Voldemort ● HBase ● Cassandra
  • 10. CAP Theorem “Pick Two” ● Consistency ● Availability ● Partition Tolerance
  • 12. Properties ● Symmetric ● No single point of failure ● Linearly scalable ● Ease of administration ● Flexible partitioning, replica placement ● Automated provisioning ● High availability (eventual consistency)
  • 15. Partitioning ● Random ● 128bit namespace, (MD5) ● Good distribution ● Order Preserving ● Tokens determine namespace ● Natural order (lexicographical) ● Range / cover queries ● Yours ??
  • 16. Replica Placement ● SimpleSnitch ● Default ● N-1 successive nodes ● RackInferringSnitch ● Infers DC/rack from IP ● PropertyFileSnitch ● Configured w/ a properties file
  • 20. Choosing Consistency Write Read Level Description Level Description ZERO Hail Mary ZERO N/A ANY 1 replica (HH) ANY N/A ONE 1 replica ONE 1 replica QUORUM (N / 2) +1 QUORUM (N / 2) +1 ALL All replicas ALL All replicas R+W>N
  • 24. Overview ● Keyspace ● Uppermost namespace ● Typically one per application ● ColumnFamily ● Associates records of a similar kind ● Record-level Atomicity ● Indexed ● Column ● Basic unit of storage
  • 26. Column ● name ● byte[] ● Queried against (predicates) ● Determines sort order ● value ● byte[] ● Opaque to Cassandra ● timestamp ● long ● Conflict resolution (Last Write Wins)
  • 27. Column Comparators ● Bytes ● UTF8 ● TimeUUID ● Long ● LexicalUUID ● Composite (third-party) http://github.com/edanuff/CassandraCompositeType
  • 28. API
  • 29. Low / High ● Thrift ● Compact binary RPC framework ● 12 different languages ● Idiomatic ● Hector (Java) ● Pycassa (Python) ● Others... http://wiki.apache.org/cassandra/ClientOptions
  • 30. Thrift Read Methods ● get() → Column ● get_slice() → list<Column> ● mulitget_slice() → map<key, list<Column>> ● get_count() → int ● multiget_count() → map<key, int> ● get_range_slices()
  • 31. Thrift Write Methods ● insert() ● batch_insert() ● remove() ● batch_mutate()
  • 33. Pycassa – Python Client API ● connect() → Thrift proxy ● cf = ColumnFamily(proxy, ksp, cfname) ● cf.insert() → long ● cf.get() → dict ● cf.get_range() → dict http://github.com/vomjom/pycassa
  • 34. Address Book – Setup <!-- conf/storage-conf.xml --> <Keyspace Name=”AddressBook”> <ColumnFamily Name=”Addresses” CompareWith=”BytesType” RowsCached=”10000” KeysCached=”50%” Comment=”Too lame” /> </Keyspace>
  • 35. Adding an entry key = uuid() columns = { 'first': 'Eric', 'last': 'Evans', 'email': 'eevans@rackspace.com', 'city': 'Austin', 'zip': 78250 } addresses.insert(key, columns)
  • 36. Fetching a record # fetching the record by key record = addresses.get(key) # accessing columns by name zipcode = record['zip'] city = record['city']
  • 37. Indexing <!-- conf/storage-conf.xml --> <Keyspace Name=”AddressBook”> <ColumnFamily Name=”Addresses” CompareWith=”BytesType” RowsCached=”10000” KeysCached=”50%” Comment=”Too lame” /> <ColumnFamily Name=”ByCity” CompareWith=”UTF8Type” /> </Keyspace>
  • 38. Updating the index key = uuid() columns = { 'first': 'Eric', 'last': 'Evans', 'email': 'eevans@rackspace.com', 'city': 'Austin', 'zip': 78250 } addresses.insert(key, columns) byCity.insert('Austin', {key: ''})
  • 39. Timeseries <!-- conf/storage-conf.xml --> <Keyspace Name=”Sites”> <ColumnFamily Name=”Stats” CompareWith=”LongType”/> </Keyspace>
  • 40. Logging values # time as a long, binary, network-order ts = pack('>d', long(time() * 1e6)) stats.insert('org.apache', {ts: value})
  • 41. Slicing begin = pack('>d', long(s * 1e6)) stats.get_range('org.apache', column_start=begin) end = pack('>d', long((s + 86400) * 1e6)) stats.get_range(start='org.apache', finish='org.debian', column_start=begin, column_finish=end)