SlideShare a Scribd company logo
Apache Cassandra
http://cassandra.apache.org



        Benoit Perroud
    Software Engineer @Verisign
        & Apache Committer
  JUG Lausanne, 14.06.2012
Agenda

• NoSQL Quick Overview
• Apache Cassandra Fundamentals
   – Design principles
   – Data & Query Model
• Real Life Uses Cases
   – Doodle clone
   – Heavy Write Load
   – Bulk Loading (write once data)
• Client side implementation
• Q&A

                                      2
NoSQL

• [Wikipedia] NoSQL is a term used to designate database
  management systems that differ from classic relational
  database management systems (RDBMS) in some way.
  These data stores may not require fixed table schemas,
  usually avoid join operations, do not attempt to provide
  ACID properties and typically scale horizontally.

• Pioneers : Google BigTable, Amazon Dynamo, etc.




                                                       3
Scalability

• [Wikipedia] Scalability is a desirable property of a
  system, a network, or a process, which indicates its
  ability to either handle growing amounts of work in a
  graceful manner or to be readily enlarged.

• Scalability in two dimensions :
   – Scale up → scale vertically (increase RAM in an existing node)
   – Scale out → scale horizontally (add a node to the cluster)


• In summary : handle load and peaks.


                                                                  4
Availability

• [Wikipedia] Availability refers to the ability of the users to
  access and use the system. If a user cannot access the
  system, it is said to be unavailable. Generally, the term
  downtime is used to refer to periods when a system is
  unavailable.
• In summary : minimize downtime.




                                                            5
CAP Theorem

• Consistency : all nodes see the same data at the same
  time
• Availability : node failures do not prevent survivors from
  continuing to operate
• Partition Tolerance : the system continues to operate
  despite arbitrary message loss

• According to the theorem, a distributed system can
  satisfy any two of these guarantees at the same time, but
  not all three.

                                                          6
NoSQL Promises

• Scale horizontally
   – Double computational power or storage by doubling size of the
     cluster (tight provisioning)
   – Adding nodes to the cluster in constant time
• High availability
   – No / few / under control SPoF
• On commodity hardware

• Let see how Cassandra achieves all of these



                                                                 7
Apache Cassandra

• Apache Cassandra is could be simplified as a scalable,
  distributed, sparse and eventually consistent hash
  map. But it's actually way more.
• Originally developed by Facebook, hit AFS incubator
  early 2008, version 1.0 in 2010
• Inspired from Amazon Dynamo and Google BigTable
• Version at time of speaking 1.0.10, 1.1.1
• Under high development by several startups : Datastax,
  Acunu, Netflix, Twitter, Rackspace, …



                                                      8
Apache Cassandra is a scalable distributed,
     sparse, eventually consistent hash map

• Gossip protocol (spreading states like a rumor)
• Consistent hashing
   – Node responsible for key range and replica sets
• No single point of failure
• Key space is 2^128 bits
                                                                     100% keyspace
                                                            0


                                             87                            12
                                             ?                             ?

   Explicitely set
   your node’s token !                  75
                                        ?
                                                  Take half of key range
                                                  of most loaded node
                                                                                 25
                                                                                 ?



                                             62                             37
                                             ?                              ?

                                                                50
                                  Take half of key range                              9
                                  of most loaded node
Apache Cassandra is a scalable distributed,
     sparse, eventually consistent hash map

• Schemaless
   – A schema (metadata) may be determined for convenience
   – Column names are stored for every rows
• [Wikipedia] Bloom filter is a space-efficient probabilistic
  data structure that is used to test whether an element is a
  member of a set.




                                                             10
Apache Cassandra is a scalable distributed,
     sparse, eventually consistent hash map

• [Wikipedia] A quorum is the minimum number of votes
  that a distributed transaction has to obtain in order to be
  allowed to perform an operation in a distributed system.
  A quorum-based technique is implemented to enforce
  consistent operation in a distributed system.

• Quorum : R + W > N
   – N : number of replica, R : number of node read, W : number of
     node written.
   – R = 1, W = N
   – R = N, W = 1
   – R = N/2, W = N/2 (+1 if N is even)
                                                                 11
Apache Cassandra is a scalable distributed,
     sparse, eventually consistent hash map

• Key space [0,99], previously put(22, 1)
• Replication factor 2
• Consistency : ONE
                           coordinator              0




             Put (22, 2)
                                80                                     20




                                              Async put(22,2)
                                         60                       40

                                                                            owner

                                                        replica                     12
Apache Cassandra is a scalable distributed,
     sparse, eventually consistent hash map

• Key space [0,49], previously put(13, 1)
• Replication factor 3
• Consistency : QUORUM (R = 2, W = 2)

                                                  0



                                                 Read(13) = 2, t2
             Put (13, 2, t2)                Put (13, 2, t2)
                               80                                   20




                                    Read(13) = 1, t1
                                    Read repair
                                       60                    40



                                                                         13
Apache Cassandra is a scalable distributed,
     sparse, eventually consistent hash map

• Can be seen as a multilevel hash map : Hash of Hash (of
  Hash) Map
   – 2 (to 3) levels of keys.
       • Let's focus on 2, the 3rd level (SuperColumn) usage is no longer
         recomanded
• Keyspace > column family > row > column name = value
   – # use Keyspace1;
   – # set ColumnFamily1['key1']['columName1'] = 'value1';
   – # get ColumnFamily1['key1']['columName1'];




                                                                        14
Data Model : Keyspace

• Equivalent to database name in SQL world
• Define replication factor and network topology
   – Network topology include multi datacenters topology
   – Replication factor can be defined per datacenters




                                                           15
Data Model : Column Family

• Equivalent to table name in SQL world
   – Term may change in upcoming releases to stop confusing users
• Define
   – Type of the keys
   – Column name comparator
   – Additional metadata (types of certain known columns)




                                                              16
Data Model : Row

• Defined by the key.
   – Eventually stored to a node and it's replicas
• Keys are be typed
• 2 strategies of key partitioner on the key space
   – Random partitioner :
       • md5(key), evenly distribute keys on nodes
   – Byte Ordered partitioner :
       • Keep order while iterating through the keys, may lead to hot spots




                                                                         17
Data Model : Column Name

    • Could be seen as column in SQL world
    • Not mandatory to be declared
         – If declared, their corresponding values have types
         – Or secondary index
    • Ordered (!)
    • Column Names are often used as values (!)
                                             Column names
         Event1

Column
Family       24.04.2012        07:00               08:00
                               239                 255

            Row key                       Values
                                                                18
Data Model : Value

• Can be typed, seen as array of bytes otherwise
• Existing types include
   –   Bytes
   –   Strings (ASCII or UTF-8 strings)
   –   Integer, Long, Float, Double, Decimal
   –   UUID, dates
   –   Counters (of long)
• Can expire
• No foreign keys (!)



                                                   19
Query Model

• 2 interfaces to interact with Cassandra
   – Native API
      • Thrift, CLI
      • Higher level third-party libraries
          –   Hector
          –   Pycassa
          –   Phpyandra
          –   Astyanax
          –   Helenus
   – CQL (Cassandra Query Language)




                                             20
Query Model

• Cassandra is more than a key – value store.
   –   Get
   –   Put
   –   Delete
   –   Update
   –   But also various range queries
        • Key range
        • Column range (slice)
   – Secondary indexes




                                                21
Query Model : Get

   • Get single key
                                  – Give me key ‘a’
   • Get multiple keys
                                  – Give me keys ‘a’, ‘c’, ‘d’ and ‘f’
                                                      Ordered regarding column name comparator

                                               ‘1’       ‘2’           ‘3’          ‘4’          ‘5’
Ordered regarding partitionner.




                                     ‘a’       8         9             10                        11
                                     ‘b’                 12            13                        14
ByteOrdred here.




                                     ‘c’       15                                   16           17
                                     ‘d’                 18
                                     ‘e’       19        20                                      20
                                     ‘f’       22        23            24           25           26
                                                                                                       22
Query Model : Get Range

• Range
  – Query for a range of key
      • Give me all keys between ‘a’ and ‘c’.
      • Mind the partitioner.



               ‘1’       ‘2’       ‘3’          ‘4’   ‘5’
     ‘a’       8         9         10                 11
     ‘b’                 12        13                 14
     ‘c’       15                               16    17
     ‘d’                 18
     ‘e’       19        20                           20
     ‘f’       22        23        24           25    26
                                                            23
Query Model : Get Slice

• Slice
   – Query for a slice of columns
       • For key ‘a’, give me all columns between ‘3’ and ‘5’
       • For key ‘f’, give me all columns between ‘3’ and ‘5’



                ‘1’       ‘2’       ‘3’        ‘4’       ‘5’
      ‘a’       8         9         10                   11
      ‘b’                 12        13                   14
      ‘c’       15                             16        17
      ‘d’                 18
      ‘e’       19        20                             20
      ‘f’       22        23        24         25        26
                                                                24
Query Model : Get Range Slice

• Range and Slice can be combined : rangeSliceQuery
   – For keys between ‘b’ and ‘d’, give me columns between ‘2’ and ‘4’




               ‘1’      ‘2’      ‘3’     ‘4’      ‘5’
      ‘a’      8        9        10               11
      ‘b’               12       13               14
      ‘c’      15                        16       17
      ‘d’               18
      ‘e’      19       20                        20
      ‘f’      22       23       24      25       26
                                                                 25
Query Model : Secondary Index

• Secondary Index
  – Give me all rows where value for column ‘2’ is ‘12’




              ‘1’      ‘2’      ‘3’      ‘4’       ‘5’
     ‘a’      8        9        10                 11
     ‘b’               12       13                 14
     ‘c’      15                         16        17
     ‘d’               18
     ‘e’      19       20                          20
     ‘f’      22       23       24       25        26
                                                          26
cassandra-cli and nodetool

• ./bin/cassandra-cli –p 9160 –h localhost
• ./bin/nodetool –p7199 –h localhost




                Quick demo !




                                             27
Write path

1.   Write to commit log
                                    Memory
2.   Update MemTable                  CF1
                                     MemTable
                                                  CF2
                                                 MemTable
                                                                     CFn
                                                                    MemTable
                                                                …
3.   Write is acked to client
4.   If MemTable reach threshold,   Disks
                                                  CF1                CFn
                                    Commit log
     flush to disk as SSTable                    Bloom filter   …    SSTable
                                                    Index

                                                    Data




                                                    …
                                                  SSTable            SSTable




                                                                        28
Read path

• Versions of the same column
                                           Memory
  can be spread at the same time             CF1         CF2                CFn
                                            MemTable    MemTable           MemTable
                                                                       …
   – In the MemTable
   – In the memtable being flushed         Disks
   – In one or multiple SSTable            Commit log
                                                         CF1                CFn

                                                                       …
• All version read, and resolved /
                                                        Bloom filter
                                                                            SSTable
                                                           Index

  merged using timestamp                                   Data




                                                           …
   – Bloom filters allow to skip reading
                                                         SSTable            SSTable
     unnecessary files
   – SSTables are indexed
   – Compaction keep things
     reasonnable
                                                                               29
Compaction

•   Runs regularly as a background operation
•   Merge SSTables together
•   Remove expired and deleted values
•   Has impact on general I/O availability (and thus
    performance)
     – This is where most of tuning happens
     – Can be throttled
• Two type of compaction
     – Size-tiered
            • Few I/O consumption  write-heavy workload
     – Leveled
            • Guarantee to read from fewer SSTables  read-heavy workload
•   See http://www.datastax.com/dev/blog/leveled-compaction-in-apache-cassandra for complete details.   30
Other Advanced Features

•   Super Columns (no more recommended)
•   Composite column names
•   Integration with Hadoop
•   Bulk Loading
•   Compression
•   Multi tenancy




                                          31
Real Life Use Case : Doodle Clone

• Live demo http://doodle.noisette.ch
   Naïve data model
     Polls { id, label, [options], email, limit }
     Subscribers (super) { polls.id { id, label, [options] } }
• Id generation
   – TimeUUID is your friend
• Avoid super column familes
   – Use composite, or serialized/encoded subscribers
• Subscriber.label uniqueness per poll ?
   – Cassandra anti-pattern (read-after-write)
• Limit to n subscribers per option ?
   – Cassandra anti-pattern (read-after-write)                   32
Real Life Use Case : Heavy Writes

• Cassandra is a really good fit when the ratio read / write
  is close to 0
   – Event logging / redo logs
   – Time series
• It’s a best practice to write data in its raw format AND in
  aggregated forms at the same time
• But need compation tuning
   – {min,max}_compaction_threshold
   – memtable_flush_writers
   – … no magic solution here, only pragmatic approach
       • change configuration in one node, and mesure the difference (load, latency, …)



                                                                                          33
Real Life Use Case : Counters

• Cassandra >= 0.8 (CASSANDRA-1072)
   create column family counterCF with
      default_validation_class=CounterColumnType
      AND key_validation_class=UTF8Type AND comparator=UTF8Type;
  INC counterCF[‘key’][‘columnName’] BY 1;

• Example
  counterCF[‘entity1’][2012-06-14   18:30:00]
  counterCF[‘entity1’][2012-06-14   18:30:05]     Query per entity
  counterCF[‘entity1’][2012-06-14   18:30:10]     number of hits for ‘entity1’
  …                                               between 18:30:00 and 19:00:00
  counterCF[‘entity2’][2012-06-14   18:30:05]

  counterCF[2012-06-14   18:30:00][‘entity1’]
  counterCF[2012-06-14   18:30:00][‘entity2’]     Query per date range
  counterCF[2012-06-14   18:30:00][‘entity3’]     all entities being hit between
  …
                                                  18:30:00 and 19:00:00
  counterCF[2012-06-14   18:30:05][‘entity1’]
                                                  ! need complete date enumeration




                                                                                     34
Real Life Use Case : Bulk Loading

• Data is transformed (e.g. using MapReduce)
• Then bulk loaded into the cluster
   – ColumFamilyOutputFormat (Cassanda 1.0)
      • Not real bulk loading
   – BulkOutputFormat (Cassandra 1.1)
      • SSTable generated during the tranformation, and streamed
• Prefer Leveled Compaction Strategy
   – Reduce read latency
   – Size sstable_size_in_mb to your data




                                                                   35
Conclusion

• Cassandra is not a general purpose solution
• But Cassandra is doing a really good job if used
  accordingly
   – Really good scalability
   – Low operational cost
   – Advanced data and query model




                                                     36
Thanks for your attention

• Questions?




• Next Swiss BigData User Group : July 16. in Zurich
   – More information to come, @SwissScale




                                                       37

More Related Content

What's hot

Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to Cassandra
Gokhan Atil
 
Comparisons of no sql databases march 2014
Comparisons of no sql databases march 2014Comparisons of no sql databases march 2014
Comparisons of no sql databases march 2014
nkabra
 
Apache CloudStack: API to UI (STLLUG)
Apache CloudStack: API to UI (STLLUG)Apache CloudStack: API to UI (STLLUG)
Apache CloudStack: API to UI (STLLUG)
Joe Brockmeier
 
Rhel cluster basics 3
Rhel cluster basics   3Rhel cluster basics   3
Rhel cluster basics 3
Manoj Singh
 
Understanding AntiEntropy in Cassandra
Understanding AntiEntropy in CassandraUnderstanding AntiEntropy in Cassandra
Understanding AntiEntropy in Cassandra
Jason Brown
 
A Deep Dive into Apache Cassandra for .NET Developers
A Deep Dive into Apache Cassandra for .NET DevelopersA Deep Dive into Apache Cassandra for .NET Developers
A Deep Dive into Apache Cassandra for .NET Developers
Luke Tillman
 
Linux hpc-cluster-setup-guide
Linux hpc-cluster-setup-guideLinux hpc-cluster-setup-guide
Linux hpc-cluster-setup-guide
jasembo
 
Learning Cassandra
Learning CassandraLearning Cassandra
Learning Cassandra
Dave Gardner
 
Hadoop fault-tolerance
Hadoop fault-toleranceHadoop fault-tolerance
Hadoop fault-tolerance
Ravindra Bandara
 
Oracle Clusterware and Private Network Considerations - Practical Performance...
Oracle Clusterware and Private Network Considerations - Practical Performance...Oracle Clusterware and Private Network Considerations - Practical Performance...
Oracle Clusterware and Private Network Considerations - Practical Performance...
Guenadi JILEVSKI
 
Dynamo cassandra
Dynamo cassandraDynamo cassandra
Dynamo cassandra
Wu Liang
 
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBase
enissoz
 
Hadoop fault tolerance
Hadoop  fault toleranceHadoop  fault tolerance
Hadoop fault tolerance
Pallav Jha
 
Rhel cluster basics 2
Rhel cluster basics   2Rhel cluster basics   2
Rhel cluster basics 2
Manoj Singh
 
HBase Storage Internals
HBase Storage InternalsHBase Storage Internals
HBase Storage Internals
DataWorks Summit
 
Benchmarking MongoDB and CouchBase
Benchmarking MongoDB and CouchBaseBenchmarking MongoDB and CouchBase
Benchmarking MongoDB and CouchBase
Christopher Choi
 
What's New and Upcoming in HDFS - the Hadoop Distributed File System
What's New and Upcoming in HDFS - the Hadoop Distributed File SystemWhat's New and Upcoming in HDFS - the Hadoop Distributed File System
What's New and Upcoming in HDFS - the Hadoop Distributed File System
Cloudera, Inc.
 
Hadoop MapReduce Introduction and Deep Insight
Hadoop MapReduce Introduction and Deep InsightHadoop MapReduce Introduction and Deep Insight
Hadoop MapReduce Introduction and Deep Insight
Hanborq Inc.
 
1 Introduction at CloudStack Developer Day
1 Introduction at CloudStack Developer Day 1 Introduction at CloudStack Developer Day
1 Introduction at CloudStack Developer Day
Kimihiko Kitase
 
Hands on MapR -- Viadea
Hands on MapR -- ViadeaHands on MapR -- Viadea
Hands on MapR -- Viadea
viadea
 

What's hot (20)

Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to Cassandra
 
Comparisons of no sql databases march 2014
Comparisons of no sql databases march 2014Comparisons of no sql databases march 2014
Comparisons of no sql databases march 2014
 
Apache CloudStack: API to UI (STLLUG)
Apache CloudStack: API to UI (STLLUG)Apache CloudStack: API to UI (STLLUG)
Apache CloudStack: API to UI (STLLUG)
 
Rhel cluster basics 3
Rhel cluster basics   3Rhel cluster basics   3
Rhel cluster basics 3
 
Understanding AntiEntropy in Cassandra
Understanding AntiEntropy in CassandraUnderstanding AntiEntropy in Cassandra
Understanding AntiEntropy in Cassandra
 
A Deep Dive into Apache Cassandra for .NET Developers
A Deep Dive into Apache Cassandra for .NET DevelopersA Deep Dive into Apache Cassandra for .NET Developers
A Deep Dive into Apache Cassandra for .NET Developers
 
Linux hpc-cluster-setup-guide
Linux hpc-cluster-setup-guideLinux hpc-cluster-setup-guide
Linux hpc-cluster-setup-guide
 
Learning Cassandra
Learning CassandraLearning Cassandra
Learning Cassandra
 
Hadoop fault-tolerance
Hadoop fault-toleranceHadoop fault-tolerance
Hadoop fault-tolerance
 
Oracle Clusterware and Private Network Considerations - Practical Performance...
Oracle Clusterware and Private Network Considerations - Practical Performance...Oracle Clusterware and Private Network Considerations - Practical Performance...
Oracle Clusterware and Private Network Considerations - Practical Performance...
 
Dynamo cassandra
Dynamo cassandraDynamo cassandra
Dynamo cassandra
 
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBase
 
Hadoop fault tolerance
Hadoop  fault toleranceHadoop  fault tolerance
Hadoop fault tolerance
 
Rhel cluster basics 2
Rhel cluster basics   2Rhel cluster basics   2
Rhel cluster basics 2
 
HBase Storage Internals
HBase Storage InternalsHBase Storage Internals
HBase Storage Internals
 
Benchmarking MongoDB and CouchBase
Benchmarking MongoDB and CouchBaseBenchmarking MongoDB and CouchBase
Benchmarking MongoDB and CouchBase
 
What's New and Upcoming in HDFS - the Hadoop Distributed File System
What's New and Upcoming in HDFS - the Hadoop Distributed File SystemWhat's New and Upcoming in HDFS - the Hadoop Distributed File System
What's New and Upcoming in HDFS - the Hadoop Distributed File System
 
Hadoop MapReduce Introduction and Deep Insight
Hadoop MapReduce Introduction and Deep InsightHadoop MapReduce Introduction and Deep Insight
Hadoop MapReduce Introduction and Deep Insight
 
1 Introduction at CloudStack Developer Day
1 Introduction at CloudStack Developer Day 1 Introduction at CloudStack Developer Day
1 Introduction at CloudStack Developer Day
 
Hands on MapR -- Viadea
Hands on MapR -- ViadeaHands on MapR -- Viadea
Hands on MapR -- Viadea
 

Similar to Cassandra talk @JUG Lausanne, 2012.06.14

Apache Cassandra @Geneva JUG 2013.02.26
Apache Cassandra @Geneva JUG 2013.02.26Apache Cassandra @Geneva JUG 2013.02.26
Apache Cassandra @Geneva JUG 2013.02.26
Benoit Perroud
 
Cassandra for Sysadmins
Cassandra for SysadminsCassandra for Sysadmins
Cassandra for Sysadmins
Nathan Milford
 
Cassandra integrations
Cassandra integrationsCassandra integrations
Cassandra integrations
T Jake Luciani
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek Berlin
Christian Johannsen
 
04-Introduction-to-CassandraDB-.pdf
04-Introduction-to-CassandraDB-.pdf04-Introduction-to-CassandraDB-.pdf
04-Introduction-to-CassandraDB-.pdf
hothyfa
 
Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache Cassandra
Robert Stupp
 
Cassandra Database
Cassandra DatabaseCassandra Database
Cassandra Database
YounesCharfaoui
 
Pythian: My First 100 days with a Cassandra Cluster
Pythian: My First 100 days with a Cassandra ClusterPythian: My First 100 days with a Cassandra Cluster
Pythian: My First 100 days with a Cassandra Cluster
DataStax Academy
 
Scaling web applications with cassandra presentation
Scaling web applications with cassandra presentationScaling web applications with cassandra presentation
Scaling web applications with cassandra presentation
Murat Çakal
 
Talk about apache cassandra, TWJUG 2011
Talk about apache cassandra, TWJUG 2011Talk about apache cassandra, TWJUG 2011
Talk about apache cassandra, TWJUG 2011
Boris Yen
 
Cassandra
CassandraCassandra
Cassandra
Carbo Kuo
 
BigData Developers MeetUp
BigData Developers MeetUpBigData Developers MeetUp
BigData Developers MeetUp
Christian Johannsen
 
NoSql Database
NoSql DatabaseNoSql Database
NoSql Database
Suresh Parmar
 
Cassandra - A Distributed Database System
Cassandra - A Distributed Database System Cassandra - A Distributed Database System
Cassandra - A Distributed Database System
Md. Shohel Rana
 
NoSQL - Cassandra & MongoDB.pptx
NoSQL -  Cassandra & MongoDB.pptxNoSQL -  Cassandra & MongoDB.pptx
NoSQL - Cassandra & MongoDB.pptx
Naveen Kumar
 
Progressive NOSQL: Cassandra
Progressive NOSQL: CassandraProgressive NOSQL: Cassandra
Progressive NOSQL: Cassandra
Acunu
 
Cassandra Learning
Cassandra LearningCassandra Learning
Cassandra Learning
Ehsan Javanmard
 
NoSQL overview implementation free
NoSQL overview implementation freeNoSQL overview implementation free
NoSQL overview implementation free
Benoit Perroud
 
NoSQL Session II
NoSQL Session IINoSQL Session II
NoSQL Session II
Roopa Chandran
 
Devops kc
Devops kcDevops kc
Devops kc
Philip Thompson
 

Similar to Cassandra talk @JUG Lausanne, 2012.06.14 (20)

Apache Cassandra @Geneva JUG 2013.02.26
Apache Cassandra @Geneva JUG 2013.02.26Apache Cassandra @Geneva JUG 2013.02.26
Apache Cassandra @Geneva JUG 2013.02.26
 
Cassandra for Sysadmins
Cassandra for SysadminsCassandra for Sysadmins
Cassandra for Sysadmins
 
Cassandra integrations
Cassandra integrationsCassandra integrations
Cassandra integrations
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek Berlin
 
04-Introduction-to-CassandraDB-.pdf
04-Introduction-to-CassandraDB-.pdf04-Introduction-to-CassandraDB-.pdf
04-Introduction-to-CassandraDB-.pdf
 
Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache Cassandra
 
Cassandra Database
Cassandra DatabaseCassandra Database
Cassandra Database
 
Pythian: My First 100 days with a Cassandra Cluster
Pythian: My First 100 days with a Cassandra ClusterPythian: My First 100 days with a Cassandra Cluster
Pythian: My First 100 days with a Cassandra Cluster
 
Scaling web applications with cassandra presentation
Scaling web applications with cassandra presentationScaling web applications with cassandra presentation
Scaling web applications with cassandra presentation
 
Talk about apache cassandra, TWJUG 2011
Talk about apache cassandra, TWJUG 2011Talk about apache cassandra, TWJUG 2011
Talk about apache cassandra, TWJUG 2011
 
Cassandra
CassandraCassandra
Cassandra
 
BigData Developers MeetUp
BigData Developers MeetUpBigData Developers MeetUp
BigData Developers MeetUp
 
NoSql Database
NoSql DatabaseNoSql Database
NoSql Database
 
Cassandra - A Distributed Database System
Cassandra - A Distributed Database System Cassandra - A Distributed Database System
Cassandra - A Distributed Database System
 
NoSQL - Cassandra & MongoDB.pptx
NoSQL -  Cassandra & MongoDB.pptxNoSQL -  Cassandra & MongoDB.pptx
NoSQL - Cassandra & MongoDB.pptx
 
Progressive NOSQL: Cassandra
Progressive NOSQL: CassandraProgressive NOSQL: Cassandra
Progressive NOSQL: Cassandra
 
Cassandra Learning
Cassandra LearningCassandra Learning
Cassandra Learning
 
NoSQL overview implementation free
NoSQL overview implementation freeNoSQL overview implementation free
NoSQL overview implementation free
 
NoSQL Session II
NoSQL Session IINoSQL Session II
NoSQL Session II
 
Devops kc
Devops kcDevops kc
Devops kc
 

Recently uploaded

Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
Zilliz
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
SOFTTECHHUB
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 

Recently uploaded (20)

Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 

Cassandra talk @JUG Lausanne, 2012.06.14

  • 1. Apache Cassandra http://cassandra.apache.org Benoit Perroud Software Engineer @Verisign & Apache Committer JUG Lausanne, 14.06.2012
  • 2. Agenda • NoSQL Quick Overview • Apache Cassandra Fundamentals – Design principles – Data & Query Model • Real Life Uses Cases – Doodle clone – Heavy Write Load – Bulk Loading (write once data) • Client side implementation • Q&A 2
  • 3. NoSQL • [Wikipedia] NoSQL is a term used to designate database management systems that differ from classic relational database management systems (RDBMS) in some way. These data stores may not require fixed table schemas, usually avoid join operations, do not attempt to provide ACID properties and typically scale horizontally. • Pioneers : Google BigTable, Amazon Dynamo, etc. 3
  • 4. Scalability • [Wikipedia] Scalability is a desirable property of a system, a network, or a process, which indicates its ability to either handle growing amounts of work in a graceful manner or to be readily enlarged. • Scalability in two dimensions : – Scale up → scale vertically (increase RAM in an existing node) – Scale out → scale horizontally (add a node to the cluster) • In summary : handle load and peaks. 4
  • 5. Availability • [Wikipedia] Availability refers to the ability of the users to access and use the system. If a user cannot access the system, it is said to be unavailable. Generally, the term downtime is used to refer to periods when a system is unavailable. • In summary : minimize downtime. 5
  • 6. CAP Theorem • Consistency : all nodes see the same data at the same time • Availability : node failures do not prevent survivors from continuing to operate • Partition Tolerance : the system continues to operate despite arbitrary message loss • According to the theorem, a distributed system can satisfy any two of these guarantees at the same time, but not all three. 6
  • 7. NoSQL Promises • Scale horizontally – Double computational power or storage by doubling size of the cluster (tight provisioning) – Adding nodes to the cluster in constant time • High availability – No / few / under control SPoF • On commodity hardware • Let see how Cassandra achieves all of these 7
  • 8. Apache Cassandra • Apache Cassandra is could be simplified as a scalable, distributed, sparse and eventually consistent hash map. But it's actually way more. • Originally developed by Facebook, hit AFS incubator early 2008, version 1.0 in 2010 • Inspired from Amazon Dynamo and Google BigTable • Version at time of speaking 1.0.10, 1.1.1 • Under high development by several startups : Datastax, Acunu, Netflix, Twitter, Rackspace, … 8
  • 9. Apache Cassandra is a scalable distributed, sparse, eventually consistent hash map • Gossip protocol (spreading states like a rumor) • Consistent hashing – Node responsible for key range and replica sets • No single point of failure • Key space is 2^128 bits 100% keyspace 0 87 12 ? ? Explicitely set your node’s token ! 75 ? Take half of key range of most loaded node 25 ? 62 37 ? ? 50 Take half of key range 9 of most loaded node
  • 10. Apache Cassandra is a scalable distributed, sparse, eventually consistent hash map • Schemaless – A schema (metadata) may be determined for convenience – Column names are stored for every rows • [Wikipedia] Bloom filter is a space-efficient probabilistic data structure that is used to test whether an element is a member of a set. 10
  • 11. Apache Cassandra is a scalable distributed, sparse, eventually consistent hash map • [Wikipedia] A quorum is the minimum number of votes that a distributed transaction has to obtain in order to be allowed to perform an operation in a distributed system. A quorum-based technique is implemented to enforce consistent operation in a distributed system. • Quorum : R + W > N – N : number of replica, R : number of node read, W : number of node written. – R = 1, W = N – R = N, W = 1 – R = N/2, W = N/2 (+1 if N is even) 11
  • 12. Apache Cassandra is a scalable distributed, sparse, eventually consistent hash map • Key space [0,99], previously put(22, 1) • Replication factor 2 • Consistency : ONE coordinator 0 Put (22, 2) 80 20 Async put(22,2) 60 40 owner replica 12
  • 13. Apache Cassandra is a scalable distributed, sparse, eventually consistent hash map • Key space [0,49], previously put(13, 1) • Replication factor 3 • Consistency : QUORUM (R = 2, W = 2) 0 Read(13) = 2, t2 Put (13, 2, t2) Put (13, 2, t2) 80 20 Read(13) = 1, t1 Read repair 60 40 13
  • 14. Apache Cassandra is a scalable distributed, sparse, eventually consistent hash map • Can be seen as a multilevel hash map : Hash of Hash (of Hash) Map – 2 (to 3) levels of keys. • Let's focus on 2, the 3rd level (SuperColumn) usage is no longer recomanded • Keyspace > column family > row > column name = value – # use Keyspace1; – # set ColumnFamily1['key1']['columName1'] = 'value1'; – # get ColumnFamily1['key1']['columName1']; 14
  • 15. Data Model : Keyspace • Equivalent to database name in SQL world • Define replication factor and network topology – Network topology include multi datacenters topology – Replication factor can be defined per datacenters 15
  • 16. Data Model : Column Family • Equivalent to table name in SQL world – Term may change in upcoming releases to stop confusing users • Define – Type of the keys – Column name comparator – Additional metadata (types of certain known columns) 16
  • 17. Data Model : Row • Defined by the key. – Eventually stored to a node and it's replicas • Keys are be typed • 2 strategies of key partitioner on the key space – Random partitioner : • md5(key), evenly distribute keys on nodes – Byte Ordered partitioner : • Keep order while iterating through the keys, may lead to hot spots 17
  • 18. Data Model : Column Name • Could be seen as column in SQL world • Not mandatory to be declared – If declared, their corresponding values have types – Or secondary index • Ordered (!) • Column Names are often used as values (!) Column names Event1 Column Family 24.04.2012 07:00 08:00 239 255 Row key Values 18
  • 19. Data Model : Value • Can be typed, seen as array of bytes otherwise • Existing types include – Bytes – Strings (ASCII or UTF-8 strings) – Integer, Long, Float, Double, Decimal – UUID, dates – Counters (of long) • Can expire • No foreign keys (!) 19
  • 20. Query Model • 2 interfaces to interact with Cassandra – Native API • Thrift, CLI • Higher level third-party libraries – Hector – Pycassa – Phpyandra – Astyanax – Helenus – CQL (Cassandra Query Language) 20
  • 21. Query Model • Cassandra is more than a key – value store. – Get – Put – Delete – Update – But also various range queries • Key range • Column range (slice) – Secondary indexes 21
  • 22. Query Model : Get • Get single key – Give me key ‘a’ • Get multiple keys – Give me keys ‘a’, ‘c’, ‘d’ and ‘f’ Ordered regarding column name comparator ‘1’ ‘2’ ‘3’ ‘4’ ‘5’ Ordered regarding partitionner. ‘a’ 8 9 10 11 ‘b’ 12 13 14 ByteOrdred here. ‘c’ 15 16 17 ‘d’ 18 ‘e’ 19 20 20 ‘f’ 22 23 24 25 26 22
  • 23. Query Model : Get Range • Range – Query for a range of key • Give me all keys between ‘a’ and ‘c’. • Mind the partitioner. ‘1’ ‘2’ ‘3’ ‘4’ ‘5’ ‘a’ 8 9 10 11 ‘b’ 12 13 14 ‘c’ 15 16 17 ‘d’ 18 ‘e’ 19 20 20 ‘f’ 22 23 24 25 26 23
  • 24. Query Model : Get Slice • Slice – Query for a slice of columns • For key ‘a’, give me all columns between ‘3’ and ‘5’ • For key ‘f’, give me all columns between ‘3’ and ‘5’ ‘1’ ‘2’ ‘3’ ‘4’ ‘5’ ‘a’ 8 9 10 11 ‘b’ 12 13 14 ‘c’ 15 16 17 ‘d’ 18 ‘e’ 19 20 20 ‘f’ 22 23 24 25 26 24
  • 25. Query Model : Get Range Slice • Range and Slice can be combined : rangeSliceQuery – For keys between ‘b’ and ‘d’, give me columns between ‘2’ and ‘4’ ‘1’ ‘2’ ‘3’ ‘4’ ‘5’ ‘a’ 8 9 10 11 ‘b’ 12 13 14 ‘c’ 15 16 17 ‘d’ 18 ‘e’ 19 20 20 ‘f’ 22 23 24 25 26 25
  • 26. Query Model : Secondary Index • Secondary Index – Give me all rows where value for column ‘2’ is ‘12’ ‘1’ ‘2’ ‘3’ ‘4’ ‘5’ ‘a’ 8 9 10 11 ‘b’ 12 13 14 ‘c’ 15 16 17 ‘d’ 18 ‘e’ 19 20 20 ‘f’ 22 23 24 25 26 26
  • 27. cassandra-cli and nodetool • ./bin/cassandra-cli –p 9160 –h localhost • ./bin/nodetool –p7199 –h localhost Quick demo ! 27
  • 28. Write path 1. Write to commit log Memory 2. Update MemTable CF1 MemTable CF2 MemTable CFn MemTable … 3. Write is acked to client 4. If MemTable reach threshold, Disks CF1 CFn Commit log flush to disk as SSTable Bloom filter … SSTable Index Data … SSTable SSTable 28
  • 29. Read path • Versions of the same column Memory can be spread at the same time CF1 CF2 CFn MemTable MemTable MemTable … – In the MemTable – In the memtable being flushed Disks – In one or multiple SSTable Commit log CF1 CFn … • All version read, and resolved / Bloom filter SSTable Index merged using timestamp Data … – Bloom filters allow to skip reading SSTable SSTable unnecessary files – SSTables are indexed – Compaction keep things reasonnable 29
  • 30. Compaction • Runs regularly as a background operation • Merge SSTables together • Remove expired and deleted values • Has impact on general I/O availability (and thus performance) – This is where most of tuning happens – Can be throttled • Two type of compaction – Size-tiered • Few I/O consumption  write-heavy workload – Leveled • Guarantee to read from fewer SSTables  read-heavy workload • See http://www.datastax.com/dev/blog/leveled-compaction-in-apache-cassandra for complete details. 30
  • 31. Other Advanced Features • Super Columns (no more recommended) • Composite column names • Integration with Hadoop • Bulk Loading • Compression • Multi tenancy 31
  • 32. Real Life Use Case : Doodle Clone • Live demo http://doodle.noisette.ch Naïve data model Polls { id, label, [options], email, limit } Subscribers (super) { polls.id { id, label, [options] } } • Id generation – TimeUUID is your friend • Avoid super column familes – Use composite, or serialized/encoded subscribers • Subscriber.label uniqueness per poll ? – Cassandra anti-pattern (read-after-write) • Limit to n subscribers per option ? – Cassandra anti-pattern (read-after-write) 32
  • 33. Real Life Use Case : Heavy Writes • Cassandra is a really good fit when the ratio read / write is close to 0 – Event logging / redo logs – Time series • It’s a best practice to write data in its raw format AND in aggregated forms at the same time • But need compation tuning – {min,max}_compaction_threshold – memtable_flush_writers – … no magic solution here, only pragmatic approach • change configuration in one node, and mesure the difference (load, latency, …) 33
  • 34. Real Life Use Case : Counters • Cassandra >= 0.8 (CASSANDRA-1072) create column family counterCF with default_validation_class=CounterColumnType AND key_validation_class=UTF8Type AND comparator=UTF8Type; INC counterCF[‘key’][‘columnName’] BY 1; • Example counterCF[‘entity1’][2012-06-14 18:30:00] counterCF[‘entity1’][2012-06-14 18:30:05] Query per entity counterCF[‘entity1’][2012-06-14 18:30:10] number of hits for ‘entity1’ … between 18:30:00 and 19:00:00 counterCF[‘entity2’][2012-06-14 18:30:05] counterCF[2012-06-14 18:30:00][‘entity1’] counterCF[2012-06-14 18:30:00][‘entity2’] Query per date range counterCF[2012-06-14 18:30:00][‘entity3’] all entities being hit between … 18:30:00 and 19:00:00 counterCF[2012-06-14 18:30:05][‘entity1’] ! need complete date enumeration 34
  • 35. Real Life Use Case : Bulk Loading • Data is transformed (e.g. using MapReduce) • Then bulk loaded into the cluster – ColumFamilyOutputFormat (Cassanda 1.0) • Not real bulk loading – BulkOutputFormat (Cassandra 1.1) • SSTable generated during the tranformation, and streamed • Prefer Leveled Compaction Strategy – Reduce read latency – Size sstable_size_in_mb to your data 35
  • 36. Conclusion • Cassandra is not a general purpose solution • But Cassandra is doing a really good job if used accordingly – Really good scalability – Low operational cost – Advanced data and query model 36
  • 37. Thanks for your attention • Questions? • Next Swiss BigData User Group : July 16. in Zurich – More information to come, @SwissScale 37