An introduction
 to Cassandra

                           Pedro Gomes
              pedrogomes@lsd.di.uminho.pt
          ...
Context
•   NOSQL movement- Not only SQL
    •unstructured data
    •web oriented interfaces
    •scale problems
         ...
Cassandra - introduction
• From the greek prophetess Cassandra.
• Based on Amazon Dynamo and Goggle
  BigTable
• Built on ...
Why Cassandra?
•   High available
•   Eventual consistent
•   Decentralized
•   Elastic
•   Fault tolerant
•   Flexible Sc...
A little internals...
• Built for Scale -   Consistence Hashing
                                     A
      A




       ...
Partitioners
• Order preserving
• Random
• Custom...
Consistency
• CAP theorem                               Availability   Consitency


 • Trade consistency for availability
...
Consistency - N,W,R
• Define your Consistency:
 • Define the replication factor N
 • For writes and reads chose the number
 ...
Data model
• KeySpaces - collection of your unique keys
• Column Families - groups of columns
• Columns - a tuple with col...
Data model - Column Families
• Using the blog example:
 • PostsKeys       Columns

        Geek          Title:       Auth...
Data model - Super Columns
          • Comments
 Keys       SuperColumns

 Geek        4/5/2010   Author:    Comment:     ...
Data model
<Keyspace Name="BloggyAppy">

   <!-- CF definitions -->
   <ColumnFamily CompareWith="BytesType" Name="BlogEnt...
API

• Thrift RPC
 • Java, PHP, C++....
API
•   insert(KeySpace, Key,Column_path,Value, Timestamp,Consistency_level)

•   get(KeySpace, Key,Column_path,Consistenc...
Have fun

• Clients for many languages
• Lucandra
• Hadoop support
• ...
End


• Questions ?
Upcoming SlideShare
Loading in...5
×

Cassandra presentation - Geek Nights Braga

2,804

Published on

My presentation about Casandra on Braga Geek Nights

Published in: Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,804
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
129
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide
















  • Cassandra presentation - Geek Nights Braga

    1. 1. An introduction to Cassandra Pedro Gomes pedrogomes@lsd.di.uminho.pt Braga Geek Nights - Abril 2010
    2. 2. Context • NOSQL movement- Not only SQL •unstructured data •web oriented interfaces •scale problems Voldemort • +20 emerging non relational databases • Document stores • Graph databases • Key-Value and Wide Column Stores
    3. 3. Cassandra - introduction • From the greek prophetess Cassandra. • Based on Amazon Dynamo and Goggle BigTable • Built on FaceBook, open sourced in 2008 • Scalable, decentralized and structured data store
    4. 4. Why Cassandra? • High available • Eventual consistent • Decentralized • Elastic • Fault tolerant • Flexible Schema
    5. 5. A little internals... • Built for Scale - Consistence Hashing A A New node F F N M I B
    6. 6. Partitioners • Order preserving • Random • Custom...
    7. 7. Consistency • CAP theorem Availability Consitency • Trade consistency for availability Partition Tolerance • Eventual consistency • Read Repair, Hinted Handoff , Proactive Repair • A choice, not an obligation
    8. 8. Consistency - N,W,R • Define your Consistency: • Define the replication factor N • For writes and reads chose the number of nodes R or W • ALL, ONE, QUORUM, ZERO. • W + R > N = Consistency
    9. 9. Data model • KeySpaces - collection of your unique keys • Column Families - groups of columns • Columns - a tuple with column name, value, and time stamp • Super columns - A column that is a set of column • I will show pictures next, don’t worry.
    10. 10. Data model - Column Families • Using the blog example: • PostsKeys Columns Geek Title: Author: Body: Nights Geek Nights Pedro The... Title: Author: Body: Tags: Cassandra Data, ... Cassandra Pedro This... Title: Author: Body: Stuff Stuff Someone Something
    11. 11. Data model - Super Columns • Comments Keys SuperColumns Geek 4/5/2010 Author: Comment: email: 4/5/2010 Author: Comment: email: Nights 20:00 Ricardo I think... email@ 19:00 Jack IMO ... email@ 1/4/2010 Author: Comment: email: 1/4/2010 Author: Comment: email: Cassandra 14:00 Filipe My POV.. email@ 14:00 Jon ... email@ Stuff 1/4/2010 Author: Comment: email: 14:00 Filipe Great... email@
    12. 12. Data model <Keyspace Name="BloggyAppy"> <!-- CF definitions --> <ColumnFamily CompareWith="BytesType" Name="BlogEntries"/> <ColumnFamily CompareWith="TimeUUIDType" Name="Comments" CompareSubcolumnsWith="BytesType" ColumnType="Super"/> </Keyspace> • Think about your schema
    13. 13. API • Thrift RPC • Java, PHP, C++....
    14. 14. API • insert(KeySpace, Key,Column_path,Value, Timestamp,Consistency_level) • get(KeySpace, Key,Column_path,Consistency_level) • batch_mutate • multi_get • range • ...
    15. 15. Have fun • Clients for many languages • Lucandra • Hadoop support • ...
    16. 16. End • Questions ?
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×