Main sponsorThe Apache Cassandra storage          engine        Sylvain Lebresne
About me• Sylvain Lebresne• sylvain@datastax.com• @pcmanus• Work at
1. What is Apache Cassandra2. Data Model3. The storage engine
1. What is Apache Cassandra2. Data Model3. The storage engine
about:project• Distributed data store aimed at big data.• Apache project since 2010.• Version 1.0 released last October.• ...
Apache Cassandra
Apache CassandraA database:
Apache CassandraA database:• distributed / decentralized
Apache CassandraA database:• distributed / decentralized• replicated & durable
Apache CassandraA database:• distributed / decentralized• replicated & durable• scalable / elastic
Apache CassandraA database:• distributed / decentralized• replicated & durable• scalable / elastic
Apache CassandraA database:• distributed / decentralized• replicated & durable• scalable / elastic• fault-tolerant / no SPOF
Apache CassandraA database:• distributed / decentralized• replicated & durable• scalable / elastic• fault-tolerant / no SP...
Apache CassandraA database:• distributed / decentralized• replicated & durable• scalable / elastic• fault-tolerant / no SP...
Apache CassandraA database:• distributed / decentralized• replicated & durable• scalable / elastic• fault-tolerant / no SP...
1. What is Apache Cassandra2. Data Model3. The storage engine
Data Model• Not SQL (no transaction, nor joins) but  more than Key/Value.• Inspired by Google BigTable• Column families ba...
Ex: user profiles        “For each user, holds profile infos”                   50e8-e29b                  birth_year   1994...
Ex: user profiles        “For each user, holds profile infos”          50e8-e29b             2ab1-f1b7         birth_year   ...
Ex: user’s Tweets           “For each user, tweets he has made”                        50e8-e29bTimeline
Ex: user’s Tweets           “For each user, tweets he has made”                          50e8-e29b                        ...
Ex: user’s Tweets           “For each user, tweets he has made”                          50e8-e29b                        ...
Ex: user’s Tweets           “For each user, tweets he has made”                          50e8-e29b                        ...
Ex: user’s Tweets           “For each user, tweets he has made”                          50e8-e29b                        ...
There’s more• Secondary indexes• Distributed counters• Composite columns
1. What is Apache Cassandra2. Data Model3. The storage engine
Goal• Writes are harder than reads to scale• Spinning disks aren’t good with random I/O• Goal: minimize random I/O
A write’s journey write( k1 , c1:v1 )                                               Memory                                ...
A write’s journey write( k1 , c1:v1 )                                                    Memory                           ...
A write’s journeyack                                 Memory                k1 c1:v1k1 c1:v1                               ...
A write’s journeywrite(    k2   ,   c1:v1 c2:v2   )                                                        Memory         ...
A write’s journeywrite(    k1   , c1:v4 c3:v3 c2:v2 )                                                                Memor...
A write’s journey                                              Memory          flush                 indexcleanup    k1 c1:...
A write’s journeymore updates                                                             Memory                          ...
A write’s journey                                              Memory                        flush       index             ...
Writes properties• No reads or seeks• Only sequential I/O• Immutable SSTables: easy snapshots
A read’s journeyread( k1 )                                                        Memory    ?                    index    ...
A read’s journeyk1 c1:v5 c2:v2 c3:v3 c4:v4                                                                 Memorymerge    ...
Compaction• Goal: keep the number of SSTables low• Merge sort against multiple sstables• Sequential I/O
Compaction• Goal: keep the number of SSTables low• Merge sort against multiple sstables• Sequential I/O          index    ...
Compaction• Goal: keep the number of SSTables low• Merge sort against multiple sstables• Sequential I/O          index    ...
Optimizations• Row Cache• Bloom filters: eliminates whole SSTable• Key Cache• Rows & Columns Indexes• ...
Other features• Compression• Checksums• Time to live
Questions?
• Cassandra 1.1 scheduled in a couple of  weeks• http://cassandra.apache.org/• http://wiki.apache.org/cassandra/• http://w...
Upcoming SlideShare
Loading in...5
×

33rd degree conference

634

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
634
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
15
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • 33rd degree conference

    1. 1. Main sponsorThe Apache Cassandra storage engine Sylvain Lebresne
    2. 2. About me• Sylvain Lebresne• sylvain@datastax.com• @pcmanus• Work at
    3. 3. 1. What is Apache Cassandra2. Data Model3. The storage engine
    4. 4. 1. What is Apache Cassandra2. Data Model3. The storage engine
    5. 5. about:project• Distributed data store aimed at big data.• Apache project since 2010.• Version 1.0 released last October.• Proven in production (Netflix, Twitter, Reddit, Cisco, ...). Largest know cluster has over 300TB in over 400 machines.
    6. 6. Apache Cassandra
    7. 7. Apache CassandraA database:
    8. 8. Apache CassandraA database:• distributed / decentralized
    9. 9. Apache CassandraA database:• distributed / decentralized• replicated & durable
    10. 10. Apache CassandraA database:• distributed / decentralized• replicated & durable• scalable / elastic
    11. 11. Apache CassandraA database:• distributed / decentralized• replicated & durable• scalable / elastic
    12. 12. Apache CassandraA database:• distributed / decentralized• replicated & durable• scalable / elastic• fault-tolerant / no SPOF
    13. 13. Apache CassandraA database:• distributed / decentralized• replicated & durable• scalable / elastic• fault-tolerant / no SPOF• highly available
    14. 14. Apache CassandraA database:• distributed / decentralized• replicated & durable• scalable / elastic• fault-tolerant / no SPOF• highly available
    15. 15. Apache CassandraA database:• distributed / decentralized• replicated & durable• scalable / elastic• fault-tolerant / no SPOF• highly available• data center aware US Europe
    16. 16. 1. What is Apache Cassandra2. Data Model3. The storage engine
    17. 17. Data Model• Not SQL (no transaction, nor joins) but more than Key/Value.• Inspired by Google BigTable• Column families based.
    18. 18. Ex: user profiles “For each user, holds profile infos” 50e8-e29b birth_year 1994 fname Justin lname BieberUsers
    19. 19. Ex: user profiles “For each user, holds profile infos” 50e8-e29b 2ab1-f1b7 birth_year 1994 birth_year 1978 fname Justin email a@kutcher.com lname Bieber fname Ashton lname KutcherUsers
    20. 20. Ex: user’s Tweets “For each user, tweets he has made” 50e8-e29bTimeline
    21. 21. Ex: user’s Tweets “For each user, tweets he has made” 50e8-e29b @LiveLoveKary glad you had 0 a good birthday #muchloveTimeline
    22. 22. Ex: user’s Tweets “For each user, tweets he has made” 50e8-e29b @NickDeMoura happy bday 1 my dude. @LiveLoveKary glad you had 0 a good birthday #muchloveTimeline
    23. 23. Ex: user’s Tweets “For each user, tweets he has made” 50e8-e29b @MickyArison @miamiHEAT 2 thanks for the gam tonight @NickDeMoura happy bday 1 my dude. @LiveLoveKary glad you had 0 a good birthday #muchloveTimeline
    24. 24. Ex: user’s Tweets “For each user, tweets he has made” 50e8-e29b still a little tired. back in the 3 studio today with Timbaland @MickyArison @miamiHEAT 2 thanks for the gam tonight @NickDeMoura happy bday 1 my dude. @LiveLoveKary glad you had 0 a good birthday #muchloveTimeline
    25. 25. There’s more• Secondary indexes• Distributed counters• Composite columns
    26. 26. 1. What is Apache Cassandra2. Data Model3. The storage engine
    27. 27. Goal• Writes are harder than reads to scale• Spinning disks aren’t good with random I/O• Goal: minimize random I/O
    28. 28. A write’s journey write( k1 , c1:v1 ) Memory MemtableCommit log Hard drive
    29. 29. A write’s journey write( k1 , c1:v1 ) Memory k1 c1:v1 Memtable k1 c1:v1Commit log Hard drive
    30. 30. A write’s journeyack Memory k1 c1:v1k1 c1:v1 Hard drive
    31. 31. A write’s journeywrite( k2 , c1:v1 c2:v2 ) Memory k1 c1:v1 k2 c1:v1 c2:v2 k1 c1:v1k2 c1:v1 c2:v2 Hard drive
    32. 32. A write’s journeywrite( k1 , c1:v4 c3:v3 c2:v2 ) Memory k1 c1:v4 c2:v2 c3:v3 k2 c1:v1 c2:v2 k1 c1:v1k2 c1:v1 c2:v2k1 c1:v4 c3:v3c2:v2 Hard drive
    33. 33. A write’s journey Memory flush indexcleanup k1 c1:v4 c2:v2 c3:v3 k2 c1:v1 c2:v2 SSTable Hard drive
    34. 34. A write’s journeymore updates Memory k1 c1:v5 c4:v4 k2 c1:v2 c3:v3 k2 c1:v2 c3:v3 k1 c1:v5 c4:v4 index k1 c1:v4 c2:v2 c3:v3 k2 c1:v1 c2:v2 Hard drive
    35. 35. A write’s journey Memory flush index index k1 c1:v4 c2:v2 c3:v3 k1 c1:v5 c4:v4 k2 c1:v1 c2:v2 k2 c1:v2 c3:v3 Hard drive
    36. 36. Writes properties• No reads or seeks• Only sequential I/O• Immutable SSTables: easy snapshots
    37. 37. A read’s journeyread( k1 ) Memory ? index index k1 c1:v4 c2:v2 c3:v3 k1 c1:v5 c4:v4 k2 c1:v1 c2:v2 k2 c1:v2 c3:v3 Hard drive
    38. 38. A read’s journeyk1 c1:v5 c2:v2 c3:v3 c4:v4 Memorymerge index index k1 c1:v4 c2:v2 c3:v3 k1 c1:v5 c4:v4 k2 c1:v1 c2:v2 k2 c1:v2 c3:v3 Hard drive
    39. 39. Compaction• Goal: keep the number of SSTables low• Merge sort against multiple sstables• Sequential I/O
    40. 40. Compaction• Goal: keep the number of SSTables low• Merge sort against multiple sstables• Sequential I/O index k1 c1:v4 c2:v2 c3:v3 k2 c1:v1 c2:v2 index k1 c1:v5 c4:v4 k2 c1:v2 c3:v3
    41. 41. Compaction• Goal: keep the number of SSTables low• Merge sort against multiple sstables• Sequential I/O index k1 c1:v4 c2:v2 c3:v3 k2 c1:v1 c2:v2 index k1 c1:v5 c2:v2 c3:v3 c4:v4 index k2 c1:v2 c2:v2 c3:v3 k1 c1:v5 c4:v4 k2 c1:v2 c3:v3
    42. 42. Optimizations• Row Cache• Bloom filters: eliminates whole SSTable• Key Cache• Rows & Columns Indexes• ...
    43. 43. Other features• Compression• Checksums• Time to live
    44. 44. Questions?
    45. 45. • Cassandra 1.1 scheduled in a couple of weeks• http://cassandra.apache.org/• http://wiki.apache.org/cassandra/• http://www.datastax.com/docs/1.0
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×