Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

33rd degree conference

963 views

Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

33rd degree conference

  1. 1. Main sponsorThe Apache Cassandra storage engine Sylvain Lebresne
  2. 2. About me• Sylvain Lebresne• sylvain@datastax.com• @pcmanus• Work at
  3. 3. 1. What is Apache Cassandra2. Data Model3. The storage engine
  4. 4. 1. What is Apache Cassandra2. Data Model3. The storage engine
  5. 5. about:project• Distributed data store aimed at big data.• Apache project since 2010.• Version 1.0 released last October.• Proven in production (Netflix, Twitter, Reddit, Cisco, ...). Largest know cluster has over 300TB in over 400 machines.
  6. 6. Apache Cassandra
  7. 7. Apache CassandraA database:
  8. 8. Apache CassandraA database:• distributed / decentralized
  9. 9. Apache CassandraA database:• distributed / decentralized• replicated & durable
  10. 10. Apache CassandraA database:• distributed / decentralized• replicated & durable• scalable / elastic
  11. 11. Apache CassandraA database:• distributed / decentralized• replicated & durable• scalable / elastic
  12. 12. Apache CassandraA database:• distributed / decentralized• replicated & durable• scalable / elastic• fault-tolerant / no SPOF
  13. 13. Apache CassandraA database:• distributed / decentralized• replicated & durable• scalable / elastic• fault-tolerant / no SPOF• highly available
  14. 14. Apache CassandraA database:• distributed / decentralized• replicated & durable• scalable / elastic• fault-tolerant / no SPOF• highly available
  15. 15. Apache CassandraA database:• distributed / decentralized• replicated & durable• scalable / elastic• fault-tolerant / no SPOF• highly available• data center aware US Europe
  16. 16. 1. What is Apache Cassandra2. Data Model3. The storage engine
  17. 17. Data Model• Not SQL (no transaction, nor joins) but more than Key/Value.• Inspired by Google BigTable• Column families based.
  18. 18. Ex: user profiles “For each user, holds profile infos” 50e8-e29b birth_year 1994 fname Justin lname BieberUsers
  19. 19. Ex: user profiles “For each user, holds profile infos” 50e8-e29b 2ab1-f1b7 birth_year 1994 birth_year 1978 fname Justin email a@kutcher.com lname Bieber fname Ashton lname KutcherUsers
  20. 20. Ex: user’s Tweets “For each user, tweets he has made” 50e8-e29bTimeline
  21. 21. Ex: user’s Tweets “For each user, tweets he has made” 50e8-e29b @LiveLoveKary glad you had 0 a good birthday #muchloveTimeline
  22. 22. Ex: user’s Tweets “For each user, tweets he has made” 50e8-e29b @NickDeMoura happy bday 1 my dude. @LiveLoveKary glad you had 0 a good birthday #muchloveTimeline
  23. 23. Ex: user’s Tweets “For each user, tweets he has made” 50e8-e29b @MickyArison @miamiHEAT 2 thanks for the gam tonight @NickDeMoura happy bday 1 my dude. @LiveLoveKary glad you had 0 a good birthday #muchloveTimeline
  24. 24. Ex: user’s Tweets “For each user, tweets he has made” 50e8-e29b still a little tired. back in the 3 studio today with Timbaland @MickyArison @miamiHEAT 2 thanks for the gam tonight @NickDeMoura happy bday 1 my dude. @LiveLoveKary glad you had 0 a good birthday #muchloveTimeline
  25. 25. There’s more• Secondary indexes• Distributed counters• Composite columns
  26. 26. 1. What is Apache Cassandra2. Data Model3. The storage engine
  27. 27. Goal• Writes are harder than reads to scale• Spinning disks aren’t good with random I/O• Goal: minimize random I/O
  28. 28. A write’s journey write( k1 , c1:v1 ) Memory MemtableCommit log Hard drive
  29. 29. A write’s journey write( k1 , c1:v1 ) Memory k1 c1:v1 Memtable k1 c1:v1Commit log Hard drive
  30. 30. A write’s journeyack Memory k1 c1:v1k1 c1:v1 Hard drive
  31. 31. A write’s journeywrite( k2 , c1:v1 c2:v2 ) Memory k1 c1:v1 k2 c1:v1 c2:v2 k1 c1:v1k2 c1:v1 c2:v2 Hard drive
  32. 32. A write’s journeywrite( k1 , c1:v4 c3:v3 c2:v2 ) Memory k1 c1:v4 c2:v2 c3:v3 k2 c1:v1 c2:v2 k1 c1:v1k2 c1:v1 c2:v2k1 c1:v4 c3:v3c2:v2 Hard drive
  33. 33. A write’s journey Memory flush indexcleanup k1 c1:v4 c2:v2 c3:v3 k2 c1:v1 c2:v2 SSTable Hard drive
  34. 34. A write’s journeymore updates Memory k1 c1:v5 c4:v4 k2 c1:v2 c3:v3 k2 c1:v2 c3:v3 k1 c1:v5 c4:v4 index k1 c1:v4 c2:v2 c3:v3 k2 c1:v1 c2:v2 Hard drive
  35. 35. A write’s journey Memory flush index index k1 c1:v4 c2:v2 c3:v3 k1 c1:v5 c4:v4 k2 c1:v1 c2:v2 k2 c1:v2 c3:v3 Hard drive
  36. 36. Writes properties• No reads or seeks• Only sequential I/O• Immutable SSTables: easy snapshots
  37. 37. A read’s journeyread( k1 ) Memory ? index index k1 c1:v4 c2:v2 c3:v3 k1 c1:v5 c4:v4 k2 c1:v1 c2:v2 k2 c1:v2 c3:v3 Hard drive
  38. 38. A read’s journeyk1 c1:v5 c2:v2 c3:v3 c4:v4 Memorymerge index index k1 c1:v4 c2:v2 c3:v3 k1 c1:v5 c4:v4 k2 c1:v1 c2:v2 k2 c1:v2 c3:v3 Hard drive
  39. 39. Compaction• Goal: keep the number of SSTables low• Merge sort against multiple sstables• Sequential I/O
  40. 40. Compaction• Goal: keep the number of SSTables low• Merge sort against multiple sstables• Sequential I/O index k1 c1:v4 c2:v2 c3:v3 k2 c1:v1 c2:v2 index k1 c1:v5 c4:v4 k2 c1:v2 c3:v3
  41. 41. Compaction• Goal: keep the number of SSTables low• Merge sort against multiple sstables• Sequential I/O index k1 c1:v4 c2:v2 c3:v3 k2 c1:v1 c2:v2 index k1 c1:v5 c2:v2 c3:v3 c4:v4 index k2 c1:v2 c2:v2 c3:v3 k1 c1:v5 c4:v4 k2 c1:v2 c3:v3
  42. 42. Optimizations• Row Cache• Bloom filters: eliminates whole SSTable• Key Cache• Rows & Columns Indexes• ...
  43. 43. Other features• Compression• Checksums• Time to live
  44. 44. Questions?
  45. 45. • Cassandra 1.1 scheduled in a couple of weeks• http://cassandra.apache.org/• http://wiki.apache.org/cassandra/• http://www.datastax.com/docs/1.0

×