Cassandra 101

2,996 views
2,749 views

Published on

Introduction to Cassandra (Operational)

Published in: Technology
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,996
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
112
Comments
0
Likes
4
Embeds 0
No embeds

No notes for slide

Cassandra 101

  1. 1. Cassandra 101 Introduction to Apache Cassandra
  2. 2. What is Cassandra? ● A distributed, columnar database ● Data model inspired by Google BigTable (2006) ● Distribution model inspired by Amazon Dynamo (2007) ● Open Sourced by Facebook in 2008 ● Monolithic Kernel written in Java ● Used by Digg, Facebook, Twitter, Reddit, Rackspace, CloudKick and others
  3. 3. Etymology ● In Greek mythology Cassandra (Also known as Alexandra) was the daughter of King Priam and Queen Hecuba of Troy ● Her beauty caused Apollo to grant her the gift of prophecy ● When she did not return his love, Apollo placed a curse on her so that no one would ever believe her predictions
  4. 4. Why Cassandra ? ● Minimal Administration ● No Single Point of Failure ● Scale Horizontally ● Writes are durable ● Optimized for writes ● Consistency is flexible, can be updated online ● Schema is flexible, can be updated online ● Handles failure gracefully ● Replication is easy, Rack and DC aware
  5. 5. Commercial Support
  6. 6. Data Model A Column is the basic unit consisting Key, Value and Timestamp
  7. 7. Data Model A Column is the basic unit consisting Key, Value and Timestamp
  8. 8. RDBMS vs Cassandra Map<RowKey, SortedMap<ColumnKey, ColumnValue>>
  9. 9. Cassandra is good at Reading data from a row in the order it is stored, i.e. by Column Name! Understand the queries you application requires before building the data model
  10. 10. Consistent Hashing Load Balancing in a changing world ... ● Evenly map keys to nodes ● Minimize key movement when nodes join or leave
  11. 11. The Partitioner: ● RandomPartitioner transforms Keys to Tokens using MD5 ● In C* 1.2 the default hashing is Murmur3 algorithm
  12. 12. Keys and Tokens? 0 999010 ‘fop’ ‘foo’ MD5 hashing for ‘fop’ is 89de73aaae8c956fb7c9379be7978e5b MD5 hashing for ‘foo’ is d3b07384d113edec49eaa6238ad5ff00
  13. 13. Token Ring. 99 0 ‘fop’ token: 10‘foo’ token: 90
  14. 14. Token Ranges (Pre 1.2) Node 1 token:0 76-0 1-25 26-5051-75 Node 2 token:25 Node 3 token:50 Node 4 token:75 ‘foo’ token 90
  15. 15. Token Ranges With Virtual Nodes in 1.2 Node 1 Node 2 Node 3 ● Easier to Enlarge or shrink the cluster ● The cluster can grow in steps of 1 node ● Node Recovery is much more faster
  16. 16. Replication Strategy Node 1 token:0 76-0 1-25 26-5051-75 Node 2 token:25 Node 3 token:50 Node 4 token:75 ‘foo’ token 90 Selects Replication Factor number of nodes for a row.
  17. 17. Replication Strategy Node 1 token:0 76-0 1-25 26-5051-75 Node 2 token:25 Node 3 token:50 Node 4 token:75 ‘foo’ token 90 SimpleStrategy with RF 3
  18. 18. Replication Strategy Node 1 token:0 76-0 1-25 26-5051-75 Node 2 token:25 Node 3 token:50 Node 4 token:75 ‘foo’ token 90 NetworkTopolgyStrategy Uses Replication Factor per Data Center Node 1 token:0 76-0 1-25 26-5051-75 Node 2 token:25 Node 3 token:50 Node 4 token:75 ‘foo’ token 90 EAST WEST
  19. 19. SimpleSnitch Places all nodes in the same DC & RACK (Default)
  20. 20. EC2Snitch/EC2MultiRegionSnitch DC is set to AWS Region and a Rack to Availability Zone
  21. 21. PropertyFileSnitch Nodes DC and Racks are maintained in a property file
  22. 22. GossipPropertyFileSnitch Uses GOSSIP as first source for node info and if not available it uses the property file
  23. 23. The Client and the Coordinator Node 1 Node 3 Node 4 Node 2 ‘foo’ token 90 Client
  24. 24. Multi DC Client and Coordinator Node 1 Node 3 Node 4 Node 2 ‘foo’ token 90 Client Node 10 Node 20
  25. 25. Gossip Nodes share information with small number of neighbours, who share information with other small number of neighbours … ● Used for intra-cluster communication ● Routes client requests ● Detects nodes failure ● Peers are called by seeds in config file.
  26. 26. Cassandra Objects ● CommitLog ● MemTable ● SSTable ● Index ● Bloom Filter
  27. 27. Consistency ● CAP theorem ○ Trade consistency for availability ○ Consistency is a choice * it doesn't matter if you are good at somethings long as you are consistent. Partition Consistency Availability OR
  28. 28. Level Description ZERO Cross fingers ANY 1st to Respond (HH) ONE, TWO, THREE 1st to Respond QUORUM N/2+1 replicas ALL All replicas WRITE Level Description ZERO N/A ANY N/A ONE, TWO, THREE nth to Respond QUORUM* N/2+1 ALL All replicas READ Consistency Level ● Specifies for each request ● Number of nodes to wait for * QUORUM, LOCAL_QUORUM, EACH_QUOROM
  29. 29. Write ‘foo’ at Quorum with Hinted Handoff Node 1 Node 3 is Down Node 4 holds ‘foo’ for node 3 Node 2 ‘foo’ token 90 Client
  30. 30. Read ‘foo’ at Quorum Node 1 Node 3 is Down Node 4 holds ‘foo’ for node 3 Node 2 ‘foo’ token 90 Client
  31. 31. Are used to resolve differences ● Stored for each Column Value ● 64bit Integers Column Node 1 Node 2 Node 3 Vegetable ‘cucumber’ (timestamp 10) ‘cucumber’ (timestamp 10) <missing> Fruit ‘Apple’ (timestamp 10) ‘banana’ (timestamp 15) ‘Apple’ (timestamp 10) Column TimeStamps
  32. 32. Strong Consistency W + R > N #Write Nodes + #Read Nodes > Replication Factor ● QUORUM Read + QUORUM Write ● ALL Read + ONE Write ● ONE Read + ALL Write
  33. 33. Achieving Consistency ● Consistency Level ● Hinted Handoff ● Read Repair ● Anti Entropy (User triggered Repairs)
  34. 34. Write Path ● Append to Commit Log File ● Merge Columns into Memtable ● Asynchronously flush Memtabe to a new file (Never update existing files) ● Data is stored in immutable files called SSTables (Sorted String Tables)
  35. 35. SSTables Files *-Data.db *-Index.db *-Filter.db (And others)
  36. 36. Read Path Bloom Filter (cache) Index/Key Cache Memory SStable-1.Data.db foo: fruit (ts:10) apple vegetable (ts:15) cucumber …. …. …. SSTable-1-Index.db Disk Bloom Filter (cache) Index/Key Cache SStable-2.Data.db foo: fruit (ts:10) apple vegetable (ts:10) Pepper …. …. …. SSTable-2-Index.db Bloom Filter Bloom Filter
  37. 37. Compactions Compactions merges truth from multiple SSTables into one SSTable with the same truth (Manual and continuous background process) Column SSTable 1 SStable 2 New Vegetable ‘cucumber’ (timestamp 10) ‘cucumber’ (timestamp 10) ‘cucumber’ (timestamp 10) Fruit ‘Apple’ (timestamp 10) <tombstone> (timestamp 15) <tombstone> (timestamp: 15)
  38. 38. Writes and Reads
  39. 39. Managing Cassandra ● Single configuration file /etc/cassandra/cassandra.yaml file ● Single control command /usr/bin/nodetool ● Monitoring done by DataStax OpsCenter
  40. 40. Troubleshooting Cassandra Always inspect these files: ● /var/log/cassandra/cassandra.log (Startup) ● /var/log/cassandra/system.log (Normal work)
  41. 41. Backup Use Cassandra snapshots... And God said to Noah, Noah make me a backup ... 'cause I shall format
  42. 42. Client (API) Choices ● Thrift, original and still fully supported API: ○ JAVA: Thrift, Hector, Astyanax, DataStax Driver, Cundera… ○ Python: Pycassa, Telephus, … ○ Ruby: Fauna ○ PHP: PHP Client Library ○ C# ○ Node.JS ○ GO ○ SImba ODBC ○ C++: LibQtCassandra ○ ORM ○ …. ● CQL3: A Table oriented, Schema Driven, Data Model and Similar to SQL
  43. 43. CQL3 Create KeySpace ● Using CQL3 via cqlsh command tool ($CASSANDRA_HOME/bin/cqlsh): ● Create a new Keyspace with Replication factor of 3 and NetworkTopology CREATE KEYSPACE kenshoo_cass_fans WITH replication = {‘class’:’NetworkTopologyStrategy’, ‘us_east_dc’:3};
  44. 44. CQL3 Working with Tables ● CQL3 Example ● Table is a sparse collection of well known ordered columns CREATE TABLE User ( user_name text, password text, real_name text, PRIMARY KEY (user_name) ); --------------------------------------------------------- INSERT INTO User (user_name, password, real_name) VALUES (‘nader’,’sekr8t’,’MR NADER’); --------------------------------------------------------- SELECT * From User where user_name = ‘NADER’; user_name| password | real_name ---------+----------+----------- nader| sekr8t | MR NADER

×