Published on

Published in: Software, Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide


  1. 1. Cassandra Carbo 2012-07-18
  2. 2. Why talk about Cassandra● Audience Platform● We use Hadoop and HBase for batch processing● We need a real time database for online random access. – HBase – MySQL – MongoDB – Cassandra
  3. 3. Our needs● Provide an interface which supports random access.● Periodically synchronize data from HBase.● Eliminate single point failure.● Store data in multiple data centers for data locality and disaster recovery.
  4. 4. Our selections● Hadoop and HBase for processing prediction data in batch.● Cassandra for real time query.
  5. 5. Apache Cassandra● Inspired by Amazon Dynamo and Google BigTable● Originally developed by Facebook● Apache top level project● Used by Twitter, Netflix, rackspace, ...
  6. 6. What defines Cassandra● High availability● Eventually consistent● Decentralized cluster● BigTable-like data model● High write throughput
  7. 7. Why we choose Cassandra● It has great write performance, faster than read.● Decentralized architecture, no single point of failure.● Managing a Cassandra cluster is simple.● Replication configurations are flexible, supporting cluster across multiple data centers.● Occasionally inconsistency can be tolerated.
  8. 8. Cassandra vs other DBs Cassandra MongoDB HBase MySQLData BigTable-Like Document BigTable-Like Table and rowModelCAP AP CP CP CACluster P2P M-S Replication & Hadoop M-S Replication build-in ShardingOptimized Write Read Batch job ReadforQuery By key or Multi-indexed By key or scan Multi-indexed scanProtocol Thrift Client REST or Thrift Client
  9. 9. CAP theorem
  10. 10. Data model● Cassandra has a BigTable-like data model, but not identical Keyspace Column Family 1 Column Family 2 Row 1 Row 1 Column 1 Column 2 Super Column Family Super Column 1 Super Column 2 Value 1 Value 2 Column 1 Column 2 Column 1 Column 2 Row 2 Value 1 Value 2 Value 1 Value 2 Column 1 Column 2 Value 1 Value 2 ……
  11. 11. Comparison with MySQLCassandra MySQLKeyspace Database (or schema)Column Family TableRow RowColumn FieldValue Value
  12. 12. Cluster Communication● Cassandra uses gossip protocol to discover location and state information about the other nodes in a cluster.● When a node first starts up, it finds seeds node to listen to gossips. A node will remember other nodes it has gossiped.● Failure state are automatically tracked during each heartbeat detection by gossip.
  13. 13. Data Partitioning● Data in the cluster is represented as a ring.● The ring is divided into ranges which equal to the number of nodes. Each node is responsible for one (or more) ranges of the overall data.● Each node has a token. The token determines the node’s position on the ring.● In configuration file, you can set initial token on each node, or generated automatically.
  14. 14. Data Partitioning● For example, 4 node cluster, range of 0 to 100.
  15. 15. Multiple Datacenters● In multiple datacenters deployments, it is ensured that each data center has a whole copy of data.
  16. 16. Partitioning with Replication● The total number of replicas across the cluster is referred to as replication factor. replication_factor = 3
  17. 17. Snitch● Snitch defines how the nodes are grouped together within network topology.● Cassandra uses this information to route inter- node requests.● The snitch does not affect requests between the client application and Cassandra. It does not control which node a client connects to.
  18. 18. Write in Cassandra● Consistency level: ZERO, ANY, ONE, QUORUM, ALL
  19. 19. Read in Cassandra● Consistency level: ONE, QUORUM, ALL● Lazy repair (while read).
  20. 20. Underlying Storage● Cassandra is optimized for write throughput. Writes are first written to a commit log, and then to memtable.● Writes are batched in memory and periodically written to disk to a persistent table structure called an SSTable.● A row may be stored across multiple SSTable files. At read time, a row must be combined from all SSTables on disk to produce the requested data.
  21. 21. Deletes● Deleted data is not immediately removed from disk.● Instead a marker called a tombstone is written to indicate the new column status.● Columns marked with a tombstone exist for a configured time period , and then are permanently deleted by the compaction process after that time has expired.
  22. 22. Compaction● Since SSTables are immutable, Cassandra periodically merges SSTables together using a process called compaction.● Compaction merges row fragments together, removes expired tombstones. SSTables are sorted by row key, so this merge is efficient.● During compaction, there is a temporary spike in disk space usage and disk I/O.
  23. 23. Limitations of Cassandra● All data for a single row stores on a single machine in the cluster. The amount of data associated with a key has this upper bound.● A single column value may not be larger than 2GB.● The maximum number of column within per row is 2000000000 (2 billion).● The key and column names must be under 64K bytes.
  24. 24. Other features● Secondary index● Load balancer● Column TTL● Thrift, Avro and CQL● Hadoop integration
  25. 25. Links●●● papers/lakshman-ladis2009.pdf
  26. 26. Thank you. 郭家寶