• Like
Cassandra 1.2 by Eddie Satterly
Upcoming SlideShare
Loading in...5

Cassandra 1.2 by Eddie Satterly

Uploaded on

Eddie Satterly on Cassandra 1.2

Eddie Satterly on Cassandra 1.2

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads


Total Views
On Slideshare
From Embeds
Number of Embeds



Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

    No notes for slide


  • 1. C* Cassandra 1.2 Eddie SatterlySplunk- Chief Evangelist, Office of CTO esatterly@splunk.com PlanetCassandra.org 2013
  • 2. Highlighted Features in 1.2 C*• Concurrent Schema changes• Virtual nodes (vnodes)• Off-heap bloom filters & compression metadata• Improved JBOD functionality• Query Profiling & Tracing• Atomic batches• CQL 3• Collections PlanetCassandra.org 2013
  • 3. Concurrent Schema ChangesC* PlanetCassandra.org 2013
  • 4. Vnodes AdvantagesC*•Rather than just one or a few nodesparticipating in bootstrapping a newnode, all nodes participate•No longer a need to stay with thinnodes as the impact is much smaller onnodes•Vnodes automatically maintain the datadistribution (balance) of the cluster sono need to perform rebalance aftercluster modification•Instead of each node having a token anew configuration num_tokens percluster•Simple to upgrade a cluster for Vnodessupport PlanetCassandra.org 2013
  • 5. C* PlanetCassandra.org 2013
  • 6. C* PlanetCassandra.org 2013
  • 7. Off-heap bloom filters & metadata C*• Reduces the Java heap requirements for large datasets to reduce garbage collection impact on performance and stability (1-2GB per billion rows)• Moves memory used for Bloom Filters & compression metadata into native memory (~20GB per TB compressed data) PlanetCassandra.org 2013
  • 8. Improved JBOD functionality C*• In Prior versions a single disk failure could make a node unavailable for I/O• 1.2 introduces a new disk_failure_policy setting that allows a choice from two policies that deal with failure: • best_effort – This setting means that if Cassandra cant write to a disk it will become blacklisted for writes and the node will continue writing elsewhere. If Cassandra cant write to a disk it will mark it as unreadable and continue serving data from readable sstables only. • Ignore – This setting causes Cassandra to behave in exactly the same way as in previous versionsPlanetCassandra.org 2013
  • 9. Query Profiling & Tracing C*• All new performance diagnostic utilities aimed at helping understand, diagnose and troubleshoot CQL statements• You can: • interrogate individual CQL statement in an ad-hoc manner • Perform a system-wide collection of all queries that are sent to a cluster• Cassandra provides a description of each step along with what nodes(s) are affected, time per step and total time of request (example follows) PlanetCassandra.org 2013
  • 10. Tracing ExampleC* PlanetCassandra.org 2013
  • 11. Atomic Batches C*• In 1.2 batches are now guaranteed by default to be atomic and are handled differently than earlier version. Steps are: • Batch is first written to a new system table that consumes the serialized batch as blob data • After the rows in the batch have been successfully written and persisted (or hinted) the system table entry is removed• This is the new default behavior and it should be noted that for this there is a performance penalty so for use cases that don’t necessitate a BEGIN UNLOGGED BATCH can be issued PlanetCassandra.org 2013
  • 12. CQL3 C*• There are several enhancements to CQL 3 in 1.2 and you should review the full documentation for a complete overview.• Some of the main new features are: • ALTER KEYSPACE • Commands to view TTL time remaining • Conditional operators • Limited Ordering • New metatdata tables in system keyspace for • Keyspace – Quick reference for keysapce metadata • Local - Supplies demographic for the local node • Peer – Supplies demographic for peer nodes in cluster PlanetCassandra.org 2013
  • 13. Collections C*• All new mechanisms for storing data called collections which provide easier methods of inserting and manipulating data with multiple items in a single column• There are 3 different types of collections you can select from • Sets – Group of elements that are returned in sorted order when queried • Lists – Group of elements that allows for append and prepend and allows refernce by index • Maps – Maps one thing to another in a pair PlanetCassandra.org 2013