Seattle Cassandra Meetup - Cassandra 1.2 - Eddie Satterly


Published on

Published in: Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Seattle Cassandra Meetup - Cassandra 1.2 - Eddie Satterly

  1. 1. 2013C*Cassandra 1.2Eddie SatterlySplunk- Chief Evangelist, Office of
  2. 2. 2013C*• Concurrent Schema changes• Virtual nodes (vnodes)• Off-heap bloom filters & compression metadata• Improved JBOD functionality• Query Profiling & Tracing• Atomic batches• CQL 3• CollectionsHighlighted Features in 1.2
  3. 3. 2013C*Concurrent Schema Changes
  4. 4. 2013C*Vnodes Advantages•Rather than just one or a few nodesparticipating in bootstrapping a newnode, all nodes participate•No longer a need to stay with thinnodes as the impact is much smaller onnodes•Vnodes automatically maintain the datadistribution (balance) of the cluster sono need to perform rebalance aftercluster modification•Instead of each node having a token anew configuration num_tokens percluster•Simple to upgrade a cluster for Vnodessupport
  5. 5. 2013C*
  6. 6. 2013C*
  7. 7. 2013C*Off-heap bloom filters & metadata• Reduces the Java heap requirements for large datasets toreduce garbage collection impact on performance andstability (1-2GB per billion rows)• Moves memory used for Bloom Filters & compressionmetadata into native memory (~20GB per TB compresseddata)
  8. 8. 2013C*Improved JBOD functionality• In Prior versions a single disk failure could make a nodeunavailable for I/O• 1.2 introduces a new disk_failure_policy setting that allowsa choice from two policies that deal with failure:• best_effort – This setting means that if Cassandra cantwrite to a disk it will become blacklisted for writesand the node will continue writing elsewhere. IfCassandra cant read from a disk it will mark it asunreadable and continue serving data from readablesstables only.• Ignore – This setting causes Cassandra to behave inexactly the same way as in previous versions
  9. 9. 2013C*Query Profiling & Tracing• All new performance diagnostic utilities aimed athelping understand, diagnose and troubleshoot CQLstatements• You can:• interrogate individual CQL statement in an ad-hocmanner• Perform a system-wide collection of all queriesthat are sent to a cluster• Cassandra provides a description of each step alongwith what nodes(s) are affected, time per step andtotal time of request (example follows)
  10. 10. 2013C*Tracing Example
  11. 11. 2013C*Atomic Batches• In 1.2 batches are now guaranteed by default to be atomicand are handled differently than earlier version. Steps are:• Batch is first written to a new system table thatconsumes the serialized batch as blob data• After the rows in the batch have been successfullywritten and persisted (or hinted) the system tableentry is removed• This is the new default behavior and it should be notedthat for this there is a performance penalty so for usecases that don’t necessitate a BEGIN UNLOGGED BATCHcan be issued
  12. 12. 2013C*CQL3• There are several enhancements to CQL 3 in 1.2 and you shouldreview the full documentation for a complete overview.• Some of the main new features are:• ALTER KEYSPACE• Commands to view TTL time remaining• Conditional operators• Limited Ordering• New metatdata tables in system keyspace for• Keyspace – Quick reference for keysapce metadata• Local - Supplies demographic for the local node• Peer – Supplies demographic for peer nodes in cluster
  13. 13. 2013C*Collections• All new mechanisms for storing data called collections whichprovide easier methods of inserting and manipulating data withmultiple items in a single column• There are 3 different types of collections you can select from• Sets – Group of elements that are returned in sorted orderwhen queried• Lists – Group of elements that allows for append andprepend and allows refernce by index• Maps – Maps one thing to another in a pair