London + Dublin Cassandra 2.0

3,014 views

Published on

Published in: Technology
0 Comments
5 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
3,014
On SlideShare
0
From Embeds
0
Number of Embeds
6
Actions
Shares
0
Downloads
22
Comments
0
Likes
5
Embeds 0
No embeds

No notes for slide

London + Dublin Cassandra 2.0

  1. 1. ©2013 DataStax Confidential. Do not distribute without consent. @spyced Jonathan Ellis CTO, DataStax / Project Chair, Apache Cassandra Cassandra 2.0
  2. 2. Five Years of Cassandra Jul-09 May-10 Feb-11 Dec-11 Oct-12 Jul-13 0.1 0.3 0.6 0.7 1.0 1.2 ... 2.0 DSE Jul-08
  3. 3. Core values 0 20000 40000 60000 80000 0 2 4 6 8 10 12 Cassandra HBase Redis MySQL • Massive scalablility • High performance • Reliability/Availability
  4. 4. CREATE TABLE users ( id uuid PRIMARY KEY, name text, state text, birth_date int ); CREATE INDEX ON users(state); SELECT * FROM users WHERE state=‘Texas’ AND birth_date > 1950; New Core Value • Massive scalablility • High performance • Reliability/Availability • Ease of use
  5. 5. *Key concepts? *Data Modeling section of documentation: http://www.datastax.com/ documentation/cassandra/1.2/index.html#cassandra/ddl/ ddl_anatomy_table_c.html CQL delivers "Coming from a relational database background we found the transition to Cassandra to be very straightforward. There are a few simple key concepts one must grasp at first but ever since it's been smooth sailing for us." Boris Wolf, Comcast
  6. 6. 1.2 for Developers • CQL3 • Thrift compatibility • Collections • Data dictionary • Auth support • Hadoop support • Native drivers • Tracing • Atomic batches
  7. 7. [cassandra.yaml] authenticator: PasswordAuthenticator # DSE offers KerberosAuthenticator as well CREATE USER robin WITH PASSWORD 'manager' SUPERUSER; ALTER USER cassandra WITH PASSWORD 'newpassword'; LIST USERS; DROP USER cassandra; Authentication
  8. 8. [cassandra.yaml] authorizer: CassandraAuthorizer GRANT select ON audit TO jonathan; GRANT modify ON users TO robin; GRANT all ON ALL KEYSPACES TO lara; Authorization
  9. 9. Native drivers • CQL native protocol: efficient, lightweight, asynchronous • Java (GA): https://github.com/datastax/java-driver • .NET (Beta): https://github.com/datastax/csharp-driver • Python (Beta): https://github.com/datastax/python-driver • Coming soon: PHP, Ruby, others
  10. 10. 1.2 for Operators • Virtual nodes • “Dense node” support (5-10TB/machine) • JBOD improvements • Off-heap bloom filters, compression metadata • Parallel leveled compaction
  11. 11. 1.2.5+
  12. 12. 1.2.5+ • ~1/2 memory usage in partition summary
  13. 13. 1.2.5+ • ~1/2 memory usage in partition summary • Improved compaction throttle
  14. 14. 1.2.5+ • ~1/2 memory usage in partition summary • Improved compaction throttle • Parallel leveled compaction
  15. 15. 1.2.5+ • ~1/2 memory usage in partition summary • Improved compaction throttle • Parallel leveled compaction • Removed cell-name bloom filters
  16. 16. 1.2.5+ • ~1/2 memory usage in partition summary • Improved compaction throttle • Parallel leveled compaction • Removed cell-name bloom filters • Thread-local allocation
  17. 17. 1.2.5+ • ~1/2 memory usage in partition summary • Improved compaction throttle • Parallel leveled compaction • Removed cell-name bloom filters • Thread-local allocation • LZ4 compression (default in 2.0)
  18. 18. 1.2.5+ • ~1/2 memory usage in partition summary • Improved compaction throttle • Parallel leveled compaction • Removed cell-name bloom filters • Thread-local allocation • LZ4 compression (default in 2.0) • (1.2.7) CQL Input/Output for Hadoop
  19. 19. 1.2.5+ • ~1/2 memory usage in partition summary • Improved compaction throttle • Parallel leveled compaction • Removed cell-name bloom filters • Thread-local allocation • LZ4 compression (default in 2.0) • (1.2.7) CQL Input/Output for Hadoop • (1.2.7) Range tombstone performance
  20. 20. 1.2.5+ • ~1/2 memory usage in partition summary • Improved compaction throttle • Parallel leveled compaction • Removed cell-name bloom filters • Thread-local allocation • LZ4 compression (default in 2.0) • (1.2.7) CQL Input/Output for Hadoop • (1.2.7) Range tombstone performance • (1.2.9) Larger default LCS filesize (160MB > 5MB)
  21. 21. Cassandra 2.0
  22. 22. 2.0 • Lightweight transactions • Triggers (experimental) • Improved compaction • CQL cursors
  23. 23. SELECT * FROM users WHERE username = ’jbellis’ [empty resultset] INSERT INTO users (...) VALUES (’jbellis’, ...) Session 1 SELECT * FROM users WHERE username = ’jbellis’ [empty resultset] INSERT INTO users (...) VALUES (’jbellis’, ...) Session 2 Lightweight transactions: the problem
  24. 24. Paxos • All operations are quorum-based • Each replica sends information about unfinished operations to the leader during prepare • Paxos made Simple
  25. 25. LWT: details • 4 round trips vs 1 for normal updates • Paxos state is durable • Immediate consistency with no leader election or failover • ConsistencyLevel.SERIAL • http://www.datastax.com/dev/blog/lightweight-transactions-in- cassandra-2-0
  26. 26. LWT: Use with caution • Great for 1% of your application • Eventual consistency is your friend • http://www.slideshare.net/planetcassandra/c-summit-2013-eventual-consistency- hopeful-consistency-by-christos-kalantzis
  27. 27. UPDATE USERS SET email = ’jonathan@datastax.com’, ... WHERE username = ’jbellis’ IF email = ’jbellis@datastax.com’; INSERT INTO USERS (username, email, ...) VALUES (‘jbellis’, ‘jbellis@datastax.com’, ... ) IF NOT EXISTS; Using LWT
  28. 28. Triggers CREATE TRIGGER <name> ON <table> USING <classname>;
  29. 29. Trigger implementation class MyTrigger implements ITrigger { public Collection<RowMutation> augment(ByteBuffer key, ColumnFamily update) { ... } }
  30. 30. Experimental! • Relies on internal RowMutation, ColumnFamily classes • [partition] key is a ByteBuffer • Expect changes in 2.1
  31. 31. Compaction • Single-pass, always • LCS performs STCS in L0
  32. 32. Healthy leveled compaction
  33. 33. Sad leveled compaction
  34. 34. STCS in L0
  35. 35. Cursors (before) SELECT * FROM timeline WHERE (user_id = :last_key AND tweet_id > :last_tweet) OR token(user_id) > token(:last_key) LIMIT 100 CREATE TABLE timeline (   user_id uuid,   tweet_id timeuuid,   tweet_author uuid, tweet_body text,   PRIMARY KEY (user_id, tweet_id) );
  36. 36. Cursors (after) SELECT * FROM timeline
  37. 37. Misc. performance improvements
  38. 38. Misc. performance improvements • Tracking statistics on clustered columns allows eliminating unnecessary sstables from the read path
  39. 39. Misc. performance improvements • Tracking statistics on clustered columns allows eliminating unnecessary sstables from the read path • New half-synchronous, half-asynchronous Thrift server based on LMAX Disruptor
  40. 40. Misc. performance improvements • Tracking statistics on clustered columns allows eliminating unnecessary sstables from the read path • New half-synchronous, half-asynchronous Thrift server based on LMAX Disruptor • Faster partition index lookups and cache reads by improving performance of off-heap memory
  41. 41. Misc. performance improvements • Tracking statistics on clustered columns allows eliminating unnecessary sstables from the read path • New half-synchronous, half-asynchronous Thrift server based on LMAX Disruptor • Faster partition index lookups and cache reads by improving performance of off-heap memory • Faster reads of compressed data by switching from CRC32 to Adler checksums
  42. 42. Misc. performance improvements • Tracking statistics on clustered columns allows eliminating unnecessary sstables from the read path • New half-synchronous, half-asynchronous Thrift server based on LMAX Disruptor • Faster partition index lookups and cache reads by improving performance of off-heap memory • Faster reads of compressed data by switching from CRC32 to Adler checksums • JEMalloc support for off-heap allocation
  43. 43. Spring cleaning
  44. 44. Spring cleaning • Removed compatibility with pre-1.2.5 sstables and pre-1.2.9 schema
  45. 45. Spring cleaning • Removed compatibility with pre-1.2.5 sstables and pre-1.2.9 schema • The potentially dangerous countPendingHints JMX call has been replaced by a Hints Created metric
  46. 46. Spring cleaning • Removed compatibility with pre-1.2.5 sstables and pre-1.2.9 schema • The potentially dangerous countPendingHints JMX call has been replaced by a Hints Created metric • The on-heap partition cache (“row cache”) has been removed
  47. 47. Spring cleaning • Removed compatibility with pre-1.2.5 sstables and pre-1.2.9 schema • The potentially dangerous countPendingHints JMX call has been replaced by a Hints Created metric • The on-heap partition cache (“row cache”) has been removed • Vnodes are on by default
  48. 48. Spring cleaning • Removed compatibility with pre-1.2.5 sstables and pre-1.2.9 schema • The potentially dangerous countPendingHints JMX call has been replaced by a Hints Created metric • The on-heap partition cache (“row cache”) has been removed • Vnodes are on by default • the old token range bisection code for non-vnode clusters is gone
  49. 49. Spring cleaning • Removed compatibility with pre-1.2.5 sstables and pre-1.2.9 schema • The potentially dangerous countPendingHints JMX call has been replaced by a Hints Created metric • The on-heap partition cache (“row cache”) has been removed • Vnodes are on by default • the old token range bisection code for non-vnode clusters is gone • Removed emergency memory pressure valve logic
  50. 50. Operational concerns
  51. 51. Operational concerns • Java7 is now required!
  52. 52. Operational concerns • Java7 is now required! • Leveled compaction level information has been moved into sstable metadata
  53. 53. Operational concerns • Java7 is now required! • Leveled compaction level information has been moved into sstable metadata • Kernel page cache skipping has been removed in favor of optional row preheating (preheat_kernel_page_cache)
  54. 54. Operational concerns • Java7 is now required! • Leveled compaction level information has been moved into sstable metadata • Kernel page cache skipping has been removed in favor of optional row preheating (preheat_kernel_page_cache) • Streaming has been rewritten to be more transparent and robust.
  55. 55. Operational concerns • Java7 is now required! • Leveled compaction level information has been moved into sstable metadata • Kernel page cache skipping has been removed in favor of optional row preheating (preheat_kernel_page_cache) • Streaming has been rewritten to be more transparent and robust. • Streaming support for old-version sstables
  56. 56. ©2013 DataStax Confidential. Do not distribute without consent. 31

×