Cassandra Summit EU 2013
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Cassandra Summit EU 2013

on

  • 1,379 views

 

Statistics

Views

Total Views
1,379
Views on SlideShare
1,379
Embed Views
0

Actions

Likes
2
Downloads
24
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Cassandra Summit EU 2013 Presentation Transcript

  • 1. #CASSANDRAEU Cassandra 2.0 and 2.1 Jonathan Ellis CTO, DataStax
  • 2. Five years of Cassandra 0.1 Jul-08 ... 0.3 Jul-09 0.6 May-10 0.7 Feb-11 #CASSANDRAEU 1.0 Dec-11 DSE 1.2 Oct-12 2.0 Jul-13
  • 3. Core values •Massive scalability •High performance •Reliability/Availabilty #CASSANDRAEU Cassandra MySQL HBase Redis
  • 4. VLDB benchmark (RWS) THROUGHPUT OPS/SEC) 80000 Cassandra MySQL HBase #CASSANDRAEU Redis C SS A RA ND A 60000 40000 20000 0 0 2 4 6 NUMBER OF NODES 8 10 12
  • 5. Endpoint benchmark (RW) HBase MongoDB AN DR A Cassandra #CASSANDRAEU CA THROUGHPUT OPS/SEC) SS 35000 26250 17500 8750 0 1 2 4 8 NUMBER OF NODES 16 32
  • 6. #CASSANDRAEU
  • 7. New core value •Massive scalability •High performance •Reliability/Availabilty •Ease of use #CASSANDRAEU CREATE TABLE users ( id uuid PRIMARY KEY, name text, state text, birth_date int ); CREATE INDEX ON users(state); SELECT * FROM users WHERE state=‘Texas’ AND birth_date > 1950;
  • 8. Native Drivers #CASSANDRAEU •CQL native protocol: efficient, lightweight, asynchronous •Java (GA): https://github.com/datastax/java-driver •.NET (GA): https://github.com/datastax/csharp-driver •Python (Beta): https://github.com/datastax/pythondriver •Coming soon: PHP, Ruby
  • 9. Tracing #CASSANDRAEU cqlsh:foo> INSERT INTO bar (i, j) VALUES (6, 2); Tracing session: 4ad36250-1eb4-11e2-0000-fe8ebeead9f9 activity | timestamp | source | source_elapsed -------------------------------------+--------------+-----------+---------------Determining replicas for mutation | 00:02:37,015 | 127.0.0.1 | 540 Sending message to /127.0.0.2 | 00:02:37,015 | 127.0.0.1 | 779 Message received from /127.0.0.1 | 00:02:37,016 | 127.0.0.2 | 63 Applying mutation | 00:02:37,016 | 127.0.0.2 | 220 Acquiring switchLock | 00:02:37,016 | 127.0.0.2 | 250 Appending to commitlog | 00:02:37,016 | 127.0.0.2 | 277 Adding to memtable | 00:02:37,016 | 127.0.0.2 | 378 Enqueuing response to /127.0.0.1 | 00:02:37,016 | 127.0.0.2 | 710 Sending message to /127.0.0.1 | 00:02:37,016 | 127.0.0.2 | 888 Message received from /127.0.0.2 | 00:02:37,017 | 127.0.0.1 | 2334 Processing response from /127.0.0.2 | 00:02:37,017 | 127.0.0.1 | 2550
  • 10. Authentication #CASSANDRAEU [cassandra.yaml] authenticator: PasswordAuthenticator # DSE offers KerberosAuthenticator
  • 11. Authentication #CASSANDRAEU [cassandra.yaml] authenticator: PasswordAuthenticator # DSE offers KerberosAuthenticator CREATE USER robin WITH PASSWORD 'manager' SUPERUSER; ALTER USER cassandra WITH PASSWORD 'newpassword'; LIST USERS; DROP USER cassandra;
  • 12. Authorization #CASSANDRAEU [cassandra.yaml] authorizer: CassandraAuthorizer GRANT select ON audit TO jonathan; GRANT modify ON users TO robin; GRANT all ON ALL KEYSPACES TO lara;
  • 13. Lightweight transactions Session 1 #CASSANDRAEU Session 2 SELECT * FROM users WHERE username = ’jbellis’ SELECT * FROM users WHERE username = ’jbellis’ [empty resultset] [empty resultset] INSERT INTO users (...) VALUES (’jbellis’, ...) INSERT INTO users (...) VALUES (’jbellis’, ...)
  • 14. Paxos #CASSANDRAEU •All operations are quorum-based •Each replica sends information about unfinished operations to the leader during prepare •Paxos made Simple
  • 15. Details #CASSANDRAEU •4 round trips vs 1 for normal updates •Paxos state is durable •Immediate consistency with no leader election or failover •ConsistencyLevel.SERIAL •http://www.datastax.com/dev/blog/lightweighttransactions-in-cassandra-2-0
  • 16. Use with caution #CASSANDRAEU •Great for 1% of your application •Eventual consistency is your friend • http://www.slideshare.net/planetcassandra/c-summit-2013- eventual-consistency-hopeful-consistency-by-christos-kalantzis
  • 17. Syntax #CASSANDRAEU INSERT INTO USERS (username, email, ...) VALUES (‘jbellis’, ‘jbellis@datastax.com’, ... ) IF NOT EXISTS; UPDATE USERS SET email = ’jonathan@datastax.com’, ... WHERE username = ’jbellis’ IF email = ’jbellis@datastax.com’;
  • 18. Triggers CREATE TRIGGER <name> ON <table> USING <classname>; #CASSANDRAEU
  • 19. Trigger implementation #CASSANDRAEU class MyTrigger implements ITrigger { public Collection<RowMutation> augment (ByteBuffer key, ColumnFamily update) { ... } }
  • 20. Experimental! #CASSANDRAEU •Relies on internal RowMutation, ColumnFamily classes •[partition] key is a ByteBuffer •Expect changes in 2.1
  • 21. Cursors (before) CREATE TABLE timeline (   user_id uuid,   tweet_id timeuuid,   tweet_author uuid, tweet_body text,   PRIMARY KEY (user_id, tweet_id) ); SELECT * FROM timeline WHERE (user_id = :last_key AND tweet_id > :last_tweet) OR token(user_id) > token(:last_key) LIMIT 100 #CASSANDRAEU
  • 22. Cursors (after) SELECT * FROM timeline #CASSANDRAEU
  • 23. Other CQL improvements #CASSANDRAEU
  • 24. Other CQL improvements •SELECT DISTINCT pk #CASSANDRAEU
  • 25. Other CQL improvements •SELECT DISTINCT pk •CREATE TABLE IF NOT EXISTS table #CASSANDRAEU
  • 26. Other CQL improvements •SELECT DISTINCT pk •CREATE TABLE IF NOT EXISTS table •SELECT ... AS • SELECT event_id, dateOf(created_at) AS creation_date #CASSANDRAEU
  • 27. Other CQL improvements •SELECT DISTINCT pk •CREATE TABLE IF NOT EXISTS table •SELECT ... AS • SELECT event_id, dateOf(created_at) AS creation_date •ALTER TABLE DROP column • #CASSANDRAEU
  • 28. On-Heap/Off-Heap On-Heap Managed by GC Java Process #CASSANDRAEU Off-Heap Not managed by GC
  • 29. Read path (per sstable) Bloom filter Memory Disk #CASSANDRAEU
  • 30. Read path (per sstable) #CASSANDRAEU Bloom filter Memory Disk Partition key cache
  • 31. Read path (per sstable) #CASSANDRAEU Bloom filter Partition summary Memory Disk 0X... 0X... 0X... Partition key cache
  • 32. Read path (per sstable) #CASSANDRAEU Bloom filter Partition summary 0X... 0X... 0X... Memory Disk 0X... 0X... 0X... 0X... Partition index Partition key cache
  • 33. Read path (per sstable) #CASSANDRAEU Bloom filter Compression offsets Partition summary 0X... 0X... 0X... Memory Disk 0X... 0X... 0X... 0X... Partition index Partition key cache
  • 34. Read path (per sstable) #CASSANDRAEU Bloom filter Compression offsets Partition summary 0X... 0X... 0X... Memory Disk 0X... 0X... 0X... 0X... Data Partition index Partition key cache
  • 35. Off heap in 2.0 #CASSANDRAEU Partition key bloom filter 1-2GB per billion partitions Bloom filter Compression offsets Partition summary 0X... 0X... 0X... Memory Disk 0X... 0X... 0X... 0X... Data Partition index Partition key cache
  • 36. Off heap in 2.0 #CASSANDRAEU Compression metadata ~1-3GB per TB compressed Bloom filter Compression offsets Partition summary 0X... 0X... 0X... Memory Disk 0X... 0X... 0X... 0X... Data Partition index Partition key cache
  • 37. Off heap in 2.0 #CASSANDRAEU Partition index summary (depends on rows per partition) Bloom filter Compression offsets Partition summary 0X... 0X... 0X... Memory Disk 0X... 0X... 0X... 0X... Data Partition index Partition key cache
  • 38. Compaction •Single-pass, always •LCS performs STCS in L0 #CASSANDRAEU
  • 39. Healthy leveled compaction #CASSANDRAEU L0 L1 L2 L3 L4 L5
  • 40. Sad leveled compaction #CASSANDRAEU L0 L1 L2 L3 L4 L5
  • 41. STCS in L0 #CASSANDRAEU L0 L1 L2 L3 L4 L5
  • 42. Rapid Read Protection NONE #CASSANDRAEU
  • 43. Cassandra 2.1
  • 44. User defined types #CASSANDRAEU CREATE TYPE address ( street text, city text, zip_code int, phones set<text> ) CREATE TABLE users ( id uuid PRIMARY KEY, name text, addresses map<text, address> ) SELECT id, name, addresses.city, addresses.phones FROM users; id | name | addresses.city | addresses.phones --------------------+----------------+-------------------------63bf691f | jbellis | Austin | {'512-4567', '512-9999'}
  • 45. Collection indexing #CASSANDRAEU CREATE TABLE songs ( id uuid PRIMARY KEY, artist text, album text, title text, data blob, tags set<text> ); CREATE INDEX song_tags_idx ON songs(tags); SELECT * FROM songs WHERE 'blues' IN tags; id | album | artist | tags | title ----------+---------------+-------------------+-----------------------+-----------------5027b27e | Country Blues | Lightnin' Hopkins | {'acoustic', 'blues'} | Worrying My Mind
  • 46. Inefficient bloom filters + =? #CASSANDRAEU
  • 47. Inefficient bloom filters + = #CASSANDRAEU
  • 48. Inefficient bloom filters + = #CASSANDRAEU
  • 49. Inefficient bloom filters #CASSANDRAEU
  • 50. HyperLogLog applied #CASSANDRAEU
  • 51. HLL and compaction #CASSANDRAEU
  • 52. HLL and compaction #CASSANDRAEU
  • 53. HLL and compaction #CASSANDRAEU
  • 54. More-efficient repair #CASSANDRAEU
  • 55. More-efficient repair #CASSANDRAEU
  • 56. More-efficient repair #CASSANDRAEU
  • 57. More-efficient repair #CASSANDRAEU
  • 58. More-efficient repair #CASSANDRAEU
  • 59. More-efficient repair #CASSANDRAEU
  • 60. More-efficient repair #CASSANDRAEU
  • 61. More-efficient repair #CASSANDRAEU
  • 62. More-efficient repair #CASSANDRAEU
  • 63. 2.1 roadmap •January 2014 #CASSANDRAEU
  • 64. #CASSANDRAEU Questions?