Cassandra Summit 2013 Keynote

8,788 views

Published on

Published in: Technology
0 Comments
25 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
8,788
On SlideShare
0
From Embeds
0
Number of Embeds
2,615
Actions
Shares
0
Downloads
0
Comments
0
Likes
25
Embeds 0
No embeds

No notes for slide

Cassandra Summit 2013 Keynote

  1. 1. CASSANDRASUMMIT2013Jonathan Ellis | DataStax CTO | Project Chair, Apache Cassandra
  2. 2. Jul-09 May-10 Feb-11 Dec-11 Oct-12 Jul-130.1 0.3 0.6 0.7 1.0 1.2...2.0DSEFive Years of CassandraJul-08
  3. 3. Core Values0200004000060000800000 2 4 6 8 10 12Cassandra HBase VoltDB Redis MySQL*Massive scalability*High performance*Reliabilty / Availability
  4. 4. VLDB Benchmark (RWS)0200004000060000800000 2 4 6 8 10 12Cassandra HBase VoltDB Redis MySQLNUMBER OF NODESTHROUGHPUT(OPS/SEC)CASSANDRA
  5. 5. Endpoint Benchmark (RW)087501750026250350001 2 4 8 16 32Cassandra HBase MongoDBCASSANDRA
  6. 6. Vox Populi#Cassandra13
  7. 7. *Massive scalability*High performance*Reliabilty / Availability*Ease of useCREATE TABLE users (id uuid PRIMARY KEY,name text,state text,birth_date int);CREATE INDEX ON users(state);SELECT * FROM usersWHERE state=‘Texas’AND birth_date > 1950;New Core Value
  8. 8. CQL is working"Coming from a relational database background we foundthe transition to Cassandra to be very straightforward. There are afew simple key concepts one must grasp at first but ever since itsbeen smooth sailing for us."Boris Wolf, Comcast*Key concepts?*The next Top Data Model (Tomorrow, 11:00, Festival)*The State of CQL (Tomorrow, 3:10, Marina)
  9. 9. 1.2 for Developers*CQL3Thrift compatibilityCollectionsData dictionaryAuth supportHadoop supportNative drivers*Tracing*Atomic batches
  10. 10. CQL/Thrift compatibility*http://www.datastax.com/dev/blog/cql3-for-cassandra-experts*http://www.datastax.com/dev/blog/thrift-to-cql3*http://www.datastax.com/dev/blog/does-cql-support-dynamic-columns-wide-rows*TLDR: Yes
  11. 11. CollectionsCREATE TABLE users (id uuid PRIMARY KEY,name text,state text,birth_date int);
  12. 12. CollectionsCREATE TABLE users (id uuid PRIMARY KEY,name text,state text,birth_date int);CREATE TABLE users_addresses (user_id uuid REFERENCES users,email text);SELECT *FROM users NATURAL JOIN users_addresses;
  13. 13. CollectionsCREATE TABLE users (id uuid PRIMARY KEY,name text,state text,birth_date int);CREATE TABLE users_addresses (user_id uuid REFERENCES users,email text);SELECT *FROM users NATURAL JOIN users_addresses;X
  14. 14. CollectionsCREATE TABLE users (id uuid PRIMARY KEY,name text,state text,birth_date int,email_addresses set<text>);
  15. 15. CollectionsUPDATE usersSET email_addresses = email_addresses +{‘jbellis@gmail.com’, ‘jbellis@datastax.com’};CREATE TABLE users (id uuid PRIMARY KEY,name text,state text,birth_date int,email_addresses set<text>);
  16. 16. Data Dictionarycqlsh:system> use system;cqlsh:system> select columnfamily_name from schema_columnfamilieswhere keyspace_name = system;columnfamily_name-----------------------batchloghintslocalpeer_eventspeersschema_columnfamiliesschema_columnsschema_keyspaces
  17. 17. Authentication[cassandra.yaml]authenticator: PasswordAuthenticator# DSE offers KerberosAuthenticator as well
  18. 18. Authentication[cassandra.yaml]authenticator: PasswordAuthenticator# DSE offers KerberosAuthenticator as wellCREATE USER robin WITH PASSWORD manager SUPERUSER;ALTER USER cassandra WITH PASSWORD newpassword;LIST USERS;DROP USER cassandra;
  19. 19. Authorization[cassandra.yaml]authorizer: CassandraAuthorizerGRANT select ON audit TO jonathan;GRANT modify ON users TO robin;GRANT all ON ALL KEYSPACES TO lara;
  20. 20. Native Drivers*CQL native protocol: efficient, lightweight, asynchronous*Java (GA): https://github.com/datastax/java-driver*.NET (Beta): https://github.com/datastax/csharp-driver*Coming soon: Python, PHP, Ruby*Java and .NET Client Drivers (Tomorrow, 4:10, Marina)
  21. 21. Tracingcqlsh:foo> INSERT INTO bar (i, j) VALUES (6, 2);Tracing session: 4ad36250-1eb4-11e2-0000-fe8ebeead9f9activity | timestamp | source | source_elapsed-------------------------------------+--------------+-----------+----------------Determining replicas for mutation | 00:02:37,015 | 127.0.0.1 | 540Sending message to /127.0.0.2 | 00:02:37,015 | 127.0.0.1 | 779Message received from /127.0.0.1 | 00:02:37,016 | 127.0.0.2 | 63Applying mutation | 00:02:37,016 | 127.0.0.2 | 220Acquiring switchLock | 00:02:37,016 | 127.0.0.2 | 250Appending to commitlog | 00:02:37,016 | 127.0.0.2 | 277Adding to memtable | 00:02:37,016 | 127.0.0.2 | 378Enqueuing response to /127.0.0.1 | 00:02:37,016 | 127.0.0.2 | 710Sending message to /127.0.0.1 | 00:02:37,016 | 127.0.0.2 | 888Message received from /127.0.0.2 | 00:02:37,017 | 127.0.0.1 | 2334Processing response from /127.0.0.2 | 00:02:37,017 | 127.0.0.1 | 2550
  22. 22. Tracing an AntipatternCREATE TABLE queues (id text,created_at timeuuid,value blob,PRIMARY KEY (id, created_at));id created_at valuemyqueue 3092e86f 9b0450d30de9myqueue 0867f47c fc7aee5f6a66myqueue 5fc74be0 668fdb3a2196
  23. 23. Tracing an AntipatternCREATE TABLE queues (id text,created_at timeuuid,value blob,PRIMARY KEY (id, created_at));id created_at valuemyqueue 3092e86f 9b0450d30de9myqueue 0867f47c fc7aee5f6a66myqueue 5fc74be0 668fdb3a2196
  24. 24. Tracing an AntipatternCREATE TABLE queues (id text,created_at timeuuid,value blob,PRIMARY KEY (id, created_at));id created_at valuemyqueue 3092e86f 9b0450d30de9myqueue 0867f47c fc7aee5f6a66myqueue 5fc74be0 668fdb3a2196
  25. 25. Tracing an AntipatternCREATE TABLE queues (id text,created_at timeuuid,value blob,PRIMARY KEY (id, created_at));id created_at valuemyqueue 3092e86f 9b0450d30de9myqueue 0867f47c fc7aee5f6a66myqueue 5fc74be0 668fdb3a2196
  26. 26. 10000 events, 9999 dequeuedcqlsh:foo> SELECT FROM queues WHERE id = myqueue ORDER BY created_at LIMIT 1;Tracing session: 4ad36250-1eb4-11e2-0000-fe8ebeead9f9activity | timestamp | source | source_elapsed------------------------------------------+--------------+-----------+---------------execute_cql3_query | 19:31:05,650 | 127.0.0.1 | 0Sending message to /127.0.0.3 | 19:31:05,651 | 127.0.0.1 | 541Message received from /127.0.0.1 | 19:31:05,651 | 127.0.0.3 | 39Executing single-partition query | 19:31:05,652 | 127.0.0.3 | 943Acquiring sstable references | 19:31:05,652 | 127.0.0.3 | 973Merging memtable contents | 19:31:05,652 | 127.0.0.3 | 1020Merging data from memtables and sstables | 19:31:05,652 | 127.0.0.3 | 1081Read 1 live cells and 19998 tombstoned | 19:31:05,686 | 127.0.0.3 | 35072Enqueuing response to /127.0.0.1 | 19:31:05,687 | 127.0.0.3 | 35220Sending message to /127.0.0.1 | 19:31:05,687 | 127.0.0.3 | 35314Message received from /127.0.0.3 | 19:31:05,687 | 127.0.0.1 | 36908Processing response from /127.0.0.3 | 19:31:05,688 | 127.0.0.1 | 37650Request complete | 19:31:05,688 | 127.0.0.1 | 38047
  27. 27. 1.2 for Operators*Concurrent CREATE TABLE*Virtual nodes*“Fat node” support (5-10TB)*JBOD improvementsOff-heap bloom filters, compression metadataImproved compaction throttleParallel leveled compaction
  28. 28. Memory UsageJava HeapOff-HeapNot managed by GCJVMJava ProcessNative MemoryOn-HeapManaged by GC
  29. 29. MemoryDiskRead Path (per SSTable)
  30. 30. BloomfilterMemoryDiskRead Path (per SSTable)
  31. 31. BloomfilterMemoryDiskPartitionkey cacheRead Path (per SSTable)
  32. 32. BloomfilterMemoryDiskPartitionkey cachePartitionsummary0X...0X...0X...Read Path (per SSTable)
  33. 33. BloomfilterMemoryDisk 0X...0X...0X...0X...PartitionindexPartitionkey cachePartitionsummary0X...0X...0X...Read Path (per SSTable)
  34. 34. BloomfilterMemoryDisk 0X...0X...0X...0X...PartitionindexCompressionoffsetsPartitionkey cachePartitionsummary0X...0X...0X...Read Path (per SSTable)
  35. 35. BloomfilterMemoryDisk 0X...0X...0X...0X...PartitionindexDataCompressionoffsetsPartitionkey cachePartitionsummary0X...0X...0X...Read Path (per SSTable)
  36. 36. Off Heap in 1.2+*Partition key bloom filter1-2GB per billion partitionsDataPartitionsummary0X...0X...0X...Bloomfilter0X...0X...0X...0X...PartitionindexCompressionoffsetsPartitionkey cacheMemoryDisk
  37. 37. Off Heap in 1.2+*Compression metadata~1-3GB per TB compressedDataPartitionsummary0X...0X...0X...Bloomfilter0X...0X...0X...0X...PartitionindexCompressionoffsetsPartitionkey cacheMemoryDisk
  38. 38. Not off Heap until 2.0*Partition index summary(Size cut in ~half in 1.2.5+)DataPartitionsummary0X...0X...0X...Bloomfilter0X...0X...0X...0X...PartitionindexCompressionoffsetsPartitionkey cacheMemoryDisk
  39. 39. Throttling on partitionboundariesThrottling using aconstant RateLimiter10000 RowsTimeMB/s1000Rows10000 RowsTimeMB/sCompaction Throttling1000Rows1000Rows1000Rows
  40. 40. DSE 3.1*Cassandra 1.2 shipping inDataStax Enterprise 3.1 onJune 30*Updated with CQL andcomposite columnsupport for Hive and Solr*Includes Solr 4.3
  41. 41. DataStax DevCenter
  42. 42. Cassandra 2.0
  43. 43. Removed in 2.0
  44. 44. #CASSANDRA13Removed in 2.0
  45. 45. Removed in 2.0
  46. 46. Removed in 2.0*Token range bisection on bootstrap
  47. 47. Removed in 2.0*Token range bisection on bootstrap*Supercolumns (only internally)
  48. 48. Removed in 2.0*Token range bisection on bootstrap*Supercolumns (only internally)public List<ColumnOrSuperColumn> get_slice(...)
  49. 49. Removed in 2.0*Token range bisection on bootstrap*Supercolumns (only internally)public List<ColumnOrSuperColumn> get_slice(...)*Disk compatibility for < 1.2.5
  50. 50. Removed in 2.0*Token range bisection on bootstrap*Supercolumns (only internally)public List<ColumnOrSuperColumn> get_slice(...)*Disk compatibility for < 1.2.5*Network compatibility for < 1.2
  51. 51. New in 2.0*CAS (Compare-and-set = lightweight transactions)*Eager retries*Improved compaction*Triggers (experimental)*CQL cursors
  52. 52. CAS: The ProblemSELECT * FROM usersWHERE username = ’jbellis’[empty resultset]INSERT INTO users (...)VALUES (’jbellis’, ...)Session 1SELECT * FROM usersWHERE username = ’jbellis’[empty resultset]INSERT INTO users (...)VALUES (’jbellis’, ...)Session 2
  53. 53. Why Locking Doesn’t WorkClient(locks) CoordinatorrequestReplicainternalrequest
  54. 54. Why Locking Doesn’t WorkClient(locks) CoordinatorrequestReplicainternalrequestX
  55. 55. Why Locking Doesn’t WorkClient(locks) CoordinatorrequestReplicainternalrequesthintX
  56. 56. Why Locking Doesn’t WorkClient(locks) CoordinatorrequestReplicainternalrequesthinttimeoutresponseX
  57. 57. *All operations are quorum-based*Each replica sends information about unfinished operations to theleader during prepare*Paxos made SimplePaxos
  58. 58. CAS Details*3 round trips vs 1 for normal updates*Paxos state is durable*Immediate consistency with no leader election or failover*ConsistencyLevel.SERIAL
  59. 59. Use with Caution*Great for 1% of your application*Eventual consistency is your friendEventual Consistency != Hopeful Consistency (Today, 1:30, Golden Gate)
  60. 60. Using CASUPDATE USERSSET email = ’jonathan@datastax.com’, ...WHERE username = ’jbellis’IF email = ’jbellis@datastax.com’;INSERT INTO USERS (username, email, ...)VALUES (‘jbellis’, ‘jbellis@datastax.com’, ... )IF NOT EXISTS;
  61. 61. TriggersCREATE TRIGGER <name> ON <table> EXECUTE <classname>;
  62. 62. Trigger Implementationclass MyTrigger implements ITrigger{public Collection<RowMutation> augment(ByteBuffer key, ColumnFamily update){...}}
  63. 63. Experimental!*Relies on internal RowMutation, ColumnFamily classes*[partition] key is a ByteBuffer*Expect changes in 2.1
  64. 64. #CASSANDRA13Follow Up Discussion*After What were they Thinking? (DataStax Lounge)*Meet the Experts (Today, 3:00, C370)*Happy Hour (Tonight, 6:15)
  65. 65. CASSANDRASUMMIT2013Thank YouCASSANDRASUMMIT2013
  66. 66. CASSANDRASUMMIT2013Thank YouCASSANDRASUMMIT2013

×