Cassandra 2.0 to 2.1
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Cassandra 2.0 to 2.1

on

  • 2,073 views

Talk given at Codebits 2014 on Cassandra - features and enhancements in 2.0 and upcoming features in 2.1

Talk given at Codebits 2014 on Cassandra - features and enhancements in 2.0 and upcoming features in 2.1

Statistics

Views

Total Views
2,073
Views on SlideShare
2,036
Embed Views
37

Actions

Likes
3
Downloads
55
Comments
0

2 Embeds 37

http://www.slideee.com 21
https://twitter.com 16

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Cassandra 2.0 to 2.1 Presentation Transcript

  • 1. Cassandra 2.0 2.1 Codebits, Lisbon, April 2014 www.datastax.com @DataStaxEU
  • 2. About Me ©2014 DataStax. Do not distribute without consent. @DataStaxEU 2 Johnny Miller Solutions Architect •  @CyanMiller •  www.linkedin.com/in/johnnymiller We are hiring www.datastax.com/careers @DataStaxCareers
  • 3. DataStax - Introduction ©2014 DataStax. Do not distribute without consent. @DataStaxEU 3 •  Founded in April 2010 •  We drive Apache Cassandra™ •  400+ customers (25 of the Fortune 100) •  200+ employees •  Home to Apache Cassandra™ Chair & most committers •  Contribute ~ 90% of code into Apache Cassandra™ code base •  Headquartered in San Francisco Bay area •  European headquarters established in London •  Offices in France and Germany Our Goal To be the first and best database choice for online applications
  • 4. Why DataStax? ©2014 DataStax. Do not distribute without consent. @DataStaxEU 4 DataStax supports both the open source community and enterprises. Open Source/Community Enterprise Software •  Apache Cassandra (employ Cassandra chair and 90+% of the committers) •  DataStax Community Edition •  DataStax OpsCenter •  DataStax DevCenter •  DataStax Drivers/Connectors •  Online Documentation •  Online Training •  Mailing lists and forums •  DataStax Enterprise Edition •  Certified Cassandra •  Built-in Analytics •  Built-in Enterprise Search •  Enterprise Security •  DataStax OpsCenter •  Expert Support •  Consultative Help •  Professional Training
  • 5. History of Cassandra ©2014 DataStax. Do not distribute without consent. @DataStaxEU 5
  • 6. Cassandra Adoption ©2014 DataStax. Do not distribute without consent. @DataStaxEU 6 Source: http://db-engines.com/en/ranking, April 2014
  • 7. Core Values ©2014 DataStax. Do not distribute without consent. @DataStaxEU 7 •  Massive Scalability •  High Performance •  Reliability/Availability
  • 8. Performance and Scale ©2014 DataStax. Do not distribute without consent. @DataStaxEU 8 “In terms of scalability, there is a clear winner throughout our experiments. Cassandra achieves the highest throughput for the maximum number of nodes in all experiments with a linear increasing throughput.” Solving Big Data Challenges for Enterprise Application Performance Management, Tilman Rable, et al., August 2012. Benchmark paper presented at the Very Large Database Conference, 2012. http://vldb.org/pvldb/vol5/p1724_tilmannrabl_vldb2012.pdf End Point Independent NoSQL Benchmark Lowest in latency… Netflix Cloud Benchmark… Highest in throughput… http://techblog.netflix.com/2011/11/benchmarking- cassandra-scalability-on.html http://www.datastax.com/wp-content/uploads/2013/02/ WP-Benchmarking-Top-NoSQL-Databases.pdf
  • 9. Performance and Scale ©2014 DataStax. Do not distribute without consent. @DataStaxEU 9 Cassandra works for small to huge deployments. •  Cassandra @ Netflix •  80+ Clusters •  2500+ nodes •  4 Data Centres (Amazon Regions) •  > 1 Trillion transactions per day •  Cassandra @ Ebay •  >250TB of data, dozens of nodes, multiple data centres •  > 6 billion writes, > 5 billion reads per day Source: http://planetcassandra.org
  • 10. Availability ©2014 DataStax. Do not distribute without consent. @DataStaxEU 10 •  Cassandra was designed with the understanding that system/hardware failures can and do occur •  Peer-to-peer, distributed system •  All nodes the same – masterless with no single point of failure •  Read/Write-anywhere and across data centres “Cassandra, our distributed cloud persistence store which is distributed across all zones and regions, dealt with the loss of one third of its regional nodes without any loss of data or availability”. http://techblog.netflix.com/2012/07/lessons-netflix-learned-from-aws-storm.html “During Hurricane Sandy, we lost an entire data center. Completely. Lost. It. Our application fail-over resulted in us losing just a few moments of serving requests for a particular region of the country, but our data in Cassandra never went offline.” http://planetcassandra.org/blog/post/outbrain-touches-over-80-of-all-us-online-users-with- help-from-cassandra/
  • 11. Cassandra 1.2 ©2014 DataStax. Do not distribute without consent. @DataStaxEU11
  • 12. New Core Value ©2014 DataStax. Do not distribute without consent. @DataStaxEU 12 •  Massive Scalability •  High Performance •  Reliability/Availability •  Ease of Use CREATE TABLE users (! id uuid PRIMARY KEY,! name text,! country text,! birth_date int! );! ! CREATE INDEX ON users(country);! ! SELECT * FROM users ! WHERE country=‘Portugal’! AND birth_date > 1950;! Cluster cluster = Cluster.builder() .addContactPoints("10.158.02.40", "10.158.02.44") .build(); Session session = cluster.connect("akeyspace"); session.execute( "INSERT INTO user (username, password) ” + "VALUES(‘johnny’, ‘password1234’)" );
  • 13. CQL3 Delivers ©2014 DataStax. Do not distribute without consent. @DataStaxEU 13 "Coming from a relational database background we found the transition to Cassandra to be very straightforward. There are a few simple key concepts one must grasp at first but ever since it’s been smooth sailing for us.” - Boris Wolf, Comcast Find out more: •  Introduction to CQL3 and Data Modeling Slides: http://bit.ly/jpm_003, Video: http://bit.ly/jpm_004 [Cassandra Meetup, Helsinki, Feb 2014]
  • 14. Native Drivers and Protocol ©2014 DataStax. Do not distribute without consent. @DataStaxEU 14 Traditionally, Cassandra clients (Hector, Astynax1 etc..) were developed using Thrift With Cassandra 1.2 and the introduction of CQL3 and the CQL native protocol and drivers a new easier way of using Cassandra was introduced. Why? •  Easier to develop and model •  Best practices for building modern distributed applications •  Integrated tools and experience •  Enable Cassandra to evolve easier and support new features 1Astynax is being updated to include the native driver: https://github.com/Netflix/astyanax/wiki/Astyanax- over-Java-Driver
  • 15. Native Drivers ©2014 DataStax. Do not distribute without consent. 15 •  Java •  C# •  Python •  C++ (beta) •  ODBC (beta) •  Clojure •  Erlang •  Node.js •  Ruby •  Plus many, many more…. Get them here: http://www.datastax.com/download Find out more: •  Going Native With Apache Cassandra http://bit.ly/jpm_001 [QCon, London 2014]
  • 16. Asynchronous Read ©2014 DataStax. Do not distribute without consent. 16 ResultSetFuture future = session.executeAsync( "SELECT * FROM user"); for (Row row : future.get()) { String userName = row.getString("username"); String password = row.getString("password"); } Note: The future returned implements Guava's ListenableFuture interface. This means you can use all Guava's Futures1 methods! 1http://docs.guava-libraries.googlecode.com/git/javadoc/com/google/common/util/concurrent/Futures.html
  • 17. Read with Callbacks ©2014 DataStax. Do not distribute without consent. 17 final ResultSetFuture future = session.executeAsync("SELECT * FROM user"); future.addListener(new Runnable() { public void run() { for (Row row : future.get()) { String userName = row.getString("username"); String password = row.getString("password"); } } }, executor);
  • 18. Parallelize Calls ©2014 DataStax. Do not distribute without consent. 18 int queryCount = 99; List<ResultSetFuture> futures = new ArrayList<ResultSetFuture>(); for (int i=0; i<queryCount; i++) { futures.add( session.executeAsync("SELECT * FROM user " +"WHERE username = '"+i+"'")); } for(ResultSetFuture future : futures) { for (Row row : future.getUninterruptibly()) { //do something } }
  • 19. Query Tracing ©2014 DataStax. Do not distribute without consent. 19 •  You can turn tracing on or off for queries with the TRACING ON | OFF command. •  This can help you understand what Cassandra is doing and identify any performance problems. Find out more: •  http://www.datastax.com/dev/blog/tracing-in-cassandra-1-2
  • 20. Also worth noting… ©2014 DataStax. Do not distribute without consent. @DataStaxEU 20 •  Automatic Batches •  CQL3 Authentication Support •  CQL3 Collections Data Type •  Virtual Nodes (vnodes) •  JBOD improvements •  Parallel leveled compaction •  LZ4 compression Plus much, much more….
  • 21. Cassandra 2.0 DataStax Enterprise 4.0 ©2014 DataStax. Do not distribute without consent. @DataStaxEU21
  • 22. Lightweight Transactions (LWT) ©2014 DataStax. Do not distribute without consent. @DataStaxEU 22 Why? •  Solve a class of race conditions in Cassandra that you would otherwise need to install an external locking manager to solve. Syntax: !INSERT INTO customer_account (customerID, customer_email)! !VALUES (‘Johnny’, ‘jmiller@datastax.com’) 
 !IF NOT EXISTS;! ! !UPDATE customer_account ! !SET customer_email=’jmiller@datastax.com’! !IF customer_email=’jmiller@datastax.com’;! ! Example Use Case: •  Registering a user
  • 23. Race Condition ©2014 DataStax. Do not distribute without consent. @DataStaxEU 23 SELECT name! FROM users! WHERE username = 'johnny';! (0 rows)! INSERT INTO users ! (username, name, email,! password, created_date)! VALUES ('johnny',! 'Johnny Miller',! ['jmiller@datastax.com'],! 'ba27e03fd9...',! '2011-06-20 13:50:00');! INSERT INTO users ! (username, name, email,! password, created_date)! VALUES ('johnny',! 'Johnny Miller',! ['jmiller@datastax.com'],! 'ea24e13ad9...',! '2011-06-20 13:50:01');! This one wins! SELECT name! FROM users! WHERE username = 'johnny';! (0 rows)!
  • 24. Lightweight Transactions ©2014 DataStax. Do not distribute without consent. @DataStaxEU 24 INSERT INTO users ! (username, name, email,! password, created_date)! VALUES ('johnny',! 'Johnny Miller',! ['jmiller@datastax.com'],! 'ba27e03fd9...',! '2011-06-20 13:50:00')! IF NOT EXISTS;! INSERT INTO users ! (username, name, email,! password, created_date)! VALUES ('johnny',! 'Johnny Miller',! ['jmiller@datastax.com'],! 'ea24e13ad9...',! '2011-06-20 13:50:01’)! IF NOT EXISTS;! ! [applied]! -----------! True! [applied] | username | created_date | name ! -----------+----------+----------------+----------------! False | johnny | 2011-06-20 ... | Johnny Miller!
  • 25. Lightweight Transactions ©2014 DataStax. Do not distribute without consent. @DataStaxEU 25 •  Uses Paxos algorthim •  All operations are quorum-based i.e. we can loose nodes and its still going to work! •  See Paxos Made Simple - http://bit.ly/paxosmadesimple •  Consequences of Lightweight Transactions •  4 round trips vs. 1 for normal updates •  Operations are done on a per-partition basis •  Will be going across data centres to obtain consensus •  Cassandra user will need read and write access i.e. you get back the row! Great for 1% your app, but eventual consistency is still your friend! Find out more: •  http://www.datastax.com/dev/blog/lightweight-transactions-in-cassandra-2-0 •  Eventual Consistency != Hopeful Consistency http://www.youtube.com/watch?v=A6qzx_HE3EU
  • 26. Batch Statements and LWT ©2014 DataStax. Do not distribute without consent. @DataStaxEU 26 BEGIN BATCH ! !UPDATE foo SET z = 1 WHERE x = 'a' AND y = 1; ! !UPDATE foo SET z = 2 WHERE x = 'a' AND y = 2 IF t = 4; ! APPLY BATCH;! •  Allows you to group multiple conditional updates in a batch as long as all those updates apply to the same partition
  • 27. Triggers ©2014 DataStax. Do not distribute without consent. @DataStaxEU 27 CREATE TRIGGER <name> ON <table> USING <classname>;
 ! class MyTrigger implements Itrigger {! public Collection<RowMutation> augment(ByteBuffer key, ColumnFamily update) {! ...! }! }! •  The trigger defined on a table fires before a requested DML statement occurs •  You place the trigger code in a lib/triggers subdirectory of the Cassandra installation directory •  A full working example can be found in the Cassandra examples/ triggers directory •  EXPERIMENTAL: Expect changes in Cassandra 2.1 Find out more: •  http://www.datastax.com/dev/blog/whats-new-in-cassandra-2-0-prototype- triggers-support
  • 28. In-Memory Tables (DataStax Enterprise 4.0) ©2014 DataStax. Do not distribute without consent. @DataStaxEU 28 CREATE TABLE users (! uid text,! fname text,! lname text,! PRIMARY KEY (uid)! ) WITH compaction={'class': 'MemoryOnlyStrategy', ‘size_limit_in_mb’:100}! AND memtable_flush_period_in_ms=3600000;! ! •  We expect that in memory column families will be on average 20-50% faster with significantly less observed variance on read queries. •  Great use case is for workloads with a lot of overwrites •  Caution: more tables = more memory = gc death spiral Find out more: •  http://www.datastax.com/2014/02/why-we-added-in-memory-to-cassandra
  • 29. Static Columns ©2014 DataStax. Do not distribute without consent. 29 A static column is a special column that is shared by all the rows of the same partition ! CREATE TABLE foo ( ! x text, ! y bigint, ! t bigint static, ! z bigint, ! PRIMARY KEY (x, y) );! ! INSERT INTO foo (x,y,t, z) VALUES ('a', 1, 1, 10);! INSERT INTO foo (x,y,t, z) VALUES ('a', 2, 2, 20);! ! SELECT * from foo;! ! x | y | t | z! ---+---+---+----! a | 1 | 2 | 10! a | 2 | 2 | 20!
  • 30. Static Columns ©2014 DataStax. Do not distribute without consent. @DataStaxEU 30 •  Considerations •  Use them when you want to store some per-partition “static” information alongside clustered rows and still want to be able to query both of those with a single SELECT. •  only columns not part of the PRIMARY key can be static. •  only tables with at least one clustering column can have static columns •  tables with the COMPACT STORAGE option cannot have static columns.
  • 31. No more CQL2 ©2014 DataStax. Do not distribute without consent. 31 •  CQL2 is not supported any more. •  CQL2 has been discouraged for a while, and if you are still using it, do not upgrade until you have rewritten your application to use CQL3.
  • 32. Clustered columns can be indexed ©2014 DataStax. Do not distribute without consent. 32 CREATE TABLE foo (! a int,! b int,! c int,! PRIMARY KEY (a, b)! );! •  It was previously impossible to create an index on the ‘b’ column, since that column was a special clustered column. •  This restriction has now been fixed and you can create indexes on clustered columns just as if they were regular CQL columns. CREATE INDEX ON foo (b);!
  • 33. Conditional create/drop ks/table/index statements in CQL3 ©2014 DataStax. Do not distribute without consent. 33 •  You can now use IF EXISTS and IF NOT EXISTS conditionals for dropping and creating tables and keyspaces.
  • 34. Automatic Paging ©2014 DataStax. Do not distribute without consent. 34 •  This is great! •  Historically difficult to get huge result sets out of Cassandra. It has generally been necessary to explicitly enumerate your row keys in reasonably small batches (1000 rows or so per batch would be common). •  This feature now allows you to get huge result sets (including “select * from table), and have the server automatically page the results, while the client is just able to trivially iterate over the entire result set. •  This should remove a very common cause of OOMs (out of memory exceptions), and should make data exploration much easier.
  • 35. Paging (before) ©2014 DataStax. Do not distribute without consent. @DataStaxEU 35 CREATE TABLE timeline (!   user_id uuid,!   tweet_id timeuuid,!   tweet_author uuid,! tweet_body text,!   PRIMARY KEY (user_id, tweet_id)! );! ! SELECT *! FROM timeline! WHERE (user_id = :last_key ! AND tweet_id > :last_tweet)! OR token(user_id) > token(:last_key)! LIMIT 100!
  • 36. Paging (after) ©2014 DataStax. Do not distribute without consent. @DataStaxEU 36 SELECT * FROM timeline!
  • 37. Thrift ©2014 DataStax. Do not distribute without consent. 37 •  Replace Thrift HsHa with LMAX Disruptor based implementation •  Because of the substantial changes at the Thrift transport layer, be sure to update your app to use Thrift clients compatible with Cassandra 2.0, and test your application thoroughly before going to production.
  • 38. Streaming ©2014 DataStax. Do not distribute without consent. 38 •  This is a major rewrite of the Cassandra streaming protocol, and should be much more robust and reliable than the previous implementation. •  It includes: •  several performance optimizations •  multiple parallel sstable streaming •  better logging •  more metrics
  • 39. Reduce request latency with rapid retry protection/eager retries ©2014 DataStax. Do not distribute without consent. 39 •  This should substantially help with your 95%-99% latency. •  By rapidly detecting that a query was sent to a slow node, this feature will greatly speed up performing a retry on another node. •  There is new metadata associated with each table speculative_retry='99.0PERCENTILE' //default •  Be careful – retries will have an effect on what throughput you can achieve in your cluster. ALTER TABLE users WITH speculative_retry = '10ms’;! ! ALTER TABLE users WITH speculative_retry = '99percentile'; !!
  • 40. Official way to disable compactions ©2014 DataStax. Do not distribute without consent. 40 •  nodetool disableautocompaction •  nodetool enableautocompaction
  • 41. Remove row-level bloom filters ©2014 DataStax. Do not distribute without consent. 41 •  This should be a largely invisible change since there was never a noticeable performance improvement from having these bloom filters. •  However, you will see a reduction in memory usage as a result.
  • 42. add default_time_to_live ©2014 DataStax. Do not distribute without consent. 42 •  This has been a long-requested Cassandra feature and makes auto-expiring data easier. •  You can have a single per-table TTL that will always be set unless overridden by the client. •  It also allows for significant performance optimizations on the server side.
  • 43. New network topology snitch for mixed ec2/other envs ©2014 DataStax. Do not distribute without consent. 43 •  There is a new snitch(YamlFileNetworkTopologySnitch) and a new yaml file (cassandra-topology.yaml) that will be used if you select it. •  This snitch should probably be used for any cluster that spans both EC2 as well as non-EC2 environments.
  • 44. Removed compatibility with pre-1.2.5 sstables and network messages ©2014 DataStax. Do not distribute without consent. 44 •  This is very important as it means that you must upgrade to Cassandra 1.2.6 ( or equivalent DSE) or later before upgrading to Cassandra 2.0.x or DSE 4.0.x.
  • 45. Improve memory use defaults ©2014 DataStax. Do not distribute without consent. 45 •  Memtables now use ¼ your heap by default instead of ⅓. •  Additionally, the write timeout has been dramatically lowered to 2 seconds from 10 seconds, and the read timeout has been changed to 5 seconds.
  • 46. add SHOW SESSION <tracing-session> command ©2014 DataStax. Do not distribute without consent. 46 •  If you aren’t already using tracing to debug your dev and production clusters, then start doing so. •  It’s one of the most powerful tools that you have at your disposal to understand what is going on. •  This lets explicitly specify which session you want to display the output for. •  Previously you would have had to manually query it from the system_traces.sessions and system_traces.events tables.
  • 47. Single-pass compaction ©2014 DataStax. Do not distribute without consent. 47 •  This should noticeably improve the performance of compaction since Cassandra no longer has to read through each sstable twice.
  • 48. Compact hottest sstables first and optionally omit coldest from compaction entirely ©2014 DataStax. Do not distribute without consent. 48 •  Read-coldness (how [in]frequently a row is read) is now used in consideration of compaction. •  If you have a lot of cold data, this could greatly reduce the amount of unnecessary re-compaction.
  • 49. Leveled compaction performs size- tiered compactions in L0 ©2014 DataStax. Do not distribute without consent. 49 •  If LCS gets behind, read performance deteriorates as we have to check bloom filters on many sstables in L0. •  For wide rows, this can mean having to seek for each one since the BF doesn't help us reject much. •  Performing size-tiered compaction in L0 will mitigate this until we can catch up on merging it into higher levels
  • 50. New CQL-aware SSTableWriter ©2014 DataStax. Do not distribute without consent. 50 •  Prior to Cassandra 2.0.4, It has been possible to write SStables for CQL3 tables, but only with a lot of difficulty. •  Particularly with complex schemas, this is very complicated and error prone, and should be deprecated as an approach. •  Instead the new CQL3 aware SSTableWriter should be used: String schema = "CREATE TABLE foo (c1 int, c2 text, c3 float, PRIMARY KEY (c1, c2))"! String insert = "INSERT INTO foo(c1, c2, c3) VALUES (?, ?, ?)"! CQLSSTableWriter writer = CQLSSTableWriter.builder()! .for(schema)! .using(insert)! .build();! ! writer.addRow(3, "foo", 2.3f);! writer.addRow(1, "bar", 0.0f); !
  • 51. Plus more…. ©2014 DataStax. Do not distribute without consent. @DataStaxEU 51 •  Java7 is now required! •  Tracking statistics on clustered columns allows eliminating unnecessary sstables from the read path •  Faster partition index lookups and cache reads by improving performance of off-heap memory •  Faster reads of compressed data by switching from CRC32 to Adler checksums •  JEMalloc support for off-heap allocation •  The potentially dangerous countPendingHints JMX call has been replaced by a Hints Created metric •  The on-heap partition cache (“row cache”) has been removed •  Vnodes are on by default in Cassandra (off by default in DataStax Enterprise). And more……
  • 52. Find out more… ©2014 DataStax. Do not distribute without consent. @DataStaxEU 52 •  Cassandra 2.0 documentation http://www.datastax.com/documentation/cassandra/2.0/ •  DataStax Enterprise 4.0 documentation http://www.datastax.com/documentation/datastax_enterprise/4.0/ •  What’s new in Cassandra 2.0 http://www.datastax.com/wp-content/uploads/2013/09/WP-DataStax- WhatsNewC2.0.pdf •  New CQL features in Cassandra 2.0.6 http://www.datastax.com/dev/blog/cql-in-2-0-6 •  What’s under the hood in Cassandra 2.0 http://www.datastax.com/dev/blog/whats-under-the-hood-in-cassandra-2-0 •  Facebook’s Cassandra paper, annotated and compared to Apache Cassandra 2.0 http://www.datastax.com/documentation/articles/cassandra/ cassandrathenandnow.html
  • 53. Cassandra 2.1 ©2014 DataStax. Do not distribute without consent. @DataStaxEU53
  • 54. User Defined Types ©2014 DataStax. Do not distribute without consent. @DataStaxEU 54 CREATE TYPE address (! street text,! city text,! zip_code int,! phones set<text>! )! ! CREATE TABLE users (! id uuid PRIMARY KEY,! name text,! addresses map<text, address>! )! ! SELECT id, name, addresses.city, addresses.phones FROM users;! ! id | name | addresses.city | addresses.phones! --------------------+----------------+--------------------------! 63bf691f | johnny | London | {’0201234567', ’0796622222'}!
  • 55. User Defined Types ©2014 DataStax. Do not distribute without consent. @DataStaxEU 55 Considerations •  you cannot update only parts of a UDT value, you have to overwrite the whole thing every time (limitation in current implementation, may change). •  Always read entirely under the hood (as of the current implementation at least) •  UDTs are not meant to store large and complex "documents" as of their current implementation, but rather to help make the denormalization of short amount of data more convenient and flexible. •  It is possible to use a UDT as type of any CQL column, including clustering ones. Find out more: •  http://www.datastax.com/dev/blog/cql-in-2-1
  • 56. Secondary indexes on collections ©2014 DataStax. Do not distribute without consent. @DataStaxEU 56 CREATE TABLE songs (! id uuid PRIMARY KEY,! artist text,! album text,! title text,! data blob,! tags set<text>! );! ! CREATE INDEX song_tags_idx ON songs(tags);! ! SELECT * FROM songs WHERE tags CONTAINS 'blues';! ! id | album | artist | tags | title! ----------+---------------+-------------------+-----------------------+------------------! 5027b27e | Country Blues | Lightnin' Hopkins | {'acoustic', 'blues'} | Worrying My Mind! ! ! !
  • 57. Secondary indexes on map keys ©2014 DataStax. Do not distribute without consent. @DataStaxEU 57 •  If you prefer indexing the map keys, you can do so by creating a KEYS index and by using CONTAINS KEY CREATE TABLE products (! id int PRIMARY KEY,! description text,! price int,! categories set<text>,! features map<text, text>! );! ! CREATE INDEX feat_key_index ON products(KEYS(features));! ! SELECT id, description! FROM products! WHERE features CONTAINS KEY 'refresh-rate';! ! id | description! -------+-----------------------------! 34134 | 120-inch 1080p 3D plasma TV!
  • 58. Counters++ ©2014 DataStax. Do not distribute without consent. @DataStaxEU 58 •  simpler implementation, no more edge cases •  possible to properly repair now •  significantly less garbage and internode traffic generated •  better performance for 99% of uses
  • 59. Row Cache ©2014 DataStax. Do not distribute without consent. @DataStaxEU 59 CREATE TABLE notifications (! target_user text,! notification_id timeuuid,! source_id uuid,! source_type text, ! activity text,! PRIMARY KEY (target_user, notification_id)! )! WITH CLUSTERING ORDER BY (notification_id DESC)! AND caching = 'rows_only'! AND rows_per_partition_to_cache = '3';!
  • 60. Thrift post-Cassandra 2.1 ©2014 DataStax. Do not distribute without consent. @DataStaxEU 60 •  There is a proposal to freeze thrift starting with 2.1.0 •  http://bit.ly/freezethrift •  Will retain it for backwards compatibility, but no new features or changes to the Thrift API after 2.1.0 “CQL3 is almost two years old now and has proved to be the better API that Cassandra needed. CQL drivers have caught up with and passed the Thrift ones in terms of features, performance, and usability. CQL is easier to learn and more productive than Thrift.” - Jonathan Ellis, Apache Chair, Cassandra
  • 61. 2.1 Roadmap ©2014 DataStax. Do not distribute without consent. @DataStaxEU 61 •  Beta1 - 20th Feb •  Beta2 - ? •  RC - ? •  Final release currently mid-2014
  • 62. Find Out More ©2014 DataStax. Do not distribute without consent. 62 DataStax: •  http://www.datastax.com Getting Started: •  http://www.datastax.com/documentation/gettingstarted/index.html Training: •  http://www.datatstax.com/training Downloads: •  http://www.datastax.com/download Documentation: •  http://www.datastax.com/docs Developer Blog: •  http://www.datastax.com/dev/blog Community Site: •  http://planetcassandra.org Webinars: •  http://planetcassandra.org/Learn/CassandraCommunityWebinars
  • 63. ©2014 DataStax. Do not distribute without consent. @DataStaxEU 63