Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Fears, Misconceptions, and
Accepted Anti-patterns of a first
time Cassandra Adopter
Ben Christenson
Who am I?
● Full Stack Developer / Architect
at Kinetic Data
● MSP Cassandra Meetups since August 2014
● I like cool techn...
Why am I presenting?
● Cassandra is a great strategy even if you aren’t looking for infinite OPS
○ Lots of articles for ne...
About Kinetic Data
● About 50 employees
● Main office in St. Paul, secondary
office in Sydney Australia, satellite
Offices...
Cassandra
Adoption ● What is Kinetic Request?
● Why Cassandra?
What is Kinetic Request?
What is Kinetic Request?
What is Kinetic Request?
Workflow automation
through Kinetic Task
What is Kinetic Request?
● Originally developed on the BMC Action Request System (ARS)
● Evolved into a Java webapp
● Plan...
Why Cassandra?
● Multi-datacenter replication
○ Significantly improved performance for global customers
● Durability
● Sca...
Fears and
Misconceptions
● StackOverflow-itis
● Kinetic Request is a deployed
solution
● Cassandra is for write-heavy
work...
Stack Overflow-itis
Fears
● ALLOW FILTERING
○ Don’t ever use that!
● Secondary indexes
○ Don’t use those.
● Collections
○ ...
Kinetic Request is a deployed solution
Fear
● Very hard to find anyone using Cassandra
for customer-managed solutions
Real...
Cassandra is for write-heavy workloads
Misconception
● Cassandra is only for write-heavy workloads
Reality
● Cassandra is ...
Cassandra is for time series data
Misconception
● Cassandra is only for time series data
Reality
● Cassandra is extremely ...
Cassandra requires Java and Linux experts
Misconception
● We were going to need to become Java and
Linux experts to use Ca...
Accepted Anti-
paterns
● Atomicity and Read-Before-Write
● Distributed Joins
● Lookup Tables
● Delete-And-Insert Updates
●...
Atomicity and Read before Write
● Read before write is often described as an anti-pattern
○ Potential inconsistency or Che...
Atomicity and Read before Write
Sample Schema
CREATE TABLE IF NOT EXISTS widgets (
name text,
tenant_id timeuuid,
value te...
Distributed Joins
● Cassandra doesn’t support joins, but you can do them in memory
○ Requires multiple sequential reads an...
Submissions Schema
CREATE TABLE IF NOT EXISTS submissions (
form_id timeuuid,
id timeuuid,
tenant_id timeuuid,
...
PRIMARY...
Lookup Tables
● Lookup Tables are another form of Distributed Join
○ Table contains only data necessary for the query and ...
Lookup Tables
CREATE TABLE IF NOT EXISTS webhooks (
id timeuuid,
scheduled_at timestamp,
tenant_id timeuuid,
...
PRIMARY K...
Delete-And-Insert Updates
● Fundamental problem:
○ Cassandra retrieves by primary key
○ User’s want to search by values th...
Delete-And-Insert Updates
● The biggest source of our DELETE-AND-INSERT usage to support our Ad-hoc
querying of submission...
Delete-And-Insert Updates
Writing
● Read record from Cassandra
(including version_id)
● An “Indexer” class generates index...
Delete-And-Insert Updates
CREATE TABLE IF NOT EXISTS submissions_index (
tenant_id timeuuid, timeline text, bucket text, /...
Delete-And-Insert Updates
● Even with Delete-And-Insert and lookup tables, performance is acceptable
○ Supports queries th...
Queues
● Queues are one of the most commonly referred to anti-patterns
● Problem comes down to tombstones again
○ Can be i...
Queues
● One of the queue-like structures used by Kinetic Request is for Webhooks
○ Can fail to connect and should be auto...
Queues
● Other styles of queues can’t necessarily be processed by the event server
○ Scheduled for the future
○ Handled by...
Takeaways
● Cassandra has many benefits,
even if you are not using it at
extreme scales
● The barrier of entry is not as
s...
Questions?
Upcoming SlideShare
Loading in …5
×

Fears, misconceptions, and accepted anti patterns of a first time cassandra adopter

156 views

Published on

Cassandra is a great strategy even if you aren’t looking for infinite OPS

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Fears, misconceptions, and accepted anti patterns of a first time cassandra adopter

  1. 1. Fears, Misconceptions, and Accepted Anti-patterns of a first time Cassandra Adopter Ben Christenson
  2. 2. Who am I? ● Full Stack Developer / Architect at Kinetic Data ● MSP Cassandra Meetups since August 2014 ● I like cool technology, whisky, and brazilian jiu-jitsu.
  3. 3. Why am I presenting? ● Cassandra is a great strategy even if you aren’t looking for infinite OPS ○ Lots of articles for newbies and experts, not a lot of content on non-extreme use ● Give back to the meetup ○ I enjoy hearing about real implementations ○ Meetup is one of the reasons we chose Cassandra ● A little symbiotic selfishness ○ There may be better patterns that I don’t know about ○ Developing a more technical version of the presentation with sample code, metrics, etc
  4. 4. About Kinetic Data ● About 50 employees ● Main office in St. Paul, secondary office in Sydney Australia, satellite Offices throughout US ● Develop software to improve service experience
  5. 5. Cassandra Adoption ● What is Kinetic Request? ● Why Cassandra?
  6. 6. What is Kinetic Request?
  7. 7. What is Kinetic Request?
  8. 8. What is Kinetic Request? Workflow automation through Kinetic Task
  9. 9. What is Kinetic Request? ● Originally developed on the BMC Action Request System (ARS) ● Evolved into a Java webapp ● Planning for ARS decoupling for about 5 years ● January 2015 - Started Request CE development ● May 2015 - Demoed Request CE prototype ● March 2016 - Released Request CE v1.0.0
  10. 10. Why Cassandra? ● Multi-datacenter replication ○ Significantly improved performance for global customers ● Durability ● Scalability ○ Easier to start a current scale ○ Scale out without migrating ○ Scale size and throughput ● Community
  11. 11. Fears and Misconceptions ● StackOverflow-itis ● Kinetic Request is a deployed solution ● Cassandra is for write-heavy workloads ● Cassandra is for time series data ● Cassandra requires Java and Linux experts
  12. 12. Stack Overflow-itis Fears ● ALLOW FILTERING ○ Don’t ever use that! ● Secondary indexes ○ Don’t use those. ● Collections ○ Probably shouldn’t use those either… ● Counters ○ Don’t you want to use something that works ● Tombstones, Tombstones?!, TOMBSTONES! Reality ● Thank you MSP Cassandra Meetup for being the cure! ○ Everything was included for a reason ○ “You probably don’t want to use Xyz for that, but here is when you would.”
  13. 13. Kinetic Request is a deployed solution Fear ● Very hard to find anyone using Cassandra for customer-managed solutions Reality ● Many of our customers already pay for Cassandra support ● Many of our customers understand the benefits ● As a deployed solution our data usage and schemas don’t change frequently and potential issues are (hopefully) caught before reaching the customer ● Possible future talk?
  14. 14. Cassandra is for write-heavy workloads Misconception ● Cassandra is only for write-heavy workloads Reality ● Cassandra is extremely good at write-heavy workloads ● Cassandra can be implemented to be good at read-heavy workloads ● Even with heavy delete-and-insert updates, reads are still outperforming previous versions of Kinetic Request
  15. 15. Cassandra is for time series data Misconception ● Cassandra is only for time series data Reality ● Cassandra is extremely good at time series data ● Cassandra is extremely good at replicating all data ● Just because Cassandra can handle extreme operations per second, doesn’t mean it isn’t suited for lower OPS usage (and you can get away with a lot more)
  16. 16. Cassandra requires Java and Linux experts Misconception ● We were going to need to become Java and Linux experts to use Cassandra Reality ● We needed to have a computer to use Cassandra ● We needed to be willing to learn more about Java, Linux commands, and Cassandra internals as we went
  17. 17. Accepted Anti- paterns ● Atomicity and Read-Before-Write ● Distributed Joins ● Lookup Tables ● Delete-And-Insert Updates ● Queues
  18. 18. Atomicity and Read before Write ● Read before write is often described as an anti-pattern ○ Potential inconsistency or Check and Set (CAS) / Lightweight Transaction (LWT) operations ○ Event sourcing may be an alternative ● There isn’t always an alternative ● Kinetic Request uses LWTs for “Optimistic Locking” (and uniqueness) ● Even at an order of magnitude slower, more than fast enough at our scale ○ < 10ms with Replication Factor 3 ○ Order of magnitude faster than what is necessary for us
  19. 19. Atomicity and Read before Write Sample Schema CREATE TABLE IF NOT EXISTS widgets ( name text, tenant_id timeuuid, value text, version_id timeuuid, PRIMARY KEY ((tenant_id), name) ) WITH CLUSTERING ORDER BY (name ASC); Uniqueness INSERT INTO widgets (name, tenant_id, value, version_id) VALUES (:name, :tenant_id, :value, :version_id) IF NOT EXISTS Optimistic Locking UPDATE widgets SET value = :value WHERE tenant_id = :tenant_id AND name = :name IF version_id = :version_id
  20. 20. Distributed Joins ● Cassandra doesn’t support joins, but you can do them in memory ○ Requires multiple sequential reads and/or multi-partition queries ○ Embedded or denormalized content is sometimes an alternative ● We use a distributed join between Submissions and Forms ○ Allows us to rename the form and maintain the link ○ Acceptable because forms are finite enough to keep in memory ● Christopher Batey has a great blog post on this: http://christopher-batey.blogspot. com/2015/02/cassandra-anti-pattern-distributed.html
  21. 21. Submissions Schema CREATE TABLE IF NOT EXISTS submissions ( form_id timeuuid, id timeuuid, tenant_id timeuuid, ... PRIMARY KEY ((tenant_id), id) ) WITH CLUSTERING ORDER BY (name ASC); Distributed Joins Forms Schema CREATE TABLE IF NOT EXISTS forms ( name text, tenant_id timeuuid, ... PRIMARY KEY ((tenant_id), name) ) WITH CLUSTERING ORDER BY (name ASC); SELECT * FROM submissions WHERE tenant_id = :tenant_id AND id = :id; SELECT * FROM forms WHERE tenant_id = :tenant_id AND id = :submission_form_id;
  22. 22. Lookup Tables ● Lookup Tables are another form of Distributed Join ○ Table contains only data necessary for the query and an id used to lookup from the source of truth ○ Requires a “multi-get” to retrieve actual records ○ Often considered an anti-pattern for similar reasons as distributed joins ● We use a lookup tables for Webhooks and Submissions ○ Duplicating data would lead to storage requirements orders of magnitude higher
  23. 23. Lookup Tables CREATE TABLE IF NOT EXISTS webhooks ( id timeuuid, scheduled_at timestamp, tenant_id timeuuid, ... PRIMARY KEY ((tenant_id), id) ) WITH CLUSTERING ORDER BY (id ASC); CREATE TABLE IF NOT EXISTS webhooks_index ( bucket text, id timeuuid, index_type text, // Tenant, Webhook, Parent index_key text, scheduled_at timestamp, tenant_id timeuuid, PRIMARY KEY ((tenant_id, bucket, index_type, index_key), scheduled_at, id) ) WITH CLUSTERING ORDER BY (scheduled_at DESC, id DESC) ...;
  24. 24. Delete-And-Insert Updates ● Fundamental problem: ○ Cassandra retrieves by primary key ○ User’s want to search by values that change ○ Updating a primary key is done as a DELETE and INSERT (which leads to tombstones) ○ Want to minimize environmental complexity ● No simple solution for us ○ Try to minimize number of deletes for a given query path ○ Try to optimize for tombstones
  25. 25. Delete-And-Insert Updates ● The biggest source of our DELETE-AND-INSERT usage to support our Ad-hoc querying of submissions ● Example Ad-hoc query: values[Foo] IN ("Bar", "Baz") AND ( values[Requested By]="ben.christenson" OR values[Requested For]="ben.christenson" ) ● Our solution is similar to the C* Summit presentation on multi-criteria queries http://fr.slideshare.net/ippontech/multi-criteria-queries-on-a-cassandra-application
  26. 26. Delete-And-Insert Updates Writing ● Read record from Cassandra (including version_id) ● An “Indexer” class generates index sets from original and updated model ● Optimistically update the source of truth record ● Asynchronously create/delete necessary index records Reading ● Each criterion is a separate async query ● An in memory evaluator aggregates the lookup table IDs ● The submissions associated to the resulting IDs are each retrieved asynchronously ● The in memory evaluator “double checks” the submissions match the query and may re-execute another search to fill in gaps for submissions that have been updated since the initial index queries (very rare)
  27. 27. Delete-And-Insert Updates CREATE TABLE IF NOT EXISTS submissions_index ( tenant_id timeuuid, timeline text, bucket text, // ‘’ for active or ‘YYYY-mm’ key text, value text, timestamp timestamp, submission_id timeuuid, PRIMARY KEY ((tenant_id, timeline, bucket, key), value, timestamp, submission_id) ) WITH CLUSTERING ORDER BY (value DESC, timestamp DESC, submission_id DESC) AND COMPACTION={ 'sstable_size_in_mb': '256', 'tombstone_threshold': '0.05', 'unchecked_tombstone_compaction': 'true', 'tombstone_compaction_interval': '3600', 'class': 'LeveledCompactionStrategy' };
  28. 28. Delete-And-Insert Updates ● Even with Delete-And-Insert and lookup tables, performance is acceptable ○ Supports queries that were previously impossible ○ Extremely complicated search queries still return in < 150ms ● Does have some caveats ○ Only supports AND, OR, IN, and = (would like to support !=, starts with, ends with, etc) ○ Whenever an AND is used at least one of the criterions must return less than 1000 matches ○ In order to support pagination, sort orders must be indexed independently (combination of date and uuid; we index multiple date properties)
  29. 29. Queues ● Queues are one of the most commonly referred to anti-patterns ● Problem comes down to tombstones again ○ Can be improved by truncating, knowing where live data begins, or complicated rotations ○ Can be improved by including additional technologies (real message queue)
  30. 30. Queues ● One of the queue-like structures used by Kinetic Request is for Webhooks ○ Can fail to connect and should be automatically retried (put back on queue) ○ Happen often and have a very specific query path so tombstones are worrisome ● In this case, the queue “event” can be processed initially by the server in memory ○ Write directly to the source of truth / historical index and avoid tombstones for normal executions ○ Only if the initial webhook fails is it added to the queue index
  31. 31. Queues ● Other styles of queues can’t necessarily be processed by the event server ○ Scheduled for the future ○ Handled by ● For this case, we are experimenting using an in-memory distributed queue ○ Hazelcast or Ignite (which have many other coordination benefits) ○ Avoids hitting tombstones by using Cassandra as a persistence mechanism only queried at startup
  32. 32. Takeaways ● Cassandra has many benefits, even if you are not using it at extreme scales ● The barrier of entry is not as scary as it seems ● Play, play, play, test, test, test ● Find good resources
  33. 33. Questions?

×