At Sky, we use Cassandra for database persistence in our Online Video Platform - the system which delivers all OTT video content to both Sky and NOW TV customers - and yes, that includes handling huge spikes in traffic both when there's a big Premier League football match and when a new Game of Thrones season comes online!
This talk aims to cover the following topics.
- A brief introduction to Cassandra, including what it’s good for, what it’s not good for, and why. We'll dig into how storage, reads, writes and conflict resolution work.
- Gotchas in an eventually-consistent DB - some interesting problems we encountered and the lessons we learned the hard way.
- Performing database schema and data evolution in Cassandra for a production app.
- Why this is important, and what we did at Sky to ensure consistency of our database schema.
Presented at Geecon Prague on 20th October 2016.
Apache Cassandra: building a production app on an eventually-consistent DB
1. Apache Cassandra:
building a production app on an
eventually-consistent DB
Oliver Lockwood
Prague, 20-21 October 2016
2. Agenda
• Brief introduction to Cassandra
• Gotchas when using an eventually-consistent DB
• Performing DB schema and data evolution in Cassandra for a production app
Oliver Lockwood Prague, 20-21 October 2016
3. Introduction to Cassandra
What it is, and what it’s good for
• NoSQL database
• Distributed architecture with no “master” – highly scalable and resilient
• Write-optimised
• Eventual consistency
Oliver Lockwood Prague, 20-21 October 2016
http://www.datastax.com/dbas-guide-to-nosql
4. Introduction to Cassandra
How storage, reads, writes and conflict resolution work
• Replication factor = how many copies
• Replication strategy determines
storage location
• Contact points used initially
• Client connection is to cluster
• Co-ordinator could be any node
(based on load balancing policy)
• Storage is independent of co-ordinator
• Last Write Wins for conflicts
Oliver Lockwood Prague, 20-21 October 2016
http://www.slideshare.net/DataStax/understanding-data-consistency-in-apache-cassandra
ClientClient
Client 2Client 2
5. Introduction to Cassandra
What it’s not good for
Oliver Lockwood Prague, 20-21 October 2016
http://planetcassandra.org/blog/flite-breaking-down-the-cql-where-clause/
6. Gotchas
Lessons we learned the hard way
• Distributable nature of Cassandra depends
on synchronized clocks
• What happens if clocks drift?
• INSERT, DELETE, READ from a single client.
• What if Node 3’s clock is slow?
Oliver Lockwood Prague, 20-21 October 2016
https://blog.logentries.com/2014/03/synchronizing-clocks-in-a-cassandra-cluster-pt-1-the-problem/
http://datascale.io/how-to-create-a-cassandra-cluster-in-aws/
ClientClient
(1) INSERT
(2) DELETE
7. Gotchas
Lessons we learned the hard way
Demo!
Oliver Lockwood Prague, 20-21 October 2016
http://stackoverflow.com/questions/17474830/configuring-cassandra-with-private-ip-for-internode-communications
https://github.com/oliverlockwood/aws-ansible-cassandra
8. Gotchas
Lessons we learned the hard way - resolution
• Node 3’s clock is slow
• Use client-side timestamps?
CQL protocol v3 supports this.
• Avoid time-sensitive query patterns
Oliver Lockwood Prague, 20-21 October 2016
http://www.datastax.com/dev/blog/java-driver-2-1-2-native-protocol-v3
ClientClient
(1) INSERT
(2) DELETE
9. Schema evolution in Cassandra
Introduction
• DB schemas evolve – accept it!
• Automation is better than manual processes
• For RDBMS: Flyway, Liquibase etc.
• For Cassandra…
… cqlmigrate!
Oliver Lockwood Prague, 20-21 October 2016
https://flywaydb.org/
http://www.liquibase.org/
10. Schema evolution in Cassandra
Introducing cqlmigrate
Oliver Lockwood Prague, 20-21 October 2016
https://github.com/sky-uk/cqlmigrate
http://developers.sky.com/internal/ovp/cassandra/schema/evolution/2016/07/05/cqlmigrate/
11. Schema evolution in Cassandra
Diving deeper into cqlmigrate
• Schema update operations are recorded, so each CQL file is applied only once
• Locking mechanism uses LWT to avoid race conditions
Oliver Lockwood Prague, 20-21 October 2016
http://www.datastax.com/dev/blog/lightweight-transactions-in-cassandra-2-0
http://www.cs.utexas.edu/users/lorenzo/corsi/cs380d/past/03F/notes/paxos-simple.pdf
12. Schema evolution in Cassandra
Diving deeper into cqlmigrate
Demo!
Oliver Lockwood Prague, 20-21 October 2016
https://github.com/oliverlockwood/cqlmigrate-example-app
13. In conclusion
Takeaway menu
Oliver Lockwood Prague, 20-21 October 2016
http://www.datastax.com/dev/blog/lightweight-transactions-in-cassandra-2-0
http://www.cs.utexas.edu/users/lorenzo/corsi/cs380d/past/03F/notes/paxos-simple.pdf
Editor's Notes
*****Ask staff – can I invite any questions at the midpoint??*****
Before we get started – show of hands:
- how many people are familiar with Cassandra?
- how many people are actively using Cassandra?
1) Get AWS console logged in
2) Mirror displays hotkey (Cmd-F1)
3) Warm up ansible cache
Brief introduction to Cassandra – also covering what it’s good for, what it’s not good for and why
Gotchas in an eventually-consistent DB – lessons we learned the hard way
Performing DB schema and data evolution in Cassandra for a production app
NoSQL – data modelled in a non relational manner!
Eventual consistency – consistency is actually tunable for each type of operation, but stronger consistency levels impact performance.
Contact points
Show how different coordinator nodes work – completely separate from storage nodes for a given row
Example of multiple consecutive updates to a particular row – explain Last Write Wins (LWW)
What makes Cassandra so highly distributable also makes it vulnerable – the whole deployment must run on synchronized clocks. Clock drift can easily occur – even with NTP installed – and can expose problems.
Let’s take the example of an INSERT, DELETE, READ query pattern from a single client. Although it’s not necessarily the most common pattern, intuitively you’d think it should work – after all, as we covered earlier, we’re in a “Last Write Wins” environment, and the operation order is clearly defined.
Unfortunately, this is not necessarily the case. Let’s take a look at how this query pattern would progress.
- Can everyone see?
- Show AWS
- Set clock back for single node
- Show cluster state (explain nodetool if needed)
- Show test
- Run test
i2cssh -m `ansible -vvvv -i ec2.py eu-west-1 --list-hosts | grep -v hosts | grep -v "config" | awk '{print $1}' | paste -s -d, -` -p LargeFont -b
date +"%Y-%m-%d %H:%M:%S.%3N"
Cmd-Alt-I for broadcast
cmd-f1 for mirror display toggle!
date +"%Y-%m-%d %H:%M:%S.%3N"; sudo date --set `date -d '-5 second' "+%H:%M:%S.%3N"`; date +"%Y-%m-%d %H:%M:%S.%3N"
nodetool status
curl http://169.254.169.254/latest/meta-data/public-ipv4
- Cassandra query tracing for details – can look at the `system_traces` keyspace
Cmd + or Cmd – for font size in IntelliJ
Version 3 of the native protocol (and the Java driver for the past couple of years!) supports allowing client-specified timestamps.
Takeaways:
Avoid time-sensitive query patterns!
If a single client will be performing multiple consecutive Cassandra operations, use client-specified timestamps.
PAUSE – any questions at this point before we move on?
Now for a slight segue.
- sometimes we have to make changes to our schema (e.g. adding a new table) or provisioned data (as distinct from user-generated data)
- sometimes we have to spin up a new deployment from scratch (e.g. creating new data center / environment)
- In both cases we need a reliable way to create and update our DB schema and data.
I don’t know how you feel, but as a developer, I don’t like:
Doing manual changes
Having to ask Operations, DevOps or anyone else to make manual changes
Complex application deployments – I want to install the new version of my app, and have it “just work”.
If you’re using a relational DB, then there’s a number of tools that you can use to aid your schema evolution. You may have heard of Liquibase or Flyway (if you haven’t, then do look them up.)
What about Cassandra? When the team responsible for user authentication and entitlements on Sky’s online video platform came to tackle this problem, there didn’t seem to be any such tooling available for Cassandra. So they created one, and called it cqlmigrate.
To introduce cqlmigrate, let’s start with the concepts behind its founding.
1) Versioning the evolution of your schema into discrete steps. Open-closed - don’t change past steps, but can add steps. Fairly standard practice.
2) Including this evolution into the same VCS as your app itself - so that every version of the app has the full DB setup that’s needed for that version of the app to run.
3) Handling deltas (including full bootstrap if necessary!) as part of application startup, to minimise external dependencies.
Although cqlmigrate can be run in a standalone manner, running it as part of app startup reduces the complexity of your application deployment, as no extra steps are necessary.
We’ll take a look at a demo in a bit, but it’s really simple to invoke cqlmigrate – all you do is pass it a collection of java Paths containing the cql files which you want to run, and they are run in alphanumeric order.
(Cassandra uses CQL, similar to SQL – Structured Query Language)
-----
For each CQL file it applies, cqlmigrate creates a row in a “schema_updates” table in your keyspace, containing both the name of the file and the SHA1 checksum of it. If the row for a given CQL file already exists then cqlmigrate will skip applying it at runtime. It’s important to re-iterate how the open-closed principle applies here – if you change a previously-applied CQL file (even just changing whitespace!), it’ll get run again, which may cause problems.
------
On the one hand, you don’t want multiple nodes trying to change your DB schema at the same time – recipe for pain.
I don’t want my schema evolution to have any dependency on how the application is started up.
Cqlmigrate allows concurrent startup of multiple application nodes, by making use of a `locks` table. Cassandra’s lightweight transactions, based on the Paxos consensus algorithm, allow us to do an atomic test-and-set to ensure that only one instance can take the lock at a given time.
The instance that first gets the lock will perform the schema evolution; all others will block until that’s complete, and then each in turn will get the lock, realise there’s nothing further to be done, and release the lock again.
To demonstrate:
date +"%Y-%m-%d %H:%M:%S.%3N"; sudo date --set `date -d '+5 second' "+%H:%M:%S.%3N"`; date +"%Y-%m-%d %H:%M:%S.%3N"
- Tour of cqlmigrate-example-app
- Simple DropWizard Application
- Configuration including Cassandra stuff (show yaml)
- MigrateSchemaBundle
- Show Cassandra cluster (same one we had earlier?)
- cqlsh in to it (explain cqlsh if necessary!)
- ensure `example` keyspace is absent
- show `cqlmigrate.locks` table
- Start up application – show log lines detailing which scripts have been run
- Rename “notyet” file, rebuild and rerun application
- Demo what happens if it fails during cqlmigrate – interrupt and re-run
If needed:
- DELETE FROM cqlmigrate.locks WHERE name = 'example.schema_migration'
As mentioned previously – time-sensitive query patterns should generally be avoided. If you have to have them in a single-client context, then specifying timestamps on the client side can help you get out of trouble.
I’d really recommend trying out cqlmigrate for your schema evolution – and I’d also invite you to contribute to its development. It’s already in use by multiple production apps within Sky; we’ve made it open source and I hope that the broader development community – that’s you lot! - will find it useful and help it to grow.
Answers:
- No Joins etc - “not enough functionality in NoSQL world” - it’s against Cassandra principles to use JOINs - store the data in denormalised form; whatever form you’d want to query it?
- Proved the theory by fixing the co-ordinator - not generally good practice for production, but useful for debugging.
- Cassandra query tracing for details – can look at the `system_traces` keyspace
- Cassandra versions 2.1.x and 3.0.x tested. Latter defaults to using client-side generation of timestamps.
- Time-sensitive timestamps mainly in testing - verifying that records have been deleted
- Standalone nature of cqlmigrate - how?