A glimpse of cassandra 4.0 features netflix

A glimpse of
Cassandra 4.0
Vinay Chella
Cloud Database Engineering

Vinay Chella
Cloud Data Architect
Cloud Database Engineering @ Netflix

State of C* @ Netflix
Notable 4.0+ Changes
● Reliability
● Features
○ Audit Logging
● Correctness
○ Scheduled Repair
Overview

There are lot of exciting features coming in 4.0,
but this talk covers some of the features that we
at Netflix are particularly excited about and
looking forward to.
There are thousands of improvements shipping
soon in 4.0, and some in later 4.x. This is just a
sample of the goodness.
Disclaimer

● In process of migrating to Apache Cassandra 3.0
● Majority on Cassandra 2.1
● Cassandra is the source truth for 99%+ streaming
persistent data
State of C* @ Netflix

Cassandra
4.0 for Netflix
Reliability
Correctness
New
Features
Why 4.0?

Reliability
Your database should be available
(aka fast)

Internode Networking
CASSANDRA-8457!!
● No more thread per peer, fully async
server-server communication
● Streaming 20% faster (12229)
● Access to critical OS networking
features

● Gossip slows down (8457, 12966)
● Restarted nodes coordinate before they
have functional connections (13993,
14297)
● Non-restarted nodes will continue
sending on dead connections for a while
(14358)
● DynamicEndpointSnitch sends to latent
nodes after restart (14459)
Restarting Cassandra

● Meet SLOs with Hybrid Speculation (14293)
○ MIN(99PERCENTILE,10MS) ~= “only speculate if I am slower than P99 SLO”
○ MAX(99PERCENTILE,100MS) ~= “stop speculating if the cluster is hosed”
● Reduce default number of vnodes (13701)
○ vnodes reduce availability, better to have fewer. (mailing list)
○ Context: https://github.com/jolynch/python_performance_toolkit/tree/master/notebooks/cassandra_availability
● Which queries are slow/huge? (13001, 14347)
● Circuit break queries of death (12106)
Some Other Improvements

Features
Your database should help
you build great apps

CDC improvements
Materialized views in other datastores
CREATE TABLE foo (a int, b text, PRIMARY KEY(a)) WITH cdc=true;
ALTER TABLE foo WITH cdc=true;
ALTER TABLE foo WITH cdc=false;

Pluggable Storage Engine
CASSANDRA-13474
● Unlocking great performance
improvements
○ Rocksandra (13476)
○ Persistent memory (13981)

Where should it go in the first place?
● Why isn’t it the client’s job?
● Why not via dynamic tracing (e.g. cqltrace)?
Design.

Once we’re in the database
● Why not log to database itself?
● Why is it in files?
○ Should it be restricted to one type of implementation?
● Why not log everything?
Design. (continued)

● Audits everything
● Yaml based configuration
● Highly performant
● Pluggable
● Supports FQL
● BinLog
● Default implementations
○ BinAuditLogger
○ FileAuditLogger
● Droppable jars for custom loggers
Audit Logging.

● User
● Host
● Source ip address
● Source port
● Timestamp
● Type
○ SELECT, INSERT, etc.,
What does it log.
● Category:
○ DDL, DML, etc.,
● Keyspace
● Scope
○ Table name, Function name etc.,
● Operation
○ Select * from tbl1 limit 2;

$ ./nodetool enableauditlog
$ ./nodetool disableauditlog
StorageService.java:5420 - AuditLog is enabled with logger: [BinAuditLogger], included_keyspaces:
[movies, thumbs_ratings], excluded_keyspaces: [star_ratings], included_categories: [DDL, DML, DCL],
excluded_categories: [QUERY],included_users: [prod_user1, prod_user2], excluded_users: [admin_user,
ops_user]
Ease of use with NodeTool

● Per table write metrics (14232)
○ Very important for multi-tenant clusters
● Virtual tables (7622)
○ Ask Cassandra for system state, config via CQL
● NetworkTopologyStrategy takes a default replication count (14303)
○ CREATE KEYSPACE test WITH replication = {'class':
'NetworkTopologyStrategy', default_datacenter_replication: 3 }
● Operate your database with HTTP (14395)
Some Other New Features

Correctness
Your database should give
you the correct responses

Incremental Repair Works!
CASSANDRA-9143 et al. !!
● Incremental repair running super quickly
on petabyte datasets
● Preview data inconsistency without
streaming (13257)
● Without repair Cassandra is hopefully
consistent. Probably not what you
expect from a database.

Repair Scheduling*
CASSANDRA-14346
● Decentralized
● Fault Tolerant
● “Just works”
● Rough design
consensus
○ Hopefully will be
merged in 4.x

● Ideal Consistency Level (13289)
○ Having metrics on inconsistency is useful
● Cassandra should be correct when nodes fail (5901)
○ Do you run repair after node failures?
● Continuous repair (13924)
○ Repair only inconsistent data, no wasted work.
Some Other Improvements

Easy Repair Operations
# Get repair configuration or status
curl localhost:7007/v1/repair/config
curl localhost:7007/v1/repair/status
# Mutate configuration live, or start or pause repair
curl -XPOST localhost:7007/v1/repair/config
curl -XPOST localhost:7007/v1/repair/start
curl -XPOST localhost:7007/v1/repair/stop
If 14346 can get finished and reviewed for 4.x we can:
● Make repair easy to schedule and run
● Allow non java interactions via HTTP
● Ship repair out of the box, not in addons.

A glimpse of cassandra 4.0 features netflix

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to A glimpse of cassandra 4.0 features netflix

Similar to A glimpse of cassandra 4.0 features netflix (20)

More from Vinay Kumar Chella

More from Vinay Kumar Chella (9)

Recently uploaded

Recently uploaded (20)

A glimpse of cassandra 4.0 features netflix