Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

CASSANDRA SF 2015
REPEATABLE, SCALABLE, RELIABLE,
OBSERVABLE CASSANDRA
Aaron Morton
@aaronmorton
Co-Founder & Principal Consultant
Licensed under a Creative Commons Attribution-NonCommercial 3.0 New Zealand License

AboutThe Last Pickle.
Work with clients to deliver and improve Apache Cassandra
based solutions.
Apache Cassandra Committer, DataStax MVP, Apache
Usergrid Committer.
Based in New Zealand,Australia, & USA.

Scaleable Data Model
Use no look writes to avoid
unnecessary reads.

No Look Writes
CREATE TABLE user_visits (
user text,
day int, // YYYYMMDD
PRIMARY KEY (user, day)
);

No Look Writes
// Bad
SELECT *
FROM user_visits
WHERE user = ‘aaron’ AND day = 20150924;
INSERT INTO user_visits (user, day)
VALUES ('aaron', 20150924);

No Look Writes
// Better

Limit Partition size by
bounding it in time or space.

Limit Partition Size
// Bad
user text,
visit_time timestamp,
data blob, // up to 100K
PRIMARY KEY (user, visit)
);

Limit Partition Size
// Better
user text,
day_bucket int, // YYYYMMDD
visit_time timestamp,
data blob, // up to 100K
PRIMARY KEY ( (user, day_bucket), visit)
);

Avoid mixed workloads on a
single Table to reduce impact
of fragmentation.

Mixed Workloads
// Bad
CREATE TABLE user (
user text,
password text, // when password changed
last_visit timestamp, // each page request
PRIMARY KEY (user)
);

Mixed Workloads
// Better
CREATE TABLE user_password (
user text,
password text,
PRIMARY KEY (user)
);
CREATE TABLE user_last_visit (
user text,
last_visit timestamp,
PRIMARY KEY (user)
);

Use
LeveledCompactionStrategy
when overwrites or
Tombstones.

Use LCS for Overwrites
user text,
day int, // YYYYMMDD
PRIMARY KEY (user, day)
)
WITH
COMPACTION =
{
'class' : 'LeveledCompactionStrategy'
};

Create parallel data models so
throughput increases with
node count.

Parallel Data Models
// Bad
CREATE TABLE hotel_price (
checkin_day int, // YYYYMMDD
hotel_name text,
price_data blob,
PRIMARY KEY (checkin_day, hotel_name)
);

Parallel Data Models
// Better
city text,
hotel_name text,
price_data blob,
PRIMARY KEY ( (checkin_day, city), hotel_name)
);

Use concurrent asynchronous
requests to complete tasks.

Concurrent Asynchronous Requests
city text,
hotel_name text,
price_data blob,
PRIMARY KEY ( (checkin_day, city), hotel_name)
);

Concurrent Asynchronous Requests
// request for cities concurrently
SELECT *
FROM hotel_price
WHERE checkin_day = 20150924 AND city = 'Santa Clara';
SELECT *
FROM hotel_price
WHERE checkin_day = 20150924 AND city = 'San Jose';

Document when Eventual
Consistency, Strong
Consistency or Linerizable
Consistency is required.

Smoke Test the data model.

Data Model SmokeTest
/*
* Get Pricing Data
*/
// Load Data
INSERT INTO city_distances (city, distance, nearby_city)
VALUES ('Santa Clara', 0, 'Santa Clara');
INSERT INTO city_distances (city, distance, nearby_city)
VALUES ('Santa Clara', 1, 'San Jose');
INSERT INTO hotel_price (checkin_day, city, hotel_name, price_data)
VALUES (20150924, 'Santa Clara', 'Hilton Santa Clara', 0xFF);
INSERT INTO hotel_price (checkin_day, city, hotel_name, price_data)
VALUES (20150924, 'San Jose', 'Hyatt San Jose', 0xFF);

Data Model SmokeTest
// Step 1
// Get the near by cities for the one selected by the user
SELECT nearby_city
FROM city_distances
WHERE city = 'Santa Clara' and distance < 2;
// Step 2
// Parallel requests for each city returned.
SELECT city, hotel_name, price_data
FROM hotel_price
WHERE checkin_day = 20150924 AND city = 'Santa Clara';
SELECT city, hotel_name, price_data
FROM hotel_price
WHERE checkin_day = 20150924 AND city = 'San Jose';

Application Development
Ensure read requests are
bound and know what the size
is.
(hint: use auto-paging in 2.0)

Auto Paging
PreparedStatement prepStmt = session.prepare(CQL);
BoundStatement boundStmt = new
BoundStatement(prepStmt);
boundStatement.setFetchSize(100)

Use appropriate Consistency
Level.
(see Data Model Smoke Test)

Use Token Aware
Asynchronous requests with
CL ONE where possible.

Token Aware Policy
cluster = Cluster.builder()
.addContactPoints("10.10.10.10")
.withLoadBalancingPolicy(new TokenAwarePolicy(
new DCAwareRoundRobinPolicy(“DC1”)))
.build()

Asynchronous Requests
ResultSetFuture f = ses.executeAsync(stmt.bind("fo"));
Row row = f.getUninterruptibly().one();

Avoid DDOS’ing the cluster.

Monitoring and Alerting
Use what you like and what
works for you.

Monitoring and Alerting
Some suggestions: OpsCentre,
Riemann, Grafana, Log Stash,
Sensu.

HowTo Monitor
Cluster wide aggregate.
All nodes (if possible).
Top 3 & Bottom 3 Nodes.
Individual Nodes.

HowTo Monitor Rates
1 Minute Rate
Derivative of Counts

HowTo Monitor Latency
75th Percentile
95th Percentile
99th Percentile

Monitoring ClusterThroughput
.o.a.c.m.ClientRequest.
Write.Latency.1MinuteRate
Read.Latency.1MinuteRate

Monitoring LocalTableThroughput
.o.a.c.m.ColumnFamily.
KEYSPACE.TABLE.WriteLatency.1MinuteRate
KEYSPACE.TABLE.ReadLatency.1MinuteRate

Monitoring Request Latency
Write.Latency.75percentile
Read.Latency.75percentile…

Monitoring Request Latency PerTable
KEYSPACE.TABLE.CoordinatorWriteLatency.
95percentile
KEYSPACE.TABLE.CoordinatorReadLatency.
95percentile

Monitoring LocalTable Latency
KEYSPACE.TABLE.WriteLatency.95percentile
KEYSPACE.TABLE.ReadLatency.95percentile

Monitoring Read Path
.o.a.c.m.ColumnFamily.KEYSPACE.TABLE.
LiveScannedHistogram.95percentile
TombstoneScannedHistogram.95percentile
SSTablesPerReadHistogram.95percentile

Monitoring Inconsistency
.o.a.c.m.
Storage.TotalHints.count
HintedHandOffManager.
Hints_created-IP_ADDRESS.count
.o.a.c.m.Connection.TotalTimeouts.
1MinuteRate

Monitoring Eventual Consistency
.o.a.c.m.
ReadRepair.RepairedBackground.
1MinuteRate
ReadRepair.RepairedBlocking.1MinuteRate

Monitoring Client Errors
Write.Unavailables.1MinuteRate
Read.Unavailables.1MinuteRate
Write.Timeouts.1MinuteRate
Read.Timeouts.1MinuteRate

Monitoring Errors
.o.a.c.m.
Storage.Exceptions.count

Monitoring Disk Usage
.o.a.c.m.
Storage.Load.count
ColumnFamily.KEYSPACE.TABLE.
TotalDiskSpaceUsed.count

Monitoring Pending Compactions
.o.a.c.m.
Compaction.PendingTasks.value
ColumnFamily.KEYSPACE.TABLE.PendingCompactions
.value
Compaction.TotalCompactionsCompleted.
1MinuteRate

Monitoring Node Performance
.o.a.c.m.ThreadPools.request.
MutationStage.PendingTasks.value
ReadStage.PendingTasks.value
ReplicateOnWriteStage.PendingTasks.value
RequestResponseStage.PendingTasks.value

Monitoring Node Performance
.o.a.c.m.DroppedMessage.
MUTATION.Dropped.1MinuteRate
READ.Dropped.1MinuteRate

Design
Development
Provisioning

SmokeTests
“preliminary testing to reveal
simple failures severe enough
to reject a prospective
software release.”

Disk SmokeTests
“Disk Latency and Other
Random Numbers”
Al Toby
http://tobert.github.io/post/2014-11-13-slides-disk-
latency-and-other-random-numbers.html

Cassandra SmokeTest
cassandra-stress write cl=quorum -schema replication(factor=3)
-mode native prepared cql3
cassandra-stress read cl=quorum -mode native prepared cql3
cassandra-stress mixed cl=quorum ratio(read=1,write=4)
-mode native prepared cql3

Run Books
Why are we doing this?
What are we doing?
How will we do it?

Fire Drill: ShortTerm Single Node Failure
Down for less than Hint Window.
Available for QUORUM.
No action necessary on return.

Fire Drill: ShortTerm Multi Node Failure (Break the cluster)
Available for ONE (maybe).
Repair on return.

Fire Drill:Availability Zone / Rack Partition
Maybe repair on return.

Fire Drill: MediumTerm Single Node Failure
Down between Hint Window and
gc_grace_seconds.
Repair on return.

Fire Drill: LongTerm Single Node Failure
Down longer than
gc_grace_seconds.
Replace node.

Fire Drill: Rolling Upgrade
Repeated short term failure.

Fire Drill: Scale Up
Repeated short term failure.

Fire Drill: Scale Out
Available for ALL.

Aaron Morton
@aaronmorton
Co-Founder & Principal Consultant
www.thelastpickle.com

Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (14)

Similar to Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Similar to Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra (20)

More from aaronmorton

More from aaronmorton (18)

Recently uploaded

Recently uploaded (20)

Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra