SlideShare a Scribd company logo
1 of 130
CASSANDRA SUMMIT 2013
IN CASE OF EMERGENCY
BREAK GLASS
Aaron Morton
@aaronmorton
www.thelastpickle.com
#Cassandra13
Licensed under a Creative Commons Attribution-NonCommercial 3.0 New Zealand License
About Me
Freelance Cassandra Consultant
Based in Wellington, New Zealand
Apache Cassandra Committer
#Cassandra13
Platform
Tools
Problems
Maintenance
#Cassandra13
The Platform
#Cassandra13
The Platform & Clients
#Cassandra13
The Platform & Running Clients
#Cassandra13
The Platform & Reality
Consistency
Availability
Partition Tolerance
#Cassandra13
The Platform & Consistency
Strong Consistency
(R + W > N)
Eventual Consistency
(R + W <= N)#Cassandra13
What Price Consistency?
In a Multi DC cluster QUOURM
and EACH_QUOURM involve
cross DC latency.
#Cassandra13
The Platform & Availability
Maintain Consistency Level UP
nodes for each Token Range.
#Cassandra13
Best Case Failure with N=9 and RF 3, 100% Availability
Replica 1
Replica 2
Replica 3
Range A
#Cassandra13
Worst Case Failure with N=9 and RF 3, 78% Availability
Range B
Range A
#Cassandra13
The Platform & PartitionTolerance
A failed node does not create
a partition.
#Cassandra13
The Platform & PartitionTolerance
#Cassandra13
The Platform & PartitionTolerance
Partitions occur when the
network fails.
#Cassandra13
The Platform & PartitionTolerance
#Cassandra13
The Storage Engine
Optimised for
Writes.
#Cassandra13
Write Path
Append to Write Ahead Log.
(fsync every 10s by default, other options available)
#Cassandra13
Write Path
Merge new Columns into
Memtable.
(Lock free, always in memory.)
#Cassandra13
Write Path... Later
Asynchronously flush
Memtable to a new SSTable on
disk.
(May be 10’s or 100’s of MB in size.)
#Cassandra13
SSTable Files
*-Data.db
*-Index.db
*-Filter.db
(And others)
#Cassandra13
Row Fragmentation
SSTable 1
foo:
dishwasher (ts 10):
tomato
purple (ts 10):
cromulent
SSTable 2
foo:
frink (ts 20):
flayven
monkey (ts 10):
embiggins
SSTable 3 SSTable 4
foo:
dishwasher (ts 15):
tomacco
SSTable 5
#Cassandra13
Read Path
Read columns from each
SSTable, then merge results.
(Roughly speaking.)
#Cassandra13
Read Path
Use Bloom Filter to
determine if a row key does
not exist in a SSTable.
(In memory)
#Cassandra13
Read Path
Search for prior key in
*-Index.db sample.
(In memory)
#Cassandra13
Read Path
Scan *-Index.db from
prior key to find the search
key and its’ *-Data.db
offset.
(On disk.)
#Cassandra13
Read Path
Read *-Data.db from
offset, all columns or specific
pages.
#Cassandra13
Read purple, monkey, dishwasher
SSTable 1-Data.db
foo:
dishwasher (ts 10):
tomato
purple (ts 10):
cromulent
SSTable 2-Data.db
foo:
frink (ts 20):
flayven
monkey (ts 10):
embiggins
SSTable 3-Data.db SSTable 4-Data.db
foo:
dishwasher (ts 15):
tomacco
SSTable 5-Data.db
Bloom Filter
Index Sample
SSTable 1-Index.db
Bloom Filter
Index Sample
SSTable 2-Index.db
Bloom Filter
Index Sample
SSTable 3-Index.db
Bloom Filter
Index Sample
SSTable 4-Index.db
Bloom Filter
Index Sample
SSTable 5-Index.db
Memory
Disk
#Cassandra13
Read With Key Cache
SSTable 1-Data.db
foo:
dishwasher (ts 10):
tomato
purple (ts 10):
cromulent
SSTable 2-Data.db
foo:
frink (ts 20):
flayven
monkey (ts 10):
embiggins
SSTable 3-Data.db SSTable 4-Data.db
foo:
dishwasher (ts 15):
tomacco
SSTable 5-Data.db
Key Cache
Index Sample
SSTable 1-Index.db
Key Cache
Index Sample
SSTable 2-Index.db
Key Cache
Index Sample
SSTable 3-Index.db
Key Cache
Index Sample
SSTable 4-Index.db
Key Cache
Index Sample
SSTable 5-Index.db
Memory
Disk
Bloom Filter Bloom Filter Bloom Filter Bloom Filter Bloom Filter
#Cassandra13
Read with Row Cache
Row Cache
SSTable 1-Data.db
foo:
dishwasher (ts 10):
tomato
purple (ts 10):
cromulent
SSTable 2-Data.db
foo:
frink (ts 20):
flayven
monkey (ts 10):
embiggins
SSTable 3-Data.db SSTable 4-Data.db
foo:
dishwasher (ts 15):
tomacco
SSTable 5-Data.db
Key Cache
Index Sample
SSTable 1-Index.db
Key Cache
Index Sample
SSTable 2-Index.db
Key Cache
Index Sample
SSTable 3-Index.db
Key Cache
Index Sample
SSTable 4-Index.db
Key Cache
Index Sample
SSTable 5-Index.db
Memory
Disk
Bloom Filter Bloom Filter Bloom Filter Bloom Filter Bloom Filter
#Cassandra13
Performant Reads
Design queries to read from a
small number of SSTables.
#Cassandra13
Performant Reads
Read a small number of
named columns or a slice of
columns.
#Cassandra13
Performant Reads
Design data model to support
current application
requirements.
#Cassandra13
Platform
Tools
Problems
Maintenance
#Cassandra13
Logging
Configure via
log4j-server.properties
and
StorageServiceMBean
#Cassandra13
DEBUG Logging For One Class
log4j.logger.org.apache.cassandra.thrift.
CassandraServer=DEBUG
#Cassandra13
Reading Logs
INFO [OptionalTasks:1] 2013-04-20 14:03:50,787
MeteredFlusher.java (line 62) flushing high-traffic column
family CFS(Keyspace='KS1', ColumnFamily='CF1') (estimated
403858136 bytes)
INFO [OptionalTasks:1] 2013-04-20 14:03:50,787
ColumnFamilyStore.java (line 634) Enqueuing flush of Memtable-
CF1@1333396270(145839277/403858136 serialized/live bytes,
1742365 ops)
INFO [FlushWriter:42] 2013-04-20 14:03:50,788 Memtable.java
(line 266) Writing Memtable-CF1@1333396270(145839277/403858136
serialized/live bytes, 1742365 ops)
#Cassandra13
GC Logs
cassandra-env.sh
# GC logging options -- uncomment to enable
# JVM_OPTS="$JVM_OPTS -XX:+PrintGCDetails"
# JVM_OPTS="$JVM_OPTS -XX:+PrintGCDateStamps"
# JVM_OPTS="$JVM_OPTS -XX:+PrintHeapAtGC"
# JVM_OPTS="$JVM_OPTS -XX:+PrintTenuringDistribution"
# JVM_OPTS="$JVM_OPTS -XX:+PrintGCApplicationStoppedTime"
# JVM_OPTS="$JVM_OPTS -XX:+PrintPromotionFailure"
# JVM_OPTS="$JVM_OPTS -XX:PrintFLSStatistics=1"
# JVM_OPTS="$JVM_OPTS -Xloggc:/var/log/cassandra/gc-`date +
%s`.log"
#Cassandra13
ParNew GC Starting
{Heap before GC invocations=224115 (full 111):
par new generation total 873856K, used 717289K ...)
eden space 699136K, 100% used ...)
from space 174720K, 10% used ...)
to space 174720K, 0% used ...)
#Cassandra13
Tenuring Distribution
240217.053: [ParNew
Desired survivor size 89456640 bytes, new threshold 4 (max 4)
- age 1: 22575936 bytes, 22575936 total
- age 2: 350616 bytes, 22926552 total
- age 3: 4380888 bytes, 27307440 total
- age 4: 1155104 bytes, 28462544 total
#Cassandra13
ParNew GC Finishing
Heap after GC invocations=224116 (full 111):
par new generation total 873856K, used 31291K ...)
eden space 699136K, 0% used ...)
from space 174720K, 17% used ...)
to space 174720K, 0% used ...)
#Cassandra13
nodetool info
Token : 0
Gossip active : true
Load : 130.64 GB
Generation No : 1369334297
Uptime (seconds) : 29438
Heap Memory (MB) : 3744.27 / 8025.38
Data Center : east
Rack : rack1
Exceptions : 0
Key Cache : size 104857584 (bytes), capacity 104857584
(bytes), 25364985 hits, 34874180 requests, 0.734 recent hit
rate, 14400 save period in seconds
Row Cache : size 0 (bytes), capacity 0...
#Cassandra13
nodetool ring
Note: Ownership information does not include topology, please specify a keyspace.
Address DC Rack Status State Load Owns Token
10.1.64.11 east rack1 Up Normal 130.64 GB 12.50% 0
10.1.65.8 west rack1 Up Normal 88.79 GB 0.00% 1
10.1.64.78 east rack1 Up Normal 52.66 GB 12.50% 212...216
10.1.65.181 west rack1 Up Normal 65.99 GB 0.00% 212...217
10.1.66.8 east rack1 Up Normal 64.38 GB 12.50% 425...432
10.1.65.178 west rack1 Up Normal 77.94 GB 0.00% 425...433
10.1.64.201 east rack1 Up Normal 56.42 GB 12.50% 638...648
10.1.65.59 west rack1 Up Normal 74.5 GB 0.00% 638...649
10.1.64.235 east rack1 Up Normal 79.68 GB 12.50% 850...864
10.1.65.16 west rack1 Up Normal 62.05 GB 0.00% 850...865
10.1.66.227 east rack1 Up Normal 106.73 GB 12.50% 106...080
10.1.65.226 west rack1 Up Normal 79.26 GB 0.00% 106...081
10.1.66.247 east rack1 Up Normal 66.68 GB 12.50% 127...295
10.1.65.19 west rack1 Up Normal 102.45 GB 0.00% 127...297
10.1.66.141 east rack1 Up Normal 53.72 GB 12.50% 148...512
10.1.65.253 west rack1 Up Normal 54.25 GB 0.00% 148...513
#Cassandra13
nodetool ring KS1
Address DC Rack Status State Load Effective-Ownership Token
10.1.64.11 east rack1 Up Normal 130.72 GB 12.50% 0
10.1.65.8 west rack1 Up Normal 88.81 GB 12.50% 1
10.1.64.78 east rack1 Up Normal 52.68 GB 12.50% 212...216
10.1.65.181 west rack1 Up Normal 66.01 GB 12.50% 212...217
10.1.66.8 east rack1 Up Normal 64.4 GB 12.50% 425...432
10.1.65.178 west rack1 Up Normal 77.96 GB 12.50% 425...433
10.1.64.201 east rack1 Up Normal 56.44 GB 12.50% 638...648
10.1.65.59 west rack1 Up Normal 74.57 GB 12.50% 638...649
10.1.64.235 east rack1 Up Normal 79.72 GB 12.50% 850...864
10.1.65.16 west rack1 Up Normal 62.12 GB 12.50% 850...865
10.1.66.227 east rack1 Up Normal 106.72 GB 12.50% 106...080
10.1.65.226 west rack1 Up Normal 79.28 GB 12.50% 106...081
10.1.66.247 east rack1 Up Normal 66.73 GB 12.50% 127...295
10.1.65.19 west rack1 Up Normal 102.47 GB 12.50% 127...297
10.1.66.141 east rack1 Up Normal 53.75 GB 12.50% 148...512
10.1.65.253 west rack1 Up Normal 54.24 GB 12.50% 148...513
#Cassandra13
nodetool status
$ nodetool status
Datacenter: ams01 (Replication Factor 3)
=================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
UN 10.70.48.23 38.38 GB 256 19.0% 7c5fdfad-63c6-4f37-bb9f-a66271aa3423 RAC1
UN 10.70.6.78 58.13 GB 256 18.3% 94e7f48f-d902-4d4a-9b87-81ccd6aa9e65 RAC1
UN 10.70.47.126 53.89 GB 256 19.4% f36f1f8c-1956-4850-8040-b58273277d83 RAC1
Datacenter: wdc01 (Replication Factor 3)
=================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
UN 10.24.116.66 65.81 GB 256 22.1% f9dba004-8c3d-4670-94a0-d301a9b775a8 RAC1
UN 10.55.104.90 63.31 GB 256 21.2% 4746f1bd-85e1-4071-ae5e-9c5baac79469 RAC1
UN 10.55.104.27 62.71 GB 256 21.2% 1a55cfd4-bb30-4250-b868-a9ae13d81ae1 RAC1
#Cassandra13
nodetool cfstats
Keyspace: KS1
Column Family: CF1
SSTable count: 11
Space used (live): 32769179336
Space used (total): 32769179336
Number of Keys (estimate): 73728
Memtable Columns Count: 1069137
Memtable Data Size: 216442624
Memtable Switch Count: 3
Read Count: 95
Read Latency: NaN ms.
Write Count: 1039417
Write Latency: 0.068 ms.
Bloom Filter False Postives: 345
Bloom Filter False Ratio: 0.00000
Bloom Filter Space Used: 230096
Compacted row minimum size: 150
Compacted row maximum size: 322381140
Compacted row mean size: 2072156
#Cassandra13
nodetool cfhistograms
$nodetool cfhistograms KS1 CF1
Offset SSTables Write Latency Read Latency Row Size Column Count
1 67264 0 0 0 1331591
2 19512 0 0 0 4241686
3 35529 0 0 0 474784
...
10 10299 1150 0 0 21768
12 5475 3569 0 0 3993135
14 1986 9098 0 0 1434778
17 258 30916 0 0 366895
20 0 52980 0 0 186524
24 0 104463 0 0 25439063
...
179 0 93 1823 1597 1284167
215 0 84 3880 1231655 1147150
258 0 170 5164 209282 956487
#Cassandra13
nodetool proxyhistograms
$nodetool proxyhistograms
Offset Read Latency Write Latency Range Latency
60 0 15 0
72 0 51 0
86 0 241 0
103 2 2003 0
124 9 5798 0
149 67 7348 0
179 222 6453 0
215 184 6071 0
258 134 5436 0
310 104 4936 0
372 89 4997 0
446 39 6383 0
535 76797 7518 0
642 9364748 96065 0
770 16406421 152663 0
924 7429538 97612 0
1109 6781835 176829 0
#Cassandra13
JMX via JConsole
#Cassandra13
JMX via MX4J
#Cassandra13
JMX via JMXTERM
$ java -jar jmxterm-1.0-alpha-4-uber.jar
Welcome to JMX terminal. Type "help" for available commands.
$>open localhost:7199
#Connection to localhost:7199 is opened
$>bean org.apache.cassandra.db:type=StorageService
#bean is set to org.apache.cassandra.db:type=StorageService
$>info
#mbean = org.apache.cassandra.db:type=StorageService
#class name = org.apache.cassandra.service.StorageService
# attributes
%0 - AllDataFileLocations ([Ljava.lang.String;, r)
%1 - CommitLogLocation (java.lang.String, r)
%2 - CompactionThroughputMbPerSec (int, rw)
...
# operations
%1 - void bulkLoad(java.lang.String p1)
%2 - void clearSnapshot(java.lang.String p1,[Ljava.lang.String; p2)
%3 - void decommission()
#Cassandra13
JVM Heap Dump via JMAP
jmap -dump:format=b,
file=heap.bin pid
#Cassandra13
JVM Heap Dump withYourKit
#Cassandra13
Platform
Tools
Problems
Maintenance
#Cassandra13
Corrupt SSTable
(Very rare.)
#Cassandra13
Compaction Error
ERROR [CompactionExecutor:36] 2013-04-29 07:50:49,060 AbstractCassandraDaemon.java
(line 132) Exception in thread Thread[CompactionExecutor:36,1,main]
java.lang.RuntimeException: Last written key
DecoratedKey(138024912283272996716128964353306009224, 6138633035613062     
2d616666362d376330612d666531662d373738616630636265396535) >= current key
DecoratedKey(127065377405949402743383718901402082101,
64323962636163652d646561372d333039322d386166322d663064346132363963386131) writing
into *-tmp-hf-7372-Data.db
at
org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:134)
at org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:153)
at
org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:160)
at
org.apache.cassandra.db.compaction.LeveledCompactionTask.execute(LeveledCompaction
Task.java:50)
at org.apache.cassandra.db.compaction.CompactionManager
$2.runMayThrow(CompactionManager.java:164)
#Cassandra13
Cause
Change in KeyValidator or
bug in older versions.
#Cassandra13
Fix
nodetool scrub
#Cassandra13
Dropped Messages
#Cassandra13
Logs
MessagingService.java (line 658) 173 READ messages dropped in last 5000ms
StatusLogger.java (line 57) Pool Name Active Pending
StatusLogger.java (line 72) ReadStage 32 284
StatusLogger.java (line 72) RequestResponseStage 1 254
StatusLogger.java (line 72) ReadRepairStage 0 0
#Cassandra13
nodetool tpstats
Message type Dropped
RANGE_SLICE 0
READ_REPAIR 0
BINARY 0
READ 721
MUTATION 1262
REQUEST_RESPONSE 196
#Cassandra13
Causes
Excessive GC.
Overloaded IO.
Overloaded Node.
Wide Reads / Large Batches.
#Cassandra13
High Read Latency
#Cassandra13
nodetool info
Token : 113427455640312814857969558651062452225
Gossip active : true
Thrift active : true
Load : 291.13 GB
Generation No : 1368569510
Uptime (seconds) : 1022629
Heap Memory (MB) : 5213.01 / 8025.38
Data Center : 1
Rack : 20
Exceptions : 0
Key Cache : size 104857584 (bytes), capacity 104857584 (bytes), 13436862
hits, 16012159 requests, 0.907 recent hit rate, 14400 save period in seconds
Row Cache : size 0 (bytes), capacity 0 (bytes), 0 hits, 0 requests, NaN
recent hit rate, 0 save period in seconds
#Cassandra13
nodetool cfstats
Column Family: page_views
SSTable count: 17
Space used (live): 289942843592
Space used (total): 289942843592
Number of Keys (estimate): 1071416832
Memtable Columns Count: 2041888
Memtable Data Size: 539015124
Memtable Switch Count: 83
Read Count: 267059
Read Latency: NaN ms.
Write Count: 10516969
Write Latency: 0.054 ms.
Pending Tasks: 0
Bloom Filter False Positives: 128586
Bloom Filter False Ratio: 0.00000
Bloom Filter Space Used: 802906184
Compacted row minimum size: 447
Compacted row maximum size: 3973
Compacted row mean size: 867
#Cassandra13
nodetool cfhistograms KS1 CF1
Offset SSTables Write Latency Read Latency Row Size Column Count
1 178437 0 0 0 0
2 20042 0 0 0 0
3 15275 0 0 0 0
4 11632 0 0 0 0
5 4771 0 0 0 0
6 4942 0 0 0 0
7 5540 0 0 0 0
8 4967 0 0 0 0
10 10682 0 0 0 284155
12 8355 0 0 0 15372508
14 1961 0 0 0 137959096
17 322 3 0 0 625733930
20 61 253 0 0 252953547
24 53 15114 0 0 39109718
29 18 255730 0 0 0
35 1 1532619 0 0 0
...
#Cassandra13
nodetool cfhistograms KS1 CF1
Offset SSTables Write Latency Read Latency Row Size Column Count
446 0 120 233 0 0
535 0 155 261 21361 0
642 0 127 284 19082720 0
770 0 88 218 498648801 0
924 0 86 2699 504702186 0
1109 0 22 3157 48714564 0
1331 0 18 2818 241091 0
1597 0 15 2155 2165 0
1916 0 19 2098 7 0
2299 0 10 1140 56 0
2759 0 10 1281 0 0
3311 0 6 1064 0 0
3973 0 4 676 3 0
...
#Cassandra13
jmx-term
$ java -jar jmxterm-1.0-alpha-4-uber.jar 
Welcome to JMX terminal. Type "help" for available commands.
$>open localhost:7199
#Connection to localhost:7199 is opened
$>bean org.apache.cassandra.db:columnfamily=CF2,keyspace=KS2,type=ColumnFamilies
#bean is set to
org.apache.cassandra.db:columnfamily=CF2,keyspace=KS2,type=ColumnFamilies
$>get BloomFilterFalseRatio
#mbean =
org.apache.cassandra.db:columnfamily=CF2,keyspace=KS2,type=ColumnFamilies:
BloomFilterFalseRatio = 0.5693801541828607;
#Cassandra13
Back to cfstats
Column Family: page_views
Read Count: 270075
Bloom Filter False Positives: 131294
#Cassandra13
Cause
bloom_filter_fp_chance had been set to 0.1
to reduce memory requirements when
storing 1+ Billion rows per Node.
#Cassandra13
Fix
Changed read queries to select by column
name to limit SSTables per query.
Long term, migrate to Cassandra v1.2 for off
heap Bloom Filters.
#Cassandra13
GC Problems
#Cassandra13
WARN
WARN [ScheduledTasks:1] 2013-03-29 18:40:48,158
GCInspector.java (line 145) Heap is 0.9355130159566108 full.
You may need to reduce memtable and/or cache sizes.
INFO [ScheduledTasks:1] 2013-03-26 16:36:06,383
GCInspector.java (line 122) GC for ConcurrentMarkSweep: 207 ms
for 1 collections, 10105891032 used; max is 13591642112
INFO [ScheduledTasks:1] 2013-03-28 22:18:17,113
GCInspector.java (line 122) GC for ParNew: 256 ms for 1
collections, 6504905688 used; max is 13591642112
#Cassandra13
Serious GC Problems
INFO [ScheduledTasks:1] 2013-04-30 23:21:11,959
GCInspector.java (line 122) GC for ParNew: 1115 ms for 1
collections, 9355247296 used; max is 12801015808
#Cassandra13
Flapping Node
INFO [GossipTasks:1] 2013-03-28 17:42:07,944 Gossiper.java
(line 830) InetAddress /10.1.20.144 is now dead.
INFO [GossipStage:1] 2013-03-28 17:42:54,740 Gossiper.java
(line 816) InetAddress /10.1.20.144 is now UP
INFO [GossipTasks:1] 2013-03-28 17:46:00,585 Gossiper.java
(line 830) InetAddress /10.1.20.144 is now dead.
INFO [GossipStage:1] 2013-03-28 17:46:13,855 Gossiper.java
(line 816) InetAddress /10.1.20.144 is now UP
INFO [GossipStage:1] 2013-03-28 17:48:48,966 Gossiper.java
(line 830) InetAddress /10.1.20.144 is now dead.
#Cassandra13
“GC Problems are the result
of workload and
configuration.”
Aaron Morton, Just Now.
#Cassandra13
Workload Correlation?
Look for wide rows, large
writes, wide reads, un-
bounded multi row reads or
writes.
#Cassandra13
Compaction Correlation?
Slow down Compaction to improve stability.
concurrent_compactors: 2
compaction_throughput_mb_per_sec: 8
in_memory_compaction_limit_in_mb: 32
(Monitor and reverse when resolved.)
#Cassandra13
GC Logging Insights
Slow down rate of tenuring and enable full
GC logging.
HEAP_NEWSIZE="1200M"
JVM_OPTS="$JVM_OPTS -XX:SurvivorRatio=4"
JVM_OPTS="$JVM_OPTS -XX:MaxTenuringThreshold=4"
#Cassandra13
GC’ing Objects in ParNew
{Heap before GC invocations=7937 (full 205):
par new generation total 1024000K, used 830755K ...)
eden space 819200K, 100% used ...)
from space 204800K, 5% used ...)
to space 204800K, 0% used ...)
Desired survivor size 104857600 bytes, new threshold 4 (max 4)
- age 1: 8090240 bytes, 8090240 total
- age 2: 565016 bytes, 8655256 total
- age 3: 330152 bytes, 8985408 total
- age 4: 657840 bytes, 9643248 total
#Cassandra13
GC’ing Objects in ParNew
{Heap before GC invocations=7938 (full 205):
par new generation total 1024000K, used 835015K ...)
eden space 819200K, 100% used ...)
from space 204800K, 7% used ...)
to space 204800K, 0% used ...)
Desired survivor size 104857600 bytes, new threshold 4 (max 4)
- age 1: 1315072 bytes, 1315072 total
- age 2: 541072 bytes, 1856144 total
- age 3: 499432 bytes, 2355576 total
- age 4: 316808 bytes, 2672384 total
#Cassandra13
Cause
Nodes had wide rows & 1.3+
Billion rows and 3+GB of
Bloom Filters.
(Using older bloom_filter_fp_chance of 0.000744.)
#Cassandra13
Fix
Increased FP chance to 0.1 on
one CF’s and .01 on others.
(One CF reduced from 770MB to 170MB of Bloom Filters.)
#Cassandra13
Fix
Increased
index_interval from 128
to 512.
(Increased key_cache_size_in_mb to 200.)
#Cassandra13
Fix
MAX_HEAP_SIZE="8G"
HEAP_NEWSIZE="1000M"
-XX:SurvivorRatio=4"
-XX:MaxTenuringThreshold=2"
#Cassandra13
Anatomy of a Partition.
(From a 1.0 cluster)
#Cassandra13
Node 23 Was Up
cassandra23# bin/nodetool -h localhost info
Token : 28356863910078205288614550619314017621
Gossip active : true
Load : 275.44 GB
Generation No : 1762556151
Uptime (seconds) : 67548
Heap Memory (MB) : 2926.44 / 8032.00
Data Center : DC1
Rack : RAC_unknown
Exceptions : 0
#Cassandra13
Other Nodes Saw It Down
cassandra20# nodetool -h localhost ring
Address DC Rack Status State Load
10.37.114.8 DC1 RAC20 Up Normal 285.86 GB
10.29.60.10 DC2 RAC23 Down Normal 277.86 GB
10.6.130.70 DC1 RAC21 Up Normal 244.9 GB
10.29.60.14 DC2 RAC24 Up Normal 296.85 GB
10.37.114.10 DC1 RAC22 Up Normal 255.81 GB
10.29.60.12 DC2 RAC25 Up Normal 316.88 GB
#Cassandra13
And Node 23 SawThem Up
cassandra23# nodetool -h localhost ring
Address DC Rack Status State Load
10.37.114.8 DC1 RAC20 Up Normal 285.86 GB
10.29.60.10 DC2 RAC23 Up Normal 277.86 GB
10.6.130.70 DC1 RAC21 Up Normal 244.9 GB
10.29.60.14 DC2 RAC24 Up Normal 296.85 GB
10.37.114.10 DC1 RAC22 Up Normal 255.81 GB
10.29.60.12 DC2 RAC25 Up Normal 316.88 GB
#Cassandra13
Still Available
Node 23 could serve requests at
LOCAL_QUORUM, QUORUM and ALL
Consistency.
Other nodes could serve requests at
LOCAL_QUOURM and QUORUM but not ALL
Consistency.
#Cassandra13
Relax
The application was up.
#Cassandra13
Gossip?
cassandra20# bin/nodetool -h localhost gossipinfo
...
/10.29.60.10
LOAD:2.98347080902E11
STATUS:NORMAL,28356863910078205288614550619314017621
RPC_ADDRESS:10.29.60.10
SCHEMA:fe933880-19bd-11e1-0000-5ff37d368cb6
RELEASE_VERSION:1.0.5
#Cassandra13
Gossip Logs On Node 20?
log4j.logger.org.apache.cassandra.gms.Gossiper=TRACE
TRACE [GossipStage:1] 2011-12-13 00:58:49,636 Gossiper.java
(line 647) local heartbeat version 526912 greater than 7951
for /10.29.60.10
#Cassandra13
More Gossip Logs On Node 20?
log4j.logger.org.apache.cassandra.gms.GossipDigestSynVerbHandler=TRACE
log4j.logger.org.apache.cassandra.gms.FailureDetector=TRACE
TRACE [GossipStage:1] 2011-12-13 02:14:37,033 GossipDigestSynVerbHandler.java
(line 46) Received a GossipDigestSynMessage from /10.29.60.10
TRACE [GossipStage:1] 2011-12-13 02:14:37,033 GossipDigestSynVerbHandler.java
(line 76) Gossip syn digests are : /10.29.60.10:1762556151:12552 /
10.29.60.14:1323732392:10208 /10.37.114.8:1323731527:11082 /
10.37.114.10:1323736718:5830 /10.6.130.70:1323732220:10379 /
10.29.60.12:1323733099:9493
//Expected call to the FailureDetector
TRACE [GossipStage:1] 2011-12-13 02:14:37,033 GossipDigestSynVerbHandler.java
(line 90) Sending a GossipDigestAckMessage to /10.29.60.10
#Cassandra13
Cause.
Generation is initialised at bootstrap to
seconds past the Epoch.
1762556151 is Fri, 07 Nov 2025 22:55:51
GMT.
cassandra23# bin/nodetool -h localhost info
Generation No : 1762556151
TRACE [GossipStage:1] 2011-12-13 02:14:37,033 GossipDigestSynVerbHandler.java
(line 76) Gossip syn digests are : /10.29.60.10:1762556151:12552 /
#Cassandra13
Fix.
[default@system] get LocationInfo['L'];
=> (column=ClusterName, value=737069, timestamp=1320437246450000)
=> (column=Generation, value=690e78f6, timestamp=1762556150811000)
#Cassandra13
Platform
Tools
Problems
Maintenance
#Cassandra13
Maintenance
Expand to Multi DC
#Cassandra13
Expand to Multi DC
Update Snitch
Update Replication Strategy
Add Nodes
Update Replication Factor
Rebuild
#Cassandra13
DC Aware Snitch?
SimpleSnitch puts all
nodes in rack1 and
datacenter1.
#Cassandra13
More Snitches?
PropertyFileSnitch
RackInferringSnitch
#Cassandra13
Gossip Based Snitch?
Ec2Snitch
Ec2MultiRegionSnitch
GossipingPropertyFileSnitch*
#Cassandra13
Changing the Snitch
Do Not change the DC or
Rack for an existing node.
(Cassandra will not be able to find your data.)
#Cassandra13
Moving to the GossipingPropertyFileSnitch
Update cassandra-
topology.properties
on existing nodes with existing DC/Rack
settings for all existing nodes.
Set default to new DC.
#Cassandra13
Moving to the GossipingPropertyFileSnitch
Update cassandra-
rackdc.properties
on existing nodes with existing DC/Rack for
the node.
#Cassandra13
Moving to the GossipingPropertyFileSnitch
Use a rolling restart to upgrade existing nodes
to GossipingPropertyFileSnitch
#Cassandra13
Expand to Multi DC
Update Snitch
Update Replication Strategy
Add Nodes
Update Replication Factor
Rebuild
#Cassandra13
Got NTS ?
Must use
NetworkTopologyStrategy
for Multi DC deployments.
#Cassandra13
SimpleStrategy
Order Token Ranges.
Start with range that contains
Row Key.
Count to RF.
#Cassandra13
SimpleStrategy
"foo"
#Cassandra13
NetworkTopologyStrategy
Order Token Ranges in the DC.
Start with range that contains the Row Key.
Add first unselected Token Range from each
Rack.
Repeat until RF selected.
#Cassandra13
NetworkTopologyStrategy
"foo"
Rack 1
Rack 2Rack 3
#Cassandra13
NetworkTopologyStrategy & 1 Rack
"foo"
Rack 1
#Cassandra13
Changing the Replication Strategy
Be Careful if using existing
configuration has multiple
Racks.
(Cassandra may not be able to find your data.)
#Cassandra13
Changing the Replication Strategy
Update Keyspace configuration to use
NetworkTopologyStrategy with
datacenter1:3 and new_dc:0.
#Cassandra13
PreparingThe Client
Disable auto node discovery or use DC
aware methods.
Use LOCAL_QUOURM or EACH_QUOURM.
#Cassandra13
Expand to Multi DC
Update Snitch
Update Replication Strategy
Add Nodes
Update Replication Factor
Rebuild
#Cassandra13
Configuring New Nodes
Add auto_bootstrap: false to
cassandra.yaml.
Use GossipingPropertyFileSnitch.
Three Seeds from each DC.
(Use cluster_name as a safety.)
#Cassandra13
Configuring New Nodes
Update cassandra-
rackdc.properties
on new nodes with new DC/Rack for the
node.
(Ignore cassandra-topology.properties)
#Cassandra13
StartThe New Nodes
New Nodes in the Ring in the
new DC without data or
traffic.
#Cassandra13
Expand to Multi DC
Update Snitch
Update Replication Strategy
Add Nodes
Update Replication Factor
Rebuild
#Cassandra13
Change the Replication Factor
Update Keyspace configuration to use
NetworkTopologyStrategy with
dataceter1:3 and new_dc:3.
#Cassandra13
Change the Replication Factor
New DC nodes will start
receiving writes from old DC
coordinators.
#Cassandra13
Expand to Multi DC
Update Snitch
Update Replication Strategy
Add Nodes
Update Replication Factor
Rebuild
#Cassandra13
Y U No Bootstrap?
DC 1 DC 2
#Cassandra13
nodetool rebuild DC1
DC 1 DC 2
#Cassandra13
Rebuild Complete
New Nodes now performing Strong
Consistency reads.
(If EACH_QUOURM used for writes.)
#Cassandra13
Summary
Relax.
Understand the Platform and
the Tools.
Always maintain Availability.
#Cassandra13
Thanks.
#Cassandra13
Aaron Morton
@aaronmorton
www.thelastpickle.com
Licensed under a Creative Commons Attribution-NonCommercial 3.0 New Zealand License

More Related Content

What's hot

SSL Failing, Sharing, and Scheduling
SSL Failing, Sharing, and SchedulingSSL Failing, Sharing, and Scheduling
SSL Failing, Sharing, and SchedulingDavid Evans
 
Varnish @ Velocity Ignite
Varnish @ Velocity IgniteVarnish @ Velocity Ignite
Varnish @ Velocity IgniteArtur Bergman
 
How to Troubleshoot OpenStack Without Losing Sleep
How to Troubleshoot OpenStack Without Losing SleepHow to Troubleshoot OpenStack Without Losing Sleep
How to Troubleshoot OpenStack Without Losing SleepSadique Puthen
 
Disk reports predicted failure event
Disk reports predicted failure eventDisk reports predicted failure event
Disk reports predicted failure eventAshwin Pawar
 
Oracle Latch and Mutex Contention Troubleshooting
Oracle Latch and Mutex Contention TroubleshootingOracle Latch and Mutex Contention Troubleshooting
Oracle Latch and Mutex Contention TroubleshootingTanel Poder
 
Debugging Ruby
Debugging RubyDebugging Ruby
Debugging RubyAman Gupta
 
Debugging Ruby Systems
Debugging Ruby SystemsDebugging Ruby Systems
Debugging Ruby SystemsEngine Yard
 
Безопасность интернет-приложений осень 2013 лекция 7
Безопасность интернет-приложений осень 2013 лекция 7Безопасность интернет-приложений осень 2013 лекция 7
Безопасность интернет-приложений осень 2013 лекция 7Technopark
 
Troubleshooting Complex Oracle Performance Problems with Tanel Poder
Troubleshooting Complex Oracle Performance Problems with Tanel PoderTroubleshooting Complex Oracle Performance Problems with Tanel Poder
Troubleshooting Complex Oracle Performance Problems with Tanel PoderTanel Poder
 
Data Mining with Splunk
Data Mining with SplunkData Mining with Splunk
Data Mining with SplunkDavid Carasso
 
Мастер-класс "Логическая репликация и Avito" / Константин Евтеев, Михаил Тюр...
Мастер-класс "Логическая репликация и Avito" / Константин Евтеев,  Михаил Тюр...Мастер-класс "Логическая репликация и Avito" / Константин Евтеев,  Михаил Тюр...
Мастер-класс "Логическая репликация и Avito" / Константин Евтеев, Михаил Тюр...Ontico
 
Openstack on Fedora, Fedora on Openstack: An Introduction to cloud IaaS
Openstack on Fedora, Fedora on Openstack: An Introduction to cloud IaaSOpenstack on Fedora, Fedora on Openstack: An Introduction to cloud IaaS
Openstack on Fedora, Fedora on Openstack: An Introduction to cloud IaaSSadique Puthen
 
Tensorflow and python : fault detection system - PyCon Taiwan 2017
Tensorflow and python : fault detection system - PyCon Taiwan 2017Tensorflow and python : fault detection system - PyCon Taiwan 2017
Tensorflow and python : fault detection system - PyCon Taiwan 2017Eric Ahn
 
Doing Horrible Things with DNS - Web Directions South
Doing Horrible Things with DNS - Web Directions SouthDoing Horrible Things with DNS - Web Directions South
Doing Horrible Things with DNS - Web Directions SouthTom Croucher
 
Montreal User Group - Cloning Cassandra
Montreal User Group - Cloning CassandraMontreal User Group - Cloning Cassandra
Montreal User Group - Cloning CassandraAdam Hutson
 
Being closer to Cassandra by Oleg Anastasyev. Talk at Cassandra Summit EU 2013
Being closer to Cassandra by Oleg Anastasyev. Talk at Cassandra Summit EU 2013Being closer to Cassandra by Oleg Anastasyev. Talk at Cassandra Summit EU 2013
Being closer to Cassandra by Oleg Anastasyev. Talk at Cassandra Summit EU 2013odnoklassniki.ru
 
.NET Fest 2019. Łukasz Pyrzyk. Daily Performance Fuckups
.NET Fest 2019. Łukasz Pyrzyk. Daily Performance Fuckups.NET Fest 2019. Łukasz Pyrzyk. Daily Performance Fuckups
.NET Fest 2019. Łukasz Pyrzyk. Daily Performance FuckupsNETFest
 
dns-sec-4-slides
dns-sec-4-slidesdns-sec-4-slides
dns-sec-4-slideskj teoh
 
CS4344 Lecture 9: Traffic Analysis
CS4344 Lecture 9: Traffic AnalysisCS4344 Lecture 9: Traffic Analysis
CS4344 Lecture 9: Traffic AnalysisWei Tsang Ooi
 

What's hot (20)

SSL Failing, Sharing, and Scheduling
SSL Failing, Sharing, and SchedulingSSL Failing, Sharing, and Scheduling
SSL Failing, Sharing, and Scheduling
 
Varnish @ Velocity Ignite
Varnish @ Velocity IgniteVarnish @ Velocity Ignite
Varnish @ Velocity Ignite
 
How to Troubleshoot OpenStack Without Losing Sleep
How to Troubleshoot OpenStack Without Losing SleepHow to Troubleshoot OpenStack Without Losing Sleep
How to Troubleshoot OpenStack Without Losing Sleep
 
Disk reports predicted failure event
Disk reports predicted failure eventDisk reports predicted failure event
Disk reports predicted failure event
 
Oracle Latch and Mutex Contention Troubleshooting
Oracle Latch and Mutex Contention TroubleshootingOracle Latch and Mutex Contention Troubleshooting
Oracle Latch and Mutex Contention Troubleshooting
 
Debugging Ruby
Debugging RubyDebugging Ruby
Debugging Ruby
 
Debugging Ruby Systems
Debugging Ruby SystemsDebugging Ruby Systems
Debugging Ruby Systems
 
Безопасность интернет-приложений осень 2013 лекция 7
Безопасность интернет-приложений осень 2013 лекция 7Безопасность интернет-приложений осень 2013 лекция 7
Безопасность интернет-приложений осень 2013 лекция 7
 
Troubleshooting Complex Oracle Performance Problems with Tanel Poder
Troubleshooting Complex Oracle Performance Problems with Tanel PoderTroubleshooting Complex Oracle Performance Problems with Tanel Poder
Troubleshooting Complex Oracle Performance Problems with Tanel Poder
 
Data Mining with Splunk
Data Mining with SplunkData Mining with Splunk
Data Mining with Splunk
 
Мастер-класс "Логическая репликация и Avito" / Константин Евтеев, Михаил Тюр...
Мастер-класс "Логическая репликация и Avito" / Константин Евтеев,  Михаил Тюр...Мастер-класс "Логическая репликация и Avito" / Константин Евтеев,  Михаил Тюр...
Мастер-класс "Логическая репликация и Avito" / Константин Евтеев, Михаил Тюр...
 
Openstack on Fedora, Fedora on Openstack: An Introduction to cloud IaaS
Openstack on Fedora, Fedora on Openstack: An Introduction to cloud IaaSOpenstack on Fedora, Fedora on Openstack: An Introduction to cloud IaaS
Openstack on Fedora, Fedora on Openstack: An Introduction to cloud IaaS
 
Tensorflow and python : fault detection system - PyCon Taiwan 2017
Tensorflow and python : fault detection system - PyCon Taiwan 2017Tensorflow and python : fault detection system - PyCon Taiwan 2017
Tensorflow and python : fault detection system - PyCon Taiwan 2017
 
Doing Horrible Things with DNS - Web Directions South
Doing Horrible Things with DNS - Web Directions SouthDoing Horrible Things with DNS - Web Directions South
Doing Horrible Things with DNS - Web Directions South
 
Montreal User Group - Cloning Cassandra
Montreal User Group - Cloning CassandraMontreal User Group - Cloning Cassandra
Montreal User Group - Cloning Cassandra
 
Being closer to Cassandra by Oleg Anastasyev. Talk at Cassandra Summit EU 2013
Being closer to Cassandra by Oleg Anastasyev. Talk at Cassandra Summit EU 2013Being closer to Cassandra by Oleg Anastasyev. Talk at Cassandra Summit EU 2013
Being closer to Cassandra by Oleg Anastasyev. Talk at Cassandra Summit EU 2013
 
.NET Fest 2019. Łukasz Pyrzyk. Daily Performance Fuckups
.NET Fest 2019. Łukasz Pyrzyk. Daily Performance Fuckups.NET Fest 2019. Łukasz Pyrzyk. Daily Performance Fuckups
.NET Fest 2019. Łukasz Pyrzyk. Daily Performance Fuckups
 
Osol Pgsql
Osol PgsqlOsol Pgsql
Osol Pgsql
 
dns-sec-4-slides
dns-sec-4-slidesdns-sec-4-slides
dns-sec-4-slides
 
CS4344 Lecture 9: Traffic Analysis
CS4344 Lecture 9: Traffic AnalysisCS4344 Lecture 9: Traffic Analysis
CS4344 Lecture 9: Traffic Analysis
 

Similar to Cassandra SF 2013 - In Case Of Emergency Break Glass

Cassandra Community Webinar August 29th 2013 - In Case Of Emergency, Break Glass
Cassandra Community Webinar August 29th 2013 - In Case Of Emergency, Break GlassCassandra Community Webinar August 29th 2013 - In Case Of Emergency, Break Glass
Cassandra Community Webinar August 29th 2013 - In Case Of Emergency, Break Glassaaronmorton
 
Cassandra Community Webinar | In Case of Emergency Break Glass
Cassandra Community Webinar | In Case of Emergency Break GlassCassandra Community Webinar | In Case of Emergency Break Glass
Cassandra Community Webinar | In Case of Emergency Break GlassDataStax
 
C* Summit 2013: Virtual Nodes: Rethinking Topology in Cassandra by Eric Evans
C* Summit 2013: Virtual Nodes: Rethinking Topology in Cassandra by Eric EvansC* Summit 2013: Virtual Nodes: Rethinking Topology in Cassandra by Eric Evans
C* Summit 2013: Virtual Nodes: Rethinking Topology in Cassandra by Eric EvansDataStax Academy
 
Performance Tipping Points - Hitting Hardware Bottlenecks
Performance Tipping Points - Hitting Hardware BottlenecksPerformance Tipping Points - Hitting Hardware Bottlenecks
Performance Tipping Points - Hitting Hardware BottlenecksMongoDB
 
Instaclustr introduction to managing cassandra
Instaclustr introduction to managing cassandraInstaclustr introduction to managing cassandra
Instaclustr introduction to managing cassandraInstaclustr
 
Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum...
Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum...Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum...
Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum...DataStax
 
C* Summit EU 2013: Practice Makes Perfect: Extreme Cassandra Optimization
C* Summit EU 2013: Practice Makes Perfect: Extreme Cassandra OptimizationC* Summit EU 2013: Practice Makes Perfect: Extreme Cassandra Optimization
C* Summit EU 2013: Practice Makes Perfect: Extreme Cassandra OptimizationDataStax Academy
 
Cassandra Performance Benchmark
Cassandra Performance BenchmarkCassandra Performance Benchmark
Cassandra Performance BenchmarkBigstep
 
Measuring Database Performance on Bare Metal AWS Instances
Measuring Database Performance on Bare Metal AWS InstancesMeasuring Database Performance on Bare Metal AWS Instances
Measuring Database Performance on Bare Metal AWS InstancesScyllaDB
 
C* Summit 2013: Time-Series Metrics with Cassandra by Mike Heffner
C* Summit 2013: Time-Series Metrics with Cassandra by Mike HeffnerC* Summit 2013: Time-Series Metrics with Cassandra by Mike Heffner
C* Summit 2013: Time-Series Metrics with Cassandra by Mike HeffnerDataStax Academy
 
Cassandra Community Webinar - Introduction To Apache Cassandra 1.2
Cassandra Community Webinar  - Introduction To Apache Cassandra 1.2Cassandra Community Webinar  - Introduction To Apache Cassandra 1.2
Cassandra Community Webinar - Introduction To Apache Cassandra 1.2aaronmorton
 
Cassandra Community Webinar | Introduction to Apache Cassandra 1.2
Cassandra Community Webinar | Introduction to Apache Cassandra 1.2Cassandra Community Webinar | Introduction to Apache Cassandra 1.2
Cassandra Community Webinar | Introduction to Apache Cassandra 1.2DataStax
 
Cassandra at Instagram (August 2013)
Cassandra at Instagram (August 2013)Cassandra at Instagram (August 2013)
Cassandra at Instagram (August 2013)Rick Branson
 
SignalFx Kafka Consumer Optimization
SignalFx Kafka Consumer OptimizationSignalFx Kafka Consumer Optimization
SignalFx Kafka Consumer OptimizationSignalFx
 
C* Summit 2013: Cassandra at Instagram by Rick Branson
C* Summit 2013: Cassandra at Instagram by Rick BransonC* Summit 2013: Cassandra at Instagram by Rick Branson
C* Summit 2013: Cassandra at Instagram by Rick BransonDataStax Academy
 
Veracity's Coldstore Arcus - Storage as the foundation of your surveillance s...
Veracity's Coldstore Arcus - Storage as the foundation of your surveillance s...Veracity's Coldstore Arcus - Storage as the foundation of your surveillance s...
Veracity's Coldstore Arcus - Storage as the foundation of your surveillance s...Alex Kwan
 
Bluestore oio adaptive_throttle_analysis
Bluestore oio adaptive_throttle_analysisBluestore oio adaptive_throttle_analysis
Bluestore oio adaptive_throttle_analysis병수 박
 

Similar to Cassandra SF 2013 - In Case Of Emergency Break Glass (20)

Cassandra Community Webinar August 29th 2013 - In Case Of Emergency, Break Glass
Cassandra Community Webinar August 29th 2013 - In Case Of Emergency, Break GlassCassandra Community Webinar August 29th 2013 - In Case Of Emergency, Break Glass
Cassandra Community Webinar August 29th 2013 - In Case Of Emergency, Break Glass
 
Cassandra Community Webinar | In Case of Emergency Break Glass
Cassandra Community Webinar | In Case of Emergency Break GlassCassandra Community Webinar | In Case of Emergency Break Glass
Cassandra Community Webinar | In Case of Emergency Break Glass
 
C* Summit 2013: Virtual Nodes: Rethinking Topology in Cassandra by Eric Evans
C* Summit 2013: Virtual Nodes: Rethinking Topology in Cassandra by Eric EvansC* Summit 2013: Virtual Nodes: Rethinking Topology in Cassandra by Eric Evans
C* Summit 2013: Virtual Nodes: Rethinking Topology in Cassandra by Eric Evans
 
Performance Tipping Points - Hitting Hardware Bottlenecks
Performance Tipping Points - Hitting Hardware BottlenecksPerformance Tipping Points - Hitting Hardware Bottlenecks
Performance Tipping Points - Hitting Hardware Bottlenecks
 
Instaclustr introduction to managing cassandra
Instaclustr introduction to managing cassandraInstaclustr introduction to managing cassandra
Instaclustr introduction to managing cassandra
 
Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum...
Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum...Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum...
Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum...
 
C* Summit EU 2013: Practice Makes Perfect: Extreme Cassandra Optimization
C* Summit EU 2013: Practice Makes Perfect: Extreme Cassandra OptimizationC* Summit EU 2013: Practice Makes Perfect: Extreme Cassandra Optimization
C* Summit EU 2013: Practice Makes Perfect: Extreme Cassandra Optimization
 
Cassandra Performance Benchmark
Cassandra Performance BenchmarkCassandra Performance Benchmark
Cassandra Performance Benchmark
 
Cassandra
CassandraCassandra
Cassandra
 
Measuring Database Performance on Bare Metal AWS Instances
Measuring Database Performance on Bare Metal AWS InstancesMeasuring Database Performance on Bare Metal AWS Instances
Measuring Database Performance on Bare Metal AWS Instances
 
Cram
CramCram
Cram
 
C* Summit 2013: Time-Series Metrics with Cassandra by Mike Heffner
C* Summit 2013: Time-Series Metrics with Cassandra by Mike HeffnerC* Summit 2013: Time-Series Metrics with Cassandra by Mike Heffner
C* Summit 2013: Time-Series Metrics with Cassandra by Mike Heffner
 
Cassandra Community Webinar - Introduction To Apache Cassandra 1.2
Cassandra Community Webinar  - Introduction To Apache Cassandra 1.2Cassandra Community Webinar  - Introduction To Apache Cassandra 1.2
Cassandra Community Webinar - Introduction To Apache Cassandra 1.2
 
Cassandra Community Webinar | Introduction to Apache Cassandra 1.2
Cassandra Community Webinar | Introduction to Apache Cassandra 1.2Cassandra Community Webinar | Introduction to Apache Cassandra 1.2
Cassandra Community Webinar | Introduction to Apache Cassandra 1.2
 
Cassandra at Instagram (August 2013)
Cassandra at Instagram (August 2013)Cassandra at Instagram (August 2013)
Cassandra at Instagram (August 2013)
 
4 use cases for C* to Scylla
4 use cases for C*  to Scylla4 use cases for C*  to Scylla
4 use cases for C* to Scylla
 
SignalFx Kafka Consumer Optimization
SignalFx Kafka Consumer OptimizationSignalFx Kafka Consumer Optimization
SignalFx Kafka Consumer Optimization
 
C* Summit 2013: Cassandra at Instagram by Rick Branson
C* Summit 2013: Cassandra at Instagram by Rick BransonC* Summit 2013: Cassandra at Instagram by Rick Branson
C* Summit 2013: Cassandra at Instagram by Rick Branson
 
Veracity's Coldstore Arcus - Storage as the foundation of your surveillance s...
Veracity's Coldstore Arcus - Storage as the foundation of your surveillance s...Veracity's Coldstore Arcus - Storage as the foundation of your surveillance s...
Veracity's Coldstore Arcus - Storage as the foundation of your surveillance s...
 
Bluestore oio adaptive_throttle_analysis
Bluestore oio adaptive_throttle_analysisBluestore oio adaptive_throttle_analysis
Bluestore oio adaptive_throttle_analysis
 

More from aaronmorton

Cassandra South Bay Meetup - Backup And Restore For Apache Cassandra
Cassandra South Bay Meetup - Backup And Restore For Apache CassandraCassandra South Bay Meetup - Backup And Restore For Apache Cassandra
Cassandra South Bay Meetup - Backup And Restore For Apache Cassandraaaronmorton
 
Cassandra SF Meetup - CQL Performance With Apache Cassandra 3.X
Cassandra SF Meetup - CQL Performance With Apache Cassandra 3.XCassandra SF Meetup - CQL Performance With Apache Cassandra 3.X
Cassandra SF Meetup - CQL Performance With Apache Cassandra 3.Xaaronmorton
 
Cassandra Day Atlanta 2016 - Monitoring Cassandra
Cassandra Day Atlanta 2016  - Monitoring CassandraCassandra Day Atlanta 2016  - Monitoring Cassandra
Cassandra Day Atlanta 2016 - Monitoring Cassandraaaronmorton
 
Cassandra London March 2016 - Lightening talk - introduction to incremental ...
Cassandra London March 2016  - Lightening talk - introduction to incremental ...Cassandra London March 2016  - Lightening talk - introduction to incremental ...
Cassandra London March 2016 - Lightening talk - introduction to incremental ...aaronmorton
 
Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra
Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable CassandraCassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra
Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandraaaronmorton
 
Cassandra sf 2015 - Steady State Data Size With Compaction, Tombstones, and TTL
Cassandra sf 2015 - Steady State Data Size With Compaction, Tombstones, and TTL Cassandra sf 2015 - Steady State Data Size With Compaction, Tombstones, and TTL
Cassandra sf 2015 - Steady State Data Size With Compaction, Tombstones, and TTL aaronmorton
 
Cassandra TK 2014 - Large Nodes
Cassandra TK 2014 - Large NodesCassandra TK 2014 - Large Nodes
Cassandra TK 2014 - Large Nodesaaronmorton
 
Cassandra Community Webinar - August 22 2013 - Cassandra Internals
Cassandra Community Webinar - August 22 2013 - Cassandra InternalsCassandra Community Webinar - August 22 2013 - Cassandra Internals
Cassandra Community Webinar - August 22 2013 - Cassandra Internalsaaronmorton
 
Cassandra SF 2013 - Cassandra Internals
Cassandra SF 2013 - Cassandra InternalsCassandra SF 2013 - Cassandra Internals
Cassandra SF 2013 - Cassandra Internalsaaronmorton
 
Apache Cassandra in Bangalore - Cassandra Internals and Performance
Apache Cassandra in Bangalore - Cassandra Internals and PerformanceApache Cassandra in Bangalore - Cassandra Internals and Performance
Apache Cassandra in Bangalore - Cassandra Internals and Performanceaaronmorton
 
Apache Con NA 2013 - Cassandra Internals
Apache Con NA 2013 - Cassandra InternalsApache Con NA 2013 - Cassandra Internals
Apache Con NA 2013 - Cassandra Internalsaaronmorton
 
Cassandra SF 2012 - Technical Deep Dive: query performance
Cassandra SF 2012 - Technical Deep Dive: query performance Cassandra SF 2012 - Technical Deep Dive: query performance
Cassandra SF 2012 - Technical Deep Dive: query performance aaronmorton
 
Hello @world #cassandra
Hello @world #cassandraHello @world #cassandra
Hello @world #cassandraaaronmorton
 
Cassandra does what ? Code Mania 2012
Cassandra does what ? Code Mania 2012Cassandra does what ? Code Mania 2012
Cassandra does what ? Code Mania 2012aaronmorton
 
Nzpug welly-cassandra-02-12-2010
Nzpug welly-cassandra-02-12-2010Nzpug welly-cassandra-02-12-2010
Nzpug welly-cassandra-02-12-2010aaronmorton
 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to Cassandraaaronmorton
 
Building a distributed Key-Value store with Cassandra
Building a distributed Key-Value store with CassandraBuilding a distributed Key-Value store with Cassandra
Building a distributed Key-Value store with Cassandraaaronmorton
 
Cassandra - Wellington No Sql
Cassandra - Wellington No SqlCassandra - Wellington No Sql
Cassandra - Wellington No Sqlaaronmorton
 

More from aaronmorton (18)

Cassandra South Bay Meetup - Backup And Restore For Apache Cassandra
Cassandra South Bay Meetup - Backup And Restore For Apache CassandraCassandra South Bay Meetup - Backup And Restore For Apache Cassandra
Cassandra South Bay Meetup - Backup And Restore For Apache Cassandra
 
Cassandra SF Meetup - CQL Performance With Apache Cassandra 3.X
Cassandra SF Meetup - CQL Performance With Apache Cassandra 3.XCassandra SF Meetup - CQL Performance With Apache Cassandra 3.X
Cassandra SF Meetup - CQL Performance With Apache Cassandra 3.X
 
Cassandra Day Atlanta 2016 - Monitoring Cassandra
Cassandra Day Atlanta 2016  - Monitoring CassandraCassandra Day Atlanta 2016  - Monitoring Cassandra
Cassandra Day Atlanta 2016 - Monitoring Cassandra
 
Cassandra London March 2016 - Lightening talk - introduction to incremental ...
Cassandra London March 2016  - Lightening talk - introduction to incremental ...Cassandra London March 2016  - Lightening talk - introduction to incremental ...
Cassandra London March 2016 - Lightening talk - introduction to incremental ...
 
Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra
Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable CassandraCassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra
Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra
 
Cassandra sf 2015 - Steady State Data Size With Compaction, Tombstones, and TTL
Cassandra sf 2015 - Steady State Data Size With Compaction, Tombstones, and TTL Cassandra sf 2015 - Steady State Data Size With Compaction, Tombstones, and TTL
Cassandra sf 2015 - Steady State Data Size With Compaction, Tombstones, and TTL
 
Cassandra TK 2014 - Large Nodes
Cassandra TK 2014 - Large NodesCassandra TK 2014 - Large Nodes
Cassandra TK 2014 - Large Nodes
 
Cassandra Community Webinar - August 22 2013 - Cassandra Internals
Cassandra Community Webinar - August 22 2013 - Cassandra InternalsCassandra Community Webinar - August 22 2013 - Cassandra Internals
Cassandra Community Webinar - August 22 2013 - Cassandra Internals
 
Cassandra SF 2013 - Cassandra Internals
Cassandra SF 2013 - Cassandra InternalsCassandra SF 2013 - Cassandra Internals
Cassandra SF 2013 - Cassandra Internals
 
Apache Cassandra in Bangalore - Cassandra Internals and Performance
Apache Cassandra in Bangalore - Cassandra Internals and PerformanceApache Cassandra in Bangalore - Cassandra Internals and Performance
Apache Cassandra in Bangalore - Cassandra Internals and Performance
 
Apache Con NA 2013 - Cassandra Internals
Apache Con NA 2013 - Cassandra InternalsApache Con NA 2013 - Cassandra Internals
Apache Con NA 2013 - Cassandra Internals
 
Cassandra SF 2012 - Technical Deep Dive: query performance
Cassandra SF 2012 - Technical Deep Dive: query performance Cassandra SF 2012 - Technical Deep Dive: query performance
Cassandra SF 2012 - Technical Deep Dive: query performance
 
Hello @world #cassandra
Hello @world #cassandraHello @world #cassandra
Hello @world #cassandra
 
Cassandra does what ? Code Mania 2012
Cassandra does what ? Code Mania 2012Cassandra does what ? Code Mania 2012
Cassandra does what ? Code Mania 2012
 
Nzpug welly-cassandra-02-12-2010
Nzpug welly-cassandra-02-12-2010Nzpug welly-cassandra-02-12-2010
Nzpug welly-cassandra-02-12-2010
 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to Cassandra
 
Building a distributed Key-Value store with Cassandra
Building a distributed Key-Value store with CassandraBuilding a distributed Key-Value store with Cassandra
Building a distributed Key-Value store with Cassandra
 
Cassandra - Wellington No Sql
Cassandra - Wellington No SqlCassandra - Wellington No Sql
Cassandra - Wellington No Sql
 

Recently uploaded

Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 

Recently uploaded (20)

Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 

Cassandra SF 2013 - In Case Of Emergency Break Glass

  • 1. CASSANDRA SUMMIT 2013 IN CASE OF EMERGENCY BREAK GLASS Aaron Morton @aaronmorton www.thelastpickle.com #Cassandra13 Licensed under a Creative Commons Attribution-NonCommercial 3.0 New Zealand License
  • 2. About Me Freelance Cassandra Consultant Based in Wellington, New Zealand Apache Cassandra Committer #Cassandra13
  • 5. The Platform & Clients #Cassandra13
  • 6. The Platform & Running Clients #Cassandra13
  • 7. The Platform & Reality Consistency Availability Partition Tolerance #Cassandra13
  • 8. The Platform & Consistency Strong Consistency (R + W > N) Eventual Consistency (R + W <= N)#Cassandra13
  • 9. What Price Consistency? In a Multi DC cluster QUOURM and EACH_QUOURM involve cross DC latency. #Cassandra13
  • 10. The Platform & Availability Maintain Consistency Level UP nodes for each Token Range. #Cassandra13
  • 11. Best Case Failure with N=9 and RF 3, 100% Availability Replica 1 Replica 2 Replica 3 Range A #Cassandra13
  • 12. Worst Case Failure with N=9 and RF 3, 78% Availability Range B Range A #Cassandra13
  • 13. The Platform & PartitionTolerance A failed node does not create a partition. #Cassandra13
  • 14. The Platform & PartitionTolerance #Cassandra13
  • 15. The Platform & PartitionTolerance Partitions occur when the network fails. #Cassandra13
  • 16. The Platform & PartitionTolerance #Cassandra13
  • 17. The Storage Engine Optimised for Writes. #Cassandra13
  • 18. Write Path Append to Write Ahead Log. (fsync every 10s by default, other options available) #Cassandra13
  • 19. Write Path Merge new Columns into Memtable. (Lock free, always in memory.) #Cassandra13
  • 20. Write Path... Later Asynchronously flush Memtable to a new SSTable on disk. (May be 10’s or 100’s of MB in size.) #Cassandra13
  • 22. Row Fragmentation SSTable 1 foo: dishwasher (ts 10): tomato purple (ts 10): cromulent SSTable 2 foo: frink (ts 20): flayven monkey (ts 10): embiggins SSTable 3 SSTable 4 foo: dishwasher (ts 15): tomacco SSTable 5 #Cassandra13
  • 23. Read Path Read columns from each SSTable, then merge results. (Roughly speaking.) #Cassandra13
  • 24. Read Path Use Bloom Filter to determine if a row key does not exist in a SSTable. (In memory) #Cassandra13
  • 25. Read Path Search for prior key in *-Index.db sample. (In memory) #Cassandra13
  • 26. Read Path Scan *-Index.db from prior key to find the search key and its’ *-Data.db offset. (On disk.) #Cassandra13
  • 27. Read Path Read *-Data.db from offset, all columns or specific pages. #Cassandra13
  • 28. Read purple, monkey, dishwasher SSTable 1-Data.db foo: dishwasher (ts 10): tomato purple (ts 10): cromulent SSTable 2-Data.db foo: frink (ts 20): flayven monkey (ts 10): embiggins SSTable 3-Data.db SSTable 4-Data.db foo: dishwasher (ts 15): tomacco SSTable 5-Data.db Bloom Filter Index Sample SSTable 1-Index.db Bloom Filter Index Sample SSTable 2-Index.db Bloom Filter Index Sample SSTable 3-Index.db Bloom Filter Index Sample SSTable 4-Index.db Bloom Filter Index Sample SSTable 5-Index.db Memory Disk #Cassandra13
  • 29. Read With Key Cache SSTable 1-Data.db foo: dishwasher (ts 10): tomato purple (ts 10): cromulent SSTable 2-Data.db foo: frink (ts 20): flayven monkey (ts 10): embiggins SSTable 3-Data.db SSTable 4-Data.db foo: dishwasher (ts 15): tomacco SSTable 5-Data.db Key Cache Index Sample SSTable 1-Index.db Key Cache Index Sample SSTable 2-Index.db Key Cache Index Sample SSTable 3-Index.db Key Cache Index Sample SSTable 4-Index.db Key Cache Index Sample SSTable 5-Index.db Memory Disk Bloom Filter Bloom Filter Bloom Filter Bloom Filter Bloom Filter #Cassandra13
  • 30. Read with Row Cache Row Cache SSTable 1-Data.db foo: dishwasher (ts 10): tomato purple (ts 10): cromulent SSTable 2-Data.db foo: frink (ts 20): flayven monkey (ts 10): embiggins SSTable 3-Data.db SSTable 4-Data.db foo: dishwasher (ts 15): tomacco SSTable 5-Data.db Key Cache Index Sample SSTable 1-Index.db Key Cache Index Sample SSTable 2-Index.db Key Cache Index Sample SSTable 3-Index.db Key Cache Index Sample SSTable 4-Index.db Key Cache Index Sample SSTable 5-Index.db Memory Disk Bloom Filter Bloom Filter Bloom Filter Bloom Filter Bloom Filter #Cassandra13
  • 31. Performant Reads Design queries to read from a small number of SSTables. #Cassandra13
  • 32. Performant Reads Read a small number of named columns or a slice of columns. #Cassandra13
  • 33. Performant Reads Design data model to support current application requirements. #Cassandra13
  • 36. DEBUG Logging For One Class log4j.logger.org.apache.cassandra.thrift. CassandraServer=DEBUG #Cassandra13
  • 37. Reading Logs INFO [OptionalTasks:1] 2013-04-20 14:03:50,787 MeteredFlusher.java (line 62) flushing high-traffic column family CFS(Keyspace='KS1', ColumnFamily='CF1') (estimated 403858136 bytes) INFO [OptionalTasks:1] 2013-04-20 14:03:50,787 ColumnFamilyStore.java (line 634) Enqueuing flush of Memtable- CF1@1333396270(145839277/403858136 serialized/live bytes, 1742365 ops) INFO [FlushWriter:42] 2013-04-20 14:03:50,788 Memtable.java (line 266) Writing Memtable-CF1@1333396270(145839277/403858136 serialized/live bytes, 1742365 ops) #Cassandra13
  • 38. GC Logs cassandra-env.sh # GC logging options -- uncomment to enable # JVM_OPTS="$JVM_OPTS -XX:+PrintGCDetails" # JVM_OPTS="$JVM_OPTS -XX:+PrintGCDateStamps" # JVM_OPTS="$JVM_OPTS -XX:+PrintHeapAtGC" # JVM_OPTS="$JVM_OPTS -XX:+PrintTenuringDistribution" # JVM_OPTS="$JVM_OPTS -XX:+PrintGCApplicationStoppedTime" # JVM_OPTS="$JVM_OPTS -XX:+PrintPromotionFailure" # JVM_OPTS="$JVM_OPTS -XX:PrintFLSStatistics=1" # JVM_OPTS="$JVM_OPTS -Xloggc:/var/log/cassandra/gc-`date + %s`.log" #Cassandra13
  • 39. ParNew GC Starting {Heap before GC invocations=224115 (full 111): par new generation total 873856K, used 717289K ...) eden space 699136K, 100% used ...) from space 174720K, 10% used ...) to space 174720K, 0% used ...) #Cassandra13
  • 40. Tenuring Distribution 240217.053: [ParNew Desired survivor size 89456640 bytes, new threshold 4 (max 4) - age 1: 22575936 bytes, 22575936 total - age 2: 350616 bytes, 22926552 total - age 3: 4380888 bytes, 27307440 total - age 4: 1155104 bytes, 28462544 total #Cassandra13
  • 41. ParNew GC Finishing Heap after GC invocations=224116 (full 111): par new generation total 873856K, used 31291K ...) eden space 699136K, 0% used ...) from space 174720K, 17% used ...) to space 174720K, 0% used ...) #Cassandra13
  • 42. nodetool info Token : 0 Gossip active : true Load : 130.64 GB Generation No : 1369334297 Uptime (seconds) : 29438 Heap Memory (MB) : 3744.27 / 8025.38 Data Center : east Rack : rack1 Exceptions : 0 Key Cache : size 104857584 (bytes), capacity 104857584 (bytes), 25364985 hits, 34874180 requests, 0.734 recent hit rate, 14400 save period in seconds Row Cache : size 0 (bytes), capacity 0... #Cassandra13
  • 43. nodetool ring Note: Ownership information does not include topology, please specify a keyspace. Address DC Rack Status State Load Owns Token 10.1.64.11 east rack1 Up Normal 130.64 GB 12.50% 0 10.1.65.8 west rack1 Up Normal 88.79 GB 0.00% 1 10.1.64.78 east rack1 Up Normal 52.66 GB 12.50% 212...216 10.1.65.181 west rack1 Up Normal 65.99 GB 0.00% 212...217 10.1.66.8 east rack1 Up Normal 64.38 GB 12.50% 425...432 10.1.65.178 west rack1 Up Normal 77.94 GB 0.00% 425...433 10.1.64.201 east rack1 Up Normal 56.42 GB 12.50% 638...648 10.1.65.59 west rack1 Up Normal 74.5 GB 0.00% 638...649 10.1.64.235 east rack1 Up Normal 79.68 GB 12.50% 850...864 10.1.65.16 west rack1 Up Normal 62.05 GB 0.00% 850...865 10.1.66.227 east rack1 Up Normal 106.73 GB 12.50% 106...080 10.1.65.226 west rack1 Up Normal 79.26 GB 0.00% 106...081 10.1.66.247 east rack1 Up Normal 66.68 GB 12.50% 127...295 10.1.65.19 west rack1 Up Normal 102.45 GB 0.00% 127...297 10.1.66.141 east rack1 Up Normal 53.72 GB 12.50% 148...512 10.1.65.253 west rack1 Up Normal 54.25 GB 0.00% 148...513 #Cassandra13
  • 44. nodetool ring KS1 Address DC Rack Status State Load Effective-Ownership Token 10.1.64.11 east rack1 Up Normal 130.72 GB 12.50% 0 10.1.65.8 west rack1 Up Normal 88.81 GB 12.50% 1 10.1.64.78 east rack1 Up Normal 52.68 GB 12.50% 212...216 10.1.65.181 west rack1 Up Normal 66.01 GB 12.50% 212...217 10.1.66.8 east rack1 Up Normal 64.4 GB 12.50% 425...432 10.1.65.178 west rack1 Up Normal 77.96 GB 12.50% 425...433 10.1.64.201 east rack1 Up Normal 56.44 GB 12.50% 638...648 10.1.65.59 west rack1 Up Normal 74.57 GB 12.50% 638...649 10.1.64.235 east rack1 Up Normal 79.72 GB 12.50% 850...864 10.1.65.16 west rack1 Up Normal 62.12 GB 12.50% 850...865 10.1.66.227 east rack1 Up Normal 106.72 GB 12.50% 106...080 10.1.65.226 west rack1 Up Normal 79.28 GB 12.50% 106...081 10.1.66.247 east rack1 Up Normal 66.73 GB 12.50% 127...295 10.1.65.19 west rack1 Up Normal 102.47 GB 12.50% 127...297 10.1.66.141 east rack1 Up Normal 53.75 GB 12.50% 148...512 10.1.65.253 west rack1 Up Normal 54.24 GB 12.50% 148...513 #Cassandra13
  • 45. nodetool status $ nodetool status Datacenter: ams01 (Replication Factor 3) ================= Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 10.70.48.23 38.38 GB 256 19.0% 7c5fdfad-63c6-4f37-bb9f-a66271aa3423 RAC1 UN 10.70.6.78 58.13 GB 256 18.3% 94e7f48f-d902-4d4a-9b87-81ccd6aa9e65 RAC1 UN 10.70.47.126 53.89 GB 256 19.4% f36f1f8c-1956-4850-8040-b58273277d83 RAC1 Datacenter: wdc01 (Replication Factor 3) ================= Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 10.24.116.66 65.81 GB 256 22.1% f9dba004-8c3d-4670-94a0-d301a9b775a8 RAC1 UN 10.55.104.90 63.31 GB 256 21.2% 4746f1bd-85e1-4071-ae5e-9c5baac79469 RAC1 UN 10.55.104.27 62.71 GB 256 21.2% 1a55cfd4-bb30-4250-b868-a9ae13d81ae1 RAC1 #Cassandra13
  • 46. nodetool cfstats Keyspace: KS1 Column Family: CF1 SSTable count: 11 Space used (live): 32769179336 Space used (total): 32769179336 Number of Keys (estimate): 73728 Memtable Columns Count: 1069137 Memtable Data Size: 216442624 Memtable Switch Count: 3 Read Count: 95 Read Latency: NaN ms. Write Count: 1039417 Write Latency: 0.068 ms. Bloom Filter False Postives: 345 Bloom Filter False Ratio: 0.00000 Bloom Filter Space Used: 230096 Compacted row minimum size: 150 Compacted row maximum size: 322381140 Compacted row mean size: 2072156 #Cassandra13
  • 47. nodetool cfhistograms $nodetool cfhistograms KS1 CF1 Offset SSTables Write Latency Read Latency Row Size Column Count 1 67264 0 0 0 1331591 2 19512 0 0 0 4241686 3 35529 0 0 0 474784 ... 10 10299 1150 0 0 21768 12 5475 3569 0 0 3993135 14 1986 9098 0 0 1434778 17 258 30916 0 0 366895 20 0 52980 0 0 186524 24 0 104463 0 0 25439063 ... 179 0 93 1823 1597 1284167 215 0 84 3880 1231655 1147150 258 0 170 5164 209282 956487 #Cassandra13
  • 48. nodetool proxyhistograms $nodetool proxyhistograms Offset Read Latency Write Latency Range Latency 60 0 15 0 72 0 51 0 86 0 241 0 103 2 2003 0 124 9 5798 0 149 67 7348 0 179 222 6453 0 215 184 6071 0 258 134 5436 0 310 104 4936 0 372 89 4997 0 446 39 6383 0 535 76797 7518 0 642 9364748 96065 0 770 16406421 152663 0 924 7429538 97612 0 1109 6781835 176829 0 #Cassandra13
  • 51. JMX via JMXTERM $ java -jar jmxterm-1.0-alpha-4-uber.jar Welcome to JMX terminal. Type "help" for available commands. $>open localhost:7199 #Connection to localhost:7199 is opened $>bean org.apache.cassandra.db:type=StorageService #bean is set to org.apache.cassandra.db:type=StorageService $>info #mbean = org.apache.cassandra.db:type=StorageService #class name = org.apache.cassandra.service.StorageService # attributes %0 - AllDataFileLocations ([Ljava.lang.String;, r) %1 - CommitLogLocation (java.lang.String, r) %2 - CompactionThroughputMbPerSec (int, rw) ... # operations %1 - void bulkLoad(java.lang.String p1) %2 - void clearSnapshot(java.lang.String p1,[Ljava.lang.String; p2) %3 - void decommission() #Cassandra13
  • 52. JVM Heap Dump via JMAP jmap -dump:format=b, file=heap.bin pid #Cassandra13
  • 53. JVM Heap Dump withYourKit #Cassandra13
  • 56. Compaction Error ERROR [CompactionExecutor:36] 2013-04-29 07:50:49,060 AbstractCassandraDaemon.java (line 132) Exception in thread Thread[CompactionExecutor:36,1,main] java.lang.RuntimeException: Last written key DecoratedKey(138024912283272996716128964353306009224, 6138633035613062      2d616666362d376330612d666531662d373738616630636265396535) >= current key DecoratedKey(127065377405949402743383718901402082101, 64323962636163652d646561372d333039322d386166322d663064346132363963386131) writing into *-tmp-hf-7372-Data.db at org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:134) at org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:153) at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:160) at org.apache.cassandra.db.compaction.LeveledCompactionTask.execute(LeveledCompaction Task.java:50) at org.apache.cassandra.db.compaction.CompactionManager $2.runMayThrow(CompactionManager.java:164) #Cassandra13
  • 57. Cause Change in KeyValidator or bug in older versions. #Cassandra13
  • 60. Logs MessagingService.java (line 658) 173 READ messages dropped in last 5000ms StatusLogger.java (line 57) Pool Name Active Pending StatusLogger.java (line 72) ReadStage 32 284 StatusLogger.java (line 72) RequestResponseStage 1 254 StatusLogger.java (line 72) ReadRepairStage 0 0 #Cassandra13
  • 61. nodetool tpstats Message type Dropped RANGE_SLICE 0 READ_REPAIR 0 BINARY 0 READ 721 MUTATION 1262 REQUEST_RESPONSE 196 #Cassandra13
  • 62. Causes Excessive GC. Overloaded IO. Overloaded Node. Wide Reads / Large Batches. #Cassandra13
  • 64. nodetool info Token : 113427455640312814857969558651062452225 Gossip active : true Thrift active : true Load : 291.13 GB Generation No : 1368569510 Uptime (seconds) : 1022629 Heap Memory (MB) : 5213.01 / 8025.38 Data Center : 1 Rack : 20 Exceptions : 0 Key Cache : size 104857584 (bytes), capacity 104857584 (bytes), 13436862 hits, 16012159 requests, 0.907 recent hit rate, 14400 save period in seconds Row Cache : size 0 (bytes), capacity 0 (bytes), 0 hits, 0 requests, NaN recent hit rate, 0 save period in seconds #Cassandra13
  • 65. nodetool cfstats Column Family: page_views SSTable count: 17 Space used (live): 289942843592 Space used (total): 289942843592 Number of Keys (estimate): 1071416832 Memtable Columns Count: 2041888 Memtable Data Size: 539015124 Memtable Switch Count: 83 Read Count: 267059 Read Latency: NaN ms. Write Count: 10516969 Write Latency: 0.054 ms. Pending Tasks: 0 Bloom Filter False Positives: 128586 Bloom Filter False Ratio: 0.00000 Bloom Filter Space Used: 802906184 Compacted row minimum size: 447 Compacted row maximum size: 3973 Compacted row mean size: 867 #Cassandra13
  • 66. nodetool cfhistograms KS1 CF1 Offset SSTables Write Latency Read Latency Row Size Column Count 1 178437 0 0 0 0 2 20042 0 0 0 0 3 15275 0 0 0 0 4 11632 0 0 0 0 5 4771 0 0 0 0 6 4942 0 0 0 0 7 5540 0 0 0 0 8 4967 0 0 0 0 10 10682 0 0 0 284155 12 8355 0 0 0 15372508 14 1961 0 0 0 137959096 17 322 3 0 0 625733930 20 61 253 0 0 252953547 24 53 15114 0 0 39109718 29 18 255730 0 0 0 35 1 1532619 0 0 0 ... #Cassandra13
  • 67. nodetool cfhistograms KS1 CF1 Offset SSTables Write Latency Read Latency Row Size Column Count 446 0 120 233 0 0 535 0 155 261 21361 0 642 0 127 284 19082720 0 770 0 88 218 498648801 0 924 0 86 2699 504702186 0 1109 0 22 3157 48714564 0 1331 0 18 2818 241091 0 1597 0 15 2155 2165 0 1916 0 19 2098 7 0 2299 0 10 1140 56 0 2759 0 10 1281 0 0 3311 0 6 1064 0 0 3973 0 4 676 3 0 ... #Cassandra13
  • 68. jmx-term $ java -jar jmxterm-1.0-alpha-4-uber.jar  Welcome to JMX terminal. Type "help" for available commands. $>open localhost:7199 #Connection to localhost:7199 is opened $>bean org.apache.cassandra.db:columnfamily=CF2,keyspace=KS2,type=ColumnFamilies #bean is set to org.apache.cassandra.db:columnfamily=CF2,keyspace=KS2,type=ColumnFamilies $>get BloomFilterFalseRatio #mbean = org.apache.cassandra.db:columnfamily=CF2,keyspace=KS2,type=ColumnFamilies: BloomFilterFalseRatio = 0.5693801541828607; #Cassandra13
  • 69. Back to cfstats Column Family: page_views Read Count: 270075 Bloom Filter False Positives: 131294 #Cassandra13
  • 70. Cause bloom_filter_fp_chance had been set to 0.1 to reduce memory requirements when storing 1+ Billion rows per Node. #Cassandra13
  • 71. Fix Changed read queries to select by column name to limit SSTables per query. Long term, migrate to Cassandra v1.2 for off heap Bloom Filters. #Cassandra13
  • 73. WARN WARN [ScheduledTasks:1] 2013-03-29 18:40:48,158 GCInspector.java (line 145) Heap is 0.9355130159566108 full. You may need to reduce memtable and/or cache sizes. INFO [ScheduledTasks:1] 2013-03-26 16:36:06,383 GCInspector.java (line 122) GC for ConcurrentMarkSweep: 207 ms for 1 collections, 10105891032 used; max is 13591642112 INFO [ScheduledTasks:1] 2013-03-28 22:18:17,113 GCInspector.java (line 122) GC for ParNew: 256 ms for 1 collections, 6504905688 used; max is 13591642112 #Cassandra13
  • 74. Serious GC Problems INFO [ScheduledTasks:1] 2013-04-30 23:21:11,959 GCInspector.java (line 122) GC for ParNew: 1115 ms for 1 collections, 9355247296 used; max is 12801015808 #Cassandra13
  • 75. Flapping Node INFO [GossipTasks:1] 2013-03-28 17:42:07,944 Gossiper.java (line 830) InetAddress /10.1.20.144 is now dead. INFO [GossipStage:1] 2013-03-28 17:42:54,740 Gossiper.java (line 816) InetAddress /10.1.20.144 is now UP INFO [GossipTasks:1] 2013-03-28 17:46:00,585 Gossiper.java (line 830) InetAddress /10.1.20.144 is now dead. INFO [GossipStage:1] 2013-03-28 17:46:13,855 Gossiper.java (line 816) InetAddress /10.1.20.144 is now UP INFO [GossipStage:1] 2013-03-28 17:48:48,966 Gossiper.java (line 830) InetAddress /10.1.20.144 is now dead. #Cassandra13
  • 76. “GC Problems are the result of workload and configuration.” Aaron Morton, Just Now. #Cassandra13
  • 77. Workload Correlation? Look for wide rows, large writes, wide reads, un- bounded multi row reads or writes. #Cassandra13
  • 78. Compaction Correlation? Slow down Compaction to improve stability. concurrent_compactors: 2 compaction_throughput_mb_per_sec: 8 in_memory_compaction_limit_in_mb: 32 (Monitor and reverse when resolved.) #Cassandra13
  • 79. GC Logging Insights Slow down rate of tenuring and enable full GC logging. HEAP_NEWSIZE="1200M" JVM_OPTS="$JVM_OPTS -XX:SurvivorRatio=4" JVM_OPTS="$JVM_OPTS -XX:MaxTenuringThreshold=4" #Cassandra13
  • 80. GC’ing Objects in ParNew {Heap before GC invocations=7937 (full 205): par new generation total 1024000K, used 830755K ...) eden space 819200K, 100% used ...) from space 204800K, 5% used ...) to space 204800K, 0% used ...) Desired survivor size 104857600 bytes, new threshold 4 (max 4) - age 1: 8090240 bytes, 8090240 total - age 2: 565016 bytes, 8655256 total - age 3: 330152 bytes, 8985408 total - age 4: 657840 bytes, 9643248 total #Cassandra13
  • 81. GC’ing Objects in ParNew {Heap before GC invocations=7938 (full 205): par new generation total 1024000K, used 835015K ...) eden space 819200K, 100% used ...) from space 204800K, 7% used ...) to space 204800K, 0% used ...) Desired survivor size 104857600 bytes, new threshold 4 (max 4) - age 1: 1315072 bytes, 1315072 total - age 2: 541072 bytes, 1856144 total - age 3: 499432 bytes, 2355576 total - age 4: 316808 bytes, 2672384 total #Cassandra13
  • 82. Cause Nodes had wide rows & 1.3+ Billion rows and 3+GB of Bloom Filters. (Using older bloom_filter_fp_chance of 0.000744.) #Cassandra13
  • 83. Fix Increased FP chance to 0.1 on one CF’s and .01 on others. (One CF reduced from 770MB to 170MB of Bloom Filters.) #Cassandra13
  • 84. Fix Increased index_interval from 128 to 512. (Increased key_cache_size_in_mb to 200.) #Cassandra13
  • 86. Anatomy of a Partition. (From a 1.0 cluster) #Cassandra13
  • 87. Node 23 Was Up cassandra23# bin/nodetool -h localhost info Token : 28356863910078205288614550619314017621 Gossip active : true Load : 275.44 GB Generation No : 1762556151 Uptime (seconds) : 67548 Heap Memory (MB) : 2926.44 / 8032.00 Data Center : DC1 Rack : RAC_unknown Exceptions : 0 #Cassandra13
  • 88. Other Nodes Saw It Down cassandra20# nodetool -h localhost ring Address DC Rack Status State Load 10.37.114.8 DC1 RAC20 Up Normal 285.86 GB 10.29.60.10 DC2 RAC23 Down Normal 277.86 GB 10.6.130.70 DC1 RAC21 Up Normal 244.9 GB 10.29.60.14 DC2 RAC24 Up Normal 296.85 GB 10.37.114.10 DC1 RAC22 Up Normal 255.81 GB 10.29.60.12 DC2 RAC25 Up Normal 316.88 GB #Cassandra13
  • 89. And Node 23 SawThem Up cassandra23# nodetool -h localhost ring Address DC Rack Status State Load 10.37.114.8 DC1 RAC20 Up Normal 285.86 GB 10.29.60.10 DC2 RAC23 Up Normal 277.86 GB 10.6.130.70 DC1 RAC21 Up Normal 244.9 GB 10.29.60.14 DC2 RAC24 Up Normal 296.85 GB 10.37.114.10 DC1 RAC22 Up Normal 255.81 GB 10.29.60.12 DC2 RAC25 Up Normal 316.88 GB #Cassandra13
  • 90. Still Available Node 23 could serve requests at LOCAL_QUORUM, QUORUM and ALL Consistency. Other nodes could serve requests at LOCAL_QUOURM and QUORUM but not ALL Consistency. #Cassandra13
  • 91. Relax The application was up. #Cassandra13
  • 92. Gossip? cassandra20# bin/nodetool -h localhost gossipinfo ... /10.29.60.10 LOAD:2.98347080902E11 STATUS:NORMAL,28356863910078205288614550619314017621 RPC_ADDRESS:10.29.60.10 SCHEMA:fe933880-19bd-11e1-0000-5ff37d368cb6 RELEASE_VERSION:1.0.5 #Cassandra13
  • 93. Gossip Logs On Node 20? log4j.logger.org.apache.cassandra.gms.Gossiper=TRACE TRACE [GossipStage:1] 2011-12-13 00:58:49,636 Gossiper.java (line 647) local heartbeat version 526912 greater than 7951 for /10.29.60.10 #Cassandra13
  • 94. More Gossip Logs On Node 20? log4j.logger.org.apache.cassandra.gms.GossipDigestSynVerbHandler=TRACE log4j.logger.org.apache.cassandra.gms.FailureDetector=TRACE TRACE [GossipStage:1] 2011-12-13 02:14:37,033 GossipDigestSynVerbHandler.java (line 46) Received a GossipDigestSynMessage from /10.29.60.10 TRACE [GossipStage:1] 2011-12-13 02:14:37,033 GossipDigestSynVerbHandler.java (line 76) Gossip syn digests are : /10.29.60.10:1762556151:12552 / 10.29.60.14:1323732392:10208 /10.37.114.8:1323731527:11082 / 10.37.114.10:1323736718:5830 /10.6.130.70:1323732220:10379 / 10.29.60.12:1323733099:9493 //Expected call to the FailureDetector TRACE [GossipStage:1] 2011-12-13 02:14:37,033 GossipDigestSynVerbHandler.java (line 90) Sending a GossipDigestAckMessage to /10.29.60.10 #Cassandra13
  • 95. Cause. Generation is initialised at bootstrap to seconds past the Epoch. 1762556151 is Fri, 07 Nov 2025 22:55:51 GMT. cassandra23# bin/nodetool -h localhost info Generation No : 1762556151 TRACE [GossipStage:1] 2011-12-13 02:14:37,033 GossipDigestSynVerbHandler.java (line 76) Gossip syn digests are : /10.29.60.10:1762556151:12552 / #Cassandra13
  • 96. Fix. [default@system] get LocationInfo['L']; => (column=ClusterName, value=737069, timestamp=1320437246450000) => (column=Generation, value=690e78f6, timestamp=1762556150811000) #Cassandra13
  • 98. Maintenance Expand to Multi DC #Cassandra13
  • 99. Expand to Multi DC Update Snitch Update Replication Strategy Add Nodes Update Replication Factor Rebuild #Cassandra13
  • 100. DC Aware Snitch? SimpleSnitch puts all nodes in rack1 and datacenter1. #Cassandra13
  • 103. Changing the Snitch Do Not change the DC or Rack for an existing node. (Cassandra will not be able to find your data.) #Cassandra13
  • 104. Moving to the GossipingPropertyFileSnitch Update cassandra- topology.properties on existing nodes with existing DC/Rack settings for all existing nodes. Set default to new DC. #Cassandra13
  • 105. Moving to the GossipingPropertyFileSnitch Update cassandra- rackdc.properties on existing nodes with existing DC/Rack for the node. #Cassandra13
  • 106. Moving to the GossipingPropertyFileSnitch Use a rolling restart to upgrade existing nodes to GossipingPropertyFileSnitch #Cassandra13
  • 107. Expand to Multi DC Update Snitch Update Replication Strategy Add Nodes Update Replication Factor Rebuild #Cassandra13
  • 108. Got NTS ? Must use NetworkTopologyStrategy for Multi DC deployments. #Cassandra13
  • 109. SimpleStrategy Order Token Ranges. Start with range that contains Row Key. Count to RF. #Cassandra13
  • 111. NetworkTopologyStrategy Order Token Ranges in the DC. Start with range that contains the Row Key. Add first unselected Token Range from each Rack. Repeat until RF selected. #Cassandra13
  • 113. NetworkTopologyStrategy & 1 Rack "foo" Rack 1 #Cassandra13
  • 114. Changing the Replication Strategy Be Careful if using existing configuration has multiple Racks. (Cassandra may not be able to find your data.) #Cassandra13
  • 115. Changing the Replication Strategy Update Keyspace configuration to use NetworkTopologyStrategy with datacenter1:3 and new_dc:0. #Cassandra13
  • 116. PreparingThe Client Disable auto node discovery or use DC aware methods. Use LOCAL_QUOURM or EACH_QUOURM. #Cassandra13
  • 117. Expand to Multi DC Update Snitch Update Replication Strategy Add Nodes Update Replication Factor Rebuild #Cassandra13
  • 118. Configuring New Nodes Add auto_bootstrap: false to cassandra.yaml. Use GossipingPropertyFileSnitch. Three Seeds from each DC. (Use cluster_name as a safety.) #Cassandra13
  • 119. Configuring New Nodes Update cassandra- rackdc.properties on new nodes with new DC/Rack for the node. (Ignore cassandra-topology.properties) #Cassandra13
  • 120. StartThe New Nodes New Nodes in the Ring in the new DC without data or traffic. #Cassandra13
  • 121. Expand to Multi DC Update Snitch Update Replication Strategy Add Nodes Update Replication Factor Rebuild #Cassandra13
  • 122. Change the Replication Factor Update Keyspace configuration to use NetworkTopologyStrategy with dataceter1:3 and new_dc:3. #Cassandra13
  • 123. Change the Replication Factor New DC nodes will start receiving writes from old DC coordinators. #Cassandra13
  • 124. Expand to Multi DC Update Snitch Update Replication Strategy Add Nodes Update Replication Factor Rebuild #Cassandra13
  • 125. Y U No Bootstrap? DC 1 DC 2 #Cassandra13
  • 126. nodetool rebuild DC1 DC 1 DC 2 #Cassandra13
  • 127. Rebuild Complete New Nodes now performing Strong Consistency reads. (If EACH_QUOURM used for writes.) #Cassandra13
  • 128. Summary Relax. Understand the Platform and the Tools. Always maintain Availability. #Cassandra13
  • 130. Aaron Morton @aaronmorton www.thelastpickle.com Licensed under a Creative Commons Attribution-NonCommercial 3.0 New Zealand License