Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016

CASSANDRA @ INSTAGRAM 2016
Dikang Gu
Software Engineer @ Facebook

ABOUT ME
2
• @dikanggu
• Software Engineer
• Instagram core infra, 2014 — present
• Facebook data Infra, 2012 — 2014

AGENDA
3
1 Overview
2 Improvements
3 Challenges

OVERVIEW
Cluster Deployment
5
• Cassandra Nodes: 1,000+
• Data Size: 100s of TeraBytes
• Ops/sec: in the millions
• Largest Cluster: 100+
• Regions: multiple

OVERVIEW
6
• Client: Python/C++/Java/PHP
• Protocol: mostly thrift, some CQL
• Versions: 2.0.x - 2.2.x
• Use LCS for most tables.

USE CASE 1
Feed
8
PUSH
When posting, we push the media information to the followers' feed store.
When reading, we fetch the feed ids from the viewer's feed store.

USE CASE 1
Feed
• Write QPS: 1M+
• Avg/P99 Read Latency : 20ms/100ms
• Data Model: 
user_id —> List(media_id)

USE CASE 2
Metadata store
10
Applications use C* as a key value store, they store a list of blobs associated with a key, and do point query or range query during the read
time.

USE CASE 2
Metadata store
• Read/Write QPS: 100K+
• Avg read size: 50KB
• Data Model: 
user_id —> List(Blob)

USE CASE 3
Counter
12
Applications issue bump/get counter operations for each user requests.

USE CASE 3
Counter
• Read/Write QPS: 50K+
• C* 2.2
• Data Model: 
some_id —> Counter

PROXY NODE
Problem
16
• Thrift client, NOT token aware
• Data node coordinates the requests
• High latency and timeout when data node is hot.

PROXY NODE
Solution
17
• join_ring: false
• act as coordinator
• do not store data locally
• client only talks to proxy node
• 2X latency drop

(CASSANDRA-9258)
2. PENDING RANGES
18

PENDING RANGES
Problem
19
• CPU usage +30% when bootstrapping new nodes.
• Client requests latency jumps and timeouts
• Multimap<Range<Token>, InetAddress> PendingRange
• In-efﬁcient O(n) pendingRanges lookup for request

PENDING RANGES
Solution
20
• Cassandra-9258
• Use two NavigableMaps to implement the pending ranges
• We can expand or shrink the cluster without affecting requests
• Thanks Branimir Lambov for patch review and feedbacks.

(CASSANDRA-6908)
3. DYNAMIC SNITCH
21

DYNAMIC SNITCH
22
• High read latency during peak time.
• Unnecessary cross region requests.
• dynamic_snitch_badness_threshold: 50
• 10X P99 latency drop

COMPACTION IMPROVEMENTS
24
• Track the write ampliﬁcation. (CASSANDRA-11420)
• Optimize the overlapping lookup. (CASSANDRA-11571)
• Optimize the isEOF() checking. (CASSANDRA-12013)
• Avoid searching for column index. (CASSANDRA-11450)
• Persist last compacted key per level. (CASSANDRA-6216)
• Compact tables before making available in L0. (CASSANDRA-10862)

BIG HEAP SIZE
26
• 64G max heap size
• 16G new gen size
• -XX:MaxTenuringThreshold=6
• Young GC every 10 seconds
• Avoid full GC
• 2X P99 latency drop

(CASSANDRA-10406)
6. NODETOOL REBUILD RANGE
27

NODETOOL REBUILD
28
• rebuild may fail for nodes with TBs of data
• Cassandra-10406
• support to rebuild the failed token ranges
• Thanks Yuki Morishita for reviewing

PERFORMANCE
30
P99 Read Latency
Latency on the C* nodes, even higher on the client side.

PERFORMANCE
31
Compaction has difﬁculties to catch up
Impact the read latency

PERFORMANCE
32
Compaction uses too much CPU (40%+)

SCALABILITY
34
Gossip, nodes see inconsistent ring
(CASSANDRA-11709, CASSANDRA-11740)

FEATURES
35
Counter, problem with repair
(CASSANDRA-11432, CASSANDRA-10862)
SSTables in each level: [966/4, 20/10, 152/100, 33, 0, 0, 0, 0, 0]

CLIENT
36
Access C* from different languages
Cassandra ClusterService
Service
Service

OPERATIONS
37
Cluster expansion takes long time
15 days to bootstrap 30 nodes

RECAP
38
• Proxy Node
• Pending Ranges
• Dynamic Snitch
• Compaction
• Big heap size
• Nodetool rebuild range token
• P99 Read latency
• Compaction
• Tombstone
• Gossip
• Counter
• Client
• Cluster expansion
ChallengesImprovements

Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016

Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016

More Related Content

What's hot

Similar to Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016

More from DataStax

Recently uploaded

Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016