Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Cassandra at Lithium
Paul Cichonski, Senior Software Engineer
@paulcichonski
Lithium?
• Helping companies build
social communities for
their customers
• Founded in 2001
• ~300 customers
• ~84 million...
Use Case: Notification Service
1. Stores subscriptions
2. Processes community events
3. Generates notifications when event...
Notification Service  System View

4
The Cluster (v1.2.6)
• 4 nodes, each node:
– Centos 6.4
– 8 cores, 2TB for commit-log, 3x 512GB SSD
for data

• Average wr...
Data Model

6
Data Model: Subscriptions
Fulfillment

identifies target of subscription
identifies entity that is subscribed

7
standard_subscription_index row
stored as:
user:2:creationtimesta user:53:creationtimest user:88:creationtimest
66edfdb7-6...
Data Model: Subscription Display
(time series)

9
subscriptions_for_entity_by_time row
stored as:
1390939670:label:testl
66edfdb7-6ff71390939665:board:53
abel
458c-94a84216...
Data Model: Subscription Display
(content browsing)

11
subscriptions_for_entity_by_type row
stored as:
message:13:creationti
66edfdb7-6ff7mestamp
458c-94a8421627c1b6f5:use
13909...
Data Model: Activity Feed (fan-out
writes)

JSON blob representing activity

13
activity_for_entity row
stored as:
66edfdb7-6ff7458c-94a8421627c1b6f5:use
r:2:0

31aac580-8550-11e3-ad74000c29351b9d:moder...
Migration Strategy
(mysql  cassandra)

15
Data Migration: Trust, but Verify
Fully repeatable due to
idempotent writes

1) Bulk Migrate all subscription data (HTTP)
...
Verify: Consistency Checking

17
Subscription Write Strategy
Reads for subscription
fulfillment happen in ns.

user
subscription_write

NS system boundary
...
Path to Production: QA Issue #1
(many writes to same row kill cluster)

19
Problem: CQL INSERTS
Single Thread SLOW, even with BATCH
(multiple second latency for writing chunks
of 1000 subscription...
Just Use More Threads? Not Quite

21
Cluster Essentially Died

22
Mutations Could Not Keep Up

23
Solution: Work Closer to Storage
Layer
Work here:
user:2:creationtimesta user:53:creationtimest user:88:creationtimest
66e...
Solution: Thrift batch_mutate

More details: http://thelastpickle.com/blog/2013/09/13/CQL3-to-Astyanax-Compatibility.html
...
Path to Production: QA Issue #2
(read timeouts)

26
Tombstone Buildup and Timeouts

CF holding notification settings rewritten every 30 minutes
Eventually tombstone build-up ...
Solution

28
Production Issue #1
(dead cluster)

29
Hard Drive Failure on All Nodes
4 days after release, we started seeing this in /var/log/cassandra/system.log

After follo...
TRIM Support to the Rescue

* http://www.slideshare.net/rbranson/cassandra-and-solid-state-drives

31
Production Issue #2
(repair causing tornadoes of destruction)

32
Activity Feed Data Explosion
• Activity data written with a TTL of 30 days.
• Users in 99th percentile were receiving
mult...
Problem Did Not Surface for 30
Days
• Repairs started taking up to a week
• Created 1000’s of SSTables
• High latency:

34
Solution: Trim Feeds Manually

35
activity_for_entity cfstats

36
How we monitor in Prod
• Nodetool, Opscenter and JMX to monitor
cluster
• Yammer Metrics at every layer of
Notification Se...
Lessons Learned
• Have a migration strategy that allows both
systems to stay live until you have proven
Cassandra in prod
...
Questions?
@paulcichonski

39
Upcoming SlideShare
Loading in …5
×

Cassandra at Lithium

1,460 views

Published on

Published in: Technology
  • Was a little hesitant about using ⇒⇒⇒WRITE-MY-PAPER.net ⇐⇐⇐ at first, but am very happy that I did. The writer was able to write my paper by the deadline and it was very well written. So guys don’t hesitate to use it.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Cassandra at Lithium

  1. 1. Cassandra at Lithium Paul Cichonski, Senior Software Engineer @paulcichonski
  2. 2. Lithium? • Helping companies build social communities for their customers • Founded in 2001 • ~300 customers • ~84 million users • ~5 million unique logins in past 20 days 2
  3. 3. Use Case: Notification Service 1. Stores subscriptions 2. Processes community events 3. Generates notifications when events match against subscriptions 4. Builds user activity feed out of notifications 3
  4. 4. Notification Service  System View 4
  5. 5. The Cluster (v1.2.6) • 4 nodes, each node: – Centos 6.4 – 8 cores, 2TB for commit-log, 3x 512GB SSD for data • Average writes/s: 100-150, peak: 2000 • Average reads/s: 100, peak: 1500 • Use Astyanax on client-side 5
  6. 6. Data Model 6
  7. 7. Data Model: Subscriptions Fulfillment identifies target of subscription identifies entity that is subscribed 7
  8. 8. standard_subscription_index row stored as: user:2:creationtimesta user:53:creationtimest user:88:creationtimest 66edfdb7-6ff7amp amp mp 458c-94a8421627c1b6f5:me 1390939665 1390939670 1390939660 ssage:13 maps to (cqlsh): 8
  9. 9. Data Model: Subscription Display (time series) 9
  10. 10. subscriptions_for_entity_by_time row stored as: 1390939670:label:testl 66edfdb7-6ff71390939665:board:53 abel 458c-94a8421627c1b6f5:use r:2:0 1390939660:message: 13 maps to (cqlsh): 10
  11. 11. Data Model: Subscription Display (content browsing) 11
  12. 12. subscriptions_for_entity_by_type row stored as: message:13:creationti 66edfdb7-6ff7mestamp 458c-94a8421627c1b6f5:use 1390939660 r:2 board:53:creationtime label:testlabel:creation stamp timestamp 1390939665 1390939670 maps to (cqlsh): 12
  13. 13. Data Model: Activity Feed (fan-out writes) JSON blob representing activity 13
  14. 14. activity_for_entity row stored as: 66edfdb7-6ff7458c-94a8421627c1b6f5:use r:2:0 31aac580-8550-11e3-ad74000c29351b9d:moderationA ction:event_summary f4efd590-82ca-11e3-ad74000c29351b9d:badge:event_ summary 1571b680-7254-11e3-8d70000c29351b9d:kudos:event_ summary {moderation_json} {badge_json} {kudos_json} maps to (cqlsh): 14
  15. 15. Migration Strategy (mysql  cassandra) 15
  16. 16. Data Migration: Trust, but Verify Fully repeatable due to idempotent writes 1) Bulk Migrate all subscription data (HTTP) lia NS 2) Consistency check all subscription data (HTTP) Also runs after migration to verify shadow-writes 16
  17. 17. Verify: Consistency Checking 17
  18. 18. Subscription Write Strategy Reads for subscription fulfillment happen in ns. user subscription_write NS system boundary subscription_write (shadow_write) lia activemq Notification Service mysql Reads for UI fulfilled by legacy mysql (temporary) Cassandr a 18
  19. 19. Path to Production: QA Issue #1 (many writes to same row kill cluster) 19
  20. 20. Problem: CQL INSERTS Single Thread SLOW, even with BATCH (multiple second latency for writing chunks of 1000 subscriptions) Largest customer (~20 million subscriptions) would have taken weeks to migrate 20
  21. 21. Just Use More Threads? Not Quite 21
  22. 22. Cluster Essentially Died 22
  23. 23. Mutations Could Not Keep Up 23
  24. 24. Solution: Work Closer to Storage Layer Work here: user:2:creationtimesta user:53:creationtimest user:88:creationtimest 66edfdb7-6ff7amp amp mp 458c-94a8421627c1b6f5:me 1390939665 1390939670 1390939660 ssage:13 Not here: 24
  25. 25. Solution: Thrift batch_mutate More details: http://thelastpickle.com/blog/2013/09/13/CQL3-to-Astyanax-Compatibility.html Allowed us to write 200,000 subscriptions to 3 CFs in ~45 seconds with almost no impact on cluster. NOTE: supposedly fixed in 2.0: CASSANDRA-4693 25
  26. 26. Path to Production: QA Issue #2 (read timeouts) 26
  27. 27. Tombstone Buildup and Timeouts CF holding notification settings rewritten every 30 minutes Eventually tombstone build-up caused reads to time out 27
  28. 28. Solution 28
  29. 29. Production Issue #1 (dead cluster) 29
  30. 30. Hard Drive Failure on All Nodes 4 days after release, we started seeing this in /var/log/cassandra/system.log After following a bunch of dead ends, we also found this in /var/messages.log This cascaded to all nodes and within an hour, cluster was dead 30
  31. 31. TRIM Support to the Rescue * http://www.slideshare.net/rbranson/cassandra-and-solid-state-drives 31
  32. 32. Production Issue #2 (repair causing tornadoes of destruction) 32
  33. 33. Activity Feed Data Explosion • Activity data written with a TTL of 30 days. • Users in 99th percentile were receiving multiple thousands of writes per day. • compacted row maximum size: ~85mb (after 30 days) Here, be Dragons: – CASSANDRA-5799: Column can expire while lazy compacting it... 33
  34. 34. Problem Did Not Surface for 30 Days • Repairs started taking up to a week • Created 1000’s of SSTables • High latency: 34
  35. 35. Solution: Trim Feeds Manually 35
  36. 36. activity_for_entity cfstats 36
  37. 37. How we monitor in Prod • Nodetool, Opscenter and JMX to monitor cluster • Yammer Metrics at every layer of Notification Service, use graphite to visualize • Use Netflix Hystrix in Notification Service to guard against cluster failure 37
  38. 38. Lessons Learned • Have a migration strategy that allows both systems to stay live until you have proven Cassandra in prod • Longevity tests are key, especially if you will have tombstones • Understand how gc_grace_seconds and compaction affect tombstone cleanup • Test with production data loads if you can 38
  39. 39. Questions? @paulcichonski 39

×