Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

How Cassandra Deletes Data (Alain Rodriguez, The Last Pickle) | Cassandra Summit 2016

2,332 views

Published on

How does Cassandra delete data when the files on disk are immutable? How does it make sure deletes are distributed around the cluster? The answer is Tombstones, a ""soft delete"" marker that solves these problems and creates others by inserting more data when you ask for data to be deleted. Which can result in serious problems for some data models, and headaches for developers and operations teams. With the correct settings and workload however it can mean that Cassandra efficiently removes old data from disk.

In this talk Alain Rodriguez, Consultant at The Last Pickle, will explain why Cassandra uses tombstones, how they work, and when they are purged from disk. He will also discuss the best data models and configurations settings to ensure efficient purging, and what to do when it goes wrong.

About the Speaker
Alain Rodriguez Consultant, The Last Pickle

Alain has been working with Apache Cassandra since version 0.8. He was the first Engineer at teads.tv which had grown to 400+ employees by the time he left. During his time at Teads Alain managed and scaled Cassandra clusters across multiple AWS Regions, fully on his own, taking care of the data modeling as well as the troubleshooting and tuning. Alain frequently contributes to the Apache Cassandra users mailing list.

Published in: Software
  • DOWNLOAD THE BOOK INTO AVAILABLE FORMAT (New Update) ......................................................................................................................... ......................................................................................................................... Download Full PDF EBOOK here { https://urlzs.com/UABbn } ......................................................................................................................... Download Full EPUB Ebook here { https://urlzs.com/UABbn } ......................................................................................................................... Download Full doc Ebook here { https://urlzs.com/UABbn } ......................................................................................................................... Download PDF EBOOK here { https://urlzs.com/UABbn } ......................................................................................................................... Download EPUB Ebook here { https://urlzs.com/UABbn } ......................................................................................................................... Download doc Ebook here { https://urlzs.com/UABbn } ......................................................................................................................... ......................................................................................................................... ................................................................................................................................... eBook is an electronic version of a traditional print book THE can be read by using a personal computer or by using an eBook reader. (An eBook reader can be a software application for use on a computer such as Microsoft's free Reader application, or a book-sized computer THE is used solely as a reading device such as Nuvomedia's Rocket eBook.) Users can purchase an eBook on diskette or CD, but the most popular method of getting an eBook is to purchase a downloadable file of the eBook (or other reading material) from a Web site (such as Barnes and Noble) to be read from the user's computer or reading device. Generally, an eBook can be downloaded in five minutes or less ......................................................................................................................... .............. Browse by Genre Available eBOOK .............................................................................................................................. Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, CookBOOK, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult, Crime, EBOOK, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, ......................................................................................................................... ......................................................................................................................... .....BEST SELLER FOR EBOOK RECOMMEND............................................................. ......................................................................................................................... Blowout: Corrupted Democracy, Rogue State Russia, and the Richest, Most Destructive Industry on Earth,-- The Ride of a Lifetime: Lessons Learned from 15 Years as CEO of the Walt Disney Company,-- Call Sign Chaos: Learning to Lead,-- StrengthsFinder 2.0,-- Stillness Is the Key,-- She Said: Breaking the Sexual Harassment Story THE Helped Ignite a Movement,-- Atomic Habits: An Easy & Proven Way to Build Good Habits & Break Bad Ones,-- Everything Is Figureoutable,-- What It Takes: Lessons in the Pursuit of Excellence,-- Rich Dad Poor Dad: What the Rich Teach Their Kids About Money THE the Poor and Middle Class Do Not!,-- The Total Money Makeover: Classic Edition: A Proven Plan for Financial Fitness,-- Shut Up and Listen!: Hard Business Truths THE Will Help You Succeed, ......................................................................................................................... .........................................................................................................................
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • 7 Signs Your Car Battery Is About To Die And Needs To Be Replaced (or reconditioned) ▲▲▲ https://bit.ly/2t1uc6e
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

How Cassandra Deletes Data (Alain Rodriguez, The Last Pickle) | Cassandra Summit 2016

  1. 1. Licensed under a Creative Commons Attribution-NonCommercial 3.0 New Zealand License HOW CASSANDRA DELETES DATA Alain Rodriguez
  2. 2. • Tombstone issues • Why tombstones • Tombstone removal
  3. 3. Introduction
  4. 4. About The Last Pickle
  5. 5. About The Last Pickle and Alain Rodriguez
  6. 6. About The Last Pickle and Alain Rodriguez
  7. 7. About deletes in Cassandra Deleted data in Cassandra do not just disappear,
  8. 8. Deleted data in Cassandra do not just disappear, instead a tombstone is added. About deletes in Cassandra
  9. 9. Ok so what’s the matter, why this talk ? Tombstone are needed in Cassandra, not an issue…
  10. 10. Ok so what’s the matter, why this talk ? Tombstone are needed in Cassandra, not an issue… …until an SSTables or a result to a query look like this…
  11. 11. Then we can see that in the user mailing list or other community tools Ok so what’s the matter, why this talk ?
  12. 12. Then we can see that in the user mailing list or other community tools So I thought I could share, about this topic. thelastpickle.com/blog/2016/07/27/about-deletes-and-tombstones.html Ok so what’s the matter, why this talk ?
  13. 13. Tombstone issues
  14. 14. Tombstone issues: impacts The read path: Reading tombstones induces Latencies, Timeouts or Exceptions
  15. 15. Tombstone issues: impacts The read path: Reading tombstones induces Latencies, Timeouts or Exceptions The disk space: tombstones can fill up the disk 100%
  16. 16. Tombstone issues: impacts The read path: Reading tombstones induces Latencies, Timeouts or Exceptions The disk space: tombstones can fill up the disk I am facing one of these issues, is it caused by tombstones? 100%
  17. 17. Tombstone issues: Read Path grep -i -e "ERROR" -e "WARN" /var/log/cassandra/system.log
  18. 18. Tombstone issues: Read Path grep -i -e "ERROR" -e "WARN" /var/log/cassandra/system.log WARN [SharedPool-Worker-7] 2016-07-16 16:31:09,048 SliceQueryFilter.java:319 - Read 276 live and 1104 tombstone cells in mykeyspace.mytable for key: ItV9kZC8mFNiSvYM8AwufBU8tTtJkW5dUH5MNcq1H18 (see tombstone_warn_threshold). 500 columns were requested, slices=[-]
  19. 19. Tombstone issues: Read Path grep -i -e "ERROR" -e "WARN" /var/log/cassandra/system.log WARN [SharedPool-Worker-7] 2016-07-16 16:31:09,048 SliceQueryFilter.java:319 - Read 276 live and 1104 tombstone cells in mykeyspace.mytable for key: ItV9kZC8mFNiSvYM8AwufBU8tTtJkW5dUH5MNcq1H18 (see tombstone_warn_threshold). 500 columns were requested, slices=[-] ERROR [ReadStage:290729] 2016-07-16 17:00:18,708 SliceQueryFilter.java (line 206) Scanned over 100000 tombstones in mykeyspace.mytable; query aborted (see tombstone_failure_threshold) ERROR [ReadStage:290729] 2016-04-22 17:00:18,709 CassandraDaemon.java (line 258) Exception in threadThread[ReadStage:290729,5,main] java.lang.RuntimeException: org.apache.cassandra.db.filter.TombstoneOverwhelmingException
  20. 20. Tombstone issues: Read Path tombstoneScannedHistogram metric Through nodetool cfstats, JMX…
  21. 21. Tombstone issues: Read Path tombstoneScannedHistogram metric Through or a plugged monitoring tool such as Datadog, Grafana, SPM, OpsCenter… Commercial Free
  22. 22. Tombstone issues: Disk space DroppableTombstoneRatio metric provide interesting info.
  23. 23. Tombstone issues: Disk space DroppableTombstoneRatio metric provide interesting info. Through sstablemetadata tool, JMX and plugged monitoring tool such as Datadog, Grafana, SPM, OpsCenter, etc. Possible to write a script to check biggest SSTables ratio for example
  24. 24. Why tombstones? I want to remove data !
  25. 25. WhyTombstones: Cassandra write path Write path Client write Memory Disk Memtable Commit Log SSTable SSTable SSTable SSTable Cassandra node Flush Immutable
  26. 26. WhyTombstones: Cassandra write path Write path Client write Memory Disk Memtable Commit Log SSTable SSTable SSTable SSTable Cassandra node Immutable Client read Flush
  27. 27. WhyTombstones: Distributed system Cassandra is a distributed system Distributed deletes are tricky !
  28. 28. WhyTombstones: Cassandra consistency Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 Strong
 Consistency
  29. 29. WhyTombstones: Cassandra consistency Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 Strong
 Consistency Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 ? ? ? Client write “A” Strong
 Consistency
  30. 30. WhyTombstones: Cassandra consistency Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 Strong
 Consistency Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 ? ? ? Client write “A” Strong
 Consistency Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A ? Client write “A” Ack Ack Strong
 Consistency
  31. 31. WhyTombstones: Cassandra consistency Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 Strong
 Consistency Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 ? ? ? Client write “A” Strong
 Consistency Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A ? Client write “A” Ack Ack Strong
 Consistency Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A ? Client write “A” Client read “A ” Ack Ack Strong
 Consistency
  32. 32. WhyTombstones: Cassandra consistency & availability Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 Strong
 Consistency Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 ? ? ? Client write “A” Strong
 Consistency Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A ? Client write “A” Ack Ack Strong
 Consistency Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A ? Client write “A” Client read “A ” Ack Ack Strong
 Consistency Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A Down Client write “A” Client read “A” Ack Ack High
 availability
  33. 33. WhyTombstones: Distributed deletes Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 Strong
 Consistency Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 ? ? ? Client write “A” Strong
 Consistency Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A ? Client write “A” Ack Ack Strong
 Consistency Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A ? Client write “A” Client read “A ” Ack Ack Strong
 Consistency WITHOUT Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A A
  34. 34. WhyTombstones: Distributed deletes Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 Strong
 Consistency Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 ? ? ? Client write “A” Strong
 Consistency Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A ? Client write “A” Ack Ack Strong
 Consistency Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A ? Client write “A” Client read “A ” Ack Ack Strong
 Consistency WITHOUT Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A A WITHOUT Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A A Client delete “A”
  35. 35. WhyTombstones: Distributed deletes Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 Strong
 Consistency Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 ? ? ? Client write “A” Strong
 Consistency Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A ? Client write “A” Ack Ack Strong
 Consistency Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A ? Client write “A” Client read “A ” Ack Ack Strong
 Consistency WITHOUT Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A A WITHOUT Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A A Client delete “A” WITHOUT Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A Client delete “A” Ack Ack
  36. 36. WhyTombstones: Distributed deletes Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 Strong
 Consistency Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 ? ? ? Client write “A” Strong
 Consistency Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A ? Client write “A” Ack Ack Strong
 Consistency Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A ? Client write “A” Client read “A ” Ack Ack Strong
 Consistency WITHOUT Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A A WITHOUT Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A A Client delete “A” WITHOUT Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A Client delete “A” Ack Ack WITHOUT Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A Client delete “A” Client read “A” Ack Ack Wrong
  37. 37. WhyTombstones: Distributed deletes Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 Strong
 Consistency Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 ? ? ? Client write “A” Strong
 Consistency Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A ? Client write “A” Ack Ack Strong
 Consistency Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A ? Client write “A” Client read “A ” Ack Ack Strong
 Consistency WITHOUT Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A A WITHOUT Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A A Client delete “A” WITHOUT Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A Client delete “A” Ack Ack WITHOUT Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A Client delete “A” Client read “empty” Ack Ack Correct
  38. 38. WhyTombstones: Distributed deletes WITH Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A A Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 Strong
 Consistency Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 ? ? ? Client write “A” Strong
 Consistency Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A ? Client write “A” Ack Ack Strong
 Consistency Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A ? Client write “A” Client read “A ” Ack Ack Strong
 Consistency WITHOUT Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A A WITHOUT Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A A Client delete “A” WITHOUT Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A Client delete “A” Ack Ack WITHOUT Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A Client delete “A” Client read “A” Ack Ack Wrong
  39. 39. WhyTombstones: Distributed deletes WITH Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A A WITH Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A A Client delete “A” Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 Strong
 Consistency Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 ? ? ? Client write “A” Strong
 Consistency Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A ? Client write “A” Ack Ack Strong
 Consistency Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A ? Client write “A” Client read “A ” Ack Ack Strong
 Consistency WITHOUT Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A A WITHOUT Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A A Client delete “A” WITHOUT Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A Client delete “A” Ack Ack WITHOUT Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A Client delete “A” Client read “A” Ack Ack Wrong
  40. 40. WhyTombstones: Distributed deletes WITH Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A A WITH Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2
 A* =Tombstone on A A A A Client delete “A” WITH Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2
 A* =Tombstone on A A* A* A Client delete “A” Ack Ack Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 Strong
 Consistency Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 ? ? ? Client write “A” Strong
 Consistency Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A ? Client write “A” Ack Ack Strong
 Consistency Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A ? Client write “A” Client read “A ” Ack Ack Strong
 Consistency WITHOUT Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A A WITHOUT Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A A Client delete “A” WITHOUT Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A Client delete “A” Ack Ack WITHOUT Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A Client delete “A” Client read “A” Ack Ack Wrong
  41. 41. WhyTombstones: Distributed deletes WITH Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A A WITH Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2
 A* =Tombstone on A A A A Client delete “A” WITH Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2
 A* =Tombstone on A A* A* A Client delete “A” Ack Ack WITH Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A* =Tombstone on A A* A* A Client delete “A” Client read “A*”
 meaning “empty” Ack Ack Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 Strong
 Consistency Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 ? ? ? Client write “A” Strong
 Consistency Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A ? Client write “A” Ack Ack Strong
 Consistency Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A ? Client write “A” Client read “A ” Ack Ack Strong
 Consistency WITHOUT Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A A WITHOUT Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A A Client delete “A” WITHOUT Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A Client delete “A” Ack Ack WITHOUT Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A Client delete “A” Client read “A” Ack Ack Wrong Correct
  42. 42. Cool story, but I really want to remove the data ! Tombstone removal!
  43. 43. When are tombstones removed? When should tombstones be removed? • Once the tombstone is fully replicated • When deleted data has been removed
  44. 44. When are tombstones removed? When should tombstones be removed? • Once the tombstone is fully replicated • When deleted data has been removed When are tombstones actually removed? • After gc_grace_seconds • During compactions
 IF all the deleted data and the tombstone itself are involved
  45. 45. How tombstones are removed: Compaction! Write path Client write Memory Disk Memtable Commit Log SSTable SSTable SSTable SSTable Cassandra node Immutable Client read Flush
  46. 46. How tombstones are removed: Compaction! Write path Client write Memory Disk Memtable Commit Log SSTable SSTable SSTable SSTable Cassandra node Immutable Client read Compacting 4 SSTables Flush
  47. 47. How tombstones are removed: Compaction! Write path Client write Memory Disk Memtable Commit Log SSTable Cassandra node Immutable Client read Flush
  48. 48. Implications in the real world • No compaction = no eviction • + TTLs or deletes, tombstone stack (up to 100%)
  49. 49. Implications in the real world • No compaction = no eviction • + TTLs or deletes, tombstone stack (up to 100%) • Overlapping SSTable = no eviction • Fragmented data = eviction unlikely • LCS: tombstone level ≠ than data level = no eviction
  50. 50. Implications in the real world • No compaction = no eviction • + TTLs or deletes, tombstone stack (up to 100%) • Overlapping SSTable = no eviction • Fragmented data = eviction unlikely • LCS: tombstone level ≠ than data level = no eviction • TTL << gc_grace_seconds = high % of useless data
  51. 51. Some tuning ! Good news: Cassandra community and Committers are Awesome!
  52. 52. Some tuning ! Issue: No compaction = No eviction CASSANDRA-3442: tombstone_threshold (C* 1.2.b1) Compaction option, default: tombstone_threshold = 0.2 (ratio = 20% has been deleted) Single SSTable compaction triggered based on an estimate! Low risk: worst case —> No-op
  53. 53. Some tuning ! Issue: Tombstone compaction loop! CASSANDRA-4022: Check for key overlaps (C* 1.2.b1) Internals improvement, not an option: Estimated droppable tombstone improved Now considering key overlapping with other SSTable
  54. 54. Some tuning ! Issue: Tombstone compaction loop! CASSANDRA-4781: tombstone_compaction_interval (C* 1.2.b2) Compaction option, default: tombstone_compaction_interval = 86400 (in seconds = 1 day) Definitely prevents loops
  55. 55. Some tuning ! Issue: Compacting to remove tombstone is expensive CASSANDRA-5228: Expired SSTables (C*2.0.b1) Internals improvement, not an option Effective with Time series, DTCS / TWCS and TTLs !
  56. 56. Some tuning ! Issue: Tombstone compactions not triggering CASSANDRA-6563: unchecked_tombstone_compaction (C* 2.0.9) Compaction option, default: unchecked_tombstone_compaction = false CASSANDRA-4022 becomes an option
  57. 57. Some tuning ! Issue: Overlapping preventing efficient tombstone compactions CASSANDRA-7019: provide_overlapping_tombstones (C* 3.10) Compaction option, default: provide_overlapping_tombstones = NONE (CELL / ROW / NONE) Risky: • Not yet released, so not really tested • Heavier tombstones compactions
  58. 58. Some tuning -Tombstone distribution ! WITH Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A A WITH Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2
 A* =Tombstone on A A A A Client delete “A” WITH Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2
 A* =Tombstone on A A* A* A Client delete “A” Ack Ack WITH Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A* =Tombstone on A A* A* A Client delete “A” Client read “A*”
 meaning “empty” Ack Ack Correct Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 Strong
 Consistency Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 ? ? ? Client write “A” Strong
 Consistency Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A ? Client write “A” Ack Ack Strong
 Consistency Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A ? Client write “A” Client read “A ” Ack Ack Strong
 Consistency WITHOUT Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A A WITHOUT Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A A Client delete “A” WITHOUT Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A Client delete “A” Ack Ack Tombstones not replicated
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A* =Tombstone on A A* A* A Client delete “A” Client read “A*” Ack Ack Correct
  59. 59. Some tuning -Tombstone distribution ! Case were node fail + no repair = Case without tombstone WITH Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A A WITH Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2
 A* =Tombstone on A A A A Client delete “A” WITH Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2
 A* =Tombstone on A A* A* A Client delete “A” Ack Ack WITH Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A* =Tombstone on A A* A* A Client delete “A” Client read “A*”
 meaning “empty” Ack Ack Correct Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 Strong
 Consistency Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 ? ? ? Client write “A” Strong
 Consistency Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A ? Client write “A” Ack Ack Strong
 Consistency Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A ? Client write “A” Client read “A ” Ack Ack Strong
 Consistency WITHOUT Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A A WITHOUT Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A A Client delete “A” WITHOUT Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A Client delete “A” Ack Ack Tombstones not replicated
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A* =Tombstone on A A* AClient read “A” Wrong A* removed
  60. 60. Some tuning -Tombstone distribution ! Case were node fail + no repair = Case without tombstone = Zombie data ! WITH Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A A WITH Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2
 A* =Tombstone on A A A A Client delete “A” WITH Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2
 A* =Tombstone on A A* A* A Client delete “A” Ack Ack WITH Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A* =Tombstone on A A* A* A Client delete “A” Client read “A*”
 meaning “empty” Ack Ack Correct Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 Strong
 Consistency Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 ? ? ? Client write “A” Strong
 Consistency Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A ? Client write “A” Ack Ack Strong
 Consistency Consistency
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A ? Client write “A” Client read “A ” Ack Ack Strong
 Consistency WITHOUT Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A A WITHOUT Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A A Client delete “A” WITHOUT Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A Client delete “A” Ack Ack Tombstones not replicated
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A* =Tombstone on A AClient read “A” Wrong A* removed
  61. 61. Some tuning -Tombstone distribution ! CASSANDRA-6434 (C*3.0.b1): only_purge_repaired_tombstones (Default: False) WITH Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A A WITH Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2
 A* =Tombstone on A A A A Client delete “A” WITH Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2
 A* =Tombstone on A A* A* A Client delete “A” Ack Ack Tombstones not replicated
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A* =Tombstone on A A* A* A A* not removed Client read “A*”
 meaning “empty” Correct
  62. 62. Some tuning -Tombstone distribution ! CASSANDRA-6434 (C*3.0.b1): only_purge_repaired_tombstones (Default: False) Limitation Repair failing or no repair = permanent tombstone WITH Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A A A WITH Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2
 A* =Tombstone on A A A A Client delete “A” WITH Tombstones
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2
 A* =Tombstone on A A* A* A Client delete “A” Ack Ack Tombstones not replicated
 Cassandra Cluster
 4 nodes
 RF = 3
 Write CL = Quorum = 2 Read CL = Quorum = 2 A* =Tombstone on A A* A* A A* not removed Client read “A*”
 meaning “empty” Correct
  63. 63. Conclusion
  64. 64. Things we know about tombstones • Tombstones due to deletes and TTLs • Tombstone fits with Cassandra write path • Tombstones ensure consistency • Reading tombstones is expensive and can produce failures • Tombstones take space on disk and might be tricky to remove • Tombstones need to be distributed before being removed
  65. 65. Takeaways • Model data and workflow to avoid to reading many tombstones • Deleted data = repair table within gc_grace_seconds • Monitor tombstones, keep control! (Set some alerts ?) • Use compaction options to tackle problems, there is always a way. • Is there no way? Ask, or create a Jira and keep improving Cassandra!
  66. 66. Thank you Questions ? thelastpickle.com/blog/2016/07/27/about-deletes-and-tombstones.html

×