Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Large partition in Cassandra
Shogo Hoshii
Yahoo! Japan Corp.
About me
• Cassandra operator atYahoo! Japan Corp.
• https://issues.apache.org/jira/browse/CASSA
NDRA-5977
remark
• This is a summary of following tickets:
– https://issues.apache.org/jira/browse/CASSANDR
A-11206
– https://issues...
Agenda
• Recap the read path
• What’s the problem?
• Solutions
High level: read path
Row Cache
Key Cache
SSTables MemTable
1. Check row cache before going to key cache
2. Check the key ...
Pattern 1.The row is in row cache
Partition
Summary
Disk
MemTable
Compression
Offsets
Bloom Filter
Row Cache
Heap Off Heap...
Pattern 2.The key is in key cache
Partition
Summary
Disk
MemTable
Compression
Offsets
Bloom Filter
Row Cache
Heap Off Heap...
Pattern 3.The key is not cached
Partition
Summary
Disk
MemTable
Compression
Offsets
Bloom Filter
Row Cache
Heap Off Heap
K...
What’s the problem?
• GC pressure by key cache when a large
partition is read
Partition Index Recap
• http://distributeddatastore.blogspot.jp/2013/08/cassandra-sstable-storage-format.html
RowIndexEntry
• Partition size < 64 kb
– RowIndexEntry
• Position
• Seriarized size of data
• Partition size > 64 kb
– Ind...
3.The key is not cached
Partition
Summary
Disk
MemTable
Compression
Offsets
Bloom Filter
Row Cache
Heap Off Heap
Key Cache...
Current solution
• If partition size <
column_index_cache_size_in_kb(configurable)
– IndexedEntry is kept on heap
• Otherw...
Other possible solutions
• IndexInfo never be kept on heap
– Read from disk when needed
– degrades performance when small ...
Other possible solutions
• Migrate key cache to be fully off heap
– https://issues.apache.org/jira/browse/CASSANDR
A-9738
...
Thank you
Upcoming SlideShare
Loading in …5
×

Large partition in Cassandra

3,646 views

Published on

Summary of problems caused by large partition and solutions to it

Published in: Technology
  • Thank you for sharing this interesting information here. Great post. And I agree with you that it is really hardly to find a student who enjoys executing college assignments. All these processes require spending much time and efforts, that is why i recommend all the students use the professional writing service HelpWriting.net Good luck.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Large partition in Cassandra

  1. 1. Large partition in Cassandra Shogo Hoshii Yahoo! Japan Corp.
  2. 2. About me • Cassandra operator atYahoo! Japan Corp. • https://issues.apache.org/jira/browse/CASSA NDRA-5977
  3. 3. remark • This is a summary of following tickets: – https://issues.apache.org/jira/browse/CASSANDR A-11206 – https://issues.apache.org/jira/browse/CASSANDR A-9738
  4. 4. Agenda • Recap the read path • What’s the problem? • Solutions
  5. 5. High level: read path Row Cache Key Cache SSTables MemTable 1. Check row cache before going to key cache 2. Check the key cache to get the offsets to data 3. Find the offsets to data and retrieve data 4. Merge data from sstables and memtable 5. Populate row cache with new row returned http://docs.datastax.com/en/cassandra/3.x/cassandra/dml/dmlAboutReads.html
  6. 6. Pattern 1.The row is in row cache Partition Summary Disk MemTable Compression Offsets Bloom Filter Row Cache Heap Off Heap Key Cache Partition Index Data 1. read request 2. return row when that is in row cache
  7. 7. Pattern 2.The key is in key cache Partition Summary Disk MemTable Compression Offsets Bloom Filter Row Cache Heap Off Heap Key Cache Partition Index Data 1. read request 2. Check bloom filters 3. Check the partition key is in key cache 4. Find the offset to the result set 5. Access the result set
  8. 8. Pattern 3.The key is not cached Partition Summary Disk MemTable Compression Offsets Bloom Filter Row Cache Heap Off Heap Key Cache Partition Index Data 1. read request 2. Miss -> Check bloom filters 3. Check the partition key is in key cache 4. Miss -> Bsearch the close location of index 5. Disk scan to find the offsets 6. Find the offset into the result set 7. Access the result set 8. Update key cache
  9. 9. What’s the problem? • GC pressure by key cache when a large partition is read
  10. 10. Partition Index Recap • http://distributeddatastore.blogspot.jp/2013/08/cassandra-sstable-storage-format.html
  11. 11. RowIndexEntry • Partition size < 64 kb – RowIndexEntry • Position • Seriarized size of data • Partition size > 64 kb – IndexedEntry • Position • Seriarized size of data • IndexInfo[] – Seriarize method – Offset – width – Etc. Approximation on 16 byte value 1mb : 3kb / > 200 objects 4mb : 11kb / > 800 objects 64mb : 180kb / > 13k objects 512mb : 1.4mb / > 106k objects
  12. 12. 3.The key is not cached Partition Summary Disk MemTable Compression Offsets Bloom Filter Row Cache Heap Off Heap Key Cache Partition Index Data 1. read request 2. Miss -> Check bloom filters 3. Check the partition key is in key cache 4. Miss -> Bsearch the close location of index 5. Disk scan to find the offsets 6. Find the offsets into the result set 7. Access the result set 8. Update key cache 9. GC, GC, GC…
  13. 13. Current solution • If partition size < column_index_cache_size_in_kb(configurable) – IndexedEntry is kept on heap • Otherwise – Always read from disk when needed • https://issues.apache.org/jira/browse/CASSANDRA-11206 • https://www.youtube.com/watch?v=qa84vABqftM
  14. 14. Other possible solutions • IndexInfo never be kept on heap – Read from disk when needed – degrades performance when small partition is read
  15. 15. Other possible solutions • Migrate key cache to be fully off heap – https://issues.apache.org/jira/browse/CASSANDR A-9738 – Serialization & deserialization cost so much when large partition is read • Will Birch help us to solve this problem? – https://issues.apache.org/jira/browse/CASSANDRA-9754
  16. 16. Thank you

×