Bulk exporting datafrom CassandraCarlo Cabanilla@clofresh
Why export?
snapshot
sstable2json
Killing IO on live cluster
sstable2json sstable2csv, with filters
ionice -c 3
Need a place to put it
EBS to the rescue
gzipped
S3cmd
Need to dedupe
Hadoop
numpy pickles
Haderp Mortar Data
numpy pickles msgpack lz4
gzipped lzod
Haderp file naming!2010-07-27~org-1018~m-48778.csv-1,316.gz
S3 copy
Bulk exporting datafrom CassandraCarlo Cabanilla@clofresh
Upcoming SlideShare
Loading in …5
×

Bulk Exporting from Cassandra - Carlo Cabanilla

729 views
557 views

Published on

Carlo give his perspective on the challenges of doing large exports from Cassandra.

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
729
On SlideShare
0
From Embeds
0
Number of Embeds
7
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Bulk Exporting from Cassandra - Carlo Cabanilla

  1. 1. Bulk exporting datafrom CassandraCarlo Cabanilla@clofresh
  2. 2. Why export?
  3. 3. snapshot
  4. 4. sstable2json
  5. 5. Killing IO on live cluster
  6. 6. sstable2json sstable2csv, with filters
  7. 7. ionice -c 3
  8. 8. Need a place to put it
  9. 9. EBS to the rescue
  10. 10. gzipped
  11. 11. S3cmd
  12. 12. Need to dedupe
  13. 13. Hadoop
  14. 14. numpy pickles
  15. 15. Haderp Mortar Data
  16. 16. numpy pickles msgpack lz4
  17. 17. gzipped lzod
  18. 18. Haderp file naming!2010-07-27~org-1018~m-48778.csv-1,316.gz
  19. 19. S3 copy
  20. 20. Bulk exporting datafrom CassandraCarlo Cabanilla@clofresh

×