Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
SOUTH BAY CASSANDRA USERS MARCH 2016
BACKUP AND RESTORE FOR
APACHE CASSANDRA
Aaron Morton
@aaronmorton
CEO
Licensed under ...
AboutThe Last Pickle.
Work with clients to deliver and improve Apache Cassandra
based solutions.
Apache Cassandra Committe...
Why Backup
Commit Log Archiving
Table Snap
Why Backup?
Replication is for Availability.
Why Backup?
Replicate good data as fast as
bad data.
Three ReasonsTo Backup…
Business Continuity Planning /
Disaster Recovery Planning
(AKA Data Centre is on fire.)
Three ReasonsTo Backup…
Environment Cloning
(AKA Let’s make a new Data Centre.)
Three ReasonsTo Backup…
Point In Time Recovery
(AKA Bad deploy.)
Why Backup
Commit Log Archiving
Table Snap
Commit Log
Writes are first written to the
Commit Log (on each node).
Commit Log
Commit Log can grow up to
8GB in size.
Commit Log
Commit Log is made up of 32
MB Segments.
Commit Log
Commit Log contains
Mutations, which have row
fragments.
Commit Log
Mutations are serialised in the
form they are sent over the
wire.
Commit Log Archiving
Archive Segment when full.
Restore Segments at startup
(if specified).
commitlog_archiving.properties
archive_command=
Run this command when a Segment
is full.
commitlog_archiving.properties
restore_directories=
Read all files in this CSV list of
directories at startup and run
resto...
commitlog_archiving.properties
restore_point_in_time=
Stop processing mutations with a
timestamp higher than this.
commitlog_archiving.properties
precision=MICROSECONDS
Precision used for timestamps.
Cassandra Parameter
-Dcassandra.replayList=
CSV white list of keyspace.table to
replay.
Why Backup
Commit Log Archiving
Table Snap
Table Snap
Table Snap
Continually Backup and
Restore SSTables to S3.
tablesnap
Watch for files closed or
moved into the data
directories.
tablesnap
Upload all SSTable
components, splitting large
files, using multiple threads.
tablesnap
Includes a list of SSTables in
the directory.
tablesnap
Skips file if it was removed by
compaction during processing.
tablechop
Deletes old files from the
backup set to implement a
rolling window.
tablechop
Specify how many days to
keep.
tablechop
Use - -debug to reduce the
stress.
(AKA Dry Run, does not delete the files.)
tableslurp
Slurp SSTables from S3 to a
local directory for restoring.
tableslurp
Restores the latest backup set,
or a named backup set.
Table Snap Pros
Simple.
Table Snap Cons
No monitoring.
Manual restore into cluster.
No support for topology
change.
Thanks.
Aaron Morton
@aaronmorton
Co-Founder & Principal Consultant
www.thelastpickle.com
Upcoming SlideShare
Loading in …5
×

Cassandra South Bay Meetup - Backup And Restore For Apache Cassandra

532 views

Published on

Discussion with Datos.io on how to backup and restore Apache Cassandra.

Published in: Technology
  • Be the first to comment

Cassandra South Bay Meetup - Backup And Restore For Apache Cassandra

  1. 1. SOUTH BAY CASSANDRA USERS MARCH 2016 BACKUP AND RESTORE FOR APACHE CASSANDRA Aaron Morton @aaronmorton CEO Licensed under a Creative Commons Attribution-NonCommercial 3.0 New Zealand License
  2. 2. AboutThe Last Pickle. Work with clients to deliver and improve Apache Cassandra based solutions. Apache Cassandra Committer and DataStax MVPs. Based in New Zealand,Australia, France & USA.
  3. 3. Why Backup Commit Log Archiving Table Snap
  4. 4. Why Backup? Replication is for Availability.
  5. 5. Why Backup? Replicate good data as fast as bad data.
  6. 6. Three ReasonsTo Backup… Business Continuity Planning / Disaster Recovery Planning (AKA Data Centre is on fire.)
  7. 7. Three ReasonsTo Backup… Environment Cloning (AKA Let’s make a new Data Centre.)
  8. 8. Three ReasonsTo Backup… Point In Time Recovery (AKA Bad deploy.)
  9. 9. Why Backup Commit Log Archiving Table Snap
  10. 10. Commit Log Writes are first written to the Commit Log (on each node).
  11. 11. Commit Log Commit Log can grow up to 8GB in size.
  12. 12. Commit Log Commit Log is made up of 32 MB Segments.
  13. 13. Commit Log Commit Log contains Mutations, which have row fragments.
  14. 14. Commit Log Mutations are serialised in the form they are sent over the wire.
  15. 15. Commit Log Archiving Archive Segment when full. Restore Segments at startup (if specified).
  16. 16. commitlog_archiving.properties archive_command= Run this command when a Segment is full.
  17. 17. commitlog_archiving.properties restore_directories= Read all files in this CSV list of directories at startup and run restore_command for each.
  18. 18. commitlog_archiving.properties restore_point_in_time= Stop processing mutations with a timestamp higher than this.
  19. 19. commitlog_archiving.properties precision=MICROSECONDS Precision used for timestamps.
  20. 20. Cassandra Parameter -Dcassandra.replayList= CSV white list of keyspace.table to replay.
  21. 21. Why Backup Commit Log Archiving Table Snap
  22. 22. Table Snap
  23. 23. Table Snap Continually Backup and Restore SSTables to S3.
  24. 24. tablesnap Watch for files closed or moved into the data directories.
  25. 25. tablesnap Upload all SSTable components, splitting large files, using multiple threads.
  26. 26. tablesnap Includes a list of SSTables in the directory.
  27. 27. tablesnap Skips file if it was removed by compaction during processing.
  28. 28. tablechop Deletes old files from the backup set to implement a rolling window.
  29. 29. tablechop Specify how many days to keep.
  30. 30. tablechop Use - -debug to reduce the stress. (AKA Dry Run, does not delete the files.)
  31. 31. tableslurp Slurp SSTables from S3 to a local directory for restoring.
  32. 32. tableslurp Restores the latest backup set, or a named backup set.
  33. 33. Table Snap Pros Simple.
  34. 34. Table Snap Cons No monitoring. Manual restore into cluster. No support for topology change.
  35. 35. Thanks.
  36. 36. Aaron Morton @aaronmorton Co-Founder & Principal Consultant www.thelastpickle.com

×