A solid backup strategy is a DBA's bread and butter. Cassandra's nodetool snapshot makes it easy to back up the SSTable files, but there remains the question of where to put them and how. Knewton's backup strategy uses Ansible for distributed backups and stores them in S3.
Unfortunately, it's all too easy to store backups that are essentially useless due to the absence of a coherent restoration strategy. This problem proved much more difficult and nuanced than taking the backups themselves. I will discuss Knewton's restoration strategy, which again leverages Ansible, yet I will focus on general principles and pitfalls to be avoided. In particular, restores necessitated modifying our backup strategy to generate cluster-wide metadata that is critical for a smooth automated restoration. Such pitfalls indicate that a restore-focused backup design leads to faster and more deterministic recovery.
About the Speaker
Joshua Wickman Database Engineer, Knewton
Dr. Joshua Wickman is currently part of the database team at Knewton, a NYC tech company focused on adaptive learning. He earned his PhD at the University of Delaware in 2012, where he studied particle physics models of the early universe. After a brief stint teaching college physics, he entered the New York tech industry in 2014 working with NoSQL, first with MongoDB and then Cassandra. He was certified in Cassandra at his first Cassandra Summit in 2015.