SlideShare a Scribd company logo
S3 & Glacier
Amazon S3 & Glacier - the only backup solution you’ll ever need
Matthew Boeckman, VP - DevOps, Craftsy @matthewboeckman
http://enginerds.craftsy.com http://devops.com/author/matthewboeckman
Who is Craftsy
● Instructor led training videos for passionate hobbyists
● #19 on Forbes’ Most Promising Companies 2014
Why do backups matter?
… they don’t, until they do.
At small scale, backup can and is as easy as icloud, or google drive, or
github.
As you grow, the situation gets more complex.
In a content creation business,
backups are vital.
RESTORE
Define your exposure
daily backups - protect
against accidental
deletion (users) and
system
corruption/crash
DR - your building
catches on fire. an
employee goes rogue.
the SAN suffers a total
existence failure.
permanent archive - data at
rest, legal/business
permanence
what are my options
LTO - hope you like lifecycle hardware
refreshes! hope your data rests on a 1.5 or
3tb boundry. hope you enjoy swapping
tapes… 4-17year durability
~$.008/GB
** does not include operational costs **
DISK - good luck offsiting, hope you enjoy hardware
maintenance, hope you’ve got a big checkbook
~$5/GB
** does not include operational costs **
hobo as a service
● not scalable
● expensive
● not addressable in code
● requires software systems
welcome to system
adminsitrationville,
population: you
why did we look for alternatives?
What are the alternatives?
s3
why s3?
● $0.03/GB
● 99.999999999% durable
● easy interface with glacier
● addressable with code
“If you store 10,000 objects with us, on average we
may lose one of them every 10 million years or so.
This storage is designed in such a way that we can
sustain the concurrent loss of data in two separate
storage facilities.”
except...
this is a busy space
free tools, one thread, small files
http://s3tools.org/s3cmd
● great sync support
● supports multipart
● Does not support multithread
● tons of other features (du, cp, rm, setpolicy, cloudfront…)
https://github.com/boto
● the aws python library
● includes basic cli s3put, great reference point
great, but not fast
free tools, multipass, big files
https://github.com/mumrah/s3-multipart
● zomg multithreaded, multipart up and down
● saturated 100M pipe for a single file transfer
● does not handle small files
https://github.com/aws/aws-cli
● zomg multithreaded, multipart up and down
● saturated 500M pipe for a single file transfer
● perfect in every imaginable way
problems
approach single archive
(tar)
lots o files, some
big some small
le good ● fast transfer
● simple
● no filename
pain
● restore &
navigation
granularity
le bad ● restore all
● can’t nav your
backup
● processing
overhead
● users name
things...
our first backup script ... testy.sh
*shitty speed, restore costs, let’s just get to the next slide
to mp or not to mp...
size=$(stat -f %z "$x")
if [ "$size" -gt "100000000" ]; then
s3-mp-upload.py ${S3MPT_OW} -np 10 -s 100 "$x"
"s3://${BUCKET}/$CID/${x}" >> $LOG 2>&1
else
s3put ${S3PUT_OW} -s $AWS_ACCESS_SECRET -a
$AWS_ACCESS_KEY -b $BUCKET -p "$CPATH" "$x" >> $LOG
2>&1
fi
aws s3 cli
aws s3 sync "$CID" "s3://${BUCKET}/${CID}" --delete
aws s3 cp "$SRC" "s3://${BUCKET}/${CID}" --recursive
simple, multipart, multithreaded and saturates 500mbps both
ways
hooray! now what
why glacier?
● $0.01/GB
● durable like s3
● data sent to glacier via s3 policies cannot be
accidentally deleted
s3->glacier
prefix, delete
restore from glacier
restore, our UI
restore from glacier as code
foreach ($objects as $flerbs) {
$key = $flerbs->Key;
$restore = $s3->restore_archived_object ($bucket, $key, $days );
//var_dump($restore);
$http_code = $restore->status;
$xml_resp = $restore->body;
$message = $xml_resp->Message;
$header = $restore->header;
$req_id = $header['x-amz-request-id'];
$amz_id2 = $header['x-amz-id-2'];
$req_date = $header['date'];
}
oooh, a progress gui
restore status as code
$header = $s3->get_object_headers($bucket, $key)->header;
$restore = explode(',', $header['x-amz-restore']);
$progress = strpos($restore[0], 'true');
…
if ($sclass == 'GLACIER') {
if ($header['x-amz-restore'] == '') {
$style = '"background-color: #0ce3ff;"'; // not restored!
$note = '';
} elseif ($progress != false) {
$style = '"background-color: #ffa391;"'; // restore in progress!
$note = 'Restoration in progress';
} else {
$style = '"background-color: #95fed0;"'; // restored!
$note = "Restored until $expirydate";
}
versioning, another approach
● versioning operates on the bucket level
● each version of an object counts as storage
● versioning happens automatically with any PUT
● deletion of an object does not explicitly delete versions (!!)
GET /my-image.jpg?versionId=L4kqtJlcpXroDTDmpUMLUo
GET /my-image.jpg?versions HTTP/1.1
DELETE /my-image.jpg?versionId=UIORUnfnd89493jJFJ
restoring a version
Restore to the most recent version?
1. delete the current version!
Restore to an older version?
1. GET the desired version
2. PUT it back on top of the object you’re restoring
where does Craftsy live?
1. Bi-weeklies of ACTIVE content to a non-versioned, non-
glaciered bucket (~200TB total, +2TB/wk)
2. Weekly new finished content to glacier-backed s3 bucket
(~500TB, +45TB/mo)
3. All code artifacts (dev, staging, prod) to a versioned s3
bucket
4. Postgres backup segments to an non-glaciered bucket
most important, we do all of it in code. we don’t shuffle tapes,
upgrade hardware, or system administrate!
Thank you
QUESTIONS!
Matthew Boeckman
@matthewboeckman
http://enginerds.craftsy.com
(deck will be there & slideshare)
http://devops.com/author/matthewboeckman

More Related Content

What's hot

Boosting MongoDB performance
Boosting MongoDB performanceBoosting MongoDB performance
Boosting MongoDB performance
Alexei Panin
 
CHI - YAPC NA 2012
CHI - YAPC NA 2012CHI - YAPC NA 2012
CHI - YAPC NA 2012
jonswar
 
CHI-YAPC-2009
CHI-YAPC-2009CHI-YAPC-2009
CHI-YAPC-2009
jonswar
 
Day 2-some fun coding
Day 2-some fun codingDay 2-some fun coding
Day 2-some fun coding
Open Knowledge Nepal
 
Mongodb
MongodbMongodb
Mongodb
Scott Motte
 
Mysqlnd uh
Mysqlnd uhMysqlnd uh
Mysqlnd uh
natmchugh
 
Microsoft SQL Server with Linux and Docker
Microsoft SQL Server with Linux and DockerMicrosoft SQL Server with Linux and Docker
Microsoft SQL Server with Linux and Docker
roskakori
 
Drupal 8 Theme System: The Backend of Frontend
Drupal 8 Theme System: The Backend of FrontendDrupal 8 Theme System: The Backend of Frontend
Drupal 8 Theme System: The Backend of Frontend
Acquia
 
MongoDB: How it Works
MongoDB: How it WorksMongoDB: How it Works
MongoDB: How it Works
Mike Dirolf
 
Persistence patterns for containers
Persistence patterns for containersPersistence patterns for containers
Persistence patterns for containers
Stephen Watt
 
Cookies
CookiesCookies
Cookies
amuthadeepa
 
Configuration surgery with Augeas (OggCamp 12)
Configuration surgery with Augeas (OggCamp 12)Configuration surgery with Augeas (OggCamp 12)
Configuration surgery with Augeas (OggCamp 12)
Dominic Cleal
 
Memory Management in WordPress
Memory Management in WordPressMemory Management in WordPress
Memory Management in WordPress
Konstantin Kovshenin
 
Putting rails and couch db on the cloud - Indicthreads cloud computing confe...
Putting rails and couch db on the cloud -  Indicthreads cloud computing confe...Putting rails and couch db on the cloud -  Indicthreads cloud computing confe...
Putting rails and couch db on the cloud - Indicthreads cloud computing confe...
IndicThreads
 
Responsive Design with WordPress
Responsive Design with WordPressResponsive Design with WordPress
Responsive Design with WordPress
Joe Casabona
 
Tutorial Puppet
Tutorial PuppetTutorial Puppet
Tutorial Puppet
Daniel Sobral
 
Responsive Design with WordPress (WCPHX)
Responsive Design with WordPress (WCPHX)Responsive Design with WordPress (WCPHX)
Responsive Design with WordPress (WCPHX)
Joe Casabona
 
Python-CouchDB Training at PyCon PL 2012
Python-CouchDB Training at PyCon PL 2012Python-CouchDB Training at PyCon PL 2012
Python-CouchDB Training at PyCon PL 2012
Stefan Kögl
 
Device Synchronization with Javascript and PouchDB
Device Synchronization with Javascript and PouchDBDevice Synchronization with Javascript and PouchDB
Device Synchronization with Javascript and PouchDB
Frank Rousseau
 

What's hot (19)

Boosting MongoDB performance
Boosting MongoDB performanceBoosting MongoDB performance
Boosting MongoDB performance
 
CHI - YAPC NA 2012
CHI - YAPC NA 2012CHI - YAPC NA 2012
CHI - YAPC NA 2012
 
CHI-YAPC-2009
CHI-YAPC-2009CHI-YAPC-2009
CHI-YAPC-2009
 
Day 2-some fun coding
Day 2-some fun codingDay 2-some fun coding
Day 2-some fun coding
 
Mongodb
MongodbMongodb
Mongodb
 
Mysqlnd uh
Mysqlnd uhMysqlnd uh
Mysqlnd uh
 
Microsoft SQL Server with Linux and Docker
Microsoft SQL Server with Linux and DockerMicrosoft SQL Server with Linux and Docker
Microsoft SQL Server with Linux and Docker
 
Drupal 8 Theme System: The Backend of Frontend
Drupal 8 Theme System: The Backend of FrontendDrupal 8 Theme System: The Backend of Frontend
Drupal 8 Theme System: The Backend of Frontend
 
MongoDB: How it Works
MongoDB: How it WorksMongoDB: How it Works
MongoDB: How it Works
 
Persistence patterns for containers
Persistence patterns for containersPersistence patterns for containers
Persistence patterns for containers
 
Cookies
CookiesCookies
Cookies
 
Configuration surgery with Augeas (OggCamp 12)
Configuration surgery with Augeas (OggCamp 12)Configuration surgery with Augeas (OggCamp 12)
Configuration surgery with Augeas (OggCamp 12)
 
Memory Management in WordPress
Memory Management in WordPressMemory Management in WordPress
Memory Management in WordPress
 
Putting rails and couch db on the cloud - Indicthreads cloud computing confe...
Putting rails and couch db on the cloud -  Indicthreads cloud computing confe...Putting rails and couch db on the cloud -  Indicthreads cloud computing confe...
Putting rails and couch db on the cloud - Indicthreads cloud computing confe...
 
Responsive Design with WordPress
Responsive Design with WordPressResponsive Design with WordPress
Responsive Design with WordPress
 
Tutorial Puppet
Tutorial PuppetTutorial Puppet
Tutorial Puppet
 
Responsive Design with WordPress (WCPHX)
Responsive Design with WordPress (WCPHX)Responsive Design with WordPress (WCPHX)
Responsive Design with WordPress (WCPHX)
 
Python-CouchDB Training at PyCon PL 2012
Python-CouchDB Training at PyCon PL 2012Python-CouchDB Training at PyCon PL 2012
Python-CouchDB Training at PyCon PL 2012
 
Device Synchronization with Javascript and PouchDB
Device Synchronization with Javascript and PouchDBDevice Synchronization with Javascript and PouchDB
Device Synchronization with Javascript and PouchDB
 

Viewers also liked

JAWS-UG北陸第5回勉強会 クラウド破産しないためのEBS入門
JAWS-UG北陸第5回勉強会 クラウド破産しないためのEBS入門JAWS-UG北陸第5回勉強会 クラウド破産しないためのEBS入門
JAWS-UG北陸第5回勉強会 クラウド破産しないためのEBS入門
Kazuki Ueki
 
S3 Server Hackathon Presented by S3 Server, a Scality Product, Seagate and Ho...
S3 Server Hackathon Presented by S3 Server, a Scality Product, Seagate and Ho...S3 Server Hackathon Presented by S3 Server, a Scality Product, Seagate and Ho...
S3 Server Hackathon Presented by S3 Server, a Scality Product, Seagate and Ho...
Scality
 
(STG203) Simplified Storage Management & Backup Using S3 & Glacier
(STG203) Simplified Storage Management & Backup Using S3 & Glacier(STG203) Simplified Storage Management & Backup Using S3 & Glacier
(STG203) Simplified Storage Management & Backup Using S3 & Glacier
Amazon Web Services
 
AWS re:Invent 2016 - Scality's Open Source AWS S3 Server
AWS re:Invent 2016 - Scality's Open Source AWS S3 ServerAWS re:Invent 2016 - Scality's Open Source AWS S3 Server
AWS re:Invent 2016 - Scality's Open Source AWS S3 Server
Scality
 
IBM Cloud Storage - Cleversafe
IBM Cloud Storage - CleversafeIBM Cloud Storage - Cleversafe
IBM Cloud Storage - Cleversafe
Michael Beatty
 
Black Belt Online Seminar AWS Amazon S3
Black Belt Online Seminar AWS Amazon S3Black Belt Online Seminar AWS Amazon S3
Black Belt Online Seminar AWS Amazon S3
Amazon Web Services Japan
 

Viewers also liked (6)

JAWS-UG北陸第5回勉強会 クラウド破産しないためのEBS入門
JAWS-UG北陸第5回勉強会 クラウド破産しないためのEBS入門JAWS-UG北陸第5回勉強会 クラウド破産しないためのEBS入門
JAWS-UG北陸第5回勉強会 クラウド破産しないためのEBS入門
 
S3 Server Hackathon Presented by S3 Server, a Scality Product, Seagate and Ho...
S3 Server Hackathon Presented by S3 Server, a Scality Product, Seagate and Ho...S3 Server Hackathon Presented by S3 Server, a Scality Product, Seagate and Ho...
S3 Server Hackathon Presented by S3 Server, a Scality Product, Seagate and Ho...
 
(STG203) Simplified Storage Management & Backup Using S3 & Glacier
(STG203) Simplified Storage Management & Backup Using S3 & Glacier(STG203) Simplified Storage Management & Backup Using S3 & Glacier
(STG203) Simplified Storage Management & Backup Using S3 & Glacier
 
AWS re:Invent 2016 - Scality's Open Source AWS S3 Server
AWS re:Invent 2016 - Scality's Open Source AWS S3 ServerAWS re:Invent 2016 - Scality's Open Source AWS S3 Server
AWS re:Invent 2016 - Scality's Open Source AWS S3 Server
 
IBM Cloud Storage - Cleversafe
IBM Cloud Storage - CleversafeIBM Cloud Storage - Cleversafe
IBM Cloud Storage - Cleversafe
 
Black Belt Online Seminar AWS Amazon S3
Black Belt Online Seminar AWS Amazon S3Black Belt Online Seminar AWS Amazon S3
Black Belt Online Seminar AWS Amazon S3
 

Similar to S3 & Glacier - The only backup solution you'll ever need

Think_your_Postgres_backups_and_recovery_are_safe_lets_talk.pptx
Think_your_Postgres_backups_and_recovery_are_safe_lets_talk.pptxThink_your_Postgres_backups_and_recovery_are_safe_lets_talk.pptx
Think_your_Postgres_backups_and_recovery_are_safe_lets_talk.pptx
Payal Singh
 
Challenges when building high profile editorial sites
Challenges when building high profile editorial sitesChallenges when building high profile editorial sites
Challenges when building high profile editorial sites
Yann Malet
 
PHP CLI: A Cinderella Story
PHP CLI: A Cinderella StoryPHP CLI: A Cinderella Story
PHP CLI: A Cinderella Story
Mike Lively
 
Scaling PHP apps
Scaling PHP appsScaling PHP apps
Scaling PHP apps
Matteo Moretti
 
Oracle to Postgres Migration - part 2
Oracle to Postgres Migration - part 2Oracle to Postgres Migration - part 2
Oracle to Postgres Migration - part 2
PgTraining
 
Pure Speed Drupal 4 Gov talk
Pure Speed Drupal 4 Gov talkPure Speed Drupal 4 Gov talk
Pure Speed Drupal 4 Gov talk
Bryan Ollendyke
 
Embulk, an open-source plugin-based parallel bulk data loader
Embulk, an open-source plugin-based parallel bulk data loaderEmbulk, an open-source plugin-based parallel bulk data loader
Embulk, an open-source plugin-based parallel bulk data loader
Sadayuki Furuhashi
 
Mojolicious
MojoliciousMojolicious
Mojolicious
Marcos Rebelo
 
PostgreSQL Administration for System Administrators
PostgreSQL Administration for System AdministratorsPostgreSQL Administration for System Administrators
PostgreSQL Administration for System Administrators
Command Prompt., Inc
 
Backup, Restore, and Disaster Recovery
Backup, Restore, and Disaster RecoveryBackup, Restore, and Disaster Recovery
Backup, Restore, and Disaster Recovery
MongoDB
 
Introduction to docker
Introduction to dockerIntroduction to docker
Introduction to docker
Justyna Ilczuk
 
Using filesystem capabilities with rsync
Using filesystem capabilities with rsyncUsing filesystem capabilities with rsync
Using filesystem capabilities with rsync
Hazel Smith
 
Source Code Management systems
Source Code Management systemsSource Code Management systems
Source Code Management systems
xSawyer
 
Malcon2017
Malcon2017Malcon2017
Gdb basics for my sql db as (percona live europe 2019)
Gdb basics for my sql db as (percona live europe 2019)Gdb basics for my sql db as (percona live europe 2019)
Gdb basics for my sql db as (percona live europe 2019)
Valerii Kravchuk
 
10 Problems with your RMAN backup script
10 Problems with your RMAN backup script10 Problems with your RMAN backup script
10 Problems with your RMAN backup script
Yury Velikanov
 
Drupal Deployment Troubles and Problems
Drupal Deployment Troubles and ProblemsDrupal Deployment Troubles and Problems
Drupal Deployment Troubles and Problems
Andrii Lundiak
 
Mobile & Desktop Cache 2.0: How To Create A Scriptable Cache
Mobile & Desktop Cache 2.0: How To Create A Scriptable CacheMobile & Desktop Cache 2.0: How To Create A Scriptable Cache
Mobile & Desktop Cache 2.0: How To Create A Scriptable Cache
Blaze Software Inc.
 
03 pig intro
03 pig intro03 pig intro
03 pig intro
Subhas Kumar Ghosh
 
What's New and Newer in Apache httpd-24
What's New and Newer in Apache httpd-24What's New and Newer in Apache httpd-24
What's New and Newer in Apache httpd-24
Jim Jagielski
 

Similar to S3 & Glacier - The only backup solution you'll ever need (20)

Think_your_Postgres_backups_and_recovery_are_safe_lets_talk.pptx
Think_your_Postgres_backups_and_recovery_are_safe_lets_talk.pptxThink_your_Postgres_backups_and_recovery_are_safe_lets_talk.pptx
Think_your_Postgres_backups_and_recovery_are_safe_lets_talk.pptx
 
Challenges when building high profile editorial sites
Challenges when building high profile editorial sitesChallenges when building high profile editorial sites
Challenges when building high profile editorial sites
 
PHP CLI: A Cinderella Story
PHP CLI: A Cinderella StoryPHP CLI: A Cinderella Story
PHP CLI: A Cinderella Story
 
Scaling PHP apps
Scaling PHP appsScaling PHP apps
Scaling PHP apps
 
Oracle to Postgres Migration - part 2
Oracle to Postgres Migration - part 2Oracle to Postgres Migration - part 2
Oracle to Postgres Migration - part 2
 
Pure Speed Drupal 4 Gov talk
Pure Speed Drupal 4 Gov talkPure Speed Drupal 4 Gov talk
Pure Speed Drupal 4 Gov talk
 
Embulk, an open-source plugin-based parallel bulk data loader
Embulk, an open-source plugin-based parallel bulk data loaderEmbulk, an open-source plugin-based parallel bulk data loader
Embulk, an open-source plugin-based parallel bulk data loader
 
Mojolicious
MojoliciousMojolicious
Mojolicious
 
PostgreSQL Administration for System Administrators
PostgreSQL Administration for System AdministratorsPostgreSQL Administration for System Administrators
PostgreSQL Administration for System Administrators
 
Backup, Restore, and Disaster Recovery
Backup, Restore, and Disaster RecoveryBackup, Restore, and Disaster Recovery
Backup, Restore, and Disaster Recovery
 
Introduction to docker
Introduction to dockerIntroduction to docker
Introduction to docker
 
Using filesystem capabilities with rsync
Using filesystem capabilities with rsyncUsing filesystem capabilities with rsync
Using filesystem capabilities with rsync
 
Source Code Management systems
Source Code Management systemsSource Code Management systems
Source Code Management systems
 
Malcon2017
Malcon2017Malcon2017
Malcon2017
 
Gdb basics for my sql db as (percona live europe 2019)
Gdb basics for my sql db as (percona live europe 2019)Gdb basics for my sql db as (percona live europe 2019)
Gdb basics for my sql db as (percona live europe 2019)
 
10 Problems with your RMAN backup script
10 Problems with your RMAN backup script10 Problems with your RMAN backup script
10 Problems with your RMAN backup script
 
Drupal Deployment Troubles and Problems
Drupal Deployment Troubles and ProblemsDrupal Deployment Troubles and Problems
Drupal Deployment Troubles and Problems
 
Mobile & Desktop Cache 2.0: How To Create A Scriptable Cache
Mobile & Desktop Cache 2.0: How To Create A Scriptable CacheMobile & Desktop Cache 2.0: How To Create A Scriptable Cache
Mobile & Desktop Cache 2.0: How To Create A Scriptable Cache
 
03 pig intro
03 pig intro03 pig intro
03 pig intro
 
What's New and Newer in Apache httpd-24
What's New and Newer in Apache httpd-24What's New and Newer in Apache httpd-24
What's New and Newer in Apache httpd-24
 

More from Matthew Boeckman

Useful flakes - The Value of Common Tools
Useful flakes - The Value of Common ToolsUseful flakes - The Value of Common Tools
Useful flakes - The Value of Common Tools
Matthew Boeckman
 
All Day DevOps 2017 - There is No Root Cause
All Day DevOps 2017 - There is No Root CauseAll Day DevOps 2017 - There is No Root Cause
All Day DevOps 2017 - There is No Root Cause
Matthew Boeckman
 
Rewriting DevOps - Lessons from a 15 month software rewrite
Rewriting DevOps - Lessons from a 15 month software rewriteRewriting DevOps - Lessons from a 15 month software rewrite
Rewriting DevOps - Lessons from a 15 month software rewrite
Matthew Boeckman
 
Top 10 Practices of Highly Successful DevOps Incident Management Teams
Top 10 Practices of Highly Successful DevOps Incident Management TeamsTop 10 Practices of Highly Successful DevOps Incident Management Teams
Top 10 Practices of Highly Successful DevOps Incident Management Teams
Matthew Boeckman
 
Many hands make light work
Many hands make light workMany hands make light work
Many hands make light work
Matthew Boeckman
 
Sandstorm or Significant? The evolving role of situational context in inciden...
Sandstorm or Significant? The evolving role of situational context in inciden...Sandstorm or Significant? The evolving role of situational context in inciden...
Sandstorm or Significant? The evolving role of situational context in inciden...
Matthew Boeckman
 
Rewriting DevOps
Rewriting DevOpsRewriting DevOps
Rewriting DevOps
Matthew Boeckman
 
Go Rin no Show - DevOpsDays Rockies
Go Rin no Show - DevOpsDays RockiesGo Rin no Show - DevOpsDays Rockies
Go Rin no Show - DevOpsDays Rockies
Matthew Boeckman
 
The promise of NoOps
The promise of NoOpsThe promise of NoOps
The promise of NoOps
Matthew Boeckman
 
Ops, DevOps, NoOps and AWS Lambda
Ops, DevOps, NoOps and AWS LambdaOps, DevOps, NoOps and AWS Lambda
Ops, DevOps, NoOps and AWS Lambda
Matthew Boeckman
 
Vpc aws meetup
Vpc   aws meetupVpc   aws meetup
Vpc aws meetup
Matthew Boeckman
 

More from Matthew Boeckman (11)

Useful flakes - The Value of Common Tools
Useful flakes - The Value of Common ToolsUseful flakes - The Value of Common Tools
Useful flakes - The Value of Common Tools
 
All Day DevOps 2017 - There is No Root Cause
All Day DevOps 2017 - There is No Root CauseAll Day DevOps 2017 - There is No Root Cause
All Day DevOps 2017 - There is No Root Cause
 
Rewriting DevOps - Lessons from a 15 month software rewrite
Rewriting DevOps - Lessons from a 15 month software rewriteRewriting DevOps - Lessons from a 15 month software rewrite
Rewriting DevOps - Lessons from a 15 month software rewrite
 
Top 10 Practices of Highly Successful DevOps Incident Management Teams
Top 10 Practices of Highly Successful DevOps Incident Management TeamsTop 10 Practices of Highly Successful DevOps Incident Management Teams
Top 10 Practices of Highly Successful DevOps Incident Management Teams
 
Many hands make light work
Many hands make light workMany hands make light work
Many hands make light work
 
Sandstorm or Significant? The evolving role of situational context in inciden...
Sandstorm or Significant? The evolving role of situational context in inciden...Sandstorm or Significant? The evolving role of situational context in inciden...
Sandstorm or Significant? The evolving role of situational context in inciden...
 
Rewriting DevOps
Rewriting DevOpsRewriting DevOps
Rewriting DevOps
 
Go Rin no Show - DevOpsDays Rockies
Go Rin no Show - DevOpsDays RockiesGo Rin no Show - DevOpsDays Rockies
Go Rin no Show - DevOpsDays Rockies
 
The promise of NoOps
The promise of NoOpsThe promise of NoOps
The promise of NoOps
 
Ops, DevOps, NoOps and AWS Lambda
Ops, DevOps, NoOps and AWS LambdaOps, DevOps, NoOps and AWS Lambda
Ops, DevOps, NoOps and AWS Lambda
 
Vpc aws meetup
Vpc   aws meetupVpc   aws meetup
Vpc aws meetup
 

Recently uploaded

TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
Zilliz
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
SOFTTECHHUB
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 

Recently uploaded (20)

TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 

S3 & Glacier - The only backup solution you'll ever need

  • 1. S3 & Glacier Amazon S3 & Glacier - the only backup solution you’ll ever need Matthew Boeckman, VP - DevOps, Craftsy @matthewboeckman http://enginerds.craftsy.com http://devops.com/author/matthewboeckman
  • 2. Who is Craftsy ● Instructor led training videos for passionate hobbyists ● #19 on Forbes’ Most Promising Companies 2014
  • 3. Why do backups matter? … they don’t, until they do. At small scale, backup can and is as easy as icloud, or google drive, or github. As you grow, the situation gets more complex. In a content creation business, backups are vital.
  • 5. Define your exposure daily backups - protect against accidental deletion (users) and system corruption/crash DR - your building catches on fire. an employee goes rogue. the SAN suffers a total existence failure. permanent archive - data at rest, legal/business permanence
  • 6. what are my options LTO - hope you like lifecycle hardware refreshes! hope your data rests on a 1.5 or 3tb boundry. hope you enjoy swapping tapes… 4-17year durability ~$.008/GB ** does not include operational costs ** DISK - good luck offsiting, hope you enjoy hardware maintenance, hope you’ve got a big checkbook ~$5/GB ** does not include operational costs **
  • 7. hobo as a service ● not scalable ● expensive ● not addressable in code ● requires software systems welcome to system adminsitrationville, population: you
  • 8. why did we look for alternatives?
  • 9. What are the alternatives? s3
  • 10. why s3? ● $0.03/GB ● 99.999999999% durable ● easy interface with glacier ● addressable with code “If you store 10,000 objects with us, on average we may lose one of them every 10 million years or so. This storage is designed in such a way that we can sustain the concurrent loss of data in two separate storage facilities.”
  • 12. this is a busy space
  • 13. free tools, one thread, small files http://s3tools.org/s3cmd ● great sync support ● supports multipart ● Does not support multithread ● tons of other features (du, cp, rm, setpolicy, cloudfront…) https://github.com/boto ● the aws python library ● includes basic cli s3put, great reference point great, but not fast
  • 14. free tools, multipass, big files https://github.com/mumrah/s3-multipart ● zomg multithreaded, multipart up and down ● saturated 100M pipe for a single file transfer ● does not handle small files https://github.com/aws/aws-cli ● zomg multithreaded, multipart up and down ● saturated 500M pipe for a single file transfer ● perfect in every imaginable way
  • 15. problems approach single archive (tar) lots o files, some big some small le good ● fast transfer ● simple ● no filename pain ● restore & navigation granularity le bad ● restore all ● can’t nav your backup ● processing overhead ● users name things...
  • 16. our first backup script ... testy.sh *shitty speed, restore costs, let’s just get to the next slide
  • 17. to mp or not to mp... size=$(stat -f %z "$x") if [ "$size" -gt "100000000" ]; then s3-mp-upload.py ${S3MPT_OW} -np 10 -s 100 "$x" "s3://${BUCKET}/$CID/${x}" >> $LOG 2>&1 else s3put ${S3PUT_OW} -s $AWS_ACCESS_SECRET -a $AWS_ACCESS_KEY -b $BUCKET -p "$CPATH" "$x" >> $LOG 2>&1 fi
  • 18. aws s3 cli aws s3 sync "$CID" "s3://${BUCKET}/${CID}" --delete aws s3 cp "$SRC" "s3://${BUCKET}/${CID}" --recursive simple, multipart, multithreaded and saturates 500mbps both ways
  • 20. why glacier? ● $0.01/GB ● durable like s3 ● data sent to glacier via s3 policies cannot be accidentally deleted
  • 25. restore from glacier as code foreach ($objects as $flerbs) { $key = $flerbs->Key; $restore = $s3->restore_archived_object ($bucket, $key, $days ); //var_dump($restore); $http_code = $restore->status; $xml_resp = $restore->body; $message = $xml_resp->Message; $header = $restore->header; $req_id = $header['x-amz-request-id']; $amz_id2 = $header['x-amz-id-2']; $req_date = $header['date']; }
  • 27. restore status as code $header = $s3->get_object_headers($bucket, $key)->header; $restore = explode(',', $header['x-amz-restore']); $progress = strpos($restore[0], 'true'); … if ($sclass == 'GLACIER') { if ($header['x-amz-restore'] == '') { $style = '"background-color: #0ce3ff;"'; // not restored! $note = ''; } elseif ($progress != false) { $style = '"background-color: #ffa391;"'; // restore in progress! $note = 'Restoration in progress'; } else { $style = '"background-color: #95fed0;"'; // restored! $note = "Restored until $expirydate"; }
  • 28. versioning, another approach ● versioning operates on the bucket level ● each version of an object counts as storage ● versioning happens automatically with any PUT ● deletion of an object does not explicitly delete versions (!!) GET /my-image.jpg?versionId=L4kqtJlcpXroDTDmpUMLUo GET /my-image.jpg?versions HTTP/1.1 DELETE /my-image.jpg?versionId=UIORUnfnd89493jJFJ
  • 29. restoring a version Restore to the most recent version? 1. delete the current version! Restore to an older version? 1. GET the desired version 2. PUT it back on top of the object you’re restoring
  • 30. where does Craftsy live? 1. Bi-weeklies of ACTIVE content to a non-versioned, non- glaciered bucket (~200TB total, +2TB/wk) 2. Weekly new finished content to glacier-backed s3 bucket (~500TB, +45TB/mo) 3. All code artifacts (dev, staging, prod) to a versioned s3 bucket 4. Postgres backup segments to an non-glaciered bucket most important, we do all of it in code. we don’t shuffle tapes, upgrade hardware, or system administrate!
  • 31. Thank you QUESTIONS! Matthew Boeckman @matthewboeckman http://enginerds.craftsy.com (deck will be there & slideshare) http://devops.com/author/matthewboeckman