SlideShare a Scribd company logo
1 of 45
Download to read offline
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Lee Atkinson, Solutions Architect, AWS
Jey Jeyasingam, CTO, Y-Cam
7 July 2016
Amazon S3
Deep Dive
Amazon EFS
File
Amazon EBS
Amazon EC2
instance store
Block
Amazon S3 Amazon Glacier
Object
Data transfer
AWS Direct
Connect
Snowball ISV connectors Amazon
Kinesis
Firehose
Transfer
Acceleration
AWS Storage
Gateway
AWS storage services
Cross-region
replication
Amazon CloudWatch
metrics for Amazon S3
& AWS CloudTrail
support
VPC endpoint
for Amazon S3
Read-after-write
consistency in all
regions
Event notifications
Amazon S3 bucket
limit increase
Innovation for Amazon S3 (1/2)
Innovation for Amazon S3 (2/2)
Amazon S3 Standard-IA
Transfer
Acceleration
Incomplete multipart
upload expiration
Expired object delete
marker
Standard
Active data Archive dataActive Archive
Standard - Infrequent Access Amazon Glacier
Choice of storage classes on Amazon S3
File sync and share /
consumer file storage
Backup and archive /
disaster recovery
Long retained data
Some use cases have different requirements
11 9s of durability Designed for
99.9% availability
Durable Available
Same throughput as
Amazon S3 Standard storage
High performance
• Server-side encryption
• Use your encryption keys
• KMS-managed encryption keys
Secure
• Lifecycle management
• Versioning
• Event notifications
• Metrics
Integrated
• No impact on user
experience
• Simple REST API
• Single bucket
Easy to use
Standard-Infrequent Access storage
Management policies
Lifecycle policies
Automatic tiering and cost controls
Includes two possible actions:
• Transition: to Standard-IA or Glacier after
specified time
• Expiration: deletes objects after specified time
Allows for actions to be combined
Set policies at the key prefix level
Lifecycle Policy
<LifecycleConfiguration>
<Rule>
<ID>sample-rule</ID>
<Prefix>documents/</Prefix>
<Status>Enabled</Status>
<Transition>
<Days>30</Days>
<StorageClass>STANDARD-IA</StorageClass>
</Transition>
<Transition>
<Days>365</Days>
<StorageClass>GLACIER</StorageClass>
</Transition>
</Rule>
</LifecycleConfiguration>
Standard-IA Storage -> Glacier
Standard-Infrequent Access storage
Standard Storage -> Standard-IA
Versioning S3 buckets
Protects from accidental overwrites and
deletes
New version with every upload
Easy retrieval of deleted objects and roll back
Three states of an Amazon S3 bucket
• Unversioned (Default)
• Versioning-enabled
• Versioning-suspended
Versioning + lifecycle policies
Expired object delete marker policy
Deleting a versioned object makes a delete
marker the current version of the object
No storage charge for delete marker
Removing delete marker can improve list
performance
Lifecycle policy to automatically remove the
current version delete marker when previous
versions of the object no longer exist
Example lifecycle policy to remove current versions
<LifecycleConfiguration>
<Rule>
...
<Expiration>
<Days>60</Days>
</Expiration>
<NoncurrentVersionExpiration>
<NoncurrentDays>30</NoncurrentDays>
</NoncurrentVersionExpiration>
</Rule>
</LifecycleConfiguration>
Leverage lifecycle to expire current
and non-current versions
S3 Lifecycle will automatically remove any
expired object delete markers
Expired object delete marker policy
Example lifecycle policy for non-current version expiration
Lifecycle configuration with
NoncurrentVersionExpiration action
removes all the noncurrent versions,
<LifecycleConfiguration>
<Rule>
...
<Expiration>
<ExpiredObjectDeleteMarker>true</ExpiredObjectDeleteMarker>
</Expiration>
<NoncurrentVersionExpiration>
<NoncurrentDays>30</NoncurrentDays>
</NoncurrentVersionExpiration>
</Rule>
</LifecycleConfiguration>
ExpiredObjectDeleteMarker element
removes expired object delete markers.
Expired object delete marker policy
Restricting deletes with MFA
Bucket policies can restrict deletes
For additional security, enable MFA (multi-factor
authentication) delete, which requires additional
authentication to:
• Change the versioning state of your bucket
• Permanently delete an object version
MFA delete requires both your security credentials and a
code from an approved authentication device
Performance optimization for S3
Parallel PUTs with Multipart Uploads
Increase throughput by parallelizing PUTs
Increase resiliency to network errors
Fewer large restarts on error-prone
networks
A balance between part size & number of
parts:
• Small parts increase connection overhead
• Large parts provide less benefits of multipart
Incomplete multipart upload expiration policy
Multipart upload feature improves PUT
performance
Partial upload does not appear in bucket list
Partial upload does incur storage charges
Set a lifecycle policy to automatically expire
incomplete multipart uploads after a predefined
number of days
Example lifecycle policy
Abort incomplete multipart
uploads seven days after
initiation
<LifecycleConfiguration>
<Rule>
<ID>sample-rule</ID>
<Prefix>SomeKeyPrefix/</Prefix>
<Status>rule-status</Status>
<AbortIncompleteMultipartUpload>
<DaysAfterInitiation>7</DaysAfterInitiation>
</AbortIncompleteMultipartUpload>
</Rule>
</LifecycleConfiguration>
Incomplete multipart upload expiration policy
Parallel GETs
Use range-based GETs to get multithreaded
performance when downloading objects
Compensates for unreliable networks
Benefits of multithreaded parallelism
Align your ranges with your parts!
Parallel LISTs
Parallelize LIST when you need a sequential
list of your keys
Secondary index to get a faster alternative to
LIST
• Sorting by metadata
• Searchability
• Objects by timestamp
Distributing object keys
Most important if you regularly exceed 100 TPS on a
bucket
Distribute keys uniformly across keyspace
Use a key-naming scheme with randomness at the
beginning
Distributing object keys
Don’t do this…
<my_bucket>/2013_11_13-164533125.jpg
<my_bucket>/2013_11_13-164533126.jpg
<my_bucket>/2013_11_13-164533127.jpg
<my_bucket>/2013_11_13-164533128.jpg
<my_bucket>/2013_11_12-164533129.jpg
<my_bucket>/2013_11_12-164533130.jpg
<my_bucket>/2013_11_12-164533131.jpg
<my_bucket>/2013_11_12-164533132.jpg
<my_bucket>/2013_11_11-164533133.jpg
<my_bucket>/2013_11_11-164533134.jpg
<my_bucket>/2013_11_11-164533135.jpg
<my_bucket>/2013_11_11-164533136.jpg
Distributing object keys
…because this is going to happen
1 2 N
1 2 N
Partition Partition Partition Partition
Distributing object keys
Add randomness to the beginning of the key name…
<my_bucket>/521335461-2013_11_13.jpg
<my_bucket>/465330151-2013_11_13.jpg
<my_bucket>/987331160-2013_11_13.jpg
<my_bucket>/465765461-2013_11_13.jpg
<my_bucket>/125631151-2013_11_13.jpg
<my_bucket>/934563160-2013_11_13.jpg
<my_bucket>/532132341-2013_11_13.jpg
<my_bucket>/565437681-2013_11_13.jpg
<my_bucket>/234567460-2013_11_13.jpg
<my_bucket>/456767561-2013_11_13.jpg
<my_bucket>/345565651-2013_11_13.jpg
<my_bucket>/431345660-2013_11_13.jpg
Distributing object keys
…so your transactions can be distributed across the partitions
1 2 N
1 2 N
Partition Partition Partition Partition
Techniques for distributing keys
Store as a hash:
• 83d02a66a0fee41b5767e4f4dd377d29
Prepend with short hash:
• 83d02013_11_13-164533125.jpg
Reverse:
• 521335461-31_11_3102.jpg
Data ingestion into Amazon S3
AWS Import/Export Snowball
• Accelerate PBs with AWS-
provided appliances
• 80TB and global availability
AWS Storage Gateway
• Up to 120 MB/s cloud upload rate
(4x improvement), and
• 10 Gb networking for VMware
Data ingestion into Amazon S3
Amazon Kinesis Firehose
• Ingest data streams directly into
AWS data stores
AWS Direct Connect
ISV connectors
Transfer Acceleration
• Move data up to 300% faster
using the AWS network
S3 Transfer Acceleration
Introducing Amazon S3 Transfer Acceleration
Up to 300% faster
Change your endpoint, not
your code
56 global edge locations
No firewall exceptions
No client software required
S3 Bucket
AWS Edge
Location
Uploader
Optimized
Throughput!
Rio De
Janeiro
Warsaw New York Atlanta Madrid Virginia Melbourne Paris Los
Angeles
Seattle Tokyo Singapore
Time[hrs]
500 GB upload from these edge locations to a bucket in Singapore
Public Internet
How fast is Transfer Acceleration?
S3 Transfer Acceleration
Getting Started
1. Enable S3 transfer acceleration on
your S3 bucket
2. Update your application/destination
URL to
<bucket-name>.s3-
accelerate.amazonaws.com
3. Done!
How much will it help me?
Use the Amazon S3 Transfer
Acceleration Speed Comparison page:
http://s3-accelerate-speedtest.s3-
accelerate.amazonaws.com/en/accelerate-speed-
comparsion.html
By Jey Jeyasingam
Y-cam Solutions Ltd
Confidential and proprietary
Who we are...
Initially used S3
just to store videos
and thumbnails, 6
years ago
120 million
objects
But now we also
use S3 for so much
more
2 million
videos
Y-cam Solutions Ltd
Confidential and proprietary
Our Architecture
Y-cam Solutions Ltd
Confidential and proprietary
Challenges
Handling the
expiration of videos
Legacy
scripts
Reducing servers,
cutting costs
Y-cam Solutions Ltd
Confidential and proprietary
Video Expiration
Create multiple
buckets with
different lifecycle
Improve code to
decide which
bucket to save the
video
Y-cam Solutions Ltd
Confidential and proprietary
Legacy Script
Move create
thumbnail and
update DynamoDB
from script to
Lambda function
Extra benefits of
using Lambda
Lambda triggered
by S3 event
notification
Y-cam Solutions Ltd
Confidential and proprietary
Future Plans
Reducing number of servers
Servers only serving web app
JS code
Moved this to be hosted
by S3
Reduced cost
Moving towards serverless architecture
Summary
Amazon S3 Standard-Infrequent Access
Amazon S3 management policies
Versioning for Amazon S3 + MFA Delete
Amazon S3 Transfer Acceleration
Please remember to rate this
session under ‘My Agenda’ on
https://awssummit.london
Deep Dive on Amazon S3

More Related Content

What's hot

What's hot (20)

Getting started with aws io t.compressed.compressed
Getting started with aws io t.compressed.compressedGetting started with aws io t.compressed.compressed
Getting started with aws io t.compressed.compressed
 
Deep Dive on Microservices and Amazon ECS
Deep Dive on Microservices and Amazon ECSDeep Dive on Microservices and Amazon ECS
Deep Dive on Microservices and Amazon ECS
 
AWS re:Invent 2016: Scaling Up to Your First 10 Million Users (ARC201)
AWS re:Invent 2016: Scaling Up to Your First 10 Million Users (ARC201)AWS re:Invent 2016: Scaling Up to Your First 10 Million Users (ARC201)
AWS re:Invent 2016: Scaling Up to Your First 10 Million Users (ARC201)
 
Automating Security in Cloud Workloads with DevSecOps
Automating Security in Cloud Workloads with DevSecOps Automating Security in Cloud Workloads with DevSecOps
Automating Security in Cloud Workloads with DevSecOps
 
Building Your First Big Data Application on AWS
Building Your First Big Data Application on AWSBuilding Your First Big Data Application on AWS
Building Your First Big Data Application on AWS
 
AWS IoT - Introduction - Pop-up Loft
AWS IoT - Introduction - Pop-up LoftAWS IoT - Introduction - Pop-up Loft
AWS IoT - Introduction - Pop-up Loft
 
Getting Started with Windows Workloads on Amazon EC2
Getting Started with Windows Workloads on Amazon EC2Getting Started with Windows Workloads on Amazon EC2
Getting Started with Windows Workloads on Amazon EC2
 
ENT309 scaling up to your first 10 million users
ENT309 scaling up to your first 10 million usersENT309 scaling up to your first 10 million users
ENT309 scaling up to your first 10 million users
 
ENT308 Best Practices for Microsoft Architectures on AWS
ENT308 Best Practices for Microsoft Architectures on AWSENT308 Best Practices for Microsoft Architectures on AWS
ENT308 Best Practices for Microsoft Architectures on AWS
 
Serverless Geospatial Mobile Apps with AWS
Serverless Geospatial Mobile Apps with AWSServerless Geospatial Mobile Apps with AWS
Serverless Geospatial Mobile Apps with AWS
 
Deep Dive on Microservices and Amazon ECS
Deep Dive on Microservices and Amazon ECSDeep Dive on Microservices and Amazon ECS
Deep Dive on Microservices and Amazon ECS
 
Getting Started with AWS Security
Getting Started with AWS SecurityGetting Started with AWS Security
Getting Started with AWS Security
 
ENT307 VMware and AWS Together - VMware Cloud on AWS
ENT307 VMware and AWS Together - VMware Cloud on AWSENT307 VMware and AWS Together - VMware Cloud on AWS
ENT307 VMware and AWS Together - VMware Cloud on AWS
 
SEC302 Becoming an AWS Policy Ninja using AWS IAM and AWS Organizations
SEC302 Becoming an AWS Policy Ninja using AWS IAM and AWS OrganizationsSEC302 Becoming an AWS Policy Ninja using AWS IAM and AWS Organizations
SEC302 Becoming an AWS Policy Ninja using AWS IAM and AWS Organizations
 
Getting Started with AWS IoT
Getting Started with AWS IoTGetting Started with AWS IoT
Getting Started with AWS IoT
 
Storage with Amazon S3 and Amazon Glacier
Storage with Amazon S3 and Amazon GlacierStorage with Amazon S3 and Amazon Glacier
Storage with Amazon S3 and Amazon Glacier
 
Deploying a Disaster Recovery Site on AWS: Minimal Cost with Maximum Efficiency
Deploying a Disaster Recovery Site on AWS: Minimal Cost with Maximum EfficiencyDeploying a Disaster Recovery Site on AWS: Minimal Cost with Maximum Efficiency
Deploying a Disaster Recovery Site on AWS: Minimal Cost with Maximum Efficiency
 
Getting Started with Amazon EC2 and Compute Services
Getting Started with Amazon EC2 and Compute ServicesGetting Started with Amazon EC2 and Compute Services
Getting Started with Amazon EC2 and Compute Services
 
AWS re:Invent 2016: Scaling Your Web Applications with AWS Elastic Beanstalk ...
AWS re:Invent 2016: Scaling Your Web Applications with AWS Elastic Beanstalk ...AWS re:Invent 2016: Scaling Your Web Applications with AWS Elastic Beanstalk ...
AWS re:Invent 2016: Scaling Your Web Applications with AWS Elastic Beanstalk ...
 
AWS Innovate: Smart Deployment on AWS - Andy Kim
AWS Innovate: Smart Deployment on AWS - Andy KimAWS Innovate: Smart Deployment on AWS - Andy Kim
AWS Innovate: Smart Deployment on AWS - Andy Kim
 

Viewers also liked

Amazon Elastic Load Balancing
Amazon Elastic Load BalancingAmazon Elastic Load Balancing
Amazon Elastic Load Balancing
Duy Tan Geek
 

Viewers also liked (20)

Deep Dive on Amazon S3
Deep Dive on Amazon S3Deep Dive on Amazon S3
Deep Dive on Amazon S3
 
Amazon S3 Masterclass
Amazon S3 MasterclassAmazon S3 Masterclass
Amazon S3 Masterclass
 
Self-Service Supercomputing
Self-Service SupercomputingSelf-Service Supercomputing
Self-Service Supercomputing
 
How to Scale to Millions of Users with AWS
How to Scale to Millions of Users with AWSHow to Scale to Millions of Users with AWS
How to Scale to Millions of Users with AWS
 
(DAT407) Amazon ElastiCache: Deep Dive
(DAT407) Amazon ElastiCache: Deep Dive(DAT407) Amazon ElastiCache: Deep Dive
(DAT407) Amazon ElastiCache: Deep Dive
 
AWS re:Invent 2016: Elastic Load Balancing Deep Dive and Best Practices (NET403)
AWS re:Invent 2016: Elastic Load Balancing Deep Dive and Best Practices (NET403)AWS re:Invent 2016: Elastic Load Balancing Deep Dive and Best Practices (NET403)
AWS re:Invent 2016: Elastic Load Balancing Deep Dive and Best Practices (NET403)
 
Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...
Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...
Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...
 
Securing Serverless Architectures
Securing Serverless ArchitecturesSecuring Serverless Architectures
Securing Serverless Architectures
 
Cost Optimization at Scale
Cost Optimization at ScaleCost Optimization at Scale
Cost Optimization at Scale
 
DevOps on AWS: Deep Dive on Continuous Delivery and the AWS Developer Tools
DevOps on AWS: Deep Dive on Continuous Delivery and the AWS Developer ToolsDevOps on AWS: Deep Dive on Continuous Delivery and the AWS Developer Tools
DevOps on AWS: Deep Dive on Continuous Delivery and the AWS Developer Tools
 
Deep Dive on Amazon Relational Database Service
Deep Dive on Amazon Relational Database ServiceDeep Dive on Amazon Relational Database Service
Deep Dive on Amazon Relational Database Service
 
Deep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDBDeep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDB
 
Amazon Elastic Load Balancing
Amazon Elastic Load BalancingAmazon Elastic Load Balancing
Amazon Elastic Load Balancing
 
All You Need to Know about AWS Elastic Load Balancer
All You Need to Know about AWS Elastic Load BalancerAll You Need to Know about AWS Elastic Load Balancer
All You Need to Know about AWS Elastic Load Balancer
 
Amazon ElastiCache (Dan Zamansky) - AWS DB Day
Amazon ElastiCache (Dan Zamansky) - AWS DB DayAmazon ElastiCache (Dan Zamansky) - AWS DB Day
Amazon ElastiCache (Dan Zamansky) - AWS DB Day
 
Understanding The Benefits Of Amazon EC2
Understanding The Benefits Of Amazon EC2Understanding The Benefits Of Amazon EC2
Understanding The Benefits Of Amazon EC2
 
(CMP401) Elastic Load Balancing Deep Dive and Best Practices
(CMP401) Elastic Load Balancing Deep Dive and Best Practices(CMP401) Elastic Load Balancing Deep Dive and Best Practices
(CMP401) Elastic Load Balancing Deep Dive and Best Practices
 
(SDD402) Amazon ElastiCache Deep Dive | AWS re:Invent 2014
(SDD402) Amazon ElastiCache Deep Dive | AWS re:Invent 2014(SDD402) Amazon ElastiCache Deep Dive | AWS re:Invent 2014
(SDD402) Amazon ElastiCache Deep Dive | AWS re:Invent 2014
 
Introduction to CloudFront
Introduction to CloudFrontIntroduction to CloudFront
Introduction to CloudFront
 
Availability & Scalability with Elastic Load Balancing & Route 53 (CPN204) | ...
Availability & Scalability with Elastic Load Balancing & Route 53 (CPN204) | ...Availability & Scalability with Elastic Load Balancing & Route 53 (CPN204) | ...
Availability & Scalability with Elastic Load Balancing & Route 53 (CPN204) | ...
 

Similar to Deep Dive on Amazon S3

Similar to Deep Dive on Amazon S3 (20)

AWS April 2016 Webinar Series - S3 Best Practices - A Decade of Field Experience
AWS April 2016 Webinar Series - S3 Best Practices - A Decade of Field ExperienceAWS April 2016 Webinar Series - S3 Best Practices - A Decade of Field Experience
AWS April 2016 Webinar Series - S3 Best Practices - A Decade of Field Experience
 
Deep Dive on Amazon S3
Deep Dive on Amazon S3Deep Dive on Amazon S3
Deep Dive on Amazon S3
 
SRV403 Deep Dive on Object Storage: Amazon S3 and Amazon Glacier
SRV403 Deep Dive on Object Storage: Amazon S3 and Amazon GlacierSRV403 Deep Dive on Object Storage: Amazon S3 and Amazon Glacier
SRV403 Deep Dive on Object Storage: Amazon S3 and Amazon Glacier
 
Deep Dive on Amazon S3 - AWS Online Tech Talks
Deep Dive on Amazon S3 - AWS Online Tech TalksDeep Dive on Amazon S3 - AWS Online Tech Talks
Deep Dive on Amazon S3 - AWS Online Tech Talks
 
Deep Dive on Amazon S3 - March 2017 AWS Online Tech Talks
Deep Dive on Amazon S3 - March 2017 AWS Online Tech TalksDeep Dive on Amazon S3 - March 2017 AWS Online Tech Talks
Deep Dive on Amazon S3 - March 2017 AWS Online Tech Talks
 
Deep Dive on Amazon S3
Deep Dive on Amazon S3Deep Dive on Amazon S3
Deep Dive on Amazon S3
 
Deep Dive On Object Storage: Amazon S3 and Amazon Glacier - AWS PS Summit Can...
Deep Dive On Object Storage: Amazon S3 and Amazon Glacier - AWS PS Summit Can...Deep Dive On Object Storage: Amazon S3 and Amazon Glacier - AWS PS Summit Can...
Deep Dive On Object Storage: Amazon S3 and Amazon Glacier - AWS PS Summit Can...
 
Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...
Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...
Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...
 
Building a Data Lake on AWS
Building a Data Lake on AWSBuilding a Data Lake on AWS
Building a Data Lake on AWS
 
February 2016 Webinar Series - Use AWS Cloud Storage as the Foundation for Hy...
February 2016 Webinar Series - Use AWS Cloud Storage as the Foundation for Hy...February 2016 Webinar Series - Use AWS Cloud Storage as the Foundation for Hy...
February 2016 Webinar Series - Use AWS Cloud Storage as the Foundation for Hy...
 
Deep Dive on Object Storage: Amazon S3 and Amazon Glacier | AWS Public Sector...
Deep Dive on Object Storage: Amazon S3 and Amazon Glacier | AWS Public Sector...Deep Dive on Object Storage: Amazon S3 and Amazon Glacier | AWS Public Sector...
Deep Dive on Object Storage: Amazon S3 and Amazon Glacier | AWS Public Sector...
 
2016 Utah Cloud Summit: AWS S3
2016 Utah Cloud Summit: AWS S32016 Utah Cloud Summit: AWS S3
2016 Utah Cloud Summit: AWS S3
 
Builders' Day - Best Practises for S3 - BL
Builders' Day - Best Practises for S3 - BLBuilders' Day - Best Practises for S3 - BL
Builders' Day - Best Practises for S3 - BL
 
AWS re:Invent 2016: Workshop: AWS S3 Deep-Dive Hands-On Workshop: Deploying a...
AWS re:Invent 2016: Workshop: AWS S3 Deep-Dive Hands-On Workshop: Deploying a...AWS re:Invent 2016: Workshop: AWS S3 Deep-Dive Hands-On Workshop: Deploying a...
AWS re:Invent 2016: Workshop: AWS S3 Deep-Dive Hands-On Workshop: Deploying a...
 
SRV403 Deep Dive on Object Storage: Amazon S3 and Amazon Glacier
SRV403 Deep Dive on Object Storage: Amazon S3 and Amazon GlacierSRV403 Deep Dive on Object Storage: Amazon S3 and Amazon Glacier
SRV403 Deep Dive on Object Storage: Amazon S3 and Amazon Glacier
 
Amazon S3 Deep Dive
Amazon S3 Deep DiveAmazon S3 Deep Dive
Amazon S3 Deep Dive
 
Deep Dive on Amazon S3 (May 2016)
Deep Dive on Amazon S3 (May 2016)Deep Dive on Amazon S3 (May 2016)
Deep Dive on Amazon S3 (May 2016)
 
SRV403 Deep Dive on Object Storage: Amazon S3 and Amazon Glacier
SRV403 Deep Dive on Object Storage: Amazon S3 and Amazon GlacierSRV403 Deep Dive on Object Storage: Amazon S3 and Amazon Glacier
SRV403 Deep Dive on Object Storage: Amazon S3 and Amazon Glacier
 
Supercharging the Value of Your Data with Amazon S3
Supercharging the Value of Your Data with Amazon S3Supercharging the Value of Your Data with Amazon S3
Supercharging the Value of Your Data with Amazon S3
 
AWS Journey through the AWS Cloud: Disaster Recovery
AWS Journey through the AWS Cloud: Disaster RecoveryAWS Journey through the AWS Cloud: Disaster Recovery
AWS Journey through the AWS Cloud: Disaster Recovery
 

More from Amazon Web Services

Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
Amazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
Amazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
Amazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
Amazon Web Services
 

More from Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 

Recently uploaded (20)

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 

Deep Dive on Amazon S3

  • 1. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Lee Atkinson, Solutions Architect, AWS Jey Jeyasingam, CTO, Y-Cam 7 July 2016 Amazon S3 Deep Dive
  • 2. Amazon EFS File Amazon EBS Amazon EC2 instance store Block Amazon S3 Amazon Glacier Object Data transfer AWS Direct Connect Snowball ISV connectors Amazon Kinesis Firehose Transfer Acceleration AWS Storage Gateway AWS storage services
  • 3. Cross-region replication Amazon CloudWatch metrics for Amazon S3 & AWS CloudTrail support VPC endpoint for Amazon S3 Read-after-write consistency in all regions Event notifications Amazon S3 bucket limit increase Innovation for Amazon S3 (1/2)
  • 4. Innovation for Amazon S3 (2/2) Amazon S3 Standard-IA Transfer Acceleration Incomplete multipart upload expiration Expired object delete marker
  • 5. Standard Active data Archive dataActive Archive Standard - Infrequent Access Amazon Glacier Choice of storage classes on Amazon S3
  • 6. File sync and share / consumer file storage Backup and archive / disaster recovery Long retained data Some use cases have different requirements
  • 7. 11 9s of durability Designed for 99.9% availability Durable Available Same throughput as Amazon S3 Standard storage High performance • Server-side encryption • Use your encryption keys • KMS-managed encryption keys Secure • Lifecycle management • Versioning • Event notifications • Metrics Integrated • No impact on user experience • Simple REST API • Single bucket Easy to use Standard-Infrequent Access storage
  • 9. Lifecycle policies Automatic tiering and cost controls Includes two possible actions: • Transition: to Standard-IA or Glacier after specified time • Expiration: deletes objects after specified time Allows for actions to be combined Set policies at the key prefix level
  • 11. Versioning S3 buckets Protects from accidental overwrites and deletes New version with every upload Easy retrieval of deleted objects and roll back Three states of an Amazon S3 bucket • Unversioned (Default) • Versioning-enabled • Versioning-suspended
  • 13. Expired object delete marker policy Deleting a versioned object makes a delete marker the current version of the object No storage charge for delete marker Removing delete marker can improve list performance Lifecycle policy to automatically remove the current version delete marker when previous versions of the object no longer exist
  • 14. Example lifecycle policy to remove current versions <LifecycleConfiguration> <Rule> ... <Expiration> <Days>60</Days> </Expiration> <NoncurrentVersionExpiration> <NoncurrentDays>30</NoncurrentDays> </NoncurrentVersionExpiration> </Rule> </LifecycleConfiguration> Leverage lifecycle to expire current and non-current versions S3 Lifecycle will automatically remove any expired object delete markers Expired object delete marker policy
  • 15. Example lifecycle policy for non-current version expiration Lifecycle configuration with NoncurrentVersionExpiration action removes all the noncurrent versions, <LifecycleConfiguration> <Rule> ... <Expiration> <ExpiredObjectDeleteMarker>true</ExpiredObjectDeleteMarker> </Expiration> <NoncurrentVersionExpiration> <NoncurrentDays>30</NoncurrentDays> </NoncurrentVersionExpiration> </Rule> </LifecycleConfiguration> ExpiredObjectDeleteMarker element removes expired object delete markers. Expired object delete marker policy
  • 16. Restricting deletes with MFA Bucket policies can restrict deletes For additional security, enable MFA (multi-factor authentication) delete, which requires additional authentication to: • Change the versioning state of your bucket • Permanently delete an object version MFA delete requires both your security credentials and a code from an approved authentication device
  • 18. Parallel PUTs with Multipart Uploads Increase throughput by parallelizing PUTs Increase resiliency to network errors Fewer large restarts on error-prone networks A balance between part size & number of parts: • Small parts increase connection overhead • Large parts provide less benefits of multipart
  • 19. Incomplete multipart upload expiration policy Multipart upload feature improves PUT performance Partial upload does not appear in bucket list Partial upload does incur storage charges Set a lifecycle policy to automatically expire incomplete multipart uploads after a predefined number of days
  • 20. Example lifecycle policy Abort incomplete multipart uploads seven days after initiation <LifecycleConfiguration> <Rule> <ID>sample-rule</ID> <Prefix>SomeKeyPrefix/</Prefix> <Status>rule-status</Status> <AbortIncompleteMultipartUpload> <DaysAfterInitiation>7</DaysAfterInitiation> </AbortIncompleteMultipartUpload> </Rule> </LifecycleConfiguration> Incomplete multipart upload expiration policy
  • 21. Parallel GETs Use range-based GETs to get multithreaded performance when downloading objects Compensates for unreliable networks Benefits of multithreaded parallelism Align your ranges with your parts!
  • 22. Parallel LISTs Parallelize LIST when you need a sequential list of your keys Secondary index to get a faster alternative to LIST • Sorting by metadata • Searchability • Objects by timestamp
  • 23. Distributing object keys Most important if you regularly exceed 100 TPS on a bucket Distribute keys uniformly across keyspace Use a key-naming scheme with randomness at the beginning
  • 24. Distributing object keys Don’t do this… <my_bucket>/2013_11_13-164533125.jpg <my_bucket>/2013_11_13-164533126.jpg <my_bucket>/2013_11_13-164533127.jpg <my_bucket>/2013_11_13-164533128.jpg <my_bucket>/2013_11_12-164533129.jpg <my_bucket>/2013_11_12-164533130.jpg <my_bucket>/2013_11_12-164533131.jpg <my_bucket>/2013_11_12-164533132.jpg <my_bucket>/2013_11_11-164533133.jpg <my_bucket>/2013_11_11-164533134.jpg <my_bucket>/2013_11_11-164533135.jpg <my_bucket>/2013_11_11-164533136.jpg
  • 25. Distributing object keys …because this is going to happen 1 2 N 1 2 N Partition Partition Partition Partition
  • 26. Distributing object keys Add randomness to the beginning of the key name… <my_bucket>/521335461-2013_11_13.jpg <my_bucket>/465330151-2013_11_13.jpg <my_bucket>/987331160-2013_11_13.jpg <my_bucket>/465765461-2013_11_13.jpg <my_bucket>/125631151-2013_11_13.jpg <my_bucket>/934563160-2013_11_13.jpg <my_bucket>/532132341-2013_11_13.jpg <my_bucket>/565437681-2013_11_13.jpg <my_bucket>/234567460-2013_11_13.jpg <my_bucket>/456767561-2013_11_13.jpg <my_bucket>/345565651-2013_11_13.jpg <my_bucket>/431345660-2013_11_13.jpg
  • 27. Distributing object keys …so your transactions can be distributed across the partitions 1 2 N 1 2 N Partition Partition Partition Partition
  • 28. Techniques for distributing keys Store as a hash: • 83d02a66a0fee41b5767e4f4dd377d29 Prepend with short hash: • 83d02013_11_13-164533125.jpg Reverse: • 521335461-31_11_3102.jpg
  • 29. Data ingestion into Amazon S3
  • 30. AWS Import/Export Snowball • Accelerate PBs with AWS- provided appliances • 80TB and global availability AWS Storage Gateway • Up to 120 MB/s cloud upload rate (4x improvement), and • 10 Gb networking for VMware Data ingestion into Amazon S3 Amazon Kinesis Firehose • Ingest data streams directly into AWS data stores AWS Direct Connect ISV connectors Transfer Acceleration • Move data up to 300% faster using the AWS network
  • 32. Introducing Amazon S3 Transfer Acceleration Up to 300% faster Change your endpoint, not your code 56 global edge locations No firewall exceptions No client software required S3 Bucket AWS Edge Location Uploader Optimized Throughput!
  • 33. Rio De Janeiro Warsaw New York Atlanta Madrid Virginia Melbourne Paris Los Angeles Seattle Tokyo Singapore Time[hrs] 500 GB upload from these edge locations to a bucket in Singapore Public Internet How fast is Transfer Acceleration? S3 Transfer Acceleration
  • 34. Getting Started 1. Enable S3 transfer acceleration on your S3 bucket 2. Update your application/destination URL to <bucket-name>.s3- accelerate.amazonaws.com 3. Done!
  • 35. How much will it help me? Use the Amazon S3 Transfer Acceleration Speed Comparison page: http://s3-accelerate-speedtest.s3- accelerate.amazonaws.com/en/accelerate-speed- comparsion.html
  • 37. Y-cam Solutions Ltd Confidential and proprietary Who we are... Initially used S3 just to store videos and thumbnails, 6 years ago 120 million objects But now we also use S3 for so much more 2 million videos
  • 38. Y-cam Solutions Ltd Confidential and proprietary Our Architecture
  • 39. Y-cam Solutions Ltd Confidential and proprietary Challenges Handling the expiration of videos Legacy scripts Reducing servers, cutting costs
  • 40. Y-cam Solutions Ltd Confidential and proprietary Video Expiration Create multiple buckets with different lifecycle Improve code to decide which bucket to save the video
  • 41. Y-cam Solutions Ltd Confidential and proprietary Legacy Script Move create thumbnail and update DynamoDB from script to Lambda function Extra benefits of using Lambda Lambda triggered by S3 event notification
  • 42. Y-cam Solutions Ltd Confidential and proprietary Future Plans Reducing number of servers Servers only serving web app JS code Moved this to be hosted by S3 Reduced cost Moving towards serverless architecture
  • 43. Summary Amazon S3 Standard-Infrequent Access Amazon S3 management policies Versioning for Amazon S3 + MFA Delete Amazon S3 Transfer Acceleration
  • 44. Please remember to rate this session under ‘My Agenda’ on https://awssummit.london