An Overview of AWS services for Data Storage and Migration - SRV205 - Toronto AWS Summit

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Antoine Généreux
Solutions Architect, Amazon Web Services
SRV205
An Overview of AWS services for Data
Storage and Migration

163 ZB by 2025
- IDC
This is 10x the amount of data generated in 2016.
1 Zettabyte = 1021 = 1 000 000 000 000 000 000 000 bytes
Data growth is not slowing

Data as a flywheel for innovation
Deliver new insights
(data lakes, analytics)
Accelerate innovation
(active archive, IoT,
Artificial Intelligence)
Realize benefits
(cost, management, scale)
Build or migrate
an application
DATA

Compliance
Industry
certifications
Lockable with audit
trails
Secure
Enterprise
Applications
Easier lift-and-shift
migrations
Integrated with
major vendors
Fully managed
infrastructure
Active
Archive
Media workflows
Tape replacement
Public Sector, FinServ,
Healthcare/Life
Sciences
Databases &
Analytics
Tailored database or
Hadoop workloads
Bespoke database
lift-and-shift projects
Backup &
Restore
Non-disruptive
Easy place to start
Integrated w/all
major vendors
Data Lakes
& IoT
400% faster queries
Built for
streaming data
Optional data
visualization
Common storage workloads on AWS

The best reliability and
largest scale
The most complete
portfolio
The most data
movement choices
The most comprehensive
support and consulting
More than twice
the partners
The most secure,
compliant, and auditable
Why AWS Storage?

“…AWS has made considerable efforts to make S3 smarter for analytics workloads
by building query capabilities into the storage layer itself.”
- Gartner’s Critical Capabilities for Public Cloud Storage Services, Worldwide
Raj Bala, Julie Palmer, August 27, 2018
Amazon S3 holds trillions of objects and regularly peaks at millions
of requests per second.
TIME
OBJECTS
Storage with analytics capabilities

Enterprise ApplicationsDisaster Recovery Analytics
Primary Storage Backup & Restore Archive
Complete partner list at https://aws.amazon.com/backup-recovery/partner-solutions/
Twice as many partnerships

Data movement Data security
and management
AWS storage services
AWS Snow Family
AWS Storage Gateway
AWS Direct Connect
Amazon EFS File Sync
Amazon S3 Transfer
Acceleration
Third-party
Applications
Amazon Kinesis Firehose
AWS KMS
AWS IAM
Amazon CloudWatch
AWS CloudTrail
AWS CloudFormation
AWS Lambda
Amazon Macie
Amazon QuickSight
Amazon
EFS
Amazon
EBS
Amazon
S3
Amazon
Glacier

Getting started moving data
A private
connection
between your
data center,
office, or
colocation
environment and
AWS
AWS
Snow Family
(Snowball, Snowball
Edge, and
Snowmobile) Secure,
physical transport
appliances that can
pre-process and
move up to exabytes
of data into and out
of AWS
AWS
Storage
Gateways
Hybrid storage
that seamlessly
connects on-
premises
applications to
AWS storage.
Ideal for backup,
DR, bursting,
tiering, or
migration
Amazon
Kinesis
Firehose
Capture, trans-
form, & load
streaming data
into Amazon S3
for use with
Amazon business
intelligence and
analytics tools
Up to 5x faster file
transfers than
open-source tools.
Ideal for migrating
data into Amazon
EFS or moving
between cloud file
systems
Amazon
S3 Transfer
Acceleration
Up to 300% faster
transfers into and
out of Amazon S3.
Ideal when
working with long
geographic
distances
APN
Competency
Partners
Integrations
between third-
party vendors and
AWS services.
Ideal for
leveraging existing
software licenses
and skills
Amazon
EFS File Sync
Amazon
Direct
Connect

Amazon S3
Analyze
Store
Collect
Built for
• More than a decade of experience and continuous innovation
• Multiple storage classes and integrated lifecycle management
• Reporting on object metadata, compliance and usage with S3 Inventory
• Multiple storage ingestion options and partner integrations
• Multiple encryption options, security integrations and compliance
• Increase data access performance by up to 400% with S3 Select
• Query data in place with Amazon Athena and Amazon Redshift Spectrum
• Optimize storage utilization with S3 Analytics
Backup &
restore
Data lakes &
analytics
Cloud-native
applications

Amazon Glacier
Cost-effective
Secure
Durable
• Certifications supporting compliance requirements for virtually every
regulatory agency
• Locking, encryption, audit and alerting tools to prevent tampering
• Built on AWS’s global infrastructure
• Withstands multiple facility failures
• Replication options across global regions
• Designed for archives and backup
• Expedited retrievals in minutes, bulk retrievals in hours
• Opens archives to analytics applications with Glacier Select
Built for Active
archive
Tape
replacement
Regulatory
compliance

Object lifecycle management
S3 Standard GlacierS3 Standard -
Infrequent Access
Active data
Milliseconds
$0.023/GB/mo
Archive data
Minutes to hours
$0.004/GB/mo
Infrequently accessed data
Milliseconds
$0.0125/GB/mo
Automated Lifecycle Policies
S3 One Zone -
Infrequent Access
Infrequently accessed data
in just one AWS Region
Milliseconds
$0.01/GB/mo

Designed for
99.999999999% durability
Glacier
S3
Standard
S3 - IA
OR
99.999% durability
99.99% durability
Traditional model with two copies in one site
Traditional model with copies in two sites
S3 & Glacier data durability
S3 One Zone
- IA

“Zones”
Or worse, this:
AWS Region
This: Not this:
“Region”
Availability Zone
Availability Zone
Availability Zone
Object Storage Availability & Durability

AWS Storage Gateway family
Cost-effective
Hybrid
Cloud
Integrated
• Connects on-premises applications to AWS storage services
• Low-latency access with local caching and cloud scalability
• Optimized, fully managed data transfer between appliance and AWS
• Standard storage protocols for file, block & tape
• Stores data in Amazon S3, Amazon Glacier and EBS Snapshots
• AWS-native: Simplifying management, monitoring and automation
• Reduces on-premises storage and backup systems and management
• Unlocks cloud economics for data storage, processing and recovery
Built for Backup &
Archiving
File Storage for
Apps & Content
Hybrid
workloads

Canada’s largest biotech firm
Data sovereignty required local hot files
and tape archives in each of 10 global offices
• AWS Volume Gateway eliminated 50-hour
backup windows and tape archive systems
• Cut on-premises storage CAPEX 40%; dropped
RTO from 48 hours to 10 minutes
• Meets cloud strategy while retaining local
ownership and data sovereignty
• Enabled data center exit in next 12 months
“It made no sense to keep buying
big disk siloes, especially as we opened up
new global offices, and now we can
recover in the cloud from a snapshot if we
ever had to.”
- Adam Leggett
IT Manager
Hybrid Cloud Storage & Restore

On-premises infrastructure took weeks to produce
customer content
Needed performant, secure,
economical media distribution solution
• Workflow pipelines are now highly parallel and
elastic
• New one-hour content delivery SLA
• Fully migrating away from on-premises
infrastructure economics
Active Archive
“We have 20 petabytes of content on AWS, the
equivalent of more than 800,000 hours of video,
available on our platform. We can only move all
that content around the world with the
scalability we’re getting on the AWS Cloud. “
- Andy Shenkler
Chief Solutions and Technology Officer

Threat analysis company ingesting
and analyzing 50 TB daily
Right-sizing clusters cost weeks and lost data
• Saved 95% through re-architecting to a “hot” index
on Amazon EBS with an analytics data lake on
Amazon S3
• Amazon EBS shortened indexing times from weeks
to hours while cutting OPEX
• Now getting consistent 1–3 sec. search response
times across 5 PB of growing data in Amazon S3
• Managing 1 billion Amazon S3 objects and 2,500
instances with just six engineers
“AWS storage completely changed our business
operations, time to market and manpower. EBS
volumes cut our cluster indexing times from weeks to
hours. Moving data into Amazon S3 saved us 95% and
our data lake now outperforms our clusters—the
harder we push it the faster it gets for extremely large
datasets. We simply could not do this anywhere else.”
- Gene Stevens
CTO and Cofounder
A m a z o n S 3 D A T A L A K E
Data Lakes

Security-as-a-Service for 4000 customers
using 25 PB and growing 110% per year
Colocation not agile enough or cost effective
• Built an Amazon S3 data lake and avoided $1.6M
CAPEX - in the first year alone
• Stress-tested 100x larger load with zero CAPEX
• 4x better “I/O per $” ratio
• Gained new insights into their customers through
Amazon S3 data management capabilities
• No 40-Gbps network infrastructure worries
“AWS storage is
Fully redundant, multi-region,
more secure, and faster
at less than half the cost.”
+
- Paul Fisher
Technical Fellow
A m a z o n S 3 D A T A L A K E
Data Lakes

ProofPoint controls and enforces over 1M daily
social media posts for corporate customers
Needed to integrate this social media content
into their regulated email archive solution
• Built fully compliant archive/purge workflows
using Amazon S3 and Amazon Glacier
• Created a compliant two-step legal hold with
vault-level tags and Glacier Vault Lock
“What would it have cost us to build
a WORM data store,
get it certified for SEC Rule 17(a)-4(f)
and CFTC Rule 1.31 (b)-(c),
and then scale it?”
- Rich Sutton
VP of Engineering
Regulatory Compliance

Amazon EBS
Performant
Persistent
Reliable
• Dedicated, detachable volumes for EC2 instances
• Helps customers manage compute and storage separately
• Highly secure Multi-AZ design
• Built-in backup options
• Performance options to fit most workloads
• Optimized for latency, throughput, or cost
• Elastic volumes expand capacity on the fly
Built for
Hadoop/Amazon EMR,
relational and NoSQL databases,
log processing, and data warehousing

Databases and analytics
Global broadband service operator processing 17 TB
of daily device data streams at 200 MB/s
Modifying Kafka clusters required an
8 hours resync every time
• Moved from instance stores to EBS volumes
• Cut storage costs by 25%
• Cut production cluster node count by 33%
• Dropped resync times to 20 minutes
“Our AWS service use is about making the
necessary easy. Storage should be as boring
as possible—it should just work. Amazon
EBS makes it trivial to do things that were
impractical before, driving experimentation,
creativity, and faster delivery.”
- Daniel Woodlins
Software Engineer

Amazon EFS
Scalable
Simple
Elastic
Web serving, content management, media and
entertainment workflows, home directories, container
storage, big data, and analytics
• Share files between EC2 instances in minutes
• True file system interface with file system semantics
• Fully managed – no capacity planning surprises
• Pay-as-you-go consumption and pricing
• Automatically grows and shrinks
• Much lower TCO than DIY or third-party workarounds
• Consistent performance even as data grows
Built for

Newly acquired streaming media product depended
on a local file server
Had to launch at global scale in 90 days – with
minimal changes
• DIY was too complex and took too long
• Lift-and-shift to Amazon EFS took 2 hours
• EFS with EC2 autoscaling met global scale agility
needs
• Seamless integration between partner
application and existing AWS systems
• Post-mortem TCO analysis showed that EFS was
still the best choice
Enterprise applications
“Good, fast, and cheap. We picked two and got
all three with Amazon EFS. It gave us the agility
to deliver a new product on schedule, eliminated
scale and performance concerns, and operates
below our
OPEX expectations.”
- Chris DeAcosta
Sr. Director Software Engineering

Prior to Amazon EFS, we experienced timeouts for
up to 10% of uploads over 100 MB. Now, all of the
JFrog build artifacts (from infrastructure-as-code
components to Docker images) are in one place,
and we’ve increased large file transfer speeds by
38%.”
- Suresh Prem, Murty Chitti,
and Rajesh Sivaraman
System Engineers
Enterprise applications
Builds 3d digital maps relying on 28 TB of
waypoints generated daily
Unreliable on-premises repository and
high maintenance DIY cloud version
• Amazon EFS dropped infrastructure provisioning
time from 90 days to 7
• Now handling 800,000 daily file transfers up to
38% faster with zero failures
• Seamless JFrog workflow integration
• Gained high availability at no extra cost
• Also tiering JFrog backups into Amazon S3 and
Amazon Glacier

AWS Snow family
Cost-effective
Secure
Portable
• Certifications supporting nearly any regulatory compliance program
• Encryption, audit and alerting tools to flag tampering
• Ruggedized for capturing data in remote locations
• Optional computing power (Snowball Edge models)
• No networking modifications required
• Automatic return shipping labels
• Integrates with existing protocols and workflows
Built for Populating
Data Lakes
Media
Migration
Edge computing

“The AWS solution with Storage Gateway helps us
reduce backend service costs; our estimated costs for
this new hybrid cloud service are one tenth of the
prior on-premises infrastructure costs. As we pass
those savings on to our customers, we become
significantly more competitive in the energy-
exploration and production market.”
- William Rivera
Manager Of Global Cloud Operations
Landmark provides technology solutions for oil & gas
exploration and production
Clients use Landmark to store & manage TBs of data
for energy discovery research
• Initial bulk data transfer to Amazon S3 done with
AWS Snowball
• Created hybrid storage infrastructure with AWS
Storage Gateway – File Gateway
• Automated operations on the stored data with
AWS Lambda functions
• Able to query and organize data to according to
specific file structure
Hybrid Cloud Storage & Active Archive

Halliburton Landmark Use Case
Amazon
Glacier
Amazon S3
Standard
S3-Infrequent
Access
File Gateway
PetroBank
Application
ServersLTO
NAS
AWS Direct
Connect
Halliburton data center
1
2
Use AWS Snowball to ship data from on-premises offline archives1
2 Online access to all data through AWS Storage Gateway - File Gateway
Minimal on-premises storage reduces cost
Time-to-date by reduced by days or weeks
AWS Snowball

Compliance
Industry
certifications
Lockable with audit
trails
Secure
Enterprise
Applications
Easier lift-and-shift
migrations
Integrated with
major vendors
Fully managed
infrastructure
Active
Archive
Media workflows
Tape replacement
Public Sector, FinServ,
Healthcare/Life
Sciences
Databases &
Analytics
Tailored database or
Hadoop workloads
Bespoke database
lift-and-shift projects
Backup &
Restore
Non-disruptive
Easy place to start
Integrated w/all
major vendors
Data Lakes
& IoT
400% faster queries
Built for
streaming data
Optional data
visualization
Let’s get started: which is top of mind?

An Overview of AWS services for Data Storage and Migration - SRV205 - Toronto AWS Summit

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to An Overview of AWS services for Data Storage and Migration - SRV205 - Toronto AWS Summit

Similar to An Overview of AWS services for Data Storage and Migration - SRV205 - Toronto AWS Summit (20)

More from Amazon Web Services

More from Amazon Web Services (20)

An Overview of AWS services for Data Storage and Migration - SRV205 - Toronto AWS Summit