El almacenamiento en la nube es un componente crítico de la informática en la nube, que guarda la información que utilizan las aplicaciones. El análisis de big data, los almacenes de datos, el Internet de las cosas, las bases de datos y las aplicaciones de backup y archivado dependen de algún tipo de arquitectura de almacenamiento de datos. El almacenamiento en la nube, por lo general, es más fiable, escalable y seguro que los sistemas de almacenamiento en las instalaciones tradicionales.
1. Servicios de almacenamiento en
AWS
Henry Alvarado
Solutions Architect, AWS Colombia
gomhenry@amazon.com
Experience Day, Cali
2. Block vs File vs Object
Block Storage
Raw Storage
Data organized as an array of unrelated blocks
Host File System places data on disk
e.g.: Microsoft NTFS, Unix ZFS
File Storage
Unrelated data blocks managed by a file (serving) system
Native file system places data on disk
Object Storage
Stores Virtual containers that encapsulate the data, data attributes, metadata and Object IDs
API Access to data
Metadata Driven, Policy-based, etc
3. Storage - Characteristics
Durability Availability Security Cost Scalability Performance Integration
Measure of
expected data
loss
Measure of
expected
downtime
Security
measures in
place
Amount per
storage unit,
e.g. $ / GB
Upward
flexibility
Performance
metrics
Ability to
interact with
Some of the ways we look at storage
4. AWS has a variety of storage options
Amazon EBS (Elastic Block Storage)
Amazon Elastic File System (EFS)
Amazon S3 (Simple Storage Service)
Amazon Glacier
AWS Storage Gateway
Amazon Snowball & Snowball Edge
AWS Snowmobile
5. Amazon EBS
• Persistent block level storage for EC2
• Pay only for what you provision
• Native redundancy and write cache
• Consistent and low-latency performance
• Optimized for random I/O
• Native support for encryption at rest (data volumes)
6. AWS EBS Features
Durable Secure
Low-latency SSD
Consistent I/O Performance
Stripe multiple volumes for
higher I/O performance
Identity and
Access Policies
Encryption
Scalable
Unlimited capacity
when you need it
Easily scale up
and down
Performance Backup
Designed for five
9’s reliability
Redundant storage
across multiple devices
within an AZ
Point-in-time Snapshots
Copy snapshots across
AZ and Regions
7. EBS Volume TypesSolid-State Drives (SSD) Hard disk Drives (HDD)
Volume Type General Purpose SSD (gp2)* Provisioned IOPS SSD
(io1)
Throughput Optimized HDD
(st1)
Cold HDD
(sc1)
Description General purpose SSD volume
that balances price and
performance for a wide
variety of transactional
workloads
Highest-performance SSD
volume designed for mission-
critical applications
Low cost HDD volume designed
for frequently accessed,
throughput-intensive
workloads
Lowest cost HDD volume
designed for less frequently
accessed workloads
Use Cases • Recommended for most
workloads
• System boot volumes
• Virtual desktops
• Low-latency interactive
apps
• Dev and test
environments
• Critical business
applications that require
sustained IOPS
performance, or more
than 10,000 IOPS or 160
MiB/s of throughput per
volume
• Large database
workloads
• Streaming workloads
requiring consistent, fast
throughput at a low price
• Big data
• Data warehouses
• Log processing
• Cannot be a boot volume
• Throughput-oriented
storage for large volumes of
data that is infrequently
accessed
• Scenarios where the lowest
storage cost is important
• Cannot be a boot volume
Volume Size 1 GiB - 16 TiB 4 GiB - 16 TiB 500 GiB - 16 TiB 500 GiB - 16 TiB
Max. IOPS**/Volume 10,000 20,000 500 250
Max. Throughput/Volume† 160 MiB/s 320 MiB/s 500 MiB/s 250 MiB/s
Max. IOPS/Instance 65,000 65,000 65,000 65,000
Max. Throughput/Instance 1,250 MiB/s 1,250 MiB/s 1,250 MiB/s 1,250 MiB/s
Dominant Performance
Attribute
IOPS IOPS MiB/s MiB/s
*Default volume type
**gp2/io1 based on 16KiB I/O size, st1/sc1 based on 1 MiB I/O size
† To achieve this throughput, you must have an instance that supports it, such as r3.8xlarge or x1.32xlarge.
9. Amazon EFS is Simple
• Fully managed
- No hardware, network, file layer
- Create a scalable file system in seconds!
• Seamless integration with existing tools and apps
- NFS v4.1—widespread, open
- Standard file system access semantics
- Works with standard OS file system APIs
• Simple pricing = simple forecasting
1
10. Amazon EFS is Elastic
• File systems grow and shrink automatically as
you add and remove files
• No need to provision storage capacity or
performance
• You pay only for the storage space you use,
with no minimum fee
2
11. • File systems can grow to petabyte scale
• Throughput and IOPS scale
automatically as file systems grow
• Consistent low latencies regardless of
file system size
• Support for thousands of concurrent
NFS connections
Amazon EFS is Scalable
3
12. Amazon S3 (Simple Storage Service)
• Web accessible object store
• Pay for exactly what you use
• Highly durable (99.999999999% design)
• Limitlessly scalable
• Natively online
• Two flavors:
– Standard Storage - $0.023 * per GB / mo
– Standard – Infrequent Access Storage (min size 128KB) – $0.0125* per GB / mo + Data
retrieval cost
* (US East (N Virginia) pricing)
13. Amazon S3 (Simple Storage Service)
• Parallel I/O for max speed (Multipart Upload, Ranged GETs)
• Resource-level IAM permissions
• Bucket Policies & ACLs
• Direct access through APIs
• Server Side Encryption
• Static Website Hosting
• Data Lifecycle Rules
• Amazon Athena – New
– Interactive Query Service that makes it easy to analyze data in Amazon S3 using
standard SQL
14. Object Storage Tiering
S3 Standard
• Primary data
• Big Data
Analytics
• Small objects
• Temporary
scratch space
S3 - IA
• File sync and
share
• Active Archive
• Enterprise backup
• Media transcoding
• Geo-
redundancy/DR
Glacier
• Deep/offline
archives
• Tape vaulting
replacement
• WORM-
compliant data
Data tiering using S3 Life Cycle Policies
15. Amazon Glacier
• Low-Cost Archival Storage
• Secure
• SSL & AES-256
• Durable
• Designed for 99.999999999% durability
• Optimized for data archiving and backup
• Suitable for RTO measured in hours
• Includes storage costs and retrieval costs
• Three retrieval options: Expedited, Standard, Bulk
• As little as $0.004 per GB/Month (US East pricing)
• Integrated with S3
16. L
if
e
c
y
c
l
e
Available
S3: 99.99%
S3-IA: 99.9%
Performant
Low Latency
High Throughput
≥ 30 Days≥ 128K
≥ 90 Days
Durable
99.999999999%
Scalable
Elastic capacity
No preset limits
> 0K$0.004 / GB per month
$0.0125 / GB per month
“Hot” Data
Active and/or
Temporary Data
“Warm” Data
Infrequently
Accessed Data
“Cold” Data
Archive and
Compliance Data
≥ 0 Days> 0KStarts at $0.023 / GB per month
1-5 mins
$0.01/GB retrieval
Storage Tiered To Your Requirements
S3-IA
Glacier
S3
3 new retrieval options
3–5 hrs 5–12 hrs
Expedited Standard Bulk
$0.03 / GB $0.01 / GB $0.0025 / GB
17. Amazon CloudFront
• Content delivery network (CDN)
• Distribute content to end users with low latency and high
data transfer rates
• Supports cookie and query string forwarding
• Accelerate data uploaded from end users
• Multi-format live streaming
18. AWS Import/Export
• Accelerates moving large amounts of data into and out of the
AWS cloud by shipping a portable storage devices such as
eSATA/SATA based hard drives or USB flash drives
• Faster than Internet transfer and more cost effective than
upgrading your connectivity
• Supports data transfer into Amazon S3 buckets, Amazon EBS
snapshots, and Amazon Glacier
• Common use cases are database migrations, offsite backups,
and disaster recovery
19. AWS Import/Export Snowball
• Petabyte-scale data transport
solution that uses secure
appliances to transfer large
amounts of data into and out of
AWS
• Snowball client encrypts and
compresses data before
transfering the data to the
Snowball appliances
• Supports 1-Gigabit Ethernet, 10-
Gigabit Ethernet, and 10 Gigabit
SFP+
• If it takes more that a week to
upload your data to AWS, then
consider using Snowball
20. Introducing AWS Snowmobile
• 45-foot long ruggedized shipping container
• Up to 100PB of capacity
• Load data S3 or Glacier
• Dedicated security personnel, GPS tracking,
alarm monitoring, 24/7 video surveillance,
and optional escort security while in transit
• Data encrypted with 256-bit encryption keys,
managed through KMS
21. AWS Storage Gateway
• Connect an on-premises software appliance VM with
cloud-based storage like Amazon S3 or Amazon Glacier
• VM runs on VMWare ESXi or Microsoft Hyper-V
• Mount as iSCSI device, and expose volumes as
Common Internet File System (CIFS), or Network File
System (NFS) mount points to client machines
• Securely upload data to the AWS cloud for cost effective
backup and rapid disaster recovery
Welcome to AWS for Digital Advertising! Our goal today is to tell you about why Digital Advertising companies are rapidly adopting AWS. We will take a look at some common scenarios in this business, review the business value proposition and take a look at AWS technologies enabling these scenarios.
Each storage option has a unique combination of performance, durability, cost, and interface
Each storage option has a unique combination of performance, durability, cost, and interface
Each storage option has a unique combination of performance, durability, cost, and interface.
AWS SNOWBALL is a petabyte-scale data transport solution that uses secure appliances to transfer large amounts of data into and out of the AWS cloud. Using Snowball addresses common challenges with large-scale data transfers including high network costs, long transfer times, and security concerns. Transferring data with Snowball is simple, fast, secure, and can be as little as one-fifth the cost of high-speed Internet.
AWS SNOWMOBILE is NEW and its a secure, Exabyte-scale data transfer service used to transfer large amounts of data into and out of AWS. Each Snowmobile can transfer up to 100PB. When you order a Snowmobile it comes to your site and AWS personnel connect a removable, high-speed network switch from Snowmobile to your local network. This makes Snowmobile appear as a network attached data store. Once it is connected, secure, high-speed data transfer begins. After your data is transferred to Snowmobile, it is driven back to AWS where the data is loaded into the AWS service you select, including S3, Glacier, Redshift and others. It allows customers with large amounts of data to migrate to AWS much faster and easier.
High-level description of EBS: network-based virtual disks, pay for what you provision, build-in redundancy (essentially RAID10), optimized for random I/O
Amazon Web Services give you reliable, durable backup storage without the up-front capital expenditures and complex capacity-planning burden of on-premises storage. Amazon storage services remove the need for complex and time-consuming capacity planning, ongoing negotiations with multiple hardware and software vendors, specialized training, and maintenance of offsite facilities or transportation of storage media to third party offsite locations.
This table describes the use cases and performance characteristics for each volume type: Source: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSVolumeTypes.html
Free with your EC2 Instance
SAS and SSD options
Size/type based on instance type
Zero Network Overhead; local, direct attached resource.
Consistent performance for sequential reads and writes
Volatile
Currently (09/22/2015) in Preview mode
Athena detailed slide in Appendix
Updated pricing as of Dec 23, 2016
Amazon Glacier provides three ways to retrieve your archives to meet varying access time and cost requirements: Expedited, Standard, and Bulk retrievals. Archives requested using Expedited retrievals are typically available within 1 – 5 minutes, allowing you to quickly access your data when occasional urgent requests for a subset of archives are required. With Standard retrievals, archives typically become accessible within 3 – 5 hours. Or you can use Bulk retrievals to cost-effectively access significant portions of your data, even petabytes, for just a quarter-of-a-cent per GB.
Highlight customer architecture and how durability, avail, performance, and scalability relate to application type
Amazon Glacier provides three ways to retrieve your archives to meet varying access time and cost requirements: Expedited, Standard, and Bulk retrievals. Archives requested using Expedited retrievals are typically available within 1 – 5 minutes, allowing you to quickly access your data when occasional urgent requests for a subset of archives are required. With Standard retrievals, archives typically become accessible within 3 – 5 hours. Or you can use Bulk retrievals to cost-effectively access significant portions of your data, even petabytes, for just a quarter-of-a-cent per GB.
The team will consider and assess global requests as well
Welcome to AWS for Digital Advertising! Our goal today is to tell you about why Digital Advertising companies are rapidly adopting AWS. We will take a look at some common scenarios in this business, review the business value proposition and take a look at AWS technologies enabling these scenarios.