Each storage option has a unique combination of performance, durability, cost, and interface
Each storage option has a unique combination of performance, durability, cost, and interface
Each storage option has a unique combination of performance, durability, cost, and interface.
AWS SNOWBALL is a petabyte-scale data transport solution that uses secure appliances to transfer large amounts of data into and out of the AWS cloud. Using Snowball addresses common challenges with large-scale data transfers including high network costs, long transfer times, and security concerns. Transferring data with Snowball is simple, fast, secure, and can be as little as one-fifth the cost of high-speed Internet.
AWS SNOWMOBILE is NEW and its a secure, Exabyte-scale data transfer service used to transfer large amounts of data into and out of AWS. Each Snowmobile can transfer up to 100PB. When you order a Snowmobile it comes to your site and AWS personnel connect a removable, high-speed network switch from Snowmobile to your local network. This makes Snowmobile appear as a network attached data store. Once it is connected, secure, high-speed data transfer begins. After your data is transferred to Snowmobile, it is driven back to AWS where the data is loaded into the AWS service you select, including S3, Glacier, Redshift and others. It allows customers with large amounts of data to migrate to AWS much faster and easier.
Amazon Web Services give you reliable, durable backup storage without the up-front capital expenditures and complex capacity-planning burden of on-premises storage. Amazon storage services remove the need for complex and time-consuming capacity planning, ongoing negotiations with multiple hardware and software vendors, specialized training, and maintenance of offsite facilities or transportation of storage media to third party offsite locations.
EBS has 99.99% SLA
See https://aws.amazon.com/about-aws/whats-new/2017/11/announcing-an-increased-monthly-service-commitment-for-amazon-ec2/ & https://aws.amazon.com/compute/sla/
Network device
Data lifecycle is independent from EC2 instance lifecycle
Each volume is like a hard drive on a physical server
Attach multiple volumes to an EC2 instance, but only one EC2 instance per volume
POSIX-compliant file systems
Virtual disk ideal for: OS boot device; file systems
Raw block devices
Ideal for Databases (Oracle Active Storage Manager)
Other raw block devices
Describe EBS standard volumes as “best effort” and PIOPs as providing consistent performance. Mention how the most predictable performance will come by using EBS-Optimized instances to obtain dedicated storage throughput.
Chart on the left describes the expected throughput and max expected 16K IOPs for various instance sizes.
Describe how EBS snapshots work.
Describe how snapshots are differential, but allow a full recovery from any individual snapshot based on this three snapshot scenario.
Free with your EC2 Instance
SAS and SSD options
Size/type based on instance type
Zero Network Overhead; local, direct attached resource.
Consistent performance for sequential reads and writes
Volatile
Currently (09/22/2015) in Preview mode
Describe how EFS works
What we decided to deliver were Windows file systems – built on fully managed Windows Servers
And integrated richly with other AWS services
Built on Windows Server, you get native Windows file storage, delivering the compatibility, features, and performance needed to run Windows applications.
Provides NTFS file systems that can be accessed from up to thousands of compute instances using the SMB protocol.
Fully supports key NTFS file system features, including what other solutions commonly don’t support, like extended attributes, file change notifications, alternate data streams, hard links, soft links
Fully supports SMB 3.1.1 (backwards compatible with all SMB 2.0+ clients)
SMB performance features like SMB multichannel (a single client can establish multiple TCP connections to drive higher throughput) and SMB directory leasing (a client can own the cache for a particular directory)
Amazon FSx works with Microsoft Active Directory (AD) to integrate your file system with your Windows environments, including trust relationships with on-prem ADs
It supports the use of Distributed File System (DFS) Replication to enable multi-AZ deployments, and DFS Namespaces to enable namespaces that can span across many file servers
Built on high-performance SSD storage
Provides high throughout and IOPS
Up to 2 GB/s of throughput and hundreds of thousands of IOPS per file system
Up to tens of GB/s of throughput and hundreds of thousands of IOPS using DFS Namespaces across file systems
Choose throughput independent of storage size – allows you to really cater the price/performance ratio to your application’s needs
Highly consistent, sub-millisecond latencies
11 Nines of durability: 99.999999999%
A chief information security officer (CISO) is the senior-level executive within an organization responsible for establishing and maintaining the enterprise vision, strategy, and program to ensure information assets and technologies are adequately protected.
On-demand analytics: RedShift Spectrum, QuickSight, Athena
Query in place:
One of the most important capabilities of a data lake that is built on AWS is the ability to do in-place transformation and querying of data assets without having to provision and manage clusters.
AWS Glue, as described in the previous sections, provides the data discovery and ETL capabilities, and Amazon Athena and Amazon Redshift Spectrum provide the in-place querying capabilities.
Glacier Select
Amazon Glacier Select allows queries to run directly on data stored in Amazon Glacier without having to retreive the entire archive. Amazon Glacier Select changes the value of archive storage by allowing you to process and find only the bytes you need out of the archive to use for analytics.
Highlight customer architecture and how durability, avail, performance, and scalability relate to application type
Amazon Glacier provides three ways to retrieve your archives to meet varying access time and cost requirements: Expedited, Standard, and Bulk retrievals. Archives requested using Expedited retrievals are typically available within 1 – 5 minutes, allowing you to quickly access your data when occasional urgent requests for a subset of archives are required. With Standard retrievals, archives typically become accessible within 3 – 5 hours. Or you can use Bulk retrievals to cost-effectively access significant portions of your data, even petabytes, for just a quarter-of-a-cent per GB.
Object Tags are key-value pairs applied to S3 objects which can be created, updated or deleted at any time during the lifetime of the object.
provide the ability to create Identity and Access Management (IAM) policies, setup S3 Lifecycle policies, and customize storage metrics.
manage transitions between storage classes and expire objects in the background.
automatically identifies the optimal lifecycle policy to transition less frequently accessed storage to SIA.
configure a storage class analysis policy to monitor an entire bucket, a prefix, or object tag. Once an infrequent access pattern is observed, easily create a new lifecycle age policy based on the results.
provides daily visualizations of your storage usage in the AWS Management Console that can be exported to an S3 bucket to analyze using the business intelligence tools of your choice, such as Amazon QuickSight.
provides a CSV (Co
monitoring and alarming on 13 new S3 CloudWatch Metrics
receive 1-minute CloudWatch Metrics, set CloudWatch alarms, and access CloudWatch dashboards to view real-time operations and performance such as bytes downloaded and the 4xx HTTP response count of your Amazon S3 storage.
For web and mobile applications that depend on cloud storage, these let you quickly identify and act on operational issues.
mma Separated Values) flat-file output of your objects and their corresponding metadata on a daily or weekly basis for an S3 bucket or a shared prefix.
The AWS SGW is typically deployed in your existing storage environment as a VM.
You connect your existing applications, storage systems, or devices to the SGW. The SGW provides standard storage protocol interfaces so apps can connect to it without changes.
The gateway in turn connects to AWS so you can store data securely and durably in Amazon S3, Glacier.
The gateway optimizes data transfer from on-premises to AWS. It also provides low-latency access through a local cache so your apps can access frequently used data locally. The service is also integrated with Cloudwatch, cloudtrail, IAM, etc. so you get an extension of aws management services locally.
---
“Enable cloud storage on-premises as part of your AWS platform”
“Native access
Industry standard protocols for file, block, and tape
Secure and durable storage in Amazon S3 and Glacier
Optimized data transfer from on-premises to AWS
Low-latency access to frequently used data
Integrated with AWS security and management services
The tape gateway presents the Storage Gateway to your existing backup application as an industry-standard iSCSI-based virtual tape library (VTL), consisting of a virtual media changer and virtual tape drives. You can continue to use your existing backup applications and workflows while writing to a nearly limitless collection of virtual tapes. Each virtual tape is stored in Amazon S3. When you no longer require immediate or frequent access to data contained on a virtual tape, you can have your backup application archive it from the virtual tape library into Amazon Glacier, further reducing storage costs.
Storage Gateway is currently compatible with most leading backup applications. The VTL interface eliminates large upfront tape automation capital expenses, multi-year maintenance contract commitments and ongoing media costs. You pay only for the capacity you use and scale as your needs grow. The need to transport storage media to offsite facilities and handle tape media manually goes away, and your archives benefit from the design and durability of the AWS cloud platform.
In your VTL you can configure up to 1,500 tapes
… up to 2.5 TB each (LTO-6 size)
… for a total of 1PB TB per VTL
The file gateway enables you to store and retrieve objects in Amazon S3 using industry-standard file protocols. Files are stored as objects in your S3 buckets, accessed through a Network File System (NFS) mount point. Ownership, permissions, and timestamps are durably stored in S3 in the user-metadata of the object associated with the file. Once objects are transferred to S3, they can be managed as native S3 objects, and bucket policies such as versioning, lifecycle management, and cross-region replication apply directly to objects stored in your bucket.
Customers use the file interface to migrate file data into S3 for use by object-based workloads, as a cost-effective storage target for traditional backup applications, and as a tier in the cloud for on-premises file storage.
The volume gateway presents your applications with disk volumes using the iSCSI block protocol. Data written to these volumes can be asynchronously backed up as point-in-time snapshots of your volumes, and stored in the cloud as Amazon EBS snapshots. You can set the schedule for when snapshots occur or create them via the AWS Management Console or service API. Snapshots are incremental backups that capture only changed blocks. All snapshot storage is also compressed to minimize your storage charges.
When connecting with the block interface, you can run the gateway in two modes: cached and stored.
In cached mode, you store your primary data in Amazon S3 and retain your frequently accessed data locally. With this mode, you can achieve substantial cost savings on primary storage, minimizing the need to scale your storage on-premises, while retaining low-latency access to your frequently accessed data. You can configure up to 32 volumes of 32 TB, for a total 1PB storage per gateway.
In stored mode, you store your entire data set locally, while performing asynchronous backups of this data in Amazon S3. This mode provides durable and inexpensive offsite backups that you can recover locally or from Amazon EC2.
Example applications such as databases or computational workloads
… where low latency is critical and the working set of data is large, ill-defined, or constantly changing.
On a stored volume gateway you can configure up to 32 volumes
… up to 16 TB each
… for a total of 512 TB per gateway
We see 3 broad categories of hybrid storage where SGW helps customers.
Let’s look in a little more detail at each of these.
The priginal Snowball has 50 TB & 80TB capacity; AWS Snowball Edge, like the original Snowball, is a terabyte-scale data transfer solution, but transports more data up to 100TB of data, and retains the same embedded cryptography and security as the original Snowball. However, Snowball Edge hosts a file server and an S3-compatible endpoint that allow you to use the NFS protocol, S3 SDK or S3 CLI to transfer data directly to the device without specialized client software. Multiple units may be clustered together, forming a temporary data collection storage tier in your datacenter so you can work as data is generated without managing copies. As storage needs scale up and down, devices can be easily added or removed from the local cluster and returned to AWS.
What is AWS Import/Export Snowball?
Snowball is a new AWS Import/Export offering that provides a terabyte-scale data transfer service that uses Amazon-provided storage devices for transport. Previously customers purchased their own portable storage devices and used these devices to ship their data. With the launch of Snowball customers are now able to use
highly secure, rugged Amazon-owned Network Attached Storage (NAS) devices, called Snowballs, to ship their data. Once received and set up, customers are able to copy up to 50TB data from their on prem file system to the Snowball via the Snowball client software via a 10Gbps network interface . Prior to transfer to the Snowball all data is encrypted by 256-bit GSM encryption by the client. When customers finish transferring data to the device they simply ship it back to an AWS facility where the data is ingested
at high speed into Amazon S3.
Compare and contrast Internet vs 5x Snowball.
Note that with 80TB Snowball or 100TB Snowball Edge, less devices needed
Snowball Edge supports up to 40GB connections (QSFP+)
Also note you can also use Snowball Edge to run edge computing workloads such as performing local analysis of data on a Snowball Edge cluster and writing it to the S3-compatible endpoint
AWS Snowmobile is a secure, Exabyte-scale data transfer service used to transfer large amounts of data into and out of AWS. Each Snowmobile can transfer up to 100PB. When you order a Snowmobile it comes to your site and AWS personnel connect a removable, high-speed network switch from Snowmobile to your local network. This makes Snowmobile appear as a network attached data store. Once it is connected, secure, high-speed data transfer begins. After your data is transferred to Snowmobile, it is driven back to AWS where the data is loaded into the AWS service you select, including S3, Glacier, Redshift and others.