[NEW LAUNCH!] [REPEAT 1] Amazon FSx for Lustre: How to build and deploy file systems for compute-intensive workloads, HPC, and machine learning applications (STG320-R1) - AWS re:Invent 2018
If you have compute-intensive workloads like high-performance computing, machine learning, and media processing then this is the workshop for you. Our new file storage service, Amazon FSx for Lustre, provides high-performance storage with fully managed Lustre file systems that can deliver hundreds of gigabytes of throughput and consistent low latencies. You will learn how to spin up an FSx for Lustre file system in minutes, feed data to it from an S3 data lake automatically, run analyses while periodically writing results back to S3, and then spin down the file system once the workload is finished.
Introducing AWS DataSync - Simplify, automate, and accelerate online data tra...
Similar to [NEW LAUNCH!] [REPEAT 1] Amazon FSx for Lustre: How to build and deploy file systems for compute-intensive workloads, HPC, and machine learning applications (STG320-R1) - AWS re:Invent 2018
AWS Portfolio: highlight delle categorie di prodotti AWS con esempiAmazon Web Services
Similar to [NEW LAUNCH!] [REPEAT 1] Amazon FSx for Lustre: How to build and deploy file systems for compute-intensive workloads, HPC, and machine learning applications (STG320-R1) - AWS re:Invent 2018 (20)
[NEW LAUNCH!] [REPEAT 1] Amazon FSx for Lustre: How to build and deploy file systems for compute-intensive workloads, HPC, and machine learning applications (STG320-R1) - AWS re:Invent 2018
With all this support it has allowed us to innovate at a rapid pace over the last decade. We are excited this has been recognized by Gartner and their clients in their annual Magic Quadrant as we have been in the furthest on both axis of any provider in the upper right since day 1.
One of many reasons that we’ve held this leadership position in the industry for so long is the breadth of our portfolio. No matter where you are in your cloud journey, we have a storage solution that will fit your applications. Whether you’re simply re-hosting, or lifting and shifting, an application with the same type of architecture you’ve been using on-prem; re-platforming using a hybrid architecture; or re-architecting to leverage all of the benefits of the cloud, we have a storage service that fits the bill. These are just a very small sample of the great customers that have migrated to the cloud.
But we talked with a lot of you, and you told us you needed more. You told us you needed an SMB file storage solution. You told us you needed a better solution to run Lustre for your HPC and Machine Learning workloads. You told us you wanted a solution to back up key AWS resources. Our roadmap is driven by your feedback.
So, as you’ve heard this week, we’re expanding our portfolio with 3 new storage classes and 2 new file storage services.
1/ Amazon S3 Intelligent-Tiering is a new S3 storage class that automatically optimizes customers’ storage costs for data with unknown or changing access patterns by moving data to the most cost-effective storage tier.
2/ Amazon S3 Glacier Deep Archive is a new storage class that delivers the lowest cost of any storage service, at less than 1/10th of one cent per gigabyte per month.
3/ Amazon FSx for Windows File Server provides fully managed Windows-based shared file storage designed to help customers lift-and-shift their applications to AWS.
4/ Amazon FSx for Lustre is a fully managed file system that is optimized for compute-intensive workloads, such as high-performance computing, machine learning, and media data processing workflows.
5/ Amazon EFS IA is a new storage class for Amazon EFS that is designed for files accessed less frequently, enabling customers to reduce storage costs by up to 85% compared to the EFS Standard storage class.
When we refer to file storage, we’re really talking about network file storage
Why is network file storage so useful?
Files and directories appear and work just like they would on local storage…
…while multiple users, computers, applications can access the same set of files…
…with strong data consistency even if multiple users or applications are editing the same file concurrently
These network file systems work natively with operating system APIs for working with files…
…so they work natively with existing applications and IT environments
And they generally provide high levels of throughput and IOPS…
…since many users/computers/apps accessing data at the same time…
…while providing near-local latencies so they really do appear like local file storage
And that’s why we built Amazon FSx for Lustre
On Wed we announced Amazon FSx, which provides fully managed third-party file systems that are optimized for a variety of workloads.
Amazon FSx provides you with the native compatibility of third-party file systems, with the feature sets for workloads such as Windows-based enterprise storage, high performance computing (HPC), and machine learning.
Simple: / fully managed You no longer need to manage file servers and storage for these file systems, as Amazon FSx automates the time-consuming administration tasks such as hardware provisioning, software configuration, patching, and backups.
Native compatibility, features, performance
Rich integrations with other cloud-native AWS services, making these file systems even more useful for a broader set of workloads.
Cost-optimized for particular workloads, like short-term compute-intensive workloads that don’t require replicated storage.
FSx for Lustre is one file system that’s part of our overall FSx service. Amazon FSx also provides Amazon FSx for Windows File Server, for Windows-based storage.
Amazon FSx for Lustre is a fully managed, high performance parallel file system on AWS
[Read through the three points]
Lustre is one of the most popular parallel file systems, and it’s open source. OpenSFS
It’s highly scalable and can be accessed from tens of thousands of compute instances, and is designed to store petabytes of data and deliver hundreds of GBs of data per second. Since it’s a parallel file system, with direct communication between clients and servers, it provides low, consistent latencies.
It was started in 1999 at CMU, and has matured into a file system that’s heavily used by businesses, research institutions, and government agencies for a wide variety of use cases including [read icons on the bottom.
And in fact 60% of the top 100 fastest supercomputers leverage Lustre for data storage
Variety of HPC workloads, including seismic processing and geospatial analysis
Financial modeling
…
With Amazon FSx for Lustre, you get a fully managed Lustre parallel file system.
Because it’s a Lustre file system, it’s performance is ideal for compute-intensive workloads with high-throughput and low-latency needs, like high performance computing, machine learning workloads, and media processing/rendering workflows.
[Read through icons]
Data repositories: S3 + on-premises data stores
I’ll now talk about each of these in turn.
Fitting in with the typical model of S3 data moving to a shared file system accessed by a compute cluster
You can link your file system to an S3 bucket
When that happens, all of your objects appear as directories and files on your Lustre file system.
However, the data itself is not moved until it’s needed. As your compute workload requests a file, it gets pulled automatically from S3 onto the file system.
This lazy load approach is really useful, because your bucket may have petabytes of data in it, but you need only a portion of that data for a given compute job.
You can then write results back at any point with a simple command from your compute instances.
Only incremental changes are written back.
So it’s really designed for the common compute-intensive workload, where you’re running your analysis for hours, days, or weeks, against a larger data set that’s in S3.
[Walk through file1.txt access, how it’s lazy loaded]
Each Amazon FSx for Lustre is built on a cluster of file servers, each with one or more disks
Data movement to/from S3 is done in parallel across each of the file servers hosting your data
Use multi-part upload/download to move large files quickly
Lots of folks want to burst to the cloud. Meaning they want to run spin up large compute clusters to run compute-heavy jobs, and then spin down the compute when needed.
For lots of these scenarios, the data is on-premises, need it on AWS while the job is running so the compute cluster has local access to it
Mount your Lustre file system from a computer or computers on-premises over DX or VPN. Then you can move data from your on-premises data store to Lustre.
[For first two icons] We really are putting the power of Lustre in-reach for anybody, but fully managing it.
The Lustre file system is POSIX-compliant, so your applications can work with files and directories just as they do with a local file system.
Included as part of that is a read after write/close consistency model – super-important as you’ll commonly have many compute instances accessing the same file, need to provide consistency guarantees
Also important for workloads with lots of compute instances accessing the same files – supports file locking
Your data is automatically encrypted at-rest
[Walk through other icons]
[Icon needs to say S3 or on-premises]
[Change “pay only for the resources you use” to “billed per second”, and remove “billed per second” from below]
[Put in parentheses after FSx for Lustre pricing that it’s (high-performance SSD)”]
Amazon FSx for Lustre is cost-optimized for short-term compute-intensive workloads by providing nonreplicated storage.
That’s because it’s designed to work with your long-term, durable data stores – and serve as the storage for when you need to run compute
You can spin up and spin down file systems as needed, and store long term data in S3 or in your on-premises data store.
You’re billed per second.
The price for the high-performance SSD is $0.14/GB-mo.
The more relevant way to think about it, since it’s for short-term processing workloads, is $0.20/TB-hour.
In addition to FSx for Lustre, we announced today FSx for Windows File Server.
EFS is our cloud-native Linux file system that we launched in 2016
FSx for Windows File Server joins that as our two file systems for supporting business applications: One for Linux workloads, the other for Windows
[TODO: Provide a better short description of Windows File Server]
EFS is our cloud-native Linux file system that we launched in 2016
FSx for Windows File Server joins that as our two file systems for supporting business applications: One for Linux workloads, the other for Windows
Lustre is a highly popular open-source parallel file system that’s used heavily in the high-performance computing space. We’re offering that as a fully managed file system that’s fully integrated with S3. You can use it to process data at hundreds of GB of throughput per second, millions of IOPs, and sub-millisecond latencies
[TODO: Add better short description of Lustre that makes compute-intensive/S3/redundancy points]
EFS:
Easily shared between multiple applications, instances, and on-premises servers simultaneously
Achieve petabyte scale from a distributed design that avoids the constraints imposed by traditional file servers
FSx for Windows:
Built on Windows Server with native support for Windows file system features you use today
SSD storage for high throughput, IOPS, and sub-millisecond latencies
FSx for Lustre:
Built on the highly popular, open source parallel file system Lustre
Process data at hundreds of GB of throughput per second, millions of IOPs, and sub-millisecond latencies
Revise this slide to focus on use cases instead of just a list of services.
Amazon EFS – show enterprise with files for large applications
Amazon EBS
Portfolio slide….show enterprise….
Need object, file, gateway…
Revise this slide to focus on use cases instead of just a list of services.
Amazon EFS – show enterprise with files for large applications
Amazon EBS
Portfolio slide….show enterprise….
Need object, file, gateway…