© 2014 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or i...
The need for parallel storage
Parallel Storage Needs
• Time spent storing and retrieving data is time not
spent on compute. Fast storage maximizes
proce...
Scale Out Storage Using Lustre*
• Purpose-built for HPC
• Distributed, Parallel, Vast Global Namespace
• Linux server base...
Intel Strategy for Lustre* Storage
Extend core Lustre* for use across
HPC and enterprise applications
Intel Enhanced Lustr...
Use Models: Cloud Resources for HPC
1 Augment: burst peak workloads and supplement resources
2 Transition: move on-premise...
Key HPC Markets Using Lustre* Today
Large-scale
Manufacturing
Weather and
Climate
Life Sciences Energy Finance
* Some name...
What Does Intel® Cloud
Edition for Lustre* Software
Look Like?
*Other names and brands may be claimed as the property of o...
MDS MDS
Lustre* Components
Management Metadata Storage
Lustre* mount service
Initial point of contact
for clients
Namespac...
Deploying a Storage Cluster
Deploying a Storage Cluster
Deploying a Storage Cluster
Deploying a Storage Cluster
Monitoring & Command Line Interface
Performance….
Large File Benchmark
Comparing 3 Lustre* cluster configuration
Increase the number of OSSs
• 4 OSS
• 8 OSS
• 16 OSS
Config...
IOR Sequential Read FPP
0
200
400
600
800
1000
1200
1400
1600
1 2 4 8 16 32
4OSS
8OSS
16OSS
N. Clients
MB/sec
Client’s net...
0
200
400
600
800
1000
1200
1400
1600
1 2 4 8 16 32
4OSS
8OSS
16OSS
IOR Sequential Write FPP
N. Clients
MB/sec
Client’s ne...
Aggregate Performance During Run
• LTOP is available and
we use it to record the
OSTs activities during
the IOR run.
• Wit...
Compare Lustre* and NFS
*Other names and brands may be claimed as the property of others.
Small File Benchmark
Simulated EDA Benchmark
• Simulate workload by compiling a package
• untar; configure; make;
• Python...
Lustre* Configuration
1 MGT
• m3.medium
1 - 4 MDTs
• m3.2xlarge
• 8x 40GB EBS
4 OSTs
• c3.xlarge
• 8x 40GB EBS
*Other name...
EDABench – Lustre* vs. NFS
0
2000
4000
6000
8000
10000
12000
1 2 4 8 16 32 64 128
EDABench
Score
(Compile)
Processes (32 c...
Storage Instance Cost Comparison
• EBS Optimized for all storage instances
• Global Support for Lustre*
• Does not include...
Intel® Cloud Edition for Lustre* software
*Other names and brands may be claimed as the property of others.
Status Today
• Available on AWS Marketplace
• Setup in less than 10 minutes
• Try for yourself
lustre.intel.com/cloudediti...
Thank You.
Upcoming SlideShare
Loading in...5
×

Breaking IO Performance Barriers: Scalable Parallel File System for AWS

1,664

Published on

Across all industries worldwide, HPC is helping innovative users achieve breakthrough results—from leading edge academic research to data-intensive applications, such as weather prediction and large-scale manufacturing in the aerospace and automotive sectors. As HPC-powered simulations continue to grow ever larger and more complex, scientists are looking for cost-effective high performance compute resources that's available when they need it. Access to on-demand infrastructure allows opportunities to experiment and try new speculative models. AWS provides computing infrastructure that allows scientists and engineers to solve complex science, engineering, and business problems using applications that require high bandwidth, low latency networking, and very high compute capabilities. Driven by its flexibility and affordability, many HPC and big data workloads are transitioning from on premise entirely onto AWS.

But like on-premises HPC, maximizing application of ""HPC cloud"" workloads requires fast and highly scalable storage.

Intel® Cloud Edition for Lustre Software has been purpose-built for use with the dynamic computing resources available from Amazon Web Services to provide the fast, massively scalable storage software resources needed to accelerate performance, even on complex workloads.

Published in: Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,664
On Slideshare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
50
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

Breaking IO Performance Barriers: Scalable Parallel File System for AWS

  1. 1. © 2014 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc. Breaking IO Performance Barriers: Scalable Parallel File System for AWS Paresh G. Pattani, Ph.D. Sr. Director, High Performance Data Solutions Intel Corporation July 10, 2014
  2. 2. The need for parallel storage
  3. 3. Parallel Storage Needs • Time spent storing and retrieving data is time not spent on compute. Fast storage maximizes processing utilization. Scalability Reliability Performance • Growing datasets require greater amounts of storage and the ability to expand existing storage. • Large clusters and critical workloads require a comprehensive focus on data availability.
  4. 4. Scale Out Storage Using Lustre* • Purpose-built for HPC • Distributed, Parallel, Vast Global Namespace • Linux server based • Linux, Windows and Mac client support • Support for 100,000+ Clients • Designed for Reliable Storage • Now available on AWS Marketplace lustre.intel.com/cloudedition * Some names and brands may be claimed as the property of others.
  5. 5. Intel Strategy for Lustre* Storage Extend core Lustre* for use across HPC and enterprise applications Intel Enhanced Lustre* – HPC Clouds  Extend core Lustre* with key features for new markets and use cases  Push Lustre* onto HPC cloud infrastructure Open-source innovation driving performance at scale Open Source - Powerful storage foundation for exascale applications  Increased scale and streaming bandwidth  Accelerate maturity, lower risk and grow the ecosystem 1 2 * Some names and brands may be claimed as the property of others.
  6. 6. Use Models: Cloud Resources for HPC 1 Augment: burst peak workloads and supplement resources 2 Transition: move on-premises HPC to cloud infrastructure 3 Deploy: launch new applications exclusively to the cloud
  7. 7. Key HPC Markets Using Lustre* Today Large-scale Manufacturing Weather and Climate Life Sciences Energy Finance * Some names and brands may be claimed as the property of others.
  8. 8. What Does Intel® Cloud Edition for Lustre* Software Look Like? *Other names and brands may be claimed as the property of others.
  9. 9. MDS MDS Lustre* Components Management Metadata Storage Lustre* mount service Initial point of contact for clients Namespace of file system File layouts, no data Scalable File content stored as objects Striped across targets Scales to 100+ MGT MDT OST OST MGS OSS OSS *Other names and brands may be claimed as the property of others.
  10. 10. Deploying a Storage Cluster
  11. 11. Deploying a Storage Cluster
  12. 12. Deploying a Storage Cluster
  13. 13. Deploying a Storage Cluster
  14. 14. Monitoring & Command Line Interface
  15. 15. Performance….
  16. 16. Large File Benchmark Comparing 3 Lustre* cluster configuration Increase the number of OSSs • 4 OSS • 8 OSS • 16 OSS Configurations of MGS and MDS are the same We use 32 clients MDS EBS Optimized RAID0 8x 40GB Standard 110 MB/sec m3.2xlarge OSS EBS Optimized 8x 100GB Standard 110 MB/sec m3.2xlarge Client 110 MB/sec m3.2xlarge MGS 94 MB/sec m1.medium *Other names and brands may be claimed as the property of others.
  17. 17. IOR Sequential Read FPP 0 200 400 600 800 1000 1200 1400 1600 1 2 4 8 16 32 4OSS 8OSS 16OSS N. Clients MB/sec Client’s network bottleneck OSS’s network bottleneck OSS’s network bottleneck Close to the OSS network
  18. 18. 0 200 400 600 800 1000 1200 1400 1600 1 2 4 8 16 32 4OSS 8OSS 16OSS IOR Sequential Write FPP N. Clients MB/sec Client’s network bottleneck OSS’s network bottleneck OSS’s network bottleneck Ops….
  19. 19. Aggregate Performance During Run • LTOP is available and we use it to record the OSTs activities during the IOR run. • With a simple python script we create this graph: “aggregate performance vs time” to analyze the problem. time 1920 MB/sec Long tail
  20. 20. Compare Lustre* and NFS *Other names and brands may be claimed as the property of others.
  21. 21. Small File Benchmark Simulated EDA Benchmark • Simulate workload by compiling a package • untar; configure; make; • Python wrapper parallelizes on cluster using MPI • Calculate score based on (total workload/runtime) 32 Clients • Linux, c3.xlarge Compare with NFS • Linux, i2.4xlarge • 4x EBS RAID0
  22. 22. Lustre* Configuration 1 MGT • m3.medium 1 - 4 MDTs • m3.2xlarge • 8x 40GB EBS 4 OSTs • c3.xlarge • 8x 40GB EBS *Other names and brands may be claimed as the property of others.
  23. 23. EDABench – Lustre* vs. NFS 0 2000 4000 6000 8000 10000 12000 1 2 4 8 16 32 64 128 EDABench Score (Compile) Processes (32 clients) 1 MDT 2 MDTs 4 MDTs NFS *Other names and brands may be claimed as the property of others.
  24. 24. Storage Instance Cost Comparison • EBS Optimized for all storage instances • Global Support for Lustre* • Does not include EBS cost Cluster Option Total Cost / Hour Lustre* – 1xMDT + 4xOSS $2.00 Lustre* – 2xMDT + 4xOSS $2.69 Lustre* – 4xMDT + 4xOSS $4.07 NFS – i2.4xlarge $3.51 *Other names and brands may be claimed as the property of others.
  25. 25. Intel® Cloud Edition for Lustre* software *Other names and brands may be claimed as the property of others.
  26. 26. Status Today • Available on AWS Marketplace • Setup in less than 10 minutes • Try for yourself lustre.intel.com/cloudedition lustre.intel.com/contactus
  27. 27. Thank You.
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×