(STG306) EFS: How to store 8 Exabytes & look good doing it


In this session we will review the world's first cloud-scale network attached file system and its targeted use cases. Session attendees will learn about EFS's benefits, how to identify applications that are appropriate for use with EFS, and details about its performance and security models. The target audience is file system administrators, application developers, and application owners that operate or build file-based applications.

  1. 1. © 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Timothy Harder October 2015 Amazon Elastic File System STG306
  2. 2. What to expect from the session (STG306) Advanced level: 300 • Motivations for creating the world’s first cloud scale NAS • How to set up and administer a highly scalable file system • Awareness of security mechanisms available to your file systems • View of Amazon EFS performance model
  3. 3. Agenda 1. Overview of Amazon EFS 2. Amazon EFS technical concepts 3. Walk through experience of creating a file system 4. Guest presenter – ClearSky Data 5. Discuss file system security mechanisms 6. Review the Amazon EFS performance model 7. Explore the Amazon EFS regional availability and durability model 8. Q&A
  4. 4. What if you never had to worry about file system space again?
  5. 5. Overview of Amazon EFS
  6. 6. Amazon S3 • Object storage: data presented as buckets of objects Amazon EFS • File storage (analogous to NAS): data presented as a file system Amazon Elastic Block Store (EBS) • Block storage (analogous to SAN): data presented as disk volumes Amazon Glacier • Archival storage: data presented as vaults/archives of objects The AWS storage portfolio
  7. 7. • Fully managed file system for Amazon EC2 instances • Provides standard file system semantics • Works with standard operating system APIs • Sharable across thousands of instances • Elastically grows to petabyte scale • Delivers performance for a wide variety of workloads • Highly available and durable • NFS v4–based What is Amazon EFS?
  8. 8. Why did we build Amazon Elastic File System? • Compute + Storage + File system + Multi-AZ replication + 24*7 + Management = Hard, expensive, wobbly. • Prepackaged appliances = Easy, but... We built Amazon EFS so that you do not need to manage the discrete infrastructure elements for your file systems
  9. 9. We focused on changing the game Amazon EFS is simple Amazon EFS is elastic Amazon EFS is scalable 1 2 3
  10. 10. Amazon EFS is simple Fully managed - No hardware, network, file layer - Create a scalable file system in seconds! Seamless integration with existing tools and apps - NFS v4—widespread, open - Standard file system semantics - Works with standard OS file system APIs Simple pricing = simple forecasting 1
  11. 11. Amazon EFS is elastic File systems grow and shrink automatically as you add and remove files No need to provision storage capacity or performance You pay only for the storage space you use, with no minimum fee 2
  12. 12. File systems can grow to petabyte scale Throughput and IOPS scale automatically as file systems grow Consistent low latencies regardless of file system size Support for thousands of concurrent NFS connections Amazon EFS is scalable3
  13. 13. Why does this matter? … to app owners and developers? … to your business? • Easy to move existing code, applications, and tools used today with existing NFS servers to the AWS cloud • Simple shared file storage solution for new cloud-native applications • Predictable pricing with no up-front investment • Increased agility • Spend less time managing file storage and more time focusing on your business … to IT administrators? • Eliminates need to manage and maintain file system storage at scale
  14. 14. Diving in
  15. 15. What is a file system? The primary resource in Amazon EFS Where you store files and directories Can create multiple file systems per account
  16. 16. How to access a file system from an instance You “mount” a file system on an Amazon EC2 instance (standard command) — the file system appears like a local set of directories and files An NFSv4 client is standard on Linux distributions mount –t nfs4 [file system DNS name]:/ /[user’s target directory]
  17. 17. What is a mount target? To access your file system from instances in an Amazon VPC, you create mount targets in the VPC A mount target is an NFSv4 endpoint in your VPC A mount target has an IP address and a DNS name you use in your mount command AVAILABILITY ZONE 1 REGION AVAILABILITY ZONE 2 AVAILABILITY ZONE 3 VPC EC2 EC2 EC2 EC2 Mount target
  18. 18. How does it all fit together? AVAILABILITY ZONE 1 REGION AVAILABILITY ZONE 2 AVAILABILITY ZONE 3 VPC EC2 EC2 EC2 EC2 Customer’s file system
  19. 19. There are three ways to set up and manage a file system AWS Management Console AWS Command Line Interface (CLI) AWS Software Development Kit (SDK)
  20. 20. The AWS Management Console, CLI, and SDK each allow you to perform a variety of management tasks Create a file system Create and manage mount targets Tag a file system Delete a file system View details on file systems in your AWS account
  21. 21. Setting up and mounting a file system takes under a minute 1. Create a file system 2. Create a mount target in each Availability Zone from which you want to access the file system 3. Enable the NFS client on your instances 4. Run the mount command
  22. 22. It takes 35 seconds or so..
  23. 23. Multi-exabyte file system available for use Don’t worry—We only bill for the space you use 
  24. 24. Securing your file system
  25. 25. Control surfaces for Amazon EFS security Control network traffic to and from file systems (mount targets) by using VPC security groups and network ACLs Control file and directory access by using standard Linux/Windows directory-/file-level permissions Control administrative access (API access) to file systems by using AWS Identity and Access Management (IAM)
  26. 26. Only EC2 instances in the VPC you specify can access your Amazon EFS file system VPC EC2 EC2 EC2 EC2 VPC EC2 EC2 EC2 EC2 Customer’s file system
  27. 27. VPC EC2 EC2 Security groups control which instances in your VPC can connect to your mount targets Customer’s file system Security group: sg-allowed Security group: Permit inbound traffic from “sg-allowed” Security group: sg-not-allowed
  28. 28. Amazon EFS supports user-level file and directory access permissions Set file/directory permissions to specify read-write-execute permissions for users and groups
  29. 29. Use IAM policies to control who can use the administrative APIs to create, manage, and delete file systems Amazon EFS supports action-level and resource-level permissions Integration with AWS IAM provides administrative security
  30. 30. FAQ on adjacent security topics Does EFS support ACLs? Does EFS support / need nis/nis+? Does EFS support kerberized auth? Does EFS support encryption? Does EFS support Windows? …. Stay tuned.
  31. 31. Using AWS for an Enterprise Storage Service October 2015Laz Vekiarides, CTO & Co-Founder
  32. 32. What does 1PB of data look like? Today With ClearSky
  33. 33. The ClearSky Global Storage Network Metro-based fully managed service SLA-guaranteed for enterprise workloads Complete lifecycle management BackupRecovery Primary
  34. 34. Metro coverage: “Always within 2ms of the Customer” = Tier 1 Location = Tier 2 Location
  35. 35. Next to your apps In your metro area Regional The ClearSky solution: A hybrid cloud storage offering Enterprise data center ClearSky Edge Appliance Enterprise Apps ClearSky POPs Distributed & optimized storage
  36. 36. Leveraging AWS Edge cache Data services Edge Metro POP ClearSky Metro Cache N x Metro E Customer SAN iSCSI/NFS/Fi ber Channel VPC EFS
  37. 37. Edge cache Data services Hybrid cloud mobility ClearSky Metro Cache 2x 1GbE Edge Metro POP Automatic and optimized data migration to AWS enables workload portability to EC2 Large and distributed network protects users from latency issues
  38. 38. Customer use cases • Managed/cloud service provider • Data centers in Philadelphia & Las Vegas • Using EC2 for cloud workloads • 1PB+ storage, currently running on EqualLogic, Nimble • Heavy users of VMW, SQL • Chose ClearSky for workload portability, cloud economics & scale • Xtium will no longer need to replicate data cross-country • Boston-based biopharma • 100TB storage, currently on Dell Compellent • Running full range of enterprise apps on ClearSky: • VMware, SQL, Oracle • Using EC2 for DR • Chose ClearSky for simplicity and cost effectiveness • Momenta will no longer need a secondary site
  39. 39. Thank You
  41. 41. Amazon EFS performance model
  42. 42. Amazon EFS aggregate performance is based on a throughput bursting model that scales as a file system grows As a file system gets larger, it needs access to more throughput Many file workloads are spiky, with peak throughput well above average levels Amazon EFS scalable bursting model is designed to make performance available when you need it
  43. 43. Throughput bursting model based on earning and spending “bursting credits” Accumulate up to 12 hours of continuous bursting Earn credits at a “baseline rate” of 0.05 MiB/s per GiB stored Spend credits by reading/writing at up to: • 100 MiB/s for file systems <1TiB • 100 MiB/s per TiB for file systems >1TiB • All file systems can drive sustained baseline throughput (i.e., 50 MiB/s per TiB stored) • File systems with a positive bursting credit balance are able to “burst” to higher levels • New file systems start with a full credit balance
  44. 44. Bursting model examples File system size Read/write throughput A 1 TiB EFS file system can… • Drive up to 50 MiB/s continuously or • Burst to 100 MiB/s for up to 12 hours each day* A 10 TiB EFS file system can… • Drive up to 500 MiB/s continuously or • Burst to 1 GiB/s for up to 12 hours each day* A 100 GiB EFS file system can… • Drive up to 5 MiB/s continuously or • Burst to 100 MiB/s for up to 72 minutes each day*
  45. 45. Amazon EFS is designed for wide spectrum of use cases We started here Ready now We are actively working here now High throughput / parallel IO Low latency / serial IO Genomics Big Data Scale-out jobs Homedir CMS Web serving SW builds Metadata-intensive jobs
  46. 46. Regional availability and durability
  47. 47. In what regions can I use Amazon EFS? US-West-2 (Oregon) US-East-1 (Northern Virginia) EU-West-1 (Ireland)
  48. 48. Data is stored in multiple Availability Zones for high availability and durability Every file system object (directory, file, and link) is redundantly stored across multiple Availability Zones in a region AVAILABILITY ZONE 1 REGION AVAILABILITY ZONE 2 AVAILABILITY ZONE 3 Amazon EFS
  49. 49. Data can be accessed from any Availability Zone in the region while maintaining full consistency Your EC2 instances can connect to your EFS file system from any Availability Zone in a region All reads are fully consistent in all Availability Zones— that is, a read in one Availability Zone is guaranteed to have the latest data, even if the data is being written in another Availability Zone AVAILABILITY ZONE 1 REGION VPC EC2 EC2 EC2 AVAILABILITY ZONE 2 AVAILABILITY ZONE 3 EC2 Write Read
  50. 50. Wrapping up
  51. 51. TCO - 1TB example User managed $1.00 / GB • Storage • Compute • Inter Availability Zone • M3 xlarge x 3 x 3TB EBS GP2 + inter az replication Appliance $4.00 / GB • Per-hour charge • Storage • Inter Availability Zone • M3 xlarge x 3 x TB EBS GP2 + inter az replication On-Premises AFA $0.60 / GB • Raw to usable • Cost of funds • Utilization • Collocation • Storage only Mirrored 2 site configuration Amazon EFS $0.30 / GB • Elastic • Simple • Predictable
  52. 52. What to do next? Learn more at Request an invite for our Preview
