Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Deep Dive on Amazon S3

3,210 views

Published on

Come learn about new and existing Amazon S3 features that can help you better protect your data, save on cost, and improve usability, security, and performance. We will cover a wide variety of Amazon S3 features and go into depth on several newer features with configuration and code snippets, so you can apply the learnings on your object storage workloads.

Published in: Technology

Deep Dive on Amazon S3

  1. 1. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Susan Chan Senior Product Manager, Amazon S3 August 2016 Deep Dive on Amazon S3
  2. 2. Recent innovations on S3 Visibility & control of your data New storage offering More data ingestion options • Standard - Infrequent Access • Amazon CloudWatch integration • AWS CloudTrail integration • New lifecycle policies • Event notifications • Bucket limit increases • Read-after-write consistency • IPv6 support • AWS Snowball (80 TB) • S3 Transfer Acceleration • Amazon Kinesis Firehose • Partner integration
  3. 3. Choice of storage classes on S3 Standard Active data Archive dataInfrequently accessed data Standard - Infrequent Access Amazon Glacier
  4. 4. File sync and share + consumer file storage Backup and archive + disaster recovery Long-retained data Use cases for Standard-Infrequent Access
  5. 5. Designed for 11 9s of durability Standard - Infrequent Access storage Designed for 99.9% availability Durable Available Same as Standard storage High performance • Bucket policies • AWS Identity and Access Management (IAM) policies • Many encryption options Secure • Lifecycle management • Versioning • Event notifications • Metrics Integrated • No impact on user experience • Simple REST API Easy to use
  6. 6. - Directly PUT to Standard - IA - Transition Standard to Standard - IA - Transition Standard - IA to Amazon Glacier storage - Expiration lifecycle policy - Versioning support Standard - Infrequent Access storage Integrated: Lifecycle management Standard - Infrequent Access
  7. 7. Transition older objects to Standard - IA
  8. 8. Lifecycle policy Standard Storage -> Standard - IA <LifecycleConfiguration> <Rule> <ID>sample-rule</ID> <Prefix>documents/</Prefix> <Status>Enabled</Status> <Transition> <Days>30</Days> <StorageClass>STANDARD-IA</StorageClass> </Transition> <Transition> <Days>365</Days> <StorageClass>GLACIER</StorageClass> </Transition> </Rule> </LifecycleConfiguration> Standard - Infrequent Access storage
  9. 9. Standard Storage -> Standard - IA <LifecycleConfiguration> <Rule> <ID>sample-rule</ID> <Prefix>documents/</Prefix> <Status>Enabled</Status> <Transition> <Days>30</Days> <StorageClass>STANDARD-IA</StorageClass> </Transition> <Transition> <Days>365</Days> <StorageClass>GLACIER</StorageClass> </Transition> </Rule> </LifecycleConfiguration> Standard - IA Storage -> Amazon Glacier Standard - Infrequent Access storage Lifecycle policy
  10. 10. S3 support for IPv6 Dual-stack endpoints support both IPv4 and IPv6 Same high performance Integrated with most S3 features Manage access with IPv6 addresses Easy to adopt, just change your endpoint. No additional charges
  11. 11. IPv6 - Getting started Update your endpoint to • virtual hosted style address http://bucketname.s3.dualstack.aws-region.amazonaws.com Or • path style address http://s3.dualstack.aws-region.amazonaws.com/bucketname
  12. 12. Restricting access by IP addresses { "Version": "2012-10-17", "Id": "S3PolicyId1", "Statement": [ { "Sid": "IPAllow", "Effect": "Allow", "Principal": "*", "Action": "s3:*", "Resource": "arn:aws:s3:::examplebucket/*", "Condition": { "IpAddress": {"aws:SourceIp": "54.240.143.0/24"} "NotIpAddress": {"aws:SourceIp": "54.240.143.188/32"} } } ] } Bucket policy with IPv4
  13. 13. Updating bucket policy with IPv6 { "Version": "2012-10-17", "Id": "S3PolicyId1", "Statement": [ { "Sid": "IPAllow", "Effect": "Allow", "Principal": "*", "Action": "s3:*", "Resource": "arn:aws:s3:::examplebucket/*", "Condition": { "IpAddress": "aws:SourceIp": [ "54.240.143.0/24", "2001:DB8:1234:5678::/64" ]} "NotIpAddress": {"aws:SourceIp": ["54.240.143.128/30", "2001:DB8:1234:5678:ABCD::/80”]}}]}
  14. 14. John Brzozowski Fellow and Chief Architect, IPv6
  15. 15. 15 – COMCAST IPV6 @ COMCAST "Route 6 runs uncertainly from nowhere to nowhere, scarcely to be followed from one end to the other, except by some devoted eccentric” George R. Stewart AWS NYC 2016
  16. 16. 16 – COMCAST BACKGROUND • The IPv6 program at Comcast began in 2005 • Seamlessness is a cornerstone of our program • Motivation • IPv4 is not adequate, could not support near or long term growth requirements • IPv6 is inevitable • Scope • Everything, over time!
  17. 17. 17 – COMCAST THE FIRST IPV6 ONLY SERVICE… • 98+% of devices are managed using IPv6 only • Management use of IPv6 (only) is one of the largest deployments of IPv6 worldwide • Trending towards 100% of all new and existing devices managed using IPv6 only, no IPv4 GROWTH
  18. 18. 18 – COMCAST BROADBAND 89 %
  19. 19. 19 – COMCAST X1 ~50 %
  20. 20. 20 – COMCAST NEXT… • Minimizing and reducing IPv4 dependencies • IPv6 is used to manage the majority (and growing) of our business needs today • IPv6 utilization continues to grow • Currently ~30% of our Internet facing communications is over IPv6 • Leverage IPv6 as a platform for innovation
  21. 21. 21 – COMCAST STAY TUNED…
  22. 22. Data ingestion into S3
  23. 23. S3 Transfer Acceleration S3 Bucket AWS Edge Location Uploader Optimized Throughput! Typically 50%-400% faster Change your endpoint, not your code No firewall exceptions No client software required 59 global edge locations
  24. 24. Rio De Janeiro Warsaw New York Atlanta Madrid Virginia Melbourne Paris Los Angeles Seattle Tokyo Singapore Time[hrs.] 500 GB upload from these edge locations to a bucket in Singapore Public Internet How fast is S3 Transfer Acceleration? S3 Transfer Acceleration
  25. 25. Getting started 1. Enable S3 Transfer Acceleration on your S3 bucket. 2. Update your endpoint to <bucket-name>.s3-accelerate.amazonaws.com. 3. Done!
  26. 26. How much will it help me? s3speedtest.com
  27. 27. Tip: Parallelizing PUTs with multipart uploads • Increase aggregate throughput by parallelizing PUTs on high-bandwidth networks • Move the bottleneck to the network, where it belongs • Increase resiliency to network errors; fewer large restarts on error-prone networks Best Practice
  28. 28. Incomplete multipart upload expiration policy • Partial upload does incur storage charges • Set a lifecycle policy to automatically make incomplete multipart uploads expire after a predefined number of days Incomplete multipart upload expiration Best Practice
  29. 29. Enable policy with the AWS Management Console
  30. 30. Example lifecycle policy <LifecycleConfiguration> <Rule> <ID>sample-rule</ID> <Prefix>MyKeyPrefix/</Prefix> <Status>rule-status</Status> <AbortIncompleteMultipartUpload> <DaysAfterInitiation>7</DaysAfterInitiation> </AbortIncompleteMultipartUpload> </Rule> </LifecycleConfiguration> Or enable a policy with the API
  31. 31. Tip #1: Use versioning • Protects from accidental overwrites and deletes • New version with every upload • Easy retrieval of deleted objects and roll back to previous versions Best Practice Versioning
  32. 32. Tip #2: Use lifecycle policies • Automatic tiering and cost controls • Includes two possible actions: • Transition: archives to Standard - IA or Amazon Glacier based on object age you specified • Expiration: deletes objects after specified time • Actions can be combined • Set policies at the bucket or prefix level • Set policies for current version or non- current versions Lifecycle policies
  33. 33. Versioning + lifecycle policies
  34. 34. Expired object delete marker policy • Deleting a versioned object makes a delete marker the current version of the object • Removing expired object delete marker can improve list performance • Lifecycle policy automatically removes the current version delete marker when previous versions of the object no longer exist Expired object delete marker
  35. 35. Enable policy with the console Insert console screen shot
  36. 36. Tip #3: Restrict deletes • Bucket policies can restrict deletes • For additional security, enable MFA (multi-factor authentication) delete, which requires additional authentication to: • Change the versioning state of your bucket • Permanently delete an object version • MFA delete requires both your security credentials and a code from an approved authentication device Best Practice
  37. 37. <my_bucket>/2013_11_13-164533125.jpg <my_bucket>/2013_11_13-164533126.jpg <my_bucket>/2013_11_13-164533127.jpg <my_bucket>/2013_11_13-164533128.jpg <my_bucket>/2013_11_12-164533129.jpg <my_bucket>/2013_11_12-164533130.jpg <my_bucket>/2013_11_12-164533131.jpg <my_bucket>/2013_11_12-164533132.jpg <my_bucket>/2013_11_11-164533133.jpg <my_bucket>/2013_11_11-164533134.jpg <my_bucket>/2013_11_11-164533135.jpg <my_bucket>/2013_11_11-164533136.jpg Use a key-naming scheme with randomness at the beginning for high TPS • Most important if you regularly exceed 100 TPS on a bucket • Avoid starting with a date • Consider adding a hash or reversed timestamp (ssmmhhddmmyy) Don’t do this… Tip #4: Distribute key names
  38. 38. Distributing key names Add randomness to the beginning of the key name… <my_bucket>/521335461-2013_11_13.jpg <my_bucket>/465330151-2013_11_13.jpg <my_bucket>/987331160-2013_11_13.jpg <my_bucket>/465765461-2013_11_13.jpg <my_bucket>/125631151-2013_11_13.jpg <my_bucket>/934563160-2013_11_13.jpg <my_bucket>/532132341-2013_11_13.jpg <my_bucket>/565437681-2013_11_13.jpg <my_bucket>/234567460-2013_11_13.jpg <my_bucket>/456767561-2013_11_13.jpg <my_bucket>/345565651-2013_11_13.jpg <my_bucket>/431345660-2013_11_13.jpg
  39. 39. Remember to complete your evaluations!
  40. 40. Thank you!

×