STG303 Building Scalable Applications on Amazon S3 - AWS re: Invent 2012

2,649 views

Published on

Want to build an application that requires minimal up-front investment, and will seamlessly scale from hundreds to millions of users? Amazon S3 is a powerful building block that can enable you to focus your time on the value and functionality of your application, rather than the challenges of scaling it. In this session we'll cover techniques to best take advantage of the platform. We'll discuss structuring your key naming convention to maximize consistency of performance, as well as ways to optimize your upload and download throughput. We'll learn how to eliminate proxies between your application and Amazon S3, and use the platform for your logging needs. Finally, we'll cover simple techniques for efficiently managing the billions of objects your highly scaled application may accumulate.

STG303 Building Scalable Applications on Amazon S3 - AWS re: Invent 2012

  1. 1. BucketsObjectsKeys
  2. 2. 1. Hashing Keys2. Lifecycle Policies3. Multipart Upload4. Object Aggregation5. Content Distribution6. Server Access Logging7. Bucket Policies8. Pre-signed URLs9. DNS Time To Live10.TCP window scaling
  3. 3. mybucket/2012-11-29-15-06-03/cust1234234/photo1.jpgmybucket/2012-11-29-15-06-03/cust8234233/photo2.jpgmybucket/2012-11-29-15-06-03/cust1234234/photo2.jpgmybucket/2012-11-29-15-06-03/cust7433234/photo1.jpgmybucket/2012-11-29-15-06-04/cust8234233/photo5.jpgmybucket/2012-11-29-15-06-04/cust2123234/photo1.jpgmybucket/2012-11-29-15-06-04/cust8234233/photo8.jpgmybucket/2012-11-29-15-06-04/cust3241234/photo7.jpg
  4. 4. mybucket/2012-11-29-15-06-03/cust1234234/photo1.jpgmybucket/2012-11-29-15-06-03/cust8234233/photo2.jpgmybucket/2012-11-29-15-06-03/cust1234234/photo2.jpgmybucket/2012-11-29-15-06-03/cust7433234/photo1.jpgmybucket/2012-11-29-15-06-04/cust8234233/photo5.jpgmybucket/2012-11-29-15-06-04/cust2123234/photo1.jpgmybucket/2012-11-29-15-06-04/cust8234233/photo8.jpgmybucket/2012-11-29-15-06-04/cust3241234/photo7.jpg
  5. 5. mybucket/2012-11-29-15-06-03/cust1234234/photo1.jpg
  6. 6. Host-D Host-AHost-C Host-B
  7. 7. Partition-A Partition-B Partition-C Partition-D
  8. 8. Partition-A Partition-B Partition-C Partition-D
  9. 9. Partition-A Partition-B Partition-C Partition-D
  10. 10. Partition-A Partition-B Partition-C Partition-D Partition-E
  11. 11. mybucket/232A-2012-11-29-15-06-03/cust1234234/photo1.jpgmybucket/7B54-2012-11-29-15-06-03/cust8234233/photo2.jpgmybucket/921C-2012-11-29-15-06-03/cust1234234/photo2.jpgmybucket/BA65-2012-11-29-15-06-03/cust7433234/photo1.jpgmybucket/8761-2012-11-29-15-06-04/cust8234233/photo5.jpgmybucket/2E4F-2012-11-29-15-06-04/cust2123234/photo1.jpgmybucket/9810-2012-11-29-15-06-04/cust8234233/photo8.jpgmybucket/7E34-2012-11-29-15-06-04/cust3241234/photo7.jpg
  12. 12. Partition-A Partition-B Partition-C Partition-D
  13. 13. Partition-A Partition-B Partition-C Partition-C’ Partition-D
  14. 14. MyOtherBucket/1290052.obj MyOtherBucket/2500921.objMyOtherBucket/1290053.obj MyOtherBucket/3500921.objMyOtherBucket/1290054.obj MyOtherBucket/4500921.objMyOtherBucket/1290055.obj MyOtherBucket/5500921.objMyOtherBucket/1290056.obj MyOtherBucket/6500921.objMyOtherBucket/1290057.obj MyOtherBucket/7500921.objMyOtherBucket/1290058.obj MyOtherBucket/8500921.objMyOtherBucket/1290059.obj MyOtherBucket/9500921.obj
  15. 15. 30 days log objects expire rule
  16. 16. log/
  17. 17. 30
  18. 18. MyHugeDataBlob.obj Scaling Large Uploads
  19. 19. MyHugeDataBlob.obj Scaling Large Uploads
  20. 20. MyHugeDataBlob.obj Scaling Large Uploads X
  21. 21. MyHugeDataBlob.obj Multipart Uploads Parallel uploads
  22. 22. MyHugeDataBlob.obj Multipart Uploads Recovery from network issues
  23. 23. MyLogFile.txt Multipart Uploads Unknown Object Length
  24. 24. Upload a Single Object
  25. 25. Upload using Multipart Upload
  26. 26. Retrieve via Ranged Gets
  27. 27. S3
  28. 28. S3Cloudfront
  29. 29. 68 msRound trip
  30. 30. Incoming Outgoing
  31. 31. IAM Users Restricted Partner Private
  32. 32. Scaling Customer Uploads
  33. 33. Scaling Customer Uploads
  34. 34. Hand out credentials -- NO
  35. 35. Pre-signed URLS
  36. 36. S3 has many IP addressIP address come and goDNS resolution is cachedS3 service SLA not IP address SLA
  37. 37. Java DNS Caching Behaviorhttp://docs.oracle.com/javase/1.5.0/docs/api/java/net/InetAddress.html
  38. 38. T11.544Mbit/s188KiB/s 100 ms BDP = 18.8 KiB
  39. 39. 1Gbit/s LAN119.2 MiB/s 1 ms BDP = 122 KiB
  40. 40. 1Gbit/s LAN119.2 MiB/s 100 ms BDP = 11.9 MiB
  41. 41. http://tools.ietf.org/html/rfc1323
  42. 42. 1Gbit/s Fiber119.2 MiB/s 100 ms 64Kib Window  640KiB/s throughput
  43. 43. 1Gbit/s Fiber119.2 MiB/s 100 ms 64Kib Window  640KiB/s throughput
  44. 44. 1Gbit/s Fiber119.2 MiB/s 100 ms 64Kib Window  640KiB/s throughput
  45. 45. 1Gbit/s Fiber119.2 MiB/s 100 ms 64Kib Window  640KiB/s throughput
  46. 46. 1Gbit/s Fiber119.2 MiB/s 100 ms 64Kib Window  640KiB/s throughput
  47. 47. WSCALE Size in Bytes Size 0 65,536 64 KiB 1 131,072 128KiB 2 262,144 256 KiB 3 524,288 512 KiB 4 1,048,576 1 MiB 5 2,097,152 2 MiB 6 4,194,304 4 MiB 7 8,388,608 8 MiB 8 16,777,216 16 MiB 9 33,554,432 32 MiB 10 67,108,864 64 MiB 11 134,217,728 128 MiB 12 268,435,456 256 MiB 13 536,870,912 512 MiB 14 1,073,741,824 1 GiB
  48. 48. 100Mb/s11.9 MiB/s 335ms 4Mib Window
  49. 49. 2000 1400 1000 0
  50. 50. 2000 1400 1000 0 ACK 1000
  51. 51. 2000 1400 1000 0 1500-2000 1100-1400 0 -1000 http://tools.ietf.org/html/rfc2018
  52. 52. 1. Hashing Keys2. Lifecycle Policies3. Multipart Upload4. Object Aggregation5. Content Distribution6. End User Logging7. Bucket Policies8. Pre-signed URLs9. DNS Time To Live10.TCP window scaling

×