Your SlideShare is downloading. ×
MED201 Media Ingest and Storage Solutions with AWS - AWS re: Invent 2012
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Saving this for later?

Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime - even offline.

Text the download link to your phone

Standard text messaging rates apply

MED201 Media Ingest and Storage Solutions with AWS - AWS re: Invent 2012

1,670
views

Published on

In this session we will discuss the numerous ways to ingest data into AWS including options such as physical media import & direct connect. We also talk about policy-based Hierarchical Storage …

In this session we will discuss the numerous ways to ingest data into AWS including options such as physical media import & direct connect. We also talk about policy-based Hierarchical Storage Management (HSM) in the cloud, total cost of ownership, the importance of storage durability, and the infinite scalability of Amazon S3. Also, the founder of photo-share sensation IMGUR, Alan Schaaf, speaks about their migration to AWS.


0 Comments
6 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,670
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
6
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. tweet #reinvent
  • 2. ••••••
  • 3. SIMPLE STORAGE SERVICE
  • 4. DURABILITY
  • 5. UNIQUE CUSTOMER OBJECTS
  • 6. PEAK TRANSACTIONS PER SECOND
  • 7. 99.99% vs. 99.999999999%durability of objects over agiven year
  • 8. Customer Decides Where Applications and Data Reside
  • 9. • Disaster Recovery • Backup Region 2 Region 1 • Distribution• Archive• Load for Elastic MapReduce• Migrate / Sync www.attunitycloudbeam.com Enterprise • SaaS-based solution • Simplicity, “click and replicate” • Speed, optimized and elastic transfer • Safe, secure and guaranteed delivery
  • 10. • LOW-COST ARCHIVING SERVICE
  • 11. PER GB / MONTH
  • 12. PER TB / YEAR
  • 13. DURABILITY
  • 14. DATA RETRIEVAL
  • 15. LOW-COST ARCHIVING SERVICE
  • 16. Archive after 30 days logs/file1 logs/file2 logs/file3 logs Amazon GlacierMy S3 bucket
  • 17. Expire after 365 days logs/file1 logs/file2 logs logs/file3 Amazon GlacierMy S3 bucket
  • 18. Storage Options Amazon Reduced Data Glacier Redundancy Storage Standard StorageAmazon Glacier Amazon S3
  • 19. Glacier Transforms tiered storage
  • 20. HSMwith AWS Amazon Amazon SAN S3 AWS Cloud Glacier Corporate Data CenterversusTraditionalApproachto HSM offsite tape SAN tier 2 disk storage storage backup Corporate Data Center
  • 21. Compliancewith AWS O/S image Amazon Amazon S3 AWS Cloud GlacierversusTraditionalApproach off-site tape O/S image disk storage backup Corporate Data Center
  • 22. Options to consider
  • 23. 5 HTTP – 2 fasp 3 multipart 4 Parallel Transcoding fasp 5 14 instances: 3 min Herndon, VA1 6 1. Video broadcast capture 2. High-speed upload Direct-to-S3 3. Scale out parallel transcode` 4. Deliver back to S3 5. High-speed download from S3 to UFC 6. Insert into CMS for streaming to mobile devices
  • 24. Common mistakes
  • 25. What is Imgur?• A simple image sharer• Has the most viral images on the Internet• Anyone can upload as many images as they want – without an account• 2,000,000 images uploaded per day • That’s 23 images per second• Can be embedded and shared on any site
  • 26. The greatest image site. Full of the all the wondersand magic of the interwebs. Be forewarned, time hasbeen known to quicken in this realm.“I spent half a day on Imgur, and it was the greatest 6 hoursof my life.” - Urban Dictionary
  • 27. • Started as a side project while at Ohio University• Redditors needed a place to host their images• Organically grew into a business• Alan was the only developer for 3 years• Moved to San Francisco• Now a team of 7 • (600 million pageviews per engineer)
  • 28. • Every month. There are: • 11 minutes average visit duration • 2.9 billion page views • 11 pages per visit • 38 billion image views (images loaded) • 46th biggest site in the US • 54 million unique visitors – (according to alexa.com) • 4.7 petabytes of bandwidth used • 600 million objects stored in S3 • 62 million images uploaded * All data as of Nov 2012
  • 29. • Pageviews are growing 15% every month.• How are we able to support this kind of growth?
  • 30. User make a request for animage (Don’t do this!)
  • 31. User make a request for an image
  • 32. User make a request for an image
  • 33. • Site traffic is increasing more than ever. How many more servers do we need?• Hardware failures• Tweaking every little thing is really hard and easy to get wrong, but necessary• There’s only one man doing all this; how can we make his life easier, while scaling the site at the same time?
  • 34. • Autoscaling is awesome• Automated DB backups are awesome• Security features are awesome• Much easier to manage in the long run• Because everything’s managed, you require less admins to look over everything all the time• AWS has managed solutions for all the core services your website needs (server, database, cache, backups, security, etc.)
  • 35. • Lots of new stuff to learn and set up• Possible downtime during migration• Very time consuming at first because you’re reconfiguring your entire stack
  • 36. • AWS has a lot of services; find out which ones can work for you and how• Use the price calculator: http://calculator.s3.amazonaws.com/calc5.html• Read the docs: http://aws.amazon.com• Set up a test environment• Install the AWS SDK• Call AWS if you have questions. • (You don’t need AWS Support to call in)• Start coding!
  • 37. • How do you get all your data to S3? • Duplicate writes: 1 to native, 1 to S3. • Upload all your data to S3 in parallel.We had 12 background processes running around the clock all uploading a different subset of data to S3 – it took 2 weeks to finish. • No need to store more than one copy • Turn on versioning for even more protection• Very similar process for Amazon RDS
  • 38. • There’s no web interface• Have to do everything from command line• Confusing terminology• Hard to verify that it’s working as intended• But in the end, it’s amazing• If you’re not using it, you’d better have a really good reason why
  • 39. • EC2: • Maximum performance with RAID0 Elastic Block Store • RAID0 EBS requires a pretty significant amount of maintenance overhead • Have to come up with your own backup plan• RDS: • Will provide very good performance out of the box (but not maximum) • Management console is fantastic • Easy to upgrade instances • High availability and read-only slaves are a click away • Managed service, which makes it more expensive• If you enjoy tuning every last little bit for maximum performance, then you can consider EC2 + EBS RAID 0• Still on the fence? Go with RDS
  • 40. • There’s no access to the underlying file system• Migrating requires a dump and an import of your data, which is extremely time consuming for large databases• No access to the logs when things break• We were able to do it live – without taking the site down – but with lots of headaches
  • 41. • Wed (1:00 p.m.–1:50 p.m.) MED203: Scalable Media Processing with AWS• Wed (2:05 p.m.–2:55 p.m.) MED202: Netflix’s Transcoding Transformation• Wed (3:25 p.m.–4:15 p.m.) MED303: Addressing Security in Media Workflow• Thu (10:30 a.m.–11:20 a.m.) STG205: Amazon S3: Reduce costs, save time, and better protect your data• Thu (11:35 a.m.–12:25 p.m.) STG203: Cloud Storage War Stories: From the front lines of some of the biggest battles• Thu (4:05 p.m.–4:55 p.m.) STG302: Archive in the Cloud with Amazon Glacier http://aws.amazon.com/s3/ http://aws.amazon.com/glacier/faqs/ http://aws.amazon.com/digital-media/• Wed (1:00 p.m.–1:50 p.m.) STG201: Understanding AWS Storage Options
  • 42. We are sincerely eager to tweet #reinvent hear your feedback on thispresentation and on re:Invent. Please fill out an evaluation form when you have a chance.