• Save
Media Content Ingest, Storage, and Archiving with AWS - John Downey, Amazon Web Services
 

Media Content Ingest, Storage, and Archiving with AWS - John Downey, Amazon Web Services

on

  • 1,050 views

 

Statistics

Views

Total Views
1,050
Views on SlideShare
1,050
Embed Views
0

Actions

Likes
1
Downloads
0
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Media Content Ingest, Storage, and Archiving with AWS - John Downey, Amazon Web Services Media Content Ingest, Storage, and Archiving with AWS - John Downey, Amazon Web Services Presentation Transcript

  • Media Content Ingest, Storage, and Archiving with AWS – MED301 John Downey, Amazon Web Services, Business Development Manger - Storage November 13, 2013 © 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • Agenda – Content Ingest, Storage, and Archiving •  AWS components –  Ingest –  Storage –  Archive •  Partner components –  Ingest –  Storage –  Archive •  TCO / ROI considerations
  • AWS Global Infrastructure 9 Regions 25 Availability Zones 46 Edge locations
  • AWS Regions and Availability Zones Availability Zone Availability Zone Availability Zone Availability Zone Availability Zone Availability Zone Asia Pacific (Tokyo) EU (Ireland) US West (Oregon) Availability Zone Availability Zone Availability Zone Availability Zone Availability Zone Availability Zone Availability Zone Availability Zone US East (N. Virginia) Availability Zone Availability Zone Availability Zone Availability Zone Availability Zone Availability Zone Availability Zone Availability Zone Availability Zone US West (N. Cal) (Asia Pacific) Singapore AWS GovCloud (US) South America (Sao Paulo) Customer decides where applications and data reside Asia Pacific (Sydney)
  • AWS Ingest Options AWS  Direct  Connect   AWS  Import/Export   AWS  Storage  Gateway   Dedicated  bandwidth  between   your  site  and  AWS   Physical  transfer  of  media  into  and   out  of  AWS     On-­‐premises  storage  federa;on  with   Amazon  S3  and  Amazon  Glacier  
  • AWS Ingest Options – One Common Theme: 1.  2.  3.  4.  Parallel Uploads Multipart upload Request rate optimization TCP window scaling TCP selective acknowledgement AWS has customers that ingest roughly 1 PB per day
  • AWS Ingest Options AWS Direct Connect •  •  Reduces costs for bandwidthheavy workloads Private connectivity to AWS –  Physical connection – 1 Gbps or 10 Gbps port –  Logical connections (802.1q VLANs) §  Public: To AWS cloud (Amazon EC2, Amazon S3 etc.) §  Private: To VPCs •  •  Consistent network performance Compatible with all AWS services
  • AWS Ingest Options AWS Direct Connect Cost •  •  •  •  1 Gbps port = $0.30/hour 10 Gbps port = $2.25/hour Data transfer IN = $0 Data transfer OUT = $0.02 – 0.11 per GB depending upon location –  Can be a significant savings vs. Internet bandwidth out Locations •  •  •  •  •  •  •  •  •  •  •  CoreSite 32 Avenue of the Americas, NY CoreSite One Wilshire & 900 North Alameda, LA Equinix DC1 – DC6 & DC10 - DC11, Ashburn, VA Equinix SV1 & SV5, San Jose, CA Equinix SE2 & SE3, Seattle, WA Equinix SG2, Singapore Equinix SY3, Sydney Equinix TY2, Tokyo Eircom, Clonshaugh TelecityGroup Docklands, London Terremark NAP do Brasil, Sao Paulo
  • AWS Ingest Options AWS Import/Export •  •  Rapidly move data into and out of AWS Portable storage device shipment to AWS –  –  –  •  Supports –  –  –  •  eSATA USB 2.0 and 3.0 (including USB flash drives) 2.5 and 3.5 inch internal SATA hard drives Amazon Elastic Block Store (EBS) Amazon Simple Storage Service (S3) Amazon Glacier Use cases –  –  –  Initial content migration Content distribution via portable devices Disaster recovery
  • AWS Ingest Options AWS Import/Export •  Cost –  $80 per storage device handled –  $2.49 per data loading hour –  Standard pricing for •  Amazon S3 •  Amazon EBS •  Amazon Glacier
  • AWS Ingest Options AWS Storage Gateway •  •  On-premises, virtual iSCSI storage appliance Local cache enables low latency access to data –  –  •  •  •  •  Gateway – stored volumes Gateway – cached volumes Copies data in the form of Amazon EBS snapshots to Amazon S3 Leverage Amazon S3 serverside encryption Recent patch results in up to 5 TB of throughput per day Recover to Amazon EBS / Amazon EC2
  • AWS Ingest Options AWS Storage Gateway •  Cost (N. Virginia – varies per region) –  Gateway pricing •  $125 per activated gateway/mo. –  Volume pricing •  $0.095 per GB per month of data stored –  Snapshot pricing •  $0.095 per GB per month of data stored –  Tiered data transfer pricing model •  Free inbound •  $0.12 - $0.05 per GB outbound depending on tier
  • AWS Ingest Options Gateway-Virtual Tape Library (Gateway VTL) •  •  •  On-premises, virtual tape library storage appliance 10 virtual tape drives / 1500 virtual tape slots 150 TB local cache –  VTL – virtual tape library –  VTS – virtual tape shelf •  •  •  •  •  Restore in seconds from VTL 24 hour retrieval from VTS Encryption in transit and at rest Gateway VTL-AMI In lab we achieved 55 MB/s upload throughput and 90 MB/s iscsi ingest rate per gateway
  • AWS Ingest Options Gateway-Virtual Tape Library (Gateway VTL) •  Cost (N. Virginia – varies per region) –  Gateway pricing •  $125 per activated gateway/mo. –  Virtual tape shelf storage •  $0.01 per GB per month of data stored –  Virtual tape library storage •  $0.095 per GB per month of data stored –  Retrieval from virtual tape shelf •  $0.30 per GB –  Virtual tape deletes •  Free
  • AWS Storage and Archive Options Amazon  Elas@c  Block  Store  (EBS)   Amazon  Simple  Storage  Service  (S3)   Amazon  Glacier   High-performance block storage device Highly  scalable  object  storage   Long-­‐term  object  archive   1 GB to 1 TB in size 1  byte  to  5  TB  in  size   Extremely  low  cost  per  gigabyte   Mount as drives to instances with snapshot/ 99.999999999%  durability   99.999999999%  durability   cloning functionalities
  • AWS Storage Options Amazon Elastic Block Store (EBS) •  High I/O block storage for Amazon EC2 •  Predictably scale to 1000s of IOPS per Amazon EC2 instance •  Automatic replication within the Availability Zone •  10x more reliable than commodity disk drives •  Point-in-time snapshots •  Amazon S3 durability (11-9s) •  Point-in-time snapshots across regions •  Amazon CloudWatch •  Exposes Amazon EBS performance metrics
  • AWS Storage Options Amazon Elastic Block Store (EBS) Costs (US East) Amazon EBS standard volumes §  $0.10 per GB-month of provisioned storage §  $0.10 per 1 million I/O requests Amazon EBS provisioned IOPS volumes §  $0.125 per GB-month of provisioned storage §  $0.10 per provisioned IOPS-month Amazon EBS snapshots to Amazon S3 §  $0.095 per GB-month of data stored
  • AWS Storage Options Amazon Simple Storage Service (S3) •  Synchronous in and synchronous out object storage •  Designed for 99.999999999% durability •  Authentication mechanisms ensure data is kept secure •  Multiple encryption options –  Amazon server-side encryption •  Standard storage •  Reduced redundancy storage (RRS)
  • AWS Storage Options Amazon S3: Over 2 Trillion Total Objects
  • AWS Storage Options Amazon Simple Storage Service (S3) Costs (US East) Standard Storage Reduced Redundancy Storage First 1 TB / month $0.095 per GB $0.076 per GB Next 49 TB / month $0.080 per GB $0.064 per GB Next 450 TB / month $0.070 per GB $0.056 per GB Next 500 TB / month $0.065 per GB $0.052 per GB Next 4000 TB / month $0.060 per GB $0.048 per GB Over 5000 TB / month $0.055 per GB $0.037 per GB
  • AWS Archive Options Amazon Glacier •  •  $0.01 - GB per month Retrievals: –  5% of monthly average storage (pro-rated daily) free •  •  •  •  •  Synchronous in 3–5 hour asynchronous retrieval Designed for 99.999999999% durability AES 256 encryption at rest Highly scalable •  •  Reliable Authentication mechanisms ensure data is kept secure
  • AWS Archive Options Object Lifecycle Management: Amazon S3 → Amazon Glacier •  Seamlessly move data from Amazon S3 → Amazon Glacier •  3-5 hour asynchronous retrieval •  Data lifecycle policies •  $0.01 per GB for Amazon Glacier costs →
  • Partner Ingest Options Aspera   CloudBeam   Signiant   Up  to  1  Gb/s  per  instance  to  AWS   SaaS-­‐based  file  transfer  into  and  out   of  AWS     High-­‐speed,  network-­‐efficient  file  transfer  – up  to  200X  faster  than  FTP  with  95+%   network  efficiency  
  • Partner Ingest Options Aspera On-Demand •  •  •  •  •  •  •  Achieve file transfer speeds that are 1000s of times faster the FTP In, out, and across the cloud with enterprise-grade security End-to-end security Speeds of up to 1 Gbps per AWS instance 10 TB per 24 hours Scale-out architecture Web, mobile, embedded clients
  • Partner Ingest Options Attunity CloudBeam •  Simplifies, automates, and accelerates the loading and replication of files from onpremises, heterogeneous sources to and from Amazon S3 •  Common Use Cases: –  –  –  –  Content availability and distribution Data analysis (Amazon EMR Hadoop) Backup, disaster recovery, and archiving Region-to-region replication
  • Partner Ingest Options Signiant Media Shuttle •  •  •  •  •  AWS-based fast file transfer as a service 200X faster than FTP Separates control layer from the data layer Multiple sources and targets including Amazon S3 Firewall-friendly transfers with autoselecting UDP, TCP, and HTTP transport options NAS (CIFS, NFS) DAS / SAN
  • Partner Ingest Options Cycle Computing DataManager •  •  •  •  Move data from any NAS / file system to Amazon S3 and/or Amazon Glacier Clean up expensive, on-premises disk Maintain full access to all content Reduce or eliminate future data migrations upon hardware refresh
  • Partner Storage and Archive Solutions Avere  Systems   Panzura   Record  performance,  scale-­‐out,   single  file  system  NAS   Cloud-­‐integrated  local  NAS  capabili;es  for   the  globally  distributed  enterprise  
  • Partner Storage and Archive Solutions   Cloud  Storage  Gateway  Solu0ons  
  • Partner Storage Options Example: Cloud Storage Gateway – Global Namespace NAS
  • Partner Storage Options Avere Systems – Comparing 1,000,000 IOPS Solutions •  •  •  •  Add high-performance, scale-out clustering with any NAS Automated tiering Separates performance scaling from capacity Avere offers the leading $ per IOPS for NAS Avere $2.3 / IOPS –  $2.3/IOPS •  •  •  •  80% less total equipment than traditional NAS systems Fastest scale-out, single file system (NAS) available Linear scaling to millions of operations/sec Tens of GB/sec of throughput 150ms Cloud Latency
  • Partner Storage Options Avere Systems •  Amazon S3 integration by EOY 2013 –  3-step process: 1.  Leverage Avere to accelerate current NAS System 2.  Nondisruptive migration to Amazon S3 / Amazon Glacier –  3.  FlashMove Switch mode in Avere to enable primary NAS operations –  Retire older NAS gear Client Workstations Core Filers Avere FXT Edge Filer Purpose-built for cloud Enterprise-class scaling Lowest TCO Compute Farm Amazon S3 Legacy NAS Show as complex w/ RAID, volume limits, low utilization, mirror schedules, etc. WAN Amazon Glacier On Premise AWS
  • Partner Storage Options Panzura •  Panzura enables: •  Global sharing – On-premises, hybrid, across AWS regions •  Panzura Amazon Machine Image (AMI) •  Small physical footprint •  Separation of data and metadata •  Data protection •  NAS centralization •  Shift ratio of Opex vs. Capex
  • TCO: On-Premises Cost Considerations 1.  Primary storage hardware (primary / remote site) 2.  DR / Remote site storage hardware 3.  Raw to utilized storage (both primary and DR) 4.  Storage growth (cost of upgrades) 5.  Storage management software and 3rd party tools 6.  Professional services 7.  Hardware maintenance 8.  Software maintenance 9.  Backup software 10. Backup hardware (primary / remote site) 11. Offsite tape storage / vault 12. Archive software 13. Archive hardware 14. Power 15. Cooling 16. Space 17. Labor 18. Cost of capital 19. Training 20. Asset depreciation 21. Migration 22. Decommission / remove 23. Recycle
  • Summary AWS ingest, storage, and archive solutions: •  AWS Import/Export + Amazon S3, Amazon EBS, Amazon Glacier •  AWS Storage Gateway + Amazon S3 •  AWS VTL + Amazon S3 + Amazon Glacier Partner-based ingest solutions: •  •  •  •  Aspera on-demand solution + Amazon S3 Attunity + Amazon S3 Signiant Media Shuttle + Amazon S3 Cycle Computing’s DataManager + Amazon S3 + Amazon Glacier Partner-based storage / archive solutions: •  Avere Systems + Amazon S3 and Amazon Glacier •  Panzura + Amazon S3 and Amazon Glacier
  • Thank you! John Downey jdowney@amazon.com 646.276.1635
  • Glacier at iN DEMAND Michael Raposa, iN DEMAND mraposa@indemand.com © 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • iN DEMAND Intro •  World leader in providing transactional entertainment delivered through television •  Joint Venture – owned by Comcast, Time Warner, & Cox •  Pay-per-view programming – MLB, NBA, NHL, boxing, MMA, & Howard Stern
  • The Problem •  1.5 PB video archive –  World War Z –  Ice Pirates –  Titanic II … Tsunami AND an Iceberg •  Tape storage –  –  –  –  Tape corruption and bit rot Lost tapes Physical storage – 1.5 PB is a lot of tapes Legacy tape formats – LTO-1, 2, 3, 4, 5, etc. etc.
  • The Problem (cont.) •  Manual asset tracking –  Typical backup system stores file name, date, size –  Important metadata is tracked separately, e.g. bit rate, aspect ratio, closed captioning, dual language, codec. –  Inventory issues –What bit rates do we have for Spider Man? –  Multiple storage – “Put it on tape just to be sure” •  Manual archive and restore –  Wait for operator to handle restores – not 24x7
  • The Problem (cont.) •  Expensive –  Tape operator –  Tape storage –  Yearly tape library maintenance •  Limited scale –  Limited by tape library capacity –  Limited by physical space
  • The Solution – Mini-DAM •  Limited digital asset management system for Glacier –  –  –  –  Web UI Glacier storage $50 K Hosted at AWS – EC2, Amazon RDS, Amazon SNS, Amazon SES –  Over 300 TB in Glacier to date –  Adding about 2 TB / day
  • Tips & Tricks •  Concurrent downloader required –  Users want files FAST!!! –  .NET and JAVA AWS SDK have only a single-threaded downloader – MAX download c.a. 160 Mbps –  iN DEMAND wrote a multithreaded downloader –  Added to AWS SDK for Python (BOTO) – MAX. download 600 Mbps •  Per archive Glacier overhead –  –  –  –  Every Glacier archive has a 32 kb overhead for metadata You are charged for this overhead For small files that 32 kb starts to add up Zip up small files before uploading
  • Tips & Tricks •  Download request time outs –  24 hours to download archive –  Queue up requests to ensure that files are downloaded within the 24-hour timeout •  Add the extra encryption to make management happy –  The MPAA loves encryption –  Management loves encryption –  AWS automatically encrypts files at rest in Glacier
  • Tips & Tricks •  Checksum files before you upload –  Save MD5 checksum –  Check that file hasn’t already been uploaded to Glacier –  Avoid file duplication •  Track who requests downloads to manage costs –  Fee associated with each download –  Keep employees honest
  • Please give us your feedback on this presentation MED301 As a thank you, we will select prize winners daily for completed surveys! Thank You