Scalable Media Processing in the Cloud (MED302) | AWS re:Invent 2013
Upcoming SlideShare
Loading in...5
×
 

Scalable Media Processing in the Cloud (MED302) | AWS re:Invent 2013

on

  • 1,777 views

The cloud empowers you to process media at scale in ways that were previously not possible, enabling you to make business decisions that are no longer constrained by infrastructure availability. Hear ...

The cloud empowers you to process media at scale in ways that were previously not possible, enabling you to make business decisions that are no longer constrained by infrastructure availability. Hear about best practices to architect scalable, highly available, high-performance workflows for digital media processing. In addition, this session covers AWS and partner solutions for transcoding, content encryption (watermarking and DRM), QC, and other processing topics.

Statistics

Views

Total Views
1,777
Views on SlideShare
1,489
Embed Views
288

Actions

Likes
2
Downloads
44
Comments
0

2 Embeds 288

http://www.scoop.it 286
http://dwamcast.tumblr.com 2

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Scalable Media Processing in the Cloud (MED302) | AWS re:Invent 2013 Scalable Media Processing in the Cloud (MED302) | AWS re:Invent 2013 Presentation Transcript

    • Scalable Media Processing Phil Cluff, British Broadcasting Corporation David Sayed, Amazon Web Services November 13, 2013 © 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
    • Agenda • • • • Media workflows Where AWS fits Cloud media processing approaches BBC iPlayer in the cloud
    • Media Workflows Archive Featurettes Networks Interviews Media Workflow 2D Movie 3D Movie Archive Materials Stills Theatrical DVD/BD Media Workflow Media Workflow Online MSOs Mobile Apps
    • Where AWS Fits Into Media Processing Analytics and Monetization Amazon Web Services Playback Track Auth. Protect Package QC Process Index Ingest Media Asset Management
    • Media Processing Approaches 3 Phases
    • Cloud Media Processing Approaches Phase 1: Lift processing from the premises and shift to the cloud
    • Lift and Shift Media Processing Operation OS Media Processing Operation OS Storage EC2 Storage Media Processing Operation OS EC2 Storage
    • The Problem with Lift and Shift Monolithic Media Processing Operation OS EC2 Storage Ingest Operation Postprocessing Export Workflow Media Processing Operation Parameters
    • Cloud Media Processing Approaches: Phase 2 Phase 1: Lift processing from the premises and shift to the cloud Phase 2: Refactor and optimize to leverage cloud resources
    • Refactor and Optimization Opportunities “Deconstruct monolithic media processing operations” – – – – – – Ingest Atomic media processing operation Post-processing Export Workflow Parameters
    • Refactoring and Optimization Example EBS EC2 EBS EC2 EBS API Calls EC2 Source S3 Bucket SWF Output S3 Bucket
    • Cloud Media Processing Approaches Phase 1: Lift processing from the premises and shift to the cloud Phase 2: Refactor and optimize to leverage cloud resources Phase 3: Decomposed, modular cloudnative architecture
    • Decomposition and Modularization Ideas for Media Processing • Decouple *everything* that is not part of atomic media processing operation • Use managed services where possible for workflow, queues, databases, etc. • Manage – – – – Capacity Redundancy Latency Security
    • in the Cloud AKA “Video Factory” Phil Cluff Principal Software Engineer & Team Lead BBC Media Services
    • Sources: BBC iPlayer Performance Pack August 2013 http://www.bbc.co.uk/blogs/internet/posts/Video-Factory • The UK’s biggest video & audio on-demand service – And it’s free! • Over 7 million requests every day – ~2% of overall consumption of BBC output • Over 500 unique hours of content every week – Available immediately after broadcast, for at least 7 days • Available on over 1000 devices including – PC, iOS, Android, Windows Phone, Smart TVs, Cable Boxes… • Both streaming and download (iOS, Android, PC) • 20 million app downloads to date
    • Video “Where Next?”
    • What Is Video Factory? • Complete in-house rebuild of ingest, transcode, and delivery workflows for BBC iPlayer • Scalable, message-driven cloud-based architecture • The result of 1 year of development by ~18 engineers
    • And here they are!
    • Why Did We Build Video Factory? • Old system – – – – Monolithic Slow Couldn’t cope with spikes Mixed ownership with third party • Video Factory – Highly scalable, reliable – Completely elastic transcode resource – Complete ownership
    • Why Use the Cloud? • Background of 6 channels, spikes up to 24 channels, 6 days a week • A perfect pattern for an elastic architecture Off-Air Transcode Requests for 1 week
    • Video Factory – Architecture • Entirely message driven – Amazon Simple Queuing Service (SQS) • Some Amazon Simple Notification Service (SNS) – We use lots of classic message patterns • ~20 small components – Singular responsibility – “Do one thing, and do it well” • Share libraries if components do things that are alike • Control bloat – Components have contracts of behavior • Easy to test
    • Video Factory – Workflow SDI Broadcast Video Feed Amazon Elastic Transcoder x 24 Broadcast Encoder SMPTE Timecode RTP Chunker Playout Video Amazon S3 Mezzanine Time Addressable Media Store Mezzanine Video Capture Mezzanine Elemental Cloud Live Ingest Logic Transcoded Video Metadata Playout Data Feed Transcode Abstraction Layer DRM QC Editorial Clipping MAM Amazon S3 Distribution Renditions
    • Detail • Mezzanine video capture • Transcode abstraction • Eventing demonstration
    • Mezzanine Video Capture
    • Mezzanine Capture SDI Broadcast Video Feed x 24 3 GB HD/1 GB SD SMPTE Timecode Broadcast Grade Encoder MPEG2 Transport Stream (H.264) on RTP Multicast 30 MB HD/10 MB SD RTP Chunker MPEG2 Transport Stream (H.264) Chunks Chunk Concatenator Chunk Uploader Amazon S3 Mezzanine Chunks Control Messages Amazon S3 Mezzanine
    • Concatenating Chunks • Build file using Amazon S3 multipart requests – 10 GB Mezzanine file constructed in under 10 seconds • Amazon S3 multipart APIs are very helpful – Component only makes REST API calls • Small instances; still gives very high performance • Be careful – Amazon S3 isn’t immediately consistent when dealing with multipart built files – Mitigated with rollback logic in message-based applications
    • By Numbers – Mezzanine Capture • 24 channels – 6 HD, 18 SD – 16 TB of Mezzanine data every day per capture • 200,000 chunks every day – And Amazon S3 has never lost one – That’s ~2 (UK) billion RTP packets every day… per capture • Broadcast grade resiliency – Several data centers / 2 copies each
    • Transcode Abstraction
    • Transcode Abstraction • Abstract away from single supplier – – – Avoid vendor lock in Choose suppliers based on performance and quality and broadcaster-friendly feature sets BBC: Elemental Cloud (GPU), Amazon Elastic Transcoder, in-house for subtitles • Smart routing & smart bundling – – Save money on non–time critical transcode Save time & money by bundling together “like” outputs • Hybrid cloud friendly – Route a baseline of transcode to local encoders, and spike to cloud • Who has the next game changer?
    • Transcode Abstraction Subtitle Extraction Backend Transcode Request SQS Transcode Router SQS Amazon Elastic Transcoder Backend Amazon Elastic Transcoder REST Elemental Backend Elemental Cloud Amazon S3 Mezzanine Amazon S3 Distribution Renditions
    • Transcode Abstraction - Future Subtitle Extraction Backend Transcode Request SQS Transcode Router SQS Amazon Elastic Transcoder Backend Amazon Elastic Transcoder REST Elemental Backend Elemental Cloud Unknown Future Backend X ? Amazon S3 Mezzanine Amazon S3 Distribution Renditions
    • Example – A Simple Elastic Transcoder Backend Amazon Elastic Transcoder XML Transcode Request Get Message from Queue POST Unmarshal and Validate Message Initialize Transcode SQS Message Transaction POST (Via SNS) XML Transcode Status Message Wait for SNS Callback over HTTP
    • Example – Add Error Handling Amazon Elastic Transcoder XML Transcode Request Get Message from Queue Dead Letter Queue POST Unmarshal and Validate Message Initialize Transcode Bad Message Queue SQS Message Transaction POST (Via SNS) XML Transcode Status Message Wait for SNS Callback over HTTP Fail Queue
    • Example – Add Monitoring Eventing Amazon Elastic Transcoder XML Transcode Request POST Get Message from Queue Unmarshal and Validate Message Monitoring Events Monitoring Events Dead Letter Queue Initialize Transcode Monitoring Events Bad Message Queue SQS Message Transaction POST (Via SNS) XML Transcode Status Message Wait for SNS Callback over HTTP Monitoring Events Fail Queue
    • BBC eventing framework • Key-value pairs pushed into Splunk – Business-level events, e.g.: • Message consumed • Transcode started – System-level events, e.g.: • HTTP call returned status 404 • Application’s heap size • Unhandled exception • Fixed model for “context” data – Identifiable workflows, grouping of events; transactions – Saves us a LOT of time diagnosing failures
    • Component Development – General Development & Architecture • Java applications – – – • Run inside Apache Tomcat on m1.small EC2 instances Run at least 3 of everything Autoscale on queue depth Built on top of the Apache Camel framework – – – A platform for build message-driven applications Reliable, well-tested SQS backend Camel route builders Java DSL • Full of messaging patterns • Developed with Behavior-Driven Development (BDD) & Test-Driven Development (TDD) – • Cucumber Deployed continuously – Many times a day, 5 days a week
    • Error Handling Messaging Patterns • We use several message patterns – Bad message queue – Dead letter queue – Fail queue • Key concept – Never lose a message – Message is either in-flight, done, or in an error queue somewhere • All require human intervention for the workflow to continue – Not necessarily a bad thing
    • Message Patterns – Bad Message Queue The message doesn’t unmarshal to the object it should OR We could unmarshal the object, but it doesn’t meet our validation rules • • • • Wrapped in a message wrapper which contains context Never retried Very rare in production systems Implemented as an exception handler on the route builder
    • Message Patterns – Dead Letter Queue We tried processing the message a number of times, and something we weren’t expecting went wrong each time • • • • Message is an exact copy of the input message Retried several times before being put on the DLQ Can be common, even in production systems Implemented as a bean in the route builder for SQS
    • Message Patterns – Fail Queue Something I knew could go wrong went wrong • • • • Wrapped in a message wrapper that contains context Requires some level of knowledge of the system to be retried Often evolve from understanding the causes of DLQ’d messages Implemented as an exception handler on the route builder
    • Demonstration – Eventing Framework
    • Questions? philip.cluff@bbc.co.uk dsayed@amazon.com @GeneticGenesis @dsayed
    • Please give us your feedback on this presentation MED302 As a thank you, we will select prize winners daily for completed surveys!