• Save
Glitch-Free A/V Encoding (CocoaConf Atlanta, November 2013)
Upcoming SlideShare
Loading in...5
×
 

Glitch-Free A/V Encoding (CocoaConf Atlanta, November 2013)

on

  • 3,844 views

The iPhone is the best iPod Apple's ever made, and the iPad has replaced the TV for many users. And while developers can use documentation and books master the media frameworks (AV Foundation, Core ...

The iPhone is the best iPod Apple's ever made, and the iPad has replaced the TV for many users. And while developers can use documentation and books master the media frameworks (AV Foundation, Core Audio, and the rest), there's nothing in Xcode that will keep your audio from dropping out, fix artifacting on video with a lot of motion, or properly balance performance on the most-capable new Retina devices with backwards-compatibility with older ones. This session offers a ground-level intro to what's actually in your iTunes songs and streaming videos, and how to best encode them for the realities of iOS devices, their storage capacities and the networks they live on. We'll shoot, compress, and stream, all from a MacBook Air, and take a close look and listen to the results.

Statistics

Views

Total Views
3,844
Views on SlideShare
3,844
Embed Views
0

Actions

Likes
2
Downloads
0
Comments
0

0 Embeds 0

No embeds

Accessibility

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Glitch-Free A/V Encoding (CocoaConf Atlanta, November 2013) Glitch-Free A/V Encoding (CocoaConf Atlanta, November 2013) Presentation Transcript

    • Glitch-Free A/V Encoding Chris Adamson • @invalidname CocoaConf Atlanta • November 2013
    • Glitches • Bitrate too high for network
 • Bitrate too low for contents
 • Keyframe interval too low / encoder error
    • More Glitches • Audio and Video out of sync • Media doesn’t play at all • …or plays on some devices but not others • Media consumes too much of a resource: battery, filesystem storage, etc.
    • Beating the Glitches (what we’ll learn today) • How digital media works: tradeoffs • Codecs, compression, and containers • Different approaches for different needs • iOS / Mac encoding APIs
    • Digital Media
    • A/V Encoding • Representing time-based media digitally • “Show this image at this time” • “Codec” – from “coder / decoder”
    • Analog media • Telephone – air pressure against mic / from speaker reproduced as line voltage • Radio – amplitude of sound wave modulated atop carrier signal • Film – series of distinct images presented for fraction of a second each
    • Simple Digital Media • Captions / Subtitles – Series of samples that indicate text / color / location and timing • PCM audio – audio wave form reproduced as numeric samples • M-JPEG – Series of JPEG frames with timing information
    • Compression • Advanced codecs can reduce bandwidth by • Eliminating redundant information within groups of samples • Eliminating data that won’t be missed by human eyes/ears • “Lossless” codecs reproduce their source media perfectly; “lossy” codecs don’t
    • Tradeoffs
    • Choose at most 4 (2, more likely)
    • Blu-Ray Author • Cares most about image quality, and fitting into a specific size range ("bitrate budgeting") • Does not care about render time, CPU requirements, or expense • The author's pay may itself be an expense for someone else
    • Streaming Site • Server-side transcoder cares most about bitrates that work for clients • Cares somewhat about time for uploaded files; critically important for livestreams • Cost/CPU/storage/bandwidth may be issues as the site scales, but they're the kinds of problems you want to have
    • Video Editor / Effect Artist • Cares most about image quality (don't want to degrade with each edit or effect), and CPU (heavily-compressed video is slow to scrub through, composite, etc.) • Does not care about storage/bandwidth or cost
    • Facetime Users • Cares most about encoding time (must be in real-time) and cost (end-users expect services to be free) • Care about CPU only to the degree it works at all on their device • Don't care about image quality; it's expected to scale with available bandwidth
    • So what's the best video codec?
    • One Size Doesn't Fit All • Editors and artists need uncompressed files, or "mezzanine" codecs that have high quality with light (preferably lossless) compression • Blu-Ray author will take uncompressed or mezzanine and crunch to a highly-efficient delivery codec • Facetime users need something that can be compressed in realtime on consumer devices
    • iOS / Mac video codecs • For capture / editing / effects: Uncompressed or ProRes • For end-user distribution: H.264 • On iOS, pretty much H.264 for everything
    • Codec Frame Types • Intra (I) frame – all image data is included in the frame • Predicted (P) frame – some image data depends on one or more earlier frames • Bi-directionally predicted (B) frame – some image data depends on one or more earlier and/or later frames
    • I/P/B Frames I frame P frame B frame I frame From http://en.wikipedia.org/wiki/Video_compression_picture_types
    • Codec Configurations • Bitrate: how much data is consumed per second • More bits = higher image quality • Keyframe interval: can force an I-frame at a specific interval to "clean up" the image • Image size, frame rate, etc.
    • H.264 “Profiles” • Define which parts of the H.264 video specs are / aren’t available • On Apple devices: baseline, main, and high • Baseline: iPhone 4 and earlier, video iPods • High: Apple TV 2, iPad 3rd Gen, iPhone 5 • Biggest difference: baseline doesn’t have Bframes
    • Tradeoff Example
    • Compressor 4 Demo
    • Original 720p HD From http://trailers.apple.com/trailers/dreamworks/howtotrainyourdragon2/
    • 200 kbps Note: Compressor apparently won’t output such a low a bitrate for this frame size; actual video bitrate is around 750 kbps
    • 200 kbps, 10 frames/sec
    • 200 kbps, resized to 640x272 Note: original HD was 1280x544
    • 200 kbps, 640x272, H.264 “main” profile Note: incompatible with iPhone 4 and earlier
    • More Compression Considerations • May want to filter media before encoding • Audio: dynamic compression and normalization of levels (see "The Levelator") • Video: some codecs change your colors and luminence ("crushed blacks"); you can adjust them prior to compression to lessen this effect
    • NTSC Color Bars 3.5%, 7.5%, 11.5% black
    • Interlacing Don't compress video that looks like this until you de-interlace it, please.
    • Compressor 4 Image Controls
    • A brief aside
    • What Do The Following Have In Common? • • • Most “Second Doctor” (Patrick Troughton) episodes of Doctor Who Most US soap operas prior to 1970 Most US game shows prior to 1975 • Nearly all Dumont Network (1946-1956) programming • Television broadcasts of Super Bowls I & II • …and much more
    • They no longer exist
    • Loss • Encoding never makes media better. When image or sound data is lost, it is lost forever • When master tapes or films are destroyed, they can never be brought back, and copies are inherently inferior • In previous decades, reuse of video tape (“wiping”) and destruction of film (“junking”) were common practice
    • Case Study: Filmation • Major US producer of TV/movie animation (Superman, Fat Albert,The Archies, He-Man) • Bought and shut down in 1989. Archive converted to PAL and films destroyed • Due to framerate differences, PAL-to-NTSC conversions will always have sped-up audio • Can never be released in HD
    • Moral of Story Always preserve pristine master recordings
    • Container Formats
    • Containers • Allow you to combine and synchronize multiple audio/video/other media streams • Files: QuickTime (.mov), MPEG-4 (.mp4, .m4v, .m4a, .aac, etc.), Flash (.flv), Windows Media (.wmv), Ogg (.ogg), etc. • Network streams: Shoutcast, RTSP, HLS, MPEG-2 Transport, etc.
    • QuickTime File Format • Content agnostic: can handle any kind of codecs • Internal tree structure of "atoms": 4 byte size counter, 4 character code type, and then internals specific to that type • "moov" atom at the top level, contains "trak" atoms, which contain "mdia" atoms, which point to media samples
    • Editing with QuickTime • Sample references may or may not be in the same file • If they are, it's a "self-contained movie", suitable for distribution to end users • If not, it's a "reference movie", suitable for non-destructive editing
    • Streaming Containers • Streams can't offer random access like files • Simple example: Shoutcast • Just an endless stream of MP3 data over a socket, rate-controlled by server • Metadata (song titles) are inserted periodically in stream and must be removed by client before passing to audio decoder
    • HTTP Live Streaming • Required format for most iOS streaming • Not actually a stream, but a series of small (~10 sec) files and a periodically-refreshed playlist of segment files • Can provide different bitrates via a playlistof-playlists; client will figure out if it's getting data fast enough and switch up or down as needed
    • Creating Media for Apps
    • Consider Your Goals • Playback: cut scenes, media player • Capture/editing • Messaging:VoIP, video chat • Other: livestreaming, screencasting, etc.
    • Consider Your Priorities • Picture/sound quality • Performance • Usefulness
    • Consider Your Constraints • Device storage / network bandwidth • iOS apps over 100MB cannot be downloaded over cellular network • CPU/GPU performance • Device support • Are you encoding for non-Apple devices too?
    • Choose at most 4 (2, more likely)
    • Creating Media In Apps
    • Encoding APIs • Core Audio • Audio Converter Services, Extended Audio File Services • AV Foundation • AVAssetExportSession, AVAssetWriter • Video Toolbox (Mac only)
    • Core Audio codecs • LPCM (uncompressed) • MP3 (read-only) • AAC • iLBC • Apple Lossless • Audible Not a complete list. Not all Mac types available on iOS.
    • AVAssetExportSession • Used to export an AVAsset (one or more audio/video tracks) to a file • Takes a “preset” for configurations. Can be: • QuickTime at various quality settings • QuickTime at various sizes • iTunes-compatible .m4a • “Pass Through”
    • AVAssetWriter • Lets you write one of several file formats (.mov, .mp4, Core Audio types) and specify encoding parameters (codec, size, bitrate, keyframe interval, etc) • You manually append CMSampleBuffers, usually single frames of video or small audio buffers
    • self.assetWriter = [AVAssetWriter assetWriterWithURL:tempFileURL fileType:AVFileTypeMPEG4 error:&assetWriterError]; NSDictionary *videoInputSettings = nil; NSDictionary *videoCompressionSettings = @{ AVVideoAverageBitRateKey : @ ([self.captureSettings.videoBitRate floatValue] * 1024.0), AVVideoMaxKeyFrameIntervalKey : self.captureSettings.videoKeyframeInterval, AVVideoProfileLevelKey : AVVideoProfileLevelH264Main1}; videoInputSettings = @{ AVVideoCodecKey: AVVideoCodecH264, AVVideoWidthKey : @(self.outputSize.width), AVVideoHeightKey : @(self.outputSize.height), AVVideoCompressionPropertiesKey : videoCompressionSettings }; self.assetWriterVideoInput = [AVAssetWriterInput assetWriterInputWithMediaType: AVMediaTypeVideo outputSettings:videoInputSettings]; self.assetWriterVideoInput.expectsMediaDataInRealTime = YES; if ([self.assetWriter canAddInput:self.assetWriterVideoInput]) { [self.assetWriter addInput:self.assetWriterVideoInput]; }
    • AVF Video Codecs • H.264 • JPEG • ProRes 4:4:4:4 or 4:2:2 (Mac only) • iFrame – H.264 I-Frame-only (as a format received from AVCaptureSession only, iOS only) The numbers in ProRes codecs refer to the color/alpha fidelity; see http:// en.wikipedia.org/wiki/Chroma_subsampling for more information
    • HTTP Live Streaming • Create with command-line tools or Pro apps (Compressor, FCPX, Motion, etc.) • Or use a server-side service to do it for you (UStream, Wowza, etc.) • Use variant playlists to target different devices and network conditions • Must provide a 64 kbps variant, either audio-only or audio with a single image
    • TN2224 – 16:9 “Best Practices for Creating and Deploying HTTP Live Streaming Media for the iPhone and iPad” http://developer.apple.com/library/ios/#technotes/tn2224/_index.html
    • TN2224 - 4:3
    • Additional Streaming Considerations • Are you encoding for other platforms? • Macs, Roku, Android (4.1+) support HLS • Desktops get Flash instead of <video> (but H.264 in .flv works great too!)
    • Takeaways • Encoding is about tradeoffs: know what matters to you, and what you can compromise on • CPU, storage/bandwidth, cost, time, quality
    • Q&A Slides will be posted to the CocoaConf Glassboard, and announced on my Twitter & app.net (@invalidname)