[@NaukriEngineering] Video handlings on apple platforms

Video Handling on
Apple Platforms
By Satendra Singh
Info Edge (India) Limited
Techno Speak - OCT 2017

Media Architecture in iOS and OS X

AvKit
Create view-level services for media playback, complete with user
controls, chapter navigation, and support for subtitles and closed
captioning.

What is AVFoundation
AVFoundation framework combines four major technology areas:
● Media Capture
- Capture photos and record video and audio; configure built-in cameras and
microphones or external capture devices.
● Audio
- Configure Audio sessions, Play, Record, mix and process audios
● Playback and Editing
- Access media assets and inspect their content; queue media for playback and
customize playback behavior; edit and combine assets with compositions; import and export
raw media streams.

What is AVFoundation
● Speech
- Convert text to audio
All these together encompass a wide range of tasks for capturing, processing, synthesizing,
controlling, importing and exporting audiovisual media on Apple platforms.

Cameras and Media Capture
To capture media we need following entities:
Session -
AVCaptureSession
An object that manages capture activity and coordinates the flow of data from input devices to capture
outputs.
Input -
AVCaptureInput
The abstract superclass for objects that provide input data to a capture session.
AVCaptureScreenInput - A capture input for recording from a screen in macOS.

Cameras and Media Capture
Output -
class AVCaptureOutput
The abstract superclass for objects that output the media recorded in a capture session.
AVCaptureStillImageOutput - A capture output for capturing still photos in macOS.
Preview -
AVCaptureVideoPreviewLayer
A Core Animation layer that can display video as it is being captured.
AVCaptureAudioPreviewOutput
A capture output that provides preview playback for audio being recorded in a capture session.

Example Application - AVScreenShack

Media Assets, Playback and Editing
What is AVAsset?
● AVAsset is an abstract, immutable class used to model timed audiovisual media
such as videos and sounds.
● An asset may contain one or more tracks.
● A track may contain audio, video, text, closed captions, and subtitles.

What is CMTime?
● In general we can say the CMTime represent the frame position in an asset.
● A struct representing a time value such as a timestamp or duration.
● A CMTime is represented as a rational number, with a numerator (an int64_t value), and a
denominator (an int32_t timescale).

What is Frame?
A video frame often has information associated with it that is useful to the system that displays it. In
Core Video, this information is associated with a video frame as an attachment. Attachments are Core
Foundation objects representing various types of data as following:
● Color space: A color space is the model used to represent an image, such as RGB or CMYK etc.
● Square versus rectangular pixels: Digital video on computers typically use square pixels. However, TV
uses rectangular pixels, so you need to compensate for this discrepancy if you are creating video for broadcast.
● Timestamps: a timestamp represents when a particular frame appears in a movie.Timestamps make it easy to
isolate particular movie frames, and simplify synchronization of multiple video and audio tracks.

How Asset editing work?
● Output
-AVAssetWriter
We use an AVAssetWriter object to write media data to a new file of a specified
audiovisual container type, such as a QuickTime movie file or an MPEG-4 file, with support for automatic
interleaving of media data for multiple concurrent tracks.
● Input
-AVAssetWriterInput
We use an AVAssetWriterInput to append media samples packaged as
CMSampleBuffer objects (see CMSampleBuffer), or collections of metadata, to a single track of the
output file of an AVAssetWriter object.

● Input Adapter
-AVAssetWriterInputPixelBufferAdaptor
We use an AVAssetWriterInputPixelBufferAdaptor to append video samples
packaged as CVPixelBuffer objects to a single AVAssetWriterInput object.
● Pixel Buffer
- We use CVPixelBufferRef to create Core Image buffers that hold pixels in main
memory.
● Rendering Context
- We Use CGContextRef to draw the image and other elements in the
context.Context is generated using CVPixelBufferRef.

Example Application - Record Screen

https://developer.apple.com/documentation/avfoundation

What is remaining?
● Merge multiple videos
● Record audio
● Merge/replace audio tracks in video
● Synthesizing

[@NaukriEngineering] Video handlings on apple platforms

Recommended

Recommended

More Related Content

Similar to [@NaukriEngineering] Video handlings on apple platforms

Similar to [@NaukriEngineering] Video handlings on apple platforms (20)

More from Naukri.com

More from Naukri.com (18)

Recently uploaded

Recently uploaded (7)

[@NaukriEngineering] Video handlings on apple platforms