Frame accurate video client in the browser

Frame accurate video client
in the browser
By Jordi Cenzano
Nov 2016
Apr 2017

Motivation
● 2011: Build a NRCS for 35 journalists
○ Peak 45 people on special events (elections), 8 people on weekends
○ Based on Barcelona, offices in Madrid, but contributions from everywhere
○ Cheap, reliable, easy to support, etc

Motivation
● 2017: we have available:
○ Cloud CMS
○ Cloud Video edition APIs
○ Cloud storage
○ Reliable, powerful, and more affordable internet links
○ Video editor player? Broadcast UX?

Spoiler alert!
● VOD demo, check it out!
○ https://jordicenzano.github.io/frame-accurate-scrubbing/

Introduction - design goals
● Accuracy is our main concern
● Full control & feedback about displayed video / audio frame
● Use common browser technologies: Javascript / HTML5
● Assumptions (just trimming player):
○ BW is NOT 1st concern (few users, good connections)
○ Maximise image quality is NOT a concern
○ Full mobile device compatibility not a concern
● Market research:
○ Vimond IO (IBC 09/2016)
○ Grabyo (tested 03/2016)
○ Volicon (tested 03/2016)
○ Accurate player

How it works for VOD? JUST A P.O.C.

VOD Backend process
● Extracts media information (fps, length, audio fs, etc)
● Extracts time code (SMPTE timecode) information
● Detects scene changes, and add it as cue point information
● Extract AV initial delay, to compensate it later
● Ready to extract other timed metadata (Cue points)
● For video
○ Decodes video and encodes each frame using JPEG (quality as a parameter)
● For audio
○ Decodes audio and encodes each portion as PCM (video frame aligned) with sample accuracy
● Generate a JSON manifest with all the information

Backend: step by step
● Extract media information: ffprobe
"streams": [{
"index": 0,
"codec_type": "video",
"r_frame_rate": "30/1",
"nb_frames": "781",
"start_time": "0.000000",
...
"tags": {
"timecode": "01:00:00:00"
...
}
}
{
"codec_name": "aac",
"codec_type": "audio",
"sample_rate": "44100",
"channels": 2,
"channel_layout": "stereo",
"start_time": "0.000000",
...
"tags": {
"timecode": "01:00:00:00"
..
}
}, {
..
}]

● Transcode video to single frame files: ffmpeg

● Decode audio to PCM (wav): ffmpeg
● Create one PCM audio “frame” per video file: Our own lib
● Compensate A/V delay at the beginning and end of stream: Our own lib
Current constraint:
Audio fs multiple of fps
ex: 44.1Khz / 30fps =
1470 asamples/vframe

● Detect scene changes: ffmpeg (+video filter)
...
[Parsed_showinfo_1 @ 0x7febc6600700] n: 0 pts: 251904 pts_time:16.4
pos: 4594281 fmt:rgb24 sar:1/1 s:1280x720 i:P iskey:1 type:I
checksum:D1C91F26 plane_checksum:[D1C91F26] mean:[119 ] stdev:[69.3 ]
...

Backend: The manifest (.json)
"video": {
"num_frames": 781,
"base_frame_path": "./a1/854x480/video",
"base_file_name": "test1v_q14_",
"num_digits_frame": 5,
"frame_ext": ".jpg",
"fps": 30
},
"audio": {
"num_frames": 781,
"base_frame_path": "./a1/854x480/audio",
"base_file_name": "test1a_",
"num_digits_frame": 5,
"frame_ext": ".wav",
"sample_rate": 44100,
"channels": 2,
"bit_per_sample": 16,
"sample_type": "signed"
},
"metadata": {
"0": { "smpte_tc": 108000 },
"204": {
"smpte_tc": 108204,
"cue_info": {
"info": "scene
change",
"mean": "152",
"stddev": "72.2"
}
},
…
}

Frontend process: Pure javascript (NO MSE)
● Fetch the manifest
● Fetch all video frames files in the manifest (&)
○ Download & store then in a Image() matrixV[0..NFrames]
● Fetch all the audio files in the manifest (&)
○ Store them as a byte object matrixA[0..NFrames] (Uses XMLHttpRequest with arraybuffer)
● Wait for user events (pos, play, rev, +/-1 frame, etc):
○ Position X
■ Show video frame X and metadata
■ Create audio context (if not created), write all frame samples into buffer and play it.
○ Playback
■ For Video: Show video frame X+1 every 1/fps (SetInterval function)
■ Create audio context (if not created), write the samples for Ys into a buffer, and play it.

Now yes, the demo (VOD)!
● Check it out!
○ https://jordicenzano.github.io/frame-accurate-scrubbing/

Live player (initial approach)

Pros and Cons (as a trimming tool)
● Accuracy
● Responsivity
● Cloud: accessed from everywhere, easy support & upgrade
● Run (almost) everywhere: HTML5 + Javascript
● Broadcast friendly (uses TC, easy to integrate Broadcast workflows)
● Requires more BW than regular playback (x3...x5)
● Probable audio clicks every anchor point (20s). Not designed for long
playback.
● Current limitations:
○ Audio fs must much player device (most common = 44.1Khz)
○ Audio fs multiple of fps

Future work
● Accept any audio sample frequency
● Implement live ingest approach
● Add intelligence to download algorithm:
○ Download all the audio and for the video just a the range that surround the cursor
○ Using ABR, download the lowest quality. And improve quality arround the cursor
● Test JPG2000 instead of JPG (½ BW savings?)
● Compensate video speed in long playbacks (avoid long term A/V drift)
● Try to use WebWorkers to download (and/or process) audio
● Implement multiple speeds (super easy)
● Implement multiple qualities (ABR approach)

Thanks!
jcenzano@brightcove.com
@jordicenzano
www.linkedin.com/in/jordicenzano
jordicenzano
https://jordicenzano.github.io/frame-accurate-scrubbing/

Frame accurate video client in the browser

Recommended

Recommended

More Related Content

Similar to Frame accurate video client in the browser

Similar to Frame accurate video client in the browser (20)

Recently uploaded

Recently uploaded (20)

Frame accurate video client in the browser