SlideShare a Scribd company logo
A Progressive Approach to the Past:
Ensuring Cheap Backwards Compatibility Through Cleverness and Pain
derekb@vimeo.com / derek@videolan.org
@daemon404
Derek Buitenhuis
13 April 2021
The British Internet
Who’s this guy?
1
13 April 2021
• Principal Video Engineer @ Vimeo
• Open source developer (FFmpeg, FFMS2, rav1e, obuparse, etc.)
• VideoLAN non-profit board member
• Professional Twitter Sh*tposter
Who am I?
2
13 April 2021
• Usually, I’m this guy:
Who am I, really?
3
13 April 2021
• Currently, I’m this guy:
Sins of Multimedia Past Last Forever
4
13 April 2021
• It is 2021. We encode to and serve fragmented MP4 for VoD.
• Audio and video are separate files.
• Segements are just range requests.
• Easier logic, easier caching.
• Some of us encode to progressive MP4, and segment at the edge.
• Can be expensive, can require running and maintaining services.
• Some people use MPEG-TS as their mezzanine. These people are monsters.
• Problems:
• Some Very Bad Programs can only consume progressive MP4.
• Your company made a bad decision over 10 yeas ago to give direct progressive MP4 URLs
to the highest paying customers.
• 10+ years of hardcoded URLs and API use. You also support VOD downloads.
Support Options
5
13 April 2021
• Don’t store videos as FMP4; store as progressive.
• Almost all traffic will have to be segmented at edge. This is expensive and dumb.
• Entirely remove progressive MP4 support.
• Least engineering work, most product work.
• Anger your highest paying users. Anger product. Anger marketing. Anger viewers on terrible devices.
• Store progressive MP4s as well, or just one rendition, such as 720p.
• Not much work.
• A lot of expensive storage for a rarely used rendition of every single video.
• People will still be angry because you took away their 240p or 4K, etc.
• Write a Very Clever Service to proxy FMP4s and make them appear progressive.
• Most engineering work.
• Service will be low volume, and thus fairly cheap.
So You’ve Chosen Pain
6
13 April 2021
• Obviously we chose the difficult engineering one.
• Things it needed :
• Transparently expose a set of FMP4 (one video, one audio) as a progressive MP4.
• Must support exact range requests, for playback in browser and Akamai cachability.
• Every request must be performant.
• Can’t read all the source moof boxes every time (more on this later).
• There are so many MP4 muxers and demuxers, but they’re all generic and not suitable.
• Source MP4s have all the info we need, such as mdat box offsets, timestamps, and sample sizes,
so the real solution is closer to de-/re-serialization.
• All input is known good. Bad input should be hard-rejected.
• So I wrote one.
Meet Artax
7
13 April 2021
MP4 Anatomy (Simplified)
8
13 April 2021
ftyp
moov
sidx
moof
mdat
moof
mdat
.
.
.
FMP4 Video
ftyp
moov
sidx
moof
mdat
moof
mdat
.
.
.
FMP4 Audio
→
ftyp
moov
mdat
Progressive
+
MP4 Anatomy (Deeper)
9
13 April 2021
[ftyp: File Type Box]
[moov: Movie Box]
[mvhd: Movie Header Box]
[trak: Track Box]
[tkhd: Track Header Box]
[edts: Edit Box]
[elst: Edit List Box]
[mdia: Media Box]
[mdhd: Media Header Box]
[hdlr: Handler Reference Box]
[minf: Media Information Box]
[vmhd: Video Media Header Box]
[dinf: Data Information Box]
[dref: Data Reference Box]
[url : Data Entry Url Box]
[stbl: Sample Table Box]
[stsd: Sample Description Box]
[avc1: Visual Description]
[avcC: AVC Configuration Box]
[colr: Colour Information Box]
[stts: Decoding Time to Sample Box]
[stsc: Sample To Chunk Box]
[stsz: Sample Size Box]
[stco: Chunk Offset Box]
[sgpd: Sample Group Description Box]
[sbgp: Sample to Group Box]
[mvex: Movie Extends Box]
[mehd: Movie Extends Header Box]
[trex: Track Extends Box]
[sidx: Segment Index Box]
[moof: Movie Fragment Box]
[mfhd: Movie Fragment Header Box]
[traf: Track Fragment Box]
[tfhd: Track Fragment Header Box]
[tfdt: Track Fragment Base Media Decode Time Box]
[trun: Track Fragment Run Box]
[sgpd: Sample Group Description Box]
[sbgp: Sample to Group Box]
[mdat: Media Data Box]
→
[ftyp: File Type Box]
[moov: Movie Box]
[mvhd: Movie Header Box]
[trak: Track Box]
[tkhd: Track Header Box]
[edts: Edit Box]
[elst: Edit List Box]
[mdia: Media Box]
[mdhd: Media Header Box]
[hdlr: Handler Reference Box]
[minf: Media Information Box]
[vmhd: Video Media Header Box]
[dinf: Data Information Box]
[dref: Data Reference Box]
[url : Data Entry Url Box]
[stbl: Sample Table Box]
[stsd: Sample Description Box]
[avc1: Visual Description]
[avcC: AVC Configuration Box]
[colr: Colour Information Box]
[stts: Decoding Time to Sample Box]
[ctts: Composition Time to Sample Box]
[stss: Sync Sample Box]
[stsc: Sample To Chunk Box]
[stsz: Sample Size Box]
[co64: Chunk Offset Box]
[sgpd: Sample Group Description Box]
[sbgp: Sample to Group Box]
[mdat: Media Data Box]
moov Box Strategy
10
13 April 2021
• Parse in the input moov and sidx boxes.
• Use moof offsets from the sidx boxes to use a threadpool to parse moofs in parallel.
• Construct all the non-mdat output boxes from this upfront, before reemuxing.
• This allows us to know the moov size, full file size, PTS/DTS, sync points,
and all mdat offsets upfont. This is extremely important for Content-Length and range
request support.
• Since we have all the exact parsed info from the source boxes, every size and offset
is calculable with a bit of book-keeping.
• Cache this information so that any future requests are fast.
• Now about range request support…
mdat Box Strategy
11
13 April 2021
• Packets sizes and positions in source files are all known.
• We need to properly interleave audio and video chunks.
• Chose 500ms interleaving.
• This interleaving is state – it must be consistent regardless of which range was requested.
• For example, you need to know, for any given range, how many packets into the chunk
you are when writing, and how they’re interleaved, 100% exactly.
• More on this in a second.
• We want to use persistent HTTP connections for reading all the mdats from source files.
• This means taking a minor hit in bandwidth by skipping over moofs, in order to keep it persistent.
• A prefetch is useful here.
Range Request Strategy
12
13 April 2021
• ftyp and moov boxes are calculated and cached already (byte buffer) – ranges for this are easy.
• Need to careful when handling ranges which staddle the cached moov and mdat box boundaries.
• mdat is much trickier:
• We need to calculate which source mdats (there are may per stream, remember) to start reading.
• We need to know which packets within these mdats to start outputting, and when to stop.
• We need to know how many bytes of the first and last written packets to ignore to satisfy the range.
• We need to know the exact position and state of the packet interleaving where this range starts.
• With a little pain, we can calculate this on each request, since we will know exactly what the
chunk pattern is, e.g. 12 video packets / 24 audio packets / repeat.
• If this sounds like a ton of tricky book-keeping, you are correct.
13
13 April 2021
Demo Time
14
13 April 2021
Questions? Disgust?

More Related Content

Similar to A Progressive Approach to the Past: Ensuring Backwards Compatability Through Cleverness and Pain

02.m3 cms sys-req4mediastreaming
02.m3 cms sys-req4mediastreaming02.m3 cms sys-req4mediastreaming
02.m3 cms sys-req4mediastreaming
tarensi
 
Moving Pictures - Web 2.0 Expo NYC
Moving Pictures - Web 2.0 Expo NYCMoving Pictures - Web 2.0 Expo NYC
Moving Pictures - Web 2.0 Expo NYC
Cal Henderson
 
Beginning html5 media, 2nd edition
Beginning html5 media, 2nd editionBeginning html5 media, 2nd edition
Beginning html5 media, 2nd edition
ser
 

Similar to A Progressive Approach to the Past: Ensuring Backwards Compatability Through Cleverness and Pain (20)

Glitch-Free A/V Encoding (CocoaConf Boston, October 2013)
Glitch-Free A/V Encoding (CocoaConf Boston, October 2013)Glitch-Free A/V Encoding (CocoaConf Boston, October 2013)
Glitch-Free A/V Encoding (CocoaConf Boston, October 2013)
 
02.m3 cms sys-req4mediastreaming
02.m3 cms sys-req4mediastreaming02.m3 cms sys-req4mediastreaming
02.m3 cms sys-req4mediastreaming
 
Video formats
Video formatsVideo formats
Video formats
 
How to video.
How to video.How to video.
How to video.
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024
 
Janus Workshop pt.2 @ ClueCon 2021
Janus Workshop pt.2 @ ClueCon 2021Janus Workshop pt.2 @ ClueCon 2021
Janus Workshop pt.2 @ ClueCon 2021
 
Moving Pictures - Web 2.0 Expo NYC
Moving Pictures - Web 2.0 Expo NYCMoving Pictures - Web 2.0 Expo NYC
Moving Pictures - Web 2.0 Expo NYC
 
Every Solution is Wrong: Normalizing Ambiguous, Broken, and Pants-on-Head Cra...
Every Solution is Wrong: Normalizing Ambiguous, Broken, and Pants-on-Head Cra...Every Solution is Wrong: Normalizing Ambiguous, Broken, and Pants-on-Head Cra...
Every Solution is Wrong: Normalizing Ambiguous, Broken, and Pants-on-Head Cra...
 
Streaming video to html
Streaming video to htmlStreaming video to html
Streaming video to html
 
FFMS2: Indexing, Edge Cases, and Insanity
FFMS2: Indexing, Edge Cases, and InsanityFFMS2: Indexing, Edge Cases, and Insanity
FFMS2: Indexing, Edge Cases, and Insanity
 
Toward low-latency Java applications - javaOne 2014
Toward low-latency Java applications - javaOne 2014Toward low-latency Java applications - javaOne 2014
Toward low-latency Java applications - javaOne 2014
 
Beginning html5 media, 2nd edition
Beginning html5 media, 2nd editionBeginning html5 media, 2nd edition
Beginning html5 media, 2nd edition
 
Image and Video formates
Image and Video formatesImage and Video formates
Image and Video formates
 
WHIP WebRTC Broadcasting @ FOSDEM 2022
WHIP WebRTC Broadcasting @ FOSDEM 2022WHIP WebRTC Broadcasting @ FOSDEM 2022
WHIP WebRTC Broadcasting @ FOSDEM 2022
 
Vimeo and Open Source (SMPTE Forum 2015)
Vimeo and Open Source (SMPTE Forum 2015)Vimeo and Open Source (SMPTE Forum 2015)
Vimeo and Open Source (SMPTE Forum 2015)
 
video tools
video toolsvideo tools
video tools
 
Preserving Audiovisual Materials (LIS 198-Digital Preservation)
Preserving Audiovisual Materials (LIS 198-Digital Preservation)Preserving Audiovisual Materials (LIS 198-Digital Preservation)
Preserving Audiovisual Materials (LIS 198-Digital Preservation)
 
The future of tape april 16
The future of tape april 16The future of tape april 16
The future of tape april 16
 
Chapter 9 - Computer Networking a top-down Approach 7th
Chapter 9 - Computer Networking a top-down Approach 7thChapter 9 - Computer Networking a top-down Approach 7th
Chapter 9 - Computer Networking a top-down Approach 7th
 
Performance Analysis of Various Video Compression Techniques
Performance Analysis of Various Video Compression TechniquesPerformance Analysis of Various Video Compression Techniques
Performance Analysis of Various Video Compression Techniques
 

More from Derek Buitenhuis

More from Derek Buitenhuis (8)

Are Video Codecs... Done?
Are Video Codecs... Done?Are Video Codecs... Done?
Are Video Codecs... Done?
 
Colorspace: Useful For More Than Just Color? - SF Video Tech Meetup - 27 May ...
Colorspace: Useful For More Than Just Color? - SF Video Tech Meetup - 27 May ...Colorspace: Useful For More Than Just Color? - SF Video Tech Meetup - 27 May ...
Colorspace: Useful For More Than Just Color? - SF Video Tech Meetup - 27 May ...
 
Opening up Open Source
Opening up Open SourceOpening up Open Source
Opening up Open Source
 
I Wrote an FFV1 Decoder in Go for Fun: What I Learned Going from Spec to Impl...
I Wrote an FFV1 Decoder in Go for Fun: What I Learned Going from Spec to Impl...I Wrote an FFV1 Decoder in Go for Fun: What I Learned Going from Spec to Impl...
I Wrote an FFV1 Decoder in Go for Fun: What I Learned Going from Spec to Impl...
 
Let's Be HAV1ng You - London Video Tech October 2019
Let's Be HAV1ng You - London Video Tech October 2019Let's Be HAV1ng You - London Video Tech October 2019
Let's Be HAV1ng You - London Video Tech October 2019
 
Let's Write a JPEG Decoder (Vimeo Lunch Talks)
Let's Write a JPEG Decoder (Vimeo Lunch Talks)Let's Write a JPEG Decoder (Vimeo Lunch Talks)
Let's Write a JPEG Decoder (Vimeo Lunch Talks)
 
Multimedia Buzzword Bingo: Translating to English
 Multimedia Buzzword Bingo: Translating to English Multimedia Buzzword Bingo: Translating to English
Multimedia Buzzword Bingo: Translating to English
 
Things Developers Believe About Video Files (Proven Wrong by User Uploads)
Things Developers Believe About Video Files (Proven Wrong by User Uploads)Things Developers Believe About Video Files (Proven Wrong by User Uploads)
Things Developers Believe About Video Files (Proven Wrong by User Uploads)
 

Recently uploaded

Recently uploaded (20)

Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
 
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
 
AI/ML Infra Meetup | ML explainability in Michelangelo
AI/ML Infra Meetup | ML explainability in MichelangeloAI/ML Infra Meetup | ML explainability in Michelangelo
AI/ML Infra Meetup | ML explainability in Michelangelo
 
Studiovity film pre-production and screenwriting software
Studiovity film pre-production and screenwriting softwareStudiovity film pre-production and screenwriting software
Studiovity film pre-production and screenwriting software
 
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
 
Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...
Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...
Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...
 
How Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptxHow Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptx
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
AI/ML Infra Meetup | Perspective on Deep Learning Framework
AI/ML Infra Meetup | Perspective on Deep Learning FrameworkAI/ML Infra Meetup | Perspective on Deep Learning Framework
AI/ML Infra Meetup | Perspective on Deep Learning Framework
 
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdfA Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
 
Accelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with PlatformlessAccelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with Platformless
 
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.ILBeyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
 
Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
 
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
 
Agnieszka Andrzejewska - BIM School Course in Kraków
Agnieszka Andrzejewska - BIM School Course in KrakówAgnieszka Andrzejewska - BIM School Course in Kraków
Agnieszka Andrzejewska - BIM School Course in Kraków
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
 
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
 

A Progressive Approach to the Past: Ensuring Backwards Compatability Through Cleverness and Pain

  • 1. A Progressive Approach to the Past: Ensuring Cheap Backwards Compatibility Through Cleverness and Pain derekb@vimeo.com / derek@videolan.org @daemon404 Derek Buitenhuis 13 April 2021 The British Internet
  • 2. Who’s this guy? 1 13 April 2021 • Principal Video Engineer @ Vimeo • Open source developer (FFmpeg, FFMS2, rav1e, obuparse, etc.) • VideoLAN non-profit board member • Professional Twitter Sh*tposter
  • 3. Who am I? 2 13 April 2021 • Usually, I’m this guy:
  • 4. Who am I, really? 3 13 April 2021 • Currently, I’m this guy:
  • 5. Sins of Multimedia Past Last Forever 4 13 April 2021 • It is 2021. We encode to and serve fragmented MP4 for VoD. • Audio and video are separate files. • Segements are just range requests. • Easier logic, easier caching. • Some of us encode to progressive MP4, and segment at the edge. • Can be expensive, can require running and maintaining services. • Some people use MPEG-TS as their mezzanine. These people are monsters. • Problems: • Some Very Bad Programs can only consume progressive MP4. • Your company made a bad decision over 10 yeas ago to give direct progressive MP4 URLs to the highest paying customers. • 10+ years of hardcoded URLs and API use. You also support VOD downloads.
  • 6. Support Options 5 13 April 2021 • Don’t store videos as FMP4; store as progressive. • Almost all traffic will have to be segmented at edge. This is expensive and dumb. • Entirely remove progressive MP4 support. • Least engineering work, most product work. • Anger your highest paying users. Anger product. Anger marketing. Anger viewers on terrible devices. • Store progressive MP4s as well, or just one rendition, such as 720p. • Not much work. • A lot of expensive storage for a rarely used rendition of every single video. • People will still be angry because you took away their 240p or 4K, etc. • Write a Very Clever Service to proxy FMP4s and make them appear progressive. • Most engineering work. • Service will be low volume, and thus fairly cheap.
  • 7. So You’ve Chosen Pain 6 13 April 2021 • Obviously we chose the difficult engineering one. • Things it needed : • Transparently expose a set of FMP4 (one video, one audio) as a progressive MP4. • Must support exact range requests, for playback in browser and Akamai cachability. • Every request must be performant. • Can’t read all the source moof boxes every time (more on this later). • There are so many MP4 muxers and demuxers, but they’re all generic and not suitable. • Source MP4s have all the info we need, such as mdat box offsets, timestamps, and sample sizes, so the real solution is closer to de-/re-serialization. • All input is known good. Bad input should be hard-rejected. • So I wrote one.
  • 9. MP4 Anatomy (Simplified) 8 13 April 2021 ftyp moov sidx moof mdat moof mdat . . . FMP4 Video ftyp moov sidx moof mdat moof mdat . . . FMP4 Audio → ftyp moov mdat Progressive +
  • 10. MP4 Anatomy (Deeper) 9 13 April 2021 [ftyp: File Type Box] [moov: Movie Box] [mvhd: Movie Header Box] [trak: Track Box] [tkhd: Track Header Box] [edts: Edit Box] [elst: Edit List Box] [mdia: Media Box] [mdhd: Media Header Box] [hdlr: Handler Reference Box] [minf: Media Information Box] [vmhd: Video Media Header Box] [dinf: Data Information Box] [dref: Data Reference Box] [url : Data Entry Url Box] [stbl: Sample Table Box] [stsd: Sample Description Box] [avc1: Visual Description] [avcC: AVC Configuration Box] [colr: Colour Information Box] [stts: Decoding Time to Sample Box] [stsc: Sample To Chunk Box] [stsz: Sample Size Box] [stco: Chunk Offset Box] [sgpd: Sample Group Description Box] [sbgp: Sample to Group Box] [mvex: Movie Extends Box] [mehd: Movie Extends Header Box] [trex: Track Extends Box] [sidx: Segment Index Box] [moof: Movie Fragment Box] [mfhd: Movie Fragment Header Box] [traf: Track Fragment Box] [tfhd: Track Fragment Header Box] [tfdt: Track Fragment Base Media Decode Time Box] [trun: Track Fragment Run Box] [sgpd: Sample Group Description Box] [sbgp: Sample to Group Box] [mdat: Media Data Box] → [ftyp: File Type Box] [moov: Movie Box] [mvhd: Movie Header Box] [trak: Track Box] [tkhd: Track Header Box] [edts: Edit Box] [elst: Edit List Box] [mdia: Media Box] [mdhd: Media Header Box] [hdlr: Handler Reference Box] [minf: Media Information Box] [vmhd: Video Media Header Box] [dinf: Data Information Box] [dref: Data Reference Box] [url : Data Entry Url Box] [stbl: Sample Table Box] [stsd: Sample Description Box] [avc1: Visual Description] [avcC: AVC Configuration Box] [colr: Colour Information Box] [stts: Decoding Time to Sample Box] [ctts: Composition Time to Sample Box] [stss: Sync Sample Box] [stsc: Sample To Chunk Box] [stsz: Sample Size Box] [co64: Chunk Offset Box] [sgpd: Sample Group Description Box] [sbgp: Sample to Group Box] [mdat: Media Data Box]
  • 11. moov Box Strategy 10 13 April 2021 • Parse in the input moov and sidx boxes. • Use moof offsets from the sidx boxes to use a threadpool to parse moofs in parallel. • Construct all the non-mdat output boxes from this upfront, before reemuxing. • This allows us to know the moov size, full file size, PTS/DTS, sync points, and all mdat offsets upfont. This is extremely important for Content-Length and range request support. • Since we have all the exact parsed info from the source boxes, every size and offset is calculable with a bit of book-keeping. • Cache this information so that any future requests are fast. • Now about range request support…
  • 12. mdat Box Strategy 11 13 April 2021 • Packets sizes and positions in source files are all known. • We need to properly interleave audio and video chunks. • Chose 500ms interleaving. • This interleaving is state – it must be consistent regardless of which range was requested. • For example, you need to know, for any given range, how many packets into the chunk you are when writing, and how they’re interleaved, 100% exactly. • More on this in a second. • We want to use persistent HTTP connections for reading all the mdats from source files. • This means taking a minor hit in bandwidth by skipping over moofs, in order to keep it persistent. • A prefetch is useful here.
  • 13. Range Request Strategy 12 13 April 2021 • ftyp and moov boxes are calculated and cached already (byte buffer) – ranges for this are easy. • Need to careful when handling ranges which staddle the cached moov and mdat box boundaries. • mdat is much trickier: • We need to calculate which source mdats (there are may per stream, remember) to start reading. • We need to know which packets within these mdats to start outputting, and when to stop. • We need to know how many bytes of the first and last written packets to ignore to satisfy the range. • We need to know the exact position and state of the packet interleaving where this range starts. • With a little pain, we can calculate this on each request, since we will know exactly what the chunk pattern is, e.g. 12 video packets / 24 audio packets / repeat. • If this sounds like a ton of tricky book-keeping, you are correct.