You’re probably tired of getting e-mails asking about $service which has Cyborg Assisted Bit Twiddling, or why we haven’t yet deployed the hot new codec with the encoder someone found in the dumpster out back.
2. Who Am I?
123 September 2018
• This guy:
VideoLAN Dev Days
3. Preface
223 September 2018
• All of these terms are, by nature, vague and have multiple definitions.
• All of the examples in this talk are things I’ve seen in real life marketing.
• There are some good/useful implementations of some of these buzzwords.
• This talk is not about those.
VideoLAN Dev Days
4. Marketing Ruins Everything
323 September 2018
• You’re probably tired of getting e-mails asking about $service which has Cyborg Assisted Bit
Twiddling, or why we haven’t yet deployed the hot new codec with the encoder someone found in the
dumpster out back.
• You’re probably tired of looking into exactly what this specific iteration of that buzzword means.
• You’re probably tired of having to defend not buying into the hype, or a different server, etc.
VideoLAN Dev Days
5. The Simple Ones
423 September 2018
• Cloud-optimized:
• Too slow to run on a normal server or workstation so throw infinite CPUs at it instead of optimizing.
• Probably FFmpeg / x264 / etc., running via exec in Python / Node / Ruby on EC2 or GCE.
• Expensive.
• Low latency or Realtime:
• Not low latency by broadcast standards. Not low latency enough for sports (e.g betting).
• Otherwise: Outsourced to WebRTC, implemented by others.
• “30% Better Compression”:
• Not 30% for any real or widely applicable application.
• Likely not compared via PQ and not against competing encoders/methods with psy enabled.
• Engineers working on these products tend to get really antsy when you ask them about this.
VideoLAN Dev Days
6. Per-Title Encoding / Content-Adaptive Encoding
523 September 2018
• Simple explanation: It’s rate control. Usually built on top of libraries with really, really bad rate control.
• Possible meanings, depending on who is using the word:
• Worst case (yes this is a thing people do): Encode in a loop and until some metric versus the
reference is met (VMAF, PSNR, whatever).
• CRF + VBV + Some concept of tracking chunks. People did this long before this was a buzzword.
• Pick parameters to encode en masse and pick the one with the best resulting metric.
• Apply machine learning (NNs) or Viterbi to the above.
• For some reason, the concept that different media can be coded more or less efficiently is
groundbreaking and interesting. A few approaches to dealing with bad rate control are, but the
concept itself is not.
VideoLAN Dev Days
7. Shot-based Encoding / Per-scene Adaptation
623 September 2018
• Simple explanation: Scene changes are a thing! Different scenes can be coded more or less efficiently!
• Possible meanings, depending on who uses the word:
• “We figured out that only allowing the encoder to put intra frames at exactly N second boundaries
was bad for coding efficiency.”
• Apply all from the previous slide, but per scene. Maybe group similar shots. Maybe use Viterbi or NNs.
• If doing chunked encoding, split at scene changes instead of N seconds.
• GOP-size limited lowest cost placements for intra frames (aka max kf distance + Dijkstra on costs)
• Neither new nor novel. Some implementations can be interesting in the context of global rate control.
• Many people seem to not understand that what a human defines as a shot / scene change, while similar,
is not the same as where it is most efficient, cost-wise, to code an intra frame.
VideoLAN Dev Days
8. Blockchain
723 September 2018
• Simple explanation: A solution in search of a problem. Nobody has ever created a legitimately
useful multimedia product with this.
• There’s nothing useful to verify / provide proof for.
• There’s no implementation of anything multimedia related using blockchain that is actually better than
the non-blockchain implementation.
• Prove me wrong.
• Otherwerise: Don’t talk to me or my son ever again.
VideoLAN Dev Days
9. AI AI AI AI AI AI AI AI AI AI AI AI AI AI OMGAI AI AI AI AI AI AI AI AI A
823 September 2018
• Simple explanation: Depends too much on who is uses it. Ranges anywhere from “literally nothing”
to “Robin Williams”.
• Possible meanings:
• We trained a thing. We swear we didn’t imprint our biases or use bad metrics as a base.
• Literally just applied statistics, no machine intelligence or learning at all.
• We used VMAF as a metric. No, we didn’t train it for our use case. We used the existing Netflix stuff.
• Generic programming. Probably Viterbi.
• A marketing person was accidentally given a keyboard or pen.
VideoLAN Dev Days
10. Bonus
923 September 2018
• CMAF:
• A 101 page specification that is mostly defining a subset of other specifications. Nobody knows
why it is so massive. People only care because Apple does.
VideoLAN Dev Days
11. Further Reading
1023 September 2018
• You, too, can be a jaded Debbie Downer!
• The classic x264 “testing encoders” blog post:
https://web.archive.org/web/20140822041755/http://x264dev.multimedia.cx/archives/472
• My own rambling from some time in… 2013? 2014?:
https://gist.github.com/dwbuiten/d324e7c58cd36696eca11b70aaf4ba22
VideoLAN Dev Days