- OBE is a specialist in software-based media encoders and decoders that has developed a native, high-performance multivendor IP and cloud software stack using agile software practices.
- Moving production workflows to software, containers, and the cloud provides benefits like efficient scaling and reduced costs but also challenges around integration, timing, and multi-vendor interoperability that standards groups are working to address.
- While some proprietary solutions currently offer elements of cloud production, widespread adoption requires open standards for ground-cloud-cloud-ground workflows and transport to allow for multi-vendor innovation.
1. Moving to software-based production workflows and
containerisation of media applications
Kieran Kunhya – kierank@obe.tv
2. Company Overview
• Specialists in software-based
encoders and decoders for
Sport, News and Channel
contribution (B2B)
• Based in Central London
• Build everything in house
• Hardware, firmware, software
• Not to be confused with:
3. • Native, high performance
multivendor 2022-6/2110 software
stack. NMOS.
• SDI developed around 21st century
software practices
• First major 5G Live Production at
Queen’s Funeral (main path)
• etc.
• Developed with much smaller
budget and headcount
My technical background
4. • “[Provocative] Discussion Topics”
• Software in an SDI world
• Software in a bare-metal IP world
• Software in a containerised IP world
• Software for cloud production
• Work of the Ground-Cloud-Cloud-Ground (GCCG) group in Video
Services Forum
• “Software Is Eating the World” – Marc Andreessen
Topics
5. • Broadcast is next whether you like it or not
• Highly regulated industries, Healthcare, Finance,
National Security already moving
• Sysadmins, database admins etc. all thought
they were immune
• “But Cloud is not Green”
• Your 1960s building with ten year old
equipment is more efficient than purpose built
football field size datacenters built next to
hydroelectric dams?
• Cellular bonding / SRT Gateway / some web
streams is neither an IP nor a cloud strategy
Cloud is eating the world
6. • Host your equipment in a datacenter out-of-town
• Inefficient use of most expensive real-estate on Earth
• This real-estate should be used to make television (edits, studios etc)
• Backhaul studios to datacentre and do heavy compute there
• Metro Ethernet easily available + Remote Hands
• Datacentres have VMs and other Cloud infrastructure
“But Cloud Latency is Too High”
7. • Huge education challenges around IP and Cloud
• Lack of fault finding skills, often non-existent
• IT department too slow, broadcast support lacks
knowledge
• Ask a vendor! - “It’s not working”
• Basic (non-product) IT questions are unacceptable
• DHCP vs Static, Unicast vs Multicast, NAT (seriously)
• “But you [vendors] need to support us in our Digital
Transformation” – erm…
• Train staff, or use service providers
IP / Cloud Education
8.
9.
10. • Josh Steinhour of Devoncroft
• “Low interest-rate phenomenon” – low interest
rates meant uneconomical business models
pursued
• HDR - Never has so much technical complexity
been noticed by so few!
• “We cannot move to cloud because we need
live UHD or HDR”
• Focus technology transformation on cloud to
avoid existential crisis!
• Spin up channels in cloud in minutes, not months
Low interest-rate phenomena and cloud
11. • The Quiet Revolution!
• Large amount of software-based
equipment with SDI ports
• Wide ecosystem of SDI boards available
• Often only as appliance, limited use of
software benefits (visibility, processing)
• New build SDI facilities have large
amounts of COTS hardware
• Very usable in the field, “Edge
Compute” hardware widely available
Software in the SDI world
Credit: @NewsTech
13. • A very complex technical problem
• Timing and packet rates make this
astonishingly difficult
• Road car vs F1 car
• Small ecosystem of COTS solutions and
vendor-specific solutions
• Not really plug-and-play
• Added complexity from NMOS, layering
makes this complicated
• Lots of combinations of audio, not all
supported by everything
Software in a 2110 world
+ In house
14. • Let’s make a complex technical problem much more complex 😭
• IT migration to VMs in 2000s, containers 2010s
Software in a VM/container world
• 2110 much more complex,
mainly because of unicast (IT)
vs multicast (broadcast)
• Lots of issues around this but
most important one: multicast
replication problem
Credit: Netapp
18. VLAN routing for packet replication
• Put each VM/Container on a
different VLAN and route
between them on trunk port
• Switch performs packet
replication
• Cons:
• Deployer of VMs need to track
available VLANs
• Some bandwidth wastage,
except in case when all VMs on
different multicast
• Hard to retrofit on existing
network, needs planning
19. eSwitch for packet replication
• Most modern Network Cards
(NICs) have an Embedded
Switch (eSwitch)
• Send one flow to the Network
Card and card replicates
• Cons:
• Complex to program eSwitch
• NIC Vendor specific
• VM hypervisor or container
provisioning needs to configure
eSwitch
20. vSwitch for packet replication
• Classic IT approach where
packets are replicated by
Virtual Switch (vSwitch)
• Cons:
• Software based approach not
suited to high packet rates
• High CPU usage, low throughput
• Can cause packet reordering
• Not suitable for broadcast
22. Software for cloud production
• I want to do all of this in the cloud!
• The Ground-Cloud-Cloud-Ground working
group of the VSF is trying to solve these
problems
• Rest of Presentation adapted from IP
Showcase presentation at NAB/IBC
• Cloud economics (scale-up/scale down)
seems a great alternative to paying for
resources that stay idle most of the time –
what is stopping us?
23. Moving Cloud production to the next level
• “But I’ve been doing live cloud production” – Yes and No
• Single Vendor Monolithic applications such as Channel-in-a-box, playout
server, cloud switchers, use the cloud as a home, but not necessarily as
a scalable architecture
• Proprietary Transports stifle innovation (IE6, Flash, Silverlight)
• To get widespread adoption we must have:
• Multi-vendor interoperation via standard APIs
• Appropriate-to-task picture quality levels
• Standards for Ground-Cloud-Cloud-Ground
• Agreed mechanism(s) for building workflows
24. Cloud production – What makes it difficult?
• Integration with the ground – both ways
• Must work into existing workflows
• SDI, ST 2110, satellite, cable, DTT
• Legacy Workflows have well-defined linear timing
models (e.g SDI, ST 2110-21, MPEG-TS VBV)
• Without a proper timing model, you end up with
variable (undefined) latency
• One reason web streams are 20-30 seconds
behind broadcast – They don’t have a timing model!
• What are my neighbours cheering about?
• Inter-cutting ground and cloud requires timing
25. I’ll just do 2110 in the cloud
• Some people claiming to have 2110 in public cloud
• But it’s not possible right now in any public cloud:
• No (full) PTP in the cloud – all clouds handle time their own way
• Cloud networks are shared and have packet loss
• No real access to network card capabilities (e.g packet pacing) for 2110
• Is this even a good idea?
• We don’t actually want time-linear processing in cloud any more
• Allow cloud instances to process data non-linearly, sometimes faster or slower
than real-time but on average real-time – known worst case
• How to handle “synthetic” sources (e.g clips, graphics) played out from cloud?
26. Cloud-vendor specific transport
• To get the benefits of cloud, we also must trust the cloud
• Depend on cloud provider bulk-transport protocols
• Throughput with Reliability - all my data arrives correctly
• Latency - all my data will arrive on time
• The Big Data community has similar needs for large data transfers
• Application may not have visibility of the internals of protocol (“black box”)
• Amazon Scalable Reliable Datagram (SRD) such an example
• Used in Amazon CDI (Cloud Digital Interface)
27. What is the “spec”
of latency and variability?
• Push model (not backpressured)
• Based on (uniform?) content “chunks”
• might be frames, fields, stripes
• might be blocks of audio samples
• LMIN = the soonest/shortest amount of
time from input to output
• Input time = buffer 100% arrived to me
• Output time = buffer left me 100%
• Includes the egress transit time
• L99% = the amount of variability beyond
the LMIN for 99th percentile case
Work
Step A
Work
Step B
Work
Step X
t
t
LAMIN
LA99%
t
LW99%
t
LXMIN
LX99%+LW99%
LX99%
Variability on the input accumulates into the output!!!
Work of the VSF GCCG AHG
What is the “spec” of latency and variability?
28. Conclusion
• Cloud is going to happen, train your staff, it’s the biggest technology
transformation for decades (forget about UHD and HDR, these are never going
to be mainstream)
• Slow but steady progress of software into traditional broadcast processes
• As opposed to usual domain of web
• Hardware chip shortage has accelerated this
• Software in VM/Containers is hard
• Cloud is even harder - (Semi-)Proprietary solutions work in the short-term but are
painful in the long-term. Remember IE6/Flash/Silverlight! – Need to do this in a
multivendor fashion. GCCG is close to completion – please participate.
• Thanks to Richard Hastie, Thomas Kernen (NVIDIA) and Gerard Philips (Arista)
for discussions. Matt Dicken for diagrams.
Editor's Notes
Sports delivery using cloud such as Premier League, NFL etc. Also work with many competitors to linear broadcasting such as DAZN, Amazon Prime etc
Worked on many of the early 2022-6 and 2110 sites.
Clean NMOS implementation.
Remote production by our IMG / Amazon Prime colleagues in a 2110 native gallery.
SDI is a “software-defined interface”.
Two days notice at IBC: 5G Path had a lot of software: encode, transport, software defined radio, decode. Only hardware for physical layer. Went to air to tens of millions.
Done software for high end transmissions for real, 24/7/365.
Some provocative topics. This is not a normal NTS and I want to stimulate discussion.
Software and SDI has been the quiet revolution
This poster works on Multiple Levels.
Netflix is main competition for audiences and presentations about how entire business runs on cloud, including production of content.
Easy scapegoat, energy efficiency data is competitive advantage
There’s a reason stock exchanges are based in out of town datacentres with cheap land and electricity.
Going on tours of facilities, I have never seen people so proud to show me how inefficiently they are using some of the most expensive real estate on earth
Could easily afford to put these in major cities
Many broadcasters are still baseband-centric. Tendency to treat IP faults like baseband faults.
You will see vendors take a hard-line on this in challenging economic times. There is no cloud where engineers are pressing buttons.
It’s not our job to explain basic IT concepts, if you think that’s part of Digital Transformation please hire a Management Consultancy Deloitte to do that for you.
So the EBU has a way of judge vendors, maybe vendors should judge broadcasters
Two can play at this game
Josh looks at the world from a Warren Buffet style view of the world, numbers over emotion
I can get groceries delivered to my house in ten minutes and sold at a loss. This company loses hundreds of millions every year.
Josh from Devoncroft uses the term jealously. I’ll pick one of these low-interest rate phenomena – UHD/HDR
But for public broadcasters, average age of viewer in 50s!
UHD/HDR makes sense if you are Netflix or BT Sport, you can sell a new tier of subscription to the enthusiasts.
HDR is like the NFT of our industry, passionate geeks waxing lyrical about jargon and the rest of the world scratching their head
Dave Travis at Devoncroft explained how they spin up channels in the cloud in minutes. High profile revenue generating channels.
Don’t like any of the SDI boards, build your own
I can understand both approaches, locked down IT appliance vs pure software.
First of around ten street cabinets we put IT hardware in 2016. Specialist rugged IT suppliers for the military (not true COTS). Very unclear whether this would survive
Able to add software features in the field, do IT style updates with Ansible.
Since then put IT hardware in a racing car. Very strange to be sshing into a racing car going at 100mph.
This picture is neat, the flypack from 2022 has around 8 times the processing power of the street cabinets. Delivering this sports match via cloud.
In around ten years this flypack will be able to replace this truck. Moore’s law unrelenting. Latency sensitive compute done locally, rest done in cloud.
Many people in the room have spent significant portions on their life on this.
I’ve written an IBC paper about this. Most industry work is based on the Stuart Grace paper of 2016.
Apologies if I’ve missed one out. Don’t want to tempt fate but light at the end of the tunnel for NMOS issues.
Containers sit at a lower level compared to VMs
How to manage high performance applications in containers that have realtime deadlines to it.
Multicast involves
These are my personal views, not the VSF working group’s views.
Proprietary transport such as NDI is bad, it’s simple in the short-term like Internet Explorer 6, Silverlight, Flash etc. Still go to broadcasters that need it using WinXP in VM.
Doesn’t need to always be 10-bit 4:2:2
See a goal again, or see an racing car overtake twiev
Square peg in a round hole
For example a protocol like TCP does not meet these requirements, if there’s any congestion TCP latency goes up drastically “buffering”. It might work some of the time, but fate has a habit of failing at the worst time. Unless you throw large delays at it.