"Have you ever wondered how broadcast TV signals get from the sports venues to over a hundred TV transmitters scattered around the United Kingdom? Over 95% of all video signals destined for UK broadcast traverse dedicated networks managed by British Telecom (BT) Networks.
BT’s next generation network infrastructure is a system called Vena. To build Vena, BT embraced Software Defined Wide Area Networks (SD-WAN) as a paradigm to allow users to manage their own services and to quickly establish new video feeds within a few mouse clicks. BT automated the task of alarm monitoring and fault reporting within their network’s operational International Media Centre.
To make Vena tick, BT needs to monitor thousands of complex devices, which could be deployed at race courses, rugby grounds and football stadiums across the country. As well as telemetry data from TV transmitter sites which could be situated on far flung Scottish islands. To handle this, BT turned to Confluent Platform as the backbone of its event handling and alarm correlation infrastructure.
In this talk we will explain the scale of the challenge we faced, how we leveraged Confluent Platform to address it as well as giving a live demonstration of booking a video feed from Birmingham to the Kafka Summit."
3. What is Vena?
If you are watching ITV or C4 and soon any terrestrial TV
channel you are seeing Vena at work
• Vena is a service delivery platform for linear broadcast quality video distribution
around the UK
• Vena is a fully automated platform both in terms of commissioning equipment and in
allowing customers to order services
• Vena was created to provide best in class performance for media traffic, with;
▪ low latency
▪ low jitter
▪ high bandwidth
▪ Multicast point to multi-point
• Customer self serve for managing and tracking services
4. Key Vena Features
4
What are the Vena Tribe are most proud of ……….
(in order of length of text not preference)
01
Path Computation
Engine (PCE)
Finds diverse routes for
resilient services
Ingests known Shared
Risk Groups from
Reveal
Services created from
customer input port to
multiple output ports
02
Service Assurance
Integration
Circuit, Interface &
Media alarms correlated
for RCA
Automatic incident ticket
generation
03
Customer Portal
Service Booking
Service Status
Incident History
Route Maps
Service Dashboard
04
Automatic Configs
Orchestrator manages
PE (Juniper)
CE (Cisco)
Media Processing (Appear)
Vena - an Overview |
5. Opportunity – Renewal of obsolescent
infrastructure and management systems
BT Group | Public 5
Leverage this transformation to fulfil BT M&B key business objectives in terms of competitiveness, operational effectiveness and
data integrity
Area Opportunity IT System requirements KPIs
Increased competitiveness and
Customer Experience
• Decreased time to market by
automated service life cycle
management
• Deploy a platform capable of
supporting the creation of innovative
and competitive service bundles for
broadcaster
• Provide customers with simplified self
serve service ordering
• Modular functional architecture for
vertical and horizontal scaling.
• Adoption Microservices
• Model/Intent-driven network services.
• Service delivery times
• Market Share & Revenue
• Number of new services launched
• Number of new service bundles
launched
Operational Effectiveness and Data
Integrity
• Decrease operational cost by
minimizing human intervention from
service fulfilment to assurance
• Data model structure to ensure real
time resource status
• Guarantee end to end view of services
• Closed loop automation
• E2E topology view
• Central Dynamic inventory
• Service, resources, live data
correlation for service management
decisions
• Number of repair calls completed
• Cost reduction related to inventory
changes.
6. BT Group | Public 6
Why Kafka ?
• High resilience: Confluent Platform clusters deployed to tried and tested best practice standards with self-balancing allows BT to reduce operational effort and error.
• Automated deployment: Deployment scripts using Terraform are used to spin up clusters and do zero downtime upgrades.
• Data replication: Replicator and Cluster Linking are used for near-real-time geo-replication.
• Confluent Control Centre: Is leveraged for detailed monitoring of Confluent Platform and the data within topics.
• Security: Confluent offers data encryption, bring your own key, audit logs, and fine-grained access control.
• Support and maintenance expertise: Best-in-class support offered by Confluent enables the high uptime requirements of critical national infrastructure
• De facto: Kafka was being used across BT. Vena leveraged existing knowledge to implement best practices in design and implementation. There is also a sizable pool of
engineers skilled in Kafka in the industry.
• Community: There’s a large, active, and global user community with many conferences and events like Kafka Summit, where people share ideas, experiences, and best
practices.
• The connector ecosystem: The Kafka Connect framework allows customers to deploy ready-to-use “connectors” to consume and produce streams from and to databases
and other applications.
• Stream processing: Kafka Streams simplify application development by abstracting a lot of the complexity involved in working with high-throughput event streams.
• Throughput, latency, and scale: This is required to deal with alarm processing and alarm storms in the network. Kafka operates at network speeds and can cope with
millions of events. This becomes crucial when dealing with alarms where every second spent processing counts and where alarm storms can easily cause a massive flood
of events.
• Real-time message manipulation and routing: Kafka Connect and Kafka Streams provide the capability to route, transform, and enrich network events as they happen.
Why Confluent ?
7. Network On-boarding
BT Group | Public 7
The Vena network consists of devices and connections that link two sites together. For each channel,
two separate circuits must be configured that follow different geographical routes to provide
resilience in the case of a physical failure on part of the line or country. Once provisioned the signal is
sent along both routes and combined at the end site to provide resilience against packet loss or
corruption.
These connections are provided by various connectivity providers. However, there is often a
significant delay, typically ranging from weeks to sometimes months, before both ends of a
connection can be established. This presents a challenge as engineers must remember to initiate the
link during the onboarding process manually every time a supplier establishes the links.
8. Reducing Fault Detection time
BT Group | Public 8
Deliver refined data to operational colleagues to expedite fault identification by eliminating
unnecessary distractions and emphasizing crucial information.
• Decrease operational cost by minimizing human intervention from service fulfilment to assurance
• Data model structure to ensure real time resource status
• Guarantee end to end view of services
9. Architecture
BT Group | Public 9
Hi I can see Device 2
connected via Link
Both device LLDP
messages published
Compare LLDP messages
If match
Ohh!! Link has gone down
Decoded SNMP event
Hi I can see Device 1
connected via Link
Get hostname, link ID
Correlate the event
from device 1
Get Services on Link
with Loss priority