How To Troubleshoot Collaboration Apps for the Modern Connected Worker
Visual search at Rai: requirements, architectures and use cases to visually search through broadcast
1. Visual Search at RAI: Requirements,
Architectures and Use Cases to Visually
Search through Broadcast Programmes
Speaker:
Federico Maria Pandolfi
Rai Teche, CRITS
2. ▪ Importance of proper management and efficient retrieval methodologies
for huge amounts of media files
▪ Typical MAM systems: text-based queries, search over textual information
and metadata
▪ Pros: reliability, robustness
▪ Cons: metadata extraction is expensive, time consuming and may not be available
for each entry
▪ No semantic or analytical representations of contents
▪ No query-by-example or near-duplicate detection
▪ These issues are particularly relevant for the raw video material of the
newsrooms (main case study)
▪ CBIR and Image search technologies are becoming feasible and mature
solutions
The Scenario
3. Ideal production workflow and
timeline
“Fresh” footage
capture
Newsroom
storage
Media
Manage
r
Discard/delete
Store on T3
Acquisitio
n on field
Enter the
newsroom
Becomes
“historic”
Media
Manager
Review
T3
CMM
searchabl
e
4. Case study: numbers
Rai's digital archives include (as of end
2015):
▪ 1.540.032 hr of video material
▪ 18.720 photos of scenic costumes
▪ 1.700 photos of sets furniture
▪ 1.552 photos of Centro Elettronico Rai
The number of video material increases with
a rate of approx. 130.000hr/year (new +
digitized material)
Only about 46% is annotated
Rai’s (single) newsroom stores (approx.):
▪ 2.000hr of “fresh” raw footage
▪ 10.000hr of “historic” raw footage
Since each news is about 3’-5’ long, this
translates into:
▪ 24.000 – 40.000 news from “fresh”
footage
▪ 120.000 – 200.000 news from
“historic” footage
Only aired material is annotated (no raw)
5. ▪ Issue: Metadata-free raw
footage, no metadata
attached by journalists
(only aired material is
documented)
▪ Issue: Huge amount of
material discarded by the
newsrooms (substantial
loss for the company)
▪ Issue: Lack of powerful
tools for in-depth search
over the vast archive of
footage
Our vision
▪ Solution: Link raw footage with
the related annotated material,
using state-of-the-art visual
search engine
Automatic metadata linking,
Visual search technologies,
Browse by indexed
references
6. What is ViSer
▪ Modular framework
▪ Few key modules + Workflow Manager
Newsroom Search
engine
Video
Info DB
Browser
extensio
n
CMM
Workflow
Manager
7. ViSer technologies
Search Engine: Visual Atoms
▪ Ready to go, engineered solution (production-
ready)
▪ Based on local descriptors
▪ No licensing issues with MPEG
▪ Video-to-video search capability
▪ Easy integration and customization
(parameters, DB)
▪ Full support
Workflow Manager: Apache Airflow
▪ Author, schedule and monitor workflows as
DAGs of tasks using Python code
▪ Custom + out-of-the-box operators
▪ Complete logging utility
▪ Rich user interface (to access and monitor
DAGs, variables, connections, …)
▪ Used to orchestrate all the steps of the binding
process for each raw footage in each
newsroom
AirFlow
8. Image search intro
▪ SIFT (Scale-Invariant Feature
Transform), one of the most-used and
robust feature matching algorithm
▪ Key-points calculated using DOGs
▪ Descriptors extracted using
orientations in the areas near each
key-point
▪ Matching & score based on the
similarity between multiple descriptors
(+ geometric-consistency, …)
▪ Best use-cases: same images, rigid
objects, logos, …
▪ Not so good for face recognition (high
variability of features)
9. ▪ State-of-the-art software for image and video search
▪ Based upon the extensive use of descriptors (files)
▪ Both batch (command line) & APIs available
▪ Batch allows a fine-tuning of parameters and multiple files ingestion
▪ The database can be chosen to match the production database
(Postgres DB requested)
▪ Automatic video-segmentation and keyframe extraction (for videos)
▪ Tuneable parameters for video-segmentation (trade-off between
precision and DB size/retrieval speed)
▪ Possibility to extract descriptors for each key-frame (videos) or image
and perform search afterwards
Visual Atoms
11. ▪ Bash/Python custom modules (Operators)
▪ “Video Info” database wrapper with RESTful APIs
▪ Image/video descriptors stored in “Video Info” DB (no need to extract
them every time, time saving)
▪ Input images/videos status is saved in “Video Info” DB
▪ Airflow’s internal DB tracks all the steps of the chain for the various
runs
▪ Back-end services for the interaction with Visual Atoms engine
▪ Date clustering module to group together videos with similar dates
and distribute the queries on multiple search indexes, each one
working on a different temporal window (ingestion optimization,
parallel search)
Workflow details
12. ▪ No relevant raw metadata in an XDCAM
▪ Creation/shooting date: only reliable raw footage parameter (MXF,
embedded data)
▪ Raw footage is searched using a temporal windowing system: a
window is a pool of episodes ranging from the creation date to a
programmable number of hours/episodes starting from that date
▪ Variable and sliding temporal window
▪ No match case: the unlinked footage is moved to the next temporal
window and searched again in a newer aired set
▪ If, after a programmable amount of trials, the search does not output
any result the raw footage is dropped
Date management & search
strategy
14. What is it:
▪ Scientific newscast
▪ Aired daily
▪ Unique example in Europe
▪ 10 minutes (now 15) per episode
▪ Large variety of topics (science,
tech, health, economics, society,
...)
Our pilot: TG Leonardo
Why TG Leo:
▪ Same structure and workflow of a
typical newsroom but with a
smaller footprint
▪ Small but prolific newsroom
▪ Long time collaboration with Rai
Teche (same facility)
▪ Visually diversified and appealing
▪ Short episodes duration
▪ Large amount of raw footage
immediately available and
partially annotated
2.896 x Aired
episodes
In
numbers:
134 x
Approx. 220 hr
of Raw
material
= ,
16. Demo CMM
▪ Example of CMM integration
▪ After linking in batch raw and aired
material the results will be shown in
the CMM
▪ For each aired video (already
browsable with CMM) a list of shots
and the matches for each shots will
be displayed
▪ In the first releases a browser
extension will be used to show the
results
▪ http://10.58.78.175:9080/viser/index.
html#/cmm-demo
17. ▪ Started as an open-source based project, the current version proves
to be more robust and reliable
▪ Raw footage is linked properly (with shot-level granularity) to the
corresponding aired footage
▪ No external documentation needed for raw material, less work for the
media manager and less waste of footage. Significant financial gain
for the company
▪ Easier for journalists to find video material for future news
▪ The workflow of the ingestion chain is currently under development
and will be tested in collaboration with TG Leonardo’s newsroom
Conclusions
18. ▪ Adoption in bigger newsrooms
▪ Better integration with the multimedia catalogue (CMM)
▪ Precise advertisement data for better statistics and tailoring of
advertising campaigns
▪ Query by example in “online” search mode
Future work
19. Advertisements demo
▪ Search for similar ads on both Rai and
competitors assets and group them by
airing hour
▪ Helpful tool for Business and Advertising
depts.
▪ http://10.58.78.175:9080/player/shot_ext
raction.html
Visual Atoms’ software works
extremely well with brand logos