SlideShare a Scribd company logo
1 of 20
Visual Search at RAI: Requirements,
Architectures and Use Cases to Visually
Search through Broadcast Programmes
Speaker:
Federico Maria Pandolfi
Rai Teche, CRITS
▪ Importance of proper management and efficient retrieval methodologies
for huge amounts of media files
▪ Typical MAM systems: text-based queries, search over textual information
and metadata
▪ Pros: reliability, robustness
▪ Cons: metadata extraction is expensive, time consuming and may not be available
for each entry
▪ No semantic or analytical representations of contents
▪ No query-by-example or near-duplicate detection
▪ These issues are particularly relevant for the raw video material of the
newsrooms (main case study)
▪ CBIR and Image search technologies are becoming feasible and mature
solutions
The Scenario
Ideal production workflow and
timeline
“Fresh” footage
capture
Newsroom
storage
Media
Manage
r
Discard/delete
Store on T3
Acquisitio
n on field
Enter the
newsroom
Becomes
“historic”
Media
Manager
Review
T3
CMM
searchabl
e
Case study: numbers
Rai's digital archives include (as of end
2015):
▪ 1.540.032 hr of video material
▪ 18.720 photos of scenic costumes
▪ 1.700 photos of sets furniture
▪ 1.552 photos of Centro Elettronico Rai
The number of video material increases with
a rate of approx. 130.000hr/year (new +
digitized material)
Only about 46% is annotated
Rai’s (single) newsroom stores (approx.):
▪ 2.000hr of “fresh” raw footage
▪ 10.000hr of “historic” raw footage
Since each news is about 3’-5’ long, this
translates into:
▪ 24.000 – 40.000 news from “fresh”
footage
▪ 120.000 – 200.000 news from
“historic” footage
Only aired material is annotated (no raw)
▪ Issue: Metadata-free raw
footage, no metadata
attached by journalists
(only aired material is
documented)
▪ Issue: Huge amount of
material discarded by the
newsrooms (substantial
loss for the company)
▪ Issue: Lack of powerful
tools for in-depth search
over the vast archive of
footage
Our vision
▪ Solution: Link raw footage with
the related annotated material,
using state-of-the-art visual
search engine
Automatic metadata linking,
Visual search technologies,
Browse by indexed
references
What is ViSer
▪ Modular framework
▪ Few key modules + Workflow Manager
Newsroom Search
engine
Video
Info DB
Browser
extensio
n
CMM
Workflow
Manager
ViSer technologies
Search Engine: Visual Atoms
▪ Ready to go, engineered solution (production-
ready)
▪ Based on local descriptors
▪ No licensing issues with MPEG
▪ Video-to-video search capability
▪ Easy integration and customization
(parameters, DB)
▪ Full support
Workflow Manager: Apache Airflow
▪ Author, schedule and monitor workflows as
DAGs of tasks using Python code
▪ Custom + out-of-the-box operators
▪ Complete logging utility
▪ Rich user interface (to access and monitor
DAGs, variables, connections, …)
▪ Used to orchestrate all the steps of the binding
process for each raw footage in each
newsroom
AirFlow
Image search intro
▪ SIFT (Scale-Invariant Feature
Transform), one of the most-used and
robust feature matching algorithm
▪ Key-points calculated using DOGs
▪ Descriptors extracted using
orientations in the areas near each
key-point
▪ Matching & score based on the
similarity between multiple descriptors
(+ geometric-consistency, …)
▪ Best use-cases: same images, rigid
objects, logos, …
▪ Not so good for face recognition (high
variability of features)
▪ State-of-the-art software for image and video search
▪ Based upon the extensive use of descriptors (files)
▪ Both batch (command line) & APIs available
▪ Batch allows a fine-tuning of parameters and multiple files ingestion
▪ The database can be chosen to match the production database
(Postgres DB requested)
▪ Automatic video-segmentation and keyframe extraction (for videos)
▪ Tuneable parameters for video-segmentation (trade-off between
precision and DB size/retrieval speed)
▪ Possibility to extract descriptors for each key-frame (videos) or image
and perform search afterwards
Visual Atoms
The workflow
▪ Bash/Python custom modules (Operators)
▪ “Video Info” database wrapper with RESTful APIs
▪ Image/video descriptors stored in “Video Info” DB (no need to extract
them every time, time saving)
▪ Input images/videos status is saved in “Video Info” DB
▪ Airflow’s internal DB tracks all the steps of the chain for the various
runs
▪ Back-end services for the interaction with Visual Atoms engine
▪ Date clustering module to group together videos with similar dates
and distribute the queries on multiple search indexes, each one
working on a different temporal window (ingestion optimization,
parallel search)
Workflow details
▪ No relevant raw metadata in an XDCAM
▪ Creation/shooting date: only reliable raw footage parameter (MXF,
embedded data)
▪ Raw footage is searched using a temporal windowing system: a
window is a pool of episodes ranging from the creation date to a
programmable number of hours/episodes starting from that date
▪ Variable and sliding temporal window
▪ No match case: the unlinked footage is moved to the next temporal
window and searched again in a newer aired set
▪ If, after a programmable amount of trials, the search does not output
any result the raw footage is dropped
Date management & search
strategy
Date management example
What is it:
▪ Scientific newscast
▪ Aired daily
▪ Unique example in Europe
▪ 10 minutes (now 15) per episode
▪ Large variety of topics (science,
tech, health, economics, society,
...)
Our pilot: TG Leonardo
Why TG Leo:
▪ Same structure and workflow of a
typical newsroom but with a
smaller footprint
▪ Small but prolific newsroom
▪ Long time collaboration with Rai
Teche (same facility)
▪ Visually diversified and appealing
▪ Short episodes duration
▪ Large amount of raw footage
immediately available and
partially annotated
2.896 x Aired
episodes
In
numbers:
134 x
Approx. 220 hr
of Raw
material
= ,
Preliminary results
Demo CMM
▪ Example of CMM integration
▪ After linking in batch raw and aired
material the results will be shown in
the CMM
▪ For each aired video (already
browsable with CMM) a list of shots
and the matches for each shots will
be displayed
▪ In the first releases a browser
extension will be used to show the
results
▪ http://10.58.78.175:9080/viser/index.
html#/cmm-demo
▪ Started as an open-source based project, the current version proves
to be more robust and reliable
▪ Raw footage is linked properly (with shot-level granularity) to the
corresponding aired footage
▪ No external documentation needed for raw material, less work for the
media manager and less waste of footage. Significant financial gain
for the company
▪ Easier for journalists to find video material for future news
▪ The workflow of the ingestion chain is currently under development
and will be tested in collaboration with TG Leonardo’s newsroom
Conclusions
▪ Adoption in bigger newsrooms
▪ Better integration with the multimedia catalogue (CMM)
▪ Precise advertisement data for better statistics and tailoring of
advertising campaigns
▪ Query by example in “online” search mode
Future work
Advertisements demo
▪ Search for similar ads on both Rai and
competitors assets and group them by
airing hour
▪ Helpful tool for Business and Advertising
depts.
▪ http://10.58.78.175:9080/player/shot_ext
raction.html
Visual Atoms’ software works
extremely well with brand logos
Browse the past, edit the
future…

More Related Content

Similar to Visual search at Rai: requirements, architectures and use cases to visually search through broadcast

Neev CakePHP Managed Services Offerings
Neev CakePHP Managed Services OfferingsNeev CakePHP Managed Services Offerings
Neev CakePHP Managed Services OfferingsNeev Technologies
 
Importance of ‘Centralized Event collection’ and BigData platform for Analysis !
Importance of ‘Centralized Event collection’ and BigData platform for Analysis !Importance of ‘Centralized Event collection’ and BigData platform for Analysis !
Importance of ‘Centralized Event collection’ and BigData platform for Analysis !Piyush Kumar
 
AdminCamp 2018 - ApplicationInsights für Administratoren
AdminCamp 2018 - ApplicationInsights für AdministratorenAdminCamp 2018 - ApplicationInsights für Administratoren
AdminCamp 2018 - ApplicationInsights für AdministratorenChristoph Adler
 
Opticon18: Developer Night
Opticon18: Developer NightOpticon18: Developer Night
Opticon18: Developer NightOptimizely
 
Agentless System Crawler - InterConnect 2016
Agentless System Crawler - InterConnect 2016Agentless System Crawler - InterConnect 2016
Agentless System Crawler - InterConnect 2016Canturk Isci
 
Flink powered stream processing platform at Pinterest
Flink powered stream processing platform at PinterestFlink powered stream processing platform at Pinterest
Flink powered stream processing platform at PinterestFlink Forward
 
Filipe paternot - Case Study: Zabbix Deployment at Globo.com
Filipe paternot - Case Study: Zabbix Deployment at Globo.comFilipe paternot - Case Study: Zabbix Deployment at Globo.com
Filipe paternot - Case Study: Zabbix Deployment at Globo.comZabbix
 
EQR Reporting: Rails + Amazon EC2
EQR Reporting:  Rails + Amazon EC2EQR Reporting:  Rails + Amazon EC2
EQR Reporting: Rails + Amazon EC2jeperkins4
 
Add observability to your django application - PyCon FR 2019
Add observability to your django application - PyCon FR 2019Add observability to your django application - PyCon FR 2019
Add observability to your django application - PyCon FR 2019Bleemeo
 
ShowNTell: An easy-to-use tool for answering students’ questions with voice-o...
ShowNTell: An easy-to-use tool for answering students’ questions with voice-o...ShowNTell: An easy-to-use tool for answering students’ questions with voice-o...
ShowNTell: An easy-to-use tool for answering students’ questions with voice-o...Anand Bhojan
 
Real Time Analytics On AWS: Optimized Architectures
Real Time Analytics On AWS: Optimized ArchitecturesReal Time Analytics On AWS: Optimized Architectures
Real Time Analytics On AWS: Optimized ArchitecturesAmazon Web Services
 
Stateful Stream Processing at In-Memory Speed
Stateful Stream Processing at In-Memory SpeedStateful Stream Processing at In-Memory Speed
Stateful Stream Processing at In-Memory SpeedJamie Grier
 
How to create custom dashboards in Elastic Search / Kibana with Performance V...
How to create custom dashboards in Elastic Search / Kibana with Performance V...How to create custom dashboards in Elastic Search / Kibana with Performance V...
How to create custom dashboards in Elastic Search / Kibana with Performance V...PerformanceVision (previously SecurActive)
 
Why And When Should We Consider Stream Processing In Our Solutions Teqnation ...
Why And When Should We Consider Stream Processing In Our Solutions Teqnation ...Why And When Should We Consider Stream Processing In Our Solutions Teqnation ...
Why And When Should We Consider Stream Processing In Our Solutions Teqnation ...Soroosh Khodami
 
Using AWS to Build a Scalable Big Data Management & Processing Service (BDT40...
Using AWS to Build a Scalable Big Data Management & Processing Service (BDT40...Using AWS to Build a Scalable Big Data Management & Processing Service (BDT40...
Using AWS to Build a Scalable Big Data Management & Processing Service (BDT40...Amazon Web Services
 
Gain insights into your business operations with BPM Analytics
Gain insights into your business operations with BPM AnalyticsGain insights into your business operations with BPM Analytics
Gain insights into your business operations with BPM AnalyticsAllen Chan
 
Building Your First Digital File Submission
Building Your First Digital File Submission Building Your First Digital File Submission
Building Your First Digital File Submission Safe Software
 
Simply Business - Near Real Time Event Processing
Simply Business - Near Real Time Event ProcessingSimply Business - Near Real Time Event Processing
Simply Business - Near Real Time Event Processingidan_by
 
Measuring CDN performance and why you're doing it wrong
Measuring CDN performance and why you're doing it wrongMeasuring CDN performance and why you're doing it wrong
Measuring CDN performance and why you're doing it wrongFastly
 
Basic Application Performance Optimization Techniques (Backend)
Basic Application Performance Optimization Techniques (Backend)Basic Application Performance Optimization Techniques (Backend)
Basic Application Performance Optimization Techniques (Backend)Klas Berlič Fras
 

Similar to Visual search at Rai: requirements, architectures and use cases to visually search through broadcast (20)

Neev CakePHP Managed Services Offerings
Neev CakePHP Managed Services OfferingsNeev CakePHP Managed Services Offerings
Neev CakePHP Managed Services Offerings
 
Importance of ‘Centralized Event collection’ and BigData platform for Analysis !
Importance of ‘Centralized Event collection’ and BigData platform for Analysis !Importance of ‘Centralized Event collection’ and BigData platform for Analysis !
Importance of ‘Centralized Event collection’ and BigData platform for Analysis !
 
AdminCamp 2018 - ApplicationInsights für Administratoren
AdminCamp 2018 - ApplicationInsights für AdministratorenAdminCamp 2018 - ApplicationInsights für Administratoren
AdminCamp 2018 - ApplicationInsights für Administratoren
 
Opticon18: Developer Night
Opticon18: Developer NightOpticon18: Developer Night
Opticon18: Developer Night
 
Agentless System Crawler - InterConnect 2016
Agentless System Crawler - InterConnect 2016Agentless System Crawler - InterConnect 2016
Agentless System Crawler - InterConnect 2016
 
Flink powered stream processing platform at Pinterest
Flink powered stream processing platform at PinterestFlink powered stream processing platform at Pinterest
Flink powered stream processing platform at Pinterest
 
Filipe paternot - Case Study: Zabbix Deployment at Globo.com
Filipe paternot - Case Study: Zabbix Deployment at Globo.comFilipe paternot - Case Study: Zabbix Deployment at Globo.com
Filipe paternot - Case Study: Zabbix Deployment at Globo.com
 
EQR Reporting: Rails + Amazon EC2
EQR Reporting:  Rails + Amazon EC2EQR Reporting:  Rails + Amazon EC2
EQR Reporting: Rails + Amazon EC2
 
Add observability to your django application - PyCon FR 2019
Add observability to your django application - PyCon FR 2019Add observability to your django application - PyCon FR 2019
Add observability to your django application - PyCon FR 2019
 
ShowNTell: An easy-to-use tool for answering students’ questions with voice-o...
ShowNTell: An easy-to-use tool for answering students’ questions with voice-o...ShowNTell: An easy-to-use tool for answering students’ questions with voice-o...
ShowNTell: An easy-to-use tool for answering students’ questions with voice-o...
 
Real Time Analytics On AWS: Optimized Architectures
Real Time Analytics On AWS: Optimized ArchitecturesReal Time Analytics On AWS: Optimized Architectures
Real Time Analytics On AWS: Optimized Architectures
 
Stateful Stream Processing at In-Memory Speed
Stateful Stream Processing at In-Memory SpeedStateful Stream Processing at In-Memory Speed
Stateful Stream Processing at In-Memory Speed
 
How to create custom dashboards in Elastic Search / Kibana with Performance V...
How to create custom dashboards in Elastic Search / Kibana with Performance V...How to create custom dashboards in Elastic Search / Kibana with Performance V...
How to create custom dashboards in Elastic Search / Kibana with Performance V...
 
Why And When Should We Consider Stream Processing In Our Solutions Teqnation ...
Why And When Should We Consider Stream Processing In Our Solutions Teqnation ...Why And When Should We Consider Stream Processing In Our Solutions Teqnation ...
Why And When Should We Consider Stream Processing In Our Solutions Teqnation ...
 
Using AWS to Build a Scalable Big Data Management & Processing Service (BDT40...
Using AWS to Build a Scalable Big Data Management & Processing Service (BDT40...Using AWS to Build a Scalable Big Data Management & Processing Service (BDT40...
Using AWS to Build a Scalable Big Data Management & Processing Service (BDT40...
 
Gain insights into your business operations with BPM Analytics
Gain insights into your business operations with BPM AnalyticsGain insights into your business operations with BPM Analytics
Gain insights into your business operations with BPM Analytics
 
Building Your First Digital File Submission
Building Your First Digital File Submission Building Your First Digital File Submission
Building Your First Digital File Submission
 
Simply Business - Near Real Time Event Processing
Simply Business - Near Real Time Event ProcessingSimply Business - Near Real Time Event Processing
Simply Business - Near Real Time Event Processing
 
Measuring CDN performance and why you're doing it wrong
Measuring CDN performance and why you're doing it wrongMeasuring CDN performance and why you're doing it wrong
Measuring CDN performance and why you're doing it wrong
 
Basic Application Performance Optimization Techniques (Backend)
Basic Application Performance Optimization Techniques (Backend)Basic Application Performance Optimization Techniques (Backend)
Basic Application Performance Optimization Techniques (Backend)
 

More from FIAT/IFTA

2021 FIAT/IFTA Timeline Survey
2021 FIAT/IFTA Timeline Survey2021 FIAT/IFTA Timeline Survey
2021 FIAT/IFTA Timeline SurveyFIAT/IFTA
 
20211021 FIAT/IFTA Most Wanted List
20211021 FIAT/IFTA Most Wanted List20211021 FIAT/IFTA Most Wanted List
20211021 FIAT/IFTA Most Wanted ListFIAT/IFTA
 
WARBURTON FIAT/IFTA Timeline Survey results 2020
WARBURTON FIAT/IFTA Timeline Survey results 2020WARBURTON FIAT/IFTA Timeline Survey results 2020
WARBURTON FIAT/IFTA Timeline Survey results 2020FIAT/IFTA
 
OOMEN MEZARIS ReTV
OOMEN MEZARIS ReTVOOMEN MEZARIS ReTV
OOMEN MEZARIS ReTVFIAT/IFTA
 
BUCHMAN Digitisation of quarter inch audio tapes at DR (FRAME Expert)
BUCHMAN Digitisation of quarter inch audio tapes at DR (FRAME Expert)BUCHMAN Digitisation of quarter inch audio tapes at DR (FRAME Expert)
BUCHMAN Digitisation of quarter inch audio tapes at DR (FRAME Expert)FIAT/IFTA
 
CULJAT (FRAME Expert) Public procurement in audiovisual digitisation at RTÉ
CULJAT (FRAME Expert) Public procurement in audiovisual digitisation at RTÉCULJAT (FRAME Expert) Public procurement in audiovisual digitisation at RTÉ
CULJAT (FRAME Expert) Public procurement in audiovisual digitisation at RTÉFIAT/IFTA
 
HULSENBECK Value Use and Copyright Comission initiatives
HULSENBECK Value Use and Copyright Comission initiativesHULSENBECK Value Use and Copyright Comission initiatives
HULSENBECK Value Use and Copyright Comission initiativesFIAT/IFTA
 
WILSON Film digitisation at BBC Scotland
WILSON Film digitisation at BBC ScotlandWILSON Film digitisation at BBC Scotland
WILSON Film digitisation at BBC ScotlandFIAT/IFTA
 
GOLODNOFF We need to make our past accessible!
GOLODNOFF We need to make our past accessible!GOLODNOFF We need to make our past accessible!
GOLODNOFF We need to make our past accessible!FIAT/IFTA
 
LORENZ Building an integrated digital media archive and legal deposit
LORENZ Building an integrated digital media archive and legal depositLORENZ Building an integrated digital media archive and legal deposit
LORENZ Building an integrated digital media archive and legal depositFIAT/IFTA
 
BIRATUNGANYE Shock of formats
BIRATUNGANYE Shock of formatsBIRATUNGANYE Shock of formats
BIRATUNGANYE Shock of formatsFIAT/IFTA
 
CANTU VT is TV The History of Argentinian Video Art and Television Archives P...
CANTU VT is TV The History of Argentinian Video Art and Television Archives P...CANTU VT is TV The History of Argentinian Video Art and Television Archives P...
CANTU VT is TV The History of Argentinian Video Art and Television Archives P...FIAT/IFTA
 
BERGER RIPPON BBC Music memories
BERGER RIPPON BBC Music memoriesBERGER RIPPON BBC Music memories
BERGER RIPPON BBC Music memoriesFIAT/IFTA
 
AOIBHINN and CHOISTIN Rehash your archive
AOIBHINN and CHOISTIN Rehash your archiveAOIBHINN and CHOISTIN Rehash your archive
AOIBHINN and CHOISTIN Rehash your archiveFIAT/IFTA
 
HULSENBECK BLOM A blast from the past open up
HULSENBECK BLOM A blast from the past open upHULSENBECK BLOM A blast from the past open up
HULSENBECK BLOM A blast from the past open upFIAT/IFTA
 
PERVIZ Automated evolvable media console systems in digital archives
PERVIZ Automated evolvable media console systems in digital archivesPERVIZ Automated evolvable media console systems in digital archives
PERVIZ Automated evolvable media console systems in digital archivesFIAT/IFTA
 
AICHROTH Systemaic evaluation and decentralisation for a (bit more) trusted AI
AICHROTH Systemaic evaluation and decentralisation for a (bit more) trusted AIAICHROTH Systemaic evaluation and decentralisation for a (bit more) trusted AI
AICHROTH Systemaic evaluation and decentralisation for a (bit more) trusted AIFIAT/IFTA
 
VINSON Accuracy and cost assessment for archival video transcription methods
VINSON Accuracy and cost assessment for archival video transcription methodsVINSON Accuracy and cost assessment for archival video transcription methods
VINSON Accuracy and cost assessment for archival video transcription methodsFIAT/IFTA
 
LYCKE Artificial intelligence, hype or hope?
LYCKE Artificial intelligence, hype or hope?LYCKE Artificial intelligence, hype or hope?
LYCKE Artificial intelligence, hype or hope?FIAT/IFTA
 
AZIZ BABBUCCI Let's play with the archive
AZIZ BABBUCCI Let's play with the archiveAZIZ BABBUCCI Let's play with the archive
AZIZ BABBUCCI Let's play with the archiveFIAT/IFTA
 

More from FIAT/IFTA (20)

2021 FIAT/IFTA Timeline Survey
2021 FIAT/IFTA Timeline Survey2021 FIAT/IFTA Timeline Survey
2021 FIAT/IFTA Timeline Survey
 
20211021 FIAT/IFTA Most Wanted List
20211021 FIAT/IFTA Most Wanted List20211021 FIAT/IFTA Most Wanted List
20211021 FIAT/IFTA Most Wanted List
 
WARBURTON FIAT/IFTA Timeline Survey results 2020
WARBURTON FIAT/IFTA Timeline Survey results 2020WARBURTON FIAT/IFTA Timeline Survey results 2020
WARBURTON FIAT/IFTA Timeline Survey results 2020
 
OOMEN MEZARIS ReTV
OOMEN MEZARIS ReTVOOMEN MEZARIS ReTV
OOMEN MEZARIS ReTV
 
BUCHMAN Digitisation of quarter inch audio tapes at DR (FRAME Expert)
BUCHMAN Digitisation of quarter inch audio tapes at DR (FRAME Expert)BUCHMAN Digitisation of quarter inch audio tapes at DR (FRAME Expert)
BUCHMAN Digitisation of quarter inch audio tapes at DR (FRAME Expert)
 
CULJAT (FRAME Expert) Public procurement in audiovisual digitisation at RTÉ
CULJAT (FRAME Expert) Public procurement in audiovisual digitisation at RTÉCULJAT (FRAME Expert) Public procurement in audiovisual digitisation at RTÉ
CULJAT (FRAME Expert) Public procurement in audiovisual digitisation at RTÉ
 
HULSENBECK Value Use and Copyright Comission initiatives
HULSENBECK Value Use and Copyright Comission initiativesHULSENBECK Value Use and Copyright Comission initiatives
HULSENBECK Value Use and Copyright Comission initiatives
 
WILSON Film digitisation at BBC Scotland
WILSON Film digitisation at BBC ScotlandWILSON Film digitisation at BBC Scotland
WILSON Film digitisation at BBC Scotland
 
GOLODNOFF We need to make our past accessible!
GOLODNOFF We need to make our past accessible!GOLODNOFF We need to make our past accessible!
GOLODNOFF We need to make our past accessible!
 
LORENZ Building an integrated digital media archive and legal deposit
LORENZ Building an integrated digital media archive and legal depositLORENZ Building an integrated digital media archive and legal deposit
LORENZ Building an integrated digital media archive and legal deposit
 
BIRATUNGANYE Shock of formats
BIRATUNGANYE Shock of formatsBIRATUNGANYE Shock of formats
BIRATUNGANYE Shock of formats
 
CANTU VT is TV The History of Argentinian Video Art and Television Archives P...
CANTU VT is TV The History of Argentinian Video Art and Television Archives P...CANTU VT is TV The History of Argentinian Video Art and Television Archives P...
CANTU VT is TV The History of Argentinian Video Art and Television Archives P...
 
BERGER RIPPON BBC Music memories
BERGER RIPPON BBC Music memoriesBERGER RIPPON BBC Music memories
BERGER RIPPON BBC Music memories
 
AOIBHINN and CHOISTIN Rehash your archive
AOIBHINN and CHOISTIN Rehash your archiveAOIBHINN and CHOISTIN Rehash your archive
AOIBHINN and CHOISTIN Rehash your archive
 
HULSENBECK BLOM A blast from the past open up
HULSENBECK BLOM A blast from the past open upHULSENBECK BLOM A blast from the past open up
HULSENBECK BLOM A blast from the past open up
 
PERVIZ Automated evolvable media console systems in digital archives
PERVIZ Automated evolvable media console systems in digital archivesPERVIZ Automated evolvable media console systems in digital archives
PERVIZ Automated evolvable media console systems in digital archives
 
AICHROTH Systemaic evaluation and decentralisation for a (bit more) trusted AI
AICHROTH Systemaic evaluation and decentralisation for a (bit more) trusted AIAICHROTH Systemaic evaluation and decentralisation for a (bit more) trusted AI
AICHROTH Systemaic evaluation and decentralisation for a (bit more) trusted AI
 
VINSON Accuracy and cost assessment for archival video transcription methods
VINSON Accuracy and cost assessment for archival video transcription methodsVINSON Accuracy and cost assessment for archival video transcription methods
VINSON Accuracy and cost assessment for archival video transcription methods
 
LYCKE Artificial intelligence, hype or hope?
LYCKE Artificial intelligence, hype or hope?LYCKE Artificial intelligence, hype or hope?
LYCKE Artificial intelligence, hype or hope?
 
AZIZ BABBUCCI Let's play with the archive
AZIZ BABBUCCI Let's play with the archiveAZIZ BABBUCCI Let's play with the archive
AZIZ BABBUCCI Let's play with the archive
 

Recently uploaded

The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...software pro Development
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionOnePlan Solutions
 
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfAzure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfryanfarris8
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdfPearlKirahMaeRagusta1
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech studentsHimanshiGarg82
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesVictorSzoltysek
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplatePresentation.STUDIO
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnAmarnathKambale
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 

Recently uploaded (20)

The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfAzure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 

Visual search at Rai: requirements, architectures and use cases to visually search through broadcast

  • 1. Visual Search at RAI: Requirements, Architectures and Use Cases to Visually Search through Broadcast Programmes Speaker: Federico Maria Pandolfi Rai Teche, CRITS
  • 2. ▪ Importance of proper management and efficient retrieval methodologies for huge amounts of media files ▪ Typical MAM systems: text-based queries, search over textual information and metadata ▪ Pros: reliability, robustness ▪ Cons: metadata extraction is expensive, time consuming and may not be available for each entry ▪ No semantic or analytical representations of contents ▪ No query-by-example or near-duplicate detection ▪ These issues are particularly relevant for the raw video material of the newsrooms (main case study) ▪ CBIR and Image search technologies are becoming feasible and mature solutions The Scenario
  • 3. Ideal production workflow and timeline “Fresh” footage capture Newsroom storage Media Manage r Discard/delete Store on T3 Acquisitio n on field Enter the newsroom Becomes “historic” Media Manager Review T3 CMM searchabl e
  • 4. Case study: numbers Rai's digital archives include (as of end 2015): ▪ 1.540.032 hr of video material ▪ 18.720 photos of scenic costumes ▪ 1.700 photos of sets furniture ▪ 1.552 photos of Centro Elettronico Rai The number of video material increases with a rate of approx. 130.000hr/year (new + digitized material) Only about 46% is annotated Rai’s (single) newsroom stores (approx.): ▪ 2.000hr of “fresh” raw footage ▪ 10.000hr of “historic” raw footage Since each news is about 3’-5’ long, this translates into: ▪ 24.000 – 40.000 news from “fresh” footage ▪ 120.000 – 200.000 news from “historic” footage Only aired material is annotated (no raw)
  • 5. ▪ Issue: Metadata-free raw footage, no metadata attached by journalists (only aired material is documented) ▪ Issue: Huge amount of material discarded by the newsrooms (substantial loss for the company) ▪ Issue: Lack of powerful tools for in-depth search over the vast archive of footage Our vision ▪ Solution: Link raw footage with the related annotated material, using state-of-the-art visual search engine Automatic metadata linking, Visual search technologies, Browse by indexed references
  • 6. What is ViSer ▪ Modular framework ▪ Few key modules + Workflow Manager Newsroom Search engine Video Info DB Browser extensio n CMM Workflow Manager
  • 7. ViSer technologies Search Engine: Visual Atoms ▪ Ready to go, engineered solution (production- ready) ▪ Based on local descriptors ▪ No licensing issues with MPEG ▪ Video-to-video search capability ▪ Easy integration and customization (parameters, DB) ▪ Full support Workflow Manager: Apache Airflow ▪ Author, schedule and monitor workflows as DAGs of tasks using Python code ▪ Custom + out-of-the-box operators ▪ Complete logging utility ▪ Rich user interface (to access and monitor DAGs, variables, connections, …) ▪ Used to orchestrate all the steps of the binding process for each raw footage in each newsroom AirFlow
  • 8. Image search intro ▪ SIFT (Scale-Invariant Feature Transform), one of the most-used and robust feature matching algorithm ▪ Key-points calculated using DOGs ▪ Descriptors extracted using orientations in the areas near each key-point ▪ Matching & score based on the similarity between multiple descriptors (+ geometric-consistency, …) ▪ Best use-cases: same images, rigid objects, logos, … ▪ Not so good for face recognition (high variability of features)
  • 9. ▪ State-of-the-art software for image and video search ▪ Based upon the extensive use of descriptors (files) ▪ Both batch (command line) & APIs available ▪ Batch allows a fine-tuning of parameters and multiple files ingestion ▪ The database can be chosen to match the production database (Postgres DB requested) ▪ Automatic video-segmentation and keyframe extraction (for videos) ▪ Tuneable parameters for video-segmentation (trade-off between precision and DB size/retrieval speed) ▪ Possibility to extract descriptors for each key-frame (videos) or image and perform search afterwards Visual Atoms
  • 11. ▪ Bash/Python custom modules (Operators) ▪ “Video Info” database wrapper with RESTful APIs ▪ Image/video descriptors stored in “Video Info” DB (no need to extract them every time, time saving) ▪ Input images/videos status is saved in “Video Info” DB ▪ Airflow’s internal DB tracks all the steps of the chain for the various runs ▪ Back-end services for the interaction with Visual Atoms engine ▪ Date clustering module to group together videos with similar dates and distribute the queries on multiple search indexes, each one working on a different temporal window (ingestion optimization, parallel search) Workflow details
  • 12. ▪ No relevant raw metadata in an XDCAM ▪ Creation/shooting date: only reliable raw footage parameter (MXF, embedded data) ▪ Raw footage is searched using a temporal windowing system: a window is a pool of episodes ranging from the creation date to a programmable number of hours/episodes starting from that date ▪ Variable and sliding temporal window ▪ No match case: the unlinked footage is moved to the next temporal window and searched again in a newer aired set ▪ If, after a programmable amount of trials, the search does not output any result the raw footage is dropped Date management & search strategy
  • 14. What is it: ▪ Scientific newscast ▪ Aired daily ▪ Unique example in Europe ▪ 10 minutes (now 15) per episode ▪ Large variety of topics (science, tech, health, economics, society, ...) Our pilot: TG Leonardo Why TG Leo: ▪ Same structure and workflow of a typical newsroom but with a smaller footprint ▪ Small but prolific newsroom ▪ Long time collaboration with Rai Teche (same facility) ▪ Visually diversified and appealing ▪ Short episodes duration ▪ Large amount of raw footage immediately available and partially annotated 2.896 x Aired episodes In numbers: 134 x Approx. 220 hr of Raw material = ,
  • 16. Demo CMM ▪ Example of CMM integration ▪ After linking in batch raw and aired material the results will be shown in the CMM ▪ For each aired video (already browsable with CMM) a list of shots and the matches for each shots will be displayed ▪ In the first releases a browser extension will be used to show the results ▪ http://10.58.78.175:9080/viser/index. html#/cmm-demo
  • 17. ▪ Started as an open-source based project, the current version proves to be more robust and reliable ▪ Raw footage is linked properly (with shot-level granularity) to the corresponding aired footage ▪ No external documentation needed for raw material, less work for the media manager and less waste of footage. Significant financial gain for the company ▪ Easier for journalists to find video material for future news ▪ The workflow of the ingestion chain is currently under development and will be tested in collaboration with TG Leonardo’s newsroom Conclusions
  • 18. ▪ Adoption in bigger newsrooms ▪ Better integration with the multimedia catalogue (CMM) ▪ Precise advertisement data for better statistics and tailoring of advertising campaigns ▪ Query by example in “online” search mode Future work
  • 19. Advertisements demo ▪ Search for similar ads on both Rai and competitors assets and group them by airing hour ▪ Helpful tool for Business and Advertising depts. ▪ http://10.58.78.175:9080/player/shot_ext raction.html Visual Atoms’ software works extremely well with brand logos
  • 20. Browse the past, edit the future…