This paper overviews ongoing work that aims to support
end-users in conveniently exploring and exploiting large audiovisual archives by deploying multiple multimodal linking
approaches. We present ongoing work on multimodal video
hyperlinking, from a perspective of unconstrained link anchor identification and based on the identification of named
entities, and recent attempts to implement and validate the
concept of outside-in linking that relates current events to
archive content. Although these concepts are not new, current work is revealing novel insights, more mature technology, development of benchmark evaluations and emergence
of dedicated workshops which are opening many interesting research questions on various levels that require closer
collaboration between research communities.
Azure Monitor & Application Insight to monitor Infrastructure & Application
Convenient Discovery of Archived Video Using Audiovisual Hyperlinking
1. Convenient Discovery of
Archived Video Using
Audiovisual Hyperlinking
Roeland J.F. Ordelman
University of Twente &
Netherlands Institute
for Sound and Vision
The Netherlands
Robin Aly
Data Management
University of Twente
The Netherlands
Maria Eskevich
EURECOM
Sophia Antipolis
France
Benoit Huet
EURECOM
Sophia Antipolis
France
Gareth J.F. Jones
ADAPT Centre / CNGL
School of Computing
Dublin City University,
Ireland
2. Audio-Visual Explosion
• EU alone hosts 500+ online video platforms
• 42.7m hrs of footage in online archives of
broadcasters and producers (61% of archive footage
is online)
• UGC content is soaring:
– YouTube receives 72 hrs of video/minute
– Vine and Instagram video messaging
• Internet video will reach 62 percent by the end of
2015, 75% in 2017
(source: CISCO)
• How to make the content accessible?
11/1/2015 SLAM2015 - Convenient Discovery of Archived Video Using Audiovisual Hyperlinking 2
VISUAL WEB
[R. Jain 2015]
3. • New technologies to stimulate active
appropriation of multimedia content
• User Information Need
– Expectation
– Interest
– Desire
– …
• Serendipity
• Story Telling
Content Accessibility
11/1/2015 SLAM2015 - Convenient Discovery of Archived Video Using Audiovisual Hyperlinking 5
4. Content Accessibility
• Video search system evaluation results
– BBC and NISV
– Users:
• Don’t know footage availability
• Usually start their search with
– Celebrity
– Location (home town,…)
– Personal Interest (hubby, sport, etc…)
– AND THEN “what else might there be in the archive?”
11/1/2015 SLAM2015 - Convenient Discovery of Archived Video Using Audiovisual Hyperlinking 6
10. Video HyperLinking Example
... the queen...
11/1/2015 SLAM2015 - Convenient Discovery of Archived Video Using Audiovisual Hyperlinking 12
11. Usage Scenario
• Exploration of additional information (video)
sources while accessing content in a linear
fashion
• Exploration of an audiovisual archive via a
structure of linked video segments
• Creating narratives on the basis of linked
video segments
• Personalisation
11/1/2015 SLAM2015 - Convenient Discovery of Archived Video Using Audiovisual Hyperlinking 13
12. 3 types of HyperLinks
• Inside-in (Video to Video Linking)
• Outside-in
(Media to Video)
• Inside-out
(Video to Media)
11/1/2015 SLAM2015 - Convenient Discovery of Archived Video Using Audiovisual Hyperlinking 14
13. Automatic Video HyperLinking Process
Anchor
Identification
•Fragment with
informative content
•Start-End time w.r.t.
the video
Anchor
Representation
•Query creation
•Feature extraction
and selection
Target
Search
•Query processing
•Identification of
relevant fragments
Target
Presentation
•Ordered list
•Personalised
selection
11/1/2015 SLAM2015 - Convenient Discovery of Archived Video Using Audiovisual Hyperlinking 15
14. Anchor Identification
• Key issues:
– multimodality: anchor can be in speech/audio,
visual or both.. Other?
– unfamiliarity: user evaluations demonstrate that
anchors in video are a ‘difficult’ concept
• Evaluation perspective:
– ask (professional) users to select anchors
– let task participants automatically identify anchors
11/1/2015 SLAM2015 - Convenient Discovery of Archived Video Using Audiovisual Hyperlinking 16
15. Anchor Representation
• Focus on segments: start-time and end-time
(media fragment)
• Extract multimodal features
• Context
• Anchor representation may be ‘noisy’ due to
analysis errors
11/1/2015 SLAM2015 - Convenient Discovery of Archived Video Using Audiovisual Hyperlinking 17
16. Target search
• Search for relevant link targets
• What is relevant?
• Working hypothesis:
– Content about what is represented in the anchor
(topically related) – context is important
– Content that is based upon it, similar, or has
identical semantic labels – context is less
important
11/1/2015 SLAM2015 - Convenient Discovery of Archived Video Using Audiovisual Hyperlinking 18
17. Target presentation
• A ranked list of ‘search’ results for each
anchor?
• Depends on scenario, not addressed in current
evaluation set-up
• For assessment of results we provide
assessors with anchor – target pairs
11/1/2015 SLAM2015 - Convenient Discovery of Archived Video Using Audiovisual Hyperlinking 19
18. 2015 Evaluation Campaigns
• MediaEval - Search and Anchoring in Video
Archives (SAVA)
– video search using multimodal features
(continued)
– automatic anchor selection (new)
• TRECVid – Video Hyperlinking
– automatic video-to-video linking
11/1/2015 SLAM2015 - Convenient Discovery of Archived Video Using Audiovisual Hyperlinking 20
19. Anchor Identification - Task scenario
• Position yourself in the role of a producer wanting to
create a new production, e.g., a news item, report or
documentary
• S/he is searching for content in the BBC archive for this
production and selects clips
• Imagine that the producer wants to place hyperlinks in
the clips that help the end-user that watches the final
program better to understand the program or enrich
their watching experience
• Imagine that these links are provided to end-users for
example via a ‘second screen’ (e.g., iPad)
11/1/2015 SLAM2015 - Convenient Discovery of Archived Video Using Audiovisual Hyperlinking 21
20. Topic Description
Describe what you are looking
(e.g., I am looking for clips with
castles and medieval villages)
Provide keywords that could be
in the speech of relevant clips
(e.g., middle ages, doomsday)
Provide keywords for visual
content (e.g., castle, bridge,
knight)
11/1/2015 SLAM2015 - Convenient Discovery of Archived Video Using Audiovisual Hyperlinking 22
21. Search
Here is what you put earlier.
You can change description,
queries if needed
You can check if a result is
relevant for you
If you want to ‘have’ this clip
click the button
Clips you collect will end up
here. Provide at least 2 clips
11/1/2015 SLAM2015 - Convenient Discovery of Archived Video Using Audiovisual Hyperlinking 23
22. Anchor Creation
Title of anchor (e.g., Castle)
To what would you want to link
this (e.g., a documentary on
this castle)
Was the anchor something
visual or in the speech?
11/1/2015 SLAM2015 - Convenient Discovery of Archived Video Using Audiovisual Hyperlinking 24
23. 11/1/2015 SLAM2015 - Convenient Discovery of Archived Video Using Audiovisual Hyperlinking 25
24. User Study - Findings
• Anchor Modality Source
1. Spoken Content
2. Whole Scene
– What about Visual Content?
• Anchor creation is intention driven
– Content producer
• VS end-user/viewer
• VS advertisers
11/1/2015 SLAM2015 - Convenient Discovery of Archived Video Using Audiovisual Hyperlinking 26
25. Entity-Based Anchor Identification
• Content Producer use-case for Video
Hyperlinking
– Rich interactive TV experience
– Keeping the Editor in the loop while automating
much of the process
• Editor Tools
– Anchors / Targets
http://editortool.linkedtv.eu/trial
11/1/2015 SLAM2015 - Convenient Discovery of Archived Video Using Audiovisual Hyperlinking 27
26. Outside-In Linking
• Using external content to link into the video
archive
– Stimulate discovery and re-use
• Multimodal Analysis of the archive
– speech transcript, visual analysis
• Metadata
• Matching with Named Entities from RSS news
feeds
11/1/2015 SLAM2015 - Convenient Discovery of Archived Video Using Audiovisual Hyperlinking 28
27. 11/1/2015 SLAM2015 - Convenient Discovery of Archived Video Using Audiovisual Hyperlinking 29
28. End User Evaluation
• Interest
• But Unsatisfactory quality of the hyperlinks
• Limiting factors:
– Fine-tuning of the structured query formulation
– Dataset sparseness (limited to 3000 hours)
11/1/2015 SLAM2015 - Convenient Discovery of Archived Video Using Audiovisual Hyperlinking 30
29. Inside-Out Linking
• Video content links to external information
– Provide enrichment
• Multimodal Analysis of Audio-Visual Broadcast
– speech transcript, (visual analysis)
• Broadcast Metadata (closed caption)
• Extract Named Entities from the text and use
Semantic Web technologies to identify
relevant structured content
11/1/2015 SLAM2015 - Convenient Discovery of Archived Video Using Audiovisual Hyperlinking 31
30. The concept of Linked Television
• meet the viewer‘s information need
– directly associated with the TV program
– easily accessible for the viewer
– under the control of the broadcaster
11/1/2015 SLAM2015 - Convenient Discovery of Archived Video Using Audiovisual Hyperlinking 32
31. 11/1/2015 SLAM2015 - Convenient Discovery of Archived Video Using Audiovisual Hyperlinking 33
https://vimeo.com/119107849
32. Conclusions
• Insights on Audio-Visual Hyperlinking
• Growing interest from both the research
community and industry
• Benchmark Evaluation
• What NEXT?
– Collaborations between fields (audio, visual, nlp,
semantic web, social media, big data…)
– Intent (anchor and hyperlink level)
11/1/2015 SLAM2015 - Convenient Discovery of Archived Video Using Audiovisual Hyperlinking 34
33. Thank you
• Special thanks to
Jana Eggink and Andy O’Dwyer
• Any questions?
TRECVid (NIST)
11/1/2015 SLAM2015 - Convenient Discovery of Archived Video Using Audiovisual Hyperlinking 35
34. References
• R. Aly, K. McGuinness, M. Kleppe, R. Ordelman, N. E. O'Connor, and F. de Jong. “Link anchors in images: Is there truth?” In Proceedings of the 12th
Dutch Belgian Information Retrieval Workshop (DIR 2012), pages 1{4,Ghent, 2012. University Ghent.
• R. Aly, R. Ordelman, M. Eskevich, G. J. F. Jones, and S. Chen. “Linking inside a video collection - what and how to measure?” In Proceedings of the
22nd International Conference on World Wide Web Companion, IW3C2 2013, Rio de Janeiro, Brazil, pages 457-460, Brazil, May 2013. ACM.
• E. Apostolidis, V. Mezaris, M. Sahuguet, B. Huet, B. Cervenkova, D. Stein, S. Eickeler, J. L. Redondo Garcia, R. Troncy, and L. Pikora. “Automatic fine-
grained hyperlinking of videos within a closed collection using scene segmentation”. In ACMMM 2014, 22nd ACM International Conference on
Multimedia, Orlando, USA, 11 2014.
• J. Blom. Deliverable 1.5, linkedtv annotation tool, final release. Public deliverable, LinkedTV Project (FP7-ICT grant agreement no 287911), 2015.
• M. Bron, B. Huurnink, and M. de Rijke. “Linking archives using document enrichment and term selection”. In S. Gradmann, F. Borri, C. Meghini, and H.
Schuldt, editors, Research and Advanced Technology for Digital Libraries, volume 6966 of Lecture Notes in Computer Science, pages 360-371.
Springer Berlin Heidelberg, 2011.
• L. S. Connaway, T. J. Dickey, and M. L. Radford. "If it is too inconvenient I'm not going after it": Convenience as a critical factor in information-seeking
behaviors. Library & Information Science Research, 33(3):179-190, 2011.
• M. Eskevich, H. Nguyen, M. Sahuguet, and B. Huet. “Hypervideo browser: Search and hyperlinking in broadcast media”. In ACMMM 2015, 23nd ACM
International Conference on Multimedia.
• P. E. Hart and J. Graham. “Query-free information retrieval”. IEEE Intelligent Systems, (5):32-37, 1997.
• M. Kleppe and J. Briggeman. “Deliverable 1.8, final use case evaluation report”. Public deliverable, AXES Project (FP7-ICT grant agreement no
269980), 2015.
• R. Mihalcea and A. Csomai. Wikify!: Linking Documents to Encyclopedic Knowledge. In Proceedings of the sixteenth ACM conference on Conference
on information and knowledge management (CIKM '07), pages 233-242, 2007.
• J. Morang, R. J. F. Ordelman, F. M. G. de Jong, and A. J. van Hessen. “InfoLink: analysis of Dutch broadcast news and cross-media browsing”. In
Proceedings of IEEE International Conference on Multimedia and Expo (ICME 2005), pages 1582-1585, Los Alamitos, 2005. IEEE Computer Society.
• D. W. Oard, A. S. Levi, R. L. Punzalan, and R. Warren. “Bridging communities of practice: Emerging technologie for content-centered linking”. In
Museums and the Web, 2014.
• D. Stein, E. Apostolidis, V. Mezaris, N. de Abreu Pereira, J. Muller, M. Sahuguet, B. Huet, and I. Lasek. “Enrichment of news show videos with
multimodal semi-automatic Analysis”. In NEM-Summit 2012, Networked and Electronic Media, Istanbul, Turkey, 10 2012.
• D. Stein, A. Oktem, E. Apostolidis, V. Mezaris, J. L. Redondo Garca, R. Troncy, M. Sahuguet, and B. Huet. ”From raw data to semantically enriched
hyperlinking: Recent advances in the LinkedTV analysis workow. In EM Summit 2013, Networked & Electronic Media, Nantes, France, 10 2013.
• P. Stockinger. Audiovisual Archives. John Wiley & Sons, Inc., 2013.
• T. Tommasi and R. Aly and K. McGuinness and K. Chateld and R. Arandjelovic and O. Parkhi and R. Ordelman and A. Zisserman and T. Tuytelaars.
“Beyond metadata: searching your archive based on its audio-visual Content”. In IBC 2014, Amsterdam, The Netherlands, 2014.
11/1/2015 SLAM2015 - Convenient Discovery of Archived Video Using Audiovisual Hyperlinking 36
Editor's Notes
content that is about what is
represented in the anchor { we sometimes refer to this as
'topically related' { and not content that is based upon it,
which is similar to it, or has identical semantic labels.