Video Browsing - The Need for Interactive Video Search (Talk at CBMI 2014)

Video Browsing
Klaus Schoeffmann, Klagenfurt University, Austria
CBMI 2014
The need for interactive video search…

Video Content Search Scenarios
• Private collection of recorded videos
Many long sequences…
You know there are a few interesting (e.g., funny) clips,
but don’t know where
Want to find them for editing/sharing
• Downloaded a suggested lecture video
In hurry for exam…
2 hours duration
Want to quickly check for important information
• Recordings from several surveillance cameras
Quickly look for suspicious activities (e.g., forensics expert)
Disasters (e.g., Boston Marathon bombings 2013)
2

Use Video Retrieval Tool?
3
Content‐
based
Feature
Example
Image
Text
Ranked list
of shots
Temporal
Context
[ Heesch, D., Howarth, P., Magalhaes, J., May, A., Pickering, M., Yavlinsky, A., & Rüger, S. (2004, November).
Video retrieval using search and browsing. In TREC Video Retrieval Evaluation Online Proceedings. ]

Video Search Scenarios
• Private collection of recorded videos
Many long sequences…
You know there are a few interesting (e.g., funny) clips,
but don’t know where
Want to find them for editing/sharing
• Downloaded a suggested lecture video
In hurry for exam…
2 hours duration
Want to quickly check for important information
• Recordings from several surveillance cameras
Quickly look for suspicious activities (e.g., forensics expert)
Disasters (e.g., Boston Marathon bombings 2013)
4
interesting
important information
suspicious activities

Shortcomings…
pic by [ sunface13 ]

Video Retrieval
Well-known issues
Query by example
 Typically no perfect example available.
Query by text
 How to describe a desired image by text?
Usability Gap
6
A picture tells a 1000 words.
by marfis75
How to describe a video clip by text???

Low performance in broad domain
Database affinity of concept classifiers
P(k) Precision at level k (after k results)
rel(k) defines if kth retrieved document is relevant
TRECVID 2013 Semantic Indexing (SIN‐500):
median “inferred average precision” (infAP) < 0.13
Performance
Gap
7
Video Retrieval
Well-known issues

TRECVID Known-item Search
TRECVID KIS (2010‐2012)
models the situation in which
“someone knows of a video, has seen it before, believes it is
contained in a collection, but doesn‘t know where to look”
Automatic Search
 Text‐description about the video
 Return ranked list of 100 videos (out of 9000)
Interactive Search
 Pre‐processing based on text query
 Searcher browses through result list (e.g., keyframes of shots)
• Interactively find target video as fast as possible
• Within 5 minutes
8

TRECVID Known-item Search
The Performance of State-of-The-Art Video Retrieval Tools
Known items not found by any team:
Interactive Automatic out of
2010 5 / 24 21% 69 / 300 22% 15 teams
2011 6 / 25 24% 142 / 391 36% 9 teams
2012 2 / 24 17% 108 / 361 29% 9 teams
From: [Alan Smeaton, Paul Over, “Known‐Item Search @ TRECVID 2012”, NIST, 2012]
9

Video Browsing
[ F. Arman, R. Depommier, A. Hsu, and M‐Y. Chiu, Content‐based Browsing of Video Sequences,
in Proc. of ACM International Conference on Multimedia, 1994, pp. 97‐103 ]
11

How do Users Browse Today?
In practice most users employ a…
VCR in the 1970s provided a similar functionality!
12

Novice vs. Expert
13
• Mostly interactive search
• Simple‐to‐use
• Inflexible and tedious for archives
• Low performance
• Mostly automatic search
• Complicated to use
• Flexible and easier (?) for archives
• Still limited performance

Modern Video Browsing
• Combines automatic and interactive search
• Integrates the user in search process
Instead of „query‐and‐browse‐results“
User controls search process
 Inspects and interacts
 Most meaningful feature for current need
• content navigation, abstract visualization,
ad‐hoc querying or content summarization, …
Klaus Schoeffmann, Frank Hopfgartner, Oge Marques, Laszlo Boeszoermenyi, and Joemon M. Jose, “Video browsing interfaces
and applications: a review“, in SPIE Reviews Journal , Vol. 1, No. 1, pp. 1‐35 (018004), SPIE, Online, March 2010
14
Exploratory Search
„Will know it when I see it!“
(instead of “telling the system what you want”)

Modern Video Browsing
• Interactive inspection/exploration of visual content in
order to satisfy an information need
• Focuses on search and exploration in
(i) single videos as well as (ii) video collections
 Directed Search
 Find a specific shot or segment in a video
 Find a specific video in an archive
 Undirected Search
 Searching to discover information
 E.g., browse through a video in order to
• Learn how the content looks like
• See if it is interesting
15
Supported by
Video Retrieval
Not supported by
Video Retrieval

Content Navigation &
Visualization
16

Improving Navigation
17
e.g., on YouTube
default window:
640 pixels = frames
(25 seconds)
Common seeker‐bar limits
navigation granularity
[Huerst et al., ICME 2007]
ZoomSlider
[Dragicevic et al., CHI 2008]
Direct
Manipulation
Improvements (selected):

Improving Content Visualization
aka “Video Surrogates”
18
However, outperformed by simple
“grid of keyframes”
in terms of search time.
VideoTree
[Jansen et al., CBMI 2008]
Similar concept proposed later
[Girgensohn et al., ICMR 2011]

19
Squeeze / Fisheye
Rapid Visual Serial
Presentation (RSVP)
Improving Content Visualization
aka “Video Surrogates”
[Wildemuth et al., 2003]
Table of Video Content
(TOVC)
[Goeau et al., ICME 2007]
[Wittenburg et al., 2005]

Examples of
Video Browsing Tools
20

Exploration…pic by [NASA's Marshall Space Flight Center]

The Video Explorer
Download demo at: http://vidosearch.com/demos/VideoExplorerTrial.zip
22
[ Schoeffmann, K., Taschwer, M., & Boeszoermenyi, L. (2010, February). The video explorer: a tool for navigation and searching within a single
video based on fast content analysis. In Proceedings of the first annual ACM SIGMM conference on Multimedia systems (pp. 247‐258). ACM. ]

Visual Seeker Bar with 2 Levels
Allows a user to quickly identify
similar/repeating scenes
23
[ Schoeffmann, K., & Boeszoermenyi, L. (2009, June). Video browsing using interactive navigation summaries. In
Content‐Based Multimedia Indexing, 2009. CBMI'09. Seventh International Workshop on (pp. 243‐248). IEEE. ]

Example: Motion Direction + Intensity
Motion Vector (µ) classification into
K=12 equidistant motion directions
Mapping to Hue channel
24
[ Schoeffmann, K., Lux, M., Taschwer, M., & Boeszoermenyi, L. (2009, June). Visualization of video motion in context
of video browsing. In Multimedia and Expo, 2009. ICME 2009. IEEE International Conference on (pp. 658‐661). IEEE. ]

Ad-Hoc Query by Motion Pattern
25
[ Schoeffmann, K., Lux, M., Taschwer, M., & Boeszoermenyi, L. (2009, June). Visualization of video motion in context
of video browsing. In Multimedia and Expo, 2009. ICME 2009. IEEE International Conference on (pp. 658‐661). IEEE. ]

Ad-Hoc Query by Color Layout
Region‐of‐Interest (ROI) Search
 User selects spatial region‐of‐interest
 On search
 Compute Euclidian distance of frame F
to every other frame f (acc. to selected region)
 Based on color layout descriptor
…
frame F
frame 1 frame k frame n
User‐selected
region (I)
…
d(F,1)=350 d(F,k)=8 d(F,n)=400
26

Ad-Hoc Query by Color Layout
27

Digital Natives…pic by [ angermann ]

Video Browser for the Digital Native
[ Adams, B., Greenhill, S., & Venkatesh, S. (2012, July). Towards a video browser for the digital native. In
Multimedia and Expo Workshops (ICMEW), 2012 IEEE International Conference on (pp. 127‐132). IEEE. ]
29
Temporal Semantic Compression
• Compress the content of e.g., a 1h video to 5 mins.
• Based on tempo and popularity (see next slide)
Compression on interestingness
User defines a compression factor (f)
that defines duration of compressed video
Based on interest function k shots are ranked
in order of interestingness, satisfying
Shots are presented in their temporal order

Interestingness
30
Tempo function derived from
motion and audio features
(originally; Greenhill et al.)
Per‐frame and per‐shot popularity based
on information like
YouTube Insights and manual annotations

User study with 8 participants
 Test configuration elements by two tasks
1. Browse a familiar movie to find scenes you remember
2. Browse an unfamiliar movie to get a feel for its story or structure
 Questionnaire with Likert‐scale ratings
31

Signatures…pic by [ Wierts Sabastien ]

Signature-based Video Browser
• Color sketches mapped to
feature signatures
• Matched to those of
keyframes
33
[ Kruliš, M., Lokoč, J. and Skopal, T. (2013). Efficient Extraction of Feature Signatures
Using Multi‐GPU Architecture. Springer Berlin Heidelberg, LNCS 7733, pp.446‐456. ]
1. Sampling keypoints
2. Description through location (x,y),
CIE Lab, contrast and entropy of
surrounding pixels
3. K‐means clustering

34
[ Lokoč, J., Blažek, A., & Skopal, T. (2014, January). Signature‐Based Video Browser. In
MultiMedia Modeling (pp. 415‐418). Springer International Publishing. ]
Sketches
(Color Signatures)
Player
Winner of VBS 2014
Download demo at: http://siret.ms.mff.cuni.cz/lokoc/vbs.zip

35
Jakub Lokoč, Adam Blažek, and Tomáš Skopal. 2014. On Effective Known Item Video Search Using Feature Signatures.
In Proceedings of International Conference on Multimedia Retrieval (ICMR '14). ACM, New York, NY, USA, 3 pages.

Performance Evaluation of
Browsing Tools
36

Evaluation of Browsing Tools
• User Studies
Reflect real benefit (+)
Unexpected behaviors (+)
Very tedious to do (‐)
Individual data sets (‐)
• User Simulations
Quick procedure (+)
Approximation only (‐)
• Campaigns/Competitions
TRECVID Known‐Item‐Search
Video Browser Showdown
Combine advantages from above
37

Video Browser Showdown (VBS)
• Annual performance evaluation competition
 Live evaluation of search performance
 Special session at Int. Conference on MultiMedia Modeling (MMM)
• Focus
 Known‐item Search tasks
 Target clips are presented on site
 Teams search in shared data set
 Highly interactive search
 e.g., text‐queries are not allowed
 Should push research on interfaces
and interaction/navigation
 Experts and Novices
 Easy‐to‐use tools and methods
38

39
2012: Klagenfurt
11 teams
2013: Huangshan
6 teams
2014: Dublin
7 teams
VBS 2015: January 4, 2015, Sydney, Australia (MMM 2015)
http://www.videobrowsershowdown.org/

Video Browser Showdown (VBS)
• Scoring through VBS Server
• Score (s) [0‐100] for task i and team k is based on
Solve time (t)
Penalty (p) based on
number of submissions (m)
40
Maximum solve time (Tmax)
typically 3 minutes
[ Schoeffmann, K., Ahlström, D., Bailer, W., Cobârzan, C., Hopfgartner, F., McGuinness, K., ... & Weiss, W. (2013). The Video Browser
Showdown: a live evaluation of interactive video search tools. International Journal of Multimedia Information Retrieval, 1‐15. ]

VBS 2013 Evaluation
Baseline Study with Novices and a Video Player
• Add. User study (16 participants) for comparison with VBS tools
• Known Item Search Tasks as used for VBS 2013
41
[ Schoeffmann and Cobarzan, “An Evaluation of Interactive Search with Modern Video Players”, in
Proc. of the 2013 IEEE International Symposium on Multimedia (ISM), Anaheim, CA, USA, 2013 ]

VBS 2013: Baseline vs. Experts
Score
42
[ Schoeffmann, K., Ahlström, D., Bailer, W., Cobârzan, C., Hopfgartner, F., McGuinness, K., ... & Weiss, W. (2013). The Video Browser
Showdown: a live evaluation of interactive video search tools. International Journal of Multimedia Information Retrieval, 1‐15. ]
Avg (Baseline) = 74.8 Avg (VBS) = 71.7

VBS 2013: Baseline vs. Experts
Submission Time
43
Avg (Baseline) = 57.9 s Avg (VBS) = 40.5 s

Conclusions and Open Issues…

HCI
Conclusions
• Need for interactive/exploratory search
• Video browsing tools
 Effective alternative to automatic search tools, support undirected search
 Provide reasonable performance, can help to bridge usability gap
 Many proposals for single browsing techniques
• But still improvable…
 How to even better integrate user into search process?
 User knowledge could help to circumvent shortcomings of content analysis
 How to better support search behavior of users?
 Stronger combination of automatic and interactive search techniques needed!
 More research on interface concepts, interaction models, demos, and user studies!
45
MM

Where is the User
in Multimedia Retrieval?
IEEE Multimedia Magazine, Oct.‐Dec. 2012, vol. 19, no. 4, pp. 6‐10
Marcel Worring, Paul Sajda, Simone Santini, David Shamma, Alan Smeaton, Qiang Yang
46
• “In the multimedia retrieval community, the
emphasis has moved toward quantitative
results to such an extent that the user has
moved into the background. ”
• “It might be time to rethink what we are doing
in the field.”
• “…users often don’t even know what they want
from an automatic system….”
• “…user needs and characteristics are dynamic.”
• “It is so much easier to publish papers about
improving a standard task than it is to describe
a new insight about user intention or a new
interface for browsing results.”

What About Novice Users?
[ Heesch, D., Howarth, P., Magalhaes, J., May, A., Pickering, M., Yavlinsky, A., & Rüger, S. (2004, November).
Video retrieval using search and browsing. In TREC Video Retrieval Evaluation Online Proceedings. ]
47

Video Browser Showdown 2012
Two examples (of the 11 tools)
48
Xiangyu Chen, Jin Yuan, Liqiang Nie, Zheng‐Jun Zha, Shuicheng Yan, and Tat‐Seng Chua, "TRECVID 2010
Known‐item Search by NUS", in Proceedings of TRECVID 2010 workshop, NIST, Gaithersburgh, USA, 2011
Jin Yuan, Huanbo Luan, Dejun Hou, Han Zhang, Yan‐Tao Zheng, Zheng‐Jun Zha, and Tat‐Seng Chua, "Video
Browser Showdown by NUS", in Proceedings of th 18th International Conference on Multimedia Modeling
(MMM) 2012, Klagenfurt, Austria, pp. 642‐645
• Keyframe extraction (shots)
• ASR and OCR
• HLF (Concepts)
• RF with Related Samples
• Uniform sampled keyframes
(with flexible distance)
• Parallel playback + navigation
Manfred Del Fabro and Laszlo Böszörmenyi, "AAU Video Browser: Non‐
Sequential Hierarchical Video Browsing without Content Analysis", in
Proceedings of th 18th International Conference on Multimedia Modeling
(MMM) 2012, Klagenfurt, Austria, pp. 639‐641
Winner of VBS 2012

[ Marco A. Hudelist, Claudiu Cobarzan and Klaus Schoeffmann, “OpenCV
Performance Measurements on Mobile Devices“, in Proceedings of the ACM
International Conference on Multimedia Retrieval (ICMR 2014), pp. 1‐4,
Glasgow, UK, 2014, pp. 479‐482 ]
The Potential of Mobile Devices
• Intuitive to use
• Rich interaction capabilities
 multi‐touch
 accelerometer, gyroscope, …
 front camera (tracking/feedback?)
• High computing power
 on‐demand content analysis
 ad‐hoc queries
 powerful graphics
49

Mobile Video Browsing
FilmStrip – Improve Visability [ Hudelist, M. A., Schoeffmann, K., & Boeszoermenyi, L. (2013, April). Mobile
video browsing with a 3D filmstrip. In Proceedings of the 3rd ACM conference on
International Conference on Multimedia Retrieval (pp. 299‐300). ACM. ]
50

ks@itec.aau.at
vidosearch.com
51

Video Browsing - The Need for Interactive Video Search (Talk at CBMI 2014)

Recommended

Recommended

More Related Content

Similar to Video Browsing - The Need for Interactive Video Search (Talk at CBMI 2014)

Similar to Video Browsing - The Need for Interactive Video Search (Talk at CBMI 2014) (20)

More from klschoef

More from klschoef (8)

Recently uploaded

Recently uploaded (20)

Video Browsing - The Need for Interactive Video Search (Talk at CBMI 2014)