Video Browsing - The Need for Interactive Video Search (Talk at CBMI 2014)
Jun. 19, 2014•0 likes
2 likes
Be the first to like this
Show More
•4,647 views
views
Total views
0
On Slideshare
0
From embeds
0
Number of embeds
0
Download to read offline
Report
Science
Technology
Business
These are the slides from my keynote talk about Video Browsing on June 18, 2014, at the International Workshop on Content-Based Multimedia Indexing (CBMI) 2014.
Video Content Search Scenarios
• Private collection of recorded videos
Many long sequences…
You know there are a few interesting (e.g., funny) clips,
but don’t know where
Want to find them for editing/sharing
• Downloaded a suggested lecture video
In hurry for exam…
2 hours duration
Want to quickly check for important information
• Recordings from several surveillance cameras
Quickly look for suspicious activities (e.g., forensics expert)
Disasters (e.g., Boston Marathon bombings 2013)
2
Use Video Retrieval Tool?
3
Content‐
based
Feature
Example
Image
Text
Ranked list
of shots
Temporal
Context
[ Heesch, D., Howarth, P., Magalhaes, J., May, A., Pickering, M., Yavlinsky, A., & Rüger, S. (2004, November).
Video retrieval using search and browsing. In TREC Video Retrieval Evaluation Online Proceedings. ]
Video Search Scenarios
• Private collection of recorded videos
Many long sequences…
You know there are a few interesting (e.g., funny) clips,
but don’t know where
Want to find them for editing/sharing
• Downloaded a suggested lecture video
In hurry for exam…
2 hours duration
Want to quickly check for important information
• Recordings from several surveillance cameras
Quickly look for suspicious activities (e.g., forensics expert)
Disasters (e.g., Boston Marathon bombings 2013)
4
interesting
important information
suspicious activities
Video Retrieval
Well-known issues
Query by example
Typically no perfect example available.
Query by text
How to describe a desired image by text?
Usability Gap
6
A picture tells a 1000 words.
by marfis75
How to describe a video clip by text???
TRECVID Known-item Search
TRECVID KIS (2010‐2012)
models the situation in which
“someone knows of a video, has seen it before, believes it is
contained in a collection, but doesn‘t know where to look”
Automatic Search
Text‐description about the video
Return ranked list of 100 videos (out of 9000)
Interactive Search
Pre‐processing based on text query
Searcher browses through result list (e.g., keyframes of shots)
• Interactively find target video as fast as possible
• Within 5 minutes
8
TRECVID Known-item Search
The Performance of State-of-The-Art Video Retrieval Tools
Known items not found by any team:
Interactive Automatic out of
2010 5 / 24 21% 69 / 300 22% 15 teams
2011 6 / 25 24% 142 / 391 36% 9 teams
2012 2 / 24 17% 108 / 361 29% 9 teams
From: [Alan Smeaton, Paul Over, “Known‐Item Search @ TRECVID 2012”, NIST, 2012]
9
How do Users Browse Today?
In practice most users employ a…
VCR in the 1970s provided a similar functionality!
12
Novice vs. Expert
13
• Mostly interactive search
• Simple‐to‐use
• Inflexible and tedious for archives
• Low performance
• Mostly automatic search
• Complicated to use
• Flexible and easier (?) for archives
• Still limited performance
Modern Video Browsing
• Combines automatic and interactive search
• Integrates the user in search process
Instead of „query‐and‐browse‐results“
User controls search process
Inspects and interacts
Most meaningful feature for current need
• content navigation, abstract visualization,
ad‐hoc querying or content summarization, …
Klaus Schoeffmann, Frank Hopfgartner, Oge Marques, Laszlo Boeszoermenyi, and Joemon M. Jose, “Video browsing interfaces
and applications: a review“, in SPIE Reviews Journal , Vol. 1, No. 1, pp. 1‐35 (018004), SPIE, Online, March 2010
14
Exploratory Search
„Will know it when I see it!“
(instead of “telling the system what you want”)
Modern Video Browsing
• Interactive inspection/exploration of visual content in
order to satisfy an information need
• Focuses on search and exploration in
(i) single videos as well as (ii) video collections
Directed Search
Find a specific shot or segment in a video
Find a specific video in an archive
Undirected Search
Searching to discover information
E.g., browse through a video in order to
• Learn how the content looks like
• See if it is interesting
15
Supported by
Video Retrieval
Not supported by
Video Retrieval
Improving Content Visualization
aka “Video Surrogates”
18
However, outperformed by simple
“grid of keyframes”
in terms of search time.
VideoTree
[Jansen et al., CBMI 2008]
Similar concept proposed later
[Girgensohn et al., ICMR 2011]
Visual Seeker Bar with 2 Levels
Allows a user to quickly identify
similar/repeating scenes
23
[ Schoeffmann, K., & Boeszoermenyi, L. (2009, June). Video browsing using interactive navigation summaries. In
Content‐Based Multimedia Indexing, 2009. CBMI'09. Seventh International Workshop on (pp. 243‐248). IEEE. ]
Example: Motion Direction + Intensity
Motion Vector (µ) classification into
K=12 equidistant motion directions
Mapping to Hue channel
24
[ Schoeffmann, K., Lux, M., Taschwer, M., & Boeszoermenyi, L. (2009, June). Visualization of video motion in context
of video browsing. In Multimedia and Expo, 2009. ICME 2009. IEEE International Conference on (pp. 658‐661). IEEE. ]
Ad-Hoc Query by Motion Pattern
25
[ Schoeffmann, K., Lux, M., Taschwer, M., & Boeszoermenyi, L. (2009, June). Visualization of video motion in context
of video browsing. In Multimedia and Expo, 2009. ICME 2009. IEEE International Conference on (pp. 658‐661). IEEE. ]
Ad-Hoc Query by Color Layout
Region‐of‐Interest (ROI) Search
User selects spatial region‐of‐interest
On search
Compute Euclidian distance of frame F
to every other frame f (acc. to selected region)
Based on color layout descriptor
…
frame F
frame 1 frame k frame n
User‐selected
region (I)
…
d(F,1)=350 d(F,k)=8 d(F,n)=400
26
[ Schoeffmann, K., Taschwer, M., & Boeszoermenyi, L. (2010, February). The video explorer: a tool for navigation and searching within a single
video based on fast content analysis. In Proceedings of the first annual ACM SIGMM conference on Multimedia systems (pp. 247‐258). ACM. ]
Ad-Hoc Query by Color Layout
27
[ Schoeffmann, K., Taschwer, M., & Boeszoermenyi, L. (2010, February). The video explorer: a tool for navigation and searching within a single
video based on fast content analysis. In Proceedings of the first annual ACM SIGMM conference on Multimedia systems (pp. 247‐258). ACM. ]
Video Browser for the Digital Native
[ Adams, B., Greenhill, S., & Venkatesh, S. (2012, July). Towards a video browser for the digital native. In
Multimedia and Expo Workshops (ICMEW), 2012 IEEE International Conference on (pp. 127‐132). IEEE. ]
29
Temporal Semantic Compression
• Compress the content of e.g., a 1h video to 5 mins.
• Based on tempo and popularity (see next slide)
Compression on interestingness
User defines a compression factor (f)
that defines duration of compressed video
Based on interest function k shots are ranked
in order of interestingness, satisfying
Shots are presented in their temporal order
Video Browser for the Digital Native
Interestingness
30
[ Adams, B., Greenhill, S., & Venkatesh, S. (2012, July). Towards a video browser for the digital native. In
Multimedia and Expo Workshops (ICMEW), 2012 IEEE International Conference on (pp. 127‐132). IEEE. ]
Tempo function derived from
motion and audio features
(originally; Greenhill et al.)
Per‐frame and per‐shot popularity based
on information like
YouTube Insights and manual annotations
Video Browser for the Digital Native
User study with 8 participants
Test configuration elements by two tasks
1. Browse a familiar movie to find scenes you remember
2. Browse an unfamiliar movie to get a feel for its story or structure
Questionnaire with Likert‐scale ratings
31
[ Adams, B., Greenhill, S., & Venkatesh, S. (2012, July). Towards a video browser for the digital native. In
Multimedia and Expo Workshops (ICMEW), 2012 IEEE International Conference on (pp. 127‐132). IEEE. ]
Signature-based Video Browser
• Color sketches mapped to
feature signatures
• Matched to those of
keyframes
33
[ Kruliš, M., Lokoč, J. and Skopal, T. (2013). Efficient Extraction of Feature Signatures
Using Multi‐GPU Architecture. Springer Berlin Heidelberg, LNCS 7733, pp.446‐456. ]
1. Sampling keypoints
2. Description through location (x,y),
CIE Lab, contrast and entropy of
surrounding pixels
3. K‐means clustering
Evaluation of Browsing Tools
• User Studies
Reflect real benefit (+)
Unexpected behaviors (+)
Very tedious to do (‐)
Individual data sets (‐)
• User Simulations
Quick procedure (+)
Approximation only (‐)
• Campaigns/Competitions
TRECVID Known‐Item‐Search
Video Browser Showdown
Combine advantages from above
37
Video Browser Showdown (VBS)
• Annual performance evaluation competition
Live evaluation of search performance
Special session at Int. Conference on MultiMedia Modeling (MMM)
• Focus
Known‐item Search tasks
Target clips are presented on site
Teams search in shared data set
Highly interactive search
e.g., text‐queries are not allowed
Should push research on interfaces
and interaction/navigation
Experts and Novices
Easy‐to‐use tools and methods
38
Video Browser Showdown (VBS)
• Scoring through VBS Server
• Score (s) [0‐100] for task i and team k is based on
Solve time (t)
Penalty (p) based on
number of submissions (m)
40
Maximum solve time (Tmax)
typically 3 minutes
[ Schoeffmann, K., Ahlström, D., Bailer, W., Cobârzan, C., Hopfgartner, F., McGuinness, K., ... & Weiss, W. (2013). The Video Browser
Showdown: a live evaluation of interactive video search tools. International Journal of Multimedia Information Retrieval, 1‐15. ]
VBS 2013 Evaluation
Baseline Study with Novices and a Video Player
• Add. User study (16 participants) for comparison with VBS tools
• Known Item Search Tasks as used for VBS 2013
41
[ Schoeffmann and Cobarzan, “An Evaluation of Interactive Search with Modern Video Players”, in
Proc. of the 2013 IEEE International Symposium on Multimedia (ISM), Anaheim, CA, USA, 2013 ]
VBS 2013: Baseline vs. Experts
Score
42
[ Schoeffmann, K., Ahlström, D., Bailer, W., Cobârzan, C., Hopfgartner, F., McGuinness, K., ... & Weiss, W. (2013). The Video Browser
Showdown: a live evaluation of interactive video search tools. International Journal of Multimedia Information Retrieval, 1‐15. ]
Avg (Baseline) = 74.8 Avg (VBS) = 71.7
VBS 2013: Baseline vs. Experts
Submission Time
43
Avg (Baseline) = 57.9 s Avg (VBS) = 40.5 s
HCI
Conclusions
• Need for interactive/exploratory search
• Video browsing tools
Effective alternative to automatic search tools, support undirected search
Provide reasonable performance, can help to bridge usability gap
Many proposals for single browsing techniques
• But still improvable…
How to even better integrate user into search process?
User knowledge could help to circumvent shortcomings of content analysis
How to better support search behavior of users?
Stronger combination of automatic and interactive search techniques needed!
More research on interface concepts, interaction models, demos, and user studies!
45
MM
Where is the User
in Multimedia Retrieval?
IEEE Multimedia Magazine, Oct.‐Dec. 2012, vol. 19, no. 4, pp. 6‐10
Marcel Worring, Paul Sajda, Simone Santini, David Shamma, Alan Smeaton, Qiang Yang
46
• “In the multimedia retrieval community, the
emphasis has moved toward quantitative
results to such an extent that the user has
moved into the background. ”
• “It might be time to rethink what we are doing
in the field.”
• “…users often don’t even know what they want
from an automatic system….”
• “…user needs and characteristics are dynamic.”
• “It is so much easier to publish papers about
improving a standard task than it is to describe
a new insight about user intention or a new
interface for browsing results.”
What About Novice Users?
[ Heesch, D., Howarth, P., Magalhaes, J., May, A., Pickering, M., Yavlinsky, A., & Rüger, S. (2004, November).
Video retrieval using search and browsing. In TREC Video Retrieval Evaluation Online Proceedings. ]
47
Video Browser Showdown 2012
Two examples (of the 11 tools)
48
Xiangyu Chen, Jin Yuan, Liqiang Nie, Zheng‐Jun Zha, Shuicheng Yan, and Tat‐Seng Chua, "TRECVID 2010
Known‐item Search by NUS", in Proceedings of TRECVID 2010 workshop, NIST, Gaithersburgh, USA, 2011
Jin Yuan, Huanbo Luan, Dejun Hou, Han Zhang, Yan‐Tao Zheng, Zheng‐Jun Zha, and Tat‐Seng Chua, "Video
Browser Showdown by NUS", in Proceedings of th 18th International Conference on Multimedia Modeling
(MMM) 2012, Klagenfurt, Austria, pp. 642‐645
• Keyframe extraction (shots)
• ASR and OCR
• HLF (Concepts)
• RF with Related Samples
• Uniform sampled keyframes
(with flexible distance)
• Parallel playback + navigation
Manfred Del Fabro and Laszlo Böszörmenyi, "AAU Video Browser: Non‐
Sequential Hierarchical Video Browsing without Content Analysis", in
Proceedings of th 18th International Conference on Multimedia Modeling
(MMM) 2012, Klagenfurt, Austria, pp. 639‐641
Winner of VBS 2012
Mobile Video Browsing
FilmStrip – Improve Visability [ Hudelist, M. A., Schoeffmann, K., & Boeszoermenyi, L. (2013, April). Mobile
video browsing with a 3D filmstrip. In Proceedings of the 3rd ACM conference on
International Conference on Multimedia Retrieval (pp. 299‐300). ACM. ]
50