Invited talk
Workshop on Interactive Information Access: Untangling Tasks and Technologies
At Centrum voor Wiskunde en Informatica (CWI), Amsterdam, The Netherlands
On Dec. 6, 2010
Unblocking The Main Thread Solving ANRs and Frozen Frames
Â
Untangling the semantic structure in a broadcast video archive
1. Workshop on Interactive Information Access
Untangling Tasks and Technologies
Untangling the semantic structure
in a broadcast video archive
Ichiro IDE
Nagoya University, Japan
University of Amsterdam, The Netherlands
December 7, 2010
2. 2
2
Introduction
⢠Online digital video archive is becoming a reality
⢠Efficient retrieval and browsing
⢠Effective reuse
⢠Our aims:
⢠Extract the semantic structure between video data
⢠Rearrange video segments and generate new contents
⢠Provide a browsing / editing interface based on the
extracted semantic structure
⢠(Semi-) automatic rearrangement of retrieved results
for answering queries
Workshop on Interactive Information Access
Workshop on Interactive Information Access
3. 3
3
NII news video archive RAID
disk Servers
PCs for
capturing
Client
MPEG-1/2 Closed-caption Client
PC
decoder decoder PC
Video archiving server DBMS server
Video archive Metadata
MPEG-1 MPEG-2 Closed-caption Story
Video Video text boundary
[970 GB] [5.9 TB] [79 MB] [46 k stories]
Data processing server
⢠Program: NHK News7
⢠Period: March 16, 2001â (1,700â hours)
Workshop on Interactive Information Access
Workshop on Interactive Information Access
4. 4
4
Overview of the talk
Exploring news stories
along the topic thread structure
§
Cross-language detection of related news stories
by text and near-duplicate video segments
§
Structuring a broadcast video archive
based on near-duplicate video segments
Workshop on Interactive Information Access
Workshop on Interactive Information Access
5. 5
5
Exploring news stories
along the topic thread structure
I. Ide, H. Mo, N. Katayama, S. Satoh:
âExploiting topic thread structures in a news video archive for the semi-automatic generation of video summariesâ,
2006 IEEE Int. Conf. on Multimedia and Expo (ICME2006), July 2006
I. Ide, T. Kinoshita, T. Takahashi, S. Satoh, H. Murase:
âmediaWalker: A video archive explorer based on time-series semantic structureâ,
15th ACM Int. Multimedia Conf. Demo Session, Sept. 2007
I.Ide , T. Kinoshita, T. Takahashi, H. Mo, N. Katayama, S. Satoh, H. Murase:
âExploiting the chronological semantic structure in a large-scale broadcast news video archive
for its efficient explorationâ,
APSIPA Annual Summit and Conf. (ASC) 2010, to appear in Dec. 2010
Workshop on Interactive Information Access
Workshop on Interactive Information Access
6. 6
6
Semantic structures in news video
Intra- & Inter-video structure
⢠Story tracking / Topic threading
Intra-video structured videos
Video-1 Story-3
Video-2 Story-1 Story-3
ď Inter-video
Video-3 Story-1 Story-2
structure
Video-4 Story-2 Story-3
Video-5 Story-1 Story-2 Story-5
Thread-2 Thread-1
âŚ
ď Reveals the semantic structure throughout the archive
Workshop on Interactive Information Access
Workshop on Interactive Information Access
7. 7
7
Example of a topic thread structure
Period: 100 days
Origin
May 1, 2003
Story #1
[Cluster-view]
Workshop on Interactive Information Access
Workshop on Interactive Information Access
8. 8
8
Contents of a topic thread structure
SARS outbreak Chinese gov. worries Chinese gov. watches
In Beijing the spread in rural areas the spread in rural areas
Spreads in Calms down
mainland China in Taiwan
WHO sends a WHO declares
mission to Beijing the cease
Slows downs in mainland Calms down in
China, spreads in Taiwan mainland China,
reports fromToronto
Taiwanese doctor found
infected after traveling Japan Search for
Anti-SARS conference Infection in Japan
held in Beijing
Workshop on Interactive Information Access
Workshop on Interactive Information Access
9. 9
9
Browsing news video by the thread
structure: mediaWalker
Demo
Workshop on Interactive Information Access
Workshop on Interactive Information Access
10. 10
10
Towards Video Story-Telling
From here
To here
I want to know
how it developed
⢠Generate a summarized video that explains how
the story developed between two news stories
⢠Select a path (semi-)automatically
⢠Summarize the video streams along the path
Currently under work with Frank
Workshop on Interactive Information Access
Workshop on Interactive Information Access
11. 14
14
Cross-language detection
of related news stories
by text and near-duplicate video segments
A. Ogawa, T. Takahashi, I. Ide, H. Murase:
âCross-lingual retrieval of identical news events by near-duplicate video segment detectionâ,
14th Intl. Multimedia Modeling Conference (MMM2008), Jan. 2008
Workshop on Interactive Information Access
Workshop on Interactive Information Access
12. 15
15
Cross-language news story detection
⢠Definition
â Detect news stories in different
channels (especially in different
languages) discussing the same
event
⢠Problem
Near
â Text-based approach duplicate
⢠Low MT * ASR quality
(Though, recently improvingâŚ)
(Though, recently improvingâŚ)
⢠Different view-point, culture
⢠Proposed method
⢠Detect near-duplicate video segments
to complement text information on Interactive Information Access
Workshop on Interactive Information Access
Workshop
13. 16
16
Comparison of news video streams
⢠Identical event should be broadcast in a close timing
⢠Compare news programs broadcast within +/- 24 hours
Compare only the center part
to avoid super-imposed captions
Cope with color differences
by histogram averaging
Workshop on Interactive Information Access
Workshop on Interactive Information Access
14. 17
17
Example of news stories on a same event
<<Keywords>>
operation [25], US army [20], Fallujah [18], military
Nov 9, 2004 Story # 1 force [12], troops [7], military strategy [7], attack [5],
19:01 (GMT+9) -- Iraqi army [5], general citizens [5], Iraq [4], âŚ
<<Keywords>>
city [9], Jean [6], Aaron [6], Iraqi [4], phone, call [3],
Nov 8, 2004 Story # 1 army forces [3], casualties [3], âŚ
22:03 (GMT-5) --
Workshop on Interactive Information Access
Workshop on Interactive Information Access
15. 18
18
Cross-language news browsing interface:
topicTraveller
Demo
Workshop on Interactive Information Access
Workshop on Interactive Information Access
16. 20
20
Result
⢠Dataset
â 18 pairs of (JP: 1 ďď US: 2)
â Ground truth: manually given
Sum of
Text only Image only
text and Image
Recall 83% (38/46) 96% (20/46) 43% (20/46)
Precision 72% (38/53) 90% (44/49) 77% (20/26)
ď Advantage of using image information
Workshop on Interactive Information Access
Workshop on Interactive Information Access
17. 21
21
Structuring a broadcast video archive
based on near-duplicate video segments
I. Ide, Y. Shamoto, D. Deguchi, T. Takahashi, H. Murase:
âClassification of near-duplicate video segments based on their appearance patternsâ,
20th Int. Conf. on Pattern Recognition (ICPR2010), Aug. 2010.
Workshop on Interactive Information Access
Workshop on Interactive Information Access
18. 22
22
Structuring a broadcast video archive
⢠Structure?
â For browsing / retrieval
â Differs among programs / genres
⢠Applications
â Advertisement database
â Related contents detection
⢠Related news, âŚ
â Periodic contents detection
⢠Sub-program structure
ď Handle in a unified framework
Workshop on Interactive Information Access
Workshop on Interactive Information Access
19. 23
23
Example of appearance patterns
Advertisement
Related news
Sub-program
⢠Different distributions for different types Demo
Workshop on Interactive Information Access
Workshop on Interactive Information Access
20. 24
24
Classes of near-duplicate segment types
1) Advertisement 2) Related news 3) Sub-program
4) Rebroadcast 5) Similar framing 6) Extracted segment
Workshop on Interactive Information Access
Workshop on Interactive Information Access
21. 25
25
Near-duplicate detection experiment
⢠Data set
â 1 week of broadcast from 6 channels in Tokyo area
ď Total: 1,008 hours
⢠Computer environment
â Cluster computer
⢠40 CPU (Intel Xeon 3.4Ghz, Main Memory: 1.0 GB)
⢠Computation cost
â CPU time: 133 days
â Actual time: 4 days
⢠Result
â 3,597,943 pairs (40,928 unique segments)
Workshop on Interactive Information Access
Workshop on Interactive Information Access
22. 26
26
Automatic classification of classes
⢠Classification rules Unique ND segment set
â Features of near-duplicate
Rebroadcast
video segments within a
unique segment set
Advertisement
⢠Appearance period
⢠Appeared channels Sub-
program
⢠Appearance interval
⢠Length of the segment Similar
framing
⢠Periodic or not
Extracted Related
⢠Extracted segment or not segment news
Extracted segment Original segment
Workshop on Interactive Information Access
Workshop on Interactive Information Access
24. 28
28
Future directions
⢠Now we have structured the archives in various
ways
ď Consider how to exploit the structure
⢠Reorganize the video data based on an external
âscenarioâ
â News video archive ďď Wikipedia description
ď (Semi-)automatic Documentary generation
â Cooking video archive ďď Plain recipe text
ď Multimedia supplementation to a text recipe
âŚ
Workshop on Interactive Information Access
Workshop on Interactive Information Access
25. 29
29
Summary
⢠Introduced works on analyzing the semantic
structures in large-scale news video archives and
interfaces for efficient understanding of its contents.
Thanks to:
⢠Nagoya Univ: Profs. Hiroshi Murase, Daisuke Deguchi
Akira Ogawa, Yuji Shamoto, Tomoki Okuoka
⢠NII: Profs. Shinâichi Satoh, Norio Katayama, Hiroshi Mo
⢠Gifu Shotoku Gakuen Univ.: Prof. Tomokazu Takahashi
⢠NetCompass Ltd.: Tomoyoshi Kinoshita, Takeharu Haraigawa
Funded by:
⢠JSPS, MEXT, MRI Inc., Kayamori Information Science Fund, Hoso Bunka
Foundation Workshop on Interactive Information Access
Workshop on Interactive Information Access