We present an exploratory study of the retrieval of semi-professional user-generated Internet video. The study is based on the MediaEval 2011 Rich Speech Retrieval (RSR) task for which the dataset was taken from the Internet sharing platform blip.tv, and search queries associated with specific speech acts occurring in the video. We compare results from three participant groups using: automatic speech recognition system transcript (ASR), metadata manually assigned to each video by the user who uploaded it, and their combination. RSR 2011 was a known-item search for a single manually identified ideal jump-in point in the video for each query where playback should begin. Retrieval effectiveness is measured using the MRR and mGAP metrics.
Using different transcript segmentation methods the participants tried to maximize the rank of the relevant item and to locate the nearest match to the ideal jump-in point. Results indicate that best overall results are obtained for topically homogeneous segments which have a strong overlap with the relevant region associated with the jump-in point, and that use of metadata can be beneficial when segments are unfocused or cover more than one topic.
Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
1. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
Comparing Retrieval Effectiveness of
Alternative Content Segmentation Methods
for Internet Video Search
Maria Eskevich1 , Gareth J.F. Jones1 , Martha Larson2
Christian Wartena2,3
Robin Aly4 , Thijs Verschoor4 , Roeland Ordelman4
1 Centre for Digital Video Processing, Centre for Next Generation Localisation
School of Computing, Dublin City University, Dublin, Ireland
2 Delft University of Technology, Delft, The Netherlands
3 Univ. of Applied Sciences and Arts Hannover
4 University of Twente, The Netherlands
2. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
Outline
MediaEval 2011 Rich Speech Retrieval Task
3 participant groups methods
Results and examples
Conclusion
Future Work: Brave New Task at MediaEval 2012
3. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
ediaEval 2011
Rich Speech Retrieval (RSR) Task
Task Goal:
Information to be found - combination of required
audio and visual content, and speaker’s intention
4. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
ediaEval 2011
Rich Speech Retrieval (RSR) Task
Task Goal:
Information to be found - combination of required
audio and visual content, and speaker’s intention
5. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
ediaEval 2011
Rich Speech Retrieval (RSR) Task
Task Goal:
Information to be found - combination of required
audio and visual content, and speaker’s intention
6. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
ediaEval 2011
Rich Speech Retrieval (RSR) Task
Task Goal:
Information to be found - combination of required
audio and visual content, and speaker’s intention
Transcript 1 Transcript 2
7. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
ediaEval 2011
Rich Speech Retrieval (RSR) Task
Task Goal:
Information to be found - combination of required
audio and visual content, and speaker’s intention
Transcript 1 Transcript 2
Meaning 1 Meaning 2
8. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
ediaEval 2011
Rich Speech Retrieval (RSR) Task
Task Goal:
Information to be found - combination of required
audio and visual content, and speaker’s intention
Transcript 1 = Transcript 2
Meaning 1 = Meaning 2
9. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
ediaEval 2011
Rich Speech Retrieval (RSR) Task
Task Goal:
Information to be found - combination of required
audio and visual content, and speaker’s intention
Transcript 1 = Transcript 2
Meaning 1 = Meaning 2
Conventional retrieval
10. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
ediaEval 2011
Rich Speech Retrieval (RSR) Task
Task Goal:
Information to be found - combination of required
audio and visual content, and speaker’s intention
Transcript 1 = Transcript 2
Meaning 1 = Meaning 2
11. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
ediaEval 2011
Rich Speech Retrieval (RSR) Task
Task Goal:
Information to be found - combination of required
audio and visual content, and speaker’s intention
Transcript 1 = Transcript 2
Meaning 1 = Meaning 2
Speech act 1 = Speech act 2
12. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
ediaEval 2011
Rich Speech Retrieval (RSR) Task
Task Goal:
Information to be found - combination of required
audio and visual content, and speaker’s intention
Transcript 1 = Transcript 2
Meaning 1 = Meaning 2
Speech act 1 = Speech act 2
Extended speech retrieval (find jump-in points)
13. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
ediaEval 2011
Rich Speech Retrieval (RSR) Task
14. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
ediaEval 2011
Rich Speech Retrieval (RSR) Task
Data provided to task participants:
15. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
ediaEval 2011
Rich Speech Retrieval (RSR) Task
Data provided to task participants:
Videos from Internet video sharing platform blip.tv
(ME10WWW dataset: testset: 1727 episodes, ca. 300 hs)
16. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
ediaEval 2011
Rich Speech Retrieval (RSR) Task
Data provided to task participants:
Videos from Internet video sharing platform blip.tv
(ME10WWW dataset: testset: 1727 episodes, ca. 300 hs)
Automatic Speech Recognition (ASR) transcript provided
by LIMSI and Vocapia Research
17. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
ediaEval 2011
Rich Speech Retrieval (RSR) Task
Data provided to task participants:
Videos from Internet video sharing platform blip.tv
(ME10WWW dataset: testset: 1727 episodes, ca. 300 hs)
Automatic Speech Recognition (ASR) transcript provided
by LIMSI and Vocapia Research
Metadata manually added by the uploader
18. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
ediaEval 2011
Rich Speech Retrieval (RSR) Task
Data provided to task participants:
Videos from Internet video sharing platform blip.tv
(ME10WWW dataset: testset: 1727 episodes, ca. 300 hs)
Automatic Speech Recognition (ASR) transcript provided
by LIMSI and Vocapia Research
Metadata manually added by the uploader
50 user-generated short web style queries collected via
crowdsourcing, associated with following speech act types:
19. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
ediaEval 2011
Rich Speech Retrieval (RSR) Task
Data provided to task participants:
Videos from Internet video sharing platform blip.tv
(ME10WWW dataset: testset: 1727 episodes, ca. 300 hs)
Automatic Speech Recognition (ASR) transcript provided
by LIMSI and Vocapia Research
Metadata manually added by the uploader
50 user-generated short web style queries collected via
crowdsourcing, associated with following speech act types:
’expressives’: apology (1), opinion (21)
’assertives’: definition (17)
’directives’: warning (6)
’commissives’: promise (5)
20. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
ediaEval 2011
Rich Speech Retrieval (RSR) Task
Data provided to task participants:
Videos from Internet video sharing platform blip.tv
(ME10WWW dataset: testset: 1727 episodes, ca. 300 hs)
Automatic Speech Recognition (ASR) transcript provided
by LIMSI and Vocapia Research
Metadata manually added by the uploader
50 user-generated short web style queries collected via
crowdsourcing, associated with following speech act types:
’expressives’: apology (1), opinion (21)
’assertives’: definition (17)
’directives’: warning (6)
’commissives’: promise (5)
Data available for results assessment:
21. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
ediaEval 2011
Rich Speech Retrieval (RSR) Task
Data provided to task participants:
Videos from Internet video sharing platform blip.tv
(ME10WWW dataset: testset: 1727 episodes, ca. 300 hs)
Automatic Speech Recognition (ASR) transcript provided
by LIMSI and Vocapia Research
Metadata manually added by the uploader
50 user-generated short web style queries collected via
crowdsourcing, associated with following speech act types:
’expressives’: apology (1), opinion (21)
’assertives’: definition (17)
’directives’: warning (6)
’commissives’: promise (5)
Data available for results assessment:
Time of the relevant item for the labeled speech act
22. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
ediaEval 2011
Rich Speech Retrieval (RSR) Task
Data provided to task participants:
Videos from Internet video sharing platform blip.tv
(ME10WWW dataset: testset: 1727 episodes, ca. 300 hs)
Automatic Speech Recognition (ASR) transcript provided
by LIMSI and Vocapia Research
Metadata manually added by the uploader
50 user-generated short web style queries collected via
crowdsourcing, associated with following speech act types:
’expressives’: apology (1), opinion (21)
’assertives’: definition (17)
’directives’: warning (6)
’commissives’: promise (5)
Data available for results assessment:
Time of the relevant item for the labeled speech act
Accurate transcript of the labeled speech act
28. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
Evaluation Metrics
Mean Reciprocal Rank (MRR):
1
RR =
RANK
Mean Generalized Average Precision (mGAP):
1
GAP = . PENALTY
RANK
29. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
Approach 1: Sliding Window (SW)
Tag and lemmatize the words
30. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
Approach 1: Sliding Window (SW)
Tag and lemmatize the words
Segmentation with sliding window:
31. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
Approach 1: Sliding Window (SW)
Tag and lemmatize the words
Segmentation with sliding window:
32. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
Approach 1: Sliding Window (SW)
Tag and lemmatize the words
Segmentation with sliding window:
33. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
Approach 1: Sliding Window (SW)
Tag and lemmatize the words
Segmentation with sliding window:
34. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
Approach 1: Sliding Window (SW)
Tag and lemmatize the words
Segmentation with sliding window:
Retrieval using BM25, BM25F - for use of metadata
35. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
Approach 1: Sliding Window (SW)
Tag and lemmatize the words
Segmentation with sliding window:
Retrieval using BM25, BM25F - for use of metadata
Post-processing:
36. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
Approach 1: Sliding Window (SW)
Tag and lemmatize the words
Segmentation with sliding window:
Retrieval using BM25, BM25F - for use of metadata
Post-processing:
37. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
Approach 1: Sliding Window (SW)
Tag and lemmatize the words
Segmentation with sliding window:
Retrieval using BM25, BM25F - for use of metadata
Post-processing:
38. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
Approach 1: Sliding Window (SW)
Tag and lemmatize the words
Segmentation with sliding window:
Retrieval using BM25, BM25F - for use of metadata
Post-processing:
39. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
Approach 2: Speech Segments (Sp)
Segmentation based on silence points and changes of
speakers
40. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
Approach 2: Speech Segments (Sp)
Segmentation based on silence points and changes of
speakers
41. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
Approach 2: Speech Segments (Sp)
Segmentation based on silence points and changes of
speakers
Search engine used: PFTijah
42. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
Approach 3: Lexical cohesion (LC)
Segmentation:
into lexically coherent segments, using 2 algorithms:
C99 and TextTiling
additional segment boundaries for silences > 0.5 sec
43. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
Approach 3: Lexical cohesion (LC)
Segmentation:
into lexically coherent segments, using 2 algorithms:
C99 and TextTiling
additional segment boundaries for silences > 0.5 sec
SMART IR system with language modeling
44. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
RSR Results: MRR and mGAP
RunName WindowSize
60 30 10
MRR mGAP MRR mGAP MRR mGAP
SW asr sh 0.37 0.32 0.32 0.27 0.19 0.19
SW asr meta sh 0.35 0.29 0.30 0.25 0.14 0.14
SW meta 0.20 0.15 0.18 0.13 0.06 0.06
Sp asr 0.34 0.27 0.27 0.22 0.16 0.16
Sp asr meta 0.34 0.25 0.26 0.21 0.15 0.15
Sp meta 0.18 0.14 0.14 0.11 0.07 0.07
LC asr tt 0.36 0.25 0.29 0.18 0.09 0.09
LC asr meta tt 0.39 0.28 0.30 0.20 0.14 0.14
LC meta 0.18 0.11 0.09 0.07 0.03 0.03
45. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
RSR Results: MRR and mGAP
RunName WindowSize
60 30 10
MRR mGAP MRR mGAP MRR mGAP
SW asr sh 0.37 0.32 0.32 0.27 0.19 0.19
SW asr meta sh 0.35 0.29 0.30 0.25 0.14 0.14
SW meta 0.20 0.15 0.18 0.13 0.06 0.06
Sp asr 0.34 0.27 0.27 0.22 0.16 0.16
Sp asr meta 0.34 0.25 0.26 0.21 0.15 0.15
Sp meta 0.18 0.14 0.14 0.11 0.07 0.07
LC asr tt 0.36 0.25 0.29 0.18 0.09 0.09
LC asr meta tt 0.39 0.28 0.30 0.20 0.14 0.14
LC meta 0.18 0.11 0.09 0.07 0.03 0.03
46. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
Relationship Between
Retrieval Effectiveness and Segmentation Methods
Segment:
100 % Recall of the relevant content
High Precision (30, 56 %) of the relevant content
Topic consistency
47. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
Relationship Between
Retrieval Effectiveness and Segmentation Methods
48. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
Relationship Between
Retrieval Effectiveness and Segmentation Methods
49. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
Relationship Between
Retrieval Effectiveness and Segmentation Methods
Example 1
50. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
Relationship Between
Retrieval Effectiveness and Segmentation Methods
Example 1
51. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
Relationship Between
Retrieval Effectiveness and Segmentation Methods
Example 1
52. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
Relationship Between
Retrieval Effectiveness and Segmentation Methods
Example 1
53. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
Relationship Between
Retrieval Effectiveness and Segmentation Methods
Example 2
54. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
Overlap of query words
with ASR transcript or Metadata. Example 3
55. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
Overlap of query words
with ASR transcript or Metadata. Example 3
Segments on the same topic
are retrieved in top of the list;
56. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
Overlap of query words
with ASR transcript or Metadata. Example 3
Segments on the same topic Use of metadata for segments
are retrieved in top of the list; containing relevant content:
decrease the rank, if
segment > 1 topic
does not affect the rank, if
segment = 1 topic
57. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
Overlap of query words
with ASR transcript or Metadata. Example 4
58. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
Overlap of query words
with ASR transcript or Metadata. Example 4
Segment:
High Recall (83, 100 %)
Precision = 23 %
Several topics covered
59. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
Overlap of query words
with ASR transcript or Metadata. Example 4
Segment:
High Recall (83, 100 %)
Precision = 23 %
Several topics covered
− > Use of metadata increase the rank (Rank = 1)
60. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
Conclusions and Future Work
Segmentation plays significant role in retrieving relevant
content
61. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
Conclusions and Future Work
Segmentation plays significant role in retrieving relevant
content
High recall and precision of the relevant content within the
segment lead to good segment ranking.
62. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
Conclusions and Future Work
Segmentation plays significant role in retrieving relevant
content
High recall and precision of the relevant content within the
segment lead to good segment ranking.
Related metadata is useful to improve ranking of the
segment with high recall and non relevant content.
63. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
Conclusions and Future Work
Segmentation plays significant role in retrieving relevant
content
High recall and precision of the relevant content within the
segment lead to good segment ranking.
Related metadata is useful to improve ranking of the
segment with high recall and non relevant content.
− > Current exploration of segmentation methods to generate
segments that have high recall and precision of the relevant
content for the query
64. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
Conclusions and Future Work
Segmentation plays significant role in retrieving relevant
content
High recall and precision of the relevant content within the
segment lead to good segment ranking.
Related metadata is useful to improve ranking of the
segment with high recall and non relevant content.
− > Current exploration of segmentation methods to generate
segments that have high recall and precision of the relevant
content for the query
Due to small size of the query set no general conclusions
on the difference based on the speech act type
65. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
ediaEval 2012 Brave New Task:
Search and Hyperlinking
Use Scenario:
A user is searching for a known segment in a video
collection.
Furthermore, because the information in the segment might
not be sufficient for his information need, s/he wants to
have links to other related video segments, which may help
to satisfy information need related to this video.
66. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
ediaEval 2012 Brave New Task:
Search and Hyperlinking
Use Scenario:
A user is searching for a known segment in a video
collection.
Furthermore, because the information in the segment might
not be sufficient for his information need, s/he wants to
have links to other related video segments, which may help
to satisfy information need related to this video.
Sub-tasks:
67. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
ediaEval 2012 Brave New Task:
Search and Hyperlinking
Use Scenario:
A user is searching for a known segment in a video
collection.
Furthermore, because the information in the segment might
not be sufficient for his information need, s/he wants to
have links to other related video segments, which may help
to satisfy information need related to this video.
Sub-tasks:
Search: finding suitable video segments based on a short
natural language query,
68. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
ediaEval 2012 Brave New Task:
Search and Hyperlinking
Use Scenario:
A user is searching for a known segment in a video
collection.
Furthermore, because the information in the segment might
not be sufficient for his information need, s/he wants to
have links to other related video segments, which may help
to satisfy information need related to this video.
Sub-tasks:
Search: finding suitable video segments based on a short
natural language query,
Linking: defining links to other relevant video segments in
the collection.
69. Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods for Internet Video Search
ediaEval 2012
Thank you for your attention!
Welcome to MediaEval 2012! http://multimediaeval.org