Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Experiments with Segmentation Strategies for
Passage Retrieval in Audio-Visual Documents
Petra Galuščáková and Pavel Pecin...
2
Information Retrieval
●
Information Retrieval (IR) is a task which involves searching
for documents relevant to a given ...
3
Speech Retrieval
●
Speech Retrieval focuses on retrieval from audio-visual
documents (recordings).
4
Speech Retrieval
●
Speech Retrieval is often converted on traditional
Information Retrieval
●
Automatic Speech Recogniti...
5
Speech Retrieval
Problems
●
Documents are long (e.g. whole TV programmes)
● Often unstructured
●
Navigation in audio-vis...
6
Passage Retrieval
●
Splits texts into smaller units which then function as
documents in the retrieval process
● Makes th...
7
Segmentation Strategies
● Regular (Window-based)
● Segments of equal length with regular shift
● Claimed to be a very ef...
8
Feature-based Segmentation
in Passage Retrieval
9
Experiments
Tasks Description
10
● MediaEval is a benchmarking initiative dedicated to
development, comparison, and improvement of strategies for
proces...
11
Similar Segments in Social
Speech (SSSS) Task
● Scenario:
●
A new member (e.g., a new student) joins a community or
org...
12
Similar Segments in Social
Speech Task Data
●
On purpose recorded interviews (5 hours)
of two speakers (university stud...
13
Search and Hyperlinking (SH)
Task
● Scenario:
● A user wants to find a piece of information relevant to a given
query i...
14
Search and Hyperlinking Task
Data
●
TV programme recordings provided
by BBC (1697 hours)
●
Subtitles and two ASR transc...
15
Passage Retrieval Quality
Evaluation
●
Full document retrieval → Mean Reciprocal Rank (MRR)
– RR = 1 / rank of the firs...
16
Experiments
System Description
17
Baseline System
●
We employ the Terrier IR toolkit
●
Hiemstra language model
● Parameter set to 0.35 (importance of a q...
18
Window-based Segmentation
● Equally-long segments with a regular shift
19
Feature-based Segmentation
● We identify possible segment boundaries (beginnings and
ends)
●
Model: J48 decision trees
...
20
Features
● Cue words and tags (n-grams which frequently occur at the
boundary,most informative n-grams) for segment beg...
21
Feature-based Segmentation
Approaches
22
Experiments
Results
23
Similar Segments in Social
Speech Task - Evaluation
●
Best results are obtained by the feature-based segmentation into
...
24
Segmentation Model
in the SH Task
● Training set used in the SH Search Subtask is very small
●
We apply the SSSS-traine...
25
SH Task Evaluation
●
Not as consistent as for the SSSS task
●
Depending on the type of the transcript
●
Feature-based a...
26
Conclusion
27
Conclusion
●
Information Retrieval, focus on speech data (Speech
Retrieval)
● Focus on retrieval of exact relevant pass...
28
Conclusion cont.
●
Feature-based segmentation applied in the two tasks
outperformed regular segmentation
● Claimed to b...
29
Thank you
This research has been supported by the project AMALACH (grant
n. DF12P01OVV022 of the program NAKI of the Mi...
Upcoming SlideShare
Loading in …5
×

Experiments with Segmentation Strategies for Passage Retrieval in Audio-Visual Documents

1,211 views

Published on

ICMR 2014

  • Be the first to comment

  • Be the first to like this

Experiments with Segmentation Strategies for Passage Retrieval in Audio-Visual Documents

  1. 1. Experiments with Segmentation Strategies for Passage Retrieval in Audio-Visual Documents Petra Galuščáková and Pavel Pecina galuscakova@ufal.mff.cuni.cz Institute of Formal and Applied Linguistics Faculty of Mathematics and Physics Charles University in Prague 4. 4. 2014
  2. 2. 2 Information Retrieval ● Information Retrieval (IR) is a task which involves searching for documents relevant to a given query.
  3. 3. 3 Speech Retrieval ● Speech Retrieval focuses on retrieval from audio-visual documents (recordings).
  4. 4. 4 Speech Retrieval ● Speech Retrieval is often converted on traditional Information Retrieval ● Automatic Speech Recognition (ASR) system applied to the audio track
  5. 5. 5 Speech Retrieval Problems ● Documents are long (e.g. whole TV programmes) ● Often unstructured ● Navigation in audio-visual recordings is time consuming ● We need to retrieve relevant segments of full documents ● Possibility to browse the recordings using hyperlinks (links between passages) → Passage Retrieval
  6. 6. 6 Passage Retrieval ● Splits texts into smaller units which then function as documents in the retrieval process ● Makes the retrieval process more precise ● May improve retrieval of full documents ● The segmentation is crucial for the quality of the retrieval → We focus on segmentation strategies
  7. 7. 7 Segmentation Strategies ● Regular (Window-based) ● Segments of equal length with regular shift ● Claimed to be a very effective approach ● Similarity-based ● Measures similarity between neighbouring segments ● Lexical-chain-based ● Finds sequences of lexicographically related word occurrences ● Feature-based ● Employs machine learning methods to detect segment boundaries based on various features
  8. 8. 8 Feature-based Segmentation in Passage Retrieval
  9. 9. 9 Experiments Tasks Description
  10. 10. 10 ● MediaEval is a benchmarking initiative dedicated to development, comparison, and improvement of strategies for processing and retrieving multimedia content. ● E.g., speech recognition, multimedia content analysis, music and audio analysis, social networks, geo-coordinates, … ● 2013 Similar Segments in Social Speech Task ● 2013 Search and Hyperlinking Task
  11. 11. 11 Similar Segments in Social Speech (SSSS) Task ● Scenario: ● A new member (e.g., a new student) joins a community or organization (e.g., a university), which owns an archive of recorded conversations among its members ● A member wants to find information according to his or her interest in the archive – The student wants to find more segments similar to the ones he or she is interested in and browses the archive using hyperlinks in videos ● The main goal: ● To find segments similar to the given ones
  12. 12. 12 Similar Segments in Social Speech Task Data ● On purpose recorded interviews (5 hours) of two speakers (university students’ community) ● Divided into training/test data ● Manual and ASR transcripts ● Manually indicated segments (1886 segments), manually grouped into similarity sets ● Query segment - specified by the timestamp of its beginning and end ● Queries - constructed by including all words lying within the boundaries of the query segments
  13. 13. 13 Search and Hyperlinking (SH) Task ● Scenario: ● A user wants to find a piece of information relevant to a given query in a collection of TV programmes (Search subtask) ● And then navigate through a large archive using hyperlinks to the retrieved segments (Hyperlinking subtask) ● The main goal of the Search Subtask ● Find passages relevant to a user’s interest given by a textual query in a large set of audio-visual recordings
  14. 14. 14 Search and Hyperlinking Task Data ● TV programme recordings provided by BBC (1697 hours) ● Subtitles and two ASR transcripts (LIMSI and LIUM) ● 4 training and 50 test queries ● Query text: e. g. Boris Johnson ● Visual cue: e. g. 2 men sitting opposite each other ● Metadata, synopsis, cast, detected shots, detected faces, visual concepts
  15. 15. 15 Passage Retrieval Quality Evaluation ● Full document retrieval → Mean Reciprocal Rank (MRR) – RR = 1 / rank of the first correctly retrieved document ● Retrieval of the exact passages → MRRw and MGAP ● MRR-window (MRRw) – Retrieved starting points are limited to appear less than 60 seconds from the relevant starting points ● Mean Generalized Average Precision (MGAP) – The quality of the retrieved starting point is assessed according to its distance from the relevant starting point using a penalty function
  16. 16. 16 Experiments System Description
  17. 17. 17 Baseline System ● We employ the Terrier IR toolkit ● Hiemstra language model ● Parameter set to 0.35 (importance of a query term in a document) ● Stopwords removal, stemming ● Post-filtering of the answers ● The segments partially overlapping with either the query segment or a higher ranked segment are removed from the list of results
  18. 18. 18 Window-based Segmentation ● Equally-long segments with a regular shift
  19. 19. 19 Feature-based Segmentation ● We identify possible segment boundaries (beginnings and ends) ● Model: J48 decision trees ● Training data available for the SSSS task ● Manually marked segments ● Binary classification problem ● For each word in the transcripts, we predict whether a segment boundary occurs after this word or not ● Classes: segment boundary and segment continuation
  20. 20. 20 Features ● Cue words and tags (n-grams which frequently occur at the boundary,most informative n-grams) for segment beginning and end ● Segment beginnings: “I’m”, “the”, “are you”, “you have”, ... ● Segment ends: “good”, “interesting”, “lot”, ... ● Letter cases ● Length of the silence before the word ● Division given in transcripts (e.g., speech segments defined in the LIMSI transcripts) ● The output of the TextTiling algorithm
  21. 21. 21 Feature-based Segmentation Approaches
  22. 22. 22 Experiments Results
  23. 23. 23 Similar Segments in Social Speech Task - Evaluation ● Best results are obtained by the feature-based segmentation into overlapping segments ● Manual gold-standard segmentation is outperformed by feature- based segmentation (MRRw score on the manual transcripts) ● Manual transcripts are significantly better in all scores
  24. 24. 24 Segmentation Model in the SH Task ● Training set used in the SH Search Subtask is very small ● We apply the SSSS-trained models in the SH task ● Allows us to examine the possibility of creating a universal model for feature-based segmentation ● Potential problems: ● Different vocabulary (student's dialogues vs. TV programmes) ● Different ASR systems may prefer different vocabulary ● Different distribution of silence, document structure
  25. 25. 25 SH Task Evaluation ● Not as consistent as for the SSSS task ● Depending on the type of the transcript ● Feature-based approaches creating overlapping segments - effective when applied on the subtitles
  26. 26. 26 Conclusion
  27. 27. 27 Conclusion ● Information Retrieval, focus on speech data (Speech Retrieval) ● Focus on retrieval of exact relevant passages ● Importance of segmentation ● Experiments in MediaEval benchamark ● Similar Segments in Social Speech Task (university student dialogues) and Search and Hyperlinking Task (BBC programmes) ● We applied window-based segmentation and three types of feature-based segmentations
  28. 28. 28 Conclusion cont. ● Feature-based segmentation applied in the two tasks outperformed regular segmentation ● Claimed to be a very effective approach ● The improvement in the SSSS Task was statistically significant on the manual (MRRw and mGAP measures) and ASR (mGAP measure) transcripts ● The results in the SH task were not so conclusive ● Some of the results (on the subtitles) are encouraging
  29. 29. 29 Thank you This research has been supported by the project AMALACH (grant n. DF12P01OVV022 of the program NAKI of the Ministry of Culture of the Czech Republic), the Czech Science Foundation (grant n. P103/12/G084), and the Charles University Grant Agency (grant n. 920913).

×