Boost Fertility New Invention Ups Success Rates.pdf
Towards Methods for Efficient Access to Spoken Content in the AMI Corpus (SSCS 2010)
1. Centre for Digital Video Processing
C e n t r e f o r D I g I t a l V I d e o P r o c e s s I n g
Towards Methods for Efficient Access
to Spoken Content
in the AMI Corpus
Gareth J. F. Jones
Maria Eskevich
Ágnes Gyarmati
Centre for Digital Video Processing
School of Computing
Dublin City University, Ireland
(gjones, meskevich, agyarmati @computing.dcu.ie) -1-
2. Centre for Digital Video Processing
Outline
C e n t r e f o r D I g I t a l V I d e o P r o c e s s I n g
• Issues
• AMI corpus
• Pre-processing
• Experiment and Results
• Future work
(gjones, meskevich, agyarmati @computing.dcu.ie) -2-
3. Centre for Digital Video Processing
Outline
C e n t r e f o r D I g I t a l V I d e o P r o c e s s I n g
• Issues
• AMI corpus
• Pre-processing
• Experiment and Results
• Future work
(gjones, meskevich, agyarmati @computing.dcu.ie) -3-
4. Centre for Digital Video Processing
Issues: types of Spoken Content
C e n t r e f o r D I g I t a l V I d e o P r o c e s s I n g
– News broadcast:
• Structured
• Clearly articulated speech
-> standard text document retrieval task on
ASR transcript
– Other types of speech (meetings, lectures):
• Lack of clearly defined document form/structure
• Informal style, cross-talk, noisy environment
->We have to define:
• Search units
• Location of relevant items
(gjones, meskevich, agyarmati @computing.dcu.ie) -4-
5. Centre for Digital Video Processing
Issues: Existing Research
C e n t r e f o r D I g I t a l V I d e o P r o c e s s I n g
• Speech Search:
– TV and radio news: Spoken Document Retrieval
(SDR) task at TREC (2000)
– Interviews: Malach Collection (2007)
– AMI (Augmented Multi-party Interaction) corpus
• Recognition WER and Retrieval:
– Low recognition error level:
• little loss in retrieval effectiveness (2000)
• documents are retrieved at higher ranks (2003, 2007)
– Specific metrics (semantic impact of substitutions):
• correlation with retrieval performance
(AMI Corpus, 2009)
(gjones, meskevich, agyarmati @computing.dcu.ie) -5-
6. Centre for Digital Video Processing
Issues
C e n t r e f o r D I g I t a l V I d e o P r o c e s s I n g
• Goal:
– Investigate how difference between manual
and automatic transcription accuracy
influences retrieval effectiveness on the
material of the AMI Corpus
• Experiment:
– Segmentation of spoken content
– Known-item search task using slides from
meetings as queries
(gjones, meskevich, agyarmati @computing.dcu.ie) -6-
7. Centre for Digital Video Processing
Outline
C e n t r e f o r D I g I t a l V I d e o P r o c e s s I n g
• Issues
• AMI corpus
• Pre-processing
• Experiment and Results
• Future work
(gjones, meskevich, agyarmati @computing.dcu.ie) -7-
8. Centre for Digital Video Processing
AMI Corpus
C e n t r e f o r D I g I t a l V I d e o P r o c e s s I n g
• 100 hours
• Each meetings approximately 30
minutes
• Simulating project meetings
• 4-5 participants
• Headset and circular microphones
• Automatic and manual transcripts
available
• Additional data (slides, minutes)
(gjones, meskevich, agyarmati @computing.dcu.ie) -8-
9. Centre for Digital Video Processing
Outline
C e n t r e f o r D I g I t a l V I d e o P r o c e s s I n g
• Issues
• AMI corpus
• Pre-processing
• Experiment and Results
• Future work
(gjones, meskevich, agyarmati @computing.dcu.ie) -9-
10. Centre for Digital Video Processing
Pre-processing: segmentation
C e n t r e f o r D I g I t a l V I d e o P r o c e s s I n g
• Linear segmentation (C99 algorithm):
Cosine based sequential sentence similarity
based algorithm
Boundaries inserted between sentences
based on the difference of lexical inventory
(stemmed)
• Time segmentation
(approximately 90 seconds)
(gjones, meskevich, agyarmati @computing.dcu.ie) - 10 -
11. Centre for Digital Video Processing
Pre-processing: segmentation
C e n t r e f o r D I g I t a l V I d e o P r o c e s s I n g
• Number of segments
Type of transcript Linear segmentation (C99)
Manual transcript 2678
ASR transcript 3831
• Average number of words per segment
Type of transcript Linear segmentation (C99)
Manual transcript 320
ASR transcript 221
(gjones, meskevich, agyarmati @computing.dcu.ie) - 11 -
12. Pre-processing: Centre for Digital Video Processing
C e n t
Word Recognition Rate (WRR)
r e f o r D I g I t a l V I d e o P r o c e s s I n g
1. Alignment between ASR and manual
transcripts
2. Recognition rate count
Recognition rate – number of correctly
recognized words in the meeting
divided by the total number of words
in the transcript
3. Recognition rate without stop words
(gjones, meskevich, agyarmati @computing.dcu.ie) - 12 -
13. Relation between Centre for Digital Video Processing
C e n t
segmentation and recognition rate
r e f o r D I g I t a l V I d e o P r o c e s s I n g
(gjones, meskevich, agyarmati @computing.dcu.ie) - 13 -
14. Centre for Digital Video Processing
Pre-processing: cross-segmentation
C e n t r e f o r D I g I t a l V I d e o P r o c e s s I n g
(gjones, meskevich, agyarmati @computing.dcu.ie) - 14 -
15. Centre for Digital Video Processing
Outline
C e n t r e f o r D I g I t a l V I d e o P r o c e s s I n g
• Issues
• AMI corpus
• Pre-processing
• Experiment and Results
• Future work
(gjones, meskevich, agyarmati @computing.dcu.ie) - 15 -
16. Experiment:
Centre for Digital Video Processing
C e n t
slides and relevant segments selection
r e f o r D I g I t a l V I d e o P r o c e s s I n g
(gjones, meskevich, agyarmati @computing.dcu.ie) - 16 -
17. Experiment:
Centre for Digital Video Processing
C e n t
slides and relevant segments selection
r e f o r D I g I t a l V I d e o P r o c e s s I n g
Number of relevant segments
Number with segmentation based on
Type of
of
queries
queries
ASR transcript Manual transcript
Min 15 56 49
Max 24 68 39
Random 25 36 42
(gjones, meskevich, agyarmati @computing.dcu.ie) - 17 -
18. Experiment: Centre for Digital Video Processing
C e n t
Indexing & Retrieval Setup
r e f o r D I g I t a l V I d e o P r o c e s s I n g
• Indri language model of the open
source Lemur Toolkit (
http://www.lemurproject.org/):
– texts are stemmed using Lemur's built-in
Porter stemmer
• Stopword list provided by Snowball
(http://snowball.tartarus.org/)
(gjones, meskevich, agyarmati @computing.dcu.ie) - 18 -
19. Centre for Digital Video Processing
Results: at ranks 100
C e n t r e f o r D I g I t a l V I d e o P r o c e s s I n g
• Recall at ranks 100:
• Mean Reciprocal Rate at ranks 100:
(gjones, meskevich, agyarmati @computing.dcu.ie) - 19 -
20. Centre for Digital Video Processing
Outline
C e n t r e f o r D I g I t a l V I d e o P r o c e s s I n g
• Issues
• AMI corpus
• Pre-processing
• Experiment
• Results
• Future work
(gjones, meskevich, agyarmati @computing.dcu.ie) - 20 -
21. Centre for Digital Video Processing
Problems
C e n t r e f o r D I g I t a l V I d e o P r o c e s s I n g
• Errors in the ASR output
• Common knowledge of the participants
of the meeting -> some words are not
spoken
• All parts of the meetings are indexed in
the same way
• Retrieval algorithm favours longer
segments
(gjones, meskevich, agyarmati @computing.dcu.ie) - 21 -
22. Centre for Digital Video Processing
Future work
C e n t r e f o r D I g I t a l V I d e o P r o c e s s I n g
• Construct proper segment-based
relevance set for the slides
• Analysis of ASR errors influence on
segmentation
• ASR transcript improvement
(gjones, meskevich, agyarmati @computing.dcu.ie) - 22 -
23. Centre for Digital Video Processing
C e n t r e f o r D I g I t a l V I d e o P r o c e s s I n g
Thank You
Thank you for your attention!
Questions?
(gjones, meskevich, agyarmati @computing.dcu.ie) - 23 -