SlideShare a Scribd company logo
1 of 23
Download to read offline
Centre for Digital Video Processing



C   e   n   t   r   e   f   o   r   D   I   g   I   t    a   l   V   I   d   e   o     P   r   o   c   e   s   s   I   n   g




            Towards Methods for Efficient Access
                    to Spoken Content
                    in the AMI Corpus
                                                                 Gareth J. F. Jones
                                                                  Maria Eskevich
                                                                  Ágnes Gyarmati

                                                        Centre for Digital Video Processing
                                                              School of Computing
                                                          Dublin City University, Ireland


(gjones, meskevich, agyarmati @computing.dcu.ie)‫‏‬                                -1-
Centre for Digital Video Processing

                Outline
C   e   n   t   r   e   f   o   r   D   I   g   I   t   a   l   V   I   d   e   o     P   r   o   c   e   s   s   I   n   g




                •  Issues
                •  AMI corpus
                •  Pre-processing
                •  Experiment and Results
                •  Future work




(gjones, meskevich, agyarmati @computing.dcu.ie)‫‏‬                               -2-
Centre for Digital Video Processing

                Outline
C   e   n   t   r   e   f   o   r   D   I   g   I   t   a   l   V   I   d   e   o     P   r   o   c   e   s   s   I   n   g




                •  Issues
                •  AMI corpus
                •  Pre-processing
                •  Experiment and Results
                •  Future work




(gjones, meskevich, agyarmati @computing.dcu.ie)‫‏‬                               -3-
Centre for Digital Video Processing

                Issues: types of Spoken Content
C   e   n   t   r   e    f   o   r   D   I   g   I   t   a   l   V   I   d   e   o     P   r   o   c   e   s   s   I   n   g




                        –  News broadcast:
                                 •  Structured
                                 •  Clearly articulated speech
                        -> standard text document retrieval task on
                          ASR transcript

                        –  Other types of speech (meetings, lectures):
                                 •  Lack of clearly defined document form/structure
                                 •  Informal style, cross-talk, noisy environment
                        ->We have to define:
                                 •  Search units
                                 •  Location of relevant items


(gjones, meskevich, agyarmati @computing.dcu.ie)‫‏‬                                -4-
Centre for Digital Video Processing

                Issues: Existing Research
C   e   n   t   r   e    f   o   r   D   I   g   I   t   a   l   V   I   d   e   o     P   r   o   c   e   s   s   I   n   g




                •  Speech Search:
                        –  TV and radio news: Spoken Document Retrieval
                           (SDR) task at TREC (2000)
                        –  Interviews: Malach Collection (2007)
                        –  AMI (Augmented Multi-party Interaction) corpus
                •  Recognition WER and Retrieval:
                        –  Low recognition error level:
                                 •  little loss in retrieval effectiveness (2000)
                                 •  documents are retrieved at higher ranks (2003, 2007)
                        –  Specific metrics (semantic impact of substitutions):
                                 •  correlation with retrieval performance
                                    (AMI Corpus, 2009)




(gjones, meskevich, agyarmati @computing.dcu.ie)‫‏‬                                -5-
Centre for Digital Video Processing

                Issues
C   e   n   t   r   e    f   o   r   D   I   g   I   t   a   l   V   I   d   e   o     P   r   o   c   e   s   s   I   n   g




                •  Goal:
                        –  Investigate how difference between manual
                           and automatic transcription accuracy
                           influences retrieval effectiveness on the
                           material of the AMI Corpus


                •  Experiment:
                        –  Segmentation of spoken content
                        –  Known-item search task using slides from
                           meetings as queries



(gjones, meskevich, agyarmati @computing.dcu.ie)‫‏‬                                -6-
Centre for Digital Video Processing

                Outline
C   e   n   t   r   e   f   o   r   D   I   g   I   t   a   l   V   I   d   e   o     P   r   o   c   e   s   s   I   n   g




                •  Issues
                •  AMI corpus
                •  Pre-processing
                •  Experiment and Results
                •  Future work




(gjones, meskevich, agyarmati @computing.dcu.ie)‫‏‬                               -7-
Centre for Digital Video Processing

                AMI Corpus
C   e   n   t   r   e   f   o   r   D   I   g   I   t   a   l   V   I   d   e   o     P   r   o   c   e   s   s   I   n   g




                •  100 hours
                •  Each meetings approximately 30
                   minutes
                •  Simulating project meetings
                •  4-5 participants
                •  Headset and circular microphones
                •  Automatic and manual transcripts
                   available
                •  Additional data (slides, minutes)


(gjones, meskevich, agyarmati @computing.dcu.ie)‫‏‬                               -8-
Centre for Digital Video Processing

                Outline
C   e   n   t   r   e   f   o   r   D   I   g   I   t   a   l   V   I   d   e   o     P   r   o   c   e   s   s   I   n   g




                •  Issues
                •  AMI corpus
                •  Pre-processing
                •  Experiment and Results
                •  Future work




(gjones, meskevich, agyarmati @computing.dcu.ie)‫‏‬                               -9-
Centre for Digital Video Processing

                Pre-processing: segmentation
C   e   n   t   r   e   f   o   r   D   I   g   I   t   a   l   V   I   d   e   o    P   r   o   c   e   s   s   I   n   g




                •  Linear segmentation (C99 algorithm):
                            Cosine based sequential sentence similarity
                            based algorithm
                            Boundaries inserted between sentences
                            based on the difference of lexical inventory
                            (stemmed)


                •  Time segmentation
                   (approximately 90 seconds)



(gjones, meskevich, agyarmati @computing.dcu.ie)‫‏‬                               - 10 -
Centre for Digital Video Processing

                Pre-processing: segmentation
C   e   n   t   r   e    f   o   r   D   I   g   I   t   a   l   V   I   d   e   o    P   r   o   c   e   s   s   I   n   g




                        •  Number of segments

                                     Type of transcript                                       Linear segmentation (C99)

                                     Manual transcript                                                                2678

                                         ASR transcript                                                               3831



                        •  Average number of words per segment
                                     Type of transcript                                       Linear segmentation (C99)

                                     Manual transcript                                                                    320
                                         ASR transcript                                                                   221



(gjones, meskevich, agyarmati @computing.dcu.ie)‫‏‬                                - 11 -
Pre-processing:                                                                  Centre for Digital Video Processing



C   e   n   t
                Word Recognition Rate (WRR)
                r   e   f   o   r   D   I   g   I   t   a   l   V   I   d   e   o    P   r   o   c   e   s   s   I   n   g




                1.  Alignment between ASR and manual
                    transcripts

                2.  Recognition rate count
                   Recognition rate – number of correctly
                    recognized words in the meeting
                    divided by the total number of words
                    in the transcript

                3.  Recognition rate without stop words

(gjones, meskevich, agyarmati @computing.dcu.ie)‫‏‬                               - 12 -
Relation between                                                                 Centre for Digital Video Processing



C   e   n   t
                segmentation and recognition rate
                r   e   f   o   r   D   I   g   I   t   a   l   V   I   d   e   o    P   r   o   c   e   s   s   I   n   g




(gjones, meskevich, agyarmati @computing.dcu.ie)‫‏‬                               - 13 -
Centre for Digital Video Processing

                Pre-processing: cross-segmentation
C   e   n   t   r   e   f   o   r   D   I   g   I   t   a   l   V   I   d   e   o    P   r   o   c   e   s   s   I   n   g




(gjones, meskevich, agyarmati @computing.dcu.ie)‫‏‬                               - 14 -
Centre for Digital Video Processing

                Outline
C   e   n   t   r   e   f   o   r   D   I   g   I   t   a   l   V   I   d   e   o    P   r   o   c   e   s   s   I   n   g




                •  Issues
                •  AMI corpus
                •  Pre-processing
                •  Experiment and Results
                •  Future work




(gjones, meskevich, agyarmati @computing.dcu.ie)‫‏‬                               - 15 -
Experiment:
                                                                                                 Centre for Digital Video Processing



C   e   n   t
                slides and relevant segments selection
                r   e   f   o   r   D   I   g   I   t   a   l   V   I   d   e   o    P   r   o   c   e   s   s   I   n   g




(gjones, meskevich, agyarmati @computing.dcu.ie)‫‏‬                               - 16 -
Experiment:
                                                                                                  Centre for Digital Video Processing



C   e   n   t
                slides and relevant segments selection
                r   e    f   o   r   D   I   g   I   t   a   l   V   I   d   e   o    P   r   o   c   e   s   s   I   n   g




                                                                             Number of relevant segments
                                         Number                              with segmentation based on
                Type of
                                           of
                queries
                                         queries

                                                                 ASR transcript                               Manual transcript


                        Min                      15                              56                                           49


                        Max                      24                              68                                           39


                Random                           25                              36                                           42



(gjones, meskevich, agyarmati @computing.dcu.ie)‫‏‬                                - 17 -
Experiment:                                                                       Centre for Digital Video Processing



C   e   n   t
                Indexing & Retrieval Setup
                r   e    f   o   r   D   I   g   I   t   a   l   V   I   d   e   o    P   r   o   c   e   s   s   I   n   g




                •  Indri language model of the open
                   source Lemur Toolkit (
                   http://www.lemurproject.org/):
                        –  texts are stemmed using Lemur's built-in
                           Porter stemmer


                •  Stopword list provided by Snowball
                   (http://snowball.tartarus.org/)




(gjones, meskevich, agyarmati @computing.dcu.ie)‫‏‬                                - 18 -
Centre for Digital Video Processing

                Results: at ranks 100
C   e   n   t   r   e   f   o   r   D   I   g   I   t   a   l   V   I   d   e   o    P   r   o   c   e   s   s   I   n   g

            •  Recall at ranks 100:




            •  Mean Reciprocal Rate at ranks 100:




(gjones, meskevich, agyarmati @computing.dcu.ie)‫‏‬                               - 19 -
Centre for Digital Video Processing

                Outline
C   e   n   t   r   e   f   o   r   D   I   g   I   t   a   l   V   I   d   e   o    P   r   o   c   e   s   s   I   n   g




                •  Issues
                •  AMI corpus
                •  Pre-processing
                •  Experiment
                •  Results
                •  Future work




(gjones, meskevich, agyarmati @computing.dcu.ie)‫‏‬                               - 20 -
Centre for Digital Video Processing

                Problems
C   e   n   t   r   e   f   o   r   D   I   g   I   t   a   l   V   I   d   e   o    P   r   o   c   e   s   s   I   n   g




                •  Errors in the ASR output

                •  Common knowledge of the participants
                   of the meeting -> some words are not
                   spoken

                •  All parts of the meetings are indexed in
                   the same way

                •  Retrieval algorithm favours longer
                   segments

(gjones, meskevich, agyarmati @computing.dcu.ie)‫‏‬                               - 21 -
Centre for Digital Video Processing

                Future work
C   e   n   t   r   e   f   o   r   D   I   g   I   t   a   l   V   I   d   e   o    P   r   o   c   e   s   s   I   n   g




            •  Construct proper segment-based
               relevance set for the slides

            •  Analysis of ASR errors influence on
               segmentation

            •  ASR transcript improvement




(gjones, meskevich, agyarmati @computing.dcu.ie)‫‏‬                               - 22 -
Centre for Digital Video Processing



C   e   n   t   r   e   f   o   r   D   I   g   I   t   a   l   V   I   d   e   o    P   r   o   c   e   s   s   I   n   g




                Thank You
                                    Thank you for your attention!
                                            Questions?




(gjones, meskevich, agyarmati @computing.dcu.ie)‫‏‬                               - 23 -

More Related Content

Similar to Towards Methods for Efficient Access to Spoken Content in the AMI Corpus (SSCS 2010)

Deep learning takes on Signal Processing
Deep learning takes on Signal ProcessingDeep learning takes on Signal Processing
Deep learning takes on Signal ProcessingVivek Kumar
 
Optical Character Recognition (OCR)
Optical Character Recognition (OCR)Optical Character Recognition (OCR)
Optical Character Recognition (OCR)Vidyut Singhania
 
ocrppt-140415204404-phpapp01.pdf
ocrppt-140415204404-phpapp01.pdfocrppt-140415204404-phpapp01.pdf
ocrppt-140415204404-phpapp01.pdfAkhilJoseph63
 
SP1: Exploratory Network Analysis with Gephi
SP1: Exploratory Network Analysis with GephiSP1: Exploratory Network Analysis with Gephi
SP1: Exploratory Network Analysis with GephiJohn Breslin
 
Gephi icwsm-tutorial
Gephi icwsm-tutorialGephi icwsm-tutorial
Gephi icwsm-tutorialcsedays
 
V Code And V Data Illustrating A New Framework For Supporting The Video Annot...
V Code And V Data Illustrating A New Framework For Supporting The Video Annot...V Code And V Data Illustrating A New Framework For Supporting The Video Annot...
V Code And V Data Illustrating A New Framework For Supporting The Video Annot...GoogleTecTalks
 
Digital DNA for Organic Enterprises
Digital DNA for Organic EnterprisesDigital DNA for Organic Enterprises
Digital DNA for Organic EnterprisesTeemu Arina
 
A Deep Learning Approach to Recognize Cursive Handwriting
A Deep Learning Approach to Recognize Cursive HandwritingA Deep Learning Approach to Recognize Cursive Handwriting
A Deep Learning Approach to Recognize Cursive HandwritingIRJET Journal
 
Supporting Valorization of Cultural Heritage Documentation: TIVal Approach
Supporting Valorization of Cultural Heritage Documentation: TIVal ApproachSupporting Valorization of Cultural Heritage Documentation: TIVal Approach
Supporting Valorization of Cultural Heritage Documentation: TIVal ApproachGiuseppe Vizzari
 
David.oberhettinger
David.oberhettingerDavid.oberhettinger
David.oberhettingerNASAPMC
 
Video Hyperlinking Tutorial (Part B)
Video Hyperlinking Tutorial (Part B)Video Hyperlinking Tutorial (Part B)
Video Hyperlinking Tutorial (Part B)LinkedTV
 
How Will AI Change the Role of the Data Scientist?
How Will AI Change the Role of the Data Scientist?How Will AI Change the Role of the Data Scientist?
How Will AI Change the Role of the Data Scientist?Hugo Gävert
 
Stucky Rwagasana Presentation
Stucky Rwagasana PresentationStucky Rwagasana Presentation
Stucky Rwagasana PresentationRwagasana Gerard
 
Portfolio Task Outlines
Portfolio Task OutlinesPortfolio Task Outlines
Portfolio Task OutlinesDerek Moore
 
Ml pluss ejan2013
Ml pluss ejan2013Ml pluss ejan2013
Ml pluss ejan2013CS, NcState
 
What to curate? Preserving and Curating Software-Based Art
What to curate? Preserving and Curating Software-Based ArtWhat to curate? Preserving and Curating Software-Based Art
What to curate? Preserving and Curating Software-Based Artneilgrindley
 
Introduction
IntroductionIntroduction
IntroductionKh Ravy
 

Similar to Towards Methods for Efficient Access to Spoken Content in the AMI Corpus (SSCS 2010) (20)

Deep learning takes on Signal Processing
Deep learning takes on Signal ProcessingDeep learning takes on Signal Processing
Deep learning takes on Signal Processing
 
Optical Character Recognition (OCR)
Optical Character Recognition (OCR)Optical Character Recognition (OCR)
Optical Character Recognition (OCR)
 
ocrppt-140415204404-phpapp01.pdf
ocrppt-140415204404-phpapp01.pdfocrppt-140415204404-phpapp01.pdf
ocrppt-140415204404-phpapp01.pdf
 
SP1: Exploratory Network Analysis with Gephi
SP1: Exploratory Network Analysis with GephiSP1: Exploratory Network Analysis with Gephi
SP1: Exploratory Network Analysis with Gephi
 
Cv June 2009
Cv June 2009Cv June 2009
Cv June 2009
 
Gephi icwsm-tutorial
Gephi icwsm-tutorialGephi icwsm-tutorial
Gephi icwsm-tutorial
 
V Code And V Data Illustrating A New Framework For Supporting The Video Annot...
V Code And V Data Illustrating A New Framework For Supporting The Video Annot...V Code And V Data Illustrating A New Framework For Supporting The Video Annot...
V Code And V Data Illustrating A New Framework For Supporting The Video Annot...
 
Digital DNA for Organic Enterprises
Digital DNA for Organic EnterprisesDigital DNA for Organic Enterprises
Digital DNA for Organic Enterprises
 
A Deep Learning Approach to Recognize Cursive Handwriting
A Deep Learning Approach to Recognize Cursive HandwritingA Deep Learning Approach to Recognize Cursive Handwriting
A Deep Learning Approach to Recognize Cursive Handwriting
 
Supporting Valorization of Cultural Heritage Documentation: TIVal Approach
Supporting Valorization of Cultural Heritage Documentation: TIVal ApproachSupporting Valorization of Cultural Heritage Documentation: TIVal Approach
Supporting Valorization of Cultural Heritage Documentation: TIVal Approach
 
David.oberhettinger
David.oberhettingerDavid.oberhettinger
David.oberhettinger
 
Video Hyperlinking Tutorial (Part B)
Video Hyperlinking Tutorial (Part B)Video Hyperlinking Tutorial (Part B)
Video Hyperlinking Tutorial (Part B)
 
How Will AI Change the Role of the Data Scientist?
How Will AI Change the Role of the Data Scientist?How Will AI Change the Role of the Data Scientist?
How Will AI Change the Role of the Data Scientist?
 
Stucky Rwagasana Presentation
Stucky Rwagasana PresentationStucky Rwagasana Presentation
Stucky Rwagasana Presentation
 
Portfolio Task Outlines
Portfolio Task OutlinesPortfolio Task Outlines
Portfolio Task Outlines
 
Ml pluss ejan2013
Ml pluss ejan2013Ml pluss ejan2013
Ml pluss ejan2013
 
What to curate? Preserving and Curating Software-Based Art
What to curate? Preserving and Curating Software-Based ArtWhat to curate? Preserving and Curating Software-Based Art
What to curate? Preserving and Curating Software-Based Art
 
User
UserUser
User
 
Previous work on Access Management Federations
Previous work on Access Management FederationsPrevious work on Access Management Federations
Previous work on Access Management Federations
 
Introduction
IntroductionIntroduction
Introduction
 

More from Maria Eskevich

Video Hyperlinking (LNK) Task at TRECVid 2016
Video Hyperlinking (LNK) Task at TRECVid 2016Video Hyperlinking (LNK) Task at TRECVid 2016
Video Hyperlinking (LNK) Task at TRECVid 2016Maria Eskevich
 
Search and Hyperlinking Overview @MediaEval2014
Search and Hyperlinking Overview @MediaEval2014Search and Hyperlinking Overview @MediaEval2014
Search and Hyperlinking Overview @MediaEval2014Maria Eskevich
 
Focus on spoken content in multimedia retrieval
Focus on spoken content in multimedia retrievalFocus on spoken content in multimedia retrieval
Focus on spoken content in multimedia retrievalMaria Eskevich
 
Audio/Video Search: Why? What? How?
Audio/Video Search: Why? What? How?Audio/Video Search: Why? What? How?
Audio/Video Search: Why? What? How?Maria Eskevich
 
Creating a Data Collection for Evaluating Rich Speech Retrieval (LREC 2012)
Creating a Data Collection for Evaluating Rich Speech Retrieval (LREC 2012)Creating a Data Collection for Evaluating Rich Speech Retrieval (LREC 2012)
Creating a Data Collection for Evaluating Rich Speech Retrieval (LREC 2012)Maria Eskevich
 
DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task
DCU at the NTCIR-9 SpokenDoc Passage Retrieval TaskDCU at the NTCIR-9 SpokenDoc Passage Retrieval Task
DCU at the NTCIR-9 SpokenDoc Passage Retrieval TaskMaria Eskevich
 
Search and Hyperlinking Task at MediaEval 2012
Search and Hyperlinking Task at MediaEval 2012Search and Hyperlinking Task at MediaEval 2012
Search and Hyperlinking Task at MediaEval 2012Maria Eskevich
 
Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods...
Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods...Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods...
Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods...Maria Eskevich
 
New Metrics for Meaningful Evaluation of Informally Structured Speech Retriev...
New Metrics for Meaningful Evaluation of Informally Structured Speech Retriev...New Metrics for Meaningful Evaluation of Informally Structured Speech Retriev...
New Metrics for Meaningful Evaluation of Informally Structured Speech Retriev...Maria Eskevich
 

More from Maria Eskevich (9)

Video Hyperlinking (LNK) Task at TRECVid 2016
Video Hyperlinking (LNK) Task at TRECVid 2016Video Hyperlinking (LNK) Task at TRECVid 2016
Video Hyperlinking (LNK) Task at TRECVid 2016
 
Search and Hyperlinking Overview @MediaEval2014
Search and Hyperlinking Overview @MediaEval2014Search and Hyperlinking Overview @MediaEval2014
Search and Hyperlinking Overview @MediaEval2014
 
Focus on spoken content in multimedia retrieval
Focus on spoken content in multimedia retrievalFocus on spoken content in multimedia retrieval
Focus on spoken content in multimedia retrieval
 
Audio/Video Search: Why? What? How?
Audio/Video Search: Why? What? How?Audio/Video Search: Why? What? How?
Audio/Video Search: Why? What? How?
 
Creating a Data Collection for Evaluating Rich Speech Retrieval (LREC 2012)
Creating a Data Collection for Evaluating Rich Speech Retrieval (LREC 2012)Creating a Data Collection for Evaluating Rich Speech Retrieval (LREC 2012)
Creating a Data Collection for Evaluating Rich Speech Retrieval (LREC 2012)
 
DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task
DCU at the NTCIR-9 SpokenDoc Passage Retrieval TaskDCU at the NTCIR-9 SpokenDoc Passage Retrieval Task
DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task
 
Search and Hyperlinking Task at MediaEval 2012
Search and Hyperlinking Task at MediaEval 2012Search and Hyperlinking Task at MediaEval 2012
Search and Hyperlinking Task at MediaEval 2012
 
Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods...
Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods...Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods...
Comparing Retrieval Effectiveness of Alternative Content Segmentation Methods...
 
New Metrics for Meaningful Evaluation of Informally Structured Speech Retriev...
New Metrics for Meaningful Evaluation of Informally Structured Speech Retriev...New Metrics for Meaningful Evaluation of Informally Structured Speech Retriev...
New Metrics for Meaningful Evaluation of Informally Structured Speech Retriev...
 

Recently uploaded

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Bhuvaneswari Subramani
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 

Recently uploaded (20)

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 

Towards Methods for Efficient Access to Spoken Content in the AMI Corpus (SSCS 2010)

  • 1. Centre for Digital Video Processing C e n t r e f o r D I g I t a l V I d e o P r o c e s s I n g Towards Methods for Efficient Access to Spoken Content in the AMI Corpus Gareth J. F. Jones Maria Eskevich Ágnes Gyarmati Centre for Digital Video Processing School of Computing Dublin City University, Ireland (gjones, meskevich, agyarmati @computing.dcu.ie)‫‏‬ -1-
  • 2. Centre for Digital Video Processing Outline C e n t r e f o r D I g I t a l V I d e o P r o c e s s I n g •  Issues •  AMI corpus •  Pre-processing •  Experiment and Results •  Future work (gjones, meskevich, agyarmati @computing.dcu.ie)‫‏‬ -2-
  • 3. Centre for Digital Video Processing Outline C e n t r e f o r D I g I t a l V I d e o P r o c e s s I n g •  Issues •  AMI corpus •  Pre-processing •  Experiment and Results •  Future work (gjones, meskevich, agyarmati @computing.dcu.ie)‫‏‬ -3-
  • 4. Centre for Digital Video Processing Issues: types of Spoken Content C e n t r e f o r D I g I t a l V I d e o P r o c e s s I n g –  News broadcast: •  Structured •  Clearly articulated speech -> standard text document retrieval task on ASR transcript –  Other types of speech (meetings, lectures): •  Lack of clearly defined document form/structure •  Informal style, cross-talk, noisy environment ->We have to define: •  Search units •  Location of relevant items (gjones, meskevich, agyarmati @computing.dcu.ie)‫‏‬ -4-
  • 5. Centre for Digital Video Processing Issues: Existing Research C e n t r e f o r D I g I t a l V I d e o P r o c e s s I n g •  Speech Search: –  TV and radio news: Spoken Document Retrieval (SDR) task at TREC (2000) –  Interviews: Malach Collection (2007) –  AMI (Augmented Multi-party Interaction) corpus •  Recognition WER and Retrieval: –  Low recognition error level: •  little loss in retrieval effectiveness (2000) •  documents are retrieved at higher ranks (2003, 2007) –  Specific metrics (semantic impact of substitutions): •  correlation with retrieval performance (AMI Corpus, 2009) (gjones, meskevich, agyarmati @computing.dcu.ie)‫‏‬ -5-
  • 6. Centre for Digital Video Processing Issues C e n t r e f o r D I g I t a l V I d e o P r o c e s s I n g •  Goal: –  Investigate how difference between manual and automatic transcription accuracy influences retrieval effectiveness on the material of the AMI Corpus •  Experiment: –  Segmentation of spoken content –  Known-item search task using slides from meetings as queries (gjones, meskevich, agyarmati @computing.dcu.ie)‫‏‬ -6-
  • 7. Centre for Digital Video Processing Outline C e n t r e f o r D I g I t a l V I d e o P r o c e s s I n g •  Issues •  AMI corpus •  Pre-processing •  Experiment and Results •  Future work (gjones, meskevich, agyarmati @computing.dcu.ie)‫‏‬ -7-
  • 8. Centre for Digital Video Processing AMI Corpus C e n t r e f o r D I g I t a l V I d e o P r o c e s s I n g •  100 hours •  Each meetings approximately 30 minutes •  Simulating project meetings •  4-5 participants •  Headset and circular microphones •  Automatic and manual transcripts available •  Additional data (slides, minutes) (gjones, meskevich, agyarmati @computing.dcu.ie)‫‏‬ -8-
  • 9. Centre for Digital Video Processing Outline C e n t r e f o r D I g I t a l V I d e o P r o c e s s I n g •  Issues •  AMI corpus •  Pre-processing •  Experiment and Results •  Future work (gjones, meskevich, agyarmati @computing.dcu.ie)‫‏‬ -9-
  • 10. Centre for Digital Video Processing Pre-processing: segmentation C e n t r e f o r D I g I t a l V I d e o P r o c e s s I n g •  Linear segmentation (C99 algorithm): Cosine based sequential sentence similarity based algorithm Boundaries inserted between sentences based on the difference of lexical inventory (stemmed) •  Time segmentation (approximately 90 seconds) (gjones, meskevich, agyarmati @computing.dcu.ie)‫‏‬ - 10 -
  • 11. Centre for Digital Video Processing Pre-processing: segmentation C e n t r e f o r D I g I t a l V I d e o P r o c e s s I n g •  Number of segments Type of transcript Linear segmentation (C99) Manual transcript 2678 ASR transcript 3831 •  Average number of words per segment Type of transcript Linear segmentation (C99) Manual transcript 320 ASR transcript 221 (gjones, meskevich, agyarmati @computing.dcu.ie)‫‏‬ - 11 -
  • 12. Pre-processing: Centre for Digital Video Processing C e n t Word Recognition Rate (WRR) r e f o r D I g I t a l V I d e o P r o c e s s I n g 1.  Alignment between ASR and manual transcripts 2.  Recognition rate count Recognition rate – number of correctly recognized words in the meeting divided by the total number of words in the transcript 3.  Recognition rate without stop words (gjones, meskevich, agyarmati @computing.dcu.ie)‫‏‬ - 12 -
  • 13. Relation between Centre for Digital Video Processing C e n t segmentation and recognition rate r e f o r D I g I t a l V I d e o P r o c e s s I n g (gjones, meskevich, agyarmati @computing.dcu.ie)‫‏‬ - 13 -
  • 14. Centre for Digital Video Processing Pre-processing: cross-segmentation C e n t r e f o r D I g I t a l V I d e o P r o c e s s I n g (gjones, meskevich, agyarmati @computing.dcu.ie)‫‏‬ - 14 -
  • 15. Centre for Digital Video Processing Outline C e n t r e f o r D I g I t a l V I d e o P r o c e s s I n g •  Issues •  AMI corpus •  Pre-processing •  Experiment and Results •  Future work (gjones, meskevich, agyarmati @computing.dcu.ie)‫‏‬ - 15 -
  • 16. Experiment: Centre for Digital Video Processing C e n t slides and relevant segments selection r e f o r D I g I t a l V I d e o P r o c e s s I n g (gjones, meskevich, agyarmati @computing.dcu.ie)‫‏‬ - 16 -
  • 17. Experiment: Centre for Digital Video Processing C e n t slides and relevant segments selection r e f o r D I g I t a l V I d e o P r o c e s s I n g Number of relevant segments Number with segmentation based on Type of of queries queries ASR transcript Manual transcript Min 15 56 49 Max 24 68 39 Random 25 36 42 (gjones, meskevich, agyarmati @computing.dcu.ie)‫‏‬ - 17 -
  • 18. Experiment: Centre for Digital Video Processing C e n t Indexing & Retrieval Setup r e f o r D I g I t a l V I d e o P r o c e s s I n g •  Indri language model of the open source Lemur Toolkit ( http://www.lemurproject.org/): –  texts are stemmed using Lemur's built-in Porter stemmer •  Stopword list provided by Snowball (http://snowball.tartarus.org/) (gjones, meskevich, agyarmati @computing.dcu.ie)‫‏‬ - 18 -
  • 19. Centre for Digital Video Processing Results: at ranks 100 C e n t r e f o r D I g I t a l V I d e o P r o c e s s I n g •  Recall at ranks 100: •  Mean Reciprocal Rate at ranks 100: (gjones, meskevich, agyarmati @computing.dcu.ie)‫‏‬ - 19 -
  • 20. Centre for Digital Video Processing Outline C e n t r e f o r D I g I t a l V I d e o P r o c e s s I n g •  Issues •  AMI corpus •  Pre-processing •  Experiment •  Results •  Future work (gjones, meskevich, agyarmati @computing.dcu.ie)‫‏‬ - 20 -
  • 21. Centre for Digital Video Processing Problems C e n t r e f o r D I g I t a l V I d e o P r o c e s s I n g •  Errors in the ASR output •  Common knowledge of the participants of the meeting -> some words are not spoken •  All parts of the meetings are indexed in the same way •  Retrieval algorithm favours longer segments (gjones, meskevich, agyarmati @computing.dcu.ie)‫‏‬ - 21 -
  • 22. Centre for Digital Video Processing Future work C e n t r e f o r D I g I t a l V I d e o P r o c e s s I n g •  Construct proper segment-based relevance set for the slides •  Analysis of ASR errors influence on segmentation •  ASR transcript improvement (gjones, meskevich, agyarmati @computing.dcu.ie)‫‏‬ - 22 -
  • 23. Centre for Digital Video Processing C e n t r e f o r D I g I t a l V I d e o P r o c e s s I n g Thank You Thank you for your attention! Questions? (gjones, meskevich, agyarmati @computing.dcu.ie)‫‏‬ - 23 -