DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task

319 views
193 views

Published on

We describe details of our runs and the results obtained for the "IR for Spoken Documents (SpokenDoc) Task'' at NTCIR-9. The focus of our participation in this task was the investigation of the use of segmentation methods to divide the manual and ASR transcripts into topically coherent segments. The underlying assumption of this approach is that these segments will capture passages in the transcript relevant to the query. Our experiments investigate the use of two lexical coherence based segmentation algorithms (TextTiling, C99). These are run on the provided manual and ASR transcripts, and the ASR transcript with stop words removed. Evaluation of the results shows that TextTiling consistently performs better than C99 both in segmenting the data into retrieval units as evaluated using the centre located relevant information metric and in having higher content precision in each automatically created segment.

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
319
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
4
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task

  1. 1. DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 1 / 52DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task Maria Eskevich, Gareth J.F. Jones Centre for Digital Video Processing Centre for Next Generation Localisation School of Computing Dublin City University Ireland December, 8, 2011
  2. 2. Retrieval Methodology DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 2 / 52 Outline Retrieval Methodology Transcript Preprocessing Text Segmentation Retrieval Setup Results Official Metrics uMAP pwMAP fMAP Conclusions
  3. 3. Retrieval Methodology DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 3 / 52 Retrieval Methodology Transcript
  4. 4. Retrieval Methodology DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 4 / 52 Retrieval Methodology Transcript Segmentation Topically Coherent Segments
  5. 5. Retrieval Methodology DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 5 / 52 Retrieval Methodology Transcript Segmentation Topically Indexed Coherent Segments Indexing Segments
  6. 6. Retrieval Methodology DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 6 / 52 Retrieval Methodology Transcript Queries Segmentation Information Request Topically Indexed Coherent Segments Indexing Segments
  7. 7. Retrieval Methodology DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 7 / 52 Retrieval Methodology Transcript Queries Segmentation Information Request Topically Indexed Retrieval Coherent Segments Indexing Segments Retrieval Results
  8. 8. Retrieval Methodology DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 8 / 52 6 Retrieval Runs 1-best ASR with Manual 1-best ASR stop words removed Transcript
  9. 9. Retrieval Methodology DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 9 / 52 6 Retrieval Runs 1-best ASR with Manual 1-best ASR stop words removed Transcript C99 ASR C99 TT ASR TT
  10. 10. Retrieval Methodology DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 10 / 52 6 Retrieval Runs 1-best ASR with Manual 1-best ASR stop words removed Transcript C99 C99 ASR C99 TT ASR NSW C99 TT ASR TT ASR NSW TT
  11. 11. Retrieval Methodology DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 11 / 52 6 Retrieval Runs 1-best ASR with Manual 1-best ASR stop words removed Transcript C99 C99 C99 ASR C99 TT ASR NSW C99 TT Manual C99 TT ASR TT ASR NSW TT Manual TT
  12. 12. Retrieval Methodology DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 12 / 52 6 Retrieval Runs ASR C99 ASR NSW C99 Manual C99 ASR TT ASR NSW TT Manual TT
  13. 13. Retrieval Methodology DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 13 / 52 Transcript Preprocessing Recognize individual morphemes of the sentences: ChaSen 2.4.0, based on Japanese morphological analyzer JUMAN 2.0 with ipadic grammar 2.7.0
  14. 14. Retrieval Methodology DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 14 / 52 Transcript Preprocessing Recognize individual morphemes of the sentences: ChaSen 2.4.0, based on Japanese morphological analyzer JUMAN 2.0 with ipadic grammar 2.7.0 Form the text out of the base forms of the words in order to avoid stemming
  15. 15. Retrieval Methodology DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 15 / 52 Transcript Preprocessing Recognize individual morphemes of the sentences: ChaSen 2.4.0, based on Japanese morphological analyzer JUMAN 2.0 with ipadic grammar 2.7.0 Form the text out of the base forms of the words in order to avoid stemming Remove the stop words (SpeedBlog Japanese Stop-words) for one of the runs
  16. 16. Retrieval Methodology DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 16 / 52 Text Segmentation Use of the algorithms originally developed for text: Individual IPUs are treated as sentences TextTiling: Cosine similarities between adjacent blocks of sentences C99: Compute similarity between sentences using a cosine similarity measure to form a similarity matrix Cosine scores are replaced by the rank of the score in the local region Segmentation points are assigned using a clustering procedure
  17. 17. Retrieval Methodology DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 17 / 52 Retrieval Setup SMART information retrieval system extended to use language modelling with a uniform document prior probability.
  18. 18. Retrieval Methodology DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 18 / 52 Retrieval Setup SMART information retrieval system extended to use language modelling with a uniform document prior probability. A query q is scored against a document d within the SMART framework in the following way: n P(q|d) = (λi P(qi |d) + (1 − λi )P(qi )) i=1
  19. 19. Retrieval Methodology DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 19 / 52 Retrieval Setup SMART information retrieval system extended to use language modelling with a uniform document prior probability. A query q is scored against a document d within the SMART framework in the following way: n P(q|d) = (λi P(qi |d) + (1 − λi )P(qi )) i=1 where q = (q1 , . . . qn ) is a query comprising of n query terms, P(qi |d) is the probability of generating the i th query term from a given document d being estimated by the maximum likelihood, P(qi ) is the probability of generating it from the collection and is estimated by document frequency
  20. 20. Retrieval Methodology DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 20 / 52 Retrieval Setup SMART information retrieval system extended to use language modelling with a uniform document prior probability. A query q is scored against a document d within the SMART framework in the following way: n P(q|d) = (λi P(qi |d) + (1 − λi )P(qi )) i=1 where q = (q1 , . . . qn ) is a query comprising of n query terms, P(qi |d) is the probability of generating the i th query term from a given document d being estimated by the maximum likelihood, P(qi ) is the probability of generating it from the collection and is estimated by document frequency The retrieval model used λi = 0.3 for all qi , this value being optimized on the TREC-8 ad hoc dataset.
  21. 21. Results DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 21 / 52 Outline Retrieval Methodology Transcript Preprocessing Text Segmentation Retrieval Setup Results Official Metrics uMAP pwMAP fMAP Conclusions
  22. 22. Results DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 22 / 52 Results: Official Metrics Transcript Segmentation uMAP pwMAP fMAP type type BASELINE 0.0670 0.0520 0.0536 manual tt 0.0859 0.0429 0.0500 manual C99 0.0713 0.0209 0.0168 ASR tt 0.0490 0.0329 0.0308 ASR C99 0.0469 0.0166 0.0123 ASR nsw tt 0.0312 0.0141 0.0174 ASR nsw C99 0.0316 0.0138 0.0120
  23. 23. Results DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 23 / 52 Results: Official Metrics Transcript Segmentation uMAP pwMAP fMAP type type BASELINE 0.0670 0.0520 0.0536 manual tt 0.0859 0.0429 0.0500 manual C99 0.0713 0.0209 0.0168 ASR tt 0.0490 0.0329 0.0308 ASR C99 0.0469 0.0166 0.0123 ASR nsw tt 0.0312 0.0141 0.0174 ASR nsw C99 0.0316 0.0138 0.0120 Only runs on the manual transcript had higher scores than the baseline (only uMAP metric)
  24. 24. Results DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 24 / 52 Results: Official Metrics Transcript Segmentation uMAP pwMAP fMAP type type BASELINE 0.0670 0.0520 0.0536 manual tt 0.0859 0.0429 0.0500 manual C99 0.0713 0.0209 0.0168 ASR tt 0.0490 0.0329 0.0308 ASR C99 0.0469 0.0166 0.0123 ASR nsw tt 0.0312 0.0141 0.0174 ASR nsw C99 0.0316 0.0138 0.0120 Only runs on the manual transcript had higher scores than the baseline (only uMAP metric) TextTiling results are consistently higher than C99 for all the metrics for manual and ASR runs
  25. 25. Results DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 25 / 52 Time-based Results Assessment Approach For each run and each query: Relevant Passage 1 Relevant Passage 2 Passage 3 Relevant Passage 4 ... where: Length of the Relevant Part Precision = Length of the Whole Passage
  26. 26. Results DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 26 / 52 Time-based Results Assessment Approach For each run and each query: Relevant Passage 1 Precision Relevant Passage 2 Precision Passage 3 Relevant Passage 4 Precision ... where: Length of the Relevant Part Precision = Length of the Whole Passage
  27. 27. Results DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 27 / 52 Time-based Results Assessment Approach For each run and each query: Average Relevant Passage 1 Precision of Precision per Query Relevant Passage 2 Precision for ranks with Passage 3 passages containing Relevant Passage 4 Precision relevant content ... where: Length of the Relevant Part Precision = Length of the Whole Passage
  28. 28. Results DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 28 / 52 Average of Precision for all passages with relevant content
  29. 29. Results DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 29 / 52 Average of Precision for all passages with relevant content TextTiling algorithm has higher average of precision for all types of transcript, i.e.topically coherent segments are better located
  30. 30. Results DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 30 / 52 Results: utterance-based MAP (uMAP) Transcript type Segmentation type uMAP BASELINE 0.0670 manual tt 0.0859 manual C99 0.0713 ASR tt 0.0490 ASR C99 0.0469 ASR nsw tt 0.0312 ASR nsw C99 0.0316
  31. 31. Results DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 31 / 52 Results: utterance-based MAP (uMAP) Transcript type Segmentation type uMAP BASELINE 0.0670 manual tt 0.0859 manual C99 0.0713 ASR tt 0.0490 ASR C99 0.0469 ASR nsw tt 0.0312 ASR nsw C99 0.0316 The trend ‘manual > ASR > ASR nsw’ for both C99 and TextTiling is not proved by the averages of precision
  32. 32. Results DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 32 / 52 Results: utterance-based MAP (uMAP) Transcript type Segmentation type uMAP BASELINE 0.0670 manual tt 0.0859 manual C99 0.0713 ASR tt 0.0490 ASR C99 0.0469 ASR nsw tt 0.0312 ASR nsw C99 0.0316 The trend ‘manual > ASR > ASR nsw’ for both C99 and TextTiling is not proved by the averages of precision Higher average values of the TextTiling segmentation over C99 are not reflected in the uMAP scores
  33. 33. Results DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 33 / 52 Results: utterance-based MAP (uMAP) Transcript type Segmentation type uMAP BASELINE 0.0670 manual tt 0.0859 manual C99 0.0713 ASR tt 0.0490 ASR C99 0.0469 ASR nsw tt 0.0312 ASR nsw C99 0.0316 The trend ‘manual > ASR > ASR nsw’ for both C99 and TextTiling is not proved by the averages of precision Higher average values of the TextTiling segmentation over C99 are not reflected in the uMAP scores For some of the queries runs on C99 segmentation have better ranking of the segments with relevant content
  34. 34. Results DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 34 / 52 Relevance of the Central IPU Assessment Number of ranks Average of Precision taken or not taken for the passages at ranks into account by pwMAP that are taken or not taken into account by pwMAP
  35. 35. Results DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 35 / 52 Relevance of the Central IPU Assessment Number of ranks Average of Precision taken or not taken for the passages at ranks into account by pwMAP that are taken or not taken into account by pwMAP TextTiling has higher numbers of segments that have central IPU relevant to the query
  36. 36. Results DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 36 / 52 Relevance of the Central IPU Assessment Number of ranks Average of Precision taken or not taken for the passages at ranks into account by pwMAP that are taken or not taken into account by pwMAP TextTiling has higher numbers of segments that have central IPU relevant to the query Overall the numbers of the ranks where the segment with relevant is retrieved is approximately the same for both segmentation techniques
  37. 37. Results DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 37 / 52 Results: pointwise MAP (pwMAP) Transcript type Segmentation type pwMAP BASELINE 0.0520 manual tt 0.0429 manual C99 0.0209 ASR tt 0.0329 ASR C99 0.0166 ASR nsw tt 0.0141 ASR nsw C99 0.0138
  38. 38. Results DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 38 / 52 Results: pointwise MAP (pwMAP) Transcript type Segmentation type pwMAP BASELINE 0.0520 manual tt 0.0429 manual C99 0.0209 ASR tt 0.0329 ASR C99 0.0166 ASR nsw tt 0.0141 ASR nsw C99 0.0138 TextTiling segmentation puts better topic boundaries for relevant content and have higher precision scores for the retrieved relevant passages
  39. 39. Results DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 39 / 52 Average Length of Relevant Part and Segments (in seconds) Center IPU is relevant
  40. 40. Results DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 40 / 52 Average Length of Relevant Part and Segments (in seconds) Center IPU is relevant Center IPU is relevant: Average length of the relevant content is of the same order for both segmentation schemes, slightly higher for TextTiling
  41. 41. Results DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 41 / 52 Average Length of Relevant Part and Segments (in seconds) Center IPU is not relevant
  42. 42. Results DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 42 / 52 Average Length of Relevant Part and Segments (in seconds) Center IPU is not relevant Center IPU is not relevant: Average length of the relevant content is higher for C99 segmentation, due to the poor segmentation it correlates with much longer segments
  43. 43. Results DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 43 / 52 Average Length of Relevant Part and Segments (in seconds) Center IPU is relevant Center IPU is not relevant Center IPU is relevant: Average length of the relevant content is of the same order for both segmentation schemes, slightly higher for TextTiling Center IPU is not relevant: Average length of the relevant content is higher for C99 segmentation, due to the poor segmentation it correlates with much longer segments
  44. 44. Results DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 44 / 52 Results: fraction MAP (fMAP) Transcript type Segmentation type fMAP BASELINE 0.0536 manual tt 0.0500 manual C99 0.0168 ASR tt 0.0308 ASR C99 0.0123 ASR nsw tt 0.0174 ASR nsw C99 0.0120
  45. 45. Results DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 45 / 52 Results: fraction MAP (fMAP) Transcript type Segmentation type fMAP BASELINE 0.0536 manual tt 0.0500 manual C99 0.0168 ASR tt 0.0308 ASR C99 0.0123 ASR nsw tt 0.0174 ASR nsw C99 0.0120 Average number of ranks with segments having non-relevant center IPU is more than 5 times higher Segmentation technique with longer poor segmented passages (C99) has much lower precision-based scores
  46. 46. Conclusions DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 46 / 52 Outline Retrieval Methodology Transcript Preprocessing Text Segmentation Retrieval Setup Results Official Metrics uMAP pwMAP fMAP Conclusions
  47. 47. Conclusions DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 47 / 52 Conclusions TextTiling segmentation shows better overall retrieval performance than C99:
  48. 48. Conclusions DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 48 / 52 Conclusions TextTiling segmentation shows better overall retrieval performance than C99: Higher numbers of segments with higher precision
  49. 49. Conclusions DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 49 / 52 Conclusions TextTiling segmentation shows better overall retrieval performance than C99: Higher numbers of segments with higher precision Higher precision even for the segments with non-relevant center IPU
  50. 50. Conclusions DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 50 / 52 Conclusions TextTiling segmentation shows better overall retrieval performance than C99: Higher numbers of segments with higher precision Higher precision even for the segments with non-relevant center IPU High level of poor segmentation makes it harder to retrieve relevant content for C99 runs
  51. 51. Conclusions DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 51 / 52 Conclusions TextTiling segmentation shows better overall retrieval performance than C99: Higher numbers of segments with higher precision Higher precision even for the segments with non-relevant center IPU High level of poor segmentation makes it harder to retrieve relevant content for C99 runs Removal of stop words before segmentation did not have any positive effect on the results
  52. 52. Conclusions DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 52 / 52 Thank you for your attention! Questions?

×