Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Wikipedia-based Kernels for Dialogue Topic Tracking

629 views

Published on

Published in: Software, Technology, Education
  • Be the first to comment

  • Be the first to like this

Wikipedia-based Kernels for Dialogue Topic Tracking

  1. 1. WIKIPEDIA-BASED KERNELS FOR DIALOGUE TOPIC TRACKING Seokhwan Kim, Rafael E. Banchs, Haizhou Li Human Language Technology Department Institute for Infocomm Research (I2R) 6th May 2014 ICASSP, Florence, Italy
  2. 2. Contents • Introduction • Problem Definition • Method • Evaluation • Conclusions Pg 2
  3. 3. Contents • Introduction • Problem Definition • Method • Evaluation • Conclusions Pg 3
  4. 4. Motivation • Spoken Dialogue Systems – Next-generation User Interface – The most natural way for human-human communication • Single-task dialogues – Most previous work focuses on single target task • Eg. Flight Reservation, Bus Information Guide, Restaurant Booking – Cause limitations in practical uses • Multi-task dialogues – [Lin et al. 1999, Ikeda et al. 2008, Celikyilmaz et al. 2011] – Selecting the most probable system at each turn – Each system is independently built and operated from others Pg 4
  5. 5. Related Work • Text categorization-based Dialogue Topic Identification – [Nakata et al., 2002; Lagus&Kuusisto, 2002; Adams&Martell, 2008] – Differences from written texts • Determinations of topics – User’s intentions – System’s decisions • Available features – Unable to see the future turns • Knowledge-based Dialogue Topic Suggestion – External Knowledge Sources • Eg. Domain Models, Heuristics, Agendas • [Roy&Subramaniam, 2006; Young et al., 2007; Bohus&Rudnicky, 2003; Lee et al. 2008 – Limited flexibility • To handle user-initiative cases – High cost • To build a sufficient amount of resources Pg 5
  6. 6. Contents • Introduction • Problem Definition • Method • Evaluation • Conclusions Pg 6
  7. 7. Dialogue Topic Tracking • Subtasks – Dialogue Segmentation • Segmenting a session into topically coherent sub-dialogues – Topic Transition Identification • Identifying the next topic category at each time of topic transition Pg 7
  8. 8. Dialogue Topic Tracking • Example Pg 8
  9. 9. Contents • Introduction • Problem Definition • Method • Evaluation • Conclusions Pg 9
  10. 10. Wikipedia-based Kernel Method • Vector Space Model – The simplest approach to represent features for supervised machine learning methods – An instance for each turn  A weighted term vector – Lack of semantic or domain-specific aspects • Each word is considered as an independent and identical unit Pg 10
  11. 11. Wikipedia-based Kernel Method • Wikipedia for Dialogue Topic Tracking – As an external knowledge source – Without significant effort for building resources – Previous work • [Breuing et al., 2011; Wilcock, 2012] • Focusing only on a single type of information from Wikipedia • Wikipedia-based Kernel Method – Aiming at incorporating various knowledge from Wikipedia – To map the data into a higher dimensional feature space • Vector Space Extension • Vector Transformation Pg 11
  12. 12. Wikipedia-based Kernel Method • Vector Extension Pg 12 Term Vector Concept Vector … U: --------------------------- S: --------------------------- U: --------------------------- S: --------------------------- U: --------------------------- S: --------------------------- … x β1 β2 Β|D| d1 d2 d|D| ⁞
  13. 13. Wikipedia-based Kernel Method • Vector Transformation – Each extended vector is transformed into a new space – Transformation Matrix S – s(di, dj) is the relatedness between di and dj – Update of Concept Vector Values Pg 13
  14. 14. Wikipedia-based Kernel Method Pg 14
  15. 15. Measures of Contextual Relatedness • How to compute s(di, dj)? • Category Relatedness – Based on hierarchical structures of Wikipedia categories • depth(d): the length of the path from the root node to d • lcs(di, dj): the least common subsume of the two articles in the hierarchy Pg 15
  16. 16. Measures of Contextual Relatedness • Category Overlap Score – Based on the ratio of common categories of two concepts – By Jaccard’s coefficient • Contents Similarity – Based on the cosine similarity between term vectors from the body texts Pg 16
  17. 17. Measures of Contextual Relatedness • Co-occurrence Frequency – To represent the discourse relatedness obtained from Wikipedia – Assumption • The more frequently the mentions about two concepts co-occurred • The more similar aspects both concepts take in dialogue flows – By normalized point-wise mutual information Pg 17
  18. 18. Measures of Contextual Relatedness • Geographical Closeness – Domain-specific Measure – Based on the geographic coordinate information of spatial concepts • Final Score Pg 18
  19. 19. Measures of Contextual Relatedness Pg 19
  20. 20. Contents • Introduction • Problem Definition • Method • Evaluation • Conclusions Pg 20
  21. 21. Evaluation • Dataset – Dialogue Corpus on Singapore Tour Guide • Real human-human mixed initiative conversations • Between guides and tourists • Stats – 35 dialogue sessions – 21 hours – 19,651 utterances • Topics – 1,642 topic segments – 9 topic categories » Opening, Closing, Itinerary, Accommodation, Attraction, Food, Transportation, Shopping, Other – Wikipedia Collection • 3,115 articles related to Singapore • Collected from Wikipedia database dump as of Feb 2013 Pg 21
  22. 22. Evaluation • Models – Training Instances • 8,318 instances for user-turn-level segmentation • 1,607 instances for dialogue-segment-level topic prediction – Support Vector Machine (SVM) Models • BOW: Baseline only with term vector space • WK0: Extended vector without transformation • WK1: s(di, dj) = s1(di, dj) • WK2: s(di, dj) = s1(di, dj) + s2(di, dj) • WK3: s(di, dj) = s1(di, dj) + s2(di, dj) + s3(di, dj) • WK4: s(di, dj) = s1(di, dj) + s2(di, dj) + s3(di, dj) + s4(di, dj) • WK5: s(di, dj) = s1(di, dj) + s2(di, dj) + s3(di, dj) + s4(di, dj) + s5(di, dj) • Metrics – Five-fold Cross-validation – Segmentation: P/R/F – Topic Prediction: Accuracy Pg 22
  23. 23. Evaluation • Comparison of dialogue topic tracking performances Pg 23
  24. 24. Evaluation • Distributions of errors on the cascaded results with WK5 – 71.4% of errors result from segmentation – 60.0% of errors occurred for system-initiative cases Pg 24
  25. 25. Contents • Introduction • Problem Definition • Method • Evaluation • Conclusions Pg 25
  26. 26. Conclusions • Summary – Wikipedia-based Kernel Method for Dialogue Topic Tracking – To incorporate various types of information from Wikipedia – Experimental results show the merits of our proposed approach in mixed-initiative dialogues • Ongoing Work – Using more various types of knowledge from Wikipedia – To be presented at ACL 2014 • A Composite Kernel Approach for Dialog Topic Tracking with Structured Domain Knowledge from Wikipedia Pg 26
  27. 27. Thank You Pg 27 Contact: kims@i2r.a-star.edu.sg
  28. 28. References • B. Lin, H. Wang, and L. Lee, “A distributed architecture for cooperative spoken dialogue agents with coherent dialogue state and history,” in Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 1999. • S. Ikeda, K. Komatani, T. Ogata, H. G. Okuno, and H. G. Okuno, “Extensibility verification of robust domain selection against out-of-grammar utterances in multidomain spoken dialogue system.,” in Proceedings of the 9th Annual Conference of the International Speech Communicatiuon Association (INTERSPEECH), 2008, pp. 487–490. • A. Celikyilmaz, D. Hakkani-T¨ur, and G. T¨ur, “Approximate inference for domain detection in spoken language understanding.,” in Proceedings of the 12th Annual Conference of the International Speech Communication Association (INTERSPEECH), 2011, pp. 713–716. • T. Nakata, S. Ando, and A. Okumura, “Topic detection based on dialogue history,” in Proceedings of the 19th international conference on Computational linguistics (COLING), 2002, pp. 1–7. • K. Lagus and J. Kuusisto, “Topic identification in natural language dialogues using neural networks,” in Proceedings of the 3rd SIGdial workshop on Discourse and dialogue, 2002, pp. 95–102. • P. H. Adams and C. H. Martell, “Topic detection and extraction in chat,” in Proceedings of the 2008 IEEE International Conference on Semantic Computing, 2008, pp. 581–588. • S. Roy and L. V. Subramaniam, “Automatic generation of domain models for call centers from noisy transcriptions,” in Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics, 2006, pp. 737–744. • S. Young, J. Schatzmann, K. Weilhammer, and H. Ye, “The hidden information state approach to dialog management,” in Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2007. • D. Bohus and A. Rudnicky, “Ravenclaw: dialog management using hierarchical task decomposition and an expectation agenda,” in Proceedings of the European Conference on Speech, Communication and Technology, 2003, pp. 597–600. • C. Lee, S. Jung, and G. G. Lee, “Robust dialog management with n-best hypotheses using dialog examples and agenda.,” in Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, 2008, pp. 630–637. • G. Salton, A. Wong, and C.S. Yang, “A vector space model for automatic indexing,” Communications of the ACM, vol. 18, no. 11, pp. 613–620, 1975. • G.Wilcock, “Wikitalk: a spoken wikipedia-based opendomain knowledge access system,” in Proceedings of the Workshop on Question Answering for Complex Domains, 2012, p. 5770. • A. Breuing, U. Waltinger, and I. Wachsmuth, “Harvesting wikipedia knowledge to identify topics in ongoing natural language dialogs,” in Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT), 2011, pp. 445–450. • P. Wang and C. Domeniconi, “Building semantic kernels for text classification using wikipedia,” in Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, 2008, pp. 713–721. • Z. Wu and M. Palmer, “Verbs semantics and lexical selection,” in Proceedings of the 32nd annual meeting on Association for Computational Linguistics, 1994, pp.133–138. • C. C. Chang and C. J. Lin, “Libsvm: a library for support vector machines,” ACM Transactions on Intelligent Systems and Technology (TIST), vol. 2, no. 3, pp. 27, 2011.

×