Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Deep Semantic Learning for Conversational Agents

271 views

Published on

Presentation for my Master's Thesis defense on 12 April 2018 at Politecnico di Torino.

Published in: Science
  • Be the first to comment

  • Be the first to like this

Deep Semantic Learning for Conversational Agents

  1. 1. Deep Semantic Learning for Conversational Agents Candidate: Martino Mensio Supervisor: Maurizio Morisio Tutor: Giuseppe Rizzo 12 April 2018 1
  2. 2. Objectives 1. Identify the approaches to build a Conversational Agent with Natural Language Understanding 2. Use the context of interaction 2
  3. 3. Background 3
  4. 4. Background: Conversational Agents What they can do: - automated interaction with customer - virtual assistants What content they can provide: - Chit-chat (small talk) - Goal-oriented - Knowledge-based 4
  5. 5. Background: from questions to answers 5
  6. 6. Background: an example of Understanding 6
  7. 7. Background: Recurrent Neural Networks 7
  8. 8. Background: intent classification [1] Liu, B. and Lane, I. (2016). Attention-based recurrent neural network models for joint intent detection and slot filling. Proceedings of The 17th Annual Meeting of the International Speech Communication Association. 8
  9. 9. Background: slot filling [1] Liu, B. and Lane, I. (2016). Attention-based recurrent neural network models for joint intent detection and slot filling. Proceedings of The 17th Annual Meeting of the International Speech Communication Association. 9
  10. 10. Background: Word Embeddings [2] Harris, Z. S. (1970). Distributional structure. In Papers in structural and transformational linguistics (pp. 775-794). Springer, Dordrecht. 10 Distributional Semantics [2]: words used in similar contexts have similar meaning - each word corresponds to a vector of reals - small dimensionality (50~300) - semantic distribution in a multidimensional space
  11. 11. The approach 11
  12. 12. Approach: the multi-turn interactions - detect the change of intent - capture intent dependencies - consider the agent words 12
  13. 13. Approach: difference between multi-turn and single-turn 13
  14. 14. Approach: multi-turn example 14
  15. 15. Approach: Word Embeddings for Italian language recomputation of Italian Wikipedia embeddings with proper tokenization (with respect to [6]) 15[6] Berardi, G., Esuli, A., & Marcheggiani, D. (2015). Word Embeddings Go to Italy: A Comparison of Models and Training Datasets. In IIR. “Voglio una bici vicino a piazza castello, grazie” ↓ [“Voglio”, “una”, “bici”, “vicino”, “a”, “piazza”, “castello”, “,”, “grazie”]
  16. 16. Results 16
  17. 17. Results: the datasets available: - ATIS (single-turn) [3] - nlu-benchmark (single-turn) [4] - kvret (multi-turn) [5] collected: - bikes Italian (single-turn) - bikes English (single-turn) 17 [3] Hemphill, C., Godfrey, J., Doddington, G. (1990). The ATIS spoken language systems pilot corpus. DARPA Speech and Natural Language Workshop [4] https://github.com/snipsco/nlu-benchmark [5] Eric, M. and Manning, C. (2017). Key-value retrieval networks for task-oriented dialogue. SIGDIAL 2017: Session on Natural Language Generation for Dialog Systems
  18. 18. Results: multi-turn intent classification results on kvret dataset [5] 18 approach F1 epoch number intent RNN agent words ✓ LSTM ✓ 0.9987 7 ✓ LSTM ✘ 0.9987 8 ✓ GRU ✓ 0.9975 14 ✘ ✓ 0.9951 5 ✓ GRU ✘ 0.9585 9 [1]✘ ✘ 0.8524 8 [1] Liu, B. and Lane, I. (2016). Attention-based recurrent neural network models for joint intent detection and slot filling. Proceedings of The 17th Annual Meeting of the International Speech Communication Association. [5] Eric, M. and Manning, C. (2017). Key-value retrieval networks for task-oriented dialogue. SIGDIAL 2017: Session on Natural Language Generation for Dialog Systems
  19. 19. Results: Italian Word Embeddings 19 [7] Mikolov, T., Yih, W. T., & Zweig, G. (2013). Linguistic regularities in continuous space word representations. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (pp. 746-751). Word Embeddings accuracy Italian values from [6] on Wikipedia 44.81% Computed Italian values on Wikipedia 58.14% analogy test [7]: - semantic (capital-country, nationality adjective, currency, family) - syntactic (m-f, singular-plural, tenses, comparatives, superlatives) [6] Berardi, G., Esuli, A., & Marcheggiani, D. (2015). Word Embeddings Go to Italy: A Comparison of Models and Training Datasets. In IIR.
  20. 20. Results: the difference on the global tasks (Italian) Measured on the bike sharing dataset on the approach by [1] 20 Word Embeddings intent classification F1 slot filling F1 Italian values from [6] on Wikipedia, 730k vectors 0.8421 0.5666 Computed Italian values on Wikipedia, 758k vectors 0.8947 0.6153 [7] Berardi, G., Esuli, A., & Marcheggiani, D. (2015). Word Embeddings Go to Italy: A Comparison of Models and Training Datasets. In IIR. [1] Liu, B. and Lane, I. (2016). Attention-based recurrent neural network models for joint intent detection and slot filling. Proceedings of The 17th Annual Meeting of the International Speech Communication Association.
  21. 21. Results: the difference of embeddings on the two tasks (English) 21 Embeddings intent classification F1 slot filling F1 ATIS nlu-bench mark bikes english ATIS nlu-bench mark bikes english Trainable, random initialization 0.9740 0.9928 0.9428 0.9425 0.9177 0.9000 [8] precomputed, 685k keys, 20k unique vectors 0.9660 0.9928 0.9714 0.9588 0.8970 0.9375 [8] precomputed, 685k keys, 685k unique vectors 0.9860 0.9928 0.9714 0.9649 0.9170 0.9689 [1] Liu, B. and Lane, I. (2016). Attention-based recurrent neural network models for joint intent detection and slot filling. Proceedings of The 17th Annual Meeting of the International Speech Communication Association. [8] https://spacy.io/models/en Measured on the approach by [1]
  22. 22. Conclusions - results of the multi-turn show the importance of context - results for the word embeddings show the importance of their proper choice 22
  23. 23. Future works - multi-turn slot filling to remove handcrafted dialog tracking 23

×