Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

A joint many task model

81 views

Published on

I will introduce a paper about Joint Many-Task Model in 5 NLP tasks accepted EMNLP 2017

This slide were presented at Deep Learning Study group in DAVIAN LAB.

Paper link: https://arxiv.org/abs/1611.01587

Published in: Engineering
  • Be the first to comment

  • Be the first to like this

A joint many task model

  1. 1. A Joint Many-Task Model: Growing a Neural Network for Multiple NLP Tasks Kazuma Hashimoto, Caiming Xiong, Yoshimasa Tsuruoka, and Richart Socher The University of Tokyo EMNLP 2017 Accepted Presented by Choi Seong Jae
  2. 2. Overview: Joint Many-Task (JMT) Model
  3. 3. Motivation • 기존의 접근법은 Single Task를 다루는 것에 치중 돼 있음 • Multi-task를 다루는 것에서도, 연관성이 큰 task들을 학습하는 형태였음(POS tagging, Chunking ETC.) • Zhang and Weiss (2016) 논문에서 POS tagging과 dependency parsing을 jointly learning을 할 경 우 효과적이란 것을 보임
  4. 4. Details: Word Representations • Word embeddings • Skip-gram • Character embeddings • N-gram embeddings Example: (n = 1, 2, 3) of the word “Cat” {C, a, t, #B#C, Ca, at, t#E#, #B#Ca, Cat, at#E#} Average of the unique character n–gram embeddings
  5. 5. Details: Word-Level Task: POS Tagging • Bi-directional LSTM Input 𝒕-th word:
  6. 6. Details: Word-Level Task: Chunking • Word-level에서 chunking-tag(B-NP, I-VP, etc.)등을 분류하는 Task • Bi-directional LSTM Input Number of POS tags Corresponding label embedding
  7. 7. Details: Syntactic Task: Dependency Parsing • 문장 내의 word pair들 사이의 syntactic relation을 찾아내는 Task • Bi-directional LSTM 𝑤𝑡의 parent node를 예측하기 위한 matching function
  8. 8. Details: Semantic Task: Semantic relatedness • 두 문장 사이의 semantic relationship를 찾아내는 Task • Output으로 두 문장 사이의 real-valued relatedness score가 됨 Sentence representations The absolute values of the element-wise subtraction Element-wise multiplicationThe feature vector representation Max pooling strategy
  9. 9. Details: Semantic Task: Textual entailment • Sentence s와 Hypothesis h가 있으면, s가 주어졌을 때 h를 추론할 수 있 는지를 확인하는 Task • Entailment, Contradiction and Neutral 3개의 class로 분류 The absolute values of the element-wise subtraction Element-wise multiplicationThe feature vector representation 어느 문장이 hypothesis인지 알기 위해
  10. 10. Training: POS tagging, Chunking, Dependency Parsing Layer L2-norm regularization Successive regularization Model이 이전 Task에서 학습한 것을 잊지 않도록
  11. 11. Training: Relatedness, Textual Entailment Layer 두 확률분포의 차이를 구하는 KL-divergence
  12. 12. Experimental Settings • POS tagging, Chunking, Dependency Parsing • Wall Street Journal(WSJ) portion of Penn Treebank dataset 사용 • Semantic relatedness, Textual entailment • SICK dataset(Marelli et al., 2014) 를 사용
  13. 13. Experiments
  14. 14. Experiments
  15. 15. Experiments
  16. 16. Experiments
  17. 17. Conclusion • Growing depth를 통해 multiple NLP task를 다루는 것을 하였음 • Depth를 증가 시킬 때, linguistic hierarchies 고려하고 shortcut connection 적용함으로써 성공적으로 학습 할 수 있었음 • 논문에 나온 5개의 task외에 entity detection, relation extraction 등을 사용해 좀 더 발전할 여지가 충분함

×