Dstc6 an introduction

DSTC6 – Dialogue System
Technology Challenges
An Introduction
서강대학교 자연어처리 연구실
허광호
2017.07.12

DSTC6 Tracks
• Track 1 – End-to-End Goal Oriented Dialog Learning
• Y-Lan Boureau et al. - Facebook AI Research
• Track 2 – End-to-End Conversation Modeling
• Chiori HORI et al. (Mitsubishi Electric Research Laboratories)
• Track 3 – Dialog Breakdown Detection
• Ryuichiro Higashinaka et al. (NTT)

Track 3 – Dialog Breakdown Detection
NB: Not a breakdown
PB: Possible breakdown
B: Breakdown

• 필요한 이유
• Voice agent 서비스가 상업용으로 출시되고 있지만
• Still cannot converse as naturally as two humans.
• 가장 큰 문제점은 Voice agent가 가끔 Dialogue breakdown을 유발하는 부적절한 발화를
생성함.
• 용도
• Breakdown detection 기술은 Chat-oriented 대화와 같이 대화유지가 중요한 경우 유용함.
• 대화 시스템의 error recovery에도 사용할 수 있음.

• Dataset
• 100 chat-oriented dialogues (21 utterances per dialogue) – 24 annotators.
• 1000 chat-oriented dialogues – 2~3 annotators.
• 300 chat-oriented dialogues – 30 annotators.
• Unfortunately, the data above are in Japanese;
• 추가로 영어로 된 100 dialogues 를 수집하여 배포한다고 함.
• 평가방법
• Classification-Related metrics – Accuracy, Precision, Recall, F-measure
• Distribution-related metrics – JS Divergence and Mean squared error

• LREC 2016 Breakdown Detection (In Japanese) 결과
• Baseline: CRF-based method
• Team1: LSTM-RNN-based method
• Features: Word2Vec + co-occurrence freq. vector + Sent2Vec vector
• Team2: LSTM-RNN-based method (Word2Vec)
• Team3: Rule-based method (Keyword는 시스템 발화에서 추출)
• Team4: SVM-based method (Word frequency vector)
• Team5: DNN-based method
• Features: dialogue act of the system and previous user utterance.
• Team6: LSTM-RNN-based method
• Features: Word vector encoded by the use of NCM (Neural Conversation Model), LSTM,
bag-of-word embedding, and an extended NCM.

Classification-Related Metrics
Baseline: CRF
Team1: LSTM-RNN-based
Team3: Rule-based
Team4: SVM-based
Team5: DNN-based
Team6: LSTM-RNN based
출처: The Dialogue Breakdown Detection Challenge - Task Description, Datasets, and Evaluation Metrics

Distribution-Related Metrics
Baseline: CRF
Team3: Rule-based
Team4: SVM-based
Team5: DNN-based
Team6: LSTM-RNN based
출처: The Dialogue Breakdown Detection Challenge - Task Description, Datasets, and Evaluation Metrics

Track 1 – End-to-End Goal Oriented Dialog Learning
• Goal-oriented Dialog Learning
• Goal-oriented 대화는 language modeling 이상의 기술을 필요로 함.
• Asking questions to clearly define a user request.
• Querying Knowledge Bases (KBs).
• Interpreting results from queries to display options to users or Completing a
transaction.
• 대화 도메인
• Restaurant reservation system
• Facebook AI Research open resource 를 코퍼스로 사용 (Bordes et al. 2017)

• Task 구성
• Goal-oriented 대화시스템이 갖춰야 할 기능들을 sub-task로 나누어서 각각 평가.
• Task 1: Issuing API calls
• Task 2: Updating API calls
• Task 3: Displaying options
• Task 4: Providing extra information
• Task 5: Conducting full dialogs

• Dataset 구성
• Task 당 10,000 examples
• 정답 발화 + 발화 candidates
• Evaluation
• Language generation 방식이 아니라
• 발화 candidates Ranking 방식으로 진행됨
• Next-Utterance Classification 이라고 함
• (Lowe et al. 2016)

Results published in ICLR 2017
출처: learning end-to-end goal-oriented dialog – A. Bordes 2017

Track 2 – End-to-End Conversation Modeling
• The system has to generate sentences responsive to a user input in a
given dialogue history where it can use external knowledge from web.

• Dataset
• Training Data (OpenSubtitles/Twitter ≈ 1M dialogs, 2.2M utterances)
• Test Data: 500 – 1000 dialogs

• Baseline System
• LSTM-based seq2seq generation system and a pre-trained model will be
provided.
• Evaluation
• Objective measure: Perplexity, BLEU, etc.
• Subjective measure: Human rating using crowd source.

DSTC6 Tracks Conclusion
• Track 1 – End-to-End Goal Oriented Dialog Learning
• Next-Utterance Classification Task
• Track 2 – End-to-End Conversation Modeling
• Language Generation Task
• Track 3 – Dialog Breakdown Detection
• Label Classification Task

Dstc6 an introduction

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Dstc6 an introduction

Similar to Dstc6 an introduction (20)

Recently uploaded

Recently uploaded (20)

Dstc6 an introduction