SlideShare a Scribd company logo
DSTC6 – Dialogue System
Technology Challenges
An Introduction
서강대학교 자연어처리 연구실
허광호
2017.07.12
DSTC6 Tracks
• Track 1 – End-to-End Goal Oriented Dialog Learning
• Y-Lan Boureau et al. - Facebook AI Research
• Track 2 – End-to-End Conversation Modeling
• Chiori HORI et al. (Mitsubishi Electric Research Laboratories)
• Track 3 – Dialog Breakdown Detection
• Ryuichiro Higashinaka et al. (NTT)
Track 3 – Dialog Breakdown Detection
NB: Not a breakdown
PB: Possible breakdown
B: Breakdown
Track 3 – Dialog Breakdown Detection
• 필요한 이유
• Voice agent 서비스가 상업용으로 출시되고 있지만
• Still cannot converse as naturally as two humans.
• 가장 큰 문제점은 Voice agent가 가끔 Dialogue breakdown을 유발하는 부적절한 발화를
생성함.
• 용도
• Breakdown detection 기술은 Chat-oriented 대화와 같이 대화유지가 중요한 경우 유용함.
• 대화 시스템의 error recovery에도 사용할 수 있음.
Track 3 – Dialog Breakdown Detection
• Dataset
• 100 chat-oriented dialogues (21 utterances per dialogue) – 24 annotators.
• 1000 chat-oriented dialogues – 2~3 annotators.
• 300 chat-oriented dialogues – 30 annotators.
• Unfortunately, the data above are in Japanese;
• 추가로 영어로 된 100 dialogues 를 수집하여 배포한다고 함.
• 평가방법
• Classification-Related metrics – Accuracy, Precision, Recall, F-measure
• Distribution-related metrics – JS Divergence and Mean squared error
Track 3 – Dialog Breakdown Detection
• LREC 2016 Breakdown Detection (In Japanese) 결과
• Baseline: CRF-based method
• Team1: LSTM-RNN-based method
• Features: Word2Vec + co-occurrence freq. vector + Sent2Vec vector
• Team2: LSTM-RNN-based method (Word2Vec)
• Team3: Rule-based method (Keyword는 시스템 발화에서 추출)
• Team4: SVM-based method (Word frequency vector)
• Team5: DNN-based method
• Features: dialogue act of the system and previous user utterance.
• Team6: LSTM-RNN-based method
• Features: Word vector encoded by the use of NCM (Neural Conversation Model), LSTM,
bag-of-word embedding, and an extended NCM.
Track 3 – Dialog Breakdown Detection
Classification-Related Metrics
Baseline: CRF
Team1: LSTM-RNN-based
Team2: LSTM-RNN-based
Team3: Rule-based
Team4: SVM-based
Team5: DNN-based
Team6: LSTM-RNN based
출처: The Dialogue Breakdown Detection Challenge - Task Description, Datasets, and Evaluation Metrics
Track 3 – Dialog Breakdown Detection
Distribution-Related Metrics
Baseline: CRF
Team1: LSTM-RNN-based
Team2: LSTM-RNN-based
Team3: Rule-based
Team4: SVM-based
Team5: DNN-based
Team6: LSTM-RNN based
출처: The Dialogue Breakdown Detection Challenge - Task Description, Datasets, and Evaluation Metrics
Track 1 – End-to-End Goal Oriented Dialog Learning
• Goal-oriented Dialog Learning
• Goal-oriented 대화는 language modeling 이상의 기술을 필요로 함.
• Asking questions to clearly define a user request.
• Querying Knowledge Bases (KBs).
• Interpreting results from queries to display options to users or Completing a
transaction.
• 대화 도메인
• Restaurant reservation system
• Facebook AI Research open resource 를 코퍼스로 사용 (Bordes et al. 2017)
Track 1 – End-to-End Goal Oriented Dialog Learning
• Task 구성
• Goal-oriented 대화시스템이 갖춰야 할 기능들을 sub-task로 나누어서 각각 평가.
• Task 1: Issuing API calls
• Task 2: Updating API calls
• Task 3: Displaying options
• Task 4: Providing extra information
• Task 5: Conducting full dialogs
Track 1 Example
Track 1 – End-to-End Goal Oriented Dialog Learning
• Dataset 구성
• Task 당 10,000 examples
• 정답 발화 + 발화 candidates
• Evaluation
• Language generation 방식이 아니라
• 발화 candidates Ranking 방식으로 진행됨
• Next-Utterance Classification 이라고 함
• (Lowe et al. 2016)
Track 1 – End-to-End Goal Oriented Dialog Learning
Results published in ICLR 2017
출처: learning end-to-end goal-oriented dialog – A. Bordes 2017
Track 2 – End-to-End Conversation Modeling
• The system has to generate sentences responsive to a user input in a
given dialogue history where it can use external knowledge from web.
Track 2 – End-to-End Conversation Modeling
• Dataset
• Training Data (OpenSubtitles/Twitter ≈ 1M dialogs, 2.2M utterances)
• Test Data: 500 – 1000 dialogs
Track 2 – End-to-End Conversation Modeling
• Baseline System
• LSTM-based seq2seq generation system and a pre-trained model will be
provided.
• Evaluation
• Objective measure: Perplexity, BLEU, etc.
• Subjective measure: Human rating using crowd source.
Track 2 – End-to-End Conversation Modeling
DSTC6 Tracks Conclusion
• Track 1 – End-to-End Goal Oriented Dialog Learning
• Next-Utterance Classification Task
• Track 2 – End-to-End Conversation Modeling
• Language Generation Task
• Track 3 – Dialog Breakdown Detection
• Label Classification Task

More Related Content

What's hot

Deep Learning Enabled Question Answering System to Automate Corporate Helpdesk
Deep Learning Enabled Question Answering System to Automate Corporate HelpdeskDeep Learning Enabled Question Answering System to Automate Corporate Helpdesk
Deep Learning Enabled Question Answering System to Automate Corporate Helpdesk
Saurabh Saxena
 
NLP from scratch
NLP from scratch NLP from scratch
NLP from scratch
Bryan Gummibearehausen
 
From TREC to Watson: is open domain question answering a solved problem?
From TREC to Watson: is open domain question answering a solved problem?From TREC to Watson: is open domain question answering a solved problem?
From TREC to Watson: is open domain question answering a solved problem?
Constantin Orasan
 
Aspects of NLP Practice
Aspects of NLP PracticeAspects of NLP Practice
Aspects of NLP Practice
Vsevolod Dyomkin
 
Tomáš Mikolov - Distributed Representations for NLP
Tomáš Mikolov - Distributed Representations for NLPTomáš Mikolov - Distributed Representations for NLP
Tomáš Mikolov - Distributed Representations for NLP
Machine Learning Prague
 
Recurrent networks and beyond by Tomas Mikolov
Recurrent networks and beyond by Tomas MikolovRecurrent networks and beyond by Tomas Mikolov
Recurrent networks and beyond by Tomas Mikolov
Bhaskar Mitra
 
Analysis of Similarity Measures between Short Text for the NTCIR-12 Short Tex...
Analysis of Similarity Measures between Short Text for the NTCIR-12 Short Tex...Analysis of Similarity Measures between Short Text for the NTCIR-12 Short Tex...
Analysis of Similarity Measures between Short Text for the NTCIR-12 Short Tex...
KozoChikai
 
Natural Language Processing in Practice
Natural Language Processing in PracticeNatural Language Processing in Practice
Natural Language Processing in Practice
Vsevolod Dyomkin
 
ACL 2018 Recap
ACL 2018 RecapACL 2018 Recap
ACL 2018 Recap
NAVER Engineering
 
AINL 2016: Nikolenko
AINL 2016: NikolenkoAINL 2016: Nikolenko
AINL 2016: Nikolenko
Lidia Pivovarova
 
The State of #NLProc
The State of #NLProcThe State of #NLProc
The State of #NLProc
Vsevolod Dyomkin
 
Arabic Question Answering: Challenges, Tasks, Approaches, Test-sets, Tools, A...
Arabic Question Answering: Challenges, Tasks, Approaches, Test-sets, Tools, A...Arabic Question Answering: Challenges, Tasks, Approaches, Test-sets, Tools, A...
Arabic Question Answering: Challenges, Tasks, Approaches, Test-sets, Tools, A...
Ahmed Magdy Ezzeldin, MSc.
 
AINL 2016: Galinsky, Alekseev, Nikolenko
AINL 2016: Galinsky, Alekseev, NikolenkoAINL 2016: Galinsky, Alekseev, Nikolenko
AINL 2016: Galinsky, Alekseev, Nikolenko
Lidia Pivovarova
 
Text Classification
Text ClassificationText Classification
Text Classification
RAX Automation Suite
 
Classifying Text using CNN
Classifying Text using CNNClassifying Text using CNN
Classifying Text using CNN
Somnath Banerjee
 
Self training improves_nlu
Self training improves_nlu Self training improves_nlu
Self training improves_nlu
taeseon ryu
 
Natural language processing: feature extraction
Natural language processing: feature extractionNatural language processing: feature extraction
Natural language processing: feature extraction
Gabriel Hamilton
 
Can functional programming be liberated from static typing?
Can functional programming be liberated from static typing?Can functional programming be liberated from static typing?
Can functional programming be liberated from static typing?
Vsevolod Dyomkin
 
NLP and LSA getting started
NLP and LSA getting startedNLP and LSA getting started
NLP and LSA getting started
Innovation Engineering
 
FaCoY – A Code-to-Code Search Engine
FaCoY – A Code-to-Code Search EngineFaCoY – A Code-to-Code Search Engine
FaCoY – A Code-to-Code Search Engine
Dongsun Kim
 

What's hot (20)

Deep Learning Enabled Question Answering System to Automate Corporate Helpdesk
Deep Learning Enabled Question Answering System to Automate Corporate HelpdeskDeep Learning Enabled Question Answering System to Automate Corporate Helpdesk
Deep Learning Enabled Question Answering System to Automate Corporate Helpdesk
 
NLP from scratch
NLP from scratch NLP from scratch
NLP from scratch
 
From TREC to Watson: is open domain question answering a solved problem?
From TREC to Watson: is open domain question answering a solved problem?From TREC to Watson: is open domain question answering a solved problem?
From TREC to Watson: is open domain question answering a solved problem?
 
Aspects of NLP Practice
Aspects of NLP PracticeAspects of NLP Practice
Aspects of NLP Practice
 
Tomáš Mikolov - Distributed Representations for NLP
Tomáš Mikolov - Distributed Representations for NLPTomáš Mikolov - Distributed Representations for NLP
Tomáš Mikolov - Distributed Representations for NLP
 
Recurrent networks and beyond by Tomas Mikolov
Recurrent networks and beyond by Tomas MikolovRecurrent networks and beyond by Tomas Mikolov
Recurrent networks and beyond by Tomas Mikolov
 
Analysis of Similarity Measures between Short Text for the NTCIR-12 Short Tex...
Analysis of Similarity Measures between Short Text for the NTCIR-12 Short Tex...Analysis of Similarity Measures between Short Text for the NTCIR-12 Short Tex...
Analysis of Similarity Measures between Short Text for the NTCIR-12 Short Tex...
 
Natural Language Processing in Practice
Natural Language Processing in PracticeNatural Language Processing in Practice
Natural Language Processing in Practice
 
ACL 2018 Recap
ACL 2018 RecapACL 2018 Recap
ACL 2018 Recap
 
AINL 2016: Nikolenko
AINL 2016: NikolenkoAINL 2016: Nikolenko
AINL 2016: Nikolenko
 
The State of #NLProc
The State of #NLProcThe State of #NLProc
The State of #NLProc
 
Arabic Question Answering: Challenges, Tasks, Approaches, Test-sets, Tools, A...
Arabic Question Answering: Challenges, Tasks, Approaches, Test-sets, Tools, A...Arabic Question Answering: Challenges, Tasks, Approaches, Test-sets, Tools, A...
Arabic Question Answering: Challenges, Tasks, Approaches, Test-sets, Tools, A...
 
AINL 2016: Galinsky, Alekseev, Nikolenko
AINL 2016: Galinsky, Alekseev, NikolenkoAINL 2016: Galinsky, Alekseev, Nikolenko
AINL 2016: Galinsky, Alekseev, Nikolenko
 
Text Classification
Text ClassificationText Classification
Text Classification
 
Classifying Text using CNN
Classifying Text using CNNClassifying Text using CNN
Classifying Text using CNN
 
Self training improves_nlu
Self training improves_nlu Self training improves_nlu
Self training improves_nlu
 
Natural language processing: feature extraction
Natural language processing: feature extractionNatural language processing: feature extraction
Natural language processing: feature extraction
 
Can functional programming be liberated from static typing?
Can functional programming be liberated from static typing?Can functional programming be liberated from static typing?
Can functional programming be liberated from static typing?
 
NLP and LSA getting started
NLP and LSA getting startedNLP and LSA getting started
NLP and LSA getting started
 
FaCoY – A Code-to-Code Search Engine
FaCoY – A Code-to-Code Search EngineFaCoY – A Code-to-Code Search Engine
FaCoY – A Code-to-Code Search Engine
 

Similar to Dstc6 an introduction

Software Analytics: Data Analytics for Software Engineering
Software Analytics: Data Analytics for Software EngineeringSoftware Analytics: Data Analytics for Software Engineering
Software Analytics: Data Analytics for Software Engineering
Tao Xie
 
Deep Domain
Deep DomainDeep Domain
Deep Domain
Zachary S. Brown
 
The Eighth Dialog System Technology Challenge (DSTC8)
The Eighth Dialog System Technology Challenge (DSTC8)The Eighth Dialog System Technology Challenge (DSTC8)
The Eighth Dialog System Technology Challenge (DSTC8)
Seokhwan Kim
 
GPT-2: Language Models are Unsupervised Multitask Learners
GPT-2: Language Models are Unsupervised Multitask LearnersGPT-2: Language Models are Unsupervised Multitask Learners
GPT-2: Language Models are Unsupervised Multitask Learners
Young Seok Kim
 
Software Analytics - Achievements and Challenges
Software Analytics - Achievements and ChallengesSoftware Analytics - Achievements and Challenges
Software Analytics - Achievements and Challenges
Tao Xie
 
CommCon 2018 - Realtime Machine Learning
CommCon 2018 - Realtime Machine LearningCommCon 2018 - Realtime Machine Learning
CommCon 2018 - Realtime Machine Learning
Evan McGee
 
[246]reasoning, attention and memory toward differentiable reasoning machines
[246]reasoning, attention and memory   toward differentiable reasoning machines[246]reasoning, attention and memory   toward differentiable reasoning machines
[246]reasoning, attention and memory toward differentiable reasoning machines
NAVER D2
 
Triantafyllia Voulibasi
Triantafyllia VoulibasiTriantafyllia Voulibasi
Triantafyllia Voulibasi
ISSEL
 
Internals of Presto Service
Internals of Presto ServiceInternals of Presto Service
Internals of Presto Service
Treasure Data, Inc.
 
DSpace 7 - Creating High-Quality Software: Update to Development Practices
DSpace 7 - Creating High-Quality Software: Update to Development PracticesDSpace 7 - Creating High-Quality Software: Update to Development Practices
DSpace 7 - Creating High-Quality Software: Update to Development Practices
4Science
 
Deep learning - Chatbot
Deep learning - ChatbotDeep learning - Chatbot
Deep learning - Chatbot
Liam Bui
 
Ask me anything: A Conversational Interface to Augment Information Security w...
Ask me anything:A Conversational Interface to Augment Information Security w...Ask me anything:A Conversational Interface to Augment Information Security w...
Ask me anything: A Conversational Interface to Augment Information Security w...
Matthew Park
 
Text Analytics for Legal work
Text Analytics for Legal workText Analytics for Legal work
Text Analytics for Legal work
AlgoAnalytics Financial Consultancy Pvt. Ltd.
 
Automated product categorization
Automated product categorizationAutomated product categorization
Automated product categorization
Andreas Loupasakis
 
Automated product categorization
Automated product categorization   Automated product categorization
Automated product categorization
Warply
 
Multi-turn QA: A RNN Contextual Approach to Intent Classification for Goal-or...
Multi-turn QA: A RNN Contextual Approach to Intent Classification for Goal-or...Multi-turn QA: A RNN Contextual Approach to Intent Classification for Goal-or...
Multi-turn QA: A RNN Contextual Approach to Intent Classification for Goal-or...
Martino Mensio
 
01.intro
01.intro01.intro
01.intro
Philip Johnson
 
Building Large Arabic Multi-Domain Resources for Sentiment Analysis
Building Large Arabic Multi-Domain Resources for Sentiment Analysis Building Large Arabic Multi-Domain Resources for Sentiment Analysis
Building Large Arabic Multi-Domain Resources for Sentiment Analysis
Hady Elsahar
 
Automatically Identifying the Quality of Developer Chats for Post Hoc Use
Automatically Identifying the Quality of Developer Chats for Post Hoc UseAutomatically Identifying the Quality of Developer Chats for Post Hoc Use
Automatically Identifying the Quality of Developer Chats for Post Hoc Use
Preetha Chatterjee
 
The VoiceMOS Challenge 2022
The VoiceMOS Challenge 2022The VoiceMOS Challenge 2022
The VoiceMOS Challenge 2022
NU_I_TODALAB
 

Similar to Dstc6 an introduction (20)

Software Analytics: Data Analytics for Software Engineering
Software Analytics: Data Analytics for Software EngineeringSoftware Analytics: Data Analytics for Software Engineering
Software Analytics: Data Analytics for Software Engineering
 
Deep Domain
Deep DomainDeep Domain
Deep Domain
 
The Eighth Dialog System Technology Challenge (DSTC8)
The Eighth Dialog System Technology Challenge (DSTC8)The Eighth Dialog System Technology Challenge (DSTC8)
The Eighth Dialog System Technology Challenge (DSTC8)
 
GPT-2: Language Models are Unsupervised Multitask Learners
GPT-2: Language Models are Unsupervised Multitask LearnersGPT-2: Language Models are Unsupervised Multitask Learners
GPT-2: Language Models are Unsupervised Multitask Learners
 
Software Analytics - Achievements and Challenges
Software Analytics - Achievements and ChallengesSoftware Analytics - Achievements and Challenges
Software Analytics - Achievements and Challenges
 
CommCon 2018 - Realtime Machine Learning
CommCon 2018 - Realtime Machine LearningCommCon 2018 - Realtime Machine Learning
CommCon 2018 - Realtime Machine Learning
 
[246]reasoning, attention and memory toward differentiable reasoning machines
[246]reasoning, attention and memory   toward differentiable reasoning machines[246]reasoning, attention and memory   toward differentiable reasoning machines
[246]reasoning, attention and memory toward differentiable reasoning machines
 
Triantafyllia Voulibasi
Triantafyllia VoulibasiTriantafyllia Voulibasi
Triantafyllia Voulibasi
 
Internals of Presto Service
Internals of Presto ServiceInternals of Presto Service
Internals of Presto Service
 
DSpace 7 - Creating High-Quality Software: Update to Development Practices
DSpace 7 - Creating High-Quality Software: Update to Development PracticesDSpace 7 - Creating High-Quality Software: Update to Development Practices
DSpace 7 - Creating High-Quality Software: Update to Development Practices
 
Deep learning - Chatbot
Deep learning - ChatbotDeep learning - Chatbot
Deep learning - Chatbot
 
Ask me anything: A Conversational Interface to Augment Information Security w...
Ask me anything:A Conversational Interface to Augment Information Security w...Ask me anything:A Conversational Interface to Augment Information Security w...
Ask me anything: A Conversational Interface to Augment Information Security w...
 
Text Analytics for Legal work
Text Analytics for Legal workText Analytics for Legal work
Text Analytics for Legal work
 
Automated product categorization
Automated product categorizationAutomated product categorization
Automated product categorization
 
Automated product categorization
Automated product categorization   Automated product categorization
Automated product categorization
 
Multi-turn QA: A RNN Contextual Approach to Intent Classification for Goal-or...
Multi-turn QA: A RNN Contextual Approach to Intent Classification for Goal-or...Multi-turn QA: A RNN Contextual Approach to Intent Classification for Goal-or...
Multi-turn QA: A RNN Contextual Approach to Intent Classification for Goal-or...
 
01.intro
01.intro01.intro
01.intro
 
Building Large Arabic Multi-Domain Resources for Sentiment Analysis
Building Large Arabic Multi-Domain Resources for Sentiment Analysis Building Large Arabic Multi-Domain Resources for Sentiment Analysis
Building Large Arabic Multi-Domain Resources for Sentiment Analysis
 
Automatically Identifying the Quality of Developer Chats for Post Hoc Use
Automatically Identifying the Quality of Developer Chats for Post Hoc UseAutomatically Identifying the Quality of Developer Chats for Post Hoc Use
Automatically Identifying the Quality of Developer Chats for Post Hoc Use
 
The VoiceMOS Challenge 2022
The VoiceMOS Challenge 2022The VoiceMOS Challenge 2022
The VoiceMOS Challenge 2022
 

Recently uploaded

Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
AndrzejJarynowski
 
A presentation that explain the Power BI Licensing
A presentation that explain the Power BI LicensingA presentation that explain the Power BI Licensing
A presentation that explain the Power BI Licensing
AlessioFois2
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
Timothy Spann
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Kiwi Creative
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
Bill641377
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
jitskeb
 
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdfUdemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Fernanda Palhano
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
apvysm8
 
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
wyddcwye1
 
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
y3i0qsdzb
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
nuttdpt
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
roli9797
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
Sm321
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
Sachin Paul
 
Build applications with generative AI on Google Cloud
Build applications with generative AI on Google CloudBuild applications with generative AI on Google Cloud
Build applications with generative AI on Google Cloud
Márton Kodok
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
Social Samosa
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
Walaa Eldin Moustafa
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
Social Samosa
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
v7oacc3l
 

Recently uploaded (20)

Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
 
A presentation that explain the Power BI Licensing
A presentation that explain the Power BI LicensingA presentation that explain the Power BI Licensing
A presentation that explain the Power BI Licensing
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
 
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdfUdemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
 
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
 
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
 
Build applications with generative AI on Google Cloud
Build applications with generative AI on Google CloudBuild applications with generative AI on Google Cloud
Build applications with generative AI on Google Cloud
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
 

Dstc6 an introduction

  • 1. DSTC6 – Dialogue System Technology Challenges An Introduction 서강대학교 자연어처리 연구실 허광호 2017.07.12
  • 2. DSTC6 Tracks • Track 1 – End-to-End Goal Oriented Dialog Learning • Y-Lan Boureau et al. - Facebook AI Research • Track 2 – End-to-End Conversation Modeling • Chiori HORI et al. (Mitsubishi Electric Research Laboratories) • Track 3 – Dialog Breakdown Detection • Ryuichiro Higashinaka et al. (NTT)
  • 3. Track 3 – Dialog Breakdown Detection NB: Not a breakdown PB: Possible breakdown B: Breakdown
  • 4. Track 3 – Dialog Breakdown Detection • 필요한 이유 • Voice agent 서비스가 상업용으로 출시되고 있지만 • Still cannot converse as naturally as two humans. • 가장 큰 문제점은 Voice agent가 가끔 Dialogue breakdown을 유발하는 부적절한 발화를 생성함. • 용도 • Breakdown detection 기술은 Chat-oriented 대화와 같이 대화유지가 중요한 경우 유용함. • 대화 시스템의 error recovery에도 사용할 수 있음.
  • 5. Track 3 – Dialog Breakdown Detection • Dataset • 100 chat-oriented dialogues (21 utterances per dialogue) – 24 annotators. • 1000 chat-oriented dialogues – 2~3 annotators. • 300 chat-oriented dialogues – 30 annotators. • Unfortunately, the data above are in Japanese; • 추가로 영어로 된 100 dialogues 를 수집하여 배포한다고 함. • 평가방법 • Classification-Related metrics – Accuracy, Precision, Recall, F-measure • Distribution-related metrics – JS Divergence and Mean squared error
  • 6. Track 3 – Dialog Breakdown Detection • LREC 2016 Breakdown Detection (In Japanese) 결과 • Baseline: CRF-based method • Team1: LSTM-RNN-based method • Features: Word2Vec + co-occurrence freq. vector + Sent2Vec vector • Team2: LSTM-RNN-based method (Word2Vec) • Team3: Rule-based method (Keyword는 시스템 발화에서 추출) • Team4: SVM-based method (Word frequency vector) • Team5: DNN-based method • Features: dialogue act of the system and previous user utterance. • Team6: LSTM-RNN-based method • Features: Word vector encoded by the use of NCM (Neural Conversation Model), LSTM, bag-of-word embedding, and an extended NCM.
  • 7. Track 3 – Dialog Breakdown Detection Classification-Related Metrics Baseline: CRF Team1: LSTM-RNN-based Team2: LSTM-RNN-based Team3: Rule-based Team4: SVM-based Team5: DNN-based Team6: LSTM-RNN based 출처: The Dialogue Breakdown Detection Challenge - Task Description, Datasets, and Evaluation Metrics
  • 8. Track 3 – Dialog Breakdown Detection Distribution-Related Metrics Baseline: CRF Team1: LSTM-RNN-based Team2: LSTM-RNN-based Team3: Rule-based Team4: SVM-based Team5: DNN-based Team6: LSTM-RNN based 출처: The Dialogue Breakdown Detection Challenge - Task Description, Datasets, and Evaluation Metrics
  • 9. Track 1 – End-to-End Goal Oriented Dialog Learning • Goal-oriented Dialog Learning • Goal-oriented 대화는 language modeling 이상의 기술을 필요로 함. • Asking questions to clearly define a user request. • Querying Knowledge Bases (KBs). • Interpreting results from queries to display options to users or Completing a transaction. • 대화 도메인 • Restaurant reservation system • Facebook AI Research open resource 를 코퍼스로 사용 (Bordes et al. 2017)
  • 10. Track 1 – End-to-End Goal Oriented Dialog Learning • Task 구성 • Goal-oriented 대화시스템이 갖춰야 할 기능들을 sub-task로 나누어서 각각 평가. • Task 1: Issuing API calls • Task 2: Updating API calls • Task 3: Displaying options • Task 4: Providing extra information • Task 5: Conducting full dialogs
  • 12. Track 1 – End-to-End Goal Oriented Dialog Learning • Dataset 구성 • Task 당 10,000 examples • 정답 발화 + 발화 candidates • Evaluation • Language generation 방식이 아니라 • 발화 candidates Ranking 방식으로 진행됨 • Next-Utterance Classification 이라고 함 • (Lowe et al. 2016)
  • 13. Track 1 – End-to-End Goal Oriented Dialog Learning Results published in ICLR 2017 출처: learning end-to-end goal-oriented dialog – A. Bordes 2017
  • 14. Track 2 – End-to-End Conversation Modeling • The system has to generate sentences responsive to a user input in a given dialogue history where it can use external knowledge from web.
  • 15. Track 2 – End-to-End Conversation Modeling • Dataset • Training Data (OpenSubtitles/Twitter ≈ 1M dialogs, 2.2M utterances) • Test Data: 500 – 1000 dialogs
  • 16. Track 2 – End-to-End Conversation Modeling • Baseline System • LSTM-based seq2seq generation system and a pre-trained model will be provided. • Evaluation • Objective measure: Perplexity, BLEU, etc. • Subjective measure: Human rating using crowd source.
  • 17. Track 2 – End-to-End Conversation Modeling
  • 18. DSTC6 Tracks Conclusion • Track 1 – End-to-End Goal Oriented Dialog Learning • Next-Utterance Classification Task • Track 2 – End-to-End Conversation Modeling • Language Generation Task • Track 3 – Dialog Breakdown Detection • Label Classification Task