Rasa Developer Summit - Bing Liu - Interactive Learning of Task-Oriented Dialog Systems

Rasa Technologies
Rasa TechnologiesRasa Technologies
Interactive Learning of
Task-Oriented Dialog Systems
Bing Liu
Research Scientist, Facebook Conversational AI
Rasa Developer Summit - 2019
Interactive Learning of Task-Oriented
Dialog Systems
Bing Liu
Research Scientist, Facebook
PhD, Carnegie Mellon University
❖ Dialog systems
➢ Chit-chat bot, QA bot, task-oriented dialog system, ...
❖ Get stuff done - assist users in completing specific tasks
➢ Personal assistants (e.g. Siri, Alexa, Google Assistant, Hey Portal)
➢ Voice command in vehicle and smart home
➢ Customer service; Sales and marketing
Task-Oriented Dialog System
2
Modular Dialog System Architecture
3
Task-Oriented Dialog System
❖ Highly handcrafted
❖ Process interdependent
4
❖ Data driven end-to-end (E2E) systems
➢ [Wen et al. 2016]: E2E supervised training neural dialog model
➢ [Bordes and Weston, 2017]: E2E model with memory network
➢ [Andrea et al, 2018]: Mem2Seq for incorporating knowledge to E2E
system
❖ Interactive learning for E2E system with less human supervision
Why Learn through Interactions?
❖ Task-oriented dialog as a sequential decision making process over
multiple steps
5
❖ State space grows exponentially with number of dialog turns
❖ Extremely hard to
➢ Design all possible dialog paths
➢ Collect a dialog corpus that is large
enough to cover all dialog scenarios
→ Continuously learn through the interaction
with users and improve over time
How can we learn end-to-end task-oriented dialog
system effectively through interaction with users?
6
End-to-End Task-Oriented Dialog Modeling
7
❖ Dialog context modeling with hierarchical RNN
B Liu, et al, "Dialogue Learning with Human Teaching and Feedback in End-To-End Trainable Task-Oriented Dialogue Systems", NAACL 2018.
End-to-End Task-Oriented Dialog Modeling
8
End-to-End Modeling of
SLU, DST, and Dialog Policy
Supervised Pre-training
❖ Supervised model pre-training on dialog corpus with MLE
➢ Objective function: linear interpolation of cross-entropy losses for
■ Dialog state tracking, i.e. user goal estimation, and
■ Dialog policy, i.e. system action prediction
➢ Optimization: Stochastic gradient descent, Adam
9
← Loss for user goal estimation
← Loss for system action prediction
Learn Interactively from User Feedback
❖ Interactive dialog learning with user feedback
10
Provide feedback for
policy optimization
Human-Human
Dialog Corpora
Supervised
Pre-training
Learn Interactively from User Feedback
❖ Use user feedback as dialog reward
❖ Introduce step penalty to encourage
shorter dialog for task completion
❖ Optimize dialog model end-to-end
with policy gradient RL:
11
Learn Interactively from User Feedback
❖ Policy optimization with RL can be slow due to sparse reward
12
❖ Dialog state distribution mismatch between offline training and
interactive learning leads to compounding errors
→ Ask user for correction/demonstration
when fails at a task and learn to act
❖ Agent may learn to recover from bad state with
RL but the search process can be very inefficient
Learn Interactively from User Teaching
❖ Interactive dialog learning with user teaching
13
Correct mistakes &
Demo desired dialog
agent behavior
Add to existing corpora
Driven by the
agent’s own policy
New
Dialog
Human-Human
Dialog Corpora
Supervised
Pre-training
Evaluation
14
Slots: theatre name, movie, date, time, num of people
SL: Supervised pre-training model
IL: Imitation learning with user teaching
RL: Reinforcement learning with user feedback
❖ Movie booking domain simulation (M2M)
Table: Human evaluation results. Mean and
standard deviation of crowd worker scores (1-5)
B Liu, et al, "Dialogue Learning with Human Teaching and Feedback in End-To-End Trainable Task-Oriented Dialogue Systems", NAACL 2018.
15
What if a user did not provide any feedback, can we
still learn anything from the interaction?
Can we learn a dialog reward function?
❖ User feedback serves as reward to RL optimization
16
❖ Task completion based reward requires prior knowledge of user’s goal
→ NOT usually accessible in real world user interactions
❖ In practice, user feedback can be inconsistent and is NOT always
available
Adversarial Dialog Learning
17
Reward
Bing Liu and Ian Lane, "Adversarial Learning of Task-Oriented Neural Dialog Models", in SIGDIAL 2018.
❖ Reward a machine-agent for conducting task-oriented dialog in a way
that is indistinguishable from the way human-agents do it.
Discriminative Reward Model
18
User’s Turn Agent’s Turn
External
Entity Info
❖ Input:
➢ Sequence of dialog turns
❖ Representation:
➢ BiLSTM with max-pooling
❖ Output:
➢ Prob. of a dialog being
successfully completed by
a human agent
Bing Liu and Ian Lane, "Adversarial Learning
of Task-Oriented Neural Dialog Models", in
SIGDIAL 2018.
Model Training
❖ Supervised pre-training with an initial set of pos & neg samples
➢ Pre-train dialog agent G on positive dialog samples with MLE
➢ Pre-train discriminative reward function D on pos & neg samples
❖ Interactive learning cycle
➢ Collect new dialog sample(s) between agent G and users
➢ Update dialog agent G with RL using the reward produced by D
➢ Update reward function D using the newly collected sample(s)
➢ Continue for next learning cycle
19
❖ Comparing different reward functions
Evaluation
20
Bing Liu and Ian Lane, "Adversarial Learning of
Task-Oriented Neural Dialog Models", in
SIGDIAL 2018.
Summary
❖ The multi-turn nature of task-oriented dialogs makes it especially
important for a system to learn through interaction with users
❖ Learning task-oriented dialog model end-to-end with user teaching
and feedback
❖ Adversarial dialog learning to address the challenges with missing or
inconsistent user feedback with less human supervision
21
Thanks!
Q & A
22
1 of 23

Recommended

Continuous Improvement of Conversational AI in Production | Rasa Summit by
Continuous Improvement of Conversational AI in Production | Rasa SummitContinuous Improvement of Conversational AI in Production | Rasa Summit
Continuous Improvement of Conversational AI in Production | Rasa SummitRasa Technologies
130 views6 slides
Voice First: Ready Your Content to Serve 50% of Global Searches | Rasa Summit... by
Voice First: Ready Your Content to Serve 50% of Global Searches | Rasa Summit...Voice First: Ready Your Content to Serve 50% of Global Searches | Rasa Summit...
Voice First: Ready Your Content to Serve 50% of Global Searches | Rasa Summit...Rasa Technologies
112 views48 slides
How Our Team Uses Rasa to Learn from Real Conversations | Rasa Summit by
How Our Team Uses Rasa to Learn from Real Conversations | Rasa SummitHow Our Team Uses Rasa to Learn from Real Conversations | Rasa Summit
How Our Team Uses Rasa to Learn from Real Conversations | Rasa SummitRasa Technologies
136 views27 slides
Ethnobots: Reimagining Chatbots as Ethnographic Research Tools | Rasa Summit ... by
Ethnobots: Reimagining Chatbots as Ethnographic Research Tools | Rasa Summit ...Ethnobots: Reimagining Chatbots as Ethnographic Research Tools | Rasa Summit ...
Ethnobots: Reimagining Chatbots as Ethnographic Research Tools | Rasa Summit ...Rasa Technologies
87 views24 slides
The State of Conversation Design - Designing for the Conversational Future by
The State of Conversation Design - Designing for the Conversational FutureThe State of Conversation Design - Designing for the Conversational Future
The State of Conversation Design - Designing for the Conversational FutureRasa Technologies
349 views41 slides
Tips sukses berkarir sebagai developer dan programmer 2021 by
Tips sukses berkarir sebagai developer dan programmer 2021Tips sukses berkarir sebagai developer dan programmer 2021
Tips sukses berkarir sebagai developer dan programmer 2021DicodingEvent
565 views19 slides

More Related Content

What's hot

Full Stack Developer Interview Questions by
Full Stack Developer Interview QuestionsFull Stack Developer Interview Questions
Full Stack Developer Interview QuestionsRock Interview
497 views15 slides
Good Code / Bad Code by
Good Code / Bad CodeGood Code / Bad Code
Good Code / Bad CodeKelly Harrop
328 views22 slides
How to present your design to the development team so they build it right by
How to present your design to the development team so they build it rightHow to present your design to the development team so they build it right
How to present your design to the development team so they build it rightKal Walkden
605 views44 slides
[#DevRelAsia Keynote 2020] Developer Centric Design for Better Experience by
[#DevRelAsia Keynote 2020] Developer Centric Design for Better Experience[#DevRelAsia Keynote 2020] Developer Centric Design for Better Experience
[#DevRelAsia Keynote 2020] Developer Centric Design for Better ExperienceTomomi Imura
420 views43 slides
The Role of Human-Human Interfaces - presented at the 38th TCF 2013-03-16 by
The Role of Human-Human Interfaces - presented at the 38th TCF 2013-03-16The Role of Human-Human Interfaces - presented at the 38th TCF 2013-03-16
The Role of Human-Human Interfaces - presented at the 38th TCF 2013-03-16Richard D. Herring
322 views59 slides
Who what and by whom by
Who what and by whomWho what and by whom
Who what and by whomOpen.Embedded
371 views12 slides

What's hot(17)

Full Stack Developer Interview Questions by Rock Interview
Full Stack Developer Interview QuestionsFull Stack Developer Interview Questions
Full Stack Developer Interview Questions
Rock Interview497 views
Good Code / Bad Code by Kelly Harrop
Good Code / Bad CodeGood Code / Bad Code
Good Code / Bad Code
Kelly Harrop328 views
How to present your design to the development team so they build it right by Kal Walkden
How to present your design to the development team so they build it rightHow to present your design to the development team so they build it right
How to present your design to the development team so they build it right
Kal Walkden605 views
[#DevRelAsia Keynote 2020] Developer Centric Design for Better Experience by Tomomi Imura
[#DevRelAsia Keynote 2020] Developer Centric Design for Better Experience[#DevRelAsia Keynote 2020] Developer Centric Design for Better Experience
[#DevRelAsia Keynote 2020] Developer Centric Design for Better Experience
Tomomi Imura420 views
The Role of Human-Human Interfaces - presented at the 38th TCF 2013-03-16 by Richard D. Herring
The Role of Human-Human Interfaces - presented at the 38th TCF 2013-03-16The Role of Human-Human Interfaces - presented at the 38th TCF 2013-03-16
The Role of Human-Human Interfaces - presented at the 38th TCF 2013-03-16
Richard D. Herring322 views
User Testing Presentation: OneTwo Productions by Chitra Ramanathan
User Testing Presentation: OneTwo ProductionsUser Testing Presentation: OneTwo Productions
User Testing Presentation: OneTwo Productions
Chitra Ramanathan251 views
The spirit of Opensource - lets plan to contribute ! @JWC16 by Parth Lawate
The spirit of Opensource - lets plan to contribute ! @JWC16The spirit of Opensource - lets plan to contribute ! @JWC16
The spirit of Opensource - lets plan to contribute ! @JWC16
Parth Lawate471 views
Id camp x dicoding live : persiapan jadi software engineer hebat 101 by DicodingEvent
Id camp x dicoding live : persiapan jadi software engineer hebat 101Id camp x dicoding live : persiapan jadi software engineer hebat 101
Id camp x dicoding live : persiapan jadi software engineer hebat 101
DicodingEvent393 views

Similar to Rasa Developer Summit - Bing Liu - Interactive Learning of Task-Oriented Dialog Systems

Li Deng at AI Frontiers: Three Generations of Spoken Dialogue Systems (Bots) by
Li Deng at AI Frontiers: Three Generations of Spoken Dialogue Systems (Bots)Li Deng at AI Frontiers: Three Generations of Spoken Dialogue Systems (Bots)
Li Deng at AI Frontiers: Three Generations of Spoken Dialogue Systems (Bots)AI Frontiers
4.4K views29 slides
#1 Berlin Students in AI, Machine Learning & NLP presentation by
#1 Berlin Students in AI, Machine Learning & NLP presentation#1 Berlin Students in AI, Machine Learning & NLP presentation
#1 Berlin Students in AI, Machine Learning & NLP presentationparlamind
386 views26 slides
Realizing AI Conversational Bot by
Realizing AI Conversational BotRealizing AI Conversational Bot
Realizing AI Conversational BotRakuten Group, Inc.
1.6K views21 slides
Deep Dialog System Review by
Deep Dialog System ReviewDeep Dialog System Review
Deep Dialog System ReviewNguyen Quang
711 views26 slides
case study-home.pdf by
case study-home.pdfcase study-home.pdf
case study-home.pdfMichaelaKravkov
12 views28 slides
Case study OOPS .pptx by
Case study OOPS .pptxCase study OOPS .pptx
Case study OOPS .pptxssuserc6f5161
26 views28 slides

Similar to Rasa Developer Summit - Bing Liu - Interactive Learning of Task-Oriented Dialog Systems(20)

Li Deng at AI Frontiers: Three Generations of Spoken Dialogue Systems (Bots) by AI Frontiers
Li Deng at AI Frontiers: Three Generations of Spoken Dialogue Systems (Bots)Li Deng at AI Frontiers: Three Generations of Spoken Dialogue Systems (Bots)
Li Deng at AI Frontiers: Three Generations of Spoken Dialogue Systems (Bots)
AI Frontiers4.4K views
#1 Berlin Students in AI, Machine Learning & NLP presentation by parlamind
#1 Berlin Students in AI, Machine Learning & NLP presentation#1 Berlin Students in AI, Machine Learning & NLP presentation
#1 Berlin Students in AI, Machine Learning & NLP presentation
parlamind386 views
Deep Dialog System Review by Nguyen Quang
Deep Dialog System ReviewDeep Dialog System Review
Deep Dialog System Review
Nguyen Quang711 views
UX class presentation by Theo V
UX class presentationUX class presentation
UX class presentation
Theo V106 views
World Usability Day 2009 - Remote vs Lab Usability Testing by Shiloh Barnat Goodman
World Usability Day 2009 - Remote vs Lab Usability TestingWorld Usability Day 2009 - Remote vs Lab Usability Testing
World Usability Day 2009 - Remote vs Lab Usability Testing
Social sales enablement with jive by James Ellis
Social sales enablement with jiveSocial sales enablement with jive
Social sales enablement with jive
James Ellis1.6K views
Jason Brenier's Presentation "Principles of Conversational Business" - Activa... by mPulse Mobile
Jason Brenier's Presentation "Principles of Conversational Business" - Activa...Jason Brenier's Presentation "Principles of Conversational Business" - Activa...
Jason Brenier's Presentation "Principles of Conversational Business" - Activa...
mPulse Mobile1.1K views
By Thoughtworks | Accessible by default: Shift accessibility left with Katie ... by IngridBuenaventura
By Thoughtworks | Accessible by default: Shift accessibility left with Katie ...By Thoughtworks | Accessible by default: Shift accessibility left with Katie ...
By Thoughtworks | Accessible by default: Shift accessibility left with Katie ...
Understanding Chatbot-Mediated Task Management by Carlos Toxtli
Understanding Chatbot-Mediated Task ManagementUnderstanding Chatbot-Mediated Task Management
Understanding Chatbot-Mediated Task Management
Carlos Toxtli379 views
ChatGPT and OpenAI.pdf by Sonal Tiwari
ChatGPT and OpenAI.pdfChatGPT and OpenAI.pdf
ChatGPT and OpenAI.pdf
Sonal Tiwari1.3K views
RESUME_SURABHI_LATEST by surabhi hm
RESUME_SURABHI_LATESTRESUME_SURABHI_LATEST
RESUME_SURABHI_LATEST
surabhi hm248 views
Case Study 3 - Portfolio Project Final - Google UX Design Certificate by AbelKCS
Case Study 3 - Portfolio Project Final - Google UX Design CertificateCase Study 3 - Portfolio Project Final - Google UX Design Certificate
Case Study 3 - Portfolio Project Final - Google UX Design Certificate
AbelKCS2.2K views
Hard and Soft skills: be successful in the IT market by Davide Benvegnù
Hard and Soft skills: be successful in the IT marketHard and Soft skills: be successful in the IT market
Hard and Soft skills: be successful in the IT market
Davide Benvegnù407 views
Get started with Dialogflow & Contact Center AI on Google Cloud by Daniel Zivkovic
Get started with Dialogflow & Contact Center AI on Google CloudGet started with Dialogflow & Contact Center AI on Google Cloud
Get started with Dialogflow & Contact Center AI on Google Cloud
Daniel Zivkovic788 views

More from Rasa Technologies

Six Steps to Conversation Driven Development by
Six Steps to Conversation Driven DevelopmentSix Steps to Conversation Driven Development
Six Steps to Conversation Driven DevelopmentRasa Technologies
355 views37 slides
Beyond Sentiment Analysis: Creating Engaging Conversational Experiences throu... by
Beyond Sentiment Analysis: Creating Engaging Conversational Experiences throu...Beyond Sentiment Analysis: Creating Engaging Conversational Experiences throu...
Beyond Sentiment Analysis: Creating Engaging Conversational Experiences throu...Rasa Technologies
250 views22 slides
Using Rasa to Power an Immersive Multimedia Conversational Experience | Rasa ... by
Using Rasa to Power an Immersive Multimedia Conversational Experience | Rasa ...Using Rasa to Power an Immersive Multimedia Conversational Experience | Rasa ...
Using Rasa to Power an Immersive Multimedia Conversational Experience | Rasa ...Rasa Technologies
178 views20 slides
How to Effectively Test Your Chatbot | Rasa Summit by
How to Effectively Test Your Chatbot  | Rasa SummitHow to Effectively Test Your Chatbot  | Rasa Summit
How to Effectively Test Your Chatbot | Rasa SummitRasa Technologies
142 views8 slides
End-to-end dialogue systems, or a feature which wasn’t meant to happen | Rasa... by
End-to-end dialogue systems, or a feature which wasn’t meant to happen | Rasa...End-to-end dialogue systems, or a feature which wasn’t meant to happen | Rasa...
End-to-end dialogue systems, or a feature which wasn’t meant to happen | Rasa...Rasa Technologies
194 views17 slides
The missing link: How AI can help create a safer society and better businesse... by
The missing link: How AI can help create a safer society and better businesse...The missing link: How AI can help create a safer society and better businesse...
The missing link: How AI can help create a safer society and better businesse...Rasa Technologies
120 views14 slides

More from Rasa Technologies(20)

Six Steps to Conversation Driven Development by Rasa Technologies
Six Steps to Conversation Driven DevelopmentSix Steps to Conversation Driven Development
Six Steps to Conversation Driven Development
Rasa Technologies355 views
Beyond Sentiment Analysis: Creating Engaging Conversational Experiences throu... by Rasa Technologies
Beyond Sentiment Analysis: Creating Engaging Conversational Experiences throu...Beyond Sentiment Analysis: Creating Engaging Conversational Experiences throu...
Beyond Sentiment Analysis: Creating Engaging Conversational Experiences throu...
Rasa Technologies250 views
Using Rasa to Power an Immersive Multimedia Conversational Experience | Rasa ... by Rasa Technologies
Using Rasa to Power an Immersive Multimedia Conversational Experience | Rasa ...Using Rasa to Power an Immersive Multimedia Conversational Experience | Rasa ...
Using Rasa to Power an Immersive Multimedia Conversational Experience | Rasa ...
Rasa Technologies178 views
How to Effectively Test Your Chatbot | Rasa Summit by Rasa Technologies
How to Effectively Test Your Chatbot  | Rasa SummitHow to Effectively Test Your Chatbot  | Rasa Summit
How to Effectively Test Your Chatbot | Rasa Summit
Rasa Technologies142 views
End-to-end dialogue systems, or a feature which wasn’t meant to happen | Rasa... by Rasa Technologies
End-to-end dialogue systems, or a feature which wasn’t meant to happen | Rasa...End-to-end dialogue systems, or a feature which wasn’t meant to happen | Rasa...
End-to-end dialogue systems, or a feature which wasn’t meant to happen | Rasa...
Rasa Technologies194 views
The missing link: How AI can help create a safer society and better businesse... by Rasa Technologies
The missing link: How AI can help create a safer society and better businesse...The missing link: How AI can help create a safer society and better businesse...
The missing link: How AI can help create a safer society and better businesse...
Rasa Technologies120 views
Boss - Bringing More Diversity to Tech | Rasa Summit by Rasa Technologies
Boss - Bringing More Diversity to Tech | Rasa SummitBoss - Bringing More Diversity to Tech | Rasa Summit
Boss - Bringing More Diversity to Tech | Rasa Summit
Rasa Technologies116 views
Applying Conversational AI in the Enterprise by Rasa Technologies
Applying Conversational AI in the EnterpriseApplying Conversational AI in the Enterprise
Applying Conversational AI in the Enterprise
Rasa Technologies118 views
Supercharging User Interfaces with Rasa | Rasa Summit 2021 by Rasa Technologies
Supercharging User Interfaces with Rasa | Rasa Summit 2021Supercharging User Interfaces with Rasa | Rasa Summit 2021
Supercharging User Interfaces with Rasa | Rasa Summit 2021
STAR: A Schema-Guided Dialog Dataset for Transfer Learning | Rasa Summit 2021 by Rasa Technologies
STAR: A Schema-Guided Dialog Dataset for Transfer Learning | Rasa Summit 2021 STAR: A Schema-Guided Dialog Dataset for Transfer Learning | Rasa Summit 2021
STAR: A Schema-Guided Dialog Dataset for Transfer Learning | Rasa Summit 2021
Rasa Technologies215 views
Building an AI Assistant Factory - Rasa Summit 2021 by Rasa Technologies
Building an AI Assistant Factory - Rasa Summit 2021Building an AI Assistant Factory - Rasa Summit 2021
Building an AI Assistant Factory - Rasa Summit 2021
Rasa Technologies119 views
Building an End-to-End Test Automation Pipeline for Conversational AI | Rasa ... by Rasa Technologies
Building an End-to-End Test Automation Pipeline for Conversational AI | Rasa ...Building an End-to-End Test Automation Pipeline for Conversational AI | Rasa ...
Building an End-to-End Test Automation Pipeline for Conversational AI | Rasa ...
Rasa Technologies156 views
Deploy your Rasa Chatbots like a Boss with DevOps | Rasa Summit 2021 by Rasa Technologies
Deploy your Rasa Chatbots like a Boss with DevOps | Rasa Summit 2021Deploy your Rasa Chatbots like a Boss with DevOps | Rasa Summit 2021
Deploy your Rasa Chatbots like a Boss with DevOps | Rasa Summit 2021
Rasa Technologies184 views
What’s next in CDD: Intent Clashes and Selective Confidence | Rasa Summit 2021 by Rasa Technologies
What’s next in CDD: Intent Clashes and Selective Confidence | Rasa Summit 2021What’s next in CDD: Intent Clashes and Selective Confidence | Rasa Summit 2021
What’s next in CDD: Intent Clashes and Selective Confidence | Rasa Summit 2021
Rasa Technologies170 views
Conversational Teams: Moving Fast at Scale | Rasa Summit 2021 by Rasa Technologies
Conversational Teams: Moving Fast at Scale | Rasa Summit 2021Conversational Teams: Moving Fast at Scale | Rasa Summit 2021
Conversational Teams: Moving Fast at Scale | Rasa Summit 2021
Rasa Technologies173 views
Research Updates from Rasa: Transformers in NLU and Dialogue by Rasa Technologies
Research Updates from Rasa: Transformers in NLU and DialogueResearch Updates from Rasa: Transformers in NLU and Dialogue
Research Updates from Rasa: Transformers in NLU and Dialogue
Rasa Technologies712 views
Webinar: How to Use Integrated Version Control in Rasa X by Rasa Technologies
Webinar: How to Use Integrated Version Control in Rasa XWebinar: How to Use Integrated Version Control in Rasa X
Webinar: How to Use Integrated Version Control in Rasa X
Rasa Technologies247 views
Rasa Developer Summit - Josh Converse, Dynamic Offset - Three Part Harmony: H... by Rasa Technologies
Rasa Developer Summit - Josh Converse, Dynamic Offset - Three Part Harmony: H...Rasa Developer Summit - Josh Converse, Dynamic Offset - Three Part Harmony: H...
Rasa Developer Summit - Josh Converse, Dynamic Offset - Three Part Harmony: H...

Recently uploaded

Melek BEN MAHMOUD.pdf by
Melek BEN MAHMOUD.pdfMelek BEN MAHMOUD.pdf
Melek BEN MAHMOUD.pdfMelekBenMahmoud
14 views1 slide
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas... by
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...Bernd Ruecker
40 views69 slides
STKI Israeli Market Study 2023 corrected forecast 2023_24 v3.pdf by
STKI Israeli Market Study 2023   corrected forecast 2023_24 v3.pdfSTKI Israeli Market Study 2023   corrected forecast 2023_24 v3.pdf
STKI Israeli Market Study 2023 corrected forecast 2023_24 v3.pdfDr. Jimmy Schwarzkopf
20 views29 slides
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f... by
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...TrustArc
11 views29 slides
Case Study Copenhagen Energy and Business Central.pdf by
Case Study Copenhagen Energy and Business Central.pdfCase Study Copenhagen Energy and Business Central.pdf
Case Study Copenhagen Energy and Business Central.pdfAitana
16 views3 slides
Scaling Knowledge Graph Architectures with AI by
Scaling Knowledge Graph Architectures with AIScaling Knowledge Graph Architectures with AI
Scaling Knowledge Graph Architectures with AIEnterprise Knowledge
38 views15 slides

Recently uploaded(20)

iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas... by Bernd Ruecker
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...
Bernd Ruecker40 views
STKI Israeli Market Study 2023 corrected forecast 2023_24 v3.pdf by Dr. Jimmy Schwarzkopf
STKI Israeli Market Study 2023   corrected forecast 2023_24 v3.pdfSTKI Israeli Market Study 2023   corrected forecast 2023_24 v3.pdf
STKI Israeli Market Study 2023 corrected forecast 2023_24 v3.pdf
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f... by TrustArc
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...
TrustArc11 views
Case Study Copenhagen Energy and Business Central.pdf by Aitana
Case Study Copenhagen Energy and Business Central.pdfCase Study Copenhagen Energy and Business Central.pdf
Case Study Copenhagen Energy and Business Central.pdf
Aitana16 views
Special_edition_innovator_2023.pdf by WillDavies22
Special_edition_innovator_2023.pdfSpecial_edition_innovator_2023.pdf
Special_edition_innovator_2023.pdf
WillDavies2218 views
STPI OctaNE CoE Brochure.pdf by madhurjyapb
STPI OctaNE CoE Brochure.pdfSTPI OctaNE CoE Brochure.pdf
STPI OctaNE CoE Brochure.pdf
madhurjyapb14 views
PharoJS - Zürich Smalltalk Group Meetup November 2023 by Noury Bouraqadi
PharoJS - Zürich Smalltalk Group Meetup November 2023PharoJS - Zürich Smalltalk Group Meetup November 2023
PharoJS - Zürich Smalltalk Group Meetup November 2023
Noury Bouraqadi132 views
【USB韌體設計課程】精選講義節錄-USB的列舉過程_艾鍗學院 by IttrainingIttraining
【USB韌體設計課程】精選講義節錄-USB的列舉過程_艾鍗學院【USB韌體設計課程】精選講義節錄-USB的列舉過程_艾鍗學院
【USB韌體設計課程】精選講義節錄-USB的列舉過程_艾鍗學院
Business Analyst Series 2023 - Week 3 Session 5 by DianaGray10
Business Analyst Series 2023 -  Week 3 Session 5Business Analyst Series 2023 -  Week 3 Session 5
Business Analyst Series 2023 - Week 3 Session 5
DianaGray10300 views
"Running students' code in isolation. The hard way", Yurii Holiuk by Fwdays
"Running students' code in isolation. The hard way", Yurii Holiuk "Running students' code in isolation. The hard way", Yurii Holiuk
"Running students' code in isolation. The hard way", Yurii Holiuk
Fwdays17 views

Rasa Developer Summit - Bing Liu - Interactive Learning of Task-Oriented Dialog Systems

  • 1. Interactive Learning of Task-Oriented Dialog Systems Bing Liu Research Scientist, Facebook Conversational AI Rasa Developer Summit - 2019
  • 2. Interactive Learning of Task-Oriented Dialog Systems Bing Liu Research Scientist, Facebook PhD, Carnegie Mellon University
  • 3. ❖ Dialog systems ➢ Chit-chat bot, QA bot, task-oriented dialog system, ... ❖ Get stuff done - assist users in completing specific tasks ➢ Personal assistants (e.g. Siri, Alexa, Google Assistant, Hey Portal) ➢ Voice command in vehicle and smart home ➢ Customer service; Sales and marketing Task-Oriented Dialog System 2
  • 4. Modular Dialog System Architecture 3
  • 5. Task-Oriented Dialog System ❖ Highly handcrafted ❖ Process interdependent 4 ❖ Data driven end-to-end (E2E) systems ➢ [Wen et al. 2016]: E2E supervised training neural dialog model ➢ [Bordes and Weston, 2017]: E2E model with memory network ➢ [Andrea et al, 2018]: Mem2Seq for incorporating knowledge to E2E system ❖ Interactive learning for E2E system with less human supervision
  • 6. Why Learn through Interactions? ❖ Task-oriented dialog as a sequential decision making process over multiple steps 5 ❖ State space grows exponentially with number of dialog turns ❖ Extremely hard to ➢ Design all possible dialog paths ➢ Collect a dialog corpus that is large enough to cover all dialog scenarios → Continuously learn through the interaction with users and improve over time
  • 7. How can we learn end-to-end task-oriented dialog system effectively through interaction with users? 6
  • 8. End-to-End Task-Oriented Dialog Modeling 7 ❖ Dialog context modeling with hierarchical RNN B Liu, et al, "Dialogue Learning with Human Teaching and Feedback in End-To-End Trainable Task-Oriented Dialogue Systems", NAACL 2018.
  • 9. End-to-End Task-Oriented Dialog Modeling 8 End-to-End Modeling of SLU, DST, and Dialog Policy
  • 10. Supervised Pre-training ❖ Supervised model pre-training on dialog corpus with MLE ➢ Objective function: linear interpolation of cross-entropy losses for ■ Dialog state tracking, i.e. user goal estimation, and ■ Dialog policy, i.e. system action prediction ➢ Optimization: Stochastic gradient descent, Adam 9 ← Loss for user goal estimation ← Loss for system action prediction
  • 11. Learn Interactively from User Feedback ❖ Interactive dialog learning with user feedback 10 Provide feedback for policy optimization Human-Human Dialog Corpora Supervised Pre-training
  • 12. Learn Interactively from User Feedback ❖ Use user feedback as dialog reward ❖ Introduce step penalty to encourage shorter dialog for task completion ❖ Optimize dialog model end-to-end with policy gradient RL: 11
  • 13. Learn Interactively from User Feedback ❖ Policy optimization with RL can be slow due to sparse reward 12 ❖ Dialog state distribution mismatch between offline training and interactive learning leads to compounding errors → Ask user for correction/demonstration when fails at a task and learn to act ❖ Agent may learn to recover from bad state with RL but the search process can be very inefficient
  • 14. Learn Interactively from User Teaching ❖ Interactive dialog learning with user teaching 13 Correct mistakes & Demo desired dialog agent behavior Add to existing corpora Driven by the agent’s own policy New Dialog Human-Human Dialog Corpora Supervised Pre-training
  • 15. Evaluation 14 Slots: theatre name, movie, date, time, num of people SL: Supervised pre-training model IL: Imitation learning with user teaching RL: Reinforcement learning with user feedback ❖ Movie booking domain simulation (M2M) Table: Human evaluation results. Mean and standard deviation of crowd worker scores (1-5) B Liu, et al, "Dialogue Learning with Human Teaching and Feedback in End-To-End Trainable Task-Oriented Dialogue Systems", NAACL 2018.
  • 16. 15 What if a user did not provide any feedback, can we still learn anything from the interaction?
  • 17. Can we learn a dialog reward function? ❖ User feedback serves as reward to RL optimization 16 ❖ Task completion based reward requires prior knowledge of user’s goal → NOT usually accessible in real world user interactions ❖ In practice, user feedback can be inconsistent and is NOT always available
  • 18. Adversarial Dialog Learning 17 Reward Bing Liu and Ian Lane, "Adversarial Learning of Task-Oriented Neural Dialog Models", in SIGDIAL 2018. ❖ Reward a machine-agent for conducting task-oriented dialog in a way that is indistinguishable from the way human-agents do it.
  • 19. Discriminative Reward Model 18 User’s Turn Agent’s Turn External Entity Info ❖ Input: ➢ Sequence of dialog turns ❖ Representation: ➢ BiLSTM with max-pooling ❖ Output: ➢ Prob. of a dialog being successfully completed by a human agent Bing Liu and Ian Lane, "Adversarial Learning of Task-Oriented Neural Dialog Models", in SIGDIAL 2018.
  • 20. Model Training ❖ Supervised pre-training with an initial set of pos & neg samples ➢ Pre-train dialog agent G on positive dialog samples with MLE ➢ Pre-train discriminative reward function D on pos & neg samples ❖ Interactive learning cycle ➢ Collect new dialog sample(s) between agent G and users ➢ Update dialog agent G with RL using the reward produced by D ➢ Update reward function D using the newly collected sample(s) ➢ Continue for next learning cycle 19
  • 21. ❖ Comparing different reward functions Evaluation 20 Bing Liu and Ian Lane, "Adversarial Learning of Task-Oriented Neural Dialog Models", in SIGDIAL 2018.
  • 22. Summary ❖ The multi-turn nature of task-oriented dialogs makes it especially important for a system to learn through interaction with users ❖ Learning task-oriented dialog model end-to-end with user teaching and feedback ❖ Adversarial dialog learning to address the challenges with missing or inconsistent user feedback with less human supervision 21