STAR: A Schema-Guided Dialog Dataset for Transfer Learning | Rasa Summit 2021

Rasa Technologies
Rasa TechnologiesRasa Technologies
STAR: A Schema-Guided Dialog
Dataset for Transfer Learning
Johannes Mosig*, Shikib Mehri*, Thomas Kober
@shikibmehri
Motivation
Pre-trained models changed NLP
Large-scale pre-training → Downstream fine-tuning
2
Motivation
Pre-trained models have helped significantly in open-domain dialog
● DialoGPT
● Meena
● Blender
3
Motivation
But what about task-oriented dialog?
● Unlike chit-chat systems, task-oriented systems must accomplish a goal
○ Limited space of valid responses
○ Often must interface with API/knowledge
● After training on Reddit, can a system make restaurant reservations?
4
Motivation
Scenario: Mary joins a COVID-19 hotline. Mary has human-level NLU/NLG and
open-domain dialog ability. Can Mary do her job without any training?
Answer: Probably not.
5
Motivation
So what can we get from large-scale pre-training?
● Language understanding
● Language generation
● General dialog skills
What don’t we get?
● Task-specific instructions/rules
6
Motivation
Scenario: Mary joins a COVID-19 hotline. Mary has human-level NLU/NLG and
open-domain dialog ability. What is the fastest way to train Mary?
● A training corpus that covers many situations → Training data
● A few examples? → Few-shot dialog
● A flow-chart describing the task? → Task-specific schema
7
Motivation
Scenario: COVID-19 is cured! Mary is out of a job. Luckily she found a new job at
a tech support call center. They give her a task-specific schema that looks a lot like
what she used at her old job. Can Mary do her job without any training?
Answer: Probably.
8
Dialog Schema Paradigm
● Schema acts as an inductive bias
● Pre-train → Train with schemas →
Zero-shot on new task
● What transfers?
○ NLU, NLG, general dialog skills
○ how to follow the schema
9
Dataset Overview
STAR: Schema-Guided Dialog Dataset for Transfer Learning
● Wizard of Oz data collection
● Schema guided data collection
● 24 tasks in 13 domains
10
Dataset Overview
11
Dialog Schemas
12
Dialog Schemas
13
Motivation ctd.
What else makes a task-oriented dialog dataset good?
● System actions should be consistent
● Realistic and variable user behavior
● API/KB interface should be explicit
● There should be a progression of difficulty in the data
14
Consistency of System Actions
● Dialog often has one-to-many problem → We try to eliminate that
○ Responses in a task-oriented system need not be diverse
○ System action at each timestep should be deterministic
● Achieved this by
○ Told AMT workers to follow the schema
○ Suggestions module
15
Suggestions Module
● Wizard enters a response
● Gets a list of suggestions from the
schema (using an NLU model)
● Can select suggestion (80%) or write
custom response (20%)
16
Realistic User Behavior
● Realistic dialog rarely follows the task schema (happy path)
● Users:
○ Change their mind
○ Request explanation/justification
○ Engage in small talk
○ Get angry
○ Anything that deviates from the standard schema
17
In-Dialog User Instructions
18
Explicit API Interface
● Can’t transfer to a new task without:
○ making task-specific API requests
○ interpreting task-specific API outputs
● Make API requests/responses part of the schema and the dialog
19
API as Part of the Dialog
● Can’t transfer to a new task without:
○ making task-specific API requests
○ interpreting task-specific API outputs
● Make API requests/responses part of the schema and the dialog
20
Instead of …
21
Formulate Dialog as a 3-Party Interaction
22
Formulate Dialog as a 3-Party Interaction
23
Progression of Difficulty
● Happy Single-Task
● Unhappy Single-Task
● Multi-Task
24
Data Collection Stages
● STAGE I: Single-Task Tutorial + Quiz [10 questions]
● STAGE II: Happy/Unhappy Single-Task Collection
● STAGE III: Multi-Task Tutorial + Quiz [10 questions]
● STAGE IV: Multi-Task Data Collection
25
Example Quiz
For the assistant, what of these things is the MOST important?
(a) Being helpful to the user
(b) Following the flow chart of the current task whenever possible
(c) Making the conversation as short as possible
(d) Making the conversation as long as possible
26
Data Collection Interface
27
Dataset Overview
● 5820 dialogs (127,833 utterances) across 24 tasks (13 domains)
○ Single-task: happy (2,688)/unhappy (1,464)
○ Multi-task (1,668)
● Properties
○ Schema-guided data collection for transfer learning
○ System-side consistency
○ Explicit API requests
28
Happy Dialog
user: hello
wizard: hello , how can i help ?
user: my name is angela and i need to rsvp to a party
wizard: at what venue is the party taking place ?
user: north heights venue
wizard: could you provide me with the name of the host , please ?
user: the hosts name is alexis
wizard: when are you planning to arrive at the party ?
…
wizard: name = north heights venue, hostname = alexia, guestname = angela, arrivaltime = 2 pm, numberguests
= 6 , needparking = api . is _ equal _ to ( yes )
api: message = thank you for your rsvp . see you there .
wizard: great , your rsvp is all done and confirmed !
user: thank you !
29
Happy Dialog
user: hello
wizard: hello , how can i help ?
user: my name is angela and i need to rsvp to a party
wizard: at what venue is the party taking place ?
user: north heights venue
wizard: could you provide me with the name of the host , please ?
user: the hosts name is alexis
wizard: when are you planning to arrive at the party ?
…
wizard: name = north heights venue, hostname = alexia, guestname = angela, arrivaltime = 2 pm, numberguests
= 6 , needparking = api . is _ equal _ to ( yes )
api: message = thank you for your rsvp . see you there .
wizard: great , your rsvp is all done and confirmed !
user: thank you !
30
Unhappy Dialog
user: hi ! i need a room at the shadyside , can you book it for me ?
wizard: when are you arriving ?
user: may 11th - 18th , and i need two beds in the room , must have them
wizard: may i have your name , please ?
user: yup , i ' m mark but i need to change the hotel to the old town inn , i just remembered that
my friends
said i should check that place out . still need two beds
wizard: name = old town inn , startdate = 11th , enddate = 18th , customername = mark , requesttype = check
api: message = available , hotelname = old town inn
wizard: alright , the old town inn ticks all of your boxes , can i book this room for you ?
user: hate to do this to you but my friends dick and jane stayed at another old town inn in a
completely
different city . i ' m not going to stay there without a recommendation . can you try the hilton ? this blog
i love just raved about the authentic feel of the place .
31
Unhappy Dialog
user: hi ! i need a room at the shadyside , can you book it for me ?
wizard: when are you arriving ?
user: may 11th - 18th , and i need two beds in the room , must have them
wizard: may i have your name , please ?
user: yup , i ' m mark but i need to change the hotel to the old town inn , i just remembered that
my friends
said i should check that place out . still need two beds
wizard: name = old town inn , startdate = 11th , enddate = 18th , customername = mark , requesttype = check
api: message = available , hotelname = old town inn
wizard: alright , the old town inn ticks all of your boxes , can i book this room for you ?
user: hate to do this to you but my friends dick and jane stayed at another old town inn in a
completely
different city . i ' m not going to stay there without a recommendation . can you try the hilton ? this blog
i love just raved about the authentic feel of the place .
32
Multi-Task Dialog
user: i ' m havin a party for my friend on sunday . help me out
wizard: when would you like your party to kick off ?
user: 5 pm
wizard: how many people would you like to invite ?
user: 80 . she ' s a big deal
. . .
user: my name is angela . actually , what would the weather be like on sunday ? we might just
take it outside
wizard: for what location would you like the weather forecast ?
. . .
user: great . let ' s book a restaurant instead then . somewhere in the west
wizard: i can filter restaurants by name , location , food type , its average rating , whether or not they take
reservations , and whether or not they have a delivery service .
user: either chinese or pizza is fine
33
Multi-Task Dialog
user: i ' m havin a party for my friend on sunday . help me out
wizard: when would you like your party to kick off ?
user: 5 pm
wizard: how many people would you like to invite ?
user: 80 . she ' s a big deal
. . .
user: my name is angela . actually , what would the weather be like on sunday ? we might just
take it outside
wizard: for what location would you like the weather forecast ?
. . .
user: great . let ' s book a restaurant instead then . somewhere in the west
wizard: i can filter restaurants by name , location , food type , its average rating , whether or not they take
reservations , and whether or not they have a delivery service .
user: either chinese or pizza is fine
34
party_plan
weather
restaurant_search
eventually restaurant_reserve
35
Tasks
● Response Generation
○ Produce the next response given (i) dialog history, (ii) API outputs,
(iii) schema
● Next Action Prediction
○ Predict the next action (ask_name, query, confirm_booking) given (i)
dialog history, (ii) API outputs, (iii) schema
36
Tasks
user: can you help me ?
wizard: hello , how can i help ?
user: i need an urgent ride to the airport .
wizard: may i have your name , please ?
user: alexis . i need the ride at forbes and murray .
wizard: customername = alexis, departurelocation = forbes and murray , arrivallocation = airport , requesttype =
check
api: id = 592 , price = 49 , allowschanges = false , minutestillpickup = 9 , serviceprovider = taxi
,
drivername = mark , carmodel = corolla , licenseplate = 019 eas , departurelocation = forbes and murray
arrivallocation = airport , customername = alexis
Response: i found a taxi ride for you from ' forbes and murray ' to ' airport ' for 49 credits that could pick you up in
9 minutes . should i book that for you ?
Action: confirm_book_ride
37
Schema-Free Classification
38
Schema Representation
39
Schema Representation
40
Schema Representation
41
Schema Representation
42
Determinism in the Schema
43
System action is always
deterministic.
The nodes representing a dialog
state before the system action
always has an out-degree of 1
Schema-Guided
Classification
44
Goal: Use the task-
specific schema to
guide next action
prediction.
Schema-Guided Classification
45
Schema-Guided Classification
46
Schema-Guided Classification
47
Schema-Guided Classification
48
Results - Next Action Prediction
49
Model Happy Unhappy Multi-Task
BERT 73.30 73.93 73.61
BERT + Schema 71.09 72.28 73.13
For each stage (happy/unhappy/multi), models are trained with 80% of data
from current stage + all data from previous stages
Zero-Shot Next Action Prediction
50
Task Transfer
Model Happy Happy + Unhappy
BERT 36.45 36.89
BERT + Schema 36.77 37.15
Domain Transfer
Model Happy Happy + Unhappy
BERT 34.84 35.63
BERT + Schema 37.20 35.71
Other Tasks
● Response Generation
● Knowledge base query prediction (state tracking)
● Schema prediction (set of dialogs → schema graph)
● Out-of-domain detection (predict when you’ve gone outside the schema)
51
Contributions
● STAR: Schema-guided Dialog Dataset for Transfer Learning
○ Task-specific schema allows zero-shot transfer learning
○ System consistency
○ Realistic user behavior
○ Progression of difficulty
● Schema-guided models for classification and generation
52
Future Work
● Improve schema-guided models
● Investigate happy → unhappy transfer
● Investigate happy → multi-task transfer
● Schema prediction (infer schema from a set of example dialogs)
53
Thank You
54
@shikibmehri
Paper
https://arxiv.org/pdf/2010.11853.pdf
Data
https://github.com/rasahq/star
1 of 54

Recommended

End-to-end dialogue systems, or a feature which wasn’t meant to happen | Rasa... by
End-to-end dialogue systems, or a feature which wasn’t meant to happen | Rasa...End-to-end dialogue systems, or a feature which wasn’t meant to happen | Rasa...
End-to-end dialogue systems, or a feature which wasn’t meant to happen | Rasa...Rasa Technologies
194 views17 slides
Building Conversational Experiences for Google Assistant '18 by
Building Conversational Experiences for Google Assistant '18Building Conversational Experiences for Google Assistant '18
Building Conversational Experiences for Google Assistant '18Abdelrahman Omran
713 views62 slides
HCI LAB MANUAL by
HCI LAB MANUAL HCI LAB MANUAL
HCI LAB MANUAL Um e Farwa
252 views46 slides
Design Principal for Action on Google by
Design Principal for Action on GoogleDesign Principal for Action on Google
Design Principal for Action on GoogleHoney Sharma
23 views25 slides
Java Zone Academy 2018 - Build the right system by
Java Zone Academy 2018   -  Build the right systemJava Zone Academy 2018   -  Build the right system
Java Zone Academy 2018 - Build the right systemCecilie Haugstvedt
80 views68 slides
Class #7: Here's the Scoop by
Class #7: Here's the ScoopClass #7: Here's the Scoop
Class #7: Here's the ScoopAngela DeHart
57 views213 slides

More Related Content

More from Rasa Technologies

Six Steps to Conversation Driven Development by
Six Steps to Conversation Driven DevelopmentSix Steps to Conversation Driven Development
Six Steps to Conversation Driven DevelopmentRasa Technologies
355 views37 slides
Beyond Sentiment Analysis: Creating Engaging Conversational Experiences throu... by
Beyond Sentiment Analysis: Creating Engaging Conversational Experiences throu...Beyond Sentiment Analysis: Creating Engaging Conversational Experiences throu...
Beyond Sentiment Analysis: Creating Engaging Conversational Experiences throu...Rasa Technologies
250 views22 slides
Using Rasa to Power an Immersive Multimedia Conversational Experience | Rasa ... by
Using Rasa to Power an Immersive Multimedia Conversational Experience | Rasa ...Using Rasa to Power an Immersive Multimedia Conversational Experience | Rasa ...
Using Rasa to Power an Immersive Multimedia Conversational Experience | Rasa ...Rasa Technologies
178 views20 slides
How to Effectively Test Your Chatbot | Rasa Summit by
How to Effectively Test Your Chatbot  | Rasa SummitHow to Effectively Test Your Chatbot  | Rasa Summit
How to Effectively Test Your Chatbot | Rasa SummitRasa Technologies
142 views8 slides
Voice First: Ready Your Content to Serve 50% of Global Searches | Rasa Summit... by
Voice First: Ready Your Content to Serve 50% of Global Searches | Rasa Summit...Voice First: Ready Your Content to Serve 50% of Global Searches | Rasa Summit...
Voice First: Ready Your Content to Serve 50% of Global Searches | Rasa Summit...Rasa Technologies
112 views48 slides
The missing link: How AI can help create a safer society and better businesse... by
The missing link: How AI can help create a safer society and better businesse...The missing link: How AI can help create a safer society and better businesse...
The missing link: How AI can help create a safer society and better businesse...Rasa Technologies
120 views14 slides

More from Rasa Technologies(20)

Six Steps to Conversation Driven Development by Rasa Technologies
Six Steps to Conversation Driven DevelopmentSix Steps to Conversation Driven Development
Six Steps to Conversation Driven Development
Rasa Technologies355 views
Beyond Sentiment Analysis: Creating Engaging Conversational Experiences throu... by Rasa Technologies
Beyond Sentiment Analysis: Creating Engaging Conversational Experiences throu...Beyond Sentiment Analysis: Creating Engaging Conversational Experiences throu...
Beyond Sentiment Analysis: Creating Engaging Conversational Experiences throu...
Rasa Technologies250 views
Using Rasa to Power an Immersive Multimedia Conversational Experience | Rasa ... by Rasa Technologies
Using Rasa to Power an Immersive Multimedia Conversational Experience | Rasa ...Using Rasa to Power an Immersive Multimedia Conversational Experience | Rasa ...
Using Rasa to Power an Immersive Multimedia Conversational Experience | Rasa ...
Rasa Technologies178 views
How to Effectively Test Your Chatbot | Rasa Summit by Rasa Technologies
How to Effectively Test Your Chatbot  | Rasa SummitHow to Effectively Test Your Chatbot  | Rasa Summit
How to Effectively Test Your Chatbot | Rasa Summit
Rasa Technologies142 views
Voice First: Ready Your Content to Serve 50% of Global Searches | Rasa Summit... by Rasa Technologies
Voice First: Ready Your Content to Serve 50% of Global Searches | Rasa Summit...Voice First: Ready Your Content to Serve 50% of Global Searches | Rasa Summit...
Voice First: Ready Your Content to Serve 50% of Global Searches | Rasa Summit...
Rasa Technologies112 views
The missing link: How AI can help create a safer society and better businesse... by Rasa Technologies
The missing link: How AI can help create a safer society and better businesse...The missing link: How AI can help create a safer society and better businesse...
The missing link: How AI can help create a safer society and better businesse...
Rasa Technologies120 views
Boss - Bringing More Diversity to Tech | Rasa Summit by Rasa Technologies
Boss - Bringing More Diversity to Tech | Rasa SummitBoss - Bringing More Diversity to Tech | Rasa Summit
Boss - Bringing More Diversity to Tech | Rasa Summit
Rasa Technologies116 views
How Our Team Uses Rasa to Learn from Real Conversations | Rasa Summit by Rasa Technologies
How Our Team Uses Rasa to Learn from Real Conversations | Rasa SummitHow Our Team Uses Rasa to Learn from Real Conversations | Rasa Summit
How Our Team Uses Rasa to Learn from Real Conversations | Rasa Summit
Rasa Technologies136 views
Applying Conversational AI in the Enterprise by Rasa Technologies
Applying Conversational AI in the EnterpriseApplying Conversational AI in the Enterprise
Applying Conversational AI in the Enterprise
Rasa Technologies118 views
Supercharging User Interfaces with Rasa | Rasa Summit 2021 by Rasa Technologies
Supercharging User Interfaces with Rasa | Rasa Summit 2021Supercharging User Interfaces with Rasa | Rasa Summit 2021
Supercharging User Interfaces with Rasa | Rasa Summit 2021
Continuous Improvement of Conversational AI in Production | Rasa Summit by Rasa Technologies
Continuous Improvement of Conversational AI in Production | Rasa SummitContinuous Improvement of Conversational AI in Production | Rasa Summit
Continuous Improvement of Conversational AI in Production | Rasa Summit
Rasa Technologies130 views
Ethnobots: Reimagining Chatbots as Ethnographic Research Tools | Rasa Summit ... by Rasa Technologies
Ethnobots: Reimagining Chatbots as Ethnographic Research Tools | Rasa Summit ...Ethnobots: Reimagining Chatbots as Ethnographic Research Tools | Rasa Summit ...
Ethnobots: Reimagining Chatbots as Ethnographic Research Tools | Rasa Summit ...
The State of Conversation Design - Designing for the Conversational Future by Rasa Technologies
The State of Conversation Design - Designing for the Conversational FutureThe State of Conversation Design - Designing for the Conversational Future
The State of Conversation Design - Designing for the Conversational Future
Rasa Technologies349 views
Building an AI Assistant Factory - Rasa Summit 2021 by Rasa Technologies
Building an AI Assistant Factory - Rasa Summit 2021Building an AI Assistant Factory - Rasa Summit 2021
Building an AI Assistant Factory - Rasa Summit 2021
Rasa Technologies119 views
Building an End-to-End Test Automation Pipeline for Conversational AI | Rasa ... by Rasa Technologies
Building an End-to-End Test Automation Pipeline for Conversational AI | Rasa ...Building an End-to-End Test Automation Pipeline for Conversational AI | Rasa ...
Building an End-to-End Test Automation Pipeline for Conversational AI | Rasa ...
Rasa Technologies156 views
Deploy your Rasa Chatbots like a Boss with DevOps | Rasa Summit 2021 by Rasa Technologies
Deploy your Rasa Chatbots like a Boss with DevOps | Rasa Summit 2021Deploy your Rasa Chatbots like a Boss with DevOps | Rasa Summit 2021
Deploy your Rasa Chatbots like a Boss with DevOps | Rasa Summit 2021
Rasa Technologies184 views
What’s next in CDD: Intent Clashes and Selective Confidence | Rasa Summit 2021 by Rasa Technologies
What’s next in CDD: Intent Clashes and Selective Confidence | Rasa Summit 2021What’s next in CDD: Intent Clashes and Selective Confidence | Rasa Summit 2021
What’s next in CDD: Intent Clashes and Selective Confidence | Rasa Summit 2021
Rasa Technologies170 views
Conversational Teams: Moving Fast at Scale | Rasa Summit 2021 by Rasa Technologies
Conversational Teams: Moving Fast at Scale | Rasa Summit 2021Conversational Teams: Moving Fast at Scale | Rasa Summit 2021
Conversational Teams: Moving Fast at Scale | Rasa Summit 2021
Rasa Technologies173 views

Recently uploaded

SAP Automation Using Bar Code and FIORI.pdf by
SAP Automation Using Bar Code and FIORI.pdfSAP Automation Using Bar Code and FIORI.pdf
SAP Automation Using Bar Code and FIORI.pdfVirendra Rai, PMP
23 views38 slides
Design Driven Network Assurance by
Design Driven Network AssuranceDesign Driven Network Assurance
Design Driven Network AssuranceNetwork Automation Forum
15 views42 slides
Scaling Knowledge Graph Architectures with AI by
Scaling Knowledge Graph Architectures with AIScaling Knowledge Graph Architectures with AI
Scaling Knowledge Graph Architectures with AIEnterprise Knowledge
38 views15 slides
STKI Israeli Market Study 2023 corrected forecast 2023_24 v3.pdf by
STKI Israeli Market Study 2023   corrected forecast 2023_24 v3.pdfSTKI Israeli Market Study 2023   corrected forecast 2023_24 v3.pdf
STKI Israeli Market Study 2023 corrected forecast 2023_24 v3.pdfDr. Jimmy Schwarzkopf
20 views29 slides
Microsoft Power Platform.pptx by
Microsoft Power Platform.pptxMicrosoft Power Platform.pptx
Microsoft Power Platform.pptxUni Systems S.M.S.A.
53 views38 slides
Mini-Track: AI and ML in Network Operations Applications by
Mini-Track: AI and ML in Network Operations ApplicationsMini-Track: AI and ML in Network Operations Applications
Mini-Track: AI and ML in Network Operations ApplicationsNetwork Automation Forum
10 views24 slides

Recently uploaded(20)

SAP Automation Using Bar Code and FIORI.pdf by Virendra Rai, PMP
SAP Automation Using Bar Code and FIORI.pdfSAP Automation Using Bar Code and FIORI.pdf
SAP Automation Using Bar Code and FIORI.pdf
STKI Israeli Market Study 2023 corrected forecast 2023_24 v3.pdf by Dr. Jimmy Schwarzkopf
STKI Israeli Market Study 2023   corrected forecast 2023_24 v3.pdfSTKI Israeli Market Study 2023   corrected forecast 2023_24 v3.pdf
STKI Israeli Market Study 2023 corrected forecast 2023_24 v3.pdf
STPI OctaNE CoE Brochure.pdf by madhurjyapb
STPI OctaNE CoE Brochure.pdfSTPI OctaNE CoE Brochure.pdf
STPI OctaNE CoE Brochure.pdf
madhurjyapb14 views
ESPC 2023 - Protect and Govern your Sensitive Data with Microsoft Purview in ... by Jasper Oosterveld
ESPC 2023 - Protect and Govern your Sensitive Data with Microsoft Purview in ...ESPC 2023 - Protect and Govern your Sensitive Data with Microsoft Purview in ...
ESPC 2023 - Protect and Govern your Sensitive Data with Microsoft Purview in ...
Automating a World-Class Technology Conference; Behind the Scenes of CiscoLive by Network Automation Forum
Automating a World-Class Technology Conference; Behind the Scenes of CiscoLiveAutomating a World-Class Technology Conference; Behind the Scenes of CiscoLive
Automating a World-Class Technology Conference; Behind the Scenes of CiscoLive
Unit 1_Lecture 2_Physical Design of IoT.pdf by StephenTec
Unit 1_Lecture 2_Physical Design of IoT.pdfUnit 1_Lecture 2_Physical Design of IoT.pdf
Unit 1_Lecture 2_Physical Design of IoT.pdf
StephenTec12 views
Special_edition_innovator_2023.pdf by WillDavies22
Special_edition_innovator_2023.pdfSpecial_edition_innovator_2023.pdf
Special_edition_innovator_2023.pdf
WillDavies2218 views
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas... by Bernd Ruecker
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...
Bernd Ruecker40 views
Serverless computing with Google Cloud (2023-24) by wesley chun
Serverless computing with Google Cloud (2023-24)Serverless computing with Google Cloud (2023-24)
Serverless computing with Google Cloud (2023-24)
wesley chun11 views
Five Things You SHOULD Know About Postman by Postman
Five Things You SHOULD Know About PostmanFive Things You SHOULD Know About Postman
Five Things You SHOULD Know About Postman
Postman36 views

STAR: A Schema-Guided Dialog Dataset for Transfer Learning | Rasa Summit 2021

  • 1. STAR: A Schema-Guided Dialog Dataset for Transfer Learning Johannes Mosig*, Shikib Mehri*, Thomas Kober @shikibmehri
  • 2. Motivation Pre-trained models changed NLP Large-scale pre-training → Downstream fine-tuning 2
  • 3. Motivation Pre-trained models have helped significantly in open-domain dialog ● DialoGPT ● Meena ● Blender 3
  • 4. Motivation But what about task-oriented dialog? ● Unlike chit-chat systems, task-oriented systems must accomplish a goal ○ Limited space of valid responses ○ Often must interface with API/knowledge ● After training on Reddit, can a system make restaurant reservations? 4
  • 5. Motivation Scenario: Mary joins a COVID-19 hotline. Mary has human-level NLU/NLG and open-domain dialog ability. Can Mary do her job without any training? Answer: Probably not. 5
  • 6. Motivation So what can we get from large-scale pre-training? ● Language understanding ● Language generation ● General dialog skills What don’t we get? ● Task-specific instructions/rules 6
  • 7. Motivation Scenario: Mary joins a COVID-19 hotline. Mary has human-level NLU/NLG and open-domain dialog ability. What is the fastest way to train Mary? ● A training corpus that covers many situations → Training data ● A few examples? → Few-shot dialog ● A flow-chart describing the task? → Task-specific schema 7
  • 8. Motivation Scenario: COVID-19 is cured! Mary is out of a job. Luckily she found a new job at a tech support call center. They give her a task-specific schema that looks a lot like what she used at her old job. Can Mary do her job without any training? Answer: Probably. 8
  • 9. Dialog Schema Paradigm ● Schema acts as an inductive bias ● Pre-train → Train with schemas → Zero-shot on new task ● What transfers? ○ NLU, NLG, general dialog skills ○ how to follow the schema 9
  • 10. Dataset Overview STAR: Schema-Guided Dialog Dataset for Transfer Learning ● Wizard of Oz data collection ● Schema guided data collection ● 24 tasks in 13 domains 10
  • 14. Motivation ctd. What else makes a task-oriented dialog dataset good? ● System actions should be consistent ● Realistic and variable user behavior ● API/KB interface should be explicit ● There should be a progression of difficulty in the data 14
  • 15. Consistency of System Actions ● Dialog often has one-to-many problem → We try to eliminate that ○ Responses in a task-oriented system need not be diverse ○ System action at each timestep should be deterministic ● Achieved this by ○ Told AMT workers to follow the schema ○ Suggestions module 15
  • 16. Suggestions Module ● Wizard enters a response ● Gets a list of suggestions from the schema (using an NLU model) ● Can select suggestion (80%) or write custom response (20%) 16
  • 17. Realistic User Behavior ● Realistic dialog rarely follows the task schema (happy path) ● Users: ○ Change their mind ○ Request explanation/justification ○ Engage in small talk ○ Get angry ○ Anything that deviates from the standard schema 17
  • 19. Explicit API Interface ● Can’t transfer to a new task without: ○ making task-specific API requests ○ interpreting task-specific API outputs ● Make API requests/responses part of the schema and the dialog 19
  • 20. API as Part of the Dialog ● Can’t transfer to a new task without: ○ making task-specific API requests ○ interpreting task-specific API outputs ● Make API requests/responses part of the schema and the dialog 20
  • 22. Formulate Dialog as a 3-Party Interaction 22
  • 23. Formulate Dialog as a 3-Party Interaction 23
  • 24. Progression of Difficulty ● Happy Single-Task ● Unhappy Single-Task ● Multi-Task 24
  • 25. Data Collection Stages ● STAGE I: Single-Task Tutorial + Quiz [10 questions] ● STAGE II: Happy/Unhappy Single-Task Collection ● STAGE III: Multi-Task Tutorial + Quiz [10 questions] ● STAGE IV: Multi-Task Data Collection 25
  • 26. Example Quiz For the assistant, what of these things is the MOST important? (a) Being helpful to the user (b) Following the flow chart of the current task whenever possible (c) Making the conversation as short as possible (d) Making the conversation as long as possible 26
  • 28. Dataset Overview ● 5820 dialogs (127,833 utterances) across 24 tasks (13 domains) ○ Single-task: happy (2,688)/unhappy (1,464) ○ Multi-task (1,668) ● Properties ○ Schema-guided data collection for transfer learning ○ System-side consistency ○ Explicit API requests 28
  • 29. Happy Dialog user: hello wizard: hello , how can i help ? user: my name is angela and i need to rsvp to a party wizard: at what venue is the party taking place ? user: north heights venue wizard: could you provide me with the name of the host , please ? user: the hosts name is alexis wizard: when are you planning to arrive at the party ? … wizard: name = north heights venue, hostname = alexia, guestname = angela, arrivaltime = 2 pm, numberguests = 6 , needparking = api . is _ equal _ to ( yes ) api: message = thank you for your rsvp . see you there . wizard: great , your rsvp is all done and confirmed ! user: thank you ! 29
  • 30. Happy Dialog user: hello wizard: hello , how can i help ? user: my name is angela and i need to rsvp to a party wizard: at what venue is the party taking place ? user: north heights venue wizard: could you provide me with the name of the host , please ? user: the hosts name is alexis wizard: when are you planning to arrive at the party ? … wizard: name = north heights venue, hostname = alexia, guestname = angela, arrivaltime = 2 pm, numberguests = 6 , needparking = api . is _ equal _ to ( yes ) api: message = thank you for your rsvp . see you there . wizard: great , your rsvp is all done and confirmed ! user: thank you ! 30
  • 31. Unhappy Dialog user: hi ! i need a room at the shadyside , can you book it for me ? wizard: when are you arriving ? user: may 11th - 18th , and i need two beds in the room , must have them wizard: may i have your name , please ? user: yup , i ' m mark but i need to change the hotel to the old town inn , i just remembered that my friends said i should check that place out . still need two beds wizard: name = old town inn , startdate = 11th , enddate = 18th , customername = mark , requesttype = check api: message = available , hotelname = old town inn wizard: alright , the old town inn ticks all of your boxes , can i book this room for you ? user: hate to do this to you but my friends dick and jane stayed at another old town inn in a completely different city . i ' m not going to stay there without a recommendation . can you try the hilton ? this blog i love just raved about the authentic feel of the place . 31
  • 32. Unhappy Dialog user: hi ! i need a room at the shadyside , can you book it for me ? wizard: when are you arriving ? user: may 11th - 18th , and i need two beds in the room , must have them wizard: may i have your name , please ? user: yup , i ' m mark but i need to change the hotel to the old town inn , i just remembered that my friends said i should check that place out . still need two beds wizard: name = old town inn , startdate = 11th , enddate = 18th , customername = mark , requesttype = check api: message = available , hotelname = old town inn wizard: alright , the old town inn ticks all of your boxes , can i book this room for you ? user: hate to do this to you but my friends dick and jane stayed at another old town inn in a completely different city . i ' m not going to stay there without a recommendation . can you try the hilton ? this blog i love just raved about the authentic feel of the place . 32
  • 33. Multi-Task Dialog user: i ' m havin a party for my friend on sunday . help me out wizard: when would you like your party to kick off ? user: 5 pm wizard: how many people would you like to invite ? user: 80 . she ' s a big deal . . . user: my name is angela . actually , what would the weather be like on sunday ? we might just take it outside wizard: for what location would you like the weather forecast ? . . . user: great . let ' s book a restaurant instead then . somewhere in the west wizard: i can filter restaurants by name , location , food type , its average rating , whether or not they take reservations , and whether or not they have a delivery service . user: either chinese or pizza is fine 33
  • 34. Multi-Task Dialog user: i ' m havin a party for my friend on sunday . help me out wizard: when would you like your party to kick off ? user: 5 pm wizard: how many people would you like to invite ? user: 80 . she ' s a big deal . . . user: my name is angela . actually , what would the weather be like on sunday ? we might just take it outside wizard: for what location would you like the weather forecast ? . . . user: great . let ' s book a restaurant instead then . somewhere in the west wizard: i can filter restaurants by name , location , food type , its average rating , whether or not they take reservations , and whether or not they have a delivery service . user: either chinese or pizza is fine 34 party_plan weather restaurant_search eventually restaurant_reserve
  • 35. 35
  • 36. Tasks ● Response Generation ○ Produce the next response given (i) dialog history, (ii) API outputs, (iii) schema ● Next Action Prediction ○ Predict the next action (ask_name, query, confirm_booking) given (i) dialog history, (ii) API outputs, (iii) schema 36
  • 37. Tasks user: can you help me ? wizard: hello , how can i help ? user: i need an urgent ride to the airport . wizard: may i have your name , please ? user: alexis . i need the ride at forbes and murray . wizard: customername = alexis, departurelocation = forbes and murray , arrivallocation = airport , requesttype = check api: id = 592 , price = 49 , allowschanges = false , minutestillpickup = 9 , serviceprovider = taxi , drivername = mark , carmodel = corolla , licenseplate = 019 eas , departurelocation = forbes and murray arrivallocation = airport , customername = alexis Response: i found a taxi ride for you from ' forbes and murray ' to ' airport ' for 49 credits that could pick you up in 9 minutes . should i book that for you ? Action: confirm_book_ride 37
  • 43. Determinism in the Schema 43 System action is always deterministic. The nodes representing a dialog state before the system action always has an out-degree of 1
  • 44. Schema-Guided Classification 44 Goal: Use the task- specific schema to guide next action prediction.
  • 49. Results - Next Action Prediction 49 Model Happy Unhappy Multi-Task BERT 73.30 73.93 73.61 BERT + Schema 71.09 72.28 73.13 For each stage (happy/unhappy/multi), models are trained with 80% of data from current stage + all data from previous stages
  • 50. Zero-Shot Next Action Prediction 50 Task Transfer Model Happy Happy + Unhappy BERT 36.45 36.89 BERT + Schema 36.77 37.15 Domain Transfer Model Happy Happy + Unhappy BERT 34.84 35.63 BERT + Schema 37.20 35.71
  • 51. Other Tasks ● Response Generation ● Knowledge base query prediction (state tracking) ● Schema prediction (set of dialogs → schema graph) ● Out-of-domain detection (predict when you’ve gone outside the schema) 51
  • 52. Contributions ● STAR: Schema-guided Dialog Dataset for Transfer Learning ○ Task-specific schema allows zero-shot transfer learning ○ System consistency ○ Realistic user behavior ○ Progression of difficulty ● Schema-guided models for classification and generation 52
  • 53. Future Work ● Improve schema-guided models ● Investigate happy → unhappy transfer ● Investigate happy → multi-task transfer ● Schema prediction (infer schema from a set of example dialogs) 53

Editor's Notes

  1. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  2. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  3. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  4. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  5. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  6. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  7. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  8. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  9. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  10. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  11. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  12. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  13. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  14. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  15. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  16. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  17. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  18. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  19. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  20. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  21. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  22. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  23. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  24. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  25. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  26. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  27. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  28. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  29. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  30. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  31. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  32. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  33. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  34. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  35. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  36. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  37. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  38. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  39. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  40. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  41. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  42. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  43. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  44. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  45. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  46. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  47. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  48. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  49. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  50. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  51. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  52. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  53. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH