SlideShare a Scribd company logo
A Comparative Study on Schema-guided
Dialog State Tracking
NAACL 2021
Jie Cao, Yi Zhang
Outlines
● Motivation
● Task Description and Datasets
● Three Comparative Studies
i. Encoder Architectures
ii. Supplementary training
iii. Impact of Schema Description Style
● Q&A
Motivation
Challenges of Virtual Assistants (Task-oriented)
● Increasing number of new services and APIs → (new annotation, new model retraining)
● Heterogeneous Interfaces for similar services, precisely understanding overlapping
functionalities.
● How to integrate common sense and world knowledge?
Schema-guided Dialogue State Tracking
Using Natural language description to explain the functionalities of tags, to help generalizing to unseen
tags in unseen domains.
Adding Intent
Description
Adding Slot
Description,
Type,
Possible values
{
"name": "SearchFlight",
"description": "Find a flight itinerary
between cities for a given date”,
“required_slots”: …..
},
{
"name": "FindFlight",
"description": “Search for flights to a
desitinaion”,
“required_slots”:……
},
Flight Service 1 Flight Service 2
{
"name": “num_stops",
"description": “Number of layovers
in the flight",
"is_categorical": true,
"possible_values": [“0”,”1”,”2”]
},
{
"name": “is_direct",
"description": “Whether the flight
directly arrive without any stop”,
"is_categorical": True,
"possible_values": [“true”,”false”]
},
Schema-guided Dialogue State Tracking
Given service schema with description and dialogue history,
predict dialog states after each user turn.
Schema-guided Dialogue State Tracking
Two Datasets:
1. Google SG-DST
2. MultiWOZ 2.2
Four Subtasks Each Turn:
1. Active Intent Classification
2. Requested Slot
3. Categorical Slot Values
a. Boolean
b. Predefined Values
i. Numeric
ii. Text
4. Non-Categorical Slot
Values
a. Span-based Value
Datasets
SG-DST has more overlapping functionalities than MultiWOZ 2.2
Challenges: Three Comparative Studies
Q1: How to encode the dialog and schema?
● For each turn, matching the same dialog history with all
schema descriptions multiple times.
● Sentence-pair(SNLI) and Token-level classification(QA)
Q2: How do different supplementary trainings help?
● Zero-shot learning for unseen services
Q3: How the model performs on various description styles?
● Unseen service may have heterogeneous styles.
Q1: How to encode the dialog and schema? (Cross-Encoder)
Cons: a lot of recomputing, slow
a. Dialog encoded multiple times within the same turn
b. Schema encoded multiple times across different turns.
Pros:
Accurate, each representation
is contextualized via
full-attention
Q1: How should the dialog and schema be encoded? (Dual-Encoder)
Pros:
Encoding dialog history and schema independently, can be
percomputed once and cached. Fast inference.
Cons:
a. Local self-attention
b. inaccurate
Q1: How to encode the dialog and schema?
Pros: moderate inference
Still independent precomputing.
but a thin full-attention fusion layer for better performance
Cons:
Moderate accuracy
Q1: How to encode the dialog and schema(for 4 subtasks) ?
By caching the token embedding instead of the single CLS embedding, a simple partial-attention
Fusion-Encoder can achieve much better performance than Dual-Encoder, while still infers two
times faster than Cross-Encoder
Q2: How do different supplementary trainings help?
Pretraining Fine-tuning
Pretraining Supplementary Training Fine-tuning
Target Task
Pre-training Task
Target Task
Pre-training Task Intermediate Tasks: NLI, QA
With Similar Problem Structures
Model Hub
e.g. hugging face,
ParlAI
Q2: How do different supplementary trainings help?
● SNLI only helps for Intent (emphasizing the whole sentence entailment), although
Req and Cat are also sentence-pair classification tasks.
● SQuAD consistently helps for non-categorical slot identification tasks, due to
span-based retrieving
● Supplementary training helps more on unseen services
Q3: How the model performs on various description styles?
Background:
1. To compatible with previous tag-based DST system, many
previous papers show simply adding question format to those
tags may help.
a. Is name-based description enough?
b. Does question format helps?
2. Unseen services may use different description style
a. heterogeneous evaluation?
Q3: How the model performs on various description styles? (homogeneous)
● Most name are meaningful, and perform not bad, especially on Intent/Req subtasks
● Rich description outperforms the name-based on NonCat, but inconsistent on other
tasks.
Is named-based description enough?
Q3: How the model performs on various description styles? (homogeneous)
● It generally helps on Cat/NonCat
● Adding it to rich description will benefit more from SQuAD2
supplementary training on unseen. However, not on MultiWOZ.
Is question-format helpful?
Q3: How the model performs on various description styles? (Heterogeneous)
● For unseen styles, all tasks surfer from inconsistencies, though to varying degrees
● For paraphrased styles, richer description are relatively more robust than named-based
descriptions.
What if unseen service in different description styles?
Takeaways
1. Cross-Encoder > Fusion-Encoder > Dual-Encoder in accuracy, while opposite
on inference speed.
2. To support low-resource unseen services, we quantified the gain via
supplementary training on different subtasks.
3. Simple named-based description are actually meaningful, and they perform
not bad, but not as robust as rich description in most cases.
4. All subtasks suffers from inconsistencies when using heterogeneous
description on unseen services, which requires future work on more robust
cross-style schema-guided dialog modeling.
Q&A?
Thanks

More Related Content

Similar to A Comparative Study on Schema-guided Dialog State Tracking

DOMAIN BASED CHUNKING
DOMAIN BASED CHUNKINGDOMAIN BASED CHUNKING
DOMAIN BASED CHUNKING
ijnlc
 
DOMAIN BASED CHUNKING
DOMAIN BASED CHUNKINGDOMAIN BASED CHUNKING
DOMAIN BASED CHUNKING
kevig
 
DOMAIN BASED CHUNKING
DOMAIN BASED CHUNKINGDOMAIN BASED CHUNKING
DOMAIN BASED CHUNKING
kevig
 
Which Questions We Should Have
Which Questions We Should HaveWhich Questions We Should Have
Which Questions We Should Have
Oracle Korea
 
Differences between method overloading and method overriding
Differences between method overloading and method overridingDifferences between method overloading and method overriding
Differences between method overloading and method overriding
Pinky Anaya
 
Metamorphic Domain-Specific Languages
Metamorphic Domain-Specific LanguagesMetamorphic Domain-Specific Languages
Metamorphic Domain-Specific Languages
University of Rennes, INSA Rennes, Inria/IRISA, CNRS
 
AI/ML Infra Meetup | ML explainability in Michelangelo
AI/ML Infra Meetup | ML explainability in MichelangeloAI/ML Infra Meetup | ML explainability in Michelangelo
AI/ML Infra Meetup | ML explainability in Michelangelo
Alluxio, Inc.
 
New c sharp3_features_(linq)_part_iv
New c sharp3_features_(linq)_part_ivNew c sharp3_features_(linq)_part_iv
New c sharp3_features_(linq)_part_iv
Nico Ludwig
 
Code Smell and Refactoring
Code Smell and RefactoringCode Smell and Refactoring
Code Smell and Refactoring
kimsrung lov
 
OOP lesson1 and Variables.pdf
OOP lesson1 and Variables.pdfOOP lesson1 and Variables.pdf
OOP lesson1 and Variables.pdf
HouseMusica
 
Tomáš Mikolov - Distributed Representations for NLP
Tomáš Mikolov - Distributed Representations for NLPTomáš Mikolov - Distributed Representations for NLP
Tomáš Mikolov - Distributed Representations for NLP
Machine Learning Prague
 
Illustrated Code (ASE 2021)
Illustrated Code (ASE 2021)Illustrated Code (ASE 2021)
C# interview
C# interviewC# interview
C# interview
ajeesharakkal
 
05 Lecture - PARALLEL Programming in C ++.pdf
05 Lecture - PARALLEL Programming in C ++.pdf05 Lecture - PARALLEL Programming in C ++.pdf
05 Lecture - PARALLEL Programming in C ++.pdf
alivaisi1
 
Frequently asked tcs technical interview questions and answers
Frequently asked tcs technical interview questions and answersFrequently asked tcs technical interview questions and answers
Frequently asked tcs technical interview questions and answers
nishajj
 
ODSC East: Effective Transfer Learning for NLP
ODSC East: Effective Transfer Learning for NLPODSC East: Effective Transfer Learning for NLP
ODSC East: Effective Transfer Learning for NLP
indico data
 
Importance Of Being Driven
Importance Of Being DrivenImportance Of Being Driven
Importance Of Being Driven
Antonio Terreno
 
Compiler gate question key
Compiler gate question keyCompiler gate question key
Compiler gate question key
ArthyR3
 
C++ programing lanuage
C++ programing lanuageC++ programing lanuage
C++ programing lanuage
Nimai Chand Das
 

Similar to A Comparative Study on Schema-guided Dialog State Tracking (20)

DOMAIN BASED CHUNKING
DOMAIN BASED CHUNKINGDOMAIN BASED CHUNKING
DOMAIN BASED CHUNKING
 
DOMAIN BASED CHUNKING
DOMAIN BASED CHUNKINGDOMAIN BASED CHUNKING
DOMAIN BASED CHUNKING
 
DOMAIN BASED CHUNKING
DOMAIN BASED CHUNKINGDOMAIN BASED CHUNKING
DOMAIN BASED CHUNKING
 
Which Questions We Should Have
Which Questions We Should HaveWhich Questions We Should Have
Which Questions We Should Have
 
Differences between method overloading and method overriding
Differences between method overloading and method overridingDifferences between method overloading and method overriding
Differences between method overloading and method overriding
 
Metamorphic Domain-Specific Languages
Metamorphic Domain-Specific LanguagesMetamorphic Domain-Specific Languages
Metamorphic Domain-Specific Languages
 
AI/ML Infra Meetup | ML explainability in Michelangelo
AI/ML Infra Meetup | ML explainability in MichelangeloAI/ML Infra Meetup | ML explainability in Michelangelo
AI/ML Infra Meetup | ML explainability in Michelangelo
 
New c sharp3_features_(linq)_part_iv
New c sharp3_features_(linq)_part_ivNew c sharp3_features_(linq)_part_iv
New c sharp3_features_(linq)_part_iv
 
73d32 session1 c++
73d32 session1 c++73d32 session1 c++
73d32 session1 c++
 
Code Smell and Refactoring
Code Smell and RefactoringCode Smell and Refactoring
Code Smell and Refactoring
 
OOP lesson1 and Variables.pdf
OOP lesson1 and Variables.pdfOOP lesson1 and Variables.pdf
OOP lesson1 and Variables.pdf
 
Tomáš Mikolov - Distributed Representations for NLP
Tomáš Mikolov - Distributed Representations for NLPTomáš Mikolov - Distributed Representations for NLP
Tomáš Mikolov - Distributed Representations for NLP
 
Illustrated Code (ASE 2021)
Illustrated Code (ASE 2021)Illustrated Code (ASE 2021)
Illustrated Code (ASE 2021)
 
C# interview
C# interviewC# interview
C# interview
 
05 Lecture - PARALLEL Programming in C ++.pdf
05 Lecture - PARALLEL Programming in C ++.pdf05 Lecture - PARALLEL Programming in C ++.pdf
05 Lecture - PARALLEL Programming in C ++.pdf
 
Frequently asked tcs technical interview questions and answers
Frequently asked tcs technical interview questions and answersFrequently asked tcs technical interview questions and answers
Frequently asked tcs technical interview questions and answers
 
ODSC East: Effective Transfer Learning for NLP
ODSC East: Effective Transfer Learning for NLPODSC East: Effective Transfer Learning for NLP
ODSC East: Effective Transfer Learning for NLP
 
Importance Of Being Driven
Importance Of Being DrivenImportance Of Being Driven
Importance Of Being Driven
 
Compiler gate question key
Compiler gate question keyCompiler gate question key
Compiler gate question key
 
C++ programing lanuage
C++ programing lanuageC++ programing lanuage
C++ programing lanuage
 

More from jie cao

Observing Dialogue in Therapy: Categorizing and Forecasting Behavioral Codes
Observing Dialogue in Therapy: Categorizing and Forecasting Behavioral CodesObserving Dialogue in Therapy: Categorizing and Forecasting Behavioral Codes
Observing Dialogue in Therapy: Categorizing and Forecasting Behavioral Codes
jie cao
 
Task-oriented Conversational semantic parsing
Task-oriented Conversational semantic parsingTask-oriented Conversational semantic parsing
Task-oriented Conversational semantic parsing
jie cao
 
CCG
CCGCCG
CCG
jie cao
 
Talking Geckos (Question and Answering)
Talking Geckos (Question and Answering)Talking Geckos (Question and Answering)
Talking Geckos (Question and Answering)
jie cao
 
Spark调研串讲
Spark调研串讲Spark调研串讲
Spark调研串讲
jie cao
 
Parsing Natural Scenes and Natural Language with Recursive Neural Networks
Parsing Natural Scenes and Natural Language with Recursive Neural NetworksParsing Natural Scenes and Natural Language with Recursive Neural Networks
Parsing Natural Scenes and Natural Language with Recursive Neural Networks
jie cao
 
Challenges on Distributed Machine Learning
Challenges on Distributed Machine LearningChallenges on Distributed Machine Learning
Challenges on Distributed Machine Learning
jie cao
 

More from jie cao (7)

Observing Dialogue in Therapy: Categorizing and Forecasting Behavioral Codes
Observing Dialogue in Therapy: Categorizing and Forecasting Behavioral CodesObserving Dialogue in Therapy: Categorizing and Forecasting Behavioral Codes
Observing Dialogue in Therapy: Categorizing and Forecasting Behavioral Codes
 
Task-oriented Conversational semantic parsing
Task-oriented Conversational semantic parsingTask-oriented Conversational semantic parsing
Task-oriented Conversational semantic parsing
 
CCG
CCGCCG
CCG
 
Talking Geckos (Question and Answering)
Talking Geckos (Question and Answering)Talking Geckos (Question and Answering)
Talking Geckos (Question and Answering)
 
Spark调研串讲
Spark调研串讲Spark调研串讲
Spark调研串讲
 
Parsing Natural Scenes and Natural Language with Recursive Neural Networks
Parsing Natural Scenes and Natural Language with Recursive Neural NetworksParsing Natural Scenes and Natural Language with Recursive Neural Networks
Parsing Natural Scenes and Natural Language with Recursive Neural Networks
 
Challenges on Distributed Machine Learning
Challenges on Distributed Machine LearningChallenges on Distributed Machine Learning
Challenges on Distributed Machine Learning
 

Recently uploaded

Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
Welocme to ViralQR, your best QR code generator.
Welocme to ViralQR, your best QR code generator.Welocme to ViralQR, your best QR code generator.
Welocme to ViralQR, your best QR code generator.
ViralQR
 

Recently uploaded (20)

Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
Welocme to ViralQR, your best QR code generator.
Welocme to ViralQR, your best QR code generator.Welocme to ViralQR, your best QR code generator.
Welocme to ViralQR, your best QR code generator.
 

A Comparative Study on Schema-guided Dialog State Tracking

  • 1. A Comparative Study on Schema-guided Dialog State Tracking NAACL 2021 Jie Cao, Yi Zhang
  • 2. Outlines ● Motivation ● Task Description and Datasets ● Three Comparative Studies i. Encoder Architectures ii. Supplementary training iii. Impact of Schema Description Style ● Q&A
  • 3. Motivation Challenges of Virtual Assistants (Task-oriented) ● Increasing number of new services and APIs → (new annotation, new model retraining) ● Heterogeneous Interfaces for similar services, precisely understanding overlapping functionalities. ● How to integrate common sense and world knowledge?
  • 4. Schema-guided Dialogue State Tracking Using Natural language description to explain the functionalities of tags, to help generalizing to unseen tags in unseen domains. Adding Intent Description Adding Slot Description, Type, Possible values { "name": "SearchFlight", "description": "Find a flight itinerary between cities for a given date”, “required_slots”: ….. }, { "name": "FindFlight", "description": “Search for flights to a desitinaion”, “required_slots”:…… }, Flight Service 1 Flight Service 2 { "name": “num_stops", "description": “Number of layovers in the flight", "is_categorical": true, "possible_values": [“0”,”1”,”2”] }, { "name": “is_direct", "description": “Whether the flight directly arrive without any stop”, "is_categorical": True, "possible_values": [“true”,”false”] },
  • 5. Schema-guided Dialogue State Tracking Given service schema with description and dialogue history, predict dialog states after each user turn.
  • 6. Schema-guided Dialogue State Tracking Two Datasets: 1. Google SG-DST 2. MultiWOZ 2.2 Four Subtasks Each Turn: 1. Active Intent Classification 2. Requested Slot 3. Categorical Slot Values a. Boolean b. Predefined Values i. Numeric ii. Text 4. Non-Categorical Slot Values a. Span-based Value
  • 7. Datasets SG-DST has more overlapping functionalities than MultiWOZ 2.2
  • 8. Challenges: Three Comparative Studies Q1: How to encode the dialog and schema? ● For each turn, matching the same dialog history with all schema descriptions multiple times. ● Sentence-pair(SNLI) and Token-level classification(QA) Q2: How do different supplementary trainings help? ● Zero-shot learning for unseen services Q3: How the model performs on various description styles? ● Unseen service may have heterogeneous styles.
  • 9. Q1: How to encode the dialog and schema? (Cross-Encoder) Cons: a lot of recomputing, slow a. Dialog encoded multiple times within the same turn b. Schema encoded multiple times across different turns. Pros: Accurate, each representation is contextualized via full-attention
  • 10. Q1: How should the dialog and schema be encoded? (Dual-Encoder) Pros: Encoding dialog history and schema independently, can be percomputed once and cached. Fast inference. Cons: a. Local self-attention b. inaccurate
  • 11. Q1: How to encode the dialog and schema? Pros: moderate inference Still independent precomputing. but a thin full-attention fusion layer for better performance Cons: Moderate accuracy
  • 12. Q1: How to encode the dialog and schema(for 4 subtasks) ? By caching the token embedding instead of the single CLS embedding, a simple partial-attention Fusion-Encoder can achieve much better performance than Dual-Encoder, while still infers two times faster than Cross-Encoder
  • 13. Q2: How do different supplementary trainings help? Pretraining Fine-tuning Pretraining Supplementary Training Fine-tuning Target Task Pre-training Task Target Task Pre-training Task Intermediate Tasks: NLI, QA With Similar Problem Structures Model Hub e.g. hugging face, ParlAI
  • 14. Q2: How do different supplementary trainings help? ● SNLI only helps for Intent (emphasizing the whole sentence entailment), although Req and Cat are also sentence-pair classification tasks. ● SQuAD consistently helps for non-categorical slot identification tasks, due to span-based retrieving ● Supplementary training helps more on unseen services
  • 15. Q3: How the model performs on various description styles? Background: 1. To compatible with previous tag-based DST system, many previous papers show simply adding question format to those tags may help. a. Is name-based description enough? b. Does question format helps? 2. Unseen services may use different description style a. heterogeneous evaluation?
  • 16. Q3: How the model performs on various description styles? (homogeneous) ● Most name are meaningful, and perform not bad, especially on Intent/Req subtasks ● Rich description outperforms the name-based on NonCat, but inconsistent on other tasks. Is named-based description enough?
  • 17. Q3: How the model performs on various description styles? (homogeneous) ● It generally helps on Cat/NonCat ● Adding it to rich description will benefit more from SQuAD2 supplementary training on unseen. However, not on MultiWOZ. Is question-format helpful?
  • 18. Q3: How the model performs on various description styles? (Heterogeneous) ● For unseen styles, all tasks surfer from inconsistencies, though to varying degrees ● For paraphrased styles, richer description are relatively more robust than named-based descriptions. What if unseen service in different description styles?
  • 19. Takeaways 1. Cross-Encoder > Fusion-Encoder > Dual-Encoder in accuracy, while opposite on inference speed. 2. To support low-resource unseen services, we quantified the gain via supplementary training on different subtasks. 3. Simple named-based description are actually meaningful, and they perform not bad, but not as robust as rich description in most cases. 4. All subtasks suffers from inconsistencies when using heterogeneous description on unseen services, which requires future work on more robust cross-style schema-guided dialog modeling.