SlideShare a Scribd company logo
1 of 54
STAR: A Schema-Guided Dialog
Dataset for Transfer Learning
Johannes Mosig*, Shikib Mehri*, Thomas Kober
@shikibmehri
Motivation
Pre-trained models changed NLP
Large-scale pre-training → Downstream fine-tuning
2
Motivation
Pre-trained models have helped significantly in open-domain dialog
● DialoGPT
● Meena
● Blender
3
Motivation
But what about task-oriented dialog?
● Unlike chit-chat systems, task-oriented systems must accomplish a goal
○ Limited space of valid responses
○ Often must interface with API/knowledge
● After training on Reddit, can a system make restaurant reservations?
4
Motivation
Scenario: Mary joins a COVID-19 hotline. Mary has human-level NLU/NLG and
open-domain dialog ability. Can Mary do her job without any training?
Answer: Probably not.
5
Motivation
So what can we get from large-scale pre-training?
● Language understanding
● Language generation
● General dialog skills
What don’t we get?
● Task-specific instructions/rules
6
Motivation
Scenario: Mary joins a COVID-19 hotline. Mary has human-level NLU/NLG and
open-domain dialog ability. What is the fastest way to train Mary?
● A training corpus that covers many situations → Training data
● A few examples? → Few-shot dialog
● A flow-chart describing the task? → Task-specific schema
7
Motivation
Scenario: COVID-19 is cured! Mary is out of a job. Luckily she found a new job at
a tech support call center. They give her a task-specific schema that looks a lot like
what she used at her old job. Can Mary do her job without any training?
Answer: Probably.
8
Dialog Schema Paradigm
● Schema acts as an inductive bias
● Pre-train → Train with schemas →
Zero-shot on new task
● What transfers?
○ NLU, NLG, general dialog skills
○ how to follow the schema
9
Dataset Overview
STAR: Schema-Guided Dialog Dataset for Transfer Learning
● Wizard of Oz data collection
● Schema guided data collection
● 24 tasks in 13 domains
10
Dataset Overview
11
Dialog Schemas
12
Dialog Schemas
13
Motivation ctd.
What else makes a task-oriented dialog dataset good?
● System actions should be consistent
● Realistic and variable user behavior
● API/KB interface should be explicit
● There should be a progression of difficulty in the data
14
Consistency of System Actions
● Dialog often has one-to-many problem → We try to eliminate that
○ Responses in a task-oriented system need not be diverse
○ System action at each timestep should be deterministic
● Achieved this by
○ Told AMT workers to follow the schema
○ Suggestions module
15
Suggestions Module
● Wizard enters a response
● Gets a list of suggestions from the
schema (using an NLU model)
● Can select suggestion (80%) or write
custom response (20%)
16
Realistic User Behavior
● Realistic dialog rarely follows the task schema (happy path)
● Users:
○ Change their mind
○ Request explanation/justification
○ Engage in small talk
○ Get angry
○ Anything that deviates from the standard schema
17
In-Dialog User Instructions
18
Explicit API Interface
● Can’t transfer to a new task without:
○ making task-specific API requests
○ interpreting task-specific API outputs
● Make API requests/responses part of the schema and the dialog
19
API as Part of the Dialog
● Can’t transfer to a new task without:
○ making task-specific API requests
○ interpreting task-specific API outputs
● Make API requests/responses part of the schema and the dialog
20
Instead of …
21
Formulate Dialog as a 3-Party Interaction
22
Formulate Dialog as a 3-Party Interaction
23
Progression of Difficulty
● Happy Single-Task
● Unhappy Single-Task
● Multi-Task
24
Data Collection Stages
● STAGE I: Single-Task Tutorial + Quiz [10 questions]
● STAGE II: Happy/Unhappy Single-Task Collection
● STAGE III: Multi-Task Tutorial + Quiz [10 questions]
● STAGE IV: Multi-Task Data Collection
25
Example Quiz
For the assistant, what of these things is the MOST important?
(a) Being helpful to the user
(b) Following the flow chart of the current task whenever possible
(c) Making the conversation as short as possible
(d) Making the conversation as long as possible
26
Data Collection Interface
27
Dataset Overview
● 5820 dialogs (127,833 utterances) across 24 tasks (13 domains)
○ Single-task: happy (2,688)/unhappy (1,464)
○ Multi-task (1,668)
● Properties
○ Schema-guided data collection for transfer learning
○ System-side consistency
○ Explicit API requests
28
Happy Dialog
user: hello
wizard: hello , how can i help ?
user: my name is angela and i need to rsvp to a party
wizard: at what venue is the party taking place ?
user: north heights venue
wizard: could you provide me with the name of the host , please ?
user: the hosts name is alexis
wizard: when are you planning to arrive at the party ?
…
wizard: name = north heights venue, hostname = alexia, guestname = angela, arrivaltime = 2 pm, numberguests
= 6 , needparking = api . is _ equal _ to ( yes )
api: message = thank you for your rsvp . see you there .
wizard: great , your rsvp is all done and confirmed !
user: thank you !
29
Happy Dialog
user: hello
wizard: hello , how can i help ?
user: my name is angela and i need to rsvp to a party
wizard: at what venue is the party taking place ?
user: north heights venue
wizard: could you provide me with the name of the host , please ?
user: the hosts name is alexis
wizard: when are you planning to arrive at the party ?
…
wizard: name = north heights venue, hostname = alexia, guestname = angela, arrivaltime = 2 pm, numberguests
= 6 , needparking = api . is _ equal _ to ( yes )
api: message = thank you for your rsvp . see you there .
wizard: great , your rsvp is all done and confirmed !
user: thank you !
30
Unhappy Dialog
user: hi ! i need a room at the shadyside , can you book it for me ?
wizard: when are you arriving ?
user: may 11th - 18th , and i need two beds in the room , must have them
wizard: may i have your name , please ?
user: yup , i ' m mark but i need to change the hotel to the old town inn , i just remembered that
my friends
said i should check that place out . still need two beds
wizard: name = old town inn , startdate = 11th , enddate = 18th , customername = mark , requesttype = check
api: message = available , hotelname = old town inn
wizard: alright , the old town inn ticks all of your boxes , can i book this room for you ?
user: hate to do this to you but my friends dick and jane stayed at another old town inn in a
completely
different city . i ' m not going to stay there without a recommendation . can you try the hilton ? this blog
i love just raved about the authentic feel of the place .
31
Unhappy Dialog
user: hi ! i need a room at the shadyside , can you book it for me ?
wizard: when are you arriving ?
user: may 11th - 18th , and i need two beds in the room , must have them
wizard: may i have your name , please ?
user: yup , i ' m mark but i need to change the hotel to the old town inn , i just remembered that
my friends
said i should check that place out . still need two beds
wizard: name = old town inn , startdate = 11th , enddate = 18th , customername = mark , requesttype = check
api: message = available , hotelname = old town inn
wizard: alright , the old town inn ticks all of your boxes , can i book this room for you ?
user: hate to do this to you but my friends dick and jane stayed at another old town inn in a
completely
different city . i ' m not going to stay there without a recommendation . can you try the hilton ? this blog
i love just raved about the authentic feel of the place .
32
Multi-Task Dialog
user: i ' m havin a party for my friend on sunday . help me out
wizard: when would you like your party to kick off ?
user: 5 pm
wizard: how many people would you like to invite ?
user: 80 . she ' s a big deal
. . .
user: my name is angela . actually , what would the weather be like on sunday ? we might just
take it outside
wizard: for what location would you like the weather forecast ?
. . .
user: great . let ' s book a restaurant instead then . somewhere in the west
wizard: i can filter restaurants by name , location , food type , its average rating , whether or not they take
reservations , and whether or not they have a delivery service .
user: either chinese or pizza is fine
33
Multi-Task Dialog
user: i ' m havin a party for my friend on sunday . help me out
wizard: when would you like your party to kick off ?
user: 5 pm
wizard: how many people would you like to invite ?
user: 80 . she ' s a big deal
. . .
user: my name is angela . actually , what would the weather be like on sunday ? we might just
take it outside
wizard: for what location would you like the weather forecast ?
. . .
user: great . let ' s book a restaurant instead then . somewhere in the west
wizard: i can filter restaurants by name , location , food type , its average rating , whether or not they take
reservations , and whether or not they have a delivery service .
user: either chinese or pizza is fine
34
party_plan
weather
restaurant_search
eventually restaurant_reserve
35
Tasks
● Response Generation
○ Produce the next response given (i) dialog history, (ii) API outputs,
(iii) schema
● Next Action Prediction
○ Predict the next action (ask_name, query, confirm_booking) given (i)
dialog history, (ii) API outputs, (iii) schema
36
Tasks
user: can you help me ?
wizard: hello , how can i help ?
user: i need an urgent ride to the airport .
wizard: may i have your name , please ?
user: alexis . i need the ride at forbes and murray .
wizard: customername = alexis, departurelocation = forbes and murray , arrivallocation = airport , requesttype =
check
api: id = 592 , price = 49 , allowschanges = false , minutestillpickup = 9 , serviceprovider = taxi
,
drivername = mark , carmodel = corolla , licenseplate = 019 eas , departurelocation = forbes and murray
arrivallocation = airport , customername = alexis
Response: i found a taxi ride for you from ' forbes and murray ' to ' airport ' for 49 credits that could pick you up in
9 minutes . should i book that for you ?
Action: confirm_book_ride
37
Schema-Free Classification
38
Schema Representation
39
Schema Representation
40
Schema Representation
41
Schema Representation
42
Determinism in the Schema
43
System action is always
deterministic.
The nodes representing a dialog
state before the system action
always has an out-degree of 1
Schema-Guided
Classification
44
Goal: Use the task-
specific schema to
guide next action
prediction.
Schema-Guided Classification
45
Schema-Guided Classification
46
Schema-Guided Classification
47
Schema-Guided Classification
48
Results - Next Action Prediction
49
Model Happy Unhappy Multi-Task
BERT 73.30 73.93 73.61
BERT + Schema 71.09 72.28 73.13
For each stage (happy/unhappy/multi), models are trained with 80% of data
from current stage + all data from previous stages
Zero-Shot Next Action Prediction
50
Task Transfer
Model Happy Happy + Unhappy
BERT 36.45 36.89
BERT + Schema 36.77 37.15
Domain Transfer
Model Happy Happy + Unhappy
BERT 34.84 35.63
BERT + Schema 37.20 35.71
Other Tasks
● Response Generation
● Knowledge base query prediction (state tracking)
● Schema prediction (set of dialogs → schema graph)
● Out-of-domain detection (predict when you’ve gone outside the schema)
51
Contributions
● STAR: Schema-guided Dialog Dataset for Transfer Learning
○ Task-specific schema allows zero-shot transfer learning
○ System consistency
○ Realistic user behavior
○ Progression of difficulty
● Schema-guided models for classification and generation
52
Future Work
● Improve schema-guided models
● Investigate happy → unhappy transfer
● Investigate happy → multi-task transfer
● Schema prediction (infer schema from a set of example dialogs)
53
Thank You
54
@shikibmehri
Paper
https://arxiv.org/pdf/2010.11853.pdf
Data
https://github.com/rasahq/star

More Related Content

More from Rasa Technologies

Six Steps to Conversation Driven Development
Six Steps to Conversation Driven DevelopmentSix Steps to Conversation Driven Development
Six Steps to Conversation Driven DevelopmentRasa Technologies
 
Beyond Sentiment Analysis: Creating Engaging Conversational Experiences throu...
Beyond Sentiment Analysis: Creating Engaging Conversational Experiences throu...Beyond Sentiment Analysis: Creating Engaging Conversational Experiences throu...
Beyond Sentiment Analysis: Creating Engaging Conversational Experiences throu...Rasa Technologies
 
Using Rasa to Power an Immersive Multimedia Conversational Experience | Rasa ...
Using Rasa to Power an Immersive Multimedia Conversational Experience | Rasa ...Using Rasa to Power an Immersive Multimedia Conversational Experience | Rasa ...
Using Rasa to Power an Immersive Multimedia Conversational Experience | Rasa ...Rasa Technologies
 
How to Effectively Test Your Chatbot | Rasa Summit
How to Effectively Test Your Chatbot  | Rasa SummitHow to Effectively Test Your Chatbot  | Rasa Summit
How to Effectively Test Your Chatbot | Rasa SummitRasa Technologies
 
Voice First: Ready Your Content to Serve 50% of Global Searches | Rasa Summit...
Voice First: Ready Your Content to Serve 50% of Global Searches | Rasa Summit...Voice First: Ready Your Content to Serve 50% of Global Searches | Rasa Summit...
Voice First: Ready Your Content to Serve 50% of Global Searches | Rasa Summit...Rasa Technologies
 
The missing link: How AI can help create a safer society and better businesse...
The missing link: How AI can help create a safer society and better businesse...The missing link: How AI can help create a safer society and better businesse...
The missing link: How AI can help create a safer society and better businesse...Rasa Technologies
 
Boss - Bringing More Diversity to Tech | Rasa Summit
Boss - Bringing More Diversity to Tech | Rasa SummitBoss - Bringing More Diversity to Tech | Rasa Summit
Boss - Bringing More Diversity to Tech | Rasa SummitRasa Technologies
 
How Our Team Uses Rasa to Learn from Real Conversations | Rasa Summit
How Our Team Uses Rasa to Learn from Real Conversations | Rasa SummitHow Our Team Uses Rasa to Learn from Real Conversations | Rasa Summit
How Our Team Uses Rasa to Learn from Real Conversations | Rasa SummitRasa Technologies
 
Applying Conversational AI in the Enterprise
Applying Conversational AI in the EnterpriseApplying Conversational AI in the Enterprise
Applying Conversational AI in the EnterpriseRasa Technologies
 
Ai = your data | Rasa Summit 2021
Ai = your data | Rasa Summit 2021Ai = your data | Rasa Summit 2021
Ai = your data | Rasa Summit 2021Rasa Technologies
 
Supercharging User Interfaces with Rasa | Rasa Summit 2021
Supercharging User Interfaces with Rasa | Rasa Summit 2021Supercharging User Interfaces with Rasa | Rasa Summit 2021
Supercharging User Interfaces with Rasa | Rasa Summit 2021Rasa Technologies
 
Continuous Improvement of Conversational AI in Production | Rasa Summit
Continuous Improvement of Conversational AI in Production | Rasa SummitContinuous Improvement of Conversational AI in Production | Rasa Summit
Continuous Improvement of Conversational AI in Production | Rasa SummitRasa Technologies
 
Ethnobots: Reimagining Chatbots as Ethnographic Research Tools | Rasa Summit ...
Ethnobots: Reimagining Chatbots as Ethnographic Research Tools | Rasa Summit ...Ethnobots: Reimagining Chatbots as Ethnographic Research Tools | Rasa Summit ...
Ethnobots: Reimagining Chatbots as Ethnographic Research Tools | Rasa Summit ...Rasa Technologies
 
The State of Conversation Design - Designing for the Conversational Future
The State of Conversation Design - Designing for the Conversational FutureThe State of Conversation Design - Designing for the Conversational Future
The State of Conversation Design - Designing for the Conversational FutureRasa Technologies
 
Rasa Open Source - What's next?
Rasa Open Source - What's next?Rasa Open Source - What's next?
Rasa Open Source - What's next?Rasa Technologies
 
Building an AI Assistant Factory - Rasa Summit 2021
Building an AI Assistant Factory - Rasa Summit 2021Building an AI Assistant Factory - Rasa Summit 2021
Building an AI Assistant Factory - Rasa Summit 2021Rasa Technologies
 
Building an End-to-End Test Automation Pipeline for Conversational AI | Rasa ...
Building an End-to-End Test Automation Pipeline for Conversational AI | Rasa ...Building an End-to-End Test Automation Pipeline for Conversational AI | Rasa ...
Building an End-to-End Test Automation Pipeline for Conversational AI | Rasa ...Rasa Technologies
 
Deploy your Rasa Chatbots like a Boss with DevOps | Rasa Summit 2021
Deploy your Rasa Chatbots like a Boss with DevOps | Rasa Summit 2021Deploy your Rasa Chatbots like a Boss with DevOps | Rasa Summit 2021
Deploy your Rasa Chatbots like a Boss with DevOps | Rasa Summit 2021Rasa Technologies
 
What’s next in CDD: Intent Clashes and Selective Confidence | Rasa Summit 2021
What’s next in CDD: Intent Clashes and Selective Confidence | Rasa Summit 2021What’s next in CDD: Intent Clashes and Selective Confidence | Rasa Summit 2021
What’s next in CDD: Intent Clashes and Selective Confidence | Rasa Summit 2021Rasa Technologies
 
Conversational Teams: Moving Fast at Scale | Rasa Summit 2021
Conversational Teams: Moving Fast at Scale | Rasa Summit 2021Conversational Teams: Moving Fast at Scale | Rasa Summit 2021
Conversational Teams: Moving Fast at Scale | Rasa Summit 2021Rasa Technologies
 

More from Rasa Technologies (20)

Six Steps to Conversation Driven Development
Six Steps to Conversation Driven DevelopmentSix Steps to Conversation Driven Development
Six Steps to Conversation Driven Development
 
Beyond Sentiment Analysis: Creating Engaging Conversational Experiences throu...
Beyond Sentiment Analysis: Creating Engaging Conversational Experiences throu...Beyond Sentiment Analysis: Creating Engaging Conversational Experiences throu...
Beyond Sentiment Analysis: Creating Engaging Conversational Experiences throu...
 
Using Rasa to Power an Immersive Multimedia Conversational Experience | Rasa ...
Using Rasa to Power an Immersive Multimedia Conversational Experience | Rasa ...Using Rasa to Power an Immersive Multimedia Conversational Experience | Rasa ...
Using Rasa to Power an Immersive Multimedia Conversational Experience | Rasa ...
 
How to Effectively Test Your Chatbot | Rasa Summit
How to Effectively Test Your Chatbot  | Rasa SummitHow to Effectively Test Your Chatbot  | Rasa Summit
How to Effectively Test Your Chatbot | Rasa Summit
 
Voice First: Ready Your Content to Serve 50% of Global Searches | Rasa Summit...
Voice First: Ready Your Content to Serve 50% of Global Searches | Rasa Summit...Voice First: Ready Your Content to Serve 50% of Global Searches | Rasa Summit...
Voice First: Ready Your Content to Serve 50% of Global Searches | Rasa Summit...
 
The missing link: How AI can help create a safer society and better businesse...
The missing link: How AI can help create a safer society and better businesse...The missing link: How AI can help create a safer society and better businesse...
The missing link: How AI can help create a safer society and better businesse...
 
Boss - Bringing More Diversity to Tech | Rasa Summit
Boss - Bringing More Diversity to Tech | Rasa SummitBoss - Bringing More Diversity to Tech | Rasa Summit
Boss - Bringing More Diversity to Tech | Rasa Summit
 
How Our Team Uses Rasa to Learn from Real Conversations | Rasa Summit
How Our Team Uses Rasa to Learn from Real Conversations | Rasa SummitHow Our Team Uses Rasa to Learn from Real Conversations | Rasa Summit
How Our Team Uses Rasa to Learn from Real Conversations | Rasa Summit
 
Applying Conversational AI in the Enterprise
Applying Conversational AI in the EnterpriseApplying Conversational AI in the Enterprise
Applying Conversational AI in the Enterprise
 
Ai = your data | Rasa Summit 2021
Ai = your data | Rasa Summit 2021Ai = your data | Rasa Summit 2021
Ai = your data | Rasa Summit 2021
 
Supercharging User Interfaces with Rasa | Rasa Summit 2021
Supercharging User Interfaces with Rasa | Rasa Summit 2021Supercharging User Interfaces with Rasa | Rasa Summit 2021
Supercharging User Interfaces with Rasa | Rasa Summit 2021
 
Continuous Improvement of Conversational AI in Production | Rasa Summit
Continuous Improvement of Conversational AI in Production | Rasa SummitContinuous Improvement of Conversational AI in Production | Rasa Summit
Continuous Improvement of Conversational AI in Production | Rasa Summit
 
Ethnobots: Reimagining Chatbots as Ethnographic Research Tools | Rasa Summit ...
Ethnobots: Reimagining Chatbots as Ethnographic Research Tools | Rasa Summit ...Ethnobots: Reimagining Chatbots as Ethnographic Research Tools | Rasa Summit ...
Ethnobots: Reimagining Chatbots as Ethnographic Research Tools | Rasa Summit ...
 
The State of Conversation Design - Designing for the Conversational Future
The State of Conversation Design - Designing for the Conversational FutureThe State of Conversation Design - Designing for the Conversational Future
The State of Conversation Design - Designing for the Conversational Future
 
Rasa Open Source - What's next?
Rasa Open Source - What's next?Rasa Open Source - What's next?
Rasa Open Source - What's next?
 
Building an AI Assistant Factory - Rasa Summit 2021
Building an AI Assistant Factory - Rasa Summit 2021Building an AI Assistant Factory - Rasa Summit 2021
Building an AI Assistant Factory - Rasa Summit 2021
 
Building an End-to-End Test Automation Pipeline for Conversational AI | Rasa ...
Building an End-to-End Test Automation Pipeline for Conversational AI | Rasa ...Building an End-to-End Test Automation Pipeline for Conversational AI | Rasa ...
Building an End-to-End Test Automation Pipeline for Conversational AI | Rasa ...
 
Deploy your Rasa Chatbots like a Boss with DevOps | Rasa Summit 2021
Deploy your Rasa Chatbots like a Boss with DevOps | Rasa Summit 2021Deploy your Rasa Chatbots like a Boss with DevOps | Rasa Summit 2021
Deploy your Rasa Chatbots like a Boss with DevOps | Rasa Summit 2021
 
What’s next in CDD: Intent Clashes and Selective Confidence | Rasa Summit 2021
What’s next in CDD: Intent Clashes and Selective Confidence | Rasa Summit 2021What’s next in CDD: Intent Clashes and Selective Confidence | Rasa Summit 2021
What’s next in CDD: Intent Clashes and Selective Confidence | Rasa Summit 2021
 
Conversational Teams: Moving Fast at Scale | Rasa Summit 2021
Conversational Teams: Moving Fast at Scale | Rasa Summit 2021Conversational Teams: Moving Fast at Scale | Rasa Summit 2021
Conversational Teams: Moving Fast at Scale | Rasa Summit 2021
 

Recently uploaded

Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 

Recently uploaded (20)

Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 

STAR: A Schema-Guided Dialog Dataset for Transfer Learning | Rasa Summit 2021

  • 1. STAR: A Schema-Guided Dialog Dataset for Transfer Learning Johannes Mosig*, Shikib Mehri*, Thomas Kober @shikibmehri
  • 2. Motivation Pre-trained models changed NLP Large-scale pre-training → Downstream fine-tuning 2
  • 3. Motivation Pre-trained models have helped significantly in open-domain dialog ● DialoGPT ● Meena ● Blender 3
  • 4. Motivation But what about task-oriented dialog? ● Unlike chit-chat systems, task-oriented systems must accomplish a goal ○ Limited space of valid responses ○ Often must interface with API/knowledge ● After training on Reddit, can a system make restaurant reservations? 4
  • 5. Motivation Scenario: Mary joins a COVID-19 hotline. Mary has human-level NLU/NLG and open-domain dialog ability. Can Mary do her job without any training? Answer: Probably not. 5
  • 6. Motivation So what can we get from large-scale pre-training? ● Language understanding ● Language generation ● General dialog skills What don’t we get? ● Task-specific instructions/rules 6
  • 7. Motivation Scenario: Mary joins a COVID-19 hotline. Mary has human-level NLU/NLG and open-domain dialog ability. What is the fastest way to train Mary? ● A training corpus that covers many situations → Training data ● A few examples? → Few-shot dialog ● A flow-chart describing the task? → Task-specific schema 7
  • 8. Motivation Scenario: COVID-19 is cured! Mary is out of a job. Luckily she found a new job at a tech support call center. They give her a task-specific schema that looks a lot like what she used at her old job. Can Mary do her job without any training? Answer: Probably. 8
  • 9. Dialog Schema Paradigm ● Schema acts as an inductive bias ● Pre-train → Train with schemas → Zero-shot on new task ● What transfers? ○ NLU, NLG, general dialog skills ○ how to follow the schema 9
  • 10. Dataset Overview STAR: Schema-Guided Dialog Dataset for Transfer Learning ● Wizard of Oz data collection ● Schema guided data collection ● 24 tasks in 13 domains 10
  • 14. Motivation ctd. What else makes a task-oriented dialog dataset good? ● System actions should be consistent ● Realistic and variable user behavior ● API/KB interface should be explicit ● There should be a progression of difficulty in the data 14
  • 15. Consistency of System Actions ● Dialog often has one-to-many problem → We try to eliminate that ○ Responses in a task-oriented system need not be diverse ○ System action at each timestep should be deterministic ● Achieved this by ○ Told AMT workers to follow the schema ○ Suggestions module 15
  • 16. Suggestions Module ● Wizard enters a response ● Gets a list of suggestions from the schema (using an NLU model) ● Can select suggestion (80%) or write custom response (20%) 16
  • 17. Realistic User Behavior ● Realistic dialog rarely follows the task schema (happy path) ● Users: ○ Change their mind ○ Request explanation/justification ○ Engage in small talk ○ Get angry ○ Anything that deviates from the standard schema 17
  • 19. Explicit API Interface ● Can’t transfer to a new task without: ○ making task-specific API requests ○ interpreting task-specific API outputs ● Make API requests/responses part of the schema and the dialog 19
  • 20. API as Part of the Dialog ● Can’t transfer to a new task without: ○ making task-specific API requests ○ interpreting task-specific API outputs ● Make API requests/responses part of the schema and the dialog 20
  • 22. Formulate Dialog as a 3-Party Interaction 22
  • 23. Formulate Dialog as a 3-Party Interaction 23
  • 24. Progression of Difficulty ● Happy Single-Task ● Unhappy Single-Task ● Multi-Task 24
  • 25. Data Collection Stages ● STAGE I: Single-Task Tutorial + Quiz [10 questions] ● STAGE II: Happy/Unhappy Single-Task Collection ● STAGE III: Multi-Task Tutorial + Quiz [10 questions] ● STAGE IV: Multi-Task Data Collection 25
  • 26. Example Quiz For the assistant, what of these things is the MOST important? (a) Being helpful to the user (b) Following the flow chart of the current task whenever possible (c) Making the conversation as short as possible (d) Making the conversation as long as possible 26
  • 28. Dataset Overview ● 5820 dialogs (127,833 utterances) across 24 tasks (13 domains) ○ Single-task: happy (2,688)/unhappy (1,464) ○ Multi-task (1,668) ● Properties ○ Schema-guided data collection for transfer learning ○ System-side consistency ○ Explicit API requests 28
  • 29. Happy Dialog user: hello wizard: hello , how can i help ? user: my name is angela and i need to rsvp to a party wizard: at what venue is the party taking place ? user: north heights venue wizard: could you provide me with the name of the host , please ? user: the hosts name is alexis wizard: when are you planning to arrive at the party ? … wizard: name = north heights venue, hostname = alexia, guestname = angela, arrivaltime = 2 pm, numberguests = 6 , needparking = api . is _ equal _ to ( yes ) api: message = thank you for your rsvp . see you there . wizard: great , your rsvp is all done and confirmed ! user: thank you ! 29
  • 30. Happy Dialog user: hello wizard: hello , how can i help ? user: my name is angela and i need to rsvp to a party wizard: at what venue is the party taking place ? user: north heights venue wizard: could you provide me with the name of the host , please ? user: the hosts name is alexis wizard: when are you planning to arrive at the party ? … wizard: name = north heights venue, hostname = alexia, guestname = angela, arrivaltime = 2 pm, numberguests = 6 , needparking = api . is _ equal _ to ( yes ) api: message = thank you for your rsvp . see you there . wizard: great , your rsvp is all done and confirmed ! user: thank you ! 30
  • 31. Unhappy Dialog user: hi ! i need a room at the shadyside , can you book it for me ? wizard: when are you arriving ? user: may 11th - 18th , and i need two beds in the room , must have them wizard: may i have your name , please ? user: yup , i ' m mark but i need to change the hotel to the old town inn , i just remembered that my friends said i should check that place out . still need two beds wizard: name = old town inn , startdate = 11th , enddate = 18th , customername = mark , requesttype = check api: message = available , hotelname = old town inn wizard: alright , the old town inn ticks all of your boxes , can i book this room for you ? user: hate to do this to you but my friends dick and jane stayed at another old town inn in a completely different city . i ' m not going to stay there without a recommendation . can you try the hilton ? this blog i love just raved about the authentic feel of the place . 31
  • 32. Unhappy Dialog user: hi ! i need a room at the shadyside , can you book it for me ? wizard: when are you arriving ? user: may 11th - 18th , and i need two beds in the room , must have them wizard: may i have your name , please ? user: yup , i ' m mark but i need to change the hotel to the old town inn , i just remembered that my friends said i should check that place out . still need two beds wizard: name = old town inn , startdate = 11th , enddate = 18th , customername = mark , requesttype = check api: message = available , hotelname = old town inn wizard: alright , the old town inn ticks all of your boxes , can i book this room for you ? user: hate to do this to you but my friends dick and jane stayed at another old town inn in a completely different city . i ' m not going to stay there without a recommendation . can you try the hilton ? this blog i love just raved about the authentic feel of the place . 32
  • 33. Multi-Task Dialog user: i ' m havin a party for my friend on sunday . help me out wizard: when would you like your party to kick off ? user: 5 pm wizard: how many people would you like to invite ? user: 80 . she ' s a big deal . . . user: my name is angela . actually , what would the weather be like on sunday ? we might just take it outside wizard: for what location would you like the weather forecast ? . . . user: great . let ' s book a restaurant instead then . somewhere in the west wizard: i can filter restaurants by name , location , food type , its average rating , whether or not they take reservations , and whether or not they have a delivery service . user: either chinese or pizza is fine 33
  • 34. Multi-Task Dialog user: i ' m havin a party for my friend on sunday . help me out wizard: when would you like your party to kick off ? user: 5 pm wizard: how many people would you like to invite ? user: 80 . she ' s a big deal . . . user: my name is angela . actually , what would the weather be like on sunday ? we might just take it outside wizard: for what location would you like the weather forecast ? . . . user: great . let ' s book a restaurant instead then . somewhere in the west wizard: i can filter restaurants by name , location , food type , its average rating , whether or not they take reservations , and whether or not they have a delivery service . user: either chinese or pizza is fine 34 party_plan weather restaurant_search eventually restaurant_reserve
  • 35. 35
  • 36. Tasks ● Response Generation ○ Produce the next response given (i) dialog history, (ii) API outputs, (iii) schema ● Next Action Prediction ○ Predict the next action (ask_name, query, confirm_booking) given (i) dialog history, (ii) API outputs, (iii) schema 36
  • 37. Tasks user: can you help me ? wizard: hello , how can i help ? user: i need an urgent ride to the airport . wizard: may i have your name , please ? user: alexis . i need the ride at forbes and murray . wizard: customername = alexis, departurelocation = forbes and murray , arrivallocation = airport , requesttype = check api: id = 592 , price = 49 , allowschanges = false , minutestillpickup = 9 , serviceprovider = taxi , drivername = mark , carmodel = corolla , licenseplate = 019 eas , departurelocation = forbes and murray arrivallocation = airport , customername = alexis Response: i found a taxi ride for you from ' forbes and murray ' to ' airport ' for 49 credits that could pick you up in 9 minutes . should i book that for you ? Action: confirm_book_ride 37
  • 43. Determinism in the Schema 43 System action is always deterministic. The nodes representing a dialog state before the system action always has an out-degree of 1
  • 44. Schema-Guided Classification 44 Goal: Use the task- specific schema to guide next action prediction.
  • 49. Results - Next Action Prediction 49 Model Happy Unhappy Multi-Task BERT 73.30 73.93 73.61 BERT + Schema 71.09 72.28 73.13 For each stage (happy/unhappy/multi), models are trained with 80% of data from current stage + all data from previous stages
  • 50. Zero-Shot Next Action Prediction 50 Task Transfer Model Happy Happy + Unhappy BERT 36.45 36.89 BERT + Schema 36.77 37.15 Domain Transfer Model Happy Happy + Unhappy BERT 34.84 35.63 BERT + Schema 37.20 35.71
  • 51. Other Tasks ● Response Generation ● Knowledge base query prediction (state tracking) ● Schema prediction (set of dialogs → schema graph) ● Out-of-domain detection (predict when you’ve gone outside the schema) 51
  • 52. Contributions ● STAR: Schema-guided Dialog Dataset for Transfer Learning ○ Task-specific schema allows zero-shot transfer learning ○ System consistency ○ Realistic user behavior ○ Progression of difficulty ● Schema-guided models for classification and generation 52
  • 53. Future Work ● Improve schema-guided models ● Investigate happy → unhappy transfer ● Investigate happy → multi-task transfer ● Schema prediction (infer schema from a set of example dialogs) 53

Editor's Notes

  1. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  2. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  3. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  4. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  5. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  6. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  7. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  8. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  9. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  10. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  11. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  12. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  13. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  14. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  15. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  16. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  17. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  18. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  19. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  20. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  21. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  22. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  23. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  24. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  25. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  26. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  27. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  28. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  29. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  30. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  31. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  32. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  33. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  34. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  35. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  36. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  37. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  38. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  39. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  40. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  41. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  42. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  43. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  44. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  45. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  46. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  47. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  48. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  49. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  50. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  51. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  52. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH
  53. Evaluation metrics are important. They determine what gets PUBLISHED → SHAPE THE DIRECTION OF RESEARCH