Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
IWSDS 2016
The Fourth Dialog State Tracking
Challenge (DSTC4)
Seokhwan Kim1, Luis Fernando D’Haro1, Rafael E. Banchs1,
Jas...
IWSDS 2016
Dialogue State Tracking
• A key subtask in dialogue management
• To estimate the user’s goal as a dialogue prog...
IWSDS 2016
Previous Dialog State Tracking Challenges
• DSTC1 (Williams et al., SIGDIAL 2013)
– Human-machine dialogues on ...
IWSDS 2016
TourSG: Dataset for DSTC4
• Human-human dialogues
• Tourist information in Singapore
• Speakers
– Guide (3 actu...
IWSDS 2016
DSTC4: Timeline
Period Task
Mar 2012 –
Oct 2012
Data collection and annotation
Sep 2014 –
Dec 2014
Internal dis...
IWSDS 2016
Main Task: Dialogue State Tracking
• Motivation
– Each subject could be expressed through a series of multiple ...
IWSDS 2016
Examples of Dialogue States
Pg 7
Tourist
Can you give me some uh- tell me some cheap rate hotels, because I'm p...
IWSDS 2016
Examples of Dialogue States
Pg 8
Tourist So uh is it near the airport?
Guide
Hm no. But you can get there easil...
IWSDS 2016
Examples of Dialogue States
Pg 9
Guide
So this is the place that you can go out and try street food. you can so...
IWSDS 2016
Examples of Dialogue States
Pg 10
Tourist
So what about other than street food, of course I have to eat my dinn...
IWSDS 2016
Main Task: Evaluation
• Resources
– Data
• Training set: 14 dialogues with 12,759 utterances
• Development set:...
IWSDS 2016
Pg 12
Schedule 1 Schedule 2
Team Entry Accuracy Precision Recall F-measure Accuracy Precision Recall F-measure
...
IWSDS 2016
Main Task: Results
Pg 13
IWSDS 2016
Main Tasks: Error Distribution
Pg 14
IWSDS 2016
Main Tasks: Ensemble Learning
Schedule 1 Schedule 2
Accuracy F-measure Accuracy F-measure
Single best entry 0.1...
IWSDS 2016
Pilot Tasks: Evaluation
• Tasks
– Spoken Language Understanding (SLU)
– Speech Act Prediction (SAP)
– Spoken La...
IWSDS 2016
• Resources
– Data
• Training set: 14 dialogues with 12,759 utterances
• Development set: 6 dialogues with 4,81...
IWSDS 2016
Pilot Tasks: Results
• Participant
– SLU
• Team 3 (5 entries)
• Results
Pg 18
Speech Act Semantic Tag
Speaker E...
IWSDS 2016
Conclusions
• DSTC4
– Main Task: Dialogue State Tracking
• Multi-topic, Mixed-initiative, Human-human conversat...
Thank You
Pg 20
Upcoming SlideShare
Loading in …5
×

The Fourth Dialog State Tracking Challenge (DSTC4)

554 views

Published on

IWSDS 2016

Published in: Engineering
  • Be the first to comment

The Fourth Dialog State Tracking Challenge (DSTC4)

  1. 1. IWSDS 2016 The Fourth Dialog State Tracking Challenge (DSTC4) Seokhwan Kim1, Luis Fernando D’Haro1, Rafael E. Banchs1, Jason D. Williams2, Matthew Henderson3 1 Institute for Infocomm Research, 2 Microsoft Research, 3 Google
  2. 2. IWSDS 2016 Dialogue State Tracking • A key subtask in dialogue management • To estimate the user’s goal as a dialogue progresses Pg 2 Utterance Food Area S Hello, How may I help you? Persian South U I need a Persian restaurant in the south part of town. S What kind of food would you like? Persian South U Persian. S I’m sorry but there is no restaurant serving persian food Portuguese South U How about Portuguese food? S Peking restaurant is a nice place in the south of town. Portuguese South U Is that Portuguese? S Nandos is a nice place in the south of town serving tasty Portuguese food. Portuguese South U Alright. Whats the phone number? S The phone number of nandos is 01223 327908 . Portuguese South U And the address? S Sure, nandos is on Cambridge Leisure Park Clifton Way. Portuguese South U Thank you good bye.
  3. 3. IWSDS 2016 Previous Dialog State Tracking Challenges • DSTC1 (Williams et al., SIGDIAL 2013) – Human-machine dialogues on bus timetable search – Collected with Let’s go (CMU) – Focused on the evaluation metrics for state tracking • DSTC2 (Henderson et al., SIGDIAL 2014) – Human-machine dialogues on restaurant search – Collected with Cambridge University’s system – Introduced changing user goals in a single dialogue session • DSTC3 (Henderson et al., IEEE SLT 2014) – Human-machine dialogues on tourist information search – Collected with Cambridge University’s system – Addressed the problem of adaptation to a new domain from DSTC2 Pg 3
  4. 4. IWSDS 2016 TourSG: Dataset for DSTC4 • Human-human dialogues • Tourist information in Singapore • Speakers – Guide (3 actual tour guides from Singapore) – Tourist (35 possible tourists from Philippines) • Characteristics – Goal-oriented dialogues – Mixed-initiative dialogues – Knowledge-based dialogues – Multi-topic dialogues – Verbose dialogues – Noisy dialogues Pg 4
  5. 5. IWSDS 2016 DSTC4: Timeline Period Task Mar 2012 – Oct 2012 Data collection and annotation Sep 2014 – Dec 2014 Internal discussions 7 Dec 2014 Challenge planning meeting @ SLT 2014 Dec 2015 – Apr 2015 Labelling additional annotations and building resources for evaluation 15 Apr 2015 – 16 Aug 2015 Development phase of the main and pilot tasks of DSTC4 17 Aug 2015 – 31 Aug 2015 Evaluation phase of the main task of DSTC4 14 Sep 2015 – 16 Sep 2015 Evaluation phase of the pilot tasks of DSTC4 30 Sep 2015 Paper submission deadline to IWSDS 2016 Pg 5
  6. 6. IWSDS 2016 Main Task: Dialogue State Tracking • Motivation – Each subject could be expressed through a series of multiple turns – Multiple topics are interlaced in a session • Problem Definition – Dialogue state tracking for each sub-dialogue level – Focusing on the most common topic categories • Annotations – Segmentation – Topic Category – Frame Structure for major topic categories • Itinerary, Accommodation, Attraction, Food, Transportation Pg 6
  7. 7. IWSDS 2016 Examples of Dialogue States Pg 7 Tourist Can you give me some uh- tell me some cheap rate hotels, because I'm planning just to leave my bags there and go somewhere take some pictures. Guide Okay. I'm going to recommend firstly you want to have a backpack type of hotel, right? Tourist Yes. I'm just gonna bring my backpack and my buddy with me. So I'm kinda looking for a hotel that is not that expensive. Just gonna leave our things there and, you know, stay out the whole day. Guide Okay. Let me get you hm hm. So you don't mind if it's a bit uh not so roomy like hotel because you just back to sleep. Tourist Yes. Yes. As we just gonna put our things there and then go out to take some pictures. Guide Okay, um- Tourist Hm. Guide Let's try this one, okay? Tourist Okay. Guide It’s InnCrowd Backpackers Hostel in Singapore. If you take a dorm bed per person only twenty dollars. If you take a room, it's two single beds at fifty nine dollars. Tourist Um. Wow, that's good. Guide Yah, the prices are based on per person per bed or dorm. But this one is room. So it should be fifty nine for the two room. So you're actually paying about ten dollars more per person only. Tourist Oh okay. That's- the price is reasonable actually. It's good. TOPIC ACCOMMODATION TYPE Hostel PRICERANGE Cheap TOPIC ACCOMMODATION NAME InnCrowd Backparkers Hostel
  8. 8. IWSDS 2016 Examples of Dialogue States Pg 8 Tourist So uh is it near the airport? Guide Hm no. But you can get there easily by taking the trains from the airport. You just need to make a change in the train direction. Tourist Hm okay. Because I have no idea at all about Singapore trains or transit. Uh how can I go to the train or to the transit from the airport? Is it just outside the airport? Guide So when you reach the airport you go down to the basement. Tourist Um. Okay. Guide So you get your ticket, you pay your deposit. And I think at the airport they gave you a map. and- to give you an idea. So all this is free. And then you travel along the East line towards the West. Can you see Tanah Merah on the second stop? Tourist Okay. Hm, Tanah Merah, yes. Guide Okay. So that is where you change to go down to town to the West towards the West. And you go down to- I think the easiest way is to go to Outram Park. Tourist Outrum Park. Guide Yah. Tourist Alright. Guide So when you get up there, you take the line towards Little India. So it's one, two, three stops and you are at Little India. Tourist Hm, okay. TOPIC TRANSPORTATION FROM Changi Airport TO InnCrowd Backparkers Hostel BY MRT
  9. 9. IWSDS 2016 Examples of Dialogue States Pg 9 Guide So this is the place that you can go out and try street food. you can soak in the atmosphere. You would love taking your camera out because you can photograph the Indian garland makers, the fortune tellers. Uh it's full of life and culture. It's one of my favourite places. Tourist Oh- Oh, great. Oh Yah. Is Little India is like a Indian community town or Indiantown? Guide Yes. So there are Hindu temples there. You can photograph beautiful architecture and statues of the different Deities, the Hindu Deities. Tourist Uh huh. Okay. Great. So other than Indiantowns, are there other uh nations town there or race town? What else? Guide Okay. And then Chinatown you take the same line. Two, three stops down. So you'll get off at Chinatown, you are right in the heart of Chinatown. And in Chinatown we have uh also Bhuddist temple and Terrace temple also great for photography. Tourist Uh Yes. Yes, okay. Okay, great. So we have Little India, then Chinatown. Other than that two, there are other kinds of town, right? Like uh- is there a um something like Vietnamese town or just the two of these? Guide Not Vietnamese but there is uh Kampong Glam which you have to go by bus because- well actually you could go by train because you are young and healthy you can walk. Tourist Hm. Yah, I like walking. TOPIC ATTRACTION NAME Little India TOPIC ATTRACTION TYPE Ethnic enclave TOPIC ATTRACTION NAME Chinatown TOPIC ATTRACTION TYPE Ethnic enclave TOPIC ATTRACTION NAME Kampong Glam
  10. 10. IWSDS 2016 Examples of Dialogue States Pg 10 Tourist So what about other than street food, of course I have to eat my dinner. Wha~ where do you suggest me to eat my dinner? I also want to experience Singaporean delicacies or Singaporean dishes. Guide do you like hot food? Do you like curries? Tourist Curries? Guide Yah. Tourist Indian curries? What about Singaporean restaurants? Like they, you know, they offer Singaporean delicacies or Singaporean dishes? Do you have a Singaporean dishes in Singapore? Guide Uh, Singaporean food is mostly try at the uh food courts. This is one I am recommending to you. It's at old market. It's Maxwell Road Food Centre. Tourist Um. Road Food Centre. Guide So it is at place called Maxwell Road which is in Chinatown. So if you take the train to Chinatown from where you are and you'd- It's near. You just walk there. Tourist Okay, nice. TOPIC FOOD CUISINE Singaporean TOPIC FOOD DISH Curry TOPIC FOOD CUISINE Singaporean TOPIC FOOD TYPE_OF_PLACE Hawker centre NAME Maxwell Road Food Centre
  11. 11. IWSDS 2016 Main Task: Evaluation • Resources – Data • Training set: 14 dialogues with 12,759 utterances • Development set: 6 dialogues with 4,812 utterances • Test set: 9 dialogues with 7,848 utterances – Ontology – Evaluation scripts – Baseline tracker • Fuzzy string matching with the ontology entries – CodaLab: Web-based Competition Platform • Metrics – Schedules • Schedule 1: all turns are included • Schedule 2: only the turns at the end of segments are included – Metrics • Frame Structure-level Accuracy • Slot-level Precision/Recall/F-measure Pg 11
  12. 12. IWSDS 2016 Pg 12 Schedule 1 Schedule 2 Team Entry Accuracy Precision Recall F-measure Accuracy Precision Recall F-measure Baseline 0 0 0.0374 0.3589 0.1925 0.2506 0.0488 0.3750 0.2519 0.3014 1 0 0.0456 0.3876 0.3344 0.3591 0.0584 0.4384 0.3377 0.3815 1 0.0374 0.4214 0.2762 0.3336 0.0584 0.4384 0.3377 0.3815 2 0.0372 0.4173 0.2767 0.3328 0.0575 0.4362 0.3377 0.3807 3 0.0371 0.4179 0.2804 0.3356 0.0584 0.4384 0.3426 0.3846 2 0 0.0487 0.4079 0.2626 0.3195 0.0671 0.4280 0.3257 0.3699 1 0.0467 0.4481 0.2655 0.3335 0.0671 0.4674 0.3275 0.3851 2 0.0478 0.4523 0.2623 0.3320 0.0706 0.4679 0.3226 0.3819 3 0.0489 0.4440 0.2703 0.3361 0.0697 0.4634 0.3335 0.3878 3 0 0.1212 0.5393 0.4980 0.5178 0.1500 0.5569 0.5808 0.5686 1 0.1210 0.5449 0.4964 0.5196 0.1500 0.5619 0.5787 0.5702 2 0.1092 0.5304 0.5031 0.5164 0.1316 0.5437 0.5875 0.5648 3 0.1183 0.5780 0.4904 0.5306 0.1473 0.5898 0.5678 0.5786 4 0 0.0887 0.5280 0.3595 0.4278 0.1072 0.5354 0.4273 0.4753 1 0.0910 0.5314 0.3122 0.3933 0.1055 0.5325 0.3623 0.4312 2 0.1009 0.5583 0.3698 0.4449 0.1264 0.5666 0.4455 0.4988 3 0.1002 0.5545 0.3760 0.4481 0.1212 0.5642 0.4540 0.5031 5 0 0.0309 0.2980 0.2559 0.2754 0.0392 0.3344 0.2547 0.2892 1 0.0268 0.3405 0.2014 0.2531 0.0401 0.3584 0.2632 0.3035 2 0.0309 0.3039 0.2659 0.2836 0.0392 0.3398 0.2639 0.2971 6 0 0.0421 0.4175 0.2142 0.2831 0.0541 0.4380 0.2656 0.3307 1 0.0478 0.5516 0.2180 0.3125 0.0654 0.5857 0.2702 0.3698 2 0.0486 0.5623 0.2314 0.3279 0.0645 0.5941 0.2850 0.3852 7 0 0.0286 0.2768 0.1826 0.2200 0.0323 0.3054 0.2410 0.2694 1 0.0044 0.0085 0.0629 0.0150 0.0061 0.0109 0.0840 0.0194
  13. 13. IWSDS 2016 Main Task: Results Pg 13
  14. 14. IWSDS 2016 Main Tasks: Error Distribution Pg 14
  15. 15. IWSDS 2016 Main Tasks: Ensemble Learning Schedule 1 Schedule 2 Accuracy F-measure Accuracy F-measure Single best entry 0.1212 0.5306 0.1500 0.5786 Top 3 entries: union 0.1111- 0.5147- 0.1325- 0.5619- Top 3 entries: intersection 0.1241+ 0.5344+ 0.1561+ 0.5861+ Top 3 entries: majority voting 0.1172- 0.5194- 0.1421- 0.5703 Top 5 entries: union 0.0980- 0.5133- 0.1107- 0.5543- Top 5 entries: intersection 0.1157 0.4370- 0.1369 0.5008- Top 5 entries: majority voting 0.1183- 0.5210- 0.1439 0.5711 Top 10 entries: union 0.0623- 0.4719- 0.0680- 0.5014- Top 10 entries: intersection 0.0300- 0.1816- 0.0453- 0.2275- Top 10 entries: majority voting 0.1268+ 0.4741- 0.1456 0.5380- All entries: union 0.0077- 0.1320- 0.0078- 0.1366- All entries: intersection 0.0132- 0.0229- 0.0192- 0.0331- All entries: majority voting 0.0646- 0.3535- 0.0898- 0.4135- Pg 15
  16. 16. IWSDS 2016 Pilot Tasks: Evaluation • Tasks – Spoken Language Understanding (SLU) – Speech Act Prediction (SAP) – Spoken Language Generation (SLG) – End-to-end System (EES) • Evaluation Metrics – SLU and SAP • Precision/Recall/F-measure – SLG and EES • BLEU • AM-FM Pg 16
  17. 17. IWSDS 2016 • Resources – Data • Training set: 14 dialogues with 12,759 utterances • Development set: 6 dialogues with 4,812 utterances • Test set: 6 dialogues with 5,615 utterances – Ontology – Evaluation Scripts • Offline evaluation • Web-based evaluation • Web-based Evaluation Pilot Tasks: Evaluation Pg 17 JSON Messages Web-server System Participant Web-client Evaluation Script Organizer
  18. 18. IWSDS 2016 Pilot Tasks: Results • Participant – SLU • Team 3 (5 entries) • Results Pg 18 Speech Act Semantic Tag Speaker Entry Precision Recall F-measure Precision Recall F-measure Guide 1 0.6287 0.5191 0.5687 0.5646 0.4886 0.5239 2 0.6330 0.5227 0.5726 0.5646 0.4886 0.5239 3 0.7451 0.6153 0.6740 0.5646 0.4886 0.5239 4 0.6314 0.5214 0.5712 0.5646 0.4886 0.5239 5 0.6762 0.5584 0.6117 0.5646 0.4886 0.5239 Tourist 1 0.3583 0.2977 0.3252 0.5741 0.4764 0.5207 2 0.2931 0.2435 0.2660 0.5741 0.4764 0.5207 3 0.5627 0.4675 0.5107 0.5741 0.4764 0.5207 4 0.2939 0.2442 0.2668 0.5741 0.4764 0.5207 5 0.5736 0.4766 0.5206 0.5741 0.4764 0.5207
  19. 19. IWSDS 2016 Conclusions • DSTC4 – Main Task: Dialogue State Tracking • Multi-topic, Mixed-initiative, Human-human conversations • Tracking sub-dialogue segment-level state structures • 24 entries from 7 participants – Pilot Tasks • SLU, SAP, SLG, EES • Web-based evaluation • 5 SLU entries from a participant Pg 19
  20. 20. Thank You Pg 20

×