Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Upcoming SlideShare
What to Upload to SlideShare
What to Upload to SlideShare
Loading in …3
×
1 of 16

SemEval 2018 Task 4: Character Identification on Multiparty Dialogues

1

Share

Download to read offline

Character identification is a task of entity linking that finds the global entity of each personal mention in multiparty dialogue. For this task, the first two seasons of the popular TV show Friends are annotated, comprising a total of 448 dialogues, 15,709 mentions, and 401 entities. The personal mentions are detected from nominals referring to certain characters in the show, and the entities are collected from the list of all characters in those two seasons of the show. This task is challenging because it requires the identification of characters that are mentioned but may not be active during the conversation. Among 90+ participants, four of them submitted their system outputs and showed strengths in different aspects about the task. Thorough analyses of the distributed datasets, system out- puts, and comparative studies are also provided. To facilitate the momentum, we create an open- source project for this task and publicly release a larger and cleaner dataset, hoping to support researchers for more enhanced modeling.

More Related Content

You Might Also Like

Related Books

Free with a 30 day trial from Scribd

See all

SemEval 2018 Task 4: Character Identification on Multiparty Dialogues

  1. 1. SemEval 2018 Task 4: Character Identification on Multiparty Dialogues Jinho D. Choi @ Emory University HenryY. Chen @ Snap Inc. https://github.com/emorynlp/semeval-2018-task4 http://nlp.mathcs.emory.edu
  2. 2. Character Mining 2 Extract or infer various information about individual characters in multiparty dialogues text chat audio chat video chat
  3. 3. Character Mining 3 Transcripts from the TV show, Friends A total of 10 seasons, 236 episodes, 3,107 scenes
  4. 4. Character Mining 4 Character Reading Comprehension Emotion Detection Personality TestIdentification https://github.com/emorynlp/character-mining
  5. 5. 5 Ross I told mom and dad last night, they seemed to take it pretty well. Monica Oh really, so that hysterical phone call I got from a woman at sobbing 3:00 A.M., "I'll never have grandchildren, I'll never have grandchildren." was what? A wrong number? Ross Sorry. Joey Alright Ross, look. You're feeling a lot of pain right now. You're angry. You're hurting.
 Can I tell you what the answer is? Jack Monica Ross Joey Judy Character Identification
  6. 6. 6 Ross I told mom and dad last night, they seemed to take it pretty well. Monica Oh really, so that hysterical phone call I got from a woman at sobbing 3:00 A.M., "I'll never have grandchildren, I'll never have grandchildren." was what? A wrong number? Ross Sorry. Joey Alright Ross, look. You're feeling a lot of pain right now. You're angry. You're hurting.
 Can I tell you what the answer is? Entity Linking or Coreference Resolution? ? ? Cross-document resolution is required Judy & Jack appear in some other dialogue Character Identification
  7. 7. Corpus 7 Episodes Scenes Speakers Utterances Sentences Tokens Season 1 24 326 107 5,968 10,790 81,453 Season 2 24 293 107 5,747 9,337 81,910 Total 48 619 171 11,715 20,127 163,363 Two seasons of Friends Pseudo-annotated then manually corrected Manually double-annotated by crowd workers Nominals referring to humans Characters in the show Mentions Entities
  8. 8. Mention Annotation 8 Named Entity Pronoun Gazeteer Total Season 1 946 7,549 811 9,306 Season 2 1,037 7,459 806 9,302 Total 1,983 15,008 1,617 18,608 A noun phrase is a mention if it is either 1. A PERSON named entity, or 2. A pronoun or possessive pronoun excluding“it” and “they”, or 3. One of the personal noun gazetteers that are 603 common and singular personal nouns selected from Freebase and DBPedia (e.g., mom, girlfriend). Gives about the F1 score of 95.93%
  9. 9. Entity Annotation 9 Primary Secondary Generic Collective General Other Total Season 1 5,160 2,526 178 1,150 111 187 9,312 Season 2 5,385 2,340 120 1,238 44 169 9,296 Total 10,545 4,866 298 2,388 155 356 18,608 1. Primary: six main characters of the show. 2. Secondary: all the non-primary characters. 3. Generic: real entities whose identities are not defined 
 (e.g.,That waitress is really cute, I am going to ask her out) 4. Collective: the plural use of the pronoun you, which cannot be deterministically distinguished from the singular use 5. General: mentions used in reference to a general case rather than an specific entity 
 (e.g.,The ideal guy you look for doesn’t exist) 6. Other: indicates all the other kinds of mentions Used for this year’s shared task
  10. 10. Evaluation Metrics 10 Labeling Accuracy # of all mentions # of correctly identified mentions Average Macro F1 F1 score of the i’th character i = 1 N N 1 6 Main Characters + Others All Characters
  11. 11. Overall Scores 11 Over 90 teams registered at CodaLab Only 4 teams made submissions :-( Main + Others All Characters Acc F1 Acc F1 AMORE-UPF 77.23 79.36 74.72 41.05 KNU-CI 85.10 86.00 69.49 16.98 Kampfpudding 73.36 73.51 59.45 37.37 Zuma-AR 46.85 44.68 33.06 16.09
  12. 12. Main Characters + Others 12 Ross Rachel Chand Joey Pheob Monica OTHER AMORE-UPF 78.57 82.98 81.36 79.83 86.52 85.22 61.02 KNU-CI 85.86 92.49 84.94 79.67 88.09 91.16 79.79 Kampfpudding 73.48 70.67 79.25 63.38 79.79 73.35 74.61 Zuma-AR 38.72 43.05 43.04 36.10 42.90 46.43 51.78 Evaluation 18.98 13.96 9.80 9.51 9.02 8.97 29.77 Training 13.93 12.37 11.43 9.43 8.79 10.61 33.44
  13. 13. All Characters 13 Be Ca Ed Pa Ju MB Ri Sc Cr Fr Ja OT AMORE 50.00 57.14 80.60 35.56 72.73 64.52 80.85 10.00 61.54 0.00 42.11 7.89 KNU-CI 38.46 62.79 73.02 15.38 42.55 0.00 66.67 38.46 0.00 18.18 16.00 0.00 Kamp. 31.86 33.33 68.85 33.33 60.32 50.00 61.22 10.00 0.00 0.00 23.53 0.00 Zuma 0.00 12.24 44.44 0.00 27.91 15.38 77.78 0.00 38.46 0.00 12.50 0.00 TST 3.46 1.73 1.56 1.44 1.32 0.86 0.86 0.78 0.74 0.70 0.62 2.92 TRN 1.41 1.46 1.06 0.71 1.15 0.60 1.83 0.21 0.13 0.51 0.43 13.51
  14. 14. Next Step 14 Character Identification v2.0 Four seasons of Friends are completely annotated Plural mentions are annotated Ihttps://github.com/emorynlp/character-identification They Exist! Introducing Plural Mentions to Coreference Resolution and Entity Linking Ethan Zhou and Jinho D. Choi COLING 2018
  15. 15. ThankYou 15 Henry Chen Ethan Zhou
  16. 16. Questions 16

×