SlideShare a Scribd company logo
Discourse component-based argument extraction of Seoul Korean
directives
Won Ik Cho¹, Young Ki Moon², Nam Soo Kim¹
Department of Electrical and Computer Engineering and INMC¹, Seoul National University,
Department of Computer Engineering², Inha University
E-mail: wicho@hi.snu.ac.kr, ykmoon0814@gmail.com, nkim@snu.ac.kr
Motivation
- Keyword, keyphrase, and argument
• Keyword/keyphrase: core content of a document
• Document can be either an article, a paragraph or a
sentence!
• Keyphrase: a sequence of the keywords in a phrase form
• For a sentence (as document):
Keyword/keyphrase: the core expression of the sentence
• e.g., in “there is a dog in front of the door”
• Keyword: dog, door
• Keyphrase: a dog in front of the door
• However, in directive utterances?
• e.g., in “you know when the rain stops in tokyo”
• Keyword: rain, stops, tokyo
• Keyphrase: when the rain stops in tokyo
• In intent analysis, argument is extracted as in:
• Domain: Weather
• Intent: Asking
• Argument: The time the rain stops in tokyo
• Thus, argument can be interpreted as a nominalized and
structured term for the sentence keywords/keyphrases
- Why discourse component (Portner, 2004) for directive
utterances (question & command)?
• Can be utilized as a syntax-semantic term (if extended to
speech act level) that conceptually matches to an argument
Annotation Guideline
- Targets the questions/commands that are not rhetorical
- Yes/no, alternative and wh-Q denote the property of the QS,
not the sentence form
- Similarly, prohibition and requirement denote the property
of the TDL, regardless of the utterance being declarative,
interrogative, or imperative
- For the Korean language:
• Strong requirement is often
in the form of [PH+REQ]
• Scrambling frequently
happens, but hardly changes
the argument (DC)
• For a frequent subject drop,
notating speaker/addressee
information in the argument
often requires a guess
or is ambiguous
Corpus Annotation
- Intonation-aided intention identification for Korean (3i4K)
- Corpus composition
- Questions: 17,869
- Commands: 12,968
- Five non-directive utterance types (not used)
- Human annotation
- Left only the utterances with the consensus of at least three
Seoul Korean natives
- Prototype corpus
- Size ~30K
- Deficit in:
- Alternative Q
- Prohibition
- Strong REQ
Data Augmentation (further work)
- Data augmentation
• Discourse component as a core content for (a human-
resourced) sentence generation
• Human-generated utterances for AltQ, Wh-Q, PH, StrREQ
• 50,837 utterances in total! (Relatively balanced corpus)
Conclusion
- Discourse component as a conceptual correspondence of the
intent argument (DC, keyphrase) of directive utterances
- Corpus construction and data augmentation for 1. the
automatic extraction system and 2. Linguistic analysis
regarding utterance types and subject drop
Goal & Research Questions (in progress)
- Implementation of automatic argument extraction system
• Further application to semantic web search & answer
making for the spoken language understanding (SLU)
systems that aim human-friendly conversation
- Qualitative comparison of annotated and generated corpus
• Variance in argument vs. Variance in sentence styles
• Frequency and felicity of scrambling
- Subject drop and speaker/addressee notation
• Tendency (speaker? addressee? both? neither? ambiguous?)
regarding intent arguments and polarity items
Original
sentence
Core content
(SQL or Keyphrase)
Paraphrase
Bilingual pivoting / Word swapping
Human paraphrase
SeqSQL /
Argument extraction
Rule-based /
Learning-based /
Human generation

More Related Content

Similar to 1910 JK27

NLTK
NLTKNLTK
Technical writing human talk
Technical writing   human talkTechnical writing   human talk
Technical writing human talk
Lucas Girardin
 
Presentation from Academic Writing
Presentation from Academic WritingPresentation from Academic Writing
Presentation from Academic Writing
Renee Davis
 
Chapter 2: Text Operation in information stroage and retrieval
Chapter 2: Text Operation in information stroage and retrievalChapter 2: Text Operation in information stroage and retrieval
Chapter 2: Text Operation in information stroage and retrieval
captainmactavish1996
 
Grosof haley-talk-semtech2013-ver6-10-13
Grosof haley-talk-semtech2013-ver6-10-13Grosof haley-talk-semtech2013-ver6-10-13
Grosof haley-talk-semtech2013-ver6-10-13
Brian Ulicny
 
Lazy man's learning: How To Build Your Own Text Summarizer
Lazy man's learning: How To Build Your Own Text SummarizerLazy man's learning: How To Build Your Own Text Summarizer
Lazy man's learning: How To Build Your Own Text Summarizer
Sho Fola Soboyejo
 
Engineering Intelligent NLP Applications Using Deep Learning – Part 1
Engineering Intelligent NLP Applications Using Deep Learning – Part 1Engineering Intelligent NLP Applications Using Deep Learning – Part 1
Engineering Intelligent NLP Applications Using Deep Learning – Part 1
Saurabh Kaushik
 
Cross-domain Document Retrieval: Matching between Conversational and Formal W...
Cross-domain Document Retrieval: Matching between Conversational and Formal W...Cross-domain Document Retrieval: Matching between Conversational and Formal W...
Cross-domain Document Retrieval: Matching between Conversational and Formal W...
Jinho Choi
 
2312 PACLIC
2312 PACLIC2312 PACLIC
2312 PACLIC
WarNik Chow
 
Syntax.ppt
Syntax.pptSyntax.ppt
Syntax.ppt
KhenAguinillo
 
Ir 03
Ir   03Ir   03
3. introduction to text mining
3. introduction to text mining3. introduction to text mining
3. introduction to text mining
Lokesh Ramaswamy
 
3. introduction to text mining
3. introduction to text mining3. introduction to text mining
3. introduction to text mining
Lokesh Ramaswamy
 
Structure of a research paper
Structure of a research paperStructure of a research paper
Structure of a research paper
Nithin Kalorth, PhD
 
NLP_KASHK:POS Tagging
NLP_KASHK:POS TaggingNLP_KASHK:POS Tagging
NLP_KASHK:POS Tagging
Hemantha Kulathilake
 
Intro to nlp
Intro to nlpIntro to nlp
Intro to nlp
ankit_ppt
 
introtonlp-190218095523 (1).pdf
introtonlp-190218095523 (1).pdfintrotonlp-190218095523 (1).pdf
introtonlp-190218095523 (1).pdf
AdityaMishra178868
 
2021-0509_JAECS2021_Spring
2021-0509_JAECS2021_Spring2021-0509_JAECS2021_Spring
2021-0509_JAECS2021_Spring
Mizumoto Atsushi
 
Introduction to natural language processing (NLP)
Introduction to natural language processing (NLP)Introduction to natural language processing (NLP)
Introduction to natural language processing (NLP)
Alia Hamwi
 
Enabling the Production of High-Quality English Glosses of Every Word in the ...
Enabling the Production of High-Quality English Glosses of Every Word in the ...Enabling the Production of High-Quality English Glosses of Every Word in the ...
Enabling the Production of High-Quality English Glosses of Every Word in the ...
jrcovington
 

Similar to 1910 JK27 (20)

NLTK
NLTKNLTK
NLTK
 
Technical writing human talk
Technical writing   human talkTechnical writing   human talk
Technical writing human talk
 
Presentation from Academic Writing
Presentation from Academic WritingPresentation from Academic Writing
Presentation from Academic Writing
 
Chapter 2: Text Operation in information stroage and retrieval
Chapter 2: Text Operation in information stroage and retrievalChapter 2: Text Operation in information stroage and retrieval
Chapter 2: Text Operation in information stroage and retrieval
 
Grosof haley-talk-semtech2013-ver6-10-13
Grosof haley-talk-semtech2013-ver6-10-13Grosof haley-talk-semtech2013-ver6-10-13
Grosof haley-talk-semtech2013-ver6-10-13
 
Lazy man's learning: How To Build Your Own Text Summarizer
Lazy man's learning: How To Build Your Own Text SummarizerLazy man's learning: How To Build Your Own Text Summarizer
Lazy man's learning: How To Build Your Own Text Summarizer
 
Engineering Intelligent NLP Applications Using Deep Learning – Part 1
Engineering Intelligent NLP Applications Using Deep Learning – Part 1Engineering Intelligent NLP Applications Using Deep Learning – Part 1
Engineering Intelligent NLP Applications Using Deep Learning – Part 1
 
Cross-domain Document Retrieval: Matching between Conversational and Formal W...
Cross-domain Document Retrieval: Matching between Conversational and Formal W...Cross-domain Document Retrieval: Matching between Conversational and Formal W...
Cross-domain Document Retrieval: Matching between Conversational and Formal W...
 
2312 PACLIC
2312 PACLIC2312 PACLIC
2312 PACLIC
 
Syntax.ppt
Syntax.pptSyntax.ppt
Syntax.ppt
 
Ir 03
Ir   03Ir   03
Ir 03
 
3. introduction to text mining
3. introduction to text mining3. introduction to text mining
3. introduction to text mining
 
3. introduction to text mining
3. introduction to text mining3. introduction to text mining
3. introduction to text mining
 
Structure of a research paper
Structure of a research paperStructure of a research paper
Structure of a research paper
 
NLP_KASHK:POS Tagging
NLP_KASHK:POS TaggingNLP_KASHK:POS Tagging
NLP_KASHK:POS Tagging
 
Intro to nlp
Intro to nlpIntro to nlp
Intro to nlp
 
introtonlp-190218095523 (1).pdf
introtonlp-190218095523 (1).pdfintrotonlp-190218095523 (1).pdf
introtonlp-190218095523 (1).pdf
 
2021-0509_JAECS2021_Spring
2021-0509_JAECS2021_Spring2021-0509_JAECS2021_Spring
2021-0509_JAECS2021_Spring
 
Introduction to natural language processing (NLP)
Introduction to natural language processing (NLP)Introduction to natural language processing (NLP)
Introduction to natural language processing (NLP)
 
Enabling the Production of High-Quality English Glosses of Every Word in the ...
Enabling the Production of High-Quality English Glosses of Every Word in the ...Enabling the Production of High-Quality English Glosses of Every Word in the ...
Enabling the Production of High-Quality English Glosses of Every Word in the ...
 

More from WarNik Chow

2311 EAAMO
2311 EAAMO2311 EAAMO
2311 EAAMO
WarNik Chow
 
2211 HCOMP
2211 HCOMP2211 HCOMP
2211 HCOMP
WarNik Chow
 
2211 APSIPA
2211 APSIPA2211 APSIPA
2211 APSIPA
WarNik Chow
 
2211 AACL
2211 AACL2211 AACL
2211 AACL
WarNik Chow
 
2210 CODI
2210 CODI2210 CODI
2210 CODI
WarNik Chow
 
2206 FAccT_inperson
2206 FAccT_inperson2206 FAccT_inperson
2206 FAccT_inperson
WarNik Chow
 
2206 Modupop!
2206 Modupop!2206 Modupop!
2206 Modupop!
WarNik Chow
 
2204 Kakao talk on Hate speech dataset
2204 Kakao talk on Hate speech dataset2204 Kakao talk on Hate speech dataset
2204 Kakao talk on Hate speech dataset
WarNik Chow
 
2108 [LangCon2021] kosp2e
2108 [LangCon2021] kosp2e2108 [LangCon2021] kosp2e
2108 [LangCon2021] kosp2e
WarNik Chow
 
2106 PRSLLS
2106 PRSLLS2106 PRSLLS
2106 PRSLLS
WarNik Chow
 
2106 JWLLP
2106 JWLLP2106 JWLLP
2106 JWLLP
WarNik Chow
 
2106 ACM DIS
2106 ACM DIS2106 ACM DIS
2106 ACM DIS
WarNik Chow
 
2104 Talk @SSU
2104 Talk @SSU2104 Talk @SSU
2104 Talk @SSU
WarNik Chow
 
2103 ACM FAccT
2103 ACM FAccT2103 ACM FAccT
2103 ACM FAccT
WarNik Chow
 
2102 Redone seminar
2102 Redone seminar2102 Redone seminar
2102 Redone seminar
WarNik Chow
 
2011 NLP-OSS
2011 NLP-OSS2011 NLP-OSS
2011 NLP-OSS
WarNik Chow
 
2010 INTERSPEECH
2010 INTERSPEECH 2010 INTERSPEECH
2010 INTERSPEECH
WarNik Chow
 
2010 PACLIC - pay attention to categories
2010 PACLIC - pay attention to categories2010 PACLIC - pay attention to categories
2010 PACLIC - pay attention to categories
WarNik Chow
 
2010 HCLT Hate Speech
2010 HCLT Hate Speech2010 HCLT Hate Speech
2010 HCLT Hate Speech
WarNik Chow
 
2009 DevC Seongnam - NLP
2009 DevC Seongnam - NLP2009 DevC Seongnam - NLP
2009 DevC Seongnam - NLP
WarNik Chow
 

More from WarNik Chow (20)

2311 EAAMO
2311 EAAMO2311 EAAMO
2311 EAAMO
 
2211 HCOMP
2211 HCOMP2211 HCOMP
2211 HCOMP
 
2211 APSIPA
2211 APSIPA2211 APSIPA
2211 APSIPA
 
2211 AACL
2211 AACL2211 AACL
2211 AACL
 
2210 CODI
2210 CODI2210 CODI
2210 CODI
 
2206 FAccT_inperson
2206 FAccT_inperson2206 FAccT_inperson
2206 FAccT_inperson
 
2206 Modupop!
2206 Modupop!2206 Modupop!
2206 Modupop!
 
2204 Kakao talk on Hate speech dataset
2204 Kakao talk on Hate speech dataset2204 Kakao talk on Hate speech dataset
2204 Kakao talk on Hate speech dataset
 
2108 [LangCon2021] kosp2e
2108 [LangCon2021] kosp2e2108 [LangCon2021] kosp2e
2108 [LangCon2021] kosp2e
 
2106 PRSLLS
2106 PRSLLS2106 PRSLLS
2106 PRSLLS
 
2106 JWLLP
2106 JWLLP2106 JWLLP
2106 JWLLP
 
2106 ACM DIS
2106 ACM DIS2106 ACM DIS
2106 ACM DIS
 
2104 Talk @SSU
2104 Talk @SSU2104 Talk @SSU
2104 Talk @SSU
 
2103 ACM FAccT
2103 ACM FAccT2103 ACM FAccT
2103 ACM FAccT
 
2102 Redone seminar
2102 Redone seminar2102 Redone seminar
2102 Redone seminar
 
2011 NLP-OSS
2011 NLP-OSS2011 NLP-OSS
2011 NLP-OSS
 
2010 INTERSPEECH
2010 INTERSPEECH 2010 INTERSPEECH
2010 INTERSPEECH
 
2010 PACLIC - pay attention to categories
2010 PACLIC - pay attention to categories2010 PACLIC - pay attention to categories
2010 PACLIC - pay attention to categories
 
2010 HCLT Hate Speech
2010 HCLT Hate Speech2010 HCLT Hate Speech
2010 HCLT Hate Speech
 
2009 DevC Seongnam - NLP
2009 DevC Seongnam - NLP2009 DevC Seongnam - NLP
2009 DevC Seongnam - NLP
 

Recently uploaded

# Smart Parking Management System.pptx using IOT
# Smart Parking Management System.pptx using IOT# Smart Parking Management System.pptx using IOT
# Smart Parking Management System.pptx using IOT
Yesh20
 
Safety Operating Procedure for Testing Lifting Tackles
Safety Operating Procedure for Testing Lifting TacklesSafety Operating Procedure for Testing Lifting Tackles
Safety Operating Procedure for Testing Lifting Tackles
ssuserfcf701
 
High Profile Girls Call Delhi 9711199171 Provide Best And Top Girl Service An...
High Profile Girls Call Delhi 9711199171 Provide Best And Top Girl Service An...High Profile Girls Call Delhi 9711199171 Provide Best And Top Girl Service An...
High Profile Girls Call Delhi 9711199171 Provide Best And Top Girl Service An...
janvikumar4133
 
Sustainable construction is the use of renewable and recyclable materials in ...
Sustainable construction is the use of renewable and recyclable materials in ...Sustainable construction is the use of renewable and recyclable materials in ...
Sustainable construction is the use of renewable and recyclable materials in ...
RohitGhulanavar2
 
杨洋李一桐做爱视频流出【网芷:ht28.co】国产国产午夜精华>>>[网趾:ht28.co】]<<<
杨洋李一桐做爱视频流出【网芷:ht28.co】国产国产午夜精华>>>[网趾:ht28.co】]<<<杨洋李一桐做爱视频流出【网芷:ht28.co】国产国产午夜精华>>>[网趾:ht28.co】]<<<
杨洋李一桐做爱视频流出【网芷:ht28.co】国产国产午夜精华>>>[网趾:ht28.co】]<<<
amzhoxvzidbke
 
RMC FPV.docx_fpv solar energy panels----
RMC FPV.docx_fpv solar energy panels----RMC FPV.docx_fpv solar energy panels----
RMC FPV.docx_fpv solar energy panels----
Khader Mallah
 
AFCAT STATIC Genral knowledge important CAPSULE.pdf
AFCAT STATIC Genral knowledge important CAPSULE.pdfAFCAT STATIC Genral knowledge important CAPSULE.pdf
AFCAT STATIC Genral knowledge important CAPSULE.pdf
vibhapatil140
 
NOVEC 1230 Fire Suppression System Presentation
NOVEC 1230 Fire Suppression System PresentationNOVEC 1230 Fire Suppression System Presentation
NOVEC 1230 Fire Suppression System Presentation
miniruwan1
 
Updated Limitations of Simplified Methods for Evaluating the Potential for Li...
Updated Limitations of Simplified Methods for Evaluating the Potential for Li...Updated Limitations of Simplified Methods for Evaluating the Potential for Li...
Updated Limitations of Simplified Methods for Evaluating the Potential for Li...
Robert Pyke
 
How to Formulate A Good Research Question
How to Formulate A  Good Research QuestionHow to Formulate A  Good Research Question
How to Formulate A Good Research Question
rkpv2002
 
charting the development of the autonomous train
charting the development of the autonomous traincharting the development of the autonomous train
charting the development of the autonomous train
huseindihon
 
"Operational and Technical Overview of Electric Locomotives at the Kanpur Ele...
"Operational and Technical Overview of Electric Locomotives at the Kanpur Ele..."Operational and Technical Overview of Electric Locomotives at the Kanpur Ele...
"Operational and Technical Overview of Electric Locomotives at the Kanpur Ele...
nanduchaihan9
 
System Analysis and Design in a changing world 5th edition
System Analysis and Design in a changing world 5th editionSystem Analysis and Design in a changing world 5th edition
System Analysis and Design in a changing world 5th edition
mnassar75g
 
Chapter 1 Introduction to Software Engineering and Process Models.pdf
Chapter 1 Introduction to Software Engineering and Process Models.pdfChapter 1 Introduction to Software Engineering and Process Models.pdf
Chapter 1 Introduction to Software Engineering and Process Models.pdf
MeghaGupta952452
 
AI chapter1 introduction to artificial intelligence
AI chapter1 introduction to artificial intelligenceAI chapter1 introduction to artificial intelligence
AI chapter1 introduction to artificial intelligence
GeethaAL
 
OME754 – INDUSTRIAL SAFETY - unit notes.pptx
OME754 – INDUSTRIAL SAFETY - unit notes.pptxOME754 – INDUSTRIAL SAFETY - unit notes.pptx
OME754 – INDUSTRIAL SAFETY - unit notes.pptx
shanmugamram247
 
Cisco Intersight Technical OverView.pptx
Cisco Intersight Technical OverView.pptxCisco Intersight Technical OverView.pptx
Cisco Intersight Technical OverView.pptx
Duy Nguyen
 
Girls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 in City
Girls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 in CityGirls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 in City
Girls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 in City
sunnuchadda
 
20240710 ISSIP GGG Qtrly Community Connection Slides.pptx
20240710 ISSIP GGG Qtrly Community Connection Slides.pptx20240710 ISSIP GGG Qtrly Community Connection Slides.pptx
20240710 ISSIP GGG Qtrly Community Connection Slides.pptx
VaishaliM24
 
Technical Seminar of Mca computer vision .ppt
Technical Seminar of Mca computer vision .pptTechnical Seminar of Mca computer vision .ppt
Technical Seminar of Mca computer vision .ppt
AnkitaVerma776806
 

Recently uploaded (20)

# Smart Parking Management System.pptx using IOT
# Smart Parking Management System.pptx using IOT# Smart Parking Management System.pptx using IOT
# Smart Parking Management System.pptx using IOT
 
Safety Operating Procedure for Testing Lifting Tackles
Safety Operating Procedure for Testing Lifting TacklesSafety Operating Procedure for Testing Lifting Tackles
Safety Operating Procedure for Testing Lifting Tackles
 
High Profile Girls Call Delhi 9711199171 Provide Best And Top Girl Service An...
High Profile Girls Call Delhi 9711199171 Provide Best And Top Girl Service An...High Profile Girls Call Delhi 9711199171 Provide Best And Top Girl Service An...
High Profile Girls Call Delhi 9711199171 Provide Best And Top Girl Service An...
 
Sustainable construction is the use of renewable and recyclable materials in ...
Sustainable construction is the use of renewable and recyclable materials in ...Sustainable construction is the use of renewable and recyclable materials in ...
Sustainable construction is the use of renewable and recyclable materials in ...
 
杨洋李一桐做爱视频流出【网芷:ht28.co】国产国产午夜精华>>>[网趾:ht28.co】]<<<
杨洋李一桐做爱视频流出【网芷:ht28.co】国产国产午夜精华>>>[网趾:ht28.co】]<<<杨洋李一桐做爱视频流出【网芷:ht28.co】国产国产午夜精华>>>[网趾:ht28.co】]<<<
杨洋李一桐做爱视频流出【网芷:ht28.co】国产国产午夜精华>>>[网趾:ht28.co】]<<<
 
RMC FPV.docx_fpv solar energy panels----
RMC FPV.docx_fpv solar energy panels----RMC FPV.docx_fpv solar energy panels----
RMC FPV.docx_fpv solar energy panels----
 
AFCAT STATIC Genral knowledge important CAPSULE.pdf
AFCAT STATIC Genral knowledge important CAPSULE.pdfAFCAT STATIC Genral knowledge important CAPSULE.pdf
AFCAT STATIC Genral knowledge important CAPSULE.pdf
 
NOVEC 1230 Fire Suppression System Presentation
NOVEC 1230 Fire Suppression System PresentationNOVEC 1230 Fire Suppression System Presentation
NOVEC 1230 Fire Suppression System Presentation
 
Updated Limitations of Simplified Methods for Evaluating the Potential for Li...
Updated Limitations of Simplified Methods for Evaluating the Potential for Li...Updated Limitations of Simplified Methods for Evaluating the Potential for Li...
Updated Limitations of Simplified Methods for Evaluating the Potential for Li...
 
How to Formulate A Good Research Question
How to Formulate A  Good Research QuestionHow to Formulate A  Good Research Question
How to Formulate A Good Research Question
 
charting the development of the autonomous train
charting the development of the autonomous traincharting the development of the autonomous train
charting the development of the autonomous train
 
"Operational and Technical Overview of Electric Locomotives at the Kanpur Ele...
"Operational and Technical Overview of Electric Locomotives at the Kanpur Ele..."Operational and Technical Overview of Electric Locomotives at the Kanpur Ele...
"Operational and Technical Overview of Electric Locomotives at the Kanpur Ele...
 
System Analysis and Design in a changing world 5th edition
System Analysis and Design in a changing world 5th editionSystem Analysis and Design in a changing world 5th edition
System Analysis and Design in a changing world 5th edition
 
Chapter 1 Introduction to Software Engineering and Process Models.pdf
Chapter 1 Introduction to Software Engineering and Process Models.pdfChapter 1 Introduction to Software Engineering and Process Models.pdf
Chapter 1 Introduction to Software Engineering and Process Models.pdf
 
AI chapter1 introduction to artificial intelligence
AI chapter1 introduction to artificial intelligenceAI chapter1 introduction to artificial intelligence
AI chapter1 introduction to artificial intelligence
 
OME754 – INDUSTRIAL SAFETY - unit notes.pptx
OME754 – INDUSTRIAL SAFETY - unit notes.pptxOME754 – INDUSTRIAL SAFETY - unit notes.pptx
OME754 – INDUSTRIAL SAFETY - unit notes.pptx
 
Cisco Intersight Technical OverView.pptx
Cisco Intersight Technical OverView.pptxCisco Intersight Technical OverView.pptx
Cisco Intersight Technical OverView.pptx
 
Girls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 in City
Girls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 in CityGirls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 in City
Girls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 in City
 
20240710 ISSIP GGG Qtrly Community Connection Slides.pptx
20240710 ISSIP GGG Qtrly Community Connection Slides.pptx20240710 ISSIP GGG Qtrly Community Connection Slides.pptx
20240710 ISSIP GGG Qtrly Community Connection Slides.pptx
 
Technical Seminar of Mca computer vision .ppt
Technical Seminar of Mca computer vision .pptTechnical Seminar of Mca computer vision .ppt
Technical Seminar of Mca computer vision .ppt
 

1910 JK27

  • 1. Discourse component-based argument extraction of Seoul Korean directives Won Ik Cho¹, Young Ki Moon², Nam Soo Kim¹ Department of Electrical and Computer Engineering and INMC¹, Seoul National University, Department of Computer Engineering², Inha University E-mail: wicho@hi.snu.ac.kr, ykmoon0814@gmail.com, nkim@snu.ac.kr Motivation - Keyword, keyphrase, and argument • Keyword/keyphrase: core content of a document • Document can be either an article, a paragraph or a sentence! • Keyphrase: a sequence of the keywords in a phrase form • For a sentence (as document): Keyword/keyphrase: the core expression of the sentence • e.g., in “there is a dog in front of the door” • Keyword: dog, door • Keyphrase: a dog in front of the door • However, in directive utterances? • e.g., in “you know when the rain stops in tokyo” • Keyword: rain, stops, tokyo • Keyphrase: when the rain stops in tokyo • In intent analysis, argument is extracted as in: • Domain: Weather • Intent: Asking • Argument: The time the rain stops in tokyo • Thus, argument can be interpreted as a nominalized and structured term for the sentence keywords/keyphrases - Why discourse component (Portner, 2004) for directive utterances (question & command)? • Can be utilized as a syntax-semantic term (if extended to speech act level) that conceptually matches to an argument Annotation Guideline - Targets the questions/commands that are not rhetorical - Yes/no, alternative and wh-Q denote the property of the QS, not the sentence form - Similarly, prohibition and requirement denote the property of the TDL, regardless of the utterance being declarative, interrogative, or imperative - For the Korean language: • Strong requirement is often in the form of [PH+REQ] • Scrambling frequently happens, but hardly changes the argument (DC) • For a frequent subject drop, notating speaker/addressee information in the argument often requires a guess or is ambiguous Corpus Annotation - Intonation-aided intention identification for Korean (3i4K) - Corpus composition - Questions: 17,869 - Commands: 12,968 - Five non-directive utterance types (not used) - Human annotation - Left only the utterances with the consensus of at least three Seoul Korean natives - Prototype corpus - Size ~30K - Deficit in: - Alternative Q - Prohibition - Strong REQ Data Augmentation (further work) - Data augmentation • Discourse component as a core content for (a human- resourced) sentence generation • Human-generated utterances for AltQ, Wh-Q, PH, StrREQ • 50,837 utterances in total! (Relatively balanced corpus) Conclusion - Discourse component as a conceptual correspondence of the intent argument (DC, keyphrase) of directive utterances - Corpus construction and data augmentation for 1. the automatic extraction system and 2. Linguistic analysis regarding utterance types and subject drop Goal & Research Questions (in progress) - Implementation of automatic argument extraction system • Further application to semantic web search & answer making for the spoken language understanding (SLU) systems that aim human-friendly conversation - Qualitative comparison of annotated and generated corpus • Variance in argument vs. Variance in sentence styles • Frequency and felicity of scrambling - Subject drop and speaker/addressee notation • Tendency (speaker? addressee? both? neither? ambiguous?) regarding intent arguments and polarity items Original sentence Core content (SQL or Keyphrase) Paraphrase Bilingual pivoting / Word swapping Human paraphrase SeqSQL / Argument extraction Rule-based / Learning-based / Human generation