SlideShare a Scribd company logo
1 of 1
Download to read offline
A Survey on Asking Clarification Questions Datasets
in Conversational Systems
Hossein A. Rahmani* 1
Xi Wang* 1
Yue Feng 1
Qiang Zhang 2
Emine Yilmaz 1
Aldo Lipani 1
1University College London, UK 2Zhejiang University, China
Overview
Resource: We systematically search through 77 relevant papers, selected as per their recency, relia-
bility and use frequency, in the Asking Clarification Questions (ACQ) domain from top-tier venues.
Survey Focus: We compare the ACQ datasets from their contributions to the development of ACQ
techniques and experimentally show the performance of representative techniques.
Novel Data Selection Strategy: We introduce a visualised semantic encoding strategy to explain
dataset suitability when selected for their corresponding experiments.
Future Directions: We analytically outline promising open research directions in the construction of
future datasets for ACQs, which sheds light on the development of future research.
Statistical Overview of ACQ Datasets
Table 1. A statistical summary of ACQ datasets for both Conv. Search and Conv. QA. The highlighted colours indicate the
distinct corpus size of datasets.
Dataset #Domains Scale #Clar. Q
Conversational Search
ClariT - 108K 260K
Qulac 198 10K 3K
ClariQ 300 2M 4K
TavakoliCQ 3 170K 7K
MIMICS - 462K 586K
MANtIS 14 80K 435
ClariQ-FKw 230 2K 2K
MSDialog 12 35K 877
MIMICS-Dou - 1K 1K
Conversational Question Answering
ClarQ 173 2M 2M
RaoCQ 3 77K 770K
AmazonCQ 2 24K 179K
CLAQUA 110 40K 40K
Research Tasks: Conversational Search (Conv. Search) and Conversational Question Answering
(Conv. QA).
Varied Scales of Data: Many large Conv. QA datasets (with background coloured in pink) due to a
rather convenient preparation of QAs than dialogues. Most Conv. Search datasets are medium size
with >1k clarification questions (in Cyan colour).
Datasets Details
Table 2. A Summary of collection details of ACQ datasets. ‘-’ means that the information is not available. ‘SE’ is
StackExchange, ‘MC’ refers to Microsoft Community, and ‘KB’ is Knowledge Base.
Dataset Published Built Resource Clar. Source
Conversational Search
ClariT 2023 Aug. 2018 General queries from task-oriented dialogues Crowdsourcing
Qulac 2019 2009-2012 198 topics from TREC WEB Data Crowdsourcing
ClariQ 2021 2009-2014 300 topics from TREC WEB Data Crowdsourcing
TavakoliCQ 2021 Jul. 2009 to Sep. 2019 3 domains of SE Post and Comment
MIMICS 2020 Sep. 2019 General queries from Bing users Machine Generated
MANtIS 2019 Mar. 2019 14 domains of SE Post and Comment
ClariQ-FKw 2021 2009-2014 TREC WEB Data Crowdsourcing
MSDialog 2018 Nov. 2005 to Oct. 2017 4 domains of MC Crowdsourcing
MIMICS-Duo 2022 Jan. 2022 to Feb. 2022 General queries from Bing users HIT on MTurk, Qualtrics
Conversational Question Answering
ClarQ 2020 - 173 domains of SE Post and Comment
RaoCQ 2018 - 3 domains of SE Post and Comment
AmazonCQ 2019 - A category of Amazon dataset Review and Comment
CLAQUA 2019 - From an open-domain KB Crowdsourcing
Recency of Data: Some datasets rely on the data collected years ago. For example, the Qulac, ClariQ
and ClariQ-FKw datasets consistently use the TREC WEB data but run between 2009 and 2014.
Resource Platform: TREC WEB data, StackExchange and Bing are the commonly considered re-
source for preparing clarification questions.
Data Collection Strategy: The crowdsourcing strategy is commonly applied to generate qualified
clarification questions. Note that the posts and comments of StackExchange are also widely used
to provide clarification questions.
According to the provided information, we conclude that the datasets have been collected based
on varied strategies, on different periods and use inconsistent resources. However, it is difficult to
tell how exactly a dataset is different from others and how to properly select a set of datasets to show
the performance of a newly introduced model.
Hence, in this survey, we introduce a visualisation-based approach to assist in the selection of datasets
for an improved experimental setup.
Datasets Analysis
−50 0 50
−100
−50
0
50
100
Dataset
ClariT
Qulac
ClariQ
MIMICS-Dou
ClariQ-FKw
MANtIS
MSDialog
MIMICS
TavakoliCQ
Comp-1
Comp-2
(a) tSNE on Conv. Search Datasets
−40 −20 0 20 40
−60
−40
−20
0
20
40
60 Dataset
AmazonCQ
RaoCQ
CLAQUA
ClarQ
Comp-1
Comp-2
(b) tSNE on Conv. QA Datasets
Figure 1. tSNE Semantic Representation on ACQ Datasets
Qulac and ClariQ datasets, and MIMICS and MIMICS-Dou datasets are highly overlapped.
Conv. Search datasets form 5 distinct clusters that can be used for evaluation.
CQs in Conv. Search are very focused while in Conv. QA datasets are more widely distributed.
Tasks and Evaluation Methods on ACQ Datasets
Table 3. Summary of tasks and evaluation method on ACQs datasets. The tasks can be generation and ranking, which are
indicated by ‘G’ and ‘R’, respectively.
Dataset
Task
Eval. Method
T1 T2 T3
Conv. Search
ClariT feng2023towards X G - Offline
Qulac AliannejadiSigir19 - R - Offline
ClariQ aliannejadi2021building X R - Offline
TavakoliCQ tavakoli2021analyzing - G - Offline
MIMICS zamani2020mimics X R, G X Offline/Online
MANtIS penha2019introducing - R, G - Offline
ClariQ-FKw sekulic2021towards - G - Offline
MSDialog qu2018analyzing - R, G - Offline
MIMICS-Duo tavakoli2022mimics X R, G X Offline/Online
Conv. QA
ClarQ kumar2020clarq - R - Offline
RaoCQ rao2018learning - R - Offline
AmazonCQ rao2019answer - G - Offline
CLAQUA xu2019asking X G - Offline
Evaluation Metrics
Automatic Evaluation
Ranking: MAP, Precision, Recall, F1-score, nDCG, MRR, MSE
Genration: BLUE, METEOR, ROUGE
Human Evaluation: Relevance, Usefulness, Naturalness, Clarification
Model Performance on ACQ Datasets
Table 4. Question relevance ranking performance evaluation on representative approaches. ‘P’ and ‘R’ refers to Precision
and Recall. ↑ or ↓ is added to Doc2Query + BM25 to indicate a consistent performance change to BM25 on all
evaluation metrics.
Model MAP P@10 R@10 NDCG
Qulac
BM25 0.6306 0.9196 0.1864 0.9043
Doc2Query + BM25 0.6289 0.9196 0.1860 0.9069
ClariQ
BM25 0.6360 0.7500 0.5742 0.7211
Doc2Query + BM25 ↑ 0.6705 0.7899 0.6006 0.7501
ClarQ
BM25 0.2011 0.0259 0.2596 0.2200
Doc2Query + BM25 ↓ 0.1977 0.0263 0.2630 0.2168
CLAQUA
BM25 0.9600 0.0992 0.9920 0.9683
Doc2Query + BM25 ↓ 0.9395 0.0990 0.9901 0.9523
Limitations of Existing Studies:
Many research studies have inconsistent uses of datasets as well as incomparable results with distinct
experimental setups.
Another common issue is the use of different baselines to show the leading performance of newly
introduced techniques.
Many techniques were introduced while tested on a single dataset to show their top performance
Discussion and Future Challenges
Findings
Missing Standard Benchmark
Few User-System Interactions Recorded for Evaluation
Inconsistent Dataset Collection and Formatting
Inconsistent Model Evaluation
Future Research Directions
Standard Benchmark Development
ACQ Evaluation Framework
Large-Scale Human-to-Machine Dataset
Multi-Modal ACQs Dataset
Acknowledgements
This research is supported by the Engineering and Physical Sciences Research Council [EP/S021566/1]
and the EPSRC Fellowship titled “Task Based Information Retrieval” [EP/P024289/1].
Paper and Codes
The 61st Annual Meeting of the Association for Computational Linguistics Toronto, Canada hossein.rahmani.22@ucl.ac.uk

More Related Content

Similar to ACQSurvey (Poster)

print mod 2.pdf
print mod 2.pdfprint mod 2.pdf
print mod 2.pdflathass5
 
Network Flow Pattern Extraction by Clustering Eugine Kang
Network Flow Pattern Extraction by Clustering Eugine KangNetwork Flow Pattern Extraction by Clustering Eugine Kang
Network Flow Pattern Extraction by Clustering Eugine KangEugine Kang
 
Using Semantic Technology to Drive Agile Analytics - SLIDES
Using Semantic Technology to Drive Agile Analytics - SLIDESUsing Semantic Technology to Drive Agile Analytics - SLIDES
Using Semantic Technology to Drive Agile Analytics - SLIDESDATAVERSITY
 
Linked Data Hypercubes
Linked Data HypercubesLinked Data Hypercubes
Linked Data HypercubesDave Reynolds
 
Spectral Clustering and Vantage Point Indexing for Efficient Data Retrieval
Spectral Clustering and Vantage Point Indexing for Efficient Data Retrieval Spectral Clustering and Vantage Point Indexing for Efficient Data Retrieval
Spectral Clustering and Vantage Point Indexing for Efficient Data Retrieval IJECEIAES
 
E132833
E132833E132833
E132833irjes
 
GSK: How Knowledge Graphs Improve Clinical Reporting Workflows
GSK: How Knowledge Graphs Improve Clinical Reporting WorkflowsGSK: How Knowledge Graphs Improve Clinical Reporting Workflows
GSK: How Knowledge Graphs Improve Clinical Reporting WorkflowsNeo4j
 
IRJET- Review of Existing Methods in K-Means Clustering Algorithm
IRJET- Review of Existing Methods in K-Means Clustering AlgorithmIRJET- Review of Existing Methods in K-Means Clustering Algorithm
IRJET- Review of Existing Methods in K-Means Clustering AlgorithmIRJET Journal
 
IRJET- Foster Hashtag from Image and Text
IRJET-  	  Foster Hashtag from Image and TextIRJET-  	  Foster Hashtag from Image and Text
IRJET- Foster Hashtag from Image and TextIRJET Journal
 
Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...Rakebul Hasan
 
IRJET- Optimal Number of Cluster Identification using Robust K-Means for ...
IRJET-  	  Optimal Number of Cluster Identification using Robust K-Means for ...IRJET-  	  Optimal Number of Cluster Identification using Robust K-Means for ...
IRJET- Optimal Number of Cluster Identification using Robust K-Means for ...IRJET Journal
 
An Effective Storage Management for University Library using Weighted K-Neare...
An Effective Storage Management for University Library using Weighted K-Neare...An Effective Storage Management for University Library using Weighted K-Neare...
An Effective Storage Management for University Library using Weighted K-Neare...C Sai Kiran
 
Heuristic based query optimisation for rsp(rdf stream processing) engines
Heuristic based query optimisation for rsp(rdf stream processing) enginesHeuristic based query optimisation for rsp(rdf stream processing) engines
Heuristic based query optimisation for rsp(rdf stream processing) enginesWilliam Aruga
 
Review of Existing Methods in K-means Clustering Algorithm
Review of Existing Methods in K-means Clustering AlgorithmReview of Existing Methods in K-means Clustering Algorithm
Review of Existing Methods in K-means Clustering AlgorithmIRJET Journal
 
Optimizing Your Supply Chain with Neo4j
Optimizing Your Supply Chain with Neo4jOptimizing Your Supply Chain with Neo4j
Optimizing Your Supply Chain with Neo4jNeo4j
 

Similar to ACQSurvey (Poster) (20)

print mod 2.pdf
print mod 2.pdfprint mod 2.pdf
print mod 2.pdf
 
Network Flow Pattern Extraction by Clustering Eugine Kang
Network Flow Pattern Extraction by Clustering Eugine KangNetwork Flow Pattern Extraction by Clustering Eugine Kang
Network Flow Pattern Extraction by Clustering Eugine Kang
 
Using Semantic Technology to Drive Agile Analytics - SLIDES
Using Semantic Technology to Drive Agile Analytics - SLIDESUsing Semantic Technology to Drive Agile Analytics - SLIDES
Using Semantic Technology to Drive Agile Analytics - SLIDES
 
Linked Data Hypercubes
Linked Data HypercubesLinked Data Hypercubes
Linked Data Hypercubes
 
Spectral Clustering and Vantage Point Indexing for Efficient Data Retrieval
Spectral Clustering and Vantage Point Indexing for Efficient Data Retrieval Spectral Clustering and Vantage Point Indexing for Efficient Data Retrieval
Spectral Clustering and Vantage Point Indexing for Efficient Data Retrieval
 
Recursive
RecursiveRecursive
Recursive
 
E132833
E132833E132833
E132833
 
GSK: How Knowledge Graphs Improve Clinical Reporting Workflows
GSK: How Knowledge Graphs Improve Clinical Reporting WorkflowsGSK: How Knowledge Graphs Improve Clinical Reporting Workflows
GSK: How Knowledge Graphs Improve Clinical Reporting Workflows
 
IRJET- Review of Existing Methods in K-Means Clustering Algorithm
IRJET- Review of Existing Methods in K-Means Clustering AlgorithmIRJET- Review of Existing Methods in K-Means Clustering Algorithm
IRJET- Review of Existing Methods in K-Means Clustering Algorithm
 
IRJET- Foster Hashtag from Image and Text
IRJET-  	  Foster Hashtag from Image and TextIRJET-  	  Foster Hashtag from Image and Text
IRJET- Foster Hashtag from Image and Text
 
G0354451
G0354451G0354451
G0354451
 
Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...
 
IRJET- Optimal Number of Cluster Identification using Robust K-Means for ...
IRJET-  	  Optimal Number of Cluster Identification using Robust K-Means for ...IRJET-  	  Optimal Number of Cluster Identification using Robust K-Means for ...
IRJET- Optimal Number of Cluster Identification using Robust K-Means for ...
 
An Effective Storage Management for University Library using Weighted K-Neare...
An Effective Storage Management for University Library using Weighted K-Neare...An Effective Storage Management for University Library using Weighted K-Neare...
An Effective Storage Management for University Library using Weighted K-Neare...
 
My thesis
My thesisMy thesis
My thesis
 
Heuristic based query optimisation for rsp(rdf stream processing) engines
Heuristic based query optimisation for rsp(rdf stream processing) enginesHeuristic based query optimisation for rsp(rdf stream processing) engines
Heuristic based query optimisation for rsp(rdf stream processing) engines
 
Review of Existing Methods in K-means Clustering Algorithm
Review of Existing Methods in K-means Clustering AlgorithmReview of Existing Methods in K-means Clustering Algorithm
Review of Existing Methods in K-means Clustering Algorithm
 
Optimizing Your Supply Chain with Neo4j
Optimizing Your Supply Chain with Neo4jOptimizing Your Supply Chain with Neo4j
Optimizing Your Supply Chain with Neo4j
 
Data Management.pptx
Data Management.pptxData Management.pptx
Data Management.pptx
 
Poster
PosterPoster
Poster
 

More from Hossein A. (Saeed) Rahmani

GeneralizibilityFairness - DEFirst Reading Group
GeneralizibilityFairness - DEFirst Reading GroupGeneralizibilityFairness - DEFirst Reading Group
GeneralizibilityFairness - DEFirst Reading GroupHossein A. (Saeed) Rahmani
 
Towards Confidence-aware Calibrated Recommendation (Poster)
Towards Confidence-aware Calibrated Recommendation (Poster)Towards Confidence-aware Calibrated Recommendation (Poster)
Towards Confidence-aware Calibrated Recommendation (Poster)Hossein A. (Saeed) Rahmani
 
Towards Confidence-aware Calibrated Recommendation (Slides)
Towards Confidence-aware Calibrated Recommendation (Slides)Towards Confidence-aware Calibrated Recommendation (Slides)
Towards Confidence-aware Calibrated Recommendation (Slides)Hossein A. (Saeed) Rahmani
 
Experiments on Generalizability of User-Oriented Fairness in Recommender Systems
Experiments on Generalizability of User-Oriented Fairness in Recommender SystemsExperiments on Generalizability of User-Oriented Fairness in Recommender Systems
Experiments on Generalizability of User-Oriented Fairness in Recommender SystemsHossein A. (Saeed) Rahmani
 
CPFair: Personalized Consumer and Producer Fairness Re-ranking for Recommende...
CPFair: Personalized Consumer and Producer Fairness Re-ranking for Recommende...CPFair: Personalized Consumer and Producer Fairness Re-ranking for Recommende...
CPFair: Personalized Consumer and Producer Fairness Re-ranking for Recommende...Hossein A. (Saeed) Rahmani
 
The Unfairness of Popularity Bias in Book Recommendation (Bias@ECIR22)
The Unfairness of Popularity Bias in Book Recommendation (Bias@ECIR22)The Unfairness of Popularity Bias in Book Recommendation (Bias@ECIR22)
The Unfairness of Popularity Bias in Book Recommendation (Bias@ECIR22)Hossein A. (Saeed) Rahmani
 
The Unfairness of Active Users and Popularity Bias in Point-of-Interest Recom...
The Unfairness of Active Users and Popularity Bias in Point-of-Interest Recom...The Unfairness of Active Users and Popularity Bias in Point-of-Interest Recom...
The Unfairness of Active Users and Popularity Bias in Point-of-Interest Recom...Hossein A. (Saeed) Rahmani
 

More from Hossein A. (Saeed) Rahmani (11)

Beyond-Accuracy Provider Fairness (Slides)
Beyond-Accuracy Provider Fairness (Slides)Beyond-Accuracy Provider Fairness (Slides)
Beyond-Accuracy Provider Fairness (Slides)
 
ContextsPOI (Slides)
ContextsPOI (Slides)ContextsPOI (Slides)
ContextsPOI (Slides)
 
ACQSurvey (Slides)
ACQSurvey (Slides)ACQSurvey (Slides)
ACQSurvey (Slides)
 
GeneralizibilityFairness - DEFirst Reading Group
GeneralizibilityFairness - DEFirst Reading GroupGeneralizibilityFairness - DEFirst Reading Group
GeneralizibilityFairness - DEFirst Reading Group
 
Towards Confidence-aware Calibrated Recommendation (Poster)
Towards Confidence-aware Calibrated Recommendation (Poster)Towards Confidence-aware Calibrated Recommendation (Poster)
Towards Confidence-aware Calibrated Recommendation (Poster)
 
Towards Confidence-aware Calibrated Recommendation (Slides)
Towards Confidence-aware Calibrated Recommendation (Slides)Towards Confidence-aware Calibrated Recommendation (Slides)
Towards Confidence-aware Calibrated Recommendation (Slides)
 
Experiments on Generalizability of User-Oriented Fairness in Recommender Systems
Experiments on Generalizability of User-Oriented Fairness in Recommender SystemsExperiments on Generalizability of User-Oriented Fairness in Recommender Systems
Experiments on Generalizability of User-Oriented Fairness in Recommender Systems
 
CPFair: Personalized Consumer and Producer Fairness Re-ranking for Recommende...
CPFair: Personalized Consumer and Producer Fairness Re-ranking for Recommende...CPFair: Personalized Consumer and Producer Fairness Re-ranking for Recommende...
CPFair: Personalized Consumer and Producer Fairness Re-ranking for Recommende...
 
The Unfairness of Popularity Bias in Book Recommendation (Bias@ECIR22)
The Unfairness of Popularity Bias in Book Recommendation (Bias@ECIR22)The Unfairness of Popularity Bias in Book Recommendation (Bias@ECIR22)
The Unfairness of Popularity Bias in Book Recommendation (Bias@ECIR22)
 
The Unfairness of Active Users and Popularity Bias in Point-of-Interest Recom...
The Unfairness of Active Users and Popularity Bias in Point-of-Interest Recom...The Unfairness of Active Users and Popularity Bias in Point-of-Interest Recom...
The Unfairness of Active Users and Popularity Bias in Point-of-Interest Recom...
 
Introduction to Complex Networks
Introduction to Complex NetworksIntroduction to Complex Networks
Introduction to Complex Networks
 

Recently uploaded

Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineeringmalavadedarshan25
 
Analog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAnalog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAbhinavSharma374939
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLDeelipZope
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxupamatechverse
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Dr.Costas Sachpazis
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerAnamika Sarkar
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).pptssuser5c9d4b1
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZTE
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...RajaP95
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxwendy cai
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 

Recently uploaded (20)

Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineering
 
Analog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAnalog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog Converter
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCL
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
 
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 

ACQSurvey (Poster)

  • 1. A Survey on Asking Clarification Questions Datasets in Conversational Systems Hossein A. Rahmani* 1 Xi Wang* 1 Yue Feng 1 Qiang Zhang 2 Emine Yilmaz 1 Aldo Lipani 1 1University College London, UK 2Zhejiang University, China Overview Resource: We systematically search through 77 relevant papers, selected as per their recency, relia- bility and use frequency, in the Asking Clarification Questions (ACQ) domain from top-tier venues. Survey Focus: We compare the ACQ datasets from their contributions to the development of ACQ techniques and experimentally show the performance of representative techniques. Novel Data Selection Strategy: We introduce a visualised semantic encoding strategy to explain dataset suitability when selected for their corresponding experiments. Future Directions: We analytically outline promising open research directions in the construction of future datasets for ACQs, which sheds light on the development of future research. Statistical Overview of ACQ Datasets Table 1. A statistical summary of ACQ datasets for both Conv. Search and Conv. QA. The highlighted colours indicate the distinct corpus size of datasets. Dataset #Domains Scale #Clar. Q Conversational Search ClariT - 108K 260K Qulac 198 10K 3K ClariQ 300 2M 4K TavakoliCQ 3 170K 7K MIMICS - 462K 586K MANtIS 14 80K 435 ClariQ-FKw 230 2K 2K MSDialog 12 35K 877 MIMICS-Dou - 1K 1K Conversational Question Answering ClarQ 173 2M 2M RaoCQ 3 77K 770K AmazonCQ 2 24K 179K CLAQUA 110 40K 40K Research Tasks: Conversational Search (Conv. Search) and Conversational Question Answering (Conv. QA). Varied Scales of Data: Many large Conv. QA datasets (with background coloured in pink) due to a rather convenient preparation of QAs than dialogues. Most Conv. Search datasets are medium size with >1k clarification questions (in Cyan colour). Datasets Details Table 2. A Summary of collection details of ACQ datasets. ‘-’ means that the information is not available. ‘SE’ is StackExchange, ‘MC’ refers to Microsoft Community, and ‘KB’ is Knowledge Base. Dataset Published Built Resource Clar. Source Conversational Search ClariT 2023 Aug. 2018 General queries from task-oriented dialogues Crowdsourcing Qulac 2019 2009-2012 198 topics from TREC WEB Data Crowdsourcing ClariQ 2021 2009-2014 300 topics from TREC WEB Data Crowdsourcing TavakoliCQ 2021 Jul. 2009 to Sep. 2019 3 domains of SE Post and Comment MIMICS 2020 Sep. 2019 General queries from Bing users Machine Generated MANtIS 2019 Mar. 2019 14 domains of SE Post and Comment ClariQ-FKw 2021 2009-2014 TREC WEB Data Crowdsourcing MSDialog 2018 Nov. 2005 to Oct. 2017 4 domains of MC Crowdsourcing MIMICS-Duo 2022 Jan. 2022 to Feb. 2022 General queries from Bing users HIT on MTurk, Qualtrics Conversational Question Answering ClarQ 2020 - 173 domains of SE Post and Comment RaoCQ 2018 - 3 domains of SE Post and Comment AmazonCQ 2019 - A category of Amazon dataset Review and Comment CLAQUA 2019 - From an open-domain KB Crowdsourcing Recency of Data: Some datasets rely on the data collected years ago. For example, the Qulac, ClariQ and ClariQ-FKw datasets consistently use the TREC WEB data but run between 2009 and 2014. Resource Platform: TREC WEB data, StackExchange and Bing are the commonly considered re- source for preparing clarification questions. Data Collection Strategy: The crowdsourcing strategy is commonly applied to generate qualified clarification questions. Note that the posts and comments of StackExchange are also widely used to provide clarification questions. According to the provided information, we conclude that the datasets have been collected based on varied strategies, on different periods and use inconsistent resources. However, it is difficult to tell how exactly a dataset is different from others and how to properly select a set of datasets to show the performance of a newly introduced model. Hence, in this survey, we introduce a visualisation-based approach to assist in the selection of datasets for an improved experimental setup. Datasets Analysis −50 0 50 −100 −50 0 50 100 Dataset ClariT Qulac ClariQ MIMICS-Dou ClariQ-FKw MANtIS MSDialog MIMICS TavakoliCQ Comp-1 Comp-2 (a) tSNE on Conv. Search Datasets −40 −20 0 20 40 −60 −40 −20 0 20 40 60 Dataset AmazonCQ RaoCQ CLAQUA ClarQ Comp-1 Comp-2 (b) tSNE on Conv. QA Datasets Figure 1. tSNE Semantic Representation on ACQ Datasets Qulac and ClariQ datasets, and MIMICS and MIMICS-Dou datasets are highly overlapped. Conv. Search datasets form 5 distinct clusters that can be used for evaluation. CQs in Conv. Search are very focused while in Conv. QA datasets are more widely distributed. Tasks and Evaluation Methods on ACQ Datasets Table 3. Summary of tasks and evaluation method on ACQs datasets. The tasks can be generation and ranking, which are indicated by ‘G’ and ‘R’, respectively. Dataset Task Eval. Method T1 T2 T3 Conv. Search ClariT feng2023towards X G - Offline Qulac AliannejadiSigir19 - R - Offline ClariQ aliannejadi2021building X R - Offline TavakoliCQ tavakoli2021analyzing - G - Offline MIMICS zamani2020mimics X R, G X Offline/Online MANtIS penha2019introducing - R, G - Offline ClariQ-FKw sekulic2021towards - G - Offline MSDialog qu2018analyzing - R, G - Offline MIMICS-Duo tavakoli2022mimics X R, G X Offline/Online Conv. QA ClarQ kumar2020clarq - R - Offline RaoCQ rao2018learning - R - Offline AmazonCQ rao2019answer - G - Offline CLAQUA xu2019asking X G - Offline Evaluation Metrics Automatic Evaluation Ranking: MAP, Precision, Recall, F1-score, nDCG, MRR, MSE Genration: BLUE, METEOR, ROUGE Human Evaluation: Relevance, Usefulness, Naturalness, Clarification Model Performance on ACQ Datasets Table 4. Question relevance ranking performance evaluation on representative approaches. ‘P’ and ‘R’ refers to Precision and Recall. ↑ or ↓ is added to Doc2Query + BM25 to indicate a consistent performance change to BM25 on all evaluation metrics. Model MAP P@10 R@10 NDCG Qulac BM25 0.6306 0.9196 0.1864 0.9043 Doc2Query + BM25 0.6289 0.9196 0.1860 0.9069 ClariQ BM25 0.6360 0.7500 0.5742 0.7211 Doc2Query + BM25 ↑ 0.6705 0.7899 0.6006 0.7501 ClarQ BM25 0.2011 0.0259 0.2596 0.2200 Doc2Query + BM25 ↓ 0.1977 0.0263 0.2630 0.2168 CLAQUA BM25 0.9600 0.0992 0.9920 0.9683 Doc2Query + BM25 ↓ 0.9395 0.0990 0.9901 0.9523 Limitations of Existing Studies: Many research studies have inconsistent uses of datasets as well as incomparable results with distinct experimental setups. Another common issue is the use of different baselines to show the leading performance of newly introduced techniques. Many techniques were introduced while tested on a single dataset to show their top performance Discussion and Future Challenges Findings Missing Standard Benchmark Few User-System Interactions Recorded for Evaluation Inconsistent Dataset Collection and Formatting Inconsistent Model Evaluation Future Research Directions Standard Benchmark Development ACQ Evaluation Framework Large-Scale Human-to-Machine Dataset Multi-Modal ACQs Dataset Acknowledgements This research is supported by the Engineering and Physical Sciences Research Council [EP/S021566/1] and the EPSRC Fellowship titled “Task Based Information Retrieval” [EP/P024289/1]. Paper and Codes The 61st Annual Meeting of the Association for Computational Linguistics Toronto, Canada hossein.rahmani.22@ucl.ac.uk