Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Asking Clarifying Questions in Open-Domain Information-Seeking Conversations

77 views

Published on

Users often fail to formulate their complex information needs in a single query. As a consequence, they may need to scan multiple result pages or reformulate their queries, which may be a frustrating experience.
Alternatively, systems can improve user satisfaction by proactively asking questions of the users to clarify their information needs. Asking clarifying questions is especially important in conversational systems since they can only return a limited number of (often only one) result(s).

In this paper, we formulate the task of asking clarifying questions in open-domain information-seeking conversational systems. To this end, we propose an offline evaluation methodology for the task and collect a dataset, called Qulac, through crowdsourcing. Our dataset is built on top of the TREC Web Track 2009-2012 data and consists of over 10K question-answer pairs for 198 TREC topics with 762 facets.
Our experiments on an oracle model demonstrate that asking only one good question leads to over 170% retrieval performance improvement in terms of P@1, which clearly demonstrates the potential impact of the task. We further propose a retrieval framework consisting of three components: question retrieval, question selection, and document retrieval. In particular, our question selection model takes into account the original query and previous question-answer interactions while selecting the next question. Our model significantly outperforms competitive baselines. To foster research in this area, we have made Qulac publicly available.

  • Be the first to comment

  • Be the first to like this

Asking Clarifying Questions in Open-Domain Information-Seeking Conversations

  1. 1. Asking Clarifying Questions in Open-Domain Information- Seeking Conversations Mohammad Aliannejadi(1), Hamed Zamani(2), Fabio Crestani(1), and W. Bruce Croft(2) (1) Università della Svizzera italiana (USI), Switzerland (2) University of Massachusetts Amherst, USA
  2. 2. © @dawnieando; @JeffD
  3. 3. 5
  4. 4. 6
  5. 5. 7
  6. 6. 8
  7. 7. Can we ask questions to clarify the user information needs? Johannes Kiesel et al. Toward Voice Query Clarification. SIGIR 2018 Radlinski and Craswell. A Theoretical Framework for Conversational Search. CHIIR 2017
  8. 8. How to evaluate?
  9. 9. ClueWeb Collection • A part of the Lemur Project • A common web crawl (English) with 50M documents • TREC Web Track 2009 – 2012 • Ad-hoc retrieval and diversification tasks
  10. 10. TREC facets
  11. 11. An offline evaluation methodology • We assume that each user is interested in one facet per topic.
  12. 12. An offline evaluation methodology • Let be the set of topics (queries). • A collection of facet sets: • includes all defined facets for topic . • A collection of clarifying question sets: • With including all clarifying questions relevant to topic . • An offline evaluation requires defining . • Borrowed from the ClueWeb Collection
  13. 13. Question Verification and Facet Linking • Two main concerns: • Precision: how is the quality of the collected clarifying questions? • Recall: are all facets addressed by at least one clarifying question? • Two expert annotators: • Marked invalid and duplicate questions. • Linked questions to the facets they found relevant. • Facets with no questions: generated new questions relevant to them.
  14. 14. An offline evaluation methodology • Let be the set of topics (queries). • A collection of facet sets: • includes all defined facets for topic . • A collection of clarifying question sets: • With including all clarifying questions relevant to topic . • An offline evaluation requires defining . • Borrowed from the ClueWeb Collection
  15. 15. Quality Check • Regular quality checks on the collected answers. • Manual checks on 10% of submissions per worker. • If any invalid answer was observed, we then checked all the submissions of the corresponding worker. • Invalid answers were removed and workers banned from future tasks. • Disabled copy/paste feature. • Monitored keystrokes.
  16. 16. Qulac: Questions for Lack of Clarity Qulac has two meanings in Persian: • blizzard • wonderful or masterpiece © HBO
  17. 17. Learning to ask clarifying questions
  18. 18. Question Retrieval
  19. 19. Question Retrieval • Task: Given a topic and a context (question-answer history), retrieve clarifying questions. • Desired objective: high recall • Approaches: • Term matching retrieval models: language models, BM25, RM3 (query expansion) • Learning to rank: LambdaMART, RankNet, neural ranking models (e.g., BERT)
  20. 20. Question Retrieval
  21. 21. Question Retrieval
  22. 22. Question Selection • Task: selecting a clarifying question that leads to retrieval improvement • Objective: high precision (in retrieval) • Approaches: • Query performance prediction (QPP): predicting the retrieval performance after asking each question (without answer) and selecting the one with the highest QPP. • Learning to rank: defining a set of features for ranking questions. The features include QPP, similarity to the topic, similarity to the context, etc. • Neural ranking models: learning to rank with representation learning (e.g., BERT)
  23. 23. Question Selection Asking only one good question improves the performance by over 100%.
  24. 24. Case Study Negative answer; new information. Retrieval model fails.
  25. 25. Case Study Open question; new information.
  26. 26. Future Directions • Utilizing positive and negative feedback for document retrieval. • Joint modeling of question retrieval and selection. • Question generation. • Determining the number of questions to ask based on the system’s confidence. • Explore other ways of evaluating a system: • Conversation turns; • Retrieval performance.
  27. 27. Conclusions • Asking clarifying questions in open-domain information-seeking conversations. • Qulac: a collection for automatic offline evaluation of asking clarifying questions for conversational IR. • A simple yet effective retrieval framework. • Asking only one good question improves the performance by over 100%! • More improvement for: • Shorter queries; • Ambiguous queries.
  28. 28. Questions? Qulac is publicly available at http://bit.ly/QulacData Thanks to SIGIR for the student travel grant!

×