Capturing the semantics of key phrases using multiple languages for question retrieval
1. Capturing the Semantics of Key Phrases Using Multiple Languages for
Question Retrieval
Abstract:
In the age of Web 2.0, community user contributed questions and answers
provide an important alternative for knowledge acquisition through web search.
Question retrieval in current community-based question answering (CQA) services
do not, in general, work well for long and complex queries, such as the questions.
The main reasons are the verboseness in natural language queries and the word
mismatch between the queries and the candidate questions in the CQA archive
during retrieval. To address these two problems, existing solutions try to refine
the search queries by distinguishing the key concepts in the queries and
expanding the queries with relevant content. However, using the existing query
refinement approaches can only identify the key and non-key concepts, while the
differences between the key concepts are overlooked. Moreover, the existing
query expansion approaches, not only overlook the weights of key concepts in the
queries, but also fail to consider concept level expansion for them. In this paper,
we explore a key concept identification approach for query refinement and a
pivot language translation based approach to explore key concept paraphrasing.
We further propose a new question retrieval model which can seamlessly
integrate the key concepts and their paraphrases. The experimental results
demonstrate that the integrated retrieval model significantly outperforms the
state-of-the-art models in question retrieval.