All questions are associated with an implicit answer type. So even if we don't know the actual answer, we can expect what the type of that answer would be. An accurate expectation makes it easier to predict the answer from a sentence that contains the query words. <example>
The answer to a question like “What are the tourist attractions in Reims?” could be many different things – church, historic building,, park, statue, famous intersection etc.
Unsupervised method to dynamically construct a probabilistic answer type model for each question. Such a model evaluates whether or not a word fits into the question context. <example> we can find words that appear in this context from a corpus.
Parsed the AQUAINT corpus (3GB) with Minipar and collected the frequency counts of words appearing in various contexts. Parsing and database construction is done offline as the database is identical for all questions. Extracted 527, 768 contexts that appeared at least 25 times in the corpus. Which city hosted the winter olympics? Question clearly states that the desired answer type is city. So the context is “X is a city”
The first model assigns the same likelihood to every instance of the candidate word. Since a word can be polysemous like we saw in the “washington” example, we introduce the candidate context. Various parameters of the model are then estimated using the context filler database and appropriate probability distribution
The model is used to filter the contents of the documents retrieved by IR portion of the question-answer system. Each answer candidate is scored and the list is sorted in descending order of score. Then treat the system as filter and observe the number of candidates that must pass through it before at least one correct answer is accepted. A model that allows low percentage of candidates to pass while still accepting at least one answer is favourable to one that passes many candidates. Compared against two models – Oracle system that uses manual question classification and manual entity tagging. And ANNIE that performs automatic tagging.
Users submit a query to search according to their potential information need. They then consecutively reformulate their queries in a search session until their original needs are fulfilled.
Idea is to integrate the learned search intents of queries into the prior preference of the personalized random walk, and apply the random walk under different search intent resp. Lambda is the teleportation probability rho is the weight balancing the original query and its intent. It is set less than 1 to smoothen the preference vector with the learned intents which can provide rich information about the original query.