Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Efficient Instant-Fuzzy Search
with Proximity Ranking
Abstract
• System finds answers to a query instantly while user
types in keywords character-by-character.
• Fuzzy search i...
Problem Statement & Proposed
Solution
• Problem Statement:
o Achieving efficient time & space complexities.
• Solution:
o ...
Literature Survey
• K. Grabski and T. Scheffer, “Sentence completion,”
in SIGIR, 2004, pp.433–439.
• A. Nandi and H. V. Ja...
Literature Survey Cont..
• H. Bast and I. Weber proposed many indexing and
query techniques to support instant search.
• M...
Literature Survey Cont..
• R. Fagin, A. Lotem, and M. Naor, F. Zhang, S. Shi, H. Yan,
and J.-R. Wen: Extensively to suppor...
General Idea of Instant
Search
Architecture
Example Table for
Architecture Explanation
This data is structured in indexed format.
Two types of indices are used to str...
Basic Blocks In Architecture
Indices
1.Trie
2.Forward
Basic Blocks In Architecture
• Phrase Validator:
o When a search server receives a request, it first identifies all
the va...
Basic Blocks In
Architecture Cont….
• Index Searcher: After Q is generated, the
segmentations are passed into the Index Se...
Phrase Validator
Phrase Validator With Cache Module
Own Contributions
• Implementing Demand Paging Algorithm with
efficient page replacement strategy will be
advantageous for...
Proposed Architecture
Conclusion
• The previous systems were able to recommend
results based on just previously typed characters
kept in cache m...
Upcoming SlideShare
Loading in …5
×

0

Share

Download to read offline

Efficient Instant-Fuzzy Search With Proximity Ranking

Download to read offline

System finds answers to a query instantly while user types in keywords character-by-character.
Fuzzy search improves user search experiences by finding relevant answers with keywords similar to query keywords.
A main computational challenge in this paradigm is the high speed requirement
At the same time, we also need good ranking functions that consider the proximity of keywords to compute relevance scores

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all
  • Be the first to like this

Efficient Instant-Fuzzy Search With Proximity Ranking

  1. 1. Efficient Instant-Fuzzy Search with Proximity Ranking
  2. 2. Abstract • System finds answers to a query instantly while user types in keywords character-by-character. • Fuzzy search improves user search experiences by finding relevant answers with keywords similar to query keywords. • A main computational challenge in this paradigm is the high speed requirement • At the same time, we also need good ranking functions that consider the proximity of keywords to compute relevance scores
  3. 3. Problem Statement & Proposed Solution • Problem Statement: o Achieving efficient time & space complexities. • Solution: o Index phrases with proper indexing scheme and o Develop an incremental-computation algorithm for efficiently segmenting a query into phrases and computing relevant answers. • Result Metrics: Experimental study on real data sets to show the tradeoffs between time, space, and quality of these solutions.
  4. 4. Literature Survey • K. Grabski and T. Scheffer, “Sentence completion,” in SIGIR, 2004, pp.433–439. • A. Nandi and H. V. Jagadish, “Effective phrase prediction,” in VLDB, 2007, pp. 219–230. o They have proposed system on predicting queries. Many systems do prediction by treating a query with multiple keywords as a single prefix string. o Therefore, if a related suggestion has the query keywords but not consecutively, then this suggestion cannot be found.
  5. 5. Literature Survey Cont.. • H. Bast and I. Weber proposed many indexing and query techniques to support instant search. • M. Hadjieleftheriou and C. Li, K. Chakrabarti, S. Chaudhuri, V. Ganti, and D. Xin, S. Chaudhuri, V. Ganti, and R. Motwani: Suggested former approach, sub-strings of the data are used for fuzzy string matching • S. Ji, G. Li, C. Li, and J. Feng: This approach is especially suitable for instant and fuzzy search since each query is a prefix and trie can support incremental computation efficiently.
  6. 6. Literature Survey Cont.. • R. Fagin, A. Lotem, and M. Naor, F. Zhang, S. Shi, H. Yan, and J.-R. Wen: Extensively to support top-k queries efficiently • G. Li, J. Wang, C. Li, and J. Feng, “Supporting efficient top-k queries in type-ahead search,” in SIGIR, 2012, pp. 355–364. o Adopted existing top-k algorithms to do instant- fuzzy search. o Most of these studies reorganize an inverted index to evaluate more relevant documents first. • Persin et al. proposed using inverted lists sorted by decreasing document frequency. • Zhang et al. studied the effect of term- independent features in index reorganization.
  7. 7. General Idea of Instant Search
  8. 8. Architecture
  9. 9. Example Table for Architecture Explanation This data is structured in indexed format. Two types of indices are used to structure this data 1. Trie Indices 2. Forward Indices
  10. 10. Basic Blocks In Architecture Indices 1.Trie 2.Forward
  11. 11. Basic Blocks In Architecture • Phrase Validator: o When a search server receives a request, it first identifies all the valid phrases in the query that are in the dictionary D, and intersects their inverted lists. o The Phrase Validator computes and returns the active nodes for all these terms. o If a query keyword appears in multiple valid phrases, the query can be segmented into phrases. • Query Plan Builder: o After identifying the valid phrases, the Query Plan Builder generates a Query Plan Q, which contains all the possible valid segmentations in a specific order. o The ranking of Q determines the order in which the segmentations will be executed.
  12. 12. Basic Blocks In Architecture Cont…. • Index Searcher: After Q is generated, the segmentations are passed into the Index Searcher one by one until the top-k answers are computed, or all the segmentations in the plan are used. • Cache Module: o The Phrase Validator uses the Cache module to validate a phrase without traversing the trie from scratch, o while the Index Searcher benefits from the Cache by being able to retrieve the answers to an earlier query to reduce the computational cost.
  13. 13. Phrase Validator
  14. 14. Phrase Validator With Cache Module
  15. 15. Own Contributions • Implementing Demand Paging Algorithm with efficient page replacement strategy will be advantageous for this application. • Previous searches could be the part of next search history, so we will put log/ history in page table and retrieve pages efficiently as per query keyword requirements. • Different “Page Replacement Strategies” could be proposed to give extremely faster recommendations. • Architecture like Translation Look Aside Buffer could be employed to fetch pages from TLB.
  16. 16. Proposed Architecture
  17. 17. Conclusion • The previous systems were able to recommend results based on just previously typed characters kept in cache module. • Most of the times Previous Search Log might be useful to make recommendation system more faster! • Relevance to user query along with users intentions could be mined easily.

System finds answers to a query instantly while user types in keywords character-by-character. Fuzzy search improves user search experiences by finding relevant answers with keywords similar to query keywords. A main computational challenge in this paradigm is the high speed requirement At the same time, we also need good ranking functions that consider the proximity of keywords to compute relevance scores

Views

Total views

184

On Slideshare

0

From embeds

0

Number of embeds

1

Actions

Downloads

4

Shares

0

Comments

0

Likes

0

×