seminar topic

4,046 views

Published on

Published in: Education, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
4,046
On SlideShare
0
From Embeds
0
Number of Embeds
5
Actions
Shares
0
Downloads
56
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

seminar topic

  1. 1. Personalization in Information Retrieval, Extraction and Access Workshop On Ontology, NLP, Personalization And IE/IR - IIT Bombay, Mumbai 15-17 July 2008 Vasudeva Varma www.iiit.ac.in/~vasu
  2. 2. Search Engine Heat is On! 2 Applications of Search Technologies Web search Product search Service search Domain Search Already a BIG Market HUGE Opportunity IR and IE Technologies and Personalization (C) Vasudeva Varma IIIT-H 5/30/2008 2
  3. 3. Agenda 3 Evolution of Search Engines Information Retrieval Vs. Extraction Vs. Access Personalization in IR, IE and IA Applications in Personalized IA Conclusions IR and IE Technologies and Personalization (C) Vasudeva Varma IIIT-H 5/30/2008 3
  4. 4. Evolution of Search Engines 4 Crawling and Indexing Topic directories Clustering and Classification Hyperlink analysis Resource discovery and vertical portals Semantic Web ??? IR and IE Technologies and Personalization (C) Vasudeva Varma IIIT-H 5/30/2008 4
  5. 5. Current IR engines fail – why? 5 Wide variation in retrieval results User topic Retrieval system Different approaches work for different systems. No way to determine which approach will work for a particular query. Solution: Deeper analysis of the content and Query IR and IE Technologies and Personalization (C) Vasudeva Varma IIIT-H 5/30/2008 5
  6. 6. Motivation for Deeper Analysis 6 Texts are one of the major sources of information and knowledge. However, they are not transparent. They have to be systematically integrated with the other sources like data bases, numerical data, etc. NLP/IR/IE for better analysis IA for better presentation IR and IE Technologies and Personalization (C) Vasudeva Varma IIIT-H 5/30/2008 6
  7. 7. Agenda 7 Evolution of Search Engines Information Retrieval Vs. Extraction Vs. Access Personalization in IR, IE and IA Applications in Personalized IA Conclusions IR and IE Technologies and Personalization (C) Vasudeva Varma IIIT-H 5/30/2008 7
  8. 8. IR vs. IE vs. IA 8 To search and retrieve documents in response to queries for information Vs. To extract information that fits pre-defined database schemas or templates, specifying the output formats Vs. To make the required information accessible to the user in their choice of language, mode, level of detail and format IR and IE Technologies and Personalization (C) Vasudeva Varma IIIT-H 5/30/2008 8
  9. 9. Characterization of Texts IR System Queries Collection of Texts 9 IR and IE Technologies and Personalization (C) Vasudeva Varma IIIT-H 5/30/2008
  10. 10. Knowledge Characterization of Texts Interpretation IR System Queries Collection of Texts 10 IR and IE Technologies and Personalization (C) Vasudeva Varma IIIT-H 5/30/2008
  11. 11. Knowledge Characterization of Texts Interpretation Passage IR System Queries Collection of Texts 11 IR and IE Technologies and Personalization (C) Vasudeva Varma IIIT-H 5/30/2008
  12. 12. Knowledge Characterization of Texts Interpretation Passage IE System IR System Queries Structures of Sentences Collection of Texts NLP Texts Templates 12 IR and IE Technologies and Personalization (C) Vasudeva Varma IIIT-H 5/30/2008
  13. 13. Information Access Knowledge Technologies Interpretation Machine Translation Passage IE System IR System Summarization I Snippet Generation NL Generation Visualization Tools 13 IR and IE Technologies and Personalization (C) Vasudeva Varma IIIT-H 5/30/2008
  14. 14. Agenda 14 Evolution of Search Engines Information Retrieval Vs. Extraction Vs. Access Personalization in IR, IE and IA Applications in Personalized IA Conclusions IR and IE Technologies and Personalization (C) Vasudeva Varma IIIT-H 5/30/2008 14
  15. 15. Limitations of Current IR Systems 15 All users get same results for a given query – independent of: Previous search history Current Search Context Treat all users the same Does one size fits all? IR and IE Technologies and Personalization (C) Vasudeva Varma IIIT-H 5/30/2008 15
  16. 16. Personalized Web Search 16 Automatic adjustment of information content, structure, and presentation tailored to an individual user. Characteristics: Age, Gender, Special Interest Groups, Topic Personalize Search Results using Personal content Past Activities (long term and short term) Variations: Explicit or Implicit profile setup Explicit or Implicit relevance feedback Client side or server side storage of information (privacy implications) User control over amount of personalization IR and IE Technologies and Personalization (C) Vasudeva Varma IIIT-H 5/30/2008 16
  17. 17. Overview of Personalized Search 17 Typically a 3 step process: 1. Obtain results (n>>10) 2. Computer Similarity (results, User) 3. Re-rank the results IR and IE Technologies and Personalization (C) Vasudeva Varma IIIT-H 5/30/2008 17
  18. 18. 18 IR and IE Technologies and Personalization (C) Vasudeva Varma IIIT-H 5/30/2008 18
  19. 19. 19 IR and IE Technologies and Personalization (C) Vasudeva Varma IIIT-H 5/30/2008 19
  20. 20. Techniques 20 Co-active Techniques Pro-active Techniques Collaborative Filtering User Profile based Result Pruning User Profile based Query Expansion IR and IE Technologies and Personalization (C) Vasudeva Varma IIIT-H 5/30/2008 20
  21. 21. Problem Description Personalized Search - Issues What to use to Personalize? How to Personalize? When not to Personalize? How to know Personalization helped? 21
  22. 22. Problem Description We focus on the issue How to Personalize? Problem Statement How to learn to personalize for future searches using past search history How to model and represent past search contexts How to use it to improve search results 22
  23. 23. Solution - Outline Model and Represent past user feedback – Learning user profile Use implicit feedback Long term learning User contexts – triples {user,query,{relevant documents}} Improve Search Results – Reranking Get Initial Search results Take top few and rescore using user profile and rearrange 23
  24. 24. Contributions I Search : A suite of approaches for Personalized Web Search Proposed Personalized search approaches Baseline Basic Retrieval methods Automatic Evaluation Analysis of Query Log 24
  25. 25. Review of Personalized Search Personalized Search Query logs Machine learning Language modeling Community based Others 25
  26. 26. I Search : A suite of Techniques for Personalized IR Suite of Approaches??? Statistical Language modeling based approaches Simple N-gram based methods Noisy Channel Model based method Machine learning based approach Ranking SVM based method Personalization without relevance feedback Simple N-gram based method 26
  27. 27. Statistical Language Modeling based Approaches:Overview From user contexts, capture statistical properties of texts Use the same to improve search results Different Contexts Unigram and Bigrams Simple N-gram based approaches Relationship between query and document words Noisy Channel based approach 27
  28. 28. Simple N-gram based approaches N-gram : general term for words 1-gram : unigram, 2-gram : bigram Capture statistical properties of text Single words (Unigrams) Two adjacent words (Bigrams) 28
  29. 29. Learning user profile Given Past search history Hu = {(q1, rf1), (q2, rf2), …, (qn, rfn)} rfall = contentation of all rf For each unigram wi User profile 29
  30. 30. Sample user profile 30
  31. 31. Reranking In general LM for IR Our Approach 31
  32. 32. Noisy Channel based Approach Documents and Queries different information spaces Queries – short, concise Documents – more descriptive Most methods to retrieval or personalized web search do not model this Capture relationship between query and document words 32
  33. 33. Machine Learning based Approaches:Introduction Most machine learning for IR - Binary classification problem – “relevant” and “non-relevant” Click through data Click is not an absolute relevance but relative relevance i.e., assuming clicked – relevant, un clicked - irrelevant is wrong. Clicks – biased Partial relative relevance - Clicked documents are more relevant than the un clicked documents. 33
  34. 34. Personalized Search without Relevance Feedback:Introduction Can personalized be done without relevance feedback about which documents are relevant How much informative are the queries posed by users Is information contained in the queries enough to personalize? 34
  35. 35. Approach Past queries of the user available Make effective use of past queries Simple N-gram based approach 35
  36. 36. Experiment Results Language Modeling – Best Results! Interesting framework Personalized Search Simple N-gram based approaches also worked well Noisy Channel model worked best Extracting Synthetic Queries helped Different Training schemes IBM Model1 Vs GIZA++ Snippet Vs Document Machine Learning – competitive results Different Features and weights Without Relevance Feedback – Very encouraging results Simple Approach worked well Sparsity – Query log was useful 36
  37. 37. Agenda 37 Evolution of Search Engines Information Retrieval Vs. Extraction Vs. Access Personalization in IR, IE and IA Applications in Personalized IA Conclusions Personalized Search Personalized Engine for Mobile Summarization Phones (for Mobile Devices) IR and IE Technologies and Personalization (C) Vasudeva Varma IIIT-H 5/30/2008 37
  38. 38. “Personalized” Search Engine for mobile devices To develop a “personalized” Search Engine for mobile devices that will produce more relevant results based on the query and the “context” What we mean by “Personalized” search? user will be able to configure the search interfaces (Explicit feedback) System will observe user behavior and customize itself to suit user’s needs (Implicit feedback) What we mean by “Context”? User, time, location, … Goal is to make Search accessible on Nokia mobile devices and make use of the mobile aspects for personalization. 38 (C) Vasudeva Varma, IIIT Hyderabad, India 38
  39. 39. Scope of the Application Client Side Server Side 39 (C) Vasudeva Varma, IIIT Hyderabad, India 39
  40. 40. Problem Re-Definition Dynamic user behavior tracking An observer that keeps track of all “relevant” user actions Client module Analysis of user actions Interpret the user actions to derive user interests (categories of interests) so that more relevant results are displayed Construction of user profile implicitly Implicit Supervised learning Personalization Based on Query Based on User Profile Based on other parameters such as time, location 40 (C) Vasudeva Varma, IIIT Hyderabad, India 40
  41. 41. Solution Overview 41 (C) Vasudeva Varma, IIIT Hyderabad, India 41
  42. 42. Personalized Summarization: Motivation The success that search engine providers have found on the PC have failed to translate to the mobile phone. why? Because trying to force a PC-based search experience inside a mobile device falls short on a key area of usability Search queries typically return hundreds of potential hits. Making sense of such output is difficult. The results may or may not be of user interest. We are looking for a faster and easier way to access precise information on our mobile devices. 42 (C) Vasudeva Varma, IIIT Hyderabad, India 42
  43. 43. Challenges Can we offer users a more simple, friendly and intuitive experience? We are looking forward to provide more information with less payload in form of a summary which will take care of context history preferences device capabilities social network 43 (C) Vasudeva Varma, IIIT Hyderabad, India 43
  44. 44. System Model Search Engine 44 (C) Vasudeva Varma, IIIT Hyderabad, India 44
  45. 45. Summary 45 Current Search Engines are inadequate and current know-how is only the tip of an ice-berg IR, IE and IA areas have enjoyed huge commercial success and have a huge growth potential Personalization is perhaps the next big wave Various personalization techniques are available - yet this is a very fertile research field The two personalization application shown are just examples of many possibilities. IR and IE Technologies and Personalization (C) Vasudeva Varma IIIT-H 5/30/2008 45
  46. 46. 46 Thank You – Questions? Vasudeva Varma, IIIT Hyderabad vv@iiit.ac.in or www.iiit.ac.in/~vasu IR and IE Technologies and Personalization (C) Vasudeva Varma IIIT-H 5/30/2008

×