SlideShare a Scribd company logo
1 of 53
Modern Information Retrieval
Relevance Feedback
The IR Black Box
Search
Query
Ranked List
Standard Model of IR
 Assumptions:
 The goal is maximizing precision and recall
simultaneously
 The information need remains static
 The value is in the resulting document set
Problems with Standard Model
 Users learn during the search process:
 Scanning titles of retrieved documents
 Reading retrieved documents
 Viewing lists of related topics/thesaurus terms
 Navigating hyperlinks
 Some users don’t like long (apparently)
disorganized lists of documents
IR is an Iterative Process
Repositories
Workspace
Goals
IR is a Dialog
 The exchange doesn’t end with first answer
 Users can recognize elements of a useful
answer, even when incomplete
 Questions and understanding changes as the
process continues
Bates’ “Berry-Picking” Model
 Standard IR model
 Assumes the information need remains the
same throughout the search process
 Berry-picking model
 Interesting information is scattered like
berries among bushes
 The query is continually shifting
Berry-Picking Model
Q0
Q1
Q2
Q3
Q4
Q5
A sketch of a searcher… “moving through many actions towards a
general goal of satisfactory completion of research related to an
information need.” (after Bates 89)
Berry-Picking Model (cont.)
 The query is continually shifting
 New information may yield new ideas and
new directions
 The information need
 Is not satisfied by a single, final retrieved set
 Is satisfied by a series of selections and bits
of information found along the way
Restricted Form of the IR Problem
 The system has available only pre-
existing, “canned” text passages
 Its response is limited to selecting from
these passages and presenting them to
the user
 It must select, say, 10 or 20 passages out
of millions or billions!
Information Retrieval
 Revised Task Statement:
Build a system that retrieves documents that
users are likely to find relevant to their queries
 This set of assumptions underlies the field
of Information Retrieval
Relevance Feedback
 Take advantage of user relevance judgments in the
retrieval process:
 User issues a (short, simple) query and gets back an
initial hit list
 User marks hits as relevant or non-relevant
 The system computes a better representation of the
information need based on this feedback
 Single or multiple iterations (although little is typically
gained after one iteration)
 Idea: you may not know what you’re looking for, but
you’ll know when you see it
Outline
 Explicit feedback: users explicitly mark
relevant and irrelevant documents
 Implicit feedback: system attempts to infer
user intentions based on observable
behavior
 Blind feedback: feedback in absence of
any evidence, explicit or otherwise
Why relevance feedback?
 You may not know what you’re looking for,
but you’ll know when you see it
 Query formulation may be difficult; simplify
the problem through iteration
 Facilitate vocabulary discovery
 Boost recall: “find me more documents
like this…”
Relevance Feedback Example
Image Search Engine
http://nayana.ece.ucsb.edu/imsearch/imsearch.html
Initial Results
Relevance Feedback
Revised Results
Updating Queries
 Let’s assume that there is an optimal query
 The goal of relevance feedback is to bring the user
query closer to the optimal query
 How does relevance feedback actually work?
 Use relevance information to update query
 Use query to retrieve new set of documents
 What exactly do we “feed back”?
 Boost weights of terms from relevant documents
 Add terms from relevant documents to the query
 Note that this is hidden from the user
Query Modification
 Changing or Expanding a query can lead
to better results
 Problem: how to reformulate the query?
 Thesaurus expansion:
 Suggest terms similar to query terms
 Relevance feedback:
 Suggest terms (and documents) similar to
retrieved documents that have been judged to be
relevant
Relevance Feedback
 Main Idea:
 Modify existing query based on relevance
judgements
 Extract terms from relevant documents and add
them to the query
 and/or re-weight the terms already in the query
 Two main approaches:
 Automatic (psuedo-relevance feedback)
 Users select relevant documents
 Users/system select terms from an
automatically-generated list
Picture of Relevance Feedback
x
x
x
x
o
o
o
Revised query
x non-relevant documents
o relevant documents
o
o
o
x
x
x
x
x
x
x
x
x
x
x
x

x
x
Initial query

x
Rocchio Algorithm
 Used in practice:
 New query
Moves toward relevant documents
Away from irrelevant documents







nr
j
r
j D
d
j
nr
D
d
j
r
m d
D
d
D
q
q 




 1
1
0 


qm = modified query vector;
q0 = original query vector;
α,β,γ: weights (hand-chosen or set
empirically);
Dr = set of known relevant doc vectors;
Dnr = set of known irrelevant doc vectors
notes on feedback
 If  is set = 0, we have a positive feedback
system; often preferred
 the feedback formula can be applied
repeatedly, asking user for relevance
information at each iteration
 relevance feedback is generally
considered to be very effective
 one drawback is that it is not fully
automatic.
Simple feedback example:
T = {pudding, jam, traffic, lane, treacle}
d1 = (0.8, 0.8, 0.0, 0.0, 0.4),
d2 = (0.0, 0.0, 0.9, 0.8, 0.0),
d3 = (0.8, 0.0, 0.0, 0.0, 0.8)
d4 = (0.6, 0.9, 0.5, 0.6, 0.0)
Recipe for jam pudding
DoT report on traffic lanes
Radio item on traffic jam in Pudding Lane
Recipe for treacle pudding
Display first 2 documents that match the following
query:
q0 = (1.0, 0.6, 0.0, 0.0, 0.0)
r = (0.91, 0.0, 0.6, 0.73)
Retrieved documents are:
d1 : Recipe for jam pudding
d4 : Radio item on traffic jam
relevant
not relevant
Suppose we set  and  to 0.5,  to 0.2
qm = q0  di / | HR |  di / | HNR|
= 0.5 q0 + 0.5 d1  0.2 d4
= 0.5  (1.0, 0.6, 0.0, 0.0, 0.0)
+ 0.5  (0.8, 0.8, 0.0, 0.0, 0.4)
 0.2  (0.6, 0.9, 0.5, 0.6, 0.0)
= (0.78, 0.52,  0.1,  0.12, 0.2)
di  HR di  HNR
Positive and Negative Feedback
Simple feedback example:
T = {pudding, jam, traffic, lane, treacle}
d1 = (0.8, 0.8, 0.0, 0.0, 0.4),
d2 = (0.0, 0.0, 0.9, 0.8, 0.0),
d3 = (0.8, 0.0, 0.0, 0.0, 0.8)
d4 = (0.6, 0.9, 0.5, 0.6, 0.0)
Display first 2 documents that match the following
query:
qm = (0.78, 0.52,  0.1,  0.12, 0.2)
r’ = (0.96, 0.0, 0.86, 0.63) Retrieved documents are:
d1 : Recipe for jam pudding
d3 : Recipe for treacle pud
relevant
relevant
Relevance Feedback: Cost
 Speed and efficiency issues
 System needs to spend time analyzing
documents
 Longer queries are usually slower
 Users often reluctant to provide explicit
feedback
 It’s often harder to understand why a
particular document was retrieved
Koenemann and Belkin’s Work
 Well-known study on relevance feedback
in information retrieval
 Questions asked:
 Does relevance feedback improve results?
 Is user control over relevance feedback
helpful?
 How do different levels of user control effect
results?
Jürgen Koenemann and Nicholas J. Belkin. (1996) A Case For Interaction: A Study of
Interactive Information Retrieval Behavior and Effectiveness. Proceedings of SIGCHI 1996
Conference on Human Factors in Computing Systems (CHI 1996).
What’s the best interface?
 Opaque (black box)
 User doesn’t get to see the relevance
feedback process
 Transparent
 User shown relevance feedback terms, but
isn’t allowed to modify query
 Penetrable (可渗透的)
 User shown relevance feedback terms and is
allowed to modify the query
Which do you think worked best?
Query Interface
Penetrable Interface
Users get to select which
terms they want to add
Study Details
 Subjects started with a tutorial
 64 novice searchers (43 female, 21 male)
 Goal is to keep modifying the query until
they’ve developed one that gets high
precision
Procedure
 Baseline (Trial 1)
 Subjects get tutorial on relevance feedback
 Experimental condition (Trial 2)
 Shown one of four modes: no relevance
feedback, opaque, transparent, penetrable
 Evaluation metric used: precision at 30
documents
Relevance feedback works!
 Subjects using the relevance feedback
interfaces performed 17-34% better
 Subjects in the penetrable condition
performed 15% better than those in
opaque and transparent conditions
Behavior Results
 Search times approximately equal
 Precision increased in first few iterations
 Penetrable interface required fewer
iterations to arrive at final query
 Queries with relevance feedback are
much longer
 But fewer terms with the penetrable interface
 users were more selective about which
terms to add
Implicit Feedback
 Users are often reluctant to provide
relevance judgments
 Some searches are precision-oriented
 They’re lazy!
 Can we gather feedback without requiring
the user to do anything?
 Idea: gather feedback from observed user
behavior
Observable Behavior
Minimum Scope
Segment Object Class
Behavior
Category
Examine
Retain
Reference
Annotate
View
Listen
Select
Print Bookmark
Save
Purchase
Delete
Subscribe
Copy / paste
Quote
Forward
Reply
Link
Cite
Mark up Rate
Publish
Organize
So far…
 Explicit feedback: take advantage of user-
supplied relevance judgments
 Implicit feedback: observe user behavior
and draw inferences
 Can we perform feedback without having
a user in the loop?
Blind Relevance Feedback
 Also called “pseudo(伪)relevance feedback”
 Motivation: it’s difficult to elicit relevance judgments from
users
 Can we automate this process?
 Idea: take top n documents, and simply assume that
they are relevant
 Perform relevance feedback as before
 If the initial hit list is reasonable, system should pick up
good query terms
 Does it work?
BRF Experiment
 Retrieval engine: Indri
 Test collection: TREC, topics 301-450
 Procedure:
 Used topic description as query to generate
initial hit list
 Selected top 20 terms from top 20 hits using
tf.idf
 Added these terms to the original query
Results
MAP R-Precision
No feedback 0.1591 0.2022
With feedback 0.1806
(+13.5%)
0.2222
(+9.9%)
Blind relevance feedback doesn’t always help!
The Complete Landscape
 Explicit, implicit, blind feedback: it’s all
about manipulating terms
 Dimensions of query expansion
 “Local” vs. “global”
 User involvement vs. no user involvement
Local vs. Global
 “Local” methods
 Only considers documents that have be
retrieved by an initial query
 Query specific
 Computations must be performed on the fly
 “Global” methods
 Takes entire document collection into account
 Does not depend on the query
 Thesauri can be computed off-line (for faster
access)
User Involvement
 Query expansion can be done
automatically
 New terms added without user intervention
 Or it can place a user in the loop
 System presents suggested terms
 Must consider interface issues
Global Methods
 Controlled vocabulary
 For example, MeSH terms
 Manual thesaurus
 For example, WordNet
 Automatically derived thesaurus
 For example, based on co-occurrence
statistics
Using Controlled Vocabulary
Thesauri
 A thesaurus may contain information
about lexical semantic relations:
 Synonyms: similar words
e.g., violin → fiddle
 Hypernyms: more general words
e.g., violin → instrument
 Hyponyms: more specific words
e.g., violin → Stradivari
 Meronyms: parts
e.g., violin → strings
Using Manual Thesauri
 For each query term t, added synonyms
and related words from thesaurus
 feline → feline cat
 Generally improves recall
 Often hurts precision
 “interest rate”  “interest rate fascinate
evaluate”
 Manual thesauri are expensive to produce
and maintain
Automatic Thesauri Generation
 Attempt to generate a thesaurus
automatically by analyzing the document
collection
 Two possible approaches
 Co-occurrence statistics (co-occurring words
are more likely to be similar)
 Shallow analysis of grammatical relations
 Entities that are grown, cooked, eaten, and
digested are more likely to be food items.
Automatic Thesauri: Example
Automatic Thesauri: Discussion
 Quality of associations is usually a problem
 Term ambiguity may introduce irrelevant statistically
correlated terms.
 “Apple computer”  “Apple red fruit computer”
 Problems:
 False positives: Words deemed similar that are not
 False negatives: Words deemed dissimilar that are
similar
 Since terms are highly correlated anyway, expansion
may not retrieve many additional documents
Key Points
 Moving beyond the black box…
interaction is key!
 Different types of feedback:
 Explicit (user does the work)
 Implicit (system watches the user and guess)
 Blind (don’t even involve the user)
 Query expansion as a general mechanism

More Related Content

Similar to Modern information Retrieval-Relevance Feedback

Open domain Question Answering System - Research project in NLP
Open domain  Question Answering System - Research project in NLPOpen domain  Question Answering System - Research project in NLP
Open domain Question Answering System - Research project in NLPGVS Chaitanya
 
User Experience as an Organizational Development Tool
User Experience as an Organizational Development ToolUser Experience as an Organizational Development Tool
User Experience as an Organizational Development ToolDonovan Chandler
 
What Is Log Analyis
What Is Log AnalyisWhat Is Log Analyis
What Is Log AnalyisJim Jansen
 
HCI 3e - Ch 9: Evaluation techniques
HCI 3e - Ch 9:  Evaluation techniquesHCI 3e - Ch 9:  Evaluation techniques
HCI 3e - Ch 9: Evaluation techniquesAlan Dix
 
What is requirement gathering chap3 1.pptx
What is requirement gathering chap3 1.pptxWhat is requirement gathering chap3 1.pptx
What is requirement gathering chap3 1.pptxtadudemise
 
Sweeny group think-ias2015
Sweeny group think-ias2015Sweeny group think-ias2015
Sweeny group think-ias2015Marianne Sweeny
 
Research Report on Document Indexing-Nithish Kumar
Research Report on Document Indexing-Nithish KumarResearch Report on Document Indexing-Nithish Kumar
Research Report on Document Indexing-Nithish KumarNithish Kumar
 
Research report nithish
Research report nithishResearch report nithish
Research report nithishNithish Kumar
 
Understanding Stakeholder Needs
Understanding Stakeholder NeedsUnderstanding Stakeholder Needs
Understanding Stakeholder NeedsSandeep Ganji
 
2 Studies UX types should know about (Straub UXPA unconference13)
2 Studies UX types should know about (Straub UXPA unconference13)2 Studies UX types should know about (Straub UXPA unconference13)
2 Studies UX types should know about (Straub UXPA unconference13)Kath Straub
 
Information Retrieval-06
Information Retrieval-06Information Retrieval-06
Information Retrieval-06Jeet Das
 
Invited Lecture on Interactive Information Retrieval
Invited Lecture on Interactive Information RetrievalInvited Lecture on Interactive Information Retrieval
Invited Lecture on Interactive Information RetrievalDavidMaxwell77
 
Information Access on Social Web
Information Access on Social WebInformation Access on Social Web
Information Access on Social WebDaqing He
 
EDUCAUSE: Transformation of Help Desk Services at Rutgers
EDUCAUSE: Transformation of Help Desk Services at RutgersEDUCAUSE: Transformation of Help Desk Services at Rutgers
EDUCAUSE: Transformation of Help Desk Services at RutgersFrank Reda
 
Getting Them There: A Small-Scale Usability Test of a University Library Website
Getting Them There: A Small-Scale Usability Test of a University Library WebsiteGetting Them There: A Small-Scale Usability Test of a University Library Website
Getting Them There: A Small-Scale Usability Test of a University Library Websitejsimon6
 
Scalable Exploration of Relevance Prospects to Support Decision Making
Scalable Exploration of Relevance Prospects to Support Decision MakingScalable Exploration of Relevance Prospects to Support Decision Making
Scalable Exploration of Relevance Prospects to Support Decision MakingKatrien Verbert
 
Open domain question answering system using semantic role labeling
Open domain question answering system using semantic role labelingOpen domain question answering system using semantic role labeling
Open domain question answering system using semantic role labelingeSAT Publishing House
 

Similar to Modern information Retrieval-Relevance Feedback (20)

Open domain Question Answering System - Research project in NLP
Open domain  Question Answering System - Research project in NLPOpen domain  Question Answering System - Research project in NLP
Open domain Question Answering System - Research project in NLP
 
User Experience as an Organizational Development Tool
User Experience as an Organizational Development ToolUser Experience as an Organizational Development Tool
User Experience as an Organizational Development Tool
 
MBA
MBAMBA
MBA
 
What Is Log Analyis
What Is Log AnalyisWhat Is Log Analyis
What Is Log Analyis
 
HCI 3e - Ch 9: Evaluation techniques
HCI 3e - Ch 9:  Evaluation techniquesHCI 3e - Ch 9:  Evaluation techniques
HCI 3e - Ch 9: Evaluation techniques
 
Collaborative filtering
Collaborative filteringCollaborative filtering
Collaborative filtering
 
What is requirement gathering chap3 1.pptx
What is requirement gathering chap3 1.pptxWhat is requirement gathering chap3 1.pptx
What is requirement gathering chap3 1.pptx
 
Sweeny group think-ias2015
Sweeny group think-ias2015Sweeny group think-ias2015
Sweeny group think-ias2015
 
Research Report on Document Indexing-Nithish Kumar
Research Report on Document Indexing-Nithish KumarResearch Report on Document Indexing-Nithish Kumar
Research Report on Document Indexing-Nithish Kumar
 
Research report nithish
Research report nithishResearch report nithish
Research report nithish
 
Understanding Stakeholder Needs
Understanding Stakeholder NeedsUnderstanding Stakeholder Needs
Understanding Stakeholder Needs
 
2 Studies UX types should know about (Straub UXPA unconference13)
2 Studies UX types should know about (Straub UXPA unconference13)2 Studies UX types should know about (Straub UXPA unconference13)
2 Studies UX types should know about (Straub UXPA unconference13)
 
Introduction.pptx
Introduction.pptxIntroduction.pptx
Introduction.pptx
 
Information Retrieval-06
Information Retrieval-06Information Retrieval-06
Information Retrieval-06
 
Invited Lecture on Interactive Information Retrieval
Invited Lecture on Interactive Information RetrievalInvited Lecture on Interactive Information Retrieval
Invited Lecture on Interactive Information Retrieval
 
Information Access on Social Web
Information Access on Social WebInformation Access on Social Web
Information Access on Social Web
 
EDUCAUSE: Transformation of Help Desk Services at Rutgers
EDUCAUSE: Transformation of Help Desk Services at RutgersEDUCAUSE: Transformation of Help Desk Services at Rutgers
EDUCAUSE: Transformation of Help Desk Services at Rutgers
 
Getting Them There: A Small-Scale Usability Test of a University Library Website
Getting Them There: A Small-Scale Usability Test of a University Library WebsiteGetting Them There: A Small-Scale Usability Test of a University Library Website
Getting Them There: A Small-Scale Usability Test of a University Library Website
 
Scalable Exploration of Relevance Prospects to Support Decision Making
Scalable Exploration of Relevance Prospects to Support Decision MakingScalable Exploration of Relevance Prospects to Support Decision Making
Scalable Exploration of Relevance Prospects to Support Decision Making
 
Open domain question answering system using semantic role labeling
Open domain question answering system using semantic role labelingOpen domain question answering system using semantic role labeling
Open domain question answering system using semantic role labeling
 

More from HasanulFahmi2

Week14-Multimedia Information Retrieval.pptx
Week14-Multimedia Information Retrieval.pptxWeek14-Multimedia Information Retrieval.pptx
Week14-Multimedia Information Retrieval.pptxHasanulFahmi2
 
Week11-Purchasing Managementtttttttttttt
Week11-Purchasing ManagementttttttttttttWeek11-Purchasing Managementtttttttttttt
Week11-Purchasing ManagementttttttttttttHasanulFahmi2
 
Android Architecture, Environment, and Components.pptx
Android Architecture, Environment, and Components.pptxAndroid Architecture, Environment, and Components.pptx
Android Architecture, Environment, and Components.pptxHasanulFahmi2
 
Topic 1. Self Discipline _ Self Control.pptx
Topic 1. Self  Discipline _ Self Control.pptxTopic 1. Self  Discipline _ Self Control.pptx
Topic 1. Self Discipline _ Self Control.pptxHasanulFahmi2
 
komdat3-PHYSICAL LAYER DAN MEDIA.pptx
komdat3-PHYSICAL LAYER DAN MEDIA.pptxkomdat3-PHYSICAL LAYER DAN MEDIA.pptx
komdat3-PHYSICAL LAYER DAN MEDIA.pptxHasanulFahmi2
 
komdat2- MODEL JARINGAN.pptx
komdat2- MODEL JARINGAN.pptxkomdat2- MODEL JARINGAN.pptx
komdat2- MODEL JARINGAN.pptxHasanulFahmi2
 
komdat1-PENDAHULUAN.pptx
komdat1-PENDAHULUAN.pptxkomdat1-PENDAHULUAN.pptx
komdat1-PENDAHULUAN.pptxHasanulFahmi2
 

More from HasanulFahmi2 (10)

Week14-Multimedia Information Retrieval.pptx
Week14-Multimedia Information Retrieval.pptxWeek14-Multimedia Information Retrieval.pptx
Week14-Multimedia Information Retrieval.pptx
 
Week11-Purchasing Managementtttttttttttt
Week11-Purchasing ManagementttttttttttttWeek11-Purchasing Managementtttttttttttt
Week11-Purchasing Managementtttttttttttt
 
Android Architecture, Environment, and Components.pptx
Android Architecture, Environment, and Components.pptxAndroid Architecture, Environment, and Components.pptx
Android Architecture, Environment, and Components.pptx
 
Topic 1. Self Discipline _ Self Control.pptx
Topic 1. Self  Discipline _ Self Control.pptxTopic 1. Self  Discipline _ Self Control.pptx
Topic 1. Self Discipline _ Self Control.pptx
 
2. definisi AI.pptx
2. definisi AI.pptx2. definisi AI.pptx
2. definisi AI.pptx
 
komdat3-PHYSICAL LAYER DAN MEDIA.pptx
komdat3-PHYSICAL LAYER DAN MEDIA.pptxkomdat3-PHYSICAL LAYER DAN MEDIA.pptx
komdat3-PHYSICAL LAYER DAN MEDIA.pptx
 
komdat2- MODEL JARINGAN.pptx
komdat2- MODEL JARINGAN.pptxkomdat2- MODEL JARINGAN.pptx
komdat2- MODEL JARINGAN.pptx
 
komdat1-PENDAHULUAN.pptx
komdat1-PENDAHULUAN.pptxkomdat1-PENDAHULUAN.pptx
komdat1-PENDAHULUAN.pptx
 
Pertemuan 4.pdf
Pertemuan 4.pdfPertemuan 4.pdf
Pertemuan 4.pdf
 
Pertemuan_I.pptx
Pertemuan_I.pptxPertemuan_I.pptx
Pertemuan_I.pptx
 

Recently uploaded

Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |aasikanpl
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Patrick Diehl
 
Temporomandibular joint Muscles of Mastication
Temporomandibular joint Muscles of MasticationTemporomandibular joint Muscles of Mastication
Temporomandibular joint Muscles of Masticationvidulajaib
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...lizamodels9
 
Call Girls in Hauz Khas Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Hauz Khas Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Hauz Khas Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Hauz Khas Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxRESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxFarihaAbdulRasheed
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |aasikanpl
 
TOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsTOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsssuserddc89b
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trssuser06f238
 
Welcome to GFDL for Take Your Child To Work Day
Welcome to GFDL for Take Your Child To Work DayWelcome to GFDL for Take Your Child To Work Day
Welcome to GFDL for Take Your Child To Work DayZachary Labe
 
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tantaDashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tantaPraksha3
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfSwapnil Therkar
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PPRINCE C P
 
insect anatomy and insect body wall and their physiology
insect anatomy and insect body wall and their  physiologyinsect anatomy and insect body wall and their  physiology
insect anatomy and insect body wall and their physiologyDrAnita Sharma
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024AyushiRastogi48
 
Heredity: Inheritance and Variation of Traits
Heredity: Inheritance and Variation of TraitsHeredity: Inheritance and Variation of Traits
Heredity: Inheritance and Variation of TraitsCharlene Llagas
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 

Recently uploaded (20)

Hot Sexy call girls in Moti Nagar,🔝 9953056974 🔝 escort Service
Hot Sexy call girls in  Moti Nagar,🔝 9953056974 🔝 escort ServiceHot Sexy call girls in  Moti Nagar,🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Moti Nagar,🔝 9953056974 🔝 escort Service
 
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?
 
Temporomandibular joint Muscles of Mastication
Temporomandibular joint Muscles of MasticationTemporomandibular joint Muscles of Mastication
Temporomandibular joint Muscles of Mastication
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
 
Call Girls in Hauz Khas Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Hauz Khas Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Hauz Khas Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Hauz Khas Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxRESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
 
TOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsTOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physics
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 tr
 
Welcome to GFDL for Take Your Child To Work Day
Welcome to GFDL for Take Your Child To Work DayWelcome to GFDL for Take Your Child To Work Day
Welcome to GFDL for Take Your Child To Work Day
 
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tantaDashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C P
 
insect anatomy and insect body wall and their physiology
insect anatomy and insect body wall and their  physiologyinsect anatomy and insect body wall and their  physiology
insect anatomy and insect body wall and their physiology
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024
 
Heredity: Inheritance and Variation of Traits
Heredity: Inheritance and Variation of TraitsHeredity: Inheritance and Variation of Traits
Heredity: Inheritance and Variation of Traits
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 

Modern information Retrieval-Relevance Feedback

  • 2. The IR Black Box Search Query Ranked List
  • 3. Standard Model of IR  Assumptions:  The goal is maximizing precision and recall simultaneously  The information need remains static  The value is in the resulting document set
  • 4. Problems with Standard Model  Users learn during the search process:  Scanning titles of retrieved documents  Reading retrieved documents  Viewing lists of related topics/thesaurus terms  Navigating hyperlinks  Some users don’t like long (apparently) disorganized lists of documents
  • 5. IR is an Iterative Process Repositories Workspace Goals
  • 6. IR is a Dialog  The exchange doesn’t end with first answer  Users can recognize elements of a useful answer, even when incomplete  Questions and understanding changes as the process continues
  • 7. Bates’ “Berry-Picking” Model  Standard IR model  Assumes the information need remains the same throughout the search process  Berry-picking model  Interesting information is scattered like berries among bushes  The query is continually shifting
  • 8. Berry-Picking Model Q0 Q1 Q2 Q3 Q4 Q5 A sketch of a searcher… “moving through many actions towards a general goal of satisfactory completion of research related to an information need.” (after Bates 89)
  • 9. Berry-Picking Model (cont.)  The query is continually shifting  New information may yield new ideas and new directions  The information need  Is not satisfied by a single, final retrieved set  Is satisfied by a series of selections and bits of information found along the way
  • 10. Restricted Form of the IR Problem  The system has available only pre- existing, “canned” text passages  Its response is limited to selecting from these passages and presenting them to the user  It must select, say, 10 or 20 passages out of millions or billions!
  • 11. Information Retrieval  Revised Task Statement: Build a system that retrieves documents that users are likely to find relevant to their queries  This set of assumptions underlies the field of Information Retrieval
  • 12. Relevance Feedback  Take advantage of user relevance judgments in the retrieval process:  User issues a (short, simple) query and gets back an initial hit list  User marks hits as relevant or non-relevant  The system computes a better representation of the information need based on this feedback  Single or multiple iterations (although little is typically gained after one iteration)  Idea: you may not know what you’re looking for, but you’ll know when you see it
  • 13. Outline  Explicit feedback: users explicitly mark relevant and irrelevant documents  Implicit feedback: system attempts to infer user intentions based on observable behavior  Blind feedback: feedback in absence of any evidence, explicit or otherwise
  • 14. Why relevance feedback?  You may not know what you’re looking for, but you’ll know when you see it  Query formulation may be difficult; simplify the problem through iteration  Facilitate vocabulary discovery  Boost recall: “find me more documents like this…”
  • 15. Relevance Feedback Example Image Search Engine http://nayana.ece.ucsb.edu/imsearch/imsearch.html
  • 19. Updating Queries  Let’s assume that there is an optimal query  The goal of relevance feedback is to bring the user query closer to the optimal query  How does relevance feedback actually work?  Use relevance information to update query  Use query to retrieve new set of documents  What exactly do we “feed back”?  Boost weights of terms from relevant documents  Add terms from relevant documents to the query  Note that this is hidden from the user
  • 20. Query Modification  Changing or Expanding a query can lead to better results  Problem: how to reformulate the query?  Thesaurus expansion:  Suggest terms similar to query terms  Relevance feedback:  Suggest terms (and documents) similar to retrieved documents that have been judged to be relevant
  • 21. Relevance Feedback  Main Idea:  Modify existing query based on relevance judgements  Extract terms from relevant documents and add them to the query  and/or re-weight the terms already in the query  Two main approaches:  Automatic (psuedo-relevance feedback)  Users select relevant documents  Users/system select terms from an automatically-generated list
  • 22. Picture of Relevance Feedback x x x x o o o Revised query x non-relevant documents o relevant documents o o o x x x x x x x x x x x x  x x Initial query  x
  • 23. Rocchio Algorithm  Used in practice:  New query Moves toward relevant documents Away from irrelevant documents        nr j r j D d j nr D d j r m d D d D q q       1 1 0    qm = modified query vector; q0 = original query vector; α,β,γ: weights (hand-chosen or set empirically); Dr = set of known relevant doc vectors; Dnr = set of known irrelevant doc vectors
  • 24. notes on feedback  If  is set = 0, we have a positive feedback system; often preferred  the feedback formula can be applied repeatedly, asking user for relevance information at each iteration  relevance feedback is generally considered to be very effective  one drawback is that it is not fully automatic.
  • 25. Simple feedback example: T = {pudding, jam, traffic, lane, treacle} d1 = (0.8, 0.8, 0.0, 0.0, 0.4), d2 = (0.0, 0.0, 0.9, 0.8, 0.0), d3 = (0.8, 0.0, 0.0, 0.0, 0.8) d4 = (0.6, 0.9, 0.5, 0.6, 0.0) Recipe for jam pudding DoT report on traffic lanes Radio item on traffic jam in Pudding Lane Recipe for treacle pudding Display first 2 documents that match the following query: q0 = (1.0, 0.6, 0.0, 0.0, 0.0) r = (0.91, 0.0, 0.6, 0.73) Retrieved documents are: d1 : Recipe for jam pudding d4 : Radio item on traffic jam relevant not relevant
  • 26. Suppose we set  and  to 0.5,  to 0.2 qm = q0  di / | HR |  di / | HNR| = 0.5 q0 + 0.5 d1  0.2 d4 = 0.5  (1.0, 0.6, 0.0, 0.0, 0.0) + 0.5  (0.8, 0.8, 0.0, 0.0, 0.4)  0.2  (0.6, 0.9, 0.5, 0.6, 0.0) = (0.78, 0.52,  0.1,  0.12, 0.2) di  HR di  HNR Positive and Negative Feedback
  • 27. Simple feedback example: T = {pudding, jam, traffic, lane, treacle} d1 = (0.8, 0.8, 0.0, 0.0, 0.4), d2 = (0.0, 0.0, 0.9, 0.8, 0.0), d3 = (0.8, 0.0, 0.0, 0.0, 0.8) d4 = (0.6, 0.9, 0.5, 0.6, 0.0) Display first 2 documents that match the following query: qm = (0.78, 0.52,  0.1,  0.12, 0.2) r’ = (0.96, 0.0, 0.86, 0.63) Retrieved documents are: d1 : Recipe for jam pudding d3 : Recipe for treacle pud relevant relevant
  • 28. Relevance Feedback: Cost  Speed and efficiency issues  System needs to spend time analyzing documents  Longer queries are usually slower  Users often reluctant to provide explicit feedback  It’s often harder to understand why a particular document was retrieved
  • 29. Koenemann and Belkin’s Work  Well-known study on relevance feedback in information retrieval  Questions asked:  Does relevance feedback improve results?  Is user control over relevance feedback helpful?  How do different levels of user control effect results? Jürgen Koenemann and Nicholas J. Belkin. (1996) A Case For Interaction: A Study of Interactive Information Retrieval Behavior and Effectiveness. Proceedings of SIGCHI 1996 Conference on Human Factors in Computing Systems (CHI 1996).
  • 30. What’s the best interface?  Opaque (black box)  User doesn’t get to see the relevance feedback process  Transparent  User shown relevance feedback terms, but isn’t allowed to modify query  Penetrable (可渗透的)  User shown relevance feedback terms and is allowed to modify the query Which do you think worked best?
  • 32. Penetrable Interface Users get to select which terms they want to add
  • 33. Study Details  Subjects started with a tutorial  64 novice searchers (43 female, 21 male)  Goal is to keep modifying the query until they’ve developed one that gets high precision
  • 34. Procedure  Baseline (Trial 1)  Subjects get tutorial on relevance feedback  Experimental condition (Trial 2)  Shown one of four modes: no relevance feedback, opaque, transparent, penetrable  Evaluation metric used: precision at 30 documents
  • 35. Relevance feedback works!  Subjects using the relevance feedback interfaces performed 17-34% better  Subjects in the penetrable condition performed 15% better than those in opaque and transparent conditions
  • 36. Behavior Results  Search times approximately equal  Precision increased in first few iterations  Penetrable interface required fewer iterations to arrive at final query  Queries with relevance feedback are much longer  But fewer terms with the penetrable interface  users were more selective about which terms to add
  • 37. Implicit Feedback  Users are often reluctant to provide relevance judgments  Some searches are precision-oriented  They’re lazy!  Can we gather feedback without requiring the user to do anything?  Idea: gather feedback from observed user behavior
  • 38. Observable Behavior Minimum Scope Segment Object Class Behavior Category Examine Retain Reference Annotate View Listen Select Print Bookmark Save Purchase Delete Subscribe Copy / paste Quote Forward Reply Link Cite Mark up Rate Publish Organize
  • 39. So far…  Explicit feedback: take advantage of user- supplied relevance judgments  Implicit feedback: observe user behavior and draw inferences  Can we perform feedback without having a user in the loop?
  • 40. Blind Relevance Feedback  Also called “pseudo(伪)relevance feedback”  Motivation: it’s difficult to elicit relevance judgments from users  Can we automate this process?  Idea: take top n documents, and simply assume that they are relevant  Perform relevance feedback as before  If the initial hit list is reasonable, system should pick up good query terms  Does it work?
  • 41. BRF Experiment  Retrieval engine: Indri  Test collection: TREC, topics 301-450  Procedure:  Used topic description as query to generate initial hit list  Selected top 20 terms from top 20 hits using tf.idf  Added these terms to the original query
  • 42. Results MAP R-Precision No feedback 0.1591 0.2022 With feedback 0.1806 (+13.5%) 0.2222 (+9.9%) Blind relevance feedback doesn’t always help!
  • 43. The Complete Landscape  Explicit, implicit, blind feedback: it’s all about manipulating terms  Dimensions of query expansion  “Local” vs. “global”  User involvement vs. no user involvement
  • 44. Local vs. Global  “Local” methods  Only considers documents that have be retrieved by an initial query  Query specific  Computations must be performed on the fly  “Global” methods  Takes entire document collection into account  Does not depend on the query  Thesauri can be computed off-line (for faster access)
  • 45. User Involvement  Query expansion can be done automatically  New terms added without user intervention  Or it can place a user in the loop  System presents suggested terms  Must consider interface issues
  • 46. Global Methods  Controlled vocabulary  For example, MeSH terms  Manual thesaurus  For example, WordNet  Automatically derived thesaurus  For example, based on co-occurrence statistics
  • 48. Thesauri  A thesaurus may contain information about lexical semantic relations:  Synonyms: similar words e.g., violin → fiddle  Hypernyms: more general words e.g., violin → instrument  Hyponyms: more specific words e.g., violin → Stradivari  Meronyms: parts e.g., violin → strings
  • 49. Using Manual Thesauri  For each query term t, added synonyms and related words from thesaurus  feline → feline cat  Generally improves recall  Often hurts precision  “interest rate”  “interest rate fascinate evaluate”  Manual thesauri are expensive to produce and maintain
  • 50. Automatic Thesauri Generation  Attempt to generate a thesaurus automatically by analyzing the document collection  Two possible approaches  Co-occurrence statistics (co-occurring words are more likely to be similar)  Shallow analysis of grammatical relations  Entities that are grown, cooked, eaten, and digested are more likely to be food items.
  • 52. Automatic Thesauri: Discussion  Quality of associations is usually a problem  Term ambiguity may introduce irrelevant statistically correlated terms.  “Apple computer”  “Apple red fruit computer”  Problems:  False positives: Words deemed similar that are not  False negatives: Words deemed dissimilar that are similar  Since terms are highly correlated anyway, expansion may not retrieve many additional documents
  • 53. Key Points  Moving beyond the black box… interaction is key!  Different types of feedback:  Explicit (user does the work)  Implicit (system watches the user and guess)  Blind (don’t even involve the user)  Query expansion as a general mechanism