SlideShare a Scribd company logo
1 of 38
Leif Azzopardi
Modeling Interaction with Economic
Models of Search
Interactive Information Retrieval needs formal models to:
• describe, predict and explain information behaviors
• provide a basis on which to reason about interaction,
• understand the relationships between interaction,
performance and cost,
• help guide the design, development and research of
information systems, and
• derive laws and principles of interaction
• e.g. Law of Least Effort In Finding
Major Research Challenge
Belkin (2008)
Jarvelin (2011)
INITIAL ECONOMIC MODEL OF SEARCH
All models are wrong
but some are useful
George E.P. Box
Production Theory
Applied to Searching
OutputInputs
The Firm
Technology
Utilizes Constrains
Capital
Labour
WidgetsQueries
Assessments
Search Engine
Relevance
Gain
Gain Function for the Search Process
Let the gain the user receives through their
interaction be:
Where:
Q is the number of queries, and
A is the number of documents examined per query.
α is the relative efficiency of querying to assessing
k is the efficiency of the technology/user to extract/
identify relevant information returned
Azzopardi (2011)
Gain Curve
Each point on the curve represents a combination
of interactions that will yield the same gain.
Cost Function for the Search Process
The total cost can be calculated by:
Where:
– cq is the cost of a query
– ca is the cost of a assessing a document
– A.Q is the total number of documents assessed
Azzopardi (2011)
Modeling Caveats
Abstracted
Simplified
Representative
Few Queries,
Lots of
Assessments?
Lots of
Queries,
Few
Assessments
?
Or some
other way?
What strategies can the user employ
when interacting with the search system to achieve their end goal
What course of action minimizes
the cost of interaction?
Cost Curve
The total cost is minimized when A = 10, which
corresponds to Q = 18. Any other combination will
result in a higher total cost.
How does behavior change
when query cost increases?
Query Cost vs Interaction
0 2 4 6 8 10
0
20
40
60
80
100
120
No.ofActions
Query Cost
Q
A
Azzopardi, Kelly & Brennan (2013)
Query-Cost-Interaction Hypothesis:
as the cost of querying increases,
more documents will be examined per query,
and less queries will be issued
Testing the Query Cost Hypothesis
Structured
High Cost
Standard
Medium Cost
Suggestion
Low Cost
Azzopardi, Kelly & Brennan (2013)
Q = 19
A = 5
Q = 35
A = 1.6
Q=31
A = 2.5
Structured vs Standard and Suggestion : YES
Standard vs Suggestion: NO
Model does not account for the time spent on the search
result page nor the interaction with snippets.
Limitations
• Assumes users are rational
• Assume interaction is fixed
• Model of interface too simplified, the search
process is more than just querying and
assessing
– There are lots of other costs involved when
searching
– There are lots of other interactions that can be
performed too
A NEW
ECONOMIC MODEL OF SEARCH
Modeling Other Costs
Cost to enter a query (cq)
Cost to load search page per query
Cost to examine each snippet
Cost to view a document
Cost of return back to search page
Cost to assess the document (ca)
Cost to view next page
Modeling Other Costs
• Let’s also include the:
– cost of viewing pages (cv) and
– cost of examining snippets (cs)
in the cost model, such that:
Assumptions
• Let’s assume that the number of page views is
equal to some constant v
– Typically this would be v=1
– But could be the average number of pages
examined i.e. v=1.1
• Let’s further assume that A = S.pa
– Where pa is the probability of assessing a
document given a snippet.
Reducing the Cost Function
• Given these assumptions, the cost function
can be simplified down to the following:
New Gain Function
• Previously, Q and A were linked via α and 1-α,
• Here we decouple this relationships
– which enables us to estimate the parameters
– and so becomes more intuitive
Optimization Problem
• Given our model, we wish to minimize the
cost c(Q,A), subject to the constraint that
g(Q,A) = g
• To do this we used a Lagrangian multiplier
Optimal Interaction
The optimal number of assessments per query:
The optimal number of queries:
How does querying behavior change?
• So we can say more precisely that:
– If g increases then Q will go up
– If k increases then Q will go down
– If β increases, then Q will go down
– If α increases, then Q will go up
Some Cost Hypotheses
• Document Cost Hypothesis: as the cost of
document increases, Q increase, A decreases.
• Snippet Cost Hypothesis: as the cost of
examining snippets increases, A decreases,
while Q increases.
Performance Hypotheses
• Beta-Performance Hypothesis:
as β increases, A will increase,
while Q will decrease.
Assessment Probability Hypothesis
• Assessment Probability Hypothesis:
as the probability of assessment increases,
A increases, while Q decreases.
ACTUAL VERSUS OBSERVED
Analysis of Empirical Data
• Re-examined the experimental data from
Azzopardi, Kelly & Brennan (2013).
• Where we considered the different
interactions over topics for each condition
• And tested seven of the hypotheses that we
generated
Beta Interaction Hypothesis
• Hypothesis states as β increases,
Q will decrease and A will increase.
• Observations tend to match theory
Assessment Probability Hypothesis
• Hypothesis states that as pa increases, Q
decreases, A increases
• Observations match theory
• Similar finding for snippet cost hypothesis
Document Cost Hypothesis
• Hypothesis suggests that as Cd increases, Q
should increase, while A should decrease
• But clearly this is not the case.
An explanation
• Cs / pa dominates Ca, so when considered
together, the result matched our expectation
• i.e. Cs / pa is bigger than Ca, and thus has a
greater influence on the results.
An explanation
• As increases, then Q should
increase, while A should decrease.
• Considering all three variables we see that this
tends holds in practice.
Summary of Empirical Findings
• The new model generally fits the empirical
data
– When there were deviations, we could explain
these through other variables having a greater
influence on the interaction
• β tends to dominate interaction
– Ie. Low β, leads to fewer documents being
assessed per query
– cs and pa also play a major role in shaping
interaction
Summary
• This new models provides a better description
of the search process
• By framing IR Tasks as economic optimization
problems we can derive testable hypotheses!
• These models provide the functional
relationships between interaction,
performance, and cost and how it affects
information behaviors.
Open Questions
• How well does the theory match up to
practice?
• How do these economic models relate to
Information Foraging Theory or the
Interactive Probability Ranking Principle?
• What happens when users are not rational?
• What other insights can we obtain when
applying this approach to other IR task?
In theory, theory and practice are
the same. In practice, they are not.
Albert Einstein

More Related Content

Similar to Modeling Search Interaction with Economic Models

Pricing like a data scientist
Pricing like a data scientistPricing like a data scientist
Pricing like a data scientistMatthew Evans
 
Introduction to formal methods
Introduction to formal methodsIntroduction to formal methods
Introduction to formal methodsInzemamul Haque
 
This document is about Ai-Project-Cycle.pptx
This document is about Ai-Project-Cycle.pptxThis document is about Ai-Project-Cycle.pptx
This document is about Ai-Project-Cycle.pptxanupriyasoundararaj
 
JAVA 2013 IEEE DATAMINING PROJECT Comparable entity mining from comparative q...
JAVA 2013 IEEE DATAMINING PROJECT Comparable entity mining from comparative q...JAVA 2013 IEEE DATAMINING PROJECT Comparable entity mining from comparative q...
JAVA 2013 IEEE DATAMINING PROJECT Comparable entity mining from comparative q...IEEEGLOBALSOFTTECHNOLOGIES
 
Comparable entity mining from comparative questions
Comparable entity mining from comparative questionsComparable entity mining from comparative questions
Comparable entity mining from comparative questionsIEEEFINALYEARPROJECTS
 
'A critique of testing' UK TMF forum January 2015
'A critique of testing' UK TMF forum January 2015 'A critique of testing' UK TMF forum January 2015
'A critique of testing' UK TMF forum January 2015 Georgina Tilby
 
Module 1.3 data exploratory
Module 1.3  data exploratoryModule 1.3  data exploratory
Module 1.3 data exploratorySara Hooker
 
SHAHBAZ_TECHNICAL_SEMINAR.docx
SHAHBAZ_TECHNICAL_SEMINAR.docxSHAHBAZ_TECHNICAL_SEMINAR.docx
SHAHBAZ_TECHNICAL_SEMINAR.docxShahbazKhan77289
 
CollectionOptimization
CollectionOptimizationCollectionOptimization
CollectionOptimizationMike Nguyen
 
3.a similarity measure for text classification and
3.a similarity measure for text classification and3.a similarity measure for text classification and
3.a similarity measure for text classification andeyalarasan138
 
7. space the estimation aid for bringing agile delivery predictability - p...
7. space   the estimation aid for bringing agile delivery predictability  - p...7. space   the estimation aid for bringing agile delivery predictability  - p...
7. space the estimation aid for bringing agile delivery predictability - p...Nesma
 
Lead Scoring Case Study_Final.pptx
Lead Scoring Case Study_Final.pptxLead Scoring Case Study_Final.pptx
Lead Scoring Case Study_Final.pptxRachnaGoel10
 
WHAT MAKES VALUE METHODOLOGY WORK? (AN ANALYSIS OF FUNCTION ANALYSIS
WHAT MAKES VALUE METHODOLOGY WORK? (AN ANALYSIS OF FUNCTION ANALYSISWHAT MAKES VALUE METHODOLOGY WORK? (AN ANALYSIS OF FUNCTION ANALYSIS
WHAT MAKES VALUE METHODOLOGY WORK? (AN ANALYSIS OF FUNCTION ANALYSISGeza Kmetty
 
Fast top k path-based relevance query on massive graphs
Fast top k path-based relevance query on massive graphsFast top k path-based relevance query on massive graphs
Fast top k path-based relevance query on massive graphsieeechennai
 
Information Retrieval
Information RetrievalInformation Retrieval
Information Retrievalssbd6985
 

Similar to Modeling Search Interaction with Economic Models (20)

Pricing like a data scientist
Pricing like a data scientistPricing like a data scientist
Pricing like a data scientist
 
Free CAPM Exam Sample Question
Free CAPM Exam Sample QuestionFree CAPM Exam Sample Question
Free CAPM Exam Sample Question
 
CAPM Sample Q and A
CAPM Sample Q and ACAPM Sample Q and A
CAPM Sample Q and A
 
CAPM Sample Q and A
CAPM Sample Q and ACAPM Sample Q and A
CAPM Sample Q and A
 
Introduction to formal methods
Introduction to formal methodsIntroduction to formal methods
Introduction to formal methods
 
This document is about Ai-Project-Cycle.pptx
This document is about Ai-Project-Cycle.pptxThis document is about Ai-Project-Cycle.pptx
This document is about Ai-Project-Cycle.pptx
 
What is Process Flow?
What is Process Flow?What is Process Flow?
What is Process Flow?
 
JAVA 2013 IEEE DATAMINING PROJECT Comparable entity mining from comparative q...
JAVA 2013 IEEE DATAMINING PROJECT Comparable entity mining from comparative q...JAVA 2013 IEEE DATAMINING PROJECT Comparable entity mining from comparative q...
JAVA 2013 IEEE DATAMINING PROJECT Comparable entity mining from comparative q...
 
Comparable entity mining from comparative questions
Comparable entity mining from comparative questionsComparable entity mining from comparative questions
Comparable entity mining from comparative questions
 
'A critique of testing' UK TMF forum January 2015
'A critique of testing' UK TMF forum January 2015 'A critique of testing' UK TMF forum January 2015
'A critique of testing' UK TMF forum January 2015
 
Module 1.3 data exploratory
Module 1.3  data exploratoryModule 1.3  data exploratory
Module 1.3 data exploratory
 
SHAHBAZ_TECHNICAL_SEMINAR.docx
SHAHBAZ_TECHNICAL_SEMINAR.docxSHAHBAZ_TECHNICAL_SEMINAR.docx
SHAHBAZ_TECHNICAL_SEMINAR.docx
 
PQF Overview
PQF OverviewPQF Overview
PQF Overview
 
CollectionOptimization
CollectionOptimizationCollectionOptimization
CollectionOptimization
 
3.a similarity measure for text classification and
3.a similarity measure for text classification and3.a similarity measure for text classification and
3.a similarity measure for text classification and
 
7. space the estimation aid for bringing agile delivery predictability - p...
7. space   the estimation aid for bringing agile delivery predictability  - p...7. space   the estimation aid for bringing agile delivery predictability  - p...
7. space the estimation aid for bringing agile delivery predictability - p...
 
Lead Scoring Case Study_Final.pptx
Lead Scoring Case Study_Final.pptxLead Scoring Case Study_Final.pptx
Lead Scoring Case Study_Final.pptx
 
WHAT MAKES VALUE METHODOLOGY WORK? (AN ANALYSIS OF FUNCTION ANALYSIS
WHAT MAKES VALUE METHODOLOGY WORK? (AN ANALYSIS OF FUNCTION ANALYSISWHAT MAKES VALUE METHODOLOGY WORK? (AN ANALYSIS OF FUNCTION ANALYSIS
WHAT MAKES VALUE METHODOLOGY WORK? (AN ANALYSIS OF FUNCTION ANALYSIS
 
Fast top k path-based relevance query on massive graphs
Fast top k path-based relevance query on massive graphsFast top k path-based relevance query on massive graphs
Fast top k path-based relevance query on massive graphs
 
Information Retrieval
Information RetrievalInformation Retrieval
Information Retrieval
 

Recently uploaded

Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 

Recently uploaded (20)

Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 

Modeling Search Interaction with Economic Models

  • 1. Leif Azzopardi Modeling Interaction with Economic Models of Search
  • 2. Interactive Information Retrieval needs formal models to: • describe, predict and explain information behaviors • provide a basis on which to reason about interaction, • understand the relationships between interaction, performance and cost, • help guide the design, development and research of information systems, and • derive laws and principles of interaction • e.g. Law of Least Effort In Finding Major Research Challenge Belkin (2008) Jarvelin (2011)
  • 4. All models are wrong but some are useful George E.P. Box
  • 5. Production Theory Applied to Searching OutputInputs The Firm Technology Utilizes Constrains Capital Labour WidgetsQueries Assessments Search Engine Relevance Gain
  • 6. Gain Function for the Search Process Let the gain the user receives through their interaction be: Where: Q is the number of queries, and A is the number of documents examined per query. α is the relative efficiency of querying to assessing k is the efficiency of the technology/user to extract/ identify relevant information returned Azzopardi (2011)
  • 7. Gain Curve Each point on the curve represents a combination of interactions that will yield the same gain.
  • 8. Cost Function for the Search Process The total cost can be calculated by: Where: – cq is the cost of a query – ca is the cost of a assessing a document – A.Q is the total number of documents assessed Azzopardi (2011)
  • 10. Few Queries, Lots of Assessments? Lots of Queries, Few Assessments ? Or some other way? What strategies can the user employ when interacting with the search system to achieve their end goal What course of action minimizes the cost of interaction?
  • 11. Cost Curve The total cost is minimized when A = 10, which corresponds to Q = 18. Any other combination will result in a higher total cost.
  • 12. How does behavior change when query cost increases?
  • 13. Query Cost vs Interaction 0 2 4 6 8 10 0 20 40 60 80 100 120 No.ofActions Query Cost Q A Azzopardi, Kelly & Brennan (2013) Query-Cost-Interaction Hypothesis: as the cost of querying increases, more documents will be examined per query, and less queries will be issued
  • 14. Testing the Query Cost Hypothesis Structured High Cost Standard Medium Cost Suggestion Low Cost Azzopardi, Kelly & Brennan (2013) Q = 19 A = 5 Q = 35 A = 1.6 Q=31 A = 2.5 Structured vs Standard and Suggestion : YES Standard vs Suggestion: NO Model does not account for the time spent on the search result page nor the interaction with snippets.
  • 15. Limitations • Assumes users are rational • Assume interaction is fixed • Model of interface too simplified, the search process is more than just querying and assessing – There are lots of other costs involved when searching – There are lots of other interactions that can be performed too
  • 16. A NEW ECONOMIC MODEL OF SEARCH
  • 17. Modeling Other Costs Cost to enter a query (cq) Cost to load search page per query Cost to examine each snippet Cost to view a document Cost of return back to search page Cost to assess the document (ca) Cost to view next page
  • 18. Modeling Other Costs • Let’s also include the: – cost of viewing pages (cv) and – cost of examining snippets (cs) in the cost model, such that:
  • 19. Assumptions • Let’s assume that the number of page views is equal to some constant v – Typically this would be v=1 – But could be the average number of pages examined i.e. v=1.1 • Let’s further assume that A = S.pa – Where pa is the probability of assessing a document given a snippet.
  • 20. Reducing the Cost Function • Given these assumptions, the cost function can be simplified down to the following:
  • 21. New Gain Function • Previously, Q and A were linked via α and 1-α, • Here we decouple this relationships – which enables us to estimate the parameters – and so becomes more intuitive
  • 22. Optimization Problem • Given our model, we wish to minimize the cost c(Q,A), subject to the constraint that g(Q,A) = g • To do this we used a Lagrangian multiplier
  • 23. Optimal Interaction The optimal number of assessments per query: The optimal number of queries:
  • 24. How does querying behavior change? • So we can say more precisely that: – If g increases then Q will go up – If k increases then Q will go down – If β increases, then Q will go down – If α increases, then Q will go up
  • 25. Some Cost Hypotheses • Document Cost Hypothesis: as the cost of document increases, Q increase, A decreases. • Snippet Cost Hypothesis: as the cost of examining snippets increases, A decreases, while Q increases.
  • 26. Performance Hypotheses • Beta-Performance Hypothesis: as β increases, A will increase, while Q will decrease.
  • 27. Assessment Probability Hypothesis • Assessment Probability Hypothesis: as the probability of assessment increases, A increases, while Q decreases.
  • 29. Analysis of Empirical Data • Re-examined the experimental data from Azzopardi, Kelly & Brennan (2013). • Where we considered the different interactions over topics for each condition • And tested seven of the hypotheses that we generated
  • 30. Beta Interaction Hypothesis • Hypothesis states as β increases, Q will decrease and A will increase. • Observations tend to match theory
  • 31. Assessment Probability Hypothesis • Hypothesis states that as pa increases, Q decreases, A increases • Observations match theory • Similar finding for snippet cost hypothesis
  • 32. Document Cost Hypothesis • Hypothesis suggests that as Cd increases, Q should increase, while A should decrease • But clearly this is not the case.
  • 33. An explanation • Cs / pa dominates Ca, so when considered together, the result matched our expectation • i.e. Cs / pa is bigger than Ca, and thus has a greater influence on the results.
  • 34. An explanation • As increases, then Q should increase, while A should decrease. • Considering all three variables we see that this tends holds in practice.
  • 35. Summary of Empirical Findings • The new model generally fits the empirical data – When there were deviations, we could explain these through other variables having a greater influence on the interaction • β tends to dominate interaction – Ie. Low β, leads to fewer documents being assessed per query – cs and pa also play a major role in shaping interaction
  • 36. Summary • This new models provides a better description of the search process • By framing IR Tasks as economic optimization problems we can derive testable hypotheses! • These models provide the functional relationships between interaction, performance, and cost and how it affects information behaviors.
  • 37. Open Questions • How well does the theory match up to practice? • How do these economic models relate to Information Foraging Theory or the Interactive Probability Ranking Principle? • What happens when users are not rational? • What other insights can we obtain when applying this approach to other IR task?
  • 38. In theory, theory and practice are the same. In practice, they are not. Albert Einstein

Editor's Notes

  1. I don’t want the search engine to think – I want the search engine to do free association. Steve Robertson. Google: Prediction and assistance of the user with a much more conversational element RBY: Spoken language to text queries SR: The web is not going to last forever SR: The battle between web search engines and web content providers – the future is not comfortable in that respect. Engineering of content by providers is problematic to search engines.
  2. Belkin (2008) outlined some of the challenges within IIR Jarvelin (2011) also argued the need to understand Info. Sys. Through the development of formal models and testable theories to describe the interaction b/w users and systems. It is a major research challenge because of all the complexities involved with users, their interactions with information and the systems that they employ.
  3. A firm produces output (such as goods or services) A firm requires inputs (such as capital and labor) A firm utilizes some form of technology to then transform the inputs into outputs.
  4. Let’s think about what this formula is saying. The amount of gain or overall performance is a combination of A times Q. Imagine if all these were relevant, and you got 1 point per relevant document, than you would essentially obtain A times Q gain. However the inclusion of Alpha, generally means you get less as there is a trade-off between A and Q.
  5. What strategy should a user eYes, the model is abstracted and general. ….. Search has many more inputs and outputs. Lots more variables, but we have abstracted away these details. We have simplified the search process to two core variables that affect the output. But this doesn’t mean the model doesn’t have any explanatory power Representative, but not necessarily wholly realistic employ to achieve their goal?
  6. And if we map a cost function to the interactions then we can ask, “what is the most cost-efficient way for a user to interact with an IR system?” What strategy should a user employ to achieve their goal? A user can choose from a range of information seeking strategies The user’s time is an economic quantity i.e. cost The user pursues a particular strategy until the cost incurred exceeds the utility received, At this point the user may choose another strategy Or they stop
  7. So for example on the structured interface, the user would have to: Go to the search box, type in the first query term, go to the next query box, type in a term, etc, then click search button and wait for the response. On the standard interface, the user would have to: Go to the search box, type in the three query terms, hit enter (activating the search) and then wait for the response, On the suggestion interface, the first query is the same, but subsequent queries, they could simply click on a query suggestion.
  8. there is a measure of the output of the system, as well. As opposed to measures it. These models decouple the cost and benefit and parameterize them on the interactions. This means we can tease out how the interactions functionally related to each other.
  9. Though work by Turpin and Hersh or Cantor and Smith have shown that on a degraded system, users adapted to degraded systems by posing less queries. This seems to fit with the reduction in the efficiency of the system (i.e. the k parameter)