Your SlideShare is downloading.
×

×
Saving this for later?
Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.

Text the download link to your phone

Standard text messaging rates apply

Like this presentation? Why not share!

793

Published on

New and improved models that describe, predict and explain the search interaction of users.

New and improved models that describe, predict and explain the search interaction of users.

No Downloads

Total Views

793

On Slideshare

0

From Embeds

0

Number of Embeds

2

Shares

0

Downloads

9

Comments

0

Likes

3

No embeds

No notes for slide

Google: Prediction and assistance of the user with a much more conversational element

RBY: Spoken language to text queries

SR: The web is not going to last forever

SR: The battle between web search engines and web content providers – the future is not comfortable in that respect.

Engineering of content by providers is problematic to search engines.

Jarvelin (2011) also argued the need to understand Info. Sys. Through the development of formal models and testable theories to describe the interaction b/w users and systems.

It is a major research challenge because of all the complexities involved with users, their interactions with information and the systems that they employ.

A firm requires inputs (such as capital and labor)

A firm utilizes some form of technology to then transform the inputs into outputs.

Imagine if all these were relevant, and you got 1 point per relevant document, than you would essentially obtain A times Q gain.

However the inclusion of Alpha, generally means you get less as there is a trade-off between A and Q.

But this doesn’t mean the model doesn’t have any explanatory power

Representative, but not necessarily wholly realistic

employ to achieve their goal?

What strategy should a user employ to achieve their goal?

A user can choose from a range of information seeking strategies

The user’s time is an economic quantity

i.e. cost

The user pursues a particular strategy until the cost incurred exceeds the utility received,

At this point the user may choose another strategy

Or they stop

Go to the search box, type in the first query term, go to the next query box, type in a term, etc, then click search button and wait for the response.

On the standard interface, the user would have to:

Go to the search box, type in the three query terms, hit enter (activating the search) and then wait for the response,

On the suggestion interface, the first query is the same, but subsequent queries, they could simply click on a query suggestion.

As opposed to measures it.

These models decouple the cost and benefit and parameterize them on the interactions. This means we can tease out how the interactions functionally related to each other.

- 1. Leif Azzopardi Modeling Interaction with Economic Models of Search
- 2. Interactive Information Retrieval needs formal models to: • describe, predict and explain information behaviors • provide a basis on which to reason about interaction, • understand the relationships between interaction, performance and cost, • help guide the design, development and research of information systems, and • derive laws and principles of interaction • e.g. Law of Least Effort In Finding Major Research Challenge Belkin (2008) Jarvelin (2011)
- 3. INITIAL ECONOMIC MODEL OF SEARCH
- 4. All models are wrong but some are useful George E.P. Box
- 5. Production Theory Applied to Searching OutputInputs The Firm Technology Utilizes Constrains Capital Labour WidgetsQueries Assessments Search Engine Relevance Gain
- 6. Gain Function for the Search Process Let the gain the user receives through their interaction be: Where: Q is the number of queries, and A is the number of documents examined per query. α is the relative efficiency of querying to assessing k is the efficiency of the technology/user to extract/ identify relevant information returned Azzopardi (2011)
- 7. Gain Curve Each point on the curve represents a combination of interactions that will yield the same gain.
- 8. Cost Function for the Search Process The total cost can be calculated by: Where: – cq is the cost of a query – ca is the cost of a assessing a document – A.Q is the total number of documents assessed Azzopardi (2011)
- 9. Modeling Caveats Abstracted Simplified Representative
- 10. Few Queries, Lots of Assessments? Lots of Queries, Few Assessments ? Or some other way? What strategies can the user employ when interacting with the search system to achieve their end goal What course of action minimizes the cost of interaction?
- 11. Cost Curve The total cost is minimized when A = 10, which corresponds to Q = 18. Any other combination will result in a higher total cost.
- 12. How does behavior change when query cost increases?
- 13. Query Cost vs Interaction 0 2 4 6 8 10 0 20 40 60 80 100 120 No.ofActions Query Cost Q A Azzopardi, Kelly & Brennan (2013) Query-Cost-Interaction Hypothesis: as the cost of querying increases, more documents will be examined per query, and less queries will be issued
- 14. Testing the Query Cost Hypothesis Structured High Cost Standard Medium Cost Suggestion Low Cost Azzopardi, Kelly & Brennan (2013) Q = 19 A = 5 Q = 35 A = 1.6 Q=31 A = 2.5 Structured vs Standard and Suggestion : YES Standard vs Suggestion: NO Model does not account for the time spent on the search result page nor the interaction with snippets.
- 15. Limitations • Assumes users are rational • Assume interaction is fixed • Model of interface too simplified, the search process is more than just querying and assessing – There are lots of other costs involved when searching – There are lots of other interactions that can be performed too
- 16. A NEW ECONOMIC MODEL OF SEARCH
- 17. Modeling Other Costs Cost to enter a query (cq) Cost to load search page per query Cost to examine each snippet Cost to view a document Cost of return back to search page Cost to assess the document (ca) Cost to view next page
- 18. Modeling Other Costs • Let’s also include the: – cost of viewing pages (cv) and – cost of examining snippets (cs) in the cost model, such that:
- 19. Assumptions • Let’s assume that the number of page views is equal to some constant v – Typically this would be v=1 – But could be the average number of pages examined i.e. v=1.1 • Let’s further assume that A = S.pa – Where pa is the probability of assessing a document given a snippet.
- 20. Reducing the Cost Function • Given these assumptions, the cost function can be simplified down to the following:
- 21. New Gain Function • Previously, Q and A were linked via α and 1-α, • Here we decouple this relationships – which enables us to estimate the parameters – and so becomes more intuitive
- 22. Optimization Problem • Given our model, we wish to minimize the cost c(Q,A), subject to the constraint that g(Q,A) = g • To do this we used a Lagrangian multiplier
- 23. Optimal Interaction The optimal number of assessments per query: The optimal number of queries:
- 24. How does querying behavior change? • So we can say more precisely that: – If g increases then Q will go up – If k increases then Q will go down – If β increases, then Q will go down – If α increases, then Q will go up
- 25. Some Cost Hypotheses • Document Cost Hypothesis: as the cost of document increases, Q increase, A decreases. • Snippet Cost Hypothesis: as the cost of examining snippets increases, A decreases, while Q increases.
- 26. Performance Hypotheses • Beta-Performance Hypothesis: as β increases, A will increase, while Q will decrease.
- 27. Assessment Probability Hypothesis • Assessment Probability Hypothesis: as the probability of assessment increases, A increases, while Q decreases.
- 28. ACTUAL VERSUS OBSERVED
- 29. Analysis of Empirical Data • Re-examined the experimental data from Azzopardi, Kelly & Brennan (2013). • Where we considered the different interactions over topics for each condition • And tested seven of the hypotheses that we generated
- 30. Beta Interaction Hypothesis • Hypothesis states as β increases, Q will decrease and A will increase. • Observations tend to match theory
- 31. Assessment Probability Hypothesis • Hypothesis states that as pa increases, Q decreases, A increases • Observations match theory • Similar finding for snippet cost hypothesis
- 32. Document Cost Hypothesis • Hypothesis suggests that as Cd increases, Q should increase, while A should decrease • But clearly this is not the case.
- 33. An explanation • Cs / pa dominates Ca, so when considered together, the result matched our expectation • i.e. Cs / pa is bigger than Ca, and thus has a greater influence on the results.
- 34. An explanation • As increases, then Q should increase, while A should decrease. • Considering all three variables we see that this tends holds in practice.
- 35. Summary of Empirical Findings • The new model generally fits the empirical data – When there were deviations, we could explain these through other variables having a greater influence on the interaction • β tends to dominate interaction – Ie. Low β, leads to fewer documents being assessed per query – cs and pa also play a major role in shaping interaction
- 36. Summary • This new models provides a better description of the search process • By framing IR Tasks as economic optimization problems we can derive testable hypotheses! • These models provide the functional relationships between interaction, performance, and cost and how it affects information behaviors.
- 37. Open Questions • How well does the theory match up to practice? • How do these economic models relate to Information Foraging Theory or the Interactive Probability Ranking Principle? • What happens when users are not rational? • What other insights can we obtain when applying this approach to other IR task?
- 38. In theory, theory and practice are the same. In practice, they are not. Albert Einstein

Be the first to comment