Exploring session search


Published on

Slides from my presentation at the ECIR 2012 workshop on "Information Retrieval Over Query Sessions" (SIR2012) held in Barcelona, Spain.

Title: Exploring Session Search


Exploratory search is typically characterized by recall-oriented information needs and by uncertainty and evolution of the information need. As searchers interact with the system, their understanding of the topic evolves in response to found information. These two characteristics – uncertainty of information need and the desire to find multiple documents – drive the need to run multiple queries. Furthermore, these queries are not independent of each other because they often retrieve overlapping sets of documents. Yet traditional information retrieval systems often treat searchers’ queries in isolation, ignoring the evolution of a person’s understanding of the information need and the historical coupling among queries.

I this talk, I will describe some interface ideas we're exploring to help people incorporate their search history into their ongoing retrieval and sense-making tasks, and will touch on some issues related to retrieval algorithms and evaluation.

Published in: Technology, Education
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Exploring session search

  1. 1. Exploring Session Search Gene Golovchinsky FX Palo Alto Laboratory, Inc. @HCIR_GeneG
  2. 2. Thanks to:Jeremy Pickens, Abdigani Diriye, Tony Dunnigan
  3. 3. Exploratory search Interactive Information seeking Anomalous state of knowledge Evolving information need Often recall-oriented
  4. 4. One Query to Rule Them AllNo single query satisfies a typical exploratory searchinformation needSearch strategies involve many queriesQueries return overlapping results
  5. 5. Why we’re here1. How do we know what’s a session?2. How do we help people deal with this complex task?3. How do we evaluate systems and algorithms?
  7. 7. Explicit vs. implicit sessionsExplicit sessions 1. We ask the person 2. We infer it from structural aspects of the search context Task context may provide strong organizing queues For example, genealogical searches are often tied to a person in a family treeWhat about implicitsessions?
  8. 8. Implicit section detection is based on implicit assumptionsHow do we detect a session? – Time heuristics – Client connection heuristics – Query similarity heuristicsWhat are we assuming? – Person works continuously – Person does not switch tasks – Enough overlap in queriesHow good are these assumptions?
  9. 9. TradeoffsImplicit sessions Explicit sessionsPros Pros No explicit user input required Accurate Needed for collaborationCons Durable over time Effectiveness relies on precision- oriented information needs and inter-query similarity, i.e., on Cons redundancy Requires manual input in some cases More difficult to connect recurring or ongoing instances of the same information need
  10. 10. Dealing with redundancy
  11. 11. Strategies Ignore it The traditional approach Manage redundancy in the UI Ancestry.com, Querium Increase diversity through scoring Some algorithmic evaluation, but are such interactive systems deployed?
  12. 12. Manage redundancy in the UICOPING WITH REDUNDANCY
  13. 13. Some UI examplesGoogle +1 but no session awareness & no good persistent visual feedbackBing Visible query history but no help with documentsAncestry.com Flags previously saved records for current personQuerium user interface Variety of document- and query-centric displays
  14. 14. Ancestry.com: Query overlapHow can we help peoplemake sense of searchresults? What’s new? What’s redundant? What’s useful? What’s not useful?
  15. 15. Querium: Filtering by process metadataHistory of interactionduring a search can beprojected onto currentresults
  16. 16. Querium: Visualizing re-retrievalDocument-centeredretrieval history can beprojected onto each searchresultIndicates “important”documentsIndicates new documents
  17. 17. Querium: Query-centric view
  18. 18. Querium: Query-centric view
  19. 19. Query-centric view
  20. 20. Increasing diversityPREVENTING REDUNDANCY
  21. 21. Some (cor)related metrics Novelty Precision Diversity Recall The exact relationshipRedundancy is hard to pin down
  22. 22. Increasing {diversity} with scoringPros Query – Can incorporate prior explicit and implicit relevance assessments Black box – More focused queries may retrieve more pertinent documents at a given Rank Session docs state cutoffCons – Relies on accurate assessment of relevance Displayed User – No way to recover “organic” results, feedback ranking so hard for people to understand effect of personalization Stop
  23. 23. Increasing {diversity} with post-processing Rank QueryPros docs – Can recover “organic” results – Supports feedback on incorrect inference “organic” If user selects demoted doc ranking – Accommodates shifting info needs better Session – Can be applied interactively state Re-rankCons docs – Limited document set Displayed User feedback ranking Stop
  24. 24. A holistic approachEVALUATION
  25. 25. Vague generalitiesSession-based search must be evaluated as a human-machine system Hard to account for real human behavior through simulations onlyRecall and precision do not tell the whole story Exploratory search is inherently a learning process Effort, knowledge gain, frustration, serendipity importantLook at patterns of interaction that led to discovery Hard to evaluate marginal contribution of each query due to negative results, learning, information need drift
  26. 26. Some thoughts on evaluating algorithmsSmall gains in retrieval effectiveness will be swamped byinteraction, good or bad Small statistically-significant effects are meaningless in practiceEvaluation “in the wild” relies on users for ground truth Use post-hoc analysis to test how algorithms predicted users’ choicesLook at system’s ability to help people recognize usefuldocuments How many times was a document retrieved before it was seen? This works for lab and naturalistic studies
  27. 27. In closing… Information needs evolve Queries are approximations Knowledge is uncertain Design challenge: Help people planfuture actions by understanding the present in the context of the past
  28. 28. While I have your attention…There is a pending proposal to create a StackExchangesite for information retrieval. Think of it as Stack Overflow for IR geeks. We need more people to vote & promote.http://area51.stackexchange.com/proposals/39142/information-retrieval-and-search-engines
  29. 29. Do I still have your attention?IIiX 2012 August 21-24, 2012, Nijmegen, The Netherlands Deadline for papers April 9, 2012EuroHCIR 2012 Same place, August 25 Deadline for papers is June 22, 2012HCIR 2012: The 6th Symposium on Human Computer InformationRetrieval October 4-5, 2012, Boston, Massachusetts, USA Submission deadline mid-summer Will publish works in progress and archival, full-length papers
  30. 30. Image creditshttp://www.flickr.com/photos/torremountain/6831414535/http://www.flickr.com/photos/bigtallguy/233176326/http://www.flickr.com/photos/77074420@N00/198347900/http://www.flickr.com/photos/racatumba/93569705/http://www.flickr.com/photos/chrisolson/3595815374/http://www.flickr.com/photos/brymo/2813028454/http://www.flickr.com/photos/computix/108732248/http://www.flickr.com/photos/funadium/913303959/http://www.flickr.com/photos/moriza/189890016/http://www.flickr.com/photos/uhdigital/6802789537/
  31. 31. Hiding unwanted results
  32. 32. Hiding unwanted results