Lessons from the Journey
A Query-log Analysis of Within-session Learning

Carsten Eickhoff
Jaime Teevan
Ryen White
Susan D...
Learning by Searching
• Domain expertise seems to be generally useful for indomain searches
• Domain expertise can slowly ...
Expertise

Studying Expertise over Time

Time
Explicit Learning Sessions
• Learning happens all the time
• We look at explicit knowledge acquisition sessions
• Two type...
Finding Indicator Terms
• Group sessions that end at Ehow vs. Wikipedia
• Find query terms that occur more frequently
in k...
Selecting Sessions
• Based on a set of 26.7 Million sessions
• Select sessions that contain indicator terms in
at least 50...
Session Properties

• Knowledge acquisition sessions are
long, topically diverse and more exploratory
Session Properties

• Knowledge acquisition sessions are long,
topically diverse and more exploratory
Session Properties

• Knowledge acquisition sessions are long,
topically diverse and more exploratory
Session Properties

• Knowledge acquisition sessions are long,
topically diverse and more exploratory
• Extended sets are ...
Within-session Learning

• General upwards trend for domain count and
query complexity
• This trend is strongest for learn...
Sustained Learning
• What happens beyond
the session boundary?
• Domain expertise
metrics are more likely
to increase furt...
Page Visits Spark Learning
• We study the origin of new query terms
• (Where) did added terms occur previously in
the same...
The Effect of Page Visits
• Condition P+, P= and P- of expertise metrics on
click status of previous SERP
• Clicks more of...
Summary
• Introduced procedural/declarative needs

• Noted evidence of within-session learning
• Learning is sustained acr...
Future Directions
• Ranking to Learn
– Learning potential is spread evenly across SERP
– Predictors of learning potential ...
Thank You.
Upcoming SlideShare
Loading in …5
×

Lessons from the Journey: A Query Log Analysis of Within-Session Learning (WSDM'14)

494 views

Published on

The Internet is the largest source of information in the world. Search engines help people navigate the huge space of available data in order to acquire new skills and knowledge. In this paper, we present an in-depth analysis of sessions in which people explicitly search for new knowledge on the Web based on the log files of a popular search engine. We investigate within-session and cross-session developments of expertise, focusing on how the language and search behavior of a user on a topic evolves over time. In this way, we identify those sessions and page visits that appear to significantly boost the learning process. Our experiments demonstrate a strong connection between clicks and several metrics related to expertise. Based on models of the user and their specific context, we present a method capable of automatically predicting, with good accuracy, which clicks will lead to enhanced learning. Our findings provide insight into how search engines might better help users learn as they search.

This work together with Jaime Teevan, Ryen White and Susan Dumais has been accepted for full oral presentation at the 7th ACM International Conference on Web Search and Data Mining (WSDM). The full version of this paper is available at: http://dl.acm.org/citation.cfm?id=2556195.2556217

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
494
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
5
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Lessons from the Journey: A Query Log Analysis of Within-Session Learning (WSDM'14)

  1. 1. Lessons from the Journey A Query-log Analysis of Within-session Learning Carsten Eickhoff Jaime Teevan Ryen White Susan Dumais
  2. 2. Learning by Searching • Domain expertise seems to be generally useful for indomain searches • Domain expertise can slowly change over time • Here, we measure this effect at finer granularity
  3. 3. Expertise Studying Expertise over Time Time
  4. 4. Explicit Learning Sessions • Learning happens all the time • We look at explicit knowledge acquisition sessions • Two types of informational needs: – Procedural: learn how to do something • E.g.: Ehow.com, YouTube tutorials, … – Declarative: learn about something • E.g.: Wikipedia.com, documentaries, …
  5. 5. Finding Indicator Terms • Group sessions that end at Ehow vs. Wikipedia • Find query terms that occur more frequently in knowledge acquisition sessions
  6. 6. Selecting Sessions • Based on a set of 26.7 Million sessions • Select sessions that contain indicator terms in at least 50% of queries – Dproc – Ddecl
  7. 7. Session Properties • Knowledge acquisition sessions are long, topically diverse and more exploratory
  8. 8. Session Properties • Knowledge acquisition sessions are long, topically diverse and more exploratory
  9. 9. Session Properties • Knowledge acquisition sessions are long, topically diverse and more exploratory
  10. 10. Session Properties • Knowledge acquisition sessions are long, topically diverse and more exploratory • Extended sets are noisy and mimic the full collection
  11. 11. Within-session Learning • General upwards trend for domain count and query complexity • This trend is strongest for learning sessions
  12. 12. Sustained Learning • What happens beyond the session boundary? • Domain expertise metrics are more likely to increase further after within-session learning
  13. 13. Page Visits Spark Learning • We study the origin of new query terms • (Where) did added terms occur previously in the same session?
  14. 14. The Effect of Page Visits • Condition P+, P= and P- of expertise metrics on click status of previous SERP • Clicks more often result in metric increases • Click duration has no significant effect
  15. 15. Summary • Introduced procedural/declarative needs • Noted evidence of within-session learning • Learning is sustained across session boundary • Page visits seem to have a strong influence
  16. 16. Future Directions • Ranking to Learn – Learning potential is spread evenly across SERP – Predictors of learning potential may serve as ranking criteria • Qualitative study of query reformulation – Here: Term presence implies causality – Better: Study what the user really sees (e.g., via eye gaze tracking)
  17. 17. Thank You.

×