Joint work with Bhaskar Mitra,
Milad Shokouhi, and Filip Radlinski
How do searchers examine QAC rankings?
How does the quality of QAC rankings affect
examination and usage?
Are QAC examinat...
Analysis
Results
Discussion
massachu|
massachusetts
massachusetts state lottery
massachusetts unemployment
massachusetts registry of motor vehicles
ma...
Same tasks for all participants
navigational closed
informational
Included difficult-to-spell names
(schwarzenegger), term...
Tobii TX300
unobtrusive
tracks natural head movement
300 Hz temporal resolution
accuracy up to 0.4˚ visual angle
size of e...
Make searchers type:
Provide instructions and
search task descriptions on
screen (avoid copy-paste).
Participants: 25, div...
Experiment
Results
Discussion
Q1 Q2 Q3R1 R4S2 R5S4R2 S3
task completion time (TCT)
time to first result click (TFC)
T E T S T _ Q U E
S1
query formulati...
Experiment
Analysis
Discussion
Q1 Q2 Q3R1 R4S2 R5S4R2 S3
T E T S T _ Q U E
S1
time to first fixation (TFF)
A B
A + B = cumulative
fixation time (CFT)
R3
...
Q1 Q2 Q3R1 R4S2 R5S4R2 S3
T E T S T _ Q U E
S1
time to first fixation (TFF)
A B
A + B = cumulative
fixation time (CFT)
R3
...
Fixations and use of AS by rank and condition. Condition has
little effect, suggesting a strong position bias.
AS suggesti...
Q1 Q2 Q3R1 R4S2 R5S4R2 S3
T E T S T _ Q U E
S1
query formulation time (QFT)
R3
mouse click
typed character
control charact...
Q1 Q2 Q3R1 R4S2 R5S4R2 S3
T E T S T _ Q U E
S1
query formulation time (QFT)
R3
mouse click
typed character
control charact...
Q1 Q2 Q3R1 R4S2 R5S4R2 S3
task completion time (TCT)
time to first result click (TFC)
T E T S T _ Q U E
S1 R3
mouse click
...
Q1 Q2 Q3R1 R4S2 R5S4R2 S3
task completion time (TCT)
time to first result click (TFC)
T E T S T _ Q U E
S1 R3
mouse click
...
Experiment
Analysis
Results
a) touch typing, aware of suggestions
b + c) spelling support vs. expressing an information need
d) seeking suggestions
How to measure QAC ranking quality?
Rank-based (e.g., MRR, extracted from logs) e.g., [Shokouhi ‘13]
QAC usage [Kharitonov...
effects of ranking changes
strong position bias
effect
on query effectiveness
Next:
[Bhatia et al. ‘11] S. Bhatia, D. Majumdar, P. Mitra: Query suggestions in the absence of query logs
(SIGIR 2011).
[Duan &...
An Eye-tracking Study of User Interactions with Query Auto Completion – Katja Hofmann, Microsoft Research Cambridge
An Eye-tracking Study of User Interactions with Query Auto Completion – Katja Hofmann, Microsoft Research Cambridge
An Eye-tracking Study of User Interactions with Query Auto Completion – Katja Hofmann, Microsoft Research Cambridge
An Eye-tracking Study of User Interactions with Query Auto Completion – Katja Hofmann, Microsoft Research Cambridge
An Eye-tracking Study of User Interactions with Query Auto Completion – Katja Hofmann, Microsoft Research Cambridge
An Eye-tracking Study of User Interactions with Query Auto Completion – Katja Hofmann, Microsoft Research Cambridge
An Eye-tracking Study of User Interactions with Query Auto Completion – Katja Hofmann, Microsoft Research Cambridge
Upcoming SlideShare
Loading in …5
×

An Eye-tracking Study of User Interactions with Query Auto Completion – Katja Hofmann, Microsoft Research Cambridge

1,006 views

Published on

Query Auto Completion (QAC) suggests possible queries to web search users from the moment they start entering a query. This popular feature of web search engines is thought to reduce physical and cognitive effort when formulating a query.

Perhaps surprisingly, despite QAC being widely used, users' interactions with it are poorly understood. This paper begins to address this gap. We present the results of an in-depth user study of user interactions with QAC in web search. While study participants completed web search tasks, we recorded their interactions using eye-tracking and client-side logging. This allows us to provide a first look at how users interact with QAC. We specifically focus on the effects of QAC ranking, by controlling the quality of the ranking in a within-subject design.

We identify a strong position bias, that is consistent across ranking conditions. Due to this strong position bias, ranking quality affects QAC usage. We also find an effect on task completion, in particular on the number of result pages visited. We show how these effects can be explained by a combination of searchers' behavior patterns, namely monitoring or ignoring QAC, and searching for spelling support or complete queries to express a search intent. We conclude the paper with a discussion of the important implications of our findings for QAC evaluation.

Published in: Internet
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,006
On SlideShare
0
From Embeds
0
Number of Embeds
453
Actions
Shares
0
Downloads
9
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

An Eye-tracking Study of User Interactions with Query Auto Completion – Katja Hofmann, Microsoft Research Cambridge

  1. 1. Joint work with Bhaskar Mitra, Milad Shokouhi, and Filip Radlinski
  2. 2. How do searchers examine QAC rankings? How does the quality of QAC rankings affect examination and usage? Are QAC examination and usage affected by position bias?
  3. 3. Analysis Results Discussion
  4. 4. massachu| massachusetts massachusetts state lottery massachusetts unemployment massachusetts registry of motor vehicles massachusetts secretary of state massachusetts department of revenue massachusetts department of education massachusetts general hospital massachu| massachusetts unemployment massachusetts department of education massachusetts secretary of state massachusetts registry of motor vehicles massachusetts massachusetts general hospital massachusetts department of revenue massachusetts state lottery original condition (production) random condition Counterbalanced in blocks so maximum of 2 subsequent tasks are in the same condition.
  5. 5. Same tasks for all participants navigational closed informational Included difficult-to-spell names (schwarzenegger), terms that can be abbreviated (wsj). Example search tasks: Find the homepage of the Massachusetts General Hospital in Boston, USA. What is their physical address? (navigational) Japan is the 10th most populated country in the world. How many people live there? (easy informational) How many matches did Roger Federer win against Rafael Nadal in 2007? (complex informational)
  6. 6. Tobii TX300 unobtrusive tracks natural head movement 300 Hz temporal resolution accuracy up to 0.4˚ visual angle size of each QAC suggestion on screen: 0.67˚ http://www.tobii.com/Global/Analysis/Downloads/Product_De scriptions/Tobii_TX300_EyeTracker_Product_Description.pdf
  7. 7. Make searchers type: Provide instructions and search task descriptions on screen (avoid copy-paste). Participants: 25, diverse backgrounds, level of education, and computer experience. Instruction: Participate in a study of search quality; start search from bing.com, then search any way you like.
  8. 8. Experiment Results Discussion
  9. 9. Q1 Q2 Q3R1 R4S2 R5S4R2 S3 task completion time (TCT) time to first result click (TFC) T E T S T _ Q U E S1 query formulation time (QFT) time to first fixation (TFF) A B A + B = cumulative fixation time (CFT) R3 fixation (anywhere on the screen) saccade (anywhere on the screen) mouse click typed character QAC suggestions shown fixations on QAC suggestions control characters QU QAC suggestion used QR QAC rank QL query length CS characters saved UQ unique queries submitted UR unique result pages + query and task characteristics:
  10. 10. Experiment Analysis Discussion
  11. 11. Q1 Q2 Q3R1 R4S2 R5S4R2 S3 T E T S T _ Q U E S1 time to first fixation (TFF) A B A + B = cumulative fixation time (CFT) R3 fixations on QAC suggestions response type n β0 estimate β1 estimate CFT > 0 binary CFT | CFT > 0 log TFF | CFT > 0 log * marks coefficients that are estimated to differ significantly from zero.
  12. 12. Q1 Q2 Q3R1 R4S2 R5S4R2 S3 T E T S T _ Q U E S1 time to first fixation (TFF) A B A + B = cumulative fixation time (CFT) R3 fixations on QAC suggestions response type n β0 estimate β1 estimate CFT > 0 binary 331 3.468* 0.97 -0.220 0.96 CFT | CFT > 0 log 284 7.124* 1241 ms -0.043 1189 ms TFF | CFT > 0 log 284 6.503* 667 ms -0.094 607 ms * marks coefficients that are estimated to differ significantly from zero.
  13. 13. Fixations and use of AS by rank and condition. Condition has little effect, suggesting a strong position bias. AS suggestion rank ASusage(percent) Fixations (original) Fixations (random) AS usage (original) AS usage (random) meanfixationtime (milliseconds)
  14. 14. Q1 Q2 Q3R1 R4S2 R5S4R2 S3 T E T S T _ Q U E S1 query formulation time (QFT) R3 mouse click typed character control characters response type n β0 estimate β1 estimate QFT log QL Poisson QU binary CS | QU Poisson QR | QU Poisson * marks coefficients that are estimated to differ significantly from zero.
  15. 15. Q1 Q2 Q3R1 R4S2 R5S4R2 S3 T E T S T _ Q U E S1 query formulation time (QFT) R3 mouse click typed character control characters response type n β0 estimate β1 estimate QFT log 331 8.680* 5884 ms 0.058 6235 ms QL Poisson 331 3.224* 25 -0.007 25 QU binary 331 -0.915* 0.29 -0.508 0.19 CS | QU Poisson 99 2.192* 9 0.223* 11 QR | QU Poisson 99 0.344* 1.4 0.044 1.5 * marks coefficients that are estimated to differ significantly from zero.
  16. 16. Q1 Q2 Q3R1 R4S2 R5S4R2 S3 task completion time (TCT) time to first result click (TFC) T E T S T _ Q U E S1 R3 mouse click response type n β0 estimate β1 estimate UQ Poisson UR = 0 binary UR | UR > 0 Poisson TFC | UR > 0 log TCT ≥ ts binary TCT | TCT < ts log * marks coefficients that are estimated to differ significantly from zero.
  17. 17. Q1 Q2 Q3R1 R4S2 R5S4R2 S3 task completion time (TCT) time to first result click (TFC) T E T S T _ Q U E S1 R3 mouse click response type n β0 estimate β1 estimate UQ Poisson 331 0.357* 1.4 0.044 1.5 UR = 0 binary 331 -3.654* 0.03 -0.022 0.02 UR | UR > 0 Poisson 282 0.703* 2.0 0.161* 2.4 TFC | UR > 0 log 282 8.625* 5569 ms -0.036 5372 ms TCT ≥ ts binary 331 -3.217* 0.04 0.764 0.08 TCT | TCT < ts log 297 11.096* 65.9 s -0.021 64.5 s * marks coefficients that are estimated to differ significantly from zero.
  18. 18. Experiment Analysis Results
  19. 19. a) touch typing, aware of suggestions
  20. 20. b + c) spelling support vs. expressing an information need
  21. 21. d) seeking suggestions
  22. 22. How to measure QAC ranking quality? Rank-based (e.g., MRR, extracted from logs) e.g., [Shokouhi ‘13] QAC usage [Kharitonov et al. ‘13] Manual judgment of suggestions [Bhatia et al. ‘11] Result page quality [Liu et al. ‘12] Effort-based (e.g., MKS) [Duan & Hsu ‘11] AB-tests [Kohavi et al. ‘13] Interleaving [Hofmann et al. ‘13]
  23. 23. effects of ranking changes strong position bias effect on query effectiveness Next:
  24. 24. [Bhatia et al. ‘11] S. Bhatia, D. Majumdar, P. Mitra: Query suggestions in the absence of query logs (SIGIR 2011). [Duan & Hsu ‘11] H. Duan, B.-J. P. Hsu: Online spelling correction for query completion (WWW ‘11). [Hofmann et al. ‘13] K. Hofmann, S. Whiteson, M. de Rijke: Fidelity, soundness, and efficiency of interleaved comparison methods (ACM TOIS 31(4) 2013). [Hofmann et al. ‘14] K. Hofmann, B. Mitra, M. Shokouhi, F. Radlinski: An Eye-tracking Study of User Interactions with Query Auto Completion (CIKM 2014). [Kharitonov et al. 13] E. Kharitonov, C. Macdonald, P. Serdyukov, I. Ounis: User Model-based Metrics for Offline Query Suggestion Evaluation (CIKM 2013). [Kohavi et al. ‘13] R. Kohavi, A. Deng, B. Frasca, T. Walker, Y. Xu, N. Pohlmann: Online controlled experiments at large scale (KDD 2013). [Li et al. ‘14] Y. Li, A. Dong, H. Wang, H. Deng, Y. Chang, C. Zhai: A Two-Dimensional Click Model for Query Auto-Completion (SIGIR 2014). [Liu et al. ‘12] Y. Liu, R. Song, Y. Chen, J.-Y. Nie, J.-R. Wen: Adaptive query suggestion for difficult queries (SIGIR 2012). [Mitra et al. ‘14] B. Mitra, M. Shokouhi, F. Radlinski, K. Hofmann: On User’s Interactions with Query Auto-Completion (SIGIR 2014). [Shokouhi ‘13] M. Shokouhi: Learning to Personalize Query Auto-Completion (SIGIR 2013).

×