Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Background Motivation Model & Metric Experimental Setup Results Summary
Incorporating Clicks, Attention and Satisfaction
i...
Background
Background Motivation Model & Metric Experimental Setup Results Summary
Search Engine Result Page (SERP) Evaluation
Main p...
Background Motivation Model & Metric Experimental Setup Results Summary
Search Engine Result Page (SERP) Evaluation
Exampl...
Background Motivation Model & Metric Experimental Setup Results Summary
Search Engine Result Page (SERP) Evaluation
Exampl...
Background Motivation Model & Metric Experimental Setup Results Summary
Search Engine Result Page (SERP) Evaluation
Exampl...
Background Motivation Model & Metric Experimental Setup Results Summary
Search Engine Result Page (SERP) Evaluation
Exampl...
Background Motivation Model & Metric Experimental Setup Results Summary
Main Goal of This Paper
Better measure for SERP ut...
Motivation
Background Motivation Model & Metric Experimental Setup Results Summary
Complex Heterogeneous SERPs
AC–MdR Incorporating C...
Background Motivation Model & Metric Experimental Setup Results Summary
Motivation 1: Non-Trivial Attention Patterns
4
eme...
Background Motivation Model & Metric Experimental Setup Results Summary
Motivation 2: Satisfaction Without Clicks
High dir...
Background Motivation Model & Metric Experimental Setup Results Summary
Problems of Existing Models and Evaluation Metrics...
Background Motivation Model & Metric Experimental Setup Results Summary
Problems of Existing Models and Evaluation Metrics...
Background Motivation Model & Metric Experimental Setup Results Summary
Problems of Existing Models and Evaluation Metrics...
Model & Metric
Background Motivation Model & Metric Experimental Setup Results Summary
Clicks + Attention + Satisfaction (CAS) Model
SERP...
Background Motivation Model & Metric Experimental Setup Results Summary
Clicks + Attention + Satisfaction (CAS) Model
SERP...
Background Motivation Model & Metric Experimental Setup Results Summary
Click Model
Examination assumption: click happens ...
Background Motivation Model & Metric Experimental Setup Results Summary
Click Model
Examination assumption: click happens ...
Background Motivation Model & Metric Experimental Setup Results Summary
Clicks + Attention + Satisfaction (CAS) Model
SERP...
Background Motivation Model & Metric Experimental Setup Results Summary
Attention (Examination) Model
Logistic regression ...
Background Motivation Model & Metric Experimental Setup Results Summary
Attention (Examination) Model
Logistic regression ...
Background Motivation Model & Metric Experimental Setup Results Summary
Attention (Examination) Model
Logistic regression ...
Background Motivation Model & Metric Experimental Setup Results Summary
Clicks + Attention + Satisfaction (CAS) Model
SERP...
Background Motivation Model & Metric Experimental Setup Results Summary
Satisfaction Model
in previous models, satisfactio...
Background Motivation Model & Metric Experimental Setup Results Summary
Satisfaction Model
in previous models, satisfactio...
Background Motivation Model & Metric Experimental Setup Results Summary
Satisfaction Model
in previous models, satisfactio...
Background Motivation Model & Metric Experimental Setup Results Summary
Satisfaction Model
in previous models, satisfactio...
Background Motivation Model & Metric Experimental Setup Results Summary
The CAS Metric
Utility that determines the satisfa...
Background Motivation Model & Metric Experimental Setup Results Summary
The CAS Metric
Utility that determines the satisfa...
Background Motivation Model & Metric Experimental Setup Results Summary
The CAS Metric
Utility that determines the satisfa...
Experimental Setup
Background Motivation Model & Metric Experimental Setup Results Summary
Dataset
199 queries with explicit unambiguous
feed...
Background Motivation Model & Metric Experimental Setup Results Summary
Dataset
199 queries with explicit unambiguous
feed...
Background Motivation Model & Metric Experimental Setup Results Summary
Baselines and CAS Model Variants
UBM model that ag...
Background Motivation Model & Metric Experimental Setup Results Summary
Baselines and CAS Model Variants
UBM model that ag...
Results
Background Motivation Model & Metric Experimental Setup Results Summary
Is the New Metric Really New?
Correlation Between ...
Background Motivation Model & Metric Experimental Setup Results Summary
Is the New Metric Measuring the Right Thing?
Metri...
Background Motivation Model & Metric Experimental Setup Results Summary
Bonus Point
Log-Likelihood of Click Prediction
CAS...
Summary
Background Motivation Model & Metric Experimental Setup Results Summary
Summary
A model-based metric needs to model satisf...
Background Motivation Model & Metric Experimental Setup Results Summary
Summary
A model-based metric needs to model satisf...
Background Motivation Model & Metric Experimental Setup Results Summary
Summary
A model-based metric needs to model satisf...
Background Motivation Model & Metric Experimental Setup Results Summary
Summary
A model-based metric needs to model satisf...
Background Motivation Model & Metric Experimental Setup Results Summary
Acknowledgments
All content represents the opinion...
Background Motivation Model & Metric Experimental Setup Results Summary
Evaluating the User Model
Log-Likelihood of Satisf...
Background Motivation Model & Metric Experimental Setup Results Summary
Analyzing the Attention Features
CASrank is the
mo...
Background Motivation Model & Metric Experimental Setup Results Summary
Heterogeneous SERPs
12% of the SERPs in our data a...
Background Motivation Model & Metric Experimental Setup Results Summary
Spammers
Some raters were filtered out as spammers,...
Incorporating Clicks, Attention and Satisfaction into a SERP Evaluation Model
Upcoming SlideShare
Loading in …5
×

6

Share

Download to read offline

Incorporating Clicks, Attention and Satisfaction into a SERP Evaluation Model

Download to read offline

Slides by Aleksandr Chuklin and Maarten de Rijke, presented at the 2016 CIKM Conference. The authors propose a methodology for better evaluating searcher satisfaction and incorporating it into how search results are evaluated and ranked.

p.s. This document was originally published at https://www.researchgate.net/publication/309416715_Slides_Incorporating_Clicks_Attention_and_Satisfaction_into_a_SERP_Evaluation_Model

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all

Incorporating Clicks, Attention and Satisfaction into a SERP Evaluation Model

  1. Background Motivation Model & Metric Experimental Setup Results Summary Incorporating Clicks, Attention and Satisfaction into a SERP Evaluation Model Aleksandr Chuklin¶,§ Maarten de Rijke§ chuklin@google.com derijke@uva.nl ¶Google Research Europe §University of Amsterdam AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 1
  2. Background
  3. Background Motivation Model & Metric Experimental Setup Results Summary Search Engine Result Page (SERP) Evaluation Main problem Combining relevance of individual SERP items (Rk) into a whole-page metric. AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 3
  4. Background Motivation Model & Metric Experimental Setup Results Summary Search Engine Result Page (SERP) Evaluation Examples document 3 document 4 document 1 document 2 document 5 AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 4
  5. Background Motivation Model & Metric Experimental Setup Results Summary Search Engine Result Page (SERP) Evaluation Examples Precision at N: P@N = 1 N N k=1 Rk document 3 document 4 document 1 document 2 document 5 AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 4
  6. Background Motivation Model & Metric Experimental Setup Results Summary Search Engine Result Page (SERP) Evaluation Examples Precision at N: P@N = 1 N N k=1 Rk Discounted Cumulative Gain (DCG): DCG@N = N k=1 1 log2 (1 + k) · Rk document 3 document 4 document 1 document 2 document 5 AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 4
  7. Background Motivation Model & Metric Experimental Setup Results Summary Search Engine Result Page (SERP) Evaluation Examples Precision at N: P@N = 1 N N k=1 Rk Discounted Cumulative Gain (DCG): DCG@N = N k=1 1 log2 (1 + k) · Rk Model-Based Metrics (Chuklin et al. 2013): Utility@N = N k=1 P(Ck = 1) · Rk document 3 document 4 document 1 document 2 document 5 AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 4
  8. Background Motivation Model & Metric Experimental Setup Results Summary Main Goal of This Paper Better measure for SERP utility Namely, improve this (Chuklin et al. 2013): N k=1 P(Ck = 1) · Rk AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 5
  9. Motivation
  10. Background Motivation Model & Metric Experimental Setup Results Summary Complex Heterogeneous SERPs AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 7
  11. Background Motivation Model & Metric Experimental Setup Results Summary Motivation 1: Non-Trivial Attention Patterns 4 ement 9 1 3 5 6 7 8 4 2 (c) Mouse Data data. The session sequence for this data would be Image credits: F. Diaz, R.W. White, G. Buscher, and D. Liebling. Robust models of mouse movement on dynamic web search results pages. In CIKM, 2013. ACM Press AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 8
  12. Background Motivation Model & Metric Experimental Setup Results Summary Motivation 2: Satisfaction Without Clicks High direct page utility (measured by DCG or ERR) leads to higher abandonment rate (SERPs with no clicks) direct page utility Image credits: from A. Chuklin and P. Serdyukov. Good abandonments in factoid queries. In WWW, 2012. AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 9
  13. Background Motivation Model & Metric Experimental Setup Results Summary Problems of Existing Models and Evaluation Metrics AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 10
  14. Background Motivation Model & Metric Experimental Setup Results Summary Problems of Existing Models and Evaluation Metrics existing models mostly do not model non-trivial user attention patterns AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 10
  15. Background Motivation Model & Metric Experimental Setup Results Summary Problems of Existing Models and Evaluation Metrics existing models mostly do not model non-trivial user attention patterns existing models do not use explicit user satisfaction data AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 10
  16. Model & Metric
  17. Background Motivation Model & Metric Experimental Setup Results Summary Clicks + Attention + Satisfaction (CAS) Model SERP 𝜑& 𝐸& 𝐶& 𝜑) 𝐸) 𝐶) 𝜑* 𝐸* 𝐶* 𝑆 … Utility AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 12
  18. Background Motivation Model & Metric Experimental Setup Results Summary Clicks + Attention + Satisfaction (CAS) Model SERP 𝜑& 𝐸& 𝐶& 𝜑) 𝐸) 𝐶) 𝜑* 𝐸* 𝐶* 𝑆 … Utility AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 13
  19. Background Motivation Model & Metric Experimental Setup Results Summary Click Model Examination assumption: click happens only when an item was examined and attractive: P(Ck = 1) = P(Ek = 1) · P(Ck = 1 | Ek = 1) AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 14
  20. Background Motivation Model & Metric Experimental Setup Results Summary Click Model Examination assumption: click happens only when an item was examined and attractive: P(Ck = 1) = P(Ek = 1) · P(Ck = 1 | Ek = 1) N.B. Here we assume that P(Ck = 1 | Ek = 1) = α(Rk) where Rk comes from the raters and α is a logistic function. AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 14
  21. Background Motivation Model & Metric Experimental Setup Results Summary Clicks + Attention + Satisfaction (CAS) Model SERP 𝜑& 𝐸& 𝐶& 𝜑) 𝐸) 𝐶) 𝜑* 𝐸* 𝐶* 𝑆 … Utility AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 15
  22. Background Motivation Model & Metric Experimental Setup Results Summary Attention (Examination) Model Logistic regression model: P(Ek = 1) = ε(ϕk), where ϕk is a vector of features for SERP item k. AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 16
  23. Background Motivation Model & Metric Experimental Setup Results Summary Attention (Examination) Model Logistic regression model: P(Ek = 1) = ε(ϕk), where ϕk is a vector of features for SERP item k. Feature group Features # of features rank user-perceived rank of the SERP item (can be different from k) 1 AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 16
  24. Background Motivation Model & Metric Experimental Setup Results Summary Attention (Examination) Model Logistic regression model: P(Ek = 1) = ε(ϕk), where ϕk is a vector of features for SERP item k. Feature group Features # of features rank user-perceived rank of the SERP item (can be different from k) 1 CSS classes SERP item type (Web, News, Weather, Currency, Knowledge Panel, etc.) 10 geometry offset from the top, first or second col- umn (binary), width (w), height (h), w × h 5 AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 16
  25. Background Motivation Model & Metric Experimental Setup Results Summary Clicks + Attention + Satisfaction (CAS) Model SERP 𝜑& 𝐸& 𝐶& 𝜑) 𝐸) 𝐶) 𝜑* 𝐸* 𝐶* 𝑆 … Utility AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 17
  26. Background Motivation Model & Metric Experimental Setup Results Summary Satisfaction Model in previous models, satisfaction comes only from clicked results; AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 18
  27. Background Motivation Model & Metric Experimental Setup Results Summary Satisfaction Model in previous models, satisfaction comes only from clicked results; in our model it also comes from the SERP items that simply attracted attention; AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 18
  28. Background Motivation Model & Metric Experimental Setup Results Summary Satisfaction Model in previous models, satisfaction comes only from clicked results; in our model it also comes from the SERP items that simply attracted attention; P(S = 1) = σ(τ0 + U) = AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 18
  29. Background Motivation Model & Metric Experimental Setup Results Summary Satisfaction Model in previous models, satisfaction comes only from clicked results; in our model it also comes from the SERP items that simply attracted attention; P(S = 1) = σ(τ0 + U) = σ τ0 + k P(Ek = 1)ud (Dk) + k P(Ck = 1)ur (Rk) where Dk and Rk are ratings assigned by the raters for direct snippet relevance and result relevance respectively. ud and ur are linear functions of rating histograms. AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 18
  30. Background Motivation Model & Metric Experimental Setup Results Summary The CAS Metric Utility that determines the satisfaction probability: U = k P(Ek = 1)ud (Dk) + k P(Ck = 1)ur (Rk) AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 19
  31. Background Motivation Model & Metric Experimental Setup Results Summary The CAS Metric Utility that determines the satisfaction probability: U = k P(Ek = 1)ud (Dk) NEW + k P(Ck = 1)ur (Rk) Chuklin et al. 2013 has an additional term AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 19
  32. Background Motivation Model & Metric Experimental Setup Results Summary The CAS Metric Utility that determines the satisfaction probability: U = k P(Ek = 1)ud (Dk) NEW + k P(Ck = 1)ur (Rk) Chuklin et al. 2013 has an additional term trained on mousing and satisfaction (in addition to clicks) AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 19
  33. Experimental Setup
  34. Background Motivation Model & Metric Experimental Setup Results Summary Dataset 199 queries with explicit unambiguous feedback (satisfied / not satisfied); AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 21
  35. Background Motivation Model & Metric Experimental Setup Results Summary Dataset 199 queries with explicit unambiguous feedback (satisfied / not satisfied); 1,739 rated results direct snippet relevance (D) result relevance (R) AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 21
  36. Background Motivation Model & Metric Experimental Setup Results Summary Baselines and CAS Model Variants UBM model that agrees well with online team-draft experimental outcomes; PBM position-based model, a robust model with fewer parameters than UBM; random model that predicts click and satisfaction with fixed probabilities (learned from the data). uUBM from Chuklin et al. 2013. Similar to UBM, but parameters are trained on a different and much bigger dataset. AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 22
  37. Background Motivation Model & Metric Experimental Setup Results Summary Baselines and CAS Model Variants UBM model that agrees well with online team-draft experimental outcomes; PBM position-based model, a robust model with fewer parameters than UBM; random model that predicts click and satisfaction with fixed probabilities (learned from the data). uUBM from Chuklin et al. 2013. Similar to UBM, but parameters are trained on a different and much bigger dataset. CASnod is a stripped-down version that does not use (D) labels; CASnosat is a version of the CAS model that does not include the satisfaction term while optimizing the model; CASnoreg is a version of the CAS model that does not use regularization while training. All other models were trained with L2-regularization. AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 22
  38. Results
  39. Background Motivation Model & Metric Experimental Setup Results Summary Is the New Metric Really New? Correlation Between Metrics Table: Correlation between metrics measured by average Pearson’s correlation coefficient. CASnosat CASnoreg CAS UBM PBM DCG uUBM CASnod 0.593 0.564 0.633 0.470 0.487 0.546 0.441 CASnosat 0.664 0.715 0.707 0.668 0.735 0.684 CASnoreg 0.974 0.363 0.379 0.417 0.341 CAS 0.377 0.394 0.440 0.360 UBM 0.814 0.972 0.882 PBM 0.906 0.965 DCG 0.943 AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 24
  40. Background Motivation Model & Metric Experimental Setup Results Summary Is the New Metric Measuring the Right Thing? Metric Correlation with True Satisfaction CASnod CASnosat CASnoreg CAS UBM PBM random DCG uUBM 0.2 0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6 Pearson correlation coefficient between different model-based metrics and the user-reported satisfaction. AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 25
  41. Background Motivation Model & Metric Experimental Setup Results Summary Bonus Point Log-Likelihood of Click Prediction CASnod CASnosat CASnoreg CAS UBM PBM random uUBM 4.5 4.0 3.5 3.0 2.5 2.0 1.5 Log-likelihood of the click data. Note that uUBM was trained on a totally different dataset. AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 26
  42. Summary
  43. Background Motivation Model & Metric Experimental Setup Results Summary Summary A model-based metric needs to model satisfaction explicitly and use it for training. AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 28
  44. Background Motivation Model & Metric Experimental Setup Results Summary Summary A model-based metric needs to model satisfaction explicitly and use it for training. Direct snippet relevance (D) is essential for predicting satisfaction. AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 28
  45. Background Motivation Model & Metric Experimental Setup Results Summary Summary A model-based metric needs to model satisfaction explicitly and use it for training. Direct snippet relevance (D) is essential for predicting satisfaction. The CAS metric is quite different from the previously used metrics, making it an interesting addition to TREC. AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 28
  46. Background Motivation Model & Metric Experimental Setup Results Summary Summary A model-based metric needs to model satisfaction explicitly and use it for training. Direct snippet relevance (D) is essential for predicting satisfaction. The CAS metric is quite different from the previously used metrics, making it an interesting addition to TREC. When used as a model, CAS consistently predicts user satisfaction with a relatively small penalty in click prediction. AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 28
  47. Background Motivation Model & Metric Experimental Setup Results Summary Acknowledgments All content represents the opinion of the authors which is not necessarily shared or endorsed by their respective employers and/or sponsors. AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 29
  48. Background Motivation Model & Metric Experimental Setup Results Summary Evaluating the User Model Log-Likelihood of Satisfaction Prediction CASnod CASnosat CASnoreg CAS UBM PBM random uUBM 0.8 0.7 0.6 0.5 0.4 0.3 0.2 Log-likelihood of the satisfaction prediction. Some models have log-likelihood below −0.8, hence there are no boxes for them. AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 31
  49. Background Motivation Model & Metric Experimental Setup Results Summary Analyzing the Attention Features CASrank is the model that only uses the rank to predict attention; CASnogeom only uses the rank and SERP item type information and does not use geometry; CASnoclass does not use the CSS class features (SERP item type). Pearson correlation with satisfaction CASrank CASnogeom CASnoclass CASnod CAS 0.2 0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6 Log-likelihood of clicks / satisfaction CASrank CASnogeom CASnoclass CASnod CAS 2.5 2.4 2.3 2.2 2.1 2.0 1.9 1.8 1.7 CASrank CASnogeom CASnoclass CASnod CAS 0.65 0.60 0.55 0.50 0.45 0.40 0.35 0.30 0.25 0.20 AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 32
  50. Background Motivation Model & Metric Experimental Setup Results Summary Heterogeneous SERPs 12% of the SERPs in our data are heterogeneous and our metric does well for them. Table: Pearson correlation between utility of heterogeneous SERP and user-reported satisfaction. CAS UBM PBM random DCG uUBM 0.60 0.38 -0.05 -0.39 0.24 -0.08 CASrank CASnogeom CASclass CASnod CASnosat CASnoreg 0.15 -0.04 0.27 -0.04 0.48 0.67 AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 33
  51. Background Motivation Model & Metric Experimental Setup Results Summary Spammers Some raters were filtered out as spammers, but there was still some natural disagreement: Table: Filtered out workers and agreement scores for remaining workers. % of workers % of ratings Cohen’s Krippendorf’s label removed removed kappa alpha (D) 32% 27% 0.339 0.144 (R) 41% 29% 0.348 0.117 AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 34
  • KaterynaKrasnianska

    Jul. 3, 2018
  • JaswanthReddy36

    Nov. 18, 2017
  • MarkusZielniok

    Oct. 1, 2017
  • jgcallahan

    Aug. 2, 2017
  • NelsonJohn5

    Dec. 19, 2016
  • jhmjacob

    Dec. 18, 2016

Slides by Aleksandr Chuklin and Maarten de Rijke, presented at the 2016 CIKM Conference. The authors propose a methodology for better evaluating searcher satisfaction and incorporating it into how search results are evaluated and ranked. p.s. This document was originally published at https://www.researchgate.net/publication/309416715_Slides_Incorporating_Clicks_Attention_and_Satisfaction_into_a_SERP_Evaluation_Model

Views

Total views

24,782

On Slideshare

0

From embeds

0

Number of embeds

608

Actions

Downloads

44

Shares

0

Comments

0

Likes

6

×