Modeling and Detecting
Changes in User Satisfaction
Julia Kiseleva*, Eric Crestan, Riccardo Brigo, Roland Dittel
*Eindhoven University of Technology
Microsoft Bing
Want to go to CIKM
conference
QUERY SERP
What is User Satisfaction?
What is User Satisfaction?
QUERY SERP
,
What is User Satisfaction?
QUERY SERP
,
What is User Satisfaction?
QUERY SERP
,Pr (Ref.)
Assumption: If a “significant” amount of users
reformulate a query with a particular SERP it is an
indication of changing in user preferences
World May Change User Preferences
QUERY SERP
,
QUERY SERP
ti
ti+1
,
Timeline
Pr ti =
Pr ti+1 =
How Can We Detect the Changes?
QUERY SERP
,
QUERY SERP
ti
ti+1
,
| Pr ti - Pr ti+1 |
Timeline
Pr ti =
Pr ti+1 =
How Can We Detect the Changes?
• There are many definitions in the literature
• We use the query expansion
o new years wallpaper IS REFORMULATED WITH 2014
o medals Olympics IS REFORMULATED WITH 2014
o ct 40ez IS REFORMULATED WITH 2013
o march 31 holiday IS REFORMULATED WITH 2014
o …
Detecting Query Reformulation
An Example of the Drift in
Reformulation Signal
The Explanation of the Drift
Before November 2013 After November 2013
The Question:
“How to detect
this kind of
changes?”
• Change detection techniques
o In dynamically changing and non-stationary environments, the data distribution can
change over time yielding the phenomenon of concept drift
o The real concept drift refers to changes in the conditional distribution of the output
(i.e., target variable) given the input (input features)
• Concept drift types:
Change Detection Techniques
• Change detection techniques
o In dynamically changing and non-stationary environments, the data distribution can change over time
yielding the phenomenon of concept drift
o The real concept drift refers to changes in the conditional distribution of the output (i.e., target variable)
given the input (input features)
• Concept drift types:
Time
Datamean
Sudden/abrupt
Disambiguation
such as
“flawless Beyoncé”
Change Detection Techniques
• Change detection techniques
o In dynamically changing and non-stationary environments, the data distribution can change over time
yielding the phenomenon of concept drift
o The real concept drift refers to changes in the conditional distribution of the output (i.e., target variable)
given the input (input features)
• Concept drift types:
Time
Datamean
Incremental
Disambiguation
such as
“cikm conference
2014”
Change Detection Techniques
• Change detection techniques
o In dynamically changing and non-stationary environments, the data distribution can change over time
yielding the phenomenon of concept drift
o The real concept drift refers to changes in the conditional distribution of the output (i.e., target variable)
given the input (input features)
• Concept drift types:
Time
Datamean
Gradual
Breaking news
such as
“idaho bus crash
investigation”
Change Detection Techniques
• Change detection techniques
o In dynamically changing and non-stationary environments, the data distribution can change over time
yielding the phenomenon of concept drift
o The real concept drift refers to changes in the conditional distribution of the output (i.e., target variable)
given the input (input features)
• Concept drift types:
Time
Datamean
Reoccurring
Seasonal change
such as
“black Friday 2014”
Change Detection Techniques
• Change detection techniques
o In dynamically changing and non-stationary environments, the data distribution can change over time
yielding the phenomenon of concept drift
o The real concept drift refers to changes in the conditional distribution of the output (i.e., target variable)
given the input (input features)
• Concept drift types:
Time
Datamean
Change Detection Techniques
• Change detection techniques
o In dynamically changing and non-stationary environments, the data distribution can
change over time yielding the phenomenon of concept drift
o The real concept drift refers to changes in the conditional distribution of the output
(i.e., target variable) given the input (input features)
• Concept drift types:
Time
Datamean
Sudden/abru
pt
Incremental Gradual
Reoccurring
concepts
Outlier
(not concept drift)
Disambiguation
such as
“medal olympics
2014”
Seasonal change
such as
“black Friday
2014”
Breaking news
such as
“idaho bus crash
investigation”
Disambiguation
such as
“cikm conference
2014”
Change Detection Techniques
Detecting Drifts in
Reformulation Signal
Query: “cikm conference”
0.1
TimeLinet0
0.1 0.2 0.2 0.3
Reformulation: “2014”
Window W0
ti
Detecting Drifts in
Reformulation Signal
Query: “cikm conference”
0.1
TimeLinet0 ti+ t
0.1 0.2 0.2 0.3 0.7 0.8 0.8
Reformulation: “2014”
Window W0 Window W1
ti
E(W0) E(W1)
Size of Window W1 = n1Size of Window W0 = n0
The
upcoming
conference
event
If |E(W1) - E(W2)|> eout
Then Drift Detected
Calculating Threshold eout
Confidence
Variance at W = W0 U W1
m = 1/(1/n0 + 1/n1)
eout
Learn
reformulation
model M
User Behavior
Logs
t0 Timelineti+
Learn
reformulation
model M
User Behavior
Logs
t0
Incoming User
Behavior logs
Timeline
Detect changes in model M
If change detected
else Do Nothing
ti ti+ t
Learn
reformulation
model M
User Behavior
Logs
ti
Incoming User
Behavior logs
Timeline
Detect changes in model M
If change detected
else Do Nothing
ti+w1 ti+w1+w2
Alarm:
Change of user
satisfaction
detected
for pairs :
{<Qi,
SERPi>}1<i<n
Learn
reformulation
model M
User Behavior
Logs
t0
Incoming User
Behavior Logs
Timeline
Detect changes in model M
If change detected
else Do Nothing
ti ti+ t
1) List of reformulation terms
per query
2) List of URLs per
reformulation
Alarm:
Change of user
satisfaction
detected
for pairs :
{<Qi,
SERPi>}1<i<n
o Dataset consists of 6 months
of the behavioral log data
from a commercial search
engine
o The training window size is
one month
o The test window size is two
weeks
Experimentation
Evaluation
Results
o We successfully leveraged the concept drift detection
techniques to detect changes in user satisfaction
o The proposed technique works in unsupervised way
o Large scale evaluation has been performed
o Classification of the drift type is needed
o Prediction of the lifetime of the drift would help
Conclusion and Future Work
Questions?
Questions?
o We successfully leveraged the concept drift detection
techniques
o The proposed technique works in unsupervised way
o Large scale evaluation has been performed
o Classification of the drift type is needed
o Prediction of the lifetime of the drift would help
Conclusion and Future Work

Modelling and Detecting Changes in User Satisfaction

  • 1.
    Modeling and Detecting Changesin User Satisfaction Julia Kiseleva*, Eric Crestan, Riccardo Brigo, Roland Dittel *Eindhoven University of Technology Microsoft Bing
  • 2.
    Want to goto CIKM conference QUERY SERP What is User Satisfaction?
  • 3.
    What is UserSatisfaction? QUERY SERP ,
  • 4.
    What is UserSatisfaction? QUERY SERP ,
  • 5.
    What is UserSatisfaction? QUERY SERP ,Pr (Ref.) Assumption: If a “significant” amount of users reformulate a query with a particular SERP it is an indication of changing in user preferences
  • 6.
    World May ChangeUser Preferences
  • 7.
    QUERY SERP , QUERY SERP ti ti+1 , Timeline Prti = Pr ti+1 = How Can We Detect the Changes?
  • 8.
    QUERY SERP , QUERY SERP ti ti+1 , |Pr ti - Pr ti+1 | Timeline Pr ti = Pr ti+1 = How Can We Detect the Changes?
  • 9.
    • There aremany definitions in the literature • We use the query expansion o new years wallpaper IS REFORMULATED WITH 2014 o medals Olympics IS REFORMULATED WITH 2014 o ct 40ez IS REFORMULATED WITH 2013 o march 31 holiday IS REFORMULATED WITH 2014 o … Detecting Query Reformulation
  • 10.
    An Example ofthe Drift in Reformulation Signal
  • 11.
    The Explanation ofthe Drift Before November 2013 After November 2013 The Question: “How to detect this kind of changes?”
  • 12.
    • Change detectiontechniques o In dynamically changing and non-stationary environments, the data distribution can change over time yielding the phenomenon of concept drift o The real concept drift refers to changes in the conditional distribution of the output (i.e., target variable) given the input (input features) • Concept drift types: Change Detection Techniques
  • 13.
    • Change detectiontechniques o In dynamically changing and non-stationary environments, the data distribution can change over time yielding the phenomenon of concept drift o The real concept drift refers to changes in the conditional distribution of the output (i.e., target variable) given the input (input features) • Concept drift types: Time Datamean Sudden/abrupt Disambiguation such as “flawless Beyoncé” Change Detection Techniques
  • 14.
    • Change detectiontechniques o In dynamically changing and non-stationary environments, the data distribution can change over time yielding the phenomenon of concept drift o The real concept drift refers to changes in the conditional distribution of the output (i.e., target variable) given the input (input features) • Concept drift types: Time Datamean Incremental Disambiguation such as “cikm conference 2014” Change Detection Techniques
  • 15.
    • Change detectiontechniques o In dynamically changing and non-stationary environments, the data distribution can change over time yielding the phenomenon of concept drift o The real concept drift refers to changes in the conditional distribution of the output (i.e., target variable) given the input (input features) • Concept drift types: Time Datamean Gradual Breaking news such as “idaho bus crash investigation” Change Detection Techniques
  • 16.
    • Change detectiontechniques o In dynamically changing and non-stationary environments, the data distribution can change over time yielding the phenomenon of concept drift o The real concept drift refers to changes in the conditional distribution of the output (i.e., target variable) given the input (input features) • Concept drift types: Time Datamean Reoccurring Seasonal change such as “black Friday 2014” Change Detection Techniques
  • 17.
    • Change detectiontechniques o In dynamically changing and non-stationary environments, the data distribution can change over time yielding the phenomenon of concept drift o The real concept drift refers to changes in the conditional distribution of the output (i.e., target variable) given the input (input features) • Concept drift types: Time Datamean Change Detection Techniques
  • 18.
    • Change detectiontechniques o In dynamically changing and non-stationary environments, the data distribution can change over time yielding the phenomenon of concept drift o The real concept drift refers to changes in the conditional distribution of the output (i.e., target variable) given the input (input features) • Concept drift types: Time Datamean Sudden/abru pt Incremental Gradual Reoccurring concepts Outlier (not concept drift) Disambiguation such as “medal olympics 2014” Seasonal change such as “black Friday 2014” Breaking news such as “idaho bus crash investigation” Disambiguation such as “cikm conference 2014” Change Detection Techniques
  • 19.
    Detecting Drifts in ReformulationSignal Query: “cikm conference” 0.1 TimeLinet0 0.1 0.2 0.2 0.3 Reformulation: “2014” Window W0 ti
  • 20.
    Detecting Drifts in ReformulationSignal Query: “cikm conference” 0.1 TimeLinet0 ti+ t 0.1 0.2 0.2 0.3 0.7 0.8 0.8 Reformulation: “2014” Window W0 Window W1 ti E(W0) E(W1) Size of Window W1 = n1Size of Window W0 = n0 The upcoming conference event If |E(W1) - E(W2)|> eout Then Drift Detected
  • 21.
    Calculating Threshold eout Confidence Varianceat W = W0 U W1 m = 1/(1/n0 + 1/n1) eout
  • 22.
  • 23.
    Learn reformulation model M User Behavior Logs t0 IncomingUser Behavior logs Timeline Detect changes in model M If change detected else Do Nothing ti ti+ t
  • 24.
    Learn reformulation model M User Behavior Logs ti IncomingUser Behavior logs Timeline Detect changes in model M If change detected else Do Nothing ti+w1 ti+w1+w2 Alarm: Change of user satisfaction detected for pairs : {<Qi, SERPi>}1<i<n
  • 25.
    Learn reformulation model M User Behavior Logs t0 IncomingUser Behavior Logs Timeline Detect changes in model M If change detected else Do Nothing ti ti+ t 1) List of reformulation terms per query 2) List of URLs per reformulation Alarm: Change of user satisfaction detected for pairs : {<Qi, SERPi>}1<i<n
  • 26.
    o Dataset consistsof 6 months of the behavioral log data from a commercial search engine o The training window size is one month o The test window size is two weeks Experimentation
  • 27.
  • 28.
  • 29.
    o We successfullyleveraged the concept drift detection techniques to detect changes in user satisfaction o The proposed technique works in unsupervised way o Large scale evaluation has been performed o Classification of the drift type is needed o Prediction of the lifetime of the drift would help Conclusion and Future Work
  • 30.
  • 31.
  • 32.
    o We successfullyleveraged the concept drift detection techniques o The proposed technique works in unsupervised way o Large scale evaluation has been performed o Classification of the drift type is needed o Prediction of the lifetime of the drift would help Conclusion and Future Work

Editor's Notes

  • #6 In our work we use probability to reformulate a query as a sign of user satisfaction or dissatisfaction
  • #7 Let revisit our example where the user wanted to visit CIKM
  • #8 When we can this change becomes a drift
  • #33 Talk about applications