SlideShare a Scribd company logo
1 of 18
Download to read offline
When no clicks are good news
Carlos Castillo, Aris Gionis, Ronny Lempel, Yoelle Maarek
Yahoo! Research Barcelona & Haifa
2 SIGIR 2010 Industry Track – Geneva, Switzerland
Usage mining for search
• Behavioral signals are useful to measure
performance of retrieval systems
• Relevant results are
– clicked more often,
– visited for longer time,
– lead to long-term engagement,
– etc.
• However, predicting user satisfaction accurately
from search behavior signals is still an open
problem
3 SIGIR 2010 Industry Track – Geneva, Switzerland
A (not-so-)special case
If we satisfy the user
by impression, then
we observe a lower
click-through rate
4 SIGIR 2010 Industry Track – Geneva, Switzerland
Satisfaction by impression
Oneboxes and Direct Displays
Oneboxes1
and Direct Displays2
(DD) are
●
Very specific results answering (mostly) unambiguous queries
with a unique answer directly on the SERP
●
Displayed above regular Web results, due to their high
relevance, and in a slightly different format.
● Typical example: weather <city name>
●
Test: guess which onebox/DD was served by which search engine:-)
1
: Google terminology
2
:Yahoo! terminology
5 SIGIR 2010 Industry Track – Geneva, Switzerland
Increasing number of “by impression” results
• When searching for specific stocks, movie or train schedules,
sports results, package tracking (Fedex/UPS), etc.
• To the extreme, what about spell checking, arithmetic operations
or currency conversion, addresses, things to do?
6 SIGIR 2010 Industry Track – Geneva, Switzerland
The problem
• Click-based metrics for user satisfaction
• For cases where we expect no clicks
• Not only search sessions
– Any browsing/interaction session
7 SIGIR 2010 Industry Track – Geneva, Switzerland
Our proposal
●
General method
●
Pick a class of users with a distinctive behavior
●
Study their response to changes
8 SIGIR 2010 Industry Track – Geneva, Switzerland
Our proposal
●
General method
●
Pick a class of users with a distinctive behavior
●
Study their response to changes
●
Specific method
– Find users who are “Tenacious”
• reformulate or click, do not let go
– Measure their abandonment
9 SIGIR 2010 Industry Track – Geneva, Switzerland
How to model users?
• Session representation
– Actions classes: queries and clicks
• XQCQX means “start, query, click, query, stop”
– Alternative: reformulation classes
• User representation
– Frequency of action 3-grams = 15 features in total
– Tenacity = (XQQ+XQC)/(XQQ+XQC+XQX)
10 SIGIR 2010 Industry Track – Geneva, Switzerland
(Preliminary) experiments
• Segment sessions into logical “goals”
• Divide goals in two groups
– With direct-displays above position 5 (DD)
– Without (NO-DD)
• Metric
– Find users with TenacityNO-DD >= 80%
– Measure TenacityDD / TenacityNO-DD
• Ground truth
– Ask humans “do you think users querying Q will be
satisfied by impression by this DD?”
• 1=never ... 5=always
Change in the tenacity of tenacious users
Pitbull: editorial vs metric (type “weather”)
BAD
GOOD
Change in the tenacity of tenacious users
“BAD”
“GOOD”
Pitbull: editorial vs metric (type “weather”)
63% of bad cases
83% precision
BAD
GOOD
Change in the tenacity of tenacious users
Pitbull: editorial vs metric (type “weather”)
Change in the tenacity of tenacious users
BAD
GOOD
Pitbull: editorial vs metric (type “reference”)
Change in the tenacity of tenacious users
BAD
GOOD
“BAD”
“GOOD”
Pitbull: editorial vs metric (type “reference”)
71% of bad cases
84% precision
BAD
GOOD
Change in the tenacity of tenacious users
Pitbull: editorial vs metric (type “reference”)
17 SIGIR 2010 Industry Track – Geneva, Switzerland
Summary
●
Tenacious users can be used to identify bad DDs
●
General method: usage mining on classes of users
●
Shoppers
●
Smart searchers
●
Click-a-lots / explorers
●
Leaders
●
Poodles?
●
etc.
●
General/shared taxonomy of users?
Thank you!
chato@yahoo-inc.com

More Related Content

Similar to When no clicks are good news

Introduction To Six Sigma
Introduction To Six SigmaIntroduction To Six Sigma
Introduction To Six Sigmaskoscielak
 
Digital analytics: Optimization (Lecture 10)
Digital analytics: Optimization (Lecture 10)Digital analytics: Optimization (Lecture 10)
Digital analytics: Optimization (Lecture 10)Joni Salminen
 
User Zoom Webinar Monster Aug09 Vf
User Zoom Webinar Monster Aug09 VfUser Zoom Webinar Monster Aug09 Vf
User Zoom Webinar Monster Aug09 VfUserZoom
 
Presentations - Zarget CRO meetup 2017
Presentations - Zarget CRO meetup 2017Presentations - Zarget CRO meetup 2017
Presentations - Zarget CRO meetup 2017ZargetHQ
 
Lean UX and Optimisation - Userzoom : 24 jan 2012 - lean optimisation
Lean UX and Optimisation - Userzoom : 24 jan 2012 - lean optimisationLean UX and Optimisation - Userzoom : 24 jan 2012 - lean optimisation
Lean UX and Optimisation - Userzoom : 24 jan 2012 - lean optimisationCraig Sullivan
 
Chainsaw Conjoint
Chainsaw ConjointChainsaw Conjoint
Chainsaw ConjointQuestionPro
 
Support at scale in a DevOps world How Swarming and Cynefin can save you from...
Support at scale in a DevOps world How Swarming and Cynefin can save you from...Support at scale in a DevOps world How Swarming and Cynefin can save you from...
Support at scale in a DevOps world How Swarming and Cynefin can save you from...Jon Stevens-Hall
 
What is Lean Six Sigma - ADDVALUE - Nilesh Arora
What is Lean Six Sigma -  ADDVALUE - Nilesh AroraWhat is Lean Six Sigma -  ADDVALUE - Nilesh Arora
What is Lean Six Sigma - ADDVALUE - Nilesh AroraADD VALUE CONSULTING Inc
 
Leverage The Power of Small Data
Leverage The Power of Small DataLeverage The Power of Small Data
Leverage The Power of Small DataKaryn Zuidinga
 
Usability Testing for Qualitative Researchers - QRCA NYC Chapter event
Usability Testing for Qualitative Researchers - QRCA NYC Chapter eventUsability Testing for Qualitative Researchers - QRCA NYC Chapter event
Usability Testing for Qualitative Researchers - QRCA NYC Chapter eventKay Aubrey
 
Testing technology products
Testing technology productsTesting technology products
Testing technology productsDave Kreimer
 
Jose Luis Fernandez-Marquez (UNIGE) - CCL tracker
Jose Luis Fernandez-Marquez (UNIGE) - CCL trackerJose Luis Fernandez-Marquez (UNIGE) - CCL tracker
Jose Luis Fernandez-Marquez (UNIGE) - CCL trackerCitizenCyberlab
 
Bazley understanding online audiences vsg conf march 2016 for uploading
Bazley understanding online audiences vsg conf march 2016 for uploadingBazley understanding online audiences vsg conf march 2016 for uploading
Bazley understanding online audiences vsg conf march 2016 for uploadingMartin Bazley
 
Six sigma an overview | Online Mini MBA (Free)
Six sigma  an overview | Online Mini MBA (Free)Six sigma  an overview | Online Mini MBA (Free)
Six sigma an overview | Online Mini MBA (Free)mybskool-online-courses
 
Comparative evaluation
Comparative evaluationComparative evaluation
Comparative evaluationSónia
 
Digital analytics: Analytics problems (Lecture 9)
Digital analytics: Analytics problems (Lecture 9)Digital analytics: Analytics problems (Lecture 9)
Digital analytics: Analytics problems (Lecture 9)Joni Salminen
 
Applying lean ux in designing enterprise software from ground up
Applying lean ux in designing enterprise software from ground upApplying lean ux in designing enterprise software from ground up
Applying lean ux in designing enterprise software from ground upKok Chiann
 

Similar to When no clicks are good news (20)

Introduction To Six Sigma
Introduction To Six SigmaIntroduction To Six Sigma
Introduction To Six Sigma
 
Digital analytics: Optimization (Lecture 10)
Digital analytics: Optimization (Lecture 10)Digital analytics: Optimization (Lecture 10)
Digital analytics: Optimization (Lecture 10)
 
User Zoom Webinar Monster Aug09 Vf
User Zoom Webinar Monster Aug09 VfUser Zoom Webinar Monster Aug09 Vf
User Zoom Webinar Monster Aug09 Vf
 
Presentations - Zarget CRO meetup 2017
Presentations - Zarget CRO meetup 2017Presentations - Zarget CRO meetup 2017
Presentations - Zarget CRO meetup 2017
 
Lean UX and Optimisation - Userzoom : 24 jan 2012 - lean optimisation
Lean UX and Optimisation - Userzoom : 24 jan 2012 - lean optimisationLean UX and Optimisation - Userzoom : 24 jan 2012 - lean optimisation
Lean UX and Optimisation - Userzoom : 24 jan 2012 - lean optimisation
 
Chainsaw Conjoint
Chainsaw ConjointChainsaw Conjoint
Chainsaw Conjoint
 
Simplify your analytics strategy
Simplify your analytics strategySimplify your analytics strategy
Simplify your analytics strategy
 
Support at scale in a DevOps world How Swarming and Cynefin can save you from...
Support at scale in a DevOps world How Swarming and Cynefin can save you from...Support at scale in a DevOps world How Swarming and Cynefin can save you from...
Support at scale in a DevOps world How Swarming and Cynefin can save you from...
 
What is Lean Six Sigma - ADDVALUE - Nilesh Arora
What is Lean Six Sigma -  ADDVALUE - Nilesh AroraWhat is Lean Six Sigma -  ADDVALUE - Nilesh Arora
What is Lean Six Sigma - ADDVALUE - Nilesh Arora
 
UX Research
UX ResearchUX Research
UX Research
 
Leverage The Power of Small Data
Leverage The Power of Small DataLeverage The Power of Small Data
Leverage The Power of Small Data
 
Usability Testing for Qualitative Researchers - QRCA NYC Chapter event
Usability Testing for Qualitative Researchers - QRCA NYC Chapter eventUsability Testing for Qualitative Researchers - QRCA NYC Chapter event
Usability Testing for Qualitative Researchers - QRCA NYC Chapter event
 
Testing technology products
Testing technology productsTesting technology products
Testing technology products
 
Jose Luis Fernandez-Marquez (UNIGE) - CCL tracker
Jose Luis Fernandez-Marquez (UNIGE) - CCL trackerJose Luis Fernandez-Marquez (UNIGE) - CCL tracker
Jose Luis Fernandez-Marquez (UNIGE) - CCL tracker
 
Bazley understanding online audiences vsg conf march 2016 for uploading
Bazley understanding online audiences vsg conf march 2016 for uploadingBazley understanding online audiences vsg conf march 2016 for uploading
Bazley understanding online audiences vsg conf march 2016 for uploading
 
Six sigma an overview | Online Mini MBA (Free)
Six sigma  an overview | Online Mini MBA (Free)Six sigma  an overview | Online Mini MBA (Free)
Six sigma an overview | Online Mini MBA (Free)
 
Comparative evaluation
Comparative evaluationComparative evaluation
Comparative evaluation
 
Digital analytics: Analytics problems (Lecture 9)
Digital analytics: Analytics problems (Lecture 9)Digital analytics: Analytics problems (Lecture 9)
Digital analytics: Analytics problems (Lecture 9)
 
CPI Training overview
CPI Training overviewCPI Training overview
CPI Training overview
 
Applying lean ux in designing enterprise software from ground up
Applying lean ux in designing enterprise software from ground upApplying lean ux in designing enterprise software from ground up
Applying lean ux in designing enterprise software from ground up
 

More from Carlos Castillo (ChaTo)

Finding High Quality Content in Social Media
Finding High Quality Content in Social MediaFinding High Quality Content in Social Media
Finding High Quality Content in Social MediaCarlos Castillo (ChaTo)
 
Socia Media and Digital Volunteering in Disaster Management @ DSEM 2017
Socia Media and Digital Volunteering in Disaster Management @ DSEM 2017Socia Media and Digital Volunteering in Disaster Management @ DSEM 2017
Socia Media and Digital Volunteering in Disaster Management @ DSEM 2017Carlos Castillo (ChaTo)
 
Detecting Algorithmic Bias (keynote at DIR 2016)
Detecting Algorithmic Bias (keynote at DIR 2016)Detecting Algorithmic Bias (keynote at DIR 2016)
Detecting Algorithmic Bias (keynote at DIR 2016)Carlos Castillo (ChaTo)
 

More from Carlos Castillo (ChaTo) (20)

Finding High Quality Content in Social Media
Finding High Quality Content in Social MediaFinding High Quality Content in Social Media
Finding High Quality Content in Social Media
 
When no clicks are good news
When no clicks are good newsWhen no clicks are good news
When no clicks are good news
 
Socia Media and Digital Volunteering in Disaster Management @ DSEM 2017
Socia Media and Digital Volunteering in Disaster Management @ DSEM 2017Socia Media and Digital Volunteering in Disaster Management @ DSEM 2017
Socia Media and Digital Volunteering in Disaster Management @ DSEM 2017
 
Detecting Algorithmic Bias (keynote at DIR 2016)
Detecting Algorithmic Bias (keynote at DIR 2016)Detecting Algorithmic Bias (keynote at DIR 2016)
Detecting Algorithmic Bias (keynote at DIR 2016)
 
Discrimination Discovery
Discrimination DiscoveryDiscrimination Discovery
Discrimination Discovery
 
Fairness-Aware Data Mining
Fairness-Aware Data MiningFairness-Aware Data Mining
Fairness-Aware Data Mining
 
Big Crisis Data for ISPC
Big Crisis Data for ISPCBig Crisis Data for ISPC
Big Crisis Data for ISPC
 
Databeers: Big Crisis Data
Databeers: Big Crisis DataDatabeers: Big Crisis Data
Databeers: Big Crisis Data
 
Observational studies in social media
Observational studies in social mediaObservational studies in social media
Observational studies in social media
 
Natural experiments
Natural experimentsNatural experiments
Natural experiments
 
Content-based link prediction
Content-based link predictionContent-based link prediction
Content-based link prediction
 
Link prediction
Link predictionLink prediction
Link prediction
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Graph Partitioning and Spectral Methods
Graph Partitioning and Spectral MethodsGraph Partitioning and Spectral Methods
Graph Partitioning and Spectral Methods
 
Finding Dense Subgraphs
Finding Dense SubgraphsFinding Dense Subgraphs
Finding Dense Subgraphs
 
Graph Evolution Models
Graph Evolution ModelsGraph Evolution Models
Graph Evolution Models
 
Link-Based Ranking
Link-Based RankingLink-Based Ranking
Link-Based Ranking
 
Text Indexing / Inverted Indices
Text Indexing / Inverted IndicesText Indexing / Inverted Indices
Text Indexing / Inverted Indices
 
Indexing
IndexingIndexing
Indexing
 
Text Summarization
Text SummarizationText Summarization
Text Summarization
 

When no clicks are good news

  • 1. When no clicks are good news Carlos Castillo, Aris Gionis, Ronny Lempel, Yoelle Maarek Yahoo! Research Barcelona & Haifa
  • 2. 2 SIGIR 2010 Industry Track – Geneva, Switzerland Usage mining for search • Behavioral signals are useful to measure performance of retrieval systems • Relevant results are – clicked more often, – visited for longer time, – lead to long-term engagement, – etc. • However, predicting user satisfaction accurately from search behavior signals is still an open problem
  • 3. 3 SIGIR 2010 Industry Track – Geneva, Switzerland A (not-so-)special case If we satisfy the user by impression, then we observe a lower click-through rate
  • 4. 4 SIGIR 2010 Industry Track – Geneva, Switzerland Satisfaction by impression Oneboxes and Direct Displays Oneboxes1 and Direct Displays2 (DD) are ● Very specific results answering (mostly) unambiguous queries with a unique answer directly on the SERP ● Displayed above regular Web results, due to their high relevance, and in a slightly different format. ● Typical example: weather <city name> ● Test: guess which onebox/DD was served by which search engine:-) 1 : Google terminology 2 :Yahoo! terminology
  • 5. 5 SIGIR 2010 Industry Track – Geneva, Switzerland Increasing number of “by impression” results • When searching for specific stocks, movie or train schedules, sports results, package tracking (Fedex/UPS), etc. • To the extreme, what about spell checking, arithmetic operations or currency conversion, addresses, things to do?
  • 6. 6 SIGIR 2010 Industry Track – Geneva, Switzerland The problem • Click-based metrics for user satisfaction • For cases where we expect no clicks • Not only search sessions – Any browsing/interaction session
  • 7. 7 SIGIR 2010 Industry Track – Geneva, Switzerland Our proposal ● General method ● Pick a class of users with a distinctive behavior ● Study their response to changes
  • 8. 8 SIGIR 2010 Industry Track – Geneva, Switzerland Our proposal ● General method ● Pick a class of users with a distinctive behavior ● Study their response to changes ● Specific method – Find users who are “Tenacious” • reformulate or click, do not let go – Measure their abandonment
  • 9. 9 SIGIR 2010 Industry Track – Geneva, Switzerland How to model users? • Session representation – Actions classes: queries and clicks • XQCQX means “start, query, click, query, stop” – Alternative: reformulation classes • User representation – Frequency of action 3-grams = 15 features in total – Tenacity = (XQQ+XQC)/(XQQ+XQC+XQX)
  • 10. 10 SIGIR 2010 Industry Track – Geneva, Switzerland (Preliminary) experiments • Segment sessions into logical “goals” • Divide goals in two groups – With direct-displays above position 5 (DD) – Without (NO-DD) • Metric – Find users with TenacityNO-DD >= 80% – Measure TenacityDD / TenacityNO-DD • Ground truth – Ask humans “do you think users querying Q will be satisfied by impression by this DD?” • 1=never ... 5=always
  • 11. Change in the tenacity of tenacious users Pitbull: editorial vs metric (type “weather”)
  • 12. BAD GOOD Change in the tenacity of tenacious users “BAD” “GOOD” Pitbull: editorial vs metric (type “weather”)
  • 13. 63% of bad cases 83% precision BAD GOOD Change in the tenacity of tenacious users Pitbull: editorial vs metric (type “weather”)
  • 14. Change in the tenacity of tenacious users BAD GOOD Pitbull: editorial vs metric (type “reference”)
  • 15. Change in the tenacity of tenacious users BAD GOOD “BAD” “GOOD” Pitbull: editorial vs metric (type “reference”)
  • 16. 71% of bad cases 84% precision BAD GOOD Change in the tenacity of tenacious users Pitbull: editorial vs metric (type “reference”)
  • 17. 17 SIGIR 2010 Industry Track – Geneva, Switzerland Summary ● Tenacious users can be used to identify bad DDs ● General method: usage mining on classes of users ● Shoppers ● Smart searchers ● Click-a-lots / explorers ● Leaders ● Poodles? ● etc. ● General/shared taxonomy of users?