SlideShare a Scribd company logo
1 of 38
Modeling and Predicting the Task-by-
Task Behavior of Search Engine Users
Gabriele Tolomei
Università Ca‟ Foscari Venezia, Italy
Claudio Lucchese
ISTI-CNR, Pisa, Italy
Salvatore Orlando
Università Ca‟ Foscari Venezia, Italy
Fabrizio Silvestri
ISTI-CNR, Pisa, Italy
Raffaele Perego
ISTI-CNR, Pisa, Italy
May, 23 2013 - Lisbon, Portugal
10th International Conference in the RIAO series
Outline
• Motivation
• Research Challenges
• Experiments and Results
• Conclusion and Future Work
2
Outline
• Motivation
• Research Challenges
• Experiments and Results
• Conclusion and Future Work
3
A New Way of Search
May, 23 2013 - Lisbon, Portugal
Alice
Bob
Same Task!
“Reserving a hotel room in New York”
4
… and Search Engines?
• Roughly, they are still Web document
retrieval tools
– answering on a per-query basis
– ten-blue links to relevant Web pages
5
May, 23 2013 - Lisbon, Portugal
Information Need Hierarchy
• Web Task: any (atomic) activity that a user
performs through Web search
– “find a recipe”, “book a flight”, “read news”,
etc.
– distinct users may use different queries to
accomplish the same Web task
• Web Mission: composition of Web tasks to
achieve complex goals
– distinct users may use different Web tasks to
accomplish the same Web mission
6
May, 23 2013 - Lisbon, Portugal
[Jones and Klinkner, CIKM „08]
Goals
• Mine Search Engine logs to detect Web
tasks
• Provide a user model for task-oriented
search
– from query-by-query to task-by-task
• Show how such model can be used to
design a real-world application
– from query to task recommendation
7
May, 23 2013 - Lisbon, Portugal
Outline
• Motivation
• Research Challenges
• Experiments and Results
• Conclusion and Future Work
8
The Big Picture
• Bottom-up, 2-stage clustering solution:
– User Task Discovery from “raw” queries
issued by the same user and stored in query
logs
– Collective Task Discovery from distinct User
Tasks
• Graph-based representation of Collective 9
May, 23 2013 - Lisbon, Portugal
User Task Discovery
• User Task
– set of possibly non contiguous queries (multi-
tasking), issued by a single user, whose aim
is to carry out a specific Web task
• QC-HTC
– Graph-based query clustering solution
proposed in our previous work [Lucchese et al.,
WSDM‟11]
– outperforms other techniques for session
boundary detection in query logs (e.g., QFG
[Boldi et al., CIKM‟08])
10
May, 23 2013 - Lisbon, Portugal
User Task Discovery: QC-HTC
• Splits long-term user session into shorter time-
based sessions
• Builds a weighted undirected graph for each time-
based session
– nodes in each graph are the queries of a time-based
session
• Weight-links consecutive pairs of queries with their
content-based similarity:
– lexical (query character n-grams)
– semantic (query “wikification”)
• Merges any two sequential clusters if their first
(head) and last (tail) queries are similar enough
11
May, 23 2013 - Lisbon, Portugal
Task-oriented User Sessions
12
May, 23 2013 - Lisbon, Portugal
Collective Task Discovery
• Collective Task
– group of distinct user tasks (i.e., distinct sets of
queries performed by several users) to represent
the same Web task
• Identify similar user tasks by clustering their
“bag of words” representations
– Each user query is a sentence
– Each user task is a concatenation of possibly
many sentences (i.e., a text document)
• T = {T1, …, TK} is the final set of Collective
Tasks
13
May, 23 2013 - Lisbon, Portugal
Mapping User to Collective
Tasks
… … … …
14
May, 23 2013 - Lisbon, Portugal
Task Relation Graph (TRG)
• Task-oriented model of user search behavior
• TRG(T, E, w, η) is a weighted directed graph
– nodes are the set of collective tasks T={T1, …, TK}
– edges E represent task relatedness
– w: TxT [0,1] is the weighting-edge function
– ηis a weight threshold
• Ti and Tj are linked together iff w(Ti, Tj) > η
15
May, 23 2013 - Lisbon, Portugal
Outline
• Motivation
• Research Challenges
• Experiments and Results
• Conclusion and Future Work
16
User Task Discovery
Data Set: AOL 2006 Query Log
18
May, 23 2013 - Lisbon, Portugal
Results
Results were evaluated on a manually-built ground-truth of user tasks
[Lucchese et al., TOIS 2013]
19
May, 23 2013 - Lisbon, Portugal
Collective Task
Discovery
Data Set: AOL 2006 Query Log
21
May, 23 2013 - Lisbon, Portugal
Training Set vs. Test Set
22
May, 23 2013 - Lisbon, Portugal
Clustering User Tasks
• Algorithm: Repeated Bisections vs.
Agglomerative
• Similarity Measure: Cosine similarity vs.
Pearson‟s correlation
• Objective Function: maximize intra-cluster
similarity
• Stop Criterion: choose heuristically the final
number K of clusters through the “elbow
method”
• We select K = 1,024
23
May, 23 2013 - Lisbon, Portugal
Results and Example
Results were evaluated on a manually-built ground-truth of collective tasks
[Lucchese et al., TOIS 2013]
24
May, 23 2013 - Lisbon, Portugal
Task Relation Graph
Building TRG: Task Relatedness
• Use the training set to compute w(Ti,Tj)
• Frequent Sequential Patterns
– η= support (i.e., probability) of Ti and Tj co-
occurring in a specified sequence: P(<Ti, Tj>)
– task order matters!
• Association Rules Ti  Tj
– η= support: P({Ti, Tj})
– η= confidence: P(Tj|Ti)
– task order doesn‟t matter!
26
May, 23 2013 - Lisbon, Portugal
Task Recommendation
• One out of many possible applications of
TRG
• A user is performing (or has just
performed) a task Ti
– indeed a user task which is similar to a known
Ti
• Retrieve from TRG the set Rm(Ti) including
the m-top related nodes/tasks to Ti
– tasks in Rm(Ti) are those having the m highest
edge weights among all the adjacent nodes to27
May, 23 2013 - Lisbon, Portugal
Task Recommendation:
Experiments
• Use TRGs built from training set to
generate task recommendations for the
test set
• Original user sessions in test set are split
in 1/3 prefix and 2/3 suffix sets of user
tasks
• Each user task is mapped to a candidate
collective task Tc (cosine similarity)
• From all the Tc in prefix retrieve the union-
set of recommendations U R (T ) from
28
May, 23 2013 - Lisbon, Portugal
Task Recommendation:
Evaluation
Coverage is affected by the edge weighting function and by the threshold η
29
May, 23 2013 - Lisbon, Portugal
Task Recommendation: Results
(top-1)
30
May, 23 2013 - Lisbon, Portugal
Task Recommendation: Results
(top-3)
31
May, 23 2013 - Lisbon, Portugal
Task Recommendation:
Examples
32
May, 23 2013 - Lisbon, Portugal
Task Recommendation:
Examples
33
May, 23 2013 - Lisbon, Portugal
Task vs. Query
Recommendation
• To show that task recommendation is
different from well-known query
recommendation
• TRG vs. QFG
– 83.8% of top-3 query suggestions generated by
QFG live in the same (collective) task
– Only 15.1% of top-3 query suggestions generated
by QFG lead to 2 separate (collective) tasks
• QFG is great if user wants to stay in the
same task
• TRG allows user to switch and jump to other
tasks
34
May, 23 2013 - Lisbon, Portugal
Outline
• Motivation
• Research Challenges
• Experiments and Results
• Conclusion and Future Work
35
The “Take-Away” Message
• Web Search Engines should handle user
requests from “query-by-query” to “task-
by-task”
• New models for user search behavior are
needed: from Query Flow Graph to Task
Relation Graph
• Task Relation Graph may be exploited for
several applications (e.g., Task
Recommendation)
36
May, 23 2013 - Lisbon, Portugal
Future Work
• Advanced Task Representation
– E.g., linked data, as opposed to simple bag-of-
queries
• Automatic Task Labeling (taxonomy of Web
tasks):
– Linking queries of collective tasks with referent
entities in a knowledge base
– Exploit entity categories to label the whole task
• Use TRG for other applications
– Task-based advertising, Mission discovery, etc.
• New SERP to render task-oriented results
37
May, 23 2013 - Lisbon, Portugal
Thank You!
Questions?

More Related Content

Similar to OAIR 2013

[Document] MultiProject analysis with Critical Path Method
[Document] MultiProject analysis with Critical Path Method[Document] MultiProject analysis with Critical Path Method
[Document] MultiProject analysis with Critical Path MethodMichele Palumbo
 
Gunjan insight student conference v2
Gunjan insight student conference v2Gunjan insight student conference v2
Gunjan insight student conference v2Gunjan Kumar
 
Exploring Generative Models of Tripartite Graphs for Recommendation in Social...
Exploring Generative Models of Tripartite Graphs for Recommendation in Social...Exploring Generative Models of Tripartite Graphs for Recommendation in Social...
Exploring Generative Models of Tripartite Graphs for Recommendation in Social...Charalampos Chelmis
 
Educational Question Routing in Online Student Communities
 Educational Question Routing in Online Student Communities Educational Question Routing in Online Student Communities
Educational Question Routing in Online Student CommunitiesJakub Macina
 
Analyzing workflows and improving communication across departments
Analyzing workflows and improving communication across departments Analyzing workflows and improving communication across departments
Analyzing workflows and improving communication across departments NASIG
 
weHelp: A Reference Architecture for Social Recommender Systems
weHelp: A Reference Architecture for Social Recommender SystemsweHelp: A Reference Architecture for Social Recommender Systems
weHelp: A Reference Architecture for Social Recommender SystemsSwapneel Sheth
 
Presentation of the paper “Improving success/completion ratio in large survey...
Presentation of the paper “Improving success/completion ratio in large survey...Presentation of the paper “Improving success/completion ratio in large survey...
Presentation of the paper “Improving success/completion ratio in large survey...Grial - University of Salamanca
 
Statistical Analysis of Results in Music Information Retrieval: Why and How
Statistical Analysis of Results in Music Information Retrieval: Why and HowStatistical Analysis of Results in Music Information Retrieval: Why and How
Statistical Analysis of Results in Music Information Retrieval: Why and HowJulián Urbano
 
Towards Filling the Gap in Conversational Search: From Passage Retrieval to C...
Towards Filling the Gap in Conversational Search: From Passage Retrieval to C...Towards Filling the Gap in Conversational Search: From Passage Retrieval to C...
Towards Filling the Gap in Conversational Search: From Passage Retrieval to C...krisztianbalog
 
Phase 1 Learning Analytics Intro Slides
Phase 1 Learning Analytics Intro SlidesPhase 1 Learning Analytics Intro Slides
Phase 1 Learning Analytics Intro SlidesPaul Bailey
 
Exploiting Semantic Information for Graph-based Recommendations of Learning R...
Exploiting Semantic Information for Graph-based Recommendations of Learning R...Exploiting Semantic Information for Graph-based Recommendations of Learning R...
Exploiting Semantic Information for Graph-based Recommendations of Learning R...Mojisola Erdt née Anjorin
 
Ectel sem_info_rec_learning_resources_v6.0_20120921_ma
Ectel  sem_info_rec_learning_resources_v6.0_20120921_maEctel  sem_info_rec_learning_resources_v6.0_20120921_ma
Ectel sem_info_rec_learning_resources_v6.0_20120921_maMojisola Erdt née Anjorin
 
COMMUNITY DETECTION IN THE COLLABORATIVE WEB
COMMUNITY DETECTION IN THE COLLABORATIVE WEBCOMMUNITY DETECTION IN THE COLLABORATIVE WEB
COMMUNITY DETECTION IN THE COLLABORATIVE WEBIJMIT JOURNAL
 
Lak2018: Scaling Nationally: Seven Lesson Learned
Lak2018:  Scaling Nationally: Seven Lesson LearnedLak2018:  Scaling Nationally: Seven Lesson Learned
Lak2018: Scaling Nationally: Seven Lesson Learnedmwebbjisc
 
A Research Plan to Study Impact of a Collaborative Web Search Tool on Novice'...
A Research Plan to Study Impact of a Collaborative Web Search Tool on Novice'...A Research Plan to Study Impact of a Collaborative Web Search Tool on Novice'...
A Research Plan to Study Impact of a Collaborative Web Search Tool on Novice'...Karthikeyan Umapathy
 
Modelling Time-aware Search Tasks for Search Personalisation
Modelling Time-aware Search Tasks for Search PersonalisationModelling Time-aware Search Tasks for Search Personalisation
Modelling Time-aware Search Tasks for Search PersonalisationThanh Vu
 
Detecting common scientific workflow fragments using templates and execution ...
Detecting common scientific workflow fragments using templates and execution ...Detecting common scientific workflow fragments using templates and execution ...
Detecting common scientific workflow fragments using templates and execution ...dgarijo
 

Similar to OAIR 2013 (20)

WS 8 Living Lab Methodology Handbook
WS 8 Living Lab Methodology HandbookWS 8 Living Lab Methodology Handbook
WS 8 Living Lab Methodology Handbook
 
[Document] MultiProject analysis with Critical Path Method
[Document] MultiProject analysis with Critical Path Method[Document] MultiProject analysis with Critical Path Method
[Document] MultiProject analysis with Critical Path Method
 
Gunjan insight student conference v2
Gunjan insight student conference v2Gunjan insight student conference v2
Gunjan insight student conference v2
 
Exploring Generative Models of Tripartite Graphs for Recommendation in Social...
Exploring Generative Models of Tripartite Graphs for Recommendation in Social...Exploring Generative Models of Tripartite Graphs for Recommendation in Social...
Exploring Generative Models of Tripartite Graphs for Recommendation in Social...
 
Educational Question Routing in Online Student Communities
 Educational Question Routing in Online Student Communities Educational Question Routing in Online Student Communities
Educational Question Routing in Online Student Communities
 
Analyzing workflows and improving communication across departments
Analyzing workflows and improving communication across departments Analyzing workflows and improving communication across departments
Analyzing workflows and improving communication across departments
 
weHelp: A Reference Architecture for Social Recommender Systems
weHelp: A Reference Architecture for Social Recommender SystemsweHelp: A Reference Architecture for Social Recommender Systems
weHelp: A Reference Architecture for Social Recommender Systems
 
ONS local presents clustering
ONS local presents clusteringONS local presents clustering
ONS local presents clustering
 
Presentation of the paper “Improving success/completion ratio in large survey...
Presentation of the paper “Improving success/completion ratio in large survey...Presentation of the paper “Improving success/completion ratio in large survey...
Presentation of the paper “Improving success/completion ratio in large survey...
 
Statistical Analysis of Results in Music Information Retrieval: Why and How
Statistical Analysis of Results in Music Information Retrieval: Why and HowStatistical Analysis of Results in Music Information Retrieval: Why and How
Statistical Analysis of Results in Music Information Retrieval: Why and How
 
Towards Filling the Gap in Conversational Search: From Passage Retrieval to C...
Towards Filling the Gap in Conversational Search: From Passage Retrieval to C...Towards Filling the Gap in Conversational Search: From Passage Retrieval to C...
Towards Filling the Gap in Conversational Search: From Passage Retrieval to C...
 
Phase 1 Learning Analytics Intro Slides
Phase 1 Learning Analytics Intro SlidesPhase 1 Learning Analytics Intro Slides
Phase 1 Learning Analytics Intro Slides
 
Exploiting Semantic Information for Graph-based Recommendations of Learning R...
Exploiting Semantic Information for Graph-based Recommendations of Learning R...Exploiting Semantic Information for Graph-based Recommendations of Learning R...
Exploiting Semantic Information for Graph-based Recommendations of Learning R...
 
Ectel sem_info_rec_learning_resources_v6.0_20120921_ma
Ectel  sem_info_rec_learning_resources_v6.0_20120921_maEctel  sem_info_rec_learning_resources_v6.0_20120921_ma
Ectel sem_info_rec_learning_resources_v6.0_20120921_ma
 
COMMUNITY DETECTION IN THE COLLABORATIVE WEB
COMMUNITY DETECTION IN THE COLLABORATIVE WEBCOMMUNITY DETECTION IN THE COLLABORATIVE WEB
COMMUNITY DETECTION IN THE COLLABORATIVE WEB
 
GKumarAICS
GKumarAICSGKumarAICS
GKumarAICS
 
Lak2018: Scaling Nationally: Seven Lesson Learned
Lak2018:  Scaling Nationally: Seven Lesson LearnedLak2018:  Scaling Nationally: Seven Lesson Learned
Lak2018: Scaling Nationally: Seven Lesson Learned
 
A Research Plan to Study Impact of a Collaborative Web Search Tool on Novice'...
A Research Plan to Study Impact of a Collaborative Web Search Tool on Novice'...A Research Plan to Study Impact of a Collaborative Web Search Tool on Novice'...
A Research Plan to Study Impact of a Collaborative Web Search Tool on Novice'...
 
Modelling Time-aware Search Tasks for Search Personalisation
Modelling Time-aware Search Tasks for Search PersonalisationModelling Time-aware Search Tasks for Search Personalisation
Modelling Time-aware Search Tasks for Search Personalisation
 
Detecting common scientific workflow fragments using templates and execution ...
Detecting common scientific workflow fragments using templates and execution ...Detecting common scientific workflow fragments using templates and execution ...
Detecting common scientific workflow fragments using templates and execution ...
 

Recently uploaded

Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 

Recently uploaded (20)

Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 

OAIR 2013

  • 1. Modeling and Predicting the Task-by- Task Behavior of Search Engine Users Gabriele Tolomei Università Ca‟ Foscari Venezia, Italy Claudio Lucchese ISTI-CNR, Pisa, Italy Salvatore Orlando Università Ca‟ Foscari Venezia, Italy Fabrizio Silvestri ISTI-CNR, Pisa, Italy Raffaele Perego ISTI-CNR, Pisa, Italy May, 23 2013 - Lisbon, Portugal 10th International Conference in the RIAO series
  • 2. Outline • Motivation • Research Challenges • Experiments and Results • Conclusion and Future Work 2
  • 3. Outline • Motivation • Research Challenges • Experiments and Results • Conclusion and Future Work 3
  • 4. A New Way of Search May, 23 2013 - Lisbon, Portugal Alice Bob Same Task! “Reserving a hotel room in New York” 4
  • 5. … and Search Engines? • Roughly, they are still Web document retrieval tools – answering on a per-query basis – ten-blue links to relevant Web pages 5 May, 23 2013 - Lisbon, Portugal
  • 6. Information Need Hierarchy • Web Task: any (atomic) activity that a user performs through Web search – “find a recipe”, “book a flight”, “read news”, etc. – distinct users may use different queries to accomplish the same Web task • Web Mission: composition of Web tasks to achieve complex goals – distinct users may use different Web tasks to accomplish the same Web mission 6 May, 23 2013 - Lisbon, Portugal [Jones and Klinkner, CIKM „08]
  • 7. Goals • Mine Search Engine logs to detect Web tasks • Provide a user model for task-oriented search – from query-by-query to task-by-task • Show how such model can be used to design a real-world application – from query to task recommendation 7 May, 23 2013 - Lisbon, Portugal
  • 8. Outline • Motivation • Research Challenges • Experiments and Results • Conclusion and Future Work 8
  • 9. The Big Picture • Bottom-up, 2-stage clustering solution: – User Task Discovery from “raw” queries issued by the same user and stored in query logs – Collective Task Discovery from distinct User Tasks • Graph-based representation of Collective 9 May, 23 2013 - Lisbon, Portugal
  • 10. User Task Discovery • User Task – set of possibly non contiguous queries (multi- tasking), issued by a single user, whose aim is to carry out a specific Web task • QC-HTC – Graph-based query clustering solution proposed in our previous work [Lucchese et al., WSDM‟11] – outperforms other techniques for session boundary detection in query logs (e.g., QFG [Boldi et al., CIKM‟08]) 10 May, 23 2013 - Lisbon, Portugal
  • 11. User Task Discovery: QC-HTC • Splits long-term user session into shorter time- based sessions • Builds a weighted undirected graph for each time- based session – nodes in each graph are the queries of a time-based session • Weight-links consecutive pairs of queries with their content-based similarity: – lexical (query character n-grams) – semantic (query “wikification”) • Merges any two sequential clusters if their first (head) and last (tail) queries are similar enough 11 May, 23 2013 - Lisbon, Portugal
  • 12. Task-oriented User Sessions 12 May, 23 2013 - Lisbon, Portugal
  • 13. Collective Task Discovery • Collective Task – group of distinct user tasks (i.e., distinct sets of queries performed by several users) to represent the same Web task • Identify similar user tasks by clustering their “bag of words” representations – Each user query is a sentence – Each user task is a concatenation of possibly many sentences (i.e., a text document) • T = {T1, …, TK} is the final set of Collective Tasks 13 May, 23 2013 - Lisbon, Portugal
  • 14. Mapping User to Collective Tasks … … … … 14 May, 23 2013 - Lisbon, Portugal
  • 15. Task Relation Graph (TRG) • Task-oriented model of user search behavior • TRG(T, E, w, η) is a weighted directed graph – nodes are the set of collective tasks T={T1, …, TK} – edges E represent task relatedness – w: TxT [0,1] is the weighting-edge function – ηis a weight threshold • Ti and Tj are linked together iff w(Ti, Tj) > η 15 May, 23 2013 - Lisbon, Portugal
  • 16. Outline • Motivation • Research Challenges • Experiments and Results • Conclusion and Future Work 16
  • 18. Data Set: AOL 2006 Query Log 18 May, 23 2013 - Lisbon, Portugal
  • 19. Results Results were evaluated on a manually-built ground-truth of user tasks [Lucchese et al., TOIS 2013] 19 May, 23 2013 - Lisbon, Portugal
  • 21. Data Set: AOL 2006 Query Log 21 May, 23 2013 - Lisbon, Portugal
  • 22. Training Set vs. Test Set 22 May, 23 2013 - Lisbon, Portugal
  • 23. Clustering User Tasks • Algorithm: Repeated Bisections vs. Agglomerative • Similarity Measure: Cosine similarity vs. Pearson‟s correlation • Objective Function: maximize intra-cluster similarity • Stop Criterion: choose heuristically the final number K of clusters through the “elbow method” • We select K = 1,024 23 May, 23 2013 - Lisbon, Portugal
  • 24. Results and Example Results were evaluated on a manually-built ground-truth of collective tasks [Lucchese et al., TOIS 2013] 24 May, 23 2013 - Lisbon, Portugal
  • 26. Building TRG: Task Relatedness • Use the training set to compute w(Ti,Tj) • Frequent Sequential Patterns – η= support (i.e., probability) of Ti and Tj co- occurring in a specified sequence: P(<Ti, Tj>) – task order matters! • Association Rules Ti  Tj – η= support: P({Ti, Tj}) – η= confidence: P(Tj|Ti) – task order doesn‟t matter! 26 May, 23 2013 - Lisbon, Portugal
  • 27. Task Recommendation • One out of many possible applications of TRG • A user is performing (or has just performed) a task Ti – indeed a user task which is similar to a known Ti • Retrieve from TRG the set Rm(Ti) including the m-top related nodes/tasks to Ti – tasks in Rm(Ti) are those having the m highest edge weights among all the adjacent nodes to27 May, 23 2013 - Lisbon, Portugal
  • 28. Task Recommendation: Experiments • Use TRGs built from training set to generate task recommendations for the test set • Original user sessions in test set are split in 1/3 prefix and 2/3 suffix sets of user tasks • Each user task is mapped to a candidate collective task Tc (cosine similarity) • From all the Tc in prefix retrieve the union- set of recommendations U R (T ) from 28 May, 23 2013 - Lisbon, Portugal
  • 29. Task Recommendation: Evaluation Coverage is affected by the edge weighting function and by the threshold η 29 May, 23 2013 - Lisbon, Portugal
  • 30. Task Recommendation: Results (top-1) 30 May, 23 2013 - Lisbon, Portugal
  • 31. Task Recommendation: Results (top-3) 31 May, 23 2013 - Lisbon, Portugal
  • 32. Task Recommendation: Examples 32 May, 23 2013 - Lisbon, Portugal
  • 33. Task Recommendation: Examples 33 May, 23 2013 - Lisbon, Portugal
  • 34. Task vs. Query Recommendation • To show that task recommendation is different from well-known query recommendation • TRG vs. QFG – 83.8% of top-3 query suggestions generated by QFG live in the same (collective) task – Only 15.1% of top-3 query suggestions generated by QFG lead to 2 separate (collective) tasks • QFG is great if user wants to stay in the same task • TRG allows user to switch and jump to other tasks 34 May, 23 2013 - Lisbon, Portugal
  • 35. Outline • Motivation • Research Challenges • Experiments and Results • Conclusion and Future Work 35
  • 36. The “Take-Away” Message • Web Search Engines should handle user requests from “query-by-query” to “task- by-task” • New models for user search behavior are needed: from Query Flow Graph to Task Relation Graph • Task Relation Graph may be exploited for several applications (e.g., Task Recommendation) 36 May, 23 2013 - Lisbon, Portugal
  • 37. Future Work • Advanced Task Representation – E.g., linked data, as opposed to simple bag-of- queries • Automatic Task Labeling (taxonomy of Web tasks): – Linking queries of collective tasks with referent entities in a knowledge base – Exploit entity categories to label the whole task • Use TRG for other applications – Task-based advertising, Mission discovery, etc. • New SERP to render task-oriented results 37 May, 23 2013 - Lisbon, Portugal