SlideShare a Scribd company logo
Kosetsu Tsukuda and Masataka Goto
National Institute of Advanced Industrial Science and Technology (AIST)
Japan
ACM SIGIR 2020
Query/Task Satisfaction and
Grid-based Evaluation Metrics
Under Different Image Search Intents
Search intent in image search 2
Intent: Intent:
I want to see photos of
my favorite actor Tom Cruise.
I want to learn
what Jupiter looks like.
Jupiter Tom Cruise
People use web image search with various search intents:
from serious demands for study or work to just passing time
Learn Entertain
Research goal 3
Investigate the influence of user’s intent on
query/task satisfaction and grid-based evaluation metrics
Intent
Query/task satisfaction Grid-based evaluation metrics
Query/task satisfaction
Query/task satisfaction 5
I want to learn about Jupiter.
Jupiter Jupiter’s satellite Jupiter europa
Query sat.
t
Task sat.
 Under a search intent, a user addresses a specific image search task
 A task consists of one or more queries submitted by the user
 The user gains satisfaction for each query and the task
Relationship between intents, query satisfaction, and task satisfaction 6
 Task difficulty would vary according to user’s search intent
 Query satisfaction would influence the task satisfaction
Learn Entertain
Task sat. Task sat.
RQ1 7
What are the characteristics of the query satisfaction
and the task satisfaction and what is the relationship
between them under different image search intents?
Answering this question enables us to
 understand user behavior at a deeper level
 reveal an appropriate approach to support the users according to their intent
Publicly available dataset 8
 Dataset developed through a field study [Wu et al. WSDM’19]
 29 users, 447 tasks, and 1,758 queries
 A user provided a 5-level satisfaction feedback for each query and task
 Assessors annotated 1 intent from each of 4 taxonomies to a task
Taxonomy 1
Locate
Learn
Entertain
Taxonomy 2
Work&Study
Daily-life
Taxonomy 3
Specific
General
Taxonomy 4
Mental Image
Navigation
Analysis 9
 Number of unsatisfied queries before the first satisfied query
 Influence of query satisfaction on task satisfaction
2 unsatisfied queries The first satisfied query
Avg of query sat.
Max of query sat.
Task sat.
Avg/Max of query sat.
Tasksat.
Avg
Max
Learn
Take-home messages for RQ1 10
 Users who have more demanding intents (Learn and Work&Study)
tend to have low query/task satisfaction
 Because such users struggle to get satisfied results by the first query
in a task, helping them to submit their first query in a task is one
possible way to increase their satisfaction
 For users who want to learn something and look for general information
rather than specific one, submitting many satisfied queries contributes
to increase the task satisfaction
 Therefore, it is beneficial to support such user’s search process
even after they found a desired image
Grid-based evaluation metric
Grid-based evaluation metric 12
Jupiter
 Xie et al. proposed a grid-based evaluation metric for image search [TheWebConf’19]
 The metric considers “middle bias,” which indicates that users tend to pay
more attention to images in the middle horizontal position on the SERP
 The metric is implemented by expanding an evaluation metric for
general web search such as RBP (Rank-Biased Precision)
Middle bias
RBP
RBP-MB
+ Middle bias (MB)
�𝑀𝑀 = �
𝑖𝑖=0
∞
�
𝑗𝑗=0
𝑖𝑖−1
𝐶𝐶𝑗𝑗 1 − 𝐶𝐶𝑖𝑖 �
𝑗𝑗=0
𝑖𝑖
𝑅𝑅𝑗𝑗
𝑀𝑀𝑀𝑀𝑀𝑀 = �
𝑖𝑖=0
∞
�
𝑗𝑗=0
𝑖𝑖−1
𝑓𝑓 𝑐𝑐 𝑖𝑖 𝐶𝐶𝑗𝑗 1 − 𝐶𝐶𝑖𝑖 �
𝑗𝑗=0
𝑖𝑖
𝑅𝑅𝑗𝑗
RQ2 13
How do image search intents affect the
performance of the grid-based evaluation metric?
Answering this question is beneficial for improving the evaluation metric
Jupiter
Analysis 14
 Dataset includes relevance scores for each pair of a query and an image
 We can compute RBP/RBP-MB for each query
 We compute Peason’s Correlation between a metric and query satisfaction
 A good evaluation metric is highly correlated with query satisfaction
RBP
RBP-MB
Query sat.
RBP/RBP-MB
Querysat.
RBP
RBP-MB
Learn
0.324
0.479
49 92 88 37
66 71 90 60
5673 81 91
Take-home messages for RQ2 15
 When users want to learn something or find images for daily life, or when
users know how the image content looks like before submitting a query,
it is effective to incorporate the middle bias behavior into the evaluation metric
 For other intents, there is still room for improvement in evaluation metric by,
for example, developing intent-aware metrics.
Intent RBP RBP-MB
Locate 0.304 0.304
Learn 0.401 0.429*
Entertain 0.360 0.379
Work&Study 0.299 0.299
Daily-life 0.334 0.372*
Specific 0.389 0.399
General 0.272 0.286
Mental Image 0.377 0.411*
Navigation 0.314 0.319
Pearson’s Correlation between evaluation metrics and query satisfaction (*: 𝑝𝑝 < 0.01)

More Related Content

What's hot (9)

CSTalks - Real movie recommendation - 9 Mar
CSTalks - Real movie recommendation - 9 MarCSTalks - Real movie recommendation - 9 Mar
CSTalks - Real movie recommendation - 9 Mar
 
The art of data science
The art of data scienceThe art of data science
The art of data science
 
20
2020
20
 
3 Types of Machine Learning
3 Types of Machine Learning3 Types of Machine Learning
3 Types of Machine Learning
 
Personalizing Forum Search using Multidimensional Random Walks
Personalizing Forum Search using Multidimensional Random WalksPersonalizing Forum Search using Multidimensional Random Walks
Personalizing Forum Search using Multidimensional Random Walks
 
Collaborative Filtering using KNN
Collaborative Filtering using KNNCollaborative Filtering using KNN
Collaborative Filtering using KNN
 
intership summary
intership summaryintership summary
intership summary
 
Similarity learning
  Similarity learning  Similarity learning
Similarity learning
 
Contiguity Principle
Contiguity PrincipleContiguity Principle
Contiguity Principle
 

Similar to Query/Task Satisfaction and Grid-based Evaluation Metrics Under Different Image Search Intents (SIGIR 2020)

Final Dessertatiion
Final DessertatiionFinal Dessertatiion
Final Dessertatiion
Nimesh Soni
 
Ordinary Search Engine Users Assessing Difficulty, Effort and Outcome for Sim...
Ordinary Search Engine Users Assessing Difficulty, Effort and Outcome for Sim...Ordinary Search Engine Users Assessing Difficulty, Effort and Outcome for Sim...
Ordinary Search Engine Users Assessing Difficulty, Effort and Outcome for Sim...
Dirk Lewandowski
 
Measuring the Quality of Online Service - Jinyoung kim
Measuring the Quality of Online Service - Jinyoung kimMeasuring the Quality of Online Service - Jinyoung kim
Measuring the Quality of Online Service - Jinyoung kim
Jin Young Kim
 

Similar to Query/Task Satisfaction and Grid-based Evaluation Metrics Under Different Image Search Intents (SIGIR 2020) (20)

Social Re-Ranking using Tag Based Image Search
Social Re-Ranking using Tag Based Image SearchSocial Re-Ranking using Tag Based Image Search
Social Re-Ranking using Tag Based Image Search
 
session2.pdf
session2.pdfsession2.pdf
session2.pdf
 
IRJET- Analysis of Question and Answering Recommendation System
IRJET-  	  Analysis of Question and Answering Recommendation SystemIRJET-  	  Analysis of Question and Answering Recommendation System
IRJET- Analysis of Question and Answering Recommendation System
 
Image Search by using various Reranking Methods
Image Search by using various Reranking MethodsImage Search by using various Reranking Methods
Image Search by using various Reranking Methods
 
C043109012
C043109012C043109012
C043109012
 
B017350710
B017350710B017350710
B017350710
 
Efficient Refining Of Why-Not Questions on Top-K Queries
Efficient Refining Of Why-Not Questions on Top-K QueriesEfficient Refining Of Why-Not Questions on Top-K Queries
Efficient Refining Of Why-Not Questions on Top-K Queries
 
Assessment and Process Automation of Two Success Factors for Websites: Usabil...
Assessment and Process Automation of Two Success Factors for Websites: Usabil...Assessment and Process Automation of Two Success Factors for Websites: Usabil...
Assessment and Process Automation of Two Success Factors for Websites: Usabil...
 
F343236
F343236F343236
F343236
 
Image Based Information Retrieval Using Deep Learning and Clustering Techniques
Image Based Information Retrieval Using Deep Learning and Clustering TechniquesImage Based Information Retrieval Using Deep Learning and Clustering Techniques
Image Based Information Retrieval Using Deep Learning and Clustering Techniques
 
Image Based Information Retrieval Using Deep Learning and Clustering Techniques
Image Based Information Retrieval Using Deep Learning and Clustering TechniquesImage Based Information Retrieval Using Deep Learning and Clustering Techniques
Image Based Information Retrieval Using Deep Learning and Clustering Techniques
 
Final Dessertatiion
Final DessertatiionFinal Dessertatiion
Final Dessertatiion
 
Best Practices in Recommender System Challenges
Best Practices in Recommender System ChallengesBest Practices in Recommender System Challenges
Best Practices in Recommender System Challenges
 
Measuring Learning During Search - ACM SIGIR CHIIR 2019
Measuring Learning During Search - ACM SIGIR CHIIR 2019Measuring Learning During Search - ACM SIGIR CHIIR 2019
Measuring Learning During Search - ACM SIGIR CHIIR 2019
 
IRJET-Semi-Supervised Collaborative Image Retrieval using Relevance Feedback
IRJET-Semi-Supervised Collaborative Image Retrieval using Relevance FeedbackIRJET-Semi-Supervised Collaborative Image Retrieval using Relevance Feedback
IRJET-Semi-Supervised Collaborative Image Retrieval using Relevance Feedback
 
IRJET- Question-Answer Text Mining using Machine Learning
IRJET- Question-Answer Text Mining using Machine LearningIRJET- Question-Answer Text Mining using Machine Learning
IRJET- Question-Answer Text Mining using Machine Learning
 
IRJET- Question-Answer Text Mining using Machine Learning
IRJET-  	  Question-Answer Text Mining using Machine LearningIRJET-  	  Question-Answer Text Mining using Machine Learning
IRJET- Question-Answer Text Mining using Machine Learning
 
Ordinary Search Engine Users Assessing Difficulty, Effort and Outcome for Sim...
Ordinary Search Engine Users Assessing Difficulty, Effort and Outcome for Sim...Ordinary Search Engine Users Assessing Difficulty, Effort and Outcome for Sim...
Ordinary Search Engine Users Assessing Difficulty, Effort and Outcome for Sim...
 
Optimization of Image Search from Photo Sharing Websites
Optimization of Image Search from Photo Sharing WebsitesOptimization of Image Search from Photo Sharing Websites
Optimization of Image Search from Photo Sharing Websites
 
Measuring the Quality of Online Service - Jinyoung kim
Measuring the Quality of Online Service - Jinyoung kimMeasuring the Quality of Online Service - Jinyoung kim
Measuring the Quality of Online Service - Jinyoung kim
 

More from Kosetsu Tsukuda

More from Kosetsu Tsukuda (20)

【論文紹介】ProtoMF: Prototype-based Matrix Factorization for Effective and Explain...
【論文紹介】ProtoMF: Prototype-based Matrix Factorization for Effective and Explain...【論文紹介】ProtoMF: Prototype-based Matrix Factorization for Effective and Explain...
【論文紹介】ProtoMF: Prototype-based Matrix Factorization for Effective and Explain...
 
音楽聴取者の行動分析で理解する人と音楽とのインタラクション
音楽聴取者の行動分析で理解する人と音楽とのインタラクション音楽聴取者の行動分析で理解する人と音楽とのインタラクション
音楽聴取者の行動分析で理解する人と音楽とのインタラクション
 
KiiteCafe: 同じ楽曲を同じ瞬間に聴きながら楽曲に対する気持ちを伝え合う音楽発掘サービス(インタラクション2022・登壇発表)
KiiteCafe: 同じ楽曲を同じ瞬間に聴きながら楽曲に対する気持ちを伝え合う音楽発掘サービス(インタラクション2022・登壇発表)KiiteCafe: 同じ楽曲を同じ瞬間に聴きながら楽曲に対する気持ちを伝え合う音楽発掘サービス(インタラクション2022・登壇発表)
KiiteCafe: 同じ楽曲を同じ瞬間に聴きながら楽曲に対する気持ちを伝え合う音楽発掘サービス(インタラクション2022・登壇発表)
 
Kiite Cafe: A Web Service for Getting Together Virtually to Listen to Music (...
Kiite Cafe: A Web Service for Getting Together Virtually to Listen to Music (...Kiite Cafe: A Web Service for Getting Together Virtually to Listen to Music (...
Kiite Cafe: A Web Service for Getting Together Virtually to Listen to Music (...
 
Toward an Understanding of Lyrics-viewing Behavior While Listening to Music o...
Toward an Understanding of Lyrics-viewing Behavior While Listening to Music o...Toward an Understanding of Lyrics-viewing Behavior While Listening to Music o...
Toward an Understanding of Lyrics-viewing Behavior While Listening to Music o...
 
人はなぜ・どのように歌詞を閲覧するのか:スマートフォンでの楽曲聴取時の歌詞閲覧行動分析(第17回WI2研究会)
人はなぜ・どのように歌詞を閲覧するのか:スマートフォンでの楽曲聴取時の歌詞閲覧行動分析(第17回WI2研究会)人はなぜ・どのように歌詞を閲覧するのか:スマートフォンでの楽曲聴取時の歌詞閲覧行動分析(第17回WI2研究会)
人はなぜ・どのように歌詞を閲覧するのか:スマートフォンでの楽曲聴取時の歌詞閲覧行動分析(第17回WI2研究会)
 
Explainable Recommendation for Repeat Consumption (RecSys 2020)
Explainable Recommendation for Repeat Consumption (RecSys 2020)Explainable Recommendation for Repeat Consumption (RecSys 2020)
Explainable Recommendation for Repeat Consumption (RecSys 2020)
 
繰り返し消費されるコンテンツを対象とした推薦理由の提示(IFAT142・登壇発表)
繰り返し消費されるコンテンツを対象とした推薦理由の提示(IFAT142・登壇発表)繰り返し消費されるコンテンツを対象とした推薦理由の提示(IFAT142・登壇発表)
繰り返し消費されるコンテンツを対象とした推薦理由の提示(IFAT142・登壇発表)
 
Kiite Cafe: 同じ楽曲を同じ瞬間に楽しんで「好き」が伝わる音楽発掘カフェ(SIGMUS132・登壇発表)
Kiite Cafe: 同じ楽曲を同じ瞬間に楽しんで「好き」が伝わる音楽発掘カフェ(SIGMUS132・登壇発表)Kiite Cafe: 同じ楽曲を同じ瞬間に楽しんで「好き」が伝わる音楽発掘カフェ(SIGMUS132・登壇発表)
Kiite Cafe: 同じ楽曲を同じ瞬間に楽しんで「好き」が伝わる音楽発掘カフェ(SIGMUS132・登壇発表)
 
Explainable Recommendation for Repeat Consumption(RecSys2020論文読み会)
Explainable Recommendation for Repeat Consumption(RecSys2020論文読み会)Explainable Recommendation for Repeat Consumption(RecSys2020論文読み会)
Explainable Recommendation for Repeat Consumption(RecSys2020論文読み会)
 
Explainable Recommendation for Repeat Consumption (RecSys 2020)
Explainable Recommendation for Repeat Consumption (RecSys 2020)Explainable Recommendation for Repeat Consumption (RecSys 2020)
Explainable Recommendation for Repeat Consumption (RecSys 2020)
 
The Web Conference 2020 国際会議報告(ACM SIGMOD 日本支部第73回支部大会・依頼講演)
The Web Conference 2020 国際会議報告(ACM SIGMOD 日本支部第73回支部大会・依頼講演)The Web Conference 2020 国際会議報告(ACM SIGMOD 日本支部第73回支部大会・依頼講演)
The Web Conference 2020 国際会議報告(ACM SIGMOD 日本支部第73回支部大会・依頼講演)
 
DualDiv: Diversifying Items and Explanation Styles in Explainable Hybrid Reco...
DualDiv: Diversifying Items and Explanation Styles in Explainable Hybrid Reco...DualDiv: Diversifying Items and Explanation Styles in Explainable Hybrid Reco...
DualDiv: Diversifying Items and Explanation Styles in Explainable Hybrid Reco...
 
ABCPRec:何を創作したかという情報がコンテンツの消費時に反映されるユーザ生成コンテンツ推薦手法(WebDB Forum 2019・登壇発表)
ABCPRec:何を創作したかという情報がコンテンツの消費時に反映されるユーザ生成コンテンツ推薦手法(WebDB Forum 2019・登壇発表)ABCPRec:何を創作したかという情報がコンテンツの消費時に反映されるユーザ生成コンテンツ推薦手法(WebDB Forum 2019・登壇発表)
ABCPRec:何を創作したかという情報がコンテンツの消費時に反映されるユーザ生成コンテンツ推薦手法(WebDB Forum 2019・登壇発表)
 
DualDiv: Diversifying Items and Explanation Styles in Explainable Hybrid Reco...
DualDiv: Diversifying Items and Explanation Styles in Explainable Hybrid Reco...DualDiv: Diversifying Items and Explanation Styles in Explainable Hybrid Reco...
DualDiv: Diversifying Items and Explanation Styles in Explainable Hybrid Reco...
 
ABCPRec:ユーザの消費者としての役割と創作者としての役割の適応的対応付けによるユーザ生成コンテンツ推薦(第14回WI2研究会)
ABCPRec:ユーザの消費者としての役割と創作者としての役割の適応的対応付けによるユーザ生成コンテンツ推薦(第14回WI2研究会)ABCPRec:ユーザの消費者としての役割と創作者としての役割の適応的対応付けによるユーザ生成コンテンツ推薦(第14回WI2研究会)
ABCPRec:ユーザの消費者としての役割と創作者としての役割の適応的対応付けによるユーザ生成コンテンツ推薦(第14回WI2研究会)
 
ABCPRec: Adaptively Bridging Consumer and Producer Roles for User-Generated C...
ABCPRec: Adaptively Bridging Consumer and Producer Roles for User-Generated C...ABCPRec: Adaptively Bridging Consumer and Producer Roles for User-Generated C...
ABCPRec: Adaptively Bridging Consumer and Producer Roles for User-Generated C...
 
Lyric Jumper: A Lyrics-Based Music Exploratory Web Service by Modeling Lyrics...
Lyric Jumper: A Lyrics-Based Music Exploratory Web Service by Modeling Lyrics...Lyric Jumper: A Lyrics-Based Music Exploratory Web Service by Modeling Lyrics...
Lyric Jumper: A Lyrics-Based Music Exploratory Web Service by Modeling Lyrics...
 
Listener Anonymizer: Camouflaging Play Logs to Preserve User’s Demographic An...
Listener Anonymizer: Camouflaging Play Logs to Preserve User’s Demographic An...Listener Anonymizer: Camouflaging Play Logs to Preserve User’s Demographic An...
Listener Anonymizer: Camouflaging Play Logs to Preserve User’s Demographic An...
 
Listener Anonymizer: Camouflaging Play Logs to Preserve User’s Demographic An...
Listener Anonymizer: Camouflaging Play Logs to Preserve User’s Demographic An...Listener Anonymizer: Camouflaging Play Logs to Preserve User’s Demographic An...
Listener Anonymizer: Camouflaging Play Logs to Preserve User’s Demographic An...
 

Recently uploaded

Detectability of Solar Panels as a Technosignature
Detectability of Solar Panels as a TechnosignatureDetectability of Solar Panels as a Technosignature
Detectability of Solar Panels as a Technosignature
Sérgio Sacani
 
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdf
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdfPests of sugarcane_Binomics_IPM_Dr.UPR.pdf
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdf
PirithiRaju
 
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...
Sérgio Sacani
 

Recently uploaded (20)

Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243
Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243
Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243
 
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
 
Detectability of Solar Panels as a Technosignature
Detectability of Solar Panels as a TechnosignatureDetectability of Solar Panels as a Technosignature
Detectability of Solar Panels as a Technosignature
 
INSIGHT Partner Profile: Tampere University
INSIGHT Partner Profile: Tampere UniversityINSIGHT Partner Profile: Tampere University
INSIGHT Partner Profile: Tampere University
 
Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...
 
A Giant Impact Origin for the First Subduction on Earth
A Giant Impact Origin for the First Subduction on EarthA Giant Impact Origin for the First Subduction on Earth
A Giant Impact Origin for the First Subduction on Earth
 
GBSN - Microbiology Lab 2 (Compound Microscope)
GBSN - Microbiology Lab 2 (Compound Microscope)GBSN - Microbiology Lab 2 (Compound Microscope)
GBSN - Microbiology Lab 2 (Compound Microscope)
 
GBSN - Biochemistry (Unit 4) Chemistry of Carbohydrates
GBSN - Biochemistry (Unit 4) Chemistry of CarbohydratesGBSN - Biochemistry (Unit 4) Chemistry of Carbohydrates
GBSN - Biochemistry (Unit 4) Chemistry of Carbohydrates
 
Triploidy ...............................pptx
Triploidy ...............................pptxTriploidy ...............................pptx
Triploidy ...............................pptx
 
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdf
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdfPests of sugarcane_Binomics_IPM_Dr.UPR.pdf
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdf
 
In vitro androgenesis ...............pptx
In vitro androgenesis ...............pptxIn vitro androgenesis ...............pptx
In vitro androgenesis ...............pptx
 
Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...
Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...
Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...
 
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...
 
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
 
word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...
word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...
word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...
 
Phytogeography........................pptx
Phytogeography........................pptxPhytogeography........................pptx
Phytogeography........................pptx
 
ERTHROPOIESIS: Dr. E. Muralinath & R. Gnana Lahari
ERTHROPOIESIS: Dr. E. Muralinath & R. Gnana LahariERTHROPOIESIS: Dr. E. Muralinath & R. Gnana Lahari
ERTHROPOIESIS: Dr. E. Muralinath & R. Gnana Lahari
 
Erythropoiesis- Dr.E. Muralinath-C Kalyan
Erythropoiesis- Dr.E. Muralinath-C KalyanErythropoiesis- Dr.E. Muralinath-C Kalyan
Erythropoiesis- Dr.E. Muralinath-C Kalyan
 
Ostiguy & Panizza & Moffitt (eds.) - Populism in Global Perspective. A Perfor...
Ostiguy & Panizza & Moffitt (eds.) - Populism in Global Perspective. A Perfor...Ostiguy & Panizza & Moffitt (eds.) - Populism in Global Perspective. A Perfor...
Ostiguy & Panizza & Moffitt (eds.) - Populism in Global Perspective. A Perfor...
 
NuGOweek 2024 full programme - hosted by Ghent University
NuGOweek 2024 full programme - hosted by Ghent UniversityNuGOweek 2024 full programme - hosted by Ghent University
NuGOweek 2024 full programme - hosted by Ghent University
 

Query/Task Satisfaction and Grid-based Evaluation Metrics Under Different Image Search Intents (SIGIR 2020)

  • 1. Kosetsu Tsukuda and Masataka Goto National Institute of Advanced Industrial Science and Technology (AIST) Japan ACM SIGIR 2020 Query/Task Satisfaction and Grid-based Evaluation Metrics Under Different Image Search Intents
  • 2. Search intent in image search 2 Intent: Intent: I want to see photos of my favorite actor Tom Cruise. I want to learn what Jupiter looks like. Jupiter Tom Cruise People use web image search with various search intents: from serious demands for study or work to just passing time Learn Entertain
  • 3. Research goal 3 Investigate the influence of user’s intent on query/task satisfaction and grid-based evaluation metrics Intent Query/task satisfaction Grid-based evaluation metrics
  • 5. Query/task satisfaction 5 I want to learn about Jupiter. Jupiter Jupiter’s satellite Jupiter europa Query sat. t Task sat.  Under a search intent, a user addresses a specific image search task  A task consists of one or more queries submitted by the user  The user gains satisfaction for each query and the task
  • 6. Relationship between intents, query satisfaction, and task satisfaction 6  Task difficulty would vary according to user’s search intent  Query satisfaction would influence the task satisfaction Learn Entertain Task sat. Task sat.
  • 7. RQ1 7 What are the characteristics of the query satisfaction and the task satisfaction and what is the relationship between them under different image search intents? Answering this question enables us to  understand user behavior at a deeper level  reveal an appropriate approach to support the users according to their intent
  • 8. Publicly available dataset 8  Dataset developed through a field study [Wu et al. WSDM’19]  29 users, 447 tasks, and 1,758 queries  A user provided a 5-level satisfaction feedback for each query and task  Assessors annotated 1 intent from each of 4 taxonomies to a task Taxonomy 1 Locate Learn Entertain Taxonomy 2 Work&Study Daily-life Taxonomy 3 Specific General Taxonomy 4 Mental Image Navigation
  • 9. Analysis 9  Number of unsatisfied queries before the first satisfied query  Influence of query satisfaction on task satisfaction 2 unsatisfied queries The first satisfied query Avg of query sat. Max of query sat. Task sat. Avg/Max of query sat. Tasksat. Avg Max Learn
  • 10. Take-home messages for RQ1 10  Users who have more demanding intents (Learn and Work&Study) tend to have low query/task satisfaction  Because such users struggle to get satisfied results by the first query in a task, helping them to submit their first query in a task is one possible way to increase their satisfaction  For users who want to learn something and look for general information rather than specific one, submitting many satisfied queries contributes to increase the task satisfaction  Therefore, it is beneficial to support such user’s search process even after they found a desired image
  • 12. Grid-based evaluation metric 12 Jupiter  Xie et al. proposed a grid-based evaluation metric for image search [TheWebConf’19]  The metric considers “middle bias,” which indicates that users tend to pay more attention to images in the middle horizontal position on the SERP  The metric is implemented by expanding an evaluation metric for general web search such as RBP (Rank-Biased Precision) Middle bias RBP RBP-MB + Middle bias (MB) �𝑀𝑀 = � 𝑖𝑖=0 ∞ � 𝑗𝑗=0 𝑖𝑖−1 𝐶𝐶𝑗𝑗 1 − 𝐶𝐶𝑖𝑖 � 𝑗𝑗=0 𝑖𝑖 𝑅𝑅𝑗𝑗 𝑀𝑀𝑀𝑀𝑀𝑀 = � 𝑖𝑖=0 ∞ � 𝑗𝑗=0 𝑖𝑖−1 𝑓𝑓 𝑐𝑐 𝑖𝑖 𝐶𝐶𝑗𝑗 1 − 𝐶𝐶𝑖𝑖 � 𝑗𝑗=0 𝑖𝑖 𝑅𝑅𝑗𝑗
  • 13. RQ2 13 How do image search intents affect the performance of the grid-based evaluation metric? Answering this question is beneficial for improving the evaluation metric
  • 14. Jupiter Analysis 14  Dataset includes relevance scores for each pair of a query and an image  We can compute RBP/RBP-MB for each query  We compute Peason’s Correlation between a metric and query satisfaction  A good evaluation metric is highly correlated with query satisfaction RBP RBP-MB Query sat. RBP/RBP-MB Querysat. RBP RBP-MB Learn 0.324 0.479 49 92 88 37 66 71 90 60 5673 81 91
  • 15. Take-home messages for RQ2 15  When users want to learn something or find images for daily life, or when users know how the image content looks like before submitting a query, it is effective to incorporate the middle bias behavior into the evaluation metric  For other intents, there is still room for improvement in evaluation metric by, for example, developing intent-aware metrics. Intent RBP RBP-MB Locate 0.304 0.304 Learn 0.401 0.429* Entertain 0.360 0.379 Work&Study 0.299 0.299 Daily-life 0.334 0.372* Specific 0.389 0.399 General 0.272 0.286 Mental Image 0.377 0.411* Navigation 0.314 0.319 Pearson’s Correlation between evaluation metrics and query satisfaction (*: 𝑝𝑝 < 0.01)