SlideShare a Scribd company logo
Two-layered Summaries for Mobile Search:
Does the Evaluation Measure Reflect User Preferences?
Makoto P. Kato (Kyoto U.), Tetsuya Sakai (Waseda U.),
Takehiro Yamamoto (Kyoto U.), Virgil Pavlu (Northeastern U.),
and Hajime Morita (Kyoto U.)
MOTIVATION AND TASK
2
IR Systems in Ten-Blue-Link Paradigm
Enter query
Click SEARCH button
Scan ranked list of URLs
Click URL
Read URL contents
Get all desired information
Long way to get all desired information
MobileClick System
Enter query
Click SEARCH button
Get all desired information
Go beyond the "ten-blue-link" paradigm, and tackle
information retrieval rather than document retrieval
LCD is better in terms of the weight, size and
energy saving. OLED shows a better black
color, a faster response speed, and a wider
view angle.
Advantage of OLED
Advantage of LCD
Task: Given a search query,
return a two-layered textual output
System output
OLED LCD difference
Phone: 046-223-3636.
Fax: 046-223-3630.
Address: 118-1
Nurumizu, Atsugi,
243-8551. Email:
soumu@shonan-
atsugi.jp. Visiting
hours: general ward
Mon-Fri 15-20;
Sat&Holidays 13-20 /
Intensive Care Unit
(ICU) 11-11:30, 15:30,
19-19:30.
Phone: 046-223-3636.
Fax: 046-223-3630.
Address: 118-1
Nurumizu, Atsugi,
243-8551. Email:
soumu@shonan-
atsugi.jp. Visiting
hours: general ward
Mon-Fri 15-20;
Sat&Holidays 13-20 /
Intensive Care Unit
(ICU) 11-11:30, 15:30,
19-19:30.
Skip
• Given a query, a set of iUnits, and a set of intents,
generate a two-layered summary
iUnit Summarization Subtask at NTCIR-12
5
iUnit
A series of evaluation workshops
Designed to enhance IA research
…
NTCIR
Input: Query
Input: iUnit set
Intents
News
Schedule
…
Input: Intents
M-measure
0.5
The NTCIR Workshop is a
series of evaluation
workshops designed to
enhance research in
information access
technologies including
information retrieval,
summarization, extraction,
question answering, etc.
News
Schedule
Tasks
2nd layer
20/Jan./2016: Task Registration Due
06/Jan./2016: Document Set Release
Jan.-May/2016: Dry Run
Mar.-July/2016: Formal Run
01/Aug./2016: Evaluation Results Due
01/Aug./2016: Task overview release
15/Sep./2016: Paper submission Due
01/Nov./2016: All paper Due
09-12/Dec./2016: NTCIR-11 Conference
Output: Two-layered summary
Evaluation metric
designed for mobile
information access
Lay out iUnits so that
any types of users can be immediately satisfied
Challenge
Two-layered Summary in Action
6
Does the Evaluation Measure
Reflect User Preferences?
Research Question Addressed in This Work
7
M-measure
0.5 0.4
User preference
(# of users who prefer to A (B))
10 4
0.5 > 0.4
10 > 4
A B
A > B
A > B
=
Same?
Which is higher? Which is better?
DATA
8
Overview of Data
9
napoleon
Queries
Documents
Web search
Born on the island of Corsica
Defeated at the Battle of Waterloo
Established legal equality and religious
toleration an innovator
iUnits
Extraction
Achievement
Skill
Career
Clustering
Intents
iUnit
summarization
Input
Input
• Queries
– 100 English/Japanese queries
– Most of which were ambiguous/underspecified
– Selected from five categories:
celebrity, location, definition, and QA (similar to NTCIR 1CLICK-2)
• Documents
– 500 commercial search engine results for each query
from which iUnits were extracted
Queries and Documents
10
CELEBRITY LOCATION DEFINITION QA
hulk hogan bank adelanto bitcoin what is mirror made of
bruno mars cafe killeen divers disease how to cook coleslaw
sharon stone cincinnati art museum windows 7 role of animal tail
Examples
• Definition
– Atomic information pieces relevant to a given query
• The number of iUnits
– 2,317 (23.8 iUnits per query) for English
– 4,169 (41.7 iUnits per query) for Japanese
iUnits
11
Born on the island of Corsica General of the Army of Italy
Defeated at the Battle of Waterloo One of the most controversial political figures
won at the Battle of Wagram
Established legal equality and religious
toleration an innovator
Baptised as a Catholic
Absent during Peninsular War Cut off European trade with Britain
Examples of iUnits for query “Napoleon”
• An intent can be defined as
– A specific interpretation of an ambiguous query
(“Mac OS” and “car brand” for “jaguar”), or
– An aspect of a faceted query
(“windows 8” and “windows 10” for “windows”)
• Obtained by clustering iUnits
Intents
12
Achievement
Skill
Career
Born on the island of Corsica
Defeated at the Battle of Waterloo
Established legal equality and religious
toleration an innovator
Absent during Peninsular War
iUnits Intents
Clustering
EVALUATION
13
• Importance of iUnits in terms of an intent
• Intent probability P(i|q)
– Probability of having intent i for a given query q
Per-intent iUnit Importance and Intent Probability
iUnit Importance
A series of evaluation workshops 5
Task Registration Due 20/Jun./2016 3
iUnit Importance
A series of evaluation workshops 2
Task Registration Due 20/Jun./2016 5
In terms of intent “Definition” In terms of intent “Schedule”
Intent Prob.
Definition 0.4
Schedule 0.3
Tasks 0.3
For details, see our MobileClick-2 overview paper
• Consider single-layered summary evaluation
• U-measure [Sakai and Dou. SIGIR2013]
– Higher if more important iUnits appear earlier
Evaluation of iUnit Summarization (Single-layer Case)
15
𝑢1 𝑢2
𝑢3
Summary Trailtext
(reading path)
𝑢1 𝑢3
G(u1)(1-10/L)
+ G(u2)(1-15/L)
+ G(u3)(1-25/L)
U-measure
Create a list of iUnits
by assuming that users
read text from left to right,
from top to bottom
𝑈 =
𝑟=1
𝐺 𝑢 𝑟 1 −
pos 𝑢 𝑟
𝐿
𝑢 𝑟: r-th iUnit
𝐺(𝑢): importance of u
pos(𝑢): offset of u from the beginning
𝐿: patience parameter
𝑢2
10chars 10chars5chars
• M-measure
– Expectation of U-measure over multiple trailtexts
𝑀 =
𝐭
𝑃(𝐭)𝑈(𝐭)
1. Generate trailtexts by assuming that
– Users read a summary from the top of the first layer
– Users click on an intent if they are interested in it
M-measure
16
𝑃(𝐭): probability of trailtext t
𝑈(𝐭): U-measure of trailtext t
𝑙1
𝑢1 𝑢2
𝑢3
𝑢4
User interested in
Intent 1 (𝑃(𝑖1|𝑞))
User interested in
Intent 2 (𝑃(𝑖2|𝑞))
𝑢1 𝑢2 𝑢3 𝑢4
𝑢1 𝑢2 𝑢3
2. Compute the expectation of U-measure
Evaluation of iUnit Summarization (Two-layer Case)
17
𝑙1
𝑙2
𝑢1 𝑢2
𝑢3
𝑢6
𝑢4 𝑢5
Trailtext (t)
(reading path)
U
𝑢1 𝑢2 𝑢3
𝑢4 𝑢5
𝑢1 𝑢2 𝑢3
𝑢6
0.44
0.12
0.36
𝑃 𝐭1 = 𝑃 𝑖1 𝑞 = 0.75
𝑃 𝐭2 = 𝑃 𝑖2 𝑞 = 0.25
M-measure
𝑀 =
𝐭
𝑃(𝐭)𝑈(𝐭)
Because trailtext t2 is read
by users interested in i2
EXPERIMENT
18
Pairwise Comparison
All possible pairs of 7 summaries for 25 queries
were presented to about 14 users
• Users were asked to select either
the left one is better,
the right one is better,
equally good, or
equally bad
• Criteria:
(1) How much useful information you can get
from the summary, and
(2) How quickly you can get useful information
from the summary
Instruction in Pairwise Comparison
20
• 𝑳 of U-measure in M-measure
– 𝑈 = 𝑟=1 𝐺 𝑢 𝑟 max 0, 1 −
pos 𝑢 𝑟
𝐿
– 𝐿 is a patience parameter that controls how the
gain of iUnits decreases as the user reads the text
• Simple variants of M-measure
– Use only first layer
– Use only second layer
– Use a uniform distribution for 𝑃 𝑖 𝑞
Settings of M-measure
21
𝑙1
𝑢1 𝑢2
𝑢3
𝑢4
𝐿 = 100
𝐿 = 200
200100
1−
pos𝑢𝑟
𝐿
pos 𝑢 𝑟
Interpretation of Results
22
(Num. of votes for A)
(Total num. of votes)
Diff. of M-measure (M(A) - M(B))
Agree
Disagree
Disagree
Agree
A
is better
(User pref.)
B
is better
(User pref.)
Ais better
(M-measure)
Bis better
(M-measure)
Each dot represents
a pair of systems (A, B)
for a particular query
Agreement
= (#dots in Agree)
/ (#dots)
Experimental Results for Different Patient Parameters
23
93.75 750 6000 24000
31.25 125 2000 8000
English
Japanese
LOW agreement for LOW
patience parameter
(L=93.5)
HIGH agreement for HIGH
patience parameter
(L=24000)
Agreement is high (70-74%) for both of the languages
Experimental Results for Simple Variants of M-measure
24
Original
Worse Slightly worseClose
Use of the second layer and intent probability
improves the agreement (but the first layer doesn’t)
24000
2000
• Possible explanations include
– The quality of the second layer correlates to the
quality of the whole summary
– Users decided the quality of the summary mainly
based on the second layer
• We asked the users to look at the second layer in the
assessment
Why did the only 2nd layer correlate to the user pref. well?
25
• Conclusions
– Proposed M-measure
• A special case of intent-aware U-measure for two-
layered summarization
– Measured the agreement between
M-measure and user preferences
• Agreement was high (70-74%)
• Future work
– Error analysis
– Address “why did the only second layer correlate
to the user preferences well?”
Conclusions and Future Work
26

More Related Content

What's hot

Large-scale Parallel Collaborative Filtering and Clustering using MapReduce f...
Large-scale Parallel Collaborative Filtering and Clustering using MapReduce f...Large-scale Parallel Collaborative Filtering and Clustering using MapReduce f...
Large-scale Parallel Collaborative Filtering and Clustering using MapReduce f...
Varad Meru
 
Fuzzy clustering and fuzzy c-means partition cluster analysis and validation ...
Fuzzy clustering and fuzzy c-means partition cluster analysis and validation ...Fuzzy clustering and fuzzy c-means partition cluster analysis and validation ...
Fuzzy clustering and fuzzy c-means partition cluster analysis and validation ...
IJECEIAES
 
Explainable AI
Explainable AIExplainable AI
Explainable AI
Arithmer Inc.
 
Hybrid Solution of the Cold-Start Problem in Context-Aware Recommender Systems
Hybrid Solution of the Cold-Start Problem in Context-Aware Recommender SystemsHybrid Solution of the Cold-Start Problem in Context-Aware Recommender Systems
Hybrid Solution of the Cold-Start Problem in Context-Aware Recommender SystemsMatthias Braunhofer
 
Hybrid Algorithm for Clustering Mixed Data Sets
Hybrid Algorithm for Clustering Mixed Data SetsHybrid Algorithm for Clustering Mixed Data Sets
Hybrid Algorithm for Clustering Mixed Data Sets
IOSR Journals
 
Recommender Systems: Advances in Collaborative Filtering
Recommender Systems: Advances in Collaborative FilteringRecommender Systems: Advances in Collaborative Filtering
Recommender Systems: Advances in Collaborative Filtering
Changsung Moon
 
Matrix Factorization Techniques For Recommender Systems
Matrix Factorization Techniques For Recommender SystemsMatrix Factorization Techniques For Recommender Systems
Matrix Factorization Techniques For Recommender SystemsLei Guo
 
姜俊宇/從資料到知識:從零開始的資料探勘
姜俊宇/從資料到知識:從零開始的資料探勘姜俊宇/從資料到知識:從零開始的資料探勘
姜俊宇/從資料到知識:從零開始的資料探勘
台灣資料科學年會
 
Enhanced ID3 algorithm based on the weightage of the Attribute
Enhanced ID3 algorithm based on the weightage of the AttributeEnhanced ID3 algorithm based on the weightage of the Attribute
Enhanced ID3 algorithm based on the weightage of the Attribute
AM Publications
 
Scikit-learn1
Scikit-learn1Scikit-learn1
Scikit-learn1
Jayanti Prasad Ph.D.
 
IRJET- A Survey of Text Document Clustering by using Clustering Techniques
IRJET- A Survey of Text Document Clustering by using Clustering TechniquesIRJET- A Survey of Text Document Clustering by using Clustering Techniques
IRJET- A Survey of Text Document Clustering by using Clustering Techniques
IRJET Journal
 
Modeling Crude Oil Prices (CPO) using General Regression Neural Network (GRNN)
Modeling Crude Oil Prices (CPO) using General Regression Neural Network (GRNN) Modeling Crude Oil Prices (CPO) using General Regression Neural Network (GRNN)
Modeling Crude Oil Prices (CPO) using General Regression Neural Network (GRNN)
AI Publications
 
A Study of Efficiency Improvements Technique for K-Means Algorithm
A Study of Efficiency Improvements Technique for K-Means AlgorithmA Study of Efficiency Improvements Technique for K-Means Algorithm
A Study of Efficiency Improvements Technique for K-Means Algorithm
IRJET Journal
 
INTEGRATED ASSOCIATIVE CLASSIFICATION AND NEURAL NETWORK MODEL ENHANCED BY US...
INTEGRATED ASSOCIATIVE CLASSIFICATION AND NEURAL NETWORK MODEL ENHANCED BY US...INTEGRATED ASSOCIATIVE CLASSIFICATION AND NEURAL NETWORK MODEL ENHANCED BY US...
INTEGRATED ASSOCIATIVE CLASSIFICATION AND NEURAL NETWORK MODEL ENHANCED BY US...
IJDKP
 
Applications in Machine Learning
Applications in Machine LearningApplications in Machine Learning
Applications in Machine Learning
Joel Graff
 
Driving Style and Behavior Analysis based on Trip Segmentation over GPS Info...
Driving Style and Behavior Analysis based on Trip Segmentation over GPS  Info...Driving Style and Behavior Analysis based on Trip Segmentation over GPS  Info...
Driving Style and Behavior Analysis based on Trip Segmentation over GPS Info...
Marco Brambilla
 
Introduction to Machine learning ppt
Introduction to Machine learning pptIntroduction to Machine learning ppt
Introduction to Machine learning ppt
shubhamshirke12
 
Collaborative Filtering with Spark
Collaborative Filtering with SparkCollaborative Filtering with Spark
Collaborative Filtering with Spark
Chris Johnson
 
Alleviating cold-user start problem with users' social network data in recomm...
Alleviating cold-user start problem with users' social network data in recomm...Alleviating cold-user start problem with users' social network data in recomm...
Alleviating cold-user start problem with users' social network data in recomm...
Eduardo Castillejo Gil
 
Policy Based reinforcement Learning for time series Anomaly detection
Policy Based reinforcement Learning for time series Anomaly detectionPolicy Based reinforcement Learning for time series Anomaly detection
Policy Based reinforcement Learning for time series Anomaly detection
Kishor Datta Gupta
 

What's hot (20)

Large-scale Parallel Collaborative Filtering and Clustering using MapReduce f...
Large-scale Parallel Collaborative Filtering and Clustering using MapReduce f...Large-scale Parallel Collaborative Filtering and Clustering using MapReduce f...
Large-scale Parallel Collaborative Filtering and Clustering using MapReduce f...
 
Fuzzy clustering and fuzzy c-means partition cluster analysis and validation ...
Fuzzy clustering and fuzzy c-means partition cluster analysis and validation ...Fuzzy clustering and fuzzy c-means partition cluster analysis and validation ...
Fuzzy clustering and fuzzy c-means partition cluster analysis and validation ...
 
Explainable AI
Explainable AIExplainable AI
Explainable AI
 
Hybrid Solution of the Cold-Start Problem in Context-Aware Recommender Systems
Hybrid Solution of the Cold-Start Problem in Context-Aware Recommender SystemsHybrid Solution of the Cold-Start Problem in Context-Aware Recommender Systems
Hybrid Solution of the Cold-Start Problem in Context-Aware Recommender Systems
 
Hybrid Algorithm for Clustering Mixed Data Sets
Hybrid Algorithm for Clustering Mixed Data SetsHybrid Algorithm for Clustering Mixed Data Sets
Hybrid Algorithm for Clustering Mixed Data Sets
 
Recommender Systems: Advances in Collaborative Filtering
Recommender Systems: Advances in Collaborative FilteringRecommender Systems: Advances in Collaborative Filtering
Recommender Systems: Advances in Collaborative Filtering
 
Matrix Factorization Techniques For Recommender Systems
Matrix Factorization Techniques For Recommender SystemsMatrix Factorization Techniques For Recommender Systems
Matrix Factorization Techniques For Recommender Systems
 
姜俊宇/從資料到知識:從零開始的資料探勘
姜俊宇/從資料到知識:從零開始的資料探勘姜俊宇/從資料到知識:從零開始的資料探勘
姜俊宇/從資料到知識:從零開始的資料探勘
 
Enhanced ID3 algorithm based on the weightage of the Attribute
Enhanced ID3 algorithm based on the weightage of the AttributeEnhanced ID3 algorithm based on the weightage of the Attribute
Enhanced ID3 algorithm based on the weightage of the Attribute
 
Scikit-learn1
Scikit-learn1Scikit-learn1
Scikit-learn1
 
IRJET- A Survey of Text Document Clustering by using Clustering Techniques
IRJET- A Survey of Text Document Clustering by using Clustering TechniquesIRJET- A Survey of Text Document Clustering by using Clustering Techniques
IRJET- A Survey of Text Document Clustering by using Clustering Techniques
 
Modeling Crude Oil Prices (CPO) using General Regression Neural Network (GRNN)
Modeling Crude Oil Prices (CPO) using General Regression Neural Network (GRNN) Modeling Crude Oil Prices (CPO) using General Regression Neural Network (GRNN)
Modeling Crude Oil Prices (CPO) using General Regression Neural Network (GRNN)
 
A Study of Efficiency Improvements Technique for K-Means Algorithm
A Study of Efficiency Improvements Technique for K-Means AlgorithmA Study of Efficiency Improvements Technique for K-Means Algorithm
A Study of Efficiency Improvements Technique for K-Means Algorithm
 
INTEGRATED ASSOCIATIVE CLASSIFICATION AND NEURAL NETWORK MODEL ENHANCED BY US...
INTEGRATED ASSOCIATIVE CLASSIFICATION AND NEURAL NETWORK MODEL ENHANCED BY US...INTEGRATED ASSOCIATIVE CLASSIFICATION AND NEURAL NETWORK MODEL ENHANCED BY US...
INTEGRATED ASSOCIATIVE CLASSIFICATION AND NEURAL NETWORK MODEL ENHANCED BY US...
 
Applications in Machine Learning
Applications in Machine LearningApplications in Machine Learning
Applications in Machine Learning
 
Driving Style and Behavior Analysis based on Trip Segmentation over GPS Info...
Driving Style and Behavior Analysis based on Trip Segmentation over GPS  Info...Driving Style and Behavior Analysis based on Trip Segmentation over GPS  Info...
Driving Style and Behavior Analysis based on Trip Segmentation over GPS Info...
 
Introduction to Machine learning ppt
Introduction to Machine learning pptIntroduction to Machine learning ppt
Introduction to Machine learning ppt
 
Collaborative Filtering with Spark
Collaborative Filtering with SparkCollaborative Filtering with Spark
Collaborative Filtering with Spark
 
Alleviating cold-user start problem with users' social network data in recomm...
Alleviating cold-user start problem with users' social network data in recomm...Alleviating cold-user start problem with users' social network data in recomm...
Alleviating cold-user start problem with users' social network data in recomm...
 
Policy Based reinforcement Learning for time series Anomaly detection
Policy Based reinforcement Learning for time series Anomaly detectionPolicy Based reinforcement Learning for time series Anomaly detection
Policy Based reinforcement Learning for time series Anomaly detection
 

Viewers also liked

MobileClick-2 Kickoff Event
MobileClick-2 Kickoff EventMobileClick-2 Kickoff Event
MobileClick-2 Kickoff Event
kt.mako
 
MobileClick-2 キックオフイベント
MobileClick-2 キックオフイベントMobileClick-2 キックオフイベント
MobileClick-2 キックオフイベント
kt.mako
 
NTCIR-12 MobileClick-2 Overview
NTCIR-12 MobileClick-2 OverviewNTCIR-12 MobileClick-2 Overview
NTCIR-12 MobileClick-2 Overview
kt.mako
 
情報検索のためのユーザモデル
情報検索のためのユーザモデル情報検索のためのユーザモデル
情報検索のためのユーザモデル
kt.mako
 
検索評価ツールキットNTCIREVALを用いた様々な情報アクセス技術の評価方法
検索評価ツールキットNTCIREVALを用いた様々な情報アクセス技術の評価方法検索評価ツールキットNTCIREVALを用いた様々な情報アクセス技術の評価方法
検索評価ツールキットNTCIREVALを用いた様々な情報アクセス技術の評価方法
kt.mako
 
Exploratory Search Beyond the Query-Response Paradigm
Exploratory Search Beyond the Query-Response ParadigmExploratory Search Beyond the Query-Response Paradigm
Exploratory Search Beyond the Query-Response Paradigm
Takehiro Yamamoto
 
ディープラーニングでおそ松さんの6つ子は見分けられるのか? FIT2016
ディープラーニングでおそ松さんの6つ子は見分けられるのか? FIT2016ディープラーニングでおそ松さんの6つ子は見分けられるのか? FIT2016
ディープラーニングでおそ松さんの6つ子は見分けられるのか? FIT2016
Yota Ishida
 

Viewers also liked (7)

MobileClick-2 Kickoff Event
MobileClick-2 Kickoff EventMobileClick-2 Kickoff Event
MobileClick-2 Kickoff Event
 
MobileClick-2 キックオフイベント
MobileClick-2 キックオフイベントMobileClick-2 キックオフイベント
MobileClick-2 キックオフイベント
 
NTCIR-12 MobileClick-2 Overview
NTCIR-12 MobileClick-2 OverviewNTCIR-12 MobileClick-2 Overview
NTCIR-12 MobileClick-2 Overview
 
情報検索のためのユーザモデル
情報検索のためのユーザモデル情報検索のためのユーザモデル
情報検索のためのユーザモデル
 
検索評価ツールキットNTCIREVALを用いた様々な情報アクセス技術の評価方法
検索評価ツールキットNTCIREVALを用いた様々な情報アクセス技術の評価方法検索評価ツールキットNTCIREVALを用いた様々な情報アクセス技術の評価方法
検索評価ツールキットNTCIREVALを用いた様々な情報アクセス技術の評価方法
 
Exploratory Search Beyond the Query-Response Paradigm
Exploratory Search Beyond the Query-Response ParadigmExploratory Search Beyond the Query-Response Paradigm
Exploratory Search Beyond the Query-Response Paradigm
 
ディープラーニングでおそ松さんの6つ子は見分けられるのか? FIT2016
ディープラーニングでおそ松さんの6つ子は見分けられるのか? FIT2016ディープラーニングでおそ松さんの6つ子は見分けられるのか? FIT2016
ディープラーニングでおそ松さんの6つ子は見分けられるのか? FIT2016
 

Similar to Two-layered Summaries for Mobile Search: Does the Evaluation Measure Reflect User Preferences? (at EVIA 2016)

Tutorial: Context In Recommender Systems
Tutorial: Context In Recommender SystemsTutorial: Context In Recommender Systems
Tutorial: Context In Recommender Systems
YONG ZHENG
 
18.02.05_IAAI2018_Mobille Network Failure Event Detection and Forecasting wit...
18.02.05_IAAI2018_Mobille Network Failure Event Detection and Forecasting wit...18.02.05_IAAI2018_Mobille Network Failure Event Detection and Forecasting wit...
18.02.05_IAAI2018_Mobille Network Failure Event Detection and Forecasting wit...
LINE Corp.
 
A Statistical Approach to Resolve Conflicting Requirements in Pervasive Compu...
A Statistical Approach to Resolve Conflicting Requirements in Pervasive Compu...A Statistical Approach to Resolve Conflicting Requirements in Pervasive Compu...
A Statistical Approach to Resolve Conflicting Requirements in Pervasive Compu...
Osama M. Khaled
 
Efficient Query Processing Infrastructures
Efficient Query Processing InfrastructuresEfficient Query Processing Infrastructures
Efficient Query Processing Infrastructures
Crai Macdonald
 
[ADBIS2022] Insight-based Vocalization of OLAP Sessions
[ADBIS2022] Insight-based Vocalization of OLAP Sessions[ADBIS2022] Insight-based Vocalization of OLAP Sessions
[ADBIS2022] Insight-based Vocalization of OLAP Sessions
University of Bologna
 
LESSON 04 - Descriptive Satatistics.pdf
LESSON 04 - Descriptive Satatistics.pdfLESSON 04 - Descriptive Satatistics.pdf
LESSON 04 - Descriptive Satatistics.pdf
ICOMICOM4
 
Collaborative Filtering Survey
Collaborative Filtering SurveyCollaborative Filtering Survey
Collaborative Filtering Survey
mobilizer1000
 
Looker's Ben Porterfield - Asking The Right Questions
Looker's Ben Porterfield - Asking The Right QuestionsLooker's Ben Porterfield - Asking The Right Questions
Looker's Ben Porterfield - Asking The Right Questions
Heavybit
 
Predictive Model and Record Description with Segmented Sensitivity Analysis (...
Predictive Model and Record Description with Segmented Sensitivity Analysis (...Predictive Model and Record Description with Segmented Sensitivity Analysis (...
Predictive Model and Record Description with Segmented Sensitivity Analysis (...
Greg Makowski
 
Parts 1 & 2: WWW 2018 Tutorial: Understanding User Needs & Tasks
Parts 1 & 2: WWW 2018 Tutorial: Understanding User Needs & TasksParts 1 & 2: WWW 2018 Tutorial: Understanding User Needs & Tasks
Parts 1 & 2: WWW 2018 Tutorial: Understanding User Needs & Tasks
Rishabh Mehrotra
 
Genetic Algorithm
Genetic AlgorithmGenetic Algorithm
Genetic Algorithm
Ankit Chaudhary
 
IRJET-Survey on Identification of Top-K Competitors using Data Mining
IRJET-Survey on Identification of Top-K Competitors using Data MiningIRJET-Survey on Identification of Top-K Competitors using Data Mining
IRJET-Survey on Identification of Top-K Competitors using Data Mining
IRJET Journal
 
It's all About the Data
It's all About the DataIt's all About the Data
It's all About the Data
Xavier Amatriain
 
[AFEL] Neighborhood Troubles: On the Value of User Pre-Filtering To Speed Up ...
[AFEL] Neighborhood Troubles: On the Value of User Pre-Filtering To Speed Up ...[AFEL] Neighborhood Troubles: On the Value of User Pre-Filtering To Speed Up ...
[AFEL] Neighborhood Troubles: On the Value of User Pre-Filtering To Speed Up ...
Emanuel Lacić
 
Web search-metrics-tutorial-www2010-section-2of7-relevance
Web search-metrics-tutorial-www2010-section-2of7-relevanceWeb search-metrics-tutorial-www2010-section-2of7-relevance
Web search-metrics-tutorial-www2010-section-2of7-relevance
Ali Dasdan
 
AIRS2016
AIRS2016AIRS2016
AIRS2016
Tetsuya Sakai
 
Recommending Sequences RecTour 2017
Recommending Sequences RecTour 2017Recommending Sequences RecTour 2017
Recommending Sequences RecTour 2017
Gunjan Kumar
 
Seven Degrees Presentation for 2015 ICEAA
Seven Degrees Presentation for 2015 ICEAASeven Degrees Presentation for 2015 ICEAA
Seven Degrees Presentation for 2015 ICEAAJames Lawlor
 
Explaining recommendations: design implications and lessons learned
Explaining recommendations: design implications and lessons learnedExplaining recommendations: design implications and lessons learned
Explaining recommendations: design implications and lessons learned
Katrien Verbert
 
Stat-weight Improving the Estimator of Interleaved Methods Outcomes with Stat...
Stat-weight Improving the Estimator of Interleaved Methods Outcomes with Stat...Stat-weight Improving the Estimator of Interleaved Methods Outcomes with Stat...
Stat-weight Improving the Estimator of Interleaved Methods Outcomes with Stat...
Sease
 

Similar to Two-layered Summaries for Mobile Search: Does the Evaluation Measure Reflect User Preferences? (at EVIA 2016) (20)

Tutorial: Context In Recommender Systems
Tutorial: Context In Recommender SystemsTutorial: Context In Recommender Systems
Tutorial: Context In Recommender Systems
 
18.02.05_IAAI2018_Mobille Network Failure Event Detection and Forecasting wit...
18.02.05_IAAI2018_Mobille Network Failure Event Detection and Forecasting wit...18.02.05_IAAI2018_Mobille Network Failure Event Detection and Forecasting wit...
18.02.05_IAAI2018_Mobille Network Failure Event Detection and Forecasting wit...
 
A Statistical Approach to Resolve Conflicting Requirements in Pervasive Compu...
A Statistical Approach to Resolve Conflicting Requirements in Pervasive Compu...A Statistical Approach to Resolve Conflicting Requirements in Pervasive Compu...
A Statistical Approach to Resolve Conflicting Requirements in Pervasive Compu...
 
Efficient Query Processing Infrastructures
Efficient Query Processing InfrastructuresEfficient Query Processing Infrastructures
Efficient Query Processing Infrastructures
 
[ADBIS2022] Insight-based Vocalization of OLAP Sessions
[ADBIS2022] Insight-based Vocalization of OLAP Sessions[ADBIS2022] Insight-based Vocalization of OLAP Sessions
[ADBIS2022] Insight-based Vocalization of OLAP Sessions
 
LESSON 04 - Descriptive Satatistics.pdf
LESSON 04 - Descriptive Satatistics.pdfLESSON 04 - Descriptive Satatistics.pdf
LESSON 04 - Descriptive Satatistics.pdf
 
Collaborative Filtering Survey
Collaborative Filtering SurveyCollaborative Filtering Survey
Collaborative Filtering Survey
 
Looker's Ben Porterfield - Asking The Right Questions
Looker's Ben Porterfield - Asking The Right QuestionsLooker's Ben Porterfield - Asking The Right Questions
Looker's Ben Porterfield - Asking The Right Questions
 
Predictive Model and Record Description with Segmented Sensitivity Analysis (...
Predictive Model and Record Description with Segmented Sensitivity Analysis (...Predictive Model and Record Description with Segmented Sensitivity Analysis (...
Predictive Model and Record Description with Segmented Sensitivity Analysis (...
 
Parts 1 & 2: WWW 2018 Tutorial: Understanding User Needs & Tasks
Parts 1 & 2: WWW 2018 Tutorial: Understanding User Needs & TasksParts 1 & 2: WWW 2018 Tutorial: Understanding User Needs & Tasks
Parts 1 & 2: WWW 2018 Tutorial: Understanding User Needs & Tasks
 
Genetic Algorithm
Genetic AlgorithmGenetic Algorithm
Genetic Algorithm
 
IRJET-Survey on Identification of Top-K Competitors using Data Mining
IRJET-Survey on Identification of Top-K Competitors using Data MiningIRJET-Survey on Identification of Top-K Competitors using Data Mining
IRJET-Survey on Identification of Top-K Competitors using Data Mining
 
It's all About the Data
It's all About the DataIt's all About the Data
It's all About the Data
 
[AFEL] Neighborhood Troubles: On the Value of User Pre-Filtering To Speed Up ...
[AFEL] Neighborhood Troubles: On the Value of User Pre-Filtering To Speed Up ...[AFEL] Neighborhood Troubles: On the Value of User Pre-Filtering To Speed Up ...
[AFEL] Neighborhood Troubles: On the Value of User Pre-Filtering To Speed Up ...
 
Web search-metrics-tutorial-www2010-section-2of7-relevance
Web search-metrics-tutorial-www2010-section-2of7-relevanceWeb search-metrics-tutorial-www2010-section-2of7-relevance
Web search-metrics-tutorial-www2010-section-2of7-relevance
 
AIRS2016
AIRS2016AIRS2016
AIRS2016
 
Recommending Sequences RecTour 2017
Recommending Sequences RecTour 2017Recommending Sequences RecTour 2017
Recommending Sequences RecTour 2017
 
Seven Degrees Presentation for 2015 ICEAA
Seven Degrees Presentation for 2015 ICEAASeven Degrees Presentation for 2015 ICEAA
Seven Degrees Presentation for 2015 ICEAA
 
Explaining recommendations: design implications and lessons learned
Explaining recommendations: design implications and lessons learnedExplaining recommendations: design implications and lessons learned
Explaining recommendations: design implications and lessons learned
 
Stat-weight Improving the Estimator of Interleaved Methods Outcomes with Stat...
Stat-weight Improving the Estimator of Interleaved Methods Outcomes with Stat...Stat-weight Improving the Estimator of Interleaved Methods Outcomes with Stat...
Stat-weight Improving the Estimator of Interleaved Methods Outcomes with Stat...
 

Recently uploaded

UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIEnchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Vladimir Iglovikov, Ph.D.
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
Rohit Gautam
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 

Recently uploaded (20)

UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIEnchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 

Two-layered Summaries for Mobile Search: Does the Evaluation Measure Reflect User Preferences? (at EVIA 2016)

  • 1. Two-layered Summaries for Mobile Search: Does the Evaluation Measure Reflect User Preferences? Makoto P. Kato (Kyoto U.), Tetsuya Sakai (Waseda U.), Takehiro Yamamoto (Kyoto U.), Virgil Pavlu (Northeastern U.), and Hajime Morita (Kyoto U.)
  • 3. IR Systems in Ten-Blue-Link Paradigm Enter query Click SEARCH button Scan ranked list of URLs Click URL Read URL contents Get all desired information Long way to get all desired information
  • 4. MobileClick System Enter query Click SEARCH button Get all desired information Go beyond the "ten-blue-link" paradigm, and tackle information retrieval rather than document retrieval LCD is better in terms of the weight, size and energy saving. OLED shows a better black color, a faster response speed, and a wider view angle. Advantage of OLED Advantage of LCD Task: Given a search query, return a two-layered textual output System output OLED LCD difference Phone: 046-223-3636. Fax: 046-223-3630. Address: 118-1 Nurumizu, Atsugi, 243-8551. Email: soumu@shonan- atsugi.jp. Visiting hours: general ward Mon-Fri 15-20; Sat&Holidays 13-20 / Intensive Care Unit (ICU) 11-11:30, 15:30, 19-19:30. Phone: 046-223-3636. Fax: 046-223-3630. Address: 118-1 Nurumizu, Atsugi, 243-8551. Email: soumu@shonan- atsugi.jp. Visiting hours: general ward Mon-Fri 15-20; Sat&Holidays 13-20 / Intensive Care Unit (ICU) 11-11:30, 15:30, 19-19:30. Skip
  • 5. • Given a query, a set of iUnits, and a set of intents, generate a two-layered summary iUnit Summarization Subtask at NTCIR-12 5 iUnit A series of evaluation workshops Designed to enhance IA research … NTCIR Input: Query Input: iUnit set Intents News Schedule … Input: Intents M-measure 0.5 The NTCIR Workshop is a series of evaluation workshops designed to enhance research in information access technologies including information retrieval, summarization, extraction, question answering, etc. News Schedule Tasks 2nd layer 20/Jan./2016: Task Registration Due 06/Jan./2016: Document Set Release Jan.-May/2016: Dry Run Mar.-July/2016: Formal Run 01/Aug./2016: Evaluation Results Due 01/Aug./2016: Task overview release 15/Sep./2016: Paper submission Due 01/Nov./2016: All paper Due 09-12/Dec./2016: NTCIR-11 Conference Output: Two-layered summary Evaluation metric designed for mobile information access Lay out iUnits so that any types of users can be immediately satisfied Challenge
  • 7. Does the Evaluation Measure Reflect User Preferences? Research Question Addressed in This Work 7 M-measure 0.5 0.4 User preference (# of users who prefer to A (B)) 10 4 0.5 > 0.4 10 > 4 A B A > B A > B = Same? Which is higher? Which is better?
  • 9. Overview of Data 9 napoleon Queries Documents Web search Born on the island of Corsica Defeated at the Battle of Waterloo Established legal equality and religious toleration an innovator iUnits Extraction Achievement Skill Career Clustering Intents iUnit summarization Input Input
  • 10. • Queries – 100 English/Japanese queries – Most of which were ambiguous/underspecified – Selected from five categories: celebrity, location, definition, and QA (similar to NTCIR 1CLICK-2) • Documents – 500 commercial search engine results for each query from which iUnits were extracted Queries and Documents 10 CELEBRITY LOCATION DEFINITION QA hulk hogan bank adelanto bitcoin what is mirror made of bruno mars cafe killeen divers disease how to cook coleslaw sharon stone cincinnati art museum windows 7 role of animal tail Examples
  • 11. • Definition – Atomic information pieces relevant to a given query • The number of iUnits – 2,317 (23.8 iUnits per query) for English – 4,169 (41.7 iUnits per query) for Japanese iUnits 11 Born on the island of Corsica General of the Army of Italy Defeated at the Battle of Waterloo One of the most controversial political figures won at the Battle of Wagram Established legal equality and religious toleration an innovator Baptised as a Catholic Absent during Peninsular War Cut off European trade with Britain Examples of iUnits for query “Napoleon”
  • 12. • An intent can be defined as – A specific interpretation of an ambiguous query (“Mac OS” and “car brand” for “jaguar”), or – An aspect of a faceted query (“windows 8” and “windows 10” for “windows”) • Obtained by clustering iUnits Intents 12 Achievement Skill Career Born on the island of Corsica Defeated at the Battle of Waterloo Established legal equality and religious toleration an innovator Absent during Peninsular War iUnits Intents Clustering
  • 14. • Importance of iUnits in terms of an intent • Intent probability P(i|q) – Probability of having intent i for a given query q Per-intent iUnit Importance and Intent Probability iUnit Importance A series of evaluation workshops 5 Task Registration Due 20/Jun./2016 3 iUnit Importance A series of evaluation workshops 2 Task Registration Due 20/Jun./2016 5 In terms of intent “Definition” In terms of intent “Schedule” Intent Prob. Definition 0.4 Schedule 0.3 Tasks 0.3 For details, see our MobileClick-2 overview paper
  • 15. • Consider single-layered summary evaluation • U-measure [Sakai and Dou. SIGIR2013] – Higher if more important iUnits appear earlier Evaluation of iUnit Summarization (Single-layer Case) 15 𝑢1 𝑢2 𝑢3 Summary Trailtext (reading path) 𝑢1 𝑢3 G(u1)(1-10/L) + G(u2)(1-15/L) + G(u3)(1-25/L) U-measure Create a list of iUnits by assuming that users read text from left to right, from top to bottom 𝑈 = 𝑟=1 𝐺 𝑢 𝑟 1 − pos 𝑢 𝑟 𝐿 𝑢 𝑟: r-th iUnit 𝐺(𝑢): importance of u pos(𝑢): offset of u from the beginning 𝐿: patience parameter 𝑢2 10chars 10chars5chars
  • 16. • M-measure – Expectation of U-measure over multiple trailtexts 𝑀 = 𝐭 𝑃(𝐭)𝑈(𝐭) 1. Generate trailtexts by assuming that – Users read a summary from the top of the first layer – Users click on an intent if they are interested in it M-measure 16 𝑃(𝐭): probability of trailtext t 𝑈(𝐭): U-measure of trailtext t 𝑙1 𝑢1 𝑢2 𝑢3 𝑢4 User interested in Intent 1 (𝑃(𝑖1|𝑞)) User interested in Intent 2 (𝑃(𝑖2|𝑞)) 𝑢1 𝑢2 𝑢3 𝑢4 𝑢1 𝑢2 𝑢3
  • 17. 2. Compute the expectation of U-measure Evaluation of iUnit Summarization (Two-layer Case) 17 𝑙1 𝑙2 𝑢1 𝑢2 𝑢3 𝑢6 𝑢4 𝑢5 Trailtext (t) (reading path) U 𝑢1 𝑢2 𝑢3 𝑢4 𝑢5 𝑢1 𝑢2 𝑢3 𝑢6 0.44 0.12 0.36 𝑃 𝐭1 = 𝑃 𝑖1 𝑞 = 0.75 𝑃 𝐭2 = 𝑃 𝑖2 𝑞 = 0.25 M-measure 𝑀 = 𝐭 𝑃(𝐭)𝑈(𝐭) Because trailtext t2 is read by users interested in i2
  • 19. Pairwise Comparison All possible pairs of 7 summaries for 25 queries were presented to about 14 users
  • 20. • Users were asked to select either the left one is better, the right one is better, equally good, or equally bad • Criteria: (1) How much useful information you can get from the summary, and (2) How quickly you can get useful information from the summary Instruction in Pairwise Comparison 20
  • 21. • 𝑳 of U-measure in M-measure – 𝑈 = 𝑟=1 𝐺 𝑢 𝑟 max 0, 1 − pos 𝑢 𝑟 𝐿 – 𝐿 is a patience parameter that controls how the gain of iUnits decreases as the user reads the text • Simple variants of M-measure – Use only first layer – Use only second layer – Use a uniform distribution for 𝑃 𝑖 𝑞 Settings of M-measure 21 𝑙1 𝑢1 𝑢2 𝑢3 𝑢4 𝐿 = 100 𝐿 = 200 200100 1− pos𝑢𝑟 𝐿 pos 𝑢 𝑟
  • 22. Interpretation of Results 22 (Num. of votes for A) (Total num. of votes) Diff. of M-measure (M(A) - M(B)) Agree Disagree Disagree Agree A is better (User pref.) B is better (User pref.) Ais better (M-measure) Bis better (M-measure) Each dot represents a pair of systems (A, B) for a particular query Agreement = (#dots in Agree) / (#dots)
  • 23. Experimental Results for Different Patient Parameters 23 93.75 750 6000 24000 31.25 125 2000 8000 English Japanese LOW agreement for LOW patience parameter (L=93.5) HIGH agreement for HIGH patience parameter (L=24000) Agreement is high (70-74%) for both of the languages
  • 24. Experimental Results for Simple Variants of M-measure 24 Original Worse Slightly worseClose Use of the second layer and intent probability improves the agreement (but the first layer doesn’t) 24000 2000
  • 25. • Possible explanations include – The quality of the second layer correlates to the quality of the whole summary – Users decided the quality of the summary mainly based on the second layer • We asked the users to look at the second layer in the assessment Why did the only 2nd layer correlate to the user pref. well? 25
  • 26. • Conclusions – Proposed M-measure • A special case of intent-aware U-measure for two- layered summarization – Measured the agreement between M-measure and user preferences • Agreement was high (70-74%) • Future work – Error analysis – Address “why did the only second layer correlate to the user preferences well?” Conclusions and Future Work 26