SlideShare a Scribd company logo
1 of 22
Download to read offline
Time-aware Evaluation of
Cumulative Citation
Recommendation Systems
Krisztian Balog
University of Stavanger
SIGIR 2013 workshop on Time-aware Information Access (#TAIA2013) | Dublin, Ireland, Aug 2013
Laura Dietz, Jeffrey Dalton
CIIR, University of Massachusetts, Amherst
CCR @TREC 2012 KBA
Evaluation methodology
Target entity: Aharon Barak
1328055120'f6462409e60d2748a0adef82fe68b86d
1328057880'79cdee3c9218ec77f6580183cb16e045
1328057280'80fb850c089caa381a796c34e23d9af8
1328056560'450983d117c5a7903a3a27c959cc682a
1328056560'450983d117c5a7903a3a27c959cc682a
1328056260'684e2f8fc90de6ef949946f5061a91e0
1328056560'be417475cca57b6557a7d5db0bbc6959
1328057520'4e92eb721bfbfdfa0b1d9476b1ecb009
1328058660'807e4aaeca58000f6889c31c24712247
1328060040'7a8c209ad36bbb9c946348996f8c616b
1328063280'1ac4b6f3a58004d1596d6e42c4746e21
1328064660'1a0167925256b32d715c1a3a2ee0730c
1328062980'7324a71469556bcd1f3904ba090ab685
PositiveNegative
Aharon_Barak
Aharon_Barak
Aharon_Barak
Aharon_Barak
Aharon_Barak
Aharon_Barak
Aharon_Barak
Aharon_Barak
Aharon_Barak
Aharon_Barak
Aharon_Barak
Aharon_Barak
Aharon_Barak
score
Target entity: Aharon Barak
urlname stream_id
Cutoff
1000
500
500
480
450
430
428
428
380
380
375
315
263
1328055120'f6462409e60d2748a0adef82fe68b86d
1328057880'79cdee3c9218ec77f6580183cb16e045
1328057280'80fb850c089caa381a796c34e23d9af8
1328056560'450983d117c5a7903a3a27c959cc682a
1328056560'450983d117c5a7903a3a27c959cc682a
1328056260'684e2f8fc90de6ef949946f5061a91e0
1328056560'be417475cca57b6557a7d5db0bbc6959
1328057520'4e92eb721bfbfdfa0b1d9476b1ecb009
1328058660'807e4aaeca58000f6889c31c24712247
1328060040'7a8c209ad36bbb9c946348996f8c616b
1328063280'1ac4b6f3a58004d1596d6e42c4746e21
1328064660'1a0167925256b32d715c1a3a2ee0730c
1328062980'7324a71469556bcd1f3904ba090ab685
PositiveNegative
Aharon_Barak
Aharon_Barak
Aharon_Barak
Aharon_Barak
Aharon_Barak
Aharon_Barak
Aharon_Barak
Aharon_Barak
Aharon_Barak
Aharon_Barak
Aharon_Barak
Aharon_Barak
Aharon_Barak
CCR @TREC 2012 KBA
- Cumulative citation recommendation
- Filter a time-ordered corpus for documents that are
highly relevant to a predefined set of entities
- For each entity, provide a ranked list of documents
based on their “citation-worthiness”
CCR @TREC 2012 KBA
- Cumulative citation recommendation
- Filter a time-ordered corpus for documents that are
highly relevant to a predefined set of entities
- For each entity, provide a ranked list of documents
based on their “citation-worthiness”
Results are evaluated in a single batch
(temporal aspects are not considered)
CCR @TREC 2012 KBA
- Cumulative citation recommendation
- Filter a time-ordered corpus for documents that are
highly relevant to a predefined set of entities
- For each entity, provide a ranked list of documents
based on their “citation-worthiness”
Evaluation metrics are set-based
(using a confidence cut-off)
Aims
- Develop a time-aware evaluation paradigm for
streaming collections
- Capture how retrieval effectiveness changes over time
- Deal with ground truth of bursty nature
- Accommodate various underlying user models
- Test the ideas on CCR
Overview
time
1. Slicing time
2. Measuring
slice relevance
3. Aggregating
slice relevance.87
.65
Slice
importance
Overview
time
.87
.65
Slice
importance
1. Slicing time
Slicing time
- Simplifying assumptions
- Slices are non-overlapping
- Unconcerned about slices that don’t contain any
relevant documents
(A) Uniform slicing
- Slices of equal length
(B) Non-uniform slicing
- Slices of varying length
#relevant
time
(A)
(B)
ti
Overview
time
.87
.65
Slice
importance
2. Measuring
slice relevance
Measuring slice relevance
- Ranked list of documents within a given slice
- Evaluation metric
- Standard IR metrics
- MAP, R-Prec, NDCG
d =< d1, . . . , dn >
m(di, q)
Overview
time
.87
.65
Slice
importance
3. Aggregating
slice relevance
Aggregating slice relevance
- Probabilistic formulation to estimate the
likelihood of relevance
P(r = 1|d, q, m) =
X
i2I
P(r = 1|di, q, i)P(i|q)
Slice-based
relevance
Slice
importance
⇡ m(di, q)
Slice importance
- Uniform slicing
- All slices are equally important
- Non-uniform slicing
- Bursty periods (i.e., slices with more relevant
documents) are more important
P(i|q) =
1
I
P(i|q) =
#R(i, q)
P
i2I #R(i, q)
Experiments
- Official TREC 2012 KBA CCR runs
- 8 systems, best run for each system
- Only uniform time slicing
- Binary relevance
Results
Atemporal vs. temporal ranking (MAP, weekly slicing)
0
0.15
0.3
0.45
0.6
UvA
udel_fang LSIS CWI
UMass_CIIR
uiucGSLIS
hltcoe
igpi2012
helsinki
Atemporal
Temporal (uniform slice weighting)
Temporal (non-uniform slice weighting)
0
0.175
0.35
0.525
0.7
UvA
udel_fang LSIS CWI
UMass_CIIR
uiucGSLIS
hltcoe
igpi2012
helsinki
Atemporal
Temporal (uniform slice weighting)
Temporal (non-uniform slice weighting)
Results
Atemporal vs. temporal ranking (MAP, daily slicing)
Zooming in
atemporal
(MAP)
temporal (MAP)temporal (MAP)temporal (MAP)temporal (MAP)
atemporal
(MAP)
weekly slicingweekly slicing daily slicingdaily slicing
atemporal
(MAP)
uniform non-uniform uniform non-uniform
LSIS 0.48 0.52 0.54 0.60 0.62
CWI 0.45 0.48 0.51 0.62 0.63
LSIS CWI
Findings
- Top performing teams are (almost) always the
same, independent of the metric
- Temporal evaluation provides additional
insights
Wrap-up
- Framework for temporal evaluation
- Applied to the evaluation of TREC 2012 KBA CCR
systems
- Future work
- Non-uniform slice weighting
- Other streaming tasks/collections (e.g., microblog
search)
- Generalize to other time-aware information access
tasks
Questions?
Online appendix:
http://ciir.cs.umass.edu/~dietz/streameval/

More Related Content

Similar to Time-aware Evaluation of Cumulative Citation Recommendation Systems

Gunjan insight student conference v2
Gunjan insight student conference v2Gunjan insight student conference v2
Gunjan insight student conference v2Gunjan Kumar
 
Data mining technique for classification and feature evaluation using stream ...
Data mining technique for classification and feature evaluation using stream ...Data mining technique for classification and feature evaluation using stream ...
Data mining technique for classification and feature evaluation using stream ...ranjit banshpal
 
A Segmentation of Water Consumption with Apache Spark
A Segmentation of Water Consumption with Apache SparkA Segmentation of Water Consumption with Apache Spark
A Segmentation of Water Consumption with Apache SparkDiego García Valverde
 
Is424 g1 t9_proposal_slides
Is424 g1 t9_proposal_slidesIs424 g1 t9_proposal_slides
Is424 g1 t9_proposal_slidesJing WANG
 
Accurate and Reliable What-If Analysis of Business Processes: Is it Achievable?
Accurate and Reliable What-If Analysis of Business Processes: Is it Achievable?Accurate and Reliable What-If Analysis of Business Processes: Is it Achievable?
Accurate and Reliable What-If Analysis of Business Processes: Is it Achievable?Marlon Dumas
 
Novel algorithms for Knowledge discovery from neural networks in Classificat...
Novel algorithms for  Knowledge discovery from neural networks in Classificat...Novel algorithms for  Knowledge discovery from neural networks in Classificat...
Novel algorithms for Knowledge discovery from neural networks in Classificat...Dr.(Mrs).Gethsiyal Augasta
 
Parametric comparison based on split criterion on classification algorithm
Parametric comparison based on split criterion on classification algorithmParametric comparison based on split criterion on classification algorithm
Parametric comparison based on split criterion on classification algorithmIAEME Publication
 
Analysis and implementation of modified k medoids
Analysis and implementation of modified k medoidsAnalysis and implementation of modified k medoids
Analysis and implementation of modified k medoidseSAT Publishing House
 
Empirical Analysis of Radix Sort using Curve Fitting Technique in Personal Co...
Empirical Analysis of Radix Sort using Curve Fitting Technique in Personal Co...Empirical Analysis of Radix Sort using Curve Fitting Technique in Personal Co...
Empirical Analysis of Radix Sort using Curve Fitting Technique in Personal Co...IRJET Journal
 
Trend analysis-of-time-series-data-using-data-mining-techniques By Raihan Sikdar
Trend analysis-of-time-series-data-using-data-mining-techniques By Raihan SikdarTrend analysis-of-time-series-data-using-data-mining-techniques By Raihan Sikdar
Trend analysis-of-time-series-data-using-data-mining-techniques By Raihan Sikdarraihansikdar
 
Structuring EMR Data For Analytics
Structuring EMR Data For AnalyticsStructuring EMR Data For Analytics
Structuring EMR Data For AnalyticsBrandon Stange
 
similarity measure
similarity measure similarity measure
similarity measure ZHAO Sam
 
Walking on AI/ML for Networking
Walking on AI/ML for NetworkingWalking on AI/ML for Networking
Walking on AI/ML for NetworkingOscar Caicedo
 
Fuzzy clustering and fuzzy c-means partition cluster analysis and validation ...
Fuzzy clustering and fuzzy c-means partition cluster analysis and validation ...Fuzzy clustering and fuzzy c-means partition cluster analysis and validation ...
Fuzzy clustering and fuzzy c-means partition cluster analysis and validation ...IJECEIAES
 
A frame work for clustering time evolving data
A frame work for clustering time evolving dataA frame work for clustering time evolving data
A frame work for clustering time evolving dataiaemedu
 
Analysis on Data Mining Techniques for Heart Disease Dataset
Analysis on Data Mining Techniques for Heart Disease DatasetAnalysis on Data Mining Techniques for Heart Disease Dataset
Analysis on Data Mining Techniques for Heart Disease DatasetIRJET Journal
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)IJERD Editor
 

Similar to Time-aware Evaluation of Cumulative Citation Recommendation Systems (20)

assia2015sakai
assia2015sakaiassia2015sakai
assia2015sakai
 
Gunjan insight student conference v2
Gunjan insight student conference v2Gunjan insight student conference v2
Gunjan insight student conference v2
 
Data mining technique for classification and feature evaluation using stream ...
Data mining technique for classification and feature evaluation using stream ...Data mining technique for classification and feature evaluation using stream ...
Data mining technique for classification and feature evaluation using stream ...
 
A Segmentation of Water Consumption with Apache Spark
A Segmentation of Water Consumption with Apache SparkA Segmentation of Water Consumption with Apache Spark
A Segmentation of Water Consumption with Apache Spark
 
Is424 g1 t9_proposal_slides
Is424 g1 t9_proposal_slidesIs424 g1 t9_proposal_slides
Is424 g1 t9_proposal_slides
 
Accurate and Reliable What-If Analysis of Business Processes: Is it Achievable?
Accurate and Reliable What-If Analysis of Business Processes: Is it Achievable?Accurate and Reliable What-If Analysis of Business Processes: Is it Achievable?
Accurate and Reliable What-If Analysis of Business Processes: Is it Achievable?
 
Novel algorithms for Knowledge discovery from neural networks in Classificat...
Novel algorithms for  Knowledge discovery from neural networks in Classificat...Novel algorithms for  Knowledge discovery from neural networks in Classificat...
Novel algorithms for Knowledge discovery from neural networks in Classificat...
 
Parametric comparison based on split criterion on classification algorithm
Parametric comparison based on split criterion on classification algorithmParametric comparison based on split criterion on classification algorithm
Parametric comparison based on split criterion on classification algorithm
 
Analysis and implementation of modified k medoids
Analysis and implementation of modified k medoidsAnalysis and implementation of modified k medoids
Analysis and implementation of modified k medoids
 
Empirical Analysis of Radix Sort using Curve Fitting Technique in Personal Co...
Empirical Analysis of Radix Sort using Curve Fitting Technique in Personal Co...Empirical Analysis of Radix Sort using Curve Fitting Technique in Personal Co...
Empirical Analysis of Radix Sort using Curve Fitting Technique in Personal Co...
 
Trend analysis-of-time-series-data-using-data-mining-techniques By Raihan Sikdar
Trend analysis-of-time-series-data-using-data-mining-techniques By Raihan SikdarTrend analysis-of-time-series-data-using-data-mining-techniques By Raihan Sikdar
Trend analysis-of-time-series-data-using-data-mining-techniques By Raihan Sikdar
 
Structuring EMR Data For Analytics
Structuring EMR Data For AnalyticsStructuring EMR Data For Analytics
Structuring EMR Data For Analytics
 
similarity measure
similarity measure similarity measure
similarity measure
 
PAKDD2013
PAKDD2013PAKDD2013
PAKDD2013
 
Walking on AI/ML for Networking
Walking on AI/ML for NetworkingWalking on AI/ML for Networking
Walking on AI/ML for Networking
 
Fuzzy clustering and fuzzy c-means partition cluster analysis and validation ...
Fuzzy clustering and fuzzy c-means partition cluster analysis and validation ...Fuzzy clustering and fuzzy c-means partition cluster analysis and validation ...
Fuzzy clustering and fuzzy c-means partition cluster analysis and validation ...
 
A frame work for clustering time evolving data
A frame work for clustering time evolving dataA frame work for clustering time evolving data
A frame work for clustering time evolving data
 
Analysis on Data Mining Techniques for Heart Disease Dataset
Analysis on Data Mining Techniques for Heart Disease DatasetAnalysis on Data Mining Techniques for Heart Disease Dataset
Analysis on Data Mining Techniques for Heart Disease Dataset
 
C070409013
C070409013C070409013
C070409013
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 

More from krisztianbalog

Towards Filling the Gap in Conversational Search: From Passage Retrieval to C...
Towards Filling the Gap in Conversational Search: From Passage Retrieval to C...Towards Filling the Gap in Conversational Search: From Passage Retrieval to C...
Towards Filling the Gap in Conversational Search: From Passage Retrieval to C...krisztianbalog
 
Conversational AI from an Information Retrieval Perspective: Remaining Challe...
Conversational AI from an Information Retrieval Perspective: Remaining Challe...Conversational AI from an Information Retrieval Perspective: Remaining Challe...
Conversational AI from an Information Retrieval Perspective: Remaining Challe...krisztianbalog
 
What Does Conversational Information Access Exactly Mean and How to Evaluate It?
What Does Conversational Information Access Exactly Mean and How to Evaluate It?What Does Conversational Information Access Exactly Mean and How to Evaluate It?
What Does Conversational Information Access Exactly Mean and How to Evaluate It?krisztianbalog
 
Personal Knowledge Graphs
Personal Knowledge GraphsPersonal Knowledge Graphs
Personal Knowledge Graphskrisztianbalog
 
Entities for Augmented Intelligence
Entities for Augmented IntelligenceEntities for Augmented Intelligence
Entities for Augmented Intelligencekrisztianbalog
 
On Entities and Evaluation
On Entities and EvaluationOn Entities and Evaluation
On Entities and Evaluationkrisztianbalog
 
Table Retrieval and Generation
Table Retrieval and GenerationTable Retrieval and Generation
Table Retrieval and Generationkrisztianbalog
 
Entity Search: The Last Decade and the Next
Entity Search: The Last Decade and the NextEntity Search: The Last Decade and the Next
Entity Search: The Last Decade and the Nextkrisztianbalog
 
Overview of the TREC 2016 Open Search track: Academic Search Edition
Overview of the TREC 2016 Open Search track: Academic Search EditionOverview of the TREC 2016 Open Search track: Academic Search Edition
Overview of the TREC 2016 Open Search track: Academic Search Editionkrisztianbalog
 
Overview of the Living Labs for IR Evaluation (LL4IR) CLEF Lab
Overview of the Living Labs for IR Evaluation (LL4IR) CLEF LabOverview of the Living Labs for IR Evaluation (LL4IR) CLEF Lab
Overview of the Living Labs for IR Evaluation (LL4IR) CLEF Labkrisztianbalog
 
Evaluation Initiatives for Entity-oriented Search
Evaluation Initiatives for Entity-oriented SearchEvaluation Initiatives for Entity-oriented Search
Evaluation Initiatives for Entity-oriented Searchkrisztianbalog
 
Entity Retrieval (tutorial organized by Radialpoint in Montreal)
Entity Retrieval (tutorial organized by Radialpoint in Montreal)Entity Retrieval (tutorial organized by Radialpoint in Montreal)
Entity Retrieval (tutorial organized by Radialpoint in Montreal)krisztianbalog
 
Entity Retrieval (WSDM 2014 tutorial)
Entity Retrieval (WSDM 2014 tutorial)Entity Retrieval (WSDM 2014 tutorial)
Entity Retrieval (WSDM 2014 tutorial)krisztianbalog
 
Entity Retrieval (SIGIR 2013 tutorial)
Entity Retrieval (SIGIR 2013 tutorial)Entity Retrieval (SIGIR 2013 tutorial)
Entity Retrieval (SIGIR 2013 tutorial)krisztianbalog
 
Multi-step Classification Approaches to Cumulative Citation Recommendation
Multi-step Classification Approaches to Cumulative Citation RecommendationMulti-step Classification Approaches to Cumulative Citation Recommendation
Multi-step Classification Approaches to Cumulative Citation Recommendationkrisztianbalog
 
Entity Retrieval (WWW 2013 tutorial)
Entity Retrieval (WWW 2013 tutorial)Entity Retrieval (WWW 2013 tutorial)
Entity Retrieval (WWW 2013 tutorial)krisztianbalog
 
Semistructured Data Seach
Semistructured Data SeachSemistructured Data Seach
Semistructured Data Seachkrisztianbalog
 
Collection Ranking and Selection for Federated Entity Search
Collection Ranking and Selection for Federated Entity SearchCollection Ranking and Selection for Federated Entity Search
Collection Ranking and Selection for Federated Entity Searchkrisztianbalog
 

More from krisztianbalog (19)

Towards Filling the Gap in Conversational Search: From Passage Retrieval to C...
Towards Filling the Gap in Conversational Search: From Passage Retrieval to C...Towards Filling the Gap in Conversational Search: From Passage Retrieval to C...
Towards Filling the Gap in Conversational Search: From Passage Retrieval to C...
 
Conversational AI from an Information Retrieval Perspective: Remaining Challe...
Conversational AI from an Information Retrieval Perspective: Remaining Challe...Conversational AI from an Information Retrieval Perspective: Remaining Challe...
Conversational AI from an Information Retrieval Perspective: Remaining Challe...
 
What Does Conversational Information Access Exactly Mean and How to Evaluate It?
What Does Conversational Information Access Exactly Mean and How to Evaluate It?What Does Conversational Information Access Exactly Mean and How to Evaluate It?
What Does Conversational Information Access Exactly Mean and How to Evaluate It?
 
Personal Knowledge Graphs
Personal Knowledge GraphsPersonal Knowledge Graphs
Personal Knowledge Graphs
 
Entities for Augmented Intelligence
Entities for Augmented IntelligenceEntities for Augmented Intelligence
Entities for Augmented Intelligence
 
On Entities and Evaluation
On Entities and EvaluationOn Entities and Evaluation
On Entities and Evaluation
 
Table Retrieval and Generation
Table Retrieval and GenerationTable Retrieval and Generation
Table Retrieval and Generation
 
Entity Search: The Last Decade and the Next
Entity Search: The Last Decade and the NextEntity Search: The Last Decade and the Next
Entity Search: The Last Decade and the Next
 
Overview of the TREC 2016 Open Search track: Academic Search Edition
Overview of the TREC 2016 Open Search track: Academic Search EditionOverview of the TREC 2016 Open Search track: Academic Search Edition
Overview of the TREC 2016 Open Search track: Academic Search Edition
 
Overview of the Living Labs for IR Evaluation (LL4IR) CLEF Lab
Overview of the Living Labs for IR Evaluation (LL4IR) CLEF LabOverview of the Living Labs for IR Evaluation (LL4IR) CLEF Lab
Overview of the Living Labs for IR Evaluation (LL4IR) CLEF Lab
 
Entity Linking
Entity LinkingEntity Linking
Entity Linking
 
Evaluation Initiatives for Entity-oriented Search
Evaluation Initiatives for Entity-oriented SearchEvaluation Initiatives for Entity-oriented Search
Evaluation Initiatives for Entity-oriented Search
 
Entity Retrieval (tutorial organized by Radialpoint in Montreal)
Entity Retrieval (tutorial organized by Radialpoint in Montreal)Entity Retrieval (tutorial organized by Radialpoint in Montreal)
Entity Retrieval (tutorial organized by Radialpoint in Montreal)
 
Entity Retrieval (WSDM 2014 tutorial)
Entity Retrieval (WSDM 2014 tutorial)Entity Retrieval (WSDM 2014 tutorial)
Entity Retrieval (WSDM 2014 tutorial)
 
Entity Retrieval (SIGIR 2013 tutorial)
Entity Retrieval (SIGIR 2013 tutorial)Entity Retrieval (SIGIR 2013 tutorial)
Entity Retrieval (SIGIR 2013 tutorial)
 
Multi-step Classification Approaches to Cumulative Citation Recommendation
Multi-step Classification Approaches to Cumulative Citation RecommendationMulti-step Classification Approaches to Cumulative Citation Recommendation
Multi-step Classification Approaches to Cumulative Citation Recommendation
 
Entity Retrieval (WWW 2013 tutorial)
Entity Retrieval (WWW 2013 tutorial)Entity Retrieval (WWW 2013 tutorial)
Entity Retrieval (WWW 2013 tutorial)
 
Semistructured Data Seach
Semistructured Data SeachSemistructured Data Seach
Semistructured Data Seach
 
Collection Ranking and Selection for Federated Entity Search
Collection Ranking and Selection for Federated Entity SearchCollection Ranking and Selection for Federated Entity Search
Collection Ranking and Selection for Federated Entity Search
 

Recently uploaded

Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 

Recently uploaded (20)

Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 

Time-aware Evaluation of Cumulative Citation Recommendation Systems

  • 1. Time-aware Evaluation of Cumulative Citation Recommendation Systems Krisztian Balog University of Stavanger SIGIR 2013 workshop on Time-aware Information Access (#TAIA2013) | Dublin, Ireland, Aug 2013 Laura Dietz, Jeffrey Dalton CIIR, University of Massachusetts, Amherst
  • 3. Evaluation methodology Target entity: Aharon Barak 1328055120'f6462409e60d2748a0adef82fe68b86d 1328057880'79cdee3c9218ec77f6580183cb16e045 1328057280'80fb850c089caa381a796c34e23d9af8 1328056560'450983d117c5a7903a3a27c959cc682a 1328056560'450983d117c5a7903a3a27c959cc682a 1328056260'684e2f8fc90de6ef949946f5061a91e0 1328056560'be417475cca57b6557a7d5db0bbc6959 1328057520'4e92eb721bfbfdfa0b1d9476b1ecb009 1328058660'807e4aaeca58000f6889c31c24712247 1328060040'7a8c209ad36bbb9c946348996f8c616b 1328063280'1ac4b6f3a58004d1596d6e42c4746e21 1328064660'1a0167925256b32d715c1a3a2ee0730c 1328062980'7324a71469556bcd1f3904ba090ab685 PositiveNegative Aharon_Barak Aharon_Barak Aharon_Barak Aharon_Barak Aharon_Barak Aharon_Barak Aharon_Barak Aharon_Barak Aharon_Barak Aharon_Barak Aharon_Barak Aharon_Barak Aharon_Barak score Target entity: Aharon Barak urlname stream_id Cutoff 1000 500 500 480 450 430 428 428 380 380 375 315 263 1328055120'f6462409e60d2748a0adef82fe68b86d 1328057880'79cdee3c9218ec77f6580183cb16e045 1328057280'80fb850c089caa381a796c34e23d9af8 1328056560'450983d117c5a7903a3a27c959cc682a 1328056560'450983d117c5a7903a3a27c959cc682a 1328056260'684e2f8fc90de6ef949946f5061a91e0 1328056560'be417475cca57b6557a7d5db0bbc6959 1328057520'4e92eb721bfbfdfa0b1d9476b1ecb009 1328058660'807e4aaeca58000f6889c31c24712247 1328060040'7a8c209ad36bbb9c946348996f8c616b 1328063280'1ac4b6f3a58004d1596d6e42c4746e21 1328064660'1a0167925256b32d715c1a3a2ee0730c 1328062980'7324a71469556bcd1f3904ba090ab685 PositiveNegative Aharon_Barak Aharon_Barak Aharon_Barak Aharon_Barak Aharon_Barak Aharon_Barak Aharon_Barak Aharon_Barak Aharon_Barak Aharon_Barak Aharon_Barak Aharon_Barak Aharon_Barak
  • 4. CCR @TREC 2012 KBA - Cumulative citation recommendation - Filter a time-ordered corpus for documents that are highly relevant to a predefined set of entities - For each entity, provide a ranked list of documents based on their “citation-worthiness”
  • 5. CCR @TREC 2012 KBA - Cumulative citation recommendation - Filter a time-ordered corpus for documents that are highly relevant to a predefined set of entities - For each entity, provide a ranked list of documents based on their “citation-worthiness” Results are evaluated in a single batch (temporal aspects are not considered)
  • 6. CCR @TREC 2012 KBA - Cumulative citation recommendation - Filter a time-ordered corpus for documents that are highly relevant to a predefined set of entities - For each entity, provide a ranked list of documents based on their “citation-worthiness” Evaluation metrics are set-based (using a confidence cut-off)
  • 7. Aims - Develop a time-aware evaluation paradigm for streaming collections - Capture how retrieval effectiveness changes over time - Deal with ground truth of bursty nature - Accommodate various underlying user models - Test the ideas on CCR
  • 8. Overview time 1. Slicing time 2. Measuring slice relevance 3. Aggregating slice relevance.87 .65 Slice importance
  • 10. Slicing time - Simplifying assumptions - Slices are non-overlapping - Unconcerned about slices that don’t contain any relevant documents (A) Uniform slicing - Slices of equal length (B) Non-uniform slicing - Slices of varying length #relevant time (A) (B) ti
  • 12. Measuring slice relevance - Ranked list of documents within a given slice - Evaluation metric - Standard IR metrics - MAP, R-Prec, NDCG d =< d1, . . . , dn > m(di, q)
  • 14. Aggregating slice relevance - Probabilistic formulation to estimate the likelihood of relevance P(r = 1|d, q, m) = X i2I P(r = 1|di, q, i)P(i|q) Slice-based relevance Slice importance ⇡ m(di, q)
  • 15. Slice importance - Uniform slicing - All slices are equally important - Non-uniform slicing - Bursty periods (i.e., slices with more relevant documents) are more important P(i|q) = 1 I P(i|q) = #R(i, q) P i2I #R(i, q)
  • 16. Experiments - Official TREC 2012 KBA CCR runs - 8 systems, best run for each system - Only uniform time slicing - Binary relevance
  • 17. Results Atemporal vs. temporal ranking (MAP, weekly slicing) 0 0.15 0.3 0.45 0.6 UvA udel_fang LSIS CWI UMass_CIIR uiucGSLIS hltcoe igpi2012 helsinki Atemporal Temporal (uniform slice weighting) Temporal (non-uniform slice weighting)
  • 18. 0 0.175 0.35 0.525 0.7 UvA udel_fang LSIS CWI UMass_CIIR uiucGSLIS hltcoe igpi2012 helsinki Atemporal Temporal (uniform slice weighting) Temporal (non-uniform slice weighting) Results Atemporal vs. temporal ranking (MAP, daily slicing)
  • 19. Zooming in atemporal (MAP) temporal (MAP)temporal (MAP)temporal (MAP)temporal (MAP) atemporal (MAP) weekly slicingweekly slicing daily slicingdaily slicing atemporal (MAP) uniform non-uniform uniform non-uniform LSIS 0.48 0.52 0.54 0.60 0.62 CWI 0.45 0.48 0.51 0.62 0.63 LSIS CWI
  • 20. Findings - Top performing teams are (almost) always the same, independent of the metric - Temporal evaluation provides additional insights
  • 21. Wrap-up - Framework for temporal evaluation - Applied to the evaluation of TREC 2012 KBA CCR systems - Future work - Non-uniform slice weighting - Other streaming tasks/collections (e.g., microblog search) - Generalize to other time-aware information access tasks