SlideShare a Scribd company logo
Overview of the
NTCIR-14 CENTRE Task
Tetsuya Sakai Nicola Ferro Ian Soboroff
Zhaohao Zeng Peng Xiao Maria Maistro
Waseda University, Japan
University of Padua, Italy
NIST, USA
http://www.centre-eval.org/
Twitter: @_centre_
June 11 and 13, 2019@NTCIR-14, Tokyo.
TALK OUTLINE
1. CENTRE@CLEF, NTCIR, TREC: motivations
2. CENTRE@NTCIR-14 task specifications
3. Additional relevance assessments
4. Evaluation measures
5. Replicability results
6. Reproducibility results
7. CENTRE@NTCIR-14 conclusions
8. Plans for NTCIR-15
9. Related efforts
Motivation
• Researcher A publishes a paper; says
Algo X > Algo Y on test data D.
⇒ Researcher B tries Algo X and Y on D but finds
Algo X < Algo Y. A replicability problem.
• Researcher A publishes a paper; says
Algo X > Algo Y on test data D.
⇒ Researcher B tries Algo X and Y on D’ but finds
Algo X < Algo Y. A reproducibility problem.
Can the IR community do better?
Research questions
• Can results from CLEF, NTCIR, and TREC be
replicated/reproduced easily? If not, what are the
difficulties? What needs to be improved?
• Can results from (say) TREC be reproduced with
(say) NTCIR data? Why or why not?
• [Participants may have their own research
questions]
CENTRE = CLEF/NTCIR/TREC
(replicability and) reproducibility
CLEF
Conferences
and Labs of the
Evaluation
Forum
NTCIR
NII Testbeds and
Community for
Information
access Research
TREC
Text REtrieval
Conference
Data, code, wisdom
Replicability and
Reproducibility experiments
Data, code, wisdom
Replicability and
Reproducibility experiments
Data, code, wisdom
Replicability and Reproducibility experiments
Europe and more;
Annual
US and more;
Annual
Asia and more;
Sesquinnual
Who should participate in CENTRE?
• IR evaluation nerds
• Students willing to replicate/reproduce SOTA
methods in IR, so that they will eventually be able
to propose their own methods and demonstrate
their superiority over the SOTA
• Those who want to be involved in more than one
evaluation communities across the globe
• Those who want to use data from more than one
evaluation conferences
TALK OUTLINE
1. CENTRE@CLEF, NTCIR, TREC: motivations
2. CENTRE@NTCIR-14 task specifications
3. Additional relevance assessments
4. Evaluation measures
5. Replicability results
6. Reproducibility results
7. CENTRE@NTCIR-14 conclusions
8. Plans for NTCIR-15
9. Related efforts
CENTRE@NTCIR-14: How it works (1)
Past CLEFs
Past TRECs
Past NTCIRs
A, B: target
run pairs
A: advanced
B: baseline
A, B: target
run pairs
A, B: target
run pairs
IR literature
A, B: target
run pairs
Selected by
organisers
Selected by
each team
CENTRE@NTCIR-14: How it works (2)
Past CLEFs
Past TRECs
Past NTCIRs
A, B: target
run pairs
T1{A,B}:
replicated run pairs
T1: replicability subtask
(same NTCIR data)
A: advanced
B: baseline
A, B: target
run pairs
A, B: target
run pairs
IR literature
A, B: target
run pairs
Selected by
organisers
Selected by
each team
Evaluation measures:
• RSME of the topicwise deltas
• Pearson correlation of the
topicwise deltas
• Effect Ratio (ER)
CENTRE@NTCIR-14: How it works (2)
Past CLEFs
Past TRECs
Past NTCIRs
A, B: target
run pairs
T1{A,B}:
replicated run pairs
T1: replicability subtask
(same NTCIR data)
A: advanced
B: baseline
A, B: target
run pairs
A, B: target
run pairs
T2: reproducibility subtask
IR literature
T2CLEF-{A,B}
Run pairs
reproduced on
NTCIR data
T2TREC-{A,B}
T2OPEN-{A,B}
A, B: target
run pairs
Selected by
organisers
Selected by
each team
Evaluation measure:
Effect Ratio (ER)
T2OPEN runs evaluated only
with effectiveness measures
Past CLEF runs not considered at CENTRE@NTCIR-14
Target runs for CENTRE@NTCIR-14
• We focus on ad hoc monolingual web search.
• For T1 (replicability): A pair of runs from the NTCIR-13 We Want Web
task
- Overview paper:
http://research.nii.ac.jp/ntcir/workshop/OnlineProceedings13/pdf/ntcir/01-NTCIR13-OV-WWW-
LuoC.pdf
• For T2 (reproducibility): A pair of runs from the TREC 2013 web track
- Overview paper:
https://trec.nist.gov/pubs/trec22/papers/WEB.OVERVIEW.pdf
• All of the above target runs use clueweb12 as the search corpus
http://www.lemurproject.org/clueweb12.php/
• Both the target NTCIR and TREC runs are implemented using Indri
https://www.lemurproject.org/indri/
Target NTCIR runs and data
for T1 (replicability)
• Test collection:
NTCIR-13 WWW English (clueweb12-B13, 100 topics)
• Target NTCIR run pairs:
From the NTCIR-13 WWW RMIT paper [Gallagher+17]
http://research.nii.ac.jp/ntcir/workshop/OnlineProceedings13/pdf/ntcir/02-NTCIR13-WWW-GallagherL.pdf
A: RMIT-E-NU-Own-1 (sequential dependency model: SDM)
B: RMIT-E-NU-Own-3 (full dependency model: FDM)
• SDM and FDM are from [Metzler+Croft SIGIR05]
https://doi.org/10.1145/1076034.1076115
Target TREC runs and data
for T2 (reproduciblity)
• Test collection:
TREC 2013 Web Track (clueweb12 category A, 50 topics)
• Target TREC run pairs:
From the TREC 2013 Web Track U Delaware paper [Yang+13] :
https://trec.nist.gov/pubs/trec22/papers/udel_fang-web.pdf
A: UDInfolabWEB2 (selects semantically related terms using
Web-based working sets)
B: UDInfolabWEB1 (selects semantically related terms using
collection-based working sets)
• The working set construction method is from
[Fang+SIGIR06]
https://doi.org/10.1145/1148170.1148193
How participants were expected to
submit T1 (replicability) runs
0. Obtain clueweb12-B13 (or the full data)
http://www.lemurproject.org/clueweb12.php/
and Indri https://www.lemurproject.org/indri/
1. Register to the CENTRE task and obtain the NTCIR-13 WWW
topics and qrels from the CENTRE organisers
2. Read the NTCIR-13 WWW RMIT paper [Gallagher+17]
http://research.nii.ac.jp/ntcir/workshop/OnlineProceedings13/pdf/ntcir/02-NTCIR13-WWW-GallagherL.pdf
and try to replicate the A-run and the B-run on the NTCIR-13
WWW-1 test collection
A: RMIT-E-NU-Own-1
B: RMIT-E-NU-Own-3
3. Submit the replicated runs by the deadline
How participants were expected to
submit T2TREC (reproducibility) runs
0. Obtain clueweb12-B13 (or the full data)
http://www.lemurproject.org/clueweb12.php/
and (optionally) Indri https://www.lemurproject.org/indri/
1. Register to the CENTRE task and obtain the NTCIR-13 WWW topics and
qrels from the CENTRE organisers
2. Read the TREC 2013 Web Track U Delaware paper [Yang+13] :
https://trec.nist.gov/pubs/trec22/papers/udel_fang-web.pdf
and try to reproduce the A-run and the B-run on the NTCIR-13 WWW-1
test collection
A: UDInfolabWEB2
B: UDInfolabWEB1
3. Submit the reproduced runs by the deadline
How participants were expected to
submit T2OPEN (reproducibility) runs
0. Obtain clueweb12-B13 (or the full data)
http://www.lemurproject.org/clueweb12.php/
and (optionally) Indri https://www.lemurproject.org/indri/
1. Register to the CENTRE task and obtain the NTCIR-13 WWW topics and
qrels from the CENTRE organisers
2. Pick any pair of runs A (advanced) and B (baseline) from the IR literature,
and try to reproduce the A-run and the B-run on the NTCIR-13 WWW-1
test collection
3. Submit the reproduced runs by the deadline
T2OPEN not discussed further in this overview.
For more, see:
Andrew Yates: MPII at the NTCIR-14 CENTRE Task
Submission instructions
• Each team can submit only one run pair per subtask
(T1, T2TREC, T2OPEN). So at most 6 runs in total.
• Run file format: same as WWW-1; see
http://www.thuir.cn/ntcirwww/ (Run Submissions
format). In short, it’s a TREC run file plus a SYSDESC
line.
• Run file names:
CENTRE1-<teamname>-<subtaskname>-[A,B]
Advanced Baseline
TALK OUTLINE
1. CENTRE@CLEF, NTCIR, TREC: motivations
2. CENTRE@NTCIR-14 task specifications
3. Additional relevance assessments
4. Evaluation measures
5. Replicability results
6. Reproducibility results
7. CENTRE@NTCIR-14 conclusions
8. Plans for NTCIR-15
9. Related efforts
Only one team participated…
Additional relevance assessments
• NTCIR-13 WWW-1 English qrels expanded based on
the new CENTRE runs (i.e. 6 runs from MPII).
• Depth-30 pools were created, following WWW-1.
• topic-doc pairs previously unseen (2,617) were
judged.
NTCIR-13
WWW E
CENTRE
@NTCIR-14
100 topics
original qrels
Based on 13 runs from three teams
100 topics
augmented
qrels
22,912 topic-doc pairs 22,912 + 2,617 = 25,529 topic-doc pairs
Inter-assessor agreement
(two assessors per topic)
TALK OUTLINE
1. CENTRE@CLEF, NTCIR, TREC: motivations
2. CENTRE@NTCIR-14 task specifications
3. Additional relevance assessments
4. Evaluation measures
5. Replicability results
6. Reproducibility results
7. CENTRE@NTCIR-14 conclusions
8. Plans for NTCIR-15
9. Related efforts
T1:examining the topicwise deltas
original topicwise improvement replicated topicwise improvement
Covariance
Standard deviations
Pearson correlation
Collection
Measure
RMSE
Root Mean Squared Error
T1 and T2: Effect Ratio (1)
• T1 (replicability)
• T2TREC (reproducibility)
original mean improvement
reproduced mean improvement
replicated mean improvement
original mean improvement
T1 and T2: Effect Ratio (2)
• ER compares the sample (unstandardised) effect
size in the “rep” experiment against that in the
original experiment. “Can we still observe the same
effect?”
• ER>1 ⇒ effect larger in “rep” experiment
• ER=1 ⇒ effect identical to original experiment
• 0 < ER < 1 ⇒ effect smaller in “rep” experiment
• ER <= 0 ⇒ FAILURE: A-run does not outperform B-
run in the “rep” experiment
TALK OUTLINE
1. CENTRE@CLEF, NTCIR, TREC: motivations
2. CENTRE@NTCIR-14 task specifications
3. Additional relevance assessments
4. Evaluation measures
5. Replicability results
6. Reproducibility results
7. CENTRE@NTCIR-14 conclusions
8. Plans for NTCIR-15
9. Related efforts
Replicability: w/o additional assessments (1)
• RMIT said “SDM (A) outperforms FDM (B)”
• MPII confirms this!
• A statistically significantly outperforms B even with nERR!
• Effect sizes for nERR (Orig vs Repli) quite similar.
Replicability: w/o additional assessments (2)
ER for nERR close
to 1
Correlation
(OrigΔ vs RepliΔ)
for nERR
statistically
significant
Replicability: WITH additional assessments
ER for nERR close
to 1
Correlation
(OrigΔ vs RepliΔ)
for nERR
statistically
significant
nERR, r=0.2610, 95%CI[0.0680, 0.4351]
Orig Δ ≒ 0.4
Repli Δ ≒ 0.8
(A>B)
Orig Δ ≒ -0.3
Repli Δ ≒ -0.6
(A<B)
But replicability not so
successful at the topic level
TALK OUTLINE
1. CENTRE@CLEF, NTCIR, TREC: motivations
2. CENTRE@NTCIR-14 task specifications
3. Additional relevance assessments
4. Evaluation measures
5. Replicability results
6. Reproducibility results
7. CENTRE@NTCIR-14 conclusions
8. Plans for NTCIR-15
9. Related efforts
Reproducibility: w/o additional assessments (1)
• UDel said “web-based working sets (A) outperform
collection-based working sets (B)” for TREC
• MPII confirms this for NTCIR
• A statistically significantly outperforms B even with Q!
• Effect sizes for Q (Orig vs Repli) quite similar.
Reproducibility: w/o additional assessments (2)
ER for Q close to 1
Reproducibility: WITH additional assessments
Larger ER values!
New relevant docs were found by the CENTRE runs; addition worthwhile!
Smaller
p-values!
TALK OUTLINE
1. CENTRE@CLEF, NTCIR, TREC: motivations
2. CENTRE@NTCIR-14 task specifications
3. Additional relevance assessments
4. Evaluation measures
5. Replicability results
6. Reproducibility results
7. CENTRE@NTCIR-14 conclusions
8. Plans for NTCIR-15
9. Related efforts
CENTRE@NTCIR-14 conclusions
• We focussed on: “Can an effect (gain of A over B)
replicated/reproduced?”
• Replication of the RMIT effect on WWW-1 data:
successful at the ER level; not so successful at the
topic level.
• Reproduction of the TREC UDel effect on WWW-1
data: successful at the ER level; additional
relevance assessments further improved the ERs.
• Only one participating team…this needs to change.
SPECIAL THANKS
• To Andrew Yates (Max Planck Institute for
Informatics, Germany)
• MPII at the NTCIR-14 CENTRE Task
TALK OUTLINE
1. CENTRE@CLEF, NTCIR, TREC: motivations
2. CENTRE@NTCIR-14 task specifications
3. Additional relevance assessments
4. Evaluation measures
5. Replicability results
6. Reproducibility results
7. CENTRE@NTCIR-14 conclusions
8. Plans for NTCIR-15
9. Related efforts
WWW and CENTRE join forces
• WWW-3 task will have three subtasks:
(1) Adhoc Chinese
(2) Adhoc English (WWW-2 + new WWW-topics)
(3) CENTRE (WWW-2 + new WWW-3 topics)
• CENTRE will accept two run types:
(a) Revived runs from WWW-2 (Tsinghua and RUC)
(b) REP (REPlicated and REProduced) runs: Tsinghua
and RUC algorithms re-implemented by other
teams
WWW-3 CENTRE
revived run 1
(Team A)
WWW-2 topics
WWW-2
target run 2
(Team B)
WWW-2
target run 1
(Team A)
WWW-3 CENTRE
revived run 2
(Team B)
revived (same team)
revived (same team)
New WWW-3 topics
Processed by
WWW-2 runs
Processed by
WWW-3 runs
WWW-3
Adhoc
(Team D)
Measuring progress:
WWW-2 top run → WWW-3 top run
COMPARE
Use the WWW-3 topics
(since WWW-3 systems
may be tuned with
WWW-2 topics)
WWW-3 CENTRE
revived run 1
(Team A)
WWW-2 topics
WWW-2
target run 2
(Team B)
WWW-2
target run 1
(Team A)
WWW-3 CENTRE
revived run 2
(Team B)
revived (same team)
revived (same team)
New WWW-3 topics
Processed by
WWW-2 runs
Processed by
WWW-3 runs
Replicated/reproduced
(different team)
Replicated/reproduced
(different team)
WWW-3 CENTRE
REP run 1-1
(Team C)
WWW-3 CENTRE
REP run 1-2
(Team C)
Measuring replicability on WWW-2 topics
COMPARE Use the WWW-2 topics
WWW-3 CENTRE
revived run 1
(Team A)
WWW-2 topics
WWW-2
target run 2
(Team B)
WWW-2
target run 1
(Team A)
WWW-3 CENTRE
revived run 2
(Team B)
revived (same team)
revived (same team)
New WWW-3 topics
Processed by
WWW-2 runs
Processed by
WWW-3 runs
Replicated/reproduced
(different team)
Replicated/reproduced
(different team)
WWW-3 CENTRE
REP run 1-1
(Team C)
WWW-3 CENTRE
REP run 1-2
(Team C)
Measuring reproducibility:
WWW-2-topics → WWW-3 topics
COMPARE
Use the WWW-2 topics Use the WWW-3 topics
TALK OUTLINE
1. CENTRE@CLEF, NTCIR, TREC: motivations
2. CENTRE@NTCIR-14 task specifications
3. Additional relevance assessments
4. Evaluation measures
5. Replicability results
6. Reproducibility results
7. CENTRE@NTCIR-14 conclusions
8. Plans for NTCIR-15
9. Related efforts
CENTRE@CLEF 2019
CLEF NTCIR TREC
REproducibility
Nicola Ferro, Norbert Fuhr, Maria Maistro, Tetsuya
Sakai, and Ian Soboroff
April 15, 2019, Cologne
http://www.centre-eval.org/clef2019/
CENTRE@CLEF 2019
TasksThe CENTRE@CLEF 2019 lab will offer three tasks:
• Task 1 - Replicability: the task will focus on the
replicability of selected methods on the same
experimental collections;
• Task 2 - Reproducibility: the task will focus on the
reproducibility of selected methods on the different
experimental collections;
• Task 3 - Generalizability: the task will focus on collection
performance prediction and the goal is to rank (sub-
)collections on the basis of the expected performance
over them.
Task 1 & 2: Selected
Papers
For Task 1 - Replicability and Task 2 - Reproducibility we
selected the following papers from TREC Common
Core Track 2017 and 2018:
• [Grossman et al, 2017] Grossman, M. R., and
Cormack, G. V. (2017). MRG_UWaterloo and
Waterloo Cormack Participation in the TREC 2017
Common Core Track. In TREC 2017.
• [Benham et al, 2018] Benham, R., Gallagher, L.,
Mackenzie, J., Liu, B., Lu, X., Scholer, F., Moffat, A.,
and Culpepper, J. S. (2018). RMIT at the 2018 TREC
CORE Track. In TREC 2018.
Task 3 - Generalizability
• Training: participants need to run plain BM25 and
they need to identify features of the corpus and
topics that allow them to predict the system score
with respect to Average Precision (AP).
• Validation: participants can validate their method
and determine which set of features represent the
best choice for predicting AP score for each system.
• Test (submission): participants submit a run for
each system (BM25 and their own system) and an
additional file (one for each system) including the
AP score predicted for each topic.
SIGIR 2019 Open-Source IR
Replicability Challenge (OSIRRC 2019)
• Repeatability, replicability, and reproducibility:
• Good science!
• Helps sustain cumulative empirical progress in the field
• OSIRRC is a workshop at SIGIR with three goals:
• Develop a common Docker interface to capture ad hoc
retrieval experiments
• Build a library of Docker images that implements the
interface
• Explore this approach to additional tasks and evaluation
methodologies
OSIRRC 2019: Current Status
• We’ve converged on a Docker specification for ad
hoc retrieval experiments – called “the jig”
• All materials at https://github.com/osirrc
• Initial library of 12 images from 9 different groups
under development
• It’s not too late if you still want to participate!
• Wrapping an existing retrieval system in “the jig” isn’t
too difficult a task…
• See you at SIGIR 2019!
Nicola Ferro
@frrncl
University of Padua, Italy
The 3rd Strategic Workshop on Information Retrieval in Lorne (SWIRL III)
14 February 2018, Lorne, Australia
ACM Artifact Badging - SIGIR
Survey
ACM Badging: Artifacts Evaluated
This badge is applied to papers whose associated artifacts have
successfully completed an independent audit. Artifacts need not
be made publicly available to be considered for this badge.
However, they do need to be made available to reviewers.
Artifacts Evaluated – Functional
The artifacts associated with the research are found to be documented,
consistent, complete, exercisable, and include appropriate evidence of
verification and validation.
Artifacts Evaluated – Reusable
The artifacts associated with the paper are of a quality that significantly
exceeds minimal functionality. That is, they have all the qualities of the
Artifacts Evaluated – Functional level, but, in addition, they are very
carefully documented and well-structured to the extent that reuse and
repurposing is facilitated. In particular, norms and standards of the
research community for artifacts of this type are strictly adhered to.
5
3
ACM Badging: Artifacts Available
This badge is applied to papers in which
associated artifacts have been made
permanently available for retrieval.
Author-created artifacts relevant to this paper
have been placed on a publicicly accessible
archival repository. A DOI or link to this
repository along with a unique identifier for
the object is provided
Repositories used to archive data should have a
declared plan to enable permanent accessibility.
Personal web pages are not acceptable for this
purpose
Artifacts do not need to have been formally
evaluated in order for an article to receive this badge.
5
4
ACM Badging: Results Validated
This badge is applied to papers in which the main results of the
paper have been successfully obtained by a person or team
other than the author
Results Replicated
The main results of the paper have been obtained in a subsequent
study by a person or team other than the authors, using, in part,
artifacts provided by the author
Results Reproduced
The main results of the paper have been independently obtained in a
subsequent study by a person or team other than the authors,
without the use of author supplied artifacts.
Exact replication or reproduction of results is not required,
or even expected. In particular, differences in the results
should not change the main claims made in the paper
5
5
SIGIR Badging Task Force
Nicola Ferro (chair)
University of Padua, Italy
Diane Kelly (ex-officio)
The University of Tennessee,
Knoxville, USA
Maarten de Rijke (ex-officio)
University of Amsterdam, The
Netherlands
Leif Azzopardi
University of Strathclyde, UK
Peter Bailey
Microsoft, USA
Hannah Bast
University of Freiburg, Germany
Rob Capra
University of North Carolina at
Chapel Hill, USA
Norbert Fuhr
University of Duisburg-Essen,
Germany
Yiqun Liu
Tsinghua University, China
Martin Potthast
Leipzig University, Germany
Filip Radlinski
Google London, UK
Tetsuya Sakai
Waseda University, Japan
Ian Soboroff
National Institute of Standards and
Technology, USA
Arjen de Vries
Radboud University, The
Netherlands
5
6

More Related Content

What's hot

Getting the Most out of Transition-based Dependency Parsing
Getting the Most out of Transition-based Dependency ParsingGetting the Most out of Transition-based Dependency Parsing
Getting the Most out of Transition-based Dependency Parsing
Jinho Choi
 
ICDE 2015 - LDV: Light-weight Database Virtualization
ICDE 2015 - LDV: Light-weight Database VirtualizationICDE 2015 - LDV: Light-weight Database Virtualization
ICDE 2015 - LDV: Light-weight Database Virtualization
Boris Glavic
 
Ipaw14 presentation Quan, Tanu, Ian
Ipaw14 presentation Quan, Tanu, IanIpaw14 presentation Quan, Tanu, Ian
Ipaw14 presentation Quan, Tanu, Ian
Boris Glavic
 
Capturing and querying fine-grained provenance of preprocessing pipelines in ...
Capturing and querying fine-grained provenance of preprocessing pipelines in ...Capturing and querying fine-grained provenance of preprocessing pipelines in ...
Capturing and querying fine-grained provenance of preprocessing pipelines in ...
Paolo Missier
 
Declarative Experimentation in Information Retrieval using PyTerrier
Declarative Experimentation in Information Retrieval using PyTerrierDeclarative Experimentation in Information Retrieval using PyTerrier
Declarative Experimentation in Information Retrieval using PyTerrier
Crai Macdonald
 
Introduction to R for Data Science :: Session 6 [Linear Regression in R]
Introduction to R for Data Science :: Session 6 [Linear Regression in R] Introduction to R for Data Science :: Session 6 [Linear Regression in R]
Introduction to R for Data Science :: Session 6 [Linear Regression in R]
Goran S. Milovanovic
 
Detecting common scientific workflow fragments using templates and execution ...
Detecting common scientific workflow fragments using templates and execution ...Detecting common scientific workflow fragments using templates and execution ...
Detecting common scientific workflow fragments using templates and execution ...dgarijo
 
Introduction to R for Data Science :: Session 2
Introduction to R for Data Science :: Session 2Introduction to R for Data Science :: Session 2
Introduction to R for Data Science :: Session 2
Goran S. Milovanovic
 
A Survey of Entity Ranking over RDF Graphs
A Survey of Entity Ranking over RDF GraphsA Survey of Entity Ranking over RDF Graphs
An Empirical Evaluation of RDF Graph Partitioning Techniques
An Empirical Evaluation of RDF Graph Partitioning TechniquesAn Empirical Evaluation of RDF Graph Partitioning Techniques
An Empirical Evaluation of RDF Graph Partitioning Techniques
Adnan Akhter
 
Quality Control of NGS Data
Quality Control of NGS Data Quality Control of NGS Data
Quality Control of NGS Data
Surya Saha
 
Webinar: What's New in Pipeline Pilot 8.5 Collection Update 1?
Webinar: What's New in Pipeline Pilot 8.5 Collection Update 1?Webinar: What's New in Pipeline Pilot 8.5 Collection Update 1?
Webinar: What's New in Pipeline Pilot 8.5 Collection Update 1?
BIOVIA
 
Heaven: Supporting Systematic Comparative Research of RDF Stream Processing E...
Heaven: Supporting Systematic Comparative Research of RDF Stream Processing E...Heaven: Supporting Systematic Comparative Research of RDF Stream Processing E...
Heaven: Supporting Systematic Comparative Research of RDF Stream Processing E...
Riccardo Tommasini
 
Effective and Efficient Entity Search in RDF data
Effective and Efficient Entity Search in RDF dataEffective and Efficient Entity Search in RDF data
Effective and Efficient Entity Search in RDF data
Roi Blanco
 
Deep API Learning (FSE 2016)
Deep API Learning (FSE 2016)Deep API Learning (FSE 2016)
Deep API Learning (FSE 2016)
Sung Kim
 
Introduction to R
Introduction to RIntroduction to R
Introduction to R
Samuel Bosch
 
Rob Davidson: Using Galaxy for Metabolomics
Rob Davidson: Using Galaxy for MetabolomicsRob Davidson: Using Galaxy for Metabolomics
Rob Davidson: Using Galaxy for Metabolomics
GigaScience, BGI Hong Kong
 
Reverted Indexing for Expansion and Feedback
Reverted Indexing for Expansion and FeedbackReverted Indexing for Expansion and Feedback
Reverted Indexing for Expansion and Feedback
Gene Golovchinsky
 
Icml2018 naver review
Icml2018 naver reviewIcml2018 naver review
Icml2018 naver review
NAVER Engineering
 
search engine
search enginesearch engine
search engine
Musaib Khan
 

What's hot (20)

Getting the Most out of Transition-based Dependency Parsing
Getting the Most out of Transition-based Dependency ParsingGetting the Most out of Transition-based Dependency Parsing
Getting the Most out of Transition-based Dependency Parsing
 
ICDE 2015 - LDV: Light-weight Database Virtualization
ICDE 2015 - LDV: Light-weight Database VirtualizationICDE 2015 - LDV: Light-weight Database Virtualization
ICDE 2015 - LDV: Light-weight Database Virtualization
 
Ipaw14 presentation Quan, Tanu, Ian
Ipaw14 presentation Quan, Tanu, IanIpaw14 presentation Quan, Tanu, Ian
Ipaw14 presentation Quan, Tanu, Ian
 
Capturing and querying fine-grained provenance of preprocessing pipelines in ...
Capturing and querying fine-grained provenance of preprocessing pipelines in ...Capturing and querying fine-grained provenance of preprocessing pipelines in ...
Capturing and querying fine-grained provenance of preprocessing pipelines in ...
 
Declarative Experimentation in Information Retrieval using PyTerrier
Declarative Experimentation in Information Retrieval using PyTerrierDeclarative Experimentation in Information Retrieval using PyTerrier
Declarative Experimentation in Information Retrieval using PyTerrier
 
Introduction to R for Data Science :: Session 6 [Linear Regression in R]
Introduction to R for Data Science :: Session 6 [Linear Regression in R] Introduction to R for Data Science :: Session 6 [Linear Regression in R]
Introduction to R for Data Science :: Session 6 [Linear Regression in R]
 
Detecting common scientific workflow fragments using templates and execution ...
Detecting common scientific workflow fragments using templates and execution ...Detecting common scientific workflow fragments using templates and execution ...
Detecting common scientific workflow fragments using templates and execution ...
 
Introduction to R for Data Science :: Session 2
Introduction to R for Data Science :: Session 2Introduction to R for Data Science :: Session 2
Introduction to R for Data Science :: Session 2
 
A Survey of Entity Ranking over RDF Graphs
A Survey of Entity Ranking over RDF GraphsA Survey of Entity Ranking over RDF Graphs
A Survey of Entity Ranking over RDF Graphs
 
An Empirical Evaluation of RDF Graph Partitioning Techniques
An Empirical Evaluation of RDF Graph Partitioning TechniquesAn Empirical Evaluation of RDF Graph Partitioning Techniques
An Empirical Evaluation of RDF Graph Partitioning Techniques
 
Quality Control of NGS Data
Quality Control of NGS Data Quality Control of NGS Data
Quality Control of NGS Data
 
Webinar: What's New in Pipeline Pilot 8.5 Collection Update 1?
Webinar: What's New in Pipeline Pilot 8.5 Collection Update 1?Webinar: What's New in Pipeline Pilot 8.5 Collection Update 1?
Webinar: What's New in Pipeline Pilot 8.5 Collection Update 1?
 
Heaven: Supporting Systematic Comparative Research of RDF Stream Processing E...
Heaven: Supporting Systematic Comparative Research of RDF Stream Processing E...Heaven: Supporting Systematic Comparative Research of RDF Stream Processing E...
Heaven: Supporting Systematic Comparative Research of RDF Stream Processing E...
 
Effective and Efficient Entity Search in RDF data
Effective and Efficient Entity Search in RDF dataEffective and Efficient Entity Search in RDF data
Effective and Efficient Entity Search in RDF data
 
Deep API Learning (FSE 2016)
Deep API Learning (FSE 2016)Deep API Learning (FSE 2016)
Deep API Learning (FSE 2016)
 
Introduction to R
Introduction to RIntroduction to R
Introduction to R
 
Rob Davidson: Using Galaxy for Metabolomics
Rob Davidson: Using Galaxy for MetabolomicsRob Davidson: Using Galaxy for Metabolomics
Rob Davidson: Using Galaxy for Metabolomics
 
Reverted Indexing for Expansion and Feedback
Reverted Indexing for Expansion and FeedbackReverted Indexing for Expansion and Feedback
Reverted Indexing for Expansion and Feedback
 
Icml2018 naver review
Icml2018 naver reviewIcml2018 naver review
Icml2018 naver review
 
search engine
search enginesearch engine
search engine
 

Similar to ntcir14centre-overview

Efficient top-k queries processing in column-family distributed databases
Efficient top-k queries processing in column-family distributed databasesEfficient top-k queries processing in column-family distributed databases
Efficient top-k queries processing in column-family distributed databasesRui Vieira
 
Apsec 2014 Presentation
Apsec 2014 PresentationApsec 2014 Presentation
Apsec 2014 Presentation
Ahrim Han, Ph.D.
 
Uk Research Infrastructure Workshop E-infrastructure Juan Bicarregui
Uk Research Infrastructure Workshop E-infrastructure Juan BicarreguiUk Research Infrastructure Workshop E-infrastructure Juan Bicarregui
Uk Research Infrastructure Workshop E-infrastructure Juan Bicarregui
Innovate UK
 
Analysis of Educational Robotics activities using a machine learning approach
Analysis of Educational Robotics activities using a machine learning approachAnalysis of Educational Robotics activities using a machine learning approach
Analysis of Educational Robotics activities using a machine learning approach
Lorenzo Cesaretti
 
VL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking
VL/HCC 2014 - A Longitudinal Study of Programmers' BacktrackingVL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking
VL/HCC 2014 - A Longitudinal Study of Programmers' BacktrackingYoungSeok Yoon
 
Hydraulics Team Full-Technical Lab Report
Hydraulics Team Full-Technical Lab ReportHydraulics Team Full-Technical Lab Report
Hydraulics Team Full-Technical Lab Report
Alfonso Figueroa
 
List from 1 – 10 1. Chanel Black Dress 2. Women’s suit 3. .docx
List from 1 – 10 1. Chanel Black Dress 2. Women’s suit 3. .docxList from 1 – 10 1. Chanel Black Dress 2. Women’s suit 3. .docx
List from 1 – 10 1. Chanel Black Dress 2. Women’s suit 3. .docx
croysierkathey
 
Towards explanations for Data-Centric AI using provenance records
Towards explanations for Data-Centric AI using provenance recordsTowards explanations for Data-Centric AI using provenance records
Towards explanations for Data-Centric AI using provenance records
Paolo Missier
 
Triantafyllia Voulibasi
Triantafyllia VoulibasiTriantafyllia Voulibasi
Triantafyllia Voulibasi
ISSEL
 
Naver learning to rank question answer pairs using hrde-ltc
Naver learning to rank question answer pairs using hrde-ltcNaver learning to rank question answer pairs using hrde-ltc
Naver learning to rank question answer pairs using hrde-ltc
NAVER Engineering
 
Analytics of analytics pipelines: from optimising re-execution to general Dat...
Analytics of analytics pipelines:from optimising re-execution to general Dat...Analytics of analytics pipelines:from optimising re-execution to general Dat...
Analytics of analytics pipelines: from optimising re-execution to general Dat...
Paolo Missier
 
Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...
Rakebul Hasan
 
8th sem (1)
8th sem (1)8th sem (1)
8th sem (1)
IdiotJackveer
 
Multi-method Evaluation in Scientific Paper Recommender Systems
Multi-method Evaluation in Scientific Paper Recommender SystemsMulti-method Evaluation in Scientific Paper Recommender Systems
Multi-method Evaluation in Scientific Paper Recommender Systems
Aravind Sesagiri Raamkumar
 
Towards Filling the Gap in Conversational Search: From Passage Retrieval to C...
Towards Filling the Gap in Conversational Search: From Passage Retrieval to C...Towards Filling the Gap in Conversational Search: From Passage Retrieval to C...
Towards Filling the Gap in Conversational Search: From Passage Retrieval to C...
krisztianbalog
 
Generating test cases using UML Communication Diagram
Generating test cases using UML Communication Diagram Generating test cases using UML Communication Diagram
Generating test cases using UML Communication Diagram
Praveen Penumathsa
 
Learning by example: training users through high-quality query suggestions
Learning by example: training users through high-quality query suggestionsLearning by example: training users through high-quality query suggestions
Learning by example: training users through high-quality query suggestions
Claudia Hauff
 
Dolap13 v9 7.docx
Dolap13 v9 7.docxDolap13 v9 7.docx
Dolap13 v9 7.docx
Hassan Nazih
 
Object Oriented Programming Lab Manual
Object Oriented Programming Lab Manual Object Oriented Programming Lab Manual
Object Oriented Programming Lab Manual
Abdul Hannan
 
Proto Spiral.ppt Proto Spiral.ppt Proto Spiral.ppt Proto Spiral.ppt
Proto Spiral.ppt Proto Spiral.ppt Proto Spiral.ppt Proto Spiral.pptProto Spiral.ppt Proto Spiral.ppt Proto Spiral.ppt Proto Spiral.ppt
Proto Spiral.ppt Proto Spiral.ppt Proto Spiral.ppt Proto Spiral.ppt
AnirbanBhar3
 

Similar to ntcir14centre-overview (20)

Efficient top-k queries processing in column-family distributed databases
Efficient top-k queries processing in column-family distributed databasesEfficient top-k queries processing in column-family distributed databases
Efficient top-k queries processing in column-family distributed databases
 
Apsec 2014 Presentation
Apsec 2014 PresentationApsec 2014 Presentation
Apsec 2014 Presentation
 
Uk Research Infrastructure Workshop E-infrastructure Juan Bicarregui
Uk Research Infrastructure Workshop E-infrastructure Juan BicarreguiUk Research Infrastructure Workshop E-infrastructure Juan Bicarregui
Uk Research Infrastructure Workshop E-infrastructure Juan Bicarregui
 
Analysis of Educational Robotics activities using a machine learning approach
Analysis of Educational Robotics activities using a machine learning approachAnalysis of Educational Robotics activities using a machine learning approach
Analysis of Educational Robotics activities using a machine learning approach
 
VL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking
VL/HCC 2014 - A Longitudinal Study of Programmers' BacktrackingVL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking
VL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking
 
Hydraulics Team Full-Technical Lab Report
Hydraulics Team Full-Technical Lab ReportHydraulics Team Full-Technical Lab Report
Hydraulics Team Full-Technical Lab Report
 
List from 1 – 10 1. Chanel Black Dress 2. Women’s suit 3. .docx
List from 1 – 10 1. Chanel Black Dress 2. Women’s suit 3. .docxList from 1 – 10 1. Chanel Black Dress 2. Women’s suit 3. .docx
List from 1 – 10 1. Chanel Black Dress 2. Women’s suit 3. .docx
 
Towards explanations for Data-Centric AI using provenance records
Towards explanations for Data-Centric AI using provenance recordsTowards explanations for Data-Centric AI using provenance records
Towards explanations for Data-Centric AI using provenance records
 
Triantafyllia Voulibasi
Triantafyllia VoulibasiTriantafyllia Voulibasi
Triantafyllia Voulibasi
 
Naver learning to rank question answer pairs using hrde-ltc
Naver learning to rank question answer pairs using hrde-ltcNaver learning to rank question answer pairs using hrde-ltc
Naver learning to rank question answer pairs using hrde-ltc
 
Analytics of analytics pipelines: from optimising re-execution to general Dat...
Analytics of analytics pipelines:from optimising re-execution to general Dat...Analytics of analytics pipelines:from optimising re-execution to general Dat...
Analytics of analytics pipelines: from optimising re-execution to general Dat...
 
Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...
 
8th sem (1)
8th sem (1)8th sem (1)
8th sem (1)
 
Multi-method Evaluation in Scientific Paper Recommender Systems
Multi-method Evaluation in Scientific Paper Recommender SystemsMulti-method Evaluation in Scientific Paper Recommender Systems
Multi-method Evaluation in Scientific Paper Recommender Systems
 
Towards Filling the Gap in Conversational Search: From Passage Retrieval to C...
Towards Filling the Gap in Conversational Search: From Passage Retrieval to C...Towards Filling the Gap in Conversational Search: From Passage Retrieval to C...
Towards Filling the Gap in Conversational Search: From Passage Retrieval to C...
 
Generating test cases using UML Communication Diagram
Generating test cases using UML Communication Diagram Generating test cases using UML Communication Diagram
Generating test cases using UML Communication Diagram
 
Learning by example: training users through high-quality query suggestions
Learning by example: training users through high-quality query suggestionsLearning by example: training users through high-quality query suggestions
Learning by example: training users through high-quality query suggestions
 
Dolap13 v9 7.docx
Dolap13 v9 7.docxDolap13 v9 7.docx
Dolap13 v9 7.docx
 
Object Oriented Programming Lab Manual
Object Oriented Programming Lab Manual Object Oriented Programming Lab Manual
Object Oriented Programming Lab Manual
 
Proto Spiral.ppt Proto Spiral.ppt Proto Spiral.ppt Proto Spiral.ppt
Proto Spiral.ppt Proto Spiral.ppt Proto Spiral.ppt Proto Spiral.pptProto Spiral.ppt Proto Spiral.ppt Proto Spiral.ppt Proto Spiral.ppt
Proto Spiral.ppt Proto Spiral.ppt Proto Spiral.ppt Proto Spiral.ppt
 

More from Tetsuya Sakai

sigir2020
sigir2020sigir2020
sigir2020
Tetsuya Sakai
 
ipsjifat201909
ipsjifat201909ipsjifat201909
ipsjifat201909
Tetsuya Sakai
 
sigir2019
sigir2019sigir2019
sigir2019
Tetsuya Sakai
 
assia2019
assia2019assia2019
assia2019
Tetsuya Sakai
 
evia2019
evia2019evia2019
evia2019
Tetsuya Sakai
 
ecir2019tutorial-finalised
ecir2019tutorial-finalisedecir2019tutorial-finalised
ecir2019tutorial-finalised
Tetsuya Sakai
 
ecir2019tutorial
ecir2019tutorialecir2019tutorial
ecir2019tutorial
Tetsuya Sakai
 
WSDM2019tutorial
WSDM2019tutorialWSDM2019tutorial
WSDM2019tutorial
Tetsuya Sakai
 
sigir2018tutorial
sigir2018tutorialsigir2018tutorial
sigir2018tutorial
Tetsuya Sakai
 
Evia2017unanimity
Evia2017unanimityEvia2017unanimity
Evia2017unanimity
Tetsuya Sakai
 
Evia2017assessors
Evia2017assessorsEvia2017assessors
Evia2017assessors
Tetsuya Sakai
 
Evia2017dialogues
Evia2017dialoguesEvia2017dialogues
Evia2017dialogues
Tetsuya Sakai
 
Evia2017wcw
Evia2017wcwEvia2017wcw
Evia2017wcw
Tetsuya Sakai
 
sigir2017bayesian
sigir2017bayesiansigir2017bayesian
sigir2017bayesian
Tetsuya Sakai
 
NL20161222invited
NL20161222invitedNL20161222invited
NL20161222invited
Tetsuya Sakai
 
AIRS2016
AIRS2016AIRS2016
AIRS2016
Tetsuya Sakai
 
Nl201609
Nl201609Nl201609
Nl201609
Tetsuya Sakai
 
ictir2016
ictir2016ictir2016
ictir2016
Tetsuya Sakai
 
ICTIR2016tutorial
ICTIR2016tutorialICTIR2016tutorial
ICTIR2016tutorial
Tetsuya Sakai
 
SIGIR2016
SIGIR2016SIGIR2016
SIGIR2016
Tetsuya Sakai
 

More from Tetsuya Sakai (20)

sigir2020
sigir2020sigir2020
sigir2020
 
ipsjifat201909
ipsjifat201909ipsjifat201909
ipsjifat201909
 
sigir2019
sigir2019sigir2019
sigir2019
 
assia2019
assia2019assia2019
assia2019
 
evia2019
evia2019evia2019
evia2019
 
ecir2019tutorial-finalised
ecir2019tutorial-finalisedecir2019tutorial-finalised
ecir2019tutorial-finalised
 
ecir2019tutorial
ecir2019tutorialecir2019tutorial
ecir2019tutorial
 
WSDM2019tutorial
WSDM2019tutorialWSDM2019tutorial
WSDM2019tutorial
 
sigir2018tutorial
sigir2018tutorialsigir2018tutorial
sigir2018tutorial
 
Evia2017unanimity
Evia2017unanimityEvia2017unanimity
Evia2017unanimity
 
Evia2017assessors
Evia2017assessorsEvia2017assessors
Evia2017assessors
 
Evia2017dialogues
Evia2017dialoguesEvia2017dialogues
Evia2017dialogues
 
Evia2017wcw
Evia2017wcwEvia2017wcw
Evia2017wcw
 
sigir2017bayesian
sigir2017bayesiansigir2017bayesian
sigir2017bayesian
 
NL20161222invited
NL20161222invitedNL20161222invited
NL20161222invited
 
AIRS2016
AIRS2016AIRS2016
AIRS2016
 
Nl201609
Nl201609Nl201609
Nl201609
 
ictir2016
ictir2016ictir2016
ictir2016
 
ICTIR2016tutorial
ICTIR2016tutorialICTIR2016tutorial
ICTIR2016tutorial
 
SIGIR2016
SIGIR2016SIGIR2016
SIGIR2016
 

Recently uploaded

zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
Alex Pruden
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
Peter Spielvogel
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 

Recently uploaded (20)

zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 

ntcir14centre-overview

  • 1. Overview of the NTCIR-14 CENTRE Task Tetsuya Sakai Nicola Ferro Ian Soboroff Zhaohao Zeng Peng Xiao Maria Maistro Waseda University, Japan University of Padua, Italy NIST, USA http://www.centre-eval.org/ Twitter: @_centre_ June 11 and 13, 2019@NTCIR-14, Tokyo.
  • 2. TALK OUTLINE 1. CENTRE@CLEF, NTCIR, TREC: motivations 2. CENTRE@NTCIR-14 task specifications 3. Additional relevance assessments 4. Evaluation measures 5. Replicability results 6. Reproducibility results 7. CENTRE@NTCIR-14 conclusions 8. Plans for NTCIR-15 9. Related efforts
  • 3. Motivation • Researcher A publishes a paper; says Algo X > Algo Y on test data D. ⇒ Researcher B tries Algo X and Y on D but finds Algo X < Algo Y. A replicability problem. • Researcher A publishes a paper; says Algo X > Algo Y on test data D. ⇒ Researcher B tries Algo X and Y on D’ but finds Algo X < Algo Y. A reproducibility problem. Can the IR community do better?
  • 4. Research questions • Can results from CLEF, NTCIR, and TREC be replicated/reproduced easily? If not, what are the difficulties? What needs to be improved? • Can results from (say) TREC be reproduced with (say) NTCIR data? Why or why not? • [Participants may have their own research questions]
  • 5. CENTRE = CLEF/NTCIR/TREC (replicability and) reproducibility CLEF Conferences and Labs of the Evaluation Forum NTCIR NII Testbeds and Community for Information access Research TREC Text REtrieval Conference Data, code, wisdom Replicability and Reproducibility experiments Data, code, wisdom Replicability and Reproducibility experiments Data, code, wisdom Replicability and Reproducibility experiments Europe and more; Annual US and more; Annual Asia and more; Sesquinnual
  • 6. Who should participate in CENTRE? • IR evaluation nerds • Students willing to replicate/reproduce SOTA methods in IR, so that they will eventually be able to propose their own methods and demonstrate their superiority over the SOTA • Those who want to be involved in more than one evaluation communities across the globe • Those who want to use data from more than one evaluation conferences
  • 7. TALK OUTLINE 1. CENTRE@CLEF, NTCIR, TREC: motivations 2. CENTRE@NTCIR-14 task specifications 3. Additional relevance assessments 4. Evaluation measures 5. Replicability results 6. Reproducibility results 7. CENTRE@NTCIR-14 conclusions 8. Plans for NTCIR-15 9. Related efforts
  • 8. CENTRE@NTCIR-14: How it works (1) Past CLEFs Past TRECs Past NTCIRs A, B: target run pairs A: advanced B: baseline A, B: target run pairs A, B: target run pairs IR literature A, B: target run pairs Selected by organisers Selected by each team
  • 9. CENTRE@NTCIR-14: How it works (2) Past CLEFs Past TRECs Past NTCIRs A, B: target run pairs T1{A,B}: replicated run pairs T1: replicability subtask (same NTCIR data) A: advanced B: baseline A, B: target run pairs A, B: target run pairs IR literature A, B: target run pairs Selected by organisers Selected by each team Evaluation measures: • RSME of the topicwise deltas • Pearson correlation of the topicwise deltas • Effect Ratio (ER)
  • 10. CENTRE@NTCIR-14: How it works (2) Past CLEFs Past TRECs Past NTCIRs A, B: target run pairs T1{A,B}: replicated run pairs T1: replicability subtask (same NTCIR data) A: advanced B: baseline A, B: target run pairs A, B: target run pairs T2: reproducibility subtask IR literature T2CLEF-{A,B} Run pairs reproduced on NTCIR data T2TREC-{A,B} T2OPEN-{A,B} A, B: target run pairs Selected by organisers Selected by each team Evaluation measure: Effect Ratio (ER) T2OPEN runs evaluated only with effectiveness measures Past CLEF runs not considered at CENTRE@NTCIR-14
  • 11. Target runs for CENTRE@NTCIR-14 • We focus on ad hoc monolingual web search. • For T1 (replicability): A pair of runs from the NTCIR-13 We Want Web task - Overview paper: http://research.nii.ac.jp/ntcir/workshop/OnlineProceedings13/pdf/ntcir/01-NTCIR13-OV-WWW- LuoC.pdf • For T2 (reproducibility): A pair of runs from the TREC 2013 web track - Overview paper: https://trec.nist.gov/pubs/trec22/papers/WEB.OVERVIEW.pdf • All of the above target runs use clueweb12 as the search corpus http://www.lemurproject.org/clueweb12.php/ • Both the target NTCIR and TREC runs are implemented using Indri https://www.lemurproject.org/indri/
  • 12. Target NTCIR runs and data for T1 (replicability) • Test collection: NTCIR-13 WWW English (clueweb12-B13, 100 topics) • Target NTCIR run pairs: From the NTCIR-13 WWW RMIT paper [Gallagher+17] http://research.nii.ac.jp/ntcir/workshop/OnlineProceedings13/pdf/ntcir/02-NTCIR13-WWW-GallagherL.pdf A: RMIT-E-NU-Own-1 (sequential dependency model: SDM) B: RMIT-E-NU-Own-3 (full dependency model: FDM) • SDM and FDM are from [Metzler+Croft SIGIR05] https://doi.org/10.1145/1076034.1076115
  • 13. Target TREC runs and data for T2 (reproduciblity) • Test collection: TREC 2013 Web Track (clueweb12 category A, 50 topics) • Target TREC run pairs: From the TREC 2013 Web Track U Delaware paper [Yang+13] : https://trec.nist.gov/pubs/trec22/papers/udel_fang-web.pdf A: UDInfolabWEB2 (selects semantically related terms using Web-based working sets) B: UDInfolabWEB1 (selects semantically related terms using collection-based working sets) • The working set construction method is from [Fang+SIGIR06] https://doi.org/10.1145/1148170.1148193
  • 14. How participants were expected to submit T1 (replicability) runs 0. Obtain clueweb12-B13 (or the full data) http://www.lemurproject.org/clueweb12.php/ and Indri https://www.lemurproject.org/indri/ 1. Register to the CENTRE task and obtain the NTCIR-13 WWW topics and qrels from the CENTRE organisers 2. Read the NTCIR-13 WWW RMIT paper [Gallagher+17] http://research.nii.ac.jp/ntcir/workshop/OnlineProceedings13/pdf/ntcir/02-NTCIR13-WWW-GallagherL.pdf and try to replicate the A-run and the B-run on the NTCIR-13 WWW-1 test collection A: RMIT-E-NU-Own-1 B: RMIT-E-NU-Own-3 3. Submit the replicated runs by the deadline
  • 15. How participants were expected to submit T2TREC (reproducibility) runs 0. Obtain clueweb12-B13 (or the full data) http://www.lemurproject.org/clueweb12.php/ and (optionally) Indri https://www.lemurproject.org/indri/ 1. Register to the CENTRE task and obtain the NTCIR-13 WWW topics and qrels from the CENTRE organisers 2. Read the TREC 2013 Web Track U Delaware paper [Yang+13] : https://trec.nist.gov/pubs/trec22/papers/udel_fang-web.pdf and try to reproduce the A-run and the B-run on the NTCIR-13 WWW-1 test collection A: UDInfolabWEB2 B: UDInfolabWEB1 3. Submit the reproduced runs by the deadline
  • 16. How participants were expected to submit T2OPEN (reproducibility) runs 0. Obtain clueweb12-B13 (or the full data) http://www.lemurproject.org/clueweb12.php/ and (optionally) Indri https://www.lemurproject.org/indri/ 1. Register to the CENTRE task and obtain the NTCIR-13 WWW topics and qrels from the CENTRE organisers 2. Pick any pair of runs A (advanced) and B (baseline) from the IR literature, and try to reproduce the A-run and the B-run on the NTCIR-13 WWW-1 test collection 3. Submit the reproduced runs by the deadline T2OPEN not discussed further in this overview. For more, see: Andrew Yates: MPII at the NTCIR-14 CENTRE Task
  • 17. Submission instructions • Each team can submit only one run pair per subtask (T1, T2TREC, T2OPEN). So at most 6 runs in total. • Run file format: same as WWW-1; see http://www.thuir.cn/ntcirwww/ (Run Submissions format). In short, it’s a TREC run file plus a SYSDESC line. • Run file names: CENTRE1-<teamname>-<subtaskname>-[A,B] Advanced Baseline
  • 18. TALK OUTLINE 1. CENTRE@CLEF, NTCIR, TREC: motivations 2. CENTRE@NTCIR-14 task specifications 3. Additional relevance assessments 4. Evaluation measures 5. Replicability results 6. Reproducibility results 7. CENTRE@NTCIR-14 conclusions 8. Plans for NTCIR-15 9. Related efforts
  • 19. Only one team participated…
  • 20. Additional relevance assessments • NTCIR-13 WWW-1 English qrels expanded based on the new CENTRE runs (i.e. 6 runs from MPII). • Depth-30 pools were created, following WWW-1. • topic-doc pairs previously unseen (2,617) were judged. NTCIR-13 WWW E CENTRE @NTCIR-14 100 topics original qrels Based on 13 runs from three teams 100 topics augmented qrels 22,912 topic-doc pairs 22,912 + 2,617 = 25,529 topic-doc pairs
  • 22. TALK OUTLINE 1. CENTRE@CLEF, NTCIR, TREC: motivations 2. CENTRE@NTCIR-14 task specifications 3. Additional relevance assessments 4. Evaluation measures 5. Replicability results 6. Reproducibility results 7. CENTRE@NTCIR-14 conclusions 8. Plans for NTCIR-15 9. Related efforts
  • 23. T1:examining the topicwise deltas original topicwise improvement replicated topicwise improvement Covariance Standard deviations Pearson correlation Collection Measure RMSE Root Mean Squared Error
  • 24. T1 and T2: Effect Ratio (1) • T1 (replicability) • T2TREC (reproducibility) original mean improvement reproduced mean improvement replicated mean improvement original mean improvement
  • 25. T1 and T2: Effect Ratio (2) • ER compares the sample (unstandardised) effect size in the “rep” experiment against that in the original experiment. “Can we still observe the same effect?” • ER>1 ⇒ effect larger in “rep” experiment • ER=1 ⇒ effect identical to original experiment • 0 < ER < 1 ⇒ effect smaller in “rep” experiment • ER <= 0 ⇒ FAILURE: A-run does not outperform B- run in the “rep” experiment
  • 26. TALK OUTLINE 1. CENTRE@CLEF, NTCIR, TREC: motivations 2. CENTRE@NTCIR-14 task specifications 3. Additional relevance assessments 4. Evaluation measures 5. Replicability results 6. Reproducibility results 7. CENTRE@NTCIR-14 conclusions 8. Plans for NTCIR-15 9. Related efforts
  • 27. Replicability: w/o additional assessments (1) • RMIT said “SDM (A) outperforms FDM (B)” • MPII confirms this! • A statistically significantly outperforms B even with nERR! • Effect sizes for nERR (Orig vs Repli) quite similar.
  • 28. Replicability: w/o additional assessments (2) ER for nERR close to 1 Correlation (OrigΔ vs RepliΔ) for nERR statistically significant
  • 29. Replicability: WITH additional assessments ER for nERR close to 1 Correlation (OrigΔ vs RepliΔ) for nERR statistically significant
  • 30. nERR, r=0.2610, 95%CI[0.0680, 0.4351] Orig Δ ≒ 0.4 Repli Δ ≒ 0.8 (A>B) Orig Δ ≒ -0.3 Repli Δ ≒ -0.6 (A<B) But replicability not so successful at the topic level
  • 31. TALK OUTLINE 1. CENTRE@CLEF, NTCIR, TREC: motivations 2. CENTRE@NTCIR-14 task specifications 3. Additional relevance assessments 4. Evaluation measures 5. Replicability results 6. Reproducibility results 7. CENTRE@NTCIR-14 conclusions 8. Plans for NTCIR-15 9. Related efforts
  • 32. Reproducibility: w/o additional assessments (1) • UDel said “web-based working sets (A) outperform collection-based working sets (B)” for TREC • MPII confirms this for NTCIR • A statistically significantly outperforms B even with Q! • Effect sizes for Q (Orig vs Repli) quite similar.
  • 33. Reproducibility: w/o additional assessments (2) ER for Q close to 1
  • 34. Reproducibility: WITH additional assessments Larger ER values! New relevant docs were found by the CENTRE runs; addition worthwhile! Smaller p-values!
  • 35. TALK OUTLINE 1. CENTRE@CLEF, NTCIR, TREC: motivations 2. CENTRE@NTCIR-14 task specifications 3. Additional relevance assessments 4. Evaluation measures 5. Replicability results 6. Reproducibility results 7. CENTRE@NTCIR-14 conclusions 8. Plans for NTCIR-15 9. Related efforts
  • 36. CENTRE@NTCIR-14 conclusions • We focussed on: “Can an effect (gain of A over B) replicated/reproduced?” • Replication of the RMIT effect on WWW-1 data: successful at the ER level; not so successful at the topic level. • Reproduction of the TREC UDel effect on WWW-1 data: successful at the ER level; additional relevance assessments further improved the ERs. • Only one participating team…this needs to change.
  • 37. SPECIAL THANKS • To Andrew Yates (Max Planck Institute for Informatics, Germany) • MPII at the NTCIR-14 CENTRE Task
  • 38. TALK OUTLINE 1. CENTRE@CLEF, NTCIR, TREC: motivations 2. CENTRE@NTCIR-14 task specifications 3. Additional relevance assessments 4. Evaluation measures 5. Replicability results 6. Reproducibility results 7. CENTRE@NTCIR-14 conclusions 8. Plans for NTCIR-15 9. Related efforts
  • 39. WWW and CENTRE join forces • WWW-3 task will have three subtasks: (1) Adhoc Chinese (2) Adhoc English (WWW-2 + new WWW-topics) (3) CENTRE (WWW-2 + new WWW-3 topics) • CENTRE will accept two run types: (a) Revived runs from WWW-2 (Tsinghua and RUC) (b) REP (REPlicated and REProduced) runs: Tsinghua and RUC algorithms re-implemented by other teams
  • 40. WWW-3 CENTRE revived run 1 (Team A) WWW-2 topics WWW-2 target run 2 (Team B) WWW-2 target run 1 (Team A) WWW-3 CENTRE revived run 2 (Team B) revived (same team) revived (same team) New WWW-3 topics Processed by WWW-2 runs Processed by WWW-3 runs WWW-3 Adhoc (Team D) Measuring progress: WWW-2 top run → WWW-3 top run COMPARE Use the WWW-3 topics (since WWW-3 systems may be tuned with WWW-2 topics)
  • 41. WWW-3 CENTRE revived run 1 (Team A) WWW-2 topics WWW-2 target run 2 (Team B) WWW-2 target run 1 (Team A) WWW-3 CENTRE revived run 2 (Team B) revived (same team) revived (same team) New WWW-3 topics Processed by WWW-2 runs Processed by WWW-3 runs Replicated/reproduced (different team) Replicated/reproduced (different team) WWW-3 CENTRE REP run 1-1 (Team C) WWW-3 CENTRE REP run 1-2 (Team C) Measuring replicability on WWW-2 topics COMPARE Use the WWW-2 topics
  • 42. WWW-3 CENTRE revived run 1 (Team A) WWW-2 topics WWW-2 target run 2 (Team B) WWW-2 target run 1 (Team A) WWW-3 CENTRE revived run 2 (Team B) revived (same team) revived (same team) New WWW-3 topics Processed by WWW-2 runs Processed by WWW-3 runs Replicated/reproduced (different team) Replicated/reproduced (different team) WWW-3 CENTRE REP run 1-1 (Team C) WWW-3 CENTRE REP run 1-2 (Team C) Measuring reproducibility: WWW-2-topics → WWW-3 topics COMPARE Use the WWW-2 topics Use the WWW-3 topics
  • 43. TALK OUTLINE 1. CENTRE@CLEF, NTCIR, TREC: motivations 2. CENTRE@NTCIR-14 task specifications 3. Additional relevance assessments 4. Evaluation measures 5. Replicability results 6. Reproducibility results 7. CENTRE@NTCIR-14 conclusions 8. Plans for NTCIR-15 9. Related efforts
  • 44. CENTRE@CLEF 2019 CLEF NTCIR TREC REproducibility Nicola Ferro, Norbert Fuhr, Maria Maistro, Tetsuya Sakai, and Ian Soboroff April 15, 2019, Cologne http://www.centre-eval.org/clef2019/
  • 45. CENTRE@CLEF 2019 TasksThe CENTRE@CLEF 2019 lab will offer three tasks: • Task 1 - Replicability: the task will focus on the replicability of selected methods on the same experimental collections; • Task 2 - Reproducibility: the task will focus on the reproducibility of selected methods on the different experimental collections; • Task 3 - Generalizability: the task will focus on collection performance prediction and the goal is to rank (sub- )collections on the basis of the expected performance over them.
  • 46. Task 1 & 2: Selected Papers For Task 1 - Replicability and Task 2 - Reproducibility we selected the following papers from TREC Common Core Track 2017 and 2018: • [Grossman et al, 2017] Grossman, M. R., and Cormack, G. V. (2017). MRG_UWaterloo and Waterloo Cormack Participation in the TREC 2017 Common Core Track. In TREC 2017. • [Benham et al, 2018] Benham, R., Gallagher, L., Mackenzie, J., Liu, B., Lu, X., Scholer, F., Moffat, A., and Culpepper, J. S. (2018). RMIT at the 2018 TREC CORE Track. In TREC 2018.
  • 47. Task 3 - Generalizability • Training: participants need to run plain BM25 and they need to identify features of the corpus and topics that allow them to predict the system score with respect to Average Precision (AP). • Validation: participants can validate their method and determine which set of features represent the best choice for predicting AP score for each system. • Test (submission): participants submit a run for each system (BM25 and their own system) and an additional file (one for each system) including the AP score predicted for each topic.
  • 48. SIGIR 2019 Open-Source IR Replicability Challenge (OSIRRC 2019) • Repeatability, replicability, and reproducibility: • Good science! • Helps sustain cumulative empirical progress in the field • OSIRRC is a workshop at SIGIR with three goals: • Develop a common Docker interface to capture ad hoc retrieval experiments • Build a library of Docker images that implements the interface • Explore this approach to additional tasks and evaluation methodologies
  • 49. OSIRRC 2019: Current Status • We’ve converged on a Docker specification for ad hoc retrieval experiments – called “the jig” • All materials at https://github.com/osirrc • Initial library of 12 images from 9 different groups under development • It’s not too late if you still want to participate! • Wrapping an existing retrieval system in “the jig” isn’t too difficult a task… • See you at SIGIR 2019!
  • 50. Nicola Ferro @frrncl University of Padua, Italy The 3rd Strategic Workshop on Information Retrieval in Lorne (SWIRL III) 14 February 2018, Lorne, Australia ACM Artifact Badging - SIGIR Survey
  • 51. ACM Badging: Artifacts Evaluated This badge is applied to papers whose associated artifacts have successfully completed an independent audit. Artifacts need not be made publicly available to be considered for this badge. However, they do need to be made available to reviewers. Artifacts Evaluated – Functional The artifacts associated with the research are found to be documented, consistent, complete, exercisable, and include appropriate evidence of verification and validation. Artifacts Evaluated – Reusable The artifacts associated with the paper are of a quality that significantly exceeds minimal functionality. That is, they have all the qualities of the Artifacts Evaluated – Functional level, but, in addition, they are very carefully documented and well-structured to the extent that reuse and repurposing is facilitated. In particular, norms and standards of the research community for artifacts of this type are strictly adhered to. 5 3
  • 52. ACM Badging: Artifacts Available This badge is applied to papers in which associated artifacts have been made permanently available for retrieval. Author-created artifacts relevant to this paper have been placed on a publicicly accessible archival repository. A DOI or link to this repository along with a unique identifier for the object is provided Repositories used to archive data should have a declared plan to enable permanent accessibility. Personal web pages are not acceptable for this purpose Artifacts do not need to have been formally evaluated in order for an article to receive this badge. 5 4
  • 53. ACM Badging: Results Validated This badge is applied to papers in which the main results of the paper have been successfully obtained by a person or team other than the author Results Replicated The main results of the paper have been obtained in a subsequent study by a person or team other than the authors, using, in part, artifacts provided by the author Results Reproduced The main results of the paper have been independently obtained in a subsequent study by a person or team other than the authors, without the use of author supplied artifacts. Exact replication or reproduction of results is not required, or even expected. In particular, differences in the results should not change the main claims made in the paper 5 5
  • 54. SIGIR Badging Task Force Nicola Ferro (chair) University of Padua, Italy Diane Kelly (ex-officio) The University of Tennessee, Knoxville, USA Maarten de Rijke (ex-officio) University of Amsterdam, The Netherlands Leif Azzopardi University of Strathclyde, UK Peter Bailey Microsoft, USA Hannah Bast University of Freiburg, Germany Rob Capra University of North Carolina at Chapel Hill, USA Norbert Fuhr University of Duisburg-Essen, Germany Yiqun Liu Tsinghua University, China Martin Potthast Leipzig University, Germany Filip Radlinski Google London, UK Tetsuya Sakai Waseda University, Japan Ian Soboroff National Institute of Standards and Technology, USA Arjen de Vries Radboud University, The Netherlands 5 6