1. EASTER: Evaluating Subject Tools
for Enhancing Retrieval
Michael Day
Research and Development Team Leader
UKOLN, University of Bath
Automatic Metadata Generation and Text Mining Projects Meeting, London, 25 May 2010
UKOLN is supported by:
www.ukoln.ac.uk
A centre of expertise in digital information management
2. Presentation outline
• Project aims and objectives
• Value and impact
www.ukoln.ac.uk
A centre of expertise in digital information management
3. Project aims and objectives (1)
• Problems that the project is trying to explore:
– The testing and evaluation of existing automated subject
metadata generation tools
• Creating subject metadata using controlled terms
from vocabularies
• The enrichment of metadata records
– The potential implementation of these tools in metadata
creation workflows (e.g. Intute)
– How enhanced metadata influences quality of retrieval
– Evaluation methodologies – the lack of consistent
approach means that it is difficult to compare tools
www.ukoln.ac.uk
A centre of expertise in digital information management
4. Project aims and objectives (2)
• Why is EASTER important?
– Manual generation of subject metadata and application of
controlled vocabularies is time-consuming and expensive
– Many tools available – need for evaluation methodologies
(e.g., quality of retrieval, indexing quality)
• To whom is EASTER important?
– Digital collection providers - many potential application
domains for the automatic generation of subject
metadata
– Those implementing tools or developing new ones
(evaluation methodologies)
www.ukoln.ac.uk
A centre of expertise in digital information management
5. Project aims and objectives (3)
• What is the envisaged final project output?
– Indexing perspectives:
• A framework for the evaluation of automated subject
indexing tools (report)
– Based on comparison of indexing quality with
manually created ‘gold standard’ metadata
(ground-truth)- reviewing things like exhaustivity,
correctness and consistency
– Evaluation report on tools
• Workflow integration (demonstrator)
– User perspectives:
• End-user retrieval testing (report)
www.ukoln.ac.uk
A centre of expertise in digital information management
6. Potential value and impact
• Exploring the value of subject metadata and
controlled vocabularies for digital collection
providers:
– Use of tools in metadata generation workflows
– Faster / higher-quality subject metadata generation
• Enhanced support for retrieval:
– Metadata consistency, contextualisation of subject terms
• Number of potential application areas:
– Repositories, OER, large-scale digitisation programmes,
etc.
• Evaluation methodologies for tools
www.ukoln.ac.uk
A centre of expertise in digital information management