EASTER: Evaluating Subject Tools
    for Enhancing Retrieval
                         Michael Day
            Research and Development Team Leader
                  UKOLN, University of Bath

Automatic Metadata Generation and Text Mining Projects Meeting, London, 25 May 2010


                                      UKOLN is supported by:




        www.ukoln.ac.uk
        A centre of expertise in digital information management
Presentation outline
• Project aims and objectives
• Value and impact




 www.ukoln.ac.uk
 A centre of expertise in digital information management
Project aims and objectives (1)
• Problems that the project is trying to explore:
    – The testing and evaluation of existing automated subject
      metadata generation tools
        • Creating subject metadata using controlled terms
          from vocabularies
        • The enrichment of metadata records
    – The potential implementation of these tools in metadata
      creation workflows (e.g. Intute)
    – How enhanced metadata influences quality of retrieval
    – Evaluation methodologies – the lack of consistent
      approach means that it is difficult to compare tools

 www.ukoln.ac.uk
 A centre of expertise in digital information management
Project aims and objectives (2)
• Why is EASTER important?
   – Manual generation of subject metadata and application of
     controlled vocabularies is time-consuming and expensive
   – Many tools available – need for evaluation methodologies
     (e.g., quality of retrieval, indexing quality)
• To whom is EASTER important?
   – Digital collection providers - many potential application
     domains for the automatic generation of subject
     metadata
   – Those implementing tools or developing new ones
     (evaluation methodologies)

www.ukoln.ac.uk
A centre of expertise in digital information management
Project aims and objectives (3)
• What is the envisaged final project output?
    – Indexing perspectives:
        • A framework for the evaluation of automated subject
          indexing tools (report)
            – Based on comparison of indexing quality with
              manually created ‘gold standard’ metadata
              (ground-truth)- reviewing things like exhaustivity,
              correctness and consistency
            – Evaluation report on tools
        • Workflow integration (demonstrator)
    – User perspectives:
        • End-user retrieval testing (report)

 www.ukoln.ac.uk
 A centre of expertise in digital information management
Potential value and impact
• Exploring the value of subject metadata and
  controlled vocabularies for digital collection
  providers:
    – Use of tools in metadata generation workflows
    – Faster / higher-quality subject metadata generation
• Enhanced support for retrieval:
    – Metadata consistency, contextualisation of subject terms
• Number of potential application areas:
    – Repositories, OER, large-scale digitisation programmes,
      etc.
• Evaluation methodologies for tools

 www.ukoln.ac.uk
 A centre of expertise in digital information management

EASTER project

  • 1.
    EASTER: Evaluating SubjectTools for Enhancing Retrieval Michael Day Research and Development Team Leader UKOLN, University of Bath Automatic Metadata Generation and Text Mining Projects Meeting, London, 25 May 2010 UKOLN is supported by: www.ukoln.ac.uk A centre of expertise in digital information management
  • 2.
    Presentation outline • Projectaims and objectives • Value and impact www.ukoln.ac.uk A centre of expertise in digital information management
  • 3.
    Project aims andobjectives (1) • Problems that the project is trying to explore: – The testing and evaluation of existing automated subject metadata generation tools • Creating subject metadata using controlled terms from vocabularies • The enrichment of metadata records – The potential implementation of these tools in metadata creation workflows (e.g. Intute) – How enhanced metadata influences quality of retrieval – Evaluation methodologies – the lack of consistent approach means that it is difficult to compare tools www.ukoln.ac.uk A centre of expertise in digital information management
  • 4.
    Project aims andobjectives (2) • Why is EASTER important? – Manual generation of subject metadata and application of controlled vocabularies is time-consuming and expensive – Many tools available – need for evaluation methodologies (e.g., quality of retrieval, indexing quality) • To whom is EASTER important? – Digital collection providers - many potential application domains for the automatic generation of subject metadata – Those implementing tools or developing new ones (evaluation methodologies) www.ukoln.ac.uk A centre of expertise in digital information management
  • 5.
    Project aims andobjectives (3) • What is the envisaged final project output? – Indexing perspectives: • A framework for the evaluation of automated subject indexing tools (report) – Based on comparison of indexing quality with manually created ‘gold standard’ metadata (ground-truth)- reviewing things like exhaustivity, correctness and consistency – Evaluation report on tools • Workflow integration (demonstrator) – User perspectives: • End-user retrieval testing (report) www.ukoln.ac.uk A centre of expertise in digital information management
  • 6.
    Potential value andimpact • Exploring the value of subject metadata and controlled vocabularies for digital collection providers: – Use of tools in metadata generation workflows – Faster / higher-quality subject metadata generation • Enhanced support for retrieval: – Metadata consistency, contextualisation of subject terms • Number of potential application areas: – Repositories, OER, large-scale digitisation programmes, etc. • Evaluation methodologies for tools www.ukoln.ac.uk A centre of expertise in digital information management