Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Succeed Validation and Take up of Tools - Katrien Depuydt

419 views

Published on

Succeed WP3 Validation and Take-up of Tools at the "Succeed in Digitisation. Spreading Excellence" Conference.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Succeed Validation and Take up of Tools - Katrien Depuydt

  1. 1. Succeed WP3 – Validation and take-up of tools Katrien Depuydt (INL) –Stefan Eickeler, Sebastian Kirch, (IAIS) Succeed is supported by the European Union under FP7-ICT and coordinated by Universidad de Alicante.
  2. 2. Objectives Many tools and linguistic resources were developed in research and development programs supporting the digitisation of cultural heritage Still, too few are used in the productive environments Succeed’s approach to support the take-up of these tools: 1.Identify existing tools and resources 2.Identify libraries willing to use and evaluate tools 3.Define criteria to validate and evaluate tools 4.Provide training material for tools 5.Provide support to libraries using and evaluating tools 6.Blueprint for validation and take-up of tools Succeed is supported by the European Union under FP7-ICT and coordinated by Universidad de Alicante.
  3. 3. Survey of tools Training material Evaluation
  4. 4. 1. SURVEY AND SELECTION OF TOOLS
  5. 5. Survey of tools Brief description and goals Produce a survey of existing tools ground truth data and lexicon data for digitisation Select candidate tools for implementation at cultural heritage institutions Succeed is supported by the European Union under FP7-ICT and coordinated by Universidad de Alicante.
  6. 6. Survey of tools Methodology used to achieve the objectives 1.Taxonomy for categorisation based on a simplified digitisation workflow 2.Definition of attributes e.g. how a tool can be used in the digitisation process 3.Online Spreadsheet to collect and organise tools 4.Assessment and further selection into a shortlist of tools Succeed is supported by the European Union under FP7-ICT and coordinated by Universidad de Alicante.
  7. 7. Selection of tools First selection: knock-out criteria (three steps) Succeed is supported by the European Union under FP7-ICT and coordinated by Universidad de Alicante. Further selection: (expertise partners)
  8. 8. Task 1 Survey of tools Summary of outcomes Categorised list of 213 research and commercial tools Available in an online database and frequently updated Shortlist with the most relevant tools based on a quality assessment An overview of existing ground truth material and lexicon data has been produced. http://impact.dlsi.ua.es/digitisation/tools-resources/tools-for-text-digitisation/ Succeed is supported by the European Union under FP7-ICT and coordinated by Universidad de Alicante.
  9. 9. 2. VALIDATION PARAMETERS
  10. 10. 1st Project Review – WP3 Validation parameters Brief description and goals Define validation parameters and procedures for the implementation of tools in productive environments (per task carried out by using a tool) Validate each tool (or group of tools) based on these criteria Work out evaluation work plans and test scenarios in cooperation with libraries based on their requirements Succeed is supported by the European Union under FP7-ICT and coordinated by Universidad de Alicante.
  11. 11. Validation parameters Methodology used to achieve the objectives 1.Definition of evaluation template structure 2.Tool selection by libraries 3.Creation and compilation of evaluation material Separate evaluation forms per task/tool type & common usability evaluation form 4.Distribution of evaluation material to participating libraries Succeed is supported by the European Union under FP7-ICT and coordinated by Universidad de Alicante.
  12. 12. 1st Project Review – WP3 Validation parameters Summary of outcomes Described evaluation procedures and produced 9 evaluation forms per task Worked out evaluation and test scenarios as a “work plan” together with the participating libraries Blueprint for take-up and validation Succeed is supported by the European Union under FP7-ICT and coordinated by Universidad de Alicante.
  13. 13. 3. TAKE-UP SUPPORT
  14. 14. Take-up support Brief description and goals Support the integration, take-up and validation of digitisation tools and resources Tool implementation at four participant libraries and nine external libraries (16 potential external libraries at the start of the project > 9 retained) Assistance for the adaptation/application of the tools to specific domains and/or languages Succeed is supported by the European Union under FP7-ICT and coordinated by Universidad de Alicante.
  15. 15. Take-up support Methodology used to achieve the objectives 1.Each library installs, on average, two tools and tests their performance and usability in a productive environment according to the predifined validation criteria 2.Some consortium libraries will test existing linguistic resources for enhancement of textual information retrieval 3.The technical partners (IAIS, INL, PSNC, UA) will provide online assistance for the adaptation of the tools to specific domains and languages 4.The technical partners will report on the results based on the information provided by the libraries Succeed is supported by the European Union under FP7-ICT and coordinated by Universidad de Alicante.
  16. 16. External Libraries Library Country Selected Tools Wielkopolska Biblioteka Cyfrowa Poland - Scan Tailor - JHOVE2 - Image Magick General Historical Library of Salamanca Spain - Gimp - Omnipage Wroclaw University Library Poland - Scan Tailor - Tesseract OCR University Library of Bratislava Slovak Republic - Scan Tailor - ImageMagick National Library of Finland Finland - Newspaper segmentation - Korrektor - Document Deskewer Library of the University of Granada Spain - Scan Tailor - Alchemy API University Library of Leuven Belgium - Abbyy FRE - NERT University Library of Antwerp Belgium - NE Attestation tool, - NLTK (NE), - Stanford (NE) University Library of Darmstadt Germany - Newspaper segmentation - Korrektor - Document Deskewer Internal Libraries Library Country Selected Tools Biblioteca Virtual Miguel de Cervantes Spain - Abbyy FRE - Geometric correction: Page Curl - COBaLT - Lexicon as Webservice Bibliotèque nationale de France France - DBPedia Spotlight - Evaluation Tool for OCR - Lexicon as Webservice Koninklijke Bibliotheek Netherlands - Lexicon as Webservice - NLTK - NERT The British Library United Kingdom - Evaluation Tool for OCR - Stanford (NE) - Lexicon as Webservice Take-up support Summary of outcomes  Involved 9 external libraries in the project to perform tool evaluation, each of them committed to evaluate at least 2 tools  Collected libraries’ digitisation requirements  Consulted libraries in defining interesting use cases for evaluation  Provided remote assistance for the take-up of tools selected by the libraries Succeed is supported by the European Union under FP7-ICT and coordinated by Universidad de Alicante.
  17. 17. Take-up support Remote assistance for technical support: Assistance for the integration and adaptation of the tools to specific domains, languages and use cases Implementation studies (final report): Elaboration of blueprint on validation and take-up process for tools and resources Case studies from the implementation experiences produced Succeed is supported by the European Union under FP7-ICT and coordinated by Universidad de Alicante.
  18. 18. 4: TRAINING
  19. 19. Training Brief description and goals Produce documentation and training material for the tools to be validated. They must help the participating libraries to take-up the tools in their productive environment. Provide training on specific tools to external stakeholders. Organise on-site training workshops depending on libraries requirements Succeed is supported by the European Union under FP7-ICT and coordinated by Universidad de Alicante.
  20. 20. Training Methodology used to achieve the objectives 1.Document structure of training material 2.Tool selection by libraries 3.Distribution of Work: WP 3 partners according to expertise and knowledge with the selected tools 4.Creation and compilation of training material 5.Distribution of training material to participating libraries Succeed is supported by the European Union under FP7-ICT and coordinated by Universidad de Alicante.
  21. 21. Training Summary of outcomes Prepared training materials for 19 tools (separate document, online SCORM + DigitWiki) Organized TPDL tutorial attracting experts from digital libraries from around the world Participation in hackathons Succeed is supported by the European Union under FP7-ICT and coordinated by Universidad de Alicante.
  22. 22. 5. CONCLUSIONS
  23. 23. Conclusions Evaluation work of each participating library > Presentations! Succeed is supported by the European Union under FP7-ICT and coordinated by Universidad de Alicante.
  24. 24. Conclusions Blueprint for evaluation General recommendations for evaluation by libraries: a.Translate requirements into detailed use case (including detailed description of data + data format) b.Acquire or produce test data c.Determine tools d.Produce work plan e.Verify use case with internal and external experts (Tool providers, CoC) If no test data can be produced, adapt use case If plan breaks down in too many steps, adapt use case If necessary, change tool selection f.Documentation of the evaluation (evaluation forms) g.Use experienced technical staff Succeed is supported by the European Union under FP7-ICT and coordinated by Universidad de Alicante.
  25. 25. Conclusions Blueprint for evaluation General recommendations for tool providers: a.Provide a clear description of the purpose of the tool b.Provide a clear description of the formats the tool can handle c.Provide a clear description of the type of material the tool can handle with reasonable results; provide information on performance where possible d.Provide a clear step by step description of the complete procedure that should be followed to get the best possible result, including training and tuning of parameters. e.Provide compact documentation if possible f.Minimize interdependency of parts of documentation Succeed is supported by the European Union under FP7-ICT and coordinated by Universidad de Alicante.
  26. 26. Thank you! Succeed is supported by the European Union under FP7-ICT and coordinated by Universidad de Alicante.

×