Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Metadata Harvesting And Validation

2,192 views

Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Metadata Harvesting And Validation

  1. 1. Metadata Harvesting and Validation Bram Vandeputte K.U.Leuven 1
  2. 2. slideshare • http://www.slideshare.net/bramvandeputte
  3. 3. Overview • Validation Service • Online Validation Service • OAI-PMH • Harvesting Infrastructure 3
  4. 4. Validation Service • Interoperability : Application Profile (AP) • Manual check : very time consuming • Need a tool for enforcing an AP => validation scheme Best practices derived from previous projects • A set of validation rules such as MELT and MACE Reusable : modular + • Reusable & extendable inheritance possible 4
  5. 5. Validation Service • Components : • XML schema : structure • schematron : • mandatory/conditional elements • empty fields • vocabularies (auto generated) • ... • Vcard component 5
  6. 6. Validation Service component : atomic block which does specific validation checking scheme : collection of components that • Terminology : ensures validity against a whole AP • Validation Component URI : unique identifier of a scheme • Validation Scheme • Validation Scheme URI : • http://aspect-project.org/validation/ASPECTv1.0/core 6
  7. 7. Validation Service 7
  8. 8. Validation Service ASPECTv1.0/ ASPECTv1.0/ LOM loose recommended core recommended lomloose.xsd vocabulary bank schematron rules core schematron vcard validator Legend rules uses empty attribute fields extends IMS ILOX ASPECT validationScheme vcard validator validation component 8
  9. 9. Validation Service 9
  10. 10. Online Validation Service demo 10
  11. 11. validation to lre AP refer to lre ap document
  12. 12. OAI-PMH • Client - Server model • Pull mechanism • options : • selective harvesting (date and set) • incremental harvesting • Metadata-agnostic 13
  13. 13. OAI-PMH • Verbs : Identify, ListRecords, GetRecord • Parameters : • baseUrl • from & until date • metadataPrefix • sets 14
  14. 14. Harvest Component • Multiple targets • Each target separate properties (sets, date granularity, metadataPrefix, ...) • Storing metadata (SPI, Filesystem, APP, ...) • Extra features : • Incremental harvesting • harvesting scheduling • Metadata validation + reporting • OAI-PMH Target validation • (User Friendly) GUI 15
  15. 15. invalid : discarded or identifier recorded for next harvesting 16
  16. 16. The Harvest component invalid : discarded or identifier recorded for next harvesting 16
  17. 17. ARIADNE Harvester invalid : discarded or identifier recorded for harvester log next harvesting 16
  18. 18. ARIADNE Harvester ASPECT Repository SPI SQI invalid : discarded or identifier recorded for harvester log validation service next harvesting 16
  19. 19. ARIADNE Harvester ASPECT Repository SPI SQI External Repository OAI OAI-PMH LOM LOM LOM invalid : discarded or identifier recorded for harvester log validation service next harvesting 16
  20. 20. ARIADNE Harvester ASPECT Repository OAI-PMH SPI SQI External Repository LOM LOM LOM OAI OAI-PMH LOM LOM LOM invalid : discarded or identifier recorded for harvester log validation service next harvesting 16
  21. 21. ARIADNE Harvester ASPECT Repository OAI-PMH SPI SQI External Repository LOM LOM OAI OAI-PMH LOM LOM LOM LOM invalid : discarded or identifier recorded for harvester log validation service next harvesting 16
  22. 22. ARIADNE Harvester ASPECT Repository OAI-PMH SPI SQI External Repository LOM LOM OAI OAI-PMH LOM LOM LOM LOM LOM invalid : discarded or identifier recorded for harvester log validation service next harvesting 16
  23. 23. ARIADNE Harvester ASPECT Repository OAI-PMH SPI SQI External Repository LOM LOM OAI OAI-PMH LOM LOM LOM LOM Validation Msg invalid : discarded or identifier recorded for harvester log validation service next harvesting 16
  24. 24. ARIADNE Harvester ASPECT Repository OAI-PMH SPI SQI External Repository LOM LOM OAI OAI-PMH Validation LOM Msg LOM LOM LOM invalid : discarded or identifier recorded for harvester log validation service next harvesting 16
  25. 25. ARIADNE Harvester ASPECT Repository OAI-PMH SPI SQI External Repository LOM LOM OAI OAI-PMH LOM LOM LOM LOM Validation Msg invalid : discarded or identifier recorded for harvester log validation service next harvesting 16
  26. 26. ARIADNE Harvester ASPECT Repository OAI-PMH SPI SQI External Repository LOM LOM OAI LOM OAI-PMH LOM LOM LOM Validation Msg invalid : discarded or identifier recorded for harvester log validation service next harvesting 16
  27. 27. ARIADNE Harvester ASPECT Repository OAI-PMH SPI SQI External Repository LOM LOM LOM OAI 2 6 LOM OAI-PMH 1 Validation LOM Msg LOM LOM LOM 4 5 3 Validation Validation LOM Msg Msg harvester log validation service 17
  28. 28. Harvester Screenshot or live demo 18
  29. 29. Validation Reports • After harvesting -> report generated and put online • report has 4 “levels” : • full log (incl. metadata) • reporting log • Grouped Errors • Error Summary
  30. 30. • Questions ? 23
  31. 31. References • SPI : http://ariadne.cs.kuleuven.be/lomi/index.php/ SimplePublishingInterface • IEEE LOM : http://ltsc.ieee.org/wg12/ • OAI-PMH : http://www.openarchives.org/

×