Your SlideShare is downloading. ×
  • Like
Metadata Harvesting And Validation
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Metadata Harvesting And Validation

  • 1,602 views
Published

 

Published in Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
1,602
On SlideShare
0
From Embeds
0
Number of Embeds
1

Actions

Shares
Downloads
28
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Metadata Harvesting and Validation Bram Vandeputte K.U.Leuven 1
  • 2. slideshare • http://www.slideshare.net/bramvandeputte
  • 3. Overview • Validation Service • Online Validation Service • OAI-PMH • Harvesting Infrastructure 3
  • 4. Validation Service • Interoperability : Application Profile (AP) • Manual check : very time consuming • Need a tool for enforcing an AP => validation scheme Best practices derived from previous projects • A set of validation rules such as MELT and MACE Reusable : modular + • Reusable & extendable inheritance possible 4
  • 5. Validation Service • Components : • XML schema : structure • schematron : • mandatory/conditional elements • empty fields • vocabularies (auto generated) • ... • Vcard component 5
  • 6. Validation Service component : atomic block which does specific validation checking scheme : collection of components that • Terminology : ensures validity against a whole AP • Validation Component URI : unique identifier of a scheme • Validation Scheme • Validation Scheme URI : • http://aspect-project.org/validation/ASPECTv1.0/core 6
  • 7. Validation Service 7
  • 8. Validation Service ASPECTv1.0/ ASPECTv1.0/ LOM loose recommended core recommended lomloose.xsd vocabulary bank schematron rules core schematron vcard validator Legend rules uses empty attribute fields extends IMS ILOX ASPECT validationScheme vcard validator validation component 8
  • 9. Validation Service 9
  • 10. Online Validation Service demo 10
  • 11. validation to lre AP refer to lre ap document
  • 12. OAI-PMH • Client - Server model • Pull mechanism • options : • selective harvesting (date and set) • incremental harvesting • Metadata-agnostic 13
  • 13. OAI-PMH • Verbs : Identify, ListRecords, GetRecord • Parameters : • baseUrl • from & until date • metadataPrefix • sets 14
  • 14. Harvest Component • Multiple targets • Each target separate properties (sets, date granularity, metadataPrefix, ...) • Storing metadata (SPI, Filesystem, APP, ...) • Extra features : • Incremental harvesting • harvesting scheduling • Metadata validation + reporting • OAI-PMH Target validation • (User Friendly) GUI 15
  • 15. invalid : discarded or identifier recorded for next harvesting 16
  • 16. The Harvest component invalid : discarded or identifier recorded for next harvesting 16
  • 17. ARIADNE Harvester invalid : discarded or identifier recorded for harvester log next harvesting 16
  • 18. ARIADNE Harvester ASPECT Repository SPI SQI invalid : discarded or identifier recorded for harvester log validation service next harvesting 16
  • 19. ARIADNE Harvester ASPECT Repository SPI SQI External Repository OAI OAI-PMH LOM LOM LOM invalid : discarded or identifier recorded for harvester log validation service next harvesting 16
  • 20. ARIADNE Harvester ASPECT Repository OAI-PMH SPI SQI External Repository LOM LOM LOM OAI OAI-PMH LOM LOM LOM invalid : discarded or identifier recorded for harvester log validation service next harvesting 16
  • 21. ARIADNE Harvester ASPECT Repository OAI-PMH SPI SQI External Repository LOM LOM OAI OAI-PMH LOM LOM LOM LOM invalid : discarded or identifier recorded for harvester log validation service next harvesting 16
  • 22. ARIADNE Harvester ASPECT Repository OAI-PMH SPI SQI External Repository LOM LOM OAI OAI-PMH LOM LOM LOM LOM LOM invalid : discarded or identifier recorded for harvester log validation service next harvesting 16
  • 23. ARIADNE Harvester ASPECT Repository OAI-PMH SPI SQI External Repository LOM LOM OAI OAI-PMH LOM LOM LOM LOM Validation Msg invalid : discarded or identifier recorded for harvester log validation service next harvesting 16
  • 24. ARIADNE Harvester ASPECT Repository OAI-PMH SPI SQI External Repository LOM LOM OAI OAI-PMH Validation LOM Msg LOM LOM LOM invalid : discarded or identifier recorded for harvester log validation service next harvesting 16
  • 25. ARIADNE Harvester ASPECT Repository OAI-PMH SPI SQI External Repository LOM LOM OAI OAI-PMH LOM LOM LOM LOM Validation Msg invalid : discarded or identifier recorded for harvester log validation service next harvesting 16
  • 26. ARIADNE Harvester ASPECT Repository OAI-PMH SPI SQI External Repository LOM LOM OAI LOM OAI-PMH LOM LOM LOM Validation Msg invalid : discarded or identifier recorded for harvester log validation service next harvesting 16
  • 27. ARIADNE Harvester ASPECT Repository OAI-PMH SPI SQI External Repository LOM LOM LOM OAI 2 6 LOM OAI-PMH 1 Validation LOM Msg LOM LOM LOM 4 5 3 Validation Validation LOM Msg Msg harvester log validation service 17
  • 28. Harvester Screenshot or live demo 18
  • 29. Validation Reports • After harvesting -> report generated and put online • report has 4 “levels” : • full log (incl. metadata) • reporting log • Grouped Errors • Error Summary
  • 30. • Questions ? 23
  • 31. References • SPI : http://ariadne.cs.kuleuven.be/lomi/index.php/ SimplePublishingInterface • IEEE LOM : http://ltsc.ieee.org/wg12/ • OAI-PMH : http://www.openarchives.org/