Metadata Harvesting and
      Validation

       Bram Vandeputte
         K.U.Leuven




              1
slideshare



• http://www.slideshare.net/bramvandeputte
Overview

• Validation Service
• Online Validation Service
• OAI-PMH
• Harvesting Infrastructure


                       3
Validation Service

• Interoperability : Application Profile (AP)
• Manual check : very time consuming
• Need a tool for en...
Validation Service

• Components :
 • XML schema : structure
 • schematron :
   •   mandatory/conditional elements
   •   ...
Validation Service
                                                      component : atomic
                              ...
Validation Service




        7
Validation Service
         ASPECTv1.0/                       ASPECTv1.0/
                                                ...
Validation Service




        9
Online Validation
  Service demo


        10
validation to lre AP
refer to lre ap document
OAI-PMH
• Client - Server model
• Pull mechanism
• options :
 • selective harvesting (date and set)
 • incremental harvest...
OAI-PMH
• Verbs : Identify, ListRecords, GetRecord
• Parameters :
 • baseUrl
 • from & until date
 • metadataPrefix
 • sets...
Harvest Component
•   Multiple targets

•   Each target separate properties (sets, date granularity,
    metadataPrefix, .....
invalid : discarded or
identifier recorded for
next harvesting




                          16
The Harvest component


invalid : discarded or
identifier recorded for
next harvesting




                          16
ARIADNE Harvester




invalid : discarded or
identifier recorded for
                             harvester log
next harve...
ARIADNE Harvester
                                                    ASPECT Repository

                                 ...
ARIADNE Harvester
                                                       ASPECT Repository

                              ...
ARIADNE Harvester
                                                            ASPECT Repository
                          ...
ARIADNE Harvester
                                                            ASPECT Repository
                          ...
ARIADNE Harvester
                                                            ASPECT Repository
                          ...
ARIADNE Harvester
                                                            ASPECT Repository
                          ...
ARIADNE Harvester
                                                                   ASPECT Repository
                   ...
ARIADNE Harvester
                                                                ASPECT Repository
                      ...
ARIADNE Harvester
                                                               ASPECT Repository
                       ...
ARIADNE Harvester
                                                                           ASPECT Repository

          ...
Harvester
Screenshot or live
      demo


        18
Validation Reports
• After harvesting -> report generated and put
  online
• report has 4 “levels” :
  • full log (incl. m...
• Questions ?




                23
References


• SPI : http://ariadne.cs.kuleuven.be/lomi/index.php/
   SimplePublishingInterface

• IEEE LOM : http://ltsc....
Metadata Harvesting And Validation
Metadata Harvesting And Validation
Metadata Harvesting And Validation
Metadata Harvesting And Validation
Upcoming SlideShare
Loading in …5
×

Metadata Harvesting And Validation

2,053 views
2,010 views

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
2,053
On SlideShare
0
From Embeds
0
Number of Embeds
6
Actions
Shares
0
Downloads
32
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Metadata Harvesting And Validation

  1. 1. Metadata Harvesting and Validation Bram Vandeputte K.U.Leuven 1
  2. 2. slideshare • http://www.slideshare.net/bramvandeputte
  3. 3. Overview • Validation Service • Online Validation Service • OAI-PMH • Harvesting Infrastructure 3
  4. 4. Validation Service • Interoperability : Application Profile (AP) • Manual check : very time consuming • Need a tool for enforcing an AP => validation scheme Best practices derived from previous projects • A set of validation rules such as MELT and MACE Reusable : modular + • Reusable & extendable inheritance possible 4
  5. 5. Validation Service • Components : • XML schema : structure • schematron : • mandatory/conditional elements • empty fields • vocabularies (auto generated) • ... • Vcard component 5
  6. 6. Validation Service component : atomic block which does specific validation checking scheme : collection of components that • Terminology : ensures validity against a whole AP • Validation Component URI : unique identifier of a scheme • Validation Scheme • Validation Scheme URI : • http://aspect-project.org/validation/ASPECTv1.0/core 6
  7. 7. Validation Service 7
  8. 8. Validation Service ASPECTv1.0/ ASPECTv1.0/ LOM loose recommended core recommended lomloose.xsd vocabulary bank schematron rules core schematron vcard validator Legend rules uses empty attribute fields extends IMS ILOX ASPECT validationScheme vcard validator validation component 8
  9. 9. Validation Service 9
  10. 10. Online Validation Service demo 10
  11. 11. validation to lre AP refer to lre ap document
  12. 12. OAI-PMH • Client - Server model • Pull mechanism • options : • selective harvesting (date and set) • incremental harvesting • Metadata-agnostic 13
  13. 13. OAI-PMH • Verbs : Identify, ListRecords, GetRecord • Parameters : • baseUrl • from & until date • metadataPrefix • sets 14
  14. 14. Harvest Component • Multiple targets • Each target separate properties (sets, date granularity, metadataPrefix, ...) • Storing metadata (SPI, Filesystem, APP, ...) • Extra features : • Incremental harvesting • harvesting scheduling • Metadata validation + reporting • OAI-PMH Target validation • (User Friendly) GUI 15
  15. 15. invalid : discarded or identifier recorded for next harvesting 16
  16. 16. The Harvest component invalid : discarded or identifier recorded for next harvesting 16
  17. 17. ARIADNE Harvester invalid : discarded or identifier recorded for harvester log next harvesting 16
  18. 18. ARIADNE Harvester ASPECT Repository SPI SQI invalid : discarded or identifier recorded for harvester log validation service next harvesting 16
  19. 19. ARIADNE Harvester ASPECT Repository SPI SQI External Repository OAI OAI-PMH LOM LOM LOM invalid : discarded or identifier recorded for harvester log validation service next harvesting 16
  20. 20. ARIADNE Harvester ASPECT Repository OAI-PMH SPI SQI External Repository LOM LOM LOM OAI OAI-PMH LOM LOM LOM invalid : discarded or identifier recorded for harvester log validation service next harvesting 16
  21. 21. ARIADNE Harvester ASPECT Repository OAI-PMH SPI SQI External Repository LOM LOM OAI OAI-PMH LOM LOM LOM LOM invalid : discarded or identifier recorded for harvester log validation service next harvesting 16
  22. 22. ARIADNE Harvester ASPECT Repository OAI-PMH SPI SQI External Repository LOM LOM OAI OAI-PMH LOM LOM LOM LOM LOM invalid : discarded or identifier recorded for harvester log validation service next harvesting 16
  23. 23. ARIADNE Harvester ASPECT Repository OAI-PMH SPI SQI External Repository LOM LOM OAI OAI-PMH LOM LOM LOM LOM Validation Msg invalid : discarded or identifier recorded for harvester log validation service next harvesting 16
  24. 24. ARIADNE Harvester ASPECT Repository OAI-PMH SPI SQI External Repository LOM LOM OAI OAI-PMH Validation LOM Msg LOM LOM LOM invalid : discarded or identifier recorded for harvester log validation service next harvesting 16
  25. 25. ARIADNE Harvester ASPECT Repository OAI-PMH SPI SQI External Repository LOM LOM OAI OAI-PMH LOM LOM LOM LOM Validation Msg invalid : discarded or identifier recorded for harvester log validation service next harvesting 16
  26. 26. ARIADNE Harvester ASPECT Repository OAI-PMH SPI SQI External Repository LOM LOM OAI LOM OAI-PMH LOM LOM LOM Validation Msg invalid : discarded or identifier recorded for harvester log validation service next harvesting 16
  27. 27. ARIADNE Harvester ASPECT Repository OAI-PMH SPI SQI External Repository LOM LOM LOM OAI 2 6 LOM OAI-PMH 1 Validation LOM Msg LOM LOM LOM 4 5 3 Validation Validation LOM Msg Msg harvester log validation service 17
  28. 28. Harvester Screenshot or live demo 18
  29. 29. Validation Reports • After harvesting -> report generated and put online • report has 4 “levels” : • full log (incl. metadata) • reporting log • Grouped Errors • Error Summary
  30. 30. • Questions ? 23
  31. 31. References • SPI : http://ariadne.cs.kuleuven.be/lomi/index.php/ SimplePublishingInterface • IEEE LOM : http://ltsc.ieee.org/wg12/ • OAI-PMH : http://www.openarchives.org/

×