Driver Guidelines and Repository Interoperability

1,352 views

Published on

On 2008-11-15 Maurice Vanderfeesten gave a presentation in Baltimore at the SPARC OpenAccess confenrence.
This presentation explains about the needs for interoperability amoung repository systems. DRIVER provides guidelines how to expose metadata via OAI-PMH is a way that has international compliance.

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,352
On SlideShare
0
From Embeds
0
Number of Embeds
347
Actions
Shares
0
Downloads
18
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Driver Guidelines and Repository Interoperability

    1. 1. Fasten … Seatbelt Maurice Vanderfeesten - SURFfoundation (NL) 15 November 2008 – Baltimore – DRIVER meeting
    2. 2. Fasten Excel in Scholarly communication
    3. 3. Seatbelt Get to the finish line safely
    4. 4. Innovation towards the intelligent web The Intelligent Web Productivity of Search Web 4.0 2020 - 2030 Reasoning The Semantic Web Web 3.0 Semantic Search 2010 - 2020 The Social Web Natural language search The World Wide Web Web 2.0 2000 - 2010 Tagging Web 1.0 1990 - 2000 Keyword search The Desktop Directories PC Era 1980 - 1990 Files & Folders Databases By: Radar Networks / TWINE Amount of data 4
    5. 5. Work together: Respect some rules
    6. 6. One goal: “Reliable Content Provision” Global Digital Repository Infrastructure
    7. 7. Reality: Efforts to interpret and normalize data Wide spread metadata standards: Unqualified Dublin Core & OAI-PMH Problem: interpreting semantics; standard specifications not enough Example: Electronic theses need context specific descriptions for date, type, roles & language TICER 2008, Tilburg 7
    8. 8. Effort interpreting dates - Trouble automatically interpreting semantics ex. [date] (Cranfield) Recommendation: in Unqualified <dc:contributor>Partington, Dublin Core use one date field that David(supervisor)</dc:contributor> <dc:creator>Lupson, Jonathan</dc:creator> represents the Publication date! <dc:date>2007-06-06T18:17:13Z</dc:date>(Publication?) <dc:date>2007-06-06T18:17:13Z</dc:date>(Graduation?) <dc:date>2007-02</dc:date> (Start ?) <dc:identifier>http://hdl.handle.net/1826/1729</dc:ident Humboldt: ifier> <dc:date>2007-06-07</dc:date> (Graduation) <dc:description> <dc:date>2007-03-06</dc:date> (Publication) <dc:date>2003-02</dc:date> Tilburg TICER 2008, (Start) 8
    9. 9. Effort interpreting types - Trouble automatically interpreting semantics ex. [type] Recommendation: use the following qualifications: Cranfield: <dc:type>Thesis or dissertation</dc:type> “ bachelorThesis”, <dc:type>Doctoral</dc:type> info:eu-repo/semantics/ <dc:type>PhD</dc:type> “ masterThesis”, info:eu-repo/semantics/ DIVA: “ doctoralThesis” info:eu-repo/semantics/ <dc:type>text.thesis.doctoral</dc:type> (Bologna Convention) Humboldt: <dc:type>Text</dc:type> <dc:type>dissertation</dc:type> TICER 2008, Tilburg 9
    10. 10. Effort interpreting roles 1. Electronic theses need context specific descriptions Recommendation: use the contributor field in Dublin Core only <dc:contributor>Partington, David(supervisor)</dc:contributor> for the person who supervised the <dc:creator>Lupson, Jonathan</dc:creator> <dc:date>2007-06-06T18:17:13Z</dc:date> Doctoral thesis project. <dc:date>2007-06-06T18:17:13Z</dc:date> <dc:date>2007-02</dc:date> <dc:identifier>http://hdl.handle.net/1826/1729</dc:ident ifier> <dc:description> TICER 2008, Tilburg 10
    11. 11. Effort interpreting languages Personal notation flavour of a language Recommendation: <dc:language>Nederlands</dc:language> use ISO639-3 <dc:language>ned</dc:language> As a standard way of writing down a language in a repository <dc:language>nl</dc:language> <dc:language>nld/dut</dc:language> <dc:language>en_UK</dc:language> <dc:language>mn</dc:language> TICER 2008, Tilburg 11
    12. 12. Number of repositories increase DRIVER: Collection of Quality Metadata for OpenAccess Material
    13. 13. All services providers must build adaptors for every single repository
    14. 14. Interoperability shares workload
    15. 15. One goal: “Reliable Content Provision” Global Digital Repository Infrastructure
    16. 16. Reliability: Broken Links Issue Repository URL
    17. 17. Reliability: Link resolvers OAI-PMH ID Global Repository Resolver URL ID + URL Updates • Use ID’s for citation reference • Obligation to update • Technology independent (future proof)
    18. 18. Standards, Agreements, Rules: Interoperability guidelines
    19. 19. Towards web-reasoning: data efficiency & interoperability levels By: Andreas Tolk et al., quot;Composable M&S Web Services for Net-centric Applications,quot; Journal for Defense Modeling & Simulation (JDMS), Volume 3 Number 1, pp. 27-44, January 2006
    20. 20. Interoperability leads towards improved retrieval and recall The Intelligent Web Productivity of Search Web 4.0 2020 - 2030 Reasoning The Semantic Web Web 3.0 Semantic Search 2010 - 2020 The Social Web Natural language search The World Wide Web Web 2.0 2000 - 2010 Tagging Web 1.0 1990 - 2000 Keyword search The Desktop Directories PC Era 1980 - 1990 Files & Folders Databases By: Radar Networks / TWINE Amount of data
    21. 21. We have: Tools for Syntactic & Semantic Interoperability - Guidelines for content providers, exposing textual resources with OAI-PMH - Validator, checking the rate of compliance to the “Guidelines for content providers” 21
    22. 22. Guidelines 2.0 - Build on knowledge from past & current IR projects (EU) - 26 actively involved contributors (experts and repository managers) from 8 countries. - Practical answers for IR’s on how to: - Improve full-text access - Standardize metadata quality - Create a reliable infrastructure for permanent identification, resolution, traceability and storage - Resolve semantic and classification issues
    23. 23. Guidelines 2.0 - Chapters 1. Use of OAI-PMH 2. Use of Metadata OAI_DC 3. Use of Best Practices for OAI_DC 4. Use of Compound Object Wrapping 5. Use of Vocabularies and Semantics 6. Use of Quality labels (Long Term Preservation) 7. Use of Persistent Identifiers 8. Use of Usage Statistics Exchange 9. Use of Intellectual Property Rights (IPR)
    24. 24. Validator
    25. 25. Validator - Deep validation - Points to exact location of the - Experimental tool issue for easy debugging - Self-test for Repository - Offers recommendations on Managers how to correctly modify your repository to interoperable - Embedded in DRIVER standards registration process - Creates a report for future - Detects interoperability issues reference - Provides explanation per - Provides a weighted score for interoperability issue. balanced effort - Score influences the result list.
    26. 26. Looking back on what we have: - Guidelines for content providers, exposing textual resources with OAI-PMH - Validator, checking the rate of compliance to the “Guidelines for content providers” 27
    27. 27. What is missing? Guidelines 28
    28. 28. Trias Politica Model Legislative 29
    29. 29. We DON’T have - A structure for acceptance of Intelligent Web Productivity of Search The Repository Interoperability Web 4.0 2020 - 2030 Reasoning Guidelines World Wide Semantic Web The Web 3.0 Semantic Search 2010 - 2020 The Social Web Natural language search The World Wide Web Web 2.0 2000 - 2010 Tagging Web 1.0 - The Desktop 1990 - 2000 Executive enforcement Keyword search enabling action on adopting PC Era 1980 - 1990 Directories Interoperability Guidelines for Files & Folders Repositories, World Wide, on Databases a National and local level By: Radar Networks / TWINE Amount of data 30
    30. 30. Questions • What strategies can be used to create a global “Trias Politica” for repositories in order to enforce “reliable content provision” by using interoperability guidelines? • What strategies are there to maintain repository guidelines? Who is responsible? • What strategies are known to create an acceptance mechanism for global agreement to repository guidelines? • What strategies can be used to enforce repository guidelines? • Who is responsible for the (metadata) quality of the repository output?
    31. 31. The end Thank you Maurice Vanderfeesten www.SURFfoundation.nl vanderfeesten@surf.nl

    ×