Your SlideShare is downloading. ×
Chiara Latronico, Europeana Cloud - Ingestion and Aggregation Workshop, The European Library
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Chiara Latronico, Europeana Cloud - Ingestion and Aggregation Workshop, The European Library

261
views

Published on

Published in: Technology, Education

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
261
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • VIAF: Virtual International Authority File
    GeoNames: geographical database
    MACS: to retrieve records with subjects in multiple languages
    LCSH: Library of Congress Subject Headings
  • EDM main CLASSES
  • Transcript

    • 1. europeana cloud Ingestion and Aggregation Workshop Chiara Latronico Operations Officer The European Library Europeana Cloud Kick-Off Meeting, Den Haag, 04-05 March 2013
    • 2. Agenda         The European Library Datasets life-cycle and workflows Content ingestion questionnaire Ingestions tools Aggregation and delivery to Europeana Europeana Data Model (EDM) Full-text index Questions
    • 3. The European Library  48 National Libraries  ~ 40 Research and University Libraries  ~ 115 Million Bibliographic Records  > 16 Million Digital Objects  > 25 Million Pages of Full-text
    • 4. The European Library  Data access point for researchers  Combination of bibliographic records and metadata for digital objects  Aggregator for Europeana Cloud
    • 5. The European Library Aggregation into Europeana
    • 6. The European Library Ingestion Workflow  Content ingestion questionnaire  Scheduling of ingestion  Datasets ready for harvesting  Create case in CRM: case # to provider  Harvesting metadata  Enhance metadata (VIAF, Geonames, MACS,...)  Indexing in acceptance portal  E-mail to provider to accept dataset  Live index = live portal  Delivery to Europeana  Enhancing and publishing in Europeana
    • 7. Content Ingestion Questionnaire Web-form  Personal Information (about the person filling the web-form)     Name & surname Job title E-mail address Skype address  Information about Organization  Organization name  Country  Website  Type of institution
    • 8. Content Ingestion Questionnaire Harvesting Details  Which protocol will be used to transfer data?  OAI-PMH  File  Z39,50  FTP  HTTP  Harvesting time and dates preferences  How often dataset(s) will need to be updated?  Weekly  Monthly  Quarterly  Annually  On demand
    • 9. Content Ingestion Questionnaire Information about dataset(s)  Number of dataset(s) to be ingested  Number of records to be expected  Number of digital objects to be expected  Contact person(s) per dataset(s)  Editorial: for collection description  Technical: for collection ingestion
    • 10. Content Ingestion Questionnaire Information about Metadata  Metadata standard(s) available to describe objects  Marc21  MarcXchange  Unimarc  ESE  EDM  METS  MODS  OAI_DC  TEI  Number of formats available per dataset
    • 11. Content Ingestion Questionnaire Information about Metadata  Are the metadata ready?  If yes, for which dataset(s)  If not, when will they be ready?  Type of digital objects per dataset(s)  TEXT  IMAGE  AUDIO  VIDEO
    • 12. Content Ingestion Questionnaire Information about Content  Will content be delivered in addition to metadata?  If yes, for which dataset(s)?  If yes, in which format(s)?  Has the content been digitized?  If yes, for which dataset(s)?  If not, when will the content be available?
    • 13. Content Ingestion Questionnaire Information about Authority  Will authority files be delivered?  If yes, for which dataset(s)?  If yes, in which format(s)?  Are controlled vocabularies utilized?  If yes, which kind? • Classification • Thesauri • Subject Headings • Other  If yes, for which dataset(s)?  Will full-text be delivered?  If yes, for which dataset(s)?  If yes, in which format(s)?
    • 14. Content Ingestion Questionnaire Submit If you make a mistake, we can fix it! After submitting collections@theeuropeanlibrary.org
    • 15. SugarCRM Customer Relation Management tool SugarCRM aggregation tasks management
    • 16. SugarCRM Customer Relation Management tool SugarCRM generation automated reports
    • 17. SugarCRM Customer Relation Management tool In SugarCRM  Organizations, contacts, datasets, project and more SugarCRM is utilized for  Collections control  Ingestion plans  Automated reports  Cases per specific datasets
    • 18. The European Library System architecture
    • 19. UIM Unique Ingestion Management
    • 20. Dataset in Acceptance Portal Acceptance Portal  Test environment  Providers to validate data Reports via UIM workflows  Link Validation  Field Validation
    • 21. Dataset in Acceptance Portal
    • 22. When Dataset in Acceptance Portal  Create an account on http://www.theeuropeanlibrary.org/  Use credential to log-in in acceptance http://www.tel.ulcc.ac.uk/acceptance/  Validate data using tabs for  Default  XML
    • 23. Dataset in Acceptance Portal
    • 24. Dataset(s) in Live Index and Portal  When a provider accepts dataset(s)  E-mail  Dataset(s) ready for live index  Dataset(s) ready for Europeana  Dataset(s) indexed into the live portal  It takes ~ 24 hrs for dataset(s) to be searchable into the live portal
    • 25. Dataset(s) Live in Europeana When a provider accepts dataset(s)  Dataset(s) delivered to Europeana  Europeana publishes live once a month  Delivery deadline ~ 21 of each month  Dataset(s) searchable in Europeana by following month  Dataset(s) published live in Europeana  E-mail to provider with link to dataset(s) into Europeana portal
    • 26. EDM – Europeana Data Model  Europeana Libraries project  EDM for library data  Europeana Cloud Project  EDM for museum and archive metadata & content Delivery in EDM to Europeana
    • 27. EDM – Europeana Data Model
    • 28. Europeana Preview
    • 29. Europeana Preview
    • 30. Full-Text (OCR) Continue Full-text indexing
    • 31. Full-Text (OCR) Full-text & OCR  URLs to OCR texts into metadata  Extraction of Full-text  Full-text indexing
    • 32. Full-Text (OCR) Continue the work about Full-text  Europeana Newspapers  Europeana Cloud
    • 33. Summary What we would like to have from you  Richest possible metadata  Content  Full-text  Authority files or ontologies
    • 34. Thank you! Questions? For every questions or feedback contact collections@theeuropeanlibrary.org Chiara Latronico Chiara.Latronico@kb.nl
    • 35. www.theeuropeanlibrary.org