Europeana Cloud - Ingestion and Aggregation Workshop


Published on

Europeana Cloud Kick-Off Meeting, 5 March 2013, The Hague, The Netherlands. By Chiaro Latronico

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • VIAF: Virtual International Authority File GeoNames: geographical database MACS: to retrieve records with subjects in multiple languages LCSH: Library of Congress Subject Headings
  • EDM main CLASSES
  • Europeana Cloud - Ingestion and Aggregation Workshop

    1. 1. europeanacloudIngestion andAggregation WorkshopChiara LatronicoOperations OfficerThe European LibraryEuropeana Cloud Kick-Off Meeting, Den Haag, 04-05 March 2013
    2. 2. Agenda The European Library Datasets life-cycle and workflows Content ingestion questionnaire Ingestions tools Aggregation and delivery to Europeana Europeana Data Model (EDM) Full-text index Questions
    3. 3.  48 National Libraries ~ 40 Research and University Libraries ~ 115 Million Bibliographic Records > 16 Million Digital Objects > 25 Million Pages of Full-textThe European Library
    4. 4. The European Library Data access point for researchers Combination of bibliographic records andmetadata for digital objects Aggregator for Europeana Cloud
    5. 5. The European LibraryAggregation into Europeana
    6. 6. The European LibraryIngestion Workflow Content ingestion questionnaire Scheduling of ingestion Datasets ready for harvesting Create case in CRM: case # to provider Harvesting metadata Enhance metadata (VIAF, Geonames, MACS,...) Indexing in acceptance portal E-mail to provider to accept dataset Live index = live portal Delivery to Europeana Enhancing and publishing in Europeana
    7. 7. Content Ingestion QuestionnaireWeb-form Personal Information(about the person filling the web-form) Name & surname Job title E-mail address Skype address Information about Organization Organization name Country Website Type of institution
    8. 8. Content Ingestion QuestionnaireHarvesting Details Which protocol will be used to transfer data? OAI-PMH File Z39,50 FTP HTTP Harvesting time and dates preferences How often dataset(s) will need to be updated? Weekly Monthly Quarterly Annually On demand
    9. 9. Content Ingestion QuestionnaireInformation about dataset(s) Number of dataset(s) to be ingested Number of records to be expected Number of digital objects to be expected Contact person(s) per dataset(s) Editorial: for collection description Technical: for collection ingestion
    10. 10. Content Ingestion QuestionnaireInformation about Metadata Metadata standard(s) available to describe objects Marc21 MarcXchange Unimarc ESE EDM METS MODS OAI_DC TEI Number of formats available per dataset
    11. 11. Content Ingestion QuestionnaireInformation about Metadata Are the metadata ready? If yes, for which dataset(s) If not, when will they be ready? Type of digital objects per dataset(s) TEXT IMAGE AUDIO VIDEO
    12. 12. Content Ingestion QuestionnaireInformation about Content Will content be delivered in addition tometadata? If yes, for which dataset(s)? If yes, in which format(s)? Has the content been digitized? If yes, for which dataset(s)? If not, when will the content be available?
    13. 13. Content Ingestion QuestionnaireInformation about Authority Will authority files be delivered? If yes, for which dataset(s)? If yes, in which format(s)? Are controlled vocabularies utilized? If yes, which kind?• Classification• Thesauri• Subject Headings• Other If yes, for which dataset(s)? Will full-text be delivered? If yes, for which dataset(s)? If yes, in which format(s)?
    14. 14. Content Ingestion QuestionnaireSubmitIf you make a mistake, we can fix it!After
    15. 15. SugarCRM aggregation tasks managementSugarCRMCustomer Relation Managementtool
    16. 16. SugarCRM generation automated reportsSugarCRMCustomer Relation Managementtool
    17. 17. In SugarCRM Organizations, contacts, datasets, project andmoreSugarCRM is utilized for Collections control Ingestion plans Automated reports Cases per specific datasetsSugarCRMCustomer Relation Managementtool
    18. 18. The European LibrarySystem architecture
    19. 19. UIM Unique IngestionManagement
    20. 20. Dataset in Acceptance PortalAcceptance Portal Test environment Providers to validate dataReports via UIM workflows Link Validation Field Validation
    21. 21. Dataset in Acceptance Portal
    22. 22. When Dataset in Acceptance Portal Create an account on Use credential to log-in in acceptance Validate data using tabs for Default XML
    23. 23. Dataset in Acceptance Portal
    24. 24. Dataset(s) in Live Index and Portal When a provider accepts dataset(s) E-mail Dataset(s) ready for live index Dataset(s) ready for Europeana Dataset(s) indexed into the live portal It takes ~ 24 hrs for dataset(s) to besearchable into the live portal
    25. 25. Dataset(s) Live in EuropeanaWhen a provider accepts dataset(s) Dataset(s) delivered to Europeana Europeana publishes live once a month Delivery deadline ~ 21 of each month Dataset(s) searchable in Europeana byfollowing month Dataset(s) published live in Europeana E-mail to provider with link to dataset(s)into Europeana portal
    26. 26. EDM – Europeana Data Model Europeana Libraries project EDM for library data Europeana Cloud Project EDM for museum and archive metadata &contentDelivery in EDM to Europeana
    27. 27. EDM – Europeana Data Model
    28. 28. Europeana Preview
    29. 29. Europeana Preview
    30. 30. Full-Text (OCR)Continue Full-text indexing
    31. 31. Full-Text (OCR)Full-text & OCR URLs to OCR texts into metadata Extraction of Full-text Full-text indexing
    32. 32. Full-Text (OCR)Continue the work about Full-text Europeana Newspapers Europeana Cloud
    33. 33. SummaryWhat we would like to have from you Richest possible metadata Content Full-text Authority files or ontologies
    34. 34. Thank you!Questions?For every questions or feedback contactcollections@theeuropeanlibrary.orgChiara
    35. 35.