europeana
cloud
Ingestion and
Aggregation Workshop
Chiara Latronico
Operations Officer
The European Library
Europeana Clou...
Agenda









The European Library
Datasets life-cycle and workflows
Content ingestion questionnaire
Ingestions ...
The European Library
 48 National Libraries
 ~ 40 Research and University Libraries

 ~ 115 Million Bibliographic Recor...
The European Library
 Data access point for researchers
 Combination of bibliographic records and
metadata for digital o...
The European Library
Aggregation into Europeana
The European Library
Ingestion Workflow
 Content ingestion questionnaire
 Scheduling of ingestion
 Datasets ready for h...
Content Ingestion Questionnaire
Web-form
 Personal Information
(about the person filling the web-form)






Name & s...
Content Ingestion Questionnaire
Harvesting Details


Which protocol will be used to transfer data?
 OAI-PMH
 File
 Z39...
Content Ingestion Questionnaire
Information about dataset(s)
 Number of dataset(s) to be ingested
 Number of records to ...
Content Ingestion Questionnaire
Information about Metadata
 Metadata standard(s) available to describe objects
 Marc21
...
Content Ingestion Questionnaire
Information about Metadata
 Are the metadata ready?
 If yes, for which dataset(s)
 If n...
Content Ingestion Questionnaire
Information about Content
 Will content be delivered in addition to
metadata?
 If yes, f...
Content Ingestion Questionnaire
Information about Authority


Will authority files be delivered?
 If yes, for which data...
Content Ingestion Questionnaire
Submit

If you make a mistake, we can fix it!
After submitting
collections@theeuropeanlibr...
SugarCRM
Customer Relation Management
tool
SugarCRM aggregation tasks management
SugarCRM
Customer Relation Management
tool
SugarCRM generation automated reports
SugarCRM
Customer Relation Management
tool
In SugarCRM
 Organizations, contacts, datasets, project and
more
SugarCRM is u...
The European Library
System architecture
UIM Unique Ingestion
Management
Dataset in Acceptance Portal

Acceptance Portal
 Test environment
 Providers to validate data
Reports via UIM workflows
...
Dataset in Acceptance Portal
When Dataset in Acceptance Portal

 Create an account on
http://www.theeuropeanlibrary.org/
 Use credential to log-in in...
Dataset in Acceptance Portal
Dataset(s) in Live Index and Portal

 When a provider accepts dataset(s)
 E-mail
 Dataset(s) ready for live index
 Dat...
Dataset(s) Live in Europeana
When a provider accepts dataset(s)
 Dataset(s) delivered to Europeana
 Europeana publishes ...
EDM – Europeana Data Model
 Europeana Libraries project
 EDM for library data

 Europeana Cloud Project
 EDM for museu...
EDM – Europeana Data Model
Europeana Preview
Europeana Preview
Full-Text (OCR)

Continue Full-text indexing
Full-Text (OCR)

Full-text & OCR
 URLs to OCR texts into metadata
 Extraction of Full-text
 Full-text indexing
Full-Text (OCR)

Continue the work about Full-text
 Europeana Newspapers
 Europeana Cloud
Summary

What we would like to have from you
 Richest possible metadata
 Content
 Full-text
 Authority files or ontolo...
Thank you!
Questions?
For every questions or feedback contact
collections@theeuropeanlibrary.org
Chiara Latronico
Chiara.L...
www.theeuropeanlibrary.org
Chiara Latronico, Europeana Cloud - Ingestion and Aggregation Workshop, The European Library
Chiara Latronico, Europeana Cloud - Ingestion and Aggregation Workshop, The European Library
Chiara Latronico, Europeana Cloud - Ingestion and Aggregation Workshop, The European Library
Upcoming SlideShare
Loading in …5
×

Chiara Latronico, Europeana Cloud - Ingestion and Aggregation Workshop, The European Library

507 views

Published on

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
507
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • VIAF: Virtual International Authority File
    GeoNames: geographical database
    MACS: to retrieve records with subjects in multiple languages
    LCSH: Library of Congress Subject Headings
  • EDM main CLASSES
  • Chiara Latronico, Europeana Cloud - Ingestion and Aggregation Workshop, The European Library

    1. 1. europeana cloud Ingestion and Aggregation Workshop Chiara Latronico Operations Officer The European Library Europeana Cloud Kick-Off Meeting, Den Haag, 04-05 March 2013
    2. 2. Agenda         The European Library Datasets life-cycle and workflows Content ingestion questionnaire Ingestions tools Aggregation and delivery to Europeana Europeana Data Model (EDM) Full-text index Questions
    3. 3. The European Library  48 National Libraries  ~ 40 Research and University Libraries  ~ 115 Million Bibliographic Records  > 16 Million Digital Objects  > 25 Million Pages of Full-text
    4. 4. The European Library  Data access point for researchers  Combination of bibliographic records and metadata for digital objects  Aggregator for Europeana Cloud
    5. 5. The European Library Aggregation into Europeana
    6. 6. The European Library Ingestion Workflow  Content ingestion questionnaire  Scheduling of ingestion  Datasets ready for harvesting  Create case in CRM: case # to provider  Harvesting metadata  Enhance metadata (VIAF, Geonames, MACS,...)  Indexing in acceptance portal  E-mail to provider to accept dataset  Live index = live portal  Delivery to Europeana  Enhancing and publishing in Europeana
    7. 7. Content Ingestion Questionnaire Web-form  Personal Information (about the person filling the web-form)     Name & surname Job title E-mail address Skype address  Information about Organization  Organization name  Country  Website  Type of institution
    8. 8. Content Ingestion Questionnaire Harvesting Details  Which protocol will be used to transfer data?  OAI-PMH  File  Z39,50  FTP  HTTP  Harvesting time and dates preferences  How often dataset(s) will need to be updated?  Weekly  Monthly  Quarterly  Annually  On demand
    9. 9. Content Ingestion Questionnaire Information about dataset(s)  Number of dataset(s) to be ingested  Number of records to be expected  Number of digital objects to be expected  Contact person(s) per dataset(s)  Editorial: for collection description  Technical: for collection ingestion
    10. 10. Content Ingestion Questionnaire Information about Metadata  Metadata standard(s) available to describe objects  Marc21  MarcXchange  Unimarc  ESE  EDM  METS  MODS  OAI_DC  TEI  Number of formats available per dataset
    11. 11. Content Ingestion Questionnaire Information about Metadata  Are the metadata ready?  If yes, for which dataset(s)  If not, when will they be ready?  Type of digital objects per dataset(s)  TEXT  IMAGE  AUDIO  VIDEO
    12. 12. Content Ingestion Questionnaire Information about Content  Will content be delivered in addition to metadata?  If yes, for which dataset(s)?  If yes, in which format(s)?  Has the content been digitized?  If yes, for which dataset(s)?  If not, when will the content be available?
    13. 13. Content Ingestion Questionnaire Information about Authority  Will authority files be delivered?  If yes, for which dataset(s)?  If yes, in which format(s)?  Are controlled vocabularies utilized?  If yes, which kind? • Classification • Thesauri • Subject Headings • Other  If yes, for which dataset(s)?  Will full-text be delivered?  If yes, for which dataset(s)?  If yes, in which format(s)?
    14. 14. Content Ingestion Questionnaire Submit If you make a mistake, we can fix it! After submitting collections@theeuropeanlibrary.org
    15. 15. SugarCRM Customer Relation Management tool SugarCRM aggregation tasks management
    16. 16. SugarCRM Customer Relation Management tool SugarCRM generation automated reports
    17. 17. SugarCRM Customer Relation Management tool In SugarCRM  Organizations, contacts, datasets, project and more SugarCRM is utilized for  Collections control  Ingestion plans  Automated reports  Cases per specific datasets
    18. 18. The European Library System architecture
    19. 19. UIM Unique Ingestion Management
    20. 20. Dataset in Acceptance Portal Acceptance Portal  Test environment  Providers to validate data Reports via UIM workflows  Link Validation  Field Validation
    21. 21. Dataset in Acceptance Portal
    22. 22. When Dataset in Acceptance Portal  Create an account on http://www.theeuropeanlibrary.org/  Use credential to log-in in acceptance http://www.tel.ulcc.ac.uk/acceptance/  Validate data using tabs for  Default  XML
    23. 23. Dataset in Acceptance Portal
    24. 24. Dataset(s) in Live Index and Portal  When a provider accepts dataset(s)  E-mail  Dataset(s) ready for live index  Dataset(s) ready for Europeana  Dataset(s) indexed into the live portal  It takes ~ 24 hrs for dataset(s) to be searchable into the live portal
    25. 25. Dataset(s) Live in Europeana When a provider accepts dataset(s)  Dataset(s) delivered to Europeana  Europeana publishes live once a month  Delivery deadline ~ 21 of each month  Dataset(s) searchable in Europeana by following month  Dataset(s) published live in Europeana  E-mail to provider with link to dataset(s) into Europeana portal
    26. 26. EDM – Europeana Data Model  Europeana Libraries project  EDM for library data  Europeana Cloud Project  EDM for museum and archive metadata & content Delivery in EDM to Europeana
    27. 27. EDM – Europeana Data Model
    28. 28. Europeana Preview
    29. 29. Europeana Preview
    30. 30. Full-Text (OCR) Continue Full-text indexing
    31. 31. Full-Text (OCR) Full-text & OCR  URLs to OCR texts into metadata  Extraction of Full-text  Full-text indexing
    32. 32. Full-Text (OCR) Continue the work about Full-text  Europeana Newspapers  Europeana Cloud
    33. 33. Summary What we would like to have from you  Richest possible metadata  Content  Full-text  Authority files or ontologies
    34. 34. Thank you! Questions? For every questions or feedback contact collections@theeuropeanlibrary.org Chiara Latronico Chiara.Latronico@kb.nl
    35. 35. www.theeuropeanlibrary.org

    ×