SlideShare a Scribd company logo
Challenges and Issues for Aggregators 
Alastair Dunning, Julia Fallon, Pavel Kats 
October 2014
The European Cloud project wished to 
find out more about the challenges 
and issues faced by aggregators in 
the Europeana ecosystem
A series of discussion were framed 
around this central point: 
“What are aggregators' current 
and future technical and 
strategic challenges ?”
Existing data from earlier 
interviews was also used 
The European Library 
Poznan Supercomputing 
Centre 
Apex 
Cultura Italia 
Europeana 
Interviews took place with 
the following 
HOPE 
EU Screen 
European Film Gateway 
Deutsche Digitale 
Bibliotheek 
Europeana Fashion 
(An interview with Hispana also took 
place but was too late to incorporate 
into the results)
The answers will be used to inform the Europeana 
Cloud service
The key findings are here:
Metadata mapping is a slow process, with too many 
steps, services and discussions required 
● EFG does conceptual mapping of metadata, but actual mapping is 
executed via staff at ISTI. 
● EFG “wants to have tools for transformation of data without 
technicians' help” 
● DDB have similar workflow, with different places and staff doing the 
conceptual and technical mapping 
● Cultura Italia wishes its data providers to have much more ability to 
share and edit their own records. Very limited at moment 
● EU Screen also interested in harmonising its complex workflow 
● HOPE have too much email back-and-forth with partners to iron out 
issues in metadata mapping 
● Apex want to save different mapping profiles for different types of 
content
Tools for ingestion and mapping are not as reliable as 
they could be; also some worries over availability of 
data 
● Fashion wants better availability of tools for mapping (eg 
weekends, holidays) 
● EU Screen has identical issues with the software; EFG mentioned 
this as well 
● TEL needs tools and access to data to be reliable and provide for 
redundancy; same for Poznan
Ingestion and mapping tools have poor usability, 
and sometimes cannot be used without technical 
expertise 
● Fashion like using MINT but want it to improve; Desire to have 
better interfaces and greater functionality (better editing of 
groups of metadata, for instance) 
● EFG want to have better usability (and documentation) so 
individual data providers can use MINT directly 
● Key concern of EU Screen as well would allow data providers 
closer access to data 
● Key concern of Poznan too; time wasted getting developers to 
do things that a good interface would allow metadata experts 
to do
Aggregators need the ability to curate (ingest, map, 
enrich) their data at greater speed 
● DDB expect to move to 130 data providers to 1000s. Currently can deal with 
30,000 records an hour; will need more in the future 
● TEL needs capacity to manage and enrich several million records 
● EU Screen needs to deal with larger datasets, and for quicker processing
Aggregators need better mechanisms for managing 
identifiers and authority files 
● DDB has massive problems with identifiers. Identifiers are a big 
problem, because many data providers do not have persistent 
identifiers, and some systems seem to produce new identifiers with 
every new ingest. Need some kind of mechanism to help improve 
this. 
● EFG wants to work on clustering but automation has not yet been 
successful, crowdsourcing may help with this ? 
● Fashion wishes to implement its own authority files for fashion 
designers. This needs to be reliable 
● EFG also wants to develop authority files 
● Cultura Italia wants to maintain and update its SKOSified thesaurus 
to allow for cross-domain enrichment
Many see possibilities for enrichment, but do not 
have the tools for it 
● Fashion wants to do semi-automatic enrichment based on content analysis 
(e.g. color extraction) or semantic enrichment on metadata 
● EFG keen on geographical enrichment - do not have tools at the moment 
● Apex need to improve their enrichment processes before data providers will 
accept data back 
● TEL strategy focussed on enriching and aligning data; also need to 
harmonise what is done by TEL and what is done by Europeana 
● Poznan mentioned Virtual Transcription Tool as a possible way of enriching 
data 
● TEL also mentioned more complex tools and workflows to improve data and 
quality
Many aggregators either storing content .... 
* TEL already store content via their hosting providers 
* EU Screen make use of a third party for streaming TV content 
* Fashion make use of Amazon 
* Hope have their own digital repository 
or considering it as a long-term option 
* DDB storing only previews, but considering storing content. Demand from 
smaller institutions 
* Likewise, DFG considering strategic possibilities between content 
* Cultura Italia have considered doing storage and preservation 
Not an urgent problem for any aggregator, but is becoming strategically 
important
Aggregators have different strategies for 
disseminating their data 
● DDB allows others to use API, eg Archive Portals Germany. But little re use 
of data as dumps 
● Fashion very keen to export to other sources (eg Tumblr, Pintrest) 
● TEL strategy depends on third party re-use of its data; very keen to see this 
built into eCloud 
● HOPE makes use of its digital repository for disseminating content 
● Others focussed on aggregating for Europeana
All aggregators encountered problems with the restriction of 
metadata to CC0. However, few rated it in urgent problem 
● DDB - Museums do not provide some metadata because of CC0. Would 
welcome broader approach. EFG and EU Screen said similar things 
● TEL wishes to provide access to data for research use only 
● Different concern for HOPE - for privacy and trust issues content must 
remain on their servers
Aggregator Who manages Ingestion Tools Where is data stored ? 
Hope DNET (managed by ISTI, Pisa) Own hosting 
EFG DNET (managed by ISTI, Pisa) ISTI 
EU Screen MINT (managed by NTUA, Athens) Third Party (Content), Athens 
(Metadata) 
DDB KarlsRuhe Fiz KarlsRuhe Fiz 
Fashion MINT (managed by NTUA, Athens) Amazon (metadata and content), 
though processed in Athens 
TEL Own tools University London Computing 
Centre 
Europeana Own tools / MINT Under negotiation ? 
Poznan Own tools Own hosting 
Apex Own tools Third party 
Cultura Italia Own tools Third party
Cinecittà Luce S.p.A. provides original 
metadata to ISTI (the technical partners 
of Europeana FIlm Gateway) … who then 
host a copy of original metadata … 
meanwhile Cinecittà Luce S.p.A provide 
a conceptual mapping to Europeana 
FIlm Gateway who then, between the 
two of them, agree on the mappning. 
Europeana Film Gateway then send the 
conceptual mapping to ISTI, who can 
convert it from a concept into XSLT 
using their specific tools for this. This 
allows the original metadata to be 
converted into EDM. ISTI then tell EFG 
that the converted data is ready and 
EFG then tell Cincecitta it is ready, and 
EFG and Cinecitta look at the test 
version of EDM, and see if there are 
changes that need to be made, and if 
there are then EFG get in touch with ISTI 
again with an updated conceptual 
mapping that ISTI converts to XSLT with 
their specific tool that allows them to 
created an updated to EDM. EFG can 
then confirm with Cinecittta Luce that 
the updated data is correct. Then the 
can be forwarded to Europeana test 
portal and Europeana asks EFG if the 
appearance in the Europeana model is 
correct and ………………………………….. 
An example of the 
EFG workflow 
and of course if 
anyone gets ill, or 
the tools don’t 
quite work well, or 
if a new dataset 
with a new data 
structure appears 
… then the process 
is slower
The answers are now being used to inform the actual 
nature of the Europeana Cloud service
Timeline for Europeana Cloud 
2014 - Ongoing project (until 2016) with 3 aggregators (TEL, 
Europeana, Poznan) building shared storage system and services 
2015 and onwards - Ongoing work to connect tools and services 
to the Cloud 
2016 - eCloud open to other aggregators to join 
2018 (?) - eCloud open for data providers to join

More Related Content

Viewers also liked

Europeana Cloud - Who is Who?
Europeana Cloud - Who is Who?Europeana Cloud - Who is Who?
Europeana Cloud - Who is Who?
Europeana
 
Europeana Awareness WP1: Public Media Campaigns (2) - Jon Purday
Europeana Awareness WP1: Public Media Campaigns (2) - Jon PurdayEuropeana Awareness WP1: Public Media Campaigns (2) - Jon Purday
Europeana Awareness WP1: Public Media Campaigns (2) - Jon Purday
Europeana
 
Europeana creating the backbone
Europeana creating the backboneEuropeana creating the backbone
Europeana creating the backbone
Europeana
 
Europeana Essentials in Italian
Europeana Essentials in ItalianEuropeana Essentials in Italian
Europeana Essentials in Italian
Europeana
 
Europeana Strategy Workshop: Aggregate
Europeana Strategy Workshop: AggregateEuropeana Strategy Workshop: Aggregate
Europeana Strategy Workshop: Aggregate
Europeana
 
Europeana update, Aggregation, Collections and Project Shift - Strategies and...
Europeana update, Aggregation, Collections and Project Shift - Strategies and...Europeana update, Aggregation, Collections and Project Shift - Strategies and...
Europeana update, Aggregation, Collections and Project Shift - Strategies and...
Europeana
 

Viewers also liked (6)

Europeana Cloud - Who is Who?
Europeana Cloud - Who is Who?Europeana Cloud - Who is Who?
Europeana Cloud - Who is Who?
 
Europeana Awareness WP1: Public Media Campaigns (2) - Jon Purday
Europeana Awareness WP1: Public Media Campaigns (2) - Jon PurdayEuropeana Awareness WP1: Public Media Campaigns (2) - Jon Purday
Europeana Awareness WP1: Public Media Campaigns (2) - Jon Purday
 
Europeana creating the backbone
Europeana creating the backboneEuropeana creating the backbone
Europeana creating the backbone
 
Europeana Essentials in Italian
Europeana Essentials in ItalianEuropeana Essentials in Italian
Europeana Essentials in Italian
 
Europeana Strategy Workshop: Aggregate
Europeana Strategy Workshop: AggregateEuropeana Strategy Workshop: Aggregate
Europeana Strategy Workshop: Aggregate
 
Europeana update, Aggregation, Collections and Project Shift - Strategies and...
Europeana update, Aggregation, Collections and Project Shift - Strategies and...Europeana update, Aggregation, Collections and Project Shift - Strategies and...
Europeana update, Aggregation, Collections and Project Shift - Strategies and...
 

Similar to Results of aggregator needs europeana cloud

DI4R 2018 - Ellip: a collaborative workplace for EO Open Science
DI4R 2018 - Ellip: a collaborative workplace for EO Open ScienceDI4R 2018 - Ellip: a collaborative workplace for EO Open Science
DI4R 2018 - Ellip: a collaborative workplace for EO Open Science
terradue
 
The EOSC Compute Platform with the EGI-ACE project
The EOSC Compute Platform with the EGI-ACE project The EOSC Compute Platform with the EGI-ACE project
The EOSC Compute Platform with the EGI-ACE project
EGI Federation
 
Data management plans – EUDAT Best practices and case study | www.eudat.eu
Data management plans – EUDAT Best practices and case study | www.eudat.euData management plans – EUDAT Best practices and case study | www.eudat.eu
Data management plans – EUDAT Best practices and case study | www.eudat.eu
EUDAT
 
D.3.1: State of the Art - Linked Data and Digital Preservation
D.3.1: State of the Art - Linked Data and Digital PreservationD.3.1: State of the Art - Linked Data and Digital Preservation
D.3.1: State of the Art - Linked Data and Digital Preservation
PRELIDA Project
 
Values & Vision - Cloud Sandboxes for BIG Earth Sciences
Values & Vision - Cloud Sandboxes for BIG Earth SciencesValues & Vision - Cloud Sandboxes for BIG Earth Sciences
Values & Vision - Cloud Sandboxes for BIG Earth Sciences
terradue
 
Summer school bz_fp7research_20100708
Summer school bz_fp7research_20100708Summer school bz_fp7research_20100708
Summer school bz_fp7research_20100708
Sandro D'Elia
 
Cloud Computing Needs for Earth Observation Data Analysis: EGI and EOSC-hub
Cloud Computing Needs for Earth Observation Data Analysis: EGI and EOSC-hubCloud Computing Needs for Earth Observation Data Analysis: EGI and EOSC-hub
Cloud Computing Needs for Earth Observation Data Analysis: EGI and EOSC-hub
Björn Backeberg
 
The Photodentro Aggregator federated system architecture
The Photodentro Aggregator federated system architectureThe Photodentro Aggregator federated system architecture
The Photodentro Aggregator federated system architecture
Anastasios (Tasos) Koutoumanos
 
EDF2012 Stefano Bertolo - Future European activities and funding perspectiv...
EDF2012   Stefano Bertolo - Future European activities and funding perspectiv...EDF2012   Stefano Bertolo - Future European activities and funding perspectiv...
EDF2012 Stefano Bertolo - Future European activities and funding perspectiv...
European Data Forum
 
BSC and Integrating Persistent Data and Parallel Programming Models
BSC and Integrating Persistent Data and Parallel Programming ModelsBSC and Integrating Persistent Data and Parallel Programming Models
BSC and Integrating Persistent Data and Parallel Programming Models
inside-BigData.com
 
DEEP general presentation
DEEP general presentationDEEP general presentation
DEEP general presentation
EUDAT
 
1P A R T Introduction to Analytics and AII
1P A R T Introduction to Analytics and AII1P A R T Introduction to Analytics and AII
1P A R T Introduction to Analytics and AII
TatianaMajor22
 
Stop making tools! Nobody likes them anyway...
Stop making tools! Nobody likes them anyway...Stop making tools! Nobody likes them anyway...
Stop making tools! Nobody likes them anyway...
Christophe Guéret
 
Gergely Sipos (EGI): Exploiting scientific data in the international context ...
Gergely Sipos (EGI): Exploiting scientific data in the international context ...Gergely Sipos (EGI): Exploiting scientific data in the international context ...
Gergely Sipos (EGI): Exploiting scientific data in the international context ...
Gergely Sipos
 
MMT – Modern, Next Generation Machine Translation, Achim Ruopp (TAUS)
MMT – Modern, Next Generation Machine Translation, Achim Ruopp (TAUS)MMT – Modern, Next Generation Machine Translation, Achim Ruopp (TAUS)
MMT – Modern, Next Generation Machine Translation, Achim Ruopp (TAUS)
TAUS - The Language Data Network
 
WP3 Further specification of Functionality and Interoperability - Gradmann / ...
WP3 Further specification of Functionality and Interoperability - Gradmann / ...WP3 Further specification of Functionality and Interoperability - Gradmann / ...
WP3 Further specification of Functionality and Interoperability - Gradmann / ...
Europeana
 
Thinking the archives of 2020: Opportunitiws, priorities, Issues
Thinking the archives of 2020: Opportunitiws, priorities, IssuesThinking the archives of 2020: Opportunitiws, priorities, Issues
Thinking the archives of 2020: Opportunitiws, priorities, Issues
FIAT/IFTA
 
Presentation of Mediamap @Ebu Production Technology Seminar
Presentation of Mediamap @Ebu Production Technology SeminarPresentation of Mediamap @Ebu Production Technology Seminar
Presentation of Mediamap @Ebu Production Technology Seminar
Maarten Verwaest
 
Keynote at Converge 2019
Keynote at Converge 2019Keynote at Converge 2019
Keynote at Converge 2019
Travis Oliphant
 
europeana agm 2015, 4/11, europeana cloud - alastair dunning & pavel kats
europeana agm 2015, 4/11, europeana cloud - alastair dunning & pavel katseuropeana agm 2015, 4/11, europeana cloud - alastair dunning & pavel kats
europeana agm 2015, 4/11, europeana cloud - alastair dunning & pavel kats
Europeana
 

Similar to Results of aggregator needs europeana cloud (20)

DI4R 2018 - Ellip: a collaborative workplace for EO Open Science
DI4R 2018 - Ellip: a collaborative workplace for EO Open ScienceDI4R 2018 - Ellip: a collaborative workplace for EO Open Science
DI4R 2018 - Ellip: a collaborative workplace for EO Open Science
 
The EOSC Compute Platform with the EGI-ACE project
The EOSC Compute Platform with the EGI-ACE project The EOSC Compute Platform with the EGI-ACE project
The EOSC Compute Platform with the EGI-ACE project
 
Data management plans – EUDAT Best practices and case study | www.eudat.eu
Data management plans – EUDAT Best practices and case study | www.eudat.euData management plans – EUDAT Best practices and case study | www.eudat.eu
Data management plans – EUDAT Best practices and case study | www.eudat.eu
 
D.3.1: State of the Art - Linked Data and Digital Preservation
D.3.1: State of the Art - Linked Data and Digital PreservationD.3.1: State of the Art - Linked Data and Digital Preservation
D.3.1: State of the Art - Linked Data and Digital Preservation
 
Values & Vision - Cloud Sandboxes for BIG Earth Sciences
Values & Vision - Cloud Sandboxes for BIG Earth SciencesValues & Vision - Cloud Sandboxes for BIG Earth Sciences
Values & Vision - Cloud Sandboxes for BIG Earth Sciences
 
Summer school bz_fp7research_20100708
Summer school bz_fp7research_20100708Summer school bz_fp7research_20100708
Summer school bz_fp7research_20100708
 
Cloud Computing Needs for Earth Observation Data Analysis: EGI and EOSC-hub
Cloud Computing Needs for Earth Observation Data Analysis: EGI and EOSC-hubCloud Computing Needs for Earth Observation Data Analysis: EGI and EOSC-hub
Cloud Computing Needs for Earth Observation Data Analysis: EGI and EOSC-hub
 
The Photodentro Aggregator federated system architecture
The Photodentro Aggregator federated system architectureThe Photodentro Aggregator federated system architecture
The Photodentro Aggregator federated system architecture
 
EDF2012 Stefano Bertolo - Future European activities and funding perspectiv...
EDF2012   Stefano Bertolo - Future European activities and funding perspectiv...EDF2012   Stefano Bertolo - Future European activities and funding perspectiv...
EDF2012 Stefano Bertolo - Future European activities and funding perspectiv...
 
BSC and Integrating Persistent Data and Parallel Programming Models
BSC and Integrating Persistent Data and Parallel Programming ModelsBSC and Integrating Persistent Data and Parallel Programming Models
BSC and Integrating Persistent Data and Parallel Programming Models
 
DEEP general presentation
DEEP general presentationDEEP general presentation
DEEP general presentation
 
1P A R T Introduction to Analytics and AII
1P A R T Introduction to Analytics and AII1P A R T Introduction to Analytics and AII
1P A R T Introduction to Analytics and AII
 
Stop making tools! Nobody likes them anyway...
Stop making tools! Nobody likes them anyway...Stop making tools! Nobody likes them anyway...
Stop making tools! Nobody likes them anyway...
 
Gergely Sipos (EGI): Exploiting scientific data in the international context ...
Gergely Sipos (EGI): Exploiting scientific data in the international context ...Gergely Sipos (EGI): Exploiting scientific data in the international context ...
Gergely Sipos (EGI): Exploiting scientific data in the international context ...
 
MMT – Modern, Next Generation Machine Translation, Achim Ruopp (TAUS)
MMT – Modern, Next Generation Machine Translation, Achim Ruopp (TAUS)MMT – Modern, Next Generation Machine Translation, Achim Ruopp (TAUS)
MMT – Modern, Next Generation Machine Translation, Achim Ruopp (TAUS)
 
WP3 Further specification of Functionality and Interoperability - Gradmann / ...
WP3 Further specification of Functionality and Interoperability - Gradmann / ...WP3 Further specification of Functionality and Interoperability - Gradmann / ...
WP3 Further specification of Functionality and Interoperability - Gradmann / ...
 
Thinking the archives of 2020: Opportunitiws, priorities, Issues
Thinking the archives of 2020: Opportunitiws, priorities, IssuesThinking the archives of 2020: Opportunitiws, priorities, Issues
Thinking the archives of 2020: Opportunitiws, priorities, Issues
 
Presentation of Mediamap @Ebu Production Technology Seminar
Presentation of Mediamap @Ebu Production Technology SeminarPresentation of Mediamap @Ebu Production Technology Seminar
Presentation of Mediamap @Ebu Production Technology Seminar
 
Keynote at Converge 2019
Keynote at Converge 2019Keynote at Converge 2019
Keynote at Converge 2019
 
europeana agm 2015, 4/11, europeana cloud - alastair dunning & pavel kats
europeana agm 2015, 4/11, europeana cloud - alastair dunning & pavel katseuropeana agm 2015, 4/11, europeana cloud - alastair dunning & pavel kats
europeana agm 2015, 4/11, europeana cloud - alastair dunning & pavel kats
 

More from Europeana

Europeana Climate Action Community meetup 29_03_2022.pdf
Europeana Climate Action Community meetup 29_03_2022.pdfEuropeana Climate Action Community meetup 29_03_2022.pdf
Europeana Climate Action Community meetup 29_03_2022.pdf
Europeana
 
French Presidency - 1 march 2022
French Presidency - 1 march 2022French Presidency - 1 march 2022
French Presidency - 1 march 2022
Europeana
 
Europeana Aggregators' Fair day 1
Europeana Aggregators' Fair day 1Europeana Aggregators' Fair day 1
Europeana Aggregators' Fair day 1
Europeana
 
Europeana Aggregators' Fair day 2
Europeana Aggregators' Fair day 2Europeana Aggregators' Fair day 2
Europeana Aggregators' Fair day 2
Europeana
 
Europeana web conference portuguese presidency of the council of the eu - jun...
Europeana web conference portuguese presidency of the council of the eu - jun...Europeana web conference portuguese presidency of the council of the eu - jun...
Europeana web conference portuguese presidency of the council of the eu - jun...
Europeana
 
Europeana 2019 - Connect Communities - 27-28 November 2019 - Auditorium
Europeana 2019 - Connect Communities - 27-28 November 2019 - AuditoriumEuropeana 2019 - Connect Communities - 27-28 November 2019 - Auditorium
Europeana 2019 - Connect Communities - 27-28 November 2019 - Auditorium
Europeana
 
Europeana 2019 - Connect Communities - 29 November 2019 - Auditorium
Europeana 2019 - Connect Communities - 29 November 2019 - AuditoriumEuropeana 2019 - Connect Communities - 29 November 2019 - Auditorium
Europeana 2019 - Connect Communities - 29 November 2019 - Auditorium
Europeana
 
Europeana 2019 - Connect Communities - Pitch your project
Europeana 2019 - Connect Communities - Pitch your projectEuropeana 2019 - Connect Communities - Pitch your project
Europeana 2019 - Connect Communities - Pitch your project
Europeana
 
Europeana 2019 - Connect Communities
Europeana 2019 - Connect CommunitiesEuropeana 2019 - Connect Communities
Europeana 2019 - Connect Communities
Europeana
 
Europeana 2019 - Connect Communities
Europeana 2019 - Connect CommunitiesEuropeana 2019 - Connect Communities
Europeana 2019 - Connect Communities
Europeana
 
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...
Europeana
 
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 1...
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 1...Europeana meeting under Finland’s Presidency of the Council of the EU - Day 1...
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 1...
Europeana
 
The Europeana meeting under the Romanian Presidency, “Exposing Online the Eur...
The Europeana meeting under the Romanian Presidency, “Exposing Online the Eur...The Europeana meeting under the Romanian Presidency, “Exposing Online the Eur...
The Europeana meeting under the Romanian Presidency, “Exposing Online the Eur...
Europeana
 
The Europeana meeting under the Romanian Presidency, Exposing Online the Euro...
The Europeana meeting under the Romanian Presidency, Exposing Online the Euro...The Europeana meeting under the Romanian Presidency, Exposing Online the Euro...
The Europeana meeting under the Romanian Presidency, Exposing Online the Euro...
Europeana
 
The Europeana meeting under the Romanian Presidency, Exposing Online the Euro...
The Europeana meeting under the Romanian Presidency, Exposing Online the Euro...The Europeana meeting under the Romanian Presidency, Exposing Online the Euro...
The Europeana meeting under the Romanian Presidency, Exposing Online the Euro...
Europeana
 
The Europeana meeting under the Romanian Presidency, Exposing Online the Euro...
The Europeana meeting under the Romanian Presidency, Exposing Online the Euro...The Europeana meeting under the Romanian Presidency, Exposing Online the Euro...
The Europeana meeting under the Romanian Presidency, Exposing Online the Euro...
Europeana
 
The Europeana meeting under the Romanian Presidency, Exposing Online the Euro...
The Europeana meeting under the Romanian Presidency, Exposing Online the Euro...The Europeana meeting under the Romanian Presidency, Exposing Online the Euro...
The Europeana meeting under the Romanian Presidency, Exposing Online the Euro...
Europeana
 
The Europeana meeting under the Romanian Presidency, Exposing Online the Euro...
The Europeana meeting under the Romanian Presidency, Exposing Online the Euro...The Europeana meeting under the Romanian Presidency, Exposing Online the Euro...
The Europeana meeting under the Romanian Presidency, Exposing Online the Euro...
Europeana
 
The Europeana meeting under the Romanian Presidency, Exposing Online the Euro...
The Europeana meeting under the Romanian Presidency, Exposing Online the Euro...The Europeana meeting under the Romanian Presidency, Exposing Online the Euro...
The Europeana meeting under the Romanian Presidency, Exposing Online the Euro...
Europeana
 
The Europeana meeting under the Romanian Presidency, Exposing Online the Euro...
The Europeana meeting under the Romanian Presidency, Exposing Online the Euro...The Europeana meeting under the Romanian Presidency, Exposing Online the Euro...
The Europeana meeting under the Romanian Presidency, Exposing Online the Euro...
Europeana
 

More from Europeana (20)

Europeana Climate Action Community meetup 29_03_2022.pdf
Europeana Climate Action Community meetup 29_03_2022.pdfEuropeana Climate Action Community meetup 29_03_2022.pdf
Europeana Climate Action Community meetup 29_03_2022.pdf
 
French Presidency - 1 march 2022
French Presidency - 1 march 2022French Presidency - 1 march 2022
French Presidency - 1 march 2022
 
Europeana Aggregators' Fair day 1
Europeana Aggregators' Fair day 1Europeana Aggregators' Fair day 1
Europeana Aggregators' Fair day 1
 
Europeana Aggregators' Fair day 2
Europeana Aggregators' Fair day 2Europeana Aggregators' Fair day 2
Europeana Aggregators' Fair day 2
 
Europeana web conference portuguese presidency of the council of the eu - jun...
Europeana web conference portuguese presidency of the council of the eu - jun...Europeana web conference portuguese presidency of the council of the eu - jun...
Europeana web conference portuguese presidency of the council of the eu - jun...
 
Europeana 2019 - Connect Communities - 27-28 November 2019 - Auditorium
Europeana 2019 - Connect Communities - 27-28 November 2019 - AuditoriumEuropeana 2019 - Connect Communities - 27-28 November 2019 - Auditorium
Europeana 2019 - Connect Communities - 27-28 November 2019 - Auditorium
 
Europeana 2019 - Connect Communities - 29 November 2019 - Auditorium
Europeana 2019 - Connect Communities - 29 November 2019 - AuditoriumEuropeana 2019 - Connect Communities - 29 November 2019 - Auditorium
Europeana 2019 - Connect Communities - 29 November 2019 - Auditorium
 
Europeana 2019 - Connect Communities - Pitch your project
Europeana 2019 - Connect Communities - Pitch your projectEuropeana 2019 - Connect Communities - Pitch your project
Europeana 2019 - Connect Communities - Pitch your project
 
Europeana 2019 - Connect Communities
Europeana 2019 - Connect CommunitiesEuropeana 2019 - Connect Communities
Europeana 2019 - Connect Communities
 
Europeana 2019 - Connect Communities
Europeana 2019 - Connect CommunitiesEuropeana 2019 - Connect Communities
Europeana 2019 - Connect Communities
 
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...
 
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 1...
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 1...Europeana meeting under Finland’s Presidency of the Council of the EU - Day 1...
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 1...
 
The Europeana meeting under the Romanian Presidency, “Exposing Online the Eur...
The Europeana meeting under the Romanian Presidency, “Exposing Online the Eur...The Europeana meeting under the Romanian Presidency, “Exposing Online the Eur...
The Europeana meeting under the Romanian Presidency, “Exposing Online the Eur...
 
The Europeana meeting under the Romanian Presidency, Exposing Online the Euro...
The Europeana meeting under the Romanian Presidency, Exposing Online the Euro...The Europeana meeting under the Romanian Presidency, Exposing Online the Euro...
The Europeana meeting under the Romanian Presidency, Exposing Online the Euro...
 
The Europeana meeting under the Romanian Presidency, Exposing Online the Euro...
The Europeana meeting under the Romanian Presidency, Exposing Online the Euro...The Europeana meeting under the Romanian Presidency, Exposing Online the Euro...
The Europeana meeting under the Romanian Presidency, Exposing Online the Euro...
 
The Europeana meeting under the Romanian Presidency, Exposing Online the Euro...
The Europeana meeting under the Romanian Presidency, Exposing Online the Euro...The Europeana meeting under the Romanian Presidency, Exposing Online the Euro...
The Europeana meeting under the Romanian Presidency, Exposing Online the Euro...
 
The Europeana meeting under the Romanian Presidency, Exposing Online the Euro...
The Europeana meeting under the Romanian Presidency, Exposing Online the Euro...The Europeana meeting under the Romanian Presidency, Exposing Online the Euro...
The Europeana meeting under the Romanian Presidency, Exposing Online the Euro...
 
The Europeana meeting under the Romanian Presidency, Exposing Online the Euro...
The Europeana meeting under the Romanian Presidency, Exposing Online the Euro...The Europeana meeting under the Romanian Presidency, Exposing Online the Euro...
The Europeana meeting under the Romanian Presidency, Exposing Online the Euro...
 
The Europeana meeting under the Romanian Presidency, Exposing Online the Euro...
The Europeana meeting under the Romanian Presidency, Exposing Online the Euro...The Europeana meeting under the Romanian Presidency, Exposing Online the Euro...
The Europeana meeting under the Romanian Presidency, Exposing Online the Euro...
 
The Europeana meeting under the Romanian Presidency, Exposing Online the Euro...
The Europeana meeting under the Romanian Presidency, Exposing Online the Euro...The Europeana meeting under the Romanian Presidency, Exposing Online the Euro...
The Europeana meeting under the Romanian Presidency, Exposing Online the Euro...
 

Results of aggregator needs europeana cloud

  • 1. Challenges and Issues for Aggregators Alastair Dunning, Julia Fallon, Pavel Kats October 2014
  • 2. The European Cloud project wished to find out more about the challenges and issues faced by aggregators in the Europeana ecosystem
  • 3. A series of discussion were framed around this central point: “What are aggregators' current and future technical and strategic challenges ?”
  • 4. Existing data from earlier interviews was also used The European Library Poznan Supercomputing Centre Apex Cultura Italia Europeana Interviews took place with the following HOPE EU Screen European Film Gateway Deutsche Digitale Bibliotheek Europeana Fashion (An interview with Hispana also took place but was too late to incorporate into the results)
  • 5. The answers will be used to inform the Europeana Cloud service
  • 6. The key findings are here:
  • 7. Metadata mapping is a slow process, with too many steps, services and discussions required ● EFG does conceptual mapping of metadata, but actual mapping is executed via staff at ISTI. ● EFG “wants to have tools for transformation of data without technicians' help” ● DDB have similar workflow, with different places and staff doing the conceptual and technical mapping ● Cultura Italia wishes its data providers to have much more ability to share and edit their own records. Very limited at moment ● EU Screen also interested in harmonising its complex workflow ● HOPE have too much email back-and-forth with partners to iron out issues in metadata mapping ● Apex want to save different mapping profiles for different types of content
  • 8. Tools for ingestion and mapping are not as reliable as they could be; also some worries over availability of data ● Fashion wants better availability of tools for mapping (eg weekends, holidays) ● EU Screen has identical issues with the software; EFG mentioned this as well ● TEL needs tools and access to data to be reliable and provide for redundancy; same for Poznan
  • 9. Ingestion and mapping tools have poor usability, and sometimes cannot be used without technical expertise ● Fashion like using MINT but want it to improve; Desire to have better interfaces and greater functionality (better editing of groups of metadata, for instance) ● EFG want to have better usability (and documentation) so individual data providers can use MINT directly ● Key concern of EU Screen as well would allow data providers closer access to data ● Key concern of Poznan too; time wasted getting developers to do things that a good interface would allow metadata experts to do
  • 10. Aggregators need the ability to curate (ingest, map, enrich) their data at greater speed ● DDB expect to move to 130 data providers to 1000s. Currently can deal with 30,000 records an hour; will need more in the future ● TEL needs capacity to manage and enrich several million records ● EU Screen needs to deal with larger datasets, and for quicker processing
  • 11. Aggregators need better mechanisms for managing identifiers and authority files ● DDB has massive problems with identifiers. Identifiers are a big problem, because many data providers do not have persistent identifiers, and some systems seem to produce new identifiers with every new ingest. Need some kind of mechanism to help improve this. ● EFG wants to work on clustering but automation has not yet been successful, crowdsourcing may help with this ? ● Fashion wishes to implement its own authority files for fashion designers. This needs to be reliable ● EFG also wants to develop authority files ● Cultura Italia wants to maintain and update its SKOSified thesaurus to allow for cross-domain enrichment
  • 12. Many see possibilities for enrichment, but do not have the tools for it ● Fashion wants to do semi-automatic enrichment based on content analysis (e.g. color extraction) or semantic enrichment on metadata ● EFG keen on geographical enrichment - do not have tools at the moment ● Apex need to improve their enrichment processes before data providers will accept data back ● TEL strategy focussed on enriching and aligning data; also need to harmonise what is done by TEL and what is done by Europeana ● Poznan mentioned Virtual Transcription Tool as a possible way of enriching data ● TEL also mentioned more complex tools and workflows to improve data and quality
  • 13. Many aggregators either storing content .... * TEL already store content via their hosting providers * EU Screen make use of a third party for streaming TV content * Fashion make use of Amazon * Hope have their own digital repository or considering it as a long-term option * DDB storing only previews, but considering storing content. Demand from smaller institutions * Likewise, DFG considering strategic possibilities between content * Cultura Italia have considered doing storage and preservation Not an urgent problem for any aggregator, but is becoming strategically important
  • 14. Aggregators have different strategies for disseminating their data ● DDB allows others to use API, eg Archive Portals Germany. But little re use of data as dumps ● Fashion very keen to export to other sources (eg Tumblr, Pintrest) ● TEL strategy depends on third party re-use of its data; very keen to see this built into eCloud ● HOPE makes use of its digital repository for disseminating content ● Others focussed on aggregating for Europeana
  • 15. All aggregators encountered problems with the restriction of metadata to CC0. However, few rated it in urgent problem ● DDB - Museums do not provide some metadata because of CC0. Would welcome broader approach. EFG and EU Screen said similar things ● TEL wishes to provide access to data for research use only ● Different concern for HOPE - for privacy and trust issues content must remain on their servers
  • 16. Aggregator Who manages Ingestion Tools Where is data stored ? Hope DNET (managed by ISTI, Pisa) Own hosting EFG DNET (managed by ISTI, Pisa) ISTI EU Screen MINT (managed by NTUA, Athens) Third Party (Content), Athens (Metadata) DDB KarlsRuhe Fiz KarlsRuhe Fiz Fashion MINT (managed by NTUA, Athens) Amazon (metadata and content), though processed in Athens TEL Own tools University London Computing Centre Europeana Own tools / MINT Under negotiation ? Poznan Own tools Own hosting Apex Own tools Third party Cultura Italia Own tools Third party
  • 17. Cinecittà Luce S.p.A. provides original metadata to ISTI (the technical partners of Europeana FIlm Gateway) … who then host a copy of original metadata … meanwhile Cinecittà Luce S.p.A provide a conceptual mapping to Europeana FIlm Gateway who then, between the two of them, agree on the mappning. Europeana Film Gateway then send the conceptual mapping to ISTI, who can convert it from a concept into XSLT using their specific tools for this. This allows the original metadata to be converted into EDM. ISTI then tell EFG that the converted data is ready and EFG then tell Cincecitta it is ready, and EFG and Cinecitta look at the test version of EDM, and see if there are changes that need to be made, and if there are then EFG get in touch with ISTI again with an updated conceptual mapping that ISTI converts to XSLT with their specific tool that allows them to created an updated to EDM. EFG can then confirm with Cinecittta Luce that the updated data is correct. Then the can be forwarded to Europeana test portal and Europeana asks EFG if the appearance in the Europeana model is correct and ………………………………….. An example of the EFG workflow and of course if anyone gets ill, or the tools don’t quite work well, or if a new dataset with a new data structure appears … then the process is slower
  • 18. The answers are now being used to inform the actual nature of the Europeana Cloud service
  • 19. Timeline for Europeana Cloud 2014 - Ongoing project (until 2016) with 3 aggregators (TEL, Europeana, Poznan) building shared storage system and services 2015 and onwards - Ongoing work to connect tools and services to the Cloud 2016 - eCloud open to other aggregators to join 2018 (?) - eCloud open for data providers to join