Europeana in a research context
Alastair Dunning, @alastairdunning
The European Library / Europeana Foundation
Mining Digital Repositories Conference
National Library of Netherlands, April 2014
Oxyrhynchus
Papyrus No. 20
Fragment of
Homer's The Iliad,
2nd century
Common Era
British Library,
London
Image via
Wikimedia
Commons
Metadata record
for Oxyrhynchus
Papyrus No. 20
(numbered 742)
from British
Library, London
Metadata record
for Oxyrhynchus
Papyrus No. 20
from University of
Oxford
Tool for
transcribing
Oxyrhynchus
Papyrus from
Ancient Lives
project
Tool for
measuring
Oxyrhynchus
Papyrus from
Ancient Lives
project
Discussion
Forum
Oxyrhynchus
Papyri from
Ancient Lives
project
Images
Hi-res TIFF – British
Library
JPEG – University of
Oxford
Full-text
Multiple
transcriptions –
Ancient Lives
Project
Metadata
Version 1 – British
Library
Version 2– University
of Oxford
Version 3 – Ancient
Lives
Unstructured
Commentary
Ancient Lives Project
The richness of this
information is created
by many parties, but it
sits in different places,
took many projects to
make happen and still is
not fully connected.
Europeana as a data
brain, helping connect
disparate datasets
Rather than Europeana
as end-user facing portal
2005 A letter to the European Commission from 6 Heads of State (from
France, Poland, Germany, Italy, Spain and Hungary) suggests the creation of
a European digital library.
2007 The European Digital Library Network - EDLnet - begins to create a
prototype, funded by i2010.
2008 Europeana's prototype is launched on 20 November.
2009 Europeana's collection reaches 5 million items.
2010 A European Parliament report approved in February asks for more
content and funding for Europeana. It is unanimously approved.
2012 Europeana releases all metadata under a CC0 waiver, making it freely
available for re-use. Europeana’s collection reaches 25 million items.
2013 Europeana continues to further its position as a catalyst for innovation
and digital enterprise in support of the Digital Agenda of Europe - one of the
pillars of the EU’s Europe 2020 strategy
How does Europeana get its content?
 Through its aggregation structure, Europeana represents 2,300
organisations across Europe
 From 150 Aggregators
• Promoting national aggregation structures
• More efficient than working with every individual content provider
• Helps to achieve international standardisation
 End-user generated content
• Crowd-sourcing projects such as Europeana 1914-1918 and
Europeana 1989
Who submits data to Europeana?
Domain Aggregators National initiatives
Audiovisual
collections
National Aggregators
Regional Aggregators
Archives
Thematic collections
Libraries
e.g. Musées
Lausannois
e.g.
Culture
Grid,
Culture.fr
e.g. The
European
Library
e.g. APEX
e.g. EUScreen,
European Film
Gateway
e.g. Judaica Europeana,
Europeana Fashion
The evolution is to Europeana Cloud : a
infrastructure for aggregators and data
providers.
 This would allow members
of Europeana Cloud to:
1.Upload metadata
2.Define who can use that
metadata and in what ways
(download, annotate, delete)
3.Give third parties access
via APIs
4. Capability for sharing
content also feasible
Development of Europeana
as portal not platform
Cloud
infrastructure sits
at heart of this
“Portals are for
visiting, platforms
are for building on”
Europeana Labs as a
platform for the creative
industries
Europeana Research as a
platform for humanities,
social sciences
Europeana Research will not be a
single discovery portal; however at
it will offer researchers access to
APIs and downloadable to „raw data‟
stored in Europeana Cloud
Third parties can build their own
specific tools using these APIs or
downloadable data
Europeana Research will give access
to data that can only be used in a
non-commercial context
Europeana Research will have open
APIs to allow bi-directional access
(read, write) to metadata in
Europeana Cloud (dependent on
permissions)
Pilot Study 1:
Tool to search
through
Europeana (and
other content)
related to
philosophy of
logic
http://greenlearningnetw
ork.com/axiom/
Pilot Study 2:
Musicologists‟
tool to annotate
early music
manuscripts from
disparate sources
(Work in Progress)
Other Possibilities
Service and end-user tool to allow for
transcription of multiple documents
aggregated from multiple sources
Service to allow for extraction of geographic
or other terms from aggregation of services
Aggregation of text documents for download
for text mining
Text Mining Opportunties
Aggregation of corpora of primary sources, with
harmonized licencing
Versioning corpora of primary sources
Enrichment of corpora via third-party tools
Brokerage of in-copyright material for non-commerical
usage ? (Primary and secondary sources)
Ability to upload algorithms / software ?
Cons
Lack of maturity in research
community in building APIs
Time taken for tool
development
Quality and extent of
underlying data still essential
Still needs engagement with
research communities / tools
builders
Pros
Europeana leverages its
aggregation network to provide
single access point to data
Tools can be built to help
specific questions for
researchers
Responsibility for sustainability
and outreach are distributed
Works very well for time-limited
projects
Europeana licencing framework
provides standards for access to
data
Thank you
Alastair Dunning, @alastairdunning

Europeana in a Research Context

  • 1.
    Europeana in aresearch context Alastair Dunning, @alastairdunning The European Library / Europeana Foundation Mining Digital Repositories Conference National Library of Netherlands, April 2014
  • 2.
    Oxyrhynchus Papyrus No. 20 Fragmentof Homer's The Iliad, 2nd century Common Era British Library, London Image via Wikimedia Commons
  • 3.
    Metadata record for Oxyrhynchus PapyrusNo. 20 (numbered 742) from British Library, London
  • 4.
    Metadata record for Oxyrhynchus PapyrusNo. 20 from University of Oxford
  • 5.
  • 6.
  • 7.
  • 8.
    Images Hi-res TIFF –British Library JPEG – University of Oxford Full-text Multiple transcriptions – Ancient Lives Project Metadata Version 1 – British Library Version 2– University of Oxford Version 3 – Ancient Lives Unstructured Commentary Ancient Lives Project
  • 9.
    The richness ofthis information is created by many parties, but it sits in different places, took many projects to make happen and still is not fully connected.
  • 10.
    Europeana as adata brain, helping connect disparate datasets Rather than Europeana as end-user facing portal
  • 11.
    2005 A letterto the European Commission from 6 Heads of State (from France, Poland, Germany, Italy, Spain and Hungary) suggests the creation of a European digital library. 2007 The European Digital Library Network - EDLnet - begins to create a prototype, funded by i2010. 2008 Europeana's prototype is launched on 20 November. 2009 Europeana's collection reaches 5 million items. 2010 A European Parliament report approved in February asks for more content and funding for Europeana. It is unanimously approved. 2012 Europeana releases all metadata under a CC0 waiver, making it freely available for re-use. Europeana’s collection reaches 25 million items. 2013 Europeana continues to further its position as a catalyst for innovation and digital enterprise in support of the Digital Agenda of Europe - one of the pillars of the EU’s Europe 2020 strategy
  • 12.
    How does Europeanaget its content?  Through its aggregation structure, Europeana represents 2,300 organisations across Europe  From 150 Aggregators • Promoting national aggregation structures • More efficient than working with every individual content provider • Helps to achieve international standardisation  End-user generated content • Crowd-sourcing projects such as Europeana 1914-1918 and Europeana 1989
  • 13.
    Who submits datato Europeana? Domain Aggregators National initiatives Audiovisual collections National Aggregators Regional Aggregators Archives Thematic collections Libraries e.g. Musées Lausannois e.g. Culture Grid, Culture.fr e.g. The European Library e.g. APEX e.g. EUScreen, European Film Gateway e.g. Judaica Europeana, Europeana Fashion
  • 14.
    The evolution isto Europeana Cloud : a infrastructure for aggregators and data providers.  This would allow members of Europeana Cloud to: 1.Upload metadata 2.Define who can use that metadata and in what ways (download, annotate, delete) 3.Give third parties access via APIs 4. Capability for sharing content also feasible
  • 15.
    Development of Europeana asportal not platform Cloud infrastructure sits at heart of this “Portals are for visiting, platforms are for building on”
  • 16.
    Europeana Labs asa platform for the creative industries Europeana Research as a platform for humanities, social sciences
  • 17.
    Europeana Research willnot be a single discovery portal; however at it will offer researchers access to APIs and downloadable to „raw data‟ stored in Europeana Cloud Third parties can build their own specific tools using these APIs or downloadable data
  • 18.
    Europeana Research willgive access to data that can only be used in a non-commercial context Europeana Research will have open APIs to allow bi-directional access (read, write) to metadata in Europeana Cloud (dependent on permissions)
  • 19.
    Pilot Study 1: Toolto search through Europeana (and other content) related to philosophy of logic http://greenlearningnetw ork.com/axiom/
  • 20.
    Pilot Study 2: Musicologists‟ toolto annotate early music manuscripts from disparate sources (Work in Progress)
  • 21.
    Other Possibilities Service andend-user tool to allow for transcription of multiple documents aggregated from multiple sources Service to allow for extraction of geographic or other terms from aggregation of services Aggregation of text documents for download for text mining
  • 22.
    Text Mining Opportunties Aggregationof corpora of primary sources, with harmonized licencing Versioning corpora of primary sources Enrichment of corpora via third-party tools Brokerage of in-copyright material for non-commerical usage ? (Primary and secondary sources) Ability to upload algorithms / software ?
  • 23.
    Cons Lack of maturityin research community in building APIs Time taken for tool development Quality and extent of underlying data still essential Still needs engagement with research communities / tools builders Pros Europeana leverages its aggregation network to provide single access point to data Tools can be built to help specific questions for researchers Responsibility for sustainability and outreach are distributed Works very well for time-limited projects Europeana licencing framework provides standards for access to data
  • 24.