Your SlideShare is downloading. ×
  • Like
Multilingual challenges in Europeana
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Multilingual challenges in Europeana

  • 194 views
Published

Presentation at the H2020-CEF Infoday, 16 January 2014 http://ec.europa.eu/digital-agenda/en/news/information-and-networking-days-h2020-work-programme-2014-2015-connecting-europe-facility

Presentation at the H2020-CEF Infoday, 16 January 2014 http://ec.europa.eu/digital-agenda/en/news/information-and-networking-days-h2020-work-programme-2014-2015-connecting-europe-facility

Published in Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
194
On SlideShare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
1
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • Les Miserables: Victor Hugo’s handwritten manuscripts: http://www.europeana.eu/portal/record/9200103/5372912AF66AB529E188218BC1F747E75EB1A18F.html
    BnF, public domain
    Matisse ‘53 in the form of a double helix’ http://www.europeana.eu/portal/record/9200104/F8D60AB9136C8A59B59DF1CFEC278A6CABA8B0C6.htmlThe Wellcome Library (CC-BY-NC-ND)
    ‘söprűtánc’ – Hungarian traditional dance http://www.europeana.eu/portal/record/08901/E1A7B01BE4AED87FD239672F4F3941F52262D6B2.html
    Hungarian Academy of Sciences Institute for Musicology, public domain
    ‘Neurologico reggae’ Music album http://www.europeana.eu/portal/record/08901/ADC241BCBF8470988DBA6EEAFCF13F14D88E5534.html
    DISMARC – EuropeanaConnect Paid Access
    ‘Castle of Kavala’ 3D exploration of a Greek castle http://www.europeana.eu/portal/record/2020703/05607B24D15BD516EE2B765F74CDA39C7427F7FB.html
    Cultural and Educational Technology Institute - Research Centre Athen CARARE CC-BY-NC-ND
  • All partners send us descriptions of their assets, which we aggregate in a single service
  • Germany 15.44%
    France 10.97%
    Netherlands 9.67%
    Sweden 9.44%
    Spain 9.98%
    UK 6.98%
    Norway 6.60%
    Italy 5.4%
    Ireland 4.04%
    Poland 4.02%
    Europe 3.95%
    Finland 2.95%
    Austria 2.05%
    Belgium 1.61%
    Hungary 1.26%
  • http://www.clef-initiative.eu/documents/71612/86374/CLEF2010wn-LogCLEF-StillerEt2010.pdf
  • Users from everywhere
    Data from everywhere
    Tools from everywhere
    http://europeana.eu/portal/record/2022347/B7C7D15C23C28EFD3FA25147ED3A580757CFBB04.html
    http://europeana.eu/portal/record/9200103/ark__12148_btv1b6921004c.html

Transcript

  • 1. Antoine Isaac Information and networking days H2020 / Connecting Europe Facility, Jan 15-16, 2014
  • 2. Europe’s platform to access cultural heritage Currently 30M objects
  • 3. Built on descriptive metadata from a broad, heterogeneous network Audiovisual collections National Aggregators Regional Aggregators Archives Thematic collections Libraries Musées Lausannois Culture.frThe European Library APEX European Film Gateway Europeana Fashion 2,300 galleries, museums, archives and libraries
  • 4. Accessing items from 36 countries top 16 Portal interface in 31 languages Metadata in 33 languages
  • 5. Serving Europe’s citizens 5M visits on Europeana.eu 7M Facebook impressions API use…
  • 6. Content (digital objects on the site of the provider) Metadata (descriptive object information) Public Domain Creative Commons Licenses Rights reserved Orphan work Facilitating re-use on the legal side CC
  • 7. Facilitating re-use on the language side? Our network needs automatic translation tools to address information needs all over Europe
  • 8. Gathering/linking existing multilingual data
  • 9. Related projects applying NLP tools E.g., The PATHS project has developed techniques to enrich English and Spanish collections 1)Identification of key entities 2)Detection of (typed) similarities between objects, using metadata 3)“Background links” to external resources such as Wikipedia 4)Classification of object against a hierarchy of topic Applying these techniques to other languages would require work 1)requires language-specific tools (PoS tagging, lemmatization) 2)is straightforward to apply to new languages 3)requires language-specific tools 4)depends on (3) and on translation of some topics http://www.paths-project.eu/eng/Resources/Semantic-Enrichment-of-Cultural-Heritage-content-in-PATHS
  • 10. Language challenges for Digital Libraries  Typical queries are very short Average < 2 terms  Identification of query language is not easy, even manually 39% of queries may belong to several languages  Plenty of named entities 60% of queries are for persons & places Not only is it hard for queries: the same issues apply to the descriptive metadata Studies by Humboldt University on Europeana and The European Library http://www.clef-initiative.eu/documents/71612/86374/CLEF2010wn-LogCLEF-StillerEt2010.pdf
  • 11. Language processing issues at the scale of Europe
  • 12. Thank you! Antoine Isaac antoine.isaac@europeana.eu @EuropeanaEU
  • 13. Europeana’s vision and mission  We believe in making cultural heritage openly accessible in a digital way, to promote the exchange of ideas and information  We want to be a catalyst for change in the world of cultural heritage