Europeana and multi-lingual access, challenges and possibilities


Published on

Slides for a presentation at the Flarenet Forum, where I presented Europeana's work with multi-lingual access.

Published in: Education
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Europeana and multi-lingual access – challenges and possibilities First an intro to Europeana. What we are and what we are not, To identify the major challenges facing Europeana conerning multi-lingual access and t o sketch possible solutions Challenge: Ontologies and multi-lingual labelling of metadata Challenge: Query translation Challenge: Results translation of metadata Challenge: Localisation of the Europeana portal Painter: Lucas van Valckenborch File:
  • Challenge: Ontologies Currently we use multi-lingual ontologies to create multi-lingual labels and index them for search Probably our main route forward However, it’s difficult to find ontologies and authority files that cover all Europeana languages (the EU 27 languages) Operational ontologies in Europeana: Dbpedia, GEMET, GeoNames Other ontologies we’re looking at: VIAF, LCSH We prefer openly licensed resources We prefer resources modelled in SKOS
  • Challenge: Query translation Under development Main efforts are part of EuropeanaConnect Work Packages 1 and 2 Basis is language identification Named entity recognition Licensed resources XEROX CELI Open resources Language resources registry Inventory of vocabularies and language resources Google and Bing Translation APIs Very good at to/from English Evaluation of Proprietary vs. Open vs. Google/Bing (morphological/dictionaries)
  • Challenge: Results translation Already in production in the Europeana portal Commercial APIs the only practical option? Cover numeroous languages and have easy to work with, well documented APIs Problems: Can be shut down, as Google Translate that will be shut down December 2011 Are there open and free alternatives? Crowdsourced translation is something we’re considering However even if successful it will barely dent our c. 20 million metadata records!
  • Challenge: Localisation Currently we use our own network of Europeana partner institutions and have volunteers there Problem with scale! Solution? Larger translation communities e.g. TranslateWiki
  • Any questions? This poster by an unknown artist is courtesy of the Municipal Library of Lyon The work is in the public domain Slides 2-5 are taken from the Europeana Strategic Plan
  • Europeana and multi-lingual access, challenges and possibilities

    1. 1. Europeana and multi-lingual access – challenges and possibilities FLaReNet Forum 2011 David Haskiya, Product Developer&Project Coordinator Europeana,
    2. 2. Semantic enrichment and multi-lingual labelling Ontologies <ul><li>Semantic enrichment </li></ul><ul><li>Multi-lingual labels </li></ul><ul><li>Persones, places, periods, subjects </li></ul><ul><li>GEMET, Dbpedia, GeoNames, etc. </li></ul>
    3. 3. Query translation Language resources – open and licensed, commercial APIs <ul><li>Licensed resources, e.g. XEROX or CELI </li></ul><ul><li>Open resources, e.g. WordNet </li></ul><ul><li>Free but commercial APIs, e.g. Google and Bing </li></ul><ul><li>Weighing strengths and weaknesses </li></ul>
    4. 4. Commercial APIs the only realistic solution? Results translation <ul><li>Commercial translation APIs </li></ul><ul><ul><li>Google & Bing (Microsoft) </li></ul></ul><ul><li>Crowdsourced metadata translations </li></ul><ul><ul><li>But even if successful, we have 20 million records… </li></ul></ul><ul><li>Or nothing? Are there open, easy to develop against alternatives? </li></ul>
    5. 5. How do we translate the user interface and editorial texts into 27 languages? Localisation <ul><li>Crowdsourced translations </li></ul><ul><ul><li>The Europeana network </li></ul></ul><ul><ul><li>TranslateWiki </li></ul></ul><ul><li>Solutions for translation of editorial text? </li></ul><ul><ul><li>TranslateWiki? </li></ul></ul><ul><ul><li>Duolingo? </li></ul></ul>
    6. 6. That was it! Questions?