• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Enabling Language Resources to Expose Translations as Linked Data on the Web
 

Enabling Language Resources to Expose Translations as Linked Data on the Web

on

  • 70 views

Language resources, such as multilingual lexica and multilingual electronic dictionaries, contain collections of lexical entries in several languages. Having access to the corresponding explicit or ...

Language resources, such as multilingual lexica and multilingual electronic dictionaries, contain collections of lexical entries in several languages. Having access to the corresponding explicit or implicit translation relations between such entries might be of great interest for many NLP-based applications. By using Semantic Web-based techniques, translations can be available on the Web to be consumed by other (semantic enabled) resources in a direct manner, not relying on application-specific formats. To that end, in this paper we propose a model for representing translations as linked data, as an extension of the lemon model. Our translation module represents some core information associated to term translations and does not commit to specific views or translation theories. As a proof of concept, we have extracted the translations of the terms contained in Terminesp, a multilingual terminological database, and represented them as linked data. We have made them accessible on the Web both for humans (via a Web interface) and software agents (with a SPARQL endpoint).

Statistics

Views

Total Views
70
Views on SlideShare
70
Embed Views
0

Actions

Likes
0
Downloads
0
Comments
0

0 Embeds 0

No embeds

Accessibility

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Enabling Language Resources to Expose Translations as Linked Data on the Web Enabling Language Resources to Expose Translations as Linked Data on the Web Presentation Transcript

    • Enabling Language Resources to Expose Translations as Linked Data on the Web Jorge Gracia, Elena Montiel-Ponsoda, Daniel Vila-Suero, Guadalupe Aguado-de-Cea Ontology Engineering Group (OEG) Universidad Politécnica de Madrid (UPM) jgracia@fi.upm.es Acknowledgments: LIDER and BabeLData projects 9th Language Resources and Evaluation Conference, LREC 2014 Reykjavik (Iceland) 28/05/2014
    • Outline Motivation The translation model Terminesp: a validating example Conclusions 2
    • 3 Motivation and goals
    • Motivation Current multilingual lexica and electronic dictionaries • Proprietary formats • Non-standard APIs • Disconnected from other resources 4
    • Motivation GOAL: to allow language resources to expose translations as Linked Data on the Web for their consumption by semantic enabled applications in a direct manner, not relying on application-specific formats 5
    • Motivation Objectives: • To define a model for representing translations in RDF • As a proof of concept: 1. Extract translations from the Terminesp terminological database 2. Represent them in RDF with our model 3. Make them accessible both for human and machine consumption 6
    • 7 The translation model
    • The translation model 8
    • The translation model 9
    • LEXICONES LEXICONEN LexicalEntry LexicalSense http://purl.org/goodrelations/v1#PaymentMethods LexicalEntry LexicalSense ONTOLOGY “payment method” “medio de pago” The translation model Translation (direct equivalent) 10
    • LEXICONES LEXICONEN LexicalEntry LexicalSense http://dbpedia.org/ontology/PrimeMinister LexicalEntry LexicalSense ONTOLOGY “Prime Minister” “Presidente del Gobierno” http://es.dbpedia.org/resource/Presidente_del_Gobierno ONTOLOGY The translation model Translation (Cultural equivalence) 11
    • The translation model Characteristics of the model • Translation as a relation between senses • Translation relation reified  additional information can be attached to it • Support to a variety of translation categories • Translation categories clearly separated from the model  no commitment to specific views or translation theories • Translation sets group translations coming from the same language resource, or belonging to the same organization, for instance • Re-use of well established vocabularies (DC, DCAT, etc.) for provenance and additional information. 12
    • LexicalSense tran translationTarget context TranslationSet Translation translationConfidence:double The translation model Translation Categories http://purl.org/net/translation-categories translationCategory context Resource http://purl.org/net/translation.owl Translation Module translationSource directEquivalent culturalEquivalent lexicalEquivalent 13
    • 14 Terminesp, a validating example
    • Terminesp, a validating example TERMINESP • Multilingual terminological database • Terms and definitions from Spanish technological standards • More than 30K terms in Spanish, with translations into English, German, French, Italian, … 15
    • lemon:LexicalEntry terminesp:38756es lemon:LexicalEntry terminesp:38756en lemon:LexicalSense terminesp:38756es-sense lemon:LexicalSense terminesp:38756en-sense skos:Concept terminesp:38756 lemon:Lexicon terminesp:lexiconES lemon:Lexicon terminesp:lexiconEN tr:Translation terminesp:38756es-en-TR “red”@es “network”@en lemon:entry lemon:entry lemon:sense lemon:sense tr:translationTarget tr:translationSource lemon:reference lemon:reference Class Instance Legend lemon:form lemon:form lemon:LexicalForm lemon:writtenRep lemon:writtenRep lemon:LexicalForm Terminesp, a validating example 16
    • lemon:LexicalSense terminesp:38756es-sense lemon:LexicalSense terminesp:38756en-sense Tr:TranslationSet terminesp:es-en-transet tr:Translation terminesp:38756es-en-TR tr:translationCategory tr:translationTarget tr:translationSource Class Instance Legend tr:tran trcat:directEquivalent Terminesp, a validating example 17
    • Before • MS Access database and a Web search interface • Non standard formats and vocabularies • Data “invisible” to software agents • Translations implicit, not explicit Terminesp, a validating example 18
    • Now • Published on the Web as Linked Data • Modelled using lemon and well established vocabularies • Dereferenceable URIs • Data “visible” to software agents • Translations were made explicit • Web search interface for human consumption • SPARQL endpoint for machine consumption Terminesp, a validating example 19
    • Terminesp for machine consumption – SPARQL endpoint http://linguistic.linkeddata.es/terminesp/sparql-editor/ Terminesp, a validating example 20
    • Terminesp for machine consumption – SPARQL endpoint http://linguistic.linkeddata.es/terminesp/sparql-editor/ Written representation target Lexicon target network http://linguistic.linkeddata.es/data/terminesp/lexiconEN Netzwerk (in der Netzwerktopologie) http://linguistic.linkeddata.es/data/terminesp/lexiconDE Terminesp, a validating example 21
    • Terminesp for human consumption – Web interface http://linguistic.linkeddata.es/terminesp/search/ Terminesp, a validating example 22
    • 23 Conclusions
    • Conclusions 24 Our proposal • Model to represent translations as Linked Data on the Web • Terminesp as a validating example Next steps • Standardization through W3C Ontolex Community group • Study possible reuse of ITS 2.0 elements • Links of Terminesp to external resources (e.g., BabelNet) 24
    • Thanks for your attention ! 25