Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Enabling Language Resources to
Expose Translations as
Linked Data on the Web
Jorge Gracia, Elena Montiel-Ponsoda,
Daniel V...
Outline
Motivation
The translation model
Terminesp: a validating example
Conclusions
2
3
Motivation and goals
Motivation
Current multilingual lexica and electronic dictionaries
• Proprietary formats
• Non-standard APIs
• Disconnecte...
Motivation
GOAL: to allow language resources to expose
translations as Linked Data on the Web for their
consumption by sem...
Motivation
Objectives:
• To define a model for representing translations in RDF
• As a proof of concept:
1. Extract transl...
7
The translation model
The translation model
8
The translation model
9
LEXICONES
LEXICONEN
LexicalEntry LexicalSense
http://purl.org/goodrelations/v1#PaymentMethods
LexicalEntry LexicalSense
ON...
LEXICONES
LEXICONEN
LexicalEntry LexicalSense
http://dbpedia.org/ontology/PrimeMinister
LexicalEntry LexicalSense
ONTOLOGY...
The translation model
Characteristics of the model
• Translation as a relation between senses
• Translation relation reifi...
LexicalSense
tran
translationTarget
context
TranslationSet Translation
translationConfidence:double
The translation model
...
14
Terminesp,
a validating example
Terminesp, a validating example
TERMINESP
• Multilingual terminological database
• Terms and definitions from Spanish tech...
lemon:LexicalEntry
terminesp:38756es
lemon:LexicalEntry
terminesp:38756en
lemon:LexicalSense
terminesp:38756es-sense
lemon...
lemon:LexicalSense
terminesp:38756es-sense
lemon:LexicalSense
terminesp:38756en-sense
Tr:TranslationSet
terminesp:es-en-tr...
Before
• MS Access database and a Web search interface
• Non standard formats and vocabularies
• Data “invisible” to softw...
Now
• Published on the Web as Linked Data
• Modelled using lemon and well established vocabularies
• Dereferenceable URIs
...
Terminesp for machine consumption – SPARQL endpoint
http://linguistic.linkeddata.es/terminesp/sparql-editor/
Terminesp, a ...
Terminesp for machine consumption – SPARQL endpoint
http://linguistic.linkeddata.es/terminesp/sparql-editor/
Written repre...
Terminesp for human consumption – Web interface
http://linguistic.linkeddata.es/terminesp/search/
Terminesp, a validating ...
23
Conclusions
Conclusions
24
Our proposal
• Model to represent translations as Linked Data on the
Web
• Terminesp as a validating exampl...
Thanks for your attention !
25
Upcoming SlideShare
Loading in …5
×

Enabling Language Resources to Expose Translations as Linked Data on the Web

487 views

Published on

Language resources, such as multilingual lexica and multilingual electronic dictionaries, contain collections of lexical entries in several languages. Having access to the corresponding explicit or implicit translation relations between such entries might be of great interest for many NLP-based applications. By using Semantic Web-based techniques, translations can be available on the Web to be consumed by other (semantic enabled) resources in a direct manner, not relying on application-specific formats. To that end, in this paper we propose a model for representing translations as linked data, as an extension of the lemon model. Our translation module represents some core information associated to term translations and does not commit to specific views or translation theories. As a proof of concept, we have extracted the translations of the terms contained in Terminesp, a multilingual terminological database, and represented them as linked data. We have made them accessible on the Web both for humans (via a Web interface) and software agents (with a SPARQL endpoint).

Published in: Data & Analytics, Business
  • Be the first to comment

  • Be the first to like this

Enabling Language Resources to Expose Translations as Linked Data on the Web

  1. 1. Enabling Language Resources to Expose Translations as Linked Data on the Web Jorge Gracia, Elena Montiel-Ponsoda, Daniel Vila-Suero, Guadalupe Aguado-de-Cea Ontology Engineering Group (OEG) Universidad Politécnica de Madrid (UPM) jgracia@fi.upm.es Acknowledgments: LIDER and BabeLData projects 9th Language Resources and Evaluation Conference, LREC 2014 Reykjavik (Iceland) 28/05/2014
  2. 2. Outline Motivation The translation model Terminesp: a validating example Conclusions 2
  3. 3. 3 Motivation and goals
  4. 4. Motivation Current multilingual lexica and electronic dictionaries • Proprietary formats • Non-standard APIs • Disconnected from other resources 4
  5. 5. Motivation GOAL: to allow language resources to expose translations as Linked Data on the Web for their consumption by semantic enabled applications in a direct manner, not relying on application-specific formats 5
  6. 6. Motivation Objectives: • To define a model for representing translations in RDF • As a proof of concept: 1. Extract translations from the Terminesp terminological database 2. Represent them in RDF with our model 3. Make them accessible both for human and machine consumption 6
  7. 7. 7 The translation model
  8. 8. The translation model 8
  9. 9. The translation model 9
  10. 10. LEXICONES LEXICONEN LexicalEntry LexicalSense http://purl.org/goodrelations/v1#PaymentMethods LexicalEntry LexicalSense ONTOLOGY “payment method” “medio de pago” The translation model Translation (direct equivalent) 10
  11. 11. LEXICONES LEXICONEN LexicalEntry LexicalSense http://dbpedia.org/ontology/PrimeMinister LexicalEntry LexicalSense ONTOLOGY “Prime Minister” “Presidente del Gobierno” http://es.dbpedia.org/resource/Presidente_del_Gobierno ONTOLOGY The translation model Translation (Cultural equivalence) 11
  12. 12. The translation model Characteristics of the model • Translation as a relation between senses • Translation relation reified  additional information can be attached to it • Support to a variety of translation categories • Translation categories clearly separated from the model  no commitment to specific views or translation theories • Translation sets group translations coming from the same language resource, or belonging to the same organization, for instance • Re-use of well established vocabularies (DC, DCAT, etc.) for provenance and additional information. 12
  13. 13. LexicalSense tran translationTarget context TranslationSet Translation translationConfidence:double The translation model Translation Categories http://purl.org/net/translation-categories translationCategory context Resource http://purl.org/net/translation.owl Translation Module translationSource directEquivalent culturalEquivalent lexicalEquivalent 13
  14. 14. 14 Terminesp, a validating example
  15. 15. Terminesp, a validating example TERMINESP • Multilingual terminological database • Terms and definitions from Spanish technological standards • More than 30K terms in Spanish, with translations into English, German, French, Italian, … 15
  16. 16. lemon:LexicalEntry terminesp:38756es lemon:LexicalEntry terminesp:38756en lemon:LexicalSense terminesp:38756es-sense lemon:LexicalSense terminesp:38756en-sense skos:Concept terminesp:38756 lemon:Lexicon terminesp:lexiconES lemon:Lexicon terminesp:lexiconEN tr:Translation terminesp:38756es-en-TR “red”@es “network”@en lemon:entry lemon:entry lemon:sense lemon:sense tr:translationTarget tr:translationSource lemon:reference lemon:reference Class Instance Legend lemon:form lemon:form lemon:LexicalForm lemon:writtenRep lemon:writtenRep lemon:LexicalForm Terminesp, a validating example 16
  17. 17. lemon:LexicalSense terminesp:38756es-sense lemon:LexicalSense terminesp:38756en-sense Tr:TranslationSet terminesp:es-en-transet tr:Translation terminesp:38756es-en-TR tr:translationCategory tr:translationTarget tr:translationSource Class Instance Legend tr:tran trcat:directEquivalent Terminesp, a validating example 17
  18. 18. Before • MS Access database and a Web search interface • Non standard formats and vocabularies • Data “invisible” to software agents • Translations implicit, not explicit Terminesp, a validating example 18
  19. 19. Now • Published on the Web as Linked Data • Modelled using lemon and well established vocabularies • Dereferenceable URIs • Data “visible” to software agents • Translations were made explicit • Web search interface for human consumption • SPARQL endpoint for machine consumption Terminesp, a validating example 19
  20. 20. Terminesp for machine consumption – SPARQL endpoint http://linguistic.linkeddata.es/terminesp/sparql-editor/ Terminesp, a validating example 20
  21. 21. Terminesp for machine consumption – SPARQL endpoint http://linguistic.linkeddata.es/terminesp/sparql-editor/ Written representation target Lexicon target network http://linguistic.linkeddata.es/data/terminesp/lexiconEN Netzwerk (in der Netzwerktopologie) http://linguistic.linkeddata.es/data/terminesp/lexiconDE Terminesp, a validating example 21
  22. 22. Terminesp for human consumption – Web interface http://linguistic.linkeddata.es/terminesp/search/ Terminesp, a validating example 22
  23. 23. 23 Conclusions
  24. 24. Conclusions 24 Our proposal • Model to represent translations as Linked Data on the Web • Terminesp as a validating example Next steps • Standardization through W3C Ontolex Community group • Study possible reuse of ITS 2.0 elements • Links of Terminesp to external resources (e.g., BabelNet) 24
  25. 25. Thanks for your attention ! 25

×