Ilan Kernerman (K Dictionaries Ltd, Tel Aviv) introduced K DICTIONARIES Ltd. which have a long tradition in the lexicographic field of dictionary development. He shared his experiences facing a transition from traditional dictionaries to multilingual datasets, data management and software engineering, architectures and design due to the increasing technological development. Today the K Dictionaries Ltd. resources comprise multilingual databases for over 20 major and some minor languages including linguistic information on morphology and pronunciation, lexicographic editorial tools and applications. The main focus lies in the quality of the language data, hence, the data is first collected and edited manually by first language speakers to build monolingual datasets which are then extended and connected to form bi- and multilingual datasets via automatic translations. The main goal is to get from traditional lexicography value for applications such as machine translation, e-learning, word processing, text mining and search engines. The use of linguistic linked open data is desired regarding its interconnectedness in nature and the vast amount of available language data. However, the integration of this data suffers from the mediocre quality of the automatically created content. The challenge is to arrive at automatically generated high quality content that can cope with the central problems of resolving the complex cross-linguistic relations that have rarely a 1:1 equivalence (for instance in compound words) as well as extending the few existing quality sensitive domains, e.g. education and healthcare which are even now interested in high quality linguistic data.
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Ilan Kernerman: Generating Multilingual Lexicographic Resources
1. MLODELeipzig, 2 September 2014
Generating multilingual lexicographic resources
Ilan Kernerman
K Dictionaries Ltd, Tel Aviv
2. K DICTIONARIES
MLODE • 20140902 2
Establishedin 1993, based in Tel Aviv
Focus on Technology-Driven Content
Create lexicographic data for 40+ languages
Cooperate worldwide with-language editors, translators and technicians-software engineers, architects and designers-digital & print publishing partners and LT firms-the academe and professional associations
3. RESOURCES
MLODE • 20140902 3
Dictionaries for English learners & native speakers
Dictionaries for native & foreign language learning
Dictionaries for bilingual & multilingual translation
Multi-language/Multi-layer datasets
Lexicographic editorial tools & applications
Morphology & pronunciation
Language supplements, audio & pictures
7. EVOLUTION
MLODE • 20140902 7
1.Monolingual English learner’s dictionary
2.Bilingual English learner’s dictionary
3.Multilingual English dictionary
4.L2-English reversed indices
5.L2 bilingual glossaries
6.L2, L3 etc. multilingual dictionaries
8. ENGLISH MULTILINGUAL
MLODE • 20140902 8
PASSWORDsemi-bilingual dictionary
KEMD(44 languages) Afrikaans | Arabic | Bulgarian | Catalan | Chinese (Simplified | Traditional) | Croatian | Czech | Danish | Dutch | English | Estonian | Farsi | Finnish | French | German | Greek | Hebrew | Hindi | Hungarian | Icelandic | Indonesian | Italian | Japanese | Korean | Latvian | Lithuanian | Malay | Norwegian | Polish | Portuguese (Brazil | Portugal) | Romanian | Russian | Serbian | Slovak | Slovene | Spanish | Swedish | Thai | Turkish | Ukrainian | Urdu | Vietnamese
9. L2 MULTILINGUALS
MLODE • 20140902 9
Generating L2-English Index― Produce L2 Index table― Produce L2 Senses table
Editing L2 Index― Include/Exclude HW in L2 Index― Include/Exclude Sense (checkbox in Tree preview) ― Edit L2 HW and POS― Edit the Entry (modify, add, remove, re-order Senses) ― Search Sense in English HW or Definition and add it
Translating Multilingually― Link L2 HW via each English Sense to translationsin all other languages (of the English multilingual)