We present UWN, a large multilingual lexical knowledge base that describes the meanings and relationships of words in over 200 languages. This paper explains how link prediction, information integration and taxonomy induction methods have been used to build UWN based on WordNet and extend it with millions of named entities from Wikipedia. We additionally introduce extensions to cover lexical relationships, frame-semantic knowledge, and language data. An online interface provides human access to the data, while a software API enables applications to look up over 16 million words and names.
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
UWN: A Large Multilingual Lexical Knowledge Base
1. Step 1: Link PredictionStep 1: Link Prediction
UWN's Multilingual GraphUWN's Multilingual Graph
• Goal: Richer, Less Sparse Features
• How: Model Synonymy, Polysemy,
Semantic Relatedness, Taxonomy.
(within and across languages)
UWN: A Large Multilingual
Lexical Knowledge Base
Gerard de Melo and Gerhard Weikum
ICSI Berkeley / Max Planck Institute for Informatics
Better NLP Features using Lexical SemanticsBetter NLP Features using Lexical Semantics
More Information:
www.lexvo.org/gdm/
• Downloadable API
available
• Web User Interface
EntityEntitypor: “entidade”por: “entidade”
cmn: “ 制度”cmn: “ 制度” InstitutionInstitution
Educational
institution
Educational
institution
UniversityUniversity
heb: “ישות.”heb: “ישות.”
deu: “Bildungs-
einrichtung”
deu: “Bildungs-
einrichtung”
srp:
“универзитете”
srp:
“универзитете”
...
University of
California, Berkeley
University of
California, Berkeley
eng: “Berkeley ”eng: “Berkeley ”
ara:
“كينونة ،”وجود
ara:
“كينونة ،”وجود
tha: “ สถาบัน”tha: “ สถาบัน”
fin: “oppilaitos”fin: “oppilaitos”
fin: “yliopisto”fin: “yliopisto”
cmn:
“ 柏克萊加州大學”
cmn:
“ 柏克萊加州大學”
Berkeley, CABerkeley, CA
George BerkeleyGeorge Berkeley
deu: “Schulgebäude”deu: “Schulgebäude”
school
(group of fish)
school
(group of fish)
school
(institution)
school
(institution)
school
(building)
school
(building)
deu: “Schulhaus”deu: “Schulhaus”
deu: “Fischschwarm”deu: “Fischschwarm”
ces: “hejno”ces: “hejno”
fra: “banc”fra: “banc”
chv: “шкул”chv: “шкул”
jpn: “ 学校”jpn: “ 学校”
kor: “ 학교”kor: “ 학교”
lao: “ໂຮງຮຽນ”lao: “ໂຮງຮຽນ”
kat: “სკოლა”kat: “სკოლა”
• Over 16 million words
and names in over 200
languages semantically
connected
• Ambiguity and
synonymy captured
eng: “UC Berkeley”eng: “UC Berkeley” eng: “Cal”eng: “Cal”
CityCity
Geopolitical
Entity
Geopolitical
Entity
ChuvashChuvash
GeorgianGeorgian
Lexvo.org
Language
Descriptions:
Languages
Scripts
Characters
Countries
Cyrllic
(Script)
Cyrllic
(Script)
Russia
(Country)
Russia
(Country)
UWN: Meaning Distinctions
Ontological
Taxonomy
Encyclopedic
Knowledge,
Pictures,
Video,
Sounds, Maps
Etymological and
other word
relationships
Millions of
Named Entities
(People, Places,
Proteins,
Asteroids,
Companies, etc.)
200+ languages
Step 2: Entity IntegrationStep 2: Entity Integration
Step 3: Taxonomy InductionStep 3: Taxonomy Induction ExtrasExtras
• Markov Chain to rank taxonomic parents
• 270 Wikipedia taxonomies integrated with
WordNet's hypernym hierarchy
es: Televisores: Televisor
es: Televisiónes: Televisión
ru: Телевизорru: Телевизор
hi: दूरदर्शनhi: दूरदर्शन
ja: テレビja: テレビ
en: Televisionen: Television
en:
Television
set
en:
Television
set
zh: 电视机zh: 电视机
ja: テレビ受像機ja: テレビ受像機
en: TV seten: TV set
en: T.V.en: T.V.
V1 ,u
V1 ,u
V1 ,v
V1 ,v
• LP for constraint-based computation of
equivalence classes of entities
• Region Growing approximation algorithm
• Link multilingual
words to WordNet
• Connect Wikipedia
with WordNet
(equivalence and
taxonomic links)
• FrameNet Linking
• Common-Sense
Knowledge Extraction
• Multilingual Roget's
Thesaurus