SlideShare a Scribd company logo
1 of 58
Download to read offline
Roberto Navigli
From Text to Concepts and Back: Going
Multilingual with BabelNet in a Step or Two
30th November 2017 – Rome, Italy
Machine Learning Meetup
http://lcl.uniroma1.it
30/11/2017
Simone Ponzetto
Tiziano
Flati
Andrea
Moro
Daniele
Vannella
Francesco
Cecconi
30/11/2017BabelNet, Babelfy and Beyond!
Roberto Navigli
2
Alessandro
Raganato
Claudio
Delli Bovi
Taher Pilehvar
Outline of the talk
• Introduction to BabelNet
• Disambiguation and entity linking with Babelfy
• Demos:
– The Babelfy disambiguation system
– Multilingual concept and entity extraction
30/11/2017 3From Text to Concepts and Back: Going
Multilingual with BabelNet in a Step or Two
Roberto Navigli
30/11/2017 4
A 5-year ERC Starting Grant (2011-2016)
on Multilingual Word Sense Disambiguation
http://multijedi.org
From Text to Concepts and Back: Going
Multilingual with BabelNet in a Step or Two
Roberto Navigli
INTEGRATING
KNOWLEDGE
[Navigli & Ponzetto, ACL 2010
& Artificial Intelligence Journal 2012;
Pilehvar & Navigli, ACL 2014]
30/11/2017
2017 Artificial
Intelligence Prominent
Paper Award!
From Text to Concepts and Back: Going
Multilingual with BabelNet in a Step or Two
Roberto Navigli
30/11/2017BabelNet, Babelfy and Beyond!
Roberto Navigli
6
The resource diaspora
30/11/2017 7
Key Objective 1: create knowledge for all languages
Multilingual Joint Word Sense Disambiguation
(MultiJEDI)
WordNet
MultiWordNet
WOLF
MCR
GermaNet
BalkaNet
From Text to Concepts and Back: Going
Multilingual with BabelNet in a Step or Two
Roberto Navigli
It all started with merging WordNet and Wikipedia
[Navigli and Ponzetto, ACL 2010; AIJ 2012]
• A wide-coverage multilingual semantic network
including both encyclopedic (from Wikipedia) and
lexicographic (from WordNet) entries
Concepts from WordNetNamed Entities and specialized
concepts from Wikipedia
Concepts integrated from
both resources
30/11/2017 8From Text to Concepts and Back: Going
Multilingual with BabelNet in a Step or Two
Roberto Navigli
Creating a Multilingual Semantic Network
• Start from two large complementary resources:
– WordNet: full-fledged taxonomy
– Wikipedia: multilingual and continuously updated
{wheeled vehicle}
{self-propelled vehicle}
{motor vehicle} {tractor}
{car,auto, automobile,
machine, motorcar}
{convertible}
{air bag}
is-a
is-a
is-a
is-a
is-a
has-part
{golf cart,
golfcart}
is-a
{wagon,
waggon}
is-a
{accelerator,
accelerator pedal,
gas pedal, throttle}
has-part
{car window}
has-part
{locomotive, engine,
locomotive engine,
railway locomotive}
is-a
{brake}has-part
{wheel}
has-part
{splasher}
has-part
Get the best from both worlds
9From Text to Concepts and Back: Going
Multilingual with BabelNet in a Step or Two
Roberto Navigli
{wheeled vehicle}
{self-propelled vehicle}
{motor vehicle} {tractor}
{car,auto, automobile,
machine, motorcar}
{convertible}
{air bag}
is-a
is-a
is-a
is-a
is-a
has-part{golf cart,
golfcart}
is-a
{wagon,
waggon}
is-a
{accelerator,
accelerator pedal,
gas pedal, throttle}
has-part
{car window}
has-part
{locomotive, engine,
locomotive engine,
railway locomotive}
is-a
{brake}has-part
{wheel}
has-part
{splasher}
has-part
concepts
semantic relation
WordNet [Miller et al., 1990; Fellbaum, 1998]
10From Text to Concepts and Back: Going
Multilingual with BabelNet in a Step or Two
Roberto Navigli
11
• Playing with senses
• Bla bla bla bla bla bla bla
• Bla bla bla bla bla bla bla
• Bla bla bla bla bla bla bla
• Bla bla bla bla bla bla bla
• Bla bla bla bla bla bla bla
concepts
(unspecified) semantic relation
Wikipedia [The Web Community, 2001-today]
From Text to Concepts and Back: Going
Multilingual with BabelNet in a Step or Two
Roberto Navigli
To merge or not to merge?
[Pilehvar and Navigli, ACL 2014]
• Measure the similarity of senses of the same word (but
from different resources)
• If they are similar enough, merge the corresponding two
concepts
WordNetplant#n#1plant#n#1
12From Text to Concepts and Back: Going
Multilingual with BabelNet in a Step or Two
Roberto Navigli
Structural Similarity with Personalized PageRank
[Pilehvar and Navigli, ACL 2014]
The resulting probabilities form a semantic signature
• We collect lexicalizations, definitions, translations,
images, etc. from each of the merged resources
Merging entries from different resources into BabelNet
14
WordNet
From Text to Concepts and Back: Going
Multilingual with BabelNet in a Step or Two
Roberto Navigli
BabelNet: concepts and semantic relations (2)
• We encode knowledge as a labeled directed graph:
– Each vertex is a Babel synset (=synonym set)
– Each edge is a semantic relation between synsets:
• is-a (balloon is-a aircraft)
• part-of (gasbag part-of balloon)
• instance-of (Einstein instance-of physicist)
• …
• unspecified/relatedness (balloon related-to flight)
balloonEN, BallonDE,
aerostatoES, aerostatoIT,
pallone aerostaticoIT,
mongolfièreFR
1530/11/2017From Text to Concepts and Back: Going
Multilingual with BabelNet in a Step or Two
Roberto Navigli
Building BabelNet: Translating Babel synsets
1. Exploiting Wikipedia interlanguage links
pallone
aerostatico
globo
aerostàtico
Ballon
1630/11/2017From Text to Concepts and Back: Going
Multilingual with BabelNet in a Step or Two
Roberto Navigli
Building BabelNet: Translating Babel synsets
2. Filling the lexical translation gaps using a Machine
Translation system to translate the English lexicalizations of
a concept
• On August 27, 1783 in Paris, Franklin witnessed the
world's first hydrogen [[Balloon (aircraft)|balloon]]
flight.
• Le 27 Août, 1783 à Paris, Franklin vu le premier vol en
ballon d'hydrogène.
Statistical Machine Translation
1730/11/2017From Text to Concepts and Back: Going
Multilingual with BabelNet in a Step or Two
Roberto Navigli
What is BabelNet?
• A merger of resources of different kinds:
30/11/2017META Prize 2015: BabelNet
Roberto Navigli
18
30/11/2017 19
• A merger of resources of different kinds:
– WordNet: the most popular computational lexicon of English
– Open Multilingual WordNet: a collection of open wordnets
– WoNeF: a French WordNet
– Wikipedia: the largest collaborative encyclopedia
– Wikidata: the largest collaborative knowledge base
– Wiktionary: the largest collaborative dictionary
– OmegaWiki: a medium-size collaborative multilingual dictionary
– GeoNames: a worldwide geographical database
– FrameNet lexical units
– VerbNet entries
– Microsoft Terminology: a computer science thesaurus
– High-quality automatic sense-based translations
What is BabelNet?
From Text to Concepts and Back: Going
Multilingual with BabelNet in a Step or Two
Roberto Navigli
30/11/2017 20
What is BabelNet?
• A merger of resources of different kinds:
From Text to Concepts and Back: Going
Multilingual with BabelNet in a Step or Two
Roberto Navigli
Anatomy of BabelNet
30/11/2017Quando il linguaggio incontra l’informatica
Roberto Navigli
21
lemma lemma language
definitionpicture
further
languages
synonyms
synonyms
in other
languages
30/11/2017 22
Anatomy of BabelNet
pronunciation
definition speech
definitions
From Text to Concepts and Back: Going
Multilingual with BabelNet in a Step or Two
Roberto Navigli
usage examples
generalizations and other relations
30/11/2017 23
Why do we need BabelNet?
• Multilinguality: the same concept is expressed in tens of
languages
From Text to Concepts and Back: Going
Multilingual with BabelNet in a Step or Two
Roberto Navigli
30/11/2017 24
Why do we need BabelNet?
• Multilinguality: the same concept is expressed in tens of
languages
From Text to Concepts and Back: Going
Multilingual with BabelNet in a Step or Two
Roberto Navigli
30/11/2017 25
Why do we need BabelNet?
• Multilinguality: the same concept is expressed in tens of
languages
From Text to Concepts and Back: Going
Multilingual with BabelNet in a Step or Two
Roberto Navigli
30/11/2017 26
Why do we need BabelNet?
• Multilinguality: the same concept is expressed in tens of
languages
• Coverage: 271 languages and 14 million entries!
– 6M concepts and 7.7M named entities
– 119M word senses
– 378M semantic relations (27 relations per concept on avg.)
– 11M images associated with concepts
– 41M textual definitions
– 2M concepts with domains associated
From Text to Concepts and Back: Going
Multilingual with BabelNet in a Step or Two
Roberto Navigli
30/11/2017Multilingual Web Access – WWW 2015
Roberto Navigli
27
Why do we need BabelNet?
• Multilinguality: the same concept is expressed in tens of
languages
• Coverage: 271 languages and 14 million entries!
• Concepts and named entities together: dictionary and
encyclopedic knowledge is semantically interconnected
30/11/2017META Prize 2015: BabelNet
Roberto Navigli
27
30/11/2017Multilingual Web Access – WWW 2015
Roberto Navigli
28
Why do we need BabelNet?
• Multilinguality: the same concept is expressed in tens of
languages
• Coverage: 271 languages and 14 million entries!
• Concepts and named entities together: dictionary and
encyclopedic knowledge is semantically interconnected
• "Dictionary of the future": semantic network structure
with labeled relations, pictures, multilingual synsets
30/11/2017META Prize 2015: BabelNet
Roberto Navigli
28
30/11/2017 29
Why do we need BabelNet?
• Multilinguality: the same concept is expressed in tens of
languages
• Coverage: 271 languages and 14 million entries!
• Concepts and named entities together: dictionary and
encyclopedic knowledge is semantically interconnected
• "Dictionary of the future": semantic network structure
with labeled relations, pictures, multilingual synsets
• Full-fledged taxonomy: is-a relations are available for
both concepts and named entities (Wikipedia Bitaxonomy)
– Ferrari Testarossa is-a sports car
– BabelNet is-a semantic network & encyclopedic dictionary
From Text to Concepts and Back: Going
Multilingual with BabelNet in a Step or Two
Roberto Navigli
30/11/2017 30
Why do we need BabelNet?
• Multilinguality: the same concept is expressed in tens of
languages
• Coverage: 271 languages and 14 million entries!
• Concepts and named entities together: dictionary and
encyclopedic knowledge is semantically interconnected
• "Dictionary of the future": semantic network structure
with labeled relations, pictures, multilingual synsets
• Full-fledged taxonomy: is-a relations are available for
both concepts and named entities (Wikipedia Bitaxonomy)
• Easy access: Java and HTTP RESTful APIs; SPARQL
endpoint (2 billion triples)
From Text to Concepts and Back: Going
Multilingual with BabelNet in a Step or Two
Roberto Navigli
The core of the Linguistic
Linked Open Data cloud!
30/11/2017 32
What can we do with BabelNet?
• Search and translate:
From Text to Concepts and Back: Going
Multilingual with BabelNet in a Step or Two
Roberto Navigli
30/11/2017META Prize 2015: BabelNet
Roberto Navigli
33
What can we do with BabelNet?
What can we do with BabelNet?
• Explore the network:
30/11/2017META Prize 2015: BabelNet
Roberto Navigli
34
BabelNet is now live!
• 284 languages
• 15 million concepts and named entities
• 1.8 billion semantic relations
30/11/2017Monolingual and multilingual, latent and
explicit representations of meaning
Roberto Navigli
35
Outline of the talk
• Introduction to BabelNet
• Disambiguation and entity linking with Babelfy
• Demos:
– The Babelfy disambiguation system
– Multilingual concept and entity extraction
30/11/2017 36From Text to Concepts and Back: Going
Multilingual with BabelNet in a Step or Two
Roberto Navigli
ADDRESSING
AMBIGUITY
[Moro, Raganato & Navigli,
TACL 2014]
37From Text to Concepts and Back: Going
Multilingual with BabelNet in a Step or Two
Roberto Navigli
Multilingual Joint Word Sense Disambiguation
(MultiJEDI)
Key Objective 2: use all languages to disambiguate one
3830/11/2017From Text to Concepts and Back: Going
Multilingual with BabelNet in a Step or Two
Roberto Navigli
So what?
3930/11/2017From Text to Concepts and Back: Going
Multilingual with BabelNet in a Step or Two
Roberto Navigli
Core step: Connect all candidate meanings
Thomas and Mario are strikers playing in Munich
4030/11/2017From Text to Concepts and Back: Going
Multilingual with BabelNet in a Step or Two
Roberto Navigli
Key points:
1. Disambiguate in any language with a multilingual
NLP pre-processing pipeline
2. Do not use training data
3. Use a knowledge-based algorithm that exploits a
processed version of the BabelNet graph
BabelNet: past, present and future
Roberto Navigli
Babelfy
Babelfy
Experimental Results:
Fine-grained (Multilingual) Disambiguation
Senseval-3
SemEval-2007
task 17
SemEval-2013 task 12
4330/11/2017BabelNet: past, present and future
Roberto Navigli
Babelfy demo
• Demo with a newspaper article
30/11/2017 44From Text to Concepts and Back: Going
Multilingual with BabelNet in a Step or Two
Roberto Navigli
Working with BabelNet saves you a
lot of time
• Annotating with BabelNet implies annotating with
WordNet, Wikipedia, OmegaWiki, Open Multilingual
WordNet, Wikidata, Wiktionary, ...
Key fact!
45
BabelNet
7
45From Text to Concepts and Back: Going
Multilingual with BabelNet in a Step or Two
Roberto Navigli
The Crazy Polyglot!
30/11/2017Multilingual Web Access – WWW 2015
Roberto Navigli
46
Live demo – Crazy polyglot!
KO 자연 언어 처리, 인공 지능
EN In todayʼs knowledge and information society
FR le paysage lexicographique est plus hétérogène que
jamais.
IT Possono le risorse stand-alone competere
ES con múltiples funciones, portale lexicográficas
multilingüe y servicios web,
ZH Web服务,定 制 的 喜 好 和 个 人 用 户 的 个 人 资 料 ?
30/11/2017 47From Text to Concepts and Back: Going
Multilingual with BabelNet in a Step or Two
Roberto Navigli
Same can be done with unstructured tags
30/11/2017 48From Text to Concepts and Back: Going
Multilingual with BabelNet in a Step or Two
Roberto Navigli
Same can be done with unstructured tags
30/11/2017BabelNet, the LLOD cloud and the Industry
Roberto Navigli
49
Neural Models for Word Sense Disambiguation
(Raganato, Delli Bovi, Navigli, EMNLP 2017)
• Sequence labeling:
30/11/2017 50
words & BabelNet
concepts
From Text to Concepts and Back: Going
Multilingual with BabelNet in a Step or Two
Roberto Navigli
Neural Models for Word Sense Disambiguation
(Raganato, Delli Bovi, Navigli, EMNLP 2017)
• Training on English (SemCor sense annotated data)
• Testing on all English Senseval & SemEval test sets
30/11/2017 51From Text to Concepts and Back: Going
Multilingual with BabelNet in a Step or Two
Roberto Navigli
Neural Models for Word Sense Disambiguation
(Raganato, Delli Bovi, Navigli, EMNLP 2017)
• Training on English (SemCor sense annotated data)
• Testing on arbitrary languages (!) – SemEval 2013
– Using multilingual embeddings to encode words in the same space
30/11/2017 52From Text to Concepts and Back: Going
Multilingual with BabelNet in a Step or Two
Roberto Navigli
The future of BabelNet and related technologies
• The MultiJEDI ERC project is now over (but: the MOUSSE
ERC grant just started – in a couple of slides!)
• However, much work still to be done in this direction
• We created a Sapienza startup, Babelscape, with the key
objective of making BabelNet sustainable
• Income is reinvested in BabelNet and subsequent projects
Multilinguality for free, or why you should care about
linking vector representations to (BabelNet) synsets
Roberto Navigli
Demos
• Babelfy DONE
• Multilingual concept and term extraction
30/11/2017 55From Text to Concepts and Back: Going
Multilingual with BabelNet in a Step or Two
Roberto Navigli
Demo: multilingual concept and term extraction
• http://babeltex.org
– News in Italian: http://www.repubblica.it
– News in English: http://www.bbc.com/news
– News in Chinese: http://www.bbc.com/zhongwen/simp
30/11/2017 56From Text to Concepts and Back: Going
Multilingual with BabelNet in a Step or Two
Roberto Navigli
From MultiJEDI to MOUSSE: a Quantum Leap!
MultiJEDI (2011-16)
Large multilingual repository of
concepts: BabelNet
Disambiguation outputs a bag
of concepts
MOUSSE: Multilingual Open-text Unified Syntax-independent SEmantics
Roberto Navigli
57
MOUSSE (2017-22)
Huge repository of novel structured
universal semantic representations
We will construct a language-
independent structured semantic
representation of the sentence
1
2
1
2
CONCEPTS SENTENCE REPRESENTATIONS
DISAMBIGUATION SEMANTIC PARSING
Thanks or…
m i(grazie)
5830/11/2017
MultiJEDI (Starting Grant, 2011-2016) + MOUSSE (Consolidator Grant, 2017-2022)
From Text to Concepts and Back: Going
Multilingual with BabelNet in a Step or Two
Roberto Navigli
Roberto Navigli
Linguistic Computing Laboratory
http://lcl.uniroma1.it
@RNavigli

More Related Content

Similar to Roberto Navigli - From Text to Concepts and Back: Going Multilingual with BabelNet in a Step or Two

A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
Tobias Kuhn
 
Introduction about AAT-Taiwan Project
Introduction about AAT-Taiwan ProjectIntroduction about AAT-Taiwan Project
Introduction about AAT-Taiwan Project
AAT Taiwan
 
[DCSB] Gregory Crane, Stella Dee, Maryam Foradi, Monica Lent, Maria Moritz (U...
[DCSB] Gregory Crane, Stella Dee, Maryam Foradi, Monica Lent, Maria Moritz (U...[DCSB] Gregory Crane, Stella Dee, Maryam Foradi, Monica Lent, Maria Moritz (U...
[DCSB] Gregory Crane, Stella Dee, Maryam Foradi, Monica Lent, Maria Moritz (U...
Digital Classicist Seminar Berlin
 

Similar to Roberto Navigli - From Text to Concepts and Back: Going Multilingual with BabelNet in a Step or Two (20)

A Multilingual Semantic Wiki Based on Controlled Natural Language
A Multilingual Semantic Wiki Based on Controlled Natural LanguageA Multilingual Semantic Wiki Based on Controlled Natural Language
A Multilingual Semantic Wiki Based on Controlled Natural Language
 
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
 
Multilingualism ifla 2014 08
Multilingualism ifla 2014 08Multilingualism ifla 2014 08
Multilingualism ifla 2014 08
 
Porting terminologies to the Semantic Web
Porting terminologies to the Semantic WebPorting terminologies to the Semantic Web
Porting terminologies to the Semantic Web
 
Multilingual Access to Cultural Heritage Content on the Semantic Web - Acl2013
Multilingual Access to Cultural Heritage Content on the Semantic Web - Acl2013Multilingual Access to Cultural Heritage Content on the Semantic Web - Acl2013
Multilingual Access to Cultural Heritage Content on the Semantic Web - Acl2013
 
DBpedia i18n - Amsterdam Meeting (30/01/2014)
DBpedia i18n - Amsterdam Meeting (30/01/2014)DBpedia i18n - Amsterdam Meeting (30/01/2014)
DBpedia i18n - Amsterdam Meeting (30/01/2014)
 
Same shit, new wrapping – or? On the termwiki of The Language Council of Norway
Same shit, new wrapping – or?  On the termwiki of The Language Council of NorwaySame shit, new wrapping – or?  On the termwiki of The Language Council of Norway
Same shit, new wrapping – or? On the termwiki of The Language Council of Norway
 
A Closer Look at the Changing Dynamics of DBpedia Mappings
A Closer Look at the Changing Dynamics of DBpedia MappingsA Closer Look at the Changing Dynamics of DBpedia Mappings
A Closer Look at the Changing Dynamics of DBpedia Mappings
 
Diachronic Analysis of Language exploiting Google Ngram
Diachronic Analysis of Language exploiting Google NgramDiachronic Analysis of Language exploiting Google Ngram
Diachronic Analysis of Language exploiting Google Ngram
 
Maximising (Re)Usability of Library metadata using Linked Data
Maximising (Re)Usability of Library metadata using Linked Data Maximising (Re)Usability of Library metadata using Linked Data
Maximising (Re)Usability of Library metadata using Linked Data
 
CROSER
CROSERCROSER
CROSER
 
Can Deep Learning Techniques Improve Entity Linking?
Can Deep Learning Techniques Improve Entity Linking?Can Deep Learning Techniques Improve Entity Linking?
Can Deep Learning Techniques Improve Entity Linking?
 
Introduction about AAT-Taiwan Project
Introduction about AAT-Taiwan ProjectIntroduction about AAT-Taiwan Project
Introduction about AAT-Taiwan Project
 
[DCSB] Gregory Crane, Stella Dee, Maryam Foradi, Monica Lent, Maria Moritz (U...
[DCSB] Gregory Crane, Stella Dee, Maryam Foradi, Monica Lent, Maria Moritz (U...[DCSB] Gregory Crane, Stella Dee, Maryam Foradi, Monica Lent, Maria Moritz (U...
[DCSB] Gregory Crane, Stella Dee, Maryam Foradi, Monica Lent, Maria Moritz (U...
 
20110728 datalift-rpi-troy
20110728 datalift-rpi-troy20110728 datalift-rpi-troy
20110728 datalift-rpi-troy
 
A Semantic Multimedia Web (Part 3)
A Semantic Multimedia Web (Part 3)A Semantic Multimedia Web (Part 3)
A Semantic Multimedia Web (Part 3)
 
Learning Multilingual Semantic Parsers for Question Answering over Linked Dat...
Learning Multilingual Semantic Parsers for Question Answering over Linked Dat...Learning Multilingual Semantic Parsers for Question Answering over Linked Dat...
Learning Multilingual Semantic Parsers for Question Answering over Linked Dat...
 
High quality machine translation short id-2706
High quality machine translation short id-2706High quality machine translation short id-2706
High quality machine translation short id-2706
 
Webinar on Ontology Management using Vocbench in the context of AGINFRA+ Project
Webinar on Ontology Management using Vocbench in the context of AGINFRA+ ProjectWebinar on Ontology Management using Vocbench in the context of AGINFRA+ Project
Webinar on Ontology Management using Vocbench in the context of AGINFRA+ Project
 
Recontextualizing Audiovisual Archives: Immigrants and Remixing practices
Recontextualizing Audiovisual Archives: Immigrants and Remixing practicesRecontextualizing Audiovisual Archives: Immigrants and Remixing practices
Recontextualizing Audiovisual Archives: Immigrants and Remixing practices
 

More from MeetupDataScienceRoma

More from MeetupDataScienceRoma (20)

Serve Davvero il Machine Learning nelle PMI? | Niccolò Annino
Serve Davvero il Machine Learning nelle PMI? | Niccolò AnninoServe Davvero il Machine Learning nelle PMI? | Niccolò Annino
Serve Davvero il Machine Learning nelle PMI? | Niccolò Annino
 
Meta-learning through the lenses of Statistical Learning Theory (Carlo Cilibe...
Meta-learning through the lenses of Statistical Learning Theory (Carlo Cilibe...Meta-learning through the lenses of Statistical Learning Theory (Carlo Cilibe...
Meta-learning through the lenses of Statistical Learning Theory (Carlo Cilibe...
 
Claudio Gallicchio - Deep Reservoir Computing for Structured Data
Claudio Gallicchio - Deep Reservoir Computing for Structured DataClaudio Gallicchio - Deep Reservoir Computing for Structured Data
Claudio Gallicchio - Deep Reservoir Computing for Structured Data
 
Docker for Deep Learning (Andrea Panizza)
Docker for Deep Learning (Andrea Panizza)Docker for Deep Learning (Andrea Panizza)
Docker for Deep Learning (Andrea Panizza)
 
Machine Learning for Epidemiological Models (Enrico Meloni)
Machine Learning for Epidemiological Models (Enrico Meloni)Machine Learning for Epidemiological Models (Enrico Meloni)
Machine Learning for Epidemiological Models (Enrico Meloni)
 
Quantum Machine Learning and QEM for Gaussian mixture models (Alessandro Luongo)
Quantum Machine Learning and QEM for Gaussian mixture models (Alessandro Luongo)Quantum Machine Learning and QEM for Gaussian mixture models (Alessandro Luongo)
Quantum Machine Learning and QEM for Gaussian mixture models (Alessandro Luongo)
 
Web Meetup #2: Modelli matematici per l'epidemiologia
Web Meetup #2: Modelli matematici per l'epidemiologiaWeb Meetup #2: Modelli matematici per l'epidemiologia
Web Meetup #2: Modelli matematici per l'epidemiologia
 
Deep red - The environmental impact of deep learning (Paolo Caressa)
Deep red - The environmental impact of deep learning (Paolo Caressa)Deep red - The environmental impact of deep learning (Paolo Caressa)
Deep red - The environmental impact of deep learning (Paolo Caressa)
 
[Sponsored] C3.ai description
[Sponsored] C3.ai description[Sponsored] C3.ai description
[Sponsored] C3.ai description
 
Paolo Galeone - Dissecting tf.function to discover auto graph strengths and s...
Paolo Galeone - Dissecting tf.function to discover auto graph strengths and s...Paolo Galeone - Dissecting tf.function to discover auto graph strengths and s...
Paolo Galeone - Dissecting tf.function to discover auto graph strengths and s...
 
Multimodal AI Approach to Provide Assistive Services (Francesco Puja)
Multimodal AI Approach to Provide Assistive Services (Francesco Puja)Multimodal AI Approach to Provide Assistive Services (Francesco Puja)
Multimodal AI Approach to Provide Assistive Services (Francesco Puja)
 
Introduzione - Meetup MLOps & Assistive AI
Introduzione - Meetup MLOps & Assistive AIIntroduzione - Meetup MLOps & Assistive AI
Introduzione - Meetup MLOps & Assistive AI
 
Zero, One, Many - Machine Learning in Produzione (Luca Palmieri)
Zero, One, Many - Machine Learning in Produzione (Luca Palmieri)Zero, One, Many - Machine Learning in Produzione (Luca Palmieri)
Zero, One, Many - Machine Learning in Produzione (Luca Palmieri)
 
Mario Incarnati - The power of data visualization
Mario Incarnati - The power of data visualizationMario Incarnati - The power of data visualization
Mario Incarnati - The power of data visualization
 
Machine Learning in the AWS Cloud
Machine Learning in the AWS CloudMachine Learning in the AWS Cloud
Machine Learning in the AWS Cloud
 
OLIVAW: reaching superhuman strength at Othello
OLIVAW: reaching superhuman strength at OthelloOLIVAW: reaching superhuman strength at Othello
OLIVAW: reaching superhuman strength at Othello
 
[Giovanni Galloro] How to use machine learning on Google Cloud Platform
[Giovanni Galloro] How to use machine learning on Google Cloud Platform[Giovanni Galloro] How to use machine learning on Google Cloud Platform
[Giovanni Galloro] How to use machine learning on Google Cloud Platform
 
Bring your neural networks to the browser with TF.js - Simone Scardapane
Bring your neural networks to the browser with TF.js - Simone ScardapaneBring your neural networks to the browser with TF.js - Simone Scardapane
Bring your neural networks to the browser with TF.js - Simone Scardapane
 
Meetup Gennaio 2019 - Slide introduttiva
Meetup Gennaio 2019 - Slide introduttivaMeetup Gennaio 2019 - Slide introduttiva
Meetup Gennaio 2019 - Slide introduttiva
 
Elena Gagliardoni - Neural Chatbot
Elena Gagliardoni - Neural ChatbotElena Gagliardoni - Neural Chatbot
Elena Gagliardoni - Neural Chatbot
 

Recently uploaded

Recently uploaded (20)

Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 

Roberto Navigli - From Text to Concepts and Back: Going Multilingual with BabelNet in a Step or Two

  • 1. Roberto Navigli From Text to Concepts and Back: Going Multilingual with BabelNet in a Step or Two 30th November 2017 – Rome, Italy Machine Learning Meetup http://lcl.uniroma1.it
  • 2. 30/11/2017 Simone Ponzetto Tiziano Flati Andrea Moro Daniele Vannella Francesco Cecconi 30/11/2017BabelNet, Babelfy and Beyond! Roberto Navigli 2 Alessandro Raganato Claudio Delli Bovi Taher Pilehvar
  • 3. Outline of the talk • Introduction to BabelNet • Disambiguation and entity linking with Babelfy • Demos: – The Babelfy disambiguation system – Multilingual concept and entity extraction 30/11/2017 3From Text to Concepts and Back: Going Multilingual with BabelNet in a Step or Two Roberto Navigli
  • 4. 30/11/2017 4 A 5-year ERC Starting Grant (2011-2016) on Multilingual Word Sense Disambiguation http://multijedi.org From Text to Concepts and Back: Going Multilingual with BabelNet in a Step or Two Roberto Navigli
  • 5. INTEGRATING KNOWLEDGE [Navigli & Ponzetto, ACL 2010 & Artificial Intelligence Journal 2012; Pilehvar & Navigli, ACL 2014] 30/11/2017 2017 Artificial Intelligence Prominent Paper Award! From Text to Concepts and Back: Going Multilingual with BabelNet in a Step or Two Roberto Navigli
  • 6. 30/11/2017BabelNet, Babelfy and Beyond! Roberto Navigli 6 The resource diaspora
  • 7. 30/11/2017 7 Key Objective 1: create knowledge for all languages Multilingual Joint Word Sense Disambiguation (MultiJEDI) WordNet MultiWordNet WOLF MCR GermaNet BalkaNet From Text to Concepts and Back: Going Multilingual with BabelNet in a Step or Two Roberto Navigli
  • 8. It all started with merging WordNet and Wikipedia [Navigli and Ponzetto, ACL 2010; AIJ 2012] • A wide-coverage multilingual semantic network including both encyclopedic (from Wikipedia) and lexicographic (from WordNet) entries Concepts from WordNetNamed Entities and specialized concepts from Wikipedia Concepts integrated from both resources 30/11/2017 8From Text to Concepts and Back: Going Multilingual with BabelNet in a Step or Two Roberto Navigli
  • 9. Creating a Multilingual Semantic Network • Start from two large complementary resources: – WordNet: full-fledged taxonomy – Wikipedia: multilingual and continuously updated {wheeled vehicle} {self-propelled vehicle} {motor vehicle} {tractor} {car,auto, automobile, machine, motorcar} {convertible} {air bag} is-a is-a is-a is-a is-a has-part {golf cart, golfcart} is-a {wagon, waggon} is-a {accelerator, accelerator pedal, gas pedal, throttle} has-part {car window} has-part {locomotive, engine, locomotive engine, railway locomotive} is-a {brake}has-part {wheel} has-part {splasher} has-part Get the best from both worlds 9From Text to Concepts and Back: Going Multilingual with BabelNet in a Step or Two Roberto Navigli
  • 10. {wheeled vehicle} {self-propelled vehicle} {motor vehicle} {tractor} {car,auto, automobile, machine, motorcar} {convertible} {air bag} is-a is-a is-a is-a is-a has-part{golf cart, golfcart} is-a {wagon, waggon} is-a {accelerator, accelerator pedal, gas pedal, throttle} has-part {car window} has-part {locomotive, engine, locomotive engine, railway locomotive} is-a {brake}has-part {wheel} has-part {splasher} has-part concepts semantic relation WordNet [Miller et al., 1990; Fellbaum, 1998] 10From Text to Concepts and Back: Going Multilingual with BabelNet in a Step or Two Roberto Navigli
  • 11. 11 • Playing with senses • Bla bla bla bla bla bla bla • Bla bla bla bla bla bla bla • Bla bla bla bla bla bla bla • Bla bla bla bla bla bla bla • Bla bla bla bla bla bla bla concepts (unspecified) semantic relation Wikipedia [The Web Community, 2001-today] From Text to Concepts and Back: Going Multilingual with BabelNet in a Step or Two Roberto Navigli
  • 12. To merge or not to merge? [Pilehvar and Navigli, ACL 2014] • Measure the similarity of senses of the same word (but from different resources) • If they are similar enough, merge the corresponding two concepts WordNetplant#n#1plant#n#1 12From Text to Concepts and Back: Going Multilingual with BabelNet in a Step or Two Roberto Navigli
  • 13. Structural Similarity with Personalized PageRank [Pilehvar and Navigli, ACL 2014] The resulting probabilities form a semantic signature
  • 14. • We collect lexicalizations, definitions, translations, images, etc. from each of the merged resources Merging entries from different resources into BabelNet 14 WordNet From Text to Concepts and Back: Going Multilingual with BabelNet in a Step or Two Roberto Navigli
  • 15. BabelNet: concepts and semantic relations (2) • We encode knowledge as a labeled directed graph: – Each vertex is a Babel synset (=synonym set) – Each edge is a semantic relation between synsets: • is-a (balloon is-a aircraft) • part-of (gasbag part-of balloon) • instance-of (Einstein instance-of physicist) • … • unspecified/relatedness (balloon related-to flight) balloonEN, BallonDE, aerostatoES, aerostatoIT, pallone aerostaticoIT, mongolfièreFR 1530/11/2017From Text to Concepts and Back: Going Multilingual with BabelNet in a Step or Two Roberto Navigli
  • 16. Building BabelNet: Translating Babel synsets 1. Exploiting Wikipedia interlanguage links pallone aerostatico globo aerostàtico Ballon 1630/11/2017From Text to Concepts and Back: Going Multilingual with BabelNet in a Step or Two Roberto Navigli
  • 17. Building BabelNet: Translating Babel synsets 2. Filling the lexical translation gaps using a Machine Translation system to translate the English lexicalizations of a concept • On August 27, 1783 in Paris, Franklin witnessed the world's first hydrogen [[Balloon (aircraft)|balloon]] flight. • Le 27 Août, 1783 à Paris, Franklin vu le premier vol en ballon d'hydrogène. Statistical Machine Translation 1730/11/2017From Text to Concepts and Back: Going Multilingual with BabelNet in a Step or Two Roberto Navigli
  • 18. What is BabelNet? • A merger of resources of different kinds: 30/11/2017META Prize 2015: BabelNet Roberto Navigli 18
  • 19. 30/11/2017 19 • A merger of resources of different kinds: – WordNet: the most popular computational lexicon of English – Open Multilingual WordNet: a collection of open wordnets – WoNeF: a French WordNet – Wikipedia: the largest collaborative encyclopedia – Wikidata: the largest collaborative knowledge base – Wiktionary: the largest collaborative dictionary – OmegaWiki: a medium-size collaborative multilingual dictionary – GeoNames: a worldwide geographical database – FrameNet lexical units – VerbNet entries – Microsoft Terminology: a computer science thesaurus – High-quality automatic sense-based translations What is BabelNet? From Text to Concepts and Back: Going Multilingual with BabelNet in a Step or Two Roberto Navigli
  • 20. 30/11/2017 20 What is BabelNet? • A merger of resources of different kinds: From Text to Concepts and Back: Going Multilingual with BabelNet in a Step or Two Roberto Navigli
  • 21. Anatomy of BabelNet 30/11/2017Quando il linguaggio incontra l’informatica Roberto Navigli 21 lemma lemma language definitionpicture further languages synonyms synonyms in other languages
  • 22. 30/11/2017 22 Anatomy of BabelNet pronunciation definition speech definitions From Text to Concepts and Back: Going Multilingual with BabelNet in a Step or Two Roberto Navigli usage examples generalizations and other relations
  • 23. 30/11/2017 23 Why do we need BabelNet? • Multilinguality: the same concept is expressed in tens of languages From Text to Concepts and Back: Going Multilingual with BabelNet in a Step or Two Roberto Navigli
  • 24. 30/11/2017 24 Why do we need BabelNet? • Multilinguality: the same concept is expressed in tens of languages From Text to Concepts and Back: Going Multilingual with BabelNet in a Step or Two Roberto Navigli
  • 25. 30/11/2017 25 Why do we need BabelNet? • Multilinguality: the same concept is expressed in tens of languages From Text to Concepts and Back: Going Multilingual with BabelNet in a Step or Two Roberto Navigli
  • 26. 30/11/2017 26 Why do we need BabelNet? • Multilinguality: the same concept is expressed in tens of languages • Coverage: 271 languages and 14 million entries! – 6M concepts and 7.7M named entities – 119M word senses – 378M semantic relations (27 relations per concept on avg.) – 11M images associated with concepts – 41M textual definitions – 2M concepts with domains associated From Text to Concepts and Back: Going Multilingual with BabelNet in a Step or Two Roberto Navigli
  • 27. 30/11/2017Multilingual Web Access – WWW 2015 Roberto Navigli 27 Why do we need BabelNet? • Multilinguality: the same concept is expressed in tens of languages • Coverage: 271 languages and 14 million entries! • Concepts and named entities together: dictionary and encyclopedic knowledge is semantically interconnected 30/11/2017META Prize 2015: BabelNet Roberto Navigli 27
  • 28. 30/11/2017Multilingual Web Access – WWW 2015 Roberto Navigli 28 Why do we need BabelNet? • Multilinguality: the same concept is expressed in tens of languages • Coverage: 271 languages and 14 million entries! • Concepts and named entities together: dictionary and encyclopedic knowledge is semantically interconnected • "Dictionary of the future": semantic network structure with labeled relations, pictures, multilingual synsets 30/11/2017META Prize 2015: BabelNet Roberto Navigli 28
  • 29. 30/11/2017 29 Why do we need BabelNet? • Multilinguality: the same concept is expressed in tens of languages • Coverage: 271 languages and 14 million entries! • Concepts and named entities together: dictionary and encyclopedic knowledge is semantically interconnected • "Dictionary of the future": semantic network structure with labeled relations, pictures, multilingual synsets • Full-fledged taxonomy: is-a relations are available for both concepts and named entities (Wikipedia Bitaxonomy) – Ferrari Testarossa is-a sports car – BabelNet is-a semantic network & encyclopedic dictionary From Text to Concepts and Back: Going Multilingual with BabelNet in a Step or Two Roberto Navigli
  • 30. 30/11/2017 30 Why do we need BabelNet? • Multilinguality: the same concept is expressed in tens of languages • Coverage: 271 languages and 14 million entries! • Concepts and named entities together: dictionary and encyclopedic knowledge is semantically interconnected • "Dictionary of the future": semantic network structure with labeled relations, pictures, multilingual synsets • Full-fledged taxonomy: is-a relations are available for both concepts and named entities (Wikipedia Bitaxonomy) • Easy access: Java and HTTP RESTful APIs; SPARQL endpoint (2 billion triples) From Text to Concepts and Back: Going Multilingual with BabelNet in a Step or Two Roberto Navigli
  • 31. The core of the Linguistic Linked Open Data cloud!
  • 32. 30/11/2017 32 What can we do with BabelNet? • Search and translate: From Text to Concepts and Back: Going Multilingual with BabelNet in a Step or Two Roberto Navigli
  • 33. 30/11/2017META Prize 2015: BabelNet Roberto Navigli 33 What can we do with BabelNet?
  • 34. What can we do with BabelNet? • Explore the network: 30/11/2017META Prize 2015: BabelNet Roberto Navigli 34
  • 35. BabelNet is now live! • 284 languages • 15 million concepts and named entities • 1.8 billion semantic relations 30/11/2017Monolingual and multilingual, latent and explicit representations of meaning Roberto Navigli 35
  • 36. Outline of the talk • Introduction to BabelNet • Disambiguation and entity linking with Babelfy • Demos: – The Babelfy disambiguation system – Multilingual concept and entity extraction 30/11/2017 36From Text to Concepts and Back: Going Multilingual with BabelNet in a Step or Two Roberto Navigli
  • 37. ADDRESSING AMBIGUITY [Moro, Raganato & Navigli, TACL 2014] 37From Text to Concepts and Back: Going Multilingual with BabelNet in a Step or Two Roberto Navigli
  • 38. Multilingual Joint Word Sense Disambiguation (MultiJEDI) Key Objective 2: use all languages to disambiguate one 3830/11/2017From Text to Concepts and Back: Going Multilingual with BabelNet in a Step or Two Roberto Navigli
  • 39. So what? 3930/11/2017From Text to Concepts and Back: Going Multilingual with BabelNet in a Step or Two Roberto Navigli
  • 40. Core step: Connect all candidate meanings Thomas and Mario are strikers playing in Munich 4030/11/2017From Text to Concepts and Back: Going Multilingual with BabelNet in a Step or Two Roberto Navigli
  • 41. Key points: 1. Disambiguate in any language with a multilingual NLP pre-processing pipeline 2. Do not use training data 3. Use a knowledge-based algorithm that exploits a processed version of the BabelNet graph BabelNet: past, present and future Roberto Navigli Babelfy
  • 43. Experimental Results: Fine-grained (Multilingual) Disambiguation Senseval-3 SemEval-2007 task 17 SemEval-2013 task 12 4330/11/2017BabelNet: past, present and future Roberto Navigli
  • 44. Babelfy demo • Demo with a newspaper article 30/11/2017 44From Text to Concepts and Back: Going Multilingual with BabelNet in a Step or Two Roberto Navigli
  • 45. Working with BabelNet saves you a lot of time • Annotating with BabelNet implies annotating with WordNet, Wikipedia, OmegaWiki, Open Multilingual WordNet, Wikidata, Wiktionary, ... Key fact! 45 BabelNet 7 45From Text to Concepts and Back: Going Multilingual with BabelNet in a Step or Two Roberto Navigli
  • 46. The Crazy Polyglot! 30/11/2017Multilingual Web Access – WWW 2015 Roberto Navigli 46
  • 47. Live demo – Crazy polyglot! KO 자연 언어 처리, 인공 지능 EN In todayʼs knowledge and information society FR le paysage lexicographique est plus hétérogène que jamais. IT Possono le risorse stand-alone competere ES con múltiples funciones, portale lexicográficas multilingüe y servicios web, ZH Web服务,定 制 的 喜 好 和 个 人 用 户 的 个 人 资 料 ? 30/11/2017 47From Text to Concepts and Back: Going Multilingual with BabelNet in a Step or Two Roberto Navigli
  • 48. Same can be done with unstructured tags 30/11/2017 48From Text to Concepts and Back: Going Multilingual with BabelNet in a Step or Two Roberto Navigli
  • 49. Same can be done with unstructured tags 30/11/2017BabelNet, the LLOD cloud and the Industry Roberto Navigli 49
  • 50. Neural Models for Word Sense Disambiguation (Raganato, Delli Bovi, Navigli, EMNLP 2017) • Sequence labeling: 30/11/2017 50 words & BabelNet concepts From Text to Concepts and Back: Going Multilingual with BabelNet in a Step or Two Roberto Navigli
  • 51. Neural Models for Word Sense Disambiguation (Raganato, Delli Bovi, Navigli, EMNLP 2017) • Training on English (SemCor sense annotated data) • Testing on all English Senseval & SemEval test sets 30/11/2017 51From Text to Concepts and Back: Going Multilingual with BabelNet in a Step or Two Roberto Navigli
  • 52. Neural Models for Word Sense Disambiguation (Raganato, Delli Bovi, Navigli, EMNLP 2017) • Training on English (SemCor sense annotated data) • Testing on arbitrary languages (!) – SemEval 2013 – Using multilingual embeddings to encode words in the same space 30/11/2017 52From Text to Concepts and Back: Going Multilingual with BabelNet in a Step or Two Roberto Navigli
  • 53. The future of BabelNet and related technologies • The MultiJEDI ERC project is now over (but: the MOUSSE ERC grant just started – in a couple of slides!) • However, much work still to be done in this direction • We created a Sapienza startup, Babelscape, with the key objective of making BabelNet sustainable • Income is reinvested in BabelNet and subsequent projects Multilinguality for free, or why you should care about linking vector representations to (BabelNet) synsets Roberto Navigli
  • 54. Demos • Babelfy DONE • Multilingual concept and term extraction 30/11/2017 55From Text to Concepts and Back: Going Multilingual with BabelNet in a Step or Two Roberto Navigli
  • 55. Demo: multilingual concept and term extraction • http://babeltex.org – News in Italian: http://www.repubblica.it – News in English: http://www.bbc.com/news – News in Chinese: http://www.bbc.com/zhongwen/simp 30/11/2017 56From Text to Concepts and Back: Going Multilingual with BabelNet in a Step or Two Roberto Navigli
  • 56. From MultiJEDI to MOUSSE: a Quantum Leap! MultiJEDI (2011-16) Large multilingual repository of concepts: BabelNet Disambiguation outputs a bag of concepts MOUSSE: Multilingual Open-text Unified Syntax-independent SEmantics Roberto Navigli 57 MOUSSE (2017-22) Huge repository of novel structured universal semantic representations We will construct a language- independent structured semantic representation of the sentence 1 2 1 2 CONCEPTS SENTENCE REPRESENTATIONS DISAMBIGUATION SEMANTIC PARSING
  • 57. Thanks or… m i(grazie) 5830/11/2017 MultiJEDI (Starting Grant, 2011-2016) + MOUSSE (Consolidator Grant, 2017-2022) From Text to Concepts and Back: Going Multilingual with BabelNet in a Step or Two Roberto Navigli
  • 58. Roberto Navigli Linguistic Computing Laboratory http://lcl.uniroma1.it @RNavigli