EVALITA 2018 NLP4FUN - Solving language games

EVALITA 2018
EVALUATION OF NLP AND SPEECH TOOLS FOR ITALIAN
Overview of the EVALITA 2018 Solving
language games (NLP4FUN) Task
Pierpaolo Basile, Marco de Gemmis
Lucia Siciliani, Giovanni Semeraro
Dipartimento di Informatica
Università degli Studi di Bari Aldo Moro, Italy

EVALITA 2018 Workshop
December 12-13 2018, Turin
“La Ghigliottina”

“La Ghigliottina”
The solution is pacco:
✓ “Pacco, doppio pacco e contropaccotto” (movie)
✓ Carta da pacco
✓ Pacco di soldi
✓ Pacco di pasta
✓ Pacco regalo

Motivation
● Language Games have attracted the attention of
researchers in the fields of AI and NLP
○ Jeopardy!, crossword puzzles
● “La Ghigliottina” is a challenging language
game which demands knowledge covering a
broad range of topics
○ take advantage from the availability of open
repositories and the web
○ cultural and linguistic background are
necessary to understand clues

Task and dataset
● The task: given a set of five words - the
clues - each linked in some way to a
specific word that represents the unique
solution of the game
○ clues are unrelated to each other
○ the player has one minute to find the
solution!!!
● Dataset: set of games taken from
○ the TV show “L’Eredità”
○ the board game “L’Eredità”

Data format
<games>
<game>
<id>3fc953bd...</id>
<clue>uomo</clue>
<clue>cane</clue>
<clue>musica</clue>
<clue>casa</clue>
<clue>pietra</clue>
<solution>chiesa</solution>
<type>TV</type>
</game>
...
</games>
● XML format
● a root element
games which
contains several
game elements
● each game has five
clue elements and
one solution
● the element type
specifies the type of
the game: TV or
board game

Output
The participants must return a ranked list of
solutions in plain text file:
id solution score rank time
For example:
3fc953bd-... porta 0.978 1 3459
3fc953bd-... chiesa 0.932 2 3251
3fc953bd-... santo 0.897 3 4321
...
3fc953bd-... carta 0.321 100 2343
MAX 100
candidate
solutions for each
game

Output
The participants must return a ranked list of
solutions in plain text file:
id solution score rank time
For example:
3fc953bd-... porta 0.978 1 3459
3fc953bd-... chiesa 0.932 2 3251
3fc953bd-... santo 0.897 3 4321
...
3fc953bd-... carta 0.321 100 2343
time taken by the
system to
compute the
solution is
reported in
milliseconds

Dataset: statistics
● Games have different levels of difficulty
○ instances taken both from the TV game and
from the official board game
● Training set: 315 instances of the game
○ 64.8% (TV game), 35.2% (board game)
● Test set: 105 instances of the game
○ 62.9% (TV game)
○ 37.1% (board game)
● 300 fake games (automatically created)
added in the evaluation data

Evaluation
● a (time) weighted version of Mean
Reciprocal Rank (MRR)
● G is the set of games
● rg
is the rank of the solution
● tg
denotes the minutes taken by the system
to give the solution

Participants
● 12 registered teams
● only 2 team submitted results
○ UNIOR4FUN: the idea is that clue words and
the corresponding solution are often part of a
multiword expression (multiword expressions
are filtered by linguistic patterns)
○ LucaSquadrone: co-occurrences of clues and
candidate solutions

Results
● UNIOR4NLP reports very high MRR, the
system is able to place the solution in the
first positions
● Squadrone system takes more time for
solving games MRR≠MRR (std)
System MRR MRR (std) Solved
UNIOR4NLP 0.6428 0.6428 81.90%
Squadrone 0.0134 0.0350 25.71%

Comments
Reported results are remarkable but some
difficult games requiring inference are
unsolved:
● uno, notte, la trippa, auto, palazzo → portiere
○ uno is the number generally assigned to the
role of the goalkeeper (portiere)
○ “La Trippa” is the surname of “Antonio La
Trippa”, a character of the Italian movie “Gli
onorevoli”, whose job is the porter (portiere) of
a building

Conclusions
● Challenging task
● Good results when the solution is a
multiword expression
○ inference is hard to tackle
● Few participants
○ Is the task too difficult?
○ Do no-classification tasks attract few
participants?
● Mobile app “Ghigliottiniamo”
○ integrate your artificial player through REST API,
contact support@quiztime.io

Thank you!
Download our dataset from the GitHub
EVALITA 2018 repository
https://github.com/evalita2018/data

EVALITA 2018 NLP4FUN - Solving language games

More Related Content

More from Pierpaolo Basile

Recently uploaded

EVALITA 2018 NLP4FUN - Solving language games