Question Answering on Romanian, English and French Languages

Question Answering on
Romanian, English and
French Languages
„„Al. I. Cuza” University of IaAl. I. Cuza” University of Ia ssi, Romi, Romaaniania
Faculty of Computer ScienceFaculty of Computer Science

 Introduction
 System components
◦ Questions analysis
◦ Index creation and information retrieval
◦ Answer extraction
 Results
 Application of QA system
◦ eLearning
◦ Robotics
◦ CriES 2010
 Conclusions

 Our group participate in CLEF exercises from 2006:
◦ 2006 – Ro–En (English collection) – 9.47% right answers
◦ 2007 – Ro–Ro (Romanian Wikipedia) – 12 %
◦ 2008 – Ro–Ro (Romanian Wikipedia) – 31 %
◦ 2009 – Ro–Ro, En–En (JRC-Acquis) – 47.2 % (48.6%)
◦ 2010 – Ro-Ro, En-En, Fr-Fr (JRC-Acquis, Europarl) – 47.5%
(42.5%, 27 %)

Lucene
queries
Lucene
Index
Question analysis:
- Tokenization & lemmatization
- Focus, keywords and names
entities identification
- Question classification
JRC-Acquis
corpus
Initial
questions
Information
Retrieval
Relevant
snippets
Romanian
Grammar
Definition Answer
Extraction
Reason Answer
Extraction
Other Answer
Extraction
Final
Answers
EUROPARL
corpus

Q1: What percentage of people in Italy relies on television for
information?
<q q_id="0001" source_lang="EN" target_lang="RO">
<string>Ce procent al populaţiei din Italia contează pe televiziune
pentru a obţine informaţii</string>
<focus>procent</focus>
<verb>contează obţine</verb>
<noun>populaţiei televiziune informaţii</noun>
<nameEntities>Italia</nameEntities>
<luceneQuery>procent~0.7 populaţiei~0.7 Italia^3 (contează^2
conta) televiziune~0.7 obţine informaţii~0.7 </luceneQuery>
<questionType>FACTOID</questionType> ~ 40 patterns
<answerType>MEASURE</answerType> ~ 30 patterns
</q>

 We used Lucene and we have created two
indexes, one at paragraph level and one at
document level
 Using the Lucene queries and the indexes we
used the Lucene search engine to extract a
ranked list of snippets for every question as
possible answer candidates

 Depend by Lucene score and additional we built
special modules to extract answers for questions of
type DEFINITION, REASON-PURPOSE,
PROCEDURE, OPINION
 Two thresholds values
◦ A higher one - in this case, the system offers many NOA
answers – RA is affected, but c@1 is higher
◦ A lower one – in this case we offer only a few NOA
answers – RA is higher, but c@1 is lower

RO-RO EN-EN FR-FR
answered right 95 102 85 78 54 47
answered wrong 74 93 98 99 124 153
total answered 169 195 183 177 178 200
unanswered right 0 0 0 0 0 0
unanswered
wrong
0 0 0 0 0 0
unanswered
empty
31 5 17 23 22 0
total unanswered 31 5 17 23 22 0
c@1 measure 0.55 0.42 0.46 0.43 0.30 0.24

 eLearning – fast answer for ~30% questions
 Robotics – communication
 CriES 2010 – identify experts on Yahoo!
Answers

Similar questions Answer Ok?
Exista aplicatii in care folosirea
Design Patterns-urilor sa nu fie
eficienta?
Evident (de exemplu un program Hello World)... Go
La ce altceva design pattrenurile ar
putea fi aplicate?
Gasirea solutiei la o problema, crearea unui limbaj de
programator avansat, in scrierea documentatiei, in
discutii cu colegii de la o firma de software.
Go
Care este diferenta dintre pattern si o
expresie (idiom) de codare.
Nu am auzit de expresia idiom de codare, dar pare
ceva particular, pe cand design pattern-ul este
general...
Go
Care este diferenta dintre pattern si
clase.
Un design pattern este o solutie la o problema si prin
urmare este compusa dintr-o ierarhie de clase intre
care avem relatii.
Go
Este design pattern-ul diferit de
pattern? De ce a fost ales acest nume?
Design pattern-ul este un pattern in domeniul
ingineriei software. Nu stiu de ce a fost ales acest
nume.... :)
Go
Folosim Design Patterns in aceeasi
aplicatie sau le folosim in aplicatii
diferite?
In aceeasi aplicatie. Go
Ce este un design pattern? In primul rand: un nume, o problema si o solutie Go
Questions Answer Priority Status Details
La ce se folosesc design
pattern-urile?
normal
nevoieN
eaparat Go
Raspunde la intrebare
Raspuns
Go
Exception handlingul in
Java poate fi considerat o
aplicatie a Decorator
pattern?
urgent
nevoieN
eaparat
Go
Raspunde la intrebare
Raspuns
Go
Exista aplicatii in care
folosirea Design Patterns-
urilor sa nu fie eficienta?
Evident (de exemplu un program
Hello World)...
normal doarAsa
La ce altceva design
Gasirea solutiei la o problema, crearea
unui limbaj de programator avansat, in saAfluM

 With Swoogle we extend the knowledge base
 The ontologies returned are then converted to AIML
format and saved in the robot’s memory

Initial
digraph
Initial Yahoo!answers collections
en fr ge sp
Eliminate stop
words
Domains
keywords
Initial users
questions
Eliminate stop
words
Questions
keywords
Relevant words for
questions
Relevant words for
domains
Similarity
score between
questions and
domains
Run 2 Run 1Run 0

 UAIC QA system evolved over time (from 9 % in
2006 at 47.5 % in 2010)
 The main problem is related to quality and quantity of
Romanian resources involved
 In present we are concerned with using of QA
components in other applications in order to improve
their capabilities

Question Answering on Romanian, English and French Languages

Question Answering on Romanian, English and French Languages

Recommended

Recommended

More Related Content

Similar to Question Answering on Romanian, English and French Languages

Similar to Question Answering on Romanian, English and French Languages (20)

More from Faculty of Computer Science

More from Faculty of Computer Science (19)

Recently uploaded

Recently uploaded (20)

Question Answering on Romanian, English and French Languages

Editor's Notes