SlideShare a Scribd company logo
Porting the QALL-ME framework to Romanian

                    Constantin Or˘san
                                 a

           Research Group in Computational Linguistics
    Research Institute in Information and Language Processing
                   University of Wolverhampton
                http://www.wlv.ac.uk/~in6093/


                      29th March 2010
1 Introduction



2 The QALL-ME project



3 Multilingual information access in QALL-ME



4 Conclusions
Structure of the presentation



1 Introduction


2 The QALL-ME project


3 Multilingual information access in QALL-ME


4 Conclusions
Need to access information




• as a result of the Internet development more and more
  information becomes available
• this information is in many languages
• fields from computational linguistics such as automatic
  summarisation, question answering, text mining, etc. can help
  people deal with information
Need to access information




• as a result of the Internet development more and more
  information becomes available
• this information is in many languages
• fields from computational linguistics such as automatic
  summarisation, question answering, text mining, etc. can help
  people deal with information
Question answering (QA)



• Question answering aims at identifying the answer to a
  question in a large collection of documents
• the information provided by QA is more focused than
  information retrieval
• the output can be the exact answer or a text snippet which
  contains the answer
• the domain took off as a result of the introduction of QA
  track in TREC, whilst cross-lingual QA as a result of CLEF
Types of QA systems

• open-domain QA systems: can answer any question from any
  collection
  + can potentially answer any question
  - very low accuracy (especially in cross-lingual settings)
Types of QA systems

• open-domain QA systems: can answer any question from any
  collection
  + can potentially answer any question
  - very low accuracy (especially in cross-lingual settings)
• canned QA systems: rely on a very large repository of
  questions for which the answer is known
  + very little processing necessary
  - limited to the answers in the database
Types of QA systems

• open-domain QA systems: can answer any question from any
  collection
  + can potentially answer any question
  - very low accuracy (especially in cross-lingual settings)
• canned QA systems: rely on a very large repository of
  questions for which the answer is known
  + very little processing necessary
  - limited to the answers in the database
• closed-domain QA systems: are built for very specific domains
  and exploit expert knowledge in them
  + very high accuracy
  - can require extensive language processing and limited to one
  domain
Purpose of the presentation




• briefly present the QALL-ME project
Purpose of the presentation




• briefly present the QALL-ME project
• show how it was adapted to answer questions in Romanian
  about movies
Structure of the presentation



1 Introduction


2 The QALL-ME project


3 Multilingual information access in QALL-ME


4 Conclusions
The QALL-ME project


• QALL-ME = Question Answering Learning technologies in a
  multiLingual and Multimodal Environment
• EU-funded project part of FP6
• 7 partners:
    • FBK-irst, Italy
    • University of Wolverhampton, UK
    • University of Alicante, Spain
    • DFKI, Germany
    • Comdata, Italy
    • UbiEST, Italy
    • WayCom, Italy
• Web page: http://qallme.fbk.eu
The QALL-ME project



• aimed at establishing a shared infrastructure for multilingual
  and multimodal QA in the domain of tourism
• In the QALL-ME system
     • users ask natural language questions in several languages (both
       in textual and speech modality) using a variety of input devices
       (e.g. mobile phones), and
     • returns a list of specific answers formatted in the most
       appropriate modality, ranging from small texts, maps, videos,
       and pictures.
Local Information      Semantic 
     Sources         representation




                                                  Service Provider
                          English Answer                                       German Answer 
                            Extractor                                             Extractor

                                                  QALL­ME central 
                                                    QA planner


                         Spanish Answer                                        Italian Answer 
                            Extractor                                             Extractor




                     Question Type          Answer Type            Speech            Dialog Models
                       ontology              ontology            Recognizers
Main outputs of the project




  • an ontology for the domain of tourism
  • entailment based QA framework
  • the QALL-ME benchmark
  • an entailment framework

(all accessible from the project’s web page:
http://qallme.fbk.eu)
The ontology



• A domain-specific ontology for the tourism domain was
  developed and shared among all the partners.
• The ontology was used to serve as:
    • bridge between different languages
    • communication language between different components of the
      system
• The ontology was linked to domain independent ontologies
  such as MultiWordNet and Sumo
• For more information see (Ou et al., 2008)
Design of the ontology



• Analysis of data from content providers
• Analysis of users requirements
• Inspired by similar ontologies:
     • Harmonise and eTourism: focus on static information (e.g.
       accommodation and events/activities)
     • Similar to eTourism as is written in OWL rather RDFs
     • but wider coverage
• Introspection
The ontology



• Main classes: Country, Destination, Site (i.e.
  Accommodation, Attraction, Gastro, and Infrastructure),
  Transportation, EventContent and Event
• Element classes: Facility, Room, PersonOrganization,
  Language, and Currency
• Attribute classes: Contact, Location, Period and Price.

• Element and attribute classes cannot exist independently and
  have to be attached to other main or element classes
Price                                               Site
                                                                                                                                              GPSCoordinate
priceType
                                                                                                                  hasGPSCoordinate
                              subClassOf                                          subClassOf
                                                                                                                                              PostalAddress
priceValue                                              Event                                                      hasPostalAddress

                              TicketPrice                                          Cinema
                                                                                                                    DirectionLocation
             hasCurrency                            subClassOf                                                                               DirectionLocation
Currency                                                        isInSite
                                                hasPrice
                                                                                                                              hasContact
                                                                           name             description
                                                                                                                                                Contact
                                                                                                            hasSiteFacility
                                                    MovieShow                                                                  hasRoom


                                                                                                                                               CinemaRoom
                                                                                                                    SiteFacility
                               Period                                        EventContent
                                                                                                                                       hasRoomFacility
endTime      startTime                      hasPeriod
                                                                 hasEventContent                                   RoomFacility

                              subClassOf                                          subClassOf
     TimePeriod

                                                                                                                                                 Director
              hasTimePeriod
                                                                                                                   hasDirector
                           DateTimePeriod                                           Movie                          hasProducer                   Producer
              hasDatePeriod                                                                                         hasStar

     DatePeriod                                                                                                     hasWriter                      Star
                                                                name                                certificate



endDate       startDate                                                synposis             genre                                                 Writer
The ontology


• Encoded using OWL DL, since it has more expressive power
  than OWL Lite and has more efficient reasoning support than
  OWL Full
• Used Protege-OWL as the editor and RacerPro7 as the
  reasoner
• The ontology contains
    • 122 classes (concepts),
    • 55 datatype properties and
    • 52 object properties which indicate the relationships among
      the 122 classes.
    • 15 top-level classes.
• The class hierarchy has a maximum depth of 4.
The QALL-ME framework



• is an architecture skeleton for multilingual QA systems for
  closed domains
• designed in such a way that it allows fast development of
  closed domain QA systems
• freely available from http://qallme.sourceforge.net/
• is based on a Service Oriented Architecture (SOA) which is
  realised using web services
• relies on textual entailment recognisers
Web services
1   Context providers: are used to anchor questions in space
    and time
2   Annotators: Currently three types of annotators are
    available:
      • named entity annotators which identify names of cinemas,
        movies, persons, etc.
      • term annotators which identify hotel facilities, movie genres
        and other domain-specific terminology
      • temporal annotators that are used to recognise and normalise
        temporal expressions in user questions
3   Entailment engine: determines whether a user question
    entails a retrieval procedure
4   Query generator: which relies on an entailment engine to
    generate a query to extract the answer.
5   Answer pool: retrieves the answers from a database.
Context providers



• are used to anchor a question in space and time
• return the current position and time
• used by the presentation module when maps are displayed
• used by temporal process to normalise temporal entities
• determines which services are used in a cross-lingual scenario
• can be static or determined from a mobile phone
Named entity and term annotators

• named entity recogniser = identifies names of hotels, movies,
  persons, etc.
• term annotator = identifies domain specific terms such as
  hotel facilities, movie genres, etc.
• the entities and terms are known, so the task is reduced to a
  database look up
• Gazetteers are the main source for determining the entities
• The annotation module needs to determine the canonical form
  of a entity
• greedy algorithm that uses character based similarity, a
  modified TF*IDF and a greedy algorithm
• does not allow overlapping and there are few ambiguities
Named entity and term annotators


• Annotates both standard and non-standard entities: cinema,
  movie, location, genre, certificate
• Needs to deal with nosy input:
    • misspelt words/input from ASR engines/SMS input e.g.
       becaming Jane, becoming Jade
    • free word order (Will Smith / Smith, Will)
    • equivalent strings (saw III / three / 3; Smith, Will / Smith,
       W.)
• Needs to deal with questions in mixed languages
• Needs to deal with ambiguous entities
Temporal annotator


• questions from the domain of tourism contain a large number
  of temporal expressions
• we use a simplified version of the tagger implemented by
  Pu¸ca¸u (2004)
    s s
• the simplification was done to reduce the processing time
  (Varga, Pu¸ca¸u, and Or˘san, 2009)
            s s          a
• identifies both self-contained temporal expressions (TEs) and
  indexical/under-specified TEs
• uses TIMEX2 standard
• the output is used by TIMEX2SPARQL service to restrict the
  extracted answers
Entailment engine

• often closed-domain QA systems transform a question to a
  Prolog fact or SQL query
• often this solution works only partially due to language
  variability
• in QALL-ME this problem is solved using textual entailment
• the entailment engine determines whether two questions entail
  the same meaning so they share the same retrieval procedure:
    • T the input question
    • H is textual pattern stored in a repository
    • textual patterns have SPARQL retrieval procedures
• we calculate the similarity between two sentences to determine
  whether between them there is an entailment relation
Query generation service



• produces a SPARQL query that can be used to answer the
  question
• has a list of question templates with their associated SPARQL
  queries
• relies on the entailment engine to determine which of the
  question patterns entail the same meaning as the user
  question
• fills in the slots of the question patterns
Example

User question (T): What movie can I see tonight in
Wolverhampton?


List of patterns (H):
  • Who is the director of [MOVIE]?
  • Where can I see [MOVIE] [TIMEX]?
  • What movies are on in [DESTINATION] [TIMEX]?
  • What is the address of [CINEMA]?
  • ...
Example
User question (T): What movie can I see tonight in
Wolverhampton? → What movie can I see [TIMEX] in
[DESTINATION]?


List of patterns (H):
  • Who is the director of [MOVIE]?
  • Where can I see [MOVIE] [TIMEX]?
  • What movies are on in [DESTINATION] [TIMEX]?
  • What is the address of [CINEMA]?
  • ...



Select the retrieval pattern associated with the question
What movies are on in Wolverhampton tonight
Answer Pool service




• takes the SPARQL query generated by the query generator
  and extracts the answer
• SPARQL is a query language for accessing RDF graphs by the
  W3C RDF Data Access Working Group
• SPARQL provides interoperability between languages
Structure of the presentation



1 Introduction


2 The QALL-ME project


3 Multilingual information access in QALL-ME


4 Conclusions
Cross-lingual QA




• QALL-ME tourism prototype is design to allow both
  monolingual and cross-lingual QA
• relevant web services are activated depending on the source
  and target language
• user scenario: Romanian tourist in UK who wants to find out
  more about the movies in Wolverhampton
Cross-lingual QA
Prototype for Romanian


• we wanted to find out how long it takes to develop a demo for
  Romanian
• components had to be adapted:
    • named entity and term annotators had to be trained on a
      different list of entities
    • a simple temporal annotator was implemented on the basis of
      the English one
    • the language independent similarity entailment engine was used
    • the question patterns were translated to Romanian
    • answer pool did not required any change
• the whole process took under one week
Romanian demo




http://qallme.wlv.ac.uk:
8080/QALL-ME-web-demo/index.jsp
Structure of the presentation



1 Introduction


2 The QALL-ME project


3 Multilingual information access in QALL-ME


4 Conclusions
Conclusions




• multilinguality is a very important issue for the QALL-ME
  project
• the ontology constitute the bridge between languages
• the QALL-ME framework can be used to quickly develop
  prototypes for other languages
Thank you!
References
Ou, Shiyan, Viktor Pekar, Constantin Or˘san, Christian Spurk, and Matteo Negri.
                                        a
2008. Development and alignment of a domain-specific ontology for question
answering. In European Language Resources Association (ELRA), editor, Proceedings
of the Sixth International Language Resources and Evaluation (LREC’08), Marrakech,
Morocco, May 28 – 30.
Pu¸ca¸u, Georgiana. 2004. A framework for temporal resolution. In Proceedings of
   s s
the 4th Conference on Language Resources and Evaluation (LREC 2004), Lisbon,
Portugal, May, 26-28.
Varga, Andrea, Georgiana Pu¸ca¸u, and Constantin Or˘san. 2009. Identification of
                             s s                     a
temporal expressions in the domain of tourism. In Knowledge Engineering: Principles
and Techniques, volume 1, pages 29 – 32, Cluj-Napoca, Romania, July 2 – 4.

More Related Content

Viewers also liked

Milieu
MilieuMilieu
Milieu
yoeri.torel
 
Fond memories of Zanzibar
Fond memories of ZanzibarFond memories of Zanzibar
Fond memories of Zanzibar
Heena Modi
 
Developing Cocoa Applications with macRuby
Developing Cocoa Applications with macRubyDeveloping Cocoa Applications with macRuby
Developing Cocoa Applications with macRuby
Brendan Lim
 
Linkedin power point
Linkedin power pointLinkedin power point
Linkedin power point
Natasha Margolina
 
Lecture 02 - DSA
Lecture 02 - DSALecture 02 - DSA
Lecture 02 - DSA
Haitham El-Ghareeb
 
Jean Fares Couture BIO
Jean Fares Couture BIO Jean Fares Couture BIO
Jean Fares Couture BIO
Norma HAYEK
 
Linked In Presentation
Linked In PresentationLinked In Presentation
Linked In Presentation
Benaud Jacob
 
Kansas sights
Kansas sightsKansas sights
Kansas sights
lorie.schaller
 
IOS-Basic Configuration
IOS-Basic ConfigurationIOS-Basic Configuration
IOS-Basic Configuration
Haitham El-Ghareeb
 
Software Testing Services
Software Testing ServicesSoftware Testing Services
Software Testing Services
Fuad Mak
 
Art Mini Portfolio
Art Mini PortfolioArt Mini Portfolio
Art Mini Portfolio
zbent
 
Prem Ni Parab
Prem Ni ParabPrem Ni Parab
Prem Ni Parab
Heena Modi
 
Subtraction problem
Subtraction problemSubtraction problem
Subtraction problem
Heena Modi
 
24 Tirthankaras
24 Tirthankaras24 Tirthankaras
24 Tirthankaras
Heena Modi
 
Interview with Warren Buffet
Interview with Warren BuffetInterview with Warren Buffet
Interview with Warren Buffet
Heena Modi
 
Fear Factor with Outsourcing
Fear Factor with OutsourcingFear Factor with Outsourcing
Fear Factor with Outsourcing
Benaud Jacob
 
Way out cafe - amazing vegan desserts!
Way out cafe - amazing vegan desserts!Way out cafe - amazing vegan desserts!
Way out cafe - amazing vegan desserts!
Heena Modi
 

Viewers also liked (19)

Milieu
MilieuMilieu
Milieu
 
Milieu
MilieuMilieu
Milieu
 
Fond memories of Zanzibar
Fond memories of ZanzibarFond memories of Zanzibar
Fond memories of Zanzibar
 
Developing Cocoa Applications with macRuby
Developing Cocoa Applications with macRubyDeveloping Cocoa Applications with macRuby
Developing Cocoa Applications with macRuby
 
Linkedin power point
Linkedin power pointLinkedin power point
Linkedin power point
 
Lecture 02 - DSA
Lecture 02 - DSALecture 02 - DSA
Lecture 02 - DSA
 
Jean Fares Couture BIO
Jean Fares Couture BIO Jean Fares Couture BIO
Jean Fares Couture BIO
 
Linked In Presentation
Linked In PresentationLinked In Presentation
Linked In Presentation
 
Iso dinkes
Iso dinkesIso dinkes
Iso dinkes
 
Kansas sights
Kansas sightsKansas sights
Kansas sights
 
IOS-Basic Configuration
IOS-Basic ConfigurationIOS-Basic Configuration
IOS-Basic Configuration
 
Software Testing Services
Software Testing ServicesSoftware Testing Services
Software Testing Services
 
Art Mini Portfolio
Art Mini PortfolioArt Mini Portfolio
Art Mini Portfolio
 
Prem Ni Parab
Prem Ni ParabPrem Ni Parab
Prem Ni Parab
 
Subtraction problem
Subtraction problemSubtraction problem
Subtraction problem
 
24 Tirthankaras
24 Tirthankaras24 Tirthankaras
24 Tirthankaras
 
Interview with Warren Buffet
Interview with Warren BuffetInterview with Warren Buffet
Interview with Warren Buffet
 
Fear Factor with Outsourcing
Fear Factor with OutsourcingFear Factor with Outsourcing
Fear Factor with Outsourcing
 
Way out cafe - amazing vegan desserts!
Way out cafe - amazing vegan desserts!Way out cafe - amazing vegan desserts!
Way out cafe - amazing vegan desserts!
 

Similar to Porting the QALL-ME framework to Romanian

Teaching Machines to Listen: An Introduction to Automatic Speech Recognition
Teaching Machines to Listen: An Introduction to Automatic Speech RecognitionTeaching Machines to Listen: An Introduction to Automatic Speech Recognition
Teaching Machines to Listen: An Introduction to Automatic Speech Recognition
Zachary S. Brown
 
II-SDV 2017: Localizing International Content for Search, Data Mining and Ana...
II-SDV 2017: Localizing International Content for Search, Data Mining and Ana...II-SDV 2017: Localizing International Content for Search, Data Mining and Ana...
II-SDV 2017: Localizing International Content for Search, Data Mining and Ana...
Dr. Haxel Consult
 
Curation Technologies for Multilingual Europe
Curation Technologies for Multilingual EuropeCuration Technologies for Multilingual Europe
Curation Technologies for Multilingual Europe
Georg Rehm
 
Text mining and Visualizations
Text mining  and VisualizationsText mining  and Visualizations
Text mining and Visualizations
Kasturi SR Narayana Murthy
 
IRJET - Language Linguist using Image Processing on Intelligent Transport Sys...
IRJET - Language Linguist using Image Processing on Intelligent Transport Sys...IRJET - Language Linguist using Image Processing on Intelligent Transport Sys...
IRJET - Language Linguist using Image Processing on Intelligent Transport Sys...
IRJET Journal
 
COBWEB Authentication Workshop
COBWEB Authentication WorkshopCOBWEB Authentication Workshop
COBWEB Authentication Workshop
EDINA, University of Edinburgh
 
Content Processing Architecture and Applications - Introduction to Text Mining
Content Processing Architecture and Applications - Introduction to Text MiningContent Processing Architecture and Applications - Introduction to Text Mining
Content Processing Architecture and Applications - Introduction to Text Mining
Findwise
 
Localize your business - Software Localization Services LocServ
Localize your business - Software Localization Services LocServLocalize your business - Software Localization Services LocServ
Localize your business - Software Localization Services LocServ
Softengi
 
LocServ - presentation of great localization and internationalization services
LocServ - presentation of great localization and internationalization servicesLocServ - presentation of great localization and internationalization services
LocServ - presentation of great localization and internationalization services
LocServ
 
traffic sign detection using deep learning.pptx
traffic sign detection using deep learning.pptxtraffic sign detection using deep learning.pptx
traffic sign detection using deep learning.pptx
brijeshbs2
 
Plone at Harvard School of Engineering and Applied Sciences
Plone at Harvard School of Engineering and Applied SciencesPlone at Harvard School of Engineering and Applied Sciences
Plone at Harvard School of Engineering and Applied Sciences
Jazkarta, Inc.
 
DaViT.pdf
DaViT.pdfDaViT.pdf
DaViT.pdf
ShahidJabbar10
 
Mobile Multi-domain Search over Structured Web Data
Mobile Multi-domain Search over Structured Web DataMobile Multi-domain Search over Structured Web Data
Mobile Multi-domain Search over Structured Web Data
AtakanAral
 
Linking Services and Linked Data: Keynote for AIMSA 2012
Linking Services and Linked Data: Keynote for AIMSA 2012Linking Services and Linked Data: Keynote for AIMSA 2012
Linking Services and Linked Data: Keynote for AIMSA 2012
John Domingue
 
44 language resources for computer assisted translation
44 language resources for computer assisted translation44 language resources for computer assisted translation
44 language resources for computer assisted translation
AEGIS-ACCESSIBLE Projects
 
Adaptive streaming for immersive communication
Adaptive streaming for immersive communicationAdaptive streaming for immersive communication
Adaptive streaming for immersive communication
Silvia Rossi
 
Using Deep Learning at Scale - Guhan Suriyanarayanan and Adi Oltean, Microsoft
Using Deep Learning at Scale - Guhan Suriyanarayanan and Adi Oltean, MicrosoftUsing Deep Learning at Scale - Guhan Suriyanarayanan and Adi Oltean, Microsoft
Using Deep Learning at Scale - Guhan Suriyanarayanan and Adi Oltean, Microsoft
Guhan Suriyanarayanan
 
Denovo SIP VoIP Termination SBC Session Boarder Controler @ denofolab.com
Denovo SIP VoIP Termination SBC Session Boarder Controler @ denofolab.comDenovo SIP VoIP Termination SBC Session Boarder Controler @ denofolab.com
Denovo SIP VoIP Termination SBC Session Boarder Controler @ denofolab.com
Anne Kwong
 
Steven Ramage: THE LANGUAGE OF BUSINESS
Steven Ramage: THE LANGUAGE OF BUSINESSSteven Ramage: THE LANGUAGE OF BUSINESS
Steven Ramage: THE LANGUAGE OF BUSINESS
AGI Geocommunity
 
Semantically-aware Networks and Services for Training and Knowledge Managemen...
Semantically-aware Networks and Services for Training and Knowledge Managemen...Semantically-aware Networks and Services for Training and Knowledge Managemen...
Semantically-aware Networks and Services for Training and Knowledge Managemen...
Gilbert Paquette
 

Similar to Porting the QALL-ME framework to Romanian (20)

Teaching Machines to Listen: An Introduction to Automatic Speech Recognition
Teaching Machines to Listen: An Introduction to Automatic Speech RecognitionTeaching Machines to Listen: An Introduction to Automatic Speech Recognition
Teaching Machines to Listen: An Introduction to Automatic Speech Recognition
 
II-SDV 2017: Localizing International Content for Search, Data Mining and Ana...
II-SDV 2017: Localizing International Content for Search, Data Mining and Ana...II-SDV 2017: Localizing International Content for Search, Data Mining and Ana...
II-SDV 2017: Localizing International Content for Search, Data Mining and Ana...
 
Curation Technologies for Multilingual Europe
Curation Technologies for Multilingual EuropeCuration Technologies for Multilingual Europe
Curation Technologies for Multilingual Europe
 
Text mining and Visualizations
Text mining  and VisualizationsText mining  and Visualizations
Text mining and Visualizations
 
IRJET - Language Linguist using Image Processing on Intelligent Transport Sys...
IRJET - Language Linguist using Image Processing on Intelligent Transport Sys...IRJET - Language Linguist using Image Processing on Intelligent Transport Sys...
IRJET - Language Linguist using Image Processing on Intelligent Transport Sys...
 
COBWEB Authentication Workshop
COBWEB Authentication WorkshopCOBWEB Authentication Workshop
COBWEB Authentication Workshop
 
Content Processing Architecture and Applications - Introduction to Text Mining
Content Processing Architecture and Applications - Introduction to Text MiningContent Processing Architecture and Applications - Introduction to Text Mining
Content Processing Architecture and Applications - Introduction to Text Mining
 
Localize your business - Software Localization Services LocServ
Localize your business - Software Localization Services LocServLocalize your business - Software Localization Services LocServ
Localize your business - Software Localization Services LocServ
 
LocServ - presentation of great localization and internationalization services
LocServ - presentation of great localization and internationalization servicesLocServ - presentation of great localization and internationalization services
LocServ - presentation of great localization and internationalization services
 
traffic sign detection using deep learning.pptx
traffic sign detection using deep learning.pptxtraffic sign detection using deep learning.pptx
traffic sign detection using deep learning.pptx
 
Plone at Harvard School of Engineering and Applied Sciences
Plone at Harvard School of Engineering and Applied SciencesPlone at Harvard School of Engineering and Applied Sciences
Plone at Harvard School of Engineering and Applied Sciences
 
DaViT.pdf
DaViT.pdfDaViT.pdf
DaViT.pdf
 
Mobile Multi-domain Search over Structured Web Data
Mobile Multi-domain Search over Structured Web DataMobile Multi-domain Search over Structured Web Data
Mobile Multi-domain Search over Structured Web Data
 
Linking Services and Linked Data: Keynote for AIMSA 2012
Linking Services and Linked Data: Keynote for AIMSA 2012Linking Services and Linked Data: Keynote for AIMSA 2012
Linking Services and Linked Data: Keynote for AIMSA 2012
 
44 language resources for computer assisted translation
44 language resources for computer assisted translation44 language resources for computer assisted translation
44 language resources for computer assisted translation
 
Adaptive streaming for immersive communication
Adaptive streaming for immersive communicationAdaptive streaming for immersive communication
Adaptive streaming for immersive communication
 
Using Deep Learning at Scale - Guhan Suriyanarayanan and Adi Oltean, Microsoft
Using Deep Learning at Scale - Guhan Suriyanarayanan and Adi Oltean, MicrosoftUsing Deep Learning at Scale - Guhan Suriyanarayanan and Adi Oltean, Microsoft
Using Deep Learning at Scale - Guhan Suriyanarayanan and Adi Oltean, Microsoft
 
Denovo SIP VoIP Termination SBC Session Boarder Controler @ denofolab.com
Denovo SIP VoIP Termination SBC Session Boarder Controler @ denofolab.comDenovo SIP VoIP Termination SBC Session Boarder Controler @ denofolab.com
Denovo SIP VoIP Termination SBC Session Boarder Controler @ denofolab.com
 
Steven Ramage: THE LANGUAGE OF BUSINESS
Steven Ramage: THE LANGUAGE OF BUSINESSSteven Ramage: THE LANGUAGE OF BUSINESS
Steven Ramage: THE LANGUAGE OF BUSINESS
 
Semantically-aware Networks and Services for Training and Knowledge Managemen...
Semantically-aware Networks and Services for Training and Knowledge Managemen...Semantically-aware Networks and Services for Training and Knowledge Managemen...
Semantically-aware Networks and Services for Training and Knowledge Managemen...
 

More from Constantin Orasan

New trends in NLP applications
New trends in NLP applicationsNew trends in NLP applications
New trends in NLP applications
Constantin Orasan
 
From TREC to Watson: is open domain question answering a solved problem?
From TREC to Watson: is open domain question answering a solved problem?From TREC to Watson: is open domain question answering a solved problem?
From TREC to Watson: is open domain question answering a solved problem?
Constantin Orasan
 
QALL-ME: Ontology and Semantic Web
QALL-ME: Ontology and Semantic WebQALL-ME: Ontology and Semantic Web
QALL-ME: Ontology and Semantic Web
Constantin Orasan
 
The role of linguistic information for shallow language processing
The role of linguistic information for shallow language processingThe role of linguistic information for shallow language processing
The role of linguistic information for shallow language processing
Constantin Orasan
 
What is Computer-Aided Summarisation and does it really work?
What is Computer-Aided Summarisation and does it really work?What is Computer-Aided Summarisation and does it really work?
What is Computer-Aided Summarisation and does it really work?
Constantin Orasan
 
Tutorial on automatic summarization
Tutorial on automatic summarizationTutorial on automatic summarization
Tutorial on automatic summarization
Constantin Orasan
 
Message project leaflet
Message project leafletMessage project leaflet
Message project leaflet
Constantin Orasan
 
Annotation of anaphora and coreference for automatic processing
Annotation of anaphora and coreference for automatic processingAnnotation of anaphora and coreference for automatic processing
Annotation of anaphora and coreference for automatic processing
Constantin Orasan
 

More from Constantin Orasan (8)

New trends in NLP applications
New trends in NLP applicationsNew trends in NLP applications
New trends in NLP applications
 
From TREC to Watson: is open domain question answering a solved problem?
From TREC to Watson: is open domain question answering a solved problem?From TREC to Watson: is open domain question answering a solved problem?
From TREC to Watson: is open domain question answering a solved problem?
 
QALL-ME: Ontology and Semantic Web
QALL-ME: Ontology and Semantic WebQALL-ME: Ontology and Semantic Web
QALL-ME: Ontology and Semantic Web
 
The role of linguistic information for shallow language processing
The role of linguistic information for shallow language processingThe role of linguistic information for shallow language processing
The role of linguistic information for shallow language processing
 
What is Computer-Aided Summarisation and does it really work?
What is Computer-Aided Summarisation and does it really work?What is Computer-Aided Summarisation and does it really work?
What is Computer-Aided Summarisation and does it really work?
 
Tutorial on automatic summarization
Tutorial on automatic summarizationTutorial on automatic summarization
Tutorial on automatic summarization
 
Message project leaflet
Message project leafletMessage project leaflet
Message project leaflet
 
Annotation of anaphora and coreference for automatic processing
Annotation of anaphora and coreference for automatic processingAnnotation of anaphora and coreference for automatic processing
Annotation of anaphora and coreference for automatic processing
 

Recently uploaded

SWOT analysis in the project Keeping the Memory @live.pptx
SWOT analysis in the project Keeping the Memory @live.pptxSWOT analysis in the project Keeping the Memory @live.pptx
SWOT analysis in the project Keeping the Memory @live.pptx
zuzanka
 
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptxNEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
iammrhaywood
 
Pharmaceutics Pharmaceuticals best of brub
Pharmaceutics Pharmaceuticals best of brubPharmaceutics Pharmaceuticals best of brub
Pharmaceutics Pharmaceuticals best of brub
danielkiash986
 
Bonku-Babus-Friend by Sathyajith Ray (9)
Bonku-Babus-Friend by Sathyajith Ray  (9)Bonku-Babus-Friend by Sathyajith Ray  (9)
Bonku-Babus-Friend by Sathyajith Ray (9)
nitinpv4ai
 
CapTechTalks Webinar Slides June 2024 Donovan Wright.pptx
CapTechTalks Webinar Slides June 2024 Donovan Wright.pptxCapTechTalks Webinar Slides June 2024 Donovan Wright.pptx
CapTechTalks Webinar Slides June 2024 Donovan Wright.pptx
CapitolTechU
 
Data Structure using C by Dr. K Adisesha .ppsx
Data Structure using C by Dr. K Adisesha .ppsxData Structure using C by Dr. K Adisesha .ppsx
Data Structure using C by Dr. K Adisesha .ppsx
Prof. Dr. K. Adisesha
 
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 8 - CẢ NĂM - FRIENDS PLUS - NĂM HỌC 2023-2024 (B...
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 8 - CẢ NĂM - FRIENDS PLUS - NĂM HỌC 2023-2024 (B...BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 8 - CẢ NĂM - FRIENDS PLUS - NĂM HỌC 2023-2024 (B...
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 8 - CẢ NĂM - FRIENDS PLUS - NĂM HỌC 2023-2024 (B...
Nguyen Thanh Tu Collection
 
spot a liar (Haiqa 146).pptx Technical writhing and presentation skills
spot a liar (Haiqa 146).pptx Technical writhing and presentation skillsspot a liar (Haiqa 146).pptx Technical writhing and presentation skills
spot a liar (Haiqa 146).pptx Technical writhing and presentation skills
haiqairshad
 
How to Manage Reception Report in Odoo 17
How to Manage Reception Report in Odoo 17How to Manage Reception Report in Odoo 17
How to Manage Reception Report in Odoo 17
Celine George
 
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptxPrésentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
siemaillard
 
REASIGNACION 2024 UGEL CHUPACA 2024 UGEL CHUPACA.pdf
REASIGNACION 2024 UGEL CHUPACA 2024 UGEL CHUPACA.pdfREASIGNACION 2024 UGEL CHUPACA 2024 UGEL CHUPACA.pdf
REASIGNACION 2024 UGEL CHUPACA 2024 UGEL CHUPACA.pdf
giancarloi8888
 
Accounting for Restricted Grants When and How To Record Properly
Accounting for Restricted Grants  When and How To Record ProperlyAccounting for Restricted Grants  When and How To Record Properly
Accounting for Restricted Grants When and How To Record Properly
TechSoup
 
Level 3 NCEA - NZ: A Nation In the Making 1872 - 1900 SML.ppt
Level 3 NCEA - NZ: A  Nation In the Making 1872 - 1900 SML.pptLevel 3 NCEA - NZ: A  Nation In the Making 1872 - 1900 SML.ppt
Level 3 NCEA - NZ: A Nation In the Making 1872 - 1900 SML.ppt
Henry Hollis
 
Oliver Asks for More by Charles Dickens (9)
Oliver Asks for More by Charles Dickens (9)Oliver Asks for More by Charles Dickens (9)
Oliver Asks for More by Charles Dickens (9)
nitinpv4ai
 
NIPER 2024 MEMORY BASED QUESTIONS.ANSWERS TO NIPER 2024 QUESTIONS.NIPER JEE 2...
NIPER 2024 MEMORY BASED QUESTIONS.ANSWERS TO NIPER 2024 QUESTIONS.NIPER JEE 2...NIPER 2024 MEMORY BASED QUESTIONS.ANSWERS TO NIPER 2024 QUESTIONS.NIPER JEE 2...
NIPER 2024 MEMORY BASED QUESTIONS.ANSWERS TO NIPER 2024 QUESTIONS.NIPER JEE 2...
Payaamvohra1
 
CIS 4200-02 Group 1 Final Project Report (1).pdf
CIS 4200-02 Group 1 Final Project Report (1).pdfCIS 4200-02 Group 1 Final Project Report (1).pdf
CIS 4200-02 Group 1 Final Project Report (1).pdf
blueshagoo1
 
Wound healing PPT
Wound healing PPTWound healing PPT
Wound healing PPT
Jyoti Chand
 
RESULTS OF THE EVALUATION QUESTIONNAIRE.pptx
RESULTS OF THE EVALUATION QUESTIONNAIRE.pptxRESULTS OF THE EVALUATION QUESTIONNAIRE.pptx
RESULTS OF THE EVALUATION QUESTIONNAIRE.pptx
zuzanka
 
The basics of sentences session 7pptx.pptx
The basics of sentences session 7pptx.pptxThe basics of sentences session 7pptx.pptx
The basics of sentences session 7pptx.pptx
heathfieldcps1
 
How to Fix [Errno 98] address already in use
How to Fix [Errno 98] address already in useHow to Fix [Errno 98] address already in use
How to Fix [Errno 98] address already in use
Celine George
 

Recently uploaded (20)

SWOT analysis in the project Keeping the Memory @live.pptx
SWOT analysis in the project Keeping the Memory @live.pptxSWOT analysis in the project Keeping the Memory @live.pptx
SWOT analysis in the project Keeping the Memory @live.pptx
 
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptxNEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
 
Pharmaceutics Pharmaceuticals best of brub
Pharmaceutics Pharmaceuticals best of brubPharmaceutics Pharmaceuticals best of brub
Pharmaceutics Pharmaceuticals best of brub
 
Bonku-Babus-Friend by Sathyajith Ray (9)
Bonku-Babus-Friend by Sathyajith Ray  (9)Bonku-Babus-Friend by Sathyajith Ray  (9)
Bonku-Babus-Friend by Sathyajith Ray (9)
 
CapTechTalks Webinar Slides June 2024 Donovan Wright.pptx
CapTechTalks Webinar Slides June 2024 Donovan Wright.pptxCapTechTalks Webinar Slides June 2024 Donovan Wright.pptx
CapTechTalks Webinar Slides June 2024 Donovan Wright.pptx
 
Data Structure using C by Dr. K Adisesha .ppsx
Data Structure using C by Dr. K Adisesha .ppsxData Structure using C by Dr. K Adisesha .ppsx
Data Structure using C by Dr. K Adisesha .ppsx
 
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 8 - CẢ NĂM - FRIENDS PLUS - NĂM HỌC 2023-2024 (B...
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 8 - CẢ NĂM - FRIENDS PLUS - NĂM HỌC 2023-2024 (B...BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 8 - CẢ NĂM - FRIENDS PLUS - NĂM HỌC 2023-2024 (B...
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 8 - CẢ NĂM - FRIENDS PLUS - NĂM HỌC 2023-2024 (B...
 
spot a liar (Haiqa 146).pptx Technical writhing and presentation skills
spot a liar (Haiqa 146).pptx Technical writhing and presentation skillsspot a liar (Haiqa 146).pptx Technical writhing and presentation skills
spot a liar (Haiqa 146).pptx Technical writhing and presentation skills
 
How to Manage Reception Report in Odoo 17
How to Manage Reception Report in Odoo 17How to Manage Reception Report in Odoo 17
How to Manage Reception Report in Odoo 17
 
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptxPrésentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
 
REASIGNACION 2024 UGEL CHUPACA 2024 UGEL CHUPACA.pdf
REASIGNACION 2024 UGEL CHUPACA 2024 UGEL CHUPACA.pdfREASIGNACION 2024 UGEL CHUPACA 2024 UGEL CHUPACA.pdf
REASIGNACION 2024 UGEL CHUPACA 2024 UGEL CHUPACA.pdf
 
Accounting for Restricted Grants When and How To Record Properly
Accounting for Restricted Grants  When and How To Record ProperlyAccounting for Restricted Grants  When and How To Record Properly
Accounting for Restricted Grants When and How To Record Properly
 
Level 3 NCEA - NZ: A Nation In the Making 1872 - 1900 SML.ppt
Level 3 NCEA - NZ: A  Nation In the Making 1872 - 1900 SML.pptLevel 3 NCEA - NZ: A  Nation In the Making 1872 - 1900 SML.ppt
Level 3 NCEA - NZ: A Nation In the Making 1872 - 1900 SML.ppt
 
Oliver Asks for More by Charles Dickens (9)
Oliver Asks for More by Charles Dickens (9)Oliver Asks for More by Charles Dickens (9)
Oliver Asks for More by Charles Dickens (9)
 
NIPER 2024 MEMORY BASED QUESTIONS.ANSWERS TO NIPER 2024 QUESTIONS.NIPER JEE 2...
NIPER 2024 MEMORY BASED QUESTIONS.ANSWERS TO NIPER 2024 QUESTIONS.NIPER JEE 2...NIPER 2024 MEMORY BASED QUESTIONS.ANSWERS TO NIPER 2024 QUESTIONS.NIPER JEE 2...
NIPER 2024 MEMORY BASED QUESTIONS.ANSWERS TO NIPER 2024 QUESTIONS.NIPER JEE 2...
 
CIS 4200-02 Group 1 Final Project Report (1).pdf
CIS 4200-02 Group 1 Final Project Report (1).pdfCIS 4200-02 Group 1 Final Project Report (1).pdf
CIS 4200-02 Group 1 Final Project Report (1).pdf
 
Wound healing PPT
Wound healing PPTWound healing PPT
Wound healing PPT
 
RESULTS OF THE EVALUATION QUESTIONNAIRE.pptx
RESULTS OF THE EVALUATION QUESTIONNAIRE.pptxRESULTS OF THE EVALUATION QUESTIONNAIRE.pptx
RESULTS OF THE EVALUATION QUESTIONNAIRE.pptx
 
The basics of sentences session 7pptx.pptx
The basics of sentences session 7pptx.pptxThe basics of sentences session 7pptx.pptx
The basics of sentences session 7pptx.pptx
 
How to Fix [Errno 98] address already in use
How to Fix [Errno 98] address already in useHow to Fix [Errno 98] address already in use
How to Fix [Errno 98] address already in use
 

Porting the QALL-ME framework to Romanian

  • 1. Porting the QALL-ME framework to Romanian Constantin Or˘san a Research Group in Computational Linguistics Research Institute in Information and Language Processing University of Wolverhampton http://www.wlv.ac.uk/~in6093/ 29th March 2010
  • 2. 1 Introduction 2 The QALL-ME project 3 Multilingual information access in QALL-ME 4 Conclusions
  • 3. Structure of the presentation 1 Introduction 2 The QALL-ME project 3 Multilingual information access in QALL-ME 4 Conclusions
  • 4. Need to access information • as a result of the Internet development more and more information becomes available • this information is in many languages • fields from computational linguistics such as automatic summarisation, question answering, text mining, etc. can help people deal with information
  • 5. Need to access information • as a result of the Internet development more and more information becomes available • this information is in many languages • fields from computational linguistics such as automatic summarisation, question answering, text mining, etc. can help people deal with information
  • 6. Question answering (QA) • Question answering aims at identifying the answer to a question in a large collection of documents • the information provided by QA is more focused than information retrieval • the output can be the exact answer or a text snippet which contains the answer • the domain took off as a result of the introduction of QA track in TREC, whilst cross-lingual QA as a result of CLEF
  • 7. Types of QA systems • open-domain QA systems: can answer any question from any collection + can potentially answer any question - very low accuracy (especially in cross-lingual settings)
  • 8. Types of QA systems • open-domain QA systems: can answer any question from any collection + can potentially answer any question - very low accuracy (especially in cross-lingual settings) • canned QA systems: rely on a very large repository of questions for which the answer is known + very little processing necessary - limited to the answers in the database
  • 9. Types of QA systems • open-domain QA systems: can answer any question from any collection + can potentially answer any question - very low accuracy (especially in cross-lingual settings) • canned QA systems: rely on a very large repository of questions for which the answer is known + very little processing necessary - limited to the answers in the database • closed-domain QA systems: are built for very specific domains and exploit expert knowledge in them + very high accuracy - can require extensive language processing and limited to one domain
  • 10. Purpose of the presentation • briefly present the QALL-ME project
  • 11. Purpose of the presentation • briefly present the QALL-ME project • show how it was adapted to answer questions in Romanian about movies
  • 12. Structure of the presentation 1 Introduction 2 The QALL-ME project 3 Multilingual information access in QALL-ME 4 Conclusions
  • 13. The QALL-ME project • QALL-ME = Question Answering Learning technologies in a multiLingual and Multimodal Environment • EU-funded project part of FP6 • 7 partners: • FBK-irst, Italy • University of Wolverhampton, UK • University of Alicante, Spain • DFKI, Germany • Comdata, Italy • UbiEST, Italy • WayCom, Italy • Web page: http://qallme.fbk.eu
  • 14. The QALL-ME project • aimed at establishing a shared infrastructure for multilingual and multimodal QA in the domain of tourism • In the QALL-ME system • users ask natural language questions in several languages (both in textual and speech modality) using a variety of input devices (e.g. mobile phones), and • returns a list of specific answers formatted in the most appropriate modality, ranging from small texts, maps, videos, and pictures.
  • 15. Local Information  Semantic  Sources representation Service Provider English Answer  German Answer  Extractor Extractor QALL­ME central  QA planner Spanish Answer  Italian Answer  Extractor Extractor Question Type  Answer Type  Speech  Dialog Models ontology ontology Recognizers
  • 16. Main outputs of the project • an ontology for the domain of tourism • entailment based QA framework • the QALL-ME benchmark • an entailment framework (all accessible from the project’s web page: http://qallme.fbk.eu)
  • 17. The ontology • A domain-specific ontology for the tourism domain was developed and shared among all the partners. • The ontology was used to serve as: • bridge between different languages • communication language between different components of the system • The ontology was linked to domain independent ontologies such as MultiWordNet and Sumo • For more information see (Ou et al., 2008)
  • 18. Design of the ontology • Analysis of data from content providers • Analysis of users requirements • Inspired by similar ontologies: • Harmonise and eTourism: focus on static information (e.g. accommodation and events/activities) • Similar to eTourism as is written in OWL rather RDFs • but wider coverage • Introspection
  • 19. The ontology • Main classes: Country, Destination, Site (i.e. Accommodation, Attraction, Gastro, and Infrastructure), Transportation, EventContent and Event • Element classes: Facility, Room, PersonOrganization, Language, and Currency • Attribute classes: Contact, Location, Period and Price. • Element and attribute classes cannot exist independently and have to be attached to other main or element classes
  • 20. Price Site GPSCoordinate priceType hasGPSCoordinate subClassOf subClassOf PostalAddress priceValue Event hasPostalAddress TicketPrice Cinema DirectionLocation hasCurrency subClassOf DirectionLocation Currency isInSite hasPrice hasContact name description Contact hasSiteFacility MovieShow hasRoom CinemaRoom SiteFacility Period EventContent hasRoomFacility endTime startTime hasPeriod hasEventContent RoomFacility subClassOf subClassOf TimePeriod Director hasTimePeriod hasDirector DateTimePeriod Movie hasProducer Producer hasDatePeriod hasStar DatePeriod hasWriter Star name certificate endDate startDate synposis genre Writer
  • 21. The ontology • Encoded using OWL DL, since it has more expressive power than OWL Lite and has more efficient reasoning support than OWL Full • Used Protege-OWL as the editor and RacerPro7 as the reasoner • The ontology contains • 122 classes (concepts), • 55 datatype properties and • 52 object properties which indicate the relationships among the 122 classes. • 15 top-level classes. • The class hierarchy has a maximum depth of 4.
  • 22. The QALL-ME framework • is an architecture skeleton for multilingual QA systems for closed domains • designed in such a way that it allows fast development of closed domain QA systems • freely available from http://qallme.sourceforge.net/ • is based on a Service Oriented Architecture (SOA) which is realised using web services • relies on textual entailment recognisers
  • 23. Web services 1 Context providers: are used to anchor questions in space and time 2 Annotators: Currently three types of annotators are available: • named entity annotators which identify names of cinemas, movies, persons, etc. • term annotators which identify hotel facilities, movie genres and other domain-specific terminology • temporal annotators that are used to recognise and normalise temporal expressions in user questions 3 Entailment engine: determines whether a user question entails a retrieval procedure 4 Query generator: which relies on an entailment engine to generate a query to extract the answer. 5 Answer pool: retrieves the answers from a database.
  • 24. Context providers • are used to anchor a question in space and time • return the current position and time • used by the presentation module when maps are displayed • used by temporal process to normalise temporal entities • determines which services are used in a cross-lingual scenario • can be static or determined from a mobile phone
  • 25. Named entity and term annotators • named entity recogniser = identifies names of hotels, movies, persons, etc. • term annotator = identifies domain specific terms such as hotel facilities, movie genres, etc. • the entities and terms are known, so the task is reduced to a database look up • Gazetteers are the main source for determining the entities • The annotation module needs to determine the canonical form of a entity • greedy algorithm that uses character based similarity, a modified TF*IDF and a greedy algorithm • does not allow overlapping and there are few ambiguities
  • 26. Named entity and term annotators • Annotates both standard and non-standard entities: cinema, movie, location, genre, certificate • Needs to deal with nosy input: • misspelt words/input from ASR engines/SMS input e.g. becaming Jane, becoming Jade • free word order (Will Smith / Smith, Will) • equivalent strings (saw III / three / 3; Smith, Will / Smith, W.) • Needs to deal with questions in mixed languages • Needs to deal with ambiguous entities
  • 27. Temporal annotator • questions from the domain of tourism contain a large number of temporal expressions • we use a simplified version of the tagger implemented by Pu¸ca¸u (2004) s s • the simplification was done to reduce the processing time (Varga, Pu¸ca¸u, and Or˘san, 2009) s s a • identifies both self-contained temporal expressions (TEs) and indexical/under-specified TEs • uses TIMEX2 standard • the output is used by TIMEX2SPARQL service to restrict the extracted answers
  • 28. Entailment engine • often closed-domain QA systems transform a question to a Prolog fact or SQL query • often this solution works only partially due to language variability • in QALL-ME this problem is solved using textual entailment • the entailment engine determines whether two questions entail the same meaning so they share the same retrieval procedure: • T the input question • H is textual pattern stored in a repository • textual patterns have SPARQL retrieval procedures • we calculate the similarity between two sentences to determine whether between them there is an entailment relation
  • 29. Query generation service • produces a SPARQL query that can be used to answer the question • has a list of question templates with their associated SPARQL queries • relies on the entailment engine to determine which of the question patterns entail the same meaning as the user question • fills in the slots of the question patterns
  • 30. Example User question (T): What movie can I see tonight in Wolverhampton? List of patterns (H): • Who is the director of [MOVIE]? • Where can I see [MOVIE] [TIMEX]? • What movies are on in [DESTINATION] [TIMEX]? • What is the address of [CINEMA]? • ...
  • 31. Example User question (T): What movie can I see tonight in Wolverhampton? → What movie can I see [TIMEX] in [DESTINATION]? List of patterns (H): • Who is the director of [MOVIE]? • Where can I see [MOVIE] [TIMEX]? • What movies are on in [DESTINATION] [TIMEX]? • What is the address of [CINEMA]? • ... Select the retrieval pattern associated with the question What movies are on in Wolverhampton tonight
  • 32. Answer Pool service • takes the SPARQL query generated by the query generator and extracts the answer • SPARQL is a query language for accessing RDF graphs by the W3C RDF Data Access Working Group • SPARQL provides interoperability between languages
  • 33. Structure of the presentation 1 Introduction 2 The QALL-ME project 3 Multilingual information access in QALL-ME 4 Conclusions
  • 34. Cross-lingual QA • QALL-ME tourism prototype is design to allow both monolingual and cross-lingual QA • relevant web services are activated depending on the source and target language • user scenario: Romanian tourist in UK who wants to find out more about the movies in Wolverhampton
  • 36. Prototype for Romanian • we wanted to find out how long it takes to develop a demo for Romanian • components had to be adapted: • named entity and term annotators had to be trained on a different list of entities • a simple temporal annotator was implemented on the basis of the English one • the language independent similarity entailment engine was used • the question patterns were translated to Romanian • answer pool did not required any change • the whole process took under one week
  • 38. Structure of the presentation 1 Introduction 2 The QALL-ME project 3 Multilingual information access in QALL-ME 4 Conclusions
  • 39. Conclusions • multilinguality is a very important issue for the QALL-ME project • the ontology constitute the bridge between languages • the QALL-ME framework can be used to quickly develop prototypes for other languages
  • 42. Ou, Shiyan, Viktor Pekar, Constantin Or˘san, Christian Spurk, and Matteo Negri. a 2008. Development and alignment of a domain-specific ontology for question answering. In European Language Resources Association (ELRA), editor, Proceedings of the Sixth International Language Resources and Evaluation (LREC’08), Marrakech, Morocco, May 28 – 30. Pu¸ca¸u, Georgiana. 2004. A framework for temporal resolution. In Proceedings of s s the 4th Conference on Language Resources and Evaluation (LREC 2004), Lisbon, Portugal, May, 26-28. Varga, Andrea, Georgiana Pu¸ca¸u, and Constantin Or˘san. 2009. Identification of s s a temporal expressions in the domain of tourism. In Knowledge Engineering: Principles and Techniques, volume 1, pages 29 – 32, Cluj-Napoca, Romania, July 2 – 4.