SlideShare a Scribd company logo
A Tutorial on
Question Answering Systems
Saeedeh Shekarpour
Postdoc researcher from EIS research group
31 August 2015EIS research group - Bonn University
1
Outline
 Introduction
 Associations for evaluation QA systems
 Preliminary Concepts
 Data Web
 Emerging Concepts
 Deeper view on SINA Project
31 August 2015EIS research group - Bonn University
2
Query
Answer
What is a Question Answering (QA)
system?
Systems that automatically answer questions posed by
humans in natural language query.
31 August 2015EIS research group - Bonn University
3
Natural language is the common
way for sharing knowledge
Question answering is a multi
disciplinary field
31 August 2015EIS research group - Bonn University
4
Semantic
Web
Linked Data
Difference to Information Retrieval
 Information Retrieval is a query driven approach for
accessing information.
 System returns a list of documents.
 It is responsibility of user to navigate on the retrieved
documents and find its own information need.
 Question Answering is an answer driven approach for
accessing information.
 User asks its question in natural language ( i.e. phrase-
based, full sentence or even keyword based) queries.
 System returns the list of short answers.
 More complex functionality.
31 August 2015EIS research group - Bonn University
5
Search engines are moving towards
QA
31 August 2015EIS research group - Bonn University
6
Search engines still lack the ability to
answer more complex queries
31 August 2015EIS research group - Bonn University
7
Natural language queries are
classified into different categories
 Factoid queries: WH questions like when, who, where.
 Yes/ No queries: Is Berlin capital of Germany?
 Definition queries: what is leukemia?
 Cause/consequence queries: How, Why, What. what are
the consequences of the Iraq war ?
31 August 2015EIS research group - Bonn University
8
Natural language queries are
classified into different categories
 Procedural queries: which are the steps for getting a
Master degree?
 Comparative queries: what are the differences between
the model A and B?
 Queries with examples: list of hard disks similar to hard
disk X.
 Queries about opinion: What is the opinion of the majority
of Americans about the Iraq war?
31 August 2015EIS research group - Bonn University
9
Corpus Type
 Structured data (relational data bases, RDF knowledge
bases).
 Semi-structured data (XML databases)
 Free text
 Multimodal data: image, voice, video
31 August 2015EIS research group - Bonn University
10
Types of QA systems
 Open-domain: domain independent QA systems can
answer any query from any corpus
+ covers wide range of queries
- low accuracy
 Closed-domain: domain specific QA systems are limited
to specific domains
+ High accuracy
- limited coverage over the possible queries
- Needs domain expert
31 August 2015EIS research group - Bonn University
11
Is QA system a need for user?
Search engines query log analysis shows that
Real informational queries in Google:
 Who first invented rock and roll music?
 When was the mobile phone invented?
 Where was the hamburger invented?
 How to lose weight?
31 August 2015EIS research group - Bonn University
12
Type of query Query log
analysis
Informational 48%
Navigational 20%
Transactional 30%
 Introduction
 Associations for evaluation QA systems
 Preliminary Concepts
 Data Web
 Emerging Concepts
 Deeper view on SINA Project
31 August 2015EIS research group - Bonn University
13
Text Retrieval Conference (Trec)
 In 1999, Trec initiated a QA track,
 From 1999-2002, participant systems were allowed to
return ranked answer snippets. These snippets of text
contain the actual answer.
 From 2002, participant systems were allowed to return
only the exact answer with confidence rate.
 The evaluation metric was mean reciprocal rank (MRR).
31 August 2015EIS research group - Bonn University
14
Typical pipeline for the participating
systems in Trec
31 August 2015EIS research group - Bonn University
15
Query
Expansion
IR
Approach
Passage
Retrieval
Answer
Selection
Predicting
the type
of answer
Answer
Query
QA at Clef
The Cross-Language Evaluation Forum (CLEF) initiative
provides
 Since 2003, Clef included a QA track.
 an infrastructure for the testing, tuning and evaluation of
information retrieval systems operating on European
languages in both monolingual and cross-language
contexts.
31 August 2015EIS research group - Bonn University
16
Outline
 Introduction
 Associations for evaluation QA systems
 Preliminary Concepts
 Data Web
 Emerging Concepts
 Deeper view on SINA Project
31 August 2015EIS research group - Bonn University
17
Core of a QA system
31 August 2015EIS research group - Bonn University
18
query
Syntactic Parsing: Part-of-speech
Tagging
I like sweet cakes
I/PRP like/VBP sweet/JJ cakes/NNS
31 August 2015EIS research group - Bonn University
19
like/VBP
I/PRP cakes/NNS
sweet/JJ
Type of Answer
Where was the hamburger invented?
White Castle traces the origin of the hamburger to
Hamburg, Germany with its invention by Otto Kuase.
31 August 2015EIS research group - Bonn University
20
Place
Named Entity Recognition on Query
where was Franklin Roosevelt born?
31 August 2015EIS research group - Bonn University
21
Named Entity: Person
Relation Extraction
Barack Hussein Obama is the 44th and current President of
the United States. Born in Honolulu, Hawaii, Obama is a
graduate of Columbia University and Harvard Law School.
31 August 2015EIS research group - Bonn University
22
Named Entity: Person
Named Entity: Place
Relation: President of
Watson Project
 Watson is a computer which is capable of answering
question issued in natural language.
 Questions come from quiz show called Jeopardy.
 The software of this project is called DeepQA project.
 In 2011, Watson won the former winners of quiz show
Jeopardy.
31 August 2015EIS research group - Bonn University
23
Watson Description
 Hardware: Watson system has 2,880 POWER processor
threads and has 16 terabytes of RAM.
 Data: encyclopedias, dictionaries, thesauri, newswire
articles, and literary works. Watson also used databases,
taxonomies, and ontologies such as DBPedia, WordNet, and
Yago were used.
 Software: DeepQA software and the Apache UIMA
framework. The system was written in various languages,
including Java, C++, and Prolog, and runs on the SUSE Linux
Enterprise Server 11 operating system using Apache Hadoop
framework to provide distributed computing.
31 August 2015EIS research group - Bonn University
24
The high level architecture of
DeepQA
31 August 2015EIS research group - Bonn University
25
Decomposition example
Query: Of the four countries in the world that the United
States does not have diplomatic relations with, the one
that’s farthest north.
Bhutan, Cuba, Iran, North Korea
North Korea
31 August 2015EIS research group - Bonn University
26
Outline
 Introduction
 Associations for evaluation QA systems
 Preliminary Concepts
 Data Web
 Emerging Concepts
 Deeper view on SINA Project
31 August 2015EIS research group - Bonn University
27
Evolution of Web
31 August 2015EIS research group - Bonn University
28
The growth of Linked Open Data
31 August 2015EIS research group - Bonn University
29
August 2014
570 Datasets
More than 74 billion triples
May 2007
12 Datasets
How to retrieve data from Linked
Data?
31 August 2015EIS research group - Bonn University
30
RDF Model
 RDF is an standard for describing Web resources.
 The RDF data model expresses statements about Web
resources in the form of subject-predicate-object (triple).
 The statement “Jack knows Alice” is represented as:
31 August 2015EIS research group - Bonn University
31
knowing
Outline
 Introduction
 Associations for evaluation QA systems
 Preliminary Concepts
 Data Web
 Emerging Concepts
 Deeper view on SINA Project
31 August 2015EIS research group - Bonn University
32
Semantic Indexing
31 August 2015EIS research group - Bonn University
33
Semantic Annotation
 Name Entity Recognition
Where is the capital of Germany?
 Semantic Annotation
Where is the capital of Germany?
31 August 2015EIS research group - Bonn University
34
Named Entity: Place
Entity:
http://dbpedia.org/resource/Germ
any
31 August 2015EIS research group - Bonn University
35
Berlin in the 1920s was the third largest municipality in the world.
After World War II, the city became divided into East Berlin -- the
capital of East Germany -- and West Berlin, a West German
exclave surrounded by the Berlin Wall from 1961–89.
Berlin in the 1920s was the third largest municipality in the world.
After World War II, the city became divided into East Berlin -- the
capital of East Germany -- and West Berlin, a West German
exclave surrounded by the Berlin Wall from 1961–89.
Relation extraction leveraging Data
Web: BOA Library
31 August 2015EIS research group - Bonn University
36
HAWK: Hybrid Question Answering
over Linked Data
31 August 2015EIS research group - Bonn University
37
TBSL: Template-based SPARQL
generator
31 August 2015EIS research group - Bonn University
38
Outline
 Introduction
 Associations for evaluation QA systems
 Preliminary Concepts
 Data Web
 Emerging Concepts
 Deeper view on SINA Project
31 August 2015EIS research group - Bonn University
39
SINAArchitecture
31 August 2015EIS research group - Bonn University
40
Client
QueryPreprocessing
QueryExpansion
ResourceRetrieval
Disambiguation
QueryConstruction
Representation
Server
UnderlyingInterlinked
KnowledgeBases
query result
keywords
valid segments
mapped resources
tuple of
resources
SPARQL
queries
OWL API
http client
Stanford
CoreNLP
SegmentValidation
Reformulated query
Comparison of search approaches
31 August 2015EIS research group - Bonn University
41
Data-semantic
unaware
Data-semantic
aware
Keyword-based
query
Objective: transformation from
textual query to formal query
31 August 2015EIS research group - Bonn University
42
Which televisions shows were created by Walt
Disney?
SELECT * WHERE
{ ?v0 a dbo:TelevisionShow.
?v0 dbo:creator dbr:Walt_Disney. }
The addressed challenges in SINA
31 August 2015EIS research group - Bonn University
43
Query Segmentation
31 August 2015EIS research group - Bonn University
44
 Definition: query segmentation is the process of identifying the right
segments of data items that occur in the keyword queries.
Resource Disambiguation
31 August 2015EIS research group - Bonn University
45
 Definition: resource disambiguation is the process of
recognizing the suitable resources in the underlying
knowledge base.
•Who produced films starring Natalie
Portman?
•dbpedia/ontology/film
•dbpedia/property/film
•What are the side effects of drugs used
for Tuberculosis?
•diseasome:Tuberculosis
•sider:Tuberculosis
Concurrent Approach
31 August 2015EIS research group - Bonn University
46
Query
Segmentation
Resource
Disambiguation
Modeling using hidden Markov
model
31 August 2015EIS research group - Bonn University
47
1
2
3
Unknown
Entity
4
5
6
7
8
9
Start
Keyword 1 Keyword 3Keyword 2 Keyword 4
Bootstrapping the model parameters
1. Emission probability is defined based on the similarity of the label of
each state with a segment, this similarity is computed based on
string-similarity and Jaccard-similarity.
2. Semantic relatedness is a base for transition probability and initial
probability. Intuitively, it is based on two values: distance and
connectivity degree. We transform these two values to hub and
authority values using weighted HITS algorithm.
3. HITS algorithm is a link analysis algorithm that was originally
developed for ranking Web pages. It assign a hub and authority
value to each web page.
4. Initial probability and transition probability are defined as a uniform
distribution over the hub and and authority values.
31 August 2015EIS research group - Bonn University
48
31 August 2015EIS research group - Bonn University
49
Sequence
of
keywords
(television show creat Walt Disney)
Paths 0.0023 dbo:TelevisionShow dbo:creator dbr:Walt_Disney
0.0014 dbo:TelevisionShow dbo:creator dbr:Category:Walt_Disney
0.000589 dbr:TelevisionShow dbo:creator dbr:Walt_Disney
0.000353 dbr:TelevisionShow dbo:creator dbr:Category:Walt_Disney
0.0000376 dbp:television dbp:show dbo:creator dbr:Category:Walt_Disney
Output of the model
Query Expansion
 Definition: query expansion is a way of reformulating the
input query in order to overcome the vocabulary
mismatch problem.
31 August 2015EIS research group - Bonn University
50
• Wife of Barak Obama
•Spouse of Barak Obama
Query Expansion
31 August 2015EIS research group - Bonn University
51
Linguistic features
 WordNet is a popular data source for expansion.
 Linguistic features extracted from WordNet are:
1. Synonyms: words having a similar meanings to the input
keyword.
2. Hyponyms: words representing a specialization of the input
keyword.
3. Hypernyms: words representing a generalization of the input
keyword.
31 August 2015EIS research group - Bonn University
52
Semantic features from Linked Data
31 August 2015EIS research group - Bonn University
53
1. SameAs: deriving resources using owl:sameAs.
2. SeeAlso: deriving resources using rdfs:seeAlso.
3. Equivalence class/property: deriving classes or properties
using owl:equivalentClass and owl:equivalentProperty.
4. Super class/property: deriving all super classes/properties of
by following the rdfs:subClassOf or rdfs:subPropertyOf
property.
5. Sub class/property: deriving resources by following the
rdfs:subClassOf or rdfs:subPropertyOf property paths ending
with the input resource.
Semantic features from Linked Data
5. Broader concepts: deriving using the SKOS vocabulary
properties skos:broader and skos:broadMatch.
6. Narrower concepts: deriving concepts using skos:narrower
and skos:narrowMatch.
7. Related concepts: deriving concepts using skos:closeMatch,
skos:mappingRelation and skos:exactMatch.
31 August 2015EIS research group - Bonn University
54
Exemplary expansion graph of the
word movie
31 August 2015EIS research group - Bonn University
55
movie
home
movie
production
film
motion
picture show
video
telefilm
Statistics over the number of the
derived words from WordNet and
Linked Data
31 August 2015EIS research group - Bonn University
56
Feature #derived words
synonym 503
hyponym 2703
hypernym 657
sameAs 2332
seeAlso 49
equivalence 2
super class/property 267
Sub class/property 2166
Automatic query expansion
31 August 2015EIS research group - Bonn University
57
Input query
External Data source
Data extraction and
preparation
Heuristic method
Reformulated query
Expansion set for each segment
31 August 2015EIS research group - Bonn University
58
Reformulating query using hidden
Markov model
31 August 2015EIS research group - Bonn University
59
Barak
Barak
Obama
spouse
Obama
wife
first
lady
woman
Barak Obama wife
Barak
Obama
Start
Obama
wife
Barak
Obama
wife
Input query: wife of Barak Obama
Formal Query Construction
 Definition: Once the resources are detected, a
connected subgraph of the knowledge base graph,
called the query graph, has to be determined which fully
covers the set of mapped resources.
31 August 2015EIS research group - Bonn University
60
Disambiguated
resources
sider:sideEffect
diseasome:possibleDrug
diseasome:1154
SPARQL query SELECT ?v3 WHERE {
diseasome:115 diseasome:possibleDrug ?v1 .
?v1 owl:sameAs ?v2 .
?v2 sider:sideEffect ?v3 .}
Data Fusion on Linked Data
 Answer of a question may be spread among different
datasets employing heterogeneous schemas.
 Constructing a federated query from needs to exploit
links between the different datasets on the schema and
instance levels.
31 August 2015EIS research group - Bonn University
61
Two different approaches
Formal query construction based on
 Template-based query construction
 Forward chaining based query construction
31 August 2015EIS research group - Bonn University
62
Federated query construction using
forward chaining
31 August 2015EIS research group - Bonn University
63
Query What is the side effects of drugs used for Tuberculosis?
resources diseasome:1154 (type instance)
diseasome:possibleDrug (type property)
sider:sideEffect (type property)
1154 ?v0
possibleDrug
Graph 1
?v1 ?v2
sideEffect
Graph 2
115
4
?v0
possibleDrug
Template 1
?v1 ?v2
sideEffect
Template 2
115
4
?v0
possibleDrug
?v1 ?v2
sideEffect
Any question?
31 August 2015EIS research group - Bonn University
64

More Related Content

What's hot

Natural lanaguage processing
Natural lanaguage processingNatural lanaguage processing
Natural lanaguage processing
gulshan kumar
 
Recommendation System Explained
Recommendation System ExplainedRecommendation System Explained
Recommendation System Explained
Crossing Minds
 
And then there were ... Large Language Models
And then there were ... Large Language ModelsAnd then there were ... Large Language Models
And then there were ... Large Language Models
Leon Dohmen
 
Text summarization
Text summarizationText summarization
Text summarization
Akash Karwande
 
Information Extraction
Information ExtractionInformation Extraction
Information Extraction
Rubén Izquierdo Beviá
 
Word embedding
Word embedding Word embedding
Word embedding
ShivaniChoudhary74
 
NLP.pptx
NLP.pptxNLP.pptx
NLP.pptx
Rahul Borate
 
Natural Language Toolkit (NLTK), Basics
Natural Language Toolkit (NLTK), Basics Natural Language Toolkit (NLTK), Basics
Natural Language Toolkit (NLTK), Basics
Prakash Pimpale
 
Natural language processing (NLP) introduction
Natural language processing (NLP) introductionNatural language processing (NLP) introduction
Natural language processing (NLP) introduction
Robert Lujo
 
Natural Language processing Parts of speech tagging, its classes, and how to ...
Natural Language processing Parts of speech tagging, its classes, and how to ...Natural Language processing Parts of speech tagging, its classes, and how to ...
Natural Language processing Parts of speech tagging, its classes, and how to ...
Rajnish Raj
 
AI: Logic in AI
AI: Logic in AIAI: Logic in AI
AI: Logic in AI
DataminingTools Inc
 
Step-by-step approach to question answering
Step-by-step approach to question answeringStep-by-step approach to question answering
Step-by-step approach to question answering
NAVER Engineering
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
Yuriy Guts
 
Neural Machine Translation (D3L4 Deep Learning for Speech and Language UPC 2017)
Neural Machine Translation (D3L4 Deep Learning for Speech and Language UPC 2017)Neural Machine Translation (D3L4 Deep Learning for Speech and Language UPC 2017)
Neural Machine Translation (D3L4 Deep Learning for Speech and Language UPC 2017)
Universitat Politècnica de Catalunya
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
Pranav Gupta
 
Learning to rank
Learning to rankLearning to rank
Learning to rank
Bruce Kuo
 

What's hot (20)

NLTK
NLTKNLTK
NLTK
 
Natural lanaguage processing
Natural lanaguage processingNatural lanaguage processing
Natural lanaguage processing
 
Recommendation System Explained
Recommendation System ExplainedRecommendation System Explained
Recommendation System Explained
 
And then there were ... Large Language Models
And then there were ... Large Language ModelsAnd then there were ... Large Language Models
And then there were ... Large Language Models
 
Text summarization
Text summarizationText summarization
Text summarization
 
Information Extraction
Information ExtractionInformation Extraction
Information Extraction
 
Word embedding
Word embedding Word embedding
Word embedding
 
Language models
Language modelsLanguage models
Language models
 
NLP.pptx
NLP.pptxNLP.pptx
NLP.pptx
 
Natural Language Toolkit (NLTK), Basics
Natural Language Toolkit (NLTK), Basics Natural Language Toolkit (NLTK), Basics
Natural Language Toolkit (NLTK), Basics
 
Natural language processing (NLP) introduction
Natural language processing (NLP) introductionNatural language processing (NLP) introduction
Natural language processing (NLP) introduction
 
Natural Language processing Parts of speech tagging, its classes, and how to ...
Natural Language processing Parts of speech tagging, its classes, and how to ...Natural Language processing Parts of speech tagging, its classes, and how to ...
Natural Language processing Parts of speech tagging, its classes, and how to ...
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
AI: Logic in AI
AI: Logic in AIAI: Logic in AI
AI: Logic in AI
 
Step-by-step approach to question answering
Step-by-step approach to question answeringStep-by-step approach to question answering
Step-by-step approach to question answering
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
 
Neural Machine Translation (D3L4 Deep Learning for Speech and Language UPC 2017)
Neural Machine Translation (D3L4 Deep Learning for Speech and Language UPC 2017)Neural Machine Translation (D3L4 Deep Learning for Speech and Language UPC 2017)
Neural Machine Translation (D3L4 Deep Learning for Speech and Language UPC 2017)
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
 
NLP
NLPNLP
NLP
 
Learning to rank
Learning to rankLearning to rank
Learning to rank
 

Viewers also liked

Instant Question Answering System
Instant Question Answering SystemInstant Question Answering System
Instant Question Answering System
Dhwaj Raj
 
Deep Learning Models for Question Answering
Deep Learning Models for Question AnsweringDeep Learning Models for Question Answering
Deep Learning Models for Question Answering
Sujit Pal
 
KAIST Web Engineering Lab Introduction (2017 ver.)
KAIST Web Engineering Lab Introduction (2017 ver.)KAIST Web Engineering Lab Introduction (2017 ver.)
KAIST Web Engineering Lab Introduction (2017 ver.)
webeng-kaist
 
연구실 소개(2014)
연구실 소개(2014)연구실 소개(2014)
연구실 소개(2014)NCLab_KAIST
 
iDBLab @KAIST 소개 20170306-업로드용(김태훈)
iDBLab @KAIST 소개 20170306-업로드용(김태훈)iDBLab @KAIST 소개 20170306-업로드용(김태훈)
iDBLab @KAIST 소개 20170306-업로드용(김태훈)
Taehun Kim, Ph.D
 
Sentence representations and question answering (YerevaNN)
Sentence representations and question answering (YerevaNN)Sentence representations and question answering (YerevaNN)
Sentence representations and question answering (YerevaNN)
YerevaNN research lab
 

Viewers also liked (6)

Instant Question Answering System
Instant Question Answering SystemInstant Question Answering System
Instant Question Answering System
 
Deep Learning Models for Question Answering
Deep Learning Models for Question AnsweringDeep Learning Models for Question Answering
Deep Learning Models for Question Answering
 
KAIST Web Engineering Lab Introduction (2017 ver.)
KAIST Web Engineering Lab Introduction (2017 ver.)KAIST Web Engineering Lab Introduction (2017 ver.)
KAIST Web Engineering Lab Introduction (2017 ver.)
 
연구실 소개(2014)
연구실 소개(2014)연구실 소개(2014)
연구실 소개(2014)
 
iDBLab @KAIST 소개 20170306-업로드용(김태훈)
iDBLab @KAIST 소개 20170306-업로드용(김태훈)iDBLab @KAIST 소개 20170306-업로드용(김태훈)
iDBLab @KAIST 소개 20170306-업로드용(김태훈)
 
Sentence representations and question answering (YerevaNN)
Sentence representations and question answering (YerevaNN)Sentence representations and question answering (YerevaNN)
Sentence representations and question answering (YerevaNN)
 

Similar to Tutorial on Question Answering Systems

TIDSR
TIDSRTIDSR
TIDSR
Eric Meyer
 
Big data impact on society: a research roadmap for Europe (BYTE project resea...
Big data impact on society: a research roadmap for Europe (BYTE project resea...Big data impact on society: a research roadmap for Europe (BYTE project resea...
Big data impact on society: a research roadmap for Europe (BYTE project resea...
Anna Fensel
 
Platform Research Agenda
Platform Research AgendaPlatform Research Agenda
Platform Research Agenda
Peter C. Evans, PhD
 
e-Research: A Social Informatics Perspective
e-Research: A Social Informatics Perspectivee-Research: A Social Informatics Perspective
e-Research: A Social Informatics Perspective
Eric Meyer
 
Data and science
Data and scienceData and science
Data and science
Anand Deshpande
 
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...
Juan Antonio Vizcaino
 
From Argument Mapping to Argument Mining, and Back
From Argument Mapping to Argument Mining, and BackFrom Argument Mapping to Argument Mining, and Back
From Argument Mapping to Argument Mining, and Back
EDV Project
 
Smart Subjects - Application Independent Subject Recommendations
Smart Subjects - Application Independent Subject RecommendationsSmart Subjects - Application Independent Subject Recommendations
Smart Subjects - Application Independent Subject Recommendations
eby
 
Tracing Publics in the Australian Blogosphere: New Methods for International ...
Tracing Publics in the Australian Blogosphere: New Methods for International ...Tracing Publics in the Australian Blogosphere: New Methods for International ...
Tracing Publics in the Australian Blogosphere: New Methods for International ...
Jean Burgess
 
Csora, "2Collab, The Research Collaboration Tool"
Csora, "2Collab, The Research Collaboration Tool"Csora, "2Collab, The Research Collaboration Tool"
Csora, "2Collab, The Research Collaboration Tool"
National Information Standards Organization (NISO)
 
New Perspectives in Scholarly Publishing – Potenziale und Möglichkeiten digi...
New Perspectives  in Scholarly Publishing – Potenziale und Möglichkeiten digi...New Perspectives  in Scholarly Publishing – Potenziale und Möglichkeiten digi...
New Perspectives in Scholarly Publishing – Potenziale und Möglichkeiten digi...
Alexander Grossmann
 
Semantic Interpretation of User Query for Question Answering on Interlinked Data
Semantic Interpretation of User Query for Question Answering on Interlinked DataSemantic Interpretation of User Query for Question Answering on Interlinked Data
Semantic Interpretation of User Query for Question Answering on Interlinked Data
Saeedeh Shekarpour
 
Knowledge Acquisition from Social Awareness Streams
Knowledge Acquisition from Social Awareness StreamsKnowledge Acquisition from Social Awareness Streams
Knowledge Acquisition from Social Awareness Streams
Claudia Wagner
 
Automated Content Analysis of Discussion Transcripts
Automated Content Analysis of Discussion TranscriptsAutomated Content Analysis of Discussion Transcripts
Automated Content Analysis of Discussion Transcripts
Vitomir Kovanovic
 
OSFair2017 Workshop | Text mining
OSFair2017 Workshop | Text miningOSFair2017 Workshop | Text mining
OSFair2017 Workshop | Text mining
Open Science Fair
 
QALD-7 Question Answering over Linked Data Challenge
QALD-7 Question Answering over Linked Data ChallengeQALD-7 Question Answering over Linked Data Challenge
QALD-7 Question Answering over Linked Data Challenge
Holistic Benchmarking of Big Linked Data
 
Qald 7 at ESWC2017
Qald 7 at ESWC2017Qald 7 at ESWC2017
Qald 7 at ESWC2017
Giulio Napolitano
 
Navigation Support in Evolving Communities by a Web-based Dashboard
Navigation Support in Evolving Communities by a Web-based DashboardNavigation Support in Evolving Communities by a Web-based Dashboard
Navigation Support in Evolving Communities by a Web-based Dashboard
Ralf Klamma
 
Sparc-Japan-Slow-revolution-in-scholarly-communication
Sparc-Japan-Slow-revolution-in-scholarly-communicationSparc-Japan-Slow-revolution-in-scholarly-communication
Sparc-Japan-Slow-revolution-in-scholarly-communication
hierohiero
 
COUNTER Standards for Open Access: the Value of Measuring/ the Measuring of V...
COUNTER Standards for Open Access: the Value of Measuring/ the Measuring of V...COUNTER Standards for Open Access: the Value of Measuring/ the Measuring of V...
COUNTER Standards for Open Access: the Value of Measuring/ the Measuring of V...
UCD Library
 

Similar to Tutorial on Question Answering Systems (20)

TIDSR
TIDSRTIDSR
TIDSR
 
Big data impact on society: a research roadmap for Europe (BYTE project resea...
Big data impact on society: a research roadmap for Europe (BYTE project resea...Big data impact on society: a research roadmap for Europe (BYTE project resea...
Big data impact on society: a research roadmap for Europe (BYTE project resea...
 
Platform Research Agenda
Platform Research AgendaPlatform Research Agenda
Platform Research Agenda
 
e-Research: A Social Informatics Perspective
e-Research: A Social Informatics Perspectivee-Research: A Social Informatics Perspective
e-Research: A Social Informatics Perspective
 
Data and science
Data and scienceData and science
Data and science
 
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...
 
From Argument Mapping to Argument Mining, and Back
From Argument Mapping to Argument Mining, and BackFrom Argument Mapping to Argument Mining, and Back
From Argument Mapping to Argument Mining, and Back
 
Smart Subjects - Application Independent Subject Recommendations
Smart Subjects - Application Independent Subject RecommendationsSmart Subjects - Application Independent Subject Recommendations
Smart Subjects - Application Independent Subject Recommendations
 
Tracing Publics in the Australian Blogosphere: New Methods for International ...
Tracing Publics in the Australian Blogosphere: New Methods for International ...Tracing Publics in the Australian Blogosphere: New Methods for International ...
Tracing Publics in the Australian Blogosphere: New Methods for International ...
 
Csora, "2Collab, The Research Collaboration Tool"
Csora, "2Collab, The Research Collaboration Tool"Csora, "2Collab, The Research Collaboration Tool"
Csora, "2Collab, The Research Collaboration Tool"
 
New Perspectives in Scholarly Publishing – Potenziale und Möglichkeiten digi...
New Perspectives  in Scholarly Publishing – Potenziale und Möglichkeiten digi...New Perspectives  in Scholarly Publishing – Potenziale und Möglichkeiten digi...
New Perspectives in Scholarly Publishing – Potenziale und Möglichkeiten digi...
 
Semantic Interpretation of User Query for Question Answering on Interlinked Data
Semantic Interpretation of User Query for Question Answering on Interlinked DataSemantic Interpretation of User Query for Question Answering on Interlinked Data
Semantic Interpretation of User Query for Question Answering on Interlinked Data
 
Knowledge Acquisition from Social Awareness Streams
Knowledge Acquisition from Social Awareness StreamsKnowledge Acquisition from Social Awareness Streams
Knowledge Acquisition from Social Awareness Streams
 
Automated Content Analysis of Discussion Transcripts
Automated Content Analysis of Discussion TranscriptsAutomated Content Analysis of Discussion Transcripts
Automated Content Analysis of Discussion Transcripts
 
OSFair2017 Workshop | Text mining
OSFair2017 Workshop | Text miningOSFair2017 Workshop | Text mining
OSFair2017 Workshop | Text mining
 
QALD-7 Question Answering over Linked Data Challenge
QALD-7 Question Answering over Linked Data ChallengeQALD-7 Question Answering over Linked Data Challenge
QALD-7 Question Answering over Linked Data Challenge
 
Qald 7 at ESWC2017
Qald 7 at ESWC2017Qald 7 at ESWC2017
Qald 7 at ESWC2017
 
Navigation Support in Evolving Communities by a Web-based Dashboard
Navigation Support in Evolving Communities by a Web-based DashboardNavigation Support in Evolving Communities by a Web-based Dashboard
Navigation Support in Evolving Communities by a Web-based Dashboard
 
Sparc-Japan-Slow-revolution-in-scholarly-communication
Sparc-Japan-Slow-revolution-in-scholarly-communicationSparc-Japan-Slow-revolution-in-scholarly-communication
Sparc-Japan-Slow-revolution-in-scholarly-communication
 
COUNTER Standards for Open Access: the Value of Measuring/ the Measuring of V...
COUNTER Standards for Open Access: the Value of Measuring/ the Measuring of V...COUNTER Standards for Open Access: the Value of Measuring/ the Measuring of V...
COUNTER Standards for Open Access: the Value of Measuring/ the Measuring of V...
 

More from Saeedeh Shekarpour

Metrics for Evaluating Quality of Embeddings for Ontological Concepts
Metrics for Evaluating Quality of Embeddings for Ontological Concepts Metrics for Evaluating Quality of Embeddings for Ontological Concepts
Metrics for Evaluating Quality of Embeddings for Ontological Concepts
Saeedeh Shekarpour
 
CEVO: Comprehensive EVent Ontology Enhancing Cognitive Annotation on Relations
CEVO: Comprehensive EVent Ontology  Enhancing Cognitive Annotation on RelationsCEVO: Comprehensive EVent Ontology  Enhancing Cognitive Annotation on Relations
CEVO: Comprehensive EVent Ontology Enhancing Cognitive Annotation on Relations
Saeedeh Shekarpour
 
A quality type aware annotated corpus and lexicon for harassment research
A quality type aware annotated corpus and lexicon for harassment researchA quality type aware annotated corpus and lexicon for harassment research
A quality type aware annotated corpus and lexicon for harassment research
Saeedeh Shekarpour
 
Windowing of attention
Windowing of attentionWindowing of attention
Windowing of attention
Saeedeh Shekarpour
 

More from Saeedeh Shekarpour (6)

Metrics for Evaluating Quality of Embeddings for Ontological Concepts
Metrics for Evaluating Quality of Embeddings for Ontological Concepts Metrics for Evaluating Quality of Embeddings for Ontological Concepts
Metrics for Evaluating Quality of Embeddings for Ontological Concepts
 
CEVO: Comprehensive EVent Ontology Enhancing Cognitive Annotation on Relations
CEVO: Comprehensive EVent Ontology  Enhancing Cognitive Annotation on RelationsCEVO: Comprehensive EVent Ontology  Enhancing Cognitive Annotation on Relations
CEVO: Comprehensive EVent Ontology Enhancing Cognitive Annotation on Relations
 
A quality type aware annotated corpus and lexicon for harassment research
A quality type aware annotated corpus and lexicon for harassment researchA quality type aware annotated corpus and lexicon for harassment research
A quality type aware annotated corpus and lexicon for harassment research
 
Windowing of attention
Windowing of attentionWindowing of attention
Windowing of attention
 
Sina presentation in IBM
Sina presentation in IBMSina presentation in IBM
Sina presentation in IBM
 
Wi presentation
Wi presentationWi presentation
Wi presentation
 

Recently uploaded

DfMAy 2024 - key insights and contributions
DfMAy 2024 - key insights and contributionsDfMAy 2024 - key insights and contributions
DfMAy 2024 - key insights and contributions
gestioneergodomus
 
HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
Robbie Edward Sayers
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
zwunae
 
CW RADAR, FMCW RADAR, FMCW ALTIMETER, AND THEIR PARAMETERS
CW RADAR, FMCW RADAR, FMCW ALTIMETER, AND THEIR PARAMETERSCW RADAR, FMCW RADAR, FMCW ALTIMETER, AND THEIR PARAMETERS
CW RADAR, FMCW RADAR, FMCW ALTIMETER, AND THEIR PARAMETERS
veerababupersonal22
 
Immunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary AttacksImmunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary Attacks
gerogepatton
 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Dr.Costas Sachpazis
 
Railway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdfRailway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdf
TeeVichai
 
AP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specificAP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specific
BrazilAccount1
 
Basic Industrial Engineering terms for apparel
Basic Industrial Engineering terms for apparelBasic Industrial Engineering terms for apparel
Basic Industrial Engineering terms for apparel
top1002
 
6th International Conference on Machine Learning & Applications (CMLA 2024)
6th International Conference on Machine Learning & Applications (CMLA 2024)6th International Conference on Machine Learning & Applications (CMLA 2024)
6th International Conference on Machine Learning & Applications (CMLA 2024)
ClaraZara1
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Sreedhar Chowdam
 
Building Electrical System Design & Installation
Building Electrical System Design & InstallationBuilding Electrical System Design & Installation
Building Electrical System Design & Installation
symbo111
 
Fundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptxFundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptx
manasideore6
 
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
thanhdowork
 
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdfWater Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation & Control
 
Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
JoytuBarua2
 
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...
ssuser7dcef0
 
weather web application report.pdf
weather web application report.pdfweather web application report.pdf
weather web application report.pdf
Pratik Pawar
 
Investor-Presentation-Q1FY2024 investor presentation document.pptx
Investor-Presentation-Q1FY2024 investor presentation document.pptxInvestor-Presentation-Q1FY2024 investor presentation document.pptx
Investor-Presentation-Q1FY2024 investor presentation document.pptx
AmarGB2
 
road safety engineering r s e unit 3.pdf
road safety engineering  r s e unit 3.pdfroad safety engineering  r s e unit 3.pdf
road safety engineering r s e unit 3.pdf
VENKATESHvenky89705
 

Recently uploaded (20)

DfMAy 2024 - key insights and contributions
DfMAy 2024 - key insights and contributionsDfMAy 2024 - key insights and contributions
DfMAy 2024 - key insights and contributions
 
HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
 
CW RADAR, FMCW RADAR, FMCW ALTIMETER, AND THEIR PARAMETERS
CW RADAR, FMCW RADAR, FMCW ALTIMETER, AND THEIR PARAMETERSCW RADAR, FMCW RADAR, FMCW ALTIMETER, AND THEIR PARAMETERS
CW RADAR, FMCW RADAR, FMCW ALTIMETER, AND THEIR PARAMETERS
 
Immunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary AttacksImmunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary Attacks
 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
 
Railway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdfRailway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdf
 
AP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specificAP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specific
 
Basic Industrial Engineering terms for apparel
Basic Industrial Engineering terms for apparelBasic Industrial Engineering terms for apparel
Basic Industrial Engineering terms for apparel
 
6th International Conference on Machine Learning & Applications (CMLA 2024)
6th International Conference on Machine Learning & Applications (CMLA 2024)6th International Conference on Machine Learning & Applications (CMLA 2024)
6th International Conference on Machine Learning & Applications (CMLA 2024)
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
 
Building Electrical System Design & Installation
Building Electrical System Design & InstallationBuilding Electrical System Design & Installation
Building Electrical System Design & Installation
 
Fundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptxFundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptx
 
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
 
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdfWater Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdf
 
Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
 
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...
 
weather web application report.pdf
weather web application report.pdfweather web application report.pdf
weather web application report.pdf
 
Investor-Presentation-Q1FY2024 investor presentation document.pptx
Investor-Presentation-Q1FY2024 investor presentation document.pptxInvestor-Presentation-Q1FY2024 investor presentation document.pptx
Investor-Presentation-Q1FY2024 investor presentation document.pptx
 
road safety engineering r s e unit 3.pdf
road safety engineering  r s e unit 3.pdfroad safety engineering  r s e unit 3.pdf
road safety engineering r s e unit 3.pdf
 

Tutorial on Question Answering Systems

  • 1. A Tutorial on Question Answering Systems Saeedeh Shekarpour Postdoc researcher from EIS research group 31 August 2015EIS research group - Bonn University 1
  • 2. Outline  Introduction  Associations for evaluation QA systems  Preliminary Concepts  Data Web  Emerging Concepts  Deeper view on SINA Project 31 August 2015EIS research group - Bonn University 2
  • 3. Query Answer What is a Question Answering (QA) system? Systems that automatically answer questions posed by humans in natural language query. 31 August 2015EIS research group - Bonn University 3 Natural language is the common way for sharing knowledge
  • 4. Question answering is a multi disciplinary field 31 August 2015EIS research group - Bonn University 4 Semantic Web Linked Data
  • 5. Difference to Information Retrieval  Information Retrieval is a query driven approach for accessing information.  System returns a list of documents.  It is responsibility of user to navigate on the retrieved documents and find its own information need.  Question Answering is an answer driven approach for accessing information.  User asks its question in natural language ( i.e. phrase- based, full sentence or even keyword based) queries.  System returns the list of short answers.  More complex functionality. 31 August 2015EIS research group - Bonn University 5
  • 6. Search engines are moving towards QA 31 August 2015EIS research group - Bonn University 6
  • 7. Search engines still lack the ability to answer more complex queries 31 August 2015EIS research group - Bonn University 7
  • 8. Natural language queries are classified into different categories  Factoid queries: WH questions like when, who, where.  Yes/ No queries: Is Berlin capital of Germany?  Definition queries: what is leukemia?  Cause/consequence queries: How, Why, What. what are the consequences of the Iraq war ? 31 August 2015EIS research group - Bonn University 8
  • 9. Natural language queries are classified into different categories  Procedural queries: which are the steps for getting a Master degree?  Comparative queries: what are the differences between the model A and B?  Queries with examples: list of hard disks similar to hard disk X.  Queries about opinion: What is the opinion of the majority of Americans about the Iraq war? 31 August 2015EIS research group - Bonn University 9
  • 10. Corpus Type  Structured data (relational data bases, RDF knowledge bases).  Semi-structured data (XML databases)  Free text  Multimodal data: image, voice, video 31 August 2015EIS research group - Bonn University 10
  • 11. Types of QA systems  Open-domain: domain independent QA systems can answer any query from any corpus + covers wide range of queries - low accuracy  Closed-domain: domain specific QA systems are limited to specific domains + High accuracy - limited coverage over the possible queries - Needs domain expert 31 August 2015EIS research group - Bonn University 11
  • 12. Is QA system a need for user? Search engines query log analysis shows that Real informational queries in Google:  Who first invented rock and roll music?  When was the mobile phone invented?  Where was the hamburger invented?  How to lose weight? 31 August 2015EIS research group - Bonn University 12 Type of query Query log analysis Informational 48% Navigational 20% Transactional 30%
  • 13.  Introduction  Associations for evaluation QA systems  Preliminary Concepts  Data Web  Emerging Concepts  Deeper view on SINA Project 31 August 2015EIS research group - Bonn University 13
  • 14. Text Retrieval Conference (Trec)  In 1999, Trec initiated a QA track,  From 1999-2002, participant systems were allowed to return ranked answer snippets. These snippets of text contain the actual answer.  From 2002, participant systems were allowed to return only the exact answer with confidence rate.  The evaluation metric was mean reciprocal rank (MRR). 31 August 2015EIS research group - Bonn University 14
  • 15. Typical pipeline for the participating systems in Trec 31 August 2015EIS research group - Bonn University 15 Query Expansion IR Approach Passage Retrieval Answer Selection Predicting the type of answer Answer Query
  • 16. QA at Clef The Cross-Language Evaluation Forum (CLEF) initiative provides  Since 2003, Clef included a QA track.  an infrastructure for the testing, tuning and evaluation of information retrieval systems operating on European languages in both monolingual and cross-language contexts. 31 August 2015EIS research group - Bonn University 16
  • 17. Outline  Introduction  Associations for evaluation QA systems  Preliminary Concepts  Data Web  Emerging Concepts  Deeper view on SINA Project 31 August 2015EIS research group - Bonn University 17
  • 18. Core of a QA system 31 August 2015EIS research group - Bonn University 18 query
  • 19. Syntactic Parsing: Part-of-speech Tagging I like sweet cakes I/PRP like/VBP sweet/JJ cakes/NNS 31 August 2015EIS research group - Bonn University 19 like/VBP I/PRP cakes/NNS sweet/JJ
  • 20. Type of Answer Where was the hamburger invented? White Castle traces the origin of the hamburger to Hamburg, Germany with its invention by Otto Kuase. 31 August 2015EIS research group - Bonn University 20 Place
  • 21. Named Entity Recognition on Query where was Franklin Roosevelt born? 31 August 2015EIS research group - Bonn University 21 Named Entity: Person
  • 22. Relation Extraction Barack Hussein Obama is the 44th and current President of the United States. Born in Honolulu, Hawaii, Obama is a graduate of Columbia University and Harvard Law School. 31 August 2015EIS research group - Bonn University 22 Named Entity: Person Named Entity: Place Relation: President of
  • 23. Watson Project  Watson is a computer which is capable of answering question issued in natural language.  Questions come from quiz show called Jeopardy.  The software of this project is called DeepQA project.  In 2011, Watson won the former winners of quiz show Jeopardy. 31 August 2015EIS research group - Bonn University 23
  • 24. Watson Description  Hardware: Watson system has 2,880 POWER processor threads and has 16 terabytes of RAM.  Data: encyclopedias, dictionaries, thesauri, newswire articles, and literary works. Watson also used databases, taxonomies, and ontologies such as DBPedia, WordNet, and Yago were used.  Software: DeepQA software and the Apache UIMA framework. The system was written in various languages, including Java, C++, and Prolog, and runs on the SUSE Linux Enterprise Server 11 operating system using Apache Hadoop framework to provide distributed computing. 31 August 2015EIS research group - Bonn University 24
  • 25. The high level architecture of DeepQA 31 August 2015EIS research group - Bonn University 25
  • 26. Decomposition example Query: Of the four countries in the world that the United States does not have diplomatic relations with, the one that’s farthest north. Bhutan, Cuba, Iran, North Korea North Korea 31 August 2015EIS research group - Bonn University 26
  • 27. Outline  Introduction  Associations for evaluation QA systems  Preliminary Concepts  Data Web  Emerging Concepts  Deeper view on SINA Project 31 August 2015EIS research group - Bonn University 27
  • 28. Evolution of Web 31 August 2015EIS research group - Bonn University 28
  • 29. The growth of Linked Open Data 31 August 2015EIS research group - Bonn University 29 August 2014 570 Datasets More than 74 billion triples May 2007 12 Datasets
  • 30. How to retrieve data from Linked Data? 31 August 2015EIS research group - Bonn University 30
  • 31. RDF Model  RDF is an standard for describing Web resources.  The RDF data model expresses statements about Web resources in the form of subject-predicate-object (triple).  The statement “Jack knows Alice” is represented as: 31 August 2015EIS research group - Bonn University 31 knowing
  • 32. Outline  Introduction  Associations for evaluation QA systems  Preliminary Concepts  Data Web  Emerging Concepts  Deeper view on SINA Project 31 August 2015EIS research group - Bonn University 32
  • 33. Semantic Indexing 31 August 2015EIS research group - Bonn University 33
  • 34. Semantic Annotation  Name Entity Recognition Where is the capital of Germany?  Semantic Annotation Where is the capital of Germany? 31 August 2015EIS research group - Bonn University 34 Named Entity: Place Entity: http://dbpedia.org/resource/Germ any
  • 35. 31 August 2015EIS research group - Bonn University 35 Berlin in the 1920s was the third largest municipality in the world. After World War II, the city became divided into East Berlin -- the capital of East Germany -- and West Berlin, a West German exclave surrounded by the Berlin Wall from 1961–89. Berlin in the 1920s was the third largest municipality in the world. After World War II, the city became divided into East Berlin -- the capital of East Germany -- and West Berlin, a West German exclave surrounded by the Berlin Wall from 1961–89.
  • 36. Relation extraction leveraging Data Web: BOA Library 31 August 2015EIS research group - Bonn University 36
  • 37. HAWK: Hybrid Question Answering over Linked Data 31 August 2015EIS research group - Bonn University 37
  • 38. TBSL: Template-based SPARQL generator 31 August 2015EIS research group - Bonn University 38
  • 39. Outline  Introduction  Associations for evaluation QA systems  Preliminary Concepts  Data Web  Emerging Concepts  Deeper view on SINA Project 31 August 2015EIS research group - Bonn University 39
  • 40. SINAArchitecture 31 August 2015EIS research group - Bonn University 40 Client QueryPreprocessing QueryExpansion ResourceRetrieval Disambiguation QueryConstruction Representation Server UnderlyingInterlinked KnowledgeBases query result keywords valid segments mapped resources tuple of resources SPARQL queries OWL API http client Stanford CoreNLP SegmentValidation Reformulated query
  • 41. Comparison of search approaches 31 August 2015EIS research group - Bonn University 41 Data-semantic unaware Data-semantic aware Keyword-based query
  • 42. Objective: transformation from textual query to formal query 31 August 2015EIS research group - Bonn University 42 Which televisions shows were created by Walt Disney? SELECT * WHERE { ?v0 a dbo:TelevisionShow. ?v0 dbo:creator dbr:Walt_Disney. }
  • 43. The addressed challenges in SINA 31 August 2015EIS research group - Bonn University 43
  • 44. Query Segmentation 31 August 2015EIS research group - Bonn University 44  Definition: query segmentation is the process of identifying the right segments of data items that occur in the keyword queries.
  • 45. Resource Disambiguation 31 August 2015EIS research group - Bonn University 45  Definition: resource disambiguation is the process of recognizing the suitable resources in the underlying knowledge base. •Who produced films starring Natalie Portman? •dbpedia/ontology/film •dbpedia/property/film •What are the side effects of drugs used for Tuberculosis? •diseasome:Tuberculosis •sider:Tuberculosis
  • 46. Concurrent Approach 31 August 2015EIS research group - Bonn University 46 Query Segmentation Resource Disambiguation
  • 47. Modeling using hidden Markov model 31 August 2015EIS research group - Bonn University 47 1 2 3 Unknown Entity 4 5 6 7 8 9 Start Keyword 1 Keyword 3Keyword 2 Keyword 4
  • 48. Bootstrapping the model parameters 1. Emission probability is defined based on the similarity of the label of each state with a segment, this similarity is computed based on string-similarity and Jaccard-similarity. 2. Semantic relatedness is a base for transition probability and initial probability. Intuitively, it is based on two values: distance and connectivity degree. We transform these two values to hub and authority values using weighted HITS algorithm. 3. HITS algorithm is a link analysis algorithm that was originally developed for ranking Web pages. It assign a hub and authority value to each web page. 4. Initial probability and transition probability are defined as a uniform distribution over the hub and and authority values. 31 August 2015EIS research group - Bonn University 48
  • 49. 31 August 2015EIS research group - Bonn University 49 Sequence of keywords (television show creat Walt Disney) Paths 0.0023 dbo:TelevisionShow dbo:creator dbr:Walt_Disney 0.0014 dbo:TelevisionShow dbo:creator dbr:Category:Walt_Disney 0.000589 dbr:TelevisionShow dbo:creator dbr:Walt_Disney 0.000353 dbr:TelevisionShow dbo:creator dbr:Category:Walt_Disney 0.0000376 dbp:television dbp:show dbo:creator dbr:Category:Walt_Disney Output of the model
  • 50. Query Expansion  Definition: query expansion is a way of reformulating the input query in order to overcome the vocabulary mismatch problem. 31 August 2015EIS research group - Bonn University 50 • Wife of Barak Obama •Spouse of Barak Obama
  • 51. Query Expansion 31 August 2015EIS research group - Bonn University 51
  • 52. Linguistic features  WordNet is a popular data source for expansion.  Linguistic features extracted from WordNet are: 1. Synonyms: words having a similar meanings to the input keyword. 2. Hyponyms: words representing a specialization of the input keyword. 3. Hypernyms: words representing a generalization of the input keyword. 31 August 2015EIS research group - Bonn University 52
  • 53. Semantic features from Linked Data 31 August 2015EIS research group - Bonn University 53 1. SameAs: deriving resources using owl:sameAs. 2. SeeAlso: deriving resources using rdfs:seeAlso. 3. Equivalence class/property: deriving classes or properties using owl:equivalentClass and owl:equivalentProperty. 4. Super class/property: deriving all super classes/properties of by following the rdfs:subClassOf or rdfs:subPropertyOf property. 5. Sub class/property: deriving resources by following the rdfs:subClassOf or rdfs:subPropertyOf property paths ending with the input resource.
  • 54. Semantic features from Linked Data 5. Broader concepts: deriving using the SKOS vocabulary properties skos:broader and skos:broadMatch. 6. Narrower concepts: deriving concepts using skos:narrower and skos:narrowMatch. 7. Related concepts: deriving concepts using skos:closeMatch, skos:mappingRelation and skos:exactMatch. 31 August 2015EIS research group - Bonn University 54
  • 55. Exemplary expansion graph of the word movie 31 August 2015EIS research group - Bonn University 55 movie home movie production film motion picture show video telefilm
  • 56. Statistics over the number of the derived words from WordNet and Linked Data 31 August 2015EIS research group - Bonn University 56 Feature #derived words synonym 503 hyponym 2703 hypernym 657 sameAs 2332 seeAlso 49 equivalence 2 super class/property 267 Sub class/property 2166
  • 57. Automatic query expansion 31 August 2015EIS research group - Bonn University 57 Input query External Data source Data extraction and preparation Heuristic method Reformulated query
  • 58. Expansion set for each segment 31 August 2015EIS research group - Bonn University 58
  • 59. Reformulating query using hidden Markov model 31 August 2015EIS research group - Bonn University 59 Barak Barak Obama spouse Obama wife first lady woman Barak Obama wife Barak Obama Start Obama wife Barak Obama wife Input query: wife of Barak Obama
  • 60. Formal Query Construction  Definition: Once the resources are detected, a connected subgraph of the knowledge base graph, called the query graph, has to be determined which fully covers the set of mapped resources. 31 August 2015EIS research group - Bonn University 60 Disambiguated resources sider:sideEffect diseasome:possibleDrug diseasome:1154 SPARQL query SELECT ?v3 WHERE { diseasome:115 diseasome:possibleDrug ?v1 . ?v1 owl:sameAs ?v2 . ?v2 sider:sideEffect ?v3 .}
  • 61. Data Fusion on Linked Data  Answer of a question may be spread among different datasets employing heterogeneous schemas.  Constructing a federated query from needs to exploit links between the different datasets on the schema and instance levels. 31 August 2015EIS research group - Bonn University 61
  • 62. Two different approaches Formal query construction based on  Template-based query construction  Forward chaining based query construction 31 August 2015EIS research group - Bonn University 62
  • 63. Federated query construction using forward chaining 31 August 2015EIS research group - Bonn University 63 Query What is the side effects of drugs used for Tuberculosis? resources diseasome:1154 (type instance) diseasome:possibleDrug (type property) sider:sideEffect (type property) 1154 ?v0 possibleDrug Graph 1 ?v1 ?v2 sideEffect Graph 2 115 4 ?v0 possibleDrug Template 1 ?v1 ?v2 sideEffect Template 2 115 4 ?v0 possibleDrug ?v1 ?v2 sideEffect
  • 64. Any question? 31 August 2015EIS research group - Bonn University 64

Editor's Notes

  1. UIMA (Unstructured Information Management Architecture)
  2. . Decomposition: Some more complex clues contain multiple facts about the answer, all of which are required to arrive at the correct response but are unlikely to occur together in one place.