SlideShare a Scribd company logo
1 of 45
Download to read offline
SEMANTIC SEARCH AND RESULT
PRESENTATION WITH ENTITY CARDS
FAEGHEH HASIBI | SEARCH ENGINES AMSTERDAM | JUNE 30, 2017
SEMANTIC SEARCH AND RESULT
PRESENTATION WITH ENTITY CARDS
FAEGHEH HASIBI | SEARCH ENGINES AMSTERDAM | JUNE 30, 2017
“——————————”
SEMANTIC SEARCH
SEMANTIC SEARCH
“Search with Meaning”
KNOWLEDGE BASE
(Core data-enabling component of semantic search)
Albert Einstein
ENTITY
<dbr:Albert_Einstein, foaf:name, Albert Einstein>
<dbr:Albert_Einstein, dbo:birthDate, 1879-03-14>
<dbr:Albert_Einstein, dbo:birthPlace, dbr:Ulm>
<dbr:Albert_Einstein, dbo:birthPlace, dbr:German_Empire>
<dbr:Albert_Einstein, dbp:description, dbr:Physicist>
…
“Search with Meaning”
SEMANTIC SEARCH
SEMANTIC SEARCH
“Search with Meaning”
an umbrella term that encompasses various techniques
‣ Knowledge acquisition

and curation
‣ Query understanding
‣ Answer retrieval
‣ Result presentation
‣ …
BUILDING BLOCKS
OVERVIEW
Result

Presentation
01+ Resources
RESULT PRESENTATION
Summarizing Entities for Entity Cards
F. Hasibi, K. Balog, and S.E. Bratsberg. “Dynamic Factual Summaries for Entity Cards”. 

In Proceedings of SIGIR ’17
ENTITY CARDS
Entity
Summary
ENTITY SUMMARIZATION
Albert Einstein
… and ~700 more facts
dbo:almaMater dbr:ETH_Zurich
dbo:almaMater dbr:University_of_Zurich
dbo:award dbr:Max_Planck_Medal
dbo:award dbr:Nobel_Prize_in_Physics
dbo:birthDate 1879-03-14
dbo:birthPlace dbr:Ulm
dbo:birthPlace dbr:German_Empire
dbo:citizenship dbr:Austria-Hungary
dbo:children dbr:Eduard_Einstein
dbo:children dbr:Hans_Albert_Einstein
dbo:deathDate 1955-04-18
dbo:deathPlace dbr:Princeton,_New_Jersey
dbo:spouse dbr:Elsa_Einstein
dbo:spouse dbr:Mileva_Marić
dbp:influenced dbr:Leo_Szilard
ENTITY SUMMARIES
einstein awardseinstein family
Other applications
‣ News search
• hovering over an entity in entity-annotated documents
‣ Job search
• company descriptions for a given topic
ENTITY SUMMARIES
ENTITY SUMMARIES
? Question
How to generate 

query-dependent entity
summaries that can directly
address users’ information
needs?
METHOD
Fact ranking
Ranking a set of entity facts (and a search query)
with respect to some criterion
Summary generation
constructing an entity summary from ranked
entity facts, for a given size
RANKING CRITERIA
Importance: The general importance of that fact in
describing the entity, irrespective of any particular
information need.
Relevance: The relevance of fact to query reflects
how well the fact supports the information need
underlying the query.
Utility
RANKING CRITERIA
Utility: The utility of a fact combines the general
importance and the relevance of a fact into a single
number
importance relevance
Importance
..
FACT RANKING
Relevance
..
‣ Supervised ranking with fact-query pairs as learning instances
‣ Learning is optimized on utility with different weights
• more bias towards importance or relevance
FACT RANKING
‣ Knowledge bases statistics as ingredients for Importance features
• absence of query logs
METHOD
Fact ranking
Ranking a set of entity facts (and a search query)
with respect to some criterion
Summary generation
constructing an entity summary from ranked
entity facts, for a given size
SUMMARY GENERATION
1. dbo:birthDate 1879-03-14
2. dbp:placeOfBirth Ulm
3. dbo:birthPlace dbr:Ulm
4. dbo:deathDate 1955-04-18
5. dbo:award dbr:Nobel_Prize_in_Physics
6. dbo:deathPlace dbr:Princeton,_New_Jersey
7. dbo:birthPlace dbr:German_Empire
8. dbo:almaMater dbr:ETH_Zurich
9. dbo:award dbr:Max_Planck_Medal
10.dbp:influenced dbr:Nathan_Rosen
11.dbo:almaMater dbr:University_of_Zurich
…
multi-valued
predicates
SUMMARY GENERATION
1. dbo:birthDate 1879-03-14
2. dbp:placeOfBirth Ulm
3. dbo:birthPlace dbr:Ulm
4. dbo:deathDate 1955-04-18
5. dbo:award dbr:Nobel_Prize_in_Physics
6. dbo:deathPlace dbr:Princeton,_New_Jersey
7. dbo:birthPlace dbr:German_Empire
8. dbo:almaMater dbr:ETH_Zurich
9. dbo:award dbr:Max_Planck_Medal
10.dbp:influenced dbr:Nathan_Rosen
11.dbo:almaMater dbr:University_of_Zurich
…
identical facts
SUMMARY GENERATION
Algorithm 1 Summary generation algorithm
Input: Ranked facts Fe, max height h, max width w
Output: Entity summary lines
1: M Predicate-Name Mapping(Fe)
2: headin s [] . Determine line headings
3: for f in Fe do
4: pname M[fp]
5: if (pname < headin s) AND (size(headin s)  h) then
6: headin s.add((fp,pname ))
7: end if
8: end for
9: alues [] . Determine line values
10: for f in Fe do
11: if fp 2 headin s then
12: alues[fp].add(fo)
13: end if
14: end for
15: lines [] . Construct lines
16: for (fp,pname ) in headin s do
17: line pname + ‘:’
18: for in alues[fp] do
19: if len(line) + len(v)  w then
20: line line + . Add comma if needed
21: end if
22: end for
23: lines.add(line)
24: end for
‣ Creates a summary of a
given size (length and width)
‣ Resolving identical facts 

(RF feature)
‣ Grouping multi-valued
predicates (GF feature)
EVALUATION
QUERIES
February 2014
Increase profit by
35%
Keyword
Natural languageList search
Named entity
• “madrid”
• “brooklyn bridge”
• “vietnam war facts”
• “eiffel”
• “states that border
oklahoma”
• “What is the second
highest mountain?”
Taken from the DBpedia-entity collection
K. Balog and R. Neumayer. “A Test Collection for Entity Search in DBpedia” In proc of SIGIR ’13.
Query
types
EVALUATION (FACT RANKING)
Benchmark construction by Crowd sourcing experiments
‣ rate the importance of
the fact w.r.t. the entity
‣ rate the relevance of the fact to
the query for the given entity
Very important
Important
Not important
How important is this fact for the given entity?
EVALUATION (FACT RANKING)
Benchmark construction by crowd sourcing experiments
‣ Collecting judgments for ~4K facts
‣ 5 judgments per record
‣ Fleiss’ Kappa of 0.52 and 0.41 for importance and relevance,
(moderate agreement)
RESULTS (FACT RANKING)
number of
oximately
used and
e features.
validation,
We report
cal signi-
= 0.05) or
mance (for
icance.
approach
g fact rele-
ynES uses
nES/imp
mportance
ures only
Table 2: Comparison of fact ranking against the state-of-the-
art of approaches with URI-only objects. Signicance for
lines i  3 are tested against lines 1,2,3, and for lines 2,3
are tested against lines 1,2.
Model
Importance Utility
NDCG@5 NDCG@10 NDCG@5 NDCG@10
RELIN 0.6368 0.7130 0.6300 0.7066
LinkSum 0.7018M 0.7031 0.6504 0.6648
SUMMARUM 0.7181N 0.7412 M 0.6719 0.7111
DynES/imp 0.8354NNN 0.8604NNN 0.7645NNN 0.8117NNN
DynES 0.8291NNN 0.8652NNN 0.8164NNN 0.8569NNN
Table 4: Fact ranking performance by removing features;
features are sorted by the relative dierence they make.
Group Removed feature NDCG@10 % p
DynES - all features 0.7873 - -
Imp. - NEFp 0.7757 -1.16 0.08
Imp. - T peImp 0.7760 -1.13 0.14
16% improvements over the best baseline

‣ Users consume all facts displayed in the summary
‣ The quality of the whole summary should be assessed
‣ Side-by-side evaluation of factual summaries by human
EVALUATION (SUMMARY GENERATION)
RESULTS (SUMMARY GENERATION)−10
−5
0
Exp
User prefere
(a) DynES vs. DynES/imp
−10
−5
0
User prefer
−1
−
User preferen
Figure 4: Boxplot for distribution of user preferences for each q
DynES/imp or DynES/rel.
Table 5: Side-by-Side evaluation of summaries for dierent
fact ranking methods.
Model Win Loss Tie RI
DynES vs. DynES/imp 46 23 31 0.23
DynES vs. DynES/rel 75 12 13 0.63
DynES vs. RELIN 95 5 0 0.90
Utility vs. Importance 47 16 37 0.31
Table 6: Side-by-side evaluation of summaries for dierent
summary generation algorithms.
Model Win Loss Tie RI
DynES vs. DynES(-GF)(-RF) 84 1 15 0.83
DynES vs. DynES(-GF) 74 0 26 0.74
DynES vs. DynES(-RF) 46 2 52 0.44
preferred DynES summaries over DynES/imp (or DynES/rel) sum-
maries; ties are ignored. Considering all queries (the black boxes),
we observe that the utility-based summaries (DynES) are generally
preferred over the other two, and especially over the relevance-
−10
−5
0
Exp
User prefere
(a) DynES vs. DynES/imp
−10
−5
0
User prefere
−1
−
User preferenc
Figure 4: Boxplot for distribution of user preferences for each q
DynES/imp or DynES/rel.
Table 5: Side-by-Side evaluation of summaries for dierent
fact ranking methods.
Model Win Loss Tie RI
DynES vs. DynES/imp 46 23 31 0.23
DynES vs. DynES/rel 75 12 13 0.63
DynES vs. RELIN 95 5 0 0.90
Utility vs. Importance 47 16 37 0.31
Table 6: Side-by-side evaluation of summaries for dierent
summary generation algorithms.
Model Win Loss Tie RI
DynES vs. DynES(-GF)(-RF) 84 1 15 0.83
DynES vs. DynES(-GF) 74 0 26 0.74
DynES vs. DynES(-RF) 46 2 52 0.44
preferred DynES summaries over DynES/imp (or DynES/rel) sum-
maries; ties are ignored. Considering all queries (the black boxes),
we observe that the utility-based summaries (DynES) are generally
• Users preferred utility-based
summaries over the others
• Grouping of multivalued
predicates (GF) is perceived
as more important by the
users than the resolution of
identical facts (RF)
RESOURCES
1. Entity search toolkit
2. Test collection
SEMANTIC SEARCH
“Search with Meaning”
an umbrella term that encompasses various techniques
SEMANTIC SEARCH TOOLKIT
Entity retrieval: Returns a ranked list of entities in
response to a query
Entity linking: Identifies entities in a query and links them
to the corresponding entry in the Knowledge base
Target type identification: Detects the target types
(or categories) of a query
Functionalities
SEMANTIC SEARCH TOOLKIT
• Web interface, API, and
command line usage

• 3-tier architecture
• Online source code and
documentation
Highlights
NORDLYS
http://nordlys.cc/
NORDLYS
DBPEDIA-ENTITY V2
• (37 + 17) runs + old qrels

• Pool size: 150K
• 3 point likert-scale
• 5 judgments per query-entity
Details
DBPEDIA-ENTITY V2
778.2.Retrievalmethod
Model
SemSearch ES INEX-LD ListSearch QALD-2 Total
@10 @100 @10 @100 @10 @100 @10 @100 @10 @100
BM25 0.2497 0.4110 0.1828 0.3612 0.0627 0.3302 0.2751 0.3366 0.2558 0.3582
PRMS 0.5340 0.6108 0.3590 0.4295 0.3684 0.4436 0.3151 0.4026 0.3905 0.4688
MLM-all 0.5528 0.6247 0.3752 0.4493 0.3712 0.4577 0.3249 0.4208 0.4021 0.4852
LM 0.5555 0.6475 0.3999 0.4745 0.3925 0.4723 0.3412 0.4338 0.4182 0.5036
SDM 0.5535 0.6672 0.4030 0.4911 0.3961 0.4900 0.3390 0.4274 0.4185 0.5143
LM+ELR 0.5557 0.6477 0.4013 0.4763 0.4037 0.4885 0.3464 0.4377 0.4228 0.5093
SDM+ELR 0.5533 0.6676 0.4097 0.4975 0.4142 0.5058 0.3434 0.4350 0.4257 0.5220
MLM-CA 0.6247 0.6854 0.4029 0.4796 0.4021 0.4786 0.3365 0.4301 0.4365 0.5143
BM25-CA 0.5858 0.6883 0.4120 0.5050 0.4220 0.5142 0.3566 0.4426 0.4399 0.5329
FSDM 0.6521 0.7220 0.4214 0.5043 0.4196 0.4952 0.3401 0.4358 0.4524 0.5342
BM25F-CA 0.6281 0.7200 0.4394 0.5296 0.4252 0.5106 0.3689 0.4614 0.4605 0.5505
FSDM+ELR 0.6568 0.7260 0.4397 0.5144 0.4246 0.5011 0.3467 0.4450 0.4607 0.5416
Table 8.3: Results, broken down into query subtypes, on DBpedia-entity v2.
Baseline runs:
Generative models are available on Nordlys
DBPEDIA-ENTITY COLLECTIONS
Entity 

Retrieval
Entity 

Summarization
Target Type 

Identification
• 468 queries
• 19K relevant entities
• 100 queries
• 4K entity facts
• 485 queries
• ~900 types
Built with DBpedia 2015-10
https://github.com/iai-group/DBpedia-Entity
Nordlys:
Krisztian Balog, Dario Garigliotti, Shuo Zhang, Heng Ding
DBpedia-Entity Collection:
Fedor Nikolaev, Chenyan Xiong, Svein Erik Bratsberg, Krisztian Balog,
Alexander Kotov, Jamie Callan
ACKNOWLEDGEMENT
THANK YOU

More Related Content

Similar to Semantic Search and Result Presentation with Entity Cards

Session 1.5 supporting virtual integration of linked data with just-in-time...
Session 1.5   supporting virtual integration of linked data with just-in-time...Session 1.5   supporting virtual integration of linked data with just-in-time...
Session 1.5 supporting virtual integration of linked data with just-in-time...semanticsconference
 
Exploratory data analysis
Exploratory data analysisExploratory data analysis
Exploratory data analysisYabebal Ayalew
 
Collective entity linking with WSRM DocEng'19
Collective entity linking with WSRM DocEng'19Collective entity linking with WSRM DocEng'19
Collective entity linking with WSRM DocEng'19ngamou
 
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...Databricks
 
Learning content - Data Science Basics
Learning content - Data Science Basics Learning content - Data Science Basics
Learning content - Data Science Basics PredicSis
 
ArrayUDF: User-Defined Scientific Data Analysis on Arrays
ArrayUDF: User-Defined Scientific Data Analysis on ArraysArrayUDF: User-Defined Scientific Data Analysis on Arrays
ArrayUDF: User-Defined Scientific Data Analysis on ArraysGoon83
 
Towards research data knowledge graphs
Towards research data knowledge graphsTowards research data knowledge graphs
Towards research data knowledge graphsStefan Dietze
 
Data Mining For Supermarket Sale Analysis Using Association Rule
Data Mining For Supermarket Sale Analysis Using Association RuleData Mining For Supermarket Sale Analysis Using Association Rule
Data Mining For Supermarket Sale Analysis Using Association Ruleijtsrd
 
Predictive Model and Record Description with Segmented Sensitivity Analysis (...
Predictive Model and Record Description with Segmented Sensitivity Analysis (...Predictive Model and Record Description with Segmented Sensitivity Analysis (...
Predictive Model and Record Description with Segmented Sensitivity Analysis (...Greg Makowski
 
Standard Datasets in Information Retrieval
Standard Datasets in Information Retrieval Standard Datasets in Information Retrieval
Standard Datasets in Information Retrieval Jean Brenda
 
Reranking based-recommender-system-with-deep-learning
Reranking based-recommender-system-with-deep-learningReranking based-recommender-system-with-deep-learning
Reranking based-recommender-system-with-deep-learningAhmed Saleh
 
Introducing Multi Valued Vectors Fields in Apache Lucene
Introducing Multi Valued Vectors Fields in Apache LuceneIntroducing Multi Valued Vectors Fields in Apache Lucene
Introducing Multi Valued Vectors Fields in Apache LuceneSease
 
Preserving the currency of analytics outcomes over time through selective re-...
Preserving the currency of analytics outcomes over time through selective re-...Preserving the currency of analytics outcomes over time through selective re-...
Preserving the currency of analytics outcomes over time through selective re-...Paolo Missier
 
The lifecycle of reproducible science data and what provenance has got to do ...
The lifecycle of reproducible science data and what provenance has got to do ...The lifecycle of reproducible science data and what provenance has got to do ...
The lifecycle of reproducible science data and what provenance has got to do ...Paolo Missier
 
2017-01-08-scaling tribalknowledge
2017-01-08-scaling tribalknowledge2017-01-08-scaling tribalknowledge
2017-01-08-scaling tribalknowledgeChristopher Williams
 

Similar to Semantic Search and Result Presentation with Entity Cards (20)

Session 1.5 supporting virtual integration of linked data with just-in-time...
Session 1.5   supporting virtual integration of linked data with just-in-time...Session 1.5   supporting virtual integration of linked data with just-in-time...
Session 1.5 supporting virtual integration of linked data with just-in-time...
 
Bill howe 2_databases
Bill howe 2_databasesBill howe 2_databases
Bill howe 2_databases
 
Exploratory data analysis
Exploratory data analysisExploratory data analysis
Exploratory data analysis
 
Collective entity linking with WSRM DocEng'19
Collective entity linking with WSRM DocEng'19Collective entity linking with WSRM DocEng'19
Collective entity linking with WSRM DocEng'19
 
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...
 
Learning content - Data Science Basics
Learning content - Data Science Basics Learning content - Data Science Basics
Learning content - Data Science Basics
 
ArrayUDF: User-Defined Scientific Data Analysis on Arrays
ArrayUDF: User-Defined Scientific Data Analysis on ArraysArrayUDF: User-Defined Scientific Data Analysis on Arrays
ArrayUDF: User-Defined Scientific Data Analysis on Arrays
 
Towards research data knowledge graphs
Towards research data knowledge graphsTowards research data knowledge graphs
Towards research data knowledge graphs
 
Data Mining For Supermarket Sale Analysis Using Association Rule
Data Mining For Supermarket Sale Analysis Using Association RuleData Mining For Supermarket Sale Analysis Using Association Rule
Data Mining For Supermarket Sale Analysis Using Association Rule
 
Predictive Model and Record Description with Segmented Sensitivity Analysis (...
Predictive Model and Record Description with Segmented Sensitivity Analysis (...Predictive Model and Record Description with Segmented Sensitivity Analysis (...
Predictive Model and Record Description with Segmented Sensitivity Analysis (...
 
Standard Datasets in Information Retrieval
Standard Datasets in Information Retrieval Standard Datasets in Information Retrieval
Standard Datasets in Information Retrieval
 
Srikanta Mishra
Srikanta MishraSrikanta Mishra
Srikanta Mishra
 
Reranking based-recommender-system-with-deep-learning
Reranking based-recommender-system-with-deep-learningReranking based-recommender-system-with-deep-learning
Reranking based-recommender-system-with-deep-learning
 
Introducing Multi Valued Vectors Fields in Apache Lucene
Introducing Multi Valued Vectors Fields in Apache LuceneIntroducing Multi Valued Vectors Fields in Apache Lucene
Introducing Multi Valued Vectors Fields in Apache Lucene
 
Preserving the currency of analytics outcomes over time through selective re-...
Preserving the currency of analytics outcomes over time through selective re-...Preserving the currency of analytics outcomes over time through selective re-...
Preserving the currency of analytics outcomes over time through selective re-...
 
The lifecycle of reproducible science data and what provenance has got to do ...
The lifecycle of reproducible science data and what provenance has got to do ...The lifecycle of reproducible science data and what provenance has got to do ...
The lifecycle of reproducible science data and what provenance has got to do ...
 
Kenett On Information NYU-Poly 2013
Kenett On Information NYU-Poly 2013Kenett On Information NYU-Poly 2013
Kenett On Information NYU-Poly 2013
 
Data mining
Data miningData mining
Data mining
 
Data mining
Data miningData mining
Data mining
 
2017-01-08-scaling tribalknowledge
2017-01-08-scaling tribalknowledge2017-01-08-scaling tribalknowledge
2017-01-08-scaling tribalknowledge
 

More from Faegheh Hasibi

Entity Linking in Queries: Efficiency vs. Effectiveness
Entity Linking in Queries: Efficiency vs. EffectivenessEntity Linking in Queries: Efficiency vs. Effectiveness
Entity Linking in Queries: Efficiency vs. EffectivenessFaegheh Hasibi
 
Exploiting Entity Linking in Queries For Entity Retrieval
Exploiting Entity Linking in Queries For Entity RetrievalExploiting Entity Linking in Queries For Entity Retrieval
Exploiting Entity Linking in Queries For Entity RetrievalFaegheh Hasibi
 
On the Reproducibility of the TAGME entity linking system
On the Reproducibility of the TAGME entity linking systemOn the Reproducibility of the TAGME entity linking system
On the Reproducibility of the TAGME entity linking systemFaegheh Hasibi
 
Being a PhD student: Experiences and Challenges
Being a PhD student: Experiences and ChallengesBeing a PhD student: Experiences and Challenges
Being a PhD student: Experiences and ChallengesFaegheh Hasibi
 
Entity Linking in Queries: Tasks and Evaluation
Entity Linking in Queries: Tasks and EvaluationEntity Linking in Queries: Tasks and Evaluation
Entity Linking in Queries: Tasks and EvaluationFaegheh Hasibi
 

More from Faegheh Hasibi (6)

Entity Linking in Queries: Efficiency vs. Effectiveness
Entity Linking in Queries: Efficiency vs. EffectivenessEntity Linking in Queries: Efficiency vs. Effectiveness
Entity Linking in Queries: Efficiency vs. Effectiveness
 
Exploiting Entity Linking in Queries For Entity Retrieval
Exploiting Entity Linking in Queries For Entity RetrievalExploiting Entity Linking in Queries For Entity Retrieval
Exploiting Entity Linking in Queries For Entity Retrieval
 
On the Reproducibility of the TAGME entity linking system
On the Reproducibility of the TAGME entity linking systemOn the Reproducibility of the TAGME entity linking system
On the Reproducibility of the TAGME entity linking system
 
Being a PhD student: Experiences and Challenges
Being a PhD student: Experiences and ChallengesBeing a PhD student: Experiences and Challenges
Being a PhD student: Experiences and Challenges
 
Entity Linking in Queries: Tasks and Evaluation
Entity Linking in Queries: Tasks and EvaluationEntity Linking in Queries: Tasks and Evaluation
Entity Linking in Queries: Tasks and Evaluation
 
Yalda night
Yalda nightYalda night
Yalda night
 

Recently uploaded

VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PPRINCE C P
 
fundamental of entomology all in one topics of entomology
fundamental of entomology all in one topics of entomologyfundamental of entomology all in one topics of entomology
fundamental of entomology all in one topics of entomologyDrAnita Sharma
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfSumit Kumar yadav
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)Areesha Ahmad
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​kaibalyasahoo82800
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bSérgio Sacani
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisDiwakar Mishra
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000Sapana Sha
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfSumit Kumar yadav
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencySheetal Arora
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.Nitya salvi
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticssakshisoni2385
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 

Recently uploaded (20)

VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C P
 
fundamental of entomology all in one topics of entomology
fundamental of entomology all in one topics of entomologyfundamental of entomology all in one topics of entomology
fundamental of entomology all in one topics of entomology
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdf
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 

Semantic Search and Result Presentation with Entity Cards

  • 1. SEMANTIC SEARCH AND RESULT PRESENTATION WITH ENTITY CARDS FAEGHEH HASIBI | SEARCH ENGINES AMSTERDAM | JUNE 30, 2017
  • 2. SEMANTIC SEARCH AND RESULT PRESENTATION WITH ENTITY CARDS FAEGHEH HASIBI | SEARCH ENGINES AMSTERDAM | JUNE 30, 2017
  • 5.
  • 6.
  • 7. KNOWLEDGE BASE (Core data-enabling component of semantic search)
  • 8. Albert Einstein ENTITY <dbr:Albert_Einstein, foaf:name, Albert Einstein> <dbr:Albert_Einstein, dbo:birthDate, 1879-03-14> <dbr:Albert_Einstein, dbo:birthPlace, dbr:Ulm> <dbr:Albert_Einstein, dbo:birthPlace, dbr:German_Empire> <dbr:Albert_Einstein, dbp:description, dbr:Physicist> …
  • 10. SEMANTIC SEARCH “Search with Meaning” an umbrella term that encompasses various techniques
  • 11. ‣ Knowledge acquisition
 and curation ‣ Query understanding ‣ Answer retrieval ‣ Result presentation ‣ … BUILDING BLOCKS
  • 13. RESULT PRESENTATION Summarizing Entities for Entity Cards F. Hasibi, K. Balog, and S.E. Bratsberg. “Dynamic Factual Summaries for Entity Cards”. 
 In Proceedings of SIGIR ’17
  • 15. ENTITY SUMMARIZATION Albert Einstein … and ~700 more facts dbo:almaMater dbr:ETH_Zurich dbo:almaMater dbr:University_of_Zurich dbo:award dbr:Max_Planck_Medal dbo:award dbr:Nobel_Prize_in_Physics dbo:birthDate 1879-03-14 dbo:birthPlace dbr:Ulm dbo:birthPlace dbr:German_Empire dbo:citizenship dbr:Austria-Hungary dbo:children dbr:Eduard_Einstein dbo:children dbr:Hans_Albert_Einstein dbo:deathDate 1955-04-18 dbo:deathPlace dbr:Princeton,_New_Jersey dbo:spouse dbr:Elsa_Einstein dbo:spouse dbr:Mileva_Marić dbp:influenced dbr:Leo_Szilard
  • 17. Other applications ‣ News search • hovering over an entity in entity-annotated documents ‣ Job search • company descriptions for a given topic ENTITY SUMMARIES
  • 18. ENTITY SUMMARIES ? Question How to generate 
 query-dependent entity summaries that can directly address users’ information needs?
  • 19. METHOD Fact ranking Ranking a set of entity facts (and a search query) with respect to some criterion Summary generation constructing an entity summary from ranked entity facts, for a given size
  • 20. RANKING CRITERIA Importance: The general importance of that fact in describing the entity, irrespective of any particular information need. Relevance: The relevance of fact to query reflects how well the fact supports the information need underlying the query.
  • 21. Utility RANKING CRITERIA Utility: The utility of a fact combines the general importance and the relevance of a fact into a single number importance relevance
  • 22. Importance .. FACT RANKING Relevance .. ‣ Supervised ranking with fact-query pairs as learning instances ‣ Learning is optimized on utility with different weights • more bias towards importance or relevance
  • 23. FACT RANKING ‣ Knowledge bases statistics as ingredients for Importance features • absence of query logs
  • 24. METHOD Fact ranking Ranking a set of entity facts (and a search query) with respect to some criterion Summary generation constructing an entity summary from ranked entity facts, for a given size
  • 25. SUMMARY GENERATION 1. dbo:birthDate 1879-03-14 2. dbp:placeOfBirth Ulm 3. dbo:birthPlace dbr:Ulm 4. dbo:deathDate 1955-04-18 5. dbo:award dbr:Nobel_Prize_in_Physics 6. dbo:deathPlace dbr:Princeton,_New_Jersey 7. dbo:birthPlace dbr:German_Empire 8. dbo:almaMater dbr:ETH_Zurich 9. dbo:award dbr:Max_Planck_Medal 10.dbp:influenced dbr:Nathan_Rosen 11.dbo:almaMater dbr:University_of_Zurich … multi-valued predicates
  • 26. SUMMARY GENERATION 1. dbo:birthDate 1879-03-14 2. dbp:placeOfBirth Ulm 3. dbo:birthPlace dbr:Ulm 4. dbo:deathDate 1955-04-18 5. dbo:award dbr:Nobel_Prize_in_Physics 6. dbo:deathPlace dbr:Princeton,_New_Jersey 7. dbo:birthPlace dbr:German_Empire 8. dbo:almaMater dbr:ETH_Zurich 9. dbo:award dbr:Max_Planck_Medal 10.dbp:influenced dbr:Nathan_Rosen 11.dbo:almaMater dbr:University_of_Zurich … identical facts
  • 27. SUMMARY GENERATION Algorithm 1 Summary generation algorithm Input: Ranked facts Fe, max height h, max width w Output: Entity summary lines 1: M Predicate-Name Mapping(Fe) 2: headin s [] . Determine line headings 3: for f in Fe do 4: pname M[fp] 5: if (pname < headin s) AND (size(headin s)  h) then 6: headin s.add((fp,pname )) 7: end if 8: end for 9: alues [] . Determine line values 10: for f in Fe do 11: if fp 2 headin s then 12: alues[fp].add(fo) 13: end if 14: end for 15: lines [] . Construct lines 16: for (fp,pname ) in headin s do 17: line pname + ‘:’ 18: for in alues[fp] do 19: if len(line) + len(v)  w then 20: line line + . Add comma if needed 21: end if 22: end for 23: lines.add(line) 24: end for ‣ Creates a summary of a given size (length and width) ‣ Resolving identical facts 
 (RF feature) ‣ Grouping multi-valued predicates (GF feature)
  • 29. QUERIES February 2014 Increase profit by 35% Keyword Natural languageList search Named entity • “madrid” • “brooklyn bridge” • “vietnam war facts” • “eiffel” • “states that border oklahoma” • “What is the second highest mountain?” Taken from the DBpedia-entity collection K. Balog and R. Neumayer. “A Test Collection for Entity Search in DBpedia” In proc of SIGIR ’13. Query types
  • 30. EVALUATION (FACT RANKING) Benchmark construction by Crowd sourcing experiments ‣ rate the importance of the fact w.r.t. the entity ‣ rate the relevance of the fact to the query for the given entity Very important Important Not important How important is this fact for the given entity?
  • 31. EVALUATION (FACT RANKING) Benchmark construction by crowd sourcing experiments ‣ Collecting judgments for ~4K facts ‣ 5 judgments per record ‣ Fleiss’ Kappa of 0.52 and 0.41 for importance and relevance, (moderate agreement)
  • 32. RESULTS (FACT RANKING) number of oximately used and e features. validation, We report cal signi- = 0.05) or mance (for icance. approach g fact rele- ynES uses nES/imp mportance ures only Table 2: Comparison of fact ranking against the state-of-the- art of approaches with URI-only objects. Signicance for lines i 3 are tested against lines 1,2,3, and for lines 2,3 are tested against lines 1,2. Model Importance Utility NDCG@5 NDCG@10 NDCG@5 NDCG@10 RELIN 0.6368 0.7130 0.6300 0.7066 LinkSum 0.7018M 0.7031 0.6504 0.6648 SUMMARUM 0.7181N 0.7412 M 0.6719 0.7111 DynES/imp 0.8354NNN 0.8604NNN 0.7645NNN 0.8117NNN DynES 0.8291NNN 0.8652NNN 0.8164NNN 0.8569NNN Table 4: Fact ranking performance by removing features; features are sorted by the relative dierence they make. Group Removed feature NDCG@10 % p DynES - all features 0.7873 - - Imp. - NEFp 0.7757 -1.16 0.08 Imp. - T peImp 0.7760 -1.13 0.14 16% improvements over the best baseline

  • 33. ‣ Users consume all facts displayed in the summary ‣ The quality of the whole summary should be assessed ‣ Side-by-side evaluation of factual summaries by human EVALUATION (SUMMARY GENERATION)
  • 34. RESULTS (SUMMARY GENERATION)−10 −5 0 Exp User prefere (a) DynES vs. DynES/imp −10 −5 0 User prefer −1 − User preferen Figure 4: Boxplot for distribution of user preferences for each q DynES/imp or DynES/rel. Table 5: Side-by-Side evaluation of summaries for dierent fact ranking methods. Model Win Loss Tie RI DynES vs. DynES/imp 46 23 31 0.23 DynES vs. DynES/rel 75 12 13 0.63 DynES vs. RELIN 95 5 0 0.90 Utility vs. Importance 47 16 37 0.31 Table 6: Side-by-side evaluation of summaries for dierent summary generation algorithms. Model Win Loss Tie RI DynES vs. DynES(-GF)(-RF) 84 1 15 0.83 DynES vs. DynES(-GF) 74 0 26 0.74 DynES vs. DynES(-RF) 46 2 52 0.44 preferred DynES summaries over DynES/imp (or DynES/rel) sum- maries; ties are ignored. Considering all queries (the black boxes), we observe that the utility-based summaries (DynES) are generally preferred over the other two, and especially over the relevance- −10 −5 0 Exp User prefere (a) DynES vs. DynES/imp −10 −5 0 User prefere −1 − User preferenc Figure 4: Boxplot for distribution of user preferences for each q DynES/imp or DynES/rel. Table 5: Side-by-Side evaluation of summaries for dierent fact ranking methods. Model Win Loss Tie RI DynES vs. DynES/imp 46 23 31 0.23 DynES vs. DynES/rel 75 12 13 0.63 DynES vs. RELIN 95 5 0 0.90 Utility vs. Importance 47 16 37 0.31 Table 6: Side-by-side evaluation of summaries for dierent summary generation algorithms. Model Win Loss Tie RI DynES vs. DynES(-GF)(-RF) 84 1 15 0.83 DynES vs. DynES(-GF) 74 0 26 0.74 DynES vs. DynES(-RF) 46 2 52 0.44 preferred DynES summaries over DynES/imp (or DynES/rel) sum- maries; ties are ignored. Considering all queries (the black boxes), we observe that the utility-based summaries (DynES) are generally • Users preferred utility-based summaries over the others • Grouping of multivalued predicates (GF) is perceived as more important by the users than the resolution of identical facts (RF)
  • 35. RESOURCES 1. Entity search toolkit 2. Test collection
  • 36. SEMANTIC SEARCH “Search with Meaning” an umbrella term that encompasses various techniques
  • 37. SEMANTIC SEARCH TOOLKIT Entity retrieval: Returns a ranked list of entities in response to a query Entity linking: Identifies entities in a query and links them to the corresponding entry in the Knowledge base Target type identification: Detects the target types (or categories) of a query Functionalities
  • 38. SEMANTIC SEARCH TOOLKIT • Web interface, API, and command line usage
 • 3-tier architecture • Online source code and documentation Highlights
  • 41. DBPEDIA-ENTITY V2 • (37 + 17) runs + old qrels
 • Pool size: 150K • 3 point likert-scale • 5 judgments per query-entity Details
  • 42. DBPEDIA-ENTITY V2 778.2.Retrievalmethod Model SemSearch ES INEX-LD ListSearch QALD-2 Total @10 @100 @10 @100 @10 @100 @10 @100 @10 @100 BM25 0.2497 0.4110 0.1828 0.3612 0.0627 0.3302 0.2751 0.3366 0.2558 0.3582 PRMS 0.5340 0.6108 0.3590 0.4295 0.3684 0.4436 0.3151 0.4026 0.3905 0.4688 MLM-all 0.5528 0.6247 0.3752 0.4493 0.3712 0.4577 0.3249 0.4208 0.4021 0.4852 LM 0.5555 0.6475 0.3999 0.4745 0.3925 0.4723 0.3412 0.4338 0.4182 0.5036 SDM 0.5535 0.6672 0.4030 0.4911 0.3961 0.4900 0.3390 0.4274 0.4185 0.5143 LM+ELR 0.5557 0.6477 0.4013 0.4763 0.4037 0.4885 0.3464 0.4377 0.4228 0.5093 SDM+ELR 0.5533 0.6676 0.4097 0.4975 0.4142 0.5058 0.3434 0.4350 0.4257 0.5220 MLM-CA 0.6247 0.6854 0.4029 0.4796 0.4021 0.4786 0.3365 0.4301 0.4365 0.5143 BM25-CA 0.5858 0.6883 0.4120 0.5050 0.4220 0.5142 0.3566 0.4426 0.4399 0.5329 FSDM 0.6521 0.7220 0.4214 0.5043 0.4196 0.4952 0.3401 0.4358 0.4524 0.5342 BM25F-CA 0.6281 0.7200 0.4394 0.5296 0.4252 0.5106 0.3689 0.4614 0.4605 0.5505 FSDM+ELR 0.6568 0.7260 0.4397 0.5144 0.4246 0.5011 0.3467 0.4450 0.4607 0.5416 Table 8.3: Results, broken down into query subtypes, on DBpedia-entity v2. Baseline runs: Generative models are available on Nordlys
  • 43. DBPEDIA-ENTITY COLLECTIONS Entity 
 Retrieval Entity 
 Summarization Target Type 
 Identification • 468 queries • 19K relevant entities • 100 queries • 4K entity facts • 485 queries • ~900 types Built with DBpedia 2015-10 https://github.com/iai-group/DBpedia-Entity
  • 44. Nordlys: Krisztian Balog, Dario Garigliotti, Shuo Zhang, Heng Ding DBpedia-Entity Collection: Fedor Nikolaev, Chenyan Xiong, Svein Erik Bratsberg, Krisztian Balog, Alexander Kotov, Jamie Callan ACKNOWLEDGEMENT