OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...
Automated Comparative Table Generation for Facilitating Human Intervention in Multi-Entity Resolution
1. Jiacheng Huang, Wei Hu*, Haoxuan Li, Yuzhong Qu
Nanjing University, China
* Corresponding author: whu@nju.edu.cn
Automated Comparative Table Generation for
Facilitating Human Intervention in Multi-Entity Resolution
SIGIR’18, July 8–12, Ann Arbor, MI, USA
2. Outline
Introduction
Knowledge graph
(Crowd) Entity resolution
Related work
Our approach
Experiments and results
Conclusion
2Introduction ➤ Our approach ➤ Experiments and results ➤ Conclusion
3. Knowledge graph (KG)
Knowledge graph (KG) is a knowledge base used by Google
to enhance its search engine’s results
Other famous knowledge bases
DBpedia, Freebase, Wikidata, YAGO …
Linked Open Data (LOD) cloud
KGs have reached a scale in billions of entities!
Problem: Many different entities refer to
the same real-world thing
3
4. Entity Resolution
Entity resolution (ER): find different entities referring to the same
a.k.a. entity linkage, entity matching …
also widely studied in DB and NLP
resolve heterogeneity and achieve interoperability
Crowd ER
use humans, in addition to machines, to obtain
the truths of ER tasks
Key issues
How to present a single ER task?
How to select “right” humans?
How to pick tasks under a budget?
……
4
Little effort has been made on how to
present the critical information (such as
important properties and values) to help
complete a task efficiently and accurately
[Verroios et al., SIGMOD’17]
5. Related work
Multi-entity
resolution (MER)
1. Display multiple entities in a form of list
just like what is typically seen from a Web search engine
2. Use pairwise presentation
compare two entities at a time and align similar properties between them
Pros & cons for MER
1. List: remember and compare in mind
2. Pairwise: focus, but difficult to scale
Both lost transitivity & grouping info
5
entities with similar properties & values
⦿ match
⦿ nonmatch
e1 [dbp:Lil_Eazy-E]
– rdf:type : Person, MusicalArtist
– rdfs:label : Lil Eazy-E
– owl:sameAs : fb:m.01wf_p_
– birthDate : 1984-4-23
– birthPlace : Compton
– gender : male
– genre : Gangsta rap, Hip hop
– givenName : Eric Darnell Wright
(146 property-values in total)
e2 [fb:m.01wf_p_]
– alias : Eric Wright, Eazy-E
– date_of_birth : 1963-9-7
– gender : male
– genre : gangsta rap, hip hop
– name : Eazy-E
– place_of_birth : Compton
– profession : rapper, producer
– type : person, music.artist
(1,253 property-values in total)
e3 [wd:Q36804]
– rdfs:label : Eazy-E
– altLabel : Eric Lynn Wright
– date_of_birth : 1963-9-7
– desc : Gangsta rapper, producer
– genre : gangsta rap
– instance_of : human
– occupation : musician, rapper
– place_of_birth : Compton
(141 property-values in total)
group?
givenName
alias
altLabel
rdf:type
type
instance_of
birthDate
date_of_birth
date_of_birth
1 e1 Eric Darnell Wright Person, MusicalArtist 1984-4-23
2 e2 Eric Wright, Eazy-E person, music.artist 1963-9-7
givenName
alias
altLabel
birthDate
date_of_birth
date_of_birth
e1 Eric Darnell Wright 1984-4-23
e2 Eric Wright, Eazy-E 1963-9-7
e3 Eric Lynn Wright 1963-9-7
e1 [dbp:Lil_Eazy-E]
– rdf:type : Person …
– rdfs:label : Lil Eazy-E
– owl:sameAs : fb:m…
– birthDate : 1984-4-23
– birthPlace : Compton
– gender : male
– genre : Gangsta rap …
…
e2 [fb:m.01wf_p_]
– alias : Eric Wright …
– date_of_birth : 1963-9-7
– genre : gangsta rap …
…
e3 [wd:Q36804]
– rdfs:label : Eazy-E
– altLabel : Eric Lynn …
– date_of_birth : 1963-9-7
…
e1 [dbp:Lil_Eazy-E]
– rdf:type : Person, MusicalArtist
– genre : Gangsta rap, Hip hop
– givenName : Eric Darnell Wright
– rdfs:label : Lil Eazy-E
– birthPlace : Compton
– gender : male
e2 [fb:m.01wf_p_]
– type : person, music.artist
– genre : gangsta rap, hip hop
– alias : Eric Wright, Eazy-E
– place_of_birth : Compton
– gender : male
e1 [dbp:Lil_Eazy-E]
– rdf:type : Person, MusicalArtist
– rdfs:label : Lil Eazy-E
– owl:sameAs : fb:m.01wf_p_
– birthDate : 1984-4-23
– birthPlace : Compton
– gender : male
– genre : Gangsta rap, Hip hop
– givenName : Eric Darnell Wright
(146 property-values in total)
e2 [fb:m.01wf_p_]
– alias : Eric Wright, Eazy-E
– date_of_birth : 1963-9-7
– gender : male
– genre : gangsta rap, hip hop
– name : Eazy-E
– place_of_birth : Compton
– profession : rapper, producer
– type : person, music.artist
(391 property-values in total)
e3 [wd:Q36804]
– rdfs:label : Eazy-E
– altLabel : Eric Lynn Wright
– date_of_birth : 1963-9-7
– desc : Gangsta rapper, producer
– genre : gangsta rap
– instance_of : human
– occupation : musician, rapper
– place_of_birth : Compton
(141 property-values in total)
group?
givenName
alias
altLabel
rdf:type
type
instance_of
birthDate
date_of_birth
date_of_birth
1 � e1 Eric Darnell Wright Person, MusicalArtist 1984-4-23
2 � e2 Eric Wright, Eazy-E person, music.artist 1963-9-7
2 � e3 Eric Lynn Wright human 1963-9-7
givenName
alias
altLabel
birthDate
date_of_birth
date_of_birth
e1 Eric Darnell Wright 1984-4-23
e2 Eric Wright, Eazy-E 1963-9-7
e3 Eric Lynn Wright 1963-9-7
e1 [dbp:Lil_Eazy-E
– rdf:type : Person, M
– genre : Gangsta ra
– givenName : Eric D
– rdfs:label : Lil Eaz
– birthPlace : Comp
– gender : male
7. Our approach: comparative table
Comparative table
arrange entities and properties as
row and column headers, resp.
assign values in cells
Workflow
1. Holistic property matching: similarity calculation property clique derivation
2. Goodness measurement: discriminability, abundance, semantics & diversity
3. Comparative table generation: property clique selection value selection
7
group?
givenName
alias
altLabel
rdf:type
type
instance_of
birthDate
date_of_birth
date_of_birth
1 e1 Eric Darnell Wright Person, MusicalArtist 1984-4-23
2 e2 Eric Wright, Eazy-E person, music.artist 1963-9-7
3 e3 Eric Lynn Wright human 1963-9-7
givenName
alias
altLabel
birthDate
date_of_birth
date_of_birth
e1 Eric Darnell Wright 1984-4-23
e2 Eric Wright, Eazy-E 1963-9-7
e3 Eric Lynn Wright 1963-9-7
Similarity computation
Clique generation
Holistic Property Matching
{rdfs:label, name, rdfs:label}
{givenName, alias, altLabel}
{rdf:type, type, occupation}
…
Abundance
Discriminability
Comparability measurement
Semantics
Refinement by diversity
0.9 {givenName, alias, altLabel}
0.8 {birthDate, dateofbirth, DOB}
0.6 {rdf:type, type, occupation}
Coverage-constrained
Budget-constrained
Comparative table generation
Input: candidate entities Property cliques
Property clique comparabilitiesOutput: comparative table
Human
Intervention
Similarity calculation
Prop. clique derivation
Holistic property matching
Abundance
Discriminability
Goodness measurement
Diversity
Semantics
Property cliques
Prop. clique selection
Value selection
Comparative table generationGoodness scores Comparative tableMultiple entities
e1 [dbp:Lil_Eazy-E]
– rdf:type : Person …
– rdfs:label : Lil Eazy-E
– owl:sameAs : fb:m…
– birthPlace : Compton
– desc : CEO NWA…
– gender : male
– genre : Gangsta rap …
e2 [fb:m.01wf_p_]
– alias : Eric Wright …
– date_of_birth : 1963-9-7
– genre : gangsta rap …
e3 [wd:Q36804]
– rdfs:label : Eazy-E
– altLabel : Eric Lynn …
– date_of_birth : 1963-9-7
{rdfs:label, name}
{givenName, alias, altLabel}
{rdf:type, type, instance_of}
…
0.2 {givenName, alias, altLabel}
0.4 {rdf:type, type, instance_of}
0.5 {birthDate, date_of_birth}
…
divide into groups
Challenge: heterogeneity, large-scale
vs. limited presentation space
8. 1. Holistic property matching
Heterogeneous properties
Label, local name & value similarity, combined with logistic regression
Property cliques for multiple entities
restrict each property can match at most one other property
choose the pairs with highest match probability estimate may lead to conflicts
Holistic property matching
maximize the overall match probability
estimate among all matched property pairs
s.t. 1:1 matching constraint is satisfied
NP-hard (3-dimensional assignment)
Greedy algorithm
8
9. 2. Goodness measurement
Goodness of property cliques
1. Discriminability: a property clique that holds completely different or exactly identical
values for all the entities may not good
2. Abundance: a property clique whose values
are largely missing may be less convincing
3. Semantics gives extra scores to the ones
particularly useful, e.g., owl:sameAs
4. Diversity evaluates the redundancy between
different property cliques (MMR)
2-phase combination: (discriminability + abundance + semantics) + diversity
Goodness of values
Longer length, less redundancy
9
0 1.0
proportion of distinct values
0
0.7
discriminability
proportion of
entities
goodness
proportion of
distinct values
0
0.11.0
1
1.00.1
1
= 0.5
2
= 0.3
3
= 0.2
0 1.0
proportion of distinct values
0
0.7
discriminability
proportion of
entities
goodness
proportion of
distinct values
0
0.11.0
1
1.00.1
1
= 0.5
2
= 0.3
3
= 0.2
10. 3. Comparative table generation
Property clique selection
Greedy method
Given the maximal number of property cliques in a comparative table, simply
select top property cliques with best goodness
cannot guarantee each entity to be at least described by several properties
Optimal property clique selection
with entity coverage constraint
NP-hard (set cover)
𝐻(𝑁)-approximation
Value selection
model it based on the classic 0/1 knapsack
problem with a table cell size constraint
10
11. Outline
Introduction
Our approach
Experiments and results
Test on holistic property matching
Test on property clique ranking
Test on human intervention
Conclusion
11Introduction ➤ Our approach ➤ Experiments and results ➤ Conclusion
12. Test on holistic property matching
Quality of matched property pairs
“Official” property matches
Label others by 3 graduate students
484 matches, 1397 non-matches
Quality of derived property cliques
Compute connected components
135 reference property cliques
12
MER tasks
10 popular domains, 25 DBpedia entities per domain as seeds
Wikipedia disambiguation page, 2~4 Freebase, Wikidata, YAGO entities
randomly select 10 entities to constitute an MER task
250 tasks, 804 distinct real objects
0.868
0.727
0.791
0.824
0.73
0.773
0.893
0.669
0.763
0.877
0.706
0.782
0
0.5
1
Precision Recall F1-score
CTab (LR) LinReg DecTree SVM
0.868
0.727
0.791
0.983
0.233
0.377
0.97
0.066
0.124
Precision Recall F1-score
CTab (LR) Falcon LogMap
0.789
0.869
0.787
0.558
0.64
0.548
0.65
0.741
0.648
0.43
0.55
0.422
0.2
0.4
0.6
0.8
1
NMI Purity V-measure
CTab K-medoids DBSCAN APCluster
13. Test on property clique ranking
1. Directly rank ref. property cliques
Assess property clique derivation &
ranking together
The Hausdorf version of Kendall
tau distance
treat property clique rankings as
partial rankings of properties (the
properties with the same grade
and in the same clique are tied)
Ablation study
13
3 experienced humans score property cliques in each task
Highly-useful (3), fairly-useful (2), marginally-useful (1) and useless (0)
Comparative systems
FACES (list) [Gunaratna et al., AAAI’15]
C3D+P (pairwise) [Cheng et al., JWS’15]
CTab, CTab (entropy), CTab (greedy)
Use reference property cliques
KHaus
P@1 P@5 P@10 nDCG@5
FACES 0.176 0.310 0.290 0.239 0.753
C3D+P 0.040 0.347 0.511 0.154 0.647
CTab (entropy) 0.180 0.178 0.184 0.092 0.811
CTab (greedy) 0.632 0.660 0.615 0.684 0.647
CTab 0.756 0.754 0.643 0.798 0.615
KHaus Discr. Abund. Sem. w/o Div. Good
CTab (greedy) 0.678 0.686 0.673 0.655 0.647
CTab 0.675 0.633 0.815 0.618 0.615
14. Test on human intervention
60 graduate students (top-5/top-10), 30 orthogonal tasks per human, 100RMB
Task difficulty is not significantly different in statistics among FACES, C3D+P, CTab
1. Completion time
2. Precision
Break the entities in each
entity group down to pairs
3. Human scoring and comments
For CTab, the least cover times was not always satisfiable
14
FACES (L) C3D+P (P) CTab (T) p-value Post-hoc
Top-5
Time (s) 153 208 96 0.01% P < L < T
Prec. 0.63 0.69 0.77 0.07% L, P < T
Top-10
Time (s) 175 180 131 1.13% L, P < T
Prec. 0.79 0.77 0.80 69.8%
Questions [from 1: “totally disagree” to 5: “totally agree”] FACES (L) C3D+P (P) CTab (T) p-value Post-hoc
Q1. The system provided adequate information of entities. 3.11 3.17 3.70 0.76% L, P < T
Q2. The system provided unsuperfluous information of entities. 2.67 3.30 3.23 4.46% L < T, P
Q3. The system helped me easily compare entities of interest. 2.43 3.37 4.00 < 0.01% L < P < T
Q4. I found the system easy to use. 3.00 3.13 3.70 2.28% L, P < T
16. Conclusion
Main contributions
1. Discovery of matched property cliques
2. Scoring functions to measure the goodness of property cliques and values
3. Optimal comparative table generation with the entity coverage constraint
An 𝐻(𝑁) algorithm to obtain approximate solutions
4. Comparison to state-of-the-art methods and user study
Accuracy of matched properties, effectiveness of goodness measures and user
satisfaction of comparative tables for MER
Future work
Combine comparative tables with other presentation enhancements
Extend to other areas such as knowledge base summarization
16
17. Datasets & source code: http://ws.nju.edu.cn/ctab/
Acknowledgements
National Natural Science Foundation of China (No. 61772264)
Collaborative Innovation Center of Novel Software Technology and Industrialization
Thank you for your time!
SIGIR’18, July 8–12, Ann Arbor, MI, USA