SlideShare a Scribd company logo
1 of 15
Download to read offline
Zequn Sun, Wei Hu, Qingheng Zhang and Yuzhong Qu
National Key Laboratory for Novel Software Technology
Nanjing University, China
{zqsun, qhzhang}.nju@gmail.com, {whu, yzqu}@nju.edu.cn
Bootstrapping Entity Alignment
with Knowledge Graph Embedding
1
Background
n Entity Alignment
¡ Find entities in different KGs that refer to the same real-world object
¡ Play a vital role in automatically integrating multiple KGs
n Conventional approaches
¡ Compute entity similarities based on entity attributes
¡ Are not always effective because of the semantic heterogeneity
n Embedding-based approaches
¡ Encode KGs into vector spaces
¡ Measure entity similarities via entity embeddings
2
Challenges
n Although embedding a single KG has been extensively studied
in the past few years, alignment-oriented KG embedding
remains largely unexplored.
n Embedding-based entity alignment usually relies on existing
entity alignment (prior alignment) as training data. However,
the accessible prior alignment usually accounts for a small
proportion.
3
Framework
n We model entity alignment as a classification problem of
using KG2 entities to label KG1 entities.
n To solve the aforementioned two issues, we proposed a
bootstrapping framework:
4
KG1 triples
KG2 triples
Prior
alignment
Supervised
triples
Parameter
swapping
Alignment
predictor
Train alignment-oriented
KG embeddings
Likely
alignment
Parameter
swapping
Alignment
editing
Alignment
labeling
Parameter Swapping
n We swap aligned entities in their triples to calibrate the
embeddings of KG1 and KG2 in the unified vector space.
!(#,%)
'
= {(*, +, ,)|(., +, ,) ∈ !0
1
} ∪ ℎ, +, * ℎ, +, . ∈ !0
1
∪ {(., +, ,)|(*, +, ,) ∈ !5
1
} ∪ ℎ, +, . ℎ, +, * ∈ !5
1
n The supervised triples are fed to our KG embedding model as
positives.
5
KG2’s triples
KG1’s triples
Alignment-Oriented Embedding
n Translational score function: ! " = $ + & − ( )
)
.
n Margin-based ranking loss:
*+ = ∑-∈/0 ∑-1∈/2
3[5 + ! " − ! "6
]8
n Limited loss function:
*9 = ∑-∈/0[! " − 5:]8+ ∑-1∈/3[5) − ! "6
]8
6
! "6
− ! " > 5
not controlled not controlled
! "6
≥ 5) ! " ≤ 5:
! "6 − ! " ≥ 5) − 5:
!-Truncated Negative Sampling
n Conventional uniform negative sampling
(Washington DC, capitalOf, USA) (Tim Berners-Lee, capitalOf, USA)
n !-Truncated negative sampling
(Washington DC, capitalOf, USA) (New York , capitalOf, USA)
7
The replacer is randomly sampled from all entities.
It may be easily distinguished from its original.
The sampling scope is limited to a group of candidates,
i.e., its "-nearest neighbors, where " = 1 − & ' .
Likely Alignment Labeling
n We choose to label likely alignment at the !-th iteration by
solving the following optimization problem:
max %
&∈(
%
)∈*+
,(.|0; 2 3
) 5 6 3
(0, .) ,
s. t. %
&;∈(
6 3
(0<
, .) ≤ 1,
%
);∈*+
6 3
(0, .<
) ≤ 1, ∀0, .
n We transform it to max-weighted matching on bipartite
graphs.
8
( *
one-to-one labeling
Likely Alignment Editing
n Labeling conflicts exist when accumulating the newly-
labeled alignment of different iterations.
¡ ! is labeled as " at the #-th iteration while as "$
at the (#+1)-th
iteration
n We calculate the following likelihood difference:
∆(',),)*)
(,)
= . " !; 0 ,
− .("$
|!; 0 ,
)
¡ If	∆(',),)*)
(,)
> 0,	indicating	labeling	x as	y gives	more	alignment		
likelihood,	we	choose	" to	label	!.	Otherwise	"$
.
9
Experiments
10
n Dataset
¡ DBP15K: three cross-lingual datasets built from the multilingual
versions of DBpedia: DBPZH-EN (Chinese to English), DBPJA-EN
(Japanese to English) and DBPFR-EN (French to English). Each
dataset contains 15 thousand reference entity alignment.
¡ DWY100K: two large-scale datasets extracted from DBpedia,
Wikidata and YAGO3, denoted by DBP-WD and DBP-YG. Each
dataset has 100 thousand reference entity alignment.
Experiments
11
n Comparative Approaches
¡ MTransE [ijcai 2017] learns a linear transformation between KGs.
¡ IPTransE [ijcai 2017] is an iterative method for entity alignment.
¡ JAPE [iswc 2017] combines relation and attribute embeddings for
entity alignment.
n Metrics
¡ Hits@k : the percentage of correct alignment ranked at top k
¡ MRR: the average of the reciprocal ranks of results
Experiments
12
Approaches
DBPZH-EN DBPJA-EN DBPFR-EN DBP-WD DBP-YG
Hits@1 MRR Hits@1 MRR Hits@1 MRR Hits@1 MRR Hits@1 MRR
MTransE 30.83 0.364 27.86 0.349 24.41 0.335 28.12 0.363 25.15 0.334
IPTransE 40.59 0.516 36.69 0.474 33.30 0.451 34.85 0.447 29.74 0.386
JAPE 41.18 0.490 36.25 0.476 32.39 0.430 31.84 0.411 23.57 0.320
AlignE 47.18 0.581 44.76 0.563 48.12 0.599 56.55 0.655 63.29 0.707
BootEA 62.94 0.703 62.23 0.701 65.30 0.731 74.79 0.801 76.10 0.808
n Main results on entity alignment
¡ AlignE outperformed the comparative approaches.
¡ BootEA considerably improved the performance of AlignE after employing
bootstrapping.
Experiments
13
n F1-score w.r.t. Distribution of Relation Triple Numbers
¡ We divided entity links in testing data into several intervals based on
the number of their relation triples.
¡ The performance was assessed by F1-score within a certain interval.
¡ This analysis demonstrated that BootEA can achieve promising
results on sparse data, indicating its practical use for real KGs.
0.0
0.2
0.4
0.6
0.8
1.0
[1,6) [6,11) [11,16) [16,21) [21,∞)
F1-score
Number of relation triples
MTransE IPTransE JAPE BootEA
Number of entity alignment within interval
Conclusion
14
n In this paper, we studied embedding-based entity alignment.
¡ We introduced a KG embedding model to learn alignment-oriented
embeddings across different KGs. It employs an !-truncated uniform
negative sampling method to improve alignment performance.
¡ We conducted entity alignment in a bootstrapping process. It labels
likely alignment as training data and edits alignment during iterations
¡ Our experiment results showed that the proposed approach
significantly outperformed three state-of-the-art embedding-based
ones, on three cross-lingual datasets and two new large-scale
datasets.
Thanks for your attention!
n This work is supported by the National Key R&D Program of China
(No. 2018YFB1004300)
n Codes and datasets of BootEA are now available at
https://github.com/nju-websoft/BootEA
n Welcome to my poster (#1425)
15

More Related Content

Similar to Bootstrapping Entity Alignment with Knowledge Graph Embedding

brief Introduction to Different Kinds of GANs
brief Introduction to Different Kinds of GANsbrief Introduction to Different Kinds of GANs
brief Introduction to Different Kinds of GANsParham Zilouchian
 
SVD and the Netflix Dataset
SVD and the Netflix DatasetSVD and the Netflix Dataset
SVD and the Netflix DatasetBen Mabey
 
Lec7 deeprlbootcamp-svg+scg
Lec7 deeprlbootcamp-svg+scgLec7 deeprlbootcamp-svg+scg
Lec7 deeprlbootcamp-svg+scgRonald Teo
 
.NET Fest 2017. Игорь Кочетов. Классификация результатов тестирования произво...
.NET Fest 2017. Игорь Кочетов. Классификация результатов тестирования произво....NET Fest 2017. Игорь Кочетов. Классификация результатов тестирования произво...
.NET Fest 2017. Игорь Кочетов. Классификация результатов тестирования произво...NETFest
 
So sánh cấu trúc protein_Protein structure comparison
So sánh cấu trúc protein_Protein structure comparisonSo sánh cấu trúc protein_Protein structure comparison
So sánh cấu trúc protein_Protein structure comparisonbomxuan868
 
ML Study Jams - Session 3.pptx
ML Study Jams - Session 3.pptxML Study Jams - Session 3.pptx
ML Study Jams - Session 3.pptxMayankChadha14
 
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...cvpaper. challenge
 
Neural Nets Deconstructed
Neural Nets DeconstructedNeural Nets Deconstructed
Neural Nets DeconstructedPaul Sterk
 
Cs221 lecture5-fall11
Cs221 lecture5-fall11Cs221 lecture5-fall11
Cs221 lecture5-fall11darwinrlo
 
SVM - Functional Verification
SVM - Functional VerificationSVM - Functional Verification
SVM - Functional VerificationSai Kiran Kadam
 
Computational Techniques for the Statistical Analysis of Big Data in R
Computational Techniques for the Statistical Analysis of Big Data in RComputational Techniques for the Statistical Analysis of Big Data in R
Computational Techniques for the Statistical Analysis of Big Data in Rherbps10
 
Exploiting Worker Correlation for Label Aggregation in Crowdsourcing
Exploiting Worker Correlation for Label Aggregation in CrowdsourcingExploiting Worker Correlation for Label Aggregation in Crowdsourcing
Exploiting Worker Correlation for Label Aggregation in CrowdsourcingYuanLi589586
 
AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transforma...
AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transforma...AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transforma...
AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transforma...Tomoyuki Suzuki
 
5 structured programming
5 structured programming 5 structured programming
5 structured programming hccit
 
Variational Autoencoders For Image Generation
Variational Autoencoders For Image GenerationVariational Autoencoders For Image Generation
Variational Autoencoders For Image GenerationJason Anderson
 
SNLI_presentation_2
SNLI_presentation_2SNLI_presentation_2
SNLI_presentation_2Viral Gupta
 
Random forest algorithm for regression a beginner's guide
Random forest algorithm for regression   a beginner's guideRandom forest algorithm for regression   a beginner's guide
Random forest algorithm for regression a beginner's guideprateek kumar
 

Similar to Bootstrapping Entity Alignment with Knowledge Graph Embedding (20)

brief Introduction to Different Kinds of GANs
brief Introduction to Different Kinds of GANsbrief Introduction to Different Kinds of GANs
brief Introduction to Different Kinds of GANs
 
SVD and the Netflix Dataset
SVD and the Netflix DatasetSVD and the Netflix Dataset
SVD and the Netflix Dataset
 
Lec7 deeprlbootcamp-svg+scg
Lec7 deeprlbootcamp-svg+scgLec7 deeprlbootcamp-svg+scg
Lec7 deeprlbootcamp-svg+scg
 
Chapter 18,19
Chapter 18,19Chapter 18,19
Chapter 18,19
 
.NET Fest 2017. Игорь Кочетов. Классификация результатов тестирования произво...
.NET Fest 2017. Игорь Кочетов. Классификация результатов тестирования произво....NET Fest 2017. Игорь Кочетов. Классификация результатов тестирования произво...
.NET Fest 2017. Игорь Кочетов. Классификация результатов тестирования произво...
 
So sánh cấu trúc protein_Protein structure comparison
So sánh cấu trúc protein_Protein structure comparisonSo sánh cấu trúc protein_Protein structure comparison
So sánh cấu trúc protein_Protein structure comparison
 
ML Study Jams - Session 3.pptx
ML Study Jams - Session 3.pptxML Study Jams - Session 3.pptx
ML Study Jams - Session 3.pptx
 
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
 
Neural Nets Deconstructed
Neural Nets DeconstructedNeural Nets Deconstructed
Neural Nets Deconstructed
 
Cs221 lecture5-fall11
Cs221 lecture5-fall11Cs221 lecture5-fall11
Cs221 lecture5-fall11
 
SVM - Functional Verification
SVM - Functional VerificationSVM - Functional Verification
SVM - Functional Verification
 
Explainable AI
Explainable AIExplainable AI
Explainable AI
 
Computational Techniques for the Statistical Analysis of Big Data in R
Computational Techniques for the Statistical Analysis of Big Data in RComputational Techniques for the Statistical Analysis of Big Data in R
Computational Techniques for the Statistical Analysis of Big Data in R
 
Exploiting Worker Correlation for Label Aggregation in Crowdsourcing
Exploiting Worker Correlation for Label Aggregation in CrowdsourcingExploiting Worker Correlation for Label Aggregation in Crowdsourcing
Exploiting Worker Correlation for Label Aggregation in Crowdsourcing
 
AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transforma...
AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transforma...AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transforma...
AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transforma...
 
5 structured programming
5 structured programming 5 structured programming
5 structured programming
 
Variational Autoencoders For Image Generation
Variational Autoencoders For Image GenerationVariational Autoencoders For Image Generation
Variational Autoencoders For Image Generation
 
SNLI_presentation_2
SNLI_presentation_2SNLI_presentation_2
SNLI_presentation_2
 
Random forest algorithm for regression a beginner's guide
Random forest algorithm for regression   a beginner's guideRandom forest algorithm for regression   a beginner's guide
Random forest algorithm for regression a beginner's guide
 
machine learning
machine learningmachine learning
machine learning
 

Recently uploaded

call girls in delhi malviya nagar @9811711561@
call girls in delhi malviya nagar @9811711561@call girls in delhi malviya nagar @9811711561@
call girls in delhi malviya nagar @9811711561@vikas rana
 
Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝
Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝
Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝soniya singh
 
SBFT Tool Competition 2024 -- Python Test Case Generation Track
SBFT Tool Competition 2024 -- Python Test Case Generation TrackSBFT Tool Competition 2024 -- Python Test Case Generation Track
SBFT Tool Competition 2024 -- Python Test Case Generation TrackSebastiano Panichella
 
Simulation-based Testing of Unmanned Aerial Vehicles with Aerialist
Simulation-based Testing of Unmanned Aerial Vehicles with AerialistSimulation-based Testing of Unmanned Aerial Vehicles with Aerialist
Simulation-based Testing of Unmanned Aerial Vehicles with AerialistSebastiano Panichella
 
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )Pooja Nehwal
 
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...NETWAYS
 
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...henrik385807
 
Philippine History cavite Mutiny Report.ppt
Philippine History cavite Mutiny Report.pptPhilippine History cavite Mutiny Report.ppt
Philippine History cavite Mutiny Report.pptssuser319dad
 
Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...
Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...
Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...NETWAYS
 
OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...
OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...
OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...NETWAYS
 
NATIONAL ANTHEMS OF AFRICA (National Anthems of Africa)
NATIONAL ANTHEMS OF AFRICA (National Anthems of Africa)NATIONAL ANTHEMS OF AFRICA (National Anthems of Africa)
NATIONAL ANTHEMS OF AFRICA (National Anthems of Africa)Basil Achie
 
Russian Call Girls in Kolkata Vaishnavi 🤌 8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Vaishnavi 🤌  8250192130 🚀 Vip Call Girls KolkataRussian Call Girls in Kolkata Vaishnavi 🤌  8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Vaishnavi 🤌 8250192130 🚀 Vip Call Girls Kolkataanamikaraghav4
 
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...NETWAYS
 
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdf
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdfOpen Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdf
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdfhenrik385807
 
LANDMARKS AND MONUMENTS IN NIGERIA.pptx
LANDMARKS  AND MONUMENTS IN NIGERIA.pptxLANDMARKS  AND MONUMENTS IN NIGERIA.pptx
LANDMARKS AND MONUMENTS IN NIGERIA.pptxBasil Achie
 
Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...
Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...
Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...Salam Al-Karadaghi
 
The 3rd Intl. Workshop on NL-based Software Engineering
The 3rd Intl. Workshop on NL-based Software EngineeringThe 3rd Intl. Workshop on NL-based Software Engineering
The 3rd Intl. Workshop on NL-based Software EngineeringSebastiano Panichella
 
OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...
OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...
OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...NETWAYS
 
Work Remotely with Confluence ACE 2.pptx
Work Remotely with Confluence ACE 2.pptxWork Remotely with Confluence ACE 2.pptx
Work Remotely with Confluence ACE 2.pptxmavinoikein
 
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdfCTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdfhenrik385807
 

Recently uploaded (20)

call girls in delhi malviya nagar @9811711561@
call girls in delhi malviya nagar @9811711561@call girls in delhi malviya nagar @9811711561@
call girls in delhi malviya nagar @9811711561@
 
Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝
Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝
Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝
 
SBFT Tool Competition 2024 -- Python Test Case Generation Track
SBFT Tool Competition 2024 -- Python Test Case Generation TrackSBFT Tool Competition 2024 -- Python Test Case Generation Track
SBFT Tool Competition 2024 -- Python Test Case Generation Track
 
Simulation-based Testing of Unmanned Aerial Vehicles with Aerialist
Simulation-based Testing of Unmanned Aerial Vehicles with AerialistSimulation-based Testing of Unmanned Aerial Vehicles with Aerialist
Simulation-based Testing of Unmanned Aerial Vehicles with Aerialist
 
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
 
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...
 
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...
 
Philippine History cavite Mutiny Report.ppt
Philippine History cavite Mutiny Report.pptPhilippine History cavite Mutiny Report.ppt
Philippine History cavite Mutiny Report.ppt
 
Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...
Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...
Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...
 
OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...
OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...
OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...
 
NATIONAL ANTHEMS OF AFRICA (National Anthems of Africa)
NATIONAL ANTHEMS OF AFRICA (National Anthems of Africa)NATIONAL ANTHEMS OF AFRICA (National Anthems of Africa)
NATIONAL ANTHEMS OF AFRICA (National Anthems of Africa)
 
Russian Call Girls in Kolkata Vaishnavi 🤌 8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Vaishnavi 🤌  8250192130 🚀 Vip Call Girls KolkataRussian Call Girls in Kolkata Vaishnavi 🤌  8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Vaishnavi 🤌 8250192130 🚀 Vip Call Girls Kolkata
 
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...
 
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdf
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdfOpen Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdf
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdf
 
LANDMARKS AND MONUMENTS IN NIGERIA.pptx
LANDMARKS  AND MONUMENTS IN NIGERIA.pptxLANDMARKS  AND MONUMENTS IN NIGERIA.pptx
LANDMARKS AND MONUMENTS IN NIGERIA.pptx
 
Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...
Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...
Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...
 
The 3rd Intl. Workshop on NL-based Software Engineering
The 3rd Intl. Workshop on NL-based Software EngineeringThe 3rd Intl. Workshop on NL-based Software Engineering
The 3rd Intl. Workshop on NL-based Software Engineering
 
OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...
OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...
OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...
 
Work Remotely with Confluence ACE 2.pptx
Work Remotely with Confluence ACE 2.pptxWork Remotely with Confluence ACE 2.pptx
Work Remotely with Confluence ACE 2.pptx
 
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdfCTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
 

Bootstrapping Entity Alignment with Knowledge Graph Embedding

  • 1. Zequn Sun, Wei Hu, Qingheng Zhang and Yuzhong Qu National Key Laboratory for Novel Software Technology Nanjing University, China {zqsun, qhzhang}.nju@gmail.com, {whu, yzqu}@nju.edu.cn Bootstrapping Entity Alignment with Knowledge Graph Embedding 1
  • 2. Background n Entity Alignment ¡ Find entities in different KGs that refer to the same real-world object ¡ Play a vital role in automatically integrating multiple KGs n Conventional approaches ¡ Compute entity similarities based on entity attributes ¡ Are not always effective because of the semantic heterogeneity n Embedding-based approaches ¡ Encode KGs into vector spaces ¡ Measure entity similarities via entity embeddings 2
  • 3. Challenges n Although embedding a single KG has been extensively studied in the past few years, alignment-oriented KG embedding remains largely unexplored. n Embedding-based entity alignment usually relies on existing entity alignment (prior alignment) as training data. However, the accessible prior alignment usually accounts for a small proportion. 3
  • 4. Framework n We model entity alignment as a classification problem of using KG2 entities to label KG1 entities. n To solve the aforementioned two issues, we proposed a bootstrapping framework: 4 KG1 triples KG2 triples Prior alignment Supervised triples Parameter swapping Alignment predictor Train alignment-oriented KG embeddings Likely alignment Parameter swapping Alignment editing Alignment labeling
  • 5. Parameter Swapping n We swap aligned entities in their triples to calibrate the embeddings of KG1 and KG2 in the unified vector space. !(#,%) ' = {(*, +, ,)|(., +, ,) ∈ !0 1 } ∪ ℎ, +, * ℎ, +, . ∈ !0 1 ∪ {(., +, ,)|(*, +, ,) ∈ !5 1 } ∪ ℎ, +, . ℎ, +, * ∈ !5 1 n The supervised triples are fed to our KG embedding model as positives. 5 KG2’s triples KG1’s triples
  • 6. Alignment-Oriented Embedding n Translational score function: ! " = $ + & − ( ) ) . n Margin-based ranking loss: *+ = ∑-∈/0 ∑-1∈/2 3[5 + ! " − ! "6 ]8 n Limited loss function: *9 = ∑-∈/0[! " − 5:]8+ ∑-1∈/3[5) − ! "6 ]8 6 ! "6 − ! " > 5 not controlled not controlled ! "6 ≥ 5) ! " ≤ 5: ! "6 − ! " ≥ 5) − 5:
  • 7. !-Truncated Negative Sampling n Conventional uniform negative sampling (Washington DC, capitalOf, USA) (Tim Berners-Lee, capitalOf, USA) n !-Truncated negative sampling (Washington DC, capitalOf, USA) (New York , capitalOf, USA) 7 The replacer is randomly sampled from all entities. It may be easily distinguished from its original. The sampling scope is limited to a group of candidates, i.e., its "-nearest neighbors, where " = 1 − & ' .
  • 8. Likely Alignment Labeling n We choose to label likely alignment at the !-th iteration by solving the following optimization problem: max % &∈( % )∈*+ ,(.|0; 2 3 ) 5 6 3 (0, .) , s. t. % &;∈( 6 3 (0< , .) ≤ 1, % );∈*+ 6 3 (0, .< ) ≤ 1, ∀0, . n We transform it to max-weighted matching on bipartite graphs. 8 ( * one-to-one labeling
  • 9. Likely Alignment Editing n Labeling conflicts exist when accumulating the newly- labeled alignment of different iterations. ¡ ! is labeled as " at the #-th iteration while as "$ at the (#+1)-th iteration n We calculate the following likelihood difference: ∆(',),)*) (,) = . " !; 0 , − .("$ |!; 0 , ) ¡ If ∆(',),)*) (,) > 0, indicating labeling x as y gives more alignment likelihood, we choose " to label !. Otherwise "$ . 9
  • 10. Experiments 10 n Dataset ¡ DBP15K: three cross-lingual datasets built from the multilingual versions of DBpedia: DBPZH-EN (Chinese to English), DBPJA-EN (Japanese to English) and DBPFR-EN (French to English). Each dataset contains 15 thousand reference entity alignment. ¡ DWY100K: two large-scale datasets extracted from DBpedia, Wikidata and YAGO3, denoted by DBP-WD and DBP-YG. Each dataset has 100 thousand reference entity alignment.
  • 11. Experiments 11 n Comparative Approaches ¡ MTransE [ijcai 2017] learns a linear transformation between KGs. ¡ IPTransE [ijcai 2017] is an iterative method for entity alignment. ¡ JAPE [iswc 2017] combines relation and attribute embeddings for entity alignment. n Metrics ¡ Hits@k : the percentage of correct alignment ranked at top k ¡ MRR: the average of the reciprocal ranks of results
  • 12. Experiments 12 Approaches DBPZH-EN DBPJA-EN DBPFR-EN DBP-WD DBP-YG Hits@1 MRR Hits@1 MRR Hits@1 MRR Hits@1 MRR Hits@1 MRR MTransE 30.83 0.364 27.86 0.349 24.41 0.335 28.12 0.363 25.15 0.334 IPTransE 40.59 0.516 36.69 0.474 33.30 0.451 34.85 0.447 29.74 0.386 JAPE 41.18 0.490 36.25 0.476 32.39 0.430 31.84 0.411 23.57 0.320 AlignE 47.18 0.581 44.76 0.563 48.12 0.599 56.55 0.655 63.29 0.707 BootEA 62.94 0.703 62.23 0.701 65.30 0.731 74.79 0.801 76.10 0.808 n Main results on entity alignment ¡ AlignE outperformed the comparative approaches. ¡ BootEA considerably improved the performance of AlignE after employing bootstrapping.
  • 13. Experiments 13 n F1-score w.r.t. Distribution of Relation Triple Numbers ¡ We divided entity links in testing data into several intervals based on the number of their relation triples. ¡ The performance was assessed by F1-score within a certain interval. ¡ This analysis demonstrated that BootEA can achieve promising results on sparse data, indicating its practical use for real KGs. 0.0 0.2 0.4 0.6 0.8 1.0 [1,6) [6,11) [11,16) [16,21) [21,∞) F1-score Number of relation triples MTransE IPTransE JAPE BootEA Number of entity alignment within interval
  • 14. Conclusion 14 n In this paper, we studied embedding-based entity alignment. ¡ We introduced a KG embedding model to learn alignment-oriented embeddings across different KGs. It employs an !-truncated uniform negative sampling method to improve alignment performance. ¡ We conducted entity alignment in a bootstrapping process. It labels likely alignment as training data and edits alignment during iterations ¡ Our experiment results showed that the proposed approach significantly outperformed three state-of-the-art embedding-based ones, on three cross-lingual datasets and two new large-scale datasets.
  • 15. Thanks for your attention! n This work is supported by the National Key R&D Program of China (No. 2018YFB1004300) n Codes and datasets of BootEA are now available at https://github.com/nju-websoft/BootEA n Welcome to my poster (#1425) 15