SlideShare a Scribd company logo
1 of 30
Download to read offline
Beyond Triplets: Hyper-Relational Knowledge Graph
Embedding for Link Prediction
Paolo Rosso, Dingqi Yang, Philippe Cudré-Mauroux
eXascale Infolab
University of Fribourg, Switzerland
April 2020 – The Web Conference 2020
0
Knowledge Graph (KG)
• Multi-relational graph composed of entities and relations
• Each fact is represented as a triplet head entity, relation, tail entity
• A fact indicates that two entities are connected by a specific relation
1
Knowledge Graph embeddings
• Represent entities and relations in a Knowledge Graph using a vector
space
• Learn low-dimensional vector representation of entities and relations
from a Knowledge Graph while preserving the graph properties
Marie Curie
University of Paris
educated at
Semantic search
Question-answering
Query expansion
Recommendation systems
2
TransE
Hyper-relational facts
• Triplet-based representation of a KG oversimplifies the complex nature of
hyper-relational data
• A hyper-relational data represents a fact containing multiple relations and
entities
• A hyper-relational is represented by a triplet and a set of key-value pairs
3
Hyper-relational facts representations
1. Triplet only: keeping only the base triplet (irreversible information
loss)
2. Reification: adding an artificial entity to represent the base triplet
and use it as head entity for the k-v pair (creates additional triplets)
3. Relation paths: creating a relation path of relation-key and use it as
a relation to connect head and key (creates additional triplets)
… we need a method to better learn hyper-relation facts!
4
5
Marie Curie educated at University of Paris
Hyper-relational facts representations
Triplet only
Hyper-relational facts representations
1. Triplet only: keeping only the base triplet (irreversible information
loss)
2. Reification: adding an artificial entity to represent the base triplet
and use it as head entity for the k-v pair (creates additional triplets)
3. Relation paths: creating a relation path of relation-key and use it as
a relation to connect head and key (creates additional triplets)
… we need a method to better learn hyper-relation facts!
6
7
q := Marie Curie educated at University of Paris
q academic major Physics
q academic degree Master of Science
Hyper-relational facts representations
Reification
Hyper-relational facts representations
1. Triplet only: keeping only the base triplet (irreversible information
loss)
2. Reification: adding an artificial entity to represent the base triplet
and use it as head entity for the k-v pair (creates additional triplets)
3. Relation paths: creating a relation path of relation-key and use it as
a relation to connect head and key (creates additional triplets)
… we need a method to better learn hyper-relation facts!
8
9
Marie Curie educated at University of Paris
rk1 := educated at AND academic major
rk2 := educated at AND academic degree
Marie Curie rk1 Physics
Marie Curie rk2 Master of Science
Relation paths
Hyper-relational facts representations
1. Triplet only: keeping only the base triplet (irreversible information
loss)
2. Reification: adding an artificial entity to represent the base triplet
and use it as head entity for the k-v pair (creates additional triplets)
3. Relation paths: creating a relation path of relation-key and use it as
a relation to connect head and key (creates additional triplets)
10
11
• Hyper-relational fact can be transformed into n-ary representation
• E.g., relation education is extracted from the hyper-relational fact
• N-ary representation {education_head:Marie Curie, education_tail:University of
Paris, education_major:Physics, education_degree:Master of Science}
… but triplets serve as the fundamental data structure in the modern KGs
because they preserve the essential information for link prediction
N-Ary Representation
12
• Hyper-relational fact can be transformed into n-ary representation
• E.g., relation education is extracted from the hyper-relational fact
• N-ary representation {education_head:Marie Curie, education_tail:University of
Paris, education_major:Physics, education_degree:Master of Science}
… but triplets serve as the fundamental data structure in the modern KGs
because they preserve the essential information for link prediction
N-Ary Representation
Limitation of N-Ary Representation
13
• Null model: one triplet is extracted from the n-ary representation of each hyper-
relational fact via randomly sampled relation path (h, rhki, vi)
• E.g., Marie Curie education_head_education_major Physics
• Null hypothesis: the information for link prediction encoded by the original base
triplets is not greater than the triplets created by the null model
• We test Null hypothesis on link prediction tasks using models on original basic
triplets and null model
• Rejected Null hypothesis - results:
• Performance of the baselines on original base triplets is consistently and significantly
better than the performance from the null model
• Information encoded in original base triplets is greater than the information encoded in
null model
HINGE: Hyper-relatIonal kNowledge Graph Embedding
• KG embedding model to learn hyper-relational facts in a KG
• Capturing the primary structural information of the KG encoded in
the triplets in order to preserve the essential information for link
prediction
• Capturing the correlation between each triplet and its associated key-
value pairs (if any)
14
HINGE overview
15
HINGE architecture
• The base triplet encodes the primary structural information and capture essential
information for link prediction
• CNN to learn from the base triplet and capture the triple-wise relatedness between h,r,t
embeddings
• The triple-wise relatedness vector used to characterize the plausibility of the base triplet of
being true
16
HINGE architecture
• CNN to learn from each key-value pair associated with the triplet together and capture the quintuple-wise
relatedness between h,r,t,k,v embeddings
• The quintuple-wise relatedness vector used to characterize the plausibility of the base triplet associated with
the k-v pair being true
17
HINGE architecture
• Concatenate the triple-wise and quintuple-
wise relatedness future vectors
• Get the minimum value along each
dimension
• For a valid hyper-relational fact, the
relatedness vectors should be high
• The minimum score along each
dimension is expected to be high
• Fully connected layer to output the score for
the input hyper-relational fact
18
HINGE architecture
19
Experiments
• Learning from triplets facts only:
• Translational distance models: TransE, TransH, TransR and TransD
• Semantic matching models: Rescal, DistMult, ComplEx, Analogy and ConvE
• Learning from hyper-relational facts:
• m-TransH, RAE, NaLP, NaLP-Fix
20
Datasets
• Hyper-relational datasets
• JF17K: filtered from Freebase to have a significant presence of hyper-relational
facts
• WikiPeople: extracted from Wikidata and focuses on entities of type human
• JF17K and WikiPeople contain triplet facts and hyper-relational facts
• Dataset configurations: Triplet Only, Relation Path, Reification, Original
Hyper-Relational
21
Evaluation Tasks and Metrics
• Link prediction: given two elements of a triplet in a fact, predict the missing one
• E.g., (?,r,t), (h,?,t) or (h,r,?)
• Mean Reciprocal Rank (MRR), Hits@10 and Hits@1
22
Comparison with Baselines Learning from Hyper-Relational Facts
23
• HINGE outperforms all baselines learning from hyper-relational facts
• Among the baselines:
• NaLP shows the best performance (it learns the relatedness between k-v pairs but
it is not aware of the triplet structure)
• m-TransH and RAE learn only from entities
Comparison with Baselines Learning from Triplets Only
Dataset
Transformation
Setting
WikiPeople JF17K
Head/Tail Prediction (%) Relation Prediction (%) Head/Tail Prediction (%) Relation Prediction (%)
Basic 0.81 18.55 41.45 8.41
Relation Path 2.85 24.65 22.96 12.24
Reification 3.28 22.22 29.71 6.53
24
Comparison with Baselines Learning from Triplets Only
Dataset
Transformation
Setting
WikiPeople JF17K
Head/Tail Prediction (%) Relation Prediction (%) Head/Tail Prediction (%) Relation Prediction (%)
Basic 0.81 18.55 41.45 8.41
Relation Path 2.85 24.65 22.96 12.24
Reification 3.28 22.22 29.71 6.53
• Basic setting discards the key-value pairs (information loss)
25
Comparison with Baselines Learning from Triplets Only
Dataset
Transformation
Setting
WikiPeople JF17K
Head/Tail Prediction (%) Relation Prediction (%) Head/Tail Prediction (%) Relation Prediction (%)
Basic 0.81 18.55 41.45 8.41
Relation Path 2.85 24.65 22.96 12.24
Reification 3.28 22.22 29.71 6.53
• Basic setting discards the key-value pairs (information loss)
• Exception for the head/tail prediction on WikiPeople with Basic setting (dominance
of triplet)
26
Comparison with Baselines Learning from Triplets Only
Dataset
Transformation
Setting
WikiPeople JF17K
Head/Tail Prediction (%) Relation Prediction (%) Head/Tail Prediction (%) Relation Prediction (%)
Basic 0.81 18.55 41.45 8.41
Relation Path 2.85 24.65 22.96 12.24
Reification 3.28 22.22 29.71 6.53
• Basic setting discards the key-value pairs (information loss)
• Exception for the head/tail prediction on WikiPeople with Basic setting (dominance
of triplet)
• Relation Path and Reification create extra entities/relations (not capturing
essential information for link prediction)
27
Conclusions
• We investigate the problem of hyper-relational KG embedding
showing that triplets serve as the fundamental data structure in
modern KGs encoding the essential information for link prediction
• We introduce HINGE, a KG embedding model that captures the
information encoded in the triplets and the correlation between each
triplet and its associated key-value pairs
• We show that the triplet structure is the fundamental structure for
link prediction and show the limitations of a commonly used
representation scheme
• We observe an improvement on various link prediction tasks with
different data transformation settings
28
Any questions?
exascale.info
Paper link: bit.ly/2VaDNC8
29

More Related Content

Similar to Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction

probabilistic ranking
probabilistic rankingprobabilistic ranking
probabilistic ranking
FELIX75
 
Practical dimensions
Practical dimensionsPractical dimensions
Practical dimensions
tholem
 

Similar to Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction (20)

Nimrita koul Machine Learning
Nimrita koul  Machine LearningNimrita koul  Machine Learning
Nimrita koul Machine Learning
 
Dimensionality reduction by matrix factorization using concept lattice in dat...
Dimensionality reduction by matrix factorization using concept lattice in dat...Dimensionality reduction by matrix factorization using concept lattice in dat...
Dimensionality reduction by matrix factorization using concept lattice in dat...
 
graph_embeddings
graph_embeddingsgraph_embeddings
graph_embeddings
 
151028_abajpai1
151028_abajpai1151028_abajpai1
151028_abajpai1
 
Kernel methods and variable selection for exploratory analysis and multi-omic...
Kernel methods and variable selection for exploratory analysis and multi-omic...Kernel methods and variable selection for exploratory analysis and multi-omic...
Kernel methods and variable selection for exploratory analysis and multi-omic...
 
Composing graphical models with neural networks for structured representatio...
Composing graphical models with  neural networks for structured representatio...Composing graphical models with  neural networks for structured representatio...
Composing graphical models with neural networks for structured representatio...
 
17- Kernels and Clustering.pptx
17- Kernels and Clustering.pptx17- Kernels and Clustering.pptx
17- Kernels and Clustering.pptx
 
Neural Nets Deconstructed
Neural Nets DeconstructedNeural Nets Deconstructed
Neural Nets Deconstructed
 
How Does Math Matter in Data Science
How Does Math Matter in Data ScienceHow Does Math Matter in Data Science
How Does Math Matter in Data Science
 
Probabilistic Modelling with Information Filtering Networks
Probabilistic Modelling with Information Filtering NetworksProbabilistic Modelling with Information Filtering Networks
Probabilistic Modelling with Information Filtering Networks
 
Probablistic information retrieval
Probablistic information retrievalProbablistic information retrieval
Probablistic information retrieval
 
Ranking nodes in growing networks: when PageRank fails
Ranking nodes in growing networks: when PageRank failsRanking nodes in growing networks: when PageRank fails
Ranking nodes in growing networks: when PageRank fails
 
Introduction to Principle Component Analysis
Introduction to Principle Component AnalysisIntroduction to Principle Component Analysis
Introduction to Principle Component Analysis
 
A multithreaded method for network alignment
A multithreaded method for network alignmentA multithreaded method for network alignment
A multithreaded method for network alignment
 
More investment in Research and Development for better Education in the future?
More investment in Research and Development for better Education in the future?More investment in Research and Development for better Education in the future?
More investment in Research and Development for better Education in the future?
 
What makes a linked data pattern interesting?
What makes a linked data pattern interesting?What makes a linked data pattern interesting?
What makes a linked data pattern interesting?
 
probabilistic ranking
probabilistic rankingprobabilistic ranking
probabilistic ranking
 
Document ranking using qprp with concept of multi dimensional subspace
Document ranking using qprp with concept of multi dimensional subspaceDocument ranking using qprp with concept of multi dimensional subspace
Document ranking using qprp with concept of multi dimensional subspace
 
Practical dimensions
Practical dimensionsPractical dimensions
Practical dimensions
 
Legal Analytics Course - Class 6 - Overfitting, Underfitting, & Cross-Validat...
Legal Analytics Course - Class 6 - Overfitting, Underfitting, & Cross-Validat...Legal Analytics Course - Class 6 - Overfitting, Underfitting, & Cross-Validat...
Legal Analytics Course - Class 6 - Overfitting, Underfitting, & Cross-Validat...
 

More from eXascale Infolab

HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...
eXascale Infolab
 
SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...
SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...
SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...
eXascale Infolab
 
CIKM14: Fixing grammatical errors by preposition ranking
CIKM14: Fixing grammatical errors by preposition rankingCIKM14: Fixing grammatical errors by preposition ranking
CIKM14: Fixing grammatical errors by preposition ranking
eXascale Infolab
 

More from eXascale Infolab (20)

It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...
 
Representation Learning on Complex Graphs
Representation Learning on Complex GraphsRepresentation Learning on Complex Graphs
Representation Learning on Complex Graphs
 
A force directed approach for offline gps trajectory map
A force directed approach for offline gps trajectory mapA force directed approach for offline gps trajectory map
A force directed approach for offline gps trajectory map
 
Cikm 2018
Cikm 2018Cikm 2018
Cikm 2018
 
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...
 
SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...
SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...
SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...
 
Dependency-Driven Analytics: A Compass for Uncharted Data Oceans
Dependency-Driven Analytics: A Compass for Uncharted Data OceansDependency-Driven Analytics: A Compass for Uncharted Data Oceans
Dependency-Driven Analytics: A Compass for Uncharted Data Oceans
 
Crowd scheduling www2016
Crowd scheduling www2016Crowd scheduling www2016
Crowd scheduling www2016
 
SANAPHOR: Ontology-based Coreference Resolution
SANAPHOR: Ontology-based Coreference ResolutionSANAPHOR: Ontology-based Coreference Resolution
SANAPHOR: Ontology-based Coreference Resolution
 
Efficient, Scalable, and Provenance-Aware Management of Linked Data
Efficient, Scalable, and Provenance-Aware Management of Linked DataEfficient, Scalable, and Provenance-Aware Management of Linked Data
Efficient, Scalable, and Provenance-Aware Management of Linked Data
 
Entity-Centric Data Management
Entity-Centric Data ManagementEntity-Centric Data Management
Entity-Centric Data Management
 
SSSW 2015 Sense Making
SSSW 2015 Sense MakingSSSW 2015 Sense Making
SSSW 2015 Sense Making
 
LDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked Data
LDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked DataLDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked Data
LDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked Data
 
Executing Provenance-Enabled Queries over Web Data
Executing Provenance-Enabled Queries over Web DataExecuting Provenance-Enabled Queries over Web Data
Executing Provenance-Enabled Queries over Web Data
 
The Dynamics of Micro-Task Crowdsourcing
The Dynamics of Micro-Task CrowdsourcingThe Dynamics of Micro-Task Crowdsourcing
The Dynamics of Micro-Task Crowdsourcing
 
Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...
Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...
Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...
 
CIKM14: Fixing grammatical errors by preposition ranking
CIKM14: Fixing grammatical errors by preposition rankingCIKM14: Fixing grammatical errors by preposition ranking
CIKM14: Fixing grammatical errors by preposition ranking
 
OLTP-Bench
OLTP-BenchOLTP-Bench
OLTP-Bench
 
An Introduction to Big Data
An Introduction to Big DataAn Introduction to Big Data
An Introduction to Big Data
 
Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)
Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)
Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)
 

Recently uploaded

TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc
 
CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)
Wonjun Hwang
 
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
FIDO Alliance
 
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
Muhammad Subhan
 

Recently uploaded (20)

Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptxCyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and Insight
 
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
 
CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)
 
Microsoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireMicrosoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - Questionnaire
 
Design and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data ScienceDesign and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data Science
 
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
 
ChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps ProductivityChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps Productivity
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Design Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptxDesign Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptx
 
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdf
 
Simplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptxSimplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptx
 
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
 
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
 
Top 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development CompaniesTop 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development Companies
 
Generative AI Use Cases and Applications.pdf
Generative AI Use Cases and Applications.pdfGenerative AI Use Cases and Applications.pdf
Generative AI Use Cases and Applications.pdf
 
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
 
2024 May Patch Tuesday
2024 May Patch Tuesday2024 May Patch Tuesday
2024 May Patch Tuesday
 

Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction

  • 1. Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction Paolo Rosso, Dingqi Yang, Philippe Cudré-Mauroux eXascale Infolab University of Fribourg, Switzerland April 2020 – The Web Conference 2020 0
  • 2. Knowledge Graph (KG) • Multi-relational graph composed of entities and relations • Each fact is represented as a triplet head entity, relation, tail entity • A fact indicates that two entities are connected by a specific relation 1
  • 3. Knowledge Graph embeddings • Represent entities and relations in a Knowledge Graph using a vector space • Learn low-dimensional vector representation of entities and relations from a Knowledge Graph while preserving the graph properties Marie Curie University of Paris educated at Semantic search Question-answering Query expansion Recommendation systems 2 TransE
  • 4. Hyper-relational facts • Triplet-based representation of a KG oversimplifies the complex nature of hyper-relational data • A hyper-relational data represents a fact containing multiple relations and entities • A hyper-relational is represented by a triplet and a set of key-value pairs 3
  • 5. Hyper-relational facts representations 1. Triplet only: keeping only the base triplet (irreversible information loss) 2. Reification: adding an artificial entity to represent the base triplet and use it as head entity for the k-v pair (creates additional triplets) 3. Relation paths: creating a relation path of relation-key and use it as a relation to connect head and key (creates additional triplets) … we need a method to better learn hyper-relation facts! 4
  • 6. 5 Marie Curie educated at University of Paris Hyper-relational facts representations Triplet only
  • 7. Hyper-relational facts representations 1. Triplet only: keeping only the base triplet (irreversible information loss) 2. Reification: adding an artificial entity to represent the base triplet and use it as head entity for the k-v pair (creates additional triplets) 3. Relation paths: creating a relation path of relation-key and use it as a relation to connect head and key (creates additional triplets) … we need a method to better learn hyper-relation facts! 6
  • 8. 7 q := Marie Curie educated at University of Paris q academic major Physics q academic degree Master of Science Hyper-relational facts representations Reification
  • 9. Hyper-relational facts representations 1. Triplet only: keeping only the base triplet (irreversible information loss) 2. Reification: adding an artificial entity to represent the base triplet and use it as head entity for the k-v pair (creates additional triplets) 3. Relation paths: creating a relation path of relation-key and use it as a relation to connect head and key (creates additional triplets) … we need a method to better learn hyper-relation facts! 8
  • 10. 9 Marie Curie educated at University of Paris rk1 := educated at AND academic major rk2 := educated at AND academic degree Marie Curie rk1 Physics Marie Curie rk2 Master of Science Relation paths
  • 11. Hyper-relational facts representations 1. Triplet only: keeping only the base triplet (irreversible information loss) 2. Reification: adding an artificial entity to represent the base triplet and use it as head entity for the k-v pair (creates additional triplets) 3. Relation paths: creating a relation path of relation-key and use it as a relation to connect head and key (creates additional triplets) 10
  • 12. 11 • Hyper-relational fact can be transformed into n-ary representation • E.g., relation education is extracted from the hyper-relational fact • N-ary representation {education_head:Marie Curie, education_tail:University of Paris, education_major:Physics, education_degree:Master of Science} … but triplets serve as the fundamental data structure in the modern KGs because they preserve the essential information for link prediction N-Ary Representation
  • 13. 12 • Hyper-relational fact can be transformed into n-ary representation • E.g., relation education is extracted from the hyper-relational fact • N-ary representation {education_head:Marie Curie, education_tail:University of Paris, education_major:Physics, education_degree:Master of Science} … but triplets serve as the fundamental data structure in the modern KGs because they preserve the essential information for link prediction N-Ary Representation
  • 14. Limitation of N-Ary Representation 13 • Null model: one triplet is extracted from the n-ary representation of each hyper- relational fact via randomly sampled relation path (h, rhki, vi) • E.g., Marie Curie education_head_education_major Physics • Null hypothesis: the information for link prediction encoded by the original base triplets is not greater than the triplets created by the null model • We test Null hypothesis on link prediction tasks using models on original basic triplets and null model • Rejected Null hypothesis - results: • Performance of the baselines on original base triplets is consistently and significantly better than the performance from the null model • Information encoded in original base triplets is greater than the information encoded in null model
  • 15. HINGE: Hyper-relatIonal kNowledge Graph Embedding • KG embedding model to learn hyper-relational facts in a KG • Capturing the primary structural information of the KG encoded in the triplets in order to preserve the essential information for link prediction • Capturing the correlation between each triplet and its associated key- value pairs (if any) 14
  • 17. HINGE architecture • The base triplet encodes the primary structural information and capture essential information for link prediction • CNN to learn from the base triplet and capture the triple-wise relatedness between h,r,t embeddings • The triple-wise relatedness vector used to characterize the plausibility of the base triplet of being true 16
  • 18. HINGE architecture • CNN to learn from each key-value pair associated with the triplet together and capture the quintuple-wise relatedness between h,r,t,k,v embeddings • The quintuple-wise relatedness vector used to characterize the plausibility of the base triplet associated with the k-v pair being true 17
  • 19. HINGE architecture • Concatenate the triple-wise and quintuple- wise relatedness future vectors • Get the minimum value along each dimension • For a valid hyper-relational fact, the relatedness vectors should be high • The minimum score along each dimension is expected to be high • Fully connected layer to output the score for the input hyper-relational fact 18
  • 21. Experiments • Learning from triplets facts only: • Translational distance models: TransE, TransH, TransR and TransD • Semantic matching models: Rescal, DistMult, ComplEx, Analogy and ConvE • Learning from hyper-relational facts: • m-TransH, RAE, NaLP, NaLP-Fix 20
  • 22. Datasets • Hyper-relational datasets • JF17K: filtered from Freebase to have a significant presence of hyper-relational facts • WikiPeople: extracted from Wikidata and focuses on entities of type human • JF17K and WikiPeople contain triplet facts and hyper-relational facts • Dataset configurations: Triplet Only, Relation Path, Reification, Original Hyper-Relational 21
  • 23. Evaluation Tasks and Metrics • Link prediction: given two elements of a triplet in a fact, predict the missing one • E.g., (?,r,t), (h,?,t) or (h,r,?) • Mean Reciprocal Rank (MRR), Hits@10 and Hits@1 22
  • 24. Comparison with Baselines Learning from Hyper-Relational Facts 23 • HINGE outperforms all baselines learning from hyper-relational facts • Among the baselines: • NaLP shows the best performance (it learns the relatedness between k-v pairs but it is not aware of the triplet structure) • m-TransH and RAE learn only from entities
  • 25. Comparison with Baselines Learning from Triplets Only Dataset Transformation Setting WikiPeople JF17K Head/Tail Prediction (%) Relation Prediction (%) Head/Tail Prediction (%) Relation Prediction (%) Basic 0.81 18.55 41.45 8.41 Relation Path 2.85 24.65 22.96 12.24 Reification 3.28 22.22 29.71 6.53 24
  • 26. Comparison with Baselines Learning from Triplets Only Dataset Transformation Setting WikiPeople JF17K Head/Tail Prediction (%) Relation Prediction (%) Head/Tail Prediction (%) Relation Prediction (%) Basic 0.81 18.55 41.45 8.41 Relation Path 2.85 24.65 22.96 12.24 Reification 3.28 22.22 29.71 6.53 • Basic setting discards the key-value pairs (information loss) 25
  • 27. Comparison with Baselines Learning from Triplets Only Dataset Transformation Setting WikiPeople JF17K Head/Tail Prediction (%) Relation Prediction (%) Head/Tail Prediction (%) Relation Prediction (%) Basic 0.81 18.55 41.45 8.41 Relation Path 2.85 24.65 22.96 12.24 Reification 3.28 22.22 29.71 6.53 • Basic setting discards the key-value pairs (information loss) • Exception for the head/tail prediction on WikiPeople with Basic setting (dominance of triplet) 26
  • 28. Comparison with Baselines Learning from Triplets Only Dataset Transformation Setting WikiPeople JF17K Head/Tail Prediction (%) Relation Prediction (%) Head/Tail Prediction (%) Relation Prediction (%) Basic 0.81 18.55 41.45 8.41 Relation Path 2.85 24.65 22.96 12.24 Reification 3.28 22.22 29.71 6.53 • Basic setting discards the key-value pairs (information loss) • Exception for the head/tail prediction on WikiPeople with Basic setting (dominance of triplet) • Relation Path and Reification create extra entities/relations (not capturing essential information for link prediction) 27
  • 29. Conclusions • We investigate the problem of hyper-relational KG embedding showing that triplets serve as the fundamental data structure in modern KGs encoding the essential information for link prediction • We introduce HINGE, a KG embedding model that captures the information encoded in the triplets and the correlation between each triplet and its associated key-value pairs • We show that the triplet structure is the fundamental structure for link prediction and show the limitations of a commonly used representation scheme • We observe an improvement on various link prediction tasks with different data transformation settings 28