SlideShare a Scribd company logo
Beyond Triplets: Hyper-Relational Knowledge Graph
Embedding for Link Prediction
Paolo Rosso, Dingqi Yang, Philippe Cudré-Mauroux
eXascale Infolab
University of Fribourg, Switzerland
April 2020 – The Web Conference 2020
0
Knowledge Graph (KG)
• Multi-relational graph composed of entities and relations
• Each fact is represented as a triplet head entity, relation, tail entity
• A fact indicates that two entities are connected by a specific relation
1
Knowledge Graph embeddings
• Represent entities and relations in a Knowledge Graph using a vector
space
• Learn low-dimensional vector representation of entities and relations
from a Knowledge Graph while preserving the graph properties
Marie Curie
University of Paris
educated at
Semantic search
Question-answering
Query expansion
Recommendation systems
2
TransE
Hyper-relational facts
• Triplet-based representation of a KG oversimplifies the complex nature of
hyper-relational data
• A hyper-relational data represents a fact containing multiple relations and
entities
• A hyper-relational is represented by a triplet and a set of key-value pairs
3
Hyper-relational facts representations
1. Triplet only: keeping only the base triplet (irreversible information
loss)
2. Reification: adding an artificial entity to represent the base triplet
and use it as head entity for the k-v pair (creates additional triplets)
3. Relation paths: creating a relation path of relation-key and use it as
a relation to connect head and key (creates additional triplets)
… we need a method to better learn hyper-relation facts!
4
5
Marie Curie educated at University of Paris
Hyper-relational facts representations
Triplet only
Hyper-relational facts representations
1. Triplet only: keeping only the base triplet (irreversible information
loss)
2. Reification: adding an artificial entity to represent the base triplet
and use it as head entity for the k-v pair (creates additional triplets)
3. Relation paths: creating a relation path of relation-key and use it as
a relation to connect head and key (creates additional triplets)
… we need a method to better learn hyper-relation facts!
6
7
q := Marie Curie educated at University of Paris
q academic major Physics
q academic degree Master of Science
Hyper-relational facts representations
Reification
Hyper-relational facts representations
1. Triplet only: keeping only the base triplet (irreversible information
loss)
2. Reification: adding an artificial entity to represent the base triplet
and use it as head entity for the k-v pair (creates additional triplets)
3. Relation paths: creating a relation path of relation-key and use it as
a relation to connect head and key (creates additional triplets)
… we need a method to better learn hyper-relation facts!
8
9
Marie Curie educated at University of Paris
rk1 := educated at AND academic major
rk2 := educated at AND academic degree
Marie Curie rk1 Physics
Marie Curie rk2 Master of Science
Relation paths
Hyper-relational facts representations
1. Triplet only: keeping only the base triplet (irreversible information
loss)
2. Reification: adding an artificial entity to represent the base triplet
and use it as head entity for the k-v pair (creates additional triplets)
3. Relation paths: creating a relation path of relation-key and use it as
a relation to connect head and key (creates additional triplets)
10
11
• Hyper-relational fact can be transformed into n-ary representation
• E.g., relation education is extracted from the hyper-relational fact
• N-ary representation {education_head:Marie Curie, education_tail:University of
Paris, education_major:Physics, education_degree:Master of Science}
… but triplets serve as the fundamental data structure in the modern KGs
because they preserve the essential information for link prediction
N-Ary Representation
12
• Hyper-relational fact can be transformed into n-ary representation
• E.g., relation education is extracted from the hyper-relational fact
• N-ary representation {education_head:Marie Curie, education_tail:University of
Paris, education_major:Physics, education_degree:Master of Science}
… but triplets serve as the fundamental data structure in the modern KGs
because they preserve the essential information for link prediction
N-Ary Representation
Limitation of N-Ary Representation
13
• Null model: one triplet is extracted from the n-ary representation of each hyper-
relational fact via randomly sampled relation path (h, rhki, vi)
• E.g., Marie Curie education_head_education_major Physics
• Null hypothesis: the information for link prediction encoded by the original base
triplets is not greater than the triplets created by the null model
• We test Null hypothesis on link prediction tasks using models on original basic
triplets and null model
• Rejected Null hypothesis - results:
• Performance of the baselines on original base triplets is consistently and significantly
better than the performance from the null model
• Information encoded in original base triplets is greater than the information encoded in
null model
HINGE: Hyper-relatIonal kNowledge Graph Embedding
• KG embedding model to learn hyper-relational facts in a KG
• Capturing the primary structural information of the KG encoded in
the triplets in order to preserve the essential information for link
prediction
• Capturing the correlation between each triplet and its associated key-
value pairs (if any)
14
HINGE overview
15
HINGE architecture
• The base triplet encodes the primary structural information and capture essential
information for link prediction
• CNN to learn from the base triplet and capture the triple-wise relatedness between h,r,t
embeddings
• The triple-wise relatedness vector used to characterize the plausibility of the base triplet of
being true
16
HINGE architecture
• CNN to learn from each key-value pair associated with the triplet together and capture the quintuple-wise
relatedness between h,r,t,k,v embeddings
• The quintuple-wise relatedness vector used to characterize the plausibility of the base triplet associated with
the k-v pair being true
17
HINGE architecture
• Concatenate the triple-wise and quintuple-
wise relatedness future vectors
• Get the minimum value along each
dimension
• For a valid hyper-relational fact, the
relatedness vectors should be high
• The minimum score along each
dimension is expected to be high
• Fully connected layer to output the score for
the input hyper-relational fact
18
HINGE architecture
19
Experiments
• Learning from triplets facts only:
• Translational distance models: TransE, TransH, TransR and TransD
• Semantic matching models: Rescal, DistMult, ComplEx, Analogy and ConvE
• Learning from hyper-relational facts:
• m-TransH, RAE, NaLP, NaLP-Fix
20
Datasets
• Hyper-relational datasets
• JF17K: filtered from Freebase to have a significant presence of hyper-relational
facts
• WikiPeople: extracted from Wikidata and focuses on entities of type human
• JF17K and WikiPeople contain triplet facts and hyper-relational facts
• Dataset configurations: Triplet Only, Relation Path, Reification, Original
Hyper-Relational
21
Evaluation Tasks and Metrics
• Link prediction: given two elements of a triplet in a fact, predict the missing one
• E.g., (?,r,t), (h,?,t) or (h,r,?)
• Mean Reciprocal Rank (MRR), Hits@10 and Hits@1
22
Comparison with Baselines Learning from Hyper-Relational Facts
23
• HINGE outperforms all baselines learning from hyper-relational facts
• Among the baselines:
• NaLP shows the best performance (it learns the relatedness between k-v pairs but
it is not aware of the triplet structure)
• m-TransH and RAE learn only from entities
Comparison with Baselines Learning from Triplets Only
Dataset
Transformation
Setting
WikiPeople JF17K
Head/Tail Prediction (%) Relation Prediction (%) Head/Tail Prediction (%) Relation Prediction (%)
Basic 0.81 18.55 41.45 8.41
Relation Path 2.85 24.65 22.96 12.24
Reification 3.28 22.22 29.71 6.53
24
Comparison with Baselines Learning from Triplets Only
Dataset
Transformation
Setting
WikiPeople JF17K
Head/Tail Prediction (%) Relation Prediction (%) Head/Tail Prediction (%) Relation Prediction (%)
Basic 0.81 18.55 41.45 8.41
Relation Path 2.85 24.65 22.96 12.24
Reification 3.28 22.22 29.71 6.53
• Basic setting discards the key-value pairs (information loss)
25
Comparison with Baselines Learning from Triplets Only
Dataset
Transformation
Setting
WikiPeople JF17K
Head/Tail Prediction (%) Relation Prediction (%) Head/Tail Prediction (%) Relation Prediction (%)
Basic 0.81 18.55 41.45 8.41
Relation Path 2.85 24.65 22.96 12.24
Reification 3.28 22.22 29.71 6.53
• Basic setting discards the key-value pairs (information loss)
• Exception for the head/tail prediction on WikiPeople with Basic setting (dominance
of triplet)
26
Comparison with Baselines Learning from Triplets Only
Dataset
Transformation
Setting
WikiPeople JF17K
Head/Tail Prediction (%) Relation Prediction (%) Head/Tail Prediction (%) Relation Prediction (%)
Basic 0.81 18.55 41.45 8.41
Relation Path 2.85 24.65 22.96 12.24
Reification 3.28 22.22 29.71 6.53
• Basic setting discards the key-value pairs (information loss)
• Exception for the head/tail prediction on WikiPeople with Basic setting (dominance
of triplet)
• Relation Path and Reification create extra entities/relations (not capturing
essential information for link prediction)
27
Conclusions
• We investigate the problem of hyper-relational KG embedding
showing that triplets serve as the fundamental data structure in
modern KGs encoding the essential information for link prediction
• We introduce HINGE, a KG embedding model that captures the
information encoded in the triplets and the correlation between each
triplet and its associated key-value pairs
• We show that the triplet structure is the fundamental structure for
link prediction and show the limitations of a commonly used
representation scheme
• We observe an improvement on various link prediction tasks with
different data transformation settings
28
Any questions?
exascale.info
Paper link: bit.ly/2VaDNC8
29

More Related Content

Similar to Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction

Nimrita koul Machine Learning
Nimrita koul  Machine LearningNimrita koul  Machine Learning
Nimrita koul Machine Learning
Nimrita Koul
 
Dimensionality reduction by matrix factorization using concept lattice in dat...
Dimensionality reduction by matrix factorization using concept lattice in dat...Dimensionality reduction by matrix factorization using concept lattice in dat...
Dimensionality reduction by matrix factorization using concept lattice in dat...
eSAT Journals
 
graph_embeddings
graph_embeddingsgraph_embeddings
graph_embeddings
Armando Vieira
 
151028_abajpai1
151028_abajpai1151028_abajpai1
151028_abajpai1
Anshumaan Bajpai
 
Kernel methods and variable selection for exploratory analysis and multi-omic...
Kernel methods and variable selection for exploratory analysis and multi-omic...Kernel methods and variable selection for exploratory analysis and multi-omic...
Kernel methods and variable selection for exploratory analysis and multi-omic...
tuxette
 
Composing graphical models with neural networks for structured representatio...
Composing graphical models with  neural networks for structured representatio...Composing graphical models with  neural networks for structured representatio...
Composing graphical models with neural networks for structured representatio...
Jeongmin Cha
 
17- Kernels and Clustering.pptx
17- Kernels and Clustering.pptx17- Kernels and Clustering.pptx
17- Kernels and Clustering.pptx
ssuser2023c6
 
Neural Nets Deconstructed
Neural Nets DeconstructedNeural Nets Deconstructed
Neural Nets Deconstructed
Paul Sterk
 
How Does Math Matter in Data Science
How Does Math Matter in Data ScienceHow Does Math Matter in Data Science
How Does Math Matter in Data Science
Mutia Ulfi
 
Probabilistic Modelling with Information Filtering Networks
Probabilistic Modelling with Information Filtering NetworksProbabilistic Modelling with Information Filtering Networks
Probabilistic Modelling with Information Filtering Networks
Tomaso Aste
 
Data mining knowledge representation Notes
Data mining knowledge representation NotesData mining knowledge representation Notes
Data mining knowledge representation Notes
RevathiSundar4
 
Probablistic information retrieval
Probablistic information retrievalProbablistic information retrieval
Probablistic information retrieval
Nisha Arankandath
 
Ranking nodes in growing networks: when PageRank fails
Ranking nodes in growing networks: when PageRank failsRanking nodes in growing networks: when PageRank fails
Ranking nodes in growing networks: when PageRank fails
Pietro De Nicolao
 
Introduction to Principle Component Analysis
Introduction to Principle Component AnalysisIntroduction to Principle Component Analysis
Introduction to Principle Component Analysis
Sunjeet Jena
 
A multithreaded method for network alignment
A multithreaded method for network alignmentA multithreaded method for network alignment
A multithreaded method for network alignment
David Gleich
 
More investment in Research and Development for better Education in the future?
More investment in Research and Development for better Education in the future?More investment in Research and Development for better Education in the future?
More investment in Research and Development for better Education in the future?
Dhafer Malouche
 
What makes a linked data pattern interesting?
What makes a linked data pattern interesting?What makes a linked data pattern interesting?
What makes a linked data pattern interesting?
Szymon Klarman
 
probabilistic ranking
probabilistic rankingprobabilistic ranking
probabilistic ranking
FELIX75
 
Document ranking using qprp with concept of multi dimensional subspace
Document ranking using qprp with concept of multi dimensional subspaceDocument ranking using qprp with concept of multi dimensional subspace
Document ranking using qprp with concept of multi dimensional subspace
Prakash Dubey
 
Practical dimensions
Practical dimensionsPractical dimensions
Practical dimensions
tholem
 

Similar to Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction (20)

Nimrita koul Machine Learning
Nimrita koul  Machine LearningNimrita koul  Machine Learning
Nimrita koul Machine Learning
 
Dimensionality reduction by matrix factorization using concept lattice in dat...
Dimensionality reduction by matrix factorization using concept lattice in dat...Dimensionality reduction by matrix factorization using concept lattice in dat...
Dimensionality reduction by matrix factorization using concept lattice in dat...
 
graph_embeddings
graph_embeddingsgraph_embeddings
graph_embeddings
 
151028_abajpai1
151028_abajpai1151028_abajpai1
151028_abajpai1
 
Kernel methods and variable selection for exploratory analysis and multi-omic...
Kernel methods and variable selection for exploratory analysis and multi-omic...Kernel methods and variable selection for exploratory analysis and multi-omic...
Kernel methods and variable selection for exploratory analysis and multi-omic...
 
Composing graphical models with neural networks for structured representatio...
Composing graphical models with  neural networks for structured representatio...Composing graphical models with  neural networks for structured representatio...
Composing graphical models with neural networks for structured representatio...
 
17- Kernels and Clustering.pptx
17- Kernels and Clustering.pptx17- Kernels and Clustering.pptx
17- Kernels and Clustering.pptx
 
Neural Nets Deconstructed
Neural Nets DeconstructedNeural Nets Deconstructed
Neural Nets Deconstructed
 
How Does Math Matter in Data Science
How Does Math Matter in Data ScienceHow Does Math Matter in Data Science
How Does Math Matter in Data Science
 
Probabilistic Modelling with Information Filtering Networks
Probabilistic Modelling with Information Filtering NetworksProbabilistic Modelling with Information Filtering Networks
Probabilistic Modelling with Information Filtering Networks
 
Data mining knowledge representation Notes
Data mining knowledge representation NotesData mining knowledge representation Notes
Data mining knowledge representation Notes
 
Probablistic information retrieval
Probablistic information retrievalProbablistic information retrieval
Probablistic information retrieval
 
Ranking nodes in growing networks: when PageRank fails
Ranking nodes in growing networks: when PageRank failsRanking nodes in growing networks: when PageRank fails
Ranking nodes in growing networks: when PageRank fails
 
Introduction to Principle Component Analysis
Introduction to Principle Component AnalysisIntroduction to Principle Component Analysis
Introduction to Principle Component Analysis
 
A multithreaded method for network alignment
A multithreaded method for network alignmentA multithreaded method for network alignment
A multithreaded method for network alignment
 
More investment in Research and Development for better Education in the future?
More investment in Research and Development for better Education in the future?More investment in Research and Development for better Education in the future?
More investment in Research and Development for better Education in the future?
 
What makes a linked data pattern interesting?
What makes a linked data pattern interesting?What makes a linked data pattern interesting?
What makes a linked data pattern interesting?
 
probabilistic ranking
probabilistic rankingprobabilistic ranking
probabilistic ranking
 
Document ranking using qprp with concept of multi dimensional subspace
Document ranking using qprp with concept of multi dimensional subspaceDocument ranking using qprp with concept of multi dimensional subspace
Document ranking using qprp with concept of multi dimensional subspace
 
Practical dimensions
Practical dimensionsPractical dimensions
Practical dimensions
 

More from eXascale Infolab

It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...
eXascale Infolab
 
Representation Learning on Complex Graphs
Representation Learning on Complex GraphsRepresentation Learning on Complex Graphs
Representation Learning on Complex Graphs
eXascale Infolab
 
A force directed approach for offline gps trajectory map
A force directed approach for offline gps trajectory mapA force directed approach for offline gps trajectory map
A force directed approach for offline gps trajectory map
eXascale Infolab
 
Cikm 2018
Cikm 2018Cikm 2018
Cikm 2018
eXascale Infolab
 
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...
eXascale Infolab
 
SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...
SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...
SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...
eXascale Infolab
 
Dependency-Driven Analytics: A Compass for Uncharted Data Oceans
Dependency-Driven Analytics: A Compass for Uncharted Data OceansDependency-Driven Analytics: A Compass for Uncharted Data Oceans
Dependency-Driven Analytics: A Compass for Uncharted Data Oceans
eXascale Infolab
 
Crowd scheduling www2016
Crowd scheduling www2016Crowd scheduling www2016
Crowd scheduling www2016
eXascale Infolab
 
SANAPHOR: Ontology-based Coreference Resolution
SANAPHOR: Ontology-based Coreference ResolutionSANAPHOR: Ontology-based Coreference Resolution
SANAPHOR: Ontology-based Coreference Resolution
eXascale Infolab
 
Efficient, Scalable, and Provenance-Aware Management of Linked Data
Efficient, Scalable, and Provenance-Aware Management of Linked DataEfficient, Scalable, and Provenance-Aware Management of Linked Data
Efficient, Scalable, and Provenance-Aware Management of Linked Data
eXascale Infolab
 
Entity-Centric Data Management
Entity-Centric Data ManagementEntity-Centric Data Management
Entity-Centric Data Management
eXascale Infolab
 
SSSW 2015 Sense Making
SSSW 2015 Sense MakingSSSW 2015 Sense Making
SSSW 2015 Sense Making
eXascale Infolab
 
LDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked Data
LDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked DataLDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked Data
LDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked Data
eXascale Infolab
 
Executing Provenance-Enabled Queries over Web Data
Executing Provenance-Enabled Queries over Web DataExecuting Provenance-Enabled Queries over Web Data
Executing Provenance-Enabled Queries over Web Data
eXascale Infolab
 
The Dynamics of Micro-Task Crowdsourcing
The Dynamics of Micro-Task CrowdsourcingThe Dynamics of Micro-Task Crowdsourcing
The Dynamics of Micro-Task Crowdsourcing
eXascale Infolab
 
Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...
Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...
Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...
eXascale Infolab
 
CIKM14: Fixing grammatical errors by preposition ranking
CIKM14: Fixing grammatical errors by preposition rankingCIKM14: Fixing grammatical errors by preposition ranking
CIKM14: Fixing grammatical errors by preposition ranking
eXascale Infolab
 
OLTP-Bench
OLTP-BenchOLTP-Bench
OLTP-Bench
eXascale Infolab
 
An Introduction to Big Data
An Introduction to Big DataAn Introduction to Big Data
An Introduction to Big Data
eXascale Infolab
 
Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)
Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)
Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)
eXascale Infolab
 

More from eXascale Infolab (20)

It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...
 
Representation Learning on Complex Graphs
Representation Learning on Complex GraphsRepresentation Learning on Complex Graphs
Representation Learning on Complex Graphs
 
A force directed approach for offline gps trajectory map
A force directed approach for offline gps trajectory mapA force directed approach for offline gps trajectory map
A force directed approach for offline gps trajectory map
 
Cikm 2018
Cikm 2018Cikm 2018
Cikm 2018
 
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...
 
SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...
SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...
SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...
 
Dependency-Driven Analytics: A Compass for Uncharted Data Oceans
Dependency-Driven Analytics: A Compass for Uncharted Data OceansDependency-Driven Analytics: A Compass for Uncharted Data Oceans
Dependency-Driven Analytics: A Compass for Uncharted Data Oceans
 
Crowd scheduling www2016
Crowd scheduling www2016Crowd scheduling www2016
Crowd scheduling www2016
 
SANAPHOR: Ontology-based Coreference Resolution
SANAPHOR: Ontology-based Coreference ResolutionSANAPHOR: Ontology-based Coreference Resolution
SANAPHOR: Ontology-based Coreference Resolution
 
Efficient, Scalable, and Provenance-Aware Management of Linked Data
Efficient, Scalable, and Provenance-Aware Management of Linked DataEfficient, Scalable, and Provenance-Aware Management of Linked Data
Efficient, Scalable, and Provenance-Aware Management of Linked Data
 
Entity-Centric Data Management
Entity-Centric Data ManagementEntity-Centric Data Management
Entity-Centric Data Management
 
SSSW 2015 Sense Making
SSSW 2015 Sense MakingSSSW 2015 Sense Making
SSSW 2015 Sense Making
 
LDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked Data
LDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked DataLDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked Data
LDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked Data
 
Executing Provenance-Enabled Queries over Web Data
Executing Provenance-Enabled Queries over Web DataExecuting Provenance-Enabled Queries over Web Data
Executing Provenance-Enabled Queries over Web Data
 
The Dynamics of Micro-Task Crowdsourcing
The Dynamics of Micro-Task CrowdsourcingThe Dynamics of Micro-Task Crowdsourcing
The Dynamics of Micro-Task Crowdsourcing
 
Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...
Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...
Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...
 
CIKM14: Fixing grammatical errors by preposition ranking
CIKM14: Fixing grammatical errors by preposition rankingCIKM14: Fixing grammatical errors by preposition ranking
CIKM14: Fixing grammatical errors by preposition ranking
 
OLTP-Bench
OLTP-BenchOLTP-Bench
OLTP-Bench
 
An Introduction to Big Data
An Introduction to Big DataAn Introduction to Big Data
An Introduction to Big Data
 
Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)
Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)
Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)
 

Recently uploaded

Dandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity serverDandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity server
Antonios Katsarakis
 
What is an RPA CoE? Session 1 – CoE Vision
What is an RPA CoE?  Session 1 – CoE VisionWhat is an RPA CoE?  Session 1 – CoE Vision
What is an RPA CoE? Session 1 – CoE Vision
DianaGray10
 
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
Jason Yip
 
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
Edge AI and Vision Alliance
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Ivanti
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge GraphGraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
Neo4j
 
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansBiomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Neo4j
 
Mutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented ChatbotsMutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented Chatbots
Pablo Gómez Abajo
 
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their MainframeDigital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Precisely
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
DanBrown980551
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
MichaelKnudsen27
 
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
saastr
 
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectorsConnector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
DianaGray10
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
Chart Kalyan
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota
Fwdays
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
Javier Junquera
 

Recently uploaded (20)

Dandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity serverDandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity server
 
What is an RPA CoE? Session 1 – CoE Vision
What is an RPA CoE?  Session 1 – CoE VisionWhat is an RPA CoE?  Session 1 – CoE Vision
What is an RPA CoE? Session 1 – CoE Vision
 
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
 
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge GraphGraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
 
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansBiomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
 
Mutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented ChatbotsMutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented Chatbots
 
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their MainframeDigital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
 
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
 
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectorsConnector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
 

Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction

  • 1. Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction Paolo Rosso, Dingqi Yang, Philippe Cudré-Mauroux eXascale Infolab University of Fribourg, Switzerland April 2020 – The Web Conference 2020 0
  • 2. Knowledge Graph (KG) • Multi-relational graph composed of entities and relations • Each fact is represented as a triplet head entity, relation, tail entity • A fact indicates that two entities are connected by a specific relation 1
  • 3. Knowledge Graph embeddings • Represent entities and relations in a Knowledge Graph using a vector space • Learn low-dimensional vector representation of entities and relations from a Knowledge Graph while preserving the graph properties Marie Curie University of Paris educated at Semantic search Question-answering Query expansion Recommendation systems 2 TransE
  • 4. Hyper-relational facts • Triplet-based representation of a KG oversimplifies the complex nature of hyper-relational data • A hyper-relational data represents a fact containing multiple relations and entities • A hyper-relational is represented by a triplet and a set of key-value pairs 3
  • 5. Hyper-relational facts representations 1. Triplet only: keeping only the base triplet (irreversible information loss) 2. Reification: adding an artificial entity to represent the base triplet and use it as head entity for the k-v pair (creates additional triplets) 3. Relation paths: creating a relation path of relation-key and use it as a relation to connect head and key (creates additional triplets) … we need a method to better learn hyper-relation facts! 4
  • 6. 5 Marie Curie educated at University of Paris Hyper-relational facts representations Triplet only
  • 7. Hyper-relational facts representations 1. Triplet only: keeping only the base triplet (irreversible information loss) 2. Reification: adding an artificial entity to represent the base triplet and use it as head entity for the k-v pair (creates additional triplets) 3. Relation paths: creating a relation path of relation-key and use it as a relation to connect head and key (creates additional triplets) … we need a method to better learn hyper-relation facts! 6
  • 8. 7 q := Marie Curie educated at University of Paris q academic major Physics q academic degree Master of Science Hyper-relational facts representations Reification
  • 9. Hyper-relational facts representations 1. Triplet only: keeping only the base triplet (irreversible information loss) 2. Reification: adding an artificial entity to represent the base triplet and use it as head entity for the k-v pair (creates additional triplets) 3. Relation paths: creating a relation path of relation-key and use it as a relation to connect head and key (creates additional triplets) … we need a method to better learn hyper-relation facts! 8
  • 10. 9 Marie Curie educated at University of Paris rk1 := educated at AND academic major rk2 := educated at AND academic degree Marie Curie rk1 Physics Marie Curie rk2 Master of Science Relation paths
  • 11. Hyper-relational facts representations 1. Triplet only: keeping only the base triplet (irreversible information loss) 2. Reification: adding an artificial entity to represent the base triplet and use it as head entity for the k-v pair (creates additional triplets) 3. Relation paths: creating a relation path of relation-key and use it as a relation to connect head and key (creates additional triplets) 10
  • 12. 11 • Hyper-relational fact can be transformed into n-ary representation • E.g., relation education is extracted from the hyper-relational fact • N-ary representation {education_head:Marie Curie, education_tail:University of Paris, education_major:Physics, education_degree:Master of Science} … but triplets serve as the fundamental data structure in the modern KGs because they preserve the essential information for link prediction N-Ary Representation
  • 13. 12 • Hyper-relational fact can be transformed into n-ary representation • E.g., relation education is extracted from the hyper-relational fact • N-ary representation {education_head:Marie Curie, education_tail:University of Paris, education_major:Physics, education_degree:Master of Science} … but triplets serve as the fundamental data structure in the modern KGs because they preserve the essential information for link prediction N-Ary Representation
  • 14. Limitation of N-Ary Representation 13 • Null model: one triplet is extracted from the n-ary representation of each hyper- relational fact via randomly sampled relation path (h, rhki, vi) • E.g., Marie Curie education_head_education_major Physics • Null hypothesis: the information for link prediction encoded by the original base triplets is not greater than the triplets created by the null model • We test Null hypothesis on link prediction tasks using models on original basic triplets and null model • Rejected Null hypothesis - results: • Performance of the baselines on original base triplets is consistently and significantly better than the performance from the null model • Information encoded in original base triplets is greater than the information encoded in null model
  • 15. HINGE: Hyper-relatIonal kNowledge Graph Embedding • KG embedding model to learn hyper-relational facts in a KG • Capturing the primary structural information of the KG encoded in the triplets in order to preserve the essential information for link prediction • Capturing the correlation between each triplet and its associated key- value pairs (if any) 14
  • 17. HINGE architecture • The base triplet encodes the primary structural information and capture essential information for link prediction • CNN to learn from the base triplet and capture the triple-wise relatedness between h,r,t embeddings • The triple-wise relatedness vector used to characterize the plausibility of the base triplet of being true 16
  • 18. HINGE architecture • CNN to learn from each key-value pair associated with the triplet together and capture the quintuple-wise relatedness between h,r,t,k,v embeddings • The quintuple-wise relatedness vector used to characterize the plausibility of the base triplet associated with the k-v pair being true 17
  • 19. HINGE architecture • Concatenate the triple-wise and quintuple- wise relatedness future vectors • Get the minimum value along each dimension • For a valid hyper-relational fact, the relatedness vectors should be high • The minimum score along each dimension is expected to be high • Fully connected layer to output the score for the input hyper-relational fact 18
  • 21. Experiments • Learning from triplets facts only: • Translational distance models: TransE, TransH, TransR and TransD • Semantic matching models: Rescal, DistMult, ComplEx, Analogy and ConvE • Learning from hyper-relational facts: • m-TransH, RAE, NaLP, NaLP-Fix 20
  • 22. Datasets • Hyper-relational datasets • JF17K: filtered from Freebase to have a significant presence of hyper-relational facts • WikiPeople: extracted from Wikidata and focuses on entities of type human • JF17K and WikiPeople contain triplet facts and hyper-relational facts • Dataset configurations: Triplet Only, Relation Path, Reification, Original Hyper-Relational 21
  • 23. Evaluation Tasks and Metrics • Link prediction: given two elements of a triplet in a fact, predict the missing one • E.g., (?,r,t), (h,?,t) or (h,r,?) • Mean Reciprocal Rank (MRR), Hits@10 and Hits@1 22
  • 24. Comparison with Baselines Learning from Hyper-Relational Facts 23 • HINGE outperforms all baselines learning from hyper-relational facts • Among the baselines: • NaLP shows the best performance (it learns the relatedness between k-v pairs but it is not aware of the triplet structure) • m-TransH and RAE learn only from entities
  • 25. Comparison with Baselines Learning from Triplets Only Dataset Transformation Setting WikiPeople JF17K Head/Tail Prediction (%) Relation Prediction (%) Head/Tail Prediction (%) Relation Prediction (%) Basic 0.81 18.55 41.45 8.41 Relation Path 2.85 24.65 22.96 12.24 Reification 3.28 22.22 29.71 6.53 24
  • 26. Comparison with Baselines Learning from Triplets Only Dataset Transformation Setting WikiPeople JF17K Head/Tail Prediction (%) Relation Prediction (%) Head/Tail Prediction (%) Relation Prediction (%) Basic 0.81 18.55 41.45 8.41 Relation Path 2.85 24.65 22.96 12.24 Reification 3.28 22.22 29.71 6.53 • Basic setting discards the key-value pairs (information loss) 25
  • 27. Comparison with Baselines Learning from Triplets Only Dataset Transformation Setting WikiPeople JF17K Head/Tail Prediction (%) Relation Prediction (%) Head/Tail Prediction (%) Relation Prediction (%) Basic 0.81 18.55 41.45 8.41 Relation Path 2.85 24.65 22.96 12.24 Reification 3.28 22.22 29.71 6.53 • Basic setting discards the key-value pairs (information loss) • Exception for the head/tail prediction on WikiPeople with Basic setting (dominance of triplet) 26
  • 28. Comparison with Baselines Learning from Triplets Only Dataset Transformation Setting WikiPeople JF17K Head/Tail Prediction (%) Relation Prediction (%) Head/Tail Prediction (%) Relation Prediction (%) Basic 0.81 18.55 41.45 8.41 Relation Path 2.85 24.65 22.96 12.24 Reification 3.28 22.22 29.71 6.53 • Basic setting discards the key-value pairs (information loss) • Exception for the head/tail prediction on WikiPeople with Basic setting (dominance of triplet) • Relation Path and Reification create extra entities/relations (not capturing essential information for link prediction) 27
  • 29. Conclusions • We investigate the problem of hyper-relational KG embedding showing that triplets serve as the fundamental data structure in modern KGs encoding the essential information for link prediction • We introduce HINGE, a KG embedding model that captures the information encoded in the triplets and the correlation between each triplet and its associated key-value pairs • We show that the triplet structure is the fundamental structure for link prediction and show the limitations of a commonly used representation scheme • We observe an improvement on various link prediction tasks with different data transformation settings 28