How Much and When Do We Need Higher-order Information in Hypergraphs? A Case Study on Hyperedge Prediction

L A N A D A
How Much and When Do We Need
Higher-order Information in Hypergraphs?
A Case Study on Hyperedge Prediction
The Web Conference 2020
Se-eun Yoon, Hyungseok Song, Kijung Shin, and Yung Yi
Contact: seeuny@kaist.ac.kr
L A N A D A
Table of contents 2
1. Introduction and related work
2. Problem formulation
3. Methods
4. Experiments
5. Conclusion
L A N A D A
1. Introduction and related work
L A N A D A
Hypergraphs 4
• What are hypergraphs?
Graph
1
2
3
45
Edge
(Link)
Interactions of two entities
Hypergraph
Hyperedge
1
2
3
45
Interactions of arbitrary
numbers of entities
What about interactions
of more than two?
• Coauthorship
• Protein interactions
• Web hashtags
L A N A D A
Original hypergraph Projected graph
1, 2, 3, 4
1, 3, 5
1, 6
2, 6
1, 7, 8
3, 9
5, 8
1, 2, 6
[PNAS 2018] Simplicial closure and
higher-order link prediction
Simplifying a hypergraph 5
• Hypergraphs are not straightforward to use.
• Common practice is to simplify them.
• E.g., projected graphs
3 1
1 3
0 2
1 2
0 1
2 2
1 2
2 3
1 2 3 4
node
node
1 0
0 1
0 1
1 1
0 1
0 0
0 1
0 1
0 0
0 1
1 0
1 1
node
hyperedge
𝒆 𝟏 𝒆 𝟐 𝒆 𝟑 𝒆 𝟒 𝒆 𝟓 𝒆 𝟔
Whole hypergraph
(Incidence matrix)
Projected graph
(Adjacency matrix)
[AAAI 2018] Beyond link prediction:
Predicting hyperlinks in adjacency space
1
2
3
4
1
2
3
4
L A N A D A
Using a hypergraph as it is 6
• However, simplification comes with information loss.
• Projected graphs express hypergraphs with only 2-way information.
• Many studies propose methods to use the whole hypergraph.
[KDD 2018] Sequences of sets
𝑡1: 1, 2, 3, 4
𝑡2: 1, 3, 5
𝑡3: 1, 6
𝑡4: 2, 6
𝑡5: 1, 7, 8
𝑡6: 3, 9
𝑡7: 5, 8
𝑡8: 1, 2, 6
Timestamped hyperedges
[AAAI 2019] Hypergraph neural networks
Hypergraph as neural net input
L A N A D A
Our question 7
Projected graph
2-way information
Original hypergraph
All information
accuracy
complexity
L A N A D A
Our question 8
2-way info
2-way info
3-way info
2-way info
3-way info
4-way info
…
All info
?
accuracy
complexity
L A N A D A
Proposed method to answer our question 9
• [Our question] How much higher-order information is sufficient for accurately
solving a hypergraph task?
• That is, how much n-way information do we need?
• [Proposed method to capture n-way information] n-projected graph
• 2-projected graph: captures 2-way information
• 3-projected graph: captures 3-way information
• ⋯
• 𝑛-projected graph: captures n-way information
• [Our task] Hyperedge prediction
• Measure prediction accuracy as n grows
L A N A D A
Example 10
• In the figure below:
a) Suppose we want to predict whether {1, 2, 3, 4} would collaborate in the future.
b) Knowing about {1, 2}, {3, 4}, … could be useful
• Ex) How often pairs have collaborated
c) Knowing also about {1, 2, 3}, {2, 3, 4}, … could be even more useful.
• Ex) How often 3 people have collaborated
• How much n-way information do we need for accurate enough prediction?
5
2 3
1
4
Pairwise
interactions
5
2 3
1
4
?
Hypergraph
5
2 3
1
4
Pairwise + 3-way
interactions
a) Hyperedge
prediction
b) Pairwise
information
c) 3-way
information
L A N A D A
Related work 11
Pairwise representation Whole representation
Hypergraph
representation
In hyperedge
prediction
[NeurIPS 2007] Learning with hypergraphs:
Clustering, classification, and embedding
[CVPR 2005] Beyond pairwise clustering
[VLSI design 2000] Multilevel k-way
hypergraph partitioning
[AAAI 2018] Beyond link prediction:
Predicting hyperlinks in adjacency space
[PNAS 2018] Simplicial closure and higher-
order link prediction
[WWW 2013] Link prediction in social
networks based on hypergraph.
[DS 2013] Hyperlink prediction in
hypernetworks using latent social features
[AAAI 2019] Hypergraph neural networks
[arXiv 2018] Hypergcn: Hypergraph convolutional
networks for semi-supervised classification
[ICML 2005] Higher order learning with graphs
[KDD 2018] Sequences of sets
[Multimedia 2018] Exploiting relational
information in social networks using geometric
deep learning on hypergraphs
[arXiv 2014] Predicting multi-actor collaborations
using hypergraphs
L A N A D A
2. Problem formulation
L A N A D A
Concept: Hypergraphs 13
• Hypergraph 𝐺 = 𝑉, 𝐸, 𝜔
• 𝑉: set of nodes
• 𝐸: set of hyperedges
• 𝑤 𝑒 : weight of hyperedge 𝑒 = number of times of occurrence
{1, 2, 3, 4}
{1, 2, 4}
{4, 5}
{1, 2, 3, 4}
Interactions that took place
𝒘 𝟏, 𝟐, 𝟑, 𝟒 = 𝟐
𝒘 {𝟏, 𝟐, 𝟒} = 𝟏
1
3
2
4 5
1 2
4
5
3
(a) Hypergraph (b) 2-pg
Hypergraph representation
𝒘 {𝟒, 𝟓} = 𝟏
node
hyperedge
L A N A D A
Problem: Hyperedge prediction 14
• Hyperedge prediction
• Binary classification problem: find
1
2
3
4
5
6
7
8
1
2
1
2
7
8
3
4
8
6
7
𝑪 𝒑: positive hyperedges
𝑪 𝒏: negative hyperedges
𝑪: candidate hyperedges
Remove some hyperedges
𝟏 (𝒄 ∈ 𝑪 𝒑)
𝟎 (𝒄 ∈ 𝑪 𝒏)
𝒇 ≅ 𝒇⋆
(𝒄)
L A N A D A
Constructing hyperedge candidate set 𝑪 15
• We can create 𝐶 in many different ways.
• Depending on how we create it, task difficulty may change.
Target hyperedge size
Size 4 Size 5 Size 10
Quality of negative hyperedges
Stars Cliques
More difficult
Quantity of negative hyperedges
1:1 1:2 1:5 1:10
More difficult
L A N A D A
3. Methods
L A N A D A
The n-projected graph 17
• How to capture n-way information?
• Idea: extend the (pairwise) projected graph to more than just 2 nodes
• n-projected graph (n-pg)
1
3
2
4 5
123 124
134234
1 2
4
5
3
(a) Hypergraph (b) 2-pg (c) 3-pg (d) 4-pg
24
12
13
34
23
14
n-pg captures n-way information
= # times group of n nodes have interacted
1
3
2
4 5
123 124
134234
1 2
4
5
3
(a) Hypergraph (b) 2-pg (c) 3-pg (d) 4-pg
24
12
13
34
23
14
1
3
2
4 5
123 124
134234
1 2
4
5
3
(a) Hypergraph (b) 2-pg (c) 3-pg (d) 4-pg
24
12
13
34
23
14
1
3
2
4 5
123 124
134234
1 2
4
5
3
(a) Hypergraph (b) 2-pg (c) 3-pg (d) 4-pg
24
12
13
34
23
14
2
1
1
2
1
L A N A D A
1
3
2
4 5
123 124
134234
1 2
4
5
3
(a) Hypergraph (b) 2-pg (c) 3-pg (d) 4-pg
24
12
13
34
23
14
The n-order expansion 18
• n-projected graph captures only n-way information
• However, we want represent a hypergraph with up to n-way information
• n-order expansion
2-order expansion
3-order expansion
4-order expansion
As n increases, n-order expansion becomes a more accurate representation of the original hypergraph
L A N A D A
Prediction model: Features 19
• Given a candidate hyperedge, we can extract its features from the n-order expansion.
Example feature: Common neighbors (CN)
Candidate hyperedge: {1, 2, 3, 5}
1
3
2
4 5
123 124
134234
1 2
4
5
3
(a) Hypergraph (b) 2-pg (c) 3-pg (d) 4-pg
24
12
13
34
23
14
?
CN of nodes 1, 2, 3, 5
= node 4
# CN = 1
CN of nodes 12, 13, …, 35
= None
# CN = 0
CN of nodes 123, …, 235
= None
# CN = 0
1 0 0
4-order expansion
feature vector
L A N A D A
Prediction model: Features 20
• These are the list of features we used.
Feature Definition
Geometric mean (GM) 𝑥 𝑛(𝑐) = 𝑒 𝑛∈𝐸 𝑛 𝑐 𝜔 𝑛 𝑒 𝑛
1
|𝐸 𝑛(𝑐)|
Harmonic mean (HM) 𝑥 𝑛(𝑐) =
|𝐸 𝑛(𝑐)|
𝑒 𝑛∈𝐸 𝑛(𝑐) 𝜔 𝑛 𝑒 𝑛
−1
Arithmetic mean (AM) 𝑥 𝑛 𝑐 =
1
𝐸 𝑛 𝑐 𝑒 𝑛∈𝐸 𝑛 𝑐 𝜔 𝑛 𝑒 𝑛
Common neighbors (CN) 𝑥 𝑛 𝑐 = 𝑣 𝑛⊆𝑐
𝑁 𝑛 𝑣 𝑛
Jaccard coefficient (JC) 𝑥 𝑛 𝑐 =
𝑣 𝑛⊆𝑐 𝑁 𝑛 𝑣 𝑛
𝑣 𝑛⊆𝑐
𝑁 𝑛(𝑣 𝑛)
Adamic-Adar index (AA) 𝑥 𝑛 𝑐 = 𝑢 𝑛∈ 𝑣 𝑛⊆𝑐
𝑁 𝑛 𝑣 𝑛
1
log |𝑁 𝑛(𝑢 𝑛)|
Features widely used
in link prediction
Mean variations
L A N A D A
Prediction model: Classifier 21
• Classifier: logistic regression classifier with L2 regularization
• Classifier input: feature vector from n-order expansion
• Classifier output: 1/0
Classifier
… … …
n-order expansion
feature vector
1/0
How does prediction performance change as we increase n?
L A N A D A
4. Experiments
L A N A D A
Setup: Datasets
• 15 datasets from 8 domains
• Ranges from about 1,000 to 2,500,000 hyperedges
1) Email: recipient addresses of an email
2) Contact: persons that appeared in face-to-face proximity
3) Drug components: classes or substances within a single drug, listed in the National Drug Code Directory
4) Drug use: drugs used by a patient, reported to the Drug Abuse Warning Network, before an emergency visit
5) US Congress: congress members cosponsoring a bill
6) Online tags: tags in a question in Stack Exchange forums
7) Online threads: users answering a question in Stack Exchange forums
8) Coauthorship: coauthors of a publication
L A N A D A
Setup: Training and evaluation 24
• Training and test sets
• First, generate the candidate set 𝐶
• Then split 𝐶 into training (50%) and test(50%) sets
• Performance metric: Area Under Curve – Precision and Recall (AUC-PR)
• Recall: “How many true hyperedges can you find?”
• Precision: “How precisely can you find true hyperedges?”
AUC-PR
# hyperedges that I claim “true”
# true hyperedges I found
# true hyperedges
# true hyperedges I found
Recall =
Precision =
L A N A D A
Results and messages (1) 25
(M1) More higher-order information leads to better prediction quality, but with
diminishing returns.
L A N A D A
Results and messages (1) 26
(M1) More higher-order information leads to better prediction quality, but with
diminishing returns.
Large gain
Small gain
L A N A D A
Results and messages (2) 27
(M2) More hardness of the task makes higher-order information even more valuable.
harder
harder
Hardness of the task
Stars < Cliques
1:1 < 1:2 < 1:5 < 1:10
L A N A D A
Results and messages (3) 28
(M3) Why is higher-order information more important in some datasets than in others?
Such datasets have the following properties:
(i) Higher-order information is more abundant.
(ii) Higher-order information share less information with pairwise ones.
“How to measure abundance of 3-way information?”
# all possible 3-way combinations
# edges in 3-pg
Edge density =
× 100%
L A N A D A
Results and messages (3) 29
(M3) Why is higher-order information more important in some datasets than in others?
Such datasets have the following properties:
(i) Higher-order information is more abundant.
(ii) Higher-order information share less information with pairwise ones.
𝐼 𝑊3; 𝑊2
Mutual information
“Shared information
between 2-pg and 3-pg”
1. Sample three nodes 𝑣1, 𝑣2, 𝑣3 from hypergraph
“How to measure shared information
between 2-way and of 3-way?”
Conditional entropy
“Information exclusive
to 3-pg”
𝐻 𝑊3|𝑊2
2. Obtain from 2-pg
𝑊2: = (𝑤2 𝑣1, 𝑣2 , 𝑤2 𝑣2, 𝑣3 , 𝑤2 𝑣1, 𝑣3 )
3. Obtain from 3-pg
𝑊3 ≔ 𝑤3(𝑣1, 𝑣2, 𝑣3)
L A N A D A
5. Conclusion
L A N A D A
Conclusion 31
• We ask and answer the following questions.
1) How much higher-order information is needed to accurately represent a hypergraph?
2) When is such higher-order information particularly useful?
3) Why is higher-order information important in some datasets more than in others?
• Our results could offer insights to future works on hypergraphs.
• E.g., higher performance on hypergraph tasks, but with less computational complexity
• Some examples of hypergraph tasks:
1
2
3
45
4
4
?
Node classification
1
2
3
45
Node embedding
L A N A D A
Links 32
• Preprint: https://arxiv.org/pdf/2001.11181.pdf
• Source code & Supplementary document: https://github.com/granelle/www20-higher-order
• Datasets: https://www.cs.cornell.edu/~arb/data/
A n y Q u e s t i o n s ?
Thank you!
1 of 33

Recommended

ChatGPT and the Future of Work - Clark Boyd by
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
22.6K views69 slides
Getting into the tech field. what next by
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
5.5K views22 slides
Google's Just Not That Into You: Understanding Core Updates & Search Intent by
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
6.2K views99 slides
How to have difficult conversations by
How to have difficult conversations How to have difficult conversations
How to have difficult conversations Rajiv Jayarajah, MAppComm, ACC
4.7K views19 slides
Introduction to Data Science by
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data ScienceChristy Abraham Joy
82.2K views51 slides
Time Management & Productivity - Best Practices by
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
169.7K views42 slides

More Related Content

Recently uploaded

zincalume water storage tank design.pdf by
zincalume water storage tank design.pdfzincalume water storage tank design.pdf
zincalume water storage tank design.pdf3D LABS
5 views1 slide
Codes and Conventions.pptx by
Codes and Conventions.pptxCodes and Conventions.pptx
Codes and Conventions.pptxIsabellaGraceAnkers
7 views5 slides
NEW SUPPLIERS SUPPLIES (copie).pdf by
NEW SUPPLIERS SUPPLIES (copie).pdfNEW SUPPLIERS SUPPLIES (copie).pdf
NEW SUPPLIERS SUPPLIES (copie).pdfgeorgesradjou
15 views30 slides
MK__Cert.pdf by
MK__Cert.pdfMK__Cert.pdf
MK__Cert.pdfHassan Khan
10 views1 slide
DESIGN OF SPRINGS-UNIT4.pptx by
DESIGN OF SPRINGS-UNIT4.pptxDESIGN OF SPRINGS-UNIT4.pptx
DESIGN OF SPRINGS-UNIT4.pptxgopinathcreddy
19 views47 slides
sam_software_eng_cv.pdf by
sam_software_eng_cv.pdfsam_software_eng_cv.pdf
sam_software_eng_cv.pdfsammyigbinovia
5 views5 slides

Recently uploaded(20)

zincalume water storage tank design.pdf by 3D LABS
zincalume water storage tank design.pdfzincalume water storage tank design.pdf
zincalume water storage tank design.pdf
3D LABS5 views
NEW SUPPLIERS SUPPLIES (copie).pdf by georgesradjou
NEW SUPPLIERS SUPPLIES (copie).pdfNEW SUPPLIERS SUPPLIES (copie).pdf
NEW SUPPLIERS SUPPLIES (copie).pdf
georgesradjou15 views
Effect of deep chemical mixing columns on properties of surrounding soft clay... by AltinKaradagli
Effect of deep chemical mixing columns on properties of surrounding soft clay...Effect of deep chemical mixing columns on properties of surrounding soft clay...
Effect of deep chemical mixing columns on properties of surrounding soft clay...
AltinKaradagli6 views
Quality Manual Chaity Group.pdf by Mizan Rahman
Quality Manual Chaity Group.pdfQuality Manual Chaity Group.pdf
Quality Manual Chaity Group.pdf
Mizan Rahman5 views
Thermal aware task assignment for multicore processors using genetic algorithm by IJECEIAES
Thermal aware task assignment for multicore processors using genetic algorithm Thermal aware task assignment for multicore processors using genetic algorithm
Thermal aware task assignment for multicore processors using genetic algorithm
IJECEIAES31 views
Design of machine elements-UNIT 3.pptx by gopinathcreddy
Design of machine elements-UNIT 3.pptxDesign of machine elements-UNIT 3.pptx
Design of machine elements-UNIT 3.pptx
gopinathcreddy32 views
Literature review and Case study on Commercial Complex in Nepal, Durbar mall,... by AakashShakya12
Literature review and Case study on Commercial Complex in Nepal, Durbar mall,...Literature review and Case study on Commercial Complex in Nepal, Durbar mall,...
Literature review and Case study on Commercial Complex in Nepal, Durbar mall,...
AakashShakya1266 views
Update 42 models(Diode/General ) in SPICE PARK(DEC2023) by Tsuyoshi Horigome
Update 42 models(Diode/General ) in SPICE PARK(DEC2023)Update 42 models(Diode/General ) in SPICE PARK(DEC2023)
Update 42 models(Diode/General ) in SPICE PARK(DEC2023)

Featured

Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present... by
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools
55.5K views138 slides
12 Ways to Increase Your Influence at Work by
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at WorkGetSmarter
401.6K views64 slides
ChatGPT webinar slides by
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slidesAlireza Esmikhani
30.3K views36 slides
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G... by
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...DevGAMM Conference
3.6K views12 slides
Barbie - Brand Strategy Presentation by
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationErica Santiago
25.1K views46 slides

Featured(20)

Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present... by Applitools
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Applitools55.5K views
12 Ways to Increase Your Influence at Work by GetSmarter
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
GetSmarter401.6K views
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G... by DevGAMM Conference
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
DevGAMM Conference3.6K views
Barbie - Brand Strategy Presentation by Erica Santiago
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy Presentation
Erica Santiago25.1K views
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well by Saba Software
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellGood Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Saba Software25.2K views
Introduction to C Programming Language by Simplilearn
Introduction to C Programming LanguageIntroduction to C Programming Language
Introduction to C Programming Language
Simplilearn8.4K views
The Pixar Way: 37 Quotes on Developing and Maintaining a Creative Company (fr... by Palo Alto Software
The Pixar Way: 37 Quotes on Developing and Maintaining a Creative Company (fr...The Pixar Way: 37 Quotes on Developing and Maintaining a Creative Company (fr...
The Pixar Way: 37 Quotes on Developing and Maintaining a Creative Company (fr...
Palo Alto Software88.4K views
9 Tips for a Work-free Vacation by Weekdone.com
9 Tips for a Work-free Vacation9 Tips for a Work-free Vacation
9 Tips for a Work-free Vacation
Weekdone.com7.2K views
How to Map Your Future by SlideShop.com
How to Map Your FutureHow to Map Your Future
How to Map Your Future
SlideShop.com275.1K views
Beyond Pride: Making Digital Marketing & SEO Authentically LGBTQ+ Inclusive -... by AccuraCast
Beyond Pride: Making Digital Marketing & SEO Authentically LGBTQ+ Inclusive -...Beyond Pride: Making Digital Marketing & SEO Authentically LGBTQ+ Inclusive -...
Beyond Pride: Making Digital Marketing & SEO Authentically LGBTQ+ Inclusive -...
AccuraCast3.4K views
Exploring ChatGPT for Effective Teaching and Learning.pptx by Stan Skrabut, Ed.D.
Exploring ChatGPT for Effective Teaching and Learning.pptxExploring ChatGPT for Effective Teaching and Learning.pptx
Exploring ChatGPT for Effective Teaching and Learning.pptx
Stan Skrabut, Ed.D.57.7K views
How to train your robot (with Deep Reinforcement Learning) by Lucas García, PhD
How to train your robot (with Deep Reinforcement Learning)How to train your robot (with Deep Reinforcement Learning)
How to train your robot (with Deep Reinforcement Learning)
Lucas García, PhD42.5K views
4 Strategies to Renew Your Career Passion by Daniel Goleman
4 Strategies to Renew Your Career Passion4 Strategies to Renew Your Career Passion
4 Strategies to Renew Your Career Passion
Daniel Goleman122K views
The Student's Guide to LinkedIn by LinkedIn
The Student's Guide to LinkedInThe Student's Guide to LinkedIn
The Student's Guide to LinkedIn
LinkedIn87.9K views
Different Roles in Machine Learning Career by Intellipaat
Different Roles in Machine Learning CareerDifferent Roles in Machine Learning Career
Different Roles in Machine Learning Career
Intellipaat12.4K views
Defining a Tech Project Vision in Eight Quick Steps pdf by TechSoup
Defining a Tech Project Vision in Eight Quick Steps pdfDefining a Tech Project Vision in Eight Quick Steps pdf
Defining a Tech Project Vision in Eight Quick Steps pdf
TechSoup 9.7K views

How Much and When Do We Need Higher-order Information in Hypergraphs? A Case Study on Hyperedge Prediction

  • 1. L A N A D A How Much and When Do We Need Higher-order Information in Hypergraphs? A Case Study on Hyperedge Prediction The Web Conference 2020 Se-eun Yoon, Hyungseok Song, Kijung Shin, and Yung Yi Contact: seeuny@kaist.ac.kr
  • 2. L A N A D A Table of contents 2 1. Introduction and related work 2. Problem formulation 3. Methods 4. Experiments 5. Conclusion
  • 3. L A N A D A 1. Introduction and related work
  • 4. L A N A D A Hypergraphs 4 • What are hypergraphs? Graph 1 2 3 45 Edge (Link) Interactions of two entities Hypergraph Hyperedge 1 2 3 45 Interactions of arbitrary numbers of entities What about interactions of more than two? • Coauthorship • Protein interactions • Web hashtags
  • 5. L A N A D A Original hypergraph Projected graph 1, 2, 3, 4 1, 3, 5 1, 6 2, 6 1, 7, 8 3, 9 5, 8 1, 2, 6 [PNAS 2018] Simplicial closure and higher-order link prediction Simplifying a hypergraph 5 • Hypergraphs are not straightforward to use. • Common practice is to simplify them. • E.g., projected graphs 3 1 1 3 0 2 1 2 0 1 2 2 1 2 2 3 1 2 3 4 node node 1 0 0 1 0 1 1 1 0 1 0 0 0 1 0 1 0 0 0 1 1 0 1 1 node hyperedge 𝒆 𝟏 𝒆 𝟐 𝒆 𝟑 𝒆 𝟒 𝒆 𝟓 𝒆 𝟔 Whole hypergraph (Incidence matrix) Projected graph (Adjacency matrix) [AAAI 2018] Beyond link prediction: Predicting hyperlinks in adjacency space 1 2 3 4 1 2 3 4
  • 6. L A N A D A Using a hypergraph as it is 6 • However, simplification comes with information loss. • Projected graphs express hypergraphs with only 2-way information. • Many studies propose methods to use the whole hypergraph. [KDD 2018] Sequences of sets 𝑡1: 1, 2, 3, 4 𝑡2: 1, 3, 5 𝑡3: 1, 6 𝑡4: 2, 6 𝑡5: 1, 7, 8 𝑡6: 3, 9 𝑡7: 5, 8 𝑡8: 1, 2, 6 Timestamped hyperedges [AAAI 2019] Hypergraph neural networks Hypergraph as neural net input
  • 7. L A N A D A Our question 7 Projected graph 2-way information Original hypergraph All information accuracy complexity
  • 8. L A N A D A Our question 8 2-way info 2-way info 3-way info 2-way info 3-way info 4-way info … All info ? accuracy complexity
  • 9. L A N A D A Proposed method to answer our question 9 • [Our question] How much higher-order information is sufficient for accurately solving a hypergraph task? • That is, how much n-way information do we need? • [Proposed method to capture n-way information] n-projected graph • 2-projected graph: captures 2-way information • 3-projected graph: captures 3-way information • ⋯ • 𝑛-projected graph: captures n-way information • [Our task] Hyperedge prediction • Measure prediction accuracy as n grows
  • 10. L A N A D A Example 10 • In the figure below: a) Suppose we want to predict whether {1, 2, 3, 4} would collaborate in the future. b) Knowing about {1, 2}, {3, 4}, … could be useful • Ex) How often pairs have collaborated c) Knowing also about {1, 2, 3}, {2, 3, 4}, … could be even more useful. • Ex) How often 3 people have collaborated • How much n-way information do we need for accurate enough prediction? 5 2 3 1 4 Pairwise interactions 5 2 3 1 4 ? Hypergraph 5 2 3 1 4 Pairwise + 3-way interactions a) Hyperedge prediction b) Pairwise information c) 3-way information
  • 11. L A N A D A Related work 11 Pairwise representation Whole representation Hypergraph representation In hyperedge prediction [NeurIPS 2007] Learning with hypergraphs: Clustering, classification, and embedding [CVPR 2005] Beyond pairwise clustering [VLSI design 2000] Multilevel k-way hypergraph partitioning [AAAI 2018] Beyond link prediction: Predicting hyperlinks in adjacency space [PNAS 2018] Simplicial closure and higher- order link prediction [WWW 2013] Link prediction in social networks based on hypergraph. [DS 2013] Hyperlink prediction in hypernetworks using latent social features [AAAI 2019] Hypergraph neural networks [arXiv 2018] Hypergcn: Hypergraph convolutional networks for semi-supervised classification [ICML 2005] Higher order learning with graphs [KDD 2018] Sequences of sets [Multimedia 2018] Exploiting relational information in social networks using geometric deep learning on hypergraphs [arXiv 2014] Predicting multi-actor collaborations using hypergraphs
  • 12. L A N A D A 2. Problem formulation
  • 13. L A N A D A Concept: Hypergraphs 13 • Hypergraph 𝐺 = 𝑉, 𝐸, 𝜔 • 𝑉: set of nodes • 𝐸: set of hyperedges • 𝑤 𝑒 : weight of hyperedge 𝑒 = number of times of occurrence {1, 2, 3, 4} {1, 2, 4} {4, 5} {1, 2, 3, 4} Interactions that took place 𝒘 𝟏, 𝟐, 𝟑, 𝟒 = 𝟐 𝒘 {𝟏, 𝟐, 𝟒} = 𝟏 1 3 2 4 5 1 2 4 5 3 (a) Hypergraph (b) 2-pg Hypergraph representation 𝒘 {𝟒, 𝟓} = 𝟏 node hyperedge
  • 14. L A N A D A Problem: Hyperedge prediction 14 • Hyperedge prediction • Binary classification problem: find 1 2 3 4 5 6 7 8 1 2 1 2 7 8 3 4 8 6 7 𝑪 𝒑: positive hyperedges 𝑪 𝒏: negative hyperedges 𝑪: candidate hyperedges Remove some hyperedges 𝟏 (𝒄 ∈ 𝑪 𝒑) 𝟎 (𝒄 ∈ 𝑪 𝒏) 𝒇 ≅ 𝒇⋆ (𝒄)
  • 15. L A N A D A Constructing hyperedge candidate set 𝑪 15 • We can create 𝐶 in many different ways. • Depending on how we create it, task difficulty may change. Target hyperedge size Size 4 Size 5 Size 10 Quality of negative hyperedges Stars Cliques More difficult Quantity of negative hyperedges 1:1 1:2 1:5 1:10 More difficult
  • 16. L A N A D A 3. Methods
  • 17. L A N A D A The n-projected graph 17 • How to capture n-way information? • Idea: extend the (pairwise) projected graph to more than just 2 nodes • n-projected graph (n-pg) 1 3 2 4 5 123 124 134234 1 2 4 5 3 (a) Hypergraph (b) 2-pg (c) 3-pg (d) 4-pg 24 12 13 34 23 14 n-pg captures n-way information = # times group of n nodes have interacted 1 3 2 4 5 123 124 134234 1 2 4 5 3 (a) Hypergraph (b) 2-pg (c) 3-pg (d) 4-pg 24 12 13 34 23 14 1 3 2 4 5 123 124 134234 1 2 4 5 3 (a) Hypergraph (b) 2-pg (c) 3-pg (d) 4-pg 24 12 13 34 23 14 1 3 2 4 5 123 124 134234 1 2 4 5 3 (a) Hypergraph (b) 2-pg (c) 3-pg (d) 4-pg 24 12 13 34 23 14 2 1 1 2 1
  • 18. L A N A D A 1 3 2 4 5 123 124 134234 1 2 4 5 3 (a) Hypergraph (b) 2-pg (c) 3-pg (d) 4-pg 24 12 13 34 23 14 The n-order expansion 18 • n-projected graph captures only n-way information • However, we want represent a hypergraph with up to n-way information • n-order expansion 2-order expansion 3-order expansion 4-order expansion As n increases, n-order expansion becomes a more accurate representation of the original hypergraph
  • 19. L A N A D A Prediction model: Features 19 • Given a candidate hyperedge, we can extract its features from the n-order expansion. Example feature: Common neighbors (CN) Candidate hyperedge: {1, 2, 3, 5} 1 3 2 4 5 123 124 134234 1 2 4 5 3 (a) Hypergraph (b) 2-pg (c) 3-pg (d) 4-pg 24 12 13 34 23 14 ? CN of nodes 1, 2, 3, 5 = node 4 # CN = 1 CN of nodes 12, 13, …, 35 = None # CN = 0 CN of nodes 123, …, 235 = None # CN = 0 1 0 0 4-order expansion feature vector
  • 20. L A N A D A Prediction model: Features 20 • These are the list of features we used. Feature Definition Geometric mean (GM) 𝑥 𝑛(𝑐) = 𝑒 𝑛∈𝐸 𝑛 𝑐 𝜔 𝑛 𝑒 𝑛 1 |𝐸 𝑛(𝑐)| Harmonic mean (HM) 𝑥 𝑛(𝑐) = |𝐸 𝑛(𝑐)| 𝑒 𝑛∈𝐸 𝑛(𝑐) 𝜔 𝑛 𝑒 𝑛 −1 Arithmetic mean (AM) 𝑥 𝑛 𝑐 = 1 𝐸 𝑛 𝑐 𝑒 𝑛∈𝐸 𝑛 𝑐 𝜔 𝑛 𝑒 𝑛 Common neighbors (CN) 𝑥 𝑛 𝑐 = 𝑣 𝑛⊆𝑐 𝑁 𝑛 𝑣 𝑛 Jaccard coefficient (JC) 𝑥 𝑛 𝑐 = 𝑣 𝑛⊆𝑐 𝑁 𝑛 𝑣 𝑛 𝑣 𝑛⊆𝑐 𝑁 𝑛(𝑣 𝑛) Adamic-Adar index (AA) 𝑥 𝑛 𝑐 = 𝑢 𝑛∈ 𝑣 𝑛⊆𝑐 𝑁 𝑛 𝑣 𝑛 1 log |𝑁 𝑛(𝑢 𝑛)| Features widely used in link prediction Mean variations
  • 21. L A N A D A Prediction model: Classifier 21 • Classifier: logistic regression classifier with L2 regularization • Classifier input: feature vector from n-order expansion • Classifier output: 1/0 Classifier … … … n-order expansion feature vector 1/0 How does prediction performance change as we increase n?
  • 22. L A N A D A 4. Experiments
  • 23. L A N A D A Setup: Datasets • 15 datasets from 8 domains • Ranges from about 1,000 to 2,500,000 hyperedges 1) Email: recipient addresses of an email 2) Contact: persons that appeared in face-to-face proximity 3) Drug components: classes or substances within a single drug, listed in the National Drug Code Directory 4) Drug use: drugs used by a patient, reported to the Drug Abuse Warning Network, before an emergency visit 5) US Congress: congress members cosponsoring a bill 6) Online tags: tags in a question in Stack Exchange forums 7) Online threads: users answering a question in Stack Exchange forums 8) Coauthorship: coauthors of a publication
  • 24. L A N A D A Setup: Training and evaluation 24 • Training and test sets • First, generate the candidate set 𝐶 • Then split 𝐶 into training (50%) and test(50%) sets • Performance metric: Area Under Curve – Precision and Recall (AUC-PR) • Recall: “How many true hyperedges can you find?” • Precision: “How precisely can you find true hyperedges?” AUC-PR # hyperedges that I claim “true” # true hyperedges I found # true hyperedges # true hyperedges I found Recall = Precision =
  • 25. L A N A D A Results and messages (1) 25 (M1) More higher-order information leads to better prediction quality, but with diminishing returns.
  • 26. L A N A D A Results and messages (1) 26 (M1) More higher-order information leads to better prediction quality, but with diminishing returns. Large gain Small gain
  • 27. L A N A D A Results and messages (2) 27 (M2) More hardness of the task makes higher-order information even more valuable. harder harder Hardness of the task Stars < Cliques 1:1 < 1:2 < 1:5 < 1:10
  • 28. L A N A D A Results and messages (3) 28 (M3) Why is higher-order information more important in some datasets than in others? Such datasets have the following properties: (i) Higher-order information is more abundant. (ii) Higher-order information share less information with pairwise ones. “How to measure abundance of 3-way information?” # all possible 3-way combinations # edges in 3-pg Edge density = × 100%
  • 29. L A N A D A Results and messages (3) 29 (M3) Why is higher-order information more important in some datasets than in others? Such datasets have the following properties: (i) Higher-order information is more abundant. (ii) Higher-order information share less information with pairwise ones. 𝐼 𝑊3; 𝑊2 Mutual information “Shared information between 2-pg and 3-pg” 1. Sample three nodes 𝑣1, 𝑣2, 𝑣3 from hypergraph “How to measure shared information between 2-way and of 3-way?” Conditional entropy “Information exclusive to 3-pg” 𝐻 𝑊3|𝑊2 2. Obtain from 2-pg 𝑊2: = (𝑤2 𝑣1, 𝑣2 , 𝑤2 𝑣2, 𝑣3 , 𝑤2 𝑣1, 𝑣3 ) 3. Obtain from 3-pg 𝑊3 ≔ 𝑤3(𝑣1, 𝑣2, 𝑣3)
  • 30. L A N A D A 5. Conclusion
  • 31. L A N A D A Conclusion 31 • We ask and answer the following questions. 1) How much higher-order information is needed to accurately represent a hypergraph? 2) When is such higher-order information particularly useful? 3) Why is higher-order information important in some datasets more than in others? • Our results could offer insights to future works on hypergraphs. • E.g., higher performance on hypergraph tasks, but with less computational complexity • Some examples of hypergraph tasks: 1 2 3 45 4 4 ? Node classification 1 2 3 45 Node embedding
  • 32. L A N A D A Links 32 • Preprint: https://arxiv.org/pdf/2001.11181.pdf • Source code & Supplementary document: https://github.com/granelle/www20-higher-order • Datasets: https://www.cs.cornell.edu/~arb/data/
  • 33. A n y Q u e s t i o n s ? Thank you!