SlideShare a Scribd company logo
Hypergraph Mining for Social Networks
Giacomo Bergami
giacomo90@libero.it
Università di Bologna
July 17, 2014
Giacomo Bergamigiacomo90@libero.it (Università di Bologna)Hypergraph Mining for Social Networks July 17, 2014 1 / 19
Contents
1 Goals & State of the Art
2 Why Hypergraphs?
3 Data Mining Algebra
4 gSpan
The Original Algorithm
Specialization Proposal
5 Conclusions & Future work
Giacomo Bergamigiacomo90@libero.it (Università di Bologna)Hypergraph Mining for Social Networks July 17, 2014 2 / 19
Goals & . . .
Define an hypergraph data structure for both data and relations.
Define some algebra operators.
Open question: how to automate the mining process.
Evaluations: online social network (OSN) data representation.
Experiments: Graph clustering of Iris data set.
Giacomo Bergamigiacomo90@libero.it (Università di Bologna)Hypergraph Mining for Social Networks July 17, 2014 3 / 19
. . . & State of the Art
Toon Calders’s Data Mining (Relational) Algebra.
Data structures that have already been studied in detail:.
Data mining operations have been developed in both graphs and
relational graphs.
Well known algorithms (optimal computational cost) and operators.
While hypergraphs may be an intuitive representation of higher order
similarities, it seems (anecdotally at least) that graphs lie at the heart
of this problem.
Giacomo Bergamigiacomo90@libero.it (Università di Bologna)Hypergraph Mining for Social Networks July 17, 2014 4 / 19
Why Hypergraphs?
tommy32
Tom S.
(lat1,long1)
Tom S. (lat1,long1) tommy32
Giacomo Bergamigiacomo90@libero.it (Università di Bologna)Hypergraph Mining for Social Networks July 17, 2014 5 / 19
Data Structures: ED as data entities’ representation
tommy32
Tom S.
(lat1,long1)
sam
Adeola S.
(lat2,long2)
47561
10231
75234
80235
11230
UserID UserLocation UserName w ϕ
tommy23 (lat1,long1) Tom Sawyer 1 1000
sam (lat2,long2) Adeola Samuel 1 10002



Entities(ED)
PostID PostLocation PostContent UserID w ϕ
47561 (lat3,long3) Have a nice day! tommy32 1 1
80235 (lat4,long4) Some great news... sam 1 2
10231 (NULL,NULL) J.S. Bach and ... tommy32 1 3
75234 (lat5,long5) Telemann & Xenakis... tommy32 1 4
11230 (NULL,NULL) Yet another plot! sam 1 5



Entities(ED)
Database with Uncertain Data
Users<:Object
Posts<:Object
Collection
Attributes
Primary key Foreign key
Giacomo Bergamigiacomo90@libero.it (Università di Bologna)Hypergraph Mining for Social Networks July 17, 2014 6 / 19
Data Structures: EE as binary relationships’ representation
tommy32
Tom S.
(lat1,long1)
sam
Adeola S.
(lat2,long2)
47561
10231
75234
80235
11230
tommy32
10231
75234
47561
sam
80235
10231follows
follows
The information’s atomization allows to automatically identify relationships
between data (users’ posts)
If we keep binary EE relations between ED, we could retain a linear time
complexity on the size of a binary graph (O(|G|)), where ED are mapped as
G’s vertices.
Giacomo Bergamigiacomo90@libero.it (Università di Bologna)Hypergraph Mining for Social Networks July 17, 2014 7 / 19
DHImp, the final step. (DHImp = (db, T))
Relational Database will express data relations (ED) while tensors will
express data correlations (EE )
This definition permits to define also tensors τi ∈ T for non binary
relations (OSN relations are mainly binary).
Permits to separate the operations for data and the operations over
relations.
Giacomo Bergamigiacomo90@libero.it (Università di Bologna)Hypergraph Mining for Social Networks July 17, 2014 8 / 19
(Relational) Data Mining Algebra
D E
Pop
I
πA πRDA
κ
λ
Toons’ DMA Identifies different “data” categories: Data World,
Intensional World, Extensional World.
Some internal world operations and external world operations are
defined.
Data Mining algorithms could be described by this algebra.
Is it possible to map these worlds over our hypergraphs? Yes
Is it possible to define an algebra for (weighted and indexed)
hypergraphs? Yes
Giacomo Bergamigiacomo90@libero.it (Università di Bologna)Hypergraph Mining for Social Networks July 17, 2014 9 / 19
Is it possible to map these worlds over our hypergraphs?
DHImp (as Hypergraph)
euser
epos
eid
fuser
fpos
fid
Pure I-Hypergraph
{e}
{f }
{e, f }
HDM = (h, H, EL)
Giacomo Bergamigiacomo90@libero.it (Università di Bologna)Hypergraph Mining for Social Networks July 17, 2014 10 / 19
Is it possible to define an algebra for (weighted and indexed) hypergraphs?
Definition (Database operations)
˙(DB) = { (t) | t ∈ DB }
DB1 ˙ DB2 = { t1 t2 | t1 ∈ DB1 ∧ t2 ∈ DB2 }
Definition (Index-consistency)
A database unary operation ˙ is said to be index-consistent iff. for all the
tables of the current database, the indices among the tables are kept
distinct. (Similarly for binary ones).
Relational algebra operators over DWorld should be redefined for
weight update and indexing properties.
The reindexing over the tables obtained as a result of algebraic
operations is performed via dovetailing.
All the relational algebra operations have to be proved as
index-consistent.
Giacomo Bergamigiacomo90@libero.it (Università di Bologna)Hypergraph Mining for Social Networks July 17, 2014 11 / 19
Hypergraph Data Mining Example: H. Clustering via Binary Graph Clustering
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95 9697
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114115
116
117
118
119
120
121
122
123
124
125
126
127128
129
130
131
132
133
134 135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95 9697
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114115
116
117
118
119
120
121
122
123
124
125
126
127128
129
130
131
132
133
134 135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
Real data (left) vs. Clustered (right) values.
Giacomo Bergamigiacomo90@libero.it (Università di Bologna)Hypergraph Mining for Social Networks July 17, 2014 12 / 19
Hypergraph Data Operators Example: 2nd Neighbours and Node Degree
<gyankos,Jack Bergus>
<jsbach,Johann Sebastian Bach>
<handel,G F Haendel>
<mozart,W A Mozart>
<faux,P.D.Q. Bach>
1
1
1
1 1
Giacomo Bergamigiacomo90@libero.it (Università di Bologna)Hypergraph Mining for Social Networks July 17, 2014 13 / 19
G ←
Calcf (x)=1 as Count(ρ{NScreenName←ScreenName,NUserId←UserId}(G))
<Jack Bergus,gyankos,1>
<Johann Sebastian Bach,jsbach,1>
<G F Haendel,handel,1>
<W A Mozart,mozart,1>
<P.D.Q. Bach,faux,1>
1
1
1
1 1
Giacomo Bergamigiacomo90@libero.it (Università di Bologna)Hypergraph Mining for Social Networks July 17, 2014 14 / 19
G ← σP
e→w(e)≥0.5(G x,y→ϕ(x)=ϕ(y)∨T[x,y,hasFriend]=0 G )
<jsbach,Johann Sebastian Bach,G F Haendel,handel,1>
<jsbach,Johann Sebastian Bach,P.D.Q. Bach,faux,1>
<gyankos,Jack Bergus,Jack Bergus,gyankos,1><handel,G F Haendel,G F Haendel,handel,1>
<mozart,W A Mozart,W A Mozart,mozart,1>
<gyankos,Jack Bergus,P.D.Q. Bach,faux,1>
<jsbach,Johann Sebastian Bach,Johann Sebastian Bach,jsbach,1>
<mozart,W A Mozart,G F Haendel,handel,1><faux,P.D.Q. Bach,Johann Sebastian Bach,jsbach,1>
<faux,P.D.Q. Bach,P.D.Q. Bach,faux,1>
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
1
1
0.5
0.50.5
0.5
0.5
0.5 0.5
0.5
1
0.5
0.5 0.5
1
0.5 0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
1
0.5
Giacomo Bergamigiacomo90@libero.it (Università di Bologna)Hypergraph Mining for Social Networks July 17, 2014 15 / 19
G ← σP
e→w(e)≥0.25(Γsum(Count)-1 as Count
ScreenName,UserId (G ))
<W A Mozart,mozart,1>
<Jack Bergus,gyankos,1>
<P.D.Q. Bach,faux,1>
<Johann Sebastian Bach,jsbach,2>
<G F Haendel,handel,0>
0.75
0.62
0.25
0.38
0.58
0.25
0.5
0.28 0.67
Giacomo Bergamigiacomo90@libero.it (Università di Bologna)Hypergraph Mining for Social Networks July 17, 2014 16 / 19
gSpan
The Original Algorithm
gSpan is a frequent subgraph mining algorithm that works in the following
way:
§
gSpan(DB, minsupp, Solution) {
Ξ ←sort(FrequentEdges(DB,minsupp))
Solution ← Ξ;
NStack ← Solution
5 while (g ← NStack.pop()) {
if g = minDfsCode(g) continue;
Solution ← g
∀e ∈ Ξ. if (e re g)⇒ { // if e re g is a rightmost expansion of g by e
if GS(e re g, DB) ≥ minsupp
0 NStack.push(e re g)
}
}
}
¦ ¥
Listing 1 : gSpan
Giacomo Bergamigiacomo90@libero.it (Università di Bologna)Hypergraph Mining for Social Networks July 17, 2014 17 / 19
gSpan
Specialization Proposal
https://github.com/jackbergus/gSpanExtended
Suppose to see DHImp = (db, T) as a directed (binary) graph:
({ϕ(e)|t ∈ db ∧ e ∈ t}, {(e, f )|∃e, f , k.T[e, f , k] = 0})
We suppose that our HDM contains only one vertex per entity
instance. Hence, each vertex will have a unique label, that is its data
representation (D(e)) or index (ϕ(e)).
Suppose that relation label λ(e) = λ(v, w) = (ϕ(v), T(e), ϕ(w)),
each edge has an unique label because each vertex is unique.
gSpan algorithm guarantees that the minimal dfs code is unique for
each graph. This algorithm strengthens the fact.
Subgraph isomorphism over distinct edges and vertices reduces to an
approximated ordered-subset test that could be implemented in linear
time.
The overall original algorithm has a polynomial time complexity.
Giacomo Bergamigiacomo90@libero.it (Università di Bologna)Hypergraph Mining for Social Networks July 17, 2014 18 / 19
Conclusions & Future work
Conclusion
We used R for experimenting data mining operations. Some Java
implementations of the algebra were provided
(https://github.com/jackbergus/hypergraphalgebra/)
Hypergraph express n-ary relations, and change both data and
relations.
Hypergraph problems could be reduced to graph ones.
Future Work
We could study the time complexity of each data mining operators for
hypergraphs and complete the algebra definition.
We could set an environment where execute such hypergraph algebric
operations with distributed or parallel algorithms.
Giacomo Bergamigiacomo90@libero.it (Università di Bologna)Hypergraph Mining for Social Networks July 17, 2014 19 / 19

More Related Content

What's hot

Nor Implement
Nor ImplementNor Implement
Nor Implement
sahed dewan
 
Gradient Estimation Using Stochastic Computation Graphs
Gradient Estimation Using Stochastic Computation GraphsGradient Estimation Using Stochastic Computation Graphs
Gradient Estimation Using Stochastic Computation Graphs
Yoonho Lee
 
The Concurrent Constraint Programming Research Programmes -- Redux (part2)
The Concurrent Constraint Programming Research Programmes -- Redux (part2)The Concurrent Constraint Programming Research Programmes -- Redux (part2)
The Concurrent Constraint Programming Research Programmes -- Redux (part2)
Pierre Schaus
 
A born-again programmer's odyssey
A born-again programmer's odysseyA born-again programmer's odyssey
A born-again programmer's odyssey
Igor Rivin
 
Simple representations for learning: factorizations and similarities
Simple representations for learning: factorizations and similarities Simple representations for learning: factorizations and similarities
Simple representations for learning: factorizations and similarities
Gael Varoquaux
 
Ch6 information theory
Ch6 information theoryCh6 information theory
Ch6 information theory
Dr. Sachin Kumar Gupta
 
Introduction to join calculus
Introduction to join calculusIntroduction to join calculus
Introduction to join calculus
Sergei Winitzki
 
Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
Jinho Choi
 
Predicting organic reaction outcomes with weisfeiler lehman network
Predicting organic reaction outcomes with weisfeiler lehman networkPredicting organic reaction outcomes with weisfeiler lehman network
Predicting organic reaction outcomes with weisfeiler lehman network
Kazuki Fujikawa
 
Computing with Directed Labeled Graphs
Computing with Directed Labeled GraphsComputing with Directed Labeled Graphs
Computing with Directed Labeled Graphs
Marko Rodriguez
 
Low Power FPGA Based Elliptical Curve Cryptography
Low Power FPGA Based Elliptical Curve CryptographyLow Power FPGA Based Elliptical Curve Cryptography
Low Power FPGA Based Elliptical Curve Cryptography
IOSR Journals
 

What's hot (11)

Nor Implement
Nor ImplementNor Implement
Nor Implement
 
Gradient Estimation Using Stochastic Computation Graphs
Gradient Estimation Using Stochastic Computation GraphsGradient Estimation Using Stochastic Computation Graphs
Gradient Estimation Using Stochastic Computation Graphs
 
The Concurrent Constraint Programming Research Programmes -- Redux (part2)
The Concurrent Constraint Programming Research Programmes -- Redux (part2)The Concurrent Constraint Programming Research Programmes -- Redux (part2)
The Concurrent Constraint Programming Research Programmes -- Redux (part2)
 
A born-again programmer's odyssey
A born-again programmer's odysseyA born-again programmer's odyssey
A born-again programmer's odyssey
 
Simple representations for learning: factorizations and similarities
Simple representations for learning: factorizations and similarities Simple representations for learning: factorizations and similarities
Simple representations for learning: factorizations and similarities
 
Ch6 information theory
Ch6 information theoryCh6 information theory
Ch6 information theory
 
Introduction to join calculus
Introduction to join calculusIntroduction to join calculus
Introduction to join calculus
 
Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
 
Predicting organic reaction outcomes with weisfeiler lehman network
Predicting organic reaction outcomes with weisfeiler lehman networkPredicting organic reaction outcomes with weisfeiler lehman network
Predicting organic reaction outcomes with weisfeiler lehman network
 
Computing with Directed Labeled Graphs
Computing with Directed Labeled GraphsComputing with Directed Labeled Graphs
Computing with Directed Labeled Graphs
 
Low Power FPGA Based Elliptical Curve Cryptography
Low Power FPGA Based Elliptical Curve CryptographyLow Power FPGA Based Elliptical Curve Cryptography
Low Power FPGA Based Elliptical Curve Cryptography
 

Viewers also liked

HPP Week 1 Summary
HPP Week 1 SummaryHPP Week 1 Summary
HPP Week 1 Summary
Pipat Methavanitpong
 
Android Internals (This is not the droid you’re loking for...)
Android Internals (This is not the droid you’re loking for...)Android Internals (This is not the droid you’re loking for...)
Android Internals (This is not the droid you’re loking for...)
Giacomo Bergami
 
Maths with Programming
Maths with ProgrammingMaths with Programming
Maths with Programming
Omar Bashir
 
Dmitry Shabanov – Improved algorithms for colorings of simple hypergraphs and...
Dmitry Shabanov – Improved algorithms for colorings of simple hypergraphs and...Dmitry Shabanov – Improved algorithms for colorings of simple hypergraphs and...
Dmitry Shabanov – Improved algorithms for colorings of simple hypergraphs and...
Yandex
 
Graph Coloring using Peer-to-Peer Networks
Graph Coloring using Peer-to-Peer NetworksGraph Coloring using Peer-to-Peer Networks
Graph Coloring using Peer-to-Peer Networks
Faculty of Computer Science
 
Cell division and pascal triangle
Cell division and pascal triangleCell division and pascal triangle
Cell division and pascal triangle
Fatima Jinnah women university
 
Careers
CareersCareers
Careers
Pawan Mishra
 
Maths Games & Videos
Maths Games & VideosMaths Games & Videos
Maths Games & Videos
Suhaila
 
Cuda introduction
Cuda introductionCuda introduction
Cuda introduction
Hanibei
 
Statistics used in Cricket
Statistics used in Cricket Statistics used in Cricket
Statistics used in Cricket
Makhan Dey
 
Graph coloring and_applications
Graph coloring and_applicationsGraph coloring and_applications
Graph coloring and_applications
mohammad alkhalil
 
Application of graph theory in drug design
Application of graph theory in drug designApplication of graph theory in drug design
Application of graph theory in drug design
Reihaneh Safavi
 
Applications of graphs
Applications of graphsApplications of graphs
Applications of graphs
Tech_MX
 
Pascal’s triangle and its applications and properties
Pascal’s triangle and its applications and propertiesPascal’s triangle and its applications and properties
Pascal’s triangle and its applications and properties
Jordan Leong
 
Football and graph theory
Football and graph theoryFootball and graph theory
Football and graph theory
Umang Aggarwal
 
Maths in cricket
Maths in cricketMaths in cricket
Maths in cricket
Sampath Pattjoshi
 
Pascal Triangle
Pascal TrianglePascal Triangle
Pascal Triangle
Rishabh Bhandari
 
A Join Operator for Property Graphs
A Join Operator for Property GraphsA Join Operator for Property Graphs
A Join Operator for Property Graphs
Giacomo Bergami
 
Applications of maths in our daily life
Applications of maths in our daily lifeApplications of maths in our daily life
Applications of maths in our daily life
Abhinav Somani
 

Viewers also liked (19)

HPP Week 1 Summary
HPP Week 1 SummaryHPP Week 1 Summary
HPP Week 1 Summary
 
Android Internals (This is not the droid you’re loking for...)
Android Internals (This is not the droid you’re loking for...)Android Internals (This is not the droid you’re loking for...)
Android Internals (This is not the droid you’re loking for...)
 
Maths with Programming
Maths with ProgrammingMaths with Programming
Maths with Programming
 
Dmitry Shabanov – Improved algorithms for colorings of simple hypergraphs and...
Dmitry Shabanov – Improved algorithms for colorings of simple hypergraphs and...Dmitry Shabanov – Improved algorithms for colorings of simple hypergraphs and...
Dmitry Shabanov – Improved algorithms for colorings of simple hypergraphs and...
 
Graph Coloring using Peer-to-Peer Networks
Graph Coloring using Peer-to-Peer NetworksGraph Coloring using Peer-to-Peer Networks
Graph Coloring using Peer-to-Peer Networks
 
Cell division and pascal triangle
Cell division and pascal triangleCell division and pascal triangle
Cell division and pascal triangle
 
Careers
CareersCareers
Careers
 
Maths Games & Videos
Maths Games & VideosMaths Games & Videos
Maths Games & Videos
 
Cuda introduction
Cuda introductionCuda introduction
Cuda introduction
 
Statistics used in Cricket
Statistics used in Cricket Statistics used in Cricket
Statistics used in Cricket
 
Graph coloring and_applications
Graph coloring and_applicationsGraph coloring and_applications
Graph coloring and_applications
 
Application of graph theory in drug design
Application of graph theory in drug designApplication of graph theory in drug design
Application of graph theory in drug design
 
Applications of graphs
Applications of graphsApplications of graphs
Applications of graphs
 
Pascal’s triangle and its applications and properties
Pascal’s triangle and its applications and propertiesPascal’s triangle and its applications and properties
Pascal’s triangle and its applications and properties
 
Football and graph theory
Football and graph theoryFootball and graph theory
Football and graph theory
 
Maths in cricket
Maths in cricketMaths in cricket
Maths in cricket
 
Pascal Triangle
Pascal TrianglePascal Triangle
Pascal Triangle
 
A Join Operator for Property Graphs
A Join Operator for Property GraphsA Join Operator for Property Graphs
A Join Operator for Property Graphs
 
Applications of maths in our daily life
Applications of maths in our daily lifeApplications of maths in our daily life
Applications of maths in our daily life
 

Similar to Hypergraph Mining For Social Networks

Knowledge Graphs and Milestone
Knowledge Graphs and MilestoneKnowledge Graphs and Milestone
Knowledge Graphs and Milestone
Barry Norton
 
Graph operations in Git version control system
Graph operations in Git version control systemGraph operations in Git version control system
Graph operations in Git version control system
Jakub Narębski
 
Locally densest subgraph discovery
Locally densest subgraph discoveryLocally densest subgraph discovery
Locally densest subgraph discovery
aftab alam
 
Open Problems in the Universal Graph Theory
Open Problems in the Universal Graph TheoryOpen Problems in the Universal Graph Theory
Open Problems in the Universal Graph Theory
Marko Rodriguez
 
AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transforma...
AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transforma...AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transforma...
AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transforma...
Tomoyuki Suzuki
 
Dove sono i tuoi vertici e di cosa stanno parlando?
Dove sono i tuoi vertici e di cosa stanno parlando?Dove sono i tuoi vertici e di cosa stanno parlando?
Dove sono i tuoi vertici e di cosa stanno parlando?
Codemotion
 
Inductive Triple Graphs: A purely functional approach to represent RDF
Inductive Triple Graphs: A purely functional approach to represent RDFInductive Triple Graphs: A purely functional approach to represent RDF
Inductive Triple Graphs: A purely functional approach to represent RDF
Jose Emilio Labra Gayo
 
Dedalo, looking for Cluster Explanations in a labyrinth of Linked Data
Dedalo, looking for Cluster Explanations in a labyrinth of Linked DataDedalo, looking for Cluster Explanations in a labyrinth of Linked Data
Dedalo, looking for Cluster Explanations in a labyrinth of Linked Data
Vrije Universiteit Amsterdam
 
From Signal to Symbols
From Signal to SymbolsFrom Signal to Symbols
From Signal to Symbols
gpano
 
Distance oracle - Truy vấn nhanh khoảng cách giữa hai điểm bất kỳ trên đồ thị
Distance oracle - Truy vấn nhanh khoảng cách giữa hai điểm bất kỳ trên đồ thịDistance oracle - Truy vấn nhanh khoảng cách giữa hai điểm bất kỳ trên đồ thị
Distance oracle - Truy vấn nhanh khoảng cách giữa hai điểm bất kỳ trên đồ thị
Hong Ong
 
On theory and applications of mathematics to security in cloud computing: a c...
On theory and applications of mathematics to security in cloud computing: a c...On theory and applications of mathematics to security in cloud computing: a c...
On theory and applications of mathematics to security in cloud computing: a c...
Dr. Richard Otieno
 
OrientDB - the 2nd generation of (Multi-Model) NoSQL
OrientDB - the 2nd generation  of  (Multi-Model) NoSQLOrientDB - the 2nd generation  of  (Multi-Model) NoSQL
OrientDB - the 2nd generation of (Multi-Model) NoSQL
Luigi Dell'Aquila
 
An Evaluation of Models for Runtime Approximation in Link Discovery
An Evaluation of Models for Runtime Approximation in Link DiscoveryAn Evaluation of Models for Runtime Approximation in Link Discovery
An Evaluation of Models for Runtime Approximation in Link Discovery
Holistic Benchmarking of Big Linked Data
 
Introduction of DiscoGAN
Introduction of DiscoGANIntroduction of DiscoGAN
Introduction of DiscoGAN
Seongcheol Baek
 
Gsdi10
Gsdi10Gsdi10
Representing Simplicial Complexes with Mangroves
Representing Simplicial Complexes with MangrovesRepresenting Simplicial Complexes with Mangroves
Representing Simplicial Complexes with Mangroves
David Canino
 
R tools for HiC data visualization
R tools for HiC data visualizationR tools for HiC data visualization
R tools for HiC data visualization
tuxette
 
Asynchronous Stochastic Optimization, New Analysis and Algorithms
Asynchronous Stochastic Optimization, New Analysis and AlgorithmsAsynchronous Stochastic Optimization, New Analysis and Algorithms
Asynchronous Stochastic Optimization, New Analysis and Algorithms
Fabian Pedregosa
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
ijceronline
 
Network Analysis: People and Open Source Communities - LinuxCon Seattle and D...
Network Analysis: People and Open Source Communities - LinuxCon Seattle and D...Network Analysis: People and Open Source Communities - LinuxCon Seattle and D...
Network Analysis: People and Open Source Communities - LinuxCon Seattle and D...
Dawn Foster
 

Similar to Hypergraph Mining For Social Networks (20)

Knowledge Graphs and Milestone
Knowledge Graphs and MilestoneKnowledge Graphs and Milestone
Knowledge Graphs and Milestone
 
Graph operations in Git version control system
Graph operations in Git version control systemGraph operations in Git version control system
Graph operations in Git version control system
 
Locally densest subgraph discovery
Locally densest subgraph discoveryLocally densest subgraph discovery
Locally densest subgraph discovery
 
Open Problems in the Universal Graph Theory
Open Problems in the Universal Graph TheoryOpen Problems in the Universal Graph Theory
Open Problems in the Universal Graph Theory
 
AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transforma...
AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transforma...AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transforma...
AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transforma...
 
Dove sono i tuoi vertici e di cosa stanno parlando?
Dove sono i tuoi vertici e di cosa stanno parlando?Dove sono i tuoi vertici e di cosa stanno parlando?
Dove sono i tuoi vertici e di cosa stanno parlando?
 
Inductive Triple Graphs: A purely functional approach to represent RDF
Inductive Triple Graphs: A purely functional approach to represent RDFInductive Triple Graphs: A purely functional approach to represent RDF
Inductive Triple Graphs: A purely functional approach to represent RDF
 
Dedalo, looking for Cluster Explanations in a labyrinth of Linked Data
Dedalo, looking for Cluster Explanations in a labyrinth of Linked DataDedalo, looking for Cluster Explanations in a labyrinth of Linked Data
Dedalo, looking for Cluster Explanations in a labyrinth of Linked Data
 
From Signal to Symbols
From Signal to SymbolsFrom Signal to Symbols
From Signal to Symbols
 
Distance oracle - Truy vấn nhanh khoảng cách giữa hai điểm bất kỳ trên đồ thị
Distance oracle - Truy vấn nhanh khoảng cách giữa hai điểm bất kỳ trên đồ thịDistance oracle - Truy vấn nhanh khoảng cách giữa hai điểm bất kỳ trên đồ thị
Distance oracle - Truy vấn nhanh khoảng cách giữa hai điểm bất kỳ trên đồ thị
 
On theory and applications of mathematics to security in cloud computing: a c...
On theory and applications of mathematics to security in cloud computing: a c...On theory and applications of mathematics to security in cloud computing: a c...
On theory and applications of mathematics to security in cloud computing: a c...
 
OrientDB - the 2nd generation of (Multi-Model) NoSQL
OrientDB - the 2nd generation  of  (Multi-Model) NoSQLOrientDB - the 2nd generation  of  (Multi-Model) NoSQL
OrientDB - the 2nd generation of (Multi-Model) NoSQL
 
An Evaluation of Models for Runtime Approximation in Link Discovery
An Evaluation of Models for Runtime Approximation in Link DiscoveryAn Evaluation of Models for Runtime Approximation in Link Discovery
An Evaluation of Models for Runtime Approximation in Link Discovery
 
Introduction of DiscoGAN
Introduction of DiscoGANIntroduction of DiscoGAN
Introduction of DiscoGAN
 
Gsdi10
Gsdi10Gsdi10
Gsdi10
 
Representing Simplicial Complexes with Mangroves
Representing Simplicial Complexes with MangrovesRepresenting Simplicial Complexes with Mangroves
Representing Simplicial Complexes with Mangroves
 
R tools for HiC data visualization
R tools for HiC data visualizationR tools for HiC data visualization
R tools for HiC data visualization
 
Asynchronous Stochastic Optimization, New Analysis and Algorithms
Asynchronous Stochastic Optimization, New Analysis and AlgorithmsAsynchronous Stochastic Optimization, New Analysis and Algorithms
Asynchronous Stochastic Optimization, New Analysis and Algorithms
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
 
Network Analysis: People and Open Source Communities - LinuxCon Seattle and D...
Network Analysis: People and Open Source Communities - LinuxCon Seattle and D...Network Analysis: People and Open Source Communities - LinuxCon Seattle and D...
Network Analysis: People and Open Source Communities - LinuxCon Seattle and D...
 

Recently uploaded

2023 Ukraine Crisis Media Center Finance Balance
2023 Ukraine Crisis Media Center Finance Balance2023 Ukraine Crisis Media Center Finance Balance
2023 Ukraine Crisis Media Center Finance Balance
UkraineCrisisMediaCenter
 
SASi-SPi Science Policy Lab Pre-engagement
SASi-SPi Science Policy Lab Pre-engagementSASi-SPi Science Policy Lab Pre-engagement
SASi-SPi Science Policy Lab Pre-engagement
Francois Stepman
 
Prsentation for VIVA Welike project 1semester.pptx
Prsentation for VIVA Welike project 1semester.pptxPrsentation for VIVA Welike project 1semester.pptx
Prsentation for VIVA Welike project 1semester.pptx
prafulpawar29
 
The Intersection between Competition and Data Privacy – KEMP – June 2024 OECD...
The Intersection between Competition and Data Privacy – KEMP – June 2024 OECD...The Intersection between Competition and Data Privacy – KEMP – June 2024 OECD...
The Intersection between Competition and Data Privacy – KEMP – June 2024 OECD...
OECD Directorate for Financial and Enterprise Affairs
 
Proposal: The Ark Project and The BEEP Inc
Proposal: The Ark Project and The BEEP IncProposal: The Ark Project and The BEEP Inc
Proposal: The Ark Project and The BEEP Inc
Raheem Muhammad
 
2 December UAE National Day - United Arab Emirates
2 December UAE National Day - United Arab Emirates2 December UAE National Day - United Arab Emirates
2 December UAE National Day - United Arab Emirates
UAE Ppt
 
AWS User Group Torino 2024 #3 - 18/06/2024
AWS User Group Torino 2024 #3 - 18/06/2024AWS User Group Torino 2024 #3 - 18/06/2024
AWS User Group Torino 2024 #3 - 18/06/2024
Guido Maria Nebiolo
 
Genesis chapter 3 Isaiah Scudder.pptx
Genesis    chapter 3 Isaiah Scudder.pptxGenesis    chapter 3 Isaiah Scudder.pptx
Genesis chapter 3 Isaiah Scudder.pptx
FamilyWorshipCenterD
 
怎么办理(lincoln学位证书)英国林肯大学毕业证文凭学位证书原版一模一样
怎么办理(lincoln学位证书)英国林肯大学毕业证文凭学位证书原版一模一样怎么办理(lincoln学位证书)英国林肯大学毕业证文凭学位证书原版一模一样
怎么办理(lincoln学位证书)英国林肯大学毕业证文凭学位证书原版一模一样
kekzed
 
ACTIVE IMPLANTABLE MEDICAL DEVICE IN EUROPE
ACTIVE IMPLANTABLE MEDICAL DEVICE IN EUROPEACTIVE IMPLANTABLE MEDICAL DEVICE IN EUROPE
ACTIVE IMPLANTABLE MEDICAL DEVICE IN EUROPE
Charmi13
 
The Intersection between Competition and Data Privacy – OECD – June 2024 OECD...
The Intersection between Competition and Data Privacy – OECD – June 2024 OECD...The Intersection between Competition and Data Privacy – OECD – June 2024 OECD...
The Intersection between Competition and Data Privacy – OECD – June 2024 OECD...
OECD Directorate for Financial and Enterprise Affairs
 
The Intersection between Competition and Data Privacy – CAPEL – June 2024 OEC...
The Intersection between Competition and Data Privacy – CAPEL – June 2024 OEC...The Intersection between Competition and Data Privacy – CAPEL – June 2024 OEC...
The Intersection between Competition and Data Privacy – CAPEL – June 2024 OEC...
OECD Directorate for Financial and Enterprise Affairs
 
2023 Ukraine Crisis Media Center Annual Report
2023 Ukraine Crisis Media Center Annual Report2023 Ukraine Crisis Media Center Annual Report
2023 Ukraine Crisis Media Center Annual Report
UkraineCrisisMediaCenter
 
Bridging the visual gap between cultural heritage and digital scholarship
Bridging the visual gap between cultural heritage and digital scholarshipBridging the visual gap between cultural heritage and digital scholarship
Bridging the visual gap between cultural heritage and digital scholarship
Inesm9
 
Data Processing in PHP - PHPers 2024 Poznań
Data Processing in PHP - PHPers 2024 PoznańData Processing in PHP - PHPers 2024 Poznań
Data Processing in PHP - PHPers 2024 Poznań
Norbert Orzechowicz
 
IEEE CIS Webinar Sustainable futures.pdf
IEEE CIS Webinar Sustainable futures.pdfIEEE CIS Webinar Sustainable futures.pdf
IEEE CIS Webinar Sustainable futures.pdf
Claudio Gallicchio
 
ServiceNow CIS-ITSM Exam Dumps & Questions [2024]
ServiceNow CIS-ITSM Exam Dumps & Questions [2024]ServiceNow CIS-ITSM Exam Dumps & Questions [2024]
ServiceNow CIS-ITSM Exam Dumps & Questions [2024]
SkillCertProExams
 
2023 Ukraine Crisis Media Center Financial Report
2023 Ukraine Crisis Media Center Financial Report2023 Ukraine Crisis Media Center Financial Report
2023 Ukraine Crisis Media Center Financial Report
UkraineCrisisMediaCenter
 
Gamify it until you make it Improving Agile Development and Operations with ...
Gamify it until you make it  Improving Agile Development and Operations with ...Gamify it until you make it  Improving Agile Development and Operations with ...
Gamify it until you make it Improving Agile Development and Operations with ...
Ben Linders
 
Using-Presentation-Software-to-the-Fullf.pptx
Using-Presentation-Software-to-the-Fullf.pptxUsing-Presentation-Software-to-the-Fullf.pptx
Using-Presentation-Software-to-the-Fullf.pptx
kainatfatyma9
 

Recently uploaded (20)

2023 Ukraine Crisis Media Center Finance Balance
2023 Ukraine Crisis Media Center Finance Balance2023 Ukraine Crisis Media Center Finance Balance
2023 Ukraine Crisis Media Center Finance Balance
 
SASi-SPi Science Policy Lab Pre-engagement
SASi-SPi Science Policy Lab Pre-engagementSASi-SPi Science Policy Lab Pre-engagement
SASi-SPi Science Policy Lab Pre-engagement
 
Prsentation for VIVA Welike project 1semester.pptx
Prsentation for VIVA Welike project 1semester.pptxPrsentation for VIVA Welike project 1semester.pptx
Prsentation for VIVA Welike project 1semester.pptx
 
The Intersection between Competition and Data Privacy – KEMP – June 2024 OECD...
The Intersection between Competition and Data Privacy – KEMP – June 2024 OECD...The Intersection between Competition and Data Privacy – KEMP – June 2024 OECD...
The Intersection between Competition and Data Privacy – KEMP – June 2024 OECD...
 
Proposal: The Ark Project and The BEEP Inc
Proposal: The Ark Project and The BEEP IncProposal: The Ark Project and The BEEP Inc
Proposal: The Ark Project and The BEEP Inc
 
2 December UAE National Day - United Arab Emirates
2 December UAE National Day - United Arab Emirates2 December UAE National Day - United Arab Emirates
2 December UAE National Day - United Arab Emirates
 
AWS User Group Torino 2024 #3 - 18/06/2024
AWS User Group Torino 2024 #3 - 18/06/2024AWS User Group Torino 2024 #3 - 18/06/2024
AWS User Group Torino 2024 #3 - 18/06/2024
 
Genesis chapter 3 Isaiah Scudder.pptx
Genesis    chapter 3 Isaiah Scudder.pptxGenesis    chapter 3 Isaiah Scudder.pptx
Genesis chapter 3 Isaiah Scudder.pptx
 
怎么办理(lincoln学位证书)英国林肯大学毕业证文凭学位证书原版一模一样
怎么办理(lincoln学位证书)英国林肯大学毕业证文凭学位证书原版一模一样怎么办理(lincoln学位证书)英国林肯大学毕业证文凭学位证书原版一模一样
怎么办理(lincoln学位证书)英国林肯大学毕业证文凭学位证书原版一模一样
 
ACTIVE IMPLANTABLE MEDICAL DEVICE IN EUROPE
ACTIVE IMPLANTABLE MEDICAL DEVICE IN EUROPEACTIVE IMPLANTABLE MEDICAL DEVICE IN EUROPE
ACTIVE IMPLANTABLE MEDICAL DEVICE IN EUROPE
 
The Intersection between Competition and Data Privacy – OECD – June 2024 OECD...
The Intersection between Competition and Data Privacy – OECD – June 2024 OECD...The Intersection between Competition and Data Privacy – OECD – June 2024 OECD...
The Intersection between Competition and Data Privacy – OECD – June 2024 OECD...
 
The Intersection between Competition and Data Privacy – CAPEL – June 2024 OEC...
The Intersection between Competition and Data Privacy – CAPEL – June 2024 OEC...The Intersection between Competition and Data Privacy – CAPEL – June 2024 OEC...
The Intersection between Competition and Data Privacy – CAPEL – June 2024 OEC...
 
2023 Ukraine Crisis Media Center Annual Report
2023 Ukraine Crisis Media Center Annual Report2023 Ukraine Crisis Media Center Annual Report
2023 Ukraine Crisis Media Center Annual Report
 
Bridging the visual gap between cultural heritage and digital scholarship
Bridging the visual gap between cultural heritage and digital scholarshipBridging the visual gap between cultural heritage and digital scholarship
Bridging the visual gap between cultural heritage and digital scholarship
 
Data Processing in PHP - PHPers 2024 Poznań
Data Processing in PHP - PHPers 2024 PoznańData Processing in PHP - PHPers 2024 Poznań
Data Processing in PHP - PHPers 2024 Poznań
 
IEEE CIS Webinar Sustainable futures.pdf
IEEE CIS Webinar Sustainable futures.pdfIEEE CIS Webinar Sustainable futures.pdf
IEEE CIS Webinar Sustainable futures.pdf
 
ServiceNow CIS-ITSM Exam Dumps & Questions [2024]
ServiceNow CIS-ITSM Exam Dumps & Questions [2024]ServiceNow CIS-ITSM Exam Dumps & Questions [2024]
ServiceNow CIS-ITSM Exam Dumps & Questions [2024]
 
2023 Ukraine Crisis Media Center Financial Report
2023 Ukraine Crisis Media Center Financial Report2023 Ukraine Crisis Media Center Financial Report
2023 Ukraine Crisis Media Center Financial Report
 
Gamify it until you make it Improving Agile Development and Operations with ...
Gamify it until you make it  Improving Agile Development and Operations with ...Gamify it until you make it  Improving Agile Development and Operations with ...
Gamify it until you make it Improving Agile Development and Operations with ...
 
Using-Presentation-Software-to-the-Fullf.pptx
Using-Presentation-Software-to-the-Fullf.pptxUsing-Presentation-Software-to-the-Fullf.pptx
Using-Presentation-Software-to-the-Fullf.pptx
 

Hypergraph Mining For Social Networks

  • 1. Hypergraph Mining for Social Networks Giacomo Bergami giacomo90@libero.it Università di Bologna July 17, 2014 Giacomo Bergamigiacomo90@libero.it (Università di Bologna)Hypergraph Mining for Social Networks July 17, 2014 1 / 19
  • 2. Contents 1 Goals & State of the Art 2 Why Hypergraphs? 3 Data Mining Algebra 4 gSpan The Original Algorithm Specialization Proposal 5 Conclusions & Future work Giacomo Bergamigiacomo90@libero.it (Università di Bologna)Hypergraph Mining for Social Networks July 17, 2014 2 / 19
  • 3. Goals & . . . Define an hypergraph data structure for both data and relations. Define some algebra operators. Open question: how to automate the mining process. Evaluations: online social network (OSN) data representation. Experiments: Graph clustering of Iris data set. Giacomo Bergamigiacomo90@libero.it (Università di Bologna)Hypergraph Mining for Social Networks July 17, 2014 3 / 19
  • 4. . . . & State of the Art Toon Calders’s Data Mining (Relational) Algebra. Data structures that have already been studied in detail:. Data mining operations have been developed in both graphs and relational graphs. Well known algorithms (optimal computational cost) and operators. While hypergraphs may be an intuitive representation of higher order similarities, it seems (anecdotally at least) that graphs lie at the heart of this problem. Giacomo Bergamigiacomo90@libero.it (Università di Bologna)Hypergraph Mining for Social Networks July 17, 2014 4 / 19
  • 5. Why Hypergraphs? tommy32 Tom S. (lat1,long1) Tom S. (lat1,long1) tommy32 Giacomo Bergamigiacomo90@libero.it (Università di Bologna)Hypergraph Mining for Social Networks July 17, 2014 5 / 19
  • 6. Data Structures: ED as data entities’ representation tommy32 Tom S. (lat1,long1) sam Adeola S. (lat2,long2) 47561 10231 75234 80235 11230 UserID UserLocation UserName w ϕ tommy23 (lat1,long1) Tom Sawyer 1 1000 sam (lat2,long2) Adeola Samuel 1 10002    Entities(ED) PostID PostLocation PostContent UserID w ϕ 47561 (lat3,long3) Have a nice day! tommy32 1 1 80235 (lat4,long4) Some great news... sam 1 2 10231 (NULL,NULL) J.S. Bach and ... tommy32 1 3 75234 (lat5,long5) Telemann & Xenakis... tommy32 1 4 11230 (NULL,NULL) Yet another plot! sam 1 5    Entities(ED) Database with Uncertain Data Users<:Object Posts<:Object Collection Attributes Primary key Foreign key Giacomo Bergamigiacomo90@libero.it (Università di Bologna)Hypergraph Mining for Social Networks July 17, 2014 6 / 19
  • 7. Data Structures: EE as binary relationships’ representation tommy32 Tom S. (lat1,long1) sam Adeola S. (lat2,long2) 47561 10231 75234 80235 11230 tommy32 10231 75234 47561 sam 80235 10231follows follows The information’s atomization allows to automatically identify relationships between data (users’ posts) If we keep binary EE relations between ED, we could retain a linear time complexity on the size of a binary graph (O(|G|)), where ED are mapped as G’s vertices. Giacomo Bergamigiacomo90@libero.it (Università di Bologna)Hypergraph Mining for Social Networks July 17, 2014 7 / 19
  • 8. DHImp, the final step. (DHImp = (db, T)) Relational Database will express data relations (ED) while tensors will express data correlations (EE ) This definition permits to define also tensors τi ∈ T for non binary relations (OSN relations are mainly binary). Permits to separate the operations for data and the operations over relations. Giacomo Bergamigiacomo90@libero.it (Università di Bologna)Hypergraph Mining for Social Networks July 17, 2014 8 / 19
  • 9. (Relational) Data Mining Algebra D E Pop I πA πRDA κ λ Toons’ DMA Identifies different “data” categories: Data World, Intensional World, Extensional World. Some internal world operations and external world operations are defined. Data Mining algorithms could be described by this algebra. Is it possible to map these worlds over our hypergraphs? Yes Is it possible to define an algebra for (weighted and indexed) hypergraphs? Yes Giacomo Bergamigiacomo90@libero.it (Università di Bologna)Hypergraph Mining for Social Networks July 17, 2014 9 / 19
  • 10. Is it possible to map these worlds over our hypergraphs? DHImp (as Hypergraph) euser epos eid fuser fpos fid Pure I-Hypergraph {e} {f } {e, f } HDM = (h, H, EL) Giacomo Bergamigiacomo90@libero.it (Università di Bologna)Hypergraph Mining for Social Networks July 17, 2014 10 / 19
  • 11. Is it possible to define an algebra for (weighted and indexed) hypergraphs? Definition (Database operations) ˙(DB) = { (t) | t ∈ DB } DB1 ˙ DB2 = { t1 t2 | t1 ∈ DB1 ∧ t2 ∈ DB2 } Definition (Index-consistency) A database unary operation ˙ is said to be index-consistent iff. for all the tables of the current database, the indices among the tables are kept distinct. (Similarly for binary ones). Relational algebra operators over DWorld should be redefined for weight update and indexing properties. The reindexing over the tables obtained as a result of algebraic operations is performed via dovetailing. All the relational algebra operations have to be proved as index-consistent. Giacomo Bergamigiacomo90@libero.it (Università di Bologna)Hypergraph Mining for Social Networks July 17, 2014 11 / 19
  • 12. Hypergraph Data Mining Example: H. Clustering via Binary Graph Clustering 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 9697 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114115 116 117 118 119 120 121 122 123 124 125 126 127128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 9697 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114115 116 117 118 119 120 121 122 123 124 125 126 127128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 Real data (left) vs. Clustered (right) values. Giacomo Bergamigiacomo90@libero.it (Università di Bologna)Hypergraph Mining for Social Networks July 17, 2014 12 / 19
  • 13. Hypergraph Data Operators Example: 2nd Neighbours and Node Degree <gyankos,Jack Bergus> <jsbach,Johann Sebastian Bach> <handel,G F Haendel> <mozart,W A Mozart> <faux,P.D.Q. Bach> 1 1 1 1 1 Giacomo Bergamigiacomo90@libero.it (Università di Bologna)Hypergraph Mining for Social Networks July 17, 2014 13 / 19
  • 14. G ← Calcf (x)=1 as Count(ρ{NScreenName←ScreenName,NUserId←UserId}(G)) <Jack Bergus,gyankos,1> <Johann Sebastian Bach,jsbach,1> <G F Haendel,handel,1> <W A Mozart,mozart,1> <P.D.Q. Bach,faux,1> 1 1 1 1 1 Giacomo Bergamigiacomo90@libero.it (Università di Bologna)Hypergraph Mining for Social Networks July 17, 2014 14 / 19
  • 15. G ← σP e→w(e)≥0.5(G x,y→ϕ(x)=ϕ(y)∨T[x,y,hasFriend]=0 G ) <jsbach,Johann Sebastian Bach,G F Haendel,handel,1> <jsbach,Johann Sebastian Bach,P.D.Q. Bach,faux,1> <gyankos,Jack Bergus,Jack Bergus,gyankos,1><handel,G F Haendel,G F Haendel,handel,1> <mozart,W A Mozart,W A Mozart,mozart,1> <gyankos,Jack Bergus,P.D.Q. Bach,faux,1> <jsbach,Johann Sebastian Bach,Johann Sebastian Bach,jsbach,1> <mozart,W A Mozart,G F Haendel,handel,1><faux,P.D.Q. Bach,Johann Sebastian Bach,jsbach,1> <faux,P.D.Q. Bach,P.D.Q. Bach,faux,1> 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 1 1 0.5 0.50.5 0.5 0.5 0.5 0.5 0.5 1 0.5 0.5 0.5 1 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 1 0.5 Giacomo Bergamigiacomo90@libero.it (Università di Bologna)Hypergraph Mining for Social Networks July 17, 2014 15 / 19
  • 16. G ← σP e→w(e)≥0.25(Γsum(Count)-1 as Count ScreenName,UserId (G )) <W A Mozart,mozart,1> <Jack Bergus,gyankos,1> <P.D.Q. Bach,faux,1> <Johann Sebastian Bach,jsbach,2> <G F Haendel,handel,0> 0.75 0.62 0.25 0.38 0.58 0.25 0.5 0.28 0.67 Giacomo Bergamigiacomo90@libero.it (Università di Bologna)Hypergraph Mining for Social Networks July 17, 2014 16 / 19
  • 17. gSpan The Original Algorithm gSpan is a frequent subgraph mining algorithm that works in the following way: § gSpan(DB, minsupp, Solution) { Ξ ←sort(FrequentEdges(DB,minsupp)) Solution ← Ξ; NStack ← Solution 5 while (g ← NStack.pop()) { if g = minDfsCode(g) continue; Solution ← g ∀e ∈ Ξ. if (e re g)⇒ { // if e re g is a rightmost expansion of g by e if GS(e re g, DB) ≥ minsupp 0 NStack.push(e re g) } } } ¦ ¥ Listing 1 : gSpan Giacomo Bergamigiacomo90@libero.it (Università di Bologna)Hypergraph Mining for Social Networks July 17, 2014 17 / 19
  • 18. gSpan Specialization Proposal https://github.com/jackbergus/gSpanExtended Suppose to see DHImp = (db, T) as a directed (binary) graph: ({ϕ(e)|t ∈ db ∧ e ∈ t}, {(e, f )|∃e, f , k.T[e, f , k] = 0}) We suppose that our HDM contains only one vertex per entity instance. Hence, each vertex will have a unique label, that is its data representation (D(e)) or index (ϕ(e)). Suppose that relation label λ(e) = λ(v, w) = (ϕ(v), T(e), ϕ(w)), each edge has an unique label because each vertex is unique. gSpan algorithm guarantees that the minimal dfs code is unique for each graph. This algorithm strengthens the fact. Subgraph isomorphism over distinct edges and vertices reduces to an approximated ordered-subset test that could be implemented in linear time. The overall original algorithm has a polynomial time complexity. Giacomo Bergamigiacomo90@libero.it (Università di Bologna)Hypergraph Mining for Social Networks July 17, 2014 18 / 19
  • 19. Conclusions & Future work Conclusion We used R for experimenting data mining operations. Some Java implementations of the algebra were provided (https://github.com/jackbergus/hypergraphalgebra/) Hypergraph express n-ary relations, and change both data and relations. Hypergraph problems could be reduced to graph ones. Future Work We could study the time complexity of each data mining operators for hypergraphs and complete the algebra definition. We could set an environment where execute such hypergraph algebric operations with distributed or parallel algorithms. Giacomo Bergamigiacomo90@libero.it (Università di Bologna)Hypergraph Mining for Social Networks July 17, 2014 19 / 19