SlideShare a Scribd company logo
THoSP: an Algorithm for Nesting Property
Graphs
Giacomo Bergami 1 André Petermann 2 Danilo Montesi 1
1st Joint GRADES-NDA International Workshop, 2018
10th June 2018
Università di Bologna1, Universität Leipzig2
Key Ideas
Key Ideas – Research Problem
1 An operator allowing to generalize the current “grouping” and
“nesting” is missing. Nevertheless, current (G)DBMSs allow to
express nesting operations, but their query languages’ plans do
not allow to optimize the whole process by combining the
following tasks:
• path joins separately for both patterns.
• grouping to create an id collection over the matched elements.
2 The general nesting algorithm could lead to an exponential
evaluation time.
1/16
Key Ideas – Use Case
Author Paper∗authorOf
Vertex Pattern
Authorsrc Paper∗
Authorsrc =Authordst
Authordst
authorOf authorOf
Edge Pattern
Author
name : Abigail
surname : Conner
0
Author
name : Baldwin
surname : Oliver
1
Author
name : Cassie
surname : Norman
2
Paper
title : On Joining Graphs
3
Paper
title : Object Databases
4
Paper
title : On Nesting Graphs
5
AuthorOf
6
AuthorOf
7
AuthorOf
8
AuthorOf
9
AuthorOf
10
Input Bibliography Network 2/16
Key Ideas – Desired Result
Paper
title : On Joining
Graphs
3
Paper
title : Object
Databases
4
(0 → 1), (1 → 0)
Paper
title : On Nesting
Graphs
5
Author
name : Abigail
surname : Conner
0
(0)
(0 → 2), (2 → 0)
(2)
Author
name : Baldwin
surname : Oliver
1
Author
name : Cassie
surname : Norman
2
coauthorship coauthorship
(1)
Expected result
3/16
Key Ideas – Research Goals
1 As for graph joins, the data model must enhance the
serialization of both operands and graph result.
4/16
Key Ideas – Research Goals
1 As for graph joins, the data model must enhance the
serialization of both operands and graph result.
2 The logical graph nesting operator must be general enough to
support both the THoSP algorithm and other graph
summarization tasks.
4/16
Key Ideas – Research Goals
1 As for graph joins, the data model must enhance the
serialization of both operands and graph result.
2 The logical graph nesting operator must be general enough to
support both the THoSP algorithm and other graph
summarization tasks.
3 Grouping can be avoided by defining a nesting index, through
which the containment is associated to the container. This can
be achieved by extending the Graph Join’s data structures with
the aforementioned data structure.
4/16
Logical Model
Logical Model – Design (1)
The nested (property) graph data model is an extension of the
logical model for graph joins. Therefore, we want to preserve the
same assumptions:
The resulting nested graph is not a materialized view (as in
SQL’s SELECT).
The nested graph is serialized by only using the ID information.
Attribute, values and labels can be completely reconstructed
from these informations and the pattern rewriting information.
5/16
Logical Model – Design (2)
The following modelling choices allow the reconstruction of the
required pieces of information:
Vertices and edges are distinctly identified by ids (N2).
A nested graph database is a property graph, where each vertex
and edge may contain (nest) another property graph (ν, ).
Each vertex or edge within the graph can be considered as a
possible graph operand.
6/16
Logical Model – Definition
Graph Nesting
A nested graph database is a nested graph, where each vertex and edge may
represent a graph. Given a nested graph G = (V, E), a vertex pattern gV, a
edge pattern gE vertex pattern containing grouping references:
η
keep
ι (G) = { v ∈ V | gV(v) = ∅, keep } ∪ ι(gV(G)),
{ e ∈ E | gE(e) = ∅, keep } ∪ ι(gE(G))
where ι is an indexing function associating to each matched graph into one
new single identifier not appearing in G, and keep is set to true whether
the non-traversed vertices and edges must be preserved into the final graph.
The newly generated nested graph is inserted into the graph database which
also contains G. Values associated to both nested vertices and edges are
determined by user defined functions.
7/16
THoSP Algorithm
THoSP Algorithm – Physical Model
Motivations:
1 Reduce the number of graph visiting times by visiting the
subpattern first, and then extending the visit to the remaining
patterns.
2 Represent the nested graph as an adjacency list enriched with
an external nesting index.
The algorithm uses the same principles that were adopted for
implementing graph joins:
Use memory mapping (OS buffering).
Serialized graphs represent vertices associated to both ingoing
and outgoing edges.
No additional indexing structures are exploited.
8/16
THoSP Algorithm – Example
Author
name : Abigail
surname : Conner
0
Author
name : Baldwin
surname : Oliver
1
Author
name : Cassie
surname : Norman
2
Paper
title : On Joining
Graphs
3
Paper
title : Object
Databases
4
Paper
title : On Nesting
Graphs
5
AuthorOf
6
AuthorOf
7
AuthorOf
8
AuthorOf
9
AuthorOf
10
9/16
THoSP Algorithm – Example
Author
name : Abigail
surname : Conner
0
Author
name : Baldwin
surname : Oliver
1
Author
name : Cassie
surname : Norman
2
Paper
title : On Joining
Graphs
3
Paper
title : Object
Databases
4
Paper
title : On Nesting
Graphs
5
AuthorOf
6
AuthorOf
7
AuthorOf
8
AuthorOf
9
AuthorOf
10
Paper
title : On Joining
Graphs
3
9/16
THoSP Algorithm – Example
Author
name : Abigail
surname : Conner
0
Author
name : Baldwin
surname : Oliver
1
Author
name : Cassie
surname : Norman
2
Paper
title : On Joining
Graphs
3
Paper
title : Object
Databases
4
Paper
title : On Nesting
Graphs
5
AuthorOf
6
AuthorOf
7
AuthorOf
8
AuthorOf
9
AuthorOf
10
Paper
title : On Joining
Graphs
3
Author
name : Abigail
surname : Conner
0
(0)
9/16
THoSP Algorithm – Example
Author
name : Abigail
surname : Conner
0
Author
name : Baldwin
surname : Oliver
1
Author
name : Cassie
surname : Norman
2
Paper
title : On Joining
Graphs
3
Paper
title : Object
Databases
4
Paper
title : On Nesting
Graphs
5
AuthorOf
6
AuthorOf
7
AuthorOf
8
AuthorOf
9
AuthorOf
10
Paper
title : On Joining
Graphs
3
Author
name : Abigail
surname : Conner
0
(0)
Author
name : Cassie
surname : Norman
2
9/16
THoSP Algorithm – Example
Author
name : Abigail
surname : Conner
0
Author
name : Baldwin
surname : Oliver
1
Author
name : Cassie
surname : Norman
2
Paper
title : On Joining
Graphs
3
Paper
title : Object
Databases
4
Paper
title : On Nesting
Graphs
5
AuthorOf
6
AuthorOf
7
AuthorOf
8
AuthorOf
9
AuthorOf
10
Paper
title : On Joining
Graphs
3
Author
name : Abigail
surname : Conner
0
(0)
(0 → 2), (2 → 0)
Author
name : Cassie
surname : Norman
2
coauthorship
9/16
THoSP Algorithm – Example
Author
name : Abigail
surname : Conner
0
Author
name : Baldwin
surname : Oliver
1
Author
name : Cassie
surname : Norman
2
Paper
title : On Joining
Graphs
3
Paper
title : Object
Databases
4
Paper
title : On Nesting
Graphs
5
AuthorOf
6
AuthorOf
7
AuthorOf
8
AuthorOf
9
AuthorOf
10
Paper
title : On Joining
Graphs
3
Paper
title : Object
Databases
4
Author
name : Abigail
surname : Conner
0
(0)
(0 → 2), (2 → 0)
(2)
Author
name : Cassie
surname : Norman
2
coauthorship
9/16
THoSP Algorithm – Example
Author
name : Abigail
surname : Conner
0
Author
name : Baldwin
surname : Oliver
1
Author
name : Cassie
surname : Norman
2
Paper
title : On Joining
Graphs
3
Paper
title : Object
Databases
4
Paper
title : On Nesting
Graphs
5
AuthorOf
6
AuthorOf
7
AuthorOf
8
AuthorOf
9
AuthorOf
10
Paper
title : On Joining
Graphs
3
Paper
title : Object
Databases
4
Author
name : Abigail
surname : Conner
0
(0)
(0 → 2), (2 → 0)
(2)
Author
name : Baldwin
surname : Oliver
1
(1)
Author
name : Cassie
surname : Norman
2
coauthorship
9/16
THoSP Algorithm – Example
Author
name : Abigail
surname : Conner
0
Author
name : Baldwin
surname : Oliver
1
Author
name : Cassie
surname : Norman
2
Paper
title : On Joining
Graphs
3
Paper
title : Object
Databases
4
Paper
title : On Nesting
Graphs
5
AuthorOf
6
AuthorOf
7
AuthorOf
8
AuthorOf
9
AuthorOf
10
Paper
title : On Joining
Graphs
3
Paper
title : Object
Databases
4
(0 → 1), (1 → 0)
Author
name : Abigail
surname : Conner
0
(0)
(0 → 2), (2 → 0)
(2)
Author
name : Baldwin
surname : Oliver
1
(1)
Author
name : Cassie
surname : Norman
2
coauthorship
coauthorship
9/16
THoSP Algorithm – Example
Author
name : Abigail
surname : Conner
0
Author
name : Baldwin
surname : Oliver
1
Author
name : Cassie
surname : Norman
2
Paper
title : On Joining
Graphs
3
Paper
title : Object
Databases
4
Paper
title : On Nesting
Graphs
5
AuthorOf
6
AuthorOf
7
AuthorOf
8
AuthorOf
9
AuthorOf
10
Paper
title : On Joining
Graphs
3
Paper
title : Object
Databases
4
(0 → 1), (1 → 0)
Paper
title : On Nesting
Graphs
5
Author
name : Abigail
surname : Conner
0
(0)
(0 → 2), (2 → 0)
(2)
Author
name : Baldwin
surname : Oliver
1
Author
name : Cassie
surname : Norman
2
coauthorship
coauthorship
(1)
9/16
Experimental Evaluation
Experimental Evaluation – Dataset
We want to show that the combination of THoSP with the proposed
physical data model outperforms the query plans for other query
languages (Cypher, SPARQL, SQL, AQL).
We performed our tests on both synthetic and real world data, using
n = 1 ÷ 8 operands with vertex size 10n:
• GMark graph generator.
• Random samples of Microsoft Academic Graph.
Our tests’ source code is available at:
https://bitbucket.org/unibogb/graphnestingc/src
10/16
Experimental Evaluation – Competing DataBases
Given that the only graph database using Java was the the worst
performing one, we implemented our solution only in C++ The
graph nesting operator was implemented in each DB language by
redurning ID collections.
• PostgreSQL was used to evaluate SQL queries. We ran the
queries directly in psql.
• SPARQL queries were evaluated over Virtuoso. SPARQL
queries were send via ODBC (C++).
• Cypher queries were evaluated over Neo4J. SPARQL queries
were send via the execute method.
• AQL queries were evaluated over ArangoDB. We ran the
queries directly in arangosh.
11/16
Experimental Evaluation – GMark Benchmark
Operands Size Two HOp Separated Pattern Time (C/C++) (ms)
|V| #Subgraph SQL+JSON SPARQL AQL Cypher THoSP
10 3 2.10 11 15.00 681.40 0.11
102 58 9.68 63 3.89 1,943.98 0.14
103 968 17.96 63 12.34 >3.60×106 0.46
104 8, 683 69.27 364 46.74 >3.60×106 4.07
105 88, 885 294.23 4,153 508.87 >3.60×106 43.81
106 902, 020 2,611.48 50,341 7,212.19 >3.60×106 563.02
107 8, 991, 417 25,666.14 672,273 922,590.00 >3.60×106 8,202.93
108 89, 146, 891 396,523.88 >3.60×106 >3.60×106 >3.60×106 91,834.20
12/16
Experimental Evaluation – Microsoft Academic Graph Bench-
mark
Operands Size Two HOp Separated Pattern Time (C/C++) (ms)
|V| #Subgraph SQL+JSON SPARQL AQL Cypher THoSP
10 19 1.69·100 3.4·101 6.57·10−1 2.38·103 2.82·10−1
102 255 1.75·100 3.22·102 2.51·100 1.01·104 3.46·10−1
103 23,119 4.71·101 1.22·103 8.18·101 >1H 1.39·101
104 5,411,205 1.53·104 2.77·105 2.08·104 >1H 2.58·103
105 97,079,329 1.20·106 >1H OOM1 >1H 1.97·105
106 241,448,529 >1H >1H OOM1 >1H 6.22·105
107 361,759,509 OOM2 >1H OOM1 >1H 7.74·105
13/16
Experimental Evaluation – Results
• This further benchmarks shows that all the current data model
supporting nested representation do not support query plans
allowing for a specific case of (graph) nesting.
• The proposed approach extended the secondary memory’s
property graph representation by adding associations to nested
vertices and edges.
• The serialized data structure provides a graph having an
external containment data structure.
• This data model achieves structural aggregation for graph data,
where aggregated data may preserve the original vertices and
edges.
14/16
Experimental Evaluation – Further Results
GROQ: THoSP can be generalized into a more general
algorithm.
Generalized Semistructured Model: This data structure can be
generalized into a broader data representation.
15/16
Experimental Evaluation – Future Work
GROQ: Further benchmarks have to be carried out over this
more general general nesting algorithm.
General Nesting: Provide a query plan where either grouping or
GROQ are used.
16/16
Backup Slides
Backup Slides – Nested Graph Database
Nested Graph DataBase
Given a set Σ∗ of strings, a nested (property) graph database G is a tuple
G = V, E, λ, , ω, ν, , where:
• V, E ∈ N2 s.t. V ∩ E = ∅
• source and target λ: E → V2.
• labelling : V ∪ E → ℘(Σ∗)
• object mapping ω : V ∪ E → Ω
• vertices’ containment: ν: (V ∪ E) → ℘(V)
• edges’ containment: : (V ∪ E) → ℘(E)
Each vertex or edge o ∈ V ∪ E induces a nested (property) graph as the
following pair:
Go = ν(o), e ∈ (o) λ(e) ∈ (∪n≥0 ν (n)
({o}))2
THoSP Pseudocode
nest ( Cont , patt , u , S ) :
for each s in S s . t . patt . d o S e r i a l i z e ( s ) :
Cont . write ( <u , s >)
Input : G, gV , gE
Cont ← ∅
NestedGraph ← ∅
a ← V ∩ E  ( γV ∪ γsrc
E ∪ γdst
E ) ;
for each v : v e r t e x in G s . t . a ( v ) :
for each V( u →e v ) :
u : = d t l ( u ) c ; nest ( Cont , V , u , { u , e , v } )
NGraph (V) ← NGraph (V) ∪ { u }
for each V(w →e v ) s . t . E ( u →e ve ←w)
w : = d t l (w) c ;
e’ : = d t l ( u ,w) c ;
nest ( Cont , E , e’ , { u , e , v , e ' ,w} )
NGraph ( E ) ← NGraph ( E ) ∪ { u →e’ w }

More Related Content

What's hot

4.4 external hashing
4.4 external hashing4.4 external hashing
4.4 external hashing
Krish_ver2
 
Ch17 Hashing
Ch17 HashingCh17 Hashing
Ch17 Hashing
leminhvuong
 
Stack and Hash Table
Stack and Hash TableStack and Hash Table
Stack and Hash Table
Umma Khatuna Jannat
 
Concept of hashing
Concept of hashingConcept of hashing
Concept of hashing
Rafi Dar
 
Hashing
HashingHashing
Hashing
Dinesh Vujuru
 
Data Structure and Algorithms Hashing
Data Structure and Algorithms HashingData Structure and Algorithms Hashing
Data Structure and Algorithms Hashing
ManishPrajapati78
 
Hash Tables in data Structure
Hash Tables in data StructureHash Tables in data Structure
Hash Tables in data Structure
Prof Ansari
 
Open Addressing on Hash Tables
Open Addressing on Hash Tables Open Addressing on Hash Tables
Open Addressing on Hash Tables
Nifras Ismail
 
Hashing and Hash Tables
Hashing and Hash TablesHashing and Hash Tables
Hashing and Hash Tables
adil raja
 
C programming
C programmingC programming
C programming
Karthikeyan A K
 
Hashing notes data structures (HASHING AND HASH FUNCTIONS)
Hashing notes data structures (HASHING AND HASH FUNCTIONS)Hashing notes data structures (HASHING AND HASH FUNCTIONS)
Hashing notes data structures (HASHING AND HASH FUNCTIONS)
Kuntal Bhowmick
 
Hashing In Data Structure
Hashing In Data Structure Hashing In Data Structure
Hashing In Data Structure
Meghaj Mallick
 
Hashing 1
Hashing 1Hashing 1
Hashing 1
Shyam Khant
 
Hashing
HashingHashing
Hashing
Sri Prasanna
 
Hashing gt1
Hashing gt1Hashing gt1
Hashing gt1
Gopi Saiteja
 
Hashing Techniques in Data Structures Part2
Hashing Techniques in Data Structures Part2Hashing Techniques in Data Structures Part2
Hashing Techniques in Data Structures Part2
SHAKOOR AB
 
DBMS 9 | Extendible Hashing
DBMS 9 | Extendible HashingDBMS 9 | Extendible Hashing
DBMS 9 | Extendible Hashing
Mohammad Imam Hossain
 
Tutorial 9 (bloom filters)
Tutorial 9 (bloom filters)Tutorial 9 (bloom filters)
Tutorial 9 (bloom filters)
Kira
 
Hash join
Hash joinHash join
Hashing data
Hashing dataHashing data
Hashing data
umair khan
 

What's hot (20)

4.4 external hashing
4.4 external hashing4.4 external hashing
4.4 external hashing
 
Ch17 Hashing
Ch17 HashingCh17 Hashing
Ch17 Hashing
 
Stack and Hash Table
Stack and Hash TableStack and Hash Table
Stack and Hash Table
 
Concept of hashing
Concept of hashingConcept of hashing
Concept of hashing
 
Hashing
HashingHashing
Hashing
 
Data Structure and Algorithms Hashing
Data Structure and Algorithms HashingData Structure and Algorithms Hashing
Data Structure and Algorithms Hashing
 
Hash Tables in data Structure
Hash Tables in data StructureHash Tables in data Structure
Hash Tables in data Structure
 
Open Addressing on Hash Tables
Open Addressing on Hash Tables Open Addressing on Hash Tables
Open Addressing on Hash Tables
 
Hashing and Hash Tables
Hashing and Hash TablesHashing and Hash Tables
Hashing and Hash Tables
 
C programming
C programmingC programming
C programming
 
Hashing notes data structures (HASHING AND HASH FUNCTIONS)
Hashing notes data structures (HASHING AND HASH FUNCTIONS)Hashing notes data structures (HASHING AND HASH FUNCTIONS)
Hashing notes data structures (HASHING AND HASH FUNCTIONS)
 
Hashing In Data Structure
Hashing In Data Structure Hashing In Data Structure
Hashing In Data Structure
 
Hashing 1
Hashing 1Hashing 1
Hashing 1
 
Hashing
HashingHashing
Hashing
 
Hashing gt1
Hashing gt1Hashing gt1
Hashing gt1
 
Hashing Techniques in Data Structures Part2
Hashing Techniques in Data Structures Part2Hashing Techniques in Data Structures Part2
Hashing Techniques in Data Structures Part2
 
DBMS 9 | Extendible Hashing
DBMS 9 | Extendible HashingDBMS 9 | Extendible Hashing
DBMS 9 | Extendible Hashing
 
Tutorial 9 (bloom filters)
Tutorial 9 (bloom filters)Tutorial 9 (bloom filters)
Tutorial 9 (bloom filters)
 
Hash join
Hash joinHash join
Hash join
 
Hashing data
Hashing dataHashing data
Hashing data
 

Similar to THoSP: an Algorithm for Nesting Property Graphs

Neo4j MeetUp - Graph Exploration with MetaExp
Neo4j MeetUp - Graph Exploration with MetaExpNeo4j MeetUp - Graph Exploration with MetaExp
Neo4j MeetUp - Graph Exploration with MetaExp
Adrian Ziegler
 
Scaling up genomic analysis with ADAM
Scaling up genomic analysis with ADAMScaling up genomic analysis with ADAM
Scaling up genomic analysis with ADAM
fnothaft
 
Mapping Graph Queries to PostgreSQL
Mapping Graph Queries to PostgreSQLMapping Graph Queries to PostgreSQL
Mapping Graph Queries to PostgreSQL
Gábor Szárnyas
 
Learning Commonalities in RDF
Learning Commonalities in RDFLearning Commonalities in RDF
Learning Commonalities in RDF
Sara EL HASSAD
 
aRangodb, un package per l'utilizzo di ArangoDB con R
aRangodb, un package per l'utilizzo di ArangoDB con RaRangodb, un package per l'utilizzo di ArangoDB con R
aRangodb, un package per l'utilizzo di ArangoDB con R
GraphRM
 
Planted Clique Research Paper
Planted Clique Research PaperPlanted Clique Research Paper
Planted Clique Research Paper
Jose Andres Valdes
 
TOPIC EXTRACTION OF CRAWLED DOCUMENTS COLLECTION USING CORRELATED TOPIC MODEL...
TOPIC EXTRACTION OF CRAWLED DOCUMENTS COLLECTION USING CORRELATED TOPIC MODEL...TOPIC EXTRACTION OF CRAWLED DOCUMENTS COLLECTION USING CORRELATED TOPIC MODEL...
TOPIC EXTRACTION OF CRAWLED DOCUMENTS COLLECTION USING CORRELATED TOPIC MODEL...
ijnlc
 
Compact Representation of Large RDF Data Sets for Publishing and Exchange
Compact Representation of Large RDF Data Sets for Publishing and ExchangeCompact Representation of Large RDF Data Sets for Publishing and Exchange
Compact Representation of Large RDF Data Sets for Publishing and Exchange
WU (Vienna University of Economics and Business)
 
Bekas for cognitive_speaker_series
Bekas for cognitive_speaker_seriesBekas for cognitive_speaker_series
Bekas for cognitive_speaker_series
diannepatricia
 
Bekas for cognitive_speaker_series
Bekas for cognitive_speaker_seriesBekas for cognitive_speaker_series
Bekas for cognitive_speaker_series
diannepatricia
 
50120130406008
5012013040600850120130406008
50120130406008
IAEME Publication
 
Apache spark on planet scale
Apache spark on planet scaleApache spark on planet scale
Apache spark on planet scale
Denis Chapligin
 
Scalable and Adaptive Graph Querying with MapReduce
Scalable and Adaptive Graph Querying with MapReduceScalable and Adaptive Graph Querying with MapReduce
Scalable and Adaptive Graph Querying with MapReduce
Kyong-Ha Lee
 
Learning Commonalities in RDF and SPARQL
Learning Commonalities in RDF and SPARQLLearning Commonalities in RDF and SPARQL
Learning Commonalities in RDF and SPARQL
Sara EL HASSAD
 
Text categorization as a graph
Text categorization as a graphText categorization as a graph
Text categorization as a graph
James Wong
 
Text categorization as a graph
Text categorization as a graph Text categorization as a graph
Text categorization as a graph
David Hoen
 
Text categorization as graph
Text categorization as graphText categorization as graph
Text categorization as graph
Harry Potter
 
Text categorization as a graph
Text categorization as a graphText categorization as a graph
Text categorization as a graph
Fraboni Ec
 
Text categorization as a graph
Text categorization as a graphText categorization as a graph
Text categorization as a graph
Young Alista
 
Text categorization
Text categorization Text categorization
Text categorization
Luis Goldster
 

Similar to THoSP: an Algorithm for Nesting Property Graphs (20)

Neo4j MeetUp - Graph Exploration with MetaExp
Neo4j MeetUp - Graph Exploration with MetaExpNeo4j MeetUp - Graph Exploration with MetaExp
Neo4j MeetUp - Graph Exploration with MetaExp
 
Scaling up genomic analysis with ADAM
Scaling up genomic analysis with ADAMScaling up genomic analysis with ADAM
Scaling up genomic analysis with ADAM
 
Mapping Graph Queries to PostgreSQL
Mapping Graph Queries to PostgreSQLMapping Graph Queries to PostgreSQL
Mapping Graph Queries to PostgreSQL
 
Learning Commonalities in RDF
Learning Commonalities in RDFLearning Commonalities in RDF
Learning Commonalities in RDF
 
aRangodb, un package per l'utilizzo di ArangoDB con R
aRangodb, un package per l'utilizzo di ArangoDB con RaRangodb, un package per l'utilizzo di ArangoDB con R
aRangodb, un package per l'utilizzo di ArangoDB con R
 
Planted Clique Research Paper
Planted Clique Research PaperPlanted Clique Research Paper
Planted Clique Research Paper
 
TOPIC EXTRACTION OF CRAWLED DOCUMENTS COLLECTION USING CORRELATED TOPIC MODEL...
TOPIC EXTRACTION OF CRAWLED DOCUMENTS COLLECTION USING CORRELATED TOPIC MODEL...TOPIC EXTRACTION OF CRAWLED DOCUMENTS COLLECTION USING CORRELATED TOPIC MODEL...
TOPIC EXTRACTION OF CRAWLED DOCUMENTS COLLECTION USING CORRELATED TOPIC MODEL...
 
Compact Representation of Large RDF Data Sets for Publishing and Exchange
Compact Representation of Large RDF Data Sets for Publishing and ExchangeCompact Representation of Large RDF Data Sets for Publishing and Exchange
Compact Representation of Large RDF Data Sets for Publishing and Exchange
 
Bekas for cognitive_speaker_series
Bekas for cognitive_speaker_seriesBekas for cognitive_speaker_series
Bekas for cognitive_speaker_series
 
Bekas for cognitive_speaker_series
Bekas for cognitive_speaker_seriesBekas for cognitive_speaker_series
Bekas for cognitive_speaker_series
 
50120130406008
5012013040600850120130406008
50120130406008
 
Apache spark on planet scale
Apache spark on planet scaleApache spark on planet scale
Apache spark on planet scale
 
Scalable and Adaptive Graph Querying with MapReduce
Scalable and Adaptive Graph Querying with MapReduceScalable and Adaptive Graph Querying with MapReduce
Scalable and Adaptive Graph Querying with MapReduce
 
Learning Commonalities in RDF and SPARQL
Learning Commonalities in RDF and SPARQLLearning Commonalities in RDF and SPARQL
Learning Commonalities in RDF and SPARQL
 
Text categorization as a graph
Text categorization as a graphText categorization as a graph
Text categorization as a graph
 
Text categorization as a graph
Text categorization as a graph Text categorization as a graph
Text categorization as a graph
 
Text categorization as graph
Text categorization as graphText categorization as graph
Text categorization as graph
 
Text categorization as a graph
Text categorization as a graphText categorization as a graph
Text categorization as a graph
 
Text categorization as a graph
Text categorization as a graphText categorization as a graph
Text categorization as a graph
 
Text categorization
Text categorization Text categorization
Text categorization
 

Recently uploaded

Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...
bijceesjournal
 
Computational Engineering IITH Presentation
Computational Engineering IITH PresentationComputational Engineering IITH Presentation
Computational Engineering IITH Presentation
co23btech11018
 
Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...
IJECEIAES
 
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
IJECEIAES
 
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by AnantLLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
Anant Corporation
 
Transformers design and coooling methods
Transformers design and coooling methodsTransformers design and coooling methods
Transformers design and coooling methods
Roger Rozario
 
ISPM 15 Heat Treated Wood Stamps and why your shipping must have one
ISPM 15 Heat Treated Wood Stamps and why your shipping must have oneISPM 15 Heat Treated Wood Stamps and why your shipping must have one
ISPM 15 Heat Treated Wood Stamps and why your shipping must have one
Las Vegas Warehouse
 
Curve Fitting in Numerical Methods Regression
Curve Fitting in Numerical Methods RegressionCurve Fitting in Numerical Methods Regression
Curve Fitting in Numerical Methods Regression
Nada Hikmah
 
Embedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoringEmbedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoring
IJECEIAES
 
International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...
gerogepatton
 
Properties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptxProperties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptx
MDSABBIROJJAMANPAYEL
 
Seminar on Distillation study-mafia.pptx
Seminar on Distillation study-mafia.pptxSeminar on Distillation study-mafia.pptx
Seminar on Distillation study-mafia.pptx
Madan Karki
 
Manufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptxManufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptx
Madan Karki
 
The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.
sachin chaurasia
 
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
shadow0702a
 
CEC 352 - SATELLITE COMMUNICATION UNIT 1
CEC 352 - SATELLITE COMMUNICATION UNIT 1CEC 352 - SATELLITE COMMUNICATION UNIT 1
CEC 352 - SATELLITE COMMUNICATION UNIT 1
PKavitha10
 
john krisinger-the science and history of the alcoholic beverage.pptx
john krisinger-the science and history of the alcoholic beverage.pptxjohn krisinger-the science and history of the alcoholic beverage.pptx
john krisinger-the science and history of the alcoholic beverage.pptx
Madan Karki
 
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Sinan KOZAK
 
Software Engineering and Project Management - Introduction, Modeling Concepts...
Software Engineering and Project Management - Introduction, Modeling Concepts...Software Engineering and Project Management - Introduction, Modeling Concepts...
Software Engineering and Project Management - Introduction, Modeling Concepts...
Prakhyath Rai
 
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECTCHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
jpsjournal1
 

Recently uploaded (20)

Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...
 
Computational Engineering IITH Presentation
Computational Engineering IITH PresentationComputational Engineering IITH Presentation
Computational Engineering IITH Presentation
 
Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...
 
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
 
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by AnantLLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
 
Transformers design and coooling methods
Transformers design and coooling methodsTransformers design and coooling methods
Transformers design and coooling methods
 
ISPM 15 Heat Treated Wood Stamps and why your shipping must have one
ISPM 15 Heat Treated Wood Stamps and why your shipping must have oneISPM 15 Heat Treated Wood Stamps and why your shipping must have one
ISPM 15 Heat Treated Wood Stamps and why your shipping must have one
 
Curve Fitting in Numerical Methods Regression
Curve Fitting in Numerical Methods RegressionCurve Fitting in Numerical Methods Regression
Curve Fitting in Numerical Methods Regression
 
Embedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoringEmbedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoring
 
International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...
 
Properties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptxProperties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptx
 
Seminar on Distillation study-mafia.pptx
Seminar on Distillation study-mafia.pptxSeminar on Distillation study-mafia.pptx
Seminar on Distillation study-mafia.pptx
 
Manufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptxManufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptx
 
The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.
 
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
 
CEC 352 - SATELLITE COMMUNICATION UNIT 1
CEC 352 - SATELLITE COMMUNICATION UNIT 1CEC 352 - SATELLITE COMMUNICATION UNIT 1
CEC 352 - SATELLITE COMMUNICATION UNIT 1
 
john krisinger-the science and history of the alcoholic beverage.pptx
john krisinger-the science and history of the alcoholic beverage.pptxjohn krisinger-the science and history of the alcoholic beverage.pptx
john krisinger-the science and history of the alcoholic beverage.pptx
 
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
 
Software Engineering and Project Management - Introduction, Modeling Concepts...
Software Engineering and Project Management - Introduction, Modeling Concepts...Software Engineering and Project Management - Introduction, Modeling Concepts...
Software Engineering and Project Management - Introduction, Modeling Concepts...
 
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECTCHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
 

THoSP: an Algorithm for Nesting Property Graphs

  • 1. THoSP: an Algorithm for Nesting Property Graphs Giacomo Bergami 1 André Petermann 2 Danilo Montesi 1 1st Joint GRADES-NDA International Workshop, 2018 10th June 2018 Università di Bologna1, Universität Leipzig2
  • 3. Key Ideas – Research Problem 1 An operator allowing to generalize the current “grouping” and “nesting” is missing. Nevertheless, current (G)DBMSs allow to express nesting operations, but their query languages’ plans do not allow to optimize the whole process by combining the following tasks: • path joins separately for both patterns. • grouping to create an id collection over the matched elements. 2 The general nesting algorithm could lead to an exponential evaluation time. 1/16
  • 4. Key Ideas – Use Case Author Paper∗authorOf Vertex Pattern Authorsrc Paper∗ Authorsrc =Authordst Authordst authorOf authorOf Edge Pattern Author name : Abigail surname : Conner 0 Author name : Baldwin surname : Oliver 1 Author name : Cassie surname : Norman 2 Paper title : On Joining Graphs 3 Paper title : Object Databases 4 Paper title : On Nesting Graphs 5 AuthorOf 6 AuthorOf 7 AuthorOf 8 AuthorOf 9 AuthorOf 10 Input Bibliography Network 2/16
  • 5. Key Ideas – Desired Result Paper title : On Joining Graphs 3 Paper title : Object Databases 4 (0 → 1), (1 → 0) Paper title : On Nesting Graphs 5 Author name : Abigail surname : Conner 0 (0) (0 → 2), (2 → 0) (2) Author name : Baldwin surname : Oliver 1 Author name : Cassie surname : Norman 2 coauthorship coauthorship (1) Expected result 3/16
  • 6. Key Ideas – Research Goals 1 As for graph joins, the data model must enhance the serialization of both operands and graph result. 4/16
  • 7. Key Ideas – Research Goals 1 As for graph joins, the data model must enhance the serialization of both operands and graph result. 2 The logical graph nesting operator must be general enough to support both the THoSP algorithm and other graph summarization tasks. 4/16
  • 8. Key Ideas – Research Goals 1 As for graph joins, the data model must enhance the serialization of both operands and graph result. 2 The logical graph nesting operator must be general enough to support both the THoSP algorithm and other graph summarization tasks. 3 Grouping can be avoided by defining a nesting index, through which the containment is associated to the container. This can be achieved by extending the Graph Join’s data structures with the aforementioned data structure. 4/16
  • 10. Logical Model – Design (1) The nested (property) graph data model is an extension of the logical model for graph joins. Therefore, we want to preserve the same assumptions: The resulting nested graph is not a materialized view (as in SQL’s SELECT). The nested graph is serialized by only using the ID information. Attribute, values and labels can be completely reconstructed from these informations and the pattern rewriting information. 5/16
  • 11. Logical Model – Design (2) The following modelling choices allow the reconstruction of the required pieces of information: Vertices and edges are distinctly identified by ids (N2). A nested graph database is a property graph, where each vertex and edge may contain (nest) another property graph (ν, ). Each vertex or edge within the graph can be considered as a possible graph operand. 6/16
  • 12. Logical Model – Definition Graph Nesting A nested graph database is a nested graph, where each vertex and edge may represent a graph. Given a nested graph G = (V, E), a vertex pattern gV, a edge pattern gE vertex pattern containing grouping references: η keep ι (G) = { v ∈ V | gV(v) = ∅, keep } ∪ ι(gV(G)), { e ∈ E | gE(e) = ∅, keep } ∪ ι(gE(G)) where ι is an indexing function associating to each matched graph into one new single identifier not appearing in G, and keep is set to true whether the non-traversed vertices and edges must be preserved into the final graph. The newly generated nested graph is inserted into the graph database which also contains G. Values associated to both nested vertices and edges are determined by user defined functions. 7/16
  • 14. THoSP Algorithm – Physical Model Motivations: 1 Reduce the number of graph visiting times by visiting the subpattern first, and then extending the visit to the remaining patterns. 2 Represent the nested graph as an adjacency list enriched with an external nesting index. The algorithm uses the same principles that were adopted for implementing graph joins: Use memory mapping (OS buffering). Serialized graphs represent vertices associated to both ingoing and outgoing edges. No additional indexing structures are exploited. 8/16
  • 15. THoSP Algorithm – Example Author name : Abigail surname : Conner 0 Author name : Baldwin surname : Oliver 1 Author name : Cassie surname : Norman 2 Paper title : On Joining Graphs 3 Paper title : Object Databases 4 Paper title : On Nesting Graphs 5 AuthorOf 6 AuthorOf 7 AuthorOf 8 AuthorOf 9 AuthorOf 10 9/16
  • 16. THoSP Algorithm – Example Author name : Abigail surname : Conner 0 Author name : Baldwin surname : Oliver 1 Author name : Cassie surname : Norman 2 Paper title : On Joining Graphs 3 Paper title : Object Databases 4 Paper title : On Nesting Graphs 5 AuthorOf 6 AuthorOf 7 AuthorOf 8 AuthorOf 9 AuthorOf 10 Paper title : On Joining Graphs 3 9/16
  • 17. THoSP Algorithm – Example Author name : Abigail surname : Conner 0 Author name : Baldwin surname : Oliver 1 Author name : Cassie surname : Norman 2 Paper title : On Joining Graphs 3 Paper title : Object Databases 4 Paper title : On Nesting Graphs 5 AuthorOf 6 AuthorOf 7 AuthorOf 8 AuthorOf 9 AuthorOf 10 Paper title : On Joining Graphs 3 Author name : Abigail surname : Conner 0 (0) 9/16
  • 18. THoSP Algorithm – Example Author name : Abigail surname : Conner 0 Author name : Baldwin surname : Oliver 1 Author name : Cassie surname : Norman 2 Paper title : On Joining Graphs 3 Paper title : Object Databases 4 Paper title : On Nesting Graphs 5 AuthorOf 6 AuthorOf 7 AuthorOf 8 AuthorOf 9 AuthorOf 10 Paper title : On Joining Graphs 3 Author name : Abigail surname : Conner 0 (0) Author name : Cassie surname : Norman 2 9/16
  • 19. THoSP Algorithm – Example Author name : Abigail surname : Conner 0 Author name : Baldwin surname : Oliver 1 Author name : Cassie surname : Norman 2 Paper title : On Joining Graphs 3 Paper title : Object Databases 4 Paper title : On Nesting Graphs 5 AuthorOf 6 AuthorOf 7 AuthorOf 8 AuthorOf 9 AuthorOf 10 Paper title : On Joining Graphs 3 Author name : Abigail surname : Conner 0 (0) (0 → 2), (2 → 0) Author name : Cassie surname : Norman 2 coauthorship 9/16
  • 20. THoSP Algorithm – Example Author name : Abigail surname : Conner 0 Author name : Baldwin surname : Oliver 1 Author name : Cassie surname : Norman 2 Paper title : On Joining Graphs 3 Paper title : Object Databases 4 Paper title : On Nesting Graphs 5 AuthorOf 6 AuthorOf 7 AuthorOf 8 AuthorOf 9 AuthorOf 10 Paper title : On Joining Graphs 3 Paper title : Object Databases 4 Author name : Abigail surname : Conner 0 (0) (0 → 2), (2 → 0) (2) Author name : Cassie surname : Norman 2 coauthorship 9/16
  • 21. THoSP Algorithm – Example Author name : Abigail surname : Conner 0 Author name : Baldwin surname : Oliver 1 Author name : Cassie surname : Norman 2 Paper title : On Joining Graphs 3 Paper title : Object Databases 4 Paper title : On Nesting Graphs 5 AuthorOf 6 AuthorOf 7 AuthorOf 8 AuthorOf 9 AuthorOf 10 Paper title : On Joining Graphs 3 Paper title : Object Databases 4 Author name : Abigail surname : Conner 0 (0) (0 → 2), (2 → 0) (2) Author name : Baldwin surname : Oliver 1 (1) Author name : Cassie surname : Norman 2 coauthorship 9/16
  • 22. THoSP Algorithm – Example Author name : Abigail surname : Conner 0 Author name : Baldwin surname : Oliver 1 Author name : Cassie surname : Norman 2 Paper title : On Joining Graphs 3 Paper title : Object Databases 4 Paper title : On Nesting Graphs 5 AuthorOf 6 AuthorOf 7 AuthorOf 8 AuthorOf 9 AuthorOf 10 Paper title : On Joining Graphs 3 Paper title : Object Databases 4 (0 → 1), (1 → 0) Author name : Abigail surname : Conner 0 (0) (0 → 2), (2 → 0) (2) Author name : Baldwin surname : Oliver 1 (1) Author name : Cassie surname : Norman 2 coauthorship coauthorship 9/16
  • 23. THoSP Algorithm – Example Author name : Abigail surname : Conner 0 Author name : Baldwin surname : Oliver 1 Author name : Cassie surname : Norman 2 Paper title : On Joining Graphs 3 Paper title : Object Databases 4 Paper title : On Nesting Graphs 5 AuthorOf 6 AuthorOf 7 AuthorOf 8 AuthorOf 9 AuthorOf 10 Paper title : On Joining Graphs 3 Paper title : Object Databases 4 (0 → 1), (1 → 0) Paper title : On Nesting Graphs 5 Author name : Abigail surname : Conner 0 (0) (0 → 2), (2 → 0) (2) Author name : Baldwin surname : Oliver 1 Author name : Cassie surname : Norman 2 coauthorship coauthorship (1) 9/16
  • 25. Experimental Evaluation – Dataset We want to show that the combination of THoSP with the proposed physical data model outperforms the query plans for other query languages (Cypher, SPARQL, SQL, AQL). We performed our tests on both synthetic and real world data, using n = 1 ÷ 8 operands with vertex size 10n: • GMark graph generator. • Random samples of Microsoft Academic Graph. Our tests’ source code is available at: https://bitbucket.org/unibogb/graphnestingc/src 10/16
  • 26. Experimental Evaluation – Competing DataBases Given that the only graph database using Java was the the worst performing one, we implemented our solution only in C++ The graph nesting operator was implemented in each DB language by redurning ID collections. • PostgreSQL was used to evaluate SQL queries. We ran the queries directly in psql. • SPARQL queries were evaluated over Virtuoso. SPARQL queries were send via ODBC (C++). • Cypher queries were evaluated over Neo4J. SPARQL queries were send via the execute method. • AQL queries were evaluated over ArangoDB. We ran the queries directly in arangosh. 11/16
  • 27. Experimental Evaluation – GMark Benchmark Operands Size Two HOp Separated Pattern Time (C/C++) (ms) |V| #Subgraph SQL+JSON SPARQL AQL Cypher THoSP 10 3 2.10 11 15.00 681.40 0.11 102 58 9.68 63 3.89 1,943.98 0.14 103 968 17.96 63 12.34 >3.60×106 0.46 104 8, 683 69.27 364 46.74 >3.60×106 4.07 105 88, 885 294.23 4,153 508.87 >3.60×106 43.81 106 902, 020 2,611.48 50,341 7,212.19 >3.60×106 563.02 107 8, 991, 417 25,666.14 672,273 922,590.00 >3.60×106 8,202.93 108 89, 146, 891 396,523.88 >3.60×106 >3.60×106 >3.60×106 91,834.20 12/16
  • 28. Experimental Evaluation – Microsoft Academic Graph Bench- mark Operands Size Two HOp Separated Pattern Time (C/C++) (ms) |V| #Subgraph SQL+JSON SPARQL AQL Cypher THoSP 10 19 1.69·100 3.4·101 6.57·10−1 2.38·103 2.82·10−1 102 255 1.75·100 3.22·102 2.51·100 1.01·104 3.46·10−1 103 23,119 4.71·101 1.22·103 8.18·101 >1H 1.39·101 104 5,411,205 1.53·104 2.77·105 2.08·104 >1H 2.58·103 105 97,079,329 1.20·106 >1H OOM1 >1H 1.97·105 106 241,448,529 >1H >1H OOM1 >1H 6.22·105 107 361,759,509 OOM2 >1H OOM1 >1H 7.74·105 13/16
  • 29. Experimental Evaluation – Results • This further benchmarks shows that all the current data model supporting nested representation do not support query plans allowing for a specific case of (graph) nesting. • The proposed approach extended the secondary memory’s property graph representation by adding associations to nested vertices and edges. • The serialized data structure provides a graph having an external containment data structure. • This data model achieves structural aggregation for graph data, where aggregated data may preserve the original vertices and edges. 14/16
  • 30. Experimental Evaluation – Further Results GROQ: THoSP can be generalized into a more general algorithm. Generalized Semistructured Model: This data structure can be generalized into a broader data representation. 15/16
  • 31. Experimental Evaluation – Future Work GROQ: Further benchmarks have to be carried out over this more general general nesting algorithm. General Nesting: Provide a query plan where either grouping or GROQ are used. 16/16
  • 33. Backup Slides – Nested Graph Database Nested Graph DataBase Given a set Σ∗ of strings, a nested (property) graph database G is a tuple G = V, E, λ, , ω, ν, , where: • V, E ∈ N2 s.t. V ∩ E = ∅ • source and target λ: E → V2. • labelling : V ∪ E → ℘(Σ∗) • object mapping ω : V ∪ E → Ω • vertices’ containment: ν: (V ∪ E) → ℘(V) • edges’ containment: : (V ∪ E) → ℘(E) Each vertex or edge o ∈ V ∪ E induces a nested (property) graph as the following pair: Go = ν(o), e ∈ (o) λ(e) ∈ (∪n≥0 ν (n) ({o}))2
  • 34. THoSP Pseudocode nest ( Cont , patt , u , S ) : for each s in S s . t . patt . d o S e r i a l i z e ( s ) : Cont . write ( <u , s >) Input : G, gV , gE Cont ← ∅ NestedGraph ← ∅ a ← V ∩ E ( γV ∪ γsrc E ∪ γdst E ) ; for each v : v e r t e x in G s . t . a ( v ) : for each V( u →e v ) : u : = d t l ( u ) c ; nest ( Cont , V , u , { u , e , v } ) NGraph (V) ← NGraph (V) ∪ { u } for each V(w →e v ) s . t . E ( u →e ve ←w) w : = d t l (w) c ; e’ : = d t l ( u ,w) c ; nest ( Cont , E , e’ , { u , e , v , e ' ,w} ) NGraph ( E ) ← NGraph ( E ) ∪ { u →e’ w }