SlideShare a Scribd company logo
1 of 32
www.moving-project.eu
TraininG towards a society of data-saVvy inforMation prOfessionals to enable open leadership INnovation
Till Blume
ZBW – Leibniz Information Centre for Economics
Kiel University
The FLuID Meta Model: Incrementally Compute
Schema-level Indices for the Web of Data
September 26th, 2018, DLR, Jena, Germany.
www.moving-project.eu
1 of 29
MOVING search scenario:
• The MOVING platform provides access to large variety of scientific literature,
metadata, videos, social-media content, websites, ….
Information Retrieval (IR) System
The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data
www.moving-project.eu
2 of 29
MOVING search scenario:
• The MOVING platform provides access to large variety of scientific literature,
metadata, videos, social-media content, websites, ….
Information Retrieval (IR) System
The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data
www.moving-project.eu
3 of 29
MOVING search scenario:
• The MOVING platform provides access to large variety of scientific literature,
metadata, videos, social-media content, websites, ….
Information Retrieval (IR) System
The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data
www.moving-project.eu
4 of 29
MOVING search scenario:
• The MOVING platform provides access to large variety of scientific literature,
metadata, videos, social-media content, websites, ….
Information Retrieval (IR) System
The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data
www.moving-project.eu
5 of 29
• Additional metadata from the Web of Data is of great value:
• We use it as additional source of information [4].
• We complement existing metadata.
• We train machine learning models to further improve the IR [5].
MOVING search scenario:
• The MOVING platform provides access to large variety of scientific literature,
metadata, videos, social-media content, websites, ….
Information Retrieval (IR) System
The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data
www.moving-project.eu
6 of 29
Integrating the Web of Data
…
Towards a clean air policy
Great Britain. Central Electricity
foaf:Agent
URI-1 URI-2
bibo:Book
dct:subject
URI-3
MOVING
platform
SLI
2
foaf:Agent
dct:subject
bibo:Book
dct:creator
3
4
Data Integration Service (DIS) for the Web of Data:
1. We formulate a structural query to find matching databases (e.g. containing
bibliographic metadata).
2. The Schema-level Index returns a list of matching databases.
3. Access all databases and check the contained data instances.
4. Harvest the relevant data instances and integrate them into our database.
D
I
S
1
The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data
www.moving-project.eu
7 of 29
Problem Statement
The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data
• The Web of Data is a huge dynamic heterogeneous network:
• Huge: largest collection of the Web of Data has more 38 billion edges1.
• Dynamic: 40-50% of the data changes either frequently or infrequently [10].
• Heterogeneous: no central authority responsible for data management.
• How to efficiently update schema-level indices over time?
• Re-computing from scratch not feasibly! (Crawls take weeks and
computations days).
• Incremental instance-level graph indices [8,13,14,18,19] not applicable due
to different level of abstraction.
• Incremental schema discovery in NoSQL databases [7,17] not applicable due
to the decentralized nature of the Web of Data.
1http://lodlaundromat.org/
www.moving-project.eu
8 of 29
The Meta Model FLuID
The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data
Our previous
work [2]
Various Index
Models exist
[1,3,6,7,9,11,
12,15,16]
Our ongoing work:
extend to incremental Index computation
Index 1
M2:
Meta model
FLuID
Index Model A:
Characteristic Sets
M1:
Model
M0:
Implementation
Index
Computation Data Graph
<<instanceOf>>
<<instanceOf>>
acts as blueprint
www.moving-project.eu
9 of 29
Approach
The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data
Goal: Develop an incremental schema-level index
• We extend our previously developed meta model FLuID, which can define
arbitrary schema-level indices [2].
• We analyze the effect of different types of changes on the data graph on
schema-level indices.
• We analyze the best and worst case complexity for all types of updates.
• We outline an algorithm using an additional data structure to coordinate the
updates efficiently.
• We experimentally evaluate the expected size of the additional data
structure.
Impact:
• Ease analyses of the evolution of data instances on schema-level.
• Allow „always up-to date“ data caches on schema-level.
www.moving-project.eu
10 of 29
• FLuID provides 4 schema elements:
• 3 simple elements: Object Cluster (OC), Property Cluster (PC), and Property-
Object Cluster (POC)
• 1 Complex Schema Element (CSE)
• FLuID provides 5 parameterizations:
• Label parameterization
• Chaining parameterization
• Direction parameterization
• Ontology paramaterization
• Instance parameterization
• In total, FLuID provides 9 building blocks sufficient to model all
existing approaches and beyond.
The FLuID Meta Model
The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data
www.moving-project.eu
11 of 29
• Instances: edges <s,p,o> with same subject node s, i.e.,
((i1, p1, o1), (i2, p2, o2)) ∈ I ⇔ i1 = i2.
• Edges belong to exactly 1 instance, nodes not necessarily
• Since instances partition the data graph, a set of instances also partitions the
data graph.
FLuID: Equivalence Relation Approach
The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data
i1
i2 i3
i4
i5
i6
i7
i8
i9
i10
p2
p1
p2
p1
p3
p2
p1
www.moving-project.eu
12 of 29
• Object Cluster: summarize instances that share a set of connected objects, i.e.,
([i1]I , [i2]I ) ∈ OC ⇔ ∀(i1, p1, o1)∃(i2, p2, o2) : o1 = o2 ∧
∀(i2, p2, o2) ∃(i1, p1, o1) : o1 = o2
The FLuID Model
The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data
i1
i2 i3
i4
i5
i6
i7
i8
i9
i10
p2
p1
p2
p1
p3
p2
p1
www.moving-project.eu
13 of 29
• Label Parameterized Object Cluster: summarize instances that have the set of
connected objects, if the property is p1
The FLuID Model
The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data
i1
i2 i3
i4
i5
i6
i7
i8
i9
i10
p2
p1
p2
p1
p3
p2
p1
www.moving-project.eu
14 of 29
• Label Parameterized Object Cluster: summarize instances that have the set of
connected objects, if the property is rdf:type
The FLuID Model
The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data
i1
i2
i4
i5
i6
i8
i10
p2
rdf:type
p2
rdf:type
p3
p2
rdf:type
Bbibo:Book
Bfoaf:Agent
Bbibo:Proceedings
www.moving-project.eu
15 of 29
• Label Parameterized Object Cluster: summarize instances that have the set of
connected objects, if the property is rdf:type
• Ontology paramaterization: RDFS Schema Graph
The FLuID Model
The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data
i1 i4 i6
i8
i10
p2
p2
rdf:type
p3
p2
i2 i5 i8
rdf:type rdf:type
Bbibo:Proceedings
Bbibo:Book
Bfoaf:Agent
www.moving-project.eu
16 of 29
• Label Parameterized Object Cluster: summarize instances that have the set of
connected objects, if the property is rdf:type
• Ontology paramaterization: RDFS Schema Graph
• Instance parameterization: owl:sameAs
The FLuID Model
The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data
i1 i4 i6
i10
dct:creator
dct:creator
rdf:type
owl:sameAs
dct:creator
i8i2 i5 i8
rdf:type rdf:type
Bbibo:Proceedings
Bbibo:Book
Bfoaf:Agent
www.moving-project.eu
17 of 29
How to compute a Schema-level Index?
Define an index model using the meta model FLuID [2]:
1. Characteristic Sets [12]:
• Instances have the same incoming and outgoing PROPERTIES
Char.Sets:= u-PROPS
2. TermPicker [15]:
• Instances have the same TYPES
• Instances have the same PROPERTIES
• Neighboring Instances have the same TYPES
TermPicker:= (TYPES ∩ PROPS, T, TYPES)
Simple Schema Elements
The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data
Complex Schema Elements
Simple Schema Element
* 2
(Chaining Parameterization)
www.moving-project.eu
18 of 29
Computing a Schema-level Index
i5
Book
author
i4keyword
Subject
Person
author
keyword
pcrel-2
Book
octype-1
~s
~o
~s
Subject
octype-2
Person
cse-1
…
i3
i6
author
i2keyword …
i1
schema
payload
12
Index Size:
α: No. of simple schema elements (3)
β: No. of complex schema elements (1)
Index Model (TYPES ∩ PROPS, T, TYPES):
τ: No. of Complex Schema Elements (1)
ς: No. of Simple Schema Elements (2)
k: Chaining Parameter (1)
Data Graph (Instance Information):
Schema-level Index:
Data Graph:
v: No. of nodes (6)
Upper bound for size:
α + β ≤ v * (ς + τ * k)
4 < 18
The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data
Example Index Computation for Term Picker:
Computational complexity for (re-) computing from scratch
Book Subject
Person
www.moving-project.eu
19 of 29
Subject
Person
Updating a Schema-level Index
schema
payload
12
Data Graph (Instance Information):
Schema-level Index:
Subject
octype-3
~s
~o
~s
cse-2
1 1
Removing instance
information can increase the
index size!
Index Size:
α: No. of simple schema elements (4)
β: No. of complex schema elements (2)
The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data
Example Index Computation for Term Picker:
i5
Book
author
i4keyword
Subject
Person
…
i3
i6
author
i2keyword …
i1
Book
author
keyword
pcrel-2
Book
octype-1
~s
~o
~s
Subject
octype-2
Person
cse-1
www.moving-project.eu
20 of 29
Possible Update Operations
There are six cases of updates possible for an SLI:
1. a new instance is observed with a new schema (SEnew)
2. a new instance is observed with a known schema (PEadd)
3. an instance is observed with a changed schema (SEmod)
4. an instance is observed with only changed instance information (PEmod)
5. an instance no longer exists (PEdel)
6. no more instance with a specific schema exists (SEdel)
We analyzed for each update type the best and worst case time complexity of
the update operation on the schema graph.
Overall Maximum Update Complexity:
ς + τ * k + δ−(I1
max
k )k * τ
No. of Simple + Complex Schema
Elements in the Index Model
In-degree of instances
found in the data graph
δ−(I1
max
k )
Chaining
Parameter k in the
Index Model
The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data
www.moving-project.eu
21 of 29
Incremental Schema-level Index
The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data
• Base algorithm iterates over all instances and computes the schema according
to the Index Model.
• Incremental algorithm uses a partial history of the indexed data graph to
coordinate the updates – the Update Coordinator (UC)
procedure computeSchema(instance) ๐ O(v)
for all SchemaElement in IndexModel do ๐ O(ς + τ * k )
hash(seprev) ⟵ UC.previousElement(instance)
senew ⟵ extractSchema(instance, SchemaElement) ๐ O(|instance|)
if hash(seprev) ≠ hash(senew) then
seprev ⟵ retrieveFromSLI(hash(seprev))
updateSchemaElement(seprev, senew) ๐ O(|instance|)
parentInstances ⟵ UC.getParentInstances(instance)
for all pInstance in parentInstances do ๐ O(δ− * ς + τ * k )
computeSchema(pInstance)
storeInSLI(senew)
UC.put(instance, senew)
www.moving-project.eu
22 of 29
Update Coordinator
DS-
URI-1
DS-
URI-2
DS-
URI-3
I-
URI-67
I-
URI-68
I-
URI-88
I-
URI-89
SE-
URI-12
SE-
URI-13
SE-
URI-55
…
…
…
Schema Element hashs
Instance Element URIs
Data Source URIs +
Timestamps<<timestamp>> <<timestamp>> <<timestamp>>
Store a partial history for the indexed data graph in a 3-layered data structure to
cope with additional level of abstraction:
The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data
For each
instance one
type of Schema
Element is
linked
Each instance can have
dependencies on zero or more
instances
Each instance is
defined in at
least one data
source
www.moving-project.eu
23 of 29
Preliminary Results & Discussion
• Datasets: 4 snapshots of the Web of Data crawled by the Dynamic Linked Data
Observatory (DyLDO1), each about 7 Million Data Instances
• Analyze the datasets with respect to the 3-layered data structure:
• About 28% of the data instances are defined in more than 1 (average 2.2)
data sources.
• About 33% of the instances have decencies on 1 or more (average 5)
parent instances (in-degree).
Implications for additional space complexity DyLDO datasets :
• SLIs using a complex schema element require storing 33% more data.
• SLIs using only simple schema elements require storing 15% more data.
Implications for the update time complexity on DyLDO datasets:
• Updates: ς + τ * k + 5 * (ς + τ * k) (TermPicker: 18)
• Re-computation: v * (ς + τ * k) (TermPicker: 21,000,000)
The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data
1http://swse.deri.org/dyldo/
www.moving-project.eu
24 of 29
Conclusion
1. We analyzed all different types of changes.
2. We outlined an algorithm that can map all instance-level changes to updates on
the schema-level Index.
3. Advantage: Only a small number of Schema Elements may needs to be updated
compared to re-computing from scratch.
4. Limitation: Depending on the Index Model, we may need to store more data.
Future Work
1. Implement and empirically evaluate the performance of the Incremental Schema-
level index algorithm for Real World Datasets (DyLDO snaphots1) & Benchmark
Datasets (Berlin SPARQL Benchmark2 & Lehigh University Benchmark3)
2. Reduce the data overhead by changing the Update Coordinator data structure, e. g.,
by using approximate data structures like bloom filter.
1http://swse.deri.org/dyldo/
2http://wifo5-03.informatik.uni-mannheim.de/bizer/berlinsparqlbenchmark/spec/index.html
3http://swat.cse.lehigh.edu/projects/lubm/
Conclusion & Future Work
The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data
www.moving-project.eu
25 of 29
Search Engine Prototype: LODatio+
http://lodatio.informatik.uni-kiel.de
The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data
www.moving-project.eu
26 of 29
Search Engine Prototype: LODatio+
http://lodatio.informatik.uni-kiel.de
The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data
www.moving-project.eu
27 of 29
Search Engine Prototype: LODatio+
http://lodatio.informatik.uni-kiel.de
The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data
www.moving-project.eu
28 of 29
1. Index the LOD cloud
Schema-level Index
FluID FrameworkLOD Crawler
2. Identify the
relevant sources
Harvey Framework
Data cloud
Focused
Crawler
JSON-Mapping Query
Index
Discovery
System
3. Harvest the
relevant sources
4. Crawl & Harvest
additional relevant sources
Data Integration Service
Ongoing Work
The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data
www.moving-project.eu
29 of 29
Thank you for your attention!
Any questions?
Project consortium and funding agency
MOVING is funded by the EU Horizon 2020 Programme under the project number INSO-4-2015: 693092
The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data
www.moving-project.eu
30 of 29
References
1. Benedetti, F., Bergamaschi, S., and Po., L.: Exposing the underlying schema of LOD sources. In Joint
IEEE/WIC/ACM WI and IAT, 2015.
2. Blume, T., Scherp, A.: Towards flexible indices for distributed graph data: The formal schema-level index model
FLuID. In Foundations of Databases. CEURWS.org, 2018.
3. Ciglan, M., Nørv˚ag, K., and Hluch´y, L.: The SemSets model for ad-hoc semantic list search. In WWW, 2012.
4. Galke, L., Mai, F., Schelten, A., Brunsch, D., Scherp, A.: Using titles vs. full-text as source for automated semantic
document annotation. In K-CAP, 2017.
5. Galke, L., Saleh, A., Scherp, A.: Evaluating the Impact of Word Embeddings on Similarity Scoring in Practical
Information Retrieval. In INFORMATIK, 2017.
6. Goldman, R. and Widom, J.: DataGuides: Enabling query formulation and optimization in semistructured
databases. In VLDB, 1997.
7. Gómez, S.N., Etcheverry, L., Marotta, A., Consens, M.P.: Findings from two decades of research on schema
discovery using a systematic literature review. In AMW. CEUR-WS.org, 2018.
8. Kansal, A., Spezzano, F.: A scalable graph-coarsening based index for dynamic graph databases. In CIKM, 2017.
9. Konrath, M., Gottron., T., Staab, S., and Scherp, A.: SchemEX - efficient construction of a data catalogue by
stream-based indexing of Linked Data. In J. Web Sem., 16:52–58, 2012.
10. Käfer, T., Abdelrahman, A., Umbrich, J., O’Byrne, P., Hogan, A.: Observing linkedn data dynamics. In ESWC.
Springer, 2013.
11. McHugh, J., Abiteboul, S., Goldman, R., Quass, D., and Widom, J.: Lore: a database management system for
semistructured data. In SIGMOD Record, 26(3):54–66, 1997.
The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data
www.moving-project.eu
31 of 29
References
12. Neumann, T. and Moerkotte, G.: Characteristic sets: Accurate cardinality estimation for RDF queries with multiple
joins. In ICDE, 2011.
13. Qiao, M., Zhang, H., Cheng, H.: Subgraph matching: on compression and computation. In PVLDB, 11(2):176–188,
2017.
14. Sakr, S., Al-Naymat, G.: Graph indexing and querying: a review. In J. of Web Inf. Sys. 6(2):101–120, 2010.
15. Schaible, J., Gottron, T., and Scherp, A.: TermPicker: Enabling the reuse of vocabulary terms by exploiting data
from the Linked Open Data cloud. In ESWC, 2016.
16. Spahiu, B., Porrini, R., Palmonari, M., Rula, A., and Maurino, A.: ABSTAT: ontology-driven Linked Data summaries
with pattern minimalization. In ESWC Satellite Events, Revised Selected Papers, 2016.
17. Wang, L., Zhang, S., Shi, J., Jiao, L., Hassanzadeh, O., Zou, J., Wangz, C.: Schema management for document
stores. In VLDB 8(9):922–933, 2015.
18. Yuan, D., Mitra, P., Yu, H., Giles, C.L.: Iterative graph feature mining for graph indexing. In ICDE, 2012.
19. Yuan, D., Mitra, P., Yu, H., Giles, C.L.: Updating graph indices with a one-pass algorithm. In SIGMOD, 2015.
The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data

More Related Content

What's hot

RDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
RDF4U: RDF Graph Visualization by Interpreting Linked Data as KnowledgeRDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
RDF4U: RDF Graph Visualization by Interpreting Linked Data as KnowledgeNational Institute of Informatics
 
Graph Analytics in Spark
Graph Analytics in SparkGraph Analytics in Spark
Graph Analytics in SparkPaco Nathan
 
Tutorial "Linked Data Query Processing" Part 5 "Query Planning and Optimizati...
Tutorial "Linked Data Query Processing" Part 5 "Query Planning and Optimizati...Tutorial "Linked Data Query Processing" Part 5 "Query Planning and Optimizati...
Tutorial "Linked Data Query Processing" Part 5 "Query Planning and Optimizati...Olaf Hartig
 
Interpreting Relational Schema to Graphs
Interpreting Relational Schema to GraphsInterpreting Relational Schema to Graphs
Interpreting Relational Schema to GraphsNeo4j
 
Social Media World News Impact on Stock Index Values - Investment Fund Analyt...
Social Media World News Impact on Stock Index Values - Investment Fund Analyt...Social Media World News Impact on Stock Index Values - Investment Fund Analyt...
Social Media World News Impact on Stock Index Values - Investment Fund Analyt...Bernardo Najlis
 
Benchmarking graph databases on the problem of community detection
Benchmarking graph databases on the problem of community detectionBenchmarking graph databases on the problem of community detection
Benchmarking graph databases on the problem of community detectionSymeon Papadopoulos
 
Graphs are everywhere! Distributed graph computing with Spark GraphX
Graphs are everywhere! Distributed graph computing with Spark GraphXGraphs are everywhere! Distributed graph computing with Spark GraphX
Graphs are everywhere! Distributed graph computing with Spark GraphXAndrea Iacono
 
A Standard Data Format for Computational Chemistry: CSX
A Standard Data Format for Computational Chemistry: CSXA Standard Data Format for Computational Chemistry: CSX
A Standard Data Format for Computational Chemistry: CSXStuart Chalk
 
Cloud Native Analysis Platform for NGS analysis
Cloud Native Analysis Platform for NGS analysisCloud Native Analysis Platform for NGS analysis
Cloud Native Analysis Platform for NGS analysisYaoyu Wang
 
Open Chemistry: Input Preparation, Data Visualization & Analysis
Open Chemistry: Input Preparation, Data Visualization & AnalysisOpen Chemistry: Input Preparation, Data Visualization & Analysis
Open Chemistry: Input Preparation, Data Visualization & AnalysisMarcus Hanwell
 
GraphFrames: Graph Queries In Spark SQL
GraphFrames: Graph Queries In Spark SQLGraphFrames: Graph Queries In Spark SQL
GraphFrames: Graph Queries In Spark SQLSpark Summit
 
Big data analysis in python @ PyCon.tw 2013
Big data analysis in python @ PyCon.tw 2013Big data analysis in python @ PyCon.tw 2013
Big data analysis in python @ PyCon.tw 2013Jimmy Lai
 
GraphFrames: Graph Queries in Spark SQL by Ankur Dave
GraphFrames: Graph Queries in Spark SQL by Ankur DaveGraphFrames: Graph Queries in Spark SQL by Ankur Dave
GraphFrames: Graph Queries in Spark SQL by Ankur DaveSpark Summit
 
Avogadro: Open Source Libraries and Application for Computational Chemistry
Avogadro: Open Source Libraries and Application for Computational ChemistryAvogadro: Open Source Libraries and Application for Computational Chemistry
Avogadro: Open Source Libraries and Application for Computational ChemistryMarcus Hanwell
 
Rethinking Online SPARQL Querying to Support Incremental Result Visualization
Rethinking Online SPARQL Querying to Support Incremental Result VisualizationRethinking Online SPARQL Querying to Support Incremental Result Visualization
Rethinking Online SPARQL Querying to Support Incremental Result VisualizationOlaf Hartig
 
Avogadro, Open Chemistry and Semantics
Avogadro, Open Chemistry and SemanticsAvogadro, Open Chemistry and Semantics
Avogadro, Open Chemistry and SemanticsMarcus Hanwell
 
Signals from outer space
Signals from outer spaceSignals from outer space
Signals from outer spaceGraphAware
 

What's hot (20)

RDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
RDF4U: RDF Graph Visualization by Interpreting Linked Data as KnowledgeRDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
RDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
 
Graph Analytics in Spark
Graph Analytics in SparkGraph Analytics in Spark
Graph Analytics in Spark
 
LD4KD 2015 - Demos and tools
LD4KD 2015 - Demos and toolsLD4KD 2015 - Demos and tools
LD4KD 2015 - Demos and tools
 
Tutorial "Linked Data Query Processing" Part 5 "Query Planning and Optimizati...
Tutorial "Linked Data Query Processing" Part 5 "Query Planning and Optimizati...Tutorial "Linked Data Query Processing" Part 5 "Query Planning and Optimizati...
Tutorial "Linked Data Query Processing" Part 5 "Query Planning and Optimizati...
 
Interpreting Relational Schema to Graphs
Interpreting Relational Schema to GraphsInterpreting Relational Schema to Graphs
Interpreting Relational Schema to Graphs
 
Social Media World News Impact on Stock Index Values - Investment Fund Analyt...
Social Media World News Impact on Stock Index Values - Investment Fund Analyt...Social Media World News Impact on Stock Index Values - Investment Fund Analyt...
Social Media World News Impact on Stock Index Values - Investment Fund Analyt...
 
Benchmarking graph databases on the problem of community detection
Benchmarking graph databases on the problem of community detectionBenchmarking graph databases on the problem of community detection
Benchmarking graph databases on the problem of community detection
 
Graphs are everywhere! Distributed graph computing with Spark GraphX
Graphs are everywhere! Distributed graph computing with Spark GraphXGraphs are everywhere! Distributed graph computing with Spark GraphX
Graphs are everywhere! Distributed graph computing with Spark GraphX
 
A Standard Data Format for Computational Chemistry: CSX
A Standard Data Format for Computational Chemistry: CSXA Standard Data Format for Computational Chemistry: CSX
A Standard Data Format for Computational Chemistry: CSX
 
Cloud Native Analysis Platform for NGS analysis
Cloud Native Analysis Platform for NGS analysisCloud Native Analysis Platform for NGS analysis
Cloud Native Analysis Platform for NGS analysis
 
Data visualization
Data visualizationData visualization
Data visualization
 
Open Chemistry: Input Preparation, Data Visualization & Analysis
Open Chemistry: Input Preparation, Data Visualization & AnalysisOpen Chemistry: Input Preparation, Data Visualization & Analysis
Open Chemistry: Input Preparation, Data Visualization & Analysis
 
GraphFrames: Graph Queries In Spark SQL
GraphFrames: Graph Queries In Spark SQLGraphFrames: Graph Queries In Spark SQL
GraphFrames: Graph Queries In Spark SQL
 
Big data analysis in python @ PyCon.tw 2013
Big data analysis in python @ PyCon.tw 2013Big data analysis in python @ PyCon.tw 2013
Big data analysis in python @ PyCon.tw 2013
 
GraphFrames: Graph Queries in Spark SQL by Ankur Dave
GraphFrames: Graph Queries in Spark SQL by Ankur DaveGraphFrames: Graph Queries in Spark SQL by Ankur Dave
GraphFrames: Graph Queries in Spark SQL by Ankur Dave
 
Avogadro: Open Source Libraries and Application for Computational Chemistry
Avogadro: Open Source Libraries and Application for Computational ChemistryAvogadro: Open Source Libraries and Application for Computational Chemistry
Avogadro: Open Source Libraries and Application for Computational Chemistry
 
Rethinking Online SPARQL Querying to Support Incremental Result Visualization
Rethinking Online SPARQL Querying to Support Incremental Result VisualizationRethinking Online SPARQL Querying to Support Incremental Result Visualization
Rethinking Online SPARQL Querying to Support Incremental Result Visualization
 
Avogadro, Open Chemistry and Semantics
Avogadro, Open Chemistry and SemanticsAvogadro, Open Chemistry and Semantics
Avogadro, Open Chemistry and Semantics
 
Signals from outer space
Signals from outer spaceSignals from outer space
Signals from outer space
 
What's new in Spark 2.0?
What's new in Spark 2.0?What's new in Spark 2.0?
What's new in Spark 2.0?
 

Similar to The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data

Towards Flexible Indices for Distributed Graph Data: The Formal Schema-level...
Towards Flexible Indices for  Distributed Graph Data: The Formal Schema-level...Towards Flexible Indices for  Distributed Graph Data: The Formal Schema-level...
Towards Flexible Indices for Distributed Graph Data: The Formal Schema-level...Till Blume
 
Microtask Crowdsourcing Applications for Linked Data
Microtask Crowdsourcing Applications for Linked DataMicrotask Crowdsourcing Applications for Linked Data
Microtask Crowdsourcing Applications for Linked DataEUCLID project
 
Computernetworkingkurosech9 091011003335-phpapp01
Computernetworkingkurosech9 091011003335-phpapp01Computernetworkingkurosech9 091011003335-phpapp01
Computernetworkingkurosech9 091011003335-phpapp01AislanSoares
 
The Enterprise Guide to Building a Data Mesh - Introducing SpecMesh
The Enterprise Guide to Building a Data Mesh - Introducing SpecMeshThe Enterprise Guide to Building a Data Mesh - Introducing SpecMesh
The Enterprise Guide to Building a Data Mesh - Introducing SpecMeshIanFurlong4
 
Building Linked Data Applications
Building Linked Data ApplicationsBuilding Linked Data Applications
Building Linked Data ApplicationsEUCLID project
 
The Quest for an Open Source Data Science Platform
 The Quest for an Open Source Data Science Platform The Quest for an Open Source Data Science Platform
The Quest for an Open Source Data Science PlatformQAware GmbH
 
Chapter9 network managment-3ed
Chapter9 network managment-3edChapter9 network managment-3ed
Chapter9 network managment-3edKhánh Ghẻ
 
A Gen3 Perspective of Disparate Data
A Gen3 Perspective of Disparate DataA Gen3 Perspective of Disparate Data
A Gen3 Perspective of Disparate DataRobert Grossman
 
Cyber Threat Ranking using READ
Cyber Threat Ranking using READCyber Threat Ranking using READ
Cyber Threat Ranking using READZachary S. Brown
 
Serena Mainframe VUG In-Com
Serena Mainframe VUG In-Com Serena Mainframe VUG In-Com
Serena Mainframe VUG In-Com Serena Software
 
WSO-LINK: Algorithm to Eliminate Web Structure Outliers in Web Pages
WSO-LINK: Algorithm to Eliminate Web Structure Outliers in Web PagesWSO-LINK: Algorithm to Eliminate Web Structure Outliers in Web Pages
WSO-LINK: Algorithm to Eliminate Web Structure Outliers in Web PagesIOSR Journals
 
databricks ml flow demonstration using automatic features engineering
databricks ml flow demonstration using automatic features engineeringdatabricks ml flow demonstration using automatic features engineering
databricks ml flow demonstration using automatic features engineeringMohamed MEJDOUBI
 
The nature.com ontologies portal: nature.com/ontologies
The nature.com ontologies portal: nature.com/ontologiesThe nature.com ontologies portal: nature.com/ontologies
The nature.com ontologies portal: nature.com/ontologiesTony Hammond
 
Data Infrastructure at LinkedIn
Data Infrastructure at LinkedInData Infrastructure at LinkedIn
Data Infrastructure at LinkedInAmy W. Tang
 
S. Bartoli & F. Pompermaier – A Semantic Big Data Companion
S. Bartoli & F. Pompermaier – A Semantic Big Data CompanionS. Bartoli & F. Pompermaier – A Semantic Big Data Companion
S. Bartoli & F. Pompermaier – A Semantic Big Data CompanionFlink Forward
 
Apache Kafka and the Data Mesh | Ben Stopford and Michael Noll, Confluent
Apache Kafka and the Data Mesh | Ben Stopford and Michael Noll, ConfluentApache Kafka and the Data Mesh | Ben Stopford and Michael Noll, Confluent
Apache Kafka and the Data Mesh | Ben Stopford and Michael Noll, ConfluentHostedbyConfluent
 
Enterprise guide to building a Data Mesh
Enterprise guide to building a Data MeshEnterprise guide to building a Data Mesh
Enterprise guide to building a Data MeshSion Smith
 
Whats new in_mlflow
Whats new in_mlflowWhats new in_mlflow
Whats new in_mlflowDatabricks
 

Similar to The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data (20)

Towards Flexible Indices for Distributed Graph Data: The Formal Schema-level...
Towards Flexible Indices for  Distributed Graph Data: The Formal Schema-level...Towards Flexible Indices for  Distributed Graph Data: The Formal Schema-level...
Towards Flexible Indices for Distributed Graph Data: The Formal Schema-level...
 
Microtask Crowdsourcing Applications for Linked Data
Microtask Crowdsourcing Applications for Linked DataMicrotask Crowdsourcing Applications for Linked Data
Microtask Crowdsourcing Applications for Linked Data
 
Computernetworkingkurosech9 091011003335-phpapp01
Computernetworkingkurosech9 091011003335-phpapp01Computernetworkingkurosech9 091011003335-phpapp01
Computernetworkingkurosech9 091011003335-phpapp01
 
The Enterprise Guide to Building a Data Mesh - Introducing SpecMesh
The Enterprise Guide to Building a Data Mesh - Introducing SpecMeshThe Enterprise Guide to Building a Data Mesh - Introducing SpecMesh
The Enterprise Guide to Building a Data Mesh - Introducing SpecMesh
 
LOD2 Webinar Series: CubeViz
LOD2 Webinar Series: CubeViz LOD2 Webinar Series: CubeViz
LOD2 Webinar Series: CubeViz
 
Building Linked Data Applications
Building Linked Data ApplicationsBuilding Linked Data Applications
Building Linked Data Applications
 
The Quest for an Open Source Data Science Platform
 The Quest for an Open Source Data Science Platform The Quest for an Open Source Data Science Platform
The Quest for an Open Source Data Science Platform
 
Chapter9 network managment-3ed
Chapter9 network managment-3edChapter9 network managment-3ed
Chapter9 network managment-3ed
 
A Gen3 Perspective of Disparate Data
A Gen3 Perspective of Disparate DataA Gen3 Perspective of Disparate Data
A Gen3 Perspective of Disparate Data
 
Cyber Threat Ranking using READ
Cyber Threat Ranking using READCyber Threat Ranking using READ
Cyber Threat Ranking using READ
 
Serena Mainframe VUG In-Com
Serena Mainframe VUG In-Com Serena Mainframe VUG In-Com
Serena Mainframe VUG In-Com
 
WSO-LINK: Algorithm to Eliminate Web Structure Outliers in Web Pages
WSO-LINK: Algorithm to Eliminate Web Structure Outliers in Web PagesWSO-LINK: Algorithm to Eliminate Web Structure Outliers in Web Pages
WSO-LINK: Algorithm to Eliminate Web Structure Outliers in Web Pages
 
databricks ml flow demonstration using automatic features engineering
databricks ml flow demonstration using automatic features engineeringdatabricks ml flow demonstration using automatic features engineering
databricks ml flow demonstration using automatic features engineering
 
The nature.com ontologies portal: nature.com/ontologies
The nature.com ontologies portal: nature.com/ontologiesThe nature.com ontologies portal: nature.com/ontologies
The nature.com ontologies portal: nature.com/ontologies
 
Data Infrastructure at LinkedIn
Data Infrastructure at LinkedInData Infrastructure at LinkedIn
Data Infrastructure at LinkedIn
 
S. Bartoli & F. Pompermaier – A Semantic Big Data Companion
S. Bartoli & F. Pompermaier – A Semantic Big Data CompanionS. Bartoli & F. Pompermaier – A Semantic Big Data Companion
S. Bartoli & F. Pompermaier – A Semantic Big Data Companion
 
PhD Defense
PhD DefensePhD Defense
PhD Defense
 
Apache Kafka and the Data Mesh | Ben Stopford and Michael Noll, Confluent
Apache Kafka and the Data Mesh | Ben Stopford and Michael Noll, ConfluentApache Kafka and the Data Mesh | Ben Stopford and Michael Noll, Confluent
Apache Kafka and the Data Mesh | Ben Stopford and Michael Noll, Confluent
 
Enterprise guide to building a Data Mesh
Enterprise guide to building a Data MeshEnterprise guide to building a Data Mesh
Enterprise guide to building a Data Mesh
 
Whats new in_mlflow
Whats new in_mlflowWhats new in_mlflow
Whats new in_mlflow
 

Recently uploaded

CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdfCTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdfhenrik385807
 
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort ServiceDelhi Call girls
 
ANCHORING SCRIPT FOR A CULTURAL EVENT.docx
ANCHORING SCRIPT FOR A CULTURAL EVENT.docxANCHORING SCRIPT FOR A CULTURAL EVENT.docx
ANCHORING SCRIPT FOR A CULTURAL EVENT.docxNikitaBankoti2
 
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...Kayode Fayemi
 
Russian Call Girls in Kolkata Vaishnavi 🤌 8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Vaishnavi 🤌  8250192130 🚀 Vip Call Girls KolkataRussian Call Girls in Kolkata Vaishnavi 🤌  8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Vaishnavi 🤌 8250192130 🚀 Vip Call Girls Kolkataanamikaraghav4
 
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara ServicesVVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara ServicesPooja Nehwal
 
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...Hasting Chen
 
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night EnjoyCall Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night EnjoyPooja Nehwal
 
Microsoft Copilot AI for Everyone - created by AI
Microsoft Copilot AI for Everyone - created by AIMicrosoft Copilot AI for Everyone - created by AI
Microsoft Copilot AI for Everyone - created by AITatiana Gurgel
 
Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)Chameera Dedduwage
 
Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510Vipesco
 
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...Sheetaleventcompany
 
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdf
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdfOpen Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdf
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdfhenrik385807
 
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...henrik385807
 
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptx
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptxMohammad_Alnahdi_Oral_Presentation_Assignment.pptx
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptxmohammadalnahdi22
 
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...NETWAYS
 
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779Delhi Call girls
 
Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...
Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...
Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...Salam Al-Karadaghi
 
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )Pooja Nehwal
 
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024eCommerce Institute
 

Recently uploaded (20)

CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdfCTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
 
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
 
ANCHORING SCRIPT FOR A CULTURAL EVENT.docx
ANCHORING SCRIPT FOR A CULTURAL EVENT.docxANCHORING SCRIPT FOR A CULTURAL EVENT.docx
ANCHORING SCRIPT FOR A CULTURAL EVENT.docx
 
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
 
Russian Call Girls in Kolkata Vaishnavi 🤌 8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Vaishnavi 🤌  8250192130 🚀 Vip Call Girls KolkataRussian Call Girls in Kolkata Vaishnavi 🤌  8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Vaishnavi 🤌 8250192130 🚀 Vip Call Girls Kolkata
 
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara ServicesVVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
 
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
 
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night EnjoyCall Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
 
Microsoft Copilot AI for Everyone - created by AI
Microsoft Copilot AI for Everyone - created by AIMicrosoft Copilot AI for Everyone - created by AI
Microsoft Copilot AI for Everyone - created by AI
 
Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)
 
Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510
 
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
 
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdf
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdfOpen Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdf
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdf
 
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...
 
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptx
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptxMohammad_Alnahdi_Oral_Presentation_Assignment.pptx
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptx
 
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...
 
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
 
Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...
Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...
Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...
 
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
 
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
 

The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data

  • 1. www.moving-project.eu TraininG towards a society of data-saVvy inforMation prOfessionals to enable open leadership INnovation Till Blume ZBW – Leibniz Information Centre for Economics Kiel University The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data September 26th, 2018, DLR, Jena, Germany.
  • 2. www.moving-project.eu 1 of 29 MOVING search scenario: • The MOVING platform provides access to large variety of scientific literature, metadata, videos, social-media content, websites, …. Information Retrieval (IR) System The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data
  • 3. www.moving-project.eu 2 of 29 MOVING search scenario: • The MOVING platform provides access to large variety of scientific literature, metadata, videos, social-media content, websites, …. Information Retrieval (IR) System The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data
  • 4. www.moving-project.eu 3 of 29 MOVING search scenario: • The MOVING platform provides access to large variety of scientific literature, metadata, videos, social-media content, websites, …. Information Retrieval (IR) System The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data
  • 5. www.moving-project.eu 4 of 29 MOVING search scenario: • The MOVING platform provides access to large variety of scientific literature, metadata, videos, social-media content, websites, …. Information Retrieval (IR) System The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data
  • 6. www.moving-project.eu 5 of 29 • Additional metadata from the Web of Data is of great value: • We use it as additional source of information [4]. • We complement existing metadata. • We train machine learning models to further improve the IR [5]. MOVING search scenario: • The MOVING platform provides access to large variety of scientific literature, metadata, videos, social-media content, websites, …. Information Retrieval (IR) System The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data
  • 7. www.moving-project.eu 6 of 29 Integrating the Web of Data … Towards a clean air policy Great Britain. Central Electricity foaf:Agent URI-1 URI-2 bibo:Book dct:subject URI-3 MOVING platform SLI 2 foaf:Agent dct:subject bibo:Book dct:creator 3 4 Data Integration Service (DIS) for the Web of Data: 1. We formulate a structural query to find matching databases (e.g. containing bibliographic metadata). 2. The Schema-level Index returns a list of matching databases. 3. Access all databases and check the contained data instances. 4. Harvest the relevant data instances and integrate them into our database. D I S 1 The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data
  • 8. www.moving-project.eu 7 of 29 Problem Statement The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data • The Web of Data is a huge dynamic heterogeneous network: • Huge: largest collection of the Web of Data has more 38 billion edges1. • Dynamic: 40-50% of the data changes either frequently or infrequently [10]. • Heterogeneous: no central authority responsible for data management. • How to efficiently update schema-level indices over time? • Re-computing from scratch not feasibly! (Crawls take weeks and computations days). • Incremental instance-level graph indices [8,13,14,18,19] not applicable due to different level of abstraction. • Incremental schema discovery in NoSQL databases [7,17] not applicable due to the decentralized nature of the Web of Data. 1http://lodlaundromat.org/
  • 9. www.moving-project.eu 8 of 29 The Meta Model FLuID The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data Our previous work [2] Various Index Models exist [1,3,6,7,9,11, 12,15,16] Our ongoing work: extend to incremental Index computation Index 1 M2: Meta model FLuID Index Model A: Characteristic Sets M1: Model M0: Implementation Index Computation Data Graph <<instanceOf>> <<instanceOf>> acts as blueprint
  • 10. www.moving-project.eu 9 of 29 Approach The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data Goal: Develop an incremental schema-level index • We extend our previously developed meta model FLuID, which can define arbitrary schema-level indices [2]. • We analyze the effect of different types of changes on the data graph on schema-level indices. • We analyze the best and worst case complexity for all types of updates. • We outline an algorithm using an additional data structure to coordinate the updates efficiently. • We experimentally evaluate the expected size of the additional data structure. Impact: • Ease analyses of the evolution of data instances on schema-level. • Allow „always up-to date“ data caches on schema-level.
  • 11. www.moving-project.eu 10 of 29 • FLuID provides 4 schema elements: • 3 simple elements: Object Cluster (OC), Property Cluster (PC), and Property- Object Cluster (POC) • 1 Complex Schema Element (CSE) • FLuID provides 5 parameterizations: • Label parameterization • Chaining parameterization • Direction parameterization • Ontology paramaterization • Instance parameterization • In total, FLuID provides 9 building blocks sufficient to model all existing approaches and beyond. The FLuID Meta Model The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data
  • 12. www.moving-project.eu 11 of 29 • Instances: edges <s,p,o> with same subject node s, i.e., ((i1, p1, o1), (i2, p2, o2)) ∈ I ⇔ i1 = i2. • Edges belong to exactly 1 instance, nodes not necessarily • Since instances partition the data graph, a set of instances also partitions the data graph. FLuID: Equivalence Relation Approach The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data i1 i2 i3 i4 i5 i6 i7 i8 i9 i10 p2 p1 p2 p1 p3 p2 p1
  • 13. www.moving-project.eu 12 of 29 • Object Cluster: summarize instances that share a set of connected objects, i.e., ([i1]I , [i2]I ) ∈ OC ⇔ ∀(i1, p1, o1)∃(i2, p2, o2) : o1 = o2 ∧ ∀(i2, p2, o2) ∃(i1, p1, o1) : o1 = o2 The FLuID Model The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data i1 i2 i3 i4 i5 i6 i7 i8 i9 i10 p2 p1 p2 p1 p3 p2 p1
  • 14. www.moving-project.eu 13 of 29 • Label Parameterized Object Cluster: summarize instances that have the set of connected objects, if the property is p1 The FLuID Model The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data i1 i2 i3 i4 i5 i6 i7 i8 i9 i10 p2 p1 p2 p1 p3 p2 p1
  • 15. www.moving-project.eu 14 of 29 • Label Parameterized Object Cluster: summarize instances that have the set of connected objects, if the property is rdf:type The FLuID Model The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data i1 i2 i4 i5 i6 i8 i10 p2 rdf:type p2 rdf:type p3 p2 rdf:type Bbibo:Book Bfoaf:Agent Bbibo:Proceedings
  • 16. www.moving-project.eu 15 of 29 • Label Parameterized Object Cluster: summarize instances that have the set of connected objects, if the property is rdf:type • Ontology paramaterization: RDFS Schema Graph The FLuID Model The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data i1 i4 i6 i8 i10 p2 p2 rdf:type p3 p2 i2 i5 i8 rdf:type rdf:type Bbibo:Proceedings Bbibo:Book Bfoaf:Agent
  • 17. www.moving-project.eu 16 of 29 • Label Parameterized Object Cluster: summarize instances that have the set of connected objects, if the property is rdf:type • Ontology paramaterization: RDFS Schema Graph • Instance parameterization: owl:sameAs The FLuID Model The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data i1 i4 i6 i10 dct:creator dct:creator rdf:type owl:sameAs dct:creator i8i2 i5 i8 rdf:type rdf:type Bbibo:Proceedings Bbibo:Book Bfoaf:Agent
  • 18. www.moving-project.eu 17 of 29 How to compute a Schema-level Index? Define an index model using the meta model FLuID [2]: 1. Characteristic Sets [12]: • Instances have the same incoming and outgoing PROPERTIES Char.Sets:= u-PROPS 2. TermPicker [15]: • Instances have the same TYPES • Instances have the same PROPERTIES • Neighboring Instances have the same TYPES TermPicker:= (TYPES ∩ PROPS, T, TYPES) Simple Schema Elements The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data Complex Schema Elements Simple Schema Element * 2 (Chaining Parameterization)
  • 19. www.moving-project.eu 18 of 29 Computing a Schema-level Index i5 Book author i4keyword Subject Person author keyword pcrel-2 Book octype-1 ~s ~o ~s Subject octype-2 Person cse-1 … i3 i6 author i2keyword … i1 schema payload 12 Index Size: α: No. of simple schema elements (3) β: No. of complex schema elements (1) Index Model (TYPES ∩ PROPS, T, TYPES): τ: No. of Complex Schema Elements (1) ς: No. of Simple Schema Elements (2) k: Chaining Parameter (1) Data Graph (Instance Information): Schema-level Index: Data Graph: v: No. of nodes (6) Upper bound for size: α + β ≤ v * (ς + τ * k) 4 < 18 The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data Example Index Computation for Term Picker: Computational complexity for (re-) computing from scratch Book Subject Person
  • 20. www.moving-project.eu 19 of 29 Subject Person Updating a Schema-level Index schema payload 12 Data Graph (Instance Information): Schema-level Index: Subject octype-3 ~s ~o ~s cse-2 1 1 Removing instance information can increase the index size! Index Size: α: No. of simple schema elements (4) β: No. of complex schema elements (2) The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data Example Index Computation for Term Picker: i5 Book author i4keyword Subject Person … i3 i6 author i2keyword … i1 Book author keyword pcrel-2 Book octype-1 ~s ~o ~s Subject octype-2 Person cse-1
  • 21. www.moving-project.eu 20 of 29 Possible Update Operations There are six cases of updates possible for an SLI: 1. a new instance is observed with a new schema (SEnew) 2. a new instance is observed with a known schema (PEadd) 3. an instance is observed with a changed schema (SEmod) 4. an instance is observed with only changed instance information (PEmod) 5. an instance no longer exists (PEdel) 6. no more instance with a specific schema exists (SEdel) We analyzed for each update type the best and worst case time complexity of the update operation on the schema graph. Overall Maximum Update Complexity: ς + τ * k + δ−(I1 max k )k * τ No. of Simple + Complex Schema Elements in the Index Model In-degree of instances found in the data graph δ−(I1 max k ) Chaining Parameter k in the Index Model The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data
  • 22. www.moving-project.eu 21 of 29 Incremental Schema-level Index The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data • Base algorithm iterates over all instances and computes the schema according to the Index Model. • Incremental algorithm uses a partial history of the indexed data graph to coordinate the updates – the Update Coordinator (UC) procedure computeSchema(instance) ๐ O(v) for all SchemaElement in IndexModel do ๐ O(ς + τ * k ) hash(seprev) ⟵ UC.previousElement(instance) senew ⟵ extractSchema(instance, SchemaElement) ๐ O(|instance|) if hash(seprev) ≠ hash(senew) then seprev ⟵ retrieveFromSLI(hash(seprev)) updateSchemaElement(seprev, senew) ๐ O(|instance|) parentInstances ⟵ UC.getParentInstances(instance) for all pInstance in parentInstances do ๐ O(δ− * ς + τ * k ) computeSchema(pInstance) storeInSLI(senew) UC.put(instance, senew)
  • 23. www.moving-project.eu 22 of 29 Update Coordinator DS- URI-1 DS- URI-2 DS- URI-3 I- URI-67 I- URI-68 I- URI-88 I- URI-89 SE- URI-12 SE- URI-13 SE- URI-55 … … … Schema Element hashs Instance Element URIs Data Source URIs + Timestamps<<timestamp>> <<timestamp>> <<timestamp>> Store a partial history for the indexed data graph in a 3-layered data structure to cope with additional level of abstraction: The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data For each instance one type of Schema Element is linked Each instance can have dependencies on zero or more instances Each instance is defined in at least one data source
  • 24. www.moving-project.eu 23 of 29 Preliminary Results & Discussion • Datasets: 4 snapshots of the Web of Data crawled by the Dynamic Linked Data Observatory (DyLDO1), each about 7 Million Data Instances • Analyze the datasets with respect to the 3-layered data structure: • About 28% of the data instances are defined in more than 1 (average 2.2) data sources. • About 33% of the instances have decencies on 1 or more (average 5) parent instances (in-degree). Implications for additional space complexity DyLDO datasets : • SLIs using a complex schema element require storing 33% more data. • SLIs using only simple schema elements require storing 15% more data. Implications for the update time complexity on DyLDO datasets: • Updates: ς + τ * k + 5 * (ς + τ * k) (TermPicker: 18) • Re-computation: v * (ς + τ * k) (TermPicker: 21,000,000) The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data 1http://swse.deri.org/dyldo/
  • 25. www.moving-project.eu 24 of 29 Conclusion 1. We analyzed all different types of changes. 2. We outlined an algorithm that can map all instance-level changes to updates on the schema-level Index. 3. Advantage: Only a small number of Schema Elements may needs to be updated compared to re-computing from scratch. 4. Limitation: Depending on the Index Model, we may need to store more data. Future Work 1. Implement and empirically evaluate the performance of the Incremental Schema- level index algorithm for Real World Datasets (DyLDO snaphots1) & Benchmark Datasets (Berlin SPARQL Benchmark2 & Lehigh University Benchmark3) 2. Reduce the data overhead by changing the Update Coordinator data structure, e. g., by using approximate data structures like bloom filter. 1http://swse.deri.org/dyldo/ 2http://wifo5-03.informatik.uni-mannheim.de/bizer/berlinsparqlbenchmark/spec/index.html 3http://swat.cse.lehigh.edu/projects/lubm/ Conclusion & Future Work The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data
  • 26. www.moving-project.eu 25 of 29 Search Engine Prototype: LODatio+ http://lodatio.informatik.uni-kiel.de The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data
  • 27. www.moving-project.eu 26 of 29 Search Engine Prototype: LODatio+ http://lodatio.informatik.uni-kiel.de The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data
  • 28. www.moving-project.eu 27 of 29 Search Engine Prototype: LODatio+ http://lodatio.informatik.uni-kiel.de The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data
  • 29. www.moving-project.eu 28 of 29 1. Index the LOD cloud Schema-level Index FluID FrameworkLOD Crawler 2. Identify the relevant sources Harvey Framework Data cloud Focused Crawler JSON-Mapping Query Index Discovery System 3. Harvest the relevant sources 4. Crawl & Harvest additional relevant sources Data Integration Service Ongoing Work The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data
  • 30. www.moving-project.eu 29 of 29 Thank you for your attention! Any questions? Project consortium and funding agency MOVING is funded by the EU Horizon 2020 Programme under the project number INSO-4-2015: 693092 The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data
  • 31. www.moving-project.eu 30 of 29 References 1. Benedetti, F., Bergamaschi, S., and Po., L.: Exposing the underlying schema of LOD sources. In Joint IEEE/WIC/ACM WI and IAT, 2015. 2. Blume, T., Scherp, A.: Towards flexible indices for distributed graph data: The formal schema-level index model FLuID. In Foundations of Databases. CEURWS.org, 2018. 3. Ciglan, M., Nørv˚ag, K., and Hluch´y, L.: The SemSets model for ad-hoc semantic list search. In WWW, 2012. 4. Galke, L., Mai, F., Schelten, A., Brunsch, D., Scherp, A.: Using titles vs. full-text as source for automated semantic document annotation. In K-CAP, 2017. 5. Galke, L., Saleh, A., Scherp, A.: Evaluating the Impact of Word Embeddings on Similarity Scoring in Practical Information Retrieval. In INFORMATIK, 2017. 6. Goldman, R. and Widom, J.: DataGuides: Enabling query formulation and optimization in semistructured databases. In VLDB, 1997. 7. Gómez, S.N., Etcheverry, L., Marotta, A., Consens, M.P.: Findings from two decades of research on schema discovery using a systematic literature review. In AMW. CEUR-WS.org, 2018. 8. Kansal, A., Spezzano, F.: A scalable graph-coarsening based index for dynamic graph databases. In CIKM, 2017. 9. Konrath, M., Gottron., T., Staab, S., and Scherp, A.: SchemEX - efficient construction of a data catalogue by stream-based indexing of Linked Data. In J. Web Sem., 16:52–58, 2012. 10. Käfer, T., Abdelrahman, A., Umbrich, J., O’Byrne, P., Hogan, A.: Observing linkedn data dynamics. In ESWC. Springer, 2013. 11. McHugh, J., Abiteboul, S., Goldman, R., Quass, D., and Widom, J.: Lore: a database management system for semistructured data. In SIGMOD Record, 26(3):54–66, 1997. The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data
  • 32. www.moving-project.eu 31 of 29 References 12. Neumann, T. and Moerkotte, G.: Characteristic sets: Accurate cardinality estimation for RDF queries with multiple joins. In ICDE, 2011. 13. Qiao, M., Zhang, H., Cheng, H.: Subgraph matching: on compression and computation. In PVLDB, 11(2):176–188, 2017. 14. Sakr, S., Al-Naymat, G.: Graph indexing and querying: a review. In J. of Web Inf. Sys. 6(2):101–120, 2010. 15. Schaible, J., Gottron, T., and Scherp, A.: TermPicker: Enabling the reuse of vocabulary terms by exploiting data from the Linked Open Data cloud. In ESWC, 2016. 16. Spahiu, B., Porrini, R., Palmonari, M., Rula, A., and Maurino, A.: ABSTAT: ontology-driven Linked Data summaries with pattern minimalization. In ESWC Satellite Events, Revised Selected Papers, 2016. 17. Wang, L., Zhang, S., Shi, J., Jiao, L., Hassanzadeh, O., Zou, J., Wangz, C.: Schema management for document stores. In VLDB 8(9):922–933, 2015. 18. Yuan, D., Mitra, P., Yu, H., Giles, C.L.: Iterative graph feature mining for graph indexing. In ICDE, 2012. 19. Yuan, D., Mitra, P., Yu, H., Giles, C.L.: Updating graph indices with a one-pass algorithm. In SIGMOD, 2015. The FLuID Meta Model: Incrementally Compute Schema-level Indices for the Web of Data

Editor's Notes

  1. In the context of the european H2020 project MOVING, we developed a Information Retrieval System
  2. In the context of the european H2020 project MOVING, we developed a Information Retrieval System
  3. In the context of the european H2020 project MOVING, we developed a Information Retrieval System
  4. In the context of the european H2020 project MOVING, we developed a Information Retrieval System
  5. In the context of the european H2020 project MOVING, we developed a Information Retrieval System
  6. The fist challenge we faced: lots of variations in the data schema: solution -> next slide
  7. Challnge 2: efficiently compute concrete indices
  8. Colors indicate partitions on the data graph
  9. P1 = rdf:type P2 = dct:creator P3 = owl:sameAs P4 = rdfs:subClassOf
  10. P1 = rdf:type P2 = dct:creator P3 = owl:sameAs P4 = rdfs:subClassOf
  11. P1 = rdf:type P2 = dct:creator P3 = owl:sameAs P4 = rdfs:subClassOf
  12. P1 = rdf:type P2 = dct:creator P3 = owl:sameAs P4 = rdfs:subClassOf
  13. P1 = rdf:type P2 = dct:creator P3 = owl:sameAs P4 = rdfs:subClassOf
  14. ^There are more features available, like ontology reasoning, but they do not impact the update complexity
  15. For each instance one type of Schema Element is linked (Index Model) Each instance can be defined in at least one data source (Data Graph) Each instance can have dependencies on zero or more instances due to complex schema elements (Data Graph)