SlideShare a Scribd company logo
An RDF Dataset Generator
for the Social Network
Benchmark with
Real-World Coherence
Mirko Spasic, Milos Jovanovik, Arnau Prat-Pérez
Workshop on Benchmarking Linked Data (BLINK) 2016
at the International Semantic Web Conference (ISWC) 2016
18 October 2016, Kobe, Japan
Introduction
● Real-world vs Synthetic RDF datasets
● Difference in coherence - a lower degree of structuredness in synthetic
RDF datasets
● Modification of the existing RDF data generator for the Social Network
Benchmark (SNB)
● Virtuoso procedure for coherence measurement of RDF datasets
Data Generator for the Social Network
Benchmark
● Existing synthetic RDF data generator for SNB
○ Persons with friendship relations, posts, comments, likes, etc.
○ Models an online social network
● Reproduces the structural characteristics of a real online social network:
○ Attribute correlation
○ Degree distributions
○ Structure-Attribute correlations
○ Spiky activity volume
Dataset Coherence
● Composite metric - coherence of a dataset
● Weight coefficient (WT) - higher impact to types with more instances and
properties
● Coverage of type (CV) - depends on whether the instances of the type T set
a value for all its properties
Measuring RDF Dataset Coherence in
Virtuoso
● Proposed workflow to compute the coherence of a dataset in a few steps:
○ Assembling all triples into a single file
○ Data cleaning and normalization
○ Generating several temporary files
○ Sorting files in different orders
○ Computing coherence by making a single pass of the sorted file
● Disadvantages: memory requirements, time, temporary space, …
● Virtuoso stored procedure
○ System will take care of the efficacy and data compression ⟹ simpler and much more
space- and time-efficient procedure
○ Available on GitHub
A Realistic RDF Dataset Generator for the
SNB
● Most dominant types from the SNB dataset - Post and Comment
● Removal of a high number of triplets related to a specific property
○ isLocatedIn
● Introducing new properties
○ gifFile
○ mentions
○ visibility
○ link
● Available on GitHub
Measurements
Original SNB Datasets Modified SNB Datasets
Conclusion and Future Work
● Done: A modified RDF Datagen for SNB, with a coherence of ~0.6
○ Used general characteristics of Twitter, Facebook, Instagram, etc., to introduce
modifications to the RDF Datagen
● Done: A set of procedures in Virtuoso for coherence measurement
● Plan: Reduce coverage of other types, apart from Posts and Comments
● Plan: Address the correlation in the newly added parts
Q&A
Questions?
This work was supported by grands from the EU H2020 Framework Programme
provided for the project HOBBIT (GA no. 688227).

More Related Content

What's hot

Seige arndt-lightning talk swib13
Seige arndt-lightning talk swib13Seige arndt-lightning talk swib13
Seige arndt-lightning talk swib13
Leander Seige
 
LinkedStat: making ISTAT data more valuable
LinkedStat: making ISTAT data more valuableLinkedStat: making ISTAT data more valuable
LinkedStat: making ISTAT data more valuable
SpazioDati
 
A Comparative Kalendar - DH2013 Presentation
A Comparative Kalendar - DH2013 PresentationA Comparative Kalendar - DH2013 Presentation
A Comparative Kalendar - DH2013 Presentation
blalbritton
 
Geospatial Querying in Apache Marmotta - Apache Big Data North America 2016
Geospatial Querying in Apache Marmotta -  Apache Big Data North America 2016Geospatial Querying in Apache Marmotta -  Apache Big Data North America 2016
Geospatial Querying in Apache Marmotta - Apache Big Data North America 2016
Sergio Fernández
 
Stream processing: The Matrix Revolutions
Stream processing: The Matrix RevolutionsStream processing: The Matrix Revolutions
Stream processing: The Matrix Revolutions
RomanaPernischov
 
Tracking research data footprints - slides
Tracking research data footprints - slidesTracking research data footprints - slides
Tracking research data footprints - slides
ARDC
 
balloon Synopsis at ISWC 2014 Developer Worksop
balloon Synopsis at ISWC 2014 Developer Worksopballoon Synopsis at ISWC 2014 Developer Worksop
balloon Synopsis at ISWC 2014 Developer Worksop
Kai Schlegel
 
CPaaS.io Y1 Review Meeting - Holistic Data Management
CPaaS.io Y1 Review Meeting - Holistic Data ManagementCPaaS.io Y1 Review Meeting - Holistic Data Management
CPaaS.io Y1 Review Meeting - Holistic Data Management
Stephan Haller
 
Third openCypher Implementers Group Meeting: Status Update
Third openCypher Implementers Group Meeting: Status UpdateThird openCypher Implementers Group Meeting: Status Update
Third openCypher Implementers Group Meeting: Status Update
openCypher
 
"Dude, where's my graph?" RDF Data Cubes for Clinical Trials Data
"Dude, where's my graph?" RDF Data Cubes for Clinical Trials Data"Dude, where's my graph?" RDF Data Cubes for Clinical Trials Data
"Dude, where's my graph?" RDF Data Cubes for Clinical Trials Data
Marc Andersen
 
Alex Liu Harvard Forest Presentation
Alex Liu Harvard Forest PresentationAlex Liu Harvard Forest Presentation
Alex Liu Harvard Forest Presentation
lexicron345
 
SC7 Webinar 5 13/12/2017 NCSR "Demokritos" Presentation "Event Detection"
SC7 Webinar 5 13/12/2017 NCSR "Demokritos" Presentation "Event Detection"SC7 Webinar 5 13/12/2017 NCSR "Demokritos" Presentation "Event Detection"
SC7 Webinar 5 13/12/2017 NCSR "Demokritos" Presentation "Event Detection"
BigData_Europe
 
Behind the Scenes at Coolblue - Feb 2017
Behind the Scenes at Coolblue - Feb 2017Behind the Scenes at Coolblue - Feb 2017
Behind the Scenes at Coolblue - Feb 2017
Pat Hermens
 
2011 IBM-KNAW Cambridge - How to store meaningful bits permanently
2011 IBM-KNAW Cambridge - How to store meaningful bits permanently2011 IBM-KNAW Cambridge - How to store meaningful bits permanently
2011 IBM-KNAW Cambridge - How to store meaningful bits permanently
Dirk Roorda
 
Managing and Consuming Completeness Information for Wikidata Using COOL-WD
Managing and Consuming Completeness Information for Wikidata Using COOL-WDManaging and Consuming Completeness Information for Wikidata Using COOL-WD
Managing and Consuming Completeness Information for Wikidata Using COOL-WD
Fariz Darari
 
SC7 Webinar 5 13/12/2017 UoA Presentation "Technical aspects of the 3rd secur...
SC7 Webinar 5 13/12/2017 UoA Presentation "Technical aspects of the 3rd secur...SC7 Webinar 5 13/12/2017 UoA Presentation "Technical aspects of the 3rd secur...
SC7 Webinar 5 13/12/2017 UoA Presentation "Technical aspects of the 3rd secur...
BigData_Europe
 
20170501 Distributed Network of Digital Heritage Information
20170501  Distributed Network of Digital Heritage Information20170501  Distributed Network of Digital Heritage Information
20170501 Distributed Network of Digital Heritage Information
Enno Meijers
 
Iochem.carles bo
Iochem.carles boIochem.carles bo
Iochem.carles bo
maredata
 
Lynx project presentation at ENDORSE 2021 Conference
Lynx project presentation at ENDORSE 2021 ConferenceLynx project presentation at ENDORSE 2021 Conference
Lynx project presentation at ENDORSE 2021 Conference
Lynx Project
 
BigDataEurope @BDVA Summit2016 1: The BDE Platform
BigDataEurope @BDVA Summit2016 1: The BDE PlatformBigDataEurope @BDVA Summit2016 1: The BDE Platform
BigDataEurope @BDVA Summit2016 1: The BDE Platform
BigData_Europe
 

What's hot (20)

Seige arndt-lightning talk swib13
Seige arndt-lightning talk swib13Seige arndt-lightning talk swib13
Seige arndt-lightning talk swib13
 
LinkedStat: making ISTAT data more valuable
LinkedStat: making ISTAT data more valuableLinkedStat: making ISTAT data more valuable
LinkedStat: making ISTAT data more valuable
 
A Comparative Kalendar - DH2013 Presentation
A Comparative Kalendar - DH2013 PresentationA Comparative Kalendar - DH2013 Presentation
A Comparative Kalendar - DH2013 Presentation
 
Geospatial Querying in Apache Marmotta - Apache Big Data North America 2016
Geospatial Querying in Apache Marmotta -  Apache Big Data North America 2016Geospatial Querying in Apache Marmotta -  Apache Big Data North America 2016
Geospatial Querying in Apache Marmotta - Apache Big Data North America 2016
 
Stream processing: The Matrix Revolutions
Stream processing: The Matrix RevolutionsStream processing: The Matrix Revolutions
Stream processing: The Matrix Revolutions
 
Tracking research data footprints - slides
Tracking research data footprints - slidesTracking research data footprints - slides
Tracking research data footprints - slides
 
balloon Synopsis at ISWC 2014 Developer Worksop
balloon Synopsis at ISWC 2014 Developer Worksopballoon Synopsis at ISWC 2014 Developer Worksop
balloon Synopsis at ISWC 2014 Developer Worksop
 
CPaaS.io Y1 Review Meeting - Holistic Data Management
CPaaS.io Y1 Review Meeting - Holistic Data ManagementCPaaS.io Y1 Review Meeting - Holistic Data Management
CPaaS.io Y1 Review Meeting - Holistic Data Management
 
Third openCypher Implementers Group Meeting: Status Update
Third openCypher Implementers Group Meeting: Status UpdateThird openCypher Implementers Group Meeting: Status Update
Third openCypher Implementers Group Meeting: Status Update
 
"Dude, where's my graph?" RDF Data Cubes for Clinical Trials Data
"Dude, where's my graph?" RDF Data Cubes for Clinical Trials Data"Dude, where's my graph?" RDF Data Cubes for Clinical Trials Data
"Dude, where's my graph?" RDF Data Cubes for Clinical Trials Data
 
Alex Liu Harvard Forest Presentation
Alex Liu Harvard Forest PresentationAlex Liu Harvard Forest Presentation
Alex Liu Harvard Forest Presentation
 
SC7 Webinar 5 13/12/2017 NCSR "Demokritos" Presentation "Event Detection"
SC7 Webinar 5 13/12/2017 NCSR "Demokritos" Presentation "Event Detection"SC7 Webinar 5 13/12/2017 NCSR "Demokritos" Presentation "Event Detection"
SC7 Webinar 5 13/12/2017 NCSR "Demokritos" Presentation "Event Detection"
 
Behind the Scenes at Coolblue - Feb 2017
Behind the Scenes at Coolblue - Feb 2017Behind the Scenes at Coolblue - Feb 2017
Behind the Scenes at Coolblue - Feb 2017
 
2011 IBM-KNAW Cambridge - How to store meaningful bits permanently
2011 IBM-KNAW Cambridge - How to store meaningful bits permanently2011 IBM-KNAW Cambridge - How to store meaningful bits permanently
2011 IBM-KNAW Cambridge - How to store meaningful bits permanently
 
Managing and Consuming Completeness Information for Wikidata Using COOL-WD
Managing and Consuming Completeness Information for Wikidata Using COOL-WDManaging and Consuming Completeness Information for Wikidata Using COOL-WD
Managing and Consuming Completeness Information for Wikidata Using COOL-WD
 
SC7 Webinar 5 13/12/2017 UoA Presentation "Technical aspects of the 3rd secur...
SC7 Webinar 5 13/12/2017 UoA Presentation "Technical aspects of the 3rd secur...SC7 Webinar 5 13/12/2017 UoA Presentation "Technical aspects of the 3rd secur...
SC7 Webinar 5 13/12/2017 UoA Presentation "Technical aspects of the 3rd secur...
 
20170501 Distributed Network of Digital Heritage Information
20170501  Distributed Network of Digital Heritage Information20170501  Distributed Network of Digital Heritage Information
20170501 Distributed Network of Digital Heritage Information
 
Iochem.carles bo
Iochem.carles boIochem.carles bo
Iochem.carles bo
 
Lynx project presentation at ENDORSE 2021 Conference
Lynx project presentation at ENDORSE 2021 ConferenceLynx project presentation at ENDORSE 2021 Conference
Lynx project presentation at ENDORSE 2021 Conference
 
BigDataEurope @BDVA Summit2016 1: The BDE Platform
BigDataEurope @BDVA Summit2016 1: The BDE PlatformBigDataEurope @BDVA Summit2016 1: The BDE Platform
BigDataEurope @BDVA Summit2016 1: The BDE Platform
 

Viewers also liked

Querying your database in natural language by Daniel Moisset PyData SV 2014
Querying your database in natural language by Daniel Moisset PyData SV 2014Querying your database in natural language by Daniel Moisset PyData SV 2014
Querying your database in natural language by Daniel Moisset PyData SV 2014
PyData
 
QALD-7 Question Answering over Linked Data Challenge
QALD-7 Question Answering over Linked Data ChallengeQALD-7 Question Answering over Linked Data Challenge
QALD-7 Question Answering over Linked Data Challenge
Holistic Benchmarking of Big Linked Data
 
HOBBIT Link Discovery Benchmarks at OM2017 ISWC 2017
HOBBIT Link Discovery Benchmarks at OM2017 ISWC 2017HOBBIT Link Discovery Benchmarks at OM2017 ISWC 2017
HOBBIT Link Discovery Benchmarks at OM2017 ISWC 2017
Holistic Benchmarking of Big Linked Data
 
Benchmarking Faceted Browsing Capabilities of Triple Stores
Benchmarking Faceted Browsing Capabilities of Triple StoresBenchmarking Faceted Browsing Capabilities of Triple Stores
Benchmarking Faceted Browsing Capabilities of Triple Stores
Holistic Benchmarking of Big Linked Data
 
Leopard ISWC Semantic Web Challenge 2017 (poster)
Leopard ISWC Semantic Web Challenge 2017 (poster)Leopard ISWC Semantic Web Challenge 2017 (poster)
Leopard ISWC Semantic Web Challenge 2017 (poster)
Holistic Benchmarking of Big Linked Data
 
Natural language search using Neo4j
Natural language search using Neo4jNatural language search using Neo4j
Natural language search using Neo4j
Kenny Bastani
 
NLIDB(Natural Language Interface to DataBases)
NLIDB(Natural Language Interface to DataBases)NLIDB(Natural Language Interface to DataBases)
NLIDB(Natural Language Interface to DataBases)
Swetha Pallati
 

Viewers also liked (7)

Querying your database in natural language by Daniel Moisset PyData SV 2014
Querying your database in natural language by Daniel Moisset PyData SV 2014Querying your database in natural language by Daniel Moisset PyData SV 2014
Querying your database in natural language by Daniel Moisset PyData SV 2014
 
QALD-7 Question Answering over Linked Data Challenge
QALD-7 Question Answering over Linked Data ChallengeQALD-7 Question Answering over Linked Data Challenge
QALD-7 Question Answering over Linked Data Challenge
 
HOBBIT Link Discovery Benchmarks at OM2017 ISWC 2017
HOBBIT Link Discovery Benchmarks at OM2017 ISWC 2017HOBBIT Link Discovery Benchmarks at OM2017 ISWC 2017
HOBBIT Link Discovery Benchmarks at OM2017 ISWC 2017
 
Benchmarking Faceted Browsing Capabilities of Triple Stores
Benchmarking Faceted Browsing Capabilities of Triple StoresBenchmarking Faceted Browsing Capabilities of Triple Stores
Benchmarking Faceted Browsing Capabilities of Triple Stores
 
Leopard ISWC Semantic Web Challenge 2017 (poster)
Leopard ISWC Semantic Web Challenge 2017 (poster)Leopard ISWC Semantic Web Challenge 2017 (poster)
Leopard ISWC Semantic Web Challenge 2017 (poster)
 
Natural language search using Neo4j
Natural language search using Neo4jNatural language search using Neo4j
Natural language search using Neo4j
 
NLIDB(Natural Language Interface to DataBases)
NLIDB(Natural Language Interface to DataBases)NLIDB(Natural Language Interface to DataBases)
NLIDB(Natural Language Interface to DataBases)
 

Similar to An RDF Dataset Generator for the Social Network Benchmark with Real-World Coherence

SPARQL and Linked Data Benchmarking
SPARQL and Linked Data BenchmarkingSPARQL and Linked Data Benchmarking
SPARQL and Linked Data Benchmarking
Kristian Alexander
 
The web of interlinked data and knowledge stripped
The web of interlinked data and knowledge strippedThe web of interlinked data and knowledge stripped
The web of interlinked data and knowledge stripped
Sören Auer
 
Open Chemistry, JupyterLab and data: Reproducible quantum chemistry
Open Chemistry, JupyterLab and data: Reproducible quantum chemistryOpen Chemistry, JupyterLab and data: Reproducible quantum chemistry
Open Chemistry, JupyterLab and data: Reproducible quantum chemistry
Marcus Hanwell
 
Towards Integration of Web Data into a coherent Educational Data Graph
Towards Integration of Web Data into a coherent Educational Data GraphTowards Integration of Web Data into a coherent Educational Data Graph
Towards Integration of Web Data into a coherent Educational Data Graph
Besnik Fetahu
 
RDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataRDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival data
Giorgos Santipantakis
 
Graph basedrdf storeforapachecassandra
Graph basedrdf storeforapachecassandraGraph basedrdf storeforapachecassandra
Graph basedrdf storeforapachecassandra
Ravindra Ranwala
 
LDBC 8th TUC Meeting: Introduction and status update
LDBC 8th TUC Meeting: Introduction and status updateLDBC 8th TUC Meeting: Introduction and status update
LDBC 8th TUC Meeting: Introduction and status update
LDBC council
 
An Open Source NoSQL solution for Internet Access Logs Analysis
An Open Source NoSQL solution for Internet Access Logs AnalysisAn Open Source NoSQL solution for Internet Access Logs Analysis
An Open Source NoSQL solution for Internet Access Logs Analysis
José Manuel Ciges Regueiro
 
Ldbc spb 2.0 evolution
Ldbc spb 2.0 evolutionLdbc spb 2.0 evolution
Ldbc spb 2.0 evolution
Ioan Toma
 
LDBC Semantic Publishing Benchmark 2.0 evolution - Ontotext
LDBC Semantic Publishing Benchmark 2.0 evolution - OntotextLDBC Semantic Publishing Benchmark 2.0 evolution - Ontotext
LDBC Semantic Publishing Benchmark 2.0 evolution - Ontotext
LDBC council
 
Better Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSA
Better Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSABetter Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSA
Better Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSA
PRBETTER
 
Hmp 201512
Hmp 201512Hmp 201512
KEDL DBpedia 2019
KEDL DBpedia  2019KEDL DBpedia  2019
KEDL DBpedia 2019
Sebastian Hellmann
 
Describing Theses and Dissertations Using Schema.org
Describing Theses and Dissertations Using Schema.orgDescribing Theses and Dissertations Using Schema.org
Describing Theses and Dissertations Using Schema.org
OCLC
 
LOD2: State of Play WP2 - Storing and Querying Very Large Knowledge Bases
LOD2: State of Play WP2 - Storing and Querying Very Large Knowledge BasesLOD2: State of Play WP2 - Storing and Querying Very Large Knowledge Bases
LOD2: State of Play WP2 - Storing and Querying Very Large Knowledge Bases
LOD2 Creating Knowledge out of Interlinked Data
 
The Semantic Data Web, Sören Auer, University of Leipzig
The Semantic Data Web, Sören Auer, University of LeipzigThe Semantic Data Web, Sören Auer, University of Leipzig
The Semantic Data Web, Sören Auer, University of Leipzig
LOD2 Creating Knowledge out of Interlinked Data
 
Apache Spark 101 - Demi Ben-Ari
Apache Spark 101 - Demi Ben-AriApache Spark 101 - Demi Ben-Ari
Apache Spark 101 - Demi Ben-Ari
Demi Ben-Ari
 
Semantic Similarity and Selection of Resources Published According to Linked ...
Semantic Similarity and Selection of Resources Published According to Linked ...Semantic Similarity and Selection of Resources Published According to Linked ...
Semantic Similarity and Selection of Resources Published According to Linked ...
Riccardo Albertoni
 
A Method for Distributed and Collaborative Curation of RDF Datasets Utilizing...
A Method for Distributed and Collaborative Curation of RDF Datasets Utilizing...A Method for Distributed and Collaborative Curation of RDF Datasets Utilizing...
A Method for Distributed and Collaborative Curation of RDF Datasets Utilizing...
Aksw Group
 
Assembling and Applying an Education Graph based on Learning Resources in Uni...
Assembling and Applying an Education Graph based on Learning Resources in Uni...Assembling and Applying an Education Graph based on Learning Resources in Uni...
Assembling and Applying an Education Graph based on Learning Resources in Uni...
Tom Heath
 

Similar to An RDF Dataset Generator for the Social Network Benchmark with Real-World Coherence (20)

SPARQL and Linked Data Benchmarking
SPARQL and Linked Data BenchmarkingSPARQL and Linked Data Benchmarking
SPARQL and Linked Data Benchmarking
 
The web of interlinked data and knowledge stripped
The web of interlinked data and knowledge strippedThe web of interlinked data and knowledge stripped
The web of interlinked data and knowledge stripped
 
Open Chemistry, JupyterLab and data: Reproducible quantum chemistry
Open Chemistry, JupyterLab and data: Reproducible quantum chemistryOpen Chemistry, JupyterLab and data: Reproducible quantum chemistry
Open Chemistry, JupyterLab and data: Reproducible quantum chemistry
 
Towards Integration of Web Data into a coherent Educational Data Graph
Towards Integration of Web Data into a coherent Educational Data GraphTowards Integration of Web Data into a coherent Educational Data Graph
Towards Integration of Web Data into a coherent Educational Data Graph
 
RDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataRDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival data
 
Graph basedrdf storeforapachecassandra
Graph basedrdf storeforapachecassandraGraph basedrdf storeforapachecassandra
Graph basedrdf storeforapachecassandra
 
LDBC 8th TUC Meeting: Introduction and status update
LDBC 8th TUC Meeting: Introduction and status updateLDBC 8th TUC Meeting: Introduction and status update
LDBC 8th TUC Meeting: Introduction and status update
 
An Open Source NoSQL solution for Internet Access Logs Analysis
An Open Source NoSQL solution for Internet Access Logs AnalysisAn Open Source NoSQL solution for Internet Access Logs Analysis
An Open Source NoSQL solution for Internet Access Logs Analysis
 
Ldbc spb 2.0 evolution
Ldbc spb 2.0 evolutionLdbc spb 2.0 evolution
Ldbc spb 2.0 evolution
 
LDBC Semantic Publishing Benchmark 2.0 evolution - Ontotext
LDBC Semantic Publishing Benchmark 2.0 evolution - OntotextLDBC Semantic Publishing Benchmark 2.0 evolution - Ontotext
LDBC Semantic Publishing Benchmark 2.0 evolution - Ontotext
 
Better Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSA
Better Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSABetter Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSA
Better Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSA
 
Hmp 201512
Hmp 201512Hmp 201512
Hmp 201512
 
KEDL DBpedia 2019
KEDL DBpedia  2019KEDL DBpedia  2019
KEDL DBpedia 2019
 
Describing Theses and Dissertations Using Schema.org
Describing Theses and Dissertations Using Schema.orgDescribing Theses and Dissertations Using Schema.org
Describing Theses and Dissertations Using Schema.org
 
LOD2: State of Play WP2 - Storing and Querying Very Large Knowledge Bases
LOD2: State of Play WP2 - Storing and Querying Very Large Knowledge BasesLOD2: State of Play WP2 - Storing and Querying Very Large Knowledge Bases
LOD2: State of Play WP2 - Storing and Querying Very Large Knowledge Bases
 
The Semantic Data Web, Sören Auer, University of Leipzig
The Semantic Data Web, Sören Auer, University of LeipzigThe Semantic Data Web, Sören Auer, University of Leipzig
The Semantic Data Web, Sören Auer, University of Leipzig
 
Apache Spark 101 - Demi Ben-Ari
Apache Spark 101 - Demi Ben-AriApache Spark 101 - Demi Ben-Ari
Apache Spark 101 - Demi Ben-Ari
 
Semantic Similarity and Selection of Resources Published According to Linked ...
Semantic Similarity and Selection of Resources Published According to Linked ...Semantic Similarity and Selection of Resources Published According to Linked ...
Semantic Similarity and Selection of Resources Published According to Linked ...
 
A Method for Distributed and Collaborative Curation of RDF Datasets Utilizing...
A Method for Distributed and Collaborative Curation of RDF Datasets Utilizing...A Method for Distributed and Collaborative Curation of RDF Datasets Utilizing...
A Method for Distributed and Collaborative Curation of RDF Datasets Utilizing...
 
Assembling and Applying an Education Graph based on Learning Resources in Uni...
Assembling and Applying an Education Graph based on Learning Resources in Uni...Assembling and Applying an Education Graph based on Learning Resources in Uni...
Assembling and Applying an Education Graph based on Learning Resources in Uni...
 

More from Holistic Benchmarking of Big Linked Data

EARL: Joint Entity and Relation Linking for Question Answering over Knowledge...
EARL: Joint Entity and Relation Linking for Question Answering over Knowledge...EARL: Joint Entity and Relation Linking for Question Answering over Knowledge...
EARL: Joint Entity and Relation Linking for Question Answering over Knowledge...
Holistic Benchmarking of Big Linked Data
 
Benchmarking Big Linked Data: The case of the HOBBIT Project
Benchmarking Big Linked Data: The case of the HOBBIT ProjectBenchmarking Big Linked Data: The case of the HOBBIT Project
Benchmarking Big Linked Data: The case of the HOBBIT Project
Holistic Benchmarking of Big Linked Data
 
Assessing Linked Data Versioning Systems: The Semantic Publishing Versioning ...
Assessing Linked Data Versioning Systems: The Semantic Publishing Versioning ...Assessing Linked Data Versioning Systems: The Semantic Publishing Versioning ...
Assessing Linked Data Versioning Systems: The Semantic Publishing Versioning ...
Holistic Benchmarking of Big Linked Data
 
The DEBS Grand Challenge 2018
The DEBS Grand Challenge 2018The DEBS Grand Challenge 2018
The DEBS Grand Challenge 2018
Holistic Benchmarking of Big Linked Data
 
Benchmarking of distributed linked data streaming systems
Benchmarking of distributed linked data streaming systemsBenchmarking of distributed linked data streaming systems
Benchmarking of distributed linked data streaming systems
Holistic Benchmarking of Big Linked Data
 
SQCFramework: SPARQL Query Containment Benchmarks Generation Framework
SQCFramework: SPARQL Query Containment Benchmarks Generation FrameworkSQCFramework: SPARQL Query Containment Benchmarks Generation Framework
SQCFramework: SPARQL Query Containment Benchmarks Generation Framework
Holistic Benchmarking of Big Linked Data
 
LargeRDFBench: A billion triples benchmark for SPARQL endpoint federation
LargeRDFBench: A billion triples benchmark for SPARQL endpoint federationLargeRDFBench: A billion triples benchmark for SPARQL endpoint federation
LargeRDFBench: A billion triples benchmark for SPARQL endpoint federation
Holistic Benchmarking of Big Linked Data
 
The DEBS Grand Challenge 2017
The DEBS Grand Challenge 2017The DEBS Grand Challenge 2017
The DEBS Grand Challenge 2017
Holistic Benchmarking of Big Linked Data
 
4th Natural Language Interface over the Web of Data (NLIWoD) workshop and QAL...
4th Natural Language Interface over the Web of Data (NLIWoD) workshop and QAL...4th Natural Language Interface over the Web of Data (NLIWoD) workshop and QAL...
4th Natural Language Interface over the Web of Data (NLIWoD) workshop and QAL...
Holistic Benchmarking of Big Linked Data
 
Scalable Link Discovery for Modern Data-Driven Applications (poster)
Scalable Link Discovery for Modern Data-Driven Applications (poster)Scalable Link Discovery for Modern Data-Driven Applications (poster)
Scalable Link Discovery for Modern Data-Driven Applications (poster)
Holistic Benchmarking of Big Linked Data
 
An Evaluation of Models for Runtime Approximation in Link Discovery
An Evaluation of Models for Runtime Approximation in Link DiscoveryAn Evaluation of Models for Runtime Approximation in Link Discovery
An Evaluation of Models for Runtime Approximation in Link Discovery
Holistic Benchmarking of Big Linked Data
 
Scalable Link Discovery for Modern Data-Driven Applications
Scalable Link Discovery for Modern Data-Driven ApplicationsScalable Link Discovery for Modern Data-Driven Applications
Scalable Link Discovery for Modern Data-Driven Applications
Holistic Benchmarking of Big Linked Data
 
Extending LargeRDFBench for Multi-Source Data at Scale for SPARQL Endpoint F...
 Extending LargeRDFBench for Multi-Source Data at Scale for SPARQL Endpoint F... Extending LargeRDFBench for Multi-Source Data at Scale for SPARQL Endpoint F...
Extending LargeRDFBench for Multi-Source Data at Scale for SPARQL Endpoint F...
Holistic Benchmarking of Big Linked Data
 
SPgen: A Benchmark Generator for Spatial Link Discovery Tools
SPgen: A Benchmark Generator for Spatial Link Discovery ToolsSPgen: A Benchmark Generator for Spatial Link Discovery Tools
SPgen: A Benchmark Generator for Spatial Link Discovery Tools
Holistic Benchmarking of Big Linked Data
 
Introducing the HOBBIT platform into the Ontology Alignment Evaluation Campaign
Introducing the HOBBIT platform into the Ontology Alignment Evaluation CampaignIntroducing the HOBBIT platform into the Ontology Alignment Evaluation Campaign
Introducing the HOBBIT platform into the Ontology Alignment Evaluation Campaign
Holistic Benchmarking of Big Linked Data
 
OKE2018 Challenge @ ESWC2018
OKE2018 Challenge @ ESWC2018OKE2018 Challenge @ ESWC2018
OKE2018 Challenge @ ESWC2018
Holistic Benchmarking of Big Linked Data
 
MOCHA 2018 Challenge @ ESWC2018
MOCHA 2018 Challenge @ ESWC2018MOCHA 2018 Challenge @ ESWC2018
MOCHA 2018 Challenge @ ESWC2018
Holistic Benchmarking of Big Linked Data
 
Dynamic planning for link discovery - ESWC 2018
Dynamic planning for link discovery - ESWC 2018Dynamic planning for link discovery - ESWC 2018
Dynamic planning for link discovery - ESWC 2018
Holistic Benchmarking of Big Linked Data
 
Hobbit project overview presented at EBDVF 2017
Hobbit project overview presented at EBDVF 2017Hobbit project overview presented at EBDVF 2017
Hobbit project overview presented at EBDVF 2017
Holistic Benchmarking of Big Linked Data
 
Leopard ISWC Semantic Web Challenge 2017
Leopard ISWC Semantic Web Challenge 2017Leopard ISWC Semantic Web Challenge 2017
Leopard ISWC Semantic Web Challenge 2017
Holistic Benchmarking of Big Linked Data
 

More from Holistic Benchmarking of Big Linked Data (20)

EARL: Joint Entity and Relation Linking for Question Answering over Knowledge...
EARL: Joint Entity and Relation Linking for Question Answering over Knowledge...EARL: Joint Entity and Relation Linking for Question Answering over Knowledge...
EARL: Joint Entity and Relation Linking for Question Answering over Knowledge...
 
Benchmarking Big Linked Data: The case of the HOBBIT Project
Benchmarking Big Linked Data: The case of the HOBBIT ProjectBenchmarking Big Linked Data: The case of the HOBBIT Project
Benchmarking Big Linked Data: The case of the HOBBIT Project
 
Assessing Linked Data Versioning Systems: The Semantic Publishing Versioning ...
Assessing Linked Data Versioning Systems: The Semantic Publishing Versioning ...Assessing Linked Data Versioning Systems: The Semantic Publishing Versioning ...
Assessing Linked Data Versioning Systems: The Semantic Publishing Versioning ...
 
The DEBS Grand Challenge 2018
The DEBS Grand Challenge 2018The DEBS Grand Challenge 2018
The DEBS Grand Challenge 2018
 
Benchmarking of distributed linked data streaming systems
Benchmarking of distributed linked data streaming systemsBenchmarking of distributed linked data streaming systems
Benchmarking of distributed linked data streaming systems
 
SQCFramework: SPARQL Query Containment Benchmarks Generation Framework
SQCFramework: SPARQL Query Containment Benchmarks Generation FrameworkSQCFramework: SPARQL Query Containment Benchmarks Generation Framework
SQCFramework: SPARQL Query Containment Benchmarks Generation Framework
 
LargeRDFBench: A billion triples benchmark for SPARQL endpoint federation
LargeRDFBench: A billion triples benchmark for SPARQL endpoint federationLargeRDFBench: A billion triples benchmark for SPARQL endpoint federation
LargeRDFBench: A billion triples benchmark for SPARQL endpoint federation
 
The DEBS Grand Challenge 2017
The DEBS Grand Challenge 2017The DEBS Grand Challenge 2017
The DEBS Grand Challenge 2017
 
4th Natural Language Interface over the Web of Data (NLIWoD) workshop and QAL...
4th Natural Language Interface over the Web of Data (NLIWoD) workshop and QAL...4th Natural Language Interface over the Web of Data (NLIWoD) workshop and QAL...
4th Natural Language Interface over the Web of Data (NLIWoD) workshop and QAL...
 
Scalable Link Discovery for Modern Data-Driven Applications (poster)
Scalable Link Discovery for Modern Data-Driven Applications (poster)Scalable Link Discovery for Modern Data-Driven Applications (poster)
Scalable Link Discovery for Modern Data-Driven Applications (poster)
 
An Evaluation of Models for Runtime Approximation in Link Discovery
An Evaluation of Models for Runtime Approximation in Link DiscoveryAn Evaluation of Models for Runtime Approximation in Link Discovery
An Evaluation of Models for Runtime Approximation in Link Discovery
 
Scalable Link Discovery for Modern Data-Driven Applications
Scalable Link Discovery for Modern Data-Driven ApplicationsScalable Link Discovery for Modern Data-Driven Applications
Scalable Link Discovery for Modern Data-Driven Applications
 
Extending LargeRDFBench for Multi-Source Data at Scale for SPARQL Endpoint F...
 Extending LargeRDFBench for Multi-Source Data at Scale for SPARQL Endpoint F... Extending LargeRDFBench for Multi-Source Data at Scale for SPARQL Endpoint F...
Extending LargeRDFBench for Multi-Source Data at Scale for SPARQL Endpoint F...
 
SPgen: A Benchmark Generator for Spatial Link Discovery Tools
SPgen: A Benchmark Generator for Spatial Link Discovery ToolsSPgen: A Benchmark Generator for Spatial Link Discovery Tools
SPgen: A Benchmark Generator for Spatial Link Discovery Tools
 
Introducing the HOBBIT platform into the Ontology Alignment Evaluation Campaign
Introducing the HOBBIT platform into the Ontology Alignment Evaluation CampaignIntroducing the HOBBIT platform into the Ontology Alignment Evaluation Campaign
Introducing the HOBBIT platform into the Ontology Alignment Evaluation Campaign
 
OKE2018 Challenge @ ESWC2018
OKE2018 Challenge @ ESWC2018OKE2018 Challenge @ ESWC2018
OKE2018 Challenge @ ESWC2018
 
MOCHA 2018 Challenge @ ESWC2018
MOCHA 2018 Challenge @ ESWC2018MOCHA 2018 Challenge @ ESWC2018
MOCHA 2018 Challenge @ ESWC2018
 
Dynamic planning for link discovery - ESWC 2018
Dynamic planning for link discovery - ESWC 2018Dynamic planning for link discovery - ESWC 2018
Dynamic planning for link discovery - ESWC 2018
 
Hobbit project overview presented at EBDVF 2017
Hobbit project overview presented at EBDVF 2017Hobbit project overview presented at EBDVF 2017
Hobbit project overview presented at EBDVF 2017
 
Leopard ISWC Semantic Web Challenge 2017
Leopard ISWC Semantic Web Challenge 2017Leopard ISWC Semantic Web Challenge 2017
Leopard ISWC Semantic Web Challenge 2017
 

Recently uploaded

The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptxThe use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
MAGOTI ERNEST
 
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
AbdullaAlAsif1
 
ESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptxESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptx
PRIYANKA PATEL
 
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
David Osipyan
 
Applied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdfApplied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdf
University of Hertfordshire
 
20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx
Sharon Liu
 
SAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdfSAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdf
KrushnaDarade1
 
Compexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titrationCompexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titration
Vandana Devesh Sharma
 
Basics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different formsBasics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different forms
MaheshaNanjegowda
 
Shallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptxShallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptx
Gokturk Mehmet Dilci
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
Sérgio Sacani
 
The debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically youngThe debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically young
Sérgio Sacani
 
Eukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptxEukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptx
RitabrataSarkar3
 
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốtmô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
HongcNguyn6
 
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdfwaterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
LengamoLAppostilic
 
Bob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdfBob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdf
Texas Alliance of Groundwater Districts
 
Thornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdfThornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdf
European Sustainable Phosphorus Platform
 
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
vluwdy49
 
8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf
by6843629
 
Phenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvementPhenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvement
IshaGoswami9
 

Recently uploaded (20)

The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptxThe use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
 
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
 
ESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptxESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptx
 
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
 
Applied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdfApplied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdf
 
20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx
 
SAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdfSAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdf
 
Compexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titrationCompexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titration
 
Basics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different formsBasics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different forms
 
Shallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptxShallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptx
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
 
The debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically youngThe debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically young
 
Eukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptxEukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptx
 
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốtmô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
 
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdfwaterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
 
Bob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdfBob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdf
 
Thornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdfThornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdf
 
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
 
8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf
 
Phenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvementPhenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvement
 

An RDF Dataset Generator for the Social Network Benchmark with Real-World Coherence

  • 1. An RDF Dataset Generator for the Social Network Benchmark with Real-World Coherence Mirko Spasic, Milos Jovanovik, Arnau Prat-Pérez Workshop on Benchmarking Linked Data (BLINK) 2016 at the International Semantic Web Conference (ISWC) 2016 18 October 2016, Kobe, Japan
  • 2. Introduction ● Real-world vs Synthetic RDF datasets ● Difference in coherence - a lower degree of structuredness in synthetic RDF datasets ● Modification of the existing RDF data generator for the Social Network Benchmark (SNB) ● Virtuoso procedure for coherence measurement of RDF datasets
  • 3. Data Generator for the Social Network Benchmark ● Existing synthetic RDF data generator for SNB ○ Persons with friendship relations, posts, comments, likes, etc. ○ Models an online social network ● Reproduces the structural characteristics of a real online social network: ○ Attribute correlation ○ Degree distributions ○ Structure-Attribute correlations ○ Spiky activity volume
  • 4. Dataset Coherence ● Composite metric - coherence of a dataset ● Weight coefficient (WT) - higher impact to types with more instances and properties ● Coverage of type (CV) - depends on whether the instances of the type T set a value for all its properties
  • 5. Measuring RDF Dataset Coherence in Virtuoso ● Proposed workflow to compute the coherence of a dataset in a few steps: ○ Assembling all triples into a single file ○ Data cleaning and normalization ○ Generating several temporary files ○ Sorting files in different orders ○ Computing coherence by making a single pass of the sorted file ● Disadvantages: memory requirements, time, temporary space, … ● Virtuoso stored procedure ○ System will take care of the efficacy and data compression ⟹ simpler and much more space- and time-efficient procedure ○ Available on GitHub
  • 6. A Realistic RDF Dataset Generator for the SNB ● Most dominant types from the SNB dataset - Post and Comment ● Removal of a high number of triplets related to a specific property ○ isLocatedIn ● Introducing new properties ○ gifFile ○ mentions ○ visibility ○ link ● Available on GitHub
  • 7. Measurements Original SNB Datasets Modified SNB Datasets
  • 8. Conclusion and Future Work ● Done: A modified RDF Datagen for SNB, with a coherence of ~0.6 ○ Used general characteristics of Twitter, Facebook, Instagram, etc., to introduce modifications to the RDF Datagen ● Done: A set of procedures in Virtuoso for coherence measurement ● Plan: Reduce coverage of other types, apart from Posts and Comments ● Plan: Address the correlation in the newly added parts
  • 10. This work was supported by grands from the EU H2020 Framework Programme provided for the project HOBBIT (GA no. 688227).