SlideShare a Scribd company logo
1 of 23
Incremental Export of Relational
Database Contents into RDF Graphs
Nikolaos Konstantinou, Dimitris Kouis, Nikolas Mitrou
By Dr. Nikolaos Konstantinou
National Technical University of Athens
School of Electrical and Computer Engineering
Multimedia, Communications & Web Technologies
4th International Conference on Web Intelligence, Mining and Semantics
Outline
• Introduction
• Proposed Approach and Measurements
• Discussion and Conclusions
24th International Conference on Web Intelligence, Mining and Semantics
Introduction
• Information collection, maintenance and update is not always taking
place directly at a triplestore, but at a RDBMS
• Triplestores are often kept as an alternative content delivery channel
• It can be difficult to change established methodologies and systems
• Newer technologies need to operate side-by-side to existing ones
before migration
34th International Conference on Web Intelligence, Mining and Semantics
Mapping Relational Data to RDF
• No one-size-fits-all approach
• Synchronous Vs Asynchronous RDF Views
• Real-time Vs Ad hoc RDF Views
• Real-time SPARQL-to-SQL Vs Querying the RDF dump using SPARQL
• Queries on the RDF dump are faster in certain conditions, compared to
round-trips to the database
• Difference in the performance more visible when SPARQL queries involve numerous
triple patterns (which translate to expensive JOIN statements)
• In this paper, we focus on the asynchronous approach
• Exporting (dumping) relational database contents into an RDF graph
44th International Conference on Web Intelligence, Mining and Semantics
Incremental Export into RDF (1/2)
• Problem
• Avoid dumping the whole database contents every time
• In cases when few data change in the source database, it is not necessary to
dump the entire database
• Approach
• Every time the RDF export is materialized
• Detect the changes in the source database or the mapping definition
• Insert/delete/update only the necessary triples, in order to reflect these changes in the
resulting RDF graph
54th International Conference on Web Intelligence, Mining and Semantics
Incremental Export into RDF (2/2)
• Incremental transformation
• Each time the transformation is executed, not all of the initial information
that lies in the database should be transformed into RDF, but only the one
that changed
• Incremental storage
• Storing (persisting) to the destination RDF graph only the triples that were
modified and not the whole graph
• Only when the resulting RDF graph is stored in a relational database or using
Jena TDB
• Regardless to whether the transformation took place fully or incrementally
64th International Conference on Web Intelligence, Mining and Semantics
R2RML and Triples Maps
• RDB to RDF Mapping Language
• A W3C Recommendation, as of 2012
• Triples Map: a reusable mapping
definition
• Specifies a rule for translating each row of
a logical table to zero or more RDF triples
• A logical table is a tabular SQL query result
set that is to be mapped to RDF triples
• Execution of a triples map generates the
triples that originate from a specific result
set (logical table)
74th International Conference on Web Intelligence, Mining and Semantics
The R2RML Parser
• An R2RML implementation
• Command-line tool that can export relational database contents as
RDF graphs, based on an R2RML mapping document
• Open-source (CC BY-NC), written in Java
• Publicly available at https://github.com/nkons/r2rml-parser
• Tested against MySQL and PostgreSQL
• Output can be written in RDF/OWL
• N3, Turtle, N-Triple, TTL, RDF/XML notation, or Jena TDB backend
• Covers most (not all) of the R2RML constructs (see the wiki)
• Does not offer SPARQL-to-SQL translations
84th International Conference on Web Intelligence, Mining and Semantics
Outline
• Introduction
• Proposed Approach and Measurements
• Discussion and Conclusions
94th International Conference on Web Intelligence, Mining and Semantics
Information Flow
• Parse the source database contents into result sets
• According to the R2RML Mapping File, the Parser generates a set of
instructions to the Generator
• The Generator instantiates in-memory the resulting RDF graph
• Persist the generated RDF graph into
• An RDF file in the Hard Disk, or
• In Jena’s relational database, or
• In Jena’s TDB (Tuple Data Base, a custom implementation Of B+ Trees)
• Log the results
Parser Generator
Mapping
fileSource database
RDF graph Hard Disk
Target database
TDB
R2RML Parser
104th International Conference on Web Intelligence, Mining and Semantics
Incremental RDF Triple Generation
• Basic challenge
• Discover, since the last time the incremental RDF
generation took place
• Which database tuples were modified
• Which Triples Maps were modified
• Then, perform the mapping only for this altered subset
• Ideally, we should detect the exact changed
database cells and modify only the respectively
generated elements in the RDF graph
• However, using R2RML, the atom of the mapping
definition becomes the Triples Map
a b c d e
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
s p O
. . .
. . .
. . .
.
. . .
. . .
. . .
. . .
. . .
. . .
Triples Map
4th International Conference on Web Intelligence, Mining and Semantics
Result
set
Triples
11
Keeping Track
• Reification
• Allows assertions about RDF statements
• “Reified” model
• A model that contains only reified
statements
• Stores the Triples Map URI that produced
each triple
• Logging
• Store MD5 hashes of
• Triples Maps, SELECT queries, Result sets
• A change in any of the hashes triggers
execution of the Triples Map
subject property object
subject
property
object
blank
node
rdf:subject
rdf:type
rdf:predicate
rdf:object
rdf:Statement
dc:source Triple Map URI
<http://data.example.org/repository/person/1>
foaf:name "John Smith" .
[] a rdf:Statement ;
rdf:subject
<http://data.example.org/repository/person/1> ;
rdf:predicate foaf:name ;
rdf:object "John Smith" ;
dc:source map:persons .
Proposed Approach
• For each Triples Map in the Mapping Document
• Decide whether we have to produce the resulting triples, based on the logged
MD5 hashes
• Dumping to the Hard Disk
• Initially, generate a number of RDF triples
• RDF triples are logged as reified statements, followed by a provenance note
• Incremental generation
• In subsequent executions, modify the existing reified model, by reflecting only the
changes in the source database
• Dumping to the Database or TDB
• No log is needed, storage is incremental by default
134th International Conference on Web Intelligence, Mining and Semantics
Measurements Setup
• An Ubuntu server, 2GHz dual-core, 4GB RAM
• Oracle Java 1.7, Postgresql 9.1, Mysql 5.5.32
• 7 DSpace (dspace.org) repositories
• 1k, 5k, 10k, 50k, 100k, 500k, 1m items, respectively
• A set of complicated, a set of simplified, and a set of simple queries
• In order to deal with database caching effects, the queries were run several
times, prior to performing the measurements
144th International Conference on Web Intelligence, Mining and Semantics
Query Sets
• Complicated
• 3 expensive JOIN conditions
among 4 tables
• 4 WHERE clauses
• Simplified
• 2 JOIN conditions among 3
tables
• 2 WHERE clauses
• Simple
• No JOIN or WHERE conditions
SELECT i.item_id AS item_id, mv.text_value AS text_value
FROM item AS i, metadatavalue AS mv,
metadataschemaregistry
AS msr, metadatafieldregistry AS mfr WHERE
msr.metadata_schema_id=mfr.metadata_schema_id AND
mfr.metadata_field_id=mv.metadata_field_id AND
mv.text_value is not null AND
i.item_id=mv.item_id AND
msr.namespace='http://dublincore.org/documents/dcmi-
terms/'
AND mfr.element='coverage'
AND mfr.qualifier='spatial'
SELECT i.item_id AS item_id, mv.text_value AS text_value
FROM item AS i, metadatavalue AS mv,
metadatafieldregistry AS mfr WHERE
mfr.metadata_field_id=mv.metadata_field_id AND
i.item_id=mv.item_id AND
mfr.element='coverage' AND
mfr.qualifier='spatial'
SELECT "language", "netid", "phone",
"sub_frequency","last_active", "self_registered",
"require_certificate", "can_log_in", "lastname",
"firstname", "digest_algorithm", "salt", "password",
"email", "eperson_id"
FROM "eperson" ORDER BY "language"
Measurements (1/3)
• Export to the Hard Disk
• Simple and complicated queries,
initial export
• Initial incremental dumps take
more time than non-incremental,
as the reified model also has to be
created
0
100
200
300
400
500
600
700
1000 5000 10000
source database rows
non-ncremental
incemental
0
5
10
15
20
25
30
35
40
3000 15000 30000 150000 300000
result model triples
non-incremental
incremental
164th International Conference on Web Intelligence, Mining and Semantics
Measurements (2/3)
• 12 Triples Maps
a. non-incremental mapping
transformation
b. incremental, for the initial time
c. 0/12
d. 1/12
e. 3/12
f. 6/12
g. 9/12
h. 12/12
0
100
200
300
400
500
600
700
a b c d e f g h
1000 items 5000 items 10000 items
0
50
100
150
200
250
300
350
a b c d e f g h
complicated mappings simpler mappings
174th International Conference on Web Intelligence, Mining and Semantics
Data change
Measurements (3/3)
• Similar overall behavior in cases of up
to 3 million triples
• Poor relational database performance
• Jena TDB is the optimal approach
regarding scalability
0
100
200
300
400
500
600
700
800
900
a b c d e f g h
RDF file database jena TDB
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
b c d e f g h
database jena TDB
184th International Conference on Web Intelligence, Mining and Semantics
~180k triples
~1.8m triples
Outline
• Introduction
• Proposed Approach and Measurements
• Discussion and Conclusions
194th International Conference on Web Intelligence, Mining and Semantics
Discussion (1/2)
• The approach is efficient when data freshness is not crucial and/or
selection queries over the contents are more frequent than the updates
• The task of exposing database contents as RDF could be considered
similar to the task of maintaining search indexes next to text content
• Third party software systems can operate completely based on the
exported graph
• E.g. using Fuseki, Sesame, Virtuoso
• Updates can be pushed or pulled from the database
• TDB is the optimal solution regarding scaling
• Caution is still needed in producing de-referenceable URIs
204th International Conference on Web Intelligence, Mining and Semantics
Discussion (2/2)
• On the efficiency of the approach for storing on the Hard Disk
• Good results for mappings (or queries) that include (or lead to) expensive SQL
queries
• E.g. with numerous JOIN statements
• For changes that can affect as much as ¾ of the source data
• Limitations
• By physical memory
• Scales up to several millions of triples, does not qualify as “Big Data”
• Formatting of the logged reified model did affect performance
• RDF/XML and TTL try to pretty-print the result, consuming extra resources, N-TRIPLES is
the optimal
214th International Conference on Web Intelligence, Mining and Semantics
Room for Improvement
• Hashing Result sets is expensive
• Requires re-run of the query, adds an “expensive” ORDER BY clause
• Further study the impact of SQL complexity on the performance
• Reification is currently being reconsidered in RDF 1.1 semantics
• Named graphs being the successor
• Investigation of two-way updates
• Send changes from the triplestore back to the database
224th International Conference on Web Intelligence, Mining and Semantics
Thank you for your attention!
Questions?
234th International Conference on Web Intelligence, Mining and Semantics

More Related Content

What's hot

Apache Spark — Fundamentals and MLlib
Apache Spark — Fundamentals and MLlibApache Spark — Fundamentals and MLlib
Apache Spark — Fundamentals and MLlibJens Fisseler, Dr.
 
Spark Sql and DataFrame
Spark Sql and DataFrameSpark Sql and DataFrame
Spark Sql and DataFramePrashant Gupta
 
Map Reduce Execution Architecture
Map Reduce Execution Architecture Map Reduce Execution Architecture
Map Reduce Execution Architecture Rupak Roy
 
Presentation data collection and gtfs
Presentation data collection and gtfsPresentation data collection and gtfs
Presentation data collection and gtfsLIFE GreenYourMove
 
HadoopXML: A Suite for Parallel Processing of Massive XML Data with Multiple ...
HadoopXML: A Suite for Parallel Processing of Massive XML Data with Multiple ...HadoopXML: A Suite for Parallel Processing of Massive XML Data with Multiple ...
HadoopXML: A Suite for Parallel Processing of Massive XML Data with Multiple ...Kyong-Ha Lee
 
LIFE GreenYourMove Project - GTFS data
LIFE GreenYourMove Project - GTFS data LIFE GreenYourMove Project - GTFS data
LIFE GreenYourMove Project - GTFS data LIFE GreenYourMove
 
Study Notes: Apache Spark
Study Notes: Apache SparkStudy Notes: Apache Spark
Study Notes: Apache SparkGao Yunzhong
 
A Workshop on R
A Workshop on RA Workshop on R
A Workshop on RAjay Ohri
 
Hadoop MapReduce framework - Module 3
Hadoop MapReduce framework - Module 3Hadoop MapReduce framework - Module 3
Hadoop MapReduce framework - Module 3Rohit Agrawal
 
Towards a More Efficient Paradigm of Storing and Querying Spatial Data on the...
Towards a More Efficient Paradigm of Storing and Querying Spatial Data on the...Towards a More Efficient Paradigm of Storing and Querying Spatial Data on the...
Towards a More Efficient Paradigm of Storing and Querying Spatial Data on the...Blake Regalia
 
Bitmap Indexes for Relational XML Twig Query Processing
Bitmap Indexes for Relational XML Twig Query ProcessingBitmap Indexes for Relational XML Twig Query Processing
Bitmap Indexes for Relational XML Twig Query ProcessingKyong-Ha Lee
 
Bringing OpenClinica Data into SAS
Bringing OpenClinica Data into SASBringing OpenClinica Data into SAS
Bringing OpenClinica Data into SASRick Watts
 
Graph basedrdf storeforapachecassandra
Graph basedrdf storeforapachecassandraGraph basedrdf storeforapachecassandra
Graph basedrdf storeforapachecassandraRavindra Ranwala
 
Scalable Algorithm Design with MapReduce
Scalable Algorithm Design with MapReduceScalable Algorithm Design with MapReduce
Scalable Algorithm Design with MapReducePietro Michiardi
 
Hadoop in sigmod 2011
Hadoop in sigmod 2011Hadoop in sigmod 2011
Hadoop in sigmod 2011Bin Cai
 

What's hot (20)

Apache Spark — Fundamentals and MLlib
Apache Spark — Fundamentals and MLlibApache Spark — Fundamentals and MLlib
Apache Spark — Fundamentals and MLlib
 
Data Interoperability
Data InteroperabilityData Interoperability
Data Interoperability
 
NetCDF and HDF5
NetCDF and HDF5NetCDF and HDF5
NetCDF and HDF5
 
Status of HDF-EOS, Related Software and Tools
Status of HDF-EOS, Related Software and ToolsStatus of HDF-EOS, Related Software and Tools
Status of HDF-EOS, Related Software and Tools
 
Spark Sql and DataFrame
Spark Sql and DataFrameSpark Sql and DataFrame
Spark Sql and DataFrame
 
Map Reduce Execution Architecture
Map Reduce Execution Architecture Map Reduce Execution Architecture
Map Reduce Execution Architecture
 
Presentation data collection and gtfs
Presentation data collection and gtfsPresentation data collection and gtfs
Presentation data collection and gtfs
 
HadoopXML: A Suite for Parallel Processing of Massive XML Data with Multiple ...
HadoopXML: A Suite for Parallel Processing of Massive XML Data with Multiple ...HadoopXML: A Suite for Parallel Processing of Massive XML Data with Multiple ...
HadoopXML: A Suite for Parallel Processing of Massive XML Data with Multiple ...
 
Unit 2
Unit 2Unit 2
Unit 2
 
LIFE GreenYourMove Project - GTFS data
LIFE GreenYourMove Project - GTFS data LIFE GreenYourMove Project - GTFS data
LIFE GreenYourMove Project - GTFS data
 
Study Notes: Apache Spark
Study Notes: Apache SparkStudy Notes: Apache Spark
Study Notes: Apache Spark
 
A Workshop on R
A Workshop on RA Workshop on R
A Workshop on R
 
Hadoop MapReduce framework - Module 3
Hadoop MapReduce framework - Module 3Hadoop MapReduce framework - Module 3
Hadoop MapReduce framework - Module 3
 
Towards a More Efficient Paradigm of Storing and Querying Spatial Data on the...
Towards a More Efficient Paradigm of Storing and Querying Spatial Data on the...Towards a More Efficient Paradigm of Storing and Querying Spatial Data on the...
Towards a More Efficient Paradigm of Storing and Querying Spatial Data on the...
 
Bitmap Indexes for Relational XML Twig Query Processing
Bitmap Indexes for Relational XML Twig Query ProcessingBitmap Indexes for Relational XML Twig Query Processing
Bitmap Indexes for Relational XML Twig Query Processing
 
Bringing OpenClinica Data into SAS
Bringing OpenClinica Data into SASBringing OpenClinica Data into SAS
Bringing OpenClinica Data into SAS
 
Graph basedrdf storeforapachecassandra
Graph basedrdf storeforapachecassandraGraph basedrdf storeforapachecassandra
Graph basedrdf storeforapachecassandra
 
Scalable Algorithm Design with MapReduce
Scalable Algorithm Design with MapReduceScalable Algorithm Design with MapReduce
Scalable Algorithm Design with MapReduce
 
Hadoop in sigmod 2011
Hadoop in sigmod 2011Hadoop in sigmod 2011
Hadoop in sigmod 2011
 
Timbuctoo 2 EASY
Timbuctoo 2 EASYTimbuctoo 2 EASY
Timbuctoo 2 EASY
 

Viewers also liked

Materializing the Web of Linked Data
Materializing the Web of Linked DataMaterializing the Web of Linked Data
Materializing the Web of Linked DataNikolaos Konstantinou
 
Deploying Linked Open Data: Methodologies and Software Tools
Deploying Linked Open Data: Methodologies and Software ToolsDeploying Linked Open Data: Methodologies and Software Tools
Deploying Linked Open Data: Methodologies and Software ToolsNikolaos Konstantinou
 
Introduction: Linked Data and the Semantic Web
Introduction: Linked Data and the Semantic WebIntroduction: Linked Data and the Semantic Web
Introduction: Linked Data and the Semantic WebNikolaos Konstantinou
 
Creating Linked Data from Relational Databases
Creating Linked Data from Relational DatabasesCreating Linked Data from Relational Databases
Creating Linked Data from Relational DatabasesNikolaos Konstantinou
 
Publishing and Using Linked Data
Publishing and Using Linked DataPublishing and Using Linked Data
Publishing and Using Linked Dataostephens
 
Learning to assess Linked Data relationships using Genetic Programming
Learning to assess Linked Data relationships using Genetic ProgrammingLearning to assess Linked Data relationships using Genetic Programming
Learning to assess Linked Data relationships using Genetic ProgrammingVrije Universiteit Amsterdam
 
Linked Open Data Principles, benefits of LOD for sustainable development
Linked Open Data Principles, benefits of LOD for sustainable developmentLinked Open Data Principles, benefits of LOD for sustainable development
Linked Open Data Principles, benefits of LOD for sustainable developmentMartin Kaltenböck
 
Generating Linked Data in Real-time from Sensor Data Streams
Generating Linked Data in Real-time from Sensor Data StreamsGenerating Linked Data in Real-time from Sensor Data Streams
Generating Linked Data in Real-time from Sensor Data StreamsNikolaos Konstantinou
 
Linking KOS Data [using SKOS and OWL2]
Linking KOS Data [using SKOS and OWL2]Linking KOS Data [using SKOS and OWL2]
Linking KOS Data [using SKOS and OWL2]Marcia Zeng
 
Entity Linking in Queries: Tasks and Evaluation
Entity Linking in Queries: Tasks and EvaluationEntity Linking in Queries: Tasks and Evaluation
Entity Linking in Queries: Tasks and EvaluationFaegheh Hasibi
 
From Research to Innovation: Linked Open Data and Gamification to Design Inte...
From Research to Innovation: Linked Open Data and Gamification to Design Inte...From Research to Innovation: Linked Open Data and Gamification to Design Inte...
From Research to Innovation: Linked Open Data and Gamification to Design Inte...Ig Bittencourt
 
Linked Data tutorial at Semtech 2012
Linked Data tutorial at Semtech 2012Linked Data tutorial at Semtech 2012
Linked Data tutorial at Semtech 2012Juan Sequeda
 
Introduction to Linked Data 1/5
Introduction to Linked Data 1/5Introduction to Linked Data 1/5
Introduction to Linked Data 1/5Juan Sequeda
 
Consuming Linked Data 4/5 Semtech2011
Consuming Linked Data 4/5 Semtech2011Consuming Linked Data 4/5 Semtech2011
Consuming Linked Data 4/5 Semtech2011Juan Sequeda
 
Méthodes et outils pour interrelier le web des données
Méthodes et outils pour interrelier le web des donnéesMéthodes et outils pour interrelier le web des données
Méthodes et outils pour interrelier le web des donnéesFrançois Scharffe
 
RDF Tutorial - SPARQL 20091031
RDF Tutorial - SPARQL 20091031RDF Tutorial - SPARQL 20091031
RDF Tutorial - SPARQL 20091031kwangsub kim
 

Viewers also liked (20)

Materializing the Web of Linked Data
Materializing the Web of Linked DataMaterializing the Web of Linked Data
Materializing the Web of Linked Data
 
Conclusions: Summary and Outlook
Conclusions: Summary and OutlookConclusions: Summary and Outlook
Conclusions: Summary and Outlook
 
Technical Background
Technical BackgroundTechnical Background
Technical Background
 
Deploying Linked Open Data: Methodologies and Software Tools
Deploying Linked Open Data: Methodologies and Software ToolsDeploying Linked Open Data: Methodologies and Software Tools
Deploying Linked Open Data: Methodologies and Software Tools
 
Introduction: Linked Data and the Semantic Web
Introduction: Linked Data and the Semantic WebIntroduction: Linked Data and the Semantic Web
Introduction: Linked Data and the Semantic Web
 
Creating Linked Data from Relational Databases
Creating Linked Data from Relational DatabasesCreating Linked Data from Relational Databases
Creating Linked Data from Relational Databases
 
Publishing and Using Linked Data
Publishing and Using Linked DataPublishing and Using Linked Data
Publishing and Using Linked Data
 
Learning to assess Linked Data relationships using Genetic Programming
Learning to assess Linked Data relationships using Genetic ProgrammingLearning to assess Linked Data relationships using Genetic Programming
Learning to assess Linked Data relationships using Genetic Programming
 
Linked Open Data Principles, benefits of LOD for sustainable development
Linked Open Data Principles, benefits of LOD for sustainable developmentLinked Open Data Principles, benefits of LOD for sustainable development
Linked Open Data Principles, benefits of LOD for sustainable development
 
Generating Linked Data in Real-time from Sensor Data Streams
Generating Linked Data in Real-time from Sensor Data StreamsGenerating Linked Data in Real-time from Sensor Data Streams
Generating Linked Data in Real-time from Sensor Data Streams
 
RDF Streams and Continuous SPARQL (C-SPARQL)
RDF Streams and Continuous SPARQL (C-SPARQL)RDF Streams and Continuous SPARQL (C-SPARQL)
RDF Streams and Continuous SPARQL (C-SPARQL)
 
Linking KOS Data [using SKOS and OWL2]
Linking KOS Data [using SKOS and OWL2]Linking KOS Data [using SKOS and OWL2]
Linking KOS Data [using SKOS and OWL2]
 
Publishing Linked Data from RDB
Publishing Linked Data from RDBPublishing Linked Data from RDB
Publishing Linked Data from RDB
 
Entity Linking in Queries: Tasks and Evaluation
Entity Linking in Queries: Tasks and EvaluationEntity Linking in Queries: Tasks and Evaluation
Entity Linking in Queries: Tasks and Evaluation
 
From Research to Innovation: Linked Open Data and Gamification to Design Inte...
From Research to Innovation: Linked Open Data and Gamification to Design Inte...From Research to Innovation: Linked Open Data and Gamification to Design Inte...
From Research to Innovation: Linked Open Data and Gamification to Design Inte...
 
Linked Data tutorial at Semtech 2012
Linked Data tutorial at Semtech 2012Linked Data tutorial at Semtech 2012
Linked Data tutorial at Semtech 2012
 
Introduction to Linked Data 1/5
Introduction to Linked Data 1/5Introduction to Linked Data 1/5
Introduction to Linked Data 1/5
 
Consuming Linked Data 4/5 Semtech2011
Consuming Linked Data 4/5 Semtech2011Consuming Linked Data 4/5 Semtech2011
Consuming Linked Data 4/5 Semtech2011
 
Méthodes et outils pour interrelier le web des données
Méthodes et outils pour interrelier le web des donnéesMéthodes et outils pour interrelier le web des données
Méthodes et outils pour interrelier le web des données
 
RDF Tutorial - SPARQL 20091031
RDF Tutorial - SPARQL 20091031RDF Tutorial - SPARQL 20091031
RDF Tutorial - SPARQL 20091031
 

Similar to Incremental Export of Relational Database Contents into RDF Graphs

ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...eswcsummerschool
 
RDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataRDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataGiorgos Santipantakis
 
11. From Hadoop to Spark 1:2
11. From Hadoop to Spark 1:211. From Hadoop to Spark 1:2
11. From Hadoop to Spark 1:2Fabio Fumarola
 
(DAT203) Building Graph Databases on AWS
(DAT203) Building Graph Databases on AWS(DAT203) Building Graph Databases on AWS
(DAT203) Building Graph Databases on AWSAmazon Web Services
 
On the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream ProcessingOn the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream ProcessingPlanetData Network of Excellence
 
OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...
OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...
OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...Oscar Corcho
 
Intro to Big Data and NoSQL
Intro to Big Data and NoSQLIntro to Big Data and NoSQL
Intro to Big Data and NoSQLDon Demcsak
 
Apache Spark: The Next Gen toolset for Big Data Processing
Apache Spark: The Next Gen toolset for Big Data ProcessingApache Spark: The Next Gen toolset for Big Data Processing
Apache Spark: The Next Gen toolset for Big Data Processingprajods
 
Apache Geode Meetup, Cork, Ireland at CIT
Apache Geode Meetup, Cork, Ireland at CITApache Geode Meetup, Cork, Ireland at CIT
Apache Geode Meetup, Cork, Ireland at CITApache Geode
 
PEARC 17: Spark On the ARC
PEARC 17: Spark On the ARCPEARC 17: Spark On the ARC
PEARC 17: Spark On the ARCHimanshu Bedi
 
Web analytics at scale with Druid at naver.com
Web analytics at scale with Druid at naver.comWeb analytics at scale with Druid at naver.com
Web analytics at scale with Druid at naver.comJungsu Heo
 
Training Slides: Basics 103: The Power of Tungsten Connector / Proxy
Training Slides: Basics 103: The Power of Tungsten Connector / ProxyTraining Slides: Basics 103: The Power of Tungsten Connector / Proxy
Training Slides: Basics 103: The Power of Tungsten Connector / ProxyContinuent
 
Building Big Data Streaming Architectures
Building Big Data Streaming ArchitecturesBuilding Big Data Streaming Architectures
Building Big Data Streaming ArchitecturesDavid Martínez Rego
 
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudFirst Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudOntotext
 
Semantic Web use cases in outcomes research
Semantic Web use cases in outcomes researchSemantic Web use cases in outcomes research
Semantic Web use cases in outcomes researchChimezie Ogbuji
 
RDB2RDF, an overview of R2RML and Direct Mapping
RDB2RDF, an overview of R2RML and Direct MappingRDB2RDF, an overview of R2RML and Direct Mapping
RDB2RDF, an overview of R2RML and Direct MappingBoris Villazón-Terrazas
 
Introduction to Apache Geode (Cork, Ireland)
Introduction to Apache Geode (Cork, Ireland)Introduction to Apache Geode (Cork, Ireland)
Introduction to Apache Geode (Cork, Ireland)Anthony Baker
 

Similar to Incremental Export of Relational Database Contents into RDF Graphs (20)

ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
 
RDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataRDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival data
 
11. From Hadoop to Spark 1:2
11. From Hadoop to Spark 1:211. From Hadoop to Spark 1:2
11. From Hadoop to Spark 1:2
 
(DAT203) Building Graph Databases on AWS
(DAT203) Building Graph Databases on AWS(DAT203) Building Graph Databases on AWS
(DAT203) Building Graph Databases on AWS
 
On the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream ProcessingOn the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream Processing
 
OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...
OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...
OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...
 
Intro to Big Data and NoSQL
Intro to Big Data and NoSQLIntro to Big Data and NoSQL
Intro to Big Data and NoSQL
 
Apache Spark: The Next Gen toolset for Big Data Processing
Apache Spark: The Next Gen toolset for Big Data ProcessingApache Spark: The Next Gen toolset for Big Data Processing
Apache Spark: The Next Gen toolset for Big Data Processing
 
Hadoop
HadoopHadoop
Hadoop
 
Apache Geode Meetup, Cork, Ireland at CIT
Apache Geode Meetup, Cork, Ireland at CITApache Geode Meetup, Cork, Ireland at CIT
Apache Geode Meetup, Cork, Ireland at CIT
 
PEARC 17: Spark On the ARC
PEARC 17: Spark On the ARCPEARC 17: Spark On the ARC
PEARC 17: Spark On the ARC
 
Web analytics at scale with Druid at naver.com
Web analytics at scale with Druid at naver.comWeb analytics at scale with Druid at naver.com
Web analytics at scale with Druid at naver.com
 
RDFauthor (EKAW)
RDFauthor (EKAW)RDFauthor (EKAW)
RDFauthor (EKAW)
 
Training Slides: Basics 103: The Power of Tungsten Connector / Proxy
Training Slides: Basics 103: The Power of Tungsten Connector / ProxyTraining Slides: Basics 103: The Power of Tungsten Connector / Proxy
Training Slides: Basics 103: The Power of Tungsten Connector / Proxy
 
Building Big Data Streaming Architectures
Building Big Data Streaming ArchitecturesBuilding Big Data Streaming Architectures
Building Big Data Streaming Architectures
 
Spark architechure.pptx
Spark architechure.pptxSpark architechure.pptx
Spark architechure.pptx
 
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudFirst Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
 
Semantic Web use cases in outcomes research
Semantic Web use cases in outcomes researchSemantic Web use cases in outcomes research
Semantic Web use cases in outcomes research
 
RDB2RDF, an overview of R2RML and Direct Mapping
RDB2RDF, an overview of R2RML and Direct MappingRDB2RDF, an overview of R2RML and Direct Mapping
RDB2RDF, an overview of R2RML and Direct Mapping
 
Introduction to Apache Geode (Cork, Ireland)
Introduction to Apache Geode (Cork, Ireland)Introduction to Apache Geode (Cork, Ireland)
Introduction to Apache Geode (Cork, Ireland)
 

More from Nikolaos Konstantinou

Exposing Bibliographic Information as Linked Open Data using Standards-based ...
Exposing Bibliographic Information as Linked Open Data using Standards-based ...Exposing Bibliographic Information as Linked Open Data using Standards-based ...
Exposing Bibliographic Information as Linked Open Data using Standards-based ...Nikolaos Konstantinou
 
Priamos: A Middleware Architecture for Real-Time Semantic Annotation of Conte...
Priamos: A Middleware Architecture for Real-Time Semantic Annotation of Conte...Priamos: A Middleware Architecture for Real-Time Semantic Annotation of Conte...
Priamos: A Middleware Architecture for Real-Time Semantic Annotation of Conte...Nikolaos Konstantinou
 
Διαχείριση Ψηφιακού Περιεχομένου με το DSpace: Λειτουργία και τεχνικά ζητήματα
Διαχείριση Ψηφιακού Περιεχομένου με το DSpace: Λειτουργία και τεχνικά ζητήματαΔιαχείριση Ψηφιακού Περιεχομένου με το DSpace: Λειτουργία και τεχνικά ζητήματα
Διαχείριση Ψηφιακού Περιεχομένου με το DSpace: Λειτουργία και τεχνικά ζητήματαNikolaos Konstantinou
 
Publishing Data Using Semantic Web Technologies
Publishing Data Using Semantic Web TechnologiesPublishing Data Using Semantic Web Technologies
Publishing Data Using Semantic Web TechnologiesNikolaos Konstantinou
 
From Sensor Data to Triples: Information Flow in Semantic Sensor Networks
From Sensor Data to Triples: Information Flow in Semantic Sensor NetworksFrom Sensor Data to Triples: Information Flow in Semantic Sensor Networks
From Sensor Data to Triples: Information Flow in Semantic Sensor NetworksNikolaos Konstantinou
 
A rule-based approach for the real-time semantic annotation in context-aware ...
A rule-based approach for the real-time semantic annotation in context-aware ...A rule-based approach for the real-time semantic annotation in context-aware ...
A rule-based approach for the real-time semantic annotation in context-aware ...Nikolaos Konstantinou
 
VisAVis: An Approach to an Intermediate Layer between Ontologies and Relation...
VisAVis: An Approach to an Intermediate Layer between Ontologies and Relation...VisAVis: An Approach to an Intermediate Layer between Ontologies and Relation...
VisAVis: An Approach to an Intermediate Layer between Ontologies and Relation...Nikolaos Konstantinou
 

More from Nikolaos Konstantinou (8)

Exposing Bibliographic Information as Linked Open Data using Standards-based ...
Exposing Bibliographic Information as Linked Open Data using Standards-based ...Exposing Bibliographic Information as Linked Open Data using Standards-based ...
Exposing Bibliographic Information as Linked Open Data using Standards-based ...
 
OR2012 Biblio-transformation-engine
OR2012 Biblio-transformation-engineOR2012 Biblio-transformation-engine
OR2012 Biblio-transformation-engine
 
Priamos: A Middleware Architecture for Real-Time Semantic Annotation of Conte...
Priamos: A Middleware Architecture for Real-Time Semantic Annotation of Conte...Priamos: A Middleware Architecture for Real-Time Semantic Annotation of Conte...
Priamos: A Middleware Architecture for Real-Time Semantic Annotation of Conte...
 
Διαχείριση Ψηφιακού Περιεχομένου με το DSpace: Λειτουργία και τεχνικά ζητήματα
Διαχείριση Ψηφιακού Περιεχομένου με το DSpace: Λειτουργία και τεχνικά ζητήματαΔιαχείριση Ψηφιακού Περιεχομένου με το DSpace: Λειτουργία και τεχνικά ζητήματα
Διαχείριση Ψηφιακού Περιεχομένου με το DSpace: Λειτουργία και τεχνικά ζητήματα
 
Publishing Data Using Semantic Web Technologies
Publishing Data Using Semantic Web TechnologiesPublishing Data Using Semantic Web Technologies
Publishing Data Using Semantic Web Technologies
 
From Sensor Data to Triples: Information Flow in Semantic Sensor Networks
From Sensor Data to Triples: Information Flow in Semantic Sensor NetworksFrom Sensor Data to Triples: Information Flow in Semantic Sensor Networks
From Sensor Data to Triples: Information Flow in Semantic Sensor Networks
 
A rule-based approach for the real-time semantic annotation in context-aware ...
A rule-based approach for the real-time semantic annotation in context-aware ...A rule-based approach for the real-time semantic annotation in context-aware ...
A rule-based approach for the real-time semantic annotation in context-aware ...
 
VisAVis: An Approach to an Intermediate Layer between Ontologies and Relation...
VisAVis: An Approach to an Intermediate Layer between Ontologies and Relation...VisAVis: An Approach to an Intermediate Layer between Ontologies and Relation...
VisAVis: An Approach to an Intermediate Layer between Ontologies and Relation...
 

Recently uploaded

Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...RohitNehra6
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real timeSatoshi NAKAHIRA
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PPRINCE C P
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhousejana861314
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptMAESTRELLAMesa2
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Nistarini College, Purulia (W.B) India
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptxanandsmhk
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxSwapnil Therkar
 
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxAleenaTreesaSaji
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PPRINCE C P
 

Recently uploaded (20)

Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real time
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C P
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhouse
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.ppt
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
 
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptx
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C P
 

Incremental Export of Relational Database Contents into RDF Graphs

  • 1. Incremental Export of Relational Database Contents into RDF Graphs Nikolaos Konstantinou, Dimitris Kouis, Nikolas Mitrou By Dr. Nikolaos Konstantinou National Technical University of Athens School of Electrical and Computer Engineering Multimedia, Communications & Web Technologies 4th International Conference on Web Intelligence, Mining and Semantics
  • 2. Outline • Introduction • Proposed Approach and Measurements • Discussion and Conclusions 24th International Conference on Web Intelligence, Mining and Semantics
  • 3. Introduction • Information collection, maintenance and update is not always taking place directly at a triplestore, but at a RDBMS • Triplestores are often kept as an alternative content delivery channel • It can be difficult to change established methodologies and systems • Newer technologies need to operate side-by-side to existing ones before migration 34th International Conference on Web Intelligence, Mining and Semantics
  • 4. Mapping Relational Data to RDF • No one-size-fits-all approach • Synchronous Vs Asynchronous RDF Views • Real-time Vs Ad hoc RDF Views • Real-time SPARQL-to-SQL Vs Querying the RDF dump using SPARQL • Queries on the RDF dump are faster in certain conditions, compared to round-trips to the database • Difference in the performance more visible when SPARQL queries involve numerous triple patterns (which translate to expensive JOIN statements) • In this paper, we focus on the asynchronous approach • Exporting (dumping) relational database contents into an RDF graph 44th International Conference on Web Intelligence, Mining and Semantics
  • 5. Incremental Export into RDF (1/2) • Problem • Avoid dumping the whole database contents every time • In cases when few data change in the source database, it is not necessary to dump the entire database • Approach • Every time the RDF export is materialized • Detect the changes in the source database or the mapping definition • Insert/delete/update only the necessary triples, in order to reflect these changes in the resulting RDF graph 54th International Conference on Web Intelligence, Mining and Semantics
  • 6. Incremental Export into RDF (2/2) • Incremental transformation • Each time the transformation is executed, not all of the initial information that lies in the database should be transformed into RDF, but only the one that changed • Incremental storage • Storing (persisting) to the destination RDF graph only the triples that were modified and not the whole graph • Only when the resulting RDF graph is stored in a relational database or using Jena TDB • Regardless to whether the transformation took place fully or incrementally 64th International Conference on Web Intelligence, Mining and Semantics
  • 7. R2RML and Triples Maps • RDB to RDF Mapping Language • A W3C Recommendation, as of 2012 • Triples Map: a reusable mapping definition • Specifies a rule for translating each row of a logical table to zero or more RDF triples • A logical table is a tabular SQL query result set that is to be mapped to RDF triples • Execution of a triples map generates the triples that originate from a specific result set (logical table) 74th International Conference on Web Intelligence, Mining and Semantics
  • 8. The R2RML Parser • An R2RML implementation • Command-line tool that can export relational database contents as RDF graphs, based on an R2RML mapping document • Open-source (CC BY-NC), written in Java • Publicly available at https://github.com/nkons/r2rml-parser • Tested against MySQL and PostgreSQL • Output can be written in RDF/OWL • N3, Turtle, N-Triple, TTL, RDF/XML notation, or Jena TDB backend • Covers most (not all) of the R2RML constructs (see the wiki) • Does not offer SPARQL-to-SQL translations 84th International Conference on Web Intelligence, Mining and Semantics
  • 9. Outline • Introduction • Proposed Approach and Measurements • Discussion and Conclusions 94th International Conference on Web Intelligence, Mining and Semantics
  • 10. Information Flow • Parse the source database contents into result sets • According to the R2RML Mapping File, the Parser generates a set of instructions to the Generator • The Generator instantiates in-memory the resulting RDF graph • Persist the generated RDF graph into • An RDF file in the Hard Disk, or • In Jena’s relational database, or • In Jena’s TDB (Tuple Data Base, a custom implementation Of B+ Trees) • Log the results Parser Generator Mapping fileSource database RDF graph Hard Disk Target database TDB R2RML Parser 104th International Conference on Web Intelligence, Mining and Semantics
  • 11. Incremental RDF Triple Generation • Basic challenge • Discover, since the last time the incremental RDF generation took place • Which database tuples were modified • Which Triples Maps were modified • Then, perform the mapping only for this altered subset • Ideally, we should detect the exact changed database cells and modify only the respectively generated elements in the RDF graph • However, using R2RML, the atom of the mapping definition becomes the Triples Map a b c d e . . . . . . . . . . . . . . . . . . . . . . . . . s p O . . . . . . . . . . . . . . . . . . . . . . . . . . . . Triples Map 4th International Conference on Web Intelligence, Mining and Semantics Result set Triples 11
  • 12. Keeping Track • Reification • Allows assertions about RDF statements • “Reified” model • A model that contains only reified statements • Stores the Triples Map URI that produced each triple • Logging • Store MD5 hashes of • Triples Maps, SELECT queries, Result sets • A change in any of the hashes triggers execution of the Triples Map subject property object subject property object blank node rdf:subject rdf:type rdf:predicate rdf:object rdf:Statement dc:source Triple Map URI <http://data.example.org/repository/person/1> foaf:name "John Smith" . [] a rdf:Statement ; rdf:subject <http://data.example.org/repository/person/1> ; rdf:predicate foaf:name ; rdf:object "John Smith" ; dc:source map:persons .
  • 13. Proposed Approach • For each Triples Map in the Mapping Document • Decide whether we have to produce the resulting triples, based on the logged MD5 hashes • Dumping to the Hard Disk • Initially, generate a number of RDF triples • RDF triples are logged as reified statements, followed by a provenance note • Incremental generation • In subsequent executions, modify the existing reified model, by reflecting only the changes in the source database • Dumping to the Database or TDB • No log is needed, storage is incremental by default 134th International Conference on Web Intelligence, Mining and Semantics
  • 14. Measurements Setup • An Ubuntu server, 2GHz dual-core, 4GB RAM • Oracle Java 1.7, Postgresql 9.1, Mysql 5.5.32 • 7 DSpace (dspace.org) repositories • 1k, 5k, 10k, 50k, 100k, 500k, 1m items, respectively • A set of complicated, a set of simplified, and a set of simple queries • In order to deal with database caching effects, the queries were run several times, prior to performing the measurements 144th International Conference on Web Intelligence, Mining and Semantics
  • 15. Query Sets • Complicated • 3 expensive JOIN conditions among 4 tables • 4 WHERE clauses • Simplified • 2 JOIN conditions among 3 tables • 2 WHERE clauses • Simple • No JOIN or WHERE conditions SELECT i.item_id AS item_id, mv.text_value AS text_value FROM item AS i, metadatavalue AS mv, metadataschemaregistry AS msr, metadatafieldregistry AS mfr WHERE msr.metadata_schema_id=mfr.metadata_schema_id AND mfr.metadata_field_id=mv.metadata_field_id AND mv.text_value is not null AND i.item_id=mv.item_id AND msr.namespace='http://dublincore.org/documents/dcmi- terms/' AND mfr.element='coverage' AND mfr.qualifier='spatial' SELECT i.item_id AS item_id, mv.text_value AS text_value FROM item AS i, metadatavalue AS mv, metadatafieldregistry AS mfr WHERE mfr.metadata_field_id=mv.metadata_field_id AND i.item_id=mv.item_id AND mfr.element='coverage' AND mfr.qualifier='spatial' SELECT "language", "netid", "phone", "sub_frequency","last_active", "self_registered", "require_certificate", "can_log_in", "lastname", "firstname", "digest_algorithm", "salt", "password", "email", "eperson_id" FROM "eperson" ORDER BY "language"
  • 16. Measurements (1/3) • Export to the Hard Disk • Simple and complicated queries, initial export • Initial incremental dumps take more time than non-incremental, as the reified model also has to be created 0 100 200 300 400 500 600 700 1000 5000 10000 source database rows non-ncremental incemental 0 5 10 15 20 25 30 35 40 3000 15000 30000 150000 300000 result model triples non-incremental incremental 164th International Conference on Web Intelligence, Mining and Semantics
  • 17. Measurements (2/3) • 12 Triples Maps a. non-incremental mapping transformation b. incremental, for the initial time c. 0/12 d. 1/12 e. 3/12 f. 6/12 g. 9/12 h. 12/12 0 100 200 300 400 500 600 700 a b c d e f g h 1000 items 5000 items 10000 items 0 50 100 150 200 250 300 350 a b c d e f g h complicated mappings simpler mappings 174th International Conference on Web Intelligence, Mining and Semantics Data change
  • 18. Measurements (3/3) • Similar overall behavior in cases of up to 3 million triples • Poor relational database performance • Jena TDB is the optimal approach regarding scalability 0 100 200 300 400 500 600 700 800 900 a b c d e f g h RDF file database jena TDB 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 b c d e f g h database jena TDB 184th International Conference on Web Intelligence, Mining and Semantics ~180k triples ~1.8m triples
  • 19. Outline • Introduction • Proposed Approach and Measurements • Discussion and Conclusions 194th International Conference on Web Intelligence, Mining and Semantics
  • 20. Discussion (1/2) • The approach is efficient when data freshness is not crucial and/or selection queries over the contents are more frequent than the updates • The task of exposing database contents as RDF could be considered similar to the task of maintaining search indexes next to text content • Third party software systems can operate completely based on the exported graph • E.g. using Fuseki, Sesame, Virtuoso • Updates can be pushed or pulled from the database • TDB is the optimal solution regarding scaling • Caution is still needed in producing de-referenceable URIs 204th International Conference on Web Intelligence, Mining and Semantics
  • 21. Discussion (2/2) • On the efficiency of the approach for storing on the Hard Disk • Good results for mappings (or queries) that include (or lead to) expensive SQL queries • E.g. with numerous JOIN statements • For changes that can affect as much as ¾ of the source data • Limitations • By physical memory • Scales up to several millions of triples, does not qualify as “Big Data” • Formatting of the logged reified model did affect performance • RDF/XML and TTL try to pretty-print the result, consuming extra resources, N-TRIPLES is the optimal 214th International Conference on Web Intelligence, Mining and Semantics
  • 22. Room for Improvement • Hashing Result sets is expensive • Requires re-run of the query, adds an “expensive” ORDER BY clause • Further study the impact of SQL complexity on the performance • Reification is currently being reconsidered in RDF 1.1 semantics • Named graphs being the successor • Investigation of two-way updates • Send changes from the triplestore back to the database 224th International Conference on Web Intelligence, Mining and Semantics
  • 23. Thank you for your attention! Questions? 234th International Conference on Web Intelligence, Mining and Semantics