SlideShare a Scribd company logo
Graph Databases &
data integration
Voxxed Days Athens 2018
Dimitris Kontokostas
Senior Knowledge Engineer @GeoPhy
About me
● Data geek, software engineer & open source enthusiast
● Involved in many R&D projects since 2003
● Participate(d) in graph-related standardization activities
● PhD in knowledge extraction and quality assessment
● Working on the GeoPhy Real Estate Knowledge Graph
Agenda
● Graphs
● RDF Graphs (*)
● Semantics & why they matter (*)
● Linked Data
● Who uses RDF
● How Google uses RDF
● How we (GeoPhy) uses RDF
(*)
Some concepts are simplified or skipped to make this talk easier to digest in the allocated time
Heatmap for Graph Databases
(*) See also this
Gartner study in 2013 found:
● many organizations find the
variety dimension a greater
challenge than volume or
velocity.
Graph DBs to the rescue:
● Combine multiple sources with
different structures
● Retain the flexibility to add
new ones without adapting
schemas
● Query combined data, or
multiple sources at once
● Detect patterns in the data
© Image by Max De Margi
● A graph is a way of specifying relationships among a collection of items
● Items can be:
○ Nodes: Alice, Bob, …
○ Edges
■ undirected: knows, …
■ directed: follows, …
○ Attributes: name, age, type, since, ...
○ Values: 18, 2001/10/13, ...
Graphs
Image source from wikimedia commons
Graph Data Models
Property graphs
● Industry standards
○ Cypher mainly Neo4j
○ Gremlin traversal API
(Apache TinkerPop)
=> Most common
○ GraphQL
● Data import / export using Cypher,
gremlin or vendor-specific
● Usually optimized for specific
operations / use cases
RDF Graphs
● W3C standards
○ Like XML, HTML, define once
run everywhere ™
● Standardised way for querying
(SPARQL), exporting & importing
(RDF)
Slide input from Andy Seaborn @VoxxedDays Bristol
Graph Databases Landscape
Property Graphs
Gremlin traversal API
RDF Graphs
SPARQL
Hybrid
Gremlin API + SPARQL
+Cypher
● Each node has
○ unique identifier
○ outgoing edges
○ incoming edges
○ key-value properties collection
● Each edge has
○ unique identifier
○ direction
○ label for the relationship
○ key-value properties collection
● Extreme flexibility
Property Graphs
RDF - Resource Description Framework
● An RDF Graph is a set of RDF Triples
● An RDF triple consists of only three components (simplified):
○ the subject which is a Thing
○ the predicate which is a (special) Thing
○ the object that can be either a Thing or a Literal (Value)
● Things are represented with URIs
● Literals have a value and a value type or a language tag (defaults to string)
Subject Predicate Object
RDF - Resource Description Framework
● An RDF Graph is a set of RDF Triples
● An RDF triple consists of only three components (simplified):
○ the subject which is a Thing
○ the predicate which is a (special) Thing
○ the object that can be either a Thing or a Literal (Value)
● Things are represented with URIs
● Literals have a value and a value type or a language tag (defaults to string)
Subject Predicate Object
RDF - Resource Description Framework
Depending on the serialization format, URIs can be abbreviated with namespaces
> just like XML
> Improves readability, e.g.
@prefix dbpedia: <http://dbpedia.org/resource/> .
@prefix schema: <http://schema.org/> .
Subject Predicate Object
RDF is an abstract data model
Many different serialization formats…
Turtle, NTriples, JSON-LD, XML, RDFa, Microdata*
RDF is an abstract data model
Many different serialization formats…
Turtle, NTriples, JSON-LD, XML, RDFa, Microdata*
@prefix dbpedia: <http://dbpedia.org/resource/> .
@prefix schema: <http://schema.org/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
dbpedia:Friends
schema:name "Friends"@en ;
schema:datePublished "1994-09-22"^^xsd:date ;
schema:numberOfSeasons 10 ;
schema:genre dbpedia:Sitcom .
dbpedia:The_Office
schema:name "The Office"@en ;
schema:genre dbpedia:Sitcom .
RDF is an abstract data model
Many different serialization formats…
Turtle, NTriples, JSON-LD, XML, RDFa, Microdata*
RDF is an abstract data model
Many different serialization formats…
Turtle, NTriples, JSON-LD, XML, RDFa, Microdata*
RDF is an abstract data model
Many different serialization formats…
Turtle, NTriples, JSON-LD, XML, RDFa, Microdata*
[Fun fact]
What does RSS stand for?
Rich Site Summary but...
Original name was: RDF Site Summary
Based on first versions of RDF/XML
See https://en.wikipedia.org/wiki/RSS
RDF is an abstract data model
Many different serialization formats…
Turtle, NTriples, JSON-LD, XML, RDFa, Microdata*
RDF is an abstract data model
Many different serialization formats…
Turtle, NTriples, JSON-LD, XML, RDFa, Microdata*
You can store RDF ...
In simple (text) files,
locally, remote, HDFS, ...
Embedded web documents
In graph databases
RDF & Graphs (Separate)
@prefix dbpedia: <http://dbpedia.org/resource/> .
@prefix schema: <http://schema.org/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
dbpedia:Friends
schema:numberOfSeasons 10 ;
schema:datePublished "1994-09-22"^^xsd:date ;
schema:genre dbpedia:Sitcom .
@prefix dbpedia: <http://dbpedia.org/resource/> .
@prefix schema: <http://schema.org/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
dbpedia:Friends schema:name "Friends"@en .
dbpedia:The_Office schema:name "The Office"@en .
/data/tvseries/labels.ttl
/data/tvseries/metadata.ttl
RDF & Graphs (merge)
File_all.ttl
Can you name of any
other format where files
can be merged without
losing data integrity?
CSV, SQL, XML, JSON, ...
@prefix dbpedia: <http://dbpedia.org/resource/> .
@prefix schema: <http://schema.org/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
dbpedia:Friends
schema:name "Friends"@en ;
schema:numberOfSeasons 10 ;
schema:datePublished "1994-09-22"^^xsd:date ;
schema:genre dbpedia:Sitcom .
dbpedia:The_Office
schema:name "The Office"@en ;
schema:genre dbpedia:Sitcom .
/data/tvseries.ttl
Datasets / multi-graph TriG files
@prefix dbpedia: <http://dbpedia.org/resource/> .
@prefix schema: <http://schema.org/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
<http://example.com/labels> {
dbpedia:Friends schema:name "Friends"@en ;
dbpedia:The_Office schema:name "The Office"@en ;
}
<http://example.com/metadata> {
dbpedia:Friends
schema:datePublished "1994-09-22"^^xsd:date ;
schema:numberOfSeasons 10 .
}
<http://example.com/genre> {
dbpedia:Friends schema:genre dbpedia:Sitcom .
dbpedia:The_Office schema:genre dbpedia:Sitcom .
}
/data/tvseries.trig
RDF is persistent, wherever it’s stored
RDF DB
Input
Files
Output
Files
Import
Export
Exactly
same (*)
(*)
The proper term is isomorphic graphs, to cover some special cases called blank nodes
Query
Big ecosystem
SPARQL: RDF query language
RDFS, OWL: RDF schema languages
SHACL, ShEx: RDF constraint languages
See http://book.validatingrdf.com (free online)
R2RML: Virtual RDF views on top of RDBMS (i.e. MySQL)
And many more specification & tools...
Takeaway points, so far...
RDF is a graph data model
> can be serialized in many formats
> identifiers are persistent by design
Natively store & integrates diverse data
RDF is kind of the new XML
> but it is much cooler...
> and you don’t need to write XML ;)
Semantics & RDF
Why they matter
Semantics & RDF
● RDF is a core part of the Semantic Web vision
● Semantics is defined as:
○ the meaning of something (word, phrase, text, etc)
○ the branch of linguistics and logic concerned with meaning
● Too academic?
“A Little Semantics Goes a Long Way”
by prof. J. Hendler
BuzzwordAlert!!!
RDF & Semantics
Ontologies are the results of modelling a specific domain
Some people prefer the terms: model, vocabulary, taxonomy, schema
(doesn’t make much difference)
Ontologies in RDF deal with classes & properties
> Some part is machine readable
> Some part is human readable
Can you tell which part is more important?
(... a more pragmatic view)
@prefix ex: <http://example.com/>
ex:TVSeries
rdf:type rdfs:Class ;
rdfs:comment “Series dedicated to TV broadcast” ;
rdfs:subClassOf ex:CreativeWork .
ex:CreativeWork
rdf:type rdfs:Class ;
rdfs:comment “A generic kind of creative work, i.e. books, movies, etc.” .
RDF Schema - Classes
Classes of Things
Machine-Readable
Semantics
Human-Readable
Semantics
… and we can assign types to Things
(i.e. “Friends” is an instance of “TVSeries”)
dbpedia:Friends rdf:type ex:TVSeries.
@prefix ex: <http://example.com/>
ex:actor
rdf:type rdf:Property ;
rdfs:comment “The person that is the actor of a TVSeries.” ;
rdfs:domain ex:TVSeries ;
rdfs:range ex:Person .
RDF Schema - Properties
Relationships between subjects and objects
Machine-Readable
Semantics
Human-Readable
Semantics
dbpedia:Friends ex:actor dbpedia:Jennifer_Aniston .
… and we can use this in RDF statements
to Infer or to Validate ?
Given only the following, what can we say about
dbpedia:Jennifer_Aniston and dbpedia:Friends ?
dbpedia:Jennifer_Aniston rdf:type ex:Person.
dbpedia:Friends rdf:type ex:TVSeries .
ex:actor
rdf:type rdf:Property ;
rdfs:domain ex:TVSeries ;
rdfs:range ex:Person.
dbpedia:Friends ex:actor dbpedia:Jennifer_Aniston .
to Infer or to Validate ?
Given only the following, what can we say ?
ex:actor
rdf:type rdf:Property ;
rdfs:domain ex:TVSeries ;
rdfs:range ex:Person.
ex:Dimitris rdf:type ex:Person .
ex:VoxxedDaysAthens rdf:type ex:Conference .
ex:VoxxedDaysAthens ex:actor ex:Dimitris .
Something is
not right…
ex:VoxxedDaysAthens
is not a ex:TVSeries
to Infer or to Validate ?
Given only the following, what can we say ?
ex:actor rdf:type rdf:Property ;
rdfs:domain ex:TVSeries ;
rdfs:range ex:Person.
ex:Dimitris rdf:type ex:Person .
dbpedia:Friends rdf:type ex:TVSeries .
dbpedia:Friends ex:actor ex:Dimitris .
Appears legit
Schema stored & queried as Data
ex:TVSeries
rdf:type rdfs:Class ;
rdfs:subClassOf ex:CreativeWork .
ex:BookSeries
rdf:type rdfs:Class ;
rdfs:subClassOf ex:CreativeWork .
ex:CreativeWork
rdf:type rdfs:Class .
dbpedia:Friends rdf:type ex:TVSeries.
dbpedia:The_Office rdf:type ex:TVSeries.
dbpedia:Narnia rdf:type ex:BookSeries.
SELECT ?s WHERE {
?s rdfs:subClassOf ex:CreativeWork .
}
ex:TVSeries, ex:BookSeries
SELECT ?s WHERE {
?s rdf:type ex:TVSeries .
}
dbpedia:Friends, dbpedia:The_Office
Schema stored & queried as Data
Navigates the
class hierarchy
SELECT ?s WHERE {
?s rdf:type/rdfs:subClassOf*
ex:CreativeWork }
dbpedia:Friends,
dbpedia:The_Office,
dbpedia:Narnia
Hierarchy can be
extended without
breaking the query
ex:TVSeries
rdf:type rdfs:Class ;
rdfs:subClassOf ex:CreativeWork .
ex:BookSeries
rdf:type rdfs:Class ;
rdfs:subClassOf ex:CreativeWork .
ex:CreativeWork
rdf:type rdfs:Class .
dbpedia:Friends rdf:type ex:TVSeries.
dbpedia:The_Office rdf:type ex:TVSeries.
dbpedia:Narnia rdf:type ex:BookSeries.
Many Available free Schemas
Many existing free (as in beer) ontologies (or schemas)
model different domains
> General purpose (DBpedia, schema.org)
> Geographical (geo)
> Provenance (prov-o)
> Taxonomies / Classification (SKOS family)
> Organizations (org)
> Find ~600 entries at http://lov.okfn.org
Reusing Available (Free) schemas
Get part of your data modeling for free
> Groups of people already worked on modeling the domain
> Spent time defining human and machine-readable semantics
Facilitates data integration easier
> Data published with common schemas
> Data easier to be consumed
Mapping to Available (Free) schemas
Map when not reusing
> integrate data in a loosely coupled way
ex:TVSeries owl:equivalentClass schema:TVSeries .
ex:actor owl:equivalentProperty schema:actor .
RDF & Semantics - take away points
It’s all about Classes & Properties
Human-readable semantics
> Commonly accepted modelling conventions
Machine-readable semantics
> Can be used for inference and/or validation
> Can be queried together with data
Reusing [or linking to] common ontologies / schemas
> Integrating data with less variety
> Network effect (the more people/data use it the better)
> Developing reusable applications against schemas
Linked Data & RDF
Given only this, can can we do/say?
<https://voxxeddays.com/athens> <https://schema.org/attendee> <http://kontokostas.com>.
schema:Event (domain) schema:Person (range)A person attending the event.
HTTPGET
<https://voxxeddays.com/athens>
rdf:type schema:Event;
schema:name “Voxxed Athens”;
schema:startDate “2018-06-01”;
schema:endDate “2018-06-02”;
schema:inLanguage “English”
schema:description “...”
HTTP GET
<http://kontokostas.com>
rdf:type schema:Person ;
schema:givenName “Dimitris” ;
schema:familyName “Kontokostas” ;
schema:birthPlace dbpedia:Greece ;
schema:jobTitle “Data Engineer” ;
schema:worksFor <https://geophy.com>.
HTTP GET
Follow your nose pattern
<http://kontokostas.com> <https://schema.org/birthPlace> <http://dbpedia.org/resource/Greece>.
schema:Person (domain) schema:Place (range)The place where the person was born.
HTTPGET
<http://kontokostas.com>
rdf:type schema:Person ;
schema:givenName “Dimitris” ;
schema:familyName “Kontokostas” ;
schema:birthPlace dbpedia:Greece ;
schema:jobTitle “Data Engineer” ;
schema:worksFor <https://geophy.com>.
HTTP GET
<http://dbpedia.org/resource/Greece>
rdf:type schema:Place, dbpedia:Country;
dbo:capital dbpedia:Athens;
dbo:currency dbpedia:Euro ;
geo:lat “39.0”^^xsd:float ;
geo:long “22.0”^^xsd:float .
HTTP GET
RDF & Linked Data
Things represented with http(s)-based URIs
can be self-published
HTTP GET requests on Things return RDF Triples
where it is a subject (or an object)
Decentralized storage / access / semantics
(*) a.k.a. the Web of Data, see TED talk from Tim Berners Lee (Creator of WWW)
RDF & Linked Data (on the web)
kontokostas.com
example.com
voxxeddays.com/At
hens
DBpedia
Web of Data DBpedia
DBpedia
DBpedia
Wikipedia
As RDF
RDF & Linked Data (on the enterprise)
Web of Data
RDF
DB x
LD x
RDF
DB y
LD y
RDF
DB z
LD z
LD w
Linked Open Data Cloud
Diagram from 2014
v2018 is too big
1.184 datasets
15.993 links
https://lod-cloud.net/
Reusing available datasets / identifiers
Just like reusing schemas, referencing / reusing external
identifiers, facilitates:
Data integration
e.g. dbpedia:Friends represents the Friends TV series, not some friends
> use dbpedia:Friends directly
> link it: ex:tv_series_123 owl:sameAs dbpedia:Friends
Data enrichment
e.g. dbpedia:Friends may have additional information about the series than our
database, and we can easily (http) get it
RDF & Linked Data - take away points
Decentralisation of Data Management
Self-documented schemas & data
Scale your [local] graphs to the [Enterprise] Web
Big pool of stable identifiers (i.e. DBpedia)
Pay as you go data integration
You can get benefit with low effort
> RDF views on top of RDBMS with R2RML (mappings, SPARQL 2 SQL translation)
> Convert XML/JSON/CSV/… to RDF with RML
The more time you invest the better the results
> Schema developement, mapping & linking
> Semi-automatically link discovery with tools like Limes & Silk
e.g.: ex:tv_series_123 owl:sameAs dbpedia:Friends
RDF does not need to be your master dataset
Who uses RDF
https://github.com/json-ld/json-ld.org/wiki/Users-of-JSON-LD
28% of TLD (or 39% of HTML pages)
> 3.7M Microdata
> 2.7M JSON-LD
> 1.2M RDFa
In total 9 billion Things & 38 billion RDF triples
Full report at http://webdatacommons.org/structureddata/#results-2017-1
Structured data on the web (Nov 2017)
RDF @ Google
RDF Ontology
> Less strict / formal
> Promotes JSON-LD
Funded & maintained
by all Search engines
drives many google
products...
Schema.org && Google && Search
https://developers.google.com/search/docs/guides/search-features
Google is...
Using the RDF graph model to integrate diverse
data from webpages & emails
By using the concept of Linked Data
And this is all empowered by a
common ontology (or schema)
RDF @ GeoPhy
GeoPhy provides value,
risk, & quality metrics
for every building in the world
RDF @GeoPhy
We collect & integrate a lot of data
> on properties, on its surroundings, and on the market conditions
Master dataset on Real Estate (aka Knowledge Graph)
> driving our Machine Learning / Deep Learning models
Challenges...
> We have thousands of sources,
> Sources are updated at arbitrary intervals
> We get our data in CSV, in the good days
And, of course…
we are not Google
to make people
write RDF for us :-)
Geophy Data Management Platform
CSV PDF
GeoPhy
Ontologies
Transform
To RDF
Validate
Identify &
Deduplicate
Conflict
resolution
Data Fusion
Data
Wrangling &
Extraction
Annotation &
Provenance
Modeling
Mapping
CoreDB
Provenance
(value-level)
Data Indexing
Data Ingestion
Data Enrichment
Dependency
Detection Geo
Enrichment Trigger ML/DL
API
And the closing slide...
People think RDF is a pain because it is complicated.
The truth is even worse.
RDF is painfully simplistic, but it allows you to work with
real-world data and problems that are horribly complicated.
While you can avoid RDF, it is harder to avoid complicated
data and complicated computer problems.
Dan Brickley, Schema.org and Google
Libby Miller, BBC
Thank you for your attention
Questions?
Many thanks to Sander, Matt and the whole GeoPhy Eng. Team for their feedback

More Related Content

What's hot

Semantic Variation Graphs the case for RDF & SPARQL
Semantic Variation Graphs the case for RDF & SPARQLSemantic Variation Graphs the case for RDF & SPARQL
Semantic Variation Graphs the case for RDF & SPARQLJerven Bolleman
 
Debunking some “RDF vs. Property Graph” Alternative Facts
Debunking some “RDF vs. Property Graph” Alternative FactsDebunking some “RDF vs. Property Graph” Alternative Facts
Debunking some “RDF vs. Property Graph” Alternative FactsNeo4j
 
Open data easy, explicit and fast
Open data easy, explicit and fastOpen data easy, explicit and fast
Open data easy, explicit and fastMetaSolutions AB
 
SHACL: Shaping the Big Ball of Data Mud
SHACL: Shaping the Big Ball of Data MudSHACL: Shaping the Big Ball of Data Mud
SHACL: Shaping the Big Ball of Data MudRichard Cyganiak
 
Datalift a-catalyser-for-the-web-of-data-fosdem-05-02-2011
Datalift a-catalyser-for-the-web-of-data-fosdem-05-02-2011Datalift a-catalyser-for-the-web-of-data-fosdem-05-02-2011
Datalift a-catalyser-for-the-web-of-data-fosdem-05-02-2011François Scharffe
 
Theory behind Image Compression and Semantic Search
Theory behind Image Compression and Semantic SearchTheory behind Image Compression and Semantic Search
Theory behind Image Compression and Semantic SearchSanti Adavani
 
Benchmarking RDF Metadata Representations: Reification, Singleton Property an...
Benchmarking RDF Metadata Representations: Reification, Singleton Property an...Benchmarking RDF Metadata Representations: Reification, Singleton Property an...
Benchmarking RDF Metadata Representations: Reification, Singleton Property an...Fabrizio Orlandi
 
ELSE IF 2019: Porting the xEBR Taxonomy to a Linked Open Data compliant Format
ELSE IF 2019: Porting the xEBR Taxonomy to a Linked Open Data compliant FormatELSE IF 2019: Porting the xEBR Taxonomy to a Linked Open Data compliant Format
ELSE IF 2019: Porting the xEBR Taxonomy to a Linked Open Data compliant FormatPretaLLOD
 
Semantic Cartography: Using ontologies to create adaptable tools for text exp...
Semantic Cartography: Using ontologies to create adaptable tools for text exp...Semantic Cartography: Using ontologies to create adaptable tools for text exp...
Semantic Cartography: Using ontologies to create adaptable tools for text exp...andyashton
 
RDF SHACL, Annotations, and Data Frames
RDF SHACL, Annotations, and Data FramesRDF SHACL, Annotations, and Data Frames
RDF SHACL, Annotations, and Data FramesKurt Cagle
 
Indexing, searching, and aggregation with redi search and .net
Indexing, searching, and aggregation with redi search and .netIndexing, searching, and aggregation with redi search and .net
Indexing, searching, and aggregation with redi search and .netStephen Lorello
 
Semantic Web introduction
Semantic Web introductionSemantic Web introduction
Semantic Web introductionGraphity
 
Why is JSON-LD Important to Businesses - Franz Inc
Why is JSON-LD Important to Businesses - Franz IncWhy is JSON-LD Important to Businesses - Franz Inc
Why is JSON-LD Important to Businesses - Franz IncFranz Inc. - AllegroGraph
 

What's hot (20)

Jesús Barrasa
Jesús BarrasaJesús Barrasa
Jesús Barrasa
 
Semantic Variation Graphs the case for RDF & SPARQL
Semantic Variation Graphs the case for RDF & SPARQLSemantic Variation Graphs the case for RDF & SPARQL
Semantic Variation Graphs the case for RDF & SPARQL
 
Debunking some “RDF vs. Property Graph” Alternative Facts
Debunking some “RDF vs. Property Graph” Alternative FactsDebunking some “RDF vs. Property Graph” Alternative Facts
Debunking some “RDF vs. Property Graph” Alternative Facts
 
Open data easy, explicit and fast
Open data easy, explicit and fastOpen data easy, explicit and fast
Open data easy, explicit and fast
 
JSON-LD and SHACL for Knowledge Graphs
JSON-LD and SHACL for Knowledge GraphsJSON-LD and SHACL for Knowledge Graphs
JSON-LD and SHACL for Knowledge Graphs
 
SHACL: Shaping the Big Ball of Data Mud
SHACL: Shaping the Big Ball of Data MudSHACL: Shaping the Big Ball of Data Mud
SHACL: Shaping the Big Ball of Data Mud
 
20110728 datalift-rpi-troy
20110728 datalift-rpi-troy20110728 datalift-rpi-troy
20110728 datalift-rpi-troy
 
Datalift a-catalyser-for-the-web-of-data-fosdem-05-02-2011
Datalift a-catalyser-for-the-web-of-data-fosdem-05-02-2011Datalift a-catalyser-for-the-web-of-data-fosdem-05-02-2011
Datalift a-catalyser-for-the-web-of-data-fosdem-05-02-2011
 
Christian Jakenfelds
Christian JakenfeldsChristian Jakenfelds
Christian Jakenfelds
 
Theory behind Image Compression and Semantic Search
Theory behind Image Compression and Semantic SearchTheory behind Image Compression and Semantic Search
Theory behind Image Compression and Semantic Search
 
Presentation shexer
Presentation shexerPresentation shexer
Presentation shexer
 
RDF validation tutorial
RDF validation tutorialRDF validation tutorial
RDF validation tutorial
 
Benchmarking RDF Metadata Representations: Reification, Singleton Property an...
Benchmarking RDF Metadata Representations: Reification, Singleton Property an...Benchmarking RDF Metadata Representations: Reification, Singleton Property an...
Benchmarking RDF Metadata Representations: Reification, Singleton Property an...
 
ELSE IF 2019: Porting the xEBR Taxonomy to a Linked Open Data compliant Format
ELSE IF 2019: Porting the xEBR Taxonomy to a Linked Open Data compliant FormatELSE IF 2019: Porting the xEBR Taxonomy to a Linked Open Data compliant Format
ELSE IF 2019: Porting the xEBR Taxonomy to a Linked Open Data compliant Format
 
Semantic Cartography: Using ontologies to create adaptable tools for text exp...
Semantic Cartography: Using ontologies to create adaptable tools for text exp...Semantic Cartography: Using ontologies to create adaptable tools for text exp...
Semantic Cartography: Using ontologies to create adaptable tools for text exp...
 
RDF SHACL, Annotations, and Data Frames
RDF SHACL, Annotations, and Data FramesRDF SHACL, Annotations, and Data Frames
RDF SHACL, Annotations, and Data Frames
 
Indexing, searching, and aggregation with redi search and .net
Indexing, searching, and aggregation with redi search and .netIndexing, searching, and aggregation with redi search and .net
Indexing, searching, and aggregation with redi search and .net
 
Semantic Web introduction
Semantic Web introductionSemantic Web introduction
Semantic Web introduction
 
What's New in RDF 1.1?
What's New in RDF 1.1?What's New in RDF 1.1?
What's New in RDF 1.1?
 
Why is JSON-LD Important to Businesses - Franz Inc
Why is JSON-LD Important to Businesses - Franz IncWhy is JSON-LD Important to Businesses - Franz Inc
Why is JSON-LD Important to Businesses - Franz Inc
 

Similar to Graph databases & data integration v2

Graph databases & data integration - the case of RDF
Graph databases & data integration - the case of RDFGraph databases & data integration - the case of RDF
Graph databases & data integration - the case of RDFDimitris Kontokostas
 
2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs
2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs
2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIsJosef Petrák
 
A Little SPARQL in your Analytics
A Little SPARQL in your AnalyticsA Little SPARQL in your Analytics
A Little SPARQL in your AnalyticsDr. Neil Brittliff
 
Slides semantic web and Drupal 7 NYCCamp 2012
Slides semantic web and Drupal 7 NYCCamp 2012Slides semantic web and Drupal 7 NYCCamp 2012
Slides semantic web and Drupal 7 NYCCamp 2012scorlosquet
 
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...Databricks
 
Find your way in Graph labyrinths
Find your way in Graph labyrinthsFind your way in Graph labyrinths
Find your way in Graph labyrinthsDaniel Camarda
 
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudFirst Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudOntotext
 
Infromation Reprentation, Structured Data and Semantics
Infromation Reprentation,Structured Data and SemanticsInfromation Reprentation,Structured Data and Semantics
Infromation Reprentation, Structured Data and SemanticsYogendra Tamang
 
Bringing the Semantic Web closer to reality: PostgreSQL as RDF Graph Database
Bringing the Semantic Web closer to reality: PostgreSQL as RDF Graph DatabaseBringing the Semantic Web closer to reality: PostgreSQL as RDF Graph Database
Bringing the Semantic Web closer to reality: PostgreSQL as RDF Graph DatabaseJimmy Angelakos
 
Introduction to RDFa
Introduction to RDFaIntroduction to RDFa
Introduction to RDFaIvan Herman
 
Understanding RDF: the Resource Description Framework in Context (1999)
Understanding RDF: the Resource Description Framework in Context  (1999)Understanding RDF: the Resource Description Framework in Context  (1999)
Understanding RDF: the Resource Description Framework in Context (1999)Dan Brickley
 
Rdf data-model-and-storage
Rdf data-model-and-storageRdf data-model-and-storage
Rdf data-model-and-storage灿辉 葛
 
RDFa: an introduction
RDFa: an introductionRDFa: an introduction
RDFa: an introductionKai Li
 
SemanticWeb Nuts 'n Bolts
SemanticWeb Nuts 'n BoltsSemanticWeb Nuts 'n Bolts
SemanticWeb Nuts 'n BoltsRinke Hoekstra
 
Syntax Reuse: XSLT as a Metalanguage for Knowledge Representation Languages
Syntax Reuse: XSLT as a Metalanguage for Knowledge Representation LanguagesSyntax Reuse: XSLT as a Metalanguage for Knowledge Representation Languages
Syntax Reuse: XSLT as a Metalanguage for Knowledge Representation LanguagesTara Athan
 

Similar to Graph databases & data integration v2 (20)

Graph databases & data integration - the case of RDF
Graph databases & data integration - the case of RDFGraph databases & data integration - the case of RDF
Graph databases & data integration - the case of RDF
 
2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs
2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs
2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs
 
Danbri Drupalcon Export
Danbri Drupalcon ExportDanbri Drupalcon Export
Danbri Drupalcon Export
 
RDFa Tutorial
RDFa TutorialRDFa Tutorial
RDFa Tutorial
 
SWT Lecture Session 2 - RDF
SWT Lecture Session 2 - RDFSWT Lecture Session 2 - RDF
SWT Lecture Session 2 - RDF
 
A Little SPARQL in your Analytics
A Little SPARQL in your AnalyticsA Little SPARQL in your Analytics
A Little SPARQL in your Analytics
 
Slides semantic web and Drupal 7 NYCCamp 2012
Slides semantic web and Drupal 7 NYCCamp 2012Slides semantic web and Drupal 7 NYCCamp 2012
Slides semantic web and Drupal 7 NYCCamp 2012
 
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
 
Find your way in Graph labyrinths
Find your way in Graph labyrinthsFind your way in Graph labyrinths
Find your way in Graph labyrinths
 
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudFirst Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
 
Semantic Web talk TEMPLATE
Semantic Web talk TEMPLATESemantic Web talk TEMPLATE
Semantic Web talk TEMPLATE
 
Infromation Reprentation, Structured Data and Semantics
Infromation Reprentation,Structured Data and SemanticsInfromation Reprentation,Structured Data and Semantics
Infromation Reprentation, Structured Data and Semantics
 
Bringing the Semantic Web closer to reality: PostgreSQL as RDF Graph Database
Bringing the Semantic Web closer to reality: PostgreSQL as RDF Graph DatabaseBringing the Semantic Web closer to reality: PostgreSQL as RDF Graph Database
Bringing the Semantic Web closer to reality: PostgreSQL as RDF Graph Database
 
Introduction to RDFa
Introduction to RDFaIntroduction to RDFa
Introduction to RDFa
 
SWT Lecture Session 10 R2RML Part 1
SWT Lecture Session 10 R2RML Part 1SWT Lecture Session 10 R2RML Part 1
SWT Lecture Session 10 R2RML Part 1
 
Understanding RDF: the Resource Description Framework in Context (1999)
Understanding RDF: the Resource Description Framework in Context  (1999)Understanding RDF: the Resource Description Framework in Context  (1999)
Understanding RDF: the Resource Description Framework in Context (1999)
 
Rdf data-model-and-storage
Rdf data-model-and-storageRdf data-model-and-storage
Rdf data-model-and-storage
 
RDFa: an introduction
RDFa: an introductionRDFa: an introduction
RDFa: an introduction
 
SemanticWeb Nuts 'n Bolts
SemanticWeb Nuts 'n BoltsSemanticWeb Nuts 'n Bolts
SemanticWeb Nuts 'n Bolts
 
Syntax Reuse: XSLT as a Metalanguage for Knowledge Representation Languages
Syntax Reuse: XSLT as a Metalanguage for Knowledge Representation LanguagesSyntax Reuse: XSLT as a Metalanguage for Knowledge Representation Languages
Syntax Reuse: XSLT as a Metalanguage for Knowledge Representation Languages
 

More from Dimitris Kontokostas

Data quality assessment - connecting the pieces...
Data quality assessment - connecting the pieces...Data quality assessment - connecting the pieces...
Data quality assessment - connecting the pieces...Dimitris Kontokostas
 
8th DBpedia meeting / California 2016
8th DBpedia meeting /  California 20168th DBpedia meeting /  California 2016
8th DBpedia meeting / California 2016Dimitris Kontokostas
 
Semantically enhanced quality assurance in the jurion business use case
Semantically enhanced quality assurance in the jurion  business use caseSemantically enhanced quality assurance in the jurion  business use case
Semantically enhanced quality assurance in the jurion business use caseDimitris Kontokostas
 
DBpedia+ / DBpedia meeting in Dublin
DBpedia+ / DBpedia meeting in DublinDBpedia+ / DBpedia meeting in Dublin
DBpedia+ / DBpedia meeting in DublinDimitris Kontokostas
 
NLP Data Cleansing Based on Linguistic Ontology Constraints
NLP Data Cleansing Based on Linguistic Ontology ConstraintsNLP Data Cleansing Based on Linguistic Ontology Constraints
NLP Data Cleansing Based on Linguistic Ontology ConstraintsDimitris Kontokostas
 
RDFUnit - Test-Driven Linked Data quality Assessment (WWW2014)
RDFUnit - Test-Driven Linked Data quality Assessment (WWW2014)RDFUnit - Test-Driven Linked Data quality Assessment (WWW2014)
RDFUnit - Test-Driven Linked Data quality Assessment (WWW2014)Dimitris Kontokostas
 
DBpedia i18n - Amsterdam Meeting (30/01/2014)
DBpedia i18n - Amsterdam Meeting (30/01/2014)DBpedia i18n - Amsterdam Meeting (30/01/2014)
DBpedia i18n - Amsterdam Meeting (30/01/2014)Dimitris Kontokostas
 

More from Dimitris Kontokostas (11)

Introduction to apache kafka
Introduction to apache kafkaIntroduction to apache kafka
Introduction to apache kafka
 
Data quality assessment - connecting the pieces...
Data quality assessment - connecting the pieces...Data quality assessment - connecting the pieces...
Data quality assessment - connecting the pieces...
 
8th DBpedia meeting / California 2016
8th DBpedia meeting /  California 20168th DBpedia meeting /  California 2016
8th DBpedia meeting / California 2016
 
Semantically enhanced quality assurance in the jurion business use case
Semantically enhanced quality assurance in the jurion  business use caseSemantically enhanced quality assurance in the jurion  business use case
Semantically enhanced quality assurance in the jurion business use case
 
DBpedia past, present & future
DBpedia past, present & futureDBpedia past, present & future
DBpedia past, present & future
 
DBpedia+ / DBpedia meeting in Dublin
DBpedia+ / DBpedia meeting in DublinDBpedia+ / DBpedia meeting in Dublin
DBpedia+ / DBpedia meeting in Dublin
 
DBpedia ♥ Commons
DBpedia ♥ CommonsDBpedia ♥ Commons
DBpedia ♥ Commons
 
NLP Data Cleansing Based on Linguistic Ontology Constraints
NLP Data Cleansing Based on Linguistic Ontology ConstraintsNLP Data Cleansing Based on Linguistic Ontology Constraints
NLP Data Cleansing Based on Linguistic Ontology Constraints
 
RDFUnit - Test-Driven Linked Data quality Assessment (WWW2014)
RDFUnit - Test-Driven Linked Data quality Assessment (WWW2014)RDFUnit - Test-Driven Linked Data quality Assessment (WWW2014)
RDFUnit - Test-Driven Linked Data quality Assessment (WWW2014)
 
DBpedia Viewer - LDOW 2014
DBpedia Viewer - LDOW 2014DBpedia Viewer - LDOW 2014
DBpedia Viewer - LDOW 2014
 
DBpedia i18n - Amsterdam Meeting (30/01/2014)
DBpedia i18n - Amsterdam Meeting (30/01/2014)DBpedia i18n - Amsterdam Meeting (30/01/2014)
DBpedia i18n - Amsterdam Meeting (30/01/2014)
 

Recently uploaded

SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...CzechDreamin
 
PLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsPLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsStefano
 
IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024IoTAnalytics
 
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Julian Hyde
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxAbida Shariff
 
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeFree and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeCzechDreamin
 
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfHow Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfFIDO Alliance
 
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCustom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCzechDreamin
 
Introduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG EvaluationIntroduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG EvaluationZilliz
 
Agentic RAG What it is its types applications and implementation.pdf
Agentic RAG What it is its types applications and implementation.pdfAgentic RAG What it is its types applications and implementation.pdf
Agentic RAG What it is its types applications and implementation.pdfChristopherTHyatt
 
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyJohn Staveley
 
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutesconfluent
 
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...CzechDreamin
 
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfThe Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfFIDO Alliance
 
Top 10 Symfony Development Companies 2024
Top 10 Symfony Development Companies 2024Top 10 Symfony Development Companies 2024
Top 10 Symfony Development Companies 2024TopCSSGallery
 
Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Patrick Viafore
 
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...FIDO Alliance
 
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxDavid Michel
 
Buy Epson EcoTank L3210 Colour Printer Online.pdf
Buy Epson EcoTank L3210 Colour Printer Online.pdfBuy Epson EcoTank L3210 Colour Printer Online.pdf
Buy Epson EcoTank L3210 Colour Printer Online.pdfEasyPrinterHelp
 
UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1DianaGray10
 

Recently uploaded (20)

SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
 
PLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsPLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. Startups
 
IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024
 
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeFree and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
 
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfHow Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
 
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCustom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
 
Introduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG EvaluationIntroduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG Evaluation
 
Agentic RAG What it is its types applications and implementation.pdf
Agentic RAG What it is its types applications and implementation.pdfAgentic RAG What it is its types applications and implementation.pdf
Agentic RAG What it is its types applications and implementation.pdf
 
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John Staveley
 
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutes
 
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
 
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfThe Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
 
Top 10 Symfony Development Companies 2024
Top 10 Symfony Development Companies 2024Top 10 Symfony Development Companies 2024
Top 10 Symfony Development Companies 2024
 
Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024
 
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
 
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
 
Buy Epson EcoTank L3210 Colour Printer Online.pdf
Buy Epson EcoTank L3210 Colour Printer Online.pdfBuy Epson EcoTank L3210 Colour Printer Online.pdf
Buy Epson EcoTank L3210 Colour Printer Online.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1
 

Graph databases & data integration v2

  • 1. Graph Databases & data integration Voxxed Days Athens 2018 Dimitris Kontokostas Senior Knowledge Engineer @GeoPhy
  • 2. About me ● Data geek, software engineer & open source enthusiast ● Involved in many R&D projects since 2003 ● Participate(d) in graph-related standardization activities ● PhD in knowledge extraction and quality assessment ● Working on the GeoPhy Real Estate Knowledge Graph
  • 3. Agenda ● Graphs ● RDF Graphs (*) ● Semantics & why they matter (*) ● Linked Data ● Who uses RDF ● How Google uses RDF ● How we (GeoPhy) uses RDF (*) Some concepts are simplified or skipped to make this talk easier to digest in the allocated time
  • 4.
  • 5. Heatmap for Graph Databases (*) See also this Gartner study in 2013 found: ● many organizations find the variety dimension a greater challenge than volume or velocity. Graph DBs to the rescue: ● Combine multiple sources with different structures ● Retain the flexibility to add new ones without adapting schemas ● Query combined data, or multiple sources at once ● Detect patterns in the data
  • 6. © Image by Max De Margi
  • 7. ● A graph is a way of specifying relationships among a collection of items ● Items can be: ○ Nodes: Alice, Bob, … ○ Edges ■ undirected: knows, … ■ directed: follows, … ○ Attributes: name, age, type, since, ... ○ Values: 18, 2001/10/13, ... Graphs Image source from wikimedia commons
  • 8. Graph Data Models Property graphs ● Industry standards ○ Cypher mainly Neo4j ○ Gremlin traversal API (Apache TinkerPop) => Most common ○ GraphQL ● Data import / export using Cypher, gremlin or vendor-specific ● Usually optimized for specific operations / use cases RDF Graphs ● W3C standards ○ Like XML, HTML, define once run everywhere ™ ● Standardised way for querying (SPARQL), exporting & importing (RDF) Slide input from Andy Seaborn @VoxxedDays Bristol
  • 9. Graph Databases Landscape Property Graphs Gremlin traversal API RDF Graphs SPARQL Hybrid Gremlin API + SPARQL +Cypher
  • 10. ● Each node has ○ unique identifier ○ outgoing edges ○ incoming edges ○ key-value properties collection ● Each edge has ○ unique identifier ○ direction ○ label for the relationship ○ key-value properties collection ● Extreme flexibility Property Graphs
  • 11. RDF - Resource Description Framework ● An RDF Graph is a set of RDF Triples ● An RDF triple consists of only three components (simplified): ○ the subject which is a Thing ○ the predicate which is a (special) Thing ○ the object that can be either a Thing or a Literal (Value) ● Things are represented with URIs ● Literals have a value and a value type or a language tag (defaults to string) Subject Predicate Object
  • 12. RDF - Resource Description Framework ● An RDF Graph is a set of RDF Triples ● An RDF triple consists of only three components (simplified): ○ the subject which is a Thing ○ the predicate which is a (special) Thing ○ the object that can be either a Thing or a Literal (Value) ● Things are represented with URIs ● Literals have a value and a value type or a language tag (defaults to string) Subject Predicate Object
  • 13. RDF - Resource Description Framework Depending on the serialization format, URIs can be abbreviated with namespaces > just like XML > Improves readability, e.g. @prefix dbpedia: <http://dbpedia.org/resource/> . @prefix schema: <http://schema.org/> . Subject Predicate Object
  • 14. RDF is an abstract data model Many different serialization formats… Turtle, NTriples, JSON-LD, XML, RDFa, Microdata*
  • 15. RDF is an abstract data model Many different serialization formats… Turtle, NTriples, JSON-LD, XML, RDFa, Microdata* @prefix dbpedia: <http://dbpedia.org/resource/> . @prefix schema: <http://schema.org/> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . dbpedia:Friends schema:name "Friends"@en ; schema:datePublished "1994-09-22"^^xsd:date ; schema:numberOfSeasons 10 ; schema:genre dbpedia:Sitcom . dbpedia:The_Office schema:name "The Office"@en ; schema:genre dbpedia:Sitcom .
  • 16. RDF is an abstract data model Many different serialization formats… Turtle, NTriples, JSON-LD, XML, RDFa, Microdata*
  • 17. RDF is an abstract data model Many different serialization formats… Turtle, NTriples, JSON-LD, XML, RDFa, Microdata*
  • 18. RDF is an abstract data model Many different serialization formats… Turtle, NTriples, JSON-LD, XML, RDFa, Microdata*
  • 19. [Fun fact] What does RSS stand for? Rich Site Summary but... Original name was: RDF Site Summary Based on first versions of RDF/XML See https://en.wikipedia.org/wiki/RSS
  • 20. RDF is an abstract data model Many different serialization formats… Turtle, NTriples, JSON-LD, XML, RDFa, Microdata*
  • 21. RDF is an abstract data model Many different serialization formats… Turtle, NTriples, JSON-LD, XML, RDFa, Microdata*
  • 22. You can store RDF ... In simple (text) files, locally, remote, HDFS, ... Embedded web documents In graph databases
  • 23. RDF & Graphs (Separate) @prefix dbpedia: <http://dbpedia.org/resource/> . @prefix schema: <http://schema.org/> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . dbpedia:Friends schema:numberOfSeasons 10 ; schema:datePublished "1994-09-22"^^xsd:date ; schema:genre dbpedia:Sitcom . @prefix dbpedia: <http://dbpedia.org/resource/> . @prefix schema: <http://schema.org/> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . dbpedia:Friends schema:name "Friends"@en . dbpedia:The_Office schema:name "The Office"@en . /data/tvseries/labels.ttl /data/tvseries/metadata.ttl
  • 24. RDF & Graphs (merge) File_all.ttl Can you name of any other format where files can be merged without losing data integrity? CSV, SQL, XML, JSON, ... @prefix dbpedia: <http://dbpedia.org/resource/> . @prefix schema: <http://schema.org/> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . dbpedia:Friends schema:name "Friends"@en ; schema:numberOfSeasons 10 ; schema:datePublished "1994-09-22"^^xsd:date ; schema:genre dbpedia:Sitcom . dbpedia:The_Office schema:name "The Office"@en ; schema:genre dbpedia:Sitcom . /data/tvseries.ttl
  • 25. Datasets / multi-graph TriG files @prefix dbpedia: <http://dbpedia.org/resource/> . @prefix schema: <http://schema.org/> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . <http://example.com/labels> { dbpedia:Friends schema:name "Friends"@en ; dbpedia:The_Office schema:name "The Office"@en ; } <http://example.com/metadata> { dbpedia:Friends schema:datePublished "1994-09-22"^^xsd:date ; schema:numberOfSeasons 10 . } <http://example.com/genre> { dbpedia:Friends schema:genre dbpedia:Sitcom . dbpedia:The_Office schema:genre dbpedia:Sitcom . } /data/tvseries.trig
  • 26. RDF is persistent, wherever it’s stored RDF DB Input Files Output Files Import Export Exactly same (*) (*) The proper term is isomorphic graphs, to cover some special cases called blank nodes Query
  • 27. Big ecosystem SPARQL: RDF query language RDFS, OWL: RDF schema languages SHACL, ShEx: RDF constraint languages See http://book.validatingrdf.com (free online) R2RML: Virtual RDF views on top of RDBMS (i.e. MySQL) And many more specification & tools...
  • 28. Takeaway points, so far... RDF is a graph data model > can be serialized in many formats > identifiers are persistent by design Natively store & integrates diverse data RDF is kind of the new XML > but it is much cooler... > and you don’t need to write XML ;)
  • 29. Semantics & RDF Why they matter
  • 30. Semantics & RDF ● RDF is a core part of the Semantic Web vision ● Semantics is defined as: ○ the meaning of something (word, phrase, text, etc) ○ the branch of linguistics and logic concerned with meaning ● Too academic? “A Little Semantics Goes a Long Way” by prof. J. Hendler BuzzwordAlert!!!
  • 31. RDF & Semantics Ontologies are the results of modelling a specific domain Some people prefer the terms: model, vocabulary, taxonomy, schema (doesn’t make much difference) Ontologies in RDF deal with classes & properties > Some part is machine readable > Some part is human readable Can you tell which part is more important? (... a more pragmatic view)
  • 32. @prefix ex: <http://example.com/> ex:TVSeries rdf:type rdfs:Class ; rdfs:comment “Series dedicated to TV broadcast” ; rdfs:subClassOf ex:CreativeWork . ex:CreativeWork rdf:type rdfs:Class ; rdfs:comment “A generic kind of creative work, i.e. books, movies, etc.” . RDF Schema - Classes Classes of Things Machine-Readable Semantics Human-Readable Semantics … and we can assign types to Things (i.e. “Friends” is an instance of “TVSeries”) dbpedia:Friends rdf:type ex:TVSeries.
  • 33. @prefix ex: <http://example.com/> ex:actor rdf:type rdf:Property ; rdfs:comment “The person that is the actor of a TVSeries.” ; rdfs:domain ex:TVSeries ; rdfs:range ex:Person . RDF Schema - Properties Relationships between subjects and objects Machine-Readable Semantics Human-Readable Semantics dbpedia:Friends ex:actor dbpedia:Jennifer_Aniston . … and we can use this in RDF statements
  • 34. to Infer or to Validate ? Given only the following, what can we say about dbpedia:Jennifer_Aniston and dbpedia:Friends ? dbpedia:Jennifer_Aniston rdf:type ex:Person. dbpedia:Friends rdf:type ex:TVSeries . ex:actor rdf:type rdf:Property ; rdfs:domain ex:TVSeries ; rdfs:range ex:Person. dbpedia:Friends ex:actor dbpedia:Jennifer_Aniston .
  • 35. to Infer or to Validate ? Given only the following, what can we say ? ex:actor rdf:type rdf:Property ; rdfs:domain ex:TVSeries ; rdfs:range ex:Person. ex:Dimitris rdf:type ex:Person . ex:VoxxedDaysAthens rdf:type ex:Conference . ex:VoxxedDaysAthens ex:actor ex:Dimitris . Something is not right… ex:VoxxedDaysAthens is not a ex:TVSeries
  • 36. to Infer or to Validate ? Given only the following, what can we say ? ex:actor rdf:type rdf:Property ; rdfs:domain ex:TVSeries ; rdfs:range ex:Person. ex:Dimitris rdf:type ex:Person . dbpedia:Friends rdf:type ex:TVSeries . dbpedia:Friends ex:actor ex:Dimitris . Appears legit
  • 37. Schema stored & queried as Data ex:TVSeries rdf:type rdfs:Class ; rdfs:subClassOf ex:CreativeWork . ex:BookSeries rdf:type rdfs:Class ; rdfs:subClassOf ex:CreativeWork . ex:CreativeWork rdf:type rdfs:Class . dbpedia:Friends rdf:type ex:TVSeries. dbpedia:The_Office rdf:type ex:TVSeries. dbpedia:Narnia rdf:type ex:BookSeries. SELECT ?s WHERE { ?s rdfs:subClassOf ex:CreativeWork . } ex:TVSeries, ex:BookSeries SELECT ?s WHERE { ?s rdf:type ex:TVSeries . } dbpedia:Friends, dbpedia:The_Office
  • 38. Schema stored & queried as Data Navigates the class hierarchy SELECT ?s WHERE { ?s rdf:type/rdfs:subClassOf* ex:CreativeWork } dbpedia:Friends, dbpedia:The_Office, dbpedia:Narnia Hierarchy can be extended without breaking the query ex:TVSeries rdf:type rdfs:Class ; rdfs:subClassOf ex:CreativeWork . ex:BookSeries rdf:type rdfs:Class ; rdfs:subClassOf ex:CreativeWork . ex:CreativeWork rdf:type rdfs:Class . dbpedia:Friends rdf:type ex:TVSeries. dbpedia:The_Office rdf:type ex:TVSeries. dbpedia:Narnia rdf:type ex:BookSeries.
  • 39. Many Available free Schemas Many existing free (as in beer) ontologies (or schemas) model different domains > General purpose (DBpedia, schema.org) > Geographical (geo) > Provenance (prov-o) > Taxonomies / Classification (SKOS family) > Organizations (org) > Find ~600 entries at http://lov.okfn.org
  • 40. Reusing Available (Free) schemas Get part of your data modeling for free > Groups of people already worked on modeling the domain > Spent time defining human and machine-readable semantics Facilitates data integration easier > Data published with common schemas > Data easier to be consumed
  • 41. Mapping to Available (Free) schemas Map when not reusing > integrate data in a loosely coupled way ex:TVSeries owl:equivalentClass schema:TVSeries . ex:actor owl:equivalentProperty schema:actor .
  • 42. RDF & Semantics - take away points It’s all about Classes & Properties Human-readable semantics > Commonly accepted modelling conventions Machine-readable semantics > Can be used for inference and/or validation > Can be queried together with data Reusing [or linking to] common ontologies / schemas > Integrating data with less variety > Network effect (the more people/data use it the better) > Developing reusable applications against schemas
  • 44. Given only this, can can we do/say? <https://voxxeddays.com/athens> <https://schema.org/attendee> <http://kontokostas.com>. schema:Event (domain) schema:Person (range)A person attending the event. HTTPGET <https://voxxeddays.com/athens> rdf:type schema:Event; schema:name “Voxxed Athens”; schema:startDate “2018-06-01”; schema:endDate “2018-06-02”; schema:inLanguage “English” schema:description “...” HTTP GET <http://kontokostas.com> rdf:type schema:Person ; schema:givenName “Dimitris” ; schema:familyName “Kontokostas” ; schema:birthPlace dbpedia:Greece ; schema:jobTitle “Data Engineer” ; schema:worksFor <https://geophy.com>. HTTP GET
  • 45. Follow your nose pattern <http://kontokostas.com> <https://schema.org/birthPlace> <http://dbpedia.org/resource/Greece>. schema:Person (domain) schema:Place (range)The place where the person was born. HTTPGET <http://kontokostas.com> rdf:type schema:Person ; schema:givenName “Dimitris” ; schema:familyName “Kontokostas” ; schema:birthPlace dbpedia:Greece ; schema:jobTitle “Data Engineer” ; schema:worksFor <https://geophy.com>. HTTP GET <http://dbpedia.org/resource/Greece> rdf:type schema:Place, dbpedia:Country; dbo:capital dbpedia:Athens; dbo:currency dbpedia:Euro ; geo:lat “39.0”^^xsd:float ; geo:long “22.0”^^xsd:float . HTTP GET
  • 46. RDF & Linked Data Things represented with http(s)-based URIs can be self-published HTTP GET requests on Things return RDF Triples where it is a subject (or an object) Decentralized storage / access / semantics (*) a.k.a. the Web of Data, see TED talk from Tim Berners Lee (Creator of WWW)
  • 47. RDF & Linked Data (on the web) kontokostas.com example.com voxxeddays.com/At hens DBpedia Web of Data DBpedia DBpedia DBpedia Wikipedia As RDF
  • 48. RDF & Linked Data (on the enterprise) Web of Data RDF DB x LD x RDF DB y LD y RDF DB z LD z LD w
  • 49. Linked Open Data Cloud Diagram from 2014 v2018 is too big 1.184 datasets 15.993 links https://lod-cloud.net/
  • 50. Reusing available datasets / identifiers Just like reusing schemas, referencing / reusing external identifiers, facilitates: Data integration e.g. dbpedia:Friends represents the Friends TV series, not some friends > use dbpedia:Friends directly > link it: ex:tv_series_123 owl:sameAs dbpedia:Friends Data enrichment e.g. dbpedia:Friends may have additional information about the series than our database, and we can easily (http) get it
  • 51. RDF & Linked Data - take away points Decentralisation of Data Management Self-documented schemas & data Scale your [local] graphs to the [Enterprise] Web Big pool of stable identifiers (i.e. DBpedia)
  • 52. Pay as you go data integration You can get benefit with low effort > RDF views on top of RDBMS with R2RML (mappings, SPARQL 2 SQL translation) > Convert XML/JSON/CSV/… to RDF with RML The more time you invest the better the results > Schema developement, mapping & linking > Semi-automatically link discovery with tools like Limes & Silk e.g.: ex:tv_series_123 owl:sameAs dbpedia:Friends RDF does not need to be your master dataset
  • 54. 28% of TLD (or 39% of HTML pages) > 3.7M Microdata > 2.7M JSON-LD > 1.2M RDFa In total 9 billion Things & 38 billion RDF triples Full report at http://webdatacommons.org/structureddata/#results-2017-1 Structured data on the web (Nov 2017)
  • 56. RDF Ontology > Less strict / formal > Promotes JSON-LD Funded & maintained by all Search engines drives many google products...
  • 57.
  • 58. Schema.org && Google && Search https://developers.google.com/search/docs/guides/search-features
  • 59. Google is... Using the RDF graph model to integrate diverse data from webpages & emails By using the concept of Linked Data And this is all empowered by a common ontology (or schema)
  • 61. GeoPhy provides value, risk, & quality metrics for every building in the world
  • 62. RDF @GeoPhy We collect & integrate a lot of data > on properties, on its surroundings, and on the market conditions Master dataset on Real Estate (aka Knowledge Graph) > driving our Machine Learning / Deep Learning models Challenges... > We have thousands of sources, > Sources are updated at arbitrary intervals > We get our data in CSV, in the good days And, of course… we are not Google to make people write RDF for us :-)
  • 63. Geophy Data Management Platform CSV PDF GeoPhy Ontologies Transform To RDF Validate Identify & Deduplicate Conflict resolution Data Fusion Data Wrangling & Extraction Annotation & Provenance Modeling Mapping CoreDB Provenance (value-level) Data Indexing Data Ingestion Data Enrichment Dependency Detection Geo Enrichment Trigger ML/DL API
  • 64. And the closing slide... People think RDF is a pain because it is complicated. The truth is even worse. RDF is painfully simplistic, but it allows you to work with real-world data and problems that are horribly complicated. While you can avoid RDF, it is harder to avoid complicated data and complicated computer problems. Dan Brickley, Schema.org and Google Libby Miller, BBC
  • 65. Thank you for your attention Questions? Many thanks to Sander, Matt and the whole GeoPhy Eng. Team for their feedback