SlideShare a Scribd company logo
1 of 47
Download to read offline
Graph databases
& data integration
The case of RDF
By Dimitris Kontokostas
AKSW/KILT - Leipzig
DBpedia Association
Thessaloniki Java Meetup / 09.05.2016
Thessaloniki Java meetup - 09.05.2016
About me
● I live in Veria
● I am an ex-ICT teacher
● Since 2003 I was working on mainly on R&D projects
○ + some web development
● Since 2012 doing a PhD & working in AKSW group in Leipzig
○ Focusing on semantic web technologies (RDF, SPARQL, and many other scary terms)
○ aka Knowledge Engineer
● I am on open source enthusiast (DBpedia, RDFUnit)
● Recently became a W3c specification editor for SHACL
● Walked across many langs but ended up in Scala, Java, & Bash
○ With bash / CLI as a first choice;)
Thessaloniki Java meetup - 09.05.2016
Before we start… who knows?
LOD Cloud
Linked Data
Thessaloniki Java meetup - 09.05.2016
Agenda*
● Graphs
● RDF Graphs
● Data integration
● Who uses RDF
● Quick overview of:
○ DBpedia
○ SPARQL
○ RelFinder
○ Schema.org & actions
○ JSON-LD
○ Entity disambiguation
○ Data Quality
(*) focusing mostly on getting familiar to basic terms and concepts
(**) Apologies in advance for mixing greek with English
Thessaloniki Java meetup - 09.05.2016
Thessaloniki Java meetup - 09.05.2016
The four V’s heatmap for Graph Databases
Study in 2013 found:
● many organizations
find the “variety”
dimension a greater
challenge than
volume or velocity.
Graph DBs to the rescue:
● Combine multiple
sources with different
structures
● while retaining the
flexibility to add new ones
without adapting
schematas
● query combined data, or
multiple sources at once
● detecting patterns in the
data
(*) See also this
Thessaloniki Java meetup - 09.05.2016
© Image by Max De Margi
Thessaloniki Java meetup - 09.05.2016
● A graph is a way of specifying relationships among a collection of items
● Items
○ Nodes - Alice, Bob, …
○ Edges
■ undirected - knows, …
■ directed - follows, …
○ Values -- weights, distances, scores, 0-5 scale, …
○ Attributes - name, time, ...
Graphs
Thessaloniki Java meetup - 09.05.2016
Graph Data Models
Property graphs
● Industry standards
○ Neo4j, Titan, Apache TinkerPop, ...
○ App specific way for querying, exporting, importing, etc
○ Optimized for specific operation and in many cases faster
RDF Graphs
● W3c standards
○ Like XML / HTML, define once run everywhere TM
○ Standardised way for querying, exporting, importing
Thessaloniki Java meetup - 09.05.2016
Property Graphs
● Each node has a
○ unique identifier.
○ set of outgoing edges.
○ set of incoming edges.
○ collection of key-value properties.
● Each edge
○ Is directed
○ has a unique identifier.
○ has a label that denotes
the type of relationship
between its source and
○ target nodes.
○ has a collection of key-value
Thessaloniki Java meetup - 09.05.2016
RDF - Resource Description Framework
● An RDF Graph is a set of RDF Triples
● An RDF triple consists of (only) three components:
○ the subject (is an IRI)
○ the predicate (is an IRI)
○ the object (can be an IRI or Literal)
○ (subjects and objects can also be blank nodes but let’s leave it for now)
http://dbpedia.
org/resource/Java
dbo:latestReleaseVersion
“1.8.0_60”
http://dbpedia.
org/resource/C++
dbo:influencedBy
http://dbpedia.
org/resource/C#
dbo:influencedBy
Subject Predicate Object
Thessaloniki Java meetup - 09.05.2016
RDF is an abstract data model
Turtle
@prefix dbo: <http://dbpedia.org/ontology/> .
@prefix ex: <http://example.com/> .
ex:Dimitris a dbo:Person .
NTriples
<http://example.com/Dimitris> a <http://dbpedia.org/ontology/Person> .
JSON-LD
{ "@id": "http://example.com/Dimitris",
"@type": "http://dbpedia.org/ontology/Person" }
XML
<rdf:Description rdf:about="http://example.com/Dimitris">
<rdf:type rdf:resource="http://dbpedia.org/ontology/Person"/>
</rdf:Description>
RDFa (embedded in html)
<div xmlns="http://www.w3.org/1999/xhtml"
prefix=" rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns#
dbo: http://dbpedia.org/ontology/
rdfs: http://www.w3.org/2000/01/rdf-schema#">
<div typeof="dbo:Person" about="http://example.com/Dimitris">
</div>
</div>
Thessaloniki Java meetup - 09.05.2016
RDF & Graphs (Separate)
File1.ttl
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix ex: <http://example.com/> .
ex:Dimitris foaf:knows ex:Petros .
File2.ttl
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix ex: <http://example.com/> .
ex:Dimitris a foaf:Person .
ex:Petros a foaf:Person .
File3.ttl
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix dbpedia: <http://dbpedia.org/resource/> .
@prefix ex: <http://example.com/> .
ex:Dimitris foaf:interest dbpedia:RDF .
ex:Petros foaf:interest dbpedia:Cassandra .
Thessaloniki Java meetup - 09.05.2016
RDF & Graphs (merge)
File_all.ttl
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix ex: <http://example.com/> .
ex:Dimitris foaf:knows ex:Petros .
ex:Dimitris a foaf:Person .
ex:Petros a foaf:Person .
@prefix dbpedia: <http://dbpedia.org/resource/> .
ex:Dimitris foaf:interest dbpedia:RDF .
ex:Petros foaf:interest dbpedia:Apache_Cassandra .
Thessaloniki Java meetup - 09.05.2016
RDF & Graphs (dataset / multi-graph) .n3 files
<http://example.com/relations-graph> {
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix ex: <http://example.com/> .
ex:Dimitris foaf:knows ex:Petros .
}
<http://example.com/types-graph> {
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix ex: <http://example.com/> .
ex:Dimitris a foaf:Person .
ex:Petros a foaf:Person .
}
<http://example.com/interests-graph> {
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix dbpedia: <http://dbpedia.org/resource/> .
@prefix ex: <http://example.com/> .
ex:Dimitris foaf:interest dbpedia:RDF .
ex:Petros foaf:interest dbpedia:Cassandra .
}
Thessaloniki Java meetup - 09.05.2016
RDF & Linked Data
● Using HTTP(s) based IRIs we get the Web of Data
○ See TED talk from Tim Berners Lee (Creator of WWW)
● Every RDF Resource becomes like a REST GET API that returns all the
RDF triples it is associated with
○ content negotiation for RDF (machine) or HTML (human)
○ Follow-your-nose pattern
http://dbpedia.
org/resource/Java
dbo:latestReleaseVersion
“1.8.0_60”
http://dbpedia.
org/resource/C++
dbo:influencedBy
http://dbpedia.
org/resource/C#
dbo:influencedBy
http://aksw.
org/DimitrisKontok
ostas
ex:learns
http://www.
geonames.
org/733905/
dbo:birthPlace
40.52437
22.20242
geo:lat
geo:long
Thessaloniki Java meetup - 09.05.2016
LOD CLOUD
>1K Datasets
>50B Triples
>100M links
Thessaloniki Java meetup - 09.05.2016
Vocabularies & Semantics
● Vocabularies/Ontologies define classes and predicates (properties) in
RDF
○ ex:Dimitris a dbo:Person
○ ex:Dimitris dbo:birthDate “1981-06-06”^^xsd:date
● Existing Vocabularies capture many use case
○ DBpedia ontology (general purpose)
○ Schema.org (general purpose / new backed by Google, Yahoo, Bing & Yandex)
○ Foaf (Friend of a friend)
○ Geo (geographical)
○ Prov-o (data provenance)
○ SKOS (classifications)
○ Org (organization structure)
○ … http://lov.okfn.org has more than 400
Thessaloniki Java meetup - 09.05.2016
Vocabularies & Semantics
● classes and predicates (properties) have definitions (semantics)
● ex:Dimitris a dbo:Person
○ dbo:Person Belongs in a class hierarchy
● ex:Dimitris dbo:birthDate “1981-06-06”^^xsd:date
○ dbo:birthDate expects a dbo:Person as subject
○ dbo:birthDate expects an xsd:date as object
● Reusing existing vocabularies (classes & properties) with defined
semantics is a good practice
○ Get part of the data modeling for free
○ Using common terms can help integrate data easier
○ Validation (or inference) for free
■ ex:Thessaloniki dbo:birthDate “1981-06-06”^^xsd:date (is Thessaloniki a Person?)
■ ex:Dimitris dbo:birthDate ex:Thessaloniki (ex:Thessaloniki is not an xsd:date)
Thessaloniki Java meetup - 09.05.2016
Data integration with RDF
● Very simple graph data model
● Convert your data to RDF and model against common vocabularies
○ Design applications against vocabularies
○ Integrate multiple different sources
● Local identifiers are a common integration problem
● Link to data authorities
○ ex:Dimitris dbo:birthPlace ex:Veria geonames:733905
○ (or) ex:Veria owl:sameAs geonames:733905
Thessaloniki Java meetup - 09.05.2016
Pay as you go Data Integration
● RDF views on top of RDBMS (e.g. MySQL) R2RML (W3c spec)
○ Mapping files defines how SQL queries / tables translate to RDF
○ Queryable through a virtual SPARQL endpoint translating SPARQL to SQL
● Convert XML/JSON/CSV/… to RDF with RML.io using mapping files
● Find links to external databases with Limes & Silk
○ e.g.: ex:Veria owl:sameAs geonames:733905
● You can get some benefit with low effort
● The more time you invest the better the results
● (Common practice) work on secondary RDF views of your data
Thessaloniki Java meetup - 09.05.2016
Who uses RDF (in public)
https://github.com/json-ld/json-ld.org/wiki/Users-of-JSON-LD
Thessaloniki Java meetup - 09.05.2016
Some More Statistics
● Based on the common crawl of Nov 2015
● 30% of HTML pages (541M / 1.77B pages) contained structured data.
● This 30% originates from 2.72M different pay-level-domains out of the
14.41 million pay-level-domains covered by the crawl (19%).
○ 521K websites use RDFa
○ 1.1 million Microdata
○ 586K have embedded json-ld (mostly for search actions)
● Altogether, the extracted data sets consist of 24.38 billion RDF quads.
http://webdatacommons.org/structureddata/2015-11/stats/stats.html#results-2015-1
Thessaloniki Java meetup - 09.05.2016
DBpedia Let’s look at John Cleese (Monty Pythons)
Thessaloniki Java meetup - 09.05.2016
SPARQL
„Which films starred John Cleese without any other members
of Monty Python?“
SPARQL Examples by
Markus Ackermann &
Markus Freudenberg
Thessaloniki Java meetup - 09.05.2016
Thessaloniki Java meetup - 09.05.2016
Basic Graph Pattern
Thessaloniki Java meetup - 09.05.2016
Thessaloniki Java meetup - 09.05.2016
Graph Group Pattern
Thessaloniki Java meetup - 09.05.2016
Thessaloniki Java meetup - 09.05.2016
Filtering Unwanted Results
Thessaloniki Java meetup - 09.05.2016
Thessaloniki Java meetup - 09.05.2016
RelFinder demo (flash)
Schema.org
● Vocabulary backed by all Search
engines
● RDF data model
○ Normative format is JSON-LD
○ RDF in not actively mentioned (to
not scare people away)
○ Allows use as general structured
data (e.g. microdata)
● Enriches a lot of (at least) Google’s
application
○ Search (try e.g. recipes)
○ Gmail (travel, events, actions,...)
○ Google Now
○ Google Knowledge Graph
○ ...
Thessaloniki Java meetup - 09.05.2016
Schema.org actions
Thessaloniki Java meetup - 09.05.2016
JSON-LD
● Like normal JSON but better ;)
Thessaloniki Java meetup - 09.05.2016
JSON-LD
● Like normal JSON but better ;)
● @context makes the difference
● Append your own context
Thessaloniki Java meetup - 09.05.2016
JSON-LD
Thessaloniki Java meetup - 09.05.2016
JSON-LD
Thessaloniki Java meetup - 09.05.2016
JSON-LD
Thessaloniki Java meetup - 09.05.2016
JSON-LD links
● Previous examples
● JSON-LD specification & playground
● Hypermedia self-described APIs with Hydra
Thessaloniki Java meetup - 09.05.2016
Entity disambiguation
aka NERD (Named Entity Resolution & Disambiguation)
● George Bush is sitting in front of the White House
○ George: some George?
○ Bush: a small plant
○ George Bush: former president of USA
○ White: Colour
○ House: a house
○ White House:
● http://dbpedia-spotlight.github.io/demo/
Thessaloniki Java meetup - 09.05.2016
Data Quality
● As mentioned earlier, we can (re) use the vocabulary semantics for
automatic data validation
● RDFUnit - https://github.com/AKSW/RDFUnit
○ Automatically generates data unit tests based on the vocabularies your data uses
○ Custom JUnit runner
● SHACL - http://w3c.github.io/data-shapes/shacl/
○ Language to define advanced data constraints on RDF Graphs
○ (In progress) W3c recommendation
Thessaloniki Java meetup - 09.05.2016
ALIGNED project
● Aligning software & data engineering
● Tools & techniques for agility in changes in code / data
● http://aligned-project.eu
● Options a free consultancy in aligned tools
○ See website for more info
Thessaloniki Java meetup - 09.05.2016
Wrapping up / Key points
● Data variety is a common problem
● Integrating Data can be a pain :)
● Graph Databases can help, RDF can sometimes be more appropriate
● Pay as you go data integration
○ Map your data to RDF
○ Keep RDF as a copy of your source data
● RDF helps you develop reusable applications against schemas
● Schema.org
○ For website markups
○ For defining actions
● JSON-LD (embedded mappings)
● RDF for text annotations
● There is very good tool support for RDF in Java
Thessaloniki Java meetup - 09.05.2016
Links
● http://json-ld.org/
● http://wiki.dbpedia.org
● http://dbpedia-spotlight.github.io/demo/
● http://schema.org
● http://aksw.org - Many interesting tools
● http://wikidata.org
● Apache Jena - RDF Java library
● Virtuoso - Open Source RDF & RDBMS DB
Thessaloniki Java meetup - 09.05.2016
Thank you!
Questions?
Slides available at slideshare.net/jimkont

More Related Content

What's hot

Linked data activities in the Deutsche Nationalbibliothek
Linked data activities in the Deutsche NationalbibliothekLinked data activities in the Deutsche Nationalbibliothek
Linked data activities in the Deutsche NationalbibliothekLars G. Svensson
 
Wherecamp Navigation Conference 2015 - SPOI SDI4pps: Points of Interest
Wherecamp Navigation Conference 2015 - SPOI SDI4pps: Points of InterestWherecamp Navigation Conference 2015 - SPOI SDI4pps: Points of Interest
Wherecamp Navigation Conference 2015 - SPOI SDI4pps: Points of InterestWhereCampBerlin
 
The European ALIADA project : introduction
The European ALIADA project : introductionThe European ALIADA project : introduction
The European ALIADA project : introductionaliada project
 
Library Linked Data in Latvia - #LIBER2014 poster
Library Linked Data in Latvia - #LIBER2014 posterLibrary Linked Data in Latvia - #LIBER2014 poster
Library Linked Data in Latvia - #LIBER2014 posterUldis Bojars
 
Turning the Page on Digital Content
Turning the Page on Digital ContentTurning the Page on Digital Content
Turning the Page on Digital ContentDavid Wilcox
 
An Introduction to Linked Data and Microdata
An Introduction to Linked Data and MicrodataAn Introduction to Linked Data and Microdata
An Introduction to Linked Data and MicrodataDLFCLIR
 
[Webinar] FactForge Debuts: Trump World Data and Instant Ranking of Industry ...
[Webinar] FactForge Debuts: Trump World Data and Instant Ranking of Industry ...[Webinar] FactForge Debuts: Trump World Data and Instant Ranking of Industry ...
[Webinar] FactForge Debuts: Trump World Data and Instant Ranking of Industry ...Ontotext
 
DOI registration with DataCite - COOPEUS, ENVRI, EUDAT workshop 2013
DOI registration with DataCite - COOPEUS, ENVRI, EUDAT workshop 2013DOI registration with DataCite - COOPEUS, ENVRI, EUDAT workshop 2013
DOI registration with DataCite - COOPEUS, ENVRI, EUDAT workshop 2013Frauke Ziedorn
 
DataCite How To: Use the MDS
DataCite How To: Use the MDSDataCite How To: Use the MDS
DataCite How To: Use the MDSFrauke Ziedorn
 
2014-02-27 Wikidata talk Cambridge
2014-02-27 Wikidata talk Cambridge2014-02-27 Wikidata talk Cambridge
2014-02-27 Wikidata talk CambridgeMagnus Manske
 
WG5: A data wrangling experiment
WG5: A data wrangling experimentWG5: A data wrangling experiment
WG5: A data wrangling experimentWARCnet
 
Portland Common Data Model (PCDM): Creating and Sharing Complex Digital Objects
Portland Common Data Model (PCDM): Creating and Sharing Complex Digital ObjectsPortland Common Data Model (PCDM): Creating and Sharing Complex Digital Objects
Portland Common Data Model (PCDM): Creating and Sharing Complex Digital ObjectsKaren Estlund
 

What's hot (20)

Linked data activities in the Deutsche Nationalbibliothek
Linked data activities in the Deutsche NationalbibliothekLinked data activities in the Deutsche Nationalbibliothek
Linked data activities in the Deutsche Nationalbibliothek
 
Wherecamp Navigation Conference 2015 - SPOI SDI4pps: Points of Interest
Wherecamp Navigation Conference 2015 - SPOI SDI4pps: Points of InterestWherecamp Navigation Conference 2015 - SPOI SDI4pps: Points of Interest
Wherecamp Navigation Conference 2015 - SPOI SDI4pps: Points of Interest
 
Linking knowledge spaces
Linking knowledge spacesLinking knowledge spaces
Linking knowledge spaces
 
The European ALIADA project : introduction
The European ALIADA project : introductionThe European ALIADA project : introduction
The European ALIADA project : introduction
 
Library Linked Data in Latvia - #LIBER2014 poster
Library Linked Data in Latvia - #LIBER2014 posterLibrary Linked Data in Latvia - #LIBER2014 poster
Library Linked Data in Latvia - #LIBER2014 poster
 
HyperGraphQL
HyperGraphQLHyperGraphQL
HyperGraphQL
 
BVC - Semantic Web - ICoC
BVC - Semantic Web - ICoCBVC - Semantic Web - ICoC
BVC - Semantic Web - ICoC
 
Sebastian Hellmann
Sebastian HellmannSebastian Hellmann
Sebastian Hellmann
 
Finding Data Sets
Finding Data SetsFinding Data Sets
Finding Data Sets
 
Turning the Page on Digital Content
Turning the Page on Digital ContentTurning the Page on Digital Content
Turning the Page on Digital Content
 
KEDL DBpedia 2019
KEDL DBpedia  2019KEDL DBpedia  2019
KEDL DBpedia 2019
 
An Introduction to Linked Data and Microdata
An Introduction to Linked Data and MicrodataAn Introduction to Linked Data and Microdata
An Introduction to Linked Data and Microdata
 
Nosql
NosqlNosql
Nosql
 
[Webinar] FactForge Debuts: Trump World Data and Instant Ranking of Industry ...
[Webinar] FactForge Debuts: Trump World Data and Instant Ranking of Industry ...[Webinar] FactForge Debuts: Trump World Data and Instant Ranking of Industry ...
[Webinar] FactForge Debuts: Trump World Data and Instant Ranking of Industry ...
 
DOI registration with DataCite - COOPEUS, ENVRI, EUDAT workshop 2013
DOI registration with DataCite - COOPEUS, ENVRI, EUDAT workshop 2013DOI registration with DataCite - COOPEUS, ENVRI, EUDAT workshop 2013
DOI registration with DataCite - COOPEUS, ENVRI, EUDAT workshop 2013
 
DataCite How To: Use the MDS
DataCite How To: Use the MDSDataCite How To: Use the MDS
DataCite How To: Use the MDS
 
2014-02-27 Wikidata talk Cambridge
2014-02-27 Wikidata talk Cambridge2014-02-27 Wikidata talk Cambridge
2014-02-27 Wikidata talk Cambridge
 
WG5: A data wrangling experiment
WG5: A data wrangling experimentWG5: A data wrangling experiment
WG5: A data wrangling experiment
 
Sasaki mlkrep-20150710
Sasaki mlkrep-20150710Sasaki mlkrep-20150710
Sasaki mlkrep-20150710
 
Portland Common Data Model (PCDM): Creating and Sharing Complex Digital Objects
Portland Common Data Model (PCDM): Creating and Sharing Complex Digital ObjectsPortland Common Data Model (PCDM): Creating and Sharing Complex Digital Objects
Portland Common Data Model (PCDM): Creating and Sharing Complex Digital Objects
 

Viewers also liked

RDFUnit - Test-Driven Linked Data quality Assessment (WWW2014)
RDFUnit - Test-Driven Linked Data quality Assessment (WWW2014)RDFUnit - Test-Driven Linked Data quality Assessment (WWW2014)
RDFUnit - Test-Driven Linked Data quality Assessment (WWW2014)Dimitris Kontokostas
 
Decentralised entity registry “WikiReg”
Decentralised entity registry “WikiReg”Decentralised entity registry “WikiReg”
Decentralised entity registry “WikiReg”Christophe Guéret
 
NLP Data Cleansing Based on Linguistic Ontology Constraints
NLP Data Cleansing Based on Linguistic Ontology ConstraintsNLP Data Cleansing Based on Linguistic Ontology Constraints
NLP Data Cleansing Based on Linguistic Ontology ConstraintsDimitris Kontokostas
 
Semantically enhanced quality assurance in the jurion business use case
Semantically enhanced quality assurance in the jurion  business use caseSemantically enhanced quality assurance in the jurion  business use case
Semantically enhanced quality assurance in the jurion business use caseDimitris Kontokostas
 
DBpedia+ / DBpedia meeting in Dublin
DBpedia+ / DBpedia meeting in DublinDBpedia+ / DBpedia meeting in Dublin
DBpedia+ / DBpedia meeting in DublinDimitris Kontokostas
 
DBpedia i18n - Amsterdam Meeting (30/01/2014)
DBpedia i18n - Amsterdam Meeting (30/01/2014)DBpedia i18n - Amsterdam Meeting (30/01/2014)
DBpedia i18n - Amsterdam Meeting (30/01/2014)Dimitris Kontokostas
 
Assessing and Refining Mappings to RDF to Improve Dataset Quality
Assessing and Refining Mappings to RDF to Improve Dataset QualityAssessing and Refining Mappings to RDF to Improve Dataset Quality
Assessing and Refining Mappings to RDF to Improve Dataset Qualityandimou
 

Viewers also liked (12)

RDFUnit - Test-Driven Linked Data quality Assessment (WWW2014)
RDFUnit - Test-Driven Linked Data quality Assessment (WWW2014)RDFUnit - Test-Driven Linked Data quality Assessment (WWW2014)
RDFUnit - Test-Driven Linked Data quality Assessment (WWW2014)
 
Decentralised entity registry “WikiReg”
Decentralised entity registry “WikiReg”Decentralised entity registry “WikiReg”
Decentralised entity registry “WikiReg”
 
DBpedia ♥ Commons
DBpedia ♥ CommonsDBpedia ♥ Commons
DBpedia ♥ Commons
 
DBpedia past, present & future
DBpedia past, present & futureDBpedia past, present & future
DBpedia past, present & future
 
DBpedia Viewer - LDOW 2014
DBpedia Viewer - LDOW 2014DBpedia Viewer - LDOW 2014
DBpedia Viewer - LDOW 2014
 
NLP Data Cleansing Based on Linguistic Ontology Constraints
NLP Data Cleansing Based on Linguistic Ontology ConstraintsNLP Data Cleansing Based on Linguistic Ontology Constraints
NLP Data Cleansing Based on Linguistic Ontology Constraints
 
Semantically enhanced quality assurance in the jurion business use case
Semantically enhanced quality assurance in the jurion  business use caseSemantically enhanced quality assurance in the jurion  business use case
Semantically enhanced quality assurance in the jurion business use case
 
DBpedia+ / DBpedia meeting in Dublin
DBpedia+ / DBpedia meeting in DublinDBpedia+ / DBpedia meeting in Dublin
DBpedia+ / DBpedia meeting in Dublin
 
DBpedia i18n - Amsterdam Meeting (30/01/2014)
DBpedia i18n - Amsterdam Meeting (30/01/2014)DBpedia i18n - Amsterdam Meeting (30/01/2014)
DBpedia i18n - Amsterdam Meeting (30/01/2014)
 
Assessing and Refining Mappings to RDF to Improve Dataset Quality
Assessing and Refining Mappings to RDF to Improve Dataset QualityAssessing and Refining Mappings to RDF to Improve Dataset Quality
Assessing and Refining Mappings to RDF to Improve Dataset Quality
 
SHACL by example
SHACL by exampleSHACL by example
SHACL by example
 
RDF validation tutorial
RDF validation tutorialRDF validation tutorial
RDF validation tutorial
 

Similar to Graph databases & data integration - the case of RDF

Graph databases & data integration v2
Graph databases & data integration v2Graph databases & data integration v2
Graph databases & data integration v2Dimitris Kontokostas
 
Slides semantic web and Drupal 7 NYCCamp 2012
Slides semantic web and Drupal 7 NYCCamp 2012Slides semantic web and Drupal 7 NYCCamp 2012
Slides semantic web and Drupal 7 NYCCamp 2012scorlosquet
 
SemWeb Fundamentals - Info Linking & Layering in Practice
SemWeb Fundamentals - Info Linking & Layering in PracticeSemWeb Fundamentals - Info Linking & Layering in Practice
SemWeb Fundamentals - Info Linking & Layering in PracticeDan Brickley
 
Release webinar: Sansa and Ontario
Release webinar: Sansa and OntarioRelease webinar: Sansa and Ontario
Release webinar: Sansa and OntarioBigData_Europe
 
Semantic Web introduction
Semantic Web introductionSemantic Web introduction
Semantic Web introductionGraphity
 
Linked data-tooling-xml
Linked data-tooling-xmlLinked data-tooling-xml
Linked data-tooling-xmlFelix Sasaki
 
Hacktoberfest 2020 - Intro to Knowledge Graphs
Hacktoberfest 2020 - Intro to Knowledge GraphsHacktoberfest 2020 - Intro to Knowledge Graphs
Hacktoberfest 2020 - Intro to Knowledge GraphsArangoDB Database
 
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudFirst Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudOntotext
 
The Semantic Web and Drupal 7 - Loja 2013
The Semantic Web and Drupal 7 - Loja 2013The Semantic Web and Drupal 7 - Loja 2013
The Semantic Web and Drupal 7 - Loja 2013scorlosquet
 
RDFa: an introduction
RDFa: an introductionRDFa: an introduction
RDFa: an introductionKai Li
 
RDF SHACL, Annotations, and Data Frames
RDF SHACL, Annotations, and Data FramesRDF SHACL, Annotations, and Data Frames
RDF SHACL, Annotations, and Data FramesKurt Cagle
 
Linked Open Data: A simple how-to
Linked Open Data: A simple how-toLinked Open Data: A simple how-to
Linked Open Data: A simple how-tonvitucci
 
2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs
2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs
2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIsJosef Petrák
 
Adventures in Linked Data Land (presentation by Richard Light)
Adventures in Linked Data Land (presentation by Richard Light)Adventures in Linked Data Land (presentation by Richard Light)
Adventures in Linked Data Land (presentation by Richard Light)jottevanger
 
Graph basedrdf storeforapachecassandra
Graph basedrdf storeforapachecassandraGraph basedrdf storeforapachecassandra
Graph basedrdf storeforapachecassandraRavindra Ranwala
 

Similar to Graph databases & data integration - the case of RDF (20)

Graph databases & data integration v2
Graph databases & data integration v2Graph databases & data integration v2
Graph databases & data integration v2
 
Slides semantic web and Drupal 7 NYCCamp 2012
Slides semantic web and Drupal 7 NYCCamp 2012Slides semantic web and Drupal 7 NYCCamp 2012
Slides semantic web and Drupal 7 NYCCamp 2012
 
SemWeb Fundamentals - Info Linking & Layering in Practice
SemWeb Fundamentals - Info Linking & Layering in PracticeSemWeb Fundamentals - Info Linking & Layering in Practice
SemWeb Fundamentals - Info Linking & Layering in Practice
 
Danbri Drupalcon Export
Danbri Drupalcon ExportDanbri Drupalcon Export
Danbri Drupalcon Export
 
Release webinar: Sansa and Ontario
Release webinar: Sansa and OntarioRelease webinar: Sansa and Ontario
Release webinar: Sansa and Ontario
 
Semantic Web introduction
Semantic Web introductionSemantic Web introduction
Semantic Web introduction
 
Linked data-tooling-xml
Linked data-tooling-xmlLinked data-tooling-xml
Linked data-tooling-xml
 
Hacktoberfest 2020 - Intro to Knowledge Graphs
Hacktoberfest 2020 - Intro to Knowledge GraphsHacktoberfest 2020 - Intro to Knowledge Graphs
Hacktoberfest 2020 - Intro to Knowledge Graphs
 
Linked data tooling XML
Linked data tooling XMLLinked data tooling XML
Linked data tooling XML
 
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudFirst Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
 
Semantic Web talk TEMPLATE
Semantic Web talk TEMPLATESemantic Web talk TEMPLATE
Semantic Web talk TEMPLATE
 
The Semantic Web and Drupal 7 - Loja 2013
The Semantic Web and Drupal 7 - Loja 2013The Semantic Web and Drupal 7 - Loja 2013
The Semantic Web and Drupal 7 - Loja 2013
 
RDFa: an introduction
RDFa: an introductionRDFa: an introduction
RDFa: an introduction
 
RDF SHACL, Annotations, and Data Frames
RDF SHACL, Annotations, and Data FramesRDF SHACL, Annotations, and Data Frames
RDF SHACL, Annotations, and Data Frames
 
Linked Open Data: A simple how-to
Linked Open Data: A simple how-toLinked Open Data: A simple how-to
Linked Open Data: A simple how-to
 
2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs
2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs
2011 4IZ440 Semantic Web – RDF, SPARQL, and software APIs
 
Adventures in Linked Data Land (presentation by Richard Light)
Adventures in Linked Data Land (presentation by Richard Light)Adventures in Linked Data Land (presentation by Richard Light)
Adventures in Linked Data Land (presentation by Richard Light)
 
Graph basedrdf storeforapachecassandra
Graph basedrdf storeforapachecassandraGraph basedrdf storeforapachecassandra
Graph basedrdf storeforapachecassandra
 
Semantic Web Technology
Semantic Web TechnologySemantic Web Technology
Semantic Web Technology
 
RDFa Tutorial
RDFa TutorialRDFa Tutorial
RDFa Tutorial
 

Recently uploaded

Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Klinik kandungan
 
Solution manual for managerial accounting 8th edition by john wild ken shaw b...
Solution manual for managerial accounting 8th edition by john wild ken shaw b...Solution manual for managerial accounting 8th edition by john wild ken shaw b...
Solution manual for managerial accounting 8th edition by john wild ken shaw b...rightmanforbloodline
 
Seven tools of quality control.slideshare
Seven tools of quality control.slideshareSeven tools of quality control.slideshare
Seven tools of quality control.slideshareraiaryan448
 
obat aborsi Bontang wa 082135199655 jual obat aborsi cytotec asli di Bontang
obat aborsi Bontang wa 082135199655 jual obat aborsi cytotec asli di  Bontangobat aborsi Bontang wa 082135199655 jual obat aborsi cytotec asli di  Bontang
obat aborsi Bontang wa 082135199655 jual obat aborsi cytotec asli di Bontangsiskavia95
 
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...ThinkInnovation
 
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...Klinik Aborsi
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...Bertram Ludäscher
 
How to Transform Clinical Trial Management with Advanced Data Analytics
How to Transform Clinical Trial Management with Advanced Data AnalyticsHow to Transform Clinical Trial Management with Advanced Data Analytics
How to Transform Clinical Trial Management with Advanced Data AnalyticsBrainSell Technologies
 
jll-asia-pacific-capital-tracker-1q24.pdf
jll-asia-pacific-capital-tracker-1q24.pdfjll-asia-pacific-capital-tracker-1q24.pdf
jll-asia-pacific-capital-tracker-1q24.pdfjaytendertech
 
Predictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting TechniquesPredictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting TechniquesBoston Institute of Analytics
 
如何办理澳洲拉筹伯大学毕业证(LaTrobe毕业证书)成绩单原件一模一样
如何办理澳洲拉筹伯大学毕业证(LaTrobe毕业证书)成绩单原件一模一样如何办理澳洲拉筹伯大学毕业证(LaTrobe毕业证书)成绩单原件一模一样
如何办理澳洲拉筹伯大学毕业证(LaTrobe毕业证书)成绩单原件一模一样wsppdmt
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...Elaine Werffeli
 
Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...
Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...
Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...ssuserf63bd7
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareGraham Ware
 
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证acoha1
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNKTimothy Spann
 
Unsatisfied Bhabhi ℂall Girls Vadodara Book Esha 7427069034 Top Class ℂall Gi...
Unsatisfied Bhabhi ℂall Girls Vadodara Book Esha 7427069034 Top Class ℂall Gi...Unsatisfied Bhabhi ℂall Girls Vadodara Book Esha 7427069034 Top Class ℂall Gi...
Unsatisfied Bhabhi ℂall Girls Vadodara Book Esha 7427069034 Top Class ℂall Gi...Payal Garg #K09
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxronsairoathenadugay
 
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...ThinkInnovation
 

Recently uploaded (20)

Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
 
Abortion pills in Doha {{ QATAR }} +966572737505) Get Cytotec
Abortion pills in Doha {{ QATAR }} +966572737505) Get CytotecAbortion pills in Doha {{ QATAR }} +966572737505) Get Cytotec
Abortion pills in Doha {{ QATAR }} +966572737505) Get Cytotec
 
Solution manual for managerial accounting 8th edition by john wild ken shaw b...
Solution manual for managerial accounting 8th edition by john wild ken shaw b...Solution manual for managerial accounting 8th edition by john wild ken shaw b...
Solution manual for managerial accounting 8th edition by john wild ken shaw b...
 
Seven tools of quality control.slideshare
Seven tools of quality control.slideshareSeven tools of quality control.slideshare
Seven tools of quality control.slideshare
 
obat aborsi Bontang wa 082135199655 jual obat aborsi cytotec asli di Bontang
obat aborsi Bontang wa 082135199655 jual obat aborsi cytotec asli di  Bontangobat aborsi Bontang wa 082135199655 jual obat aborsi cytotec asli di  Bontang
obat aborsi Bontang wa 082135199655 jual obat aborsi cytotec asli di Bontang
 
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
 
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
 
How to Transform Clinical Trial Management with Advanced Data Analytics
How to Transform Clinical Trial Management with Advanced Data AnalyticsHow to Transform Clinical Trial Management with Advanced Data Analytics
How to Transform Clinical Trial Management with Advanced Data Analytics
 
jll-asia-pacific-capital-tracker-1q24.pdf
jll-asia-pacific-capital-tracker-1q24.pdfjll-asia-pacific-capital-tracker-1q24.pdf
jll-asia-pacific-capital-tracker-1q24.pdf
 
Predictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting TechniquesPredictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting Techniques
 
如何办理澳洲拉筹伯大学毕业证(LaTrobe毕业证书)成绩单原件一模一样
如何办理澳洲拉筹伯大学毕业证(LaTrobe毕业证书)成绩单原件一模一样如何办理澳洲拉筹伯大学毕业证(LaTrobe毕业证书)成绩单原件一模一样
如何办理澳洲拉筹伯大学毕业证(LaTrobe毕业证书)成绩单原件一模一样
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...
Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...
Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Unsatisfied Bhabhi ℂall Girls Vadodara Book Esha 7427069034 Top Class ℂall Gi...
Unsatisfied Bhabhi ℂall Girls Vadodara Book Esha 7427069034 Top Class ℂall Gi...Unsatisfied Bhabhi ℂall Girls Vadodara Book Esha 7427069034 Top Class ℂall Gi...
Unsatisfied Bhabhi ℂall Girls Vadodara Book Esha 7427069034 Top Class ℂall Gi...
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
 
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
 

Graph databases & data integration - the case of RDF

  • 1. Graph databases & data integration The case of RDF By Dimitris Kontokostas AKSW/KILT - Leipzig DBpedia Association Thessaloniki Java Meetup / 09.05.2016
  • 2. Thessaloniki Java meetup - 09.05.2016 About me ● I live in Veria ● I am an ex-ICT teacher ● Since 2003 I was working on mainly on R&D projects ○ + some web development ● Since 2012 doing a PhD & working in AKSW group in Leipzig ○ Focusing on semantic web technologies (RDF, SPARQL, and many other scary terms) ○ aka Knowledge Engineer ● I am on open source enthusiast (DBpedia, RDFUnit) ● Recently became a W3c specification editor for SHACL ● Walked across many langs but ended up in Scala, Java, & Bash ○ With bash / CLI as a first choice;)
  • 3. Thessaloniki Java meetup - 09.05.2016 Before we start… who knows? LOD Cloud Linked Data
  • 4. Thessaloniki Java meetup - 09.05.2016 Agenda* ● Graphs ● RDF Graphs ● Data integration ● Who uses RDF ● Quick overview of: ○ DBpedia ○ SPARQL ○ RelFinder ○ Schema.org & actions ○ JSON-LD ○ Entity disambiguation ○ Data Quality (*) focusing mostly on getting familiar to basic terms and concepts (**) Apologies in advance for mixing greek with English
  • 6. Thessaloniki Java meetup - 09.05.2016 The four V’s heatmap for Graph Databases Study in 2013 found: ● many organizations find the “variety” dimension a greater challenge than volume or velocity. Graph DBs to the rescue: ● Combine multiple sources with different structures ● while retaining the flexibility to add new ones without adapting schematas ● query combined data, or multiple sources at once ● detecting patterns in the data (*) See also this
  • 7. Thessaloniki Java meetup - 09.05.2016 © Image by Max De Margi
  • 8. Thessaloniki Java meetup - 09.05.2016 ● A graph is a way of specifying relationships among a collection of items ● Items ○ Nodes - Alice, Bob, … ○ Edges ■ undirected - knows, … ■ directed - follows, … ○ Values -- weights, distances, scores, 0-5 scale, … ○ Attributes - name, time, ... Graphs
  • 9. Thessaloniki Java meetup - 09.05.2016 Graph Data Models Property graphs ● Industry standards ○ Neo4j, Titan, Apache TinkerPop, ... ○ App specific way for querying, exporting, importing, etc ○ Optimized for specific operation and in many cases faster RDF Graphs ● W3c standards ○ Like XML / HTML, define once run everywhere TM ○ Standardised way for querying, exporting, importing
  • 10. Thessaloniki Java meetup - 09.05.2016 Property Graphs ● Each node has a ○ unique identifier. ○ set of outgoing edges. ○ set of incoming edges. ○ collection of key-value properties. ● Each edge ○ Is directed ○ has a unique identifier. ○ has a label that denotes the type of relationship between its source and ○ target nodes. ○ has a collection of key-value
  • 11. Thessaloniki Java meetup - 09.05.2016 RDF - Resource Description Framework ● An RDF Graph is a set of RDF Triples ● An RDF triple consists of (only) three components: ○ the subject (is an IRI) ○ the predicate (is an IRI) ○ the object (can be an IRI or Literal) ○ (subjects and objects can also be blank nodes but let’s leave it for now) http://dbpedia. org/resource/Java dbo:latestReleaseVersion “1.8.0_60” http://dbpedia. org/resource/C++ dbo:influencedBy http://dbpedia. org/resource/C# dbo:influencedBy Subject Predicate Object
  • 12. Thessaloniki Java meetup - 09.05.2016 RDF is an abstract data model Turtle @prefix dbo: <http://dbpedia.org/ontology/> . @prefix ex: <http://example.com/> . ex:Dimitris a dbo:Person . NTriples <http://example.com/Dimitris> a <http://dbpedia.org/ontology/Person> . JSON-LD { "@id": "http://example.com/Dimitris", "@type": "http://dbpedia.org/ontology/Person" } XML <rdf:Description rdf:about="http://example.com/Dimitris"> <rdf:type rdf:resource="http://dbpedia.org/ontology/Person"/> </rdf:Description> RDFa (embedded in html) <div xmlns="http://www.w3.org/1999/xhtml" prefix=" rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# dbo: http://dbpedia.org/ontology/ rdfs: http://www.w3.org/2000/01/rdf-schema#"> <div typeof="dbo:Person" about="http://example.com/Dimitris"> </div> </div>
  • 13. Thessaloniki Java meetup - 09.05.2016 RDF & Graphs (Separate) File1.ttl @prefix foaf: <http://xmlns.com/foaf/0.1/> . @prefix ex: <http://example.com/> . ex:Dimitris foaf:knows ex:Petros . File2.ttl @prefix foaf: <http://xmlns.com/foaf/0.1/> . @prefix ex: <http://example.com/> . ex:Dimitris a foaf:Person . ex:Petros a foaf:Person . File3.ttl @prefix foaf: <http://xmlns.com/foaf/0.1/> . @prefix dbpedia: <http://dbpedia.org/resource/> . @prefix ex: <http://example.com/> . ex:Dimitris foaf:interest dbpedia:RDF . ex:Petros foaf:interest dbpedia:Cassandra .
  • 14. Thessaloniki Java meetup - 09.05.2016 RDF & Graphs (merge) File_all.ttl @prefix foaf: <http://xmlns.com/foaf/0.1/> . @prefix ex: <http://example.com/> . ex:Dimitris foaf:knows ex:Petros . ex:Dimitris a foaf:Person . ex:Petros a foaf:Person . @prefix dbpedia: <http://dbpedia.org/resource/> . ex:Dimitris foaf:interest dbpedia:RDF . ex:Petros foaf:interest dbpedia:Apache_Cassandra .
  • 15. Thessaloniki Java meetup - 09.05.2016 RDF & Graphs (dataset / multi-graph) .n3 files <http://example.com/relations-graph> { @prefix foaf: <http://xmlns.com/foaf/0.1/> . @prefix ex: <http://example.com/> . ex:Dimitris foaf:knows ex:Petros . } <http://example.com/types-graph> { @prefix foaf: <http://xmlns.com/foaf/0.1/> . @prefix ex: <http://example.com/> . ex:Dimitris a foaf:Person . ex:Petros a foaf:Person . } <http://example.com/interests-graph> { @prefix foaf: <http://xmlns.com/foaf/0.1/> . @prefix dbpedia: <http://dbpedia.org/resource/> . @prefix ex: <http://example.com/> . ex:Dimitris foaf:interest dbpedia:RDF . ex:Petros foaf:interest dbpedia:Cassandra . }
  • 16. Thessaloniki Java meetup - 09.05.2016 RDF & Linked Data ● Using HTTP(s) based IRIs we get the Web of Data ○ See TED talk from Tim Berners Lee (Creator of WWW) ● Every RDF Resource becomes like a REST GET API that returns all the RDF triples it is associated with ○ content negotiation for RDF (machine) or HTML (human) ○ Follow-your-nose pattern http://dbpedia. org/resource/Java dbo:latestReleaseVersion “1.8.0_60” http://dbpedia. org/resource/C++ dbo:influencedBy http://dbpedia. org/resource/C# dbo:influencedBy http://aksw. org/DimitrisKontok ostas ex:learns http://www. geonames. org/733905/ dbo:birthPlace 40.52437 22.20242 geo:lat geo:long
  • 17. Thessaloniki Java meetup - 09.05.2016 LOD CLOUD >1K Datasets >50B Triples >100M links
  • 18. Thessaloniki Java meetup - 09.05.2016 Vocabularies & Semantics ● Vocabularies/Ontologies define classes and predicates (properties) in RDF ○ ex:Dimitris a dbo:Person ○ ex:Dimitris dbo:birthDate “1981-06-06”^^xsd:date ● Existing Vocabularies capture many use case ○ DBpedia ontology (general purpose) ○ Schema.org (general purpose / new backed by Google, Yahoo, Bing & Yandex) ○ Foaf (Friend of a friend) ○ Geo (geographical) ○ Prov-o (data provenance) ○ SKOS (classifications) ○ Org (organization structure) ○ … http://lov.okfn.org has more than 400
  • 19. Thessaloniki Java meetup - 09.05.2016 Vocabularies & Semantics ● classes and predicates (properties) have definitions (semantics) ● ex:Dimitris a dbo:Person ○ dbo:Person Belongs in a class hierarchy ● ex:Dimitris dbo:birthDate “1981-06-06”^^xsd:date ○ dbo:birthDate expects a dbo:Person as subject ○ dbo:birthDate expects an xsd:date as object ● Reusing existing vocabularies (classes & properties) with defined semantics is a good practice ○ Get part of the data modeling for free ○ Using common terms can help integrate data easier ○ Validation (or inference) for free ■ ex:Thessaloniki dbo:birthDate “1981-06-06”^^xsd:date (is Thessaloniki a Person?) ■ ex:Dimitris dbo:birthDate ex:Thessaloniki (ex:Thessaloniki is not an xsd:date)
  • 20. Thessaloniki Java meetup - 09.05.2016 Data integration with RDF ● Very simple graph data model ● Convert your data to RDF and model against common vocabularies ○ Design applications against vocabularies ○ Integrate multiple different sources ● Local identifiers are a common integration problem ● Link to data authorities ○ ex:Dimitris dbo:birthPlace ex:Veria geonames:733905 ○ (or) ex:Veria owl:sameAs geonames:733905
  • 21. Thessaloniki Java meetup - 09.05.2016 Pay as you go Data Integration ● RDF views on top of RDBMS (e.g. MySQL) R2RML (W3c spec) ○ Mapping files defines how SQL queries / tables translate to RDF ○ Queryable through a virtual SPARQL endpoint translating SPARQL to SQL ● Convert XML/JSON/CSV/… to RDF with RML.io using mapping files ● Find links to external databases with Limes & Silk ○ e.g.: ex:Veria owl:sameAs geonames:733905 ● You can get some benefit with low effort ● The more time you invest the better the results ● (Common practice) work on secondary RDF views of your data
  • 22. Thessaloniki Java meetup - 09.05.2016 Who uses RDF (in public) https://github.com/json-ld/json-ld.org/wiki/Users-of-JSON-LD
  • 23. Thessaloniki Java meetup - 09.05.2016 Some More Statistics ● Based on the common crawl of Nov 2015 ● 30% of HTML pages (541M / 1.77B pages) contained structured data. ● This 30% originates from 2.72M different pay-level-domains out of the 14.41 million pay-level-domains covered by the crawl (19%). ○ 521K websites use RDFa ○ 1.1 million Microdata ○ 586K have embedded json-ld (mostly for search actions) ● Altogether, the extracted data sets consist of 24.38 billion RDF quads. http://webdatacommons.org/structureddata/2015-11/stats/stats.html#results-2015-1
  • 24. Thessaloniki Java meetup - 09.05.2016 DBpedia Let’s look at John Cleese (Monty Pythons)
  • 25. Thessaloniki Java meetup - 09.05.2016 SPARQL „Which films starred John Cleese without any other members of Monty Python?“ SPARQL Examples by Markus Ackermann & Markus Freudenberg
  • 26. Thessaloniki Java meetup - 09.05.2016
  • 27. Thessaloniki Java meetup - 09.05.2016 Basic Graph Pattern
  • 28. Thessaloniki Java meetup - 09.05.2016
  • 29. Thessaloniki Java meetup - 09.05.2016 Graph Group Pattern
  • 30. Thessaloniki Java meetup - 09.05.2016
  • 31. Thessaloniki Java meetup - 09.05.2016 Filtering Unwanted Results
  • 32. Thessaloniki Java meetup - 09.05.2016
  • 33. Thessaloniki Java meetup - 09.05.2016 RelFinder demo (flash)
  • 34. Schema.org ● Vocabulary backed by all Search engines ● RDF data model ○ Normative format is JSON-LD ○ RDF in not actively mentioned (to not scare people away) ○ Allows use as general structured data (e.g. microdata) ● Enriches a lot of (at least) Google’s application ○ Search (try e.g. recipes) ○ Gmail (travel, events, actions,...) ○ Google Now ○ Google Knowledge Graph ○ ...
  • 35. Thessaloniki Java meetup - 09.05.2016 Schema.org actions
  • 36. Thessaloniki Java meetup - 09.05.2016 JSON-LD ● Like normal JSON but better ;)
  • 37. Thessaloniki Java meetup - 09.05.2016 JSON-LD ● Like normal JSON but better ;) ● @context makes the difference ● Append your own context
  • 38. Thessaloniki Java meetup - 09.05.2016 JSON-LD
  • 39. Thessaloniki Java meetup - 09.05.2016 JSON-LD
  • 40. Thessaloniki Java meetup - 09.05.2016 JSON-LD
  • 41. Thessaloniki Java meetup - 09.05.2016 JSON-LD links ● Previous examples ● JSON-LD specification & playground ● Hypermedia self-described APIs with Hydra
  • 42. Thessaloniki Java meetup - 09.05.2016 Entity disambiguation aka NERD (Named Entity Resolution & Disambiguation) ● George Bush is sitting in front of the White House ○ George: some George? ○ Bush: a small plant ○ George Bush: former president of USA ○ White: Colour ○ House: a house ○ White House: ● http://dbpedia-spotlight.github.io/demo/
  • 43. Thessaloniki Java meetup - 09.05.2016 Data Quality ● As mentioned earlier, we can (re) use the vocabulary semantics for automatic data validation ● RDFUnit - https://github.com/AKSW/RDFUnit ○ Automatically generates data unit tests based on the vocabularies your data uses ○ Custom JUnit runner ● SHACL - http://w3c.github.io/data-shapes/shacl/ ○ Language to define advanced data constraints on RDF Graphs ○ (In progress) W3c recommendation
  • 44. Thessaloniki Java meetup - 09.05.2016 ALIGNED project ● Aligning software & data engineering ● Tools & techniques for agility in changes in code / data ● http://aligned-project.eu ● Options a free consultancy in aligned tools ○ See website for more info
  • 45. Thessaloniki Java meetup - 09.05.2016 Wrapping up / Key points ● Data variety is a common problem ● Integrating Data can be a pain :) ● Graph Databases can help, RDF can sometimes be more appropriate ● Pay as you go data integration ○ Map your data to RDF ○ Keep RDF as a copy of your source data ● RDF helps you develop reusable applications against schemas ● Schema.org ○ For website markups ○ For defining actions ● JSON-LD (embedded mappings) ● RDF for text annotations ● There is very good tool support for RDF in Java
  • 46. Thessaloniki Java meetup - 09.05.2016 Links ● http://json-ld.org/ ● http://wiki.dbpedia.org ● http://dbpedia-spotlight.github.io/demo/ ● http://schema.org ● http://aksw.org - Many interesting tools ● http://wikidata.org ● Apache Jena - RDF Java library ● Virtuoso - Open Source RDF & RDBMS DB
  • 47. Thessaloniki Java meetup - 09.05.2016 Thank you! Questions? Slides available at slideshare.net/jimkont