Two graph data models
RDF and Property Graphs
Andy Seaborne
Paolo Castagna
andy@a.o, castagna@a.o
Outline
➢ Graphs
➢ Data Model: RDF
➢ Data Model: Property Graphs
➢ Best of both?
Andy
➢ Involved in Linked Data standards
(SPARQL, RDF)
➢ Open source: contributor to Apache Jena
➢ Work for TopQuadrant, an RDF tools
company
Graphs
Org charts are
not trees
Graphs
Reference data
Life Sciences Ontologies
Vocabularies
Sharable data
Wikipedia Info boxes (DBpedia)
Analytics and Unstructured data
Fraud analysis
Social Graphs
Use Case for Graphs
Looking for patterns
➢ Analytics
● Social networks and recommendation engines
● Data center infrastructure management
➢ Knowledge Graphs
● Happenings: people, places, events
● Customer databases / products catalogues
Graph Data Models
➢ RDF
● W3C Standard
➢ Property Graphs
● Industry standard
RDF
➢ A graph is a set of links
Link: a triple : subject - predicate - object
predicate (or property) is the link name : an IRI
➢ IRIs (=URIs)
literals (strings, numbers, …)
blank nodes
prefix : <http://example/myData/>
prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix foaf: <http://xmlns.com/foaf/0.1/>
# foaf:name is a short form of <http://xmlns.com/foaf/0.1/name>
:alice rdf:type foaf:Person ;
foaf:name "Alice Smith" ;
foaf:knows :bob .
:alice
foaf:knows
"Alice Smith"
foaf:name
foaf:Person
rdf:type
:bob
prefix : <http://example/myData/>
prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix foaf: <http://xmlns.com/foaf/0.1/>
:bob rdf:type foaf:Person ;
foaf:name "Bob Brown" .
"Bob Brown"
foaf:Person
rdf:type
:bob
prefix : <http://example/myData/>
prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix foaf: <http://xmlns.com/foaf/0.1/>
:alice rdf:type foaf:Person ;
foaf:name "Alice Smith" ;
foaf:knows :bob .
:bob rdf:type foaf:Person ;
foaf:name "Bob Brown" .
:alice
foaf:knows
"Alice Smith"
foaf:name
foaf:Person
rdf:type
"Bob Brown"
foaf:Person
rdf:type
:bob
RDFS
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix foaf: <http://xmlns.com/foaf/0.1/>
foaf:Person rdfs:subClassOf foaf:Agent .
foaf:Person rdfs:subClassOf
<http://www.w3.org/2003/01/geo/wgs84_pos#SpatialThing> .
foaf:skypeID
rdfs:domain foaf:Agent ;
rdfs:label "Skype ID" ;
rdfs:range rdfs:Literal ;
rdfs:subPropertyOf foaf:nick .
RDF : Access
➢ SPARQL : Query language
➢ Protocol : over HTTP
## Names of people Alice knows.
PREFIX : <http://example/myData/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT * {
:alice foaf:knows ?X .
?X foaf:name ?name .
}
RDF : Access
➢ SPARQL : Query language
➢ Protocol : over HTTP
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?name ?numFriends {
{ SELECT ?person (count(*) AS ?numFriends) {
?person foaf:knows ?X .
} GROUP BY ?person
}
?person foaf:name ?name .
} ORDER BY ?numFriends
RDF : Access
➢ SPARQL : Update language
➢ Protocol : over HTTP
PREFIX : <http://example/myData/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
INSERT DATA {
:bob foaf:name "Bob Brown" ;
foaf:knows :alice
} ;
INSERT { :alice knows ?B }
} WHERE {
:bob knows ?B
}
Apache Jena
TLP: April 2012
➢ Involvement in standards
➢ RDF 1.1, SPARQL 1.1
➢ RDF database
➢ SPARQL server
Other RDF@ASF:
➢ Any23, Marmotta, Clerezza, Stanbol, Rya
Property Graph Data Model
A property graph is a set of vertices and edges with
respective properties (i.e. key / values):
➢ each vertex or edge has a unique identifier
➢ each vertex has a set of outgoing edges and a set of incoming edges
➢ edges are directed: each edge has a start vertex and an end vertex
➢ each edge has a label which denotes the type of relationship
➢ vertexes and edges can have a properties (i.e. key / value pairs)
Directed multigraph with properties
attached to vertexes and edges
Property Graph: Example
id = 1 id = 2
name = “Alice”
surname = “Smith”
age = 32
email = alice@example.com
...
name = “Bob”
surname = “Brown”
age = 45
email = bob@example.com
...
since = 01/01/1970
...
id = 3
knows
Property Graphs : Access
➢ Tinkerpop Gremlin
DSL for various languages
g.V().as('person').out('knows').as('friend')
.select().by{it.value('name').length()}
➢ Cypher
MATCH (you:Person {name:"You"})
FOREACH (name in ["Johan","Rajesh","Anna","Julia","Andrew"] |
CREATE (you)-[:FRIEND]->(:Person {name:name}))
➢ Connect : API
Property Graphs @ASF
➢ Apache Tinkerpop
➢ Apache Spark > GraphX
➢ Apache Giraph
➢ Apache Flink > Gelly
RDF
Standards
Information modeling
Data publishing
Property Graphs
Code
Analytics
Data capture
Layering
Using Property Graphs tech for RDF
Using RDF tech for Property Graphs
Doable but why?
Can’t use the tools of one without
understanding the other.
What to take from RDF
URIs as data types
Data Exchange
Data modelling
Emphasis on data formats for exchange
Relational Algebra engines
URIs matter
https://twitter.com/canberratimes/status/700198365393321984
What to take from PG
Separate links and values
Short names for attributes
Engines for Graph Algorithms
Some Conclusions
➢ Data Graphs are (still) new to many people
➢ RDF emphasizes information modelling
→ Knowledge graphs e.g SNOMED
→ SQL-like query
➢ Property Graph emphasizes data syntax
→ Data capture
→ Graph analytic algorithms
➢ Naive layering of data models leads dissatisfaction
→ Can only mix toolsets by knowing it’s layered
➢ Could share technology
→ Storage, data access, query algebra
Thanks and Q&A
?
The Answer
Building one on top of the other is possible
… but why do it?
Really hard to use! Worse of both worlds.
Semantic Web has some useful features
Apply to property graphs

2016-02 Graphs - PG+RDF

  • 1.
    Two graph datamodels RDF and Property Graphs Andy Seaborne Paolo Castagna andy@a.o, castagna@a.o
  • 2.
    Outline ➢ Graphs ➢ DataModel: RDF ➢ Data Model: Property Graphs ➢ Best of both?
  • 3.
    Andy ➢ Involved inLinked Data standards (SPARQL, RDF) ➢ Open source: contributor to Apache Jena ➢ Work for TopQuadrant, an RDF tools company
  • 4.
  • 5.
    Graphs Reference data Life SciencesOntologies Vocabularies Sharable data Wikipedia Info boxes (DBpedia) Analytics and Unstructured data Fraud analysis Social Graphs
  • 6.
    Use Case forGraphs Looking for patterns ➢ Analytics ● Social networks and recommendation engines ● Data center infrastructure management ➢ Knowledge Graphs ● Happenings: people, places, events ● Customer databases / products catalogues
  • 7.
    Graph Data Models ➢RDF ● W3C Standard ➢ Property Graphs ● Industry standard
  • 8.
    RDF ➢ A graphis a set of links Link: a triple : subject - predicate - object predicate (or property) is the link name : an IRI ➢ IRIs (=URIs) literals (strings, numbers, …) blank nodes
  • 9.
    prefix : <http://example/myData/> prefixrdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> prefix foaf: <http://xmlns.com/foaf/0.1/> # foaf:name is a short form of <http://xmlns.com/foaf/0.1/name> :alice rdf:type foaf:Person ; foaf:name "Alice Smith" ; foaf:knows :bob . :alice foaf:knows "Alice Smith" foaf:name foaf:Person rdf:type :bob
  • 10.
    prefix : <http://example/myData/> prefixrdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> prefix foaf: <http://xmlns.com/foaf/0.1/> :bob rdf:type foaf:Person ; foaf:name "Bob Brown" . "Bob Brown" foaf:Person rdf:type :bob
  • 11.
    prefix : <http://example/myData/> prefixrdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> prefix foaf: <http://xmlns.com/foaf/0.1/> :alice rdf:type foaf:Person ; foaf:name "Alice Smith" ; foaf:knows :bob . :bob rdf:type foaf:Person ; foaf:name "Bob Brown" . :alice foaf:knows "Alice Smith" foaf:name foaf:Person rdf:type "Bob Brown" foaf:Person rdf:type :bob
  • 12.
    RDFS prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> prefixrdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> prefix foaf: <http://xmlns.com/foaf/0.1/> foaf:Person rdfs:subClassOf foaf:Agent . foaf:Person rdfs:subClassOf <http://www.w3.org/2003/01/geo/wgs84_pos#SpatialThing> . foaf:skypeID rdfs:domain foaf:Agent ; rdfs:label "Skype ID" ; rdfs:range rdfs:Literal ; rdfs:subPropertyOf foaf:nick .
  • 13.
    RDF : Access ➢SPARQL : Query language ➢ Protocol : over HTTP ## Names of people Alice knows. PREFIX : <http://example/myData/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT * { :alice foaf:knows ?X . ?X foaf:name ?name . }
  • 14.
    RDF : Access ➢SPARQL : Query language ➢ Protocol : over HTTP PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT ?name ?numFriends { { SELECT ?person (count(*) AS ?numFriends) { ?person foaf:knows ?X . } GROUP BY ?person } ?person foaf:name ?name . } ORDER BY ?numFriends
  • 15.
    RDF : Access ➢SPARQL : Update language ➢ Protocol : over HTTP PREFIX : <http://example/myData/> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX foaf: <http://xmlns.com/foaf/0.1/> INSERT DATA { :bob foaf:name "Bob Brown" ; foaf:knows :alice } ; INSERT { :alice knows ?B } } WHERE { :bob knows ?B }
  • 16.
    Apache Jena TLP: April2012 ➢ Involvement in standards ➢ RDF 1.1, SPARQL 1.1 ➢ RDF database ➢ SPARQL server Other RDF@ASF: ➢ Any23, Marmotta, Clerezza, Stanbol, Rya
  • 17.
    Property Graph DataModel A property graph is a set of vertices and edges with respective properties (i.e. key / values): ➢ each vertex or edge has a unique identifier ➢ each vertex has a set of outgoing edges and a set of incoming edges ➢ edges are directed: each edge has a start vertex and an end vertex ➢ each edge has a label which denotes the type of relationship ➢ vertexes and edges can have a properties (i.e. key / value pairs) Directed multigraph with properties attached to vertexes and edges
  • 18.
    Property Graph: Example id= 1 id = 2 name = “Alice” surname = “Smith” age = 32 email = alice@example.com ... name = “Bob” surname = “Brown” age = 45 email = bob@example.com ... since = 01/01/1970 ... id = 3 knows
  • 19.
    Property Graphs :Access ➢ Tinkerpop Gremlin DSL for various languages g.V().as('person').out('knows').as('friend') .select().by{it.value('name').length()} ➢ Cypher MATCH (you:Person {name:"You"}) FOREACH (name in ["Johan","Rajesh","Anna","Julia","Andrew"] | CREATE (you)-[:FRIEND]->(:Person {name:name})) ➢ Connect : API
  • 20.
    Property Graphs @ASF ➢Apache Tinkerpop ➢ Apache Spark > GraphX ➢ Apache Giraph ➢ Apache Flink > Gelly
  • 21.
  • 22.
    Layering Using Property Graphstech for RDF Using RDF tech for Property Graphs Doable but why? Can’t use the tools of one without understanding the other.
  • 23.
    What to takefrom RDF URIs as data types Data Exchange Data modelling Emphasis on data formats for exchange Relational Algebra engines
  • 24.
  • 25.
    What to takefrom PG Separate links and values Short names for attributes Engines for Graph Algorithms
  • 26.
    Some Conclusions ➢ DataGraphs are (still) new to many people ➢ RDF emphasizes information modelling → Knowledge graphs e.g SNOMED → SQL-like query ➢ Property Graph emphasizes data syntax → Data capture → Graph analytic algorithms ➢ Naive layering of data models leads dissatisfaction → Can only mix toolsets by knowing it’s layered ➢ Could share technology → Storage, data access, query algebra
  • 27.
  • 29.
    The Answer Building oneon top of the other is possible … but why do it? Really hard to use! Worse of both worlds. Semantic Web has some useful features Apply to property graphs