1st UIM-GDB - Connections to the Real World

Connections to the Real World
Graph Databases and Applications

Achim Friedland <achim@graph-database.org>, Aperis GmbH 1st University-Industrial Meeting on Graph Databases - 7.-8. Feb.. 2011, Barcelona , Spain

Let’s change out point of view...

2

Welcome on the customer side... ;)

www.graph-database.org

3

The Graph Representation Problem

Adjacency matrix vs. Incidence matrix vs.
Adjacency list vs. Edge list vs. Classes,
Index-based vs. Index-free Adjacency, Dense
vs. Sparse graphs, On-disc vs. In-memory
graphs, All-Indexed vs. Speciﬁc-Index-
Creation, directed vs. undirected edges,
hypergraphs?, hierarchical graphs?, dynamic
graphs?

• Different levels of expressivity
• Sometimes very application speciﬁc
• Hard to optimize a single one for every use-case
4

The GraphDB Vendor Problem

• Multiple APIs from different vendors
• Unknown internal graph representation
• Unclear design goals
• Community involvement?

5

Step 1) Deﬁne a common API

6

The Property-Graph Model
The most common graph model within
the NoSQL GraphDB space

edge label
Id: 1 Id: 2
Friends
name: Alice name: Bob
since: 2009/09/21
age: 21 age: 23
edge
vertex
properties
properties

• directed: Each edge has a source and destination vertex
• attributed: Vertices and edges carry key/value pairs
• edge-labeled: The label denotes the type of relationship
• multi-graph: Multiple edges between any two vertices allowed
7

Property-Graph Constraints?

• Vertex type vs. vertex interfaces?
• Edge label/type vs. edge interfaces?
• Vertex<->Edge constraints?
• Extension: Undirected Edges?
• Extension: Hyperedges?
• Extension: Semantic graphs?
• Extension: Dynamic graphs?

8

A Property Graph Model Interface for Java and .NET

// Use a class-based in-memory graph
var graph = new InMemoryGraph();

var v1 = graph.AddVertex(new VertexId(1));
var v2 = graph.AddVertex(new VertexId(2));
v1.SetProperty("name", "Alice");
v1.SetProperty("age" , 21);
v2.SetProperty("name", "Bob");
v2.SetProperty("age" , 23);

var e1 = graph.AddEdge(v1, v2, new EdgeId(1), "Friends");
e1.SetProperty(“since”, ”2009/09/21”);

structured data (XML, JSON)
9

Supported datatypes?

• Strings
• Integers
• DataTime?
• byte[]?
• structured data like XML/JSON?
• List<...>
• ...

10

Step 2) Declarative ways for querying

11

Querying a Graph Database
• Programmatic / API
• From any programming language, Pipes, ...
• Synchronous or Asynchronous
• Allow bypassing all optimizations
• Do not try to be smarter than the application
developer

• Ad hoc / Explorative
• Gremlin aka. “high-level pipes”?
• sones GQL, OrientDB QL aka. “SQL style”?
• Pattern matching aka. “SPARQL style”?
• Easy embedding of domain speciﬁc query languages?
12

A data ﬂow framework for property graph models

: IEnumerator<E>, IEnumerable<E>

S ISideEffectPipe<in S, out E, out T> E
Source Emitted
Elements T Elements

Side Effect

13

Create complex pipes by combining pipes to pipelines

S Pipeline<S, E> E
pipe1<S,A> pipe2<B,C> pipe3<C,E>
Source Emitted
Elements Elements

14

A “perl”-style Ad Hoc query language for graphs

// Friends-of-a-friend
var pipe1 = new VertexEdgePipe(VertexEdgePipe.Step.OUT_EDGES);
var pipe2 = new LabelFilterPipe("Friends", ComparisonFilter.EQUALS);
var pipe3 = new EdgeVertexPipe(EdgeVertexPipe.Step.IN_VERTEX);
var pipe7 = new PropertyPipe("name");

var pipeline = new Pipeline(pipe1,pipe2,pipe3,pipe4,pipe5,pipe6,pipe7);
pipeline.SetSource(new SingleEnumerator(
graph.GetVertex(new VertexId(1))));

g:id-v(1)/outE[@label='Friends']/inV/outE
[@label='Friends']/inV/@name
15

sones GQL
A “SQL”-style Ad Hoc query language for graphs

// Friends-of-a-friend
var pipe7 = new PropertyPipe("name");

var pipeline = new Pipeline(pipe1,pipe2,pipe3,pipe4,pipe5,pipe6,pipe7);
pipeline.SetSource(new SingleEnumerator(
graph.GetVertex(new VertexId(1))));

From User u SELECT u.Friends.Friends.name
WHERE u.Id = 1
16

Step 3) Query result formats

17

Query Result Formats

• Graphs
• QR may be queried over and over again
• QR may be stored/cached as a graph
• But again: (Too) may graph representations available

• Other data structures
• If result is just a list, why converting it to a graph?
• Simple for programming languages
• Much more complicated for Query Languages

18

Query Result Formats

• Reduced 2-tier architecture (GraphDB -> Client)
• Higher performance
• Avoids relational architecture anti-patterns
• Link-aware, self-describing hypermedia (see Neo4J)
• e.g. ATOM, XML + XLINK, RDFa
• User-deﬁned/application speciﬁc protocols
• E.g. serve HTML/GEXF directly (see CouchDB)
• Allows to create powerful embedded applications
19

Step 4) Accessing remote graphs

20

A HTTP/REST interface for property graphs

• rexster server
• Exposes a graph via HTTP/REST
• Vertices and edges are REST resources
• Neo4J, OrientDB are available,
InﬁniteGraph announced

• rexster client
• Accessing remote graphs
21

Common CRUD operations...

22

Common CRUD operations...

23

What about other HTTP verbs?

• PATCH for applying small changes?
• NEIGHBORS?
• EXPLORE (more neighbors...)
• SHORTESTPATH
• CENTRALITY

24

Default resource representation: JSON

curl -H Accept:application/json http://localhost:8182/graph1/vertices/1
{
"version" : "0.1",
"results" : {
"_type" : "vertex",
"_id" : "1",
"name" : "Alice",
"age" : 21
},
"query_time" : 0.014235
}

25

Advanced HTTP/REST concepts

• HTTP caching support?
• HTTP Authentication support?
• Conditional PUT/POST requests?

26

The GraphDB Graph...

OrientDB for Documents ThinkerGraph &
Gremlin for Ad Hoc

InﬁniteGraph for Neo4J for GIS
Clustering

InfoGrid for WebApps In-Memory for
Caching

OrientDB for Ad Hoc
Neo4J for HA

27

Questions?

http://www.graph-database.org
http://www.twitter.com/graphdbs

28

1st UIM-GDB - Connections to the Real World

Recommended

Recommended

More Related Content

Similar to 1st UIM-GDB - Connections to the Real World

Similar to 1st UIM-GDB - Connections to the Real World (20)

More from Achim Friedland

More from Achim Friedland (13)

Recently uploaded

Recently uploaded (20)

1st UIM-GDB - Connections to the Real World