Connections to the Real World                                          Graph Databases and ApplicationsAchim Friedland <ac...
Let’s change out point of view...                                    2
Welcome on the customer side... ;)        www.graph-database.org                                     3
The Graph Representation Problem                              Adjacency matrix vs. Incidence matrix vs.                   ...
The GraphDB Vendor Problem• Multiple APIs from different vendors• Unknown internal graph representation• Unclear design go...
Step 1) Define a common API                             6
The Property-Graph Model               The most common graph model within                    the NoSQL GraphDB space      ...
Property-Graph Constraints?•   Vertex type vs. vertex interfaces?•   Edge label/type vs. edge interfaces?•   Vertex<->Edge...
A Property Graph Model Interface for Java and .NET// Use a class-based in-memory graphvar graph = new InMemoryGraph();var ...
Supported datatypes?•   Strings•   Integers•   DataTime?•   byte[]?•   structured data like XML/JSON?•   List<...>•   ... ...
Step 2) Declarative ways for querying                                        11
Querying a Graph Database• Programmatic / API  •   From any programming language, Pipes, ...  •   Synchronous or Asynchron...
A data flow framework for property graph models                           : IEnumerator<E>, IEnumerable<E>   S         ISid...
Create complex pipes by combining pipes to pipelines   S                      Pipeline<S, E>                 E            ...
A “perl”-style Ad Hoc query language for graphs// Friends-of-a-friendvar pipe1 = new VertexEdgePipe(VertexEdgePipe.Step.OU...
sones GQL    A “SQL”-style Ad Hoc query language for graphs// Friends-of-a-friendvar pipe1 = new VertexEdgePipe(VertexEdge...
Step 3) Query result formats                               17
Query Result Formats• Graphs  •   QR may be queried over and over again  •   QR may be stored/cached as a graph  •   But a...
Query Result Formats• Reduced 2-tier architecture (GraphDB -> Client)   • Higher performance   • Avoids relational archite...
Step 4) Accessing remote graphs                                  20
A HTTP/REST interface for property graphs• rexster server    • Exposes a graph via HTTP/REST    • Vertices and edges are R...
Common CRUD operations...                            22
Common CRUD operations...                            23
What about other HTTP verbs?• PATCH for applying small changes?• NEIGHBORS?• EXPLORE (more neighbors...)• SHORTESTPATH• CE...
Default resource representation: JSONcurl -H Accept:application/json http://localhost:8182/graph1/vertices/1{  "version" :...
Advanced HTTP/REST concepts• HTTP caching support?• HTTP Authentication support?• Conditional PUT/POST requests?          ...
The GraphDB Graph...          OrientDB for Documents    ThinkerGraph &                                   Gremlin for Ad Ho...
Questions? http://www.graph-database.orghttp://www.twitter.com/graphdbs                                  28
Upcoming SlideShare
Loading in...5
×

1st UIM-GDB - Connections to the Real World

1,849

Published on

Published in: Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,849
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
32
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

1st UIM-GDB - Connections to the Real World

  1. 1. Connections to the Real World Graph Databases and ApplicationsAchim Friedland <achim@graph-database.org>, Aperis GmbH 1st University-Industrial Meeting on Graph Databases - 7.-8. Feb.. 2011, Barcelona , Spain
  2. 2. Let’s change out point of view... 2
  3. 3. Welcome on the customer side... ;) www.graph-database.org 3
  4. 4. The Graph Representation Problem Adjacency matrix vs. Incidence matrix vs. Adjacency list vs. Edge list vs. Classes, Index-based vs. Index-free Adjacency, Dense vs. Sparse graphs, On-disc vs. In-memory graphs, All-Indexed vs. Specific-Index- Creation, directed vs. undirected edges, hypergraphs?, hierarchical graphs?, dynamic graphs?• Different levels of expressivity• Sometimes very application specific• Hard to optimize a single one for every use-case 4
  5. 5. The GraphDB Vendor Problem• Multiple APIs from different vendors• Unknown internal graph representation• Unclear design goals• Community involvement? 5
  6. 6. Step 1) Define a common API 6
  7. 7. The Property-Graph Model The most common graph model within the NoSQL GraphDB space edge label Id: 1 Id: 2 Friends name: Alice name: Bob since: 2009/09/21 age: 21 age: 23 edge vertex propertiesproperties• directed: Each edge has a source and destination vertex• attributed: Vertices and edges carry key/value pairs• edge-labeled: The label denotes the type of relationship• multi-graph: Multiple edges between any two vertices allowed 7
  8. 8. Property-Graph Constraints?• Vertex type vs. vertex interfaces?• Edge label/type vs. edge interfaces?• Vertex<->Edge constraints?• Extension: Undirected Edges?• Extension: Hyperedges?• Extension: Semantic graphs?• Extension: Dynamic graphs? 8
  9. 9. A Property Graph Model Interface for Java and .NET// Use a class-based in-memory graphvar graph = new InMemoryGraph();var v1 = graph.AddVertex(new VertexId(1));var v2 = graph.AddVertex(new VertexId(2));v1.SetProperty("name", "Alice");v1.SetProperty("age" , 21);v2.SetProperty("name", "Bob");v2.SetProperty("age" , 23);var e1 = graph.AddEdge(v1, v2, new EdgeId(1), "Friends");e1.SetProperty(“since”, ”2009/09/21”); structured data (XML, JSON) 9
  10. 10. Supported datatypes?• Strings• Integers• DataTime?• byte[]?• structured data like XML/JSON?• List<...>• ... 10
  11. 11. Step 2) Declarative ways for querying 11
  12. 12. Querying a Graph Database• Programmatic / API • From any programming language, Pipes, ... • Synchronous or Asynchronous • Allow bypassing all optimizations • Do not try to be smarter than the application developer• Ad hoc / Explorative • Gremlin aka. “high-level pipes”? • sones GQL, OrientDB QL aka. “SQL style”? • Pattern matching aka. “SPARQL style”? • Easy embedding of domain specific query languages? 12
  13. 13. A data flow framework for property graph models : IEnumerator<E>, IEnumerable<E> S ISideEffectPipe<in S, out E, out T> E Source EmittedElements T Elements Side Effect 13
  14. 14. Create complex pipes by combining pipes to pipelines S Pipeline<S, E> E pipe1<S,A> pipe2<B,C> pipe3<C,E> Source EmittedElements Elements 14
  15. 15. A “perl”-style Ad Hoc query language for graphs// Friends-of-a-friendvar pipe1 = new VertexEdgePipe(VertexEdgePipe.Step.OUT_EDGES);var pipe2 = new LabelFilterPipe("Friends", ComparisonFilter.EQUALS);var pipe3 = new EdgeVertexPipe(EdgeVertexPipe.Step.IN_VERTEX);var pipe4 = new VertexEdgePipe(VertexEdgePipe.Step.OUT_EDGES);var pipe5 = new LabelFilterPipe("Friends", ComparisonFilter.EQUALS);var pipe6 = new EdgeVertexPipe(EdgeVertexPipe.Step.IN_VERTEX);var pipe7 = new PropertyPipe("name");var pipeline = new Pipeline(pipe1,pipe2,pipe3,pipe4,pipe5,pipe6,pipe7);pipeline.SetSource(new SingleEnumerator( graph.GetVertex(new VertexId(1)))); g:id-v(1)/outE[@label=Friends]/inV/outE [@label=Friends]/inV/@name 15
  16. 16. sones GQL A “SQL”-style Ad Hoc query language for graphs// Friends-of-a-friendvar pipe1 = new VertexEdgePipe(VertexEdgePipe.Step.OUT_EDGES);var pipe2 = new LabelFilterPipe("Friends", ComparisonFilter.EQUALS);var pipe3 = new EdgeVertexPipe(EdgeVertexPipe.Step.IN_VERTEX);var pipe4 = new VertexEdgePipe(VertexEdgePipe.Step.OUT_EDGES);var pipe5 = new LabelFilterPipe("Friends", ComparisonFilter.EQUALS);var pipe6 = new EdgeVertexPipe(EdgeVertexPipe.Step.IN_VERTEX);var pipe7 = new PropertyPipe("name");var pipeline = new Pipeline(pipe1,pipe2,pipe3,pipe4,pipe5,pipe6,pipe7);pipeline.SetSource(new SingleEnumerator( graph.GetVertex(new VertexId(1)))); From User u SELECT u.Friends.Friends.name WHERE u.Id = 1 16
  17. 17. Step 3) Query result formats 17
  18. 18. Query Result Formats• Graphs • QR may be queried over and over again • QR may be stored/cached as a graph • But again: (Too) may graph representations available• Other data structures • If result is just a list, why converting it to a graph? • Simple for programming languages • Much more complicated for Query Languages 18
  19. 19. Query Result Formats• Reduced 2-tier architecture (GraphDB -> Client) • Higher performance • Avoids relational architecture anti-patterns• Link-aware, self-describing hypermedia (see Neo4J) • e.g. ATOM, XML + XLINK, RDFa• User-defined/application specific protocols • E.g. serve HTML/GEXF directly (see CouchDB) • Allows to create powerful embedded applications 19
  20. 20. Step 4) Accessing remote graphs 20
  21. 21. A HTTP/REST interface for property graphs• rexster server • Exposes a graph via HTTP/REST • Vertices and edges are REST resources • Neo4J, OrientDB are available, InfiniteGraph announced• rexster client • Accessing remote graphs 21
  22. 22. Common CRUD operations... 22
  23. 23. Common CRUD operations... 23
  24. 24. What about other HTTP verbs?• PATCH for applying small changes?• NEIGHBORS?• EXPLORE (more neighbors...)• SHORTESTPATH• CENTRALITY 24
  25. 25. Default resource representation: JSONcurl -H Accept:application/json http://localhost:8182/graph1/vertices/1{ "version" : "0.1", "results" : { "_type" : "vertex", "_id" : "1", "name" : "Alice", "age" : 21 }, "query_time" : 0.014235} 25
  26. 26. Advanced HTTP/REST concepts• HTTP caching support?• HTTP Authentication support?• Conditional PUT/POST requests? 26
  27. 27. The GraphDB Graph... OrientDB for Documents ThinkerGraph & Gremlin for Ad HocInfiniteGraph for Neo4J for GIS Clustering InfoGrid for WebApps In-Memory for Caching OrientDB for Ad Hoc Neo4J for HA 27
  28. 28. Questions? http://www.graph-database.orghttp://www.twitter.com/graphdbs 28
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×