Your SlideShare is downloading. ×
  • Like
1st UIM-GDB - Connections to the Real World
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply

1st UIM-GDB - Connections to the Real World

  • 1,787 views
Published

 

Published in Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,787
On SlideShare
0
From Embeds
0
Number of Embeds
2

Actions

Shares
Downloads
31
Comments
0
Likes
3

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Connections to the Real World Graph Databases and ApplicationsAchim Friedland <achim@graph-database.org>, Aperis GmbH 1st University-Industrial Meeting on Graph Databases - 7.-8. Feb.. 2011, Barcelona , Spain
  • 2. Let’s change out point of view... 2
  • 3. Welcome on the customer side... ;) www.graph-database.org 3
  • 4. The Graph Representation Problem Adjacency matrix vs. Incidence matrix vs. Adjacency list vs. Edge list vs. Classes, Index-based vs. Index-free Adjacency, Dense vs. Sparse graphs, On-disc vs. In-memory graphs, All-Indexed vs. Specific-Index- Creation, directed vs. undirected edges, hypergraphs?, hierarchical graphs?, dynamic graphs?• Different levels of expressivity• Sometimes very application specific• Hard to optimize a single one for every use-case 4
  • 5. The GraphDB Vendor Problem• Multiple APIs from different vendors• Unknown internal graph representation• Unclear design goals• Community involvement? 5
  • 6. Step 1) Define a common API 6
  • 7. The Property-Graph Model The most common graph model within the NoSQL GraphDB space edge label Id: 1 Id: 2 Friends name: Alice name: Bob since: 2009/09/21 age: 21 age: 23 edge vertex propertiesproperties• directed: Each edge has a source and destination vertex• attributed: Vertices and edges carry key/value pairs• edge-labeled: The label denotes the type of relationship• multi-graph: Multiple edges between any two vertices allowed 7
  • 8. Property-Graph Constraints?• Vertex type vs. vertex interfaces?• Edge label/type vs. edge interfaces?• Vertex<->Edge constraints?• Extension: Undirected Edges?• Extension: Hyperedges?• Extension: Semantic graphs?• Extension: Dynamic graphs? 8
  • 9. A Property Graph Model Interface for Java and .NET// Use a class-based in-memory graphvar graph = new InMemoryGraph();var v1 = graph.AddVertex(new VertexId(1));var v2 = graph.AddVertex(new VertexId(2));v1.SetProperty("name", "Alice");v1.SetProperty("age" , 21);v2.SetProperty("name", "Bob");v2.SetProperty("age" , 23);var e1 = graph.AddEdge(v1, v2, new EdgeId(1), "Friends");e1.SetProperty(“since”, ”2009/09/21”); structured data (XML, JSON) 9
  • 10. Supported datatypes?• Strings• Integers• DataTime?• byte[]?• structured data like XML/JSON?• List<...>• ... 10
  • 11. Step 2) Declarative ways for querying 11
  • 12. Querying a Graph Database• Programmatic / API • From any programming language, Pipes, ... • Synchronous or Asynchronous • Allow bypassing all optimizations • Do not try to be smarter than the application developer• Ad hoc / Explorative • Gremlin aka. “high-level pipes”? • sones GQL, OrientDB QL aka. “SQL style”? • Pattern matching aka. “SPARQL style”? • Easy embedding of domain specific query languages? 12
  • 13. A data flow framework for property graph models : IEnumerator<E>, IEnumerable<E> S ISideEffectPipe<in S, out E, out T> E Source EmittedElements T Elements Side Effect 13
  • 14. Create complex pipes by combining pipes to pipelines S Pipeline<S, E> E pipe1<S,A> pipe2<B,C> pipe3<C,E> Source EmittedElements Elements 14
  • 15. A “perl”-style Ad Hoc query language for graphs// Friends-of-a-friendvar pipe1 = new VertexEdgePipe(VertexEdgePipe.Step.OUT_EDGES);var pipe2 = new LabelFilterPipe("Friends", ComparisonFilter.EQUALS);var pipe3 = new EdgeVertexPipe(EdgeVertexPipe.Step.IN_VERTEX);var pipe4 = new VertexEdgePipe(VertexEdgePipe.Step.OUT_EDGES);var pipe5 = new LabelFilterPipe("Friends", ComparisonFilter.EQUALS);var pipe6 = new EdgeVertexPipe(EdgeVertexPipe.Step.IN_VERTEX);var pipe7 = new PropertyPipe("name");var pipeline = new Pipeline(pipe1,pipe2,pipe3,pipe4,pipe5,pipe6,pipe7);pipeline.SetSource(new SingleEnumerator( graph.GetVertex(new VertexId(1)))); g:id-v(1)/outE[@label=Friends]/inV/outE [@label=Friends]/inV/@name 15
  • 16. sones GQL A “SQL”-style Ad Hoc query language for graphs// Friends-of-a-friendvar pipe1 = new VertexEdgePipe(VertexEdgePipe.Step.OUT_EDGES);var pipe2 = new LabelFilterPipe("Friends", ComparisonFilter.EQUALS);var pipe3 = new EdgeVertexPipe(EdgeVertexPipe.Step.IN_VERTEX);var pipe4 = new VertexEdgePipe(VertexEdgePipe.Step.OUT_EDGES);var pipe5 = new LabelFilterPipe("Friends", ComparisonFilter.EQUALS);var pipe6 = new EdgeVertexPipe(EdgeVertexPipe.Step.IN_VERTEX);var pipe7 = new PropertyPipe("name");var pipeline = new Pipeline(pipe1,pipe2,pipe3,pipe4,pipe5,pipe6,pipe7);pipeline.SetSource(new SingleEnumerator( graph.GetVertex(new VertexId(1)))); From User u SELECT u.Friends.Friends.name WHERE u.Id = 1 16
  • 17. Step 3) Query result formats 17
  • 18. Query Result Formats• Graphs • QR may be queried over and over again • QR may be stored/cached as a graph • But again: (Too) may graph representations available• Other data structures • If result is just a list, why converting it to a graph? • Simple for programming languages • Much more complicated for Query Languages 18
  • 19. Query Result Formats• Reduced 2-tier architecture (GraphDB -> Client) • Higher performance • Avoids relational architecture anti-patterns• Link-aware, self-describing hypermedia (see Neo4J) • e.g. ATOM, XML + XLINK, RDFa• User-defined/application specific protocols • E.g. serve HTML/GEXF directly (see CouchDB) • Allows to create powerful embedded applications 19
  • 20. Step 4) Accessing remote graphs 20
  • 21. A HTTP/REST interface for property graphs• rexster server • Exposes a graph via HTTP/REST • Vertices and edges are REST resources • Neo4J, OrientDB are available, InfiniteGraph announced• rexster client • Accessing remote graphs 21
  • 22. Common CRUD operations... 22
  • 23. Common CRUD operations... 23
  • 24. What about other HTTP verbs?• PATCH for applying small changes?• NEIGHBORS?• EXPLORE (more neighbors...)• SHORTESTPATH• CENTRALITY 24
  • 25. Default resource representation: JSONcurl -H Accept:application/json http://localhost:8182/graph1/vertices/1{ "version" : "0.1", "results" : { "_type" : "vertex", "_id" : "1", "name" : "Alice", "age" : 21 }, "query_time" : 0.014235} 25
  • 26. Advanced HTTP/REST concepts• HTTP caching support?• HTTP Authentication support?• Conditional PUT/POST requests? 26
  • 27. The GraphDB Graph... OrientDB for Documents ThinkerGraph & Gremlin for Ad HocInfiniteGraph for Neo4J for GIS Clustering InfoGrid for WebApps In-Memory for Caching OrientDB for Ad Hoc Neo4J for HA 27
  • 28. Questions? http://www.graph-database.orghttp://www.twitter.com/graphdbs 28