1st UIM-GDB - Connections to the Real World
Upcoming SlideShare
Loading in...5
×
 

1st UIM-GDB - Connections to the Real World

on

  • 2,119 views

 

Statistics

Views

Total Views
2,119
Views on SlideShare
1,588
Embed Views
531

Actions

Likes
3
Downloads
31
Comments
0

5 Embeds 531

http://www.graph-database.org 524
http://translate.googleusercontent.com 3
http://graph-databases.org 2
http://static.slidesharecdn.com 1
http://www.linkedin.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

1st UIM-GDB - Connections to the Real World 1st UIM-GDB - Connections to the Real World Presentation Transcript

  • Connections to the Real World Graph Databases and ApplicationsAchim Friedland <achim@graph-database.org>, Aperis GmbH 1st University-Industrial Meeting on Graph Databases - 7.-8. Feb.. 2011, Barcelona , Spain
  • Let’s change out point of view... 2
  • Welcome on the customer side... ;) www.graph-database.org 3
  • The Graph Representation Problem Adjacency matrix vs. Incidence matrix vs. Adjacency list vs. Edge list vs. Classes, Index-based vs. Index-free Adjacency, Dense vs. Sparse graphs, On-disc vs. In-memory graphs, All-Indexed vs. Specific-Index- Creation, directed vs. undirected edges, hypergraphs?, hierarchical graphs?, dynamic graphs?• Different levels of expressivity• Sometimes very application specific• Hard to optimize a single one for every use-case 4
  • The GraphDB Vendor Problem• Multiple APIs from different vendors• Unknown internal graph representation• Unclear design goals• Community involvement? 5
  • Step 1) Define a common API 6
  • The Property-Graph Model The most common graph model within the NoSQL GraphDB space edge label Id: 1 Id: 2 Friends name: Alice name: Bob since: 2009/09/21 age: 21 age: 23 edge vertex propertiesproperties• directed: Each edge has a source and destination vertex• attributed: Vertices and edges carry key/value pairs• edge-labeled: The label denotes the type of relationship• multi-graph: Multiple edges between any two vertices allowed 7
  • Property-Graph Constraints?• Vertex type vs. vertex interfaces?• Edge label/type vs. edge interfaces?• Vertex<->Edge constraints?• Extension: Undirected Edges?• Extension: Hyperedges?• Extension: Semantic graphs?• Extension: Dynamic graphs? 8
  • A Property Graph Model Interface for Java and .NET// Use a class-based in-memory graphvar graph = new InMemoryGraph();var v1 = graph.AddVertex(new VertexId(1));var v2 = graph.AddVertex(new VertexId(2));v1.SetProperty("name", "Alice");v1.SetProperty("age" , 21);v2.SetProperty("name", "Bob");v2.SetProperty("age" , 23);var e1 = graph.AddEdge(v1, v2, new EdgeId(1), "Friends");e1.SetProperty(“since”, ”2009/09/21”); structured data (XML, JSON) 9
  • Supported datatypes?• Strings• Integers• DataTime?• byte[]?• structured data like XML/JSON?• List<...>• ... 10
  • Step 2) Declarative ways for querying 11
  • Querying a Graph Database• Programmatic / API • From any programming language, Pipes, ... • Synchronous or Asynchronous • Allow bypassing all optimizations • Do not try to be smarter than the application developer• Ad hoc / Explorative • Gremlin aka. “high-level pipes”? • sones GQL, OrientDB QL aka. “SQL style”? • Pattern matching aka. “SPARQL style”? • Easy embedding of domain specific query languages? 12
  • A data flow framework for property graph models : IEnumerator<E>, IEnumerable<E> S ISideEffectPipe<in S, out E, out T> E Source EmittedElements T Elements Side Effect 13
  • Create complex pipes by combining pipes to pipelines S Pipeline<S, E> E pipe1<S,A> pipe2<B,C> pipe3<C,E> Source EmittedElements Elements 14
  • A “perl”-style Ad Hoc query language for graphs// Friends-of-a-friendvar pipe1 = new VertexEdgePipe(VertexEdgePipe.Step.OUT_EDGES);var pipe2 = new LabelFilterPipe("Friends", ComparisonFilter.EQUALS);var pipe3 = new EdgeVertexPipe(EdgeVertexPipe.Step.IN_VERTEX);var pipe4 = new VertexEdgePipe(VertexEdgePipe.Step.OUT_EDGES);var pipe5 = new LabelFilterPipe("Friends", ComparisonFilter.EQUALS);var pipe6 = new EdgeVertexPipe(EdgeVertexPipe.Step.IN_VERTEX);var pipe7 = new PropertyPipe("name");var pipeline = new Pipeline(pipe1,pipe2,pipe3,pipe4,pipe5,pipe6,pipe7);pipeline.SetSource(new SingleEnumerator( graph.GetVertex(new VertexId(1)))); g:id-v(1)/outE[@label=Friends]/inV/outE [@label=Friends]/inV/@name 15
  • sones GQL A “SQL”-style Ad Hoc query language for graphs// Friends-of-a-friendvar pipe1 = new VertexEdgePipe(VertexEdgePipe.Step.OUT_EDGES);var pipe2 = new LabelFilterPipe("Friends", ComparisonFilter.EQUALS);var pipe3 = new EdgeVertexPipe(EdgeVertexPipe.Step.IN_VERTEX);var pipe4 = new VertexEdgePipe(VertexEdgePipe.Step.OUT_EDGES);var pipe5 = new LabelFilterPipe("Friends", ComparisonFilter.EQUALS);var pipe6 = new EdgeVertexPipe(EdgeVertexPipe.Step.IN_VERTEX);var pipe7 = new PropertyPipe("name");var pipeline = new Pipeline(pipe1,pipe2,pipe3,pipe4,pipe5,pipe6,pipe7);pipeline.SetSource(new SingleEnumerator( graph.GetVertex(new VertexId(1)))); From User u SELECT u.Friends.Friends.name WHERE u.Id = 1 16
  • Step 3) Query result formats 17
  • Query Result Formats• Graphs • QR may be queried over and over again • QR may be stored/cached as a graph • But again: (Too) may graph representations available• Other data structures • If result is just a list, why converting it to a graph? • Simple for programming languages • Much more complicated for Query Languages 18
  • Query Result Formats• Reduced 2-tier architecture (GraphDB -> Client) • Higher performance • Avoids relational architecture anti-patterns• Link-aware, self-describing hypermedia (see Neo4J) • e.g. ATOM, XML + XLINK, RDFa• User-defined/application specific protocols • E.g. serve HTML/GEXF directly (see CouchDB) • Allows to create powerful embedded applications 19
  • Step 4) Accessing remote graphs 20
  • A HTTP/REST interface for property graphs• rexster server • Exposes a graph via HTTP/REST • Vertices and edges are REST resources • Neo4J, OrientDB are available, InfiniteGraph announced• rexster client • Accessing remote graphs 21
  • Common CRUD operations... 22
  • Common CRUD operations... 23
  • What about other HTTP verbs?• PATCH for applying small changes?• NEIGHBORS?• EXPLORE (more neighbors...)• SHORTESTPATH• CENTRALITY 24
  • Default resource representation: JSONcurl -H Accept:application/json http://localhost:8182/graph1/vertices/1{ "version" : "0.1", "results" : { "_type" : "vertex", "_id" : "1", "name" : "Alice", "age" : 21 }, "query_time" : 0.014235} 25
  • Advanced HTTP/REST concepts• HTTP caching support?• HTTP Authentication support?• Conditional PUT/POST requests? 26
  • The GraphDB Graph... OrientDB for Documents ThinkerGraph & Gremlin for Ad HocInfiniteGraph for Neo4J for GIS Clustering InfoGrid for WebApps In-Memory for Caching OrientDB for Ad Hoc Neo4J for HA 27
  • Questions? http://www.graph-database.orghttp://www.twitter.com/graphdbs 28