TinkerPop: a story of
graphs, DBs, and graph DBs
Joshua Shinavier and James Thornton
Texas Linux Festival
June 13th, 2014
Once, there was a thing
v(1)
Let’s call it a vertex
The vertex had
some metadata
v(1)
name: “Graph DB workshop”
We’ll call that a property
v(1)
name: “Graph DB workshop”
You are here.
In fact, the vertex had
multiple properties
v(1)
name: “Graph DB workshop”
type: “Event”
The properties were
of various types
v(1)
name: “Graph DB workshop”
type: “Event”
starts: 1402682400000
ends: 1402696800000
v(1)
name: “Graph DB workshop”
type: “Event”
starts: 1402682400000
ends: 1402696800000
v(2)
name: “Texas Linux Fest”
type:...
Thus, an edge
v(1)
name: “Graph DB workshop”
type: “Event”
starts: 1402682400000
ends: 1402696800000
v(2)
name: “Texas Lin...
The edge was directed…
v(1)
name: “Graph DB workshop”
type: “Event”
starts: 1402682400000
ends: 1402696800000
v(2)
name: “...
…and labeled
v(1)
name: “Graph DB workshop”
type: “Event”
starts: 1402682400000
ends: 1402696800000
v(2)
name: “Texas Linu...
The label types the relationship
v(1)
name: “Graph DB workshop”
type: “Event”
starts: 1402682400000
ends: 1402696800000
v(...
v(1)
name: “Graph DB workshop”
type: “Event”
starts: 1402682400000
ends: 1402696800000
v(2)
name: “Texas Linux Fest”
type:...
v(1)
name: “Graph DB workshop”
type: “Event”
starts: 1402682400000
ends: 1402696800000
v(2)
name: “Texas Linux Fest”
type:...
Now it was a labeled multigraph
v(1)
name: “Graph DB workshop”
type: “Event”
starts: 1402682400000
ends: 1402696800000
v(2...
A few more edges
v(1)
name: “Graph DB workshop”
type: “Event”
starts: 1402682400000
ends: 1402696800000
v(2)
name: “Texas ...
Some edges also had properties
v(1)
name: “Graph DB workshop”
type: “Event”
starts: 1402682400000
ends: 1402696800000
v(2)...
We call this a Property Graph
v(1)
name: “Graph DB workshop”
type: “Event”
starts: 1402682400000
ends: 1402696800000
v(2)
...
Many graph DB data models
are variations on this theme
v(1)
name: “Graph DB workshop”
type: “Event”
starts: 1402682400000
...
Neo4j
OrientDB
Sparksee*
* the graph database previously known as DEX
etc.
Enter
• single Property Graph API supported by diverse
graph database backends
• choose your favorite, but avoid vendor lo...
Blueprints implementations
Now we need a
query language…
• build it on the Blueprints API
• query over any Blueprints-compatible DB
• make it path-li...
• a domain-specific language for traversing graphs
• Turing-complete, permits access to the full JDK
• has been adapted to ...
Think “pipes and filters”
• Pipes: dataflow framework. The basis of Gremlin
• Frames: Java bean framework for graphs
• Furnace: Property Graph algori...
TinkerPop is…
• a developer group creating an open-source graph DB
stack
• a community of users and third-party implemento...
A detailed guide to the
rest of this workshop
• intro to the Aurelius Graph Cluster
• demos of graph tools and concepts
• ...
Thanks!
The Aurelius
Graph Cluster
In TinkerPop…
• we adapt various graph DBs to a unified API
• they become Property Graph databases
With AGC…
• we adapt various high-performance databases to
the Titan API
• they become graph databases
Take your pick of CAP
Titan highlights
• graphs, transactions scale with the number of
machines in a cluster
• geo, numeric range, and full text...
Dealing with supernodes
• Titan’s vertex-centric indices permit ordered querying
from a vertex
• e.g. retrieve “knows” edg...
What about
Faunus
Faunus…
• is a Hadoop-based graph analytics engine
• in Titan 0.5 will simply be called Titan/Hadoop
• adds support for gl...
Faunus inputs and outputs
• Hadoop SequenceFile format (in/out)
• Titan graph DB (in/out)
• GraphSON format (in/out)
• Rex...
Demo time
TinkerPop3
What’s new in TP3
• new Gremlin implementation which makes good use of
Java 8 closures, enables introspection and optimiza...
Gremlitron
• Blueprints, Pipes, and
Gremlin are all integrated
in TinkerPop3
• Frames obsoleted by
Gremlin DSLs
• Furnace ...
Try it out
• at:
• https://github.com/tinkerpop/tinkerpop3
• mailing list:
• https://groups.google.com/forum/gremlin-users...
josh@fortytwo.net james.thornton@gmail.com
http://tinkerpop.com
Upcoming SlideShare
Loading in...5
×

TinkerPop: a story of graphs, DBs, and graph DBs

4,737

Published on

intro to TinkerPop and the Aurelius Graph Cluster for the Graph DB Workshop, Texas Linux Festival 2014

Published in: Technology
0 Comments
10 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
4,737
On Slideshare
0
From Embeds
0
Number of Embeds
35
Actions
Shares
0
Downloads
37
Comments
0
Likes
10
Embeds 0
No embeds

No notes for slide

TinkerPop: a story of graphs, DBs, and graph DBs

  1. 1. TinkerPop: a story of graphs, DBs, and graph DBs Joshua Shinavier and James Thornton Texas Linux Festival June 13th, 2014
  2. 2. Once, there was a thing
  3. 3. v(1) Let’s call it a vertex
  4. 4. The vertex had some metadata v(1) name: “Graph DB workshop”
  5. 5. We’ll call that a property v(1) name: “Graph DB workshop” You are here.
  6. 6. In fact, the vertex had multiple properties v(1) name: “Graph DB workshop” type: “Event”
  7. 7. The properties were of various types v(1) name: “Graph DB workshop” type: “Event” starts: 1402682400000 ends: 1402696800000
  8. 8. v(1) name: “Graph DB workshop” type: “Event” starts: 1402682400000 ends: 1402696800000 v(2) name: “Texas Linux Fest” type: “Event” starts: 1402664400000 ends: 1402808400000 Our vertex was not alone
  9. 9. Thus, an edge v(1) name: “Graph DB workshop” type: “Event” starts: 1402682400000 ends: 1402696800000 v(2) name: “Texas Linux Fest” type: “Event” starts: 1402664400000 ends: 1402808400000
  10. 10. The edge was directed… v(1) name: “Graph DB workshop” type: “Event” starts: 1402682400000 ends: 1402696800000 v(2) name: “Texas Linux Fest” type: “Event” starts: 1402664400000 ends: 1402808400000
  11. 11. …and labeled v(1) name: “Graph DB workshop” type: “Event” starts: 1402682400000 ends: 1402696800000 v(2) name: “Texas Linux Fest” type: “Event” starts: 1402664400000 ends: 1402808400000 partOf
  12. 12. The label types the relationship v(1) name: “Graph DB workshop” type: “Event” starts: 1402682400000 ends: 1402696800000 v(2) name: “Texas Linux Fest” type: “Event” starts: 1402664400000 ends: 1402808400000 partOf You are here, too.
  13. 13. v(1) name: “Graph DB workshop” type: “Event” starts: 1402682400000 ends: 1402696800000 v(2) name: “Texas Linux Fest” type: “Event” starts: 1402664400000 ends: 1402808400000 partOf v(3) name: “Chef Workshop” type: “Event” starts: 1402664400000 ends: 1402696800000 v(4) name: “Canonical Charm School” type: “Event” starts: 1402664400000 ends: 1402696800000 partOf partOf More vertices joined the fun…
  14. 14. v(1) name: “Graph DB workshop” type: “Event” starts: 1402682400000 ends: 1402696800000 v(2) name: “Texas Linux Fest” type: “Event” starts: 1402664400000 ends: 1402808400000 partOf v(3) name: “Chef Workshop” type: “Event” starts: 1402664400000 ends: 1402696800000 v(4) name: “Canonical Charm School” type: “Event” starts: 1402664400000 ends: 1402696800000 partOf partOf v(7) name: “TinkerPop suite” type: “Software” hasTopic v(8) name: “Aurelius Graph Cluster” type: “Software” hasTopic More labels, too
  15. 15. Now it was a labeled multigraph v(1) name: “Graph DB workshop” type: “Event” starts: 1402682400000 ends: 1402696800000 v(2) name: “Texas Linux Fest” type: “Event” starts: 1402664400000 ends: 1402808400000 partOf v(3) name: “Chef Workshop” type: “Event” starts: 1402664400000 ends: 1402696800000 v(4) name: “Canonical Charm School” type: “Event” starts: 1402664400000 ends: 1402696800000 partOf partOf v(6) name: “Joshua Shinavier” type: “Person” githubId: “joshsh” v(5) presentedBy presentedBy v(7) name: “TinkerPop suite” type: “Software” hasTopic v(8) name: “Aurelius Graph Cluster” type: “Software” hasTopic name: “James Thornton” type: “Person” githubId: “espeed”
  16. 16. A few more edges v(1) name: “Graph DB workshop” type: “Event” starts: 1402682400000 ends: 1402696800000 v(2) name: “Texas Linux Fest” type: “Event” starts: 1402664400000 ends: 1402808400000 partOf v(3) name: “Chef Workshop” type: “Event” starts: 1402664400000 ends: 1402696800000 v(4) name: “Canonical Charm School” type: “Event” starts: 1402664400000 ends: 1402696800000 partOf partOf v(6) name: “Joshua Shinavier” type: “Person” githubId: “joshsh” v(5) presentedBy presentedBy v(7) name: “TinkerPop suite” type: “Software” hasTopic v(8) name: “Aurelius Graph Cluster” type: “Software” contributesTo contributesTo hasTopic contributesTo name: “James Thornton” type: “Person” githubId: “espeed”
  17. 17. Some edges also had properties v(1) name: “Graph DB workshop” type: “Event” starts: 1402682400000 ends: 1402696800000 v(2) name: “Texas Linux Fest” type: “Event” starts: 1402664400000 ends: 1402808400000 partOf v(3) name: “Chef Workshop” type: “Event” starts: 1402664400000 ends: 1402696800000 v(4) name: “Canonical Charm School” type: “Event” starts: 1402664400000 ends: 1402696800000 partOf partOf v(6) name: “Joshua Shinavier” type: “Person” githubId: “joshsh” v(5) presentedBy presentedBy v(7) name: “TinkerPop suite” type: “Software” hasTopic v(8) name: “Aurelius Graph Cluster” type: “Software” contributesTo contributesTo hasTopic contributesTo weight: 0.2 weight: 0.8 name: “James Thornton” type: “Person” githubId: “espeed” weight: 1.0
  18. 18. We call this a Property Graph v(1) name: “Graph DB workshop” type: “Event” starts: 1402682400000 ends: 1402696800000 v(2) name: “Texas Linux Fest” type: “Event” starts: 1402664400000 ends: 1402808400000 partOf v(3) name: “Chef Workshop” type: “Event” starts: 1402664400000 ends: 1402696800000 v(4) name: “Canonical Charm School” type: “Event” starts: 1402664400000 ends: 1402696800000 partOf partOf v(6) name: “Joshua Shinavier” type: “Person” githubId: “joshsh” v(5) presentedBy presentedBy v(7) name: “TinkerPop suite” type: “Software” hasTopic v(8) name: “Aurelius Graph Cluster” type: “Software” contributesTo contributesTo hasTopic contributesTo weight: 0.2 weight: 0.8 name: “James Thornton” type: “Person” githubId: “espeed” weight: 1.0
  19. 19. Many graph DB data models are variations on this theme v(1) name: “Graph DB workshop” type: “Event” starts: 1402682400000 ends: 1402696800000 v(2) name: “Texas Linux Fest” type: “Event” starts: 1402664400000 ends: 1402808400000 partOf v(3) name: “Chef Workshop” type: “Event” starts: 1402664400000 ends: 1402696800000 v(4) name: “Canonical Charm School” type: “Event” starts: 1402664400000 ends: 1402696800000 partOf partOf v(6) name: “Joshua Shinavier” type: “Person” githubId: “joshsh” v(5) presentedBy presentedBy v(7) name: “TinkerPop suite” type: “Software” hasTopic v(8) name: “Aurelius Graph Cluster” type: “Software” contributesTo contributesTo hasTopic contributesTo weight: 0.2 weight: 0.8 name: “James Thornton” type: “Person” githubId: “espeed” weight: 1.0
  20. 20. Neo4j
  21. 21. OrientDB
  22. 22. Sparksee* * the graph database previously known as DEX
  23. 23. etc.
  24. 24. Enter • single Property Graph API supported by diverse graph database backends • choose your favorite, but avoid vendor lock-in • Blueprints : graph DB :: JDBC : RDBMS • implementations, “ouplementations”, test suites, and helper utilities are built on top
  25. 25. Blueprints implementations
  26. 26. Now we need a query language… • build it on the Blueprints API • query over any Blueprints-compatible DB • make it path-like, with side-effects • match abstract traversals through the graph, filtering, ranking, and mutating as you go • make it interactive. How about a REPL?
  27. 27. • a domain-specific language for traversing graphs • Turing-complete, permits access to the full JDK • has been adapted to various JVM languages • Gremlin : graph DB :: SQL : RDBMS… sort of Enter
  28. 28. Think “pipes and filters”
  29. 29. • Pipes: dataflow framework. The basis of Gremlin • Frames: Java bean framework for graphs • Furnace: Property Graph algorithms • Rexster: high-performance graph database server The rest of the TinkerPop family
  30. 30. TinkerPop is… • a developer group creating an open-source graph DB stack • a community of users and third-party implementors • a foundation for building high-performance graph applications of any size • model some data on your laptop • build massive clustered applications • open source, BSD licensed
  31. 31. A detailed guide to the rest of this workshop • intro to the Aurelius Graph Cluster • demos of graph tools and concepts • guided installation of tools • preview of TinkerPop3
  32. 32. Thanks!
  33. 33. The Aurelius Graph Cluster
  34. 34. In TinkerPop… • we adapt various graph DBs to a unified API • they become Property Graph databases
  35. 35. With AGC… • we adapt various high-performance databases to the Titan API • they become graph databases
  36. 36. Take your pick of CAP
  37. 37. Titan highlights • graphs, transactions scale with the number of machines in a cluster • geo, numeric range, and full text search for vertices and edges • support for either of two indexing backends • ElasticSearch, Lucene • native support for Blueprints, Rexster
  38. 38. Dealing with supernodes • Titan’s vertex-centric indices permit ordered querying from a vertex • e.g. retrieve “knows” edges… in order of “since” timestamp • iterates efficiently, even if there are thousands of edges
  39. 39. What about Faunus
  40. 40. Faunus… • is a Hadoop-based graph analytics engine • in Titan 0.5 will simply be called Titan/Hadoop • adds support for global distributed graph operations • applies (a subset of) Gremlin in a breadth-first fashion
  41. 41. Faunus inputs and outputs • Hadoop SequenceFile format (in/out) • Titan graph DB (in/out) • GraphSON format (in/out) • Rexster (in) • RDF (in) • Gremlin scripts (in/out)
  42. 42. Demo time
  43. 43. TinkerPop3
  44. 44. What’s new in TP3 • new Gremlin implementation which makes good use of Java 8 closures, enables introspection and optimization of traversals • new OLAP API with support for message passing systems like Giraph, Hama, Faunus, etc. • revamped I/O utilities with support for GraphSON, GraphML, and GremlinKryo • new server model, incl. remote execution of scripts via WebSocket API, server plugin support, customizable serialization formats
  45. 45. Gremlitron • Blueprints, Pipes, and Gremlin are all integrated in TinkerPop3 • Frames obsoleted by Gremlin DSLs • Furnace is Gremlin OLAP • Rexster is Gremlin Server
  46. 46. Try it out • at: • https://github.com/tinkerpop/tinkerpop3 • mailing list: • https://groups.google.com/forum/gremlin-users • we welcome your feedback and/or PRs
  47. 47. josh@fortytwo.net james.thornton@gmail.com http://tinkerpop.com
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×